Course Notes Math235 - Copy 2
Course Notes Math235 - Copy 2
COURSE NOTES
FALL 2023
VERSION: October 5, 2023
EYAL Z. GOREN,
MCGILL UNIVERSITY
[email protected]
These notes will be updated and corrected throughout the semester. I will mark with z the point I got to in
relation to revisions. If you find any mistakes kindly bring them to my attention, especially if they are in the
part that was already revised. Thanks, E.G.
Contents
Part 2. Arithmetic in Z 33
9. Division, GCD and the Euclidean Algorithm 33
9.1. Division with residue 33
9.2. Division 34
9.3. GCD 34
9.4. The Euclidean algorithm 35
10. Primes and unique factorization 36
10.1. Primes 36
10.2. Applications of the Fundamental Theorem of Arithmetic 39
11. Exercises 42
17. Exercises 62
Part 5. Rings 63
18. Some basic definitions and examples 63
19. Ideals 65
20. Homomorphisms 67
20.1. Units 69
21. Quotient rings 71
21.1. The quotient ring F[ x ]/( f ( x )) 73
21.2. Every polynomial has a root in a bigger field 75
21.3. Roots of polynomials over Z/pZ 75
22. The First Isomorphism Theorem 76
22.1. Isomorphism of rings 76
22.2. The First Isomorphism Theorem 76
22.3. The Chinese Remainder Theorem 77
22.3.1. Inverting Z/mnZ → Z/mZ × Z/nZ 78
23. Prime and maximal ideals 80
24. Exercises 82
Part 6. Groups 85
25. First definitions and examples 85
25.1. Definitions and some formal consequences 85
25.2. Examples 85
25.3. Subgroups 86
26. Permutation groups and dihedral groups 87
26.1. Permutation groups 87
26.2. Cycles 88
26.3. The Dihedral group 90
27. The theorem of Lagrange 91
27.1. Cosets 91
27.2. Lagrange’s theorem 92
28. Homomorphisms and isomorphisms 93
28.1. homomorphisms of groups 93
28.2. Isomorphism 93
29. Group actions on sets 94
29.1. Basic definitions 94
29.2. Basic properties 95
29.3. Some examples 95
30. The Cauchy-Frobenius Formula 97
30.1. Some applications to Combinatorics 98
31. Cauchy’s theorem: a wonderful proof 100
32. The first isomorphism theorem for groups 101
32.1. Normal subgroups 101
32.2. Quotient groups 101
32.3. The first isomorphism theorem 102
32.4. Groups of low order 103
32.4.1. Groups of order 1 103
32.4.2. Groups of order 2, 3, 5, 7 103
32.4.3. Groups of order 4 103
32.4.4. Groups of order 6 104
32.5. Odds and evens 104
32.6. Odds and Ends 104
32.7. Exercises 106
Introduction.
It is important to realize that Algebra started in antiquity as an applied science. As every science, it was born
of necessity; in this case, the need to solve everyday problems some of which are mentioned below. Since the
19-th century, if not even earlier, there is a growing side to Algebra which is purely theoretic. However, it
is important to realize that some of the most abstract algebraic structures of the past, such as Galois fields
(nowadays simply called finite fields), have become in modern days part of the foundation of certain branches
of applied algebra, for example in applications to coding theory and cryptography. The lesson is (and please
repeat that to any politician you happen to meet) is that what is at present considered “pure" and what is
considered “applied" is our temporary perspective and in time many of the pure, seemingly useless, branches
of mathematics turn out to have critical relevance to real world applications.
The word “algebra" is derived from the title of a book - Hisab al-jabr w’al-muqabala - written by the Farsi
scholar Abu Ja’far Muhammad ibn Musa Al-Khwarizmi (790 - 840 AD). The word al-jabr itself comes from
the root of reunite (in the sense of completing or putting together) and refers to one of the methods of
solving quadratic equations (by completing the square) described in that book. The book can be considered
as the first treatise on algebra. The word algorithm is in fact derived from the name Al-Khwarizmi. The
book was very much concerned with methods for solving known practical problems; Al-Khwarizmi intended to
teach (in his own words) “... what is easiest and most useful in arithmetic, such as men constantly require
in cases of inheritance, legacies, partition, lawsuits, and trade, and in all their dealings with one another, or
where the measuring of lands, the digging of canals, geometrical computations, and other objects of various
sorts and kinds are concerned." 1
Algebra I is a first course in Algebra. Little is assumed in the way of background. Though the course is
self-contained, it puts some of the responsibility of digesting and exploring the material on the student, as is
normal in university studies. You’ll soon realize that we are also learning a new language in this course and a
new attitude towards mathematics. The language is the language of modern mathematics; it is very formal,
precise and concise. One of the challenges of the course is digesting and memorizing the new concepts and
definitions. The new attitude is an attitude where any assumptions one is making while making an argument
have to be justified, or at least clearly stated as a postulate, and from there on one proceeds in a logical and
clear manner towards the conclusion. This is called proof and one of the main challenges in the course is to
understand what constitutes a good proof and to be able to write proofs yourself.2 A further challenge for
most students is that the key ideas we learn in this course are very abstract, bordering on philosophy and art,
yet they are truly scientific in their precision. You should expect to not understand everything right away;
you should expect to need time to reflect on the meaning of the new ideas and concepts that we introduce.
Here are some pointers as to how to cope with the challenges of this course:
• Read the class notes and the textbook over and over again. Try and give yourself examples of the
theorems and propositions and try and provide counterexamples when some of the hypotheses are
dropped.
• Do lots and lots of exercises. The more, the better.
• This textbook attempts to be lean and so everything in it is important. Examples and exercises may
contain important observations and referred to later.
• Explain to your friends, and possibly to your family, the material of the course.3 Work together with
your class mates on assignments, but write your own solutions in the end; try to understand different
solutions to assignments and try and find flaws in your friends’ solutions.
• Use the instructor’s and the TA’s office hours, as well as the math help center, to quickly close any
gap and clarify any point you’re not sure about.
1. Sets
1.1. First definitions. A set is a collection of elements. The notion of a set is logically not quite defined
(what’s a “collection"? an “element"?) but, hopefully, it makes sense to us. What we have is the ability to
say whether an element is a member of a set or not. Thus, in a sense, a set is a property, and its elements
are the objects having that property (the property is to be in the set).4
There are various ways to define sets:
(1) By writing it down:
S = {1, 3, 5}.
The set is named S and its elements are 1, 3 and 5. The use of curly brackets is mandatory! Another
example is
T = {2, 3, Jim’s football}.
This is a set whose elements are the numbers 2, 3 and Jim’s football. It is assumed here that “Jim"
refers to one particular individual.
A set can also be given as all objects with a certain property:
S1 = {all beluga whales}.
Another example is
T5 = {n : n is an odd integer, n3 = n}.
The colon means that the part that follows is the list of properties n must satisfy, i.e. the colon is
shorthand for “such as". Note that this set is equal to the set
U + = { n : n2 = 1}.
Our eccentric notation T5 , S1 , U + is just to make a point that a set can be denoted in many ways.
(2) Sometimes we write a set where the description of its elements is implicit, to be understood by the
reader. For example:
N = {0, 1, 2, 3, . . . }, Z = {. . . , −2, −1, 0, 1, 2, . . . },
and na o
Q= : a, b ∈ Z, b 6= 0 .
b
Thus N is the set of natural numbers, Z is the set of integers and Q the set of rational numbers.5
The use of the letters N, Z, Q is standard. Other standard notation is
R = the set of real numbers ( = points on the line),
and the complex numbers
C = { a + bi : a, b ∈ R}.
Here i is the imaginary number satisfying i2 = −1 (we’ll come back to that in §6). Note that we
sneaked in new notation. If A is a set, the notation x ∈ A means x is an element (a member) of A,
while x 6∈ A means that x is not an element of A. Thus, the expression C = { a + bi : a, b ∈ R} is
saying that C is the set whose elements are a + bi, where a and b are real numbers; these
√ are formal
expressions and it is not assumed that we know how to add a to bi. For example, 1 + i, 3 + πi are
complex numbers. Every real number r is a complex number that we still write as r instead of the
more formal expression r + 0i. For example, in the notation above, 3 ∈ S, 2 6∈ S, Jim’s football ∈ T
but 6∈ U + , i ∈ C but i 6∈ R.
We haven’t really defined any of these sets rigorously. We have assumed that the reader under-
stands what we mean. This suffices for the level of this course. A rigorous treatment is usually
4Like some upscale gentlemen clubs, they are defined as much by those that aren’t members as by those that are.
5For us 0 is a natural number, that is, we include it in N, but some authors do not. The letter Z comes from “zahlen" meaning
“numbers" in german and Q comes from “quotient".
4 EYAL GOREN MCGILL UNIVERSITY
given in a logic course for N (constructed via the Peano axioms) or in an analysis course for R (via
Dedekind cuts). The set of real numbers can also be thought of as the set of all numbers written in
a possibly infinite decimal expansion. Thus, 1, 2, 1/3 = 0.33333 . √ . . and indeed any rational number
is an element
√ of R as are π = 3.1415926 . . . , e = 2.718281
√ . . . , 2 = 1.414 . . . and so on. In fact
π, e and 2 are not rational numbers; we will prove that 2 is not rational in Proposition 10.2.4,
and that e is irrational in Appendix B. The proof that π is irrational is much harder.
Two sets A, B, are equal, A = B, if they have the same elements, that is, if every element of A is an element
of B and vice-versa. Thus, for example, A = {1, 2} is equal to B = {2, 1} (the order doesn’t matter), and
A = {−1, 1} is equal to B = { x ∈ R : x2 − 1 = 0} (the description doesn’t matter). Also A = {1, −1} is
equal to B = {1, 1, −1, 1} (repetitions do not matter, either). We say that
A ⊆ B,
(A is contained in B), or simply A ⊂ B, if every element of A is an element of B. For example N ⊂ Z.
Some authors use A ⊂ B to mean A is contained in B but not equal to it. We do not follow this convention
and for us A ⊂ B allows A = B. If we want to say that A is contained in B and not equal to it, we shall use
A $ B. Note that A = B holds precisely when both A ⊂ B and B ⊂ A.
The notation
∅
stands for the empty set. It is a set but it has no elements. Admittedly, that sounds funny... the logic behind
is that we want the intersection of sets to always be a set. We let
A ∩ B = { x : x ∈ A and x ∈ B}
be the intersection of A and B, the set of common elements, and we let
A ∪ B = { x : x ∈ A or x ∈ B}
be the union of A and B. For example, {1, 3} ∩ {n : n2 = n} = {1}, N ∩ { x : − x ∈ N} = {0}, S1 ∩ T5 =
∅.6
We shall also need arbitrary unions and intersections. Let I be a non-empty set (thought of as an index
set) and suppose that for each i ∈ I we are given a set Ai . Then
∩i ∈ I A i = { x : x ∈ A i , ∀ i },
(∀ means “for all") is the set of elements belonging to each Ai , and
∪i∈ I Ai = { x : x ∈ Ai , for some i },
is the set of elements appearing in at least one Ai . For example, define for i ∈ Z,
Ai = { x ∈ Z : x ≥ i }
(so A−1 = {−1, 0, 1, 2, 3, 4, . . . }, A0 = {0, 1, 2, 3, . . . }, A1 = {1, 2, 3, 4, . . . }, A2 = {2, 3, 4, 5, . . . } and so on).
Then ∪i∈Z Ai = Z, while ∩i∈Z Ai = ∅.
Here’s a another example: for every real number x, 0 ≤ x ≤ 1 define
Sx = {( x, y) : 0 ≤ y ≤ x, y ∈ R}.
Then ∪0≤ x≤1 Sx is the triangular area in the plane whose vertices are (0, 0), (1, 0), (1, 1).
A good way to decipher formulas involving two or three sets is by diagrams. For example:
Another definition we shall often use is that of the cartesian product. Let A1 , A2 , . . . , An be sets. Then
A1 × A2 × · · · × An = {( x1 , x2 , . . . , xn ) : xi ∈ Ai , for 1 ≤ i ≤ n}.
In particular,
A × B = {( a, b) : a ∈ A, b ∈ B}.
Example 1.1.1. Let A = {1, 2, 3}, B = {1, 2}. Then
A × B = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)}.
Note that (3, 1) ∈ A × B but (1, 3) 6∈ A × B, because although 1 ∈ A, 3 6∈ B.
Example 1.1.2. Let A = B = R. Then
A × B = {( x, y) : x ∈ R, y ∈ R}.
This is just the presentation of the plane in cartesian coordinates (which explains why we call such products
“cartesian" products).
1.2. Algebra of set operations. Up till now we have only introduced notation and vocabulary. With the
exception of the notion of the empty set, none of what we had said had any depth. We now wish to make
general statements relating some of these operations. Once a statement is important enough to highlight
it, it falls under the heading of a Lemma, a Proposition or a Theorem (or, more colloquially, a Claim, an
Assertion and so on). Usually, “Lemma" is reserved for technical statements often to be used in the proof
of a proposition or a theorem. “Proposition" and “Theorem" are more or less the same. They are used for
claims that are more conceptual, or central, with “Theorem" implying even more importance. However, none
of these rules is absolute. For example, consider the following proposition:
Proposition 1.2.1. Let I be a set. Let A be a set and Bi , i ∈ I, be sets as well, then
A ∩ (∪i∈ I Bi ) = ∪i∈ I ( A ∩ Bi ),
and
A ∪ (∩i∈ I Bi ) = ∩i∈ I ( A ∪ Bi ).
Furthermore,
A \ (∪i∈ I Bi ) = ∩i∈ I ( A \ Bi ),
6 EYAL GOREN MCGILL UNIVERSITY
and
A \ (∩i∈ I Bi ) = ∪i∈ I ( A \ Bi ).
We will prove parts of this Proposition below and the rest is left as an exercise.
2.1. Proving equality by two inequalities. When one wants to show that two real numbers x, y are equal,
it is often easier to show instead that x ≤ y and y ≤ x and to conclude that x = y.
In the same spirit, to show two sets A and B are equal, one may show that every element of A is an
element of B and that every element of B is an element of A. That is, we prove two “inequalities", A ⊆ B
and B ⊆ A. Thus, our principle of proof is
A=B
if and only if
x∈A⇒x∈B and x ∈ B ⇒ x ∈ A.
(The notation ⇒ means “implies that".)
Let us now prove the statement A ∩ (∪i∈ I Bi ) = ∪i∈ I ( A ∩ Bi ). The way you should write it in an assignment,
a test, or a research paper is as follows:
A ∩ (∪i∈ I Bi ) = ∪i∈ I ( A ∩ Bi ).
Proof. Let x ∈ A ∩ (∪i∈ I Bi ) then x ∈ A and x ∈ ∪i∈ I Bi . That is, x ∈ A and x ∈ Bi0 for some i0 ∈ I.
Then x ∈ A ∩ Bi0 and so x ∈ ∪i∈ I ( A ∩ Bi ). We have shown so far that A ∩ (∪i∈ I Bi ) ⊂ ∪i∈ I ( A ∩ Bi ).
Conversely, let x ∈ ∪i∈ I ( A ∩ Bi ). Then, there is some i0 ∈ I such that x ∈ A ∩ Bi0 and so for that i0 we
have x ∈ A and x ∈ Bi0 . In particular, x ∈ A and x ∈ ∪i∈ I Bi and so x ∈ A ∩ (∪i∈ I Bi ).
Let’s also do
A \ (∪i∈ I Bi ) = ∩i∈ I ( A \ Bi ).
We use the same technique. Let x ∈ A \ (∪i∈ I Bi ) thus x ∈ A and x 6∈ ∪i∈ I Bi . That means that x ∈ A and
for all i ∈ I we have x 6∈ Bi . That is, for all i ∈ I we have x ∈ A \ Bi and so x ∈ ∩i∈ I ( A \ Bi ).
Conversely, let x ∈ ∩i∈ I ( A \ Bi ). Then, for all i ∈ I we have x ∈ A \ Bi . That is, x ∈ A and x 6∈ Bi for
every i. Thus, x ∈ A and x 6∈ ∪i∈ I Bi and it follows that x ∈ A \ (∪i∈ I Bi ).
COURSE NOTES - ALGEBRA I 7
2.2. Proof by contradiction and the contrapositive. Proof by contradiction is a very useful technique,
even though using it too often shows lack of deeper understanding of the subject. Suppose that some
statement is to be proven true. In this technique one assumes that the statement is false and then one
proceeds to derive logical consequences of this assumption until an obvious contradiction arises. Here is an
easy example that illustrates this:
Proof. Assume not. Then there are positive integers x, y such that x2 − y2 = 1. Then, ( x − y)( x + y) = 1.
However, the only product of integers giving 1 is 1 × 1 or −1 × −1 and, in any case, it follows that x − y =
x + y. It follows that 2y = ( x + y) − ( x − y) = 0 and so that y = 0. Contradiction (because we have
assumed both x and y are positive).
Claim. If x and y are two integers whose sum is odd, then exactly one of them is odd.
Proof. Suppose not. Then, either both x and y are odd, or both x and y are even. In the first case x =
2a + 1, y = 2b + 1, for some integers a, b, and x + y = 2( a + b + 1) is even; contradiction. In the second
case, x = 2a, y = 2b, for some integers a, b, and x + y = 2( a + b) is even; contradiction again.
Somewhat related is the technique of proving the contrapositive. Let A and B be two assertions and
let ¬ A, ¬ B be their negation. Logically the implication
A⇒B
is equivalent to
¬ B ⇒ ¬ A.
Here is an example. Let A be the statement “it rains" and B the statement “it’s wet outside". Then ¬ A
is the statement “it doesn’t rain" and ¬ B is the statement “it’s dry outside". The meaning of A ⇒ B is “it
rains therefore it’s wet outside" and its contrapositive is “it’s dry outside therefore it doesn’t rain". Those
statements are equivalent. Here is a mathematical example:
7
Claim. If x and y are integers such that xy is even then either x or y are even.
Proof. The contrapositive is: If both x and y are odd then xy is odd. To prove that, write x = 2a + 1, y =
2b + 1 for some integers a, b. Then xy = 4ab + 2a + 2b + 1 = 2(2ab + a + b) + 1, is one more than an even
integer and so is odd.
2.3. Proof by Induction. Induction is perhaps the most fun technique. Its logical foundations also lie deeper
than the previous methods. The principle of induction, to be explained below, rests on the following axiom:
We remark that the axiom is actually intuitively obviously true. The reason we state it as an axiom is that
when one develops the theory of sets in a very formal way from fundamental axioms the assertion stated
above doesn’t follow from simpler axioms and, in one form or another, has to be included as an axiom.
Theorem 2.3.1. (Principle of Induction) Let n0 be a given natural number. Suppose that for every natural
number n ≥ n0 we are given a statement Pn . Suppose that we know that:
(1) Pn0 is true.
(2) If Pn is true then Pn+1 is true.
Then Pn is true for every n ≥ n0 .
7Note that in mathematics “or" always means “and/or".
8 EYAL GOREN MCGILL UNIVERSITY
1 − q n +1
The statement Pn is “for every real number q 6= 1, 1 + q + · · · + qn = 1−q ". The base case, that is the
first n for the statement is being claimed true, is n = 0; the statement is then
1−q
1= ,
1−q
which is obviously true.
Now suppose that the statement is true for n. That is, suppose that
1 − q n +1
1 + q + · · · + qn = .
1−q
We need to show that
1 − q n +2
1 + q + · · · + q n +1 = .
1−q
Indeed,
1 + q + · · · + q n +1 = (1 + q + · · · + q n ) + q n +1
1 − q n +1
= + q n +1
1−q
1 − q n +1 (1 − q ) q n +1
= +
1−q 1−q
1 − q n +1 q n +1 − q n +2
= +
1−q 1−q
1 − q n +2
= .
1−q
Here are further examples of statements that are easy to prove by induction. (i) For n ≥ 0, n2 ≤ 4n . (ii)
1 + 3 + · · · + (2n − 1) = n2 , for n ≥ 1. More thought is required for the following: (iii) Consider the following
scenario: there are n people in a party and when the bell rings each is supposed to throw a pie at the person
closest to him or her. (Well, things were getting wild...) Assume that all the distances between the people
are mutually distinct. If n is odd then at least one person is not going to be hit by a pie.
2.4. Prove or disprove. A common exercise, and a situation one often faces in research, is to prove or
disprove a particular statement. For example,
“Prove or disprove: for every natural number n, 4n + 1 is either a square or a sum of two squares."
At that point you are requested first to form a hunch, a guess, an opinion about whether the statement
is true or false. To form that hunch you can try some examples (4 ∗ 0 + 1 = 1 = 12 , 4 ∗ 1 + 1 = 5 =
12 + 22 , 4 ∗ 2 + 1 = 9 = 32 , 4 ∗ 3 + 1 = 13 = 22 + 32 , ...) to see if the statement holds for these examples,
or you can think whether the statement is similar to other statements you know to hold true/false, or, when
at loss, throw a coin. After deciding on your initial position, if you believe the statement is true you should
proceed to find a proof. If you don’t, then you have two options. You can try and show that if the statement
is true it will imply a contradiction to a known fact, or you can provide one counterexample. The statement
being false doesn’t mean it’s false for every n; it means it’s false for at least one n. In the case at hand,
if we take n = 5 we find that 4 ∗ 5 + 1 = 21, which is neither a square nor a sum of squares (just try all
possibilities) and so the statement is false.
2.5. The pigeonhole principle. The pigeonhole principle is simple to state, yet it is a powerful tool. It states
the following:8
Following are examples of applications of the pigeonhole principle. To appreciate its power, I recommend
reading the statement first and trying to come up with an independent proof, before reading the provided
proof.
Example 2.5.1. Let a, b, c, d, e, f be 6 integers. Then there are among them two integers whose difference is
divisible by 5.
To prove this consider the remainders of a, b, c, d, e, f upon division by 5. The remainder is either 0, 1, 2, 3
or 4 (these are the 5 pigeonholes) but we get from our numbers 6 remainders (those are the pigeons).
Therefore, there are two numbers among a, b, c, d, e, f with the same remainder. Say, a and b have the same
remainder, say r. Then a − b is divisible by 5 (because a = 5a0 + r, b = 5b0 + r and so a − b = 5( a0 − b0 )).
A similar example is the following:
Example 2.5.2. Let n ≥ 2 be an integer. In any group of n people, there are two that have the same number
of friends within the group.
We prove that by induction on n. The case n = 2 is trivial. Let N ≥ 2; suppose the claim for all n ≤ N,
and consider a group of N + 1 people. If there is a person with 0 friends, we can look at the rest of the
people. This is a group of N people and the number of friends each has in this smaller group is the same as
in the original group. We can apply induction to conclude that two have the same number of friends (initially
in the smaller group, but in fact also in the larger group).
The other case is when each person in the group of N + 1 people has at least 1 friend. By considering
the number of friends each person in the group has, we get N + 1 integer values between 1 and N and so,
by the pigeonhole principle, two must be equal.
Here is another way to formulate the statement we have just proven. A graph is a collection of vertices and
edges connecting them. We say that a graph is simple if it has no loops and no multiple edges, namely, an
edge always goes from a vertex to a different vertex and between any two vertices there is at most one edge.
A graph is called finite if it has finitely many vertices. The degree of a vertex v is the number of edges one of
whose terminal points is v. What we proved is that in a finite simple graph two of the vertices must have the
same degree. (Given a party, create a vertex for every person and connect two vertices if the corresponding
persons are friends.)
Finally, notice that we have used a variant of Induction called “complete induction". Namely, we used
the following principle. Suppose that a statement Pn is given for every natural number greater or equal to a
natural number n0 . Suppose that the statement Pn0 is true and that for every n ≥ n0 , if all the statments
Pn0 , Pn0 +1 , . . . , Pn are true then so is Pn+1 . Then Pn is true for every n ≥ n0 .
The proof is more or less the same as the proof of Theorem 2.3.1, so we leave it as an exercise.9
3. Functions
We are familiar with functions already from high-school, where functions were usually, probably always,
functions of a real variable. For example, the function y = sin( x ) or y = 3x + 1. Here x ∈ R and the value y
is in R too. We are also used to plotting this function using cartesian coordinates as all the pairs ( x, y) where
y = sin( x ) or 3x + 1, as the case may be. We want to abstract this notion and for a start we may say that
we have a function f : R → R, where for the first example f ( x ) = sin( x ) and for the second f ( x ) = 3x + 1.
That is, instead of writing y we write f ( x ). In this language, the graph are all the points in the plane of the
form ( x, f ( x )), or, what is the same, all the points ( x, y) such that y = f ( x ).
There are more formal and less formal ways to define a function. Here we take the most pedestrian
approach. Let A and B be sets. A function f from A to B,
f : A −→ B,
9You should not take that as meaning that you can safely ignore the matter of finding a proof for complete induction. On the
contrary, it means that you should understand the proof of Theorem 2.3.1 so well that it is clear to you how to prove the variant
given here.
COURSE NOTES - ALGEBRA I 11
is a rule assigning to each element of A a single element of B. The set A is called the source, or the domain,
of the function, and B the target, or codomain, of the function. For a ∈ A, f ( a) is called the image of a
(under f ) and f ( A) = { f ( a) : a ∈ A} is the image of f .
Example 3.0.1. The simplest example is the identity function. Let A be any set and define
1A : A → A
to be the function sending each element to itself. Namely,
1 A ( x ) = x,
for any x ∈ A.
Example 3.0.2. Let A = {1, 2, 3}, B = {1, 2} and consider the following rules for f : A −→ B.
(1) f (1) = 2, f (2) = 1, f (3) = 1.
(2) f (1) = 1 or 2, f (2) = 2, f (3) = 1.
(3) f (1) = 1, f (2) = 1.
The first recipe defines a function from A to B. The second recipe does not, because 1 is assigned two
possible values. The third also doesn’t define a function because no information is given about f (3).
Example 3.0.3. Let R≥0 denote the non-negative real numbers. Consider the following attempts to define
functions. √
(1) f : R → R, f ( x ) = x.
(2) f : R≥0 → R, f ( x ) = y, where y is a real number such√that y2 = x.
(3) f : R≥0 → R, f ( x ) = the non negative root of x = + x.
(4) f : R → R, f ( x ) = 1/x.
The first definition fails because −1 doesn’t have a root in R. The second definition fails because every
positive number has 2 roots (differing by a sign) and it isn’t clear which root one is supposed to take. This
problem also exists in the first definition. The third definition does define a function. The fourth definition
doesn’t define a function, because the value f (0) is not well-defined.
There are various ways to define a function. It can be done by writing down f ( a) for every a ∈ A explicitly,
it can be done by providing a formula, and it can be done by giving some other description. For example,
take A to be the set of all people who ever lived, B = A and f : A → A is given by
f ( a) = a0 s mother.
This definitely looks like a good definition at first sight. However, the astute reader will note the problem
here. If this function was truly well-defined then the set A must be infinite, because if it were finite we would
have a person who’s a descendant of itself (consider a, f ( a), f ( f ( a)), f ( f ( f ( a))), . . . ). Since a mother is
older than any of her children by at least a day, say (just not to worry about an exact number), it follows
that if A and f are indeed well defined, that people have existed forever. The various ways to resolve the
paradox, namely to provide an explanation as to why f is not well-defined, are rather amazing and I leave it
to you as an amusing exercise. (And, no, I don’t view that as a proof that God exists.)
Here is some more notation: the symbol ∀ means “for all". The symbol ∃ means “exists". The symbol ∃!
means “exists unique". A function can also be defined by using a set
Γ ⊂ A × B,
with the following property: ∀ a ∈ A, ∃!b ∈ B such that ( a, b) ∈ Γ. (Read: for all a in A there exists a unique b
in B such that ( a, b) is in Γ.) We then define f ( a) to be the unique b such that ( a, b) ∈ Γ. Conversely, given
a function f we let
Γ = Γ f = {( a, f ( a)) : a ∈ A}.
The set Γ f is called the graph of f .
Example 3.0.4. Let A be a set and Γ ⊂ A × A the “diagonal",
Γ = {( x, x ) : x ∈ A}.
The function defined by Γ is 1 A .
12 EYAL GOREN MCGILL UNIVERSITY
3.1. Injective, surjective, bijective and inverse image. We introduce some properties of functions. Let
f: A→B
be a function. Then:
(1) f is called injective if f ( a) = f ( a0 ) ⇒ a = a0 . I.e., different elements of A go to different elements
of B. Such a function is also called one-one.
(2) f is called surjective, or onto, if ∀b ∈ B, ∃ a ∈ A such that f ( a) = b. I.e., every element in the
target is the image of some element in the source under the function f .
(3) f is called bijective if it is both injective and surjective. In that case, every element of B is the
image of a unique element of A.
Let f : A → B be a function. Let U ⊂ B. We define the pre-image of U to be the set
f −1 (U ) = { a : a ∈ A, f ( a) ∈ U }.
If U consists of a single element, U = {u}, we usually write f −1 (u) (instead of f −1 ({u})) and call it the
fibre of f over u.
Example 3.1.1. (1) f : R → R, f ( x ) = x2 . Then f is neither surjective (a square is always non-
negative) nor injective as f ( x ) = f (− x ). We have f −1 ([1, 4]) = [1, 2] ∪ [−2, −1] and f −1 (0) =
{0}, f −1 (−1) = ∅.
(2) f : R → R≥0 , f ( x ) = x2 . Then f is surjective but not injective.
(3) f : R≥0 → R≥0 , f ( x ) = x2 . Then f is bijective.
f g #
A /B /C
3.3. The inverse function. Let f : A → B be a bijective function. In this case we can define the inverse
function
f −1 : B → A,
by the property
f −1 (b) = a if f ( a) = b.
This is well defined: since f is surjective such an a exists for every b and is unique (because f is injective).
Thus f −1 is a function. It is easy to verify that
f −1 ◦ f = 1 A , f ◦ f −1 = 1 B .
To tie it up with previous definitions, note that if we let
ϕ : A × B → B × A, ϕ( a, b) = (b, a),
then
ϕ ( Γ f ) = Γ f −1 .
In fact, this gives another way of defining f −1 .
4. Cardinality of a set
Imagine a group of students about to enter a lecture hall. The instructor wants to know if there are
sufficient chairs for all the students. There are two ways to do that. One is to count both students and
chairs separately and determine which is larger. The other is to ask each student to take a seat. If there are
students left standing, the number of chairs is too small. If there are chairs left unoccupied, there are more
chairs than students. In the remaining case there is a perfect match between students and chairs and so their
number (cardinality) is equal.
This idea proves very powerful in discussing the cardinality (“size", “magnitude", “number of elements") of
sets, finite or infinite.
George Cantor revolutionized mathematics,10 and human thought, by defining two sets A, B possibly
infinite to be of equal cardinality, denoted | A| = | B|, if there is a bijective function
f : A → B.
10Cantor’s own words are as fresh as ever; the following are the opening paragraphs of Cantor’s paper whose title is translated
into English as “Contributions to the founding of the theory of transfinite numbers". The translation is based on the Dover
edition, but I have replaced “aggregate", “uniting" etc. by the modern terminology “set", “union" and so on, and the use of
quotation marks by italics.
“By a set we are to understand any collection into a whole M of definite and separate objects m of our intuition or our thought.
These objects are called the elements of M. In signs we express this thus:
(1) M = { m }.
We denote the union of many sets M, N, P, . . . , which have no common elements, into a single set by
(2) ( M, N, P, . . . ).
The elements of this set are, therefore, the elements of M, of N, of P, . . . , taken together. [....]
Every set M has a definite power, which we will also call its cardinal number. We will call by the name power or cardinal number
of M the general concept which, by means of our active faculty of thought, arises from the set M when we make abstraction of
the nature of its various elements m and of the order in which they are given."
What is striking to me is that Cantor’s investigation is intimately tied with introspective meditation on the nature of thought
and abstraction. The notion of cardinal number is not claimed to be absolute; it arises as a result of our thought process. Note
the effort in trying to explain what cardinality is. In current times such efforts are rare. One simply says when two sets have the
same cardinality, as we have done, dodging the issue of what that cardinality is.
Cantor suffered vicious personal attacks in reaction to his theory. Leopold Kronecker’s public opposition and personal attacks
included describing Cantor as a "scientific charlatan", a "renegade" and a "corrupter of youth." On the other hand, David
Hilbert defended Cantor by declaring "No one shall expel us from the Paradise that Cantor has created." However, this came
too late for Cantor, he was already dead by then.
14 EYAL GOREN MCGILL UNIVERSITY
(Note that then there is an inverse function f −1 : B → A which is also a bijection, so it doesn’t matter if we
require a bijection from A to B or from B to A.) He defined the cardinality of A to be no larger than B’s if
there is an injective function
f : A → B,
and this is denoted | A| ≤ | B|. We also say that the cardinality of A is less than B’s, | A| < | B|, if | A| ≤ | B|
and | A| 6= | B|. As a sanity check we’d like to know at least the following.
Proposition 4.0.1. Let A, B, C be sets. If | A| = | B| and | B| = |C | then | A| = |C |.
Proof. Let f : A → B, g : B → C be bijections. Then
g◦ f: A → C
is also a bijection. Indeed: if for some x, y ∈ A we have ( g ◦ f )( x ) = ( g ◦ f )(y) then g( f ( x )) = g( f (y)).
Since g is injective, f ( x ) = f (y) and, since f is injective, x = y.
To show g ◦ f is surjective, let c ∈ C and choose b ∈ B such that g(b) = c; such b exists since g is surjective.
Since f is surjective, there is an a ∈ A such that f ( a) = b. Then ( g ◦ f )( a) = g( f ( a)) = g(b) = c.
To show the definitions and notations make sense at all, we surely need to know the following theorem.
Theorem 4.0.2 (Cantor-Bernstein). If | A| ≤ | B| and | B| ≤ | A| then | A| = | B|.
Although it does not require any sophisticated mathematics, knowing the proof is not required for this
course. It is an ingenious and intricate proof, but requires no more background than we already have. We
give it in Appendix A. We would like to explain though why the theorem is not obvious.
What we are given that there is some injective function f from A to B and some injective function g
from B to A. Those functions need not be related to each other in any way. We should conclude from that
there is a bijective function h from A to B. It is not necessarily true that h = f . One should somehow
construct h from f and g. Here is an example: let A be the set of points in the plane in distance at most 1
from the origin (the closed unit disk) and B the square [−1, 1] × [−1, 1]. The function
f : A → B, f ( x ) = x,
is a well-defined injective function but not a bijection. The function
√
g : B → A, g(b) = b/ 2,
is also a well-defined injective function, but not bijection. One can find a bijection from A to B, but it is not
immediately clear how to find it based on the knowledge of f and g (and in fact in this particular example, it
is better to “rethink the situation" rather than to deduce it from f and g).
A set A is called countable (or enumerable) if it is either finite or has the same cardinality as N. If it is
finite there is a bijective function f : {0, 1, . . . , n − 1} → A, where n is the number of elements of A and so
A = { f (0), f (1), . . . , f (n − 1)}. If A is infinite, there is a bijective function f : N → A. The elements of A
are thus { f (0), f (1), f (2), f (3), . . . }. If we introduce the notation ai = f (i ) then we can also enumerate the
elements of A as { a0 , a1 , a2 , a3 , . . . } and this explains the terminology.
COURSE NOTES - ALGEBRA I 15
Example 4.0.4. Let A be the set {0, 2, 4, 6, . . . } and B the set {0, 1, 4, 9, 16, 25, . . . }. Then
|N| = | A | = | B |.
Indeed, one verifies that the functions
f : N → A, f ( x ) = 2x,
and
g : N → B, g( x ) = x2 ,
are bijections. We can then conclude that | A| = | B| and, if we want to, we can find the bijection. It is
h = g ◦ f −1 ; that is, h( x ) = ( x/2)2 .
f : N → A, f ( x ) = 4x + 3.
First, this is well defined. Namely, 4x + 3 is really in A. This is because if n is a square, n leaves residue 1
or 0 when divided by 4 (if n = (2m)2 = 4m2 the residue is zero; if n = (2m + 1)2 = 4(m2 + m) + 1 the
residue is one). But 4x + 3 leaves residue 3. Clearly f is injective and so |N| ≤ | A|.
Proof. We define (
2x x≥0
f : Z → N, g( x ) =
−2x − 1 x < 0.
Then f is a bijective function, as is easy to check.
Proposition 4.0.7. |N| = |N × N|.
g : N × N → N, g(n, m) = 2n 3m .
g = ( f , f ) : N × N → Z × Z,
is also a bijection.
Exercise 4.0.9. Prove that |N| = |Q|. (Hint: there’s an easy injection Q → Z × Z).
When Cantor has laid down the foundations for the study of infinite cardinals he also dropped a bombshell:
16 EYAL GOREN MCGILL UNIVERSITY
Proof. Suppose that |N| = |R|. We can then enumerate the real numbers as a0 , a1 , a2 , . . . . Let us write the
decimal expression of each number as
where we agree to use 000000000 . . . instead of 999999999 . . . (so we write 1.00000000000 . . . and not
0.9999999999 . . . , etc.). Here ei is a sign, + or −, and each bij , cij is a digit, i.e. in {0, 1, . . . , 9}.
Now consider the number (
3 cii 6= 3
0.e0 e1 e2 e3 e4 . . . , ei =
4 cii = 3.
This is a real number that differs from each ai at the i + 1-th digit after the decimal dot and hence is not
equal to any ai . It follows that the list a0 , a1 , a2 , . . . cannot consist of all real numbers and so we arrive at a
contradiction.
Remark 4.1.2. The Continuum hypothesis, formulated by George Cantor, asserts that there is no set A
such that |N| < | A| < |R|. Much later, Kurt Godel and Paul Cohen proved that neither the hypothesis,
nor its negation, can be proven from the standard axioms of set theory. That is, in the axiom system we
are using in mathematics, we cannot prove the hypothesis or provide a counter-example. These discoveries
were nothing short of shocking. They came at a time where humanity was fascinated with its own power,
especially the power of the intellect. These results proved the inherent limitations of thought.
One may ask at this point if there is a cardinality bigger than that of R. That is, is there a set T such that
|R| < | T |. The answer to that is yes. One may take T to be the set of all subsets of R. This is discussed in
the exercises.
5. Relations
A relation on a set S is best described as a subset Γ ⊂ S × S. For each s ∈ S, s is “related" to t if (s, t) ∈ Γ.
Though the format reminds one of functions, the actual relevance of the notion of functions here is minimal.
For example, usually for a given s there will be many elements t such that (s, t) ∈ Γ, which is the opposite of
what we require for functions, where there is precisely one t for a given s. We shall usually denote that x is
related to y, namely that ( x, y) ∈ Γ, by x ∼ y.
Note that so far the definition is wide enough to allow any Γ. Here are same basic examples:
(1) Γ = S × S. In this case for any x, y, we have x ∼ y. Any two elements are related.
(2) Γ = ∅. In this case for no x, y we have x ∼ y. No elements are related (including an element with
itself).
(3) Γ = {(s, s) : s ∈ S} (the “diagonal"). In this case x ∼ x for all x ∈ S, but if x 6= y then x 6∼ y.
(4) Γ = {( x, y) : x, y ∈ S, x ≤ y}, where S is the interval of real numbers [0, 1]. Then to say that x ∼ y
means that x ≤ y, in the sense of the usual inequality of real numbers.
(5) Γ = {( x, y) : x, y ∈ Z} is the relation on Z where x ∼ y if ( x − y) is divisible by 5.
COURSE NOTES - ALGEBRA I 17
A relation on S is called reflexive if for all x ∈ S we have x ∼ x. In words: every element is related to itself.
The relations in (1), (3), (4) and (5) are reflexive, but the relation in (2) is not (except in the trivial case
where S is the empty set).
A relation is called symmetric if for all x, y ∈ S, if x ∼ y ⇒ y ∼ x. In words, whenever x is related to y
also y is related to x. The relations (1), (2), (3), (5) are symmetric, but (4) is not.
A relation is called transitive for all x, y, z ∈ S, if x ∼ y and y ∼ z implies x ∼ z. All the relations (1) -
(5) are transitive.
A relation on a set S is called a partial order if it is reflexive and transitive and, in addition, if both x ∼ y
and y ∼ x then x = y. We then use the notation x ≤ y for x ∼ y and, in this notation, we always have:
x ≤ x, the implication x ≤ y, y ≤ z ⇒ x ≤ z holds, and if both x ≤ y and y ≤ x then x = y. There may
very well be x, y for which neither x ≤ y nor y ≤ x holds. A linear order (or a simple order or a total order)
is a partial order such that for every x, y we have either x ≤ y or y ≤ x.
For example, for real numbers we have the relation ≤ of “less or equal than". That is, for real numbers
a, b we have the notion of a ≤ b and this is a linear ordering. It is reflexive (a ≤ a), satisfies transitivity
(a ≤ b and b ≤ c implies that a ≤ c) and if both a ≤ b and b ≤ a then a = b. We can put a relation on the
positive natural numbers N, by saying that a ∼ b if a|b. Since we have a|b and b|c implies a|c this relation
is transitive. Also, if a|b and b| a then a = b and, clearly, always a| a. Thus, this is a partial ordering of the
natural numbers. Note that for a = 2, b = 3 neither a|b nor b| a. That is, this is not a linear order.
Another important class of relations, even more important for this course than order relations, are the
equivalence relations. They are very far from order relations. A relation is called an equivalence relation if
it satisfies the following properties:
(1) (Reflexive) For every x we have x ∼ x.
(2) (Symmetric) If x ∼ y then y ∼ x.
(3) (Transitive) If x ∼ y and y ∼ z then x ∼ z.
Equivalence relations arise when one wishes to identify elements in a given set R according to some principle.
If we reflect on the meaning of “identify" we see that what we aim at is that x is identified with x (well,
obviously!), that if x is identified with y then y is identified with x, and that if x is identified with y and y is
identified with z then, by all accounts, x should be identified with z too. That is, we aim at an equivalence
relation. And conversely, an equivalence relation is a way to identify elements in a set. We identify x with all
the elements y such that x ∼ y.
Example 5.0.1. (1) A permutation σ of a set A is a bijection σ : A → A. A permutation of n elements
is a bijection
σ : {1, 2, . . . , n} → {1, 2, . . . , n}.
We denote Sn the set of all permutations σ of n elements. There are n! permutations, that is,
|Sn | = n!. Define a relation on Sn by saying for two permutations σ, τ that
σ ∼ τ if σ (1) = τ (1).
This is an equivalence relation.
(2) Here is a simple example, that will be generalized in §12. Define a relation on the integers Z by
saying that a ∼ b if a − b is even. This is an equivalence relation as a − a = 0 is always even, if a − b
is even then so is b − a and if both a − b and b − c are even then so is a − c = ( a − b) + (b − c).
(3) Define a relation on sets11 by saying that
A ∼ B if | A| = | B|.
This is an equivalence relation. The identity function A → A shows A ∼ A. If A ∼ B there is a
bijection f : A → B and then the inverse function f −1 : B → A is a bijection too, showing B ∼ A.
Finally, if | A| = | B| and | B| = |C | then | A| = |C |, by Proposition 4.0.1.
11We ignore here the fact that the collection of all sets is not a set and we only defined relations on a set. The example would
be correct if we restricted ourselves to all sets that are subsets of some fixed given set.
18 EYAL GOREN MCGILL UNIVERSITY
Let S be a set and Si , i ∈ I, be subsets of S. We say that S is the disjoint union of the sets Si if S = ∪i∈ I Si
and for any i 6= j we have Si ∩ S j = ∅. We denote this by
S= ä Si .
i∈ I
Lemma 5.0.2. Let ∼ be an equivalence relation on a set S. Define the equivalence class [ x ] of an element x ∈
S as follows:
[ x ] = {y : y ∈ S, x ∼ y}.
This is a subset of S. The following holds:
(1) Two equivalence classes are either disjoint or equal.
(2) S is a disjoint union of equivalence classes.
Conversely, if S is a disjoint union S = äi∈ I Ui of non-empty sets Ui (this is called a partition of S) then
there is a unique equivalence relation on S for which the Ui are the equivalence classes.
Proof. Let x, y be elements of S and suppose that [ x ] ∩ [y] 6= ∅. Then, there is an element z such that
x ∼ z, y ∼ z. Since ∼ is symmetric also z ∼ y and using transitivity x ∼ y. Now, if s ∈ [y] then y ∼ s and
by transitivity x ∼ s and so s ∈ [ x ] and we showed [y] ⊂ [ x ]. Since x ∼ y also y ∼ x and the same argument
gives [ x ] ⊂ [y]. We conclude that [ x ] = [y].
Every element of S lies in the equivalence class of itself. It follows that S is a disjoint union of equivalence
classes.
To prove the second part of the lemma, we define that x ∼ y if both x and y lie in the same set Ui . It is
clearly reflexive and symmetric. It is also transitive: x ∼ y means x, y ∈ Ui for some i, y ∼ z means y, z ∈ Uj
for some j. But there is a unique Ui containing y because the union is a disjoint union. That is Ui = Uj and
so x, z ∈ Uj ; that is, x ∼ z. The equivalence classes are clearly the Ui .
Example 5.0.3. We revisit Example 5.0.1 and find the equivalence classes. In the first case, there are n
equivalence classes [σ1 ], . . . , [σn ] (each having (n − 1)! permutations in it), where we may choose σi to be
the permutation:
j j 6= 1, i
σi ( j) = 1 j = i
i j=1
That is, σi is the permutation sending 1 to i and i to 1 and leaving all other elements intact.
In the second case there are two equivalence classes, the even integers and the odd integers.
In the third case, the equivalence classes are the cardinalities, per definition (namely, one could say that
a cardinality is an equivalence class of sets under the relation of existence of bijection). Some distinct
equivalence classes are “all sets of cardinality 3" (that is, all sets that are in bijection with {1, 2, 3}, which are
precisely the sets of 3 elements), “all sets of cardinality ℵ0 " (that is, all sets that are in bijection with N and
this includes Z, N × N, Q, ....), “all sets of cardinality 2ℵ0 ", which is the cardinality of the real numbers R
(see also Exercise 12).
We introduce the following terminology: Let S be a set with an equivalence relation. We say that a subset
T ⊆ S is a complete set of representatives if S = ät∈T [t]. That is S is the disjoint union of the equivalence
classes of the elements in T: every equivalence class [ x ] in S is equal to an equivalence class [t] for a unique
element t ∈ T.
Sometimes we want to index the elements of T and so as a notational variant we say that a subset { xi :
i ∈ I } ⊆ S, I some index set, is a complete set of representatives if the equivalence classes [ xi ] are disjoint
and S = ∪i∈ I [ xi ]. This means that every equivalence class is of the form [ xi ] for a unique i ∈ I. That is, if
i 6= j then [ xi ] 6= [ x j ] (and in fact [ xi ] ∩ [ x j ] = ∅).
COURSE NOTES - ALGEBRA I 19
6. Number systems
Again we start with an apology of a sort. The formal discussion of number systems is a rather involved piece
of mathematics. Our approach is pragmatic. We assume that at some level we all know what are integers
and real numbers and no confusion shall arise there. We use those to define more complicated notions.
As we have already said, we denote the natural numbers N by
N = {0, 1, 2, . . . }, N+ = {1, 2, 3, . . . }.
We also denote the integers by
Z = {. . . , −2, −1, 0, 1, 2, . . . }.
The rational numbers are the set na o
Q= : a, b ∈ Z, b 6= 0 ,
b
where we identify ba with dc if ad = bc. The real numbers R are the “points on the line". Each real number
has a decimal expansion such as 0.19874526348 . . . that may or may not repeat itself from some point on.
For example:
1/3 = 0.3333333 . . . , 1/2 = 0.5000000 . . . , 1/7 = 0.142857142857142857142857142857 . . . ,
π = 3.141592653589793238462643383 . . . .
It is a fact that a number is rational if and only if from some point on its decimal expansion becomes periodic.
The length of the period of rational number can be explained. For example, the period of a number of the
form 1/n, where n ≥ 1 is an integer not divisible by 2 and 5 is the following.
Consider the sequence 1, 10, 100, 1000, . . . and their residues upon division by n, say r0 = 1, r1 , r2 , . . . .
The first d > 0 such that rd = 1 is the period of the decimal expansion of 1/n. For example, for n =
3 we have 1, 10, 100, . . . that give residues 1, 1, 1, . . . when divided by 3, and indeed 1 = 0.3333 . . . has
period 1. On the other hand, the residues of 1, 10, 100, . . . when divided by 7 are r0 = 1, r1 = 3, r2 =
2, r3 = 6, r4 = 4, r5 = 5, r6 = 1 which predicts a period of 6 for the decimal expansion of 1/7 and indeed
1/7 = 0.142857142857142857142857142857 . . . .
(We shall later say that the complex numbers form a field.) In fact, given these rules and i2 = −1, there
is no need to memorize the formula for multiplication as it just follows by expansion: ( a + bi )(c + di ) =
ac + adi + bci + bdi2 = ac − bd + adi + bci = ac − bd + ( ad + bc)i.
Let z = a + bi be a complex number. We define the complex conjugate of z, z̄, as follows:
z̄ = a − bi.
Lemma 6.0.1. The complex conjugate has the following properties:
(1) z = z.
(2) z1 + z2 = z̄1 + z̄2 , z1 · z2 = z̄1 · z̄2 .
(3) Re(z) = z+2 z̄ , Im(z)i = z−2 z̄ .
(4) Define for z = a + bi,
p
| z | = a2 + b2 .
(This is just the distance of the point ( a, b) from the origin.) Then |z|2 = z · z̄ and the following
holds:
| z1 + z2 | ≤ | z1 | + | z2 |, | z1 · z2 | = | z1 | · | z2 |.
Thus, the assertion |z1 · z2 | = |z1 | · |z2 | follows by taking roots. The inequality |z1 + z2 | ≤ |z1 | + |z2 | viewed
in the plane model for complex numbers is precisely the assertion that the sum of the lengths of two sides of
a triangle is greater or equal to the length of the third side.
z̄ z·z̄
Example 6.0.2. If z 6= 0 then z has an inverse with respect to multiplication. Indeed, z · | z |2
= | z |2
= 1. We
write
z̄
z −1 = .
| z |2
Just to illustrate calculation with complex numbers, we calculate 1 + 2i + 13+−5ii . Using z−1 = z̄/|z|2 , we
have 1+15i = 1− 5i 3− i 1 8 3− i 12 18
26 and so 1+5i = (3 − i )(1 − 5i ) /26 = − 13 − 13 i and thus 1 + 2i + 1+5i = 13 + 13 i.
6.1. The polar representation. Considering a complex number z = a + bi in the plane model as the vec-
tor ( a, b) we see that we can describe each complex number z by the length r of the corresponding vector
( a, b) and the angle θ it forms with the real axis.
Z
r
We have
Im(z) Re(z)
r = | z |, sin θ = , cos θ = .
|z| |z|
Lemma 6.1.1. If z1 has parameters r1 , θ1 and z2 has parameters r2 , θ2 then z1 z2 has parameters r1 r2 , θ1 + θ2
(up to multiples of 360◦ or 2π rad).
Proof. We have r1 r2 = |z1 ||z2 | = |z1 z2 | and this shows that r1 r2 is the length of z1 z2 . Let θ be the angle
of z1 z2 then
Im(z1 z2 )
sin θ =
| z1 z2 |
Re(z1 )Im(z2 ) + Re(z2 )Im(z1 )
=
|z1 ||z2 |
Re(z1 ) Im(z2 ) Re(z2 ) Im(z1 )
= +
| z1 | | z2 | | z2 | | z1 |
= cos θ1 sin θ2 + cos θ2 sin θ1
= sin(θ1 + θ2 ).
22 EYAL GOREN MCGILL UNIVERSITY
Similarly, we get
Re(z1 z2 )
cos θ =
| z1 z2 |
Re(z1 ) Re(z2 ) Im(z1 ) Im(z2 )
= −
| z1 | | z2 | | z1 | | z2 |
= cos(θ1 ) cos(θ2 ) − sin(θ1 ) sin(θ2 )
= cos(θ1 + θ2 ).
It follows that θ = θ1 + θ2 up to multiples of 360◦ .
6.2. The complex exponential function. Let θ be any real number. Let eiθ denote the unit vector whose
angle is θ. Note that we define eiθ this way. That is, eiθ is the complex number of length 1 and angle θ.
Clearly we have
eiθ = cos θ + i sin θ.
If z is any complex number with length r and angle θ then we have the equality z = |z|eiθ . The formula we
proved in Lemma 6.1.1 is
z1 z2 = |z1 ||z2 |ei(θ1 +θ2 )
and in particular, if we take the complex numbers eiθ1 , eiθ2 themselves, we find that
eiθ1 eiθ2 = ei(θ1 +θ2 ) .
Let z = a + bi be a complex number then we define
ez = e a eib ,
where e a is the usual exponential (a is a real number) and eib is as defined above. Combining the formulas
for the real exponent with our definition for eiθ , we conclude that for any complex numbers z1 , z2 ,
e z1 e z2 = e z1 + z2 .
We have defined here ez in a purely formal way. A more analytic approach, and in fact the “correct" approach,
is the following:
We say that a sequence of complex numbers z1 , z2 , . . . converges to a complex number A if
lim | A − zn | = 0.
n
In this sense, one can show that for every complex number z the series
z2 z3 zn
1+z+ + +···+ +...
2! 3! n!
converges and is equal to ez . This is well known to hold for z a real number, and so we see that our definition
of ez for z a complex number is a natural extension of the function ez for z a real number. See Appendix D
for more on the complex exponential function.
Example 6.2.1. Consider the polynomial x n − a = 0, where a is a non-zero complex number. We claim that
this equation has n distinct roots in C. Write a = reiθ (so | a| = r and the line from 0 to a forms an angle θ
with the real axis). A complex number z = ReiΘ is a solution to the equation if and only if zn = Rn einΘ = reiθ .
That is, if and only if
Rn = r, nΘ ≡ θ (mod 2π ).
Thus, the solutions are exactly
2π )
z = r1/n ei( n + j·
θ
n , j = 0, 1, . . . , n − 1.
In particular, taking a = 1 the solutions are called the roots of unity of order n. There are precisely n of
them: the points on the unit circle having angles 0, 2π 2π 2π
n , 2 · n , . . . , ( n − 1) · n .
COURSE NOTES - ALGEBRA I 23
6.3. The Fundamental Theorem of Algebra. A complex polynomial f ( x ) is an expression of the form
a n x n + a n −1 x n −1 + · · · + a 1 x + a 0 ,
where n is a non-negative integer, x is a variable, and the coefficients ai are complex numbers. If all the
coefficients are real we may call it a real polynomial; if all the coefficients are rational numbers we may call
it a rational polynomial and so on. But note that x2 + 1 is both a rational, real and complex polynomial.
The zero polynomial, denote 0, is the case when n = 0 and a0 = 0.
A polynomial defines a function
f : C → C, z 7 → f ( z ) = a n z n + a n −1 z n −1 + · · · + a 1 z + a 0 .
The notation 7→ appearing in this formula means “maps to". If an 6= 0 then we say f has degree n. If f (z) = 0
for some particular complex number z, we say that z is a root (or a solution, or a zero) of the polynomial f .
Example 6.3.1. Consider the polynomial f ( x ) = x2 + 1. It has degree 2 and f (i ) = i2 + 1 = −1 + 1 =
0, f (−i ) = (−i )2 + 1 = −1 + 1 = 0. So i and −i are roots of f . This is a special case of Example 6.2.1,
because x4 − 1 = ( x2 + 1)( x2 − 1).
Theorem 6.3.2 (The Fundamental Theorem of Algebra). Let f ( x ) be a complex polynomial of degree at
least 1. Then f ( x ) has a root in C.
Proofs of the theorem are beyond the scope of this course. There are many proofs. In the course Honours
Algebra 4 one sees an algebraic proof using Galois theory; in the course Complex Variables and Transforms
one sees an analytic proof. The theorem is attributed to Gauss who proved it in 1799. There are some issues
regarding the completeness of that proof; he later published several other proofs of the theorem. Gauss didn’t
discover the theorem, though; Many attempts and partial results were known before his work, and he was
aware of that literature.
Proposition 6.3.3. Let f ( x ) = an x n + an−1 x n−1 + · · · + a1 x + a0 be a complex non-zero polynomial of
degree n. Then
n
f ( x ) = a n ∏ ( x − z i ),
i =1
for suitable complex numbers zi , not necessarily distinct. The numbers zi are all roots of f and any root of f
is equal to some zi . Moreover, this factorization is unique.
Proof. We prove the result by induction on n. For n = 0 we understand the product ∏in=1 ( x − zi ) as 1, by
definition.12 And so, the claim is just that a constant polynomial is equal to its leading coefficient. Clear.
Now, assume that f has degree at least one. By the Fundamental Theorem of Algebra there is a complex
number zn , say, such that f (zn ) = 0. We claim that for every complex number z we can write
f ( x ) = ( x − z) g( x ) + r,
12This is a convention: the empty product is equal to one, the empty sum is equal to zero.
24 EYAL GOREN MCGILL UNIVERSITY
where g( x ) is a polynomial of degree n − 1 and leading coefficient an and r is a complex number. In-
deed, write g( x ) = bn−1 x n−1 + · · · + b1 x + b0 and equate coefficients in ( x − z) g( x ) = bn−1 x n + (bn−2 −
zbn−1 ) x n−1 + · · · + (b0 − zb1 ) x and f ( x ). We want complex numbers b0 , . . . , bn−1 such that
bn−1 = an , (bn−2 − zbn−1 ) = an−1 , . . . , (b0 − zb1 ) = a1 ,
and there is no problem solving these equations. Thus, we can choose g( x ) with a leading coefficient an such
that f ( x ) − ( x − z) g( x ) = r is a constant. Note that g( x ) depends on z, but this is not reflected in our
notation.
Now, apply that for the particular choice z = zn . We have f ( x ) − ( x − zn ) g( x ) = r. We view r as a
polynomial and substitute x = zn . We get
f (zn ) − (zn − zn ) g(zn ) = r.
Since f (zn ) = 0 we conclude that r = 0.
We showed that if f (zn ) = 0 then
f ( z n ) = ( x − z n ) g ( x ), g( x ) = bn−1 x n−1 + · · · + b0 .
In fact, bn−1 = an . Using the induction hypothesis, we have
n −1
g( x ) = an ∏ ( x − z i ),
i =1
for some complex numbers zi and so
n
f ( x ) = a n ∏ ( x − z i ).
i =1
We note that f (z j ) = an ∏in=1 (z j − zi ) = 0, because the product contains the term (z j − z j ). If f (z) = 0
then an ∏in=1 (z − zi ) = 0. But, if a product of complex numbers is zero one of the numbers is already zero
(use |z1 z2 · · · z a | = |z1 | · |z2 | · ... · |z a |). Since an 6= 0, we must have z = zi for some i.
It remains to prove the uniqueness of the factorization. Suppose that
n n
f ( x ) = a n ∏ ( x − z i ) = a ∏ ( x − t i ).
i =1 i =1
Since the leading coefficient of f is an we must have a = an . We now argue by induction on the degree of
f . The case of degree 0 is clear. Assume f has degree greater than zero. Then the ti are roots of f and
so t1 is equal to some zi , and we may re-index the zi so that t1 = z1 . Dividing both sides by x − z1 we then
conclude that13
n n
a n ∏ ( x − z i ) = a n ∏ ( x − t i ),
i =2 i =2
and, by induction, zi = ti for all i.
We remark that for n = 1, 2 the expression of f as a product is well-known from highschool:
−b
ax + b = a · x − ,
a
√ ! √ !
2 −b + b2 − 4ac −b − b2 − 4ac
ax + bx + c = a · x − · x− .
2a 2a
There are also formulas for the roots for polynomials of degree 3 and 4, but in degrees 5 and higher no such
general formulas exist. Not that they are merely unknown; they cannot exist. This follows from Galois theory,
taught in in the course Algebra 4.
The story of the solvability of polynomials is a fascinating story. One looks for formulas for solving
polynomial equations that only involve the coefficients of the polynomials and elementary operations such
as adding, subtracting, multiplying, dividing and taking n-th roots. This is called “solving by radicals". The
formulas for solving linear and quadratic polynomials by radicals were known since antiquity: in some form or
13We say that f ( x )/g( x ) = h( x ) if h( x ) is a polynomial such that f ( x ) = g( x ) h( x ). We shall see later that h( x ) is uniquely
determined. In our case clearly f ( x )/( x − z1 ) = an ∏in=2 ( x − zi ).
COURSE NOTES - ALGEBRA I 25
another already to the Babylonians as early as 2000 BC and in a definite form to the Indian mathematician
Brahmagupta in 628 AD. This knowledge propagated through Al-Khwarizmi and others to Europe, and one
of the first European books containing such formulas was published in the 12th century by the Spanish Jewish
scholar Avraham bar Khiyya Ha-Nasi. The formulas for polynomials of degree 3 and 4 were published by the
Italian mathematician Gerolamo Cardano in 1545 (the case of a quartic reduces to a cubic and was discovered
by his student Ludovico Ferrari), and the case of cubic was also known to his contemporary and compatriot
Niccolo Fontana, who went by the name Tartaglia. In fact, Tartaglia had told Cardano how to solve cubics
but sworn him to complete secrecy. As it were, both were preceded, by a margin of three decades, by Scipione
del Ferro that preferred taunting his colleagues with challenges for solving particular cubic equations than to
publish his result. Cardano, after discovering del Ferro’s work didn’t consider himself bound by his promise
to Tartaglia anymore and, indeed, in his book Ars Magna had attributed the solution of the cubic to de
Ferro as well as acknowledging that Tartaglia had told him of the method, but not its proof. Cardano’s
publication didn’t go well with Tartaglia and a bitter feud ensued. Indeed, the solutions to the cubic and
quartic equations made Cardano and his book well-known through Europe and the scientific prestige easily
translated into material benefits as well.
For a long time finding formula for equations of degree 5 was one the outstanding problems of mathematics
and for a long time everyone was sure that such formulas must exist, although Gauss expressed some serious
doubts. Niels Henrik Abel, a Norwegian mathematical genius who died in 1829 at the age of 27 (and after
whom the Abel prize - the “Nobel prize for mathematics" - is named), proved that such formulas cannot
exist. Independently, Évartise Galois who died at the age of 21 in 1832, proved not only the insolvability of
the quintic, but in fact of all equations of any degree greater than 4, developing in the process the theory of
groups - a theory that transformed algebra and number theory.
w : R × R → R.
That is, an operation is a rule taking two elements of R and returning a new one. For example:
w : C × C → C, w ( z1 , z2 ) = z1 + z2 ,
or
w : C × C → C, w ( z1 , z2 ) = z1 z2 .
Often, for a general set R, we may denote w(z1 , z2 ) by z1 + z2 , or z1 z2 , if we want to stress the fact that
the operation behaves like addition, or multiplication. We shall of course focus on mathematical examples,
but one can certainly be more adventurous. For example, we can take R to be the set of sound waves and
the operation w to be the juxtaposition of two sound waves.
26 EYAL GOREN MCGILL UNIVERSITY
Definition 7.0.1. A ring R is a non-empty set together with two operations, called “addition" and “multipli-
cation" that are denoted, respectively, by
( x, y) 7→ x + y, ( x, y) 7→ xy.
One requires the following axioms to hold:
(1) x + y = y + x, ∀ x, y ∈ R. (Commutativity of addition)
(2) ( x + y) + z = x + (y + z), ∀ x, y, z ∈ R. (Associativity of addition)
(3) There exists an element in R, denoted 0, such that 0 + x = x, ∀ x ∈ R. (Neutral element for
addition)
(4) ∀ x ∈ R, ∃y ∈ R such that x + y = 0. (Inverse with respect to addition)
(5) ( xy)z = x (yz), ∀ x, y, z ∈ R. (Associativity of multiplication)
(6) There exists an element 1 ∈ R such that 1x = x1 = x, ∀ x ∈ R. (Neutral element for multiplication)
(7) z( x + y) = zx + zy, ( x + y)z = xz + yz, ∀ x, y, z ∈ R. (Distributivity)
We remark that for us, by definition, a ring always has an identity element with respect to multiplication.
Most, but not all, authors follow this convention.
Definition 7.0.2. Note that the multiplication operation is not assumed to be commutative in general.
If xy = yx for all x, y ∈ R, we say R is a commutative ring. If for every non-zero x ∈ R there is an
element y ∈ R such that xy = yx = 1, and also 0 6= 1 in R, we call R a division ring. A commutative
division ring is called a field.
Example 7.0.3. Z is a commutative ring. It is not a division ring and so it is not a field.
Example 7.0.4. The rational numbers Q form a field. The real numbers R form a field. In both cases we
assume the properties of addition and multiplication as “well known". The complex numbers also form a field,
in fact we have at some level already used all the axioms implicitly in our calculations, but now we prove it
formally using that R is a field.
Example 7.0.6. Here is an example of a non-commutative ring. The elements of this ring are 2-by-2 matrices
with entries in R. We define
a b α β a+α b+β a b α β aα + bγ aβ + bδ
+ = , = .
c d γ δ c+γ d+δ c d γ δ cα + dγ cβ + dδ
COURSE NOTES - ALGEBRA I 27
The verification
of the axioms is a straightforward,
but tiresome, business. We shall not do it here. The zero
0 0 1 0
element is and the identity element is . To see that the ring is not commutative we give the
0 0 0 1
following example.
0 1 0 0 1 0 0 0 0 1 0 0
= , = .
0 0 1 0 0 0 1 0 0 0 0 1
7.1. Some formal consequences of the axioms. We note some useful formal consequences of the axioms
defining a ring:
(1) The element 0 appearing in axiom (3) is unique. Indeed, if q is another element with the same
property then q + x = x for any x and in particular q + 0 = 0. But also, using the property of 0 and
commutativity, we have q + 0 = 0 + q = q. So q = 0.
(2) The element y appearing in axiom (4) is unique. Indeed, if for a given x we have x + y = x + y0 = 0
then y = y + ( x + y0 ) = (y + x ) + y0 = ( x + y) + y0 = 0 + y0 = y0 . We shall denote this element y
by − x.
(3) We have −(− x ) = x and −( x + y) = − x − y, where, technically − x − y means (− x ) + (−y). To
prove that, it is enough, after what we have just proven, to show that − x + x = 0 and that ( x +
y) + (− x − y) = 0 (after all, −(− x ) is that unique element that when added to − x gives 0, etc.).
The first is clear, and ( x + y) + (− x − y) = x + (− x ) + y + (−y) = 0 + 0 = 0.
Although this proof is simple and short, it is based on an important idea, worth internalizing.
We prove that an object is equal to another object by proving they both have a property known to
uniquely determine the object. Namely, the proof that −(− x ) = x consists of showing that the
objects x and −(− x ) both have the property that adding them to − x gives zero. Since this is known
to uniquely characterize the additive inverse of − x, denoted −(− x ), we conclude that x = −(− x ).
(4) The element 1 in axiom (6) is unique. (Use the same argument as in (1)).
(5) We have x · 0 = 0, 0 · x = 0. Indeed, x · 0 = x · (0 + 0) = x · 0 + x · 0. Let y = x · 0 then y = y + y
and so 0 = −y + y = −y + (y + y) = (−y + y) + y = 0 + y = y.
We shall see many examples of rings and fields in the course. For now, we just give one more definition and
some examples.
Definition 7.1.1. Let R be a ring. A subset S ⊂ R is called a subring if 0, 1 ∈ S and if a, b ∈ S implies
that a + b, − a and ab ∈ S.
Note that the definition says that the operations of addition and multiplication in R give operations of
addition and multiplication in S (namely, the outcome is in S and so we get functions S × S → S), satisfying
all the axioms of a ring. It follows that S is a ring whose zero element is that of R and whose identity is,
likewise, that of R.
For example, Q is a subring of R and Z is a subring of Q, as well of R.
Example 7.1.2. Consider the set {0, 1} with the following addition and multiplication tables.
+ 0 1 × 0 1
0 0 1 0 0 0 .
1 1 0 1 0 1
One can verify that this is a ring by directly checking the axioms. (We shall later see that this is the ring of
integers modulo 2).
√ √
Example 7.1.3. Consider all expressions of √ the form { a + b 2 : a, b ∈ Z}. We use the notation Z[ 2] for
a ring. Since Z[ 2]√⊂ R and√R is a ring
this set. This set is actually √ √ (even a field!), it is √
enough to check
√ it’s
a subring. Indeed, √ 0, 1 ∈ Z[ √2]. Suppose a + b√ 2, c + d 2 ∈ Z √[ 2]. Then:
√ (1) ( a + b √ 2) + (c + d√ 2) =
( a + c) + (b + d) 2 ∈√ Z[ 2]√ ; (2) −( a + b 2) = − a − b 2 ∈ Z[ 2]; (3) ( a + b 2)(c + d 2) =
( ac + 2bd) + ( ad + bc) 2 ∈ Z[ 2].
Let now √ √
Q[ 2] = { a + b 2 : a, b ∈ Q}.
28 EYAL GOREN MCGILL UNIVERSITY
This is a field. The verification that this is a subring of C is the same √ as above. It is thus a commutative
ring in which 0 6= 1. We need to show inverse for multiplication. If a + b 2 is not zero then either a or b are
not zero. If
c = a2 − 2b2
√
is zero then either b = 0 (but then a 6= 0 and so c 6= 0, so this case doesn’t happen), or 2 = a/b√is a rational
√
number. We shall prove in Proposition 10.2.4 that this is not the case. Thus, c 6= 0. Now, ac − bc 2 ∈ Q[ 2]
and it is easy to check that
√ a b√
( a + b 2) − 2 = 1.
c c
COURSE NOTES - ALGEBRA I 29
8. Exercises
(1) Let B be a given set. What does the equality A ∪ B = B ∩ C imply on A? on C?
(2) Calculate the following intersection and union of sets (provide short explanations, if not complete
proofs).
Notation: If a, b are real numbers we use the following notation:
[ a, b] = { x ∈ R| a ≤ x ≤ b}.
[ a, b) = { x ∈ R| a ≤ x < b}.
( a, b] = { x ∈ R| a < x ≤ b}.
( a, b) = { x ∈ R| a < x < b}.
We also use
[ a, ∞) = { x ∈ R| a ≤ x }.
(−∞, b] = { x ∈ R| x ≤ b}.
(−∞, ∞) = R.
If A1 , A2 , A3 , . . . are sets, we may write ∪iN=1 Ai for A1 ∪ A2 ∪ · · · ∪ A N and ∪i∞=1 Ai for ∪i∈{1,2,3,... } Ai .
(a) Let N ≥ 1 be a natural number. What is ∪nN=1 [−n, n]? What is ∩nN=1 [−n, n]?
(b) What is ∪∞ ∞
n=1 [ n, n + 1]? What is ∪n=1 ( n, n + 2)?
(c) What is ∪∞ ∞
n=1 ( n, n + 1)? What is ∪n=1 (1/n, 1]?
(d) Let An = { x n : x ∈ N}. What is ∩∞
n =1 A n ?
(e) Let B = (−1, 1) × (−1, 1), the open square in the plane. Write B as an infinite union of closed
discs of positive radius (where a closed disc with center (v0 , v1 ) and positive radius is a set of
the form {(v0 , v1 ) + ( x, y) : x2 + y2 ≤ r } for some fixed positive real number r).
(f) Prove that one cannot write this B as a finite union of discs.
(3) Using the intervals [0, 2] and (1, 3] and the operations of union, intersection and difference, create 8
different sets. For example, the set [0, 1] ∪ (2, 3] is ([0, 2] \ (1, 3]) ∪ ((1, 3] \ [0, 2]).
(4) Let A, B and C be sets. Prove or disprove:
(a) ( A \ B) \ C = A \ ( B \ C );
(b) ( B \ A) ∪ ( A \ B) = ( A ∪ B) \ ( A ∩ B).
(5) Prove that
A \ (∩i∈ I Bi ) = ∪i∈ I ( A \ Bi ).
(6) Prove that the Principle of Induction (Theorem 2.3.1) implies the statement: “every non-empty
subset of N has a minimal element".
(7) Prove by induction that for n ≥ 1,
13 + · · · + n3 = (1 + · · · + n)2 .
You may use the formula for the right hand side previously given.
(8) Prove by induction that 2n > n2 for n ≥ 5. It is false if we take n ≥ 0 (because it fails for n = 2),
but it happens to hold for n = 0. Where would your proof break down if you try to argue by induction
starting at n = 0?
(9) Let Cn denote the number of ways to cover the squares of a 2 × n checkers board using plain dominos.
Therefore, C1 = 1, C2 = 2, C3 = 3.√ Compute C√ 4 and C5 and find a recursive formula for Cn . Prove
1 + 5 −
1
by induction that Cn = √ · ( 2 ) n + 1 − ( 2 5 ) n +1 .
1
5
(10) Prove by induction that for n ≥ 1
1 + 3 + 5 + · · · + (2n − 1) = n2 .
(16) Let A, B be sets and assume A is not the empty set. Prove that | A| ≤ | B| if and only if there is a
surjective function B → A.
(17) Prove that |N| = |Q|. (Hint: Show two inequalities; note that there is an easy injection Q → Z ×
Z).
(18) Prove or disprove: if | A1 | = | A2 |, A1 ⊇ B1 , A2 ⊇ B2 , and | B1 | = | B2 | then | A1 \ B1 | = | A2 \ B2 |.
(19) Let A be a set. Then | A| < |2 A |, where 2 A is the set of all subsets of A. (Another common notation
for the set of subsets of A is P ( A).) Prove this as follows: First show | A| ≤ |2 A | by constructing
an injection A → 2 A . Suppose now that there is a bijection
A → 2A, a 7 → Ua .
Define a subset U of A by
U = { a : a 6 ∈ Ua } .
Show that if U = Ub we get a contradiction. (This is some sort of “diagonal argument"). Put all
this together to conclude | A| < |2 A |.
(20) Prove that a number is rational if and only if from some point on its decimal expansion becomes
periodic.
(21) Prove that if a product z1 · z2 of complex numbers is equal to zero then at least one of z1 , z2 is zero.
(22) Let f be the complex polynomial f ( x ) = (3 + i ) x2 + (−2 − 6i ) x + 12. Find a complex number z such
that the equation f ( x ) = z has a unique solution (use the formula for solving a quadratic equation).
(23) Find the general form of a complex number z such that: (i) z2 is a real number (i.e., Im(z2 ) = 0).
(ii) z2 is a purely imaginary number (i.e., Re(z2 ) = 0). (iii) z2 = z̄. (iv) Im(z2 + z̄) = 0. (v)
Re(z2 + z̄) = 0. Also, in each of these cases, plot the answer on the complex plane.
(24) The ring of 2 × 2 matrices over a field F.
Let F be a field. We consider the set
a b
M2 (F) = : a, b, c, d ∈ F .
c d
It is called the two-by-two matrices over F. We define the addition of two matrices as
a1 b1 a b a + a2 b1 + b2
+ 2 2 = 1 .
c1 d1 c2 d2 c1 + c2 d1 + d2
We define multiplication by
a1 b1 a2 b2 a1 a2 + b1 c2 a1 b2 + b1 d2
= .
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2
Prove that this is a ring. For each of the following subsets of M2 (F) determine if they are subrings
or not.
a b
(a) The set ∈ M2 (F) .
0 d
0 b
(b) The set ∈ M2 (F) .
0 0
a 0
(c) The set ∈ M2 (F) .
c d
a 0
(d) The set ∈ M2 (F) .
0 d
a 0
(e) The set ∈ M2 (F) .
0 a
Remark: One defines in a very similar way the ring of n × n matrices with entries in a field F.
32 EYAL GOREN MCGILL UNIVERSITY
(25) The zero ring. Let e be a formal symbol and let R = {e}. Define e + e = e, e · e = e. Verify that R
is a ring in which 1 = 0. Prove that if S is any ring in which 1 = 0 then S has a unique element and
if we choose to denote by e then e + e = e, e · e = e.
(26) Prove that a commutative ring with 2 elements is a field.
(27) Prove that a commutative ring with 3 elements is a field.
COURSE NOTES - ALGEBRA I 33
Part 2. Arithmetic in Z
In this part of the course we are going to study arithmetic in the ring of integers Z. We are going to focus
on particular properties of this ring. Our choice of properties is motivated by an analogy to be drawn later
between integers and polynomials. In fact, there is a general class of rings to which one can extend this
analogy, called Euclidean rings; they are discussed briefly in Appendix C.
We are focusing here on rather simple properties of the integers. However, the integers have been the
playground of mathematicians for millennia and the fascination with their properties never vaned. Starting
from the Greek mathematicians and philosophers, for whom integers were manifestation of the divine, simple
problems concerning integers were explored. Remarkably, for most of those problems we still do not know
the answer. For example, a positive integer n is called perfect if the sum of its divisors, excluding itself, is
equal to n. Thus, 6 is a perfect number as 6 = 1 + 2 + 3, as is 28. We do not know if odd perfect numbers
exist and we do not know if there are infinitely many even perfect numbers (although I would suspect that
the answer to the first is no, and for the second is yes). Because of the special role prime numbers play in
arithmetic, perhaps the most substantial open problems are problems concerning prime numbers, and some
of these are mentioned below.
9.2. Division.
Definition 9.2.1. Let a, b be integers. We say that a|b (read, a divides b) if there is an element c ∈ Z such
that b = ac.
Here are some properties:
(1) a|b ⇒ a| − b.
(2) a|b ⇒ a|bd for any d ∈ Z.
(3) a|b, a|d ⇒ a|(b ± d).
(4) if a|b and b| a then a = ±b.
Proof. Write b = ac. Then −b = a · (−c) and so a| − b. Also, bd = a · (cd) and so a|bd.
Write also d = ae. Then b ± d = a · (c ± e) and so a|(b ± d).
Now, for the last property. For some integers c, d we have b = ac, a = bd and so we find that a = bd = acd
and so a(1 − cd) = 0. If a = 0 then also b = 0 and the conclusion a = ±b is definitely true. Otherwise, we
must have cd = 1. But, as c and d are integers, we must have c = d = ±1 and the conclusion follows.
Remark 9.2.2. Extreme cases are sometimes confusing. It follows from the definition that 0 divides only 0,
that every number divides 0 and that ±1 (and only them) divide any number.
Example 9.2.3. Which are the integers that can divide both n and 2n + 5? If a divides n, then a divides 2n
and so, if a divides 2n + 5 then a divides 5, because 5 = (2n + 5) − 2n. Therefore a = ±1, ±5 are the only
possibilities. However, note that this does not imply that they actually occur in every case. If n = 0, they all
do. If n = 1 only ±1 are common divisors.
Corollary 9.2.4. Let a 6= 0. a|b if and only if in dividing b in a with residue, b = aq + r, the residue r is zero.
Proof. If the residue r = 0 then b = aq and so a|b. If a|b and b = aq + r then a|(b − aq), i.e., a|r. But r < | a|
and so that’s possible only if r = 0.
9.3. GCD.
Definition 9.3.1. Let a, b, be integers, not both zero. The greatest common divisor (gcd) of a and b,
denoted gcd( a, b) or just ( a, b) if the context is clear, is the largest (positive) integer dividing both a and b.
Theorem 9.3.2. Let a, b, be integers, not both zero, and d = ( a, b) their gcd. Then every common divisor
of a and b divides d. There are integers u, v such that
d = ua + vb.
Moreover, d is the minimal positive number that has the form ua + vb.
Proof. Let
S = {ma + nb : m, n ∈ Z, ma + nb > 0}.
First note that S 6= ∅. Indeed, aa + bb ∈ S. Let D be the minimal element of S. Then, for some u, v ∈ Z
we have D = ua + vb.
We claim that D = d. To show D | a, write a = qD + r, 0 ≤ r < D. Then, D > r = a − qD =
a − q(ua + vb) = (1 − qu) a − qvb. If r 6= 0 then r = (1 − qu) a − qvb is an element of S smaller than D
and that’s a contradiction. It follows that r = 0, that is D | a. In the same way, D |b.
On the other hand, let e be any common divisor of a and b. Then e also divides ua + vb = D. It follows
that D is the largest common divisor of a, b, so D = d, and also that any other common divisor of a and b
divides it.
Corollary 9.3.3. If a|bc and gcd( a, b) = 1 then a|c.
COURSE NOTES - ALGEBRA I 35
Proof. We have 1 = ua + vb for some integers u, v. Since a|uac and a|vbc we have a|uac + vbc = c.
9.4. The Euclidean algorithm. 14 The question arises: how do we compute in practice the gcd of two
integers? This is a very practical issue, even in the simple task of simplifying fractions! As we shall see, there
are two methods. One method uses the prime factorization of the two numbers; we shall discuss that later.
The other method, which is much more efficient, is the Euclidean algorithm.
Theorem 9.4.1. (The Euclidean Algorithm) Let a, b be positive integers with a ≥ b. If b| a then gcd( a, b) = b.
Else perform the following recursive division with residue:
a = bq0 + r0 , 0 < r0 < b,
b = r0 q1 + r1 , 0 ≤ r1 < r0 ,
r0 = r1 q2 + r2 , 0 ≤ r2 < r1 ,
..
.
113 = 54 · 2 + 5
54 = 5 · 10 + 4
5 = 4·1+1
4 = 1 · 4.
Thus gcd(113, 54) = 1.
A further bonus supplied by the Euclidean algorithm is that it allows us to find u, v such that gcd( a, b) =
ua + vb. We just illustrate it in two examples:
1). Take a = 113, b = 54. Then, as we saw,
113 = 54 · 2 + 5
54 = 5 · 10 + 4
5 = 4·1+1
4 = 1 · 4.
and so gcd(113, 54) = 1. We have 1 = 5 − 4 · 1, and we keep substituting for the residues we now
have expressions involving previous residues (the important numbers to modify are the residues not the
quotients qi ). 4 = 54 − 5 · 10 and we get 1 = 5 − (54 − 5 · 10) = −54 + 5 · 11. Next, 5 = 113 − 54 · 2 and
we get 1 = −54 + 5 · 11 = −54 + (113 − 54 · 2) · 11 = 54 · (−23) + 113 · 11. Thus,
1 = gcd(54, 113) = −23 · 54 + 11 · 113.
The sieve of Eratosthenes: 15 This is a method that allows one to construct rapidly a list of all primes less
than a given number N. We illustrate that with N = 50. One writes all the numbers from 2 to 50:
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41, 42, 43, 44, 45,
46, 47, 48, 49, 50
The first number on the list is prime. This is 2. We write it in bold-face and cross all its multiples (we denote
crossing out by an underline):
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
The first number on the list not in bold-face and not crossed out is prime. This is 3. We write it in bold-face
and cross all its multiples (we denote crossing out by another underline):
15Eratosthenes of Cyrene, 276BC - 194BC, was a Greek mathematician and is famous for his work on prime numbers and for mea-
suring the diameter of the earth. For more see http://www-groups.dcs.st-and.ac.uk/%7Ehistory/Biographies/Eratosthenes.html
COURSE NOTES - ALGEBRA I 37
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
The first number on the list not in bold-face and not crossed out is prime. This is 5. We write it in bold-face
and cross all its multiples (we denote crossing out by an underline):
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
The first number on the list not in bold-face and not crossed out is prime. This is 7. We write it in bold-face
and cross all its multiples (we denote crossing out by yet another underline):
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
√ √
The next number 11 is already greater than N = 50 ∼ 7.071 . . . . So we stop, because any number is
a product of prime numbers (see Lemma 10.1.3 below) √ and so any number less or equal to N, which is not
prime, has at least one prime divisor smaller or equal to N. Thus, any number left on our list is prime.
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
Theorem 10.1.2 (The Fundamental Theorem of Arithmetic). Every non-zero integer n is a product of primes.
(We allow the empty product, equal by definition to 1). That is, one can write every non-zero integer n as
n = ep1 p2 · · · pm ,
where e = ±1 and 0 ≤ p1 ≤ p2 ≤ · · · ≤ pm are primes (m ≥ 0). Moreover, this way of writing n is unique.
Proof. We first show n can be written this way. We may assume n is positive (if n is negative, apply the
statement to −n, −n = p1 p2 · · · pm and thus n = −1 · p1 p2 · · · pm ).
Lemma 10.1.3. Every positive integer is a product of primes numbers. (We allow the empty product, equal
by definition to 1).
Proof. Suppose not. Then the set of integers S that are not a product of prime numbers has a minimal
element, say n0 . n0 is not one, or a prime, because in those cases it is a product of primes (1, as said, is the
empty product). Thus, there are integers 1 < s < n0 , 1 < t < n0 such that n0 = st. Note that s and t are
not in S because they are smaller than n0 . Thus, s = q1 q2 · · · q a is a product of primes, t = r1 r2 · · · rb is a
product of primes and therefore n = q1 q2 · · · q a r1 r2 · · · rb is also a product of primes. This is a contradiction
- a contradiction to our initial assumption that there are positive integers that are not a product of prime
numbers. Thus, every positive integer is a product of prime numbers.
Choosing the sign e appropriately and ordering the primes in increasing order we conclude that any non-zero
integer n = ep1 p2 · · · pm , where e = ±1 and 0 ≤ p1 ≤ p2 ≤ · · · ≤ pm are primes. We now show uniqueness.
For this we need the following important fact.
Proposition 10.1.4. Let p > 1 be an integer. The following are equivalent:
(1) p is a prime number;
(2) if p| ab then p| a or p|b.
Proof. Suppose p is prime and p| ab. If p 6 | a then gcd( p, a) = 1 and so, as we have already seen (Corol-
lary 9.3.3), p|b.
Now suppose that p satisfies (ii). Let p = st. By replacing s by −s and t by −t, we may assume that
s, t are positive. As p = st, p|st and so p|s, say. So s = ps0 and p = ps0 t. But we must have then
38 EYAL GOREN MCGILL UNIVERSITY
that s0 = t = 1, because s0 , t are positive integers. Therefore p = s. So p has no proper divisors and hence
is prime.
Remark 10.1.5. Suppose that p is a prime dividing a product of integers q1 q2 · · · qt . Arguing by induction
on t, one concludes that p divides some qi .
We now finish the proof of the theorem. Suppose that
n = ep1 p2 · · · pm ,
and also
n = µq1 q2 · · · qt ,
are two expressions of n as in the statement of the theorem. First, e is negative if and only if n is, and
the same holds for µ. So e = µ. We may then assume n is positive and e = µ = 1 and we argue by
induction on n. The case n = 1 is clear: a product of one or more primes will be greater than 1 so the only
way to express n is as the empty product. Assume the statement holds for 1, 2, . . . , n − 1 and consider two
factorizations of n:
n = p1 p2 · · · p m ,
and
n = q1 q2 · · · q t .
First, note that m ≥ 1 and t ≥ 1 because n > 1. Assume that p1 ≤ q1 (the argument in the other case goes
the same way). We have p1 |n and so p1 |q1 q2 · · · qt . It follows that p1 divides some qi but then, qi being
prime, p1 = qi . Furthermore, p1 ≤ q1 ≤ qi = p1 , so p1 = q1 . We then have the factorizations
n
= p2 · · · p m = q2 · · · q t .
p1
Since n/p1 < n we may apply the induction hypothesis and conclude that m = t and pi = qi for all i.
The Fundamental Theorem exhibits the prime numbers as the building blocks of the integers. In itself, it
doesn’t tell us if there are finitely many or infinitely many of them.
Theorem 10.1.6 (Euclid). There are infinitely many prime numbers.
Proof. Let p1 , p2 , . . . , pn be distinct prime numbers. We show then that there is a prime not in this list. It
follows that there couldn’t be only finitely many prime numbers.
Consider the integer n = p1 p2 · · · pn + 1 and its prime factorization. Let q be a prime dividing n. If q ∈
{ p1 , p2 , . . . , pn } then q| p1 p2 · · · pn and so q|(n − p1 p2 · · · pn ), that is q|1, which is a contradiction. Thus, q
is a prime not in the list { p1 , p2 , . . . , pn }.
So! We know that every integer is a product of prime numbers, we know there are infinitely many prime
numbers. That teaches us about the integers, and invites some more questions:
– How frequent are prime numbers? The Prime Number Theorem asserts that the number of primes
in the interval [1, n] is roughly n/ log n, in the sense that the ratio between the true number and the esti-
mate n/ log n approaches 1 as n goes to infinity. The result was conjectured by Gauss16 at the age of 15 or
16 and proven by J. Hadamard and Ch. de la Vallée Poussin, independently, in 1896. The famous Riemann
hypothesis, currently (Fall 2023) one of the Clay Institute million dollar millennium prize problems, is initially
a conjecture about the zeros of some complex valued function – the so-called Riemann zeta function. An
equivalent formulation is the following: let π ( x ) be the number of prime numbers in the interval [1, x ], for
x > 1, then for some constant C, we have
x √
|π ( x ) − | < C x · log x.
log x
16Johann Carl Friedrich Gauss, 1777 - 1855, worked in a wide variety of fields in both mathematics and physics including number
theory, analysis, differential geometry, geodesy, magnetism, astronomy and optics. His work has had an immense influence
in many areas of Science. He is mentioned in the same breath with Euclid, Archimedes and Newton as one of the giants of
mathematics.
COURSE NOTES - ALGEBRA I 39
– How small can the gaps between consecutive primes be? For example, we have (3, 5), (5, 7), (11, 13),
(17, 19), ... But, are there infinitely many such pairs?? The answer is believed to be yes but no one has
proven it yet (Fall 2023). This is called the Twin Primes Conjecture. Incidentally, it is not hard to prove
that the gap between primes can be arbitrarily large. Given an integer N, consider the numbers
N! + 2, N! + 3, N! + 4, . . . , N! + N.
This is a set of N − 1 consecutive integers none of which is prime. In recent breakthroughs, originating with
Yitang Zhang in 2013, it was established that there is a computable constant C such that there are infinitely
many pairs of primes at most C apart. Zhang provided C = 70, 000, 000 and J. Maynard improved that not
much after to C = 600; “Polymath project 8" - a large group of mathematicians working on the problem
using an online platform - improved the bound further to C = 256. It is believed that their methods can be
improved to yield C = 6, but not (oh, painful irony!) to yield C = 2, leaving the twin primes conjecture just
beyond reach.
– How far does one need to go until the next prime shows up? For example, it is known that there
is always a prime between n and 2n. It follows rather easily from the prime number theorem for n 0,
but it in fact holds for every n. There is a rather elementary, yet ingenious, proof of this fact, the technical
background for which is not much more than Stirling’s formula. We sometimes give it in the Number Theory
course.
– What about adding primes? Goldbach’s conjecture asserts that every even integer greater than 2 is
the sum of two prime numbers. For example, 4 = 2 + 2, 6 = 3 + 3, 8 = 3+5, 10 = 3+7, 12 = 5 + 7, 14 =
3 + 11, 16 = 5+ 11, .... It has been verified (Winter 2016) up to n ≤ 4 × 1017 , but no proof is currently
known (Fall 2023).
The “odd Goldbach conjecture" states that every odd integer greater than 5 is the sum of 3 primes.
Remarkably this is known. Series of works over a century proved better and better results towards the odd
conjecture. In particular, it was known in 2002 to be true for every odd integer greater than C = 2 · 101346 ,
which is computationally out of reach. Harald Helfgott proved the full conjecture in 2013, by improving the
constant C greatly so that the remaining cases could be verified by direct computation (and, in fact, were
already verified prior to his work).
that is not rational. However, the advantage of Proposition 10.2.4 is that it shows that a specific number is
irrational. We will need the following result.
Proposition 10.2.3. Any non-zero rational number q can be written as
a am
q = ep11 . . . pm ,
where e = ±1, the pi are distinct prime numbers and a1 , . . . , am are non-zero integers (possibly negative).
Moreover, this expression is unique (up to reordering the primes).
Proof. By definition, for some integers a, b we have q = a/b. Let us write:
s t
a = ea r11 · · · rnsn , b = eb r11 · · · rntn ,
where ea , eb ∈ {±1}, the ri are distinct primes and si , ti are non-negative integers, possibly zero. This is
always possible to do because we allow zero exponents here. Then, clearly,
s − t1
q = (ea /eb )r11 · · · rnsn −tn .
Now, omitting the primes such that si − ti = 0 from the list, calling the remaining primes p1 , . . . , pm , and
letting e = ea /eb , we obtain an expression as desired.
Suppose that we have two such expressions for q. Again, by allowing zero exponents, it is enough to
consider the following situation:
a a0 a0
am
q = ep11 . . . pm = e0 p11 . . . pmm .
Since e, e0 determine the sign of q, they must be equal and we need to show that ai = ai0 for all i. Dividing
through, we get an expression of the form,
c
1 = p11 . . . pcmm ,
where ci = ai − ai0 and we need to show all the ci are zero. By rearranging the primes, we may assume
that c1 , . . . , ct are negative and ct+1 , . . . , cm are non-negative. We conclude that
− c1 c
p1 . . . pt−ct = pt+
t +1 cm
1 . . . pm .
In this expression there are no negative exponents. Thus, from unique factorization for integers, since all the
primes are distinct all the powers must be zero.
√
Proposition 10.2.4. 2 is not a rational number.
√ a am
Proof. Suppose it is, and write 2 = p11 · · · pm , distinct primes with non-zero exponents (possibly negative).
Then
2a
2 = p1 1 · · · p2a
m
m
must be the unique factorization of 2. However, 2 is prime. So there must be only one prime on the right
2a
hand side, i.e. m = 1. Then 2 = p1 1 and we must have p1 = 2 and 2a1 = 1. But this contradicts the fact
that a1 is an integer.
Here is another
√proof: √
Suppose that 2 is rational and write 2 = m/n, where (m, n) = 1. Then
2n2 = m2 .
This implies that 2|m2 = m · m. As 2 is prime, it follows that 2|m. Thus, m is even, say m = 2k. It follows
that 2n2 = 4k2 and so n2 = 2k2 . Therefore, 2|n2 and by the same considerations
√ 2|n. This means that 2
both n and m, contrary to our assumption. Thus, assuming 2 is rational leads to a contradiction
divides √
and so 2 is not a rational number.
p √
Example 10.2.5. We can draw the conclusion that also α = 1 + 2 is irrational. Indeed, if α was rational, √
say equal to m/n, then α2 = m2 /n2 is also rational and so is α2 − 1 = (m2 − n2 )/n2 . But α2 − 1 = 2,
which we know to be irrational.
Some problems sound very hard, but turn out to have elementary solutions. For example, consider the
statement: “there exists two irrational numbers a, b such that ab is rational". This seems hard, but a shrewd
√ √2
observation due to Dov Jarden, gives a proof: if 2 is rational then we are done. Otherwise, consider
COURSE NOTES - ALGEBRA I 41
√
√ 2 √
ab , where a = 2 and b√ = 2. As ab = 2, we are done. This clever argument doesn’t tell us which
√ 2 √ √2 √2 √ √2
is the correct example, 2 , or ( 2 ) . In fact, by the Gelfond-Schneider theorem, we know 2 is
transcendental (namely, it doesn’t solve any non-zero polynomial with rational coefficients; being irrational just
√ √2 √2
means it doesn’t solve any linear polynomial with rational coefficients), so the example is actually ( 2 ) .
See also Exercise 20.
11. Exercises
(1) Find the quotient and remainder when a is divided by b:
(a) a = 302, b = 19.
(b) a = −302, b = 19.
(c) a = 0, b = 19.
(d) a = 2000, b = 17.
(e) a = 2001, b = 17.
(f) a = 2002, b = 17.
(2) Prove that the square of any integer a is either of the form 3k or of the form 3k + 1 for some integer
k. (Hint: write a in the form 3q + r, where r = 0, 1 or 2.)
(3) Prove of disprove: If a|(b + c) then a|b or a|c. If a|bc then a|b. If a|c and ( a + b)|c then b|c.
(4) If r ∈ Z and r is a solution of x2 + ax + b (where a, b ∈ Z) prove that r |b.
(5) If n ∈ Z, what are the possible values of
(a) (n, n + 2);
(b) (n, n + 6).
(6) Find the following gcd’s. In each case also express ( a, b) as ua + vb for suitable integers u, v ∈ Z.
(a) (56, 72).
(b) (24, 138).
(c) (143, 227).
(d) (314, 159).
(7) If a|c and b|c, must ab divide c? What if ( a, b) = 1?
(8) The least common multiple of nonzero integers a, b is the smallest positive integer m such that a|m
and b|m. We denote it by lcm( a, b) or [ a, b]. Prove that:
(a) If a|k and b|k then [ a, b]|k.
ab
(b) [ a, b] = ( a,b)
if a > 0, b > 0.
r r s s
(9) Let a = p11 p2r2 · · · pkk and b = p11 p2s2 · · · pkk , where p1 , p2 , . . . , pk are distinct positive primes and
each ri , si ≥ 0. Prove that
n n
(a) ( a, b) = p1 1 p2n2 · · · pk k , where ni = min(ri , si ).
t t
(b) [ a, b] = p11 p2t2 · · · pkk , where ti = max(ri , si ).
(10) Prove or disprove: If n is an integer and n > 2, then there exists a prime p such that n < p < n!.
(11) Find all the primes between 1 and 150. The solution should consist of a list of all the primes + giving
the last prime used to sieve + explanation why you didn’t have to sieve by larger primes.
(12) Unlike twin primes, prove that there is a unique triplet of primes; namely, if n, n + 2, n + 4 are primes
for some n ≥ 1 then n = 3.
p √
(13) Prove that 2 + 3 is irrational.
√
(14) Prove that if p is a prime then p is irrational.
√
(15) Prove that if p is a prime then 3 p is irrational.
(16) Let kó 1 be an integer. Prove that if a positive integer n is not a k-th power of another integer
than k n is irrational.
√ √
(17) Prove that there are no rational numbers a, b such that 3 = a + b 2.
(18) If the ratio of the frequencies of two musical notes is 3 : 2, we say the notes form a fifth. For
example, the notes C and G (where G is higher) form a fifth, as do the notes G and D (where D is
higher). In fact, in the following sequence any two consecutive notes are supposed to form a fifth:
COURSE NOTES - ALGEBRA I 43
C, G, D, A, E, B, F ], C ], G ], D ], A], F, C.
(The last C is 7 octaves higher than the first C).
On the other hand, the ratio between two consecutive C’s is an octave and is 2 : 1. Explain why
this leads to a contradiction; the sequence above cannot really be a sequence of fifths. This is solved,
in tuning a piano, by the so-called equal temperament method; the fifths are not quite in ratio 3 : 2.
In fact, they are in ratio x : 1, where x is chosen so to ensure that the octaves stay in ratio 2 : 1.
What is x and how close it is to 3 : 2?
(19) Prove that there are infinitely many primes congruent to 3 modulo 4.
(20) Let I be the interval ((1/e)1/e , ∞). Then every rational number in I is either of the form a a where a
is irrational, or of the form nn where n is an integer. This gives infinitely many examples of irrational
numbers a such that a a is rational.
Here are some suggestions as to how to prove this statement. First, using analysis show that the
function f ( x ) = x x is bijection between the interval (1/e, ∞) and I. Now, let r be a rational number
in I and assume that r = a a , where a is rational too. Write a = n/m and r = b/c where b, c, m, n
are positive integers and (b, c) = (m, n) = 1. The goal is to show m = 1.
By analysing (n/m)n/m = b/c conclude that we must have cm = mn . If m > 1, take a prime p
dividing both c and m, say pi kc and p j km (the notation pi kc means that pi |c, but pi+1 6 |c); deduce
that im = jn and from that p j | j. Show that this is not possible.
z
44 EYAL GOREN MCGILL UNIVERSITY
Lemma 12.0.1. Congruence modulo n is an equivalence relation on Z. The set {0, 1, . . . , n − 1} is a complete
set of representatives.
Proof. First n|( x − x ) so x ≡ x and the relation is thus reflexive. If n|( x − y) then n| − ( x − y) = y − x,
so the relation is symmetric. Suppose n|( x − y), n|(y − z) then n|(( x − y) + (y − z)) = x − z and so the
relation is transitive too.
Let x be any integer and write x = qn + r with 0 ≤ r < n. Then x − r = qn and so x ≡ r. It follows
that every equivalence class is represented by some r ∈ {0, 1, . . . , n − 1}. The equivalence classes defined by
elements of {0, 1, . . . , n − 1} are disjoint. If not, then for some 0 ≤ i < j < n we have i ≡ j, that is, n|( j − i ).
But 0 < j − i < n and we get a contradiction.
Theorem 12.0.2. Denote the equivalence classes of congruence modulo n by 0̄, 1̄, . . . , n − 1 instead of
[0], [1], . . . , [n − 1]. Denote this set by Z/nZ. The set Z/nZ is a commutative ring under the follow-
ing operations:
ī + j̄ = i + j, ī · j̄ = ij.
The neutral element for addition is 0̄, for multiplication 1̄, and the inverse of ī with respect to addition
is −i = n − i.
Modular arithmetic, that is calculating in the ring Z/nZ, is some times called “clock arithmetic". The reason
is the following. The usual clock is really displaying hours modulo 12. When 5 hours pass from the time
10 o’clock the clock shows 3. Note that 3 ≡ 15 (mod 12). We are used to adding hours modulo 12 (or
modulo 24, for that matter), but we are not used to multiplying hours as that doesn’t quite make sense.
However, if you’d like you can think about multiplication as repeated addition 5 · 3 = 5 + 5 + 5. So, in that
sense, we are already familiar with the operations modulo 12 and the definitions above are a generalization.
Continuing with our numerical example, let us solve the equation 4x + 2 = 7 in Z/13Z. Now, and from
now on, we are just writing 4, 2, 7 etc. for 4̄, 2̄, 7̄. So we need to solve 4x = 5. Let us first look for a
residue class r modulo 13 so that 4r ≡ 1 (mod 13). We guess (but later we’ll develop methods for that)
that r = 10 and check: 4 · 10 = 40 ≡ 1 (mod 13). We now multiply both sides of the equation 4x = 5 by
10. Then, 4x = 5 implies 10 · 4x ≡ x ≡ 50 ≡ 11 (mod 13). Thus, the only possibility is x = 11. We go
back to the original equation 4x = 5 and verify that 4 · 11 ≡ 5 (mod 13). We found the solution x = 11.
The idea of the calculation was to use that for a given a 6= 0 we can find r such that ra = 1. This is
used to solve the equation ax = b by reducing to x = rax = rb, and we had done that for the particular
case a = 4 and b = 5. We remark that in general such an r need not exists if the modulos n is not a prime.
These issues will be discussed later. In particular, as we shall see, there are efficient ways to find the solution
to ax = 1 (mod n), but the situation for equations of the form ax = b (mod n) is more involved.
Example 12.0.4. As another example, we give the addition and multiplication table of Z/5Z.
COURSE NOTES - ALGEBRA I 45
+ 0 1 2 3 4 · 0 1 2 3 4
0 0 1 2 3 4 0 0 0 0 0 0
1 1 2 3 4 0 1 0 1 2 3 4
2 2 3 4 0 1 2 0 2 4 1 3
3 3 4 0 1 2 3 0 3 1 4 2
4 4 0 1 2 3 4 0 4 3 2 1
Proof. (Of Theorem 12.0.2) We first prove that the operations do not depend on the representatives for the
equivalence classes that we have chosen.
Suppose ī = ī0 , j̄ = j̄0 , where i, i0 , j, j0 need not be in the set {0, 1, 2, . . . , n − 1}. We have defined ī + j̄ =
i + j and so we need to check that this is the same as i0 + j0 . Since ī = ī0 , n|(i − i0 ) and similarly n|( j − j0 ).
Therefore, n|((i + j) − (i0 + j0 )); that is, i + j = i0 + j0 .
We also need to show that ij = i0 j0 . But, ij − i0 j0 = ij − ij0 + ij0 − i0 j0 = i ( j − j0 ) + j0 (i − i0 ) and so
n|(ij − i0 j0 ).
The verification of the axioms is now easy if we make use of the fact that Z is a commutative ring:
(1) ī + j̄ = i + j = j + i = j̄ + ī. (The first and third equalities are by definition and the second equality
follows from Z being a commutative ring.)
(2) (ī + j̄) + k̄ = i + j + k̄ = (i + j) + k. Note that at this point we used the simplification that we can
use any representative of the equivalence class to carry out the operations. Had we insisted on always
using the representative in the set {0, 1, 2, . . . , n − 1} we would usually have needed to replace i + j
by its representative in that set and things would be turning messy. Now, (i + j) + k = i + ( j + k ) =
ī + j + k = ī + ( j̄ + k̄ ).
(3) 0̄ + ī = 0 + i = ī.
(4) ī + −i = i + (−i ) = 0̄. Note that −i = n − i.
(5) (ī · j̄)k̄ = ij · k̄ = (ij)k = i ( jk ) = ī · jk = ī ( j̄ · k̄). (A dot is added occasionally merely to make it
easier to read.)
(6) 1̄ · ī = 1 · i = ī and ī · 1̄ = i · 1 = ī.
(7) ī ( j̄ + k̄) = ī · j + k = i ( j + k) = ij + ik = ij + ik = ī · j̄ + ī · k̄. Similarly, ( j̄ + k̄)ī = j + k · ī =
( j + k)i = ji + ki = ji + ki = j̄ · ī + k̄ · ī.
Furthermore, this is a commutative ring: ī · k̄ = ik = ki = k̄ · ī.
In the proof we saw that the ring properties of Z/nZ, the set of equivalence classes modulo n, all follow
from the ring properties of Z. We shall later see that this can be generalized to any ring R: if we impose
a correct notion of an equivalence relation, the equivalence classes themselves will form a ring and the fact
that the ring axioms hold for it follows from the fact that they hold for R.
Before providing the proof we introduce some terminology. Let R be a ring, x ∈ R a non-zero element. x is
called a zero divisor if there is an element y 6= 0 such that either xy = 0 or yx = 0 (or both).
Lemma 12.0.6. Let R be a commutative ring. If R has zero divisors then R is not a field.
Proof. Let x 6= 0 be a zero divisor and let y 6= 0 be an element such that xy = 0. If R is a field then there
is an element z ∈ R such that zx = 1. But then z( xy) = z · 0 = 0 and also z( xy) = (zx )y = 1 · y = y.
So y = 0, and that is a contradiction.
Proof. (Of Theorem) If n = 1 then Z/nZ has a single element and so 0 = 1 in that ring. Therefore, it is
not a field. Suppose that n > 1 and n is not prime, n = ab where 1 < a < n, 1 < b < n. Then ā 6= 0̄, b̄ 6= 0̄
but ā · b̄ = ab = n̄ = 0̄. So Z/nZ has zero divisors and thus is not a field.
46 EYAL GOREN MCGILL UNIVERSITY
Suppose now that n is prime and let ā 6= 0̄. That is, n - a, which, since n is prime, means that (n, a) = 1.
Consider the list of elements
0̄ · ā, 1̄ · ā, . . . , n − 1 · ā.
We claim that they are distinct elements of Z/nZ. Suppose that ī · ā = j̄ · ā, for some 0 ≤ i ≤ j ≤ n − 1
then ia = ja, which means that n|(ia − ja) = (i − j) a. Since (n, a) = 1, it follows that n|(i − j) but that
means i = j. Thus, the list 0̄ · ā, 1̄ · ā, . . . , n − 1 · ā contains n distinct elements of Z/nZ and so it must
contain 1̄. That is, there’s an i such that ī · ā = 1̄ and therefore ā is invertible.
Here is another proof for the invertibility of ā. Since ( a, n) = 1, for suitable integers u, v we have
ua + vn = 1. But that means that n|(ua − 1). That is, ua ≡ 1 (mod n).
The first proof has the advantage that a minor variant shows that every finite commutative ring with no
zero-divisors is a field; the second proof has the advantage that in our particular situation one can use the
Euclidean algorithm to calculate u such that ua ≡ 1 (mod n), namely to calculate a−1 in the field. We shall
see that this is a tremendously useful fact.
Let p be a prime number. We denote Z/pZ also by F p . It is a field with p elements. It is a fact that any
finite field (that is, any field with finitely many elements) has cardinality a power of a prime and for any prime
power there is a field with that cardinality - we learn ways to construct such fields in this course. Finite fields,
such as F p , play an important role in coding and cryptography as well as in pure mathematics. Interestingly
enough, these fields were used to be called Galois fields and a notation such as GF ( p, n) (Galois field of
pn elements) was used. Initially, such fields were the height of abstraction; today we routinely use them in
various cryptographic and coding theory implementations.
a p −1 ≡ 1 (mod p).
Before proving the theorem we state two auxiliary statements whose proofs are left as an exercise.
Lemma 12.1.2. Let p be a prime number. We have p|( pi) for every 1 ≤ i ≤ p − 1.18
Lemma 12.1.3. Let R be a commutative ring and x, y ∈ R. Interpret (ni) as adding (ni) times the element 1
to itself. Then the binomial formula holds in R:
n
n i n −i
( x + y) = ∑
n
xy .
i =0
i
17Pierre de Fermat, 1601 - 1665, was a French lawyer and government official most remembered for his work in number theory;
in particular for Fermat’s Last Theorem that states the equation x n + yn = zn does not have solutions in positive integers
x, y, z such that xyz 6= 0 for any n ≥ 3. He famously scribbled in the margin of his copy of Diophantus’ Arithmetica book
that he had discovered a marvellous proof that the margin is too small to contain. Fermat’s last theorem was proved by Sir
Andrew Wiles in 1994, ushering in the process a whole new area of number theory. Wiles’ proof is so difficult, and builds on
so much mathematics that didn’t exist at Fermat’s time, that Fermat’s claim cannot be accepted as such. Fermat had also
made important contributions to calculus and diophantine equations by introducing the so-called method of infinite descent.
This method, still in use today, assumes that a given diophantine equation (namely, f ( x ) = 0, f ( x ) ∈ Z[ x ]) has any solution
in integers and attempts to use those to provide a solution in smaller integers (in absolute value). The process can then be
repeated, yielding eventually a contradiction as one arrives at integer solutions of negative absolute value.
18Recall that the binomial coefficient (n) is defined as follows. Let n, a be natural numbers. Define (n) = n!
, where 0! := 1.
a a a! (n− a)!
This is in fact an integer and has the combinatorial interpretation that (na) is the number of ways to choose a distinct objects
from a set of n objects, where the order in which we choose the object doesn’t matter. That is, it is the number of subsets
of cardinality a of a set of cardinality n. This, incidentally, is a proof that (na) is indeed an integer, and the reason the binomial
number (na) is often read “ n choose a".
COURSE NOTES - ALGEBRA I 47
Proof. (of Fermat’s little theorem) We prove the statement by induction on 1 ≤ a ≤ p − 1. For a = 1 the
result is clear. Suppose the result for a and consider a + 1, provided a + 1 < p. We have, by the binomial
formula,
p
p i
( a + 1) = ∑
p
a
i =0
i
p p 2 p
= 1+ a+ a +···+ a p −1 + a p
1 2 p−1
= 1 + ap (using the lemma)
= 1+a (using the induction hypothesis)
Since 1 + a 6≡ 0 (mod p) it has an inverse y in F p , y(1 + a) ≡ 1. Then, y(1 + a) p ≡ y(1 + a) ≡ 1. As also
y(1 + a) p = y(1 + a)(1 + a) p−1 ≡ (1 + a) p−1 , we find that (1 + a) p−1 ≡ 1.
Example 12.1.4. We calculate 2100 modulo 13. We have 2100 = 296 24 = (212 )8 24 ≡ 24 ≡ 3 modulo 13.
Fermat’s little theorem gives a criterion for numbers to be composite. Let n be a positive integer. If there
is 1 ≤ a ≤ n − 1 such that an−1 6≡ 1 (mod n) then n is not prime. Unfortunately, it is possible that for
every 1 ≤ a ≤ n − 1 such that ( a, n) = 1, one has an−1 ≡ 1 (mod n) and yet n is not prime. Thus, this
test fails to recognize such n as composite numbers. Such numbers are called Carmichael numbers. There
are infinitely many such numbers. The first are 561, 1105, 1729, 2465, 2821, 6601, 8911, 10585, 15841,
29341, ...
Primality testing routines first test divisibility by small primes available to the program as pre-computed
data and then use various methods that are more sophisticated version of the following method: choose
randomly some 1 ≤ a < n: if ( a, n) 6= 1 then n is not prime. If ( a, n) = 1 the program calculated an−1
(mod n). If the result is not 1 (mod n) then n is not prime. If the result is 1, the program chooses another a.
After a certain number of tests, say 10, if n passed all the tests it is declared as “prime", though there is
no absolute reassurance it is indeed a prime; it could be a Carmichael number, or we could have just been
unlucky in choosing our elements a. We remark that calculating an−1 (mod n) can be done quickly. One
calculates a, a2 , a4 , a8 , a16 , · · · modulo n, as long as the power is less than n. This can be done rapidly. One
then expresses n in base 2 to find the result. Here is an example: Let us calculate 354 (mod 55) (random
choice of numbers). We have 3, 32 = 9, 34 = 81 = 26, 38 = 262 = 676 = 16, 316 = 162 = 256 = 36, 332 =
362 = 1296 = 31. Now, 54 = 2 + 4 + 16 + 32 and so 354 = 9 · 26 · 36 · 31 = 4. In particular, 55 is not a
prime – not that this is a particularly shrewd observation...
It is important to note that there is a polynomial-time algorithm (this means that the running time of the
algorithm is at most some constant times the number of digits of the input) to decide, without any doubt,
if an integer is prime. Such an algorithm was discovered by Agrawal, Kayal and Saxena in 2002 and was a
real sensation at the time. It is important to note that the algorithm does not produce a decomposition of n
in case n is composite. Such an algorithm will compromise the very backbone of e-commerce and military
security.
12.2. Solving equations in Z/nZ. There is no general method for solving polynomials equations in Z/nZ.
We just present some selected topics.
12.2.1. Linear equations. We want to consider the equation ax + b = 0 in Z/nZ. Let us assume that ( a, n) =
1. Then, there are integers u, v such that 1 = ua + vn. We remark that u, v are found by the Euclidean
algorithm. Note that this implies that ua ≡ 1 (mod n). Thus, if x solves ax + b = 0 in Z/nZ then x solves
the equation uax + ub = 0 (mod n), that is x + ub = 0 and so x = −ub in Z/nZ. Conversely, if x = −ub
in Z/nZ where ua = 1 in Z/nZ then ax = a(−ub) = − aub = −b in Z/nZ.
We summarize: if ( a, n) = 1 then the equation
ax + b = 0 (mod n),
has a unique solution x = −ub, where u is such that ua = 1 (mod n).
48 EYAL GOREN MCGILL UNIVERSITY
Example 12.2.1. Here is a specific example: Let us solve 12x + 3 = 0 (mod 17). First 17 = 12 + 5, 12 =
2 · 5 + 2, 5 = 2 · 2 + 1, so (12, 17) = 1 and, moreover, 1 = 5 − 2 · 2 = 5 − 2 · (12 − 2 · 5) = 5 · 5 − 2 ·
12 = 5 · (17 − 12) − 2 · 12 = 5 · 17 − 7 · 12. We see that −7 · 12 ≡ 1 (mod 17). Thus, the solution
is x = 7 · 3 = 21 = 4 (mod 17).
More generally, if ( a, n) = d, then the equation ax + b ≡ 0 (mod n) has a solution if and only if d|b.
Indeed, if we have a solution, then b = kn − ax for some integer k and we see that d|b. Conversely, if d|b
consider the equation ( a/d) x + (b/d) ≡ 0 (mod n/d). A solution to this equation also solves the original
equation ax + b ≡ 0 (mod n). Now ( a/d, n/d) = 1 and we are in the case already discussed above.
Unlike the situation ( a, n) = d where d = 1, if d > 1 there is more than one solution to ax + b ≡ 0
(mod n). In fact, one can prove that if x0 is one solution, then a complete set of solutions modulo n is
n n n
x0 , x0 + , x0 + 2 · , . . . , x0 + ( d − 1) · .
d d d
12.2.2. Quadratic equations. Consider the equation ax2 + bx + c = 0 in Z/nZ and assume n is a prime
greater than 2. In that case, assuming that a 6= 0 modulo n, there is an element (2a)−1 . One can prove that
the equation has a solution if and only if b2 − 4ac is a square in Z/nZ (which may or may not be the case).
In case it is a square, the solutions of this equation are given by the usual formula:
p
(2a)−1 (−b ± b2 − 4ac).
For example, the equation x2 + x + 1 has no solution in Z/5Z because the discriminant b2 − 4ac is in this
case 12 − 4 = −3 = 2 in Z/5Z and 2 can be checked not to be a square in Z/5Z (one just tries: 02 =
√ 4, 4 = 1 in Z/5Z).
0, 12 = 1, 22 = 4, 32 = On the other hand, x2 + x + 1 can be solved in Z/7Z. The
2
√
solutions are 4(−1 ± −3) = −4 ± 4 4 = −4 ± 8 = −4 ± 1 = {2, 4} .
When n is not prime, we shall not study the problem in this course, beyond remarking that one can proceed
by trying all possibilities if n is small and that the number of solutions can be very large. For example:
consider the equation x3 − x in Z/8Z. We can verify that its solutions are 0, 1, 3, 5, 7. There are 5 solutions
but the equation has degree three. We shall later see that in any field a polynomial equation of degree n
has at most n roots. As the example of x3 − x in Z/8Z suggests, this may fail spectacularly in a general
commutative ring.
12.3. Public key cryptography; the RSA method. 19 We cannot go here too much into the cryptograph-
ical practical aspects. Suffices to say that in many cryptographical applications two parties X and Y wish to
exchange a secret. Given any large integer n that secret can be represented as a number modulo n, and we
leave it to the reader’s imagination to devise methods for that. The method proceeds as follows:
The rest of the data p, q, k, d is kept secret. In fact, p, q, k can be destroyed altogether and only d is kept,
and kept secret!
19The RSA method described above is named after Ron Rivest, Adi Shamir and Len Adleman, who discovered it in 1977.
COURSE NOTES - ALGEBRA I 49
Y, wishing to send a secret, writes it as an integer b modulo n, which is also relatively prime to n, and sends be
(mod n) to X, allowing anyone interested to see that message. The point is, and this is called the discrete
log problem, that it is very difficult to find what b is, even when one knows be and n. Thus, someone seeing
Y’s message cannot find the secret b from it.
p = 10007, q = 10009;
n = p*q = 100160063
k = (p-1)*(q-1) = 100140048
d = 10001
e = 88695185
b = 3
b^e = 33265563
33265563^10001 = 3 Mod n.
RSA rests on several security assumptions. Besides the belief that the discrete log problem is inherently
difficult, it also pre-supposes that the difficulty of factoring an integer of the form pq, where p and q are
primes of similar size is a difficult problem computationally whose solution has running time comparable to
the size of p. All evidence so far indicates that this is so. Suppose that p is a prime of the order of magnitude
of 101000 (at this point in time (2016) this is considered very secure). Let us calculate how long it will take
to factor n by brute force trial and error. Cray’s supercomput Titan does 20,000 trillion (2 × 1015 ) flops
calculations per second. It had cost 97 million dollars to be built. To simplify, let us assume that this many
flops boils down to trying 2 × 1015 integers as factors of n per second.
We can assume that such a computer can’t be built for less that 10 million dollars. A computer 10000
times faster will prob cost today in the excess of 10 billion dollars, if it can be constructed at all. The existence
of anything bigger would probably be public knowledge for budgetary reasons and the size of resources needed
for constructing and running such a computer. Even with dedicated architecture it wouldn’t perform more
than 1025 operations per second. The time to factor an integer greater than 101000 by brute force will take
more than 10975 seconds, which is much much more than a billion years. Even relaxing our assumptions
greatly shows that for all practical purposes, as long as our security assumptions are valid, it is not feasible
to break RSA based on a key of this size in any feasible time.
50 EYAL GOREN MCGILL UNIVERSITY
13. Exercises
(1) A relation can be either reflexive or not, symmetric of not, transitive or not. This gives a priori 8
possibilities (e.g., reflexive, non-symmetric, transitive). For each possibility either give an example
of such a relation, or indicate why this possibility doesn’t occur. Whenever possible, give “natural
examples".
(2) Let A be a set.
(a) Let Γ = A × A. What relation does Γ define on A? Is it an equivalence relation? For every
a ∈ A write the set of elements b ∈ A such that a ∼ b.
(b) Let Γ = {( a, a) : a ∈ A}. What relation does Γ define on A? Is it an equivalence relation? For
every a ∈ A write the set of elements b ∈ A such that a ∼ b.
(c) Define a relation on non-zero real numbers by saying that a ∼ b if a/b is a rational number.
Show that this is an equivalence relation.
(d) Define a relation on complex numbers by saying that z1 ∼ z2 if |z1 | = |z2 |. Show that this is
an equivalence relation. How do the equivalence classes look like graphically?
(e) Let f ( x ) be a polynomial of the form x n + an−1 x n−1 + · · · + a0 , where n ≥ 1 and ai ∈ R.
Describe a relation on R by saying that a ∼ b if f ( a) = f (b). Prove that this is an equivalence
relation. Prove that if n is odd, there is an element a ∈ R whose equivalence class consists only
of itself. (This question requires some use of calculus.)
(3) Given an integer N, we write N in decimal expansion as N = nk nk−1 . . . n0 , the ni being the digits
of N. Note that this means that N = n0 + 10n1 + 102 n2 + · · · + 10k nk . In the following you are
asked to show certain divisibility criteria that can be proved by using congruences.
(a) Prove that a positive integer N = nk nk−1 . . . n0 is divisible by 3 if and only if the sum of its
digits n0 + n1 + · · · + nk is divisible by 3. (Hint: show that in fact N and n0 + n1 + · · · + nk
are congruent to the same number modulo 3.) Example: 34515 is divisible by 3 because 3 +
4 + 5 + 1 + 5 = 18 is divisible by 3.
(b) Prove that a positive integer N = nk nk−1 . . . n0 is divisible by 11 if and only if the sum of its
digits with alternating signs n0 − n1 + n2 − · · · + (−1)k nk is divisible by 11. Example: 1234563
is divisible by 11 since 1 − 2 + 3 − 4 + 5 − 6 + 3 = 0 is divisible by 11.
(c) Prove that a positive integer N = nk nk−1 . . . n0 is divisible by 7 if and only if when we let
M = nk nk−1 . . . n1 , we have that M − 2n0 is divisible by 7. Example: take the number 7 ∗ 11 ∗
13 ∗ 17 = 17017. It is clearly divisible by 7. Let us check the criterion against this example. We
form the number 1701 − 2 ∗ 7 = 1687 and then the number 168 − 2 ∗ 7 = 154 and then the
number 15 − 2 ∗ 4 = 7. So it works. Let us also check the number 82. It is not divisible by 7, in
fact it’s residue modulo 7 is 5. Also 8 − 2 ∗ 2 = 4, so the criterion shows that it’s not divisible.
Note though that in this case the number N = 82 and the number M = 8 − 2 ∗ 2 = 4 don’t
have the same residue modulo 7. So you need to construct your argument a little differently
than in the previous exercises.
(4) To check if you had multiplied correctly two large numbers A and B, A × B = C, you can make the
following check: sum the digits of A; keep doing it repeatedly until you get a single digit number a.
Do the same for B and C and get numbers b, c. If you have multiplied correctly, the sum of digits of
ab is c. Prove that this is so. This is called in French “preuve par neuf".
(12) Find all values a for which the system of equations xy = a, x + y = 1 has a solution in Z/19Z. Is
there such an a for which there is a unique solution?
(13) Prove Wilson’s theorem: Let p be a prime number then ( p − 1)! ≡ −1 (mod p).
(14) Let p be an odd prime and consider the sum
m 1 1 1
:= 1 + + + · · · + .
n 2 3 p−1
Prove that p|m. One method starts by multiplying this sum by ( p − 1)!. The other method considers
the ring Z[1/p] := { ba : p - b, a, b ∈ Z}. As p is an element of this ring, we can talk about divisibility
by p and define for two elements x, y in this ring that x ≡ y (mod p) if p|( x − y). Show that there
is a well-defined commutative ring structure on the set of congruence classes. Show that the residue
classes are represented once more by 0, 1, . . . , p − 1. Let a be an integer not congruent to 0 modulo
p. There is an integer b such that ab ≡ 1 (mod p); namely, the congruence class of b is the inverse
of a in the field Z/pZ. Show that b ≡ 1a (mod p), where this congruence is taking place in the
ring Z[1/p]. With these preparations, prove that p|m.
(15) Provide another proof of Fermat’s little theorem: Let p be a prime number and a 6≡ 0 (mod p) then
a p−1 ≡ 1 (mod p).
Start the proof by showing that the residue classes a, 2a, 3a, . . . , ( p − 1) a are distinct and non-zero,
and deduce that ( p − 1)! · a p−1 ≡ ( p − 1)! (mod p).
52 EYAL GOREN MCGILL UNIVERSITY
Let F be a field. We shall see that there are many similarities between the ring of integers Z and the ring
of polynomials with coefficients in F, F[ x ]. Initially, we shall see that they share many notions concerning
divisibility. The notion of absolute value for Z which measures the size of an integer is replaced by the notion
of degree for polynomials, which is the “size" of a polynomial. Using this notion we will define division with
residue and a Euclidean algorithm. Later on, after introducing some ideas from ring theory, we will also
construct the analog of the rings of congruence classes Z/nZ (n ≥ 1 an integer) and in particular of the
fields Z/pZ (p a prime number).
16. Arithmetic in F[ x ]
In this section F is a field. We denote by F× the set of non-zero elements of F.
16.1. Some remarks about divisibility in a commutative ring T. The definitions we made in § 9.2 can be
made in general and the same basic properties hold. Let T be a commutative ring and a, b ∈ T. We say that
a divides b if b = ac for some c ∈ T. We have the following properties:
(1) a|b ⇒ a| − b.
(2) a|b ⇒ a|bd for any d ∈ T.
(3) a|b, a|d ⇒ a|(b ± d).
In particular, the definition and properties hold for the ring of polynomials R[ x ], where R is a commutative
ring.
16.2. GCD of polynomials.
Definition 16.2.1. Let f ( x ), g( x ) ∈ F[ x ], not both zero. The greatest common divisor of f ( x ) and g( x ),
denoted gcd( f ( x ), g( x )) of just ( f ( x ), g( x )), is the monic polynomial of largest degree dividing both f ( x )
and g( x ). (We shall see below that there is a unique such polynomial.)
Theorem 16.2.2. Let f ( x ), g( x ) be polynomials, not both zero. The gcd of f ( x ) and g( x ), h( x ) =
( f ( x ), g( x )), is unique and can be expressed as
h ( x ) = u ( x ) f ( x ) + v ( x ) g ( x ), u ( x ), v ( x ) ∈ F[ x ].
It is the monic polynomial of minimal degree having such an expression. If t( x ) divides both g( x ) and f ( x )
then t( x )|h( x ).
Proof. Consider the following set of monic polynomials
S = { a( x ) : a( x ) = u( x ) f ( x ) + v( x ) g( x ) for some u( x ), v( x ) ∈ F[ x ], a( x ) monic}.
S contains a non-zero polynomial, because if f ( x ) 6= 0, f ( x ) = bx n + l.o.t.20, then b−1 f ( x ) ∈ S; if f ( x ) = 0
then g( x ) is not zero and the same argument can be applied to g( x ). Let h( x ) be an element of min-
imal degree of S. We claim that h( x ) divides both f ( x ) and g( x ). Since the situation is symmetric,
we just prove h( x )| f ( x ). Suppose not, then we can write f ( x ) = q( x )h( x ) + r ( x ), where r ( x ) is a
non-zero polynomial of degree smaller than h( x ). Then r ( x ) = f ( x ) − q( x )(u( x ) f ( x ) + v( x ) g( x )) =
(1 − q( x )u( x )) · f ( x ) − q( x )v( x ) · g( x ) and so, if we let r1 ( x ) be r ( x ) divided by its leading coefficient, we
see that r1 ( x ) ∈ S and has degree smaller than h( x ), which is a contradiction.
By construction, h( x ) is the monic polynomial of minimal degree having such an expression. If t( x ) divides
both g( x ) and f ( x ) then t( x )|(u( x ) f ( x ) + v( x ) g( x )) = h( x ). Therefore, h( x ) is a monic polynomial of the
largest possible degree dividing both f ( x ), g( x ). Suppose that h1 ( x ) is another monic polynomial dividing f ( x )
and g( x ) having the largest possible degree, i.e., the degree of h( x ). Then, we have h( x ) = h1 ( x )b( x ) by
what we proved. Since both polynomials have the same degree b( x ) must be a constant polynomial, and,
then, since both are monic, b( x ) = 1. We have shown the gcd is unique.
This is indeed possible, and the process always terminates. Letting rt ( x ) = cm x m + · · · + c0 , we have
( f ( x ), g( x )) = c− 1
m r t ( x ).
Thus,
1 4 4
+ x ) · f ( x ) − (1 + x ) · g ( x ).
x + 1 = ( f ( x ), g( x )) = (−
15 15 15
(4) Now consider the same polynomials over the field F = Z/3Z. We now have:
f ( x ) = 1 · g( x ) + x2 + 2x + 1
x3 + x2 − x − 1 = ( x − 1)( x2 + 2x + 1).
Therefore, now we have ( f ( x ), g( x )) = x2 + 2x + 1 = ( x + 1)2 .
One interesting application of the Euclidean algorithm is that it allows to check whether two polynomials
f , g ∈ C[ x ] (for example) have a common root, although it would not find the root. The point is that
if f (α) = g(α) = 0 then ( x − α)| f ( x ) and ( x − α)| g( x ). Indeed, divide f ( x ) by x − α with residue:
f ( x ) = q( x )( x − α) + r ( x ). Division with residue gives that r ( x ) is a constant polynomial. On the other
hand, substitute x = α to see that r (α) = f (α) − q(α)(α − α) = 0. Thus, r ( x ) = 0 and ( x − α)| f ( x );
similarly, ( x − α)| g( x ). Thus, if f and g have a common root α then ( x − α)|( f , g) and so ( f , g) 6= 1.
Conversely, suppose that d( x ) := ( f , g) 6= 1. As d( x ) is a non-constant polynomial it has a root α ∈ C and
as f ( x ) = d( x )q1 ( x ), g( x ) = d( x )q2 ( x ) for some qi ( x ) ∈ C[ x ], it follows that f (α) = g(α) = 0.
The interesting point about this use of the Euclidean algorithm is that it provides a quick and efficient
answer to the question of whether two polynomials f and g have a common root. It should be stressed the
straightforward approach, namely “calculate the roots of f and g and check if one of them is the same"
usually fails, because there is no general method to fine in exact form the roots of polynomials of high degree.
16.4. Irreducible polynomials and unique factorization. Let F be a field. We define a relation on polyno-
mials f ( x ) ∈ F[ x ]. We say that f ( x ) ∼ g( x ) if there is an element a ∈ F, a 6= 0 such that f ( x ) = ag( x ).
Lemma 16.4.1. This relation is an equivalence relation. Related polynomials are called associates. The
associates of 1 are the non-zero scalars F× .
Proof. The relation is reflexive because f ( x ) = 1 · f ( x ) and symmetric, because f ( x ) = ag( x ) implies g( x ) =
a−1 f ( x ). It is also transitive since f ( x ) = ag( x ) and g( x ) = bh( x ) implies f ( x ) = abh( x ) and ab 6= 0.
Finally, the last claim follows immediately from the definition.
The motivation for this definition of being “associated" is that as far as division goes two associated
polynomials behave the same. Indeed, suppose that f ∼ g, say f = ag. If f |h, that is, if h = f f 1 for some
polynomial f 1 , then h = g( a f 1 ) and that shows that g|h as well. The argument can be reversed and one
conclude that if g|h then f |h. It is also easy to check that if h| f then h| g and, conversely, if h| g then h| f .
Why did this issue not come up for integers?! Well, in a sense it was always there but it was so “visible"
that it required no special definition. The units of Z are just ±1 and in this case the correct definition is
to say that two integers a, b are associate if a = ±b. Under this definition it is clear that a|c ⇔ b|c and
c| a ⇔ c|b.
Let us return to polynomials. A non-constant polynomial f is called irreducible if g| f implies that g ∼ 1
or g ∼ f . As we have noted, if g| f and g1 ∼ g then g1 | f . Therefore, trying to define “irreducible" by g| f
implies that g( x ) = 1, or g( x ) = f ( x ), will not make sense; it will be in general a too strong requirement.
Proposition 16.4.2. Let f ( x ) ∈ F[ x ] be a non-constant polynomial. The following are equivalent:
(1) f is irreducible.
(2) if f | gh then f | g or f |h.
Proof. Suppose that f is irreducible, f | gh and f - g. The only monic polynomials dividing f are 1 and a−1 f ,
where a is the leading coefficient of f . Therefore, ( f , g) = 1 and so, for suitable polynomials u, v we
have u f + vg = 1. Then u f h + vgh = h. Since f divides the left hand side, it also divides the right hand
side, i.e., f |h.
Suppose now that f has the property f | gh ⇒ f | g or f |h. Let g be a divisor of f . Then f = gh for some h
and so f | gh. Therefore, f | g or f |h. Since h| f too, the situation concerning whether f | g or f |h is entirely
symmetric and we can assume that g| f and f | g. This implies that deg( g) ≤ deg( f ) and deg( f ) ≤ deg( g),
COURSE NOTES - ALGEBRA I 57
and so deg( f ) = deg( g). But then deg(h) = deg( f ) − deg( g) = 0 and so h is a constant polynomial. We
find that f ∼ g.
Example 16.4.3. Here are some comments on irreducible polynomials.
(1) Every linear polynomial is irreducible.
(2) If f = ax2 + bx + c is reducible then f = (αx + β)(γx + δ), where α, β, γ, δ ∈ F and α 6= 0, γ 6= 0.
It follows then that f has a root in F, for example x = −α−1 β.
Conversely, suppose that f has a root α ∈ F then, as we shall see shortly (Theorem 16.5.1), f =
( x − α) g( x ) for some polynomial g( x ) ∈ F[ x ], and degree considerations dictate that g( x ) is a linear
polynomial.
Therefore, for quadratic polynomials one can say that f is reducible if and only if f has a root
in F. If, furthermore, 2 6= 0 in the field F then, as we have seen in the assignments, we know that f
has a root if and only if b2 − 4ac is a square in F. In fact, in that case, the unique factorization of f
is √ ! √ !
2 −b + b2 − 4ac −b − b2 − 4ac
ax + bx + c = a x − x− .
2a 2a
(3) If f has degree 3 it is still true that f is reducible if and only if f has a root. But if f has degree
√ 4
or higher this may fail. For example, the polynomial x2 − 2 is irreducible over Q because 2 is
irrational. Same for x2 − 3. Thus, for example, the polynomials ( x2 − 2)2 , ( x2 − 2)( x2 − 3), ( x2 − 3)2
√ over Q√but don’t√have a root. √ of ( x √− 2)( x − 3) then in C we
are reducible 2 2
√ (Indeed, if α is a root
have (α − 2)(α + 2)(α − 3)(α + 3) = 0 and so α is ± 2 or ± 3 and, in any case, is not
rational.)
(4) The property of f being irreducible depends on the field. It is not an absolute property. For ex-
√ x − 2 √is irreducible in Q[ x ] but is reducible in C[ x ] because there we can write x =
ample, 2 2
( x − 2)( x + 2).
Theorem 16.4.4. (Unique factorization for polynomials) Let f ( x ) ∈ F[ x ] be a non-zero polynomial. Then
there is an a ∈ F× and distinct monic irreducible polynomials f 1 , · · · , f g and positive integers r1 , . . . , r g such
that
r rg
(3) f = a f11 · · · f g .
Moreover, if
s
f = bh11 · · · hst t ,
where b ∈ F× , hi distinct monic irreducible polynomials and si > 0, then a = b, g = t, and after re-naming
the hi ’s we have hi = f i for all i, and ri = si for all i.
Proof. The proof is very similar to the proof for integers. We first prove the existence of factorization.
Suppose that there is a non-zero polynomial f ( x ) with no such factorization. Choose then a non-zero
polynomial f ( x ) of minimal degree for which no such factorization exists. Then f ( x ) is not a constant
polynomial and is not an irreducible polynomial either, else f ( x ) = an x n + · · · + a0 = an · ( a− 1
n f ( x )) is a
suitable factorization. It follows that f ( x ) = f 1 ( x ) f 2 ( x ), where each f i ( x ) has degree less than that of f ( x ).
Therefore, each f i ( x ) has a factorization
f 1 ( x ) = c1 a1 ( x ) · · · a m ( x ), f 2 ( x ) = c2 b1 ( x ) · · · bn ( x ),
with ci ∈ F and ai , b j monic irreducible polynomials, not necessarily distinct. It follows that
f ( x ) = (c1 c2 ) a1 ( x ) · · · am ( x )b1 ( x ) · · · bn ( x ),
has also a factorization as claimed, by collecting factors. Contradiction. Thus, no such f ( x ) exists, and every
polynomial has a factorization as claimed.
leading coefficient of f , we have c1 = c2 . In particular, the case of deg( f ) = 0 holds. Assume that we proved
uniqueness for all polynomials of degree ≤ d and deg( f ) = d + 1 ≥ 1. Since a1 ( x )|c2 b1 ( x ) · · · bn ( x ) and a1 ( x )
is irreducible, it follows that either a1 ( x )|c2 (which is impossible because c2 is a constant) or a1 ( x )|bi ( x ) for
some i. But since bi ( x ) is irreducible it then follows that that a1 ( x ) ∼ bi ( x ) and so, both polynomials being
monic, a1 ( x ) = bi ( x ).
Let us re-number the bi so that a1 = b1 . Then, dividing by a1 ( x ) we have
c1 a2 ( x ) · · · am ( x ) = c2 b2 ( x ) · · · bn ( x ).
Induction gives that m = n and, after re-numbering the bi , ai ( x ) = bi ( x ), i = 2, 3, . . . , n.
(see Proposition 6.3.3) and this is precisely its unique factorization, except if some of the zi are equal
and then we may wish to collect them together to get a factorization as in (3).
We can now deduce from Theorem 16.4.4 the analogues of Proposition 10.2.1 and Corollary 10.2.2. The
proofs are the same.
am a
Proposition 16.4.6. Let f , g be non-zero polynomials in F[ x ]. Then f | g if and only if f = ap11 · · · pm
a10 a0m b
and g = bp1 · · · pm q11 · · · qbt t (products of distinct irreducible monic polynomials pi ; a, b non-zero scalars)
with ai0 ≥ ai for all i = 1, . . . , m.
COURSE NOTES - ALGEBRA I 59
a am b
Corollary 16.4.7. Let f = ap11 · · · pm , g = bp11 · · · pbmm with pi distinct irreducible monic polynomials, a, b
non zero scalars and ai , bi non-negative integers. (Any two non-zero polynomials can be written this way).
Then
min( a1 ,b1 ) min( am ,bm )
gcd( f , g) = p1 · · · pm .
16.5. Roots. Let F be a field and let f ( x ) ∈ F[ x ] be a non-zero polynomial. Recall that an element a ∈ F
is called a root (or zero, or solution) of f if f ( a) = 0.
A field F is called algebraically closed if any non-constant polynomial f ( x ) ∈ F[ x ] has a root in F. Recall
the following important result.
Theorem 16.5.2 (The Fundamental Theorem of Algebra). The field of complex numbers is algebraically
closed.
It is a fact (proven in Algebra III) that every field is contained in an algebraically closed field. If F is
algebraically closed, then the only irreducible polynomials over F are the linear polynomials, viz. x − a, a ∈ F.
It follows then that
f ( x ) = A ( x − a 1 ) s1 · · · ( x − a m ) s m ,
where A is the leading coefficient of f and a1 , . . . , am are the roots (with multiplicities s1 , . . . , sm ).
A natural question is, for a given field F and a given polynomial f ( x ), to tell if f has a root in F or not. It
is the first check one can do when trying to determine whether f ( x ) is irreducible or not. Unfortunately, in
general this is impossible to decide; but we have some partial answers in special cases.
Proof. Since the roots of f are the roots of − f , we may assume that f ( x ) = an x n + · · · + a1 x + a0 , ai ∈
R, an > 0. An easy estimate shows that there is an N > 0 such that f ( N ) > 0 and f (− N ) < 0. By the
intermediate value theorem there is some a, − N ≤ a ≤ N such that f ( a) = 0. (Another proof appears in
the exercises.)
Proof. We first prove that f ( x ) is irreducible over Z. Namely, suppose f ( x ) = g( x )h( x ), where g( x ), h( x ) ∈
Z[ x ] and both polynomials are not constant. Let us write g( x ) = c a x a + · · · + c1 x + c0 , h( x ) = db x b + · · · +
d1 x + d0 . Note that c a , db are ±1. Reduce the identity f ( x ) = g( x )h( x ) modulo p to get x a+b = g( x ) · h( x ).
By unique factorization for Z/pZ[ x ] we conclude that g( x ) = c a x a , h( x ) = db x b , and in particular, that
p|c0 , p|d0 . It follows that p2 |c0 d0 = a0 and that’s a contradiction.
We next prove that if f ( x ) is reducible over Q it is reducible over Z. Suppose that f ( x ) = g( x )h( x ),
where g( x ), h( x ) are in Q[ x ] are non-constant polynomials. Multiply by a suitable integer to get that F ( x ) =
G ( x ) H ( x ), where F ( x ) = N f ( x ) is in Z[ x ] and all its coefficients are divisible by N, and G ( x ), H ( x ) ∈ Z[ x ]
are non-constant polynomials. It is enough to prove that if a prime p divides all the coefficients of F ( x ), it
either divides all the coefficients of G ( x ) or all the coefficients of H ( x ), because then, peeling off one prime
at the time, we find a factorization of f ( x ) into polynomials with integer coefficients.
Reduce the equation F ( x ) = G ( x ) H ( x ) modulo p, for a prime p that divides all the coefficients of F ( x ),
to find 0 = G ( x ) · H ( x ). Using that Z/pZ[ x ] is an integral domain, we conclude that either G ( x ) = 0 or
H ( x ) = 0, which means that either p divides all the coefficients of G, or all the coefficients of H.
16.7. Roots of polynomials in Z/pZ. Let p be a prime and let Z/pZ be the field with p elements whose
elements are congruence classes modulo p. By Fermat’s little theorem, every element of Z/pZ× is a root
of x p−1 − 1. This gives p − 1 distinct roots of x p−1 − 1 and so these must be all the roots and each must
appear with multiplicity one. It follows that the roots of x p − x are precisely the elements of Z/pZ, again
each with multiplicity one. That is,
p −1
xp − x = ∏ (x − ā).
a =0
Proposition 16.7.1. Let f ( x ) be any polynomial in Z/pZ[ x ]. Then f ( x ) has a root in Z/pZ if and only
if gcd( f ( x ), x p − x ) 6= 1.
COURSE NOTES - ALGEBRA I 61
Proof. If f ( a) = 0 for some a ∈ Z/pZ then ( x − a)| f ( x ), but also ( x − a)|( x p − x ). It follows that gcd( f ( x ), x p −
p −1
x ) 6= 1. Conversely, if h( x ) = gcd( f ( x ), x p − x ) 6= 1 then, since h( x )| x p − x = ∏ a=0 ( x − ā), by unique
factorization we must have h( x ) = ∏i=1,...,n ( x − ai ) for some distinct elements a1 , . . . , an of Z/pZ. In
particular, each such ai is a root of f ( x ).
The straightforward way to check if f ( x ) has a root in Z/pZ is just to try all possibilities for x. Suppose
that f ( x ) has a small degree relative to p. Even then, except in special cases, we still have to try p residue
classes, each in its turn, to see if any of which is a root. But p may be very large, much too large for
this method to be feasible. For example, p might be of cryptographic size ≈ 22048 (according to the RSA
company, this should be secure until 2030). Even with a computer doing 1010 operations per second, which is
about what a good laptop does these days (2016), checking all these possibilities will take about 10600 years!
Proposition 16.7.1 suggests a different method: Calculate gcd( f ( x ), x p − x ). Note that except for the
first step
x p − x = q0 ( x ) f ( x ) + r0 ( x ),
all the polynomials involved in the Euclidean algorithm would have very small degrees (smaller than f ’s for
example) and so the Euclidean algorithm will terminate very quickly. The first step, though, could be very
time consuming given what we know at this point. Later we shall see that it can, in fact, be done quickly (in
order of magnitude log( p)). For example, using the software GP/PARI, it took my laptop 69 microseconds
to determine that for p = 22203 − 1, the polynomial f ( x ) = x3 + x + 1 is irreducible. This p is an example of
a Mersenne prime; for numbers of the form 2n − 1 we have special methods to ascertain their primality. Even
a prime of the form p = 29941 − 1 was no problem; it took 3.582 seconds to determine that f ( x ) is reducible
modulo p. It even took only 7.710 seconds to find the root. As the root has close to 3000 digits, I am not
listing it here.
We have seen that many of the features of arithmetic in Z can be carried out in F[ x ]. We still don’t have
an analogue of passing from Z to Z/nZ in the context of F[ x ]. This is one motivation for studying rings in
much more detail; we’d like to be able to emulate the process of Z to Z/nZ for general rings, not just F[ x ].
62 EYAL GOREN MCGILL UNIVERSITY
17. Exercises
(1) In each case, divide f ( x ) by g( x ) with residue:
(a) f ( x ) = 3x4 − 2x3 + 6x2 − x + 2, g( x ) = x2 + x + 1 in Q[ x ].
(b) f ( x ) = x4 − 7x + 1, g( x ) = 2x2 + 1 in Q[ x ].
(c) f ( x ) = 2x4 + x2 − x + 1, g( x ) = 2x − 1 in Z/5Z[ x ].
(d) f ( x ) = 4x4 + 2x3 + 6x2 + 4x + 5, g( x ) = 3x2 + 2 in Z/7Z[ x ].
(2) Use the Euclidean algorithm to find the gcd of the following pairs of polynomials and express it as a
combination of the two polynomials.
(a) x4 − x3 − x2 + 1 and x3 − 1 in Q[ x ].
(b) x5 + x4 + 2x3 − x2 − x − 2 and x4 + 2x3 + 5x2 + 4x + 4 in Q[ x ].
(c) x4 + 3x3 + 2x + 4 and x2 − 1 in Z/5Z[ x ].
(d) 4x4 + 2x3 + 3x2 + 4x + 5 and 3x3 + 5x2 + 6x in Z/7Z[ x ].
(e) x3 − ix2 + 4x − 4i and x2 + 1 in C[ x ].
(f) x4 + x + 1 and x2 + x + 1 in Z/2Z[ x ].
(3) Consider the polynomial x2 + x = 0 over Z/nZ.
(a) Find an n such that the equation has at least 4 solutions.
(b) Find an n such that the equation has at least 8 solutions.
(4) Is the given polynomial irreducible:
(a) x2 − 3 in Q[ x ]? In R[ x ]?
(b) x2 + x − 2 in F3 [ x ]? In F7 [ x ]? (For any prime p we denote Z/pZ also by F p . This notation
is used only for primes! Namely, one does not use a notation as Fn if n is not a prime.)
(5) Find the rational roots of the polynomial 2x4 + 4x3 − 5x2 − 5x + 2.
(6) For a polynomial g( x ) = an x n + · · · + a1 x + a0 with complex coefficients, let g( x ) = an x n + · · · +
a1 x + a0 be the polynomial obtained by taking the complex conjugate of the coefficients. Check that
g1 ( x ) g2 ( x ) = g1 ( x ) · g2 ( x ). Let f ( x ) be a polynomial with real coefficients and
f ( x ) = a ( x − α 1 ) a1 ( x − α 2 ) a2 · · · ( x − α r ) ar ,
its unique factorization over C. Apply complex conjugation to both sides. Deduce that if α is a root
of f with multiplicity d then ᾱ is a root of f with the same multiplicity. Deduce that if f has odd
degree then f has a real root.
(7) Let p > 2 be a prime. Calculate the gcd of x p−1 − 1 and x2 + 1 in the ring Z/pZ[ x ], using the
Euclidean algorithm method, and conclude that −1 is a square in Z/pZ if and only p ≡ 1 (mod 4).
(8) Let p be a prime number. Use the factorization of x p − x to deduce Wilson’s theorem: ( p − 1)! ≡ −1
(mod p)
COURSE NOTES - ALGEBRA I 63
Part 5. Rings
18. Some basic definitions and examples
Recall our definition of a ring.
Definition 18.0.1. A ring R is a non-empty set together with two operations, called “addition" and “multi-
plication" that are denoted, respectively, by
( x, y) 7→ x + y, ( x, y) 7→ xy.
One requires the following axioms to hold:
(1) x + y = y + x, ∀ x, y ∈ R. (Commutativity of addition)
(2) ( x + y) + z = x + (y + z), ∀ x, y, z ∈ R. (Associativity of addition)
(3) There exists an element in R, denoted 0, such that 0 + x = x, ∀ x ∈ R. (Neutral element for
addition)
(4) ∀ x ∈ R, ∃y ∈ R such that x + y = 0. (Inverse with respect to addition)
(5) ( xy)z = x (yz), ∀ x, y, z ∈ R. (Associativity of multiplication)
(6) There exists an element 1 ∈ R such that 1x = x1 = x, ∀ x ∈ R. (Neutral element for multiplication)
(7) z( x + y) = zx + zy, ( x + y)z = xz + yz, ∀ x, y, z ∈ R. (Distributivity)
Recall also that a ring R is called a division ring (or sometimes a skew-field) if 1 6= 0 in R and any non-zero
element of R has an inverse with respect to multiplication. A commutative division ring is precisely what we
call a field.
Example 18.0.2. Z is a commutative ring. It is not a division ring and so is not a field. The rational
numbers Q form a field. The real numbers R form a field. The complex numbers C form a field.
We have also noted some useful formal consequences of the axioms defining a ring:
(1) The element 0 appearing in axiom (3) is unique.
(2) Given x, the element y appearing in axiom (4) is unique. We shall denote y by − x.
(3) We have −(− x ) = x and −( x + x 0 ) = − x − x 0 , where, technically − x − x 0 means (− x ) + (− x 0 ).
(4) We have x · 0 = 0, 0 · x = 0.
Here are some further examples. We do not prove that the ring axioms hold; this is left as an exercise.
Example 18.0.3. Let F be a field and n ≥ 1 an integer. Consider the set of n × n matrices:
a . . . a
11 1n
.. .. ..
Mn ( F ) = . . . : aij ∈ F .
an1 . . . ann
For example:
(1) for n = 1 we ( a11 ), a11 ∈ F;
just get
a11 a12
(2) for n = 2, ;
a21 a22
a11 a12 a13
(3) for n = 3 we get a21 a22 a23 .
a31 a32 a33
n
In general we shall write an n × n matrix as ( aij ), or ( aij )i,j =1 if we need to be very clear about the dimensions
of the matrix. The index i is the row index and the index j is the column index. We then define
( aij ) + (bij ) = ( aij + bij ), ( aij )(bij ) = (cij ),
64 EYAL GOREN MCGILL UNIVERSITY
where
n
cij = ∑ aik bkj .
k =1
We can say that the ij entry of the product AB, is the dot product of the i-th row of A with the j-th column
of B.
. ..
..
··· ···
. ··· b1j ···
..
cij = ai1 . . . ain
.
.. ..
··· ··· . . ··· bnj ···
For example:
(1) for n = 1 we get ( a) + (b) = ( a + b) and ( a)(b) = ( ab). Namely, we just get F again!
(2) for n = 2, we have
a11 a12 b11 b12 a11 + b11 a12 + b12
+ = ,
a21 a22 b21 b22 a21 + b21 a22 + b22
and
a11 a12 b11 b12 a11 b11 + a12 b21 a11 b12 + a12 b22
= .
a21 a22 b21 b22 a21 b11 + a22 b21 a21 b12 + a22 b22
Under these definitions Mn (F) is a ring, called the ring of n × n matrices with entries in F, with identity
given by the identity matrix
1 0 ... 0
0 1
In = .. ,
.
0 ... 1
and zero given by the zero matrix (the matrix all whose entries are zero). For n ≥ 2 this is a non-commutative
ring. For example, for n = 2 we have,
1 1 1 0 2 1 1 0 1 1 1 1
= , = .
0 1 1 1 1 1 1 1 0 1 1 2
These are never equal, else 2 = 1 in F, which implies 1 = 0 in F, which is never the case, by definition.
Example 18.0.4. Let e be a formal symbol and F a field. The ring of dual numbers, F[e], is defined as
F[e] = { a + be : a, b ∈ F},
with the following addition and multiplication:
( a + be) + (c + de) = a + c + (b + d)e, ( a + be)(c + de) = ac + ( ad + bc)e.
Note that e is a zero divisor: e 6= 0 but e2 = 0.
Example 18.0.5. Let R1 , R2 be rings. Then R1 × R2 is a ring with the following operations:
( a1 , b1 ) + ( a2 , b2 ) = ( a1 + a2 , b1 + b2 ), ( a1 , b1 )( a2 , b2 ) = ( a1 a2 , b1 b2 ).
The zero element is (0R1 , 0R2 ) and the identity element is (1R1 , 1R2 ). The ring R1 × R2 is called the direct
product of R1 and R2 .
Some of the examples of rings are rather complicated (for example, rings of matrices) or exotic looking (like
the ring of dual numbers) and one would be justified to ask why one would want to consider such rings. As
one progresses in mathematics, the need for the concept of ring becomes more and more clear. In linear
algebra we learn that linear transformations of R3 , such as rotations fixing the origin, can be described by
3 × 3 matrices with real entries and composition of linear transformations matches multiplication of matrices
in M3 (R). Modern algebraic geometry associated to every commutative ring R a space Spec( R) (it is a set,
there is a notion of open subsets and of functions, and so on) whose points are the prime ideals of R (see
COURSE NOTES - ALGEBRA I 65
below for the concept of a prime ideal) and, in this setting, the ring of dual numbers is used to study the
tangent space at a point of Spec( R). If F is a field, we understand the importance of the ring of polynomials
F[ x ] and similarly F[ x, y], but if we want to considers polynomials in x “modulo a fixed polynomial f ( x )", or
polynomials in x and y “modulo a collection of polynomials f 1 ( x, y), . . . , f n ( x, y)" etc., we get new rings and
ring theory is the best way to have a rigorous language to discuss those. Thus, with no more apologies, we
proceed to develop the very basics of this huge area of mathematics.
19. Ideals
Definition 19.0.1. Let R be a ring. A (two-sided) ideal I of R is a subset of R such that
(1) 0 ∈ I;
(2) if a, b ∈ I then a + b ∈ I;
(3) if a ∈ I, r ∈ R, then ra ∈ I and ar ∈ I.
Remark 19.0.2. Note that if a ∈ I then −1 · a = − a ∈ I.
We shall use the notation I C R to indicate that I is an ideal of R.
Example 19.0.3. I = {0} and I = R are always ideals. They are called the trivial ideals.
Example 19.0.4. Suppose that R is a division ring (e.g., a field) and I C R is a non-zero ideal. Then I = R.
Indeed, there is an element a ∈ I such that a 6= 0. Then 1 = a−1 a ∈ I and so for every r ∈ R we
have r = r · 1 ∈ I. That is, I = R. We conclude that a division ring has only the trivial ideals. (Note also
that the argument shows for any ring R that if an ideal I contains an invertible element of R then I = R.)
66 EYAL GOREN MCGILL UNIVERSITY
Example 19.0.5. Let R be a commutative ring. Let r ∈ R. The principal ideal (r ) is defined as
(r ) = {ra : a ∈ R} = { ar : a ∈ R}.
We also denote this ideal by rR or Rr. This is indeed an ideal: First 0 = r · 0 is in (r ). Second, given two
elements ra1 , ra2 in (r ) we have ra1 + ra2 = r ( a1 + a2 ) ∈ (r ) and for every s ∈ R we have s(ra1 ) = (sr ) a1 =
(rs) a1 = r (sa1 ) ∈ rR (using commutativity!), (ra1 )s = r ( a1 s) ∈ rR.
Definition 19.0.6. Let R be a commutative ring. If every ideal of R is principal, one calls R a principal ideal
ring.
Theorem 19.0.7. Z is a principal ideal ring. In fact, the list
(0), (1), (2), (3), (4), . . .
is a complete list of the ideals of Z. (Note that another notation is 0, 1Z, 2Z, 3Z, 4Z, . . . .)
Proof. We already know that these are ideals, in fact principal ideals, and we note that for i > 0 the minimal
positive number in the ideal (i ) is i. Thus, these ideals are distinct.
Let I be an ideal of Z. If I = {0} then I appears in the list above. Else, there is some non-zero
element a ∈ I. If a < 0 then − a = −1 · a ∈ I and so I has a positive element in it. Choose the smallest
positive element in I and call it i.
First, since i ∈ I so is ia for any a ∈ Z and so (i ) ⊂ I. Let b ∈ I. Divide b by i with residue: b = qi + r,
where 0 ≤ r < i. Note that r = b − qi is an element of I, smaller than i. The only possibility is that r = 0
and so b = qi ∈ (i ). Thus, I = (i ).
Theorem 19.0.8. Let F be a field. The ring F[ x ] is a principal ideal ring. Two ideals ( f ( x )), ( g( x )) are equal
if and only if f ∼ g.
Proof. The proof is very similar to the case of Z. Let I be an ideal. If I = {0} then I = (0), the principal
ideal generated by 0. Else, let f ( x ) ∈ I be a non-zero polynomial whose degree is minimal among all non-
zero elements of I. On the one hand I ⊇ ( f ( x )). On the other hand, let g( x ) ∈ I and write g( x ) =
q( x ) f ( x ) + r ( x ), where r ( x ) is either zero or of degree small than f 0 s. But r ( x ) = g( x ) − q( x ) f ( x ) ∈ I.
Thus, we must have r ( x ) = 0 and so g( x ) = q( x ) f ( x ) ∈ ( f ( x )). That is, I ⊆ ( f ( x )).
At this point we need a definition and a lemma.
Let R be any ring. The units of R are denoted R× and defined as follows:
R× = { x ∈ R : ∃y ∈ R, xy = yx = 1}.
For example, 1R is always a unit. If R is a field then, by definition, R× = R − {0}. Given x ∈ R× the y such
that xy = yx = 1 is unique and we denote it x −1 . Indeed, if also xy1 = 1 then y( xy1 ) = y and on the other
hand y( xy1 ) = (yx )y1 = 1 · y1 = y1 . Thus, y = y1 .
Lemma 19.0.9. Let R be a commutative integral domain and a, b ∈ R. We say that a ∼ b (a and b are
associates) if for some unit u of R, au = b. This is an equivalence relation. If a|b and b| a then a ∼ b.
Proof. (Lemma) As a = 1 · a and au = b implies a = u−1 b if u is a unit (else u−1 doesn’t even make sense),
this is a reflexive and symmetric relation. If au = b and bv = c, where u, v ∈ R× then a(uv) = c and uv is a
unit (the inverse is v−1 u−1 ). This shows transitivity.
Now suppose a|b and b| a. If a = 0 then b = 0 and conversely, and there’s nothing to prove. Else, write
au = b for some u ∈ R and bv = a for some v ∈ R. We need to show that u, v are units. But, we have
a(1 − uv) = a − ( au)v = a − bv = a − a = 0. Since R is an integral domain and a 6= 0 it must be that
1 − uv = 0. Thus uv = 1. Starting from b(1 − vu) = 0 we get in the same way that vu = 1. It follows that
u, v are units of R.
Note that for R = F[ x ], the notion of being associate is precisely the one we have previously defined.
Indeed, the units of R are just the non-zero scalars and two polynomials are associate precisely if they differ
by multiplication by a non-zero scalar. As R is also an integral domain, we may apply the Lemma. So, suppose
that ( f ( x )) ⊃ ( g( x )) then g( x ) = f ( x )h( x ) for some polynomial h( x ) ∈ F[ x ]. That is f ( x )| g( x ). Thus, if
( f ( x )) = ( g( x )) then f | g and g| f and so f ∼ g.
If f | g, say g( x ) = f ( x )h( x ) then any multiple of g( x ), say g( x )t( x ) is equal to f ( x )[h( x )t( x )] and
so ( g( x )) ⊂ ( f ( x )). If f ∼ g then f | g and g| f and so, by the argument above, ( f ( x )) = ( g( x )).
COURSE NOTES - ALGEBRA I 67
Example 19.0.10. Let F be a field. One can show that all the ideals of F[e] are {0} = (0), F[e] = (1) and
(e) = {be : b ∈ F} and so the ring of dual numbers is also a principal ideal ring.
Example 19.0.11. The ring of polynomials C[ x, y] in two variables with complex coefficients is not a principal
ideal ring. We claim that the set of polynomials I = { f ( x, y) : f (0, 0) = 0}, namely, polynomials with zero
constant term, is an ideal that is not principal. We leave that as an exercise.
Example 19.0.12. Let R1 , R2 be rings with ideals I1 , I2 , respectively. Then I1 × I2 is an ideal of R1 × R2 .
Example 19.0.13. Let us consider the ring F[ x ] and in it the set
S = { f ( x ) : f ( x ) = a0 + a2 x 2 + a3 x 3 + . . . },
of polynomials with no x term. Note that 0 ∈ S and s1 , s2 ∈ S ⇒ s1 + s2 ∈ S and even s1 s2 ∈ S. However, S
is not an ideal. We have 1 ∈ S but x = x · 1 6∈ S.
√
Example 19.0.14. Consider the ring Z[ 5]. In this ring we consider
√
I = {5a + b 5 : a, b ∈ Z}.
√ that I is an ideal. This can be verified directly, but it is easier to note that I is in fact the principal
We claim
ideal ( 5).
Example 19.0.15. Let R be a ring and I1 , I2 two ideals of R. Then
I1 + I2 = {i1 + i2 : i1 ∈ I1 , i2 ∈ I2 }
is an ideal of R. Inductively, the sum of n ideals I1 + I2 + · · · + In is an ideal. A particular case is the
following: Let R be a commutative ring and Ii = ri R a principal ideal. Then
n
r1 R + r2 R + · · · + r n R = { ∑ r i a i : a i ∈ R }
i =1
is an ideal of R; we often denote it by (r1 , r2 , . . . , rn ) or hr1 , r2 , . . . , rn i (so in particular a principal ideal (r )
may also be denoted hr i). √ √
Let us consider the situation of the ring R = Z[ −5] and the ideal h2, 1 + −5i. We know abstractly
that this
√ is an ideal. We claim that this ideal is not principal. In particular, this shows that this ideal is
not Z[ −5] and, more √ importantly, gives√ us an example of a ring with non-principal √ ideals.
√
Suppose
√ that h 2,
√ 1 + − 5 i = h a + b − 5 i . It follows that 2 = ( a + b − 5 )( c + d −5) and so that 2 =
( a − b −5)(c − d −5) (check!). Therefore, by multiplying these two equations, 4 = ( a2 + 5b2 )(c2 + 5d2 ).
This is an equation
√ in integers and so (because 0 < a2 + 5b2 ≤√4) a ∈ {±1, ±2}, b = 0 and we conclude √
that h2,√1 + −5i = h ai is equal to h1i or h2i. Now, √ if h2, 1 + −5i = h2i this√implies that 1√+ −5 =
2(c√+ d −5), which is a contradiction.
√ If h2, 1 + −5i = h1i then 1 = 2(c1 + d1 −5) + (1 + −5)(c2 +
d2 −5) = (2c1 + c2 − 5d2 ) + −5(2d1 + c2 + d2 ). Therefore, 2d1 + c2 + d2 = 0, that is, −2d1 − c2 = d2
and we get 1 = 2c1 + c2 − 5d2 = 2c1 + c2 + 10d1 + 5c2 = 2(c1 + 3c2 + 5d1 ). This is an equation in integers
and it implies that 1 is even. Contradiction.
20. Homomorphisms
When we discussed sets, we also considered functions from one set to another. It was the functions, really,
that made the subject much more deep and useful. Indeed, without functions, we wouldn’t even be able to
compare cardinalities of sets. Moreover, functions on sets are as natural as things get in mathematics. I can
consider the set of guests of a wedding. A well-known headache-inducing problem is try and find a function
from the set of guests to the set of tables that associates to each guest a table so as to maximize the number
of friends around each table.
It is a general principle in mathematics that when sets are endowed with more structure, the maps should
take these structures into account; we say, “respect" the structures. We are about to make the definition
for rings and later on we will see it for groups. In Algebra II (or other linear algebra courses) you will see it
68 EYAL GOREN MCGILL UNIVERSITY
for vector spaces. In each case, the way to define correctly functions should be evident; one should be very
careful not ignore all the special features of the special sets (rings, groups, vector spaces, ...) that we are
considering.
Definition 20.0.1. Let R, S be rings. A function f : R → S is a ring homomorphism if the following holds:
(1) f (1R ) = 1S ;
(2) f (r1 + r2 ) = f (r1 ) + f (r2 );
(3) f (r1 r2 ) = f (r1 ) f (r2 ).
Here are some formal consequences (that are nonetheless very useful).
• f (0R ) = 0S . Indeed, f (0R ) = f (0R + 0R ) = f (0R ) + f (0R ). Let y = f (0R ) then y = y + y.
Adding −y to both sides we find 0S = y = f (0R ).
• We have f (−r ) = − f (r ). Indeed: 0S = f (0R ) = f (r + (−r )) = f (r ) + f (−r ) and so f (−r ) =
− f (r ) (just because it sums with f (r ) to 0S !)
• We have f (r1 − r2 ) = f (r1 ) − f (r2 ), because f (r1 − r2 ) = f (r1 + (−r2 )) (this, by definition) and
so f (r1 − r2 ) = f (r1 ) + f (−r2 ) = f (r1 ) − f (r2 ).
Note, in particular, that f (0R ) = 0S is a consequence of axioms (2), (3). On the other hand f (1R ) = 1S
does not follow from (2), (3) and we therefore include it as an axiom (though not all authors do that). Here
is an example. Consider,
f : R → R × R, f (r ) = (r, 0).
This map satisfies f (r1 + r2 ) = f (r1 ) + f (r2 ) and f (r1 r2 ) = f (r1 ) f (r2 ), but f (1) = (1, 0) is not the identity
element of R × R. So this is not a ring homomorphism.
On the other hand, if S ⊂ R is a subring then the inclusion map i : S → R, i (s) = s, is a ring homomorphism.
Note that this explains why in the definition of a subring we insisted on 1R ∈ S.
Proposition 20.0.2. Let f : R → S be a homomorphism of rings. The image of f is a subring of S.
Proof. As we have seen, f (0R ) = 0S . Also, by definition f (1R ) = 1S and so 0S , 1S ∈ Im( f ). Let now s1 , s2 ∈
Im( f ), say si = f (ri ). Then, s1 ± s2 = f (r1 ) ± f (r2 ) = f (r1 ± r2 ) and so s1 ± s2 ∈ Im( f ). Similarly, s1 s2 =
f (r1 r2 ) and so s1 s2 ∈ Im( f ).
Definition 20.0.3. Let f : R → S be a homomorphism of rings. The kernel of f , Ker( f ), is defined as
follows:
Ker( f ) = {r ∈ R : f (r ) = 0}.
Proposition 20.0.4. Ker( f ) is an ideal of R. The map f is injective if and only if Ker( f ) = {0}.
Proof. First, since f (0R ) = 0S we have 0R ∈ Ker( f ). Suppose that r1 , r2 ∈ Ker( f ) then f (ri ) = 0S and we
find that f (r1 + r2 ) = f (r1 ) + f (r2 ) = 0S + 0S = 0S , so r1 + r2 ∈ Ker( f ).
Now suppose that r1 ∈ Ker( f ) and r ∈ R is any element. We need to show that rr1 , r1 r ∈ Ker( f ). We
calculate f (rr1 ) = f (r ) f (r1 ) = f (r )0S = 0S , so rr1 ∈ Ker( f ). Similarly for r1 r.
So far we proved that Ker( f ) is an ideal. Suppose now that f is injective. Then f (r ) = 0S implies f (r ) =
f (0R ) and so r = 0R . That is, Ker( f ) = {0R }.
Suppose conversely that Ker( f ) = {0R }. If f (r1 ) = f (r2 ) then 0S = f (r1 ) − f (r2 ) = f (r1 − r2 ) and
so r1 − r2 ∈ Ker( f ). Since Ker( f ) = {0R }, we must have r1 − r2 = 0R ; that is, r1 = r2 . We proved that f
is injective.
We now look at some examples:
Example 20.0.5. Let n ≥ 1 be an integer. Define a function,
f : Z → Z/nZ,
by f ( a) = ā (the congruence class of a modulo n). Then f is a homomorphism:
(1) f (1) = 1̄ and 1̄ is the indeed the identity element of Z/nZ;
(2) f ( a + b) = a + b = ā + b̄ = f ( a) + f (b);
COURSE NOTES - ALGEBRA I 69
Example 20.0.9. Let A be the set of all continuous functions f : [0, 1] → R. Define the sum (resp. product)
of two functions f , g to be the function f + g (resp. f g) whose value at any x is f ( x ) + g( x ) (resp. f ( x ) g( x )).
That is:
( f + g)( x ) = f ( x ) + g( x ), ( f g)( x ) = f ( x ) g( x ).
This is a ring (in particular, these are operations – the sum and product of continuous functions is continuous!).
Its zero element is the constant function zero and its identity element is the constant function 1. Let a ∈ [0, 1]
be a fixed element. Define
ϕ : A → R, ϕ ( f ) = f ( a ).
Then ϕ is a ring homomorphism whose kernel are all the functions vanishing at the point a.
20.1. Units. Let R be any ring. Recall the definition of units: The units of R are denoted R× and defined
as follows:
R× = { x ∈ R : ∃y ∈ R, xy = yx = 1}.
For example, 1R is always a unit. If R is a field then, by definition, R× = R − {0}.
Lemma 20.1.1. We have the following properties:
(1) If r1 , r2 ∈ R× then r1 r2 ∈ R× .
70 EYAL GOREN MCGILL UNIVERSITY
Example 20.1.3. We have F[e]× = { a + be : a 6= 0}. Indeed, if a 6= 0 then ( a + be)( a−1 − a−2 be) = 1
(where a−2 is by definition ( a2 )−1 . It satisfies a−2 a = a−1 ). Conversely, if ( a + be)(c + de) = 1 then ac = 1
and so a 6= 0.
√ √
Example 20.1.4. Let n 6= 0, 1 be a square free integer. √ Recall that Z[ n] = { a + b n : a, b ∈ Z} and
every element of this ring has a unique expression as a + b n. We claim that
√ √
Z[ n ] × = { a + b n : a2 − b2 n = ±1}.
√ √ √
Indeed,
√ if a2 − b2 n = ±1 √ then ( a + b n)( a − b n)√= ±1 and √so a + b n is invertible with inverse ±( a −
b n). Conversely,√ if a + b n√is invertible, say ( a + b n)(c + d n) = 1 (for some c, d ∈ Z) then ad + bc = 0
and so also ( a − b n)(c − d n) = 1. We get that
√ √ √ √
( a + b n)( a − b n)(c + d n)(c − d n) = 1.
√ √ √ √
But ( a + b n)( a − b n) = a2 − b2 n and (c + d n)(c − d n) = c2 − d2 n are integers. So
√ √
( a + b n)( a − b n) = a2 − b2 n = ±1.
If n is negative, then it is easy to see that the unique solutions are a = ±1, b = 0, except when n = −1
where also b = ±1 is a solution. Namely, in general, only ±1 are units for n < 0, but for Z[i ] the units are
±1, ±i. On the other hand, for n > 1 it turns out that there are infinitely many units; there are infinitely
many solutions ( a, b) to the so-called Pell equation
a2 − b2 n = 1.
This is not an easy statement. It is interesting to try and prove this, but don’t be discouraged if you can’t.
The Pell equation has been studied much and is related to the Archimedes Cattle problem that asks for “the
number of the cattle of the sun which once grazed upon the plains of Sicily". It is stated in the form of a
poem, ultimately reducing to a Pell equation, the solutions of which are just enormous. Finding the smallest
√
solution to a Pell equation is a complicated problem, related to the continued fraction expression of n. It
behaves rather erratically. For example, the smallest solution to a2 − 60 · b2 = 1 is a = 31, b = 4. In contrast,
the smallest solution to a2 − 61 · b2 = 1 is a = 1766319049, b = 226153980.
Example 20.1.5. Let F be a field. The units of the ring M2 (F) are the matrices
a b
GL2 (F) := : ad − bc 6= 0 .
c d
a b
Indeed, suppose that for the matrix we have ad − bc 6= 0. Consider the matrix
c d
d −b
( ad − bc)−1
−c a
COURSE NOTES - ALGEBRA I 71
a b ta tb a b
(where by t we mean . It is equal to t). We claim that this is the inverse. We
c d tc td c d
have
d −b a b ad − bc 0
( ad − bc)−1 = ( ad − bc)−1
−c a c d 0 ad − bc
1 0
= .
0 1
a b d −b 1 0
Similarly, one checks that ( ad − bc)−1 =
.
c d −c 1a 0
a b
Suppose now that is invertible. The expression ad − bc is called the determinant of the ma-
c d
a b
trix M = and is denoted det( M). One can verify by a laborious but straightforward calculation that
c d
for any two matrices M, N we have
det( MN ) = det( M) det( N ).
If the matrix M has an inverse, say MN = N M = I2 , then
det( MN ) = det( M) det( N ) = det( I2 ) = 1,
d −b
and that shows that det( M) 6= 0. One can then show that N is necessarily ( ad − bc)−1 . In
−c a
fact, a more general fact is true.
Let R be a ring and x ∈ R× , yx = xy = 1. Suppose also that zx = xz = 1. Then (y − z) x = 1 − 1 = 0
and so (y − z)( xy) = 0 · y = 0. Therefore, since
xy = 1, wehave y − z = 0 that is y = z.
d −b
Applying this to MN = I2 and M ( ad − bc)−1 = I2 , we conclude that
−c a
d −b
N = ( ad − bc)−1 .
−c a
*********************
We now verify the ring axioms. It will be convenient to write ā for a + I. With this notation we have
ā + b̄ = a + b, ā b̄ = ab.
The axioms follow from the definition of the operations and the fact that they hold for R. To make clear at
!
what point we use that the axioms hold in R, we use the notation = to draw attention to this.
!
(1) ā + b̄ = a + b = b + a = b̄ + ā.
!
(2) ā + (b̄ + c̄) = ā + b + c = a + (b + c) = ( a + b) + c = a + b + c̄ = ( ā + b̄) + c̄.
!
(3) We have 0̄ + ā = 0 + a = ā. ( We remark that 0̄ = I.)
!
(4) We have ā + − a = a + (− a) = 0̄.
!
(5) ā(b̄ c̄) = ā bc = a(bc) = ( ab)c = ab c̄ = ( ā b̄) c̄.
! !
(6) We have ā 1̄ = a 1 = ā and 1̄ ā = 1 a = ā.
!
(7) ( ā + b̄)c̄ = a + b c̄ = ( a + b)c = ac + bc = ac + bc = ā c̄ + b̄ c̄. Also, c̄( ā + b̄) = c̄ a + b =
!
c( a + b) = ca + cb = ca + cb = c̄ ā + c̄ b̄.
Proposition 21.0.5. The natural map,
π : R → R/I, a 7→ π ( a) := ā
is a surjective ring homomorphism with kernel I. Thus, every ideal I C R is the kernel of some ring homomor-
phism from R to some other ring.
Proof. Note that 1 7→ 1̄, which is the identity element of R/I. We have π ( a + b) = a + b = ā + b̄ =
π ( a) + π (b). Also, π ( ab) = ab = ā b̄ = π ( a) π (b). We have shown that π is a ring homomorphism and it
is clearly surjective.
The kernel of π are the elements a ∈ R such that π ( a) = ā = 0̄, namely, the elements a such that a + I =
0 + I. By Lemma 21.0.3 this is the set of elements a such that a − 0 ∈ I, namely, the kernel is precisely I.
Example 21.0.6. Consider the ring Z. If we take the ideal {0} then Z/{0} can be identified with Z; the
map Z → Z/{0} is a bijective ring homomorphism. Let n > 0 then. The ring Z/(n) has as elements
the cosets a + (n). Two cosets a + (n), b + (n) are equal if and only if a − b ∈ (n), that is, precisely
when n|( a − b). We see that the elements of Z/(n) are just the congruence classes modulo n, as we have
in fact noted before, and the operations on Z/(n) are just the operations we defined on congruence classes.
Thus, the quotient rings of Z are (either Z or) the familiar rings of congruences. In particular, if p is a
prime number we get the field Z/pZ of p elements. Following on the analogy between Z and F[ x ], it is
natural to examine next the quotient rings of F[ x ]. We shall see that in fact we can get this way fields and
in particular fields whose cardinality is any power of a prime (in contrast Z/p a Z is never a field for a > 1).
It is a fact that any finite field has cardinality a power of a prime, so the methods we develop in this course
produce all finite fields. We shall not prove, though, in this course that any finite field has cardinality a power
of a prime, or that we get all finite fields this way. This is done usually in Algebra IV.
21.1. The quotient ring F[ x ]/( f ( x )). Let F be a field , f ( x ) ∈ F[ x ] a non-constant polynomial and
( f ( x )) = f ( x ) · F[ x ] the principal ideal it defines. Consider the quotient ring F[ x ]/( f ( x )). Suppose
that f ( x ) = x n + an−1 x n−1 + · · · + a0 is a monic polynomial of degree n. The following lemma is an
analogue of Lemma 12.0.1.
Lemma 21.1.1. Every element of F[ x ]/( f ( x )) is of the form g( x ) := g( x ) + ( f ( x )) for a unique polyno-
mial g( x ) which is either zero or of degree less than n.
Proof. Let h( x ) be a polynomial. To say that we have equality of cosets, h( x ) + ( f ( x )) = g( x ) + ( f ( x )), is
to say that h( x ) = q( x ) f ( x ) + g( x ). The requirement that deg( g) < deg( f ) amounts to the assertion that
the expression
h( x ) = q( x ) f ( x ) + g( x )
74 EYAL GOREN MCGILL UNIVERSITY
is the one gotten by dividing h by f with residue. We know that this is always possible and in a unique
fashion.
Theorem 21.1.2. Let F be a field, f ( x ) ∈ F[ x ] a non-constant irreducible polynomial of degree n. The
quotient ring F[ x ]/( f ( x )) is a field. If F is a finite field of cardinality q then F[ x ]/( f ( x )) is a field with qn
elements.
Proof. We already know that F[ x ]/( f ( x )) is a commutative ring. We note that 0̄ 6= 1̄ because 1 6∈ ( f )
(if it were, f would be a constant polynomial). Thus, we only need to show that a non-zero element has
an inverse. Let g( x ) be a non-zero element. That means that g( x ) 6∈ ( f ( x )), thus f ( x ) - g( x ), and so
gcd( f , g) = 1 (here is where we use that f is irreducible). Therefore, there are polynomials u( x ), v( x ) such
that
u( x ) f ( x ) + v( x ) g( x ) = 1.
Passing to the quotient ring, that means that v̄ ḡ = 1̄, and 1̄ is the identity of the quotient ring. Therefore ḡ
is invertible (and its inverse is v̄).
Finally, by the Lemma, every element of F[ x ]/( f ( x )) has a unique representative of the form an−1 x n−1 +
· · · + a1 x + a0 , where a0 , a1 , . . . , an−1 are elements of F. If F has q elements, we get qn such polynomials as
qn is the number of choices for the coefficients a0 , a1 , . . . , an−1 .
Example 21.1.3. A field with 4 elements. Take the field F to be F2 = Z/2Z and consider the poly-
nomial x2 + x + 1 over that field. Because it is of degree 2 and has no root in F2 it must be irreducible.
Therefore, F2 [ x ]/( x2 + x + 1) is a field K with 4 elements. Let us list its elements:
K = {0̄, 1̄, x̄, x + 1}.
(This is the list of polynomials a0 + a1 x with a0 , a1 ∈ F2 .) We can describe the addition and multiplication
by tables:
+ 0̄ 1̄ x̄ x+1 · 0̄ 1̄ x̄ x+1
0̄ 0̄ 1̄ x̄ x+1 0̄ 0̄ 0̄ 0̄ 0̄
1̄ 1̄ 0̄ x+1 x̄ , 1̄ 0̄ 1̄ x̄ x+1
x̄ x̄ x+1 0̄ 1̄ x̄ 0̄ x̄ x+1 1̄
x+1 x+1 x̄ 1̄ 0̄ x+1 0̄ x+1 1̄ x̄
***************
Example 21.1.4. A field with 9 elements. Consider the polynomial x2 + 1 over F3 = Z/3Z. It is
quadratic and has no root in F3 , hence is irreducible over F3 . We conclude that L = F3 [ x ]/( x2 + 1)
is a field with 9 elements. Note that in F3 the element −1 = 2 is not a square. However, in L we
have x2 = x2 − ( x2 + 1) = −1 and so −1 is a square now – its root is x (viewed as an element of L). In
fact, any quadratic polynomial over F3 has a root in L, because the discriminant “ b2 − 4ac" is either 0, 1, 2
and all those are squares in L.
For example, consider the polynomial t2 + t + 2. It has discriminant −7 ≡ −1 (mod 3), which √ is not a
square in F3 . In the field L we have x2 = −1. The solutions of the polynomial are then (−1 ± −1)/2 =
2(−1 ± x ) = 1 ± 2x.
On the other hand, one can prove that the polynomial t3 + t2 + 2 is irreducible in F3 and stays irreducible
in L. In MATH 370 we learn a systematic theory for deciding which polynomials stay irreducible and which
do not.
Example 21.1.5. Fields with 8 and 16 elements. A polynomial of degree 3 is irreducible if and only if
it doesn’t have a root. We can verify that x3 + x + 1 doesn’t have a root in F2 = Z/2Z and conclude
that F2 [ x ]/( x3 + x + 1) is a field with 8 elements. Consider the field K with 4 elements constructed above.
COURSE NOTES - ALGEBRA I 75
We note that the polynomial t2 + t + x̄ is irreducible over K (simply by substituting for t any of the four
elements of K and checking). Thus, we get a field L with 16 elements
L = K[t]/(t2 + t + x̄ ).
Remark 21.1.6. To construct a finite field of pn elements, where p is a prime number, we need to find a
polynomial of degree n which is irreducible over the field of Z/pZ. One can prove that such a polynomial
always exists by a counting argument. Finding a specific one for a given p and n is harder. Nonetheless,
given p and n one can find such an irreducible polynomial and so construct a field of pn elements explicitly,
for example in the sense that one can write a computer program that makes calculations in such a field.
21.3. Roots of polynomials over Z/pZ. We can now continue our discussion, begun in § 16.7, of the
efficient determination of whether a small degree polynomial f ( x ) over Z/pZ has a root in Z/pZ. Recall
that the only remaining point was whether the Euclidean algorithm step,
x p − x = q ( x ) f ( x ) + r ( x ),
can be done rapidly. Now we can answer that affirmatively. Note that r ( x ) + x is exactly the representative
of x p in the ring F[ x ]/( f ( x )). This representative can be calculated quickly by the method we already used
for calculating powers. We need to calculate
x, x2 , x4 , x8 , . . .
i
and express p in base 2, p = ∑ ai 2i , ai ∈ {0, 1}, x p = ∏{i:ai 6=0} x2 and so on. We see that the slowing factor
now is how quickly we can carry out multiplication in the ring F[ x ]/( f ( x )). It is not hard to see that this
depends on the degree of f and not on p.
Let us illustrate this by finding if x3 + x + 1 has a root in the field with 17 elements. We calculate x17 in
the quotient ring L = F17 [ x ]/( x3 + x + 1). We have x, x2 ,
x 4 = x ( x 3 + x + 1) − ( x 2 + x ),
x8 = ( x2 + x )2 = x4 + 2x3 + x2 = −( x2 + x ) + 2(− x − 1) + x2 = −3x − 2,
x16 = (3x + 2)2 = 9x2 + 12x + 4
and so
x17 − x = x (9x2 + 12x + 4) − x = 9(− x − 1) + 12x2 + 3x = 12x2 − 6x − 9.
This is the residue of dividing x17 − x in x3 + x + 1. Now we continue with the Euclidean algorithm, in the
way we are used to.
x3 + x + 1 = (10x + 5)(12x2 − 6x − 9) + 2x + 12,
12x2 − 6x − 9 = (6x − 5)(2x + 12)
76 EYAL GOREN MCGILL UNIVERSITY
Proof. First, because f (1R ) = 1S we have g(1S ) = 1R . Next, let s1 , s2 ∈ S. We need to prove g(s1 + s2 ) =
g(s1 ) + g(s2 ) and g(s1 s2 ) = g(s1 ) g(s2 ). It is enough to prove that
f ( g(s1 + s2 )) = f ( g(s1 ) + g(s2 )), f ( g(s1 s2 )) = f ( g(s1 ) g(s2 )),
because f is injective. But f ( g(s1 ) + g(s2 )) = f ( g(s1 )) + f ( g(s2 )) = s1 + s2 = f ( g(s1 + s2 )) and f ( g(s1 ) g(s2 )) =
f ( g(s1 )) f ( g(s2 )) = s1 s2 = f ( g(s1 s2 )).
Definition 22.1.3. Let R, S be rings. We say that R and S are isomorphic if there is a ring isomorphism R → S.
Proof. First, the identity function is always a ring homomorphism from R to R, so this relation is reflexive.
Secondly, if f : R → S is an isomorphism then g : S → R is an isomorphism, where g is the inverse function
to f . Thus, the relation is symmetric. Now suppose f : R → S and g : S → T are ring isomorphisms between
the rings R, S, T. To show the relation is transitive we need to prove that g ◦ f : R → T is an isomorphism.
Indeed:
(1) ( g ◦ f )(1R ) = g( f (1R )) = g(1S ) = 1T ;
(2) ( g ◦ f )(r1 + r2 ) = g( f (r1 + r2 )) = g( f (r1 ) + f (r2 )) = g( f (r1 )) + g( f (r2 )) = ( g ◦ f )(r1 ) + ( g ◦
f )(r2 );
(3) ( g ◦ f )(r1 r2 ) = g( f (r1 r2 )) = g( f (r1 ) f (r2 )) = g( f (r1 )) · g( f (r2 )) = ( g ◦ f )(r1 ) · ( g ◦ f )(r2 ).
Theorem 22.2.1. Let f : R → S be a surjective homomorphism of rings. Let I = ker( f ) then there is an
isomorphism F : R/I → S, such that the following diagram commutes
f
R //S,
=
π
! F
R/I
where π : R → R/I is the canonical map g 7→ ḡ.
COURSE NOTES - ALGEBRA I 77
This theorem is very useful. It says that to solve an equation modulo mn, (m, n) = 1, is the same as solving it
modulo m and modulo n. That is, for given integers a0 , . . . , an and an integer A we have an An + · · · + a1 A +
a0 ≡ 0 (mod mn) if and only if we have an An + · · · + a1 A + a0 ≡ 0 (mod m) and an An + · · · + a1 A + a0 ≡
0 (mod n). Here is an example:
Example 22.3.2. Solve the equation 5x + 2 = 0 modulo 77.
We consider the equation modulo 7 and get 5x = −2 = 5 (mod 7) so x = 1 (mod 7); we consider
it modulo 11 and get 5x = −2 = 20 (mod 11) and get that x = 4 (mod 11). There is an x ∈ Z such
that x (mod 7) = 1, x (mod 11) = 4 and in fact x is unique modulo 77 (this is the CRT). We can guess
that x = 15 will do in this case, but it raises the general problem of finding the inverse isomorphism to
Z/mnZ → Z/mZ × Z/nZ.
22.3.1. Inverting Z/mnZ → Z/mZ × Z/nZ. Suppose we know how to find integers e1 , e2 such that e1 = 1
(mod m), e1 = 0 (mod n) and e2 such that e2 = 0 (mod m), e2 = 1 (mod n), then we would have solved
our problem. Indeed, given now two congruence classes a (mod m), b (mod n) take the integer ae1 + be2 .
It is congruent to a modulo m and to b modulo n.
Since (m, n) = 1 we may find u, v such that 1 = um + vn. Put
e1 = 1 − um, e2 = 1 − vn.
These are the integers we are looking for.
Example 22.3.3. Solve the equation 56x + 23 = 0 (mod 323).
We have 323 = 17 · 19.
• Solution modulo 17.
We have the equation 5x + 6 = 0 (mod 17). Or x = −6 · 5−1 = 11 · 5−1 . To find 5−1 we look
for u, v such that 1 = u5 + v17.
17 = 3 · 5 + 2, 5 = 2 · 2 + 1 so 1 = 5 − 2 · 2 = 5 − 2 · (17 − 3 · 5) = 7 · 5 − 2 · 17 and so 7 · 5 = 1
(mod 17). We conclude that x = 11 · 7 = 77 = 9 (mod 17).
• Solution modulo 19.
We have the equation − x + 4 = 0 so x = 4 (mod 19) is a solution.
• Finding e1 , e2 .
We have 19 = 17 + 2, 17 = 8 · 2 + 1 so 1 = 17 − 8 · 2 = 9 · 17 − 8 · 19. It follows that e1 =
1 − 9 ∗ 17 = −152, e2 = 1 + 8 ∗ 19 = 153.
• We conclude that the solution to the equation 56x + 23 = 0 (mod 323) is 9 ∗ e1 + 4 ∗ e2 = −1368 +
612 = −756 and modulo 323 this is 213.
With linear equations, there is in fact a quicker way to solve the equation that we already know. If the leading
coefficient in ax + b is prime to mn, there is a c such that ca ≡ 1 (mod mn) (c can be found using the
Euclidean algorithm). Then x = −bc.
Example 22.3.4. Solve the equation x2 = 118 (mod 323). As before we reduce to solving x2 = 118 = 16
(mod 17) and x2 = 118 = 4 (mod 19). There are two solutions in each case, given by x = ±4 (mod 17)
and x = ±2 (mod 19). We conclude that over all we have 4 solutions given by
±4(−152) ± 2(153) (mod 323).
One can then reduce those numbers to standard representatives and find that 21, 55, 268, 302 (mod 323) are
the four solutions.
One can be more precise about the connection between solutions mod mn and solutions mod m and mod n.
First, let us generalize the Chinese Remainder Theorem:
Theorem 22.3.5. Let m1 , . . . , mk be relatively prime non-zero integers (that is (mi , m j ) = 1 for i 6= j). Then
there is an isomorphism
Z/m1 m2 . . . mk Z ∼ = Z/m1 Z × Z/m2 Z × · · · × Z/mk Z,
given by
a (mod m1 m2 . . . mk ) 7→ ( a (mod m1 ), a (mod m2 ), . . . , a (mod mk )).
COURSE NOTES - ALGEBRA I 79
The theorem is not hard to prove by induction on k. The main case, k = 2, is the one we proved above.
Now, let g( x ) = an x n + . . . a1 x + a0 be a polynomial with integer coefficients. Let S be the solutions of g in
Z/m1 m2 . . . mk Z and Si the solutions of g in Z/mi Z. Then we have a bijection
S ↔ S1 × S2 × · · · × S k ,
given by
a (mod m1 m2 . . . mk ) 7→ ( a (mod m1 ), a (mod m2 ), . . . , a (mod mk )).
Indeed, g( a) (mod m1 m2 . . . mk ) is mapped to ( g( a) (mod m1 ), g( a) (mod m2 ), . . . , g( a) (mod mk )) and
g( a) ≡ 0 (mod m1 m2 . . . mk ) if and only if for every i we have g( a) ≡ 0 (mod mi ). This shows that we
have a map
S → S1 × S2 × · · · × S k .
But, conversely, given solutions ri to g( x ) mod mi , there is a unique r (mod m1 m2 · · · mk ) such that r ≡ ri
(mod mi ) and g(r ) ≡ 0 (mod m1 m2 · · · mk ) because it is true modulo every mi .
In particular, we may draw the following conclusion.
Corollary 22.3.6. Let m1 , . . . , mk be relatively prime integers. Let s be the number of solutions to the
equation an x n + · · · + a1 x + a0 = 0 (mod m1 m2 · · · mk ) and let si be the number of solutions modulo mi .
Then
s = s1 s2 · · · s k .
Example 22.3.7. The equation x2 = 1 has 8 solutions modulo 2 · 3 · 5 · 7 = 490, because it has one solution
mod 2, and 2 solutions mod 3,5 or 7.
Example 22.3.8. The equation 34x = 85 (mod 17 · 19) has 17 solutions, because it has 17 solutions modulo
17 (it is then the equation 0 · x = 0 (mod 17)) and has a unique solution modulo 19 (it is then the equation
4x = 10 (mod 19) and x = 12 is the unique solution).
On the other hand, the equation 34x = 5 (mod 17 · 19) has no solutions, because it has no solutions
modulo 17.
There remains the question how to calculate a solution mod m1 m2 · · · mk from solutions mod m1 , mod m2 ,
. . . , mod mk . That is, how to find explicitly the inverse to the map
Z/m1 m2 . . . mk Z → Z/m1 Z × Z/m2 Z × · · · × Z/mk Z.
We explain how to do that for 3 numbers m1 , m2 , m3 , though the method is general.
We first find integers e1 , e2 such that
e1 ≡ 1 (mod m1 ), e1 ≡ 0 (mod m2 m3 ),
and
e2 ≡ 0 (mod m1 ), e2 ≡ 1 (mod m2 m3 ).
This we know how to do because we are only dealing with two relatively prime numbers, that is, m1 and
m2 m3 . Then, find λ2 , λ3 such that
λ2 ≡ 1 (mod m2 ), λ2 ≡ 0 (mod m3 ),
and
λ3 ≡ 0 (mod m2 ), λ3 ≡ 1 (mod m3 ).
Then, the numbers
µ 1 = e1 , µ 2 = e2 λ 2 , µ 3 = e2 λ 3 ,
are congruent to (1, 0, 0), (0, 1, 0) and (0, 0, 1) respectively in the ring Z/m1 Z × Z/m2 Z × Z/m3 Z. To find
an integer mod m1 m2 m3 mapping to ( a, b, c) in Z/m1 Z × Z/m2 Z × Z/m3 Z take aµ1 + bµ2 + cµ3 :
aµ1 + bµ2 + cµ3 7→ ( a, b, c) ∈ Z/m1 Z × Z/m2 Z × Z/m3 Z.
Let us illustrate all this with a numerical example.
80 EYAL GOREN MCGILL UNIVERSITY
24. Exercises
(1) Recall that for the ring Z a complete list of ideals is given by (0), (1), (2), (3), (4), (5), . . . , where
(n) is the principal ideal generated by n, namely, (n) = {na : a ∈ Z}. Find the complete list of
ideals of the ring Z × Z.
(2) Let R be a ring and let I and J be two ideals of R.
(a) Prove that I ∩ J is an ideal of R, where
I ∩ J = {r : r ∈ I, r ∈ J }.
It is called the intersection of the ideals I and J.
(b) Prove that
I + J = {i + j : i ∈ I, j ∈ J }
is an ideal of R. It is called the sum of the ideals I and J.
(c) Find for every two ideals of the ring Z their sum and intersection.
(3) Let F be a field. Prove that the ring M2 (F) of 2 × 2 matrices with entries in F has no non-trivial
(two-sided) ideals. That is, every ideal is either the zero ideal or M2 (F) itself.
(Note: there is also a notion of a one-sided ideal that we don’t discuss in this course. The ring
M2 (F) has a non-trivial one sided ideal. The notion of one-sided ideals is studied in Higher Algebra
I & II).
(4) The ring of real quaternions H (Hamilton’s quaternions). Let i, j, k be formal symbols and
H = { a + bi + cj + dk : a, b, c, d ∈ R}.
Addition on H is defined by
( a + bi + cj + dk) + ( a0 + b0 i + c0 j + d0 k) = ( a + a0 ) + (b + b0 )i + (c + c0 ) j + (d + d0 )k.
Multiplication is determined by defining
i2 = j2 = −1, ij = − ji = k,
(and one extends this to a product rule by linearity).
(a) Prove that the map
z z2
1
H→ : z1 , z2 ∈ C ,
−z z
2 1
a + bi c + di
taking a + bi + cj + dk to the matrix is bijective and satisfies: f ( x + y) =
−c + di a − bi
f ( x ) + f (y) and f (i )2 = f ( j)2 = − I2 , f (i ) f ( j) = f (k ) = − f ( j) f (i ).
(b) Use part (a) to conclude that H is indeed a ring, by proving it is a subring of M2 (C).
(c) Prove that H is a non-commutative division ring.
(5) In the following questions it’s useful to remember that under our definitions a ring homomorphism
takes 1 to 1.
(a) Prove that there is no ring homomorphism Z/5Z → Z.
(b) Prove that there is no ring homomorphism Z/5Z → Z/7Z.
(c) Prove that the rings Z/2Z × Z/2Z and Z/4Z are not isomorphic.
(d) Is there a ring homomorphism Z/4Z → Z/2Z × Z/2Z?
(e) Is there a ring homomorphism Z/2Z × Z/2Z → Z/4Z?
(6) (a) Let R be a commutative ring and let r1 , . . . , rn be elements of R. We define (r1 , . . . , rn ) to be
the set
{r1 a1 + · · · + r n a n : ∀ i a i ∈ R }.
Prove that (r1 , . . . , rn ) is an ideal of R. We call it the ideal generated by r1 , . . . , rn .
COURSE NOTES - ALGEBRA I 83
(b) Now apply that to the case where R = Z[ x ] (polynomials with integer coefficients). Let (2, x )
be the ideal generated by 2 and x.
(i) Prove that the ideal (2, x ) is not principal and conclude that Z[ x ] is not a principal ideal ring.
(ii) Find a homomorphism f : R → Z2 such that (2, x ) = Ker( f ).
(7) Prove that C[ x, y] is not a principal ideal ring, for example, that the ideal ( x, y) is not a principal
ideal.
(8) Prove that no two of the following rings are isomorphic:
(a) R × R × R × R (with addition and multiplication given coordinate by coordinate);
(b) M2 (R);
(c) The ring H of real quaternions.
(9) Let f : R → S be a ring homomorphism.
(a) Let J CS be an ideal. Prove that f −1 ( J ) (equal by definition to {r ∈ R : f (r ) ∈ J }) is an ideal
of R.
(b) Prove that if f is surjective and I C R is an ideal then f ( I ) is an ideal (where f ( I ) = { f (i ) : i ∈
I }).
(c) Show, by example, that if f is not surjective the assertion in (2) need not hold.
(10) Let F be a field and let
a a12 a13
11
R= 0 a22 a23 : aij ∈ F .
0 0 a33
Let
I = {( aij ) ∈ R : a11 = a22 = a33 = 0}.
Prove that R is a subring of M3 (F), I is an ideal of R and R/I ∼= F × F × F.
(11) Let d be an integer, which is not a square of another integer.
√ d is not a √
(a) Prove that square a rational number. √
(b) Let Q[ d] := √ { a + b d : a, b ∈ Q}. Show that Q[ d] is a subring of C and is in fact a field.
(c) Prove that Q[ d] ∼ = Q[√x ]/( x2 − d)√
.
(d) Prove that the fields Q[ 2] and Q[ 3] are not isomorphic.
(12) Prove a Chinese Remainder Theorem for polynomials:
Let F be a field and let f ( x ), g( x ) be two non-constant polynomials that are relatively prime,
gcd( f , g) = 1. Prove that
∼ F[ x ] / ( f ) × F[ x ] / ( g ).
F[ x ] / ( f g ) =
(Hint: mimic the proof of the Chinese Remainder Theorem for integers.)
(13) Let R and S be rings and let I C R, J CS be ideals. Prove that
( R × S)/( I × J ) ∼
= ( R/I ) × (S/J ).
(14) For each of the rings Z/60Z and F[ x ]/( x4 + 2x3 + x2 ) find all their ideals and identify all their
homomorphic images. Suggestion: To find the ideals use exercise (9) to reduce the calculation to
ideals of the ring Z or F[ x ] that contain the ideal (60), respectively ( x4 + 2x3 + x2 ), but explain why
this is valid.
(15) Using the Chinese remainder theorem, find the solutions (if any) to the following polynomial equations
(for example, in (d), write the solutions as integers mod 30 and so on):
(a) 15x = 11 in Z/18Z.
(b) 15x = 12 in Z/63Z.
(c) x2 = 37 in Z/63Z.
(d) x2 = 4 in Z/30Z.
84 EYAL GOREN MCGILL UNIVERSITY
(16) Prove that F5 [ x ]/( x2 + 2) is a field. Calculate the following expressions as polynomials of degree
smaller than 2: ( x2 + x ) ∗ ( x3 + 1) − ( x + 1), ( x2 + 3)−1 and ( x2 − 1)/( x2 + 3). Find all the roots
of the polynomials t2 + 2 and t2 + 3 in this field.
(17) Prove that F2 [ x ]/( x3 + x + 1) is a field. Calculate the following expressions as polynomials of degree
smaller than 3: ( x2 + x ) ∗ ( x3 + 1) − ( x + 1), ( x2 + 3)−1 and ( x2 − 1)/( x2 + 3). Find all the roots
of the polynomial t3 + t2 + 1 and t3 + 1 in this field.
(18) Let f ( x ) = x2 − 2 ∈ F19 [ x ].
(a) Prove that f is irreducible and deduce that L := F19 [ x ]/( x2 − 2) is a field.
(b) Find the roots of the polynomial t2 − t + 2 in the field L.
(c) Prove that h( x ) = x3 − 2 is irreducible over F19 . Prove that it is also irreducible in L.
(d) Find x19 − x in F19 [ x ]/( x3 − 2) as being represented by a polynomial of degree at most 2 (hint:
x19 = ( x3 )6 · x ). Use this to rapidly calculate gcd( x19 − x, h( x )) and conclude also in this way
that h( x ) is irreducible over F19 .
(19) Find all the solutions to the equation x3 = 38 (mod 195).
COURSE NOTES - ALGEBRA I 85
Part 6. Groups
25. First definitions and examples
25.1. Definitions and some formal consequences.
Definition 25.1.1. A group G is a non-empty set with an operation
G × G → G, ( a, b) 7→ ab,
such that the following axioms hold:
(1) ( ab)c = a(bc). (Associativity)
(2) There exists an element e ∈ G such that eg = ge for all g ∈ G. (Identity)
(3) For every g ∈ G there exists an element d ∈ G such that dg = gd = e. (Inverse)
Here are some formal consequences of the definition:
(1) e is unique. Say ẽ has the same property then ẽ = eẽ, using the property of e, but also eẽ = e,
using the property of ẽ. Thus, e = ẽ.
(2) d appearing in (3) is unique (therefore we shall call it “the inverse of g" and denote it by g−1 ).
Say d˜ also satisfies dg
˜ = gd˜ = e. Then
d˜ = de
˜ = d˜( gd) = (dg
˜ )d = ed = d.
am an = am+n , ( am )n = amn .
25.2. Examples.
Example 25.2.1. The trivial group G is a group with one element e and multiplication law ee = e.
Example 25.2.2. If R is a ring, then R with addition only is a group. It is a commutative group. The
operation in this case is of course written g + h. In general a group is called commutative or abelian if for
all g, h ∈ G we have gh = hg. It is customary in such cases to write the operation in the group as g + h and
not as gh, but this is not a must. This example thus includes Z, Q, R, C, F, F[e], M2 (F), Z/nZ, all with the
addition operation.
Example 25.2.3. Let R be a ring. Recall that the units R× of R are defined as
{u ∈ R : ∃v ∈ R, uv = vu = 1}.
This is a group. If u1 , u2 ∈ R with inverses v1 , v2 , respectively, then, as above, one checks that v2 v1 is an
inverse for u1 u2 and so R× is closed under the product operation. The associative law holds because it holds
in R; 1R serves as the identity. If R is not commutative there is no reason for R× to be commutative, though
in certain cases it may be.
Thus we get the examples of Z× = {±1}, Q× = Q − {0}, R× = R − 0, C× = C − {0}, and more
generally, F× = F − {0}. We also have, GL2 (F) = { M ∈ M2 (F) : det( M) 6= 0}, F[e]× = { a + be : a 6= 0}.
Proposition 25.2.4. Let n > 1 be an integer. The group Z/nZ× is precisely
{1 ≤ a ≤ n : ( a, n) = 1}.
86 EYAL GOREN MCGILL UNIVERSITY
Proof. If ā is invertible then ab = 1 (mod n) for some integer b; say ab = 1 + kn for some k ∈ Z. If d| a, d|n
then d|1. Therefore ( a, n) = 1.
Conversely, suppose that ( a, n) = 1 then for some u, v we have 1 = ua + vn and so ua = 1 (mod n).
One defines Euler’s ϕ function on positive integers by
(
1 n=1
ϕ(n) =
|Z/nZ× | n > 1.
One can prove that this is a multiplicative function, namely, if (n, m) = 1 then ϕ(nm) = ϕ(n) ϕ(m). I
invite you to try and prove this based on the Chinese Remainder Theorem.
Definition 25.3.3. Let G be a group. G is called cyclic if there is an element g ∈ G such that G = { gn :
n ∈ Z}; that is, any element of G is a power of g. The element g is then called a generator of G.
Example 25.3.4. Let G be any group. Let g ∈ G and define
h g i : = { g n : n ∈ Z}.
This is a cyclic subgroup of G (it may be finite or infinite).
Example 25.3.5. The group Z is cyclic. As a generator we may take 1 (or −1).
Example 25.3.6. The group (Z/5Z)× = {1, 2, 3, 4} is cyclic. The elements 2, 3 are generators. The
group Z/8Z× is not cyclic. One can check that the square of any element is 1.
1 2 3
The first line is in fact a cyclic subgroup of S3 . It is the subgroup generated by .
2 3 1
(4) In general, let T0 be a subset of T. We can define two subgroups of Σ T . The first is H = {σ ∈ Σ T :
σ (t) = t, ∀t ∈ T0 } and the second it I = {σ ∈ Σ T : σ ( T0 ) = T0 }. Then H ⊆ I ⊆ Σ T are subgroups.
H and I depend on the choice of T0 , but we do not reflect this in the notation. If Σ T = Sn and T0
has m elements then H has (n − m)! elements and I has (n − m)!m! elements.
26.2. Cycles. There is still more efficient notation for permutations in Sn . Fix n ≥ 1. A cycle (in Sn ) is an
expression of the form
( a1 a2 · · · a t ),
where ai ∈ {1, 2, . . . , n} are distinct elements. This expression is understood as the permutation σ given by
ai+1 a = ai , i < n,
σ ( a ) = a1 a = an ,
a else.
Pictorially:
) ) )
a1 i a2 a3 .( . . at
Proof. Let k be the minimal integer such that gk = e (∞ if such doesn’t exist).
Suppose first that o ( g) is finite, say equals r. Then the r + 1 elements {e, g, g2 , . . . , gr } cannot be distinct
and so gi = g j for some 0 ≤ i < j ≤ r. It follows that g j−i = e and so k ≤ j − i ≤ r. In particular k is also
finite. So r is finite implies k is finite and k ≤ r.
Suppose now that k is finite. Let n be an integer and write n = ak + b where 0 ≤ b < k. Then gn =
( gk ) a gb = e a gb = gb . We conclude that h gi ⊆ {e, g, · · · , gk−1 }, and so k is finite implies that r is finite
and r ≤ k.
Example 26.2.3. Let ( a1 a2 · · · at ) be a cycle. Its order is t.
COURSE NOTES - ALGEBRA I 89
Two cycles σ, τ are called disjoint if they contain no common elements. In this case, clearly στ = τσ.
Moreover, (στ )n = σn τ n and since σn and τ n are disjoint, (στ )n = Id if and only if σn = Id and τ n = Id.
Thus o (σ )|n, o (τ )|n and we deduce that o (στ ) (namely, the least n such that (στ )n = Id) is lcm(o (σ), o (τ )).
Arguing in the same way a little more generally we obtain:
Lemma 26.2.4. Let σ1 , . . . , σn be disjoint permutations of orders r1 , . . . , rn , respectively. Then the order of
the permutation σ1 ◦ σ2 ◦ · · · ◦ σn is lcm(r1 , r2 , . . . , rn ).
Combining this lemma with the following proposition allows us to calculate the order of every permutation
very quickly.
Proposition 26.2.5. Every permutation is a product of disjoint cycles.
We shall not provide formal proof of this proposition, but illustrate it by examples.
1 2 3 4 5
Example 26.2.6. Consider the permutation σ = . To write it as a product of cycles we
3 4 1 5 2
begin by (1 and check where 1 goes to. It goes to 3. So we write (13 and check where 3 goes to. It goes
to 1 and so we have (13). The first number we didn’t consider is 2. 2 goes to 4 and so we write (13)(24
and 4 goes to 5 and so we write (13)(245 . Now, 5 goes to 2 and so we have σ = (13)(245). The order of
σ is lcm(2, 3) = 6.
1 2 3 4 5 6 7 8 9 10
Example 26.2.7. Consider the permutation σ = . It is written as a
3 4 2 1 10 6 9 5 7 8
product of disjoint transposition as follows (1324)(5 10 8)(79). To find this expression, we did the same
procedure described above: We start with (1 , continue with (13 , because 1 goes to 3, and then with (132 ,
because 3 goes to 2. Then we find that 2 goes to 4 which goes to 1 and we have found (1324). The first
number not in this list is 5 which goes to 10 and so we have (1324)(5 10. Since 10 goes to 8 and 8 to 5 we get
now (1324)(5 10 8). The first number not in this list is 6 that goes to 6 and that gives (1324)(5 10 8)(6). We
then continue with 7. Since 7 goes to 9 which goes to 7 we have (1324)(5 10 8)(6)(79). We have considered
all numbers and so σ = (1324)(5 10 8)(6)(79) = (1324)(5 10 8)(79). The order of σ is lcm(4, 3, 2) = 12.
Example 26.2.8. Suppose we want to find a permutation of order 10 in S7 . We simply take (12345)(67). If
we want to find a permutation of order 10 in S10 we can take either (12345)(67) or (123456789 10) (and all
variants on this).
Finally we remark on the computation of σ−1 for a permutation σ. If σ is given in the form of a table, for
example:
1 2 3 4 5 6 7 8 9 10
σ= ,
3 4 2 1 10 6 9 5 7 8
then because σ (i ) = j ⇔ σ−1 ( j) = i, the table describing σ−1 is the same table but read from the bottom
to the top. That is
3 4 2 1 10 6 9 5 7 8
σ −1 = .
1 2 3 4 5 6 7 8 9 10
Only that we follow our convention and write the columns in the conventional order and so we get
1 2 3 4 5 6 7 8 9 10
σ −1 = .
4 3 1 2 8 6 9 10 7 5
If σ is a cycle, say σ = (i1 i2 . . . ik−1 ik ), then σ−1 is easily seen to be (ik ik−1 . . . i2 i1 ). So,
( i 1 i 2 . . . i k −1 i k ) −1 = ( i k i k −1 . . . i 2 i 1 ).
90 EYAL GOREN MCGILL UNIVERSITY
Now, if σ is a product of disjoint cycles, σ = σ1 σ2 . . . σr then σ−1 = σr−1 . . . σ2−1 σ1−1 (by a generalization
of the rule ( ab)−1 = b−1 a−1 ), but, since those cycles are disjoint they commute, and so we can also write
this as σ−1 = σ1−1 σ2−1 . . . σr−1 . (This last manipulation is wrong if the cycles are not disjoint!) Thus, for
example, the inverse of σ = (1324)(5 10 8)(79) is (4231)(8 10 5)(97), which we can also write, if we wish,
as (1423)(5 8 10)(79).
The nature of the symmetries 1, y, . . . , yn−1 is clear: y j rotates clockwise by angle j · 360◦ /n.
Proposition 26.3.2. Let 0 ≤ j < n. The element y j x is a reflection through the line forming an angle
− j · 360◦ /2n with the x-axis.
Proof. The symmetry y j x is not trivial. If it fixes an angle θ it must be reflection through the line with that
angle. Note that y j x sends the angle θ to −θ and then adds − j · 360◦ /n so the equation is θ = −θ − j · 360◦ /n
(mod 360). That is θ = − j · 360◦ /2n.
COURSE NOTES - ALGEBRA I 91
Example 26.3.3. We revisit Example 25.2.6. The technique is rather similar to studying the dihedral group.
The symmetries of solids we considered there preserve orientation (we don’t allow flip-overs through a fourth
dimension). If we number the vertices of a tetrahedron by 1, 2, 3, 4 and the vertices of the cube by 1, 2, . . . , 8,
we may view the group of symmetries, call them A and B, respectively, as subgroups of S4 and S8 , respectively.
If we fix an edge on the tetrahedron, or the cube, then a symmetry is completely determined by its effect
on this edge. The edge, as a moment’s reflection shows, can go to any other edge on the solid and in two
ways. Since a tetrahedron has 6 edges and a cube has 12, we conclude that the group A has 12 elements and
the group B has 24. If we wish we can enumerate the elements of the groups A, B, by listing the permutations
they induce.
gH := { gh : h ∈ H },
for some g ∈ G. The set gH is called the left coset of g; g is called a representative of the coset gH.
Example 27.1.1. Consider the subgroup H of S3 given by {1, (123), (132)}. Here are some cosets: H =
1H = (123) H = (132) H, (12) H = (13) H = (23) H = {(12), (23), (13)}. We leave the verification to the
reader.
We now prove the equivalence of the assertions (i) - (iii). Suppose (i) holds. Then g1 = g1 e ∈ g2 H and
(ii) holds. Suppose (ii) holds; say g1 = g2 h. Then g2−1 g1 = h ∈ H and (iii) holds. Suppose that (iii)
holds; g2−1 g1 = h for some h ∈ H. Then g1 = g2 h and so g1 H ∩ g2 H 6= ∅. By what we have proved in the
first part, g1 H = g2 H.
Remark 27.1.3. The Lemma and its proof should be compared with Lemma 21.0.3. In fact, since R is an
abelian group and an ideal I is a subgroup, that lemma is special case of the lemma above.
Corollary 27.1.4. G is a disjoint union of cosets of H. Let { gi : i ∈ I } be a set of elements of G such that
each coset has the form gi H for a unique gi . That is, G = äi∈ I gi H. Then the { gi : i ∈ I } are called a
complete set of representatives.
In the same manner one defines a right coset of H in G to be a subset of the form Hg = {hg : h ∈ H }
and Lemma 27.1.2 holds for right cosets with the obvious modifications. Two right cosets are either equal
or disjoint and the following are equivalent: (i) Hg1 = Hg2 ; (ii) g1 ∈ Hg2 ; (iii) g1 g2−1 ∈ H. Thus, the
Corollary holds true for right cosets as well.
We remark that the intersection of a left coset and a right coset may be non-empty, yet not a coset itself.
For example, take H = {1, (12)} in S3 . We have the following table.
92 EYAL GOREN MCGILL UNIVERSITY
g gH Hg
1 {1, (12)} {1, (12)}
(12) {(12), 1} {(12), 1}
(13) {(13), (123)} {(13), (132)}
(23) {(23), (132)} {(23), (123)}
(123) {(123), (13)} {(123), (23)}
(132) {(132), (23)} {(132), (13)}
The table demonstrates that indeed any two left (resp. right) cosets are either equal or disjoint, but the
intersection of a left coset with a right coset may be non-empty and properly contained in both.
Proof. We have,
G= ä gi H.
i∈ I
Let a, b ∈ G. We claim that the function
f : aH → bH, x 7→ ba−1 x,
is a well defined bijection. First, x = ah for some h and so ba−1 x = bh ∈ bH and so the map is well defined.
It is surjective, because given an element y ∈ bH, say y = bh it is the image of ah. The map is also injective:
if ba−1 x1 = ba−1 x2 then multiplying both sides by ab−1 we get x1 = x2 .
We conclude that each coset gi H has the same number of elements, which is exactly the number of
elements in H = eH. We get therefore that
| G | = | H | · | I |.
That completes the proof.
Here are some applications of Lagrange’s theorem:
(1) Let G be a finite group of prime order p. Then G is cyclic; in fact, every element of G that is not
the identity generates G.
Indeed, let g 6= e. Then H = h gi is a non-trivial subgroup. So | H | > 1 and divides p. It follows
that | H | = | G | and so that h gi = G.
(2) In a similar vein, we conclude that a group of order 6 say, cannot have elements of order 4, or 5,
or of any order not dividing 6. This follows immediately from Lagrange’s theorem, keeping in mind
that ord( g) = |h gi| .
COURSE NOTES - ALGEBRA I 93
Suppose f is injective. Then, since f (e g ) = e H , eG is the only element mapping to e H and so Ker( f ) =
{eG }. Conversely, suppose Ker( f ) = {eG } and f ( g1 ) = f ( g2 ). Then e H = f ( g1 )−1 f ( g2 ) = f ( g1−1 ) f ( g2 ) =
f ( g1−1 g2 ). That means that g1−1 g2 ∈ Ker( f ) and so g1−1 g2 = eG . That is, g1 = g2 .
28.2. Isomorphism.
Definition 28.2.1. A group homomorphism f : G → H is called an isomorphism if it is bijective.
As in the case of rings, one verifies that if f is an isomorphism, the inverse function g = f −1 is automatically
a homomorphism and so an isomorphism as well. Also, one easily checks that a composition of group
homomorphisms is a group homomorphism. It follows that being isomorphic is an equivalence relation on
groups. Cf. §22.1.
Example 28.2.2. Let n be a positive integer. Any two cyclic groups of order n are isomorphic.
Indeed, suppose that G = h gi, H = hhi are cyclic groups of order n. Define, for any integer a,
f ( ga ) = ha .
This is well defined; if g a = gb then g a−b = eG and so n|( a − b). Thus, a = b + kn and f ( g a ) = h a =
hb (hn )k = hb = f ( gb ). Obviously f is a surjective homomorphism; f is also injective, because f ( g a ) = h a =
e H implies that n| a and so g a = eG .
In particular, we conclude that any cyclic group of order n is isomorphic to the group Z/nZ (with the
group operation being addition).
Example 28.2.3. Let p be a prime number then any two groups of order p are isomorphic. Indeed, we have
seen that such groups are necessarily cyclic.
94 EYAL GOREN MCGILL UNIVERSITY
Theorem 28.2.4. (Cayley) Let G be a finite group of order n then G is isomorphic to a subgroup of Sn .
G → Sn , g 7→ σg .
This map is a homomorphism of groups: σgh ( a) = gha = σg (σh ( a)). That is, σgh = σg ◦ σh . This homomor-
phism is injective: if σg is the identity permutation then σg (e) = e and that implies ge = e, that is g = e. We
get that G is isomorphic to its image, which is a subgroup of Sn , under this homomorphism.
Remark 28.2.5. We were somewhat informal about identifying the permutations of G with Sn . A more
rigorous approach is the following.
Lemma 28.2.6. Let T, Z be sets and f : T → Z a bijection. The group of permutations of T and Z are
isomorphic.
SZ → ST , τ 7 → f −1 τ f .
Therefore, we found a bijective homomorphism ST → SZ , which shows those two permutation groups are
isomorphic.
Given an action of G on S we can define the following sets. Let s ∈ S. Define the orbit of s
Orb(s) = { g ? s : g ∈ G }.
Note that Orb(s) is a subset of S, equal to all the images of s under the action of the group G. We also
define the stabilizer of s to be
Stab(s) = { g ∈ G : g ? s = s}.
Note that Stab(s) is a subset of G. In fact, it is a subgroup, as Lemma 29.2.1 states.
COURSE NOTES - ALGEBRA I 95
The stabilizer of i is the permutations fixing i. These permutations can be identified with permutations of
the set T − {i }, because σ ∈ Stab(i ) induces a permutation of T − {i } and a permutation of T − {i } can be
extended to a permutation of T sending i to itself. Thus, for every i, Stab(i ) ∼
= Sn−1 . The orbit of i is T;
for every j the transposition (ij) shows that j ∈ Orb(i).
Example 29.3.2. Let G be the group of real numbers R. The group operation is addition. Let S be the
sphere in R3 of radius 1 about the origin. The group R acts by rotating around the z-axis. An element r ∈ R
rotates by r radians. For every point s ∈ S, different from the poles, the stabilizer is 2πZ. For the poles the
stabilizer is R. The orbit of every point is the altitude line on which it lies.
Example 29.3.3. Let G be a group and H a subgroup of G. Then H acts on G by
H × G → G, (h, g) 7→ hg.
Here H plays the role of the group and G the role of the set in the definition. This is indeed a group
action: e H g = g for all g ∈ G, because by definition e H = eG . Also, h1 (h2 ) g = (h1 h2 ) g is nothing but the
associative law.
The orbit of g ∈ G is
Orb( g) = {hg : h ∈ H } = Hg.
That is, the orbits are the right cosets of H. We have that G is a disjoint union of orbits, namely, a disjoint
union of cosets. The stabilizer of any element g ∈ G is {e}. The formula we have proven, |Orb(g)| =
| H |/|Stab( g)|, gives us | Hg| = | H | for any g ∈ G, and we see that we have another point of view on
Lagrange’s theorem.
Example 29.3.4. We consider a roulette with n sectors and write n = i1 + · · · + ik , for some positive (and
fixed) integers i1 , . . . , ik . We suppose we have different colors c1 , . . . , ck and we color i1 sectors of the roulette
by the color c1 , i2 sectors by the color c2 and so on. The sectors can be chosen as we wish and so there are
many possibilities. We get a set S of colored roulettes. Now, we turn the roulette r steps clockwise, say, and
we get another colored roulette, usually with different coloring. Nonetheless, it is natural to view the two
coloring as the same, since “they only depend on your point of view". We may formalize this by saying that
the group Z/nZ acts on S; r acts on a colored roulette by turning it r steps clockwise, and by saying that
we are interested in the number of orbits for this action.
Example 29.3.5. Let G be the dihedral group D8 . Recall that G is the group of symmetries of a regular
octagon in the plane.
G = {e, y, y2 , . . . , y7 , x, yx, y2 x, . . . , y7 x },
where y is rotation clockwise by angle 2π/8 and x is reflection through the x-axis. We have the relations
x2 = y8 = e, xyxy = 1.
We let S be the set of colorings of the octagon ( = necklaces laid on the table) having 4 red vertices (rubies)
and 4 green vertices (sapphires). The group G acts on S by its action on the octagon.
For example, the coloring s0 , consisting of alternating green and red, is certainly preserved under x and
under y2 . Therefore, the stabilizer of s0 contains at least the set of eight elements
(4) {e, y2 , y4 , y6 , x, y2 x, y4 x, y6 x }.
Remember that the stabilizer is a subgroup and, by Lagrange’s theorem, of order dividing 16 = | G | . On the
other hand, Stab(s0 ) 6= G because y 6∈ Stab(s0 ). It follows that the stabilizer has exactly 8 elements and is
equal to the set in (4).
Let H be the stabilizer of s0 . According to Lemma 29.2.1 the orbit of s0 is in bijection with the left cosets
of H = {e, y2 , y4 , y6 , x, y2 x, y4 x, y6 x }. By Lagrange’s theorem there are two cosets. For example, H and gH
are distinct cosets. The proof of Lemma 29.2.1 tells us how to find the orbit: it is the set {s0 , gs0 }, which
is of course quite clear if you think about it.
COURSE NOTES - ALGEBRA I 97
Remark 30.0.2. Note that I ( g) is the number of fixed points for the action of g on S. Thus, the CFF can
be interpreted as saying that the number of orbits is the average number of fixed points (though this does
not make the assertion more obvious).
Proof. We define a function
(
1 g ? s = s,
T : G × S → {0, 1}, T ( g, s) =
0 g ? s 6= s.
Note that for a fixed g ∈ G we have
I ( g) = ∑ T ( g, s),
s∈S
and that for a fixed s ∈ S we have
|Stab(s)| = ∑ T ( g, s).
g∈ G
Let us fix representatives s1 , . . . , s N for the N disjoint orbits of G in S. Now,
! !
∑ I ( g) = ∑ ∑ T ( g, s) = ∑ ∑ T ( g, s)
g∈ G g∈ G s∈S s∈S g∈ G
|G|
= ∑ |Stab(s)| = ∑ |Orb(s)|
s∈S s∈S
N N
|G| |G|
= ∑ ∑ |Orb(s)| ∑ ∑ |Orb(si )|
=
i =1 s∈Orb(s ) i i =1 s∈Orb(s ) i
N N
|G|
= ∑ |Orb(si )| · |Orb(si )| = ∑ |G|
i =1 i =1
= N · | G |.
z
Remark 30.0.3. If N, the number of orbits, is equal 1 we say that G acts transitively on S. It means exactly
that: For every s1 , s2 ∈ S there exists g ∈ G such that g ? s1 = s2 . Note that if G and S are finite then if G
acts transitively then the number of elements in S divides the number of elements in G,
| S | | | G |,
because, if S = Orb(s) then |S| = | G |/|Stab(s)|.
Corollary 30.0.4. Let G be a finite group acting transitively on a finite set S. Suppose that |S| > 1. Then
there exists g ∈ G without fixed points.
21This is also sometimes called Burnside’s formula.
22The sum appearing in the formula means just that: If you write G = { g , . . . , g } then
1 n ∑ g∈G I ( g) is ∑in=1 I ( gi ) = I ( g1 ) + I ( g2 ) +
· · · + I ( gn ). The double summation ∑ g∈G ∑s∈S T ( g, s) appearing in the proof means that if we write S = {s1 , . . . , sm } then the
double sum is T ( g1 , s1 ) + T ( g1 , s2 ) + · · · + T ( g1 , sm ) + T ( g2 , s1 ) + T ( g2 , s2 ) + · · · + T ( g2 , sm ) + · · · + T ( gn , s1 ) + T ( gn , s2 ) + · · · +
T ( gn , s m ) .
98 EYAL GOREN MCGILL UNIVERSITY
Proof. By contradiction. Suppose that every g ∈ G has a fixed point in S. That is, suppose that for
every g ∈ G we have
I ( g) ≥ 1.
Since I (e) = |S| > 1 we have that
∑ I ( g ) > | G |.
g∈ G
By Cauchy-Frobenius formula, the number of orbits N is greater than 1. Contradiction.
Example 30.1.2. How many roulettes with 12 wedges painted 2 blue, 2 green and 8 red are there when we
allow rotations?
element g I ( g)
0 2970
i 6= 6 0
i=6 30
Applying CFF we get that there are
1
N= (2970 + 30) = 250
12
different roulettes.
Example 30.1.3. In this example S is the set of necklaces made of four rubies and four sapphires laid on the
table. We ask how many necklaces there are when we allow rotations and flipping-over. We may talk of S
as the colorings of a regular octagon, four vertices are green and four are red. The group G = D8 acts on S
and we are interested in the number of orbits for the group G. The results are the following
element g I ( g)
e 70
y, y3 , y5 , y7 0
y2 , y6 2
y4 6
xyi for i = 0, . . . , 7 6
(For example xy = (1 8)(2 7)(3 6)(4 5) is of this sort). Whichever the case, one uses similar reasoning to
deduce that there are 6 colorings preserved by a reflection.
we conclude that there must an element ( g1 , . . . , g p ) in S with a non-trivial stabilizer. This means that for
some g ∈ G, such that g 6= e, we have
( gg1 , . . . , gg p ) is equal to ( g1 , . . . , g p ) up to a cyclic shift.
This means that for some i we have
( gg1 , . . . , gg p ) = ( gi+1 , gi+2 , gi+3 , . . . , g p , g1 , g2 , . . . , gi ).
Therefore, gg1 = gi+1 , g2 g1 = ggi+1 = g2i+1 , . . . , g p g1 = · · · = g pi+1 = g1 (we always read the indices
mod p). That is, there exists g 6= e with
g p = e.
Since the order of g divides p and p is prime, the order of g must be p.
Finally, ā a−1 = aa−1 = ēG = eG/H and a−1 ā = a−1 a = ēG = eG/H . Thus, every ā is invertible and its
inverse is a−1 (that is, ( aH )−1 = a−1 H).
π
$ F
G/Ker( f )
32.4.1. Groups of order 1. There is a unique group of order 1, up to isomorphism. It consists of its identity
element alone. There is only one way to define a homomorphism between two groups of order 1 and it is an
isomorphism.
32.4.2. Groups of order 2, 3, 5, 7. Recall that we proved that every group G of prime order is cyclic, and,
in fact, any non-trivial element is a generator. This implies that any subgroup of G different from {eG } is
equal to G. We also proved that any two cyclic groups having the same order are isomorphic. We therefore
conclude:
Corollary 32.4.1. Every group G of prime order p is isomorphic to Z/pZ; it has no subgroups apart from
the trivial subgroups {e g }, G.
Claim: Let G be a group in which every element different from the identity has order 2. Then G is
commutative.
Proof: Note first that if a ∈ G has order 2 (or is the identity) then aa = eG and so a−1 = a. Now, we need
to show that for every a, b ∈ G we have ab = ba. But this is equivalent to ab = b−1 a−1 . Multiply both
sides by ab and we see that we need to prove that abab = eG . But, abab = ( ab)2 and so is equal to eG , by
assumption.
One example of a group of order 4 satisfying all these properties is Z/2Z × Z/2Z. We claim that G ∼
=
Z/2Z × Z/2Z. Pick two distinct elements g1 , g2 of G, not equal to the identity. Define a map
The non-trivial subgroups of Z/2Z × Z/2Z are all cyclic. They are {(0, 0), (0, 1)}, {(0, 0), (1, 0)} and
{(0, 0), (1, 1)}. Since the group is commutative they are all normal and the quotient in every case has
order 2, hence isomorphic to Z/2Z.
32.4.4. Groups of order 6. We know three candidates already Z/6Z, Z/2Z × Z/3Z and S3 . Now, in
fact, Z/6Z ∼ = Z/2Z × Z/3Z (CRT). And since S3 is not commutative it is not isomorphic to Z/6Z. In
fact, every group of order 6 is isomorphic to either Z/6Z or S3 . We don’t prove it here.
The subgroups of Z/6Z: Let n be a positive integer. We have a surjective group homomorphism π :
Z → Z/nZ. Similar to the situation with rings one can show that this gives a bijection between subgroups H
of Z that contain nZ and subgroups K of Z/nZ. The bijection is given by
H 7 → π ( H ), K 7 → π −1 ( K ).
The subgroups of Z are all cyclic, having the form nZ for some n (same proof as for ideals, really). We thus
conclude that the subgroups of Z/nZ are cyclic and generated by the elements m such that m|n. Thus,
for n = 6 we find the cyclic subgroups generated by 1, 2, 3, 6. Those are the subgroups Z/6Z, {0, 2, 4}, {0, 3}, {0}.
They are all normal and the quotients are isomorphic respectively to {0}, Z/2Z, Z/3Z, Z/6Z.
32.5. Odds and evens. Let n ≥ 2 be an integer. One can show that there is a way to assign a sign, ±1, to
any permutation in Sn such that the following properties hold:
• sgn(στ ) = sgn(σ) · sgn(τ ).
• sgn((ij)) = −1 for i 6= j.
We do not prove that here, but we shall prove that next term in MATH 251. Note that since any permutation
is a product of transpositions, the two properties together determine the sign of any permutation. Here are
some examples: sgn((12)) = −1, sgn((123)) = sgn((13)(12)) = sgn((13)) · sgn((12)) = 1, sgn((1234)) =
sgn((14)(13)(12)) = −13 = −1.
The property sgn(στ ) = sgn(σ) · sgn(τ ) could be phrased as saying that the function
sgn : Sn −→ {±1}
is a surjective group homomorphism. We define An as the kernel of the homomorphism sgn. It is called the
alternating group on n letters and its elements are called even permutations. The elements of Sn \ An are
called odd permutations. The group An is a normal subgroup of Sn , being a kernel of a homomorphism. Its
cardinality is n!/2. Here are some examples:
• A2 = {1};
• A3 = {1, (123), (132)};
• A4 = {1, (12)(34), (13)(24), (14)(23), (123), (132), (234), (243), (124), (142), (134), (143)}. (Easy
to check those are distinct 12 even permutations, so the list must be equal to A4 ).
Let a ∈ Z/pZ then the two sets H ∗ and a − H ∗ := { a − h : h ∈ H ∗ } have size ( p + 1)/2 and so
p +1
must intersect (because Z/pZ has p < 2 · 2 elements). That is, there are two squares x2 , y2 such
that a − x2 = y2 and so a = x2 + y2 .
32.7. Exercises.
(1) Write that following permutations as a product of disjoint cycles in S9 and find their order:
(a) στ 2 σ, where σ = (1234)(68) and τ = (123)(398)(45).
(b) στστ, σ = (123), τ = (345)(17).
(c) σ−1 τσ, σ = (123456789), τ = (12)(345)(6789).
(d) σ−1 τσ, σ = (123456789), τ = (12345)(6789).
(2) Which of the following are subgroups of S4 ?
(a) {1, (12)(34), (13)(24), (14)(23)}.
(b) {1, (1234), (13), (24), (13)(24)}.
(c) {1, (423), (432), (42), (43), (23)}.
(d) {1, (123), (231), (124), (142)}.
(3) (a) Let Q8 be the set of eight elements {±1, ±i, ± j, ±k} in the quaternion ring H, discussed in a
previous assignment (so ij = k = − ji etc.). Show that Q8 is a group.
(b) For each of the groups S3 , D4 , Q8 do the following:
(i) Write their multiplication table;
(ii) Find the order of each element of the group;
(iii) Find all the subgroups. Which of them are cyclic?
1 2 3 4 5 6 7 8
(4) (a) Find the order of the permutation σ = and write it as a product
3 1 5 6 2 7 8 4
of cycles.
(b) Find a permutation in S12 of order 60. Is there a permutation of larger order in S12 ?
(6) Let G be a group and let H1 , H2 be subgroups of G. Prove that if H1 ∪ H2 is a subgroup then either
H1 ⊆ H2 or H2 ⊆ H1 .
is a group. If F has q elements, how many elements are in the group SL2 (F)? (You may use the
fact that GL2 (F) is a group, being the units of the ring M2 (F).)
(8) Let n ≥ 2 be an integer. Let Sn be the group of permutations on n elements {1, 2, ..., n}. A
permutation σ is called a transposition if σ = (i j) for some i 6= j, namely, σ exchanges i and j
and leaves the rest of the elements in their places. Prove that every element of Sn is a product of
transpositions.
Hint: Reduce to the case of cycles.
(9) (a) Let n ≥ 1 integer. Show that the order of a ∈ Z/nZ, viewed as a group of order n with respect
to addition, is gcdn(a,n) .
(b) Let n ≥ 3. Find the order of every element of the dihedral group Dn .
(10) Let n > 1 be an integer, relatively prime to 10. Consider the decimal expansion of 1/n. It is
periodic, as we have proven in a previous assignment. Prove that the length of the period is precisely
the order of 10 in the group Z/nZ× (the group of congruence classes relatively prime to n, under
multiplication).
For example: 1/3 = 0.33333 . . . has period 1 and the order of 10 (mod 3) = 1 (mod 3), with
respect to multiplication, is 1. We have 1/7 = 0.1428571428571428571428571429 · · · , which has
COURSE NOTES - ALGEBRA I 107
Note that R has 26 = 64 elements. Let G be the group of units of R. In this case it consists of the
matrices such that a11 = a22 = a33 = 1. Therefore G has 8 elements. Prove that G ∼ = D4 .
(17) Let H be a subgroup of index 2 of a group G. Prove that H is normal in G.
108 EYAL GOREN MCGILL UNIVERSITY
Part 7. Appendices
Appendix A: The Cantor-Bernstein Theorem
In this appendix we prove the Cantor-Bernstein theorem (Theorem 4.0.2).
Proof. We divide the proof into three parts. First, given a set X, denote by P ( X ) the set whose elements
are the subsets of X. The first step is the following “fixed point" lemma.
Lemma 32.7.2. Let X be a set and let ϕ : P ( X ) → P ( X ) be a function with the property that for A1 , A2 ∈
P ( X ), A1 ⊆ A2 =⇒ ϕ( A1 ) ⊆ ϕ( A2 ). Then, there is A0 ∈ P ( X ) such that
ϕ ( A0 ) = A0 .
D = { A ∈ P ( X ) : A ⊆ ϕ( A)}.
Let
A0 = ∪ A∈ D A.
f : X → Y, g : Y → X.
Lemma 32.7.3. There is a subset A0 of X such that g−1 gives a bijection between X − A0 and Y − f ( A0 ).
ϕ( A) = X − g(Y − f ( A)).
A0 = ϕ( A0 ) = X − g(Y − f ( A0 )).
That means that X − A0 = g(Y − f ( A0 )) and so that X − A0 is contained in the image of g and g−1 gives
a bijection X − A0 → Y − f ( A0 ).
We now proceed to the last part of the proof where we construction a bijection
h : X → Y.
We define
(
f (x) x ∈ A0 ,
h( x ) =
g −1 ( x ) x ∈ X − A0 .
Note that h is injective when restricted to A0 and when restricted to X − A0 . At the same time h( A0 ) =
f ( A0 ) is disjoint from h( X − A0 ) = Y − f ( A0 ). Thus, h is injective. On the other hand, as Y = f ( A0 ) ∪
(Y − f ( A0 )), h is also surjective.
COURSE NOTES - ALGEBRA I 109
is an integer. This number is clearly positive. On the other hand, it is smaller than
1 1 1 1
+ + +··· = < 1.
N + 1 ( N + 1)2 ( N + 1)3 N
As there is no positive integer small than 1, we have arrived at a contradiction and so e is irrational.
As said, this process of repeated division with residue much stop as the sizes |ri | of the residues are decreasing
natural numbers. Then rt is the gcd of a and b. That is, rt divides both a and b and any element of R dividing
both a and b divides r. Note though that rt is not uniquely determined by these properties as for every unit
u of R also urt would have these properties (also, in each step of the devision with residue we have choose
qi and that effects the residues as well). In the case of Z we could make the gcd unique by demanding that
it is positive, and in the case of F[ x ] we could make it unique by requiring it to be a monic polynomial, but
110 EYAL GOREN MCGILL UNIVERSITY
for a general ring R there are no such normalizations. And so, we should really say “a gcd" of a, b and not
“the gcd" of a, b. As in the case of Z or F[ x ] we can prove:
Proposition 32.7.4. R is a principal ideal domain. Let a, b ∈ R not both zero. Let r be a generator of the
ideal of R generated by a and b, hr i = h a, bi. Then r is a gcd of a and b.
Proof. Let us show first that R is a principal ideal domain. Let I be a non-zero ideal of R and consider the
non-empty set of natural numbers
{| a| : a ∈ I − {0}}.
This set has a minimal element, necessarily of the form | a0 | for some element a0 ∈ I, a0 6= 0. Clearly h a0 i ⊆ I.
Let x ∈ I and write x = qa0 + r where either r = 0 or |r | < | a0 |. As r = x − qa0 ∈ I, we cannot have
|r | < | a0 | and so we must have r = 0. That is, x ∈ h a0 i and we got I ⊆ h a0 i. Therefore, I = h a0 i and is
principal.
Assume that b is not zero and let r be a gcd of a, b obtain by the Euclidean algorithm. It follows from the
algorithm that r = xa + yb for some x, y ∈ R and so r ∈ h a, bi and consequently hr i ⊆ h a, bi. On the other
hand, as r | a we have a ∈ hr i and similarly b ∈ hr i and it follows that h a, bi ⊆ hr i. That is, hr i = h a, bi,
where r is a gcd of a and b. Remark that the generators of the ideal hr i are precisely the gcd’s of a, b (they
are the elements {ur : u ∈ R× }).
The next step is to show that for r ∈ R, r 6= 0, r 6∈ R× , the following properties are equivalent: (i) if r | ab
then r | a or r |b; (ii) if r = ab then either a or b are units. Such an element is called a prime element of R. A
proof very similar to those for Z or F[ x ] gives us unique factorization:
Theorem 32.7.5. Let R be a Euclidean ring and a 6= 0 an element of R. Then, there are non-associated
prime elements p1 , . . . , pt of R and positive integers a1 , . . . , at and a unit u such that
a
a = up11 . . . ptat .
b
Moreover, the factorization is unique in the sense that if a = vq11 · · · qbs s , with v a unit, qi are non-associated
prime elements and bi positive integers, then possibly after renaming the qi we have s = t, pi ∼ qi for all i
and ai = bi for all i.
Here is an interesting example of a Euclidean ring, called the ring of Gaussian integers . It is the ring
Z[i ] = { x + yi : x, y ∈ Z}, where
| x + yi | = x2 + y2 .
Given a, b ∈ Z[i ] such that b 6= 0, write using division in C, a/b = s + ti, where s, t are real numbers (in fact
they come out rational numbers) and let q = s0 + t0 i, where s0 is the integer nearest to s and t0 the integer
nearest to t. There is a unique r ∈ Z[i ] such that the equation
a = qb + r
holds; that is, define r = a − qb. One then verifies that
|r | < | b |,
√
√ that Z[2i ] is Euclidean.
and so we get It is interesting that if you try to mimic this argument of R = Z[ −5]
and | x + y −5| = x + 5y2 the proof doesn’t work. We know that because we have seen that R is not a
principal ideal domain (see page 67). It is interesting to analyze the exact point where the proof fails, but we
leave that to the interested reader.
Lemma 32.7.6. For every complex number z the series converges to a complex number that we shall call ez .
COURSE NOTES - ALGEBRA I 111
Proof. Left as an exercise. The key is to show that for every for every real number e > 0 and z ∈ C there is
an integer n ≥ 0 such that for all integers N ≥ 0,
n+ N
| ∑ zk /k! |< e.
k=n
This implies that the real part and imaginary part of this sum are both less then e and this, in turn, implies
that the series Re(∑kN=1 zk /k!), when N varies, and the series Im(∑kN=1 zk /k!), when N varies, are Cauchy
series of real numbers that thus converge.
Theorem 32.7.7. The complex exponential function ez satisfies
e z1 + z2 = e z1 e z2 .
Proof. As formal power series we have
∞
( z1 + z2 ) n
e z1 + z2 = ∑ n!
n =0
j n− j
∞ n (nj)z1 z2
= ∑∑ n!
n =0 j =0
∞ n j n− j
z z
= ∑ ∑ j!1 (n 2− j)!
n =0 j =0
∞z1n ∞ z2n
= (∑
n! n∑
)( ).
n =0 =0 n!
( z1 + z2 ) n
To justify that as equality of complex numbers we should work with finite sums ∑2M
n =0 n! where we get
z1n 2M z2
n
equalities up to the last step where we don’t quite get (∑2M
n=0 n! )( ∑n=0 n! ), but a sub-sum of this containing
zn zn
all terms appearing in(∑nM=0 n!1 )(∑nM=0 n!2 ).
However, it is easy to show that the two sums differ by a quantity
that goes to zero as M goes to infinity. We leave the details to the reader.
Lemma 32.7.8 (Euler’s formula). Let θ be a real number then
eiθ = cos(θ ) + i sin(θ ).
Remark 32.7.9. In the text, where we discussed complex numbers, we have defined eiθ this way as a quick
and dirty method to get a function ez with good properties. We now show that this formula follows from the
more systematic approach of defining ez via a power series, valid for either real or complex number z.
Proof. We have
∞
(iθ )n
eiθ = ∑ n!
n =0
∞ ∞
θ 2n θ 2n+1
= ∑ i2n (2n)! + i ∑ i2n (2n + 1)!
n =0 n =0
∞ ∞
θ 2n θ 2n+1
= ∑ (−1)n (2n)! + i ∑ (−1)n (2n + 1)!
n =0 n =0
= cos(θ ) + i sin(θ ).
We have assumed here familiarity with the Taylor series expansion for sin and cos.
Corollary 32.7.10 (de Moivre’s formula). Let θ be a real number.
(cos(θ ) + i sin(θ ))n = cos(nθ ) + i sin(nθ ).
Proof. The left hand side is equal to (eiθ )n , while the right hand side is equal to einθ . The theorem implies
these are equal.
112 EYAL GOREN MCGILL UNIVERSITY
Example 32.7.11. By expanding we find trigonometric formulas, and this is the easiest way to know them,
without memorizing them!
cos(2θ ) + i sin(2θ ) = (cos(θ ) + i sin(θ ))2 = cos(θ )2 − sin(θ )2 + i2 cos(θ ) sin(θ ).
By separating real and imaginary parts, we find:
cos(2θ ) = cos(θ )2 − sin(θ )2 , sin(2θ ) = 2 cos(θ ) sin(θ ).
Example 32.7.12. As eiθ · e−iθ = 1 we find that (cos(θ ) + i sin(θ ))(cos(−θ ) + i sin(−θ )) = (cos(θ ) +
i sin(θ ))(cos(θ ) − i sin(θ )) (cos is an even function and sin is an odd function; this is well known but is also
evident from their Taylor expansions). This gives the identity
cos(θ )2 + sin(θ )2 = 1.
Index
(r ), 66 cardinality, 13
(r1 , r2 , . . . , rn ), 67 Cauchy-Frobenius formula, 97
Dn , 90 Cayley’s theorem, 94
In , 64 Chinese remainder theorem, 77
Mn (F), 64 complete set of representatives, 18, 44, 91
R/I, 72 complex conjugate, 20
R[ x ], 52 congruence, 44
Sn , 17, 87 Continuum hypothesis, 16
C, 3 coset, 72, 91
F[e], 64 cycle, 88
F p , 46 disjoint, 89
Im(z), 19
N, 3 de Moivre’s formula, 111
N+ , 19 Dedekind, 4
Q, 3 direct product
R, 3 rings, 64
Re(z), 19 division, 34
⇒, 6 with residue, 33
Z, 3 division with residue, 53
Z/nZ, 44 Eisenstein’s criterion, 60
z̄, 20 equivalence
∩, 4 class, 18
∩i ∈ I , 4 relation, 17
∼
=, 76 Euclid, 35, 38
∪, 4 Euclidean algorithm, 35, 54
∪i ∈ I , 4 Euler’s constant, 41
∅, 4 Euler’s formula, 111
≡ (mod n), 44
∃, 11 Fermat, 46
∀, 4, 11 little theorem, 46, 51
∈, 3 fibre, 12
h gi, 87 field, 26
hr1 , r2 , . . . , rn i, 67 First isomorphism theorem, 76, 102
7→, 23 function, 10
¬, 7 bijective, 12
C, 65 composition, 12
6 ∈, 3 graph, 11
\, 4 identity, 11
∼, 16 image, 11
, 6 injective (one-one), 12
⊂, 4 inverse, 13
⊆, 4 source, domain, 11
Orb(s), 94 sujective (onto), 12
Stab(s), 94 target, codomain, 11
sgn, 104 Fundamental Theorem of Algebra, 23
×, 5 Fundamental Theorem of Arithmetic, 37
{ }, 3
a|b, 34 Gauss, 23, 38
e, 85 Gaussian integers, 110
eiθ , 22 generator, 87
o ( g), 88 Goldach’s conjecture, 39
graph, 10
43
finite, 10
Archimedes Cattle problem, 70 simple, 10
associate greatest common divisor, 59
in a ring, 81 greatest common divisor (gcd), 34, 39, 54, 59
polynomials, 56 group, 85
abelian, 85
binomial action, 94
coefficient, 46, 51 transitive, 97
theorem, 46, 51 cyclic, 87
Burnside’s formula, 97 dihedral, 90
homomorphism, 93
Cantor’s diagonal argument, 16 isomorhism, 93
113
114 EYAL GOREN MCGILL UNIVERSITY
Hamilton, 82 relation, 16
homomorphism, 68, 93 congruence, 44
equivalence, 17
ideal, 65 complete set of representatives, 18
generated, 82 reflexive, 17
maximal, 80 symmetric, 17
non-principal, 67 transitive, 17
prime, 80 ring, 26
principal, 66 commutative, 26
sum, 67 division, 26
trivial, 65 dual numbers, 64
index, 92 Euclidean, 109
induction, 7 homomorphism, 68
isomorphism, 76, 93 integral domain, 109
isomorphism, 76
kernel, 68, 93 matrices, 63
polynomial, 52
Lagrange’s theorem, 92
quaternion, 82
necklace, 96 subring, 27
number root, 23
complex, 3, 19 of unity, 22
integer, 3 roulette, 96
irrational, 40 RSA, 48
natural, 3
set, 3
prime, 36
contain, 4
rational, 19
countable, 14
real, 3
difference, 4
operation, 25 disjoint union, 18
orbit, 94 equal, 4
order intersection, 4
element, 88 product, 5
group, 88 union, 4
linear, simple, total, 17 sieve of Eratosthenes, 36
partial, 17 sign (of a permutation), 104
stabilizer, 94
Peano, 4 subgroup, 86
Pell equation, 70 normal, 101
permutation, 17, 87 subring, 27
even, 104
odd, 104 Theorem
pigeonhole principle, 9 Eisenstein’s Criterion, 60
polar representation, 21 Euclidean algorithm, 35
polynomial Fundamental Theorem of Algebra, 23, 59
complex, 23 Fundamental Theorem of Arithmetic, 37
constant, 52 Prime Number, 38
irreducible, 56 transposition, 88
monic, 52 Twin Prime conjecture, 39
rational, 23
unique factorization
real, 23
integers, 37
zero, 23, 52
polynomials, 57
pre-image, 12
rationals, 40
prime
unit, 66, 69
number, 36
Prime Number Theorem, 38 Wilson’s theorem, 51, 62
principal ideal ring, 66
proof zero, 23
by contradiction, 7 zero divisor, 45
contrapositive, 7
induction, 7
pigeonhole, 9