(Textbooks in Mathematics) Mark Hunacek - Introduction To Number Theory-Chapman and Hall - CRC (2023) PDF
(Textbooks in Mathematics) Mark Hunacek - Introduction To Number Theory-Chapman and Hall - CRC (2023) PDF
Theory
Introduction to Number Theory covers the essential content of an introduc-
tory number theory course including divisibility and prime factorization,
congruences and quadratic reciprocity. The instructor may also choose from
a collection of additional topics.
Aligning with the trend toward smaller, essential texts in mathematics,
the author strives for clarity of exposition. Proof techniques and proofs are
presented slowly and clearly.
The book employs a versatile approach to the use of algebraic ideas.
Instructors who wish to put this material into a broader context may do so,
though the author introduces these concepts in a non-essential way.
A final chapter discusses algebraic systems (like the Gaussian integers)
presuming no previous exposure to abstract algebra. Studying general
systems helps students to realize unique factorization into primes is a more
subtle idea than may at first appear; students will find this chapter interest-
ing, fun and quite accessible.
Applications of number theory include several sections on cryptography
and other applications to further interest instructors and students alike.
Textbooks in Mathematics
Series editors:
Al Boggess, Kenneth H. Rosen
https://www.routledge.com/Textbooks-in-Mathematics/book-series/
CANDHTEXBOOMTH
Introduction to Number
Theory
Mark Hunacek
First edition published 2023
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and
let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known
or hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@
tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003318712
Typeset in Palatino
by codeMantra
This book is dedicated to Leslie, Adrienne and Sofia,
Preface ......................................................................................................................xi
Author ....................................................................................................................xv
1 Divisibility .......................................................................................................7
1.1 The Principles of Well-Ordering and Mathematical Induction .....7
Exercises ............................................................................................... 10
1.2 Basic Properties of Divisibility .......................................................... 11
Exercises ............................................................................................... 13
1.3 The Greatest Common Divisor ......................................................... 14
Exercises ............................................................................................... 19
1.4 The Euclidean Algorithm .................................................................. 19
Exercises ...............................................................................................22
1.5 Primes ................................................................................................... 23
Exercises ............................................................................................... 27
1.6 Numbers to Different Bases .............................................................. 28
Exercises ............................................................................................... 29
Challenge Problems for Chapter 1.................................................... 29
vii
viii Contents
xi
xii Preface
One inclusion of an algebraic idea that I could not resist occurs in the
section on greatest common divisors. I have always had a fondness for
proving the existence of the gcd by using ideals, so, in Chapter 1, I define
that concept (for the integers only), prove that any ideal in the integers is
generated by a single element, and use that result to quickly prove, in one
fell swoop, that the gcd of two integers exists and is a linear combination of
those integers.
This approach to the gcd pays dividends in the final chapter of the book,
which also introduces some algebraic ideas, though in a fairly concrete set-
ting, focusing on specific examples rather than abstract algebraic systems.
This chapter begins with a fairly detailed look at the Gaussian integers, mim-
icking, wherever possible, the various arguments used previously in the text
for the ordinary integers (including the concept of an ideal and using ideals
to prove the existence of a gcd). From the Gaussian integers, we proceed to
other quadratic extensions, including a discussion of algebraic systems in
which unique factorization fails, thus showing the students that unique fac-
torization is a more subtle concept than might have originally been thought.
As a very pleasant additional benefit, studying other algebraic systems can
actually be used to prove results about the ordinary integers. As seen in the
text, for example, the Gaussian integers can actually be used to prove results
about sums of two squares of integers and also used to classify Pythagorean
triples. Studying the quaternions allows a proof that any positive integer can
be written as the sum of four squares. Over the years, I have found that my
students find this material to be interesting, fun and quite accessible. And
here again, the instructor has some discretion in determining whether to use
algebraic terms like “ring” and “field”; I have written the book so as to accom-
modate either choice.
In writing this book, I have resisted the urge to discuss a plethora of top-
ics, most of which will never be gotten to in a one-semester introductory
course. I find it discouraging to use a book as a text for a course and then only
cover half (or less) of it. Students, I think, don’t like this either, particularly
since they are the ones who are paying for the book. Therefore, I have tried
to write a book that covers the essential content of an introductory number
theory course (divisibility and prime factorization, congruences, quadratic
reciprocity) and a collection of topics from which the professor can choose
(perfect numbers, sums of squares, Pythagorean triples, primitive roots and,
as previously noted, a chapter on algebraic systems other than the integers).
Because I invariably had computer science majors in my class, and because
the math majors also generally found it interesting, I also have included
some optional material on cryptography. All told, there is probably a little
more material in the text than can be covered in a one-semester course, but
not so much more as to be discouraging. The last time I taught the course, I
covered in one class period a selection of material from Chapter 0, and then
did Chapters 1 through 6 in their entirety (weaving in, as appropriate, the
three Appendices). This left me enough time to cover a substantial amount of
Preface xiii
xv
0
Introduction: What Is Number Theory?
Carl Friedrich Gauss, whom many people consider the greatest mathema-
tician who ever lived, once described number theory as the “Queen of
Mathematics”. Indeed, the integers (“whole numbers”), and the patterns
they exhibit, have been the subject of fascination and study for literally thou-
sands of years. Euclid’s famous treatise The Elements, which is often thought
as being solely related to geometry, actually contains many results that are
theorems of number theory. In Chapter 1 of this text, for example, we will
give Euclid’s proof that there are infinitely many prime integers.
For our purposes, the term “number theory” will (mostly) refer to the study
of the integers and various issues connected with them. Unlike many areas
of mathematics, where the problems and conjectures themselves (let alone
the proofs) are so technical that one has to be a specialist in the area to even
understand them, many questions in number theory do not involve technical
terms or results and can be understood by a grade-schooler. In this introduc-
tory chapter, we will look at some examples of these problems, so as to try
and give a sense of the flavor of the subject.
First, let us start with one of the most famous problems in mathematics:
Fermat’s Last Theorem. This is one of the great success stories of mathematics,
and it also has a fascinating history that dates back to the Pythagorean theo-
rem: if a right triangle has sides of length x and y and hypotenuse of length z,
then x2 + y2 = z2. From a number-theoretic point of view, it is of interest to look
for positive integer solutions to this equation, such as x = 3, y = 4 and z = 5. Later
in the text, we will find the general form of all such solutions.
Mathematicians love to generalize from one problem to another, and when
you’ve considered x2 + y2 = z2, it is not too big a leap to consider more gen-
eral equations like xn + yn = zn, where n > 2 is a positive integer. Equations like
these are called Diophantine equations, in honor of the Greek mathematician
Diophantus, who wrote a textbook titled Arithmetica in which he discussed
solutions of many equations.
In the mid-1600s, a lawyer and amateur mathematician named Pierre
Fermat was reading Diophantus’s book and wrote in the margin that he had
discovered a “marvelous proof” that “this margin is too narrow to contain”
that for positive integers n > 2, the above-mentioned equation xn + yn = zn has
no solution in positive integers x, y and z.
Whether Fermat actually did have such a proof is something that will
never be known for sure, but most authorities believe that he did not. In any
event, this cryptic marginal reference led to a search, lasting for more than
DOI: 10.1201/9781003318712-1 1
2 Introduction to Number Theory
300 years, for a proof of this result. Correct proofs were given for some spe-
cific values of n (Fermat himself proved the result for n = 4, and Euler proved
the result for n = 3 by a different method) but nobody came up with a proof
that worked for all n. On the other hand, nobody could come up with a coun-
terexample showing the result to be false. Some mathematicians, including
some very good ones, thought they had come up with a proof, but errors
were always found. Sometimes these errors themselves shed some light on
subtle points, such as the uniqueness of factorization into primes.
Finally, in 1993, Andrew Wiles announced that, after 7 years of intense
effort, he had found a proof of the result. Unfortunately, Wiles’ proof was
found to contain an error, but that error was, in collaboration with his stu-
dent Richard Taylor, eventually patched up in 1994. The proof, published in
1995, uses very deep and difficult mathematics that is far beyond the scope of
this text, and which did not even exist in Fermat’s time.
As just noted, Fermat’s equation xn + yn − zn = 0 is an example of a Diophantine
equation. More generally, a Diophantine equation in k variables x1, x2, …, xk
is an equation of the form p(x1, x2, … xk) = 0, where p is a polynomial in these
variables with integer coefficients. The study of such polynomial equations,
which is a major part of number theory, itself naturally leads to several ques-
tions, including questions of interest to computer scientists. Specifically, one
might ask: Is there an algorithm for determining whether a given Diophantine
equation has a solution in integers or rational numbers? If so, is there an
algorithm for determining all such solutions? If we can’t determine all solu-
tions, can we at least determine some? The first of these questions is known
as Hilbert’s Tenth Problem, so named because it was the tenth of 23 then-
open problems identified by the mathematician David Hilbert in a famous
speech that he gave in Paris in the year 1900. In 1970 it was shown, via the
collaborative efforts of several mathematicians, that no such algorithm exists.
This result shows just one way in which number theory intersects with other
areas of mathematics (here, logic).
Fermat’s Last Theorem and Hilbert’s Tenth Problem are at least mathemati-
cal problems that were eventually solved. There are other problems in num-
ber theory that remain unsolved to this date. A number of them are also easy
to state. As a first example, let us say that a positive integer n is perfect if the
sum of its factors (other than n itself) is equal to n. For example, 6 is perfect,
because 1 + 2 + 3 = 6. There are, as of this writing, only 51 known perfect num-
bers, all of which are even. This suggests two questions: Are there any odd
perfect numbers? Are there infinitely many even perfect numbers? Nobody
knows the answer to either of these questions. However, as we will see, we
can at least tell what an even perfect number looks like: we’ll prove this later,
but the answer is hinted at in exercises 0.4 and 0.5.
Here’s another example of a problem that is also easy to state but currently
unsolved. Start with a positive integer n (your choice!); now define a new pos-
itive integer, which we will call n’, as follows: if n is even, n’ = n/2; if n is odd,
n’ = 3n + 1. Now, having defined n’, use this same recipe to define (n’)’, and so
Introduction 3
0.1 Exercises
0.1. Find several prime numbers that can be written in the form n2 + 1,
for some positive integer n. The question of whether there are infi-
nitely many such primes is also unsolved to this day. The question
of whether there are infinitely many primes of the form n3 + 1 is,
however, easily resolved. Explain. What about primes of the form
n2 − 1? Explain again.
1
Divisibility
DOI: 10.1201/9781003318712-2 7
8 Introduction to Number Theory
far. The Well-Ordering Principle, stated below (and also taken as an axiom in
Appendix B), makes this intuition precise.
Well-Ordering Principle: Any nonempty set S of positive integers contains
a smallest element, i.e., an element x with the property that x ≤ y for all y ∈ S.
We immediately point out a trivial restatement of this principle: any non-
empty set S of nonnegative integers has a smallest element. This is because
if 0 is an element of S, then it is clearly the smallest element of it; if 0 is not
an element of S, then S consists of positive integers and the Well-Ordering
Principle applies.
We motivated the Well-Ordering Principle by noting that there is a small-
est positive integer, namely 1. This is certainly something that most readers
of this book will be happy to take on faith as a “given”, based on their years
of prior acquaintance with the set of integers. Yet, it is not something that was
assumed as an axiom and, it turns out, can be proved easily as a consequence
of the Well-Ordering Principle. Because of the simplicity of the proof, and
because it illustrates how to use the Well-Ordering Principle, we take the
time to prove it precisely rather than just slip it under the rug.
Theorem 1.1.1
Theorem 1.1.2
Proof. Assume, hoping for a contradiction, that there is a positive integer that
is not in S. Then, by the Well-Ordering Principle (applied to the nonempty set
of all such integers) there must be a smallest positive integer not in S; call it
k. Note that k ≠ 1 (because 1 ∈ S), so k − 1 is a positive integer (note that we
are using Theorem 1.1.1 here!) and because it is smaller than k, must be in S.
However, by assumption, since k − 1 ∈ S, it must be the case that k = (k − 1) + 1 ∈ S,
a contradiction. This contradiction yields the desired result.
The Principle of Mathematical Induction is typically used as a proof tool. If
asked to prove that a certain statement is true for all positive integers n, one
first proves it is true for n = 1 and then, assuming it is true for n, proves it true
for n + 1. In the language of the preceding theorem, we let S be the set of all
positive integers n for which the result is true; the proofs just discussed then
show that S is the set of all positive integers, and we are done.
We illustrate this method with a simple example: we will prove that the
sum of the first n positive integers is equal to n(n + 1)/2. For n = 1, this is obvi-
ous because (1 × 2)/2 = 1. So, now assume the result is true for n, and let us
examine the sum of the first n + 1 positive integers; we want to prove this is
equal to (n + 1)(n + 2)/2. Well,
1 + 2 + …. + n + (n + 1) =
(1 + 2 + …+ n) + (n + 1) =
n(n + 1)/2 + (n + 1) =
(n(n + 1) +2 (n + 1)) /2 =
Suppose that
Exercises
1.1 Prove by induction that the sum of the first n positive odd integers
is equal to n2.
1.2 Prove by induction that 2n > n for every positive integer n.
1.3 Prove by induction that a set with n elements has 2n subsets.
1.4 If n is a nonnegative integer, define the nth Fermat number Fn to be
n
2 2 + 1. Use mathematical induction to prove that, for every n, F0…
Fn + 2 = Fn+1. (We will see Fermat numbers again. They pop up in
unexpected places in mathematics, including the geometric ques-
tion of when a regular polygon with n sides can be constructed
with compass and straightedge alone.)
1.5 Find the error in this fallacious “proof” that all billiard balls have
the same color: “We will prove that, for any positive integer n, in
any set of n billiard balls, they all have the same color. This is obvi-
ous if n = 1. Now assume the result is true for n and consider a set of
n + 1 billiard balls; let us denote them by B1, …, Bn +1. By our induc-
tive assumption, all the balls in the set of n balls {B1, …, Bn} have the
same color; without loss of generality, let us say the color is black.
Now consider the set {B2, …, Bn+1}. This is also a set of n balls and so
they must have the same color as well. But this color must be black
because B2 is in both sets. So all the billiard balls B1, …, Bn+1 have
the same color, and we are done.”
Divisibility 11
Theorem 1.2.1
(as you no doubt know from grade school) that we can divide an integer a by
a positive integer b to obtain a quotient q and remainder r, with r being a non-
negative integer less than b. The precise statement of this result is called the
Division Algorithm (a slight misnomer, since it is not really an “algorithm”
in the usual sense). Although we stated earlier that we will simply assume
as known all the basic facts about integer addition and multiplication, this
fact involves division and, we think, is best proved, particularly because the
proof provides a nice use of the Well-Ordering Principle.
Theorem 1.2.2
(Division Algorithm) If a and b are integers, with b positive, then there exist
unique integers q and r such that a = bq + r and 0 ≤ r < b.
Proof. We first prove that such q and r exist and will then prove that they
are unique. First note that if b ∣ a, then existence is obvious (with r = 0). So we
assume without loss of generality (to prove existence) that b does not divide
a. Let S be the set of all positive integers of the form a − bx, as x ranges over the
a
integers. It is obvious that S is nonempty; any x that is less than will give
b
an element of S. So, by Well-Ordering, S contains a smallest element, which
we will call r. By definition of r, it can be written as a − bq for some integer q.
So we know a = bq + r; it therefore suffices to prove 0 ≤ r < b, and since we know
r is positive, we need only show that r < b. We can’t have r = b because then
b would divide a, contrary to our assumption. (Why?) If we had r > b, then
a − b(q + 1) = r − b would be positive, less than r, and in the set S, a contradiction.
So in fact r < b, and we have shown the existence of a quotient and remainder
satisfying the conditions of the theorem.
We next prove uniqueness. Specifically, we prove that if a = bq1 + r1 = bq2 + r2,
with both r1 and r2 nonnegative and less than b, then r1 = r2 and q1 = q2. It suf-
fices to prove that r1 = r2, as simple algebra then would establish q1 = q2 . We use
a proof by contradiction, and assume instead, without loss of generality, that
r1 > r2. We then have, by some more simple algebra, r1 − r2 = b(q1 − q2). Since
b and r1 − r2 are both positive, q1 − q2 must be positive as well, which would
then imply that the right-hand side of this equation is at least equal to b. But
this is a contradiction, because r1 − r2 ≤ r1 < b. This concludes the proof.
The division algorithm is a powerful tool in number theory, and we will
begin putting it to use in the very next section, where we discuss the greatest
common divisor of two integers. For the moment, though, we record an easy
but important consequence of it. If b = 2, then the division algorithm tells us
that any integer a can be written as either 2q or 2q + 1, but not as both. Integers
of the first type are (as we mentioned above) called even, and integers of the
second type are called odd. The uniqueness of the quotient and remainder in
the Division Algorithm tells us that no integer can be both even and odd. This
surely does not come as a surprise to you, but it’s nice to know it “officially”. It
Divisibility 13
is easy to see that the product of two odd integers is odd (see the exercises). In
particular, if n is an integer and n2 is even, then n must be even. Just these basic
facts allow us to prove a result of immense historical significance, namely that
2 is irrational. (Recall that a rational number is a fraction; more precisely, it
a
is a number of the form , where a and b are integers.)
b
Our proof of the irrationality of 2 gives a preview of the concept of great-
est common divisor, discussed in the next section.
Theorem 1.2.3
2 is irrational.
a
Proof. Suppose, hoping for a contradiction, that we could write 2 = , where
b
a and b are positive integers. We can and do assume this fraction is in “lowest
terms”, meaning that a and b have no positive divisors in common except 1.
(We can always divide both a and b by any common divisors greater than 1
without changing the value of the fraction.) Squaring both sides and clearing
denominators gives 2b2 = a2. This equation tells us that a2 is even, and hence,
by the observation above, this means that a is even, say a = 2k. Squaring and
substituting gives 2b2 = 4k2 or b2 = 2k2, which implies that b2, and hence b, is
a
even. But if a and b are both even, then the fraction is not in lowest terms, a
b
contradiction.
The historical importance of this result can be traced back to at least as far
as the ancient Greeks, who believed that any two lengths were commensu-
rable, i.e., both a multiple of some other length. If that were true, though, then
it would be true for both the side of a unit square and its diagonal (which by
the Pythagorean theorem has length 2 ) and that would mean that 2 is
rational. The realization that the side and diagonal of a square are not com-
mensurable had important historical consequences and led Greek geometers
to separate the concepts of number and segment and to develop an intricate
“theory of proportions” to deal with these issues.
The Division Algorithm also underlies the idea of writing integers to “dif-
ferent bases”. This is discussed in Section 1.6 at the end of this chapter, but
this section can be read now if desired.
Exercises
If a and b are integers, not both 0, then the greatest common divisor of a and
b, denoted gcd(a, b), is the positive integer d with the following properties:
So, the gcd of a and b is not only greater than or equal to any other common
divisor of these integers, it is a multiple of that common divisor. For example,
if a = 8 and b = 20, then the only positive common divisors of a and b are 1, 2
and 4; thus, gcd(8, 20) = 4, which is a multiple of the only other common divi-
sors, 1 and 2.
As noted above, it is not immediately obvious that gcd(a, b) always exists.
What is clear (see the exercises) is that if the gcd of a and b exists, it is unique.
(This is why we refer in the definition to “the positive integer d”.) We will
Divisibility 15
(i) 0 is an element of I
(ii) if a and b are elements of I, so is a + b (closure under addition)
(iii) if a is an element of I and b is any integer whatsoever, ab is an element
of I (“super-closure” under multiplication)
Note the difference between conditions (ii) and (iii): condition (ii) is ordinary
closure under addition (that if two integers are both in I, then so is their sum),
but condition (iii) is a stronger condition (hence the ad-hoc term “super-clo-
sure”, which is not standard terminology): not only is I closed under multi-
plication in the ordinary sense, but I contains any product where even one of
the terms in the product is in I.
Examples of ideals are easy to give: the two “trivial” ones are {0} and Z
itself. For a less trivial example, let k be any integer and denote by < k> the set
of all multiples km of k. It is easy to verify that this is an ideal, called the prin-
cipal ideal generated by k. Observe that this example includes the two previous
ones as special cases, since clearly {0} = < 0 > and Z = < 1 >.
As it happens, the principal ideals in Z are all the ideals of Z.
Theorem 1.3.3
Theorem 1.3.4
If a and b are integers, not both zero, then the gcd of a and b exists and is a
linear combination of 𝑎 and b.
Proof. Let I = {ax + by : x, y ∈ Z} be the set of all linear combinations of a and
b. Note that both a = a1 + b0 and b = a0 + b1 are in I, and at least one of these
integers is nonzero. Since it is also easy to see (check this!) that I is an ideal,
I is principal by the previous theorem and therefore consists precisely of the
multiples of some integer d. It follows from our previous remarks that we can
assume that d > 0 (why?). We will prove that d is the gcd of a and b. Since it is
obvious that d is also a linear combination of a and 𝑏 by the way it is defined,
this will complete the proof.
To show that the positive integer d is the gcd of a and b, we first observe
that d divides both a and b. This follows from the observation, made in the
previous paragraph, that I contains both a and b, and every element of I is a
multiple of d by the way it is defined.
Finally, suppose k also divides both a and b. Then it is clear that k also
divides any linear combination of a and b. But one such linear combination is
d itself. So k ∣ d , and this completes the proof.
Although this theorem guarantees the existence of the greatest common
divisor, the proof given above does not provide a method for actually finding
it in a particular case. In the next section, however, we will discuss a useful
algorithm for computing the gcd of two integers and expressing it as a linear
combination of these integers. For the moment, we content ourselves with
a simple example: suppose we wish to find the gcd of 114 and 102. We can
begin by listing the divisors of 102: 1, 2, 3, 6, 17, 51 and 102. Of these integers,
we see that 1, 2, 3 and 6 also divide 114. So 6 is the gcd, and, consistent with
the theorem, all the other common divisors (1, 2 and 3) divide it.
Two integers whose greatest common divisor is equal to 1 are called rela-
tively prime. If integers a and b are relatively prime, then the previous result
establishes that 1 can be written as a linear combination of a and b. It is quite
easy to see that the converse of this result is also true.
Divisibility 17
Theorem 1.3.5
Two integers a and b, not both 0, are relatively prime if and only if the equa-
tion ax + by = 1 has a solution in integers x and y.
Proof. As noted, if gcd (a, b) = 1, then Theorem 1.3.4 guarantees the existence
of integers x and y satisfying ax + by = 1. For the converse, suppose this equa-
tion has a solution in integers x and y. Let d denote the gcd of a and b. Then
by our basic list of properties of divisibility, it is clear that d divides ax + by.
But this means that d divides 1, which (since d is positive) implies that d = 1.
Hence, a and b are relatively prime.
Students frequently misinterpret the previous theorem and misread it as
saying that “a and b have gcd n if and only if the equation ax + by = n has a
solution in integers x and y”. This is false: the equation 2x + 3y = 2 has the
obvious solution x = 1, y = 0 but the gcd of 3 and 2 is not 2; it is 1. The true state
of affairs is given by the following theorem, the proof of which is similar to
that of Theorem 1.3.5, and which includes Theorem 1.3.5 as a special case
(because the only positive integer that divides 1 is 1).
Theorem 1.3.6
For two integers a and b, not both 0, the equation ax + by = n has a solution in
integers x and y if and only if the gcd of a and b divides n.
Proof. Exercise.
It is worth noting that the integers x and y in the previous theorem are not
unique. Indeed, suppose that the equation ax + by = n has a solution (X, Y). Let
d denote the greatest common divisor of a and b, and consider the integers
x’ = X + (b/d)t and y’ = Y − (a/d)t, where t is any integer whatsoever. It is easy
to show by direct calculation that x’ and y’ are also solutions to the equation
ax + by = n. We will finish this section by proving that these are all the solu-
tions—i.e., that any solution is of this form, for some integer t. We first need
some preliminary results that are important in their own right.
Theorem 1.3.7
If a, b and c are integers, a ∣bc, and a and b are relatively prime, then a ∣ c.
Proof. Since a and b are relatively prime, we can express 1 as a linear com-
bination of them: 1 = ax + by. Multiplying both sides of this equation by c gives
c = acx + bcy. Now notice that because a ∣bc, a divides the second summand on
the right-hand side of this equation; it also obviously divides the first sum-
mand, and hence it divides their sum, which is c.
18 Introduction to Number Theory
Theorem 1.3.8
Theorem 1.3.9
If m1, …, mn are integers, any two of which are relatively prime, and each mi
divides an integer c, then the product m1…mn also divides c.
Proof. Induction on n. We defer the proof, however, to Section 1.5, after we
have developed some more machinery. (The reader might wish to try prov-
ing this now; what problem arises?)
We can now, as promised, finish up the discussion of roots of the equation
ax + by = n.
Theorem 1.3.10
Exercises
Theorem 1.4.1
Suppose a and b are nonzero integers, and a = bq + r. Then gcd (a, b) = gcd (b, r).
Proof. Let us denote gcd (a, b) by d. We will show that d is also the greatest
common divisor of b and r by showing that it satisfies the defining proper-
ties of that greatest common divisor. We already know that d is positive, so
it suffices to show that d divides both b and r, and also that d is a multiple of
any other divisor of b and r. Both of these, however, follow from our basic
properties of divisibility. Since d divides both a and b, it divides a – bq = r. So
d divides b and r. Also, if k is any divisor of b and r, then k also divides bq + r,
which is a. As a divisor of a and b, k divides d.
We will use this theorem in a second to see that the algorithm works. But
first, we must describe the algorithm.
Euclidean Algorithm. Given two integers a and b with, say, a > b, follow these
steps to compute the greatest common divisor of a and b:
a = bq1 + r1
b = r 1q2 + r 2
r1 = r2 q3 + r3
⋮
rn = rn+1qn+2
which results in the conclusion that rn+1 is the greatest common divisor of a
and b.
Before proceeding further, we should perhaps call attention to a possible
issue: our statement of the algorithm used the phrase “until we get a remain-
der of 0”. Must we, in fact, always eventually get a zero remainder? The answer
is yes: note that by construction of the integers rt, we get a strictly decreasing
sequence of nonnegative integers: the second equation above gives r2 < r1, the
third gives r3 < r2, etc. A strictly decreasing sequence of nonnegative integers
can’t go on forever; it must eventually reach 0.
Divisibility 21
We will shortly prove that the Euclidean Algorithm does, in fact, produce
the gcd of a and b, but before doing so, it would be helpful to work through a
specific example. Suppose, for example, that we want to compute the gcd of
a = 824 and b = 260. We perform the following calculations:
824 = 260(3) + 44
260 = 44(5) + 40
44 = 40(1) + 4
40 = 4(10)
which produces 4 (the last nonzero remainder) as the gcd of 824 and 260.
To see that this method works in general, refer back to the general system
of equations listed above, and note that, by Theorem 1.4.1, we have:
gcd (a, b) = gcd (b, r1) = gcd (r1, r2) = ….. gcd (rn, rn+1) = gcd (rn+1, 0) = rn +1, where,
for the last equality, we used the (very simple) result of Exercise 1.20.
Because 4 is the greatest common divisor of 824 and 260, we know that it
can be expressed as a linear combination of these two integers: i.e., there exist
integers x and y such that 4 = 824x + 260y. The calculations in the Euclidean
Algorithm allow us, by proceeding backward, to actually find integers x and
y that work. The trick is to start at the penultimate equation, solve for 4 as a
linear combination of the previous remainders and keep “backward solving”
until we arrive at a linear combination of the original integers. Observe:
4 = 44 − 40(1)
= 44 – (260 – 44(5))
= 44(6) – 260
= 824(6) – 260(19).
Thus, we have expressed 4 as a linear combination of 824 and 260. (The pru-
dent thing to do, of course, would be to check the right-hand side with a
calculator to make sure that it gives us the value 4; it does.)
Let’s try one more example. Suppose we want to find the gcd of a = 234 and
b = 63. We get
234 = 63(3) + 45
22 Introduction to Number Theory
= 45(1) + 18
63
= 18(2) + 9
45
= 9(2)
18
= 45 – 18(2)
9
= 63(−2) + 45(3)
= 63(−11) + 234(3),
which we can also check to be correct. (The right-hand side is − 693 + 702.)
We close this section by stating, without proof, a result that might be of
interest to readers who care about the computational complexity of algo-
rithms. It is known as Lame’s theorem.
Theorem 1.4.2
Exercises
1.27 Use the Euclidean Algorithm to find the gcd of 1024 and 342. Then
express this gcd as a linear combination of 1024 and 342.
Divisibility 23
1.5 Primes
In this section we begin the study of prime integers, the properties of which,
as noted in chapter 0 of this text, have fascinated people for literally centu-
ries. We begin with the definition.
Definition 1.5.1
An integer p > 1 is called prime if its only positive divisors are 1 and p. An
integer n > 1 that is not prime is called composite.
An equivalent way to define a prime integer is to say that p > 1 is prime if
and only if whenever p = ab, a product of positive integers, one of a or b must
be 1 and the other must be p.
Note that, by definition, 1 is not a prime, even though its only divisors are itself
and 1. There are technical reasons why we want to exclude 1 as a prime; we’ll
see one of these when we discuss the Fundamental Theorem of Arithmetic,
shortly. The first few primes are 2, 3, 5, 7, 11, 13 and 17. Note that all primes,
other than 2, are odd—that’s obvious, because any even integer greater than 2
is divisible by 2 and hence cannot be a prime. Note also that if p is a prime and
p does not divide the integer a, then p and a must be relatively prime.
We can put this last observation to good use immediately by proving a
result generally known as Euclid’s Lemma.
Theorem 1.5.2
If p is a prime and a and b are integers with the property that p ∣ab, then p ∣a or p ∣b.
Proof. This is just a special case of Theorem 1.3.8. Make sure you under-
stand why and can write the reason out in a clear, coherent sentence.
An easy induction argument (which we leave as an exercise) allows us to
extend this result to the case of an n-fold product.
Theorem 1.5.3
If p is a prime and a1, …, an are integers with the property that p ∣ a1, …, an,
then p ∣ai for some i.
Proof. Exercise.
We can also use Euclid’s Lemma to clean up a loose end from Section 1.3,
namely the proof of Theorem 1.3.10. For convenience, we restate the theorem
and prove it:
24 Introduction to Number Theory
Theorem 1.3.10
If m1, …, mn are integers, any two of which are relatively prime, and each mi
divides an integer c, then the product m1,…,mn also divides c.
Proof. Induction on n. The case n = 2 is just Theorem 1.3.9. If we know the
result is true for n and want to prove it for n + 1, just note that m1,…,mn and
mn+1 are relatively prime: if the gcd of these two integers were greater than 1,
it would have a prime factor p, but then p would divide mn+1 and one of the
mi (i = 1, 2, …, n), a contradiction. Once we know that m1, …, mn and mn+1 are
relatively prime, and both divide c, the result follows from Theorem 1.3.9.
Euclid’s Lemma allows us to prove a result called the Fundamental Theorem
of Arithmetic, which states that any integer greater than 1 can be written as a
product of primes (we consider a single prime to be a product of one term) and
that, moreover, this expression is unique, except for the order of the factors.
For example, 15 is 3 times 5, and that is the only way (other than 5 times 3) to
write 15 as a product of primes. The uniqueness of the prime factorization is
actually a rather subtle phenomenon and plays a significant role in algebra. It
turns out, for example, that there are algebraic “number systems” where there
is a factorization into primes, but that the factorization need not be unique.
A brief taste of these ideas will be provided in Chapter 6 of this book. Note,
however, that if 1 were a prime, we would not have uniqueness of prime fac-
torization, since 15 could also be written as 5 × 3 × 1, a factorization involving
different primes than 3 × 5. This is one of the technical reasons, mentioned
earlier, why it is more convenient to not consider 1 as a prime.
We can now state and prove the Fundamental Theorem of Arithmetic.
Theorem 1.5.4
Any integer n > 1 can be written as a product of one or more primes. Moreover,
this representation is unique, except for the order of the terms.
Proof. We first prove the existence of such a product. (We will give a proof
using Well-Ordering, but the reader may wish to supply an alternative proof
using Strong Induction.) If this result is false, then by Well-Ordering, there
is a smallest positive integer n > 1 that cannot be so written. Obviously, this
means that n itself is not prime, so we can write n = ab, where a and b are both
positive integers that are both strictly greater than 1 and strictly less than n.
This means, by the way n was chosen, that the integers a and b can both be
written as the product of one or more primes, but then clearly n = ab can also
be so written, a contradiction.
We next prove the more subtle and interesting part of the theorem,
namely that the decomposition of n as a product of primes is unique. More
precisely, we prove that if r = p1p2…pm = q1q2…qn, then m = n and every pi = qj
for some j. From the equation p1p2…pm = q1q2…qn, we immediately conclude
Divisibility 25
Theorem 1.5.5
Theorem 1.5.6
If m and n are positive relatively prime integers, then mn is a kth power if and
only if both m and n are.
Proof. Exercise.
As our final result about primes in this section, we use the fact that any
integer greater than 1 has a prime divisor to give a famous proof of the fact
that there are infinitely many primes. This proof was known to Euclid; it
26 Introduction to Number Theory
appears (in less modern terminology, of course) in his famous treatise The
Elements, which dates back to roughly 300 B.C. It is a masterpiece of math-
ematical reasoning: short, beautifully elegant, easy to understand and
insightful. Every mathematics major should see this proof before he or she
graduates from college.
Theorem 1.5.7
Theorem 1.5.8
Theorem 1.5.9
Theorem 1.5.10
Exercises
28 Introduction to Number Theory
Theorem 1.6.1
Let b > 1 be any integer. Then any nonnegative integer m can be written
uniquely as a sum an bn + …. + a1 b + a0, where each ai satisfies 0 ≤ ai < b.
We won’t prove this result because we won’t really use it in the remain-
der of this book, and the proof is rather dry and technical. However, it is
worth noting that the proof really amounts to repeated use of the Division
Algorithm, so we will give a few examples illustrating this fact.
Before giving these examples, though, a few brief remarks are in order.
First, the case b = 10 just gives the ordinary decimal expansion of a num-
ber, where each ai can be an integer from 0 to 9. Also, when the number
m is written in the form specified above, it is traditional to write m = (an
an−1…a 0)b, and we say that m has been written in the base b. Finally, when
b = 2, each ai is either 0 or 1, so when we write m out in base 2 we just get a
sum of distinct powers of 2. In other words, every coefficient in the base 2
expansion of an integer is 0 or 1. Computer science majors will no doubt
recognize the significance of writing any nonnegative integer as a string
of 0s and 1s.
Divisibility 29
We now illustrate how to write a nonnegative integer in base b and also illus-
trate how the Division Algorithm is related to this idea. Suppose, for example,
that we want to write the number 1000 in base 2. We first determine the largest
power of 2 that is less than or equal to 1000; in this case, that is clearly 512 = 29.
So we divide 1000 by 512, getting a quotient of 1 and remainder 488. Now we
ask: what is the largest power of 2 that does not exceed 488? That’s 256, or 28.
Divide 488 by 256 and we get 232. (Note, at this point, that we are never going
to get a quotient greater than 1, because then a higher power of 2 would be
less than or equal to the number that we are dividing into.) The highest power
of 2 that does not exceed is 128 = 27. Applying the Division Algorithm again,
we get 232 = 128 + 104. Keep this up: we get 104 = 64 + 40, 40 = 32 + 8 and 8 = 8 + 0.
Putting everything together, we get: 1000 = 512 + 256 + 128 + 64 + 32 + 8, or, put-
ting it another way: 1000 = 29 + 28 + 27 + 26 + 25 + 23. (All the “missing” powers of 2
appear with coefficient 0.) So, in base 2, we have 1000 = (1111101000)2.
As another example, let us write 1000 in base 5. Reasoning similar to that
used above gives 1000 = 625 + 375 = 625 + 3 125 = 54 + 3 53 = (13000)5.
It is worth pointing out explicitly that since the base b can be any integer
greater than 1, there is no reason why it can’t be greater than 10. If we were
to express, for example, 121 in base 12, we would start with 121 = 10 121 + 1
120 and might be tempted to write 121 = (101)12. But this is wrong, because
(101)12 = 122 + 1 = 145, not 121. So, instead of writing 10 or 11 as a coefficient, we
need to invent a symbol for each of these—say, T and E. In that case, 121 = (T1)12.
Exercises
1.44 Write the number 248 in base 2, base 3 and base 12.
C1.1 Given ten consecutive integers, prove that one must be relatively
prime to the other 9.
C1.2 Find all positive integer solutions x, y to the equation xy = yx.
C1.3 Prove that an integer n > 1 has an odd number of divisors if and
only if n is a square.
C1.4 If n is a positive integer, what are the possible values of gcd(n + 1,
n2 – n + 1)?
C1.5 If p > 3 and p + 2 are both primes, prove that 12 divides 2p + 2.
C1.6 If n > 1 is an integer, prove that 1 + ½ + … + 1/n is not.
C1.7 Let m and n be positive integers with gcd d. Prove that the number
mn/d is a multiple of m and n, and divides any other multiple of m
and n. We call this number the least common multiple of m and n,
and denote it [m, n].
30 Introduction to Number Theory
2
Congruences and Modular Arithmetic
Let us begin with a simple motivating problem: If today is Monday, what day
will it be 200 days from now? At first glance this seems like a cumbersome
problem to answer, until we recognize a simple trick: 7 days from now, it will
be Monday again. It will also be Monday 14 days from now, 21 days from
now, etc. In particular, it will be Monday 196 days from now, because 196 is a
multiple of 7. So, since we “reset the clock” at day 196, the original question
posed is equivalent to asking what day it will be 4 days from now, and the
answer to that is obvious: Friday.
The idea here (counting until we get to a certain number, then “equating”
that number with 0, and resuming counting from scratch) is simple but very
important in number theory. It is formalized in the definition of congruence
modulo n, the concept to which we now turn.
Definition 2.1.1
Theorem 2.1.2
DOI: 10.1201/9781003318712-3 31
32 Introduction to Number Theory
Proof. We prove (iii), and leave the similar (but easier) proofs of (i) and (ii) to the
reader. To prove (iii), note that we are given that n ∣ a – b and n ∣ b – c. It follows
from our basic properties of divisibility that n ∣ (a – b) + (b – c) or, equivalently,
n ∣ a – c. But this is precisely the same as saying that a ≡ c (mod n), which is
what we wanted to prove.
There is a relationship between congruence and the Division Algorithm.
Suppose we apply the Division Algorithm to the integers a and n, getting
a quotient q and remainder r: a = nq + r. Then a – r = nq, so n divides a – r.
Thus, every integer is congruent, mod n, to its remainder when divided by
n. In particular, every integer is congruent to one of the integers 0, 1, …, n
– 1. Moreover, no two different integers between 0 and n – 1 are congruent
mod n (why?) so we have shown that every integer is congruent, mod n,
to exactly one integer between 0 and n – 1. Another way to say this is to
say that the set of integers {0, 1, …, n – 1} is a complete set of residues mod n.
There are, of course, other complete sets of residues. If, for example, n = 5,
then not only is {0, 1, 2, 3, 4} a complete set of residues, but so is {10, 21, 67,
103, 14}.
Readers familiar with equivalence relations know that, given such a rela-
tion, we can define the equivalence class of an element of the set on which
the relation is defined. In the case of congruence modulo n as the given
equivalence relation, we call these congruence classes. The precise definition
is as follows.
Definition 2.1.3
If n is a positive integer and a is any integer, then the congruence class deter-
mined by a, denoted [a]n, is the set of all integers that are congruent to a
modulo n.
If the integer n is understood, it is common to suppress the subscript in this
notation and just write [a]. So, for example, if n = 2 then [0] is the set of all even
integers and [1] is the set of all odd integers. (Make sure you understand why
this is so.) If n = 6, then [3] = { …, –9, –3, 3, 9, 15, …}. Our next theorem merely
restates a familiar property about equivalence relations in general, but for
those readers who are not familiar with this general theory, we state and
prove the result from scratch.
Theorem 2.1.4
Proof. We will prove, equivalently, that if two congruence classes have even
one element in common, then they are equal as sets. So, suppose [a] and [b] are
two congruence classes modulo n that have the integer k in common. We will
show [a] = [b] by showing that each of these sets is a subset of the other. To show
that [a] ⊆ [b], let x be an arbitrary element of [a]. Then, by definition, x ≡ a (mod n).
But since k is another element in [a], we know that a ≡ k (mod n). Hence, by
transitivity, x ≡ k (mod n). But k is also in [b], so k ≡ b (mod n). By transitivity again,
x ≡ b (mod n), which is just another way of saying that x is an element of [b].
Hence, [a] ⊆ [b]. The proof of the reverse inclusion is identical (or we could just
say that the result follows from symmetry, since a and b are interchangeable).
This concludes the proof.
Since it is obvious from the definition that a ∈ [a] (this is just the reflexive
rule, rephrased), it follows from the previous theorem that [a] = [b] if and
only if a ≡ b (mod n). This is because if a ≡ b (mod n), then a ∈ [b]; in that case,
the congruence classes [a] and [b] both have the element a in common, and
hence are equal. Conversely, if [a] = [b], then, since a ∈ [a], it is also the case
that a ∈ [b], from which a ≡ b (mod n) follows. This is a simple property
of congruence classes, but an important one, and one that deserves to be
stated explicitly.
Also, because a ∈ [a], the union of the distinct congruence classes modulo
n is all of Z. Therefore, we can sum up some of the results of this section as
follows: if n is a positive integer, and [a] denotes the congruence class of a
modulo n, then any integer x lies in one, and only one, of the congruence
classes [0], [1], …, [n – 1].
We denote the set of these n congruence classes by Z n. Thus, for example,
Z 2 = {[0], [1]}, with [0] being the set of even integers and [1] the set of odd inte-
gers. In the next section of this book, we will define operations of addition
and multiplication on the set Z n, thereby turning this set into a “miniature
arithmetic system” (more precisely referred to, in algebraic terminology, as a
“commutative ring with identity”; see Appendix C).
We end this section by discussing, in the spirit of the original motivating
example, an easy but amusing application of congruences. Specifically, we
want to prove that any year contains at least one Friday the 13th. We will
do this under the assumption that our year has exactly 365 days (i.e., is not
a leap year) but the same argument works, with minor arithmetic changes,
for leap years as well. So, let us start with any non-leap year. January 13 of
this year will fall on a certain day, let us number this day 0, and number the
remaining days of the week, in order, 1, 2, …, 6. (So, for example, if January
13 is a Wednesday, then Thursday is day 1, Friday is day 2, etc.) Now, what
day is February 13? Because January has 31 days and 31 is congruent to 3
mod 7, February 13 is day 3. So is March 13, because March 13 occurs 28 days
after February 13, and 28 is congruent to 0 mod 7. Proceeding in this way,
we can determine what day of the week the 13th of each month falls on.
After doing so, we see that all of the numbers from 0 to 6 appear as days of
the week on which this happens. Since one of these numbers corresponds
34 Introduction to Number Theory
to Friday, it follows that there is a Friday the 13th in this year, and the result
is proved.
Exercises
2.2 Arithmetic in Z n
Fix a positive integer n, and let, as in the previous section, Z n = {[0], [1], … [n – 1]}.
We want to define operations of “addition” and “multiplication” in Z n, prefer-
ably in such a way as to maintain the “nice” properties of arithmetic (associative
law, commutative law, etc.) that hold for the integers, or at least as many of these
properties as we can. There is a pretty obvious way to approach this, namely
by defining
[ a ] + [b ] = [ a + b ] (*)
but there is a fairly subtle potential problem that we must deal with in order
for this definition to be “legal”. The problem arises from the fact that the con-
gruence class [a] does not uniquely determine a.
To illustrate the problem, suppose a student, Joe, wants to compute [2] + [7]
in Z 8. On the one hand, he can apply definition (*) above and get [9], or [1],
since we want our answer to be an element of Z 8. On the face of it, this is
pretty straightforward. But now suppose that another student, Alice, also
does this computation, but instead of writing [2], she writes [18]. There’s
nothing wrong with this, because, in Z 8, [18] is equal to [2]. And suppose also
that instead of writing [7], she writes [–1]. Again, there’s nothing wrong
Congruences and Modular Arithmetic 35
with this, because in Z 8, [7] and [–1] are exactly the same objects. So, Alice,
following the rule given in (*) above, will compute the sum to be [18] +
[–1] = [17]. Fortunately, [17] turns out to be the same object as [1], so Alice
winds up with the same answer as Joe. If she hadn’t, however, definition (*)
would be worthless, because two people, both computing the sum correctly,
would have arrived at different answers, based simply on the fact that they
chose different representatives of the same object to compute with.
So, to define addition of congruence classes by the formula (*), we must show
that what happened here is not a lucky coincidence and that it is always the case
that this definition of addition is “independent of the representative” that we use
to denote the congruence classes. In mathematics, this condition is expressed by
saying that definition (*) is well-defined. Fortunately, it is quite simple to prove
that addition is, indeed, well-defined; this is the content of our next theorem.
Theorem 2.2.1
Theorem 2.2.2
As pointed out in Appendix C, this theorem says that Z n, with respect to the
operation of addition, is an abelian group.
The additive inverse of an element [a] in Z n is denoted − [a]. This allows us
to define subtraction in Z n: [a] − [b] is simply defined to be sum of [a] and − [b].
36 Introduction to Number Theory
[ a ][ b ] = [ ab ] (**)
Theorem 2.2.3
Theorem 2.2.4
Congruences and Modular Arithmetic 37
In the language of Appendix C, the properties set out in this theorem and
Theorem 2.2.2 say that Z n is, with respect to the operations of addition and
multiplication, a commutative ring (or, depending on which book you read, a
commutative ring with identity).
We thus have created a miniature “arithmetic system” that shares some
properties with the ring of integers, but which in some other ways is
quite different from this ring. This new arithmetic system, for example,
is, unlike the set of integers, a finite set. The operations of addition and
multiplication also can act quite differently than the corresponding opera-
tions on the set of integers. For example, note that in the set of integers, it
is not possible to multiply two nonzero integers and get 0. However, in the
set Z n, it is, for certain integers n, quite possible to multiply two nonzero
elements in Z n (“zero” in the set Z n refers to the additive identity [0]) and
get [0]. For example, in Z 8, we have [2][4] = [0]. The elements [2] and [4] are
called zero divisors in Z n. Our next theorem states exactly when zero divi-
sors can exist in Z n.
Theorem 2.2.5
Let n > 1. The ring Z n has no (nonzero) zero divisors if and only if n is
prime.
Proof. Suppose first that n is prime and that [a][b] = [0]. We want to show that
either [a] = [0] or [b] = [0]. This, however, follows immediately from Euclid’s
Lemma (Theorem 1.5.2): if [ab] = [0] in Z n, that means n divides ab, but since
n is prime, Euclid’s Lemma says that n divides a or n divides b. This in turn
means either [a] = [0] or [b] = [0], as was to be proved.
For the converse, suppose that n is not a prime. Then we can write n = ab,
where both a and b are strictly between 1 and n. But this means that [0] = [n] = [a]
[b], where [a] and [b] are nonzero elements in Z n; i.e., Z n has zero divisors.
In algebraic language, this theorem says that Z n (like Z) is an integral domain
if and only if n is prime.
Let us compare and contrast Z n and Z in another respect. Suppose we look
for the nonzero integers a in Z that have multiplicative inverses—i.e., an inte-
ger b such that ab = 1. Clearly, the only integers a with this property are ±1.
The situation, however, is different in Z n, where [1], the multiplicative iden-
tity, plays the role that 1 plays in Z. Consider, for example, Z 8, where we can
quickly see that [1], [3], [5] and [7] all have multiplicative inverses: [3][5] = [1],
and [7][7] = [1]. We can equally quickly see, by simply considering all possible
products, that [2], [4] and [6] do not have multiplicative inverses. See a pattern?
The next theorem spells it out.
38 Introduction to Number Theory
Theorem 2.2.6
The nonzero element [a] in Z n has a multiplicative inverse if and only if a and n
are relatively prime.
Proof. First, suppose that [a] has a multiplicative inverse [b]. Then [ab] = [1],
which means (translating into the language of congruences) that ab ≡ 1(mod n).
This means that n divides ab – 1, or, by definition, that ab – 1 = nx for some integer
x. But this means that ab – nx = 1. Since we can express 1 as a linear combination
of a and n, this means that a and n are relatively prime. For the converse, simply
reverse this line of reasoning.
We use the term unit to refer to a nonzero element of Z n that has a multipli-
cative inverse. Note that if n is a prime, then every integer a between 1 and
n – 1 is relatively prime to n, and so every nonzero element of Z n is a unit. In
algebraic terminology (see Appendix C once again), Z n is a field. The ring of
integers, however, is not a field.
We will denote the set of all units in Z n by Z n*. Thus, for example, Z 8* = {[1],
[3], [5], [7]}. Note that this set is not closed under addition but is closed
under multiplication. This is true in general (why?). Z n*, with respect to the
operation of multiplication, is an algebraic structure known as a group (see
Appendix C). Of course, when n is a prime, Z n* is simply the set of all nonzero
congruence classes mod n.
We end this section by briefly addressing two computational questions:
first, if [a] is a unit in Z n, how do we explicitly find its multiplicative inverse?
Since our assumption is that a and n are relatively prime, we know there
are integers x and y such that ax + ny = 1; we also know that we can use the
Euclidean Algorithm to explicitly find x and y. But then (consider this equa-
tion modulo n) it follows immediately that [x] is a multiplicative inverse of [a].
It is also the only one (see Exercise 2.10).
Second, there is a technique {“repeated squaring”) that allows for reason-
ably efficient exponentiation modulo a given number. Recall that any posi-
tive integer can be written in “base 2” as a sum of powers of 2. For example,
37 = 32 + 4 + 1 = 25 + 22 + 20. Suppose we wanted to compute 537 modulo 17.
Computing the 37th power of a number is no fun, but we can avoid that by
using congruences. Begin by repeatedly squaring 5 and reducing mod 17:
52 ≡ 8
54 ≡ 13 ≡ −4
58 ≡ 16 ≡ −1
516 ≡ 1
Exercises
[ a ][ x ] = [b ]
ax ≡ b ( mod n)
ax – ny = b
for some integer y. We know from Section 1.3 that this equation has a solu-
tion in integers x and y if and only if the greatest common divisor of a and n
divides b. We thus have the following result, which really is just a restatement
of previously established results in new language: the equation [a][x] = [b] in Z n
has a solution if and only if the greatest common divisor of a and n divides b.
In fact, we can say more. In Section 1.3, we learned not only when a solution
to ax – ny = b exists, but also what the general form of a solution looks like. We
40 Introduction to Number Theory
Theorem 2.3.1
x ≡ 4 ( mod 5 )
x ≡ 7 ( mod 11)
We know that 4 is the unique solution mod 5 to the first equation, but we also
have to consider things mod 11 to satisfy the second. We also know (by the
previous theorem) that the general solution to the first equation is 4 + 5t for
Congruences and Modular Arithmetic 41
some integer t. So, if we want an x that satisfies both equations, the sensible
thing to do is plug 4 + 5t into the second, thus getting
4 + 5t ≡ 7 ( mod 11) or
5t ≡ 3 ( mod 11).
Theorem 2.3.2
If m and n are relatively prime positive integers, and a and b are any two
integers, then the two congruence equations x ≡ a (mod m) and x ≡ b (mod n)
have a simultaneous solution, and any two solutions are congruent mod mn.
Proof. As above, consider any integer of the form a + mt, where t is an inte-
ger. Any integer of this form satisfies x ≡ a (mod m); we want to show that
there is an integer t such that this integer also satisfies x ≡ b (mod n). In other
words, we ask: can we find a t such that a + mt ≡ b (mod n)? This amounts to
asking whether there is a t for which mt ≡ b – a (mod n), and we know the
answer to this question is “yes” because m, being relatively prime to n, has
a multiplicative inverse mod n, so we can solve for t by multiplying by this
multiplicative inverse. Once we find t, then a + mt is a simultaneous solution
to the system of congruence equations. The uniqueness of this solution mod
mn follows from the fact that the difference between any two solutions is
divisible by m and by n, and hence (because m and n are relatively prime) by
mn, just as in the example above.
We can now state and prove the more general version of the Chinese
Remainder Theorem, which involves a system of k equations rather than just
two. We could actually use Theorem 2.3.2 to prove the more general case, but
it seems advisable to give a more constructive proof.
42 Introduction to Number Theory
Theorem 2.3.3
Suppose that n1, …, nk are k positive integers, any two of which are relatively
prime. Suppose also that a1, …, ak are any k integers. Then there is an integer
x that simultaneously satisfies each of the following congruences:
x ≡ a1 ( modn1 )
⋮
x ≡ ak ( modnk ) .
Moreover, any two such solutions are congruent mod N = n1, …, nk.
Proof. We will prove the last sentence first. If x and y are two simultaneous
solutions to this system of congruences, then, by the basic equivalence rela-
tion properties of congruence relation, it must be the case that x ≡ y (mod ni)
for reach i between 1 and k. But because the ni are relatively prime, it must be
the case that x ≡ y (mod N).
We now prove that such an x exists. First, let Ni = N/ni. In other words, Ni
is simply the product of all the n’s except for ni. Observe that Ni and ni are
relatively prime: if some prime p divided Ni, then by Euclid’s Lemma p would
have to divide some nj (j ≠ i) and hence p could not, by our assumption of
pairwise relative primeness, also divide ni.
Because Ni and ni are relatively prime, Ni has a multiplicative inverse xi
mod ni: i.e., xiNi ≡ 1 (mod n1). Now let x = a1x1N1 +\... + akxkNk. If i is any integer
between 1 and k, then every summand of x except aixiNi is congruent to 0
mod ni (why?) and this one summand is congruent to ai mod ni. Hence, x is
congruent to ai mod ni and is therefore a simultaneous solution to the system
of congruences.
We close this section with a bit of history: the term “Chinese Remainder
Theorem” memorializes an ancient Chinese document, likely dating back to the
late 3rd century, called the Mathematical Manual. In this document, Sun Tze poses
the problem of finding an integer that leaves a remainder of 2 when divided by
3, a remainder of 3 when divided by 5 and a remainder of 2 when divided by 7.
Sun Tze provides an answer; you will be offered the opportunity to provide one
in the exercises that follow.
Exercises
Definition 2.4.1
Theorem 2.4.2
Now that we know how ϕ treats primes, it is only natural to ask how ϕ
treats prime powers. There’s a simple answer.
Theorem 2.4.3
Theorem 2.4.4
Theorem 2.4.5
If n1, …, nk are k positive integers, any two of which are relatively prime, then
ϕ (n1 … nk) = ϕ (n1) … ϕ (nk).
If the prime factorization of an integer n is known, this result allows an
easy calculation of ϕ (n).
Theorem 2.4.6
Of course, it still remains to prove Theorem 2.4.4. We tie up this loose end
now. We can assume both m and n are greater than 1, as otherwise the result
is trivial. We will prove the theorem by counting units in the sets Z n, Z m and
Z mn. (To help keep things straight, we will use subscripts to keep track of con-
gruence classes relative to different moduli.) Recall that if G and H are sets,
then G × H, the Cartesian product of G and H, denotes the set of all ordered
pairs whose first component comes from G and whose second component
comes from H. We assume the reader knows that if G has m elements and H
has n elements, then G × H has mn elements. In particular, Z m × Z n has mn ele-
ments, the same number of elements as the set Zmn.
Define a function T from Z mn to the Cartesian product Z m × Z n as follows: if
[a]mn denotes an arbitrary element of Z mn, let T([a]mn) = ([a]m, ([a]n). First observe
that this function is clearly well-defined (see the definition in Section 2.2);
this follows from the fact that two integers congruent mod mn are also con-
gruent mod m and mod n. Moreover, this function is onto: if we start with
any two residue classes [a]m and [b]n in Z m and Z n, respectively, then by the
Chinese Remainder Theorem there is an integer x that is congruent to a mod
m and to b mod n; it follows that T([x]mn) = ([a]m, ([b]n). Since a function from
a set onto another set of the same size must also be 1-1, it follows that T is a
bijection from Z mn to Z m × Z n.
What we are really interested in, however, is the way T acts on the subset
of units of Z mn. Let us denote this set as Z mn* and use similar notation for Z
m and Z n. We claim that T is a bijection from Z mn* to the set Z m* × Z n*. Since
the set Z mn* has ϕ (mn) elements and the set Z m* × Z n* has ϕ (m) ϕ ( n) elements,
this will prove the result.
We first observe that T actually maps Z mn* into the set Z m* × Z n*. This is
pretty clear: if [a]mn is an element of Z mn*, then a is relatively prime to mn. But
then a is also relatively prime to both m and n, so [a]m is an element of Z m* and
[a]n is an element of Z n*.
We next show that T is 1-1. Of course, this follows from the bijectivity of T
as a function from Z mn to Z m × Z n (why?) but let us give a simple direct proof.
Suppose T([a]mn) = T([b]mn). Then by definition of T, we must have [a]m = [b]m and
[a]n = [b]n. But then, since m and n are relatively prime, it follows that [a]mn = [b]mn,
as was to be proved.
Finally, we show that T, as a function from Z mn* into the set Z m* × Z n*, is
onto. Let us start with arbitrary residue classes [a]m and [b]n in Z m* and Z n*.
We know (see above) that there is an integer x such that T([x]mn) = ([a]m, ([b]n).
We know [x]mn is an element of Z mn, but is it an element of Z mn*? In fact, it is:
we know that x is congruent to a mod m, and since a is relatively prime to m,
this means that x must be also. The same reasoning shows that x is relatively
to n. But if x is relatively prime to both m and n, and m and n are relatively
prime, then x must be relatively prime to mn, as needed to be shown.
We have shown that T induces a bijection from the set Z mn* onto the set
Z m* × Z n*. The desired result then follows immediately.
46 Introduction to Number Theory
Exercises
Definition 2.5.1
Theorem 2.5.2
Proof. We are told that [a2] = [1], which means (translated into the language
of congruences) that a2 ≡ 1 (mod p). This in turn means that p divides
a2 – 1 = (a – 1)(a + 1), and by Euclid’s Lemma, this means that either p
divides (a – 1) or p divides (a + 1). In the first case, [a] = [1] and in the second
case [a] = [–1] = [p – 1].
We can now state and prove Wilson’s Theorem. The proof exploits the mul-
tiplicative structure of Z p.
Theorem 2.5.3
Theorem 2.5.4
a different order. But this means that the product of all the terms listed in (*)
is the same as the product of the terms listed in (**):
[1] = [ a( p − 1) ]
Theorem 2.5.5
gives a10 ≡ 1 (mod 11), from which we immediately conclude that a 560 ≡ 1 (mod
11). The same reasoning applies for the prime 17 as well, concluding the proof.
A composite integer n, greater than 1, with the property that a n – 1 ≡ 1 (mod n)
for all a relatively prime to n, is called a Carmichael number. We have just shown
that 561 is one; there are others. In fact, it was proved in 1994 by Alford, Granville
and Pomerance that there are infinitely many.
FLT clearly does not hold if the modulus is not a prime: 23 is certainly not
congruent to 1 mod 4, for example. But there is a generalization of FLT, called
Euler’s Theorem, which does hold; it makes use of the Euler phi function
from the last section. As you read the statement of the theorem, note that if n
is a prime, it reduces to FLT.
Theorem 2.5.6
Exercises
50 Introduction to Number Theory
x2 + y 2 = z2 (P)
Let us now shift focus a little and think of (P) not as a statement about a known
entity but as an equation in the three variables x, y and z. This being a course
in number theory, we are interested in solutions to this equation in positive
integers. Solutions certainly exist, the simplest one being x = 3, y = 4, z = 5. In
this section of the book we will, using modular arithmetic as a helpful tool,
study equation (P) and see if we can determine all solutions to this equation.
We begin with a simple observation: if (x, y, z) is a solution to (P), then so
is (cx, cy, cz) for any positive integer c; after all, if x2 + y2 = z2 then certainly
(cx)2 + (cy)2 = (cz)2. This solution being considered a somewhat trivial modifica-
tion of an existing one, we focus attention on solutions that have no positive
factors greater than 1 in common.
Definition 2.6.1
Theorem 2.6.2
If (x, y, z) is a PPT, then x and y have opposite parity (i.e., one is even and the
other is odd), and z is odd.
Congruences and Modular Arithmetic 51
Theorem 2.6.3
If r and s are relatively prime integers of opposite parity with r > s, then (r2 – s 2,
2rs, r2 + s2) is a PPT. Moreover, any PPT (with x odd and y even) is of this form
for some relatively prime integers r and s of opposite parity.
So, for example, the choice r = 2, s = 1 yields the familiar PPT (3, 4, 5). Another
familiar example, (5, 12, 13), corresponds to r = 3, s = 2.
Proof of theorem 2.6.3: The easy part of the theorem is showing that any tri-
ple of the desired form is, indeed, a PPT. The fact that this triple satisfies the
Pythagorean equation (P) follows immediately from some tedious, but easy,
high school algebra. The fact that it is primitive follows from the easily shown
fact that any prime p that divides all three terms of the PPT would have to divide
either r or s; then, the fact that p divides r2 – s2 would imply that it divides both r
and s, a contradiction. We leave it to the reader, as an exercise, to fill in the details.
We turn now to the less obvious half of the theorem. Starting with an arbi-
trary PPT (x, y, z), we must prove the existence of two relatively prime inte-
gers r and s of opposite parity such that (x, y, z) = (r2 – s 2, 2rs, r2 + s2).
As a first step to doing so, recall that we are assuming that x is odd and y
is even. So we can write y = 2t for some integer t. Substituting in equation (P)
gives 4t2 = z2 – x2, or
z − x z + x
t 2 = (*)
2 2
Note that the two factors on the right-hand side of (*) are integers, because x
and z are both odd (so their sum and difference are both even). In fact, they
are relatively prime integers: if, say, a prime p divided both of them, then p
52 Introduction to Number Theory
would have to divide their sum and difference, which would imply that x
and z are both divisible by p. But then it would follow from (P) that p divides
y as well, which contradicts the fact that (x, y, z) is a PPT.
z− x
We may now appeal to Theorem 1.5.6 to conclude that both and
2
z + x
are squares:
2
z− x
=s
2
2
z + x
= r2
2
Simple algebra now confirms that (x, y, z) = (r2 – s 2, 2rs, r2 + s2). So, to finish the
proof of the theorem, we need only show that r and s are relatively prime and
of opposite parity. Both of these observations, however, are practically obvi-
ous. Since r2 + s2 = z2 is odd, it follows that r2 and s 2, and hence r and s, must
have opposite parity. And if r and s were both divisible by a prime p, then p
would divide each of x, y and z, which isn’t possible. So r and s are relatively
prime, and the proof of Theorem 2.5.3 is complete.
There are other ways to prove this result. In Chapter 6, for example, we give
a proof using the Gaussian Integers, an algebraic system that extends the set
of (ordinary) integers.
Exercises
3
Cryptography: An Introduction
One of the more striking “real world” applications of number theory is the
study of cryptography, which concerns itself with secret communications.
The ability to decipher such communications can have striking global conse-
quences; the decipherment of the famous Zimmerman telegram (a 1917 com-
munication from the German Foreign Office to the German ambassador in
Mexico), for example, played a significant role in the decision of the United
States to enter World War I. In this chapter, we give an introduction to this
area of mathematics, emphasizing the role that number theory plays in it.
Entire books (e.g., [R-S]) have been written on this subject, so our chapter-
long treatment of it will necessarily hit only a few high points. Our approach
will be somewhat informal; we will focus on the mechanics of the applica-
tions rather than excessive mathematical formalism.
A B C D… Z
DOI: 10.1201/9781003318712-4 55
56 Introduction to Number Theory
And underneath that, write out some permutation (rearrangement) of the let-
ters. For example, we can write the letters in reverse order:
A B C D… Z
Z Y X W…A
To encrypt a message, simply take every letter of the plaintext and replace it
by the letter that appears underneath it in the list above. For example, BAD
converts to YZW. To decrypt the message, we of course just replace every let-
ter in the ciphertext by the letter that appears immediately above it.
The disadvantages of this “substitution cipher” should be obvious. First, if
we try to maximize security by taking a purely random rearrangement of the
alphabet, we run the risk of having to either memorize the substitution key (a
daunting chore) or writing it down somewhere, which leads to the possibility
of the writing being stolen or otherwise accessed. Also, this kind of substitu-
tion cipher can be broken by frequency analysis. There are tables that list the
relative frequency of all letters (and two-letter couplets, etc.) in the English
alphabet. The top ten most frequently occurring letters in the English lan-
guage are, in decreasing order of frequency: E, T, A, O, N, R, I, S, H and D. So,
in a message of any significant size, one might look for the letters that appear
the most often and make guesses as to what they are. There are also tables
that list the relative frequency of two-letter and three-letter groupings, and
frequency analysis can be applied here as well. Once a number of letters have
been filled in, the rest of the message can often be deciphered by figuring out
what makes sense.
Since the substitution method doesn’t seem very useful in practice, let us
bring mathematics into the picture and see how number theoretic ideas can
be used.
we can identify the letter A with 0, B with 1 and so on, until we get to Z,
which is assigned number 25. Encryption under the Caesar cipher simply
amounts to then applying the function
x → x + 3 ( mod 26 )
x → x −3 ( mod 26 )
Of course, there is nothing magical about the number “3”; it can be replaced
by any other integer between 1 and 25. This is an example of a shift cipher—
we simply shift each letter a certain number of places.
As a cryptosystem, though, shift ciphers have serious defects. For one
thing, if Eve knows the method Alice and Bob are using (and it is good prac-
tice to assume your adversary does know the method, if not the key; this is
called Kerchkoff’s principle), then all Eve has to do is check 25 numbers and see
which one works. This can easily be done by hand, let alone in seconds with
a computer.
Another problem here is that this method is also subject to attacks by fre-
quency analysis. This is not surprising, since this method is really just an
example of a substitution cipher, with the substitutions being given by a spe-
cific formula.
There is a mild generalization of shift ciphers that, unfortunately, are still
vulnerable to the kinds of attacks mentioned above. This is the idea of an
affine cipher. Whereas a shift cipher deals with functions of the form
x → x + b ( mod 26 ) ,
x → a x + b ( mod 26 )
Of course, to ensure that different plaintext letters get mapped into different
ciphertext letters (this is necessary if we want to ensure that Bob can invert
the process), we must ensure that this function is 1-1 and onto. For this to
happen, a must be invertible modulo 26. This means that the integer a must
be relatively prime to 26; the total number of distinct (mod 26) integers that
satisfy this condition is ϕ (26) = 12.
As an illustration, consider the affine cipher given by the key
x → 5 x + 9 ( mod 26 )
Suppose Bob receives the encrypted message JIIXWD. The letter J (the 10th
letter of the alphabet) corresponds to the number 9, so to find the plaintext
58 Introduction to Number Theory
letter that corresponds to it, Bob must solve (modulo 26) 9 = 5x + 9, or 0 = 5x. The
only solution to this is x = 0, so A is the letter that encrypts to J. Proceeding in
a similar fashion through the remaining letters of the word, Bob eventually
arrives at the word AFFINE.
However, just as is the case with shift ciphers, affine ciphers are also vul-
nerable to quick searches through all possible values of a and b, as well as
to frequency analysis attack. If Eve can identify the images of two letters,
she can set up a system of linear equations that can be solved modulo 26.
Suppose, for example, that having intercepted a message from Alice to Bob
and knowing that an affine cipher is being used, Eve also knows, or at least
guesses, that E encrypts to B and that T encrypts to A. Eve can then write
down two equations:
1 ≡ 4a + b
0 ≡ 19a + b
(All congruences are mod 26, of course.) Subtracting the first from the sec-
ond, we get
1 ≡ −15a
or
1 ≡ 11a
from which we conclude that a = 19, since 11 × 19 = 209 ≡ 1 (mod 26). Now that
we know a = 19, substituting in the first equation gives b ≡ −75 ≡ 3 (mod 26).
So our affine cipher is
x → 19 x + 3 ( mod 26 ) ,
and now (if Eve was originally guessing) all she has to do is check to see if
this works.
One way to help foil a frequency attack is to use a cryptosystem in which
a letter in the plaintext does not necessarily always correspond to the same
letter in the ciphertext and vice versa. For example, we might come up with
a system in which the letter A in the plaintext is encrypted to C on one occa-
sion and to X on another. There are several such systems known. Some are
called block ciphers because they operate on blocks of letters rather than indi-
vidual letters. The first of these that we will discuss dates back to the 16th
century and is known as the Vigenere cipher.
This works as follows. Alice and Bob agree on a word (say, for purposes
of illustration, DOOR). This word happens to have four letters in it, so the
cipher will operate on blocks of four letters each (or fewer, if we run out of
Cryptography 59
letters). Translating the word DOOR into its numerical equivalent, we get a
four-component vector (3, 14, 14, 17). Now, given any plaintext message (say,
RETREAT), our first step is to break it into four-word blocks; since RETREAT
has seven letters, our second block will only have three letters. We get RETR
and EAT. Now, in the first block, we shift R by 3, E by 14, T by 14 and R by 17.
This gives us USHI. In the second block, we shift E by 3, A by 14 and T by 14,
giving us HOH. Alice will encrypt her message, therefore, as USHIHOH. The
most frequently occurring letter in the ciphertext is H, but this tells us noth-
ing, because the first H corresponds to T, the second to E and the third to T.
When Bob gets the ciphertext, he can easily use modular arithmetic to
decrypt it. He must simply subtract instead of add, or, to put it another way,
he must add 23, 12, 12, and 9 (since these numbers are, respectively, the addi-
tive inverses (modulo 26) of 3, 14, 14 and 17).
We should point out that, in practice today, nobody really uses the Vigenere
cipher. Although a straightforward frequency analysis does not work, the
cipher is vulnerable to other kinds of statistical attack, but discussing this
would take us too far afield. Suffice it to say, though, that longer key words
(or phrases) are more secure than shorter ones.
Another example of a block cipher uses arithmetic modulo 26 applied
to matrices and is called the Hill cipher. (This discussion assumes familiar-
ity with matrix multiplication.) Here, Alice and Bob agree beforehand on a
square matrix whose entries are integers modulo 26. (We could work with
integers modulo n, but 26 is customary; it is also common to work with 2 × 2
matrices, and we shall do so in what follows.) In order to be invertible, the
determinant of the matrix must be a unit modulo 26—i.e., must be relatively
prime to 26. To illustrate the procedure, we will take
7 2
A=
5 3
and break it into chunks of two letters each: YO HM HT. The first group of
letters corresponds to (24, 14)A−1, which (of course) is (12, 14). Proceeding in
this way, Bob deciphers the ciphertext and arrives at MONEYZ. He ignores
the placeholder Z and gets MONEY as his secret message.
We should remark that some authors write vectors x as column vectors and
therefore compute Ax rather than xA. This is a matter of personal prefer-
ence; all that matters is that Alice and Bob have an understanding as to what
approach is being used.
Exercises
discuss one of these methods, known as RSA (named for the discoverers
Rivest, Shamir and Adleman). As in the previous section, we are more inter-
ested in an explanation of how the method works and what it has to do with
modular arithmetic than with a rigorous development of the subject, so we
will dispense with mathematical formalisms.
Essentially, the RSA method depends for its validity on this rule of thumb:
multiplying is easy, factoring is hard. By this, we mean that it is computationally
easy to multiply two integers (even two very large ones), but, given a very
large number, there is currently no known computationally fast algorithm
for factoring it. (If somebody were to discover one, the whole face of number
theory, as well as government and modern business, would be dramatically
changed.)
The basic idea is as follows. Alice has a message that she wants to send to
Bob; since words can be converted to integers, we assume the message is a
number, say m. Bob begins by selecting two distinct prime integers p and q
(in practice, both very large) and then multiplying them together to form the
integer n = pq. He also computes ϕ (n) = (p – 1)(q – 1). These computations can be
done easily on a computer, and once they are done Bob has no further need
of the individual primes p and q; he does not even need to let Alice know
what they are. Bob does transmit to Alice what the number n is, but he does
not need to worry that this message might be intercepted: the beauty of the
method is that the whole world can know what n is as well; there is no need
to keep it secret.
Bob also selects an integer e that is relatively prime to ϕ (n) and transmits
this number to Alice as well. Alice then computes me (mod n); this is the
ciphertext (call it c) that is transmitted by Alice to Bob. (This computation
can be done quickly on a computer; in fact, there are modular arithmetic
calculators available online that instantly produce an answer when the base,
exponent and modulus are entered.)
By the way e was selected, it has a multiplicative inverse modulo ϕ (n); call
it d. To decrypt the message, Bob simply computes cd (mod n), which can also
be done quickly on a computer. Euler’s Theorem guarantees that Bob’s com-
putation recovers the number m. To see why, observe that (by definition of
multiplicative inverse mod ϕ (n)), ed = k ϕ (n) + 1 for some integer k. Thus, with
all calculations below being done modulo n, we have
( )
d
c d ≡ me
= med
= mkϕ ( n)+1
= mkϕ ( n) m
62 Introduction to Number Theory
(
= mϕ( n) ) k
m
≡ 1k m ( by Euler’s theorem )
= m. So, Bob has recovered the original plaintext message.
Exercises
In doing these exercises, you can use the Repeated Squaring Method dis-
cussed earlier, or a computer, or an online modular arithmetic calculator
such as the one found at http://ptrow.com/perl/calculator.pl
3.9 Encrypt I LOVE YOU using p = 7, q = 13, e = 5. How will Bob decrypt
this?
3.10 Encrypt SETTLE THE CASE using p = 5, q = 17, e = 3. How will Bob
decrypt this?
3.11 Would it make any sense at all to select e = 1? Explain.
3.12 Will 2 ever be selected as the encryption exponent e? Explain.
Cryptography 63
4
Perfect Numbers
The subject matter of this chapter, perfect numbers, has ancient roots, dat-
ing back to the time of Euclid, where they are discussed in Book IX of his
monumental treatise The Elements. Interestingly, however, only four specific
perfect numbers were actually known to the Greeks. Now we know many
more, but as we will see there are still several long-standing open questions
concerning them.
Definition 4.1.1
A positive integer n > 1 is called perfect if the sum of the positive divisors of n,
other than n itself, is equal to n. Equivalently, n > 1 is perfect if the sum of all
of the positive divisors of n is equal to 2n.
The two smallest perfect numbers are 6 and 28, as one can easily verify:
1 + 2 + 3 = 6
1 + 2 + 4 + 7 + 14 = 28.
Because we will be dealing with the sum of the positive divisors of a positive
integer, it is convenient to adopt a compact notation. Accordingly, we have:
Definition 4.1.2
DOI: 10.1201/9781003318712-5 65
66 Introduction to Number Theory
This is not the first time that we have defined a function from the set of
positive integers into the set of positive integers; the Euler phi function of
Section 2.4 is another example of one. It turns out that both the σ and ϕ func-
tions have an important property, which we define next.
Definition 4.1.3
A function f from the set of positive integers into the set of positive integers
is called multiplicative if f(mn) = f(m)f(n) whenever m and n are relatively prime
positive integers.
We have previously proved (Theorem 2.4.4) that the ϕ function is multipli-
cative; we now show that the σ function is as well.
Theorem 4.1.4
Theorem 4.1.5
Exercises
Definition 4.2.1
Theorem 4.2.2
Theorem 4.2.3
Theorem 4.2.4
If n is an even perfect number, then n = 2p–1 (2p – 1) for some Mersenne prime
2p – 1.
Proof. Because n is even, we can write n = 2km for some positive integers k
and m, with m odd. Note that m must be greater than 1, since a power of 2 is
never a perfect number by Exercise 4.5. We will show that m is a Mersenne
prime 2p – 1 and that k = p – 1, thus proving the result.
Because n is perfect and 2k is relatively prime to m, we have
From this, it follows that 2k+1 – 1 divides 2k+1m, which in turn implies that 2k+1
– 1 divides m by Theorem 1.3.7. Let us see what this entails. If we write
2k+1r = σ ( m ) . (***)
≥ m+r
= σ ( m ) . (from ***)
Since σ ( m ) is the sum of all divisors of m and is also the sum of the two divi-
sors m and r, it follows that these are all the divisors of m. But one divisor of
m is, of course, 1. So we must have r = 1 and σ ( m ) = m + 1. This means (Exercise
4.3) that m is a prime. By Theorem 4.2.2 and (**), this means that k + 1 is a
prime. If we write k + 1 = p, then
n = 2km = 2p – 1 (2p – 1), and the proof is complete.
The case p = 2 in this theorem yields n = 6, and the case p = 3 gives the even
perfect number 28. As an exercise, check for other even perfect numbers
using other values of p.
Exercises
4.8 Theorem 4.2.2 says that 210 – 1 can’t be a prime. Verify this directly.
4.9 Verify without electronic assistance that 211 – 1 is not prime, thus
showing that the converse of Theorem 4.2.2 is false.
4.10 Find two more even perfect numbers.
4.11 A triangular number is one that can be written as the sum of the
first n positive integers, for some n. Prove that any even perfect
number is triangular.
4.12 Prove that the unit digit of any even perfect number is either 6
or 8.
C4.1 If m and n are prime powers (not necessarily of the same prime)
and σ (n)/n = σ (m)/m, prove that m = n.
70 Introduction to Number Theory
5
Primitive Roots
The subject matter of this chapter has strong algebraic content and is, in fact,
largely a special case of basic group theory in abstract algebra. Of course,
abstract algebra is not a prerequisite for this text, so the material will be devel-
oped from scratch in a number-theoretic setting. However, since it would be
a pity not to understand this material in its proper context, occasional ref-
erences to group theory will be made. These references can be ignored by
people who are not interested in these algebraic connections.
Definition 5.1.1
DOI: 10.1201/9781003318712-6 71
72 Introduction to Number Theory
Section 2.2, we defined to be the set of units of Zn. Here is where algebra
enters the picture: in Section 2.2, we showed that this set was an algebraic
structure called a group. In this context, the positive integer k is called the
order of the element [a] in the group Z*n.
Theorem 5.1.2
If the order of a mod n is k, then no two of the integers a, a2, …, ak are congru-
ent mod n. Equivalently, in the group Z*n, the elements [a], [a]2,…, [a]k are all
distinct.
Proof. Suppose to the contrary that ai and aj are congruent mod n, where
0 < i < j ≤ k. This yields a i− j ⋅ a j ≡ a j ( mod n). Since aj is relatively prime to n (why?),
we can cancel it in this congruence equality, getting a i− j ≡ 1( mod n), where i − j
is a positive integer less than k. This contradicts the fact that the order of a mod
n is k.
If k is the order of a mod n, and d is any integer such that a d ≡ 1( mod n), then
of course k must be less than or equal to d. But our next theorem says even
more—namely, that k must divide d. The proof uses a familiar technique.
Theorem 5.1.3
If k is the order of a mod n, and d is any integer such that a d ≡ 1( mod n), then
k divides d.
Proof. The Division Algorithm tells us that we can write d = kq + r, where
0 ≤ r < k. From here it is obvious that 1 ≡ a d ≡ a r ( mod n). This can only happen
if r = 0, as otherwise r would be a positive integer, less than k, which gives 1
when a is raised to that power, contradicting the fact that k is the order of a.
So r = 0, and d = kq. In particular, k divides d.
Combining the previous result with Euler’s Theorem immediately gives
us:
Corollary 5.1.4
result is left as an exercise. People who have taken abstract algebra will note
that this result, too, is really a special case of basic group theory.
Theorem 5.1.5
Exercises
Definition 5.2.1
Theorem 5.2.2
The integer a is a primitive root modulo n if and only if every integer that is
relatively prime to n is congruent, mod n, to some power of a, which in turn
happens if and only if every element of Z*n is a power of [a].
Let’s consider the example n = 5. We have Z*5 = {[1],[2],[3],[4]}, and a simple
calculation shows that [2]1 = [2], [2]2 = [4], [2]3 = [3], [2]4 = [1]. Thus, the smallest
power of 2 that is congruent to 1 mod 5 is the 4th power. Since 4 = φ(5), 2 is a
primitive root mod 5. At the same time, we see that the powers of [2] sweep
out all the elements of Z*5, as we would expect from the previous theorem.
Primitive roots, when they exist, are generally not unique. In the previous
example, we know that 3, the multiplicative inverse of 2 mod 5, must also have
order 4 (see Theorem 5.1.5) and so must also be a primitive root mod 5. We can
verify this directly by computing powers of [3] and observing that we don’t
get to [1] until the fourth power: [3]1 = [3], [3]2 = [4], [3]3 = [2], [3]4 = [1]. The follow-
ing theorem describes explicitly how one primitive root is related to another.
Theorem 5.2.3
An interesting question, motivated by the example above, is: are there infi-
nitely many primes that have 2 as a primitive root? The conjecture that there
are, known as Artin’s Conjecture, is yet another example of an unsolved prob-
lem in number theory.
Our next goal is to prove that primitive roots exist for any prime integer p.
To accomplish this task, we need to study polynomials, which we do in the
next section.
Exercises
5.3 Polynomials in Z p
Throughout this section, p denotes an arbitrary but fixed prime. We will be
working a lot with the set Z p = {[0],[1],…,[p − 1]}. Because of this, for nota-
tional convenience, we will abuse notation and drop the brackets on the ele-
ments of this set, writing them to look like integers. But we must keep in
mind that they are not integers, and that addition and multiplication are all
done modulo p. For example, in Z7 , 3 + 6 = 2.
In high school, you undoubtedly learned about polynomials with real coef-
ficients, so let us very briefly recall some of the relevant facts about them.
A polynomial with real coefficients is an expression f ( x ) = a0 + a1x + + an x n ,
where the ai are all real numbers; they are called the coefficients of the polyno-
mial f(x). If the ai are not all zero, then the largest n for which an is nonzero is
called the degree of f(x).
Polynomials can be added and multiplied. Addition is particularly simple:
we simply add “like terms”. In other words, the if f ( x ) = a0 + a1x + + an x n
and g ( x ) = b0 + b1x + + bm x m, then the coefficient of xk in f(x) + g(x) is sim-
ply ak + bk. Multiplication is slightly more complicated: with f(x) and g(x) as
before, the coefficient of xk in f(x)g(x) is a0bk + + ak b0. These operations satisfy
the basic rules of arithmetic, and the set of real polynomials is therefore an
example of a ring (see Appendix C). In practice, if you have to multiply two
76 Introduction to Number Theory
f ( r ) = a0 + a1r + + an r n
and we say that r is a root of f(x) if f(r) = 0. You learned in high school that a poly-
nomial of degree n has at most n distinct roots. If f(x) and g(x) are polynomials,
then one can verify by calculation that (f + g)(r) = f(r) + g(r) and fg(r) = f(r)g(r).
In proving the various facts quoted above, the only properties of the real
numbers that are used is the fact that they can be added, subtracted, multi-
plied and divided, and that the various standard rules of arithmetic hold.
Recall from Chapter 2 (and Appendix C) that these properties can be sum-
marized by saying that the set of real numbers is a field. But, as a result of
our investigations in Chapter 2, we now know of another field, namely the
set Z p, where p is a prime. It therefore seems appropriate to consider the set
of polynomials with coefficients from Z p. This set is denoted Z p [ x]. A typi-
cal such polynomial, for example, might be (take p = 5) [2] + [1]x + [4]x2. This
looks ungainly, which is why we adopted the convention mentioned in the
first paragraph above to drop brackets. With this convention in place, we can
write this polynomial as 2 + x + 4x2. This looks nicer, but we have to keep in
mind that all calculations are taking place mod p. For example, if we denote
this polynomial f(x), then f(0) = 2, f(1) = 2, f(2) = 0, f(3) = 1, and f(4) = 0. Thus, this
polynomial has two roots.
Polynomials with coefficients in Z p behave in many respects like polyno-
mials with real coefficients. In particular, we have the following theorem,
which we will need in the next section.
Theorem 5.3.1
g(x) is anxn. Thus, the degree of the polynomial k(x) = f(x) − g(x) is strictly less
than n, and our Strong Induction hypothesis applies to it: this polynomial, if
nonzero, cannot have n roots.
We also note, however, that k(x) does have n roots: clearly, each ri (i = 1, 2, …, n)
is a root. So, by induction hypothesis, k(x) must be the zero polynomial, which
implies that f(x) = g(x). But this in turn implies that 0 = f(rn + 1) = g(rn + 1) = an(rn + 1 − r1)…
(rn + 1 − rn). But this is impossible: the product on the right-hand side of this equa-
tion is a product of nonzero elements of Z p, and we know from Chapter 2 that
the product of nonzero elements of Z p must be nonzero. So, we have found the
contradiction that we sought, and this finishes the induction argument.
As a special case of this theorem, we record, for use in the next section, the
following corollary.
Corollary 5.3.2
Exercises
Theorem 5.4.1
Let n be a positive integer with distinct positive divisors d1, …, dk. Then
n = ϕ (d1 ) + + ϕ ( dk )
78 Introduction to Number Theory
Theorem 5.4.2
≤ φ(d1)+ … + φ(dk)
Exercises
5.16 Use Theorem 4.4.2 and the previous exercise to give another proof
of Wilson’s Theorem (Theorem 2.4.3).
5.17 If p is an odd prime and a( ) ≡ −1(mod p), is a necessarily a
p – 1 /2
Now that Alice has the number gb, she can compute (gb)a (modulo p).
Likewise, Bob can compute the number (ga)b (modulo p). But of course these
are the same number. So, Alice and Bob are now in possession of a shared
number, which they can use as the key.
Let us illustrate this with an example, where, as we have done previously,
we have made the numbers absurdly small. In particular, take p = 11. We have
seen previously that 2 is a primitive root mod 11, so let us take g = 2. Suppose
Alice selects a = 3 and Bob selects b = 5. Thus, Alice transmits the number 8
to Bob and Bob transmits the number 10 (i.e., 32 mod 11) to Alice. Alice then
takes the number 10 and raises it to the 3rd power, getting 10. Bob takes the
number 8 and raises it to the 5th power; a short calculation shows that this
number, mod 11, is also 10 (as it had to be!). So, Alice and Bob now have a
shared key.
Note an interesting thing: the Diffie-Hellman Method not only shares a
key, it creates one. Neither Bob nor Alice knew what the key would be until
the other one transmitted his or her number.
One final comment: an astute reader might ask at this point why it is neces-
sary to select g to be a primitive root; the method would work no matter what
the number g is. The reason we select g to be a primitive root is more practical
than mathematical. The more distinct powers of g there are, the harder it is
to find the exponent, knowing the power of g. And we know that primitive
roots give the most distinct powers of g.
C5.1 If n > 1 is an integer, prove that n does not divide 2n − 1. (Hint: sup-
pose not, and let p be the smallest prime factor of n. Then consider
the order of 2 mod p.)
C5.2 If n > 1 is an integer, prove that n does divide φ(2n − 1).
C5.3 Prove that if m and n are relatively prime integers, each greater
than 1, then mn does not have a primitive root.
C5.4 Prove that if a and b have orders r and s, respectively, modulo n,
and r and s are relatively prime, then ab has order rs mod n.
C5.5 If the integer a has order 3 modulo the prime p, prove that 1 + a + a2
is divisible by p.
C5.6 Under the circumstances of the previous problem, prove that a2
has order 6 mod p.
6
Quadratic Reciprocity
Definition 6.1.1
Let p be an odd prime. We say that the integer a (not a multiple of p) is a qua-
dratic residue mod p if the equation x2 ≡ a (mod p) has an integer solution. If
the equation does not have a solution, then a is a quadratic nonresidue mod p.
Equivalently, a is a quadratic residue mod p if and only if, for some integer x,
[a] = [x]2 in Z p.
A trivial consequence of this definition is that if a and b are two integers
that are congruent to each other modulo p, and if one of these integers is
a quadratic residue mod p, then so is the other. We leave the proof of this
simple result to the exercises.
As an example, let us determine the quadratic residues mod 7. The simplest
way to do this is to just square each of the nonzero elements of Z 7: we get,
after a simple calculation, [1], [4] and [2]. So the quadratic residues mod 7 are
the integers that are congruent to 1, 2 or 4 mod 7. The quadratic nonresidues
mod 7 are the integers congruent to 3, 5 and 6.
When doing the calculation above, we note that there is duplication as
we square things: [1]2 = [−1]2 = [6]2, etc. In general, assuming that p is an odd
prime, if we list the elements of Z p* from [1] to [p – 1] and start to square them,
we can pair off the first and last terms, the second and next-to-last, and so
DOI: 10.1201/9781003318712-7 83
84 Introduction to Number Theory
Theorem 6.1.1
If p is an odd prime, then there are exactly (p – 1)/2 quadratic residues that
are non-congruent mod p.
Proof. We employ the reasoning above. As noted there, if a is a quadratic
residue mod p then [a] must be one of [1]2, …, [(p – 1)/2]2. Thus, there are at
most (p – 1)/2 choices for a. If we knew that [1]2, …, [(p – 1)/2]2 were all distinct
elements in Z p*, then we would know that there are exactly (p – 1)/2 choices
for a, and we would be done.
So, let us prove that. Assume to the contrary that there is some duplication;
let us say [a]2 = [b]2 where, without loss of generality, we have 0 < a < b ≤ ( p – 1)/2.
This means that p divides b2 – a2 = ( b – a ) (b + a). By Euclid’s Lemma, we then
know that p must divide either b – a or b + a. But both of these terms are posi-
tive integers that are strictly less than p, so this is impossible. This contradiction
proves the result.
Exercises
course, the best way to tackle a problem like this is to look at some specific cases
and see if we can discern a pattern, so let us take p = 7. In this case (p – 1)/2 = 3, so
we take cubes of integers that are relatively prime to 7. Some calculation shows
Quadratic Reciprocity 85
that 13, 23 and 43 are all congruent to 1 mod 7, and 33, 53 and 63 are all congru-
ent to –1 mod 7. This seems fairly random, until we compare with the example
that led off Section 6.1. We noticed there that 1, 2 and 4 were quadratic residues
mod 7 and 3, 5 and 6 were quadratic nonresidues. We now have a pattern and a
conjecture: a( ) is congruent to 1 mod p if a is a quadratic residue mod p, and
p − 1 /2
Theorem 6.2.1
1 ≡ a( ) (mod p)
p − 1 /2
≡ g ( ) (mod p).
m p − 1 /2
Because the order of g mod p is p – 1, it follows from the above that m(p – 1)/2
must be a multiple of (p – 1)/2; i.e., m/2 must be an integer. Thus, m = 2k for
some integer k. But then
a ≡ gm (mod p)
86 Introduction to Number Theory
≡ g2k(mod p)
But this congruence equation says that a is a quadratic residue mod p, a con-
tradiction. This contradiction proves the theorem.
This theorem tells us that, if a is not a multiple of the odd prime p, the num-
ber a( ) , mod p, is a quantity that is either 1 or –1 depending on whether
p − 1 /2
Definition 6.2.2
Theorem 6.2.3
p
Proof. If a is a quadratic residue mod p, both sides of this congruence are
congruent to 1, and if a is not a quadratic residue, both sides are congruent
to –1.
We can use this result to establish a “multiplicative property” of the
Legendre symbol.
Theorem 6.2.4
If p is an odd prime and a and b are integers that are relatively prime to p,
ab a b
then = .
p p p
Quadratic Reciprocity 87
ab
Proof. Since 1 is not congruent to –1 mod p, it suffices, to show that =
p
a b
p p , to show that both sides are congruent mod p. This, in turn, is
ab ( p−1)/2
an immediate consequence of the previous theorem: ≡ ( ab ) ≡
p
a b
a( ) b( ) ≡ , with, of course, all congruences being mod p.
p − 1 /2 p − 1 /2
p p
It follows from this that the product of two quadratic residues or two qua-
dratic nonresidues is a quadratic residue, and the product of a quadratic
residue and a quadratic nonresidue is a quadratic nonresidue. Some of these
results can be easily proved directly, and appeared as Exercises 6.3 and 6.4
in the last section.
As an application of these ideas, we can answer the question: for which odd
−1
primes p is –1 a quadratic residue mod p? We know that ≡ ( −1)( ) (mod p).
p − 1 /2
p
For the right-hand side to be equal to 1, it must be the case that (p – 1)/2 is even,
say (p – 1)/2 = 2k for some integer k. But this happens if and only if p = 4k + 1. We
have thus proved:
Theorem 6.2.5
Theorem 6.2.6
There are infinitely many primes of the form 4n + 1, for a positive integer n.
Proof. Suppose to the contrary (hoping for a contradiction) that there only a
finite number of such primes, and let us denote the product of all of them by
P. Now consider the number N = (2P)2 + 1. Clearly, N is an odd integer greater
than 1, so it has a prime factor p, which must be odd since N is. Clearly also, p
does not divide P and so cannot be one of the primes whose product defined
P. In other words, p cannot be one of the finite number of primes existing of
the form 4n + 1. Note, however, that since p divides N, we must have (2P)2 ≡ –1
(mod p), an equation which says that –1 is a quadratic residue mod p. But if this
is the case, then Theorem 6.2.5 tells us that p is of the form 4n + 1, a contradiction.
88 Introduction to Number Theory
A careful reader will recall that we pointed out, much earlier (Theorem
1.5.9) that there is a far-reaching generalization of this result known as
Dirichlet’s Theorem, the proof of which is beyond the scope of the text: if
a and b are two relatively prime positive integers, then there are infinitely
many primes of the form an + b.
Exercises
8 12 −2
6.7 Evaluate , and .
11 11 11
6.8 If both a and –a are quadratic residues mod p, what can you say
about p?
97
6.9 Evaluate . (Do this mentally.)
101
6.10 Let q be the smallest positive nonresidue mod the odd prime p.
Prove that q is prime.
1 2 p − 1
6.11 If p is an odd prime, evaluate, with proof, + + … + .
p p p
p q 1
( p − 1) 21 ( q − 1)
= ( −1) .
q p
2
Let’s take a second to think about what this means. The term on the right
is known to us if we know p and q; it is either 1 or –1, depending on
whether the exponent is even or odd. If it is even and the right hand side
is 1, that means the two Legendre symbols on the left are either both 1 or
both –1; i.e., p is a quadratic residue mod q if and only if q is a quadratic
residue mod p. If, on the other hand, the right-hand side is −1, then that
means the two Legendre symbols have opposite signs, which means that
p is a quadratic residue mod q if and only if q is not a quadratic residue
mod p.
Any odd number is, of course, congruent to either 1 or 3 mod 4. It is easy to
show (we leave this as an exercise) that the exponent on the right-hand side
above is even if either p or q is congruent to 1 mod 4 and is odd if both are
congruent to 3 mod 4. Thus, we may rephrase Theorem 6.3.1 above as follows:
If p and q are odd primes, both congruent to 3 mod 4, then p is a quadratic
residue mod q if and only if q is not a quadratic residue mod p; if, on the other
hand, either p or q is congruent to 1 mod 4, then p is a quadratic residue mod
q if and only if q is a quadratic residue mod p.
Although the statement of this law is elegant and beautiful, the same can-
not be said for its elementary proofs. There are, as previously noted, a lot of
known proofs of this result, but all of the ones that are elementary enough to
be presented in a first course in number theory are fairly technical counting
arguments that seem to miraculously turn out right at the end. None of them,
unfortunately, give any real feeling for why the result should be true. For that
reason, we will omit the proof. Proofs are easily found in other elementary
number theory textbooks, such as [KW], as well as in journals, such as Kim’s
relatively recent proof [Kim].
We mention at this point that the result we have stated in Theorem 6.3.1 is not
the entire Law of Quadratic Reciprocity. There are also two “supplemental rela-
tions”, which we will state and prove in the next section. (One of them, in fact,
has already been proved.) For the moment, however, we focus on this “main”
result.
Let us illustrate the usefulness of the result in determining whether a
prime p is or is not a quadratic residue modulo another prime q. Consider,
for example, p = 3 and q = 101. It would be, to put it mildly, a tedious chore to
square the integers from 1 to 50 to determine whether any of those squares
are congruent to 3 mod 101. But with the law of quadratic reciprocity, the
calculation becomes so trivial that it could be done mentally. Since 101 is
90 Introduction to Number Theory
3 101
congruent to 1 mod 4, we know that = . But since 101 is congru-
101 3
ent to 2 mod 3,
101 2
= = −1. So, 3 is not a quadratic residue mod 101.
3 3
As another example, we ask whether 97 (which is a prime) is a quadratic
97 101 4
residue mod 101. Just as before, = = , where the last equal-
101 97 101
ity derives from the fact that 101 is congruent to 4 mod 97. But 4, being a per-
97 4
fect square, is obviously a quadratic residue mod 101, so = = 1.
101 101
So we have shown, with minimal calculation, that 97 is a quadratic residue
mod 101.
We can also use the Law of Quadratic Reciprocity to investigate the qua-
dratic character of non-primes modulo a prime. For example, suppose we
want to determine whether 57 is a quadratic residue mod 101. Now, 57 is not
a prime; its prime factors are 3 and 19. However, using Theorem 6.2.4, we
57 3 19
see that = . So, it suffices to evaluate each of the Legendre
101 101 101
3
symbols on the right. We have already determined that = −1, and we
101
19 101 6 2 3
see that = = = . It is not too hard to verify that 2 is
101 19 19 19 19
not a quadratic residue mod 19. It is equally easy to see that 3 is not either,
but we can do this without calculation (since 3 and 19 are both congruent to
3 19 1
3 mod 4) by noting that = − = − = −1. Thus, putting everything
19 3 3
57 3 19 3 2 3
together, we see that = = = (−1)3 = −1. In
101 101 101 101 19 19
other words, 57 is not a quadratic residue mod 101.
Exercises
17 13 51 11
6.12 Evaluate , , and . Explain your answers.
101 101 101 53
3
6.13 Evaluate in two ways, using Euler’s Criterion and Quadratic
19
Reciprocity.
6.14 Prove the “rephrased” version of Theorem 6.3.1 that is stated in
the text.
6.15 Find (with proof) all odd primes p for which 5 is a quadratic resi-
due mod p.
6.16 If p is an odd prime, prove that 3 is a quadratic residue mod p if
and only if p is congruent to 1 or 11 mod 12.
Quadratic Reciprocity 91
25(5!) = 25(1)(2)(3)(4)(5)
= (2 × 1) (2 × 2) (2 × 3) (2 × 4) (2 × 5)
≡ (2)(4)(–5)(–3) (–1)
≡ (–1)3 (5!)
≡ (–1)5!
92 Introduction to Number Theory
Let us look at this last product closely. The first 2k terms are all the even posi-
tive integers that are less than or equal to 4k + 1. The remaining 2k + 1 terms are
the negatives of all the odd integers that are less than or equal to 4k + 1. If we
factor out a minus sign from these last 2k + 1 terms, the product (*) is seen to
be equal to (–1)4k+1 times the product of all positive even integers ≤4k + 1, times
the product of all positive odd integers ≤4k+1. But these last two products,
multiplied together, is just (4k + 1)! Since it is obvious that (–1)4k+1 = –1, we have
just shown that 2( ) ((p−1)/2)! is congruent mod p to (–1)((p−1)/2)!, which, upon can-
p−1 /2
Exercises
Definition 6.5.1
Theorem 6.5.2
Let m and n be positive odd integers and a and b nonzero integers that are
relatively prime to both m and n. Then
a b ab
(a) =
n n n
a a a
(b) =
mn m n
a
(c) = 1 if n is a square
n
a b
(d) If a and b are congruent mod n, then =
n n
Jacobi symbols also satisfy a Quadratic Reciprocity Law (complete with sup-
plemental relations), which we state below but do not prove.
Theorem 6.5.3
m n 1 1
( m − 1) ( n− 1)
(e) = ( −1) 2 2
n m
−1
(f) is equal to 1 if n ≡ 1 (mod 4), and equal to 1 if n ≡ 3 (mod 4).
n
2
(g) is equal to 1 if n ≡ 1, 7 (mod 8), and equal to –1 if n ≡ 3, 5 (mod 8).
n
One advantage to using these identities for Jacobi symbols is that, in contrast
with Legendre symbols, we don’t have to concern ourselves with the ques-
tion of whether the entries in the symbol are primes. For very large integers,
it may be difficult to conveniently determine whether that integer is or is not
a prime, or to factor a known composite integer into primes; if we think of
m
as being a Jacobi symbol rather than a Legendre symbol, we no longer
n
have to worry about this.
To illustrate these ideas and to show how Jacobi symbols can sometimes
simplify the computation of Legendre symbols, consider the Legendre sym-
105
bol . Without using Jacobi symbols, we would have to factor 105 and
113
3 5
then separately compute the three Legendre symbols , and
113 113
7
. However, using Jacobi symbols, we can cheerfully ignore the fact that
113
Quadratic Reciprocity 95
105 113 8
105 is not a prime and apply Theorem 6.5.3 directly: = = =
113 105 105
2 3
= 1.
113
As a final example, let us determine whether 55 is a quadratic residue mod-
ulo the odd prime 401. We can use Jacobi symbols to evaluate the Legendre
55 5
symbol directly (and very quickly) rather than computing
401 401
11 55 401 16
and . By Theorem 6.5.3(a), = = , which is obviously 1
401 401 55 55
because 16 is a square.
Exercises
a
6.20 Prove that if a Jacobi symbol is –1, then a is not a square mod
n
n.
105
6.21 Evaluate without using Jacobi symbols.
113
55
6.22 Evaluate without using Jacobi symbols.
401
6.23 Prove Theorem 6.5.2.
6.24 Prove part (b) of Theorem 6.5.3.
109
6.25 Evaluate the Jacobi symbol .
385
C6.1 Suppose that q and p = 2q + 1 are both odd primes. Prove that the
primitive roots of p consist of the quadratic nonresidues of p and
one other number. What is that number?
C6.2 With p and q as in the previous problem, prove that –4 is a primi-
tive root of p.
C6.3 Prove that every element of Z p can be written as a sum of two
squares of elements of Z p.
C6.4 If p > 5 is a prime, prove that there are two consecutive quadratic
residues mod p.
(Hint: Show that at least one of the numbers 2, 5 and 10 is a qua-
dratic residue mod p.)
C6.5 If p > 5 is a prime, prove that there are two quadratic residues mod
p that differ by 2.
C6.6 Let p = 4k + 1 be a prime, and d an odd divisor of k. Prove that d is a
quadratic residue mod p.
96 Introduction to Number Theory
7
Arithmetic Beyond the Integers
Up to now, we have been studying the set Z integers. In this chapter, how-
ever, we expand our horizons and study other number systems that have
features in common with Z, but also some differences as well. We study them
for several reasons. First, we can use these systems to actually prove things
about the integers, and second, their study helps shed some light on unique
factorization into primes, which turns out to be a subtler idea than one might
expect at first.
( a + bi ) ± ( c + di ) = ( a + c ) ± ( b + d ) i
Multiplication can be given by a formula, but it’s easier to just think of it as a
consequence of the distributive and associative laws:
( a + bi )( c + di ) = ( a + bi ) c + ( a + bi ) di
ac + bic + adi + ( bd ) i 2
ac + bic + adi – bd
( ac – bd ) + ( bc + ad ) i
DOI: 10.1201/9781003318712-8 97
98 Introduction to Number Theory
(
= [( a + bi ) (c − di)]/ c 2 + d 2 )
Theorem 7.1.1
Theorem 7.1.2
(a) N(α) = 1
(b) α has a multiplicative inverse in Z [ i ]
(c) α = 1, –1, i or –i
If a Gaussian integer α satisfies any one (hence all) of the equivalent condi-
tions above, it is called a unit. We say that Gaussian integers α and β are associ-
ates if α = βu for some unit u.
Exercises
Any complex number a + bi can be identified with the point (a, b) in the ordi-
nary Cartesian plane. Indeed, this provides a way of actually defining a com-
plex number that avoids reliance on the nebulous concept of an “imaginary
unit”, but we won’t need this precise definition; we will, however, exploit the
identification.
Under this identification, the Gaussian integers correspond to the points
in the plane with two integer coordinates. Geometrically, these points form
a lattice in the plane. They constitute the vertices of infinitely many “unit
squares” that tile the plane, extending infinitely far from left to right and up
and down.
Note that the distance from the point z = a + bi to the origin is given by
|z| = √(a2 + b2), and that for Gaussian integers z, N(z) = |z|2.
Now let z be a complex number that is not a Gaussian integer. Then z lies
in one (or two) of the unit squares that tile the plane—either in the interior,
or on one of the sides. Pick a square containing z and call it S. Now divide S
into four sub-squares of side length ½ by drawing the horizontal and verti-
cal lines connecting the midpoints of the sides of S. The point z lies in at
least one of these sub-squares, and each of these four sub-squares contains
exactly one vertex of the original square S (i.e., contains exactly one Gaussian
integer). The length of any diagonal of one of these sub-squares is, by the
Pythagorean Theorem, equal to the square root of ¼ + ¼ = ½, or, putting it
another way, 1/√2 = √2/2 < 1. It is clear, geometrically, that this is the largest
possible distance from any point in a sub-square to the unique Gaussian
integer (vertex) contained in that sub-square. Thus, we have shown the fol-
lowing geometric fact: given any complex number z, there is a Gaussian integer
α whose distance from z is less than 1; i.e., | z − α | < 1. We will use this fact in
Section 7.4, where we give a geometric proof of the Division Algorithm for
the Gaussian integers.
Exercises
7.7 Find the Gaussian integer that is closest to the complex number
½ + ¼i.
7.8 Give an example to show that the Gaussian integer closest to a
complex number z need not be unique.
Definition 7.3.1
Theorem 7.3.2
(a) α|α
(b) 1|β
(c) if α | β and β | γ, then α | γ
(d) if α | β and α | γ, then α | β ± γ
(e) if α | 1, then a is a unit
(f) if α | β and β | α, then α and β are associates
(g) if α | β then N(α) | N(β)
(h) ifα | β and N(α) = N(β), then α and β are associates
Examples are easy to produce. Since (3 − i)(1 + 4i) = 7 + 11i, for example, it fol-
lows that (3 − i) | (7 + 11i). For a non-example, note that (1 + 3i) / (1 – 3i) is not
a Gaussian integer (check this!) and so 1–3i does not divide 1 + 3i. Note also
that since 1 + 3i and 1–3i obviously have the same norm, this last example
also serves to establish that the converse of part (g) of the previous theorem
is not true.
We can also adapt the definition of an ordinary prime integer to define
prime elements in Z [ i ]. We will use the word “irreducible” rather than
“prime” to distinguish the notion from another kind of primality that will
be discussed later. In the Gaussian integers these two ideas turn out to be
equivalent, but that is not the case for certain other number systems, as we
will see. We will use the Greek letter π to denote irreducible Gaussian inte-
gers, hoping that no confusion with the real number π will result.
Definition 7.3.3
prime does not mean that it is irreducible in the Gaussian integers. The inte-
ger 5 is certainly prime in Z, but it is not irreducible in Z [ i ], as the factoriza-
tion 5 = (1 + 2i)(1 – 2i) shows.
We will ultimately state and prove a theorem that completely characterizes
the irreducible Gaussian integers, but this will require the development of
some more mathematical machinery. For the moment, however, we can at
least record one easy theorem.
Theorem 7.3.4
Theorem 7.3.5
Exercises
Theorem 7.4.1
Suppose that α and β are Gaussian integers, with β ≠ 0. Then there exist
Gaussian integers γ and ρ such that α = β γ + ρ , and 0 ≤ N(ρ ) < N(β ).
Proof. Consider α /β , which is not necessarily a Gaussian integer but which
is certainly a complex number. By the geometric reasoning of Section 7.2,
there is a Gaussian integer γ that satisfies | α /β − γ | < 1, or (multiply through
by | β |) the equivalent inequality
| α − β γ | < | β |. Now define ρ = α − β γ . Then it is obvious that α = β γ + ρ,
and moreover,
N(ρ ) = | ρ |2 < | β |2 = N(β ). This completes the proof.
Note that this proof gives a method for actually computing the greatest
common divisor of two Gaussian integers; the reader can try his or her hand
at computing a gcd in Exercise 7.13. Note also that, in contrast to the situation
for ordinary integers, the quotient and remainder here are not necessarily
unique, because there may be more than one γ that is closest to α /β . For a
simple example, note that if α = 1 and β = I − i, then we have
1 = ( 1 – i ) 0 + 1
( 1 – i ) 1 + i
=
( 1 – i )( I + i ) +
= ( –1)
=
( 1 – i )( i ) + ( – i )
it, as the name implies, as a common divisor that is greatest in norm among
all common divisors. This definition has the advantage of making it clear
that a greatest common divisor exists, but a disadvantage of not telling us
everything we need to know. So, we will give a different definition, where
the existence, though not obvious, can be proved.
Definition 7.4.2
Suppose that α and β are Gaussian integers, not both zero. Then a greatest
common divisor of α and β is a Gaussian integer δ with the properties:
Our first objective is to prove that a greatest common divisor (gcd) actually
exists. There are several ways to do this; we will mimic the “ideal-theoretic”
argument used to establish the greatest common divisor of two ordinary
integers. We first restate the definition of an “ideal”, this time in the context
of the Gaussian integers.
Definition 7.4.3
Theorem 7.4.4
Theorem 7.4.5
If α and β are Gaussian integers, not both zero, then a gcd of α and β exists
and is a linear combination of α and β .
Proof. Let I = {ασ + βτ : σ , τ ∈ Z [ i ]} be the set of all linear combinations of α
and β . It is obvious that I is nonempty (it contains both α and β ), and it is easy
to see (check this!) that I is an ideal. Therefore, I is principal, and therefore
consists of multiples of a Gaussian integer, say δ . We will prove that δ is a
gcd of α and β . Since it is obvious that δ is also a linear combination of α and
β by the way it is defined, this will complete proof.
To show that δ is a gcd of α and β , we first observe that Z [divides i] both α
and β . This follows from the observation, made in the previous paragraph,
that I contains both α and β , and every element of I is a multiple of δ by the
way it is defined.
Finally, suppose δ ′ also divides both α and β . Then it is clear that δ ′ also
divides any linear combination of α and β . But one such linear combination
is δ itself. So δ ′ | δ , and this completes the proof.
It should be observed that, when working with the integers, the greatest
common divisor of two integers was unique. That was because the gcd could
be defined as a positive integer satisfying certain properties. There is, how-
ever, no notion of “positivity” for Gaussian integers, and so we must sacrifice
complete uniqueness. We can, however, salvage a partial result, the proof of
which we leave to the exercises: if δ is a gcd of α and β , and δ ′ is a Gaussian
integer, then δ ′ is also a gcd of α and β if and only if δ ′ and δ are associates.
Just as with ordinary integers, we say that two Gaussian integers α and
β are relatively prime if they have 1 as a gcd. Equivalent conditions are that
106 Introduction to Number Theory
(a) α and β have no common divisor other than a unit, and (b) 1 is a linear
combination of α and β . One can readily check that if π and α are Gaussian
integers, with π irreducible and a non-divisor of α , then α and π are rela-
tively prime.
The Euclidean Algorithm for finding the greatest common divisor of two
ordinary integers can be adapted readily enough to find the greatest com-
mon divisor of two Gaussian integers, but doing these calculations seems
like something of a chore, so we won’t pursue this further. (But see Exercise
7.14 if you can’t resist trying your hand at this.) We do note, however, that
sometimes we can use norms to calculate the gcd without having to go
through any algorithmic procedures. For example, consider the Gaussian
integer 1 + 4i and its conjugate 1 – 4i. They both have norm 17, and so any non-
unit common divisor would have to have norm 17 as well. (Why?) However,
a divisor of a Gaussian integer with the same norm must be an associate of
that Gaussian integer, and it is easy to see that 1 + 4i and 1 – 4i are not associ-
ates of each other, so there can be no non-unit common divisor of these two
Gaussian integers. Hence, these Gaussian integers are relatively prime.
We can now state and prove another analog of a useful ordinary integer
divisibility theorem, Euclid’s Lemma.
Theorem 7.4.6
(Euclid’s Lemma for Gaussian Integers) If π , α and β are Gaussian integers with
π irreducible, and π |αβ , then either π |α or π |β .
Proof. Suppose that it is not the case that π |α . Then by the remark above,
π and α are relatively prime, and we can write 1 as a linear combination of
these two Gaussian integers:
1 = π λ + αµ
β = βπ λ + αβµ .
Each summand of the right hand side above is clearly divisible by π , and
hence so is the left hand side. We have therefore shown that if it is not the
case that π |α , then it must be the case that π |β . This completes the proof.
The previous theorem extends easily by mathematical induction to the fol-
lowing result: if π is irreducible and π |α 1 … α n then π |α j for some j, 1 ≤ j ≤ n.
We have now developed enough material to state and prove an analog, for
the Gaussian integers, of the Fundamental Theorem of Arithmetic for ordi-
nary integers. In what follows, when we speak of a “product of irreducibles”,
Arithmetic Beyond the Integers 107
we implicitly allow for the product to have just one term—i.e., a single irre-
ducible Gaussian integer is considered to be a product of irreducibles.
Theorem 7.4.7
Exercises
Lemma 7.5.1
If a and b are positive integers that can each be written as the sum of two
squares, then ab can as well.
Proof. Write a = m2 + n2 and b = r2 + s2. We could just write down an algebraic iden-
tify for ab, but that would be unmotivated; let us discover such a result by using
Gaussian integers. If we let α = m + ni and β = r + si, then ab = N(α ) N(β ) = N(αβ ),
which, by definition, is the sum of two squares.
We now characterize those primes that can be written as the sum of two
squares.
Theorem 7.5.2
The ordinary prime integer p can be written as the sum of two squares if and
only if p = 2 or p is congruent to 1 mod 4.
Proof. The “only if” direction is easy and amounts to recalling that the square
of an integer is congruent to either 0 or 1 mod 4 depending on whether that inte-
ger is odd or even. We leave the details to the reader and instead prove the more
challenging “if” direction. The number 2 is obviously a sum of two squares, so
suppose that p is a prime that is congruent to 1 modulo 4. We know that −1 must
be a quadratic residue mod p, so, for some x, x2 ≡ −1 (mod p). It follows that p
divides x2 + 1. Thinking of this in the Gaussian integers, this means that p | (x + i)
(x − i). Now, if p were irreducible in Z [ i ], this would imply p | (x + i) or p | (x − i).
But since p is an ordinary integer, Exercise 7.11 clearly makes this impossible.
Thus, p (viewed as a Gaussian integer) is not irreducible, which means we can
write p =αβ , where neither α nor β is a unit. It follows, upon taking norms, that
N(α ) = p. But if we write α = a + bi, this means that p = a2 + b2; i.e., that p is a sum of
two squares, as was to be proved.
Using this, we can characterize all positive integers that are the sum of two
squares. This argument does not use the Gaussian integers, but it does use a
result that we established when we discussed quadratic reciprocity, namely
Arithmetic Beyond the Integers 109
that −1 is not a quadratic residue mod p for any prime p that is congruent to
3 mod 4.
Theorem 7.5.3
An integer n > 1 can be written as the sum of two squares if and only if every
prime factor of n that is congruent to 3 mod 4 appears with even multiplicity
in the prime factorization of n.
Proof. If this condition is satisfied, then n can be written as s2t where s and t
are positive integers and t is the product of distinct primes, each one either 2
or congruent to 1 mod 4. It follows immediately from Theorem 6.5.2 that t can
be written as the sum of two squares, say t= a2 + b2. But then n = (sa)2 + (sb)2 is
also a sum of two squares, as desired.
For the converse, suppose n is the sum of two squares, say n = a2 + b2. Let p be
a prime dividing n that is congruent to 3 mod 4. We will show that p appears
to an even power in the prime factorization of n. To do this, first note that
since p = a2 + b2, we must have a2 ≡ −b2 (mod p). If it were the case that p did not
divide b, then b would be relatively prime to p and would have a multiplica-
tive inverse mod p; multiplying both sides of the congruence a2 ≡ −b2 (mod p)
by that multiplicative inverse, we would see that −1 was a quadratic residue
mod p, which we know it is not. So p divides b, and from this it is immediate
that p divides a as well. It follows that p2 divides both a2 and b2, and hence p2
divides n.
If p2 is the largest power of p dividing n, then we are done; if not, then p
divides n/p2. However, n/p2 is also a sum of two square integers: (a/p)2 + (b/p)2.
By what we have just shown, p2 divides n/p2, or p4 divides n. If this is the larg-
est power of p that divides n, we are again done; otherwise, repeat the process
once more. The point is that we have to stop at some point, and we must stop
at an even power of p, since every time that p divides n/pk, so does p2. This
completes the proof.
Now take a prime, like 5, which is a sum of two squares: 5 = 22 + 12. It is
clear that, except for the order of the terms 22 and 12, this is the only way 5
can be written as the sum of two squares. The same is true of other primes
like 13 = 22 + 33 or 17 = 12 + 42. The next result says that this is not a coincidence.
Theorem 7.5.4
Exercises
7.16 Determine whether each of the integers 688, 1000 and 1240 can be
written as the sum of two squares.
7.17 The last paragraph of the proof of Theorem 6.5.3 is a little infor-
mal. Make it precise by showing that, in the notation of the theo-
rem, p2k+1 cannot be the largest power of p that divides n.
7.18 Prove that, among any four positive consecutive integers, at least
one cannot be written as the sum of two squares.
7.19 Prove that if p is a prime that is congruent to 3 mod 4, then p2 is not
a sum of two positive squares.
coefficients, for which integer solutions are sought. We illustrate this idea
with the Diophantine equation y2 = x3 − 1. One solution, which we can find
by inspection, is x = 1 and y = 0. It turns out that this is the only solution, and
Gaussian integers can be used to prove this.
Theorem 7.6.1
Exercises
Theorem 7.7.1
Let a, b and c be three relatively prime positive integers, with a odd, b even,
and a2 + b2 = c2. Then there exist positive, relatively prime, integers m and n of
opposite parity such that a = m2 − n2, b = 2mn and c = m2 + n2.
Proof. The equation a2 + b2 = c2 factors, in the Gaussian integers, as (a + bi)
(a − bi) = c2. The first thing to observe is that, as in the proof of the preceding
result, (a + bi) and (a − bi) are relatively prime. The proof is not too different
than the proof given in the previous result, and we leave it as an exercise.
Now that we know that the product of two relatively prime Gaussian inte-
gers is equal to a square, it is tempting to assert that each term is a square.
We used similar reasoning, with cubes replacing squares, in the preceding
proof. But as noted in that proof, there’s a subtle point here: since irreducible
factorization is unique only up to associates, all we can really conclude is that
each term is an associate of a square. This didn’t create an issue in the previ-
ous result, because every unit in the Gaussian integer is a cube. However,
every unit in the Gaussian integers is not a square (specifically, i and –i are
not), so things are not quite as simple now. We can therefore, a priori, only
assert that a + bi = (m + ni)2 or a + bi = i(m + ni)2. (Why do we not need to consider
the case a + bi = (−i)(m + ni)2?)
Arithmetic Beyond the Integers 113
If the second case holds, then, expanding the square and equating real
parts, we get a = –2mn, which contradicts our assumption that a is odd. So in
fact this case cannot hold after all.
We now know, then, that a + bi = (m + ni)2. Squaring the right-hand side and
equating real and imaginary parts gives us that a = m2 − n2 and b = 2mn, as
desired. Observe that m and n can chosen to be positive; they are obviously
either both positive or both negative (why?), and in the latter case, we may
replace each one by its negative. Also, note that m and n must be relatively
prime, because a and b are. They must also be of opposite parity, as otherwise
a and b would both be even, a contradiction. Finally, computing c2 = a2 + b2,
then gives us c = m2 + n2, completing the proof.
Exercises
7.22 Prove that, in the proof of Theorem 7.7.1, the Gaussian integers
(a + bi) and (a − bi) are relatively prime.
7.23 Explicitly answer the question posed in the proof above: why do
we not need to consider the case a + bi = (−i)(m + ni)2?
Theorem 7.8.1
(a) 1 + i
(b) a Gaussian integer π, where N(π) is an ordinary prime congruent to 1
mod 4
(c) an ordinary prime p that is congruent to 3 mod 4
Exercises
norm, like the one defined on the set of Gaussian integers, is multiplicative,
although here it should be noted that the norm may take on negative values.
This requires modification of a theorem about the Gaussian integers: now, an
element α in Z √ 2 is a unit if and only if N(α ) = ±1. We leave the details of
proving this to the exercises.
It can be shown that the equation a2 − 2b2 = ±1 has infinitely many integer
solutions. So, Z √ 2 has infinitely many units, unlike the Gaussian inte-
gers, which has only 4. However, in another respect, Z √ 2 is similar to the
Gaussian integers: there is an analog of the Division Algorithm in Z √ 2 , and
it follows from this that there is unique factorization into irreducible elements
here as well. We will not prove these facts here.
Now let’s vary things and consider Z −2 , or the set of all complex num-
bers of the form a + b √2i. This set is also closed under addition and multipli-
cation, and hence we can do basic arithmetic in this domain. In particular,
we can define divisibility and irreducibility just as with the Gaussian inte-
gers. We can define a norm on elements of this set by exact analogy with the
Gaussian integers: N(a + b √2i) = (a + b √2i)(a − b √2i) = a2 + 2b2. So here, the norm
takes on only positive values, is multiplicative and satisfies N(a + b √2i) = 1 if
and only if a + b √2i is a unit. The equation a2 + 2b2 = 1, however, obviously has
as its only solutions a = ±1, b = 0, so this ring has only two units: ±1. It can be
shown, though we won’t do so here, that an analog of the Division Algorithm
holds for this ring, and, with it, unique factorization into irreducibles.
We next consider Z −5 , or the set of all numbers of the form a + b √5i. We
define the norm of this generic element to be a2 + 5b2, and, as in the previous
case, this is equal to 1 only when a = ±1, b = 0, so this ring has only two units:
±1. But Z −5 differs from Z −2 in one very important respect: this time,
there is no Division Algorithm, and unique factorization into irreducibles
fails. In fact, we will show that 6 = 3 × 2 = (1 + √5i)(1 − √5i) gives two distinct
irreducible factorizations of 6.
To see this, observe that no one of the four elements that appear as factors
of 6 is an associate of any other one. We can also show that each of these four
elements is irreducible. Suppose, for example, that 2 = αβ , where α and β are
non-unit elements of Z −5 . Taking norms gives 4 = N (α ) N ( β ) . This in turn
implies, by unique factorization of ordinary integers, that N (α ) = N ( β ) = 2.
But this is impossible, because 2 cannot be written as =a2 + 5b2. For precisely
the same reasons, both (1 + √5i) and (1 − √5i) are also irreducible. Thus, we
have two distinct (even up to associates) factorizations of 6 into irreducible
elements.
It is worthwhile to consider the equation 3 ∙ 2 = (1 + √5i)(1 − √5i) from the
standpoint of Euclid’s Lemma. Notice that the irreducible element 2 divides
the product on the right-hand side but does not divide either of the two terms
making up this product—i.e., Euclid’s Lemma for the integers fails for some
quadratic extensions of the integers. It is for this reason that many books
Arithmetic Beyond the Integers 117
Exercises
Theorem 7.10.1
Exercises
Theorem 7.11.1
If a and b are Hurwitz integers and b ≠ 0, then there exist Hurwitz integers q
and r such that a = qb + r, and 0 ≤ N ( r ) < N(b).
The existence of a division algorithm allows us to introduce the notion
of a greatest common right divisor. Specifically, we say that q is a greatest
122 Introduction to Number Theory
Theorem 7.11.2
Let p be an ordinary integer odd prime, and suppose that a and b are Hurwtiz
integers with the property that p ∣ ab. Then p ∣ a or p ∣ b.
Proof. If p does not divide a, then p and a are relatively prime, and 1 can be
expressed as 1 = qa + rp for Hurwitz integers q and r. Multiplying both sides of
this equation by b gives b = qab + rpb. Because p is an ordinary integer, it com-
mutes with everything, and so we have b = qab + rbp. It is now obvious from
this expression that p ∣ b.
Exercises
7.35 Prove that there are infinitely many quaternions q satisfying q2 = −1.
7.36 There are 24 units in the set of Hurwitz integers. Find them all,
and prove that your list is complete.
7.37 Let us denote by L the set of quaternions with integer coefficients.
The quaternions a = 1 + i + j + k and b = 2 are obviously elements of L.
Show that if q is any element of L, then N(a − qb) ≥ 4. Explain from
this why Theorem 7.11.1 does not hold in L.
7.38 Fill in the details of the argument that the norm of a Hurwitz inte-
ger is an integer.
7.39 Prove that if q is a Hurwitz integer, then some associate of q has
all-integer coefficients.
Theorem 7.12.1
Any positive integer can be written as the sum of four integer squares.
We will prove this via a sequence of lemmas. Our first is the analog of
Lemma 6.5.1 for four squares instead of two.
Lemma 7.12.2
If a and b are positive integers that can each be written as the sum of four
squares, then ab can as well.
Proof. The proof is just like the proof of Lemma 7.5.1, this time using the fact
that a and b are the norms of Hurwitz integers rather than Gaussian integers.
It is interesting to note that although we gave a quaternion-based proof of
this result, it was originally proved almost a century before the quaternions
were discovered by Hamilton.
It follows from the previous lemma and the Fundamental Theorem of
Arithmetic that to prove that any positive integer can be written as the sum
of four squares, it suffices to prove it for the numbers 1, 2 and any odd prime
integer. Since the result is obviously true for the integers 1 and 2, it therefore
suffices to prove Theorem 7.12.1 when n is an odd prime. To do this, we need
another lemma that says that, for any prime p and any integer n, n can be
written as a sum of two squares mod p. We actually don’t need the result in
quite this level of generality, but it is just as easy to prove it at that level.
Lemma 7.12.3
If p is any odd prime and n is any positive integer, then there are integers x
and y such that x2 + y2 ≡ n (mod p).
Proof. We shall work in the set Z p of residue classes mod p, where for typo-
graphical convenience we denote the elements of Z p as integers rather than
residue classes; i.e., we write a typical element of Z p as a rather than [a]. We
need to keep in mind, however, that equality in Z p amounts to congruence
mod p as integers.
We will use a counting argument. We have previously seen that there are
(p − 1)/2 quadratic residues mod p. In other words, there are (p − 1)/2 nonzero
squares in Z p. If we add 0 to this list, we get a total of (p − 1)/2 + 1 = (p + 1)/2 total
squares. Another way to say this is that the set A = {x2 : x ∈ Z p} has (p + 1)/2 ele-
ments in it. It follows immediately from this that the set B = {n − x2 : x ∈Z p} also
has (p + 1)/2 elements in it.
These observations imply that the sets A and B cannot be disjoint: if they
were, then the union of these sets would contain p + 1 elements, but that’s not
possible because there are only p elements in Z p. So, let us denote by t an ele-
ment that is in both sets; t must, on the one hand, be equal to x2 for some x in
124 Introduction to Number Theory
Lemma 7.12.4
Suppose that n is a positive integer and that 2n can be written as the sum of
four squares. Then n can be so written.
Proof. Write 2n = a2 + b2 + c2 + d2, where a, b, c and d are integers. Since 2n is
even, it must be the case that a, b, c and d are either all even, all odd, or that
two of them are even and two of them are odd. In any event, we may assume
that (relabeling if necessary) a and b have the same parity, as do c and d. This
means that the numbers (a + b)/2, (a − b)/2, (c + d)/2 and (c − d)/2 are all integers.
High school algebra now shows that the sum of the squares of these four
integers is n, and we are done.
We can now prove:
Theorem 7.12.5
this is manifestly not the case. So p must be reducible, and we can write p = ab
where a and b are non-unit Hurwitz integers. Taking norms gives p2 = N(a)
N(b). Since neither N(a) nor N(b) is equal to 1, this implies p = N(a). But now we
are done, because by the previous result the norm of a Hurwtiz integer is the
sum of four squares.
C7.1 Find all Gaussian integers α , β and γ with the property that αβγ =
α + β + γ = 1.
C7.2 If n is a positive integer, denote by F(n) the number of Gaussian
integers with norm less than n. Are there infinitely many n satis-
fying F(n) = F(n + 1)?
C7.3 Prove that γ is a greatest common divisor of the nonzero Gaussian
integers α , β if and only if γ is a common divisor of α , β of maxi-
mal norm.
C7.4 Use Gaussian integers to classify all integers solutions to the equa-
tion x2 + y2 = z3.
C7.5 If m and n are distinct squarefree integers, prove that Q (√ m) ≠ Q
(√ n).
C7.6 Exhibit infinitely many units in Z √ 2 . (Hint: begin by showing
that 1 + √ 2 is one.)
Appendix A: A Proof Primer
One way in which mathematics differs from all other disciplines is that in
mathematics, things are proved—in other words, mathematics is a deductive,
rather than inductive, discipline. Let us illustrate with a simple example that
should be familiar to you from your high school geometry course. Consider
the statement “The sum of the angles of a triangle is 180°”. Suppose (contrary
to fact) that you had access to a device that was capable of measuring angles
with 100% precision. Suppose also that you drew 1000 triangles, all different
shapes and sizes, measured the angles in each of them, and came up with
an angle sum of 180° every time. Would that establish the correctness of the
sentence quoted above?
The answer is no, for the simple reason that the angle sum of the 1001st tri-
angle, the one that you didn’t measure, might not be 180°. Of course it doesn’t
matter if we change 1000 to any other positive integer—a billion, a trillion,
what have you. Since we can draw an infinite number of triangles, it is impos-
sible to try them all; there’ll always be some that we didn’t measure. In order
to establish the correctness of the statement, therefore, we can’t simply rely
on experiment; we need a proof.
A proof is a logically convincing argument—a series of assertions, each
one with an appropriate justification, leading to the desired conclusion. We’ll
shortly talk about what kinds of justifications are appropriate and describe
some standard kinds of proof, but before we do that we need to establish
some basic vocabulary and discuss the rules of (very) elementary logic.
Many mathematical statements are what we call conditional statements—
i.e., statements of the form “if P, then Q”. This statement simply means that
if we assume P, then Q must be true. The statement does not mean that Q is
always true, and it says nothing at all about what happens if P is not assumed
to be true. The only time a statement “if P, then Q” is false is when P is true
and Q is false. So, for example, the silly-sounding statement “if Paris is the
capital of Spain, then 1 + 1 = 3” is a true statement, because the antecedent
clause (“Paris is the capital of Spain”) is not true. (Sentences like this are said
to be vacuously true.) It follows from the foregoing that the negation of a con-
ditional statement “if P, then Q” is “P and not Q”.
Associated with every conditional statement “if P, then Q” is its converse,
which is the statement “if Q, then P”. (So, for example, the converse of the
statement in the previous paragraph is “if 1 + 1 = 3, then Paris is the capital of
Spain”.) It is important to note that the truth or falsity of a conditional state-
ment says nothing whatsoever about the truth or falsity of its converse. A true
conditional statement can have a converse that is true or one that is false; so
can a false conditional statement. (Examples illustrating this are easy to con-
struct, and the reader should pause now and construct some.)
127
128 Appendix A
to show that such a statement is false, it therefore suffices to find one single
counterexample. The statement “all prime integers are odd” is false because
there is, indeed, one single example where it fails to hold: namely, the inte-
ger 2.
To prove a statement of the form “there exists…” (e.g., “there exists an even
prime number”), it suffices to show that there is at least one such object. Since
2 is an even prime, merely pointing this out is sufficient to prove the state-
ment. The fact that 2 is the only even prime is irrelevant to the truth of this
statement; all we need to show is that there is one.
Implicit in these remarks are the facts that the negation of a “for all” state-
ment is a “there exists” statement, and vice versa. In other words, the nega-
tion of the statement “All dog owners are happy” is NOT “All dog owners are
unhappy”; it is, instead, “There exists an unhappy dog owner”.
Sometimes, an existence theorem can be proved without explicitly giving
an example of the desired object. This is called a non-constructive proof.
Here is an amusing example of such a proof. We want to prove that there
exist two irrational numbers α and β with the property that α β is rational.
(An irrational number is one that cannot be expressed as a quotient of inte-
gers; it is a fact, one that is proved in the text and will be assumed here, that
√ 2 is irrational.) For the proof, consider the number γ = √ 2 √2 . If γ is rational,
take α = β = √ 2, and we are done. If γ is not rational, take α to be γ , and β to
be √ 2. Then α β = √ 2 √2√2 = (√ 2)2 = 2, which is rational. So either way we have
found two irrational numbers α and β with the property that α β is rational.
(It actually turns out that γ is not rational, but that fact is quite hard to prove.
The point is that we don’t need to know whether it is or not for this proof to
work.)
The previous proof illustrates another technique that is often used in
proofs—consideration of cases. Occasionally, while working one’s way
through an argument, one encounters a situation that can occur in multiple
ways. In such a situation, it may be useful to just consider each possible way
the situation can occur and show the result is true in each case.
We now turn to the mechanics of proof in general. As stated earlier, a proof
consists of a string of assertions, each appropriately justified, leading to a
desired conclusion. (In high school geometry you probably did “two column
proofs” where each assertion really was a separate line in a column, with the
justification in the second column. Mathematicians write proofs in prose, but
it may help you to first write a two-column proof and then work on putting
the lines together into prose.) There are six permissible justifications for a
line in a proof: definition, assumption, axiom, previously proved theorem,
previous line in a proof, or principle of logic. Of these, the one that probably
requires the most explanation is “axiom”.
Modern mathematics is often done axiomatically: i.e., certain principles are
taken as “given” (they are, so to speak, the “rules of the game”) and deduc-
tions are made from them. You may have encountered this in your geometry
130 Appendix A
classes: the statement “given any two points, there is a unique line containing
them” is often taken as an axiom of Euclidean geometry. It isn’t something
that we attempt to justify rigorously; we simply assume it to be true. (Words
like “point” and “line” are generally taken as undefined terms; since there
are only a finite number of words in the English language, it is impossible to
define everything; if you tried, you would eventually wind up in a circular
situation.)
In this number theory book, we have not attempted to rigorously define the set
of integers by specifying axioms for them. We simply assume the reader is famil-
iar with them, and we assume as known all the familiar facts from arithmetic
that the reader has used for years. However, it is worth noting that some of these
facts can be taken as axioms and others can be proved as consequences of these
axioms. Appendix B summarizes some axioms for the integers and also speci-
fies some of the results that can be deduced, as theorems, from these axioms.
We now turn to a survey of some basic methods of proof. First is the direct
method, where, to prove “if P, then Q” we simply assume P and proceed,
step by step and using the six basic justifications specified above, to prove
Q. We illustrate with an example. In Chapter 1 of this text we define, for two
integers m and n, the relation “n divides m” (denoted n | m) to simply mean
m = nx for some integer x; this is just a precise way of saying “n goes evenly
into m”. The following theorem, summarizing basic facts about divisibility, is
stated in Chapter 1; we prove it here. The proofs are quite easy but do illus-
trate the method of a direct proof.
Theorem A.1
(a) n |n
(b) 1 |m
(c) if n |m and m |r then n |r
(d) if n |m and n |r then n |m + r and n |m – r
Proof
Theorem A.2
(a) if n |1 then n = ± 1
(b) if n |m and m |n then n = ± m
Proof
(a) We are told that there exists an integer x such that 1 = nx. It is intui-
tively obvious that this forces x to be 1 or −1, but let’s give a more
careful proof, using the fact that 1 is the smallest positive integer.
(The fact that 1 is, indeed, the smallest positive integer is something
that you can assume for the moment, but we will give a precise proof
immediately after the conclusion of this one.) Since 1 = nx, it is clear
that x is nonzero, so is either positive or negative. If x is positive and
not equal to 1, it is greater than 1, but then (since n must be positive
as well) we have nx > x > 1, contradicting our assumption that 1 = nx.
Finally, suppose x is negative. Then 1 = nx = (−n)(−x), where now –x is
positive. By what we have just done, this forces –x = 1, from which we
conclude x = −1, as desired.
(b) Try this yourself.
prove a result that seems almost insultingly obvious: that there is no inte-
ger between 0 and 1, or, to rephrase things, that 1 is the smallest posi-
tive integer. Although obvious sounding, this result is actually used in
other proofs (in fact, we just used it above, and will again use it, almost
immediately, to prove the Principle of Mathematical Induction) and, if
you’re going to do things very precisely, requires proof. The proof is actu-
ally quite simple and provides a good illustration of how to use the Well-
Ordering Principle.
Theorem A.3
Proof
Suppose to the contrary that a positive integer less than 1 existed. Then the
set S of all positive integers less than 1 is nonempty, and hence, by the Well-
Ordering Principle, has a smallest element; call it x. Multiply the inequality
0 < x < 1 by x; since we are multiplying by a positive integer, the inequality
is preserved and we get 0 < x2 < x < 1. It follows from this that x2 is a positive
integer that is less than 1 but also less than x, which contradicts our defini-
tion of x.
We next prove the Principle of Mathematical Induction (see Section 1.1 of
the text) as a consequence of the Well-Ordering Principle. For convenience,
we restate the Principle of Mathematical Induction.
Theorem A.4
Proof
Assume, hoping for a contradiction, that there is a positive integer that is not
in S. Then, by the Well-Ordering Principle (applied to the nonempty set of all
such integers), there must be a smallest positive integer not in S; call it k. Note
that k ≠ 1 (because 1 ∈ S), so k – 1 is a positive integer. (Note that we are using
the previously proved result here!) It is also in S (since it is smaller than k.)
Appendix A 133
Theorem A.5
Proof
Exercises
A1. Is the statement “Paris is the capital of France and New York is
the capital of Spain” true or false? What is the negation of this
statement?
A2. Write down the negation of the statement “If it is raining, I will go
to the movies”.
A3. Prove part (b) to Theorem A.2 above.
134 Appendix A
A4. Write down four true statements, two of which have false con-
verses and two of which have true converses. The statements
you choose can be “mathematical” or “nonmathematical”, as you
choose.
A5. For purposes of this problem, assume that any integer can be writ-
ten in the form 2m or 2n + 1 for some integer m or n. Integers of the
first kind are, of course, called even; integers of the second kind
are called odd. Use properties of divisibility to prove that no inte-
ger can be both even and odd.
A6. (See previous problem.) Prove that the sum of two even, or two
odd, integers is even. Prove that the sum of an even integer and an
odd integer is odd.
Appendix B: Axioms for the Integers
Because the reader has presumably been dealing with the set of integers for
years now, he or she is no doubt familiar with some of their very basic prop-
erties—for example, that the product of two nonzero integers is nonzero. In
this book, we will simply assume familiarity with these properties and use
them freely. However, it is worthwhile to note that the set of integers can be
characterized by axioms, or assumptions, that, if taken for granted, can be
used to prove all the other properties of the integers that we will need. For the
benefit of those who prefer a somewhat more formal approach to the integers,
and to give some practice in the construction of simple proofs, we briefly
indicate in this Appendix how an axiomatic approach can be carried out.
Our axioms are divided into three groups: axioms of Arithmetic, Order and
a Well-Ordering Principle. We assume the existence of the set Z = {…−2, −1, 0,
1, 2, … } on which are defined two operations of addition and multiplication.
Order Axioms: there exists a nonempty subset P of the set of integers, called
the set of positive integers, with the following properties:
135
136 Appendix B
Well-Ordering Principle:
• If m + r = n + r, then m = n
• m0 = 0
• −(−m) = m
• (−m)(n) = −mn
• (−m)(−n) = mn
• if mn = 0 then either m = 0 or n = 0
• if mr = nr and r is nonzero, then m = n
The reader may wonder why it is even necessary to prove “obvious” facts like
these. This is the nature of mathematical reasoning: when proving things
from axioms, we cannot take anything for granted. If one is going to develop
mathematics rigorously, then careful definitions and careful proofs (even of
things that seem obvious) cannot be avoided. So, for the sake of complete-
ness, we will prove some of the facts above and leave the others as exercises.
We start by proving the first property, which can be summarized by the
phrase “additive cancellation”. Suppose m + r = n + r. Then add –r to both
sides of this equation, getting (m + r) + (−r) = (n + r) + (−r), which by the asso-
ciative law reduces to m + (r + (−r)) = n + (r + (−r)), which by axiom 5 leads to
m + 0 = n + 0, or (by axiom 3) m = n, as desired.
With this established, we can easily prove the second property above. We
know that 0 = 0 + 0 by axiom 3, so we have the following chain of equalities:
0 + m0 = m0 = m(0 + 0) = m0 + m0. It follows from this, and the previous result,
that m0 = 0.
For the third property, first consider m + (−m). By axiom 5, this is 0. On the
other hand, axiom 5 also tells us that when we add − (−m) to − m, we get 0.
So we have
We leave it to the reader to prove the fourth and fifth properties above.
To prove the sixth property above, we use the order axioms as well as the
arithmetic ones. If mn = 0 and neither m nor n are zero, then there are three
possibilities: both m and n are positive, both are negative, or one is positive
and one is negative. If m and n are positive, then by axiom 10 the product
mn is also positive, and hence, by axiom 9, cannot be 0. If m and n are both
Appendix B 137
negative, then −m and −n are positive, and once again mn = (−m) (−n) is posi-
tive, and hence can’t be 0. We leave to the reader the task of disposing of the
one remaining case and also proving the seventh and last bulleted property
above.
One thing that might be noted from the list of results above is that the
familiar fact that “the product of two negative numbers is positive”, a fact
that students learning arithmetic for the first time sometimes wonder about.
This fact, we now see, actually follows logically from the other axioms. A
number of other “obvious” facts about arithmetic follow from these defini-
tions, but aside from a few that are listed in the exercises (e.g., −0 = 0), we will
not make the effort to list and prove all of them; now that we have given a set
of axioms and seen how they can be used, we will simply take all the familiar
basic principles of arithmetic as given and use them without explicit proof.
Note also that nothing is said in these axioms about division. There’s a
reason for that, of course: there is no operation of division defined on the
integers because the quotient of two integers may very well not be an integer.
For example, 1 divided by 2 is ½, which is certainly not an integer. However,
we will see in the text that given two integers, we can divide one by the other,
obtaining a quotient and remainder. This is another “intuitively obvious”
result, and the reason it is not listed as an axiom is that it can be deduced, as a
theorem, from the other axioms. Here, however, the proof is not trivial, but it
is instructive, so it is proved in Chapter 1. Another nontrivial but very useful
result that can be deduced from the axioms is the Principle of Mathematical
Induction, which is also discussed in Chapter 1.
Exercises
B1. Prove the fourth, fifth and seventh bulleted properties of the
integers.
B2. Prove that if m is a nonzero integer then m2 > 0.
B3. Prove that if a < b and b < c then a < c.
B4. Explain from the axioms why –0 = 0.
Appendix C: Basic Algebraic Terminology
When the binary operation * is understood, we will denote the group just
by identifying the set and speak of “the group G”. In addition, it is custom-
ary to suppress the * notation and denote the binary operation by ordinary
juxtaposition of letters. In other words, we write ab instead of the more cum-
bersome a*b. It should be kept in mind, however, that ab does not necessar-
ily symbolize the product of a and b under any kind of multiplication, but
instead the product under an abstract operation.
One other point should be emphasized: it is implicit in the definition of
“binary operation” that the set G is closed under the binary operation *: in other
words, if a and b are elements of G, then a*b is also an element of G. Thus, for
example, the set of positive integers is not a group under subtraction, because
the set is not closed under this operation: 3 and 5 are in the set, but 3–5 = 2 is not.
A fairly trivial consequence of the defining conditions of a group is that
in any group G, cancellation holds: if ab = ac, then b = c. To see this, simply
“multiply” both sides of the given equation by a–1 on the left and use the
associative law.
Some more definitions: If G is a finite set, with, say, n elements, then we
say G has order n; if G is an infinite set, then we say G has infinite order. If the
binary operation is commutative, i.e., ab = ba for all a, b ∈ G, then we say that
G is an abelian group. (This is named for the Norwegian mathematician Neils
Hendrik Abel.)
139
140 Appendix C
Consider the set of nonnegative powers of a: e, a, a2, …. One of two things must
be the case: either these powers are all distinct or there is some repetition among
them, say am = an with m > n. If the latter condition holds (which it must if G is
finite), then by cancellation am–n = e, and so by the Well-Ordering Principle there
is a smallest positive integer d such that ad = e. This smallest positive integer d
is called the order of a. If the powers of a are all distinct, then we say that a has
infinite order.
Now, suppose a has order d. Then it is not hard to see that the set {e, a, a2,
…, ad–1} is a subgroup of G of order d; let us denote this set < a >. Note that all
higher powers of a are automatically in < a >: since ad = e by assumption, we
“loop around” when considering ad, ad+1 = ada = a, etc. Observe also that this
set is the smallest possible subgroup of G containing a; we call it the subgroup
of G generated by a. Thus, if an element of a group has finite order, this order
is also the order of the subgroup generated by that element. It follows that if
G is a finite group of order n, then (by Lagrange’s Theorem), d ∣ n. This obser-
vation, in turn, allows us to deduce another: since n = dk for some integer k,
it follows that an = adk = (ad)k = ek = e. Thus, in a group of order n, if we take any
element and raise it to the nth power, we get the identity.
We next consider a different kind of algebraic system, one with two binary
operations defined on a set R. These operations are called addition (denoted +)
and multiplication (denoted by juxtaposition). We say that R, with respect to
these operations, is a ring if, for all a, b and c in R:
• The set R is an abelian group with respect to addition (with the iden-
tity denoted 0)
• The distributive laws hold: a(b + c) = ab + ac and (b + c)a = ba + ca
• Multiplication is associative: (ab)c = a(bc)
• There is a multiplicative identity, i.e., an element 1 in R such that
1a = a1 = a.
A few remarks: First, not all authors require the last condition (multiplicative
identity) as part of the definition of a ring and use the term ring with identity
to denote rings that happen to have a multiplicative identity. However, it is
becoming more and more common to require a ring to have an identity, and
since the rings that we will encounter all do have an identity, we will require
this condition as part of the definition.
Second, note that we have not required multiplication to be commutative.
In other words, we do not require that ab = ba for all elements a and b in R. A
commutative ring is one in which this requirement does hold.
Here are some examples that will be particularly relevant for us. The set
Z of integers is a commutative ring with respect to the “usual” operations of
addition and multiplication, as is the set Zn of congruence classes modulo
some positive integer n. Likewise, the sets Q and R of rational and real num-
bers, respectively, are commutative rings. The set of even integers is not a ring
142 Appendix C
[AC] A. Adler and J.E. Coury, Theory of Numbers: A Text and Source Book of Problems,
Jones and Bartlett Publishers, Burlington, MA, 1995.
[Cam] D. Campbell, An Open Door to Number Theory, MAA Press, New Denver, 2018.
[Con] K. Conrad, The Gaussian Integers, https://kconrad.math.uconn.edu/blurbs/
ugradnumthy/Zinotes.pdf.
[Jar] F. Jarvis, Algebraic Number Theory, Springer-Verlag, Heidelberg/Belin, Germany,
2014.
[Kim] S. Kim, “An elementary proof of the quadratic reciprocity law,” American
Mathematical Monthly, 111, 1 (2004), 48–50.
[KW] J. Kraft and L. Washington, An Introduction to Number Theory with Cryptography,
2nd edition, CRC Press, Boca Raton, FL, 2018.
[NZM] I. Niven, H. Zuckerman and H. Montgomery, An Introduction to the Theory
of Numbers, 5th edition, Wiley, New York, 1991.
[R-S] S. Rubinstein-Salzedo, Cryptography, Springer-Verlag, Heidelberg/Belin,
Germany, 2018.
[Ros] K. H. Rosen, Elementary Number Theory, 6th edition, Pearson, Upper Saddle
River, NJ, 2010.
[Sil] J. Silverman, A Friendly Introduction to Number Theory, 4th edition, Pearson, Upper
Saddle River, NJ, 2011.
[St] J. Stillwell, Elements of Number Theory, Springer-Verlag, Heidelberg/Belin,
Germany, 2003.
143
Index
145
146 Index
Division Algorithm 12, 13, 15, 16, 28, 29, infinite order 139, 141
72, 116, 120–122 integers 7, 16
relationship between congruence algebraic 118
and 32 algebraic numbers and 118–119
in ℤ[i] 103–107 axioms for 135–137
Gaussian 52, 97–100, 106
elementary number theory 139, 140 order of 71–73
ElGamal Cryptosystem 80–81 ordinary 98, 101–106, 108,
encryption/enciphering 55 110, 121
equivalence class 32 parity of 11
equivalence relation 31 prime 10, 23, 25
Euclidean Algorithm 19–22, 106, 122 integral domains 37, 142
Euclid’s Lemma 23, 47, 84, 85, 106, 107, irreducible element 107, 113, 116, 117
114, 116, 122 irreducible Gaussian integers 101–102,
Euler phi function 43–46, 49, 77 113–115
multiplicative 43–46
Euler’s Criterion 84–88, 91 Jacobi symbol 93–95
Euler’s theorem 49, 61, 71, 72
even 11, 12 Kerchkoff’s Principle 57
even perfect numbers 67–69
Lagrange’s Theorem 72, 140
Fermat’s Last Theorem 1, 2, 117 Lame’s theorem 22
Fermat’s Little Theorem (FLT) 47–49, 85 Law of Quadratic Reciprocity 88–91
Fibonacci sequence, defined as 22 Legendre symbols 84–89, 93–95
field 38, 76, 142 linear combination 16
FLT see Fermat’s Little Theorem (FLT) linear equations 58
frequency analysis 56, 58, 59 in ℤn 39–43
Fundamental Theorem of Arithmetic 10,
24, 26, 106, 123 Mathematical Manual 42
mathematical reasoning 136
Gaussian integers 52, 97–100, 106 A Mathematician’s Apology (Hardy) 4–5
divisibility and primes in 100–102 mathematics
geometric interlude 99–100 numbers in 7
Goldbach conjecture 4 problems in 1
greatest common divisor (gcd) 14–19, Mersenne primes 67, 68
39, 40 multi-digit nonnegative numbers 28
in ℤ[i] 103–107 multiplication, in ℤn 36–37
group 72, 139, 140 multiplicative functions 66, 78
element of 141 multiplicative inverses 37, 38, 40–42, 46,
47, 49, 62, 74, 142
Hilbert’s Tenth Problem 2 multiplicative property 141, 142
Hill cipher 59 Euler phi-function 44–45
Hurwitz integers 120–125 of Legendre symbol 86, 92