0% found this document useful (0 votes)
10 views79 pages

Elementary Number Theory - MAT105

Uploaded by

p.s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views79 pages

Elementary Number Theory - MAT105

Uploaded by

p.s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

MAT105 Elementary Number Theory

Claire Burrin
(Fall 2024, University of Zurich)

1
Contents

Chapter 1. Introduction 5
1.1. Basic principles 5
1.2. Pythagorean triples 7
1.3. Homework 1 9
Chapter 2. The Fundamental Theorem of Arithmetic 11
2.1. Prime numbers 11
2.2. Failure of unique factorization 13
2.3. Factoring 14
2.4. Homework 2 15
Chapter 3. Euclid’s Algorithm 17
3.1. Division with remainder 17
3.2. Complexity theory 18
3.3. Euclid’s lemma and Bézout’s identity 19
3.4. Homework 3 21
Chapter 4. Continued Fractions 23
4.1. Continued fraction expansions 23
4.2. Convergents 25
4.3. Constructing irrational numbers 26
4.4. Homework 4 28
Chapter 5. Diophantine Approximation 29
5.1. Dirichlet’s theorem (1842) 29
5.2. Hurwitz’s theorem (1891) 30
5.3. Liouville’s theorem (1844) 32
5.4. Homework 5 34
Chapter 6. Congruences (I) 35
6.1. The basic algebra of congruences 35
6.2. Some theorems to prime modulus 37
6.3. Homework 6 38
Chapter 7. Congruences (II) 41
7.1. Fermat–Euler theorem 41
7.2. Chinese remainder theorem 42
7.3. Euler’s totient function 43
7.4. Homework 7 44
3
4 CONTENTS

Chapter 8. Quadratic Residues 45


8.1. Public key cryptography – RSA 45
8.2. Quadratic residues 46
8.3. Primes that are the sum of two squares 48
8.4. Homework 8 49
Chapter 9. Quadratic Reciprocity 51
9.1. Supplementary laws 51
9.2. Quadratic reciprocity law 53
9.3. Jacobi and Kronecker symbols 54
9.4. Homework 9 55
Chapter 10. Applications of Residue Symbols 57
10.1. Apollonian circle packings 57
10.2. The arithmetic of Gaussian integers 60
10.3. Homework 10 62
Chapter 11. Primes 63
11.1. Gaussian primes 63
11.2. Rational primes 64
11.3. Homework 11 67
Chapter 12. Generating Functions 69
12.1. Ordinary generating functions 69
12.2. The zeta function 72
12.3. The zeta function for Gaussian integers 75
12.4. An irrationality proof of the infinitude of primes 78
12.5. Homework 12 79
CHAPTER 1

Introduction

A course in elementary number theory presents a corpus of results on the natural


numbers, the integers, the rationals, congruences and Diophantine equations. Those
results are not elementary in the sense that they are simple, but that they don’t
require advanced techniques of analysis (complex analysis, harmonic analysis, or mea-
sure theory) or algebra (Galois theory). You will encounter this math during your
undergraduate studies, and much of it was inspired by questions about number pat-
terns. Hence this course has several goals: to provide an overview of classical pearls
of number theory and discuss their place in modern mathematics, with a view towards
introducing concepts in analysis and algebra that you will study in more depth over
the next two or three years.

1.1. Basic principles


The way we conceive of mathematics as a science is largely inherited from the
ancient Greeks. The Greek method followed a clear strategy: starting with definitions,
agreeing on basic principles (also called laws, axioms, or postulates), and then justifying
claims via deductive reasoning and proofs, a logical structure that continues to define
mathematical practice today.
Arithmetic (arithmos, number, tike, art) is the study of numbers and their proper-
ties under the elementary operations of adding, subtracting, multiplying, and dividing.
The set of natural numbers is defined to be the infinite countable set
N = {1, 2, 3, ...}.
This set is totally ordered (i.e. for any two elements a, b ∈ N, either a ≤ b or b ≤ a).
We take for granted the following ‘laws of arithmetic’: for any a, b, c ∈ N,
(1) addition and multiplication are commutative: a + b = b + a, ab = ba;
(2) addition and multiplication are associative: a + (b + c) = (a + b) + c, a(bc) =
(ab)c;
(3) multiplication is distributive with respect to addition: a(b + c) = ab + ac;
(4) If a < b then a + c < b + c and ac < bc;
(5) If ac = bc or a + c = b + c then a = b.
We take these basic principles for granted. Here is an example. Let a, b ∈ N. We
say that a divides b (shorthand notation a | b) if there is a number n ∈ N such that
b = an.
Proposition 1. If a | b and a | c then a divides any linear combination of b and c,
i.e., for all u, v ∈ N, a | (ub + vc).
5
6 1. INTRODUCTION

Proof. Write b = ma, c = na. Then for any u, v ∈ N, we have


ub + vc = u(ma) + v(na) = (um)a + (vn)a = (um + vn)a,
which shows that a | (ub + vc). 
To this arsenal of basic laws of arithmetic, we will add the principle of induction.
Consider for example the sum of the first few odd numbers:
1, 1 + 3 = 4, 1 + 3 + 5 = 9, 1 + 3 + 5 + 7 = 16, 1 + 3 + 5 + 7 + 9 = 25,
1 + · · · + 9 + 11 = 36, 1 + · · · + 11 + 13 = 49, 1 + · · · + 13 + 15 = 64, . . .
It appears that the sum of the first n odd numbers is equal to n2 . To justify this
claim, we need to prove that it holds for all possibly arbitrarily large values of n. A
straightforward strategy in such situations is to rely on ‘proof by induction.’
Proposition 2. The sum of the first n odd numbers is equal to n2 .
Proof. The statement is clearly true for n = 1. The n-th odd number has the
form 2n − 1. If we assume that the statement is true at step n then
sum of the first n odd numbers + (2n + 1) = n2 + 2n + 1 = (n + 1)2 ,
which proves that the statement is true also at step n + 1. 
Proof. For fun, here is an alternative proof that avoids induction with an imagi-
native trick. Write down the list of the first n odd numbers twice, the second time in
reverse order:
1 3 . . . 2n − 3 2n − 1
2n − 1 2n − 3 . . . 3 1
We can sum up all the terms by considering that each column sums up to 2n and
that there are n columns, so we have that twice the sum of the first n odd numbers is
n · 2n = 2n2 . 
The set N of natural numbers is not closed under subtraction; for example 1 − 1 =
0 6∈ N. For this, we have the ring of integers1
Z = {..., −2, −1, 0, 1, 2, ...},
1
which is now closed under subtraction, but under division; for example 2
6∈ Z. For
this, we have the field of fractions
Q = { ab : a ∈ Z, b ∈ N}.
The latter sets, Z and Q, are prototypes of important algebraic structures called ring
and field, respectively. Just think that a ring is a set closed under addition, multipli-
cation, and subtraction, such that each element has an additive inverse a + (−a) = 0,
while a field is a set closed under all four operations, and such that each element has
an additive and a multiplicative inverse a · a−1 = 1. Other major examples of fields are
R and C, the real respectively complex field. These are successively larger than Q as
R is also closed under taking the square root of positive integers, and C is closed under
1The letter Z stands for Zahlen. From the 18th century to the 1930s, German was the primary
language of mathematics.
1.2. PYTHAGOREAN TRIPLES 7

taking all square roots (since i = −1 ∈ C). We will come back to these algebraic
structures later in the course, see that their structure can be studied abstractly and
systematically, and that they cover many more more natural arithmetically interesting
examples.

1.2. Pythagorean triples


Systematic studies of number patterns predate the ancient Greeks. For instance,
a famous clay tablet of Babylonian mathematics (2000–1600 BC), listed as Plimpton
322, records an impressive number of what would later be coined Pythagorean triples
in connection to Pythagoras theorem.
Definition 3. A Pythagorean triple is a triple (a, b, c) of natural numbers such that
a2 + b 2 = c 2 .
Pythagoras’ theorem played an important part in the development of arithmetic,
even though it is first thought as a landmark of classical geometry. The systematic
study of numbers carried out by the Greek thinkers was long limited to the natural
numbers and fractions thereof. But they quickly realized that if one √ considers a right-
angled triangle of side-lengths 1 and 1, the length of hypothenuse, 2, does not fit this
paradigm. The proof of this fact, a proof by contradiction, is one of the early pearls of
mathematical reasoning.

Theorem 1. 2 is irrational.

Proof. Suppose √ (for contradiction) that 2 is rational. That is, there are numbers
a, b ∈ N such that 2 = ab , and we may assuming that this fraction is reduced, that
is, that the greatest common divisor (gcd) of a and b is 1.
Up to squaring and rearranging, we thus have 2b2 = a2 . In particular, a2 is an even
number. We leave it to the reader to prove as an exercise that if a2 is even then a is
even. (Or its contraposition: if a is odd then a2 is odd.) If a is even and the gcd of
a and b is 1, b must be odd, and hence (by the previous exercise) b2 is odd. We may
thus write a = 2m and b2 = 2n + 1. Now 2b2 = a2 implies that
4n + 2 = 4m2 .
The right hand-side is a multiple of 4, the left hand-side isn’t; we have reached√a
logical contradiction, leading to the conclusion that our original assumption (i.e., 2
is rational) was erroneous. 
Let’s come back to Pythagorean triples. The first example is the triple (3, 4, 5),
followed by (5, 12, 13); in fact there are infinitely many Pythagorean triples and we
know all of them. Already the tablet Plimpton 322 suggests that the Babylonians had
worked out a method to construct such triples, long before the Greeks discovered the
solution to this problem.2 The next theorem lists all Pythagorean triples. We say that
2 One may wonder if the Babylonian mathematicians came to their construction via the manip-
ulation of remarkable identities such as (a + b)2 = a2 + 2ab + b2 , which they knew well; the equation
c2 − a2 = b2 does suggest the choice c = p2 + q 2 , a = p2 − q 2 , b = 2pq.
8 1. INTRODUCTION

two numbers a, b are coprime if their greatest common divisor (gcd), denoted (a, b),
is 1.
Theorem 2. Up to switching the order of a and b, every Pythagorean triple has
the form
a = m(p2 − q 2 ) b = 2mpq c = m(p2 + q 2 )
where m ∈ N, and p, q are coprime numbers, one of which is even and the other one
odd. Conversely, any triple (a, b, c) of the above form is a Pythagorean triple.
Proof. Observe that if (a, b, c) is a Pythagorean triple then so is (ma, mb, mc) for
all m ∈ N. We call a triple for which (a, b, c) = 1 a primitive Pythagorean triple. Next
we observe that finding primitive Pythagorean triples amounts to listing all points on
the first quadrant of the unit circle with rational coordinates:
 a 2  b 2
+ = 1.
c c
From the algebraic equation of the circle we have Y 2 = 1 − X 2 = (1 − X)(1 + X).
Setting
Y
t :=
X +1
(for X 6= −1), the equation becomes
1−X
t2 =
1+X
and can be solved in X:
1 − t2 1 − t2
 
1−X 2t
t2 = ⇐⇒ X = , Y =t 1+ = .
1+X 1 + t2 1 + t2 1 + t2
(Geometrically, our computation parametrizes the intersection points (X, Y ) of the
unit circle X 2 + Y 2 = 1 with the line Y = t(X + 1).) It follows that (X, Y ) ∈ Q2 if
and only if t ∈ Q. Let us write t = pq for p and q coprime. Then X and Y can now be
expressed as
a p2 − q 2 b 2pq
X= = 2 , Y = = .
c p + q2 c p2 + q 2
We can easily check that (p2 − q 2 , 2pq, p2 + q 2 ) is a Pythagorean triple, but we still need
to check that it is primitive.
Assume first that one of p, q is odd and the other even. Then both p2 + q 2 , p2 − q 2
are odd. Suppose that d divides both p2 + q 2 , p2 − q 2 ; in particular, d is an odd number.
Then d also divides (p2 + q 2 ) + (p2 − q 2 ) = 2p2 and (p2 + q 2 ) − (p2 − q 2 ) = 2q 2 . Since
d is odd, it must divide both p and q, which forces d = 1 since we assumed p, q are
coprime. This proves that (p2 − q 2 , p2 + q 2 ) = 1. We leave it to the reader to conclude
that (2pq, p2 + q 2 ) = 1, and hence (p2 − q 2 , 2pq, p2 + q 2 ) = 1.
Since p, q are coprime, they cannot both be even and that leaves us with the case
that p and q are both odd. Then p + q, p − q are even and we write
p + q = 2P, p − q = 2Q.
1.3. HOMEWORK 1 9

Adding (respectively subtracting) these two equations, we find p = P + Q and q =


P − Q. (In particular, one of P, Q is odd, and the other even). In terms of P and Q
we now have
a (P + Q)2 − (P − Q)2 2P Q b P 2 − Q2
= = , = ,
c (P + Q)2 + (P − Q)2 P 2 + Q2 c P 2 + Q2
that is the same result than before, with the order of a and b switched. The reader can
now check that (p, q) = 1 implies that (P, Q) = 1. This concludes the proof. 
Around 1637, while studying a proof of this theorem in a translation of Diophantus’s
Arithmetica, Fermat wrote in the margin that for n ≥ 3 the Diophantine equations
xn + y n = z n
no longer have integer solutions. To this he added that the margin was unfortunately
too narrow for his proof; it would take another 350 years and impressive developments
of mathematics in the 20th century for a proof of ’Fermat’s Last Theorem’ to emerge
(Wiles, Taylor–Wiles 1994).
In general, a Diophantine equation (as named after Diophantus) is a polyno-
mial equation with at least two unknowns and integer coefficients, in which we study
the integer or rational solutions. Such equations are still a topic of current research.
Hilbert’s tenth problem (1900) asked whether there is an algorithm, which given any
Diophantine equation, can decide whether the equation has an integer solution. It is a
theorem (Matiyasevich, 1970) that such a general algorithm cannot exist.
1.3. Homework 1
(1) Show that if a | b and c | d then ac | bd.
(2) Let a ∈ N. Show that if a is odd then a2 is odd.
(3) Use Fermat’s Last Theorem to deduce that the n-th root of 2, i.e., 21/n , is
irrational for all n ≥ 3.
(4) The Fibonacci sequence (Fn ) is defined recursively by F0 = 0, F1 = 1, Fn =
Fn−1 + Fn−2 for n ≥ 2. Prove that Fn is even if and only if 3 | n. 3
(5) It is not difficult to find a closed-form formula for Fn ; prove by induction that
ϕn − ϕn
Fn = √ ,
5
√ √
1+ 5 1− 5
where ϕ = 2
, ϕ= 2
each satisfy the equation x + 1 = x2 .4

3 To prove this ‘if and only if’ statement, you need to prove two things: that Fn even implies that
3 | n and that 3 | n implies that Fn is even.
4 To argue by induction, you need here to first verify that the statement is true for n = 0, 1, and
then use induction to show that it is true for all larger integers n.
CHAPTER 2

The Fundamental Theorem of Arithmetic

In working out the complete list of Pythagorean triples, we had to show that (p2 +
q , p2 − q 2 ) = (p2 + q 2 , 2pq) = 1 for which we implicitly relied on the fundamental
2

theorem of arithmetic, which says that every natural number can be written as a
product of primes in a unique way (up to reordering).
2.1. Prime numbers
Each natural number n ≥ 2 has at least two divisors: 1 and itself. We refer to those
as the trivial divisors of n.
Definition 4. A natural n ≥ 2 that only has trivial divisors is called a prime number.
Otherwise, we say that n is composite.
Prime numbers are the irreducible factors of multiplication, and the sequence of
prime numbers is one of the most studied pattern of numbers. We will later see
various proofs that there are infinitely many primes, and discuss some important open
conjectures in prime number theory.
Regarding n = 1 we adopt the following conventions:
• 1 is not considered to be a prime number;
• 1 is seen as the ’empty product’ (1 = a0 for any a).
These conventions will allow to formulate the fundamental theorem of arithmetic in
the clearest possible way. Before stating the theorem, we consider the following much
simpler fact.
Proposition 5. Every natural number n factors as a product of primes, i.e.,
n = p1 p2 · · · pr ,
where p1 , . . . , pr are primes.
Proof. We proceed by induction. Suppose that the statement holds for all natural
numbers m < n. If n is prime, we are done. If n is composite we write n = ab for two
natural numbers 1 < a, b < n. In particular, the induction hypothesis applies to a and
b, hence their product is also a product of primes. 
The representation of a number as a product of divisors is not unique, e.g,
12 = 2 · 6 = 3 · 4.
The fundamental theorem of arithmetic asserts that its representation as a product of
prime divisors is unique (up to reordering):
12 = 22 · 3 = 3 · 22 .
11
12 2. THE FUNDAMENTAL THEOREM OF ARITHMETIC

Theorem 3 (Fundamental theorem of arithmetic). Every natural number n can be


represented uniquely as a product of primes.
That is, if n can be written both as n = p1 · p2 · · · pr and as n = q1 · q2 · · · qs , where
p1 ≤ p2 ≤ · · · ≤ pr , q1 ≤ q2 ≤ · · · ≤ qs , then r = s and pi = qi for 1 ≤ i ≤ r.
Remark 6. Observe that if 1 is considered to be a prime, the statement would no
longer hold; e.g., 3 = 3 · 1 would be two distinct representations of 3 as a product of
primes.
Proof. We again proceed by induction and assume that the statement holds for
all m < n. Suppose now that n is composite and has two representations as product
of primes:
n = p1 · p2 · · · pr = q1 · q2 · · · qs ,
where we may assume that p1 ≤ p2 ≤ ... ≤ pr , q1 ≤ q2 ≤ · · · ≤ qs . We will assume
pi 6= qj for all 1 ≤ i ≤ r ,1 ≤ j ≤ s otherwise we can cancel some factors on both sides
of the equation and apply the induction hypothesis.
Since we are assuming that n is composite, n = p1 (p2 · · · pr ) ≥ p21 and similarly
n ≥ q12 . Thus n ≥ p1 q1 where we can only have n = p1 q1 if p1 = q1 . Observe that both
p1 and q1 divide n − p1 q1 . Since by induction hypothesis n − p1 q1 has a unique prime
factorization, both p1 , q1 must appear in it; we write
n − p1 q1 = p1 q1 r1 · · · rt ,
where r1 , . . . , rt are primes. We now have the expression
n = p1 q1 (1 + r1 · · · rt ) = p1 p2 · · · pr .
Cancelling p1 on both sides, and referring to the induction hypothesis, we find that q1
is one of the primes appearing in p2 , . . . , pr ; a contradiction. 
The fundamental theorem of arithmetic makes all kinds of practical and theoretical
computations easier. Here are three examples.

Example 1: How many divisors does n have? Writing


n = pk11 · · · pkr r
so that p1 , . . . , pr are distinct primes, and k1 , . . . , kr ≥ 1, we find that the number d(n)
of positive divisors of n is
d(n) = (k1 + 1)(k2 + 1) · · · (kr + 1).

Example 2: Computation of gcd (greatest common divisor). For example


(2829, 6850) = (3 · 23 · 41, 2 · 52 · 137) = 1
(3132, 7200) = (22 · 33 · 29, 25 · 32 · 52 ) = 22 · 32 = 36

Example 3: Checking simple properties of divisibility. The following propo-


sition is easily proved using the fundamental theorem of arithmetic: If a | bc and
(a, b) = 1 then a | c.
2.2. FAILURE OF UNIQUE FACTORIZATION 13

2.2. Failure of unique factorization


You will have noticed that the proof of the fundamental theorem of arithmetic is
considerably more involved than the proof of the preceding proposition. This is because
unique (prime) factorization is not anymore a direct consequence of the multiplicative
structure of numbers; one needs to incorporate at least addition as well. In this section,
we will see two examples of sets of numbers that are closed under multiplication (and
in the second case addition as well) but for which unique factorization fails.

Example 1. Consider the arithmetic progression


N = {4n + 1 : n ≥ 0} = {1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, . . . }.
Since (4m + 1)(4n + 1) ∈ N , this set is closed under multiplication. (In fact, both N
and N are monoids under multiplication.) We declare that n ∈ N is a prime if it has
nontrivial divisors in N . For example, 9 is a prime in N but 25 = 52 is composite. We
next take note of
9 · 77 = 32 · 7 · 11 = 21 · 33
showing that 693 admits two distinct prime factorizations in N .

Example 2. We next consider the set


√ √
Z[ −5] = {a + b −5 : a, b ∈ Z},
√ √ √ √ √
where −5 is understood as −5 = −1 5 = i 5, with i the imaginary √ unit.
This example is more interesting than the previous √ one as not only is Z[ −5] closed
under multiplication, but we will later see that Z[ −5] and Z have the same algebraic
structure
√ (to some extent): they are both rings. However unique factorization fails in
Z[ −5] while it holds for Z; this can be seen at the hand of the example
√ √
(1 + −5)(1 − −5) = 6 = 2 · 3
√ √ √
once we establish that 1 + −5, 1 − −5, 2, 3 are primes in Z[ −5].
Every complex number x + iy can √ be represented graphically as a point (x, y) in
the plane, and so we may √think of Z[ −5] as a discrete set of points in the x-y-plane.
The ‘size’ of each u ∈ Z[ −5]√ is measured as the distance of u to the origin of the
plane; we say that u = a + b −5 has norm
N (u) = a2 + 5b2 .
A direct computation shows that the norm is multiplicative, i.e.,

N (uv) = N (u)N (v) for all u, v ∈ Z[ −5].

In particular, if w ∈ Z[ −5] √is composite, w = uv, then so is its norm. Consequently,
if N (w) is prime then w ∈ Z[ −5] is prime. Consider

N (1 ± −5) = 6 = 2 · 3.

Since
√ there
√ is no element u ∈ Z[ −5] of norm either 2 or 3, we conclude that 1 +
−5, 1 − −5, 2, 3 are all prime.
14 2. THE FUNDAMENTAL THEOREM OF ARITHMETIC

2.3. Factoring
We next discuss the practical problem of factoring a given (large) number n. The
most immediate approach is to proceed by trial division. An algorithm is a system-
atic, or step-by-step, process to solve a problem (transforming a given input into a
desired output).
Algorithm 7 (Trial division algorithm). Input: composite number n. Output: prime
factorization of n. √
Run through primes 2 ≤ p ≤ n in increasing order until finding p | n.
Record p, and restart the process with n/p in the place of n.

You might wonder why it suffices to consider primes only up to n and not √ up to
n. If n is composite, n = p1 · · · pr , then r ≥ 2. If all prime factors where > n, we
would have n > nr/2 ≥ n, which is√impossible; hence any composite number n has at
least one prime factor of size p ≤ n.
When running this algorithm, the worst-case√scenario is that n = pq is a product
of two large primes, both of size approximately n;√then the number of required trial
divisions is the number of primes less or equal to n. How many primes are there
up to a large number x? The prime number theorem (to be discussed in a few weeks)
states that there are approximately
x
log x
primes p ≤ x (where log denotes the natural logarithm).
On a classical computer (your laptop or smartphone), a number n is recorded via
its binary (base 2) rather than decimal (base 10) expansion. If you’re unfamiliar with
this concept, consider the decimal and binary expansions of 17:
17 = 1 · 101 + 7 · 100 (base 10)
17 = 1 · 24 + 0 · 23 + 0 · 22 + 0 · 21 + 1 · 20 (base 2)
On a classical computer, 17 is stored as the sequence of bits 10001.
If n is a large number of k bits, say n ≈ 2k , then the trial division algorithm requires
up to
2k/2
k
2
log 2
operations. In other words, it is exponential in the number of bits. For k large enough,
the computation might require more time than there are atoms in the known universe
(roughly 1080 ). And we haven’t even √ accounted yet for the fact that we first have to
check for each number 2 ≤ m ≤ n whether it is prime (or rely on a large enough
pre-stored table of primes).
There are obviously more elaborate factoring algorithms than trial division, but
even the best currently known algorithms have exponential time growth. That com-
puters cannot factor efficiently is the basis of public-key cryptography, used for secure
browsing, VPNs, email encryption, etc. (We will briefly discuss public-key cryptogra-
phy later on in the semester.)
2.4. HOMEWORK 2 15

Regarding primality testing, things are better; the AKS algorithm (2002) allows to
determine in polynomial time whether a number is prime. On the other hand, Shor’s
algorithm (1994) shows that on a quantum computer, factoring could also be performed
in polynomial time.
2.4. Homework 2
(1) Let n ∈ N. Show that if 2n − 1 is prime, then n is prime.1

(2) Use the fundamental theorem of arithmetic to show that for each prime p, p
is irrational.
(3) Use the fundamental theorem of arithmetic to show that if a | bc and (a, b) = 1
then a | c.
(4) Let n ∈ N. Show that if the smallest prime factor p of n satisfies p > n1/3 ,
then n/p is either prime or 1.
(5) The sieve of Eratosthenes2 is an ancient algorithm to produce tables of prime.
It works as follows. List√all numbers from 2 up to N , then successively delete
for each prime 2 ≤ p ≤ N its higher multiples mp (see Table 1 below). Use
the sieve of Eratosthenes to list all primes between 1 to 64.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
2 3 5 7 9 11 13 15 17 19 21
2 3 5 7 11 13 17 19
Table 1. Eratosthenes’ sieve for N = 21. In the second line we deleted
all higher multiples of 2, in the third line, all higher multiples of 3. The
only numbers remaining are the primes up to 21.

1 Hint: Use that xn − 1 = (x − 1)(1 + x + x2 + · · · + xn−1 ).


2 Eratosthenes (276–194 BC) is best known for computing the circumference of the Earth (with
remarkable accuracy).
CHAPTER 3

Euclid’s Algorithm

3.1. Division with remainder


Division with remainder can be formalized as follows. Let a, b be two positive
integers, with say a ≥ b. If b | a we may write a = qb with q = ab called the quotient
of a and b. If b does not divide a (notation: b - a) then there exists q ∈ N so that
a ∈ (qb, (q + 1)b). The remainder r is defined to be r = a − qb. Note that
qb < a < (q + 1)b =⇒ 0 < r < b.
Note also that in either case q, r are uniquely defined: q is the largest positive integer
for which qb ≤ a. These observations are summarized in the following proposition.
Proposition 8 (Division with remainder). Let a, b ∈ N. Then there exists a unique
quotient q ≥ 0 and a unique remainder 0 ≤ r < b such thatÂăa = qb + r.
Division with remainder is the basis of the ancient Euclidean algorithm, arguably
the oldest mathematical algorithm, which we now describe. Suppose that d is a common
divisor of a and b. It must also then divide the remainder r.1 Conversely, if d divides
b and r it must also divide a. In fact we can show that
(a, b) = (b, r).
For examples, consider a = 4, b = 2 or a = 5, b = 2. (Take the habit of always testing
a statement/claim with some simple examples!) We now iterate this process:
Step 1 a = qb + r with (a, b) = (b, r) andÂă 0 ≤ r < b
Step 2 b = q 0 r + r0 with (b, r) = (r, r0 ) andÂă 0 ≤ r0 < r
Step 3 r = q 00 r0 + r00 with (r, r0 ) = (r0 , r00 ) andÂă 0 ≤ r00 < r0
..
.

The process terminates once r(k) = 0, which eventually happens since r(k) < r(k−1) .
We then see that
(a, b) = (b, r) = (r, r0 ) = (r0 , r00 ) = · · · = (r(k−1) , r(k) ) = r(k−1) .
In other words, the greatest common divisor of a and b is equal to the last nonzero
remainder when iterating division with remainder. This is Euclid’s algorithm.

1 By convention, every number divides 0, but 0 never divides anything.


17
18 3. EUCLID’S ALGORITHM

Algorithm 9 (Euclid’s algorithm). Input: numbers a ≥ b. Output: gcd (a, b).


Perform division with remainder.
While r > 0, replace a with b and b with r and restart.
Otherwise return b.

7491 = 17 · 422 + 317


422 = 1 · 317 + 105
317 = 3 · 105 + 2
105 = 52 · 2 + 1

Figure 1. Computing (7491, 422) = 1 with Euclid’s algorithm.

Figure 2. Graphic representation of Euclid’s algorithm as tiling of a


rectangle of side-lengths 13 and 5. The gcd (13, 5) is the side-length of
the smallest square tile.

3.2. Complexity theory


In the previous chapter, we observed that the gcd of two numbers is very easy
to determine once we know their prime factorizations, but also that factorizing is
computationally complicated. By contrast, Euclid’s algorithm is a very efficient way
to implement the computation of the gcd (it runs in polynomial time in the size of
the input). The analysis of Euclid’s algorithm is sometimes considered as the birth of
complexity theory.
The complexity of an algorithm is measured as follows. We say that an algorithm
A runs in time T (n) if for every n ≥ 1 and for every input of length n, running the
algorithm A(x) requires at most T (n) steps. In other words, complexity is measured by
considering worst-case scenarios. We recall that the length of input is usually measured
in terms of the binary of decimal expansion of x. For example, if we are working with
binary expansion, the length n of x is given by n = blog2 xc + 1.
3.3. EUCLID’S LEMMA AND BÉZOUT’S IDENTITY 19

Theorem 4 (Lamé 1844). Let a > b be two natural numbers, and b < Fn for some
n ≥ 3. Then running Euclid’s algorithm for the pair a, b requires at least n − 2 steps.
Moreover this bound is sharp if we take a = Fn , b = Fn−1 .
Proof. The second statement follows from the recurrence relation for the Fi-
bonacci sequence: Euclid’s algorithm
Fn = Fn−1 + Fn−2
Fn−1 = Fn−2 + Fn−3
..
.
F2 = F1
terminates after n − 2 steps. More generally, say that Euclid’s algorithm runs for k
steps;
a = qb + r
b = q 0 r + r0
..
.
r(k−3) = q (k−1) r(k−2) + r(k−1)
r(k−2) = q (k) r(k−1)

On the last line, we recall that r(k−1) ≥ 1 and q (k) ≥ 2. Hence r(k−2) ≥ 2. Going
backwards we find that r(k−3) ≥ r(k−2) + r(k−1) ≥ F2 + F1 = F3 until b ≥ r + r0 ≥
Fk + Fk−1 = Fk+1 . Since b < Fn , the maximal choice of k is given by k + 1 = n − 1,
i.e., k = n − 2. 
Theorem 5 (Lamé 1844). Given input a > b, Euclid’s algorithm finishes in at
most 5k steps, where k is the (decimal) length of b.
Proof. Consider the worst-case analysis provided by the previous theorem; a =
Fn+2 , b = Fn+1 requiring n steps for Euclid’s algorithm. By induction we can show
that Fn+1 ≥ ϕn−1 , where ϕ is the golden mean. Then
log b
n−1≤ < 5 log b < 5k
log ϕ
and hence n ≤ 5k. 

3.3. Euclid’s lemma and Bézout’s identity


Some mathematicians argue that while proving good theorems is the bread and
butter of the practice of mathematics, glory lies in proving an important, versatile
lemma. Here are two examples that follow from Euclid’s algorithm.
Lemma 10 (Euclid’s Lemma). Let p be a prime. If p | ab, then either p | a or p | b.
20 3. EUCLID’S ALGORITHM

Proof. Suppose that p | ab and p does not divide a. Then


p | (ab, pb) = (a, p)b = b,
where the equality in the middle can be justified either via Euclid’s algorithm or the
fundamental theorem of arithmetic (exercise: do both!) and the last equality follows
from the fact that p is a prime that does not divide a. Hence either p divides b, or if
it doesn’t, then p divides a. 
To measure the importance of Euclid’s lemma, we first note that it is actually
equivalent to the fundamental theorem of arithmetic, meaning that not only does the
fundamental theorem of arithmetic imply Euclid’s lemma (as we have seen in the proof
above) but Euclid’s lemma implies the fundamental theorem of arithmetic (see proof
below). This is not circular logic since Euclid’s lemma can be proved directly from
Euclid’s algorithm, that is, without relying on the fundamental theorem of arithmetic.
Proof of the fundamental theorem of arithmetic via Euclid’s lemma.
Recall that it is easy to prove that any number factors as a product of prime; we want
to prove that this factorization is unique. For this, suppose that n admits the two
factorizations
n = p 1 · · · p r = q1 · · · qs ,
with p1 , . . . , qs primes. Applying Euclid’s lemma repeatedly we find that p1 = qi for
some 1 ≤ i ≤ s. Cancelling these two terms from both sides of the equal sign, we can
repeat the argument until having shown that all primes coincide (or rely on induction,
as we did in the previous chapter). 
We also note that the converse to Euclid’s lemma actually offers a characterization
of being a prime. This characterization is often useful in abstract algebra (e.g., the
concept of prime ideals in rings).
Proposition 11. Let p ≥ 2. If p | ab implies that p | a or p | b for any a, b ∈ N, then
p is a prime.
Proof. Suppose for contradiction that p is composite, i.e., p = ab with 1 < a, b <
p. But then we have p | ab but p - a and p - b, a contradiction. 
Our second example is the following identity, which is again a simple consequence
of Euclid’s algorithm. Although generally attributed to Bézout, it was proven earlier
by Bachet (1624).
Lemma 12 (Bézout’s identity). For any pair of number a, b ∈ N their gcd (a, b) can
be expressed as a linear combination of the following form: there exist u, v ∈ N such
that au − bv = (a, b).
Proof. First we take note of the fact that the roles of a and b can be exchanged
for free: we claim that if n = au − bv with n, a, b, u, v ∈ N, then there exist u0 , v 0 ∈ N
such that n = bu0 − av 0 .
For the claim to be true, we need au − bv = bu0 − av 0 to hold. This is equivalent to
a(u + v 0 ) = b(u0 + v).
3.4. HOMEWORK 3 21

We may choose u0 = ma − v, v 0 = mb − u for a positive integer m large enough for


u0 , v 0 to be positive. This proves the claim.
Now we show that Bézout’s identity is obtained by running Euclid’s algorithm. At
the first step, we have r = a − qb. At the second step, r0 = b − q 0 r, where by the first
step, qr0 is of the form au − bv for some u, v. Hence r0 is of the form bu − av for some
u, v. Similarly for the third step, we have r00 = r − q 00 r0 with r of the form au − bv (for
some u, v), q 00 r0 of the form bu − av (for some u, v) and so conclude that r00 is of the
form au − bv (for some u, v). Once arrived at the last step, i.e.,
(a, b) = r(k−1) = r(k−3) − q (k−1) r(k−2) ,
we have r(k−1) of the form au − bv and q (k−1) r(k−2) of the form bu0 − av 0 (or vice versa)
and hence (a, b) of the form au00 − bv 00 . 
A neat application of Bézout’s identity is the complete resolution of Diophantine
equations of the following form. (And yet the same question for the linear Diophantine
equation ax + by = n is much harder to answer.)
Theorem 6. Let a, b, n ∈ N. The linear Diophantine equation
ax − by = n
has a solution in the positive integers (i.e., x, y ∈ N) if and only if (a, b) | n.
Proof. If the Diophantine equation has a solution then (a, b) | n. Conversely,
suppose that n = m(a, b) with m ∈ N. Bézout’s identity asserts that au − bv = (a, b)
for some u, v ∈ N. Then a(mu) − b(mv) = n. 
We conclude by showing that Bézout’s identity can be used to prove Euclid’s lemma.
(This is the usual modern proof adopted by many textbooks.)
Proof of Euclid’s lemma via Bézout’s identity. Suppose that p | ab and
p does not divide a. Then (a, p) = 1 and by Bézout’s identity, we may write au−pv = 1.
Multiplying both sides by b yields abu−pbv = b. Since p | ab we conclude that p | b. 
3.4. Homework 3
(1) Use Euclid’s algorithm to find the gcd of 93, 42.
(2) Use Euclid’s algorithm to find the gcd of 280, 330, 405.
(3) Let a, b, c ∈ N. Prove that (ac, bc) = (a, b)c.
(4) Let a, b ∈ N. Prove that if (a + b, b) = (a, b).
(5) The least common multiplier {a, b} satisfies
(i) a, b | {a, b};
(ii) if a, b | c then {a, b} | c.
Show that ab = {a, b}(a, b).2

2 Hint: To check (ii), show that c(a, b)/(ab) is a positive integer.


CHAPTER 4

Continued Fractions

4.1. Continued fraction expansions


67
We start by expanding 24
in a continued fraction;
67 2 · 24 + 19 19 1 1 1 1
= =2+ = 2 + 24 = 2 + 1·19+5 =2+ 1 = 2+ .
24 24 24 19 19
1 + 19 1 + 3+ 1 1
5 1+ 1
4

The terms 2, 1, 3, 1, 4 are called the partial quotients of the continued fraction ex-
67 24 19 5
pansion. (By contrast, we call 24 , 19 , 5 , 4 the complete quotients.) The partial
quotients are precisely the quotients appearing in Euclid’s algorithm;
67 = 2 · 24 + 19
24 = 1 · 19 + 5
19 = 3 · 5 + 4
5=1·4+1
4 = 4 · 1.
Equivalently, the partial quotients are the integer parts1 of the corresponding complete
quotients;
67 19
=2+
24 24
24 5
=1+
19 19
19 4
=3+
5 5
5 1
=1+
4 4
4
= 4.
1
We set the notation
1
[a0 , a1 , . . . , an ] := a0 + 1
a1 + a2 + 1
...
+ a1
n

1 Given a real number x ∈ R, its integer part bxc is the largest integer such that bxc ≤ x, and
its fractional part {x} is {x} = x − bxc ∈ [0, 1).
23
24 4. CONTINUED FRACTIONS

to express any number written in this ‘staircase form’ (with no assumption on the
’digits’ ai ). Note that representation of numbers of this shape is not unique; we have
[a0 , . . . , an ] = [a0 , . . . , an − 1, 1].
The previous numerical example shows that given a > b, Euclid’s algorithm yields
a sequence (ai )0≤i≤n such that
a
= [a0 , . . . , an ],
b
where
• a0 , . . . , an−1 ≥ 1;
• an > (a, b) ≥ 1, hence an ≥ 2.
Proposition 13. Every ab ∈ Q>1 has a unique finite continued fraction expansion
whose last partial quotient is > 1.
a
Proof. Suppose that b
admits two finite continued fraction expansions:
[a0 , a1 , . . . , am ] = [b0 , b1 , . . . , bn ].
Taking the integer part of ab , we see that b ab c = a0 = b0 . Hence canceling a0 out on
both sides, we are left with
[a1 , . . . , am ] = [b1 , . . . , bn ].
Repeating this argument, we see that m = n, and ai = bi for i = 0, . . . , n. 
We can adapt the process to express an irrational number x > 1 as a continued
fraction. Write again x = bxc + {x} and set
1
a0 := bxc, x1 := −
{x}
Then
1
x = a0 +
x1
where x1 is irrational and x1 > 1. Iterating, we obtain sequences
1
xn := an := bxn c (4.1)
{xn−1 }
for n ≥ 1 with x0 := x, so that
1
x = a0 + 1 .
a1 + .
. .+ 1
a + 1 n−1 xn

Since each xn > 1, each partial quotient is an ≥ 1. Since each xn is irrational, the
process never terminates; we claim that continuing this process ad infinitum does
express the value x, i.e.,
x = lim [a0 , . . . , an−1 , xn ]
n→∞
To make this claim precise we need the notions of convergents.
Definition 14. The nth convergent is [a0 , . . . , an ].
4.2. CONVERGENTS 25

4.2. Convergents
Looking at the formal shape of the first few convergents, i.e.,
[a0 ] = a0
1 a0 a1 + 1
[a0 , a1 ] = a0 + =
a1 a1
1 1 a2 a0 a1 a2 + a0 + a2
[a0 , a1 , a2 ] = a0 + 1 = a0 + = a0 + =
a1 + a2 [a1 , a2 ] a1 a2 + 1 a1 a2 + 1
a2 a3 + 1 a0 a1 a2 a3 + a0 a1 + a0 a3 + a2 a3 + 1
[a0 , a1 , a2 , a3 ] = a0 + = ,
a1 a2 a3 + a1 + a3 a1 a2 a3 + a1 + a3
we observe a discernable, recursive pattern for both the numerators and denominators
(assuming the above fractions are already in reduced form). Given a sequence (an )n≥0 ,
set for each n ≥ 1
pn := an pn−1 + pn−2 , p0 := a0 , p−1 := 1,
qn := an qn−1 + qn−2 , q0 := 1, q−1 := 0.
It is often convenient to express linear recurrence relations in matrix form; here we
have     
pn qn an 1 pn−1 qn−1
=
pn−1 qn−1 1 0 pn−2 qn−2
for n ≥ 1. Then
       
pn qn an 1 an−1 1 a1 1 a0 1
= ··· .
pn−1 qn−1 1 0 1 0 1 0 1 0
Taking the determinant on both sides we find that
pn qn−1 − pn−1 qn = (−1)n+1 . (4.2)
In particular (pn , qn ) | (−1)n+1 and hence (pn , qn ) = 1. These observations made, we
show that
Proposition 15. Let (an )n≥0 be a sequence. We have
pn
[a0 , . . . , an ] = .
qn
with pn , qn defined as above for all n ≥ 0.
Proof. The proof is by induction. Clearly, [a0 ] = pq00 . Let n ≥ 1 and assume that
pi
qi
= [a0 , . . . , ai ] for all i ≤ n. We now use a very useful observation: the definition of
pn , qn does not depend on the ai ’s being integers. Hence under the induction hypothesis
1
(an + an+1 )pn−1 + pn−2
 
1
[a0 , . . . , an+1 ] = a0 , . . . , an + = 1
an+1 (an + an+1 )qn−1 + qn−2
(an pn−1 + pn−2 ) + pn−1 /an+1 an+1 pn + pn−1 pn+1
= = = .
(an qn−1 + qn−2 ) + qn−1 /an+1 an+1 qn + qn−1 qn+1

26 4. CONTINUED FRACTIONS

Proposition 16. Let x ∈ R \ Q. Let (an )n≥0 be the sequence defined by (4.1) and
(pn ), (qn ) the corresponding sequences given above. Then
pn
x = lim .
n→∞ qn

Proof. We will show that


pn
x− →0
qn
as n → ∞. Using the recurrence relations we have
xn+1 pn + pn−1
x = [a0 , . . . , an , xn+1 ] = .
xn+1 qn + qn−1
Hence
pn xn+1 pn + pn−1 pn 1
x− = − ≤ .
qn xn+1 qn + qn−1 qn qn+1 qn
To conclude we claim that qn → ∞ as n → ∞. Observe that q1 = a1 ≥ 1, q2 =
a2 q1 + q0 ≥ 2, q3 ≥ 3, ..., and by induction
qn = an qn−1 + qn−2 ≥ qn−1 + qn−2 ≥ (n − 1) + 1 = n
for all n ≥ 1. Hence
pn 1 1
x− ≤ ≤ →0
qn qn+1 qn (n + 1)n
as n → ∞. 
4.3. Constructing irrational numbers
We have seen that (ir)rational numbers admit a (in)finite continued fraction expan-
sion. Observe that this dichotomy (rationals correspond to finite continued fractions,
irrationals to infinite continued fractions) is absent when considering the decimal ex-
pansion of a real number; e.g., the decimal expansion of 31 is 0.3. The comparison
suggests constructing irrational numbers via sequences (an )n≥0 ⊂ N. That we can do
this is the content of the following theorem.
Theorem 7. Given an infinite sequence (an )n≥0 ⊂ N, the limit
pn
lim
n→∞ qn

exists and is irrational.


Proof. The identity (4.2) implies that
pn+1 pn pn+1 qn − pn qn+1 (−1)n
− = = ;
qn+1 qn qn+1 qn qn+1 qn
pn+2 pn an+2 pn+1 + pn pn an+2 (pn+1 qn − pn qn+1 ) (−1)n an+2
− = − = = .
qn+2 qn an+2 qn+1 + qn qn qn+2 qn qn+2 qn
This shows that for n even
p0 p2 pn pn+1 p3 p1
< < ··· < < < ··· < < .
q0 q2 qn qn+1 q3 q1
4.3. CONSTRUCTING IRRATIONAL NUMBERS 27

It follows that the subsequence  


pn
qn n even
is increasing and bounded above, while the subsequence
 
pn
qn n odd
is decreasing and bounded below. Hence both subsequences have a limit, and since
pn+1 pn 1 1
− = ≤ →0
qn+1 qn qn+1 qn n(n + 1)
as n → ∞, these two limits coincide. 
Example 17. Consider the constant sequence an = 1 for n ≥ 0, and let x = [1]. Then
x satisfies the equation
1
x=1+ .
x
2
The quadratic equation x = x + 1 has solutions
√ √
1+ 5 1− 5
ϕ= , ϕ= .
2 2
Since x > 1, we conclude that x = ϕ, the golden mean.
We say that a continued fraction is eventually periodic if it has the form
[a0 , . . . , am−1 , am , . . . , am+n ]
with m, n ≥ 1. Although we won’t prove it in full, we take note of the following
beautiful theorem from the theory of continued fractions.
Theorem 8. Let x ∈ R \ Q. The continued fraction of x is eventually periodic if
and only if x is a quadratic irrational, i.e., the solution of a quadratic equation.
It is easy to show that eventually periodic continued fractions are solutions of
quadratic equations. (This goes back to Euler.) Write x = [am , . . . , am+n ]. Using
convergents, we find that
xpn + pn−1
x = [am , . . . , am+n , x] = ,
xqn + qn−1
which we can rewrite as the quadratic equation
qn x2 + (qn−1 − pn )x − pn−1 = 0.
Then [a0 , . . . , am−1 , x] is itself the solution of a quadratic equation. The (harder) proof
of the converse direction is due to Lagrange (1770).
Very little is known about continued fractions for irrational numbers other than
quadratic irrationals. For example, the continued fraction of π is given by
π = [3, 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 2, 1, 1, 2, 2, 2, 2, 1, 84, 2, 1, 1, 15, 3, 13, 1, 4, 2, . . . ]
(see https://oeis.org/A001203) and does not showcase any known pattern.
28 4. CONTINUED FRACTIONS

4.4. Homework 4
17
(1) Compute the continued fraction expansions of 11 and 11 .
√ 31 √
(2) Compute the continued fraction expansion of 2. Conclude that 2 is irra-
tional.
(3) (Explicit Bézout) Use (4.2) to find values x, y ∈ N such that 67x − 24y = 1.
(4) What is the value of [1, 2] = [1, 2, 1, 2, 1, 2, . . . ]?
(5) Show that the convergents of the golden mean ϕ are given by
pn Fn+2
=
qn Fn+1
for every n ≥ 1.
Remark 18. The golden mean ϕ is ϕ ≈ 1.618, which is relatively close to the con-
version rate for miles to kilometers given by 1 mile ≈ 1.609 km. Therefore, using
that
mile Fn+1
≈ϕ≈
km Fn
for n sufficiently large, we have the easy conversion rule
Fn miles ≈ Fn+1 km
for anyone knowing the Fibonacci sequence.
CHAPTER 5

Diophantine Approximation

In the previous chapter, we have seen that every irrational number x admits a
continued fraction expansion, whose convergents provide us with x = limn→∞ pqnn . In
particular, we recall that
• (qn ) is a strictly increasing sequence;
1
• qn+1 = an+1 qn + qn−1 and xn+1 = an+1 + xn+2 ;
pn 1
• x− qn
= (xn+1 qn +qn−1 )qn
.
Combining these facts, we find that
pn 1 1 1
x− = < < 2. (5.1)
qn (xn+1 qn + qn−1 )qn qn+1 qn qn
In other words, the irrational x admits an infinite sequence of rational approximants
p/q such that |x − pq | < q12 . Finding good sequences of rational approximations for
real numbers is the primary task of the theory of Diophantine approximation. In this
chapter we will showcase three fundamental results of Diophantine approximation

5.1. Dirichlet’s theorem (1842)


The first important statement in Diophantine’s approximation is Dirichlet’s theo-
rem, which states that every irrational number has a rational approximant very close
whose complexity (measured by its denominator) is not too large.
p
Theorem 9 (Dirichlet). Let x ∈ R \ Q, Q ∈ N. There exists a rational number q
such that
p 1
x− < and 1 ≤ q ≤ Q.
q qQ
Proof 1 (via theory of convergents). Choose n large enough that qn ≤
Q < qn+1 . Then |x − pqnn | < qn q1n+1 < qn1Q . 

Proof 2 (via box principle). The box principle (also called pigeonhole prin-
ciple/Schubfachprinzip) asserts that given N boxes in which we place N + 1 balls, one
of the boxes has to contain at least two balls.
We consider the Q + 1 fractional parts 0, {x}, {2x}, . . . , {Qx} ∈ [0, 1). (Note that
these numbers are all distinct since x is irrational; see the homework.) We divide the
interval [0, 1) in Q subintervals of equal length 1/Q as follows; [0, 1) = [0, Q1 ) ∪ [ Q1 ∪
Q
2
) ∪ · · · ∪ [ Q−1
Q
, 1). By the box principle, one of these intervals contains two distinct
29
30 5. DIOPHANTINE APPROXIMATION

fractional parts {mx}, {nx}. Say that n > m and set q = n − m, p = bnxc − bmxc.
Then |qx − p| = |{nx} − {mx}| < Q1 . 
The advantage of Dirichlet’s ‘softer’ proof based on the box principle is that it can
easily be adapted to higher dimensions, i.e., Diophantine approximation in Rn , where
the theory of continued fractions is not so readily available. On the other hand, in the
one-dimensional setting, convergents actually provide the best possible approximants
to an irrational number; this can be formalized as follows.
Theorem 10 (Best approximation theorem). Let x ∈ R \ Q. If 1 ≤ q ≤ qn , then
pn p
x− ≤ x−
qn q
with equality only if q = qn , p = pn .
Proof. We will in fact prove a stronger statement, namely that for 1 ≤ q < qn+1
we have
|qn x − pn | ≤ |qx − p|,
with equality only if q = qn , p = pn . It follows that if q < qn then
pn q p p
x− < x− < x− .
qn qn q q
Using the identity (4.2), i.e., pn+1 qn − pn qn+1 = (−1)n , we write
p = pn u − pn+1 v
q = qn u − qn+1 v,
choosing u = (−1)n (qpn+1 − pqn+1 ) and v = (−1)n (qpn − pqn ). We may assume that
u, v 6= 0. We now have
qx − p = u(qn x − pn ) − v(qn+1 x − pn+1 )
and we claim that both terms have opposite signs. Given this claim, we can conclude
that
|qx − p| = |u(qn x − pn )| + |v(qn+1 x − pn+1 )| > |qn x − pn |.
We now prove our claim. We saw earlier (see proof of Theorem 7) that qn x − pn
and qn+1 x − pn+1 have opposite signs, so it remains to prove that u and v have the
same sign. Observe that qn u = qn+1 v + q and q < qn+1 ≤ |qn+1 v| hence u and v have
the same sign. 

5.2. Hurwitz’s theorem (1891)


Now that we know that convergents provide the best approximants, it is natural to
ask if the theory of convergents leads to a better upper bound in the statement “each
irrational x admits an infinite sequence of rational approximants for which x − pq <
1
q2
.”
Analyzing a bit more carefully the facts on convergents reviewed at the beginning
of this chapter, we observe that
5.2. HURWITZ’S THEOREM (1891) 31

1
(i) an+1 < xn+1 = an+1 + xn+2 < an+1 + 1;
(ii) an+1 qn < qn+1 = an1 qn + qn−1 < (an+1 + 1)qn .
In other words, the order of xn+1 is roughly that of an+1 and the order of qn+1 is roughly
that of an+1 qn . This suggests that
pn 1 1
x− ≈ ≈ ,
qn qn+1 qn an+1 qn2
i.e., the larger the size of the partial quotients of x the better the approximation. For
example, consider again the first few partial quotients of π,
π = [3, 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, . . . ].
Take note of the much larger partial quotient a4 = 292; our observations suggest that
the convergent
p3 355
=
q3 113
gives the best approximation of π by a rational number of four digits or less — this is
in fact the floating-point approximation of π used by many computers.
Following this line of thought we might expect the worst approximable irrational
number to be the golden mean ϕ with periodic continued fraction expansion [1]. We
leave it to the reader to check that in this case qn = Fn+1 , xn = ϕ. This leads to the
identity
pn 1 1
ϕ− = = .
qn (xn+1 qn + qn−1 )qn (ϕ + FFn+1
n
)qn2
Fn+1
Using some facts established in previous homeworks, namely that → ϕ as n → ∞
√ Fn
and ϕ + ϕ1 = 2ϕ − 1 = 5, we find that for very large n
pn 1
ϕ− ≈√ .
qn 5qn2
Hurwitz’ theorem states that this is the worst possible bound for the approximation of
an irrational by a infinite sequence of rational approximants.
Theorem 11 (Hurwitz). Every irrational number x admits infinitely many rational
approximants with
p 1
x− <√
q 5q 2
and this statement becomes false if √1 is replaced by any smaller constant.
5

Proof. We first prove the second statement. Fix C < √15 . We will show that there
are only finitely many rationals satisfying |ϕ − pq | < qC2 . Recall that ϕ is the positive
solution of the polynomial equation X 2 − X − 1 = 0. More precisely, we have
X 2 − X − 1 = (X − ϕ)(X − ϕ).
32 5. DIOPHANTINE APPROXIMATION

p
Plugging in X = and multiplying both sides by q 2 we find
q

   
2 2 2 p p p C C
|p − pq − q | = q ϕ − ϕ− <C ϕ− < C |ϕ − ϕ| + 2 = C 5+ 2 .
q q q q q
Since the left hand-side is a positive integer, we are left with the inequality
C2
1 ≤ q2 < √
1−C 5
which leaves only finitely many possibilities for the choice of q.
The first statement admits various proofs. We will sketch here a geometric proof
based on the Farey–Ford packing along the real line. The Farey sequence is (FQ )Q≥1
where each FQ is the ordered set
 
p
FQ = | (p, q) = 1, 1 ≤ q ≤ Q .
q
To each Farey point pq we associate the Ford circle Cp/q that has radius 1
2q 2
and center
( pq , 2q12 ). The following statements hold:
(i) The interiors of two Ford circles are disjoint, i.e., Int(Cp/q ) ∩ Int(Cp0 /q0 ) = ∅;
0
(ii) Two Ford circles Cp/q , Cp0 /q0 are tangent if and only pq , pq0 are consecutive ele-
ments in some Farey set FQ .
Running over the Farey sequence, attaching a Ford circle to each Farey point yields
the Farey–Ford packing.
Given x ∈ R \ Q, the vertical line passing through x intersects infinitely many
of the triangular interstices of the packing. Each triangle is determined by its three
vertices, which are the tangency points of the neighboring Ford circles Cp/q , Cp0 /q0 ,
0 0
C(p+p0 )/(q+q0 ) . For each triangle, we choose the point between pq , pq0 , p+p
q+q 0
that lies the
p
closest to x. Say this is q ; an explicit computation of the tangency points implies
then that |x − pq | < √5q 1
2 . Since there are infinitely many triangles, this completes the

proof. 

5.3. Liouville’s theorem (1844)


The golden mean is an algebraic number, that is, it is the root of a polynomial with
integer coefficients. We consider the following more precise definition. Let
Z[X] = {f (X) = ad X d + · · · + a1 X + a0 | a0 , . . . , ad ∈ Z, d ≥ 0}
be the ring of polynomials with integer coefficients. If ad 6= 0 is the largest non-zero
coefficient, we say that f (X) is a polynomial of degree d.
Definition 19. A real number x is called algebraic of degree d if it is the root of
an irreducible polynomial f (X) ∈ Z[X] of degree d.

Rational numbers are algebraic of degree 1, quadratic irrationals (such as ϕ, 2) are
algebraic of degree 2. A real number that is not algebraic is called transcendental.
A classical result in a first course in analysis is Cantor’s proof that the set of algebraic
5.3. LIOUVILLE’S THEOREM (1844) 33

numbers is countable and the set of real numbers is uncountable. It follows that
most numbers are transcendental. This last statement can be made more precise with
measure theory, which is the topic of Analysis III.
Liouville’s theorem shows that there is a lower bound on how well approximable
an algebraic irrational can be. The result is essentially a corollary of the mean value
theorem, which is one of the fundamental results of Analysis I: If f : [a, b] → R is
continuous and differentiable on (a, b) then there exists t ∈ (a, b) such that
f (b) − f (a)
f 0 (t) = .
b−a
Theorem 12 (Liouville). If x ∈ R \ Q is algebraic of degree d, then there exists a
constant c > 0 such that
p c
x− ≥ d
q q
for any rational approximant p/q.
Proof. Since x is algebraic of degree d, there is a polynomial f (X) = ad X d +· · ·+a0
so that f (x) = 0. Choose p/q to be a rational approximant of x that is closer to x than
any other root of f ; hence f (p/q) 6= 0. By the mean value theorem, there exists some
t between x and p/q such that
p |f (x) − f (p/q)| |f (p/q)| |ad pd + ad−1 pd−1 q + · · · + a0 q d | 1
x− = 0
= 0
= 0 d
≥ 0 .
q |f (t)| |f (t)| |f (t)|q |f (t)|q d

The statement of Liouville’s theorem motivates the following definition.
Definition 20. A real number x is called a Liouville number if for all n ∈ N, there
exists a reduced fraction pq so that
p 1
x− < n.
q q
We leave it as an exercise to the reader to show that every Liouville number is
transcendental. This observation led Liouville to the construction of the first known
transcendental number, the Liouville constant

X 1
n!
= 0.11000100...
n=1
10
We show that this number is indeed Liouville (and therefore transcendental). Fix
n ∈ N, and set
n
p X 1
:= .
q m=1
10m!
Note that q = 10n! . Then
∞  
p X 1 1 1 2 2 1
x− ≤ m!
= (n+1)! 1 + n+2 + . . . < (n+1)! = n+1 < n .
q m=n+1
10 10 10 10 q q
34 5. DIOPHANTINE APPROXIMATION

The general principle is that a number defined by a sufficiently rapid sequence of


rationals is Liouville. The numbers π and e were later shown to be transcendental but
not Liouville. It is still unknown whether various numbers such as 2e , 2π , π e , π + e, πe
are irrational.
5.4. Homework 5
(1) Show that if x is irrational then {mx} = {nx} implies that m = n.
(2) Use the box principle to show that if we place four √
points on a circle of radius
1, at least two of the points are within distance ≤ 2 of each other.
355
(3) Compute the first few convergents of π and check that 113 is the best approx-
imation of π by a rational number of at most four digits.
(4) Show that every Liouville number is transcendental.
(5) Show that x ∈ R is irrational if and only if for each n ∈ N there exists a
rational number pq (6= x) so that |x − pq | < qn
1
.
CHAPTER 6

Congruences (I)

In division with remainder, dividing a number a by a number b gives


a = qb + r, r ∈ {0, . . . , b − 1}.
Sometimes it is the remainder r ∈ {0, . . . , b − 1} and not the quotient we are interested
in. In such a case, we say that a is congruent to r (mod b) and write a ≡ r (mod b).
This is modular arithmetic and you perform it everytime you read the clock (whether
in 12h or 24h format).
Example 21. It is 11pm and you want to sleep 8 hours; then you should not set your
alarm before 11 + 8 = 19 ≡ 7 (mod 12) am.
Example 22. Wrapping around a circle can be thought of as (mod 360) since a com-
plete turn is 360 degrees. For example, the final position after a rotation of 440 degrees
is given by the angle 440 ≡ 80 (mod 360).

6.1. The basic algebra of congruences


Let n ∈ N. Gauss introduced the following convenient notation for congruences
(mod n):
a ≡ b (mod n) ⇐⇒ n | (a − b).
We say that the integers a and b are congruent mod n and call n the modulus. It is
easy to check that if a ≡ b (mod n) and c ≡ d (mod n) then a + c ≡ b + d and ac ≡ bd
(mod n). In fact, this extends to finitely many factors:
Proposition 23. If a1 ≡ b1 (mod n), . . . , ak ≡ bk (mod n) then
(i) a1 + · · · + ak ≡ b1 + · · · + bk (mod n),
(ii) a1 · · · ak ≡ b1 · · · bk (mod n).
Proof. For (i), write nmi = ai − bi . Then n(m1 + . . . mk ) = a1 + · · · + ak −
(b1 + · · · + bk ), which proves the claim. For (ii), we proceed by induction on k. For
k = 2, (i) implies that a1 a2 ≡ a1 b2 ≡ b1 b2 (mod n). By induction we then find that
(a1 · · · ak−1 )ak ≡ (b1 · · · bk−1 )bk (mod n) for all k ≥ 2. 
We pause to consider the following elegant use of Gauss’ congruence notation. Every
school pupil is taught the following simple divisibility tests: a number is divisible
by two if its last digit is even, by three if the sum of its digits is divisible by three, by
four if the number formed by its last two digits is a multiple of four, etc. The truth of
these divisibility tests is immediate from the use of Gauss’ notation for congruences.
35
36 6. CONGRUENCES (I)

For instance, using that 10 ≡ 1 (mod 3), and more generally a · 10j ≡ a (mod 3) for
any a ∈ Z and j ∈ N, we have
n := ak 10k + · · · + a1 10 + a0 ≡ ak + · · · + a0 (mod 3),
that is, n is divisible by 3 if and only the sum of its digits is divisible by 3. Observe
that the same argument produces the same rule for 9 in place of 3.
A special case of (ii) above is that if a ≡ b (mod n) then ac ≡ bc (mod n) for
all c ∈ Z. The converse, i.e., the cancellation law for congruences, does not hold in
general; for example, 6 ≡ 2 (mod 4) but 3 6≡ 1 (mod 4). The rule for cancellation is
given by the following proposition.
Proposition 24 (Cancellation law for congruences). If ac ≡ bc (mod n) then a ≡ b
n
(mod (n,c) ). In particular, if (n, c) = 1 then ac ≡ bc (mod n) implies a ≡ b (mod n).
Proof. Set d = (n, c) and write n = kd, c = ld. Then ald ≡ bld (mod kd) implies
that al ≡ bl (mod k). Since (k, l) = 1, we have a ≡ b (mod k). 
Definition 25. We say that the relation ∼ is an equivalence relation on the set S
if for any a, b, c ∈ S we have that ∼ is
(i) reflexive: a ∼ a;
(ii) symmetric: a ∼ b if and only if b ∼ a;
(iii) and transitive: if a ∼ b and b ∼ c then a ∼ c.
If S is a set with an equivalence relation ∼, then S/ ∼ is the set of all elements of S
up to equivalence.
Example 26. Consider the real line R with the equivalence relation x ∼ y if and only
if x − y ∈ Z. The equivalence class of x is x + Z. In particular, x ∼ {x} for each
x ∈ R, so we are only concerned with the subset [0, 1) ⊂ R. Note that we can also view
[0, 1) as the closed interval [0, 1] with end-points identified, or as the unit circle. That
is, we can identify S/ ∼= R/Z with the unique circle; this is explicitly realized by the
bijection R/Z → S 1 , x + Z 7→ e2πix .
It is easy to check that being congruent (mod n) is an equivalence relation on Z:
we set a ∼ b if and only if a − b ∈ nZ, i.e., a ≡ b (mod n). For any integers a, b, c, we
have
(i) a ≡ a (mod n) ⇐⇒ n | 0;
(ii) a ≡ b (mod n) ⇐⇒ n | (a − b) ⇐⇒ n | (b − a) ⇐⇒ b ≡ a (mod n);
(iii) a ≡ b, b ≡ c (mod n) ⇐⇒ b = a + kn = c + ln for some integers k, l
=⇒ a ≡ c (mod n).
Given an integer a ∈ Z, we denote its congruence (equivalence) class (mod n) by
[a] := {b ∈ Z : b ≡ a (mod n} = {a + nk : k ∈ Z} = a + nZ.
Note that for each congruence class [a] we have a unique representative a ∈ {0, 1, . . . , n−
1}, namely the least non-negative residue (mod n). We obtain a partition of the set of
integers by the finitely many congruence classes
n−1
[
Z= [a].
a=0
6.2. SOME THEOREMS TO PRIME MODULUS 37

Consequently, the set Z/nZ of all congruence classes (mod n) is in bijection with the
finite set {0, 1, . . . , n − 1} via the map that sends [a] = a + nZ to the least non-negative
residue a (mod n) ∈ {0, . . . , n − 1}. The set Z/nZ has a rich algebraic structure: it
is a (finite, commutative) ring. In fact, we will explain that it is even a (finite) field if
and only n = p is prime.
By definition, Z/nZ can be upgraded from being a ring to being a field if every
a ∈ {1, . . . , n − 1} has a reciprocal a0 satisying aa0 ≡ 1 (mod n). This is not true in
general. For example, the linear congruence equation 3x ≡ 1 (mod 6) has no solution
(i.e., 3 has no reciprocal (mod 6)). To see this, it suffices to check that 3x ≡ 0, 3
(mod 6) by running x through 0, . . . , 5. We take note of the following more general
proposition.
Proposition 27. The linear congruence equation ax ≡ b (mod n) has a solution if and
only if (a, n) | b. If a solution exists, it is unique.
Proof. If x is a solution to ax ≡ b (mod n) then (a, n) must divide b. The converse
can be obtained from Bézout’s identity; we will however give a more direct argument.
Suppose first that (a, n) = 1. Then we claim that the map ϕa : Z/nZ → Z/nZ,
[x] 7→ [ax] is a bijection. Indeed, it is injective since ax ≡ ay (mod n) implies x ≡ y
(mod n) given that (a, n) = 1. Since ϕa is an injective self-map, it is bijective. This
shows not only that a solution exists but also that this solution is unique.
More generally, write a0 = (a,n)
a
, b0 = (a,n)
b
, n0 = (a,n)
n
. Then as we have just seen
0 0 0 0 0
a x ≡ b (mod n ) has a unique solution since (a , n ) = 1, and this is also a solution of
ax ≡ b (mod n). 
Corollary 28. The map ϕa : Z/nZ → Z/nZ, ϕa (x) = ax (mod n) is a bijection if
and only if (a, n) = 1.
Proof. We have seen in the previous proof that if (a, n) = 1, the map ϕa is
a bijection. Conversely if ϕa is surjective, then the proposition above implies that
(a, n) | b for each b ∈ {0, . . . , n − 1} and in particular that (a, n) = 1. 
In particular, Z/nZ is a field if and only if (a, n) = 1 for all 1 ≤ a ≤ n − 1. This
says that n has no non-trivial divisors, i.e., it is prime.

6.2. Some theorems to prime modulus


Theorem 13 (Wilson). p is a prime if and only if (p − 1)! ≡ −1 (mod p).
Proof. Suppose that p > 3 is prime. By the discussion above, each residue
1, 2, . . . , p − 1 has a (unique) reciprocal in {1, ...., p − 1}. In particular, we can pair
up reciprocals in the {1, . . . , p − 1} together except for elements that are their own
reciprocal, namely that satisfy a2 ≡ 1 (mod p). If so, p | (a − 1)(a + 1) and by Euclid’s
lemma, this implies that a ≡ ±1 (mod p). Hence every residue in {2, . . . , p − 2} can
be paired up with its reciprocal so that (p − 2)! ≡ 1 (mod p). Multiplying both sides
by p − 1 yields the equation (p − 1)! ≡ p − 1 ≡ −1 (mod p). For p = 2, 3 one can check
directly that the congruence equation holds.
38 6. CONGRUENCES (I)

Conversely, assume for contradiction that p has a non-trivial divisor d|p with 1 <
d < p. Then (p − 1)! ≡ 0 (mod d) but also (p − 1)! ≡ −1 (mod d). By transitivity, we
are left with 0 ≡ −1 (mod d) which can only be true if d = 1. 
Wilson’s theorem provides a direct primality test; it asserts that the primality of
a natural number depends on checking that a congruence equation holds. In practice
however computing the factorial of a large number rapidly exceeds standard computa-
tional power so that this primality test is mostly of theoretical interest.
Theorem 14 (Fermat). If p is a prime then ap−1 ≡ 1 (mod p) for all (a, p) = 1.
Proof. We present a proof of Ivory (1806). Recall that the map ϕa : Z/pZ →
Z/pZ is a bijection for (a, p) = 1. Hence multiplying all non-zero elements gives
a(2a)(3a) · · · ((p − 1)a) ≡ (p − 1)! (mod p).
On the other hand, a(2a)(3a) · · · ((p − 1)a) = (p − 1)!ap−1 . Hence (p − 1)!ap−1 ≡ (p − 1)!
(mod p) and since (p, (p − 1)!) = 1, we may cancel the factor (p − 1)! on both sides. 
Remark 29. For applications, it is sometimes useful to keep in mind that ap−1 ≡ 1
(mod p) is equivalent to ap ≡ a (mod p).
In contrast to the computation of a factorial in Wilson’s theorem, computing powers
in modular arithmetic is very efficient, thanks to the process of repeated squaring.
E.g., mod 17, we have
515 = 257 · 5 ≡ 86 · 40 ≡ 643 · 6 ≡ 132 · 13 · 6 ≡ (−1) · 10 ≡ 7 (mod 17).
However Fermat’s theorem does not provide an absolute primality test because its
converse is false: there are composite numbers n for which an−1 ≡ 1 (mod n) for all
(a, n) = 1. Such numbers are called Carmichael numbers. It is known that there are
infinitely many Carmichael numbers (a result first proved in 1994) but these numbers
are very sparse; the smallest Carmichael number is 561 = 3 · 11 · 17 and there are
only 6 other Carmichael numbers below 10000. More generally, a composite number n
satisying an−1 ≡ 1 (mod n) for some a is called a pseudoprime (to base a). It is known
that the probability of n being a pseudoprime goes to 0 as n → ∞. To this extent,
we say that if an−1 ≡ 1 (mod n) holds for some randomly chosen a ∈ {2, . . . , p − 2}
then n is a probable prime. Extensions of such probabilistic tests are widely used
as primality tests in practice, e.g., the Miller-Rabin or Solovay-Strassen tests. The
first deterministic polynomial-time primality test, the AKS test (2002), also relies on
Fermat’s theorem.
6.3. Homework 6
(1) Show that n is a multiple of 4 if and only if the number composed of the last
two digits of n is a multiple of 4.
(2) Find the least non-negative residues of 1! + 2! + · · · + 10! (mod n) for n = 3, 11.
40
(3) Find the least non-negative residue of 23 (mod 10).
(4) Let p be a prime. Show that (a + b)p ≡ ap + bp (mod p).1
1 p

Hint: Use the binomial theorem and show that p | k for all k = 1, . . . , p − 1.
6.3. HOMEWORK 6 39

(5) Show that if ap−1 ≡ 1 (mod p) for all a = 1, . . . , p − 1 then p is prime.2

2 Hint: First show that (p, a) = 1 for all a = 1, . . . , p − 1.


CHAPTER 7

Congruences (II)

We continue last week’s discussion with some theorems to composite modulus.

7.1. Fermat–Euler theorem


A careful analysis of Ivory’s proof of Fermat’s theorem leads to the following gen-
eralization (sometimes called the Fermat–Euler theorem).
Theorem 15 (Euler). We have aϕ(n) ≡ 1 (mod n) whenever (a, n) = 1, where
ϕ(n) := #{0 ≤ m ≤ n − 1 : (m, n) = 1}
is called Euler’s totient function.
Proof. Let A = {a1 , . . . , aϕ(n) } be the set of all 0 ≤ m ≤ n − 1 with (m, n) = 1.
For each (a, n) = 1, the restriction of ϕa : Z/nZ → Z/nZ to the set A is a bijection.
Hence we again find that
a1 · · · aϕ(n) ≡ a1 · · · aϕ(n) aϕ(n) (mod n).
Since (a1 · · · aϕ(n) , n) = 1 the claim follows. 
Recall that the linear congruence equation ax ≡ b (mod n) has a solution whenever
(a, n) = 1. So far, we only know that a solution exists. If we want to compute this
solution, we can apply Euler’s theorem as follows:
ax ≡ b (mod n) ⇐⇒ x ≡ baϕ(n)−1 (mod n)
by multiplying both sides of the congruence equation by aϕ(n)−1 .
To compute the least residue baϕ(n)−1 (mod n) ∈ {0, . . . , n − 1}, we need to know
how to compute Euler’s totient function. The first few values of Euler’s totient function
are
n 1 2 3 4 5 6 7 8 9 10 . . .
ϕ(n) 1 1 2 2 4 2 6 4 6 4 . . .
Observe that ϕ(n) ≤ n − 1 and that ϕ(p) = p − 1 when p is prime. We will show in
the last section of this chapter that if n has prime factorization n = pk11 · · · pkr r with
p1 , . . . , pr distinct primes, then
r
Y Y
ϕ(n) = (pk11 − pk11 −1 ) · · · (pkr r − pkr r −1 ) = pki i −1 (pi − 1) = n (1 − p−1 ),
i=1 p|n

where the product runs through all prime factors of n.


41
42 7. CONGRUENCES (II)

7.2. Chinese remainder theorem


We consider the practical problem of finding the least non-negative residue x ≡ a
(mod n) at the hand of the following numerical example
x ≡ (10273 + 55)37 (mod 111).
The smaller the modulus n, the easier it is to compute x ≡ a (mod n). When n is
composite, we can use its prime factorization to reduce the problem to solving a system
of equations to smaller moduli. Here, since 111 = 3 · 37, the solution x should satisfy
(
x ≡ (10273 + 55)37 (mod 3)
x ≡ (10273 + 55)37 (mod 37)
Mod 3, we have 102 ≡ 0, 55 ≡ 1 so that (10273 + 55)37 ≡ 1. Mod 37, we have 10273 =
(10236 )2 · 102 ≡ 102 ≡ 28 by Fermat’s theorem, and (10273 + 55)37 ≡ 8337 ≡ 83 ≡ 9 by
another application of Fermat’s theorem. Hence the system above simplifies to
x ≡ 1 (mod 3)

(7.1)
x ≡ 9 (mod 37)
We next rely on the following special case of the Chinese remainder theorem.
Theorem 16. If (m, n) = 1 and u, v ∈ N such that mu − nv = 1 then the system
x ≡ a (mod m)


x≡b (mod n)
has solution x = bmu − anv and this solution is unique (mod mn).
Proof. Recall that the existence of u, v ∈ N is guaranteed by Bézout’s identity
since (m, n) = 1. We can check that bmu−anv ≡ −anv ≡ a (mod m) and bmu−anv ≡
bmu ≡ b (mod n).
Suppose y is another solution. Then x ≡ y (mod m) and (mod n). We may write
x − y = km = ln for some integers k, l. Since (m, n) = 1 we have n | k and hence x ≡ y
(mod mn). 
Recall that continued fractions allow to compute the solutions u, v to Bézout’s
identity. Sometimes the solutions can be found immediately as is the case for m = 37,
n = 3: 37 · 1 − 3 · 12 = 1. Then the system (7.1) has (unique) solution
x ≡ 37 − 3 · 12 · 9 ≡ 46 (mod 111).
We conclude this section with a proof of the general form of the Chinese remainder
theorem.
Theorem 17 (Chinese remainder theorem). Let n1 , . . . , nk be pairwise coprime
numbers. Then the system of congruence equations


x ≡ a1 (mod n1 )

..
 .

x ≡ a (mod n )
k k
7.3. EULER’S TOTIENT FUNCTION 43

admits a unique solution (mod n1 · · · nk ).


Proof. The proof is by induction over k. We already settled the case k = 2 with
Theorem 16. Let k > 2 and suppose that the claim holds for systems of k −1 equations.
In particular the first k − 1 congruence equations have a simultaneous solution x ≡ a
(mod n1 · · · nk−1 ) and we are again left with a system of two equations


 x ≡ a (mod n1 · · · nk−1 )

..
 .

x ≡ a (mod n ),
k k

which we know to admit a unique solution since (n1 · · · nk−1 , nk ) = 1. 

7.3. Euler’s totient function


Definition 30. A function f on N is called multiplicative if f (mn) = f (m)f (n) for
all coprime m, n ∈ N. It is called completely multiplicative if f (mn) = f (m)f (n)
holds for all m, n ∈ N.
Proposition 31. ϕ(mn) = ϕ(m)ϕ(n) for (m, n) = 1.
Proof. Let Ak := {0 ≤ a ≤ k − 1 : (a, k) = 1}. By definition ϕ(mn) is the
cardinality of the set Amn and ϕ(m)ϕ(n) is the cardinality of the Cartesian product
Am × An . To prove the statement it therefore suffices to show that the map F : Amn →
Am × An , F (x) = (x (mod m), x (mod n) is bijective. Let (a, b) ∈ Am × An . Then the
Chinese remainder theorem states that there exists a unique element x ∈ Amn such
that F (x) = (a, b). 
Remark 32. To see that ϕ is not completely multiplicative, observe that ϕ(4) = 2
while ϕ(2) = 1.
As a consequence, to compute ϕ(n) we need the prime factorization n = pk11 · · · pkr r
of n (with p1 , . . . , pr distinct primes) and to know that
Proposition 33. If p is prime and k ≥ 1, we have ϕ(pk ) = pk − pk−1 .
Proof. This is a computation:
p k p k p k
X X X
k k
ϕ(p ) = 1= 1=p − 1 = pk − pk−1 .
m=1 m=1 m=1
(m,pk )=1 (m,p)=1 p|m


We conclude that
r
Y Y
ϕ(n) = (pk11 − pk11 −1 ) · · · (pkr r − pkr r −1 ) = piki −1 (pi − 1) = n (1 − p−1 ),
i=1 p|n

where the product runs through all prime factors of n.


44 7. CONGRUENCES (II)

7.4. Homework 7
(1) Find the last two digits of 99 (without calculator).
(2) Solve 97x ≡ 13 (mod 105).
(3) Show that m | n implies ϕ(m) | ϕ(n).
(4) Find all n ∈ N for which ϕ(n) = 6.1

1 Hint: First establish that n must be of the form n = 2a 3b 7c for some integers a, b, c ≥ 0.
CHAPTER 8

Quadratic Residues

8.1. Public key cryptography – RSA


As a practical application of the Fermat–Euler theorem from last chapter, we de-
scribe the RSA algorithm, named after Rivest, Shamir, Adlemann (1977). This al-
gorithm concerns one-way secure transmission of messages, and uses a public key
(n, e), openly available, and a secret, private key (n, d). Here, n, d, e are all positive
integers.
For the public key, take two primes p, q, large (∼ 2512 ) and chosen at “random”;
the security of the algorithm relies of n = p · q being hard to factor. Choose a number
e such that (e, ϕ(n)) = 1. The data (n, e) is called the public key.
Next find the reciprocal d of e (mod ϕ(n)), i.e., de ≡ 1 (mod ϕ(n)). The arithmetic
condition (e, ϕ(n)) = 1 ensures that the equation de ≡ 1 (mod ϕ(n)) has a unique
solution. Recall that the computation of ϕ(n) = ϕ(pq) = (p − 1)(q − 1) relies on the
prime factorization of n; this makes it hard to easily recover d from the public key
(n, e). Hence the data (n, d) is called the private key.
We now describe the principle of the algorithm. Let’s say that Bob wants to send
Alice the message HELLO. Bob first encodes his message according to some agreed
upon protocol. For example, using the ASCII standard, where H = 072, E = 101,
L = 108, O = 111, so that his message is
a = 07210110811.
The message needs to be shorter that the public key’s modulus n. Hence Bob might
need to chop up a in smaller blocks a1 , a2 , . . . , ak , each < n. (For additional security,
Bob might also use some prescribed permutation on the digits of each block to make
sure his message is not too easy to decode). To encrypt his message, Bob computes
for each block
b ≡ ae (mod n)
using Alice’s public key (n, e), and then sends Alice the encrypted message b.
To recover the original message, an eavesdropper would need to solve the equation
xe ≡ b (mod n).
Similarly to prime factorization, taking large modular roots is considered a hard prob-
lem computationally (meaning that known algorithms are slow).
Alice has received the message b. Using her private key, Alice will decrypt b by
computing
a ≡ bd (mod n).
45
46 8. QUADRATIC RESIDUES

Indeed, since de = 1 + kϕ(n) for some integer k we have


bd ≡ (ae )d ≡ a · (aϕ(n) )k ≡ a (mod n)
by Euler’s theorem. Since a < n, this is the original message.
The security of the algorithm therefore relies essentially on the following two prob-
lems being hard: factoring large numbers and taking large modular roots.

8.2. Quadratic residues


In general it is not clear that an equation of the form xe ≡ b (mod n) admits a
solution. When e = 2 and n is an odd prime, we have a complete theory. At its center
is the quadratic reciprocity law, which is perhaps the most famous result of (modern)
elementary number theory.
Definition 34. Let p be an odd prime, and let a be an integer with (a, p) = 1. The
integer a is called a quadratic residue (mod p) if
x2 ≡ a (mod p)
has a solution.
Example 35. Let p = 5; which are the quadratic residues (mod 5)? Observe that by the
definition of quadratic residues we only need consider x (mod p), i.e., x ∈ {1, . . . , p−1}.
Computing
12 = 1, 22 = 4, 32 ≡ 4, 42 ≡ 1 (mod 5)
we conclude that 1 and 4 are the quadratic residues (mod 5).
Looking at various odd primes p, it seems that among 1, . . . , p − 1, there are always
exactly p−1
2
quadratic residues (mod p). The next proposition shows that this is a fact.
p−1
Proposition 36. There are exactly 2
distinct quadratic residues (mod p).
Proof. We count the number of distinct residues a2 (mod p) where a ranges over
1, . . . , p − 1. Since
a2 ≡ (p − a)2 (mod p)
there are at most p−1 2
distinct residues. Let a, b ∈ {1, . . . , p−1
2
}, and suppose that
2 2
a ≡ b (mod p). This is equivalent to p | (a − b)(a + b). Since p is prime, Euclid’s
lemma asserts that p must divide one of these factors; hence a ≡ b or −b (mod p).
Since a, b ∈ {1, . . . , p−1
2
} we conclude that a = b. 
The next proposition will motivate the introduction of the Legendre symbol, which
is used to compute efficiently whether an integer is a quadratic residue (mod p).
Proposition 37. We have the following multiplicativity behavior (mod p);
(i) If a and b are quadratic residues, then ab is a quadratic residue;
(ii) If a is a quadratic residue and b is not, then ab is not a quadratic residue,
(iii) If a and b are not quadratic residues, then ab is a quadratic residue.
8.2. QUADRATIC RESIDUES 47

Proof. (i) Suppose there exist x, y such that x2 ≡ a, y 2 ≡ b (mod p). Then
(xy)2 ≡ ab (mod p), i.e., ab is a quadratic residue (mod p).
(ii) We have seen that among the set {1, 2, . . . , p − 1}, half the elements are
quadratic residues, say {a1 , . . . , a(p−1)/2 } and the other half are (therefore) not, say
{b1 , . . . , b(p−1)/2 } = {1, 2, . . . , p − 1} \ {a1 , . . . , a(p−1)/2 }. Let a be an integer coprime to
p. Then a, 2a, . . . , (p − 1)a are p − 1 distinct numbers (mod p).
If a is a quadratic residue, then by (i) aa1 , . . . , aa(p−1)/2 are quadratic residues as
well, and hence by Proposition 36 the other half, ab1 , . . . , ab(p−1)/2 cannot be. This
proves (ii).
(iii) Consider the same setup as for (ii) but choose a to be a quadratic non-
residue. Then by (ii) aa1 , . . . , aa(p−1)/2 are quadratic nonresidues, and this forces
ab1 , . . . , ab(p−1)/2 to be quadratic residues. This proves (iii). 

In this sense, quadratic residues behave like +1, quadratic nonresidues like −1.
Definition 38. For p an odd prime, a such that (a, p) = 1, the Legendre symbol is
+1 a is a quadratic residue (mod p)
  
a
=
p −1 a is a quadratic nonresidue (mod p).
We can reformulate the properties of quadratic residues established up to now in
terms of the Legendre symbol as follows;
   
a+kp
• p
= ap for every k ∈ Z;
 
• ap for half the elements of {1, . . . , p − 1};
    
• ab p
= ap b
p
, i.e., the Legendre symbol is completely multiplicative.
In the next chapter, we will prove that every Legendre symbol can be computed thanks
to the quadratic reciprocity law
   
p p−1 q−1 q
= (−1) 2 2 ,
q p
valid for every pair p, q of odd primes, together with the supplementary laws
 
−1
= 1 if and only if p ≡ 1 (mod 4)
p
 
2
= 1 if and only if p ≡ 1 or 7 (mod 8)
p
We conclude this section with some computational examples:
       2  3  
72 9 8 3 2 2
= = = =1
97 97 97 97 97 97
             2    
34 2 17 17 97 12 2 3 2
= = = = = = = −1
97 97 97 97 17 17 17 17 3
48 8. QUADRATIC RESIDUES

8.3. Primes that are the sum of two squares


Before we move on to the proof of the quadratic reciprocity law and its supplemen-
tary laws, we explain some of the motivations for the interest in quadratic residues and
the discovery of these laws. Consider the sequence of primes. Aside from 2, each prime
is odd and either of the form 4k + 1 or 4k − 1. If we list the primes that are ≡ 1 (mod
4), we soon notice that
5 = 2 2 + 12
13 = 32 + 22
17 = 42 + 12
29 = 52 + 22
37 = 62 + 12
41 = 52 + 42
53 = 72 + 12
61 = 62 + 52
73 = 82 + 32
..
.
Theorem 18 (Fermat). If p is an odd prime, then p = a2 + b2 if and only if p ≡ 1
(mod 4).
We will prove this famous assertion of Fermat at a later time. For now, we only
sketch the structure of its proof. Suppose that p = a2 + b2 for some a, b ∈ N. Let a0
be the reciprocal of a, i.e., aa0 ≡ 1 (mod p). Then
 
2 2 02 0 2 0 2 0 2 −1
p = a + b =⇒ pa = (aa ) + (ba ) =⇒ (ba ) ≡ −1 (mod p) =⇒ = 1.
p
The key to the proof is to realize that the converse is true, i.e., if ( −1
p
) = 1 then p is a
sum of squares. This was done by Euler using the principle of ‘infinite descent’; we will
later see a modern algebraic proof using Gaussian integers. Given Euler’s argument, it
then remains to prove that ( −1
p
) = 1 if and only if p ≡ 1 (mod 4), and this is precisely
the first supplementary law stated earlier.
By itself, this supplementary law is remarkable. It asserts that p is a prime divisor
of a number of the form n2 + 1 if and only if it lies in the arithmetic progression 4k + 1,
k ∈ N. Even more remarkable, this allows to prove that there are infinitely many
prime numbers that are ≡ 1 (mod 4). We first recall Euclid’s proof of the infinitude of
primes.
Theorem 19 (Euclid). There are infinitely many primes.
Proof. Suppose there were only finitely many primes, p1 , . . . , pk . Then the number
N = p1 · · · pk + 1 is larger than the largest prime in our list, hence must be composite.
But any prime divisor p | N also divides p1 · · · pk , hence p | 1, a contradiction. 
8.4. HOMEWORK 8 49

A simple modification of Euclid’s argument allows to prove that there are infinitely
many primes ≡ 3 (mod 4).
Theorem 20. There are infinitely many primes ≡ 3 (mod 4).
Proof. Suppose there were only finitely many primes ≡ 3 (mod 4), say p1 , . . . , pk .
Then the number N = 4(p1 · · · pk ) − 1 is larger than the largest prime in our list, hence
must be composite. At least one prime divisor p | N must be of the form p ≡ 3 (mod
4); if p, q ≡ 1 (mod 4) then pq ≡ 1 (mod 4) hence if all prime divisors of a number n
are ≡ 1 (mod 4), we have n ≡ 1 (mod 4).
Since p ≡ 3 (mod 4), it also divides p1 · · · pk , and we obtain a contradiction. 
However this argument does not work to show that there are infinitely many primes
p ≡ 1 (mod 4). Instead we use the supplementary law.
Theorem 21. There are infinitely many primes ≡ 1 (mod 4).
Proof. Suppose there were only finitely many primes ≡ 1 (mod 4), say p1 , . . . , pk .
Then the number N = (p1 · · · pk )2 + 1 is larger than the largest prime in our list, hence
must be composite. Then any prime divisor p | N gives ( −1 p
) = 1, hence is of the form
p ≡ 1 (mod 4), and we again have a contradiction. 
All this motivated looking at similar patterns (such as the fact that p is a prime
divisor of n2 − 2 if and only if p lies in either one of the arithmetic progressions 8k + 1,
8k − 1, k ∈ N, i.e., the second supplementary law) until the quadratic reciprocity
law was empirically observed by Euler, and eventually proved by Gauss. The fact
that its proof resisted both Euler and Legendre reflects that it is very much a non-
trivial statement. On the other hand, we have now more than 250 different proofs
of the quadratic reciprocity law, and this is a testament both to the great advances
in mathematics over the last two centuries, and to the fundamental nature of the
quadratic reciprocity law; it marked the birth of algebraic number theory and many
later developments in mathematics.
8.4. Homework 8
(1) Suppose that Alice’s public key is (583, 3). Compute Alice’s private key.
(2) Compute the Legendre symbols ( −26 73
), ( 19
73
), ( 33
73
).
(3) Find all odd primes for which −2 is a quadratic residue.
(4) Show that there are infinitely many primes that are congruent to either 1 or
7 (mod 8).
(5) Let p ≥ 7 be a prime. Show that there are always two quadratic residues (mod
p) that differ by 2.1

1Hint: Check small numbers 1,2,3,...


CHAPTER 9

Quadratic Reciprocity

We present an elementary proof of the quadratic reciprocity law due to Gauss.

9.1. Supplementary laws


Let p be an odd prime, and a an integer that is coprime to p. The first attempt to
compute the Legendre symbol stems from the following observation. If a is a quadratic
residue (mod p), then by definition there exists an integer x, (x, p) = 1, such that
x2 ≡ a (mod p). Then Fermat’s little theorem implies that
 
a p−1
= 1 ≡ xp−1 ≡ a 2 (mod p).
p
Euler proved that this congruence equation holds also when a is not a quadratic residue
(mod p).
Proposition 39 (Euler’s criterion).
 
a p−1
≡ a 2 (mod p)
p
Proof. To complete the proof, suppose that ( ap ) = −1. Recall that the linear
congruence equation bx ≡ a (mod p) has a unique solution for each b ∈ {1, . . . , p − 1}.
Since a is not a quadratic residue, this solution must satisfy the constraint x 6= b.
Therefore, we can pair each residue in 1, . . . , p − 1 with a distinct residue in 1, . . . , p − 1
to obtain
p−1
1 · 2 · · · (p − 1) ≡ a 2 (mod p).
The left hand-side is (p − 1)! and since p is prime, Wilson’s theorem asserts that
(p − 1)! ≡ −1 (mod p). 
The first supplementary law, i.e., ( −1
p
) = 1 if and only if p ≡ 1 (mod 4), is a direct
application of Euler’s criterion, since (−1)(p−1)/2 = 1 if and only if p ≡ 1 (mod 4). In
general however, computing with Euler’s criterium is not very practical. To compute
 
2
≡ 2(p−1)/2 (mod p),
p
we will consider as starting point the identity
p−1
2 2 ( p−1 p−3 p−1
 
2
)! = [2 · 2 · · · 2] · 1 · 2 · · · ( 2
)( 2
) = 2 · 4 · · · (p − 3)(p − 1).
51
52 9. QUADRATIC RECIPROCITY

On the right hand-side we note that p − 1 ≡ −1 (mod p), p − 3 ≡ −3 (mod p), etc.
This suggests that applying this congruence equivalence to roughly the second half of
the terms, we should eventually obtain
p−1
2 2 ( p−1 )! ≡ ±( p−1 )! (mod p).
2 2

The determination of the sign is the crucial point; observe that it will depend on
whether p−1
2
is even or odd. We consider these two cases separately:
(A) We have p−12
= 2k;
(B) We have p−12
= 2k + 1;
for some number k ∈ N. Depending on whether we are in case (A) or (B), the second
bracket [· · · ] in
2 · 4 · · · (p − 3)(p − 1) = [2 · · · (2k)] · [(2k + 2) · · · (p − 3)(p − 1)]
is a product of k or k + 1 terms, while the first bracket is a product of k terms.
In case (A), we thus find
p−1 p−1
2 2 ( p−1 )! ≡ (−1)k (2k)! (mod p), i.e., 2 2 = (−1)k .
2

Hence in case (A), ( p2 ) = 1 if and only if k is even, that is, if and only if p ≡ 1 (mod
8). In case (B), we have instead
p−1 p−1
2 2 ( p−1 )! = (−1)k+1 (2k + 1)! (mod p), i.e., 2 2 = (−1)k+1 .
2

Hence in case (B), ( p2 ) = 1 if and only if k is odd, i.e., if and only if p ≡ 7 (mod 8).
This proves the second supplementary law.
On the basis of such computations, Gauss discovered a simple lemma, which will
provide the key to the elementary proof of the law of quadratic reciprocity presented
in the next section.
Lemma 40 (Gauss’ lemma).
 
a
= (−1)ν ,
p
p−1 p
where ν := #{1 ≤ k ≤ 2
| 2
< ka (mod p) < p}.
Proof. Starting again from
p−1
a 2 ( p−1 )! = a(2a) · · · ( p−1 a)
2 2

we observe that each factor on the right has a distinct residue in


p−1
±1, ±2, . . . , ± .
2
Suppose both x and −x appear as residue with |x| ≤ p−1
2
. There exist 1 ≤ j 6= k ≤ p−1
2
such that x ≡ ja, x ≡ ka (mod p). This would imply
0 = x − x ≡ (j + k)a (mod p)
9.2. QUADRATIC RECIPROCITY LAW 53

and since j + k < p − 1 this is impossible. Hence each number in the set 1, 2, . . . , p−1
2
appears exactly once, with a predetermined sign. The number of negative signs is
exactly given by ν. 

9.2. Quadratic reciprocity law


We first recall the statement.
Theorem 22 (Quadratic reciprocity law). For all odd primes p, q we have
  
p q p−1 q−1
= (−1) 2 2 .
q p
In particular, the right hand-side is = −1 if and only if p ≡ q ≡ 3 (mod 4).
Proof. We have two cases to consider: (A) p ≡ q (mod 4), and (B) p ≡ −q (mod
4). Write correspondingly
(A) p = q + 4j;
(B) p = −q + 4j.
In case (A) we find
              
p q + 4j j q p − 4k −1 j p−1 j
= = , = = = (−1) 2 .
q q q p p p p p
In case (B) we find
           
p −q + 4j j q −p + 4j j
= = , = = .
q q q p p p
Hence the quadratic reciprocity law follows from proving that ( pj ) = ( qj ). We do this
by using Gauss’ lemma, which asserts that ( pj ) is determined by the parity of νp , which
counts how many multiplies of j lie in the intervals
 p  [  3p  [ [ 
1
 
,p , 2p ··· b− p, bp ,
2 2 2
where b is either j/2 or (j − 1)/2, whichever is an integer. Dividing throughout by j,
we see that νp is the total number of integers in the union
     
p p [ 3p 2p [ [ (2b − 1) p bp
, , ··· , .
2j j 2j j 2j j
Similarly νq (determined by ( qj ) = (−1)νq ) is the total number of integers in the union
     
q q [ 3q 2q [ [ (2b − 1) q bq
, , ··· , . (9.1)
2j j 2j j 2j j
Using that p = q + 4j, νp is the total number of integers in the union
 [ [ [ 
q q 3q 2q bq bq
+ 2, + 4 + 6, +8 ··· + 4b − 2, + 4b . (9.2)
2j j 2j j 2j j
54 9. QUADRATIC RECIPROCITY

Comparing (9.1) and (9.2) we conclude that νp and νq have the same parity. By Gauss’
lemma, this proves ( pj ) = ( qj ). The same argument can be made to show that this
equality also holds in case (B). 

9.3. Jacobi and Kronecker symbols


We discuss two important extensions of the Legendre symbol.
Definition 41. For n ∈ N odd and a an integer with (a, n) = 1, the Jacobi symbol
is given by
a  k1  k2  kr
a a a
:= ··· ,
n p1 p2 pr
where n = pk11 · · · pkr r is the prime factorization of n, and ( ap ) denotes the usual Legendre
symbol. Following the usual convention for the empty product, we set ( a1 ) = 1.
One can check that the Jacobi symbol has most of the same properties as the
Legendre symbol. In particular
a  b 
=
n n
whenever a ≡ b (mod n), and
    
ab a b  a   a a
= , = .
n n n mn m n
Further we have the quadratic reciprocity law for any pair of coprime odd numbers
m, n,
m  n  m−1 n−1
= (−1) 2 2
n m
and the supplementary laws
   
−1 n−1 2 n2 −1
= (−1) 2 , = (−1) 8 .
n n
However, there are important differences as well. For instance, ( na ) = 1 does not imply
that a is a quadratic residue (mod n); e.g.,
 
2
=1
9
but 2 is not a quadratic residue (mod 9). On the other hand, if ( na ) = −1 then a is
guaranteed to not be a quadratic residue (mod n).
The Kronecker symbol is an extension of the Jacobi symbol to all natural num-
bers. It is obtained by extending the Jacobi symbol with the definition of
 a  1 if a ≡ 1, 7 (mod 8)
=
2 −1 if a ≡ 3, 5 (mod 8)
9.4. HOMEWORK 9 55

Once again the usual rules can be established, but they become more complicated. For
example
a  b 
=
n n
whenever a ≡ b (mod n) except if n ≡ 2 (mod 4), in which case the equality holds
whenever a ≡ b (mod 4n). The Kronecker symbol satisfies the following version of
quadratic reciprocity. Write n ∈ N as n = 2e n0 with e ≥ 0 and n0 is odd. Then for
any coprime m, n ∈ N we have
m  n  m0 −1 n0 −1
= (−1) 2 2 .
n m
9.4. Homework 9
(1) Use quadratic reciprocity to find all odd primes for which 3 is a quadratic
residue, i.e., ( p3 ) = 1.
(2) Use Gauss’ lemma to find all odd primes for which 3 is a quadratic
  residue.
(3) Show that if p, q are twin primes, i.e., |p − q| = 2, then q = pq .
p

(4) Let n ∈ N be odd and (a, n) = 1. Show that if the Jacobi symbol is ( na ) = −1
then a is not a quadratic residue (mod n).
(5) Show that if m, n ≡ 0, 1 (mod 4) and are coprime then the quadratic reci-
procity law for the Kronecker symbol asserts that ( mn
n
)( m ) = 1.
CHAPTER 10

Applications of Residue Symbols

10.1. Apollonian circle packings


The study of Apollonian circle packings has its origin in the following theorem of
the ancient greek geometer Appolonius (262–190 BC).
Theorem 23 (Apollonius). For any configuration of three mutually tangent circles
in the plane, there are exactly two other circles that are mutually tangent to all three.

Looking at this picture, it is clear that there is a symmetry connecting the two
new circles; they are reflections of one another with respect to the circle that passes
through the tangency points of the original three circles.

Figure 1. C40 is obtained from C40 by inversion with respect to the


unique circle passing through the tangency points of C1 , C2 , C3 .

In other words, given the first four circles, you know everything about the fifth.
This should apply in particular to the numerical invariants of these circles, namely their
center and radius. Since Apollonius’ theorem does not depend on the position of the
circles in the plane, we only consider the radii. To state the following algebraic identity
in the simplest possible way, we will need to introduce some notation. The curvature
of a circle with radius r is defined to be 1/r. To each circle in a configuration of four
mutually tangent circle we assign a signed curvature defined as the usual curvature if
57
58 10. APPLICATIONS OF RESIDUE SYMBOLS

C contains none of the three other circles in its interior, and minus the usual curvature
if C contains all other three circles in its interior. For example in Figure 1, C2 , C3 , C4
would be assigned positive curvature and C1 negative curvature.
Theorem 24 (Descartes, 1643). For any configuration of four mutually tangent
circles with signed curvatures a1 , a2 , a3 , a4 ∈ R, we have the identity
(a1 + a2 + a3 + a4 )2 = 2(a21 + a22 + a23 + a24 ).
Remark 42. Observe that if a1 , a2 , a3 are fixed, then the quadratic equation
(a1 + a2 + a3 +x)2 = 2(a21 + a22 + a23 + x2 )
⇐⇒ x2 − 2(a1 + a2 + a3 )x + (2(a21 + a22 + a23 ) − (a1 + a2 + a3 )2 ) = 0
has two solutions a4 , a04 , related by the identity
a4 + a04 = 2(a1 + a2 + a3 ). (10.1)
This is an algebraic expression of Apollonius’ theorem.
An Apollonian circle packing is obtained by starting from a “root configuration”
of four mutually circles and generating smaller and smaller circles in the interstices
according to Apollonius’ theorem.

Figure 2. By Time3000 - Own work, CC BY-SA 4.0, https://


commons.wikimedia.org/w/index.php?curid=3577043

An Apollonian circle packing A is called a primitive integral Apollonian pack-


ing if its root configuration has integer curvatures a1 , a2 , a3 , a4 and (a1 , a2 , a3 , a4 ) = 1.
For a number theorist, the natural question to ask is: Which integers appear as
curvatures in a primitive integral Apollonian circle packing? Looking at the
example in Figure 3, one sees that
10.1. APOLLONIAN CIRCLE PACKINGS 59

Figure 3. The primitive integral Apollonian packing with root config-


uration (−3, 5, 8, 8).

• Except for the outer circle, which has signed curvature −3, every circle in the
packing will have positive signed curvature;
• The signed curvatures 1,2,3,4,6,7 do not appear;
• Each signed curvature that appears is ≡ 0 or 1 (mod 4). (See the homework
at the end of this chapter).
The local-global conjecture states that these are the only possible types of ob-
structions on a natural number appearing as a curvature in a primitive integral Apol-
lonian packing.
Conjecture 43 (Local–global conjecture). For a primitve integral Apollonian pack-
ing, every sufficiently large number in an admissible residue classes will appear as a
curvature.
Impressive machinery has been developed in the last decade to make progress on
this conjecture. Then two years ago, during a Research Experience for Undergraduates
(REU) project at the University of Colorado, it was discovered that the conjecture is
... false.1
Theorem 25 (Haag–Kertzer–Rickards–Stange). No square curvatures n2 appear
in the primitive integral Apollonian packing A generated by (−3, 5, 8, 8)
Sketch of proof. We will take the following facts on primitive integral Apollo-
nian packings for granted. Fix a circle C ∈ A and let n be the curvature of C.
(i) For each C 0 ∈ A, there exists a path of tangent circles with coprime curvatures
that connects C to C 0 ;
(ii) For each C 0 ∈ A tangent to C and with curvature m coprime to n, we have
m ≡ Ax2 (mod n) for some integer x, and where A is a constant independent
of C 0 .
1 You can read more about this story here or here, and find the final research paper here.
60 10. APPLICATIONS OF RESIDUE SYMBOLS

Recall that each curvature in this packing satisfies n ≡ 0, 1 (mod 4). Then by (ii) the
Kronecker symbol ( m n
) satisfies
m A
= .
n n
In particular, since the constant A depends only on the circle C, we can define
 
A
χ(C) :=
n
to be the Kronecker symbol of the circle C. In the previous homework you had to
show that if two numbers m, n satisfy m, n ≡ 0, 1 (mod 4) and are coprime then the
quadratic reciprocity law for the Kronecker symbol asserts that ( m n
n
)( m ) = 1. This
0
implies that for any two tangent circles C, C ∈ A with coprime curvatures, we have
χ(C) = χ(C 0 ).
Let C0 be the circle of curvature 5 appearing in the root configuration. By (i) we
conclude that χ(C) = χ(C0 ) for all C ∈ A, i.e., the Kronecker symbol is constant
over all circles of the packing. It remains to compute χ(C0 ); this is achieved by the
Legendre symbol computation
 
8
χ(C0 ) = = −1.
5
To conclude, we observe that any circle C appearing in A with square curvature n2
would have Kronecker symbol χ(C) = 1, which is impossible by the ‘reciprocity ob-
struction’ just established. 

10.2. The arithmetic of Gaussian integers


A Gaussian integer is a complex number of the form a + ib where a, b ∈ Z and i
is the imaginary unit. The set of all Gaussian integers,
Z[i] = {u = a + ib : a, b ∈ Z},
can be thought of as a square grid in the complex plane C. The set Z[i] is a ring with
respect to the standard operations of addition and multiplication for complex numbers.
Definition 44. The norm of a Gaussian integer α = a + ib is given by
N (α) = α · α = (a + ib)(a − ib) = a2 + b2 .
Example 45. Observe that N (α) = 0 if and only α = 0, N (α) = 1 if α = ±1, ±i,
N (α) = 2 if and only if α = ±1 ± i, and there is no α ∈ Z[i] such that N (α) ≡ 3 (mod
4) — this is the ‘easy’ part of Fermat’s theorem. The hard part of Fermat’s theorem
amounts to show that if p is an odd prime satisfying p ≡ 1 (mod 4) then there exists
α ∈ Z[i] such that N (α) = p.
Gaussian integers α for which N (α) = 1 are called units. The units of Z[i] are
thus ±1, ±i. For comparison, the units of Z are ±1. If two Gaussian integers are the
same up to multiplication by a unit, i.e., α = uβ for u ∈ {±1, ±i}, we say that they
are associates.
10.2. THE ARITHMETIC OF GAUSSIAN INTEGERS 61

Lemma 46. For any complex numbers α, β ∈ C,


N (αβ) = N (α)N (β).
Proof. Express α, β in polar coordinates: α = reiφ , β = seiϕ . Then αβ =
(rs)ei(φ+ϕ) and N (αβ) = (rs)2 = r2 s2 = N (α)N (β). 
Theorem 26 (Division with remainder). Let α, β ∈ Z[i], β 6= 0. There exists
ω, ρ ∈ Z[i] such that α = ωβ + ρ with 0 ≤ N (ρ) < N (β).
Proof. Consider the complex number α/β as a point in the plane. Choose ω ∈ Z[i]
be the closest Gaussian integer to α/β. Note that ω ∈ Z[i] is not uniquely defined;
α/β could be equidistant to four distinct Gaussian integers.

Figure 4. Depending on the position of α/β, represented on these two


figures as the red dot, the choice of closest Gaussian integer is unique
or not. In the worst case scenario, the red dot is equidistant to four
Gaussian integers.

Since N (α/β − ω) is the square of the distance between α/β and ω, we have the
upper bound
1
N (α/β − ω) ≤ .
2
Now set ρ := α − ωβ. Since Z[i] is a ring, we have ρ ∈ Z[i]. Moreover N (ρ) ≥ 0 and
by the multiplicativity of the norm we have
1
N (ρ) = N (β(α/β − ω)) = N (β)N (α/β − ω) < N (β) < N (β).
2

Given division with remainder, Euclid’s algorithm holds in Z[i], implying the fun-
damental theorem of arithmetic for Z[i]: every Gaussian integer can be written as a
product of Gaussian primes in a unique way up to multiplication by a unit (i.e.,
α = ±1, ±i). For instance
5 = (2 + i)(2 − i) = (1 + 2i)(1 − 2i)
are seen as the same factorization given that −i(2 + i) = (1 − 2i) and i(2 − i) = (1 + 2i).
This is similar to considering 6 = 2 · 3 = (−2)(−3).
Definition 47. A Gaussian integer π ∈ Z[i] is a Gaussian prime if N (π) > 1 and
if it can not be written as a product π = αβ of two Gaussian integers α, β ∈ Z[i] with
1 < N (α), N (β) < N (π).
62 10. APPLICATIONS OF RESIDUE SYMBOLS

Euclid’s lemma for Gaussian primes (equivalent to the fundamental theorem of


arithmetic) states that if π is a Gaussian prime and π | αβ then π | α or π | β. We
can now close this chapter with a proof of Fermat’s theorem on the representation of
primes as sums of two squares.
Theorem 27 (Fermat). Let p be an odd prime. Then p is a sum of two squares if
and only if p ≡ 1 (mod 4).
Proof. As discussed earlier, it is easy to show that if an odd prime p can be
expressed as p = a2 + b2 for some a, b ∈ N then p ≡ 1 (mod 4). Suppose instead that
p ≡ 1 (mod 4). Then ( −1 p
) = 1, that is, we have p | (x2 − 1) = (x + i)(x − i) for some
x ∈ N. Since p does not divide either x + i or x − i, Euclid’s lemma implies that p
is not a Gaussian prime. Hence we may factor p as p = αβ for some α, β ∈ Z[i] with
1 < N (α), N (β) < N (p) = p2 . Since by multiplicativity N (p) = N (α)N (β) = p2 , we
conclude that p = N (α) = a2 + b2 , which expresses p as a sum of two squares. Note
that ab 6= 0 since otherwise either α or β would be a unit. 
10.3. Homework 10
(1) Show that each curvature appearing in the Apollonian packing with root con-
figuration (−3, 5, 8, 8) (see Figure 3) satisfies n ≡ 0, 1 (mod 4).
(2) Use the norm of Gaussian integers to prove the famous identity of Fibonacci:
(a2 + b2 )(c2 + d2 ) = (ac + bd)2 + (ad − bc)2 .
√ √
(3) Show that the proof of Theorem 26 extends to Z[ −2] but not to Z[ −3].
(4) Show that the fundamental
√ theorem of arithmetic (“unique factorization”) does
not hold for Z[ −3].2
(5) Reread Chapter 3 to review for yourself the relations between division with
remainder, Euclid’s algorithm, Euclid’s lemma, the fundamental theorem of
arithmetic, and Bézout’s identity.

2 √
Review Example 2 in Chapter 2.2 and find a counterexample of unique factorization for Z[ −3].
CHAPTER 11

Primes

11.1. Gaussian primes


Is there an easy way to check whether a Gaussian integer is prime? The following
easy lemma gives us a first criterion.
Lemma 48. Let π ∈ Z[i]. If N (π) is prime then π is a Gaussian prime.
Proof. Suppose that π were composite, say π = αβ with 1 < N (α), N (β) < N (π).
Then, by multiplicativity, so is its norm N (π) = N (α)N (β). 
Example 49. Let α = 10 + 7i. Since N (α) = 149 is a prime, we can conclude that α
is a Gaussian prime.
The following discussion is based on the proof of Fermat’s theorem on sum of
squares from last chapter (Theorem 27). In the course of the proof we showed that if a
prime p ∈ N is not a Gaussian prime, then p is a sum of two squares, i.e., p 6≡ 3 (mod
4). The contraposition is the statement that
Proposition 50. Every prime p ≡ 3 (mod 4) is a Gaussian prime. 
Moreover, the same proof taught us that every prime p ≡ 1 (mod 4), when seen
as a Gaussian integer, is composite; in fact that p = αβ with N (α) = N (β) = p. In
particular, α, β are Gaussian primes by Lemma 48 above. We can slightly refine this
statement to the following proposition.
Proposition 51. Every prime p ≡ 1 (mod 4) splits as p = ππ for a Gaussian prime π.
Proof. Write α = a1 + ia2 , β = b1 + ib2 . We will show that β = α. Using that
αβ = (a1 + ia2 )(b1 + ib2 ) = (a1 b1 − a2 b2 ) + i(a1 b2 + a2 b1 ) = p = a21 + a22 = b21 + b22 .
Note that a1 a2 b1 b2 6= 0. If we use that a2 = − ab11b2 we find that
a1 !
a1 b1 − a2 b2 = (b21 + b22 ) = b21 + b22
b1
which implies a1 = b1 . Similarly using that a1 = − ab22b1 , we find that
a2 !
a1 b1 − a2 b2 = − (b21 + b22 ) = b21 + b22
b2
which implies b2 = −a2 . 
Remark 52. In modern algebraic lingo, we say that primes of the form p ≡ 1 (mod
4) are inert and that primes of the form p ≡ 3 (mod 4) are split. Finally, we note
that 2 = (1 + i)(1 − i). Because 1 + i and 1 − i are associates. For comparison, we
63
64 11. PRIMES

can easily check that π and π in Proposition 51 are not associates. We say that 2 is
ramified.
We can now characterize Gaussian primes by relating them back to rational (ordi-
nary) primes.
Theorem 28. Let π = a + ib ∈ Z[i], π 6= 0. If ab 6= 0, the Gaussian integer π is a
Gaussian prime if and only if a2 + b2 is prime. If ab = 0, i.e., π = a or π = ib, π is a
Gaussian prime if and only if |π| is prime and |π| ≡ 3 (mod 4).
The proof builds on the following simple observation.
Lemma 53. If π is a Gaussian prime, there exists a rational prime p ∈ N such that
π | p.
Proof. By definition of the norm, we have π | N (π). Since π is a Gaussian prime,
Euclid’s lemma implies that π | p, where p is some prime factor of N (π). 
Proof of Theorem 28. Let π be a Gaussian prime with π | p.
If p ≡ 3 (mod 4), then p is a Gaussian prime and π is associate to p.
If p ≡ 1 (mod 4), there exists some Gaussian prime π 0 such that π | π 0 π 0 and by
Euclid’s lemma, π is associate to either π 0 or its conjugate. Moreover, N (π) = N (π 0 ) =
p. The same conclusion holds if p = 2.
Conversely, if N (π) is prime or if |π| ≡ 3 (mod 4) and |π| is prime, then π is a
Gaussian prime. 

11.2. Rational primes


We have seen earlier that there exist infinitely many (odd) primes and in fact
infinitely many primes of the form p ≡ 1 (mod 4) (respectively, of the form p ≡ 3 (mod
4)). Among primes of the form p ≡ 1 (mod 4), we have primes of the form p ≡ 1, 3, 5, 7
(mod 8); does each one of these arithmetic progression contain infinitely many primes?
11.2. RATIONAL PRIMES 65

Theorem 29 (Dirichlet, 1837). For (a, n) = 1, there are infinitely many primes of
the form p ≡ a (mod n).
The proof of Dirichlet’s theorem is unfortunately outside of the scope of this course,
but should be studied by anyone who wants to delve deeper in number theory! In this
section, we will review a few foundational results related to the distribution of prime
numbers. We saw earlier with Eratosthenes sieve that primes become sparser and
sparser the larger the numbers we consider. This is no surprise; a large number is more
likely to have non-trivial divisors.
Let pn denote the n-th prime and set gn = pn+1 − pn to be the gap between pn and
the next larger prime. We first show that
lim sup gn = ∞
n→∞

via the following proposition.


Proposition 54. There exist arbitrary large gaps between successive primes.
Proof. Choose p a large prime. Then p! + 2, p! + 3, . . . , p! + p is a sequence of
p − 1 consecutive composite numbers. Since there are infinitely many prime numbers,
we may take p to be arbitrarily large. 
On the other hand, the twin prime conjecture asserts that
lim inf gn = 2,
n→∞

i.e., that there exist infinitely many pairs of successive primes that differ by 2, such as
(3, 5), (5, 7), (11, 13), (17, 19), (29, 31), . . . . We currently know that
lim inf gn < N
n→∞

where the first breakthrough is due to Zhang showing in 2013 that N = 7 · 107 , and
was brought down to N = 600 (Maynard, 2013) and N = 246 (Polymath, 2014). You
can check out this Numberphile episode to hear more.
For N > 1 large, we define the prime counting function to be
π(n) = #{p ≤ n, p prime}.
Although the distribution of the prime numbers is irregular, their average distribution
behaves remarkably regularly. We have, e.g., π(10) = 4, π(1000) = 168, and the plot
of π(n) looks like
66 11. PRIMES

In the 1790s, Legendre and Gauss both conjectured, based on numerical observa-
tions, that the density of prime numbers among the first N whole numbers is approxi-
matively (log n)−1 (where log denotes the natural logarithm); this is usually expressed
as
π(n) 1
∼ (n → ∞)
n log n
to say that the two quantities are asymptotically equivalent.
Definition 55. Let f (x), g(x) be two functions of one variable, and suppose that
g(x) > 0 with at most finitely many exceptions. We say that f (x) and g(x) are asymp-
totically equivalent, written f (x) ∼ g(x) as x → ∞, if and only if
f (x)
lim = 1.
x→∞ g(x)

Remark 56. Being asymptotically equivalent is (no surprise) an equivalence relation.


Example 57. When considering the asymptotic growth of a function, we are only
concerned with its leading term; for instance, if f (x) = 2x2 + 3x + 2, then f (x) ∼ 2x2
as x → ∞.
The conjecture was eventually proved by de la Vallée Poussin and Hadamard inde-
pendently in 1896, using complex analysis.
n
Theorem 30 (Prime number theorem). π(n) ∼ log n
as n → ∞.
Corollary 58. We have pn ∼ n log n and gn ∼ log n as n → ∞.
Proof. The prime number theorem implies that
pn
π(pn ) = n ∼ .
log(pn )
This is equivalent to stating pn ∼ n log(pn ), so it remains to show that log(pn ) ∼ log(n).
Observe that the previous asymptotic equivalence implies
pn log(pn )
lim = 1 =⇒ lim log(pn ) − log(n log(pn )) = 0 =⇒ lim = 1,
n→∞ n log pn n→∞ n→∞ log(n)
log log pn log log n
since log n
≤ log n
→ 0 as n → ∞. We leave the second statement as homework.

How good is the (asymptotic) approximation of π(x) given by the prime number
theorem? For any constant c we have
1 1

log(n) + c log(n)
as n → ∞. It turns out that taking c = −1 yields a consistently better approximation
of π(x) than c = 0. For illustrations, here are a few values.
x x
x π(x) log x log x−1
R(x)
3
10 168 145 169 168
105 90 592 80 686 90 512 90 587
108 50 7610 455 50 4280 681 50 7400 304 50 7610 552
11.3. HOMEWORK 11 67

The function on the very right, which gives the better approximation, was introduced
by Riemann and can be expressed as

X 1 (log x)n
R(x) = 1 + ,
n=1
nζ(n + 1) n!
where ζ is the Riemann zeta-function

X 1 1 1
ζ(s) = s
= 1 + s + s + ...,
n=1
n 2 3
where s is a complex variable with Re(s) > 1 (to guarantee the convergence of the
series). Although Riemann couldn’t prove the prime number theorem, he conjectured
the following exact formula
X
π(x) = R(x) − R(xρ ),
ρ

where the sum runs over the set of the (‘non-trivial’) zeros of ζ. Without going into
details, this is to say that the fluctuations of π(x) depend on the location of the zeros
of ζ. The Riemann hypothesis, one of the six remaining open Millenium Prize
Problems, states that all such zeros lie on the vertical line 21 + iR in the complex plane.
It would imply that the primes have as regular a distribution as one could hope for.
11.3. Homework 11
(1) Find the prime factorization of 3 − i, 4 + 7i, 5 + i.1
(2) Goldbach’s strong conjecture states that every even number n > 2 can be
written as a sum of two primes. Goldbach’s weak conjecture (settled by
Helfgott in 2013) states that every odd number n > 5 can be written as a sum
of three primes. Show that the strong conjecture implies the weak one.
(3) Show that the function n 7→ gn is not multiplicative.
(4) Show that gn ∼ log n as n → ∞.
(5) Use the prime number theorem to show that there are infinitely many prime
numbers with leading digit 7.2

1Hint: Start by computing the prime factorization of the norm.


2Hint: Use the definition of limit to show that for the number Π(k) of primes with leading digit
7 and k further digits, we have Π(k) > 0 for infinitely many k.
CHAPTER 12

Generating Functions

12.1. Ordinary generating functions


Given a sequence (an )n≥0 , either finite or infinite, one way of studying its patterns
is to represent the an ’s as the coefficients of a (formal) power series. The ordinary
generating function of a sequence (an )n≥0 is
X∞
F (x) := an x n .
n=0

For the moment, we view F (x) as a formal power series, and leave issues of conver-
gence aside.
Example 59. The constant sequence an = 1 has as (ordinary) generating function the
geometric series

X
xn = (1 − x)−1 .
n=0
Formally this identity is obtained by the operation of shifting:
X∞ X∞ X∞
(1 − x) xn = xn − xn = x0 = 1.
n=0 n=0 n=1
The same trick shows that for every N ≥ 1, we have
N
X 1 − xN +1
xn = .
n=0
1−x
Example 60. Recall the definition of the Fibonacci numbers: F0 = 0, F1 = 1, and
Fn = Fn−1 + Fn−2 for every n ≥ 2. The generating function for (Fn )n≥0 then satisfies

X ∞
X ∞
X X∞ ∞
X
Fn x n = x + Fn−1 xn + Fn−2 xn = x + x F n xn + x 2 F n xn .
n=0 n=2 n=2 n=0 n=0
leading to the closed form expression

X x −x
Fn x n = 2
= ,
n=0
1−x−x (x + ϕ)(x + ϕ)

1± 5
where ϕ, ϕ = 2
.
Further, by partial fraction decomposition, we find that
∞ ∞  n
ϕ − ϕn

X
n −x 1 −ϕ 1 ϕ X
Fn x = =√ −√ = √ xn .
n=0
(x + ϕ)(x + ϕ) 5 x + ϕ 5 x + ϕ n=0
5
69
70 12. GENERATING FUNCTIONS

From this identity, we read off Binet’s identity (see Homework 1)


ϕn − ϕn
Fn = √ .
5
The previous example highlights several useful observations, which extend to the
following facts.
Proposition 61. Formal power series can be added and multiplied as follows

X ∞ X
X n
F (x) + G(x) = (an + bn ) xn F (x) · G(x) = ak bn−k xn .
n=0 n=0 k=0

With respect to these operations of addition and multiplication, the set of all formal
power series with coefficients in a commutative ring R is a ring itself, denoted R[[X]],
called the ring of formal power series.
Proposition P 62. nTwoPsequences (an ), (bn ) are identical, i.e., an = bn for all n ≥ 0 if
n
and only if an x = b n x .
Proposition 63. The sequence (an ) is defined by a linear recurrence relation if and
P (x)
an xn = Q(x)
P
only if for P (x), Q(x), polynomials of finite degrees.
Ordinary generating functions are widely used in combinatorics, probability theory,
and number theory. We highlight a few examples of applications.
Proposition 64. Given a set of n elements, let C(n, k) the number of k-combinations,
i.e., the number of distinct ways to select k out of the n elements (where the ordering
n

does not matter). Then C(n, k) = k .
Proof. The finite generating function nk=0 C(n, k)xk is a polynomial of degree
P
n. The coefficients C(n, k) imply that this polynomial factors as
(1 + x)(1 + x) · · · (1 + x),
| {z }
n times

where each factor represents one of the n given elements, the summand x0 = 1 corre-
sponds to not selecting the element, the summand x1 = x corresponds to its selection.
The statement then follows from the binomial theorem, i.e.,
n n  
X
k n
X n k
C(n, k)x = (1 + x) = x .
k=0 k=0
k

Next we consider the most famous function of combinatorial number theory, the
partition function. Let p(n) be the number of ways n ∈ N can be written as a sum
of positive integers (where the order does not matter). For example
6=5+1=4+2=4+1+1=3+3=3+2+1=3+1+1+1=2+2+2
=2+2+1+1=2+1+1+1+1=1+1+1+1+1+1
12.1. ORDINARY GENERATING FUNCTIONS 71

so that p(6) = 10. The partition function does not satisfy any simple linear recurrence,
as follows from the following theorem. (In fact, the partition function has no known
closed form expression!)
Theorem 31 (Euler). Set p(0) = 1. We have the formal identity
∞ ∞
X
n
Y 1
p(n) x = .
n=0 n=1
1 − xn

Proof. Each p(n) is uniquely determined by ... Hence



X
p(n) xn = (1 + x + x2 + . . . )(1 + x2 + x4 + . . . ) · · · (1 + xn + x2n + . . . ) · · ·
n=0

1 1 1 Y 1
= 2
· · · n
· · · = n
.
1−x1−x 1−x n=1
1 − x

Additive number theory is concerned with the properties of subsets of integers under
addition. A familiar prototypical problem is to determine rk (n), the number of ways
to write n as a sum of k squares. The function rk (n) is called the sum-of-squares
function. The generating function for rk (n) is given in terms of another important
formal series,
X 2
θ(x) := xn
n∈Z

called the (Jacobi) theta function. Indeed,



2 2
X X
θ(x)k = xn1 +···+nk = rk (n) xn .
n1 ,...,nk ∈Z n=0

Theorem 32 (Jacobi).

Y (1 − x2n )5
θ(x) = .
n=1
(1 − xn )2 (1 − x4n )2

Closed form expressions are known for rk (n) with k ≤ 8 and their absence when
k > 8 is explained by the theory of modular forms. Similarly, modular forms are
instrumental in proving the asymptotic formula for the partition function
1 √
p(n) ∼ √ eπ 2n/3 (as n → ∞)
4n 3
(first stated by Ramanujan) and in the study of that other famous original problem of
additive number theory, Waring’s problem: Given k, does there exist s ∈ N such
that each positive integer can be written as a sum of at most s powers of k?
72 12. GENERATING FUNCTIONS

In multiplicative number theory, one instead usually works with a different type of
generating series: given a sequence (an )n≥1 , its (formal) Dirichlet series is given by

X an
F (s) = ,
n=1
ns
In practice, the variable s can be a complex number, but we will only consider real
values in this chapter, and again for a moment only consider F (s) as a formal se-
ries, leaving convergence issues aside. The prototype of a Dirichlet series is the zeta
function
X 1
ζ(s) =
n=1
ns
which we will study in the next section. To conclude here, we take note of the following
proposition.
Proposition 65. Dirichlet series can be added and multiplied as follows,

X an + b n
F (s) + G(s) =
n=1
ns

X X
F (s) · G(s) = n−s ad bn/d .
n=1 d|n

The formal Dirichlet series form a ring.

12.2. The zeta function


Theorem 33 (Euler product). We have the formal identity
X 1 Y
s
= (1 − p−s )−1 ,
n=1
n p

where the product on the right hand-side is over all prime numbers.
Proof. We start with the observation that
∞ ∞
X 1 −s −s −s −s
X 1
s
= 1 + 2 + 3 + 4 + ..., 2 s
= 2−s + 4−s + 6−s + . . .
n=1
n n=1
n
so that the difference of the two series is

−s
X 1
1−2 s
= 1 + 3−s + 5−s + 7−s + . . .
n=1
n
In other words, we have sieved out all multiples of 2. By the same argument, we can
sieve out all multiples of 3:

−s −s
X 1
= 1 + 5−s + 7−s + . . . .

1−3 1−2 s
n=1
n
Continuing this process leads to the statement. 
12.2. THE ZETA FUNCTION 73

Remark 66. The application of Eratosthenes sieve reflects the fact that we can think
of the existence of the Euler product expansion as a reflection of unique factorization
(i.e., the fundamental theorem of arithmetic);
∞ ∞
X 1 YX 1 Y 1
= = ,
n=1
n s
p r=0
p rs
p
1 − p−s
where the last step is an application of the geometric series.
Many interesting arithmetic sequences can be recovered from the zeta function. For
example, the Möbius function µ(n) is defined by

1
 if n = 1,
µ(n) = 0 if n is divisible by a square,
(−1)k if n is a product of k distinct primes,

and we have
1 Y
= (1 − p−s ) = (1 − 2−s )(1 − 3−s )(1 − 5−s )(1 − 7−s ) · · ·
ζ(s) p

= 1 − (2−s + 3−s + . . . ) + (2−s 3−s + 2−s 5−s + . . . ) − (2−s 3−s 5−s + . . . ) + . . .



X µ(n)
= .
n=1
ns
Similarly
Y (1 − p−s )(1 + p−s ) Y X |µ(n)| ∞
ζ(s) −s
= −s
= (1 + p ) = s
.
ζ(2s) p
1 − p p n=1
n
Further examples include
X d(n)
ζ(s)2 =
n≥1
ns
ζ(s)2 X 2ω(n)
=
ζ(2s) n≥1 ns
ζ(s − 1) X ϕ(n)
=
ζ(s) n≥1
ns
where d(n) is the number of positive divisors of n, ω(n) is the number of distinct prime
factors of n, and ϕ(n) is the totient function.
So far, we have only considered generating series formally, but things become really
interesting once we consider the domain of convergence of these series; of course, for
this one needs to rely on analysis, and so to leave the realm of elementary number
theory.

Q If s >−s1 −1
Proposition 67. the zeta function ζ(s) is convergent and admits the Euler
product ζ(s) = p (1 − p ) .
74 12. GENERATING FUNCTIONS

Proof. Let s > 1. Then


∞ ∞
x1−s
Z
dx
= < +∞
1 xs 1−s 1
and the first statement follows from the integral test for convergence. For the second
statement, we use the sieving argument of Theorem 33, which shows that for a large
prime P ,
Y
(1 − p−s )ζ(s) = 1 + P −s + . . . .
p<P

Then

Y
−s −s
X 1
(1 − p )ζ(s) − 1 = P + ··· < →0
p<P n=P
ns
as P → ∞. 
Remark 68. When s = 1, the zeta-function is the harmonic series, which is the model-
case example of a divergent series. A nice argument to see this, due to Oresme ( 1350),
is by regrouping the terms as follows
1 1 1 1 1 1 1 1
1 + + + + + + + + + ...
2 3 4 5 6 7 8 9    
1 1 1 1 1 1 1 1
>1+ + + + + + + + + ...
2 4 4 8 8 8 8 16
1 1 1 1
= 1 + + + + + ··· = ∞
2 2 2 2
Corollary 69. There exist infinitely many prime numbers.
Proof. Suppose Q for contradiction that there are only finitely many primes. Then
−s −1
the finite product p (1 − p ) converges for any s. In particular, this is also true as
s → 1, which would imply the convergence of the harmonic series, a contradiction. 
More interestingly, Euler derived the following stronger version of the infinitude of
primes using the convergence of the zeta function.
Theorem 34 (Euler, 1737).
X1
= ∞.
p
p

Proof. Let s > 1. Taking the logarithm of the Euler product, we have
X
log ζ(s) = − log(1 − p−s ).
p
P∞ xm
When |x| < 1, log(1 − x) has Maclaurin expansion log(1 − x) = − m=1 m so that
∞ ∞
XX 1 X 1 XX 1
log ζ(s) = = + .
p m=1
mpms p
ps p m=2
mpms
12.3. THE ZETA FUNCTION FOR GAUSSIAN INTEGERS 75

We can show that this last series is uniformly bounded for any choice of s:
XX 1 1XX 1 1X 1 1X 1 1
ms
≤ ms
= s s
< = .
p m≥2
mp 2 p m≥2 p 2 p p (p − 1) 2 n≥2 n(n − 1) 2
Hence
X 1 1
s
− log ζ(s) < ,
p
p 2
and the statement now follows by taking s → 1. 
This clearly again implies that there are infinitely many primes (if the sum was
finite, it would converge) but by doing so quantitatively, it provides some more infor-
mation on the density of primes; for example, we now see that primes are denser than
numbers of the form n2 , since ζ(2) converges. This result often cited as the ‘birth of
analytic number theory.’

12.3. The zeta function for Gaussian integers


We have explored earlier the Gaussian integers as another number system with
unique factorization; hence it makes sense to consider its associated (formal) Dirichlet
series
1 X 1
ζ(s, Z[i]) := .
4 N (α)s
α∈Z[i]
α6=0

The constant factor 14 warrants some explanation: recall that N (α) = N (β) whenever
β = (unit) · α and that there are four units, namely ±1, ±i. For comparison, the units
in Z are ±1 and

X 1 1X 1
ζ(s) = s
= s
.
n=1
n 2 n∈Z
|n|
n6=0
If one might be brought to study ζ(s, Z[i]) for its own sake, it also appears naturally
as the Dirichlet series (generating function) for the sum-of-squares function
r2 (n) := #{(a, b) ∈ Z2 | a2 + b2 = n} = #{α ∈ Z[i] | N (α) = n}.
Until now, we have been interested in numbers n that can be represented as n = a2 + b2
with a, b ∈ N; we introduce the modified sum-of-squares function
1
r̃2 (n) := #{(a, b) ∈ N2 | a2 + b2 = n} = r2 (n).
4
Then
∞ ∞
1 X r2 (n) X r̃2 (n)
ζ(s, Z[i]) = = .
4 n=1 ns n=1
n s

Our goal is thus to obtain information on r̃2 (n) through the study of its generating
function ζ(s, Z[i]). Inspired by the theory of the (Riemann) zeta function, our first
step will be to compute the Euler product of ζ(s, Z[i]). For this, we first recall, for the
reader’s convenience, the characterization of Gaussian primes established in Chapter
76 12. GENERATING FUNCTIONS

11.1. We saw that if π is a Gaussian prime then its norm has one of the following
forms:
(1) N (π) = 2;
(2) N (π) = p2 with p ≡ 3 (mod 4) prime;
(3) N (π) = p with p ≡ 1 (mod 4) prime;
and these three cases correspond to
(1) π = 1 + i (up to multiplication by a unit);
(2) π = p with p ≡ 3 (mod 4) prime (up to multiplication by a unit);
(3) p ≡ 1 (mod 4) prime factors uniquely (up to multiplication by a unit) as
p = ππ, with π, π distinct (i.e., not associates) primes.
Hence the Euler product is given by
1Y
ζ(s, Z[i]) = (1 − N (π)−s )−1
4 π
Y Y
= (1 − 2−s ) (1 − p−2s )−1 (1 − p−s )−2
p≡3 (mod 4) p≡1 (mod 4)
Y Y Y
= (1 − p−s )−1 (1 − p−s )−1 (1 + p−s )−1
p p≡1 (mod 4) p≡3 (mod 4)
Y Y
= (1 − p−s )−1 (1 − ( −1
p
)p−s )−1 .
p p odd
1
Observe that the constant factor disappears since each Gaussian primes appears 4
4
times (via multiplication by units). When s > 1, we recognize the first product to be
ζ(s); can one also write the second product as a zeta function? For this we introduce
the following extension of the Legendre symbol ( −1p
):

1
 if n ≡ 1 (mod 4);
χ(n) := −1 if n ≡ 3 (mod 4);

0 if 2 | n.
Proposition 70. The Dirichlet character χ(n) is completely multiplicative and the
Dirichlet L-function

X χ(n)
L(s, χ) :=
n=1
ns
converges absolutely for s > 1 and admits the Euler product L(s, χ) = p (1−χ(p)p−s )−1 .
Q

Proof. Recall that a function f is completely multiplicative if f (mn) = f (m)f (n)


for all m, n ∈ N. If m, n are odd, the first statement follows from the construction of
the Jacobi symbol. Otherwise, one of m, n is even if and only if mn is even, and hence
χ(m)χ(n) = 0 = χ(mn). To prove the absolute convergence of L(s, χ), observe that
when s > 1 we have

X |χ(n)|
|L(s, χ)| ≤ s
< ζ(s) < ∞.
n=1
n
12.3. THE ZETA FUNCTION FOR GAUSSIAN INTEGERS 77

Finally the development of the Euler product follows the same proof as for ζ(s), using
that χ(n) is completely multiplicative. 

We can thus summarize our discussion with the following theorem.


Theorem 35. The zeta function for Gaussian integers is given by
ζ(s, Z[i]) = ζ(s)L(s, χ) (12.1)
and converges absolutely whenever s > 1.
Remark 71. From our derivation, we understand this decomposition to be an analytic
encoding of the characterization of Gaussian primes carried out in Chapter 11.1.
We now deduce a sequence of striking applications from the identity (12.1). First,
the multiplicative law for Dirichlet series implies that
∞ ∞ X
X r̃2 (n) X
= χ(d) n−s ,
n=1
ns n=1 d|n

which implies that


X
r̃2 (n) = χ(d)
d|n

for each n ≥ 1. It is now easy to check that r̃2 (n) is multiplicative; indeed if (m, n) = 1
then
X X X X
χ(d) χ(d) = χ(d1 d2 ) = χ(d).
d|m d|n d1 d2 |mn d|mn
d1 |m, d2 |n

Hence it suffices to determine the values of r̃2 (n) on powers of primes, and this is easily
computed to be
4 if p = 2;



k
k + 1 if p ≡ 1 (mod 4);


X
r̃2 (pk ) = χ(p)j = (12.2)
j=0

 0 if p ≡ 3 (mod 4) and k odd;

1 if p ≡ 3 (mod 4) and k even.

In particular, we obtain the following general form of Fermat’s theorem on sums of two
squares “for free.”
Theorem 36 (Fermat). A natural number n can be expressed as a sum of two
squares (i.e., n = a2 + b2 for a, b ∈ N) if and only if any prime p ≡ 3 (mod 4)
appearing in the prime factorization of n appears as an even power.

Proof. Let n = pk11 · · · pkr r . We have r̃2 (n) = 0 if and only if r̃2 (pki i ) = 0 for some
1 ≤ i ≤ r and this holds if and only if pi ≡ 3 (mod 4) and ki is odd by (12.2). 
78 12. GENERATING FUNCTIONS

12.4. An irrationality proof of the infinitude of primes


We end this lecture with a surprising by-product of our previous discussion of the
factorization ζ(s, Z[i]) = ζ(s)L(s, χ), s > 1, established in the previous section. What
happens at s = 1? We know that the ζ-function would blow up (ζ(1) is the harmonic
series), but it could be that L(1, χ) = 0. For example, in the homework you are asked
to show that

X (−1)n
η(s) := s
= (21−s − 1)ζ(s)
n=1
n
converges for s > 0; at s = 1 we have η(1) = 0 · ζ(1). Here, L(1, χ) 6= 0 and in fact the
value of L(1, χ) is computed by Leibniz’s beautiful sum-formula for π.
Theorem 37 (Leibniz).

π X (−1)n
= .
4 n=0
2n + 1)

Proof. We start from the simple observation that cos(π/4) = sin(π/4); hence
tan(π/4) = 1 and taking the inverse, π/4 = arctan(1). By the fundamental theorem
of calculus, we may thus write
Z 1
π dx
= arctan(1) = 2
.
4 0 1+x
Whenever x ∈ [0, 1) we have the geometric series expansion

1 X
= (−1)n x2n .
1 + x2 n=0

In particular, if we could expand the integrand in this way, we would find


∞ Z 1 ∞
X
n 2n
X (−1)n
(−1) x dx = = L(1, χ).
n=0 0 n=0
2n + 1
However, the geometric series expansion is not valid when x = 1, so we consider instead
its finite form
N
X 1 − (−1)N +1 x2N +2 1 (−1)N x2N +2
(−1)n x2n = = + .
n=0
1 + x2 1 + x2 1 + x2
Then
1 N
X (−1)n 1 1
x2N +2
Z Z Z
dx 1
− ≤ dx ≤ x2N +2 dx = →0
0 1 + x2 n=0 2n + 1 0 1 + x2 0 2N + 3
as N → ∞ and the statement follows. 
A very nice application of this formula to number theory is the following ‘irratonality
proof’ of the infinitude of prime numbers.
Corollary 72. There are infinitely many primes.
12.5. HOMEWORK 12 79

Proof. Leibniz formula can be restated using the Euler product as


π Y Y
= (1 − p−1 )−1 (1 + p−1 )−1 .
4
p≡1 (4) p≡3 (4)

Since π is irrational, one of the products on the right must be infinite. 


12.5. Homework 12
(1) Let a0 = 3 and set an = 2an−1 for n ≥ 1. Use the ordinary generating function
for (an )n≥0 to find a closed expression for an .
(2) Show that ζ(s)2 = ∞ d(n)
P
n=1 ns .
(3) Prove the Abel summation theorem1: Let f : R → R be aPcontinuous
differentiable function, (an )n≥1 a sequence, and define A(x) = n≤x an for
every x > 1. Then
X Z N
an f (n) = A(N )f (N ) − A(x)f 0 (x) dx.
n≤N 1

(4) Use Abel summation to prove that if A(x) is uniformly bounded, then the
Dirichlet series ∞ −s
P
n=1 n n
a is convergent for s > 0.
(5) Let

X (−1)n
η(s) =
n=1
ns
be the alternating zeta function. Show that η(s) = (21−s − 1)ζ(s) and use
the previous exercise to show that η(s) is convergent for s > 0.

1Hint: Use that an = A(n) − A(n − 1).

You might also like