Elementary Number Theory - MAT105
Elementary Number Theory - MAT105
Claire Burrin
(Fall 2024, University of Zurich)
1
Contents
Chapter 1. Introduction 5
1.1. Basic principles 5
1.2. Pythagorean triples 7
1.3. Homework 1 9
Chapter 2. The Fundamental Theorem of Arithmetic 11
2.1. Prime numbers 11
2.2. Failure of unique factorization 13
2.3. Factoring 14
2.4. Homework 2 15
Chapter 3. Euclid’s Algorithm 17
3.1. Division with remainder 17
3.2. Complexity theory 18
3.3. Euclid’s lemma and Bézout’s identity 19
3.4. Homework 3 21
Chapter 4. Continued Fractions 23
4.1. Continued fraction expansions 23
4.2. Convergents 25
4.3. Constructing irrational numbers 26
4.4. Homework 4 28
Chapter 5. Diophantine Approximation 29
5.1. Dirichlet’s theorem (1842) 29
5.2. Hurwitz’s theorem (1891) 30
5.3. Liouville’s theorem (1844) 32
5.4. Homework 5 34
Chapter 6. Congruences (I) 35
6.1. The basic algebra of congruences 35
6.2. Some theorems to prime modulus 37
6.3. Homework 6 38
Chapter 7. Congruences (II) 41
7.1. Fermat–Euler theorem 41
7.2. Chinese remainder theorem 42
7.3. Euler’s totient function 43
7.4. Homework 7 44
3
4 CONTENTS
Introduction
two numbers a, b are coprime if their greatest common divisor (gcd), denoted (a, b),
is 1.
Theorem 2. Up to switching the order of a and b, every Pythagorean triple has
the form
a = m(p2 − q 2 ) b = 2mpq c = m(p2 + q 2 )
where m ∈ N, and p, q are coprime numbers, one of which is even and the other one
odd. Conversely, any triple (a, b, c) of the above form is a Pythagorean triple.
Proof. Observe that if (a, b, c) is a Pythagorean triple then so is (ma, mb, mc) for
all m ∈ N. We call a triple for which (a, b, c) = 1 a primitive Pythagorean triple. Next
we observe that finding primitive Pythagorean triples amounts to listing all points on
the first quadrant of the unit circle with rational coordinates:
a 2 b 2
+ = 1.
c c
From the algebraic equation of the circle we have Y 2 = 1 − X 2 = (1 − X)(1 + X).
Setting
Y
t :=
X +1
(for X 6= −1), the equation becomes
1−X
t2 =
1+X
and can be solved in X:
1 − t2 1 − t2
1−X 2t
t2 = ⇐⇒ X = , Y =t 1+ = .
1+X 1 + t2 1 + t2 1 + t2
(Geometrically, our computation parametrizes the intersection points (X, Y ) of the
unit circle X 2 + Y 2 = 1 with the line Y = t(X + 1).) It follows that (X, Y ) ∈ Q2 if
and only if t ∈ Q. Let us write t = pq for p and q coprime. Then X and Y can now be
expressed as
a p2 − q 2 b 2pq
X= = 2 , Y = = .
c p + q2 c p2 + q 2
We can easily check that (p2 − q 2 , 2pq, p2 + q 2 ) is a Pythagorean triple, but we still need
to check that it is primitive.
Assume first that one of p, q is odd and the other even. Then both p2 + q 2 , p2 − q 2
are odd. Suppose that d divides both p2 + q 2 , p2 − q 2 ; in particular, d is an odd number.
Then d also divides (p2 + q 2 ) + (p2 − q 2 ) = 2p2 and (p2 + q 2 ) − (p2 − q 2 ) = 2q 2 . Since
d is odd, it must divide both p and q, which forces d = 1 since we assumed p, q are
coprime. This proves that (p2 − q 2 , p2 + q 2 ) = 1. We leave it to the reader to conclude
that (2pq, p2 + q 2 ) = 1, and hence (p2 − q 2 , 2pq, p2 + q 2 ) = 1.
Since p, q are coprime, they cannot both be even and that leaves us with the case
that p and q are both odd. Then p + q, p − q are even and we write
p + q = 2P, p − q = 2Q.
1.3. HOMEWORK 1 9
3 To prove this ‘if and only if’ statement, you need to prove two things: that Fn even implies that
3 | n and that 3 | n implies that Fn is even.
4 To argue by induction, you need here to first verify that the statement is true for n = 0, 1, and
then use induction to show that it is true for all larger integers n.
CHAPTER 2
In working out the complete list of Pythagorean triples, we had to show that (p2 +
q , p2 − q 2 ) = (p2 + q 2 , 2pq) = 1 for which we implicitly relied on the fundamental
2
theorem of arithmetic, which says that every natural number can be written as a
product of primes in a unique way (up to reordering).
2.1. Prime numbers
Each natural number n ≥ 2 has at least two divisors: 1 and itself. We refer to those
as the trivial divisors of n.
Definition 4. A natural n ≥ 2 that only has trivial divisors is called a prime number.
Otherwise, we say that n is composite.
Prime numbers are the irreducible factors of multiplication, and the sequence of
prime numbers is one of the most studied pattern of numbers. We will later see
various proofs that there are infinitely many primes, and discuss some important open
conjectures in prime number theory.
Regarding n = 1 we adopt the following conventions:
• 1 is not considered to be a prime number;
• 1 is seen as the ’empty product’ (1 = a0 for any a).
These conventions will allow to formulate the fundamental theorem of arithmetic in
the clearest possible way. Before stating the theorem, we consider the following much
simpler fact.
Proposition 5. Every natural number n factors as a product of primes, i.e.,
n = p1 p2 · · · pr ,
where p1 , . . . , pr are primes.
Proof. We proceed by induction. Suppose that the statement holds for all natural
numbers m < n. If n is prime, we are done. If n is composite we write n = ab for two
natural numbers 1 < a, b < n. In particular, the induction hypothesis applies to a and
b, hence their product is also a product of primes.
The representation of a number as a product of divisors is not unique, e.g,
12 = 2 · 6 = 3 · 4.
The fundamental theorem of arithmetic asserts that its representation as a product of
prime divisors is unique (up to reordering):
12 = 22 · 3 = 3 · 22 .
11
12 2. THE FUNDAMENTAL THEOREM OF ARITHMETIC
2.3. Factoring
We next discuss the practical problem of factoring a given (large) number n. The
most immediate approach is to proceed by trial division. An algorithm is a system-
atic, or step-by-step, process to solve a problem (transforming a given input into a
desired output).
Algorithm 7 (Trial division algorithm). Input: composite number n. Output: prime
factorization of n. √
Run through primes 2 ≤ p ≤ n in increasing order until finding p | n.
Record p, and restart the process with n/p in the place of n.
√
You might wonder why it suffices to consider primes only up to n and not √ up to
n. If n is composite, n = p1 · · · pr , then r ≥ 2. If all prime factors where > n, we
would have n > nr/2 ≥ n, which is√impossible; hence any composite number n has at
least one prime factor of size p ≤ n.
When running this algorithm, the worst-case√scenario is that n = pq is a product
of two large primes, both of size approximately n;√then the number of required trial
divisions is the number of primes less or equal to n. How many primes are there
up to a large number x? The prime number theorem (to be discussed in a few weeks)
states that there are approximately
x
log x
primes p ≤ x (where log denotes the natural logarithm).
On a classical computer (your laptop or smartphone), a number n is recorded via
its binary (base 2) rather than decimal (base 10) expansion. If you’re unfamiliar with
this concept, consider the decimal and binary expansions of 17:
17 = 1 · 101 + 7 · 100 (base 10)
17 = 1 · 24 + 0 · 23 + 0 · 22 + 0 · 21 + 1 · 20 (base 2)
On a classical computer, 17 is stored as the sequence of bits 10001.
If n is a large number of k bits, say n ≈ 2k , then the trial division algorithm requires
up to
2k/2
k
2
log 2
operations. In other words, it is exponential in the number of bits. For k large enough,
the computation might require more time than there are atoms in the known universe
(roughly 1080 ). And we haven’t even √ accounted yet for the fact that we first have to
check for each number 2 ≤ m ≤ n whether it is prime (or rely on a large enough
pre-stored table of primes).
There are obviously more elaborate factoring algorithms than trial division, but
even the best currently known algorithms have exponential time growth. That com-
puters cannot factor efficiently is the basis of public-key cryptography, used for secure
browsing, VPNs, email encryption, etc. (We will briefly discuss public-key cryptogra-
phy later on in the semester.)
2.4. HOMEWORK 2 15
Regarding primality testing, things are better; the AKS algorithm (2002) allows to
determine in polynomial time whether a number is prime. On the other hand, Shor’s
algorithm (1994) shows that on a quantum computer, factoring could also be performed
in polynomial time.
2.4. Homework 2
(1) Let n ∈ N. Show that if 2n − 1 is prime, then n is prime.1
√
(2) Use the fundamental theorem of arithmetic to show that for each prime p, p
is irrational.
(3) Use the fundamental theorem of arithmetic to show that if a | bc and (a, b) = 1
then a | c.
(4) Let n ∈ N. Show that if the smallest prime factor p of n satisfies p > n1/3 ,
then n/p is either prime or 1.
(5) The sieve of Eratosthenes2 is an ancient algorithm to produce tables of prime.
It works as follows. List√all numbers from 2 up to N , then successively delete
for each prime 2 ≤ p ≤ N its higher multiples mp (see Table 1 below). Use
the sieve of Eratosthenes to list all primes between 1 to 64.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
2 3 5 7 9 11 13 15 17 19 21
2 3 5 7 11 13 17 19
Table 1. Eratosthenes’ sieve for N = 21. In the second line we deleted
all higher multiples of 2, in the third line, all higher multiples of 3. The
only numbers remaining are the primes up to 21.
Euclid’s Algorithm
The process terminates once r(k) = 0, which eventually happens since r(k) < r(k−1) .
We then see that
(a, b) = (b, r) = (r, r0 ) = (r0 , r00 ) = · · · = (r(k−1) , r(k) ) = r(k−1) .
In other words, the greatest common divisor of a and b is equal to the last nonzero
remainder when iterating division with remainder. This is Euclid’s algorithm.
Theorem 4 (Lamé 1844). Let a > b be two natural numbers, and b < Fn for some
n ≥ 3. Then running Euclid’s algorithm for the pair a, b requires at least n − 2 steps.
Moreover this bound is sharp if we take a = Fn , b = Fn−1 .
Proof. The second statement follows from the recurrence relation for the Fi-
bonacci sequence: Euclid’s algorithm
Fn = Fn−1 + Fn−2
Fn−1 = Fn−2 + Fn−3
..
.
F2 = F1
terminates after n − 2 steps. More generally, say that Euclid’s algorithm runs for k
steps;
a = qb + r
b = q 0 r + r0
..
.
r(k−3) = q (k−1) r(k−2) + r(k−1)
r(k−2) = q (k) r(k−1)
On the last line, we recall that r(k−1) ≥ 1 and q (k) ≥ 2. Hence r(k−2) ≥ 2. Going
backwards we find that r(k−3) ≥ r(k−2) + r(k−1) ≥ F2 + F1 = F3 until b ≥ r + r0 ≥
Fk + Fk−1 = Fk+1 . Since b < Fn , the maximal choice of k is given by k + 1 = n − 1,
i.e., k = n − 2.
Theorem 5 (Lamé 1844). Given input a > b, Euclid’s algorithm finishes in at
most 5k steps, where k is the (decimal) length of b.
Proof. Consider the worst-case analysis provided by the previous theorem; a =
Fn+2 , b = Fn+1 requiring n steps for Euclid’s algorithm. By induction we can show
that Fn+1 ≥ ϕn−1 , where ϕ is the golden mean. Then
log b
n−1≤ < 5 log b < 5k
log ϕ
and hence n ≤ 5k.
Continued Fractions
The terms 2, 1, 3, 1, 4 are called the partial quotients of the continued fraction ex-
67 24 19 5
pansion. (By contrast, we call 24 , 19 , 5 , 4 the complete quotients.) The partial
quotients are precisely the quotients appearing in Euclid’s algorithm;
67 = 2 · 24 + 19
24 = 1 · 19 + 5
19 = 3 · 5 + 4
5=1·4+1
4 = 4 · 1.
Equivalently, the partial quotients are the integer parts1 of the corresponding complete
quotients;
67 19
=2+
24 24
24 5
=1+
19 19
19 4
=3+
5 5
5 1
=1+
4 4
4
= 4.
1
We set the notation
1
[a0 , a1 , . . . , an ] := a0 + 1
a1 + a2 + 1
...
+ a1
n
1 Given a real number x ∈ R, its integer part bxc is the largest integer such that bxc ≤ x, and
its fractional part {x} is {x} = x − bxc ∈ [0, 1).
23
24 4. CONTINUED FRACTIONS
to express any number written in this ‘staircase form’ (with no assumption on the
’digits’ ai ). Note that representation of numbers of this shape is not unique; we have
[a0 , . . . , an ] = [a0 , . . . , an − 1, 1].
The previous numerical example shows that given a > b, Euclid’s algorithm yields
a sequence (ai )0≤i≤n such that
a
= [a0 , . . . , an ],
b
where
• a0 , . . . , an−1 ≥ 1;
• an > (a, b) ≥ 1, hence an ≥ 2.
Proposition 13. Every ab ∈ Q>1 has a unique finite continued fraction expansion
whose last partial quotient is > 1.
a
Proof. Suppose that b
admits two finite continued fraction expansions:
[a0 , a1 , . . . , am ] = [b0 , b1 , . . . , bn ].
Taking the integer part of ab , we see that b ab c = a0 = b0 . Hence canceling a0 out on
both sides, we are left with
[a1 , . . . , am ] = [b1 , . . . , bn ].
Repeating this argument, we see that m = n, and ai = bi for i = 0, . . . , n.
We can adapt the process to express an irrational number x > 1 as a continued
fraction. Write again x = bxc + {x} and set
1
a0 := bxc, x1 := −
{x}
Then
1
x = a0 +
x1
where x1 is irrational and x1 > 1. Iterating, we obtain sequences
1
xn := an := bxn c (4.1)
{xn−1 }
for n ≥ 1 with x0 := x, so that
1
x = a0 + 1 .
a1 + .
. .+ 1
a + 1 n−1 xn
Since each xn > 1, each partial quotient is an ≥ 1. Since each xn is irrational, the
process never terminates; we claim that continuing this process ad infinitum does
express the value x, i.e.,
x = lim [a0 , . . . , an−1 , xn ]
n→∞
To make this claim precise we need the notions of convergents.
Definition 14. The nth convergent is [a0 , . . . , an ].
4.2. CONVERGENTS 25
4.2. Convergents
Looking at the formal shape of the first few convergents, i.e.,
[a0 ] = a0
1 a0 a1 + 1
[a0 , a1 ] = a0 + =
a1 a1
1 1 a2 a0 a1 a2 + a0 + a2
[a0 , a1 , a2 ] = a0 + 1 = a0 + = a0 + =
a1 + a2 [a1 , a2 ] a1 a2 + 1 a1 a2 + 1
a2 a3 + 1 a0 a1 a2 a3 + a0 a1 + a0 a3 + a2 a3 + 1
[a0 , a1 , a2 , a3 ] = a0 + = ,
a1 a2 a3 + a1 + a3 a1 a2 a3 + a1 + a3
we observe a discernable, recursive pattern for both the numerators and denominators
(assuming the above fractions are already in reduced form). Given a sequence (an )n≥0 ,
set for each n ≥ 1
pn := an pn−1 + pn−2 , p0 := a0 , p−1 := 1,
qn := an qn−1 + qn−2 , q0 := 1, q−1 := 0.
It is often convenient to express linear recurrence relations in matrix form; here we
have
pn qn an 1 pn−1 qn−1
=
pn−1 qn−1 1 0 pn−2 qn−2
for n ≥ 1. Then
pn qn an 1 an−1 1 a1 1 a0 1
= ··· .
pn−1 qn−1 1 0 1 0 1 0 1 0
Taking the determinant on both sides we find that
pn qn−1 − pn−1 qn = (−1)n+1 . (4.2)
In particular (pn , qn ) | (−1)n+1 and hence (pn , qn ) = 1. These observations made, we
show that
Proposition 15. Let (an )n≥0 be a sequence. We have
pn
[a0 , . . . , an ] = .
qn
with pn , qn defined as above for all n ≥ 0.
Proof. The proof is by induction. Clearly, [a0 ] = pq00 . Let n ≥ 1 and assume that
pi
qi
= [a0 , . . . , ai ] for all i ≤ n. We now use a very useful observation: the definition of
pn , qn does not depend on the ai ’s being integers. Hence under the induction hypothesis
1
(an + an+1 )pn−1 + pn−2
1
[a0 , . . . , an+1 ] = a0 , . . . , an + = 1
an+1 (an + an+1 )qn−1 + qn−2
(an pn−1 + pn−2 ) + pn−1 /an+1 an+1 pn + pn−1 pn+1
= = = .
(an qn−1 + qn−2 ) + qn−1 /an+1 an+1 qn + qn−1 qn+1
26 4. CONTINUED FRACTIONS
Proposition 16. Let x ∈ R \ Q. Let (an )n≥0 be the sequence defined by (4.1) and
(pn ), (qn ) the corresponding sequences given above. Then
pn
x = lim .
n→∞ qn
4.4. Homework 4
17
(1) Compute the continued fraction expansions of 11 and 11 .
√ 31 √
(2) Compute the continued fraction expansion of 2. Conclude that 2 is irra-
tional.
(3) (Explicit Bézout) Use (4.2) to find values x, y ∈ N such that 67x − 24y = 1.
(4) What is the value of [1, 2] = [1, 2, 1, 2, 1, 2, . . . ]?
(5) Show that the convergents of the golden mean ϕ are given by
pn Fn+2
=
qn Fn+1
for every n ≥ 1.
Remark 18. The golden mean ϕ is ϕ ≈ 1.618, which is relatively close to the con-
version rate for miles to kilometers given by 1 mile ≈ 1.609 km. Therefore, using
that
mile Fn+1
≈ϕ≈
km Fn
for n sufficiently large, we have the easy conversion rule
Fn miles ≈ Fn+1 km
for anyone knowing the Fibonacci sequence.
CHAPTER 5
Diophantine Approximation
In the previous chapter, we have seen that every irrational number x admits a
continued fraction expansion, whose convergents provide us with x = limn→∞ pqnn . In
particular, we recall that
• (qn ) is a strictly increasing sequence;
1
• qn+1 = an+1 qn + qn−1 and xn+1 = an+1 + xn+2 ;
pn 1
• x− qn
= (xn+1 qn +qn−1 )qn
.
Combining these facts, we find that
pn 1 1 1
x− = < < 2. (5.1)
qn (xn+1 qn + qn−1 )qn qn+1 qn qn
In other words, the irrational x admits an infinite sequence of rational approximants
p/q such that |x − pq | < q12 . Finding good sequences of rational approximations for
real numbers is the primary task of the theory of Diophantine approximation. In this
chapter we will showcase three fundamental results of Diophantine approximation
Proof 2 (via box principle). The box principle (also called pigeonhole prin-
ciple/Schubfachprinzip) asserts that given N boxes in which we place N + 1 balls, one
of the boxes has to contain at least two balls.
We consider the Q + 1 fractional parts 0, {x}, {2x}, . . . , {Qx} ∈ [0, 1). (Note that
these numbers are all distinct since x is irrational; see the homework.) We divide the
interval [0, 1) in Q subintervals of equal length 1/Q as follows; [0, 1) = [0, Q1 ) ∪ [ Q1 ∪
Q
2
) ∪ · · · ∪ [ Q−1
Q
, 1). By the box principle, one of these intervals contains two distinct
29
30 5. DIOPHANTINE APPROXIMATION
fractional parts {mx}, {nx}. Say that n > m and set q = n − m, p = bnxc − bmxc.
Then |qx − p| = |{nx} − {mx}| < Q1 .
The advantage of Dirichlet’s ‘softer’ proof based on the box principle is that it can
easily be adapted to higher dimensions, i.e., Diophantine approximation in Rn , where
the theory of continued fractions is not so readily available. On the other hand, in the
one-dimensional setting, convergents actually provide the best possible approximants
to an irrational number; this can be formalized as follows.
Theorem 10 (Best approximation theorem). Let x ∈ R \ Q. If 1 ≤ q ≤ qn , then
pn p
x− ≤ x−
qn q
with equality only if q = qn , p = pn .
Proof. We will in fact prove a stronger statement, namely that for 1 ≤ q < qn+1
we have
|qn x − pn | ≤ |qx − p|,
with equality only if q = qn , p = pn . It follows that if q < qn then
pn q p p
x− < x− < x− .
qn qn q q
Using the identity (4.2), i.e., pn+1 qn − pn qn+1 = (−1)n , we write
p = pn u − pn+1 v
q = qn u − qn+1 v,
choosing u = (−1)n (qpn+1 − pqn+1 ) and v = (−1)n (qpn − pqn ). We may assume that
u, v 6= 0. We now have
qx − p = u(qn x − pn ) − v(qn+1 x − pn+1 )
and we claim that both terms have opposite signs. Given this claim, we can conclude
that
|qx − p| = |u(qn x − pn )| + |v(qn+1 x − pn+1 )| > |qn x − pn |.
We now prove our claim. We saw earlier (see proof of Theorem 7) that qn x − pn
and qn+1 x − pn+1 have opposite signs, so it remains to prove that u and v have the
same sign. Observe that qn u = qn+1 v + q and q < qn+1 ≤ |qn+1 v| hence u and v have
the same sign.
1
(i) an+1 < xn+1 = an+1 + xn+2 < an+1 + 1;
(ii) an+1 qn < qn+1 = an1 qn + qn−1 < (an+1 + 1)qn .
In other words, the order of xn+1 is roughly that of an+1 and the order of qn+1 is roughly
that of an+1 qn . This suggests that
pn 1 1
x− ≈ ≈ ,
qn qn+1 qn an+1 qn2
i.e., the larger the size of the partial quotients of x the better the approximation. For
example, consider again the first few partial quotients of π,
π = [3, 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, . . . ].
Take note of the much larger partial quotient a4 = 292; our observations suggest that
the convergent
p3 355
=
q3 113
gives the best approximation of π by a rational number of four digits or less — this is
in fact the floating-point approximation of π used by many computers.
Following this line of thought we might expect the worst approximable irrational
number to be the golden mean ϕ with periodic continued fraction expansion [1]. We
leave it to the reader to check that in this case qn = Fn+1 , xn = ϕ. This leads to the
identity
pn 1 1
ϕ− = = .
qn (xn+1 qn + qn−1 )qn (ϕ + FFn+1
n
)qn2
Fn+1
Using some facts established in previous homeworks, namely that → ϕ as n → ∞
√ Fn
and ϕ + ϕ1 = 2ϕ − 1 = 5, we find that for very large n
pn 1
ϕ− ≈√ .
qn 5qn2
Hurwitz’ theorem states that this is the worst possible bound for the approximation of
an irrational by a infinite sequence of rational approximants.
Theorem 11 (Hurwitz). Every irrational number x admits infinitely many rational
approximants with
p 1
x− <√
q 5q 2
and this statement becomes false if √1 is replaced by any smaller constant.
5
Proof. We first prove the second statement. Fix C < √15 . We will show that there
are only finitely many rationals satisfying |ϕ − pq | < qC2 . Recall that ϕ is the positive
solution of the polynomial equation X 2 − X − 1 = 0. More precisely, we have
X 2 − X − 1 = (X − ϕ)(X − ϕ).
32 5. DIOPHANTINE APPROXIMATION
p
Plugging in X = and multiplying both sides by q 2 we find
q
√
2 2 2 p p p C C
|p − pq − q | = q ϕ − ϕ− <C ϕ− < C |ϕ − ϕ| + 2 = C 5+ 2 .
q q q q q
Since the left hand-side is a positive integer, we are left with the inequality
C2
1 ≤ q2 < √
1−C 5
which leaves only finitely many possibilities for the choice of q.
The first statement admits various proofs. We will sketch here a geometric proof
based on the Farey–Ford packing along the real line. The Farey sequence is (FQ )Q≥1
where each FQ is the ordered set
p
FQ = | (p, q) = 1, 1 ≤ q ≤ Q .
q
To each Farey point pq we associate the Ford circle Cp/q that has radius 1
2q 2
and center
( pq , 2q12 ). The following statements hold:
(i) The interiors of two Ford circles are disjoint, i.e., Int(Cp/q ) ∩ Int(Cp0 /q0 ) = ∅;
0
(ii) Two Ford circles Cp/q , Cp0 /q0 are tangent if and only pq , pq0 are consecutive ele-
ments in some Farey set FQ .
Running over the Farey sequence, attaching a Ford circle to each Farey point yields
the Farey–Ford packing.
Given x ∈ R \ Q, the vertical line passing through x intersects infinitely many
of the triangular interstices of the packing. Each triangle is determined by its three
vertices, which are the tangency points of the neighboring Ford circles Cp/q , Cp0 /q0 ,
0 0
C(p+p0 )/(q+q0 ) . For each triangle, we choose the point between pq , pq0 , p+p
q+q 0
that lies the
p
closest to x. Say this is q ; an explicit computation of the tangency points implies
then that |x − pq | < √5q 1
2 . Since there are infinitely many triangles, this completes the
proof.
numbers is countable and the set of real numbers is uncountable. It follows that
most numbers are transcendental. This last statement can be made more precise with
measure theory, which is the topic of Analysis III.
Liouville’s theorem shows that there is a lower bound on how well approximable
an algebraic irrational can be. The result is essentially a corollary of the mean value
theorem, which is one of the fundamental results of Analysis I: If f : [a, b] → R is
continuous and differentiable on (a, b) then there exists t ∈ (a, b) such that
f (b) − f (a)
f 0 (t) = .
b−a
Theorem 12 (Liouville). If x ∈ R \ Q is algebraic of degree d, then there exists a
constant c > 0 such that
p c
x− ≥ d
q q
for any rational approximant p/q.
Proof. Since x is algebraic of degree d, there is a polynomial f (X) = ad X d +· · ·+a0
so that f (x) = 0. Choose p/q to be a rational approximant of x that is closer to x than
any other root of f ; hence f (p/q) 6= 0. By the mean value theorem, there exists some
t between x and p/q such that
p |f (x) − f (p/q)| |f (p/q)| |ad pd + ad−1 pd−1 q + · · · + a0 q d | 1
x− = 0
= 0
= 0 d
≥ 0 .
q |f (t)| |f (t)| |f (t)|q |f (t)|q d
The statement of Liouville’s theorem motivates the following definition.
Definition 20. A real number x is called a Liouville number if for all n ∈ N, there
exists a reduced fraction pq so that
p 1
x− < n.
q q
We leave it as an exercise to the reader to show that every Liouville number is
transcendental. This observation led Liouville to the construction of the first known
transcendental number, the Liouville constant
∞
X 1
n!
= 0.11000100...
n=1
10
We show that this number is indeed Liouville (and therefore transcendental). Fix
n ∈ N, and set
n
p X 1
:= .
q m=1
10m!
Note that q = 10n! . Then
∞
p X 1 1 1 2 2 1
x− ≤ m!
= (n+1)! 1 + n+2 + . . . < (n+1)! = n+1 < n .
q m=n+1
10 10 10 10 q q
34 5. DIOPHANTINE APPROXIMATION
Congruences (I)
For instance, using that 10 ≡ 1 (mod 3), and more generally a · 10j ≡ a (mod 3) for
any a ∈ Z and j ∈ N, we have
n := ak 10k + · · · + a1 10 + a0 ≡ ak + · · · + a0 (mod 3),
that is, n is divisible by 3 if and only the sum of its digits is divisible by 3. Observe
that the same argument produces the same rule for 9 in place of 3.
A special case of (ii) above is that if a ≡ b (mod n) then ac ≡ bc (mod n) for
all c ∈ Z. The converse, i.e., the cancellation law for congruences, does not hold in
general; for example, 6 ≡ 2 (mod 4) but 3 6≡ 1 (mod 4). The rule for cancellation is
given by the following proposition.
Proposition 24 (Cancellation law for congruences). If ac ≡ bc (mod n) then a ≡ b
n
(mod (n,c) ). In particular, if (n, c) = 1 then ac ≡ bc (mod n) implies a ≡ b (mod n).
Proof. Set d = (n, c) and write n = kd, c = ld. Then ald ≡ bld (mod kd) implies
that al ≡ bl (mod k). Since (k, l) = 1, we have a ≡ b (mod k).
Definition 25. We say that the relation ∼ is an equivalence relation on the set S
if for any a, b, c ∈ S we have that ∼ is
(i) reflexive: a ∼ a;
(ii) symmetric: a ∼ b if and only if b ∼ a;
(iii) and transitive: if a ∼ b and b ∼ c then a ∼ c.
If S is a set with an equivalence relation ∼, then S/ ∼ is the set of all elements of S
up to equivalence.
Example 26. Consider the real line R with the equivalence relation x ∼ y if and only
if x − y ∈ Z. The equivalence class of x is x + Z. In particular, x ∼ {x} for each
x ∈ R, so we are only concerned with the subset [0, 1) ⊂ R. Note that we can also view
[0, 1) as the closed interval [0, 1] with end-points identified, or as the unit circle. That
is, we can identify S/ ∼= R/Z with the unique circle; this is explicitly realized by the
bijection R/Z → S 1 , x + Z 7→ e2πix .
It is easy to check that being congruent (mod n) is an equivalence relation on Z:
we set a ∼ b if and only if a − b ∈ nZ, i.e., a ≡ b (mod n). For any integers a, b, c, we
have
(i) a ≡ a (mod n) ⇐⇒ n | 0;
(ii) a ≡ b (mod n) ⇐⇒ n | (a − b) ⇐⇒ n | (b − a) ⇐⇒ b ≡ a (mod n);
(iii) a ≡ b, b ≡ c (mod n) ⇐⇒ b = a + kn = c + ln for some integers k, l
=⇒ a ≡ c (mod n).
Given an integer a ∈ Z, we denote its congruence (equivalence) class (mod n) by
[a] := {b ∈ Z : b ≡ a (mod n} = {a + nk : k ∈ Z} = a + nZ.
Note that for each congruence class [a] we have a unique representative a ∈ {0, 1, . . . , n−
1}, namely the least non-negative residue (mod n). We obtain a partition of the set of
integers by the finitely many congruence classes
n−1
[
Z= [a].
a=0
6.2. SOME THEOREMS TO PRIME MODULUS 37
Consequently, the set Z/nZ of all congruence classes (mod n) is in bijection with the
finite set {0, 1, . . . , n − 1} via the map that sends [a] = a + nZ to the least non-negative
residue a (mod n) ∈ {0, . . . , n − 1}. The set Z/nZ has a rich algebraic structure: it
is a (finite, commutative) ring. In fact, we will explain that it is even a (finite) field if
and only n = p is prime.
By definition, Z/nZ can be upgraded from being a ring to being a field if every
a ∈ {1, . . . , n − 1} has a reciprocal a0 satisying aa0 ≡ 1 (mod n). This is not true in
general. For example, the linear congruence equation 3x ≡ 1 (mod 6) has no solution
(i.e., 3 has no reciprocal (mod 6)). To see this, it suffices to check that 3x ≡ 0, 3
(mod 6) by running x through 0, . . . , 5. We take note of the following more general
proposition.
Proposition 27. The linear congruence equation ax ≡ b (mod n) has a solution if and
only if (a, n) | b. If a solution exists, it is unique.
Proof. If x is a solution to ax ≡ b (mod n) then (a, n) must divide b. The converse
can be obtained from Bézout’s identity; we will however give a more direct argument.
Suppose first that (a, n) = 1. Then we claim that the map ϕa : Z/nZ → Z/nZ,
[x] 7→ [ax] is a bijection. Indeed, it is injective since ax ≡ ay (mod n) implies x ≡ y
(mod n) given that (a, n) = 1. Since ϕa is an injective self-map, it is bijective. This
shows not only that a solution exists but also that this solution is unique.
More generally, write a0 = (a,n)
a
, b0 = (a,n)
b
, n0 = (a,n)
n
. Then as we have just seen
0 0 0 0 0
a x ≡ b (mod n ) has a unique solution since (a , n ) = 1, and this is also a solution of
ax ≡ b (mod n).
Corollary 28. The map ϕa : Z/nZ → Z/nZ, ϕa (x) = ax (mod n) is a bijection if
and only if (a, n) = 1.
Proof. We have seen in the previous proof that if (a, n) = 1, the map ϕa is
a bijection. Conversely if ϕa is surjective, then the proposition above implies that
(a, n) | b for each b ∈ {0, . . . , n − 1} and in particular that (a, n) = 1.
In particular, Z/nZ is a field if and only if (a, n) = 1 for all 1 ≤ a ≤ n − 1. This
says that n has no non-trivial divisors, i.e., it is prime.
Conversely, assume for contradiction that p has a non-trivial divisor d|p with 1 <
d < p. Then (p − 1)! ≡ 0 (mod d) but also (p − 1)! ≡ −1 (mod d). By transitivity, we
are left with 0 ≡ −1 (mod d) which can only be true if d = 1.
Wilson’s theorem provides a direct primality test; it asserts that the primality of
a natural number depends on checking that a congruence equation holds. In practice
however computing the factorial of a large number rapidly exceeds standard computa-
tional power so that this primality test is mostly of theoretical interest.
Theorem 14 (Fermat). If p is a prime then ap−1 ≡ 1 (mod p) for all (a, p) = 1.
Proof. We present a proof of Ivory (1806). Recall that the map ϕa : Z/pZ →
Z/pZ is a bijection for (a, p) = 1. Hence multiplying all non-zero elements gives
a(2a)(3a) · · · ((p − 1)a) ≡ (p − 1)! (mod p).
On the other hand, a(2a)(3a) · · · ((p − 1)a) = (p − 1)!ap−1 . Hence (p − 1)!ap−1 ≡ (p − 1)!
(mod p) and since (p, (p − 1)!) = 1, we may cancel the factor (p − 1)! on both sides.
Remark 29. For applications, it is sometimes useful to keep in mind that ap−1 ≡ 1
(mod p) is equivalent to ap ≡ a (mod p).
In contrast to the computation of a factorial in Wilson’s theorem, computing powers
in modular arithmetic is very efficient, thanks to the process of repeated squaring.
E.g., mod 17, we have
515 = 257 · 5 ≡ 86 · 40 ≡ 643 · 6 ≡ 132 · 13 · 6 ≡ (−1) · 10 ≡ 7 (mod 17).
However Fermat’s theorem does not provide an absolute primality test because its
converse is false: there are composite numbers n for which an−1 ≡ 1 (mod n) for all
(a, n) = 1. Such numbers are called Carmichael numbers. It is known that there are
infinitely many Carmichael numbers (a result first proved in 1994) but these numbers
are very sparse; the smallest Carmichael number is 561 = 3 · 11 · 17 and there are
only 6 other Carmichael numbers below 10000. More generally, a composite number n
satisying an−1 ≡ 1 (mod n) for some a is called a pseudoprime (to base a). It is known
that the probability of n being a pseudoprime goes to 0 as n → ∞. To this extent,
we say that if an−1 ≡ 1 (mod n) holds for some randomly chosen a ∈ {2, . . . , p − 2}
then n is a probable prime. Extensions of such probabilistic tests are widely used
as primality tests in practice, e.g., the Miller-Rabin or Solovay-Strassen tests. The
first deterministic polynomial-time primality test, the AKS test (2002), also relies on
Fermat’s theorem.
6.3. Homework 6
(1) Show that n is a multiple of 4 if and only if the number composed of the last
two digits of n is a multiple of 4.
(2) Find the least non-negative residues of 1! + 2! + · · · + 10! (mod n) for n = 3, 11.
40
(3) Find the least non-negative residue of 23 (mod 10).
(4) Let p be a prime. Show that (a + b)p ≡ ap + bp (mod p).1
1 p
Hint: Use the binomial theorem and show that p | k for all k = 1, . . . , p − 1.
6.3. HOMEWORK 6 39
Congruences (II)
x≡b (mod n)
has solution x = bmu − anv and this solution is unique (mod mn).
Proof. Recall that the existence of u, v ∈ N is guaranteed by Bézout’s identity
since (m, n) = 1. We can check that bmu−anv ≡ −anv ≡ a (mod m) and bmu−anv ≡
bmu ≡ b (mod n).
Suppose y is another solution. Then x ≡ y (mod m) and (mod n). We may write
x − y = km = ln for some integers k, l. Since (m, n) = 1 we have n | k and hence x ≡ y
(mod mn).
Recall that continued fractions allow to compute the solutions u, v to Bézout’s
identity. Sometimes the solutions can be found immediately as is the case for m = 37,
n = 3: 37 · 1 − 3 · 12 = 1. Then the system (7.1) has (unique) solution
x ≡ 37 − 3 · 12 · 9 ≡ 46 (mod 111).
We conclude this section with a proof of the general form of the Chinese remainder
theorem.
Theorem 17 (Chinese remainder theorem). Let n1 , . . . , nk be pairwise coprime
numbers. Then the system of congruence equations
x ≡ a1 (mod n1 )
..
.
x ≡ a (mod n )
k k
7.3. EULER’S TOTIENT FUNCTION 43
We conclude that
r
Y Y
ϕ(n) = (pk11 − pk11 −1 ) · · · (pkr r − pkr r −1 ) = piki −1 (pi − 1) = n (1 − p−1 ),
i=1 p|n
7.4. Homework 7
(1) Find the last two digits of 99 (without calculator).
(2) Solve 97x ≡ 13 (mod 105).
(3) Show that m | n implies ϕ(m) | ϕ(n).
(4) Find all n ∈ N for which ϕ(n) = 6.1
1 Hint: First establish that n must be of the form n = 2a 3b 7c for some integers a, b, c ≥ 0.
CHAPTER 8
Quadratic Residues
Proof. (i) Suppose there exist x, y such that x2 ≡ a, y 2 ≡ b (mod p). Then
(xy)2 ≡ ab (mod p), i.e., ab is a quadratic residue (mod p).
(ii) We have seen that among the set {1, 2, . . . , p − 1}, half the elements are
quadratic residues, say {a1 , . . . , a(p−1)/2 } and the other half are (therefore) not, say
{b1 , . . . , b(p−1)/2 } = {1, 2, . . . , p − 1} \ {a1 , . . . , a(p−1)/2 }. Let a be an integer coprime to
p. Then a, 2a, . . . , (p − 1)a are p − 1 distinct numbers (mod p).
If a is a quadratic residue, then by (i) aa1 , . . . , aa(p−1)/2 are quadratic residues as
well, and hence by Proposition 36 the other half, ab1 , . . . , ab(p−1)/2 cannot be. This
proves (ii).
(iii) Consider the same setup as for (ii) but choose a to be a quadratic non-
residue. Then by (ii) aa1 , . . . , aa(p−1)/2 are quadratic nonresidues, and this forces
ab1 , . . . , ab(p−1)/2 to be quadratic residues. This proves (iii).
In this sense, quadratic residues behave like +1, quadratic nonresidues like −1.
Definition 38. For p an odd prime, a such that (a, p) = 1, the Legendre symbol is
+1 a is a quadratic residue (mod p)
a
=
p −1 a is a quadratic nonresidue (mod p).
We can reformulate the properties of quadratic residues established up to now in
terms of the Legendre symbol as follows;
a+kp
• p
= ap for every k ∈ Z;
• ap for half the elements of {1, . . . , p − 1};
• ab p
= ap b
p
, i.e., the Legendre symbol is completely multiplicative.
In the next chapter, we will prove that every Legendre symbol can be computed thanks
to the quadratic reciprocity law
p p−1 q−1 q
= (−1) 2 2 ,
q p
valid for every pair p, q of odd primes, together with the supplementary laws
−1
= 1 if and only if p ≡ 1 (mod 4)
p
2
= 1 if and only if p ≡ 1 or 7 (mod 8)
p
We conclude this section with some computational examples:
2 3
72 9 8 3 2 2
= = = =1
97 97 97 97 97 97
2
34 2 17 17 97 12 2 3 2
= = = = = = = −1
97 97 97 97 17 17 17 17 3
48 8. QUADRATIC RESIDUES
A simple modification of Euclid’s argument allows to prove that there are infinitely
many primes ≡ 3 (mod 4).
Theorem 20. There are infinitely many primes ≡ 3 (mod 4).
Proof. Suppose there were only finitely many primes ≡ 3 (mod 4), say p1 , . . . , pk .
Then the number N = 4(p1 · · · pk ) − 1 is larger than the largest prime in our list, hence
must be composite. At least one prime divisor p | N must be of the form p ≡ 3 (mod
4); if p, q ≡ 1 (mod 4) then pq ≡ 1 (mod 4) hence if all prime divisors of a number n
are ≡ 1 (mod 4), we have n ≡ 1 (mod 4).
Since p ≡ 3 (mod 4), it also divides p1 · · · pk , and we obtain a contradiction.
However this argument does not work to show that there are infinitely many primes
p ≡ 1 (mod 4). Instead we use the supplementary law.
Theorem 21. There are infinitely many primes ≡ 1 (mod 4).
Proof. Suppose there were only finitely many primes ≡ 1 (mod 4), say p1 , . . . , pk .
Then the number N = (p1 · · · pk )2 + 1 is larger than the largest prime in our list, hence
must be composite. Then any prime divisor p | N gives ( −1 p
) = 1, hence is of the form
p ≡ 1 (mod 4), and we again have a contradiction.
All this motivated looking at similar patterns (such as the fact that p is a prime
divisor of n2 − 2 if and only if p lies in either one of the arithmetic progressions 8k + 1,
8k − 1, k ∈ N, i.e., the second supplementary law) until the quadratic reciprocity
law was empirically observed by Euler, and eventually proved by Gauss. The fact
that its proof resisted both Euler and Legendre reflects that it is very much a non-
trivial statement. On the other hand, we have now more than 250 different proofs
of the quadratic reciprocity law, and this is a testament both to the great advances
in mathematics over the last two centuries, and to the fundamental nature of the
quadratic reciprocity law; it marked the birth of algebraic number theory and many
later developments in mathematics.
8.4. Homework 8
(1) Suppose that Alice’s public key is (583, 3). Compute Alice’s private key.
(2) Compute the Legendre symbols ( −26 73
), ( 19
73
), ( 33
73
).
(3) Find all odd primes for which −2 is a quadratic residue.
(4) Show that there are infinitely many primes that are congruent to either 1 or
7 (mod 8).
(5) Let p ≥ 7 be a prime. Show that there are always two quadratic residues (mod
p) that differ by 2.1
Quadratic Reciprocity
On the right hand-side we note that p − 1 ≡ −1 (mod p), p − 3 ≡ −3 (mod p), etc.
This suggests that applying this congruence equivalence to roughly the second half of
the terms, we should eventually obtain
p−1
2 2 ( p−1 )! ≡ ±( p−1 )! (mod p).
2 2
The determination of the sign is the crucial point; observe that it will depend on
whether p−1
2
is even or odd. We consider these two cases separately:
(A) We have p−12
= 2k;
(B) We have p−12
= 2k + 1;
for some number k ∈ N. Depending on whether we are in case (A) or (B), the second
bracket [· · · ] in
2 · 4 · · · (p − 3)(p − 1) = [2 · · · (2k)] · [(2k + 2) · · · (p − 3)(p − 1)]
is a product of k or k + 1 terms, while the first bracket is a product of k terms.
In case (A), we thus find
p−1 p−1
2 2 ( p−1 )! ≡ (−1)k (2k)! (mod p), i.e., 2 2 = (−1)k .
2
Hence in case (A), ( p2 ) = 1 if and only if k is even, that is, if and only if p ≡ 1 (mod
8). In case (B), we have instead
p−1 p−1
2 2 ( p−1 )! = (−1)k+1 (2k + 1)! (mod p), i.e., 2 2 = (−1)k+1 .
2
Hence in case (B), ( p2 ) = 1 if and only if k is odd, i.e., if and only if p ≡ 7 (mod 8).
This proves the second supplementary law.
On the basis of such computations, Gauss discovered a simple lemma, which will
provide the key to the elementary proof of the law of quadratic reciprocity presented
in the next section.
Lemma 40 (Gauss’ lemma).
a
= (−1)ν ,
p
p−1 p
where ν := #{1 ≤ k ≤ 2
| 2
< ka (mod p) < p}.
Proof. Starting again from
p−1
a 2 ( p−1 )! = a(2a) · · · ( p−1 a)
2 2
and since j + k < p − 1 this is impossible. Hence each number in the set 1, 2, . . . , p−1
2
appears exactly once, with a predetermined sign. The number of negative signs is
exactly given by ν.
Comparing (9.1) and (9.2) we conclude that νp and νq have the same parity. By Gauss’
lemma, this proves ( pj ) = ( qj ). The same argument can be made to show that this
equality also holds in case (B).
Once again the usual rules can be established, but they become more complicated. For
example
a b
=
n n
whenever a ≡ b (mod n) except if n ≡ 2 (mod 4), in which case the equality holds
whenever a ≡ b (mod 4n). The Kronecker symbol satisfies the following version of
quadratic reciprocity. Write n ∈ N as n = 2e n0 with e ≥ 0 and n0 is odd. Then for
any coprime m, n ∈ N we have
m n m0 −1 n0 −1
= (−1) 2 2 .
n m
9.4. Homework 9
(1) Use quadratic reciprocity to find all odd primes for which 3 is a quadratic
residue, i.e., ( p3 ) = 1.
(2) Use Gauss’ lemma to find all odd primes for which 3 is a quadratic
residue.
(3) Show that if p, q are twin primes, i.e., |p − q| = 2, then q = pq .
p
(4) Let n ∈ N be odd and (a, n) = 1. Show that if the Jacobi symbol is ( na ) = −1
then a is not a quadratic residue (mod n).
(5) Show that if m, n ≡ 0, 1 (mod 4) and are coprime then the quadratic reci-
procity law for the Kronecker symbol asserts that ( mn
n
)( m ) = 1.
CHAPTER 10
Looking at this picture, it is clear that there is a symmetry connecting the two
new circles; they are reflections of one another with respect to the circle that passes
through the tangency points of the original three circles.
In other words, given the first four circles, you know everything about the fifth.
This should apply in particular to the numerical invariants of these circles, namely their
center and radius. Since Apollonius’ theorem does not depend on the position of the
circles in the plane, we only consider the radii. To state the following algebraic identity
in the simplest possible way, we will need to introduce some notation. The curvature
of a circle with radius r is defined to be 1/r. To each circle in a configuration of four
mutually tangent circle we assign a signed curvature defined as the usual curvature if
57
58 10. APPLICATIONS OF RESIDUE SYMBOLS
C contains none of the three other circles in its interior, and minus the usual curvature
if C contains all other three circles in its interior. For example in Figure 1, C2 , C3 , C4
would be assigned positive curvature and C1 negative curvature.
Theorem 24 (Descartes, 1643). For any configuration of four mutually tangent
circles with signed curvatures a1 , a2 , a3 , a4 ∈ R, we have the identity
(a1 + a2 + a3 + a4 )2 = 2(a21 + a22 + a23 + a24 ).
Remark 42. Observe that if a1 , a2 , a3 are fixed, then the quadratic equation
(a1 + a2 + a3 +x)2 = 2(a21 + a22 + a23 + x2 )
⇐⇒ x2 − 2(a1 + a2 + a3 )x + (2(a21 + a22 + a23 ) − (a1 + a2 + a3 )2 ) = 0
has two solutions a4 , a04 , related by the identity
a4 + a04 = 2(a1 + a2 + a3 ). (10.1)
This is an algebraic expression of Apollonius’ theorem.
An Apollonian circle packing is obtained by starting from a “root configuration”
of four mutually circles and generating smaller and smaller circles in the interstices
according to Apollonius’ theorem.
• Except for the outer circle, which has signed curvature −3, every circle in the
packing will have positive signed curvature;
• The signed curvatures 1,2,3,4,6,7 do not appear;
• Each signed curvature that appears is ≡ 0 or 1 (mod 4). (See the homework
at the end of this chapter).
The local-global conjecture states that these are the only possible types of ob-
structions on a natural number appearing as a curvature in a primitive integral Apol-
lonian packing.
Conjecture 43 (Local–global conjecture). For a primitve integral Apollonian pack-
ing, every sufficiently large number in an admissible residue classes will appear as a
curvature.
Impressive machinery has been developed in the last decade to make progress on
this conjecture. Then two years ago, during a Research Experience for Undergraduates
(REU) project at the University of Colorado, it was discovered that the conjecture is
... false.1
Theorem 25 (Haag–Kertzer–Rickards–Stange). No square curvatures n2 appear
in the primitive integral Apollonian packing A generated by (−3, 5, 8, 8)
Sketch of proof. We will take the following facts on primitive integral Apollo-
nian packings for granted. Fix a circle C ∈ A and let n be the curvature of C.
(i) For each C 0 ∈ A, there exists a path of tangent circles with coprime curvatures
that connects C to C 0 ;
(ii) For each C 0 ∈ A tangent to C and with curvature m coprime to n, we have
m ≡ Ax2 (mod n) for some integer x, and where A is a constant independent
of C 0 .
1 You can read more about this story here or here, and find the final research paper here.
60 10. APPLICATIONS OF RESIDUE SYMBOLS
Recall that each curvature in this packing satisfies n ≡ 0, 1 (mod 4). Then by (ii) the
Kronecker symbol ( m n
) satisfies
m A
= .
n n
In particular, since the constant A depends only on the circle C, we can define
A
χ(C) :=
n
to be the Kronecker symbol of the circle C. In the previous homework you had to
show that if two numbers m, n satisfy m, n ≡ 0, 1 (mod 4) and are coprime then the
quadratic reciprocity law for the Kronecker symbol asserts that ( m n
n
)( m ) = 1. This
0
implies that for any two tangent circles C, C ∈ A with coprime curvatures, we have
χ(C) = χ(C 0 ).
Let C0 be the circle of curvature 5 appearing in the root configuration. By (i) we
conclude that χ(C) = χ(C0 ) for all C ∈ A, i.e., the Kronecker symbol is constant
over all circles of the packing. It remains to compute χ(C0 ); this is achieved by the
Legendre symbol computation
8
χ(C0 ) = = −1.
5
To conclude, we observe that any circle C appearing in A with square curvature n2
would have Kronecker symbol χ(C) = 1, which is impossible by the ‘reciprocity ob-
struction’ just established.
Since N (α/β − ω) is the square of the distance between α/β and ω, we have the
upper bound
1
N (α/β − ω) ≤ .
2
Now set ρ := α − ωβ. Since Z[i] is a ring, we have ρ ∈ Z[i]. Moreover N (ρ) ≥ 0 and
by the multiplicativity of the norm we have
1
N (ρ) = N (β(α/β − ω)) = N (β)N (α/β − ω) < N (β) < N (β).
2
Given division with remainder, Euclid’s algorithm holds in Z[i], implying the fun-
damental theorem of arithmetic for Z[i]: every Gaussian integer can be written as a
product of Gaussian primes in a unique way up to multiplication by a unit (i.e.,
α = ±1, ±i). For instance
5 = (2 + i)(2 − i) = (1 + 2i)(1 − 2i)
are seen as the same factorization given that −i(2 + i) = (1 − 2i) and i(2 − i) = (1 + 2i).
This is similar to considering 6 = 2 · 3 = (−2)(−3).
Definition 47. A Gaussian integer π ∈ Z[i] is a Gaussian prime if N (π) > 1 and
if it can not be written as a product π = αβ of two Gaussian integers α, β ∈ Z[i] with
1 < N (α), N (β) < N (π).
62 10. APPLICATIONS OF RESIDUE SYMBOLS
2 √
Review Example 2 in Chapter 2.2 and find a counterexample of unique factorization for Z[ −3].
CHAPTER 11
Primes
can easily check that π and π in Proposition 51 are not associates. We say that 2 is
ramified.
We can now characterize Gaussian primes by relating them back to rational (ordi-
nary) primes.
Theorem 28. Let π = a + ib ∈ Z[i], π 6= 0. If ab 6= 0, the Gaussian integer π is a
Gaussian prime if and only if a2 + b2 is prime. If ab = 0, i.e., π = a or π = ib, π is a
Gaussian prime if and only if |π| is prime and |π| ≡ 3 (mod 4).
The proof builds on the following simple observation.
Lemma 53. If π is a Gaussian prime, there exists a rational prime p ∈ N such that
π | p.
Proof. By definition of the norm, we have π | N (π). Since π is a Gaussian prime,
Euclid’s lemma implies that π | p, where p is some prime factor of N (π).
Proof of Theorem 28. Let π be a Gaussian prime with π | p.
If p ≡ 3 (mod 4), then p is a Gaussian prime and π is associate to p.
If p ≡ 1 (mod 4), there exists some Gaussian prime π 0 such that π | π 0 π 0 and by
Euclid’s lemma, π is associate to either π 0 or its conjugate. Moreover, N (π) = N (π 0 ) =
p. The same conclusion holds if p = 2.
Conversely, if N (π) is prime or if |π| ≡ 3 (mod 4) and |π| is prime, then π is a
Gaussian prime.
Theorem 29 (Dirichlet, 1837). For (a, n) = 1, there are infinitely many primes of
the form p ≡ a (mod n).
The proof of Dirichlet’s theorem is unfortunately outside of the scope of this course,
but should be studied by anyone who wants to delve deeper in number theory! In this
section, we will review a few foundational results related to the distribution of prime
numbers. We saw earlier with Eratosthenes sieve that primes become sparser and
sparser the larger the numbers we consider. This is no surprise; a large number is more
likely to have non-trivial divisors.
Let pn denote the n-th prime and set gn = pn+1 − pn to be the gap between pn and
the next larger prime. We first show that
lim sup gn = ∞
n→∞
i.e., that there exist infinitely many pairs of successive primes that differ by 2, such as
(3, 5), (5, 7), (11, 13), (17, 19), (29, 31), . . . . We currently know that
lim inf gn < N
n→∞
where the first breakthrough is due to Zhang showing in 2013 that N = 7 · 107 , and
was brought down to N = 600 (Maynard, 2013) and N = 246 (Polymath, 2014). You
can check out this Numberphile episode to hear more.
For N > 1 large, we define the prime counting function to be
π(n) = #{p ≤ n, p prime}.
Although the distribution of the prime numbers is irregular, their average distribution
behaves remarkably regularly. We have, e.g., π(10) = 4, π(1000) = 168, and the plot
of π(n) looks like
66 11. PRIMES
In the 1790s, Legendre and Gauss both conjectured, based on numerical observa-
tions, that the density of prime numbers among the first N whole numbers is approxi-
matively (log n)−1 (where log denotes the natural logarithm); this is usually expressed
as
π(n) 1
∼ (n → ∞)
n log n
to say that the two quantities are asymptotically equivalent.
Definition 55. Let f (x), g(x) be two functions of one variable, and suppose that
g(x) > 0 with at most finitely many exceptions. We say that f (x) and g(x) are asymp-
totically equivalent, written f (x) ∼ g(x) as x → ∞, if and only if
f (x)
lim = 1.
x→∞ g(x)
The function on the very right, which gives the better approximation, was introduced
by Riemann and can be expressed as
∞
X 1 (log x)n
R(x) = 1 + ,
n=1
nζ(n + 1) n!
where ζ is the Riemann zeta-function
∞
X 1 1 1
ζ(s) = s
= 1 + s + s + ...,
n=1
n 2 3
where s is a complex variable with Re(s) > 1 (to guarantee the convergence of the
series). Although Riemann couldn’t prove the prime number theorem, he conjectured
the following exact formula
X
π(x) = R(x) − R(xρ ),
ρ
where the sum runs over the set of the (‘non-trivial’) zeros of ζ. Without going into
details, this is to say that the fluctuations of π(x) depend on the location of the zeros
of ζ. The Riemann hypothesis, one of the six remaining open Millenium Prize
Problems, states that all such zeros lie on the vertical line 21 + iR in the complex plane.
It would imply that the primes have as regular a distribution as one could hope for.
11.3. Homework 11
(1) Find the prime factorization of 3 − i, 4 + 7i, 5 + i.1
(2) Goldbach’s strong conjecture states that every even number n > 2 can be
written as a sum of two primes. Goldbach’s weak conjecture (settled by
Helfgott in 2013) states that every odd number n > 5 can be written as a sum
of three primes. Show that the strong conjecture implies the weak one.
(3) Show that the function n 7→ gn is not multiplicative.
(4) Show that gn ∼ log n as n → ∞.
(5) Use the prime number theorem to show that there are infinitely many prime
numbers with leading digit 7.2
Generating Functions
For the moment, we view F (x) as a formal power series, and leave issues of conver-
gence aside.
Example 59. The constant sequence an = 1 has as (ordinary) generating function the
geometric series
∞
X
xn = (1 − x)−1 .
n=0
Formally this identity is obtained by the operation of shifting:
X∞ X∞ X∞
(1 − x) xn = xn − xn = x0 = 1.
n=0 n=0 n=1
The same trick shows that for every N ≥ 1, we have
N
X 1 − xN +1
xn = .
n=0
1−x
Example 60. Recall the definition of the Fibonacci numbers: F0 = 0, F1 = 1, and
Fn = Fn−1 + Fn−2 for every n ≥ 2. The generating function for (Fn )n≥0 then satisfies
∞
X ∞
X ∞
X X∞ ∞
X
Fn x n = x + Fn−1 xn + Fn−2 xn = x + x F n xn + x 2 F n xn .
n=0 n=2 n=2 n=0 n=0
leading to the closed form expression
∞
X x −x
Fn x n = 2
= ,
n=0
1−x−x (x + ϕ)(x + ϕ)
√
1± 5
where ϕ, ϕ = 2
.
Further, by partial fraction decomposition, we find that
∞ ∞ n
ϕ − ϕn
X
n −x 1 −ϕ 1 ϕ X
Fn x = =√ −√ = √ xn .
n=0
(x + ϕ)(x + ϕ) 5 x + ϕ 5 x + ϕ n=0
5
69
70 12. GENERATING FUNCTIONS
With respect to these operations of addition and multiplication, the set of all formal
power series with coefficients in a commutative ring R is a ring itself, denoted R[[X]],
called the ring of formal power series.
Proposition P 62. nTwoPsequences (an ), (bn ) are identical, i.e., an = bn for all n ≥ 0 if
n
and only if an x = b n x .
Proposition 63. The sequence (an ) is defined by a linear recurrence relation if and
P (x)
an xn = Q(x)
P
only if for P (x), Q(x), polynomials of finite degrees.
Ordinary generating functions are widely used in combinatorics, probability theory,
and number theory. We highlight a few examples of applications.
Proposition 64. Given a set of n elements, let C(n, k) the number of k-combinations,
i.e., the number of distinct ways to select k out of the n elements (where the ordering
n
does not matter). Then C(n, k) = k .
Proof. The finite generating function nk=0 C(n, k)xk is a polynomial of degree
P
n. The coefficients C(n, k) imply that this polynomial factors as
(1 + x)(1 + x) · · · (1 + x),
| {z }
n times
where each factor represents one of the n given elements, the summand x0 = 1 corre-
sponds to not selecting the element, the summand x1 = x corresponds to its selection.
The statement then follows from the binomial theorem, i.e.,
n n
X
k n
X n k
C(n, k)x = (1 + x) = x .
k=0 k=0
k
Next we consider the most famous function of combinatorial number theory, the
partition function. Let p(n) be the number of ways n ∈ N can be written as a sum
of positive integers (where the order does not matter). For example
6=5+1=4+2=4+1+1=3+3=3+2+1=3+1+1+1=2+2+2
=2+2+1+1=2+1+1+1+1=1+1+1+1+1+1
12.1. ORDINARY GENERATING FUNCTIONS 71
so that p(6) = 10. The partition function does not satisfy any simple linear recurrence,
as follows from the following theorem. (In fact, the partition function has no known
closed form expression!)
Theorem 31 (Euler). Set p(0) = 1. We have the formal identity
∞ ∞
X
n
Y 1
p(n) x = .
n=0 n=1
1 − xn
Additive number theory is concerned with the properties of subsets of integers under
addition. A familiar prototypical problem is to determine rk (n), the number of ways
to write n as a sum of k squares. The function rk (n) is called the sum-of-squares
function. The generating function for rk (n) is given in terms of another important
formal series,
X 2
θ(x) := xn
n∈Z
Theorem 32 (Jacobi).
∞
Y (1 − x2n )5
θ(x) = .
n=1
(1 − xn )2 (1 − x4n )2
Closed form expressions are known for rk (n) with k ≤ 8 and their absence when
k > 8 is explained by the theory of modular forms. Similarly, modular forms are
instrumental in proving the asymptotic formula for the partition function
1 √
p(n) ∼ √ eπ 2n/3 (as n → ∞)
4n 3
(first stated by Ramanujan) and in the study of that other famous original problem of
additive number theory, Waring’s problem: Given k, does there exist s ∈ N such
that each positive integer can be written as a sum of at most s powers of k?
72 12. GENERATING FUNCTIONS
In multiplicative number theory, one instead usually works with a different type of
generating series: given a sequence (an )n≥1 , its (formal) Dirichlet series is given by
∞
X an
F (s) = ,
n=1
ns
In practice, the variable s can be a complex number, but we will only consider real
values in this chapter, and again for a moment only consider F (s) as a formal se-
ries, leaving convergence issues aside. The prototype of a Dirichlet series is the zeta
function
X 1
ζ(s) =
n=1
ns
which we will study in the next section. To conclude here, we take note of the following
proposition.
Proposition 65. Dirichlet series can be added and multiplied as follows,
∞
X an + b n
F (s) + G(s) =
n=1
ns
∞
X X
F (s) · G(s) = n−s ad bn/d .
n=1 d|n
where the product on the right hand-side is over all prime numbers.
Proof. We start with the observation that
∞ ∞
X 1 −s −s −s −s
X 1
s
= 1 + 2 + 3 + 4 + ..., 2 s
= 2−s + 4−s + 6−s + . . .
n=1
n n=1
n
so that the difference of the two series is
∞
−s
X 1
1−2 s
= 1 + 3−s + 5−s + 7−s + . . .
n=1
n
In other words, we have sieved out all multiples of 2. By the same argument, we can
sieve out all multiples of 3:
∞
−s −s
X 1
= 1 + 5−s + 7−s + . . . .
1−3 1−2 s
n=1
n
Continuing this process leads to the statement.
12.2. THE ZETA FUNCTION 73
Remark 66. The application of Eratosthenes sieve reflects the fact that we can think
of the existence of the Euler product expansion as a reflection of unique factorization
(i.e., the fundamental theorem of arithmetic);
∞ ∞
X 1 YX 1 Y 1
= = ,
n=1
n s
p r=0
p rs
p
1 − p−s
where the last step is an application of the geometric series.
Many interesting arithmetic sequences can be recovered from the zeta function. For
example, the Möbius function µ(n) is defined by
1
if n = 1,
µ(n) = 0 if n is divisible by a square,
(−1)k if n is a product of k distinct primes,
and we have
1 Y
= (1 − p−s ) = (1 − 2−s )(1 − 3−s )(1 − 5−s )(1 − 7−s ) · · ·
ζ(s) p
Q If s >−s1 −1
Proposition 67. the zeta function ζ(s) is convergent and admits the Euler
product ζ(s) = p (1 − p ) .
74 12. GENERATING FUNCTIONS
Then
∞
Y
−s −s
X 1
(1 − p )ζ(s) − 1 = P + ··· < →0
p<P n=P
ns
as P → ∞.
Remark 68. When s = 1, the zeta-function is the harmonic series, which is the model-
case example of a divergent series. A nice argument to see this, due to Oresme ( 1350),
is by regrouping the terms as follows
1 1 1 1 1 1 1 1
1 + + + + + + + + + ...
2 3 4 5 6 7 8 9
1 1 1 1 1 1 1 1
>1+ + + + + + + + + ...
2 4 4 8 8 8 8 16
1 1 1 1
= 1 + + + + + ··· = ∞
2 2 2 2
Corollary 69. There exist infinitely many prime numbers.
Proof. Suppose Q for contradiction that there are only finitely many primes. Then
−s −1
the finite product p (1 − p ) converges for any s. In particular, this is also true as
s → 1, which would imply the convergence of the harmonic series, a contradiction.
More interestingly, Euler derived the following stronger version of the infinitude of
primes using the convergence of the zeta function.
Theorem 34 (Euler, 1737).
X1
= ∞.
p
p
Proof. Let s > 1. Taking the logarithm of the Euler product, we have
X
log ζ(s) = − log(1 − p−s ).
p
P∞ xm
When |x| < 1, log(1 − x) has Maclaurin expansion log(1 − x) = − m=1 m so that
∞ ∞
XX 1 X 1 XX 1
log ζ(s) = = + .
p m=1
mpms p
ps p m=2
mpms
12.3. THE ZETA FUNCTION FOR GAUSSIAN INTEGERS 75
We can show that this last series is uniformly bounded for any choice of s:
XX 1 1XX 1 1X 1 1X 1 1
ms
≤ ms
= s s
< = .
p m≥2
mp 2 p m≥2 p 2 p p (p − 1) 2 n≥2 n(n − 1) 2
Hence
X 1 1
s
− log ζ(s) < ,
p
p 2
and the statement now follows by taking s → 1.
This clearly again implies that there are infinitely many primes (if the sum was
finite, it would converge) but by doing so quantitatively, it provides some more infor-
mation on the density of primes; for example, we now see that primes are denser than
numbers of the form n2 , since ζ(2) converges. This result often cited as the ‘birth of
analytic number theory.’
The constant factor 14 warrants some explanation: recall that N (α) = N (β) whenever
β = (unit) · α and that there are four units, namely ±1, ±i. For comparison, the units
in Z are ±1 and
∞
X 1 1X 1
ζ(s) = s
= s
.
n=1
n 2 n∈Z
|n|
n6=0
If one might be brought to study ζ(s, Z[i]) for its own sake, it also appears naturally
as the Dirichlet series (generating function) for the sum-of-squares function
r2 (n) := #{(a, b) ∈ Z2 | a2 + b2 = n} = #{α ∈ Z[i] | N (α) = n}.
Until now, we have been interested in numbers n that can be represented as n = a2 + b2
with a, b ∈ N; we introduce the modified sum-of-squares function
1
r̃2 (n) := #{(a, b) ∈ N2 | a2 + b2 = n} = r2 (n).
4
Then
∞ ∞
1 X r2 (n) X r̃2 (n)
ζ(s, Z[i]) = = .
4 n=1 ns n=1
n s
Our goal is thus to obtain information on r̃2 (n) through the study of its generating
function ζ(s, Z[i]). Inspired by the theory of the (Riemann) zeta function, our first
step will be to compute the Euler product of ζ(s, Z[i]). For this, we first recall, for the
reader’s convenience, the characterization of Gaussian primes established in Chapter
76 12. GENERATING FUNCTIONS
11.1. We saw that if π is a Gaussian prime then its norm has one of the following
forms:
(1) N (π) = 2;
(2) N (π) = p2 with p ≡ 3 (mod 4) prime;
(3) N (π) = p with p ≡ 1 (mod 4) prime;
and these three cases correspond to
(1) π = 1 + i (up to multiplication by a unit);
(2) π = p with p ≡ 3 (mod 4) prime (up to multiplication by a unit);
(3) p ≡ 1 (mod 4) prime factors uniquely (up to multiplication by a unit) as
p = ππ, with π, π distinct (i.e., not associates) primes.
Hence the Euler product is given by
1Y
ζ(s, Z[i]) = (1 − N (π)−s )−1
4 π
Y Y
= (1 − 2−s ) (1 − p−2s )−1 (1 − p−s )−2
p≡3 (mod 4) p≡1 (mod 4)
Y Y Y
= (1 − p−s )−1 (1 − p−s )−1 (1 + p−s )−1
p p≡1 (mod 4) p≡3 (mod 4)
Y Y
= (1 − p−s )−1 (1 − ( −1
p
)p−s )−1 .
p p odd
1
Observe that the constant factor disappears since each Gaussian primes appears 4
4
times (via multiplication by units). When s > 1, we recognize the first product to be
ζ(s); can one also write the second product as a zeta function? For this we introduce
the following extension of the Legendre symbol ( −1p
):
1
if n ≡ 1 (mod 4);
χ(n) := −1 if n ≡ 3 (mod 4);
0 if 2 | n.
Proposition 70. The Dirichlet character χ(n) is completely multiplicative and the
Dirichlet L-function
∞
X χ(n)
L(s, χ) :=
n=1
ns
converges absolutely for s > 1 and admits the Euler product L(s, χ) = p (1−χ(p)p−s )−1 .
Q
Finally the development of the Euler product follows the same proof as for ζ(s), using
that χ(n) is completely multiplicative.
for each n ≥ 1. It is now easy to check that r̃2 (n) is multiplicative; indeed if (m, n) = 1
then
X X X X
χ(d) χ(d) = χ(d1 d2 ) = χ(d).
d|m d|n d1 d2 |mn d|mn
d1 |m, d2 |n
Hence it suffices to determine the values of r̃2 (n) on powers of primes, and this is easily
computed to be
4 if p = 2;
k
k + 1 if p ≡ 1 (mod 4);
X
r̃2 (pk ) = χ(p)j = (12.2)
j=0
0 if p ≡ 3 (mod 4) and k odd;
1 if p ≡ 3 (mod 4) and k even.
In particular, we obtain the following general form of Fermat’s theorem on sums of two
squares “for free.”
Theorem 36 (Fermat). A natural number n can be expressed as a sum of two
squares (i.e., n = a2 + b2 for a, b ∈ N) if and only if any prime p ≡ 3 (mod 4)
appearing in the prime factorization of n appears as an even power.
Proof. Let n = pk11 · · · pkr r . We have r̃2 (n) = 0 if and only if r̃2 (pki i ) = 0 for some
1 ≤ i ≤ r and this holds if and only if pi ≡ 3 (mod 4) and ki is odd by (12.2).
78 12. GENERATING FUNCTIONS
Proof. We start from the simple observation that cos(π/4) = sin(π/4); hence
tan(π/4) = 1 and taking the inverse, π/4 = arctan(1). By the fundamental theorem
of calculus, we may thus write
Z 1
π dx
= arctan(1) = 2
.
4 0 1+x
Whenever x ∈ [0, 1) we have the geometric series expansion
∞
1 X
= (−1)n x2n .
1 + x2 n=0
(4) Use Abel summation to prove that if A(x) is uniformly bounded, then the
Dirichlet series ∞ −s
P
n=1 n n
a is convergent for s > 0.
(5) Let
∞
X (−1)n
η(s) =
n=1
ns
be the alternating zeta function. Show that η(s) = (21−s − 1)ζ(s) and use
the previous exercise to show that η(s) is convergent for s > 0.