0% found this document useful (0 votes)
128 views69 pages

Number Theory: Lecture Notes

This document provides lecture notes on number theory. It covers topics such as divisibility, greatest common divisors, prime numbers, modular arithmetic, primitive roots, quadratic residues, representation of integers by quadratic forms, Diophantine equations, continued fractions, Diophantine approximations, and quadratic number fields. The notes are intended for a number theory course taught at Carnegie Mellon University and reference additional texts for more advanced topics.

Uploaded by

Charan Vedant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views69 pages

Number Theory: Lecture Notes

This document provides lecture notes on number theory. It covers topics such as divisibility, greatest common divisors, prime numbers, modular arithmetic, primitive roots, quadratic residues, representation of integers by quadratic forms, Diophantine equations, continued fractions, Diophantine approximations, and quadratic number fields. The notes are intended for a number theory course taught at Carnegie Mellon University and reference additional texts for more advanced topics.

Uploaded by

Charan Vedant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Number Theory

Lecture Notes

Vahagn Aslanyan1

1 www.math.cmu.edu/~vahagn/
Contents

1 Divisibility 3
1.1 Greatest common divisors . . . . . . . . . . . . . . . . . . . . 3
1.2 Linear Diophantine equations . . . . . . . . . . . . . . . . . . 4
1.3 Primes and irreducibles . . . . . . . . . . . . . . . . . . . . . . 4
1.4 The fundamental theorem of arithmetic . . . . . . . . . . . . . 5
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Multiplicative functions 7
2.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The Möbius inversion formula . . . . . . . . . . . . . . . . . . 8
2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Modular arithmetic 11
3.1 Congruences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Wilson’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Fermat’s little theorem . . . . . . . . . . . . . . . . . . . . . . 12
3.4 The Chinese Remainder Theorem . . . . . . . . . . . . . . . . 12
3.5 Euler’s totient function . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Euler’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Primitive roots 17
4.1 The order of an element . . . . . . . . . . . . . . . . . . . . . 17
4.2 Primitive roots modulo a prime . . . . . . . . . . . . . . . . . 18
4.3 Primitive roots modulo prime powers . . . . . . . . . . . . . . 19
4.4 The structure of (Z/2k Z)× . . . . . . . . . . . . . . . . . . . . 20
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5 Quadratic residues 23
5.1 The Legendre symbol and Euler’s criterion . . . . . . . . . . . 23
5.2 Gauss’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 24

iii
iv CONTENTS

5.3 The quadratic reciprocity law . . . . . . . . . . . . . . . . . . 25


5.4 Composite moduli . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

6 Representation of integers by some quadratic forms 29


6.1 Sums of two squares . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 Sums of four squares . . . . . . . . . . . . . . . . . . . . . . . 30
6.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

7 Diophantine equations 33
7.1 The Pythagorean equation . . . . . . . . . . . . . . . . . . . . 33
7.2 A special case of Fermat’s last theorem . . . . . . . . . . . . . 34
7.3 Rational points on quadratic curves . . . . . . . . . . . . . . . 35
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

8 Continued fractions 39
8.1 Finite continued fractions . . . . . . . . . . . . . . . . . . . . 39
8.2 Representation of rational numbers by continued fractions . . 41
8.3 Infinite continued fractions . . . . . . . . . . . . . . . . . . . . 41
8.4 Periodic continued fractions . . . . . . . . . . . . . . . . . . . 44
8.5 Pell’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

9 Diophantine approximations 51
9.1 Dirichlet’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 51
9.2 Better approximations . . . . . . . . . . . . . . . . . . . . . . 52
9.3 Transcendental numbers and Liouville’s theorem . . . . . . . . 53
9.4 Transcendence of e . . . . . . . . . . . . . . . . . . . . . . . . 55
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

10 Quadratic number fields 59


10.1 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
10.2 Gaussian integers . . . . . . . . . . . . . . . . . . . . . . . . . 61
10.3 Fermat’s little theorem for Gaussian integers . . . . . . . . . . 63
10.4 Using Gaussian integers to solve Diophantine equations . . . . 63
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

11 Chebyshev’s theorem 65
11.1 Basic estimates . . . . . . . . . . . . . . . . . . . . . . . . . . 65
11.2 Proof of Chebyshev’s theorem . . . . . . . . . . . . . . . . . . 67
11.3 On the prime number theorem . . . . . . . . . . . . . . . . . . 68
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
CONTENTS v

Bibliography 70
Preface

Dominus Illuminatio Mea.


Psalm 27:1

These are lecture notes for the Number Theory course taught at CMU
in Fall 2017 and Fall 2018. I used several texts when preparing these notes.
In particular, most of the material can be found in [Bak12, Gre17, HW80].
The books [Bak12, HW80] go way beyond the material of these notes and
the reader is referred to those books for more advanced topics.

Please email me if you notice any mistakes or typos.

Synopsis
Divisibility in the ring of integers, primes, the fundamental theorem of arith-
metic. Multiplicative functions, the Möbius inversion formula. Modular
arithmetic, Wilson’s theorem, Fermat’s little theorem, Euler’s theorem, the
Chinese Remainder Theorem. Primitive roots. Quadratic residues, Gauss’s
law of quadratic reciprocity. Fermat’s theorem on sums of two squares, La-
grange’s theorem on sums of four squares. Some classical Diophantine equa-
tions. Continued fractions, Pell’s equation. Diophantine approximations,
Liouville numbers, algebraic and transcendental numbers. Quadratic num-
ber fields, Gaussian integers. Chebyshev’s theorem, a weak version of the
prime number theorem.

Notations
• The sets of natural numbers1 (positive integers), integers, rationals, re-
als and complex numbers will be denoted by N, Z, Q, R, C respectively.
1
The standard convention is that 0 ∈ N but in this course it is more convenient to
assume 0 is not a natural number.

1
2 CONTENTS

• For a field (or ring) K the ring of polynomials in indeterminate X with


coefficients from K is denoted by K[X], while K(X) is the field of
rational functions.

• The degree of a polynomial f (X) is denoted by deg(f ).

• The greatest common divisor of two integers a and b is denoted by


gcd(a, b).

• For a real number x its integral part, denoted [x], is the greatest integer
which does not exceed x. The fractional part of x is defined as {x} =
x − [x].

More notations will be introduced throughout the text.


Chapter 1

Divisibility

1.1 Greatest common divisors


Definition 1.1. For two integers a and b with a 6= 0 we say that a divides b
or b is divisible by a and write a | b, iff there is an integer c such that b = ca.

Definition 1.2. The greatest common divisor of two integers a and b, de-
noted gcd(a, b), is defined as a positive number d which divides a and b and
is divisible by every common divisor of a and b.

Proposition 1.3. The greatest common divisor of any two numbers a and
b, which are not simultaneously zero, exists and is unique. It is the biggest
among the common divisors of a and b.

Proof. Denote d := min{ax + by : x, y ∈ Z, ax + by > 0}. We claim that


d = gcd(a, b).
If d0 | a, b then clearly d0 | ax + by for all integers x, y and hence d0 | d.
Further, if a = dq + r for some r with 0 ≤ r < d, then it is easy to see that
r = a − dq = ax0 + by0 for some a0 , b0 ∈ Z. Now if r > 0 then we get a
contradiction to minimality of d.
Uniqueness of d is evident.

Lemma 1.4. For any integers a, b there are integers x, y such that

gcd(a, b) = ax + by.

Proof 1. This follows immediately from the proof of the previous result.
Proof 2. (Euclid’s algorithm)
Let a = bq0 + r0 for some 0 ≤ r0 < b. If r0 = 0 then gcd(a, b) = b.
Otherwise let b = q1 r0 + r1 with 0 ≤ r1 < r0 . Now if r1 = 0 then gcd(a, b) =

3
4 CHAPTER 1. DIVISIBILITY

r0 . Otherwise continue the process and divide r0 by r1 with remainder. In


the (k + 2)-th step we get rk−1 = qk+1 rk + rk+1 with 0 ≤ rk+1 < rk . The
sequence of non-negative integers ri is strictly decreasing so the process must
terminate at some point, that is, rk+1 = 0 for some k. Then rk = gcd(a, b)
and going up from the bottom we see that rk can be written as a linear
combination of a and b with integer coefficients.
Remark 1.5. Euclid’s algorithm allows one to compute the numbers x and y
for which ax + by = gcd(a, b).

Definition 1.6. Two integers are called coprime or relatively prime if their
greatest common divisor is 1.

Corollary 1.7. If a and b are coprime and a | bc then a | c.

Proof. There are integers x, y such that ax+by = 1. This implies acx+bcy =
c. Now a divides the left hand side and so it must divide the right hand side
as well.

1.2 Linear Diophantine equations


Theorem 1.8. Let a, b, c be integers with a, b 6= 0. Then the linear equation

ax + by = c

has an integer solution, that is, a solution (x, y) ∈ Z2 , if and only if d :=


gcd(a, b) | c. Given one solution (x0 , y0 ), all solutions are of the form x =
x0 + k · db , y = y0 − k · ad for k ∈ Z.

Proof. If the equation has a solution (x0 , y0 ) then obviously d | ax0 + by0 = c.
Conversely, if c = dl then since d = am + bn for some integers m, n, we know
that (ml, nl) is a solution.
If (x, y) and (x0 , y0 ) are two solutions then a(x − x0 ) + b(y − y0 ) = 0.
Denote u = x − x0 , v = y0 − y. Then au = bv. If a = da0 , b = db0 then
gcd(a0 , b0 ) = 1 and a0 u = b0 v and so a0 | v, b0 | u. Therefore v = a0 k, u = b0 k
for some integer k. The result follows.

1.3 Primes and irreducibles


Definition 1.9. An integer p 6= 0, ±1 is called irreducible if a | p implies
a = ±1 or a = ±p. A number p 6= 0, ±1 is prime if whenever p | ab, we have
p | a or p | b.
1.4. THE FUNDAMENTAL THEOREM OF ARITHMETIC 5

Here 1 and −1 are the only units of the ring Z, that is, integers with a
multiplicative inverse in Z.
Lemma 1.10. An integer p is prime iff it is irreducible.
Proof. Suppose p is prime and p = ab. But then p | ab, hence p | a or p | b.
Assume the former holds. On the other hand a and b divide p. This shows
that b = ±1 and a = ±p.
Now suppose p is irreducible and p | ab. If p - a then gcd(p, a) = 1.
Therefore p | b.
Note that primes and irreducibles may not coincide in other rings.
Lemma 1.11. Every non-zero integer has a prime divisor.
Proof. The number min{d : 1 < d, d | n} is a prime divisor of n.

1.4 The fundamental theorem of arithmetic


Theorem 1.12 (Fundamental theorem of arithmetic). Every natural number
n > 1 can be factored into a product of (positive) primes in a unique way (up
to the order of the factors).
Proof. Firs we prove the existence of a factorisation. Take a prime divisor p1
of n and write n = p1 n1 . Now, by induction, n1 is a product of primes and
hence so is n.
Now let us prove uniqueness. Suppose p1 · · · pk = q1 · · · qs where all pi
and qj are prime. Since p1 divides the right hand side, it must divide one of
the factors qj , say q1 (after reordering qj ’s). However, q1 is prime and hence
p1 = q1 . Thus, p2 · · · pk = q2 · · · qs and we can proceed by induction. This
shows in particular that k = s and the two collections of primes p1 , . . . , pk
and q1 , . . . , qk coincide up to reordering.
Proposition 1.13 (Euclid). There are infinitely many primes.
Proof. Suppose there are only finitely many primes. Denote those by p1 , . . . , pk .
Consider the number N := p1 . . . pk + 1. If q is a prime divisor of N (which
exists!) then it must be different from p1 , . . . , pk which is a contradiction.
Proof 2. (Euler) Suppose there are finitely many primes, namely p1 , . . . , pk .
Then
k k X ∞ ∞
Y 1 Y
−l
X 1
−1 = p i = = ∞,
i=1 1 − pi i=1 l=0 n=1 n
which is a contradiction. Note that the second equality follows from the
fundamental theorem.
6 CHAPTER 1. DIVISIBILITY

1.5 Exercises
1. Show that

(i) every two consecutive integers are relatively prime,


(ii) every two consecutive odd integers are relatively prime.

2. Find all integers x and y for which 84x + 120y = 24.

3. Let p and q be prime numbers. Suppose that the polynomial x2 −px+q


has an integer root. Find all possible values of p and q.

4. Show that a natural number is divisible by 3 iff the sum of its digits is
divisible by 3.

5. Prove that 4 - n2 + 1 for any integer n.

6. Show that gcd(am − 1, an − 1) = agcd(m,n) − 1.

7. Show that if for a positive integer n the number 2n − 1 is prime then n


is prime as well.

8. Find all integers a, b for which a2 b2 | a4 + a3 b + ab3 + b4 .

9. Prove that for every positive integer n there are n consecutive numbers
none of which is prime.

10. Show that if for a positive integer n the number 2n + 1 is prime then n
must be a power of 2.

11. Let a, b, c, n ∈ N with gcd(a, b) = 1 and ab = cn . Prove that both a


and b must be n-th powers of some positive integers.

12. Prove that there are infinitely many primes of the form 4k + 3, k ∈ N.

13. Let a, b be positive integers. Show that if gcd(a, b) = 1 then gcd(a +


b, a2 − ab + b2 ) is either 1 or 3.

14. Show that if p is an odd prime and gcd(a, b) = 1 then

ap + b p
!
gcd a + b, = 1 or p.
a+b
Chapter 2

Multiplicative functions

2.1 Basics
Definition 2.1. A function f : N → C is multiplicative if f (1) = 1 and
whenever m, n are coprime natural numbers, we have f (mn) = f (m)f (n).
P
Lemma 2.2. If f is multiplicative and g(n) = d|n f (d) then g is also mul-
tiplicative.
Proof. If gcd(m, n) = 1 then any divisor d of mn can be factored into a
product ab with a | m and b | n. The numbers a and b are determined
uniquely by d, m, n. Therefore
X X X
g(mn) = f (d) = f (ab) = f (a)f (b)
d|mn a,b:a|m,b|n a,b:a|m,b|n
X X
= f (a) · f (b) = g(m)g(n).
a|m b|n

Example 2.3. If τ (n) and σ(n) are respectively the number and the sum
of all positive divisors of a natural number n, then those are multiplicative.
P P
Indeed, τ (n) = d|n 1 and σ(n) = d|n d and the functions f (n) = 1 and
f (n) = n are obviously multipliactive.
An important multiplicative function, namely Euler’s totient function,
will be introduced in the next chapter.
The following is obvious.
Lemma 2.4. If f is multiplicative and n = i pki i is the prime factorisation
Q

of n then
f (pki i ).
Y
f (n) =
i

7
8 CHAPTER 2. MULTIPLICATIVE FUNCTIONS

Thus, in order to show that two multiplicative functions are identically


equal, it suffices to show they agree on prime powers.

2.2 The Möbius inversion formula


Definition 2.5. The Möbius function µ(n) is defined as

0, if n is divisible by a square,
µ(n) =  k
(−1) , if n is a product of k distinct primes.

It is obvious that µ is multiplicative. Hence, by Lemma 2.2, the function


X
ν(n) = µ(d)
d|n

is multiplicative as well.

Lemma 2.6. 
0, if n > 1,
ν(n) =
1, if n = 1.

Proof. Since both the left hand side and the right hand side are multiplicative
functions, it suffices to establish the equality for n = pk for primes p. This
is left as an exercise.

Proposition 2.7 (Möbius Inversion Formula). If f : N → C and g(n) =


P
d|n f (d) then
X
f (n) = µ(d)g(n/d).
d|n

Note that here f does not have to be multiplicative.

Proof. We have

µ(d)f (d0 ) = µ(d)f (d0 )


X XX XX
µ(d)g(n/d) =
d|n d|n d0 | n
d
d0 |n d| dn0
 
0
f (d0 )ν(n/d0 ) = f (n).
X X X
= f (d ) µ(d) =

d0 |n d| dn0 d0 |n
2.3. EXERCISES 9

Proposition 2.8. If X
f (n) = µ(d)g(n/d)
d|n
P
then g(n) = d|n f (d).

Proof. We have

µ(d0 )g(d/d0 ) = µ(d0 )g(d/d0 )


X XX X X
f (d) =
d|n d|n d0 |d d0 |n d:d0 |d,d|n
 

µ(d0 )g(a) = µ(d0 ) =


XX X X X
= g(a) g(a)ν(n/a) = g(n).
d0 |n a| dn0 a|n d0 | n
a
a|n

2.3 Exercises
1. For a complex number s denote

ds .
X
σs (n) :=
d|n

Prove that σs is multiplicative. Notice that σ0 = τ, σ1 = σ.

|µ(d)| = 2k .
P
2. If n has k distinct prime factors, show that d|n

3. Prove that
X µ(d)2 n
= .
d|n
ϕ(d) ϕ(n)

4. Prove that τ (n) ≤ 2 n for all n ∈ N.
Chapter 3

Modular arithmetic

3.1 Congruences
Definition 3.1. For a, b, m ∈ Z with m 6= 0 it is said that a is congruent to
b modulo m, written a ≡ b mod m, if m | b − a.
This gives rise to residue classes mod m. More precisely for an integer a
we denote [a]m := {b ∈ Z : a ≡ b mod m}. Obviously mZ = {mk : k ∈ Z}
is an ideal of the ring of integers and [a]m = a + mZ is just its coset with a
representative a. Then the quotient ring Z/mZ consists of the residue classes
modulo m and the operations are defined by
(a + Z/mZ) + (b + Z/mZ) = (a + b) + Z/mZ,
(a + Z/mZ) · (b + Z/mZ) = (ab) + Z/mZ.
We will often identify a + mZ with the integer a and by abuse of notation
write Z/mZ = {0, 1, . . . , m − 1}.
Definition 3.2. For a ring R its multiplicative group R× is the set of all
invertible elements of R which is a group under multiplication of the ring.
Lemma 3.3. For m 6= 0 we have
(Z/mZ)× = {a + mZ : gcd(a, m) = 1}.
Note that here gcd(a, m) does not depend on the choice of the represen-
tative a.
Proof. Clearly, a + mZ is invertible iff ak ≡ 1 mod m for some integer k.
Such a k exists iff gcd(a, m) = 1.
Corollary 3.4. If p is prime then every non-zero element in Z/pZ is invert-
ible. Hence it is a field and is normally denoted by Fp .

11
12 CHAPTER 3. MODULAR ARITHMETIC

3.2 Wilson’s theorem


Theorem 3.5. If p is prime then (p − 1)! ≡ −1 mod p.

Proof. Pair each element of F× p = {1, 2, . . . , p−1} with its inverse. If a = a


−1

then a2 ≡ 1 mod p and hence a ≡ ±1 mod p. Thus, if a 6= ±1 then its


inverse is different from itself. The product of all those numbers and their
inverses is clearly 1 modulo p. Adding 1 and −1 to the product we get the
desired result.

3.3 Fermat’s little theorem


Theorem 3.6. If p is prime and p - a then ap−1 ≡ 1 mod p.

Remark 3.7. This is equivalent to the following: ap ≡ a mod p for all integers
a.
This is a special case of Euler’s theorem that we prove later. The standard
proof of Fermat’s little theorem presented in textbooks is actually a special
case of the proof of Euler’s theorem. To avoid repetition, we now give a
different proof to Fermat’s theorem.

Proof. We can assume a > 0 and  we use induction on a. For a = 1 the


p
theorem is obvious. Now since k is divisible by p for 0 < k < p, we deduce
that
p !
p p k
a ≡ ap + 1p ≡ a + 1 mod p,
X
(a + 1) =
k=0 k
where the last equality follows from the induction hypothesis.

3.4 The Chinese Remainder Theorem


Theorem 3.8 (Chinese Remainder Theorem). Let ai , mi ∈ Z, i = 1, . . . , n,
with mi 6= 0 and gcd(mi , mj ) = 1 for i 6= j. Then there is an integer x such
that x ≡ ai mod mi for all i.

This follows immediately from the following result which is also known
as the Chinese Remainder Theorem.
Qn
Theorem 3.9. Let m1 , . . . , mm be pairwise coprime and denote M := i=1 mi .
Then
Z/M Z ∼ = Z/m1 Z × · · · × Z/mn Z.
3.5. EULER’S TOTIENT FUNCTION 13

Proof. Consider the map


ψ : Z/M Z → Z/m1 Z × · · · × Z/mn Z,
ψ : x + M Z 7→ (x + m1 Z, . . . , x + mn Z) .
Observe that x + M Z = y + M Z iff x ≡ y mod M iff x ≡ y mod mi , 1 ≤
i ≤ n iff x + mi Z = y + mi Z, 1 ≤ i ≤ n iff ψ(x + M Z) = ψ(y + M Z). This
shows that ψ is well defined and injective. Hence, it is surjective since its
domain and range have the same cardinality M . It is now evident that ψ is
also an isomorphism.
Remark 3.10. For the proof of Theorem 3.8 we need only surjectivity of ψ.
However, the isomorphism of the rings is an important result and we will use
it later.
The above proof is not constructive so we give a second proof for Theorem
3.8.
M
Second proof of Theorem 3.8. Denote Mi := where M = m1 · · · mn . It is
mi
easy to see that gcd(Mi , mi ) = 1. Hence there are integers ki with ki Mi ≡ 1
mod mi for all i. Now take
n
X
x= ai ki Mi .
i=1

Since Mi ≡ 0 mod mj for i 6= j, we deduce that x ≡ ai mod mi .


Remark 3.11. Compare this to Lagrange interpolation.
Note that if x solves the above system of congruences then y is also a
solution if and only if x ≡ y mod M .

3.5 Euler’s totient function


Definition 3.12 (Euler’s function). For m > 0 define ϕ(m) = #{k : 1 ≤
k ≤ m, gcd(k, m) = 1}.
The following result follows from Lemma 3.3.
Proposition 3.13. The order of the group (Z/mZ)× is ϕ(m).
Lemma 3.14. Euler’s function ϕ is multiplicative. Hence if n = pk11 · · · pkl l
is the prime factorisation of n then
Y k −1
ϕ(n) = i
pi (pi − 1).
i
14 CHAPTER 3. MODULAR ARITHMETIC

Proof. Theorem 3.9 implies that if gcd(m, n) = 1 then


(Z/mnZ)× ∼
= (Z/mZ)× × (Z/nZ)× ,
and the result follows.
For the second part of the theorem notice that if p is prime and k > 0
then among the numbers 1, 2, . . . , pk all multiples of p and only those are
not coprime to pk . There are pk−1 many such numbers. Therefore ϕ(pk ) =
pk − pk−1 .
P
Proposition 3.15. d|n ϕ(d) = n.
Proof 1. For a divisor d of n denote Ad := {a : 1 ≤ a ≤ n : gcd(n, a) = d}.
Then {1, . . . , n} is the disjoint union of (Ad )d|n and |Ad | = ϕ(n/d).
P
Proof 2. Since ϕ is multiplicative, d|n ϕ(d) is also multiplicative. So it
suffices to establish the equality for n = pk where p is a prime. In this case
k
ϕ(pi ) = 1 + (p − 1) + (p2 − p) + · · · + (pk − pk−1 ) = pk .
X X
ϕ(d) =
d|n i=0

3.6 Euler’s theorem


Theorem 3.16 (Euler). If gcd(a, m) = 1 then aϕ(m) ≡ 1 mod m.
Proof 1. Denote k := ϕ(m) and let m1 , . . . , mk be a reduced residue system
modulo m, that is, gcd(mi , m) = 1 for all i and mi 6≡ mj mod m for i 6= j.
In other words, (Z/mZ)× = {m1 , . . . , mk }.
Observe that am1 , . . . , amk is again a reduced residue system mod m and
hence it is a permutation of m1 , . . . , mk mod m. Therefore
ami = ak
Y Y Y
mi ≡ mi .
i i i

mi , m) = 1, we can deduce that ak ≡ 1 mod m.


Q
Since gcd ( i

Proof 2. Consider the multiplicative group (Z/mZ)× and its subgroup (a)
generated by a. If the latter has order (cardinality) n then an = 1. By
Lagrange’s theorem n divides the order of the group (Z/mZ)× which is ϕ(m).
Thus, aϕ(m) = 1 in (Z/mZ)× and we are done.
Corollary 3.17 (Fermat’s little theorem). If p is prime and p - a then
ap−1 ≡ 1 mod p.
Proof. When p is prime, ϕ(p) = p − 1.
3.7. EXERCISES 15

3.7 Exercises
1. Find a positive integer x such that x ≡ 2 mod 4, 2x ≡ 3 mod 9, 7x ≡
1 mod 11.

2. Suppose a28 +28b is prime for some positive integers a and b > 1. Prove
that 2 | b or 29 | a.

3. Let n ≥ 6 be composite. Show that n | (n − 1)!.



4. For a composite n show that ϕ(n) ≤ n − n.

5. Let p > 2 be a prime number with p ≡ 3 mod 4. Show that if p | a2 +b2


for some integers a and b then p | a and p | b.

6. Find all integers x, y such that 15x2 − 7y 2 = 9.


Chapter 4

Primitive roots

4.1 The order of an element


Definition 4.1. For integers a, m 6= 0 with gcd(a, m) = 1 the order of a
mod m is its order in the multiplicative group (Z/mZ)× , that is,

ordm (a) = min{γ ∈ N : aγ ≡ 1 mod m}.

Lemma 4.2. If an ≡ 1 mod m then ordm (a) | n. In particular ordm (a) |


ϕ(m).

Proof. Denote k := ordm (a) and let n ≡ l mod k with 0 ≤ l < k. Then
al ≡ 1 mod m and by minimality of k we must have l = 0.
The second part of the lemma follows from the first part and Euler’s
theorem (or from Lagrange’s theorem directly).

Definition 4.3. If ordm (a) = ϕ(m) then a is called a primitive root modulo
m.

In other words, a primitive root is a generator of the multiplicative group


(Z/mZ)× . So there is a primitive root mod m if and only if (Z/mZ)× is
cyclic.
We are going to describe all integers that have a primitive root. More
generally, we will prove a structure theorem about the multiplicative group
(Z/mZ)× . But now we recall two classical results from the theory of finite
groups.

Lemma 4.4. If G is a group and a, b ∈ G with ab = ba and ord(a) =


k, ord(b) = l and gcd(k, l) = 1 then ord(ab) = lk.

17
18 CHAPTER 4. PRIMITIVE ROOTS

Proof. Let ord(ab) = m. Since (ab)kl = (ak )l (bl )k = 1, we have m | kl. On


the other hand (ab)m = 1, hence 1 = (ab)ml = aml . This implies k | ml and
so k | m. Similarly, l | m and kl | m.

Lemma 4.5. Let G be a group and a ∈ G be an element of order m. Then


m
for every integer k the order of ak is equal to gcd(m,k) .

Proof. Let ord(ak ) =: l. Then akl = 1 and so m|kl. Now if d := gcd(m, k)


then m = dm0 , k = dk 0 with gcd(m0 , k 0 ) = 1. So m0 |k 0 l and m0 |l. Thus md |l.
0
On the other hand (ak )m/d = amk = 1.

4.2 Primitive roots modulo a prime


Let p be a prime number. Since Fp = Z/pZ is a field, every polynomial
equation f (x) = 0 hast at most deg(f ) solutions in Fp where f (X) ∈ Fp [X] \
{0}. Alternatively, we could say the congruence f (x) ≡ 0 mod p has at
most deg(f ) integer solutions incongruent mod p where f (X) ∈ Z[X] and p
does not divide all coefficients of f .

Lemma 4.6. For d | p − 1 the polynomial X d − 1 has exactly d roots in Fp .

Proof. Since d | p − 1 there is a polynomial g(X) ∈ Fp [X] such that

X p−1 − 1 = (X d − 1)g(X).

Obviously, deg(g) = p − 1 − d and hence g(X) has at most p − 1 − d roots.


However, X p−1 − 1 has exactly p − 1 roots and therefore X d − 1 has at least
p − 1 − (p − 1 − d) = d roots. On the other hand, it cannot have more than
d roots and therefore it has exactly d roots.

Lemma 4.7. If p, q are primes and q k | p − 1 for some k ≥ 1, then there is


a ∈ Fp with ordp (a) = q k .
k
Proof. By the above lemma the equation xq − 1 = 0 has exactly q k solutions
in Fp . Suppose none of them has order q k . This means that for each a ∈ Fp
k k−1
with aq = 1 we have ordp (a) | q k−1 . Therefore a is a root of X q − 1. But
this polynomial has only q k−1 roots which is a contradiction.

Theorem 4.8. For every prime p there is a primitive root mod p. Equiva-
lently, F×
p is cyclic.

Proof. Let p − 1 = q1k1 · · · qsks be the prime factorisation of p − 1 and let


ki
ai ∈ F ×
p be of order qi . Then g = a1 · · · as has order p − 1.
4.3. PRIMITIVE ROOTS MODULO PRIME POWERS 19

Remark 4.9. This proof actually shows that every finite subgroup of the
multiplicative group of a field is cyclic.
Proof 2. For each divisor d | p − 1 let Ad := {a ∈ Fp : ordp (a) = d} and
ψ(d) := |Ad |. Clearly, (Ad )d|p−1 is a partition of F×
p . Therefore
X
ψ(d) = p − 1.
d|p−1

d
Now suppose Ad 6= ∅ and let a ∈ Ad . Then ordp (ak ) = gcd(d,k) for each k.
In particular, ordp (a ) = d iff gcd(d, k) = 1, hence the set Ad ∩{1, a, . . . , ad−1 }
k

has ϕ(d) elements. Thus ψ(d) ≥ ϕ(d). We claim that the equality holds.
Indeed, let b ∈ F× p with ordp (b) = d. Then b is a root of the polynomial
X − 1. However, the latter has exactly d roots and those are 1, a, . . . , ad−1 .
d

Hence b is actually a power of k which proves our claim.


Thus, for every d either ψ(d) = 0 or ψ(d) = ϕ(d). Since
X X
ψ(d) = p − 1 = ϕ(d),
d|p−1 d|p−1

ψ(d) = ϕ(d) for every d | p − 1. In particular ψ(p − 1) = ϕ(p − 1) > 0.

Corollary 4.10. There are exactly ϕ(p − 1) primitive roots modulo p.

4.3 Primitive roots modulo prime powers


Theorem 4.11. If p is an odd prime then there is a primitive root mod pk
for every positive integer k.

Proof. The proof is split into two steps presented in the following claims.
Claim. If g is a primitive root mod p then either g or g + p must be a
primitive root mod p2 .
Let g be a primitive root mod p. Denote k := ordp2 (g). We have k |
ϕ(p2 ) = p(p − 1) and g k ≡ 1 mod p2 . Since ordp (g) = p − 1, k must be
divisible by p − 1. Thus k = p(p − 1) or k = p − 1. In the former case g is
a primitive root modulo p. So assume k = p − 1, that is, g p−1 ≡ 1 mod p2 .
Then

(g + p)p−1 ≡ g p−1 + p(p − 1)g p−2 ≡ 1 + p(p − 1)g p−2 6≡ 1 mod p2 .

As g+p is a primitive root mod p, by the above argument it must be primitive


mod p2 .
20 CHAPTER 4. PRIMITIVE ROOTS

Claim. If g is a primitive root mod p2 then it is primitive mod pn for all


n ≥ 2.
We prove by induction on n that ordpn (g) = pn−1 (p − 1). It is true for
n = 2. Now assume it is true for n and prove it for n + 1.
Let l := ordpn+1 (g) so that l | pn (p − 1). On the other hand, since
ordpn (g) = pn−1 (p − 1), either l = pn (p − 1) or l = pn−1 (p − 1). In the former
case we are done, so assume the latter is the case.
n−2 n−2
Now g p (p−1) ≡ 1 mod pn−1 and therefore g p (p−1) = 1 + pn−1 t for
n−2
some integer t. Furthermore, p - t as otherwise we would have g p (p−1) ≡ 1
mod pn which contradicts the induction hypothesis. Thus,
n−1 (p−1)
gp = (1 + pn−1 t)p ≡ 1 + pn t 6≡ 1 mod pn+1 .
 
Note that here we used the fact that pi is divisible by p for 0 < i < p and
that pn+1 | p2n−1 for n ≥ 2. For the last term, where the binomial coefficient
is 1, we actually have pn+1 | pp(n−1) for p > 2 and n ≥ 2.
There is no primitive root mod 2k for k > 2 according to the following
result the proof of which is left to the reader as an exercise. Actually, to
show that (Z/2k Z)× is not cyclic it suffices to notice that it has a non-cyclic
subgroup, namely (Z/8Z)× .
k−2
Lemma 4.12. For k > 2 and odd a we have a2 ≡ 1 mod 2k .
Theorem 4.13. There is a primitive root mod m iff m = 2, 4, pk , 2pk where
p is an odd prime and k is a positive integer.
Proof. Clearly, 1 is primitive mod 2 and −1 is primitive mod 4.
Now let m = pk11 · · · pks s be the prime factorisation of m. Then

(Z/mZ)× ∼
= (Z/pk11 Z)× × · · · × (Z/pks s Z)× ,

which is cyclic iff m is not divisible by 8 and the numbers ϕ(pk11 ), . . . , ϕ(pks s )
are pairwise coprime. Since ϕ(pk ) is even for an odd p, the result follows.1

4.4 The structure of (Z/2k Z)×


Now we give a characterisation of the group (Z/2k Z)× for k > 2 which will
allow us to understand the structure of any group (Z/mZ)× using the Chinese
Remainder Theorem.
We can also show directly that if g is a primitive root mod pk then either g (if it is
1

odd) or g + pk (if g is even) is a primitive root mod 2pk .


4.4. THE STRUCTURE OF (Z/2K Z)× 21

Notation. For a positive integer n denote by Cn the cyclic group of order n


(which is unique up to isomorphism).
Proposition 4.14. For k > 2 we have (Z/2k Z)× ∼
= C2 × C2k−2 .
Definition 4.15. For a prime number p, the p-adic valuation of a non-zero
integer n, denoted vp (n), is the biggest integer γ for which pγ divides n.
Proof of Proposition 4.14. We claim that ord2k (5) = 2k−2 . By Lemma 4.12
k−3
it suffices to prove that 52 6≡ 1 mod 2k . We have
k−3
2X
2k−3 2i
!
2k−3 2 2k−3 k−1
5 = (1 + 2 ) =1+2 + 2 .
i=2 i
2k−3
P2k−3  
So it is enough to show that 2k | i=2 i
22i . For this we need to prove
k−3
 
2
that v2 i
≥ k − 2i for i ≥ 2. Observe that for m < n
! !
n n n−1
=
m m m−1
hence
2k−3
!!
v2 ≥ k − 3 − v2 (i) ≥ k − 2i,
i
as clearly v2 (i) < i. This proves our claim.
k−2
Now let (5) = {1, 5, 52 , . . . , 52 −1 } be the subgroup of (Z/2k Z)× gener-
ated by 5. We claim that none of those elements is congruent to −1 mod
2k . Indeed, suppose 5l ≡ −1 mod 2k for some l ∈ {0, 1, . . . , 2k−2 − 1}. Then
52l ≡ 1 mod 2k and so 2k−2 | 2l. Obviously l 6= 0 as otherwise we would
have 1 ≡ −1 mod 2k which is wrong. Thus, l = 2k−3 . However, we saw
above that
k−3
52 ≡ 1 + 2k−1 6≡ −1 mod 2k .
Thus, the subgroup {±1} · (5) ≤ (Z/2k Z)× has the same cardinality as
the whole group (Z/2k Z)× . Hence (Z/2k Z)× = {±1} · (5). So the intersection
of the subgroups (5) and {±1} is the trivial subgroup {1} and their product
is equal to the whole group (Z/2k Z)× . Therefore
(Z/2k Z)× ∼
= {±1} × (5).

Example 4.16. Let us decompose the group (Z/360Z)× into a direct product
of cyclic groups. By the Chinese Remainder Theorem we have
(Z/360Z)× ∼
= (Z/23 Z)× × (Z/32 Z)× × (Z/5Z)× ∼
= C2 × C2 × C6 × C4 .
22 CHAPTER 4. PRIMITIVE ROOTS

4.5 Exercises
1. Let n > 0 be an integer. Prove that n | ϕ(2n − 1).

2. Show that if g1 and g2 are primitive roots modulo an odd prime p, then
g1 g2 is not a primitive root modulo p.

3. Find all positive integers n for which the congruence a25 ≡ a mod n
holds for all integers a.

4. Show that a primitive root modulo p2 is also a primitive root modulo


p, where p is an odd prime.

5. Prove that for k > 2 and a ∈ Z, if a is odd, then


k−2
a2 ≡1 mod 2k .

6. Show that if m has a primitive root then it has exactly ϕ(ϕ(m)) of


them.

7. Let p be an odd prime. Prove that there is a number 1 < g < p which
is a primitive root modulo pn for every positive integer n.
Chapter 5

Quadratic residues

5.1 The Legendre symbol and Euler’s crite-


rion
Throughout this chapter p is going to be an odd prime unless explicitly stated
otherwise.
Definition 5.1. An integer a ∈ Z (or its residue class mod p) with p - a is
a quadratic residue mod p iff there is x ∈ Z such that x2 ≡ 1 mod p, and a
quadratic non-residue otherwise.
In other words, quadratic residues are the non-zero squares in the field

p .
p−1 p−1
Lemma 5.2. There are exactly 2
quadratic residues and 2
non-residues.
p−1 2
 
Proof. In Fp x2 = y 2 iff x = ±y. Thus 12 , 22 , . . . , 2
are all quadratic
residues.
Definition 5.3 (Legendre symbol). For a ∈ Z we define

1, if a is a quadratic residue mod p,
! 
a

= −1, if a is a quadratic non-residue mod p,
p 

0, if p | a.
p−1
 
a
Proposition 5.4 (Euler’s criterion). If p - a then a 2 ≡ p
mod p.
Proof. By Fermat’s little theorem, all quadratic residues are roots of the
p−1
polynomial X 2 − 1 in Fp . There are exactly p−1
2
quadratic residues and the
p−1
polynomial has at most (in fact, exactly) 2 roots, hence quadratic residues

23
24 CHAPTER 5. QUADRATIC RESIDUES

are all the roots of the above polynomial, that is, if a is a quadratic non-
p−1
residue then a 2 6= 1 in Fp . However, we know that ap−1 = 1 and since Fp is
p−1 p−1
a field, a 2 = ±1. Thus, if a is a quadratic non-residue then a 2 = −1.

Corollary 5.5. −1 is a quadratic residue mod p iff p ≡ 1 mod 4.

Lemma 5.6. The Legendre symbol has the following properties.


   
a b
• If a ≡ b mod p then p
= p
,
 
a2
• p
= 1,
    
ab a b
• p
= p p
.

Proof. The first two are evident and the third one follows from Euler’s cri-
terion.
Remark 5.7. The third property states that the product of two quadratic
residues is a quadratic residue, the product of a quadratic residue and a non-
residue is a non-residue, and the product of two quadratic non-residues is a
quadratic residue. These could be deduced directly from definitions. Indeed,
the first two statements are obvious, while the third one follows from the first
two and the pigeonhole principle.

5.2 Gauss’s lemma


Proposition 5.8 (Gauss’s Lemma). Let I ⊆ F× p be such
 
that F×
p is the
× a
disjoint union of I and −I. Then for a ∈ Fp we have p = (−1)t where
t = #{i ∈ I : ai ∈ −I}.

Proof. Let I+ := {i ∈ I : ai ∈ I} and I− := {i ∈ I : ai ∈ −I}. Then I is the


disjoint union of I+ and I− .
Now we show that I is also the disjoint union of aI+ and −aI− . First,
if ai = −aj for some i ∈ I+ , j ∈ I− then i = −j ∈ I ∩ −I which is a
contradiction. Further, |aI+ | = |I+ | and | − aI− | = |I− | therefore |aI+ ∪
−aI− | = |I+ |+|I− | = |I|. By definition aI+ ∪−aI− ⊆ I and so aI+ ∪−aI− = I
by the pigeonhole principle.
Thus,
p−1
(−ai) = (−1)t · a
Y Y Y Y
i= ai · 2 · i
i∈I i∈I+ i∈I− i∈I

and the result follows.


5.3. THE QUADRATIC RECIPROCITY LAW 25

Note that in applications one always takes I = {1,2, . . . , p−1


2
}. In partic-
p−1
−1
ular, it follows immediately from Gauss’s lemma that p = (−1) 2 giving
a second proof of this result. Now we give a slightly more sophisticated
application.
p2 −1
 
2
Proposition 5.9. p
= (−1) 8 , that is, 2 is a quadratic residue mod p
iff p ≡ ±1 mod 8.

Proof. We need to find the number t of all elements i ∈ {1, . . . , p−1


2
} for
p+1 p−1 p−1
which 2i ∈ { 2 , . . . , p − 1}. If p ≡ 1 mod 4 then 4 < i ≤ 2 and so
t = p−1
4
. This is even iff p ≡ 1 mod 8. If p ≡ −1 mod 4 then p+1
4
≤ i ≤ p−1 2
and t = p+1
4
. This is even iff p ≡ −1 mod 8.

5.3 The quadratic reciprocity law


We will now formulate and prove one of the most prominent theorems in ele-
mentary number theory established by Gauss, known as the law of quadratic
reciprocity.

Theorem 5.10 (Quadratic Reciprocity Law). For distinct odd primes p, q


! !
p q (p−1)(q−1)
= (−1) 4 .
q p

Proof. Consider the rectangle


p q
 
2
E := (x, y) ∈ Z : 0 < x < , 0 < y < .
2 2
Denote
p
 
E1 := (x, y) ∈ E : qx − py < − ,
2
p
 
E2 := (x, y) ∈ E : − < qx − py < 0 ,
2
q
 
E3 := (x, y) ∈ E : 0 < qx − py < ,
2
q
 
E4 := (x, y) ∈ E : < qx − py ,
2
and ti := #Ei , i = 1, 2, 3, 4. It is clear that E is the disjoint union of
E1 , . . . , E4 .
26 CHAPTER 5. QUADRATIC RESIDUES
 
q
By Gauss’s lemma we know that p
= (−1)t where

p p
  
t = # x : 0 < x < , qx ≡ a mod p for some a ∈ − , 0 .
2 2
 
Given x as above, there is a unique a ∈ − p2 , p2 with qx ≡ a mod p.
Furthermore, there is a unique integer
 y such that a = qx − py. Moreover,
p qx−a q
if − 2 < a < 0 then y = p ∈ 0, 2 (since it is an integer). Therefore
 
q
t = #E2 = t2 and p
= (−1)t2 .
 
p
Similarly, q
= (−1)t3 . Thus, to prove the theorem we need to show
that
(p − 1)(q − 1)
− (t2 + t3 )
4
is even. But this number is equal to #E − #E2 − #E3 = t1 + t4 . Observe
that the map (x, y) 7→ p+1 2
− x, q+1
2
− y is a bijection from E1 onto E4 .
Hence t1 = t4 and t1 + t4 is indeed even.

The law of quadratic reciprocity, along with the results on the quadratic
nature of −1 and 2, gives an algorithm for determining whether a given
number is a quadratic residue modulo a prime.

 Is −6
Example 5.11.  aquadratic
    residue modulo p = 113?
−6 −1 2 3
We have p = p p p
.
   
Since p ≡ 1 mod 4, −1 p
= 1. Also, p ≡ 1 mod 8 and so 2
p
= 1.
By the reciprocity law
3 113 113 2
       
(3−1)(113−1)
= (−1) 4 = = = −1.
113 3 3 3
 
−6
Thus, 113
= −1.

5.4 Composite moduli


So far we have only studied quadratic residues modulo primes. Now we
consider the case of composite moduli.
We define a quadratic residue modulo a number m to be an integer a
which is coprime to m and for which the congruence x2 ≡ a mod m has a
solution.1
1
Given a description of coprime quadratic residues one can easily get a description of
non-coprime quadratic residues as well.
5.5. EXERCISES 27

Proposition 5.12. Let m = i pki i be the prime factorisation of m. Then a


Q

number a is a quadratic residue mod m iff it is a qaudratic residue mod pki i


for each i.

Proof. Let xi be an integer with x2i ≡ a mod pki i . By the Chinese Remainder
Theorem there is an integer x such that x ≡ xi mod pki i . Now it is easy to
see that x2 ≡ a mod m.

Proposition 5.13. Let p be an odd prime. If a is a quadratic residue modulo


p then it is a quadratic residue modulo pk for every k > 0.

Proof. We induct on k. Assume a is a quadratic residue mod pk . This means


that for some integers x and b we have

x2 = a + pk b.

Let y := x+pk z where z is yet to be determined. We want the congruence

y 2 ≡ a mod pk+1

to hold. This is equivalent to

pk+1 | (2xz − b)pk .

Since gcd(2x, p) = 1, we can find z such that 2xz ≡ b mod p.

Now let us explore quadratic residues modulo powers of 2. All odd num-
bers are quadratic residues modulo 2, and quadratic residues modulo 4 are
exactly the numbers of the form 4n + 1.

Proposition 5.14. An odd integer a is a quadratic residue modulo 2k for


k ≥ 3 iff a ≡ 1 mod 8.

Proof. If a is a quadratic residue modulo 8 then it must be 1 mod 8.


Now assume a ≡ 1 mod 8 and prove by induction that a is a quadratic
residue mod 2k for k ≥ 3. To this end we replicate the above argument, only
in this case y is taken to be of the form a + 2k−1 b. We leave the details to
the reader.

5.5 Exercises
1. Show that a primitive root modulo an odd prime is a quadratic non-
residue.
28 CHAPTER 5. QUADRATIC RESIDUES

2. Let p be an odd prime. Prove that the product of quadratic residues


p+1
mod p is ≡ (−1) 2 mod p.

3. Does the congruence x2 − 20x + 10 ≡ 0 mod 50893 have a solution?


Note that 50893 is prime.

4. Show that there are infinitely many primes of the form 4k + 1, k ∈ N.

5. Describe all primes p (in terms of congruences) for which 7 is a quadratic


residue.

6. Suppose p ≡ 1 mod 4 is prime and 2p + 1 is prime. Show that 2 is a


primitive root mod 2p + 1.

7. Show that there are infinitely many primes of the form 3k + 1, k ∈ N.


Chapter 6

Representation of integers by
some quadratic forms

6.1 Sums of two squares


In this section we are going to prove a result of Fermat describing all natural
numbers that can be represented as sums of two squares.

Proposition 6.1. An odd prime number p can be expressed as a sum of two


squares if and only if p ≡ 1 mod 4.

Proof. If p = x2 + y 2 then obviously


 p≡ 1 mod 4.
Now let p ≡ 1 mod 4. Then −1 p
= 1 and hence there is an integer u
such that u2 ≡ −1 mod p. Consider the numbers

au + b; 0 ≤ a, b ≤ [ p].

This list contains more than p elements, hence there are a1 , a2 , b1 , b2 ∈ [0, p)
such that (a1 , b1 ) 6= (a2 , b2 ) and

a1 u + b 1 ≡ a2 u + b 2 mod p.

Denote x := a1 − a2 , y := b2 − b1 . Then x2 + y 2 6= 0 and xu ≡ y mod p.


Therefore
y 2 ≡ x2 u2 ≡ −x2 mod p.

Thus, p | x2 + y 2 and 0 < x2 + y 2 ≤ 2[ p]2 < 2p. Hence x2 + y 2 = p.

Theorem 6.2. Let Y r Y sj


n = 2l · i
pi · qj
i j

29
30CHAPTER 6. REPRESENTATION OF INTEGERS BY SOME QUADRATIC FORMS

be the prime factorization of n where pi ≡ 1 mod 4 and qj ≡ 3 mod 4 and


l, ri , qj ≥ 0. Then n = x2 + y 2 for some integers x and y iff sj is even for all
j.
Proof. First observe that for all a, b, c, d
(a2 + b2 )(c2 + d2 ) = (ac + bd)2 + (ad − bc)2 .
Therefore, if all sj ’s are even then n is the sum of two squares (note that
2 = 12 + 12 ).
Conversely, suppose n = x2 + y 2 and q | n is a prime divisor with q ≡ 3
mod 4 with vq (n) = s, that is, q s | n and q s+1 - n. We claim that q | x, y.
Indeed, otherwise let y 0 ∈ Z be such that 0
 yy ≡ 1 mod q. Then since
2 2 0 2 −1
q | x + y , we have q | (xy ) + 1, i.e. q
= 1 which contradicts q ≡ 3
mod 4.
Thus, q | x, y and x = qx1 , y = qy1 . Now we deduce q s−2 | x21 + y22 .
If s were odd, this would lead to a contradiction (after applying the same
argument repeatedly).

6.2 Sums of four squares


Theorem 6.3 (Lagrange). Every natural number can be expressed as a sum
of four squares.
Proof. Notice that
(x2 + y 2 + z 2 + w2 )(x21 + y12 + z12 + w12 ) =
(xx1 + yy1 + zz1 + ww1 )2 + (xy1 − yx1 + wz1 − zw1 )2 +
(xz1 − zx1 + wy1 − yw1 )2 + (xw1 − wx1 + zy1 − yz1 )2 . (2.1)
Thus, it is enough to prove the theorem for prime numbers. Let p be an
odd prime. Consider the p + 1 numbers
p−1
x2 , −y 2 − 1; 0 ≤ x, y ≤ .
2
If 0 ≤ x1 , x2 ≤ p−1
2
and x1 6= x2 then x21 6≡ x22 mod p. Hence for some x and
y we must have x2 ≡ −y 2 − 1 mod p. h i
Thus, there are integers x, y ∈ 0, p−1 2
such that p | x2 + y 2 + 1. So
  
p−1 2

2 2 1
mp = x +y +1 for some positive integer m where m ≤ p
2 2
+ 1 < p.
Let l be the smallest positive integer for which there are integers x, y, z, w
such that
lp = x2 + y 2 + z 2 + w2 .
6.3. EXERCISES 31

Such an l exists and 1 ≤ l ≤ m < p. We will show that l = 1. Assume for


contradiction that l > 1.
If l is even then an even number of x, y, z, w are odd and, assuming x ≡ y
mod 2 and z ≡ w mod 2, we get
2 2 2 2
l x+y x−y z+w z−w
   
p= + + +
2 2 2 2 2
which contradicts minimality of l.  
Therefore l must be odd. Let x1 ∈ − 2l , 2l be such that x ≡ x1 mod l.
Define y1 , z1 , w1 similarly. Then

n := x21 + y12 + z12 + w12 ≡ 0 mod l.

Moreover, as l - p we must have 0 < n < 4 · (l/2)2 = l2 . Hence n = kl with


0 < k < l.
Now

(lp)(kl) = (x2 + y 2 + z 2 + w2 )(x21 + y12 + z12 + w12 ) = A2 + B 2 + C 2 + D2 ,

where A, B, C, D are determined by the identity (2.1). In particular, it is


clear that B, C, D ≡ 0 mod l. Furthermore, A = xx1 + yy1 + zz1 + ww1 ≡
x2 + y 2 + z 2 + w2 ≡ 0 mod l. Thus,
2 2 2 2
A B C D
   
kp = + + +
l l l l
which contradicts the minimality of l.

6.3 Exercises
1. Describe (in terms of congruences) all primes p which can be expressed
as p = x2 + 2y 2 for some x, y ∈ Z.
Chapter 7

Diophantine equations

Equations of the form


f (x1 , . . . , xn ) = 0,
where f (X1 , . . . , Xn ) ∈ Z[X1 , . . . , Xn ] is a polynomial with integer coeffi-
cients, and where we are interested in integer (or rational) solutions, are
known as Diophantine equations. We studied linear Diophantine equations
in Section 1.2. In this chapter we are going to consider some classical non-
linear Diophantine equations.
Note that Hilbert’s tenth problem asks to find a general algorithm which,
given a Diophantine equation, would decide whether there is an integer solu-
tion. It was proven by M. Davis, Y. Matiyasevich, H. Putnam, J. Robinson
(the theorem being completed by Matiyasevich in 1970) that such an algo-
rithm does not exist.

7.1 The Pythagorean equation


Theorem 7.1. All solutions of the equation

x2 + y 2 = z 2

are of the form (d(a2 − b2 ), 2dab, d(a2 + b2 )) or (2dab, d(a2 − b2 ), d(a2 + b2 ))


where a, b, c, d are arbitrary integers (and all those triples are solutions).

Proof. First, if gcd(x, y) = d then d | z and x = dx0 , y = dy 0 , z = dz 0 with


(x0 )2 + (y 0 )2 = (z 0 )2 . Hence, dividing by d we can assume without loss of
generality that gcd(x, y) = 1 which implies that x, y, z are pairwise coprime.
A solution to the Pythagorean equation with this property is known as a
primitive solution.

33
34 CHAPTER 7. DIOPHANTINE EQUATIONS

Reducing the equation modulo 4 we see that z is odd and exactly one of
x, y is even. Since the equation is symmetric with respect to x and y, we can
assume that y is even. Furthermore, it suffices to find positive solutions and
so we can assume x, y, z > 0.
Thus, the theorem follows from the following claim.
Claim. All primitive positive solutions of the Pythagorean equation with y
even are given by x = a2 − b2 , y = 2ab, z = a2 + b2 where a, b ∈ N with
gcd(a, b) = 1 and a > b.
Let y = 2w. We write the equation in the form
z−x z+x
w2 = · .
2 2

Since gcd( z−x


2
, z+x
2
) = 1, and the product of two coprime numbers is a
square iff each of them is, z−x
2
and z+x
2
must be squares themselves. Thus,
there must be coprime numbers a and b such that
z−x z+x
= b2 , = a2 .
2 2
This yields z = a2 + b2 , x = a2 − b2 , y = 2ab. Conversely, for any a, b we
have
(a2 − b2 )2 + (2ab)2 = (a2 + b2 )2 .

We will shortly give another proof of this theorem.

7.2 A special case of Fermat’s last theorem


Fermat’s last theorem states that the equation

xn + y n = z n

with n > 2 does not have any positive integer solutions. This turned out
to be an extremely difficult problem solved by A. Wiles in 1995 (after being
unsuccessfully attacked by many mathematicians for about 350 years).
It is easy to see that in order to prove Fermat last theorem, it is enough
to prove it for n = 4 and odd prime values of n. We consider the case n = 4
below. It is also interesting to note that the equation for n = 2 has infinitely
many solutions while for n > 2 it does not have any solutions.
7.3. RATIONAL POINTS ON QUADRATIC CURVES 35

Theorem 7.2. The equation


x2 + y 4 = z 4 (2.1)
has no solutions in positive integers.
Proof. If gcd(y, z) = d then d2 | x and (x/d2 )2 + (y/d)4 = (z/d)4 . So we can
assume x, y, z are pairwise coprime.

Case 1. First suppose y is even. By Theorem 7.1 there are coprime integers
a, b such that
x = a2 − b2 , y 2 = 2ab, z 2 = a2 + b2 .
Assume a is even and b is odd. From the third equation we deduce
that a = 2uv, b = u2 − v 2 for some relatively prime integers u, v. Now
(y/2)2 = uv(u2 − v 2 ) hence u = p2 , v = q 2 , u2 − v 2 = r2 for some integers
p, q, r (since u, v, u2 − v 2 are mutually coprime). Therefore
r 2 + q 4 = p4 .
Thus, assuming (x, y, z) is a positive solution to (2.1), we found another pos-
itive solution (r, q, p) with p < z. Applying the same procedure to this new
solution, we will find yet another solution the third coordinate of which is
smaller than p. This can be repeated indefinitely which will yield an infinite
strictly decreasing sequence of positive integers which is a contradiction.

Case 2. Now suppose x is even. Then by Theorem 7.1 there are coprime
integers a, b such that
x = 2ab, y 2 = a2 − b2 , z 2 = a2 + b2 .
Therefore (yz)2 + b4 = a4 and a < z. This leads to a contradiction as
above.
Remark 7.3. The method of the proof, invented by Fermat, is known as the
method of infinite descent.
Corollary 7.4. The equation x4 + y 4 = z 4 has no positive integer solutions.

7.3 Rational points on quadratic curves


Let f (X, Y ) ∈ Q[X, Y ] be a quadratic polynomial with rational coefficients.
We will see in this section that we can find all rational solutions of the
equation
f (x, y) = 0 (3.2)
provided we know one rational solution (x0 , y0 ).
36 CHAPTER 7. DIOPHANTINE EQUATIONS

Denote C(R) := {(x, y) ∈ R2 : y


f (x, y) = 0}1 and C(Q) := C(R) ∩
Q2 . Thus (x0 , y0 ) ∈ C(Q). Now
pick an arbitrary point (x1 , y1 ) ∈ (x1 , y1 )

C(Q) and consider the line l pass-
ing through (x0 , y0 ) and (x1 , y1 ). Its
equation is • x
(x0 , y0 )
y − y0 y1 − y0
=
x − x0 x1 − x0

and it has rational slope t = xy11 −y


−x0
0
.
Conversely, let l be a line with a
rational slope t passing through (x0 , y0 ). Its equation is

y = t(x − x0 ) + y0 .

The line l intersects the quadratic curve C in two points one of which is
(x0 , y0 ). Let (x1 , y1 ) ∈ R2 be the other point of intersection. It may actually
coincide with (x1 , y1 ) if l is tangent to C. The abscissa x1 is a root of
the quadratic polynomial f (X, t(X − x0 ) + y0 ). Since one of the roots of
this polynomial, namely x0 , is rational, and its coefficients are rational, the
second root must be rational as well. Therefore x1 is rational and hence
y1 = t(x1 − x0 ) + y0 is rational too.
Thus, there is a one-to-one correspondence between rational points on C
and lines with rational slope passing through (x0 , y0 ). This gives an algorithm
for finding all rational solutions of (3.2). We just need to solve the equation
f (x, t(x − x0 ) + y0 ) for x. Note that since x0 is a root, the root x1 will be
given by a rational function of t, that is, x1 = r(t) for some r(X) ∈ Q(X).
So for each rational value of t the point (r(t), t(r(t) − x0 ) + y0 ) is a solution of
(3.2) and all solutions are of that form. In other words we found a rational
parametrisation of the quadratic curve C. One says in this case that C is
rational.

Example 7.5. Let us find all rational solutions of the equation

x2 + y 2 = 1.

Obviously, (−1, 0) is a solution. So consider the line y = t(x + 1). Solving


1−t2
x2 + t2 (x + 1)2 = 1 we get x = −1 or x = 1+t 2 . Thus, all rational points on

1
One says that C(R) is the set of R-rational points of the curve given by the equation
(3.2)
7.4. EXERCISES 37

the unit circle are of the form


!
1 − t2 2t
, , t ∈ Q.
1 + t2 1 + t2

This can be used to find all integer solutions of the Pythagorean equation

x2 + y 2 = z 2 .

As we have seen before, we may assume x, y, z are pairwise coprime. Then


since (x/z)2 + (y/z)2 = 1, we must have

x 1 − t2 y 2t
= , = .
z 1 + t2 z 1 + t2
b
Let t = a
with gcd(a, b) = 1. Then

x a2 − b 2 y 2ab
= 2 2
, = 2 .
z a +b z a + b2
Hence
x = a2 − b2 , y = 2ab, z = a2 + b2 .

Remark 7.6. Here we implicitly assumed that a and b have different parities
in order to deduce the last qualities from the previous ones. We leave it as
an exercise for the reader to prove that when both a and b are odd, x, y and
z can be written as x = 2mn, y = m2 − n2 , z = m2 + n2 for some integers
m, n with gcd(m, n) = 1.

7.4 Exercises
1. Find all integer solutions of the equation y 2 = x3 + 16.

2. Find all pairs (x, y) of positive rational numbers such that x2 +3y 2 = 1.

3. Find all integers x, y for which x4 − 2y 2 = 1.


Chapter 8

Continued fractions

8.1 Finite continued fractions


Definition 8.1. A finite continued fraction is a function [a0 , a1 , . . . , aN ] of
variables a0 , a1 , . . . , aN of the form
1
a0 + .
1
a1 +
1
a2 + . . . +
aN
We call a0 , . . . , aN the partial quotients of the continued fraction, and [a0 , . . . , an ], n ≤
N, is called the n-th convergent to [a0 , . . . , aN ].
Lemma 8.2. Define the sequences pn and qn recursively by
• p0 = a0 , p1 = a1 a0 + 1, pn = an pn−1 + pn−2 , 2 ≤ n ≤ N ,
• q0 = 1, q1 = a1 , qn = an qn−1 + qn−2 , 2 ≤ n ≤ N .
pn
Then [a0 , . . . , an ] = qn
.
Remark 8.3. Note that pn and qn are functions of a0 , . . . , an and hence they
depend only on these variables.
Proof. We induct on n. For n = 0, 1 the equality holds obviously. Now
 
1
an + pn−1 + pn−2
" #
1 an+1
[a0 , . . . , an , an+1 ] = a0 , . . . , an−1 , an + =  
an+1 an + 1
qn−1 + qn−2
an+1

an+1 (an pn−1 + pn−2 ) + pn−1 an+1 pn + pn−1 pn+1


= = = .
an+1 (an qn−1 + qn−2 ) + qn−1 an+1 qn + qn−1 qn+1

39
40 CHAPTER 8. CONTINUED FRACTIONS

In the second equality we used the induction hypothesis and the above re-
mark.

Lemma 8.4. The following identities hold.

• pn qn−1 − pn−1 qn = (−1)n−1 for n > 0,

• pn qn−2 − pn−2 qn = (−1)n an for n > 1.

Proof. For the first identity we use induction on n.

pn qn−1 − pn−1 qn = (an pn−1 + pn−2 )qn−1 − pn−1 (an qn−1 + qn−2 )
= −(pn−1 qn−2 − pn−2 qn−1 ) = −(−1)n−2 = (−1)n−1 .

The second identity follows easily from the first one.

Corollary 8.5. The functions pn and qn satisfy

pn pn−1 (−1)n−1
• − = ,
qn qn−1 qn−1 qn

pn pn−2 (−1)n an
• − = .
qn qn−2 qn−2 qn

Now we assign numerical values to the quotients an . We will assume


that an ∈ Z for all n and an > 0 whenever n > 0. Observe that then
gcd(pn , qn ) = 1.
pn
Let xn = be the n-th convergent.
qn

Lemma 8.6. The sequence x2n is strictly increasing and x2n+1 is strictly
decreasing. Furthermore, x2n < x2m+1 for every m, n.

Proof. The first part follows immediately from the second identity of Corol-
lary 8.5 (note that qn is positive).
For the second part, first botice that x2n < x2n+1 according to the first
identity of Corollary 8.5. If n ≤ m then x2n ≤ x2m < x2m+1 , and if n ≥ m
then x2n < x2n+1 ≤ x2m+1 .

Thus, if the value of the continued fraction is x then x2n ≤ x and x2m+1 ≥
x, and equality holds only once when the index is equal to N .
8.2. REPRESENTATION OF RATIONAL NUMBERS BY CONTINUED FRACTIONS41

8.2 Representation of rational numbers by con-


tinued fractions
Theorem 8.7. Every finite continued fraction represents a rational number
and every rational number can be represented by a finite continued fraction.
Proof. It is obvious that finite continued fractions represent rational num-
bers, since the partial quotients are integers. We show now that every rational
number does have such a representation.
Let x ∈ Q and denote x0 = x. Define the sequences an , xn by an = [xn ]
1
and xn+1 = xn −a n
. Continue this process as long as xn 6= an . Now we show
that the process must terminate, that is, xk = ak for some k.
a
We use Euclid’s algorithm. If x = with gcd(a, b) = 1 then write
b
a = a0 · b + r0 ,
b = a1 · r0 + r1 ,
r0 = a2 · r1 + r2 ,
···
rN −2 = aN · rN −1 + 0.

We know Euclid’s algorithm terminates. We see that x0 = ab , x1 = b


r0
, xn =
rn−1
rn
, 2 ≤ n < N. Then obviously x = [a0 , . . . , aN ].
If x = [a0 , . . . , aN ] then also x = [a0 , . . . , aN − 1, 1] and thus the con-
tinued fraction representation is not unique. But these are the only two
representations of x according to the following result.
Proposition 8.8. If x = [a0 , . . . , an ] = [b0 , . . . , bm ] with an > 1, bm > 1 then
m = n and ai = bi for every i.
1
Proof. Note that [a1 , . . . , an ] > 1 and x = [a0 , . . . , an ] = a0 + ,
[a1 , . . . , an ]
therefore a0 = [x]. Similarly, b0 = [x] = a0 and [a1 , . . . , an ] = [b1 , . . . , bm ].
Now the result follows by induction.

8.3 Infinite continued fractions


Let (an )n≥0 be a sequence of integers with an > 0 for n > 0. Denote
pn
xn := [a0 , . . . , an ] = .
qn
42 CHAPTER 8. CONTINUED FRACTIONS

Lemma 8.9. The sequence xn is convergent.


Proof. Our results on finite continued fractions show that x2n is strictly in-
creasing, x2n+1 is strictly decreasing and x2n < x2m+1 for all n, m. Hence the
limits
x0 := lim x2n and x00 := lim x2n+1
n→∞ n→∞
0 00
exist and x2n < x ≤ x < x2m+1 .
Now Corollary 8.5 implies
1
|x0 − x00 | ≤ |x2n − x2n+1 | = → 0.
q2n+1 q2n
Therefore x0 = x00 =: x and xn → x.
Definition 8.10. We define the infinite continued fraction [a0 , a1 , . . .] as the
limit x of its convergents xn = [a0 , . . . , an ]. One also writes
1
x = a0 + .
1
a1 +
a2 + . . .
Remark 8.11. The proof shows actually that

pn 1 1
x − ≤ |xn+1 − xn | = < .

qn2

qn qn+1 qn
Proposition 8.12. Every irrational number can be represented by an infinite
continued fraction.
Proof. Let x ∈ R \ Q and denote u0 := x. Define the sequences an , un by
1
an = [un ], un+1 = , n ≥ 0.
un − an
This process does not terminate, i.e. un 6= an , since otherwise x would be
rational.
We claim that x = [a0 , a1 , . . .]. To this end notice that
un+1 pn + pn−1
x = [a0 , . . . , an , un+1 ] = .
un+1 qn + qn−1
Hence
pn pn−1 qn − pn qn−1 (−1)n
x− = = →0
qn (un+1 qn + qn−1 )qn (un+1 qn + qn−1 )qn
as un > 1 whenever n > 0.
8.3. INFINITE CONTINUED FRACTIONS 43

The proof of Proposition 8.8 generalises to infinite continued fractions.

Proposition 8.13. Two infinite continued fractions are equal if and only if
all their corresponding partial quotients are equal. Furthermore, an infinite
continued fraction cannot be equal to a finite one.

Corollary 8.14. The value of an infinite continued fraction is irrational.

Proof. Indeed, if the value is rational then the continued fraction algorithm
terminates. This contradicts uniqueness.

We sum up the above results in the following theorems.

Theorem 8.15. Every real number can be represented by a continued fraction


(uniquely for irrational numbers). The continued fraction representation of
a number is finite iff it is rational.

Theorem 8.16. If x is an irrational real number then there are infinitely


many rational numbers pq such that

p 1
x − < 2.

q q
p pn
Proof. Take q
= qn
.

This is known as Dirichlet’s theorem on diophantine approximations. We


will give another proof (independent of continued fractions) and establish
stronger results in the next chapter.

Example 8.17. Let us find the continued fraction representation of 2. We
have
√ √
2 = 1 + ( 2 − 1),
1 √ √
√ = 2 + 1 = 2 + ( 2 − 1).
2−1

Noticing the repeating pattern we conclude that

√ 1
2 = [1, 2, 2, . . .] = 1 + .
1
2+
2 + ...
44 CHAPTER 8. CONTINUED FRACTIONS

8.4 Periodic continued fractions


Definition 8.18. A continued fraction [a0 , a1 , a2 , . . .] is called (ultimately)
periodic if there are integers n0 ≥ 0 and l > 0 such that an+l = an for all
n > n0 . When n0 = 0, it is called purely periodic.
A periodic continued fraction is denoted (using the above notation)

[a0 , . . . , an0 , an0 +1 , . . . , an0 +l ],

and [an0 +1 , . . . , an0 +l ] is called the purely periodic part.


Definition 8.19. A quadratic irrational is an irrational root of a quadratic
equation with rational coefficients.
In other words a complex number α is a quadratic irrational iff the field
extension Q(α) ⊇ Q has degree 2.
We will only consider real numbers so by a quadratic irrational we will
normally understand a real one. The following is obvious.
Lemma

8.20. A real number is a quadratic irrational iff it is of the form
a+b d
c
for a, b, c, d ∈ Z with d > 0 non-square and c 6= 0.
Proposition 8.21. A number u ∈ R is a quadratic irrational iff its continued
fraction representation is periodic.
pm
Proof. Let u = [a0 , . . . , an , an+1 , . . . , an+l ]. If qm
is the m-th convergent to u
and v := [an+1 , . . . , an+l ] then
pn v + pn−1
u=
qn v + qn−1
and hence it suffices to show that y is a quadratic irrational.
Observe that
pv + p0
v = [an+1 , . . . , an+l , v] =
qv + q 0
for some integers p, p0 , q, q 0 . This shows that v is a root of a quadratic equa-
tion. It is irrational since its continued fraction representation is infinite.
Conversely, let u be a root of a polynomial aX 2 + bX + c with a, b, c ∈ Z
with d = b2 − 4ac > 0 non-square. Consider the quadratic form

f (X, Y ) = aX 2 + bXY + cY 2 .
pn
If qn
is the n-th convergent to u then the substitution

X = pn X 0 + pn−1 Y 0 , Y = qn X 0 + qn−1 Y 0
8.4. PERIODIC CONTINUED FRACTIONS 45

takes f (X, Y ) into a form

fn (X 0 , Y 0 ) = an X 02 + bn X 0 Y 0 + cn Y 02

where

an = ap2n + bpn qn + cqn2 = f (pn , qn ),


cn = ap2n−1 + bpn−1 qn−1 + cqn−1
2
= f (pn−1 , qn−1 ) = an−1 ,
bn = 2apn pn−1 + 2cqn qn−1 + b(pn qn−1 + pn−1 qn ).

It is easy to see that b2n − 4a2n c2n = d. This also follows from the fact that
the above transformation has determinant pn qn−1 − pn−1 qn = (−1)n−1 and
hence it does not change the discriminant of the form.
Since f (u, 1) = 0, we have
! !
an p2n pn
2
= a 2 − u2 + b −u .
qn qn qn

pn pn
We know that u − < 1
< 1, hence u + ≤ |u| + pqnn < (2|u| + 1).

qn
2
qn qn

Therefore
|an | 2|u| + 1 1
< |a| + |b| .
qn2 qn2 qn2
Thus, |an | < |a|(2|u| + 1) + |b| and the right hand side does not depend on n.
So we showed that the sequence an is bounded, hence cn = an−1 is bounded.
Also, b2n = 4a2n c2n + d and bn is bounded as well.
Now let u = [a0 , a1 , . . .] and un = [an , an+1 , . . .] be the n-th complete
quotient of u. Then
pn un+1 + pn−1
u=
qn un+1 + qn−1
and

fn (un+1 , 1) = f (pn un+1 +pn−1 , qn un+1 +qn−1 ) = (qn un+1 +qn−1 )2 ·f (u, 1) = 0.

Thus, un+1 is a root of the quadratic polynomial fn (X, 1) which has


bounded coefficients. Therefore there are only finitely many possibilities
for un . So un0 = un0 +l for some n0 ≥ 0 and l > 0. By uniqueness of the
continued fraction representation an = an+l for all n ≥ n0 .

√ a, b ∈ Q and d ∈ Z is not a
For a quadratic irrational x = a + b d where
square, its conjugate is the number x̄ := a − b d. It is the other root of the
quadratic polynomial which vanishes at x.
46 CHAPTER 8. CONTINUED FRACTIONS

Proposition 8.22. The continued fraction representation of a quadratic ir-


rational x is purely periodic iff x > 1 and −1 < x̄ < 0.

Proof. Assume x > 1 and −1 < x̄ < 0. Let x = [a0 , a1 , . . .] and let xn =
[an , an+1 , . . .] be the n-th complete quotient.
First, a0 > 0 as x > 1. Further, we show by induction that −1 < x̄n < 0.
1 1
Since xn = an + xn+1 , we have x̄n = an + x̄n+1 . Hence −1 < x̄n+1 < 0 by the
induction hypothesis. h i
1 1
Now − x̄n+1 = an − x̄n and 0 < −x̄n < 1, hence an = − x̄n+1 .
1 1
We know that xn = xn+l for some n, l. Hence x̄n = x̄n+l . This implies
an−1 = an+l−1 which shows in its turn that xn−1 = xn−1+l . Repeating this
argument we get x0 = xl and we are done.
Conversely, if x = [a0 , a1 , . . .] is purely periodic then a0 = al ≥ 1 for some
l > 0. So x > a0 ≥ 1.
Moreover,
pn x + pn−1
x=
qn x + qn−1
and therefore
qn x2 + (qn−1 − pn )x − pn−1 = 0.
The roots of the above quadratic equation are x and x̄, hence x · x̄ =
− pn−1
qn
< 0. In particular x̄ < 0 as x > 0.
Recall that even convergents of a continued fraction are less than its value
while odd convergents are greater. So if n is even, then pn−1
qn
< pqnn < x since
pn is increasing. And if n is odd, then pn−1
qn
pn
< qn−1 < x since qn is increasing.
Therefore x · x̄ > −x and x̄ > −1.

Example 8.23. Let d > 0 be a non-square and x = √ 1√ . Then x > 1


d−[ d]
and x̄ = √ 1 √ ∈ (−1, 0). Thus x has a purely periodic continued fraction
− d−[ d] √ √
representation, x = [a1 , . . . , al ]. Since d = [ d] + x1 , the continued fraction

representation of d is almost purely periodic, i.e.

d = [a0 , a1 , . . . , al ]

where a0 = [ d].

8.5 Pell’s equation


Let d > 0 be a non-square positive integer. The equation

x2 − dy 2 = 1 (5.1)
8.5. PELL’S EQUATION 47

is known as Pell’s equation. We are going to apply the results of the last
section to solve Pell’s equation.
First, notice that for all d there is a trivial solution (±1, 0). We will first
show that for all d there is a non-trivial solution.

Lemma 8.24. The equation (5.1) has a non-trivial solution.



Proof. We know that the continued fraction representation of d is “almost”
purely periodic, that is,

d = [a0 , a1 , . . . , am ].

Denote θ := [a1 , . . . , am ]. Let n be an even multiple of m. Then


√ θpn + pn−1
d=
θqn + qn−1

where pqkk is the k-th convergent to d.

On the other hand, d = a0 + 1θ . Substituting θ = √ 1
d−a0
in the previous
equation we get √
√ pn + pn−1 ( d − a0 )
d= √ .
qn + qn−1 ( d − a0 )
This yields
qn − a0 qn−1 = pn−1 , pn − a0 pn−1 = qn−1 d
and therefore

p2n−1 − dqn−1
2
= −(pn qn−1 − pn−1 qn ) = (−1)n = 1

since n is even.
Thus, for every even multiple n of m the pair (pn−1 , qn−1 ) is a solution to
Pell’s equation. So there are infinitely many non-trivial solutions.

Solutions (x, y) to Pell’s


√ equation are in one-to-one correspondence with
the numbers z := x + y d. Often we will say that z is a solution
√ to Pell’s
equation. Recalling that the conjugate is defined by z̄ = x − y d we can
rewrite (5.1) in the form
z · z̄ = 1. (5.2)

Lemma 8.25. If z = x + y d with z z̄ = 1 then z > 1 iff x > 0, y > 0.

Proof. If x > 0, y > 0 then z > x ≥ 1. Conversely, if z > 1 then 0 < z̄ < 1,
hence x > 0, −y < 0.
48 CHAPTER 8. CONTINUED FRACTIONS

Definition 8.26. The number z1 = min{z ∈ Z[ d] : z > 1, z z̄ = 1} is
called the fundamental solution of the equation (5.2).

Proposition 8.27. All solutions of the equation (5.2) are given by ±z1m , m ∈
Z.

Proof. Since z1 z¯1 = 1, z1m · z1m = (z1 z̄1 )m = 1. Now, taking into account
that z1−1 = z̄1 , it suffices to show that for every solution z with z > 1 there
is a positive integer m such that z = z1m . As z1 > 1 there is m > 0 such
that z1m ≤ z < z1m+1 . If the first inequality is strict then z 0 := zzm satisfies
1
z 0 · z¯0 = 1 and 1 < z 0 < z1 which contradicts the definition of z1 . This finishes
the proof.

Remark 8.28. The above proposition shows that if z1 = x1 + y1 d then all
solutions of Pell’s equation are given by
√ √ √ √
(x1 + y1 d)n + (x1 − y1 d)n (x1 + y1 d)n − (x1 − y1 d)n
x=± , y=± √ ,
2 2 d
where n is a positive integer.
Alternatively,
√ n if xn , yn , n > 0, are determined from the equation xn +yn =
(x1 + y1 d) then all solutions are given by (±xn , ±yn ). The trivial solutions
correspond to n = 0.
Remark 8.29. In the proof of Lemma 8.24 if we choose n = (m, 2), that
is, m = n if n is even and m = 2n if n is odd, then (pn−1 , qn−1 ) is the
fundamental solution of Pell’s equation. Though we will not prove this, it
gives an algorithm for solving the equation. Moreover, all solutions of Pell’s
equation are of the form (pn−1 , qn−1 ) where n is an even multiple of m.

Example 8.30. Consider the equation

x2 − 2y 2 = 1.

Since 2 = [1, 2], (3, 2) is the fundamental solution. So all solutions are of
the form
√ √ √ √ !
(3 + 2 2)n + (3 − 2 2)n (3 + 2 2)n − (3 − 2 2)n
± , ± √ .
2 2 2

8.6 Exercises
√ √ 1
1. Find the continued fraction representation of the numbers 2, − 23
6
, 5, 7, √3 .
8.6. EXERCISES 49

2. Evaluate the continued fraction [1, 2, 3, 1, 4].

3. Find all integer solutions of the following equations:

(i) x2 − 3y 2 = 1,
(ii) x2 − 6y 2 = 1.

4. Prove that the sum of the first n natural numbers is a perfect square
for infinitely many n.

5. (i) Show that x2 − 7y 2 = −1 has no integer solutions.


(ii) Show that if d is divisible by a prime number p with p ≡ 3 mod 4
then the equation x2 − dy 2 = −1 has no integer solutions.

6. Let n 6= 0 be an integer. Show that if x2 − dy 2 = n has an integer


solution (d is a non-square) then it has infinitely many integer solutions.
Chapter 9

Diophantine approximations

9.1 Dirichlet’s theorem


Theorem 9.1. Let x be a real number. For every integer Q > 1 there are
integers p, q with 0 < q < Q such that |qx − p| ≤ 1/Q.

Proof. Consider the numbers {0x}, {x}, {2x}, . . . , {(Q − 1)x}, 1.1 By the
pigeonhole principle, we can choose two of these numbers the distance of
which is at most 1/Q. Assume first for some 0 ≤ i < j < Q we have
|{ix} − {jx}| ≤ 1/Q. This means

|(j − i)x + ([ix] − [jx])| ≤ 1/Q

and we can choose q = j − i, p = [ix] − [jx].


If one of those two numbers whose distance is ≤ 1/Q is 1 then we have
|{ix} − 1| ≤ 1/Q for some 0 < i < Q and we choose q = i, p = [ix] + 1.

Corollary 9.2. If x is irrational then there are infinitely many fractions p/q
such that |x − p/q| < 1/q 2 .

Proof. For an arbitrary Q let p, q be such that |qx − p| ≤ 1/Q < 1/q. Then
1
|x − p/q| < 1/q 2 so there is at least one such p/q. Now let Q > qx−p be an
0 0
arbitrary integer and let p , q be such that

|q 0 x − p0 | ≤ 1/Q < |qx − p|.

This means that p0 /q 0 6= p/q and |x − p0 /q 0 | ≤ 1/q 0 Q < 1/q 02 .


1
For a real number r its fractional part, denoted {r}, is defined by {r} = r − [r] where
[r] is the integral part of r.

51
52 CHAPTER 9. DIOPHANTINE APPROXIMATIONS

The latter is known as Dirichlet’s theorem. We have already proved it in


the previous chapter using continued fractions. In the next section we will
use the properties of continued fractions to strengthen Dirichlet’s theorem.
Remark 9.3. If x is rational then there are finitely many such fractions p/q.
Indeed, if x = a/b and |a/b − p/q| < 1/q 2 with p/q 6= a/b then q < b so there
are only finitely many possibilities for q. Further, for every q there are only
finitely many possibilities for p.

9.2 Better approximations


Proposition 9.4. For an irrational number x there are infinitely many ra-
tional numbers p/q such that |x − pq | < 2q12 .

Proof. We show that at least one of any two consecutive convergents to (the
continued fraction of) x satisfies the desired inequality. Indeed, we have

pn pn+1 p
n+1 pn 1 1 1
x − + x − = − = < 2 + 2 ,


qn qn+1 qn+1 qn qn qn+1 2qn 2qn+1

where the first equality holds since x − pqnn and x − pqn+1


n+1
have opposite signs
and the last inequality follows from the obvious fact that for all real numbers
a 6= b we have 2ab < a2 + b2 .
Now we deduce from the above inequality that either |x − pn /qn | < 1/2qn2
2
or |x − pn+1 /qn+1 | < 1/2qn+1 .
Proposition 9.5. For an irrational number x there are infinitely many ra-
tional numbers p/q such that |x − pq | < √5q
1
2.

Proof. Here we show that at least one of any three consecutive convergents
to x must satisfy the required inequality.
Suppose for some n
pk 1
x − ≥ √

qk 5qk2
for k = n, n + 1, n + 2.
Using the proof of the previous proposition we see that
1 1 1
√ +√ 2 ≤ .
2
5qn 5qn+1 qn qn+1

Denoting λ = qn+1 /qn we get λ + 1/λ ≤ 5.
Obviously
√ λ is a rational number, hence√5+1
the above inequality is strict.
2
Thus, λ − 5λ + 1 < 0 which implies λ < 2 .
9.3. TRANSCENDENTAL NUMBERS AND LIOUVILLE’S THEOREM53

Similarly, for µ := qn+2 /qn+1 we have µ < 5+1 2
.
However, qn+2√= an+2 qn+1 + qn ≥ qn+1√ + qn hence µ ≥ 1 + 1/λ. This
yields 1 + 1/λ < 5+1 2
2
and so λ > √5−1 = 5+12
, a contradiction.

Now we show that 5 in the last result is best possible.

Proposition 9.6. Proposition 9.5 does not remain true if we replace 5 by
a bigger number.
√ √
Proof. Let A > 5. Consider the number x = 5−1 2
. Assume p/q satisfies
|x − p/q| < 1/Aq 2 .
Write x = pq + qδ2 where |δ| < A1 < √15 . Then

δ 5 q
− q = − − p.
q 2 2
Taking squares we get
δ2 √
2
− δ 5 = p2 + pq − q 2 .
q
√ √
Now |δ√ 5| < 5/A which does not depent√on q and is less than 1. Since
2
δ < 1/ 5, if q is large enough then qδ2 − δ 5 ∈ (−1, 1). This means p2 +
pq − q 2 ∈ (−1, 1) and therefore p2 + pq − q 2 = 0 since it is an integer. This
is a contradiction.
Remark 9.7. It can be shown (though we will not do it) that for an irrational
x the convergents pn /qn give the best approximations to x among all rational
numbers p/q with 1 ≤ q ≤ qn . More precisely, if n ≥ 1, 0 < q ≤ qn and
p/q 6= pn /qn then
pn p
x − ≤ x −

qn q
and the strict inequality holds for n > 1.

9.3 Transcendental numbers and Liouville’s


theorem
Definition 9.8. A complex number α is called algebraic (over Q) if there is
a non-zero polynomial f (X) ∈ Q[X] such that f (α) = 0. If there is no such
polynomial, α is called transcendental.
Proposition 9.9. There are countably many algebraic numbers.
54 CHAPTER 9. DIOPHANTINE APPROXIMATIONS

Proof. There are countably many rationals, hence countably many polyno-
mials with rational coefficients. Each of them has finitely many roots, hence
there are countably many algebraic numbers.
Corollary 9.10. Almost all numbers (complex or real) are transcendental.
This establishes the existence of transcendental numbers. Nevertheless,
proving that a given number, like e or π, is transcendental is much more
difficult. Liouville was the first to construct a transcendental number and
thus establish their existence (Cantor’s set theory (and the above argument)
came after his discovery). We will now prove Liouville’s result asserting
that transcendental numbers have better diophantine approximations than
algebraic numbers.
Definition 9.11. A real number ξ is approximable by rationals to the order
n if there is a constant c = c(ξ), depending only on ξ, and infinitely many
rational numbers p/q such that

p c
ξ

− < n .
q q
We saw for example that irrational numbers are approximable to the order
2 while rationals are not.
Definition 9.12. A real number ξ is a Liouville number if for every n > 0
there are integers p, q with q > 1 such that

p 1
0< ξ − < n .

q q
Lemma 9.13. A real number is Liouville iff it is approximable to the order
n for every n > 0.
Proof. Obvious.
Theorem 9.14. Let f (X) ∈ Z[X] be a polynomial of degree n. If ξ is a real
zero of f then it is not approximable to the order n + 1.
Proof. Assume ξ is approximable to the order n+1. Then there are a constant
c and a rational number p/q ∈ (ξ − 1, ξ + 1) such that |ξ − p/q| < c/q n+1 and
f (p/q) 6= 0.
Let f (X) = an X n + . . . + a1 X + a0 . The derivative f 0 (X) is bounded on
every bounded set. Let |f 0 (x)| < M for all x ∈ (ξ − 1, ξ + 1). Now

|an pn + an−1 pn−1 q + . . . + a0 q n |
!
p 1
f = ≥ n

n

q q q
9.4. TRANSCENDENCE OF E 55

since the numerator is a non-zero integer. On the other hand by the mean
value theorem we have
! ! !
p p p
f =f − f (ξ) = − ξ · f 0 (x)
q q q

for some number x between ξ and p/q. Combining what we obtained above
we get
c p |f (p/q) 1
> ξ − = 0 > .

q n+1 q |f (x)| M qn
This shows that q < cM , i.e. there are finitely many possibilities for q, which
is a contradiction.

Corollary 9.15. Liouville numbers are transcendental.

Example 9.16. Consider the number



X 1
ξ := n!
.
n=1 10

We will show that this is a Liouville number. Fix a positive integer N and
let
N
X 1 p
ξN := n!
=
n=1 10 q
where q = 10N ! . Clearly

p X 1 2 2 1
0<ξ− = n!
< (N +1)! = N +1 < N .
q n=N +1 10 10 q q

9.4 Transcendence of e
Theorem 9.17. The number e is transcendental.

Proof. Consider the integral


Z t
I(t) := et−x f (x)dx
0

for t > 0 where f (X) ∈ R[X] with deg(f ) = m. Integrating by parts m


times we get
m m
I(t) = et f (j) (0) − f (j) (t).
X X

j=0 j=0
56 CHAPTER 9. DIOPHANTINE APPROXIMATIONS

Now suppose e is algebraic, i.e. for some integers a0 , . . . , an with a0 6= 0

an en + . . . + a1 e + a0 = 0.

Let f be the polynomial

f (X) = X p−1 (X − 1)p · · · (X − n)p

where p is a prime to be determined later. Also, denote

g(X) = X p−1 (X + 1)p · · · (X + n)p .

Obviously |f (x)| ≤ g(x) for all real values of x and g is increasing on the
positive half-line hence
|I(t)| ≤ tet g(t).
Further, denote
n
X
J := aj I(j).
j=0

If m = deg(f ) = (n + 1)p − 1 then


m X
n
ak f (j) (k).
X
J =−
j=0 k=0

Observe that
• if 1 ≤ k ≤ n and j < p, then f (j) (k) = 0,

• if 1 ≤ k ≤ n and j ≥ p, then f (j) (k) is an integer divisible by p!,

• f (j) (0) = 0 for j < p − 1,

• if j > p − 1, then f (j) (0) is an integer divisible by p!,

• f (p−1) (0) is an integer divisible by (p − 1)! but not by p for p > n.


6 0 and (p − 1)!|J, so |J| ≥ (p − 1)!. On the other hand
Therefore, J =
g(k) ≤ (2n)m ≤ (2n)2np for all 1 ≤ k ≤ n. Thus
m m m
|aj | · jej g(j) ≤ (2n)2np |aj |jej .
X X X
|J| ≤ |aj | · |I(j)| ≤
j=1 j=1 j=1

Since m j
j=1 |aj |je does not depend on p, the right hand side can be bounded
P

by cp where c is a constant independent of p. This implies (p − 1)! < cp which


cannot hold if p is big enough.
9.5. EXERCISES 57

9.5 Exercises

1. Find two rational numbers a/b such that | 2 − ab | < √1 .
5b2

2. Let b > 1 be an integer and (an )n≥1 be a sequence of integers with


1 ≤ an < b for all n. Show that the number

X an
n!
n=1 b

is transcendental.
Chapter 10

Quadratic number fields

Definition 10.1. A field extension K of Q is quadratic if it has degree 2


over Q, that is, if the dimension of K as a Q-vector space is 2.

It is clear that quadratic extensions of Q are obtained


√ by adjoining a
√ those are of the form a d + b for a, b, d ∈ Q
quadratic irrational to Q. Since
with d non square, K =√Q( d).
More explicitly, Q( d) can√be seen as a subfield of complex numbers
consisting of the elements a + b d for rational a, b. When d√> 0 the field√can
be embedded into the field of real numbers. The subset Z[ d] = {a + b d :
a, b ∈ Z} is obviously a subring. We are going to study some properties of
those rings. Note however
1
√ that when d ≡ 1 mod 4 this is not the ring of
algebraic integers of Q( d).
√ √
Definition√ 10.2. The conjugate of an
√ element α = a + b d ∈ Q( d) is
ᾱ = a − b d. The norm of α = a + b d is defined as N (α) = αᾱ = a − b2 d.
2

The following is evident.

Lemma 10.3. √ The norm is multiplicative, that is, N (αβ) = N (α)N (β) for
all α, β ∈ Q( d).

10.1 Units

In this section we will describe the multiplicative group of the ring Z( d).
Recall that it consists of the invertible elements (units) of the ring, and is
 √ ×
denoted by Z[ d] .
1

An element α ∈ Q( d) is an algebraic integer if it is a root of a quadratic polynomial
X 2 + bX + c with b, c ∈ Z.

59
60 CHAPTER 10. QUADRATIC NUMBER FIELDS

Lemma 10.4. An element α ∈ Z[ d] is a unit iff N (α) = ±1.

Proof. If N (α) = ±1 then α · ᾱ = 1 or α · (−ᾱ) =√ 1, hence α is invertible.


Conversely, if α is a unit then there is β ∈ Z[ d] such that αβ = 1. This
yields N (α)N (β) = 1. Since N (α), N (β) are integers, N (α) = ±1.
√ √
Now let α = x + y d ∈ Z[ d] be a unit. We distinguish three cases.
Case 1. d < −1
In this case x2 − dy 2 = ±1 has only two integer solutions in x, y, namely,
(1, 0) which correspond to α = ±1. So the only units are ±1.

Case 2. d = −1
In this case the field Q(i) (where i2 = −1) is known as the Gaussian field
and the elements of Z(i), which correspond to lattice points on the complex
plane, are called Gaussian integers.
The equation x2 + y 2 = ±1 has four solutions, (±1, ±1). Thus, there are
four units in the ring of Gaussian integers, namely, ±1, ±i.

Case 2. d > 0
This case is non-trivial unlike the case d < 0. We need to solve the equation

x2 − dy 2 = ±1.

We have already studied the equation x2 −dy 2 = 1 in Section√8.5 and seen that
it has infinitely many solutions. In particular, the ring Z[ d] has infinitely
many units.   √ × 
Let α0 := min α ∈ Z[ d] : α > 1 . This is called the fundamental
unit. As in Section 8.5 one can show easily that all positive units must be
powers of α0 . Moreover, if N (α0 ) = −1 then all odd powers of α0 will have
norm −1 (those will be all positive solutions of the equation x2 − dy 2 = −1)
and all even powers will have norm 1 (those will be all solutions of Pell’s
equation). If N (α) = 1 then all powers will have norm 1 and will be solutions
of Pell’s equation. In particular, the equation x2 − dy 2 = −1 will have no
solutions in this case.
Taking into account negative solutions of the above equations as well, we
conclude that
 √ ×
Z[ d] = {±α0m : m ∈ Z} = (−1, α0 ),

i.e. the group of units is generated by −1 and the fundamental unit.


10.2. GAUSSIAN INTEGERS 61

10.2 Gaussian integers


In this section we will study prime elements of the ring Z[i]. First, notice
that it is a Euclidean domain with the norm being a Euclidean function.
Lemma 10.5. If α, β ∈ Z[i] with β 6= 0 then there are γ and ρ in Z[i] such
that α = βγ + ρ and N (ρ) < N (β).
Proof. Let αβ = x + yi ∈ Q(i). If u and v are the closest integers to x and y
respectively then |x − u|, |y − v| ≤ 21 and
!
α 1
N − (u + vi) = (x − u)2 + (y − v)2 ≤ < 1.
β 2

So we can take γ = u + vi, ρ = α − βγ.


Thus, Z[i] is a unique factorisation domain2 and so irreducibles and
primes3 coincide in Z[i]. In order to distinguish between primes of Z[i] and
Z, we will call the latter rational primes.
Two elements α, β ∈ Z[i] are associates, written α ∼ β, if α|β and β|α,
i.e. β/α is a unit of Z[i]. Obviously, an associate of an irreducible element is
itself irreducible and actually the factorisation into a product of irreducibles
is unique up to associates.
Lemma 10.6. If N (α) is a rational prime then α is prime in Z[i].
Proof. If α = β ·γ then N (α) = N (β)N (γ), hence N (β) = ±1 or N (γ) = ±1.
Thus, either β or γ must be a unit.
Now let π ∈ Z[i] be prime. Then π|π · π̄ = N (π). The latter is an integer
and so it can be factored into a product of rational primes. Then π must
divide one of those rational primes. It cannot divide two distinct rational
primes p, q since we know that px + qy = 1 for some integers x, y (in other
words, if two integers are coprime then they are coprime as Gaussian integers
as well). Thus, every Gaussian prime divides a unique rational prime.
Let π = a + bi|p. Then N (π)|N (p) = p2 . As N (π) is an integer, it must
be either ±1 or ±p or ±p2 . The first case is impossible since π is a prime
and hence not a unit. So there are three possibilities.

2
It is a general result that unique factorisation holds in Euclidean domains and the
proof is basically the same as for Z.
3
These are defined as in Section 1.3, that is, an element π 6= 0, ±1, ±i is irreducible if
π = αβ implies that either α or β is a unit, and it is a (Gaussian) prime if π | αβ implies
π | α or π | β.
62 CHAPTER 10. QUADRATIC NUMBER FIELDS

Case 1. N (π) = p is an odd rational prime.


Then p = ππ̄ = a2 + b2 , hence p ≡ 1 mod 4. Conversely, assume p is a
rational prime with p ≡ 1 mod 4. Then −1 p
= 1 and so p | x2 + 1 for some
integer x. This means that p | (x + i)(x − i). But clearly xp ± p1 i 6∈ Z[i],
hence p - x ± i. So p is not a Gaussian prime, hence it is not irreducible
in Z[i]. Thus, p = αβ for some Gaussian integers α, β which are not units.
Then p2 = N (p) = N (α)N (β) and N (α) = N (β) = p (since the norms are
positive). Thus, p = αᾱ and as N (α) = p is a rational prime, α is a Gaussian
prime. Moreover, it is clear that α  ᾱ. One says in this case that p splits
in Z[i].

Case 2. N (π) = 2.
We notice that 2 = (1 + i)(1 − i) = −i(1 + i)2 . Here 1 + i is a Gaussian prime
and 2 ∼ (1 + i)2 . One says that 2 ramifies in Z[i].

Case 3. N (π) = p2 .
Then N (π/p) = 1, hence π ∼ p and p is a Gaussian prime. It is said to
be inert in Z[i]. We already saw that p = 2 are p ≡ 1 mod 4 cannot be
Gaussian primes, hence in this case p ≡ 3 mod 4.

Thus, we obtain the following characterisation of Gaussian primes.

Theorem 10.7. A Gaussian integer π is a Gaussian prime if and only if it


is of one of the following forms.

• π = a + bi with a2 + b2 = p a rational prime with p ≡ 1 mod 4.

• π ∼ p where p ≡ 3 mod 4 is a rational prime.

• π ∼ 1 + i.

As a consequence we get another proof of Fermat’s theorem stating that


primes of the form 4k + 1 can be represented as a sum of two squares. This
approach also gives the uniqueness of such a representation.

Proposition 10.8. If p ≡ 1 mod 4 then p = a2 + b2 for some integers


a, b. Moreover, a and b are unique up to signs, that is, if p = c2 + d2 then
c = ±a, d = ±b or c = ±b, d = ±a.

Proof. Existence of a, b follows immediately from the above analysis. Unique-


ness follows from the fact that Z[i] is a unique factorisation domain. The
details are left to the reader as an exercise.
10.3. FERMAT’S LITTLE THEOREM FOR GAUSSIAN INTEGERS 63

10.3 Fermat’s little theorem for Gaussian in-


tegers
Theorem 10.9. Let π ∈ Z[i] be a Gaussian prime which does not divide
α ∈ Z[i]. Then
αN (π)−1 ≡ 1 mod π.
We sketch two proofs below leaving the details to the reader to complete.
Proof 1. We will show that αN (π) ≡ α mod π. Let α = x + yi.
If N (π) = p ≡ 1 mod 4 then

(x + yi)p ≡ xp + y p ip ≡ x + yi mod p.

If π = q ≡ 3 mod 4 then

αq = (x + yi)q ≡ xq + y q iq ≡ x − yi ≡ ᾱ mod q

and so
2
¯≡α
αq ≡ ᾱ mod q.

Proof 2. We will construct a complete residue system modulo π consisting


of N (π) elements.4
If π = q ≡ 3 mod 4 is a rational prime then {x + yi : 0 ≤ x, y < q} is a
complete residue system mod π.
If N (π) = p ≡ 1 mod 4 then for some integer u ∈ Z we have p | u + i.
So the system {0, 1, . . . , p − 1} is a complete residue system mod π.

10.4 Using Gaussian integers to solve Dio-


phantine equations
In this section we demonstrate how Gaussian integers can be applied to solve
Diophantine equations.
Let us consider the equation

y 2 = x3 − 1. (4.1)

We write it as
(y + i)(y − i) = x3 .
4
A set S ⊆ Z[i] is a complete residue system modulo π if every Gaussian integer is
congruent to an element of S mod π and no two elements of S are congruent mod π.
64 CHAPTER 10. QUADRATIC NUMBER FIELDS

Denote δ := gcd(y + i, y − i). Then δ | 2i so either δ ∼ 2 or δ ∼ 1 + i or


δ ∼ 1.
Obviously 2 - y + i so δ  2. If δ ∼ 1 + i then 1 + i divides y + i, i.e.
(y − 1) + (y + 1)i is divisible by 2, hence y is odd. However, reducing (4.1)
mod 4 we see that y cannot be odd.
Thus δ is a unit, that is, y + i and y − i are coprime. But since their
product is a prefect cube, they must be associates of perfect cubes. Since
all units in Z[i] are cubes, y + i and y − i must be prefect cubes. Hence
y + i = (a + bi)3 and taking conjugates we get y − i = (a − bi)3 . This implies
3a2 b − b3 = 1 and a3 − 3ab2 = y. The former equality forces b to be ±1.
Computing the value of a we see that the only possibility is b = −1, a = 0.
This corresponds to y = 0, x = 1.
Thus, the only integer solutions of the equation (4.1) is (1, 0).

10.5 Exercises
1. Prove Proposition 10.8.

2. Factorise the numbers −14 and 5i in Z[i].


√ √ √
3. Show that 2, 3, 1 + −5, 1 − −5 are irreducible in Z[ −5]. Use this
to show that the latter is not a unique factorisation domain.
Chapter 11

Chebyshev’s theorem

The following conjecture was posed by Bertrand in 1845 and proved by


Chebyshev in 1852.

Theorem 11.1. For every n > 1 there is a prime number p with n < p < 2n.
 
We are going to estimate the binomial coefficient 2nn
= (2n)!
(n!)2
from above
and from below. It is quite easy to get a lower bound which we will do
shortly. To get an upper bound we will factorise it into a product of primes
and get estimates for prime factors.
  In particular, if Chebyshev’s theorem
is false then all prime factors of 2n
n
are at most n. Then we will see that
if n is large enough then the two bounds are inconsistent. This will prove
the theorem for sufficiently large n and small values of n will be dealt with
separately by exhibiting prime numbers in each interval (n, 2n) (we will do
this in a smart way and not consider all possible values of n).
The proof that we are going to present is due to Erdős, and is taken from
[AZ14].

11.1 Basic estimates


Lemma 11.2. For n ≥ 1 we have

4n
!
2n
≥ .
n 2n

Proof. Observe that


2n 2n−1
! ! !
2n X 2n 2n
4n = (1 + 1)2n =
X
=2+ ≤ 2n · .
k=0 k k=1 k n

65
66 CHAPTER 11. CHEBYSHEV’S THEOREM
   
2n 2n
The last inequality follows from the fact that n
≥ k
for all k and
 
2n
n
≥ 2.
Recall that for a positive integer n and a prime number p the p-adic
valuation of n is defined by

vp (n) := max{γ ∈ Z : pγ | n}.

Then the fundamental theorem of arithmetic implies

pvp (n) = pvp (n) ,


Y Y
n=
p|n p

where the last product is over all primes p. It is actually a finite product
since if p - n then vp (n) = 0.
Lemma 11.3. For n ≥ 1 we have

" #
X n
vp (n!) = .
k=1 pk
h i
n
Proof. There are p
numbers between (and including) 1 and n divisible by
h i
p. Each of those numbers contributes at least 1 to vp (n!). Now, pn2 many
of those numbers is divisible by p2 and they contribute two. Repeating this
argument for
h i
each pk we get the desired result. Note that when k > logp n,
the terms pnk vanish and hence the sum is actually a finite sum.
 
2n
Denote N := n
.
Lemma 11.4. For each prime p ≤ 2n we have
(i) pvp (N ) ≤ 2n,

(ii) if p > 2n then vp (N ) ≤ 1,
2n
(iii) if 3
< p ≤ n then vp (N ) = 0 (for n ≥ 3).
Proof. (i) First notice that for any real number x the difference [2x] − 2[x]
is either 0 or 1. Further, by the previous lemma,
[logp 2n] " # " #!
X 2n n
vp (N ) = vp ((2n)!) − 2vp (n!) = − 2 ≤ logp 2n.
k=1 pk pk

(ii) If 2n < p ≤ 2n then logp 2n = 1.
11.2. PROOF OF CHEBYSHEV’S THEOREM 67

(iii) If 2n
3
< p ≤ n then 3p > 2n and 2p > n and p2 ≥ 3p > 2n. So
vp ((2n)!) = 2 and vp (n!) = 1.

Lemma 11.5. For any real number x > 1

p ≤ 4x−1 ,
Y

p≤x

where the product is over all prime numbers that do not exceed x.
Proof. Replacing x by [x] we may assume that x is a positive integer and we
will proceed to the proof by induction on x. Further, we can assume that
x = 2m + 1 is an oddinteger
 (and a prime number).
2m+1
First, notice that m ≤ 4m since

2m+1
! ! ! !
2m+1 2m+1
X 2m + 1 2m + 1 2m + 1 2m + 1
2 = (1+1) = ≥ + =2 .
k=0 k m m+1 m

Second, !
Y 2m + 1 (2m + 1)!
p
=
m+1<p≤2m+1 m m!(m + 1)!
since each prime p with m + 1 < p ≤ 2m + 1 divides (2m + 1)! and does not
divide m!(m + 1)!.
Now, using the induction hypothesis for m, we get
!
2m + 1
p ≤ 4m · ≤ 4m · 4m = 42m .
Y Y Y
p= p·
p≤2m+1 p≤m+1 m+1<p≤2m+1 m

11.2 Proof of Chebyshev’s theorem


Now we are ready to prove Chebyshev’s theorem.
Proof of Theorem 11.1. First we will show that Chebyshev’s theorem is true
for n > 5000. Assume for contradiction it is false for some such n, i.e. there
are no primes between n and 2n. Then from Lemmas 11.2, 11.4 and 11.5 we
have
4n
!
2n √ 2n
pvp (N ) = pvp (N ) · p ≤ (2n) 2n · 4 3 .
Y Y Y Y
≤ = p·
2n n p≤2n

p≤ 2n

2n<p≤ 2n n<p≤2n
3
68 CHAPTER 11. CHEBYSHEV’S THEOREM

Thus, √
4n/3 ≤ (2n) 2n+1
,
and so
log 2 √
2n · ≤ ( 2n + 1) log 2n.
3
This is obviously wrong for sufficiently large n since the left hand side grows
much faster. We will now show that it is in fact false for n > 5000. Indeed,
it suffices to show that for x > 10000 we have
log 2 √
x· > ( x + 1) log x.
3
1
First, it is easy to check that log3 2 > 0.2 and log x < x 4 if x > 10000.
√ 1
We claim that the stronger inequality 0.2x > 2 x · x 4 holds. But this is
obviously equivalent to x > 10000.
Thus, we proved Chebyshev’s theorem for n > 5000. For smaller n we
argue as follows. Consider the following sequence of primes:

2, 3, 5, 7, 13, 23, 43, 83, 163, 317, 631, 1259, 2503, 5003.

Every prime in this sequence is smaller than twice the previous one. Hence
for any 1 ≤ n ≤ 5000 one of those primes is between n and 2n.

11.3 On the prime number theorem


For a positive real number x let π(x) be the number of primes that do not
exceed x (π is called the prime counting function).
x
Theorem 11.6 (Prime number theorem). π(x) ∼ log x
as x → ∞.1

This is a classical result in analytic number theory first proved by Hadamard


and de la Vallée-Poussin in 1896. The proof is beyond the scope of this course.
However, we will use the estimates established in the previous sections to
prove a weak version of the prime number theorem.

Theorem 11.7. There are constants 0 < A < 1 < B such that
x x
A ≤ π(x) ≤ B
log x log x
for all sufficiently large x.
1 f (x)
The notation f (x) ∼ g(x) means that limx→∞ g(x) = 1.
11.3. ON THE PRIME NUMBER THEOREM 69

The following proof is from [Gre16] and the reader can also find a proof
of the prime number theorem there.

Proof. First, we will establish the upper bound. Lemma 11.5 implies

p ≤ 4x ,
Y

x/2<p≤x

and hence  π(x)−π(x/2)


x
≤ 4x .
2
Taking logarithms we get

x log 4
π(x) ≤ π(x/2) + .
log(x/2)

Applying this inequality repeatedly for x2 , x4 , . . . for each positive integer


m we get
m
x x/2k
  X
π(x) ≤ π m + 2 log 4 k
.
2 k=1 log(x/2 )
√ √
Now let m√be the biggest integer for which 2m ≤ x. Then √ 2m+1
> x
m k
and x/2 < 2 x. Also, for each 1 ≤ k ≤ m we have x/2 ≥ x. Hence

√ m √ ∞
X x/2k x X 1
π(x) < 2 x + 2 log 4 √ <2 x+ · 4 log 4 k
.
k=1 log x log x k=1 2

√ x
Obviously, 2 x < log x
for sufficiently large x and so we can take B =
1 + 4 log 4.2

Now we turn to the lower bound.


p ≥ C n for all sufficiently
Q
Claim. There is a constant C > 1 for which p≤2n
large n.
Indeed, we saw in the proof of Chebyshev’s theorem that
4n √
< (2n) 2n ·
Y
p.
2n p≤2n

So any constant 1 < C < 4 works.


2
Q Actually, any number B > 4 log 4 would work. Moreover, it can be shown that
x
x/2<p≤x p ≤ C for any real number C > 2. Hence taking B > 4 log 2 would suffice.
70 CHAPTER 11. CHEBYSHEV’S THEOREM

Now the claim implies

n log C ≤ (2n)π(2n)

and
log C 2n
π(2n) ≥ · .
2 log 2n
This establishes the lower bound for x = 2n. The general case follows
easily from this.

11.4 Exercises
1. How many zeroes are there at the end of the decimal representation of
99! ?

p ≤ C x for any real number C > 2.


Q
2. Show that x/2<p≤x
Bibliography

[AZ14] Martin Aigner and Günter M. Ziegler. Proofs from THE BOOK.
Springer-Verlag Berlin Heidelberg, 5 edition, 2014.

[Bak12] Alan Baker. A Comprehensive Course in Number Theory. Cam-


bridge University Press, 2012.

[Gre16] Ben Green. Analytic Number Theory: Lecture Notes. University of


Oxford, Oxford, 2016. Available at http://people.maths.ox.ac.
uk/greenbj/papers/primenumbers.pdf.

[Gre17] Ben Green. Introduction to Number Theory: Lecture Notes. Univer-


sity of Oxford, Oxford, 2017. Available at http://people.maths.
ox.ac.uk/greenbj/papers/numbertheory-2017.pdf.

[HW80] Godfrey Hardy and Edward Wright. An Introduction to the Theory


of Numbers. Oxford University Press, 1980.

71

You might also like