Number Theory
Number Theory
These are lecture notes for math 354b, “Number Theory,” taught by Ross Berkowitz
at Yale University during the spring of 2019. These notes are not official, and have
not been proofread by the instructor for the course. They live in my lecture notes
respository at
https://github.com/jopetty/lecture-notes/tree/master/MATH-354.
If you find any errors, please open a bug report describing the error and label it
with the course identifier, or open a pull request so I can correct it.
Contents
1 January 14, 2019 1
3 Lecture 3 5
4 Wednesday, January 23 6
4.1 Infinitude of Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Congruence Equations & Modular Arithmetic . . . . . . . . . . . . . 7
5 Monday, January 28 8
5.1 Solving ax ≡ b (mod m) . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 Algorithmic Speed for the Chinese Remainder Theorem . . . . . . . . 8
6 Wednesday, January 30 9
6.1 Pollard-ρ Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6.2 Floyd’s Cycle Finding . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6.4 Some cool things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7 Monday, February 4 11
7.1 Rosen 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.2 Multiplicative Structure of Z/nZ . . . . . . . . . . . . . . . . . . . . 12
8 No Notes 13
9 No Notes 14
10 Wednesday, 23 February 15
10.1 Applications of QR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
11 Monday, 18 February 19
11.1 RSA Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
11.2 Diffie-Hellman Key Exchange . . . . . . . . . . . . . . . . . . . . . . 20
11.3 Zero-Knowledge Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1 n u m b e r t h e o ry
2 january 16, 2019
Definition (Prime). A number is prime if its only positive divisors are Prime
1 and itself.
Theorem 2.1 (Well-Ordering Principle). Every nonempty subset of Z<0 has a least
element. This is the defining property of Z.
2.2 Today
Definition (GCD). Let a, b ∈ Z. The greatest common divisor is the GCD
largest common divisor of a and b, so gcd(a, b) = max{d | d divides a and b}.
We know this exists because of well-ordering.
Definition (GCD). Given a and b in some PID, we say that the GCD is GCD
he principal generator d of the ideal (a, b), so (a, b) = (d). Alternatively,
the gcd is the smallest positive number in (a, b) if we’re working in Z.
Notation (GCD). As a nod to the last definition, we often write the GCD
GCD of two numbers as (a, b) to emphasize the relation to ideals.
Lemma 2.2. Let d be the greatest common divisor of a and b. Then for any x ∈ Z
we know that (a, b + ax) = d as well. Then the GCD is unchanged under linear
combinations.
Proof. It’s clear that d still divides b + ax if it divides a and b, so its clear that
(a, b + ax) ≥ (a, b). Independently, we know that there can’t be a larger divisor
2 n u m b e r t h e o ry
2 january 16, 2019
since if d0 divides b + ax then d0 divides b, and we already know that d is the largest
divisor of b which also divides ax. Thus (a, b + ax) ≤ (a, b) so (a, b + ax) = d.
Proof. We show containment each way. First we note that I ⊆ dZ since every
element of I is divisible by d since if d divides a and b then it divides ax + by.
Then we show that dZ ⊆ I (this is sometimes called Bezout’s Lemma). By the This part could be proved
Well-Ordering property, we know that there exists some c = min(I ∩ Z>0 ). We know with the Extended
Euclidean Algorithm.
that c ≥ d since it must be the case that d divides c. On the other hand, if we
can show that c is a common divisor of a and b then we know that c ≤ d as well.
We know that a = cq + r for 0 ≤ r ≤ c. Then we know that c ∈ I implies that
c = ax + by so r = a − cq = a(1 − xq) + b(−yq) so r ∈ I. Since c is the minimum
positive element we know that c = 0 and so a = cq so it divides a. Repeat for b.
Then c ≤ d and c ≥ d so c = d. This also gives us the definition of the GCD which
is the divisor of a and b which is divisible by all other common divisors.
Proof. Note that (a, b) = 1, so there exist some x, y ∈ Z such that 1 = ax + by.
Multiplying through by c, we get that
c = cax + cby.
Theorem 2.7. All integers have a unique prime factorization. For every n ∈ Z≥2
there exists a unique set of primes p1 , · · · , pk and positive integers a1 , · · · , ak such
Qk
that n = i=1 pai i .
Proof. Assume that we have two (more than one) such lists of primes and their
powers. Denote them P = p1 , · · · , pk (possible with repeats) and Q = q1 , · · · , q` .
Assyme by way of contradiciton that the lists are disjoint (otherwise we cancel the
Q`
like terms). We know that p1 divides i=1 qi , so p1 must divide qi for some i. This
can happen if and only if p1 = qi . This contradicts the disjointness of our list and
presents a contradiciton.
3 n u m b e r t h e o ry
2 january 16, 2019
4 n u m b e r t h e o ry
3 lecture 3
3 Lecture 3
Didn’t take notes today.
5 n u m b e r t h e o ry
4 wednesday, january 23
4 Wednesday, January 23
Recall the uniqueness of prime factorization, where for all n ∈ N we have a unique
Qk
list of primes p1 , . . . , pk and ai , . . . , ak ∈ Z>0 such that n = i=1 pai i .
Then by uniqueness of prime factorization for each n ∈ N we know that 1/n appears
exactly once when you expand this product. This is Euler’s product for the ζ
function?
Euler’s Proof. Assume by way of contradiction that there are finitely many primes.
Then ∞ k Y k
X 1 Y 1 1
= 1+ + ··· = < ∞.
n=1
n i=1 p1 i=1
1 − 1/pi
P
Yet we know that 1/n diverges, which presents a contradiction.
Lemma 4.2. For any n ∈ Z there exists a unique a, b ∈ Z such that a is square free
(meaning that no square number divides it) and n = ab2 .
Erdős’ Proof. Assume by way of contradiction that there are finitely many primes.
Then any square-free number n = pai i where ai ∈ {0, 1}. Thus there are only 2k
Q
6 n u m b e r t h e o ry
4 wednesday, january 23
square-free numbers. Now let’s look at all numbers at most N for some N . By the
above lemma, they can be specified by (a, b) where a is square-free and b2 is square.
√ √
There are 2k square-free numbers and at most N square numbers, so N ≤ 2k N
√
for all N , so 2k ≥ N for all N , which is very very false if N > 22k .
Obs. 1. If (a, m) = 1 then we can use Bezout’s theorem. This tells us that tehre exist
some X, Y such that 1 = aX + mY . Then we multiply through by b to get
that b = a(Xb) + m(Y b). Then aXb ≡ b (mod m).
Lemma 4.4. The congruence ax ≡ b (mod m) has solutions if and only if the gcd
of a and m divides b.
Proof. Let d be the gcd of a and m. By Bezout, there exists some X0 , Y0 ∈ Z such
that d = aX0 + bY0 . Since d divides b there exists some k such that b = dk. Then
b = aX0 k + mY0 k so b ≡ aX (mod m) for X = X0 k. In the other direction, just
write it out. If there is a solution then b ≡ aX (mod m) so b = aX + mY . Since the
gcd divides the right hand side it must divide the left as well, so d divides b.
7 n u m b e r t h e o ry
5 monday, january 28
5 Monday, January 28
5.1 Solving ax ≡ b (mod m)
Recall from last lecture that ax ≡ b (mod m) is solvable if and only if the gcd
divides m. If we let m0 = m/d then the solutions are unique modulo m0 .
Proof. Let x1 , x2 be solutions to ax1 ≡ b (mod m) and ax2 ≡ b (mod m). Consider
then that a(x1 − x2 ) ≡ 0 (mod m). Let a0 = a/d. Then da0 (x1 − x2 ) = dm0 k. We
know that m0 divides a0 (x1 − x2 ), and since (m0 , a0 ) = 1 we know that m0 divides
x1 − x2 .
x ≡ a1 (mod m1 ), . . . , a ≡ ar (mod mr ),
are isomorphic.
Lemma 5.5. If a1 , . . . , ar all divide m and are all pairwise relatively prime to m
then the product a1 · · · ar divides m.
Q
Proof of CRT. Let M̂i = M/mi = j6=i mi . We find a helper yi such that yi ≡ 0
P
(mod M̂i ) and yi ≡ 1 (mod mi ). Then we’ll have that x = ai yi . Note that
(M̂i , mi ) = 1 so we know that 1 = xi M̂i + yi mi has a solution. Let yi = xi M̂i . THis
shows existence. To show uniqueness, just apply Lemma 5.2 above.
8 n u m b e r t h e o ry
6 wednesday, january 30
6 Wednesday, January 30
6.1 Pollard-ρ Factorization
We want to factor n. If we can find 0 < a, b < n such that a ≡ b (mod p) then
(b − a, n) = p is a nontrivial factor of n. Our first idea was to try numbers at random,
√
and after about p samplings we’ll find two a, b which are congruent mod p. But
√
since there were 2p pairs so it takes about p log p steps.
New idea: Start at x0 = 2. For i ≥ 1, let xi+1 = x2i +1 (mod n). This will replace our
random numbers, and the hope is that this sequence x1 , x2 , . . . is “random enough”
for our uses. Now if xj ≡ xi (mod p) for any p dividing n then xj+1 = x2j + 1
(mod n) ≡ x2j + 1 (mod p) ≡ x2i + 1 (mod p) ≡ xi+1 (mod p). This forms a nice
cycle modulo p.
2. Fix a1 and check all the others in the list until you find a match. But we have
no idea when the cycle starts.
Instead we use a “tortoise adn the hare method,” where we have two pointers in our
sequence. The slow pointer t moves through the list while the fast pointer h moves
twice as fast as the tortoise. Eventually both of the following will happen:
6.3 Algorithm
Let x0 = 2 and let xi+1 = x2i + 1 (mod n).
9 n u m b e r t h e o ry
6 wednesday, january 30
If the xi are sufficiently random, then with high probability there are two j, k ≤
√ √
O( p) such that xj ≡ xk (mod p) where p is any divisor of n. After O( p) steps
we will have
• k − j divides i;
• i ≥ min(j, k);
These imply that xi is in the cycle and that x2i ≡ xi (mod p). Then xi ≡ xi+(k−j)
(mod p). Then (x2i − xi , n) is at least p, a nontrivial factor.
Pollard-ρ runs in about O(n1/4 log n), which is O(n1/4 ) computations of the gcd,
and it runs especially quickly in the case that n has small prime factors since those
determine the cycle length.
Proof. Every number can be paired with its mutliplicative inverse in Z/pZ. Then
(p − 1)! = a = (a,a−1 ) (a · a−1 ) · −1 = −1 (this double counts when a = a−1 , so
Q Q
when a = ±1).
Theorem 6.2 (Fermat’s Little Theorem). For prime p, xp ≡ x (mod p) for any
x ∈ Z.
Proof. Recall ϕ(n) is the number if numbers less than n which are relatively prime
to n, and let Z/mZ× is the set of units in Z/mZ, which is the set of numbers
relatively prime to m equipped with multiplication modulo m.
10 n u m b e r t h e o ry
7 monday, february 4
7 Monday, February 4
Recall that if p is prime then ap−1 ≡n 1, which gives us a suggested primality test:
If we want to know if p is prime, pick some 1 ≤ a ≤ p and check ap−1 (mod p). This
doesn’t always 414 ≡15 1. The question now becomes, is this a rarity?
Theorem 7.1. Fix a base b = 2 with at least one odd pseudoprime n. There are
infinitely many pseudoprimes to the base b = 2.
We will return to this theorem in a few weeks and extend it to the Miller Primality
Test.
11 n u m b e r t h e o ry
7 monday, february 4
12 n u m b e r t h e o ry
8 no notes
8 No Notes
13 n u m b e r t h e o ry
9 no notes
9 No Notes
14 n u m b e r t h e o ry
10 wednesday, 23 february
10 Wednesday, 23 February
Recall the question of when is (a/p) = ±1 when a ∈ Z, (a, p) = 1, and p is prime.
We defined |x| = min{x, p − x} for 0 ≤ x ≤ p − 1. Recall Gauss’ Lemma, which
states
Lemma 10.1 (Gauss). Let s be the number of ` such that 1 ≤ ` ≤ (p − 1)/2 such
that |a`| = −a`. Then (a/p) = (−1)s .
Proof. We just need to look at how many even numbers are between 1 and (p − 1)/2
verses (p + 1)/2 and p.
Case 3: p = 3+8k. Then (p−1)/2 = 4k+1, so there are 2k even numbers ebtween 1 and
(p − 1)/2 and 2k + 1 even numbers between (p + 1)/2 and p, so (2/p) = −1.
Proof. Consider f (z) = 2i sin(2πz), which has some nice properties. It’s odd, so
−f (z) = f (−z). It’s also 1-periodic, so f (z) = f (z +1). Note that i sin(z) = sinh(iz),
so
f (z) = e2πiz − e−2πiz .
Define ζ = ζp to be the pth root of unity 22πi/p . Note that ζ m · ζ n = ζ m+n mod p
`
and (ζ m ) = ζ m` mod p
We also have the following Proposition and Lemma, listed after the proof.
15 n u m b e r t h e o ry
10 wednesday, 23 february
Notice how this expression is almost symmetric in p and q, with only one difference
in the final term. In fact, switching them out only requires (−1)(p−1)(q−1)/2 .
Then recall that the sequence {|a|, |2a|, . . . , |(p − 1)a/2|} is just {1, 2, . . . , (p − 1)/2},
which gets us that
(p−1)/2 (p−1)/2
a Y |a`| a Y `
f = f .
p p p p
`=1 `=1
Lemma 10.5. If n is odd then
n
Y
xn − y n = xζ k − yζ −k ,
k=0
where ζ = ζn .
k=0 k=0
Then Y
−2k
hY i
xn − y n = xζ k − yζ −k ζ −(n−1)/2 ,
x − yζ =
where ζ −(n−1)/2 = 1 since n is odd.
if n is odd.
Proof. Notice that f (nz) = e2πizn −e−2πizn . Now just apply the previous lemma.
16 n u m b e r t h e o ry
10 wednesday, 23 february
10.1 Applications of QR
17 n u m b e r t h e o ry
10 wednesday, 23 february
We factor as
23 · 31 23 31 1009 1009 20 17
= = = .
1009 1009 1009 23 31 23 31
Definition (Jacobi Symbol). Let (q/n) be the Jacobi symbol, where Jacobi Symbol
n is a product of primes, then it is multiplicative.
q 0 (a, n) 6= 1,
= Q 1/i
n a
(a, n) = 1.
pi
• (−1/n) = (−1)(n−1)/2
2
−1)/8
• (2/n) = (−1)(n
Next Class
• Use Jacobi to talk about when a is a quadratic residue for almost all primes
18 n u m b e r t h e o ry
11 monday, 18 february
11 Monday, 18 February
Recall the definition of the Jacobi Symbol. This is like the multiplicative extension
of the Legendre Symbol, although we loose the nice property that (a/n) = 1 if and
only if a is a quadratic residue modulo n. We do that the following properties:
a b ab a a a
= and = .
n n n n ` n`
• (−1/n) = (−1)(n−1)/2
2
−1)/8
• (2/n) = (−1)(n
Theorem 11.2. If a is a non-square, there are infinitely many primes such that
(a/p) = −1, that is where a is not a residue modulo p.
Proof. Assume that a = 2e · qi , where qi are distinct primes and e ∈ {0, 1}. We
Q
assume here that a is square free, since we can always reduce the exponents modulo
2 to get rid of this square part. Fix any set of primes `1 , . . . , `k distinct from 2, qi .
We want to show that there is a prime p not in this list such that (a/p) = −1.
We do this by building such a number. By CRT we know there is a x such that
x ≡8 1 ≡`i 1 ≡qi<m 1 ≡qm s, where (s/qm ) = −1. Consider that
a 2e Y q Y x Y x
i (x−1)/2·(q−1)/2
= =1· · (−1) = .
x x x qi qi
Then we use the multiplicative nature of the Jacobi symbol to say that
a Y a Y
= −1 = where x = pvi i ,
x pi
and we know that pi = 6 qi since otherwise its congruence modulo qi would be zero.
Since we already know this equals −1, there must be some (at least one) pi such that
(a/pi ) = −1. This is really similar to Euclid’s proof of the infinitude of primes.
Note: The above assumes that a 6= 2 since we implicitly assumed there was at least
one odd prime factor. If a = 2, then it is a nonresidue if and only if p ≡8 3, 5. There
are infinitely many primes p ≡8 3.
19 n u m b e r t h e o ry
11 monday, 18 february
Another example is Enigma (yay Alan Turing) from WWII, where the ability to
read or send the messages was dependent on a (very high) number of possible
dial-combinations, which made it easy to use but computationally difficult to break.
Public Key cryptography sets up a system where anyone can encrypt a message for
Alice, but only she may decrypt such a message. This is accomplished by having
two different keys. In private, Alice will pick two primes p, q and publicly announces
their product n = pq. Privately, she can compute ϕ(n) = (p − 1)(q − 1). She the
picks an encryption exponent e and announces this too, and then privately computes
d = e−1 mod ϕ(n). Let’s say that Bob wants to send a message to Alice. Suppose
this message is some number P between 2 and n (1 fails for obvious reasons). Bob
takes P and encrypts is via C = P e and sends C to Alice. When she receives
d −1
it, she takes C d = P e ≡ P e·e mod ϕ(n) = P . If Eve is looking in on this
transmission, she can see C = P e and she even knows what e is! However, given
a composite number n, it is computationally easy to compute P e mod n, but it is
nearly intractable to find P given P e . This means that Alice can’t really decrypt
the message by brute force. Nor can she compute ϕ(n), since it is also very hard to
compute ϕ(n) from n; we believe it to be as hard as factoring n, which is not easy
to do1 . 1 We think this is the case.
20 n u m b e r t h e o ry
11 monday, 18 february
Alice and Bob agree publicly on a public prime p, and some primitive root-ish2 r. 2 Thismight be hard to find,
Now they will each privately choose keys kA and kB , which they don’t reveal to so we can find something
with a large enough order
anyone. Alice then transmits cA = rkA mod p and Bob transmits cB = rkB mod p.
and just go with that.
Alice takes ckBA = rkA kB mod p and Bob takes ckAB = rkA kB mod p. This is their
shared secret. Note however that the secret they end up with rkA kB mod p is different
than what they started with it, but they both end up with a shared secret.
Example 11.1. Imagine that Vince is color-blind, and cannot tell red
from green. Paula has a red sock and a greens sock. When Vince sees
these, he can’t tell which is which, but Paula wants to prove to Vince
that she can distinguish between them. To do this, Paula hands both
socks to Vince. In each round,
• Vince will produce a sock (he doesn’t know which one), shows it
to Paula, and puts it behind his back, and then produces a second
sock (either S1 or S2 ) and then asks Paula whether or not it is the
same sock.
• Paula answers him each time. If she couldn’t tell the difference, she
would have to guess which sock it was, and so in total she fails with
probability 1 − 0.5n after n rounds. If, however, she does see the
difference then Vince is confident that she does so with the same
probability.
Thus Vince can be as sure as he wants to be that Paula can see the colors
without every actually learning which sock is which.
Example 11.2. Paula wants to prove her identity to the world. She
picks primes p, q, u in private, and announces to the world “I am Paula!
√
n = pq and v = u2 .” Then Paula is anyone who knows v = u without
showing what u is. To do so, she will
1. Pick an r at random and sends x = r2 mod n to Vince.
21 n u m b e r t h e o ry
11 monday, 18 february
22 n u m b e r t h e o ry
12 monday, 25 february 2019
23 n u m b e r t h e o ry
13 monday, 4 march 2019
• ψ(x) ∼ x,
• ϑ(x) ∼ x,
• Π(x) ∼ x/ log x.
P
Lemma 13.4. Mlog x = n≤x log n = x log x − x + O(log x).
Proof. Consider X
log n · 1.
n≤x
We apply Abel summation using f (n) = 1 and φ(x) = log x to get that
Z x
X btc
log n · 1 = bxc log x − dt ,
1 t
n≤x
holds.
P
Proof. Recall that Mlog(x) = n≤x log n, and that
X
log n = Λ(d).
d|n
24 n u m b e r t h e o ry
13 monday, 4 march 2019
Then
XX X X X X x
Mlog(x) = Λ(d) = Λ(d) = Λ(d) = ψ .
q
n≤x d|n dq≤x q≤x d≤x/q q≤x
25 n u m b e r t h e o ry
14 monday, 25 march 2019
Lemma. Let q be prime. There are infinitely many primes p ≡ 1 (mod q).
q
−1
Proof. Look at Φq (x) = xx−1 = xq−1 + · · · + 1. We know that if p | Φq (x) then p ≡ 1
(mod q). Note that p | xq − 1 or x ≡ 1 (mod p). In the former case, we know that
the order of x is q, so q | p − 1, which implies that p ≡ 1 (mod q). In the latter case,
we would get that p | q which is really problematic since p and q are distinct primes.
To prove that there are infinitely many such primes, suppose we have some finite
list of primes p1 , . . . , p` . Notice that Φq (x) ≡ 1 (mod x) for all x, so let x be the
Q
product of these finite primes. Then any prime which divides pi must itself be
congruent to 1 modulo q, and since these prime factors must exist we know that
there exist infinitely many primes which are congruent to 1 modulo any prime q.
ψa (x) = ζ ax ,
where ζ is the mth root of unity. One may trivially check that this is, in
fact, a homomorphism.
A list of facts:
2. Orthogonality relations.
26 n u m b e r t h e o ry
14 monday, 25 march 2019
•
X |G| if ψ ≡ 1,
ψ(g) =
0 otherwise.
a∈G
•
X |G| if g = 0,
ψ(g) =
0 otherwise.
ψ∈G
b
χo = 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, . . .
1/ns
P
Example 14.3. Let χ = χ0 and let q be prime. Then L(s, χ0 ) =
where n 6≡ 0 (mod q).
27 n u m b e r t h e o ry
14 monday, 25 march 2019
Remark — Why care about these characters? Well, they are really good at picking
out numbers which are 1 modulo q.
Theorem. There are infinitely numbers x ≡ a (mod q). Yeah, this is easy.
If there were finitely many n ≡ 0 (mod m), then the right hand side would need to
be finite. However, we can show rather easily that
X
lim+ L(s, ψ) = ∞,
s→1
ψ
and so there must be infinitely many n. The obstacles we’ll face are
• We want to restrict our attention to things that are only relatively prime to q.
Convergence of L-functions
X q−1
X
χ(x) = 0 =⇒ χ(x) = 0, (by orthogonality)
x∈G x=0
so consider X X x
X
χ(n) = χ(n) + χ(n) ≤ ϕ(q),
n≤x n≤kq kq+1
where the first term must be 0, and in the latter, there are at most ϕ(q) summands
which are relatively prime to q. This implies that the L-functions always converge.
28 n u m b e r t h e o ry
14 monday, 25 march 2019
In particular, Y 1
L(s, χ0 ) = ζ(s) 1− .
ps
p|q
29 n u m b e r t h e o ry