0% found this document useful (0 votes)
13 views8 pages

Dist of Primes Notes

The document discusses the distribution of prime numbers, focusing on analytic techniques and various conjectures related to primes, including the Cramér model and Hardy–Littlewood k-tuples conjectures. It explores questions about the behavior of primes, their gaps, and their equidistribution, while also providing mathematical definitions and theorems relevant to these topics. The document concludes with predictions and implications of the Cramér model regarding twin primes and the need for adjustments to account for small prime influences.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Dist of Primes Notes

The document discusses the distribution of prime numbers, focusing on analytic techniques and various conjectures related to primes, including the Cramér model and Hardy–Littlewood k-tuples conjectures. It explores questions about the behavior of primes, their gaps, and their equidistribution, while also providing mathematical definitions and theorems relevant to these topics. The document concludes with predictions and implications of the Cramér model regarding twin primes and the need for adjustments to account for small prime influences.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

THE DISTRIBUTION OF PRIME NUMBERS

VIVIAN KUPERBERG

C ONTENTS
1. Overview 1
2. The Cramér Model 1
3. Hardy–Littlewood k-tuples conjectures 5
References 8

1. O VERVIEW
The study of prime numbers is a beating heart of number theory. In this course we’ll see analytic
techniques applied to questions about prime numbers and thier distribution, and we’ll get an
understanding of what we should expect from prime numbers and from our proof techniques.
As an initial guide, here are some immediate questions one can (and does) ask about prime
numbers:
• Are there infinitely many twin primes?
• How many twin primes are there?
• Are primes equidistributed mod q?
• How large does x have to be so that primes look equidistributed mod q? In other words, for
which q can we show that primes up to x are equidistributed mod q?
• Can every even number ≥ 4 be written as a sum of two primes?
• Can very odd number ≥ 5 be written as a sum of three primes?
• What is the largest (and smallest) gap between primes ≤ x?
• How often do large and small gaps occur?
• How many primes lie in the interval [ x, x + h( x )] for different functions h( x )? For which
values of h is there an asymptotic answer? How much does this count vary?
The list goes on. Underpinning all of these questions is the following general question:
• In what ways do the primes behave like a “random sequence”? In what ways do they not?
We begin by formalizing what we mean by a “random sequence,” through something called the
Cramér model.

2. T HE C RAM ÉR M ODEL


“It is evident that the primes are
randomly distributed but,
unfortunately, we do not know
what ‘random’ means.”– R. C.
Vaughan

The most basic quantitative result about prime numbers is the prime number theorem.
2 VIVIAN KUPERBERG

Theorem 2.1 (Prime Number Theorem, Hadamard–de la Vallée Poussin). Let π ( x ) denote the number
of primes p ≤ x. Then
Z x
dt x
π ( x ) = li( x )(1 + o (1)) := (1 + o (1)) = (1 + o (1)).
2 ln t ln x
Correspondingly, we would like a random model for the prime numbers that, at a minimum,
agrees with the prime number theorem on average.
To begin with, we can discretize the logarithmic integral appearing in the prime number theorem
into a sum over integers n:
Exercise 2.2. Show that Z x
dt 1
ln t
− ∑ = O(ln x ).
2 2≤n≤ x ln n

The benefit of this expression is that, well, the integers are discrete. In particular, the statement
“the prime number theorem should hold for a typical sequence in our random model of primes” is
equivalent to saying “the probability that n is prime should be about log1 n .” Assuming that (and
1
this is a big assumption!) each n is independently prime with probability log n leads us to the Cramér
model for prime numbers.
Definition 2.3 (Cramér’s random model). For integers n ≥ 2, let X (n) be a sequence of independent
random variables defined by
(
1 with probability log1 n
X (n) :=
0 with probability 1 − log1 n ,
which we think of as a randomized model for the behavior of 1P , the indicator function of the
prime numbers.
By construction, we have
1
E ∑ X (n) = ∑ EX (n) = ∑ ln n
= li( x ) + O(ln x ),
2≤ n ≤ x 2≤ n ≤ x 2≤ n ≤ x

so we have correctly calibrated our random model to agree with PNT.


Now that we have the Cramér model, we can compare it to experimental data for various
questions about prime numbers and see where we land.
Example 2.4 (Frequency of prime gaps). A typical prime gap between pk and the next prime is of
size log pk ∼ log k, but one can ask a more refined question: for fixed 0 ≤ α < β and primes pk for
k ≤ n, how often is pk+1 − pk ∈ [α log k, β log k ]? That is, does the limit
p k +1 − p k
 
1
lim # 1 ≤ k ≤ n : ∈ [α, β]
n→∞ n log k
exist, and what is it?
This probability is given by
h −1     h −1
1 1 1 1
∑ ∏ 1 − log( pk + j) log( pk + h) ∼ ∑ 1−
log k log k
,
α log k ≤h≤ β log k j=1 α log k ≤h≤ β log k

since log( pk + j) ∼ log k. This is


1
Z β
∼ ∑ e −h/ log k
log k

α
e−t dt.
α≤h/ log k ≤ β
THE DISTRIBUTION OF PRIME NUMBERS 3

So, the probability density of finding pk+1 − pk close to t log k is e−t ; in other words, primes are
predicted to form a Poisson process.
Our next example will make use of the Borel–Cantelli Lemma:
Lemma 2.5 (Borel–Cantelli). Let E1 , E2 , . . . be a sequence of events in a probability space. If the sum of the
probabilities of the events { En } is finite, i.e.

∑ P(En ) < ∞,
n =1

then the probability that infinitely many events occur is 0, that is,
 
P lim sup En = 0.
n→∞

Example 2.6 (Cramér and large gaps). This example was in fact Cramér’s motivation for defining
the random model. Say we want to approximate the probability that the primes contain a gap of
width at least g for some integer g. This is the same as saying that for some n, for all 1 ≤ v ≤ g,
each n + v is nonprime (where we imagine that g grows with n). The probability of this event
happening is
g  
1
∏ 1 − log(n + v) .
v =1

Say g = c(log n)2 for some constant c > 0. Then, discarding the small fluctuations in log, we get
 c(log n)2
1
≈ 1− ∼ e−c log n = n−c .
log n
If c > 1, then the expected number of occurrences of gaps of size c(log n)2 is therefore
2
∞ c(log n)   ∞
1 1
∑ ∏ 1−
log(n + v)
≪ ∑ c
< ∞,
n =2 v =1 n =2 n

so we expect to see gaps of size c(log n)2 only finitely often by Borel–Cantelli (Lemma 2.5). By
contrast if c ≤ 1, the expectation is
2
∞ c(log n)   ∞
1 1
∑ ∏ 1−
log(n + v)
≫ ∑ c
→ ∞,
n =2 v =1 n =2 n

so we expect to see gaps of width c(log n)2 infinitely often. Cramér conjectured therefore that
p k +1 − p k
(2.1) lim sup = 1,
k→∞ (log pk )2
where here pk denotes the kth prime.
Let us record a few more predictions of the Cramér model before moving on to evaluating the
strength of these predictions.
Example 2.7 (Cramér and Riemann). What would the Cramér model say, for example, about the
Riemann hypothesis? Recall that the Riemann hypothesis, in one of its many forms, is the statement
that for any fixed ε > 0, we have

∑ Λ(n) = x + Oε (x1/2+ε ),
n≤ x
4 VIVIAN KUPERBERG

where Λ(n) is the von Mangoldt function given by


(
log p if n = pk
Λ(n) =
0 else.
We have immediately (this can be taken as an exercise) that

∑ Λ( pk ) = Oε ( x1/2+ε ),
pk ≤ x
k ≥2

so we can restrict our attention to


∑ 1P (n) log n,
n≤ x
which we can model by
E ∑ X (n) log n.
2≤ n ≤ x
By Borel-Cantelli (Lemma 2.5), it suffices to obtain the estimate
(2.2) ∑ X (n) log n = x + Oε ( x1/2 + ε)
2≤ n ≤ x

with probability 1 − O( x −2 ), say, in which case exceptions happen only finitely often.
We consider the random variable Y (n) := X (n) log n − 1; the Y (n) have mean zero and are
independent, and have size O(log n) = O(log x ) = O( x o(1) ). Thus for any fixed natural number k,
!k
E ∑ Y (n) =E ∑ Y ( n 1 )Y ( n 2 ) · · · Y ( n k )
2≤ n ≤ x 2≤n1 ,...,nk ≤ x

= ∑ E(Y (n1 )Y (n2 ) · · · Y (nk )).


2≤n1 ,...,nk ≤ x

Since the Y (n) are independent with mean zero, any term Y (n1 )Y (n2 ) · · · Y (nk ) where some ni
appears exactly once will have mean zero. Thus only terms with at most k/2 distinct indices ni are
nonzero; there are Ok ( x k/2 ) of these terms and each one contributes a power of log x = O( x o(1) ), so
the expectation overall satisfies
!k
E ∑ Y (n) = O( x k/2+o(1) ).
2≤ n ≤ x

In particular the variance of ∑2<n≤ x Y (n) is O( x1+o(1) ).


We now apply Chebyshev’s inequality, which says that for a random variable X with variance σ2
and mean 0, for any real number r > 0, for any natural number k ≥ 2,
1
Pr(| X | ≥ rE(| X |k )1/k ) ≤ .
rk
In our case this tells us that
!
Pr ∑ Y (n) = O( x 1/2+ε
) ≥ 1 − O( x −kε+o(1) ),
2≤ n ≤ x

so taking k large enough depending on ε proves that

∑ Y (n) = Oε ( x1/2+ε )
2≤ n ≤ x

with probability 1 − O( x −2 ), and then expanding the definition of Y (n) proves (2.2).
THE DISTRIBUTION OF PRIME NUMBERS 5

Note that in this example we really used very little about the Cramér model: just that the Y (n) are
bounded and that they have mean zero. This is, accordingly, a very general principle in probability:
random variables with mean zero exhibit square root cancellation.
Finally, let’s see what Cramér says about twin primes.
Example 2.8 (Cramér and twin primes). According to the Cramér model, the number of twin
primes n ≤ x is expected to be
E ∑ X (n)X (n + 2) = ∑ E(X (n)X (n + 2))
n≤ x n≤ x

= ∑ EX (n)EX (n + 2)
n≤ x
1
= ∑ (ln n)(ln(n + 2))
n≤ x
1 x
∼ ∑ (ln n) 2
∼ li2 ( x ) ∼
(ln x )2
,
n≤ x

where the kth logarithmic integral lik ( x ) is


Z x
dt
lik ( x ) := .
2 (ln t)k
This is not perfect but reasonably good if we compare to computer data! Computations suggest
that the number of twin primes less than x is ≈ 1.32li2 ( x ), so we seem to be correct to within a
constant factor.
However, the example of twin primes suggests a different example where the Cramér model is
hopelessly wrong. Say instead of twin primes we asked how many consecutive primes there are less
than x, i.e., how many n ≤ x there are such that n and n + 1 are both prime. We know the answer
to be 1 for all x ≥ 2, since n = 2 and n + 1 is the only example; for larger n, either n or n + 1 is
even and thus composite. But the argument in Example 2.8 works in precisely the same way for
consecutive primes as for twin primes, so the Cramér model predicts li2 ( x ) pairs of consecutive
primes ≤ x, and in particular infinitely many.
So, the primes are not actually independent: for example, if n is prime, then n − 1 and n + 1
must not be. However, the problem in the example of consecutive primes is specifically that we
neglected to consider divisibility by 2, which suggests that we should adjust our model to include
the influence of small prime numbers. This adjustment is best encapsulated by the Hardy–Littlewood
k-tuples conjectures.

3. H ARDY–L ITTLEWOOD k- TUPLES CONJECTURES


The Hardy–Littlewood k-tuples conjectures, often referred to somewhat less descriptively as
the “first Hardy–Littlewood conjecture,” predict asymptotics for any linear correlations of prime
numbers. Fix k ≥ 1 and a k-tuple H = {h1 , . . . , hk } of distinct nonnegative integers. Then we may
ask, as a generalization of our question about twin primes, how many values of n ≤ x have the
property that n + hi is prime for every 1 ≤ i ≤ k. In other words, if 1P denotes the indicator of the
set P of prime numbers, then
∑ 1 P ( n + h1 ) 1 P ( n + h2 ) · · · 1 P ( n + h k ) = ?
n≤ x
Conjecture 3.1 (Hardy–Littlewood k-tuples conjecture). For a fixed integer k ≥ 1 and a fixed k-tuple of
distinct nonnegative integers H = { h1 , . . . , hk }, as x → ∞,
k
(3.1) ∑ ∏ 1P (n + hi ) = S(H)lik (x) + o(lik (x)),
n ≤ x i =1
6 VIVIAN KUPERBERG

where
1 − νp (H)/p
S(H) = ∏ (1 − 1/p)2
,
p prime
and νp (H) denotes the number of distinct congruence classes modulo p occupied by elements of H.
It will also be useful to us to consider the Hardy–Littlewood conjecture with von Mangoldt
weights, given by the following:
Conjecture 3.2 (Hardy–Littlewood k-tuples conjecture with von Mangoldt weights). For a fixed
integer k ≥ 1 and a fixed k-tuple of distinct nonnegative integers H = { h1 , . . . , hk }, as x → ∞,
k
(3.2) ∑ ∏ Λ(n + hi ) = S(H)x + o(x).
n ≤ x i =1

An enlightening interpretation of this conjecture is as a correction of the independence assump-


tion that came from the Cramér model. The Cramér model would predict, just as we saw for twin
primes, that the right-hand side of (3.1) should be lik ( x ). The Cramér model, however, assumes
that the primality of different integers is independent, which we can interpret as assuming that for
every prime p and for distinct integers n and m, the event that n is divisible by p is independent
from the event that m is divisible by p.
For a fixed prime p, the probability that k distinct randomly chosen integers are all nonzero mod
p −1 k
 
p is p = (1 − 1/p)k . But for a randomly chosen n, which we think of as large compared to
both p and h1 , . . . , hk , we have
n + hi ̸ ≡ 0 mod p ∀i ⇔ n ̸≡ −hi mod p ∀i.
This condition is precisely the condition that n does not land in one of νp (H) congruence classes
mod p. Thus there are p − νp (H) congruence classes mod p for which n satisfies this condition,
so the probability that a randomly chosen n has the property that n + hi ̸≡ 0 mod p for all i is
p−νp (H) ν (H)
p = 1 − p p . For each prime p, then, the pth component of S(H) involves dividing by the
independence assumption for divisibility by p and replacing it by the actual asymptotic probability
that no element of the k-tuple is divisible by p.
Remark 3.3. Note that we are still using an independence assumption: we are assuming that the
probability of being 0 mod p − 1 and of being 0 mod p2 for distinct primes p1 and p2 is independent.
But this independence assumption has a strong justification: by Sunzi’s remainder theorem, it is
true! And not only do we get pairwise independence, but more generally: if q is any finite number,
as x → ∞, the Hardy–Littlewood conjectures (with the constant S(H) restricted to be a product
only over primes p|q) accurately predict the number of n ≤ x with (n + hi , q) = 1 for all i.
Example 3.4 (Twin primes, revisited). When H = {0, 2}, the Hardy–Littlewood conjecture is
precisely addressing the question of twin primes. In this case the prediction is that
1 − 2/p
#{ p ≤ x : p, p + 2 prime} = 2 ∏ li2 ( x )(1 + o (1)).
p ≥3 ( 1 − 1/p)2
This prediction matches numerical data extremely well. The product over p ≥ 3 (for reasons I don’t
understand, one usually excludes the 2) is called the twin prime constant and denoted C2 , so that
C2 ≈ 0.66016 and the number of pairs of twin primes ≤ x is very close to 1.32032.
Example 3.5 (Consecutive primes, revisited). At this point we have mostly exhausted the utility of
this example, but let us revisit it once more to emphasize that it fails for the “correct” reason.
Here we would like to predict that there are only finitely many consecutive primes, so when
H = {0, 1} we should have S(H) = 0. Indeed, for p = 2, v p ({0, 1}) = 2, since 0 and 1 take up two
THE DISTRIBUTION OF PRIME NUMBERS 7

1−v (H)/2
congruence classes mod 2. Thus (1−21/2)2 = (11−−1/2 2/2
)2
= 0, so S(H) = 0. This is capturing precisely
the same reason we discussed before: the fact that v2 ({0, 1}) = 2 is the fact that for all n, either n or
n + 1 is even, or 0 mod 2.
By the same argument, one can show that whenever S(H) = 0, there are only finitely many n
such that n + hi is prime for all i. A tuple H is called admissible if S(H) ̸= 0.
3.1. Hardy–Littlewood and primes in short intervals. The power, in some cases, of the Cramér
model, is that beginning with the Hardy–Littlewood conjecture instead, one sometimes gets the
same answer. In this section, we’ll outline an argument due to Gallagher [2] about primes in
intervals of length λ log x, for some real constant λ > 0. Much of this exposition follows [4].
Let h = λ log x. To understand π (n + h) − π (n) as n varies over integers n ≤ x, consider the rth
moment
h r
1 1 
x n∑ x n∑ ∑
r
(3.3) ( π ( n + h ) − π ( n )) = 1 .
≤x ≤x ℓ=1
n+ℓ prime

The Cramér prediction would say that the rth moment is approximately

1   h r 
(3.4) E ∑ ∑ X (n + ℓ) .
x 2≤n≤ x ℓ=1

If (3.3) and (3.4) are roughly equal for r ≤ R, as long as R = R( x ) tends to infinity, then π (n +
h) − π (n), for n chosen uniformly at random in [1, x ], has a Poisson distribution with parameter λ,
because the Poisson distribution is determined by its moments. (See exercises for a bit more on
this).
Expanding the rth power in (3.3), we get

∑ 1 P ( n + ℓ1 ) · · · 1 P ( n + ℓr ).
1≤ℓ1 ,...,ℓr ≤h

This is something we’d like to apply Hardy–Littlewood to, but we note that the ℓi might not all be
different, so we have to do a little counting in order to account for this. Suppose there are exactly k
distinct numbers among the ℓ1 , . . . , ℓr , and say that they are given by 1 ≤ h1 < h2 < · · · < hk ≤ h.
The number of choices for ℓ1 , . . . , ℓr that yield the same distinct, ordered list h1 , . . . , hk is given
by the number of different ways of mapping {1, 2, . . . , r } onto {1, . . . , k }. This number is called a
Stirling number of the second kind, and we will denote it by σ(r, k ). Then (3.3) can be written as
!
r
1
∑ σ(r, k) ∑ x n∑
1 P ( n + h1 ) · · · 1 P ( n + h k )
k =1 1≤h <h <···<h ≤h
1 2 k ≤x
r
σ(r, k )
∼ ∑ (log x)k ∑ S({ h1 , . . . , hk }),
k =1 1≤h1 <···<hk ≤h

where the right-hand side follows from the left by Hardy–Littlewood. Applying the same logic to
the Cramér model expression in (3.4), we get that the two models agree if and only if
(3.5) ∑ S({ h1 , . . . , hk }) = ∑ 1,
1≤h1 <···<hk ≤h 1≤h1 <···<hk ≤h

that is, if the singular series constants S({h1 , . . . , hk }) are 1 on average.


Proposition 3.6 (Gallagher, 1976 [2]). For each k ≥ 1, the singular series is 1 on average for sets of size k,
that is, (3.5) holds for fixed k as h grows large.
8 VIVIAN KUPERBERG

Proof sketch. The first step is to prove that, for any fixed prime p,
(1 − νp (H)/p)
∑ (1 − 1/p)k
∼ ∑ 1,
1≤h1 <···<hk ≤h 1≤h1 <···<hk ≤h

with a certain error term.


This is the same as saying that if we define
1 − νp (H)/p
1 + a p (H) = ,
(1 − 1/p)k
then a p (H) converge to 0 on average as h grows large. We can extend a p (H) to any squarefree q,
defining aq (H) := ∏ p|q a p (H) and a1 (H) = 1, so that
S(H) = ∑ aq (H).
q ≥1

By Sunzi’s remainder theorem (although this is by no means immediate), aq (H) also converges to 0
on average over H as h grows large, for any q > 1.
Then the sum we want to estimate is
∑ S(H) = ∑ ∑ aq (H).
1≤h1 <···<hk ≤h 1≤h1 <···<hk ≤h q≥1

When q is “small” in some sense with respect to h, we can use that aq (H) converges to 0 on average
over H ⊆ [1, h]. When q is large, we can instead use that the tail of the singular series ∑q>y aq (H)
is small. □
This tells us that, if (and since) we believe the Hardy–Littlewood conjectures, we should expect
the Cramér guess to be correct for the number of primes in log-size intervals: the distribution
should indeed be Poissonian.
The Cramér model is not always so lucky. For example, in [3] it is shown that the sum of the
von Mangoldt function in intervals (n, n + H ] where 1 ≤ n ≤ x and x δ ≤ H ≤ x1−δ , for a constant
δ > 0, has a distribution that is Gaussian (as one would expect from Cramér) with mean ∼ H and
variance ∼ H log N H (which is smaller than the variance ∼ H log N that Cramér would imply). So
there is a balance to be explored: the Cramér model is a useful back-of-the-envelope calculation to
know approximately what we should expect from questions about prime numbers, but it does not
always agree with the inside-of-the-envelope calculation coming from other more sophisticated
predictions which we believe much more.

R EFERENCES
1. K. Ford, Sieve methods lecture notes, spring 2023.
2. P. X. Gallagher, On the distribution of primes in short intervals, Mathematika 23 (1976), no. 1, 4–9. MR 409385
3. H. L. Montgomery and K. Soundararajan, Primes in short intervals, Comm. Math. Phys. 252 (2004), no. 1-3, 589–617.
MR 2104891
4. K. Soundararajan, The distribution of prime numbers, Equidistribution in number theory, an introduction, NATO Sci. Ser.
II Math. Phys. Chem., vol. 237, Springer, Dordrecht, 2007, pp. 59–83. MR 2290494

You might also like