Choudhary
Choudhary
THEOREM
ABHIMANYU CHOUDHARY
Contents
1. Arithmetic Functions 1
2. Elementary Results 2
3. Chebyshev’s Functions and Asymptotic Formulae 4
4. Shapiro’s Theorem 10
5. Selberg’s Asymptotic Formula 12
6. Deriving the Prime Number Theory using Selberg’s Identity 15
Acknowledgments 25
References 25
1. Arithmetic Functions
Definition 1.1. The prime counting function denotes the number of primes not
greater than x and is given by π(x), which can also be written as:
X
π(x) = 1
p≤x
where the symbol p runs over the set of primes in increasing order.
Using this notation, we state the prime number theorem, first conjectured by
Legendre, as:
Theorem 1.2.
π(x) log x
lim =1
x→∞ x
Note that unless specified otherwise, log denotes the natural logarithm.
1
2 ABHIMANYU CHOUDHARY
2. Elementary Results
Before proving the main result, we first introduce a number of foundational
definitions and results.
Definition 2.1. An arithmetical function or a sequence, is a function whose domain
is the natural numbers, and codomain is either the real numbers or the complex
numbers.
Definition 2.2. We define the divisor sum of an arithmetic function a to be:
X
a(d)
d|n
Note that the Dirichlet Product is commutative and associative. Moreover, the
set of Arithmetical functions has an identity I over this product, and every arith-
metical function with the property that f (1) 6= 0 has an inverse f −1 such that
f ∗ f −1 = I. It is easy to verify that the identity function I is given by:
(
1 1 if n = 1
I(n) = =
n 0 otherwise
Definition 2.4. We define the Mobius function, µ as:
1 if n = 1
µ(n) = (−1)k if n = p1 , ..., pk for primes p1 , ..., pk
0 otherwise
Thus, the Mobius function allows us to determine a ”parity” of sorts for any
squarefree integer.
Theorem 2.5. The divisor sum of the mobius function is given by:
(
X 1 1 if n = 1
µ(d) = = = I(n)
n 0 otherwise
d|n
This can be verified using the fundamental theorem of arithmetic and the bino-
mial theorem. Note that this divisor sum yields the identity function, an important
property we will use momentarily.
Definition 2.6. We define the unit function by:
u(n) = 1
for all natural n.
We see that the divisor sum in 2.5 can be rewritten as:
n
X X X 1
µ(d) = µ(d)1 = µ(d)u = (µ ∗ u)(n) =
d n
d|n d|n d|n
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 3
Thus, the mobius and unit functions are inverses of each other. We can use this
property to derive a powerful formula, known as the Mobius inversion formula.
Theorem 2.7 (Mobius inversion formula). If f, g are arithmetical functions and:
X
g(d) = f (n)
d|n
then: X n
f (d)µ = g(n)
d
d|n
then X x
f (x) = µ(n)g
n
n≤x
where the symbol n ranges over all integers not greater than x.
We now introduce Von Mangoldt’s function given by the symbol Λ.
Definition 2.9 (Von Mangoldt’s Function). For every integer n ≥ 1 we define:
(
log(p) if n = pk for some prime p and k ≥ 1
Λ(n) =
0 otherwise
The proof for this result can be derived follows naturally from the fundamental
theorem of arithmetic. Roughly speaking, the function counts each prime factor
of n exactly as many times as it appears in the prime factorization of n. Through
summing and properties of the logarithm, the result follows.
4 ABHIMANYU CHOUDHARY
This sum clearly telescopes to the value A(n) − A(0). Because A(0) is an empty
sum, we have that: X X
a(n) = [A(n) − A(n − 1)]
n≤x n≤x
Mutliplying both sides by f (n):
X X
a(n)f (n) = [A(n) − A(n − 1)]f (n)
n≤x n≤x
Note that: X X
A(n)f (n) = A(n)f (n) + A(x)f (x)
n≤x n≤x−1
So we have:
X X X X
A(n)f (n)− A(n)f (n+1) = A(x)f (x)+ A(n)f (n)− A(n)f (n+1)
n≤x n≤x−1 n≤x−1 n≤x−1
Combining we have:
X X X
A(x)f (x) + A(n)f (n) − A(n)f (n + 1) = A(n)(f (n + 1) − f (n))
n≤x−1 n≤x−1 n≤x−1
Because f has Riemann integrable derivative, we can apply the fundamental theo-
rem of calculus to it and say:
X X Z n+1
A(n)(f (n + 1) − f (n)) = A(n) f 0 (t)dt
n≤x−1 n≤x−1 n
Because A(n) is constant on the interval [n, n + 1) and takes on the value A(n)
everywhere, we can place it inside the integral. Thus:
X Z n+1 X Z n+1
0
A(n) f (t)dt = A(t)f 0 (t)dt
n≤x−1 n n≤x−1 n
as we need.
Corollary 3.6 (Euler’s Summation Formula). Let f be a function with Riemann-
integrable derivative defined on the interval [1, x]. Then:
X Z x Z x
f (n) = f (t)dt + f 0 (t)dt + f (x)(bxc − x)
n≤x 1 1
Proof. We first assume (1) from 3.11. We have then by 3.13 that:
π(x) log x 1 x π(t)
Z
ϑ(x)
= −
x x x 2 t
It suffices then to show that:
Z x
1 π(t)
lim =0
x→∞ x 2 t
We know by our assumption that:
π(t) 1
=O
t log t
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 7
for t ≥ 2. So:
1 x π(t)
Z Z x
1 1
=O dt
x 2 t x 2 log t
by 3.2. We can bound this integral as follows:
Z x Z √x Z x √ √
1 1 1 x x− x
dt = dt + √ dt ≤ + √
2 log t 2 log t x log t log 2 log x
and this approaches 0 as x approaches infinity. So by the squeeze theorem, the
original integral approaches 0, and we have that:
ϑ(x)
lim =1
x→∞ x
Similarly, to prove the other direction, we now assume that:
ϑ(x)
lim =1
x→∞ x
Using our second integral formula, we have:
ϑ(x) log x x ϑ(t)
Z
π(x) log x
= + 2 dt
x x x 2 t log t
As we did previously, we must show the integral expression on the right hand side
approaches 0 as we take a limit to infinity. Our initial assumption implies that
ϑ(t) = O(t), so we have that:
log x x ϑ(t) log x x 1
Z Z
2 dt = O 2 dt
x 2 t log t x 2 log t
Again, bounding this integral in a similar manner, we have:
Z x Z √x Z x √ √
1 1 1 x x− x
2 dt ≤ + √ 2 dtdt ≤ + √
2 log t 2 log2 t x log t log2 2 log2 x
And again by the squeeze theorem, we see our limit is 0, as necessary, and the
second direction is complete. We have proven the equivalency of 3.11.1 and 3.11.2,
and the other equivalences follow from this, as stated before. Knowing this, we
now set out to prove the prime number by proving equivalent form 3.11.4, i.e that
limx→∞ R(x)x =0
We now examine the asymptotic behavior of Von-Mangoldt’s function and other
arithmetical functions. The following identities will prove useful later in our analysis
of Chebyshev’s ϑ and ψ functions. Before we do this however, we will introduce a
set of corollaries about partial sums of dirichlet products (see definition 2.3).
Theorem 3.14. Let f, g be arithmetic functions and denote h = f ∗ g. Let H, F, G
denote the partial sums of their respective functions. Then we have that:
X x
H(x) = f (n)G =
n
n≤x
We now introduce an identity about the partial sums of the harmonic series:
Theorem 3.16.
X 1 1
= log x + γ + O
n x
n≤x
Where γ is a constant, hereafter called the ”Euler-Mascheroni” constant.
Proof. This is a consequence of 3.6, the Euler summation formula.
Theorem 3.17. X jxk
Λ(n) = logbnc!
n
x≤n
Corollary 3.21.
X jxk X x
Λ(n) = ψ = x log x − x + O(log x)
n n
n≤x n≤x
Proof. This follows from 3.15 and the definition of the ψ function as the partial
sums of Von Mangoldt’s function.
Remark 3.22. Note that we can also use the less precise approximation:
X jxk X x
Λ(n) = ψ = x log x + O(x)
n n
n≤x n≤x
as the logarithm is dominated by the linear term. This approximation will be useful
later on, when we discuss Shapiro’s theorem.
The next theorem follows as a consequence of 3.21.
Theorem 3.23.
X x
log p = x log x + O(x)
p
p≤x
Now, we will reindex the above sum so it can be written in terms of primes less
than or equal to x. We know that Λ is nonzero for prime powers and 0 otherwise.
Thus, we have
jxk X X ∞
X
m x
Λ(n) = Λ(p ) m
n m=1
p
n≤x p≤x
Again, we have:
∞
X 1 X 1
x log p ≤x log n = O(x)
(p − 1)p n=2
(n − 1)n
p≤x
4. Shapiro’s Theorem
We now provide a proof of P Shapiro’s
theorem, an important theorem which
relates partial sums of the form n≤x nx a(n) to the often more interesting sums
P
of form n≤x a(n). Specifically, we will use this to derive a result about the
behavior of the partial sums of the ψ function.
Theorem 4.1. (Shapiro’s Tauberian Theorem) Let a(n) be a nonnegative sequence
such that: X jxk
a(n) = x log x + O(x)
n
n≤x
Then the following are true:
(1) For n ≥ 1 we have:
X a(n)
= log x + O(1)
n
n≤x
Reindexing we have
x X jxk jxk X jxk
T (x) − 2T = a(n) − 2a(n) + a(n)
2 x n 2n x n
n≤ 2 2 <n≤x
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 11
The sum on the left hand side will be nonnegative because b2xc − 2bxc is always
nonnegative (this can be checked by considering the min and max of both functions)
and our sequence is nonnegative. Thus, we have:
x X jxk
T (x) − 2T ≥ a(n)
2 x n
2 <n≤x
For n larger than x2 , we observe that the term nx is 1 as 1 is the floor of any
as required. We now use this fact to prove statement 2. Recall by our initial
assumption that:
T (x) = x log x − x + O(x)
We have then that:
x x x x
T (x) − 2T = x log x − x + O(x) − 2 log − + O(x) = O(x)
2 2 2 2
By the previous inequality, we can establish that:
x
S(x) − S = O(x)
2
for all x ≥ 1. Thus, we have a constant M 0 such that S(x) − s( x2 ) ≤ M 0 x. Applying
this to successive values of x as follows, we can get expressions of the forms:
x x x
S −S ≤ M0
2 4 2
x x x
S −S ≤ M0
4 8 4
..
.
and so on, we see that when we sum these expressions, all terms cancel except S(x)
and we are left with:
1 1
S(x) ≤ M x 1 + + + ... = 2M 0 x
0
2 4
so we have found an appropriate constant, namely 2M 0 . We now prove part 1. We
see that: jxk x
= + O(1)
n n
So:
X jxk X x
T (x) = a(n) = + O(1) a(n)
n n
x≤n x≤n
X a(n) X
=x + a(n)O(1)
n
x≤n n≤x
X a(n)
=x + O(x)
n
x≤n
12 ABHIMANYU CHOUDHARY
If we substitute the asymptotic formula we have for T (x), it follows almost imme-
diately that:
X a(n)
x = log x + O(1)
n
x≤n
as needed.
We can now state a number of corollaries that follow immediately from this
result:
Corollary 4.2. The following asymptotic formulae hold:
(1)
ψ(x) = O(x)
(2)
X Λ(n)
= log x + O(1)
n
n≤x
(3)
ϑ(x) = O(x)
(4)
X log p
= log x + O(1)
p
p≤x
All of these corollaries follow from previous asymptotic formulae derived in sec-
tion 3.
We first prove a lemma which will help us obtain the final result
Lemma 5.2 (Tatuzawa Iseki Identity). Let F be a real valued function defined on
R+ and let G be given by:
X x
G(x) = log x F
n
n≤x
Then, we have:
X x X x
F (x) log x + F Λ(n) = µ(d)G
n d
n≤x d≤x
So we have:
X x X xX n
(5.4) F Λ(n) = F µ(d) log
n n d
n≤x n≤x d|n
And by our initial definition of G, we can rewrite the right hand side of (5.6) as:
X x
µ(d)G
d
d≤x
as needed.
Proof. We apply 5.2 to the functions ψ(x) and x − γ − 1, where γ is the Euler
Mascheroni constant. For ψ(x) we can define the associated Gψ as:
X x
Gψ (x) = log x ψ
n
n≤x
= log x(x log x − x + O(log x))
= x log2 x − x log x + O(log2 x)
which follows from 3.21. We have for x − γ + 1 that:
X x
Gγ (x) = log x −γ+1
n
n≤x
X 1 X
= x log x − log x (γ + 1)
n
n≤x n≤x
By 3.16, we have:
X 1
1
x log x = x log x log x + γ + O
n x
n≤x
14 ABHIMANYU CHOUDHARY
So:
1 X
Gγ (x) = x log x log x + γ + O − log x (γ + 1)
x
n≤x
1
= x log x log x + γ + O − log x(γ + 1)x
x
= x log2 x − x log x + O(log x)
We see that the√difference between Gψ and Gγ is O(log2 x). We will use the weaker
estimate of O( x). Now, apply 5.2 (Tatuzawa Iseki) to ψ and x − γ + 1. We have
for ψ that:
X x X x
(5.7) ψ(x) log x + ψ Λ(n) = µ(d)Gψ
n d
n≤x d≤x
Subtracting the RHS of 5.7 from that of 5.8,we have the term:
X x x
µ(d) Gψ − Gγ
d d
d≤x
√ x
Applying our O( x) estimate for the difference of Gψ and Gγ with d as our argu-
ment, we have:
x x r
X X x
(5.9) µ(d) Gψ − Gγ = O
d d d
d≤x d≤x
ψ(x) − (x + γ + 1)
we have:
X h x x i X x x
[ψ(x)−(x+γ+1)] log x+ ψ − − γ − 1 Λ(n) = µ(d) Gψ − Gγ
n n d d
x≤n d≤x
as needed. To show the function is Lipschitz, we can use the fact that for any
x1 , x2 ≥ 2:
|S(x1 )| ≤ M1 |x1 |
|S(x2 )| ≤ M1 |x2 |
as S(x) = O(x). Subtracting the expressions and applying the triangle inequality
we have:
|S(x1 ) − S(x2 )| ≤ |S(x1 )| − |S(x2 )| ≤ M1 |x1 | − M1 |x2 | ≤ M1 |x1 − x2 |
and thus S is Lipschitz, as required.
Corollary 6.3.
R(x)
S 0 (y) = = O(1)
x
where S 0 (y) is defined when y =
6 pk for some prime p.
This was proven indirectly in the proof of 6.2. The restriction on y gives us a
guarantee of continuity. We now introduce an important corollary:
Corollary 6.4.
X y
S(y) log y + Λ(n)S = O(y)
n
n≤y
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 17
Proof. Dividing the expression in (6.1) by x and integrating both sides from 2 to
y, we have:
Z y Z yX
R(x) x
log xdx + Λ(n)S dx = O(1)
2 x 2 n
n≤x
Integrating the first expression from the left by parts, we have:
Z y Z y Z y
R(x) R(x) 1
log x = log y − S(x) dx
2 x 2 x 2 x
= S(y) log y − O(y)
Decomposing the second integral we have:
Z yX x Z y
X x
Λ(n)S dx = Λ(n) S dx
2 n 2 n
n≤x n≤x
Combining, we have:
X y
S(y) log y − O(y) + Λ(n)S = O(1)
n
n≤x
by 6.2. We now simplify the LHS. First, we break up the sum in the first set of
brackets using properties of logarithms. We have:
X y y X y X y
Λ(k)S log = Λ(k)S log (y) − Λ(k)S log (k)
k k k k
k≤y k≤y k≤y
We first perform a sign change, and see the above is equal to:
X y X X y
− Λ(k)S log (k) − Λ(k)Λ(n)S
k y kn
k≤y k≤y n≤ k
y
Because k ≤ y and n ≤ we see that the above means nk = m ≤ y, so reindex
k,
again:
" !#
X y X y X
− Λ(m)S log (m) − S Λ(k)Λ(n))
m m
m≤y m≤y kn=m
Now, we reconsider:
X y
Λ(k)S log (y)
k
k≤y
So in summary, we have:
X y
log y(O(y) − S(y) log y) − S Λ2 (m) = O(y log y)
m
m≤y
And moving the sum to the RHS will give us the result.
Because we showed that Λ2 (m) = 2 log m + O(m) near the end of section 6, our
next lemma will show that we can use weights of 2 log m, making our sum easier to
work with.
Lemma 6.6. There exists a constant Z2 such that:
X y
(6.7) log2 y|S(y)| ≤ 2 log m + Z2 y log y
m
S
m≤y
We see that:
Λ2 (m) − 2 log m = Q(m) − Q(m − 1)
Substituting, we have:
X y
(Q(m) − Q(m − 1)) = C(y)
m
S
m≤y
We can ignore all terms with m < 2 as S(m) = 0 for those terms. Through
examination (this becomes exceedingly clear when terms are written out), we can
reindex the sum above as:
X y y
(6.10) C(y) = − S Q(m)
m m+1
S
2≤m≤y
We see that:
X 1 Z y
1
C(y) ≤ yZ3
m+1
≤ yZ3 dm = Z3 y log y
1 m
2≤m≤y
Combining the above with our expression for C(y) proves the lemma.
We now further simplify this inequality by replacing the sum above with an
integral.
Lemma 6.12. There is a constant Z4 such that:
Z y
y
log2 y|S(y)| ≤ 2 log u du + Z4 y log y
u
S
2
The inequality of 6.12 assumes a simpler form with a change of variables. Letting
x = log y and letting v = log uy , we can rewrite 6.12 as:
Z x−log 2
(6.14) x2 |S(ex )| ≤ 2 |S (ev )| (x − v)e(x−v) dv + Z4 xex
0
Performing another change of variables by defining the function W (x) = e−x S(ex ),
and applying the transform to 6.14, we have that:
Z x−log 2
|W (x)|
(6.15) x2 −x ≤ |S (ev )| (x − v)e(x−v) dv + Z4 xex
e 0
Further simplifying 6.15 we have:
Z x−log 2
|W (x)| ex
x2 −x ≤ 2 |W (v)|ev (x − v) v dv + Z4 xex
e 0 e
Z x−log 2
2 xex
|W (x)|ex ≤ 2 |W (v)|(x − v)ex dv + Z4 2
x 0 x
Z x
2 1
(6.16) |W (x)| ≤ 2 |W (v)|(x − v)dv + Z4
x 0 x
The transformations above were done to yield a function W which is essentially
dominated by a weighted average of itself. We now prove two lemmas about |W (x)|.
Lemma 6.17.
lim sup |W (x)| = α ≤ 1
x→∞
We seek to show that α = 0. To do this, we will use two more facts about W ,
proving them along the way.
Lemma 6.20. Let k = 2M1 . Then, the following holds:
|W (x1 ) − W (x2 )| ≤ k|x1 − x2 |
Proof. We see that by definition:
|W 0 (x)| = −e−x S(ex ) + S 0 (ex ) ≤ e−x |S(ex )| + |S 0 (ex )| ≤ c + c = 2c
and it follows by the methods used in 6.2 that the condition follows.
Lemma 6.21. If W (v) 6= 0 for v1 < v < v2 then there exists M2 such that:
Z v2
W (v)dv ≤ M2
v1
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 23
(3) 2 Mω2 ≥ (b − a) ≥ 2 ωk
We can use the reasoning as in case 2 for any points a distance ω/k from
each endpoint. Otherwise, by 6.24 we have:
Z b
w2
2ω
|W (x)| ≤ + b−a− ω
a k k
Simplifying the right hand side above we have:
ω2
ω
(b − a)ω 1 − ≤ (b − a)ω 1 −
k(b − a) 2M2 k
And this is strictly less than:
α2
(6.25) (b − a)ω 1 −
2M2 k
as M2 k > 1 and α ≤ 1. If x1 is the first zero of W (x) to the right of xω
and x is the largest zero to the left of y then by 6.25 and 6.21 imply that:
Z y Z x1
α2
|W (x)|dx ≤ |W (x)|dx + (x − x1 )ω 1 − + M2
0 0 2M2 k
Dividing by y and noting that x ≤ y we have:
1 y 1 x1 α2
Z Z
M2
|W (x)|dx ≤ |W (x)|dx + (x − x1 )ω 1 − +
y 0 y 0 2M2 k y
Letting y → ∞ we see that:
α2
β ≤ ω(1 − )
2M2 k
and because β ≥ α, we see:
α2
α≤ω 1−
2M2 k
Since this inequality holds for all ω > α, it must hold for ω = α. Thus,
α3 ≤ 0 and since α ≥ 0 it follows α = 0.
It follows then that:
S(y)
lim =0
y→∞ y
Thus, for any given > 0 we have for large y that:
1
|S(y)| ≤ 2 y
3
Hence, we see:
1
S(y(1 + )) − S(y) ≤ 2 (y(1 + ) + y) < 2 y
3
Expanding S, we have: Z
R(t)
y(1 + ) dt ≤ 2 y
y t
By the definition of R and because ψ is nondecreasing, we have:
Z y(1+) Z y(1+)
ψ(y)
dt − dt ≤ 2 y
y(1 + ) y y
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 25
ψ(y) 2
Hence, we have that y ≤ (1 + ) . Similarly, because S(y) − S(y(1 − )) ≥ −2 y
ψ(y) 2
for large y leads to y ≥ (1 − ) . Because is arbitrary, it follows that:
ψ(x)
lim =1
x→∞ x
proving 1.2.
Acknowledgments. It is a pleasure to thank my mentor, Karen Butt for her
assistance in writing this paper, as well as her having the confidence in me to write
on a topic that most would consider ambitious. I would also like to thank the
program director, Peter May, for organizing this research experience.
References
[1] Norman Levinson. A Motivated Account of the Elementary Proof of the Prime Number The-
orem. https://www.jstor.org/stable/2316361
[2] Tom M. Apostol. Introduction to Analytic Number Theory.
http://plouffe.fr/simon/math/IntrodAnalyticNTApostol.pdf