0% found this document useful (0 votes)
106 views25 pages

Choudhary

This document presents an elementary proof of the prime number theorem. It begins by introducing arithmetic functions and establishing foundational results like the Mobius inversion formula. It then defines Chebyshev's functions and establishes asymptotic formulae, including Abel's summation formula. The proof will use these tools to show that the number of primes less than x approaches x/ln(x) as x increases without complex analysis.

Uploaded by

emilio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views25 pages

Choudhary

This document presents an elementary proof of the prime number theorem. It begins by introducing arithmetic functions and establishing foundational results like the Mobius inversion formula. It then defines Chebyshev's functions and establishes asymptotic formulae, including Abel's summation formula. The proof will use these tools to show that the number of primes less than x approaches x/ln(x) as x increases without complex analysis.

Uploaded by

emilio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

AN ELEMENTARY PROOF OF THE PRIME NUMBER

THEOREM

ABHIMANYU CHOUDHARY

Abstract. This paper presents an ”elementary” proof of the prime number


theorem, elementary in the sense that no complex analytic techniques are
used. First proven by Hadamard and Valle-Poussin, the prime number the-
orem states that the number of primes less than or equal to an integer x
asymptotically approaches the value lnxx . Until 1949, the theorem was con-
sidered too ”deep” to be proven using elementary means, however Erdos and
Selberg successfully proved the theorem without the use of complex analysis.
My paper closely follows a modified version of their proof given by Norman
Levinson in 1969.

Contents
1. Arithmetic Functions 1
2. Elementary Results 2
3. Chebyshev’s Functions and Asymptotic Formulae 4
4. Shapiro’s Theorem 10
5. Selberg’s Asymptotic Formula 12
6. Deriving the Prime Number Theory using Selberg’s Identity 15
Acknowledgments 25
References 25

1. Arithmetic Functions
Definition 1.1. The prime counting function denotes the number of primes not
greater than x and is given by π(x), which can also be written as:
X
π(x) = 1
p≤x

where the symbol p runs over the set of primes in increasing order.
Using this notation, we state the prime number theorem, first conjectured by
Legendre, as:
Theorem 1.2.
π(x) log x
lim =1
x→∞ x
Note that unless specified otherwise, log denotes the natural logarithm.
1
2 ABHIMANYU CHOUDHARY

2. Elementary Results
Before proving the main result, we first introduce a number of foundational
definitions and results.
Definition 2.1. An arithmetical function or a sequence, is a function whose domain
is the natural numbers, and codomain is either the real numbers or the complex
numbers.
Definition 2.2. We define the divisor sum of an arithmetic function a to be:
X
a(d)
d|n

where the symbol d ranges over the set of positive divisors of n.


Definition 2.3. We define the Dirichlet product or Dirichlet Convolution of two
arithmetic functions f, g as:
X n
(f ∗ g)(n) = f (d)g
d
d|n

Note that the Dirichlet Product is commutative and associative. Moreover, the
set of Arithmetical functions has an identity I over this product, and every arith-
metical function with the property that f (1) 6= 0 has an inverse f −1 such that
f ∗ f −1 = I. It is easy to verify that the identity function I is given by:
  (
1 1 if n = 1
I(n) = =
n 0 otherwise
Definition 2.4. We define the Mobius function, µ as:

1 if n = 1

µ(n) = (−1)k if n = p1 , ..., pk for primes p1 , ..., pk

0 otherwise

Thus, the Mobius function allows us to determine a ”parity” of sorts for any
squarefree integer.
Theorem 2.5. The divisor sum of the mobius function is given by:
  (
X 1 1 if n = 1
µ(d) = = = I(n)
n 0 otherwise
d|n

This can be verified using the fundamental theorem of arithmetic and the bino-
mial theorem. Note that this divisor sum yields the identity function, an important
property we will use momentarily.
Definition 2.6. We define the unit function by:
u(n) = 1
for all natural n.
We see that the divisor sum in 2.5 can be rewritten as:
n  
X X X 1
µ(d) = µ(d)1 = µ(d)u = (µ ∗ u)(n) =
d n
d|n d|n d|n
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 3

Thus, the mobius and unit functions are inverses of each other. We can use this
property to derive a powerful formula, known as the Mobius inversion formula.
Theorem 2.7 (Mobius inversion formula). If f, g are arithmetical functions and:
X
g(d) = f (n)
d|n

then: X n
f (d)µ = g(n)
d
d|n

Proof. We have by 2.3 that:


g∗u=f
Taking the convolution of both sides with µ we have:
µ ∗ (g ∗ u) = µ ∗ f
Using associativity and commutativity, we can write the above expression as:
g =µ∗f =f ∗µ
as required. 

Corollary 2.8 (Generalized Mobius Inversion). If f, g are arithmetical functions


and: X x
g(x) = f
n
n≤x

then X x
f (x) = µ(n)g
n
n≤x

where the symbol n ranges over all integers not greater than x.
We now introduce Von Mangoldt’s function given by the symbol Λ.
Definition 2.9 (Von Mangoldt’s Function). For every integer n ≥ 1 we define:
(
log(p) if n = pk for some prime p and k ≥ 1
Λ(n) =
0 otherwise

The above definition is fairly powerful as it turns a multiplication problem (prime


factorization), into an addition problem through the use of logarithms. We are also
prohibited from ”double counting” any prime factors, as we will see in the next
theorem.
Theorem 2.10 (Divisor sum of the Von Mangoldt Function).
X
Λ(d) = log n
d|n

The proof for this result can be derived follows naturally from the fundamental
theorem of arithmetic. Roughly speaking, the function counts each prime factor
of n exactly as many times as it appears in the prime factorization of n. Through
summing and properties of the logarithm, the result follows.
4 ABHIMANYU CHOUDHARY

3. Chebyshev’s Functions and Asymptotic Formulae


Definition 3.1. We say that a function f is ”big-oh g(x) ” for all x ≥ a or write
that:
f (x) = O(g(x))
if there exists a constant M such that for all x ≥ a we have:
f (x) ≤ M g(x)
Corollary 3.2. Let f, g be Riemann Integrable functions such that f (t) = O(g(t))
for t ≥ a. Then we have:
Z x Z x 
f (t)dt = O g(t)dt
a a

Definition 3.3. We say that f is asymptotic to g or f ∼ g, if:


f (x)
lim =1
x→∞ g(x)

Definition 3.4. We define the extension of an arithmetic function a as a map


R+ → R given by:
a(x) = a (bxc) for all x ∈ R+
We now have a suitable way to extend the domain of arithmetic functions to the
positive reals. We now have a new set of tools to our disposal, namely those of cal-
culus. We now supply a powerful summation formula that allows us to approximate
the partial sums of arithmetic functions.
Theorem 3.5 (Abel’s Summation Formula). Let f be a real valued function with a
Riemann-Integrable derivative for t ≥ 1. Let a(n) be an arithmetical function and
let A(x) be the partial sum of a up to x. Then:
X Z x
a(n)f (n) = f (x)A(x) − f 0 (t)A(t)dt
n≤x 1

Proof. Taking suitable a, f , we have that:


X
A(n) − A(n − 1) = [A(1) − A(0)] + ... + [A(n) − A(n − 1)]
1≤n≤x

This sum clearly telescopes to the value A(n) − A(0). Because A(0) is an empty
sum, we have that: X X
a(n) = [A(n) − A(n − 1)]
n≤x n≤x
Mutliplying both sides by f (n):
X X
a(n)f (n) = [A(n) − A(n − 1)]f (n)
n≤x n≤x

Expanding the right hand side:


X X X
[A(n) − A(n − 1)]f (n) = A(n)f (n) − A(n − 1)f (n)
n≤x n≤x n≤x

Reindexing, it follows that:


X X X X
A(n)f (n) − A(n − 1)f (n) = A(n)f (n) − A(n)f (n + 1)
n≤x n≤x n≤x n≤x−1
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 5

Note that: X X
A(n)f (n) = A(n)f (n) + A(x)f (x)
n≤x n≤x−1
So we have:
X X X X
A(n)f (n)− A(n)f (n+1) = A(x)f (x)+ A(n)f (n)− A(n)f (n+1)
n≤x n≤x−1 n≤x−1 n≤x−1

Combining we have:
X X X
A(x)f (x) + A(n)f (n) − A(n)f (n + 1) = A(n)(f (n + 1) − f (n))
n≤x−1 n≤x−1 n≤x−1

Because f has Riemann integrable derivative, we can apply the fundamental theo-
rem of calculus to it and say:
X X Z n+1
A(n)(f (n + 1) − f (n)) = A(n) f 0 (t)dt
n≤x−1 n≤x−1 n

Because A(n) is constant on the interval [n, n + 1) and takes on the value A(n)
everywhere, we can place it inside the integral. Thus:
X Z n+1 X Z n+1
0
A(n) f (t)dt = A(t)f 0 (t)dt
n≤x−1 n n≤x−1 n

. Summing the integrals, we have:


X Z n+1 Z x
A(t)f 0 (t)dt = A(t)f 0 (t)dt
n≤x−1 n 1

Thus, in conclusion, we see that


X Z x
a(n)f (n) = A(x)f (x) + A(t)f 0 (t)dt
n≤x 1

as we need. 
Corollary 3.6 (Euler’s Summation Formula). Let f be a function with Riemann-
integrable derivative defined on the interval [1, x]. Then:
X Z x Z x
f (n) = f (t)dt + f 0 (t)dt + f (x)(bxc − x)
n≤x 1 1

Proof. This result follows from the Abel summation formula. 


We will use these results to derive results about the asymptotic behavior of cer-
tain arithmetic functions. We first introduce two important functions of Chebyshev
whose asymptotic behavior we will examine.
Definition 3.7. For x > 0 we define the Chebyshev ϑ function by:
X
ϑ(x) = log p
p≤x

where the symbol p runs over all primes not exceeding x.


Definition 3.8. For x > 0 we define the Chebyshev ψ function by:
X
ψ(x) = Λ(n)
n≤x
6 ABHIMANYU CHOUDHARY

Lemma 3.9. For x > 1 we have:


ψ(x) ϑ(x) (log x)2
0≤ − ≤ √
x x 2 x log 2
ψ(x) ϑ(x)
Remark 3.10. Notice by the squeeze theorem that if either quotient x , x has
a limit to infinity, then the other quotient has the same limit.
Theorem 3.11. The following 3 statements are equivalent:
(1)
π(x) log x
lim =1
x→∞ x
(2)
ϑ(x)
lim =1
x→∞ x
(3)
ψ(x)
lim =1
x→∞ x
(4)
ψ(x) − x
lim =0
x→∞ x
Remark 3.12. Throughout the rest of this paper, the function ψ(x) − x is denoted
by R(x) and will be known as remainder function.
We prove that 1 is equivalent to 2. The equivalency of 2 and 3 follows from 3.10
and the equivalency of 3 and 4 follows trivially. We first use a corollary of 3.5,
Abel’s Summation Formula:
Corollary 3.13. We have formulas:
(1)
Z x
π(t)
ϑ(x) = π(x) log x −
2 t
(2)
Z x
ϑ(x) ϑ(t)
π(x) = +
log x 2 t log2 t
These results can be derived in a straightforward manner using the summation
formula. We now use them to prove 3.10.

Proof. We first assume (1) from 3.11. We have then by 3.13 that:
π(x) log x 1 x π(t)
Z
ϑ(x)
= −
x x x 2 t
It suffices then to show that:
Z x
1 π(t)
lim =0
x→∞ x 2 t
We know by our assumption that:
 
π(t) 1
=O
t log t
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 7

for t ≥ 2. So:
1 x π(t)
Z  Z x 
1 1
=O dt
x 2 t x 2 log t
by 3.2. We can bound this integral as follows:
Z x Z √x Z x √ √
1 1 1 x x− x
dt = dt + √ dt ≤ + √
2 log t 2 log t x log t log 2 log x
and this approaches 0 as x approaches infinity. So by the squeeze theorem, the
original integral approaches 0, and we have that:
ϑ(x)
lim =1
x→∞ x
Similarly, to prove the other direction, we now assume that:
ϑ(x)
lim =1
x→∞ x
Using our second integral formula, we have:
ϑ(x) log x x ϑ(t)
Z
π(x) log x
= + 2 dt
x x x 2 t log t
As we did previously, we must show the integral expression on the right hand side
approaches 0 as we take a limit to infinity. Our initial assumption implies that
ϑ(t) = O(t), so we have that:
log x x ϑ(t) log x x 1
Z  Z 
2 dt = O 2 dt
x 2 t log t x 2 log t
Again, bounding this integral in a similar manner, we have:
Z x Z √x Z x √ √
1 1 1 x x− x
2 dt ≤ + √ 2 dtdt ≤ + √
2 log t 2 log2 t x log t log2 2 log2 x
And again by the squeeze theorem, we see our limit is 0, as necessary, and the
second direction is complete. We have proven the equivalency of 3.11.1 and 3.11.2,
and the other equivalences follow from this, as stated before. Knowing this, we
now set out to prove the prime number by proving equivalent form 3.11.4, i.e that
limx→∞ R(x)x =0 
We now examine the asymptotic behavior of Von-Mangoldt’s function and other
arithmetical functions. The following identities will prove useful later in our analysis
of Chebyshev’s ϑ and ψ functions. Before we do this however, we will introduce a
set of corollaries about partial sums of dirichlet products (see definition 2.3).
Theorem 3.14. Let f, g be arithmetic functions and denote h = f ∗ g. Let H, F, G
denote the partial sums of their respective functions. Then we have that:
X x
H(x) = f (n)G =
n
n≤x

Applying the above to a single arithmetic function, we have that f = f ∗ I and


thus:
P
Theorem 3.15. For F (x) = n≤x f (n) we have:
XX X jxk X x
f (d) = f (n) = F
n n
n≤x d|n n≤x n≤x
8 ABHIMANYU CHOUDHARY

We now introduce an identity about the partial sums of the harmonic series:
Theorem 3.16.  
X 1 1
= log x + γ + O
n x
n≤x
Where γ is a constant, hereafter called the ”Euler-Mascheroni” constant.
Proof. This is a consequence of 3.6, the Euler summation formula. 
Theorem 3.17. X jxk
Λ(n) = logbnc!
n
x≤n

Proof. This follows from a combination of 2.10 and 3.15. 


Lemma 3.18. (Legendre) Y
bxc! = pα(p)
p≤x
Where:
∞  
X x
α(p) =
m=1
pm
Proof. This identity is a consequence of 3.17. 
Theorem 3.19. For x ≥ 2, we have:
logbxc! = x log x − x + O(log x)
Proof. We see that: X
logbxc! = log n
n≤x
Applying Euler’s Summation formula, we have:
Z x Z x
X (t − btc)
log n = log t dt + dt − (x − bxc) log x
1 1 t
n≤x
Z x
(t − btc)
= x log x + dt + O(log x)
1 t
We know that:  
t − btc 1
=O
t t
So, we have: Z x Z x 
(t − btc) 1
dt = O dt = O(log x)
1 t 1 t
Thus, X
log n = x log x + O(log x)
n≤x

and an immediate corollary that follows from this:
Corollary 3.20. jxk
X
Λ(n) = x log x − x + O(log x)
n
x≤n
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 9

Proof. This follows from 3.17. 

Corollary 3.21.
X jxk X x
Λ(n) = ψ = x log x − x + O(log x)
n n
n≤x n≤x

Proof. This follows from 3.15 and the definition of the ψ function as the partial
sums of Von Mangoldt’s function. 

Remark 3.22. Note that we can also use the less precise approximation:
X jxk X x
Λ(n) = ψ = x log x + O(x)
n n
n≤x n≤x

as the logarithm is dominated by the linear term. This approximation will be useful
later on, when we discuss Shapiro’s theorem.
The next theorem follows as a consequence of 3.21.
Theorem 3.23.
X x
log p = x log x + O(x)
p
p≤x

Proof. We know by 3.21 that


X jxk
Λ(n) = x log x − x + O(log x)
n
n≤x

Now, we will reindex the above sum so it can be written in terms of primes less
than or equal to x. We know that Λ is nonzero for prime powers and 0 otherwise.
Thus, we have
jxk X X ∞  
X
m x
Λ(n) = Λ(p ) m
n m=1
p
n≤x p≤x

Note that the above ”infinite”j sum


k is indeed a finite sum, as for sufficiently large
m x
m, we have p ≥ x and thus pm = 0. By the definition of Λ we have:
∞   ∞  
XX x XX x
Λ(pm ) = log p
pm pm
p≤x m=1 p≤x m=1

Decomposing the above sum we have:


∞   XX ∞   X  
XX x x x
log p m = log p m + log p
m=1
p m=2
p p
p≤x p≤x p≤x

We now prove that:


∞  
XX x
log p m = O(x)
m=2
p
p≤x

We know by definition of the floor function that:


∞   XX ∞  
XX x x
log p m ≤ log p
m=2
p m=2
pm
p≤x p≤x
10 ABHIMANYU CHOUDHARY

Summing the geometric series of the right sum we have:


∞    
XX x X 1
log p = x log p
m=2
pm (p − 1)p
p≤x p≤x

Again, we have:
  ∞  
X 1 X 1
x log p ≤x log n = O(x)
(p − 1)p n=2
(n − 1)n
p≤x

as the series on the right hand side converges by comparison. 

4. Shapiro’s Theorem
We now provide a proof of P Shapiro’s
 theorem, an important theorem which
relates partial sums of the form n≤x nx a(n) to the often more interesting sums
P
of form n≤x a(n). Specifically, we will use this to derive a result about the
behavior of the partial sums of the ψ function.
Theorem 4.1. (Shapiro’s Tauberian Theorem) Let a(n) be a nonnegative sequence
such that: X jxk
a(n) = x log x + O(x)
n
n≤x
Then the following are true:
(1) For n ≥ 1 we have:
X a(n)
= log x + O(1)
n
n≤x

(2) There exists a constant M such that:


X
a(n) ≤ M x for all x ≥ 1
n≤x

(3) There exists a constant m such that:


X
a(n) ≥ mx for all x ≥ 1
n≤x

Proof. Define functions S, T by:


X X jxk
S(x) = a(n), T (x) = a(n)
n
n≤x n≤x

We first show the inequality:


x x
S(x) − S ≤ T (x) − 2T
2 2
holds. We have:
x X jxk X jxk
T (x) − 2T = a(n) −2 a(n)
2 n 2n
n≤x n≤ x
2

Reindexing we have
x X jxk jxk X jxk
T (x) − 2T = a(n) − 2a(n) + a(n)
2 x n 2n x n
n≤ 2 2 <n≤x
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 11

The sum on the left hand side will be nonnegative because b2xc − 2bxc is always
nonnegative (this can be checked by considering the min and max of both functions)
and our sequence is nonnegative. Thus, we have:
x X jxk
T (x) − 2T ≥ a(n)
2 x n
2 <n≤x

For n larger than x2 , we observe that the term nx is 1 as 1 is the floor of any
 

number between 1 and 2. Thus, we have:


x X jxk X x
T (x) − 2T ≥ a(n) = a(n) = S(x) − S
2 x n x 2
2 <n≤x 2 <n≤x

as required. We now use this fact to prove statement 2. Recall by our initial
assumption that:
T (x) = x log x − x + O(x)
We have then that:
x x x x 
T (x) − 2T = x log x − x + O(x) − 2 log − + O(x) = O(x)
2 2 2 2
By the previous inequality, we can establish that:
x
S(x) − S = O(x)
2
for all x ≥ 1. Thus, we have a constant M 0 such that S(x) − s( x2 ) ≤ M 0 x. Applying
this to successive values of x as follows, we can get expressions of the forms:
x x x
S −S ≤ M0
2 4 2
x x x
S −S ≤ M0
4 8 4
..
.
and so on, we see that when we sum these expressions, all terms cancel except S(x)
and we are left with:
 
1 1
S(x) ≤ M x 1 + + + ... = 2M 0 x
0
2 4
so we have found an appropriate constant, namely 2M 0 . We now prove part 1. We
see that: jxk x
= + O(1)
n n
So:
X jxk X  x  
T (x) = a(n) = + O(1) a(n)
n n
x≤n x≤n
X a(n) X
=x + a(n)O(1)
n
x≤n n≤x
X a(n)
=x + O(x)
n
x≤n
12 ABHIMANYU CHOUDHARY

If we substitute the asymptotic formula we have for T (x), it follows almost imme-
diately that:
X a(n)
x = log x + O(1)
n
x≤n
as needed. 
We can now state a number of corollaries that follow immediately from this
result:
Corollary 4.2. The following asymptotic formulae hold:
(1)
ψ(x) = O(x)
(2)
X Λ(n)
= log x + O(1)
n
n≤x
(3)
ϑ(x) = O(x)
(4)
X log p
= log x + O(1)
p
p≤x

All of these corollaries follow from previous asymptotic formulae derived in sec-
tion 3.

5. Selberg’s Asymptotic Formula


We now prove a major result, first derived by Atle Selberg in 1949.
Theorem 5.1 (Selberg’s Asymptotic Formula). For x > 0 we have the following:
X x
ψ(x) log x + Λ(n)ψ = 2x log x + O(x)
n
n≤x

We first prove a lemma which will help us obtain the final result
Lemma 5.2 (Tatuzawa Iseki Identity). Let F be a real valued function defined on
R+ and let G be given by:
X x
G(x) = log x F
n
n≤x

Then, we have:
X x X x
F (x) log x + F Λ(n) = µ(d)G
n d
n≤x d≤x

Proof. We first rewrite F (x) log x as a sum. We have:


X 1 x x XX x x
(5.3) F (x) log x = F log = µ(d)F log
n n n n n
n≤x n≤x d|n

Now, we can use the Mobius inversion formula to say that:


X n
Λ(n) = µ(d) log
d
d|n
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 13

So we have:
X x X xX n
(5.4) F Λ(n) = F µ(d) log
n n d
n≤x n≤x d|n

Adding (5.3) and (5.4) we have:


X xX h x  n i
F µ(d) log + log
n n d
n≤x d|n

This is equal to:


X xX x
F µ(d) log
n d
n≤x d|n

So, summarizing our results so far, we have:


X x X xX x
(5.5) F (x) log x + F Λ(n) = F µ(d) log
n n d
n≤x n≤x d|n

Now, taking the right hand side and writing n = qd we have:


X xX  
x X xX x
(5.6) F µ(d) log = µ(d) log F
n d d x qd
n≤x d|n d≤x q≤ d

And by our initial definition of G, we can rewrite the right hand side of (5.6) as:
X x
µ(d)G
d
d≤x

as needed. 

We now prove 5.1 using this lemma.

Proof. We apply 5.2 to the functions ψ(x) and x − γ − 1, where γ is the Euler
Mascheroni constant. For ψ(x) we can define the associated Gψ as:
X x
Gψ (x) = log x ψ
n
n≤x
= log x(x log x − x + O(log x))
= x log2 x − x log x + O(log2 x)
which follows from 3.21. We have for x − γ + 1 that:
X x 
Gγ (x) = log x −γ+1
n
n≤x
X 1 X
= x log x − log x (γ + 1)
n
n≤x n≤x

By 3.16, we have:
X 1   
1
x log x = x log x log x + γ + O
n x
n≤x
14 ABHIMANYU CHOUDHARY

So:
  
1 X
Gγ (x) = x log x log x + γ + O − log x (γ + 1)
x
n≤x
  
1
= x log x log x + γ + O − log x(γ + 1)x
x
= x log2 x − x log x + O(log x)

We see that the√difference between Gψ and Gγ is O(log2 x). We will use the weaker
estimate of O( x). Now, apply 5.2 (Tatuzawa Iseki) to ψ and x − γ + 1. We have
for ψ that:
X x X x
(5.7) ψ(x) log x + ψ Λ(n) = µ(d)Gψ
n d
n≤x d≤x

and for x − γ + 1 we have:


Xx X x
(5.8) (x − γ + 1) log x + = µ(d)Gγ
n d
n≤x d≤x

Subtracting the RHS of 5.7 from that of 5.8,we have the term:
X  x  x 
µ(d) Gψ − Gγ
d d
d≤x
√ x
Applying our O( x) estimate for the difference of Gψ and Gγ with d as our argu-
ment, we have:
 
x  x  r
X  X x
(5.9) µ(d) Gψ − Gγ = O 
d d d
d≤x d≤x

Factoring, we see that 5.9 is equal to:


 r  Z x
√ X 1 √
 
1
(5.10) O  x  = O(x) x √ dt = O(x)
d 1 t
d≤x

Using 5.2 (Tatuzwa Iseki), this time on the auxiliary function:

ψ(x) − (x + γ + 1)

we have:
X h x x i X  x  x 
[ψ(x)−(x+γ+1)] log x+ ψ − − γ − 1 Λ(n) = µ(d) Gψ − Gγ
n n d d
x≤n d≤x

and we know from 5.10 that the RHS is O(x), so we have:


X h x x i
ψ(x) − (x + γ + 1)] log x + ψ − − γ − 1 Λ(n) = O(x)
n n
x≤n
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 15

Rearranging terms and simplifying we have:


X x X x 
ψ(x) log x + Λ(n) = O(x) + (x + γ + 1) log x + + γ + 1 Λ(n)
n n
n≤x n≤x
X Λ(n) X
= O(x) + x log x + O(log x) + x + (γ + 1) Λ(n)
n
n≤x n≤x
= O(x) + x log x + O(log x) + x log x + O(1) + O(log x)
= 2x log x + O(x)
which is the Selberg Asymptotic identity. 
We now provide two alternate formulations of Selberg’s Identity. The corollary
below follows immediately through an application of 3.5 (Abel Summation).
Corollary 5.11.
Z x
X ψ(t)
Λ(n) log n = ψ(x) log x − dt = ψ(x) log x + O(x)
1 t
n≤x

Definition 5.12. Define:


X n
Λ2 (n) = Λ(n) log n + Λ(d)Λ = Λ(n) log n + (Λ ∗ Λ)(n)
d
d|n

By 3.15 the partial sums of Λ2 are given by:


X X X X x
Λ2 (n) = Λ(n) log n + (Λ ∗ Λ)(n) = ψ(x) log x + O(x) + Λ(n)ψ
n
n≤x n≤x n≤x n≤x

So an equivalent restatement of Selberg’s identity is:


X
(5.13) Λ2 (n) = 2x log x + O(x)
n≤x

Moving the 2x log x to left hand side and applying , we have:


X X X
(5.14) Q(x) = Λ2 (n) − 2 log n = Λ2 (n) − 2 log n = O(x)
n≤x n≤x n≤x

6. Deriving the Prime Number Theory using Selberg’s Identity


We now move to derive the prime number theorem using Selberg’s identity. We
will mainlybe working with the function R(x) = ψ(x) − x. We first introduce
a lemma that allows us to restate Selberg’s identity in terms of the remainder
function, rather than ψ.
Lemma 6.1. Selberg’s identity can be restated as:
X x
R(x) log x + Λ(n)R = O(x)
n
n≤x

Unfortunately, due to the nature of ψ(x), R(x) is also particularly temperamen-


tal. and so is the quotient, R(x)
x . Thus, we will use the smoother:
Z x
R(t)
S(x) = dt
2 t
16 ABHIMANYU CHOUDHARY

as our starting point. If we can show that:


S(x)
lim =0
x→∞x
then we can show the same result for the quotient involving R, which is the result
we need.We now prove two properties of S.
Lemma 6.2. The following is true of S(x):
(1)
S(x) = O(x)
(2)
S(x) is Lipschitz
Proof. We first show 1. Recall that ψ(x) is O(x) for x ≥ 2 and thus we have some
constant M1 such that:
ψ(x) ≤ M1 x for all x ≥ 2
Subtracting x we have:
R(x) = ψ(x) − x ≤ M1 x − x = x(M1 − 1) for all x ≥ 2
Dividing by x, we see:
R(x)
≤ (M‘ − 1)
x
R(x)
So clearly, x = O(1). We know then that:
Z x Z x Z x 
R(t)
S(x) = dt ≤ O(1)dt = O 1dt = O(x)
2 t 2 2

as needed. To show the function is Lipschitz, we can use the fact that for any
x1 , x2 ≥ 2:
|S(x1 )| ≤ M1 |x1 |
|S(x2 )| ≤ M1 |x2 |
as S(x) = O(x). Subtracting the expressions and applying the triangle inequality
we have:
|S(x1 ) − S(x2 )| ≤ |S(x1 )| − |S(x2 )| ≤ M1 |x1 | − M1 |x2 | ≤ M1 |x1 − x2 |
and thus S is Lipschitz, as required. 

Corollary 6.3.
R(x)
S 0 (y) = = O(1)
x
where S 0 (y) is defined when y =
6 pk for some prime p.
This was proven indirectly in the proof of 6.2. The restriction on y gives us a
guarantee of continuity. We now introduce an important corollary:
Corollary 6.4.
X y
S(y) log y + Λ(n)S = O(y)
n
n≤y
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 17

Proof. Dividing the expression in (6.1) by x and integrating both sides from 2 to
y, we have:
Z y Z yX
R(x) x
log xdx + Λ(n)S dx = O(1)
2 x 2 n
n≤x
Integrating the first expression from the left by parts, we have:
Z y Z y Z y
R(x) R(x) 1
log x = log y − S(x) dx
2 x 2 x 2 x
= S(y) log y − O(y)
Decomposing the second integral we have:
Z yX x Z y  
X x
Λ(n)S dx = Λ(n) S dx
2 n 2 n
n≤x n≤x

This can be integrated by applying the substitution u = nx and gives us:


Z yX x X y
Λ(n)S dx = Λ(n)S
2 n n
n≤x n≤x

Combining, we have:
X y
S(y) log y − O(y) + Λ(n)S = O(1)
n
n≤x

and the result follows. 

Lemma 6.5. There exists a constant Z1 such that:


X  y 
log2 |S(y)| ≤ Λ2 S + Z1 y log y

m
m≤x

Proof. We prove this by beginning with (6.3). We have that:


X y
S(y) log y + Λ(n)S = O(y)
n
n≤y
y
We now substitute y for k,
for some positive dummy variable k, and we have:
y y X  y   y  O(y)
S log + Λ(n)S =O =
k k y kn k k
n≤ k

Multiplying both sides by Λ(k) we get:


y y X  y  O(y)Λ(k)
Λ(k)S log + Λ(k) Λ(n)S =
k k y kn k
n≤ k

Summing from 1 ≤ k ≤ y we have:


    
X y y X X  y  X Λ(k)
 Λ(k)S log +  Λ(k)Λ(n)S  = O(y)
k k y kn k
k≤y k≤y n≤ k k≤y

The RHS simplifies to:


O(y)(y log y + O(1)) = O(y log y)
18 ABHIMANYU CHOUDHARY

by 6.2. We now simplify the LHS. First, we break up the sum in the first set of
brackets using properties of logarithms. We have:
X y y X y X y
Λ(k)S log = Λ(k)S log (y) − Λ(k)S log (k)
k k k k
k≤y k≤y k≤y

Rewriting the above sum and making substitutions, we then have:


  
X y X y X X  y 
Λ(k)S log (y)− Λ(k)S log (k)+  Λ(k)Λ(n)S  = O(y log y)
k k y kn
k≤y k≤y k≤y n≤ k

We turn our attention to:


  
X y X X  y 
− Λ(k)S log (k) +   Λ(k)Λ(n)S 
k y kn
k≤y k≤y n≤ k

We first perform a sign change, and see the above is equal to:
   
X y X X  y 
− Λ(k)S log (k) −   Λ(k)Λ(n)S 
k y kn
k≤y k≤y n≤ k

Let m = kn. Because k ≤ y and n ≤ ky , we know m ≤ y. We can thus reindex the


left hand portion of the above (the symbol changes from k to m but the value of
the sum does not). We have:
   
X y X X y
− Λ(m)S log (m) −   Λ(k)Λ(n)S 
m y m
m≤y k≤y n≤ k

y
Because k ≤ y and n ≤ we see that the above means nk = m ≤ y, so reindex
k,
again:
 " !#
X y X y X
− Λ(m)S log (m) − S Λ(k)Λ(n)) 
m m
m≤y m≤y kn=m

Combining and applying the definition of Λ2 we have:


 
X y
− S Λ2 (m)
m
m≤y

Now, we reconsider:
X y
Λ(k)S log (y)
k
k≤y

Moving the logarithm to the outside, we have:


X y
log y Λ(k)S
k
k≤y

By 6.1, this is:


log y(O(y) − S(y) log y)
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 19

So in summary, we have:
 
X y
log y(O(y) − S(y) log y) −  S Λ2 (m) = O(y log y)
m
m≤y

And moving the sum to the RHS will give us the result. 

Because we showed that Λ2 (m) = 2 log m + O(m) near the end of section 6, our
next lemma will show that we can use weights of 2 log m, making our sum easier to
work with.
Lemma 6.6. There exists a constant Z2 such that:
X  y 
(6.7) log2 y|S(y)| ≤ 2 log m + Z2 y log y
m
S
m≤y

Proof. We define a second remainder function C(y), as given by:


X  y 
(6.8) (Λ2 (m) − 2 log m) = C(y)
m
S
m≤y

Now, recall the function Q defined by:


X
(6.9) Q(y) = (Λ2 (m) − 2 log m)
m≤y

We see that:
Λ2 (m) − 2 log m = Q(m) − Q(m − 1)
Substituting, we have:
X  y 
(Q(m) − Q(m − 1)) = C(y)
m
S
m≤y

We can ignore all terms with m < 2 as S(m) = 0 for those terms. Through
examination (this becomes exceedingly clear when terms are written out), we can
reindex the sum above as:
X   y   y  
(6.10) C(y) = − S Q(m)
m m+1
S
2≤m≤y

Applying the reverse triangle inequality (|x − y| ≥ ||x| − |y||), we have:


X   y  
y
 

C(y) ≤ S − S Q(m)
m m+1
2≤m≤y

Because S is Lipschitz, we have some Z3 such that:


(6.11)    
X y y X   y   y  
S −S Q(m) ≤ Z3 − Q(m)
m m+1 m m+1
2≤m≤y 2≤m≤y

Moreover, we can bound Q(m) by some linear function of m as Q(m) = O(m) by


5.14. So, making Z3 large enough and substituting 6.11 we have:
X   y   y  
C(y) ≤ Z3 m m − m+1

2≤m≤y
20 ABHIMANYU CHOUDHARY

Factoring a y term and simplifying the right hand side, we have:



X   1   1   X  1 
C(y) ≤ yZ3 m m − m + 1 = yZ3

m+1
2≤m≤y 2≤m≤y

We see that:
X  1  Z y
1
C(y) ≤ yZ3
m+1
≤ yZ3 dm = Z3 y log y
1 m
2≤m≤y

Combining the above with our expression for C(y) proves the lemma. 
We now further simplify this inequality by replacing the sum above with an
integral.
Lemma 6.12. There is a constant Z4 such that:
Z y  
y
log2 y|S(y)| ≤ 2 log u du + Z4 y log y

u
S
2

Proof. Note the following bound on the integral:


 y  Z m+1  y 
log m S ≤ log u du

m u
S
m
As log is an increasing function. Now, by the triangle inequality, we see that:
 y   y   y   y 
≤ S + S −S

m u m u
S
Integrating we have:
Z m+1   Z m+1   Z m+1  
y y y  y 
log u du ≤ log u du+ −S log u du

m u m u
S S S
m m m
Denote the integral furthest to the right as Jm . Now, using the Lipschitz property
and the appropriate constant M1 from 6.2, we have that:
Z m+1   Z m+1
y  y  y y
Jm = −S log u du ≤ M1 − log u du

m u m u
S
m m
We can bound this by:
Z m+1 y   Z m+1
y y y
Jm ≤ M1 − log u du ≤ M1 − log u du

m m u m m+1 m
Simplifying and bounding the final integral on the LHS we have:
  Z m+1
y y log(m + 1)
J m ≤ M1 − log u du ≤ M1 y
m m+1 m m(m + 1)
Because m ≥ log(m + 1) we have:
M1 y
Jm ≤
m+1
Now, returning to our original expression, we see:
 y  Z m+1  y  M1 y
(6.13) log m S ≤ log u du +

m u m +1
S
m
Using Z4 = Z2 + M1 and applying 6.13 to 6.6, we have the desired result. 
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 21

The inequality of 6.12 assumes a simpler form with a change of variables. Letting
x = log y and letting v = log uy , we can rewrite 6.12 as:
Z x−log 2
(6.14) x2 |S(ex )| ≤ 2 |S (ev )| (x − v)e(x−v) dv + Z4 xex
0

Performing another change of variables by defining the function W (x) = e−x S(ex ),
and applying the transform to 6.14, we have that:
Z x−log 2
|W (x)|
(6.15) x2 −x ≤ |S (ev )| (x − v)e(x−v) dv + Z4 xex
e 0
Further simplifying 6.15 we have:
Z x−log 2
|W (x)| ex
x2 −x ≤ 2 |W (v)|ev (x − v) v dv + Z4 xex
e 0 e
Z x−log 2
2 xex
|W (x)|ex ≤ 2 |W (v)|(x − v)ex dv + Z4 2
x 0 x
Z x
2 1
(6.16) |W (x)| ≤ 2 |W (v)|(x − v)dv + Z4
x 0 x
The transformations above were done to yield a function W which is essentially
dominated by a weighted average of itself. We now prove two lemmas about |W (x)|.
Lemma 6.17.
lim sup |W (x)| = α ≤ 1
x→∞

Proof. By 6.3 we know that:


R(x)
lim sup ≤1
x→∞ x
By definition of S, it follows that:
|S(y)|
lim sup ≤1
y→∞ y
It is clear from this that lim supx→∞ |W (x)| ≤ 1 
Lemma 6.18. Let Z x
1
lim sup |W (t)|dt = β
x→∞ x 0
Then β ≥ α = 1
This key result will be proven using lemmas 6.5, 6.6, and 6.12.
Proof. Recall that by 6.16:
Z x
2 1
|W (x)| ≤ |W (v)|(x − v)dv + Z4
x2 0 x
We first decompose the integral on the right hand side by using a dummy variable
and rewriting it as an iterated integral. We see that 6.16 can be rewritten as:
Z x  Z u 
2 1 1
(6.19) |W (x)| ≤ 2 udu |W (v)|dv + Z4
x 0 u 0 x
22 ABHIMANYU CHOUDHARY

which can be verified by reversing the order of integration. Note that:


Z x
2
udu = 1
x2 0
Thus we have the integral on the right handside of the form:
1 u 1 u −v
Z Z
|W (v)|dv = |e S(ev )|dv
u 0 u 0
And this is bounded by M1 (our Lipschitz continuity constant) from lemma 6.2,
i.e:
1 x 1 x −v
Z Z
|W (v)|dv = |e S(ev )|dv ≤ M1
u 0 u 0
Thus, if we fix some x1 , and take any x > x1 , we have:
Z x  Z x 
2 1
I(x) = 2 udu |W (v)|dv
x 0 u 0
Z x1 Z x  Z x 
2M1 2 1
≤ 2 udu + 2 udu |W (v)|dv
x 0 x x1 u 0
As we separate the integrals at x1 . Given arbitrary  > 0, if we choose x1 sufficiently
large, we have by the definition of β (limit supremum) that:
1 x1
Z
|W (v)|dv ≤ β + 
u 0
for all u ≥ x1 . Thus, substituting into our inequality for I(x) we have:
M1 x21 x21
 
I(x) ≤ + (β + ) 1 − 2
x2 x
Thus, for large x, we have by 6.19 that:
M1 x21 x21
 
Z4
|W (x)| ≤ 2
+ (β + ) 1 − 2
+
x x x
Letting x → ∞, we have that α ≤ β + ,and the inequality holds as it is true for
arbitrary . 

We seek to show that α = 0. To do this, we will use two more facts about W ,
proving them along the way.
Lemma 6.20. Let k = 2M1 . Then, the following holds:
|W (x1 ) − W (x2 )| ≤ k|x1 − x2 |
Proof. We see that by definition:
|W 0 (x)| = −e−x S(ex ) + S 0 (ex ) ≤ e−x |S(ex )| + |S 0 (ex )| ≤ c + c = 2c
and it follows by the methods used in 6.2 that the condition follows. 

Lemma 6.21. If W (v) 6= 0 for v1 < v < v2 then there exists M2 such that:
Z v2
W (v)dv ≤ M2
v1
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 23

Proof. We see through an application of Abel summation that:


Z x
ψ(t)
= log x + O(1)
2 t
Or, using the remainder function:
Z x
R(t)
(6.22) dt = O(1)
2 t2
S(y)
Taking y2 , we do another double integral decomposition:
Z x Z x
dy y R(t)
Z Z x Z x 
S(y) R(t) dy
dy = = dt
2 y2 2 y
2
2 t 2 t t y
2

This simplifies to:


x
1 x R(t)
Z Z
R(t)
dt − dt
2 t2 x 2 t
R x S(y)
Using 6.2 and 6.22, it follows that 2 y2 dy = O(1). Performing change of variables
with y = eu , x = ev we have:
Z v
W (u)du = O(1)
log 2

Letting v = v1 and v = v2 and subtracting the integrals with these as endpoints,


we have that the result integral is bounded, and thus there exists M2 such that:
Z v2
W (u)du ≤ M2
v1

Lemma 6.23. A function W (x) subject to 6.17, 6.18, 6.20 and 6.21 must have
β = 0.
Proof. Take ω > α. Then from the definition of α, there exists xω such that for all
x ≥ xω we have:
(6.24) |W (x)| ≤ ω
If W (x) 6= 0 for large x, it follows from 6.21 that ω = 0 and thus α = 0. Thus,
suppose that W has arbitrarily large zeros. Let a, b be adjacent zeros of W (x) for
x > xω . We now have 3 cases.
(1) (b − a) ≥ 2 Mω2
By 6.21, as x 6= 0 for a < x < b we have:
Z b
1
W (u)du ≤ M2 ≤ (b − a)ω
a 2
Thus, the average of |W | on (a, b) is less than 21 ω.
(2) (b − a) ≥ 2 ωk
Here, it follows from 6.20 (Lipschitz property of W )that if the graph of
|W (x)| increases as rapidly as possible from x = a to x = b that it cannot
lie above a triangle with height k(b−a)
2 ≤ ω and thus:
Z b
1
|W (x)|dx ≤ (b − a)ω
a 2
24 ABHIMANYU CHOUDHARY

(3) 2 Mω2 ≥ (b − a) ≥ 2 ωk
We can use the reasoning as in case 2 for any points a distance ω/k from
each endpoint. Otherwise, by 6.24 we have:
Z b
w2
 

|W (x)| ≤ + b−a− ω
a k k
Simplifying the right hand side above we have:
ω2
   
ω
(b − a)ω 1 − ≤ (b − a)ω 1 −
k(b − a) 2M2 k
And this is strictly less than:
α2
 
(6.25) (b − a)ω 1 −
2M2 k
as M2 k > 1 and α ≤ 1. If x1 is the first zero of W (x) to the right of xω
and x is the largest zero to the left of y then by 6.25 and 6.21 imply that:
Z y Z x1
α2
 
|W (x)|dx ≤ |W (x)|dx + (x − x1 )ω 1 − + M2
0 0 2M2 k
Dividing by y and noting that x ≤ y we have:
1 y 1 x1 α2
Z Z  
M2
|W (x)|dx ≤ |W (x)|dx + (x − x1 )ω 1 − +
y 0 y 0 2M2 k y
Letting y → ∞ we see that:
α2
β ≤ ω(1 − )
2M2 k
and because β ≥ α, we see:
α2
 
α≤ω 1−
2M2 k
Since this inequality holds for all ω > α, it must hold for ω = α. Thus,
α3 ≤ 0 and since α ≥ 0 it follows α = 0.
It follows then that:
S(y)
lim =0
y→∞ y
Thus, for any given  > 0 we have for large y that:
1
|S(y)| ≤ 2 y
3
Hence, we see:
1
S(y(1 + )) − S(y) ≤ 2 (y(1 + ) + y) < 2 y
3
Expanding S, we have: Z
R(t)
y(1 + ) dt ≤ 2 y
y t
By the definition of R and because ψ is nondecreasing, we have:
Z y(1+) Z y(1+)
ψ(y)
dt − dt ≤ 2 y
y(1 + ) y y
AN ELEMENTARY PROOF OF THE PRIME NUMBER THEOREM 25

ψ(y) 2
Hence, we have that y ≤ (1 + ) . Similarly, because S(y) − S(y(1 − )) ≥ −2 y
ψ(y) 2
for large y leads to y ≥ (1 − ) . Because  is arbitrary, it follows that:

ψ(x)
lim =1
x→∞ x
proving 1.2. 
Acknowledgments. It is a pleasure to thank my mentor, Karen Butt for her
assistance in writing this paper, as well as her having the confidence in me to write
on a topic that most would consider ambitious. I would also like to thank the
program director, Peter May, for organizing this research experience.

References
[1] Norman Levinson. A Motivated Account of the Elementary Proof of the Prime Number The-
orem. https://www.jstor.org/stable/2316361
[2] Tom M. Apostol. Introduction to Analytic Number Theory.
http://plouffe.fr/simon/math/IntrodAnalyticNTApostol.pdf

You might also like