0% found this document useful (0 votes)
76 views

On the difference π (x) − li (x) : Christine Lee

This document discusses the difference between the prime counting function π(x) and the logarithmic integral li(x). It summarizes that while π(x) is always less than li(x) for x up to 1014, Littlewood proved in 1914 that π(x) exceeds li(x) at infinitely many values of x, with the differences reaching arbitrarily large magnitudes. Skewes provided the first unconditional upper bound in 1955 below which π(x) is guaranteed to exceed li(x). Subsequent work by Lehman and others have greatly reduced this bound through computational methods. The document then outlines the proofs and computational advances that have lowered this upper bound.

Uploaded by

Khokon Gayen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

On the difference π (x) − li (x) : Christine Lee

This document discusses the difference between the prime counting function π(x) and the logarithmic integral li(x). It summarizes that while π(x) is always less than li(x) for x up to 1014, Littlewood proved in 1914 that π(x) exceeds li(x) at infinitely many values of x, with the differences reaching arbitrarily large magnitudes. Skewes provided the first unconditional upper bound in 1955 below which π(x) is guaranteed to exceed li(x). Subsequent work by Lehman and others have greatly reduced this bound through computational methods. The document then outlines the proofs and computational advances that have lowered this upper bound.

Uploaded by

Khokon Gayen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

On the difference π(x) − li(x)

Christine Lee

Introduction
Riemann’s memoir [13] and the subsequent efforts to prove his statements have
answered a lot of the questions we have about prime numbers. For the prime
counting function
π(x) := # of primes ≤ x,
we now have an explicit formula which expresses this quantity in terms of known
functions. We also have the Prime Number Theorem [11, p.168], which tells us
that the chance that a large number N is a prime number is roughly 1/ log N .
Riemann’s ideas illustrate the connection between complex analysis and number
theory, and paved the way to many results that had been unimaginable. How-
ever, his paper also spawned many more problems, all of considerable difficulties.
Numerous examples can be given, of which the Riemann Hypothesis is the most
prominent [5]. However, in this paper, we will mainly be concerned with the
question described below:
Consider the explicit formula
Z ∞
X
ρ du
J(x) = li(x) − li(x ) + 2
− log 2,
ρ x (u − 1)u log u

where J(x) is defined by


1 1
J(x) = π(x) + π(x1/2 ) + π(x1/3 ) . . .
2 3
When Möbius inversion is used to write π(x) as a function of J(x), we have an
expression of π(x) in terms of li(x) and the zeros of the Riemann zeta function.
This is the main result of Riemann’s paper. A natural question to ask at this point
is how well li(x) approximates π(x). More precisely, we want to know the size of
the error π(x) − li(x), whether it is always positive or negative. The numerical
evidence to this date seems to suggest that this difference is always negative.
Indeed, the increasing difference of π(x) − li(x) from select values up to 1022
could remove the doubt from anyone who finds numerical evidence convincing
enough:

1
x π(x) π(x) − li(x)
108 5761455 -753
109 50847534 -1700
1010 455052511 -3103
1011 4118054813 -11587
1012 37607912018 -38262
1013 346065536839 -108970
1014 3204941750802 -314889
1015 29844570422669 -1052618
1016 279238341033925 -3214631
1017 2623557157654233 -7956588
1018 24739954287740860 -21949554
1019 234057667276344607 -99877774
1020 2220819602560918840 -222744643
1021 21127269486018731928 -597394253
1022 201467286689315906290 -1932355207

Table 1: Primes up to selected x, and the values π(x) − li(x).

It has also been verified through heavy computation that π(x) is less than li(x)
for all x up to 1014 [8]. Nevertheless, it remains a fact, proven by Littlewood
in 1914 [10], that not only does π(x) exceed li(x) at infinitely many x’s on the
real line, their differences also reach arbitrarily large magnitude. Forty one years
later, Skewes succeeded in obtaining the first unconditional upperbound
3
1010
1010

below which π(x) is guaranteed to exceed li(x) [14]. This remarkably large num-
ber has since been greatly reduced through the joint efforts of the computer and
mathematics, of which Lehman’s computational formula [9] is the principal figure.
In the subsequent sections of this paper I shall gradually develop the reasoning
which leads to this result. To that end, I will include a self-contained proof of
Littlewood’s and Lehman’s theorem (see [11] and [9]), while only assuming some
standard facts from number theory and complex analysis. Although the methods
used are elementary, the proofs themselves are by no means trivial and are highly
sensitive to the validity and the falsity of the Riemann Hypothesis. This results
in completely different proofs for both cases. Following this exposition, I will
present a brief summary in the computational advances in lowering the upper
bound below which π(x) − li(x) > 0, with particular emphasis on the result of
Chao and Plymen [4]. Their computation gives the smallest upper bound to date.

This article comprises the author’s MSc dissertation (Manchester University


2008); some slight editorial changes have been made by the author’s supervisor
Professor Roger Plymen.

2
Notations

z, s A complex number.
ζ(s) The Riemann P∞Zeta1 function on the complex plane C,
defined by n=1 ns in the half plane Re(s) > 1.
ρ A zero of the zeta function in the critical strip
0 < Re(s) < 12 in C. R∞
Γ(s) The Gamma function. Defined by 0 ts−1 e−t dt on the
half plane Re(s) > 0.
β The real part of ρ.
γ The imaginary part of ρ.
x A real number.
n A nonnegative integer.
µ(n) := 0 if n is not square free, := (−1)k , where k is the number
of distinct prime factors of n. The Möbius function.
Λ(n) := log p if n = pk for a prime p, := 0 otherwise.
log(x) := The natural logarithm of x.
π(x) ThePnumber of primes not exceeding x.
θ(x) := p prime ≤x log p .
:= P∞ 1 1/k
P
J(x) k=1 k π(x ).
ψ(x) := n≤x Λ(n) .
ϑ A complex number satisfying |ϑ| ≤ 1.
limn→∞ inf xn := sup{inf{xm : m ≥ n} : n ≥ 0}.
limn→∞ sup xn := inf{sup{xm : m ≥ n} : n ≥ 0}.
kxk The distance from x to the nearest integer.
[x] The greatest integer not exceeding x.
{x} The fractional part of x.
f (z) = O(g(z)) |f (x)| ≤ Cg(x) where C is an absolute constant.
f (z)  g(z) f (z) = O(g(z)).
f (z)  g(z) g(z) = O(f (z)).
f (x)  g(x) cf (x) ≤ g(x) ≤ Cf (x) for some positive absolute
constants c, C.
f (x) ∼ g(x) limx→∞ f (x)/g(x) = 1.

3
We list some preliminary lemmas.

1
The Riemann-Stieltjes Integral [11, p.486–494]
Because we will frequently encounter sums taken over discrete step functions it
is essential to develop a notion of integrals to work with them. This enables us
to make very precise estimates for select portions of infinite series such as
X 1
.
γ2
0<|γ|<∞

For a, b ∈ R and a < b, let x = {x0 , x1 , . . . , xn } be a partition of the interval


[a, b]:
a = x0 ≤ x1 ≤ · · · xn = b .
Let x0k be any number such that xk−1 ≤ x0k ≤ xk , we can form the sum from x
and x0 = {x01 , x02 , . . . , x0n }:
n
X
0
S(x, x ) = f (x0k )(g(xk ) − g(xk−1 )).
k=1

Rb
Definition. We say that the Riemann-Stieltjes integral a f (x) dg(x) exists and
has the value I if for evey  > 0 there is a δ > 0 such that

|S(x, x0 ) − I| ≤ 

whenever
mesh{x} = max (xk − xk−1 ) ≤ δ .
1≤k≤n

R b Now we prove a criteria for the existence of the Riemann-Stieltjes integral


a
f (x) dg(x) for a pair of functions f and g.

Definition. We define the variation of g on an interval [a, b] to be


n
X
Var[a,b] (g) = sup |g(xk ) − g(xk−1 )| ,
x
k=1

where the supremum is taken over all partitions x on [a, b].


Rb
Lemma 0.1. The Riemann-Stieltjes integral a f (x) dg(x) exists if f is contin-
uous on [a, b] and g is of bounded variation.

Proof. Since f is continuous on [a, b], for every  > 0 we can find a δ > 0 such
that |f (x) − f (y)| <  for all x, y ∈ [a, b] satisfying |x − y| < δ. We show that for
every  > 0, there exists a δ such that

|S(x, x0 ) − S(y, y0 )| ≤ 2 Var[a,b] (g)


1
obtained directly from [11] with minor changes in notation.

4
whenever mesh{x}, mesh{y} < δ. Consider the partition z which is the union of
x and y. We can write

S(x, x0 )
X n
= f (x0k )(g(xk ) − g(xk−1 ))
k=1
n
X
= f (x0k )(g(xk ) − g(zk` ) + g(zk` ) − g(zk(`−1) ) + · · · + g(zk1 ) − g(zxk−1 )) ,
k=1

where {zk1 , zk2 , . . . , zk` } is the partition of z on the interval (xk−1 , xk ). The
absolute value of difference can be written as

|S(x, x0 ) − S(y, y0 )|

XI
= (f (x0i ) − f (yi0 ))(g(zi ) − g(zi−1 ))


i=1

Since mesh{x} and mesh{y} are less than δ, |f (x0i ) − f (yi0 )| <  for all 1 ≤ i ≤ I,
and

XI
≤  (g(zi ) − g(zi−1 ))


i=1
XI
≤ |g(zi ) − g(zi−1 )|
i=1
≤  Var[a,b] (g),

and we are done since we have a Cauchy sequence.


The reason for defining this integral is so that we can write
Xn Z n
ak f (k) = f (x) dA(x) ,
k=1 0

where A(k) − A(k − 1) = ak and A(x) = j−1


P
k=0 ak for x ∈ [j − 1, j). Certainly
the endpoints of the integral are flexible as long as they don’t change the value
of the the sum.
We also need the formula for integration by parts.
Rb Rb
Lemma 0.2. If a f dg exists for functions f, g, then a g df also exists, and
Z b Z b
g df = f (b)g(b) − f (a)g(a) − f dg.
a a

Proof. For any partition x on [a, b], we expand the sum S(x, x0 ) according to the
Rb
definition for a f dg:
n
X
0
S(x, x ) = f (x0k )(g(k) − g(k − 1)) ,
k=1

5
since x0 = a and xn = b, we have, after choosing x01 = a and x0n = b,
= f (a)(g(x1 ) − g(a)) + f (x2 )(g(x2 ) − g(x1 )) + · · · + f (b)(g(b) − g(xn−1 ))
X n
= f (b)g(b) − f (a)g(a) − g(xk )(f (x0k ) − f (x0k−1 )) .
k=1

thus if we take the sums over all possible partitions of [a, b] we have
Z b Z b
g df = f (b)g(b) − f (a)g(a) − f dg .
a a
The final, and the most useful lemma developing the notion of the Riemann-
Stietljes integral is the following:
Lemma 0.3. If g 0 is continuous on [a, b], then
Z b
V ar[a,b] g = |g 0 (x)| dx .
a
If in addition f is Riemann integrable, then
Z b Z b
f (x) dg(x) = f (x)g 0 (x) dx .
a a
0
Proof. Suppose g is continuous on [a, b], then by the mean value theorem there
is a x0k ∈ [xk−1 , xk ] such that g 0 (x0k )(xk − xk−1 ) = g(xk ) − g(xk−1 ), thus
X n X n
|g(xk ) − g(xk−1 )| = |g 0 (x0k )(xk − xk−1 )|
k=1 k=1

Consider two partitions x, y, where x is a subpartition of y , then


X n Xn
|g(xk ) − g(xk−1 )| ≤ |g(yk ) − g(yk−1 )|
k=1 k=1
Rb
by the triangle inequality. Certainly the Riemann-Stieltjes integral a |g 0 (x)|dx
exists by Lemma 0.1 since g 0 is continuous. It is also the supremum of the right
hand side of the equation above, and we have the first statement. The second
statement is given by a similar substitution.
X n
f (x0k )(g(xk ) − g(xk−1 ))
k=1
n
00
X
= f (x0k )g 0 (xk )(xk − xk−1 )
k=1
n n
00
X X
= f (x0k )g 0 (x0k )(xk − xk−1 ) + f (x0k )(g 0 (xk ) − g 0 (x0k ))(xk − xk−1 )
k=1 k=1
00
where xk , x0k 0
∈ [xk , xk−1 ]. Since g is continuous, as mesh{x} tends to zero, the
second term of the sum tends to
Xn
 f (x0k )(xk − xk−1 ) = M,
k=1

and M is the value of the Riemann integral of f on the interval [a, b], and we are
done.

6
Backlund’s Formula [9, p. 399]2
The Riemann-Stieltjes integral will be used in combination with the following
formula for the number of zeta zeros up to a height T to estimate sums involving
the zeta zeros. We will also assume for the rest of the paper that whenever we
encounter a sum taken over all the zeros of the zeta function, each term of the
sum is arranged according to the increasing order of the imaginary parts.
Lemma 0.4. Let N (T ) be the number of zeta zeros ρ with 0 < γ < T , then
T T T 7
N (T ) = log − + + Q(T ) for (T > 2)
2π 2π 2π 8
where

|Q(T )| < 0.137 log T + 0.443 log log T + 4.35 .


With this formula and the Stieltjes integral, we can prove the following.
Lemma 0.5. X 1
n
< T 1−n log T ,
γ>T
γ
where the γ’s are arranged in increasing order.
Proof.
X 1 Z ∞
1
n
= n
dN (t)
γ>T
γ T t
Z ∞ Z ∞
1 t 1 0
= n
log dt + Q (t) dt
T 2πt 2π T tn
  ∞
1 1 t 1
= log −
2π tn−1 (−n + 1) 2π tn−1 (−n + 1)2 T
∞ Z ∞
1 1
+ n Q(t) + n

n+1
Q(t) dt
t T T t

Since |Q(t)| < 0.137 log T + 0.443/2 log T + 1.533 log T < 2 log T for T > 2πe,
 
1−n 1 log 2π 1
<T log T + +
2π(n − 1) 2π log T (n − 1) 2π(n − 1)2 log T
∞ Z ∞
2 log T 1
−2 1
+ + 2 log t d log t
Tn tn
T T tn
where since n ≥ 2,
   
1−n 1 log 2π 1 1−n 4 1
≤T log T + + 2 +T log T + .
2π 2π log T 4π log T T T log T

The terms multiplying T 1−n log T is less than 0.574, so

< T 1−n log T.


2
Here we are using the formulation from [1] as used by Lehman because it is the most directly
accessible. There are different formulations for other uses which can be found in number theory
texts such as [11].

7
Here is an estimate of the sum:
X 1
< 0.025 . (0.1)
0<γ<∞
γ2

Cauchy’s Integral Formula


We will also use some standard lemmas from complex analysis.

Definition. Let A ⊂ B ⊆ C and f : A → C be analytic. A meromorphic


extension of f is a meromorphic function g : B → C such that g|A = f .

Lemma 0.6. The meromorphic extension of an analytic function to a larger


domain is unique; i.e., with the notation above, if h : B → C is meromorphic
and has the property that h|A = f , then g = h on B.

Lemma 0.7. Suppose U is an open subset of the complex plane C, and f : U → C


is a holomorphic function defined on U , and the closed disk D centered at z0 with
radius r is completely contained in U . Let C be the circle forming the boundary
of D. Then for every z in the interior of D,
I
1 f (s)
f (z) = ds
2πi C s − z

and
I
(n) n! f (s)
f (z) = ds .
2πi C (s − z)n+1

Explicit Formulas
3
We shall use the following explicit formulas without proof. The logarithmic
integral is defined as follows.

Definition.
x+iy
et
Z
z
li(e ) := dt
−∞+iy t
where z = x + iy, y 6= 0.

For x > 1, li(x) is then defined as follows:


1
li(x) := [li(x + i0) + li(x − i0)]
2
In this way, we recover the classical definition of li(x) as an integral principal
value: Z 1− t Z x t 
e e
li(x) = lim dt + dt
→0 0 t 1+ t

3
Interested readers can check out [6], which discusses the proofs and the history of these
explicit formulas in detail.

8
Lemma 0.8. ∞
X xρ ζ0 X x−2k
ψ(x) = x − − (0) + . (0.2)
ρ
ρ ζ k=1
2k

Lemma 0.9.
Z ∞
X
ρ du
J(x) = li(x) − li(x ) + − log 2 . (0.3)
ρ x (u2 − 1)u log u

We use the following formulation of the Riemann Hypothesis for Littlewood’s


theorem:
The supremum of the real parts of the zeros of the zeta function does not exceed
1
2
; i.e. write ρ = β + iγ, then β ≤ 21 .
This alternative formulation is used by Lehman:
The real parts of the zeros of the zeta function lie on the critical line; i.e. write
ρ = β + iγ, then β = 12 .

1 Littlewood’s theorem
How do we even know whether π(x) exceeds li(x) at some point? Let us look at
the explicit formulas. We can write4

X µ(k)
π(x) = J(x1/k ) . (1.1)
k=1
k

By (0.3), we have

" #
Z ∞
X µ(k) X du
π(x) = li(x1/k ) − li(xρ/k ) + 5
. (1.2)
k=1
k ρ x1/k (u2 − 1)u log u

when the difference is written in this way the nature of the difference π(x) − li(x)
is difficult to obtain. Therefore, this formula is not very useful when the goal
is just to prove that the difference changes sign at some point. Instead, we will
make do with very rough estimates of π(x). This difference in approach will be
important later when we actually want to find a precise value for the first x where
the sign switch occurs.
Since we prefer not to work directly with the formula for π(x), however accu-
rate it is, we consider other functions which are related to π(x). It turns out that
the prime number theorem can be easily deduced from the asymptotic formula
ψ(x) ∼ x [6, p.76–77]. We consider the function θ(x), which gives the following
equation
X∞
ψ(x) = θ(x1/k ) .
k=1

4
P This can be seen by substituting the definition for J(x) into the sum and noting that
d|n µ(d) = 0 for n > 1, or by the sieve method.
P∞
5
The constant term log 2 disappears because k=1 µ(k)
k = 0.

9
On the other hand6 ,

(1 − δ) log x(π(x) − x1−δ ) ≤ θ(x) ≤ log x π(x)


π(x) log x log x 1−δ 1 π(x) log x
⇒ ≤ x + and ≥1
θ(x) θ(x) 1−δ θ(x)
for any constant δ such that 0 ≤ δ ≤ 1. Since we can deduce θ(x) ∼ x from
ψ(x) ∼ x, the term log x 1−δ
θ(x)
x disappears because

log x 1−δ log x 1−δ


x ∼ x = 0 as x → ∞ ,
θ(x) x

and we can choose δ such that π(x) log x


θ(x)
is as close to 1 as we would like as x tends
to infinity, thus completing the proof. What is noteworthy here is the connection
between ψ(x) and π(x) via θ(x). From the last two inequalities, we have
 
1−δ θ(x) 1
π(x) ≤ x + .
log x 1 − δ

By definition of ψ(x) we also have that

ψ(x) = θ(x) + x1/2 + O(x1/3 ) .

We reconsider the explicit formula of ψ(x):



X xρ ζ0 X x−2k
ψ(x) = x − − (0) + .
ρ
ρ ζ k=1
2k

It is apparent now, that if only the first term in the sum of the right hand side
of the explicit formula for π(x) “counts”, namely, that
Z ∞
X
ρ du
π(x) = li(x) − li(x ) + 2
− log 2 + error, (1.3)
ρ x (u − 1)u log u

there is an analog between the difference ψ(x)−x and π(x)−li(x). Write c = 1−δ,
all we know is that
1 θ(x)
π(x) − li(x) = + O(xc ) − li(x)
c log x

Substitute θ(x) = ψ(x) − x1/2 + O(x1/3 ), we have

x − x1/2 + O(x1/3 )
 
1 ψ(x) − x c
= + + O(x ) − li(x)
c log x log x
6

X X
θ(x) = log p ≤ log x = π(x) log x , and
p prime ≤x p prime ≤x
X X
≥ log p ≥ (1 − δ) log x ≥ (1 − δ) log x(π(x) − π(x1−δ ))
x1−δ ≤p prime ≤x x1−δ ≤p prime ≤x

10
If ψ(x) − x oscillates with a magnitude that overpowers the terms in the paren-
thesis, then we have that π(x) − li(x) oscillates as well. We can make a further
simplification if we note that [11]
 
d u 1 1
= − ,
du log u log u (log u)2

and
Z x
x 2 1
li(x) = − + du
log x log 2 2 (log u)2
so
Z x
1 ψ(x) − x x1/2 O(x1/3 ) 2 1
π(x) − li(x) = − + + − 2
du + O(xc )
c log x log x log x log 2 2 (log u)
 1/2  Z x
1 ψ(x) − x x 1
= +O − 2
du + O(xc ) .
c log x log x 2 (log u)

Now we see that it’s not enough that ψ(x) − x changes sign, it also has to
change sign with magnitude greater than x1/2 . The upper bound of the error
θ(x)
term π(x) − log x
indicated by O(xc ) does not suffice anymore, we actually need
Rx
a bound on the size of the error term minus 2 (log1u)2 du.
We use the Stieltjes integral to write
Z x
1
π(x) = dθ(u)
2 log u
 x Z x
1 1
= θ(u) + θ(u) du .
log u 2 2 u(log u)2
So now the error term plus the integral becomes
Z x
θ(u) − u
2
du ,
2 u(log u)

and with integration by parts,


x Z x
θ1 (u) − u2 /2 θ1 (u) − u2 /2
  
2
= + 1− du
u(log u)2 2 2 u2 (log u)2 (log u)3 u2
where θ1 (x) is the integral of θ(x) from 2 to x. Since θ(x) can be written in terms
of ψ(x), we make use of the explicit formula for ψ(x) to find the integral of θ(x).
The only problem is that there is an oscillatory sum over the zeta zeros in the
expression of ψ(x).
Z x
x2 X xρ+1 ζ0 ζ0
ψ(u) du = − − (0)x + (−1) + O(x−1/2 ), but
2 2 ρ
ρ+1 ζ ζ


X xρ+1 X xβ+1
≤ ,

ρ(ρ + 1) γ2


ρ ρ

11
and the numerator of each term is bounded byPxΘ+1 , where Θ denotes the supre-
mum of the real parts of the zeta zeros. Also, γ>0 γ12 converges by (0.1) , so the
oscillatory term is O(xΘ+1 ). If we assume that the Riemann hypothesis is true,
we can immediately integrate the explicit formula for ψ(x) and obtain
Z x
θ1 (x) = ψ(x) − x1/2 + O(x1/3 ) dx
2
2
x
= + O(x3/2 ) .
2
Finally,
x
θ(u) − u x2 /2 + O(x3/2 ) − x2 /2
Z
du = + O(1)
2 u(log u)2 x(log x)2
Z x 2
u /2 + O(u3/2 ) − u2 /2
 
2
+ 1− du
2 u2 (log u)2 (log u)3 u2
 1/2 
x
=O
(log x)2
after estimating the last integral. Now we can be assured that every term in
x1/2
the expression for π(x) − li(x) is bounded by log x
. We drop the now irrelevant
constant c and write
 1/2 
ψ(x) − x x
π(x) − li(x) = +O . (1.4)
log x log x
We introduce the main statement that leads to Littlewood’s Theorem and the
notation f (x) = Ω± (g(x)).
Theorem 1.1. If the Riemann Hypothesis is true, then [11, p.477]
Z eδx
1 X sin γδ sin(γ log x)
(ψ(u) − u)du = −2x1/2 · + O(x1/2 ) (1.5)
(e − e−δ )x
δ
e−δx γ>0
γδ γ

uniformly for x ≥ 4, 1/2x ≤ δ ≤ 1/2.


This formula fully reveals the oscillatory nature of the difference ψ(x) − x.
Depending δ and log x, multiplying by γ sends sin γδ to different values on the
sine curve. At a casual glance, the interactions between each term of the sum
seem quite complicated, but with the help of Dirichlet’s theorem, the sum proves
to be quite manageable. Since x is taken over all real numbers, we might even
guess that the difference on the left hand side switches sign infinitely many times
because of the periodic nature of sine. Furthermore, it does so while achieving a
magnitude that is at least x1/2 . We make this notion rigorous by introducing the
notation Ω± .
Definition. Let f (x), g(x) be real-valued functions, we write f (x) = Ω+ (g(x))
if lim supx→∞ f (x)/g(x) > 0, f (x) = Ω− (g(x)) if lim inf x→∞ f (x)/g(x) > 0, and
f (x) = Ω± (g(x)) if both statements are true.
We can now state Littlewood’s theorem:

12
Theorem 1.2. (Littlewood) [11, p.478–479]
Assuming the Riemann Hypothesis, as x → ∞,

ψ(x) − x = Ω± (x1/2 log log log x) (1.6)

and

π(x) − li(x) = Ω± (x1/2 (log x)−1 log log log x) (1.7)

The first statement is derived from (1.5), the second statement is a direct
consequence of the first and (1.4). The factor log log log x is the lower bound on
the magnitude of the oscillation of γ>0 sinγδγδ · sin(γγlog x) . As we will see when
P
we prove the first statement, assuming the Riemann Hypothesis means the error
term O(x1/2 ) in (1.5) is not easily overwhelmed. Therefore, the case where the
Riemann hypothesis is true is the more difficult one to prove.
Proof. (Theorem 1.1)
The formula (1.5) is the average of ψ(x) − x over an interval. If this value is
positive then there must be a point within the interval where ψ(x) exceeds x,
therefore it is sufficient to prove that the sign-switching behavior occurs for this
integral.
Use the explicit formula for ψ(x) and write
Z x X xρ+1 ζ0
ψ(u) − u du = − − (0)x + O(1).7 (1.8)
0 ρ
ρ(ρ + 1) ζ

Replacing x with e±δ x, we have


Z eδ x Z e−δ x Z eδ x
(ψ(u) − u) du = − (ψ(u) − u) du + (ψ(u) − u) du + O(1)
e−δ x 0 0
X (eδ x)ρ+1 − (e−δ x) ρ+1
ζ0
=− − (0)(eδ x − e−δ x) + O(1) .
ρ
ρ(ρ + 1) ζ

Dividing both sides by eδ x − e−δ x = 2 sinh(δ)x,

δ X (eδ(ρ+1) − e−δ(ρ+1) )xρ


=− + O(1) . (1.9)
2 sinh(δ) ρ δρ(ρ + 1)

We invoke the Riemann Hypothesis and get that

eδ(ρ+1) = eδ(1/2+iγ+1)
= e3/2δ · eδiγ

!
X (3/2δ)n
= eδiγ
n=0
n!
= (1 + O(δ))eδiγ
x−2n
7
P∞ R
The O(1) comes from the sum over the trivial zeros of the zeta function, n=0 2n =
x−2n+1
P∞
n=0 (−2n+1)2n = O(1).

13
since δ ≤ 1/2 < 1. We use the same estimate for e−δ(ρ+1) . Replacing both terms
with those estimates we get

δ X (1 + O(δ))(eδiγ − e−δiγ ) ρ
(1.9) = − x + O(1) .
sinh δ ρ 2δρ(ρ + 1)

d δ −δ
Since dδ [sinh(δ)] = e +e
2
which is positive for all δ, sinh(δ) is strictly increasing
1
and so it reaches its smallest point at δ = 2x . Moreover, x ≥ 4, so the smallest δ
1
can be is zero, and we have that sinh δ is O(1). We bound the other terms in the
error term in the sum multiplied by O(δ):

δiγ −δiγ ρ 1/2
X 1 X 1
|e − e | = O(1) , |x | = O(x ) , ≤ = O(1) .


ρ
ρ(ρ + 1) ρ
γ2

Thus the error term is bounded by x1/2 . We reconsider δ/ sinh(δ) and take the
Taylor expansion of sinh(δ) to obtain δ/ sinh(δ) = 1 + O(δ 2 ). We once again try
to bound the error term

2 1/2
X sin γδ xiγ
O(δ ) · −ix · , (1.10)
ρ
δ ρ(ρ + 1)

iδγ −iδγ
where sin δγ = e −e 2i
. Since O(δ 2 ) = O(1), we need only to deal with the sum
multiplied by 1δ . If γ ≤ 1δ then γδ ≤ 1 and sinγδγδ ≤ 1, below the curve x = y.
Thus we can split the sum into two parts
1 X sin γδ 1X 1

δ ρ ρ(ρ + 1) δ ρ γ2
X 1 1 X 1
≤ + . (1.11)
γ δ γ2
0≤γ≤1/δ γ≥1/δ

To evaluate the first sum we form the Stieltjes integral and use the more general
form of Lemma 0.4:
T T T
N (T ) = log − + O(log T ) .
2π 2π 2π

Z 1/δ
X 1 1
= dN (t)
γ 0 t
0≤γ≤1/δ
Z 1/δ Z 1/δ
1 t 1
= log dt + d(O(log t)) .
0 2πt 2π 0 t

We use integration by parts for the second term, choosing dv = dO(log t), v =
O(log t), and get
Z 1/δ  1/δ Z 1/δ
1 O(log t) 1
d(O(log t)) = + O(log t) 2 dt
0 t t 0 0 t

14
that log t does not evaluate at 0 is not an issue because the interval of the Stieltjes
integral is flexible and we can always choose the lower bound to be slightly smaller
than the imaginary component of the first zeta zero above the real axis.
    Z 1/δ
log 1/δ log t
=O + a term ≤ dt
1/δ 0 t2
   1/δ Z 1/δ !
log 1/δ −1 1
=O +O log t · + dt
1/δ 2π 0 0 t2
 
log 1/δ
=O + O(δ) ,
1/δ

and the first term is just


" 2 #1/δ
1/δ
log 2πt
Z
1 t
log dt = = O((log(1/δ))2 ) .
0 2πt 2π 8π 2
0

Thus X 1
= O((log(1/δ))2 ).
γ
0<γ≤1/δ

We write similarly,
Z ∞
1 X 1 1
2
= dN (t)
δ γ 1/δ t2
γ>1/δ
Z ∞ Z ∞
1 t 1
= 2
log dt + 2
d(O(log t))
1/δ 2πt 2π 1/δ t

log 1/δ
 
2π log 1/δ log 1/δ
= + O(δ) + +O
1/δ (1/δ)2 1/δ
= O(log(1/δ)) .

and the entire sum over ρ is O((log 1/δ)2 ). We have


2
 
1/2 (log 1/δ)
(1.10) = O x = O(x1/2 ) , and
(1/δ)2
X sin γδ xiγ
(1.9) = −ix1/2 · + O(x1/2 ).
ρ
δ ρ(ρ + 1)

Now
 
− 1 = −1/2 = O 1 ,
1
ρ iγ i1/2γ − γ 2 γ2

so

xiγ
X sin γδ   
1/2 1 1
(1.9) = −ix · +O + O(x1/2 ) .
ρ
δ ρ+1 iγ γ2

15
We apply the estimate we have used before for the sum over ρ to deduce that the
error term from replacing ρ1 with iγ1 is O(x1/2 ). We can do the same thing again
with the (ρ + 1) term in the denominator and obtain
X sin γδ xiγ
(1.9) = −x1/2 · + O(x1/2 ) .
ρ
γδ iγ

The zeros of the zeta function come in conjugate pairs so when we take the sum
over γ > 0, we add a factor of 2, and we are done.
Estimating the sum in (1.5) requires knowing some information about the
imaginary parts of the zeros of the zeta function. The constant δ is fixed but
sin γδ oscillates depending on the position of γ modulo 2π. If γ log x is “close
enough” to γδ for |γ| < N , then we need only to estimate the simpler sum
2
X sin γδ
δ
γδ

γ<N

which, for small enough δ, will obey sin γδ ≈ γδ, and reduce to the still simpler
1X
1.
δ γ<N

The first obstacle here is to pick an x such that γ log x is close to γδ. That we
can do this is guaranteed by the following lemma due to Dirichlet. It turns out
that one does not need to know more than the number of zeros up to a certain
height.

Lemma 1.3. (Dirichlet)


If x1 , . . . , xK are real numbers, and N is a positive integer, then there is a positive
integer n ≤ N K such that kxk nk < 1/N for 1 ≤ k ≤ K.

Proof. We know that n = 0, 1, . . . , N K N K +1, different values. Let {xk n} denote


the fractional part of xk n, then

p(n) = ({x1 n}, {x2 n}, . . . , {xK n}) ∈ [0, 1)K .

We can partition [0, 1]K into N K smaller cubes with sides of length 1/N . If we
let n run over N K + 1 possibilities then there would be N K + 1 number of p(n)0 s.
By the pigeonhole principle there must be two p(n)0 s in the same hypercube, say
for n1 and n2 . Write, for each k, xk n1 = [xk n1 ] + {xk n1 }, then

kxk n1 − xk n2 k = k([xk n1 ] − [xk n2 ]) + ({xk n1 } − {xk n2 })k


= k{xk n1 } − {xk n2 }k ≤ |{xk n1 } − {xk n2 }| < 1/N ,

since |p(n1 ) − p(n2 )| ≤ 1/N . Assume n2 > n1 , we take n = n2 − n1 and we are


done.
Now we are ready to prove Littlewood’s theorem.

16
Proof. (Littlewood’s Theorem) We again assume RH and use equation (1.5). Let
N be an integer which we will take to be very large. Let T = N log N be the
height to which we would like to consider the zeta zeros, and let

{γ1 (log N )/2π, γ2 (log N )/2π, . . . , γN (T ) (log N )/2π}

be a collection of N (T ) real numbers, where N (T ) is the number of zeta zeros


with imaginary parts ≤ T . By Lemma 1.3, there exists an integer 1 ≤ n ≤ N N (T )
such that kγk (log N )/2π · nk < 1/N and 1 ≤ k ≤ N (T ).
Take x = N n e±1/N , δ = 1/N , we use the inequality

| sin 2πα ± sin 2πβ| ≤ 2πkα ± βk.8

By this identity, we have



γ γ log x ± γ/N
sin γ log x ± sin ≤ 2π
N 2π

γ(n log N ± 1/N ) ± γ/N
= 2π


γ log N n 2π
2π ≤ N
= 2π

for all γ up to height T . Consider the right hand side of (1.5) and substitute δ
for 1/N ,
X sin γ/N sin γ log x
−2x1/2 · + O(x1/2 ) .
γ>0
γ/N γ

We consider its difference and sum with a similar sum of sin γ/N over ρ and
estimate the error:
 2
X sin γ/N
(1.5) ∓ 2x1/2 N −1

γ/N

γ>0

X sin γ/N sin γ log x ∓ sin γ/N

1/2
= 2x γ/N · (S1)

0<γ<T
γ
X (sin γ/N )2
+ 2x1/2
γ/N · γ
(S2)
γ>T

X sin γ/N sin γ log x
1/2
+ 2x γ/N ·
. (S3)
γ>T
γ

We consider (S1) first,


X sin γ/N 2π/N
1/2 1/2
(S1) ≤ x γ 2 1/N = O(x ) .

0<γ<T

8
Use the sum-to-product formulas we get

| sin 2πα ± sin 2πβ| ≤ |2 sin π(α ± β)| ≤ 2πkα ± βk .

17
as we know from the Introduction that γ>0 1/γ 2 < ∞. For (S2), (S3) we note
P
that
X 1 log T log N + log log N 1
2
   .
γ>T
γ T N log N N

by Lemma 0.5. Thus these two terms are both O(x1/2 ) as well, and we may write
X  sin γ/N 2
1/2 −1
(1.5) = ±2x N + O(x1/2 ) (1.12)
γ>0
γ/N

when x = N n e±1/N and δ = 1/N . Now we show that this quantity is strictly
greater than x1/2 log log log x for suitable choices of N .
First we consider the sum of sines. This can be split into two:
" #
X  sin γ/N 2 X  sin γ/N 2 X  sin γ/N 2
−1 −1
N =N + .
γ>0
γ/N γ<T
γ/N γ>T
γ/N

As was discussed previously,

 N −1 (N log N + O(N ))
 log N + O(1) .

We know that N (T )  T log T  N (log N )2 , so that

log x ≤ N N (T ) log N + 1/N


⇒ log x = O(N N (T ) log N )
⇒ log log x = O(N (T ) log N + log log N ) .

subsituting the asymptotic estimate for N (T ), we get

⇒ log log log x  log N + 3 log log N


⇒ log log log x < C log N .

for some absolute constant C > 0. Now we divide the expression (1.12) by
x1/2 log log log x to get

(1.12) ±2x1/2 log N + O(x1/2 )


 .
x1/2 log log log x x1/2 log log log x

As x → ∞, the term bounded by x1/2 goes to zero, and we have

2
± .
C
By splitting the sum into two parts, we see that the contributions of the zeros
over a certain height, namely N , are trivial, and for a very specific value of x,
this sum can be approximated by a term of order log N . Passing the calculation
to the integrated version of the explicit formula for ψ(x) is helpful since γ>0 γ12
P
converges.

18
A point of interest here is whether this formula yields a numerical value
for x. Although a choice of x is explicitly stated in the proof, the use of the
number from Dirichlet’s lemma is merely existential. Also from this choice we
can already see how big this number would be, as verified by Skewes efforts to
find a numerical value. One might also be interested in a computational formula
based on Littlewood’s method. However, his method involves assuming that
the contribution from the square factors of x is always non-neligible, making the
switch from ψ(x) − x to π(x) − li(x) imprecise. He also selected his x by assuming
that the list of zeta zeros is an arbitrary list. This assumption is not quite
accurate as we will see from the much smaller upper bound found by Lehman
by actually entering the precise values for the zeta zeros in his computational
formula. Furthermore, finding an efficient algorithm for the elusive Dirichlet’s
number will be required for a direct implementation.
What about the case when RH is not true? As Littlewood said in his paper,
“one already knows more.”9 Indeed, the proof in the case where the Riemann
Hypothesis is false is much simpler. The inspiration comes from being able to
express the Mellin transform of the related functions in terms of the zeta function
itself. Let Θ be the supremum of the real part of the zeros of the zeta function,
and  > 0. We can write
Z ∞
1 ζ0 1
+ (s) + = (xΘ− − ψ(x) + x)x−s−1 dx .
s−Θ+ ζ s−1 1

If xΘ− > ψ(x)−x for all x, then the expression on the left is analytic for s > Θ−,
which contradicts the assumption that there is a zero of ζ(s) > Θ −  by the
definition of supremum. A rigorous proof of the above statement requires deeper
theoretical foundation than the case where the Riemann Hypothesis is true. In
particular, we will need to develop more theory about the general Dirichlet series
and examine their Mellin transforms. Also in contrast to Littlewood’s theorem,
this method of proof gives not even a number x in principal where the sign change
might occur.
Consider ζ(s) on the half plane Re(s) > 1, we can write
Y  −1
1
ζ(s) = 1− s (The Euler Product)
p prime
p

Since each of the factors is a holomorphic function in the half plane, and the
product on the right converges to to ζ(s) locally uniformly on compact sets, we
can write  −1
X 1
log ζ(s) = log 1 − s
p prime
p

and differentiate the sum on the right term by term to obtain the logarithmic
9
“on sait dèjá plus”.

19
derivative of ζ(s):
" −1 #
ζ0

d X 1
(s) = log 1 − s
ζ dsp prime
p
p−s
X  
=− log p
p prime
1 − p−s
 
X 1 1
=− log p s + 2s + · · ·
p prime
p p

X
=− Λ(n)n−s ,
n=1

since Λ(n) vanishes at any integer which is not a prime power. Then we use the
Stieltjes integral to write the partial sum
XN Z N
−s
Λ(n)n = x−s dψ(x)
n=0 1
Z N
= |ψ(x)x−s ]N
1 − ψ(x) dx−s
1
Z N
−s
= ψ(N )N +s ψ(x)x−s−1 dx .
1

Since we know ψ(N ) ∼ N as N → ∞, the first term goes to zero, and we have
only the integral left when Re(s) > 1, and
Z ∞
ζ0
− (s) = s ψ(x)x−s−1 dx . (1.13)
ζ 1

Since the meromorphic continuation of a function is unique, wherever the in-


tegral on the right hand side is defined outside of the half plane Re(s) > 1,
0
the meromorphic continuation of ζζ (s) must also be defined. We already have
some information on the real parts of the zeros of the Riemann zeta function,
so this restricts the magnitude of the function multiplying x−s−1 . Suppose that
ψ(x) − x ≤ xΘ− for all x and an  > 0 after a certain X , then we can consider
Z ∞
(xΘ− − ψ(x) + x)x−s−1 dx . (1.14)
0

On the left hand side we have


1 1 ζ0 1
+ (s) + . (1.15)
s−Θ+ s ζ s−1
If we can prove that the integral converges where there is a Riemann zeta zero,
there would be a contradiction. The other statement ψ(x) − x = Ω− (xΘ− ) can
be proved in an entirely analogous way. There are several issues here. First is the
validity of the formula like (1.13) if we were to use the same set up to prove the
equivalent statement for π(x) − li(x). Second is determining where the integral
(1.14) converges. To prove these things for each instance of Dirichlet series would
be cumbersome, so we will include the results for general Dirichlet series here,
starting with a definition.

20
Definition. A Dirichlet series is a series of the form α(s) = ∞ −s
P
n=1 an n .

Theorem 1.4. Any Dirichlet series α(s) = ∞ −s


P
n=1 an n has an abscissa of con-
vergence σc with the property that α(s) converges for all s with Re(s) > σc , and
for no s with Re(s) < σc .
Proof. We prove that for any s0 where the Dirichlet series converges, the series
is also convergent in the sector
S = {s : σ ≥ σ0 , |t − t0 | ≤ C(σ − σ0 )} ,
where C is any
P constant greater than 0.
Let R(N ) = n>N an n−s0 be the tail of α(s0 ). Write an = (R(n − 1) − R(n))ns0 ,
then
K
X K
X
an n−s = (R(n − 1) − R(n))ns0 −s .
n=M +1 n=M +1

we take the first and the final term of the sum

= R(M )M s0 −s − R(K)K s0 −s
" K #
X
+ (R(n − 1) − R(n))ns0 −s − R(M )M s0 −s + R(K)K s0 −s
n=M +1
K
X
= R(M )M s0 −s − R(K)K s0 −s − R(n − 1)((n − 1)s0 −s − ns0 −s )
n=M +1
K
X Z n
s0 −s s0 −s
= R(M )M − R(K)K − R(n − 1)(s0 − s) xs0 −s−1 dx .
n=M +1 n−1

Since α(s0 ) converges, with any  > 0, there exists an N 0 such that R(N ) <  for
all N > N 0 . Let M, L be such that M, K > N 0 , we have
K
X −s
s0 −s s0 −s


a n n =

R(M )M − R(K)K < 2
n=M +1
K Z n
Z ∞
X s0 −s−1
xσ0 −σ−1 dx .

+ R(n − 1)(s0 − s) x dx < |(s0 − s)|
n=M +1 n−1 M

Since σ0 < σ we have



M σ0 −σ
Z
1
xσ0 −σ−1 dx = − < .
M σ0 − σ σ0 − σ
Thus K  
X −s
|s0 − s|
an n <  2 + .
σ 0−σ

n=M +1

For s ∈ S, |t − t0 | ≤ C(σ − σ0 )
|s − s0 | ≤ |σ − σ0 + i(t − t0 )|
≤ |σ − σ0 | + |t − t0 |
≤ (C + 1)(σ − σ0 ) .

21

PK
Thus n=M +1 an n can be made arbitrarily small with the right choice of N 0
−s

for which M, K > N 0 . We can let K tend to infinity to obtain our statement.
Now it is a straightforward matter to deduce the theorem since for any s0 where
α(s0 ) converges, there exists a constant C such that a point s with Re(s) > Re(s0 )
belongs to the set S. We take the infimum over all s for which α(s) to obtain
σc .
P
Theorem 1.5. Let A(x) = n≤x an . If σc ≥ 0, then the equation

X Z ∞
−s
an n =s A(x)x−s−1 dx (1.16)
n=1 1

holds for Re(s) > σc , and

log |A(x)|
lim sup = σc . (1.17)
x→∞ log x

Proof. We can use the Stieltjes integral to write


∞ Z ∞  ∞ Z ∞
X
−s −s 1
an n = x dA(x) = A(x) s +s A(x)x−s−1 dx .
n=1 1 x 1 0

This is reminiscent of the formula derived for ψ(x), except this is in a general
context. Evaluating the first term requires estimating the magnitude of A(x).
Let φ denote the left hand side of (1.17), if θ > φ, then A(x) = O(xθ ). Thus if
φ = σc , the term A(x)x−s−1 goes to 0 as x → ∞ if Re(s) > σc . We need only to
prove (1.17). The series α(s) diverges when Re(s) < σc , thus the right hand side
of (1.16) cannot
P converge, thereby φ ≥ σc . So we show φ ≤ σc . For any  > 0,
−σc −
let R(N ) = n>N an n and write an = (R(n − 1) − R(n))n−σc − , we repeat
the process from before
X Z N
σc +
A(N ) = an = −R(N )N + (σc + ) R(u)uσc +−1 du
n≤N 0

Since α(σc + ) converges R(N ) is bounded for all N . Thus A(x) = O(xσc + ) for
every  > 0, and A(x) = O(xσc ).
Now we can prove

Theorem 1.6.
ψ(x) − x = Ω± (xΘ− ) .

Proof. We combine (1.14), (1.15) and write


Z ∞
1 1 ζ0 1
+ (s) + = (xΘ− − ψ(x) + x)x−s−1 dx for Re(s) > 1.
s−Θ+ s ζ s−1 0
(1.18)

22
By assumption ψ(x) − x < xΘ− for all x > X . Thus the integral can be split
into two
Z X Z ∞
−s−1
(x Θ−
− ψ(x) + x)x dx + (xΘ− − ψ(x) + x)x−s−1 dx .
0 X
R∞
The first integral is entire, and the second integral ≤ X xΘ−−s−1 dx, which
converges when s > Θ − . Thus the function defined by the right hand side of
(1.18) is entire on the half plane Re(s) > Θ−. By the uniqueness of meromorphic
extension, the left hand side of (1.18) must also converge on this half plane as
well. This is a contradiction since Θ is the supremum of all zeros of the zeta
function. Therefore the left hand side has a pole in the half plane Θ − . We
apply the same argument with the assumption ψ(x) − x > −xΘ− for all x > X
to Z ∞
1 1 ζ0 1
− (s) − = (xΘ− + ψ(x) − x)x−s−1 dx .
s+Θ+ s ζ s−1 1

We consider the following integral of li(x), and choose v = li(x), du = x−s−1


Z ∞ Z ∞
−s−1
 −s ∞
 1
s li(x)x dx = − li(x)x 2 + dx .
2 2 xs log x

The first term evaluates to 0, and with a subsitution u = log x


Z ∞
1
= (s−1)u u
du .
log 2 e

Write w = (s − 1)u,
Z ∞
1
= dw
(s−1) log 2 ew w
Z 1 ∞ 1
e−w − 1 e−w
Z Z
1
= dw + dw + dw
(s−1) log 2 w 1 w (s−1) log 2 w
1 −w ∞ −w
−1
Z Z
e e
= dw + dw − log(s − 1) − log log 2 .
(s−1) log 2 w 1 w

By integration by parts,
1 ∞ ∞
e−w − 1 e−w
Z Z Z
dw + dw = e−w log w dw = Γ0 (1) = C
0 w 1 w 0

for C a constant. Use this in our formulation of the Mellin transform of li(x) to
obtain
(s−1) log 2
e−w − 1
Z
=− dw − C − log log 2 − log(s − 1) .
0 w
(1.19)

Note that the integrand is discontinuous at 0, but we can always ignore a single
point of discontinuity. The integral thus converges everywhere and is an entire

23
function.
Now we consider J(x) and write it slightly differently,
X Λ(n)
J(x) = .
n≤x
log n

By Theorem 1.5 we have


Z ∞
s J(x)x−s−1 dx = log ζ(s) (σ > 1).
2

Thus we can write


Z ∞
1 1 r(s)
(xΘ− − J(x) + li(x))x−s−1 dx = − log(ζ(s)(s − 1)) +
2 s−Θ+ s s
for Re(s) > 1. The function A(x) = xΘ− − J(x) + li(x) is bounded in a finite
interval 1 ≤ x ≤ X, and is positive by our assumption that J(x) − li(x) <
xΘ+ . In comparison to x−s−1 the integral converges if −s − 1 + Θ −  < −1,
or for Re(s) > Θ −  by Landau’s theorem. By the uniqueness of meromorphic
continuation the function on the right hand side must converge in the half plane
as well, but this is a contradiction since by assumption there exists a zero of the
zeta function with real part greater than Θ − . Thus J(x) − li(x) must exceed
xΘ− . On the other hand if ψ(x) − x > −xΘ− for all x > X0 (), then we can
repeat the argument with the following integral
Z ∞
1 1 r(s)
(xΘ− + J(x) − li(x)x−s−1 ) dx = + log(ζ(s)(s − 1)) − .
2 s−Θ+ s s
Now the switch to π(x) is the reason why this argument will not work if we assume
that the Riemann Hypothesis is true. By definition J(x) = π(x) + O(x1/2 / log x),
and we have
J(x) − li(x) = Ω+ (xΘ− ) .

This means
J(x) − li(x)
lim sup Θ−
>0
x→∞
  x  
J(x) − li(x)
⇔ inf sup : x > n : n > 0 > 0.
xΘ−

We use J(x) = π(x) + O(x1/2 / log x),

π(x) − li(x) + O(x1/2 / log x)


   
⇔ inf sup : x > n : n > 0 = λ > 0.
xΘ−
Let  be such that Θ −  > 1/2, it is always possible to find this number because
the Riemann Hypothesis is assumed to be false. Then the difference between
π(x) − li(x)/xΘ− and O(x1/2 / log x)/xΘ− is greater than or equal to λ. Pick δ
such that δ < λ, then there exists some N such that
−O(x1/2 / log x)
> −δ for all x > N ,
xΘ−

24
and there exists some x > N such that

π(x) − li(x) + O(x1/2 / log x)



xΘ−
> λ − δ > 0.

We have that
π(x) − li(x) = Ω+ (xΘ− )
0
is true for 0 <  < Θ − 1/2, but also trivially true for all 0 > , since then xΘ−
is smaller than xΘ− . The case for

π(x) − li(x) = Ω− (xΘ− )

proceeds in exactly the same way.


This result is stronger than Littlewood’s theorem in the sense that the bound
Ω± (x1/2 log log log x) can be easily derived from this result. To derive Littlewood’s
θ(x)
theorem, we used that π(x) is equal to log x
within an order that is bounded by
x , and then similarly, that θ(x) is ψ(x) within an error bounded by x1/2 . This
1/2

error comes from the number of repeated factors in an integer, and we proceed
with assuming that this error always exists. Also the growth of this difference
is very little (log log log x), nevertheless it is there. For the proof assuming the
falsity of the Riemann Hypothesis, there is a similar switch involving a term of
order x1/2 which is easily overpowered by xΘ . The result on Dirichlet series (The-
orem 1.5) is merely a statement about the magnitude of the difference compared
to x−s−1 , and it is easier to work with J(x) since there is an easily identifiable
representation of J(x) in terms of the series of the zeta function.

2 Lehman’s approach
Instead of considering the difference ψ(x) − x, Lehman chose to work with the
explicit formula for π(x) − li(x) as originally conceived in Riemann’s paper. The
goal is to develop a computational formula for the difference, which depends, to
sufficient accuracy, on only a finite number of zeta zeros. It is easy to see that
for any positive function K(y) and any interval (a, b),
Z b
K(y){π(y) − li(y)} dy > 0 (2.1)
a

implies that π(y) − li(y) is positive within (a, b). Lehman showed that with a
suitable selection of K(y), the computation of this integral only required the first
12,000 zeta zeros to yield a dramatically improved upper bound of Skewes’ num-
ber. Since Lehman’s publication, other mathematicians have used his formula
with minor modifications to yield the best upper bound that is currently known.
The crucial theorem due to Lehman is the following:

Theorem 2.1. Let ρ = β + iγ denote a zero of the zeta function in the critical
strip, and let A be a positive number such that β = 21 for all zeros ρ with imaginary

25
parts 0 ≤ γ ≤ A. Let α, η, and ω be positive numbers such that ω − η > 1 and
the conditions
4A/ω ≤ α ≤ A2 , (2.2)
2A/α ≤ η < ω/2, (2.3)
hold. Let r
α −αy2 /2
K(y) = e . (2.4)

Then for 2πe < T ≤ A ,
Z ω+η X eiγw 2
K(u − ω)ue−u/2 {π(eu ) − li(eu )} du = −1 − e−γ /2α + R , (2.5)
ω−η ρ
0<|γ|≤T

where
2
3.05 2e−αη /2 √ 2
|R| ≤ + 4(ω + η)e−(ω−η)/6 + √ + 0.08 αe−αη /2 (2.6)
ω−η 2παη
 
−T 2 /2α α T 8 log T 4α
+e log + + 3 (2.7)
πT 2 2π T T
2 /2α+(ω+η)/2
+ A log Ae−A (4α−1/2 + 15η) . (2.8)
If we ignore conditions (2.2) and (2.3) for the moment and consider the error
term R, we can see that it is desirable to integrate over a large interval (ω−η, ω+η)
to reduce the error in (2.6), and include as many zeta zeros in the exponential sum
of (2.5) to reduce the second term in R. Finally, (2.8) disappears if the Riemann
Hypothesis is true. Given the current computational power and the number of
zeros that have been verified to lie on the critical line, the error term is negligible
and the computation reduces down to a simple sum. To give a concrete example,
the calculation done by Bays and Hudson gives, for
A = 107 , α = 1010 , η = .002,
T = γ1 , 000, 000 = 600269.677 . . . , ω = 77.95209,
the following bounds on |R|:
(2.6) < .00418985 + 5.94 × 10−50 + 3.64 × 10−8689 + 1.03 × 10−8682 < .00418986
(2.7) < 1.53 × 10−9 ,
(2.8) < 1.20 × 10−2001 .
iγw 2
Since the sum 0<|γ|≤T e ρ e−γ /2α under these constraints is 1.012762 . . . with
P
error smaller than .0078, this value establishes the existence of a crossover in the
interval (1.398201 × 10316 , 1.398244 × 10316 ).
Lehman’s method would have been impractical if the Riemann Hypothesis had
not been verified to such a large number of zeros, or if large-scale computations
were not possible. The breakthrough was a real triumph of modern computational
mathematics. Unfortunately, the constraints imposed on the parameters A, ω, α,
and η meant that there would be gaps in the intervals where this theorem cannot
be applied. An effort to relax or clarify the constraints (2.2), and (2.3) should
prove useful in further investigation of Skewes’ number.

26
The proof.
We let li(x) denote the principal value of the logarithmic integral10 . We recall
that
1 1
J(x) = π(x) + π(x1/2 ) + π(x1/3 ) . . . , (2.9)
2 3
The Riemann-von Mangoldt explicit formula states
Z ∞
X
ρ du
J(x) = li(x) − li(x ) + 2
− log 2 (x > 1) . (2.10)
ρ x (u − 1)u log u

Then we can use two estimates on π(x):

x 3/2
π(x) − < , and
log x (log x)2
2x
π(x) < (x > 1).
log x
Define
x
π(x) − log x
ϑ1 (x) = 3/2
, and
(log x)2
π(x)
ϑ2 (x) = 2x .
log x

Then both ϑ1 (x), ϑ2 (x) < 1, and we can conveniently express π(x) thus:

x ϑ1 (x)3/2x
π(x) = + , and
log x (log x)2
ϑ2 (x)2x
π(x) = . (2.11)
log x
log x
Since π(x1/k ) = 0 when x1/k < 2, this tells us that there are at most log 2
terms
in J(x), so
 
1 1/2 1 1/3 log x
J(x) < π(x) + π(x ) + π(x ) , and
2 3 log 2
 
1 1/2 1 1/3 log x
J(x) = π(x) + π(x ) + ϑ3 (x)π(x ) ,
2 3 log 2

where ϑ3 (x) is defined in the same way as ϑ1 (x) and ϑ2 (x) and does not exceed
1. Combining (2.9) and (2.10), we have
Z ∞
X
ρ du
π(x) − li(x) = − li(x ) + 2
− log 2
ρ x (u − 1)u log u
 
1 1/2 1 1/3 log x
− π(x ) − ϑ3 (x)π(x ) .
2 3 log 2
10
see Introduction, page 8.

27
Substituting the first expression in (2.11) for π(x1/2 ) and the second expression
for π(x1/3 ), we get
Z ∞
X
ρ du
π(x) − li(x) = − li(x ) + 2
− log 2
ρ x (u − 1)u log u
 1/2
ϑ1 (x1/2 )x1/2 ϑ2 (x1/3 )2x1/3
   
x log x
− +3 − ϑ3 (x) .
log x (log x)2 log x log 2

The occurrence of multiple estimating functions is not a serious problem. Indeed,


for Lehman’s purpose, ϑ1 (x1/2 ), ϑ2 (x1/3 ), and ϑ3 (x) can be absorbed into a single
estimate ϑ(x) with modulus < 1, since, by above,

ϑ1 (x1/2 )x1/2 2ϑ3 (x)ϑ2 (x1/3 )x1/3 log x 3x1/2 2x1/3


 
3 + < + , and thus
(log x)2 log x log 2 (log x)2 log 2
ϑ1 (x1/2 )x1/2 ϑ3 (x)ϑ2 (x1/3 )x1/3 log x 3x1/2 2x1/3
   
3 + = ϑ(x) + .
(log x)2 log x log 2 (log x)2 log 2

We have,
Z ∞ Z ∞
du du 1
< 2 = < log 2 (x > e), thus
(u2 − 1)u log u u3 x2
Zx ∞ x
du
− log 2 < 0 < x1/3 log 2 (x ≥ e).
x (u2 − 1)u log u

Finally,
X x1/2 3x1/2 2x1/3
π(x) − li(x) ≤ − li(xρ ) − + 2
+ + x1/3 log 2
ρ
log x (log x) log 2
x1/2 3x1/2
 
X
ρ 1/3 2
≤− li(x ) − + +x + log 2 ,
ρ
log x (log x)2 log 2
 
2
and since log 2
+ log 2 = 3.578 . . . < 4, we have

x1/2 3x1/2
X  
π(x) − li(x) = − ρ
li(x ) − + ϑ0 (x) + 4x 1/3
,
ρ
log x (log x)2

where ϑ0 (x) is the final estimate derived from the inequality above. The problem
with
P using this formula directly is that it isn’t clear how to truncate the series
ρ
ρ li(x ) such that only finitely many zeros are needed in the computation. To
overcome this problem Lehman integrates this term against the Gaussian kernel.
For the ease of inspecting numerical data he has also scaled the difference so that
one would
P only iγω need to look for values greater than 1 in the resulting exponential
2
sum 0<|ω|≤T e ρ e−γ /2α .
We will begin with the simpler terms in the difference. Recall that K(y) =

28
2 /2
e−αy


, and we have
Z ω+η
K(u − ω)ue−u/2 {π(eu ) − li(eu )}du
ω−η
Z ω+η " #
u/2
 u/2
e X e
= K(u − ω)ue−u/2 − − li(euρ ) + ϑ0 (x) 3 2 + 4eu/3 du
ω−η u ρ
u
Z ω+η
= −K(u − ω)du (S1)
ω−η
Z ω+η !
X
−u/2 uρ
+ K(u − ω)ue − li(e ) du (S2)
ω−η ρ
ω+η
eu/2
Z   
−u/2 0
+ K(u − ω)ue ϑ (x) 3 2 + 4eu/3 du . (S3)
ω−η u
The estimate for the integral (S1) depends on the interval of integration, (S2)
provides the oscillatory exponential sum, and (S3) contributes to the error term.
Consider (S1), we can write
Z ω+η
−K(u − ω) du
ω−η
Z ∞ Z ω−η Z ∞
=− K(u − ω) du + K(u − ω) du + K(u − w) du .
−∞ −∞ ω−η

Since K(u − ω)R ∞is the probability density function for the normal distribution
centered at ω, −∞ K(u − ω)du = 1. With a change of variable we can write
Z ∞ Z ∞
K(u − ω) du = K(y) dy
ω+η η
Z ω−η Z −η
K(u − ω) du = K(y) dy .
−∞ −∞

Since K(y) is symmetric about the y-axis, these two sums are the same, and we
only need to estimate one of them. Note
" 2
# 2
d e−αy /2 e−αy /2 2
=− 2
− αe−αy /2 .
dy y y

Thus
∞ ∞
Z Z r
α −αy2 /2
K(y) dy = e dy
η η 2π
r Z ∞ " 2
#
1 d e−αy /2
<− dy
2πα η dy y
r " 2
#∞
1 e−αy /2
=−
2πα y
η
−αη 2 /2
e
=√ .
2παη

29
Since α > 0, and we have
2
e−αη /2
(S1) < −1 + 2 √ .
2παη

Also, the integral (S3) becomes


Z ω+η  
0 3 −u/6
K(u − ω)ϑ (x) + 4ue du ,
ω−η u

and
3
|S3| ≤ + 4(ω + η)e−(ω−η)/6 .
ω−η
The most difficult term to estimate is (S2), and we will start by interchanging
integration and summation:
Z ω+η !
X
K(u − ω)ue−u/2 − li(euρ ) du
ω−η ρ
XZ ω+η
=− K(u − ω)ue−u/2 li(euρ ) du.
ρ ω−η

li(euρ ) is a convergent series in the finite interval ω −η <


P
This is justified since ρ
u < ω + η. Recall
uβ+iuγ
ez
Z

li(e ) := dz .
−∞+iuγ z
With z = ρu − t we have

eρu · e−t
Z

li(e ) = dt .
0 ρu − t

Choosing w = (ρu − t)−1 , dv = e−t , integration by parts gives,


Z ∞ ρu −t Z ∞
−e−t
 
e ·e ρu 1
dt = e −
0 ρu − t ρu (ρu − t)2
 Z0 ∞ −t 
ρu 1 e
<e + dt and so,
ρu 0 (γu)2
eρu ϑ00 (ρu)eρu
li(euρ ) = + ,
ρu (γu)2

where ϑ00 (ρu) is a function of ρ and u defined by the inequality. We can plug
this into the formula for (S2). Take A to be a number for which the Riemann

30
Hypothesis holds for zeros where |γ| < A, then
!
Z ω+η X
−u/2 uρ
K(u − ω)ue − li(e ) du
ω−η ρ
ω+η
eρu eρu
XZ  
−u/2
=− K(u − ω)ue + ϑ00 (ρu) du
ρ ω−η ρu (γu)2
X 1 Z ω+η
=− K(u − ω)eiγu du (E1)
ρ ω−η
0<|γ|≤A
X Z ω+η eiγu
− K(u − ω)ϑ00 (ρu) 2 du (E2)
γ u
0<|γ|≤A ω−η
X Z ω+η
− K(u − ω) li(euρ ) du . (E3)
|γ|>A ω−η

To improve the speed of the calculation we would like to use as few of the zeta
zeros as we can while tolerating a certain amount of error. This means that
we will have to find some way to estimate the terms that are dropped from the
computation. In (E1), we are dealing with the integral
Z ω+η
K(u − ω)eiγu du .
ω−η

With a change of variable taking y = u − ω, the integral becomes


Z η Z ηr
iγω iγy iγω α −αy2 /2 iγy
e K(y)e dy = e e e dy .
−η −η 2π

Since
γ2 α(y − iγ/α)2 αy 2
− − =− + iγy,
2α 2 2
we can rewrite the integral as
r Z η
−γ 2 /2α+iγω α 2
e e−α(y−iγ/α) /2 dy .
2π −η

Take (y − iγ/α)2 = t2 /α, change of variable gives us



α(η−iγ/α)
r Z
−γ 2 /2α+iγω α 2 /2 dt
=e √
e−t √ .
2π α(−η−iγ/α) α

The standard normal distribution is given with the probability density function
1 2
√ e−t /2 .

31
This integrates to 1 over (−∞, ∞). Now we write
Z ω+η
K(u − ω)eiγu du
ω−η
r Z η
−γ 2 /2α+iγω α 2
=e e−α(y−iγ/α) /2 dy
2π −η
r
2 α
= e−γ /2α+iγω

Z ∞ Z ∞ Z −η 
−α(y−iγ/α)2 /2 −α(y−iγ/α)2 /2 −α(y−iγ/α)2 /2
× e dy − e dy − e dy .
−∞ η −∞

By the change of variable above we have


Z ∞ Z −η 
−γ 2 /2α+iγω iγω iγy iγy
=e −e K(y)e dy + K(y)e dy .
η −∞

Now (E1) becomes


X eiγω X eiγω Z ∞ Z −η 
−γ 2 /2α iγy iγy
− e + K(y)e dy + K(y)e dy ,
ρ ρ η −∞
0≤|γ|≤A 0≤|γ|≤A

where
Z ∞ Z −η X 1 Z ∞
X eiγω

iγy iγy
iγy

K(y)e dy + K(y)e dy ≤ 2 K(y)e dy .
ρ η −∞ ρ η

0<|γ|<A 0<|γ|<A

The constant 2 comes from taking the absolute value of the integral
Z ∞ Z ∞ Z ∞ Z −η
iγy
iγy
K(y)e dy ≤ K(y) |e | dy = K(y) dy = K(y) |eiγy | dy .

η η η −∞

Integrating by parts with w = k(y) and dv = eiγy gives


Z ∞ ∞ Z ∞
eiγy eiγy

iγy
K(y)e dy = K(y) − K 0 (y) dy .
η iγ η η iγ

Note that in the first term, |eiγy | is bounded by 1, thus since K(y) decreases to
iγη
0 at infinity, the first term becomes K(η) eiγ . Note that this is simply

eiγη
Z
K 0 (y) dy ,
η iγ

so we can go back to the integral above and get


Z ∞ Z ∞
(eiγη − eiγy )
iγy
K(y)e dy = K 0 (y) dy .
η η iγ

Furthermore, iγη
(e − eiγy ) 2

≤ ,
iγ γ

32
thus
∞ 2 ∞ 0
Z Z r
iγy 2 2 α −αη2 /2
K(y)e dy ≤
|K (y)| dy = K(η) = e .

η γ η γ γ 2π
Based on the sequence of estimates:
X eiγω X 1 Z ∞

−γ 2 /2α iγy

(E1) = − e + 2ϑ K(y)e dy
ρ ρ η

0≤|γ|≤A 0≤|γ|<A
r
X eiγω X 1 α −αη2 /2
−γ 2 /2α
=− e + 4ϑ 2
e . (2.12)
ρ γ 2π
0≤|γ|≤A 0≤|γ|<A

Previously, the estimates used in the calculations merely depended on the


knowledge of the Gaussian kernel K(y) at infinity. Now we will split the sum
involving the zeta zeros into
P two parts based on the following lemmas. We can
1
make the terms involving γ γ 2 more precise. Using Lemma 0.1 and the inequal-
ity (2π)−1/2 < 0.4 ,
r
X eiγω X 1 α −αη2 /2
−γ 2 /2α
(E1) = − e + 8ϑ 2
e
ρ 0<γ<A
γ 2π
0<|γ|<A
X eiγω 2 √ 2
≤− e−γ /2α + (8 · 0.025 · 0.4) αe−αη /2
ρ
0<|γ|<A
X eiγω 2 √ 2
=− e−γ /2α + .08 αe−αη /2 .
ρ
0<|γ|<A

Now we split the exponential sum so that we can enter any number of zeros we
desire:

X eiγω
2
X e−γ 2 /2α
−γ /2α
e ≤2

ρ γ


T <|γ|≤A T <γ<∞

as we write the roots in conjugate pairs, and


Z ∞ −t2 /2α
e
≤2 dN (t)
T t
Z ∞ −t2 /2α Z ∞ −t2 /2α
e t e
≤ log dt + 2 Q0 (t) dt .
T πt 2π T t
−t2 /2α
Choosing dv = Q0 (t) and u = e t ,
Z ∞ −t2 /2α " 2 ∞ ! Z ∞ 2 2
! #
e t e−t /2α e−t /2α e−t /2α
≤ log dt + 2 Q(t) + Q(t) + dt .
T πt 2π t T T t2 α

We use the inequality Q(t) < 2 log t,


Z ∞ −t2 /2α / 2 Z ∞ −t2 /2α
4 log T e−T 2α log T ∞ e−t /2α
Z
e t e
< log dt + +4 t· dt + dt
T πt 2π T T T α T t2
Z ∞ −t2 /2α / Z ∞ −t2 /2α
e t 8 log T e−T 2α e
= log dt + + dt .
T πt 2π T T t2

33
Note that the integrals in the expression above involve multiplying a monotone
decreasing function with e−t/2α . We have

X eiγω
−γ 2 /2α
e

ρ


T <|γ|≤A
 ∞ −T 2 /2α
 ∞ 
2 log T /2π 8 log T e 2 4
≤ −αe−t /2α −t /2α

2
+ + −αe
πT T T3
 T  T
2 α log T /2π 8 log T 4α
= e−T /2α 2
+ + 3 .
πT T T
This completes our description of (E1). To summarize, we have
X eiγω 2
(E1) = e−γ /2α + RE1 ,
ρ
0<|γ|<T

where

 
−T 2 /2α α log T /2π 8 log T 4α 2
|RE1 | ≤ e 2
+ + 3 + 0.08 αe−αη /2 .
πT T T
To estimate (E2), note that
ω+η iγu

X Z e
00
|(E2)| ≤ K(u − ω)ϑ (ρu) du
ω−η
γ 2u
0<|γ|≤A
X 1 1
≤ ,
γ2 ω − η
0<|γ|≤A

and when we apply the inequality 0.1,


0.05
≤ .
ω−η
The most laborious part of the proof comes up in estimating (E3), the sum
involving zeros of the zeta function that does not lie on the critical line (if they
exist). Lehman does not assume any properties of these zeros other than the
fact that if they exist, the magnitude of their imaginary components needs to
exceed 14, which is approximatly the value of the imaginary component of the
first zeta zero above the real axis. The rest of the proof is a steadfast application
of Cauchy’s integral formula to the expression in the integral. The order to which
the derivatives should be taken are carefully chosen to ensure that the estimate
at the end depends only on η, A, and α. This gives rise to the conditions (2.2)
and (2.3), which are hitherto unused.
We would like to estimate the sum
X Z ω+η
(E3) = − K(u − ω)ue−u/2 li(eρu ) du .
|γ|≥A ω−η

Consider the function


2 /2
fρ (s) = ρse−ρs li(eρs )e−α(s−ω)

34
in the sector − π4 ≤ arg(s) ≤ π4 . Note also that 12 5
π < | arg(ρ)| < π2 for every zeta
zero simply because all of them lie in an area bound by a line that goes through
(0, 0) and ( 12 , 14), which lies at an angle of 12 5
π to the x-axis. Combining these
inequalities, we have
π 3
< | arg(ρs)| < π .
6 4
We recall that Z ∞
ρu e−t
li(e ) = eρu dt .
0 ρu − t
Applying this to fρ (s), we get

e−t
Z 
−ρs −α(s−ω)2 /2 ρs

|fρ (s)| = ρse e
e dt
0 ρs − t
2 /2
|ρs||e−α(s−ω) | ∞
≤ −e−t 0
| Im(ρs)|
−α(s−ω)2 /2
≤ 2|e |. (2.13)

We expand (E3) into


ω+η
r
XZ α −α(u−ω)2 /2 −u/2
(E3) = − e ue li(eρu ) du

|γ|≥A ω−η

α X 1 ω+η (ρ− 1 )u
r Z
=− e 2 fρ (u) du .
2π ρ ω−η
|γ|>A

If we evaluate the integral using integration by parts and choose v = fρ (u), dw =


1
e(ρ− 2 )u , after N times we get

(E3) =
r  N −1
α X 1 X (−1)n e(ρ−1/2)ω  (ρ−1/2)η (n) −(ρ−1/2)η (n)

− e f ρ (ω + η) − e f ρ (ω − η)
2π γ≥A ρ n=0 (ρ − 1/2)n+1
Z ω+η
(−1)N

(ρ−1/2)u (N )
+ e fρ (u) du . (2.14)
(ρ − 1/2)N ω−η

In order to apply the Cauchy Integral formula we need only to find a path in
the domain where fρ (s) is analytic to evaluate the integral. Let C be a circle of
radius r < ω/4 centered at u. If we invoke condition (2.3), specifically η < ω/2,
we have that for any s on C,
ω ω ω ω
Re(s) ≥ ω − η − >ω− − =
4 2 4 4
and | Im(s)| ≤ ω4 . Thus C lies below the line x = y and in the sector | arg(s)| ≤ π4 .
By the previous discussion, fρ (s) is analytic in a region enclosing C. Cauchy’s
integral formula states that
I
(n) n! fρ (z)
fρ (u) = dz .
2πi C (z − u)n+1

35
(n)
We can then bound the absolute value of fρ (s) on C:
I
(n) n! maxs∈C fρ (s)
|fρ (u)| ≤ ds
2πi C |(s − u)|n+1

writing |s − u| = r and using (2.13), we have

2n! 2
≤ n
max |e−α(s−ω) /2 | .
r |s−u|=r

If s = σ + it, then on the circle (σ − u)2 + t2 = r2 ,


2 /2 2 −(α−ω)2 /2
|e−α(s−ω) | = eα(t
2 −(σ−u)2 −(σ−ω)2 )/2 2 /2
= eα(r ≤ eαr
p
If N ≤ αω 2 /16, we can discard r by choosing it to be N/α, which satisfies the
(n)
constraint r ≤ ω/4. We plug the inequality back into our estimate for fρ (u) .

N!
|fρ(N ) (u)| ≤ eαN/2α · 2
(N/α)N/2
= 2N !N −N/2 αN/2 eN/2 .
(N )
To estimate fρ , we choose r = η/2 Since by condition (2.3), η < ω/2, r < ω/4 ,
this allows us to apply Cauchy’s integral formula again and get
2 /2
|fρ(n) (ω ± η)| ≤ 2n!(η/2)−n max |e−α(s−ω) |
|s−(ω±η)|=r
2 −(σ−(ω±η))2 −(σ−ω)2 )/2
= 2n!(η/2)−n eα(r
2 /2
≤ 2n!(η/2)−n e−αη

after expanding the terms in the exponent of e. We will proceed with estimating
the first sum of the expression (2.14) of (E3) using some simple inequalities:

|e(ρ−1/2)ω+(ρ−1/2)η | = e(β−1/2)(ω+η) ≤ e1/2(ω+η) .

since 0 < β < 1. Similarly,


1 1
|e−(ρ− 2 )(ω−η) | ≤ e 2 (ω+η) ,

and we have
1 1 1
|e(ρ− 2 )ω {e(ρ− 2 )η fρ(n) (ω + η) − e−(ρ− 2 )η fρ(n) (ω − η)}
1 1
≤ |e(ρ− 2 )(ω+η) fρ(n) (ω + η)| + |e−(ρ+ 2 )(ω−η) fρ(n) (ω − η)}|
 1 2

≤ e 2 (ω+η) · 2n!(η/2)−n e−αη /8 · 2
 −n
1
(ω+η) η 2
= 4n!e 2 e−αη /8
2

36
(n)
by using our estimates on |fρ (ω ± η)| above.
The first sum of (E3) becomes
r
N −1
α X 1 X 1
(ω+η) 4n!
−αη 2 /8
≤ − e e
2
2π ρ n=0 (η/2)η γ n+1


|γ|≥A
2 N −1
α 1 (ω+η) X 4e−αη /8 X
r
n!
≤ e2 .
2π γ2 n=0
(η/2) nγ n
|γ|>A

For the second integral in (2.14), we have


(−1)N

1
(ρ − 1/2)N ≤ γ n ,

and
|e(ρ−1/2)u | ≤ e(ω+η)/2
(N )
in the interval. With |fρ (u)| ≤ 2N !N −N/2 αN/2 eN/2 ,
r Z ω+η
α X 1 (−1)N

(ρ−1/2)u (N )

e f ρ (u)du
2π ρ (ρ − 1/2)N ω−η
|γ|≥A
r
α 1 (ω+η) (−1)N  αe N/2
≤ e 2 2N ! (ω + η − ω + η)
2π ρ(ρ − 1/2)N N
r
α 1 (ω+η) 4ηN !  αe N/2
≤ e2 .
2π γ N +1 N

Now we have the final estimate for (E3):


" 2 N −1
#
α (ω+η)/2 X 4e−αη /8 X
r
n! 4ηN !  αe N/2
|(E3)| ≤ 2 e + .
2π γ>A
γ2 n=0
(η/2)n γ n γ N +1 N

Applying estimates such as Lemma 0.4, 0.5, and equation (0.1) for the sum
over γ completes the proof.
The implementation of this theorem consists of first choosing the parameters
A, α, η, ω such that the conditions (2.2) and (2.3) are satisfied, and then deciding
how many actual zeros should be entered into the calculation. Then the sum
X eiγω 2
− e−γ /2α
ρ
0<|γ|≤T

is computed over a wide range of ω (in Lehman’s case, from 14 to 12000) to find
heuristic values of x where the sum could be expected to exceed 1. After that,
calculations with the estimates of the error term R is carried out to prove that
π(x) − li(x) around this value. Lehman found 3 points where the value of the
sum exceeds 0.96 with 1000 Riemann zeta zeros.

727.952, 853.853, 2682.977 .

37
Then he used the following parameters to prove that π(x) > li(x) around e2682.977 .

A = 17, 000, α = 107 , η = 0.034,


T = γ12,000 , ω = 2682.9768

where the sum ≥ 1.00133. The error term R is smaller than 0.001268, thus
X ei2682.9768γ −γ 2 /(2×107 )
−1− e +R
ρ
0<|γ|<12,000

≥ −1 + 1.00133 − 0.001268 = 6.2 × 10−5 > 0 .

The number 6.2 × 10−5 is small, but this means that


Z 2683.0108
K(u − 2682.9768)ue−u/2 {π(eu ) − li(eu )}du > 6.2 × 10−5 .
2682.9428

Also
Z 2683.0108
K(u − 2682.9768)ue−u/2 {u−1 eu/2 }du < 1
2682.9428
Z 2683.0108
−5
6.2 × 10 · K(u − 2682.9768) < 6.2 × 10−5 ,
2682.9428

since the probability function integrates to 1 over the entire real line. K(u − ω)
and ue−u/2 are both positive functions and so this implies that π(eu ) − li(eu ) >
u−1 eu/2 for some u ∈ (ω − η, ω + η). If we substitute u = ω − η = 2682.9428,
then we have that π(eu ) − li(eu ) > 10500 .
The number obtained by Lehman is ≈ 101165 , this is an enormous improve-
3
1010
ment over Skewes’ 1010 . Why does Lehman’s formula yield a much smaller
number? His approach followed Littlewood’s theorem, which does not assume
anything about the distribution of the zeros beyond their number at a certain
height. Dirichlet’s lemma gives that for a large list of numbers any two of them
will lie close to each other, but this does not account for how the zeros at a lower
height may have more than two which lie closer to each other modulo 2π.
The number obtained by Lehman exposes the distance between Littlewood’s
model of the zeta zeros and the actual distribution of the zeros less than T .

3 Recent developments
After Lehman’s breakthrough, the results on finding a lower upper bound for the
Littlewood violation mainly consist of reapplying Lehman’s theorem with little
or no modification and increasing the number of zeros entered into the formula
for higher accuracy. The results are summarized in the following table.
The most resent result by Chao and Plymen is notable not only for computing
the lowest bound currently known, but also because it reduced error in Lehman’s
formula, where little had been done [2, p.1288]. Of computational methods I will
refer the reader to [3] for a description of the formulas used to evaluate the zeros
and the errors incurred from using approximations. Bay and Hudson in particular

38
Number of zeros entered Bound
Lehman 12,000 ≈ 1.65 × 101165
Te Riele 15,000 ≈ 6.69 × 10370
Bay & Hudson 1,000,000 ≈ 1.398 × 10316
Chao & Plymen 2,000,000 ≈ 1.397 × 10316

Table 2: Upper bounds computed from 1966 to 2008

provides a graph of computed values of the sum and heuristic for where the first
lower bound can be expected to occur [2, p.1291–1293].
There are many obstacles in refining Lehman’s result—our imperfect knowl-
edge of the influence of the 4 parameters, and the lack of more precise knowledge
of the Riemann zeta zeros, not to mention our sole dependence to Riemann’s
formulas for the prime counting function. However, with recent results on better
formulas for approximating π(x), we can reduce the main source of error from
Lehman’s formula. The result depends on a more careful examination of the
several ϑ0 s on page 27.
We absorb ϑ3 (x), ϑ2 (x1/3 ) into one ϑ(x) and following similar arguments
X x1/2 3x1/2
π(x) − li(x) = − li(xρ ) − + ϑ1 (x1/2 ) 2
+ 4ϑ(x1/2 )x1/3 .
ρ
log x (log x)

When we multiply this by the Gaussian kernel and ue−u/2 and integrate over
(ω − η, ω + η), we see that the error term involving ϑ1 (x1/2 ), ϑ(x) become
Z ω+η
3eu/2
 
−u/2 u/2 u u 1/3
K(u − w)ue ϑ1 (e ) + ϑ(e )4(e ) du
ω−η (log eu )2
3ϑ1 (eu/2 )
= + 4ϑ(eu )(ω + η)e−(ω−η)/6 .
ω−η

Thus if we reduce ϑ1 (eu/2 ), we can reduce the major error term. We use the
estimates on π(x) by L. Panaitopol:
x
π(x) > for x ≥ 59 , and
log x − 1 + (log x)−0.5
x
π(x) < for x > 6 .
log x − 1 − (log x)−0.5

We get
x x
−0.5
< π(x) < .
log x − 1 + (log x) log x − 1 − (log x)−0.5

By the inequality by Schoenfeld used to derive ϑ1 (eu/2 ), we get

x 3/2xϑ1 (x) x
π(x) = + <
log x (log x) 2 log x − 1 − (log x)−0.5

39
and
x
π(x) > .
log x

Denote y = y(x) := (log x)1/2 . The first inequality of the two above will lead to
an upper bound for ϑ1 :
2 y3 + y2
0 < ϑ1 (x) < · 3 for all x ≥ 59 .
3 y −y−1
The function
y3 + y2
F (u) := 3
y −y−1
is a monotone decreasing function for x ≥ e. Thus we can bound ϑ1 by a very
small number if x is chosen to be large enough. For example, if we choose 1059 ,

0 ≤ ϑ1 < 0.71523279 .

Finally, we have that the error term involving ϑ1 is 2.14569. This is a substantial
improvement over 3, even though it is unclear how this reduction has influenced
the outcome.

Conclusion Where does one go from here? Littlewood’s theorem has illus-
trated well enough the oscillatory nature of the difference π(x) − li(x). This
proof was later made effective in computing a numerical value for the sign switch
by Skewes. Lehman’s theorem shows us that an upper bound for the sign switch
can be efficiently computed. A possible direction for improvement would be to
continue to reduce the error term R, or find a different representation of the
prime counting function. There is of course always the option of including more
zeros in Lehman’s formula, but this will most likely not yield a great deal of
improvement because the dominant term in the error comes from the range of
the interval we wish to consider.

References
[1] R. J. Backlund, Über die Nullstellen der Riemannscher ζ-funktion, Acta
Math. 41 (1918) 345–375.

[2] C. Bays and R.H. Hudson, A new bound for the smallest x with π(x) > li(x),
Mathematics of Computation 69 (2000) 1285–1296.

[3] P. Borwein, S. Choi, B. Rooney and A. Weirathmueller, The Riemann Hy-


pothesis: A Resource for the Afficionado and Virtuoso Alike , Springer,
2008.

[4] K. F. Chao and R.J. Plymen, A new bound for the smallest x with π(x) >
li(x), http://arxiv.org/pdf/math/0509312

[5] J.B. Conrey, The Riemann Hypothesis, AMS Notices 50 (2003) 341–353,
www.ams.org/notices/200303/fea-conrey-web.pdf.

40
[6] H. M. Edwards, Riemann’s Zeta Function, Dover, 1974.

[7] A. Granville and G. Martin, Prime Number Races, American Math. Monthly,
January (2006).

[8] T. Kotnik, The prime-counting function and its analytic approximations,


Adv. Comput. Math. 29 (2008) 55–70.

[9] R. S. Lehman, On the difference π(x) − li(x), Acta Arith. 11 (1966) 397–410.

[10] J.E. Littlewood, Sur la distribution des nombres premiers, C.R. Acad. Sci.
Paris 158 (1914) 1869–1872.

[11] H. L. Montgomery and R. C. Vaughan, Multiplicative Number Theory I.


Classical Theory, Cambridge University Press, 2006.

[12] L. Panaitopol, Inequalities Concerning the Function π(x): Applications,


Acta Arith. 94 (2000) 373–381.

[13] B. Riemann, On the number of primes less than a given quantity. (Über die
Anzahl der Primahlen unter einer gegebenen Grösse.), (1859), Translated
into English by D. R. Wilkins (1998), www.claymath.org/millennium/
Riemann_Hypothesis/1859_manuscript/EZeta.pdf

[14] S. Skewes, On the Difference π(x) − li(x) (II), Proc. London Math. Soc. 5
(1955) 48–70.

Email: [email protected]

41

You might also like