Dirichlet
Dirichlet
Anthony Várilly
Harvard University, Cambridge, MA 02138
1 Introduction
Dirichlet’s theorem on arithmetic progressions is a gem of number theory. A great part of
its beauty lies in the simplicity of its statement.
Theorem 1.1 (Dirichlet). Let a, m ∈ Z, with (a, m) = 1. Then there are infinitely many
prime numbers in the sequence of integers a, a + m, a + 2m, . . . , a + km, . . . for k ∈ N.
A sixth grader knows enough mathematics to understand this particular formulation of
the theorem. However, many deep ideas of algebra and analysis are required to prove it.
In order to motivate some of the ideas we will introduce, we will sketch how to show there
are infinitely many primes of the form 4k + 1, the special case a = 1, m = 4 of Theorem 1.1.
We shall follow Knapp’s exposition in our sketch [2].
Define the (real valued) Riemann zeta function as
∞
X 1
ζ(s) = s
, s > 1. (1)
n=1
n
Throughout this paper, p shall denote a prime number, unless otherwise indicated. It is
possible to write the zeta function as the infinite product
Y 1
ζ(s) = . (2)
p
1 − p−s
where S is the set of natural numbers whose prime factors do not exceed N . Letting N → ∞
we obtain the result. With this product formula for ζ(s), it is possible to show (and we will
do so in the proof of Theorem 1.1) that
X 1
log ζ(s) = s
+ g(s) (3)
p
p
1
where g(s) is bounded as s → 1.
Define a function χ : Z → {−1, 0, 1} by
0
if a is even,
χ(a) = 1 if a ≡ 1 mod 4,
−1 if a ≡ 3 mod 4.
This function will allow us to distinguish primes of the form 4k + 1 and 4k + 3 from one
another. Notice that χ(mn) = χ(m)χ(n) for all integers m and n. Now let
∞
X χ(n)
L(s, χ) = . (4)
n=1
ns
Since χ is multiplicative for all integers (we say χ is strictly multiplicative in this case), one
can write, just like in the case of the zeta function,
Y 1
L(s, χ) = . (5)
p
1 − χ(p)p−s
From the Taylor expansion of arctan x, L(1, χ) = π/4 > 0. But ζ(s) diverges as s → 1, so
the left hand sides of (7) and (8) tend to infinity as s → 1. Since both 1/2sP
+ g(s) + g1 (s, χ)
s s
and 1/2 + g(s) − g1 (s, χ) remain bounded, it follows that both sums p≡1(4) 1/p and
s
P
p≡3(4) 1/p diverge as s → 1. This proves there are infinitely many primes of the form
4k + 1. As a bonus, we obtained the existence of infinitely many primes of the form 4k + 3.
There were two crucial ideas that made this last proof possible. First, it was imperative
that L(1, χ) was finite and non-zero, so that its logarithm remain bounded in (8). The other
key idea was the use of the function χ to ‘filter out’ the primes of the form 4k + 1 from all
other primes. To prove Dirichlet’s theorem, we’ll need functions like χ that will filter out
primes of the form a + km. We thus direct our attention to such functions: group characters.
2
2 Group Characters
Let G be a finite abelian group. A group character is a homomorphism χ : G → C∗ . The
characters of a group form themselves a group under pointwise multiplication. We call this
group the dual of G and denote it G.b
If G is a cyclic group of order n, then it is easy to describe G.
b Let g be a generator of
∗
G. Then χ(g) = w for some w ∈ C . Since χ is a homomorphism,
Hence w is an nth root of unity. Conversely, let w be an nth root of unity. Then we can
define a character χ of Gb by setting χ(g) = w. Notice that χ−1 (a) = χ(a) for all a ∈ G. We
have a bijective correspondence between the group of nth roots of unity µn and G. b In fact, it
is easy to see that this correspondence gives an isomorphism. Since µn ∼ = G,b it follows that
∼
G = G.b
Now let G be any finite abelian group. The structure theorem for finite abelian groups
tells us G can be written as a direct product of cyclic groups, G ∼
= Cn1 × · · · × Cnk . Let gi be
a generator of Cni . Every element of G can be written as a product of gi ’s to the appropriate
powers, so a character of G is completely determined by the images of the gi ’s. These images
must again be roots of unity.
Conversely, we can define a character χi of G b by sending gi to an nth root of unity wn
i i
∼
and all other generators gj to the identity element. It is easy to see that in this case G = G b
as well. In particular, a group and its dual have the same order.
Example 2.2. Dirichlet Characters modulo m: Let G = (Z/mZ)∗ . Then G is a finite abelian
group with φ(m) elements (here φ is the Euler totient function). The Dirichlet characters
can be extended to all of Z by setting χ(a) = 0 if (a, m) > 1 and letting χ(a + m) = χ(a) for
all integers a. These extensions are not themselves group characters (a character can’t take
the value 0), but they are multiplicative functions on Z. Through an abuse of language, we
will often times refer to these extensions as Dirichlet characters modulo m.
Example 2.3. The principal Dirichlet character modulo m is the extension to Z (as a
multiplicative function) of the trivial character of (Z/mZ)∗ :
(
1 if (a, m) = 1,
χ0 (a) =
0 otherwise.
3
Example 2.4. Let m = 4 in Example 2.2. Then (Z/4Z)∗ ∼
= Z/2Z, so the dual of (Z/4Z)∗
has one non-trivial character; it is given by
1
if a ≡ 1 mod 4,
χ(a) = −1 if a ≡ 3 mod 4,
0 otherwise.
Example 2.5. Let m = p in Example 2.2; here p is an odd prime number. The dual of
(Z/pZ)∗ will be cyclic of order p − 1. Hence there will be a character χ of order 2, that is
χ2 = χ0 . If a is a quadratic residue modulo p, then χ(a) is forced to be 1. If a is a quadratic
non-residue, then is forced to be −1. Thus we can identify χ with the familiar Legendre
χ(a)
a
symbol, χ(a) = .
p
Remark. The characters χ of G b are strictly multiplicative, that is, χ(ab) = χ(a)χ(b) for all
a, b ∈ G. This follows from the definition of group homomorphism.
The last equality follows from the fact that as a ranges through the elements of G, so does
ab. Hence we have
1 X
(χ(b) − 1) · χ(a) = 0.
|G| a∈G
b and since G ∼
By applying Theorem 2.1 to the dual group G = G,
bb
we get the following
result.
4
Corollary 2.2. Let a ∈ G. Then
(
1 X 1 if a = 1,
χ(a) = (10)
|G|
b 0 6 1.
if a =
χ∈G
b
Equations (11) and (12) are refered to as the orthogonality relations for group characters.
A special case of these relations, which is of interest to us, occurs when G = (Z/mZ)∗ .
Corollary 2.3 (Orthogonality relations for Dirichlet Characters). Let χ and ψ be
Dirichlet characters modulo m, and let a, b be integers. Then
m−1
(
1 X 1 if χ = ψ,
χ(a)ψ(a) = (13)
φ(m) a=0 0 otherwise.
(
1 X 1 if a ≡ b mod m,
χ(a)χ(b) = (14)
φ(m) χ 0 otherwise.
These last two relations shall do us a great service when we try to ‘filter out’ primes of
the form a + km from the zeta function. This is all the character theory we will need. If the
reader is interested in a more thorough treatment of it, we recommend Serre’s book [4]. For
a treatment closer to ours, Ireland and Rosen [1] would be a good book to look at.
We now turn our attention to series like (4). A careful study of them, together with our
knowledge of group characters is enough to prove Theorem 1.1.
3 Dirichlet Series
A series ∞
X an
n=1
ns
with an and s complex is called a Dirichlet series. We will be primarily concerned with
series where an is a Dirichlet character modulo m. First, we must know something about a
Dirichlet series’ region of convergence. We follow Knapp’s [2] treatment on Dirichlet series
for the following theorems.
5
Theorem 3.1. Let ∞ an
P
n=1 ns be a Dirichlet series. If the series converges for a particular
s = s0 , then it converges uniformly on the open half-plane Re s > Re s0 . Furthermore, the
sum is analytic in this region.
an an 1 an 1
Proof of Theorem 3.1. We have = . Let u n = and v n = . We know
ns ns0 ns−s0 n s0 ns−s0
{Un } is convergent by hypothesis, and vn → 0 uniformly on the half-plane Re s > Re s0 .
Thus Un vn → 0 as n → ∞ in this region. Say Un → U as n → ∞. Then
X X X
un v n = Un (vn − vn+1 ) ≤ |Un ||vn − vn+1 |
X
≤U |vn − vn+1 |.
X X 1 1
If we can show that |vn − vn+1 | = − converges uniformly on the
ns−s0
(n + 1)s−s0
half-plane Re s > Re s0 , we will be done. For n ≤ t ≤ n + 1, we have
|s − s0 |
and so |vn − vn+1 | ≤ 1+Re(s−s 0)
. Hence
n
X X 1
|vn − vn+1 | ≤ |s − s0 | ,
n n
n1+Re(s−s0 )
and this last expression converges uniformly when Re(s − s0 ) > 0. The analyticity of the
sum follows from the analyticity of each term in the half-plane.
6
3.1 Dirichlet L-series and Euler Products
Dirichlet series that have Dirichlet characters modulo m (extended to Z) as their coefficients
are called L-functions. ∞
X χ(n)
L(s, χ) = s
. (16)
n=1
n
When we studied primes of the form 4k +1, we came across an example of L-function. Notice
though that a general L-function can have s and χ(n) take complex values.
Remark. L(s, 1) looks like a zeta function with complex s that is missing all integers n that
are divisible by m, since χ(n) = 0 for such n.
Lemma 3.3. The zeta function ζ(s) is meromorphic in the half-plane Re s > 0. Its only
pole is s = 1 and it is simple.
Proof. We have
∞ ∞ Z ∞
1 X 1 1 1 X 1 1
ζ(s) = + s
− = + s
− dt
s − 1 n=1 n s−1 s − 1 n=1 n 1 ts
∞ Z n+1
1 X 1 1
= + − dt
s − 1 n=1 ns n ts
∞ Z n+1
1 X 1 1
= + − dt.
s − 1 n=1 n ns ts
R n+1
Notice that n (n−s − t−s ) dt is an analytic function for Re s > 0. To show the sum of such
integrals (as n ranges from 1 to ∞) is analytic, all we need is convergence on compact sets
for which Re s > 0. Now,
Z n+1 Z n+1
−s −s
|n−s − t−s | dt ≤ sup | n−s − t−s |,
n − t dt ≤
n n n≤t≤n+1
|s| X 1
and this last expression is at most by (15). The series converges for
n1+Re s n
n1+Re s
Re s > 0. Hence the desired series of integrals converges in this region as well.
Our next goal is to obtain a product expansion for L(s, χ) like that of the zeta function.
We use the crucial fact that Dirichlet characters are strictly multiplicative.
Lemma 3.4. The Dirichlet series n χ(n)
P
ns
converges absolutely for Res > 1. Furthermore,
∞
X χ(n) Y 1
= . (17)
n=1
ns p
1 − χ(p)p−s
7
Proof. χ is a bounded function. This gives the desired absolute convergence for Re s > 1.
To see why the product expansion holds note that for Re s > 1 and a fixed prime number q,
q q
Y 1 Y
1 + χ(p)p−s + χ2 (p)p−2s + · · ·
= (18)
p=2,
1 − χ(p)p−s p=2,
p prime p prime
q
Y X χ(n)
= 1 + χ(p)p−s + χ(p2 )p−2s + · · · = (19)
p=2, n∈S
ns
p prime
where S is the set of natural numbers whose prime factors do not exceed q. This means the
partial product (18) is equal to a convergent infinite sum. Now fix a natural number N . We
have
N r
X χ(n) Y 1 X χ(n)
s
= −s
− s
(20)
n=1
n p=2,
1 − χ(p)p n∈S
n
p prime n>N
where r is the largest prime number less than or equal to N , and now S is the set of natural
numbers whose prime factors do not exceed r. Letting q → ∞ in (18) and N → ∞ in (20)
we see that the product expansion and the series converge or diverge together. Since we
know the series converges for Re s > 1, the product expansion must also converge in that
region. Furthermore, letting q → ∞ in (18), we obtain (17)
With the above three lemmas in hand, we can extend our remark about L(s, 1). Applying
Lemma 3.4 to the principal character, we have
Y 1 Y
L(s, 1) = = (1 − p−s )ζ(s).
1 − p−s
p-m p|m
This last equality follows from the fact that we have extended the zeta function to the
region Re s > 0 in Lemma 3.3. Since the product over p|m is finite, it follows that L(s, 1)
is meromorphic in the region Re s > 0 and its only pole is simple at s = 1. Note, however,
that the product expansion of L(s, 1) is only valid in the region Re s > 1.
The product expression (17) is an example of an Euler product of first degree.
If χ is not the principal character, then we can go further and show that the series L(s, χ)
is convergent and analytic in the region Re s > 0.
Theorem 3.5. Let χ be a Dirchlet character modulo m different from the principal character.
Then the series L(s, χ) converges and is analytic in Re s > 0.
Proof. We extended Dirchlet characters to Z by setting χ(a) = 0 when (a, m) > 1 and by
letting χ(a + m) = χ(a) for all integers a. Using the extended characters, it follows, by
Theorem 2.1, that
Xm
χ(m + a) = 0 (21)
n=1
8
for any a.
Let s > 0 for now. We use Abel’s summation formula with un = χ(n) and vn = 1/ns .
Equation (21) says {Un } is bounded; say |Un | ≤ U . It is easy to see that Un vn → 0 as
n → ∞. Hence
∞ ∞ ∞ ∞
X χ(n) X X X U
s
= un v n = Un (vn − vn+1 ) ≤ U |vn − vn+1 | = s
n=M
n n=M n=M n=M
M
for any finite M . The last equality follows because |vn − vn+1 | = P
(vn − vn+1 ) for s > 0. As
M → ∞, the last expression tends to zero. Therefore the series n χ(n)/ns is convergent
for s real and positive. By Theorem 3.1, the series is convergent and analytic in the region
Re s > 0.
As a consequence of Theorem 3.5, we see that when χ is not the principal character,
L(s, χ) is well defined at s = 1. We will need to show that in fact it is not zero to prove
Dirichlet’s theorem.
Theorem 3.6. For non-principal χ, L(1, χ) 6= 0.
We will postpone the proof of this theorem until we prove Dirichlet’s theorem.
4 Dirichlet’s theorem
We are now in a position to prove Theorem 1.1. For the first part of the proof we will loosely
follow Knapp’s [2] treatment.
Proof of Theorem 1.1. First, we will show that for a Dirichlet character modulo m,
X χ(p)
log L(s, χ) = s
+ g(s, χ) (22)
p
p
for real s > 1, where g(s, χ) is a function that remains bounded as s → 1. Even if s is real,
L(s, χ) could still be complex valued, so if we want to take its logarithm, we better choose
a branch. For a given p and s ≥ 1, define the value of the logarithm of the pth factor in the
L-function’s Euler product by
∞
1 χ(p) X χ(pn )
log = +
1 − χ(p)p−s ps n=2
npns
∞
X χ(pn )
and let g(s, χ, p) = . For this choice of branch, and for |z| ≤ 1/2, we have
n=2
npns
∞ ∞
1 X zn X |z|n
log −z = ≤
1−z n=2
n n=2
n
∞ ∞ n+1
2
X 1 1 n 2
X 1
≤ |z| ≤ |z| = |z|2
n=0
n+2 2 n=0
2
9
χ(p) χ(p) 1
Now set z = s
. Since s
≤ we obtain
p p 2
1 χ(p) χ(p) 2 1
g(s, χ, p) = log −s
− s
≤ s
≤ 2 for s ≥ 1.
1 − χ(p)p p p p
Finally, set g(s, χ) = p g(s, χ, p). Now p |g(s, χ, p)| ≤ p p12 ≤ n n12 which converges.
P P P P
This establishes (22). Now we use group characters to ‘filter out’ the primes of the form a +
km. Recall from our discussion of Dirichlet characters modulo m the following orthogonality
relation (
1 X 1 if a ≡ b mod m,
χ(a)χ(b) =
φ(m) χ 0 otherwise.
10
We know L(s, 1) has a simple pole at s = 1 and all other L(s, χ) are analytic for Re s > 0.
Suppose there is some non-principal χ such that L(1, χ) = 0. The function ζm (s) would then
be analytic on Re s > 0. We will prove this is not the case. The theorem will follow.
Suppose p is a prime not dividing m. Let f (p) be the order of the image p of p in
(Z/mZ)∗ . Define g(p) = φ(m)/f (p), that is, the order of the quotient of (Z/mZ)∗ by (p).
Lemma 4.1. If p - m then
Y g(p)
χ(p) 1
1− s = 1− . (25)
χ
p pf (p)s
For any w ∈ µf (p) , there are φ(m)/f (p) = g(p) Dirichlet characters modulo m such that
χ(p) = w. Letting x = p1s we obtain the desired result.
Lemma 4.2. The function ζm (s) has a product expansion for Re s > 1 given by
Y g(p)
1
ζm (s) = . (26)
1 − p−f (p)s
p-m
Proof. We have
! g(p)
Y Y Y 1 Y 1
ζm (s) = L(s, χ) = = ,
χ χ
1 − χ(p)p−s 1 − p−f (p)s
p-m p-m
11
Proof. We will follow Serre’s [3, p. 67] ideas in this proof, though our proof is not as general
as his. We may assume without loss of generality that s0 = 0. Just replace s with s − s0 .
For convenience, denote the series above by f (s). Because f (s) is analytic in the region
Re s > 0 and in a neighborhood around 0, there is an > 0 such that f (s) is analytic in the
disc |s − 1| ≤ 1 + . This means the Taylor series of f (s) must converge in this disc. The pth
derivative of f (s) is given by
∞
(p)
X an (− log n)p
f (s) =
n=1
ns
∞ p
p an (log n)
X
(p)
→f (1) = (−1)
n=1
n
∞
p (p)
X an (log n)p
But (−1) f (1) = is a convergent series with positive terms. This means the
n
n=1
following double sum converges:
XX 1 (1 + )p (log n)p
f (−) = an .
p n
p! n
This is the Dirichlet series we started with evaluated at −! Therefore, the series converges
at s = −, and thus, by Theorem 3.1, for all Re(s) > −.
12
So far we have argued that if L(1, χ) = 0 for some non-principal χ, then ζm (s) must
be analytic for Re s > 0. We saw that ζm (s) is a Dirichlet series with positive coefficients.
Moreover, since all L(s, χ) are convergent for Re(s) > 1, ζm (s) converges in this region as
well. Since ζm (s) is analytic in the region Re(s) > 0, Lemma 4.3 tells us we can push back
the region of convergence of ζm (s) to Re(s) > 0. Our contradiction is at hand.
Let s be a real number greater than 1. Consider the pth factor in the product expansion
of ζm (s) (which is valid for Re s > 1)
g(p)
1
= (1 + p−f (p)s + p−2f (p)s + · · · )g(p)
1 − p−f (p)s
≥ 1 + p−f (p)g(p)s + p−2f (p)g(p)s + · · ·
= 1 + p−φ(m)s + p−2φ(m)s + · · ·
1
= .
1 − pφ(m)s
This shows that for s > 1,
Y Y Y 1 Y 1 g(p)
ζm (s) = L(s, χ) = =
χ χ p
1 − χ(p)p−s p
1 − p−f (p)s
(27)
Y 1 X
≥ −φ(m)s
= n−φ(m)s
p
1 − p
(n,m)=1
Thus, ζm (s) has all its coefficients greater than those of the series in (27). These coefficients
of ζm (s) remain unchanged if we take s between 0 and 1. But the series (27) diverges for
1
s = φ(m) > 0. Hence ζm (s) is unbounded for this value of s, and thus the series ζm (s)
diverges for a value of s whose real part is greater than zero. This is a contradiction, since
we showed ζm (s) converges for Re s > 0. This completes the proof of Theorem 3.6.
Remark. Even though the product expansion of ζm (s) is only valid for Re s > 1, we were
able to use the expansion to look at the coefficients of the series representation for ζm (s),
which is valid for Re s > 0
References
[1] K. Ireland, M. Rosen, A Classical Introduction to Modern Number Theory Second Edi-
tion. Springer, New York, 1990.
[4] J-P. Serre, Linear Representations of Finite Groups Springer, New York, 1977.
13