0% found this document useful (0 votes)
16 views104 pages

Mit 18.786

The document discusses the geometry of the complex upper half plane and its automorphisms, focusing on the action of the group GL2(R) on the Riemann sphere. It introduces concepts such as topological groups, geodesic spaces, and Haar measures, along with key theorems regarding the classification of linear fractional transformations and the properties of the upper half plane. The lecture also emphasizes the importance of invariant metrics and measures in the context of SL2(R) and its applications in number theory.

Uploaded by

wenzhecao205
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views104 pages

Mit 18.786

The document discusses the geometry of the complex upper half plane and its automorphisms, focusing on the action of the group GL2(R) on the Riemann sphere. It introduces concepts such as topological groups, geodesic spaces, and Haar measures, along with key theorems regarding the classification of linear fractional transformations and the properties of the upper half plane. The lecture also emphasizes the importance of invariant metrics and measures in the context of SL2(R) and its applications in number theory.

Uploaded by

wenzhecao205
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

18.

786 Number theory II Spring 2024


Lecture #1 9/6/2021

1 Geometry of the complex upper half plane


This is summary of the material from §1.1–1.4 of [1] presented in lecture, with proofs omitted.

1.1 Automorphisms of the upper half plane


The group GL2 (C) acts on the Riemann sphere P1 (C) = C ∪ ∞ via linear fractional transforma-
tions:
az + b
 
a b
z= (z ∈ P1 (C))
c d cz + d
This defines a left group action of GL2 (C) on P1 (C). The maps z 7→ αz are meromorphic.
We now define the upper half plane

H := {z ∈ C : Im(z) > 0}

For z ∈ H and α = a b
c d ∈ GL2 (R) we have

det(α) Im(αz)
Im(αz) = .
|cz + d|2

For det(α) > 0 we have |cz +d| 6= 0 and Im(αz) > 0, which implies that α 7→ αz is a holomorphic
automorphism of H for all α ∈ GL+ 2 (R) = {α ∈ GL2 (R) : det(α) > 0}.
:

Theorem 1.1. The following hold:

• For every z ∈ H there exists α ∈ SL2 (R) such that αi = z.


• GL+ ×
2 (R)/R ' SL2 (R)/{±1} ' Aut(H)
• The SL2 (R)-stabilizer of i ∈ H is SL2 (R)i = SO2 (R).

Proof. See [1, Theorem 1.1.3].

Recall that a topological group G is a group object in the category of topological spaces (this
means the maps G × G → G and G × G defined by (g, h) 7→ gh and g 7→ g −1 are continuous). All
the topological groups we shall consider in this course are Hausdorff, but we won’t bake that
into the definition (some authors do). We view GL2 (R) and all its subgroups and quotients as
topological groups, where GL2 (R) has the subspace topology induced by GL2 (R) ⊆ M2 (R) ' R4 .
When a topological group G acts on a topological space X the action map G × X → X
defined by (g, x) 7→ g x is required to be continuous, in addition to satisfying the usual prop-
erties of a (left) group action. For each g ∈ G the map x 7→ g x is an automorphism of X (a
homeomorphism X → X ). For x ∈ X we have the G-stabilizer G x and G-orbit G x defined by

G x := {g ∈ G : g x = x} ⊆ G and G x := {g x : g ∈ G} ⊆ X

and we use G\X to denote the topological space defined by the G-orbits of X (with the quotient
topology), and we use G/G x to denote the topological space consisting of the right G x -cosets of
G (with the quotient topology).

Theorem 1.2. Let G be a second countable locally compact Hausdorff topological group acting
on a locally compact Hausdorff space X . Then for every x ∈ X the map g G x 7→ g x defines a

homeomorphism G/G x −→ X.

Lecture by Andrew V. Sutherland


Proof. See [1, Theorem 1.2.1]. We will prove a more general theorem in the next lecture.

Corollary 1.3. The map αSO2 (R) 7→ αi is a homeomorphism SL2 (R)/SO2 (R) −→

H.

The corollary allows us to represent elements z of H as right cosets βSO2 (R), in which case
the action of αSL2 (R) on H is simply (left) matrix multiplication αz = αβSO2 (R) rather than a
linear fractional transformation.

1.2 Classification of linear fractional transformations


For α ∈ GL+
2 (R) we define disc(α) = tr(α) − 4 det(α) to be the discriminant of its characteristic
: 2

polynomial t 2 − tr(α)t + det(α), whose sign determines the type of eigenvalues α has (distinct
complex conjugates, a single real (double) eigenvalue, or two real eigenvalues).

Definition 1.4. We call a nonscalar α ∈ GL+


2 (R)

elliptic if disc(α) < 0, parabolic if disc(α) = 0, hyperbolic if disc(α) > 0.

We denote the extended upper half-plane by H∗ := H ∪ R ∪ {∞} ⊆ P1 (C).

Theorem 1.5. For each nonscalar α ∈ GL+


2 (R) we have

• α is elliptic if and only if its fixed points are z and z̄ for some z ∈ H.
• α is parabolic if and only if it has a unique fixed point x ∈ H∗ − H = R ∪ {∞}.
• α is hyperbolic if and only if it has distinct fixed points x 1 , x 2 ∈ H∗ − H = R ∪ {∞}.

In every case, the fixed points of α on P1 (C) lie in the extended upper half-plane H∗ , and they lie
in the upper half-plane if and only if α is elliptic.

Proof. This follows immediately from an analysis of the eigenvalues in each case.

For x, x 0 ∈ R ∪ {∞} we define

GL2 (R)(p)
x
:= {α ∈ GL+
2 (R) x : α is parabolic or scalar}, GL+ + +
2 (R) x,x 0 = GL2 (R) x ∩ GL2 (R) x 0
:

Lemma 1.6. We have

• GL+
2 (R)i = SO2 (R);

• GL+ ×
 
2 (R)∞ = 0 d : a, d ∈ R , b ∈ R, ad > 0 ;
a b

(p)
• GL+ ×
 
2 (R)∞ =
a b
0 a :a∈R ,b∈R ;

• GL+ ×
 
2 (R)0,∞ = 0 d : a, d ∈ R , ad > 0 .
a 0

Moreover, for z ∈ H the stabilizer GL+ + ∗


2 (R)z is conjugate to GL2 (R)i , for x ∈ H − H the stabilizer
(p) (p)
GL+ + + + 0
2 (R) x is conjugate to GL2 (R)∞ with GL2 (R) x conjugate to GL2 (R)∞ , and for distinct x, x ∈
∗ + +
H − H the double stabilizer GL2 (R) x,x 0 is conjugate to GL2 (R)0,∞ . In all cases, the conjugating
element can be chosen to lie in SL2 (R).

Proof. See [1, Lemma 1.3.2].

18.786 Spring 2024, Lecture #1, Page 2


1.3 The invariant metric and measure on the upper half plane
Let z = x + i y ∈ H. The invariant metric ds and invariant measure on H are defined by

d x2 + d y2
p
dxd y
ds(z) = 2
and d v(z) =
y y2

We call the image Cz0 →z1 of an injective function φ : [0, 1] → H that is C ∞ at all but finitely
many points a path from z0 = φ(0) to z1 = φ(1). If we put φ(t) = x(t) + i y(t), the length of
Cz0 →z1 is
Z1 Z1p
(d x(t)/d t)2 + (d y(t)/d t)2
`(Cz0 →z1 ) := ds(φ(t)) = d t,
0 0 y(t)
which does not depend on the choice of φ,as long as it has image Cz0 →z1 .
A circle or line in H orthogonal to R is called a geodesic.

Lemma 1.7. For any two distinct z0 , z1 ∈ H, there is a unique shortest path Cz0 →z1 from z0 to z1 ,
which lies on a geodesic. For any z0 ∈ H and r > 0 the set {z ∈ H: `(Cz0 ,z ) = r} is a circle in H
orthogonal to every geodesic that contains z0 .

Proof. See Lemma 1.4.1 and Corollary 1.4.2 in [1]

The isomorphism
 H ' SL2 (R)/SO2 (R) allows us to uniquely represent α ∈ SL2 (R) in the
form α = ac db = hz kθ , where z = x + i y = αi ∈ H and θ = − arg(ci + d), with
y x cos θ sin θ
hz = y −1/2
 
0 1 ∈ SL2 (R), kθ = − sin θ cos θ ∈ SO2 (R).

We now define a measure on SL2 (R) by

d x d y dθ
dα = .
2π y 2

This measure is SL2 (R)-invariant: one can check that dα = dαβ = dβα for all α, β ∈ SL2 (R);
moreover, we have chosen the scaling so that dα is compatible with d v.

Theorem 1.8. SL2 (R) is unimodular, and for any measurable f : H → C we have
Z Z
f (z)d v(z) = f (αi)dα
H SL2 (R)

Proof. This is [1, Thm. 1.4.5].

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #1, Page 3


18.786 Number theory II Spring 2024
Lecture #2 2/7/2024

2 Topological group actions and Fuchsian groups


This is summary of material from §33–34 of [3] presented in lecture, with proofs omitted.

2.1 Geodesic spaces



Definition 2.1. Let X be a metric space with distance d : X → R≥0 . An isometry g : X −→ X is a
distance preserving homeomorphism; they form a group Isom(X ) under composition.

In Lecture 1 we saw that the isometry group of the upper half-plane Isom(H) ' SL2 (R)/{±1}.
In any metric space X we can define the length of a path φ : [0, 1] → X as the supre-
Pn over all finite subdivisions 0 = t 0 < t 1 < · · · < t n < t n+1 = 1 of [0, 1] of the sum
mum
i=0 d(φ(t i ), φ(t i+1 ) (which need not be finite). Note that this length depends only on the im-
age φ, which we denote C x 0 →x 1 , where x 0 = φ(0) and x 1 = φ(1). Conversely, given a function
` assigning lengths to paths C x 0 →x 1 in X , we can define

d(x, y) = inf `(C x→ y ),


C x→ y

and if the infimum exists for all x, y ∈ X , then d is a distance metric (positive on distinct points,
symmetric, and satisfies the triangle inequality), called the intrinsic metric. Metric spaces whose
distance functions can be defined in this way are called length spaces (or path metric spaces).

Definition 2.2. Shortest paths in in a length space (those that realize the infimum defining
the intrinsic metric) are called geodesic segments. A geodesic in a length space is the image
of a continuous map φ : R → X for which there exists ε > 0 such that for every t 0 , t 1 ∈ R
with t 0 < t 1 < t 0 + ε the image of the restriction of φ to [t 0 , t 1 ] is a geodesic segment. A
geodesic space is a length space in which every pair of distinct points are connected by a geodesic
segment, and if this geodesic segment is always unique, then X is a uniquely geodesic space.

In Lecture 1 we say that H is a uniquely geodesic space. More generally, we have the fol-
lowing theorem.

Theorem 2.3 (Hopf-Rinow). Every complete locally compact metric space is a geodesic space.

Proof. See [1, Prop. I.3.7].

2.2 Haar measures


Definition 2.4. Let X be a locally compact Hausdorff space. The σ-algebra Σ of X is the collec-
tion of subsets of X generated by the open and closed sets under countable unions and countable
intersections. Its elements are Borel sets, or measurable sets. A Borel measure on X is a count-
ably additive function
µ: Σ → R≥0 ∪ {∞}.
A Radon measure on X is a Borel measure on X that additionally satisfies

(i) µ(S) < ∞ if S is compact,


(ii) µ(S) = inf{µ(U) : S ⊆ U, U open},
(iii) µ(S) = sup{µ(C) : C ⊆ S, C compact},

Lecture by Andrew V. Sutherland


for all Borel sets S.1

Definition 2.5. A (left) Haar measure µ on a locally compact Hausdorff group G is a nonzero
Radon measure that is translation invariant: µ(S) = µ(gS) for all g ∈ G and measurable S ⊆ G.
Replacing µ(gS) with µ(S g) yields a right Haar measure, and if µ is both a left and a right Haar
measure then µ and G are said to be unimodular.

It follows from the left and right invariance of the measure dα on SL2 (R) defined in Lecture 1
that SL2 (R) is unimodular. We used the homeomorphism SL2 (R)/SO(2) ' H defined by gSO2 →
g i to define this measure. But SL2 (R) ⊆ M2 (R) ' R4 inherits a natural metric space structure
with distance function d(α, β) = kα − βk, where
2
= a2 + b2 + c 2 + d 2 ,

a b
c d

which is related to the distance metric d on H via

kαk2 = 2 cosh d(i, αi).

Theorem 2.6 (Weil). Every locally compact Hausdorff group G has a Haar measure µ, and if µ0
is any other Haar measure on G then µ0 = λµ for some λ ∈ R>0 .

Proof. See [2, §7.2].

2.3 Topological group actions


Recall that a covering map f : X → Y is a continuous surjection for which every y ∈ Y has an
open neighborhood V whose preimage f −1 (V ) is a disjoint union of open Uα ⊆ X on which
π restricts to a homeomorphism Uα → V , in which case X is a covering space of Y . A deck
transformation of is a homeomorphism φ : X −→ ∼
X that preserves f , meaning f ◦φ = f = f ◦φ.
The deck transformations form a group Deck( f ), which acts on X , with each Deck( f )-orbit lying
in a fiber of f . If we actually have Deck( f )\X ' Y , then f is a normal covering map.

Definition 2.7. The action of a topological group G on a topological space X is a covering space
action if every x ∈ X has an open neighborhood U for which

ρ(G, U) := {g ∈ G : U ∩ g U 6= ;} = {1}.

If G is a covering space action, then π: X → G\X is a normal covering map and G ⊆ Deck(π).

Covering space actions are free actions, since G x ⊆ ρ(G, U) for every open U 3 x, but the
actions we are most interested have non-trivial stabilizers and cannot be covering space actions.
This leads to the following definition.

Definition 2.8. The action of a topological group G on a topological space X is wandering if


every x ∈ X has an open neighborhood U for which ρ(G, U) is finite.

Lemma 2.9. For a Hausdorff group G acting on a Hausdorff space X the following are equivalent:

(i) The action of G is wandering;


(ii) Every x ∈ X has open neighborhood U for which ρ(G, U) = G x is finite.
1
Some authors additionally require X to be σ-compact (a countable union of compact sets); all the X we shall
consider are σ-compact.

18.786 Spring 2024, Lecture #2, Page 2


If G acts freely these equivalent conditions hold if and only if the G-action is a covering space action.

Proof. This is [3, Lemma 34.3.6].

Lemma 2.10. For a Hausdorff group G acting on a Hausdorff space X the following are equivalent:

(i) G\X is Hausdorff;


(ii) If G x 6= G y then there are open U 3 x and V 3 y for which g U ∩ V = ; for all g ∈ G;
(iii) The set {(x, g x) : g ∈ G, x ∈ X } ⊆ X × X is closed.

Proof. This is [3, Lemma 34.4.1].

Recall that a continuous map f : X → Y is quasiproper if inverse images of compact sets are
compact, and if it is also a closed map then f is proper.

Lemma 2.11. Every quasiproper continuous map of locally compact Hausdorff spaces is proper.

Proof. See [3, Lemma 34.4.5].

Definition 2.12. A topological group G acts properly on a topological space X if the action map
λ: G × X → X × X defined by (g, x) 7→ (x, g x) is proper.

Proposition 2.13. Let G be a Hausdorff group acting properly on a Hausdorff space X . Then the
following hold:

(i) G\X is Hausdorff;


(ii) Every G-orbit G x is closed;
(iii) The map G/G x → G x defined by g 7→ g x is a homeomorphism;
(iv) Every stabilizer G x is compact.

Proof. (i)–(iii) follow from Lemma 2.10, since λ(G × X ) is closed, and (iv) follows from the fact
that the singleton set (x, x) ∈ X × X is compact, so G x ' G x × {x} = λ−1 (x, x) is too.

Theorem 2.14. Let G be a Hausdorff group acting on a locally compact Hausdorff space X . The
following are equivalent:

(i) G is discrete and the G-action is proper;


(ii) For every compact K ⊂ X the set {g ∈ G : K ∩ g K 6= ;} is finite;
(iii) For every compact K, L ⊂ X the set {g ∈ G : K ∩ g L 6= ;} is finite;
(iv) For all x, y ∈ X there exist open U 3 x, V 3 y for which {g ∈ G : U ∩ g V 6= ;} is finite.

These four equivalent conditions imply the following two:

(v) The G-action is wandering;


(vi) Every orbit G x is discrete and every stabilizer G x is finite.

When X is metrizable and G acts via isometries, all six conditions are equivalent.

Proof. This is [3, Theorem 34.5.1].

18.786 Spring 2024, Lecture #2, Page 3


2.4 Fuchsian groups
Proposition 2.15. Let Γ be a subgroup of SL2 (R) or PSL2 (R). The action of Γ on H is wandering
if and only if Γ is discrete.

Proof. The action of Γ ⊆ SL2 (R) is is the same as that of its image in PSL2 (R), and Γ is discrete
if and only if its image in PSL2 (R) is discrete. For Γ ⊆ PSL2 (R) see [3, Proposition 34.7.2].

Definition 2.16. A Fuchsian group is a discrete subgroup of SL2 (R) or PSL2 (R).

References
[1] Martin R. Bridson and André Haefliger, Metric spaces of non-positive curvature, Springer,
1999.

[2] Joe Diestel and Angela Spalsbury, The Joys of Haar Measure, Amer. Math. Society, 2014.

[3] John Voight, Quaternion algebras, Springer, 2021.

18.786 Spring 2024, Lecture #2, Page 4


18.786 Number theory II Spring 2024
Lecture #3 2/12/2024

3 Fuchsian groups and Dirichlet domains


This is summary of the material from §34,§36-37 of [2] presented in lecture.

3.1 Manifolds and orbifolds


Recall that a (topological) manifold is a second countable Hausdorff space X that is locally home-
omorphic to Rn , for some fixed n ≥ 1 (the dimension of the manifold). This that means every
x ∈ X has an open neighborhood U equipped with a homeomorphism φ : U −→ ∼
φ(U) ⊆ Rn .
The map φ is a chart, and and an open cover of charts is an atlas. For every pair of overlap-
ping charts φ1 : U1 → Rn and φ2 : U2 → Rn , meaning U1 ∩ U2 6= ;, we have a homeomorphic
transition map
φ12 = φ2 φ1−1 : φ1 (U1 ∩ U2 ) −→

φ2 (U1 ∩ U2 )
between two open subsets of Rn . If X has an atlas for which all the transition maps are smooth
(continuous partial derivatives of all orders), then X is a smooth manifold. Replacing Rn with Cn
and “smooth” with “holomorphic” yields a complex manifold. A Riemann surface is a complex
manifold of dimension one. Every compact Riemann surface can be realized as an algebraic
curve.
A real Lie group is a topological group G that is also a smooth manifold for which the maps
(g, h) 7→ gh and g 7→ g −1 are smooth; a complex Lie group is defined similarly.
Theorem 3.1 (Riemann uniformization theorem). Every simply connected Riemann surface is
isomorphic to either the Riemann sphere P1 (C), the complex plane C, or the upper half-plane H.
The universal cover X̃ of a compact Riemann surface X is simply connected, which means
that X ' G\X̃ , where G = π1 (X , x 0 ) is the fundamental group of X acting on X̃ as a covering
space action. This leads to the following possibilities:
• X̃ = P1 (C) and X ' P1 (C) (elliptic);
• X̃ = C and X ' C or X ' C/Z or X ' C/(Z + τZ) with τ ∈ H (parabolic).
• X̃ = H and X ' Γ \H with Γ a torsion-free Fuchsian group (hyperbolic).
Recall that a Fuchsian group is a discrete subgroup of PSL2 (R), which need not be torsion free!
Remark 3.2. It will often be convenient to work with subgroups Γ of SL2 (R) rather than PSL2 (R),
but we can ignore the trivial action of −1 (so when we speak of torsion, we always refer to the
image of Γ in PSL2 (R)).
Definition 3.3. An orbifold is a second countable Hausdorff space X locally homeomorphic to
G\Rn for some finite group G acting continuously on Rn . An atlas for an orbifold is an open cover
of charts Ui , closed under finite intersection, and a corresponding collection of open Vi ⊆ Rn
equipped with the continuous action of a finite group Gi and a homeomorphism φi : Ui −→ ∼
Gi \Vi
such that whenever Ui ⊆ U j there exists an injective homomorphism f i j : Gi ,→ G j and a Gi -
equivariant gluing map ψi j : Vi ,→ Vj for which ψi j ◦ φi = φ j . If the gluing maps and G-action
are C ∞ -smooth we have a smooth orbifold, and replacing “C ∞ -smooth” with “holomorphic”
yields a complex orbifold.
If X is a manifold with a proper action by a discrete group G, we can make G\X into an
orbifold using an atlas defined by neighborhoods U 3 x chosen so that {g ∈ G : U ∩ g U 6= ;} =
G x is finite; orbifolds that arise in this way are good. If Γ ⊆ SL2 (R) is a Fuchsian group, then
Γ \H is a good complex orbifold of dimension 1 (and also a good 2-orbifold, a real orbifold of
dimension 2).

Lecture by Andrew V. Sutherland


3.2 Dirichlet domains
For a group G acting on a topological space X , we call an element of G trivial if it acts trivially
on every element of X . For example, ±1 are the trivial elements of SL2 (R) (as noted above, we
do not require that G act faithfully).

Definition 3.4. A fundamental set for a group G acting on a topological space X is a subset
S ⊆ X for which (i) S 0 = S, (ii) GS = X , (iii) S 0 ∩ (gS)0 for all nontrivial g.

Let X be a completely locally compact geodesic space with metric d, and let Γ be a discrete
group of isometries acting properly on X .

Definition 3.5. The Dirichlet domain for Γ centered at x 0 ∈ X is the set

D(x 0 ) := D(Γ ; x 0 ) := {x ∈ X : d(x 0 , x) ≤ d(x 0 , γx) for all γ ∈ Γ }.

For x 1 , x 2 ∈ X the Leibniz half-space is the set

H(x 1 , x 2 ) := {x ∈ X : d(x 1 , x) ≤ d(x 2 , x)},

with H(x 1 , x 2 ) = X if x 1 = x 2 , and D(x 0 ) = ∩γ H(x 0 , γx 0 ) is an intersection of Leibniz half-


spaces over γ ∈ Γ − Γ x 0 . For x 1 6= x 2 the boundary

L(x 1 , x 2 ) := H(x 1 , x 2 ) − H(x 1 , x 2 )0 = {x ∈ X : d(x 1 , x) = d(x 2 , x)}

is the hyperplane bisector separating x 1 and x 2 , and H(x 1 , x 2 )0 = {x ∈ X : d(x 1 , x) < d(x 2 , x)}.

Remark 3.6. Hyperplane bisectors need not be geodesics, but hyperplane bisectors in H are.

Definition 3.7. A set S ⊆ X is convex if it contains every geodesic segment connecting points
in S, and star-shaped with respect to x 0 ∈ S if it contains every geodesic segment connecting a
point in S to x 0 .

Lemma 3.8. Every Leibniz half-plane H(x 1 , x 2 ) in X is star-shaped with respect to x 1 .

Proof. Consider any x ∈ H(x 1 , x 2 ) and any y on a geodesic segment from x to x 1 . Then
d(x 1 , y) + d( y, x) = d(x 1 , x). If y 6∈ H(x 1 , x 2 ) then d(x 2 , y) < d(x 1 , y) and

d(x 2 , x) ≤ d(x 2 , y) + d( y, x) < d(x 1 , y) + d( y, x) = d(x 1 , x),

but this contradicts x ∈ H(x 1 , x 2 ).

Corollary 3.9. The Dirichlet domain D(x 0 ) is closed and star-shaped with respect to x 0 .

Proof. D(x 0 ) is an intersection of Leibniz half-spaces that are closed and star-shaped.

Lemma 3.10. If x 0 ∈ S ⊆ X is bounded then the set {γ ∈ Γ : S 6⊆ H(x 0 , γx 0 )} is finite.

Proof. If S is bounded then sup{d(x 0 , x) : x ∈ S} = r < ∞, which implies Γ x 0 ⊆ X is discrete


and Γ x 0 ⊆ Γ is finite, since Γ is discrete and acts properly on X . Thus only finitely many γ ∈ Γ
satisfy ρ(x 0 , γx 0 ) ≤ 2r. For every other γ ∈ Γ and all x ∈ S we have

d(x, γx 0 ) ≥ d(x 0 , γx 0 ) − d(x, x 0 ) > 2r − r = r ≥ d(x, x 0 )

which implies x ∈ H(x 0 , γx 0 ).

18.786 Spring 2024, Lecture #3, Page 2


Corollary 3.11. We have D(x 0 )0 = {x ∈ D(x 0 ): d(x 0 , x) < d(x 0 , γx) for all γ ∈ Γ − Γ x 0 } and
D(x 0 )0 = D(x 0 ).

Proof. For any x ∈ D(x 0 ) we can pick a bounded open U 3 x for which
\
U ∩ D(x 0 ) = U ∩ H(x 0 , γx 0 )
γi ∈F

where F is the finite set of Lemma 3.10 for S = U with elements of Γ x 0 removed. Therefore
\
U ∩ D(x 0 )0 = U ∩ H(x 0 , γx 0 )0
γ∈F

For γ ∈ Γ − (F ∪ Γ x 0 ) we have U ⊆ H(x 0 , γx 0 )0 , and if V is the set on the RHS of the corollary,
\
U ∩ D(x 0 )0 = U ∩ H(x 0 , γi x 0 )0 = U ∩ V,
γ∈Γ −Γ x 0

since H(x 0 , γx 0 )0 = {x ∈ X : d(x, x 0 ) < d(x, γx 0 )} for γ ∈ Γ − Γ x 0 .


This implies the first statement in the corollary, and the second then follows, since we have
D(x 0 )0 = γ∈Γ −Γ x H(x 0 , γx 0 ) = D(x 0 ).
T
0

Definition 3.12. S ⊆ X is locally finite if for every compact C ⊆ X the set {γ ∈ Γ : S ∩ γC 6= ;}


is finite.

Lemma 3.13. If S is a locally finite fundamental set then Γ is generated by {γ ∈ Γ : S ∩ γS 6= ;}.

Proof. Let Γ ∗ ≤ Γ be the subgroup generated by {γ ∈ Γ : S ∩ γS 6= ;}. Define f : X → Γ ∗ \Γ


via x 7→ Γ ∗ γ, with γ ∈ Γ chosen so γx ∈ S; such a γ exists because S is fundamental, and it
doesn’t matter which one we pick because if γ0 x ∈ S then γ0 x ∈ S ∩ γ0 γ−1 S 6= ;, which implies
γ0 γ−1 ∈ Γ ∗ and γ0 Γ ∗ = γΓ ∗ .
We claim f is locally constant. Consider any x ∈ X with compact S neighborhood C. Since S
is locally finite, we can pick a finite set of {γi } ⊆ Γ so that C ⊆ i γi S. By replacing C by its
intersection with the union of the γi S that contain x, we can assume x ∈ γi S for all i, which
implies that f (x) = Γ ∗ γ−1
i for all i. Any y ∈ C lies in γi S for some i, which means f ( y) = f (x).
But X is connected (it is a geodesic space), so f is actually constant, and for any γ ∈ Γ and
any x ∈ S we have Γ ∗ = f (x) = f (γ−1 x) = Γ ∗ γ, which implies Γ ∗ = Γ .

In the theorem below, by trivial, we mean an element or subgroup of Γ that acts trivially on
every x ∈ X (so for Γ ∈ SL2 (R) this would include ±1).

Theorem 3.14. Let X be a complete, locally compact geodesic space and let Γ be a discrete group
of isometries acting properly on X . If x 0 ∈ X has trivial stabilizer then D(x 0 ) is a locally finite
fundamental set for Γ that is star-shaped with respect to x 0 whose boundary consists of segments
of hyperplane bisectors.

Proof. Let x 0 ∈ X have trivial stabilizer. It follows from Lemma 3.8 that D(x 0 ) is star-shaped
with respect to x 0 , and Corollary 3.11 implies that its boundary is a union of segments of hy-
perplane bisectors. We now show that D(x 0 ) is a fundamental set. Corollary 3.11 implies that
D(x 0 )0 , so (i) holds. Now consider any x ∈ X . Then Γ x is discrete and

d(Γ x, x 0 ) = inf{d(γx, x 0 ) : γ ∈ Γ }

18.786 Spring 2024, Lecture #3, Page 3


is realized by some γx ∈ D(x 0 ). It follows that D(x 0 ) intersects every Γ -orbit, so X = Γ D(x 0 )
and (ii) holds. Finally, if x ∈ D(x 0 )0 and x 6= γx ∈ D(x 0 )0 , then

d(x, x 0 ) < d(γx, x 0 ) ≤ d(γ−1 γx, x 0 ) = d(x, x 0 ),

a contradiction, so (iii) holds.


To show that D(x 0 ) is locally finite, it suffices to check a bounded closed disc C ⊆ X with
center x 0 and radius r ≥ 0. If γC ∩ D(x 0 ) 6= ; then for some x ∈ D(x 0 ) we have d(x 0 , γx) ≤ r
and
d(x 0 , γx 0 ) ≤ d(x 0 , γx) + d(γx, γx 0 ) ≤ r + d(x, x 0 ),
but x ∈ S implies d(x, x 0 ) = d(x 0 , x) ≤ d(x 0 , γx) ≤ r, so d(x 0 , γx 0 ) ≤ r + r = 2r. This can
hold for only finitely many γ, since Γ x 0 is discrete, so D(x 0 ) is locally finite.

3.3 Hyperbolic space


Hyperbolic space H3 is the set

C × R>0 = {(x, y) = (x 1 + x 2 i, y) ∈ C × R : y > 0}.

equipped with the metric induced by the hyperbolic length element and volume
q
d x 12 + d x 22 + d y 3 d x1 d x2 d y
ds = dV = .
y y3

Its boundary is the sphere at infinity P1 (C), analogous to the circle at infinity P1 (R) that forms
the boundary of the hyperbolic plane H2 := H.
The group SL2 (C) acts on P1 (C) via linear fractional transformations, and we extend this to
hyperbolic space by embedding C × R>0 in the Hamiltonians H = C + C j (where jz = x̄ j for
z ∈ C ⊆ H) via (x, y) 7→ x + y j and defining

az + b
 
a b
z=
c d cz + d

for z = x + y j ∈ H. One can then show that PSL2 (C) acts faithfully and transitively on H3 via
isometries, and that every orientation preserving isometry of H3 corresponds to the action of
an element of PSL2 (C) (and in general Isom(H3 ) ' PSL2 (C) o Z/2Z)), just as every orientation
preserving isometry of H2 corresponds to the action of an element of PSL2 (R) ' PGL+ 2 (R) (and
in general, Isom(H2 ) ' PGL2 (R))

Definition 3.15. A Kleinian group is a discrete subgroup of SL2 (C) or PSL2 (C).

As with Fuchisan groups (discrete subgroups of SL2 (R) or PSL2 (R)), Kleinian groups act
properly on H3 via isometries.

3.4 Hyperbolic Dirichlet domains


We now specialize to the case X is either hyperbolic space H2 or the hyperbolic plan H3 , which
we note are both complete, locally compact, uniquely geodesic spaces, and Γ is a Fuchsian or
Kleinian group.

18.786 Spring 2024, Lecture #3, Page 4


Definition 3.16. A fundamental domain for Γ is a connected fundamental set S whose boundary
has measure zero; if the boundary of S is a countable union of geodesic segments then we say
that S has geodesic boundary.

Theorem 3.17. Let Γ be a Fuchsian group acting on X = H2 or a Kleinian group acting on X = H3 .


Let z0 ∈ X have stabilizer Γ0 and let u0 ∈ X have trivial stabilizer in Γ0 . Then

D(Γ ; z0 ) ∩ D(Γ0 ; u0 )

is a convex locally finite fundamental domain for Γ with geodesic boundary.

Proof. We first note that any Dirichlet domain D(x 0 ) for a Fuchsian or Kleinian group G (includ-
ing Γ and Γ0 ) is convex with geodesic boundary because it is a countable intersection of Leibniz
half-spaces H(x 0 , γx 0 ) over γ ∈ G; Leibniz half-spaces in H2 or H3 are convex with geodesic
boundary and G is countable (because G is discrete and X is locally compact second countable).
It follows that D(Γ ; z0 ) ∩ D(Γ0 ; u0 ) is a convex set with geodesic boundary.
Theorem 3.14 implies that D(Γ0 ; u0 ) is a locally finite fundamental set for Γ0 , and this implies
that D(γ; z0 ) ∩ D(Γ0 ; u0 ) is locally finite, since a subset of a locally finite set is locally finite. It
remains only to show that S := D(Γ ; z0 )∩ D(Γ0 ; u0 ) is a fundamental set; we have already shown
it is convex with geodesic boundary, so this will imply it is a fundamental domain.
The proof of Corollary 3.11 implies S 0 = S, since S is an intersection of Leibniz half-spaces,
so (i) holds. Now consider any z ∈ X , and choose γ ∈ Γ to minimize d(z0 , γz) and choose γ0 ∈ Γ0
so γ0 γz ∈ D(Γ0 ; u0 ) (note D(Γ0 ; u0 ) is a fundamental set). Then d(z0 , γ0 γz) = d(z0 γz), by the
minimality of γ, which implies that γ0 γz ∈ D(Γ ; z0 ) and therefore in S, thus Γ S = X and (ii)
holds. Finally, if γ ∈ Γ is nontrivial, either γ ∈ Γ0 and S 0 ∩ (γS)0 = ; because D(Γ0 ; u0 ) is a
fundamental set for Γ0 , or γ 6∈ Γ0 and S 0 ∩ (γS)0 6= ; implies d(z, z0 ) < d(γz, z0 ) ≤ d(z, z0 ), a
contradiction, as in the proof of Theorem 3.14; so (iii) holds.

Corollary 3.18. Let Γ by a Fuchsian group. If z0 ∈ H has trivial stabilizer then the Dirichlet domain
D(Γ ; z0 ) is a connected, convex, locally finite fundamental domain for Γ with geodesic boundary.

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

[2] John Voight, Quaternion algebras, Springer, 2021.

18.786 Spring 2024, Lecture #3, Page 5


18.786 Number theory II Spring 2024
Lecture #4 2/14/2024

This is summary of the material from §1.5, §1.7 of [1] and §37-38 of [2] presented in lecture.

4.1 Cusps and stabilizers


Recall that the action of SL2 (R) on the extended upper half plane H := H ∪ P1 (R) ⊆ P1 (C) via
linear transformations stabilizes the boundary ∂ H = P1 (R) and is transitive on H and ∂ H. We
classified nontrivial elements α ∈ SL2 (R) according to disc(α) = tr(α)2 − 4 det(α) as

• elliptic if disc(α) < 0, equivalently, α ∈ SL2 (R)z for exactly one z ∈ H and no z ∈ ∂ H;
• parabolic if disc(α) = 0, equivalently, α ∈ SL2 (R)z for no z ∈ H and exactly one z ∈ ∂ H;
• hyperbolic if disc(α) > 0, equivalently, α ∈ SL2 (R)z for no z ∈ H and exactly two z ∈ ∂ H.

If α ∈ SL2 (R) fixes more than two points in H then α = ±1 trivial.

Definition 4.1. For Γ ≤ SL2 (R) we classify the z ∈ H = H ∪ ∂ H as follows:

• z ∈ H is ordinary if Γz ⊆ {±1} (in which case Γz is trivial);


• z ∈ H is an elliptic point of Γz contains an elliptic γ ∈ Γ ;
• z ∈ ∂ H is a parabolic point or cusp if Γz contains a parabolic γ ∈ Γ ;
• z ∈ ∂ H is a hyperbolic point if Γz contains a hyperbolic γ ∈ Γ (so Γz,z 0 ⊆ Γz for some z 0 ).

For Γ ∈ SL2 (R) we let Z(Γ ) = Γ ∩ {±1} denote its trivial subgroup, so that Γ /Z(Γ ) is isomorphic
to the image of Γ in PSL2 (R).

Theorem 4.2. Let Γ ≤ SL2 (R) be a Fuchsian group and let z ∈ H = H ∪ ∂ H. The following hold:

(i) If z is an elliptic point of Γ then Γz is a finite cyclic group that strictly contains Z(Γ );

(ii) If z is a cusp of Γ then Γz /Z(Γ ) ' Z and ±Γz is SL2 (R)-conjugate to ± 10 1h with h > 0;
(iii) If z is a hyperbolic point of Γ then Γz ⊇ Γz,z 0 ) Z(Γ ) (for some z 0 6= z) then Γz,z 0 /Z(Γ ) ' Z
and ±Γz,z 0 is SL2 (R)-conjugate to ± 0u 10/u with u > 0.


Proof. (i) SL2 (R)z is conjugate to SL2 (R)i = SO2 (R) ' S 1 , a compact abelian group. Since Γ is
discrete, Γz = Γ ∩ SL2 (R)z is finite, hence cyclic, and it contains an elliptic γ 6∈ Z(Γ ). 
(ii) WLOG z = ∞ and Γz = Γ∞ is upper triangular and contains some γ = 10 1h with
h 6= 0 (we can  conjugate Γz to achieven this). Up to ±1 every element of Γz must be of this form:
a b −n 2n
if α = 0 1/a ∈ Γ x with |a| < 1 then α γα = 10 a 1 h ∈ Γz for all n ≥ 1, but Γz is discrete. Thus
Γ x /Z(Γ ) is isomorphic to a nontrivial discrete subgroup of R, which implies Γ x /Z(Γ ) ' Z.
(iii) WLOG z = 0 and Γz Γ0 ⊇ Γ0,∞ ) Z(Γ ), so that Γz,z 0 = Γ0,∞ is diagonal, consisting
of elements of the form 0u 10/u with u ∈ R× . It follows that Γ x /Z(Γ ) is a nontrivial discrete


subgroup of R× / ±1, which implies Γ x /Z(Γ ) ' Z.

Corollary 4.3. For any Fuchsian group Γ we may partition H = H ∪ ∂ H into elliptic points, cusps,
hyperbolic points, and ordinary points. In this partition, H consists of elliptic and ordinary points,
while ∂ H consists of parabolic, hyperbolic and ordinary points.

Proof. The theorem implies that a cusp cannot also be a hyperbolic point.

Corollary 4.4. Let Γ be a Fuchsian group. Every finite index Γ 0 ≤ Γ has the same set of cusps as Γ .

Proof. Any cusp x of Γ 0 is a cusp of Γ , since Γ x0 ⊆ Γ x . If x is a cusp of Γ then Γ x0 = Γ x ∩ Γ has finite


index in Γ x , which is infinite, so Γ x0 cannot be trivial and x is also a cusp of Γ 0 .

Lecture by Andrew V. Sutherland


Note that even when Γ 0 ≤ Γ has the same set of cusps as Γ it need not have the same cusp orbits
(the cusp orbits of Γ 0 will be a refinement of the partition of the cusps of Γ into cusp orbits).

Definition 4.5. For a Fuchsian group Γ and z ∈ H = H ∪ ∂ H, the order of z with respect to Γ is

ez := #Γz /Z(Γ ) ∈ Z>0 ∪ {∞}.

It follows from the theorem above that ez = 1 if z is ordinary, 1 < ez < ∞ if z is elliptic, and
ez = ∞ if z is a cusp or hyperbolic point.

Corollary 4.6. If −1 6∈ Γ then every elliptic point of Γ has odd order.

Proof. The group ±Γ is a finite cyclic group whose unique element of order 2 is −1.

Definition 4.7. Let Γ be a Fuchsian group, let x ∈ ∂ H be acusp of Γ , and choose σ ∈ SL2 (R) so
σx = ∞. Then ±σΓ x σ−1 = ± 10 1h with h > 0. If 10 1h ∈ σΓ x σ−1 then x is a regular cusp,


and otherwise x is irregular. The latter can arise only when −1 6∈ Γ , and the regularity of x does
not depend on the choice of σ.

4.2 Adjoining cusps


For a Fuchsian group Γ we let PΓ denote its set of cusps, and define H∗Γ := H ∩ PΓ . We extend the
topology of H to H∗Γ by taking defining open neighborhoods Ul := {z ∈ H : Im(z) > x} ∪ {∞}
of ∞ for all l > 0, and we take SL2 (R)-translates σUl as open neighborhoods of σ∞ for each
cusp σ∞ of Γ . Then H∗Γ is a Hausdorff space, on which Γ acts.
This action is not proper, since cusps have infinite stabilizers, but the quotient X Γ := Γ \H∗Γ
is Hausdorff (see [1, Lemma 1.7.7]), and we can give it the structure of a Riemann surface (see
[1, §1.8]). If Γ is a Fuchsian group and X Γ has finite volume (under the measure induced by the
measure dv on H), then X Γ is a compact Riemann surface, by Siegel’s Theorem [1, Thm. 1.9.1],
hence an algebraic curve of some genus g ≥ 0. In this case the number of elliptic points and
cusps is necessarily finite [1, Thm. 1.7.8].

4.3 The modular group


The modular group Γ = SL2 (Z) is a Fuchsian group containing −1, generated by the matrices
   
0 −1 1 1
S := and T := .
1 0 0 1

A Dirichlet domain for Γ is given by



D(2i) = z ∈ H : |z| ≥ 1 and Re(z) ∈ [−1/2, 1/2] ,

with elliptic points i = ζ4 of order 2, with Γi = 〈S〉, and ω = ζ3 of order 3 with Γω = 〈S T 〉. The
Dirichlet domain D(2i) also contains the elliptic point T ω on its boundary.
The closure of D(2i) in H∗Γ contains a single cusp ∞ with Γ∞ = 〈±T 〉 and PΓ = Γ ∞ = P1 (Q).
Then Y (1) := YΓ := Γ \H ' C and X (1) := X Γ := Γ \H∗Γ ' P1 (C).
Every subgroup of Γ = SL2 (Z) is also a Fuchsian group, but we will only be interested in those
that have finite index. An important example of such groups is given by congruence subgroups,
which are subgroups of SL2 (Z) that contain the group

Γ (N ) := γ ∈ SL2 (Z) : γ ≡ 1 mod N .

18.786 Spring 2024, Lecture #4, Page 2


Two particularly important examples of congruence subgroups are
¦ © ¦ ©
Γ0 (N ) := γ ∈ SL2 (Z) : γ ≡ 0∗ ∗∗ mod N
 
and Γ1 (N ) := γ ∈ SL2 (Z) : γ ≡ 1 ∗
01 mod N .

We note that not all finite index subgroups of SL2 (Z) are congruence subgroups.

4.3.1 Triangle groups


€ Š
a,b
Definition 4.8. Let k be a field whose characteristic is not 2. A quaternion algebra B := k
over a field k is a (unital associative) k-algebra that has a k-basis of the form {1, i, j, i j} with

i 2 = a, j 2 = b, i j = − ji.

A quaternion algebra is either a division ring (nonsplit) or isomorphic to M2 (k) (split). The
standard involution on B is the map α 7→ ᾱ that sends α = t + x i + y j −zi j to ᾱ = t − x i − y j −zi j.
The (reduced) norm of α ∈ B is N (α) = αᾱ = t 2 = a x 2 − b y 2 + a bz 2 ∈ k, which is multiplicative
and nonzero precisely when α is a unit in B. The set of norm-1 elements B 1 form a subgroup of
the unit group B × .
a,b

Let B = Q be a quaternion algebra over Q. We say that a place p of Q is ramified if B if
B ⊗Q Q p is a division ring, and unramified if B ⊗Q Q p is split. We call B definite if it is ramified at
−1,−1

the archimedean place of Q, equivalently, B∞ := B⊗Q R ' H = R and indefinite otherwise.
a,b

The quaternion algebra Q is definite if and only if a, b < 0.
The set of ramified places of B is finite of even cardinality; conversely, for every finite set
Σ of places of Q of even cardinality, there is a quaternion algebra with ramified exactly at the
places in Σ, which is unique up to isomorphism (this is true more generally for quaternion
algebras over any global field, see [2, Theorem 14.6.1]). The discriminant of B is the product
of the primes that ramify in B, which we negate when ∞ is ramified; the discriminant uniquely
determines the isomorphism class of a quaternion algebra over Q (and in fact |B| does).
If B is an indefinite quaternion algebra over Q,
a,b
 equivalently disc(B) > 0 we can embed B in
M2 (R) via B ,→ B∞ ' M2 (R) by writing B = Q with a > 0 and defining
p
ι∞ : B ,→ M2 (Q( a)) ⊆ M2 (R)
 p p 
t+x a y +z a
t + x i + y j + zi j 7→ p p .
b( y − z a) t − x a

For any α = t + x i + y j + zi j ∈ B we then have

det ι∞ (α) = t 2 − a x 2 − b y 2 + a bz 2 = N (α),

thus ι∞ restricts to group homomorphisms B × ,→ GL2 (R) and B 1 ,→ SL2 (R).


An order O in a quaternion algebra B over Q is a subring that is finitely generated as a Z-
module with O ⊗Z Q = B. It’s norm-1 unit group O1 := O ∩ B 1 is a discrete subgroup of SL2 (R),
hence a Fucshian group.
−1,3

Consider the indefinite quaternion algebra B = Q of discriminant 6. It’s unit group is

1+i + j +ij
O := Z ⊕ Zi ⊕ Z j ⊕ Zk, k= .
2
The Fuchsian group Γ := ι∞ (O1 ) has no cusps, and X Γ = YΓ = Γ \H = X (Γ ) is a good compact
complex 1-orbifold.

18.786 Spring 2024, Lecture #4, Page 3


References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

[2] John Voight, Quaternion algebras, Springer, 2021.

18.786 Spring 2024, Lecture #4, Page 4


18.786 Number theory II Spring 2024
Lecture #5 2/20/2024

5.1 Lattices in locally compact groups


Let G be a locally compact (Hausdorff) group equipped with a left Haar measure µ. Then G
contains an open set U of finite measure: pick g ∈ U ⊆ C with C compact, finitely many G-
translates of U cover C, and if µ(U) = 0 then µ(C) = 0 (by additivity of µ); if µ(C) = 0 for
every C then µ = 0, a contradiction.
Definition 5.1. Let U be an open set with 0 < µ(U) < ∞. The Haar modulus for G (also called
modular function for G) is the continuous group homomorphism
∆ G : G → R×
>0
µ(U g)
g 7→
µ(U)
which does not depend on the choice of U (since µ is unique up to scaling).
Note that G is unimodular (meaning µ is also a right Haar measure) precisely when ∆G (G) =
{1}. The center of G necessarily lies in the kernel of ∆G , as does its commutator subgroup and
all torsion elements, since R×
>0 is abelian and torsion free. This implies that all perfect groups are
unimodular (including SLn (R) for all n ≥ 1), as are all compact groups, since ∆G is continuous
and the only compact subgroup of R× >0 is the trivial group. For n > 1 the subgroup of upper
triangular matrices in GLn (R) is an example of a locally compact group that is not unimodular.
Now let H be a closed subgroup of G. Then the (left) coset space G/H is a locally compact
Hausdorff space with a continuous (left) G-action.
Definition 5.2. A (left) Haar measure µ on G/H is a Radon measure with µ(gS) = µ(S) for
every S ∈ Σ(G/H). If µ(G/H) < ∞ then µ is finite. When G/H admits a finite Haar measure
we say that H is cofinite.
Definition 5.3. A lattice in a locally compact group G is a discrete cofinite subgroup Γ .
As shown in Problem 3 of Problem Set 2, the lattices in SL2 (R) are the finitely generated
Fuchsian groups of the first kind (they satisfy both Λ(Γ ) = P1 (R) and X Γ := H∗Γ /Γ compact).
Lemma 5.4. Let Γ be a discrete subgroup of a second countable locally compact group G. Then Γ is
countable and G contains a measurable set of unique Γ -coset representatives with nonzero measure.
Proof. Let π: G → G/Γ be the projection map. Let U be an open neighborhood of the identity
1 ∈ G such that U ∩ Γ = {1}, let A × B be the inverse image of U under the multiplication
map G × G → G, with A, B ⊆ G open, and let V := (A ∩ B)−1 ∩ (A ∩ B). Then V is an open
neighborhood of the identity with V −1 V ∩ Γ = {1}. For any g ∈ G the restriction of π to g V is
injective, since for v1 , v2 ∈ V , if π(g v1 ) = π(g v2 ) then g v1 γ1 = g v2 γ2 for some γ1 , γ2 ∈ Γ and
then γ2 γ−1 −1
1 = v2 v1 ∈ V
−1
V ∩ Γ = {1}.
Let {Un }n≥1 be a countable subcover of G = {g V : g ∈ G} (second countable implies Lindelöf)
with µ(U1 ) > 0; then π restricts to an injection on each Un and Γ is countable. Now let F1 := U1
and define
Fn := Un − Un ∩ π−1 (π(U1 ∪ · · · ∪ Un−1 ))
for n ≥ 1. Then π restricts to an injection on each Fn ⊆ Un and the images π(Fm ) and π(F S n ) are
disjoint for m 6= n, since π(Fn ) = π(Un )−π(Un )∩(π(F1 )∪· · ·∪π(Fn−1 )). Now let F := n≥1 Fn .
The restriction of π to F is injective, and also surjective, since π(F ) = π(∪n≥1 Un ) = π(G). Thus
F is a set of unique Γ -coset representatives which is measurable, since each Fn is (note that
π−1 (π(U)) = UΓ is open for any open U ⊆ G). The measure of F is nonzero, since it contains
the open set U1 with positive measure.

Lecture by Andrew V. Sutherland


A set of unique Γ -coset representatives in G is a strict fundamental domain for Γ .

Lemma 5.5. Let Γ be a discrete subgroup of a second countable locally compact group G. Then
G/Γ admits a Haar measure if and only if ∆G (Γ ) = {1}.

Proof. Let π: G → G/Γ be the quotient map, let F be a measurable strict fundamental domain
for G/Γ as in Lemma 5.4, and let ν be a left Haar measure on G. For S ∈ Σ(G/Γ ) we define

µ(S) := ν(F ∩ π−1 (S)).

For any g ∈ G, the left G-invariance of ν implies

µ(gS) = ν(F ∩ π−1 (gS)) = ν(F ∩ gπ−1 (S)) = ν(g −1 F ∩ π−1 (S)) = ν(F 0 ∩ T ),

where F 0 := g −1 F and T := π−1 (S). Now F 0 is also a measurable strict fundamental domain for
G/Γ , thus G = F 0 Γ = F Γ , and T Γ = T . If ∆G (Γ ) = {1} then ν is Γ -right invariant and
X X
µ(gS) = ν(G ∩ F 0 ∩ T ) = ν(F γ ∩ F 0 ∩ T ) = ν(F ∩ F 0 γ−1 ∩ T ) = ν(F ∩ G ∩ T ) = µ(S),
γ∈Γ γ∈Γ

implying that µ is a left G-invariant. The function µ is a nonzero Radon measure on G/Γ , since ν
restricts to a nonzero Radon measure on F (because F is measurable and G is second countable,
hence σ-compact). It follows that µ is a left Haar measure on G/Γ .
Conversely, given a left Haar measure µ on G/Γ , for any S ∈ Σ(G) we may define
X
ν(S) := µ(π(F γ ∩ S)).
γ∈Γ

The Haar measure µ on G/Γ lifts to a nonzero Radon measure on each of the measurable strict
fundamental domains F γ, which form a countable partition of G. It follows that ν is a nonzero
Radon measure on G. The G-invariance of µ implies that ν is a Haar measure, and we note
that ν is right Γ -invariant, since π is; it follows that ∆G (Γ ) = {1}.

Corollary 5.6. Let Γ be a lattice in a second countable locally compact group G. Then G is uni-
modular.

Proof. Γ ∈ ker ∆G , so ∆G factors through the map G/Γ → R× >0 induced by µ, and ∆G (G) must
be bounded, since µ is finite. But the only bounded subgroup of R×
>0 is the trivial group.

Lemma 5.7. Let G be a second countable locally compact group with a discrete cocompact sub-
group Γ . Then Γ is a lattice.

In other words, for discrete subgroups of second countable locally compact groups, cocom-
pact implies cofinite. The converse does not hold: SL2 (Z) ⊆ SL2 (R) is discrete and SL2 (R)
is unimodular, so SL2 (Z) is a lattice in SL2 (R), but it is not cocompact because the diagonal
subgroup of SL2 (R) has unbounded image in SL2 (R)/SL2 (Z). Cocompact lattices are uniform

18.786 Spring 2024, Lecture #5, Page 2


5.2 Arithmetic groups
Recall that a Lie group is a smooth (second countable) manifold with a compatible group struc-
ture (so group operations are smooth maps). An algebraic group is an algebraic variety with a
compatible group structure (so group operations are rational maps). Projective algebraic groups
are abelian varieties (being projective forces the group to be abelian). Affine algebraic groups
2
are linear algebraic groups, and include GLn , which can be defined as an affine varieties in An +1
over any field k, using variables t and Ai j for 1 ≤ i, j ≤ n and the equation t det A = 1 (where
det A is the polynomial in the ai j corresponding to the determinant), along with polynomial
maps corresponding to matrix multiplication and inversion (using t as the inverse of det A).
The classical groups SLn , Sp2n , Un , SU n , On , and SOn are all examples of linear algebraic groups
(assume char(k) 6= 2 for On and SOn ).
If G is an algebraic group over Q (a Q-algebraic group), then G(R) and G(C) are Lie groups.

Definition 5.8. Subgroups H1 and H2 of a group G are commensurable if H1 ∩ H2 has finite


index in both H1 and H2 .

Definition 5.9. Let G be a Lie group. A subgroup Γ ≤ G is arithmetic if there is a Q-algebraic


linear group H and a morphism of Lie groups φ : H(R) → G with compact kernel and finite
cokernel for which φ(H(Z) is commensurable to Γ .

Arithmetic groups are discrete (since Z is discrete in R) and cofinite, which implies they
are lattices (which may be uniform or non-uniform). Not all lattices are arithmetic (not even
all uniform lattices); triangle groups in SL2 (R) provide many examples (see the next section).
But in a semisimple Lie group G (no nontrivial connected normal abelian subgroup) of real
rank at least 2, every irreducible lattice Γ (infinite lattices for which Γ N is dense in G for every
noncompact closed N Å G 0 ) is arithmetic, by a theorem of Margulis [1].
The arithmetic lattices in SL2 (R) arise from orders O in quaternion algebras B over a totally
real field F in which exactly one of the real places splits (this includes both uniform and non-
uniform lattices). As in the previous lecture, we embed F in R via the unique real place that
splits and use this to embed B in M2 (R), then take the image of the norm-1 elements of O under
this embedding to get a discrete cofinite subgroup Γ (O1 ) ≤ SL2 (R).
When B is a division algebra, the arithmetic lattice Γ (O1 ) is uniform (cocompact) and has
no cusps, otherwise B is a matrix algebra and Γ (O1 ) is non-uniform and has at least one cusp.

5.2.1 Triangle groups

Let Γ be a lattice in SL2 (R), equivalently, a finitely generated Fuchsian group of the first kind
with Z(Γ ) := Γ ∩{±1}. Then Γ /Z(Γ ) is generated by hyperbolic elements {α1 , . . . , α g , β1 , . . . , β g },
elliptic elements {γ1 , . . . , γs }, and parabolic elements {γs+1 , . . . , γs+t }, which satisfy

α1 β1 α−1 −1 −1 −1
1 β1 · · · α g β g α g β g γ1 · · · γs+t = 1
e
γi i = 1 (1 ≤ i ≤ s),

where 1 < e1 ≤ · · · ≤ es < ∞ are the orders of the elliptic elements.

Definition 5.10. The signature of Γ is the tuple (g; e1 , . . . , es+t ), with es+1 = · · · = es+t . It
satisfies the inequality
s+t 
1
X ‹
2g − 2 + 1− > 0.
j=1
ei

18.786 Spring 2024, Lecture #5, Page 3


When g = 0 and s + t = 3 we call Γ a triangle group of type (e1 , e2 , e3 ). For triangle groups
of type (e1 , e2 , e3 ) we necessarily have

1 1 1
+ + < 1.
e1 e2 e3

Proposition 5.11. Let (e1 , e2 , e3 ) and s be as above. Up to conjugacy in SL2 (R), the following
hold:

• If s ≥ 1 and ei is even for some i ≤ s then there is a unique triangle group of type (e1 , e2 , e3 )
and it contains −1.
• If g ≥ 2 and ei is odd for all i ≤ s then there are exactly two triangle groups of type (e1 , e2 , e3 ).
One contains −1 and the other of does not.
• If s = 0 or s = 1 and e1 is odd then there are exactly three triangle groups of type (e1 , e2 , e3 ).
One contains −1 and the other two do not and have index 2 in the one that does.

Proof. This is [4, Prop. 1].

This implies that there are infinitely many triangle groups.

Theorem 5.12. Let Γ be a triangle group of type (e1 , e2 , e3 ) and let

k := Q cos(π/e1 )2 , cos(π/e2 )2 , cos(π/e3 )2 , cos(π/e1 ) cos(π/e2 ) cos(π/e3 ) .




If t > 0 then Γ is arithmetic if and only if k = Q. If t = 0 then Γ is arithemtic if and only if k = Q


or k ) Q and under every embedding of k into R we have

cos(π/e1 )2 + cos(π/e2 )2 + cos(π/e3 )2 + 2 cos(π/e1 ) cos(π/e2 ) cos(π/e3 ) < 1

Proof. This is [4, Thm. 1].

Theorem 5.13. Let Γ be an arithmetic triangle group. If t > 0 there are 9 possible types for Γ :

(2, 3, ∞), (2, 4, ∞), (2, 6, ∞), (2, ∞, ∞), (3, 3, ∞), (3, ∞, ∞), (4, 4, ∞), (6, 6, ∞), (∞, ∞, ∞)

and otherwise there are 76 possible types:

• (2, 3, n) with n ∈ {7, . . . , 14, 16, 18, 24, 30} or (2, 4, n) with n ∈ {5, . . . , 8, 10, 12, 18};
• (2, 5, n) with n ∈ {5, 6, 8, 10, 20, 30} or (2, 6, n) with n ∈ {6, 8, 12};
• one of (2, 7, 7), (2, 7, 14), (2, 8, 8), (2, 8, 16), (2, 9, 18), (2, 10, 10), (2, 12, 12), (2, 12, 24),
(2, 15, 30), (2, 18, 18);
• (3, 3, n) with n ∈ {4, . . . , 9, 12, 15} or (3, 4, n) with n ∈ {4, 6, 12};
• one of (3, 5, 5), (3, 6, 6), (3, 6, 18), (3, 8, 8), (3, 8, 24), (3, 10, 30), (3, 12, 12);
• (4, 4, n) with n ∈ {4, 5, 6, 9} or one of (4, 5, 5), (4, 6, 6), (4, 8, 8), (4, 16, 16);
• one of (5, 5, 5), (5, 5, 10), (5, 5, 15), (5, 10, 10), (6, 6, 6), (6, 12, 12), (6, 24, 24), (7, 7, 7),
(8, 8, 8), (9, 9, 9), (9, 18, 18), (12, 12, 12), (15, 15, 15).

Proof. This is [4, Thm. 3].

18.786 Spring 2024, Lecture #5, Page 4


References
[1] Grigory Margulis, Arithmeticity of the irreducible lattices in the semi-simple groups of rank
greater than 1, Invent. Math. 76 (1984), 93–120.

[2] Toshitsune Miyake, Modular forms, Springer, 2006.

[3] Dave White Morris, Introduction to arithmetic groups, Deductive Press, 2015.

[4] Kisao Takeuchi, Arithmetic triangle groups, J. Math. Soc. Japan 29 (1977), 91-106.

[5] John Voight, Quaternion algebras, Springer, 2021.

18.786 Spring 2024, Lecture #5, Page 5


18.786 Number theory II Spring 2024
Lecture #6 2/21/2024

These notes summarize the material in §2.1 of [1] presented in lecture.

6.1 Automorphic forms


Definition 6.1. The automorphy factor j : GL+2 (R) × H → C is defined by

j(γ, z) := cz + d, where γ = ac db ,
az+b
in other words, the denominator of γz = cz+d .

Lemma 6.2 (cocycle relation). For all γ1 , γ2 ∈ GL+


2 (R) and z ∈ H we have

j(γ1 γ2 , z) = j(γ1 , γ2 z) j(γ2 , z)

Proof. Check.

Definition 6.3. Let k ∈ Z≥0 and γ ∈ GL+ 2 (R). The weight- k slash operator for γ acts on functions
f : H → C via
( f |k γ)(z) := (det γ)k/2 j(γ, z)−k f (γz).
We have ( f + g)|k γ = f |k γ + g|k γ and (c f )|k γ = c( f |k γ) for all γ ∈ GL+
2 (R), f , g : H → C, and
a ∈ C, thus the map f 7→ f |k γ is a linear operator on the vector space of functions H → C.
Moreover, for every γ1 , γ2 ∈ GL+ 2 (R) we have

f |k (γ1 γ2 ) = ( f |k γ1 )|k γ2 .

In other words, for each k ∈ Z the weight-k slash operator defines a linear right group action of
GL+2 (R) on the vector space of functions f : H → C.

We can omit the factor (det γ)k/2 if we only care about γ ∈ SL2 (R). For scalars a ∈ R× we have
f |k a = sgn(a)k f (so scalars, including −1, act trivially only when k is even).

Definition 6.4. Let Γ be a lattice in SL2 (R) (equivalently, a finitely generated Fuchsian group of
the first kind), and let k ∈ Z. We say that a meromorphic function f : H → C is an automorphic
function of weight k with respect to Γ if for every γ ∈ Γ we have

f |k γ = f .

The C-vector space of automorphic functions of weight k for Γ is denoted Ωk (Γ ).

We note the following (easily checked) facts:

• If k is odd and −1 ∈ Γ then Ωk (Γ ) = {0}, since f k | − 1 = − f when k is odd.


• For lattices Γ 0 ⊆ Γ we have Ωk (Γ ) ⊆ Ωk (Γ 0 ) (inclusions are reversed).
• If f ∈ Ωk (Γ ) and g ∈ Ωl (Γ ) then f g ∈ Ωk+` (Γ ).
• For f ∈ Ωk (Γ ) and α ∈ GL+ −1
2 (R) we have f |k α ∈ Ωk (α Γ α).

Recall that a graded ring is a ring R whose additive group is an internal direct sum
M
R= Ri
i=I

indexed by a monoid I (typically Z or Z≥0 ) such that R i R j ⊆ R i+ j for all i, j ∈ I. Note that this
means each R i is an R0 -module.

Lecture by Andrew V. Sutherland


Thus Ω(Γ ) := k∈Z Ωk (Γ ) is a graded ring and each Ωk (Γ ) is an Ω0 (Γ )-module (this gener-
L

alizes the C-vector space structure of Ωk (Γ ), which can be viewed as multiplication by constant
functions C ⊆ Ω0 (Γ )).
Suppose Γ has a cusp x with stabilizer Γ x , and pick σ ∈ SL2 (R) so that σx = ∞. Then

±σΓ x σ−1 = ± 10 1h


∈ σΓ x σ−1 and irregular



for some h > 0. Recall that a cusp x with σx = ∞ is regular if 10 ±h 1
−1 −1
∈ Ωk (σΓ σ−1 ),

otherwise (in which case −1 0 −1 ∈ σΓ x σ ). For f ∈ Ωk (Γ ) we have f |k σ
±h

and if x is regular this implies that


€ Š
( f |k σ−1 )(z + h) = ( f |k σ−1 )|k 10 1h (z) = ( f |k σ−1 )(z)

for all z ∈ H; this also holds when x is irregular and k is even, since f |k σ−1 |k − 1 = f |k σ−1 for
even k. There is then a function g : D0 → C (where D0 := {z ∈ C× : |z| < 1}) such that

( f |k σ−1 )(z) = g(e2πiz/h )

for all z ∈ H. We say f ∈ Ωk (Γ ) is meromorphic/holomorphic/vanishing at x if g is meromor-


phic/holomorphic/vanishing at 0. For x irregular and k odd we adopt the same terminology,
using the g forPf 2 (which has even weight 2k). This definition does not depend on σ.
If g(w) = n≥n0 an wn is the Laurent series expansion of g in a neighborhood about 0, then
X
( f |k σ−1 )(x) = an e2πiz/h
n≥n0

on some neighborhood Ul := {z ∈ H : Im(z) > l} of ∞. We thus have


¨
( f |k σ−1 )(z) if k is even or x is regular,
( f |k σ−1 )(z + h) = −1
−( f |k σ )(z) if k is odd and x is irregular,

which yields the Fourier expansion of f at the cusp x:

an eπinz/h if k is even or x is regular,


¨P
( f |k σ )(z) = Peven n≥0 πinz/h
−1

odd n≥0 an e if k is odd and x is irregular.

Writing this in terms of q := e2πinz yields the q-expansion of f at the cusp x.

Definition 6.5. We call an automorphic function f ∈ Ωk (Γ )

• an automorphic form (of weight k for Γ ) if f is meromorphic at every cusp of Γ ;


• a modular form (of weight k for Γ ) if f is holomorphic on H and at every cusp of Γ ;
• a cusp form (of weight k for Γ ) if f is holomorphic on H and vanishes at every cusp of Γ ;

The C-vector spaces of automorphic/modular/cusp forms of weight k for Γ are denoted


Ak (Γ ), Mk (Γ ), Sk (Γ ), respectively. For lattices Γ 0 ⊆ Γ of finite index we get corresponding inclu-
sions in the reverse directions (when Γ has cusps this is not immediate, see [1, Lemma 2.1.3]
for a proof), and we always have the inclusions

Sk (Γ ) ⊆ Mk (Γ ) ⊆ Ak (Γ ) ⊆ Ωk (Γ ).

18.786 Spring 2024, Lecture #6, Page 2


If Γ has no cusps then Ωk (Γ ) = Ak (Γ ) and Mk (Γ ) = Sk (Γ ), and for any α ∈ GL+
2 (R) the map
f 7→ f |k α induces isomorphisms

Ak (Γ ) ' Ak (Γ α ), Mk (Γ ) ' Mk (Γ α ), Sk (Γ ) ' Sk (Γ α ),

where Γ α := α−1 Γ α. The product of two automorphic/modular/cusp forms of weights k and l


is an automorphic/modular/cusp of weight k + l and we thus have graded rings
M M M
A(Γ ) = Ak (Γ ), M (Γ ) = Mk (Γ ), S(Γ ) = Sk (Γ ).
k∈Z k∈Z k∈Z

The ratio of automorphic forms of weight k and l is an automorphic form of weight k − l, thus
A(Γ ) is a field, as is its subspace A0 (Γ ), and all the Ak (Γ ) are A0 (Γ )-vector spaces. The field A0 (Γ )
is isomorphic to the field of meromorphic functions on the Riemann surface X Γ := Γ \H∗Γ .

Theorem 6.6. Let f ∈ Ωk (Γ ) be holomorphic on H. If f (z) = O(Im(z)−v ) as Im(z) → 0, uniformly


with respect to Re(z), for some v > 0 then f ∈ Mk (Γ ), and if this holds with v < k then f ∈ Sk (Γ ).

Proof. This is [1, Thm. 2.1.4].

Theorem 6.7. Let f ∈ Ωk (Γ ). Then f ∈ Sk (Γ ) if and only if f (z) Im(z)k/2 is bounded on H.

Proof. This is [1, Thm. 2.1.5].

Corollary 6.8. Let n≥1 an eπinz/h be the Fourier expansion of a cusp form f ∈ Sk (Γ ) at a cusp
P

of Γ . Then an = O(nk/2 ).

Remark 6.9. The bound in Corollary 6.8 can be improved to an = O(n(k−1)/2 ) when Γ is a
congruence subgroup [1, Thm. 4.5.17].

6.2 Automorphic forms with character


Definition 6.10. Let Γ ≤ SL2 (R) be a lattice. A character of Γ is a group homomorphism
χ : Γ → C× with finite image. This implies χ(Γ ) = 〈ζn 〉, where ζn = e2πi/n ; n is the order of χ.

Definition 6.11. a Dirichlet character " is a periodic totally multiplicative function Z → C,


equivalently, the extension by zero of a character of (Z/N Z)× for some modulus N .

Dirichlet characters " are commonly used to define characters of Γ0 (N ) via



χ ac db := "(d).

Definition 6.12. Let χ be a character of a lattice Γ ∈ SL2 (R). An automorphic function f ∈


Ωk (Γχ ) which satisfies
f |k γ = χ(γ) f
for all γ ∈ Γ is an automorphic function for Γ with character χ, and we use Ωk (Γ , chi) to denote
the C-vector space of all such functions. We similarly define the C-vector spaces

A(Γ , χ) := A(Γχ ) ∩ Ωk (Γ , χ), M (Γ , χ) := M (Γχ ) ∩ Ωk (Γ , χ), S(Γ , χ) := S(Γχ ) ∩ Ωk (Γ , χ),

of automorphic/modular/cusp forms for Γ with character χ.

18.786 Spring 2024, Lecture #6, Page 3


Note that Γχ has finite index in Γ , hence the same cusps as Γ (see [1, Cor. 1.5.5]), so
automorphic forms in A(Γ , χ) are meromorphic at the cusps of Γ , modular forms in M (Γ , χ) are
holomorphic at the cusps of Γ , and cusp forms in S(Γ , χ) vanish at the cusps of Γ .
For any finite index Γ 0 ≤ Γ we have corresponding inclusions
A(Γ , χ) ⊆ A(Γ 0 , χ 0 ), M (Γ , χ) ⊆ M (Γ 0 , χ 0 ), S(Γ , χ) ⊆ S(Γ 0 , χ 0 )
where χ 0 is the restriction of χ to Γ 0 . If f ∈ Ak (Γ , χ) and g ∈ Al (Γ , ψ) then f g ∈ Ak+l (Γ , χψ),
and similarly for Mk and Sk . The product of a modular form and a cusp form is a cusp form, so
we also have
Mk (Γ , χ)Sl (Γ , ψ) ⊆ Sk+l (Γ , χψ).
for any lattice Γ ∈ SL2 (R) with characters χ, ψ.
Example 6.13. Let N ∈ Z>0 . As we will prove in a later lecture, for the congruence subgroups
Γ0 (N ), Γ1 (N ) ≤ SL2 (Z), we have the decompositions
M M
Mk (Γ1 (N )) = Mk (Γ0 (N ), χ) and Sk (Γ1 (N )) = Sk (Γ0 (N ), χ),
χ χ

where the direct sums range over Dirichlet characters of modulus N .

6.3 The Petersson inner product


Let χ be a character for a lattice Γ ≤ SL2 (R) and let f , g ∈ Mk (Γ , χ) with at least one in Sk (Γ , χ).
Then f g ∈ S2k (Γ , χ 2 ) and for any γ ∈ Γ we have
f (γz)g(γz) Im(γz)k = f (z)g(z) Im(z)k ,
where g(z) := g(z) denotes complex conjugation. By Theorem 6.7, | f (z)g(z)| Im(z)k is bounded
on H, thus the integral Z
f (z)g(z) Im(z)k dv(z)
Γ \H
dxd y
is converges. Here dv(z) := y2
is the hyperbolic measure on H (with z = x + i y). We define
v(Γ \H∗Γ ) := v(Γ \H) := v(F )
where F is any measurable fundamental domain for Γ . The fact that Γ is a lattice (a finitely
generated Fuchsian group of the first kind) implies that such an F exists (indeed, we can use
any Dirichlet domain for Γ ), and that v(F ) ∈ R>0 is finite and independent of the choice of F .
We now define the Petersson inner product of f and g via
Z
1
〈 f , g〉 := f (z)g(z)k dv(z) ∈ C,
v(Γ \H) Γ \H
which induces a positive definite Hermitian inner product on the C-vector space Sk (Γ , χ), mak-
ing Sk (Γ , χ) into a Hermitian space.

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.
[2] Fred Diamond and Jerry Shurman, A first course in modular forms, Springer, 2005.
[3] John Voight, Quaternion algebras, Springer, 2021.

18.786 Spring 2024, Lecture #6, Page 4


18.786 Number theory II Spring 2024
Lecture #7 2/26/2024
These notes summarize the material in §2.2–2.3 of [1] presented in lecture.

7.1 Meromorphic differentials on compact Riemann surfaces



Let X be a (connected) Riemann surface with holomorphic atlas {ψi : Ui −→ Vi ⊆ C} and transi-
−1
tion maps ψi j := ψ j ◦ ψi : Vi j → Vj defined for all Ui ∩ U j 6= ;, where Vi j := ψi (Ui ∩ U j ).
Definition 7.1. Let n be an integer. A meromorphic differential ω of degree n on X is (the
equivalence class of) a collection of meromorphic functions {ωi : Vi → C} such that

ωi = (ω j ◦ ψi j )(dψi /dψ j )n

on Vi j whenever Ui ∩U j 6= ;. Meromorphic differentials {ωi } and {θi } of degree n are equivalent


if {ωi } ∪ {θi } is also a meromorphic differential of degree n.
Remark 7.2. A meromorphic differential of degree 0 is a meromorphic function X → C, equiv-
alently, a holomorphic map X → P1 (C) that is not identically ∞. Nonconstant meromorphic
functions X → C are the same thing as nonconstant holomorphic maps X → P1 (C).
We note the following, all of which are compatible with equivalence:
• Taking all ωi = 0 yields the zero differential, which is a meromorphic differential of degree
n for all n ∈ Z.
• If ω = {ωi } is a nonzero differential of degree n then ω−1 := {1/ωi } is a nonzero differ-
ential of degree −n.
• If ω is a meromorphic differential of degree n then so is cω := {cωi } for any c ∈ C.
• If ω and ω0 are meromorphic differentials of degree n then so is ω + ω0 := {ωi + ω0i }.
• If ω and ω0 are meromorphic differentials of degrees m and n then their product ωω0 :=
{ωi ω0i } is a meromorphic differential of degree m + n.
It follows that the set of meromorphic differentials of degree n on X forms a C-vector space
Ωn (X ), and we have a graded algebra of meromorphic differentials
M
Ω(X ) := Ωn (X ),
n∈Z

in which Ω (X ) ' C(X ) is the field of meromorphic functions on X . In particular, each C-vector
0

space Ωn (X ) is also an Ω0 (X )-vector space.

From now on we shall assume that X is compact. This implies that X is a smooth projective
curve over C with function field C(X ) ' Ω0 (X ) Recall that the categories of Riemann surfaces
(with holomorphic maps), smooth projective curves over C (with dominant morphisms), and
function fields of transcendence degree 1 over C (with field homomorphisms) are equivalent
categories. The functor from Riemann surfaces to curves is covariant, while the functor from
curves to function fields is contravariant.

For each f ∈ C(X ) we define a differential d f := {d f i }, where f i := f ◦ ψ−1


i , which is zero
if and only if f is constant, and (d f )n ∈ Ωn (X ) for all n ∈ Z. Note that C(X ) 6= C contains a
nonconstant f , thus there are nonzero differentials of every degree and dimC(X ) Ωn (X ) > 0. On
the other hand, if we pick a nonzero ω ∈ Ωn , multiplication by ω−1 defines an isomorphism
Ωn (X ) −→

Ω0 (X ) of C(X )-vector spaces. Thus for every n ∈ Z we have

dimC(X ) Ωn (X ) = 1.

Lecture by Andrew V. Sutherland


7.2 Divisors
The divisor group Div(X ) := P∈X (C) Z of X is the free abelian group generated by X (C):
L

( )
X
Div(X ) := n P P : n P ∈ Z with n P = 0 for all but finitely many P .
P∈X (C)

For D = n P P ∈ Div(X ), the degree of the divisor D is deg(D) :=


P P
P n P , and the map

deg: Div(X ) → Z
D 7→ deg(D)

is a group homomorphism. For each f ∈ C(X )× we have a divisor


X
div( f ) := ord P ( f )P,
P∈X (C)

where ord P ( f ) is the order of vanishing of f at P. Divisors arising in this way are called principal.
For any f , g ∈ C(X )× we have div( f g) = div( f ) + div(g) and deg(div( f )) = 0. The set of
principal divisors form a subgroup Prin(X ) ≤ Div0 (X ) ≤ Div(X ), where Div0 (X ) := ker(deg).
We now define the divisor class group

Pic(X ) := Div(X )/ Prin(X ).

We also define Pic0 (X ) := ker(deg)/ Prin(X ).


For each ω ∈ Ωn (X ) and P ∈ X (C) we define the order of vanishing of ω at P ∈ Ui to be

ord P (ω) = ord P (ωi ◦ ψi ).

The transition maps ψi j are holomorphic and nonzero, so ord P (ωi ) = ord P (ω j ) for Ui ∩ U j 6= ;.
This implies that ord P (ω) does not depend on the choice of Ui 3 P. We now define the divisor
X
div(ω) = ord P (ω)P,
P∈X (C)

which for ω ∈ Ω0 (X ) coincides with our previous definition. For any ω ∈ Ωm (X ) and ω0 ∈ Ωn (X )
with ω, ω0 6= 0 we have
div(ωω0 ) = div(ω) + div(ω0 ).
Recall that dimC(X ) Ωn = 1.
PThis implies that {div(ω) : 0 6= ω ∈ Ω (X )} is a divisor class.
n

We call a divisor D = P∈X n P P ∈ Div(X ) positive, or effective, if n P ≥ 0 for all P ∈ X (C),


and write D ≥ 0. For every divisor D ∈ Div(X ) we have an associated Riemann–Roch space

L(D) := { f ∈ C(X ) : f = 0 or div( f ) + D ≥ 0},

which is a finite-dimensional C-vector space, with L(0) = C and L(D) = {0} for deg(D) < 0.
We denote its dimension by
`(D) := dimC L(D).

Theorem 7.3 (Riemann-Roch). Let X be a compact Riemann surface of genus g and let ω ∈ Ω1 (X )
be a nonzero meromorphic differential. Then for every D ∈ Div(X ) we have

`(D) = deg(D) − g + 1 + `(div(ω) − D).

18.786 Spring 2024, Lecture #7, Page 2


Corollary 7.4. Let X be a compact Riemann surface of genus g. The following hold

• For each nonzero ω ∈ Ω1 (X ) we have deg(div(ω)) = 2g − 2 and `(div(ω)) = g.


• For each nonzero ω ∈ Ωn (X ) we have deg(div(ω)) = 2n(g − 1).
• For each D ∈ Div(X ) with deg(D) > 2g − 2 we have `(D) = deg(D) − g + 1.

Definition 7.5. A meromorphic differential ω ∈ Ωn (X ) is holomorphic if ω = 0 or div(ω) ≥ 0.

Corollary 7.6. Let X be a compact Riemann surface of genus g. Then dimC Ω10 (X ) = g.

Proof. Let ω1 ∈ Ω1 (X ) be nonzero. Then L(div(ω1 )) = { f ∈ C(X ) : f = 0 or div( f ω) ≥ 0}.


The isomorphism C(X ) −→∼
Ω1 (X ) defined by f 7→ f ω1 implies

L(div(ω1 )) ' {ω ∈ Ω1 (X ) : ω = 0 or div(ω) ≥ 0} = Ω10 (X ),

and dimC Ω10 (X ) = `(div(ω1 )) = g.

7.3 Automorphic forms and differentials


Let Γ ≤ SL2 (R) be a lattice and consider the Riemann surface X Γ := Γ \H∗Γ . Let π := πΓ : H∗Γ → X Γ
be the projection map. Assume k = 2n is even and fix an automorphic form f ∈ Ak (Γ ). Then f is
meromorphic on H∗Γ and f |k γ = f for all γ ∈ Γ , If k = 0 then f defines a meromorphic function
α ∈ Ω0 (X ). Our goal is to define a meromorphic differential ω f ∈ Ωn (X ) for any k = 2n, which
will coincide with f when k = 0. Recall that for any γ ∈ Γ and z ∈ H we have

( f |k γ)(z) = (det γ)k/2 j(γ, z)−k f (γz) = j(γ, z)−k f (γz) = (cz + d)−k f az+b

cz+d ,

thus f (γz) = j(γ, z)k f (z).


For each P ∈ X Γ (C), pick z P ∈ π−1 (P), and an open neighborhood z P ∈ U P ⊆ H∗Γ for which
γU P ∩ U P 6= ; if and only if γ ∈ Γ P := ΓzP and γU P = U P for all γ ∈ Γ P , and let ψ P : VP −→

WP ⊆ C
−1
be a chart for VP = π(U P ), with transition maps ψ PQ = ψQ ◦ ψ P : WPQ → WQ defined on
:
WPQ := ψ P (VP ∩ VQ ) whenever VP ∩ VQ 6= ;.
Let α = {α P : WP → C}, and define f P := f|U and π P := π|U so that f P = α P ◦ ψ P ◦ π P . If
P P
z P ∈ H is not a cusp, we define g P : U P → C via

g P (z) = f P (z)(d(ψ P ◦ π P (z))/dz)−n ,

so that g P = f P when n = 0, and we note that


‹0
az + b a(cz + d) − (az + b)c 1

0
(γz) = = = = j(γ, z)−2 .
cz + d (cz + d)2 (cz + d)2

For γ ∈ Γ P and z ∈ U P we have ψ P ◦ π P (γz) = ψ P ◦ π P (z) thus

g p (γz) = f P (γz)(d(ψ p ◦ π P (γz))/d(γz))−n


= j(γ, z)k f P (z)(d(ψ P ◦ π P (z))/( j(γ, z)−2 dz))−n
= f P (z)(d(ψ P ◦ π P (z))/dz)−n = g P (z)

for all γ ∈ Γ P . There is thus a meromorphic function ω P : WP → C for which g P = ω P ◦ ψ P ◦ π P ,


and ω P ◦ ψ P ◦ π P (z) = f P (z)(d(ψ P ◦ π P (z))/dz)−n .

18.786 Spring 2024, Lecture #7, Page 3


If Q ∈ X Γ (C) is not a cusp then ωQ ◦ψQ ◦πQ (z) = fQ (z)d(ψQ ◦πQ (z))/dz)−n and if VP ∩VQ 6= ;
then the compatibility of α P = αQ ◦ ψ PQ implies

ω P = (ωQ ◦ ψ PQ )(dψ P /dψQ )n . (1)

If z P ∈ H∗Γ −H is a cusp, we let U P0 := U P −{z P } and proceed as above to obtain meromorphic


functions g P on U P − {z P } and ω P on WP − {ψ P (P)}  with g P = ω P ◦ ψ P ◦ π P . Now choose
σ ∈ SL2 (R) so σz P = ∞ so that ±σΓ P σ−1 = ± 10 1h for some h > 0, and define ϕ P : VP → WP
via
ϕ P ◦ π(z) = e2πiσz/h
If we put c := (2πi/h)−n then

g P (z) = c f (z)(d(σz)/dz)−n (ϕ P ◦ π(z))−n


= c f (z) j(σ, z)2n (ϕ P ◦ π(z))−n
= c( f |k σ−1 )(σz)(ϕ P ◦ π(z))−n

is meromorphic at z P , and ω P is meromorphic at ψ P (P). Moreover if Q ∈ X Γ (C) with VP ∩VQ 6= ;


then (1) holds as above.
It follows that {ω P } is a differential ω f ∈ Ωn (X Γ ), and we have

• ω f g = ω f ω g for all f ∈ A2m (Γ ) and g ∈ A2n (Γ );


• ω f +g = ω f + ω g for all f , g ∈ A2m (Γ ).

Conversely, for each ω = {ωi } ∈ Ωn (X Γ ) we can define an automorphic form

f (z) = ψi (π(z))(d(ψi ◦ π(z))/dz)n ∈ A2n (Γ )

for which ω = ω f . This yields the following theorem.

Theorem 7.7. Let A(Γ )even = n∈Z A2n (Γ ). Then A(Γ )even is isomorphic to Ω(X Γ ) as a graded
L

C-algebra via the isomorphism f 7→ ω f . In particular, A2n (Γ ) 6= {0}.

Let f ∈ A2n (Γ ) be nonzero and choose P ∈ X Γ (Z), z P ∈ π−1 (P), and z P ∈ U P ⊆ H∗Γ as above.
For z P ∈ H choose ρ ∈ SL2 (C) so that ρH = D and ρz P = 0. Choose ψ P : VP −→ ∼
WP ⊆ D so that
ψ P ◦ π(z) = (ρz) for all z ∈ U P , where e is the ramification index at P. Then
e

g P = f (z)(d(ρz)/dz)−n (e(ρz)e−1 )−n

and if we put w = ρz we have

g P ◦ ρ −1 (w) = e−m f (ρ −1 w)(d(ρ −1 w)/d w)n w−n(e−1)

which implies that

ordw (g P ◦ ρ −1 ) = ordw ( f (ρ −1 w)(d(ρ −1 w)/d w)n ) − n(e − 1).

Now g P ◦ ρ −1 (w) = we , so

ordw (g P ◦ ρ −1 ) = eord P (g P ) = eord P (ω f ).

We also have
ordw ( f (ρ −1 w)(d(ρ −1 w)/d w)n ) = ordz0 ( f ),

18.786 Spring 2024, Lecture #7, Page 4


since d(ρz)/dz has neither a zero or pole at z0 . We thus define

1
ord P ( f ) := ordz0 ( f ),
e
and we then have
ord P (ω f ) = ord P ( f ) − n(1 − 1/e).
which shows that the definition of ord P ( f ) does not depend on the choice of z0 or ρ. When
n = 1 and z0 ∈ H we have

ω f is holomorphic at P ⇔ ord P (ω F ) ≥ 0 ⇔ ord P ( f ) ≥ 0 ⇔ f is holomorphic at z0 .

For z0 a cusp of Γ we choose σ ∈ SL2 (R) so σz0 = ∞ and define h > 0 as above so that
X
( f |2n σ−1 )(σz) = am (g P (z))m ,
m≥m0

with m0 maximal. We have ord P (g P ) = m0 − n and define ord P ( f ) := m0 so that we have


ord P (ω f ) = ord P ( f ) − n. When P = π(z0 ) is a cusp and n = 1 we have

ω f is holomorphic at P ⇔ ord P (ω F ) ≥ 0 ⇔ ord P ( f ) ≥ 1 ⇔ f vanishes at z0 .

Theorem 7.8. The correspondence f 7→ ω f induces an isomorphism S2 (Γ ) ' Ω10 (X Γ ).

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

[2] Fred Diamond and Jerry Shurman, A first course in modular forms, Springer, 2005.

[3] John Voight, Quaternion algebras, Springer, 2021.

18.786 Spring 2024, Lecture #7, Page 5


18.786 Number theory II Spring 2024
Lecture #8 2/28/2024

These notes summarize the material in §2.3–2.5 of [1] covered in lecture.

8.1 Divisors of automorphic forms


For a compact Riemann surface X we define

Div(X )Q = Div(X ) ⊗Z Q

and call elements of Div(X )Q divisors with rational coefficients.


Let Γ ≤ SL2 (R) be a lattice, so that X Γ := Γ \H∗Γ is a compact Riemann surface, and let
πΓ : H∗Γ → X be the projection map. Let f ∈ A2n (Γ ) be a nonzero automorphic form of weight 2n,
and let P ∈ X Γ (C) and z P ∈ π−1 Γ (P). In the previous lecture we defined

1
 e ordzP ( f )
 if P is an elliptic point of order e,
ord P ( f ) = ord∞ ( f |2n σ ) if P is a cusp and σz P = ∞,
−1

ord ( f ).

otherwise
zP

−1
Here ord∞ ( f |2n σ ) is the index m of the first nonzero  coefficient am in the Fourier expansion
−1
( f |2n σ )(z) = with ±σΓzP σ−1 = ± 10 1h . We also recall that e = #ΓzP /Z(Γ ) is
P 2πimz/h
e
equal to the ramification index of the map πΓ at e (the cardinality of π−1 Γ (Q) for every Q 6= P in
any sufficiently small neighborhood of P).
The lattice Γ has only finitely many inequivalent elliptic points and cusps (it is cofinite, hence
geometrically finite), so we may define
X
div( f ) := ord P ( f )P ∈ Div(X Γ )Q .
P∈X Γ (C)

In the previous lecture we associated to f a meromorphic differential ω f ∈ Ωn (X Γ ) with



1

ord P ( f ) − n 1 − e
 if P is an elliptic point of order e,
ord P (ω f ) = ord P ( f ) − n if P is a cusp,
ord ( f )

otherwise.
P

Note that unlike div( f ), the divisor div(ω f ) ∈ Div(X Γ ) has integer coefficients (it is the divisor
of a differential form), and the Riemann–Roch theorem implies

deg(div(ω f )) = 2n(g − 1), (1)

where g := g(X Γ ) is the genus of X Γ , as is true for any meromorphic differential of degree n on
a compact Riemann surface of genus g.
The following theorem follows immediately from the formulas above.

Theorem 8.1. For k ∈ 2Z and nonzero f ∈ Ak (Γ ) we have


X
1
div( f ) = div(ω f ) + 2k

1− eP P
P∈X Γ (C)
X
k
1 − e1P

deg div( f ) = k(g − 1) + 2
P∈X Γ (C)

where e P is the ramification index of P and g = g(X Γ ), with 1/e P = 0 when P is a cusp.

Lecture by Andrew V. Sutherland


Corollary 8.2 (Sturm bound). Let k ∈ 2Z>0 , let Γ ≤ SL2 (R) be a lattice with r ≥ 1 inequivalent
cusps and s ≥ 0 inequivalent elliptic points, and let g be the genus of X Γ . Let x be a cusp of Γ and
choose σ ∈ SL2 (R) so that σx = ∞. Suppose f1 , f2 ∈ Mk (Γ ) have Fourier expansions
X X
( f1 |k σ−1 = an e2πinz/h and ( f2 |k σ−1 )(z) = bn e2πinz/h
n≥0 n≥0

at x, with an = bn for n ≤ k(g − 1 + (r + s)/2). Then f1 = f2 .


Proof. Suppose f1 6= f1 . Then f1 − f2 and ω := ω f1 − f2 are nonzero. Let P = πΓ (x), and let
n > k(g − 1) + k(r + s)/2 be the least positive integer for which an 6= bn . Then
k k
ord P (ω) = ord P ( f − g) − 2 = n− 2 > k(g − 1 + (r + s − 1)/2).

Now f1 − f2 is holomorphic, so ordQ (ω) ≥ −k/2 for every elliptic point or cusp Q 6= P, and
ordR (ω) ≥ 0 for every ordinary point R. It follows that

deg(div(ω)) > k(g − 1 + (r + s − 1)/2) − (r + s − 1)k/2 = k(g − 1)

but this contradicts the degree equality implied by Riemann–Roch (1), so f1 = f2 .

Remark 8.3. This corollary has some remarkable implications. If k(g − 1) + k(r + s)/2 < 0
the hypothesis always holds, implying that every f ∈ Mk (Γ ) is zero. It also implies that every
f ∈ Mk (Γ ) is uniquely determined by a finite prefix of its Fourier expansion at any cusp. This
yields an effective algorithm for testing whether two modular forms are equal or not, and if we
fix a basis for the finite dimensional C-vector space Mk (Γ ) it allows us to explicitly compute the
representation of any f ∈ Mk (Γ ) as a linear combination of basis elements.

8.2 Odd weight


For odd k we may assume −1 6∈ Γ , since f |k −1 = − f is satisfied only when f = 0. Let f ∈ Ak (Γ )
be nonzero. Then f 2 ∈ A2k (Γ ), and we define

ord P ( f ) := ord P ( f 2 )/2

and X
div( f ) := ord p ( f )P
P∈X Γ (C)

Theorem 8.4. Let k be an odd integer and assume −1 6∈ Γ . For each nonzero f ∈ Ak (Γ ) we have
X
div( f ) = 21 div(ω f 2 ) + 2k 1 − e1P P


XP
k 1

deg div( f ) = k(g − 1) + 2 1− eP
P

1
 2 + Z if P is an irregular cusp

ord P ( f ) ∈ e1 + Z if P is an elliptic point
 P
0 otherwise
¨
1 + 2Z if P is a regular cusp
ord P (ω f 2 ) ∈
2Z otherwise

where e P is the ramification index at P with 1/e P = 0 if P is a cusp.

18.786 Spring 2024, Lecture #8, Page 2


8.3 Computing the measure of X Γ
As above, let Γ ≤ SL2 (R) be a lattice and let X Γ := Γ \H∗Γ .

Lemma 8.5. Let m be the lcm of the orders of all elliptic P ∈ X Γ (C). There is a nonzero f ∈ A2m (Γ )
which has no zeros or poles at any cusp or any elliptic point in X Γ (C).

Proof. Let f ∈ A2m (Γ ) be nonzero. Then div( f ) = div(ω f ) + 2k P 1 − e1P P ∈ Div(X Γ ), since
P 

k/2 = m ∈ Z is divisible by every e P ∈ Z. Let S be the (finite) set of elliptic points and cusps in
X Γ (C), let g := g(X Γ ), and choose n ∈ Z so that

− deg(div(g)) − 1 + n > 2g − 2.

Fix Q 6∈ S and consider the divisors DQ := −div(g) + nQ and DP := −div(g) − P + nQ for each
P ∈ S. We have DQ ≥ DP , so L(DQ ) ) L(DP ), and deg(DQ ) > deg(DP ) > 2g − 2, so

dim L(DQ ) − dim L(DP ) = deg(DQ ) − deg(DP ) = 1,

for all P ∈ S by Riemann–Roch, since `(D) = deg(D) − g + 1 whenever deg(D) > 2g − 2.


It follows that the Riemann–Roch space

L(DQ ) = {h ∈ C(X Γ ) : h = 0 or div(h) + DQ ≥ 0}

contains some h ∈ C(X Γ ) ' A0 (Γ ) that does not lie in any of the L(DP ), with ord P ( f /h) = 0 for
P ∈ S, since ord P (h) = ord P ( f ) for P ∈ S. Then f /h ∈ A2m (Γ ) has not zeros or poles on any
cusp or elliptic point in X Γ (C).

Recall that v(X Γ ) is defined as the measure of any (measurable) fundamental domain for Γ
under the hyperbolic measure dv(z) := d xyd2 x on H, where z = x + i y, which is induced by the
Haar measure of SL2 (R).

Theorem 8.6. For any lattice Γ ≤ SL2 (R) we have


X
1 1

2π v(X Γ ) = 2g − 2 + 1− eP ,
P∈X Γ (C)

where e P is the ramification index of P, with 1/e P = 0 when P is a cusp, and g is the genus of X Γ .

Proof sketch. Pick m and nonzero f ∈ A2m (Γ ) via Lemma 8.5, and a measurable fundamental
domain F for Γ such that f has no zeros or poles on ∂ F . Let v1 , . . . , vr be the vertices of F that
are cusps, and pick curves C1 , . . . , C r with Ci in a neighborhood of vi such that πΓ (Ci ) is a circle
around π(vi ) oriented counterclockwise. Let M be the compact set bounded by F and the curves
C1 , . . . , C r . Then Z
dx ∧ d y
v(X Γ ) = v(F ) = lim ,
Ci →x i
M y2
where Ci → x i means we are shrinking the neighborhoods of x i in which the Ci lie so that the
measure of the part of F outside M tends to zero. Applying Stokes Theorem yields
Z Z Z  Z
dx ∧ d y
‹
dz dz i i
= = + d(log f ) − d(log f ),
M y2 ∂M y ∂M y m m ∂M

18.786 Spring 2024, Lecture #8, Page 3


where ∂ M is oriented counterclockwise. One can show that the integrand in the first term on
the RHS is Γ -invariant, and since Γ acts on pairs of sides of F with opposite orientation, the
integral tends to zero as the Ci → x i , and we have
Z
i
v(X Γ ) = 2π lim − d(log f ) = 2π
m deg(div( f ))
Ci →x i m
∂M

by Cauchy’s residue formula, since we can assume f has no zeros or poles on ∂ M and its zeros
and poles in F all lie inside M . Applying Theorem 8.1 yields
‚ Œ
X X
1 1 1
1 − e1p .
 
2π v(X Γ ) = m 2m(g − 1) + m 1 − eP = 2g − 2 +
P P∈C(X Γ )

8.4 Dimension formulas


Let Γ ≤ SL2 (R) be a lattice. For D ∈ Div(X Γ )Q we define
X
bDc := br P cP
P

Lemma 8.7. Let Γ ∈ SL2 (R) be a lattice with regular cusps P1 , . . . , Ps and irregular cusps Ps+1 , . . . , Pt .
Let f ∈ Ak (Γ ) be nonzero. We have the following isomorphisms of C-vector-spaces

• Mk (Γ ) ' L(bdiv( f )c);


¨
L(bdiv( f ) − (P1 + · · · + Pt )c) k even,
• Sk (Γ ) '
L(bdiv( f ) − (P1 + · · · + Ps + 21 Ps+1 + · · · + 12 Pt )c) k odd.

Proof. We have Ak (Γ ) = { f h|h ∈ A0 (Γ )}, and it follows that

Mk = { f h : h ∈ A0 (Γ ) : h = 0 or div( f h) ≥ 0}
= {h ∈ C(X Γ ) : h = 0 or div(h) + div( f ) ≥ 0}
= L(b(div( f )c).

since div(h) + div( f ) ≥ 0 if and only if div(h) + bdiv( f )c ≥ 0.


Now consider f h ∈ Mk (Γ ) with h ∈ A0 (Γ ) ' C(X ). Then

f h ∈ Sk (Γ ) ⇔ ord Pi ( f f0 ) > 0 for 1 ≤ i ≤ t

If k is even then

f h ∈ Sk (Γ ) ⇔ ord Pi ( f h) ≥ 1 for 1 ≤ i ≤ t
⇔ ord Pi ( f ) + ord Pi (h) − 1 ≥ 0 for 1 ≤ i ≤ t
⇔ h ∈ L(bdiv( f ) − (P1 + · · · + Pt )c)

which implies Sk (Γ ) ' L(bdiv( f ) − (P1 + · · · + Pt )c). For k odd a similar argument works, except
we use the bound ord Pi ( f h) ≥ 1/2 for s < i ≤ t.

18.786 Spring 2024, Lecture #8, Page 4


Let e1 , . . . , e r be the orders of the elliptic points and t the number of cusps in X Γ (C). Define
r
X
1

d := 2g − 2 + 1− ei + t,
i=1

let g := g(X Γ ), and fix a nonzero f ∈ Ak (Γ ). Theorem 8.1 implies that


r š
X
k k 1

deg(div( f )) = 2d and deg(bdiv( f )c) = k(g − 1) + 2 1− ei + b 2k ct
i=1

and we have d = v(X Γ )/(2π) > 0, by Theorem 8.6. We now consider k ∈ 2Z as follows:

• k < 0: we have deg(bdiv( f )c) ≤ deg(div( f )) = kd/2 < 0 so dim Mk (Γ ) = dim Sk (Γ ) = 0.


• k = 0: we have f ∈ C(X Γ )× and dim M0 (Γ ) = `(div( f )) = 1 and
¨
1 if t = 0
dim S0 (Γ ) = `(div( f ) − (P1 + · · · + Pt )) =
0 if t > 0.

• k = 2: we have S2 (Γ ) ' Ω10 (X Γ ), as shown in Lecture 7, so dim S2 (Γ ) = `(div(ω f )) = g,


and Mk (Γ ) ' Sk (Γ ) if t = 0, otherwise Riemann–Roch implies

dim Mk (Γ ) = `(bdiv( f )c) = deg(bdiv( f )c) − g + 1 + `(div(ω1 ) − bdiv( f )c) = g − 1 + t.

• For k > 2. Fix any nonzero f ∈ Ak (Γ ) and let D = div( f ) − (P1 + · · · + Pt ), where P1 , . . . , Pt
are the cusps in X Γ (C). Then deg(D) = deg(div( f )) − t and
r š
X
k 1
 k
 k−2
deg(bDc) = k(g − 1) + 2 1− ei + 2 −1 t ≥ 2 d + (2g − 2) > 2g − 2.
i=1

Lemma 8.7 and Riemann–Roch imply dim Sk (Γ ) = `(bDc) = deg(bDc) − g + 1, and we also
have dim Mk (Γ ) = dim Sk (Γ ) + `(bdiv( f )c) − `(bDc) = dim Sk (Γ ) + t.

This yields the following theorem.

Theorem 8.8. Let Γ ∈ SL2 (R) be a lattice, let e1 , . . . , e r be the orders of the elliptic points in X Γ (C),
let t be the number of cusps in X Γ (C), let g be the genus of X Γ and let k ∈ 2Z. Then


0 k < 0 or k = 0, t > 0
k=t =0

1
dim Sk (Γ ) =

 g k=2
 Pr š k 1
 k

(k − 1)(g − 1) + i=1 2 1 − ei + 2 − 1 t k > 2

and 

0 k<0
k=0

1


dim Mk (Γ ) = g k = 2, t = 0
g −1+ t k = 2, t > 0





dim Sk (Γ ) + t k>2

18.786 Spring 2024, Lecture #8, Page 5


For odd k 6= 1 a similar series of calculations yields the following result.

Theorem 8.9. Let Γ ∈ SL2 (R) be a lattice with −1 6∈ Γ , let e1 , . . . , e r be the orders of the elliptic
points in X Γ (C), let u and v be the number of regular and irregular cusps in X Γ (C), let g be the
genus of X Γ and let k ∈ 1 + 2Z. Then
¨
0 k<0
d imSk (Γ = Pr š
k 1
 k−2 k−1
(k − 1)(g − 1) + i=1 2 1− ei + 2 u+ 2 v k≥3

and 
0
 k<0
dim Mk (Γ ) = dim S1 (Γ ) + 2u k=1
dim S (Γ ) + u

k≥3
k 2

Proof. See [1, Theorem 2.5.3]

Remark 8.10. No general formula is known for dim S1 (Γ ), even for congruence subgroups.

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #8, Page 6


18.786 Number theory II Spring 2024
Lecture #9 3/13/2024

These notes summarize the material in §2.6 of [1] covered in lecture.

9.1 Poincaré series


Let k be an integer and let Γ ≤ SL2 (R) be a lattice with a character χ : Γ → U(1) of finite order
such that χ(−1) = (−1)k if −1 ∈ Γ . Let Λ ≤ Γ and let φ a meromorphic function on H such that
(i) φ|k λ = χ(λ)φ for all λ ∈ Λ;
(ii) φ has only finitely many Λ-inequivalent poles z1 , . . . zm ;
(iii) If x 1 , . . . , x r are the Γ -inequivalent cusps of Γ , then for any open neighborhoods Ui 3 zi
and Vj 3 x j we have
Z
|φ(z)| Im(z)k/2 d v(z) < ∞,
Λ\H0
Sm S Sr
where H0 = H − λ∈Λ λUi γVj and d v(z) is the hyperbolic measure.
S
i=1 − j=1 γ∈Γ

We now define the Poincaré series


X
F (z) := Fk (z; φ, χ, Λ, Γ ) = χ(γ)(φ|k γ)(z)
γ∈Λ\Γ

If the sum defining F (z) converges, then by construction we have F |k γ = χ(γ)F for all γ ∈ Γ
and F ∈ Ωk (Γ , χ) is an automorphic form of weight k for Γ with character χ.
Theorem 9.1. Let φ, χ, Λ, Γ , zi , F be as above. Then F converges absolutely and uniformly on any
compact subset of H − {γzi |γ ∈ Γ , 1 ≤ i ≤ m} and F ∈ Ωk (Γ , χ).
Proof. See Theorem 2.6.6 in [1].

Now let x ∈ P1 (R) be a cusp of Γ , choose σ ∈ SL2 (R) so σx = ∞, and suppose that phi
also satisfies
(iv) if x is not a cusp of Λ then |(φ|k σ−1 )(z)| ≤ M |z|−1−ε on Im(z) > `, for some M , `, ε > 0.
(v) if x is a cusp of Λ then |(φ|k σ−1 )(z)| ≤ M |z|−ε on Im(z) > `, for some M , ` > 0 and ε ≥ 0.
Theorem 9.2. Let Γ , Λ, φ, χ, F be as above, let x 0 be a cusp of Γ such that (iv) and (v) hold for
x ∈ Γ x 0 . Then F is holomorphic at x 0 , and if ε > 0 in (v) then F vanishes at x 0 .
Proof. This is Theorem 2.6.7 in [1].

We now assume that Γ has at least one cusp x, with σx = ∞ and that

χ(γ) j(σγσ−1 , z)k = 1 for all γ ∈ Γ x , (1)

a condition that does not depend on the choice of σ and which implies that if χ is trivial, k is
odd, and −1 6∈ Γ then x is a regular cusp. Recall that

±σΓ x σ−1 = ± 10 1h


for some h > 0. For each m ∈ Z≥0 we define

φm (z) := φm (z; x, σ) := j(σ, z)−k e2πimσz/h .

It is easy to check that for all k ≥ 3 the functions φm satisfy conditions (i) through (v) for
Γ , χ, Λ = Γ x , x, and we have the following theorem.

Lecture by Andrew V. Sutherland


Theorem 9.3. Assume k ≥ 3 and let Γ , χ, φm , x be as above, satisfying conditions (i)–(v) and (1).
Then

• If m ≥ 1 then Fk (z; φm , χΓ x , Γ ) ∈ Sk (Γ , χ) is a cusp form of weight k for Γ with character χ.


• If m = 0 then F (z) := Fk (z; φ0 , χΓ x , Γ ) ∈ Mk (Γ , χ) is a modular form of weight k for Γ with
character χ whose Fourier expansion at x has the form

X
(F |k σ−1 )(z) = 1 + an e2πinz/h
n=1

and which vanishes at all cusps that are Γ -inequivalent to x.

Definition 9.4. The Poincaré series Fk (z; φ0 , χ, Γ x , Γ ) in Theorem 9.3 is an Eisenstein series.

Theorem 9.5. Assume k ≥ 3 and let Γ , χ, φm , x be as above, satisfying conditions (i)–(v) and (1).
Let σx = ∞ and for m ∈ Z≥0 define
(m)
gk := Fk (z; φm , χ, Γ x , Γ ).

Let f ∈ Sk (Γ , χ) have Fourier expansion



X
( f |k σ−1 )(z) = an e2πinz/h
n=1

at x, then
¨
if m = 0
Z
¬
(m)

(m) 0
f , gk = f (z)g k (z) Im(z)k d v(z) =
Γ \H am (4πm) 1−k k
h (k − 2)! if m > 0

proof sketch. Without loss of generality we may assume x = ∞ and σ = 1. As shown in the
proof of [1, Theorem 2.6.10], the integral in the Petersson inner product converges uniformly,
allowing us to swap the order of integration and the sum defining Fk . With z = x + i y we have
Z Z
(m)
X
f (z)g k (z) Im(z)d v(z) = χ(γ) f (z)e−2pim y z̄/h j(γ, z̄)−k Im(z)k d v(z)
Γ \H γ∈Γ x \Γ Γ \H
Z ∞Z h
= f (z)e−2πimz̄/h y k−2 d x d y
0 0

X Z ∞ Z h
−2π(m+n) y/h
= an e y k−2
dy e2πi(n−m)x/h d x
n=1 0 0
¨
0 if m = 0
=
am (4πm) 1−k k
h (k − 2)! if m > 0
R1
where we have used the fact that 0
e2πinx d x vanishes for all nonzero integers n.
(m)
Corollary 9.6. Under the hypotheses of Thoerem 9.5, the set {g k (z) : m ≥ 1} generates Sk (Γ , χ).

18.786 Spring 2024, Lecture #9, Page 2


(m)
Proof. Let V be the subspace of Sk (Γ , χ) spanned by the g k , let

(m)
¦ ©
V ⊥ := f : Sk (Γ , χ) : 〈 f , g k 〉 = 0 for m ≥ 1

P∞respect to the Petersson inner product, and consider f ∈ V


be its orthogonal complement with
with Fourier expansion f (z) = n≥1 an e2πinz/h at x. Theorem 9.5 implies that am = 0 for all
m ≥ 1, so f = 0.

Let Ek (Γ , χ) denote the orthogonal complement of Sk (Γ , χ) in Mk (Γ , χ) with respect to the


Petersson inner product:

Ek (Γ , χ) := {g ∈ Mk (Γ , χ) : 〈 f , g〉 = 0 for all f ∈ Sk (Γ , χ)} .

Corollary 9.7. Let Γ ≤ SL2 (R) be a lattice and let x 1 , . . . , x r be the Γ -inequivalent cusps that satisfy
(1). Assume k ≥ 3 and let Γ , χ, φm be as above, satisfying conditions (i)–(v). Then the functions

g i (z) := Fk (z; φ0 , χ, Γ x i , Γ ) ∈ Sk (Γ , χ) (1 ≤ i ≤ r)

are a basis for the C-vector space Ek .

Proof. Consider f ∈ Mk (Γ , χ). If x is a cusp of Γ that does not satisfy (1) then f vanishes at x.
(i)
Otherwise, let a0 be the constant term in the Fourier expansion of f at x i . Theorem 9.3 implies
Pr (i)
that the function f (z)− i=1 a0 g i (z) lies in Sk (Γ , χ), therefore Mk (Γ , χ) = Sk (Γ , χ) V , where
L

V is the subspace of Mk (Γ , χ) spanned by the g i , and Theorem 9.5 implies that V is contained
in Ek (Γ , χ) = Sk (Γ , χ)⊥ , so Ek (Γ , χ) = V is spanned by the g i , and Theorem 9.3 implies that
the g i are linearly independent and thus form a basis.

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #9, Page 3


18.786 Number theory II Spring 2024
Lecture #10 3/13/2024

These notes summarize the material in §2.7–2.8 of [1] covered in lecture, along with some
relevant background on permutation modules.

10.1 Group rings and permutation modules


Let R be a commutative ring. For any set X we use R[X ] to denote P the free R-module gener-
ated by the elements of S; it consists of all finite formal sums x∈X r x x, with addition scalar
multiplication and addition defined in the obvious way: we let r1 x + r2 x := (r1 + r2 )x and
r1 (r2 x) = (r1 r2 )x and extend these definitions R-linearly to all of R[X ].
If G is a group, the R-module R[G] is a ring with r1 g1 · r2 g2 := (r1 r2 )(g1 g2 ) (to define the
product of elements or R[G] use the distributive law), called the group ring R[G] of G over R;
note that this ring is commutative if and only if G is. If X is a left/right G-set then the R-module
R[X ] can be given the structure of a left/right R[G]-module by extending the G-action R-linearly.
Note that this ensures that the G-action is compatible with the R-module structure of R[X ] in
the sense that the action of each g ∈ G defines an endomorphism of R[X ].
For subgroups H ≤ G we use [G/H] and [H\G] to denote the G-sets of left and right cosets
of H equipped with the obvious left and right G-actions (g 0 sends g H to g 0 g H and H g to H g g 0 ).
We then have permutation modules R[G/H] and R[H\G] that are left and right R[G]-modules,
respectively. If K is another subgroup of G we have an R-module of double cosets R[H\G/K]
which admits left and right G-actions in which g 0 ∈ G sends H g K to H g 0 g K and H g g 0 K; ex-
tending these G-actions R-linearly makes R[H\G/K] an R[G]-bimodule.
For H, K ≤ G and g ∈ G, the decomposition of the double coset H\g/K into right H-cosets
is given by the orbit of H g ∈ [H\G] under the right action of K. If this orbit is finite we can
formally sum cosets to obtain an element of R[H\G]. If every right H-coset in [H\G] has a finite
K-orbit, we can use this to define an injective right R[G]-module homomorphism

R[H\G/H] ,→ R[H\G]
X
H g K 7→ H g0
H g 0 ∈{H gk:k∈K}

Elements in the image of this homomorphism are R-linear combination of sums of K-orbits in
[H\G], hence invariant under the action of K. Let R[H\G]K denote the R-submodule of R[H\G]
whose elements are fixed by K. Every such element must be an R-linear sum of K-orbits in
R[H\G], corresponding to the image of an R-linear sum of double cosets in [H\G/K] that lies
in the image of the map above. It follows that we then have an R-module isomorphism

R[H\G/K] ' [H\G]K ,

and if ever K-coset in [G/K] has a finite H-orbit we similarly obtain an R-module isomorphism

R[H\G/K] ' [G/K]H .

We now that that everything above applies more generally to any semigroup ∆ ≤ G. The
semigroup ring R[∆] is defined in the same way as the group ring R[G] (and it is indeed a ring,
even though ∆ need not be a group). For H, K ⊆ ∆ we have the right R[∆]-module R[H\∆]
and the R[∆]-bimodule R[H\∆/K], and if every K orbit in [H\∆] and every H-orbit in [∆/K]
is finite, we have R-module isomorphisms

R[H\∆/K] ' [H\∆]K ' [∆/K]H

Lecture by Andrew V. Sutherland


10.2 Commensurability
We now investigate conditions under which the finite orbit assumptions above are satisfied.
Definition 10.1. Let G be a group. Two subgroups H, K ≤ G are commensurable if their inter-
section has finite index in both H and K; we write H ∼ K to indicate this relationship.
For any H, K, J ≤ G we have

[H : H ∩ J] ≤ [H : H ∩ K ∩ J] = [H : H ∩ K][H ∩ K : H ∩ K ∩ J] ≤ [H : H ∩ K][K : K ∩ J],

which implies that commensurability is transitive; it is clearly reflexive and symmetric, hence
an equivalence relation.
For g ∈ G, let H g denote g −1 H g ≤ G. For any H, K ≤ G and g ∈ G we have H ∼ K if
and only if H g ∼ K g . It follows that if H is commensurable with H g1 and H g2 then it is also
commensurable with H g1 g2 and therefore the commensurator
e := {g ∈ G : H ∼ H g }
H

is a subgroup of G that contains the normalizer of H in G.


If H, K ≤ G are commensurable, then H̃ = K̃ and for any g ∈ H
e=K
e we have finite left and
and right coset decompositions of the double coset H g K via
m
a n
a
H gK = hi g K = H gkj,
i=1 j=1

−1 −1
with hi ∈ H ranging over H/(H ∩ K g ) representatives, of which there are m := [H : H ∩ K g ],
and k j ranging over (K ∩ H g )\K representatives, of which there are n = [K : K ∩ H g ].

10.3 Hecke algebras


Now let S be a set of commensurable subgroups Γ ≤ G, and let ∆ ⊆ G be a semigroup containing
every Γ ∈ S and is contained in their common commensurator in G. We will eventually apply
this with Γ = SL2 (Z) ≤ GL+
2 (R) = G and S the set of congruence subgroups of SL2 (Z), but rather
letting ∆ be the commensurator of SL2 (Z) in GL+ × +
2 (R) (which is R GL2 (Q)), we will take ∆ to
be a semigroup of integer matrices with positive determinant.
Let M be an R-module equipped with a right ∆-action that is compatible with the R-module
0 0
structure
0
`of M , making M an R[∆]-module. Let Γ , Γ ∈ S, and consider a 0doubleΓ coset Γ αΓ . If
Γ αΓ = i Γ αi is a right coset decomposition, we define an action of Γ αΓ on M via
X
m|Γ αΓ 0 := mαi .
i

This definition does not depend on the choice of the choice of the αi ∈ Γ αΓ 0 , since if Γ α0i = Γ αi
0 γ
for some then α0i = γi αi for some γi ∈ Γ and mαi = mγi αi = (mi )αi = mαi for all m ∈ M Γ . We
also note that m|Γ ααΓ ∈ M is Γ -invariant, since for any γ ∈ Γ , if we let α0i = αi γ0 then
0 0 0 0

0
X 0
X 0
(m|Γ αΓ 0 )γ = mα i γ = mαi = m|Γ αΓ 0 ,
i i
0 0
because Γ αΓ = i Γ αi
`
is just another right coset decomposition. This defines an R-module
0
homomorphism M → M Γ . Extend this R-linearly to an action of R[Γ \∆/Γ 0 ] yields an R-module
Γ
0
homomorphism R[Γ \∆/Γ 0 ] → HomR (M Γ , M Γ ).

18.786 Spring 2024, Lecture #10, Page 2


Now Let Γ1 αΓ2 = i Γ1 α i and Γ2 βΓ3 = j Γ2 β j , for some Γ1 , Γ2 , Γ3 ∈ S and α, β ∈ ∆, and
` `

define the product


X
Γ1 αΓ2 · Γ2 βΓ3 = cγ Γ1 γΓ3 where cγ = #{(i, j) : Γ1 αi β j = Γ1 γ} (1)
Γ1 γΓ3

with the summation over distinct double cosets Γ1 γΓ3 (not over all γ ∈ ∆). Note that cγ can
be nonzero only when Γ1 γΓ3 is equal to one of the finitely many double cosets Γ1 αi β j Γ3 , so the
product of Γ1 αΓ2 and Γ2 βΓ3 is an element of R[Γ1 \∆/Γ3 ], and this element does not depend on
the choice of the αi , β j , γ ∈ ∆. `
The decomposition Γ1 αΓ2 = i Γ1 αi defines an R-module isomorphism

R[Γ1 \∆/Γ2 ] ' R[Γ1 \∆]Γ2 ,

as in §10.1. We can thus view R[Γ1 \∆/Γ2 ] as a Γ2 -invariant R-module on which R[Γ2 \∆/Γ3 ] acts,
and this action is given by the multiplication induced by (1), which extends to multiplication of
elements of R[Γ1 \∆/Γ2 ] by elements of R[Γ2 \∆/Γ3 ] via
! !
X X X
aα Γ1 αΓ2 bβ Γ2 βΓ3 = aα bβ (Γ1 αΓ2 · Γ2 βΓ3 ) ∈ R[Γ1 \∆/Γ3 ].
Γ1 αΓ2 Γ2 βΓ3 Γ1 αΓ2 ,Γ2 βΓ3

It is easy to verify that for all m ∈ M Γ1 , A1 ∈ R[Γ1 \∆/Γ2 ], A2 ∈ R[Γ2 \∆/Γ3 ] we have

(m|A1 )|A2 = m|(A1 A2 ) ∈ M Γ3 ,

and that for all A1 ∈ R[Γ1 \∆/Γ2 ], A2 ∈ R[Γ2 \∆/Γ3 ], A3 ∈ R[Γ3 \∆/Γ4 ] we have

(A1 A2 )A3 = A1 (A2 A3 ) ∈ R[Γ1 \∆/Γ4 ].

If we now consider the case where Γ1 = Γ2 , the multiplication defined above makes R[Γ \∆/Γ ]
into a ring, in fact an R-algebra, with identity Γ = Γ 1∆ Γ .

Definition 10.2. Let G be a group, let Γ ≤ G be a subgroup, let Γ ⊆ ∆ ⊆ e Γ be a semigroup, and


let R be a ring. The Hecke algebra of Γ over R with respect to ∆ is the R-algebra R[Γ \∆/Γ ].

Recall that an anti-involution of a semigroup is a bijection ι : ∆ → ∆ with ι(αβ) = ι(β)ι(α) for


all α, β ∈ ∆.

Theorem 10.3. Let Γ ≤ G be a subgroup, Γ ≤ ∆ ≤ e Γ a semigroup, R a commutative ring, and


ι : ∆ → ∆ an anti-involution for which ι(Γ ) = Γ and Γ ι(α)Γ = Γ αΓ . The following hold:

• For all α ∈ ∆ the double coset Γ \Γ αΓ has a common` and right Γ -coset representa-
set of left `
r r
tives: there exist α1 , . . . , α r ∈ Γ αΓ such that Γ αΓ = i Γ αi = i=1 αi Γ .
• The Hecke algebra R[Γ \∆/Γ ] is commutative.

Proof. The existence of the involution ι implies that `decompositions of Γ αΓ into left or right
r
cosets all have the same cardinality. Indeed, if Γ αΓ = i=1 Γ αi then
r
a r
a r
a
Γ αΓ = ι(Γ αΓ ) = ι(Γ αi ) = ι(αi )ι(Γ ) = ι(αi )Γ .
i=1 i=1 i=1

18.786 Spring 2024, Lecture #10, Page 3


`r `r
Now suppose Γ αΓ = i=1 Γ βi = i=1 δi Γ then
S Γ βi ∩ δ j Γ 6= ; for 1 ≤ i, j ≤ r, otherwise
Γ βi ⊆ k6= j δk Γ for some i, j and Γ αΓ = Γ βi Γ = k6= j δk Γ , which is impossible. So we may pick
S

αi ∈ Γ βi ∩ δi Γ so that Γ βi = Γ αi and δi Γ = αi Γ for 1 ≤ i ≤ r. This proves ` r the first` claim.


r
For `
the second, given α, β ∈ ∆, the first claim lets us write Γ αΓ = i=1 Γ α i = i=1 αi Γ and
r `r
Γ βΓ = i=1 Γ βi = i=1 βi Γ , and we then have
X X
Γ αΓ · Γ βΓ = cγ Γ γΓ and Γ βΓ · Γ αΓ = cγ0 Γ γΓ ,
Γ γΓ Γ γΓ

where

cγ = #{(i, j) : Γ αi β j = Γ γ}
= #{(i, j) : Γ αi β j Γ = Γ γΓ }/#(Γ \Γ γΓ )
= #{(i, j) : Γ ι(β j )ι(αi )Γ = Γ ι(γ)Γ }/#(Γ \Γ ι(γ)Γ )
= #{(i, j) : Γ ι(β j )ι(αi ) = Γ ι(γ)}
= cγ0 .

It follows that Γ αΓ · Γ βΓ = Γ βΓ · Γ αΓ , and this implies that R[Γ \∆/Γ ] is commutative.

10.4 Hecke operators on automorphic forms


We now consider a lattice Γ ∈ SL2 (R), viewed as a subgroup of GL+ 2 (R), and fix a semigroup
Γ ⊆ ∆ ⊆ Γ . If we have a finite order character χ on Γ , we assume it extends to a character χ
e
of ∆ such that χ(αγα−1 ) = χ(γ) whenever γ, αγα−1 ∈ Γ , a condition that is obviously satisfied
when χ is the trivial character.
Let S be a set of commensurable
`r subgroups of Γ . For Γ1 , Γ2 ∈ S and α ∈ ∆ we have a
decomposition Γ1 αΓ2 = i=1 Γ1 αi , and for any automorphic form f ∈ Ak (Γ1 χ) we define
r
X
( f |k Γ1 αΓ2 )(z) := det(α)k/2−1 χ(αi )( f |k αi )(z)
i=1
r
X
= det(α)k−1 χ(αi ) j(αi , z)−k f (αi z)
i=1

This definition does not depend on the choice of the αi ; see [1, Theorem 2.8.1].
We now specialize to the case R = Z.

Theorem 10.4. Let Γ1 , Γ2 , Γ3 be finite index subgroups of a lattice Γ ≤ SL2 (R), let ∆ be a semigroup
containing Γ that lies in its commensurator in GL+ 2 (R), let k be an integer, and let χ be a character
of finite order on Γ extending to ∆ as above. The following hold:

• If f is an automorphic/modular/cusp form of weight k for Γ1 with character χ then f |Γ1 αΓ2


is an automorphic/modular/cusp form of weight k for Γ2 with character χ.
• For all finite index Γ1 , Γ2 , Γ3 ≤ Γ and α, β ∈ ∆ we have

( f |k Γ1 αΓ2 )|k Γ2 βΓ3 = f |k (Γ1 αΓ2 · Γ2 βΓ3 ).

• The spaces Sk (Γ , χ) ⊆ Mk (Γ , χ) ⊆ Ak (Γ , χ) are right Z[Γ \∆/Γ ]-submodules.

Proof. This is Theorem 2.8.1 in [1].

18.786 Spring 2024, Lecture #10, Page 4


Recall the Petersson inner product 〈 f , g〉, defined for f ∈ Sk (Γ ) and g ∈ Mk (Γ ).

Theorem 10.5. Let α ∈ GL+ 0 : −1


2 (R) and define α = det(α)α . Let Γ1 , Γ2 be commensurable to a
Γ in GL+
lattice Γ ∈ SL2 (R) with commensurator e 2 (R), and let k be an integer. We then have:

• 〈 f |k α, g〉 = 〈 f , g|k α0 〉 for all f ∈ Sk (Γ1 ) and g ∈ Mk (Γ2 );


• 〈 f |k Γ αΓ , g〉 = 〈 f , g|k Γ α0 Γ 〉 for all f ∈ Sk (Γ ) and g ∈ Mk (Γ ).

Proof. This is Theorem 2.8.2 in [1].

Corollary 10.6. Let χ, ψ be distinct finite order characters of a lattice Γ ∈ SL2 (R). Then 〈 f , g〉 = 0
for all f ∈ Sk (Γ , χ) and g ∈ Mk (Γ , ψ).

Proof. Pick γ ∈ Γ so that χ(γ) 6= ψ(γ). Then γ0 := det(γ)γ−1 = γ−1 and

χ(γ)〈 f , g〉 = 〈 f |k γ, g〉 = 〈 f , g|k γ−1 ) = ψ(γ)〈 f , g〉,

by Theorem 10.5, which is possible only if 〈 f , g〉 = 0.

Corollary 10.7. If f ∈ Ek (Γ ) then f |k Γ αΓ ∈ Ek (Γ ) for all α ∈ e


Γ

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #10, Page 5


18.786 Number theory II Spring 2024
Lecture #11 3/18/2024

These notes summarize the material in §4.3 and §4.5 of [1] covered in lecture

11.1 Modular forms for congruence subgroups


Definition 11.1. For each positive integer N the principal congruence subgroup of level N is

Γ (N ) := {γ ∈ SL2 (Z) : γ ≡ 1 mod N } .

A subgroup Γ ≤ Γ (1) that contains Γ (N ) for some N ≥ 1 is a congruence subgroup. The least
such N ≥ 1 is the level of Γ . The set of congruence subgroups of level N includes the groups

Γ0 (N ) := {γ ∈ Γ (1) : γ ≡ ( 0∗ ∗∗ ) mod N } ,
 
Γ1 (N ) := γ ∈ Γ (1) : γ ≡ 10 ∗∗ mod N .

Congruence subgroups are lattices in SL2 (R), so Fuchsian groups of the first kind.

If Γ is a congruence subgroup containing Γ (N ), then Mk (Γ ) ⊆ Mk (Γ (N )) for any k ∈ Z.


Moreover, for δN := N0 10 ∈ GL+2 (Q) we have

δ−1 ∈ SL2 (Z) : c ≡ 0 mod N 2 , a ≡ d ≡ 1 mod N ⊇ Γ1 (N 2 ).


 
N Γ (N )δN =
a b
c d

Thus for any f ∈ Mk (Γ (N )) we have

f (N z) = N −k/2 ( f |k δN )(z) ∈ Mk (Γ1 (N 2 ),

and if f (z) = n/N


is the q-expansion of f at ∞ (with q(z) := e2πinz ), then
P
n≥0 an q
X
f (N z) = an q n ,
n≥0

and we can read off the Fourier coefficients of f (N z) from those of f (z). It follows that provided
we are willing to square the level when necessary, the study of modular forms for congruence
subgroups reduces to the study of modular forms for Γ1 (N ).
Now let χ be a Dirichlet character of modulus N ; this means that χ : Z → C is a periodic
multiplicative function that is the extension by zero of a group homomorphism (Z/N Z)× → C.
The set of all such χ form a group X (N ) that can be identified with the character group of
(Z/N Z)× , which is abstractly isomorphic to its dual (Z/N Z)× , since (Z/N Z)× is abelian.
We define a character χ : Γ0 (N ) → C via

χ ac db := χ(d)

Lemma 11.2. For each positive integer N we have C-vector space decompositions
M M
Mk (Γ1 (N )) = Mk (Γ0 (N ), χ) and Sk (Γ1 (N ) = Sk (Γ0 (N ), χ).
χ∈X (N ) χ∈X (N )

Proof. We have Γ1 (N ) Å Γ0 (N ) and an action of Γ0 on Mk (Γ1 (N )) and Sk (Γ1 (N )) via f 7→ f |k γ in


which elements of Γ1 (N ) act trivially. This induces a representation of Γ0 (N )/Γ1 (N ) ' (Z/N Z)× .
Now (Z/N Z)× is abelian, with character group X (N ), so every such representation is induced
by a Dirichlet character χ ∈ X (N ), and the decompositions above are simply decompositions
into irreducible representations.

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


The lemma implies that we can restrict our study of modular forms for congruence subgroups
to the spaces Mk (Γ0 (N ), χ) and Sk (Γ0 (N ), χ), and to simplify the notation we define

Mk (N , χ) := Mk (Γ0 (N ), χ) and Sk (N , χ) := Sk (Γ0 (N ), χ),

and just write Mk (N ) and Sk (N ) when the character χ is the trivial character of modulus N . For
any multiple M of N , each χ ∈ X (N ) induces a character χ ∈ X (M ), and we have

Mk (N , χ) ⊆ Mk (M , χ) and Sk (N , χ) ⊆ Sk (M , χ).

Definition 11.3. A Dirichlet character χ has even parity if χ(−1) = 1, and odd parity otherwise
(in which case we necessarily have χ(−1) = −1).

Lemma 11.4. If k and χ do not have the same parity then Mk (N , χ) = {0}.

Proof. We have −1 ∈ Γ0 (N ) and f = χ(−1) f k | − 1 = χ(−1)(−1)k f cannot hold for f 6= 0 unless


k and χ have the same parity.

Definition 11.5. Fix N ≥ 1. For each positive integer d coprime to N we define the diamond
operator 〈d〉 on Mk (Γ1 (N )) via
〈d〉 f := f |k α,
 
for any α = ac δb ∈ Γ0 (N ) with δ ≡ d mod N . The map ac db → d on Γ0 (N ) has kernel Γ1 (N ),
so the definition of 〈d〉 does not depend on the choice of α, since f = f |k γ for γ ∈ Γ1 (N ).

The spaces Mk (N , χ) and Sk (N , χ) are χ-eigenspaces of diamond operators:

Mk (N , χ) = { f ∈ Mk (Γ1 (N )) : 〈d〉 f = χ(d) f for all d ∈ (Z/N Z)× },


Sk (N , χ) = { f ∈ Sk (Γ1 (N )) : 〈d〉 f = χ(d) f for all d ∈ (Z/N Z)× }.

For each positive integer N we define

∈ GL+

ω(N ) = 0 −1
N 0 2 (Q).

Lemma 11.6. The map f 7→ f |k ω(N ) induces isomorphisms

Mk (N , χ) ' Mk (N , χ̄) Sk (N , χ) ' Sk (N , χ̄).


and

Proof. Let f ∈ Mk (N , χ) and put g = f |k ω(N ). For γ = cNa b
d ∈ Γ0 (N ) we have

ω(N )γω−1

N =
d −c
−bN a ,
−1
thus Γ0 (N )ω(N )= Γ0 (N ) and g|k γ = χ(a)g = χ̄(d) = χ̄(γ)g, since ad = 1.

For f ∈ Mk (N , χ) with q-expansion n≥0 an q n , we define f˜ = n≥0 ān q n .


P P

Lemma 11.7. For f ∈ Mk (N , χ) we have f˜(z) = f (−z̄) ∈ Mk (N , χ̄), and for f ∈ Sk (N , χ) we


have f˜ ∈ Sk (N , χ̄).

Proof. We have q(z) = e2πiz , so q(z) = q(−z̄) and f˜(z) = f (−z̄) follows. For γ = ac db ∈ Γ0 (N ),


if we put γ0 = a −b ∈ Γ0 (N ), then f˜|k γ = àf |k γ0 = χ̄(γ) f˜.



−c d

18.786 Spring 2024, Lecture #11, Page 2


11.2 Hecke operators for congruence subgroups
Congruence subgroups have finite index in SL2 (Z) and are thus mutually commensurable (and
commensurable with finite index subgroups of SL2 (Z) that are not congruence subgroups).
Γ is R× GL+
Lemma 11.8. Let Γ be a finite index subgroup of SL2 (Z). Then e 2 (Q).

Proof. It suffices to consider Γ = SL2 (Z). Given α ∈ R× GL+ 2 (Q), we may choose c ∈ R so
×
+ −1 −1
that β = cα ∈ GL2 (Q) is an integer matrix, with α Γ α = β Γ β. Let m = det(β) ∈ Z≥1 so
that mβ −1 has integer entries. For γ ∈ Γ (m) the matrix mβ −1 γβ ∈ GL+ 2 (Q) has integer entries
−1 +
divisible by m, and it follows that β γβ ∈ GL2 (Q) is an integer matrix with determinant
1, hence an element of Γ . Therefore β −1 Γ (m)β = α−1 Γ (m)α ⊆ Γ ; it follows that αΓ α−1 ∩ Γ
contains Γ (m) and has finite index in Γ . Applying the same argument to α−1 ∈ R× GL+ 2 (Q)
and conjugating by α shows that αΓ α−1 ∩ Γ has finite index in αΓ α−1 . Thus Γ and αΓ α−1 are
commensurable, so α ∈ e Γ , which proves that R× GL+ 2 (Q) ⊆ Γ , since α was arbitrary.
e
+
Now suppose α ∈ Γ ⊆ GL2 (R). Then αΓ α is commensurable with Γ , as is βΓ β −1 , where
e −1

β is the inverse transpose of α. It follows that Γ , αΓ α−1 , βΓ β −1 all have the same set of cusps,
namely P1 (Q), since taking a finite index subgroup of a Fuchsian group does not change the
cusps, as proved in Lecture 4. So α−1 0, α−1 ∞, −1 −1
 β 0, β 1 ∞ are all elements×of P+ (Q), and this
1

implies that every ratio of entries of α = ac db lies in P (Q), therefore α ∈ R GL2 (Q).

We now define the following semigroups ∆0 (N ), ∆∗0 (N ) ⊆ GL+2 (Q):


 
∆0 (N ) := ac db ∈ M2 (Z) : c ≡ 0 mod N , a ⊥ N , ad − bc > 0 ,
∆∗0 (N ) := ac db ∈ M2 (Z) : c ≡ 0 mod N , d ⊥ N , ad − bc > 0 .
 

We also note that


∆0 (N ) ∩ ∆∗0 (N ) =
 
a b
c d ∈ M2 (Z) : c ≡ 0 mod N , ad − bc ⊥ N , ad − bc > 0 .
We now want to consider the Hecke algebras
T(N ) := Z[Γ0 (N )\∆0 (N )/Γ0 (N )],
T∗ (N ) := Z[Γ0 (N )\∆∗0 (N )/Γ0 (N )].
Lemma 11.9. For each α ∈ ∆0 (N ) there exist unique positive integers l|m with l ⊥ N such that

0 Γ (N ),
Γ0 (N )αΓ0 (N ) = Γ0 (N ) 0l m 0

and for each α ∈ ∆∗0 (N ) there exist unique positive integers l|m with l ⊥ N such that

Γ0 (N )αΓ0 (N ) = Γ0 (N ) m
0 l Γ0 (N ).
0

Proof. This is Lemma 4.5.2 in [1].

Theorem 11.10. The Hecke algebras T(N ),T∗ (N ) are commutative, and for α ∈ ∆0 (N ) ∪ ∆∗0 (N )
the double coset Γ0 (N )αΓ0 (N ) admits a common set of representatives for its decomposition into
left and right Γ0 (N )-cosets.
Proof. By Theorem 10.3 proved in Lecture 10, it suffices to exhibit an anti-involution ι for which
ι(Γ0 (N )) = Γ0 (N ) and Γ0 (N )ι(α)Γ0 (N ) = Γ0 (N )αΓ0 (N ), for all α ∈ ∆0 (N ) and α ∈ ∆∗0 (N ). It
follows from Lemma 11.9 that the map
a c

cN d 7→ ( bN d )
a b

is such an involution.

18.786 Spring 2024, Lecture #11, Page 3



For χ ∈ X (N ) and α = ac db ∈ ∆0 (N ) we define χ(α) = χ(a). Then χ is an extension of
the induced character χ of Γ0 (N ) to ∆0 (N ), and we claim that

χ(αγα−1 ) = χ(γ)

for all γ ∈ Γ and α ∈ ∆0 (N ) for which αγα−1 ∈ Γ0 (N ). By Lemma 11.9 we may assume
α = 0l m0 with l|m and l ⊥ N .
€ Š
0: −1 a bl/m
∈ Γ0 (N ) then bl ≡ 0 mod m and γ0 = cN m/l d , in

For γ = cN d ∈ Γ0 (N ), if γ = αγα
a b

which case χ(γ0 ) = χ(αγα−1 ) = χ(γ0 ), as required. It follows from Theorem 10.4 of Lecture 10
that the Hecke algebra T(N ) acts on Mk (N , χ) and Sk (N , χ).
For ∆∗0 (N ) we extend χ to χ ∗ (α) = χ(d) and one similarly shows that T∗ (N ) acts on
Mk (N , χ) and Sk (N , χ).

Remark 11.11. For double cosets Γ0 (N )αΓ0 (N ) ∈ T(N ) ∩ T∗ (N ) the actions of Γ0 (N )αΓ0 (N )
as an element of T(N ) and T∗ (N ) need not coincide! However, the map α 7→ ω(N )−1 αω(N )
induces an isomorphism ∆0 (N ) ' ∆∗0 (N ) and we have

χ ∗ (ω(N )−1 αω(N )) = χ(α)

for α ∈ ∆0 (N ).

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #11, Page 4


18.786 Number theory II Spring 2024
Lecture #12 3/20/2024

These notes summarize the material in §4.3 and §4.5 of [2] covered in lecture

12.1 Hecke algebras for congruence subgroups


Let N be a positive integer. Recall the semigroups ∆0 (N ), ∆∗0 (N ) ⊆ GL+
2 (Q):
 
∆0 (N ) := a
c
b
d ∈ M2 (Z) : c ≡ 0 mod N , a ⊥ N , ad − bc > 0 ,
∆∗0 (N )
 
:= a
c
b
d ∈ M2 (Z) : c ≡ 0 mod N , d ⊥ N , ad − bc > 0 ,

and the corresponding Hecke algebras:

T(N ) := Z[Γ0 (N )\∆0 (N )/Γ0 (N )],


T∗ (N ) := Z[Γ0 (N )\∆∗0 (N )/Γ0 (N )],

which act on Mk (N , χ) and Sk (N , χ) for all N ≥ 1 and Dirichlet characters χ of modulus N via
r
X
f |k Γ0 (N )αΓ0 (N ) = det(α)k/2−1 χ(αi ) f |k αi ,
i=1
r
X
= det(α)k−1 χ(αi ) j(αi , z)−k f (αi z),
i=i
`r 
where Γ0 (N )αΓ0 (N ) = i=1 Γ0 (N )αi and we extend χ to ac db ∈ ∆0 (N ) via χ(α) := χ(a) and
to ac db ∈ ∆∗0 (N ) via χ ∗ (α) := χ(d). It follows from [2, Lemma 4.52] that


 
Γ0 (N )\∆0 (N )/Γ0 (N ) = Γ0 (N ) 0 m Γ0 (N )
l 0 : l, m ∈ Z≥1 with N ⊥ l|m ,
Γ0 (N )\∆∗0 (N )/Γ0 (N ) = Γ0 (N )
 
0 l Γ0 (N )
m0
: l, m ∈ Z≥1 with N ⊥ l|m ,

and distinct pairs (l, m) yield distinct double cosets on the RHS in each case. For m ⊥ N we
have  
0 Γ (N ) = Γ (N ) m 0 Γ (N ),
Γ0 (N ) 0l m 0 0 0 l 0 (1)
but this does not contradict the statement above, we are just choosing a different representative
for this double coset depending on whether we are working with ∆0 (N ) or ∆∗0 (N ).
For positive integers l, m, n with l|m and l ⊥ N we define the following elements of T(N ), T∗ (N ):

T ∗ (m, l) := Γ0 (N ) m
 
T (l, m) := Γ0 (N ) 0l m
0 Γ (N )
0 and 0 l Γ0 (N ),
0
X X
T (n) := T (l, m) and T ∗ (n) := T ∗ (m, l),
l m=n lm=n

where the sums are taken over l, m ∈ Z≥1 with l|m and l ⊥ N as above, and we note that

T (p) = T (1, p) T ∗ (p) = T ∗ (p, 1).


and

For n ⊥ N the double coset T (n, n) is also a right coset (since 0n 0n is scalar), which implies

T (n, n)T (l, m) = T (nl, nm) and T ∗ (n, n)T ∗ (m, l) = T ∗ (nm, nl).

Lemma 12.1. For any f ∈ Mk (N , χ) and positive integers l, m, n with l|m and l mn ⊥ N we have

f |k T ∗ (m, l) = χ(l m) f |k T (l, m) and f |k T ∗ (n) = χ(n) f |k T (n).

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


Proof. We have m ⊥ N and (1) implies that we can choose right coset representatives α1 , . . . , α r
for T (l, m) = T ∗ (m, l) with αi ∈ ∆0 (N ) ∩ ∆∗0 (N ) so that

χ ∗ (αi ) = χ(det αi )χ(αi ) = χ(l m)χ(αi ),



for 1 ≤ i ≤ r, since for αi = ac db we have

χ ∗ (αi )/χ(αi ) = χ(d)/χ(a) = χ(d)χ(a) = χ(ad) = χ(ad − bc) = χ(det α) = χ(l m),

because c ≡ 0 mod N and χ is a Dirichlet character of modulus N . We then have


X
f |k T ∗ (m, l) = (l m)k/2−1 χ ∗ (αi ) f |k αi
i
X
= χ(l m)(l m) k/2−1
χ(αi ) f |k αi
i
= χ(l m) f |k T (l, m),

and X X
f |k T ∗ (n) = f k |T ∗ (m, l) = χ f |k T (l m) = χ(n) f |k T (n).
l m=n lm=n

12.2 Quick linear algebra review


Let V be a complex vector space. A function 〈·, ·〉: V × V → C that satisfies

• 〈u + λv, w〉 = 〈u, w〉 + λ〈v, w〉,



• 〈u, v〉 = 〈v, u〉 which then implies 〈u, v + λw〉 = 〈u, v〉 + λ̄〈u, w〉 ,
• 〈u, u〉 ≥ 0 with 〈u, u〉 = 0 only if u = 0,

for all λ ∈ C and u, v, w ∈ V is a (positive definite) Hermitian inner product. For every N ∈ Z≥1 ,
k ∈ Z and Dirichlet character χ of modulus N , the Petersson inner product is a Hermitian inner
product on the C-vector space Sk (N , χ), making it a (positive definite) Hermitian space.
If T is a linear transformation of a Hermitian space V , a linear operator T ∗ that satisfies
〈Tu, v〉 = 〈u, T ∗ v〉 for all u, v ∈ V is the adjoint operator of T ; it is necessarily unique and guar-
anteed to exist when V is finite dimensional (take the linear operator defined by the conjugate
transpose of a matrix representing T with respect to a 〈·, ·〉-orthonormal basis for V ).
For all u, v ∈ V and linear transformations S, T we have

• 〈Tu, v〉 = 〈u, T ∗ v〉 = 〈T ∗ v, u〉 = 〈v, (T ∗ ) ∗ u〉 = 〈(T ∗ )∗ u, v〉,


• 〈(S + T )u, v〉 = 〈Su, v〉 + 〈Tu, v〉 = 〈u, S ∗ v〉 + 〈u, T ∗ v〉 = 〈u, (S ∗ + T ∗ )v〉,

thus T is the adjoint of its adjoint and the adjoint of the sum is the sum of the adjoints.
Linear operators on a Hermitian space may be classified as

• Hermitian: T = T ∗ ,
• unitary: T T ∗ = 1,
• normal: T T ∗ = T ∗ T .

18.786 Spring 2024, Lecture #12, Page 2


If V is finite dimensional and we fix a 〈·, ·〉-orthonormal basis, then the matrix M representing
T is Hermitian (M = M ∗ ), unitary (M M ∗ = 1), normal (M M ∗ = M ∗ M ) if and only if T is. Her-
mitian and unitary operators are normal, but normal operators need not be Hermitian/unitary.
The essential property of normal operators is that their eigenspaces are orthogonal, in fact this
property uniquely characterizes normal operators.

Theorem 12.2 (Spectral theorem). Let T be a linear operator on a Hermitian space V of finite
dimension. Then T is normal if and only if V is the orthogonal sum of the eigenspaces of T .

Proof. This is Theorem 7.31 in [1].

If T = {T1 , T2 , T3 , . . .} is a (possibly infinite) family of normal operators on a finite dimen-


sional Hermitian space V that pairwise commute, then V has an orthonormal basis whose ele-
ments are simultaneous eigenvectors of every T ∈ T ; this follows from the fact that diagonal-
izable operators that commute can be simultaneously diagonalized; see [1, Theorem 5.76].

12.3 Adjoint Hecke operators


We now want to apply the spectral theorem to Hecke operators in T(N ) acting on the Hermitian
space Sk (N , χ). We continue in the setting of §12.1, with χ a Dirichlet character of modulus
N extended to ∆0 (N ) and ∆∗0 (N ) as above, and Hecke operators T (l, m) and T (n) defined as
above for positive integers with l|m and l ⊥ N .

Theorem 12.3. T (l, m) and T ∗ (m, l) are adjoint operators on the Hermitian space Sk (N , χ) with
the Petersson inner product, as are T (n) and T ∗ (n).

Proof. For α = ac db ∈ GL+ 0 : −1 d −b . The map α 7→ α0 is an anti-


 
2 (R) let α = det(α)α = −c a
isomorphism form ∆0 (N ) to ∆∗0 (N ), which allows us to choose coset representatives
r
Y

Γ0 (N ) l 0
0 m Γ0 (N ) = Γ0 (N )αi
i

so that
r
Y
Γ0 (N )α0i ,

Γ0 (N ) m0
0 l Γ0 (N ) =
i

and we have χ(αi ) = χ ∗ (α0i ). For g ∈ Sk (N , χ) and any α = αi we have



f |k α, g = v(Γ0 (N )\H)−1 det(α)k/2 j(α, z)−k f (αz)g(z) Im(z)k d v(z)
Γ0 (n)\H
−1

= v(Γ0 (N )\H) det(α)k/2 j(α, α−1 z)−k f (z)g(α−1 z) Im(α−1 z)k d v(α−1 z).
αΓ0 (n)α−1 \H

We have α−1 z = α0 z for any z ∈ H and αα0 = det(α) = det(α0 ). For γ, δ ∈ GL+ 2 (R) we have
j(γ, δz) = j(γδ, z)/ j(δ, z) and Im(δz) = det(δ)| j(δ, z)|−2 Im(z). Taking γ = α and δ = α0 yields

det(α)k/2 j(α, α−1 z)−k Im(α−1 z)k = det(α0 )k/2 j(α, α0 z)−k Im(α0 z)k
= det(α0 )k/2 (det(α0 )/ j(α0 , z))−k (det(α0 )| j(α0 , z)|−2 Im(z))k
= det(α0 )k j(α0 , z)−k Im(z)k .

18.786 Spring 2024, Lecture #12, Page 3


Now v is the invariant measure, so v(αΓ0 (N )α−1 \H) = v(Γ0 (N )\H) and d v(α−1 z) = d v(z), and

f |k α, g = v(αΓ0 (N )α−1 \H)−1 f (z)det(α0 )k/2 j(α0 , z)−k g(α0 z) Im(z)k d v(z)
αΓ0 (n)α−1 \H
0
= f , g|k α .

Left linearity and right sesquilinearity of the Petersson inner product then implies
® r ¸
X
f |k T (l, m), g = det(αi ) k/2−1
χ(αi ) f |k αi , g
i=1
r
X
= det(αi )k/2−1 χ(α0i ) f |k αi , g
i=1
r
X
= det(α0i )k/2−1 χ ∗ (α0i ) f , g|k α0i
i=1
r
X
= f, det(α0i )k/2−1 χ ∗ (α0i )g|k α0i = f , g|k T ∗ (m, l) ,
i=1

proving that TP(l, m) and T ∗ (m, l) are adjoint operators as claimed, as are T (n) = T (l, m)
P
lm=n
and T ∗ (n) = l m=n T ∗ (m, l)

Corollary 12.4. For l|m and l mn ⊥ N the Hecke operators T (l, m) and T (n) are normal operators
on the Hermitian space Sk (N , χ).

Proof. By Lemma 12.1 we have

T (l, m)T ∗ (m, l) = T (l, m)χ(l m)T (l, m) = χ T (l, m)T (l, m) = T ∗ (m, l)T (l, m),

which shows that T (l, m) commutes with its adjoint T ∗ (m, l), by Theorem 12.3. So T (l, m) is a
normal operator, and the same argument applies to T (n).1

Corollary 12.5. The C-vector space Sk (N , χ) has a basis of common eigenfunctions for the Hecke
operators T (n) and T (l, m) with l|m and l mn ⊥ N .

Proof. The Hecke algebra T(N ) is commutative, by Theorem 10.3, and the spectral theorem
implies that Sk (N , χ) admits a basis of simultaneous eigenvectors for the operators T (l, m) and
T (n) when they are normal, which holds for l mn ⊥ N , by the previous corollary.

Proposition 12.6. Theisomorphisms Mk (N , χ) ' Mk (N , χ) and Sk (N , χ) ' Sk (N , χ) induced by


0 commute with the Hecke operators T (l, m) and T (n) for l|m with l, n ⊥ n.
the map f 7→ f |k N0 −1

Proof. This is Theorem 4.5.5 in [2].

Lemma 12.7. For all prime numbers p and e ≥ 1 we have



(p + 1)T (p, p)
 if p - N and e = 1,
• T (p)(T (1, p ) = T (1, p ) + pT (p, p)T (1, p ) if p - N and e > 1,
e e+1 e−1

0 otherwise.
1
Note that a sum of normal operators need not be normal, so one really does need to apply the argument to T (n).

18.786 Spring 2024, Lecture #12, Page 4


¨
T (p e+1 ) + pT (p, pT (p e−1 ) if p - N ,
• T (p)T (p ) =
e
T (p e+1 ) otherwise.
Proof. This is Lemma 4.5.7 in [2].

Lemma 12.8. For positive integers l 0 |m0 with l m ⊥ l 0 m0 , and m ⊥ n we have


• T (l, m)T (l 0 , m0 ) = T (l l 0 , mm0 ),
• T (m)T (n) = T (mn).
Proof. This is Lemma 4.5.8 in [2].

Theorem 12.9. The Hecke algebra T(N ) is equal to the polynomial ring over Z generated by the
Hecke operators T (p), T (p, p), T (q) for prime numbers p - N and q|N .
Proof. That T (p), T (p, p), T (q) generate T(N ) follows from the commutativity of T(N ) and the
preceeding lemmas, and one can verify that they are algebraically independent over Q.

Lemma 12.10. Let p be prime, e ≥ 1. If p - N then Γ0 (N ) 10 p0e Γ0 (N ) contains p e + p e−1 distinct




right cosets Γ0 (N )α with


¦€ e−r Š ©
p m
α∈ e p f : 0 ≤ f ≤ e, 0 ≤ m < p f
, gcd(m, p f
, p e− f
) = 1 ,
1 0

and if p|N then Γ0 (N ) 0 pe Γ0 (N ) contains p e distinct right cosets Γ0 (N )α with
¦ ©
α ∈ 1e pme : 0 ≤ m < p e ,


Proof. See Lemma 4.5.6 in [2].


e e
Let l|m be positive integers with l ⊥ N , and let m/l = p11 · · · p r r . Then
r
Y
e
T (l, m) = T (l, l)T (1, m/l) = T (l, l) T (1, pi i ),
i=1

and we obtain right coset decompositions


a 
T (l, m) = Γ0 (N ) 0a db (ad = l m, 0 ≤ b < d, gcd(a, b, d) = l)
a 
T (n) = Γ0 (N ) 0a db (ad = n, 0 ≤ b < d, a ⊥ N ).

It follows that for f ∈ Mk (N , χ) we have


X
az+b
χ(a)d −k f

( f |k T (n))(z) = nk−1 d
ad=n
0≤b<d

and for l ⊥ N we have


( f k |T (l, l))(z) = l k−2 χ(l) f (z).

References
[1] Sheldon Axler, Linear algebra done right, Fourth Edition, Springer, 2023.

[2] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #12, Page 5


18.786 Number theory II Spring 2024
Lecture #13 4/1/2024

These notes summarize the material in §3.2 presented in lecture.


Our goal for is to prove analytic continuation and a functional equation for the L-functions
attached to cusp forms, as originally proved by Hecke. While this is a classical result, it is of
critical importance for two reasons:

1. It makes modularity a much more powerful tool than it would otherwise be. Knowing that
an elliptic curve E/Q has the same L-function as a modular form is interesting precisely
because of the things we know about L-functions of modular forms. The LHS of the
BSD formula makes sense only because we know both the modularity theorem and that
L-functions of modular forms admit an analytic continuation and functional equation;
modularity alone is not enough.
2. It is a model for proving facts about L-functions of more general automorphic forms.

In the spirit of the second point, as a warmup we recall the proof of the analytic continuation
and functional equation for the Riemann zeta function, proved by Riemann in 1859, which was
the basis for Hecke’s results.

13.1 The Riemann zeta function


Given a sequence of complex numbers a1 , a2 , a3 , . . . we can define the Dirichlet series
X
an n−s .
n≥1

In order for this Dirichlet series to converge, we need bounds on the an . Note that |n−s | = n− Re(s) ,
so if we have |an | = O(nσ ) as n → ∞ then the Dirichlet series above converges absolutely and
uniformly to a holomorphic function on Re(s) > σ + 1. This function then uniquely determines
the Dirichlet coefficients an (no matter what σ is); they are intrinsic to the function.
−s
and n≥1 bn n−s converge absolutely at s = σ > 0.
P P
Lemma 13.1. Suppose n≥1 an n
If n≥1 an n−s = n≥1 bn n−s on Re(s) ≥ σ then an = bn for all n ≥ 1.
P P

Proof. This is Lemma 3.2.1 in [1].

The Riemann zeta function ζ(s) is defined by the Dirichlet series with 1 = a1 = a2 = · · · :
X
ζ(s) := n−s ,

which converges absolutely and uniformly to a holomorphic function on Re(s) > 1, as does any
Dirichlet series with bounded coefficients, since we can take σ = 0.

Definition 13.2. Euler’s gamma function is defined by the integral


Z ∞
Γ (s) := e−t t s−1 d t.
0

It is holomorphic on Re(s) > 0 and satisfies the functional equation Γ (s + 1) = sΓ (s), which
allows us to extend it to a meromorphic function on C whose only poles are simple poles at
s = −n for n ∈ Z≥0 , with residue (−1)n /n!. The gamma function has no zeros.

Lecture by Shiva Chidambaram, notes by Andrew V. Sutherland


The gamma function can also be defined as the Mellin transform of e−t . Recall that the
Mellin transform of a function h: R>0 → C is defined by
Z ∞
(M h)(s) := h(t)t s−1 d t,
0
R∞
and is holomorphic on vertical strips Re(s) ∈ (a, b) in which 0 |h(t)|t σ−1 d t converges for all
σ ∈ (a, b). The Mellin transform is related to the Fourier transform, which we will also need.
Recall that a Schwarz function f (x) is a C ∞ -function for which

d m f (x)
sup x n < ∞.
x d xm

The Fourier transform of a Schwarz function f (x) is defined by


Z
(F f )( y) := f (x)e−2πi x y d x,
R

and is also Schwarz function. We also have the inverse Fourier transform
Z
(F −1 f )(x) := f ( y)e+2πi x y d y,
R

with F −1 F f = F F −1 f = f for all Schwarz functions f . We will need the following fact about
the Fourier function of the Guassian function.
2 2
Lemma 13.3. For any a > 0 the Fourier transform of e−πax is p1 e −π y /a .
a

Proof. See any textbook on Fourier analysis.

The Mellin transform is related to the Fourier transform via


s
(M h)(s) = (F h(e−x ))( 2πi ),

provided that h(e−x ) is a Schwarz function. We also have the inverse Mellin transform
Z c+i∞
−1 1
(M f )(x) = x −s f (s)ds,
2πi c−i∞

defined for f (s) holomorphic on a vertical strip Re(s) ∈ (a, b) with c ∈ (a, b). We than have
M −1 M h = h and M M −1 f = f for suitable h: R>0 → C and f : C → C.

Theorem 13.4 (Poisson summation). Let f be a Schwarz function with Fourier transform fˆ. Then
X X
f (n) = fˆ(n).
n∈Z n∈Z

Proof. Both f and fˆ are


P Schwarz functions, which decay rapidly as |n| → ∞, so the sums
converge. Let F (x) := n∈Z f (x + n). Then F is a periodic C ∞ -function with Fourier expansion
X
F (x) = cn e2πinz ,
n∈Z

18.786 Spring 2024, Lecture #13, Page 2


whose Fourier coefficients are given by
Z1 Z 1X Z
cn = F (t)e −2πint
dt = f (t + m)e −2πint
dt = f (t)e−2πint d t = fˆ(n),
0 0 m∈Z R

and we have X X X
f (n) = F (0) = cn = fˆ(n).
n∈Z n∈Z n∈Z

We now prove Riemann’s theorem for ζ(s), carefully spelling out each step (this is our model
for the next section where we won’t be as pedantic).
Theorem 13.5 (Riemann). Let Λ(s) = π−s/2 Γ 2s ζ(s). Then Λ(s) has a meromorphic continua-

1
tion to C and Λ(s) + 1s + 1−s is holomorphic on C and satisfies Λ(s) = Λ(1 − s).
Proof. For Re(s) > 1 we have

X Z ∞
2 −s
Λ(2s) = (πn ) e−t t s−1 d t
n=1 0
∞ Z
X ∞
2
= e−πn t t s−1 d t
n=1 0
∞ X
Z ‚∞ Œ
2
= e−πn t
t s−1 d t,
0 n=1
P∞ −πn2 t
with a change of variable t 7→ πn2 t in the second line and n=1 e |t s−1 | < ∞ justifying
2
the third line (via Fubini). Applying Poisson summation and Lemma 13.3 to e−πx t yields
X 2 1 X −πn2 /t
e−πn t = p e ,
n∈Z t n∈Z
2
thus g(t) = n∈Z e−πn t satisfies the functional equation g(t) = p1t g( 1t ), and we have
P

Z∞
1
Λ(2s) = (g(t) − 1)t s−1 d t
2 0
‚Z 1 Z∞ Œ
1 €
1 1
Š
= p g( ) − 1 t
t t
s−1
dt + s−1
(g(t) − 1) t d t
2 0 1
Z ∞ Z∞ 
1 p  −1−s
= t g(t) − 1 t dt + s−1
(g(t) − 1) t d t
2 1 1
Z ∞ Z∞ Z∞ Z∞ 
1 1/2 −1−s 1 1/2−s −1 −1/2−s
= t g(t)t dt − − t t dt + t s−1
d t + (g(t) − 1)t d t
2 1 s 1 1 1
Z∞
1 1 1
t 1/2−s + t s (g(t) − 1) t −1 d t −

= − .
2 1 2s 1 − 2s

with a change of variable t 7→ 1t to get the third line. The last integral converges uniformly on
any compact subset and defines a holomorphic function on C, hence a meromorphic continua-
tion of Λ(s) to C for which Λ(s) + 1s + 1−s
1
is holomorphic. The last line is invariant under the
1
transformation s 7→ 2 − s, which implies Λ(s) = Λ(1 − s).
1
Corollary 13.6. The function ζ(s) + 1−s is holomorphic on C.

18.786 Spring 2024, Lecture #13, Page 3


References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #13, Page 4


18.786 Number theory II Spring 2024
Lecture #14 4/3/2024

These notes summarize the material in §4.3 of [1] presented in lecture.

14.1 Nice holomorphic functions


We now recall some facts from complex analysis
P∞about2πinz
the growth rate of the coefficients an that
can appear in the Fourier expansions f (z) = n=0 an e of holomorphic functions f : H → C.
We omit the proofs, which can be found in [1].
For z ∈ C we put z = x + i y, with x, y ∈ R (so x and y are implicitly functions of z).

Lemma
P∞ 14.1. Fix σ ∈ R. If f (z) is a holomorphic function on H whose Fourier expansion
n=0 na e 2πinz
converges absolutely and uniformly on H with f (z) = O( y −σ ) as y → 0, then
|an | = O(nσ ) as n → ∞.

Proof. See Corollary 2.1.6 in [1].

Lemma 14.2. Let {an }n≥0 be a sequence of complex numbers, let



X
f (z) = an e2πinz .
n=0
σ
If |an | = O(n ) as n → ∞, for some σ > 0, then the following hold:

• the sum defining f (z) converges absolutely and uniformly on every compact subset of H;
• the function f (z) is holomorphic on H;
• | f (z)| = O( y −σ−1 ) as y → 0;
• | f (z) − a0 | = O(e−2π y ) as y → ∞.

Proof. This is Lemma 4.3.3 in [1].

Finally, we note the following lemma, which allows us to extend bounds on a holomorphic
function that hold at the edges of a vertical strip to the interior of the strip.

Lemma 14.3. Let σ1 , σ2 , a, c ∈ R with a, c > 0, and let φ(z)| be a function that is holomorphic
on an open set containing a vertical strip S bounded by the lines x = σ1 and x = σ2 .
a
Suppose |φ(z)| = O e| y| uniformly on S, with |φ(z)| = O(| y|c ) on the lines x = σ1 and
x = σ2 , as | y| → ∞. Then |φ(z)| = O(| y|c ) uniformly on S.

Proof. This is Lemma 4.3.4 in [1].

Now suppose f is a holomorphic on H with a Fourier expansion f (z) = n≥0 an e2πinz that
P

converges absolutely and uniformly on H, such that | f (z)| = O( y −σ ) as y → 0, for some σ > 0.
Then we also have an = O(nσ ) and | f (z) − a0 | = O(e−2π y ), by the first two lemmas above. Let
us call such f (z) nice holomorphic functions. P
To each nice holomorphic function f (z) = n≥0 an e2πinz we associate the Dirichlet series

X
L( f , s) := an n−s ,
n=1

which converges absolutely and uniformly on Re(s) > σ + 1 and is holomorphic on this right
half plane. For each positive integer N we define the completed L -function

ΛN ( f , s) := N s/2 (2π)−s Γ (s)L( f , s).

Lecture by Shiva Chidambaram, notes by Andrew V. Sutherland


The motivation for this definition is that when a0 = 0 (as when f is a cusp form, for example),
then N −s/2 ΛN ( f , s) is the Mellin transform of the function f (i t): R>0 → C, which we expect to
lead to a functional equation as in the proof for the Riemann zeta function.

Theorem 14.4 (Hecke). Let f = n≥0 an e2πinz and g(z) = n≥0 bn e2πnz be nice holomorphic
P P

functions, and let k and N be positive integers. The following are equivalent:
p
(A) g(z) = (−i N z)−k f (− N1z ).
(B) The completed L-functions ΛN ( f , s) and ΛN (g, s) can be analytically continued to C, satisfy

ΛN ( f , s) = ΛN (g, k − s),
a0 a b0 b
with ΛN ( f , s) + s + k−s
0
and ΛN (s; g) + s + k−s
0
holomorphic and bounded on vertical strips.

Proof. (A)⇒(B): We have |an | = O(nσ ) and |bn | = O(nσ ) for some σ > 0, which implies that
X p
|an |e−2πnt/ N
n≥1
R∞ p
converges for t > 0, and that n≥1 0 |an |t c e−2πnt/ N t −1 d t converges on c > σ + 1, and
P

similarly for the bn . Thus for Re(s) > σ + 1 we have



p
X Z
−s
ΛN ( f , s) = an (2πn/ N ) e−t t s−1 d t
n≥1 0
XZ ∞ p
= an e−2πnt/ N s−1
t dt
n≥1 0
Z ∞ ‚X Œ
p
= an e−2πnt/ N
t s−1 d t
0 n≥1

p
Z

= f (i t/ N ) − a0 t s−1 d t
0
p
where the second line uses the change of variable t 7→ (2πn/ N )t, the sum-integral swap in
the third line is justified
P by the2πinz
convergence noted above, and the fourth line simply applies the
definition of f (z) = n≥0 an e . This implies
Z ∞ Z ∞€
a0 Š
ΛN ( f , s) = − + f (t pi )t −s−1
dt + f ( pi tN ) − a0 t s−1 d t
s N
1 1
∞ ∞
p p
Z Z
a0 b0  
=− − + g(i t/ N ) − b0 t k−s−1
dt + f (i t/ N ) − a0 t s−1 d t,
s k−s 1 1

where the change of variable t 7→ t −1 gives the middle term in the first line, and (A) implies
p p p 1
g(i t/ N ) = (−i N (i t/ N ))−k f (− i t p N
) = t −k f ( t pi N ),

which we used in the second line. The fact that f and g are nice implies | f (i t) − a0 | = O(e−2πt )
and |g(i t) − b0 | = O(e−2πt ) as t → ∞, so the two integrals in the last displayed equation

18.786 Spring 2024, Lecture #14, Page 2


converge absolutely and uniformly on any vertical strip, and are therefore holomorphic on C.
It follows that ΛN ( f , s) has a meromorphic continuation to C and that
a0 b0
ΛN ( f , s) + +
s k−s
is holomorphic on C and bounded on every vertical strip. Applying the exact same calculation
to ΛN (g, k − s) yields the desired identity ΛN ( f , s) = ΛN (g, k − s) of meromorphic functions.
(B)⇒ (A): Applying the inverse Mellin transform to the Mellin transform Γ (s) of e−t yields
Z
−t 1
e = Γ (s)t −s ds,
2πi Re(s)=α

valid for any α > 0. Applying this to all but the first term of f (i y) = n≥0 an e−2πn y yields
P

Z
1 X
f (i y) = an Γ (s)(2πn y)−s ds + a0
2πi n≥1 Re(s)=α

For α > σ + 1 the function L( f , s) = n≥1 an n−s is absolutely convergent and bounded on
P

Re(s) = α, which implies we can swap the order of summation/integration to obtain


p
Z
1 −s
f (i y) = Ny ΛN ( f , s)ds + a0 . (1)
2πi Re(s)=α

Since L( f , s) is bounded on Re(s) = α, for any c > 0 we have

|ΛN ( f , s)| = O(| Im(s)|−c ) (2)

as | y| → ∞ on x = α, and applying a similar argument using k − β > σ + 1 yields

|ΛN ( f , s)| = |ΛN (g, k − s)| = O(| Im(s)|−c )


a b
as | Im(s)| → ∞ on Re(s) = β. Now (B) implies ΛN ( f , s)+ s0 + k−s 0
is bounded on β ≤ Re(s) ≤ α,
so Lemma 14.3 implies that (2) holds uniformlypon the vertical strip β ≤ Re(s) ≤ α. We may
−s
choose α > k and β < 0. By (B), p the−kfunction ( N y) ΛN ( f , s) has simple poles at s = 0 and
s = k with residues −a0 and ( N y) b0 , respectively, which lie in the strip β ≤ Re(s) ≤ α, so
if we shift the path of integration from Re(s) = α to Re(s) = β in (1) we obtain
p p
Z
1 −s
f (i y) = Ny ΛN ( f , s)ds + ( N y)−k b0 .
2πi Re(s)=β

Now the functional equation given by (B) implies


p p
Z
1 −s
f (i y) = Ny ΛN (k − s; g)ds + ( N y)−k b0
2πi Re(s)=β
p p
Z
1 s−k
= Ny ΛN (s; g)ds + ( N y)−k b0
2πi Re(s)=k−β
p € Š
= ( N y)−k g N−1
iy .
p −1
The functions f (z) and g(z) are holomorphic on H, so this implies f (z) = ( N z/i)−k g

Nz ,
p
which is equivalent to (A): g(z) = (−i N z)−k f −1

Nz .

18.786 Spring 2024, Lecture #14, Page 3


14.2 Analytic continuation and functional equation for cusp forms
Let N be a positive integer and χ a Dirichlet character of modulus N . As in Lecture 11 (see
+

Lemma 11.6), if we define ω(N ) := N0 −1 0 ∈ GL 2 (Q) with N ∈ Z≥1 , then the map f 7→ f |k ω(N )
gives an isomorphism Sk (N , χ) ' Sk (N , χ).
We now observe that any cusp form f ∈ Sk (N , χ) is a nice holomorphic function, as is
g = f |k ω(N ) ∈ Sk (N , χ). Applying Theorem 14.4 to f and g, with a0 = b0 = 0 yields the
following corollary.

Corollary 14.5. For any cusp form f ∈ Sk (N χ) the function ΛN ( f , s) is entire and satisfies

ΛN ( f , s) = i k ΛN ( f |k ω(N ), k − s).

When N = 1 the character χ is trivial, and f |k ω(1) = f , since ω(1) = 0 −1
1 0 ∈ Γ0 (1) = SL2 (Z),
so the functional equation becomes

ΛN ( f , s) = i k ΛN ( f , k − s).

For N > 1, if χ is trivial (in which case we should assume k is even), for f ∈ Sk (N ) we still have
f |k ω(N ) ∈ Sk (N ), but we do not quite have f |k ω(N ) = f , since ω(N ) 6∈ Γ0 (N ). But in fact, up
to a sign the functional equation above still holds provided that f is an eigenfunction for the
Fricke involution ωN , the linear operator defined by f 7→ f |k ω(N ). Note that

( f |k ωN )(z) = N −k/2 z −k f −1

Nz ,

so
(( f |k ωN )|k ωN )(z) = N −k/2 z −k N −k/2 ( −1
Nz )
−k
f (z) = (−1)k f (z) = f (z),
where the last equality follows from the fact that for f ∈ Sk (N ), either k is even or f = 0.
Thus ωN is an involution, and its eigenvalues are ±1. We will see later that if f ∈ Sk (N ) is an
eigenform for all the Hecke operators, then it is automatically an eigenfunction for the Fricke
involution, but in any case, for any f ∈ Sk (N ) that is an eigenfunction for ωN we have the
functional equation
ΛN ( f , s) = ±i k ΛN ( f , k − s).
More generally, if f ∈ Sk (N , χ) is an eigenform for all the Hecke operators, we will always have
a functional equation of the form

ΛN ( f , s) = "i k ΛN ( f , k − s),

where " is a root of unity.

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #14, Page 4


18.786 Number theory II Spring 2024
Lecture #15 4/10/2024

These notes summarize the material in §4.3, §4.6 of [3] and §11 of [1] presented in lecture.
In Lecture 11 we reduced the study of modular forms for congruence subgroups Γ to the
study of modular forms for Γ0 (N ) with character χ ∈ X (N ), where X (N ) denotes the group of
Dirichlet characters of modulus N . We did this via two standard reductions/decompositions
that can be applied to any weight k ∈ Z and level N ∈ Z≥1 :
• Mk (Γ ) ⊆ Mk (Γ1 (N 2 )), since Γ1 (N 2 ) ⊆ δ−1

N Γ (N )δN , where δN = 0 1 ,
: N 0

• Mk (Γ1 (N )) = χ∈X (N ) Mk (Γ0 (N ), χ),


L

both of which preserve cusp forms. To ease notation we define Mk (N , χ) := Mk (Γ0 (N ), χ) and
let Sk (N , χ) ⊆ Mk (N , χ) denote the subspace of cusp forms.
 Lecture 11 we also defined the diamond operator 〈d〉 defined by f 7→ f |k α, where α =
In
c δ ∈ Γ0 (N ) satisfies δ ≡ d mod N , and noted that Mk (N , χ) and Sk (N , χ) are χ-eigenspaces
a b

of diamond operators, they are the subspaces of Mk (Γ1 (N ) and Sk (Γ1 (N ) for which 〈d〉 f = χ(d) f
for all d ∈ (Z/N Z)× .
In Lecture 12 we showed that the Hecke algebra T(N ) := Z[Γ0 (N )\∆0 (N )/Γ0 (N ), where
 
∆0 (N ) := ac db ∈ M2 (Z) : c ≡ 0 mod N , a ⊥ N , ad − bc > 0 ,

is generated by the Hecke operators



T (l, m) := Γ0 (N ) l 0
0 m Γ0 (N )

with N ⊥ l|m, and we defined the Hecke operator


X
T (n) := T (l, m).
lm=n

Recall that each double coset Γ0 (N )αΓ0 (N ) acts on f ∈ Mk (N, χ) via


r
X
f |k Γ0 (N )αΓ0 (N ) = det(α)k/2−1 χ(αi ) f |k αi ,
i=1
r
X
= det(α)k−1 χ(αi ) j(αi , z)−k f (αi z),
i=i
 
for any α ∈ ∆0 (N ) := ac db ∈ M2 (Z) : c ≡ 0 mod N , a ⊥ N , ad − bc > 0 . We then proved that
that every Hecke operator in T(N ) is a polynomial in T (p), T (p, p), and T (q), where p - N and
q|N vary over primes.
In Lecture 12 we proved that Sk (N , χ) has a basis of common eigenfunctions for T (n) and
T (l, m) with l|m and l mn ⊥ N , equivalently, for T (p) and T (p, p) with p - N . This naturally
leads to the question of whether there is a basis of common eigenfunctions for all the Hecke
operators, including T (q) for primes q|N .
This is not always true, but it is true if we restrict our attention to the new subspace of
Sk (N , χ).

15.1 Old and new modular forms


For all positive integers N |M the inclusion Γ0 (M ) ⊆ Γ0 (N ) implies Mk (N , χ) ⊆ Mk (M , χ) for any
χ ∈ X (N ). But there are many other ways to embed Mk (N , χ) into Mk (M , χ). For every divisor
d of M /N we have
f (dz) = d −k/2 f |k δn ∈ Mk (d N , χ),

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland



where δd = d 0
0 1 , which induces a map δd : Mk (N , χ) → Mk (M , χ) that restricts to a map on
cusp forms.
Lemma 15.1. For any l, n, N ∈ Z≥1 , if n ⊥ l N then the map δl commutes with the action of T (n)
on the spaces Mk (N , χ) and Mk (l N , χ).
Proof. It suffices to consider the case where n is a prime p ⊥ l N , and one can check this directly
using explicit coset representatives, see [3, Lemma 4.6.2].

The lemma below will be used to prove a key lemma due to Hecke that appears as Lemma
4.6.3 in [3] (the proof in [3] omits some details that are filled in by this lemma).
Lemma 15.2. Let f ∈ Mk (N , χ) with k > 0, let h > 1 be an integer prime to N . Suppose
c f (z) = f (z/h) for some nonzero c ∈ C (as functions on the upper half plane). Then f = 0.
Proof. Observe that f ∈ Mk (N , χ) implies f (z + 1) = f (z), and therefore
f ((z + 1)/h) = c f (z + 1) = c f (z) = f (z/h).
If f (z) = n≥0 a(n)e2πinz is the Fourier expansion of f at ∞, comparing coefficients of e2πinz/h
P

in the Fourier expansions of both sides of the equality above yields


a(n)e2πin/h = a(n),
for all n ≥ 0. This implies a(n) = 0 whenever h does not divide n (because e2πin/h 6= 1).
We now observe that c 2 f (z) = c f (z/h) = f (z/h2 ), and the same argument shows that
a(n) = 0 whenever h2 does not divide n. Repeating this argument ad infinitum shows that
a(n) = 0 for all n > 0. For k > 0 there are no nonzero constant functions in Mk (N , χ), so we
also have a(0) = 0, Thus f = 0 as claimed.

Lemma 15.3 (Hecke). Let f ∈ Mk (N , χ) and let α = ac db ∈ ∆0 (N ) satisfy N ⊥ det(α) > 1 and
gcd(a, b, c, d) = 1. If f |k α ∈ Mk (N , χ) then f = 0.

Proof. Choose γ1 , γ2 ∈ Γ0 (N ) so that γ1 αγ2 = 0l m 0 for some with l|m and l, m > 0. Now

1 0 −1
= 10 /1m 6∈ Γ0 (N )
   1

1 0 11
0 m 01 0 m

so αΓ0 (N )α−1 6∈ Γ0 (N
 ), and we can choose γ−1∈ Γ0 (N ) so that αγα
−1
6∈ Γ0 (N ). We have
−1
det(α)α = −c a ∈ ∆0 (N ), so det(α)αγα ∈ ∆0 (N ), and for some γ3 , γ4 ∈ Γ0 (N ) we
d −b

have
det(α)γ3 αγα−1 γ4 = 0u 0v ,


with u|v and u, v ∈ Z>0 . We have uv = det(α)2 with u 6= v, otherwise αγα−1 = γ−1 −1
3 γ4 ∈ Γ0 (N ).
Therefore h = u/v ∈ Z>1 . Now let c := hk/2 χ(γ3 )χ(γ)χ(γ4 ), and suppose that f |k α ∈ Mk (N , χ).
Then f ∈ Mk (αΓ0 (n)α−1 , χ), and f ∈ Mk (γ−1 −1
4 αΓ0 (N )α γ4 ), so we have

c f (z) = hk/2 χ(γ3 )χ(γ)χ(γ4 ) f (z)


= hk/2 χ(γ3 )χ(γ4 )( f |k γ−1 −1
4 αγα γ4 )(z)
= hk/2 χ(γ3 )( f |k αγα−1 γ4 )(z)
= hk/2 χ(γ3 )( f |k γ−1

0 v )(z)
u0
3

= hk/2 ( f |k 0u 0v )(z)
= hk/2 (uv)k/2 v −k f (z/h) = f (z/h).
Thus c f (z) = f (z/h), with c 6= 0, and Lemma 15.2 then implies f = 0.

18.786 Spring 2024, Lecture #15, Page 2


Definition 15.4. Let χ be a Dirichlet character of modulus N and conductor m. We define the
space of old cusp forms Skold (N , χ) to be the subspace of Sk (N , χ) spanned by the set
[ [
f (lz)| f (z) ∈ Sk (M , χ)
M l

where M ranges over proper divisors of N divisible by m and l ranges over divisors of N /M . We
define the space of new cusp forms Sknew (N , χ) to be the orthogonal complement of Skold (N , χ)
with respect to the Petersson inner product.

Lemma 15.5. The spaces Skold (N , χ) and Sknew (N , χ) are stable under the action of T (n) for all
n ⊥ N.

The lemma implies that the old and new subspaces of Sk (N χ) each have bases of common
eigenfunctions for the Hecke operators T (n) with n ⊥ N . Moreover, each eigenfunction gener-
ates a one-dimensional subspace that is uniquely determined by the eigenvalues of the Hecke
operators T (n) for all n ⊥ N , or more generally for all n ⊥ L, for any integer L.

Theorem 15.6. Fix L ∈ Z. If f ∈ Sknew (N , χ) and g ∈ Sk (N , χ) are common eigenfunctions of T (n)


with the same eigenvalue for n ⊥ L then g is a multiple of f , and in particular, g ∈ Sknew (N , χ).

Proof. This is Theorem 4.6.12 in [3].

For each proper divisor Ni of N and divisor d of Ni the map δd : Sk (Ni , χ) → Sk (N , χ) defined
by f (z) 7→ f (dz) sends the new subspace of Sk (Ni , χ) to the old subspace of Sk (N , χ). The
images of these maps give us a complete decomposition of Sk (N , χ) into new subspaces
M
Sk (N , χ) ' Sknew (Ni , χ)⊕mi
cond(χ)|Ni |N

where mi is the number of divisors of N /Ni . As usual when we write Sk (M , χ) for an integer M
divisible by cond(χ) we understand χ to denote the unique Dirichlet character of modulus M
induced by the primitive character χ0 of modulus cond(χ) that induces χ.

Definition 15.7. We call f ∈ Sknew (N , χ) a newform if f is a common eigenfunction of T (n) for


all n ⊥ N with a1 ( f ) = 1.

Theorem 15.8. The newforms in Sknew (N , χ) are common eigenfunctions of T(N ) ∪ T∗ (N ) and
form a basis for Sknew (N , χ).

Proof. We have f |k T (n) = an f for all n ⊥ N . Consider T ∈ T(N ) and T ∗ ∈ T∗ (N ). Now T(N ) is
commutative, so T commutes with all the T (n), and similarly, T ∗ commutes with all the T ∗ (n).
We have
f |k T (n) = χ(N ) f |k T ∗ (n)
to T ∗ also commutes with T (n) for all n ⊥ N , and f |k T and f |k T ∗ are common eigenfunctions
of T (n) with the same eigenvalue an for all n ⊥ N . It follows that f |k T and f |k T ∗ must be
multiples of f , thus f is en eigenfunctions for T and T ∗ .
We have already shown that Sk (N , χ) has a basis of common eigenfunctions for T (n) with
n ⊥ N , a subset of which form a basis for Sknew (N , χ), and it follows that if we normalize these
so that a1 ( f ) = 1 we obtain a basis of newforms.

18.786 Spring 2024, Lecture #15, Page 3


Corollary 15.9. If f ∈ Sk (N , χ) is an eigenfunction of T (n) with eigenvalue an ( f ) for all n ⊥ N
there is a divisors M |N divisible by the conductor of χ and a newform g ∈ Sknew (M , χ) for which
g|k T (n) = an ( f )g for all n ⊥ N . Moreover, if f 6∈ Sknew (N , χ) then M < N .

It follows from the theorem that for newforms f ∈ Sknew (N , χ) we have f |k T (n) = an f for
all n, not just for n ⊥ N , and the adjointness
 of T (n) and T ∗ (n) implies that f |k T ∗ (n) = ān ( f ) f .
Moreover, if we put ωN := N 0 then
0 −1

( f |k ωN )|k T (n) = ān ( f |k ωN )


( f |k ωN )|k T ∗ (n) = an ( f |k ωN )

for all n ∈ Z, and this implies the following theorem.

Theorem 15.10. The action of ωN induces an isomorphism between Sknew (N , χ) and Sknew (N , χ),
and also between Skold (N , χ) and Skold (N , χ).

We conclude this section with two important theorems about newforms. The first is a strong
multiplicity one result, which implies that any newform f is uniquely determined by any subset
of its Fourier coefficients (equivalently, Hecke eigenvalues) an ( f ) that includes all n coprime to
some integer L that we are free to choose.

Theorem 15.11. Fix L ∈ Z, let f ∈ Sknew (N , χ) be a newform and let g ∈ Sk (M , ψ). If g is a


common eigenfunction of T(M ) ∪ T∗ (M ) with an (g) = an ( f ) for all n ⊥ L then M = N and g = f .

Proof. See Theorem 4.6.19 in [3].

The second theorem characterizes the multiplicative relations between the Fourier coeffi-
cients of a newform, which together with the theorem above imply that every newform f is
uniquely determined by its Fourier coefficients a p ( f ) at almost all primes p.

Theorem 15.12. Let f ∈ Mk (N , χ). Then f is a newform if and only if the following hold:

• a1 ( f ) = 1;
• a p r ( f ) = a p ( f )a p r−1 ( f ) − χ(p)p k−1 a p r−2 ( f ) for all primes p and r ≥ 2;
• amn ( f ) = am ( f )an ( f ) for all integers m ⊥ n.

Proof. See Proposition 5.8.5 in [2].

15.2 Twisting newforms


Definition 15.13. For a modular form f (z) = n≥0 an e2πinz ∈ Mk (N , χ) and a Dirichlet char-
P

acter ψ we define the twist of f ⊗ ψ by the Dirichlet series


X
( f ⊗ ψ)(z) := ψ(n)an e2πnz .
n≥0

Lemma 15.14. Let f ∈ Mk (N , χ) and let ψ be a Dirichlet character. Then f ⊗ ψ ∈ Mk (M , χψ2 ),


where M = lcm(N , cond(ψ)2 , cond(ψ) cond(χ)), and if f is a cusp form, so is f ⊗ ψ.

Proof. See Lemma 4.3.10 in [3].

For newforms f ∈ Sknew (N , χ) one can take M = lcm(N , cond(ψ) cond(χψ)) in the lemma
above; see [1, Lemma 11.2.1].

18.786 Spring 2024, Lecture #15, Page 4


Definition 15.15. Let f ∈ Sknew (N , χ) be a newform, and let ψ be a Dirichlet character. If
f ⊗ ψ = f , we say that f ⊗ ψ is a self-twist of ψ.

The set SelfTw( f ) := {ψ : f ⊗ ψ = f } is a group (under multiplication of Dirichlet char-


acters). For ψ ∈ SelfTw( f ) we must have cond(ψ)|N and ψ2 = 1, so it is a finite elemen-
tary abelian 2-group. The theory of complex multiplication implies that for k ≥ 2 the group
SelfTw( f ) is cyclic, and if it is nontrivial it is generated by the Kronecker symbol of an imagi-
nary quadratic field; see [4, Thm. 4.5]. In this case we say that f has complex multiplication
(CM).
For k = 1, the group SelfTw( f ) is isomorphic to a subgroup of (Z/2Z)2 and may contain
Kronecker symbols of both real and imaginary fields.

References
[1] A.J. Best, J. Bober, A.R. Booker, E. Costa, J. Cremona, M. Derickx, M. Lee, D. Lowry-Duda,
D. Roe, A.V. Sutherland, and J. Voight, Computing classical modular forms, Arithmetic Ge-
ometry, Number Theory, and Computation, Simons Symposia (2021), 123–213.

[2] Fred Diamond and Jerry Shurman, A first course in modular forms, Springer, 2005.

[3] Toshitsune Miyake, Modular forms, Springer, 2006.

[4] Kenneth A. Ribet, Twists of modular forms and endomorphisms of abelian varieties, Math.
Ann. 253 (1980), 43–62.

18.786 Spring 2024, Lecture #15, Page 5


18.786 Number theory II Spring 2024
Lecture #16 4/17/2024

These notes summarize the material in §6.1 of [1] presented in lecture.

16.1 Hilbert spaces of functions on the complex upper half plane


Fix a weight k ∈ Z≥0 . Let p ∈ R≥1 ∪ ∞ be an exponent. For f ∈ H → C we define
(€R Š1/p
p | f (z) Im(z)k/2 d v(z) 1 ≤ p < ∞,
k f kk := H
ess supz∈H | f (z) Im(z)|k/2 p = ∞,
dxd y
where d v(z) = y2
is the invariant measure on x + i y = z ∈ H and "ess sup" denotes the
essential supremum, the infimum of the set of essential upper bounds.
If f : Ω → R is a real-valued function on a topological space Ω equipped with a measure, we
call a real number a an essential upper bound of f if the set {z ∈ Ω : f (z) ≥ a} has measure
zero. Thus if k f k∞ = a ∈ R then the set {z ∈ H : | f (z) Im(z)k/2 > a} has measure 0 but the sets
{z ∈ H : | f (z) Im(z)k/2 | : a − ε} have nonzero measure for every ε > 0.
p
We use L k (H) to denote the Banach space (complete normed vector space) of all measurable
functions f : H → C satisfying k f k p < ∞ (recall that a function is measurable if pre-images of
measurable sets are measurable). We call exponents p, q ∈ R≥1 ∪ ∞ conjugate if
1 1
+ = 1,
p q
p q
where 10 := ∞ and ∞ 1
= 0. For conjugate exponents p, q and functions f ∈ L k (H) and g ∈ L k (H)
we define the pairing Z
〈 f , g〉 := f (z)g(z) Im(z)k d v(z),
H
q p
which allows us to identify g ∈ with an element of the dual space of L k (H), continuous
L k (H)
p q p
linear functions on L k (H), and for p 6= ∞ we can view L k (H) as the dual space of L k (H). When
p = q = 2 this defines an inner product on L k2 (H), making it a Hilbert space (a complete metric
p
space with distance given by the inner product). We use H k (H) to denote the closed subspace
p
of L k (H) consisting of holomorphic functions.
For any Hilbert space H of functions f : X → C with inner product 〈, ·, ·〉 we call a function
K : X × X → C a kernel function of H if the following hold:
• for every y ∈ X the function K y : X → C defined by x 7→ K(x, y) lies in H;
• for every f ∈ H we have f ( y) = 〈 f (x), K y (x)〉 for all y ∈ X .
If K is a kernel function then it is conjugate symmetric:

K(x, y) = K y (x) = 〈K y , K x 〉 = 〈K x , K y 〉 = K x ( y) = K( y, x).

Kernel functions need not exist but are uniquely determined when the do (in which case H is
sometimes called a reproducing kernel Hilbert space or RKHS; we won’t use this terminology).
Indeed, if K and K 0 are both kernel functions of H then for every x, y ∈ X we have

K 0 (x, y) = K 0y (x) = 〈K 0y , K x 〉 = 〈K x , K 0y 〉 = K x ( y) = K( y, x) = K(x, y).

If H is finite dimensional then it has the kernel function


X
K(x, y) := f i (x) f i ( y),
i

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


for y ∈ X the function K y = i f i ( y) f i
P
where f1 , . . . , f r is any orthonormal basis of H. Indeed,
is an element of H, and if we write f ∈ H as f = i ci f i , then
P

® ¸
X X X X
〈 f , Ky〉 = ci f i , f i ( y) f i = ci f i ( y) = ci f i ( y) = f ( y).
i i i i

We now want to compute the kernel function of the Hilbert space H k2 (H). It follows from
[1, Corollary 2.62] that for any fixed k ∈ Z≥0 and z0 ∈ H there is a constant c such that

| f (z0 )| ≤ ck f k2k

for all f ∈ H k2 (H). Thus for any fixed z0 ∈ H the map f 7→ f (z0 ) is a continuous linear functional
on H k2 (H). Since H k2 (H) is a Hilbert space (so self-dual), there is therefor a unique gz0 ∈ H k2 (H)
for which Z
f (z0 ) = 〈 f , gz0 〉 = f (z)gz0 (z) Im(z)k d v(z),
H

for all f ∈ H k2 (H). We thus may take

K(w, z) := gz (w)

as the kernel function of H k2 (H). For any f ∈ H k2 (H) we have f (z1 ) = 〈 f , gz1 〉 = 〈 f , Kz1 〉 and
Z Z
f (z1 ) = 〈 f , Kz1 〉 = f (z2 )K(z2 , z1 ) Im(z2 )k d v(z2 ) = K(z1 , z2 ) f (z2 ) Im(z2 )k d v(z2 ).
H H

We now want to consider the behavior of K(z1 , z2 ) under  the action of α ∈ SL2 (R) on H (via
linear fractional transformations). Recall that for α = ac db we define j(α, z) := cz + d, which
satisfies the identities
j(αα0 , z) = j(α, α0 z) j(α0 , z)
and we have
Im(z) Im(z)
Im(αz) = 2
= .
| j(α, z)| j(α, z) j(α, z)
For any α ∈ SL2 (R) and f ∈ H k2 (H), the function f (α−1 z) j(α, α−1 z)−k = f (α−1 z) j(α−1 , z)k lies
in H k2 (H), and we have d v(α−1 z) = d v(z) and Im(α−1 z) j(α−1 , z) = Im(z) j(α−1 , z)−1 , thus
Z
K(αz1 , αz2 ) j(α, z1 )−k j(α, z2 )−k f (z2 ) Im(z2 )k d v(z2 )
H
Z
−k
= j(α, z1 ) K(αz1 , z2 ) f (α−1 z2 ) j(α, α−1 z2 )−k Im(α−1 z2 )k d v(α−1 z2 )
ZH
= j(α, z1 )−k K(αz1 , z2 ) f (α−1 z2 ) j(α, α−1 z2 )k Im(z2 )k d v(z2 )
H
−k
= j(α, z1 ) f (α−1 (αz1 ) j(α, α−1 (αz1 ))k
= j(α, z1 )−k f (z1 ) j(α, z2 )k
= f (z1 )
Z
= K(z1 , z2 ) f (z2 ) Im(z2 )k d v(z2 ).
H

18.786 Spring 2024, Lecture #16, Page 2


It follows from the uniqueness of kernel functions that for any α ∈ SL2 (R) we have
k
K(αz1 , αz2 ) = K(z1 , z2 ) j(α, z1 )k j(α, z2 )

and
K(αz1 , z2 ) j(α, z1 )−k = K(z1 , α−1 z2 ) j(α−1 , z2 )−k .

Applying this to α = 10 1b yields

K(z1 + b, z2 + b) = K(z1 , z2 )

for all b ∈ R. If we let M := (z1 , z2 ) ∈ C2 : z1 ∈ H, z1 − z2 ∈ H and define

h(z1 , z2 ) := K(z1 , z1 − z2 ),

then h(z1 , z2 ) is holomorphic function on M . Now h(z1 + b, z2 ) = h(z1 , z2 ) for b ∈ R, so h(z1 , z2 )


is actually independent of z1 . For any z ∈ H we can pick z1 ∈ H so (z1 , z) ∈ H and define

P(z) := h(z1 , z) = K(z1 , z1 − z).

Then P(z) is holomorphic on H and we have

K(z1 , z2 ) = P(z1 − z 2 )

for all z1 , z2 ∈ H. If we now consider α = 0a a0−1 ∈ SL2 (R), we have P(a2 z) = a−2k P(z) and


P(i y) = y −k P(i)

for all y > 0. Since P(z) is holomorphic on H we must have

z −k

P(z) = ck 2i

for some constant ck > 0. This yields the following theorem.


z −z̄ −k
€ Š
Theorem 16.1. H k2 (H) has kernel function K(z1 , z2 ) = ck 12i 2 for some ck ∈ R>0 .

Corollary 16.2. H k2 (H) ⊆ H k∞ (H).

Proof. For any f ∈ H k2 (H) and z0 ∈ H we have

| f (z0 )|2 = |〈 f (z), K(z, z0 )〉|2 ≤ k f k22 kK(z, z0 )k22 = k f k22 K(z0 , z0 ) = ck Im(z0 )−k k f k22
p
so | f (z0 ) Im(z0 )k/2 | ≤ ck k f k2 for any z0 ∈ H, thus f ∈ H k∞ (H).

We now want to compute the constant ck in Theorem 16.1, and for this we need to define
the Fourier transform of f ∈ H k2 (H). For any y ∈ R>0 we define the function f y :− R → R via

f y (x) := f (x + i y).
R
Since k f k22 = | f (x + i y)|2 y k−2 d x d y < ∞, we have f y ∈ L 2 (R) for all y outside a set of
H R∞
measure zero; let fˆy (u) := f (x)e−2πiux d x of f y for all such y.
−∞

18.786 Spring 2024, Lecture #16, Page 3


Theorem 16.3. For each f ∈ H k2 (H) there is a function fˆ : R → R such that fˆy (h) = fˆ(u)e−2πu y
for all y outside a set of measure zero, and fˆ(u) vanishes almost everywhere on R.

Corollary 16.4. If k ≤ 1 then H k2 (H) = {0}.

We call the function fˆ : R → R of Theorem 16.3 the Fourier transform of f ∈ H k2 (H). We


henceforth assume k > 1, since H k2 (H) has dimension zero otherwise; see [1, Corollary 6.1.7].
We now define the function
Z∞
Gk (u) := y k−2 e−πu y d y = (πu)1−k Γ (k − 1),
0

for u > 0, and let Gk (u) := 0 for u ≤ 0. Let Ĥ k2 denote the space of measurable functions
φ : R → C for which

• φ vanishes almost everywhere on R<0 ;


R∞
• −∞ |φ(u)|2 Gk (4u)du < ∞.

Then hat H k2 is a Hilbert space with inner product


Z ∞
〈φ, ϕ〉 := φ(u)ϕ(u)Gk (4u)du,
0

and for f ∈ H k2 (H) we have


Z ∞
kf k22 = | fˆ(u)|2 Gk (4u)du = 〈 fˆ, fˆ〉.
−∞

Thus for f ∈ H k2 (H) we have fˆ ∈ Ĥ k2 , and in fact this defines an isomorphism of Hilbert spaces.

Theorem 16.5. The map f 7→ fˆ is an isomorphism from H k2 (H) to Ĥ k2 .

Proof. This is Theorem 6.1.6 in [1].

For φ ∈ Ĥ k2 we define
Z ∞
φ̂(z) := φ(u)e2πiuz du
−∞

Let K̂(u, z) denote the Fourier transform of K(z1 , z) as a function of z1 , for any fixed z. For
any φ ∈ Ĥ k2 we have
Z ∞
〈φ(u), K̂(u, z)〉 = 〈φ̂(z1 ), K(z1 , z)〉 = φ̂(z) = φ(u)e2πiuz dz,
0

and we also have Z ∞


〈φ(u, K̂(u, z)〉 = φ(u)K̂(u, z)Gk (4u)du,
0
which implies
K̂(u, x) = Gk (4u)−1 e−2πiuz̄

18.786 Spring 2024, Lecture #16, Page 4


for all u ∈ R and z ∈ H. Taking the inverse transform of K̂(u, z) as a function of u yields
Z ∞
K(z1 , z2 ) = Gk (4u)−1 e2πiu(z1 −z̄2 ) du,
0

and therefore Z ∞
z −k
Gk (4u)−1 e2πiuz du.

ck 2i =
0
Applying this with z = 2i yields
k−1
ck =

and the following theorem.

Theorem 16.6. For k ∈ Z>1 the kernel function of H k2 (H) is

k − 1 z1 − z̄2 −k
 ‹
K(z1 , z2 ) = .
4π 2i

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #16, Page 5


18.786 Number theory II Spring 2024
Lecture #17 4/22/2024

These notes summarize the material in §6.3-4 presented in lecture. Recall the weight-k slash
operator associated to α ∈ GL+
2 (R) on functions f : H → C is defined by

( f |k α)(z) := (det α)k j(α, z)−k f (αz),


where j(α, z) := cz + d for α = ac db ∈ GL+
det(α) Im(z)

2 (R) and z ∈ H, and we have Im(αz) = | j(α,z)|2
.

17.1 Function spaces of automorphic forms


Let Γ ≤ SL2 (R) be a lattice (which we recall is a discrete cofinite subgroup of SL2 (R), equiva-
lently, a finitely generated Fuchsian of the first kind; see Lecture 5). Let k ≤ inZ>2 , and let χ
be a finite order character of Γ with χ(−1) = (−1)k if −1 ∈ Γ .
For any measurable function f : H → C that satisfies
( f |k γ) = χ(γ) f (z) (for all γ ∈ Γ ) (1)
we define (€R Š1/p
| f (z) im(z) k/2 p
| d v(z) 1≤p<∞
k f kΓ ,p := Γ \H
ess supz∈H | f (z) Im(z) k/2
| p = ∞.
For γ ∈ Γ we have
| f (γz) Im(γz)k/2 | = |( f |kγ)(z) j(γ, z)−k Im(γz)k/2 | = | f (z) Im(z)k/2 |,
p
which ensures that k f kΓ ,p is well defined. We now let L k (Γ , χ) denote the set of measurable
p
functions f : H → C that satisfy (1) with k f kΓ ,p < ∞, and use H k (Γ , χ) to denote the subspace
p
of holomorphic f ∈ L k (Γ , χ). Then L k2 (Γ , χ) is a Hilbert space with inner product
Z
〈 f , g〉 := f (z)g(z) Im(z)k d v(z).
Γ \H
p
Now Γ \H has finite volume (Γ is a lattice), which implies L k∞ (Γ , χ) ⊆ L k (Γ , χ) and H k∞ (Γ , χ) ⊆
p
H k (Γ , χ) for all p ∈ R≥1 . Moreover, we have H k∞ (Γ , χ) = Sk (Γ , χ) (see [1, Theorem 2.1.5])
and the restriction of the inner product of L k2 (Γ , χ) to H k∞ (Γ , χ) is just a rescaled version of the
Petersson inner product that omits the leading factor v(Γ \H)−1 .
Theorem 17.1. H k2 (Γ , χ) = H k∞ (Γ , χ)
Proof. This is immediate when Γ has no cusps, since then Γ \H is compact. So let x 0 is a cusp
of Γ and f ∈ H k2 (Γ , χ). By replacing Γ with a finite index subgroup we can assume χ is trivial.
Pick σ ∈ SL2 (R) so that σx = ∞ and put ±σ−1 Γ x 0 σ = ± 10 1h , with h ∈ R>0 . If we pick a


neighborhood Ul = {z ∈ H| Im(z) > l} of ∞ and let an eπinz/h denote the Fourier espansion
P

of f |k σ, then
Z
∞> | f (z)|2 Im(z)k d v(z)
Γ \H
ZZ
1
≥ | f (σz) j(σ, z)−k |2 Im(z)k d v(z)
2 0≤Re(z)≤2h
l≤Im(z)<∞
Z ∞Z 2h X
1
= am ān e−π y(m+n)/h eπi x(m−n)/h y k−2 d x d y
2 l 0 m,n∈Z
Z ∞
2 −2π y n/h
≥ h|an | e y k−2 d y.
l

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


for any n ∈ Z. For k > 2 this implies an = 0 if n ≤ 0, so f ∈ Sk (Γ , χ) = H k∞ (Γ , χ).

For f ∈ L k1 (H) we define


X
f Γ (z) := 1
#Z(Γ ) χ(γ) f (γz) j(γ, z)−z
γ∈Γ

and X
K Γ (z1 , z2 ) := 1
#Z(Γ ) χ(γ)K(γz1 , z2 ) j(γ, z1 )−k ,
γ∈Γ

where #Z(Γ ) ∈ {1, 2} counts the trivial elements of Γ . In the previous lecture we proved

k − 1 z1 − z̄2 −k
 ‹
K(z1 , z2 ) = , (2)
4π 2i

and we note that K(z2 , z1 ) = K(z1 , z2 ) implies K Γ (z2 , z1 ) = K Γ (z1 , z2 ), for all z1 , z2 ∈ H.

Theorem 17.2. If f ∈ L k1 (H) then f Γ ∈ L k1 (Γ , χ), and if f ∈ H k1 (H) then f Γ ∈ H k1 (Γ , χ).

Proof. See Theorem 6.3.2 in [1].

Theorem 17.3. For f ∈ L k1 (H) the sum in f Γ converges absolutely on H and f Γ lies in L k1 (Γ , χ),
and in H k1 (Γ , χ) if f is holomorphic. In particular, K Γ (z, z2 ) ∈ H k1 (Γ , χ) for every fixed z2 ∈ H.

Proof. See [1, Theorem 6.4.2].

Theorem 17.4. K Γ (z1 , z2 ) is the kernel function of H k2 (Γ , χ).

Proof. This is [1, Theorem 6.3.3], but modulo issues of convergence, for f ∈ H k2 (Γ , χ) we have
Z Z
〈 f , KzΓ 〉 = f (z2 )K Γ (z2 , z1 ) Im(z2 )k d v(z2 ) = K Γ (z1 , z2 ) f (z2 ) Im(z2 )k d v(z2 )
1
Γ \H Γ \H
XZ
1
= #Z(Γ ) K(z1 , z2 ) f (z2 ) Im(z2 )k d v(z2 )
γ∈Γ γ−1 F
Z
= K(z1 , z2 ) f (z2 ) Im(z2 )k d v(z2 )
H
= f (z1 ),

since f ∈ H k2 (Γ , χ) ⊆ H k2 (H) and K(z1 , z2 ) is the kernel function of H k2 (H). and this implies that
K Γ (z1 , z2 ) is the kernel function of H k2 (Γ , χ), provided KzΓ ∈ H k2 , which one can show follows
1
from Kz1 ∈ H k2 (H).

This gives us a new way to compute the dimension of Sk (Γ , χ).

Corollary 17.5. For any lattice Γ ≤ SL2 (R) and integer k > 2 we have the dimension formula
Z
dim Sk (Γ , χ) = K Γ (z, z) Im(z)k d v(z).
Γ \H

18.786 Spring 2024, Lecture #17, Page 2


Proof. Let f1 , . . . , f r be an orthonormal basis for Sk (Γ , χ) with respect to the inner product on
H k2 (Γ , χ) = H k∞ (Γ , χ) = Sk (Γ , χ). Then
r
X
K Γ (z1 , z2 ) = f i (z1 ) f i (z2 ),
i=1

since as noted in Lecture 16, the RHS is the kernel of any finite dimensional Hilbert space with
orthonormal basis f1 , . . . , f r , and we have
r
X r Z
X Z
dim Sk (Γ , χ) = r = 〈 fi , fi 〉 = k
f i (z) f i (z) Im(z) d v(z) = K Γ (z, z) Im(z)k d v(z).
i=1 i=1 Γ \H Γ \H

17.2 Traces of Hecke operators


Let Γ ≤ SL2 (R) be a lattice and let ∆ ⊇ be a semigroup contained in the commensurator Γ̃ of
Γ in GL+
2 (R) (so for α ∈ ∆ the group αΓ α
−1
is commensurable with Γ ). Let χ be a finite order
character of Γ with χ(−1) = (−1) if −1 ∈ Γ as above, extending to a homomorphism ∆ → C
k

with
χ(αγα−1 ) = χ(γ)
for all γ ∈ Γ and α ∈ ∆.
Fix k ∈ Z>2 . Recall that the Hecke algebra Z[Γ \∆/Γ ] acts on f ∈ Sk (Γ , χ) via
r
X
( f |k Γ αΓ )(z) := det(α)k−1 χ(αi ) j(αi , z)−k f (αi z),
i=1
`r
where Γ αΓ = i=1 Γ αi is any right coset decomposition and the action of an integer linear
combination of double cosets is computed by extending Z-linearly. For α ∈ ∆ we define

κ(α, z) := det(α)k−1 χ(α)K(αz, a) j(α, z)−k Im(z)k .

If T ∈ Z[Γ \∆/Γ ] is a sum of disjoint double cosets (a single double coset, for example) then the
` r summands of T is a subset of ∆ that admits a finite decomposition
union of the double coset
into right cosets T = i=1 Γ αi ; the Hecke operator T (n) ∈ T(N ) = Z[Γ0 (N )\∆0 (N )/Γ0 (N )] is
a notable example; it corresponds to the set {α ∈ ∆0 (N )| det(α) = n} ⊆ ∆0 (N ). We will regard
such Hecke operators as elements of Z[Γ \∆/Γ ] that are also subsets of ∆ we can sum over.

Theorem 17.6. Let T ∈ Z[Γ \∆/Γ ] be a sum of double cosets. The trace of T acting on Sk (Γ , χ) is
Z
1 X
tr(T ) = tr(T |Sk (Γ , χ)) = κ(α, z)d v(z)
#Z(Γ ) Γ \H α∈T

18.786 Spring 2024, Lecture #17, Page 3


`n
Proof. Let f1 , . . . , f r be an orthonormal basis of Sk (Γ , χ) and let T = i=1 Γ αi . Then
r
X
tr(Γ αΓ ) = 〈 f l |Γ αΓ , f l 〉
l=1
r Z
X n
X
= det(αi )k−1 χ(αi ) j(αi , z)−k f l (αi z) f l (z) Im(z)k d v(z)
l=1 Γ \H i=1
Z n
X
= det(αi )k−1 χ(αi ) j(αi , z)−k K Γ (αi z1 , z2 ) Im(z)k d v(z)
Γ \H i=1
Z
1 X
= det(α)k−1 χ(α)K(αz, z) j(α, z)−k Im(z)k d v(z)
#Z(Γ ) Γ \H α∈T
Z
1 X
= κ(α, z)d v(z).
#Z(Γ ) Γ \H α∈T

In order to compute the integral in Theorem 17.6 we need to treat cusps of Γ separately.
For x ∈ P1 (R) a cusp of Γ , and a Hecke operator T ⊆ ∆, let Tx := {α ∈ T : αx = x}. If U x is a
neighborhood of x that is stable under the action of Γ x := {γ ∈ Γ : γx = x} then
Z X Z X Z X
κ(α, z)d v(z) = κ(α, z)d v(z) + κ(α, z)d v(z)
Γ x \U x α∈T Γ x \U x α∈T −T x Γ x \U x α∈T x

Lemma 17.7. For any cusp x of Γ with Γ x -stable neighborhood U x and Hecke operator T ∈
Z[Γ \∆/Γ ] that is a sum of double cosets we have
Z X X Z
κ(α, z)d v(z) = κ(α, z)d v(z)
Γ x \U x α∈T −T x α∈T −T x Γ x \U x

and [Tx : Γ x ] < ∞.

Proof. See [1, Theorem 6.4.5] and [1, Lemma 6.4.6].

Theorem 17.8. Let x be a cusp of Γ with σx = ∞ for some σ ∈ SL2 (R). then
Z X XZ
κ(α, z)d v(z) = lim+ κ(α, z) Im(z)−s | j(σ, z)|2s d v(z).
s→0
Γ x \U x α∈T x α∈T x Γ x \U x

Proof. This is [1, Theorem 6.4.7].

Let PΓ ⊆ P1 (R) denote the set of cusps of Γ and choose neighborhoods U x of x ∈ PΓ so


that Uγx = γU x and U x ∩ U x 0 = ; for x 6= x 0 , and choose σ x ∈ SL2 (R) so that σ x x = ∞ and
Im(σγx γz) = Im(σ x z) for γ ∈ Γ and z ∈ H. Let T ⊆ ∆ be a union of double Γ -cosets, put
Z(T ) := T ∩ R× and T∞ := ∪ x∈PΓ (Tx − Z(T )). We then have
Z !
1 X X Z
tr(T ) = κ(α, z)d v(z) + lim+ κ(α, z, s)d v(z) ,
#Z(Γ ) α∈T −T∞ Γ \H
s→0
α∈T ∞ Γ \H

where κ(α, z, s) := κ(α, z) Im(z)−s | j(σ x , z)|2s for z ∈


S
αx=x Ux .

18.786 Spring 2024, Lecture #17, Page 4


For α ∈ T let Γ (α) := {γ ∈ Γ : αγ = γα}, and for any union of Γ -conjugacy classes S ⊆ T let
conjΓ (S) denote a set of Γ -conjugacy class representatives. Then
Z Z !
1 X X
tr(T ) = κ(α, z)d v(z) + lim+ κ(α, z, s)d v(z) .
#Z(Γ ) Γ (α)\H
s→0
Γ (α)\H
α∈conjΓ (T −T∞ ) α∈conjΓ (T∞ )

R
We now consider the integrals Γ (α)\H
κ(α, z)d v(z). There are five different cases:

1. α is scalar;
2. α is elliptic (tr(α) < 4 det(α));
3. α is hyperbolic (tr(α) > 4 det(α)) with fixed points that are not cusps of Γ ;
4. α is hyperbolic (tr(α) > 4 det(α)) with a fixed point that is a cusp of Γ ;
5. α is parabolic (tr(α) = 4 det(α).

17.2.1 Scalar case



We have α = 0a 0a with Γ (α) = Γ and αz = z, so
Z Z
κ(α, z)d v(z) = det(α)k−1 χ(α) K(αz, z) j(α, z)−k Im(z)k d v(z)
Γ (α)\H Γ \H

k − 1 z − z̄ −k −k
Z  ‹
2k−2
=a χ(α) a Im(z)k d v(z)
Γ \H 4π 2i
k − 1 k−2
= a χ(α)v(Γ \H).

17.2.2 Elliptic case


η0
€ Š € Š
1 −z0
If z0 ∈ H is the fixed point of α and we put ρ := 1 −z̄0 then ραρ −1 = 0 ζ , where η, ζ are
the eignevalues of α, and if we put w = ρz = r e iθ
then one finds that
k
1 − r2

k−1 −k
κ(α, z) = k−1
(ηζ) χ(α)ζ
4π 1 − (η/ζ)r 2

and using d v(w) = 4r(1 − r 2 )−2 d r dθ we obtain


Z Z 1Z π
k − 1 k−1
κ(α, z)d v(z) = η χ(α) 4r(1 − r 2 )k−2 (1 − (η/ζ)r 2 )−k d r dθ
Γ (α)\H 4πζ 0 0
1
(k − 1)η χ(α)
k−1
Z
= (1 − t)k−2 (1 − (η/ζ)t)−k d t
[Γ (α) : Z(Γ )]ζ 0
(k − 1)η χ(α)
k−1
= ((k − 1)(1 − η/ζ))−1
[Γ (α) : Z(Γ )]ζ
χ(α) ηk−1
=
[Γ (α) : Z(Γ )] (ζ − η)

18.786 Spring 2024, Lecture #17, Page 5


17.2.3 Hyperbolic case with no fixed cusps

In this case one finds that Z


κ(α, z)d v(z) = 0.
Γ (α)\H

17.2.4 Hyperbolic case with a fixed cusp

In this case one finds that

min(|η|, |ζ|)k−1
Z
κ(α, z)d v(z) = −χ(α sgn(ζ)k .
Γ (α)\H |ζ − η|

17.2.5 Parabolic case

Let x be the fixed point of α; then x is a cusp of Γ and€ Γ (α) Š = Γz ; see the argument in [1].
ζλ
Choose σ ∈ SL2 (R) so that σx = ∞, and let σασ = 0 ζ . Let T p be the set of parabolic
p p
elements in T , and for each cusp x of Γ let Tx := T p ∩ Tx . Then x∈Γ \PΓ Tx is a complete
S
p
set of representatives for conjΓ (T p ). Let ±σΓ x σ−1 = 〈 10 1h 〉 with h > 0, and for α ∈ Tx , put


h(α) := λ/ζ and let sgn(α) = sgn(ζ). Then


XZ X
1
lim κ(α, z, x)d v(s) = lim+ χ(α) sgn(α)k det(α)k/2−1 (ih/h(α))1+s .
s→0+ s→0 2π
p
α∈T x Γ (α)\H α∈T x
p

17.3 The trace formula


Let Γ ≤ SL2 (R) be a lattice, let ∆ be a semigroup with Γ ≤ ∆ ≤ Γ̃ , let χ be a finite order
character of Γ with χ(−1) = (−1)k if −1 ∈ Γ , extended to ∆ so that χ(αγα−1 ) = χ(γ) for α ∈ ∆
and γ ∈ Γ , and let T ∈ Z[Γ \∆/Γ ] be a sum of Γ -double cosets viewed as a subset of ∆.
Define Z(T ) = T ∩R× , let T e be the subset of elliptic α ∈ T , let T h be the subset of hyperbolic
0
α ∈ T whose fixed points are cusps of Γ , let T h be the subset of hyperbolic α ∈ T whose fixed
points are not cusps of Γ , and let T p be the subset of parabolic α ∈ T whose fixed points are
cusps of Γ , so that
0
T = Z(T ) t T e t T h t T h t T p .
−iθ
For α ∈ T let ηα and ζα denote  the eigenvalues of α, with ηα = r e and ζα = r e

if α is elliptic
−1 cos θ sin θ
with σασ = r − sin θ cos θ . For non-elliptic α let sgn(α) := sgn(ζα ), and for parabolic α let
m(α) = λ/(hζα ) where ±σΓ (α)σ−1 = ± 10 1h .


Theorem 17.9. The trace of T acting on Sk (Γ , χ) is

tr(T ) = t 0 + t e + t h + t p ,

18.786 Spring 2024, Lecture #17, Page 6


where
(k − 1)v(Γ \H) X
t0 = χ(α) sgn(α)k det(α)k/2−1 ,
4π#Z(Γ )
α∈Z(T )
X χ(α)ηαk−1
te = − ,
|Γ (α)|(ηα − ζα )
α∈conjΓ (T e )

1 X min(|ζα |, |ηα |)k−1


th = − χ(α) sgn(α)k ,
#Z(Γ ) |ζα − ηα |
α∈conjΓ (T h )
1 X
t p = lim+ χ(α) sgn(α)k det(α)k/2−1 (i/m(α))1+s .
s→0 2π#Z(Γ )
α∈conjΓ (T p )

References
[1] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #17, Page 7


18.786 Number theory II Spring 2024
Lecture #18 4/22/2024

These notes summarize the material in §6.1-6.3 of [1] presented in lecture.

18.1 The Jacobian of a compact Riemann surface



Let X be a compact Riemann surface of genus g with holomorphic atlas {ψi : Ui −→ Vi ⊆ C}
−1
and transition maps ψi j : ψ j ◦ ψi : Vi j → Vj for all Ui ∩ U j ̸= ;, where Vi j := ψi (Ui ∩ U j ).
In Lecture 7 we considered the space Ω1 (X ) of meromorphic differentials ω of degree n on X
defined by a collection of meromorphic functions {ωi : Vi → C} satisfying
 n
dψi
ωi = (ω j ◦ ψi j
dψ j

on Vi j (modulo equivalence {ωi } = ω ≃ θ =P {θi } if {ωi } ∪ {θi } is a meromorphic differential).


We associated to each ω a divisor div(ω) = P ord P (ωi ◦ ψi ) in the divisor group Div(X ) (the
free abelian group generated by P ∈ X (C)), and call a meromorphic differential ω holomorphic
if ω = 0 or div(ω) ≥ 0 (meaning every coefficient of div(ω) is nonnegative). We showed that
the space of holomorphic differentials Ω10 (X ) is a complex vector space of dimension g.
For a lattice Γ ≤ SL2 (R), we showed how to associate to each automorphic f ∈ A2n (Γ ) a
meromorphic differential ω f of degree n, and we showed that ω f is holomorphic precisely
when f is a cusp form. For cusp forms of weight 2 we get a holomorphic differential of degree 1
and the map f 7→ ω f induces an isomorphism S2 (Γ ) ≃ Ω10 (X Γ ), where X Γ := Γ \H∗ .
The homology group of H1 (X , Z) formed by equivalence classes of closed loops on X (1-
cycles modulo boundaries) is a free abelian group of rank 2g. Each γ ∈ H1 (X , Z) determines a
linear form on Ω10 (X ) via the map
Z
ω 7→ ω.
γ

This induces an injective map Z2g ≃ H1 (X , Z) ,→ Ω10 (X )∨ = Hom(H01 (X ), C) ≃ C g that allows


us to view H1 (X , Z) as a lattice Λ = H1 (X , Z) ≃ Z2g in the vector space V = Ω10 (X )∨ ≃ C g . The
quotient H01 (X )∨ /H1 (X , Z) is a complex torus V /Λ that can be equipped with a Riemann form
(a positive definite Hermitian form V × V → C that maps Λ × Λ to Z) that induces a polarization
(a homomorphism to the dual torus) which makes it an abelian variety called the Jacobian of X ,
denoted Jac(X ).
It turns out that because our complex torus V /Λ arose from a Riemann surface, this polar-
ization is actually an isomorphism and Jac(X ) is a principally polarized abelian variety.
Recall that compact Riemann surfaces are also algebraic curves (varieties of dimension 1)
that are smooth projective curves over C. The Jacobian of a compact Riemann surfaces is also an
algebraic variety, and in fact a smooth projective variety of dimension g. Moreover, as a complex
torus, the Jacobian is naturally equipped with a group operation (add vectors in V and reduce
modulo Λ), and this induces a group structure on the corresponding projective variety that is
defined by morphisms, making it an algebraic group. This group is obviously abelian (since V /Λ
is), but in fact it any smooth projective variety that is also an algebraic group, including those
that are not necessarily Jacobians, has an abelian group structure; thus all projective algebraic
groups are called abelian varieties (note that linear algebraic groups like GLn are not projective).
The moduli space A g of principally polarized abelian varieties of dimension g has dimension
g(g+1)/2, while the moduli space M g of smooth projective curves of genus g ≥ 2 has dimension
3g − 3. Thus for large g almost all principally polarized abelian varieties are not Jacobians.
But for g = 1 every abelian variety is a Jacobian (in fact an elliptic curve), and for g = 2, 3

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


the dimensions of A g and M g coincide, meaning that almost all principally polarized abelian
varieties of dimension g ≤ 3 are Jacobians.
Recall the divisor class group Pic0 (X ) = Div0 (X )/ Prin(X ) of divisors of degree zero modulo
principal divisors those associated to elements of the function field C(X ). We should note that
our identification of Pic0 (X ) with the divisor class group is valid for smooth projective curves
over an algebraically closed field, but Pic0 (X ) also has an intrinsic definition that may differ
from the divisor class group in other settings.
There is a canonical isomorphism Pic0 (X ) −→ ∼
Jac(X ) defined as follows. Given D ∈ Div0 (X )
we write D in the form X
D= (Pi − Q i )
i

(this is always possible for X /C) and associate to D the linear form

XZ Qi
ω 7→ ω
i Pi

in Ω10 (X )∨ . Reducing modulo the lattice H1 (X , C) yields an element of Jac(X ). This is the Abel-
Jacobi map, which can be viewed as an extension of the map X → Pic0 (X ) that sends x ∈ X (C)
to the divisor x − x 0 , where x 0 ∈ X (C) is a fixed rational point.
If φ : X → Y is a nonconstant holomorphic map of compact Riemann surfaces, we have a
pullback map φ ∗ : C(Y ) → C(X ) defined by φ ∗ f = f ◦ φ, which extends to a linear map of
holomorphic differentials
φ ∗ : Ω10 (Y ) → Ω10 (X ).
Taking duals induces the pushforward map on Jacobians h∗ : Jac(X ) → Jac(Y ). There is also a
pushforward map h∗ : Pic0 (X ) → Pic0 (Y ) induced by the pushforward map Div0 (X ) → Div0 (Y )
that sends P ∈ C(X ) to h(P) ∈ C(Y ), and the two maps are compatible with the isomorphisms
Jac(X ) ≃ Pic0 (X ) and Jac(Y ) ≃ Pic0 (Y ).
There is also a pullback map h∗ : Pic0 (Y ) → Pic0 (X ) induced by the map Div0 (Y ) → Div0 (X )
that sends Q ∈ Y (C) to the sum of the points P ∈ φ −1 (Q), and a corresponding pullback map
Jac(Y ) → Jac(X ) that can be defined using the trace map Ω10 (X ) → Ω10 (Y ). These pullback maps
are also compatible with the isomorphisms Jac(X ) ≃ Pic0 (X ) and Jac(Y ) ≃ Pic0 (Y ).

18.2 Modular Jacobians and Hecke operators


Let Γ1 and
`Γ2 be congruence subgroups, α ∈ GL+ −1
2 (Q) and Γ3 = α Γ1 α ∩ Γ2 . If we decompose

Γ3 \Γ2 as i Γ3 γi and Γ1 αΓ2 as j Γ1 β j and let Γ3 := αΓ3 α = Γ1 ∩ αΓ2 α−1 then
−1
`

Γ2 ←− Γ3 −→

Γ3′ −→ Γ1

where the isomorphism is γ 7→ αγα−1 and the other arrows are inclusions. This induces mor-
phism of modular curves
π2 π1
X Γ2 ←− X Γ3 ≃ X 3′ −→ X Γ1

Each point of X Γ2 is mapped via π1 ◦ α ◦ π−1


2 to a multiset of points of X Γ1 . Viewing points on
X Γ1 and X Γ2 as Γi -orbits in H∗ we have

π−1
2 α π1
Γ2 z 7→ {Γ3 γi z} → {Γ3′ β j z} → {Γ1 β j z}

18.786 Spring 2024, Lecture #18, Page 2


We have a corresponding map Γ1 αΓ2 : Div(X Γ2 ) → Div(X Γ1 ), which induces a map

that we can view as map of Jacobians of modular curves: Jac(X Γ2 ) → Jac(X Γ1 ).


Applying S2 (Γ ) ≃ Ω10 (X Γ ) and taking the corresponding isomorphism of the dual space yields
an isomorphism
Jac(X Γ ) ≃ S2 (Γ )∨ /H1 (X Γ , Z)
that is compatible with pushforward and pullback maps. The map

Γ1 αΓ2 : S2 (Γ1 ) → S2 (Γ2 )

induces a map on the dual spaces, and a map on Jacobians

Γ1 αΓ2 : Jac(X 2 ) → Jac(X 1 ),

which defines an action of the Hecke algebra T = Z[Γ αΓ ] on the Jacobian Jac(X Γ ).

18.3 Eichler-Shimura
Now let f ∈ S2 (Γ1 (N )) be a newform, hence an eigenform of T(N ) = Z[Γ1 (N )∆(N )/Γ1 (N )], and
consider the eigenvalue map
λ f : T(N ) → C
defined by f |2 T = λ f (T ) f . Let I f := ker λ f := {T ∈ T(N ) : f |2 T = 0}. Since T(N ) acts on
Jac(X 1 (N )), we can view I f Jac(X 1 (N )) as a subgroup of the divisor class group Pic0 (X 1 (N ))
corresponding to an abelian subvariety of Jac1 (X 1 (N )).

Definition 18.1. The modular abelian variety A f associated to the newform f ∈ S2 (Γ1 (N )) is
the quotient Jac(X 1 (N ))/I f Jac(X 1 (N )).

The eigenvalues λ f (T ) and the Fourier coefficients an ( f ) are algebraic integers that gener-
ate a number field Q( f ). By letting Gal(Q/Q) act on the coefficients of f , we obtain a Galois
conjugate newform f σ associated to each σ ∈ Gal(Q/Q) (equivalently, we can view the coeffi-
cients of f as abstract elements of the number field Q( f ) and consider the different embedding
of Q( f ) into C.
The set of all Galois conjugates of f spans a subspace of S2 (Γ1 (N )) of dimension [Q( f ) : Q],
and this is equal to the dimension of A f ; indeed A f is invariant under the action of Gal(Q/Q)
and can be viewed as an abelian variety over Q. Replacing f with a Galois conjugate newform
does not change A f .
If we let Vf denote the subspace of S2 (Γ1 (N )) spanned by the Galois conjugates of f and
consider H1 (X 1 (N ), Z) ,→ Ω10 (X 1 (N )) ≃ S2 (Γ1 (N )), restricting H1 (X 1 (N ), Z) to its action on
elements of Vf ⊆ S2 (Γ1 (N )) yields a lattice Λ f ⊆ Vf , and we can alternatively define A f as the
abelian variety corresponding to the complex torus Vf /Λ f (which also admits a polarization).

References
[1] Fred Diamond and Jerry Shurman, A first course in modular forms, Springer, 2005.

[2] Toshitsune Miyake, Modular forms, Springer, 2006.

18.786 Spring 2024, Lecture #18, Page 3


18.786 Number theory II Spring 2024
Lecture #19 4/29/2024

19.1 The j-function


Let L be a lattice in C (a discrete cocompact subgroup). The weight-k Eisentstein series for L is
X 1
Gk (L) := ,
ωk
ω∈L−{0}

which vanishes for odd k and converges for all k > 2. After replacing L with λL for some λ ∈ C×
we may assume L is the lattice spanned by [1, τ] for some τ ∈ H. We then have
X 1
Gk (z) := Gk ([1, τ]) = ,
m,n∈Z
(m + nz)k
(m,n)6=(0,0)

thus for k > 2 we may view Gk (τ) as a holomorphic function H → C that satisfies

Gk (τ + 1) = Gk (τ) and Gk (−1/τ) = τk Gk (τ).


 
This implies that Gk |k γ = Gk for all γ ∈ SL2 (Z) = 11
01 , 0 −1
1 0 . For even k we have

lim Gk (τ) = 2ζ(k),


Im τ→∞

and Gk (z) is holomorphic and nonvanishing at the cusps, makeing it a (non-cuspidal) modular
form of weight k for Γ (1) = SL2 (Z).
The Weierstrass ℘-function for L is defined by

1 X  1 1
‹
℘(z : L) = 2 +
: − .
z (z − ω)2 ω2
ω∈L−{0}

It is a meromorphic function on C that is periodic with respect to L: ℘(z + ω) = ℘(z) for ω ∈ L;


in other words, it is an elliptic function. It has poles of order 2 at each ω ∈ L is holomorphic
elsewhere. The Laurent series expansion of ℘(z) as z = 0 is
1 X
℘(z) = + (2n + 1)G2n+2 (L)z 2n ,
z 2 n≥1

and one can use this to show that ℘(z) = ℘(z; L) satisfies the differential equation

℘0 (z) = 4℘(z)3 − g4 (L)℘(z) − g6 (L),

where g4 (L) := 60G4 (L) and g6 (L) := 140G6 (L). The map z 7→ (℘(z), ℘0 (z)) defines an isomor-
phism from the torus C/L (which is a Riemann surface of genus 1) to the elliptic curve

E L : y 2 = 4x 3 − g4 (L)x − g6 (L),

with nonzero discriminant ∆(L) := g4 (L)3 − 27g6 (L)2 , which sends ω ∈ L to the projective
point ∞ := (0 : 1 : 0) on E L , which serves as the identity element of the group E L (C). The
discriminant function ∆(τ) := ∆([1, τ]) is a cusp form of weight 12.

Definition 19.1. The j -invariant of the lattice L, and of the elliptic curve E L , is defined by

g2 (L)3 g2 (L)3
j(L) = 1728 = 1728 ∈ C.
∆(L) g2 (L)3 − 27g3 (L)3

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


If we write the E L in the form y 2 = x 3 + Ax + B with g4 (L) = −4A and g6 (L) = −4B then

4A3
j(E L ) = 1728 ,
4A3 + 27B 2
which defined the j-invariant of elliptic curves E/k for any field k whose characteristic is not 2
or 3 (and there are generalizations that work over any field).

We call lattices L and L 0 homothetic if L 0 = λL for some λ ∈ C× .

Theorem 19.2. Lattices L and L 0 are homothetic if and only if j(L) = j(L 0 ), and if and only if the
corresponding elliptic curves E L and E L 0 are isomorphic.

Proof. See Theorem 15.5 and Corollary 15.6 in [1].

Theorem 19.3 (Uniformization theorem). The functor L 7→ E L defines an equivalence of cate-


gories of complex tori C/L up to homethety and elliptic curves E/C up to isomorphism. The map
C/L → E L (C) defined by z 7→ (℘(z), ℘0 (z)) is an isomorphism of compact Lie groups.

Proof. See Corollary 15.12 in [1].

Now consider j(τ) := j([1, τ]) as a holomorphic function H → C. We have j(γτ) = j(τ) for
all γ ∈ Γ (1) = SL2 (Z), thus j(τ) is a modular function. It is not a modular form (of weight 0)
because it is not holomorphic at the cusps: it has a pole at ∞.
The function j(τ) defines an isomorphism from Y (1) := H/Γ (1) to the affine line A1 (C) that
extends to an isomorphism from X (1) := H∗ /Γ (1) to the projective line P1 (C).
If we put q = e2πiτ the q-expansion of the j-function is
1 X
j(τ) = + 744 + an q n
q n≥1

with an ∈ Z. This explains the factor 1728 in the definition of j(τ); it is the least integer that
makes the q-expansion of j(τ) integral.

19.2 Isogenies
Morphisms of complex tori ϕ : C/L1 → C/L2 are induced by holomorphic functions f : C → C
for which the following diagram commutes

← f
C → C

π1 π2

←ϕ
C/L1 → C/L2

One can show that, up to homethety, we can always take f to be a multiplication-by-α map
z 7→ αz for some α ∈ C× for which αL1 ⊆ L2 . The corresponding homomorphism ϕα is then
defined by z + L1 7→ αz + L2 . We have an isomorphism of abelian groups
¦ © ¦ ©
α ∈ C : αL1 ⊆ L2 −→ ∼
ϕ : C/L1 → C/L2 = Hom(C/L1 , C/L2 ),

and if L1 = L2 this is an isomorphism of commutative rings. For most lattice L we have


End(C/L) = Z, but if L is homothetic to an ideal in an imaginary quadratic order O then
End(C/L) ' O (this is the theory of complex multiplication).

18.786 Spring 2024, Lecture #19, Page 2


Definition 19.4. An isogeny of elliptic curves is a nonconstant morphism φ : E1 → E2 that is
compatible with the group law. The degree of an isogeny φ is its degree as a rational map,
equivalently, the degree of the extension of function fields induced by φ (for each f ∈ k(E2 )
we have f ◦ φ ∈ k(E1 ) and a field extension k(E1 )/φ ∗ (k(E2 ). We say that φ is separable if the
extension k(E1 )/φ ∗ (k(E2 )) is separable (always true in characteristic zero).
If L1 ⊆ L2 then L2 /L1 is a finite abelian group and the inclusion L1 ⊆ L2 induces a morphism
C/L1 → C/L2 via z + L1 7→ z + L2 and a corresponding isogeny of elliptic curves φ : E L1 → E L2
whose kernel is isomorphic to L2 /L1 with deg φ = [L2 : L1 ]. If we put N = [L2 : L1 ] then we
also have an inclusion N L2 ⊆ L1 that induces an isogeny in the reverse direction of the same
degree called the dual isogeny. The composition of these two isogenies (in either order) is
equivalent to multiplication-by-N .
The kernel of the multiplication-by-N map on any lattice L or elliptic curve E/C is isomorphic
to Z/N Z⊕Z/N Z. We call an isogeny cyclic if its kernel is a cyclic group. If φ : E1 → E2 is a cyclic
isogeny of degree N then its kernel is isomorphic to Z/N Z and φ(E1 [N ]) is a cyclic subgroup
of E2 [N ] that is the kernel of the dual isogeny φ̂.
For any elliptic curve E/k the kernel of any isogeny φ : E → E 0 is a finite subgroup of E(k̄)
whose order divides deg φ, with equality if and only if φ is separable. Conversely, every finite
subgroup of E(k̄) is the kernel of a separable isogeny that is unique up to isomorphism; Vélu’s
formulas allow one to explicit construct this isogeny and an equation for its codomain.

19.3 Modular curves


Recall the modular curves X 0 (N ) := H∗ /Γ0 (N ) and X 1 (N ) := H∗ /Γ1 (N ). These are compact
Riemann surfaces, hence smooth projective curves over C, but in fact they have models over Q
that allow us to view them as smooth projective curves over any field whose characteristic does
not divide N .
For X 0 (1) = X 1 (1) = X (1), we have C(X (1)) = C( j) generated by the j-function (this follows
from the fact that j(τ) defines an isomorphism X (1) ' P1 and meromorphic functions on P1 are
rational functions).
If Γ is any congruence subgroup, the inclusion Γ ⊆ Γ (1) induces morphism of modular curves
X Γ → X (1). Thus every modular curve comes equipped with a map to the j -line X (1) ' P1 . It
follows that each non-cuspidal point on X Γ can be associated to an elliptic curve, and this makes
applies not only over C, but for any field over which X Γ is defined.
Assuming Γ contains −1, the degree of the morphism X Γ → X (1) is [Γ (1): Γ ], which is the
degree of the function field extension C(X Γ )/C( j). We now consider the function field C(X 0 (N )).
Theorem 19.5. For each positive integer N the function jN (τ) := j(N τ) is a modular function for
Γ0 (N ) that generates the extension C(X 0 (N ))/C( j), in other words, C(X 0 (N )) = C( j, jN ).
Proof. See Theorems 19.13 and 19.14 in [1].

Note that while C( j) is a transcendental extension of C, the field C( j, jN ) is a finite (hence


algebraic) extension of C( j).
Definition 19.6. The (classical) modular polynomial ΦN (Y ) ∈ C( j)[Y ] is the minimal polyno-
mial of jN over C( j). If we replace each occurrence of j in the coefficients of ΦN with X , we
obtain a bivariate polynomial ΦN ∈ Z[X , Y ] that satisfies ΦN (X , Y ) = ΦN (Y, X ).
Theorem 19.7. For j1 , j2 ∈ C we have ΦN ( j1 , j2 ) = 0 if and only if j1 and j2 are the j-invariants
of elliptic curves E1 , E2 over C for which there exists a cyclic isogeny φ : E1 → E2 of degree N .

18.786 Spring 2024, Lecture #19, Page 3


Proof. See Theorem 20.3 in [1] for a proof in the case where N is prime.

Remark 19.8. If E1 and E2 are related by an isogeny φ : E1 → E2 of degree N , the pair


( j(E1 ), j(E2 )) does not necessarily determine φ uniquely (not even up to isomorphism), as it is
possible for there to be two cyclic isogenies of degree N from E1 to E2 that have distinct kernels.
This occurs precisely when j(E2 ) is a double root of the polynomial ΦN ( j(E1 ), Y ).

The connection between Γ0 (N ) and isogenies can be seen in two ways. First, there is a one-
to-one correspondence between cyclic subgroups H of Z/N Z ⊕ Z/N Z of order N and cosets of
Γ0 (N ) in Γ (1). Alternatively, if we fix a basis 〈P, Q〉 of Z/N Z ⊕ Z/N Z so that H = 〈P〉, then the
matrices in G L2 (Z/NZ) that send P to a multiple of P and therefore stabilize H are precisely
the upper triangular matrices; if we restrict to matrices of determinant 1, these are precisely the
reductions of elements of Γ0 (N ) modulo N .
This leads to the moduli interpretation of X 0 (N ). There is a one-to-one correspondence
between non-cuspidal points in X 0 (N )(C) and equivalence classes of pairs (E, 〈P〉), where E is
an elliptic curve and P ∈ E[N ] is a point of order N . We regard two pairs (E, 〈P〉) and (E 0 , 〈P 0 〉)
to be equivalent whenever there is an isomorphism ι : E → E 0 for which 〈P 0 〉 = 〈ι(P)〉.
Let us now consider the modular curve X 1 (N ). Matrices in Γ1 (N ) don’t just fix a cyclic
subgroup of Z/N Z × Z/N Z of order N , they fix a point of order N , and there is a one-to-
one correspondence between non-cuspidal points in X 1 (N )(C) and equivalence classes of pairs
(E, P), where E is an elliptic curve and P ∈ E[N ] is a point of order N , where we now regard
pairs (E, P) and (E 0 , P 0 ) as equivalent if there is an isomorphism ι : E → E 0 for which P 0 = ι(P).
For the modular curve X (N ) we have a one-to-one correspondence between non-cuspidal
points in X (N )(C) and equivalence classes of triples (E, P, Q) where E[N ] = 〈P, Q〉 and equiva-
lence involves an isomorphism ι : E → E 0 with ι(P) = P 0 and ι(Q) = Q0 .
Every matrix in GL2 (Z/N Z) defines an automorphism of X (N ) via its action on P and Q,
so given any subgroup H ≤ GL2 (Z/N Z) we can consider the quotient curve X (N )/H. If we
take H to be the subgroup of upper triangular matrices in GL2 (Z/N Z), this quotient is X 0 (N ),
and restricting to upper triangular matrices with a 1 in the upper left corner yields X 1 (N ).
Unlike the curves X 0 (N ) and X 1 (N ), which have models over Q, the curve X (N ) is defined
over Q(ζN )(so over Q only when N ≤ 2). But if H is a subgroup of GL2 (Z/N Z) for which
det(H) = (Z/N Z)× , the quotient X (N )/H will have a model over Q.

References
[1] Andrew V. Sutherland, Lecture notes for Elliptic Curves (18.783), 2023.

18.786 Spring 2024, Lecture #19, Page 4


18.786 Number theory II Spring 2024
Lecture #20 5/1/2024

20.1 Galois representations attached to elliptic curves


Let E be an elliptic curve over a number field K. For each positive integer m we use E[m] to
denote the m-torsion subgroup of E(K). The Galois group GK := Gal(K/K) acts on E[m], since
the coordinates of P ∈ E[m] are roots of the m-division polynomials whose coefficients lie in K.
This yields the mod-m Galois representation

ρ E,m : GK → Aut(E[m]).
L
The abelian group E[m] ' Z/mZ Z/mZ is a free Z/mZ module of rank 2, hence its auto-
morphism group is isomorphic to GL2 (Z/mZ). After fixing a Z/mZ-module basis for E[m] we
obtain a linear representation
ρ E,m : GK → GL2 (Z/mZ).
Note that this representation depends on a choice of basis and is defined only up to conjugation.
When m is a prime ` we have Z/`Z ' F` and can view this as a 2-dimensional F` -representation.
The multiplication-by-` map P 7→ `P defines a surjective homomorphism E[`n+1 ] → E[`n ]
for each n ∈ Z≥0 . These maps uniquely determine an inverse system that is used to define the
`-adic Tate module
T` (E) := lim E[`n ],
←−
n

which is a free Z` -module of rank 2. If we fix a basis (P1 , Q 1 ) for E[`], we can then choose
(P2 , Q 2 ) so that `P2 = P1 and `Q 2 = Q 1 , since the multiplication-by-` map defines a surjection
E[`2 ]  E[`] (it is a homomorphism whose kernel has cardinality #E[`] = `2 = [E[`2 ] : E[`]],
so it must be surjective). Continuing in this fashion yields a compatible system of bases that
allows us to identify Aut(T` (E)) with GL2 (Z` ) and we have the `-adic Galois representation

ρ E,`∞ : GK → Aut(T` (E)) ' GL2 (Z` ).

As above, the isomorphism with GL2 (Z` ) depends on a choice of basis, so this representation is
determined only up to conjugation in GL2 (Z` ).
More generally, for any m, n ∈ Z≥1 with n|m the multiplication-by-(m/n) map defines a sur-
jective homomorphism E[m]  E[n], yielding an inverse system of torsion subgroups ordered
by divisibility, and we can consider the adelic Tate module

T (E) := lim E[m],


←−
m
Q
which is a free Zb -module of rank 2. Now Z b ' ` Z` , so any choice of compatible systems of
bases for all the `-adic Tate modules T` (E) yields a compatible system of bases for T (E) and we
obtain the adelic Galois representation
Y
ρ E : GK → Aut(T (E)) ' GL2 (Zb) ' GL2 (Z` ),
`

which is again defined only up to conjugation on GL2 (Z b ).


Note that GK and GL2 (Z) are both profinite groups, hence topological groups, and the rep-
b
resentation ρ E is continuous in this topology (in other word, ρ E is a morphism of topological
groups, not just a morphism of groups). Both groups can be defined as inverse limits of finite
groups, which we regard as topological groups equipped with the discrete topology, and the
inverse limit is a topological group that is compact and totally disconnected (any two points can
be separated by open sets that partition the space).

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


In the case of GK this topology is the Krull topology, and we recall that Galois theory for
infinite algebraic extensions requires that we restrict our attention to subgroups that are closed
in the Krull topology: there is a one-to-one inclusion reversing correspondence between closed
subgroups of GK and algebraic extensions of K in K, but this is not true if consider arbitrary
subgroups; indeed, every subgroup of GK with the same closure will have the same fixed field.
Under this correspondence, finite extensions of K in K correspond to finite index closed
subgroups of GK , which are necessarily also open subgroups, since their complement is a finite
union of closed cosets, hence closed. Conversely, every open subgroup of GK is a closed subgroup
of finite index (it must have finite index because GK is compact and cosets form an open subcover
with no proper subcovers, and every open subgroup of a topological group is also closed, since
it is the complement of the open set formed by the union of its cosets).
We are thus particularly interested in open subgroups H ≤ GL2 (Z b ). It follows from the
definition of the topology of an inverse limit of discrete groups that every such H is the inverse
image of its projection to GL2 (Z/mZ) for some positive integer m; the least such m is the level
of H. The inverse image of any finite subgroup of GL2 (Z/mZ) in GL2 (Z b ) is an open subgroup,
but note the same subgroup of GL2 (Z b ) can be obtained from infinitely many different values
of m. However, if we restrict our attention to subgroups of GL2 (Z/mZ) that are not the inverse
image of their projection to GL2 (Z/nZ) for any n|m we obtain a one-to-one correspondence
with open subgroups of GL2 (Z b ) of level m.

20.2 Complex multiplication


Theorem 20.1. Let E/k be an elliptic curve with endomorphism ring End(E) and endomorphism
algebra End0 (E) := End(E) ⊗Z Q. Then End(E) is a free Z-module of rank r that is an order in
the Q-algebra End0 (E) of dimension r and exactly one of the following holds:

r = 1 End0 (E) = Q and End(E) = Z;


r = 2 End0 (E) is an imaginary quadratic field Q(α) with α2 < 0
r = 4 End0 (E) is a non-split quaternion algebra Q(α, β) with α2 , β 2 < 0 and char(k) 6= 0.

Proof. See Theorem 12.17 in [1]

When k is a number field only the first two possibilities can occur, and when End0 (E) is an
imaginary quadratic field we say that E has complex multiplication. Such E are rare: for any
fixed number field K the set { j(E) : End(E) 6= Z} is finite, and even if we let K range over all
number fields of degree at most d, we still obtain a finite set.
We should note that End(E) is not invariant under base change; any endomorphisms defined
over K are necessarily defined over any extension L/K, but E may acquire additional endomor-
phisms over an extension L/K. If End(EK ) is an imaginary quadratic field then we say that E
has potential complex multiplication, and in this case there is a unique minimal extension L/K
of degree at most 2 for which End(E L ) = End(EK ): we can take L to be the compositum of K
and the imaginary quadratic field End(EK ).1
The endomorphism ring of EK has a profound impact on the Galois representation ρ E . If
End(E) is an imaginary quadratic order O (so E has complex multiplication) then E[m] is not
just a free (Z/mZ)-module of rank 2, it is also a free O /mO -module of rank 1. If we fix a basis
〈P, Q〉 = E[m] and consider the orbit of P under the action of End(E) ' O = [1τ], we obtain all
1
This fact is specific to elliptic curves; for a general abelian variety A/K whose geometric endomorphism algebra
is a field F , the minimal field L for which End(A L ) = End(AK ) is not necessarily K F .

18.786 Spring 2024, Lecture #20, Page 2


of E[m]. The endomorphisms α ∈ End(E) are defined over K, hence compatible with the action
of GK , and this implies that we can actually view the mod-m representation as a homomorphism

ρ E,m : GK → GL1 (O /mO ) ' (O /mO )× .

This implies that the image of ρ E,m is an abelian subgroup of GL2 (Z/mZ), which forces it to be
much smaller than GL2 (Z/mZ); ignoring log factors, its cardinality will be quadratic in m, while
the cardinality of GL2 (Z/mZ) is roughly quartic in m. This is true for every m and implies that
ρ E (GK ) is an abelian subgroup of GL2 (Z
b ) and it cannot be open because it must have infinite
index.
Even when End(E) = Z, if End(EK ) is an imaginary quadratic order O (so E has potential
complex multiplication), then over a quadratic extension L/K we will have End(E L ) = O , in
which case ρ E (G L ) is an abelian subgroup of GL2 (Zb ) that has index 2 in ρ E (GK ); in this case
ρ E (GK ) will not be abelian (it be a generalized dihedral group), but it will still have infinite
index in GL2 (Z b ) and cannot be open.

20.3 Images of Galois representations


Theorem 20.2 (Serre’s open image theorem). Let E be an elliptic curve over a number field K.
Then ρ E (GK ) is an open subgroup of GL2 (Z
b ) if and only if E does not have potential complex
multiplication.

Proof. The forward direction was argued above; the reverse direction is Théorèm 3 in [2].

References
[1] Andrew V. Sutherland, Lecture notes for Elliptic Curves (18.783), 2023.

[2] Jean-Pierre Serre, Propriétés galoisiennes des points d’ordre fini des courbes elliptiques, Invent.
Math. 15 (1972), 259–331.

18.786 Spring 2024, Lecture #20, Page 3


18.786 Number theory II Spring 2024
Lecture #21 5/6/2024

21.1 Level structure


Let N ≤ Z≥1 and let E/k be an elliptic curve over a perfect field k of characteristic prime to N .
To simply notation, we put Z(N ) := Z/N Z and GL2 (N ) := GL2 (Z/N Z).
Each basis (P1 , P2 ) for E[m] = 〈P
 1 , P2 〉 defines an isomorphism ι : E[N ] −→ Z(N ) that sends
∼ 2

P1 to e1 := 0 and P2 to e2 := 1 . This induces an isomorphism Aut(E[N ]) −→ GL2 (N ) that


1 0 ∼

we also denote ι. For φ ∈ Aut(E[N ]) with

φ(P1 ) = aP1 + c P2
φ(P2 ) = bP1 + d P2

we define  
a b
ι(φ) := ,
c d
so that
 
ι(φ(P1 )) = ( ac ) = a b
c d 0 = ι(φ)ι(P1 )
1
  
ι(φ(P2 )) = db = a b
c d 1 = ι(φ)ι(P2 ).
0

This is consistent with the left action of GL2 (N ) on Z(N )2 . The isomorphism ι allows us to view
the mod-N Galois representation
ι
ρ E,N : Galk → Aut(E[N ]) −→

GL2 (N )

as a continuous group homomorphism Galk → GL2 (N ).


When k has characteristic zero, each compatible system of bases for E[N ] for all positive
integers N induces an isomorphism ι : Aut(T (E)) −→

GL2 (Z
b ), where

b2
T (E) := E(k̄)tor = lim E[N ] ' Z
←−
N

is the adelic Tate module. The isomorphism ι allows us to view the Galois representation
ι
ρ E : Galk → Aut(T (E)) −→

GL2 (Z
b)

as a continuous group homomorphism Galk → GL2 (Z b ). Recall that the level of an open subgroup
H ≤ GL2 (Zb ) is the least N ∈ Z≥1 for which H = π−1 (πN (H)), where πN : GL2 (Z b ) → GL2 (N ) is
N
the canonical projection associated to the inverse limit GL2 (Z) ' lim GL2 (N ).
b
←−N
Definition 21.1. For H ≤ GL2 (N ), an H -level structure on an elliptic curve E is an equivalence
class [ι]H of isomorphisms ι : E[N ] → Z(N )2 , where ι ∼ ι 0 whenever ι = h ◦ ι 0 for some h ∈ H,
where h ∈ H ≤ GL2 (N ) = Aut(E[N ]) acts as an automorphism of Z(N )2 ; this action gives the
abelian group Hom(E[N ], Z(N )2 ) the structure of a left H-module, and we can equivalently
view [ι]H = {h ◦ ι : h ∈ H} as the H-orbit of an isomorphism ι ∈ Hom(E[N ], Z(N )2 ).
For open H ≤ GL2 (Z b ) of level N , an H -level structure on E is a πN (H)-level structure on E;
equivalently, an H-level structure on E is the H-orbit of an isomorphism ι ∈ Hom(T (E), Z b 2 ).

When H ∈ GL2 (N ) is trivial, the H-orbits of isomorphisms ι ∈ Hom(E[N ], Z(N )2 ) are sin-
gletons, each corresponding to a choice of basis (P1 , P2 ) for E[N ]. This is the full level-N struc-
ture on E. As noted in the previous lecture, the non-cuspidal points on the modular curve

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


X (N ) := Γ (N )\H∗ correspond to isomorphism classes of triples (E, P1 , P2 ), equivalently, elliptic
curves E equipped with full level-N structure. Each H ≤ GL2 (N ) acts on X (N ) via automor-
phisms, and non-cuspidal points on the quotient H\X (N ) correspond to isomorphism classes of
elliptic curves with an H-level structure.
Alternatively, we can formally define the modular curve X H as the coarse moduli space
associated to the stack MH over Spec Z[1/N ] whose non-cuspidal points parameterize elliptic
curves with H-level structure, as defined by Deligne and Rapoport [1]. The theory of stacks
is beyond the scope of this course, but we can describe this stack very explicitly by giving its
functor of points, and this description allows us to compute many invariants of X H that can be
difficult to compute using a quotient of X (N ), especially when N is large.
As a stack over Z[1/N ] the modular curve X H can always be viewed as a smooth projective
curve over Q that has good reduction at all primes p - N ; this amounts to taking the base
change to Q and forgetting its moduli structure. But X H is a nice curve (smooth, projective,
geometrically integral) only when det(H) = Z b × , so we will restrict our attention to this case,
which is the relevant case to consider if one is interested in elliptic curves over Q.
If we put ΓH := ±H ∩ SL2 (Z), then over the cyclotomic field Q(ζN ), every component of the
base change of the curve X H is isomorphic to the (nice) algebraic curve corresponding to the
Riemann surface X ΓH := ΓH \H∗ , which a priori is defined over C but admits a model over Q(ζN )
(but not necessarily over Q, which is another reason to consider X H ).
The group ΓH ≤ SL2 (Z) is necessarily a congruence subgroup, since it contains SL2 (Z/N Z),
but it may also contain SL2 (Z/M Z) for some proper divisor M of N . Thus the level of the
congruence subgroup ΓH need not coincide with the level of H, although it will necessarily divide
it. Indeed, there will typically be infinitely many non-conjugate open subgroups H ≤ GL2 (Z b ) of
varying level N that share the same intersection ΓH with SL2 (Z). The corresponding modular
curves X H are all isomorphic over Q = N Q(ζN ), but not over Q; they are twists of X ΓH .
cyc
S

21.2 Twists
Definition 21.2. Two curves X , X 0 over a field k are twists (of each other) if X L ' X L0 for some
extension L/k. The same definition also applies to abelian varieties (and more generally to any
category of objects over a field k that has an appropriate notion of base change).

For a fixed curve X /k, the k-isomorphism classes of twists X 0 of X are parameterized by the
cohomology set H 1 (Galk , Aut(X k̄ )). Note that Aut(X k̄ ) need not be abelian, so this is a pointed
set and not necessarily a group. Twisting is a subtle topic in general, but it is quite simple when
Aut(X k̄ ) is a finite abelian group on which Galk acts trivially, which is true in almost all cases
where X is an elliptic curve.
For an elliptic curve E/k with j(E) 6= 0, 1728 the only non-trivial isomorphism of Ek̄ is
the map that sends P to −P, in which case Aut(Ek̄ ) = Aut(E) ' {±1} is a cyclic group of
order 2 with trivial Galk -action. In this situation H 1 (Galk , Aut(Ek̄ )) consists of all continuous
homomorphisms Galk → {±1}, each of which is either trivial (corresponding to a trivial twist
E 0 ' E) or has a kernel whose fixed field is a quadratic extension L/k over which E L0 ' E L for
some E 0 6' E that is unique up to k-isomorphism; this is the quadratic twist of E by L/k.
When k does not have characteristic 2 or 3, we can assume E is defined by a Weierstrass
equation y 2 = x 3 + Ax + B, in which case the map P 7→ −P simply negates the y-coordinate of
P, andpif we write the fixed field of the kernel of a non-trivial element of H 1 (Galk , Aut(Ek̄ )) as
L = k( d) for some non-square d ∈ k, the twist E 0 of E by L is isomorphic to d y 2 = x 3 + Ax + B
(and also to y 2 = x 3 + d 2 Ax + d 3 B).

18.786 Spring 2024, Lecture #21, Page 2


Even if j(E) = 0, 1728, provided k does not have characteristic 2 or 3 the automorphism
group Aut(Ek̄ ) is a cyclic of order 4 or 6 (for j(E) = 1728 and j(E) = 0, respectively). In
this situation Aut(Ek̄ ) may not be a trivial Galk -module (this depends on whether k contains a
primitive 4th or 6th root of unity), but in any case it is still easy to understand the Galk -module
structure of Aut(Ek̄ ) and write down Weierstrass equations for twists of E.

21.3 Modular curves X H


Let us now describe X H in terms of its functor of points. As with quotients of the upper half
plane, we will first describe YH , which effectively solves the moduli problem, and then explain
how to add “cusps" to YH so that X H can be viewed as a smooth projective curve.
Let H be an open subgroup of GL2 (Z b ) of level N with det(H) = Z× and let k be a perfect
field of characteristic prime to N . The set YH (k̄) consists of equivalence classes of pairs (E, [ι]H ),
where E/k̄ is an elliptic curve and [ι]H is an H-level structure on E. We have an equivalence
(E, [ι]H ) ∼ (E 0 , [ι 0 ]H ) whenever there is an isomorphism φ : E → E 0 for which the induced
isomorphism φN : E[N ] → E 0 [N ] satisfies ι ∼ ι 0 ◦φN , meaning ι = h◦ι 0 ◦φN for some h ∈ πN (H).
Equivalently, YH (k̄) consists of pairs ( j(E), α), where α = H gA E is a double coset in

H\GL2 (N )/A E ,

where A E := {ϕN : ϕ ∈ Aut(E)} with ϕN := ι(ϕ|E[N ] ). For j(E) 6= 0, 1728 we have A E = {±1}, and
if −1 ∈ H we can identify double cosets in H\GL2 (N )/{±1} with the right cosets in H\GL2 (N ).
Each σ ∈ Galk induces an isomorphism E σ [N ] → E[N ] defined by P 7→ σ−1 (P), where E σ
is the elliptic curve obtained by letting σ act on the coefficients of an equation defining E; let
σ−1 denote this isomorphism. We have a right Galk -action on YH (k̄) given by

(E, [ι]H ) 7→ (E σ , [ι ◦ σ−1 ]H ),

and define YH (k) to be the fixed points of this action. Each point in YH (k) is represented by a
pair (E, [ι]H ), where E/k is an elliptic curve and for every σ ∈ Galk there is a ϕ ∈ Aut(Ek̄ ) and
h ∈ H such that
ι ◦ σ−1 = h ◦ ι ◦ ϕN .
Equivalently, YH (k) consists of pairs ( j(E), α) where j(E) ∈ k is the j-invariant of an elliptic
curve E/k and α = H gA E ∈ H is a double coset with H gρ E,N (σ)A E = H gA E for every σ ∈ Galk .
For E/k with j(E) 6= 0, 1728, if −1 ∈ H we can determine the k-rational points on YH corre-
sponding to E (if any) by computing the fixed points of ρ E,N (Galk ) ⊆ GL2 (N ) in the permutation
representation of GL2 (N ) given by the right action of GL2 (N ) on H\GL2 (N ). This amounts to
computing the rational points in the fiber of X H → X (1) above the point j(E) on X (1) ' P1k .

Proposition 21.3. Let H be an open subgroup of GL2 (Z b ) of level N and E/k be an elliptic curve. If
ρ E,N (GalK ) ≤ H then there exists an isomorphism ι : E[N ] −→ ∼
Z(N )2 for which (E, [ι]H ) ∈ YH (k).
Conversely, if (E, [ι]H ) ∈ YH (k), then for every twist E 0 of E there exists ι 0 : E 0 [N ] −→

Z(N )2
0 0 0
with (E , [ι ]H ) ∈ YH (k), and if Aut(Ek̄ ) = {±1} for at least one twist E we haveρ E 0 ,N (Galk ) ≤ H.
Proof. If ρ E,N (Galk ) ≤ H then Galk fixes the trivial double coset HαA E with α = 1, and we
have ( j(E), α) ∈ YH (k). For the converse, if E 0 is a twist of E, then we have an isomorphism
φ : E 0 −→

Ek̄ that induces an isomorphism φN : E 0 [N ] −→ ∼
E[N ], and if we put ι 0 = ι ◦ φN the

pairs (E, [ι]H ) and (E 0 , [ι 0 ]H ) are equivalent (as witnessed by φ) and represent the same point
in YH (k̄), so if (E, [ι]H ) lies in YH (k) then so does (E 0 , [ι 0 ]H ). See Problem 2 on Problem Set 7
for the final claim.

18.786 Spring 2024, Lecture #21, Page 3


Remark 21.4. The final claim of Proposition 21.3 actually holds whenever Aut(Ek̄ ) is cyclic,
which is always true provided k does not have characteristic 2 or 3; see [2, Proposition 12.2.1].

The modular curve X H is obtained from YH by adjoining cusps X H∞ , whose functor of points
we now describe. The cusps in X H∞ (k̄) correspond to double cosets H\GL2 (N )/U(N ), where
U(N ) = 〈± 10 11 〉 corresponds to the stabilizer of ∞ in SL2 (Z). We equip X H∞ (k̄) with a right


Galk -action hgu 7→ hgχN (σ)u, where the cyclotomic character χN (σ) := 0e 10 is defined by
σ(ζN ) = ζeN . Rational cusps in X H∞ (k) corresponds to double cosets in H\GL2 (N )/U(N ) that
are fixed by χN (Galk ).

21.4 Computing the genus


The genus of a curve is invariant under base change. If follows that if we puts ΓH := ±H ∩SL2 (Z)
then g(X H ) = g(X ΓH ) (here we are assuming det(H) = Z b × so that X H is geometrically connected,
but otherwise we define the genus of H to be the genus of each of its geometric components, all
of which are isomorphic to X ΓH ). In fact the genus of X ΓH depends only on the intersection of
ΓH with SL2 (N ), where N is any multiple of the level of ΓH ; we can take N to be the level of H,
but it is more efficient to take N to be the level of ΓH .
This allows us to compute the genus of X H via the formula

i(ΓH ) ν2 (ΓH ) ν3 (ΓH ) ν∞ (ΓH )


g(H) = g(ΓH ) = 1 + − − − ,
12 4 3 2
where we view ΓH as a subgroup of SL2 (N ) that contains −1, let i(ΓH ) := [SL2(N ) : ΓH ], let 
ν2 (ΓH ) and ν3 (ΓH ) count right cosets in ΓH \SL2 (N ) containing conjugates of −1
0 1
0 and −1
0 1
,
 −1
respectively, and let ν∞ (ΓH ) count the orbits of ΓH \SL2 (N ) under the right action of 0 1 .
11

References
[1] Pierre Deligne and Michael Rapoport, Les schémas de modules de courbes elliptiques, Modular
functions of one variable, II (Proc. Internat. Summer School, Univ Antwerp, 1972), 1973,
pp. 143–316, Lecture Notes in Mathematics 349, Springer.

[2] Jeremy Rouse, Andrew V. Sutherland, and David Zureick-Brown, with an appendix by John
Voight, `-adic images of Galois for elliptic curves over Q, Forum of Mathematics, Sigma 10
(2022), e62.

[3] Andrew V. Sutherland, Lecture notes for Elliptic Curves (18.783), 2023.

18.786 Spring 2024, Lecture #21, Page 4


18.786 Number theory II Spring 2024
Lecture #22 5/8/2024

22.1 Galois images of Frobenius elements


Let E be an elliptic curve over a finite field Fq . Recall that the q-power Frobenius automorphism
x 7→ x q is a topological generator for GalFq ' Z b (it generates a subgroup isomorphic to Z, which
is dense in Zb ). The Frobenius automomorphism of GalF induces the Frobenius endomorphism
q

π E ∈ End(E), which acts on E(Fq ) via (x : y : z) 7→ (x q : y q : z q ).


The group of Fq -rational points E(Fq ) is the kernel of the separable endomorphism π E − 1,
which implies

#E(Fq ) = deg(π E − 1) = (π E − 1)(π̂ E − 1) = q + 1 − tr π E ,

where π̂ E is the dual of π E (the unique endomorphism π̂ E for which π E π̂ E = deg π E = q) and
tr π E := π E + π̂ E ∈ Z is the trace of Frobenius.
For any positive integer m ⊥ q and endomorphism α ∈ End(E) with deg(α) ⊥ m, the action
of α on E[m] is given by a matrix αm ∈ GL2 (Z/mZ) that has the same characteristic polynomial

x 2 − tr(α)x + deg(α)

as α modulo m; this implies tr(αm ) ≡ tr(α) mod m and det(αm ) ≡ deg(α) mod m. For α = π E
the action of π E on E[m] is thus described by a matrix in GL2 (Z/mZ) with trace congruent to
tr(π E ) = q + 1 − #E(F p ) modulo m and determinant congruent to deg(π E ) = q modulo m; see
Lectures 6–7 of [7] for proofs of these facts.
Now tr(π E ) and deg(π E ) are integers that do not depend on m, and in fact there is an integer
matrix Aπ whose reduction modulo m describes the action of π E on E[m] for every m ⊥ q, as
proved by Duke and Tóth in [2].

Theorem 22.1 (Duke and Tóth). Let E/Fq be an elliptic curve with Frobenius endomorphism π E .
Define a, b, ∆ ∈ Z by a := tr π E , ∆ := disc(End(E) ∩ Q(π E )), and b := (a2 − 4q)/∆ if ∆ 6= 1,
p

with b := 0 if ∆ = 1. For a suitable choice of bases, the reduction modulo m of the integer matrix
a+bδ
 
b
Frob E := Frob(a, b, ∆) = 2
b(∆−δ) a−bδ (δ := ∆ mod 4 ∈ {0, 1})
4 2

gives the action of π E on E[m] for every integer m ⊥ q.

Proof. This follows from [2, Theorem 2.1] and the Duering lifting theorem [7, Thm. 21.15].

Corollary 22.2. Let E/K be an elliptic curve over a number field, let p be a prime of good reduction
for E, and let π be the Frobenius endomorphism of the reduction of E to the residue field of p. Then
ρ E,m (Frobp ) ∈ GL2 (Z/mZ) is conjugate to the image of Frob E in GL2 (Z/mZ) for every m ⊥ p.

We note that the integers a, b, ∆ that define Frob E = Frob(a, b, ∆) satisfy the norm equation

4q = a2 − b2 ∆,

and either ∆ = 1 (in which case b = 0 and 4q = a2 ), or ∆ is the discriminant of an imaginary


quadratic field, since the Hasse bound implies a2 ≤ 4q. The latter always holds when q is prime.

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


22.2 Counting points on modular curves
Let H ≤ GL2 (Z b ) be an open subgroup of level N with det(H) = Z b and −1 ∈ H. The modular
curve X H is a nice curve over Q with good reduction modulo every prime p - N , in which case
the reduction of X H modulo p is a nice curve of genus g(X H ). We would like to compute

#X H (F p ) = #X H∞ (F p ) + #YH (F p ).

If we put G L2 (N ) := GL2 (Z/N Z) and let U(N ) = ± 10 11 , we then have
GalF
#X H∞ (F p ) = # H\GL2 (N )/U(N ) p

p
where GalFp acts via the matrix 0p 01 , since Frob p (ζN ) = ζN . We note that #X H∞ (F p ) depends


only on the value of p mod N , so we can determine #X H∞ (F p ) for every prime p - N by simply
counting double cosets in H\GL2 (N )/U(N ) fixed by 0a 01 with a varying over (Z/N Z)× .


For the non-cuspidal points we have


X GalF
#YH (F p ) = # H\GL2 (N )/A E p
(1)
j(E)∈F p

where A E := {ϕN : ϕ ∈ Aut(Ek̄ )} is just {±1} for j(E) 6= 0, 1728 and the action of GFp is given by
ρ E,N (Frob p ) ≡ Frob E mod N . We note that while Frob E is not completely determined by j(E),
its action on the double-coset space H\GL2 (N )/A E is. Indeed, the elliptic curves E 0 /F p with
j(E 0 ) = j(E) are twists of E, and for j(E) 6= 0, 1728 there are just two F p -isomorphism classes
of twists, and their Frobenius matrices Frob E differ only by −1 ∈ A E .
For convenience we split the sum in (1) into four parts:

#YH (F p ) = #YH0 (F p ) + #YH1728 (F p ) + #YHss (F p ) + #YHord (F p )

depending on whether j(E) = 0, j(E) = 1728, j(E) 6= 0, 1728 is ordinary, j(E) 6= 0, 1728 is
supersingular. Recall that E/F p is ordinary if tr π E 6≡ 0 mod p and supersingular otherwise. To
keep things simple, we will focus on the main term #YHord (F p ); see Lemma 5.1.1 and Table 6
in [6] for the other cases, all of which can be computed very quickly provided one knows the
class numbers h(−p) and h(−4p) (these are easy to compute as long as p is not too large and
precomputed values are available in the LMFDB for p < 1012 ).
For ordinary j(E) 6= 0, 1728 we will have ∆ < −4, in which case Frob E is determined up
to the signs of a and b by the norm equation 4q = a2 − b2 ∆, which has a unique solution
with a, b > 0. For each such solution the theory of complex multiplication implies that there
are exactly h(∆) j-invariants j(E) for which End(E) ∩ Q(π E ) = End(E) has discriminant ∆. It
thus suffices to enumerate solutions to 4q = a2 − b2 ∆ and for each solution count the right
cosets in H\GL2 (N ) that are fixed by the Frobenius Frob(A, B, ∆), which can be done using the
permutation character χH for H ≤ GL2 (N ), which for each g ∈ G counts the cosets in H\GL2 (N )
that are fixed by g. Note that this permutation character does not depend on p and can be
computed very quickly once one knows its values on generators for G, which can be done as a
precomputation. For all primes p > 3 we then have
X X
#YHord (F p ) = χH (Frob(a, b, ∆/b2 ))h(∆/b2 )
p
1≤a<2 p b| cond(∆)
∆=a2 −4p
Æ p
where cond(∆) := ∆/ disc(Q( p ∆)) is the conductor of the order of discriminant ∆ (its index
in the ring of integers of Q( ∆)).

18.786 Spring 2024, Lecture #22, Page 2


22.3 Decomposing the Jacobian
As noted in earlier lectures, X H is a quotient of the modular curve X (N )/Q(ζN ). The Jaco-
bian of X (N ) is an abelian variety of dimension g(X (N )) whose simple isogeny factors corre-
spond to modular abelian varieties A f associated to the Galois orbit of a weight-2 newform
f ∈ S2 (Γ1 (N 2 )); see Lecture 18 for the definition of A f and recall from Lecture 11 that we can

identify S2 (Γ (N )) with a subspace of S2 (Γ1 (N )) by conjugating Γ (N ) with δN := N0 10 .
This implies that the Jacobian JH of X H is isogenous to a product of modular abelian vari-
eties A f , and that its L-function L(JH , s) is a product of L-functions of normalized eigenforms
in S2 (Γ1 (N 2 )); these eigenforms need not be newforms of level N 2 but each is a newform
in S2 (M , χ) := S2 (Γ0 (M ), χ) for some M |N 2 . One can show that in fact cond(χ)|N (see [6,
Prop. A.1.6]), so it suffices to consider normalized eigenforms in Γ0 (N 2 ) ∩ Γ1 (N ).
If f1 , . . . , f r are the distinct GalQ -conjugates of a newform f (so r = [Q( f ) : Q]), then
r r Y
Y Y −1 X
L(A f , s) = L( f i , s) = 1 − a p p−s + χ(p)p1−2s = an n−s
i=1 i=1 p n≥1
Pr
with an ∈ Z. For primes p we have a p = i=1 a p ( f i ) equal to a p (tr( f )), where tr( f ) := i f i is
P

the traceform of f . If we now let f1 , . . . , f m be representatives of Galois orbits of the normalized


eigenforms in Γ0 (N 2 ) ∩ Γ1 (N ), then for some nonnegative integers e1 , . . . , em we have
m
Y X
L(JH , s) = L(A f i , s)ei = an n−s .
i=1 n≥1

This implies that for every prime p - N the Frobenius trace of JH at p is an integer linear com-
bination of the Frobenius traces of the A f at p. Now a p (JH ) = p + 1 − #X H (F p ), so we can
compute a p (JH ) by counting points on X H over the F p as described above.
If we have precomputed values of a p (A f i ) for 1 ≤ i ≤ r (which can be found in the LMFDB
for moderate values of p and N ), we can construct a linear system Ax = B, where A = (a pi ( f j ))i j
is an integer matrix with rows indexed by primes p j - N (up to some bound P) and a column
for each of the normalized eigenforms f1 , . . . , f m , while B is a column vector whose ith entry is
a pi (JH ) = pi + 1 − #X H (F pi ). For sufficiently large P the matrix A will have rank r and uniquely
determine the exponents ei that appear in the factorization of L(JH , s), thereby determining the
modular abelian varieties A f that appear as isogeny factors in JH along with their multiplicities.
The isogeny decomposition of JH has many applications. In particular, it allows us to com-
pute the analytic rank of the L-function of L(JH , s) as the sum of the analytic ranks of the modular
abelian varieties A f , which are easy to compute. Under a generalization of the BSD conjecture,
this analytic rank tells us the Mordell-Weil rank of JH , and if this rank is strictly less than the
genus of X H , we can (at least in principal) use the Chabauty-Coleman method to rigorously
determine the set X H (Q) and all elliptic curves E/Q that admit an H-level structure. Note that
Faltings’ theorem implies that X H (Q) is finite whenever g(X H ) ≥ 2.
Mazur famously determined the H-level structures that are compatible with rational torsion
(the 15 possible torsion subgroups that arise for E/Q, which as Z/N Z for 11 6= N ≤ 12 and
Z/2Z ⊕ Z/2N Z for N ≤ 4) in [4]. The corresponding X H have genus 0 (as conjectured by Ogg),
meaning that their rational points are parameterized by P1Q . Mazur also famously determined
the H-level structures that are compatible with a rational isogeny of prime degree [5], which
was soon extended to cyclic isogenies of any degree. Some of these X H ' X 0 (N ) have nonzero
genus, which is fortunate; the fact that X 0 (15) has only finitely many rational points played a

18.786 Spring 2024, Lecture #22, Page 3


crucial role in Wiles proof of Fermat’s Last Theorem (the 3-5 trick); the modular curve X 0 (15)
has genus 1 and its Jacobian has rank zero, which implies finiteness.
When Mazur announced his torsion theore [3] he also proposed what is now called Mazur’s
Program B to classify all the X H that have rational points (over Q, or over a number field K).
Mazur’s isogeny theorem can be viewed as a first step in this direction, which has since been
extended by many researchers. This program is still very much a work in progress (even over Q).

References
[1] Pierre Deligne and Michael Rapoport, Les schémas de modules de courbes elliptiques, Modular
functions of one variable, II (Proc. Internat. Summer School, Univ Antwerp, 1972), 1973,
pp. 143–316, Lecture Notes in Mathematics 349, Springer.

[2] William Duke and Árpád Tóth, The splitting of primes in division fields of elliptic curves,
Experiment. Math. 11 (2002), 555–565.

[3] Barry Mazur, Rational points on modular curves, Modular functions of one variable, V, 1977,
pp. 107-148, Lecture Notes in Math. 601, Springer.

[4] Barry Mazur, Modular curves and the Eisenstein ideal, Inst. Hautes Étudces Sci. Publ. Math.
47 (1977), 33–186.

[5] Barry Mazur, Rational isogenies of prime degree (with an appendix by D. Goldfeld), Invent.
Math. 44 (1978), 129–162.

[6] Jeremy Rouse, Andrew V. Sutherland, and David Zureick-Brown, with an appendix by John
Voight, `-adic images of Galois for elliptic curves over Q, Forum of Mathematics, Sigma 10
(2022), e62.

[7] Andrew V. Sutherland, Lecture notes for Elliptic Curves (18.783), 2023.

18.786 Spring 2024, Lecture #22, Page 4


18.786 Number theory II Spring 2024
Lecture #23 5/13/2024

In this final lecture we survey some generalizations of the (classical) modular forms we have
been studying in this course (some of this material can be found in [4, Chapter 15] but much
of it is taken from the literature).

23.1 Half-integral weight modular forms


Recall that for integers k and lattices Γ ≤ SL2 (R) we defined weight-k modular forms for Γ as
holomorphic functions f : H → C that satisfy

f |k γ = f

for every γ ∈ Γ and are holomorphic at the cusps of Γ , where the weight-k slash operator is

f |k γ := j(γ, z)−k f (γz)



with j(γ, z) := cz + d for γ = ac db . We now want to consider half integral values of k.
To extend this to k ∈ 21 + Z we need to define the square root of j(γ, z) for z ∈ H and γ ∈ Γ
in a way that makes f |k γ holomorphic and which is compatible with the action of Γ on H. In
particular, we want an analog of the cocycle γ2 , z) = j(γ1 , γ2 z) j(γ2 , z) to hold.
p condition j(γ1p
One might be tempted to take (cz + d) 1/2 :
= cz + d, where · denotes the principal branch (so
p
Arg( z) ∈ (− π2 , π2 ]) and then use this to define (cz+d)−k for k = n/2. Note that either c = 0 and
p
cz + d is constant or cz + d : H → C lies in a halfplane on which z is holomorphic, so wepdon’t
p
need to worry about what happens on the branch locus R<0 . But note that ( z)−n = ± z −n ,
and the sign varies with z, so it is not completely clear what (cz + d)−k means, and even if we
fix a choice, neither choice satisfies the cocycle condition consistently.
Instead we extend SL2 (R) to a connected double cover of SL2 (R) known as the metaplectic
group Mp2 (R); The fundamental group of π1 (SL2 (R)) ' Z has a unique quotient of order 2,
so Mp2 (R) is uniquely determined as a topological space, and the fact that SL2 (R) is connected
and locally path connected implies that Mp2 (R) is also a topological group; it is locally compact,
since SL2 (R) is, so we can consider lattices is Mp2 (R) (discrete cofinite subgroups).
We can describe Mp2 (R) more explicitly as follows. It consists of pairs (γ, ") with γ ∈ SL2 (R)
and " : H → C a holomorphic function that satisfies "(z)2 = j(γ, z); note that without the holo-
p we would have uncountably many choices for "(z), but with it there are
morphic condition
exactly two: ± j(γ, z). The group operation is

(γ1 , "1 ) · (γ2 , "2 ) = (γ1 γ2 , z 7→ "1 (γ2 z)"2 (z)),

so the cocycle conduction holds. We have a surjective homomorphism Mp2 (R) → SL2 (R) defined
p
by (γ, ") 7→ γ whose kernel is {(1, ± z)} ' {±1}. This leads to the exact sequence

1 −→ {±1} −→ Mp2 (R) −→ SL2 (R) −→ 1



This sequence does not split in general, but it splits over Γ (4); indeed, for γ = ac db ∈ Γ (4),
if we define "γ (z) = dc
p
Γ (4) := {(γ, "γ ) : γ ∈ Γ (4)} is a lattice in Mp2 (R)
j(γ, z), then e
and the surjection Mp2 (R) → SL2 (R) restricts to an isomorphism e Γ (4) −→

Γ (4).1 This leads
many authors to restrict their attention to half-integral weight modular forms on congruence
subgroups whose level is a multiple of 4, but one can work with any lattice in Mp2 (R). We
define the weight-k slash operator for elements of Mp2 (R) via

f |k (γ, ") = "(z)−2k f (γz),


1
If one enlarges the metaplectic group to allow "(z)2 = ± j(γ, z) one can extend this to Γ0 (4).

Lecture by Andrew V. Sutherland, notes by Andrew V. Sutherland


and one can then consider modular forms of weight k ∈ 12 Z for lattices Γ ≤ Mp2 (R). The
metaplectic group Mp2 (R) is a compact Lie group (but not a matrix group), so in particular it
is a locally compact topological group in which a lattice is any discrete cofinite subgroup.
Every lattice in SL2 (R) is also a lattice in Mp2 (R), and for k ∈ Z the definition of the weight-
k slash operator above is the same as our original definition. The set of half-integral weight
modular forms of weight k ∈ 21 Z for a lattice Γ ∈ Mp2 (R) is a finite dimensional C-vector space.
The canonical example of a half-integral weight modular form is the Jacobi theta function
X 2
θ (τ) = e2πin τ ,
n∈Z

which is a weight-1/2 modular form for e Γ (4).


One can use the Jacobi theta function to define the space Mk (4N , χ) of modular forms of
weight k ∈ 21 + Z and level 4N with character χ as holomorphic functions f : H → C for which
there exists a Dirichlet character χ of modulus 4N for which
‹2k
θ (γz)

f (γz) = χ(d) f (z)
θ (z)

for all γ = ac db ∈ Γ0 (4N ), and which are holomorphic at all the cusps of Γ0 (4N ). Most
of the theory we have developed for integral weight modular forms extends to half-integral
weight, but there are subtleties that arise, and the theory of Hecke operators and eigenforms
is more complicated. However, as shown by Shimura [10], there is a correspondence between
half-integral weight Hecke eigenforms for Mk (4N , χ) and integral weight hecke eigenforms for
M2k−1 (2N , χ 2 ). This is known as the Shimura correspondence, which is not 1-to-1 in general,
but if one restricts to the Kohnen plus space it is; see [8] for details.

23.2 Hilbert modular forms


Let K be a totally real number field with embeddings σ1 , . . . , σ r : K → R. Each σi induces an
embedding σi : GL2 (K) ,→ GL2 (R), and we define

GL+ +
2 (K) = {α ∈ GL2 (K) : σi (α) ∈ GL2 (R) for 1 ≤ i ≤ r},
:

as the subgroup of matrices in GL2 (K) with totally positive determinant. The group GL+
2 (K) acts
+
on H via the isometric action of GL2 (R) on H. For functions f : H → C, vectors of integers
r r

k ∈ Z r , and α ∈ GL+
2 (K) we define the weight- k slash operator

r
‚ Œ
Y
ki /2
( f |k α)(z) := det(σi (α)) j(σi (α), z) ki
f (αz),
i=1

which we note coincides with the usual weight-k slash operator when r = 1.
Let ZK denote the ring of integers of K (the integral closure of Z in K). The group GL+ 2 (ZK ) is
an arithmetic lattice in the locally compact group GL+ 2 (R) r
, as is any Γ ≤ GL +
2 (K) commensurable
with GL+2 (Z K ); we note that unlike the case r = 1, for r > 1 lattices in GL +
2 (R) are automatically
arithmetic.

Definition 23.1. Let K be a totally real number field of degree r > 1. For k ∈ Z r and lattices
Γ ≤ GL+ +
2 (K) commensurable with GL2 (ZK ), a Hilbert modular form f : H → C of weight k
r

for Γ is a holomorphic function that satisfies f |k γ = f for all γ ∈ Γ .

18.786 Spring 2024, Lecture #23, Page 2


For r = 1 (in which case K = Q) this coincides with the usual definition of a modular form
without the requirement that f be holomorphic at the cusps. Hilbert modular forms are auto-
matically holomorphic at the cusps (this follows from what is known as the Koecher principle).
We call weights k = (c, . . . , c) ∈ Z r parallel weights (or scalar weights), and these are the
most commonly studied. There is a naturally analog of the congruence subgroups Γ0 (N ) that is
commonly used for Hilbert modular forms: for any nonzero ideal n of ZK we define

Γ0 (n) := ac db ∈ GL+
 
2 (ZK ) : c ∈ n ,

and say that Hilbert modular forms for Γ0 (n) have level n. More generally, if c is any fractional
Z F -ideal, we can define

Γ0 (c, n) := ac db ∈ GL+ −1
 
2 (K) : a, d ∈ ZK , c ∈ cn, b ∈ c

and consider the space Mk (c, n) of Hilbert modular forms for Γ0 (c, n) of level (c, n). This is a finite
dimensional C-vector space, as is its subspace Sk (c, n) of cusp forms (Hilbert modular forms that
r
vanish at the cusps). These spaces have dimension zero unless k ∈ Z≥0 .
For suitable k (including parallel even weight k) there are dimension formulas and traces
formulas, as well as a nice theory of Hecke operators and newforms; we now have Hecke op-
erators Tp associated to each prime ideal p of ZK and newforms f have Fourier coefficients ap
equal to the eigenvalue of Tp on f (all of which are algebraic integers). We should note that the
field Q( f ) generated by these eigenvalues need not contain K; indeed, we may have Q( f ) = Q.
For parallel weight-2 newforms f there is a conjectural analog of the Eichler-Shimura con-
struction that yields modular abelian varieties over K that is known to hold in many cases (in-
cluding all cases where K has odd degree) due to a theorem of Hida [7]. When f has rational
Hecke eigenvalues (so Q( f ) = Q), one obtains an elliptic curve E/K.
Recall that for K = Q the modularity theorem gives a 1-to-1 correspondence between isogeny
classes of elliptic curves E/Q of conductor N and newforms f of weight 2 and level N with
Q( f ) = Q in which E and f have the same L-function. It is similarly conjectured that there
is a 1-to-1 correspondence between elliptic curves over totally real fields K of conductor n and
Hilbert modular newforms of parallel weight 2 and level n with Q( f ) = Q in which E and f
have the same L-function.
There is much recent progress toward this conjecture, which is now known to hold for all
real quadratic fields [6] all totally real cubic fields [5], and in many other cases. Unlike E/Q, it
is possible for E/K to have everywhere good reduction (meaning the its conductor is the trivial
ideal ZK ), corresponding to a Hilbert modular form for the full Hilbert modular group GL+ 2 (ZK ).

Remark 23.2. Some authors restrict to subgroups of SL2 (ZK ) ≤ GL+ 2 (ZK ) when discussing
Hilbert modular forms. This is a nontrivial restriction: the determinant of an element of GL+
2 (ZK )
is a unit that is positive in every real embedding (a totally positive unit), but this condition is
satisfied by the square of any unit, and there are infinitely many when K = Q.

23.3 Bianchi modular forms


We began this course by considering the action of GL2 (R) on H ' SL2 (R)/SO2 (R), restrict-
ing to GL+2 (R) so that the action is isometric, and then to lattices in SL2 (R), which acts on
SL2 (R)/SO2 (R) via left multiplication. Here SO2 (R) is the maximal compact subgroup of SL2 (R),
and we note that SU2 (C) is the maximal compact subgroup of SL2 (C). Replacing R with C, we
get an action of SL2 (C) on SL2 (C)/SU2 (C).

18.786 Spring 2024, Lecture #23, Page 3


What is the analog of the complex upper halfplane H is in this setting? Well H = H2 is a
2-dimensional real manifold equipped with the hyperbolic metric induced by the Haar measure
of SL2 (R), and is often called hyperbolic 2-space. Now SL2 (C)/SU2 (C) is a 3-dimensional real
manifold isomorphic to hyperbolic 3-space

H3 := C × R>0 = {(x 1 + x 2 i, y) : x 1 , x 2 ∈ R, y ∈ R>0 }.


dx dx d y
The Haar measure on SL2 (C) induces the hyperbolic metric i y 32 on H3 and SL2 (C) then acts
on H3 via isometries.
For any number field K of degree r, we have r embeddings of K into C, each of which induces
an embedding of SL2 (K) into SL2 (C), allowing us to view SL2 (K) as a subgroup of SL2 (C) r that
acts on H3r via isometries.
In the simplest case r = 1 and K is an imaginary quadratic field, which we assume hence-
forth. A complication arises because H3 is not a complex manifold, so we cannot define modular
forms as “holomorphic” functions on H3 . One instead works with vector-valued real analytic
functions F : H3 → Ck+1 that are invariant under the action of certain differential operators
(Casimir operators associated to the Lie algebra of SL2 (C)). The weight-k slash operator asso-
ciated to α ∈ SL2 (C) is then defined by

(F |k α) := Symk (J(α, z)−1 )F (αz),


€ c x+d −c y Š 
where J(α, z) := c̄ y c x+d for α = ac db and z = (x, y) ∈ C × R>0 = H3 and Symk denotes
the linear operator corresponding to the kth symmetric power of the standard representation of
SL2 (C) on C2 . For integers k ≥ 0 and congruence subgroups Γ ≤ SL2 (ZK ) one can then consider
the space of Bianchi modular forms of weight k for Γ , which turns out to be a finite dimension
C-vector space that contains a subspace of cusp forms and Hecke operators Tp associated to
prime ideals of ZK . There is a notion of newforms that are simultaneous eigenforms for all the
Hecke operators, and these form a basis for the new subspace of cusp forms.
As with Hilbert modular forms, one expects to be able to associate modular abelian varieties
over K to weight-2 Bianchi newforms f , and when Q( f ) = Q, to obtain elliptic curves over K.
But unlike the case of Hilbert modular forms, this is not quite true; there are weight-2 Bianchi
newforms with Q( f ) = Q that do not correspond to any elliptic curve E/K, but instead corre-
spond to a simple abelian surface whose endomorphism ring is an order in a quaternion algebra
(something that cannot hold for an elliptic curve over a number field).
It is however true that every elliptic curve over an imaginary quadratic field K is modular,
meaning it has the same L-function as a weight-2 Bianchi newform f with Q( f ) = Q; this was
recently proved in [12].

23.4 Siegel modular forms


For each positive integer g we define the Siegel upper half space

H g := {z ∈ M g (C) : z T = z and Im(z) > 0}

−I g
€0 Š
of symmetric g × g matrices with positive definite imaginary part. Let J = Ig 0 ∈ M2g (Z).
The symplectic group
Sp2g (R) := {γ ∈ SL2g (R) : γT Jγ = J}

18.786 Spring 2024, Lecture #23, Page 4


acts on H g via “linear fractional transformations” τ 7→ (aτ + b)(cτ + d)−1 for ac db ∈ Sp2g (Z);


note that for g > 1 this expression involves matrices and the order of multiplication matters.
This extends to an isometric action of GSp+ 2g (R) on H g , where

GSp2g (R) = {γ ∈ SL2g (R) : γT Jγ = λJ with λ ∈ R× }


is the group of symplectic similitudes and GSp+ 2g (R) is the subgroup with positive determinant.
Each τ ∈ H g has an associated lattice Λτ := τZ g ⊕ Z g ⊆ C g for which the torus Aτ = C g /Λτ
is an abelian variety A/C equipped with the principal polarization
 
y
Eτ : (τx 1 + x 2 , τ y1 + y2 ) = (x 1 , x 2 )J 1 .
T T
y2
For V ' C g , a polarization on a torus V /Λ is a Hermitian form H : V × V → C with H(Λ, Λ) = Z.
Every polarization induces a homomorphism v 7→ H(v, ·) to the dual torus V ∗ /Λ∗ , where
V ∗ := f : V → C : f (αv) = ᾱ f (v) and f (v1 + v2 ) = f (v1 ) + f (v2 )


and Λ∗ := { f ∈ V ∗ : Im f (Λ) ⊆ Z}. This is defines an isogeny from the abelian variety A = V /Λ
and the dual abelian variety A∗ = V ∗ /Λ∗ ; for principal polarizations this is an isomorphism.
Two elements τ, τ0 ∈ H g define isomorphic principally polarized abelian varieties Aτ ' Aτ0
if and only if they lie in the same Sp2g (Z) orbit. We thus have a bijection between isomor-
phism classes of principally polarized abelian varieties A/C of dimension g and the quotient
Sp2g (Z)\H g , which is a variety A g /Q of dimension g(g + 1)/2, the moduli space of principally
polarized abelian varieties of dimension g. For g = 1 the moduli space A1 corresponds to
the modular curve Y (1) = Γ (1)\H; note that Γ (1) = SL2 (Z) = Sp2 (Z) and H = H1 . As with
Y (1) = A1 , the moduli space A g is not compact, but it can be compactified and embedded into
projective space (analogous to the construction of X (1) from Y (1), but a bit more complicated).
We note that for g ≤ 3 the dimensions of A g and M g coincide. Here M g is the moduli space
of nice (smooth projective geometrically connected) curves; it has dimension 1 for g = 1 and
dimension 3g − 3 for g > 1. Thus almost all principally polarized abelian varieties of dimension
g ≤ 3 are Jacobians of nice curves of genus g, but this is not true for any g > 3.
We now define
Γ (N ) := {γ ∈ Sp2g (Z) : γ ≡ 1 mod N },
and call subgroups Γ ≤ Sp2g (Z) that contain Γ (N ) for some N congruence subgroups. For g > 1
a holomorphic function f : H g → C is a (classical) Siegel modular form of (parallel even) weight
k ∈ 2Z for a congruence subgroup Γ if it is invariant under the weight-k slash operator
( f |k γ)(τ) := (det J(γ, τ)−k f (γτ),

for all γ = CA DB ∈ Γ , where J(γ, τ) = Cτ + D. This definition coincides with the definition of
a modular form for a congruence subgroup when g = 1 except the holomorphicity condition at
the cusps is omitted because it is automatically satisfied (by Koecher’s principle).
This definition can be generalized in many ways, including to non-parallel weights and
vector-valued functions associated to representations of GL g (C), but for the purpose of this
introduction there is one particular generalization we want to consider, and to simplify matters
we will restrict to g = 2. Rather than a congruence subgroup Γ ≤ Sp2g (Z) we instead use a
paramodular group
Z NZ Z
 
Z
 Z Z Z Z/N 
K(N ) :=   ∩ Sp4 (Q).
Z NZ Z Z
NZ NZ NZ Z

18.786 Spring 2024, Lecture #23, Page 5


Points on the quotient K(N )\H2 now correspond to (1, N )-polarized abelian surfaces, but this
distinction won’t concern us here; the connection between modular forms and abelian varieties
is via their L-functions and operates at the level of isogeny classes (recall that polarizations are
isogenies); the L-function of an abelian variety is independent of its polarization.
Modular forms for paramodular groups are called paramodular forms. There is a nice theory
of Hecke operators, eigenforms, and newforms for paramodular groups that is conjectured to
have all the properties one would like; in particular the new subspace of cuspidal paramodular
forms of weight k and level N has a basis consisting of newforms that are normalized eigenforms
for all of the Hecke operators Tp for p - N . The theory of newforms is more complicated than
in the classical case (there are three different ways in which a paramodular form of level M |N
can arise as a paramodular form of level N , as well as Gritsenko lifts of Jacobi forms that need
to be excluded), but this theory is now well understood [9].

23.5 The paramodular conjecture


For abelian surfaces there is a conjectured analog of the modularity of elliptic curves known as
the paramodular conejecture, due to Brumer and Kramer [2].
Conjecture 23.3 (Paramodular conjecture). Let A/Q be an abelian surface of conductor N with
End(A) = Z. Then L(A, s) = L( f , s) for some weight-2 paramodular newform of level N with
rational Hecke eigenvalues. Conversely, for every weight-2 paramodular newform of level N with
rational Hecke eigenvalues we have L( f , s) = L(A, s) for an abelian variety A/Q that is either an
abelian surface of conductor N with End(A) = Z, or an abelian fourfold of conductor N 2 with
End(A) an order in a non-split quaternion algebra.
Both directions of this conjecture are open and highly non-trivial; in particular, there is no
analog of the Eichler–Shimura construction that allows one to associate an abelian variety to a
weight-2 paramodular newform. In fact there has been more progress toward proving the first
part of the conjecture (the modularity of abelian surfaces) than the second (which for g = 1 was
known long before the modularity of elliptic curves, due to the Eichler–Shimura construction).
The necessity of allowing for the case of QM abelian fourfolds noted by Frank Calegari highlights
the difficulty and is analogous to the situation with weight-2 Bianchi newforms with rational
Hecke eigenvalues that correspond to QM abelian surfaces rather than elliptic curves.
The paramodular conjecture has now been verified for many abelian surfaces over Q with
small conductor N , including for N = 277, which is the first level where one finds paramodular
newforms with rational Hecke eigenvalues and believed to be the smallest conductor of an
abelian surface A/Q with End(A) = Z. There are two abelian surfaces A of conductor 277, both
of which are isogenous to the Jacobian of

y 2 + (x 3 + x 2 + x + 1) y = −x 2 − x,

which is the genus 2 curve with label 277.a.277.1 in the LMFDB. Infinite families of abelian
surfaces over Q for which the paramodular conjectures holds are now known [3], building on
the potential modular results of Boxer, Calegari, Gee, and Pilloni [1]. Very recently the same
authors have announced a proof that the paramodular conjecture holds for a positive proportion
of abelian surfaces over Q (but the new result uses different techniques).
One can define paramodular groups K(N ) ≤ Sp2g (Q) for even g > 2, but in general the
quotient K(N )\H g parameterizes abelian varieties whose endomorphism ring is an order in a
totally real field of degree g/2. For g > 2 these abelian varieties are not fully general and do
not account for a positive proportion of abelian varieties of dimension g.

18.786 Spring 2024, Lecture #23, Page 6


23.6 Shimura varieties
We have seen that modular curves that parameterize elliptic curves equipped with a level struc-
ture, Siegel modular varieties that parameterize abelian varieties equipped with a polarization,
and paramodular varieties that parameterize abelian varieties whose endomorphism rings are
orders in a totally real field. These are all special cases of Shimura varieties, and in particular,
PEL type Shimura varieties, that parameterize abelian varieties with a particular polarization
(P), endomorphism ring (E), and level structure (L).
We won’t attempt to give a formal definition of Shimura varieties here, but they generally
start with a complex manifold M (such as M = H) endowed with a Hermitian metric (such
as the hyperbolic metric) that is induced by the Haar measure of a locally compact topological
group (such as SL2 (R)) acting via isometries. Moreover, this topological group corresponds to
the real points G(R) of a Q-algebraic group G (such as G = SL2 ). This allows us to then consider
arithmetic lattices Γ ≤ G(R) (such as SL2 (Z)) and Shimura varieties X = Γ \M formed by the
quotient that are defined over a finite extension of Q and which parameterize abelian varieties
equipped with certain structures.

References
[1] George Boxer, Frank Calegari, Toby Gee, and Vincent Pilloni, Abelian surfaces over totally
real fields are potentially modular, Publ. Math. IHES 134 (2021), 153–501.

[2] Armand Brumer and Ken Kramer, Paramodular abelian varieties of odd conductor, Trans.
Amer. Math. Soc. 366 (2013), 2463–2516 (but see 2018 correction in arXiv:1004.4699).

[3] Frank Calegari, Shiva Chidambaram, and David Roberts, Abelian surfaces with fixed 3-
torsion, Proceedings of the Fourteenth Algorithmic Number Theory Symposium (ANTS XIV),
Open Book Series 4, Mathematical Sciences Publishers, 2020.

[4] Henri Cohen and Fredrik Strömberg, Modular forms: A classical approach, Graduate Studies
in Mathematics 179, American Mathematical Society, 2017.

[5] Maarten Derickx, Filip Najman, and Samir Siksek, Elliptic curves over totally real cubic fields
are modular, Algebra Number Theory 14 (2020), 1791–1800.

[6] Nuno Freitas, Bao V. Le Hung, and Samir Siksek, Elliptic curves over real quadratic fields are
modular, Invent. Math. 201 (205), 159–206.

[7] Haruzo Hida, Hilbert modular forms and Iwasawa theory, Oxford University Press, 2006.

[8] Winfried Kohnen, Modular forms of half–integral weight on Γ0 (4), Math. Ann. 248 (1980),
249–266.

[9] Brooke Roberts and Ralf Schmidt, Local newforms for GSp(4), Lecture Notes in Mathematics
1918, Springer, 2007.

[10] Goro Shimura, On modular forms of half integral weight, Ann. of Math. 97 (1973), 440-
481.

[11] Nils–Peter Skoruppa and Don Zagier, Jacobi forms and a certain space of modular forms,
Invent. Math. 94 (1988), 113–146.

18.786 Spring 2024, Lecture #23, Page 7


[12] Patrick B. Allen, Frank Calegari, Ana Caraiani, Toby Gee, David Helm, Bao V. Le Hung,
James Newton, Peter Scholze, Richard Taylor, and Jack A. Thorne, Potential automorphy
over CM fields, Ann. of Math. 197 (2023), 897–1113.

18.786 Spring 2024, Lecture #23, Page 8

You might also like