0% found this document useful (0 votes)

318 views

Elliptic Curves, Modular Forms, and Their L-Functions

This book provides an introduction to the connections between elliptic curves, modular forms, and L-functions. It covers the basics of each topic, explains how they are related, and discusses important theorems like modularity and Fermat's Last Theorem. The book is based on lecture notes from an undergraduate summer course and is intended to give readers a high-level overview without proofs.

Uploaded by

lol noob ja

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

318 views

Elliptic Curves, Modular Forms, and Their L-Functions

Uploaded by

lol noob ja

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 214

S T U D E N T M AT H E M AT I C A L L I B R A RY

IAS/PARK CITY MATHEMATIC AL SUBSERIES

Volume 58

Elliptic Curves,
Modular Forms,
and Their
L-functions
Álvaro Lozano-Robledo

American Mathematical Society

Institute for Advanced Study
Elliptic Curves,
Modular Forms,
and Their L-functions
S T U D E N T M AT H E M AT I C A L L I B R A RY
IAS/PARK CITY MATHEMATIC AL SUBSERIES
Volume 58

Elliptic Curves,
Modular Forms,
and Their L-functions

Álvaro Lozano-Robledo

American Mathematical Society, Providence, Rhode Island

Institute for Advanced Study, Princeton, New Jersey
Editorial Board of the Student Mathematical Library
Gerald B. Folland Brad G. Osgood (Chair)
Robin Forman John Stillwell

Series Editor for the Park City Mathematics Institute

John Polking
Cover art courtesy of Karl Rubin, using MegaPOV, which is based
on POV-Ray, both of which are open source, freely available software.

2000 Mathematics Subject Classiﬁcation. Primary 14H52, 11G05;

Secondary 11F03, 11G40.

For additional information and updates on this book, visit

www.ams.org/bookpages/stml-58

Library of Congress Cataloging-in-Publication Data

Lozano-Robledo, Álvaro, 1978–
Elliptic curves, modular forms, and their L-functions / Álvaro Lozano-Robledo.
p. cm. — (Student mathematical library ; v. 58. IAS/Park City mathemat-
ical subseries)
Includes bibliographical references and index.
ISBN 978-0-8218-5242-2 (alk. paper)
1. Curves, Elliptic. 2. Forms, Modular. 3. L-functions. 4. Number theory.
I. Title.
QA567.2.E44L69 2010
516.352—dc22
2010038952

Copying and reprinting. Individual readers of this publication, and nonproﬁt

libraries acting for them, are permitted to make fair use of the material, such as to
copy a chapter for use in teaching or research. Permission is granted to quote brief
passages from this publication in reviews, provided the customary acknowledgment of
the source is given.
Republication, systematic copying, or multiple reproduction of any material in this
publication is permitted only under license from the American Mathematical Society.
Requests for such permission should be addressed to the Acquisitions Department,
American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-
2294 USA. Requests can also be made by e-mail to [email protected].

c 2011 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted to the United States Government.
Printed in the United States of America.

∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at http://www.ams.org/
10 9 8 7 6 5 4 3 2 1 16 15 14 13 12 11
A mis padres, que me enseñaron todo lo importante,
a mi abuela, por su sonrisa que no aparece en fotografías,
a Marisa, por lograr que siempre me supere,
y a “Sally”, que nacerá pronto.
Contents

Preface xi

Chapter 1. Introduction 1
§1.1. Elliptic curves 1
§1.2. Modular forms 7
§1.3. L-functions 11
§1.4. Exercises 15

Chapter 2. Elliptic curves 17

§2.1. Why elliptic curves? 17
§2.2. Definition 20
§2.3. Integral points 23
§2.4. The group structure on E(Q) 24
§2.5. The torsion subgroup 32
§2.6. Elliptic curves over finite fields 35
§2.7. The rank and the free part of E(Q) 43
§2.8. Linear independence of rational points 46
§2.9. Descent and the weak Mordell-Weil theorem 49
§2.10. Homogeneous spaces 59
§2.11. Selmer and Sha 66

vii
viii Contents

§2.12. Exercises 69

Chapter 3. Modular curves 77

§3.1. Elliptic curves over C 77
§3.2. Functions on lattices and elliptic functions 82
§3.3. Elliptic curves and the upper half-plane 84
§3.4. The modular curve X(1) 87
§3.5. Congruence subgroups 90
§3.6. Modular curves 91
§3.7. Exercises 94

Chapter 4. Modular forms 99

§4.1. Modular forms for the modular group 99
§4.2. Modular forms for congruence subgroups 105
§4.3. The Petersson inner product 110
§4.4. Hecke operators acting on cusp forms 111
§4.5. Exercises 118

Chapter 5. L-functions 123

§5.1. The L-function of an elliptic curve 123
§5.2. The Birch and Swinnerton-Dyer conjecture 127
§5.3. The L-function of a modular (cusp) form 135
§5.4. The Taniyama-Shimura-Weil conjecture 137
§5.5. Fermat’s last theorem 140
§5.6. Looking back and looking forward 142
§5.7. Exercises 143

Appendix A. PARI/GP and Sage 147

§A.1. Elliptic curves 147
§A.2. Modular forms 154
§A.3. L-functions 156
§A.4. Other Sage commands 158

Appendix B. Complex analysis 159

Contents ix

§B.1. Complex numbers 159

§B.2. Analytic functions 160
§B.3. Meromorphic functions 163
§B.4. The complex exponential function 165
§B.5. Theorems in complex analysis 166
§B.6. Quotients of the complex plane 168
§B.7. Exercises 169

Appendix C. Projective space 171

§C.1. The projective line 171
§C.2. The projective plane 173
§C.3. Over an arbitrary ﬁeld 174
§C.4. Curves in the projective plane 175
§C.5. Singular and smooth curves 176

Appendix D. The p-adic numbers 179

§D.1. Hensel’s lemma 181
§D.2. Exercises 182

Appendix E. Parametrization of torsion structures 185

Bibliography 189

Index 193
Preface

This book grew out of the lecture notes for a course on “Elliptic
Curves, Modular Forms and L-functions” that the author taught at
an undergraduate summer school as part of the 2009 Park City Mathe-
matics Institute. These notes are an introductory survey of the theory
of elliptic curves, modular forms and their L-functions, with an em-
phasis on examples rather than proofs. The main goal is to provide
the reader with a big picture of the surprising connections among
these three types of mathematical objects, which are seemingly so
distinct. In that vein, one of the themes of the book is to explain
the statement of the modularity theorem (Theorem 5.4.6), previously
known as the Taniyama-Shimura-Weil conjecture (Conjecture 5.4.5).
In order to underscore the importance of the modularity theorem, we
also discuss in some detail one of its most renowned consequences:
Fermat’s last theorem (Example 1.1.5 and Section 5.5).
It would be impossible to give the proofs of the main theorems
on elliptic curves and modular forms in one single course, and the
proofs would be outside the scope of the undergraduate curriculum.
However, the deﬁnitions, the statements of the main theorems and
their corollaries can be easily understood by students with some stan-
dard undergraduate background (calculus, linear algebra, elementary
number theory and a ﬁrst course in abstract algebra). Proofs that are
accessible to a student are left to the reader and proposed as exercises

xi
xii Preface

at the end of each chapter. The reader should be warned, though,

that there are multiple references to mathematical objects and results
that we will not have enough space to discuss in full, and the student
will have to take these items on faith (we will provide references to
other texts, however, for those students who wish to deepen their
understanding). Some other objects and theorems are mentioned in
previous chapters but only explained fully in later chapters. To avoid
any confusion, we always try to clarify in the text which objects or
results the student should take on faith, which ones we expect the stu-
dent to be familiar with, and which will be explained in later chapters
(by providing references to later sections of the book).
The book begins with some motivating problems, such as the
congruent number problem, Fermat’s last theorem, and the represen-
tations of integers as sums of squares. Chapter 2 is a survey of the
algebraic theory of elliptic curves. In Section 2.9, we give a proof
of the weak Mordell-Weil theorem for elliptic curves with rational 2-
torsion and explain the method of 2-descent. The goal of Chapter
3 is to motivate the connection between elliptic curves and modular
forms. To that end, we discuss complex lattices, tori, modular curves
and how these objects relate to elliptic curves over the complex num-
bers. Chapter 4 introduces the spaces of modular forms for SL(2, Z)
and other congruence subgroups (e.g., Γ0 (N )). In Chapter 5 we define
the L-functions attached to elliptic curves and modular forms. We
briefly discuss the Birch and Swinnerton-Dyer conjecture and other
related conjectures. Finally, in Section 5.4, we justify the statement
of the original conjecture of Taniyama-Shimura-Weil (which we usu-
ally refer to as the modularity theorem, since it was proved in 1999);
i.e., we explain the surprising connection between elliptic curves and
certain modular forms, and justify which modular forms correspond
to elliptic curves.
In order to make this book as self-contained as possible, I have
also included five appendices with concise introductions to topics that
some students may not have encountered in their classes yet. Appen-
dix A is a quick reference guide to two popular software packages:
PARI and Sage. Throughout the book, we strongly recommend that
the reader tries to find examples and do calculations using one of these
Preface xiii

two packages. Appendix B is a brief summary of complex analysis.

Due to space limitations we only include deﬁnitions, a few exam-
ples, and a list of the main theorems in complex analysis; for a full
treatment see [Ahl79], for instance. In Appendix C we introduce
the projective line and the projective plane. The p-adic integers and
the p-adic numbers are treated in Appendix D (for a complete refer-
ence, see [Gou97]). Finally, in Appendix E we list inﬁnite families
of elliptic curves over Q, one family for each of the possible torsion
subgroups over Q.
I would like to emphasize once again that this book is, by no
means, a thorough treatment of elliptic curves and modular forms.
The theory is far too vast to be covered in one single volume, and the
proofs are far too technical for an undergraduate student. Therefore,
the humble goals of this text are to provide a big picture of the vast and
fast-growing theory, and to be an “advertisement” for undergraduates
of these very active and exciting areas of number theory. The author’s
only hope is that, after reading this text, students will feel compelled
to study elliptic curves and modular forms in depth, and in all their
full glory.
There are many excellent references that I would recommend to
the students, and that I have frequently consulted in the preparation
of this book:

(1) There are not that many books on these subjects at the
undergraduate level. However, Silverman and Tate’s book
[SiT92] is an excellent introduction to elliptic curves for
undergraduates. Washington’s book [Was08] is also acces-
sible for undergraduates and emphasizes the cryptography
applications of elliptic curves. Stein’s book [Ste08] also has
an interesting chapter on elliptic curves.

(2) There are several graduate-level texts on elliptic curves. Sil-

verman’s book [Sil86] is the standard reference, but Milne’s
[Mil06] is also an excellent introduction to the theory of el-
liptic curves (and also includes a chapter on modular forms).
Before reading Silverman or Milne, the reader would beneﬁt
xiv Preface

from studying some algebraic geometry and algebraic num-

ber theory. (Milne’s book does not require as much algebraic
geometry as Silverman’s.)
(3) The theory of modular forms and L-functions is definitely
a graduate topic, and the reader will need a strong back-
ground in algebra to understand all the fine details. Dia-
mond and Shurman’s book [DS05] contains a neat, modern
and thorough account of the theory of modular forms (in-
cluding much information about the modularity theorem).
Koblitz’s book [Kob93] is also a very nice introduction to
the theory of elliptic curves and modular forms (and includes
a lot of information about the congruent number problem).
Chapter 5 in Milne’s book [Mil06] contains a good, concise
overview of the subject. Serre’s little book [Ser77] is always
worth reading and also contains an introduction to modular
forms. Miyake’s book [Miy06] is a very useful reference.
(4) Finally, if the reader is interested in computations, we rec-
ommend Cremona’s [Cre97] or Stein’s [Ste07] book. If the
reader wants to play with fundamental domains of modular
curves, try Helena Verrill’s applet [Ver05].
I would like to thank the organizers of the undergraduate summer
school at PCMI, Aaron Bertram and Andrew Bernoff, for giving me
the opportunity to lecture in such an exciting program. Also, I would
like to thank Ander Steele and Aaron Wood for numerous corrections
and comments of an early draft. Last, but not least, I would like to
express my gratitude to Keith Conrad, David Pollack and William
Stein, whose abundant comments and suggestions have improved this
manuscript much more than it would be safe to admit.

Álvaro Lozano-Robledo
Chapter 1

Introduction

Notation:
N = {1, 2, 3, . . .} is the set of natural numbers.
Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . .} is the ring of integers.
n : m, n ∈ Z, n = 0} is the field of rational numbers.
Q = {m
R is the field of real numbers.
C = {a + bi : a, b ∈ R, i2 = −1} is the field of complex numbers.
In this chapter, we introduce elliptic curves, modular forms and L-
functions through examples that motivate the definitions.

1.1. Elliptic curves

For the time being, we deﬁne an elliptic curve to be any equation of
the form
y 2 = x3 + ax2 + bx + c
with a, b, c ∈ Z and such that the polynomial x3 + ax2 + bx + c does
not have repeated roots. See Section 2.2 for a precise deﬁnition.

Example 1.1.1. Are there three consecutive integers whose product

is a perfect square?
There are some trivial examples that involve the number zero, for
example, 0, 1 and 2, whose product equals 0 · 1 · 2 = 0 = 02 , a square.

1
2 1. Introduction

Are there any non-trivial examples? If we try to assign variables to

our problem, we see that we are trying to find solutions to
(1.1) y 2 = x(x + 1)(x + 2)
with x, y ∈ Z and y = 0. Equation (1.1) defines an elliptic curve. It
turns out that there are no integral solutions other than the trivial
ones (see Exercise 1.4.1). Are there rational solutions, i.e., are there
solutions with x, y ∈ Q? This is a more delicate question, but the
answer is still no (we will prove it in Example 2.7.6). Here is a similar
question, with a very different answer:
• Are there three integers that differ by 5, i.e., x, x + 5 and
x + 10, and whose product is a perfect square?
In this case, we are trying to find solutions to y 2 = x(x+5)(x+10)
with x, y ∈ Z. As in the previous example, there are trivial solutions
(those which involve 0) but in this case, there are non-trivial solutions
as well:
(−9) · (−9 + 5) · (−9 + 10) = (−9) · (−4) · 1 = 36 = 62
40 · (40 + 5) · (40 + 10) = 40 · 45 · 50 = 90000 = 3002 .
Moreover, there are also rational solutions, which are far from obvious:
2
5 5 5 75
· +5 · + 10 =
4 4 4 8
2
50 50 50 100
− · − + 5 · − + 10 =
9 9 9 27
and, in fact, there are infinitely many rational solutions! Here are
some of the x-coordinates that work:
5 −50 961 7200 12005 16810 27910089
x = −9, 40, , , , , − ,− , − ,...
4 9 144 961 1681 2401 5094049
In Sections 2.9 and 2.10 we will explain a method to find rational
points on elliptic curves and, in Exercise 2.12.23, the reader will cal-
culate all the rational points of y 2 = x(x + 5)(x + 10).
Example 1.1.2 (The Congruent Number Problem). We say that
n ≥ 1 is a congruent number if there exists a right triangle whose
sides are rational numbers and whose area equals n. What natural
numbers are congruent?
1.1. Elliptic curves 3

For instance, the number 6 is congruent, because the right triangle

with sides of length (a, b, c) = (3, 4, 5) has area equal to 3·42 = 6.
Similarly, the number 30 is the area of the right triangle with sides
(5, 12, 13); thus, 30 is a congruent number.

Figure 1. A right triangle of area 5 and rational sides.

The number 5 is congruent but there is no right triangle with

integer sides and area equal to 5. However, our definition
allowed
rational sides, and the triangle with sides 32 , 20
3 , 41
6 has area exactly
5. We do not allow, however, triangles with irrational sides√even if
the area is an integer. For example, the right triangle (1, 2, 5) has
area 1, but that does not imply that 1 is a congruent number (in fact,
1 is not a congruent number, as we shall see below).
The congruent number problem is one of the oldest open problems
in number theory. For more than a millennium, mathematicians have
attempted to provide a characterization of all congruent numbers.
The oldest written record of the problem dates back to the early
Middle Ages, when it appeared in an Arab manuscript written before
972 (a later 10th century manuscript written by Mohammed Ben
Alcohain would go as far as to claim that the principal object of the
theory of rational right triangles is to find congruent numbers). It is
known that Leonardo Pisano, a.k.a. Fibonacci, was challenged around
1220 by Johannes of Palermo to find a rational right triangle of area
4 1. Introduction

n = 5, and Fibonacci found the triangle ( 23 , 20 41

3 , 6 ). We will explain a
method to ﬁnd this triangle below. In 1225, Fibonacci wrote a more
general treatment about the congruent number problem, in which he
stated (without proof) that if n is a perfect square, then n cannot
be a congruent number. The proof of such a claim had to wait until
Pierre de Fermat (1601-1665) settled that the number 1 (and every
square number) is not a congruent number (a result that he showed
in order to prove the case n = 4 of Fermat’s last theorem).
The connection between the congruent number problem and el-
liptic curves is as follows:

Proposition 1.1.3. The number n > 0 is congruent if and only if the

curve y 2 = x3 − n2 x has a point (x, y) with x, y ∈ Q and y = 0. More
precisely, there is a one-to-one correspondence Cn ←→ En between
the following two sets:
ab
Cn = {(a, b, c) : a2 + b2 = c2 , = n}
2
En = {(x, y) : y 2 = x3 − n2 x, y = 0}.
Mutually inverse correspondences f : Cn → En and g : En → Cn are
given by
2
nb 2n2 x − n2 2nx x2 + n2
f ((a, b, c)) = , , g((x, y)) = , , .
c−a c−a y y y

The reader can provide a proof (see Exercise 1.4.3). For example,
the curve E : y 2 = x3 − 25x has a point (−4, 6) that corresponds to
the triangle ( 23 , 20 41 1681 62279
3 , 6 ). But E has other points, such as ( 144 , 1728 )
that corresponds to the triangle

1519 4920 3344161
, ,
492 1519 747348
which also has area equal to 5. See Figure 2.
Today, there are partial results toward the solution of the congru-
ent number problem, and strong results that rely heavily on famous
(and widely accepted) conjectures, but we do not have a full answer
yet. For instance, in 1975 (see [Ste75]), Stephens showed that the
Birch and Swinnerton-Dyer conjecture (which we will discuss in Sec-
tion 5.2) implies that any positive integer n ≡ 5, 6 or 7 mod 8 is a
1.1. Elliptic curves 5

Figure 2. Two rational points on the curve y 2 = x3 − 25x.

congruent number. For example, n = 157 ≡ 5 mod 8 must be a con-

gruent number and, indeed, Don Zagier has exhibited a right triangle
(a, b, c) whose area equals 157. The hypotenuse of the simplest such
triangle is:
2244035177043369699245575130906674863160948472041
c= .
8912332268928859588025535178967163570016480830
In Example 5.2.7 we will see an application of the conjecture of Birch
and Swinnerton-Dyer to ﬁnd a rational point P on y 2 = x3 − 1572 x,
which corresponds to a right triangle of area 157 via the correspon-
dence in Proposition 1.1.3.
The best known result on the congruent number problem is due
to J. Tunnell:

Theorem 1.1.4 (Tunnell, 1983, [Tun83]). If n is an odd square-

free positive integer and n is the area of a right triangle with rational
sides, then the following cardinalities are equal:
#{(x, y, z) ∈ Z3 : n = 2x2 + y 2 + 32z 2 }
1
= # (x, y, z) ∈ Z3 : n = 2x2 + y 2 + 8z 2
2
6 1. Introduction

and, if n is even,
n
#{(x, y, z) ∈ Z3 : = 4x2 + y 2 + 32z 2 }
2
1 n
= #{(x, y, z) ∈ Z3 : = 4x2 + y 2 + 8z 2 } .
2 2
Moreover, if the Birch and Swinnerton-Dyer conjecture is true, then,
conversely, these equalities imply that n is a congruent number.

For example, for n = 2 we have n2 = 1 = 4x2 + y 2 + 32z 2 if and

only if x = z = 0 and y = ±1, so the left-hand side of the appropriate
equation in Tunnell’s theorem is equal to 2. However, the right-hand
side is equal to 1 and the equality does not hold. Hence, 2 is not a
congruent number.
For a complete historical overview of the congruent number prob-
lem, see [Dic05], Ch. XVI. The book [Kob93] contains a thorough
modern treatment of the problem. The reader may also ﬁnd useful an
expository paper [Con08] on the congruent number problem, written
by Keith Conrad. Another neat exposition, more computational in
nature (using Sage), appears in [Ste08], Section 6.5.3.

Example 1.1.5 (Fermat’s last theorem). Let n ≥ 3. Are there any

solutions to xn + y n = z n in integers x, y, z with xyz = 0? The
answer is no. In 1637, Pierre de Fermat wrote in the margin of a
book (Diophantus’ Arithmetica; see Figure 9 in Section 5.5) that he
had found a marvellous proof, but the margin was too small to contain
it. Since then, many mathematicians tried in vain to demonstrate (or
disprove!) this claim. A proof was finally found in 1995 by Andrew
Wiles ([Wil95]). We shall discuss the proof in some more detail
in Section 5.5. For now, we will outline the basic structure of the
argument.
First, it is easy to show that, to prove the theorem, it suffices to
show the cases n = 4 and n = p ≥ 3, a prime. It is not difficult to
show that x4 + y 4 = z 4 has no non-trivial solutions in Z (this was first
shown by Fermat). Now, suppose that p ≥ 3 and a, b, c are integers
with abc = 0 and ap + bp = cp . Gerhard Frey conjectured that if such
a triple of integers exists, then the elliptic curve
E : y 2 = x(x − ap )(x + bp )
1.2. Modular forms 7

Figure 3. Pierre de Fermat (1601-1665).

would have some unexpected properties that would contradict a well-

known conjecture that Taniyama, Shimura and Weil had formulated
in the 1950’s. Their conjecture spelled out a strong connection be-
tween elliptic curves and modular forms, which we will describe in
Section 5.4. Ken Ribet proved that, indeed, such a curve would con-
tradict the Taniyama-Shimura-Weil (TSW) conjecture. Finally, An-
drew Wiles was able to prove the TSW conjecture in a special case
that would cover the hypothetical curve E. Therefore, E cannot exist
and the triple (a, b, c) cannot exist, either.
The Taniyama-Shimura-Weil conjecture (Conjecture 5.4.5), i.e.,
the modularity theorem 5.4.6, was fully proved by Christophe Breuil,
Brian Conrad, Fred Diamond, and Richard Taylor in their article
[BCDT01].

1.2. Modular forms

Let C be the complex plane and let H be the upper half of the complex
plane, i.e., H = {a+bi : a, b ∈ R, b > 0}. A modular form is a function
8 1. Introduction

f : H → C that has several relations among its values (which we will

specify in Deﬁnitions 4.1.3 and 4.2.1). In particular, the values of the
function f satisfy several types of periodicity relations. For example,
the modular forms for SL(2, Z) satisfy, among other properties, the
following:
• f (z) = f (z + 1) for all z ∈ H, and

• f −1z = z k f (z) for all z ∈ H. The number k is an integer
called the weight of the modular form.
We will describe modular forms in detail in Chapter 4. Let us see
some examples that motivate our interest in these functions.

Example 1.2.1 (Representations of integers as sums of squares). Is

the number n > 0 a sum of two (integer) squares? In other words,
are there a, b ∈ Z such that n = a2 + b2 ? And if so, in how many
diﬀerent ways can you represent n as a sum of two squares?
For instance, the number n = 3 cannot be represented as a sum
of two squares but the number n = 5 has 8 distinct representations:

5 = (±1)2 + (±2)2 = (±2)2 + (±1)2 .

Notice that here we consider (−1)2 + 22 , 12 + 22 and 22 + 1 as distinct

representations of 5. A general formula for the number of represen-
tations of an integer n as a sum of 2 squares, due to Lagrange, Gauss
and Jacobi, is given by

−1 −1
(1.2) S2 (n) = 2 1 + ,
n d
d|n
m
where n is the Jacobi symbol and d|n is a sum over all positive
divisors of n (including 1 and n). Here we just need the easiest values
( −1
n ) = (−1)
(n−1)/2
of the Jacobi symbol. Let us see that the formula
works:

−1 −1
S2 (3) = 2 1 + = 2(1 + (−1))(1 + (−1)) = 0,
3 d
d|3

−1 −1
S2 (5) = 2 1 + = 2(1 + 1)(1 + 1) = 8,
5 d
d|5
1.2. Modular forms 9

and S2 (9) = 4. Indeed, the number nine has 4 diﬀerent representa-

tions: 9 = (±3)2 + 02 = 02 + (±3)2 . Let us explore other similar
questions.
Let n > 0 and k ≥ 2. Is the number n > 0 a sum of k (integer)
squares? In other words, are there a1 , . . . , ak ∈ Z such that n =
a21 +· · ·+a2k ? And if so, in how many diﬀerent ways can you represent
n as a sum of k squares? Lagrange showed that every natural number
can be represented as a sum of k ≥ 4 squares, but how many diﬀerent
representations are there?
Let Sk (n) be the number of representations of n as a sum of k
squares. Determining exact formulas for Sk (n) is a classical problem
in number theory. There are exact formulas known in a number of
cases (e.g. Eq. 1.2). The formulas for k = 4, 6 and 8 are due to
Jacobi and Siegel. We write n = 2ν g, with ν ≥ 0 and odd g > 0:

S4 (n) = 8 d,
d|n, 4d

−1 −1
S6 (n) = 22ν+4 − 4 d2 ,
g d
d|g

d3 if n is odd,
S8 (n) = 16 · d|n

d|n d −2
3
d|g
3
d if n is even.

For example, S4 (4) = 8(1 + 2) = 24 and, indeed

4 = (±1)2 + (±1)2 + (±1)2 + (±1)2 = (±2)2 + 0 + 0 + 0

= 0 + (±2)2 + 0 + 0 = 0 + 0 + (±2)2 + 0 = 0 + 0 + 0 + (±2)2 .

So there are 16 + 2 + 2 + 2 + 2 = 24 possible representations of the

number 4 as a sum of 4 squares. Notice that S4 (2) = S4 (4). In how
many ways can 4 be represented as a sum of 6 squares? We write
4 = 22 · 1, so ν = 2 and g = 1, and thus,

−1 −1
S6 (4) = 2·2+4
2 −4 · 1 = (28 − 4) · 1 = 252.
2
1 1
10 1. Introduction

The formulas for Sk (n) given above are derived using the theory of
modular forms, as follows. We deﬁne a formal power series Θ(q) by

∞
2
Θ(q) = qj
j=−∞

and, for k ≥ 2, consider the power series expansion of the kth power
of Θ:
⎛ ⎞k
∞
= ⎝ qj ⎠
2
(Θ(q))k
j=−∞
∞
∞

a21 a2k
= q ··· q = cn q n .
a1 =−∞ ak =−∞ n≥0

What is the nth coeﬃcient, cn , of Θk ? If the readers stare at the

previous equation for a while, they will ﬁnd that cn is given by

cn = #{(a1 , . . . , ak ) ∈ Zk : a21 + · · · + a2k = n}.

Therefore, cn = Sk (n) and (Θ(q))k = n≥0 Sk (n)q n . In other words,

Θk is a generating function for Sk (n). But, how do we find closed
formulas for Sk (n)? This is where the theory of modular forms be-
comes particularly useful, for it provides an alternative description of
the coefficients of Θk .
It turns out that, for even k ≥ 2, the function Θk is a modular
form of weight k2 (more precisely, it is a modular form for the group
Γ1 (4)), and the space of all modular forms of weight k2 , denoted by
M k (Γ1 (4)), is finite dimensional (we will carefully define all these
2
terms later). For instance, let k = 4. Then M2 (Γ1 (4)), the space of
modular forms of weight 42 = 2 for Γ1 (4), is a 2-dimensional C-vector
space and a basis is given by modular forms with q-expansions:

f (q) = 1 + 24q 2 + 24q 4 + 96q 6 + 24q 8 + 144q 10 + 96q 12 + · · ·

g(q) = q + 4q 3 + 6q 5 + 8q 7 + 13q 9 + 12q 11 + 14q 13 + · · · .
1.3. L-functions 11

Therefore, Θ4 (q) = λf (q) + μg(q) for some constants λ, μ ∈ C. We

may compare q-expansions to ﬁnd the values of λ and μ:

Θ4 (q) = S4 (n)q n = 1 + 8q + 24q 2 + 32q 3 + 24q 4 + · · ·

n≥0

λf (q) + μg(q) = λ + μq + 24λq 2 + 4μq 3 + · · · .

Therefore, it is clear that λ = 1 and μ = 8, so Θ4 = f + 8g. Since

the expansions of f and g are easy to calculate (for example, using
Sage; see Appendix A.2), we can easily calculate the coefficients of
the q-expansion of Θ and, therefore, values of S4 (n).
The exact formulas given above for Sk (n), however, follow from
some deeper facts. Here is a sketch of the ideas involved (the reader
may skip these details for now and return here after reading Chapter
4): given Θ4 = cn q n and F (q) = ( d|n d)q n , one can find an
eigenvector G(q) = bn q n for a collection of linear maps Tn (the so-
called Hecke operators, Tn : M2 (Γ1 (4)) → M2 (Γ1 (4))) among spaces
of modular forms, i.e., Tn (G) = λn G for n > 1, and the eigenvalues
λn = bn /b1 = d|n d. Moreover, the eigenvector G can be written
explicitly as a combination of Θ4 and F . Finally, one can show that
the coefficients cn must be given by the formula cn = 8 d|n, 4d d
(see [Kob93], III, §5, for more details).

1.3. L-functions
An L-function is a function L(s), usually given as an inﬁnite series of
the form
∞ ∞
an a2 a3
L(s) = an n−s = s
= a1 + s + s + · · ·
n=1 n=1
n 2 3

with some coeﬃcients an ∈ C. Typically, the function L(s) con-

verges for all complex numbers s in some half-plane (i.e., those s
with real part larger than some constant), and in many cases L(s)
has an analytic or meromorphic continuation to the whole complex
plane. Mathematicians are interested in L-functions because they
are objects from analysis that, sometimes, capture very interesting
algebraic information.
12 1. Introduction

Example 1.3.1 (The Riemann zeta function). The Riemann zeta

function, usually denoted by ζ(s), is perhaps the most famous L-
function:
∞
1 1 1
ζ(s) = s
= 1 + s + s + ··· .
n=1
n 2 3
The reader may already know some values of ζ. For example ζ(2) =
1 2
n2 is convergent by the p-series test, and its value is π /6 (this
value can be computed using Fourier analysis and Parseval’s equality).
The connection between ζ(s) and number theory comes from the fact
that ζ(s) has an Euler product:
∞
1 1
ζ(s) = =
n=1
n s
p prime
1 − p−s

1 1 1
= · · ··· .
1 − 2−s 1 − 3−s 1 − 5−s
This Euler product is not diﬃcult to establish (Exercise 1.4.8) and
has the very interesting consequence that any information on the
distribution of the zeros of ζ(s) can be translated into information
about the distribution of prime numbers among the natural numbers.

Example 1.3.2 (Dirichlet L-function). Let a, N ∈ N be relatively

prime integers. Are there inﬁnitely many primes p of the form a+kN
(i.e., p ≡ a mod N ) for k ≥ 0? The answer is yes and this fact,
known as Dirichlet’s theorem on primes in arithmetic progressions,
was ﬁrst proved by Dirichlet using a particular kind of L-function
that we know today as a Dirichlet L-function.
Let N > 0. A Dirichlet character (modulo N ) is a function
χ : (Z/N Z)× → C× that is a homomorphism of groups, i.e., χ(nm) =
χ(n)χ(m) for all n, m ∈ (Z/N Z)× . Notice that χ(n) ∈ C and
χ(n)ϕ(N ) = 1 for all gcd(n, N ) = 1. Therefore, χ(n) must be a root of
unity. We extend χ to Z as follows. Let a ∈ Z. If gcd(a, N ) = 1, then
χ(a) = χ(a mod N ). Otherwise, if gcd(a, N ) = 1, then χ(a) = 0.
A Dirichlet L-function is a function of the form
∞
χ(n)
L(s, χ) = ,
n=1
ns
1.3. L-functions 13

Figure 4. Johann Peter Gustav Lejeune Dirichlet (1805-

1859) and Georg Friedrich Bernhard Riemann (1826-1866).

where χ is a given Dirichlet character. For example, one can take χ0

to be the trivial Dirichlet character, i.e., χ0 (n) = 1 for all n ≥ 1. Then
L(s, χ0 ) is the Riemann zeta function ζ(s). Dirichlet L-functions also
have Euler products:
∞
χ(n) 1
L(s, χ) = = .
n=1
n s
p
1 − χ(p)p−s

The idea of the proof of Dirichlet’s theorem generalizes the fol-

lowing proof, due to Euler, of the infinitude of the primes. Consider
∞
ζ(s) = n=1 n1s = p 1−p1 −s and suppose there are only finitely many
primes. Then the product over all primes is finite, and therefore its
value at s = 1 would be finite (a rational number, in fact). However,
ζ(1) = ∞ n=1 1/n is the harmonic series, which diverges! Therefore,
there must be infinitely many prime numbers.
Dirichlet adapted this argument by looking instead at a different
function:
1
Ψa,N (s) = .
ps
p≡a mod N
14 1. Introduction

He showed that (a) for every non-trivial Dirichlet character χ modulo

N , we have L(1, χ) = 0 or ∞, and (b) this implies that Ψa,N (1)
diverges to ∞. Part (b) follows from the equality

log(ζ(s)) + χ(a)−1 log(L(s, χ))

χ mod N
χ=1
⎛ ⎞
1⎠
= φ(N ) ⎝ + g(s),
ps
p≡a mod N

where g(s) is a function with g(1) ﬁnite, and φ is the Euler φ-function.
Therefore, there cannot be a ﬁnite number of primes of the form
p ≡ a mod N .

Example 1.3.3 (Representations of integers as sums of squares). Is

the number n > 0 a sum of three integer squares? In Subsection 1.2,
we saw formulas for the number of representations of an integer as a
sum of k = 2, 4, 6 and 8 integer squares, but we avoided the same
question for odd k. The known formulas for S3 (n), S5 (n) and S7 (n)
involve values of Dirichlet L-functions.
use here.
Let us first define the Dirichlet character that we shall
The reader should be familiar with the Legendre symbol np , which
is equal to 0 if p|n, equal to 1 if n is a square mod p, and equal to
−1 if n is not a square mod p. Let m > 0 be a natural number

with prime factorization m = i pi (the primes are not necessarily
distinct). First we define
⎧
⎪
⎪
n ⎨0 if n is even,
= 1 if n ≡ ±1 mod 8,
2 ⎪
⎪
⎩−1 if n ≡ ±3 mod 8.

Now we are ready to deﬁne the Kronecker symbol of n over m > 0 by

n n
= .
m i
pi
−n
For any n > 0, the symbol
induces a Dirichlet character χn
deﬁned by χn (a) = −n
a , and we can deﬁne the associated L-function
1.4. Exercises 15

by
∞
χn (a)
L(s, χn ) = .
a=1
as
We are ready to write down the formula for S3 (n), due to Gauss,
Dirichlet and Shimura (there are also formulas for S5 (n), due to Eisen-
stein, Smith, Minkowski and Shimura, and a formula for S7 (n), also
due to Shimura). For simplicity, let us assume that n is odd and
square free (for the utmost generality, please check [Shi02]):

0 if n ≡ 7 mod 8,
S3 (n) = √
24 n
π L(1, χn ) otherwise.
The reader is encouraged to investigate this problem further by at-
tempting Exercises 1.4.6 and 1.4.7.

1.4. Exercises
Exercise 1.4.1. Use the divisibility properties of integers to show
that the only solutions to y 2 = x(x + 1)(x + 2) with x, y ∈ Z are
(0, 0), (−1, 0) and (−2, 0). (Hint: If a and b are relatively prime and
ab is a square, then a is a square and b is a square.)

Exercise 1.4.2. Find all the Pythagorean triples (a, b, c), i.e., a, b, c ∈
Z and a2 + b2 = c2 , such that b2 + c2 = d2 for some d ∈ Z. In other
words, ﬁnd all the integers a, b, c, d such that (a, b, c) and (b, c, d)
are both Pythagorean triples. (Hint: You may assume that y 2 =
x(x + 1)(x + 2) has no rational points other than (0, 0), (−1, 0) and
(−2, 0).)

Exercise 1.4.3. Prove Proposition 1.1.3; i.e., show that f ((a, b, c)) is
a point in En , that g((x, y)) is a triangle in Cn and that f (g((x, y))) =
(x, y) and g(f ((a, b, c))) = (a, b, c).

Exercise 1.4.4. Calculate S4 (n), for n = 1, 3, 5, 6, by hand, using

Jacobi’s formula and also by ﬁnding all possible ways of writing n as
a sum of 4 squares.

Exercise 1.4.5. The goal of this problem is to ﬁnd the q-expansion

of Θ6 (q):
16 1. Introduction

(1) Find by hand the values of S6 (n), for n = 0, 1, 2; i.e., ﬁnd

all possible ways to write n = 0, 1, 2 as a sum of 6 squares.
(2) Using Sage, calculate the dimension of M k (Γ1 (4)) (see Ap-
2
pendix A.2) and a basis of modular forms for k = 6.
(3) Write Θ6 as a linear combination of the basis elements found
in part 2.
(4) Use part 3 to write the q-expansion of Θ6 up to O(q 20 ).
(5) Use the expansion of Θ6 to verify that S6 (4) = 252. Also,
calculate S6 (19) using Jacobi’s formula and verify that it
coincides with the coeﬃcient of Θ6 in front of the q 19 term.
Exercise 1.4.6. Show that any integer n ≡ 7 mod 8 cannot be rep-
resented as a sum of three integer squares.
Exercise 1.4.7. Find the number of representations of n = 3 as a
sum of 3 squares. Then compare your result with the value of the
formula given in Example 1.3.3; i.e., use a computer to approximate
√ √ ∞ −3
24 3 24 3 a
S3 (3) = L(1, χ3 ) =
π π a=1 a
by adding the ﬁrst 10, 000 terms of L(1, χ3 ). Do the same for n = 5
and n = 11. Does the formula seem to work for n = 2? (Note: the
command kronecker(-n,m) calculates the Kronecker symbol −n m in
Sage.)
Exercise 1.4.8. Prove that the Riemann zeta function ζ(s) = ∞ n=1 ns
1

has an Euler product; i.e., prove the following formal equality of series
∞
1 1
= .
n=1
n s
p prime
1 − p−s

(Hint: There are two possible approaches:

Hint (a). Expand the right-hand side using the Fundamental Theorem
1 ∞
of Arithmetic and the algebraic equality 1+x = k=0 xk .
[This approach helps build an intuition about what is going
on, but may be hard to write into a rigorous proof]
Hint (b). Calculate (1 − 1/2s )ζ(s) and (1 − 1/3s )(1 − 1/2s )ζ(s), etc.)
Chapter 2

Elliptic curves

In this chapter we summarize the main aspects of the theory of el-

liptic curves1. Unfortunately, we will not be able to provide many of
the proofs because they are beyond the scope of this course. If the
reader is not familiar with projective geometry or needs to refresh the
memory, it is a good time to look at Appendix C or another reference
(for example, [SK52] is a beautiful book on projective geometry).

2.1. Why elliptic curves?

A Diophantine equation is an equation given by a polynomial with
integer coeﬃcients, i.e.

(2.1) f (x1 , x2 , . . . , xr ) = 0

with f (x1 , . . . , xr ) ∈ Z[x1 , . . . , xr ]. Since antiquity, many mathe-

maticians have studied the solutions in integers of Diophantine equa-
tions that arise from a variety of problems in number theory, e.g.
y 2 = x3 − n2 x is the Diophantine equation related to the study of the
congruent number problem (see Example 1.1.2).
Since we would like to systematically study the integer solutions
of Diophantine equations, we ask ourselves three basic questions:

1The contents of this chapter are largely based on the article [Loz05], in Spanish.

17
18 2. Elliptic curves

(a) Can we determine if Eq. (2.1) has any integral solutions,

xi ∈ Z, or rational solutions, xi ∈ Q?
(b) If so, can we find any of the integral or rational solutions?
(c) Finally, can we find all solutions and prove that we have
found all of them?
The first question was proposed by David Hilbert: to devise a
process according to which it can be determined in a finite number
of operations whether the equation is solvable in rational integers.
This was Hilbert’s tenth problem out of 23 fundamental questions
that he proposed to the mathematical community during the Second
International Congress of Mathematicians in Paris in the year 1900.
Surprisingly, in 1970, Matiyasevich, Putnam and Robinson discovered
that there is no such general algorithm that decides whether equation
(2.1) has integer solutions (see [Mat93]). However, if we restrict our
attention to certain particular cases, then we can answer questions
(a), (b) and (c) posed above. The most significant advances have
been obtained in equations with one and two variables:
• Polynomials in one variable:
f (x) = a0 xn + a1 xn−1 + . . . + an = 0
with ai ∈ Z. This case is fairly simple. The following crite-
rion determines how to search for rational or integral roots
of a polynomial: if pq ∈ Q is a solution of f (x) = 0, then an
is divisible by p and a0 is divisible by q.
• Linear equations in two variables:
ax + by = d
with a, b, d ∈ Z and ab = 0. Clearly, this type of equa-
tion always has an infinite number of rational solutions. As
for integral solutions, Euclid’s algorithm (to find gcd(a, b))
determines if there are solutions x, y ∈ Z and, if so, pro-
duces all solutions. In particular, the equation has integral
solutions if and only if d is divisible by gcd(a, b).
• Quadratic equations (conics):
ax2 + bxy + cy 2 + dx + ey = f with a, b, c, d, e, f ∈ Z.
2.1. Why elliptic curves? 19

Finding integral and rational points on a conic is a classical

problem. Legendre’s criterion determines whether there are
rational solutions: a conic C has rational solutions if and
only if C has points over R and over Qp , the p-adics, for all
primes p ≥ 2 (see Appendix D for a brief introduction to
the p-adics). Essentially, Legendre’s criterion says that the
conic has rational solutions if and only if there are solutions
modulo pn for all primes p and all n ≥ 1 but, in practice,
one only needs to check this for a finite number of primes
that depends on the coefficients of the conic.
If C has rational points, and we have found at least
one point, then we can find all the rational solutions using
a stereographic projection (see Exercise 2.12.2). The inte-
gral points on C, however, are much more difficult to find.
The problem is equivalent to finding integral solutions to
Pell’s equation x2 − Dy 2 = 1. There are several methods to
solve Pell’s equation. For example, one can use continued
fractions (certain convergents xy of the continued fraction
√
for D are integral solutions (x, y) of Pell’s equation; see
Exercise 2.12.2).
• Cubic equations:
aX + bX 2 Y + cXY 2 + dY 3 + eX 2 + f XY + gY 2 + hX + jY + k = 0.
3

A cubic equation in two variables may have no rational solu-

tions, only 1 rational solution, a finite number of solutions,
or infinitely many solutions. Unfortunately, we do not know
any algorithm that yields all rational solutions of a cubic
equation, although there are conjectural algorithms. In this
chapter we will concentrate on this type of equation: a non-
singular cubic, i.e., no self-intersections or pinches, with at
least one rational point (which will be our definition of an
elliptic curve).
• Higher degree. Typically, curves defined by an equation of
degree ≥ 4 have a genus ≥ 2 (but some equations of degree
4 have genus 1; see Example 2.2.5 and Exercise 2.12.4). The
genus is an invariant that classifies curves according to their
topology. Briefly, if we consider a curve as defined over C,
20 2. Elliptic curves

then C(C) may be considered as a surface over R, and the

genus of C counts the number of holes in the surface. For
example, the projective line P1 (C) has no holes and g = 0
(the projective plane is homeomorphic to a sphere; see Ap-
pendix C for a quick introduction to projective geometry),
and an elliptic curve has genus 1 (homeomorphic to a torus;
see Theorem 3.2.5). Surprisingly, the genus of a curve is
intimately related with the arithmetic of its points. More
precisely, Louis Mordell conjectured that a curve C of genus
≥ 2 can only have a ﬁnite number of rational solutions. The
conjecture was proved by Faltings in 1983.

2.2. Definition
Definition 2.2.1. An elliptic curve over Q is a smooth cubic projec-
tive curve E defined over Q with at least one rational point O ∈ E(Q)
that we call the origin.

In other words, an elliptic curve is a curve E in the projective

plane (see Appendix C) given by a cubic polynomial F (X, Y, Z) = 0
with rational coeﬃcients, i.e.,

(2.2) F (X, Y, Z) = aX 3 + bX 2 Y + cXY 2 + dY 3

+eX 2 Z + f XY Z + gY 2 Z
+hXZ 2 + jY Z 2 + kZ 3 = 0,

with coeﬃcients a, b, c, . . . ∈ Q, and such

∂F that E is smooth; i.e., the
tangent vector ∂X ∂F
(P ), ∂Y (P ), ∂F
∂Z (P ) does not vanish at any P ∈ E
(see Appendix C.5 for a brief introduction to singularities and non-
singular or smooth curves). If the coefficients a, b, c, . . . are in a field
K, then we say that E is defined over K (and write E/K).
Even though the fact that E is a projective curve is crucial, we
usually consider just affine charts of E, e.g. those points of the form
{[X, Y, 1]}, and study instead the affine curve given by

(2.3) aX 3 + bX 2 Y + cXY 2 + dY 3
+eX 2 + f XY + gY 2 + hX + jY + k = 0
2.2. Deﬁnition 21

but with the understanding that in this new model we may have left
out some points of E at infinity (i.e., those points [X, Y, 0] satisfying
Eq. 2.2).
In general, one can find a change of coordinates that simplifies
Eq. 2.3 enormously:

Proposition 2.2.2. Let E be an elliptic curve, given by Eq. 2.2,

defined over a field K of characteristic different from 2 or 3. Then
given by
there exists a curve E

zy 2 = x3 + Axz 2 + Bz 3 , A, B ∈ K with 4A3 + 27B 2 = 0

and an invertible change of variables ψ : E → E of the form

f1 (X, Y, Z) f2 (X, Y, Z) f3 (X, Y, Z)
ψ([X, Y, Z]) = , ,
g1 (X, Y, Z) g2 (X, Y, Z) g3 (X, Y, Z)
where fi and gi are polynomials with coeﬃcients in K for i = 1, 2, 3,
i.e., ψ(O) = [0, 1, 0].
and the origin O is sent to the point [0, 1, 0] of E,

The existence of such a change of variables is a consequence of

the Riemann-Roch theorem of algebraic geometry (for a proof of the
proposition see [Sil86], Chapter III.3). The reference [SiT92], Ch. I.

3, gives an explicit method to ﬁnd the change of variables ψ : E → E.
See also pages 46-49 of [Mil06].
A projective equation of the form zy 2 = x3 + Axz 2 + Bz 3 , or
y = x3 +Ax+B in aﬃne coordinates, is called a Weierstrass equation.
2

From now on, we will often work with an elliptic curve in this form.
Notice that a curve E given by a Weierstrass equation y 2 = x3 +Ax+
B is non-singular if and only if 4A3 + 27B 2 = 0, and it has a unique
point at inﬁnity, namely [0, 1, 0], which we shall call the origin O or
the point at inﬁnity of E.
Sometimes we shall use a more general Weierstrass equation

y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6

with ai ∈ Q (we will explain the funky choice of notation for the
coeﬃcients later), but most of the time we will work with equations
of the form y 2 = x3 + Ax + B. It is easy to come up with a change
of variables from one form to the other (see Exercise 2.12.3).
22 2. Elliptic curves

Example 2.2.3. Let d ∈ Z, d = 0 and let E be the elliptic curve

given by the cubic equation

X 3 + Y 3 = dZ 3

with O = [1, −1, 0]. The reader should verify that E is a smooth
curve. We wish to ﬁnd a Weierstrass equation for E and, indeed, one
given by
can ﬁnd a change of variables ψ : E → E

ψ([X, Y, Z]) = [12dZ, 36d(X − Y ), X + Y ] = [x, y, z]

such that zy 2 = x3 − 432d2 z 3 . The map ψ is invertible; the inverse

→ E is
map ψ −1 : E

−1 36dz + y 36dz − y x
ψ ([x, y, z]) = , , .
72d 72d 12d
In aﬃne coordinates, the change of variables is going from X 3 + Y 3 =
d to the curve y 2 = x3 − 432d2 :

12d 36d(X − Y )
ψ(X, Y ) = , ,
X +Y X +Y

36d + y 36d − y
ψ −1 (x, y) = , .
6x 6x

Deﬁnition 2.2.4. Let E : f (x, y) = 0 be an elliptic curve with origin

O, and let E : g(X, Y ) = 0 be an elliptic curve with origin O . We
say that E and E are isomorphic over Q if there is an invertible
change of variables ψ : E → E , deﬁned by rational functions with
coeﬃcients in Q, such that ψ(O) = O .

Example 2.2.5. Sometimes, a curve given by a quartic polynomial

can be isomorphic over Q to another curve given by a cubic polyno-
mial. For instance, consider the curves

C/Q : V 2 = U 4 + 1 and E/Q : y 2 = x3 − 4x.

The map ψ : C → E given by

2(V + 1) 4(V + 1)
ψ(U, V ) = ,
U2 U3
2.3. Integral points 23

is an invertible rational map, deﬁned over Q, that sends (0, 1) to

O, and ψ(0, −1) = (0, 0). See Exercise 2.12.4. More generally, any
quartic
C : V 2 = aU 4 + bU 3 + cU 2 + dU + q 2
for some a, b, c, d, q ∈ Z is isomorphic over Q to a curve of the form
E : y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 , also deﬁned over Q. The
isomorphism is given in [Was08], Theorem 2.17, p. 37.

Let E be an elliptic curve over Q given by a Weierstrass equation

E : y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 , ai ∈ Q.
−2 −3
With a change of variables (x, y) → (u x, u y), we can find the
equation of an elliptic curve isomorphic to E given by
y 2 + (a1 u)xy + (a3 u3 )y = x3 + (a2 u2 )x2 + (a4 u4 )x + (a6 u6 )
with coefficients ai ui ∈ Z for i = 1, 2, 3, 4, 6. By the way, this is one
of the reasons for the peculiar numbering of the coefficients ai .
Example 2.2.6. Let E be given by y 2 = x3 + x2 + 53 . We may
change variables by x = 6X2 and y = 6Y3 to obtain a new equation
Y 2 = X 3 + 648X + 77760 with integral coefficients.

2.3. Integral points

In 1929, Siegel proved the following result about integral points E(Z),
i.e., about those points on E with integer coordinates:
Theorem 2.3.1 (Siegel’s theorem; [Sil86], Ch. IX, Thm. 3.1). Let
E/Q be an elliptic curve given by y 2 = x3 + Ax + B, with A, B ∈ Z.
Then E has only a ﬁnite number of integral points.

Siegel’s theorem is a consequence of a well-known theorem of Roth

on Diophantine approximation. Unfortunately, Siegel’s theorem is not
effective and provides neither a method to find the integral points on
E nor a bound on the number of integral points. However, in [Bak90],
Alan Baker found an alternative proof that provides an explicit upper
bound on the size of the coefficients of an integral solution. More
concretely, if x, y ∈ Z satisfy y 2 = x3 + Ax + B, then
6
max(|x|, |y|) < exp((106 · max(|A|, |B|))10 ).
24 2. Elliptic curves

Obviously, Baker’s bound is not a very sharp bound, but it is theo-

retically interesting nonetheless.

2.4. The group structure on E(Q)

From now on, we will concentrate on trying to ﬁnd all rational points
on a curve E : y 2 = x3 + Ax + B. We will use the following notation
for the rational points on E:

E(Q) = {(x, y) ∈ E | x, y ∈ Q} ∪ {O}

where O = [0, 1, 0] is the point at inﬁnity.

One of the aspects that makes the theory of elliptic curves so
rich is that the set E(Q) can be equipped with a group structure,
geometric in nature. The (addition) operation on E(Q) can be defined
as follows (see Figure 1). Let E be given by a Weierstrass equation
y 2 = x3 + Ax + B with A, B ∈ Q. Let P and Q be two rational points
in E(Q) and let L = P Q be the line that goes through P and Q (if
P = Q, then we define L to be the tangent line to E at P ). Since
the curve E is defined by a cubic equation, and since we have defined
L so it already intersects E at two rational points, there must be a
third point of intersection R in L ∩ E, which is also defined over Q,
and
L ∩ E(Q) = {P, Q, R}.
The sum of P and Q, denoted by P + Q, is by definition the second
point of intersection with E of the vertical line that goes through R,
or in other words, the reflection of R across the x-axis.
It is easy to verify that the addition operation that we have de-
fined on points of E(Q) is commutative. The origin O is the zero
element, and for every P ∈ E(Q) there exists a point −P such that
P + (−P ) = O. If E is given by y 2 = x3 + Ax + B and P = (x0 , y0 ),
then −P = (x0 , −y0 ). The addition is also associative (but this is
not obvious, and it is tedious to prove) and, therefore, (E, +) is an
abelian group.

Example 2.4.1. Let E be the elliptic curve y 2 = x3 − 25x, as in

Example 1.1.2. The points P = (5, 0) and Q = (−4, 6) belong to
2.4. The group structure on E(Q) 25

P+Q

Figure 1. Addition of points on an elliptic curve

E(Q). Let us ﬁnd P + Q. First, we ﬁnd the equation of the line

L = P Q. The slope must be
0−6 6 2
m= =− =−
5 − (−4) 9 3
and the line is L : y = − 23 (x − 5). Now we ﬁnd the third point of
intersection of L and E by solving

y = − 23 (x − 5)
y 2 = x3 − 25x.
Plugging the ﬁrst equation into the second one, we obtain an equation
4 185 100
x3 − x2 − x− = 0,
9 9 9
which factors as (x − 5)(x + 4)(9x + 5) = 0. The ﬁrst two factors
are expected, since we already knew that P = (5, 0) and Q = (−4, 6)
are in L ∩ E. The third point of intersection must have x = − 59 ,
y = − 23 (x − 5) = 100 5 100
27 and, indeed, R = (− 9 , 27 ) is a point in
26 2. Elliptic curves

L ∩ E(Q). Thus, P + Q is the reﬂection of R across the x-axis, i.e.,

P + Q = (− 59 , − 100
27 ).
Using Proposition 1.1.3, we may try to use the point P + Q =
(− 59 , − 100
27 ) to find a (new) right triangle with rational sides and area
equal to 5, but this point corresponds to the triangle ( 20 3 41
3 , 2 , 6 ), the
same triangle that corresponds to Q = (−4, 6). In order to find a new
triangle, let us find Q + Q = 2Q.
The line L in this case is the tangent line to E at Q. The slope
of L can be found using implicit differentiation on y 2 = x3 − 25x:
dy dy 3x2 − 25
2y = 3x2 − 25, so = .
dx dx 2y
23 23
Hence, the slope of L is m = 12 and L : y = 12 (x + 4) + 6. In order
to find R we need to solve
23
y= 12 (x + 4) + 6
y = x − 25x.
2 3

Simplifying yields x3 − 529 2

144 x − 1393
18 x − 1681
9 = 0, which factors as
(x + 4) (144x − 1681) = 0.
2

Once again, two factors were expected: x = −4 needs to be a double

root because L is tangent to E at Q = (−4, 6). The third factor tells us
that the x coordinate of R is x = 1681 23 62279
144 , and y = 12 (x + 4) + 6 = 1728 .
Thus, Q + Q = 2Q = ( 1681 144 , − 1728 ). This point corresponds to the
62279

right triangle

1519 4920 3344161
(a, b, c) = , , .
492 1519 747348

Example 2.4.2. Let E : y 2 = x3 + 1 and put P = (2, 3). Let us ﬁnd

P , 2P , 3P , etc.
• In order to find 2P , first we need to find the tangent line to
E at P , which is y − 3 = 2(x − 2) or y = 2x − 1. The third
point of intersection is R = (0, −1), so 2P = (0, 1).
• To find 3P , we add P and 2P . The third point of inter-
section of E with the line that goes through P and 2P is
R = (−1, 0); hence, 3P = (−1, 0).
2.4. The group structure on E(Q) 27

Figure 2. The rational points on y 2 = x3 + 1, except the

point at ∞.

• The point 4P can be found by adding 3P and P . The third

point of intersection of E and the line through P and 3P is
R = 2P = (0, 1), and so 4P = P + 3P = (0, −1).
• We find 5P by adding 4P and P . Notice that the line that
goes through 4P = (0, −1) and P = (2, 3) is tangent at
(2, 3), so the third point of intersection is P . Thus, 5P =
4P + P = (2, −3).
• Finally, 6P = P + 5P but 5P = (2, −3) = −P . Hence,
6P = P + (−P ) = O, the point at infinity.
This means that P is a point of finite order, and its order equals
6. See Figure 2 (the Sage code for this graph can be found in the
Appendix A.1.3).

The addition law can be deﬁned more generally on any smooth

projective cubic curve E : f (X, Y, Z) = 0, with a given rational point
O. Let P, Q ∈ E(Q) and let L be the line that goes through P and
28 2. Elliptic curves

Q. Let R be the third point of intersection of L and E. Then R is

also a rational point in E(Q). Let L be the line through R and O.
We define P + Q to be the third point of intersection of L and E.
Notice that any vertical line x = a in the affine plane passes through
[0, 1, 0], because the same line in projective coordinates is given by
x = az and [0, 1, 0] belongs to such line. Thus, if E is given by a
model y 2 = x3 + Ax + B, and O is chosen to be the point [0, 1, 0],
then L is always a vertical line, so P + Q is always the reflection of
R with respect to the x axis.
The next step in the study of the structure of E(Q) was conjec-
tured by Jules Poincaré in 1908, proved by Louis Mordell in 1922 and
generalized by André Weil in his thesis in 1928:

Theorem 2.4.3 (Mordell-Weil). E(Q) is a ﬁnitely generated abelian

group. In other words, there are points P1 , . . . , Pn such that any other
point Q in E(Q) can be expressed as a linear combination

Q = a1 P 1 + a2 P 2 + · · · + an P n

for some ai ∈ Z.

The group E(Q) is usually called the Mordell-Weil group of E,

in honor of the two mathematicians who proved the theorem.

Example 2.4.4. Consider the elliptic curve E/Q given by the Weier-
strass equation
y 2 + y = x3 − 7x + 6.
The set of rational points E(Q) for this elliptic curve is inﬁnite. For
instance, the following points are on the curve:

(1, 0), (2, 0), (0, −3), (−3, −1), (8, −22), (−2, −4), (3, −4),
(3, 3), (−1, −4), (1, −1), (0, 2), (2, −1), (−2, 3), (−1, 3),

1 13 25 91 26 28 7 17
, , ,− , − , , , ,....
4 8 9 27 9 27 9 27
At a first glance, it may seem very difficult to describe all the points
on E(Q), including those listed above, in a succinct manner. However,
the Mordell-Weil theorem tells us that there must be a finite set of
points that generate the whole group. Indeed, it can be proved that
2.4. The group structure on E(Q) 29

Figure 3. Louis Mordell (1888-1972) and André Weil (1906-1998).

the three points

P = (1, 0), Q = (2, 0), and R = (0, −3)
are generators of E(Q). This means that any other point on E(Q)
can be expressed as a Z-linear combination of P , Q and R. In other
words,
E(Q) = {a · P + b · Q + c · R : a, b, c ∈ Z}.
For instance,
(−3, −1) = P + Q, (8, −22) = P + R, (−2, −4) = P − Q,

(−1, −4) = Q − R and (3, 3) = P − R.

The proof of the theorem has three fundamental ingredients: the

so-called weak Mordell-Weil theorem E(Q)/mE(Q) is finite for any
m ≥ 2; see below); the concept of height functions on abelian groups
and the descent theorem, which establishes that an abelian group A
with a height function h, such that A/mA is finite (for some m ≥ 2),
is finitely generated.
30 2. Elliptic curves

Theorem 2.4.5 (weak Mordell-Weil). E(Q)/mE(Q) is a ﬁnite group

for all m ≥ 2.

We will discuss the proof of a special case of the weak Mordell-

Weil theorem in Section 2.9 (see Corollary 2.9.7).
It follows from the Mordell-Weil theorem and the general struc-
ture theory of finitely generated abelian groups that
(2.4) E(Q) ∼
= E(Q)torsion ⊕ ZRE .
In other words, E(Q) is isomorphic to the direct sum of two abelian
groups (notice however that this decomposition is not canonical!).
The first summand is a finite group formed by all torsion elements,
i.e., those points on E of finite order:
E(Q)torsion = {P ∈ E(Q) : there is n ∈ N such that nP = O}.
The second summand of Eq. (2.4), sometimes called the free part, is
ZRE , i.e., RE copies of Z for some integer RE ≥ 0. It is generated
by RE points of E(Q) of infinite order (i.e., P ∈ E(Q) such that
nP = O for all non-zero n ∈ Z). The number RE is called the rank
of the elliptic curve E/Q. Notice, however, that the set
F = {P ∈ E(Q) : P is of infinite order} ∪ {O}
is not a subgroup of E(Q) if the torsion subgroup is non-trivial. For
instance, if T is a torsion point and P is of infinite order, then P and
P + T belong to F but T = (P + T ) − P does not belong to F . This
fact makes the isomorphism of Eq. (2.4) not canonical because the
subgroup of E(Q) isomorphic to ZRE cannot be chosen, in general,
in a unique way.
Example 2.4.6. The following are some examples of elliptic curves
and their Mordell-Weil groups:
(1) The curve E1 /Q : y 2 = x3 + 6 has no rational points, other
than the point at infinity O. Therefore, there are no torsion
points (other than O) and no points of infinite order. In
particular, the rank is 0, and E1 (Q) = {O}.
(2) The curve E2 /Q : y 2 = x3 + 1 has only 6 rational points.
As we saw in Example 2.4.2, the point P = (2, 3) has exact
order 6. Therefore E2 (Q) ∼= Z/6Z is an isomorphism of
2.4. The group structure on E(Q) 31

groups. Since there are no points of inﬁnite order, the rank

of E2 /Q is 0, and
E2 (Q) = {O, P, 2P, 3P, 4P, 5P } = {O, (2, ±3), (0, ±1), (−1, 0)}.
(3) The curve E3 /Q : y 2 = x3 − 2 does not have any rational
torsion points other than O (as we shall see in the next
section). However, the point P = (3, 5) is a rational point.
Thus, P must be a point of infinite order and E3 (Q) contains
infinitely many distinct rational points. In fact, the rank of
E3 is equal to 1 and P is a generator of all of E3 (Q), i.e.,
E3 (Q) = {nP : n ∈ Z} and E3 (Q) ∼
= Z.
(4) The elliptic curve E4 /Q : y 2 = x3 + 7105x2 + 1327104x
features both torsion and infinite order points. In fact,
E4 (Q) ∼= Z/4Z ⊕ Z3 . The torsion subgroup is generated
by the point T = (1152, 111744) of order 4. The free part is
generated by three points of infinite order:
P1 = (−6912, 6912), P2 = (−5832, 188568), P3 = (−5400, 206280).
Hence
E4 (Q) = {aT + bP1 + cP2 + dP3 : a = 0, 1, 2 or 3 and b, c, d ∈ Z}.
As we mentioned above, the isomorphism E4 (Q) ∼ = Z/4Z ⊕
Z3 is not canonical. For instance, E4 (Q) ∼ = T ⊕P1 , P2 , P3
but also E4 (Q) ∼
= T ⊕ P1 , P2 , P3 with P1 = P1 + T .

The rank of E/Q is, in a sense, a measurement of the arithmetic

complexity of the elliptic curve. It is not known if there is an upper
bound for the possible values of RE (the largest rank known is 28,
discovered by Noam Elkies; see Andrej Dujella’s website [Duj09] for
up-to-date records and examples of curves with “high” ranks). It has
been conjectured (with some controversy) that ranks can be arbitrar-
ily large; i.e., for all n ∈ N there exists an elliptic curve E over Q
with RE ≥ n. We state this as a conjecture for future reference:

Conjecture 2.4.7 (Conjecture of the rank). Let N ≥ 0 be a natural

number. Then there exists an elliptic curve E deﬁned over Q with
rank RE ≥ N .
32 2. Elliptic curves

One of the key pieces of evidence in favor of such a conjecture

was offered by Shafarevich and Tate, who proved that there exist
elliptic curves defined over function fields Fp (T ) and with arbitrarily
large ranks (Fp (T ) is a field that shares many similar properties with
Q; see [ShT67]). In any case, the problem of finding elliptic curves
of high rank is particularly interesting because of its arithmetic and
computational complexity.

2.5. The torsion subgroup

In this section we concentrate on the torsion points of an elliptic
curve:
E(Q)torsion = {P ∈ E(Q) : there is n ∈ N such that nP = O}.

Example 2.5.1. The curve En : y 2 = x3 − n2 x = x(x − n)(x + n) has

three obvious rational points, namely P = (0, 0), Q = (−n, 0), T =
(n, 0), and it is easy to check (see Exercise 2.12.6) that each one of
these points is torsion of order 2, i.e., 2P = 2Q = 2T = O, and
P + Q = T . In fact En (Q)torsion = {O, P, Q, T } ∼
= Z/2Z ⊕ Z/2Z.

Note that the Mordell-Weil theorem implies that E(Q)torsion is

always ﬁnite. This fact prompts a natural question: what abelian
groups can appear in this context? The answer was conjectured by
Ogg and proven by Mazur:

Theorem 2.5.2 (Ogg’s conjecture; Mazur, [Maz77], [Maz78]). Let

E/Q be an elliptic curve. Then E(Q)torsion is isomorphic to one of
the following groups:
(2.5) Z/N Z with 1 ≤ N ≤ 10 or N = 12, or
Z/2Z ⊕ Z/2M Z with 1 ≤ M ≤ 4.

Example 2.5.3. For instance, the torsion subgroup of the elliptic

curve with Weierstrass equation y 2 + 43xy − 210y = x3 − 210x2 is
isomorphic to Z/12Z and it is generated by the point (0, 210). The
elliptic curve y 2 + 17xy − 120y = x3 − 60x2 has a torsion subgroup
isomorphic to Z/2Z⊕Z/8Z, generated by the rational points (30, −90)
and (−40, 400). See Figure 4 for a complete list of examples with each
possible torsion subgroup.
2.5. The torsion subgroup 33

Curve Torsion Generators

y 2 = x3 − 2 trivial O
y 2 = x3 + 8 Z/2Z (−2, 0)
y 2 = x3 + 4 Z/3Z (0, 2)
2 3
y = x + 4x Z/4Z (2, 4)
y 2 − y = x3 − x2 Z/5Z (0, 1)
y 2 = x3 + 1 Z/6Z (2, 3)
y 2 = x3 − 43x + 166 Z/7Z (3, 8)
2 3
y + 7xy = x + 16x Z/8Z (−2, 10)
y 2 + xy + y = x3 − x2 − 14x + 29 Z/9Z (3, 1)
y 2 + xy = x3 − 45x + 81 Z/10Z (0, 9)
y + 43xy − 210y = x3 − 210x2
2
Z/12Z (0, 210)
(2,0)
y = x − 4x
2 3
Z/2Z ⊕ Z/2Z
(3,6)
(0,0)
y 2 = x3 + 2x2 − 3x Z/4Z ⊕ Z/2Z (0,0)
(−3,18)
y 2 + 5xy − 6y = x3 − 3x2 Z/6Z ⊕ Z/2Z (2,−2)
(30,−90)
y 2 + 17xy − 120y = x3 − 60x2 Z/8Z ⊕ Z/2Z (−40,400)

Figure 4. Examples of each of the possible torsion subgroups

over Q.

Furthermore, it is known that, if G is any of the groups in Eq.

2.5, there are inﬁnitely many elliptic curves whose torsion subgroup is
isomorphic to G. See, for example, [Kub76], Table 3, p. 217. For the
convenience of the reader, the table in Kubert’s article is reproduced
in Appendix E.
Example 2.5.4. Let Et : y 2 + (1 − t)xy − ty = x3 − tx2 with t ∈ Q
and Δt = t5 (t2 − 11t − 1) = 0. Then, the torsion subgroup of Et (Q)
contains a subgroup isomorphic to Z/5Z, and (0, 0) is a point of exact
order 5. Conversely, if E : y 2 = x3 + Ax + B is an elliptic curve with
torsion subgroup equal to Z/5Z, then there is an invertible change of
variables that takes E to an equation of the form Et for some t ∈ Q.
See also Examples E.1.1 and E.1.2.

A useful and simple consequence of Mazur’s theorem is that if

the order of a rational point P ∈ E(Q) is larger than 12, then P must
be a point of inﬁnite order and, therefore, E(Q) contains an inﬁnite
34 2. Elliptic curves

number of distinct rational points. Except for this criterion, Mazur’s

theorem is not very helpful in eﬀectively computing the torsion sub-
group of a given elliptic curve. However, the following result, proven
independently by E. Lutz and T. Nagell, provides a simple algorithm
to determine E(Q)torsion :
Theorem 2.5.5 (Nagell-Lutz, [Nag35], [Lut37]). Let E/Q be an
elliptic curve with Weierstrass equation
y 2 = x3 + Ax + B, A, B ∈ Z.
Then, every torsion point P = O of E satisﬁes:
(1) The coordinates of P are integers, i.e., x(P ), y(P ) ∈ Z.
(2) If P is a point of order n ≥ 3, then 4A3 + 27B 2 is divisible
by y(P )2 .
(3) If P is of order 2, then y(P ) = 0 and x(P )3 +Ax(P )+B = 0.

For a proof, see [Sil86], Ch. VIII, Corollary 7.2, or [Mil06], Ch.
II, Theorem 5.1.
Example 2.5.6. Let E/Q : y 2 = x3 − 2, so that A = 0 and B = −2.
The polynomial x3 − 2 does not have any rational roots, so E(Q)
does not contain any points of order 2. Also, 4A3 + 27B 2 = 27 · 4.
Thus, if (x(P ), y(P )) are the coordinates of a torsion point in E(Q),
then y(P ) is an integer and y(P )2 divides 27 · 4. This implies that
y(P ) = ±1, ±2, ±3, or ±6. In turn, this implies that x(P )3 = 3, 6,
11 or 38, respectively. However, x(P ) is an integer, and none of 3, 6,
11 or 38 are a perfect cube. Thus, E(Q)torsion is trivial (i.e., the only
torsion point is O).
Example 2.5.7. Let p ≥ 2 be a prime number and let us define a
curve Ep : y 2 = x3 + p2 . Since x3 + p2 = 0 does not have any rational
roots, Ep (Q) does not contain points of order 2. Let P be a torsion
point on Ep (Q). The list of all squares dividing 4A3 + 27B 2 = 27p4
is short, and by the Nagell-Lutz theorem the possible values for y(P )
are:
y = ±1, ±p, ±p2 , ±3p, ±3p2 , and ± 3.
Clearly, (0, ±p) ∈ Ep (Q) and one can show that those two points and
O are the only torsion points; see Exercise 2.12.8. Thus, the torsion
subgroup of Ep (Q) is isomorphic to Z/3Z for any prime p ≥ 2.
2.6. Elliptic curves over finite fields 35

2.6. Elliptic curves over ﬁnite ﬁelds

Let p ≥ 2 be a prime and let Fp be the finite field with p elements,
i.e.,
Fp = Z/pZ = {a mod p : a = 0, 1, 2, . . . , p − 1}.
Fp is a field and we may consider elliptic curves defined over Fp . As
for elliptic curves over Q, there are two conditions that need to be
satisfied: the curve needs to be given by a cubic equation, and the
curve needs to be smooth.
Example 2.6.1. For instance, E : y 2 ≡ x3 + 1 mod 5 is an ellip-
tic curve defined over F5 . It is clearly given by a cubic equation
(zy 2 ≡ x3 +z 3 mod 5 in the projective plane P2 (F5 )) and it is smooth,
because for F ≡ zy 2 − x3 − z 3 mod 5, the partial derivatives are:
∂F ∂F ∂F
≡ −3x2 , ≡ 2yz, ≡ y 2 − 3z 2 mod 5.
∂x ∂y ∂z
Thus, if the partial derivatives are congruent to 0 modulo 5, then
x ≡ 0 mod 5 and yz ≡ 0 mod 5. The latter congruence implies that
y or z ≡ 0 mod 5, and ∂F/∂z ≡ 0 implies that y ≡ z ≡ 0 mod 5.
Since [0, 0, 0] is not a point in the projective plane, we conclude that
there are no singular points on E/F5 .
However, C/F3 : y 2 ≡ x3 + 1 mod 3 is not an elliptic curve be-
cause it is not smooth. Indeed, the point P = (2 mod 3, 0 mod 3) ∈
C(F3 ) is a singular point:
∂F ∂F
(P ) ≡ −3 · 22 ≡ 0, (P ) ≡ 2 · 0 · 1 ≡ 0, and
∂x ∂y
∂F
(P ) ≡ 02 − 3 · 12 ≡ 0 mod 3.
∂z
Let E/Q be an elliptic curve given by a Weierstrass equation
y 2 = x3 + Ax + B with integer coefficients A, B ∈ Z, and let p ≥ 2
be a prime number. If we reduce A and B modulo p, then we obtain
the equation of a curve E given by a cubic curve and defined over
the field Fp . Even though E is smooth as a curve over Q, the curve
E may be singular over Fp . In the previous example, we saw that
E/Q : y 2 = x3 + 1 is smooth over Q and F5 but it has a singularity
is smooth, then it is an elliptic
over F3 . If the reduction curve E
curve over Fp .
36 2. Elliptic curves

Example 2.6.2. Sometimes the reduction of a model for an elliptic

curve E modulo a prime p is not smooth, but it is smooth for some
other models of E. For instance, consider the curve E : y 2 = x3 +
15625. Then E ≡ E mod 5 is not smooth over F5 because the point
(0, 0) mod 5 is a singular point. However, using the invertible change
of variables (x, y) → (52 X, 53 Y ), we obtain a new model over Q for
E given by E : Y 2 = X 3 + 1, which is smooth when we reduce it
modulo 5. The problem here is that the model we chose for E is not
minimal. We describe what we mean by minimal next.

Deﬁnition 2.6.3. Let E be an elliptic curve given by y 2 = x3 +Ax+

B, with A, B ∈ Q.
(1) We deﬁne ΔE , the discriminant of E, by

ΔE = −16(4A3 + 27B 2 ).

For a deﬁnition of the discriminant for more general Weier-

strass equations, see for example [Sil86], p. 46.
(2) Let S be the set of all elliptic curves E that are isomor-
phic to E over Q (see Deﬁnition 2.2.4) and such that the
discriminant of E is an integer. The minimal discriminant
of E is the integer ΔE that attains the minimum of the set
{|ΔE | : E ∈ S}. In other words, the minimal discriminant
is the smallest integral discriminant (in absolute value) of
an elliptic curve that is isomorphic to E over Q. If E is the
model for E with minimal discriminant, we say that E is a
minimal model for E.

Example 2.6.4. The curve E : y 2 = x3 + 56 has discriminant

ΔE = −24 33 512 , and the curve E : y 2 = x3 + 1 has discriminant
ΔE = −24 33 . Since E and E are isomorphic (see Deﬁnition 2.2.4
and Example 2.6.2), then ΔE cannot be the minimal discriminant for
E and y 2 = x3 + 56 is not a minimal model. In fact, the minimal
discriminant is ΔE = −432 and E is a minimal model.

Before we go on to describe the types of reduction modulo p that

one can encounter, we need a little bit of background on types of
be a cubic curve over a field K with Weierstrass
singularities. Let E
2.6. Elliptic curves over finite fields 37

equation f (x, y) = 0, where

f (x, y) = y 2 + a1 xy + a3 y − x3 − a2 x2 − a4 x − a6 ,

and suppose that E has a singular point P = (x0 , y0 ), i.e., ∂f /∂x(P ) =

∂f /∂y(P ) = 0. Thus, we can write the Taylor expansion of f (x, y)
around (x0 , y0 ) as follows:

f (x, y) − f (x0 , y0 )
= λ1 (x − x0 )2 + λ2 (x − x0 )(y − y0 ) + λ3 (y − y0 )2 − (x − x0 )3
= ((y − y0 ) − α(x − x0 )) · ((y − y0 ) − β(x − x0 )) − (x − x0 )3

for some λi ∈ K and α, β ∈ K (an algebraic closure of K).

Deﬁnition 2.6.5. The singular point P ∈ E is a node if α = β. In

at P , namely
this case there are two diﬀerent tangent lines to E

y − y0 = α(x − x0 ), y − y0 = β(x − x0 ).

If α = β, then we say that P is a cusp, and there is a unique tangent

line at P .

Deﬁnition 2.6.6. Let E/Q be an elliptic curve given by a minimal

model, let p ≥ 2 be a prime and let E be the reduction curve of E
modulo p. We say that E/Q has good reduction modulo p if E is
is singular at a
smooth and hence is an elliptic curve over Fp . If E
point P ∈ E(Fp ), then we say that E/Q has bad reduction at p and
we distinguish two cases:
has a cusp at P , then we say that E has additive (or
(1) If E
unstable) reduction.
has a node at P , then we say that E has multiplicative
(2) If E
(or semistable) reduction. If the slopes of the tangent lines
(α and β as above) are in Fp , then the reduction is said to
be split multiplicative (and non-split otherwise).

Example 2.6.7. Let us see some examples of elliptic curves with

diﬀerent types of reduction:
(1) E1 : y 2 = x3 + 35x + 5 has good reduction at p = 7, because
y 2 ≡ x3 + 5 mod 7 is a non-singular curve over F7 .
38 2. Elliptic curves

(2) However E1 has bad reduction at p = 5, and the reduction

is additive, since modulo 5 we can write the equation as
((y − 0) − 0 · (x − 0))2 − x3 and the unique slope is 0.
(3) The elliptic curve E2 : y 2 = x3 − x2 + 35 has bad multiplica-
tive reduction at 5 and 7. The reduction at 5 is split, while
the reduction at 7 is non-split. Indeed, modulo 5 we can
write the equation as
((y − 0) − 2(x − 0)) · ((y − 0) + 2(x − 0)) − x3 ,
the slopes being 2 and −2. However, for p = 7, the slopes
are not in F7 (because −1 is not a quadratic residue in F7 ).
Indeed, when we reduce the equation modulo 7, we obtain
y 2 + x2 − x3 mod 7
and y 2 + x2 can only be factored in F7 [i] but not in F7 .
(4) Let E3 be an elliptic curve given by the model y 2 + y =
x3 − x2 − 10x − 20. This is a minimal model for E3 and
its (minimal) discriminant is ΔE3 = −115 . The prime 11 is
the unique prime of bad reduction and the reduction is split
multiplicative. Indeed, the point (5, 5) mod 11 is a singular
point on E3 (F11 ) and
f (x, y) = y 2 + y + x2 + 10x + 20 − x3
= (y − 5 − 5(x − 5)) · (y − 5 + 5(x − 5)) − (x − 5)3 .
Hence, the slopes at (5, 5) are 5 and −5, which are obviously
in F11 and distinct.

Proposition 2.6.8. Let K be a ﬁeld and let E/K be a cubic curve

given by y 2 = f (x), where f (x) is a monic cubic polynomial in K[x].
Suppose that f (x) = (x − α)(x − β)(x − γ) with α, β, γ ∈ K (an
algebraic closure of K) and put
D = (α − β)2 (α − γ)2 (β − γ)2 .
Then E is non-singular if and only if D = 0.

The proof of the proposition is left as an exercise (see Exercise

2.12.9). Notice that the quantity D that appears in the previous
2.6. Elliptic curves over ﬁnite ﬁelds 39

proposition is the discriminant of the polynomial f (x). The discrim-

inant of E/Q, ΔE as in Deﬁnition 2.6.3, is a multiple of D; in fact,
ΔE = 16D. This fact together with Proposition 2.6.8 yield the fol-
lowing corollary:
Corollary 2.6.9. Let E/Q be an elliptic curve with coeﬃcients in
Z. Let p ≥ 2 be a prime. If E has bad reduction at p, then p | ΔE .
In fact, if E is given by a minimal model, then p | ΔE if and only if
E has bad reduction at p.
Example 2.6.10. The discriminant of the elliptic curve E1 : y 2 =
x3 + 35x + 5 of Example 2.6.7 is ΔE1 = −2754800 = −24 · 52 · 71 · 97
(and, in fact, this is the minimal discriminant of E1 ). Thus, E1 has
good reduction at 7 but it has bad reduction at 2, 5, 71 and 97. The
reduction at 71 and 97 is multiplicative.

Let E be an elliptic curve defined over a finite field Fq with q

elements, where q = pr and p ≥ 2 is prime. Notice that E(F q) ⊆
P (Fq ), and the projective plane over Fq only has a ﬁnite number
2

of points (how many?). Thus, the number Nq := |E(F q )|, i.e., the

number of points on E over Fq , is ﬁnite. The following theorem
provides a bound for Nq . This result was conjectured by Emil Artin
(in his thesis) and was proved by Helmut Hasse in the 1930’s:
be
Theorem 2.6.11 (Hasse; [Sil86], Ch. V, Theorem 1.1). Let E
an elliptic curve deﬁned over Fq . Then
√ √
q + 1 − 2 q < Nq < q + 1 + 2 q,
q )|.
where Nq = |E(F

Remark 2.6.12. Heuristically, we expect that Nq is approximately

q +1, in agreement with Hasse’s bound. Indeed, let E/Q be an elliptic
curve given by y 2 = x3 + Ax + B, with A, B ∈ Z, and let q = p be a
prime for simplicity. There are p choices of x in Fp . For each value
x0 , the polynomial f (x) = x3 + Ax + B gives a value f (x0 ) ∈ Fp . The
probability that a random element in Fp is a perfect square in Fp is
1/2 (notice, however, that f (x0 ) is not random; this is just a heuristic
argument). If f (x0 ) is a square modulo p, i.e., if there is a y0 ∈ Fp
such that f (x0 ) ≡ y02 mod p, then there are two points (x0 , ±y0 ) in
40 2. Elliptic curves

Figure 5. Helmut Hasse (1898-1979).

p ). If f (x0 ) is not a square modulo p, then there are no points in

E(F
p ) with x-coordinate equal to x0 . Hence:
E(F

1 1
Np ≈ p · · 2 + · 0 + 1 = p + 1.
2 2
Notice that we have added 1 in order to account for the point at
inﬁnity.

Remark 2.6.13. Suppose that E/Q is an elliptic curve that has bad
reduction at a prime p. How many points does the singular curve
E have over Fp ? The reader can ﬁnd the answer to this question in
Exercise 5.7.1.

Example 2.6.14. Let E/Q be the elliptic curve y 2 = x3 + 3. Its

minimal discriminant is ΔE = −3888 = −24 · 35 . Thus, the only
p is smooth for all p ≥ 5.
primes of bad reduction are 2 and 3 and E/F
5 ), namely
For p = 5, there are precisely 6 points on E(F
5 ) = {O,
E(F (1, 2), (1, 3), (2, 1), (2, 4), (3, 0)},
2.6. Elliptic curves over ﬁnite ﬁelds 41

where all the coordinates should be regarded as congruences modulo

5. Thus, N5 = 6, which is in the range given by Hasse’s bound:
√ √
1.5278 . . . = 5 + 1 − 2 5 < N5 < 5 + 1 + 2 5 = 10.4721 . . . .
Similarly, one can verify that N7 = 13.

The connections between the numbers Np and the group E(Q)

are numerous and of great interest. The most surprising relationship
is captured by the Birch and Swinnerton-Dyer conjecture (Conjecture
5.2.1) that relates the growth of Np (as p varies) with the rank of the
elliptic curve E/Q. We shall discuss this conjecture in Section 5.2 in
more detail. In the next proposition we describe a diﬀerent connection
between Np and E(Q). We shall use the following notation: if G is
an abelian group and m ≥ 2, then the points of G of order dividing
m will be denoted by G[m].

Proposition 2.6.15 ([Sil86], Ch. VII, Prop. 3.1). Let E/Q be an

elliptic curve, p a prime number and m a natural number not divisible
by p. Suppose that E/Q has good reduction at p. Then the reduction
map modulo p,
p ),
E(Q)[m] −→ E(F
is an injective homomorphism of abelian groups. In particular, the
number of elements of E(Q)[m] divides the number of elements of
p ).
E(F

The previous proposition can be very useful when calculating the

torsion subgroup of an elliptic curve. Let’s see an application:

Example 2.6.16. Let E/Q : y 2 = x3 + 3. In Example 2.6.14 we have

seen that N5 = 6 and N7 = 13, and E/Q has bad reduction only at
2 and 3.
If q = 5, 7 is a prime number, then E(Q)[q] is trivial. Indeed,
Proposition 2.6.15 implies that |E(Q)[q]| divides N5 = 6 and also
N7 = 13. Thus, |E(Q)[q]| must divide gcd(6, 13) = 1.
In the case of q = 5, we know that |E(Q)[5]| divides N7 = 13.
Moreover, by Lagrange’s theorem from group theory, if E(Q)[p] is
non-trivial, then p divides |E(Q)[p]| (later on we will see that E(Q)[p]
is always a subgroup of Z/pZ × Z/pZ; see Exercise 3.7.5). Since 5
42 2. Elliptic curves

does not divide 13, it follows that E(Q)[5] must be trivial. Similarly,
one can show that E(Q)[7] is trivial, and we conclude that E(Q)torsion
is trivial.
However, notice that P = (1, 2) ∈ E(Q) is a point on the curve.
Since we just proved that E does not have any points of finite order,
it follows that P must be a point of infinite order, and, hence, we have
shown that E has infinitely many rational points: ±P, ±2P, ±3P, . . ..
In fact, E(Q) ∼= Z and (1, 2) is a generator of its Mordell-Weil group.

In the previous example, the Nagell-Lutz theorem (Theorem 2.5.5)

would have yielded the same result, i.e., the torsion is trivial, in an
easier way. Indeed, for the curve E : y 2 = x3 + 3, the quantity
4A3 + 27B 2 equals 35 , so the possibilities for y(P )2 , where P is a tor-
sion point of order ≥ 3, are 1, 9 or 81 (it is easy to see that there are no
2-torsion points). Therefore, the possibilities for x(P )3 = y(P )2 − 3
are −2, 6 or 78, respectively. Since x(P ) is an integer, we reach a con-
tradiction. In the following example, the Nagell-Lutz theorem would
be a lengthier and much more tedious alternative, and Proposition
2.6.15 is much more eﬀective.

Example 2.6.17. Let E/Q : y 2 = x3 + 4249388. In this case

4A3 + 27B 2 = 24 · 33 · 112 · 132 · 172 · 192 · 232 .

Therefore, 4A3 + 27B 2 is divisible by 192 distinct positive squares,

which makes it very tedious to use the Nagell-Lutz theorem. The
(minimal) discriminant of E/Q is ΔE = −16(4A3 + 27B 2 ) and there-
fore E has good reduction at 5 and 7. Moreover, B = 4249388 ≡
3 mod 35 and therefore, by our calculations in Example 2.6.16, N5 = 6
and N7 = 13. Thus, Proposition 2.6.15 and the same argument we
used in Ex. 2.6.16 show that the torsion of E(Q) is trivial.
Incidentally,
25502the curve E/Q : y 2 = x3 + 4249388 has a rational
6090670
point P = 169 , 2197 . Since the torsion of E(Q) is trivial, P
must be of inﬁnite order. Another way to see this: since P has
rational coordinates that are not integral, the Nagell-Lutz theorem
implies that the order of P is inﬁnite. In fact, E(Q) is isomorphic to
Z and it is generated by P .
2.7. The rank and the free part of E(Q) 43

2.7. The rank and the free part of E(Q)

In the previous sections we have described simple algorithms that de-
termine the torsion subgroup of E(Q). Recall that the Mordell-Weil
theorem (Thm. 2.4.3) says that there is a (non-canonical) isomor-
phism
E(Q) ∼
= E(Q)torsion ⊕ ZRE .
Our next goal is to try to find RE generators of the free part of the
Mordell-Weil group. Unfortunately, no algorithm is known that will
always yield such free points. We don’t even have a way to determine
RE , the rank of the curve, although sometimes we can obtain upper
bounds for the rank of a given curve E/Q (see, for instance, Theorem
2.7.4 below).
Naively, one could hope that if the coefficients of the (minimal)
Weierstrass equation for E/Q are small, then the coordinates of the
generators of E(Q) should also be small, and perhaps a brute force
computer search would yield these points. However, Bremner and
Cassels found the following surprising example: the curve y 2 = x3 +
877x has rank equal to 1 and the x-coordinate of a generator P is
x(P ) = (612776083187947368101/78841535860683900210)2 .
However, Serge Lang salvaged this idea and conjectured that for all
> 0 there is a constant C such that there is a system of generators
{Pi : i = 1, . . . , RE } of E(Q) with

h(Pi ) ≤ C · |ΔE |1/2+ ,
where
h is the canonical height function of E/Q, which we define next.
Lang’s conjecture says that the size of the coordinates of a generator
may grow exponentially with the (minimal) discriminant of a curve
E/Q.
Definition 2.7.1. We define the height of m n ∈ Q, with gcd(m, n) =
1, by m
h = log(max{|m|, |n|}).
n
This can be used to define a height on a point P = (x, y) on an elliptic
curve E/Q, with x, y ∈ Q by
H(P ) = h(x).
44 2. Elliptic curves

Finally, we deﬁne the canonical height of P ∈ E(Q) by

1 H(2N · P )
h(P ) = lim .
2 N →∞ 4N
Note: here 2N ·P means multiplication in the curve, using the addition
law deﬁned in Section 2.4, i.e., 2 · P = P + P , 22 · P = 2P + 2P , etc.
Example 2.7.2. Let E : y 2 = x3 + 877x, and let P be a generator
of E(Q). Here are some values of 12 · H(24N ·P ) :
N

1
· H(P ) = 47.8645312628 . . .
2
1 H(2 · P )
· = 47.7958126219 . . .
2 4
1 H(22 · P )
· = 47.9720107996 . . .
2 42
1 H(23 · P )
· = 47.9636902383 . . .
2 43
1 H(24 · P )
· = 47.9901607777 . . .
2 44
1 H(25 · P )
· = 47.9901600133 . . .
2 45
1 H(26 · P )
· = 47.9901569227 . . .
2 46
1 H(27 · P )
· = 47.9901419861 . . .
2 47
1 H(28 · P )
· = 47.9901807594 . . . .
2 48
The limit is in fact equal to h(P ) = 47.9901859939..., well below the
value |ΔE |1/2 = 207, 773.12....

The canonical height enjoys the following properties and, in fact,

the canonical height is deﬁned so that it is (essentially) the only height
that satisﬁes these properties:
Proposition 2.7.3 (Néron-Tate). Let E/Q be an elliptic curve and
let
h be the canonical height on E.
(1) For all P, Q ∈ E(Q), h(P +Q)+ h(P −Q) = 2
h(P )+2
h(Q).
(Note: this is called the parallelogram law.)
2.7. The rank and the free part of E(Q) 45

(2) For all P ∈ E(Q) and m ∈ Z, h(mP ) = m2 ·

h(P ). (Note: in
particular, the height of mP is much larger than the height
of P , for any m = 0, 1.)
(3) Let P ∈ E(Q). Then h(P ) ≥ 0, and
h(P ) = 0 if and only if
P is a torsion point.

For the proofs of these properties, see [Sil86], Ch. VIII, Thm.
9.3, or [Mil06], Ch. IV, Prop. 4.5 and Thm. 4.7.
As we mentioned at the beginning of this section, we can calculate
upper bounds on the rank of a given elliptic curve (see [Sil86], p. 235,
exercises 8.1, 8.2). Here is an example:
Theorem 2.7.4 ([Loz08], Prop. 1.1). Let E/Q be an elliptic curve
given by a Weierstrass equation of the form
E : y 2 = x3 + Ax2 + Bx, with A, B ∈ Z.
Let RE be the rank of E(Q). For an integer N ≥ 1, let ν(N ) be the
number of distinct positive prime divisors of N . Then
RE ≤ ν(A2 − 4B) + ν(B) − 1.
More generally, let E/Q be any elliptic curve with a non-trivial point
of 2-torsion and let a (resp. m) be the number of primes of additive
(resp. multiplicative) bad reduction of E/Q. Then
RE ≤ m + 2a − 1.
Example 2.7.5. Pierre de Fermat proved that n = 1 is not a con-
gruent number (see Example 1.1.2) by showing that x4 + y 4 = z 2 has
no rational solutions. As an application of the previous theorem, let
us show that the curve
E1 : y 2 = x3 − x = x(x − 1)(x + 1)
only has the trivial solutions (0, 0), (±1, 0), which are torsion points
of order 2. Indeed, the minimal discriminant of E1 is ΔE1 = 64.
Therefore p = 2 is the unique prime of bad reduction. Moreover, the
reader can check that the reduction at p = 2 is multiplicative. Now
thanks to Theorem 2.7.4 we conclude that RE1 = 0 and E1 only has
torsion points. Finally, using Proposition 2.6.15 or Theorem 2.5.5,
we can show that the only torsion points are the three trivial points
named above.
46 2. Elliptic curves

Example 2.7.6. Let E/Q be the elliptic curve y 2 = x(x + 1)(x +

2), which already appeared in Example 1.1.1. Since the Weierstrass
equation of E is
y 2 = x(x + 1)(x + 2) = x3 + 3x2 + 2x,
it follows from Theorem 2.7.4 that the rank RE satisﬁes
RE ≤ ν(A2 − 4B) + ν(B) − 1 = ν(1) + ν(2) − 1 = 0 + 1 − 1 = 0,
and therefore the rank is 0. The reader can check that
E(Q)torsion = {O, (0, 0), (−1, 0), (−2, 0)}.
Since the rank is zero, the four torsion points on E/Q are the only
rational points on E.
Example 2.7.7. Let E : y 2 = x3 + 2308x2 + 665858x. The primes 2
and 577 are the only prime divisors of (both) B and A2 − 4B. Thus,
RE ≤ ν(A2 − 4B) + ν(B) − 1 = 2 + 2 − 1 = 3.
The points P1 = (−1681, 25543), P2 = (−338, 26), and P3 = (577/16,
332929/64) are of inﬁnite order and the subgroup of E(Q) generated
by P1 , P2 and P3 is isomorphic to Z3 . Therefore, the rank of E is
equal to 3.

2.8. Linear independence of rational points

Let E/Q be the curve deﬁned in Example 2.7.7. We claimed that
the subgroup generated by the points P1 = (−1681, 25543), P2 =
(−338, 26), and P3 = (577/16, 332929/64) is isomorphic to Z3 . But
how can we show that? In particular, why is P3 not a linear combi-
nation of P1 and P2 ? In other words, are there integers n1 and n2
such that P3 = n1 P1 + n2 P2 ? In fact, E/Q has a rational torsion
point T = (0, 0) of order 2, so could some combination of P1 , P2 and
P3 equal T ? This example motivates the need for a notion of linear
dependence and independence of points over Z.
Deﬁnition 2.8.1. Let E/Q be an elliptic curve. We say that the
rational points P1 , . . . , Pm ∈ E(Q) are linearly dependent over Z if
there are integers n1 , . . . , nm ∈ Z such that
n1 P1 + n2 P2 + · · · + nm Pm = T,
2.8. Linear independence of rational points 47

where T is a torsion point. Otherwise, if no such relation exists, we

say that the points are linearly independent over Z.
Example 2.8.2. Let E/Q : y 2 = x3 + x2 − 25x + 39 and let

61 469 335 6868
P1 = ,− , P2 = − , − , P3 = (21, 96).
4 8 81 729
The points P1 , P2 and P3 are rational points on E and linearly de-
pendent over Z because
−3P1 − 2P2 + 6P3 = O.

Example 2.8.3. Let E/Q : y 2 + y = x3 − x2 − 26790x + 1696662 and

put

59584 71573
P1 = , ,
625 15625

101307506181 30548385002405573
P2 = , .
210337009 3050517641527
The points P1 and P2 are rational points on E, and they are linearly
dependent over Z because
−3P1 + 2P2 = (133, −685),
and (133, −685) is a torsion point of order 5.

Now that we have deﬁned linear independence over Z, we need

a method to prove that a number of points are linearly independent.
The existence of the Néron-Tate pairing provides a way to prove in-
dependence.
Deﬁnition 2.8.4. The Néron-Tate pairing attached to an elliptic
curve is deﬁned by
·, · : E(Q) × E(Q) → R, P, Q =
h(P + Q) −
h(P ) −
h(Q),
where h is the canonical height on E. Let P1 , P2 , . . . , Pr be r rational
points on E(Q). The elliptic height matrix associated to {Pi }ri=1 is
H = H({Pi }ri=1 ) := (Pi , Pj )1≤i≤r, 1≤j≤r .

The determinant of H is called the elliptic regulator of the set of

points {Pi }ri=1 . If {Pi }ri=1 is a complete set of generators of the free
48 2. Elliptic curves

part of E(Q), then the determinant of H({Pi }ri=1 ) is called the elliptic
regulator of E/Q.

Theorem 2.8.5. Let E/Q be an elliptic curve. Then the Néron-Tate

pairing ·, · associated to E is a non-degenerate symmetric bilinear
form on E(Q)/E(Q)torsion , i.e.,

(1) For all P, Q ∈ E(Q), P, Q = Q, P .

(2) For all P, Q, R ∈ E(Q) and all m, n ∈ Z,

P, mQ + nR = mP, Q + nP, R.

(3) Suppose P ∈ E(Q) and P, Q = 0 for all Q ∈ E(Q). Then

P ∈ E(Q)torsion . In particular, P is a torsion point if and
only if P, P = 0.

The properties of the Néron-Tate pairing follow from those of the

canonical height in Proposition 2.7.3 (see Exercise 2.12.12). Theorem
2.8.5 has the following important corollary:

Corollary 2.8.6. Let E/Q be an elliptic curve and let P1 , P2 , . . . , Pr ∈

E(Q) be rational points. Let H be the elliptic height matrix associated
to {Pi }ri=1 . Then:

(1) Suppose det(H) = 0 and u = (n1 , . . . , nr ) ∈ Ker(H), with

ni ∈ Z. Then the points {Pi }ri=1 are linearly dependent and
r
k=1 nk Pk = T , where T is a torsion point on E(Q).

(2) If det(H) = 0, then the points {Pi }ri=1 are linearly indepen-
dent and the rank of E(Q) is ≥ r.

Here is an example of how the Néron-Tate pairing is used in

practice:

Example 2.8.7. Let E/Q be the elliptic curve y 2 = x3 + 2308x2 +

665858x. Put

P = (−1681, 25543), Q = (−338, 26), and

332929 215405063
R = ,− .
36 216
2.9. Descent and the weak Mordell-Weil theorem 49

Are P , Q and R independent? In order to ﬁnd out, we ﬁnd the elliptic

height matrix associated to {P, Q, R}, using PARI or Sage:
⎛ ⎞
P, P Q, P R, P
H = ⎝ P, Q Q, Q R, Q ⎠
P, R Q, R R, R
⎛ ⎞
7.397 . . . −3.601 . . . 3.795 . . .
= ⎝ −3.601 . . . 6.263 . . . 2.661 . . . ⎠ .
3.795 . . . 2.661 . . . 6.457 . . .

The determinant of H seems to be very close to 0 (PARI returns

3.368 · 10−27 ). Hence Cor. 2.8.6 suggests that P , Q and R are not
independent. If we ﬁnd the (approximate) kernel of H with PARI, we
discover that the (column) vector (1, 1, −1) is approximately in the
kernel, and therefore, P + Q − R may be a torsion point. Indeed, the
point P + Q − R = (0, 0) is a torsion point of order 2 on E(Q). Hence,
P , Q and R are linearly dependent over Z.
Instead, let P1 = (−1681, 25543), P2 = (−338, 26), a third point
P3 = (577/16, 332929/64) and let H be the elliptic height matrix
associated to {Pi }3i=1 . Then det(H ) = 101.87727 . . . is non-zero and,
therefore, {Pi }3i=1 are linearly independent and the rank of E/Q is at
least 3.

2.9. Descent and the weak Mordell-Weil

theorem
In the previous sections we have seen methods to calculate the torsion
subgroup of an elliptic curve E/Q, and also methods to check if a
collection of points are independent modulo torsion. However, we
have not discussed any method to find points of infinite order. In this
section, we briefly explain the method of descent, which facilitates the
search for generators of the free part of E(Q). Unfortunately, the
method of descent is not always successful! We will try to measure
the failure of the method in the following section. The method of
descent (as explained here) is mostly due to Cassels. For a more
detailed treatment, see [Was08] or [Sil86]. A more general descent
algorithm was laid out by Birch and Swinnerton-Dyer in [BSD63].
50 2. Elliptic curves

The current implementation of the algorithm is more fully explained

in Cremona’s book [Cre97].
Let E/Q be a curve given by y 2 = x3 + Ax + B, with A, B ∈ Z.
The most general case of the method of descent is quite involved,
so we will concentrate on a particular case where the calculations
are much easier: we will assume that E(Q) has 4 distinct rational
points of 2-torsion (including O). As we saw before (Theorem 2.5.5,
or Exercise 2.12.6), a point P = (x, y) ∈ E(Q) is of 2-torsion if and
only if y = 0 and x3 + Ax + B = 0 (or P = O). Thus, if E(Q) has 4
distinct rational points of order 2, that means that x3 + Ax + B has
three (integral) roots and it factors completely over Z:
x3 + Ax + B = (x − e1 )(x − e2 )(x − e3 )
with ei ∈ Z. Since x3 +Ax+B does not have an x2 term, we conclude
that e1 + e2 + e3 = 0.
Suppose, then, that E : y 2 = (x − e1 )(x − e2 )(x − e3 ), where the
roots satisfy ei ∈ Z and e1 + e2 + e3 = 0. We would like to ﬁnd a
solution (x0 , y0 ) ∈ E with x0 , y0 ∈ Q, i.e.,
y02 = (x0 − e1 )(x0 − e2 )(x0 − e3 ).
Thus, each term (x0 − ei ) must be almost a square, and we can make
this precise by writing
(x0 − e1 ) = au2 , (x0 − e2 ) = bv 2 , (x0 − e3 ) = cw2 , y02 = abc(uvw)2 ,
where a, b, c, u, v, w ∈ Q, the numbers a, b, c ∈ Q are square-free, and
abc is a square (in Q).
Example 2.9.1. Let
E : y 2 = x3 − 556x + 3120 = (x − 6)(x − 20)(x + 26)
so that e1 = 6, e2 = 20 and e3 = −26. The point (x0 , y0 ) =
( 164184
289 ,
66469980
4913 ) is rational and on E. We can write
2
164184 285
x0 − e1 = −6=2·
289 17
and, similarly, x0 − e2 = ( 398
17 ) and x0 − e3 = 2 · ( 17 ) . Thus,
2 293 2

following the notation of the preceeding paragraphs

285 398 293
a = 2, b = 1, c = 2, u = , v= , w= .
17 17 17
2.9. Descent and the weak Mordell-Weil theorem 51

Notice that abc is a square and y02 = ( 66469980 2 2

4913 ) = abc(uvw) .

Example 2.9.2. Let E : y 2 = x3 − 556x + 3120 as before, with

e1 = 6, e2 = 20 and e3 = −26. Let P = (−8, 84), Q = (24, 60) and
16 , − 64 ). The points P , Q and S are in E(Q). We
S = P + Q = (− 247 5733

would like to calculate the aforementioned numbers a, b, c for each of

the points P, Q and S. For instance
x(P ) − e1 = −8 − 6 = −14 = −14 · 12 ,
x(P ) − e2 = −7 · 42 , and x(P ) − e3 = 2 · 32 .
Thus, aP = −14, bP = −7 and cP = 2. Similarly, we calculate
x(Q) − 6 = 2 · 32 , x(Q) − 20 = 22 , x(Q) + 26 = 2 · 52 ,
2
7
x(S) − 6 = −7 · ,
4
2 2
9 13
x(S) − 20 = −7 · , x(S) + 26 = .
4 4
Thus aQ = 2, bQ = 1, cQ = 2, and aS = −7, bS = −7, cS = 1.
Notice the following interesting fact:
aP · aQ = −28 = −7 · 22 , bP · bQ = −7, cP · cQ = 4.
Therefore, the square-free part of aP · aQ equals aS = aP +Q = −7.
And similarly, the square-free parts of bP ·bQ and cP ·cQ equal bS = −7
and cS = 1, respectively. Also, the reader can check that a2P = b2P =
c2P = 1 and a2Q = b2Q = c2Q = 1.

The previous example points to the fact that there may be a ho-
momorphism between points on E(Q) and triples (a, b, c) of rational
numbers modulo squares, or square-free parts of rational numbers;
formally, we are talking about Q× /(Q× )2 × Q× /(Q× )2 × Q× /(Q× )2 .
Here, the group Q× /(Q× )2 is the multiplicative group of non-zero
rational numbers, with the extra relation that two non-zero rational
numbers are equivalent if their square-free parts are equal (or, equiv-
alently, if their quotient is a perfect square). For instance, 3 and 12
2 2 25
represent the same element of Q× /(Q× )2 because 12 25 = 3 · 5 . The
following theorem constructs such a homomorphism. Here we have
adapted the proof that appears in [Was08], Theorem 8.14.
52 2. Elliptic curves

Theorem 2.9.3. Let E/Q be an elliptic curve

y 2 = x3 + Ax + B = (x − e1 )(x − e2 )(x − e3 )
with distinct e1 , e2 , e3 ∈ Z and e1 + e2 + e3 = 0. There is a homomor-
phism of groups
δ : E(Q) → Q× /(Q× )2 × Q× /(Q× )2 × Q× /(Q× )2
deﬁned for P = (x0 , y0 ) by
⎧
⎪
⎪ if P = O;
⎪(1, 1, 1)
⎪
⎪
⎪
⎨(x0 − e1 , x0 − e2 , x0 − e3 )
⎪ if y0 = 0;
δ(P ) = ((e1 − e2 )(e1 − e3 ), e1 − e2 , e1 − e3 ) if P = (e1 , 0);
⎪
⎪
⎪
⎪ (e2 − e1 , (e2 − e1 )(e2 − e3 ), e2 − e3 )
⎪
⎪
if P = (e2 , 0);
⎪
⎩(e − e , e − e , (e − e )(e − e ))
3 1 3 2 3 1 3 2 if P = (e3 , 0).
If δ(P ) = (δ1 , δ2 , δ3 ), then δ1 · δ2 · δ3 = 1 in Q× /(Q× )2 . Moreover, the
kernel of δ is precisely 2E(Q); i.e., if δ(Q) = (1, 1, 1), then Q = 2P
for some P ∈ E(Q).

Proof. Let δ be the function deﬁned in the statement of the theorem.

Let us show that δ is a homomorphism of (abelian) groups; i.e., we
want to show that δ(P ) · δ(Q) = δ(P + Q). Notice first of all that
δ(P ) = δ(x0 , y0 ) = δ(x0 , −y0 ) = δ(−P ), because the definition of
δ does not depend on the sign of the y coordinate of P (in fact, it
only depends on whether y(P ) = 0). Thus, it suffices to prove that
δ(P ) · δ(Q) = δ(−(P + Q)) for all P, Q ∈ E(Q).
Let P = (x0 , y0 ), Q = (x1 , y1 ) and R = −(P + Q) = (x2 , y2 ),
and let us assume, for simplicity, that yi = 0 for i = 1, 2, 3. By the
definition of the addition rule on an elliptic curve (see Figure 1), the
points P , Q and R are collinear. Let L = P Q be the line that goes
through all three points, and suppose it has equation L : y = ax + b.
Therefore, if we substitute y in the equation of E/Q, we obtain a
polynomial
p(x) = (ax + b)2 − (x − e1 )(x − e2 )(x − e3 ).
The polynomial p(x) is cubic, its leading term is −1, and it has pre-
cisely three rational roots, namely x0 , x1 and x2 . Hence, it factors:
p(x) = (ax+b)2 −(x−e1 )(x−e2 )(x−e3 ) = −(x−x0 )(x−x1 )(x−x2 ).
2.9. Descent and the weak Mordell-Weil theorem 53

If we evaluate p(x) at x = ei , we obtain

p(ei ) = (aei + b)2 = −(ei − x0 )(ei − x1 )(ei − x2 )
or, equivalently, (x0 − ei )(x1 − ei )(x2 − ei ) = (aei + b)2 . Thus, the
product δ(P ) · δ(Q) · δ(R) equals
δ(P ) · δ(Q) · δ(R) = (x0 − e1 , x0 − e2 , x0 − e3 )
·(x1 − e1 , x1 − e2 , x1 − e3 )
·(x2 − e1 , x2 − e2 , x2 − e3 )
= ((x0 − e1 )(x1 − e1 )(x2 − e1 ),
(x0 − e2 )(x1 − e2 )(x2 − e2 ),
(x0 − e3 )(x1 − e3 )(x2 − e3 ))
= ((ae1 + b)2 , (ae2 + b)2 , (ae3 + b)2 )
= (1, 1, 1) ∈ (Q× /(Q× )2 )3 .
Hence, δ(P ) · δ(Q) · δ(R) = 1. If we multiply both sides by δ(R) and
notice that a2 = 1 for any a ∈ Q× /(Q× )2 , we conclude that
δ(P ) · δ(Q) = δ(R) = δ(−(P + Q)) = δ(P + Q),
as desired. In order to completely prove that δ is a homomorphism,
we would need to check the cases when P , Q or R is one of the points
(ei , 0) or O, but we leave those special cases for the reader to check
(Exercise 2.12.15).
If δ(P ) = (δ1 , δ2 , δ3 ), then it follows directly from the deﬁnition
of δ that δ1 · δ2 · δ3 = 1 in Q× /(Q× )2 . Indeed, this is clear for
P = O or P = (ei , 0), and if P = (x0 , y0 ) with y0 = 0, then (x0 −
e1 )(x0 − e2 )(x0 − e3 ) = y02 , which is a square, and is therefore trivial
in Q× /(Q× )2 .
Next, let us show that the kernel of δ is 2E(Q). Clearly, 2E(Q)
is in the kernel of δ, because δ is a homomorphism with image in
(Q× /(Q× )2 )3 , as we just proved. Indeed, if P ∈ E(Q), then
δ(2P ) = δ(P ) · δ(P ) = δ(P )2 = (δ12 , δ22 , δ32 ) = (1, 1, 1),
because squares are trivial in Q× /(Q× )2 .
Now let us show the reverse inclusion, i.e., that the kernel of δ
is contained in 2E(Q). Let Q = (x1 , y1 ) ∈ E(Q) such that δ(Q) =
(1, 1, 1). We want to ﬁnd P = (x0 , y0 ) such that 2P = Q. Notice that
54 2. Elliptic curves

it is enough to show that x(2P ) = x1 , because 2P is a point on E(Q)

and if x(2P ) = x(Q), then Q = 2(±P ). Hence, our goal will be to
construct (x0 , y0 ) ∈ E(Q) such that
x40 − 2Ax20 − 8Bx0 + A2
x(2P ) = = x1 .
4y02
The formula for x(2P ) above is given in Exercise 2.12.16.
Once again, for simplicity, let us assume y(Q) = y1 = 0 and, as
stated above, we assume δ(Q) = (1, 1, 1). Hence, x1 − ei is a square
in Q for i = 1, 2, 3. Let us write
(2.6) x1 − ei = t2i , for some ti ∈ Q× .
We define a new auxiliary polynomial p(x) by
(x − e2 )(x − e3 ) (x − e1 )(x − e3 ) (x − e1 )(x − e2 )
t1 + t2 + t3 .
(e1 − e2 )(e1 − e3 ) (e2 − e1 )(e2 − e3 ) (e3 − e1 )(e3 − e2 )
The polynomial p(x) is an interpolating polynomial (or Lagrange
polynomial) which was defined so that p(ei ) = ti . Notice that p(x) is
a quadratic polynomial, say p(x) = a + bx + cx2 . Also define another
polynomial q(x) = x1 − x − p(x)2 and notice that
q(ei ) = x1 − ei − p(ei )2 = x1 − ei − t2i = 0
from the definition of ti in Eq. (2.6). Since q(ei ) = 0, it follows that
(x − ei ) divides q(x) for i = 1, 2, 3. Thus, (x − e1 )(x − e2 )(x − e3 ) =
x3 + Ax + B divides q(x). In other words, q(x) ≡ 0 mod x3 + Ax + B.
Since q(x) = x1 − x − p(x)2 , we can also write
x1 − x ≡ p(x)2 ≡ (a + bx + cx2 )2 mod (x3 + Ax + B).
We shall expand the square on the right-hand side, modulo f (x) =
x3 + Ax + B. Notice that x3 ≡ −Ax − B, and x4 ≡ −Ax2 − Bx
modulo f (x):
x1 − x ≡ p(x)2 ≡ (a + bx + cx2 )2
≡ c2 x4 + 2bcx3 + (2ac + b2 )x2 + 2abx + a2
≡ c2 (−Ax2 − Bx) + 2bc(−Ax − B)
+(2ac + b2 )x2 + 2abx + a2
≡ (2ac + b2 − Ac2 )x2
+(2ab − Bc2 − 2Abc)x + (a2 − 2bcB),
2.9. Descent and the weak Mordell-Weil theorem 55

where all the congruences are modulo f (x) = x3 + Ax + B. The

congruences in the previous equation say that a polynomial of degree
1, call it g(x) = x1 − x, is congruent to a polynomial of degree ≤
2, call the last line h(x), modulo a polynomial of degree 3, namely
f (x). Then h(x) − g(x) is a polynomial of degree ≤ 2, divisible by a
polynomial of degree 3. This implies that h(x) − g(x) must be zero
and h(x) = g(x), i.e.,

x1 − x = (2ac + b2 − Ac2 )x2 + (2ab − Bc2 − 2Abc)x + (a2 − 2bcB).

If we match coeﬃcients, we obtain the following equalities:

(2.7) 2ac + b2 − Ac2 = 0,

(2.8) 2ab − Bc − 2Abc
2
= −1,
(2.9) a − 2bcB
2
= x1 .

If c = 0, then b = 0 by Eq. (2.7); therefore, p(x) = a + bx + cx2 = a is

a constant function, and so t1 = t2 = t3 . By Eq. (2.6), it follows that
e1 = e2 = e3 , which is a contradiction with our assumptions. Hence,
c must be non-zero. We multiply Eq. (2.8) by c12 and Eq. (2.7) by
b
c3 to obtain

2ab 2Ab 1
(2.10) 2
−B− = − ,
c c c2
2ab b3 Ab
(2.11) + 3− = 0.
c2 c c

We subtract Eq. (2.10) from Eq. (2.11) to get:

3 2
b b 1
+A +B = .
c c c

Hence, the point P = (x0 , y0 ) = ( bc , 1c ) is a rational point on E(Q).

It remains to show that x(2P ) = x(Q). From Eq. (2.11) we deduce
that
b3
2
c − c3 A − cb A − x20
Ab
a= = = ,
2b
c2
2· c 1 2y0
56 2. Elliptic curves

and, therefore, substituting a in Eq. (2.9) yields

2
A − x20
x(Q) = x1 = a − 2bcB =
2
− 2bcB
2y0
(A2 − 2Ax20 + x40 ) − (2bcB)(4y02 )
=
4y02
(A2 − 2Ax20 + x40 ) − (2bcB)( c42 )
=
4y02
(A2 − 2Ax20 + x40 ) − 8Bx0
=
4y02
x40 − 2Ax20 − 8Bx0 + A2
= = x(2P )
4y02
as desired. In order to complete the proof of the fact that the kernel of
δ is 2E(Q), we would need to consider the case when y(Q) = y1 = 0,
but we leave this special case to the reader (Exercise 2.12.18).

Thus, the previous proposition shows that there is a homomor-

phism δ : E(Q) → (Q× /(Q× )2 )3 with kernel equal to 2E(Q). In fact,
the theorem shows that there is a homomorphism from E(Q) into
Γ = {(δ1 , δ2 , δ3 ) ∈ (Q× /(Q× )2 )3 : δ1 · δ2 · δ3 = 1 ∈ Q× /(Q× )2 }.
Hence, δ induces an injection
E(Q)/2E(Q) → Γ ⊂ (Q× /(Q× )2 )3 .
The groups Q× /(Q× )2 and Γ are inﬁnite, so such an injection does
not tell us much about the size of E(Q)/2E(Q). However, the image
of E(Q)/2E(Q) is much smaller than Γ.

Example 2.9.4. Let E : y 2 = x3 − 556x + 3120 as in Example 2.9.2.

It turns out that E(Q) ∼ = Z/2Z ⊕ Z/2Z ⊕ Z2 . The generators of the
torsion part are T1 = (6, 0) and T2 = (20, 0), and the generators of
the free part are P = (−8, 84) and Q = (24, 60). The image of the
map δ in this case is, therefore, generated by the images of T1 , T2 , P
and Q.
δ(T1 ) = (−7, −14, 2), δ(T2 ) = (14, 161, 46),
δ(P ) = (−14, −7, 2), δ(Q) = (2, 1, 2).
2.9. Descent and the weak Mordell-Weil theorem 57

Thus, the image of δ is formed by the 16 elements that one obtains

by multiplying out δ(T1 ), δ(T2 ), δ(P ) and δ(Q), in all possible ways.
Thus, δ(E(Q)/2E(Q)) is the group:
{(1, 1, 1), (−7, −14, 2), (14, 161, 46), (−2, −46, 23),
(−14, −7, 2), (2, 2, 1), (−1, −23, 23), (7, 322, 46),
(2, 1, 2), (−14, −14, 1), (7, 161, 23), (−1, −46, 46),
(−7, −7, 1), (1, 2, 2), (−2, −23, 46), (14, 322, 23)}.
(Exercise: Check that the elements listed above form a group under
multiplication.) We see that the only primes that appear in the fac-
torization of the coordinates of elements in the image of δ are: 2, 7
and 23. Therefore, the coordinates of δ are not just in Q× /(Q× )2 but
in a much smaller subgroup of 16 elements:
Γ = {±1, ±2, ±7, ±23, ±14, ±46, ±161, ±322} ⊂ Q× /(Q× )2 .
And the image of E(Q)/2E(Q) embeds into
ΓΔ = {(δ1 , δ2 , δ3 ) ∈ Γ × Γ × Γ : δ1 · δ2 · δ3 = 1 ∈ Q× /(Q× )2 }
⊂ Γ × Γ × Γ .
Since Γ has 16 elements and E(Q)/2E(Q) embeds into (Γ )3 , we
conclude that E(Q)/2E(Q) has at most (16)3 = 212 elements. In fact,
ΓΔ has only 162 elements, so E(Q)/2E(Q) has at most 28 elements.
Notice also the following interesting “coincidence”: the prime divisors
that appear in ΓΔ coincide with the prime divisors of the discriminant
of E, which is ΔE = 6795034624 = 218 ·72 ·232 . In the next proposition
we explain that, in fact, this is always the case.
Proposition 2.9.5. Let E : y 2 = (x−e1 )(x−e2 )(x−e3 ), with ei ∈ Z.
Let P = (x0 , y0 ) ∈ E(Q) and write
(x0 − e1 ) = au2 , (x0 − e2 ) = bv 2 , (x0 − e3 ) = cw2 , y02 = abc(uvw)2 ,
where a, b, c, u, v, w ∈ Q, the numbers a, b, c ∈ Z are square-free, and
abc is a square (in Z). Then, if p divides a · b · c, then p also divides
the quantity Δ = (e1 − e2 )(e2 − e3 )(e1 − e3 ).

Note: the discriminant of E equals ΔE = 16(e1 − e2 )2 (e2 −

e3 ) (e1 − e3 )2 . So a prime p divides Δ if and only if p divides ΔE . If
2

p > 2, then this is clear (see Exercise 2.12.19 for p = 2).

58 2. Elliptic curves

Proof. Suppose a prime p divides abc. Then p divides a, b or c. Let us

assume that p | a (the same argument works if p divides b or c). Let pk
be the exact power of p that appears in the factorization of the rational
number x0 − e1 = au2 . Notice that k may be positive or negative,
depending on whether p divides the numerator or denominator of
au2 . Notice, however, that k must be odd, because p | a, and a is
square-free.
Suppose ﬁrst that k < 0, i.e., p|k| is the exact power of p that
divides the denominator of x0 − e1 . Since ei ∈ Z, it follows that
p|k| must divide the denominator of x0 too, and therefore p|k| is the
exact power that divides the denominators of x0 − e2 and x0 − e3 as
well. Hence, p3|k| is the exact power of p dividing the denominator

of y02 = (x0 − ei ), but this is impossible because y02 is a square and
3|k| is odd. Thus, k must be positive.
If k > 0 and p divides x0 − e1 , then the denominator of x0 is
not divisible by p, so it makes sense to consider x0 mod p, and x0 ≡
e1 mod p. Similarly, the denominators of x0 − e2 and x0 − e3 are not
divisible by p and
bv 2 ≡ x0 − e2 ≡ e1 − e2 , and cw2 ≡ x0 − e3 ≡ e1 − e3 mod p.
Since y02 = abc(uvw)2 and p divides a, then p must also divide one
of b or c. Let’s suppose it also divides b. Then 0 ≡ bv 2 ≡ x0 − e2 ≡
e1 − e2 mod p and Δ = (e1 − e2 )(e2 − e3 )(e1 − e3 ) ≡ 0 mod p, as
claimed.

The deﬁnition of the map δ and the previous proposition yield

the following immediate corollary:
Corollary 2.9.6. With notation as in the previous Theorem and
Proposition, deﬁne a subgroup Γ of Q× /(Q× )2 by
Γ = {n ∈ Z : 0 = n is square-free and if p | n, then p | Δ}/(Z× )2 .
Then, δ induces an injection of E(Q)/2E(Q) into
ΓΔ = {(δ1 , δ2 , δ3 ) ∈ Γ × Γ × Γ : δ1 · δ2 · δ3 = 1 ∈ Q× /(Q× )2 }
⊂ Γ × Γ × Γ .

We are ready to prove the weak Mordell-Weil theorem (Thm.

2.4.5), at least in our restricted case:
2.10. Homogeneous spaces 59

Corollary 2.9.7 (Weak Mordell-Weil theorem). Let E : y 2 = (x −

e1 )(x−e2 )(x−e3 ) be an elliptic curve, with ei ∈ Z. Then E(Q)/2E(Q)
is ﬁnite.

Proof. By Cor. 2.9.6, E(Q)/2E(Q) injects into ΓΔ ⊂ Γ × Γ × Γ .

Since Γ is ﬁnite, E(Q)/2E(Q) is ﬁnite as well.

2.10. Homogeneous spaces

In this section we want to make the weak Mordell-Weil theorem ex-
plicit, i.e., we want:
• explicit bounds on the size of E(Q)/2E(Q), and
• a method to find generators of E(Q)/2E(Q) (see Exercise
2.12.25, though).
Before we discuss bounds, we need to understand the structure
of the quotient E(Q)/2E(Q). Remember that, from the Mordell-Weil
theorem (Thm. 2.4.3), E(Q) ∼ = T ⊕ ZRE where T = E(Q)torsion is a
finite abelian group. Therefore,
E(Q)/2E(Q) ∼
= T /2T ⊕ (Z/2Z)RE .
In our restricted case, we have assumed all along that E(Q) contains
4 points of 2-torsion, namely O and (ei , 0), for i = 1, 2, 3. And, by
Exercise 2.12.6, E(Q) cannot have more points of order 2. Thus,
T /2T ∼
= Z/2Z ⊕ Z/2Z (see Exercise 2.12.20).
Hence, the size of E(Q)/2E(Q) is exactly 2RE +2 , under our as-
sumptions. Recall that we defined ν(N ) to be the number of distinct
prime divisors of an integer N . We prove our first bound:
Proposition 2.10.1. Let E : y 2 = (x − e1 )(x − e2 )(x − e3 ) be an
elliptic curve, with ei ∈ Z. Then the rank of E(Q) is RE ≤ 2ν(ΔE ).

Proof. If the quantity ΔE has ν = ν(ΔE ) distinct (positive) prime

divisors, then we claim that the set
Γ = {n ∈ Z : 0 = n is square-free and if p | n, then p | Δ}/(Z× )2
has precisely 2ν(ΔE )+1 elements. Indeed, if ΔE = ps11 · · · psνν , then
Γ = {(−1)t0 pt11 · · · ptνν : ti = 0 or 1 for i = 0, . . . , ν}.
60 2. Elliptic curves

Thus, Γ has as many elements as {(t0 , . . . , tν ) : ti = 0 or 1}, which

clearly has 2ν+1 elements. Moreover, the set ΓΔ , as deﬁned in Corol-
lary 2.9.6, has as many elements as Γ × Γ , i.e., 22ν+2 elements. Since
E(Q)/2E(Q) injects into ΓΔ , we conclude that it also has at most
22ν+2 elements. Since the size of E(Q)/2E(Q) is 2RE +2 , we conclude
that RE + 2 ≤ 2ν + 2 and RE ≤ 2ν, as claimed.
Example 2.10.2. Let
E : y 2 = x3 − 1156x = x(x − 34)(x + 34).
The discriminant of E/Q is ΔE = 98867482624 = 212 · 176 . Hence,
ν(ΔE ) = 2 and the rank of E is at most 4. (The rank is in fact 2; see
Example 2.10.4 below.)

The bound RE ≤ 2ν(ΔE ) is, in general, not very sharp (The-

orem 2.7.4 is an improvement). However, the method we followed
to come up with the bound yields a strategy to ﬁnd generators for
E(Q)/2E(Q) as follows. Recall that E(Q)/2E(Q) embeds into ΓΔ
via the map δ, so we want to identify which elements of ΓΔ may
belong to the image of δ. Suppose (δ1 , δ2 , δ3 ) ∈ ΓΔ belongs to the
image of δ and it is not the image of a torsion point. Then there
exists P = (x0 , y0 ) ∈ E(Q) such that:
⎧
⎪
⎪ y02 = (x0 − e1 )(x0 − e2 )(x0 − e3 ),
⎪
⎪
⎨ x − e = δ u2 ,
0 1 1
⎪x0 − e2 = δ2 v 2 ,
⎪
⎪
⎪
⎩
x0 − e3 = δ3 w2
for some rational numbers u, v, w. We may substitute the last equa-
tion in the previous two, and obtain
e3 − e1 = δ1 u2 − δ3 w2 ,
e3 − e2 = δ2 v 2 − δ3 w2 .
Recall that the elements (δ1 , δ2 , δ3 ) that are in the image of δ satisfy
δ1 · δ2 · δ3 = 1 modulo squares. Thus, δ3 = δ1 · δ2 · λ2 and if we do a
change of variables (u, v, w) → (X, Y, Zλ ), we obtain a system

e3 − e1 = δ1 X 2 − δ1 δ2 Z 2 ,
C(δ1 , δ2 ) :
e3 − e2 = δ2 Y 2 − δ1 δ2 Z 2 ,
2.10. Homogeneous spaces 61

or, equivalently, one can subtract both equations to get

e1 − e2 = δ2 Y 2 − δ1 X 2 ,
C(δ1 , δ2 ) :
e3 − e2 = δ2 Y 2 − δ1 δ2 Z 2 .

The space C(δ1 , δ2 ) is the intersection of two conics, and it may have
rational points or not. If (δ1 , δ2 , δ3 ) is in the image of δ, however,
then the space C(δ1 , δ2 ) must have a rational point; i.e., there are
X, Y, Z ∈ Q that satisfy the equations of C(δ1 , δ2 ). Moreover, if
X0 , Y0 , Z0 ∈ Q are the coordinates of a point in C(δ1 , δ2 ), then

(2.12) P = (e1 + δ1 X02 , δ1 δ2 X0 Y0 Z0 )

is a rational point on E(Q) such that δ(P ) = (δ1 , δ2 , δ3 ). The spaces

C(δ1 , δ2 ) are called homogeneous spaces and are extremely helpful
when we try to calculate the Mordell-Weil group of an elliptic curve.
We record our ﬁndings in the form of a proposition, for later use:

Proposition 2.10.3. Let E/Q be an elliptic curve with Weierstrass

equation y 2 = (x−e1 )(x−e2 )(x−e3 ), with ei ∈ Z and e1 +e2 +e3 = 0.
Let δ : E(Q)/2E(Q) → ΓΔ be the injection given by Corollary 2.9.7,
and let δ(E) := δ(E(Q)/2E(Q)) be the image of δ in ΓΔ . Then:
(1) If (δ1 , δ2 , δ3 ) ∈ δ(E), then the space C(δ1 , δ2 ) has a point
(X0 , Y0 , Z0 ) with rational coordinates, X0 , Y0 , Z0 ∈ Q.
(2) Conversely, if C(δ1 , δ2 ) has a rational point (X0 , Y0 , Z0 ),
then E(Q) has a rational point

P = (e1 + δ1 X02 , δ1 δ2 X0 Y0 Z0 ).

(3) Since δ is a homomorphism and δ(E) is the image of δ, it

follows that δ(E) is a subgroup of ΓΔ . In particular:
• If (δ1 , δ2 , δ3 ) and (δ1 , δ2 , δ3 ) are elements of the image,
then their product (δ1 · δ1 , δ2 · δ2 , δ3 · δ3 ) is also in the
image;
• If (δ1 , δ2 , δ3 ) ∈ δ(E) but (δ1 , δ2 , δ3 ) ∈ ΓΔ is not in the
image, then their product (δ1 · δ1 , δ2 · δ2 , δ3 · δ3 ) is not
in the image δ(E);
• If C(δ1 , δ2 ) and C(δ1 , δ2 ) have rational points, then
C(δ1 · δ1 , δ2 · δ2 ) also has a rational point;
62 2. Elliptic curves

• If C(δ1 , δ2 ) has a rational point but C(δ1 , δ2 ) does not

have a rational point, then C(δ1 · δ1 , δ2 · δ2 ) does not
have a rational point.

Example 2.10.4. Let E : y 2 = x3 − 1156x = x(x − 34)(x + 34). The

only divisors of ΔE are 2 and 17. Thus, Γ = {±1, ±2, ±17, ±34}. Let
us choose e1 = 0, e2 = −34 and e3 = 34. Therefore, the homogeneous
spaces for this curve are all of the form

δ2 Y 2 − δ1 X 2 = 34,
C(δ1 , δ2 ) :
δ2 Y 2 − δ1 δ2 Z 2 = 68

with δ1 , δ2 ∈ Γ . We analyze these spaces, case by case. There are 64

pairs (δ1 , δ2 ) to take care of:

(1) ((δ1 , δ2 , δ3 ) = (1, 1, 1)). The point at inﬁnity (i.e., the origin)
is sent to (1, 1, 1) via δ, i.e., δ(O) = (1, 1, 1).
(2) (δ1 < 0 and δ2 < 0). The equation δ2 Y 2 − δ1 δ2 Z 2 = 68
cannot have solutions (in Q or R) because the left-hand side
is always negative for any X, Z ∈ Q.
(3) (δ1 > 0 and δ2 < 0). The equation δ2 Y 2 − δ1 X 2 = 34
cannot have solutions (in Q or R), because the left-hand
side is always negative.
(4) (δ1 = −1, δ2 = 34). The space C(−1, 34) has a rational
point (X, Y, Z) = (0, 1, 1), which maps to T1 = (0, 0) on
E(Q) via Eq. (2.12).
(5) (δ1 = −34, δ2 = 2). The space C(−34, 2) has the rational
point (X, Y, Z) = (1, 0, 1), which maps to T2 = (−34, 0) on
E(Q) via Eq. (2.12).
(6) (δ1 = 34, δ2 = 17). If δ(T1 ) = δ((0, 0)) equals (−1, 34, −34),
and δ(T2 ) = (−34, 2, −17), then

δ(T1 + T2 ) = δ(T1 ) · δ(T2 ) = (−1, 34, −34) · (−34, 2, −17) = (34, 17, 2).

Thus, the space C(34, 17) must have a point that maps
back to T1 + T2 = (34, 0). Indeed, C(34, 17) has a point
(X, Y, Z) = (1, 2, 0) that maps to (34, 0) via Eq. (2.12).
2.10. Homogeneous spaces 63

(7) (δ1 = −1, δ2 = 2). The space C(−1, 2) has a rational point
(X, Y, Z) = (4, 3, 5), which maps to P = (−16, −120) on
E(Q) via Eq. (2.12). P is a point of infinite order.
(8) ((δ1 , δ2 ) = (1, 17), (34, 1), or (−34, 34)). These are the pairs
that correspond to (−1, 2) · γ, with γ = (−1, 34), (−34, 2)
or (34, 17). Therefore, the corresponding spaces C(δ1 , δ2 )
must have rational points that map to P + T1 , P + T2 and
P + T1 + T2 , respectively.
(9) (δ1 = −2, δ2 = 2). The space C(−2, 2) has a rational point
(X, Y, Z) = (1, 4, 3), which maps to Q = (−2, −48) on E(Q)
via Eq. (2.12). Q is a point of infinite order.
(10) ((δ1 , δ2 ) = (2, 17), (17, 1), or (−17, 34)). These are the pairs
that correspond to (−2, 2) · γ, with γ = (−1, 34), (−34, 2)
or (34, 17). Therefore, the corresponding spaces C(δ1 , δ2 )
must have rational points that map to Q + T1 , Q + T2 and
Q + T1 + T2 , respectively.
(11) ((δ1 , δ2 ) = (2, 1), and (−2, 34), (−17, 2), or (17, 17)). Since
(−1, 2) and (−2, 2) correspond to P and Q, respectively,
then (−1, 2) · (−2, 2) = (2, 1) corresponds to P + Q. The
other pairs correspond to (−2, 2) · γ, with γ = (−1, 34),
(−34, 2) or (34, 17). Therefore, the corresponding spaces
C(δ1 , δ2 ) must have rational points that map to P + Q + T1 ,
P + Q + T2 and P + Q + T1 + T2 , respectively.
(12) (δ1 = 1, δ2 = 2). The space C(1, 2) does not have ratio-
nal points (see Exercise 2.12.21). In fact, it does not have
solutions in Q2 , the field of 2-adic numbers.
(13) ((δ1 , δ2 ) = (2, 2), (17, 2), (34, 2), (−1, 1), (−2, 1), (−17, 1),
(−34, 1), (−1, 17), (−2, 17), (−17, 17), (−34, 17), (1, 34),
(2, 34), (17, 34), (34, 34)). The corresponding spaces C(δ1 , δ2 )
do not have rational points. For instance, suppose C(2, 2)
had a point. Then (2, 2, 1) would be in the image of δ.
Since (2, 1, 2) is in the image of δ (we already saw above
that C(2, 1) has a point), then (2, 1, 2) · (2, 2, 1) = (1, 2, 2)
would also be in the image of δ, but we just saw (in the pre-
vious item) that (1, 2, 2) is not in the image of δ. Therefore,
64 2. Elliptic curves

we have reached a contradiction and C(2, 2) cannot have a

rational point. One can rule out all the other (δ1 , δ2 ) in the
list similarly.
We have analyzed all 64 possible pairs (δ1 , δ2 ) and have found that
the image of E(Q)/2E(Q) via δ has order 24 . Therefore, 2RE +2 = 24
and RE = 2. The rank of the curve is exactly 2 and T1 , T2 , P and Q
(as found above) are generators of E(Q)/2E(Q). (In fact, they are
generators of E(Q) as well.)

Example 2.10.5. Let E : y 2 = x3 − 6724x = x(x − 82)(x + 82). Let

e1 = 0, e2 = −82 and e3 = 82. The only divisors of ΔE are 2 and
41, hence Γ = {±1, ±2, ±41, ±82}. Let us analyze the homogeneous
spaces
δ2 Y 2 − δ1 X 2 = 82,
C(δ1 , δ2 ) :
δ2 Y 2 − δ1 δ2 Z 2 = 164
as we did in the previous example. Once again, there are 64 pairs to
check:
(1) ((δ1 , δ2 , δ3 ) = (1, 1, 1)). The point at inﬁnity (i.e., the origin)
is sent to (1, 1, 1) via δ, i.e., δ(O) = (1, 1, 1).
(2) (δ1 < 0 and δ2 < 0). The equation δ2 Y 2 − δ1 δ2 Z 2 = 164
cannot have rational solutions because the left-hand side is
always negative for any X, Z ∈ Q.
(3) (δ1 > 0 and δ2 < 0). The equation δ2 Y 2 −δ1 X 2 = 82 cannot
have rational solutions, because the left-hand side is always
negative.
(4) ((δ1 , δ2 ) = (−1, 82), (−82, 2), (82, 41)). The corresponding
spaces have (trivial) rational points that map, respectively,
to T1 = (0, 0), T2 = (−82, 0) and T3 = T1 + T2 = (82, 0) via
Eq. (2.12).
(5) ((δ1 , δ2 ) = (1, 2)). The space C(1, 2) does not have rational
points (same reason as for Exercise 2.12.21). In fact, it does
not have any solutions over Q2 .
(6) ((δ1 , δ2 ) = (−1, 41), (−82, 1), (82, 82)). The correspond-
ing spaces cannot have rational points, because these ele-
ments of ΓΔ are the product of (1, 2, 2), with no points,
2.10. Homogeneous spaces 65

times (−1, 82, −82), (−82, 2, −41), (82, 41, 2), which do have
points by a previous item in this list.
How about all the other possible pairs (δ1 , δ2 )? Consider (−1, 2, −2)
and its homogeneous space:
2Y 2 + X 2 = 82,
C(−1, 2) :
2Y 2 + 2Z 2 = 164.
Let us show that there are solutions to C(−1, 2) over R, Q2 and Q41 :
√ √
• (Over R). The point (0, 41, 41) is a point on C(−1, 2)
defined over R.
• (Over Q41 ). Let Y0 = 1 and put f (X) = X 2 − 80, g(Z) =
Z 2 − 81. By Hensel’s Lemma (see Appendix D.1 and Corol-
lary D.1.2), it suffices to show that there are α0 , β0 ∈ F41
such that
f (α0 ) = g(β0 ) ≡ 0 mod 41 and f (α0 ), g (β0 ) ≡ 0 mod 41.
The reader can check that the congruences α0 ≡ 11 mod 41
and β0 ≡ 9 mod 41 work. Thus, there are α, β ∈ Q41 such
that f (α) = 0 = g(β). Hence, (X0 , Y0 , Z0 ) = (α, 1, β) is a
point on C(−1, 2) defined over Q41 , as desired.
• (Over Q2 ). Let X0 = 0 and put f (Y ) = Y 2 −41. Let α0 = 1.
Then f (α0 ) = −40, f (α0 ) = 82 and
3 = ν2 (−40) > ν2 (822 ) = ν2 (22 · 412 ) = 2.
Thus, by Hensel’s Lemma (Theorem D.1.1; see also Ex-
ample D.1.4), there is α ∈ Q2 such that f (α) = 0, or
α2 = 41. Hence, the point (X0 , Y0 , Z0 ) = (0, α, α) is a point
on C(−1, 2) defined over Q2 , as desired.
One can also show that, in fact, C(−1, 2) has a point over Qp for
all p ≥ 2. Therefore, we cannot deduce any contradictions working
locally about whether C(−1, 2) has a point over Q. A computer search
does not yield any Q-points on C(−1, 2). Therefore, our method
breaks at this point, and we cannot determine whether there is a
point on E(Q) that comes from C(−1, 2).
It turns out that C(−1, 2) does not have rational points (but this
is difficult to show). This type of space, a space that has solutions
66 2. Elliptic curves

everywhere locally (Qp , R) but not globally (Q) is the main obstacle
for the descent method to fully work.

2.11. Selmer and Sha

In Example 2.10.5, we found a type of homogeneous space that made
our approach to finding generators of E(Q)/2E(Q) break down. In
this section, we study everywhere locally solvable spaces in detail.
Let E : y 2 = (x − e1 )(x − e2 )(x − e3 ) be an elliptic curve with
ei ∈ Z and e1 + e2 + e3 = 0. Let Γ be defined as in Corollary 2.9.7,
i.e.:
Γ = {n ∈ Z : 0 = n is square-free and if p | n, then p | Δ}/(Z× )2
where Δ = (e1 − e2 )(e2 − e3 )(e1 − e3 ). We define H as the following
set of homogeneous spaces:
H := {C(δ1 , δ2 ) : δ1 , δ2 ∈ Γ }.
Some homogeneous spaces in H have rational points that correspond
to rational points on E(Q); see Prop. 2.10.3. Other homogeneous
spaces do not have points (e.g. C(1, 2) in Example 2.10.4, or C(−1, 2)
in Example 2.10.5). For each elliptic curve, we define two different
sets of homogeneous spaces, the Selmer group and the Shafarevich-
Tate group, as follows. The Selmer group is
Sel2 (E/Q) := {C(δ1 , δ2 ) with points over R and Qp for all primes p}.
In other words, the Selmer group is the set of all homogeneous spaces
that are solvable everywhere locally, i.e., over R and over all fields of
p-adic numbers. The group operation on Sel2 (E/Q) is defined by
C(δ1 , δ2 ) · C(γ1 , γ2 ) = C(δ1 γ1 , δ2 γ2 ).

Notice that, due to Prop. 2.10.3, E(Q)/2E(Q) injects into H

via δ and the homogeneous spaces in the image of δ, i.e. δ(E) ⊆
H, have rational points. Since Q ⊆ Qp for all primes p ≥ 2, the
spaces in the image of δ belong to Sel2 (E/Q). Hence, Sel2 (E/Q)
has a subgroup formed by those homogeneous spaces in Sel2 (E/Q)
that have rational points as well (i.e., over Q), and this subgroup is
isomorphic to E(Q)/2E(Q):
E(Q)/2E(Q) = {C(δ1 , δ2 ) with points deﬁned over Q}.
2.11. Selmer and Sha 67

Finally, the Shafarevich-Tate group is the quotient of the Selmer group

by its subgroup E(Q)/2E(Q). Thus, each element of the Shafarevich-
Tate group is represented by C(1, 1) or by a homogeneous space that
is solvable everywhere locally but does not have a rational point:
X2 (E/Q) = {C(1, 1)}
∪ {C(δ1 , δ2 ) ∈ Sel2 (E/Q) without points over Q}.
These three groups, Selmer, X (or “Sha”) and E/2E, fit in a short
exact sequence
0 −→ E(Q)/2E(Q) −→ Sel2 (E/Q) −→ X2 (E/Q) −→ 0.
In other words, the map ψ : E(Q)/2E(Q) → Sel2 (E/Q) is injective,
the map φ : Sel2 (E/Q) → X2 (E/Q) is surjective, and the kernel of
φ is the image of ψ.
Remark 2.11.1. Here for simplicity we have defined what number
theorists would usually refer to as the 2-part of the Selmer group
(Sel2 (E/Q) above) and the 2-torsion of X (the group X2 as de-
fined above). For the definition of the full Selmer and X groups, see
[Sil86], Ch. X, §4.
Example 2.11.2. Let E : y 2 = x3 − 1156x, as in Example 2.10.4.
The full group of homogeneous spaces H has 64 elements:
H = {C(δ1 , δ2 ) : δi = ±1, ±2, ±17, ±34}.
The spaces in H with δ2 < 0 do not have points over R, so they
do not belong to Sel2 (E/Q). Moreover, we showed that the spaces
(δ1 , δ2 ) = (2, 2), (17, 2), (34, 2), (−1, 1), (−2, 1), (−17, 1), (−34, 1),
(−1, 17), (−2, 17), (−17, 17), (−34, 17), (1, 34), (2, 34), (17, 34), and
(34, 34) do not have points over Q2 . Therefore, they do not belong
to Sel2 (E/Q) either. All other spaces have rational points; therefore,
they are everywhere locally solvable, so they all belong to Sel2 (E/Q).
Hence,
Sel2 (E/Q) = {C(δ1 , δ2 ) : (δ1 , δ2 ) =
(1, 1), (−1, 34), (−34, 2), (34, 17),
(1, 17), (34, 1), (−34, 34), (−2, 2),
(17, 1), (−17, 34), (2, 1), (−2, 34),
(−17, 2), (17, 17), (−1, 2), (2, 17)}.
68 2. Elliptic curves

Notice that, indeed, the elements of Sel2 (E/Q) listed above form
a subgroup of Γ × Γ ⊂ (Q× /(Q× )2 )2 . Since all the elements of
Sel2 (E/Q) have rational points, we conclude that Sel2 (E/Q) equals
E(Q)/2E(Q) and
X2 (E/Q) = Sel2 (E/Q)/(E(Q)/2E(Q)) = {C(1, 1)},
i.e., X2 is the trivial subgroup in this case.

Example 2.11.3. Let E : y 2 = x3 − 6724x, as in Example 2.10.5.

The full group of homogeneous spaces H has 64 elements:
H = {C(δ1 , δ2 ) : δi = ±1, ±2, ±41, ±82}.
The spaces in H with δ2 < 0 do not have points over R, so they
do not belong to Sel2 (E/Q). Moreover, the spaces (δ1 , δ2 ) = (2, 2),
(41, 2), (82, 2), (−1, 1), (−2, 1), (−41, 1), (−82, 1), (−1, 41), (−2, 41),
(−41, 41), (−82, 41), (1, 82), (2, 82), (41, 82), and (82, 82) do not have
points over Q2 . Therefore, they do not belong to Sel2 (E/Q) either. It
turns out that the rest of the spaces (such as C(−1, 2)) are everywhere
locally solvable (we showed this for C(−1, 2)). Therefore they all
belong to Sel2 (E/Q). Hence,
Sel2 (E/Q) = {C(δ1 , δ2 ) : (δ1 , δ2 ) =
(1, 1), (−1, 82), (−82, 2), (82, 41),
(1, 41), (82, 1), (−82, 82), (−2, 2),
(41, 1), (−41, 82), (2, 1), (−2, 82),
(−41, 2), (41, 41), (−1, 2), (2, 41)}.
The spaces (1, 1), (−1, 82), (−82, 2) and (82, 41) have rational points
that correspond to (torsion) points on E(Q). However, none of the
other spaces have rational solutions! Thus, the rest are representative
of non-trivial elements of Sha, and we conclude that
E(Q)/2E(Q) = {C(1, 1), C(−1, 82), C(−82, 2), C(82, 41)}
and X2 (E/Q) = {C(δ1 , δ2 ) : (δ1 , δ2 ) = (1, 1), (−1, 2), (−2, 2), (2, 1)}.
Notice that the elements of X2 listed above are representatives
of all the classes in the quotient of Sel2 (E/Q) by E(Q)/2E(Q). For
instance, (−1, 2) · (1, 41) = (−1, 82) ∈ E(Q)/2E(Q). Thus, (−1, 2) ·
(1, 41) is trivial in X2 .
2.12. Exercises 69

2.12. Exercises
Exercise 2.12.1. Let f (x) = a0 xn + a1 xn−1 + . . . + an , with ai ∈ Z.
Prove that if x = pq ∈ Q, with gcd(p, q) = 1, is a solution of f (x) = 0,
then an is divisible by p and a0 is divisible by q.
Exercise 2.12.2. Let C be the conic defined by x2 − 2y 2 = 1.
(1) Find all the rational points on C. (Hint: the point O = (1, 0)
belongs to C. Let L(t) be the line that goes through O and
has slope t. Since C is a quadratic and L(t) ∩ C contains
at least one rational point, there must be a second point of
intersection Q. Find the coordinates of Q in terms of t.)
√ √ √
(2) Let α = 1 + 2. Calculate α2 = a + b 2 and α4 = c + d 2
and verify that (a, b) and (c, d) are integral points
√ on C :
x2 − 2y 2 = 1. (Note: in fact, if α2n = e + f 2, then
(e, f ) ∈ C and the coefficients of α2n+1 are a solution of
x2 − 2y 2 = −1.)
(3) (This problem is only for those who already know√about
continued fractions.) Find the continued fraction of 2 and
find the first 6 convergents. Use the convergents to find
three distinct (positive) integral solutions of x2 − 2y 2 = 1,
other than (1, 0). (Note: the reader should remind herself
or himself how to find the continued fraction and conver-
gents by hand, then check his or her answer using Sage; see
Appendix A.4.)
Exercise 2.12.3. Let C/Q be an affine curve.
(1) Suppose that C/Q is given by an equation of the form
(2.13) C : xy 2 + ax2 + bxy + cy 2 + dx + ey + f = 0.
Find an invertible change of variables that takes the equa-
tion of C onto one of the form xy 2 +gx2 +hxy+jx+ky+l = 0.
(Hint: consider a change of variables X = x + λ, Y = y).
(2) Suppose that C /Q is given by an equation of the form
(2.14) C : xy 2 + ax2 + bxy + cx + dy + e = 0.
Find an invertible change of variables that takes the equa-
tion of C onto one of the form y 2 + αxy + βy = x3 + γx2 +
70 2. Elliptic curves

δx + η. (Hint: multiply (2.14) by x and consider the change

of variables X = x and Y = xy. Make sure that, at the end,
the coefficients of y 2 and x3 equal 1.)
(3) Suppose that C /Q is a curve given by an equation of the
form
(2.15) C : y 2 + axy + by = x3 + cx2 + dx + e.
Find an invertible change of variables that takes the equa-
tion of C onto one of the form y 2 = x3 + Ax + B. (Hint:
do it in two steps. First eliminate the xy and y terms. Then
eliminate the x2 term.)
(4) Let E/Q : y 2 +43xy −210y = x3 −210x2 . Find an invertible
change of variables that takes the equation of E to one of
the form y 2 = x3 + Ax + B.
Exercise 2.12.4. Let C and E be curves defined, respectively, by
C : V 2 = U 4 + 1 and E : y 2 = x3 − 4x. Let ψ be the map defined by

2(V + 1) 4(V + 1)
ψ(U, V ) = , .
U2 U3
(1) Show that if U = 0 and (U, V ) ∈ C(Q), then ψ(U, V ) ∈
E(Q).
(2) Find an inverse function for ψ; i.e., find ϕ : E → C such
that ϕ(ψ(U, V )) = (U, V ).
Next, we work in projective coordinates. Let C : W 2 V 2 = U 4 + W 4
and E : zy 2 = x3 + z 3 .
(3) Write down the definition of ψ in projective coordinates; i.e.,
what is ψ([U, V, W ])?
(4) Show that ψ([0, 1, 1]) = [0, 1, 0] = O.
(5) Show that ψ([0, −1, 1]) = [0, 0, 1]. (Hint: Show that
ψ([U, V, W ]) = [2U 2 , 4U W, W (V − W )].)
Exercise 2.12.5. Use Sage to solve the following problems:
(1) Find 3Q, where E : y 2 = x3 − 25x and Q = (−4, 6). Use
3Q to find a new right triangle with rational sides and area
equal to 5. (Hint: Examples 1.1.2 and 2.4.1.)
2.12. Exercises 71

(2) Let y 2 = x(x + 5)(x + 10) and P = (−9, 6). Find nP for
n = 1, . . . , 12. Compare the x-coordinates of nP with the
list given at the end of Example 1.1.1, and write down the
next three numbers that belong in the list.

Exercise 2.12.6. Let E/Q be an elliptic curve given by a Weierstrass

equation of the form y 2 = f (x), where f (x) ∈ Z[x] is a monic cubic
polynomial with distinct roots (over C).
(1) Show that P = (x, y) ∈ E is a torsion point of exact order
2 if and only if y = 0 and f (x) = 0.
(2) Let E(Q)[2] be the subgroup of E(Q) formed by those ra-
tional points P ∈ E(Q) such that 2P = O. Show that the
size of E(Q)[2] may be 1, 2 or 4.
(3) Give examples of three elliptic curves deﬁned over Q where
the size of E(Q)[2] is 1, 2 and 4, respectively.

Exercise 2.12.7. Let Et : y 2 +(1−t)xy−ty = x3 −tx2 with t ∈ Q and

Δt = t5 (t2 − 11t − 1) = 0. As we saw in Example 2.5.4 (or Appendix
E), every curve Et has a subgroup isomorphic to Z/5Z. Use Sage to
ﬁnd elliptic curves with torsion Z/5Z and rank 0, 1 and 2. Also, try
to ﬁnd an elliptic curve Et with rank r, as high as possible. (Note:
the highest rank known — as of 6/1/2009 — for an elliptic curve with
Z/5Z torsion is 6, discovered by Dujella and Lecacheux in 2001; see
[Duj09].)

Exercise 2.12.8. Let p ≥ 2 be a prime and Ep : y 2 = x3 + p2 . Show

that there is no torsion point P ∈ Ep (Q) with y(P ) equal to

y = ±1, ±p2 , ±3p, ±3p2 , or ± 3.

Prove that Q = (0, p) is a torsion point of exact order 3. Conclude

that {O, Q, 2Q} are the only torsion points on Ep (Q). (Note: for
p = 3, the point (−2, 1) ∈ E3 (Q). Show that it is not a torsion
point.)

Exercise 2.12.9. Prove Proposition 2.6.8, as follows:

(1) First show that if f (x) is a polynomial, f (x) its derivative,
and f (δ) = f (δ) = 0, then f (x) has a double root at δ.
72 2. Elliptic curves

(2) Show that if y 2 = f (x) is singular, where f (x) ∈ K[x] is a

monic cubic polynomial, then the singularity must occur at
(δ, 0), where δ is a root of f (x).
(3) Show that (δ, 0) is singular if and only if δ is a double root
of f (x). Therefore D = 0 if and only if E is singular.

Exercise 2.12.10. Let E/Q : y 2 = x3 + 3. Find all the points of

7 ) and verify that N7 satisﬁes Hasse’s bound.
E(F

Exercise 2.12.11. Let E/Q : y 2 = x3 + Ax + B and let p ≥ 3 be

a prime of bad reduction for E/Q. Show that E(Fp ) has a unique
singular point.

Exercise 2.12.12. Prove parts (1) and (3) of Theorem 2.8.5. (Hint:
use Deﬁnition 2.8.4 and Proposition 2.7.3.)

Exercise 2.12.13. Prove Corollary 2.8.6.

Exercise 2.12.14. Let E : y 2 = x3 − 10081x. Use Sage (or PARI)

to ﬁnd a minimal set of generators for the subgroup that is spanned
by all these points on E:

10081 90729
(0, 0), (−100, 90), , , (−17, 408)
100 1000

907137 559000596 1681 20295 833 21063
,− , , , ,
6889 571787 16 64 4 8

161296 19960380 6790208 40498852616
− , , − ,− .
1681 68921 168921 69426531
(Hint: use Theorem 2.7.4 to determine the rank of E/Q.)

Exercise 2.12.15. Let E and δ be deﬁned as in Theorem 2.9.3, and

suppose P = (x0 , y0 ) is a point on E with y0 = 0. Show:
• δ(P ) · δ(O) = δ(P ).
• δ((e1 , 0)) · δ((e2 , 0)) = δ((e1 , 0) + (e2 , 0)).
• δ(P ) · δ((e1 , 0)) = δ(P + (e1 , 0)).

Exercise 2.12.16. Let E : y 2 = x3 +Ax+B be an elliptic curve with

A, B ∈ Q, and suppose P = (x0 , y0 ) is a point on E, with y0 = 0.
2.12. Exercises 73

(1) Prove that the x-coordinate of 2P is given by

x40 − 2Ax20 − 8Bx0 + A2
x(2P ) = .
4y02
(2) Find a formula for y(2P ) in terms of x0 and y0 .

Exercise 2.12.17. The curve E/Q : y 2 = x3 − 1572 x has a rational

point Q with x-coordinate x = x(Q) given by
2
224403517704336969924557513090674863160948472041
x= .
17824664537857719176051070357934327140032961660
Show that there exists a point P ∈ E(Q) such that 2P = Q. Find
the coordinates of P . (Hint: use PARI or Sage and Exercise 2.12.16.)

Exercise 2.12.18. Let E : y 2 = (x − e1 )(x − e2 )(x − e3 ) with ei ∈ Q,

distinct, and such that e1 + e2 + e3 = 0. Additionally, suppose that
e1 − e2 = n2 and e2 − e3 = m2 are squares. This exercise shows
that, under these assumptions, there is a point P = (x0 , y0 ) such that
2P = (e1 , 0), i.e., P is a point of exact order 4.
n2 +m2 m2 −2n2 n2 −2m2
(1) Show that e1 = 3 , e2 = 3 , e3 = 3 .
3
(2) Find A and B, in terms of n and m, such that x +Ax+B =
(x−e1 )(x−e2 )(x−e3 ). (Hint: Sage or PARI can be of great
help here.)
(3) Let p(x) = x4 − 2Ax2 − 8Bx + A2 − 4(x3 + Ax + B)e1 . Show
that p(x0 ) = 0 if and only if x(2P ) = e1 , and therefore
2P = (e1 , 0). (Hint: use Exercise 2.12.16.)
(4) Express all the coefficients of p(x) in terms of n and m.
(Hint: use Sage or PARI.)
(5) Factor p(x) for (n, m) = (3, 6), (3, 12), (9, 12), . . ..
(6) Guess that p(x) = (x−a)2 (x−b)2 for some a and b. Express
all the coefficients of p(x) in terms of a and b.
(7) Finally, compare the coefficients of p(x) in terms of a, b and
n, m and find the roots of p(x) in terms of n, m. (Hint:
compare first the coefficient of x3 and then the coefficient of
x2 .)
(8) Write P = (x0 , y0 ) in terms of n and m.
74 2. Elliptic curves

Exercise 2.12.19. Let e1 , e2 , e3 be three distinct integers. Show that

Δ = (e1 − e2 )(e2 − e3 )(e1 − e3 ) is always even.
Exercise 2.12.20. In this exercise we study the structure of the
quotient G/2G, where G is a finite abelian group.
(1) Let p ≥ 2 be a prime and let G = Z/pe Z, with e ≥ 1. Prove
that G/2G is trivial if and only if p > 2.
(2) Prove that, if G = Z/2e Z and e ≥ 1, then G/2G ∼= Z/2Z.
(3) Finally, let G be an arbitrary finite abelian group. We define
G[2∞ ] to be the 2-primary component of G, i.e.,
G[2∞ ] = {g ∈ G : 2n · g = 0 for some n ≥ 1}.
In other words, G[2∞ ] is the subgroup of G formed by those
elements of G whose order is a power of 2. Prove that
∼ Z/2e1 Z ⊕ Z/2e2 Z ⊕ · · · ⊕ Z/2er Z
G[2∞ ] =
for some r ≥ 0 and ei ≥ 1 (here r = 0 means G[2∞ ] is
trivial). Also show that G/2G ∼
= (Z/2Z)r .
Exercise 2.12.21. Show that the space
2Y 2 − X 2 = 34,
C:
Y 2 − Z 2 = 34
does not have any rational solutions with X, Y, Z ∈ Q. (Hint: modify
the system so there are no powers of 2 in any of the denominators,
then work modulo 8.)
Exercise 2.12.22. For the following elliptic curves, use the method
of 2-descent (as in Proposition 2.10.3 and Example 2.10.4) to find the
rank of E/Q and generators of E(Q)/2E(Q). Do not use Sage:
(1) E : y 2 = x3 − 14931x + 220590.
(2) E : y 2 = x3 − x2 − 6x.
(3) E : y 2 = x3 − 37636x.
(4) E : y 2 = x3 − 962x2 + 148417x. (Hint: use Theorem 2.7.4
first to find a bound on the rank.)
Exercise 2.12.23. Find the rank and generators for the rational
points on the elliptic curve y 2 = x(x + 5)(x + 10).
2.12. Exercises 75

Exercise 2.12.24. (Elliptic curves with non-trivial rank.) The goal

here is a systematic way to find curves of rank at least r ≥ 0 without
using tables of elliptic curves:
(1) (Easy) Find 3 non-isomorphic elliptic curves over Q with
rank ≥ 2. You must prove that the rank is at least 2. (To
show linear independence, you may use PARI or Sage to
calculate the height matrix).
(2) (Fair) Find 3 non-isomorphic elliptic curves over Q with rank
≥ 3.
(3) (Medium difficulty) Find 3 non-isomorphic elliptic curves
over Q with rank ≥ 6. If so, then you can probably find 3
curves of rank ≥ 8 as well.
(4) (Significantly harder) Find 3 non-isomorphic elliptic curves
over Q of rank ≥ 10.
(5) (You would be famous!) Find an elliptic curve over Q of
rank ≥ 29.
Exercise 2.12.25. Let E be an elliptic curve and suppose that the
images of the points P1 , P2 , . . . , Pn ∈ E(Q) in E(Q)/2E(Q) generate
the group E(Q)/2E(Q). Let G be the subgroup of E(Q) generated
by P1 , P2 , . . . , Pn .
(1) Prove that the index of G in E(Q) is finite, i.e., the quotient
group E(Q)/G is finite.
(2) Show that, depending on the choice of generators {Pi } of
the quotient E(Q)/2E(Q), the size of E(Q)/G may be ar-
bitrarily large.
Exercise 2.12.26. Fermat’s last theorem shows that x3 + y 3 = z 3
has no integer solutions with xyz = 0. Find the first d ≥ 1 such
that x3 + y 3 = dz 3 has infinitely many non-trivial solutions, find a
generator for the solutions and write down a few examples. (Hint:
Example 2.2.3.)
Chapter 3

Modular curves

We saw in the introduction (Section 1.2) that a modular form is an

object defined analytically. So far, we have only discussed algebraic
aspects of elliptic curves. Before we go into the precise definitions of
modular forms (Chapter 4), we need to consider elliptic curves over
the complex numbers in order to motivate the definition of modular
curves from the theory of elliptic curves, which in turn will motivate
the definition of modular forms. In this chapter, we shall see that
when we consider an elliptic curve E/Q as defined over C instead,
then E(C) is homeomorphic to a torus over C. We remind the reader
that Appendix B contains a concise introduction to complex analysis.

3.1. Elliptic curves over C

3.1.1. Lattices.

Deﬁnition 3.1.1. A lattice L in the complex plane is an additive

discrete subgroup of C such that L ⊗ R = C.

Alternatively, a lattice can be deﬁned by its generators. Let w1 =

u1 + v1 i and w2 = u2 + v2 i be two non-zero complex numbers such
that the vectors (u1 , v1 ) and (u2 , v2 ) are linearly independent in R2 .
Then, the set
L = {mw1 + nw2 : m, n ∈ Z}

77
78 3. Modular curves

−1 1 2 3

−1

Figure 1. Points in the lattice 12 + 3i 3

,
2 2
+ 3i
2
.

is a lattice, and every lattice is given in this way. The lattice gener-
ated by w1 , w2 ∈ C is denoted by w1 , w2 . We will insist on a positive
orientation of our basis; i.e., we require w1 /w2 to have positive imag-
inary part. In other words, w1 /w2 belongs to the upper half complex
plane H, where
H = {a + bi ∈ C : b > 0}.

Example 3.1.2. The Gaussian integers Z[i] = {a + bi : a, b ∈ Z}

form a lattice. One can take w1 = i and w2 = 1 as generators (notice
that w1 /w2 has positive imaginary part). See Exercise 3.7.2.
√ √
Example 3.1.3. The set Z[ 2] = {a+b 2 : a, b ∈ Z} is not a lattice,
because when you replace a, b ∈ Z by a, b ∈ R we do not obtain all of
C but only a 1-dimensional real space (in this√case just R). In other
words, there are no two points w1 , w2 in Z( 2) whose coordinates
are linearly independent in R2 .

We shall be interested in quotients of C by a lattice L.

Deﬁnition 3.1.4. Let L be a lattice, and let w1 , w2 ∈ C be generators
of L. The group C/L is the quotient of C, as an additive group, by
3.1. Elliptic curves over C 79

its subgroup L. In other words, we deﬁne C/L via an equivalence

relation: we say that z1 and z2 are equivalent modulo L if there is
w ∈ L such that z1 − z2 = w. Then C/L is the set of equivalence
classes of C modulo L.
If L = w1 , w2 , then the parallelogram
F = {λw1 + μw2 : 0 ≤ λ, μ < 1}
is called a fundamental domain for C/L. Notice that there is a one-
to-one correspondence between elements of F and classes in C/L; i.e.,
the elements of F form a complete set of representatives for C/L.

−1 1 2 3

−i

Figure 2. A fundamental domain for the lattice 12 + 3i 3

,
2 2
+ 3i
2
.

Notice also that if L is a lattice, then C/L is a (ﬂat) torus because

each side of the parallelogram F is identiﬁed with the opposite side
modulo L.
Example 3.1.5. Let L = Z[i] = i, 1. A fundamental domain for
C/Z[i] is given by F = {λi + μ : 0 ≤ λ, μ < 1}, which is just
a square (only two sides are actually included in F). Notice that
λi ≡ λi + 1 mod L for all λ ∈ R (because (λi + 1) − λi = 1 ∈ L), and
μ ≡ μ + i mod L for all μ ∈ R (because i ∈ L). Therefore, each side
80 3. Modular curves

of the square F is identiﬁed with the opposite side modulo the lattice
L. Thus, C/Z[i] is indeed a torus when considered as a surface.
Proposition 3.1.6. Let L = w1 , w2 and L = w1 , w2 be lattices
with oriented bases (i.e., w1 /w2 and w1 /w2 ∈ H).
(1) L = L if and only if there is a matrix M ∈ SL(2, Z) such
1
that w1
w
=M w w2 .
2

(2) There is a complex analytic (i.e., holomorphic) isomorphism

of the quotients C/L and C/L (as additive groups) if and
only if L = αL for some α ∈ C.

In Appendix B the reader can ﬁnd the deﬁnition of analytic func-

tion (Deﬁnition B.2.4). Moreover, we have also included a section
that describes what it means for a map C/L → C/L to be analytic
(see Section B.6).
Corollary 3.1.7. Let L = w1 , w2 and L = w1 , w2 be oriented
bases of lattices such that there is an analytic isomorphism C/L ∼ =
C/L of abelian groups. Then, there is an α ∈ C× and M ∈ SL(2, Z)
1
such that ww
1
= αM w w2 .
2

The proofs of (most of) the proposition and corollary are left as
exercises (Exercises 3.7.2 and 3.7.3). We will, however, need to rely
on the following fact from complex analysis without giving a proof:
if a map ψ : C/L → C/L is an analytic isomorphism, then there
is α ∈ C× such that L = αL and ψ(z mod L) = αz mod L . See
[Sil86], Ch. VI, Theorem 4.1 for more details.
Remark 3.1.8. Let L = w1 , w2 and L = w1 , w2 be two arbitrary
lattices. Then, the map ψ : C/L → C/L , given by
ψ(λw1 + μw2 mod L) = λw1 + μw2 mod L
for any 0 ≤ λ, μ < 1, is a bijection of sets (indeed, ψ is a bijection
between the fundamental domains of C/L and C/L ). In fact, ψ is
also an isomorphism of abelian groups. However, in general, this map
is not analytic.
Example 3.1.9. Let L = Z[i] = i, 1 with w1 = i and w2 = 1. Let

3 5
M= ∈ SL(2, Z).
1 2
3.1. Elliptic curves over C 81
1 i 3i+5
Put w 1
w2
=M w w2 = M 1 = i+2 . Then Z[i] = i, 1 = 5 + 3i, 2 +
i. Indeed, it is clear that 5 + 3i, 2 + i ∈ i, 1. Moreover,

3 · (2 + i) − (5 + 3i) = 1 and 2 · (5 + 3i) − 5 · (2 + i) = i;

therefore i, 1 ⊆ 5 + 3i, 2 + i, and so they are equal lattices. Now,
deﬁne L = 5i + 13
5 , 1 = 2+i 5 + 3i, 2 + i = 2+i Z[i]. By Proposition
1 1

3.1.6, there is an isomorphism C/i, 1 ∼ = C/ 5 + 13

i
5 , 1.

Suppose that L = w1 , w2 is an arbitrary lattice, with an oriented

basis (i.e., w1 /w2 ∈ H). Then L = τ, 1, with τ = w1 /w2 ∈ H, is
another lattice such that, by Prop. 3.1.6, C/L ∼ = C/L . Therefore,
this shows that for every lattice L, there is a lattice of the form
L = τ, 1 with τ ∈ H such that C/L ∼ = C/L .
When is C/τ, 1 =∼ C/τ , 1? If the two quotients are isomorphic,
then Cor. 3.1.7 implies that there must be a matrix

a b
M= ∈ SL(2, Z)
c d

and α ∈ C× such that

τ a b τ α(aτ + b)
=α = .
1 c d 1 α(cτ + d)

Thus, 1 = α(cτ + d) and so α = (cτ + d)−1 . Hence,

aτ + b
τ = with ad − bc = 1.
cτ + d
If M = (a, b; c, d) is a matrix in SL(2, Z) and τ ∈ H, we will write
cτ +d ∈ H. We record our ﬁndings in the form of a proposition.
M τ := aτ +b

Proposition 3.1.10. Let L = w1 , w2 be a lattice in C.

(1) There is a τ ∈ H such that C/L ∼= C/τ, 1.
(2) Let τ, τ ∈ H. Then 1 ∼
C/τ, = C/τ , 1 if and only if there
a b
is a matrix M = ∈ SL(2, Z) such that
c d
aτ + b
τ = Mτ = .
cτ + d
82 3. Modular curves

3.2. Functions on lattices and elliptic functions

In this section we discuss functions on C/L. One way to construct a
function f : C/L → C is to find a function f : C → C that is periodic
with respect to the lattice L, i.e., f(z +w) = f(z) for all w ∈ L. Thus,
f induces a well-defined function on C/L because, if z1 ≡ z2 mod L
(i.e., z2 = z1 + w for some w ∈ L), then f(z1 ) = f(z2 ). Hence, we
can define f (z mod L) := f(z) and this is a well-defined function on
C/L. The functions of this type are called elliptic functions.
Definition 3.2.1. An elliptic function (relative to a lattice L ⊂ C) is
a meromorphic function f (z) : C → C ∪ {∞} that satisfies f (z + w) =
f (z) for all z ∈ C and all w ∈ L. The set of all elliptic functions for
L is denoted by C(L).

The most important example of an elliptic function is the Weier-

strass ℘-function.
Deﬁnition 3.2.2. Let L be a lattice. The Weierstrass ℘-function
relative to L is the function

1 1 1
℘(z, L) = 2 + − .
z (z − w)2 w2
0=w∈L

The Weierstrass ℘-function is a meromorphic function on C and it

has (double) poles at each lattice point w ∈ L. And, most importantly
for us, the Weierstrass ℘-function is an elliptic function for the lattice
L since, clearly, ℘(z, L) = ℘(z + v, L) for any v ∈ L (check this!). The
Laurent series of the ℘-function is also very important. In order to
be able to write down the Laurent series, we need to deﬁne another
very important function of lattices: the Eisenstein series.
Deﬁnition 3.2.3. Let k ≥ 2 and let L be a lattice. The Eisenstein
series of L of weight 2k is the series
1
G2k (L) = .
w2k
0=w∈L

Here, we will not worry too much about convergence, but the
worried reader may be relieved to know that G2k (L) is absolutely
convergent for k > 1 and ℘(z, L) converges uniformly on every com-
pact subset of C − L (the worried reader can ﬁnd a proof of the
3.2. Functions on lattices and elliptic functions 83

convergence in [Sil86], Ch. VI, Theorem 3.1). We are now ready to

write down the Laurent series about z = 0 for the function ℘(z, L).

Theorem 3.2.4. Let L be a lattice.

(1) The Laurent series for ℘(z, L) about z = 0 is given by
∞
1
℘(z, L) = + (2k + 1)G2k+2 (L)z 2k ,
z2
k=1

where G2k+2 (L) is the Eisenstein series for L of weight 2k +

2.
(2) Let ℘ (z, L) be the derivative of ℘ with respect to z. For all
z ∈ C and z ∈ / L,
2
℘ (z, L)
= ℘(z, L)3 − 15G4 (L)℘(z, L) − 35G6 (L).
2

In other words, (℘(z, L), ℘ (z,L)
2 ) is a point on the elliptic
curve EL (C), where
EL /C : y 2 = x3 − 15G4 (L)x − 35G6 (L).

See Exercise 3.7.4 for a proof of the ﬁrst part of the theorem (part
(2) is shown in [Sil86], Theorem VI.3.5). Theorem 3.2.4 shows that
there is a map:

℘ (z, L)
(3.1) Φ : C/L → EL (C), z mod L → ℘(z, L), .
2
It turns out that the map Φ has all the “nice” properties that one
would hope for: it is a complex analytic isomorphism of abelian
groups. Moreover, if E/C : y 2 = x3 + Ax + B is an elliptic curve,
then there is a lattice L ⊂ C such that Φ : C/L ∼
= E(C). This result
is usually called the uniformization theorem:

Theorem 3.2.5. (Uniformization theorem)

(1) Let L be a lattice. Then the equation y 2 = x3 − 15G4 (L)x −
35G6 (L) is non-singular (i.e., its discriminant is = 0) and
deﬁnes an elliptic curve EL /C. Moreover, the map Φ :
C/L → EL (C) deﬁned in Eq. (3.1) is a complex analytic
isomorphism of abelian groups.
84 3. Modular curves

(2) Let E/C : y 2 = x3 + Ax + B be an elliptic curve. Then

there exists a lattice L ⊂ C such that A = −15G4 (L), B =
−35G6 (L) and C/L is isomorphic to E(C) via Φ.

For a proof of the uniformization theorem, see [DS05], §1.4.

Example 3.2.6. Let E/Q : y 2 = x3 −x. The lattice that corresponds

to this elliptic curve is

L = (13.5823633497 . . .)i, 13.5823633497 . . .

because −15G4 (L) = −1 and −35G6 (L) = 0. Let us deﬁne a quantity

ΩE = 13.5823633497 . . .. Then, L = ΩE · i, 1 and, therefore (by
Prop. 3.1.6), E(C) ∼
= C/i, 1.

3.3. Elliptic curves and the upper half-plane

The uniformization theorem tells us that every lattice L determines
an elliptic curve EL /C and, conversely, for every elliptic curve E/C
there is a lattice L that produces E, i.e., E(C) ∼ = C/L. Proposition
3.1.10 tells us that we can find a lattice of the form τ, 1, with τ ∈ H,
such that E(C) ∼ = C/τ, 1. Thus, every τ in the complex upper half-
plane determines a lattice Lτ = τ, 1, which in turn determines a
C-isomorphism class of an elliptic curve E(C) ∼ = C/τ, 1. The choice
of τ , however, is not unique. Remember that, also by Prop. 3.1.10,
if τ is another element of the complex upper half-plane and there
exists a matrix M ∈ SL(2, Z) such that τ = M τ (= aτ +b
cτ +d ), then
E(C) ∼ = C/τ , 1.
The discussion in the preceding paragraph motivates the defini-
tion of an equivalence relation between points in H modulo SL(2, Z).
For the sake of brevity, we will write Γ(1) for SL(2, Z), and we will
call it the modular group. Later on, we shall describe other subgroups
Γ(N ) that will justify this notation (see Definition 3.5.1).

Deﬁnition 3.3.1. We say that two points τ, τ ∈ H are equivalent

relative to the modular group Γ(1) if there is a matrix

a b
M= ∈ SL(2, Z)
c d
3.3. Elliptic curves and the upper half-plane 85

such that τ = M τ . This deﬁnes an equivalence relation (see Exercise

3.7.6), and the set of all equivalence classes is denoted by Y (1) =
H/Γ(1):

Y (1) = H/Γ(1)
{z = a + bi ∈ C : b > 0}
= .
{z ∼ z if and only if z = M z for some M ∈ SL(2, Z)}

Remark 3.3.2. The that, forany τ ∈ H,

readershould also notice
a b −a −b
the matrices M = and −M = aﬀord the
c d −c −d
same action on τ . Indeed,

−aτ − b aτ + b
(−M )τ = = = M τ.
−cτ − d cτ + d

Thus, sometimes the equivalence relation is deﬁned with respect to

the quotient SL(2, Z)/{± Id}.

Example 3.3.3. For instance, z = 1 + i and z = 7 + i are represen-

tatives of the same equivalence class in Y (1) = H/Γ(1) because

1 6
M = ∈ SL(2, Z), and
0 1
1 · (1 + i) + 6
Mz = M · (1 + i) = = 7 + i = z .
0 · (1 + i) + 1

Similarly, z = 1+i and z = i+27

are representatives of the same class
10
3 5
in Y (1). In this case, the transitional matrix is M = ∈
1 2
SL(2, Z) and, indeed, z = M z (check this). Proposition 3.1.10 im-
plies that the elliptic curves that correspond to the quotients C/z, 1,
C/z , 1 and C/z , 1 are all isomorphic (over C).

As in the case of the quotient C/L by a lattice L (see Deﬁnition

3.1.4), we would like to ﬁnd a fundamental domain for the quotient
H/Γ(1).
86 3. Modular curves

Figure 3. The fundamental domain F (1) for the quotient H/Γ(1).

Proposition 3.3.4. Let F(1) ⊂ C be the following set of complex

numbers:

1 1
F(1) = z = a + bi ∈ C : |z| > 1 and − < a = (z) ≤
2 2

1
∪ z = a + bi ∈ C : |z| = 1 and 0 ≤ a = (z) ≤ .
2
Then F(1) is a fundamental domain for H/Γ(1), i.e.,
(1) If w ∈ H, then there is z ∈ F(1) such that w ∼ z in H/Γ(1);
i.e., there is M ∈ SL(2, Z) such that w = M z, and
(2) If z, z are two distinct elements of F(1), then z ∼ z in

H/Γ(1); that is, the equivalence classes of z and z are dis-
tinct.

The proof of part (1) of Proposition 3.3.4 is left as an exercise

(Exercises 3.7.1 and 3.7.7). The proof of (2) can be found, for exam-
ple, in [Ser77], Ch. VII. In the proof of (1), in Exercise 3.7.7, we
3.4. The modular curve X(1) 87

found two distinguished matrices of SL(2, Z):

0 −1 1 1
S= and T = .
1 0 0 1
The action of the matrices S and T on τ ∈ H is simply T τ = τ + 1
and Sτ = − τ1 . Notice also that S 2 = Id and (ST )3 = − Id. As a
corollary of Prop. 3.3.4, we show that the subgroup generated by S
and T (i.e., the group G of Exercise 3.7.7) is all of SL(2, Z).

Corollary 3.3.5. The modular group Γ(1) = SL(2, Z) is generated

by the matrices S and T .

Proof. Let M be a matrix in SL(2, Z) and let τ ∈ H be a ﬁxed

complex number in the interior of F(1) (e.g., τ = 2i works). Let
τ = M τ and write G for the subgroup of SL(2, Z) generated by
S and T . Notice that − Id = (ST )3 ∈ G. We want to show that
G = Γ(1).
Exercise 3.7.7 says that there exists a matrix M ∈ G such that
τ = M τ lies in F(1). Thus, τ = M τ = (M M )τ . Since M and

M belong to SL(2, Z), their product M M ∈ SL(2, Z) and therefore

τ ∼ τ in H/Γ(1) by deﬁnition. Moreover, both τ and τ are in
F(1). But Prop. 3.3.4, part (2), says that this is impossible unless
τ = τ . Hence, τ = M M τ and this implies that M M = ± Id
(here we are using the fact that τ is in the interior of F(1)). Thus,
M = ±(M )−1 ∈ G. Since M ∈ SL(2, Z) was arbitrary, we conclude
that Γ(1) ≤ G. The reverse inclusion is obvious. Thus, G = Γ(1), as
claimed.

3.4. The modular curve X(1)

Proposition 3.3.4 shows that each point on the fundamental domain
F(1) represents a unique class in the quotient Y (1) = H/Γ(1), and
every class of Y (1) has a representative in F(1). If we considered
H/Γ(1) as a surface, then it would be homeomorphic to a sphere with
one point missing (to see this, identify the sides in the boundary of
F(1)). In order to compactify Y (1), we add one point, a point at inﬁn-
ity, that makes Y (1)∪{∞} into a compact surface (a sphere), denoted
X(1). In the interest of space, time and ink, we will not discuss the
88 3. Modular curves

topology of Y (1) and X(1) (for instance, see [Sil94], Ch. 1, §2 or see
[DS05], Sections 2.1 and 2.2 for a more generalized approach). The
formal construction of this point at infinity is the following.
We define an extended upper half plane by H∗ := H ∪ P1 (Q), i.e.,
the union of H and a copy of a projective line over Q. The points
in the projective line of the form {[s, 1] : s ∈ Q} are simply the
rational points of the real axis in the complex plane (so [s, 1] stands
for s + 0 · i ∈ Q ⊂ R). The remaining point of P1 (Q) is the point at
infinity ∞ := [1, 0]. If the reader needs to review the basics about the
projective line, see Appendix C.1.
We also extend the action of Γ(1) to all of H∗ as follows. Let
M = (a, b; c, d) ∈ SL(2, Z) and let [s, t] ∈ P1 (Q). Then we define

M [s, t] = (−M )[s, t] := [as + bt, cs + dt] ∈ P1 (Q).

Notice that [s, 1], which we may identify with s ∈ Q, is sent to

M [s, 1] = [as + b, cs + d], and as long as cs + d = 0, then M [s, 1] =
cs+d , 1], so that s ∈ Q is sent to M s = cs+d ∈ Q. Thus, this action
[ as+b as+b

is clearly consistent with the previous deﬁnition of the action of Γ(1)

on H. The point at inﬁnity can also be treated similarly.

5 6
Example 3.4.1. Let M = ∈ SL(2, Z). Let s = [3, 1] ∈ Q.
4 5
Then
M s = M [3, 1] = [5 · 3 + 6, 4 · 3 + 5] = [21, 17],
or, equivalently, M s = M · 3 = 5·3+6 21
4·3+5 = 17 . One needs to be careful
with zeros in the denominators! For instance, let s = − 54 . Then:

5 5 · − 54 + 6 −1
Ms = M · − = 5 = 4 “=” ∞.
4 4 · −4 + 5 0
The previous equation can be formalized using projective coordinates:

5 1
M [s , 1] = M − , 1 = − , 0 = [1, 0].
4 4
We can also calculate the action of M on ∞ using the usual laws of
limits:
5·∞+6 5
M · ∞ “=” “=” .
4·∞+5 4
3.4. The modular curve X(1) 89

Once again, this calculation can be formalized using projective coor-

dinates:

5
M [1, 0] = [5 · 1 + 6 · 0, 4 · 1 + 5 · 0] = [5, 4] = ,1 .
4

We are ready to deﬁne X(1). The deﬁnition now is identical to

the deﬁnition of Y (1) in Deﬁnition 3.3.1:

Deﬁnition 3.4.2. We say that two points τ, τ ∈ H∗ = H ∪ P1 (Q)

are equivalent relative to the modular group Γ(1) if there is a matrix

a b
M= ∈ SL(2, Z)
c d

such that τ = M τ . This deﬁnes an equivalence relation, and the set

of all equivalence classes is denoted by X(1) = H∗ /Γ(1):

X(1) = H∗ /Γ(1)
{z = a + bi ∈ C : b > 0} ∪ {s ∈ Q} ∪ {∞}
= .
{z ∼ z if and only if z = M z for some M ∈ SL(2, Z)}

At the beginning of this section we claimed that we were going to

adjoin only one point to Y (1), to compactify the surface, but it would
seem we added inﬁnitely many points (i.e., all of P1 (Q)). However, all
the points in P1 (Q) represent the same class in H∗ /Γ(1), and therefore
X(1) = H∗ /Γ(1) only contains one extra point more than Y (1).

Proposition 3.4.3. Let s and s be two elements of P1 (Q). Then

there exists a matrix M in SL(2, Z) such that s = M s.

The proof is left as an exercise (Exercise 3.7.8). Proposition

3.4.3 implies that every point in P1 (Q) is equivalent to ∞ = [1, 0]
in H∗ /Γ(1). Thus, X(1) = Y (1) ∪ {∞} as we wanted. The point ∞
is called a cusp.
In the next couple of sections we are going to generalize the con-
struction of X(1) and deﬁne other types of modular curves that will
show up later on. First of all, we need to talk about the subgroups
of SL(2, Z) that we are most interested in.
90 3. Modular curves

3.5. Congruence subgroups

In this section we define several types of subgroups Γ of SL(2, Z) that
come up often in the theory of modular forms. Later on, we will
define other modular curves as quotients H∗ /Γ in the same way that
we have defined X(1) above.
Definition 3.5.1. Let N ≥ 1 be an integer. We define subgroups of
SL(2, Z) by

a b
Γ0 (N ) = ∈ SL(2, Z) : c ≡ 0 mod N ,
c d

a b
Γ1 (N ) = ∈ SL(2, Z) : c ≡ 0, a ≡ d ≡ 1 mod N ,
c d

a b
Γ(N ) = ∈ SL(2, Z) : b ≡ c ≡ 0, a ≡ d ≡ 1 mod N .
c d
We say that a subgroup G of SL(2, Z) is a congruence subgroup if G
contains Γ(N ) for some integer N ≥ 1.

The reader should check that, indeed, Γ0 (N ), Γ1 (N ) and Γ(N )

are subgroups of SL(2, Z) for any N ≥ 1. Also notice that, for a ﬁxed
N ≥ 1, we have inclusions Γ(N ) ≤ Γ1 (N ) ≤ Γ0 (N ). The equality
Γ(1) = SL(2, Z) follows from Deﬁnition 3.5.1, which explains our
previous notation for the modular group.
Example 3.5.2. Let N = 5. The following matrices belong to Γ0 (5):

1 1 −1 0 1 0
, , ,
0 1 0 −1 5 1

−2 −1 −3 −1
, ,
5 2 10 3
and, in fact, one can show that these matrices (and their inverses)
generate all of Γ0 (5). The matrices

1 1 −49 23 11 4 66 23
, , ,
0 1 115 −54 −25 −9 175 61
are some examples of elements of Γ1 (5), but they are not a complete
generating set for Γ1 (5). A complete list of generators can be found,
for example, using Sage with the command Gamma1(5).gens().
3.6. Modular curves 91

3.6. Modular curves

In this section we generalize the definition of X(1), as in Definition
3.4.2, in order to define more general modular curves. To do so, we
simply replace Γ(1) by any congruence subgroup Γ defined in Section
3.5.
Let Γ be a fixed congruence subgroup. We say that two points
τ, τ ∈ H∗ = H ∪ P1 (Q) are equivalent relative to Γ if there is a matrix

a b
M= ∈Γ
c d
such that τ = M τ . This defines an equivalence relation, and the set
of all equivalence classes is denoted by X = H∗ /Γ:
{z = a + bi ∈ C : b > 0} ∪ {s ∈ Q} ∪ {∞}
X = H∗ /Γ = .
{z ∼ z if and only if z = M z for some M ∈ Γ}
The space X is called a modular curve (indeed, X may be viewed as
a curve over C or as a real surface). The cusps of H∗ /Γ are those
elements in the quotient that have a representative in P1 (Q). Recall
that X(1) had only 1 cusp. However, other modular curves have
multiple distinct cusps.
Let N ≥ 1. The modular curves that correspond to the congru-
ence subgroups Γ0 (N ), Γ1 (N ) and Γ(N ) are usually denoted, respec-
tively, by X0 (N ), X1 (N ) and X(N ).

Example 3.6.1. Let p ≥ 2 and let X0 (p) = H∗ /Γ0 (p). Then X0 (p)
has exactly two cusps. The points 0 = [0, 1] and ∞ = [1, 0] are
inequivalent in X0 (p) and are representatives of the two non-trivial
cusps. See Exercise 3.7.10.

Remark 3.6.2. In Proposition 3.3.4 we found F(1), a fundamental

domain for the action of SL(2, Z) on H. Similarly, if Γ is a congruence
subgroup, one can ﬁnd a fundamental domain F(Γ) for the action
of Γ on H. We write F(N ), F1 (N ) and F0 (N ), respectively, for
the fundamental domains for the action of Γ(N ), Γ1 (N ) and Γ0 (N )
on H. Helena Verrill [Ver05] has developed a great applet to draw
fundamental domains for modular curves. See Figure 4 for an example
of a fundamental domain for Γ0 (11).
92 3. Modular curves

Figure 4. A fundamental domain (whole shaded region) for

the action of Γ0 (11), obtained with Verrill’s applet. The do-
main is in fact inﬁnitely long in the positive imaginary direc-
tion (upwards), but we had to cut the domain to be able to
ﬁt it on the page. Each hyperbolic triangle inside the shaded
region is a fundamental domain for SL(2, Z).

Remark 3.6.3. As a consequence of the uniformization theorem

(Thm. 3.2.5) and Prop. 3.1.10, every class [τ ] ∈ X(1) such that
τ is not a cusp (sometimes we say non-cuspidal τ ) corresponds to an
elliptic curve E/C ∼= C/τ, 1 and, conversely, if E/C is an elliptic
curve, there is a unique class [τ ] ∈ X(1) such that E/C ∼= C/τ, 1.
Thus, the non-cuspidal points on X(1) classify elliptic curves up to
isomorphism over C.
Similarly, one can show that the modular curves X0 (N ), X1 (N )
and X(N ) have interpretations in terms of elliptic curves together
with some extra data. For instance, X0 (N ) classifies pairs (E, C) of
elliptic curves E with a fixed subgroup C ⊆ E(C) of order N up
to isomorphism over C. The curve X1 (N ) classifies pairs (E, P ) of
3.6. Modular curves 93

elliptic curves E with a ﬁxed point P ∈ E(C) of exact order N up to

isomorphism over C.

Remark 3.6.4. One aspect of modular curves that is not at all ob-
vious is the fact that modular curves have algebraic models; i.e., if
Γ is a congruence subgroup, then H∗ /Γ is a compact Riemann sur-
face and it has a model as a projective algebraic curve over C, given
by polynomial equations. The modular curves for Γ0 (N ) have the
surprising property that they have a canonical model deﬁned over
Q. The reason is that the modular j-invariant function j(z) (see Ex-
ample 4.1.10) and the function j(N z) satisfy an algebraic equation
FN (j(z), j(N z)) = 0, with FN (x, y) ∈ Q[x, y], which gives an alge-
braic model for X0 (N ). However, this is typically a highly singular
model, which can be transformed into a non-singular model for the
modular curve. For instance,
(1) The curve X0 (11) = H∗ /Γ0 (11) has a model y 2 + y = x3 −
x2 − 10x − 20 (notice that it is an elliptic curve!).
(2) The curve X1 (11) = H∗ /Γ1 (11) has a model y 2 +y = x3 −x2 .
(3) The curve X0 (14) has a model y 2 + xy + y = x3 + 4x − 6.
(4) The curve X1 (14) has a model y 2 + xy + y = x3 − x.
(5) The curve X1 (13) has a model y 2 + (x3 − x2 − 1)y = x2 − x.
This is not an elliptic curve (it has genus 2, not 1). The
examples above (1)-(4) are nice but the model of a modular
curve will be often much more complicated than a cubic.

We conclude this chapter with some genus formulas for X0 (N ),

due to Ogg, Shimura, and others. The genus can be calculated using
the Hurwitz genus formula and the ramiﬁcation points of the quotient
map X0 (N ) → X(1).

Theorem 3.6.5. Let N ≥ 1 be an integer and let X0 (N ) be the

modular curve H∗ /Γ0 (N ). Let g be the genus of the curve X0 (N ).
Then:
g=0 if N = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 16, 18, 25;
g=1 if N = 11, 14, 15, 17, 19, 20, 21, 24, 27, 32, 36, 49;
g=2 if N = 22, 23, 26, 28, 29, 31, 37, 50.
94 3. Modular curves

Moreover, if p > 3 is prime, then:

p+1
−1 if p ≡ 1 mod 12;
genus(X0 (p)) = p+1
12

12 otherwise,
(p − 1)(p − 11)
genus(X1 (p)) = 1 + , and
24
(p2 − 1)(p − 6)
genus(X(p)) = 1 + ,
24
where [x] is the greatest integer ≤ x.

The genus formulas for X(p), X0 (p) and X1 (p) are consequences
of the Hurwitz and Riemann-Hurwitz genus formulas (see Exercises
3.1.4, 3.1.5 and 3.1.6 of [DS05] or see Chapter 1 of [Shi73] for proofs).
The list of all modular curves X0 (N ) with genus 0, 1 or 2 can be found
in [Maz72]. For more general genus formulas see [DS05], Section 3.1.

3.7. Exercises
Exercise 3.7.1. Let a, b, c, d ∈ R, τ ∈ C and τ ∈
/ R. Show that:
(1) The imaginary part of τ = aτ +b
cτ +d is Im(τ ) =
(ad−bc) Im(τ )
|cτ +d|2 .

a b
(2) If M = ∈ SL(2, Z) and τ ∈ H, then M τ ∈ H.
c d
Exercise 3.7.2. In this exercise we study the relationships between
diﬀerent bases of a lattice.
(1) Let L = i, 1 be the lattice of Gaussian integers Z[i]. Let
a, b, c, d be integers such that ad − bc = 1. Show that the
lattice L generated by w1 = ai + b and w2 = ci + d is also
Z[i].
(2) More generally, let L be a lattice generated by w1 and w2 ∈
C with w1 /w2 ∈ H. Let

a b
M=
c d
be a matrix in SL(2, Z), i.e., a, b, c, d ∈ Z and ad − bc = 1.
1
Let w 1
w2
= M w w2 , where the operation here is the usual
matrix multiplication of vectors, i.e., w1 = aw1 + bw2 and
3.7. Exercises 95

w2 = cw1 + dw2 . Show that w1 /w2 ∈ H and the lattice
generated by w1 and w2 is also L. (Hint: do Exercise 3.7.1.
Also, notice that M is an invertible matrix.)
(3) Conversely, suppose that L = w1 , w2 = w1 , w2 , for some
wi , wi ∈ C, such that w1 /w2 , w1 /w2 ∈ H. Show that there
1
is a matrix M ∈ SL(2, Z) such that w 1
w
=M ww2 .
2

Exercise 3.7.3. Let L and L be lattices in C. Let α ∈ C× and

suppose that L = αL . Show that the map ψ : C/L → C/L deﬁned
by ψ(z mod L) = αz mod L is an analytic map and it is also an
isomorphism of abelian groups.

Exercise 3.7.4. This exercise shows part (a) of Theorem 3.2.4.

1
(a) Find the Taylor series of f (x) = (1−x)2 centered at x = 0.
(b) Use (a) to ﬁnd the Laurent series of ℘(z, L) centered around
z = 0. Hint:

1 1 1 1
− = − 1 .
(z − w)2 w2 w2 (1 − wz )2
Exercise 3.7.5. Let E/C be an elliptic curve. Show that E[m] ∼
=
Z/mZ × Z/mZ. (Hint: use the uniformization theorem, Thm. 3.2.5.
What is the m-torsion of C/L?)

Exercise 3.7.6. The goal of this problem is to show that the relation
that appears in Deﬁnition 3.3.1 is indeed an equivalence relation. Let
M = (a, b; c, d) ∈ SL(2, Z), τ, τ ∈ H and deﬁne M τ = aτ +b
cτ +d . We say
that τ ∼ τ if there is a matrix M ∈ SL(2, Z) such that τ = M τ .

Show that:
(1) (Reﬂexive) τ ∼ τ for all τ ∈ H;
(2) (Symmetric) if τ ∼ τ , then τ ∼ τ for all τ, τ ∈ H;
(3) (Transitive) if τ ∼ τ and τ ∼ τ , then τ ∼ τ for all
τ, τ, τ ∈ H.

Exercise 3.7.7. Let G be the subgroup of SL(2, Z) generated by the

matrices
0 −1 1 1
S= and T = .
1 0 0 1
96 3. Modular curves

In other words, G is the group of all matrices that can be obtained as

“words” in the letters S, T , and T −1 (e.g. M = S · T · S · T 3 · S ∈ G).
The goal of this exercise is to show that for all τ ∈ H there is M ∈ G
such that M τ is in the fundamental domain F(1) defined in Prop.
3.3.4.
(1) Let τ ∈ H be fixed. The set
U = {Im(M τ ) : M ∈ G} ⊂ R>0
has a maximum element, i.e., there is M0 ∈ G such that
Im(M0 τ ) is the maximum element of U . (Hint: show that
|cτ + d| → ∞ as |c| + |d| → ∞. Then use Prob. 3.7.1.)
(2) Let τ and M0 be as in (1). Show that there is n ∈ Z such
that
1
|(T n M0 τ )| ≤ .
2
(3) Let τ , M0 and n be as above. Show that if |T n M0 τ | < 1,
then
Im(ST n M0 τ ) > M0 τ
contradicting the definition of M0 . Hence |T n M0 τ | ≥ 1.
(4) If τ ∈ F with

1
F = z = a + bi ∈ C : |z| > 1 and a = (z) = −
2

1
∪ z = a + bi ∈ C : |z| = 1 and − ≤ a = (z) < 0 ,
2
then there is M ∈ G such that M τ ∈ F(1).
(5) Conclude that for every τ ∈ H there is M ∈ G such that
M τ ∈ F(1).

Exercise 3.7.8. Prove Proposition 3.4.3. In particular, show that

∞ is equivalent to all rational points [s, 1] ∈ P1 (Q), with s ∈ Q, in
X(1).

Exercise 3.7.9. Let N > 1 be ﬁxed. Prove that Γ(N ), Γ1 (N ) and

Γ0 (N ) are subgroups of SL(2, Z), and Γ(N ) ≤ Γ1 (N ) ≤ Γ0 (N ).

Exercise 3.7.10. Let p be a prime and let X0 (p) = H∗ /Γ0 (p). Show
that X0 (p) has exactly two cusps. In particular, show that if [s, t] ∈
3.7. Exercises 97

P1 (Q), then either there is a matrix M ∈ Γ0 (p) such that M [s, t] =

[0, 1] or there is a matrix M ∈ Γ0 (p) such that M [s, t] = [1, 0], but
both cannot occur simultaneously (i.e., 0 ∼ ∞ in X0 (p)).
Chapter 4

Modular forms

In this chapter we deﬁne modular forms as functions on modular

curves (see Deﬁnition 3.4.2 and Section 3.6). For further introductory
reading on modular forms (and modular curves), we refer the reader to
[Ser77], Ch. VII, and the ﬁrst two chapters of [DS05]. See Appendix
A.2 for a brief introduction to computing spaces of modular forms
using Sage, and see [Ste07] for a thorough treatment.

4.1. Modular forms for the modular group

First, we deﬁne meromorphic functions on a modular curve H∗ /Γ.
For the deﬁnition of congruence subgroup, see Section 3.5.

Deﬁnition 4.1.1. Let Γ be a congruence subgroup of SL(2, Z) and

let X(Γ) be the modular curve H∗ /Γ. A meromorphic function f :
H∗ /Γ → C ∪ {∞} is called a modular function for Γ. In other words,
a modular function is a function f : H∗ → C ∪ {∞} such that:
(1) f is meromorphic on H∗ , so it is meromorphic at all points
on H and all points on P1 (Q) = Q ∪ {∞}; and
(2) f (z) = f (M z) for any M ∈ Γ; i.e., f (z) = f ( az+b
cz+d ), where
M = (a, b; c, d) is a matrix in Γ.

Remark 4.1.2. Let Γ = SL(2, Z) and let X(1) = H∗ / SL(2, Z). Then
a modular function for SL(2, Z) is a function f : H∗ → C ∪ {∞} that

99
100 4. Modular forms

is meromorphic on all of H∗ and such that f (z) = f (M z) for all

M ∈ SL(2, Z). In particular, f (z) = f (Sz) = f (−1/z) and f (z) =
f (T z) = f (z + 1). In fact, if f (z) = f (M z) for M = S and T , then
it also holds for all M ∈ SL(2, Z), because S and T generate all of
SL(2, Z) by Corollary 3.3.5.

It turns out that the conditions on the deﬁnition of modular func-

tion are quite restrictive and, as a result, there are very few interesting
examples. For instance, the modular functions for SL(2, Z) are just
C(j), where j(z) is the modular j-invariant (see Example 4.1.10). We
extend the deﬁnition a bit. We begin with the deﬁnition of modular
forms for SL(2, Z).

Deﬁnition 4.1.3. A function f : H → C ∪ {∞} is weakly modular of

weight k for SL(2, Z) if:
(1) f is meromorphic on H; and

(2) f (M z) = (cz+d)k f (z) for any M ∈ SL(2, Z); i.e., f az+b
cz+d =
(cz + d) f (z), where M is a matrix (a, b; c, d) ∈ SL(2, Z).
k

Since the matrix T is an element of SL(2, Z), if f is weakly modular,

then
f (z) = f (T z) = f (z + 1) for all z ∈ H.
This periodicity of f means that, if we set q = e2πiz , then we can
express f as a Laurent series:
∞
a−2 a−1
f (z) = an q n = · · · + + + a0 + a1 q + a2 q 2 + · · · ,
n=−∞
q2 q
where an are called the Fourier coeﬃcients of f .
(a) We say that f is a modular function of weight k for SL(2, Z)
if f satisﬁes (1) and (2) above and f is meromorphic at the
cusp ∞ of the modular curve X(1) = H∗ / SL(2, Z). This
means that the Laurent expansion of f (z) must be of the
form
∞
a−m a−m+1 a−1
f (z) = an q n = + m−1 +· · ·+ +a0 +a1 q +a2 q 2 +· · ·
n=−m
qm q q
for some m ∈ N.
4.1. Modular forms for the modular group 101

(b) f is a modular form of weight k for SL(2, Z) if f is a modular

function of weight k and it is analytic everywhere on H and
at the cusp ∞ of X(1). This means that f (z) has a Taylor
expansion
∞
f (z) = an q n = a0 + a1 q + a2 q 2 + · · · .
n=0

Equivalently, f is analytic at the cusp ∞ if |f (yi)| stays

bounded as y → ∞. The value of f at the cusp is equal to
limy→∞ f (yi).
(c) f is a cusp form of weight k for SL(2, Z) if f is a modular
form of weight k and it vanishes at the cusp ∞ of X(1).
This means that the function f (z) has a Taylor expansion
∞
f (z) = an q n = a1 q + a2 q 2 + · · · , i.e., a0 = 0.
n=1

Equivalently, f is a cusp form if limy→∞ f (yi) = 0.

Remark 4.1.4. Exercise 4.5.5 shows that condition (2) in the defi-
nition of weakly modular function (resp. modular form below); i.e.,
Definition 4.1.3 above (resp. 4.2.1 below), is equivalent to saying
that f (z)(dz)k is a differential k-form, invariant under the action of
SL(2, Z) (resp. congruence subgroup Γ).

It is easy to show that there are no modular forms of odd weight

for SL(2, Z) other than f (z) = 0; see Exercise 4.5.2. The Eisenstein
series are our first and most important examples of modular forms
of even weight 2k for SL(2, Z). Recall that we have already defined
the Eisenstein series G2k (Λ) as functions of lattices (in Definition
3.2.3). Here we evaluate G2k at the lattice Λz = z, 1, with z ∈ H.
Therefore we may consider G2k (Λz ) = G2k (z) as a function of the
complex variable z.

Proposition 4.1.5. Let k ≥ 2 and deﬁne a function of τ ∈ H by

1 1
G2k (z) := G2k (z, 1) = = .
w2k (mz + n)2k
w∈ z,1 (m,n)∈Z×Z
w=0 (m,n)=(0,0)
102 4. Modular forms

Then G2k (z) is a modular form of weight 2k for SL(2, Z), and the
value of G2k at the cusp ∞ of X(1) is equal to 2ζ(2k), where ζ(s) is
the Riemann zeta function.

The proof is an exercise (Exercise 4.5.3 shows everything except

the fact that G2k is meromorphic on H, which is an exercise in uniform
convergence of series).
Definition 4.1.6. We say that a modular form is normalized if the
first non-zero coefficient of its q-expansion is equal to 1.

The following proposition states the formula for the q-expansion

of the Eisenstein series and its normalization.
Proposition 4.1.7. Let k ≥ 2, let ζ(s) be the Riemann zeta function,
let q = e2πiz and let σk (n) = 0<d|n dk be the sum of the k-th powers
of positive divisors of n. Then, the q-expansion of the Eisenstein
series G2k (z) is given by
2(2πi)2k
G2k (z) = 2ζ(2k) + σ2k−1 (n)q n .
(2k − 1)!
n≥1

Therefore, the normalized Eisenstein series is

1 4k
E2k (z) = G2k (z) = 1 − σ2k−1 (n)q n ,
2ζ(2k) B2k
n≥1

where B2k is the 2k-th Bernoulli number.

For a proof of this fact, see [Kob93], Ch. III, Prop. 6. The
values of ζ(2k) can be computed in terms of Bernoulli numbers:
∞
1 (2πi)2k B2k
ζ(2k) = = − for all k ≥ 1.
n=1
n2k 2(2k)!
Next, we list a number of properties satisﬁed by modular forms.
Proposition 4.1.8. Let k, k ≥ 2 be integers.
(1) Suppose that f and g are modular forms of weight k for
SL(2, Z). Then, for all λ, μ ∈ C, the function λf (z) + μg(z)
is also a modular form of weight k for SL(2, Z). Therefore,
the set of all modular forms of weight k for SL(2, Z) is a
vector space over C.
4.1. Modular forms for the modular group 103

(2) The set of all cusp forms of weight k for SL(2, Z) is a C-

linear subspace of the vector space of forms of weight k.
(3) Suppose that f (z) and g(z) are respectively modular forms
of weight k and k for SL(2, Z). Then the function f (z)·g(z)
is a modular form of weight k + k for SL(2, Z).

The proof is Exercise 4.5.4.

Deﬁnition 4.1.9. The C-vector space of all modular forms of weight

k for SL(2, Z) is denoted by Mk (SL(2, Z)). The subspace of cusp forms
is denoted by Sk (SL(2, Z)).

Example 4.1.10. Let g2 (z) = −15G4 (z), g3 (z) = −35G6 (z), and
deﬁne

Δ(z) = −16 4(g2 (z))3 + 27(g3 (z))2 .
Then Δ is a modular form of weight 12 for SL(2, Z). The modular
form Δ is usually called the modular discriminant. Δ(z) has a simple
zero at ∞ and no other zeros. The function
(4g2 (z))3
j(z) = 1728
Δ(z)
is a modular function of weight 0 (as in Deﬁnition 4.1.1) but it is not
a modular form. j(z) is analytic on H but it is not analytic at ∞
because Δ(z) has a zero at ∞ but g2 (z) does not vanish at ∞, so j(z)
has a pole. The function j(z) is called the modular j-invariant.

Example 4.1.11. The modular forms f1 (z) = E10 (z) and f2 (z) =
E4 (z) · E6 (z) are both in the space M10 (SL(2, Z)). A priori, they
are distinct modular forms. However, if we knew the dimension of
M10 (SL(2, Z)) and the dimension was 1, then there should be a linear
relationship between both f1 and f2 . Thus, we need to know the
dimensions of spaces of modular forms!

Theorem 4.1.12. The dimension of Mk (SL(2, Z)) as a C-vector

space is ﬁnite and it is given by
⎧
⎪
⎪
⎨0 if k < 0 or if k is odd;
dimC (Mk (SL(2, Z))) = k
if k ≥ 0, k ≡ 2 mod 12;
⎪
⎪ 12
⎩ k +1 otherwise.
12
104 4. Modular forms

where [x] is the greatest integer ≤ x. Moreover, for all integers k,

there is an isomorphism ψ : M2k−12 (SL(2, Z)) ∼ = S2k (SL(2, Z)) of
C-vector spaces given by ψ(f (z)) = Δ(z)f (z). In particular,

dimC (S2k (SL(2, Z))) = dimC (M2k−12 (SL(2, Z))).

For a proof, see [DS05], Theorem 3.5.2; [Ser77], Ch. VII, The-
orem 4; or [Kob93], Ch. III, Proposition 9.

Example 4.1.13. Let f1 (z) = E10 (z) and f2 (z) = E4 (z) · E6 (z),
which are both in the space M10 (SL(2, Z)). By Theorem 4.1.12, the
dimension of M10 as a C-vector space is 1. Therefore, there exists
λ ∈ C such that f1 (z) = λf2 (z). However, both f1 (z) and f2 (Z)
are normalized modular forms, meaning that their ﬁrst non-zero co-
eﬃcient of their q-expansions equals 1. Hence, comparing their q-
expansions, we conclude that λ = 1 and E10 (z) = E4 (z)E6 (z).
The equality we just deduced, together with the q-expansion of
the Eisenstein series (Prop. 4.1.7), can be rephrased as:

20 8 12
1− σ9 (n)q n = (1 − σ3 (k)q k )(1 − σ5 (h)q h ).
B10 B4 B6
n≥1 k≥1 h≥1

In particular, if we compare the coeﬃcient of q n on both sides, we

obtain the following interesting conclusion:
n−1
20 8 12 96
σ9 (n) = σ3 (n) + σ5 (n) − σ3 (j)σ5 (n − j),
B10 B4 B6 B4 B6 j=1

where B4 = −1/30, B6 = 1/42 and B10 = 5/66. We remind the

reader that σi (n) = 0<d|n di .

Example 4.1.14. Let Δ(z) be the modular discriminant form (which

is a modular form of weight 12 for SL(2, Z)), as in Example 4.1.10.
It can be shown that
∞

Δ(z) = (2π)12 q (1 − q n )24 ,
n=1
4.2. Modular forms for congruence subgroups 105

where q = e2πiz as usual. The Ramanujan τ -function is deﬁned by

the coeﬃcients of the q-expansion of the normalized Δ function, i.e.,
∞

(2π)−12 Δ(z) = q (1 − q n )24 = τ (n)q n .
n=1 n≥1

Using the fact that S12 (SL(2, Z)) is 1-dimensional (by Theorem 4.1.12),
one can show the following surprising congruence:
τ (n) ≡ σ11 (n) ≡ d11 mod 691 for all n ≥ 1.
0<d|n

A proof of this congruence is outlined in Exercise 4.5.7.

4.2. Modular forms for congruence subgroups

Deﬁnition 4.2.1. Let Γ be a congruence subgroup of SL(2, Z). A
function f : H → C ∪ {∞} is weakly modular of weight k for Γ if:
(1) f is meromorphic on H; and

(2) f (M z) = (cz + d)k f (z) for any M ∈ Γ; i.e., f az+b
cz+d =
(cz + d) f (z), where M is a matrix (a, b; c, d) ∈ Γ.
k

Since Γ is a congruence subgroup, there must be an N ≥ 1 such

that Γ(N ) ⊆ Γ. If f is a weakly modular function of weight k for Γ,
and Γ(N ) ⊆ Γ, such that N is the smallest positive integer with this
property, then we say that the level of f is N . If f is weakly modular
of weight k and level N , then the matrix

1 N
TN =
0 1
belongs to Γ. Therefore,
f (z) = f (TN z) = f (z + N ) for all z ∈ H.
This periodicity of f means that, if we set qN = e2πiz/N , then we can
express f as a Laurent series:
∞
f (z) = an (qN )n ,
n=−∞

where an are the Fourier coeﬃcients of f .

106 4. Modular forms

(a) We say that f is a modular function of weight k for Γ if

f satisﬁes (1) and (2) above and f is meromorphic at the
cusps of the modular curve X(Γ) = H∗ /Γ. This means
= (a, b; c, d) ∈ SL(2, Z), the function
that, for any matrix M
−k
fM (z) = (cz + d) f az+b
cz+d has a Laurent expansion

∞
fM (z) = cn (qN )n .
n=−m

See Remark 4.2.2 below for a discussion about this require-

ment.
(b) f is a modular form of weight k for Γ if f is a modular
function of weight k and it is analytic on H and at the cusps
of X(Γ). This means that, for all M ∈ SL(2, Z), the function
fM (z) has a Taylor expansion
∞
fM (z) = cn (qN )n .
n=0

(c) f is a cusp form of weight k for Γ if f is a modular form

of weight k and it vanishes at all the cusps of X(Γ). This
means that, for all M ∈ SL(2, Z), the function fM (z) has a
Taylor expansion
∞
fM (z) = cn (qN )n , i.e., c0 = 0.
n=1

Remark 4.2.2. In part (a) we request that f should be meromorphic

at the cusps of X(Γ). It is simple to check whether f is meromorphic
at the cusp ∞. Indeed, f is meromorphic at ∞ if it has a Laurent
∞
expansion of the form f (z) = n=−m cn (qN )n . In order to check if
f is meromorphic at another cusp, say s = − dc ∈ Q, we ﬁrst do a
change of variables that brings −d/c to ∞. This is best accomplished
by a linear fractional transformation z → az+b
cz+d (see Prop. 3.4.3). The
function fM (z) is precisely the result of this change of variables on
f (z). Thus, if fM (z) is meromorphic at ∞, then f (z) is meromorphic
at −d/c, so we just need to check the Laurent expansion at ∞ of
fM (z).
4.2. Modular forms for congruence subgroups 107

Notice also that X(Γ) only has a ﬁnite number of cusps, say
s1 , s2 , . . . , sn , so one only needs to check the condition in part (a) of
Defn. 4.2.1 for a ﬁnite number of matrices M1 , M2 , . . . , Mn such that
Mi sends si to ∞.

Let Γ be a congruence subgroup. As in the case of modular forms

for SL(2, Z) (cf. Proposition 4.1.8), the set of all modular forms of
weight k for Γ is a C-vector space and the subset of cusp forms is a
linear subspace.
Proposition 4.2.3. Let k ≥ 2 and let Γ be a congruence subgroup.
Let Mk (Γ) be the set of all modular forms of weight k for Γ, and let
Sk (Γ) be the subset that consists of all cusp forms of weight k for Γ.
(1) Mk (Γ) is a C-vector space and Sk (Γ) is a C-linear subspace
of Mk (Γ).
(2) Let Γ be another congruence subgroup contained in Γ, i.e.,
Γ ≤ Γ. Then any modular form f (z) of weight k for Γ
is also a modular form of the same weight for Γ . There-
fore, Mk (Γ) is a C-linear subspace of Mk (Γ ), and Sk (Γ) ⊆
Sk (Γ ).
(3) Let k and k be positive integers. If f (z) ∈ Mk (Γ) and g(z) ∈
Mk (Γ), then (f · g)(z) is a modular form in Mk+k (Γ).
Remark 4.2.4. Let Γ be a congruence subgroup. Then Γ ≤ SL(2, Z).
Thus, Proposition 4.2.3 implies that Mk (SL(2, Z)) is always a C-linear
subspace of Mk (Γ). Also, recall that Γ(N ) ≤ Γ1 (N ) ≤ Γ0 (N ). There-
fore, Mk (Γ0 (N )) ⊆ Mk (Γ1 (N )) ⊆ Mk (Γ(N )).
Remark 4.2.5. Let N ≥ 1 and let M be a positive divisor of N .
Then
Γ0 (N ) ≤ Γ0 (M ), Γ1 (N ) ≤ Γ1 (M ), and Γ(N ) ≤ Γ(M ).
Therefore, for any k ≥ 1, Mk (Γ0 (M )) ⊆ Mk (Γ0 (N )), Mk (Γ1 (M )) ⊆
Mk (Γ1 (N )), and Mk (Γ(M )) ⊆ Mk (Γ(N )).
Also, suppose that N = M M , where 1 < M, M < N , so that
M and M are proper divisors of N . Suppose that g(z) ∈ Mk (Γ(M )).
Then, it is an exercise (Exercise 4.5.9) to show that f (z) := g(M z)
belongs to Mk (Γ(N )).
108 4. Modular forms

Deﬁnition 4.2.6. Let N , k ≥ 1 be integers. A modular form f (z)

of weight k for Γ(N ) is said to be an old form if there is some pos-
itive divisor M of N such that f (z) is a modular form in the space
Mk (Γ(M )). The C-linear subspace spanned by the set of all old forms
of Mk (Γ(N )) is usually denoted by Mkold (Γ(N )). We also deﬁne
Skold (Γ(N )) := Mkold (Γ(N )) ∩ Sk (Γ(N )).

In Deﬁnition 4.3.2, we will deﬁne the space of new cusp forms as

the orthogonal complement of Skold (Γ(N )) in Sk (Γ(N )) with respect
to the Petersson inner product (see Section 4.3).
As for SL(2, Z), the Eisenstein series (of level N ) are the main
non-trivial examples (compare the following definition with Proposi-
tion 4.1.5).
Definition 4.2.7. Let k ≥ 3, N ≥ 1 and let a = (a1 , a2 ) ∈ Z/N Z ×
Z/N Z be non-zero, i.e., a ≡ (0, 0) mod N . We define the Eisenstein
series of level N and weight k, corresponding to a, by
1
Gak (z) = ,
(mz + n)k
(m,n)≡a mod N

where the sum is over all (m, n) ∈ Z × Z such that m ≡ a1 and

n ≡ a2 mod N .

Notice that if a = (0, 0) mod N , then we would have

1
Gak (z) =
(mz + n)k
(m,n)≡(0,0) mod N
(m,n)=(0,0)
1
= = N −k Gk (z),
(N az + N b)k
(a,b)∈Z2
(a,b)=(0,0)

where Gk (z) is the classical Eisenstein modular form of weight k for

SL(2, Z).
Proposition 4.2.8. Let k ≥ 3, N ≥ 1 and let a = (a1 , a2 ) ∈ Z/N Z×
Z/N Z be non-zero. Then
(0,a2 )
Gak (z) ∈ Mk (Γ(N )) and Gk (z) ∈ Mk (Γ1 (N ))
for any a2 ≡ 0 mod N .
4.2. Modular forms for congruence subgroups 109

See Exercise 4.5.8 for a proof of the invariance under Γ(N ) and
Γ1 (N ).

Remark 4.2.9. The Eisenstein series are very useful because most
of the spaces we are discussing in this book have a basis formed by
Eisenstein series, and we can calculate their q-expansions. For pre-
cise statements see [Miy06], Chapter 7, or [Ste07], Chapter 5 (in
particular, see Section 5.3).

Remark 4.2.10. Let k ≥ 1 and let Γ be a congruence subgroup.

Then Mk (Γ) is a ﬁnite-dimensional C-vector space. The formulas
for the dimension of the spaces of modular forms Mk (Γ) and Sk (Γ)
can be found by calculating the genus and the number of cusps of
the modular curve X(Γ). Since we will not use these formulas here,
we simply refer the reader to [DS05], Theorem 3.5.1 and Figure 3.3
(page 108).

Example 4.2.11. Let N = 11 and let the weight be 2. The space

M2 (Γ0 (11)) is a 2-dimensional C-vector space with basis elements
{f, g} given by the q-expansions

f (q) = q − 2q 2 − q 3 + 2q 4 + q 5 + 2q 6 − 2q 7 − 2q 9 − 2q 10 + O(q 11 )
12 36 48 84 72 144 6
g(q) = 1 + q + q 2 + q 3 + q 4 + q 5 + q + O(q 7 ),
5 5 5 5 5 5
where q = e2πiz . Thus, we deduce that S2 (Γ0 (11)) is 1-dimensional,
generated by f (q).

Example 4.2.12. Let N = 37 and let the weight be 2. The space

M2 (Γ0 (37)) is a 3-dimensional C-vector space with basis elements
{f, g, h} given by the q-expansions:

f (q) = q + q 3 − 2q 4 − q 7 − 2q 9 + 3q 11 − 2q 12 − 4q 13 + O(q 16 )
g(q) = q 2 + 2q 3 − 2q 4 + q 5 − 3q 6 − 4q 9 − 2q 10 + 4q 11 + O(q 12 )
2 8 14 16
h(q) = 1 + q + 2q 2 + q 3 + q 4 + 4q 5 + 8q 6 + q 7 + O(q 8 ),
3 3 3 3
where, once again, q = e2πiz . Thus, we deduce that S2 (Γ0 (37)) is
2-dimensional, generated by f (q) and g(q).
110 4. Modular forms

4.3. The Petersson inner product

Let Γ and Γ be congruence subgroups with Γ ⊆ Γ. Let F(Γ ) ⊆ C
be a fundamental domain for the modular curve X(Γ ) = H∗ /Γ , and
suppose f (z) and g(z) are modular forms in Mk (Γ ) such that at least
one of them is a cusp form in Sk (Γ ). We deﬁne the Petersson inner
product of f and g by
!
1 dxdy
f, g := f (x + iy)g(x + iy)y k 2 ∈ C,
[Γ : Γ ] F (Γ ) y
where Γ = Γ/{± Id} and g(z) is the complex conjugate of g(z).
Remark 4.3.1. A number of remarks are in order:
(1) The Euclidean measure dxdy on C has been replaced by the
y 2 on H. This makes sense because the
hyperbolic measure dxdy
hyperbolic measure is invariant under SL(2, Z). This means
that, if M ∈ SL(2, Z) and F is a region in H, then
! !
dxdy dxdy
2
= 2
,
F y MF y
where M F = {M z : z ∈ F} and the matrix M = (a, b; c, d)
acts on z as usual by M z = az+b
cz+d .
(2) The integral in the deﬁnition of f (z), g(z) does not con-
verge (in general) if neither f nor g is a cusp form.
(3) The Petersson inner product is a Hermitian inner product
on Sk (Γ ), i.e.,
• f (z), g(z) linear in f :
λ1 f1 (z) + λ2 f2 (z), g(z) = λ1 f1 (z), g(z) + λ2 f2 (z), g(z)
for any λ1 , λ2 ∈ C and any f1 , f2 and g ∈ Sk (Γ );
• f (z), g(z) is antilinear in g:
f (z), λ1 g1 (z) + λ2 g2 (z) = λ1 f (z), g1 (z) + λ2 f (z), g2 (z),
where λ is the complex conjugate of λ ∈ C;
• f (z), g(z) is conjugate-symmetric, i.e.,
f (z), g(z) = g(z), f (z);
and
• f (z), f (z) > 0 for f (z) = 0.
4.4. Hecke operators acting on cusp forms 111

Therefore, the Petersson inner product makes Sk (Γ ) into

an inner product space.
(4) Suppose that Γ is another congruence subgroup with Γ ≤
Γ ≤ Γ, and let f (z) and g(z) be modular forms for Γ .
Then, f and g may also be considered as modular forms for
Γ , and
!
1 dxdy
f, g = f (x + iy)g(x + iy)y k 2
[Γ : Γ ] F (Γ ) y
!
1 dxdy
= f (x + iy)g(x + iy)y k 2 .
[Γ : Γ ] F (Γ ) y

Thus, f, g gives the same value whether we consider f and

g as modular forms for Γ or for Γ .

Deﬁnition 4.3.2. Let N , k ≥ 1. Let Skold (Γ(N )) be the subspace

of Sk (Γ(N )) of old forms, defined in Definition 4.2.6. We define a
subspace of new forms, denoted by Sknew (Γ(N )), as the orthogonal
complement of Skold (Γ(N )) in Sk (Γ(N )) with respect to the Petersson
inner product, i.e.,

Sknew (Γ(N )) := Skold (Γ(N ))⊥

= {f (z) ∈ Sk (Γ(N )) : f (z), g(z) = 0 for all g ∈ Skold (Γ(N ))}.

Hence, Sk (Γ(N )) factors as:

Sk (Γ(N )) = Sknew (Γ(N )) ⊕ Skold (Γ(N ))

= Skold (Γ(N ))⊥ ⊕ Skold (Γ(N )).

4.4. Hecke operators acting on cusp forms

Let N ≥ 1 and let Sk (Γ0 (N )) be the space of cusp forms of weight
k for Γ0 (N ). In this section we deﬁne a collection of C-linear maps
from Sk (Γ0 (N )) to Sk (Γ0 (N )) that will be very important in Chapter
5. In particular, we will be very interested in the eigenvalues and
eigenvectors associated to these linear maps.
112 4. Modular forms

4.4.1. The wN operator.

Deﬁnition 4.4.1. Let N, k ≥ 1 and let f (z) be a modular form in

Sk (Γ0 (N )). The operator wN on Sk (Γ0 (N )) is a linear map

wN : Sk (Γ0 (N )) → Sk (Γ0 (N ))

deﬁned by
√
−k −1
(wN (f ))(z) = i ·k
Nz ·f .
Nz

Proposition 4.4.2. Let N, k ≥ 1 and let f (z) be a modular form in

Sk (Γ0 (N )). Then:

• wN (f ) is also a modular form in Sk (Γ0 (N ));

• wN is C-linear, i.e., wN (λ · f + μ · g) = λ · wN (f ) + μ · wN (g)
for all f, g ∈ Sk (Γ0 (N )) and all λ, μ ∈ C; and
• The square of wN is the identity, i.e., wN (wN (f )) = f .

Therefore, wN 2
= Id and the eigenvalues of wN are +1 or −1. Thus,
Sk (Γ0 (N )) can be expressed as the direct sum of the eigenspace cor-
responding to +1 plus the eigenspace corresponding to −1; i.e., if we
deﬁne spaces

Sk+ (Γ0 (N )) = {f ∈ Sk (Γ0 (N )) : wN (f ) = f },

Sk− (Γ0 (N )) = {f ∈ Sk (Γ0 (N )) : wN (f ) = −f },

then Sk (Γ0 (N )) factors as

Sk (Γ0 (N )) = Sk+ (Γ0 (N )) ⊕ Sk− (Γ0 (N )).

We shall see in the next chapter that if f (z) ∈ Sk (Γ0 (N )) is in

the ε = ±1 eigenspace, then the sign in the functional equation of the
L-series attached to f is precisely ε. The proof of Proposition 4.4.2
is an exercise (Exercise 4.5.11).

4.4.2. The diamond operators. Let δ ∈ Z and N, k ≥ 1. The di-

amond operator δ is a linear map from Mk (Γ1 (N )) to itself, deﬁned
as follows.
4.4. Hecke operators acting on cusp forms 113

Deﬁnition 4.4.3. Let δ ∈ Z be ﬁxed. Let M = (a, b; c, d) be a

matrix in Γ0 (N ) such that d ≡ δ mod N . The diamond operator δ
is a linear map δ : Mk (Γ1 (N )) → Mk (Γ1 (N )) defined by

−k −k az + b
(δf )(z) = (cz + d) f (M z) = (cz + d) f .
cz + d
Exercise 4.5.12 shows that the definition of δ does not depend
on the choice of a matrix M . Thus, δ is determined by the value
of δ mod N , so there are N distinct diamond operators, one for each
value 0, 1, . . . , N − 1. Notice that 1f = f is the identity operator,
because we can pick M = Id in the definition of the diamond operator.
Moreover, the following proposition shows that the diamond operators
with (δ, N ) = 1 form a group under multiplication.
Proposition 4.4.4. Let N, k ≥ 1 be fixed and let δ, δ ∈ Z with
(δδ , N ) = 1. Then δ (δf ) = δ(δ f ) = δ δf . In particular,
δϕ(N ) = 1 = Id and the eigenvalues of δ must be roots of unity
of order dividing ϕ(N ), where ϕ is the Euler phi function.

The proof of this proposition is left to the reader: Exercise 4.5.13.

Let μϕ(N ) be the set of all roots of unity of order dividing ϕ(N ).
Then, for each δ ∈ Z and every ζ ∈ μϕ(N ) , there is an eigenspace
of Mk (Γ1 (N )) formed by eigenvectors with eigenvalue ζ. More con-
cretely, let δ ∈ Z be ﬁxed. Then, for each ζ ∈ μϕ(N ) , the set
Mk (Γ1 (N ), δ, ζ) = {f (z) ∈ Mk (Γ1 (N )) : (δf )(z) = ζ · f (z)}
is a linear subspace of Mk (Γ1 (N )), which is the eigenspace for δ
formed by all eigenvectors with eigenvalue ζ. Furthermore, for each
δ ∈ Z, the space of modular forms Mk (Γ1 (N )) can be decomposed as
a direct sum of eigenspaces:
"
Mk (Γ1 (N )) = Mk (Γ1 (N ), δ, ζ).
ζ∈μϕ(N )

As it turns out, one can show a much more interesting decompo-

sition of Mk (Γ1 (N )), as follows.
Proposition 4.4.5. Let N, k ≥ 1 be ﬁxed. For every group homo-
morphism χ : (Z/N Z)× → C× (i.e., a character) we deﬁne
Mk (N, χ) = {f ∈ Mk (Γ1 (N )) : δf = χ(δ)f for all δ ∈ (Z/N Z)× }.
114 4. Modular forms
#
Then Mk (Γ1 (N )) = χ Mk (N, χ), where the direct sum is over all
possible characters χ of (Z/N Z)× .

The reader should check (Exercise 4.5.14) that if χ0 is the trivial

character (i.e., χ0 (δ) = 1 for all (δ, N ) = 1), then Mk (N, χ0 ) =
Mk (Γ0 (N )).

4.4.3. The Tn operators. Before we deﬁne the Hecke operators Tn ,

we need to define the auxiliary operators Um and Vm .
Definition 4.4.6. Let m ≥ 1 and let f ∈ Mk (Γ1 (N )). We define
operators Vm and Um by
m−1
1 z+j
(Vm (f ))(z) = f (mz) and (Um (f ))(z) = f .
m j=0 m

If f is given by a q-expansion f (z) = n≥0 an q n , then

Vm (f ) = an q mn and Um (f ) = an q n/m .
n≥0 n≡0 mod m

Recall that in Prop. 4.4.5 we deﬁned spaces Mk (N, χ) by

Mk (N, χ) = {f ∈ Mk (Γ1 (N )) : δf = χ(δ)f for all δ ∈ (Z/N Z)× }.
Deﬁnition 4.4.7. Let f (z) ∈ Mk (N, χ) and suppose f (z) is given by
a q-expansion f (z) = n≥0 an q n . Let p ≥ 2 be a prime. We deﬁne
an operator Tp by
Tp (f ) = Up (f ) + χ(p)pk−1 Vp (f ),
where χ(p) = 0 if N ≡ 0 mod p. Equivalently,
Tp (f (z)) = bn q n , such that bn = apn + χ(p)pk−1an/p
n≥0

and an/p = 0 if n ≡ 0 mod p. In particular, if χ0 is trivial and

f ∈ Mk (N, χ0 ) = Mk (Γ0 (N )), then
Tp (f ) = Up (f ) + pk−1 Vp (f ).

Next, we deﬁne Hecke operators Tn for all n ≥ 1.

Deﬁnition 4.4.8. Let f ∈ Mk (N, χ). We deﬁne Hecke operators Tn
for all n ≥ 1 as follows:
4.4. Hecke operators acting on cusp forms 115

• If n = p ≥ 2 is a prime, then Tp (f ) = Up (f )+χ(p)pk−1 Vp (f )

as before;
• If n = pr and p|N , then Tpr = (Tp )r , i.e., Tp composed r
times with itself;
• If n = pr and p N , then Tpr can be calculated using the
following recurrence relation:
Tp · Tpr = Tpr+1 + pk−1 pTpr−1 .
• If (n, m) = 1, then Tnm (f ) = (Tn · Tm )(f ) = (Tm · Tn )(f ) =
Tm (Tn (f )).
Remark 4.4.9. There are several equivalent ways to define Hecke
operators. Tn can be defined as above, or as a function on lattices,
or as a double coset operator. See [DS05], [Kob93] or [Mil06] for
alternative definitions.

Every Hecke operator Tn deﬁnes a linear map Tn : Mk (N, χ) →

Mk (N, χ). As in the case of the wN and diamond operators, we
are interested in the eigenvalues and eigenvectors of the operators
Tn . Surprisingly, there exist eigenvectors f which satisfy Tn (f ) =
λn f for all n ≥ 1, i.e., f is an eigenvector for all Hecke operators
simultaneously! These eigenvectors are of particular interest in the
theory, as we shall see in the next chapter.
Deﬁnition 4.4.10. Let f (z) ∈ Mk (N, χ) ⊂ Mk (Γ1 (N )). We say that
f (z) is an eigenform if f is an eigenvector for all Hecke operators Tn ,
n ≥ 1, simultaneously. In other words, f is an eigenform if, for all
n ≥ 1, there exist eigenvalues λn ∈ C such that
Tn (f ) = λn f.
Theorem 4.4.11 (Hecke; [Kob93], Ch. III, Prop. 40). Let k ≥ 1
and suppose that f (z) is an eigenform in the space Mk (N, χ) ⊂
Mk (Γ1 (N )), with Tn (f ) = λn f for all n ≥ 1, for some λn ∈ C. Sup-
pose further that f has a q-expansion of the form f (z) = n≥0 an q n .
Then:
(1) a1 = 0 and an = λn a1 for all n ≥ 1; and
(2) if a0 = 0, then the eigenvalues are given by the formula
λn = d|n χ(d)dk−1 .
116 4. Modular forms

Example 4.4.12. Let k ≥ 2 and let

1 4k
E2k (z) = G2k (z) = 1 − σ2k−1 (n)q n
2ζ(2k) B2k
n≥1

be the (normalized) Eisenstein series of weight 2k for SL(2, Z), as in

Proposition 4.1.7. We can write

E2k (z) = − B2k E2k (z) = − B2k + σ2k−1 (n)q n .

4k 4k
n≥1

Therefore, a1 = 1 and an = σ2k−1 (n) = 0<d|n d2k−1 . Since E 2k

is a modular form for SL(2, Z), it may also be considered as a form
for Γ0 (N ) for any N ≥ 1. Hence, E 2k ∈ M2k (N, χ0 ) = M2k (Γ0 (N )),
where χ0 is the trivial character of (Z/N Z)× , and so
an = χ0 (d)d2k−1
0<d|n

since χ0 (d) = 1. Also notice that a0 = B2k /4k = 0, so E 2k is not

a cusp form. Hence, Hecke’s theorem 4.4.11 suggests that E2k may
be an eigenform; that is, it suggests that E2k is an eigenvector for
all Tn , with n ≥ 1, with eigenvalue an . In other words, Tn (E2k ) =
σ2k−1 (n)E2k for all n ≥ 1. This equality is left as an exercise for the
reader (see Exercise 4.5.15).
Remark 4.4.13. One can show directly that the eigenvalues λn =
k−1
d|n χ(d)d satisfy the recursive relation dictated by our definition
of the Hecke operators (as in Definition 4.4.8). This is left as an
exercise for the reader (Exercise 4.5.16).
Definition 4.4.14. In the notation of Theorem 4.4.11, an eigenform
f ∈ Mk (N, χ) ⊂ Mk (Γ1 (N )) is said to be normalized if a1 = 1 (c.f.
Definition 4.1.6).
Remark 4.4.15. The reader may have noticed that we now have two
different types of normalizations (see Definition 4.1.6). Depending on
the application, it is convenient to normalize modular forms in one
way or another. For instance, in Example 4.1.13 we have normal-
ized a0 = 1 for convenience, while Theorem 4.4.11 shows that when
working with eigenforms it is very useful to normalize them so that
a1 = 1.
4.4. Hecke operators acting on cusp forms 117

Example 4.4.16. It follows from Hecke’s theorem that, if k ≥ 2,

there is a unique normalized eigenform of M2k (Γ0 (N )) that is not a
2k , by Example
cusp form, and it is precisely the Eisenstein series E
4.4.12 (and Exercise 4.5.15).

It is a crucial fact in the theory that one can find a basis for
Sknew (Γ1 (N ))
such that every element of the basis is an eigenform.
Recall that we defined spaces of new forms and old forms (Definitions
4.2.6 and 4.3.2) such that

Sk (Γ(N )) = Sknew (Γ(N )) ⊕ Skold (Γ(N )) = Skold (Γ(N ))⊥ ⊕ Skold (Γ(N ))

where new and old forms are orthogonal with respect to the Petersson
inner product. We deﬁne Sknew (Γ1 (N )) = Sk (Γ1 (N )) ∩ Sknew (Γ(N )),
and deﬁne similarly the old space for Γ1 (N ).

Theorem 4.4.17 (Atkin, Lehner [AL70], Li [Li75]; see also [DS05],

§5.8). The spaces of modular forms Sknew (Γ1 (N )) and Skold (Γ1 (N ))
are stable under wN , the diamond operators and Tn for all n ≥ 1.
Furthermore, the space Sknew (Γ1 (N )) has an orthonormal basis that
consists of new normalized eigenforms for the Hecke operators wN
and Tn for all n ≥ 1. In other words, the new forms of weight k for
Γ1 (N ) have a basis {f1 , . . . , fd } such that

1 if i = j;
fi , fj = wN (fi ) = ±fi and Tn fi = λn,i fi ,
0 if i = j

where ·, · is the Petersson inner product for all n ≥ 1 and all 1 ≤
i ≤ d, for some eigenvalues λn,i ∈ C.

Deﬁnition 4.4.18. A normalized eigenform f ∈ Sknew (Γ1 (N )), which

is an eigenvector for all Hecke, wN and diamond operators simulta-
neously, is called a newform (not to be confused with simply a new
form).

We remind the reader that in the previous theorem and deﬁnition

the word “normalized” means that a1 = 1, as in Deﬁnition 4.4.14.
118 4. Modular forms

4.5. Exercises
Exercise 4.5.1. The goal of this problem is to show that the modu-
larity condition (2) in Deﬁnition 4.1.3 works “as one would hope” un-
der matrix multiplication. Let M = (a, b; c, d) and M = (a , b ; c , d )
be matrices in a congruence subgroup Γ ≤ SL(2, Z), and let M M =
(a , b ; c , d ) be their product. Let k ≥ 1 and let f (z) be a function.
Show that:
a z+b a (M z)+b
(1) (M M )z = M (M z), i.e., c z+d = c (M z)+d , where M z =
az+b
cz+d ;
(2)

a (M z) + b
(cz + d)−k (c (M z) + d )−k f
c (M z) + d

a z + b
= (c z + d )−k f .
c z + d
Exercise 4.5.2. Let f (z) be a weakly modular function of weight k
for SL(2, Z) with k odd. Show that f (z) = 0 for all z ∈ H. (Hint:
show that f (z) = −f (z) for all z ∈ H.)

Exercise 4.5.3. Let G2k (z) be the Eisenstein series

1
G2k (z) := .
(mz + n)2k
(0,0)=(m,n)∈Z×Z

(1) Show that G2k az+b
cz+d = (cz+d)2k G2k (z) for all (a, b; c, d) ∈
SL(2, Z).
(2) Show that
1
lim = 2ζ(2k)
y→∞ (myi + n)2k
(0,0)=(m,n)∈Z×Z

where ζ(s) = ∞ 1
n=1 ns is the Riemann zeta function. (Hint:
you may assume that the convergence is uniform [can you
prove this?], and so the limit can be brought inside the sum-
mation.)

Exercise 4.5.4. Prove Propositions 4.1.8 and 4.2.3.

4.5. Exercises 119

Exercise 4.5.5. Let f be a modular form of weight k for a congruence

subgroup Γ ⊆ SL(2, Z). Show that:
k
az + b ∂ az + b a b
f (z) = f for any M = ∈ Γ.
cz + d ∂z cz + d c d
(Note: this exercise shows that condition (2) in the definition of mod-
ular form, Definition 4.2.1, is equivalent to saying that f (z)(dz)k is a
differential k-form, invariant under the action of Γ.)

Exercise 4.5.6. Show that the modular discriminant Δ(z), as de-

ﬁned in Example 4.1.10, is a cusp form of weight 12 for SL(2, Z).

Exercise 4.5.7. Let Δ(z) be the modular discriminant form, and let
E2k (z) be the normalized Eisenstein series of weight 2k for SL(2, Z).
(1) Show that M = M12 (SL(2, Z)) is a 2-dimensional vector
space, and {Δ(z)} is a basis of the cusp forms S12 (SL(2, Z)).
(2) Show that E12 and E62 belong to M , and
(2π)−12 · 26 · 35 · 72
E12 − E62 = λΔ, where λ = .
691
(3) Use the q-expansions of E6 , E12 and Δ to write an expression
for τ (n) in terms of σ5 (n) and σ11 (n).
(4) Show that τ (n) ≡ σ11 (n) ≡ 0<d|n d11 mod 691 for all n ≥
1.

Exercise 4.5.8. Let k ≥ 3, N ≥ 1 and let a = (a1 , a2 ) ∈ Z/N Z ×

Z/N Z be non-zero. Let Gak (z) be the Eisenstein series of Deﬁnition
4.2.7, and let M = (a, b; c, d) ∈ SL(2, Z).
(1) Show that

k az + b
(cz + d) Gak = GaM
k (z),
cz + d
where aM = (a1 a + a2 c, a1 b + a2 d) mod Z/N Z × Z/N Z.
(2) Show that if M ∈ Γ(N ), then a ≡ aM mod N , and if M ∈
Γ1 (N ), then (0, a2 ) ≡ (0, a2 )M mod N .
(3) Conclude that Gak (z) is modular for Γ(N ) and the function
(0,a )
Gk 2 (z) is modular for Γ1 (N ).
120 4. Modular forms

Exercise 4.5.9. Let N ≥ 1 and suppose that N = M · M , where

1 < M, M < N so that M and M are proper divisors of N . Suppose
that g(z) ∈ Mk (Γ(M )). Show that f (z) := g(M z) ∈ Mk (Γ(N )).
Also, show that the same conclusion holds if we replace Γ(N ) by
Γ1 (N ). Hint:

aM bM a bM M 0
= · .
cN d cM d 0 1
Exercise 4.5.10. Show that the Petersson inner product is a Hermit-
ian inner product (see Remark 4.3.1 for the properties of a Hermitian
product).
Exercise 4.5.11. This exercise proves the properties of wN that are
claimed in Proposition 4.4.2. Let N, k ≥ 1.
(1) Verify the following identity of matrices:

0 −1 a b d −c/N 0 −1
· = · .
N 0 c d −N b a N 0
Conclude that Γ0 (N ) is preserved under conjugation by the
matrix (0, −1; N, 0).
(2) Use the matrix identity in (1) to show that, if the ma-
trix M = (a, b; c, d) is in Γ0 (N ) and z = M z = az+b
cz+d ,

then (wN (f ))(z ) = (cz + d) · (wN (f ))(z). Thus, wN (f ) ∈
k

Sk (Γ0 (N )).
(3) Show that wN (wN (f )) = f for all f ∈ Sk (Γ0 (N )). Hence
wN (wN ) = Id, the eigenvalues of wN are all ±1, and Sk (Γ0 (N ))
factors into the direct sum of eigenspaces
Sk (Γ0 (N )) = Sk+ (Γ0 (N )) ⊕ Sk− (Γ0 (N )).
Exercise 4.5.12. Let N ≥ 1, δ ∈ Z and let M = (a, b; c, d) ∈ Γ0 (N )
with δ ≡ d mod N .
(1) Show that, for any f ∈ Mk (Γ1 (N )), the modular form δf
is also modular of weight k under the action of Γ1 (N ) (where
δ is as in Deﬁnition 4.4.3).
(2) Let M = (a , b ; c , d ) ∈ Γ0 (N ) be another matrix with
δ ≡ d ≡ d mod N , and let f ∈ Mk (Γ1 (N )). Show that
(cz + d)−k f (M z) = (c z + d )−k f (M z)
4.5. Exercises 121

for any z ∈ H. (Hint: show that M M −1 ∈ Γ1 (N ).)

Exercise 4.5.13. Prove Proposition 4.4.4. (Hint: use Exercise 4.5.1.)
Exercise 4.5.14. Let Mk (N, χ) be as in Proposition 4.4.5. Show that
if χ0 is the trivial character (i.e., χ0 (δ) = 1 for all gcd(δ, N ) = 1),
then Mk (N, χ0 ) = Mk (Γ0 (N )).
Exercise 4.5.15. Let k ≥ 2 and let

E2k (z) = − B2k + σ2k−1 (n)q n

4k
n≥1

be the (renormalized) Eisenstein series of weight 2k for SL(2, Z) =

Γ0 (1), as in Example 4.4.12. Show that:
2k ) = σ2k−1 (p)E
(1) Tp (E 2k for all primes p ≥ 2. (Hint: here
N = 1 and χ = χ0 is trivial.)
(2) Tpr (E2k ) = σ2k−1 (pr )E
2k for any prime p ≥ 2 and r ≥ 1.
(Hint: use induction and the recursive deﬁnition of Tpr .)
(3) Tn (E2k ) = σ2k−1 (n)E2k for all n ≥ 1. (Hint: σ2k−1 is mul-
tiplicative for relatively prime integers, i.e., σ2k−1 (mn) =
σ2k−1 (m)σ2k−1 (n) for (m, n) = 1.)
Exercise 4.5.16. Let N > 1 be a natural number, and let χ :
(Z/N Z)× → C× be a character. For each k ≥ 2, we deﬁne
σn = χ(d)dk−1 .
d|n

Show that
(1) If (n, m) = 1, then σn σm = σnm .
(2) If p is a prime with gcd(N, p) = 1, then
σp σpr = σpr+1 + χ(p)pk−1 σpr−1 .
Chapter 5

L-functions

In this chapter we deﬁne the L-functions attached to elliptic curves

and modular forms, and we investigate when an elliptic curve and a
modular form could have the same L-function.

5.1. The L-function of an elliptic curve

Let E be an elliptic curve over Q given by a minimal model (as in
Deﬁnition 2.6.3):

y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6

with coeﬃcients ai ∈ Z. For p a prime in Z of good reduction for

E/Q, we deﬁne Np as the number of points in the reduction of the
curve modulo p, i.e., the number of points in E(Fp ). In other words,
Np is the number of points in

{O}∪{(x, y) ∈ Fp 2 : y 2 +a1 xy +a3 y −x3 −a2 x2 −a4 x−a6 ≡ 0 mod p}

where O is the point at inﬁnity (see Section 2.6 and, in particular,

Hasse’s theorem 2.6.11). Also, let ap = p + 1 − Np . We deﬁne the

123
124 5. L-functions

local factor at p of the L-series to be

⎧
⎪
⎪ 1 − ap T + pT 2 , if E has good reduction at p,
⎪
⎪
⎨1 − T , if E has split multiplicative reduction at p,
Lp (T ) =
⎪1 + T , if E has non-split multiplicative reduction at p,
⎪
⎪
⎪
⎩
1, if E has additive reduction at p.
Definition 5.1.1. The L-function of the elliptic curve E is defined
to be 1
L(E, s) = ,
Lp (p−s )
p≥2
where the product is over all primes p ≥ 2 and Lp (T ) is the local factor
defined above. L(E, s) is sometimes called the Hasse-Weil L-function
of E/Q.
Remark 5.1.2. The product that defines L(E, s) converges and gives
an analytic function for all (s) > 3/2. This follows from Hasse’s
√
bound (Theorem 2.6.11), which implies that |ap | ≤ 2 p. However,
far more is true. Indeed, mathematicians conjectured that L(E, s)
should have an analytic continuation to the whole complex plane and
that it must satisfy a functional equation relating the values of L(E, s)
and L(E, 2−s). For the precise functional equation see Theorem 5.1.9
below.
Example 5.1.3. Let E/Q be the elliptic curve with equation
y 2 + y = x3 − x2 − 10x − 20.
This is a minimal model for E/Q, and its discriminant is ΔE = −115 .
Therefore, p = 11 is the only prime of bad reduction for E/Q, and
the reduction is split multiplicative (see the discussion about E3 in
Example 2.6.7). Therefore,

1 1
L(E, s) = · .
1 − 11−s 1 − ap p−s + p1−2s
p≥2
p=11

When expanded, the L-series attached to E has the form

2 1 2 1 2 2 2 2 1
L(E, s) = 1 − s − s + s + s + s − s − s − s + s + · · · .
2 3 4 5 6 7 9 10 11
In general, one can always write L(E, s) = n≥1 an n−s , where the
an are characterized in Proposition 5.1.5 below.
5.1. The L-function of an elliptic curve 125

Example 5.1.4. Let E/Q : y 2 = x3 − 11x2 + 385. The curve E

has bad additive reduction at 2 and 11, split multiplicative at 5 and
non-split multiplicative at 7 and 461. Thus, by definition
−1
L(E, s) = (1 − 5−s )(1 + 7−s )(1 + 461−s )
1
· −s
primes p
1 − ap p + p1−2s
p=2,5,7,11,461
2 1 1 1 2 2 5 2
= 1− + s − s + s + s − s − s + s ··· .
3s 5 7 9 13 15 17 21
Proposition 5.1.5. Let E/Q be an elliptic curve, and let L(E, s) be
its L-function. Define Fourier coefficients an for all n ≥ 1 as follows.
Let a1 = 1. If p ≥ 2 is prime, we define
⎧
⎪
⎪ p + 1 − Np if E has good reduction at p;
⎪
⎪
⎨1 if E has split multiplicative reduction at p;
ap =
⎪
⎪ −1 if E has non-split multiplicative reduction at p;
⎪
⎪
⎩
0 if E has additive reduction at p.

If n = pr for some r ≥ 1, we deﬁne apr recursively using the relation

ap · apr = apr+1 + p · apr−1 if E/Q has good reduction at p

and apr = (ap )r if E/Q has bad reduction at p. Finally, if (m, n) = 1,

then we deﬁne amn = am ·an . Then the L-function of E can be written
as the series
an
L(E, s) = .
ns
n≥1

The proof is left as an exercise (Exercise 5.7.2).

Remark 5.1.6. Notice that the recurrence formula ap · apr = apr+1 +

p · apr−1 (and apr = (ap )r in the bad reduction case) is strikingly
similar to the recurrence relation defining the Hecke operators Tpr for
k = 2, and also the recurrence relation satisfied by the eigenvalues
of an eigenform (see Definition 4.4.8, Remark 4.4.13 and Exercise
4.5.16). This is one of the first pieces of evidence that the L-function
of an elliptic curve may be connected to a modular form.
126 5. L-functions

Before we write down the functional equation for E/Q, we need

one more ingredient: the conductor of E/Q. For each prime p ∈ Z,
we deﬁne the quantity fp as follows:
⎧
⎪
⎪ 0, if E has good reduction at p,
⎪
⎪
⎨1, if E has multiplicative reduction at p,
fp =
⎪
⎪ 2, if E has additive reduction at p, and p = 2, 3,
⎪
⎪
⎩
2 + δp , if E has additive reduction at p = 2 or 3,

where δp is a technical invariant (see [Sil94], Ch. IV, §10; the invari-
ant δp describes whether there is wild ramiﬁcation in the action of
the inertia group at p of Gal(Q/Q) on the Tate module Tp (E)).

Deﬁnition 5.1.7. The conductor NE/Q of E/Q is deﬁned to be

NE/Q = p fp ,
p

where the product is over all primes and the exponents fp are deﬁned
as above.

Example 5.1.8. Let us see some examples of conductors.

(1) Let E/Q : y 2 + y = x3 − x2 + 2x − 2. The primes of bad

reduction for E are p = 5 and 7. The reduction at p = 5
is additive, while the reduction at p = 7 is multiplicative.
Hence NE/Q = 25 · 7 = 175.
(2) As we saw above, the curve y 2 + y = x3 − x2 − 10x − 20 has
split multiplicative reduction at p = 11 and the reduction is
good elsewhere. Thus, the conductor is 11.
(3) The curves EA : y 2 + y = x3 − x and EB : y 2 + y = x3 +
x2 − 23x − 50 are two non-isomorphic curves with conductor
equal to 37.

Theorem 5.1.9 (Functional equation). The L-series L(E, s) has an

analytic continuation to the entire complex plane, and it satisﬁes the
following functional equation. Deﬁne

Λ(E, s) = (NE/Q )s/2 (2π)−s Γ(s)L(E, s),

5.2. The Birch and Swinnerton-Dyer conjecture 127
$∞
where NE/Q is the conductor of E and Γ(s) = 0
ts−1 e−t dt is the
Gamma function. Then
Λ(E, s) = w · Λ(E, 2 − s) with w = ±1.

The number w = w(E/Q) in the functional equation is usually

called the root number of E, and it has an important conjectural
meaning (see the next section on the Birch and Swinnerton-Dyer con-
jecture). Theorem 5.1.9 was proved in 1999, since it follows from the
Taniyama-Shimura-Weil conjecture 5.4.5, which was proved by work
of Wiles, Taylor-Wiles, and Breuil, Conrad, Diamond and Taylor.

5.2. The Birch and Swinnerton-Dyer conjecture

Figure 1. Bryan Birch (left) and Sir Peter Swinnerton-Dyer

(right). Photograph courtesy of William Stein.

Conjecture 5.2.1 (Birch and Swinnerton-Dyer). Let E be an elliptic

curve over Q, and let L(E, s) be the L-function attached to E. Then:
(1) L(E, s) has a zero at s = 1 of order equal to the rank RE of
E(Q). In other words, the Taylor expansion of L(E, s) at
s = 1 is of the form
L(E, s) = C0 · (s − 1)RE + C1 · (s − 1)RE +1 + C3 · (s − 1)RE +2 + · · ·
where C0 is a non-zero constant.
128 5. L-functions

(2) The residue of L(E, s) at s = 1, i.e., the coeﬃcient C0 , has

a concrete expression in terms of invariants of E/Q. More
concretely,

L(E, s) |X| · ΩE · Reg(E/Q) · p cp
C0 = lim = .
s→1 (s − 1)RE |Etorsion (Q)|2
The invariants that appear in the conjectural formula for the
residue are listed below:
• RE is the (free) rank of E(Q) (see Section 2.7).
$ % %
% %
• ΩE = E(R) % dxy % is either the real period or twice the real
period of a minimal model for E, depending on whether
E(R) is connected.
• |X| is the order of the Shafarevich-Tate group of E/Q (we
defined the 2-torsion of Sha, X2 , in Section 2.11).
• Reg(E/Q) is the elliptic regulator of E(Q), as in Definition
2.8.4.
• |E(Q)torsion | is the number of torsion points on E/Q, includ-
ing the point at infinity O (see Section 2.5).
• cp is an elementary local factor, equal to the cardinality of
E(Qp )/E0 (Qp ), where E0 (Qp ) is the set of points in E(Qp )
whose reduction modulo p is non-singular in E(Fp ). Notice
that if p is a prime of good reduction for E/Q, then cp = 1,
so cp = 1 only for finitely many primes p. The number cp is
called the Tamagawa number of E at p.
In 1974 ([Tat74], p. 198), John Tate wrote about the BSD con-
jecture: “This remarkable conjecture relates the behavior of a function
L at a point where it is not at present known to be defined (s = 1)
to the order of a group (X) which is not known to be finite! ” Tate
is referring to the fact that, when the conjecture was first proposed,
the analytic continuation of L(E, s) was not known, and we did not
know whether X was ever finite (nowadays we know many examples
where X is finite, but it is still not known for all elliptic curves).
Example 5.2.2. Let E/Q be an elliptic curve. By Theorem 5.1.9, the
function L(E, s) = n≥1 an n−s has an analytic continuation to C.
In particular, if we restrict our attention to real values t, then L(E, t)
5.2. The Birch and Swinnerton-Dyer conjecture 129

0.8

0.6

0.4

0.2

1 2 3

Figure 2. L(E0 , t) for E0 : y 2 + y = x3 − x2 − 10x − 20 and

−1 ≤ t ≤ 3.

is a real-valued function. Since L(E, s) is analytic, L(E, t) should be

continuous and (infinitely) differentiable. Let Er , for r = 0, 1, 2 and
3, be elliptic curves defined by

E0 : y 2 + y = x3 − x2 − 10x − 20, E 1 : y 2 + y = x3 − x
E2 : y 2 + y = x3 + x2 − 2x, E3 : y 2 + y = x3 − 7x + 6.

The reader can check that the rank of Er is precisely r. In Figures

2 through 5 we show the graphs of L(Er , t) for −1 ≤ t ≤ 3. Notice
that the function L(Er , t) seems to have a zero of order r at t = 1, in
agreement with the BSD conjecture.

Example 5.2.3. Let E/Q : y 2 = x3 −1156x. Recall that in Examples

2.10.4 and 2.11.2 we calculated RE = 2, E(Q)torsion ∼= Z/2Z × Z/2Z
and X2 = {(1, 1)} (here X2 is just the 2-torsion of X). A non-trivial
calculation yields X = X2 = {(1, 1)}. Figure 6 provides the values
of all the invariants that appear in the BSD conjecture. Thus,

|X| · ΩE · Reg(E/Q) · p cp
= 6.3851519548 . . . .
|E(Q)torsion |2
130 5. L-functions

0.6

0.4

0.2

1 2 3

−0.2

Figure 3. L(E1 , t) for E1 : y 2 + y = x3 − x and −1 ≤ t ≤ 3.

0.6

0.4

0.2

1 2 3

Figure 4. L(E2 , t) for E2 : y 2 + y = x3 + x2 − 2x and −1 ≤ t ≤ 3.

5.2. The Birch and Swinnerton-Dyer conjecture 131

Figure 5. L(E3 , t) for E3 : y 2 + y = x3 − 7x + 6 and −1 ≤ t ≤ 3.

E/Q : y 2 = x3 − 1156x
RE 2, P = (−16, 120), Q = (−2, 48)
|X| 1
ΩE 0.8993583214 . . .
Reg(E/Q) det H({P, Q}) = 7.0996751824 . . .
E(Q)torsion Z/2Z × Z/2Z ∼
= (0, 0), (34, 0)

p≥2 cp c2 · c17 = 4 · 4

Figure 6. BSD data for the curve E/Q : y 2 = x3 − 1156x.

We can also calculate the value L(E, 1) and the values of the deriva-
tives L (E, 1) and L (E, 1); i.e., we can approximate numerically
these values. For instance, one can use Sage (see Appendix A.3).
For a technical description of the algorithms involved, see [Dok04].
Once we have calculated these values, we can write the ﬁrst few terms
132 5. L-functions

E/Q : y 2 = x3 − 6724x
RE 0
|X| 4
ΩE 0.5791156343 . . .
Reg(E/Q) det H({}) = 1
E(Q)torsion Z/2Z × Z/2Z ∼
= (0, 0), (81, 0)

p≥2 cp c2 · c41 = 4 · 4

Figure 7. BSD data for the curve E/Q : y 2 = x3 − 6724x.

of the Taylor expansion of L(E, s) around s = 1.

L(E, s) ≈ 9.508 · 10−24 − (2.374 · 10−23 ) · (s − 1)
+(6.3851519548) · (s − 1)2 + · · · .
Therefore, our approximate calculation suggests that L(E, s) has a
zero of order 2 at s = 1 and the residue is 6.3851519548 . . ., in per-
fect agreement with the BSD conjecture (at least up to the given
precision).

Example 5.2.4. Let E/Q : y 2 = x3 − 6724x. Recall that Examples

2.10.5 and 2.11.3 suggest that RE = 0, E(Q)torsion ∼ = Z/2Z × Z/2Z
and |X2 | = 4. A non-trivial calculation reveals that RE is indeed 0
and |X| = |X2 | = 4. Figure 7 provides the values of all the invariants
that appear in the BSD conjecture. Thus,

|X| · ΩE · Reg(E/Q) · p cp
= 2.3164625374 . . . .
|E(Q)torsion |2
We can approximate the ﬁrst few terms of the Taylor expansion of
L(E, s) around s = 1.
L(E, s) ≈ 2.3164625374 − (7.8248271660) · (s − 1)
+(25.7352635691) · (s − 1)2 + · · · .
Therefore, our approximate calculation suggests that L(E, s) does
not vanish at s = 1, and L(E, 1) = 2.3164625374 . . ., again in perfect
agreement with the BSD conjecture.
5.2. The Birch and Swinnerton-Dyer conjecture 133

The following is an easy consequence of the BSD conjecture (Ex-

ercise 5.7.3). Recall that the root number of E is the sign in the
functional equation of L(E, s).
Conjecture 5.2.5 (Parity Conjecture). The root number of E, de-
noted by w = w(E/Q), indicates the parity of the rank of the elliptic
curve; i.e., w = 1 if and only if the rank RE is even, and w = −1 iﬀ
the rank is odd. Equivalently,
w = (−1)ords=1 L(E,s) = (−1)rank(E(Q))
or ords=1 L(E, s) ≡ rank(E(Q)) mod 2.

See Exercise 5.7.3.

Deﬁnition 5.2.6. Let E/Q be an elliptic curve, and let L(E, s) be
the L-function attached to E. The analytic rank of E(Q) is deﬁned
to be the order of vanishing of L(E, s) at s = 1, i.e.,
rankan (E/Q) := ords=1 L(E, s).
In other words, rankan (E/Q) is the order of the zero of L(E, s) at
s = 1.

Thus, the first part of the BSD conjecture is the statement that
the analytic rank equals the (algebraic) free rank of the Mordell-Weil
group E(Q).
Example 5.2.7. Let E/Q : y 2 = x3 −1572 x. Recall that Proposition
1.1.3 says that the rational points on E/Q with y = 0 give right
triangles of area 157, so if we find a single non-trivial point on E we
prove that n = 157 is a congruent number (as defined in Example
1.1.2).
Comparing values of Λ(s) and Λ(2 − s), we calculate the root
number w = w(E/Q) = −1. Thus, the parity conjecture suggests
that E(Q) has odd rank, therefore ≥ 1, and so E(Q) must be infinite.
However, a computer search only yields the trivial 2-torsion points
(0, 0), (157, 0) and (−157, 0). We can calculate values of L(E, s) and
its derivatives at s = 1 and write down an approximate Taylor ex-
pansion:
L(E, s) ≈ (11.4259445007) · (s − 1) − (49.9773214816) · (s − 1)2 + · · · .
134 5. L-functions

Hence, the BSD conjecture suggests that RE = 1 and

|X| · ΩE · Reg(E/Q) · p cp
(5.1) = 11.4259445007 . . . .
|E(Q)torsion |2
If we believe that RE = 1 and we write P for a generator of E(Q)
modulo torsion, then one can show that X2 must be trivial (and, in
fact, X is trivial as well, but this is much tougher to prove). Some
other invariants are easy to calculate:

ΩE = 0.4185259488 . . . , cp = c2 · c157 = 2 · 4, |E(Q)torsion | = 4.
p≥2

However, Reg(E/Q) = P, P = 2 · h(P ) is diﬃcult to calculate be-

cause we do not know P (here h is the canonical height). But we can
solve for Reg(E/Q) in Eq. (5.1) and obtain
Reg(E/Q) = 2 ·
h(P ) = 54.6008892938 . . .
and
h(P ) = 27.3004446469 . . .. That’s a huge height! Recall that
1
h(P ) ≈ log max{num(x(P )), den(x(P ))}
2
and so max{|num(x(P ))|, |denom(x(P ))|} ≈ e54.6 ≈ 5.157 · 1023 . This
calculation gives us a rough idea of the size of the numerator and
denominator of the x coordinate. With the help of homogeneous
spaces, and looking for points in the correct height range, we can
succeed at ﬁnding P . Its coordinates P = (x(P ), y(P )) are:
166136231668185267540804
x(P ) = − ,
2825630694251145858025

167661624456834335404812111469782006
y(P ) =
150201095200135518108761470235125
and the canonical height of P is precisely 27.3004446469 . . ., as pre-
dicted by the Birch and Swinnerton-Dyer conjecture.

There has been a great amount of research on the BSD conjecture,

but the progress in the general case over Q is minimal (a lot is known
about BSD for elliptic curves over function ﬁelds). The conjecture has
been veriﬁed for many elliptic curves (for instance, see [GJPST09],
[Mil10]), but there is little evidence in the form of proven theorems.
The following result is the strongest piece of evidence proved to date.
5.3. The L-function of a modular (cusp) form 135

Theorem 5.2.8 (Gross-Zagier, Kolyvagin). Let E/Q be an elliptic

curve of algebraic rank RE . Suppose that the analytic rank of E/Q is
≤ 1, i.e., ords=1 L(E, s) ≤ 1. Then:
(1) The ﬁrst part of BSD holds for E/Q, i.e.,
RE = rank(E(Q)) = rankan (E/Q) = ords=1 L(E, s).
(2) The Shafarevich-Tate group X associated to E/Q is ﬁnite.

5.3. The L-function of a modular (cusp) form

Let N, k ≥ 1 and let f (z) be a cusp form of weight 2k for the con-
gruence subgroup Γ0 (N ), i.e., f (z) ∈ S2k (Γ0 (N )) in the notation of
Section 4.2 (and, in particular, Prop. 4.2.3). For any N ≥ 1, the ma-
trix T = (1, 1; 0, 1) belongs to Γ0 (N ), and therefore f (z) = f (z + 1)
for all z ∈ H. Moreover, f (z) is a cusp form and so f vanishes at all
the cusps of H∗ /Γ0 (N ), and in particular it vanishes at ∞. Hence
f (z) has a q-expansion expression of the form
f (z) = an q n ,
n≥1

where q = e 2πiz
for some coefficients an ∈ C.
Definition 5.3.1. The L-function attached to a cusp form f (z) =
n≥1 an q ∈ Sk (Γ0 (N )) is defined by
n

an a2 a3
L(f, s) = an n−s = = a1 + s + s + · · · .
ns 2 3
n≥1 n≥1

Example 5.3.2. Let N = 11 and k = 1. The space M2 (Γ0 (11)) is

a 2-dimensional C-vector space with basis elements {f, g} given in
Example 4.2.11. In particular, S2 (Γ0 (11)) is generated by
f (q) = q − 2q 2 − q 3 + 2q 4 + q 5 + 2q 6 − 2q 7 − 2q 9 − 2q 10 + O(q 11 ),
where q = e2πiz . Hence the L-function associated to f is
2 1 2 1 2 2 2 2
L(f, s) = 1 − s − s + s + s + s − s − s − s + · · · .
2 3 4 5 6 7 9 10
The very attentive reader might recognize these few terms as the ﬁrst
few terms in the L-function L(E, s) that appeared in Example 5.1.3,
where E/Q is the elliptic curve with equation y 2 + y = x3 − x2 −
136 5. L-functions

10x − 20. Are they truly the same L-series? Further calculations
show that all terms agree as we increase the precision. We will see
that the Taniyama-Shimura-Weil conjecture 5.4.5, i.e., the modularity
theorem, implies that L(E, s) = L(f, s). Notice that the conductor
of E/Q is precisely N = 11, as we saw in Example 5.1.8.
Example 5.3.3. Let N = 37 and k = 1. In Example 4.2.12 we
described the space S2 (Γ0 (37)) with basis elements {f, g} given by
the q-expansions
f (q) = q + q 3 − 2q 4 − q 7 − 2q 9 + 3q 11 − 2q 12 − 4q 13 + O(q 16 ),
g(q) = q 2 + 2q 3 − 2q 4 + q 5 − 3q 6 − 4q 9 − 2q 10 + 4q 11 + O(q 12 ).
The L-functions attached to f and g are
1 2 1 2 3 2 4
L(f, s) = 1 + s − s − s − s + s − s − s + . . . ,
3 4 7 9 11 12 13
1 2 2 1 3 4 2 4
L(g, s) = + s − s + s − s − s − s + s + ....
2s 3 4 5 6 9 10 11
Now, let EA and EB be the elliptic curves of conductor 37 described
in Example 5.1.8. Then
1 2 1 2 3 2 4
L(EB , s) = 1 + s − s − s − s + s − s − s + . . .
3 4 7 9 11 12 13
and, indeed, we shall see that L(f, s) = L(EB , s). How about EA ?
2 3 2 2 6 1 6 4 5
L(EA , s) = 1 − s
− s + s − s + s − s + s + s − s +...
2 3 4 5 6 7 9 10 11
so L(EA , s) = L(g, s) or L(f, s). Is there some form F (z) ∈ Sk (Γ0 (37))
such that L(EA , s) = L(F, s)? If so, F (q) must be a linear combina-
tion λ · f (q) + μ · g(q) for some λ, μ ∈ C. After a quick look at the
ﬁrst few coeﬃcients of the q-expansions of f and g, and those of the
series L(EA , s), one can check that, if some F works, then it must be
F (q) = f (q) − 2g(q), and indeed
(f −2g)(q) = 1−2q 2 −3q 3 +2q 4 −2q 5 +6q 6 −q 7 +6q 9 +4q 10 −5q 11 +O(q 12 )
and so
2 3 2 2 6 1 6 4 5
L(f − 2g, s) = 1 − − + − + − + + − +...
2s 3s 4s 5s 6s 7s 9s 10s 11s
Once again, we shall see that the Taniyama-Shimura-Weil conjecture
implies the equality L(f − 2g, s) = L(EA , s).
5.4. The Taniyama-Shimura-Weil conjecture 137

5.4. The Taniyama-Shimura-Weil conjecture

In Examples 5.3.2 and 5.3.3, we have seen examples of elliptic curves
E/Q of conductor N and modular forms f ∈ S2 (Γ(N )) such that the
L-functions L(E, s) and L(f, s) seem to be identical.

Deﬁnition 5.4.1. We say that an elliptic curve E/Q is modular if

there is a cusp form f (z) such that
L(E, s) = L(f, s).

In the second half of the 20th century, many mathematicians grew

increasingly interested in the question of whether every elliptic curve
over Q is modular. However, early on, it was noticed that not every
cusp form comes from an elliptic curve.
Notice that if E is modular and L(E, s) = L(f, s) = n≥1 an n−s ,
then ap must equal p + 1 − Np when p is a prime of good reduction
for E and, in general, an must coincide with those values deﬁned in
Proposition 5.1.5. Hence, for a given elliptic curve, there is a clear
candidate for a cusp form f associated to the elliptic curve E.

Deﬁnition 5.4.2. Let E/Q be an elliptic curve. We deﬁne the po-

tential cusp form associated to E to be a function fE : H → C deﬁned
by its q-expansion
fE (q) = an q n ,
n≥1
2πiz
where q = e and the an are deﬁned in Proposition 5.1.5 (for
instance, if E/Q has good reduction at p, then ap = p + 1 − Np ).

It is very far from clear that fE is a modular form. Let us sup-

pose for a moment that fE is indeed a modular form and L(E, s) =
L(fE , s). What kind of modular form should fE be?
(1) The examples suggest that, ﬁrst of all, fE must be a cusp
form of weight 2 for Γ0 (N ), where N = NE is the conductor
of E/Q;
(2) If L(E, s) = L(fE , s), then, by the functional equation for
L(E, s) in Theorem 5.1.9, the L-function associated to fE ,
that is L(fE , s), must also satisfy a functional equation;
138 5. L-functions

(3) If L(E, s) = L(fE , s), then L(fE , s) must have an Euler

product, since L(E, s) has one. We say that L(s) = n≥1 an n−s
has an Euler product if it can be written as a product

L(s) = p≥2 Lp (s) over all primes p ≥ 2. Clearly, L(E, s) is
deﬁned as an Euler product, so L(fE , s) must have an Euler
product as well.
The work of Hecke characterizes which cusp forms in S2 (Γ0 (N ))
satisfy a functional equation and which cusp forms have an Euler
product. Recall that in Proposition 4.4.2 we deﬁned ±1-spaces of S2
such that
S2 (Γ0 (N )) = S2+ (Γ0 (N )) ⊕ S2− (Γ0 (N )).

Theorem 5.4.3 (Hecke; [DS05], §5.10). Let N, k ≥ 1 and f (z) ∈

S2k (Γ0 (N )) be a cusp form such that f (z) is an eigenvector for the
operator wN , i.e., f (z) ∈ S2k
ε
(Γ0 (N )) for ε = +1 or −1. Then L(f, s)
has an analytic continuation to C. Moreover, if we deﬁne
Λ(f, s) = N s/2 (2π)−s Γ(s)L(f, s),
where Γ(s) is the Gamma function, then Λ(f, s) satisﬁes the func-
tional equation
Λ(f, s) = ε · Λ(f, 2 − s).

Recall (Deﬁnition 4.4.10) that we say that f (z) = n≥0 an q n is

an eigenform if f is an eigenvector for all Hecke operators Tn , n ≥ 1,
simultaneously. We say that f (z) is a normalized eigenform if a1 = 1.

Theorem 5.4.4 (Hecke; [DS05], §5.9). Let N, k ≥ 1. Let f (z) be a

normalized eigenform of weight 2k for Γ0 (N ) such that Tp (f ) = λp · f
for every prime p ≥ 2. Then L(f, s) has an Euler product of the form
1 1
L(f, s) = −s −s
.
1 − λp p 1 − λp p + p2k−1−2s
p|N pN

Now we may use Hecke’s theorems to narrow down which cusp

forms may be associated to elliptic curves. Suppose that E/Q is an
elliptic curve with conductor N and let us assume that the potential
cusp form fE associated to E is indeed a cusp form. Then fE must
verify the following properties:
5.4. The Taniyama-Shimura-Weil conjecture 139

(1) fE (z) ∈ S2 (Γ0 (N )). The level of fE (z) should be precisely

N and not lower; otherwise f would correspond to a curve
of lower conductor. Thus, we require fE (z) ∈ S2new (Γ0 (N )).
Note that the functional equation of fE determines N , the
conductor/level.
(2) fE (z) must be in one of the ε-spaces of cusp forms, i.e.,
fE ∈ S2new (Γ0 (N )) ∩ S2ε (Γ0 (N ))
for = +1 or −1.
(3) fE (z) must be a normalized eigenform in S2new (Γ0 (N )), and
it needs to be an eigenvector for wN as well. Therefore,
fE (z) is a normalized newform (Deﬁnition 4.4.18).
Taniyama, Shimura and Weil are credited with the following for-
mulation of the modularity conjecture.

Conjecture 5.4.5 (Taniyama-Shimura-Weil). A series of the form

−s
L(s) = n≥1 an n with an ∈ Z is the L-function L(E, s) of an
elliptic curve E/Q of conductor N if and only if L(s) = L(f, s) is the
L-function of a normalized newform of weight 2 for Γ0 (N ).

The conjecture of Taniyama, Shimura and Weil was proved in

several stages.
• Eichler and Shimura ([Shi73], Ch. 7, Thm. 7.14) showed
one of the directions of the equivalence in the conjecture:
if f (z) is a normalized newform of weight 2 for Γ0 (N ),
then there exists an elliptic curve Ef /Q such that L(f, s) =
L(Ef , s).
• Wiles [Wil95] and Taylor and Wiles [TW95] proved the
Taniyama-Shimura-Weil conjecture when E/Q is semistable
(i.e., if the conductor NE is square-free or, equivalently,
when E/Q does not have any primes of bad additive re-
duction). This was the case that was needed to ﬁnalize the
proof of Fermat’s last theorem (see Section 5.5).
• Finally, Breuil, Conrad, Diamond and Taylor [BCDT01]
showed that the conjecture is true for all elliptic curves over
Q.
140 5. L-functions

The Taniyama-Shimura-Weil conjecture is nowadays frequently

called the modularity theorem. We conclude this section with an
important equivalent formulation of the TSW conjecture:
Theorem 5.4.6 (Modularity theorem). Let E/Q be an elliptic curve
of conductor N , and let X0 (N ) be given by an algebraic model over
Q (see Remark 3.6.4). Then there is a surjective algebraic map of
curves ΨE,N : X0 (N ) → E deﬁned over Q. (The map ΨE,N is called
a modular parametrization of E.)

5.5. Fermat’s last theorem

Theorem 5.5.1. The equation xn + y n = z n has no solutions in
integers x, y, z with xyz = 0, whenever n > 2.

Figure 8. Andrew J. Wiles (right) and his Ph.D. advisor,

John H. Coates (left).

Suppose that n, u, v and w are integers such that n > 2, uvw = 0

and
un + v n = wn .
Therefore, either n is divisible by 4, i.e., n = 4k with k ≥ 1, and
(uk )4 + (v k )4 = (wk )4 , or there is a prime divisor p ≥ 3 of n, with
5.5. Fermat’s last theorem 141

n = ph and h ≥ 1, such that (uh )p + (v h )p = (wh )p . Fermat showed

that the equation x4 + y 4 = z 4 has no solutions x, y, z ∈ Z with
xyz = 0, so we conclude that xp + y p = z p must have an integer
solution for some prime p ≥ 3 and xyz = 0.
Thus, let us suppose that p ≥ 3 and ap + bp = cp , with a, b, c ∈ Z
and abc = 0. However, we know that this is not possible for p = 3, 5
or 7.
• Leonhard Euler is generally credited for the proof of the
p = 3 case (although his solution, in 1770, had a major gap).
Kausler (1802), Legendre (1823) and many others have also
published proofs of this case.
• The case of p = 5 was first shown (independently) by Le-
gendre and Dirichlet, around 1825.
• The proof of Fermat’s last theorem for p = 7 is due to Lamé,
published in 1839.
Hence, we may assume that p ≥ 11. It is worth pointing out that, in
1846, Ernst Kummer proved Fermat’s last theorem for regular primes.
Not all primes are regular: we know that there are infinitely many
irregular primes (the first few irregular primes are 37, 59, 67, 101, 103,
131, 149, . . .), but it is widely believed that there are also infinitely
many regular primes. In 1984, the proof of Mordell’s conjecture (now
known as Faltings’ theorem; see the paragraph on Higher degree in
Section 2.1) was announced which shows that, for a fixed n > 2,
xn + y n = z n may have at most a finite number of relatively prime
integer solutions.
The strategy that led to the first (correct) proof of Fermat’s last
theorem was layed out by Frey [Fre86] and Serre [Ser87]. Let p ≥ 11
and suppose a, b, c are relatively prime integers with ap + bp = cp and
abc = 0. In 1984, Frey discovered that the elliptic curve

E : y 2 = x(x − ap )(x + bp )

would be semistable with conductor NE = |abc (see Exercise
5.7.5) and would satisfy some other technical properties. Moreover,
Frey claimed that such a curve E/Q could not be modular; i.e.,
there is no weight 2 normalized newform f ∈ S2 (Γ0 (NE )) such that
142 5. L-functions

L(f, s) = L(E, s). The problem with the modularity of E was made
precise by Serre, and Ribet [Rib90] proved in 1986 that, indeed, E
cannot be modular.
Finally, in 1995, Wiles [Wil95] and Taylor and Wiles [TW95]
proved the Taniyama-Shimura-Weil conjecture 5.4.5 for all semistable
elliptic curves E/Q. Therefore, E : y 2 = x(x − ap )(x + bp ) would have
to be modular if it existed. Hence, neither E nor the aforementioned
solution (a, b, c) to xp + y p = z p can exist, and Fermat’s last theorem
holds.

5.6. Looking back and looking forward

The quest to find a proof of Fermat’s last theorem lasted more than
350 years, and hundreds of mathematicians tried to attack the prob-
lem in many very different ways. It was simply a fantastic challenge
that piqued the interest of essentially every mathematician from Fer-
mat to Wiles. Still today, Fermat’s last theorem captivates the imag-
ination of math enthusiasts across the world. It is curious, though,
that Fermat’s last theorem has virtually no interesting consequences
other than the statement itself.
However, the study of the solutions of such a simple equation
(xn + y n = z n ) has been the driving force in developing an immense
amount of extremely interesting mathematics. The statement of Fer-
mat’s last theorem may not have relevant corollaries, but the tools
that were used in the proof are incredibly important and offer a vast
range of very useful applications.
The final stages of the proof of Fermat’s last theorem (as out-
lined in Section 5.5) represent one of the biggest triumphs of modern
mathematics — not just because a 358-year-old problem was solved,
but for the fundamental advances in the theory of elliptic curves and
modular forms that were produced in order to verify Fermat’s claim.
This was no small enterprise; we have already briefly described the re-
markable involvement of many important mathematicians (Shimura,
Taniyama, Weil, Frey, Serre, Ribet, Wiles, Taylor, Breuil, Conrad,
and Diamond, among many others). Just the proof of the modularity
theorem (Theorem 5.4.6) occupies more than 200 pages of research
articles (that’s only counting [Wil95], [TW95] and [BCDT01]), and
5.7. Exercises 143

many books have been written to explain the brilliant mathematics

developed for the proof (see [CSS00] for a graduate-level textbook).
Fermat’s last theorem has been proved, but the broad areas of
research that this book touches on (namely algebraic number theory,
algebraic geometry and their intersection, arithmetic geometry) have
seen an exponential growth over the last couple of centuries, and they
continue to grow at a vigorous pace. Nowadays, there is an immense
amount of research being done on elliptic curves, modular forms, and
generalizations of the modularity theorem to other settings (abelian
varieties, elliptic curves over number ﬁelds, etc.). Many questions
remain unanswered; for instance,

• Are there elliptic curves over Q of arbitrarily high rank? See

Conjecture 2.4.7 and the discussion in the same section.
• Is the Shafarevich-Tate group of an elliptic curve, X(E/Q),
always a ﬁnite group?
• Is the Birch and Swinnerton-Dyer conjecture true for all el-
liptic curves? See Conjecture 5.2.1 and Section 5.2. The
Clay Mathematics Institute has oﬀered a reward of one mil-
lion dollars for a proof (or counterexample!) of this cele-
brated conjecture.

These are just three questions of great (huge!) interest to number

theorists, but there are many other interesting questions and chal-
lenging problems being formulated as the reader stares at this page.
The Preface to this book contains a list of suggested reading mate-
rial so that the reader can continue to learn (more rigorously, and in
depth) about elliptic curves, modular forms, and their L-functions.

5.7. Exercises
Exercise 5.7.1. Let E/Q be an elliptic curve and let p ≥ 2 be a
prime. Deﬁne E ns (Fp ) to be the set of all non-singular points on
E(Fp ), and write Npns = |E ns (Fp )|. For instance, if p is a prime of
good reduction, then E ns (Fp ) = E(Fp ) and Npns = Np = p + 1 − ap .
144 5. L-functions

Suppose that E/Q has bad reduction at p. Show that:

⎧
⎪
⎨p − 1 if E
⎪ has split multiplicative reduction at p;
ns
Np = p + 1 if E has non-split multiplicative reduction at p;
⎪
⎪
⎩p if E has additive reduction at p.
N ns
Conclude that Lp (p−1 ) = pp for every p ≥ 2 (including good and
bad primes), where the function Lp (T ) appears in Deﬁnition 5.1.1.
(Hint: write E : f (x, y) = 0 and express f (x, y) = ((y − y0 ) − α(x −
x0 )) · ((y − y0 ) − β(x − x0 )) − (x − x0 )3 where (x0 , y0 ) is the singular
point for E(Fp ). Exercise 2.12.11 shows that there is (at most) one
singular point in E(Fp ), at least for p ≥ 3.)
1
Exercise 5.7.2. Prove Proposition 5.1.5. (Hint: 1−x = 1 + x + x2 +
· · · = n≥0 xn , and use the Fundamental Theorem of Arithmetic.)

Exercise 5.7.3. Prove the parity conjecture 5.2.5, assuming the

Birch and Swinnerton-Dyer conjecture and the functional equation
of L(E, s). (Hint: use the Taylor expansion of L(E, s) around s = 1.)
Conclude that, if the root number w(E/Q) = −1, then E(Q) is inﬁ-
nite.

Exercise 5.7.4. Let f (z) = n≥1 an q n be a cusp form in Sk (Γ0 (N )),

and deﬁne the Mellin transform of f (z) by
! ∞
dy
f(s) = f (iy)y s .
0 y
Show that f(s) = (2π)−s Γ(s)L(f, s), where Γ(s) is the Gamma func-
tion and L(f, s) is the L-function attached to f . (You may ignore
convergence issues and assume that integrals and inﬁnite sums com-
mute.)

Exercise 5.7.5. Let p > 3 be a prime and suppose that a, b, c are

pairwise relatively prime integers such that ap + bp = cp and abc = 0.
Let E/Q be the elliptic curve (Frey curve) deﬁned by
E : y 2 = x(x − ap )(x + bp ).
The goal of this exercise is to show that E is semistable with conductor

NE = |abc .
5.7. Exercises 145

(1) Show that, after rearranging a, b and c if necessary, we can

assume that a ≡ 0 mod 2 and b ≡ c ≡ 1 mod 4. (Hint: if
2|a and b ≡ 3 mod 4, consider ap + (−c)p = (−b)p .)
(2) Calculate the discriminant Δ of E/Q.
(3) Show that E/Q has good reduction at all primes that do
not divide abc.
(4) Show that if ≥ 3 is a prime dividing abc, then E/Q has
bad multiplicative reduction at .
(5) Show that E/Q has bad multiplicative reduction at = 2.
(Hint: use the following change of variables
X Y 3X
x= , y= +
4 8 8
to find another model isomorphic to E/Q. Show that this
model has coefficients in Z, and analyze the reduction at
= 2.)

(6) Conclude that the conductor of E is precisely NE = |abc .
(See Definition 5.1.7.)
146 5. L-functions

Figure 9. A 1670 edition of Diophantus’ Arithmetica, which

includes the original Greek text, a Latin translation, and Fer-
mat’s commentary: “Observatio Domini Petri de Fermat”. In
this page Fermat states his famous last theorem.
Appendix A

PARI/GP and Sage

This appendix is meant as a brief introduction to the usage of the

software packages PARI/GP and Sage, oriented to the study of elliptic
curves and modular forms. The websites for these packages are:
• PARI/GP: http://pari.math.u-bordeaux.fr/
• Sage: http://www.sagemath.org/
but notice that you can call PARI/GP from Sage, so I would recom-
mend simply installing Sage on your computer. I strongly recommend
that you use the “notebook” option in Sage and interact with the soft-
ware through your favorite internet browser (e.g. Firefox). Sage can
also be found online (although the performance, usually, is slower
than a local version on your computer):
• Sage online: http://www.sagenb.org/
Both packages have online manuals and speciﬁc sections on elliptic
curves.

A.1. Elliptic curves

A.1.1. Deﬁnition of an Elliptic Curve. An elliptic curve is a
plane curve E given by a Weierstrass equation

y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6

147
148 A. PARI/GP and Sage

with coefficients a1 , . . . , a6 in some field F . If the field is of char-

acteristic different from 2 or 3, one can find an easier model of the
form
y 2 = x3 + Ax + B.
In order to work with elliptic curves using the software packages, we
need to define the curves first:
• GP > E = ellinit([a1 , a2 , a3 , a4 , a6 ])
• Sage > E = EllipticCurve([a1 , a2 , a3 , a4 , a6 ])
• or Sage > E = EllipticCurve([A, B]).
Once we have defined an elliptic curve E, we can calculate basic quan-
tities such as the discriminant, the j-invariant or any of the coefficients
bi or cj (as defined in [Sil86], Ch. III, §1):
• In GP, type E.disc, E.c4 or E.j,
• In Sage, type E.discriminant(), E.c4()
or E.j_invariant().
If the elliptic curve is given by a model of the form y 2 + a1 xy + a3 y =
x3 +a2 x2 +a4 x+a6 but you would rather have a model y 2 = x3 +Ax+
B, use the command E.integral_short_weierstrass_model().
Remark A.1.1. Perhaps the two most useful Sage tricks are
the “Tab” key after an object and “?” after a command to get help.
For instance, if we have defined an elliptic curve E, then typing
E.
followed by the “Tab” key displays all possible commands that one can
use with an elliptic curve. This is very useful when we do not remem-
ber the exact syntax or we are wondering if Sage is capable of doing
some particular operation on E. Similarly, if we want to know more
about the usage of a particular command, then “E.command_name?”
will display a help box. For example, if we input E.discriminant?
then Sage tells us that this command returns the discriminant of E
and provides a couple of examples for the user.

A.1.2. Basic operations. Let us start by using the addition on an

elliptic curve. Let E be the curve given by Y 2 = X 3 + 1, and suppose
we have initialized E as above. This curve has points P = [0, 1] and
A.1. Elliptic curves 149

Q = [−1, 0]. Let us ﬁnd P + Q and 2P (the answers are [2, −3] and
[0, −1] respectively). The commands are:

• In GP, the commands are elladd(E,[0,1],[-1,0]) and,

in order to ﬁnd 2P , one types ellpow(E,[0,1],2);

• Sage: First we create points on the curve: P = E([0,1]);

Q = E([-1,0]) and now we can do addition: type P+Q and
P+P, or calculate multiples by typing 2*P, 3*P, etc.

Notice that Sage will transform aﬃne points to projective coordinates

(e.g., P = E([0,1]) returns (0 : 1 : 1) in Sage). If you want to
ﬁnd points on a curve (up to a given bound B on the height of the
point), use E.point_search(B) in Sage.

A.1.3. Plotting. Here is an example of a 2D-plot with Sage:

E = EllipticCurve([0,0,0,0,1]);
Ep = plot(E, -1,2.5,thickness=2);
p1=(2,3); p2=(0,1); p3=(-1,0); p4=(0,-1); p5=(2,-3);
L1=line([p1,p3],rgbcolor=(1,0,0));
L2=line([p5,p3],rgbcolor=(1,0,0));
L3=line([p4,p3],rgbcolor=(1,0,0));
L4=line([p2,p5],rgbcolor=(1,0,0));
L5=line([p4,p1],rgbcolor=(1,0,0));
T1=text(’P’,[2,3.5]); T2=text(’2P’,[0.15,1.5]);
T3=text(’3P’,[-1,.5]); T4=text(’4P’,[0.15,-1.5]);
T5=text(’5P’,[2,-3.5]);
P=point([p1,p2,p3,p4,p5],pointsize=30,
rgbcolor=(0,0,0));
PLOT=Ep+T1+T2+T3+T4+T5+L1+L2+L3+L4+L5+P; show(PLOT)
150 A. PARI/GP and Sage

The result is the graph that appears in Figure 2. The following is an

alternative way to plot points on a curve:

Q = E(2,3);
Qplot = plot(Q, pointsize=30)+plot(2*Q, pointsize=30);
show(Qplot)

A.1.4. Good and bad reduction. Given a prime p and an elliptic

curve E/Q given by a Weierstrass equation with integer coeﬃcients,
we can consider E as a curve over Z/pZ. The primes that divide
the (minimal) discriminant are called bad primes or primes of bad
reduction. In Sage, you can ﬁnd the minimal model of an elliptic
curve E by typing E.minimal_model(). For example, in Sage, the
commands

E=EllipticCurve([0,5,0,0,35]);
prime_divisors(E.discriminant())

will return [2,5,7,17]. You may also use

factor(E.discriminant()).

Then one can use the command kodaira_type() to ﬁnd out the
precise type of reduction: I0 is good reduction; Ij, where j > 0 is
some positive number, means bad multiplicative reduction; II, III,
IV or Ij*, for j ≥ 0, or II*,III*,IV* mean additive reduction. For
an explanation of the terminology of Kodaira symbols, see [Sil86],
Appendix C, §15. For our example E : y 2 = x3 + 5x2 + 35, we obtain

E.kodaira_type(2) returns II (i.e., additive);

E.kodaira_type(5) returns II (i.e., additive);
E.kodaira_type(7) returns I1 (i.e., multiplicative);
E.kodaira_type(17) returns I2 (i.e., multiplicative);
E.kodaira_type(11) returns I0 (i.e., good).
A.1. Elliptic curves 151

Note: if the equation is not minimal, some of the prime divisors of

the discriminant may not be bad after all. For example,

E=EllipticCurve([0,0,0,0,15625]);
prime_divisors(E.discriminant()) returns [2,3,5] but
E.kodaira_type(5) returns I0 (i.e., good).

This happened because the model y 2 = x3 + 15625 is not minimal

(15625 = 56 ); we should have used y 2 = x3 + 1 instead.
If E/Q has good reduction at p, then E defines an elliptic curve
over the finite field Z/pZ and we can count the number of points
modulo p (always including the extra point at infinity). Np denotes
this number of points while ap = p + 1 − Np . In GP, the command
ellap(E,p) returns the coefficient ap and ellan(E,n) returns an
array with the first n coefficients ak for k = 1, . . . , n.
In Sage, the command E.ap(p) returns ap while E.an(n) returns
the nth coefficient (and only the nth), and E.anlist(n) provides a
list of all the coefficients up to an . In Sage you can also directly find
the number Np by typing E.Np(p).
The conductor of E/Q is another associated quantity that is very
useful in practice:

• In Sage, type E.conductor(),

• In GP, type ellglobalred(E).

The command ellglobalred(E) returns an array [conductor, global

minimal model, product of local Tamagawa numbers]. In Sage, you
can ﬁnd a minimal model of an elliptic curve E by typing the com-
mand E.minimal_model().

A.1.5. The torsion subgroup. It follows from the Mordell-Weil

theorem that the torsion subgroup of an elliptic curve (over a number
ﬁeld) is a ﬁnite abelian group. Over Q, a theorem of B. Mazur says
that the torsion subgroup is one of the following: Z/nZ with 1 ≤ n ≤
10 or n = 12, or Z/2Z × Z/2mZ with 1 ≤ m ≤ 4. One can compute
the torsion subgroup as follows. The computation is easy, due to a
theorem of Nagell and Lutz:
152 A. PARI/GP and Sage

• In GP, the output of elltors(E) is a vector [t, [n, m],

[P,Q]], where t is the size of the torsion subgroup, which
is isomorphic to Z/nZ × Z/mZ, generated by the points P
and Q. If P is a torsion point, the command ellorder(P)
provides the order of the element.
• In Sage, E.torsion_order() returns the order of the group,
while G = E.torsion_subgroup() returns the group itself.
Then G.0 and G.1 return generators for G.

Remark A.1.2. Even though the Nagell-Lutz theorem provides a

simple algorithm to calculate the torsion subgroup of an elliptic curve,
this method may not be very eﬀective (at least when the discriminant
is divisible by many primes). In general, there are better algorithms
(for example, see [Dou98]).

A.1.6. The free part and the rank. It also follows from the
Mordell-Weil theorem that the free part (here free is the opposite
of torsion) of the group of points E(K) on an elliptic curve (again
over a number field K) is generated by a finite number of points P1 ,
P2 , ..., PR of infinite order. The number R of generators (of infinite
order) is called the rank of E(K). There is no known algorithm that
will always terminate and provide the rank and a set of generators.
However, the so-called “descent algorithm” will terminate in certain
cases (the descent procedure is an algorithm if X is finite, and we
conjecture that X is always finite). The following commands com-
pute lower and upper bounds for the rank and, in some cases, if they
coincide, provide the rank of the curve. There are also commands to
calculate generators; however, in many situations, the resulting points
will only generate a group of finite index in E(K) (the software will
warn you when this may be the case). Some of the algorithms take
an optional argument of a bound B.
In Sage, the command E.selmer_rank_bound() gives an upper
bound of the rank, and E.rank(), E.gens() try to find, respectively,
the rank and generators modulo torsion... but the computer may not
succeed! When these commands are called, Sage is using an algorithm
of Cremona in the background (see [Cre97]).
A.1. Elliptic curves 153

A.1.7. Heights and independence. In order to determine if a set

of rational points is algebraically independent, we use a pairing arising
from the canonical height. The following commands calculate the
global Néron-Tate canonical height of a rational point P on a curve
E:

In GP use ellheight(E,P);
In Sage simply use P.height(), where P is a point on E.

If S = {P1 , . . . , Pn } is a set of rational points, we can test whether

they are independent using the canonical height matrix. The height
pairing of P and Q is deﬁned by P, Q = h(P + Q) − h(P ) − h(Q),
where h is the canonical height on E. The height matrix relative to S
is a matrix H whose coordinate ij is given by Pi , Pj . The canonical
height is a positive deﬁnite quadratic form on E(Q) tensored with
the reals. Thus, the determinant of H is non-zero if and only if the
points in S are independent modulo torsion.

In GP use S = [P1,P2,P3];
H=ellheightmatrix(E,S); matdet(H);
In Sage use E.height_pairing_matrix([P1,P2,P3]),

where P1, P2, P3 are points on E (previously deﬁned). In GP, if

matdet(H) returns 0, one can calculate generators for the kernel of
H with matker(H). Each element of the kernel represents a linear
combination of points that adds up to a torsion point. In Sage, you
may use H.kernel() for the same purpose.

A.1.8. Elliptic curves over C. The period lattice of an elliptic

curve E/Q can be found by typing

L=E.period_lattice()
154 A. PARI/GP and Sage

and a basis for the period lattice is found simply using L.basis().
Using PARI/GP, one can start from a lattice and obtain the associ-
ated elliptic curve, as follows:

L=[1,I];
elleisnum(L,4) returns G4 (L),
which equals 2268.8726415...,
elleisnum(L,6) returns G6 (L),
which equals -3.97...E-33, i.e., 0,
thus, L corresponds to an elliptic curve
y 2 = x3 − (34033.089 . . .)x.
The elliptic curve y 2 = x3 − (34033.089 . . .)x is isomorphic to E/Q :
y 2 = x3 − x over C. Thus, C/1, i ∼
= E(C).

A.2. Modular forms

In this section, all commands we list are to be used in the Sage envi-
ronment.

A.2.1. The modular group and congruence subgroups. The

modular group and main congruence subgroups, defined for any N >
0 by

a b
SL(2, Z) = : a, b, c, d ∈ Z, ad − bc = 1 ,
c d

a b
Γ0 (N ) = ∈ SL(2, Z) : c ≡ 0 mod N ,
c d

a b
Γ1 (N ) = ∈ Γ0 (N ) : a ≡ d ≡ 1 mod N ,
c d
may be defined in Sage using SL2Z, Gamma0(N), and Gamma1(N), re-
spectively. Alternatively, SL(2, Z) can also be defined as Γ0 (1). Notice
that those 2×2 matrices that define elements of congruence subgroups
are stored in Sage as 4-dimensional row vectors. One can use the sub-
command .gens() on any of the modular and congruence groups to
find a set of matrices that generate (multiplicatively) the given group.
A.2. Modular forms 155

You can call the generators by using the suﬃx [0], [1], etc. Here
are some examples:

A = SL2Z([1,1,0,1]);
G = SL2Z.gens() returns two matrices

0 −1 1 1
G[0] = , G[1] = .
1 0 0 1

H = Gamma0(3).gens() returns six matrices

1 1 −1 0 1 −1
H[0] = , H[1] = , H[2] = ,
0 1 0 −1 0 1

1 −1 2 −1 −2 1
H[3] = , H[4] = , H[5] = .
3 −2 3 −1 −3 1
The genus of the modular curve X0 (N ) can be computed with the
command Gamma0(N).genus(). Similarly, Gamma(N).genus() and
Gamma1(N).genus() return the genus of X(N ) and X1 (N ), respec-
tively.

A.2.2. Vector spaces of modular forms. Let Γ be a congruence

subgroup of SL(2, Z) and define:
• Mk (Γ), the C-vector space of all modular forms for Γ of
weight k;
• Sk (Γ), the C-vector space of all cusp forms for Γ of weight
k.
Suppose you have already defined a congruence subgroup G (for ex-
ample, G = Gamma0(3)) and are interested in forms of weight k. The
vector spaces of modular forms and cusp forms can be defined in Sage
by
M=ModularForms(G,k) or ModularForms(G,k,prec=m)
if you want q-series expansions up to q m ;
S=CuspForms(G,k) or CuspForms(G,k,prec=m).
The precision is set to 6 by default. If you want to find the dimen-
sion or a basis, you can use the suffix .dimension() or .basis(),
156 A. PARI/GP and Sage

respectively. Here is an example:

M=ModularForms(Gamma0(3),4, prec=10);
M.dimension() returns 2;
M.basis() returns the forms:
[1 + 240q 3 + 2160q 6 + 6720q 9 + O(q 10 ),
q + 9q 2 + 27q 3 + 73q 4 + 126q 5 + 243q 6
+344q 7 + 585q 8 + 729q 9 + O(q 10 )].
The command CuspForms(Gamma0(3),4,prec=10) returns only the
0 vector space. Notice that even though the modular form q + 9q 2 +
27q 3 + O(q 4 ) vanishes at the cusp at inﬁnity (because a0 = 0 in the
expansion), it is not a cusp form for Γ0 (3) because it does not vanish
at all the cusps of X0 (3) (inﬁnity is not the only cusp!). The com-
mand AllCusps(N) produces a list of all (representatives of) cusps of
X0 (N ).
AllCusps(3) returns [(inf),(0)].

A.3. L-functions
Let E/Q be an elliptic curve, and let L(E, s) be the Hasse-Weil L-
function associated to E, as in Definition 5.1.1. This L-function is
defined in Sage using the command
L=E.lseries()
or one can use L=E.lseries().dokchitser() to use Dokchitser’s
algorithms to calculate values ([Dok04]). Once we have defined L =
L(E, s), we can evaluate L. For example:
E=EllipticCurve([1,2,3,4,5]);
L=E.lseries();
L(1) which returns 0,
L(1+I) = -0.485502124065793 + 0.627256178203893*I.
The value L(E, 1) = 0 is predicted in this case by the Birch and
Swinnerton-Dyer conjecture (Conjecture 5.2.1), since the rank of E is
> 0 (in fact, the rank is 1). One can also plot L(E, x) when x takes
A.3. L-functions 157

real values (because L(E, x) is real valued for x ∈ R). For instance,
the graph in Figure 2 was created with the following lines of code:

E0=EllipticCurve([0,-1,1,-10,-20]);
L0=E0.lseries().dokchitser();
P0=plot(lambda x: L0(x).real(),0, 3);
show(P0,xmin=-0.5, ymin=0, dpi=150).

If you want to create a PDF ﬁle with your graph, you can use

P=plot(lambda x: real(L0(x)),0, 3).save(

"bsdrank0.pdf",xmin=-0.5, ymin=-0.2, dpi=150).

You may also want to calculate the Taylor polynomial of L(E, s)

around the point x = a of degree n − 1 with L.taylor_series(a,n).

A.3.1. Data related to the BSD conjecture. The Shafarevich-

Tate group of E/Q is deﬁned in Sage by E.sha() but, in general, it
is diﬃcult to calculate its order. The user can calculate a conjectural
value of Sha by typing E.sha().an(). The conductor N of E/Q is

calculated with E.conductor(). The Tamagawa product p|N cp can
be calculated directly with E.tamagawa_product() or the individual
Tamagawa numbers cp , for each prime p|N , may be calculated with
E.tamagawa_number(p). The regulator of E/Q can be calculated by
E.regulator(). Finally, the real period ΩE is calculated as follows:

E=EllipticCurve([1,2,3,4,5]);
M=E.period_lattice();
Then M.omega returns ΩE = 2.78074001376673 . . ..

The reader should try to use the commands above to calculate

all the invariants listed in Examples 5.2.3 and 5.2.4 (see Figure 6 and
Figure 7).
158 A. PARI/GP and Sage

A.4. Other Sage commands

• Continued fractions:
continued_frac_list(N) returns the continued fraction of N ;
continued_frac_list(N,partial_convergents=True) or
convergents(v) return convergents for the cont. frac. v.
• The Kronecker symbol (deﬁned in Example 1.3.3):

−n
kronecker(-n,m) returns the Kronecker symbol .
m
Appendix B

Complex analysis

In this appendix we review some of the basic notions of complex

numbers and the theory of analytic and meromorphic functions on
the complex plane. This brief appendix is by no means a replacement
for a good course or a good book on complex analysis such as [Ahl79].

B.1. Complex numbers

The complex numbers, usually denoted by C, are defined as an exten-
sion of the real numbers R. Over the reals, the equation x2 + 1 = 0
has no solutions, so we define a new number i that satisfies i2 = −1.
Therefore x2 + 1 = 0 now has two solutions, namely i and −i. We
define C by adjoining our new number i to R:

C = {a + bi : a, b ∈ R, i2 = −1}.

The real and imaginary parts of a complex number α = a + bi are

denoted, respectively, by (α) = a and (α) = b. If (α) = b = 0
we say that α is a real number, and if (α) = a = 0 we say that α
is purely imaginary. We can add and multiply two complex numbers
α = a + bi and β = c + di to obtain a new complex number, as follows:

α+β = (a + bi) + (c + di) = (a + c) + (b + d)i; and

α·β = (a + bi) · (c + di) = (ac − bd) + (ad + bc)i.

159
160 B. Complex analysis

The set of all complex numbers together with the operations of addi-
tion and multiplication form a ﬁeld (see Exercise B.7.1).
There are two other operations on complex numbers that occur
often: complex conjugation and calculating the modulus, or absolute
value. The complex conjugate of α = a+bi is α = a−bi. The modulus
or absolute value of α is
√ & &
|α| = α · α = (a + bi)(a − bi) = a2 + b2 .
Notice that, for any α, β ∈ C, we have
α + β = α + β, α · β = α · β, |αβ| = |α||β|, and |α + β| ≤ |α| + |β|.

We constructed the complex numbers by adjoining i to R so that

i ∈ C and therefore the equation x2 + 1 = 0 has two solutions in C.
But something extremely surprising happened in this construction.
It turns out that not only x2 + 1 has a root in C but, in fact, every
polynomial with complex coeﬃcients has a root in C. This is an
extremely important result:
Theorem B.1.1 (Fundamental Theorem of Algebra). Let p(z) be a
polynomial
p(z) = an z n + an−1 z n−1 + . . . + a1 z + a0
with complex coeﬃcients ai ∈ C and degree ≥ 1. Then there exists a
complex number α ∈ C such that p(α) = 0.

The proof is left to the reader (Exercise B.7.6).

B.2. Analytic functions

Definition B.2.1. Let α ∈ C and δ ∈ R+ . An open disc Dδ (α) in
the complex plane, centered at α and of radius δ > 0, is the set
Dδ (α) = {z ∈ C : |z − α| < δ}.
Definition B.2.2. We say that a set S ⊆ C is open if for every α ∈ S
there is a real number δ > 0 such that Dδ (α) ⊆ S. We say that a set
T ⊆ C is closed if the complement of T in C, i.e., C − T , is open.
Definition B.2.3. A non-empty connected open set in the complex
plane is called a region.
B.2. Analytic functions 161

Let U be a region in the complex plane and let f (z) : U → C be

a complex-valued function on U . Let α ∈ U . We say that f has a
derivative at α if the usual limit converges:
f (α + h) − f (α)
f (α) = lim ,
h→0 h
where h runs over complex numbers inside U that approach 0. Al-
ternatively (or more precisely), we can define f (z) using and δ as
follows. We say that f has a derivative at α with value m = f (α) if
the following statement holds: for every real > 0 there exists a real
δ > 0 such that, if h ∈ Dδ (α), then
% %
% f (α + h) − f (α) %
% − m %< .
% h %
Definition B.2.4. Let U ⊆ C be a region and let f (z) be a complex-
valued function f : U → C defined for every z ∈ U . We say that f (z)
is analytic (or holomorphic, or entire) on U if it has a derivative at
each z ∈ U .
Example B.2.5. The function f (z) = z is analytic on the whole
complex plane C (Exercise B.7.3). The function g(z) = 1/z is analytic
on C − {0}.

It is not hard to show that the sum, product and composition

of two analytic functions are also analytic. Thus, all polynomials
in one variable with complex coeﬃcients deﬁne analytic functions.
Similarly, the quotient of two analytic functions is analytic except at
the zeros of the denominator. Thus, all rational functions (quotients
of polynomials) are analytic in the complex plane except at the zeros
of the polynomial in the denominator.
Remark B.2.6. Let U be a region of C and let f : U → C be an
analytic function. We write f (z) where z = x + yi ∈ U , with x, y ∈ R.
We may also write
f (z) = u(z) + v(z)i,
where u, v : U → R are real-valued functions. Since f is analytic on
U , the functions f , u and v are continuous on U (Exercise B.7.4).
Since f is analytic, the limit
f (z + h) − f (z)
(B.1) f (z) = lim
h→0 h
162 B. Complex analysis

exists for every z ∈ U . The parameter h runs over complex numbers

in U approaching zero, but we may restrict h to real values (thus,
we are calculating ∂f /∂x). The value of the limit in Eq. (B.1) does
not change under this restriction, and this means that the partial
derivative of f with respect to x equals f (z). Hence
∂f ∂u ∂v
f (z) = = + i.
∂x ∂x ∂x
Similarly, we may restrict h to purely imaginary values h = ik, and
then
f (z + h) − f (z) f (z + ik) − f (z)
f (z) = lim = lim
h→0 h k→0 ik
1 f (z + ik) − f (z) ∂f
= · lim = (−i) .
i k→0 k ∂y

It follows that f (z) = (−i) ∂f ∂u

∂y = (−i) ∂y +
∂v
∂y . Therefore,

∂f ∂f ∂u ∂v ∂v ∂u
f (z) = = (−i) = + i= − i.
∂x ∂y ∂x ∂x ∂y ∂y
The last equality implies that the real and imaginary parts of every
analytic function must satisfy the following diﬀerential equations:
∂u ∂v ∂u ∂v
(B.2) = and =− .
∂x ∂y ∂y ∂x
These are called the Cauchy-Riemann diﬀerential equations.

Diﬀerentiability (or being analytic) over C, as in Deﬁnition B.2.4,

is a much stronger condition than differentiability over R. Indeed, the
existence of a complex derivative implies that the function is in fact
infinitely differentiable and locally equal to its own Taylor series. We
explain what these terms mean in the following theorem.

Theorem B.2.7. Let U ⊆ C be a region and let f : U → C be

analytic. Then f has derivatives of all orders on U (i.e., the deriva-
tives f (z), f (z), . . . and, more generally, f (n) (z) for all n ≥ 1 are
continuous and diﬀerentiable complex-valued functions on U ).
Moreover, for every α ∈ U , the Taylor series of f (z) about z = α
converges to f (z) in some neighborhood of α. In other words, for
B.3. Meromorphic functions 163

every α ∈ U , there is a real δ > 0 such that the Taylor series

∞
f (n) (α)
T (z; α) = (z − α)n
n=0
n!
converges for all z ∈ Dδ (α), and T (z; α) = f (z).
∞
n=0 an (z − α) is a power series with
n
Conversely, if S(z) =
complex coefficients an with a radius of convergence R (i.e., S(z)
converges for all z ∈ C with |z − α| < R), then S(z) defines an
analytic function on the open disc DR (α).
∞ n
Example B.2.8. Let f (z) = n=0 zn! . The radius of convergence of
this series is infinite (over C as well as over R), so it defines an analytic
function in the complex plane. The function f (z) is, of course, the
complex exponential function which we discuss below in B.4 in some
more detail. Similarly, we define sin(z) and cos(z) using the usual
Taylor expansions
∞ ∞
z 2n+1 z 2n
sin(z) = (−1)2n+1 , cos(z) = (−1)2n .
n=0
(2n + 1)! n=0
(2n)!
Since the radius of convergence of these series is infinite, sin(z) and
cos(z) define analytic functions on C.

B.3. Meromorphic functions

At this juncture, it is useful to extend the complex numbers by in-
troducing a point at inﬁnity ∞. We will write C = C ∪ {∞} for the
extended complex plane. We set the convention that every straight
line shall pass through the point at inﬁnity. (Note that C is sim-
ply the projective line over C, i.e., P (C). See Appendix C for an
1

introduction to the projective line and projective geometry.)

With this deﬁnition of ∞, suppose that f (z) is a complex-valued
function not deﬁned at α. The expression
lim f (z) = ∞
z→α

means that |f (z)| is unbounded as z approaches α. For instance,

f (z) = 1/z is not deﬁned at 0 and limz→0 1/z = ∞. This “∞” is
the complex point at inﬁnity, and it should not be confused with the
164 B. Complex analysis

infinity that we use in real analysis (“very, very far along the positive
x-axis”). In fact, in R, the limit limx→0 1/x is undefined (as the value
may be ±∞ depending on how we approach 0), but in C, the limit
limz→0 1/z = ∞ simply means that if z is close to 0, then 1/z is far
from 0 (in some direction, not necessarily along the x-axis).
Suppose that f (z) is some complex-valued function that is not
defined at α but is analytic in a neighborhood of α. How can f fail to
be analytic at α? The function f (z) may have a removable singularity
(e.g., sin(z)/z), an essential singularity (e.g., sin(1/z)) or a pole (e.g.,
1/z). Here we will only discuss poles in some detail (for a complete
discussion, see [Ahl79], Ch. 4, §3).

Deﬁnition B.3.1. Let f be a complex-valued function, and let α ∈

C. We say that f has a pole (or isolated pole) at z = α if:
(1) The function f (z) is analytic on some disc Dδ (α) centered
at α, except at α itself. In other words, f is analytic on the
punctured disc
{z ∈ C : 0 < |z − α| < δ}
for some δ > 0; and
(2) The limit of f at α is inﬁnite:
lim f (z) = ∞.
z→α

Deﬁnition B.3.2. A function f (z) is meromorphic in a region U if

f is analytic on U except for a set of isolated poles.

Remark B.3.3. Suppose that f (z) is meromorphic in a region U

with an isolated pole at α ∈ U . It does not make sense to write f :
U → C, since limz→α f (α) = ∞. Instead, we may write f : U → C.

Example B.3.4. Let p(z) and q(z) be polynomials in C[z] such that
p and q have no common factors. Then the rational function p(z)/q(z)
is a meromorphic function with isolated poles at the zeros of q(z).

Example B.3.5. The function sin(1/z) has inﬁnitely many zeros

accumulating near z = 0 (there is a zero at each z = 1/(πk) for
each k ≥ 1). Therefore, g(z) = (sin(1/z))−1 is not meromorphic
because the singularity at 0 is not isolated. In fact, the function g
B.4. The complex exponential function 165

has inﬁnitely many poles in any open neighborhood of 0. Notice,

however, that (sin(z))−1 is a meromorphic function.

Remark B.3.6. Let f (z) be a function that is analytic in a disc

DR (α) except, perhaps, at α ∈ C. Then f (z) has a Laurent expansion
of the form
∞
f (z) = cn (z − α)n .
n=−∞

Then, the function f (z):

(1) is analytic at α if cn = 0 for all n < 0 and f (α) = c0 (if

f (α) = c0 , or if f (α) is undeﬁned, then there is a removable
singularity at α),
(2) is meromorphic at α if there is some M > 0 such that cn = 0
for all n < −M ; i.e., the expansion of f (z) is of the form
∞
f (z) = cn (z − α)n ,
n=−M

and
(3) has an essential singularity at α if there are inﬁnitely many
n < 0 such that cn = 0.

B.4. The complex exponential function

The usual real exponential function ex can be extended to the ﬁeld
of complex numbers as follows. Let z = x + yi with x, y ∈ R. Then
we deﬁne ez by

ez = ex+yi := ex (cos(y) + sin(y)i).

Equivalently, ez can be deﬁned as a Taylor series (which coincides

with the Taylor series of the real valued exponential function):
∞
zn
ez = .
n=0
n!
166 B. Complex analysis

If z = x + yi with x, y ∈ R:

|ez | = |ex+yi | = |ex cos(y) + (ex sin(y))i|

&
= (ex cos(y))2 + (ex sin(y))2
'
= e2x (cos2 (y) + sin2 (y)) = ex .

Notice that, if θ ∈ R, then eθi is a complex number that lies on the

unit complex circle {z ∈ C : |z| = 1}. Indeed, by the formula above,
|eθi | = |e0+θi | = e0 = 1.
In the theory of L-functions, we often calculate powers of natural
numbers n ∈ N with complex exponents s ∈ C. Next, we deﬁne
what ns means precisely. If n ∈ N and s = x + yi ∈ C, we deﬁne
ns = elog(n)s , i.e.,

ns = elog(n)s = elog(n)x+log(n)yi
= elog(n)x (cos(log(n)y) + sin(log(n)y)i)
= nx (cos(log(n)y) + sin(log(n)y)i).

B.5. Theorems in complex analysis

In this section we state some of the most important and useful the-
orems about analytic functions. We have already stated two funda-
mental theorems, namely Theorems B.1.1 and B.2.7.
The ﬁrst two theorems concern line integrals along closed curves.
If γ is a closed curve (the starting point is equal to the end point)
in C, and
$ f (z) is a function deﬁned at every point of γ, then the
symbol γ f (z)dz represents the line integral of f (z) along γ. A curve
is contractible in a region U if it can be continuously shrunk to a
point, always staying inside U . The winding number of a curve γ
with respect to a point α ∈ C, denoted by n(γ, α), counts the number
of times that the path γ winds around α. The winding number is
positive if the curve goes around α in the counterclockwise direction,
and negative otherwise. (See [Ahl79], Ch. 4.)

Theorem B.5.1 (Cauchy’s Theorem). Let U be a region in C, let

f (z) be a complex-valued function that is analytic on U , and let γ be
B.5. Theorems in complex analysis 167

any contractible closed curve contained in U . Then

!
f (z)dz = 0.
γ

Theorem B.5.2 (Cauchy’s Integral Formula). Let f (z) be a function

that is analytic in a region U , and let γ be a closed curve inside U .
For any point α not on γ, we have
!
1 f (z)
n(γ, α) · f (α) = dz,
2πi γ z − α
where n(γ, α) is the winding number of γ around α.

Cauchy’s Theorem B.5.1 has the following converse.

Theorem B.5.3 (Morera’s Theorem). $ If f (z) is deﬁned and contin-
uous in a region U ⊆ C, and if γ f (z)dz = 0 for all closed curves γ
in U , then f (z) is analytic in U .

Another important theorem about line integrals is the Residue

Theorem (see [Ahl79], Ch. 4, §5.1). Before stating the next theo-
rem, we remind the reader that a region is by deﬁnition a non-empty
connected open set.
Theorem B.5.4 (The Maximum Principle). If f (z) is analytic and
non-constant in a region U , then its absolute value |f (z)| has no max-
imum in U . Alternatively, if f (z) is an analytic function on a closed
bounded set T , then the maximum of |f (z)| occurs on the boundary
of T .
Theorem B.5.5 (Liouville’s Theorem). A function which is analytic
and bounded in the whole complex plane must be constant.

We say that α in a set S ⊆ C is an accumulation point in S if for

every δ > 0 there is point β ∈ S, β = α such that |β − α| < δ.
Theorem B.5.6. If f (z) and g(z) are analytic in a region U , and if
f (z) = g(z) for every z in a set S which has an accumulation point
in U , then f (z) is identically equal to g(z) on all points of U .

The previous theorem has some remarkable consequences: if f (z)

is analytic in U and it is identically zero in a set S ⊆ C that contains
168 B. Complex analysis

an accumulation point, then f (z) is identically zero. Also, we deduce

that an analytic function is uniquely determined by its values on any
set with an accumulation point in the region of analyticity.
Theorem B.5.7 (Conformal Mapping Theorem). A complex func-
tion is analytic if and only if it maps pairs of intersecting curves into
pairs that intersect at the same angle.

B.6. Quotients of the complex plane

In the theory of elliptic curves over C, we often work with a quotient
of the complex plane C modulo some lattice L. See Section 3.1 for
the definition of lattice, the definition of the quotient C/L and the
relationship to elliptic curves. In this section, we define what it means
for a map C/L → C to be analytic.
Let L ⊂ C be a lattice with a basis L = w1 , w2 . Usually, we fix
a fundamental domain for L as follows
FL = {λw1 + μw2 ∈ C : 0 ≤ λ, μ < 1}.
For our purposes here, we will define a fundamental domain for C/L
for each α ∈ C such that α is positioned in the interior of the domain:
FL,α = {α + λw1 + μw2 ∈ C : −1/2 ≤ λ, μ < 1/2}
and we also define the interior of FL,α by
FL,α
0
= {α + λw1 + μw2 ∈ C : −1/2 < λ, μ < 1/2}.
Notice that FL,α
0
is a region in C (it is non-empty, connected and
open), and α is at the center of the region. Notice that there is a
bijection
(B.3) ψL,α : C/L → FL,α .
Let f : C/L → C be a complex-valued function that is well-defined for
every element of the quotient C/L. Let α mod L be such an element.
We say that f : C/L → C is analytic at α if the map
f : FL,α
0
→ C, f(z) = f (z mod L)
is analytic at α.
When we discuss maps between elliptic curves (e.g., Proposition
3.1.6), we talk about analytic maps f : C/L → C/L , where L and
B.7. Exercises 169

L are lattices. What does “analytic” mean in this context? How do

we define analyticity? It is simply a matter of choosing correct charts
for each C/L and C/L , as we shall see next.
Let f : C/L → C/L be a continuous map. Let α ∈ C and suppose
that f (α mod L) = β mod L . Let FL,α 0
be the region about α defined
above, and similarly define FL ,β . Let > 0 be small enough so that
0

the disc D (β) is completely contained in FL0 ,β . Then, by continuity

of f , there is a δ such that if |z − α| < δ, then f (z mod L) is inside
D (β). Pick δ small enough so that Dδ (α) is completely contained
in FL,α
0
. We are now ready to state our deﬁnition: we say that the
continuous map f : C/L → C/L is analytic at α mod L if the map
f : Dδ (α) → D (β), f(z) = ψL ,β (f (z mod L)) ∈ D (β) ⊆ FL ,β
is analytic at α, where ψL ,β : C/L → FL ,β is the bijection we
deﬁned in Eq. (B.3).

B.7. Exercises
Exercise B.7.1. The goal of this exercise is to prove that C is a field.
(1) Show that any non-zero complex number α = a + bi has a
multiplicative inverse which is also a complex number α−1 =
c + di with c, d ∈ R.
(2) Convince yourself that C is a field; i.e., justify why C satisfies
each of the field axioms.
Exercise B.7.2. Let α be a complex number. Show that α ∈ R if
and only if α = α.
Exercise B.7.3. Show that f (z) = z is analytic on C. Also, show
that gn (z) = z n is analytic on C for every n ≥ 1 and that the deriva-
tive is gn (z) = nz n−1 .
Exercise B.7.4. Let f (z) be a complex-valued function that is ana-
lytic in a region U ⊆ C.
(1) Show that f is also continuous at every point of U (i.e.,
limh→a f (h) = f (a) for every a ∈ U ).
(2) Let f (z) = u(z)+v(z)i, where u(z) and v(z) are real-valued.
Show that u and v are continuous on U .
170 B. Complex analysis

Exercise B.7.5. Let f (z) be an analytic function on a region U ,

and write (f (z)) = u(z), (f (z)) = v(z) for the real and imaginary
parts of f (z), respectively. We deﬁne the Laplacians of u and v by
∂2u ∂2u ∂2v ∂2v
Δu = + and Δv = + .
∂x2 ∂y 2 ∂x2 ∂y 2
Show that Δu = Δv = 0. (Hint: use the Cauchy-Riemann diﬀerential
equations, i.e., Eq. B.2.)
Exercise B.7.6. Prove the Fundamental Theorem of Algebra B.1.1:
if P (z) is a non-constant polynomial, then there is a root of P in C.
(Hint: suppose that P (z) has no roots in C. Then 1/P (z) would be
analytic. Now use Liouville’s Theorem B.5.5.)
Appendix C

Projective space

C.1. The projective line

Let us begin with an example. Consider the function f (x) = x1 . We
know from Calculus that f is continuous (and differentiable) on all of
its domain (i.e., R) except at x = 0. Would it be possible to extend
the real line so that f (x) is continuous everywhere? The answer is
yes, it is possible, and the solution is to glue the “end” of the real line
at ∞ with the other “end” at −∞. We will describe the solution in
detail below. Formally, we need the projective line, which is a line
with points R ∪ {∞}, i.e., a real line plus a single point at infinity
that ties the line together (into a circle).
The formal definition of the projective line is as follows. It may
seem a little confusing at first, but it is fairly easy to work and com-
pute with it. First, we need to define a relation between vectors of
real numbers in the plane. Let a, b, x, y be real numbers such that
neither (x, y) nor (a, b) is the zero vector. We say that (x, y) ∼ (a, b)
if the vector (x, y) is a non-zero multiple of the vector (a, b). In other
words, if we consider (a, b) and (x, y) as points in the plane, we say
that (a, b) ∼ (x, y) if they both lie in one line on the plane that passes
through the origin. Again:
(x, y) ∼ (a, b) if and only if there is λ ∈ R such that x = λa, y = λb.

171
172 C. Projective space
√ √
For instance, ( 2, 2) ∼ (1, 1). We denote by [x, y] the set of all
vectors (a, b) such that (x, y) ∼ (a, b):
[x, y] = {(a, b) : a, b ∈ R such that (a, b) = (0, 0) and (x, y) ∼ (a, b)}.
Finally, we deﬁne the real projective line by
P1 (R) = {[x, y] : x, y ∈ R with (x, y) = (0, 0)}.
If you think about it, P1 (R) is the set of all lines through the origin
(each class [x, y] consists of all points — except the origin — on the
line that goes through (x, y) and (0, 0)). The important thing to
notice is that if [x, y] ∈ P1 (R) and y = 0, then (x, y) ∼ ( xy , 1), so the
class of [x, y] contains a unique representative of the form (a, 1) for
some a = xy ∈ R. This allows the following decomposition of P1 (R):
P1 (R) = {[x, 1] : x ∈ R} ∪ {[1, 0]}.
The set of points {[x, 1]} are in bijection with R and, therefore, form
a real line. The point [1, 0], which is the only point in P1 (R) that
does not belong to the real line {[x, 1]}, is called the point at inﬁnity
(see Figure 1).

|2, 3| |1, 1|
3

2 |3, 2|
{|x, 1|}
1
|1, 0|
−3 −2 −1 1 2 3

−1

−2

−3

Figure 1. Some points in the projective line, e.g., [2, 3] ∈

P1 (R), and their representatives of the form [x, 1], e.g. [ 23 , 1],
except for [1, 0].
C.2. The projective plane 173

Notice that when x ∈ R gets large (i.e., x → ∞ or x → −∞),

the point [x, 1] ∈ P1 (R) corresponds to a line in the real plane that
is closer and closer to the horizontal line. Since the horizontal line
corresponds to the point [1, 0] ∈ P1 (R), we see that as x gets large (in
either the positive or negative direction!), the points [x, 1] get closer
and closer to [1, 0], the point at infinity. This is what we meant at
the beginning of this section by “glueing” both ends of the real line,
∞ and −∞, at one point.
Let us see that, with this definition, the function f : R → R,
f (x) = 1/x is continuous everywhere when extended to P1 (R). We
define instead an extended function F : P1 (R) → P1 (R) by
F ([x, y]) = [y, x].
Notice that a point on the real line of P1 , i.e., a point of the form
[x, 1], is sent to the point [1, x] of P1 , and (1, x) ∼ ( x1 , 1) as long as
x = 0. So [x, 1] with x = 0 is sent to [ x1 , 1] via F (i.e., the real point x
is sent to x1 ). Hence, F coincides with f on R−{0}. But F is perfectly
well-defined on x = 0, i.e., on the point [0, 1], and F ([0, 1]) = [1, 0] so
that [0, 1] is sent to the point at infinity. Moreover, both sided limits
coincide:
lim F ([x, 1]) = lim− F ([x, 1]) = F ([0, 1]) = [1, 0].
x→0+ x→0

C.2. The projective plane

We may generalize the construction above of the projective line in
order to construct a projective plane that will consist of a real plane
plus a number of points at infinity, one for each direction in the plane;
i.e., the projective plane will be a real plane plus a projective line of
points at infinity.
Let a, b, c, x, y, z ∈ R such that neither (a, b, c) nor (x, y, z) are
the zero vector:
(x, y, z) ∼ (a, b, c) if and only if there is λ ∈ R such that x = λa,
y = λb, z = λc.
We also define classes of similar vectors by
[x, y, z] = {(a, b, c) : a, b, c ∈ R such that (a, b, c) = 0 and (x, y, z) ∼
(a, b, c)}.
174 C. Projective space

Notice that, as before, the class [x, y, z] contains all the points
in the line in R3 that goes through (x, y, z) and (0, 0, 0) except the
origin. We define the projective plane to be the collection of all such
lines:
P2 (R) = {[x, y, z] : x, y, z ∈ R such that (x, y, z) = (0, 0, 0)}.
If z = 0, then (x, y, z) ∼ ( xz , yz , 1). Thus,
P2 (R) = {[x, y, 1] : x, y ∈ R} ∪ {[a, b, 0] : a, b ∈ R}.
The points of the set {[x, y, 1] : x, y ∈ R} are in 1-to-1 correspondence
with the real plane R2 , and the points in {[a, b, 0] : a, b ∈ R} are called
the points at infinity and form a P1 (R), a projective line.
One interesting consequence of the definitions is that any two
parallel lines in the real plane {[x, y, 1]} intersect at a point at infinity
[a, b, 0]. Indeed, let L : y = mx + b and L : y = mx + b be distinct
parallel lines in the real plane. If points in the real plane {[x, y, 1]}
correspond to lines in R3 , then lines in the real plane correspond to
planes in R3 :
L = {[x, y, z] : mx − y + bz = 0}, L = {[x, y, z] : mx − y + b z = 0}.
What is L ∩ L ? The intersection points are those [x, y, z] such that
mx − y + bz = mx − y + b z = 0, which implies that (b − b )z = 0.
Since L = L , we have b = b and, therefore, we must have z = 0.
Hence
L ∩ L = {[x, mx, 0] : x ∈ R} = {[1, m, 0]},
and so the intersection consists of a single point at infinity: [1, m, 0].

C.3. Over an arbitrary ﬁeld

The projective line and plane can be defined over any field. Let K be
a field (e.g. K = Q, R, C or Fp ). The usual affine plane (or Euclidean
plane) is defined by
A2 (K) = {(x, y) : x, y ∈ K}.
The projective plane over K is defined by
P2 (K) = {[x, y, z] : x, y, z ∈ K such that (x, y, z) = (0, 0, 0)}.
C.4. Curves in the projective plane 175

As before, (x, y, z) ∼ (a, b, c) if and only if there is λ ∈ K such that

(x, y, z) = λ · (a, b, c).

C.4. Curves in the projective plane

Let K be a ﬁeld and let C be a curve in aﬃne space, given by a
polynomial in two variables:

C : f (x, y) = 0

for some f (x, y) ∈ K[x, y], e.g. C : y 2 − x3 − 1 = 0. We want to

extend C to a curve in the projective plane P2 (K). In order to do
this, we consider the points in the curve (x, y) to be points in the
plane [ xz , yz , 1] of P2 (K). Thus, we have
y 2 x 3
C: − −1=0
z z
or, equivalently, zy 2 − x3 − z 3 = 0. Notice that the polynomial
F (x, y, z) = zy 2 − x3 − z 3 is homogeneous in its variables (each mono-
mial has degree 3) and F (x, y, 1) = f (x, y). The curve in P2 (K),
given by
: F (x, y, z) = zy 2 − x3 − z 3 = 0,
C
is the curve we were looking for, which extends our original curve C in

the affine plane. Notice that if the points (x, y) ∈ C, then [x, y, 1] ∈ C.

However, there may be some extra points in C which were not present
in C, namely those points of C at infinity. Recall that the points at
infinity are those with z = 0, so F (x, y, 0) = −x3 = 0 implies that
x = 0 also, and the only point at infinity in C is [0, 1, 0].
In general, if C ⊆ A2 (K) is given by f (x, y) = 0 and d is the
∈ P2 (K) is given by
highest degree of a monomial in f , then C

C : F (x, y, z) = 0,

: F (x, y, z) = 0 is
where F (x, y, z) = z d · f xz , yz . Conversely, if C
a curve in the projective plane, then C : F (x, y, 1) = 0 is a curve in
the aﬃne plane. In this case, C is the projection of C onto the chart
z = 1; we may also look at other charts, e.g., x = 1, which would
yield a curve C : F (1, y, z) = 0.
176 C. Projective space

Here is another example. Let C be given by

C : y − x2 = 0
so that C is a parabola. Then C is given by

C : F (x, y, z) = z 2 f x , y = zy − x2 = 0.
z z

The curve C has a unique point at infinity, namely [0, 1, 0]. This
means that the two “arms” of the parabola meet at a single point at
infinity. Thus, a parabola has the shape of an ellipse in P2 (K). How
about hyperbolas? Let
C : x2 − y 2 = 1.
Then C : x2 − y 2 = z 2 and there are two points at infinity, namely
[1, 1, 0] and [1, −1, 0]. Thus, the four arms of the hyperbola in the
affine plane meet in two points, and the hyperbola also has the shape
of an ellipse in the projective plane P2 (K).

C.5. Singular and smooth curves

We say that a projective curve C : F (x, y, z) = 0 is singular at a point
P ∈ C if and only if ∂F ∂F ∂F
∂x (P ) = ∂y (P ) = ∂z (P ) = 0. In other words,
C is singular at P if the tangent vector at P vanishes. Otherwise, we
say that C is non-singular at P . If C is non-singular at every point,
we say that C is a smooth (or non-singular) curve.
For example, C : zy 2 = x3 is singular at P = [0, 0, 1] because
F (x, y, z) = zy 2 − x3 and
∂F ∂F ∂F
= −x2 , = 2yz, = y2 .
∂x ∂y ∂z
∂F ∂F ∂F
Thus, ∂x (P ) = ∂y (P ) = ∂z (P ) = 0 for P = [0, 0, 1].
Here is another example. The curve D : z 2 y 2 = x4 + z 4 has
partial derivatives
∂F ∂F ∂F
= −4x3 , = 2yz 2 , = 2y 2 z − 4z 3 .
∂x ∂y ∂z
Thus, if P = [x, y, z] ∈ D(Q) is singular, then
−4x3 = 0, 2yz 2 = 0, and 2y 2 z − 4z 3 = 0.
C.5. Singular and smooth curves 177

7.5

2.5

−0

−2.5

−5

−7.5

−10
−1 0 1 2 3 4 5 6

Figure 2. The chart {[x, y, 1]} of the curve zy 2 = x3 .

The ﬁrst two equalities imply that x = 0 and yz = 0 (what would

happen if we were working over a field of characteristic 2, such as F2 ?).
If y = 0, then z = 0 by the third equation, but [0, 0, 0] is not a well-
defined point in P2 (Q), so this is impossible. However, if x = z = 0,
then y may take any value. Hence, P = [0, 1, 0] is a singular point.
Notice that the affine curve that corresponds to the chart z = 1 of D,
given by y 2 = x4 + 1, is non-singular at all points in the affine plane
but is singular at a point at infinity, namely P = [0, 1, 0].
An elliptic curve of the form E : y 2 = x3 +Ax+B, or in projective
coordinates given by zy 2 = x3 + Axz 2 + Bz 3 , is non-singular if and
only if 4A3 + 27B 2 = 0. The quantity Δ = −16 · (4A3 + 27B 2 ) is
called the discriminant of E.
178 C. Projective space

7.5

2.5

−0

−2.5

−5

−7.5

−10
−5 −4 −3 −2 −1 −0 1 2 3 4 5

1.5

0.5

−0

−0.5

−1

−1.5

−2
−1 −0.75 −0.5 −0.25 −0 0.25 0.5 0.75 1

Figure 3. The chart {[x, y, 1]} of the curve z 2 y 2 = x4 +

z 4 (above, non-singular) and the chart {[x, 1, z]} (below, the
curve is singular).
Appendix D

The p-adic numbers

In this appendix we brieﬂy introduce the p-adic integers Zp and the

p-adic numbers Qp . We strongly recommend [Gou97] to learn more
about the p-adics.
Let p ≥ 2 be a prime. The p-adic numbers may be thought of as
a generalization of Z/pZ. The main diﬀerence is that the p-adic num-
bers form a ring of characteristic zero, while Z/pZ has characteristic
p. In Z/pZ we only consider congruences modulo p, while in Zp we
consider congruences modulo pn for all n > 0. The p-adic integers,
denoted by Zp , are deﬁned as follows:

Zp = {(a1 , a2 , . . .) : an ∈ Z/pn Z such that an+1 ≡ an mod pn }.

In other words, a p-adic integer is an inﬁnite vector (an )∞

n=1 such that
the nth coordinate belongs to Z/pn Z and the sequence is coherent
under congruences; i.e., an+1 ∈ Z/pn+1 Z reduces to the previous
term an modulo pn . For instance,

(2, 2, 29, 29, 272, 758, . . .)

are the ﬁrst few terms of a 3-adic integer; notice that all the coordi-
nates are coherent with the previous terms under congruences modulo
powers of 3. The vector (2, 2, 2, 2, . . .) is another element of Z3 (which
we will denote simply by 2).

179
180 D. The p-adic numbers

The p-adic integers have addition and multiplication operations,

deﬁned coordinate-by-coordinate:
(an )∞ ∞ n ∞
n=1 + (bn )n=1 = (an + bn mod p )n=1 ,

and
(an )∞ ∞ n ∞
n=1 · (bn )n=1 = (an · bn mod p )n=1 .
The reader should check that the sum and product of two coherent
vectors is also coherent under congruences and, therefore, a new ele-
ment of Zp . These operations make Zp a commutative ring with iden-
tity element 1 = (1, 1, 1, 1, . . .) and zero element 0 = (0, 0, 0, 0, . . .).
For any prime p ≥ 2, the p-adic integers contain a copy of Z,
where the integer m is represented by the element
m = (m mod p, m mod p2 , m mod p3 , . . .).
For example, the number 200 in Z3 is given by
200 = (2, 2, 11, 38, 200, 200, 200, 200, 200, 200, . . .).
Thus, we may write Z ⊆ Zp (see Exercise D.2.1). However, there are
elements in Zp that are not in Z, so Z Zp . Here is an example for
p = 7: we are going to show that Z7 , unlike Z,√contains an element
whose square is 2 (which we will denote by “ 2”). Indeed, 2 is a
quadratic residue in Z/7Z, and 2 has two square roots, namely 3 and
4 modulo 7. A standard theorem of number theory shows that, hence,
2 is in fact a quadratic residue modulo 7n for all n ≥ 1. Thus, there
exist integers an such that a2n ≡ 2 mod pn for all n ≥ 1. Moreover, it
can also be shown that, if an is chosen, then there is an+1 mod pn+1
with a2n+1 ≡ 2 mod pn+1 and an+1 ≡ an mod pn (we say that an can
be lifted to Z/pn+1 Z; see Exercise D.2.2). Indeed, here are the first
few coordinates of an element α of Z7 such that α2 = (2, 2, 2, . . .):
α = (3, 10, 108, 2166, 4567, . . .).
√
Thus, α should be regarded as “ 2 ” inside Z7 , and −α is another
square root of 2.
The usual integers, Z, are not a field because not every element
has a multiplicative inverse (only ±1 have inverses!). Similarly, the
p-adic integers Zp do not form a field either; e.g., p = (p, p, p, . . .)
is not invertible in Zp , but many elements of Zp are invertible. For
D.1. Hensel’s lemma 181

instance, if p > 2, then 2 is invertible in Zp (in other words, there is

a number 12 ∈ Zp ). Indeed, the inverse of 2 is given by

1 1 + p 1 + p2 1 + pn
= , ,..., ,... .
2 2 2 2
For example, in Z5 , the inverse of 2 is given by (3, 13, 63, 313, . . .).
It is easy to see that if α = (an )∞n=1 with a1 ≡ 0 mod p, then α is
invertible in Zp . If a1 ≡ 0 mod p, then α is not invertible. Moreover,
for any α ∈ Zp there is an r ≥ 0 such that α = pr β, where β ∈ Zp is
invertible.
Even though Zp is not a field, we can embed Zp in a field in the
same way that Z sits inside Q. We define the field of p-adic numbers
by
α
Qp = : k ≥ 0 and α ∈ Z p .
pk
Thus, every element of α ∈ Qp can be written as α = pr β with r ∈ Z
and an invertible β ∈ Z×
p.

D.1. Hensel’s lemma

The following results are used to show the existence of a solution to
polynomial equations over local fields. Here we will only discuss the
application to the p-adics, Qp (which is an example of a local field).
Notice the similarities with Newton’s method.
Theorem D.1.1 (Hensel’s Lemma). Let p ≥ 2, let Qp be the field of
p-adic numbers and let Zp be the p-adic integers. Let νp be the usual
p-adic valuation (i.e., νp (pe n) = e if n ∈ Z and gcd(n, p) = 1). Let
f (x) be a polynomial with coefficients in Zp and suppose there exist
α0 ∈ Zp such that
νp (f (α0 )) > νp (f (α0 )2 ).
Then there exists a root α ∈ Qp of f (x). Moreover, the sequence
f (αi )
αi+1 = αi −
f (αi )
converges to α. Furthermore;

f (αi )
νp (α − α0 ) ≥ νp > 0.
f (αi )
182 D. The p-adic numbers

Corollary D.1.2 (Trivial case of Hensel’s lemma). Let p ≥ 2, and

let Zp and Qp be as before. Let f (x) be a polynomial with coeﬃcients
in Zp and suppose there exist α0 ∈ Zp such that

f (α0 ) ≡ 0 mod p, f (α0 ) = 0 mod p.

Then there exists a root α ∈ Qp of f (x), i.e., f (α) = 0.

Example D.1.3. Let p be a prime number greater than 2. Are there

solutions to x2 + 7 = 0 in the ﬁeld Qp ? If there are, −7 must be a
quadratic residue modulo p. Thus, let p be a prime such that

−7
= 1,
p
where ( p· ) is Legendre’s quadratic residue symbol. Hence, there exist
α0 ∈ Z such that α02 ≡ −7 mod p. We claim that x2 + 7 = 0 has
a solution in Qp if and only if −7 is a quadratic residue modulo
p. Indeed, if we let f (x) = x2 + 7 (so f (x) = 2x), the element
α0 ∈ Zp satisﬁes the conditions of the (trivial case of) Hensel’s lemma.
Therefore, there exists a root α ∈ Qp of x2 + 7 = 0.

Example D.1.4. Let p = 2. Are there any solutions to x2 + 7 = 0

in Q2 ? Notice that if we let f (x) = x2 + 7, then f (x) = 2x and, for
any α0 ∈ Z2 , the number f (α0 ) = 2α0 is congruent to 0 modulo 2.
Thus, we cannot use the trivial case of Hensel’s lemma (i.e., Corollary
D.1.2).
Let α0 = 1 ∈ Z2 . Notice that f (1) = 8 and f (1) = 2. Thus,

3 = ν2 (8) > ν2 (22 ) = 2

and the general case of Hensel’s lemma applies. Hence, there exists a
2-adic solution to x2 + 7 = 0.

D.2. Exercises
Exercise D.2.1. Show that if q and t are distinct integers (in Z),
then their representatives in Zp for any prime p ≥ 2, given by q =
(q mod pn )∞ n ∞
n=1 and t = (t mod p )n=1 , are also distinct in Zp .

Exercise D.2.2. Let p > 2 be a prime number.

D.2. Exercises 183

(1) Let b ∈ Z with gcd(b, p) = 1, and let n ≥ 1. Suppose an ∈ Z

such that a2n ≡ b mod pn . Show that there exists an+1 ∈
Z such that a2n+1 ≡ b mod pn+1 and an+1 ≡ an mod pn .
(Hint: write a2n = b + kpn and consider f (x) = an + xpn .
Find x such that f (x)2 ≡ b mod pn+1 .)
(2) Suppose a21 ≡ b mod p, where gcd(b, p) = 1. Show that the
vector α = (an )∞
n=1 , defined recursively by
a2n − b
an+1 = an − mod pn+1 ,
2an
is a well-defined element of Zp and, moreover, α2 = b, i.e.,
α2 = (b mod p, b mod p2 , b mod p3 , . . .),
so α is a square root of b.
Exercise D.2.3. Find the first 4 coordinates of the 5-adic expansion
of 13 in Z5 .
Exercise
√ D.2.4. Find the first 4 coordinates of the 5-adic expansions
of ± 6 in Z5 ; i.e., find the first 4 coordinates of α and −α such that
α2 = 6 in Z5 .
Appendix E

Parametrization of
torsion structures

In this appendix we provide one-parameter inﬁnite families of elliptic

curves with all the possible torsion subgroups that may occur for ellip-
tic curves over Q. The main reference for this appendix is [Kub76],
Table 3, p. 217.
In each table below, Figure 1 and Figure 2, we provide ellip-
tic curves Ea,b whose equations depend on two rational parameters
a, b ∈ Q, and such that the torsion subgroup Ea,b (Q)tors has a given
subgroup G; i.e., the full torsion subgroup contains G as a subgroup,
but may be larger in certain cases (see Example E.1.1 below).
The families that appear in Figure 1 depend on two independent
parameters a, b, and the only condition that needs to be satisﬁed is
that the discriminant Δa,b of Ea,b must be non-zero. This condition
on the discriminant is given in the second column of the table.
The families that appear in Figure 2 depend on one single rational
parameter t ∈ Q, and a and b are rational functions in the variable t.
The curves Ea,b that appear in this table are all of the form

Ea,b : y 2 + (1 − a)xy − by = x3 − bx2 .

The point (0, 0) is a torsion point of the maximal order in the group.
The discriminant Δa,b of Ea,b is always assumed to be non-zero.

185
186 E. Parametrization of torsion structures

Ea,b /Q Δa,b = 0 G
y 2 = x3 + ax2 + bx a2 b2 − 4b3 = 0 Z/2Z
y 2 + axy + by = x3 a3 b3 − 27b4 = 0 Z/3Z
y 2 = x(x + a)(x + b) 0 = a = b = 0 Z/2Z ⊕ Z/2Z

Figure 1. Two-parameter families of elliptic curves Ea,b /Q

such that Ea,b (Q)tors has a subgroup G.

Curves of the form Ea,b : y 2 + (1 − a)xy − by = x3 − bx2

a b G
a=0 b=t Z/4Z
a=t b=t Z/5Z
a=t b = t + t2 Z/6Z
a=t −t2
b=t −t3 2
Z/7Z
(2t−1)(t−1)
a= t b = (2t − 1)(t − 1) Z/8Z
a = t (t − 1)
2
b = t (t − 1)(t − t + 1) Z/9Z
2 2

t(t−1)(2t−1) t3 (t−1)(2t−1)
a= t2 −3t+1 b= (t2 −3t+1)2 Z/10Z
t(1−2t)(3t2 −3t+1) 2t2 −2t+1
a= (t−1)3 b = −a · t−1 Z/12Z
a=0 b = t2 − 1
16 Z/2Z ⊕ Z/4Z
−2(t−1) (t−5)
2
a= 10−2t
t2 −9 b= (t2 −9)2 Z/2Z ⊕ Z/6Z
(2t+1)(8t2 +4t+1) (2t+1)(8t2 +4t+1)
a= 2(4t+1)(8t2 −1)t b= (8t2 −1)2 Z/2Z ⊕ Z/8Z

Figure 2. One-parameter families of elliptic curves Ea,b /Q

such that Ea,b (Q)tors has a subgroup G.

Example E.1.1. For each t ∈ Q, according to Figure 2, the torsion

subgroup of the elliptic curve Et,t : y 2 + (1 − t)xy − ty = x3 − tx2
contains G = Z/5Z as a subgroup, as long as the discriminant Δt,t =
t5 (t2 − 11t − 1) is non-zero (thus, Δt,t = 0 if and only if t = 0). In
other words, the point (0, 0) of Et,t is a torsion point of order 5.
E. Parametrization of torsion structures 187

Notice, however, that this does not imply that the torsion sub-
group of Et,t (Q) is identical to Z/5Z. For instance, let t = 12. The
torsion subgroup of the elliptic curve
E12,12 : y 2 − 11xy − 12y = x3 − 12x2
is isomorphic to Z/10Z. The point (0, 0) is a point of order 5, but the
point (−6, −18) has exact order 10.
Example E.1.2. According to Figure 2, each curve in the family
y 2 + (1 + t − t2 )xy + (t2 − t3 )y = x3 + (t2 − t3 )x2
has a torsion point of exact order 7, namely P = (0, 0), as long as
the discriminant Δ = t7 (t − 1)7 (t3 − 8t2 + 5t + 1) is non-zero, which
can only happen for the rational values t = 0 and t = 1. By Mazur’s
Theorem 2.5.2, the only possible torsion subgroup for an elliptic curve
over Q that contains Z/7Z as a subgroup is Z/7Z itself. Thus, the
torsion subgroup of each elliptic curve in this family is exactly Z/7Z.
Similarly, if Ea,b is an elliptic curve in one of the families in Figure
2 that correspond to G in the list
Z/7Z, Z/9Z, Z/10Z, Z/12Z, Z/2Z ⊕ Z/6Z, or Z/2Z ⊕ Z/8Z,
then the torsion subgroup of Ea,b (Q) must be exactly G.
Bibliography

[Ahl79] Lars Ahlfors, Complex Analysis, McGraw-Hill Sci-

ence/Engineering/Math; 3rd edition (1979).
[AL70] A. O. L. Atkin and J. Lehner, Hecke operators on Γ0 (m), Math.
Ann. 185 (1970), 134-160.
[Bak90] Alan Baker, Transcendental Number Theory, Cambridge Uni-
versity Press, Cambridge, 1990.
[BSD63] B. Birch and H. P. F. Swinnerton-Dyer, Notes on elliptic curves
(I) and (II), J. Reine Angew. Math. 212 (1963), pp. 7-25, and
218 (1965), pp. 79-108.
[BCDT01] Christophe Breuil, Brian Conrad, Fred Diamond, and Richard
Taylor, On the modularity of elliptic curves over Q: Wild 3-
adic exercises, Journal of the American Mathematical Society
14 (2001), pp. 843-939.
[Cha06] Jasbir S. Chahal, Congruent numbers and elliptic curves, Amer.
Math. Monthly 113 (2006), no. 4, pp. 308-317.
[Chi95] Lindsay N. Childs, A Concrete Introduction to Higher Algebra,
Springer, New York, 1995.
[Con08] Keith Conrad, The congruent number problem, available at
http://www.math.uconn.edu/∼kconrad/blurbs/ugradnumthy/
congnumber.pdf
[CSS00] Gary Cornell, Joseph H. Silverman and Glenn Stevens (Edi-
tors), Modular Forms and Fermat’s Last Theorem, Springer,
2000.
[Cre97] John Cremona, Algorithms for Modular Elliptic Curves, Cam-
bridge University Press, 1997 (available for free online).

189
190 Bibliography

[DS05] Fred Diamond and Jerry Shurman, A First Course in Modular

Forms, Springer, New York, 2005.
[Dic05] Leonard E. Dickson, History of the Theory of Numbers, Volume
II: Diophantine Analysis, Dover Publications, 2005.
[Dok04] Tim Dokchitser, Computing special values of motivic L-
functions, Exper. Math. 13, No. 2 (2004), 137-149.
[Dou98] Darrin Doud, A procedure to calculate torsion of elliptic curves
over Q, Manuscripta Math. 95 (1998), 463-469.
[Duj09] Andrej Dujella’s website, at
http://web.math.hr/∼duje/tors/tors.html
[Fre86] Gerhard Frey, Links between solutions of A−B = C and elliptic
curves. Number theory (Ulm, 1987), 31–62, Lecture Notes in
Math., 1380, Springer, New York, 1989.
[Gou97] Fernando Gouvea, p-adic Numbers: An Introduction, Springer
(Universitext), 2nd edition (1997).
[GJPST09] Grigor Grigorov, Andrei Jorza, Stefan Patrikis, William A.
Stein, and Corina Tarnita, Computational veriﬁcation of the
Birch and Swinnerton-Dyer conjecture for individual elliptic
curves, Math. Comp. 78 (2009), 2397-2425.
[Kob93] Neal I. Koblitz, Introduction to Elliptic Curves and Modular
Forms, Second Edition, Springer-Verlag, New York, 1993.
[Kub76] Daniel S. Kubert, Universal bounds on the torsion of elliptic
curves, Proc. London Math. Soc. (3), 33, 1976, p. 193-237.
[Lan83] Serge Lang, Conjectured Diophantine estimates on elliptic
curves, Progress in Math. 35, Birkhäuser, 1983.
[Li75] Wen-Ching Winnie Li, Newforms and functional equations,
Math. Ann. 212 (1975), 285-315.
[Loz05] Álvaro Lozano-Robledo, Buscando puntos racionales en curvas
elípticas: Métodos explícitos, La Gaceta de la Real Sociedad
Matematica Española (J. of the Royal Mathematical Society of
Spain), Vol. 8 (2005), n. 2, pp. 471-488.
[Loz08] Julian Aguirre, Álvaro Lozano-Robledo and Juan Carlos
Peral, Elliptic curves of maximal rank, in Revista Matemática
Iberoamericana, proceedings of the conference “Segundas Jor-
nadas de Teoria de Números”.
[Lut37] E. Lutz, Sur l’equation y 2 = x3 − Ax − B dans les corps p-adic,
J. Reine Angew. Math. 177 (1937), 431-466.
[Mat93] Yuri V. Matiyasevich, Hilbert’s Tenth Problem, MIT Press,
Cambridge, Massachusetts, 1993.
Bibliography 191

[Maz72] Barry Mazur, Courbes elliptiques et symboles modulaires, Lec-

ture Notes in Mathematics, Vol. 317, 277-294.
[Maz77] Barry Mazur, Modular curves and the Eisenstein ideal, IHES
Publ. Math. 47 (1977), 33-186.
[Maz78] Barry Mazur, Rational isogenies of prime degree, Invent. Math.
44 (1978), 129-162.
[Mil10] Robert L. Miller, Empirical Evidence for the Birch and
Swinnerton-Dyer Conjecture, Ph.D. Thesis, 2010.
[Mil06] J. S. Milne, Elliptic Curves, Kea Books, 2006.
[Miy06] T. Miyake, Modular Forms, Springer Monographs in Mathe-
matics, New York, 2006.
[Nag35] T. Nagell, Solution de quelque problemes dans la theorie arith-
metique des cubiques planes du premier genre, Wid. Akad.
Skrifter Oslo I, 1935, Nr. 1.
[Rib90] Kenneth A. Ribet, On modular representations of Gal(Q/Q)
arising from modular forms. Invent. Math. 100 (1990), no. 2,
431–476.
[RuS02] Karl Rubin and Alice Silverberg, Ranks of Elliptic Curves, Bull.
Amer. Math. Soc. 39, no. 4, pg. 455-474.
[SK52] J. G. Semple and G. T. Kneebone, Algebraic Projective Geom-
etry, Oxford University Press, USA (1952).
[Ser77] J.-P. Serre, A course in arithmetic, Springer-Verlag, New York,
1973.
[Ser87] J.-P. Serre, Sur les représentations modulaires de degré 2 de
Gal(Q/Q). (French) [On modular representations of degree 2 of
Gal(Q/Q)] Duke Math. J. 54 (1987), no. 1, 179–230.
[Ser97] J.-P. Serre, Galois Cohomology, Springer-Verlag, New York,
1997.
[ShT67] I. R. Shafarevich and J. Tate, The rank of elliptic curves, AMS
Transl. 8 (1967), 917-920.
[Shi73] Goro Shimura, Introduction to Arithmetic Theory of Automor-
phic Functions, Princeton University Press, 1973.
[Shi02] Goro Shimura, The Representation of Integers as Sums of
Squares, American Journal of Mathematics, Vol. 124, No. 5
(Oct., 2002), pp. 1059-1081.
[Sil86] Joseph H. Silverman, The Arithmetic of Elliptic Curves,
Springer-Verlag, New York, 1986.
[Sil94] Joseph H. Silverman, Advanced Topics in the Arithmetic of El-
liptic Curves, Springer-Verlag, New York, 1994.
192 Bibliography

[SiT92] Joseph H. Silverman and John Tate, Rational Points on Elliptic

Curves, Springer-Verlag, New York, 1992.
[Ste07] W. Stein, Modular Forms, a computational approach, American
Mathematical Society, 2007.
[Ste08] W. Stein, Elementary Number Theory: Primes, Congruences,
and Secrets: A Computational Approach, Undergraduate Texts
in Mathematics, Springer, New York, 2008.
[Ste75] N. M. Stephens, Congruence properties of congruent numbers,
Bull. London Math. Soc. 7 (1975), pp. 182-184.
[Tat74] J. Tate, The arithmetic of elliptic curves, Invent. Math., 23
(1974), 179-206.
[TW95] Richard Taylor and Andrew Wiles, Ring-theoretic properties of
certain Hecke algebras. Ann. of Math. (2) 141 (1995), no. 3,
553-572.
[Tun83] J. Tunnell, A Classical Diophantine Problem and Modular
Forms of Weight 3/2, Invent. Math. 72 (1983), pp. 323-334.
[Ver05] H. Verrill, Fundamental Domain Drawer Applet
http://www.math.uconn.edu/∼alozano/fundomain/
[Was08] L. C. Washington, Elliptic Curves: Number Theory and Cryp-
tography, Second Edition (Discrete Mathematics and Its Appli-
cations), Chapman & Hall/CRC (April 3, 2008).
[Wil95] Andrew Wiles, Modular elliptic curves and Fermat’s last theo-
rem, Ann. of Math. 141 (1995), no. 3, pp. 443-551.
Index

Baker’s bound, 24 q-expansion, 102

Bernoulli number, 102 of level N , 108
elliptic curve, 1, 20, 147
canonical height, 43, 44 L-function, 123
congruence subgroup, 90 analytic rank, 133, 135
congruent number problem, 2, 45, conductor, 126, 136, 139
133
discriminant, 36
conjecture of
free part, 30
Lang, 43
group structure, 24
Mordell, 20
Ogg, 32 minimal discriminant, 36
parity, 133 minimal model, 36
Taniyama-Shimura-Weil, xii, 7, modular, 137, 139, 141
127, 137, 139 modular parametrization, 140
Birch and Swinnerton-Dyer, 6, Mordell-Weil group, 28
127–129, 132, 133, 135 over ﬁnite ﬁelds, 35
the rank, 31 rank, 30, 45, 59, 128, 129, 135
cusp, 37 real period, 128
cusp form regulator, 48, 128
for SL(2, Z), 101 root number, 127, 133
for a congruence subgroup, 106 semistable, 139, 141
newform, 117 Tamagawa numbers, 128
torsion subgroup, 30, 32, 128
descent, 49
Weierstrass equation, 21
Diophantine equation, 17
Dirichlet elliptic function, 82
L-function, 12 Weierstrass ℘-function, 82
character, 12 elliptic height matrix, 47–49
discriminant, 36, 148 elliptic regulator, 47
Euler product, 12
Eisenstein series, 82, 101, 116 extended upper half-plane H∗ , 88

193
194 Index

Fermat, 4, 6, 45 for a congruence subgroup, 91

Fibonacci, 3 modular discriminant, 103
finite field, 35 modular form, 7
fundamental domain L-function, 135
of a lattice, 79 cusp form, 101, 135, 137, 139
of a modular curve, 85 eigenform, 115, 138
fundamental theorem of algebra, for SL(2, Z), 101
160 for a congruence subgroup, 106
level, 105
Hasse, 39 new form, 111, 139
Hasse’s bound, 39 newform, 117, 139
Hecke operator normalized, 102
Tn , 114 of an elliptic curve, 137
Um and Vm , 114 old form, 108
wN , 112 normalized eigenform, 116
diamond, 112 modular function, 99, 100
height, 43 weakly, 100
Hensel’s lemma, 65, 181
Hilbert, 18 Néron-Tate pairing, 47, 48
10th problem, 18 node, 37
homogeneous space, 59, 61, 62, 64
PARI/GP, 147
isomorphism of curves, 22 Parity conjecture, 133
Petersson inner product, 110
j-invariant, 148
Jacobi symbol, 8 point at infinity, 21

Kronecker symbol, 14 rank, 30, 128, 129

analytic, 133
L-function, 11 reduction of an elliptic curve, 37,
Euler product, 138 124, 126
local factor, 124 additive, 37
of a modular form, 135, 137, 139 good, 37
of an elliptic curve, 123, 124, non-split multiplicative, 37, 125
127, 135, 137, 139 split multiplicative, 37, 38, 124
of Dirichlet, 12 regular prime, 141
of Hasse-Weil, 124 regulator of an elliptic curve, 48,
root number, 127, 133 128
functional equation, 126, 138 Riemann zeta function, 12
lattice, 77
Legendre symbol, 14 Sage, 147
linear independence, 46 Selmer group, 66–68
semistable, 139, 141
minimal discriminant, 36 Shafarevich-Tate group, 66–68, 128
minimal model, 36 singular curve, 35–37, 72, 176
modular j-invariant, 103 cusp, 37
modular curve, 87 node, 37
algebraic model, 93 smooth curve, 20, 35, 176
cusp, 89, 91
for SL(2, Z), 89 theorem of
Index 195

Atkin and Lehner, 117

Dirichlet on primes in arithmetic
progressions, 12
Faltings, 20, 141
Gross-Kolyvagin-Zagier, 135
Hasse, 39
Hecke, 115, 138
Mazur, 32
modularity, 140
Mordell-Weil, 28
Nagell-Lutz, 34
Siegel, 23
uniformization, 83
weak Mordell-Weil, 29
Nagell-Lutz, 151
torsion points, 30
Tunnell, 5

weakly modular function, 100

Weierstrass ℘-function, 82
Weierstrass equation, 21, 147
Titles in This Series
58 Álvaro Lozano-Robledo, Elliptic curves, modular forms, and their
L-functions, 2011
57 Charles M. Grinstead, William P. Peterson, and J. Laurie Snell,
Probability tales, 2011
56 Julia Garibaldi, Alex Iosevich, and Steven Senger, The Erdős
distance problem, 2011
55 Gregory F. Lawler, Random walk and the heat equation, 2010
54 Alex Kasman, Glimpses of soliton theory: The algebra and geometry of
nonlinear PDEs, 2010
53 Jiřı́ Matoušek, Thirty-three miniatures: Mathematical and algorithmic
applications of linear algebra, 2010
52 Yakov Pesin and Vaughn Climenhaga, Lectures on fractal geometry
and dynamical systems, 2009
51 Richard S. Palais and Robert A. Palais, Differential equations,
mechanics, and computation, 2009
50 Mike Mesterton-Gibbons, A primer on the calculus of variations and
optimal control theory, 2009
49 Francis Bonahon, Low-dimensional geometry: From euclidean surfaces
to hyperbolic knots, 2009
48 John Franks, A (terse) introduction to Lebesgue integration, 2009
47 L. D. Faddeev and O. A. Yakubovskiı̆, Lectures on quantum
mechanics for mathematics students, 2009
46 Anatole Katok and Vaughn Climenhaga, Lectures on surfaces:
(Almost) everything you wanted to know about them, 2008
45 Harold M. Edwards, Higher arithmetic: An algorithmic introduction to
number theory, 2008
44 Yitzhak Katznelson and Yonatan R. Katznelson, A (terse)
introduction to linear algebra, 2008
43 Ilka Agricola and Thomas Friedrich, Elementary geometry, 2008
42 C. E. Silva, Invitation to ergodic theory, 2007
41 Gary L. Mullen and Carl Mummert, Finite fields and applications,
2007
40 Deguang Han, Keri Kornelson, David Larson, and Eric Weber,
Frames for undergraduates, 2007
39 Alex Iosevich, A view from the top: Analysis, combinatorics and number
theory, 2007
38 B. Fristedt, N. Jain, and N. Krylov, Filtering and prediction: A
primer, 2007
37 Svetlana Katok, p-adic analysis compared with real, 2007
36 Mara D. Neusel, Invariant theory, 2007
35 Jörg Bewersdorff, Galois theory for beginners: A historical perspective,
2006
TITLES IN THIS SERIES

34 Bruce C. Berndt, Number theory in the spirit of Ramanujan, 2006

33 Rekha R. Thomas, Lectures in geometric combinatorics, 2006
32 Sheldon Katz, Enumerative geometry and string theory, 2006
31 John McCleary, A first course in topology: Continuity and dimension,
2006
30 Serge Tabachnikov, Geometry and billiards, 2005
29 Kristopher Tapp, Matrix groups for undergraduates, 2005
28 Emmanuel Lesigne, Heads or tails: An introduction to limit theorems in
probability, 2005
27 Reinhard Illner, C. Sean Bohun, Samantha McCollum, and Thea
van Roode, Mathematical modelling: A case studies approach, 2005
26 Robert Hardt, Editor, Six themes on variation, 2004
25 S. V. Duzhin and B. D. Chebotarevsky, Transformation groups for
beginners, 2004
24 Bruce M. Landman and Aaron Robertson, Ramsey theory on the
integers, 2004
23 S. K. Lando, Lectures on generating functions, 2003
22 Andreas Arvanitoyeorgos, An introduction to Lie groups and the
geometry of homogeneous spaces, 2003
21 W. J. Kaczor and M. T. Nowak, Problems in mathematical analysis
III: Integration, 2003
20 Klaus Hulek, Elementary algebraic geometry, 2003
19 A. Shen and N. K. Vereshchagin, Computable functions, 2003
18 V. V. Yaschenko, Editor, Cryptography: An introduction, 2002
17 A. Shen and N. K. Vereshchagin, Basic set theory, 2002
16 Wolfgang Kühnel, Differential geometry: curves – surfaces – manifolds,
second edition, 2006
15 Gerd Fischer, Plane algebraic curves, 2001
14 V. A. Vassiliev, Introduction to topology, 2001
13 Frederick J. Almgren, Jr., Plateau’s problem: An invitation to varifold
geometry, 2001
12 W. J. Kaczor and M. T. Nowak, Problems in mathematical analysis
II: Continuity and differentiation, 2001
11 Mike Mesterton-Gibbons, An introduction to game-theoretic
modelling, 2000

10 John Oprea, The mathematics of soap films: Explorations with Maple ,
2000
9 David E. Blair, Inversion theory and conformal mapping, 2000