Algebraic Number Theory
Algebraic Number Theory
a Computational Approach
William Stein
1 Introduction 9
1.1 Mathematical background . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 What is algebraic number theory? . . . . . . . . . . . . . . . . . . . 10
1.2.1 Topics in this book . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Some applications of algebraic number theory . . . . . . . . . . . . . 11
4 Factoring Primes 51
4.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.1 Geometric Intuition . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 A Method for Factoring Primes that Often Works . . . . . . . . . . 55
4.3 A General Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.1 Inessential Discriminant Divisors . . . . . . . . . . . . . . . . 58
3
4 CONTENTS
13 Valuations 149
13.1 Valuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
13.2 Types of Valuations . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
13.3 Examples of Valuations . . . . . . . . . . . . . . . . . . . . . . . . . 155
20 Exercises 215
Preface
This book is based on notes the author created for a one-semester undergraduate
course on Algebraic Number Theory, which the author taught at Harvard during
Spring 2004 and Spring 2005. This book was mainly inspired by the [SD01, Ch. 1]
and Cassels’s article Global Fields in [Cas67]
—————————
7
8 CONTENTS
This material is based upon work supported by the National Science Foundation
under Grant No. 0400386.
Chapter 1
Introduction
For example, if you have never worked with finite groups before, you should read
another book first. If you haven’t seen much elementary ring theory, there is still
hope, but you will have to do some additional reading and exercises. We will briefly
review the basics of the Galois theory of number fields.
Some of the homework problems involve using a computer, but there are ex-
amples which you can build on. We will not assume that you have a program-
ming background or know much about algorithms. Most of the book uses Sage
(http://sagemath.org), which is free open source mathematical software. The
following is an example Sage session:
2 + 2
9
10 CHAPTER 1. INTRODUCTION
Note that we will not do anything nontrivial with zeta functions or L-functions.
1.3. SOME APPLICATIONS OF ALGEBRAIC NUMBER THEORY 11
1. Integer factorization using the number field sieve. The number field sieve
is the asymptotically fastest known algorithm for factoring general large in-
tegers (that don’t have too special of a form). On December 12, 2009, the
number field sieve was used to factor the RSA-768 challenge, which is a 232
digit number that is a product of two primes:
rsa768 = 1 2 3 0 1 8 6 6 8 4 5 3 0 1 1 7 7 5 5 1 3 0 4 9 4 9 5 8 3 8 4 9 6 2 7 2 0 7 7 2 8 5 3 5 6 9 5 9 5 3 3 4 7 9 \
219732245215172640050726365751874520219978646938995647494277406384592\
519255732630345373154826850791702612214291346167042921431160222124047\
9274737794080665351419597459856902143413
n = 33478071698956898786044169848212690817704794983713768568912\
431388982883793878002287614711652531743087737814467999489
m = 36746043666799590428244633799627952632279158164343087642676\
032283815739666511279233373417143396810270092798736308917
n * m == rsa768
True
This record integer factorization cracked a certain 768-bit public key cryp-
tosystem (see http://eprint.iacr.org/2010/006), thus establishing a lower
bound on one’s choice of key size:
2. Primality testing: Agrawal and his students Saxena and Kayal from India
found in 2002 the first ever deterministic polynomial-time (in the number
of digits) primality test. Their methods involve arithmetic in quotients of
(Z/nZ)[x], which are best understood in the context of algebraic number
theory.
For example, Theorem 1.3.1 implies that for any n ≥ 4 and any number
field K, there are only finitely many solutions in K to xn + y n = 1.
A major open problem in arithmetic geometry is the Birch and Swinnerton-
Dyer conjecture. An elliptic curves E is an algebraic curve with at least one
point with coordinates in K such that the set of complex points E(C) is a
topological torus. The Birch and Swinnerton-Dyer conjecture gives a criterion
for whether or not E(K) is infinite in terms of analytic properties of the L-
function L(E, s). See http://www.claymath.org/millennium/Birch_and_
Swinnerton-Dyer_Conjecture/.
Part I
13
Chapter 2
Theorem 2.1.2 (Structure Theorem for Finitely Generated Abelian Groups). Let
G be a finitely generated abelian group. Then there is an isomorphism
15
16 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
Exercise 2.1.3. Quick! Guess how many abelian groups there are of order less
than 12. Use Theorem 2.1.2 to classify all abelian groups of order less than 12.
How many do you think there are? How many are there?
We will prove the theorem as follows. We first remark that any subgroup of
a finitely generated free abelian group is finitely generated. Then we see how to
represent finitely generated abelian groups as quotients of finite rank free abelian
groups, and how to reinterpret such a presentation in terms of matrices over the
integers. Next we describe how to use row and column operations over the integers
to show that every matrix over the integers is equivalent to one in a canonical
diagonal form, called the Smith normal form. We obtain a proof of the theorem by
reinterpreting the Smith normal form in terms of groups. Finally, we observe that
the representation in the theorem is necessarily unique.
Proposition 2.1.4. If H is a subgroup of a finitely generated abelian group, then
H is finitely generated.
The key reason that this is true is that G is a finitely generated module over the
principal ideal domain Z. We defer the proof of Proposition 2.1.4 to Section 2.2,
where we will give a complete proof of a beautiful generalization in the context of
Noetherian rings (the Hilbert basis theorem).
Corollary 2.1.5. Suppose G is a finitely generated abelian group. Then there
are finitely generated free abelian groups F1 and F2 and there is a homomorphism
ψ : F2 → F1 such that G ≈ F1 /ψ(F2 ).
Proof. Let x1 , . . . , xm be generators for G. Let F1 = Zm and let ϕ : F1 → G be the
homomorphism that sends the ith generator (0, 0, . . . , 1, . . . , 0) of Zm to xi . Then ϕ
is surjective, and by Proposition 2.1.4 the kernel ker(ϕ) of ϕ is a finitely generated
abelian group. Suppose there are n generators for ker(ϕ), let F2 = Zn and fix a
surjective homomorphism ψ : F2 → ker(ϕ). Then F1 /ψ(F2 ) is isomorphic to G.
A
Zm −
→ Zn → G → 0
The cokernel of the homomorphism defined by A is the quotient of Zn by the
image of A (i.e., the Z-span of the columns of A), and this cokernel is isomorphic
to G.
The following proposition implies that we may choose a bases for F1 and F2
such that the matrix of A only has nonzero entries along the diagonal, so that the
structure of the cokernel of A is trivial to understand.
Proof of Proposition 2.1.6. The matrix P will be a product of matrices that define
elementary row operations and Q will be a product corresponding to elementary
column operations. The elementary row and column operations over Z are as fol-
lows:
1. [Add multiple] Add an integer multiple of one row to another (or a multiple
of one column to another).
Suppose ai1 is a nonzero entry in the first column, with i > 1. Using
the division algorithm, write ai1 = a11 q + r, with 0 ≤ r < a11 . Now
add −q times the first row to the ith row. If r > 0, then go to
step 1 (so that an entry with absolute value at most r is the upper
left corner).
If at any point this operation produces a nonzero entry in the matrix with
absolute value smaller than |a11 |, start the process over by permuting rows
and columns to move that entry to the upper left corner of A. Since the
integers |a11 | are a decreasing sequence of positive integers, we will not have
to move an entry to the upper left corner infinitely often, so when this step is
done the upper left entry of the matrix is nonzero, and all entries in the first
row and column are 0.
2. We may now assume that a11 is the only nonzero entry in the first row and
column. If some entry aij of A is not divisible by a11 , add the column of A
containing aij to the first column, thus producing an entry in the first column
that is nonzero. When we perform step 2, the remainder r will be greater
than 0. Permuting rows and columns results in a smaller |a11 |. Since |a11 | can
only shrink finitely many times, eventually we will get to a point where every
aij is divisible by a11 . If a11 is negative, multiple the first row by −1.
After performing the above operations, the first row and column of A are zero except
for a11 which is positive and divides all other entries of A. We repeat the above
steps for the matrix B obtained from A by deleting the first row and column. The
upper left entry of the resulting matrix will be divisible by a11 , since every entry of
B is. Repeating the argument inductively proves the proposition.
−1 2 1 0
Example 2.1.7. The matrix has Smith normal form , and the
−3 4 0 2
1 4 9 1 0 0
matrix 16 25 36 has Smith normal form 0 3 0 . As a double check,
49 64 81 0 0 72
note that the determinants of a matrix and its Smith normal form match, up to
sign. This is because
We compute each of the above Smith forms using Sage, along with the corre-
sponding transformation matrices. First the 2 × 2 matrix.
2.1. FINITELY GENERATED ABELIAN GROUPS 19
A = matrix ( ZZ , 2 , [ -1 ,2 , -3 ,4])
S , U , V = A . smith_form (); S
[1 0]
[0 2]
U*A*V
[1 0]
[0 2]
[ 0 1]
[ 1 -1]
[1 4]
[1 3]
The Sage matrix command takes as input the base ring, the number of rows, and
the entries. Next we compute with a 3 × 3 matrix.
[ 1 0 0]
[ 0 3 0]
[ 0 0 72]
U*A*V
[ 1 0 0]
[ 0 3 0]
[ 0 0 72]
[ 0 0 1]
[ 0 1 -1]
[ 1 -20 -17]
[ 47 74 93]
[ -79 -125 -156]
[ 34 54 67]
m = matrix ( ZZ , 3 , [2..10]); m
[ 2 3 4]
[ 5 6 7]
[ 8 9 10]
m . smith_form ()[0]
[1 0 0]
[0 3 0]
[0 0 0]
Exercise 2.1.8. Recall Smith normal form defined in Proposition 2.1.6. With only
minor modifications, then the proposition and proof will work over any principle
ideal domain. Find and apply
these modifications then find the Smith normal form
1 2 3
of the matrix 0 1 + i 2.
0 1 5
[Hint: You can use Sage to verify your answer. However, you will need to make
explicitly construct the Gaussian integers in order to input the matrix. You can do
this by the following code. ]
K . <i > = Q uadratic Field ( -1)
R = K . maximal_order ()
M = matrix (R , 3 , [1 ,2 ,3 ,0 ,1+ i ,2 ,0 ,1 ,5]); show ( M )
# show ( M . smith_form ()[0]) # uncomment for the answer
1 2 3
Exercise 2.1.9. Let A = 4 5 6.
7 8 9
1. Find the Smith normal form of A.
2. Prove that the cokernel of the map Z3 → Z3 given by multiplication by A is
isomorphic to Z/3Z ⊕ Z.
2.2. NOETHERIAN RINGS AND MODULES 21
that is exact at each point; thus f is injective, g is surjective, and im(f ) = ker(g).
Example 2.2.6. The sequence
2
0→Z→
− Z → Z/2Z → 0
is an exact sequence, where the first map sends 1 to 2, and the second is the natural
quotient map.
Lemma 2.2.7. If
f g
0→L−
→M →
− N →0
is a short exact sequence of R-modules, then M is noetherian if and only if both L
and N are noetherian.
Proof. First suppose that M is noetherian. Then L is a submodule of M , so L is
noetherian. Let N 0 be a submodule of N ; then the inverse image of N 0 in M is
a submodule of M , so it is finitely generated, hence its image N 0 is also finitely
generated. Thus N is noetherian as well.
Next assume nothing about M , but suppose that both L and N are noethe-
rian. Suppose M 0 is a submodule of M ; then M0 = f (L) ∩ M 0 is isomorphic to a
submodule of the noetherian module L, so M0 is generated by finitely many ele-
ments a1 , . . . , an . The quotient M 0 /M0 is isomorphic (via g) to a submodule of the
noetherian module N , so M 0 /M0 is generated by finitely many elements b1 , . . . , bm .
For each i ≤ m, let ci be a lift of bi to M 0 , modulo M0 . Then the elements
a1 , . . . , an , c1 , . . . , cm generate M 0 , for if x ∈ M 0 , then there is some element y ∈ M0
such that x−y is an R-linear combination of the ci , and y is an R-linear combination
of the ai .
0→I→R→S→0
Proof. Assume first that we have already shown that for any n the polynomial ring
R[x1 , . . . , xn ] is noetherian. Suppose S is finitely generated as a ring over R, so
there are generators s1 , . . . , sn for S. Then the map xi 7→ si extends uniquely to a
surjective homomorphism π : R[x1 , . . . , xn ] → → S, and Lemma 2.2.9 implies that S
is noetherian.
The rings R[x1 , . . . , xn ] and (R[x1 , . . . , xn−1 ])[xn ] are isomorphic, so it suffices
to prove that if R is noetherian then R[x] is also noetherian. (Our proof follows
[Art91, §12.5].) Thus suppose I is an ideal of R[x] and that R is noetherian. We
will show that I is finitely generated.
Let A be the set of leading coefficients of polynomials in I. (The leading coef-
ficient of a polynomial is the coefficient of the highest degree monomial, or 0 if the
polynomial is 0; thus 3x7 + 5x2 − 4 has leading coefficient 3.) We will first show
that A is an ideal of R. Suppose a, b ∈ A are nonzero with a + b 6= 0. Then there
are polynomials f and g in I with leading coefficients a and b. If deg(f ) ≤ deg(g),
then a + b is the leading coefficient of xdeg(g)−deg(f ) f + g, so a + b ∈ A; the argument
when deg(f ) > deg(g) is analogous. Suppose r ∈ R and a ∈ A with ra 6= 0. Then
ra is the leading coefficient of rf , so ra ∈ A. Thus A is an ideal in R.
Since R is noetherian and A is an ideal of R, there exist nonzero a1 , . . . , an ∈ A
that generate A as an ideal. Since A is the set of leading coefficients of elements
of I, and the aj are in A, we can choose for each j ≤ n an element fj ∈ I with
leading coefficient aj . By multipying the fj by some power of x, we may assume
that the fj all have the same degree d ≥ 1.
Let S<d be the set of elements of I that have degree strictly less than d. This
set is closed under addition and under multiplication by elements of R, so S<d is a
module over R. The module S<d is the submodule of the R-module of polynomials of
degree less than n, which is noetherian by Proposition 2.2.8 because it is generated
by 1, x, . . . , xn−1 . Thus S<d is finitely generated, and we may choose generators
h1 , . . . , hm for S<d .
We finish by proving using induction on the degree that every g ∈ I is an R[x]-
linear combination of f1 , . . . , fn , h1 , . . . , hm . If g ∈ I has degree 0, then g ∈ S<d ,
since d ≥ 1, so g is a linear combination of h1 , . . . , hm . Next suppose g ∈ I has degree
e, and that we have proven the statement for all elements of I of degree < e. If e ≤ d,
then g ∈ S<d , so g is in the R[x]-ideal generated by h1 , . . . , hm . Next suppose that
e ≥ d. Then the leading coefficient b of g lies in the ideal A of leading coefficients
of elements of I, so there exist ri ∈ R such that b = r1 a1 + · · · + rn an . Since fi has
2.2. NOETHERIAN RINGS AND MODULES 25
leading coefficient ai , the difference g − xe−d ri fi has degree less than the degree e
of g. By induction g−xe−d ri fi is an R[x] linear combination of f1 , . . . , fn , h1 , . . . , hm ,
so g is also an R[x] linear combination of f1 , . . . , fn , h1 , . . . , hm . Since each fi and
hj lies in I, it follows that I is generated by f1 , . . . , fn , h1 , . . . , hm , so I is finitely
generated, as required.
Example 2.2.12. Let I = (12, 18) be the ideal of Z generated by 12 and 18. If
n = 12a + 18b ∈ I, with a, b ∈ Z, then 6 | n, since 6 | 12 and 6 | 18. Also,
6 = 18 − 12 ∈ I, so I = (6).
The ring Z in Sage is ZZ, which is Noetherian.
ZZ . is_noetherian ()
True
I . is_principal ()
True
Propositions 2.2.8 and 2.2.11 together imply that any finitely generated abelian
group is noetherian. This means that subgroups of finitely generated abelian groups
are finitely generated, which provides the missing step in our proof of the structure
theorem for finitely generated abelian groups.
26 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
Exercise 2.2.13. There is another way to show every principle ideal domain (for
example Z) is noetherian (contrast to the proof in Section 2.2.1). Let R be a PID
and (a) an arbitrary ideal. Use the facts that (b) ⊇ (a) if and only if b | a and R is
a UFD to show that ascending chain of ideals starting with (a) must stabilize.
The proof of the following proposition uses repeatedly that any submodule of
a finitely generated Z-module is finitely generated, which uses that Z is noetherian
and that finitely generated modules over a noetherian ring are noetherian.
Proposition 2.3.5. Suppose K is a field and α, β ∈ K are two algebraic integers.
Then αβ and α + β are also algebraic integers.
Proof. Let m, n be the degrees of monic integral polynomials that have α, β as roots,
respectively. Then we can write αm in terms of smaller powers of α and likewise
for β n , so the elements αi β j for 0 ≤ i < m and 0 ≤ j < n span the Z-module
Z[α, β]. Since Z[α + β] is a submodule of the finitely-generated Z-module Z[α, β], it
is finitely generated, so α + β is integral. Likewise, Z[αβ] is a submodule of Z[α, β],
so it is also finitely generated, and αβ is integral.
( a ^2 + 3). minpoly ()
x ^2 - 6* x + 7
28 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
√ √
Exercise 2.3.8. Find the minimal polynomial of 2+ 3 by hand. Check your
result with Sage.
Proof. (⇐=) Since f ∈ Z[x] is monic and f (α) = 0, we see immediately that α is
an algebraic integer.
(=⇒) Since α is an algebraic integer, there is some nonzero monic g ∈ Z[x] such
that g(α) = 0. By Lemma 2.3.9, we have g = f h, for some h ∈ Q[x], and h is monic
because f and g are. If f 6∈ Z[x], then some prime p divides the denominator of
some coefficient of f . Let pi be the largest power of p that divides some denominator
of some coefficient f , and likewise let pj be the largest power of p that divides some
denominator of a coefficient of h. Then pi+j g = (pi f )(pj h), and if we reduce both
sides modulo p, then the left hand side is 0 but the right hand side is a product of
two nonzero polynomials in Fp [x], hence nonzero, a contradiction.
(1/2). minpoly ()
x - 1/2
a . minpoly ()
x ^2 - 2
√
Finally we compute the minimal polynomial of α = 2/2 + 3, which is not integral,
hence Proposition 2.3.4 implies that α is not an algebraic integer:
( a /2 + 3). minpoly ()
x ^2 - 6* x + 17/2
The only elements of Q that are algebraic integers are the usual integers Z, since
Z[1/d] is not finitely generated as a Z-module. Watch out since there are elements
of Q that seem to appear to have denominators when written down, but are still
algebraic integers. This is an artifact of how we write them down, e.g., if we wrote
our integers as a multiple of α = 2, then we would write 1 as α/2. For example,
√
1+ 5
α=
2
is an algebraic integer, since it is a root of the monic integral polynomial x2 − x − 1.
We verify this using Sage below, though of course this is easy to do by hand (you
should try much more complicated examples in Sage).
alpha = (1 + a )/2
alpha . minpoly ()
x ^2 - x - 1
alpha . is_integral ()
True
30 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
√
Since 5 can be expressed in terms of radicals, we can also compute this minimal
polynomial using the symbolic functionality in Sage.
x ^2 - x - 1
x ^8 - 8* x ^6 + 18* x ^4 - 104* x ^2 + 1
a ^2
b ^3
√ √
We compute the minimal polynomial of the sum and product of 3 5 and 2. The
command absolute minpoly gives the minimal polynomial of the element over the
rational numbers.
( a + b ). a b s o l u t e _ m i n p o l y ()
( a * b ). a b s o l u t e _ m i n p o l y ()
x ^6 - 200
√ √
The minimal polynomial of the product is 3 5 2 is trivial to compute by hand. In
light of the Cayley-Hamilton theorem, we can compute the minimal polynomial of
2.3. RINGS OF ALGEBRAIC INTEGERS 31
√ √
α = 3 5 + 2 by hand by computing the determinant of the matrix given by left
multiplication by α on the basis
√ √
3
√
3
√ √ √ √
3 2 3 2
1, 2, 5, 5 2, 5 , 5 2.
x ^6 - 200
√ √
1+ 5
Exercise 2.3.15. Let α = 2+ 2 .
1. Is α an algebraic integer?
Example 2.3.18. The field Q of rational numbers is a number field of degree 1, and
the ring of integers of Q is Z. The field K = Q(i) of Gaussian integers has degree
2 and OK = Z[i].
√
Example√2.3.19. The golden ratio ϕ = (1 + 5)/2 is in the quadratic number field
K = Q( 5) = Q(ϕ); notice that ϕ satisfies x2 − x − 1, so ϕ ∈ OK . To see that
OK = Z[ϕ] directly, we proceed as follows. √ By Proposition 2.3.4, the algebraic
integers K are exactly the elements a + b √ 5 ∈ K, with a, b ∈ Q that have integral
√
minimal polynomial. The matrix of a + b 5 with respect to the basis 1, 5 for
2 − 5b2 =
K is m = ab 5b a . The characteristic polynomial of m is f = (x − a)
x2 − 2ax + a2 − 5b2 , which is in Z[x] if and only if 2a ∈ Z and a2 − 5b2 ∈ Z. Thus
a = a0 /2 with a0 ∈ Z, and (a0 /2)2 − 5b2 ∈ Z, so 5b2 ∈ 41 Z, so b ∈ 12 Z as well. If a
has a denominator of 2, then b must also have a denominator of 2 to ensure that
the difference a2 − 5b2 is an integer. This proves that OK = Z[ϕ].
√ √ √ √
Example 2.3.20. The ring of integers of K = Q( 3 9) is Z[ 3 3], where 3 3 = 13 ( 3 9)2 6∈
√3
9. As we will see, in general the problem of computing OK given K may be very
hard, since it requires factoring a certain potentially large integer.
Exercise
√ 2.3.21.
√ From basic definitions, find the rings of integers of the fields
Q( 11) and Q( −6).
As noted above, Z[i] is the ring of integers of Q(i). For every nonzero integer n,
the subring Z+niZ of Z[i] is an order. The subring Z of Z[i] is not an order, because
Z does not have finite index in Z[i]. Also the subgroup 2Z + iZ of Z[i] is not an
order because it is not a ring.
O3 = K . order (3* i ); O3
O3 . gens ()
[1 , 3* i ]
True
1 + 2* i in O3
False
We will frequently consider orders because they are often much easier to write
down explicitly than OK . For example, if K = Q(α) and α is an algebraic integer,
then Z[α] is an order in OK , but frequently Z[α] 6= OK .
Example 2.3.25. In this example [OK : Z[a]] = 2197. First we define the number
field K = Q(a) where a is a root of x3 − 15x2 − 94x − 3674, then we compute the
order Z[a] generated by a.
Oa . basis ()
[1 , a , a ^2]
Next we compute a Z-basis for the maximal order OK of K, and compute that the
index of Z[a] in OK is 2197 = 133 .
OK = K . maximal_order ()
OK . basis ()
Oa . index_in ( OK )
2197
34 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
√
3
Then we list the Galois conjugates of 2.
cuberoot2 . g a l o i s _ c o n j u g a t e s ( L )
- zeta3 - 1
We know from linear algebra that determinants are multiplicative and traces
are additive, so for a, b ∈ L we have
and
trL/K (a + b) = trL/K (a) + trL/K (b).
Note that if f ∈ Q[x] is the characteristic polynomial of `a , then the constant
term of f is (−1)deg(f ) det(`a ), and the coefficient of xdeg(f )−1 is − tr(`a ).
Proposition 2.4.3. Let a ∈ L and let σ1 , . . . , σd , where d = [L : K], be the distinct
field embeddings L ,→ Q that fix every element of K. Then
d
Y d
X
NormL/K (a) = σi (a) and trL/K (a) = σi (a).
i=1 i=1
It is important in Proposition 2.4.3 that the product and sum be over all the
images σi (a), not over just the distinct images. For example, if a = 1 ∈ L, then
TrL/K (a) = [L : K], whereas the sum of the distinct conjugates of a is 1.
The following corollary asserts that the norm and trace behave well in towers.
Corollary 2.4.4. Suppose K ⊂ L ⊂ M is a tower of number fields, and let a ∈ M .
Then
NormM/K (a) = NormL/K (NormM/L (a)) and trM/K (a) = trL/K (trM/L (a)).
S = {c1 a1 + · · · + cn an : ci ∈ Q, 0 ≤ ci ≤ 1} ⊂ K.
Thus for any ε > 0, there are elements a, b ∈ OK such that the coefficients of a − b
are all less than ε (otherwise the elements of OK would all be a “distance” of least ε
from each other, so only finitely many of them would fit in S).
As mentioned above, the norms of elements of OK are integers. Since the norm
of an element is the determinant of left multiplication by that element, the norm is
a homogenous polynomial of degree n in the indeterminate coefficients ci , which is
0 only on the element 0, so the constant term of this polynomial is 0. If the ci get
arbitrarily small for elements of OK , then the values of the norm polynomial get
arbitrarily small, which would imply that there are elements of OK with positive
norm too small to be in Z, a contradiction. So the set S contains only finitely many
elements of OK . Thus the denominators of the ci are bounded, so for some d, we
have that OK has finite index in A = d1 Za1 + · · · + d1 Zan . Since A is isomorphic to
Zn , it follows from the structure theorem for finitely generated abelian groups that
OK is isomorphic as a Z-module to Zn , as claimed.
β = 22/389 = 0.05655526992287917737789203084832904884318766066838046 . . .
and we compute
α = 0.056555.
38 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
Now suppose given only α that you would like to recover β. A standard technique is
to use continued fractions, which yields a sequence of good rational approximations
for α; by truncating right before a surprisingly big partial quotient, we obtain β:
v = c o n t i n u e d _ f r a c t i o n (0.056555)
c o n t i n u e d _ f r a c t i o n (0.056555)
[0 , 17 , 1 , 2 , 6 , 1 , 23 , 1 , 1 , 1 , 1 , 1 , 2]
1.64940734628582579415255223513033238849340192353916
Now suppose you very much want to find the (rescaled) minimal polynomial f (x) ∈
Z[x] of β just given this numerical approximation α. This is of great value even
without proof, since often in practice once you know a potential minimal polynomial
you can verify that it is in fact right. Exactly this situation arises in the explicit
construction of class fields (a more advanced topic in number theory) and in the
construction of Heegner points on elliptic curves. As we will see, the LLL algo-
rithm provides a polynomial time way to solve this problem, assuming α has been
computed to sufficient precision.
where
bi · b∗j
µi,j = ∗ ∗.
bj · bj
A = matrix ( ZZ , 2 , [1 ,2 , 3 ,4]); A
[1 2]
[3 4]
Bstar , mu = A . gramm_schmidt ()
The rows of the matrix B ∗ are obtained from the rows of A by the Gramm-Schmidt
procedure.
Bstar
[ 1 2]
[ 4/5 -2/5]
mu
[ 0 0]
[11/5 0]
b2 · b∗1
µ2,1 = = 0,
b∗1 · b∗1
and
22 ≥ (3/4) · 12 .
A = matrix ( ZZ , 2 , [1 ,2 , 3 ,4])
A . LLL ()
[1 0]
[0 2]
40 CHAPTER 2. BASIC COMMUTATIVE ALGEBRA
2. For 1 ≤ j ≤ i ≤ n, we have
|bj | ≤ 2(i−1)/2 |b∗i |.
3.0494159999999999
There is even a very fast variant of Stehle’s implementation that computes a basis
for L that is very likely LLL reduced but may in rare cases fail to be LLL reduced.
t = cputime ()
B = A . LLL ( algorithm = " fpLLL : fast " ) # not tested
cputime ( t ) # random output
0.96842699999999837
2.5. RECOGNIZING ALGEBRAIC NUMBERS USING LLL 41
1. Form the lattice in Rd+2 with basis the rows of the matrix A whose first
(d + 1) × (d + 1) part is the identity matrix, and whose last column has entries
2. Compute an LLL reduced basis for the Z-span of the rows of A, and let B be
the corresponding matrix. Let b1 = (a0 , a1 , . . . , ad+1 ) be the first row of B and
notice that B is obtained from A by left multiplication by an invertible integer
matrix. Thus a0 , . . . , ad are the linear combination of the (2.5.1) that equals
ad+1 . Moreover, since B is LLL reduced we expect that ad+1 is relatively
small.
2* x ^3 - 3* x ^2 + 10* x - 4
Chapter 3
Unique factorization into irreducible elements frequently fails for rings of integers of
number fields. In this chapter we will deduce a central property of the ring of integers
OK of an algebraic number field, namely that every nonzero ideal factors uniquely
as a products of prime ideals. Along the way, we will introduce fractional ideals
and prove that they form a free abelian group under multiplication. Factorization
of elements of OK (and much more!) is governed by the class group of OK , which is
the quotient of the group of fractional ideals by the principal fractional ideals (see
Chapter 7).
Frac ( ZZ )
Rational Field
In Sage the Frac command usually returns a field canonically isomorphic to the
fraction field (not a formal construction).
43
44 CHAPTER 3. UNIQUE FACTORIZATION OF IDEALS
OK . basis ()
[1/2* a + 1/2 , a ]
Frac ( OK )
Remark 3.1.2. Note that in computers 1/2 * x means the same as (1/2)*x. For
more information about the order of operations in programming see http://en.
wikipedia.org/wiki/Order_of_operations. In Sage the ^ symbol is replaced
with python’s exponentiation ** at execution.1
The fraction field of an order – i.e., a subring of OK of finite index – is also the
number field again.
O2 = K . order (2* a ); O2
Frac ( O2 )
For example, every field√is integrally closed in its field of fractions, as is the ring
Z of √
integers. However, Z[ 5] is not integrally √ closed in its field√of fractions, since
(1 + 5)/2 is integrally over Z and lies in Q( 5), but not in Z[ 5]
as a Z[a0 , . . . , an−1 ]-linear combination of αi for i < n, so the ring Z[a0 , . . . , an−1 , α]
is also finitely generated as a Z-module. Thus Z[α] is finitely generated as a Z-
module because it is a submodule of a finitely generated Z-module, which implies
that α is integral over Z.
Without loss we may assume that K ⊂ Q, so that OK = Z ∩ K. Suppose α ∈ K
is integral over OK . Then since Z is integrally closed, α is an element of Z, so
α ∈ K ∩ Z = OK , as required.
I*J
with J an integral ideal. Thus dividing by α, we see that every fractional ideal is
of the form
aJ = {ab : b ∈ J}
for some a ∈ K and integral ideal J ⊂ OK .
For example, the set 12 Z of rational numbers with denominator 1 or 2 is a
fractional ideal of Z.
Definition 3.2.4 (Divides for Ideals). Suppose that I, J are ideals of OK . Then
we say that I divides J if I ⊃ J.
Lemma 3.2.5. Suppose I is a nonzero ideal of OK . Then there exist prime ideals
p1 , . . . , pn such that p1 · p2 · · · pn ⊂ I, i.e., I divides a product of prime ideals.
Proof. Let S be the set of nonzero ideals of OK that do not satisfy the conclusion
of the lemma. The key idea is to use that OK is noetherian to show that S is the
empty set. If S is nonempty, then since OK is noetherian, there is an ideal I ∈ S
that is maximal as an element of S. If I were prime, then I would trivially contain
a product of primes, so we may assume that I is not prime. Thus there exists
a, b ∈ OK such that ab ∈ I but a 6∈ I and b 6∈ I. Let J1 = I + (a) and J2 = I + (b).
Then neither J1 nor J2 is in S, since I is maximal, so both J1 and J2 contain a
product of prime ideals, say p1 · · · pr ⊂ J1 and q1 · · · qs ⊂ J2 . Then
Proof of Theorem 3.2.3. Note that we will only prove Theorem 3.2.3 in the case
when R = OK is the ring of integers of a number field K.
48 CHAPTER 3. UNIQUE FACTORIZATION OF IDEALS
I = {a ∈ K : ap ⊂ OK }
p1 p2 · · · pm ⊂ (b) ⊂ p.
p−1 = {a ∈ K : ap ⊂ OK }
and I an integral ideal, so since (a) has inverse (1/a), it suffices to show that every
integral ideal I has an inverse. If not, then there is a nonzero integral ideal I that
is maximal among all nonzero integral ideals that do not have an inverse. Every
ideal is contained in a maximal ideal, so there is a nonzero prime ideal p such that
I ⊂ p. Multiplying both sides of this inclusion by p−1 and using that OK ⊂ p−1 ,
we see that
I ⊂ p−1 I ⊂ p−1 p = OK .
If I = p−1 I, then arguing as in the proof that p−1 is an inverse of p, we see
that each element of p−1 preserves the finitely generated Z-module I and is hence
integral. But then p−1 ⊂ OK , which, upon multiplying both sides by p, implies that
OK = pp−1 ⊂ p, a contradiction. Thus I 6= p−1 I. Because I is maximal among
ideals that do not have an inverse, the ideal p−1 I does have an inverse J. Then
p−1 J is an inverse of I, since (Jp−1 )I = J(p−1 I) = OK .
We can finally deduce the crucial Theorem 3.2.6, which will allow us to show
that any nonzero ideal of a Dedekind domain can be expressed uniquely as a product
of primes (up to order). Thus unique factorization holds for ideals in a Dedekind
domain, and it is this unique factorization that initially motivated the introduction
of ideals to mathematics over a century ago.
Proof. Suppose I is an ideal that is maximal among the set of all ideals in OK that
cannot be written as a product of primes. Every ideal is contained in a maximal
ideal, so I is contained in a nonzero prime ideal p. If Ip−1 = I, then by Theo-
rem 3.2.3 we can cancel I from both sides of this equation to see that p−1 = OK , a
contradiction. Since OK ⊂ p−1 , we have I ⊂ Ip−1 , and by the above observation I is
strictly contained in Ip−1 . By our maximality assumption on I, there are maximal
ideals p1 , . . . , pn such that Ip−1 = p1 · · · pn . Then I = p · p1 · · · pn , a contradiction.
Thus every ideal can be written as a product of primes.
Suppose p1 · · · pn = q1 · · · qm . If no qi is contained in p1 , then for each i there
is an ai ∈ qi such that ai 6∈ p1 . But the product of the ai is in p1 · · · pn , which is a
subset of p1 , which contradicts that p1 is a prime ideal. Thus qi = p1 for some i.
We can thus cancel qi and p1 from both sides of the equation by multiplying both
sides by the inverse. Repeating this argument finishes the proof of uniqueness.
K . factor (6)
Factoring Primes
Let p be a prime and OK the ring of integers of a number field. This chapter is about
how to write pOK as a product of prime ideals of OK . Paradoxically, computing
the explicit prime ideal factorization of pOK is easier than computing OK .
51
52 CHAPTER 4. FACTORING PRIMES
Bill Gates meant1 factoring products of two primes, which would break the
RSA cryptosystem (see e.g. [Ste09, §3.2]). However, perhaps Gates is an algebraic
number theorist, and he really meant what he said: then we might imagine that he
meant factorization of primes of Z in rings of integers of number fields. For example,
216 + 1 = 65537 is a “large” prime, and in Z[i] we have
(65537) = (65537, 28 + i) · (65537, 28 − i).
4.1.2 Examples
The following Sage session shows the commands needed to compute the factorization
of pOK for K the number field defined by a root of x5 + 7x4 + 3x2 − x + 1 and p = 2
and 5. We first create an element f ∈ Q[x] in Sage:
1
This quote is on page 265 of the first edition. In the second edition, on page 303, this sentence is
changed to “The obvious mathematical breakthrough that would defeat our public key encryption
would be the development of an easy way to factor large numbers.” This is less nonsensical;
however, fast factoring is not known to break all commonly used public-key cryptosystem. For
example, there are cryptosystems based on the difficulty of computing discrete logarithms in F∗p
and on elliptic curves over Fp , which (presumably) would not be broken even if one could factor
large numbers quickly.
4.1. THE PROBLEM 53
R . <x > = QQ []
f = x ^5 + 7* x ^4 + 3* x ^2 - x + 1
[1 , a , a ^2 , a ^3 , a ^4]
I . factor ()
I . is_prime ()
True
I . factor ()
( x + 2) * ( x + 3)^2 * ( x ^2 + 4* x + 2)
p1
•
•p 2
q1
p3
•
zero
q
• • p
zero r • s• •t
OK = Z[a] it is relatively easy to factor pOK , at least assuming one can factor
polynomials in Fp [x]. The following factorization gives a hint as to why:
The exponent 2 of (5, 3 + a)2 in the factorization of 5OK above suggests “rami-
fication”, in the sense that the cover X → Y has less points (counting their “size”,
i.e., their residue class degree) in its fiber over 5 than it has generically. See Fig-
ure 4.1.1.
4.2. A METHOD FOR FACTORING PRIMES THAT OFTEN WORKS 55
Spec(Fp [x]/(f i i )) / Spec(Z[a])
[ e
Spec(Fp ) / Spec(Z)
Q e
where f = i f i i is the factorization of the image of f in Fp [x], and pOK = pei i is
Q
the factorization of pOK in terms of prime ideals of OK . On the level of rings, the
bottom horizontal map is the quotient map Z → Z/pZ ∼ = Fp . The middle horizontal
map is induced by M e
Z[x] → Fp [x]/(f i i ),
i
and the top horizontal map is induced by
OK → OK /pOK ∼
M
= OK /pei i ,
0 → Z[a] → OK → H → 0,
OK /p0 ∼
= (OK /pOK )/(f (ã)) ∼
= (Z[a]/pZ[a])/(f (ã)) ⊂ Fp ,
where the f i are distinct monic irreducible polynomials. Let pi = (p, fi (a)) where
fi ∈ Z[x] is a lift of f i in Fp [x]. Then
t
Y
pOK = pei i .
i=1
2945785
factor ( D )
5 * 353 * 1669
The order Z[a] has the same discriminant as f (x), which is the same as the discrim-
inant of OK , so Z[a] = OK and we can apply the above theorem. (Here we use that
the index of Z[a] in OK is the square of the quotient of their discriminants, a fact
we will prove later in Section 6.2.)
R . <x > = QQ []
discriminant ( x ^5 + 7* x ^4 + 3* x ^2 - x + 1)
2945785
We have
x5 + 7x4 + 3x2 − x + 1 ≡ (x + 2) · (x + 3)2 · (x2 + 4x + 2) (mod 5),
which yields the factorization of 5OK given before the theorem.
If we replace a by b = 7a, then the index of Z[b] in OK will be a power of 7,
which is coprime to 5, so the above method will still work.
K . <a > = NumberField ( x ^5 + 7* x ^4 + 3* x ^2 - x + 1)
f = (7* a ). minpoly ( ’x ’)
f
f . disc ()
235050861175510968365785
7^20
f . factor_mod (5)
( x + 4) * ( x + 1)^2 * ( x ^2 + 3* x + 3)
Thus 5 factors in OK as
5OK = (5, 7a + 1)2 · (5, 7a + 4) · (5, (7a)2 + 3(7a) + 3).
If we replace a by b = 5a and try the above algorithm with Z[b], then the method
fails because the index of Z[b] in OK is divisible by 5.
58 CHAPTER 4. FACTORING PRIMES
f . factor_mod (5)
x ^5
See Example 6.2.7 below for why it is called an inessential “discriminant divisor”
instead of an inessential “index divisor”.
Since [OK : Z[a]]2 is the absolute value of Disc(f (x))/ Disc(OK ), where f (x) is
the characteristic polynomial of f (x), an inessential discriminant divisor divides the
discriminant of the characteristic polynomial of any element of OK .
Example 4.3.2 (Dedekind). Let K = Q(a) be the cubic field defined by a root a
of the polynomial f = x3 + x2 − 2x + 8. We will use Sage to show that 2 is an
inessential discriminant divisor for K.
K . factor (2)
Thus 2OK = p1 p2 p3 , with the pi distinct, and one sees directly from the above
expressions that OK /pi ∼= F2 for each i. If OK = Z[a] for some a ∈ OK with
minimal polynomial f , then f (x) ∈ F2 [x] must be a product of three distinct linear
factors, which is impossible, since the only linear polynomials in F2 [x] are x and
x + 1.
4.3. A GENERAL METHOD 59
Problem 4.3.3. Let O be any order in OK and let p be a prime of Z. Find the
prime ideals of O that contain p.
Ip = {x ∈ O : xm ∈ pO for some m ≥ 1} ⊂ O
O0 = {x ∈ K : xIp ⊂ Ip }.
Note that to give an algorithm one must also figure out how to explicitly compute
Ip /pIp and the kernel of this map (see the next section for more details).
5. Use the Euclidean algorithm in Fp [X] to find U1 (X) and U2 (X) such that
U1 m1 + U2 m2 = 1.
U1 m1 U1 m1 + U2 m2 U1 m1 = U1 m1 ,
A∼
= Ker(1 − ε) ⊕ Ker(ε).
2. [Compute radical] Let I be the radical of pO, which is the ideal of elements
x ∈ O such that xm ∈ pO for some positive integer m. Note that pO ⊂ I, i.e.,
I | pO; also I is the product of the primes that divide p, without multiplicity.
Using linear algebra over the finite field Fp , we compute a basis for I/pO
by computing the abelian subgroup of O/pO of all nilpotent elements. This
computes I, since pO ⊂ I.
A = O/I = (O/pO)/(I/pO).
The second equality comes from the fact that pO ⊂ I. Note that O/pO is
obtained by simply reducing the basis w1 , . . . , wn modulo p. Thus this step
entirely involves linear algebra modulo p.
5. [Compute the maximal ideals over p] Each maximal ideal pi lying over p is
the kernel of one of the compositions
O → A ≈ A1 × · · · × Ak → Ai .
Algorithm 4.3.9 finds all primes of O that contain the radical I of pO. Every
such prime clearly contains p, so to see that the algorithm is correct, we prove that
the primes p of O that contain p also contain I. If p is a prime of O that contains p,
then pO ⊂ p. If x ∈ I then xm ∈ pO for some m, so xm ∈ p which implies that
x ∈ p by the primality of p. Thus p contains I, as required. Note that we do not
find the powers of primes that divide p in Algorithm 4.3.9; that’s left to another
algorithm that we will not discuss in this book.
Algorithm 4.3.9 was invented by J. Buchmann and H. W. Lenstra, though their
paper seems to have never been published; however, the algorithm is described in
detail in [Coh93, §6.2.5]. Incidentally, this chapter is based on Chapters 4 and 6 of
[Coh93], which is highly recommended, and goes into much more detail about these
algorithms.
Chapter 5
In this chapter, we prove the Chinese Remainder Theorem (CRT) for arbitrary com-
mutative rings, then apply CRT to prove that every ideal in a Dedekind domain R is
generated by at most two elements. We also prove that pn /pn+1 is (noncanonically)
isomorphic to R/p as an R-module, for any nonzero prime ideal p of R. The tools
we develop in this chapter will be used frequently to prove other results later.
63
64 CHAPTER 5. THE CHINESE REMAINDER THEOREM
Before proving the CRT in more generality, we prove (5.1.1). There is a natural
map
φ : Z → (Z/n1 Z) ⊕ · · · ⊕ (Z/nr Z)
given by projection onto each factor. Its kernel is
n1 Z ∩ · · · ∩ nr Z.
n1 Z ∩ · · · ∩ nr Z = n1 · · · nr Z.
This is half of the CRT; the other half is to prove that this map is surjective. In
this case, it is clear that i is also surjective, because i is an injective map between
finite sets of the same cardinality. We will, however, give a proof of surjectivity that
doesn’t use finiteness of the above two sets.
To prove surjectivity of i, note that since the ni are coprime in pairs,
gcd(n1 , n2 · · · nr ) = 1,
xn1 + yn2 · · · nr = 1.
c = c · 1 = c · (x + y) = cx + cy ∈ IJ + IJ = IJ,
Proof. In the special case of a Dedekind domain, we could easily prove this lemma
using unique factorization of ideals as products of primes (Theorem 3.2.6); instead,
we give a direct general argument.
It suffices to prove the lemma in the case s = 3, since the general case then follows
from induction. By assumption, there are x1 ∈ I1 , y2 ∈ I2 and a1 ∈ I1 , b3 ∈ I3 such
x1 + y2 = 1 and a1 + b3 = 1.
x1 a1 + x1 b3 + y2 a1 + y2 b3 = 1 · 1 = 1.
The first three terms are in I1 and the last term is in I2 I3 = I2 ∩I3 (by Lemma 5.1.2),
so I1 is coprime to I2 I3 .
Next we prove the general Chinese Remainder Theorem. We will apply this
result with R = OK in the rest of this chapter.
Thus given any an ∈ R, for n = 1, . . . , r, there exists some Qra ∈ R such that a ≡ an
(mod In ) for n = 1, . . . , r; moreover, a is unique modulo n=1 In .
Proof. Let ϕ : R → rn=1 R/In be the natural map induced by reduction modulo
L
the In . An inductive
Qr application of Lemma 5.1.2 implies that the kernel ∩rn=1 In
of ϕ is equal to n=1 In , so the map ψ of the theorem is injective.
Each projection R → R/In is surjective, so to prove that ψ is surjective, it
suffices to show that (1, 0, . . . , Q
0) is in the image of ϕ, and similarly for the other
factors. By Lemma 5.1.3, J = rn=2 In is coprime to I1 , so there exists x ∈ I1 and
y ∈ J such that x + y = 1. Then y = 1 − x maps to 1 in R/I1 and to 0 in R/J,
hence to 0 in R/In for each n ≥ 2, since J ⊂ In .
Lemma 5.2.2. If I and J are nonzero integral ideals in R, then there exists an
a ∈ I such that the integral ideal (a)I −1 is coprime to J.
Before we give the proof in general, note that the lemma is trivial when I is
principal, since if I = (b), just take a = b, and then (a)I −1 = (a)(a−1 ) = (1) is
coprime to every ideal.
Proof. Let p1 , . . . , pr be the prime divisors of J. For each n, let vn be the largest
power of pn that divides I. Since pvnn 6= pvnn +1 , we can choose an element an ∈ pvnn
that is not in pnvn +1 . By Theorem 5.1.4 applied to the r + 1 coprime integral ideals
Y −1
pv11 +1 , . . . , pvrr +1 , I · pvnn ,
a ≡ an (mod pvnn +1 )
To complete the proof we show that (a)I −1 is not divisible by any pn , or equiva-
lently, that each pvnn exactly divides (a). First we show that pvnn divides (a). Because
a ≡ an (mod pvnn +1 ), there exists b ∈ pvnn +1 such that a = an +b. Since an ∈ pvnn and
b ∈ pnvn +1 ⊂ pvnn , it follows that a ∈ pvnn , so pvnn divides (a). Now assume for the sake
of contradiction that pvnn +1 divides (a); then an = a − b ∈ pvnn +1 , which contradicts
that we chose an 6∈ pvnn +1 . Thus pvnn +1 does not divide (a), as claimed.
We can also use Theorem 5.1.4 to determine the R-module structure of pn /pn+1 .
Proposition 5.2.4. Let p be a nonzero prime ideal of R, and let n ≥ 0 be an
integer. Then pn /pn+1 ∼
= R/p as R-modules.
Proof 1 . Since pn 6= pn+1 , by unique factorization, there is an element b ∈ pn
such that b 6∈ pn+1 . Let ϕ : R → pn /pn+1 be the R-module morphism defined
by ϕ(a) = ab. The kernel of ϕ is p since clearly ϕ(p) = 0 and if ϕ(a) = 0 then
ab ∈ pn+1 , so pn+1 | (a)(b), so p | (a), since pn+1 does not divide (b). Thus ϕ induces
an injective R-module homomorphism R/p ,→ pn /pn+1 .
It remains to show that ϕ is surjective, and this is where we will use Theo-
rem 5.1.4. Suppose c ∈ pn . By Theorem 5.1.4 there exists d ∈ R such that
We have pn | (d) since d ∈ pn and (b)/pn | (d) by the second displayed condition, so
since p - (b)/pn , we have (b) = pn · (b)/pn | (d), hence d/b ∈ R. Finally
d d
ϕ ≡ · b (mod pn+1 ) ≡ d (mod pn+1 ) ≡ c (mod pn+1 ),
b b
so ϕ is surjective.
Exercise 5.2.5. (See [Mar77, Thm. 22(a)]) Let R be a Dedekind domain and p a
nonzero prime ideal in R. Show that #(R/pm ) = #(R/p)m .
Note: #(R/p) is not finite in general! For example, The ring of formal power
series k[[t]] for some field k is a Dedekind domain and the residue field at the prime
(t) is k.
[Hint: Consider the exact sequence
Remark 5.2.6. There is one special case of the previous exercise that you probably
have seen before: the size of Z/4Z is the same as (Z/2Z)2 . In fact you might have
seen a proof of the fact that Z/nm Z has the same cardinality as (Z/nZ)m in a
standard group theory or abstract algebra course.
5.3.1 Sage
[[TODO]]
5.3.2 Magma
The Magma command ChineseRemainderTheorem implements the algorithm sug-
gested by Theorem 5.1.4. In the following example,√ we compute a prime over (3)
and a prime over (5)√of the ring of integers of Q( 3 2), and find an element of OK
that is congruent to 3 2 modulo one prime and 1 modulo the other.
> J;
Prime Ideal of OK
Two element generators:
[5, 0, 0]
[7, 1, 0]
> b := ChineseRemainderTheorem(I, J, OK!a, OK!1);
> K!b;
-4
> b - a in I;
true
> b - 1 in J;
true
5.3.3 PARI
There is also a CRT algorithm
√ for number fields in PARI, but it is more cumbersome
to use. First we defined Q( 3 2) and factor the ideals (3) and (5).
? f = x^3 - 2;
? k = nfinit(f);
? i = idealfactor(k,3);
? j = idealfactor(k,5);
Next we form matrix whose rows correspond to a product of two primes, one
dividing 3 and one dividing 5:
? m = matrix(2,2);
? m[1,] = i[1,];
? m[1,2] = 1;
? m[2,] = j[1,];
Note that we set m[1,2] = 1, so the exponent is 1 instead of 3. We apply the CRT
to obtain a lift in terms of the basis for OK .
? ?idealchinese
idealchinese(nf,x,y): x being a prime ideal factorization and y
a vector of elements, gives an element b such that
v_p(b-y_p)>=v_p(x) for all prime ideals p dividing x,
and v_p(b)>=0 for all other p.
? idealchinese(k, m, [x,1])
[0, 0, -1]~
? nfbasis(f)
[1, x, x^2]
√
Thus PARI finds the lift −( 3 2)2 , and we finish by verifying that this lift is correct.
I couldn’t figure out how to test for ideal membership in PARI, so here we just
check that the prime ideal plus the element is not the unit ideal, which since the
ideal is prime, implies membership.
70 CHAPTER 5. THE CHINESE REMAINDER THEOREM
K = QQ [2^(1/3)]; K
K . c o m p l e x _ e m b e d d i n g s ()
Let σ : K ,→ Cn be the map a 7→ (σ1 (a), . . . , σn (a)), and let V = Rσ(K) be the
R-span of the image σ(K) of K inside Cn .
71
72 CHAPTER 6. DISCRIMANTS AND NORMS
XH = {v ∈ L : max{|v1 |, . . . , |vn |} ≤ H}
is finite.
Proof. If L is not discrete, then there is a point x ∈ L such that for every ε > 0
there is y ∈ L such that 0 < |x − y| < ε. By choosing smaller and smaller ε, we
find infinitely many elements x − y ∈ L all of whose coordinates are smaller than 1.
The set X1 is thus not finite. Thus if the sets XH are all finite, L must be discrete.
Next assume that L is discrete and let H > 0 be any positive number. Then
for every x ∈ XH there is an open ball Bx that contains x but no other element
of L. Since XH is closed and bounded, the Heine-Borel theorem implies that XH is
compact, so the open covering ∪Bx of XH has a finite subcover, which implies that
XH is finite, as claimed.
Proof. Let x1 , . . . , xm ∈ L be an R-vector space basis for RL, and consider the
Z-submodule M = Zx1 + · · · + Zxm of L. If the quotient L/M is infinite, then
there are infinitely many distinct elements of L that all lie in a fundamental domain
for M , so Lemma 6.1.2 implies that L is not discrete. This is a contradiction, so
L/M is finite, and the rank of L is m = dim(RL), as claimed.
Proposition 6.1.4. The R-vector space V = Rσ(K) spanned by the image σ(K)
of K has dimension n.
Proof. We prove this by showing that the image σ(OK ) is discrete. If σ(OK ) were
not discrete it would contain elements all of whose coordinates are simultaneously
arbitrarily small. The norm of an element a ∈ OK is the product of the entries of
σ(a), so the norms of nonzero elements of OK would go to 0. This is a contradiction,
since the norms of nonzero elements of OK are nonzero integers.
Since σ(OK ) is discrete in Cn , Lemma 6.1.3 implies that dim(V ) equals the
rank of σ(OK ). Since σ is injective, dim(V ) is the rank of OK , which equals n by
Proposition 2.4.5.
6.1.1 A Determinant
Suppose w1 , . . . , wn is a basis for OK , and let A be the matrix whose ith row is
σ(wi ). Consider the determinant det(A).
Example 6.1.5. The ring OK = Z[i] of integers of K = Q(i) has Z-basis w1 = 1,
w2 = i. The map σ : K → C2 is given by
The image σ(OK ) is spanned by (1, 1) and (i, −i). The determinant is
1 1
i −i = −2i.
√ √
Let OK = Z[ 2] be the ring of integers of K = Q( 2). The map σ is
√ √ √
σ(a + b 2) = (a + b 2, a − b 2) ∈ R2 ,
and
A= √1 1
√ ,
2 − 2
√
which has determinant −2 2.
As the above example illustrates, the determinant det(A) most certainly need
not be an integer. However, as we will see, it’s square is an integer that does not
depend on our choice of basis for OK .
6.2 Discriminants
Suppose w1 , . . . , wn is a basis for OK as a Z-module, which we view as a Q-vector
space. Let σ : K ,→ Cn be the embedding σ(a) = (σ1 (a), . . . , σn (a)), where
σ1 , . . . , σn are the distinct embeddings of K into C. Let A be the matrix whose
rows are σ(w1 ), . . . , σ(wn ).
Changing our choice of basis for OK is the same as left multiplying A by an
integer matrix U of determinant ±1, which changes det(A) by ±1. This leads us to
consider det(A)2 instead, which does not depend on the choice of basis; moreover,
as we will see, det(A)2 is an integer. Note that
= det(Tr(wi wj )1≤i,j≤n ),
so det(A)2 can be defined purely in terms of the trace without mentioning the
embeddings σi . Moreover, if we change basis hence multiplying A by some U with
determinant ±1, then det(U A)2 = det(U )2 det(A)2 = det(A)2 . Because det(A) is
an algebraic integer and Tr(wi wj ) ∈ Q, it follows that det(A)2 is an algebraic integer
in Q. Thus det(A)2 ∈ Z is well defined as a quantity associated to OK .
If we view K as a Q-vector space, then (x, y) 7→ Tr(xy) defines a bilinear pairing
K × K → Q on K, which we call the trace pairing. The following lemma asserts
that this pairing is nondegenerate, so det(Tr(wi wj )) 6= 0 hence det(A) 6= 0.
Proof. If the trace pairing is degenerate, then there exists 0 6= a ∈ K such that
for every b ∈ K we have Tr(ab) = 0. In particularly, taking b = a−1 we see that
0 = Tr(aa−1 ) = Tr(1) = [K : Q] > 0, which is absurd.
This also works for orders (notice the square factor below, which will be explained
by Proposition 6.2.6):
2^2 * 5 * 7^2
This is an intentional choice done for efficiency reasons, since computing the maxi-
mal order can take a long time. Nonetheless, it conflicts with standard mathematical
usage, so beware.
The following proposition asserts that the discriminant of an order O in OK is
bigger than disc(OK ) by a factor of the square of the index.
Proposition 6.2.6. Suppose O is an order in OK . Then
Disc(O) = Disc(OK ) · [OK : O]2 .
Proof. Let A be a matrix whose rows are the images via σ of a basis for OK ,
and let B be a matrix whose rows are the images via σ of a basis for O. Since
O ⊂ OK has finite index, there is an integer matrix C such that CA = B, and
|det(C)| = [OK : O]. Then
Disc(O) = det(B)2 = det(CA)2 = det(C)2 det(A)2 = [OK : O]2 · Disc(OK ).
-1 * 503
hence [OK : O] = 2.
√ √
Example 6.2.9. Consider √ the√cubic field K = Q( 3 2), and let O be the order Z[ 3 2].
Relative to the base 1, 3 2, ( 3 2)2 for O, the matrix of the trace pairing is
3 0 0
A = 0 0 6 .
0 6 0
Thus
disc(O) = det(A) = 108 = 22 · 33 .
Suppose we do not know that the ring of integers OK is equal to O. By Proposi-
tion 6.2.6, we have
Disc(OK ) · [OK : O]2 = 22 · 33 ,
so 3 | disc(OK ), and [OK : O] | 6. Thus to prove O = OK it suffices to prove
that O is 2-maximal and 3-maximal, which could be accomplished as described in
Section 4.3.3.
• If M ⊂ L, then [L : M ] = #(L/M ).
• If M, L, N are any lattices in V , then
[L : N ] = [L : M ] · [M : N ].
J = K . f r a c t i o na l _ i d e a l (17)
J . norm ()
289
78 CHAPTER 6. DISCRIMANTS AND NORMS
1445
We will use the following proposition in the next chapter when we prove finite-
ness of class groups.
Proposition 6.3.6. Fix a number field K. Let B be a positive integer. There are
only finitely many integral ideals I of OK with norm at most B.
Note that if we let Div(OK ) denote the group of fractional ideals, then we have
an exact sequence
∗
0 → OK → K ∗ → Div(OK ) → CK → 0.
That the class group CK is finite follows from the first part of the following theo-
rem and that there are only finitely many ideals of norm less than a given integer
(Proposition 6.3.6).
79
80 CHAPTER 7. FINITENESS OF THE CLASS GROUP
Theorem 7.1.2 (Finiteness of the Class Group). Let K be a number field. There
is a constant Cr,s that depends only on the number r, s of real and pairs of complex
conjugate embeddings of K p such that every ideal class of OK contains an integral
ideal of norm at most Cr,s |dK |, where dK = Disc(OK ). Thus by Proposition 6.3.6
the class group CK of K is finite. In fact, one can take
s
4 n!
Cr,s = .
π nn
The explicit bound in the theorem
s
4 n! p
MK = · |dK |
π nn
is called the Minkowski bound. There are other better bounds, but they depend on
unproven conjectures.
The following two examples illustrate how to apply Theorem 7.1.2 to compute
CK in simple cases.
Example 7.1.3. Let K = Q[i]. Then n = 2, s = 1, and |dK | = 4, so the Minkowski
bound is 1
√ 4 2! 4
4· 2
= < 2.
π 2 π
Thus every fractional ideal is equivalent to an ideal of norm 1. Since (1) is the only
ideal of norm 1, every ideal is principal, so CK is trivial.
√ √
Example 7.1.4. Let K = Q( 10). We have OK = Z[ 10], so n = 2, s = 0,
|dK | = 40, and the Minkowski bound is
0
√ 4 2! √ 1 √
40 · · 2 = 2 · 10 · = 10 = 3.162277 . . . .
π 2 2
We compute the Minkowski bound in Sage as follows:
K = QQ [ sqrt (10)]; K
B = K . m i nk ow sk i _b ou nd (); B
sqrt (10)
B . n ()
3 . 1 6 2 2 7 7 6 6 0 1 6 8 38
Theorem 7.1.2 implies that every ideal class has a representative that is an integral
ideal of norm 1, 2, or 3. The ideal 2OK is ramified in OK , so
√
2OK = (2, 10)2 .
7.1. THE CLASS GROUP 81
√ √
If (2, 10) were principal, say (α), then α = a + b 10 would have norm ±2. Then
the equation
x2 − 10y 2 = ±2, (7.1.1)
G = K . class_group (); G
G .0
G .0^2
G .0 == G ( (3 , 2 + sqrt10 ) )
True
82 CHAPTER 7. FINITENESS OF THE CLASS GROUP
Before proving Theorem 7.1.2, we prove a few lemmas. The strategy of the
proof is to start with any nonzero ideal I, and prove that there is some nonzero
a ∈ K having very small norm, such that aI is an integral ideal. Then Norm(aI) =
NormK/Q (a) Norm(I) will be small, since NormK/Q (a) is small. The trick is to
determine precisely how small an a we can choose subject to the condition that aI
is an integral ideal, i.e., that a ∈ I −1 .
Let S be a subset of V = Rn . Then S is convex if whenever x, y ∈ S then the
line connecting x and y lies entirely in S. We say that S is symmetric about the
origin if whenever x ∈ S then −x ∈ S also. If L is a lattice in the real vector space
V = Rn , then the volume of V /L is the volume of the compact real manifold V /L,
which is the same thing as the absolute value of the determinant of any matrix
whose rows form a basis for L.
Lemma 7.1.5 (Blichfeld). Let L be a lattice in V = Rn , and let S be a bounded
closed convex subset of V that is symmetric about the origin. If Vol(S) ≥ 2n Vol(V /L),
then S contains a nonzero element of L.
1
Proof. First assume that Vol(S) > 2n Vol(V /L). If the map π : 2S → V /L is
injective, then
1 1
n
Vol(S) = Vol S ≤ Vol(V /L),
2 2
a contradiction. Thus π is not injective, so there exist P1 6= P2 ∈ 21 S such that
P1 − P2 ∈ L. Because S is symmetric about the origin, −P2 ∈ 21 S. By convexity,
the average 12 (P1 − P2 ) of P1 and −P2 is also in 21 S. Thus 0 6= P1 − P2 ∈ S ∩ L, as
claimed.
Next assume that Vol(S) = 2n · Vol(V /L). Then for all ε > 0 there is 0 6= Qε ∈
L ∩ (1 + ε)S, since Vol((1 + ε)S) > Vol(S) = 2n · Vol(V /L). If ε < 1 then the Qε
are all in L ∩ 2S, which is finite since 2S is bounded and L is discrete. Hence there
exists nonzero Q = Qε ∈ L ∩ (1 + ε)S for arbitrarily small ε. Since S is closed,
Q ∈ L ∩ S.
Lemma 7.1.6. If L1 and L2 are lattices in V , then
Vol(V /L2 ) = Vol(V /L1 ) · [L1 : L2 ].
Proof. Let A be an automorphism of V such that A(L1 ) = L2 . Then A defines an
isomorphism of real manifolds V /L1 → V /L2 that changes volume by a factor of
|det(A)| = [L1 : L2 ]. The claimed formula then follows, since [L1 : L2 ] = |det(A)|,
by definition.
Fix a number field K with ring of integers OK . Let σ1 , . . . , σr be the real
embeddings of K and σr+1 , . . . , σr+s be half the complex embeddings of K, with one
representative of each pair of complex conjugate embeddings. Let σ : K → V = Rn
be the embedding
σ(x) = σ1 (x), σ2 (x), . . . , σr (x),
Re(σr+1 (x)), . . . , Re(σr+s (x)), Im(σr+1 (x)), . . . , Im(σr+s (x)) ,
7.1. THE CLASS GROUP 83
Warning 7.1.7. Note that this σ is not exactly the same as the one at the beginning
of Section 6.2 if s > 0.
(σ1 (wi ), · · · , σr (wi ), Re(σr+1 (wi )), . . . , Re(σr+s (wi )), Im(σr+1 (wi )), . . . , Im(σr+s (wi )))
and whose determinant has absolute value equal to the volume of V /L. By doing
the following three column operations, we obtain a matrix whose rows are exactly
the images of the wi under all embeddings of K into C, which is the matrix that
came up when we defined dK = Disc(OK ) in Section 6.2.
√
1. Add i = −1 times each column with entries Im(σr+j (wi )) to the column
with entries Re(σr+j (wi )).
2. Multiply all columns with entries Im(σr+j (wi )) by −2i, thus changing the
determinant by (−2i)s .
3. Add each column that now has entries Re(σr+j (wi )) + iIm(σr+j (wi )) to the
the column with entries −2iIm(σr+j (wi )) to obtain columns Re(σr+j (wi )) −
iIm(σr+j (wi )).
Proof. Since σ(OK ) has rank n as an abelian group, and Lemma 7.1.8 implies that
σ(OK ) also spans V , it follows that σ(OK ) is a lattice in V . For some nonzero
1
integer m we have mOK ⊂ I ⊂ m OK , so σ(I) is also a lattice in V . To prove the
displayed volume formula, combine Lemmas 7.1.6 and 7.1.8 to get
Proof of Theorem 7.1.2. Let K be a number field with ring of integers OK , let
σ : K ,→ V ∼
= Rn be as above, and let f : V → R be the function defined by
Let S ⊂ V be any fixed choice of closed, bounded, convex, subset with positive
volume that is symmetric with respect to the origin. Since S is closed and bounded,
M = max{f (x) : x ∈ S}
exists.
Suppose I is any fractional ideal of OK . Our goal is to prove that there is
an integral ideal aI with small norm. We will do this by finding an appropriate
a ∈ I −1 . By Lemma 7.1.9,
2−s |dK |
p
−1 −s −1
p
c = Vol(V /σ(I )) = 2 |dK | · Norm(I) = .
Norm(I)
c 1/n
Let λ = 2 · v , where v = Vol(S). Then
c
Vol(λS) = λn Vol(S) = 2n · · v = 2n · c = 2n Vol(V /σ(I −1 )),
v
so by Lemma 7.1.5 there exists 0 6= b ∈ σ(I −1 ) ∩ λS. Let a ∈ I −1 be such that
σ(a) = b. Since M is the largest norm of an element of S, the largest norm of an
element of σ(I −1 ) ∩ λS is at most λn M , so
NormK/Q (a) ≤ λn M.
= 2r+s |dK | · M · v −1 .
p
Notice that the right hand side is independent of I. It depends only on r, s, |dK |, and
our choice of S. This completes the proof of the theorem, except for the assertion
that S can be chosen to give the claim at the end of the theorem which is shown in
Exercise 7.1.10.
7.2. CLASS NUMBER 1 85
Exercise 7.1.10. Show that in the proof of Theorem 7.1.2, S can be chosen so
that the final bound matches the statement of the theorem. This means S can be
chosen so that s
4 n! p
Norm(aI) ≤ |dK |.
π nn
[Hint: Consider the subset S of Rn defined by
q q
|x1 | + · · · + |xr | + 2 x2r+1 + x2(r+1)+s + · · · + x2r+s + x2(r+s)+s ≤ 1.
Suppose a ∈ OK such that σ(a) ∈ S. What can you say about NormK/Q (a)? What
is Vol(S)? ]
Proof. Applying Theorem 7.1.2 to the unit ideal, we get the bound
s
p 4 n!
1 ≤ |dK | · .
π nn
Thus
p π s n n
|dK | ≥ ,
4 n!
and the right hand quantity is strictly bigger than 1 for any s ≤ n/2 and any n > 1,
see Exercise 7.1.12.
Conjecture 7.2.1. There are infinitely many number fields K such that the class
group of K has order 1.
√
For example, if we consider real quadratic fields K = Q( d), with d positive and
square free, many class numbers are probably 1, as suggested by the Sage output
below. It looks like 1’s will keep appearing infinitely often, and indeed Cohen and
Lenstra conjecture that they do ([CL84]).
86 CHAPTER 7. FINITENESS OF THE CLASS GROUP
for d in [2..1000]:
if i s _ f u n d a m e n t a l _ d i s c r i m i n a n t ( d ):
h = Q uadrati cField (d , ’a ’ ). class_number ()
if h == 1:
print d ,
5 8 12 13 17 21 24 28 29 33 37 41 44 53 56 57 61 69
73 76 77 88 89 92 93 97 101 109 113 124 129 133 137
141 149 152 157 161 172 173 177 181 184 188 193 197
201 209 213 217 233 236 237 241 248 249 253 268 269
277 281 284 293 301 309 313 317 329 332 337 341 344
349 353 373 376 381 389 393 397 409 412 413 417 421
428 433 437 449 453 457 461 472 489 497 501 508 509
517 521 524 536 537 541 553 556 557 569 573 581 589
593 597 601 604 613 617 632 633 641 649 652 653 661
664 668 669 673 677 681 701 709 713 716 717 721 737
749 753 757 764 769 773 781 789 796 797 809 813 821
824 829 844 849 853 856 857 869 877 881 889 893 908
913 917 921 929 933 937 941 953 956 973 977 989 997
Lemma 7.3.1. Let K be a number field with ring of integers OK . Then the class
s by the prime ideals p of OK lying over primes p ∈ Z with
group Cl(K)pis generated
p ≤ BK = |dK | · π4 · nn!n , where s is the number of complex conjugate pairs of
embeddings K ,→ C.
7.3. MORE ABOUT COMPUTING CLASS GROUPS 87
Proof. Theorem 7.1.2 asserts that every ideal Qmclassei in Cl(K) is represented by an
ideal I with Norm(I) ≤ BK . Write I = i=1 pi , with each ei ≥ 1. Then by
multiplicativity of the norm, each pi also satisfies Norm(pi ) ≤ BK . If pi ∩ Z = pZ,
then p | Norm(pi ), since p is the residue characteristic of OK /p, so p ≤ BK . Thus I
is a product of primes p that satisfies the norm bound of the lemma.
1. Use the algorithms of Chapter 4 to list all prime ideals p of OK that appear
in the factorization of a prime p ∈ Z with p ≤ BK .
2. Find the group generated by the ideal classes [p], where the p are the prime
ideals found in step 1. (In general, this step can become fairly complicated.)
√
The following
√ three examples illustrate computation of Cl(K) for K = Q(i), Q( 5)
and Q( −6).
Example 7.3.2. We compute the class group of K = Q(i). We have
n = 2, r = 0, s = 1, dK = −4,
so 1
√ 4 2! 8
BK = 4· · = < 3.
π 22 π
Thus Cl(K) is generated by the prime divisors of 2. We have
2OK = (1 + i)2 ,
n = 2, r = 2, s = 0, dK = 5,
so 0
√ 4 2!
B= 5· · < 3.
π 22
Thus Cl(K)
√
is generated by the primes that divide 2. We have OK = Z[γ], where
1+ 5
γ = 2 satisfies x2 − x − 1. The polynomial x2 − x − 1 is irreducible mod 2, so
2OK is prime. Since it is principal, we see that Cl(K) = 1 is trivial.
√
Example 7.3.4. In this example, we compute the class group of K = Q( −6). We
have
n = 2, r = 0, s = 1, dK = −24,
so
√
4 2!
B= 24 · · ∼ 3.1.
π 22
88 CHAPTER 7. FINITENESS OF THE CLASS GROUP
Thus
√ Cl(K) is √ generated by 2the prime ideals lying2 over 2 and 3. We have OK =
Z[ −6], and −6 satisfies x + 6 = 0. Factoring x + 6 modulo 2 and 3 we see that
the class group is generated by the prime ideals
√ √
p2 = (2, −6) and p3 = (3, −6).
Also, p22 = 2OK and p23 = 3OK , so p2 and p3 define elements of order dividing 2 in
Cl(K).
Is either p2 or p3 principal? Fortunately, there√is an easier norm trick that allows
us to decide. Suppose p2 = (α), where α = a + b −6. Then
√ √
2 = Norm(p2 ) = |Norm(α)| = (a + b −6)(a − b −6) = a2 + 6b2 .
Trying the first few values of a, b ∈ Z, we see that this equation has no solutions,
so p2 can not be principal. By a similar argument, we see that p3 is not principal
either. Thus p2 and p3 define elements of order 2 in Cl(K).
Does the class of p2 equal the class of p3 ? Since p2 and p3 define classes of
order 2, we can decide this by finding the class of p2 · p3 . We have
√ √ √ √ √
p2 · p3 = (2, −6) · (3, −6) = (6, 2 −6, 3 −6) ⊂ ( −6).
In this chapter we will prove Dirichlet’s unit theorem, which is a structure theorem
for the group of units of the ring of integers of a number field. The answer is
remarkably simple: if K has r real and s pairs of complex conjugate embeddings,
then
∗
OK ≈ Zr+s−1 × T,
Theorem 8.1.2 (Dirichlet). The group UK is the product of a finite cyclic group
of roots of unity with a free abelian group of rank r + s − 1, where r is the number of
real embeddings of K and s is the number of complex conjugate pairs of embeddings.
(Note that we will prove a generalization of Theorem 8.1.2 in Section 12.1 below.)
We prove the theorem by defining a map ϕ : UK → Rr+s , and showing that the
kernel of ϕ is finite and the image of ϕ is a lattice in a hyperplane in Rr+s . The
trickiest part of the proof is showing that the image of ϕ spans a hyperplane, and
we do this by a clever application of Blichfeld’s Lemma 7.1.5.
89
90 CHAPTER 8. DIRICHLET’S UNIT THEOREM
Remark 8.1.3. Theorem 8.1.2 is due to Dirichlet who lived 1805–1859. Thomas
Hirst described Dirichlet thus:
I think Koch’s observation nicely describes the proof we will give of Theorem 8.1.2.
Units have a simple characterization in terms of their norm.
Proof. Write Norm = NormK/Q . If a is a unit, then a−1 is also a unit, and 1 =
Norm(a) Norm(a−1 ). Since both Norm(a) and Norm(a−1 ) are integers, it follows
that Norm(a) = ±1. Conversely, if a ∈ OK and Norm(a) = ±1, then the equation
aa−1 = 1 = ± Norm(a) implies that a−1 = ± Norm(a)/a. But Norm(a) is the
product of the images of a in C by all embeddings of K into C, so Norm(a)/a is
also a product of images of a in C, hence a product of algebraic integers, hence an
algebraic integer. Thus a−1 ∈ K ∩ Z = OK , which proves that a is a unit.
Let r be the number of real and s the number of complex conjugate embeddings
of K into C, so n = [K : Q] = r + 2s. Define the log map
ϕ : UK → Rr+s
by
ϕ(a) = (log |σ1 (a)|, . . . , log |σr+s (a)|).
p
Here |z| is the usual absolute value of z = x + iy ∈ C (so |z| = x2 + y 2 ), and the
maps σi are the same as those described in Lemma 7.1.8. In particular, σ1 , . . . , σr
represent all real embeddings K → R and σr+1 , . . . , σr+s represent half of the com-
plex embeddings K → C, with one representative for each pair of complex conjugate
embeddings.
Proof. We have
where σ : OK → Cr+s is given by σ(a) = (σ1 (a), . . . , σr+s (a)) and X is the set
{(z1 , . . . , zr+s ) ∈ Cr+s : |zi | ≤ 1}. Since σ(OK ) is a lattice (see Proposition 2.4.5)
and X is compact, the intersection σ(OK ) ∩ X is finite. This implies Ker(ϕ) is
finite.
Proof. Lemma 8.1.7 implies that ker(ϕ) is a finite group. It is a general fact that
any finite subgroup G of the multiplicative group K ∗ of a field is cyclic (see Exer-
cise 8.1.9).
Exercise 8.1.9. Finish the proof of Lemma 8.1.8 by showing that for a field K,
every finite subgroup G of the multiplicative group K ∗ is cyclic.
[Hint: Every element in G satisfies a polynomial of the form xn − 1. Recall that
a polynomial of degree n over a field has at most n distinct roots. Now consider
the orders of the elements of G. ]
92 CHAPTER 8. DIRICHLET’S UNIT THEOREM
To prove Theorem 8.1.2, it suffices to prove that Im(ϕ) is a lattice in the hyper-
plane H of (8.1.1), which we view as a vector space of dimension r + s − 1.
Define an embedding
σ : K ,→ Rn (8.1.2)
given by σ(x) = (σ1 (x), . . . , σr+s (x)), where we view C ∼
= R × R via a + bi 7→ (a, b).
Thus this is the embedding
and d1 · · · dn = A.
Proof. Order the wi so that w1 6= 0. By hypothesis there exists a wj such that
wj 6= w1 , and again re-ordering we may assume that j = 2. Set d3 = · · · = dr+s = 1.
Suppose d1 , d2 are any positive real numbers with d1 d2 = A. Since log(1) = 0,
Xn
wi log(di ) = |w1 log(d1 ) + w2 log(d2 )|
i=1
= |w1 log(d1 ) + w2 log(A/d1 )|
= |(w1 − w2 ) log(d1 ) + w2 log(A)|
Proof of Theorem 8.1.2. By Lemma 8.1.10, the image ϕ(UK ) is discrete, so it re-
mains to show that ϕ(UK ) spans H. Let W be the R-span of the image ϕ(UK ),
and note that W is a subspace of H, by Lemma 8.1.6. We will show that W = H
indirectly by showing that if v 6∈ H ⊥ , where ⊥ is the orthogonal complement with
respect to the dot product on Rr+s , then v 6∈ W ⊥ . This will show that W ⊥ ⊂ H ⊥ ,
hence that H ⊂ W , as required.
8.1. THE GROUP OF UNITS 93
c1 · · · cr · (cr+1 · · · cr+s )2 = A.
Let
S = {(x1 , . . . , xn ) ∈ Rn :
|xi | ≤ ci for 1 ≤ i ≤ r,
|x2i + x2i+s | ≤ c2i for r < i ≤ r + s} ⊂ Rn .
Then S is closed, bounded, convex, symmetric with respect to the origin, and of
dimension r + 2s, since S is a product of r intervals and s discs, each of which has
these properties. Viewing S as a product of intervals and discs, we see that the
volume of S is
r
Y s
Y
(2ci ) · (πc2i ) = 2r · π s · A = 2r+s |dK | = 2n · 2−s |dK |.
p p
Vol(S) =
i=1 i=1
|f (u) − t| ≤ B.
c1 · · · cr · (cr+1 · · · cr+s )2 = A
|tc1 ,...,cr+s | > B,
then the fact that |f (u) − t| ≤ B would then imply that |f (u)| > 0, which is exactly
what we aimed to prove.
8.2. EXAMPLES WITH SAGE 95
The subgroup of cubes of u gives us the units with integer x, y (not both nega-
tive).
√
However, the norm of u = 1+2 5 is −1. So the 6th powers of u will generate
solutions to Pell’s Equation. We can also list the coefficients for these powers as
follows.
[ list ( u ^(6* i )) for i in [0..7]]
Remark 8.2.1. A great article about Pell’s equation is [Len02]. The MathSciNet
review begins: “This wonderful article begins with history and some elementary
facts and proceeds to greater and greater depth about the existence of solutions
to Pell equations and then later the algorithmic issues of finding those solutions.
The cattle problem is discussed, as are modern smooth number methods for solving
Pell equations and the algorithmic issues of representing very large solutions in a
reasonable way.”
The simplest solutions to Pell’s equation can be huge, even when d is quite small.
Read Lenstra’s paper for some examples from over two thousand years ago. Here
is one example for d = 10000019.
163580259880346328225592238121094625499142677693142915506747253000\
340064100365767872890438816249271266423998175030309436575610631639\
272377601680603795883791477817611974184075445702823789975945910042\
8895693238165048098039* a - \
517286692885814967470170672368346798303629034373575202975075605058\
714958080893991274427903448098643836512878351227856269086856679078\
304979321047765031073345259902622712059164969008633603603640331175\
6634562204182936222240930
√
Exercise 8.2.2. Let U be the group of units of the ring of integers of K = Q( 5).
√
(a) Prove that the set S of units x + y 5 ∈ U with x, y ∈ Z is a subgroup of U .
(The main point is to show that the inverse of a unit with x, y ∈ Z again has
coefficients in Z.)
(b) Let U 3 denote the subgroup of cubes of elements of U . Prove that S = U 3 by
showing that U 3 ⊂ S ( U and that there are no groups H with U 3 ( H ( U .
8.2. EXAMPLES WITH SAGE 97
In this section we give examples for various (r, s) pairs. First we consider K = Q(i).
(0 , 1)
U = K . unit_group (); U
U .0. value ()
The signature method returns the number of real and complex conjugate em-
beddings of K into C. The unit_group method, which we used above, returns the
unit group UK as an abstract abelian group and a homomorphism UK → OK .
√
Next we consider K = Q( 3 2).
(1 , 1)
U = K . unit_group (); U
[ -1 , a - 1]
a - 1
Below we use the places command, which returns the real embeddings and
representatives for the complex conjugate embeddings. We use the places to define
the log map ϕ, which plays such a big role in this chapter.
98 CHAPTER 8. DIRICHLET’S UNIT THEOREM
[ Ring morphism :
From : Number Field in a with defining polynomial x ^3 - 2
To : Real Double Field
Defn : a | - - > 1.25992104989 , \
Ring morphism :
From : Number Field in a with defining polynomial x ^3 - 2
To : Complex Double Field
Defn : a | - - > -0.629960524947 + 1. 09112363 597* I ]
def phi ( z ):
return [ log ( abs ( sigma ( z ))) for sigma in S ]
phi ( u )
[ -1.3473773483293832 , 0 . 6 7 3 6 8 8 6 7 4 1 6 4 6 9 2 ]
phi ( K ( -1))
[0.0 , 0.0]
(0 , 3)
U = K . unit_group (); U
a ^3 + a
[ -0.16741548328589614 , 0.04864390975267338 , 0 . 1 1 8 7 7 1 5 7 3 5 3 3 2 2 2 9 8 ]
phi ( u2 )
[0.30678570892329504 , -1.0725146505489758 , 0 . 7 6 5 7 2 8 9 4 1 6 2 5 6 8 0 3 ]
phi ( K ( -1))
sum ( phi ( u1 ))
2 . 2 2 0 4 4 6 0 4 9 2 5 0 3 1 3 e -16
sum ( phi ( u2 ))
-4.440892098500626 e -16
Notice that the log image of u1 is clearly not a real multiple of the log image
of u2 (e.g., the scalar would have to be positive because of the first coefficient, but
negative because of the second). This illustrates the fact that the log images of u1
and u2 span a two-dimensional space.
Next we compute a field with r = 3 and s = 0. (A field with s = 0 is called
totally real.)
100 CHAPTER 8. DIRICHLET’S UNIT THEOREM
(3 , 0)
U = K . unit_group (); U
1/2* a ^2 + a - 1/2
[ -0.7747670223461895 , -0.3928487245813982 , 1 . 1 6 7 6 1 5 7 4 6 9 2 7 5 8 8 7 ]
phi ( u2 )
[0.9966812040934553 , -1.6402241503223172 , 0 . 6 4 3 5 4 2 9 4 6 2 2 8 8 6 2 7 ]
A field with r = 0 is called totally complex. For example, the cyclotomic fields
Q(ζn ) are totally complex, where ζn is a primitive nth root of unity. The degree of
Q(ζn ) over Q is ϕ(n) and r = 0, so s = ϕ(n)/2 (assuming n > 2). Here ϕ is the
Euler Totient function which on n is defined as the number of integers k such that
0 < k ≤ n and gcd(k, n) = 1.
8.2. EXAMPLES WITH SAGE 101
K . signature ()
(0 , 5)
U = K . unit_group (); U
a ^7 + a ^6
How far can we go computing unit groups of cyclotomic fields directly with
Sage?
102 CHAPTER 8. DIRICHLET’S UNIT THEOREM
There are better ways to compute units in cyclotomic fields than to just use
general purpose software. For example, there are explicit cyclotomic units that
can be written down and generate a finite subgroup of UK . See [Was97, Ch. 8],
which would be a great book to read now that you’ve gone this far in the present
book. Also, using ideas explained in that book, it is probably possible to make the
unit_group command in Sage for cyclotomic fields extremely fast, which would be
an interesting project for a reader who also likes to code.
104 CHAPTER 8. DIRICHLET’S UNIT THEOREM
Chapter 9
In this chapter we will study extra structure in the case when K is Galois over Q.
We will learn about Frobenius elements, the Artin symbol, decomposition groups,
and how the Galois group of K is related to Galois groups of residue class fields.
These are the basic structures needed to attach L-function to representations of
Gal(Q/Q), which will play a central role in the next few chapters.
# Aut(K/L) = [K : L],
Gal(K/L) = Aut(K/L).
105
106 CHAPTER 9. DECOMPOSITION AND INERTIA GROUPS
• The set Sp of prime ideals lying over a given nonzero prime ideal p of OL , i.e.,
the prime divisors of pOK
Theorem 9.2.2. Suppose K/LQis a Galois extension of number fields, and let p be
a prime of OL . Write pOK = gi=1 Pei i , and let fi = fPi /p . Then G = Gal(K/L)
acts transitively on the set Sp of primes Pi , and
e1 = · · · = eg , f1 = · · · = fg .
Morever, if we let e be the common value of the ei , f the common value of the fi ,
and n = [K : L], then
ef g = n.
Proof. For simplicity, we will give the proof only in the case L = Q, but the proof
e
works in general. Suppose p ∈ Z and pOK = pe11 · · · pgg , and S = {p1 , . . . , pg }.
We will first prove that G acts transitively on S. Let p = pi for some i. Recall
108 CHAPTER 9. DECOMPOSITION AND INERTIA GROUPS
Lemma 5.2.2 which we proved long ago using the Chinese Remainder Theorem
(Theorem 5.1.4). It showed there exists a ∈ p such that (a)/p is an integral ideal
that is coprime to pOK . The product
By unique factorization, since every pj appears in the left hand side, we must have
that for each j there is a σ with σ(pi ) = pj , i.e., G acts transitively on S.
Choose some j and suppose that k 6= j is another index. Because G acts
transitively,
Qg there exists σ ∈ G such that σ(pk ) = pj . Applying σ to the factorization
ei
pOK = i=1 pi , we see that
g
Y g
Y
pei i = σ(pi )ei .
i=1 i=1
The rest of this section illustrates the theorem for quadratic fields and a cubic
field and its Galois closure.
9.2. DECOMPOSITION OF PRIMES: EF G = N 109
x3 − 2 = (x + 2)(x2 + 3x + 4) ∈ F5 [x],
Exercise 9.2.8 (See [Mar77, Ch. 4, Exercise 24]). Continue the notation from the
previous exercise.
(a) If p it totally ramified in K then it is totally ramified in L.
1 → Ip → Dp → Gal(kp /Fp ) → 1,
where Ip is the inertia subgroup of Dp , and #Ip = e = e(p/p). The most interesting
part of the proof is showing that the natural map Dp → Gal(kp /Fp ) is surjective.
We will also discuss the structure of Dp and introduce Frobenius elements, which
play a crucial role in understanding Galois representations.
Recall from Theorem 9.2.2 that G acts transitively on the set of primes p lying
over p. The orbit-stabilizer theorem implies that [G : Dp ] equals the cardinality of
the orbit of p, which by Theorem 9.2.2 equals the number g of primes lying over p,
so [G : Dp ] = g.
Lemma 9.3.2. The decomposition subgroups Dp corresponding to primes p lying
over a given p are all conjugate as subgroups of G.
Proof. See Exercise 9.3.3.
112 CHAPTER 9. DECOMPOSITION AND INERTIA GROUPS
Thus p does not split in going from K D to K—it does some combination of
ramifying and staying inert. To fill in more of the picture, the following proposition
asserts that p splits completely and does not ramify in K D /Q.
Proposition 9.3.5. Fix a finite Galois extension K of Q, let p be a prime lying
over p with decomposition group D, and set L = K D and q = p ∩ OL . Then
e(q/p) = f (q/p) = 1, gL (p) = [L : Q], e(p/p) = e(p/q) and f (p/p) = f (p/q).
Proof. As mentioned right after Definition 9.3.1, the orbit-stabilizer theorem implies
that gK (p) = [G : D], and by Galois theory [G : D] = [L : Q], so gK (p) = [L : Q].
By Proposition 9.3.4, we have gK (q) = 1 so by Theorem 9.2.2,
[K : Q]
e(p/q) · f (p/q) = [K : L] =
[L : Q]
e(p/p) · f (p/p) · gK (p)
=
[L : Q]
= e(p/p) · f (p/p).
Now e(p/q) ≤ e(p/p) and f (p/q) ≤ f (p/p), so we must have e(p/q) = e(p/p) and
f (p/q) = f (p/p). Since from Exercise 9.2.7 we have e(p/p) = e(p/q) · e(q/p) and
f (p/q) = f (p/q) · f (q/p), it follows that e(q/p) = f (q/p) = 1.
9.3. THE DECOMPOSITION GROUP 113
Exercise 9.3.7.
(a) Show the map Frobp is in fact a field homomorphism, that is Frobp (a + b) =
Frobp (a) + Frobp (b) and Frobp (ab) = Frobp (a) Frobp (b).
(d) Continuing part (c), note that by Exercise 8.1.9 k ∗ is cyclic. Let a ∈ k be a
generator for k ∗ , so a has multiplicative order pf − 1 and k = Fp (a). Show
that
n
Frobnp (a) = ap = a ⇔ (pf − 1) | pn − 1 ⇔ f | n
Remark 9.3.8. Exercise 9.3.7 shows that all finite fields are perfect. For more on
perfect fields see a standard abstract algebra text such as [?].
114 CHAPTER 9. DECOMPOSITION AND INERTIA GROUPS
Theorem 9.3.9. The extension kp /Fp is Galois and moreover, Gal(kp /Fp ) is gen-
erated by the Frobenius map Frobp defined by a 7→ ap .
Exercise 9.3.10. Prove that up to isomorphism there is exactly one finite field of
each degree.
[Hint: By Theorem 9.3.9 all elements in a finite field satisfy an equation of the
f
form xp − 1 where p is the characteristic and f is the degree over the field Fp . ]
ϕ : Dp → Gal(kp /Fp ).
Definition 9.3.12 (Inertia Group). The inertia group associated to p is the kernel
Ip of Dp → Gal(kp /Fp ).
Proof. The exact sequence (9.3.1) implies that #Ip = #Dp /f where f = f (p/p) =
[kp : Fp ]. Applying Propositions 9.3.4 and 9.3.5, we have
[K : Q] ef g
#Dp = [K : L] = = = ef.
g g
Dividing both sides by f proves the corollary.
Proposition 9.3.14. Let K/Q be a Galois extension with group G, and let p be a
prime of OK lying over a prime p. Then
Exercise 9.4.1. With the notation above, prove that Frobp is unique. That is, if
σ satisfies σ(a) ≡ ap (mod p) for all a ∈ OK then σ = Frobp .
[Hint: First show σ ∈ Dp , then argue as in the proof of Proposition 9.3.14. ]
Just as the primes p and decomposition groups Dp are all conjugate, the Frobe-
nius elements corresponding to primes p | p are all conjugate as elements of G.
Frobσp = σ Frobp σ −1 .
In particular, the Frobenius elements lying over a given prime are all conjugate.
Warning 9.5.1. The topology on Gal(Q/Q) is not the topology induced by taking
as a basis of open neighborhoods around the origin the collection of finite-index
normal subgroups of Gal(Q/Q), see [?, Ch. 7] or Exercise 9.5.5. In particular,
there exist nonopen normal subgroups of finite index which do not correspond to
subgroups Gal(Q/K) for some finite Galois extension K/Q.
For ρ to be continuous means that if K is the fixed field of Ker(ρ), then K/Q is
9.5. THE ARTIN CONJECTURE 117
' + ρ0
Gal(K/Q)
Exercise 9.5.3. Suppose ρ : Gal(Q/Q) → GLn (C) is continuous. Show that the
image is finite.
Remark 9.5.4. The converse to Exercise 9.5.3 is false in general (see Exercise 9.5.5).
This is essentially the same warning as Warning 9.5.1, however it is worth pointing
out to avoid mistakes.2
Fix a Galois representation ρ and let K be the fixed field of ker(ρ), so ρ factors
through Gal(K/Q). For each prime p ∈ Z that is not ramified in K, there is an
element Frobp ∈ Gal(K/Q) that is well-defined up to conjugation by elements of
Gal(K/Q). This means that ρ0 (Frobp ) ∈ GLn (C) is well-defined up to conjuga-
tion. Thus the characteristic polynomial Fp (x) ∈ C[x] of ρ0 (Frobp ) is a well-defined
invariant of p and ρ. Let
We view L(ρ, s) as a function of a single complex variable s. One can prove that
L(ρ, s) is holomorphic on some right half plane, and extends to a meromorphic
function on all C.
2
See [?, Pg. 1].
118 CHAPTER 9. DECOMPOSITION AND INERTIA GROUPS
This conjecture asserts that there is some way to analytically continue L(ρ, s)
to the whole complex plane, except possibly at 1. (A standard fact from complex
analysis is that this analytic continuation must be unique.) The simple pole at
s = 1 corresponds to the trivial representation (the Riemann zeta function), and if
n ≥ 2 and ρ is irreducible, then the conjecture is that ρ extends to a holomorphic
function on all C.
The conjecture is known when n = 1. Assume for the rest of this paragraph
that ρ is odd, i.e., if c ∈ Gal(Q/Q) is complex conjugation, then det(ρ(c)) = −1.
When n = 2 and the image of ρ in PGL2 (C) is a solvable group, the conjecture is
known, and is a deep theorem of Langlands and others (see [Lan80]), which played a
crucial roll in Wiles’s proof of Fermat’s Last Theorem. When n = 2 and the image
of ρ in PGL2 (C) is not solvable, the only possibility is that the projective image is
isomorphic to the alternating group A5 . Because A5 is the symmetry group of the
icosahedron, these representations are called icosahedral. In this case, Joe Buhler’s
Harvard Ph.D. thesis [Buh78] gave the first example in which ρ was shown to satisfy
Conjecture 9.5.6. There is a book [Fre94], which proves Artin’s conjecture for 7
icosahedral representation (none of which are twists of each other). Kevin Buzzard
and the author proved the conjecture for 8 more examples [BS02]. Subsequently,
Richard Taylor, Kevin Buzzard, Nick Shepherd-Barron, and Mark Dickinson proved
the conjecture for an infinite class of icosahedral Galois representations (disjoint
from the examples) [BDSBT01]. The general problem for n = 2 is in fact now
completely solved, due to recent work of Khare and Wintenberger [KW08] that
proves Serre’s conjecture.
Chapter 10
This chapter is about elliptic curves and the central role they play in algebraic
number theory. Our approach will be less systematic and more a survey than most
of the rest of this book. The goal is to give you a glimpse of the forefront of research
by assuming many basic facts that can be found in other books (see, e.g., [Sil92]).
We will not define genus in this book, except to note that a nonsingular curve
over K has genus one if and only if over K it can be realized as a nonsingular
plane cubic curve.1 Moreover, one can show (using the Riemann-Roch formula)
that over any field a genus one curve with a rational point can always be defined
by a projective cubic equation of the form
Y 2 Z + a1 XY Z + a3 Y Z 2 = X 3 + a2 X 2 Z + a4 XZ 2 + a6 Z 3 .
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 . (10.1.1)
1
For a detailed and technical explanation of genus see [Har77, Ch. II.8] or [?, Ch. 7.3]
119
120 CHAPTER 10. ELLIPTIC CURVES AND L-FUNCTIONS
Thus one often presents an elliptic curve by giving a Weierstrass equation (10.1.1),
though there are significant computational advantages to other equations for curves
(e.g., Edwards coordinates – see work of Bernstein and Lange in [?]).
Using Sage we plot an elliptic curve over the finite field F7 and an elliptic curve
curve defined over Q.
1 2 3 4 5
E . plot ()
1.5
0.5
-0.5
-1
-1.5
10.1. GROUPS ATTACHED TO ELLIPTIC CURVES 121
Note that both plots above are of the affine equation y 2 = x3 + x, and do not
include the distinguished point O, which lies at infinity.
Remark 10.1.2. The command EllipticCurve in Sage can take as input a list
[a4,a6] of coefficients and returns an elliptic curve given by a Weirstrass equation
with a1 = a2 = a3 = 0 and a4 , a6 as specified.
If E is an elliptic curve over K, then we give the set E(K) of all K-rational points
on E the structure of abelian group with identity element O.2 If we embed E in the
projective plane, then this group is determined by the condition that three points
sum to the zero element O if and only if they lie on a common line (some care
needs to be taken when the points are not distinct). In our affine picture, a line will
intersect the point at infinity if it is vertical, or equivalently if it of the form x = a
for some fixed a ∈ K.
Example 10.1.3. On the curve y 2 = x3 − 5x + 4, we have (0, 2) + (1, 0) = (3, 4). This
is because (0, 2), (1, 0), and (3, −4) are on a common line (given by the equation
y = 2 − 2x) hence they sum to zero:
Notice (3, 4), (3, −4), and O (the point at infinity on the curve) are also on a
common line (given by x = 3), so (3, 4) = −(3, −4). We can illustration this in
Sage:
E = EllipticCurve ([ -5 ,4])
E (0 ,2) + E (1 ,0)
(3 : 4 : 1)
2
As a reminder, we will not give rigorous proofs of any facts in this section. For a more detailed
and technical explanation of the group structure for elliptic curves see [Sil92, Ch. III.2].
122 CHAPTER 10. ELLIPTIC CURVES AND L-FUNCTIONS
G = E . plot ()
G += points ([(0 ,2) , (1 ,0) , (3 ,4) , (3 , -4)] ,
pointsize =90 , color = ’ red ’ , zorder =10)
G += line ([( -1 ,4) , (4 , -6)] , color = ’ black ’)
G += line ([(3 , -6) , (3 ,6)] , color = ’ black ’)
G . show ()
-2 -1 1 2 3 4
-2
-4
-6
Iterating the group operation often leads quickly to very complicated points:
7* E (0 ,2)
(14100601873051200/48437552041038241 :
-17087004418706677845235922/10660394576906522772066289 :
1)
Remark 10.1.4. In the previous example we saw that iterating the group operation
led to points which used a lot of digits to write down. This notion can be made
formal and is called the height of the point. The height function is used to prove
the general Mordell-Weil theorem, see [Sil92, Ch. VIII.4]
Exercise 10.1.5. Let E be an elliptic curve given by a Weirstrass equation such
as (10.1.1). Show that the points of order two are exactly the points on E with
y-coordinate equal to 0.
[Hint: Recall that a point P has order 2 if P + P + O = O, which means the
tangent line at P goes through the point at infinity. ]
That the above condition—three points on a line sum to zero—defines an abelian
group structure on E(K) is not obvious. Depending on your perspective, the trick-
iest part is seeing that the operation satisfies the associative axiom. The best way
to understand the group operation on E(K) is to view E(K) as being related to a
class group. As a first observation, note that the ring
is a Dedekind domain, so Cl(R) is defined, and every nonzero fractional ideal can
be written uniquely in terms of prime ideals. When K is a perfect field, the prime
10.1. GROUPS ATTACHED TO ELLIPTIC CURVES 123
ideals correspond to the Galois orbits of affine points of E(K). Note that these do
not include the point at infinity.
Let Div(E/K) be the free abelian group on the Galois orbits of points of E(K),
which as explained above is analogous to the group of fractional ideals of a number
field (here we do include the point at infinity). We call the elements of Div(E/K)
divisors. Let Pic(E/K) be the quotient of Div(E/K) by the principal divisors, i.e.,
the divisors associated to rational functions f ∈ K(E)∗ via
X
f 7→ (f ) = ordP (f )[P ].
P
Here K(E) is the fraction field of the ring R defined above. Note that the principal
divisor associated to f is analogous to the principal fractional ideal associated to a
nonzero element of a number field. The definition of ordP (f ) is analogous to the
“power of P that divides the principal ideal generated by f ”. Define the class group
Pic(E/K) to be the quotient of the divisors by the principal divisors, so we have
an exact sequence:
A key difference between elliptic curves and algebraic number fields is that the
principal divisors in the context of elliptic curves all have degree 0, i.e., the sum
of the coefficients of the divisor (f ) is always 0. This might be a familiar fact to
you: the number of zeros of a nonzero rational function on a projective curve equals
the number of poles, counted with multiplicity. If we let Div0 (E/K) denote the
subgroup of divisors of degree 0, then we have an exact sequence
To connect this with the group law on E(K), note that there is a natural map
Using the Riemann-Roch theorem, one can prove that this map is a bijection, which
is moreover an isomorphism of abelian groups. Thus really when we discuss the
group of K-rational points on an E, we are talking about the class group Pic0 (E/K).
Recall that we proved (Theorem 7.1.2) that the class group Cl(OK ) of a number
field is finite. The group Pic0 (E/K) = E(K) of an elliptic curve can be either finite
(e.g., for y 2 + y = x3 − x + 1) or infinite (e.g., for y 2 + y = x3 − x), and determining
which is the case for any particular curve is one of the central unsolved problems
in number theory.
The Mordell-Weil theorem (see Chapter 12) asserts that if E is an elliptic curve
over a number field K, then there is a nonnegative integer r, referred to as the
algebraic rank of E, such that
E(Q) ≈ Zr ⊕ T, (10.1.2)
124 CHAPTER 10. ELLIPTIC CURVES AND L-FUNCTIONS
where T is a finite group. This is similar to Dirichlet’s unit theorem, which gives
the structure of the unit group of the ring of integers of a number field. The main
difference is that T need not be cyclic, and computing r appears to be much more
difficult than just finding the number of real and complex roots of a polynomial!
Example 10.1.6. Sage has algorithms which can compute this rank for us. For
example we can compute the ranks of the curves y 2 +y = x3 −x+1 and y 2 +y = x3 −x
respectively.
EllipticCurve ([0 ,0 ,1 , -1 ,1]). rank ()
is the x coordinate of R.
E[n] = {P ∈ E(K) : nP = O}
ρE,n : GK → Aut(E[n]).
Warning 10.2.1. Though the action of GK leaves the group E[n] fixed, it may act
non-trivially on individual elements! Otherwise ρE,n would not be very interesting.
For any positive integer n, the group E[n] is isomorphic as an abstract abelian
group to (Z/nZ)2 . There are various related ways to see why this is true. One is to
use the Weierstrass ℘-theory to parametrize E(C) by the the complex numbers, i.e.,
to find an isomorphism C/Λ ∼ = E(C), where Λ is a lattice in C and the isomorphism
is given by z 7→ (℘(z), ℘0 (z)) with respect to an appropriate choice of coordinates on
E(C). It is then an easy exercise to verify that (C/Λ)[n] ∼ = (Z/nZ)2 . For a detailed
and rigorous walk through of this method see [?, Ch. 1.4].
126 CHAPTER 10. ELLIPTIC CURVES AND L-FUNCTIONS
Another way to understand E[n] is to use the fact that E(C)tor is isomorphic
to the quotient
H1 (E(C), Q)/ H1 (E(C), Z)
Remark 10.2.2. Notice that the arguments above used many analytic facts about
geometry over C (e.g. homology, analytic structure) in order to prove algebraic
facts (e.g. the number of torsion points) about E(K). This is part of a more
general concept called the Lefschetz principle which generally relates geometry over
an algebraically closed field of characteristic 0 to geometry over C. For more on
this see [Sil92, Ch. VI.6].
Remark 10.2.3. In fact, if p is a prime that does not divide n then E[n] ≈ (Z/nZ)2
over fields of characteristic p. However, the methods we used above do not apply to
the case of positive characteristic. Another method is to show the multiplication by
n map is separable and has degree n2 . For a detailed proof see [Sil92, Cor. III.6.4].
Exercise 10.2.4. Let E be an elliptic curve defined over a number field K. Fix an
integer n and consider the extension of K given by
Example 10.2.5. Consider the case when n = 2. From Exercise 10.1.5 we know that
the points in E[2] are exactly the points with y-coordinate 0. Let E be the elliptic
curve given by E : y 2 = x3 +x+1. If y = 0 then x has to be a root of the polynomial
x3 + x + 1, so the points in E[2] are defined over the splitting field of x3 + x + 1.
We can compute these points in Sage.
10.2. GALOIS REPRESENTATIONS 127
f = x ^3 + x + 1
K . <a > = NumberField ( f )
M . <b > = K . galois _closure (); M
F = E . change_ring ( M )
T = F . t o r s i o n _s u b g r o u p (); T
T . gens ()
Note that this matches with what we expected: we computed two generators for
E[2] (the output of the last cell) corresponding to two generators of (Z/2Z)2 .
If n = p is a prime, then upon chosing a basis for the two-dimensional Fp -vector
space E[p], we obtain an isomorphism Aut(E[p]) ∼ = GL2 (Fp ). We thus obtain a
mod p Galois representation
This representation ρE,p is continuous if GL2 (Fp ) is endowed with the discrete topol-
ogy, because the field K(E[p]) is a Galois extension of K of finite degree by Exer-
cise 10.2.4.
In order to attach an L-function to E, one could try to embed GL2 (Fp ) into
GL2 (C) and use the construction of Artin L-functions from Section 9.5. Unfor-
tunately, this approach is doomed in general, since GL2 (Fp ) frequently does not
embed in GL2 (C). The following Sage session shows that for p = 5, 7, there are no
2-dimensional irreducible representations of GL2 (Fp ), so GL2 (Fp ) does not embed in
GL2 (C). The notation in the output below is [degree of rep, number of times
it occurs].
128 CHAPTER 10. ELLIPTIC CURVES AND L-FUNCTIONS
[ [ 1, 2 ], [ 2, 1 ] ]
[ [ 1, 2 ], [ 2, 3 ], [ 3, 2 ], [ 4, 1 ] ]
[ [ 1 , 4 ] , [ 4 , 10 ] , [ 5 , 4 ] , [ 6 , 6 ] ]
[ [ 1 , 6 ] , [ 6 , 21 ] , [ 7 , 6 ] , [ 8 , 15 ] ]
Instead of using the complex numbers, we use the p-adic numbers 3 , as follows.
For each power pm of p, we have defined a homomorphism
We combine together all of these representations (for all m ≥ 1) using the inverse
limit. Recall that the p-adic numbers are
Zp = lim Z/pm Z,
←−
which is the set of all compatible choices of integers modulo pm for all m. We obtain
a (continuous) homomorphism
the same: choose a prime of OM over `, and let D` be the subgroup of Gal(M/K)
that leaves p invariant. Then the submodule Tp (E)I` of inertia invariants is a module
for D` and the characteristic polynomial F` (x) of Frob` on Tp (E)I` is well defined
(since inertia acts trivially). Let R` (x) be the polynomial obtained by reversing the
coefficients of F` (x). One can prove that R` (x) ∈ Z[x] and that R` (x), for ` 6= p
does not depend on the choice of p. Define R` (x) for ` = p using a different prime
q 6= p, so the definition of R` (x) does not depend on the choice of p.
This question was one of the central topics in number theory in the late 1990s
and early 2000s. An amazing fact is that the question has been answered in the
affirmative.
This is a corollary to the modularity theorem described in the next section, see
Corollary 10.2.10.
Theorem 10.2.9 (Wiles, Brueil, Conrad, Diamond, Taylor). Every elliptic curve
over Q is modular, i.e, the function fE (z) is a cuspidal modular form.
Galois Cohomology
(1 + 2a)(1 − a2 ) = 1 − a2 + 2a − 2a3 = 1 + 2a − a2 − 2 = −1 + 2a − a2 .
Since a3 = 1 you might think that Z[C3 ] is isomorphic to the ring Z[ζ3 ] of integers
of Q(ζ3 ), but you would be wrong, since the ring of integers is isomorphic to Z2 as
an abelian group, but Z[C3 ] is isomorphic to Z3 as abelian group. Note that Q(ζ3 )
is a quadratic extension of Q.
Exercise 11.1.3. Is Z[ζ3 ] isomorphic to the group ring of some group?
Hint: Note that the rank of the group ring as a Z-module is equal to the size of
the group. If Z[ζ3 ] was a group ring then it would have to be isomorphic to Z[C2 ].
Exercise 11.1.4.
131
132 CHAPTER 11. GALOIS COHOMOLOGY
(a) Write down an two elements of Z[Z] and multiply them. This is not hard, but
is good practice with the concept of a group ring.
Exercise 11.1.6. Fix an abelian group A. Show the following are equivalent sets
of data. Specifically, given any one of the following objects, there is a natural way
to construct another.
Remark 11.1.7. In Exercise 11.1.6, part (a) is our definition of a G-module and
parts (c) and (d) are the data of a Z[G]-module. This shows that a G-module in
the above sense is the same as a Z[G]-module in the usual module sense.
Example 11.1.8. If G is any finite group and A any abelian group then we can always
make A into a G-module by giving it the trivial action. In particular, Z with the
trivial action is a module over any group G, as is Z/mZ for any positive integer m.
Another example is G = (Z/nZ)∗ , which acts via multiplication on A = Z/nZ.
Remark 11.1.9. The construction Z[G] from G is natural, in the sense that it defines
a functor between categories. Moreover, Z[G] is the most natural way to construct
a ring from a group in the sense that the group ring functor is a left adjoint to the
forgetful functor from rings to groups. These types of functors are sometimes called
“free” functors. If you are interested in free objects, see if you can come up with a
natural way to add structure to other objects. Could you make a set into a group?
How about a vector space?
There are also explicit, and increasingly complicated, definitions of Hn (G, A) for
each n ≥ 2 in terms of crossed homomorphisms, which are certain maps G×· · ·×G →
A modulo a subgroup. We will not need these maps, but for more information about
them see [Cp86, Ch. IV.2].
Example 11.2.2. The groups H n (G, Z) and H n (G, Z/pZ) (where p is a prime) are
computable in Sage. For example we can compute H 10 (A5 , Z) and H 7 (A5 , Z/5Z)
where A5 is the alternating group of order 120 and Z/5Z is given the trivial A5 -
module structure.
G = A l t e r n a t i n g G r o u p (5); G
G . cohomology (10)
G . cohomology (7 ,5)
The following theorem gives three properties of group cohomology, which uniquely
determine group cohomology.
1. We have H0 (G, A) = AG .
···
We will not prove this theorem. For proofs see [Cp86, Atiyah-Wall] and [Ser79,
Ch. 7]. The properties of the theorem uniquely determine group cohomology, so
one should in theory be able to use them to deduce anything that can be deduced
about cohomology groups. Indeed, in practice one frequently proves results about
higher cohomology groups Hn (G, A) by writing down appropriate exact sequences,
using explicit knowledge of H0 , and chasing diagrams.
Remark 11.2.5. Alternatively, we could view the defining properties of the theorem
as the definition of group cohomology, and could state a theorem that asserts that
group cohomology exists.
Remark 11.2.6. For those familiar with commutative and homological algebra, we
have
Hn (G, A) = ExtnZ[G] (Z, A),
where Z is the trivial G-module.
11.2. GROUP COHOMOLOGY 135
Remark 11.2.7. One can interpret H2 (G, A) as the group of equivalence classes of
extensions of G by A, where an extension is an exact sequence
0→A→M →G→1
m
0 Z Z Z/mZ
[m]
H1 (G, Z) H1 (G, Z) H1 (G, Z/mZ)
[m]
H2 (G, Z) H2 (G, Z) H2 (G, Z/mZ) ···
From the first few terms of the sequence and the fact that Z surjects onto Z/mZ,
we see that [m] : H1 (G, Z) → H1 (G, Z) is injective. This is consistent with Exer-
cise 11.2.1 above that showed H1 (G, Z) = 0. Using this vanishing and the right side
of the exact sequence we obtain an isomorphism
H1 (G, Z/mZ) ∼
= H2 (G, Z)[m]
where H2 (G, Z)[m] is the kernel of the map [m] : H2 (G, Z) → H2 (G, Z). By Exer-
cise 11.2.1, when a group acts trivially the H1 is Hom, so
H2 (G, Z)[m] ∼
= Hom(G, Z/mZ). (11.2.1)
One can prove that for any n > 0 and any module A that the group Hn (G, A)
has order dividing #G (see Remark 11.3.5). Thus (11.2.1) allows us to understand
H2 (G, Z), and this comprehension arose naturally from the properties in Theo-
rem 11.2.4 that determine the cohomology groups Hn .
136 CHAPTER 11. GALOIS COHOMOLOGY
The following proposition will be useful when proving the weak Mordell-Weil
theorem (see Theorem 12.2.3).
S3 . cohomology (1 ,3)
A3 = A l t e r n a t i n gG r o u p (3); A3
A3 . cohomology (1 ,3)
Following Section 9.5, we can put a topology on Gal(K sep /K) by taking as a
basis of the origin, subgroups of the form Gal(K sep /L) where L/K is a finite Galois
extension.
Exercise 11.4.1. Let H be a subgroup of G = Gal(K sep /K). Show that H is
open if and only if H is closed and has finite index in G. [Hint: If H is open then
it contains a basis element N . By definition of the basis described above, N is
finite index in G. What does this say about the index of H in G? What about the
compliment of H? ]
Definition 11.4.2. Let A be a Gal(K sep /K)-module. We say that A is a continuous
Gal(K sep /K)-module if the map Gal(K sep /K) × A → A (see Exercise 11.1.6) is
continuous when A has the discrete topology.
Exercise 11.4.3. Let G = Gal(K sep /K) and A be a G-module. Show that A is a
continuous G-module if and only if the subgroup Ga = {σ ∈ G : σ(a) = a} is open
for every a ∈ A.
Now let A be a continuous Gal(K sep /K)-module. Let
sep /L)
A(L) = AGal(K = {x ∈ A : σ(x) = x for all σ ∈ Gal(K sep /L)}.
and define
Hn (K, A) = lim Hn (L/K, A(L)),
−→
L/K
where the limit is taken over all finite Galois extensions L/K.
It is not obvious that the groups Hn (K, A) are actually cohomology groups, i.e.
they satisfy the conclusion of Theorem 11.2.4. However one can show they have
analogous properties, see [Ser79, Ch. X.3] for references.
Remark 11.4.4. Those familiar with algebraic geometry should compare the groups
Hn (K, A) with the Čech cohomology groups on the étale site over Spec K. One can
show that Čech cohomology agrees with the derived functor groups of A 7→ AG , see
[?, Ch. 10]. Therefore Hn (K, A) do indeed define a cohomology theory.
Example 11.4.5. The following are examples of continuous Gal(Q/Q)-modules:
∗ ∗
Q, Q , Z, Z , E(Q), E(Q)[n], Tate` (E),
where E is an elliptic curve over Q. Can you identify the action for each module
A? What about A(L) for any finite Galois extension L/Q. It is important to notice
∗
Q (L) = L∗ .
11.4. GALOIS COHOMOLOGY 139
∗
Theorem 11.4.6 (Hilbert 90). We have H1 (K, K ) = 0.
where µn (K) is the nth roots of unity contained in K. The last equality follows
from Theorem 11.4.6.
Assume now that the group µn is contained in K. Using Galois cohomology we
obtain a relatively simple classification of all abelian extensions of K with cyclic
Galois group of order dividing n. Moreover, since the action of Gal(K/K) on µn is
trivial, by our hypothesis that µn ⊂ K, Exercise 11.2.1 implies
H1 (K, µn ) = Hom(Gal(K/K), µn ).
or equivalently, an isomorphism
K ∗ /(K ∗ )n ∼
= Hom(Gal(K/K), µn ).
141
142 CHAPTER 12. THE WEAK MORDELL-WEIL THEOREM
cyclic group µn . Unwinding the definitions, this says that every cyclic abelian
extension of K of degree dividing n is of the form K(a1/n ) for some element a ∈ K.
One can prove via calculations that K(a1/n ) is unramified outside n and the
primes that divide Norm(a). Moreover, and this is a much bigger result, one can
combine this with facts about class groups and unit groups to prove the following
theorem:
Sketch of Proof. Note that we may enlarge S as needed. To see this, choose a finite
set S 0 ⊇ S and let L0 the maximal extension with respect to S 0 as in the statement
of the theorem. Because L is unramified outside of S, it is certainly unramified
outside of S 0 . By maximality of L0 this implies L0 ⊆ L. Therefore it’s sufficient to
show the larger extension L0 /K is finite.
We first argue that we can enlarge S so that the ring
is a principal ideal domain. One can show that for any S, the ring OK,S is a
Dedekind domain. The condition ordp (aOK ) ≥ 0 means that in the prime ideal
factorization of the fractional ideal aOK , we have that p occurs to a nonnegative
power. Thus we are allowing denominators at the primes in S. Since the class group
of OK is finite, there are primes p1 , . . . , pr that generate the class group as a group
(for example, take all primes with norm up to the Minkowski bound). Enlarge S to
contain the primes pi .
Note that we have used the class group of OK is finite.
Next we want to show pi OK,S is the unit ideal. To see this, let m be the order
of pi in the class group of OK so that pm i = (α) for some α ∈ OK . Note the
factorization of α1 OK is p−m
i so by construction 1
α ∈ OK,S . Since α ∈ (pi OK,S )
m
this shows (pi OK,S )m is the unit ideal. It follows from the unique factorization of
ideals in the Dedekind domain OK,S that pi OK,S is the unit ideal.
Now we can show OK,S is a principal ideal domain. Let P be a prime ideal
of OK,S . Since the pi generate the class group of OK , the restriction of P to OK
is equivalent modulo a principal ideal to a product of the primes pi . Therefore P
is equivalent modulo a principal ideal to a product of ideals of the form pi OK,S .
Because we showed pi OK,S was the unit ideal, this means P is principal.
12.1. KUMMER THEORY OF NUMBER FIELDS 143
Next enlarge S so that all primes over nOK are in S. Note that OK,S is still a
PID. Let
K(S, n) = {a ∈ K ∗ /(K ∗ )n : n | ordp (a) for all p 6∈ S}.
Then a refinement of the arguments at the beginning of this section show that L is
generated by all nth roots of the elements of K(S, n) (specifically, their representa-
tives in K). Thus it suffices to prove that K(S, n) is finite.
∗
If a ∈ OK,S then ordp (a) = 0 for all p ∈
/ S. So there is a natural map
∗
φ : OK,S → K(S, n)
Next we show that the image of φ has finite index in Zm . Let h be the class number
∗
of OK . For each i there exists αi ∈ OK such that phi = (αi ). But αi ∈ OK,S since
ordp (αi ) = 0 for all p 6∈ S (by unique factorization). Then
It follows that (hZ)m ⊂ Im(φ), so the image of φ has finite index in Zm . It follows
∗
that OK,S has rank equal to r + s − 1 + #S.
144 CHAPTER 12. THE WEAK MORDELL-WEIL THEOREM
Note the last term comes from replacing the codomain of H1 (K, E[n]) → H1 (K, E)
n
by the kernel of H1 (K, E) −
→ H1 (K, E). From this we obtain a short exact sequence
Exercise 12.2.1. Consider the map E(K) → Hom(Gal(K/K), E[n]) defined above.
First show this map is well defined, i.e., σ(Q) − Q ∈ E[n]. Then show it does not
depend on the choice of P modulo nE(K) so it in fact descends to a homomorphism
on E(K)/nE(K).
Sketch of Proof. First one proves that if p - n is a prime of good reduction for E,
then the natural reduction map π : E(K)[n] → Ẽ(OK /p) is injective. The argument
that π is injective uses formal groups, whose development is outside the scope of
12.2. PROOF OF THE WEAK MORDELL-WEIL THEOREM 145
this course.1 Next fix any Q as in the statement of the theorem. Above we saw
σ(Q) − Q ∈ E[n] for all σ ∈ Gal(K/K). Let Ip ⊂ Gal(L/K) be the inertia group
at p. By definition of interia group, Ip acts trivially on Ẽ(OK /p). Thus for each
σ ∈ Ip we have
Since π is injective, it follows that σ(Q) = Q for σ ∈ Ip , i.e., that Q is fixed under
all Ip . Repeating for all Q this shows Ip = 1 so L is unramified at p.
Note that we technically only defined π on E(K)[n] and σ(Q) − Q may not lie
in E(K). However, given a finite extension K 0 /K and prime p0 lying over p, E
the reduction map E(K 0 ) → Ẽ(OK 0 /p0 ) is still injective. So we could apply the
argument to the field K 0 given by adjoining the coordinates of Q to K.
Hom(Gal(K/K), (Z/nZ)2 ).
By Theorem 12.2.2, the image consists of homomorphisms whose kernels cut out
an abelian extension of K unramified outside n and primes of bad reduction for E.
Since this is a finite set of primes, Theorem 12.1.1 implies that the homomorphisms
all factor through a finite quotient Gal(L/K) of Gal(Q/K). Thus there can be only
finitely many such homomorphisms, so the image of E(K)/nE(K) is finite. Thus
E(K)/nE(K) itself is finite, which proves the theorem in this case.
Next suppose E is an elliptic curve over a number field, but do not make the
hypothesis that the elements of E[n] have coordinates in K. Since the group E[n](C)
is finite and its elements are defined over Q, the extension L of K got by adjoining
to K all coordinates of elements of E[n](C) is a finite extension. It is also Galois,
as we saw when constructing Galois representations attached to elliptic curves. By
Proposition 11.3.2, we have an exact sequence
The kernel of the restriction map H1 (K, E[n]) → H1 (L, E[n]) is finite, since it is
isomorphic to the finite cohomology group H1 (L/K, E[n](L)). By the argument of
the previous paragraph, the image of E(K)/nE(K) in H1 (L, E[n]) under
res
E(K)/nE(K) ,→ H1 (K, E[n]) −−→ H1 (L, E[n])
1
For an introduction to formal groups of elliptic curves see [Sil92, Ch. IV].
146 CHAPTER 12. THE WEAK MORDELL-WEIL THEOREM
Part II
Adelic Viewpoint
147
Chapter 13
Valuations
The rest of this book is a partial rewrite of [Cas67] meant to make it more accessible.
I have attempted to add examples and details of the implicit exercises and remarks
that are left to the reader.
13.1 Valuations
Definition 13.1.1 (Valuation). A valuation | · | on a field K is a function defined
on K with values in R≥0 satisfying the following axioms:
The trivial valuation is the valuation for which |a| = 1 for all a 6= 0. We will
often tacitly exclude the trivial valuation from consideration.
From (2) we have
|1| = |1| · |1| ,
|a|2 = |a|c1
for all a ∈ K.
149
150 CHAPTER 13. VALUATIONS
Also we have
Note that Axioms (1), (2) and Equation (13.1.1) imply Axiom (3) with C = 2.
We take Axiom (3) instead of Equation (13.1.1) for the technical reason that we will
want to call the square of the absolute value of the complex numbers a valuation.
(Here the big absolute value on the outside of the left-hand side of the inequality
is the usual absolute value on real numbers, but the other absolute values are a
valuation on an arbitrary field K.)
Proof. We have
|a| = |b + (a − b)| ≤ |b| + |a − b|,
so |a| − |b| ≤ |a − b|. The same argument with a and b swapped implies that
|b| − |a| ≤ |a − b|, which proves the lemma.
forms a discrete subgroup of the reals under addition (because the elements of the
group G are bounded away from 0).
Proof. Since G is discrete there is a positive m ∈ G such that for any positive x ∈ G
we have m ≤ x. Suppose x ∈ G is an arbitrary positive element. By subtracting off
integer multiples of m, we find that there is a unique n such that
0 ≤ x − nm < m.
By Proposition 13.2.2, the set of log |a| for nonzero a ∈ K is free on one gen-
erator, so there is a c < 1 such that |a|, for a 6= 0, runs precisely through the
set
cZ = {cm : m ∈ Z}
(Note: we can replace c by c−1 to see that we can assume that c < 1).
Note that if we can take C = 1 for | · | then we can take C = 1 for any valuation
equivalent to | · | . To see that (13.2.1) is equivalent to Axiom (3) with C = 1,
suppose |b| ≤ |a|. Then |b/a| ≤ 1, so Axiom (3) asserts that |1 + b/a| ≤ 1, which
implies that |a + b| ≤ |a| = max{|a|, |b|}, and conversely.
We note at once the following consequence:
Proof. Note that |a + b| ≤ max{|a|, |b|} = |a|, which is true even if |b| = |a|. Also,
where for the last equality we have used that |b| < |a| (if max{|a + b|, |b|} = |b|,
then |a| ≤ |b|, a contradiction).
We will prove this modulo the claim (to be proved later in Section 14.1) that
valuations are equivalent if (and only if) they induce the same topology.
The topology induced by | |1 has as basis of open neighborhoods the set of open
balls
B1 (z, r) = {x ∈ K : |x − z|1 < r},
for r > 0, and likewise for | |2 . Since the absolute values |b|1 get arbitrarily close to
0, the set U of open balls B1 (z, |b|1 ) also forms a basis of the topology induced by
| |1 (and similarly for | |2 ). By (13.2.2) we have
so the two topologies both have U as a basis, hence are equal. That equal topologies
imply equivalence of the corresponding valuations will be proved in Section 14.1.
The set of a ∈ O with |a| < 1 forms an ideal p in O. The ideal p is maximal,
since if a ∈ O and a 6∈ p then |a| = 1, so |1/a| = 1/|a| = 1, hence 1/a ∈ O, so a is a
unit.
Proof. First suppose that | · | is discrete. Choose π ∈ p with |π| maximal, which we
can do since
S = {log |a| : a ∈ p} ⊂ (−∞, 1],
so the discrete set S is bounded above. Suppose a ∈ p. Then
a |a|
= ≤ 1,
π |π|
so a/π ∈ O. Thus
a
a=π· ∈ πO.
π
Conversely, suppose p = (π) is principal. For any a ∈ p we have a = πb with
b ∈ O. Thus
|a| = |π| · |b| ≤ |π| < 1.
Thus {|a| : |a| < 1} is bounded away from 1, which is exactly the definition of
discrete.
Example 13.2.9. For any prime p, define the p-adic valuation | · | p : Q → R as
follows. Write a nonzero α ∈ K as pn · ab , where gcd(a, p) = gcd(b, p) = 1. Then
n
n a 1
−n
p · := p = .
b p p
This valuation is both discrete and non-archimedean. The ring O is the local ring
na o
Z(p) = ∈Q:p-b ,
b
which has maximal ideal generated by p. Note that ord(pn · ab ) = pn .
We will using the following lemma later (e.g., in the proof of Corollary 14.2.4
and Theorem 13.3.2).
Lemma 13.2.10. A valuation | · | is non-archimedean if and only if |n| ≤ 1 for all
n in the ring generated by 1 in K.
Note that we cannot identify the ring generated by 1 with Z in general, be-
cause K might have characteristic p > 0.
Proof. If | · | is non-archimedean, then |1| ≤ 1, so by Axiom (3) with a = 1, we have
|1 + 1| ≤ 1. By induction it follows that |n| ≤ 1.
Conversely, suppose |n| ≤ 1 for all integer multiples n of 1. This condition is
also true if we replace | · | by any equivalent valuation, so replace | · | by one with
C ≤ 2, so that the triangle inequality holds. Suppose a ∈ K with |a| ≤ 1. Then by
the triangle inequality,
|1 + a|n = |(1 + a)n |
n
X n
≤ j |a|
j=0
≤1 + 1 + · · · + 1 = n.
13.3. EXAMPLES OF VALUATIONS 155
and take the limit as n → ∞ to see that |1 + a| ≤ 1. This proves that one can take
C = 1 in Axiom (3), hence that | · | is non-archimedean.
We do not prove this here as we do not need it. For a proof, see [Art59, pg. 45,
67].
There are many non-archimedean valuations. On the rationals Q there is one
for every prime p > 0, the p-adic valuation, as in Example 13.2.9.
Remark 13.3.3. Before giving the proof, we pause with a brief remark about Os-
trowski. According to
http://www-gap.dcs.st-and.ac.uk/~history/Mathematicians/Ostrowski.html
p = {a ∈ Z : |a| < 1}
is nonzero. Also p is an ideal and if |ab| < 1, then |a| |b| = |ab| < 1, so |a| < 1 or
|b| < 1, so p is a prime ideal of Z. Thus p = pZ, for some prime number p. Since
every element of Z has valuation at most 1, if u ∈ Z with gcd(u, p) = 1, then u 6∈ p,
156 CHAPTER 13. VALUATIONS
so 1 < |a|log(c)/ log(a) , so 1 < |a| as well (i.e., any a ∈ Z with a > 1 automatically
satisfies |a| > 1). Also, taking the 1/ log(c) power on both sides of (13.3.1) we see
that 1 1
|c| log(c) ≤ |a| log(a) . (13.3.2)
Because, as mentioned above, |a| > 1, we can interchange the roll of a and c to
obtain the reverse inequality of (13.3.2). We thus have
log(c)
|c| = |a| log(a) .
α
log(c)
·log(c)
|c|α = |2| log(2) = |2|log|2| (e) = elog(c) = c = |c|∞ .
Thus for all integers c ∈ Z with c > 1 we have |c|α = |c|∞ , which implies that | · | is
equivalent to | · |∞ .
Let k be any field and let K = k(t), where t is transcendental. Fix a real number
c > 1. If p = p(t) is an irreducible polynomial in the ring k[t], we define a valuation
by
a u
p · = c− deg(p)·a , (13.3.3)
v p
where a ∈ Z and u, v ∈ k[t] with p - u and p - v.
Remark 13.3.4. This definition differs from the one page 46 of [Cassels-Frohlich,
Ch. 2] in two ways. First, we assume that c > 1 instead of c < 1, since otherwise
| · |p does not satisfy Axiom 3 of a valuation. Also, we write c− deg(p)·a instead of
c−a , so that the product formula will hold. (For more about the product formula,
see Section 18.1.)
In addition there is a a non-archimedean valuation | · |∞ defined by
u
= cdeg(u)−deg(v) . (13.3.4)
v ∞
This definition differs from the one in [Cas67, pg. 46] in two ways. First, we
assume that c > 1 instead of c < 1, since otherwise | · |p does not satisfy Axiom 3
158 CHAPTER 13. VALUATIONS
Lemma 13.3.5. The only nontrivial valuations on k(t) which are trivial on k are
equivalent to the valuation (13.3.3) or (13.3.4).
14.1 Topology
A valuation | · | on a field K induces a topology in which a basis for the neighbor-
hoods of a are the open balls
B(a, d) = {x ∈ K : |x − a| < d}
for d > 0.
Proof. If | · |1 = | · |r2 , then |x − a|1 < d if and only if |x − a|r2 < d if and only if
|x − a|2 < d1/r so B1 (a, d) = B2 (a, d1/r ). Thus the basis of open neighborhoods of
a for | · |1 and | · |2 are identical.
A valuation satisfying the triangle inequality gives a metric for the topology on
defining the distance from a to b to be |a − b|. Assume for the rest of this section
that we only consider valuations that satisfy the triangle inequality.
is small when |ε| and |δ| are small (for fixed a, b).
Lemma 14.1.3. Suppose two valuations | · |1 and | · |2 on the same field K induce
the same topology. Then for any sequence {xn } in K we have
|xn |1 → 0 ⇐⇒ |xn |2 → 0.
159
160 CHAPTER 14. TOPOLOGY AND COMPLETENESS
Proof. It suffices to prove that if |xn |1 → 0 then |xn |2 → 0, since the proof of
the other implication is the same. Let ε > 0. The topologies induced by the two
absolute values are the same, so B2 (0, ε) can be covered by open balls B1 (ai , ri ).
One of these open balls B1 (a, r) contains 0. There is ε0 > 0 such that
Since |xn |1 → 0, there exists N such that for n ≥ N we have |xn |1 < ε0 . For such n,
we have xn ∈ B1 (0, ε0 ), so xn ∈ B2 (0, ε), so |xn |2 < ε. Thus |xn |2 → 0.
Proposition 14.1.4. If two valuations | · |1 and | · |2 on the same field induce the
same topology, then they are equivalent in the sense that there is a positive real α
such that | · |1 = | · |α2 .
we see that
m log |w|1 + n log |z|1 ≥ 0
if and only if
m log |w|2 + n log |z|2 ≥ 0.
Dividing through by log |z|i , and rearranging, we see that for every rational number
α = −n/m,
log |w|1 log |w|2
≥ α ⇐⇒ ≥ α.
log |z|1 log |z|2
Thus
log |w|1 log |w|2
= ,
log |z|1 log |z|2
so
log |w|1 log |z|1
= .
log |w|2 log |z|2
Since this equality does not depend on the choice of z, we see that there is a
constant c (= log |z|1 / log |z|2 ) such that log |w|1 / log |w|2 = c for all w. Thus
log |w|1 = c·log |w|2 , so |w|1 = |w|c2 , which implies that | · |1 is equivalent to | · |2 .
14.2. COMPLETENESS 161
14.2 Completeness
We recall the definition of metric on a set X.
d:X ×X →R
A Cauchy sequence is a sequence (xn ) in X such that for all ε > 0 there exists M
such that for all n, m > M we have d(xn , xm ) < ε. The completion of X is the set of
Cauchy sequences (xn ) in X modulo the equivalence relation in which two Cauchy
sequences (xn ) and (yn ) are equivalent if limn→∞ d(xn , yn ) = 0. A metric space is
complete if every Cauchy sequence converges, and one can show that the completion
of X with respect to a metric is complete.
For example, d(x, y) = |x − y| (usual archimedean absolute value) defines a
metric on Q. The completion of Q with respect to this metric is the field R of real
numbers. More generally, whenever | · | is a valuation on a field K that satisfies the
triangle inequality, then d(x, y) = |x − y| defines a metric on K. Consider for the
rest of this section only valuations that satisfy the triangle inequality.
an → a∗ w.r.t. | · |
sequence (a). Because the field operations on K are continuous, they induce well-
defined field operations on equivalence classes of Cauchy sequences componentwise.
Also, define a valuation on Kv by
|(an )∞
n=1 | = lim |an | ,
n→∞
and note that this is well defined and extends the valuation on K.
To see that Kv is unique up to a unique isomorphism fixing K, we observe that
there are no nontrivial continuous automorphisms Kv → Kv that fix K. This is
because, by denseness, a continuous automorphism σ : Kv → Kv is determined by
what it does to K, and by assumption σ is the identity map on K. More precisely,
suppose a ∈ Kv and n is a positive integer. Then by continuity there is δ > 0 (with
δ < 1/n) such that if an ∈ Kv and |a − an | < δ then |σ(a) − σ(an )| < 1/n. Since
K is dense in Kv , we can choose the an above to be an element of K. Then by
hypothesis σ(an ) = an , so |σ(a) − an | < 1/n. Thus σ(a) = limn→∞ an = a.
We begin with the definition of the N -adic numbers for any positive integer N .
Section 14.2.1 is about the N -adics in the special case N = 10; these are fun because
they can be represented as decimal expansions that go off infinitely far to the left.
Section 14.2.3 is about how the topology of QN is nothing like the topology of R.
Finally, in Section 14.2.4 we state the Hasse-Minkowski theorem, which shows how
to use p-adic numbers to decide whether or not a quadratic equation in n variables
has a rational zero.
Definition 14.2.6 (N -adic valuation). Let N be a positive integer. For any positive
α ∈ Q, the N -adic valuation of α is e, where e is as in Lemma 14.2.5. The N -adic
valuation of 0 is ∞.
We denote the N -adic valuation of α by ordN (α). (Note: Here we are using
“valuation” in a different way than in the rest of the text. This valuation is not an
absolute value, but the logarithm of one.)
Definition 14.2.7 (N -adic metric). For x, y ∈ Q the N -adic distance between x
and y is
dN (x, y) = N − ordN (x−y) .
We let dN (x, x) = 0, since ordN (x − x) = ordN (0) = ∞.
For example, x, y ∈ Z are close in the N -adic metric if their difference is divisible
by a large power of N . E.g., if N = 10 then 93427 and 13427 are close because their
difference is 80000, which is divisible by a large power of 10.
Proposition 14.2.8. The distance dN on Q defined above is a metric. Moreover,
for all x, y, z ∈ Q we have
d(x, z) ≤ max(d(x, y), d(y, z)).
(This is the “nonarchimedean” triangle inequality.)
164 CHAPTER 14. TOPOLOGY AND COMPLETENESS
Proof. The first two properties of Definition 14.2.1 are immediate. For the third,
we first prove that if α, β ∈ Q then
ordN (α + β) ≥ min(ordN (α), ordN (β)).
Assume, without loss, that ordN (α) ≤ ordN (β) and that both α and β are nonzero.
Using Lemma 14.2.5 write α = N e (a/b) and β = N f (c/d) with a or c possibly
negative. Then
ad + bcN f −e
a
c
α + β = Ne + N f −e = Ne .
b d bd
Since gcd(N, bd) = 1 it follows that ordN (α + β) ≥ e. Now suppose x, y, z ∈ Q.
Then
x − z = (x − y) + (y − z),
so
ordN (x − z) ≥ min(ordN (x − y), ordN (y − z)),
hence dN (x, z) ≤ max(dN (x, y), dN (y, z)).
We can finally define the N -adic numbers.
Definition 14.2.9 (The N -adic Numbers). The set of N -adic numbers, denoted
QN , is the completion of Q with respect to the metric dN .
The set QN is a ring, but it need not be a field as you will show in Exercises 11
and 12. It is a field if and only if N is prime. Also, QN has a “bizarre” topology,
as we will see in Section 14.2.3.
α mod 10r
1 mod 10
81 mod 102
981 mod 103
2981 mod 104
22981 mod 105
422981 mod 106
1! − 2! + 3! − 4! + · · · = . . . 637838364422981 in Q10 !
Here’s another example. Reducing 1/7 modulo larger and larger powers of 10
we see that
1
= . . . 857142857143 in Q10 .
7
Here’s another example, but with a decimal point.
1 1 1
= · = . . . 85714285714.3
70 10 7
We have
1 1 10
+ = . . . 66667 + . . . 57143 = = . . . 23810,
3 7 21
which illustrates that addition with carrying works as usual.
a contradiction.
B(y, d(x, y)) ∩ B(x, r) ⊂ B(y, d(x, y)) ∩ B(x, d(x, y)) = ∅.
Proposition 14.2.13. The only connected subsets of QN are the singleton sets {x}
for x ∈ QN and the empty set.
3x3 + 4y 3 + 5z 3 = 0
has a solution other than (0, 0, 0) in R and in Qp for all primes p but has no solution
other than (0, 0, 0) in Q (for a proof see [Cas91, §18]).
Open Problem. Give an algorithm that decides whether or not a cubic
ax3 + by 3 + cz 3 = 0
Proof. We note first that it will be enough to find, for each n, an element cn ∈ K
such that
|cn |n > 1 and |cn |m < 1 for n 6= m,
where 1 ≤ n, m ≤ N . For then as r → +∞, we have
(
crn 1 1 with respect to | · |n and
r
= r →
1 + cn 1+ 1 0 with respect to | · |m , for m 6= n.
cn
b
Then c = will do.
a
Remark 14.3.2. It is not completely clear that one can choose an a such that (14.3.1)
is satisfied. Suppose it were impossible. Then because the valuations are nontrivial,
we would have that for any a ∈ K if |a|1 < 1 then |a|2 < 1. This implies the
converse statement: if a ∈ K and |a|2 < 1 then |a|1 < 1. To see this, suppose there
is an a ∈ K such that |a|2 < 1 and |a|1 ≥ 1. Choose y ∈ K such that |y|1 < 1.
Then for any integer n > 0 we have |y/an |1 < 1, so by hypothesis |y/an |2 < 1. Thus
|y|2 < |a|n2 < 1 for all n. Since |a|2 < 1 we have |a|n2 → 0 as n → ∞, so |y|2 = 0, a
contradiction since y 6= 0. Thus |a|1 < 1 if and only if |a|2 < 1, and we have proved
before that this implies that | · |1 is equivalent to | · |2 .
14.3. WEAK APPROXIMATION 169
Then put
a if |a|N < 1
r
c = a · rb if |a|N = 1
a
· b if |a|N > 1
1 + ar
where r ∈ Z is sufficiently large so that |c|1 > 1 and |c|n < 1 for 2 ≤ n ≤ N .
the ord of a at v. (Some authors, including me (!) also call this integer the valuation
of a with respect to v.) If p = (π 0 ), then π/π 0 is a unit, and conversely, so ord(a) is
independent of the choice of π.
Let Ov and pv be defined with respect to the completion Kv of K at v.
ϕ : Ov /pv → O/p,
Proof. We may view Ov as the set of equivalence classes of Cauchy sequences (an )
in K such that an ∈ O for n sufficiently large. For any ε, given such a sequence
171
172 CHAPTER 15. ADIC NUMBERS: THE FINITE RESIDUE FIELD CASE
Assume for the rest of this section that K is complete with respect to | · |.
Notice that O is uncountable since there are p choices for each p-adic “digit”. We
can do arithmetic with elements of Zp , which can be thought of “backwards” as
numbers in base p. For example, with p = 3 we have
(1 + 2 · 3 + 32 + · · · ) + (2 + 2 · 3 + 32 + · · · )
= 3 + 4 · 3 + 2 · 32 + · · · not in canonical form
2
= 0 + 2 · 3 + 3 · 3 + 2 · 3 + ··· still not canonical
2
= 0 + 2 · 3 + 0 · 3 + ···
1 + 3 + O( 3 ˆ 3 )
sage : s q r t ( a )ˆ2
1 + 2∗3 + 3ˆ2 + O( 3 ˆ 3 )
sage : a ∗ b
2 + O( 3 ˆ 3 )
Type Zp? and Qp? in Sage for much more information about the various computer
models of p-adic arithmetic that are available.
Theorem 15.1.4. Under the conditions of the preceding lemma, O is compact with
respect to the | · | -topology.
Proof. Let Vλ , for λ running through some index set Λ, be some family of open sets
that cover O. We must show that there is a finite subcover. We suppose not.
Let R be a set of representatives for O/p. Then O is the union of the finite
number of cosets a + πO, for a ∈ R. Hence for at lest one a0 ∈ R the set a0 + πO
is not covered by finitely many of the Vλ . Then similarly there is an a1 ∈ R such
that a0 + a1 π + π 2 O is not finitely covered. And so on. Let
a = a0 + a1 π + a2 π 2 + · · · ∈ O.
Then a ∈ Vλ0 for some λ0 ∈ Λ. Since Vλ0 is an open set, a + π J · O ⊂ Vλ0 for some J
(since those are exactly the open balls that form a basis for the topology). This is
a contradiction because we constructed a so that none of the sets a + π n · O, for
each n, are not covered by any finite subset of the Vλ .
Remark 15.1.7. The converse is also true. If K is locally compact with respect to a
non-archimedean valuation | · | , then
1. K is complete,
For there is a compact neighbourhood C of 0. Let π be any nonzero with |π| < 1.
Then π n · O ⊂ C for sufficiently large n, so π n · O is compact, being closed. Hence O
is compact. Since | · | is a metric, O is sequentially compact, i.e., every fundamental
174 CHAPTER 15. ADIC NUMBERS: THE FINITE RESIDUE FIELD CASE
Proof. The whole group G is open, so there is a covering Uα of G by open sets each
of which has finite measure. Since C is compact, there is a finite subset of the Uα
that covers C. The measure of C is at most the sum of the measures of these finitely
many Uα , hence finite.
µn = q · µn+1 .
If we normalize µ by putting
µ(O) = 1
15.1. FINITE RESIDUE FIELD CASE 175
µn = q −n .
We can express the result of the theorem in a more suggestive way. Let b ∈ K
with b 6= 0, and let µ be a Haar measure on K + (not necessarily normalized as
in the theorem). Then we can define a new Haar measure µb on K + by putting
µb (E) = µ(bE) for E ⊂ K + . But Haar measure is unique up to a multiplicative
176 CHAPTER 15. ADIC NUMBERS: THE FINITE RESIDUE FIELD CASE
constant and so µb (E) = µ(bE) = c · µ(E) for all measurable sets E, where the
factor c depends only on b. Putting E = O, shows that the theorem implies that c
is just |b|, when | · | is the normalized valuation.
Remark 15.1.14. The theory of locally compact topological groups leads to the
consideration of the dual (character) group of K + . It turns out that it is isomorphic
to K + . We do not need this fact for class field theory, so do not prove it here. For
a proof and applications see Tate’s thesis or Lang’s Algebraic Numbers, and for
generalizations see Weil’s Adeles and Algebraic Groups and Godement’s Bourbaki
seminars 171 and 176. The determination of the character group of K ∗ is local class
field theory.
The set of nonzero elements of K is a group K ∗ under multiplication. Multipli-
cation and inverses are continuous with respect to the topology induced on K ∗ as
a subset of K, so K ∗ is a topological group with this topology. We have
U1 ⊂ U ⊂ K ∗
where U is the group of units of O ⊂ K and U1 is the group of 1-units, i.e., those
units ε ∈ U with |ε − 1| < 1, so
U1 = 1 + πO.
The set U is the open ball about 0 of radius 1, so is open, and because the metric
is nonarchimedean U is also closed. Likewise, U1 is both open and closed.
The quotient K ∗ /U = {π n · U : n ∈ Z} is isomorphic to the additive group Z+
of integers with the discrete topology, where the map is
π n · U 7→ n for n ∈ Z.
K ∗ = {π n ζ m ε : n ∈ Z, m ∈ Z/(q − 1)Z, ε ∈ U1 } ∼
= Z × Z/(q − 1)Z × U1 .
(How to apply Hensel’s lemma: Let f (x) = xq−1 − 1 and let a ∈ O be such that a
mod p generates K ∗ . Then |f (a)| < 1 and |f 0 (a)| = 1. By Hensel’s lemma there is
a ζ ∈ K such that f (ζ) = 0 and ζ ≡ a (mod p).)
Since U is compact and the cosets of U cover K, we see that K ∗ is locally
compact.
u · (1 + π n O) = (1 + a1 π + a2 π 2 + · · · ) · (1 + π n O)
= 1 + a1 π + a2 π 2 + · · · + π n O
= a1 π + a2 π 2 + · · · + (1 + π n O),
Lemma 15.1.16. The topological spaces K + and K ∗ are totally disconnected (the
only connected sets are points).
Proof. The proof is the same as that of Proposition 14.2.13. The point is that the
non-archimedean triangle inequality forces the complement an open disc to be open,
hence any set with at least two distinct elements “falls apart” into a disjoint union
of two disjoint open subsets.
Remark 15.1.17. Note that K ∗ and K + are locally isomorphic if K has character-
istic 0. We have the exponential map
∞
X an
a 7→ exp(a) =
n!
n=0
Much of this chapter is preparation for what we will do later when we will prove
that if K is complete with respect to a valuation (and locally compact) and L is
a finite extension of K, then there is a unique valuation on L that extends the
valuation on K. Also, if K is a number field, v = | · | is a valuation on K, Kv is
the completion of K with respect to v, and L is a finite extension of K, we’ll prove
that
MJ
Kv ⊗K L = Lj ,
j=1
where the Lj are the completions of L with respect to the equivalence classes of
extensions of v to L. In particular, if L is a number field defined by a root of
f (x) ∈ Q[x], then
MJ
Qp ⊗Q L = Lj ,
j=1
where the Lj correspond to the irreducible factors of the polynomial f (x) ∈ Qp [x]
(hence the extensions of | · |p correspond to irreducible factors of f (x) over Qp [x]).
In preparation for this clean view of the local nature of number fields, we will
prove that the norms on a finite-dimensional vector space over a complete field are
all equivalent. We will also explicitly construct tensor products of fields and deduce
some of their properties.
179
180 CHAPTER 16. NORMED SPACES AND TENSOR PRODUCTS
Note that setting kvk = 1 for all v 6= 0 does not define a norm unless the absolute
value on K is trivial, as 1 = kavk = |a| kvk = |a|. We assume for the rest of this
section that | · | is not trivial.
Lemma 16.1.3. Suppose that K is a field that is complete with respect to a valua-
tion | · | and that V is a finite dimensional K vector space. Continue to assume, as
mentioned above, that K is complete with respect to | · | . Then any two norms on
V are equivalent.
Remark 16.1.4. As we shall see soon (see Theorem 17.1.8), the lemma is usually
false if we do not assume that K is complete. For example, when K = Q and | · |p is
the p-adic valuation, and V is a number field, then there may be several extensions
of | · |p to inequivalent norms on V .
If two norms are equivalent then the corresponding topologies on V are equal,
since very open ball for k · k1 is contained in an open ball for k · k2 , and conversely.
(The converse is also true, since, as we will show, all norms on V are equivalent.)
where c1 = N
P
n=1 kvn k.
To finish the proof, we show that there is a c2 ∈ R such that for all v ∈ V ,
kvk0 ≤ c2 · kvk .
16.2. TENSOR PRODUCTS 181
We will only prove this in the case when K is not just merely complete with respect
to | · | but also locally compact. This will be the case of primary interest to us. For
a proof in the general case, see the original article by Cassels (page 53).
By what we have already shown, the function kvk is continuous in the k · k0 -
topology, so by local compactness it attains its lower bound δ on the unit circle
{v ∈ V : kvk0 = 1}. (Why is the unit circle compact? With respect to k · k0 , the
topology on V is the same as that of a product of copies of K. If the valuation
is archimedean then K ∼ = R or C with the standard topology and the unit circle
is compact. If the valuation is non-archimedean, then we saw (see Remark 15.1.7)
that if K is locally compact, then the valuation is discrete, in which case we showed
that the unit disc is compact, hence the unit circle is also compact since it is closed.)
Note that δ > 0 by part 1 of Definition 16.1.1. Also, by definition of k · k0 , for any
v ∈ V there exists a ∈ K such that kvk0 = |a| (just take the max coefficient in
our basis). Thus we can write any v ∈ V as a · w where a ∈ K and w ∈ V with
kwk0 = 1. We then have
We define a new ring C containing K whose elements are the set of all expressions
N
X
an w n
n=1
as the wn .
There are injective ring homomorphisms
and
N N
!
X X
j : B ,→ C, j cn wn = cn wn .
n=1 n=1
Lemma 16.2.1. Let A and B be fields containing the field K and suppose that B is
a separable extension of finite degree N = [B : K]. Then C = A ⊗K B is the direct
sum of a finite number of fields Kj , each containing an isomorphic image of A and
an isomorphic image of B.
A ⊗K B = A[b] ∼
= A[x]/(f (x))
where gj (x) ∈ A[x] is irreducible. The gj (x) are distinct because f (x) is separable
(i.e., has distinct roots in any algebraic closure).
For each j, let bj ∈ A be a root of gj (x), where A is a fixed algebraic closure of
the field A. Let Kj = A(bj ). Then the map
ϕj : A ⊗K B → Kj (16.2.1)
are injections. Since B and Kj are both fields, λj is either the 0 map or injective.
However, λj is not the 0 map since λj (1) = 1 ∈ Kj .
Q(i) ⊗Q Q(i) ∼
= Q(i)[x]/(x2 + 1) ∼
= Q(i)[x]/((x − i)(x + i)) ∼
= Q(i) ⊕ Q(i),
184 CHAPTER 16. NORMED SPACES AND TENSOR PRODUCTS
since (x − i) and (x + i) are coprime. The last isomorphism sends a + bx, with
a, b ∈ Q(i), to (a + bi, a − bi). Since Q(i) ⊕ Q(i) has zero divisors, the tensor product
Q(i) ⊗Q Q(i) must also have zero divisors. For example, (1, 0) and (0, 1) is a zero
divisor pair on the right hand side, and we can trace back to the elements of the
tensor product that they define. First, by solving the system
a + bi = 1 and a − bi = 0
we see that (1, 0) corresponds to a = 1/2 and b = −i/2, i.e., to the element
1 i
− x ∈ Q(i)[x]/(x2 + 1).
2 2
This element in turn corresponds to
1 i
⊗ 1 − ⊗ i ∈ Q(i) ⊗Q Q(i).
2 2
Similarly the other element (0, 1) corresponds to
1 i
⊗ 1 + ⊗ i ∈ Q(i) ⊗Q Q(i).
2 2
As a double check, observe that
i2
1 i 1 i 1 i i
⊗1− ⊗i · ⊗1+ ⊗i = ⊗1+ ⊗ i − ⊗ i − ⊗ i2
2 2 2 2 4 4 4 4
1 1
= ⊗1− ⊗ 1 = 0 ∈ Q(i) ⊗Q Q(i).
4 4
Clearing the denominator of 2 and writing 1⊗1 = 1, we have (1−i⊗i)(1+i⊗i) = 0,
so i ⊗ i is a root of the polynomimal x2 − 1, and i ⊗ i is not ±1, so x2 − 1 has more
than 2 roots.
In general, to understand A ⊗K B explicitly is the same as factoring either the
defining polynomial of B over the field A, or factoring the defining polynomial of A
over B.
Corollary 16.2.4. Let a ∈ B be any element and let f (x) ∈ K[x] be the char-
acteristic polynomials of a over K and let gj (x) ∈ A[x] (for 1 ≤ j ≤ J) be the
characteristic polynomials of the images of a under B → A ⊗K B → Kj over A,
respectively. Then
J
Y
f (x) = gj (X). (16.2.2)
j=1
Proof. We show that both sides of (16.2.2) are the characteristic polynomial T (x) of
the image of a in A ⊗K B over A. That f (x) = T (x) follows at once by computing
the characteristic polynomial in terms of a basis w1 , . . . , wN of A ⊗K B, where
w1 , . . . , wN is a basis for B over K (this is because the matrix of left multiplication
16.2. TENSOR PRODUCTS 185
and
J
X
TrB/K (a) = TrKj /A (a),
j=1
Proof. This follows from Corollary 16.2.4. First, the norm is ± the constant term of
the characteristic polynomial, and the constant term of the product of polynomials is
the product of the constant terms (and one sees that the sign matches up correctly).
Second, the trace is minus the second coefficient of the characteristic polynomial,
and second coefficients add when one multiplies polynomials:
(xn +an−1 xn−1 +· · · )·(xm +am−1 xm−1 +· · · ) = xn+m +xn+m−1 (am−1 +an−1 )+· · · .
One could also see both the statements by considering a matrix of left multiplication
by a first with respect to the basis of wn and second with respect to the basis coming
from the left side of (16.2.3).
186 CHAPTER 16. NORMED SPACES AND TENSOR PRODUCTS
Chapter 17
where the N th root is the non-negative real N th root of the nonnegative real number
NormL/K (a).
187
188CHAPTER 17. EXTENSIONS AND NORMALIZATIONS OF VALUATIONS
Existence. We do not give a proof of existence in the general case. Instead we give
a proof, which was suggested by Dr. Geyer at the conference out of which [Cas67]
arose. It is valid when K is locally compact, which is the only case we will use later.
We see at once that the function defined in (17.1.1) satisfies the condition (i)
that kak ≥ 0 with equality only for a = 0, and (ii) kabk = kak · kbk for all a, b ∈ L.
The difficult part of the proof is to show that there is a constant C > 0 such that
kak ≤ 1 =⇒ k1 + ak ≤ C.
Note that we do not know (and will not show) that k · k as defined by (17.1.1) is a
norm as in Definition 16.1.1, since showing that k · k is a norm would entail showing
that it satisfies the triangle inequality, which is not obvious.
Choose a basis b1 , . . . , bN for L over K. Let k · k0 be the max norm on L, so for
a= N
P
i=1 ci bi with ci ∈ K we have
XN
kak0 =
ci bi
= max{|ci | : i = 1, . . . , N }.
i=1 0
(Note: in Cassels’s original article he let k · k0 be any norm, but we don’t because
the rest of the proof does not work, since we can’t use homogeneity as he claims
to do. This is because it need not be possible to find, for any nonzero a ∈ L some
element c ∈ K such that kack0 = 1. This would fail, e.g., if kak0 6= |c| for any
c ∈ K.) The rest of the argument is very similar to our proof from Lemma 16.1.3
of uniqueness of norms on vector spaces over complete fields.
With respect to the k · k0 -topology, L has the product topology as a product of
copies of K. The function a 7→ kak is a composition of continuous functions on L
with respect to this topology (e.g., NormL/K is the determinant, hence polynomial),
hence k · k defines nonzero continuous function on the compact set
S = {a ∈ L : kak0 = 1}.
For any nonzero a ∈ L there exists c ∈ K such that kak0 = |c|; to see this take c to
be a ci in the expression a = N
P
i=1 ci bi with |ci | ≥ |cj | for any j. Hence ka/ck0 = 1,
so a/c ∈ S and
ka/ck
0≤δ≤ ≤ ∆.
ka/ck0
Then by homogeneity
kak
0≤δ≤ ≤ ∆.
kak0
17.1. EXTENSIONS OF VALUATIONS 189
k1 + ak ≤ ∆ · k1 + ak0
≤ ∆ · (k1k0 + kak0 )
≤ ∆ · k1k0 + δ −1
=C (say),
as required.
Remark 17.1.5. Geyer’s existence proof gives (17.1.1). But it is perhaps worth
noting that in any case (17.1.1) is a consequence of unique existence, as follows.
Suppose L/K is as above. Suppose M is a finite Galois extension of K that con-
tains L. Then by assumption there is a unique extension of | · | to M , which we
shall also denote by k · k. If σ ∈ Gal(M/K), then
kakσ := kσ(a)k
But now
NormL/K (a) = σ1 (a) · σ2 (a) · · · σN (a)
190CHAPTER 17. EXTENSIONS AND NORMALIZATIONS OF VALUATIONS
as required.
Corollary 17.1.6. Let w1 , . . . , wN be a basis for L over K. Then there are positive
constants c1 and c2 such that
XN
bn wn
n=1
c1 ≤ ≤ c2
max{|bn | : n = 1, . . . , N }
Then choose c3 , c4 such that k · kc3 and | · |c4 satisfies the triangle inequality, and
prove the modified corollary using the proof suggested by Cassels.
Kv ⊗K L ∼
M
= Lj (17.1.2)
1≤j≤J
algebraically and topologically, where the right hand side is given the product topol-
ogy.
λj : L → Kv ⊗K L → Lj
kak ≤ 1 =⇒ k1 + ak ≤ C.
so for any a1 ∈ L1 , a2 ∈ L2 ,
ka1 k · ka2 k = 0.
192CHAPTER 17. EXTENSIONS AND NORMALIZATIONS OF VALUATIONS
Hence k · k induces a valuation in precisely one of the Lj , and it extends the given
valuation | · | of Kv . Hence k · k = k · kj for precisely one j.
It remains only to show that (17.1.2) is a topological homomorphism. For
(b1 , . . . , bJ ) ∈ L1 ⊕ · · · ⊕ LJ
put
k(b1 , . . . , bJ )k0 = max kbj kj .
1≤j≤J
Then k · k0 is a norm on the right hand side of (17.1.2), considered as a vector space
over Kv and it induces the product topology. On the other hand, any two norms
are equivalent, since Kv is complete, so k · k0 induces the tensor product topology
on the left hand side of (17.1.2).
Corollary 17.1.9. Suppose L = K(a), and let f (x) ∈ K[x] be the minimal polyno-
mial of a. Suppose that Y
f (x) = gj (x)
1≤j≤J
|x + iy| = x2 + y 2 (normalized).
a : x 7→ ax
on K + multiplies any choice of Haar measure by |a|, and this characterizes the
normalized valuations among equivalent ones.
We have already verified the above characterization for non-archimedean valua-
tions, and it is clear for the ordinary absolute value on R, so it remains to verify it
17.2. EXTENSIONS OF NORMALIZED VALUATIONS 193
The constant C in axiom (3) of a valuation for the ordinary absolute value on C is
2, so the constant for the normalized valuation | · | is C ≤ 4:
|x + iy| ≤ 1 =⇒ |x + iy + 1| ≤ 4.
(x + 1)2 + y 2 = x2 + 2x + 1 + y 2 ≤ 1 + 2x + 1 ≤ 4
since x ≤ 1.
Lemma 17.2.1. Suppose K is a field that is complete with respect to a normalized
valuation | · | and let L be a finite extension of K of degree N = [L : K]. Then the
normalized valuation k · k on L which is equivalent to the unique extension of | · |
to L is given by the formula
kak = NormL/K (a) all a ∈ L. (17.2.1)
In the case when K need not be complete with respect to the valuation | · | on K,
we have the following theorem.
Theorem 17.2.2. Suppose | · | is a (nontrivial as always) normalized valuation of
a field K and let L be a finite extension of K. Then for any a ∈ L,
Y
kakj = NormL/K (a)
1≤j≤J
What next?! We’ll building up to giving a new proof of finiteness of the class
group that uses that the class group naturally has the discrete topology and is the
continuous image of a compact group.
Chapter 18
In this chapter, we will focus attention on number fields, and leave the function
field case to the reader.
The following lemma essentially says that the denominator of an element of a
global field is only “nontrivial” at a finite number of valuations.
|a| > 1.
an + cn−1 an−1 + · · · + c0 = 0,
195
196 CHAPTER 18. GLOBAL FIELDS AND ADELES
We know the lemma for Q, so there are only finitely many valuations | · | on Q such
that the right hand side of (18.1.1) is bigger than 1. Since each valuation of Q
has finitely many extensions to K, and there are only finitely many archimedean
valuations, it follows that there are only finitely many valuations on K such that
|a| > 1.
We will later give a more conceptual proof of this using Haar measure (see
Remark 18.3.9).
Proof. By Lemma 18.1.2, we have |a|v ≤ 1 for almost all v. Likewise, 1/ |a|v =
|1/a|v ≤ 1 for almost all v, so |a|v = 1 for almost all v.
Let w run through all normalized valuations of Q (or of F(t)), and write v | w if
the restriction of v to Q is equivalent to w. Then by Theorem 17.2.2,
Y Y Y Y
|a|v = |a| =
v
NormK/Q (a) ,
w
v w v|w w
as claimed.
Ov = OK,v = {x ∈ Kv : |x| ≤ 1}
Definition 18.1.5 (Almost All). We say a condition holds for almost all elements
of a set if it holds for all but finitely many elements.
We will use the following lemma later (see Lemma 18.3.3) to prove that formation
of the adeles of a global field is compatible with base change.
Proof. The proof proceeds in two steps. First we deduce easily from Lemma 18.1.2
that for almost all v the left hand side of (18.1.2) is contained in the right hand
side. Then we use a trick involving discriminants to show the opposite inclusion for
all but finitely many primes.
Since Ov ⊂ Owi for all i, the left hand side of (18.1.2) is contained in the right
hand side if |ωi |wj ≤ 1 for 1 ≤ i ≤ n and 1 ≤ j ≤ g. Thus by Lemma 18.1.2, for all
but finitely many v the left hand side of (18.1.2) is contained in the right hand side.
We have just eliminated the finitely many primes corresponding to “denominators”
of some ωi , and now only consider v such that ω1 , . . . , ωn ∈ Ow for all w | v.
For any elements a1 , . . . , an ∈ Kv ⊗K L, consider the discriminant
D(a1 , . . . , an ) = det(Tr(ai aj )) ∈ Kv ,
where the trace is induced from the L/K trace. Since each ωi is in each Ow , for
w | v, the traces lie in Ov , so
d = D(ω1 , . . . , ωn ) ∈ Ov .
where d ∈ K. Since ω1 , . . . , ωn are a basis for L over K and the trace pairing is
nondegenerate, we have d 6= 0, so by Theorem 18.1.4 we have |d|v = 1 for all but
finitely many v. Then for all but finitely many v we have that a2m ∈ Ov . For these
v, that a2m ∈ Ov implies am ∈ Ov since am ∈ Kv , i.e., α is in the left hand side of
(18.1.2).
√ √
Example 18.1.7. Let K = Q and L = Q( 2). Let ω1 = 1/3 and ω2 = 2 2. In the
first stage of the above proof we would eliminate | · |3 because ω2 is not integral at
3. The discriminant is
1 √
2
9 0 32
d=D , 2 2 = det = .
3 0 16 9
The restricted topological product depends on the totality of the Yλ , but not on
the individual Yλ :
Lemma 18.2.3. Let Yλ0 ⊂ Xλ be open subsets, and suppose that Yλ = Yλ0 for
almost all λ. Then the restricted topological product of the Xλ with respect to the
Yλ0 is canonically isomorphic to the restricted topological product with respect to the
Yλ .
Lemma 18.2.4. Suppose that the Xλ are locally compact and that the Yλ are com-
pact. Then the restricted topological product X of the Xλ is locally compact.
Proof. For any finite subset S of Λ, the open subset XS ⊂ X is locally compact,
because by Lemma 18.2.2 it is a product of finitely many locally compact sets with
an infinite product of compact sets. (Here we are using Tychonoff’s theorem from
topology, which asserts that an arbitrary product of compact topological spaces is
compact (see Munkres’s Topology, a first course, chapter 5).) Since X = ∪S XS ,
and the XS are open in X, the result follows.
where each Mλ ⊂ Xλ has finite µλ -measure and Mλ = Yλ for almost all λ, and
where !
Y Y
µ Mλ = µλ (Mλ ).
λ λ
Definition 18.3.1 (Adele Ring). The adele ring AK of K is the topological ring
whose underlying topological space is the restricted topological product of the Kv
with respect to the Ov , and where addition and multiplication are defined compo-
nentwise:
K ,→ AK (18.3.2)
that sends x ∈ K to the adele every one of whose components is x. This is an adele
because x ∈ Ov for almost all v, by Lemma 18.1.2. The map is injective because
each map K → Kv is an inclusion.
Definition 18.3.2 (Principal Adeles). The image of (18.3.2) is the ring of principal
adeles.
It will cause no trouble to identify K with the principal adeles, so we shall speak
of K as a subring of AK .
Formation of the adeles is compatibility with base change, in the following sense.
18.3. THE ADELE RING 201
L∼
= K ⊗K L ⊂ AK ⊗K L
Proof. Let ω1 , . . . , ωn be a basis for L/K and let v run through the normalized
valuations on K. The left hand side of (18.3.3), with the tensor product topology,
is the restricted product of the tensor products
Kv ⊗K L ∼
= Kv · ω1 ⊕ · · · ⊕ Kv · ωn
Ov · ω1 ⊕ · · · ⊕ Ov · ωn . (18.3.4)
P
(An element of the left hand side is a finite linear combination xi ⊗ ai of adeles
xi ∈ AK and coefficients ai ∈ L, and there is a natural isomorphism from the ring
of such formal sums to the restricted product of the Kv ⊗K L.)
We proved before (Theorem 17.1.8) that
Kv ⊗K L ∼
= Lw1 ⊕ · · · ⊕ Lwg ,
Corollary 18.3.4. Let A+ K denote the topological group obtained from the additive
structure on AK . Suppose L is a finite seperable extension of K. Then
A+ + +
L = AK ⊕ · · · ⊕ AK , ([L : K] summands).
A+ + ∼ + + ∼ + +
L = AK ⊗K L = ω1 · AK ⊕ · · · ⊕ ωn · AK = AK ⊕ · · · ⊕ AK .
P
If a ∈ L, write a = bi ωi , with bi ∈ K. Then a maps via the above map to
where {bi } denotes the principal adele defined by bi . Under the final map, x maps
to the tuple
(b1 , . . . , bn ) ∈ K ⊕ · · · ⊕ K ⊂ A+ +
K ⊕ · · · ⊕ AK .
The dimensions of L and of K ⊕ · · · ⊕ K over K are the same, so this proves the
final claim of the corollary.
where | · |p and | · |∞ are respectively the p-adic and the usual archimedean absolute
values on Q. If b ∈ Q ∩ U , then in the first place b ∈ Z because |b|p ≤ 1 for all
p, and then b = 0 because |b|∞ < 1. This proves that K + is discrete in A+ Q . (If
we leave out one valuation, as we will see later (Theorem 18.4.4), this theorem is
false—what goes wrong with the proof just given?)
Next we prove that the quotient A+ + +
Q /Q is compact. Let W ⊂ AQ consist of
+
the x = {xv }v ∈ AQ with
1
|x∞ |∞ ≤ and |xp |p ≤ 1 for all primes p.
2
We show that every adele y = {yv }v is of the form
y = a + x, a ∈ Q, x ∈ W,
18.3. THE ADELE RING 203
which will imply that the compact set W maps surjectively onto A+ +
Q /Q . Fix an
adele y = {yv } ∈ A+
Q . Since y is an adele, for each prime p we can find a rational
number
zp
rp = np with zp ∈ Z and np ∈ Z≥0
p
such that
|yp − rp |p ≤ 1,
and
rp = 0 almost all p.
More precisely, for the finitely many p such that
X
yp = an pn 6∈ Zp ,
n≥−|s|
A+Q /Q
+ is surjective. But W is compact (being the topological product of the
compact spaces |x∞ |∞ ≤ 1/2 and the Zp for all p), hence A+ +
Q /Q is also compact.
Proof. We constructed such a set for K = Q when proving Theorem 18.3.5. For
general K the W coming from the proof determines compenent-wise a subset of
A+ ∼ + +
K = AQ ⊕ · · · ⊕ AQ that is a subset of a set with the properties claimed by the
corollary.
204 CHAPTER 18. GLOBAL FIELDS AND ADELES
Remark 18.3.8. This statement is independent of the particular choice of the multi-
plicative constant in the Haar measure on A+
K . We do not here go into the question
+ +
of finding the measure AK /K in terms of our explicitly given Haar measure. (See
Tate’s thesis, [Cp86, Chapter XV].)
Proof. This can be reduced similarly to the case of Q or F(t) which is immediate,
e.g., the W defined above has measure 1 for our Haar measure.
Alternatively, finite measure follows from compactness. To see this, cover AK /K +
with the translates of U , where U is a nonempty open set with finite measure. The
existence of a finite subcover implies finite measure.
Q
Remark 18.3.9. We give an alternative proof of the product formula |a|v = 1
for nonzero a ∈ K. We have seen that if xv ∈ Kv , then multiplication by xv
magnifies the Haar measure in Kv+ by a factor of |xv |v . Hence ifQx = {xv } ∈ AK ,
then multiplication by x magnifies the Haar measure in A+ K by |xv |v . But now
multiplication by a ∈ K takes K + ⊂ A+ K into K + , so gives a well-defined bijection
+ + onto A+ /K + which magnifies the measure by the factor
Q
of
Q KA /K K |a|v . Hence
+ +
|a|Q
v = 1 Corollary 18.3.7. (The point is that if µ Qis the measure of A K /K , then
µ = |a|v · µ, so because µ is finite we must have |a|v = 1.)
determine µ and we have µ(S 1 ) = 1 since X = [0, 1) is a measurable set that maps
bijectively onto S 1 and has measure 1. The situation for the map AK → AK /K +
is pretty much the same.
Lemma 18.4.1. There is a constant C > 0 that depends only on the global field K
with the following property:
Whenever x = {xv }v ∈ AK is such that
Y
|xv |v > C, (18.4.1)
v
1
|yv |v ≤ if v is real archimedean,
2
1
|yv |v ≤ if v is complex archimedean,
2
|yv |v ≤ 1 if v is non-archimedean.
(As we will see, any positive real number ≤ 1/2 would suffice in the definition of
c1 above. For example, in Cassels’s article he uses the mysterious 1/10. He also
doesn’t discuss the subtleties of the complex archimedean case separately.)
Then 0 < c0 < ∞ since AK /K + is compact, and 0 < c1 < ∞ because the
number of archimedean valuations v is finite. We show that
c0
C=
c1
will do. Thus suppose x is as in (18.4.1).
The set T of t = {tv }v ∈ A+
K such that
1
|tv |v ≤ |xv |v if v is real archimedean,
2
1
q
|tv |v ≤ |xv |v if v is complex archimedean,
2
|tv |v ≤ |xv |v if v is non-archimedean
206 CHAPTER 18. GLOBAL FIELDS AND ADELES
has measure Y
c1 · |xv |v > c1 · C = c0 . (18.4.2)
v
(Note: If there are complex valuations, then the some of the |xv |v ’s in the product
must be squared.)
Because of (18.4.2), in the quotient map A+ + +
K → AK /K there must be a pair of
+ +
distinct points of T that have the same image in AK /K , say
and
a = t0 − t00 ∈ K +
is nonzero. Then
(
|t0v | + |t00v | ≤ 2 · 21 |xv |v ≤ |xv |v if v is real archimedean, or
|a|v = t0v − t00v v ≤
max(|t0v | , |t00v |) ≤ |xv |v if v is non-archimedean,
for all v. In the case of complex archimedean v, we must be careful because the
normalized valuation | · |v is the square of the usual archimedean complex valuation
| · |∞ on C, so e.g., it does not satisfy the triangle inequality. In particular, the
quantity |t0v − t00v |v is at most the square of the maximum distance between two
1
p
points in the disc in C of radius 2 |xv |v , where by distance we mean the usual
distance. This maximum distance in such a disc is at most |xv |v , so |t0v − t00v |v is
p
Corollary 18.4.2. Let v0 be a normalized valuation and let δv > 0 be given for all
v 6= v0 with δv = 1 for almost all v. Then there is a nonzero a ∈ K with
|a|v ≤ δv (all v 6= v0 ).
Remark 18.4.3. The character group of the locally compact group A+K is isomorphic
+ +
to AK and K plays a special role. See Chapter XV of [Cp86], Lang’s [Lan64],
Weil’s [Wei82], and Godement’s Bourbaki seminars 171 and 176. This duality lies
behind the functional equation of ζ and L-functions. Iwasawa has shown [Iwa53]
that the rings of adeles are characterized by certain general topologico-algebraic
properties.
We proved before that K is discrete in AK . If one valuation is removed, the
situation is much different.
18.4. STRONG APPROXIMATION 207
Proof. This proof was suggested by Prof. Kneser at the Cassels-Frohlich conference.
Recall that if x = {xv }v ∈ AK,v0 then a basis of open sets about x is the
collection of products Y Y
B(xv , εv ) × Ov ,
v∈S v6∈S, v6=v0
where B(xv , εv ) is an open ball in Kv about xv , and S runs through finite sets of
normalized valuations (not including v0 ). Thus denseness of K in AK,v0 is equivalent
to the following statement about elements. Suppose we are given (i) a finite set S
of valuations v 6= v0 , (ii) elements xv ∈ Kv for all v ∈ S, and (iii) an ε > 0. Then
there is an element b ∈ K such that |b − xv |v < ε for all v ∈ S and |b|v ≤ 1 for all
v 6∈ S with v 6= v0 .
By the corollary to our proof that A+ +
K /K is compact (Corollary 18.3.6), there
is a W ⊂ AK that is defined by inequalities of the form |yv |v ≤ δv (where δv = 1 for
almost all v) such that ever z ∈ AK is of the form
z = y + c, y ∈ W, c ∈ K. (18.4.3)
x = w + b, w ∈ a · W, b ∈ K,
where a · W is the set of ay for y ∈ W . If now we let x have components the given
xv at v ∈ S, and (say) 0 elsewhere, then b = x − w has the properties required.
Remark 18.4.5. The proof gives a quantitative form of the theorem (i.e., with a
bound for |b|v0 ). For an alternative approach, see [Mah64].
In the next chapter we’ll introduce the ideles A∗K . Finally, we’ll relate ideles to
ideals, and use everything so far to give a new interpretation of class groups and
their finiteness.
208 CHAPTER 18. GLOBAL FIELDS AND ADELES
Chapter 19
In this chapter, we introduce the ideles IK , and relate ideles to ideals, and use what
we’ve done so far to give an alternative interpretation of class groups and their
finiteness, thus linking the adelic point of view with the classical point of view of
the first part of this course.
Definition 19.1.2 (Idele Group). The idele group IK of K is the group A∗K of
invertible elements of the adele ring AK .
209
210 CHAPTER 19. IDELES AND IDEALS
Example 19.1.3. For a rational prime p, let xp ∈ AQ be the adele whose pth compo-
nent is p and whose vth component, for v 6= p, is 1. Then xp → 1 as p → ∞ in AQ ,
for the following reason. We must show that if U is a basic open set that contains
the adele 1 = {1}v , the xp for all sufficiently large p are contained in U . Since U
contains 1 and is a basic open set, it is of the form
Y Y
Uv × Zv ,
v∈S v6∈S
where S if a finite set, and the Uv , for v ∈ S, are arbitrary open subsets of Qv that
contain 1. If q is a prime larger than any prime in S, then xp for p ≥ q, is in U . This
proves convergence. If the inverse map were continuous on IK , then the sequence
of x−1
p would converge to 1
−1 = 1. However, if U is an open set as above about 1,
Remark 19.1.9. Note also that the IK -topology is that appropriate to a group of
operators on A+ +
K : a basis of open sets is the S(C, U ), where C, U ⊂ AK are, re-
spectively, AK -compact and AK -open, and S consists of the x ∈ IJ such that
(1 − x)C ⊂ U and (1 − x−1 )C ⊂ U .
Lemma 19.1.11. The subset I1K of AK is closed as a subset, and the AK -subset
topology on I1K coincides with the IK -subset topology on I1K .
This works because if w ∈ W , then either |wv |v = 1 for all v 6∈ S, in which case
1 < c(w) < 2c, so w 6∈ I1K , or |wv0 |v0 < 1 for some v0 6∈ S, in which case
!
Y 1
c(w) = |wv |v · |wv0 | · · · < 2C · · · · < 1,
2C
v∈S
so again w 6∈ I1K .
We next show that the IK - and AK -topologies on I1K are the same. If x ∈ I1K ,
we must show that every AK -neighborhood of x contains an AK -neighborhood and
vice-versa.
Let W ⊂ I1K be an AK -neighborhood of x. Then it contains an AK -neighborhood
of the type
where the finite set S containsQat least all archimedean valuations v and all valua-
tions v with |xv |v 6= 1. Since |xv |v = 1, we may also suppose that ε is so small
that (19.1.4) implies Y
|wv |v < 2.
v
Then the intersection of (19.1.4) with I1K is the same as that of (19.1.2) with I1K ,
i.e., (19.1.4) defines an AK -neighborhood.
By the product formula we have that K ∗ ⊂ I1K . The following result is of vital
importance in class field theory.
Theorem 19.1.12. The quotient I1K /K ∗ with the quotient topology is compact.
|wv |v ≤ |xv |v ,
where x = {xv }v is any idele of content greater than the C of Lemma 18.4.1.
19.2. IDEALS AND DIVISORS 213
Let y = {yv }v ∈ I1K . Then the content of x/y equals the content of x, so by
Lemma 18.4.1 there is an a ∈ K ∗ such that
xv
|a|v ≤ all v.
yv v
Then ay ∈ W , as required.
Remark 19.1.13. The quotient I1K /K ∗ is totally disconnected in the function field
case. For the structure of its connected component in the number field case, see
papers of Artin and Weil in the “Proceedings of the Tokyo Symposium on Algebraic
Number Theory, 1955” (Science Council of Japan) or [AT90]. The determination
of the character group of IK /K ∗ is global class field theory.
Lemma 19.2.1. There is a natural bijection between FK and the group of nonzero
fractional ideals of OK . The correspondence is induced by
Endow FK with the discrete topology. Then there is a natural continuous map
π : IK → FK given by X
x = {xv }v 7→ ordv (xv ) · v.
v
This map is continuous since the inverse image of a valuation v (a point) is the
product Y Y
π −1 (v) = πOv∗ × Kw∗ × Ow∗
,
w archimedean w6=v non-arch.
which is an open set in the restricted product topology on IK . Moreover, the image
of K ∗ in FK is the group of nonzero principal fractional ideals.
Recall that the class group CK of the number field K is by definition the quotient
of FK by the image of K ∗ .
Proof. We first prove that the map I1K → FK is surjective. Let ∞ be an archimedean
valuation on K. If v is a non-archimedean valuation, let x ∈ I1K be a 1-idele such that
xw = 1 at ever valuation w except v and ∞. At v, choose xv = π to be a generator
Q ideal of Ov , and choose x∞ to be such that |x∞ |∞ = 1/ |xv |v . Then
for the maximal
x ∈ IK and w |xw |w = 1, so x ∈ I1K . Also x maps to v ∈ FK .
Thus the group of ideal classes is the continuous image of the compact group
IK /K ∗ (see Theorem 19.1.12), hence compact. But a compact discrete group is
1
finite.
where n is the greatest common divisor of the degrees of elements of Pic(X), which
is 1 when X has a rational point.
Chapter 20
Exercises
IJ = {ab : a ∈ I, b ∈ J}
Job 1 Starting with an annual salary of $1000, and a $200 increase every year.
Job 2 Starting with a semiannual salary of $500, and an increase of $50 every
6 months.
1
From The Education of T.C. MITS (1942).
215
216 CHAPTER 20. EXERCISES
In all other respects, the two jobs are exactly alike. Which is the better offer
(after the first year)? Write a Sage program that creates a table showing how
much money you will receive at the end of each year for each job. (Of course
you could easily do this by hand – the point is to get familiar with Sage.)
6. Let OK be the ring of integers of a number field. Let FK denote the abelian
group of fractional ideals of OK .
7. In this problem, you will give an example to√illustrate the failure of unique
factorization in the ring OK of integers of Q( −6).
(a) Give an element α ∈ OK that factors in two distinct ways into irreducible
elements.
(b) Observe explicitly that the (α) factors uniquely, i.e., the two distinct
factorization in the previous part of this problem do not lead to two
distinct factorization of the ideal (α) into prime ideals.
√
8. Factor the ideal (10) as a product of primes in the ring of integers of Q( 11).
You’re allowed to use a computer, as long as you show the commands you use.
12. Let OK be the ring of integers of a number field. The Zariski topology on the
set X = Spec(OK ) of all prime ideals of OK has closed sets the sets of the
form
V (I) = {p ∈ X : p | I},
where I varies through all ideals of OK , and p | I means that I ⊂ p.
(a) Prove that the collection of closed sets of the form V (I) is a topology on
X.
(b) Let Y be the subset of nonzero prime ideals of OK , with the induced
topology. Use unique factorization of ideals to prove that the closed
subsets of Y are exactly the finite subsets of Y along with the set Y .
(c) Prove that the conclusion of (a) is still true if OK is replaced by an order
in OK , i.e., a subring that has finite index in OK as a Z-module.
14. Let K = Q(ζ13 ),where ζ13 is a primitive 13th root of unity. Note that K has
ring of integers OK = Z[ζ13 ].
(a) Factor 2, 3, 5, 7, 11, and 13 in the ring of integers OK . You may use a
computer.
(b) For p 6= 13, find a conjectural relationship between the number of prime
ideal factors of pOK and the order of the reduction of p in (Z/13Z)∗ .
(c) Compute the minimal polynomial f (x) ∈ Z[x] of ζ13 . Reinterpret your
conjecture as a conjecture that relates the degrees of the irreducible fac-
tors of f (x) (mod p) to the order of p modulo 13. Does your conjecture
remind you of quadratic reciprocity?
15. (a) Find by hand √and with proof the ring of integers of each of the following
two fields: Q( 5), Q(i).
(b) Find the ring of integers of Q(a), where a5 +7a+1 = 0 using a computer.
16. Let p be a prime. Let OK be the ring of integers of a number field K, and
suppose a ∈ OK is such that [OK : Z[a]] is finite and coprime to p. Let f (x)
be the minimal polynomial of a. We proved in class that if the reduction
f ∈ Fp [x] of f factors as
Y e
f= gi i ,
218 CHAPTER 20. EXERCISES
where the gi are distinct irreducible polynomials in Fp [x], then the primes
appearing in the factorization of pOK are the ideals (p, gi (a)). In class, we
did not prove that the exponents of these primes in the factorization of pOK
are the ei . Prove this.
(a) Prove that the ideals I1 = (a1 ), I2 = (a2 ), and I3 = (a3 ) are coprime in
pairs.
(b) Compute #Z[i]/(I1 I2 I3 ).
(c) Find a single element in Z[i] that is congruent to n modulo In , for each
n ≤ 3.
18. Find an example of a field K of degree at least 4 such that the ring OK of
integers of K is not of the form Z[a] for any a ∈ OK .
20. (*) Give an example of an order O in the ring of integers of a number field
and an ideal I such that I cannot be generated by 2 elements as an ideal.
Does the Chinese Remainder Theorem hold in O? [The (*) means that this
problem is more difficult than usual.]
21. For each of the following three fields, determining if there is an order of dis-
criminant 20 contained in its ring of integers:
√ √
3
K = Q( 5), K = Q( 2), and . . .
K any extension of Q of degree 2005. [Hint: for the last one, apply the exact
form of our theorem about finiteness of class groups to the unit ideal to show
that the discriminant of a degree 2005 field must be large.]
22. Prove that the quantity Cr,s in our theorem about finiteness of the class group
s
can be taken to be π4 nn!n , as follows (adapted from [SD01, pg. 19]): Let S
be the set of elements (x1 , . . . , xn ) ∈ Rn such that
r+s q
X
|x1 | + · · · |xr | + 2 x2v + x2v+s ≤ 1.
v=r+1
[Hint: For convexity, use the triangle inequality and that for 0 ≤ λ ≤ 1,
we have
q q
λ x21 + y12 + (1 − λ) x22 + y22
p
≥ (λx1 + (1 − λ)x2 )2 + (λy1 + (1 − λ)y2 )2
which is trivial. That M ≤ n−n follows from the inequality between the
arithmetic and geometric means.
(b) Transforming pairs xv , xv+s from Cartesian to polar coordinates, show
also that v = 2r (2π)s Dr,s (1), where
Z Z
D`,m (t) = · · · y1 · · · ym dx1 · · · dx` dy1 · · · dym
R`,m (t)
x1 + · · · + x` + 2(y1 + · · · + ym ) ≤ t.
4−m t`+2m
D`,m (t) =
(` + 2m)!
23. Let K vary through all number fields. What torsion subgroups (UK )tor actu-
ally occur?
24. If UK ≈ Zn × (UK )tor , we say that UK has rank n. Let K vary through all
number fields. What ranks actually occur?
25. Let K vary through all number fields such that the group UK of units of K is
a finite group. What finite groups UK actually occur?
29. Let S3 by the symmetric group on three symbols, which has order 6.
ρ : Gal(Q/Q) → Gal(K/Q) ∼
= S3 ⊂ GL2 (C).
is an isomorphism.
31. Suppose G is a finite group and A is a finite G-module. Prove that for any q,
the group Hq (G, A) is a torsion abelian group of exponent dividing the order
#A of A.
221
√
32. Let K = Q( 5) and let A = UK be the group of units of K, which is a module
over the group G = Gal(K/Q). Compute the cohomology groups H0 (G, A)
and H1 (G, A). (You shouldn’t use a computer, except maybe to determine
UK .)
√ √
33. Let K = Q( −23) and let C be the class group of Q( −23), which is a module
over the Galois group G = Gal(K/Q). Determine H0 (G, C) and H1 (G, C).
34. Let E be the elliptic curve y 2 = x3 + x + 1. Let E[2] be the group of points
of order dividing 2 on E. Let
1. Let k be any field. Prove that the only nontrivial valuations on k(t) which are
trivial on k are equivalent to the valuation (13.3.3) or (13.3.4) of page 157.
2. A field with the topology induced by a valuation is a topological field, i.e., the
operations sum, product, and reciprocal are continuous.
5. Prove that the polynomial f (x) = x3 − 3x2 + 2x + 5 has all its roots in Q5 ,
and find the 5-adic valuations of each of these roots. (You might need to use
Hensel’s lemma, which we don’t discuss in detail in this book. See [Cas67,
App. C].)
7. Prove that −9 has a cube root in Q10 using the following strategy (this is a
special case of Hensel’s Lemma, which you can read about in an appendix to
Cassel’s article).
8. Compute the first 5 digits of the 10-adic expansions of the following rational
numbers:
13 1 17
, , , the 4 square roots of 41.
2 389 19
9. Let N > 1 be an integer. Prove that the series
∞
X
(−1)n+1 n! = 1! − 2! + 3! − 4! + 5! − 6! + · · · .
n=1
converges in QN .
10. Prove that −9 has a cube root in Q10 using the following strategy (this is a
special case of “Hensel’s Lemma”).
223
14. Find the 3-adic expansion to precision 4 of each root of the following polyno-
mial over Q3 :
f = x3 − 3x2 + 2x + 3 ∈ Q3 [x].
Your solution should conclude with three expressions of the form
a0 + a1 · 3 + a2 · 32 + a3 · 33 + O(34 ).
15. (a) Find the normalized Haar measure of the following subset of Q+
7:
1 1
U = B 28, = x ∈ Q7 : |x − 28| < .
50 50
(b) Find the normalized Haar measure of the subset Z∗7 of Q∗7 .
17. Prove that the ring C defined in Section 9 really is the tensor product of A
and B, i.e., that it satisfies the defining universal mapping property for tensor
products. Part of this problem is for you to look up a functorial definition of
tensor product.
224 CHAPTER 20. EXERCISES
√ √
18. Find a zero divisor pair in Q( 5) ⊗Q Q( 5).
√ √
19. (a) Is Q( 5) ⊗Q Q( −5) a field?
√ √ √
(b) Is Q( 4 5) ⊗Q Q( 4 −5) ⊗Q Q( −1) a field?
20. Suppose ζ5 denotes a primitive 5th root of unity. For any prime p, consider
the tensor product Qp ⊗Q Q(ζ5 ) = K1 ⊕ · · · ⊕ Kn(p) . Find a simple formula
for the number n(p) of fields appearing in the decomposition of the tensor
product Qp ⊗Q Q(ζ5 ). To get full credit on this problem your formula must
be correct, but you do not have to prove that it is correct.
22. Suppose K and L are number fields (i.e., finite extensions of Q). Is it possible
for the tensor product K ⊗Q L to contain a nilpotent element? (A nonzero
element a in a ring R is nilpotent if there exists n > 1 such that an = 0.)
√
23. Let K be the number field Q( 5 2).
(a) In how many ways does the 2-adic valuation | · |2 on Q extend to a valu-
ation on K?
(b) Let v = | · | be a valuation on K that extends | · |2 . Let Kv be the
completion of K with respect to v. What is the residue class field F of
Kv ?
24. Prove that the product formula holds for F(t) similar to the proof we gave
in class using Ostrowski’s theorem for Q. You may use the analogue of Os-
trowski’s theorem for F(t), which you had on a previous homework assignment.
(Don’t give a measure-theoretic proof.)
25. Prove Theorem 18.3.5, that “The global field K is discrete in AK and the
quotient A+ +
K /K of additive groups is compact in the quotient topology.” in
the case when K is a finite extension of F(t), where F is a finite field.
Bibliography
[Art23] E. Artin, Über eine neue Art von L-reihen, Abh. Math. Sem. Univ.
Hamburg 3 (1923), 89–108.
[Art91] M. Artin, Algebra, Prentice Hall Inc., Englewood Cliffs, NJ, 1991. MR
92g:00001
[AT90] E. Artin and J. Tate, Class field theory, second ed., Advanced Book
Classics, Addison-Wesley Publishing Company Advanced Book Pro-
gram, Redwood City, CA, 1990. MR 91b:11129
225
226 BIBLIOGRAPHY
[Len02] H. W. Lenstra, Jr., Solving the Pell equation, Notices Amer. Math. Soc.
49 (2002), no. 2, 182–192. MR 2002i:11028
[S+ 11] W. A. Stein et al., Sage Mathematics Software (Version 4.6.2), The
Sage Development Team, 2011, http://www.sagemath.org.
[ST68] J-P. Serre and J. T. Tate, Good reduction of abelian varieties, Ann.
of Math. (2) 88 (1968), 492–517, http://wstein.org/papers/bib/
Serre-Tate-Good_Reduction_of_Abelian_Varieties.pdf.
[Wei82] A. Weil, Adeles and algebraic groups, Progress in Mathematics, vol. 23,
Birkhäuser Boston, Mass., 1982, With appendices by M. Demazure and
Takashi Ono. MR 83m:10032