100% found this document useful (1 vote)
34 views101 pages

Galois

a galois theory note

Uploaded by

Wangyh0630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
34 views101 pages

Galois

a galois theory note

Uploaded by

Wangyh0630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

MA3D5 Galois Theory

This is a module on Galois Theory and the Theory of Fields and their appli-
cation to various famous historical problems. The aim is to gain a much more
precise understanding of what it means to solve a polynomial equation in one
variable—a problem you might have hoped you’d sewn up at school.
Why wasn’t it sewn up at school? The main reason is that we didn’t know
how to speak properly then: intrinsically the ideas are simple and natural, and
they explain everything we know already and much much more, but only once
we have the language, concepts and examples that you get from a maths degree.
There are 4 things you need to follow this:
(a) enthusiasm to fool around with polynomials, using long division, GCDs,
factorisations and roots of unity, and generally enjoying their company;
(b) basic linear algebra: linear independence, bases and dimension;
(c) basic group theory: subgroups and normal subgroups in permutation
groups, and the basic isomorphism theorems;
(d) basic ring theory: ideals I in R = R[x] and their quotient rings R/I, which
really just means more fun getting your hands on polynomials.
We will review them in the notes, and you could also do that in advance now. At
the pointy end, we will use a little more algebra (such as knowing the subgroups
of a quotient group G/N ), but that’s a long way off when things will be clearer.

It’s called Galois Theory in honour of Évariste Galois, who famously died
following a duel aged 20 in May 1832. He was born near Paris in Bourg-La-
Reine, which was twinned with Kenilworth on the 150th anniversary of his
death—you can see the sign on the Kenilworth Road. He lives forever in the
maths pantheon, though he’d barely started—it will be centuries before we can
even imagine what he may have taught us.
His miracle was to see the conceptual symmetry behind an array of polyno-
mial root calculations, even without the language we use—that was developed
over the 19th century and completely rewritten at least twice in the 20th. Many
basic calculations had been known for centuries, since shortly before the sym-
metric gardens were built in Kenilworth Castle for the visit of Queen Elizabeth I,
when symmetry was in fashion in Warwickshire. Galois spent the night before
the duel the following dawn writing down these ideas so we could have them.
It’s best not to leave revision to the night before the exam.

1
Contents
1 Introduction 5
1.1 Factorisation of polynomials and splitting field . . . . . . . . . . 6
1.2 Building up a factorisation . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Subfields of the splitting field and the roots of f . . . . . . . . . 7
1.4 Permuting the roots: the Galois Correspondence . . . . . . . . . 7
1.5 One punchline: insoluble quintics . . . . . . . . . . . . . . . . . . 9
1.6 What’s where when? . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Essays on what you already know 11


2.1 Working in the complex numbers . . . . . . . . . . . . . . . . . . 11
2.2 Subfields of the complex numbers . . . . . . . . . . . . . . . . . . 13
2.3 The complex numbers viewed from below . . . . . . . . . . . . . 15

3 Solving cubic polynomials 17


3.1 The quadratic formula . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 The cubic formula . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 The cubic formula for real cubics . . . . . . . . . . . . . . . . . . 20

4 Reminders 23
4.1 A reminder about fields . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Ring homomorphisms, first isomorphism theorem . . . . . . . . . 25
4.3 Characteristic of a field . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 Automorphism group of a field: the main invariant . . . . . . . . 28
4.5 Polynomials in one variable, rings, ideals and all that . . . . . . . 28

5 Factorisation in polynomial rings 31


5.1 Roots of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Factorisation in C[x] and Q[x] . . . . . . . . . . . . . . . . . . . . 31
5.3 Factorisation in Z[x] . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3.1 A test for rational roots . . . . . . . . . . . . . . . . . . . 33
5.3.2 Eisenstein’s criterion for irreducibility . . . . . . . . . . . 34
5.3.3 Factorisation modulo a prime . . . . . . . . . . . . . . . . 35

6 Field extensions I 36
6.1 Extensions as vector spaces . . . . . . . . . . . . . . . . . . . . . 37
6.2 Adjoining a square root to a subfield of C . . . . . . . . . . . . . 39
6.3 Simple extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.1 Standard models for simple extensions . . . . . . . . . . . 42
6.3.2 A basis of a simple algebraic extension K(α)/K . . . . . 43
6.4 Adjoining a root of a polynomial . . . . . . . . . . . . . . . . . . 44
6.5 Adjoining multiple algebraic elements . . . . . . . . . . . . . . . 45
6.6 Algebraic extensions and finite extensions . . . . . . . . . . . . . 46
6.7 Maps between fields I . . . . . . . . . . . . . . . . . . . . . . . . 48

2
7 Applications: antiquities 51
7.1 Euclidean geometry – nonexaminable . . . . . . . . . . . . . . . . 51
7.2 Translate into Field Theory . . . . . . . . . . . . . . . . . . . . . 54
7.3 Impossible constructions . . . . . . . . . . . . . . . . . . . . . . . 57

8 The automorphism group of a field 59


8.1 The automorphism group . . . . . . . . . . . . . . . . . . . . . . 59
8.2 The fixed field of a group of automorphisms . . . . . . . . . . . . 59
8.3 How does this all work for f = x3 − 2? . . . . . . . . . . . . . . . 61

9 Field extensions II 65
9.1 Splitting fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.2 When is an extension normal? . . . . . . . . . . . . . . . . . . . . 67
9.3 Maps between fields II . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3.1 Extending maps from subfields of normal extensions . . . 69
9.3.2 Restricting maps from fields containing normal extensions 70
9.4 When is an extension separable? . . . . . . . . . . . . . . . . . . 70
9.4.1 Characterisation of separable polynomials . . . . . . . . . 72

10 Galois Theory 74
10.1 Galois extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
10.2 Fixed fields and stabilisers . . . . . . . . . . . . . . . . . . . . . . 78
10.2.1 The subfield lattice . . . . . . . . . . . . . . . . . . . . . . 78
10.2.2 The subgroup lattice . . . . . . . . . . . . . . . . . . . . . 78
10.2.3 Order-reversing maps between the two lattices . . . . . . 78
10.3 The Galois Correspondence . . . . . . . . . . . . . . . . . . . . . 79

11 Biquadratic extensions 81
11.1 [L ∼
p : K] = 8 =⇒ Gal(L/K) = D8 . . . . . . . . . . . . . . . . . . 83
11.2 √ b(a − b) ∈ K =⇒ [L : K] = 4, Gal(L/K) ∼
2 = C4 . . . . . . . . . 86
11.3 a2 − b ∈ K and [L : K] = 4 =⇒ Gal(L/K) ∼ = C2 × C2 . . . . . 86
11.4 Otherwise [L : K] = 2, Gal(L/K) ∼ = C2 . . . . . . . . . . . . . . . 87
11.5 Examples of each type . . . . . . . . . . . . . . . . . . . . . . . . 87

12 Finite fields 89
12.1 The existence of finite fields . . . . . . . . . . . . . . . . . . . . . 89
12.2 The Frobenius map . . . . . . . . . . . . . . . . . . . . . . . . . . 90
12.3 The Galois group of finite fields . . . . . . . . . . . . . . . . . . . 91

13 Radical solutions of polynomials 93


13.1 Radical and soluble extensions . . . . . . . . . . . . . . . . . . . 93
13.2 Soluble groups and insoluble quintics . . . . . . . . . . . . . . . . 95
13.3 Abelian Galois groups of simple radical extensions . . . . . . . . 97
13.4 Soluble yoga and the proof . . . . . . . . . . . . . . . . . . . . . 98

3
1 Introduction
We will work with fields. You should remind yourself of the formal definition of
field, but in short K is a field if it has the four operations +, −, ×, ÷. Familiar
examples include R and C. In fact R ⊂ C is a subfield: R is a subset of C that
contains 1 and is closed under the field operations of C. We say that R ⊂ C is
a field extension.
There are a few very powerful ideas we’ll use repeatedly. Here’s the first:
since R ⊂ C, we can regard C as a vector space over R. (Convince yourself of
that: we use +, − in C, and then can do ‘scalar multiplication’ from R just by
multiplying inside C. Treating C as an R-vector space temporarily forgets most
of the information about ×, ÷ in C.) In fact, we already have a favourite basis
for C as an R-vector space:

C = {a + ib | a, b ∈ R} so {1, i} is a basis of C over R.

Thus dimR C = 2, the size of that R-basis (or any other R-basis), which is why
you surely picture the complex numbers as the Argand diagram, Figure 1, with
its “real” and “imaginary” axes, and the unit circle marked in to help work out
where the numbers are, and some points in polar and cartesian coordinates.

C i ω = exp(2πi/5)
• •

ω2 • •
z = r exp(iθ)
= r(cos(θ) + i sin(θ))
θ 1

R


ω3


ω4

Figure 1: The complex numbers on the Argand diagram. The sum of the vertices
of any regular polygon centred at 0 is zero: e.g. 1+ω +ω 2 +ω 3 +ω 4 = 0. (That’s
just physics for you: where else could the centre of gravity possibly be?)

Here’s the second powerful idea. Rather than regarding C as the main object,
we could focus on R, and treat C as ‘R plus a bit more’: C is the smallest field
that contains both R and the symbol i (which satisfies i2 = −1, as ever). We

4
say that we can extend R by i to get C, and we write C = R(i). This is
probably how you learned it in the first place, though without that notation.
OK. Bank all that for now. We will review those points formally later, and
in any case we’ll usually work with much smaller fields like Q.

1.1 Factorisation of polynomials and splitting field


Consider f = x3 − 2. We treat it first as an element of Q[x], the ring of all
polynomials in a variable x with coefficients in Q. It is irreducible in Q[x],
meaning that it does not factorise f = gh into g, h ∈ Q[x] polynomials of degree
< 3, given that we insist here that g and h also have rational coefficients. (Later
we’ll pick up some easy technology to prove statements like that.)

Of course f has 3 roots in C: α, αω and αω 2 , where α = 3 2 ∈ R and ω is a
choice of primitive root of unity. (Remind yourself of the roots of unity. In this
case, ω 6= 1 is a complex root of x3 − 1, and you can choose ω = exp(2πi/3) or
ω = exp(4πi/3), after which ω 2 = ω is the other.) Correspondingly, f factorises
into linear factors in C[x]:

f = (x − α)(x − αω)(x − αω 2 )
= x3 − (α + αω + αω 2 )x2 + (α · αω + α · αω 2 − αω · αω 2 )x + (α · αω · αω 2 )
= x3 − α(1 + ω + ω 2 )x2 + α2 (1 + ω + ω 2 )x − α3 = x3 − 2,

since x3 − 1 = (x − 1)(x2 + x + 1), so ω 2 + ω + 1 = 0.


The factorisation of f did not involve many complex numbers at all: only α
and ω and products of them. To be less wasteful, therefore, we define L ⊂ C
to be the smallest field that contains α and ω; we call L the splitting field
of f , because that’s the smallest field over which f splits into linear factors.
The factorisation of f above makes sense in the polynomial ring L[x], and we
say f splits over L. Later we’ll see easily that

L = {a1 + a2 α + a3 α2 + a4 ω + a5 αω + a6 α2 ω | α1 , . . . , α6 ∈ Q}, (1)

and that there are no Q-linear relations between {1, α, α2 , ω, αω, αω 2 }, so they
form a Q-basis of L, and dimQ L = 6. As a first sanity check we see that ω 2 ∈ L
can indeed be expressed in this basis: ω 2 = −1 − ω.

1.2 Building up a factorisation


The basic question is: how did we find the roots of f , and can that be turned
into a systematic method of solving polynomials, or at least of understanding
what the problem is? Galois Theory addresses this kind of question head √ on.
Here’s how we might have proceeded. We might first spot that α = 3 2 is
a root of f . Since roots correspond to linear factors (over C, say), we can use
that to start to factorise:

f = (x − α)(x2 + αx + α2 ),

5
where we figure out the quadratic factor by putting in unknown coefficients,
multiplying out and asking that to equal f . This factorisation does not take
place in Q[x]. The two factors do not have rational coefficients, but have coef-
ficients in the field K = Q(α), which by definition is the smallest subfield of C
that contains Q and α. It’s not hard to persuade yourself that

K = Q(α) = {a0 + a1 α + a2 α2 | a0 , a1 , a2 ∈ Q}, (2)

and we’ll see a simple lemma later that explains this precisely. As a first sanity
check, 1 + α ∈ K, so what is its inverse? Well, since α3 = 2, we see that
(1 + α)(1 − α + α2 ) = 3, so (1 + α)−1 = 31 (1 − α + α2 ).
To complete the factorisation of f , we just need to split the remaining
quadratic factor. (It does not factorise over K, as we see in a second.) By
the quadratic formula, it has roots

−α ± α2 − 4α2  √   √ 
x= = α − 12 ± 2−3 = α − 12 ± i 23 = αω and αω 2 .
2
So there’s the factorisation (and, incidentally, the proof that the quadratic can’t
factorise over K, since K ⊂ R but ω, ω 2 ∈ / R).

1.3 Subfields of the splitting field and the roots of f


In factorising f we built a sequence of fields: Q ⊂ K ⊂ L. We built K by
extending Q by α, and saw in (2) that dimQ K = 3. We then built L, the
splitting field, by extending K by ω: we didn’t mention it, but L = {b0 + b1 ω |
b0 , b1 ∈ K}; in fact, it’s visible already in (1). In particular, dimK L = 2.
This splitting field L is a pretty small field: as a vector space, it has dimen-
sion 6 over Q by (1). You won’t have any intuition yet for what kind of things
might be subfields of L—although we already saw K—but out of the blue Fig-
ure 2 shows all the subfields of L: the lines indicate subfield inclusion going
up the page, and they are labelled by 2 or 3, the dimension of the bigger field
as a vector space over the smaller (and let’s very loosely interpret that to mean
something like adding in a square root or a cube root respectively).
Although it sounds crazy, if we’d known all that in advance it would have
given us a map to find the roots of f : we could try to follow any path from Q
to L, adding roots as indicated.

1.4 Permuting the roots: the Galois Correspondence


Here’s the third big idea. The permutation group

S3 = {id, (12), (13), (23), (123), (132)}

acts on the three roots

λ1 = α, λ2 = αω, λ3 = αω 2

6
L = Q(α, ω)

2
2
2

3
K = Q(α) Q(αω) Q(αω 2 )


3 3 3 Q(ω) = Q( −3)

Figure 2: Subfields of L, the splitting field of x3 − 2.

by permuting these three complex numbers: σ · λi = λσ(i) . Consider, for exam-


ple, the permutation σ = (12) ∈ S3 . It flips λ1 ↔ λ2 and leaves λ3 alone. In
terms of α = λ1 and αω = λ2 , that gives (noting 1/ω = ω 2 ):

α 7→ αω (i.e. λ1 7→ λ2 )
αω 7→ α (i.e. λ2 7→ λ1 )
2 2
so αω = (αω) /α 7→ α /(αω) = α/ω = αω 2
2
(i.e. λ3 is fixed)
2
and so e.g. ω = αω/α 7→ α/(αω) = 1/ω = ω = −1 − ω.

You can check similarly how all the basis elements of L in the expression in (1)
are mapped, and this realises the permutation σ = (12) of the roots by a map
L −→ L of the splitting field to itself—an automorphism of L, as we say.
(We will need it to be a ring homomorphism, not just a bijection or a K-linear
map, but let’s save that for later.) It moves the elements of L around. The two
subfields Q(α) and Q(αω) are swapped, while the elements of Q(αω 2 ) are not
moved at all. An element a+bω ∈ Q(ω) is taken to a+b(−1−ω) = (a−b)+(−b)ω,
so the subfield Q(ω) is fixed as a set, even though its elements are moved around
inside it.
Similarly, we could check whether each of the other permutations of S3 is
realised by a map of the splitting field L −→ L to itself. In this particular
example, in fact they all are.
It is easy to understand the set of all subgroups of S3 : Figure 3 lists them all,
where the lines indicate subgroup inclusions going down the page. The integer

7
labels indicate the index of each subgroup: {id, (23)} ⊂ S3 is a subgroup of size
2 in a group of size 6, so has index 6/2 = 3.

{id}

2
2
2

{id, (23)} {id, (13)} {id, (12)} 3

3 3 3 {id, (123), (132)}

S3

Figure 3: Subgroups of S3 = Aut(L).

You will notice that Figures 2 and 3 are virtually identical, though they refer
to completely different mathematical objects: one describes subfields of L, the
other, subgroups of S3 , albeit with inclusions reversed. Amazing, but no fluke.

1.5 One punchline: insoluble quintics


We will prove that under some mild assumptions on a polynomial f , the lattice
of subfields of the splitting field L of f is the same as lattice of subgroups of
the group of permutations of the roots of f that are realised by maps L −→ L
(with inclusions reversed as in the example). This is called the Galois Cor-
respondence. You could think of it as matching highly nontrivial information
about subfields (of a field that you may not even know!—since you probably
don’t know the roots of f ) with concrete information about a group that you
can figure out from far less detailed knowledge of f .
For example, the subfield lattice contains enough subtle information
√ to un-

derstand whether there is a formula in radicals (that is, involving and 3
and so on) for the roots of a given f ∈ Q[x]. By the Galois Correspondence,
this information can also be detected by the nature of the subgroups appear-
ing in the parallel subgroup lattice. After all, from Figure 1 we might expect
that adjoining a radical (an nth root of something) promise some ‘cyclic’ maps

8
L → L, and if the subgroup lattice doesn’t have a path with ‘cyclic quotients’,
then there can’t be any radicals. (I’m oversimplifying, but not by much.)
If f = x5 − 10x + 5, we can deduce the subgroup lattice with nothing more
than some high-school calculus, a few permutation calculations and a bit of the
theory we will develop. To understand a group, we don’t need to understand
every element: some generators and relations will do. So we don’t need to
understand every automorphism L → L, nor every root of f . The absence of
some suitable subgroups in the subgroup lattice will mean at once that there
could not possibly be a formula in radicals for the roots of f .

1.6 What’s where when?


First Quarter Before any formal theory, §3 derives the radical formula for
the solution of cubic equations that you’ve probably been waiting 5 years to see.
Then §4 covers the prerequisite material that you may already know. It states
things briefly in the context they are used later, and adds a couple of elementary
bonuses that could easily have been covered in previous algebra modules. §5
discusses factorisation and irreducibility and adds a few elementary tools for
factorising polynomials with rational coefficients.

Second Quarter The subject properly starts in §6 with an analysis of how


you can make a field like Q bigger by allowing in new elements. The simplest
part of this is already strong enough to solve some of the ancient problems and
motivate new ones in §7.

Third Quarter We look in detail at the subfields that are fixed by automor-
phisms in §8, and then work out everything we claimed above for x3 − 2 in §8.3.
§9 identifies the most important properties of field extensions, and how these
relate to maps between fields. We put all this together in §10, which explains the
revolutionary idea behind Galois Theory, and the main result that illuminates
everything.
In short, the idea is this: if f (x) is a polynomial, then, whether we know
them or not, the set of its roots has symmetries, in a very similar way that a
polygon in the plane has symmetries. It turns out that properties of the group
of all symmetries reflect precisely properties of the roots themselves. But that’s
great, because in our context these groups are small finite objects that we can
analyse by hand, whereas solving polynomials in general seems impossible.

Fourth Quarter We apply that first in §11 to the substantial set piece of ‘bi-
quadratic’ quartics, where we can get hold of the 4 roots in fairly explicit terms,
and then to the classification and analysis of all finite fields in §12. Finally, in
§13, we return to the most famous problem of them all: the insolubility of the
quintic.

9
2 Essays on what you already know
Here are three alternative introductions, and there’ll be another in §13.4 at the
end. You don’t have to read any of these, or you could skim them in case one
may be interesting later when you have something substantial to tie it to.

2.1 Working in the complex numbers


The complex numbers Let R denote the real numbers. The complex num-
bers C is the set
C = {a + ib | a, b ∈ R} (3)
along with everything you know about it, where i is symbol fixed once and for
all.
As an R-vector space, C has favourite basis {1, i}—that’s how we uniquely
described elements a + ib in the first place. In particular C has dimension 2 as
an R-vector space, which is why we usually understand it as a plane, the Argand
diagram in Figure 1. But C is also a field with multiplication and division. This
field arithmetic all follows once you decree that i2 = −1 and insist that the
usual commutative, associative and distributive laws hold (and why not?).
It’s good to agree that right now. Without getting philosophical about it,
if you make your version of C and I make mine, we’ll never know whether we
made the same thing or not, Was my i the same as yours or was it your −i, for
example? And that only begins to be a question if we happen to construct the
same set at all: your preferred version of C might be something more fun, like
  
a −b
C= | a, b ∈ R
b a

with the inclusion R ,→ C given by a 7→ ( a0 a0 ), which would be entirely reason-


able of you, but certainly not be the same set as the C we agreed at the start.
But let’s not worry about this: it’s all easy enough to sort out later once we
know how to speak, and for now let’s agree to agree on that set in (3) with that
symbol i and the arithmetic you know well.

Complex conjugation There is an elegant trick we use all the time. If


z = a + ib ∈ C, we say z = a − ib ∈ C is the complex conjugate of z. We see
this geometrically in the Argand diagram by reflecting the whole of C in the
real axis, leaving R fixed pointwise: then z is just the reflection of z. The trick,
then, is that by observing

zz = a2 + b2 = |z|2 ∈ R

(where |z| is the Euclidean length of z), we can handle division via multiplica-
tion: if z 6= 0 then
1 1 z a b
= = = 2 2
−i 2 .
a + ib z zz a +b a + b2

10
This works because both z and its reflection z meet in the denominator, and
since this is symmetric in i and −i it has no real choice but to be real (or see
the actual calculation above, if you don’t like rhetoric).

First view of Galois Theory Looking beyond its computational utility,


complex conjugation is an isomorphism of C with itself (since z1 z2 = z 1 z 2
and z1 + z2 = z 1 + z 2 ) that fixes the subfield R ⊂ C pointwise: we say it
is an R-automorphism of C. The main idea of Galois Theory is that for any
(finite-dimensional) field extension K ⊂ L we can collect all K-automorphisms
of L into a group, the so-called Galois group of the extension and denoted
Gal(L/K) or AutK (L), and that the structure of this group tells you a lot
about K ⊂ L. In the case R ⊂ C, the Galois group is isomorphic to Z/2, with
the identity map and complex conjugation as the only two R-automorphisms.
The Galois Correspondence, Theorem 10.17 below, says that the fact that Z/2
has no nontrivial subgroups means that there are no nontrivial intermediate
subfields R ⊂ L ⊂ C. That is easy to check by hand in this example (see
§2.2), and we will see far more subtle behaviour, but already this gives a new
perspective on why we can solve quadratic equations using only field operations
and square roots.
The various applications arise because many questions can be interpreted as
some input data involving K (polynomials, geometrical constructions) that have
solutions as data in a bigger field L (the roots of a polynomial, the coordinates
of a new geometrical figure), but we are given only fairly crude tools to make
field extensions (radicals, rulers and compasses). To decide whether the tools
we have are strong enough to build (or at least approximate) L from K, we need
to measure how complicated the extension K ⊂ L is. But fields are horribly
complicated objects, so we translate that question to the easier setting of groups
(they can also be horribly complicated, but not in the context we see them here).

Roots of unity and other radicals From your life so far, you are indeed
most excellent at working in the complex numbers. For example, you know
what it means to extract n-th roots of a real or complex number in analytical
and geometrical and computational terms:
(a) If x ∈ R and x ≥ 0 then there is a unique real number
√ y ≥ 0 such that
y n = x. In exactly this context you write y = n x. (If pressed to find
this number, you could easily expand its decimal expansion by systematic
trial and error, or apply Newton–Raphson or similar.)
(b) If z ∈ C, then you can find a complex number t ∈ C so that tn = z. It does
not harm to suppose z 6= 0, and write z = r exp(iθ) in polar form, where
r ∈ R has r√≥ 0 and θ is well defined up to 2πZ. Then√ t = y exp(iθ/n)
where y = n r. You wisely tend not to denote this by n z since it is not
unique. Instead you do something much more precise, as follows.
(c) Let ω = exp(2π/n) = cos(2π/n) + i sin(2π/n), so that the complex num-
bers 1 = ω 0 , ω, ω 2 , . . . , ω n−1 are the n distinct complex roots of the poly-

11
nomial xn − 1. Geometrically you can’t help but picture these as the
vertices of a regular n-gon, inscribed within the unit circle, with one of
the vertices pinned down at 1 ∈ C (to stop the whole thing rotating). Our
choice of ω makes it the first vertex anticlockwise after 1, ω 2 the next, and
so on. The 5th roots of unity are drawn at the vertices of a pentagon in
Figure 1.
(d) Now if t is any n-th root of z as above, then t, ωt, ω 2 t, . . . , ω n−1 t are the
n distinct complex roots of the polynomial xn − z.

Solving polynomials by radicals When you solve ax2 +bx+c = 0, you trick
it into being a question of finding radicals: you change coordinates so that a
square root does the job. At that point, most people just remember the famous
formula, and know that they could figure it out again by completing the square
if necessary.
We will see that there are also formulas for cubic and quartic polynomials.
Nobody remembers them, and they are harder to figure out—they were the
secret technology of their time. In higher degree, there are also formulas for
special cases. But the striking fact is that the roots of a typical polynomial of
degree ≥ 5 cannot be expressed in radicals at all: in particular, there cannot
be a radical formula to solve the general case. Even more remarkably, we can
prove that fact, and we will, precisely as Galois taught us.

2.2 Subfields of the complex numbers


Consider the familiar pair of fields R ⊂ C. Very often we look for solutions
in one of these two fields when we solve equations. To be concrete, let’s think
about attempting to solve polynomials with integer coefficients. That is, fix in
advance some integral degree n ≥ 1 and some integral coefficients a0 , a1 , . . . , an ,
with an 6= 0, and then find all values x for which

an xn + an−1 xn−1 + · · · + a1 x + a0 = 0.

For example, if you want to know how big a cube-shaped tupperware you’ll
need to fit in the 2kg of mashed potatoes you have left over from Christmas,
you’ll look for a real number x that satisfies x3 − 2 = 0. (I may be glossing over
some of the physics of mashed potatoes here, but then only you really know
the density you made them, so only you can assign sensible units to that 2. I’ll
leave that with you.) We sometimes think of C as the big super field (where
we know there’ll be at least one solution, and no more than n of them, even
though we may not be very clear about what they mean in Tupperware terms)
and R as the smaller, specialised ‘real’ field (where we might prefer solutions
to lie, especially if the solutions have some meaning in the real, mashed potato
world of fitting things into the fridge).
There is no intermediate field lying strictly between the two: if L is a field
satisfying R ⊂ L ⊂ C, then L = R or L = C. (This is clear: if α = a+ib ∈ L\R,
then b 6= 0, so i = 1b (α − a) ∈ L, since a, b, α ∈ L, and so L = C.)

12
But in fact C has billions upon infinities of subfields K ⊂ C that are not
equal to R. One of them is Q, the familiar rational numbers, and every subfield
K ⊂ C must contain Q (because K contains 1, so it must also contain 2 = 1 + 1
and −3 = −(1 + 1 + 1), whence also 12 and − 32 , and so on).
If we solve a single polynomial f = an xn + an−1 xn−1 + · · · + a1 x + a0 with
integer or rational coefficients, then in retrospect we didn’t need anything like
the whole of the field C to work in: we could have made do with the smallest
subfield of C that contains the roots of f . This sounds a bit theoretical: at the
outset, we don’t know the roots, and we probably think of that as the point. But
at least we know it has complex roots, so they are out there in C somewhere.
For example,
√ x3 − 2 (the mashed potato problem) has a single positive real
root α = 2 and a pair√of complex conjugate roots, αω and αω 2 , where
1 3

ω = exp(2πi/3) = − 21 + i 23 is a primitive cube root of unity. So there is a


choice of two much leaner fields than R or C to work in. If we are determined
only to care about the single real root, we could work in the subfield K of R
that contains all numbers we can get by combining α with rational numbers.
We denote that field K = Q(α), and note that Q ⊂ Q(α) ⊂ R.
Or we might be a bit more gregarious and admit all three roots, α, αω, αω 2 ,
and work in the smallest subfield L of C that contains those: since L is a field
containing α 6= 0 and αω, it also contains ω = αω/α, and so it might be easier
to describe it as the field containing all numbers we can get by combining α and
ω with rational numbers. We denote that field L = Q(α, ω), and note that

Q ⊂ K = Q(α) ⊂ L = Q(α, ω) ⊂ C

are all different, but all interesting in their own ways for x3 − 2.
To a large extent, this module studies fields like K and L—they are fabbo
examples that we’ll see again, so if you want to brush up your cube roots of
unity, be my guest. To get some idea of how economical they are, both K
and L are countable (thought of merely as sets), whereas both R and C are
uncountable, so K and L are miniscule in comparison. And while the Argand
diagram in Figure 1 is an absolutely brilliant visual aid for understanding C,
we stand no chance at all of drawing K or L on it in any reasonable way,
since they consist of infinitely many numbers spread around densely (in R or
C respectively) in an exquisite filagree of magic arithmetical dust—drawing Q
itself is just as impossible, though perhaps we don’t find it so mysterious.
So from one point of view, you could think of the brutal work that analysis
and topology do in completing Q to get R as grotesquely violent, smashing these
delicate little fields out of the way. The problem isn’t with C itself—once we
own R, it’s a very small step to get to C: they teach it at high school, completely
ignoring the damage that R has already caused. The problem is with that earlier
step where we thought that R would be a good idea. √
But come on, that’s a bit rich: you probably only believe in α = 3 2 in the
first place thanks to the completeness of R. So let’s not malign R, but let’s note,
1 approximately 1.25992104989487316476721060727831982242676643638515515135310740557628150781965814530849457

13
that if we want to begin to understand the billions upon infinities of subfields
of C, we had better not ask them to contain R.
We love C because it is familiar and comes with a fantastic picture, but also
because it is algebraically closed, so the roots of polynomials behave as well as
they possibly could. But there is a smaller algebraically closed field A ⊂ C, the
field of algebraic numbers (see §6.6), that contains all the roots of all polynomials
with rational coefficients, but does not contain R. We could have decided to
work in A instead of C. Some people do, but not before they’ve taken this
module. (Unless you did Algebraic Number Theory last term, in which case
you know A very well. But then you’re probably waiting for a lorryload of
machinery to be delivered by this module before you believe it.)
One final point: we can work in our fields K and L purely symbolically. That
is, any element is some rational expression in α and ω, and to tell whether two
are the same is a simple matter of algebraic manipulation (in principle, at least).
In particular, we do not need to work with decimal expansions,√ which is good
because we will never own the whole expansion of α = 3 2, and if some other
real number β 6= α had the same expansion for the first 1000 terms, we would
probably never know they were different. So we could program K and L on a
computer in precise terms without worrying about the (im)precision of floating
point calculations. Of course, since we just thought of it, it has been done
hundreds of times: all the calculations we will see involving only the subfield A ⊂
C are available on multiple computer algebra systems. (That’s a bit rich too,
though: decimal expansions are such an easy, flexible and powerful tool, that
many implementations of fields like K will carry around a decimal expansion
alongside the symbols to avoid the worst of the algebraic manipulation that I
pretended would be easy. There is some proverb about babies and bathwater
or possibly gift horses and their mouths. . . )

2.3 The complex numbers viewed from below


For a moment, imagine a world where you only know about R and the ring R[x]
of polynomials with real coefficients, but you haven’t come across C.
For any polynomial f ∈ R[x] of degree d > 0, you can construct the quotient
ring L = R[x]/(f ) (see §4.5, §6.4). If you remember this algebra well, you can
work with L formally as cosets g + (f ) of the ideal (f ) ⊂ R[x], which is what
this notation really means. Or, in the same way as you treat Z/pZ informally,
but just as precisely, as the set {0, 1, 2, . . . , p − 1} (see §4.1), you can treat L
informally, but just as precisely, as the set R[x]<d of polynomials of degree < d,
where it is understood that you can work in R[x] while calculating, and then
finish any computation by taking the remainder after long division by f (and by
all means also take remainders at any stage of the calculation, if it pleases you);
the key point is the uniqueness of the remainder after division. Let’s consider
an example, which, if you haven’t seen it before, will blow your mind!
Let f = x2 + 1. (You know that f is irreducible in R[x], meaning that it
does not factorise nontrivially there, but let’s put that important point to the
side for now.) To work in L = R[x]/(f ) in practical terms, you just work with

14
polynomials and set f to be zero whenever you come across it: or, if you prefer,
set x2 to be −1 wherever you see an x2 , which amounts to the same thing, since
x2 +1 = 0 and x2 = −1 are not so very different. So you see why it is reasonable
to work in R[x]<2 : whenever you see a higher power of x, whittle it down by
replacing occurances of x2 by −1. So in this sense it’s reasonable to think of
L = {a + bx | a, b ∈ R}, together with slightly odd ‘wrap-around’ behaviour
when calculating, just as it is reasonable to treat Z/pZ as {0, 1, . . . , p − 1}.
In fact, L is a field, whichever way you work with it. For example,

x × (−x) = −x2 = −(−1) = 1

so −x does just what is required of 1/x, and so in L the multiplicative inverse


of x is −x. You could try working out the inverse of 1 + x as a fun exercise, and
enjoy any deja vu you may experience.
Let’s consider polynomials over L. We’ve already used x as a variable, so
let’s use y just to avoiding muddling them, and so we consider elements of the
polynomial ring L[y]. In particular, what about g = y 2 + 1? It’s certainly a
polynomial with coefficients in L. I can evaluate it at any element α = a+bx ∈ L
to get another element of L. For example,

g(2 − 3x) = (2 − 3x)2 + 1 = 9x2 − 6x + 5 = −9 − 6x + 5 = 4 − 6x

(where we replaced x2 by −1, since that’s what we do when calculating in L).


But here’s a strange thing: see what happens when we evaluate g = g(y) at
the element α = x of L:

g(α) = g(y)|y=x = x2 + 1 = 0.

That is, x ∈ L is a root of g ∈ L[y]. And, if you try it you’ll see that −x is
another root. So y 2 + 1 = (y − x)(y + x) factorises in L[y].
But hang on a moment: factorising y 2 + 1 is what C is for. You’ve probably
already spotted what’s going on. At the outset, we called the variable x. We
could have called it anything: t or z or ξ or, oddly but legally, i. And if we had
called it i, then everything we’d talked about throughout the whole discussion
would have been exactly as though we were talking about C. More precisely,
there is an isomorphism of fields L → C taking a + bx to a + bi that takes the
subfield R of L identically to R inside C. There is also a second isomorphism
L → C that takes a + bx to a − bi. That’s fine: I have no idea which root of
y 2 + 1 the quantity x is and neither do you—there is no answer to that: we just
say to ourselves that we are happy that there are two maps L ,→ C, and we can
pick either one of them and stick to it.
We’ll formalise this trick in §6.7, and use it infinitely often: to some extent,
it’s one of the central points of field theory and is the main tool under the surface
of this module, so if you got something from that discussion that’s great, and if
not you can see it done properly later.

15
3 Solving cubic polynomials
3.1 The quadratic formula
To find the roots of a quadratic polynomial f = x2 + ax + b, with a, b ∈ C, you
complete the square by coordinate change x = y − a/2 and then look to solve

(y − a/2)2 + a(y − a/2) + b = 0, which is y 2 − c = 0,

where c = 41 (a2 − 4b). Since the nature of the square root you want to extract
now depends on the sign of c, you define the discriminant

∆ = a2 − 4b.
√ √
With that notation, y = ± c = ± 12 ∆, whence

−a ± ∆
x = y − a/2 = ,
2
as everybody knows.
Notice that the ± sign isn’t just there as an afterthought: if we fix α to be
one of the solutions of z 2 = ∆, then αω2 is the other, where ω2 is a primitive
square root of unity—that is, ω2 = −1. That is, the ± is the visible action of
the group Z/2Z = {1, −1} ⊂ C of square roots of unity.
Viewed from the other end, if α, β ∈ C were the roots of f , we would have

f = (x − α)(x − β) = x2 − (α + β)x + αβ,

so, in the original notation, a = −(α + β) and b = αβ. In terms of α, β, we


see that ∆ = α2 − 2αβ + β 2 = (α − β)2 . This is a more compelling reason for
having ∆ defined at all: it’s not there to tidy up a formula, but to discriminate
between the case that f has a multiple root—namely α = β so ∆ = 0—and two
distinct roots, ∆ 6= 0.

3.2 The cubic formula


Completing the square is also known as the Tschirnhausen transformation,
and it works for polynomials of any degree. To find the roots of a cubic poly-
nomial f = x3 + ax2 + bx + c, with a, b, c ∈ C, we first ‘complete the cube’, that
is, make the Tchirnhausen transformation x = y − a/3. (In degree n you’d use
y − a/n.) That presents f as

(y − a/3)3 + a(y − a/3)2 + b(y − a/3) + c = y 3 + py + q,


2 3
where p = − a3 + b and q = 2a ab
27 − 3 + c.
Thus it is enough to find a radical formula for the roots of the cubic poly-
nomial y 3 + py + q. Without loss of generality pq 6= 0:
• p 6= 0, otherwise the roots are the cube roots of −q, and

16
• q 6= 0, otherwise the roots are y = 0 and the square roots of −p.
Out of the blue, define a type of cubic discriminant
4p3
D = q2 + (4)
27

and fix a complex number α = D; that is, α is either solution of z 2 = D.
Now fix a cube root r
−q + α
β= 3 , (5)
2
that is, β is any solution of z 3 = 21 (−q + α), and similarly, pick another cube
root r
−q − α
γ= 3 . (6)
2
There was choice in each of α, β, γ. You’ll have noticed that we’ve already
‘balanced’ α with −α, but I don’t yet know how to ‘balance’ the cube roots.
Let ω be a primitive cube root of unity: that is ω 3 = 1 and ω 6= 1. (We
could be more prescriptive: since ω 3 − 1 = (ω − 1)(ω 2 + ω + 1) and√ω 6= 1 we
must have ω 2 + ω + 1 = 0, so by the quadratic formula ω = 21 (−1 ± i 3). But I
don’t care which primitive root of unity ω is, so let’s not bother spelling it out.)
The key to choosing the cube roots is to consider the product (βγ)3 , which
is indifferent to all those choices. From here on in, you could follow your nose,
but I’ll work it out:
  
3 3 3 −q + α −q − α
(βγ) = β γ =
2 2
= 14 (q 2 − α2 )
= 14 (q 2 − D)
= 14 (q 2 − (q 2 + 4p3 /27))
= −p3 /27 = (−p/3)3 .
Thus βγ is one of the 3 cube roots of (−p/3)3 , and so the 3 distinct roots are
βγ, ωβγ and ω 2 βγ. Visibly −p/3 is one of these cube roots, so after making an
arbitrary choice of β we may choose γ so that βγ = −p/3.
Now I claim we have the three roots:
y =β+γ or ωβ + ω 2 γ or ω 2 β + ωγ. (7)
We only need to check the first of these: the other two come from making
the other two possible choices for β, and then rechoosing γ so their product
remains −p/3, so they work if βγ works. Since βγ = −p/3,
y 3 = β 3 + 3β 2 γ + 3βγ 2 + γ 3
−q + α −q − α
= − pβ − pγ +
2 2
= −p(β + γ) − q = −py − q,

17
so y 3 + py + q = 0 as required.
To show off, we might write the solutions as
v q v q
u 3
u
4p3
3 −q + q 2 + 4p 3 −q − q2 +
u u
t 27 t 27
y = ωi + ω 3−i , (8)
2 2
for i = 0, 1, 2 and a primitive cube root of unity ω—but beware that this dis-
guises the crucial compatibility between the second cube root (which is γ) and
the first (which is β). (There is of course also an unspoken relation between the
two square roots: they are related by being exactly the same.)2
If you want to solve the original cubic x3 + ax2 + bx + c, remember that
x = y − a/3, and p, q are expressed in a, b, c, so we can use these formulas for
that case too. (It’s not showing off to write (8) in terms of a, b, c, it’s madness.)

Where did D and β and γ and y = β + γ come from? Going back to


f = y 3 + py + q, with pq 6= 0, consider the substitution y = z − p/(3z). (If
y ∈ C, then there is a nonzero z that satisfies this, so we haven’t lost anything.)
Clearing the z 3 from the denominator (fine: it’s not zero), presents f = 0 as
p3
z 6 + qz 3 − = 0.
27
That’s a sextic polynomial in z, so things seem to have got worse, until you spot
that it’s really a quadratic in z 3 . (The substitution was chosen precisely to rig
that up.) So by the quadratic formula
q
3
−q ± q 2 + 4p27
3
z = .
2
So D is the discriminant of this auxiliary quadratic equation, and choosing z = β
and z = γ to be any of cube roots (for ± respectively), subject to the restriction
βγ = −p/3, we see y = z − p/(3z) gives
 
y+y β − p/(3β) + γ − p/(3γ) 1 p (β + γ)
y= = = β+γ− = β + γ.
2 2 2 3 βγ
This is expressed more elegantly by rearranging the substitution:
z 2 − yz + (−p/3) = 0
shows that a pair of solutions for z whose product is −p/3 must sum to y.
2 Del Ferro (early 1500s) and Tartaglia and Cardano (around the 1530s) knew essentially

this formula [1]. Del Ferro was teaching it to his students 500 years ago in the University of
Bologna, the oldest university in Europe. To put it into local context, that was around the
time Queen Elizabeth I was born (1533). Forty years later (from 1575) she visited Kenilworth
Castle three times, in part to give Robert Dudley, the first Earl of Leicester, a chance to
impress her for a marriage proposal (unsuccessful, despite the nice new formal garden, which
you can see recreated today, and the live sea battles in the adjoining flooded fields, which you
cannot). Historians have overlooked the possibility that she was also checking out the facilities
for twinning with Galois’ birthplace (successful, but with a delay of 400 years). Anyway, that’s
not bad going by dF, T and C, bearing in mind that they didn’t really trust negative numbers,
let alone complex ones.

18
Shouldn’t D be a discriminant, defined conceptually? Yes, of course.
If y1 , y2 , y3 are the three roots (with multiplicity) of any cubic polynomial f ,
then the discriminant is correctly defined as

∆ = (y1 − y2 )2 (y1 − y3 )2 (y2 − y3 )2 .

Clearly the value of ∆ is independent of the order we wrote down the three roots,
and ∆ = 0 if and only if f has multiple roots. You can check that ∆ = −27D,
so that the cubic has multiple roots if and only if D = 0.
This is worth stating properly in terms of the coefficients and remembering.
For a cubic polynomial f = y 3 + py + q, the discriminant is

∆ = −27q 2 − 4p3 (= −33 q 2 − p3 22 as a tonguetwister).

3.3 The cubic formula for real cubics


We continue the notation

f = x3 + ax2 + bx + c = y 3 + py + q

with x = y − a/3 and p, q expressed in terms of a, b, c, where pq 6= 0 and the


three solutions are y = ω i β + ω 3−i γ where βγ = −p/3, all as before.
Notice that if a, b, c ∈ R, then also p, q ∈ R, but not conversely, so let us
suppose p, q ∈ R. It follows that D = q 2 + 4p3 /27 ∈ R.

Real cubics with multiple roots. f has multiple √ roots if and only if D = 0.
The form of D implies
p then that p < 0. So α = D = 0 and by (5,6) we may
choose β = γ = − 3 q/2 ∈ R, checking the necessary compatibility condition is
satisfied for that choice:
p p
βγ = 3 q 2 /4 = 3 −p3 /27 = −p/3 (using D = 0).

By (7) the three roots are


p
y = 2β and (ω + ω 2 )β = −β (twice), where β = 3
−q/2.

We can simplify the expression for β using the compatibility condition, or by


2 3 2 3 2
p we’ve got: f = (y − 2β)(y + β) = y − 3β y − 2β , so p = −3β
looking at what
and β = ± −p/3, with the sign chosen to be the opposite of the sign of q.

Real cubics with one real root. Suppose D > 0, so that α = D is real.
Then β and γ are both cube roots of real numbers, so may also be chosen real:
notice that the constraint βγ = −p/3 ∈ R must be satisfied, since we are free
to choose β as the real cube root, and then γ must also be chosen real.
Thus y = β + γ ∈ R is a formula for the real root, that involves extracting
only real square and cube roots.

19
We see also that the remaining two roots of f ,

y = ωβ + ω 2 γ and y = ω 2 β + ωγ,

are a complex conjugate pair, since ω = ω 2 . In particular, if they were real,


then they would be equal, which they’re not since D 6= 0.
Thus we have described the three distinct roots, one real and a strictly
complex conjugate pair, in terms of real square and cubic radicals and the cube
roots of unity. Perfection!

Real cubics with 3 √ real roots: the Casus Irreducibilis. Now suppose
D < 0, so that α = D = ±i|α| is pure imaginary. By (5,6), β 3 and γ 3 are a
strictly complex conjugate pair, so the compatibility condition βγ = −p/3 ∈ R
forces us to choose their cube roots β and γ = β to be a strictly complex
conjugate pair too. Thus by (8) the 3 roots are

y = β + β, ωβ + ωβ and ω 2 β + ω 2 β

which are visibly all real. The end.


But the vexing thing about this is that we have a brilliant radical formula
for all 3 real roots, but each one requires strictly complex calculations.3
It must surely be possible to have a formula for the real roots that involves
only real radicals—we managed that in the complex case, with only the cube
roots of unity carrying the imaginary data. The answer is no: the radicals
needed in a formula for the roots may indeed generate a larger field than the
roots themselves do. We’ll see this with more precision much later.

Real cubics with 3 real roots via trigonometric functions First an easy
point. Recall that for any θ, we have the triple angle formula

cos(3θ) = 4 cos3 (θ) − 3 cos(θ).

Let me write y = cos(θ) and rearrange that as

y 3 − 34 y − 1
4 cos(3θ) = 0. (9)

If by chance you ever come across a cubic in reduced form with p = −3/4 and
q = − cos(3θ)/4, for some value of θ, then you can say at once that it has the 3
real solutions

y = cos(θ) and cos(θ + 2π/3) and cos(θ + 4π/3).

Second, another easy point. This solution can be adapted to work whenever
p, q ∈ R with D < 0. The trick is to consider a change of coordinates y = λt,
and choose λ so the result is of the form of (9): the only real constraint is
3 Word is that Tartaglia and Cardano were not happy about that at all—the name for this

case, Casus Irreducibilis or ‘irreducible case’, echoes with their irritation over the centuries.

20
1
that the constant term had better lie in [−1/4, 1/4] so it can be 4 ×cosine of
something. The solutions are simply
y = λ cos(θ) and λ cos(θ + 2π/3) and λ cos(θ + 4π/3)
p
where λ = 2 −p/3 (fine: p < 0 since D > 0) and θ is a inverse-cosine of
something fiddly in p, q that the algebra throws up. (See [?], §1.3 Exercise 10,
for detailed hints.)4

What to make of that? [Nonexaminable] The first thing to say is great!


That method returns 3 real roots without any complex calculations. We paid a
price for this power: rather than working with simple radicals, we used sophis-
ticated trigonometric functions. But that’s not really a big deal, it is simply
a different approach: it uses tools that are better adapted to the particular
problem—finding roots of cubics with D < 0—and so it works better.
So wouldn’t it be better to throw away radicals and use trig functions from
the start? It all depends on what you mean by solving an equation. In this
module we progress by abstract algebra: that gives us a lot of one sort of power,
which simplifies many things, and conversely the problems we consider direct
the choice
√ of algebra we develop. We allow ourselves radicals: after all, any
radical n a is a root of a particular kind of polynomial equation z n − a = 0.
But then certain values of cos(θ) are roots of particular kinds of polynomial
equation too, namely (9), although we wouldn’t usually think of cos in those
terms, and didn’t learn it like that. I first learned about cos and sin through
trigonometric calculations, and pretty soon after that as solutions to the differ-
ential equation
d2 y
+y =0
dx2
and by inverting certain integrals (for the enigmatically-named function s(u))
Z s(u)
dx
u= √ ,
0 1 − x2
though they weren’t expressed in quite those terms.
To solve higher degree polynomials, this suggests two approaches. We could
simply award ourselves some class of solutions to some class of equations, and
then use those to solve others. That was done successfully in the mid-19th cen-
tury. Alternatively, and perhaps less arbitrarily, we could explore new classes
of functions that arise in nature, and see whether they can be applied to the
solution of polynomials. This was done successfully throughout the 19th cen-
tury. That solution uses elliptic functions—Galois knew them well—which are
generalisations of trigonometric functions that arise as the solutions of slightly
bigger differential equations or by inverting slightly more complicated integrals.
This module won’t take that route. We stick to radicals, and show how
Galois Theory relates their efficacy to the group of symmetries of the roots.
4 Viète knew how to do this, and much more, around 1590. Viète also cracked the codes

used by Philip II of Spain, one of the global superpowers at the time. We are discussing the
work of highly sophisticated scientists working at the cutting edge of knowledge at the time.

21
4 Reminders
Here’s what we start with. It’s part of the module to have all this on board.

4.1 A reminder about fields


A field K is a set in which you can do the four arithmetic operations +, −, ×, ÷,
subject to the usual rules. All the rules are important, but the only one you
forfeit your dinner money for is the one about not dividing by zero. Not even
for a joke.
The rationals Q, the reals R and the complexes C are all fields. From one
point of view, this module is the theory of fields, and so we will see many more.
But there’s a lot more to this module than that, and the applications we see
will direct our focus on some parts of field theory more than others.
Let’s build a zoo of fields. Some look a bit dangerous, but they’ve all been
domesticated over the years so you can safely put your fingers through the bars
and tickle them.

Quadratic fields Let d > 0 be a squarefree integer (that is, each prime in its
prime factorisation appears only once) and consider
√ n √ o
Q( d) = a + b d | a, b ∈ Q .

Since d ∈ R is a (positive) real number, this set is a (nonempty) subset of
R, and it is easy to check that it is closed under +, − and ×. To complete
confirmation that it is a field, you just need to
√ figure out division, but that
works easily using the Galois conjugate a − b d in a similar way to how you
divide complex numbers:
√ √

   
1 a−b d a−b d a −b
√ = √ √ = 2 = + d.
a+b d (a + b d)(a − b d) a − b2 d a2 − b2 d a2 − b2 d

Similarly, using the negative −d, we can consider


√ n √ o
Q( −d) = a + b −d | a, b ∈ Q .

This is a subfield of C, which you can√check in the same way. Notice that in
this case the conjugate operation a − b −d is the same
√ as complex conjugation,
while it was clearly something quite different for Q( d).

Fields of rational functions If K is any field and x is an indeterminate (a


variable), then we can consider the field of rational functions in x over K:
 
f (x)
K(x) = | f, g are polynomials in x with coefficients in K, g 6= 0
g(x)

22
(where of course you can cancel factors in the usual way). If we weren’t so fo-
cussed on fields, we might have started better by defining the ring of polynomials
in x over K (note the different, squarer brackets)

K[x] = a0 + a1 x + a2 x2 + · · · + as xs | s ∈ Z≥0 , a0 , . . . , as ∈ K, as 6= 0 ∪{0},




where my insistence that as 6= 0 is a convenient habit for later (so that I know
for certain that the polynomial has degree s), but does mean I have to remember
to include the zero polynomial at the end (which has degree −1, by convention).
It is easy to check that K[x] is an integral domain, and then K(x) is its field
of fractions:
K(x) = Frac(K[x]).
More generally, if x1 , x2 , . . . , xn are distinct indeterminates, then the ring of
polynomials in x1 , x2 , . . . , xn over K is
nX o
K[x1 , x2 , . . . , xn ] = aI xI | I ∈ Zn≥0 , aI ∈ K ,

where I = (i1 , i2 , . . . , in ) ∈ Zn≥0 is a multi-index of nonnegative integers and xI


is shorthand for xi11 xi22 · · · xinn . It is easy to check that K[x1 , x2 , . . . , xn ] is an
integral domain, and we define the rational function field in n variables over K
to be
K(x1 , x2 , . . . , xn ) = Frac(K[x1 , x2 , . . . , xn ]).

Finite fields You may know that Fp = Z/pZ is a field exactly when p is a
prime integer. This is a finite field and has exactly p elements. You can work
with it formally as cosets a + pZ, as the notation intends, or work informally
(but just as precisely) with the set {0, 1, 2, . . . , p − 1}, where it is understood
that you can work in Z while calculating, and then finish any computation by
taking the remainder after long division by p. You may also take remainders
at any stage during the calculation, and as doing so keeps numbers smaller you
probably do. To check that that division is possible, suppose p > 0 is a prime
and let 1 ≤ a < p be an integer. Then a is is necessarily coprime to p—that’s
where we’re using that p is prime—so by the Euclidean algorithm there are
integers u, v so that
ua + vp = 1.
Now if we choose the unique 1 ≤ b < p that satisfies b ≡ u mod p, that
Euclidean equation (taken mod p) shows that ba = 1, so b is the multiplicative
inverse of a.
Translated into the language of cosets, the arithmetic rules are

(a + pZ) + (b + pZ) = (a + b) + pZ,


(a + pZ) × (b + pZ) = ab + pZ,
−(a + pZ) = −a + pZ,
(a + pZ)−1 = b + pZ, where ab ≡ 1 mod p;

23
you see that the only difference is the careful ‘ + pZ’ all over the place, which
means I can afford to be less careful to pin down b ∈ Z at the end.
Just in passing, note that there is a field with exactly 4 elements that we’ll
see later, but it is most certainly not R = Z/4Z, since that ring is not a field:
2 × 2 = 0 in R, so 2 ∈ R cannot have a multiplicative inverse; indeed, if
2 × b = 1 ∈ Z/4Z, then 2 = 2 × 1 = 2 × 2 × b = 0 × b = 0, which it isn’t.
More generally Z/nZ is not a field whenever n = ab is composite: since
then, a, b 6=∈ Z/nZ, but ab = n = 0 ∈ Z/nZ, so neither a nor b can have a
multiplicative inverse. But this is all easy to sort out later in §12, and we will
be able to understand all finite fields then.

4.2 Ring homomorphisms, first isomorphism theorem


Remark 4.1. Please check the definitions of ring, subring and ideal from Al-
gebra 2. Here are some informal reminders and comments.
(a) Informally, a ring is a set with operations +, −, × (we usually abbreviate
a × b as a · b or ab), but it doesn’t necessarily permit division. By ring,
we will always mean commutative ring with 1. We denote the additive
identity of any ring by 0, so we don’t bother with notation like 0R and 0S
to distinguish the zero in two rings R and S respectively.
(b) If R is a ring, then a subring of R is a subset that contains 1 and is closed
under +, −, ×.
(c) If R is a ring, then a nonempty subset I ⊂ R is an ideal if it is closed
under addition and also under multiplication by elements of R: that is,

if a, b ∈ I and r ∈ R then also a + b ∈ I and ra ∈ I.

(d) If R is a ring and x an indeterminate, then the ring of all polynomials in x


with coefficients in R is denoted R[x]. We discuss this in detail in §4.5.
Definition 4.2. Let R and S be rings. A map ϕ : R → S is a ring homomor-
phism if for all a, b ∈ R, ϕ(a + b) = ϕ(a) + ϕ(b) and ϕ(ab) = ϕ(a)ϕ(b), and
also ϕ(1) = 1. We simply say homomorphism if it is clear that we refer to a
ring homomorphism (rather than, say, a group homomorphism.)
It follows that ϕ(0) = 0 and ϕ(−a) = −ϕ(a).
Example 4.3. Ring homomorphisms respect addition and multiplication, and
much follows at once from that. Suppose S is any ring.

(a) There is a unique homomorphism Z −→ S. Indeed 1 7→ 1 ∈ S, and then


the image of any other integer is determined: a 7→ 1 + · · · + 1, a sum of
a ones in S. As a slight abuse of notation, we denote the image of a ∈ Z
also by a, but of course it is now a ∈ S. We use this in §4.3 below.

24
(b) If ϕ : R −→ S is a map of rings, and x is an indeterminate, then ϕ naturally
extends to a ring homomorphism between polynomial rings over R and S,
which we also call ϕ:
ϕ : R[x] −→ S[x]
a0 + a1 x + · · · + an xn 7−→ ϕ(a0 ) + ϕ(a1 )x + · · · + ϕ(an )xn
simply by evaluating the coefficients at ϕ while leaving the x’s alone.
(c) For any s ∈ S there is a ring homomorphism ϕs : Z[x] −→ S determined
by ϕs (x) = s. The axioms insist that ϕs (x2 ) = ϕs (x)2 = s2 , ϕs (x3 ) = s3 ,
and so on. Also ϕs (axm ) = ϕs (a)ϕs (x)m = asm (recalling the abuse of
notation ϕs (a) = a), and so ϕ extends to any polynomial f :
ϕs (f ) = f (s)
where you could regard the right-hand side as treating f as an element of
an unmentioned polynomial ring S[x] and then evaluating it at s. This is
called the evaluation map (of x at s), because that’s what it is.
Ring homomorphisms work very naturally. If we hadn’t said any of those things,
you’d have done them correctly anyway.
The basic theorem is this:
Theorem 4.4. Let ϕ : R −→ S be a ring homomorphism. Then:
(a) ker ϕ = {r ∈ R | ϕ(r) = 0} ⊂ R is an ideal of R;
(b) Im ϕ ⊂ S is a subring of S;
(c) First Isomorphism Theorem: ϕ naturally induces an isomorphism

=
ϕ : R/ ker ϕ −→ Im ϕ
r + ker ϕ 7→ ϕ(r).

Proof. See Algebra 2. Alternatively, if you write out the things that would need
to be proved, you’ll see that each is a small exercise. Once you agree R/ ker ϕ
is a ring, the main points are to check that ϕ is well defined and injective. 
Our main use of Theorem 4.4(c) is the following mild rephrasing.
Corollary 4.5. Let ϕ : R −→ S with kernel I ⊂ R. Then there is an injective
ring homomorphism ϕ : R/I ,→ S so that the diagram commutes:
ϕ
R S

ϕ
R/I

where the vertical map is the natural one a 7→ a + I.


We say ϕ factors through the quotient to give the inclusion R/I ,→ S,
which therefore gives an isomorphism of rings of R/I with the image of ϕ.

25
4.3 Characteristic of a field
Lemma–Definition 4.6. Let K be any field. There is a unique ring homo-
morphism ι : Z −→ K. Setting ker ι = pZ, exactly one of the following cases
occurs:

• p = 0, and we say K has characteristic zero. In this case, K has a subfield


K0 ⊂ K isomorphic to Q.
• p > 0 is a prime, and we say K has characteristic p. In this case, K has
a subfield K0 ⊂ K isomorphic to Fp = Z/pZ.

In either case, K0 ⊂ K is called the prime field of K, and any other subfield
L ⊂ K must contain K0 .
Thus the characteristic of K, denoted char K, is either 0 or a prime p > 0.
In the case p > 0, we may also say simply that K has positive character-
istic if we only care that p 6= 0 and not its actual value.

Proof. If ι is injective (that is ker ι = {0} = 0Z), then K contains a copy of Z


and so also a copy of Q: any n and nonzero m lie in K, and so their quotient
n/m does too, since K is a field.
If ι is not injective, then it has a kernel I ⊂ Z. The kernel I is an ideal, and
so is of the form I = nZ for some integer n > 0 (by division with remainder, or
GCDs, or unique factorisation). Moreover, n must be a prime, say n = p > 0.
If not, then n = ab with a, b > 1, and so 0 = ι(ab) = ι(a)ι(b). But ι(a) 6= 0 and
ι(b) 6= 0 (a and b cannot be in ker ι, as they are smaller than n), and so they
are zerodivisors that cannot possibly have inverses in K.
By the first isomorphism theorem, ι determines an isomorphism

=
ι : Z/ ker ι −→ Im ι ⊂ K.

In other words, ι determines an injective map from the quotient

ι : Fp = Z/pZ ,→ K.

For the final part, any subfield L ⊂ K must contain ι(Z), since it must contain 1,
and so must contain the field K0 that that generates. 

For example Q, R, C, Q(x) and C(x1 , x2 ) are all characteristic zero, since
they contain Q (or, equivalently, since adding 1 to itself repeatedly never gives 0).
The finite field Fp is characteristic p, as is, for example, the bigger field
Fp (t) of rational functions over Fp , since it clearly contains Fp as the subfield of
constant functions.

26
4.4 Automorphism group of a field: the main invariant
Let L be any field.
Definition 4.7. An automorphism of L is a bijective ring homomorphism
ϕ : L −→ L. (It follows easily that the inverse map is also a ring homomor-
phism.) The set of all automorphisms of L forms a group:

Aut(L) = {ϕ : L −→ L | ϕ is a automorphism of L}.

If K ⊂ L is any subfield, then a K-automorphism of L is an automorphism ϕ


that fixes K pointwise: ϕ(α) = α for all α ∈ K. The set of all K-automorphisms
of L forms a group:

AutK (L) = {ϕ : L −→ L | ϕ is a K-automorphism of L} ⊂ Aut(L).

Remark 4.8. If K0 ⊂ L is its prime subfield and ϕ is any automorphism of


L, then ϕ(1) = 1, by definition, so ϕ fixes every element of K0 . That is, any
automorphism is automatically a K0 -automorphism, and Aut(L) = AutK0 (L).
Example 4.9. (a) AutR (C) = {id, z 7→ z} ∼
= Z/2.

(b) Let L = Q(α) = {a + bα|a, b ∈ Q} where α = 2 ∈ R. Let ϕ ∈ Aut(L).
We know ϕ(q) = q for all q ∈ Q, but what about ϕ(α)? Well,

ϕ(α)2 = ϕ(α2 ) = ϕ(2) = 2,

so ϕ(α) must be an element β ∈ L that satisfies β 2 = 2. There are


exactly two such elements in L: α and −α. The first of these is the
identity map. The second is a + bα 7→ a − bα; you can check this is a ring
homomorphism, and later a simple lemma will guarantee it many similar
cases. So Aut(L) = {id, ϕ} ∼
= Z/2.

4.5 Polynomials in one variable, rings, ideals and all that


Let’s say once and for all what a polynomial is in this module; if you think it’s
all a bit formal, please compare with Bourbaki, but it is useful to be precise.

Polynomials. Let K be any field. Fix an indeterminate x. A polynomial


in x with coefficients in K is an expression

a0 + a1 x + · · · + as xs (10)

where all ai ∈ K. The ai are called the coefficients of the polynomial, and
individually ai is called the coefficient of xi . Each expression ai xi in the
polynomial is called a term. We may abbreviate the expression (10) to
s
X X
ai xi or even to ai xi ,
i=0

27
where in the latter it is understood that all but finitely many of the coefficients
ai are nonzero.
For example, 2 + 3x + 5x3 is a polynomial, where it is understood that any
terms that do not appear have zero coefficient—in this case 0x2 is missing, as
are any 0xn for n > 3.
The powers xi of x are independent quantities, with x0 treated as 1 ∈ K
whenever convenient, and two polynomials are equal if and only if they have
the same coefficient of xi for every i ≥ 0.
A daft but easy point: the zero polynomial is the unique polynomial which
has all coefficients ai = 0 for all i ≥ 0. We denote it 0, and note that 0 + 0x,
0 + 0x3 − 0x17 and so on are all the same zero polynomial. So if we say ‘let g 6= 0
be a polynomial’, this simply says g is a polynomial with at least one nonzero
coefficient, P
and nothing more.
If f = ai xi is a nonzero polynomial, then its degree, denoted deg(f ),
is the largest i ≥ 0 for which ai 6= 0. By convention, the degree of the zero
polynomial is −1. Polynomials of degree ≤ 0 are called constant polynomials,
degree 1 are linear, degree 2 are quadratic, degree 3 are cubic, degree 4 are
quartic, degree 5 are quintic, and so on in latin fashion.

The polynomial ring. The collection of all polynomials in x over K is a


ring, denoted K[x], with the natural operations +, − and ×. The ring K[x] is
an integral domain: it has no zerodivisors: if f, g ∈ K[x] \ {0} then f g 6= 0
(consider the coefficient of the highest degree term).
An element u ∈ K[x] is a unit if there is some v ∈ K[x] for which uv = 1.
(We may write v = u−1 = 1/u if we’re sure u is a unit.) The subset of all
units of a ring R is denoted R× , and clearly R× is an abelian group under the
multiplication in R. Your proof that K[x] is an integral domain probably made
it clear that K[x]× = K × = K \ {0}, the set of nonzero constant polynomials.
An element f ∈ K[x] is called irreducible if there is no expression f = gh
of f as a product of elements g, h ∈ K[x] in which neither g nor h is a unit;
otherwise f is called reducible. Loosely speaking, a polynomial is reducible if
it factorises and irreducible if it does not factorise into polynomials of smaller
degree, but note that they must also have all their coefficients in K.

The division algorithm. We can do ‘division with remainder’ in the poly-


nomial ring K[x] over any field K. Please remind yourself how the calculation
works. The result is summarised as a theorem, from which everything we know
follows.
Theorem 4.10. Let K be a field and f, g ∈ K[x] any polynomials. Then there
exist unique polynomials q, r ∈ K[x] with

g = qf + r and deg r < deg f.

For any f ∈ R, the subset

(f ) := {f r | r ∈ R}

28
is an ideal, and any such ideal is called a principal ideal, with f as its generator.
Principal ideals are closely related to division, since almost directly from the
definitions
f divides g ⇐⇒ g = hf for some h ⇐⇒ g ∈ (f ) ⇐⇒ (g) ⊂ (f ).
(The generator f of a principal ideal (f ) is not unique, but this division state-
ments shows at once that two principal ideals (f ) = (g) are equal if and only if
f = ug, where u ∈ R is a unit.)
In our context, R = K[x] is a principal ideal domain (PID), that is, every
ideal in K[x] is principal; in fact, something more precise is true.
Corollary 4.11. Let I ⊂ K[x] be an ideal. Then I is a principal ideal, and
exactly one of the following three cases occurs:
(a) I = {0} is the zero ideal;
(b) I contains a nonzero constant polynomial; in this case I = K[x];
(c) I = (f ) for some polynomial f ∈ K[x] with deg f ≥ 1.
Proof. See Algebra II, but simple to do. If I 6= {0}, pick a nonzero element
f ∈ I of smallest degree. If deg(f ) = 0, then we are in case (b). If not, we claim
that I = (f ). Indeed, if g ∈ I, then long division of g by f yields g = qf + r,
where deg(r) < deg(f ). But r = g − qf ∈ I, so by the minimality of deg(f ) we
must have r = 0, and so g = qf ∈ (f ), as required. 
Corollary 4.12. Let f, g ∈ K[x]. Then there exist h, a, b ∈ K[x] such that
af + bg = h.
The polynomial h is well defined up to a unit factor, and any such h satisfies
the conditions for being a greatest common divisor (≡ highest common factor),
so we denote h = gcd(f, g), with care about the unit if necessary.
The point is just that the ideal (f, g) ⊂ K[x] must be principal, so define h
by (f, g) = (h). The Euclidean algorithm constructs h as usual.

Quotients of polynomial rings and fields An ideal I ⊂ R is maximal


if I 6= R and the only ideal that strictly contains I is R itself. If f ∈ K[x]
factorises as f = gh, for polynomials g, h of degree ≥ 1, then (f ) ⊂ (g) ⊂ K[x]
are strict inclusions, so in this case (f ) is not a maximal ideal.
Proposition 4.13. Let K be a field and f ∈ K[x]. The following are equivalent:
(a) f is an irreducible polynomial;
(b) (f ) ⊂ K[x] is a maximal ideal;
(c) K[x]/(f ) is a field.
Proof. See Algebra II, but again possible to do after calm expression of what
needs to be proved. 

29
5 Factorisation in polynomial rings
The polynomial ring K[x] over any field K is a unique factorisation domain.
That is, any polynomial f ∈ K[x] admits a factorisation

f = h1 h2 · · · hs

into irreducible polynomials hi ∈ K[x], where the hi are unique up to the order
in which we wrote them down, and up to multiplication by units. (That last
point captures, for example, that (−h1 )(−h2 ) = h1 h2 is regarded as the same
factorisation ‘up to units’.)
The harder question is when is a polynomial f ∈ K[x] irreducible. The
answer can be subtle, and depends on the field K.

5.1 Roots of polynomials


If f ∈ K[x] then α ∈ K is a root of f if f (α) = 0. Another corollary of the
division algorithm:
Corollary 5.1. Let f ∈ K[x]. Then the following are equivalent:
• α ∈ K is a root of f ;
• x − α divides f (as elements of K[x]);
• there is a polynomial g ∈ K[x] with f = (x − α)g.
Proof. Dividing f by (x − α) with remainder gives f = (x − α)g + r for some
polynomial g and some r, which must be a constant since deg r < deg(x−α) = 1.
Now setting x = α shows that r = 0. 

5.2 Factorisation in C[x] and Q[x]


By the Fundamental Theorem of Algebra, any polynomial f ∈ C[x] factorises
into linear factors:

f = c(x − α1 )(x − α2 ) · · · (x − αn )

for c, αi ∈ C; the factors x − αi correspond to the roots αi of f .


But polynomials in Q[x] do not necessarily factorise in that way if we demand
that all the factors lie in Q[x]. For example x2 + 1 ∈ Q[x] cannot factorise into
linear factors in Q[x]; in fact it is irreducible. How to see that? If you can work
fluently in C, then you could regard x2 + 1 as an element of the bigger ring C[x],
factorise it there as
x2 + 1 = (x − i)(x + i)
and then observe that the roots ±i do not lie in Q, so the factorisation could
not possibly take place in Q[x].
Factorisation in Z[x] is a bit easier, so first we show that it is equivalent to
factorisation in Q[x] and then see some simple tools that help in Z[x].

30
(Non-examinable) Remark We extend all the definitions so far to the ring
Z[x] of polynomials with integer coefficients. Its units are simply ±1, irre-
ducibility is defined in exactly the same way, and it too is a unique factorisation
domain. The main difference is that it is not a principal ideal domain: for exam-
ple I = (2, x) ⊂ Z[x] cannot be generated by a single integral polynomial. But
that doesn’t matter to us here: everything here is elementary, and we proceed
without using any formal definitions. 
Lemma 5.2 (Gauss’s Lemma). Let f = a0 + a1 x + · · · + an xn ∈ Z[x] be a
polynomial with n = deg f > 0. Suppose in addition that gcd(a0 , . . . , an ) = 1.
If f = gh for g, h ∈ Q[x] with deg g > 0 and deg h > 0, then there is some
b ∈ Q∗ with bg ∈ Z[x] and b−1 h ∈ Z[x].
This is completely elementary—it’s capturing the hope that if you can fac-
torise an integer polynomial over Q, such as x2 − 4x + 3 = ( 52 x − 25 )( 52 x − 15
2 ),
then you could move any nontrivial common denominators between factors so
they cancel, and so you would have a factorisation over Z too.
Proof. Let a, b ∈ Q such that g = ag1 and h = bh1 with g1 , h1 ∈ Z[x], where
there is no common factor of all the coefficients of g1 , and similarly for h1 .
(Simply choose a to be c/d, where d ∈ Z is the least common multiple of the
denominators of the coefficients of g and then c is the highest common factor
of the coefficients of dg—at the blackboard you’d say let’s clear a common
denominator and then pull out the common factor of the coefficients.)
So f = gh = (ab)g1 h1 . Let ab = r/s ∈ Q in least terms with s > 0. If s = 1,
then we are done: r divides all coefficients of f so r = ±1 by hypothesis and
thus a = ±1/b. So suppose not and pick a prime factor p of s. Since p does not
divide r, it must divide every coefficient of the product g1 h1 , as the coefficients
of f are integers. We claim that p divides all the coefficients of either g1 or h1 ;
but that contradicts how we chose g1 and h1 , so we are done.
P The claim is immediate once I bother to write some notation. Let g1 =
bi xi and h1 = ci xi and suppose, for contradiction, that bj is the least
P
index coefficient of g1 not divisible by p and ck is the least index coefficient
of h1 not divisible by p. Then the coefficient of xN , where N = j + k, in the
product g1 h1 is
 
b0 cN + b1 cN −1 + · · · + bj−1 ck+1 + bj ck + bj+1 ck−1 + · · · + bN c0

and p divides every term except bj ck (as it divides all b<j and c<k ). So p does
not divide the coefficient of xN in g1 h1 , a contradiction. 
The following corollary is usually cited as Gauss’s Lemma.

Corollary 5.3. If f is irreducible in Z[x], then it is irreducible in Q[x].


The only point is that if f had a common prime integer factor then it would
not be irreducible in Z[x], and so we only have to consider those f that satisfy
the hypotheses of the lemma.

31
(Non-examinable) Remark If you’re not impressed, think of it this way. If
f ∈ Z[x] has integral coefficients and a ∈ Z is an integer root does that mean
that f factorises in Z[x] with a factor x − a? Of course it has no right to: the
division algorithm cannot work in the same way in Z[x] as it does in K[x] with
coefficients in a field (otherwise Z[x] would be a PID, which it’s not), so your
first thought of a proof cannot work. But if you try to build a counterexample,
you’ll see how any denominator in a factor conspires to find its way back to f .
The neat proof of Gauss’s Lemma grabs hold of that runaway denominator. 

5.3 Factorisation in Z[x]


5.3.1 A test for rational roots
Start with a lemma of no practical use for us.
Lemma 5.4. Let f ∈ Z[x] and suppose a ∈ Z is a root of f . Then a must
divide the constant term of f .

Proof. For a moment, regard f as an element of Q[x]. Since a is a root of f , we


have f = (x − a)g for some g ∈ Q[x]. Now Gauss’s lemma gives b ∈ Q∗ with
b(x − a) ∈ Z[x] and g/b ∈ Z[x]. By the first of these, clearly b is actually an
integer, and therefore, by the second, g had integer coefficients all along. Thus
the constant term of f is the (integral!) constant term of g times −a, and we
are done. 
If you understood that, then the following very useful proposition is obvious.
Proposition 5.5. Let f = a0 + · · · + an xn ∈ Z[x], an 6= 0, and suppose r/s ∈ Q
is a root of f in Q expressed in least terms. Then r divides a0 and s divides an .

Proof. For a moment, regard f as an element of Q[x]. Since r/s is a root of f ,


we have f = (x − r/s)g = (sx − r)(g/s) for some g ∈ Q[x]. Now Gauss’s lemma
gives b ∈ Q∗ with

b(sx − r) ∈ Z[x] and g/(sb) ∈ Z[x].

By the first of these, clearly b is actually an integer, as r and s are coprime.


Therefore, by the second, g/s has integer coefficients. Thus the highest degree
term of f = (sx − r)(g/s) is sx times the highest degree term of g/s, while the
constant term is −r times the constant term of g/s, and we are done. 
Example 5.6. (a) Is the polynomial f = x3 − 4x + 5 irreducible in Q[x]?
If not, it must have a linear factor, as it’s a cubic, so there would be a
rational root r/s. By the proposition, the denominator s divides 1 and r
divides 5, so s ∈ {−1, 1} and r ∈ {−5, −1, 1, 5}. But you quickly check
that f (r/s) 6= 0 for all of these 8 possible r/s (with some repetition), and
so f is irreducible over Q by the proposition.

32
(b) Is the polynomial f = 2x3 +3x2 +4x+6 irreducible in Q[x]? I have no idea.
If it is reducible it must have a linear factor, as it’s a cubic, so there would
be a rational root r/s. By the proposition, s divides 2 and r divides 6. So
we may assume s = 1 or s = 2 and r ∈ {−6, −3, −2, −1, 1, 2, 3, 6}. You
can simply check all 12 possible r/s. If none of them is a root, then f is
irreducible by the proposition.
(Spoiler: now that the proposition limited the possibilities for a rational
root, you’ll find soon enough that f (− 23 ) = 0, and so f = 2(x+ 32 )(x2 +2).)

5.3.2 Eisenstein’s criterion for irreducibility


The elementary coefficient divisibility considerations in Gauss’s Lemma prove
more: a sufficient condition for being irreducible, the Eisenstein criterion. It
looks very specialised, and is, but it helps enormously when we want to make
examples.
Theorem 5.7. Let f = a0 + · · · + an xn ∈ Z[x] and suppose there is a prime
p ∈ Z such that
• p does not divide an ;

• p divides each of a0 , . . . , an−1 ;


• p2 does not divide a0 .
Then f is irreducible over Q.

Proof. Suppose not, and that f = ghPis a factorisation.P By iGauss’s Lemma we


may suppose g, h ∈ Z[x]. Let g = bi xi and h = ci x . Then a0 = b0 c0 .
Since p divides a0 but p2 does not, we may assume p divides b0 but does not
divide c0 .
Since p does not divide an , it cannot divide every coefficient of g. Let bj be
the smallest index coefficient of g that p does not divide. But now

aj = b0 cj + b1 cj−1 + · · · + bj−1 c1 + bj c0

and p divides aj and every term on the right except possibly bj c0 . So p must
divide bj c0 , but that’s a contradiction as we already agreed it doesn’t divide
either factor. 

Example 5.8. (a) Let f = x5 − 10x + 5, the quintic we mentioned in the


introduction that we will show has no solution in radicals. The prime p = 5
satisfies the hypotheses of Eisenstein’s criterion, and so f is irreducible
over Q.
(b) Let f = 12 x3 + x2 − 43 x + 59 does not have Z coefficients, but we can apply
Eisenstein with the prime p = 2 to 18f ∈ Z[x], and so 18f is irreducible
over Q, and so the original f is irreducible over Q too.

33
Example 5.9. Prime cyclotomic polynomials. Let p > 0 be a prime, and
consider
f = xp−1 + xp−2 + · · · + x + 1.
Then in fact f is irreducible over Q. That does not follow directly from Eisen-
stein’s criterion, but there is little trick that extends Eisenstein’s applicability.
Note first that f (x) = (xp − 1)/(x − 1). So setting x = 1 + y gives

f (x) = f (1 + y)
(1 + y)p − 1
=
y
 
1 p! p!
= py + y2 + y 3 + · · · + py p−1 + y p
y 2!(p − 2)! 3!(p − 3)!
p! p!
=p+ y+ y 2 + · · · + py p−2 + y p−1 .
2!(p − 2)! 3!(p − 3)!

Eisenstein applies with prime p to the last polynomial f (1+y) ∈ Z[y] (the prime
p in the binomial numerators cannot cancel since the denominator factors are
too small), so that is irreducible over Q. But any factorisation f = gh of would
give a factorisation f (1 + y) = g(1 + y)h(1 + y), so the original f is irreducible
over Q too.

5.3.3 Factorisation modulo a prime


Proposition 5.10. Let f = a0 + · · · + an xn ∈ Z[x] and let p ∈ Z be a prime
that does not divide an . Define

f = a0 + a1 x + · · · + an xn ∈ Fp [x]

where ai ∈ {0, 1, . . . , p − 1} denotes the least residue of ai modulo p (that is, the
remainder on dividing ai by p).
If f is irreducible in Fp [x], then f is irreducible in Z[x].
Proof. Suppose f = gh is a factorisation; we may assume that g, h ∈ Z[x]. Then
f ∈ Fp [x] has degree n since an 6= 0. So f = gh is a factorisation in Fp [x]. 

Example 5.11. Consider f = x3 + ax + b for odd a, b ∈ Z. Working modulo


p = 2 gives f = x3 + x + 1 ∈ F2 [x]. Since it is a cubic, it is either irreducible
over F2 or it has a root in F2 . But neither element of F2 = {0, 1} is a root, so
it is irreducible. Thus the original f is irreducible in Q[x].

34
6 Field extensions I
Recall that a homomorphism ϕ : K −→ L between fields is simply a ring homo-
morphism: thus ϕ respects addition and multiplication and takes 1 to 1.
Proposition 6.1. Let K and L be fields and ϕ : K −→ L a homomorphism.
Then ϕ is injective.
Proof. Quicker to do this one yourself, but here goes. Suppose a, b ∈ K satisfy
ϕ(a) = ϕ(b). So ϕ(a − b) = ϕ(a) − ϕ(b) = 0. So we show that the only element
c ∈ K that maps to 0 ∈ L is c = 0. Well, if c ∈ K \ {0} and ϕ(c) =0, then
ϕ(1) = ϕ(cc−1 ) = ϕ(c)ϕ(c−1 ) = 0, which contradicts ϕ(1) = 1. 
Simplifying Remark + Notation 6.2. Proposition 6.1 says that whenever
we have a homomorphism between fields we are in precisely the situation
ϕ
K −→ K 0 = ϕ(K) ⊂ L

where we use the notation K 0 ⊂ L to denote the image, so that K 0 ∼ = K is a


subfield of L isomorphic to K—or K 0 is a copy of K in L, as one might say.
Because of this, we frequently identify K with its image ϕ(K) ⊂ L, and
therefore treat the homomorphism as a subset inclusion K ⊂ L. We will do
that as an automatic reaction (writing K in place of K 0 ) almost all the time.
Just occasionally we need to be careful, and then we can wheel out ϕ and K 0 ,
but those places are clear in practice. We will flag them explicitly the first few
times. This turns out to be a great simplifying use of notation, and once we get
going it is not at all as convoluted or dishonest as it sounds.
In fact, you’ve done this all your life. I bet you write Q ⊂ R, when you prob-
ably mean ϕ : Q → R where ϕ(r/s) is the class of Cauchy sequences equivalent
to the constant sequence (r/s, r/s, . . . ).
It’s worth noting why we might sometimes need to be careful. There may be
more than one homomorphism from K to L, and they may have the same image
or different images, and knowing how K maps into L may be the point. When
that happens we will distinguish them carefully. If K = Q and char L = 0 then
there is a unique homomorphism so at least writing Q ⊂ L is unambiguous.
Definition 6.3. A field extension is a ring homomorphism ϕ : K −→ L of fields
K and L. In particular, any inclusion of fields K ⊂ L is a field extension, where
ϕ is the identity inclusion map (and so we don’t mention it).
To repeat the warning, it is tempting to think of all field extensions as subset
inclusions K ⊂ L, and we will do that whenever it is safe to do so, but we will
have to be more careful once we start to find more than one map K ,→ L.
As first familiar examples, Q ⊂ R and R ⊂ C are field extensions, as is their
composition Q ⊂ C. We will see many more interesting ones soon.
Notation 6.4. We often write L/K to denote a field extension, where the map
ϕ is implicit. This is just a standard notation, and is nothing to do with quotient
rings: K is never an ideal inside L unless K = L.

35
6.1 Extensions as vector spaces
We often treat C as a vector space over R. That’s how we picture C, with a
basis {1, i} spanning real and imaginary axes, as in Figure 1. In particular, C
is 2-dimensional as an R-vector space. It works because R ⊂ C, so doing scalar
multiplication from R is automatic. One of the basic ideas is that something
similar works for any field extension.
Fundamental Observation 6.5. If L/K is a field extension, then L is a vector
space over K.
Why? Addition and subtraction already work in L, so the point is the
scalar multiplication of elements of L by elements of K. When K ⊂ L, the
scalar multiplication from K is simply multiplication as elements of L. If more
generally the extension is given by a map ϕ : K → L, then we multiply v ∈ L by
α ∈ K via ϕ by defining α · v := ϕ(α)v, where multiplication on the right-hand
side is the multiplication inside L. That is, we do scalar multiplication by using
the copy K 0 = ϕ(K) of K that lies in L. (We need to check that all the axioms
of a vector space hold, and you can do that easily and should.)
Definition 6.6. If L/K is an extension, then the degree of L/K, denoted
[L : K], is defined to be
[L : K] = dimK L
that is, the dimension of L as a K-vector space. Thus the degree is a nonnegative
integer or ∞. Note that [L : K] = 1 if and only if L = K.
We say L/K is finite if [L : K] < ∞, otherwise L/K is infinite.
This definition forgets an enormous amount about what makes L a field, but
it illuminates something coarse but crucial about the relationship between K
and L that may otherwise be hidden. For example, is the following statement
immediately obvious to you before you consider K as a vector space over Fp ?
Proposition 6.7. If K is a finite field, then it has char(K) = p > 0 a prime,
and #K = pn for some n ∈ N.
Proof. Z → K cannot be injective, since K is finite, so the prime field must be
Fp ⊂ K for some prime p > 0. So K is a finite-dimensional vector space over
Fp , and after choosing any basis its elements are simply all vectors (a1 , . . . , an )
with ai ∈ Fp , where n = [K : Fp ], and there are obviously pn of those. 
This leaves the more subtle question of whether finite fields of size pn exist
(they do; see §12), or, put another way, can we extend the multiplication in Fp
to a vector space, in a compatible way with that vector space structure?
Remark 6.8. So, for example, there cannot be a field with 6 elements. How
else might you prove it? If you try to think of rings with 6 elements, you’ll
immediately try Z/(6), but of course that’s not a field because 2 × 3 = 0 in it.
But that by itself is pretty weak evidence. After all, Z/(4) is not a field, but
with a bit of persistence you could write addition and multiplication tables on

36
a set of 4 elements {0, 1, a, b} that have inverses and so determine a field. Try
that, and then try with 6 elements (which must fail by the proposition), and
decide whether your determined failure would be enough to convince you.
The first main theorem is easy, but amazingly powerful—it will solve prob-
lems in an instant that you would have no idea how to calculate otherwise.
When you read it, have in mind the situation of three fields K ⊂ L ⊂ M , even
though it is phrased more generally in terms of field extensions.
Theorem 6.9 (The Tower Law). If M/L and L/K are field extensions, then
M/K is a field extension and if M/L and L/K are both finite then

[M : K] = [M : L][L : K]

If either of M/L or L/K are infinite, then so is M/K.


Proof hint: Use a K-basis of L and an L-basis of M to concoct a K-basis of
M , and then count it.
It is enough to consider the case K ⊂ L ⊂ M , and so we do.
Proof. First, K ⊂ M so M is a vector space over K and so L ⊂ M is a K-vector
subspace of M . So if [M : K] is finite, then (a) [L : K] is finite too (by linear
algebra you can extend a K-basis of L to a K-basis of M ), and (b) [M : L] is
finite (any K-basis of M certainly spans over L). That takes care of the final
claim about infinite extensions.
So now consider finite extensions K ⊂ L and L ⊂ M . Pick a K-basis
a1 , . . . , an of L and an L-basis b1 , . . . , bm of M . We claim that the nm products
aj bi (taken inside M , remembering for a moment that it’s a field) form a K-basis
of M , and so we are done.
Consider the set B = {aj bi | 1 ≤ j ≤ n, 1 ≤ i ≤ m}. First we prove that B
spans M over K and then that its elements are K-linearly independent. (With
this hint, it is probably quicker now for you to complete the proof yourself.)
For spanning, suppose given e ∈ M . Since {bi } is an L-basis of M , there
are di ∈ L so that e = d1 b1 + · · · + dm bm . Now each di ∈ L, and since {ai } is
a K-basis of L there are cij ∈ K so that di = ci1 a1 + · · · c +in an . So we can
express e as a K-linear combination of the elements of B, as required: viz.
 
Xm m
X X n m X
X n
e= di bi =  cij aj  bi = cij (aj bi )
i=1 i=1 j=1 i=1 j=1

For linear independence, suppose

c11 a1 b1 + c12 a2 b1 + · · · + cm−1,n an bm−1 + cmn an bm = 0

Collecting coefficients of each bi together rewrites this as


 
Xn
(c11 a1 + c12 a2 + · · · c1n an ) b1 + · · · +  cmj aj  bm = 0
j=1

37
Since {bi } is a L-basis of M , each of the coefficients must be zero. Just looking
at the coefficient of b1 , that means

c11 a1 + c12 a2 + · · · + c1n an = 0

Since {aj } is a K-basis of L, each of the coefficients must be zero: that is,
c11 = · · · = c1n = 0. The same holds for the coefficients of b2 , . . . , bm , and thus
all cij = 0, which is what we had to prove. 
Definition 6.10. If K ⊂ L ⊂ M are fields, then L is said to be an interme-
diate field of K ⊂ M .

We could imagine phrasing this definition more generally for field extensions
K −→ L −→ M , but in the context we work that is not helpful. Of course,
K = L or L = M are allowed as intermediate fields.

6.2 Adjoining a square root to a subfield of C


This section is wholly subsumed by later sections so you could skip it, but it
explains a useful particular case that we see a lot.
Let K ⊂ C and√suppose s ∈ K is not a square of another element in K:
thus, choosing α = s ∈ C to be either square root of s, we have α ∈ / K.
What is the smallest subfield L ⊂ C that contains both K and α? The
answer is simple: just write down all possible expressions using α and elements
of K, joined together with the four field operations of C. Thus, after collecting
all terms over a common denominator, an element ξ of L is of the form

p(α)
ξ= where p, q ∈ K[x] and q(α) 6= 0
q(α)

and conversely the collection of all such expressions is evidently a field (by the
usual operations with fractions) so must equal L.
But we can be much more economical in the way we describe elements of L.
In the first place, there is no point ever writing α2 since we may replace it by s
wherever it appears. So we can write an element ξ of L in the form
a + bα
ξ= where a, b, c, d ∈ K and (c, d) 6= (0, 0)
c + dα

If α had been i = −1, then you’d know how to use complex conjugation to
simplify this further. An analogous trick works for any α: multiply top and
bottom by the Galois conjugate c − dα of the denominator of ξ to give
   
a + bα c − dα ac − bds bc − ad
ξ= × = + α
c + dα c − dα c2 − d2 s c2 − d2 s
= A + Bα for some A, B ∈ K

38
where you notice that the denominators could not possibly vanish since s is not
a square in K. So, relabelling, we can write L more cleanly as

L = {a + bα | a, b ∈ K}

which makes it clear that {1, α} is a basis of L over K, and

[L : K] = dimK L = 2

We will use the notation K(α) to denote the field L for ever more.

Example 6.11. Consider Q ⊂ R and α = 2. Then Q(α) √ is the subfield of
R consisting of all combinations of rational numbers and 2. The discussion
above says that
√ n √ o
Q( 2) = a + b 2 | a, b ∈ Q
√ 2
To see the discussion above in action, observe
√ that we never write 2 as we
could write 2 instead, so elements of Q( 2) are naturally expressions such as

3−5 2
γ= √
7 + 11 2

Multiplying numerator and denominator by 7 − 11 2 shows that
√ √
35 − 33 √ 2 √
 
(3 + 5 2)(7 − 11 2) 21 − 110 89
γ= √ √ = + 2= − 2
(7 + 11 2)(7 − 11 2) 49 − 242 49 − 242 193 193

That’s all there is to it!



Example 6.12. Let K = Q( 2). I claim that 3 is not a square in K.
If 3 were a square
√ 2 in K, then by the discussion above it would have the
form
√ 3 = (a + b 2) for some a, b ∈ Q. Expanding the bracket shows that
2ab 2 = 3 − a2 − 2b2 is rational: but that’s only true if a = 0 or b = 0, in which
case either 3/2 or 3 is a square in Q, which you know they are√not.
So let L be the smallest subfield of C containing K and 3.√(That’s√the
same as saying L is the smallest
√ subfield of C that contains both 2 and 3.)
What is [L : Q]? As √ 2∈/ Q, the discussion above shows that [K : Q] = 2.
And since we know that 3 ∈ / K, it also shows that [L : K] = 2. Now we
conclude by the Tower Law

[L : Q] = [L : K][K : Q] = 2 × 2 = 4

Looking at this concretely—following the proof of the Tower Law—any ele-


ment ξ ∈ L is of the form

ξ =a+b 3 for some a, b ∈ K

and elements of K similarly have the form


√ √
a = a0 + a1 2, b = b0 + b1 2 for some a0 , a1 , b0 , b1 ∈ Q

39
so we may expand ξ ∈ L as
√ √ √ √ √ √ √
ξ = a0 + a1 2 + (b0 + b1 2) 3 = a0 + a1 2 + b0 3 + b1 2 3

which visibly
√ √depends
√ 4 parameters a0 , a1 , b0 , b1 ∈ Q with respect to the
on √

basis {1, 2, 3, 6 = 2 3}.

Very soon we will have a nice simple proposition that will subsume all of
these ad hoc calculations.

6.3 Simple extensions


Suppose K ⊂ L is a field extension. Let α ∈ L. We define
 
p(α)
K(α) = | p, q ∈ K[x], q(α) 6= 0
q(α)
= the smallest subfield of L that contains K and α.

That is, you get K(α) by applying all field operations to anything in K and α.
So 17α3 − 3/α ∈ K(α), for example.
Terminology 6.13. In the notation above, we say that K(α) is the field ob-
tained by adjoining α to K.

The field K(α) is an intermediate field between K and L,

K ⊂ K(α) ⊂ L,

where the lefthand inclusion is an equality if and only if α ∈ K, and the right-
hand inclusion is an equality if and only if it so happens that every element of
L can be expressed in terms of α and elements of K.
Definition 6.14. A (nontrivial) field extension K ⊂ L is a simple (or primi-
tive) extension if and only if L = K(α) for some α ∈ L.
A familiar example is that C = R(i) is a simple extension. Before looking at
more examples, note that the field K(α) has a subring

K[α] = {p(α) | p ∈ K[x]}


= the smallest K-algebra of L that contains α

Of course K[α] is an integral domain (it is contained in a field), and K(α) is its
field of fractions. It frequently happens that K[α] = K(α): it happens whenever
we can eliminate denominators using conjugacy tricks as in Example 6.11.

Example 6.15. Consider Q ⊂ C and ω = exp(2πi)/5, a primitive 5th root of


unity. Then Q(ω) is the subfield of C consisting of all combinations of rational
numbers and ω. For example γ = 1+ω 2 +ω 4 +ω 6 +ω 8 ∈ Q(ω), and I can at least

40
try to simplify γ by trading any ω 5 factors for 1 to get γ = 1 + ω 2 + ω 4 + ω + ω 3 .
If you remember at this stage that
x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1),
then you’ll notice that in fact γ = 0, so it wasn’t quite as interesting an element
as it at first appeared. (Or perhaps you spotted right away that ω 2 is also
a primitive 5th root of unity, so that γ is the zero sum of the vertices of the
pentagon in Figure 1 despite its initial appearance.)
Example 6.16. Consider Q ⊂ R and let α be π = 3.1415 · · · , the circumference
of a circle of diameter 1. Then Q(α) is the subfield of R consisting of all
combinations of rational numbers and π. For example, γ = π 2 +π 5 +1/π ∈ Q(π).
There are no polynomial relations involving π and rational numbers (π is in fact
irrational), so I’m not going to try to simplify that expression for γ at all.

6.3.1 Standard models for simple extensions


Consider an extension K ⊂ L and an element α ∈ L (which invariably in practice
will not lie in K, though there’s nothing to stop it). As ever, K[x] denotes the
polynomial ring in indeterminate x with coefficients in K.
Lemma 6.17. The evaluation-at-α map evα : K[x] −→ L defined by evα (f ) =
f (α), is a ring homomorphism, and moreover it is the unique ring homomor-
phism ϕ : K[x] −→ L that is the identity on K and satisfies ϕ(x) = α.
This is simply stating the rules you use for evaluating polynomials at some
fixed value.
Proof. Certainly evα (1) = 1, and more generally constant polynomials evaluate
to themselves so evα is the identity on K. Also
evα (f ) + evα (g) = f (α) + g(α) = (f + g)(α) = evα (f + g)
and
evα (f ) evα (g) = f (α)g(α) = (f g)(α) = evα (f g)
so evα is a ring homomorphism with evα (x) = α. If ϕ is any other such ring
homomorphism, then ϕ(ax2 ) = aϕ(x)2 = aα2 for a ∈ K, and inductively
ϕ(axn ) = aαn , so ϕ = evα by extending over sums to polynomials. 
Proposition 6.18. Let K ⊂ L, α ∈ L and evα : K[x] −→ L the evaluation
map evα (x) = α. Then exactly one of the following two cases occurs:

=
(a) evα is injective, and evα extends to an isomorphism K(x) −→ K(α) ⊂ L.
(b) evα is not injective, and ker(evα ) = (f ) is generated by a monic irreducible
polynomial f ∈ K[x] of degree ≥ 1. In particular f (α) = 0, and if g(α) = 0
for a polynomial g ∈ K[x], then f divides g. evα induces an isomorphism

=
evα : K[x]/(f ) −→ K(α) ⊂ L
h + (f ) 7→ evα (h).

41
This is practically identical to the dichotomy when defining the characteristic
char L of a field, with evα taking the role of the natural map Z → L.
Proof. Certainly evα is either injective or not. Moreover, any injective map ϕ
from an integral domain to a field extends to an injective map from its field
of fractions: ϕ(f /g) is defined to be ϕ(f )/ϕ(g), and if ϕ(f1 /g1 ) = ϕ(f2 /g2 )
then also ϕ(g2 f1 ) = ϕ(g1 f2 ), so g2 f1 = g1 f2 and dividing by both gi finishes
it. The extension of evα to K(x) surjects onto K(α) by definition of simple
extension §6.3, and it is injective because it is a homomorphism of fields.
In the non-injective case, we show the polynomial f is irreducible as follows:
if f = gh where g and h both have degree < deg f , then

0 = f (α) = g(α)h(α)

so without loss of generality g(α) = 0; but that means g ∈ ker(evα ), which is


not possible as it has lower degree than f . The isomorphism at the end is by
the first isomorphism theorem. 

Definition 6.19. Let L/K be a field extension and α ∈ L. We say α is


algebraic over K if there is some monic polynomial f ∈ K[x] for which f (α) =
0. If there is no such polynomial, then α is transcendental over K.
If α is algebraic over K, then the minimal polynomial of α over K is the
(nonzero) monic polynomial f ∈ K[x] of minimal degree satisfying f (α) = 0.

Put another way, α algebraic over K means that evα is not injective and the
minimal polynomial of α over K is the unique monic generator of ker(evα ).
Remark 6.20. We can think of K[x]/(f ) as an in vitro model of the extension
K(α) ⊂ L of K by a root α ∈ L of the irreducible f . Rather than working
with K(α) in the cold, muddy field L, we can go back to the warm, dry lab
and work with K[x]/(f ). It’s shocking at first to realise that you have the same
model whichever root in L you choose: evβ (f ) = 0 for any other root β ∈ L.

6.3.2 A basis of a simple algebraic extension K(α)/K


Proposition 6.21. Let K ⊂ K(α) be a simple extension by an algebraic element
α with minimal polynomial f ∈ K[x] with deg f = n ≥ 1.
Then 1, α, . . . , αn−1 ∈ K(α) is a K-basis of K(α) and [K(α) : K] = deg f .

This is a linear algebra statement, so for the proof we don’t need to know
that we can multiply polynomials together; but see comments afterwards.
Proof. Set V = K[x]<n , the K-vector space of polynomials of degree < deg f .
Consider W = K[x]/(f ) as a K-vector space: that is, build the quotient
ring, and then only allow yourself to do +, − and scalar multiplication. Propo-
sition 6.18 says K[x]/(f ) ∼
= K(α) (as fields, which is even more than as K-vector
spaces) by x 7→ α, so it’s fine to use W as a model for K(α).

42
Define a map L : V −→ W by g 7→ g + (f ). It is K-linear (that is, a map
of K-vector spaces) by the golden rules of coset arithmetic: (ag + bh) + (f ) :=
(a + (f ))(g + (f )) + (b + (f ))(h + (f )). We prove that this is an isomorphism.
To show L is surjective, consider any w ∈ W , namely some coset w = g +(f ).
Let r be the remainder of division g = qf + r. Then w = r + (f ) and deg r < n,
so r is an element of V , and L(r) = w. Moreover L is injective because the
remainder is uniquely determined: if L(v) = 0, then v + (f ) = 0 + (f ), so
v ∈ (f ): but deg v < deg f , so we must have had v = 0 in the first place.
So L : V −→ W is an isomorphism, and we may use L to map any basis of V
to a basis of W . The basis {1, x, x2 , . . . , xn−1 } of V gives the basis claimed. 
Remark 6.22. In fact, this operation is completely natural, and you may be
more comfortable not forgetting multiplication. We may put ring multiplication
on V by taking remainder on division by f after any multiplications, and since
f is irreducible V is then a field and V ∼
= K(α).

6.4 Adjoining a root of a polynomial


Minimal polynomials arise from simple extensions, but we can view it the other
way round and start with a polynomial and create an extension in which it has
a root. If you think this is cheating, remember how you introduced i and C in
the first place: x2 + 1 doesn’t have a root in R, so why not just extend R by
throwing in an element that behaves as a root?—and you don’t complain about
that any more. We are more precise, but it amounts to the same thing.
Theorem 6.23. Let f ∈ K[x] be a monic irreducible polynomial. Then there
is a field extension K ⊂ L and an element α ∈ L so that L = K(α) is a simple
extension and f (α) = 0. Moreover, f is the minimal polynomial of α over K
and [L : K] = deg f .

Clearly the fact f is monic has nothing to do with the result; it’s just there
so I can say ‘minimal polynomial’ at the end. We can adjoin a root of any
irreducible polynomial, or indeed any nontrivial polynomial at all.
Proof. Just set L = K[x]/(f ). L is a field because f is irreducible in K[x], and
the composition
K −→ K[x] −→ K[x]/(f ) = L
is a ring homomorphism and so is injective as K is a field. Let α = x + (f )
be the class of x in L. Any element of L is of the form h(x) + (f ), but that is
exactly h(α), so α generates L over K. Visibly f (α) = f (x) + (f ) = 0 + (f )
is zero in L, and since f is irreducible and monic it is the minimal polynomial
of α. 
Corollary 6.24. Let f ∈ K[x]. Then there is a field extension K ⊂ L and an
element α ∈ L so that L = K(α) is a simple extension and f (α) = 0.
Pick an irreducible (monic) factor g of f and apply the theorem.

43
6.5 Adjoining multiple algebraic elements
Let K ⊂ L. If α ∈ L, we know a lot about the simple extension K ⊂ K(α).
More generally, if α, β ∈ L we define
 
p(α, β)
K(α, β) = | p, q ∈ K[x1 , x2 ], q(α, β) 6= 0
q(α, β)
= the smallest subfield of L that contains K and α and β

Notice that K(α, β) = K(α)(β), where the right-hand side is the simple exten-
sion of K(α) ⊂ L by β ∈ L. Indeed K(α)(β) ⊂ L is a subfield that contains α,
β and K, and K(α, β) is the smallest such, so K(α, β) ⊂ K(α)(β); on the other
hand K(α, β) ⊂ L contains β and K(α), and K(α)(β) is the smallest such, so
K(α)(β) ⊂ K(α, β). Similarly

K(α, β) = K(α)(β) = K(β)(α) (11)

In other words, we can adjoin multiple elements one after the other or all in one
go, and we get the same result.
In fact, for any subset S ⊂ L, we define

K(S) = the smallest subfield of L that contains K and S

noting, of course, that even though S may be infinite, any particular element
of K(S) is expressed using finitely many arithmetic operations on finitely many
elements of K and S. Note an equivalent way of formulating any of these
extensions that take place within a given big field is
\
K(S) = M
M ⊃S

where the intersection takes place over all subfields M ⊂ L that contain both S
and K (although the notation leaves some of this information implicit).
√ √
Example 6.25. Consider M = Q( 2, 3). We saw in Example 6.12 that
√ √ √ √ √
Q( 2, 3) = {a + b 2 + c 3 + d 6 | a, b, c, d ∈ Q}
√ √ √ √ √
and [M : Q] = 4 with the visible basis {1, 2, 3, 6 =√ 2 3} by considering
M as a sequence of extensions by square roots Q ⊂ Q( 2) ⊂ M .

Remark
√ 6.26.
√ There’s a nice little surprise lurking in Example 6.25. Consider
α
√ = 2 + ∈ M and the intermediate √
3√ ⊂ F =√Q(α) ⊂ M . Clearly
field Q √
6 ∈ F since 6 = 12 (α2 − 5). So also α 6 = 2 3 + 3 2 ∈ F . Subtracting
√ √
copies of α from that shows that both 2 ∈ F and 3 ∈ F . So M = F = Q(α)
was a simple extension all along. Ah well. You try to be interesting. . .
It’s not true that all field extensions are simple, but it does happen more
than you might expect, and it always happens for finite extensions of Q. This

44
will follow instantly as an unintended consequence of the Main Theorem 10.17
of Galois Theory below: it’s called the Theorem on the Primitive Element.
Having said that, it often seems easier to work with a few simple generators
than with a single one, as in the example above. You could regard the question of
finding radical formulas to solve equations as the opposite of finding a primitive
element: sure the field of all roots is a simple extension, but can you span it by
a bunch of radicals instead?
Example√6.27. At last we can verify the claim in §1(1) that if L = Q(α, ω)
with α = 3 2 and ω a primitive cube root of unity, then
[L : Q] = 6 and {1, α, α2 , ω, αω, αω 2 } is a Q-basis of L
For example, consider the intermediate field
Q ⊂ K = Q(α) ⊂ L
Then by Proposition 6.21 K has Q-basis {1, α, α2 } and [K : Q] = 3. The
polynomial x2 + x + 1 ∈ K[x] is irreducible over K (not only over Q) since
K ⊂ R but the roots are not real, so again by Proposition 6.21 L has K-basis
{1, ω} and [L : K] = 2. Thus by the Tower Law
[L : Q] = [L : K][K : Q] = 2 × 3 = 6
and the proof of the Tower Law gives a Q-basis of L as the product of the two
bases we just found. (You could redo the proof going via Q(ω) instead for fun.)

6.6 Algebraic extensions and finite extensions


Definition 6.28. An extension L/K is called algebraic if every element α ∈ L
is algebraic over K.
We can see directly from the definition that if L/K is a field extension and
α ∈ L is algebraic, then −α is algebraic too: there is a monic polynomial
f ∈ K[x] with f (α) = 0, and if we set g(x) = ±f (−x) to be the polynomial
f with signs changed on all odd powers of x if f has even degree (and even
powers if deg f is odd), then g(−α) = ±f (α) = 0. But working directly with
the polynomials in this way is rare. How would you show that α17 + 5α is also
algebraic? Fortunately, there is a much simpler approach.
Recall that we say L/K is finite if dimK L < ∞, otherwise infinite.
Proposition 6.29. If L/K is a finite extension, then it is algebraic.
Proof. This is trivial. Suppose [L : K] = n. If α ∈ L, consider the n + 1
elements 1, α, α2 , . . . , αn . As they lie in an n-dimensional K-vector space, there
must be elements c0 , c1 , . . . , cn ∈ K, not all zero, for which
c1 · 1 + c1 α + c2 α2 + · · · + cn αn = 0.
Thus α satisfies a nontrivial polynomial relation over K. (You can divide
through by whichever cs 6= 0 for largest s to make it monic.) 

45
Corollary 6.30. Let L/K be a field extension and α, β ∈ L be two algebraic
elements. Then α + β, αβ and α/β are algebraic, where of course we only claim
the last one in the case β 6= 0.
Proof. By the proposition K(α)/K is a finite extension, say [K(α) : K] = n.
Let p ∈ K[x] be the minimal polynomial of β over K. It is irreducible over K.
Suppose for a moment that p is irreducible over K(α) too. In that case [K(α, β) :
K(α)] = deg p. So by the Tower Law

[K(α, β) : K] = [K(α, β) : K(α)][K(α) : K] = n deg p

is finite too. So K(α, β)/K is algebraic, by the proposition. Thus any element
of K(α, β) is algebraic over K, in particular the arithmetic combinations of α
and β we wish to prove.
We presumed p was irreducible over K(α). But if not, things are even better.
In that case p must factorise p = p1 p2 · · · ps into irreducibles pi over K(α), and
β must be a root of one of the pi , without loss of generality p1 . (That is,
p1 ∈ K(α)[x] is the minimal polynomial of β over K(α).) So simply repeat the
previous argument with p1 in place of p. 
Example 6.31. This implies at once that the set A ⊂ C of all complex numbers
that are algebraic over Q,

A = {α ∈ C | f (α) = 0 for some f ∈ Q[x]}

is a subfield of C. This field is called the field of algebraic numbers. Indeed,


if α, β ∈ A, then by definition both are algebraic, and so the various arithmetic
combinations, α + β and so on, are algebraic too, and so they also lie in A.
With a tiny bit more work, it also proves that A is algebraically closed. Here
goes. Suppose f ∈ A[x] and let α ∈ C be a root of f . We need to prove that
α ∈ A, that is, α is algebraic over Q, or more prosaically by Proposition 6.29
that [Q(α) : Q] is finite. You could figure this out easily now we’ve said it out
loud, but I will.
Let a0 , . . . , an ∈ A be the coefficients of f . Setting K = Q(a0 , . . . , an ) ⊂ A
we have that in fact f ∈ K[x] and α is algebraic over K. Since each ai is
algebraic over Q, Proposition 6.21 says that

Q ⊂ Q(a0 ) ⊂ Q(a0 , a1 ) ⊂ · · · ⊂ Q(a0 , . . . , an ) = K

is a sequence of finite extensions (some may be equalities of course), so [K : Q]


is finite by induction on the Tower Law. As the root α ∈ C is algebraic over K,
[K(α) : K] is finite, by Proposition 6.21 again, and so by the Tower Law [K(α) :
Q] = [K(α) : K][K : Q] is finite too.
But Q ⊂ Q(α) ⊂ K(α), so the Q-vector space Q(α) is a Q-sub-vector space
of the finite-dimensional Q-vector space K. Thus [Q(α) : Q] is finite, which is
what we needed to show.

46
Remark 6.32. It was trivial that finite extensions are algebraic. Conversely,
simple algebraic extensions are finite by Proposition 6.21. But the more general
converse is not true, as Q ⊂ A in Example 6.31 shows.
You can easily make up other gratuitous counterexamples:
√ √ √inductively ad-

joining all the real pth roots of all primes p, Q ⊂ Q( 2, 3 3, 5 5, . . . , p p, . . . ),
for example. In any example you make up, you’ll have to show that what you
add inductively does not already lie in the field you have constructed so far. In

this case, once you calculate [Q( p p) : Q] = p, you’ll see that the Tower Law
√ √ √
shows both p p ∈ / Q( 2, . . . , q q), where q is the prime before p, and then that
the whole extension is infinite. You can have fun making up some others (the
easy part) and trying to prove they are infinite (which can be difficult).

6.7 Maps between fields I


We prove three results that between them begin to show how to construct
homomorphisms between fields, and conversely show that they are rare.
Notation 6.33. If L/K and M/K are extensions, then we write

EmbK (L, M ) = {ϕ : L → M | ϕ is a K-homomorphism}

Notice that EmbK (L, M ) is not naturally a group: we cannot compose maps
because in general their domain and codomain are different.
Remark 6.34. There is one situation in which EmbK (L, M ) is a group under
composition in a straightforward way. Suppose L/K is finite, L ⊂ M and that
it so happens that ϕ(L) ⊂ L for every K-homomorphism ϕ. Since L/K is finite,
each ϕ is therefore an isomorphism L → L, and so EmbK (L, M ) is naturally
identical to AutK (L), which is a group under composition. (We used finiteness
here to ensure maps are bijections so they can be inverted.)
Fundamental Observation 6.35. Consider ϕ ∈ EmbK (L, M ) where L/K
and M/K are any field extensions. If α ∈ L is a root of a polynomial f ∈ K[x]
then ϕ(α) is a root of f in M .
Proof. The point is that ϕ respects addition and multiplication and is the iden-
tity on K, so ϕ(f ) = f , if we extend ϕ naturally to a ring homomorphims
L[x] −→ M [x] as in §4.2 Example 4.3. With that first equality,

f (ϕ(α)) = ϕ(f )(ϕ(α)) = ϕ(f (α)) = ϕ(0) = 0

as claimed, where the second equality is because ϕ is a ring homomorphism. 


Proposition 6.36. Let L/K be a field extension. Suppose f ∈ K[x] is an
irreducible polynomial and that α, β ∈ L are two roots of f . Then there is a
K-isomorphism K(α) −→ K(β) that takes α to β.
Proof. Both fields are K-isomorphic to K[x]/(f ) with x 7→ α and x 7→ β respec-
tively, so composing the inverse of the first with the second gives an isomorphism
as required. 

47
Spoiler 6.37. It would be nice if the map ϕ : K(α) −→ K(β) we got extended
to a K-automorphism L −→ L. In general it does not, but later in Corollary 9.16
we will see a situation in which it always does, which will give us a lot of power.

Example 6.38. Thinking back to the roots α = 3 2, αω and αω 2 of f = x3 − 2,
the proposition shows that there is an isomorphism between the three subfields
Q(α) ∼ = Q(αω) ∼ = Q(αω 2 ) of the field L = Q(α, ω) containing all the roots.
When motivating the subject in §1 we implicitly said this, and hoped that these
isomorphisms could be realised by automorphisms of L (which indeed they are,
as we see later). The proof above completely ignores L—it’s only there as a
place for α, β to live—and simply says that each of the three fields is isomorphic
to the same model extension K[x]/(f ) of K.

Corollary 6.39. Let L/K be a field extension, f ∈ K[x] is an irreducible


polynomial and α ∈ L a root of f . Then

EmbK (K(α), L) −→ {β ∈ L|f (β) = 0} is a bijection


ϕ 7→ ϕ(α)

so # EmbK (K(α), L) = #{β ∈ L|f (β) = 0}, the number of roots of f in L.


Proof. Any K-homomorphism ϕ : K(α) −→ L is determined by the image β =
ϕ(α). Certainly β ∈ L must be a root of f , since 0 = ϕ(0) = ϕ(f (α)) =
ϕ(f )(ϕ(α)) = f (β), as f ∈ K[x] and ϕ fixes K. Conversely, Proposition 6.36
says that for any root β, sending α 7→ β does determine a K-homomorphism. 

Example 6.40. Continuing with f = x3 − 2 and K = Q[x]/(f ), which is


a field as f is irreducible over Q. The corollary says that the number of Q-
homomorphisms of K to C is the same as the number of distinct roots of f
in C. There are 3 roots, so 3 homomorphisms, and their images are the 3
isomorphic subfields in Example 6.38.

Theorem 6.41. Let L/K be a finite extension and M/K any extension. Then

# EmbK (L, M ) ≤ [L : K]

Proof. If L = K(α) is a simple extension with minimal polynomial f , then


deg f = [L : K] is an upper bound on the number of roots of f , so this follows
from Corollary 6.39.
We work by induction on [L : K]. In general, pick α ∈ L \ K and consider
K ⊂ K(α) ⊂ L. There is a map of sets
ρ
EmbK (L, M ) −→ EmbK (K(α), M )

just by restricting a map from L to a map from K(α).


Let ϕ ∈ EmbK (K(α), M ). The preimage ρ−1 (ϕ) is the set of K-homomorph-
isms L −→ M that restrict to ϕ : K(α) −→ M . In particular, we may consider
ρ−1 (ϕ) as the set of K(α)-homomorphisms L −→ M (where K(α) ⊂ M via ϕ).

48
Since [L : K(α)] < [L : K], we have by induction that #ρ−1 (ϕ) ≤ [L : K(α)].
So

# EmbK (L, M ) ≤ max{#ρ−1 (ϕ) | ϕ ∈ EmbK (K(α), M )} × # EmbK (K(α), M )


≤ [L : K(α)][K(α) : K] = [L : K]

by the Tower Law. 

Remark 6.42. I like this proof. It shows how we can use the tower law to do
induction on the degree of an extension, and then use simple extensions, which
are very concrete and we can be completely explicit about, as the inductive step.
Remark 6.43. It may have struck you that the upper bound in Theorem 6.41
is a function of L/K and has nothing whatsoever to do with M . The role of M
comes later as we find out when this bound is an equality. The inductive step
of the proof already contains the clue: you can build more maps if M contains
more roots of the minimal polynomials of elements of L over K.

49
7 Applications: antiquities
Euclidean geometry is a practical and applicable subject.
When you move an IKEA wardrobe from one room to another, you check
in advance whether it can be walked or carried, or whether sadly it needs dis-
mantling, before you gouge holes out of the walls and ceilings. If you want to
rise to become the Head Gardener at Buckingham Palace or Kew Gardens or
the University of Warwick, you’d better be able to be sure that the corners of
those beds are right angles and that if you have 10m2 coverage of compost the
area you’re mulching actually is 10m2 : the Royal Horticultural Society practical
exams really do include Euclidean geometry (a small dose, proofs omitted).
Because it is such a powerful tool, it is useful to pin down exactly what you
can do with Euclidean geometry and what you cannot. We specify the tools and
begin to explore the myriad constructions in §7.1. If you are only interested in
how the field theory works you can skip straight to §7.2, remembering that the
only tools we have to mark points are to intersect lines and circles with lines or
circles, and there are conditions on which lines and circles we may draw.
The tools we use for constructions are limited at the outset, but the ideas we
use to prove that we cannot solve certain problems are not. We will translate
the whole exercise into modern coordinate geometry, which was not available to
Euclid, and then the Tower Law will solve the impossible.
If you don’t like what we achieve, then you can think about buying some new
tools and working through the whole exercise with them. But remember that
your toolbox is not infinite, so at some point you’ll have to have the courage to
make do with what you’ve got and go out into the field and get working.

7.1 Euclidean geometry – nonexaminable


Here’s how we’ll approach this:
(a) List the tools we have. Classically we are permitted a ruler and compass,
both of which have precise specification and very limited functionality.
(b) Get some practice using the tools we have. We can make a nice little
collection of things that we’ve figured out how to do.
(c) After trying hard, list some things that we seem not to be able to do.

(d) For each of those challenges, either try harder and come up with a con-
struction, or prove that there cannot be any construction at all.

Starting point We work in the Euclidean plane, which we treat as R2 to-


gether with the usual notion of distance and angle. The challenge is to plot as
many points of R2 as we can using only intersections of lines and circles accord-
ing to some fixed rules. At the outset the only points of R2 we have marked for
us are the two points (1, 0) and (−1, 0).

50
The question is: which points of R2 can we plot? We will gather together all
the points we plot into a set P ⊂ R2 ; our starting points (−1, 0) and (1, 0) ∈ P
give us something to work with. We will quickly find that there are many other
points in P, but first we need to understand what a ruler and compass can do
and how we identify other points of P, that is, what it means to plot a point.

What can our ruler and compass do? These are our tools:

Ruler: If P, Q ∈ P are distinct points, then we may draw the straight line P Q
between them. Note that the line does not extend beyond the endpoints
P and Q.
Compass: If P, Q, R ∈ P (not necessarily distinct) then we may draw the circle
centred on P with radius equal to the length of the line QR. In practical
terms, we set our compass radius by putting the two sharp spikes at Q
and R respectively, and then move one spike to P and rotate the compass
through 360◦ using the other spike to describe a circle as we go; see the
picture below of what a compass is in this context (you can buy that one
online right now).

What counts as plotting a point? Given that we can draw certain lines
and circles with the ruler and compass, all of the following are points of P:
(a) the intersection point of two lines drawn with the ruler;
(b) the points of intersection of of a line drawn with the ruler and a circle
drawn with the compass;

(c) the intersection points of two circles drawn with the compass.
Of course, the lines and circles in this prescription may only be drawn in accor-
dance with their definition above, so we need to know some points of P to be
able to draw them at all. We think of building up P recursively, including new
intersection points as we draw ever more lines and circles.
A point P ∈ R2 , other than the initial points (−1, 0) and (1, 0) ∈ P, is
an element of P if and only if can be realised by the operations above applied
finitely many times.

51
The first computation We start out knowing P0 = (−1, 0) and Q0 = (1, 0).
We can draw the line P0 Q0 , and then set the compass to radius P0 Q0 and draw
the circles of that radius centred on P0 and on Q0 —that’s absolutely everything
we can do given only two points to work with. But already the intersection
points of those two circles are points S and S 0 , and so our knowledge of P
grows. With the modern technology of coordinates, and ancient technology of
Pythagoras’ Theorem, we may express those as

S, S 0 = (1/2, ± 3/2)

We may use the ruler now to draw the line SS 0 . Intersecting the lines P0 Q0 and
SS 0 produces another new point R = (0, 0).
The picture sketches the situation. (I’ve borrowed that picture from Wolfram
alpha so it’s not labelled for us. Note too that our ruler would not allow the
vertical bisector to extend beyond the intersections of the circles.)

In complete generality this shows that we may always make the following con-
struction.
Construction 1. If P, Q ∈ P then the midpoint R of the line segment P Q also
lies in P.

An arithmetic point of view We may think of numbers as being the lengths


of lines: that is our number set is

N = {|P Q| : P, Q ∈ P} ⊂ R

Construction 1 shows that we may divide by 2 in N . This is a lovely point


of view, and it immediately invites the question what other arithmetic can we
perform in N ? The answer is that N is almost a field, in that we can do
the usual arithmetic of positive numbers, except that it does not include any
negative numbers. With a bit of practice you can easily prove that with more
ruler-and-compass constructions: at this stage we only own two lengths and a
halving procedure, so there’s some way to go.
In fact we won’t use N as the main tool for our analysis, though it’s a nice
first way to think about what we can do. We could imagine including negative
numbers N e = N ∪ −N to make a field, but that’s clearly a clumsy way to do
things. Instead we introduce a simpler approach in §7.2 below.

52
Extending lines beyond their endpoints The previous trick can be used
to extend lines indefinitely. Circles drawn centred on S and S 0 of radius SS 0
meet on the line P Q but outside its endpoints. We can join those and repeat.
In this way we can draw as much of the x- and y-axes as we please—and indeed
extend any other lines we have a segment of.
Given the extension of P0 Q0 along the x-axis, we may draw a circle of radius
P0 Q0 (in the notation above) at Q0 to construct a point T as follows.
Construction 2. If P, Q ∈ P, then also T ∈ P, where |T Q| = |P Q| and Q is
the midpoint of the line segment P T .
In arithmetic terms, we may think of that as multiplying by 2. The same
set up constructs analogous points on the y-axis.
Construction 3. If R, S ∈ P, then T, T 0 ∈ P where T T 0 is the line at right
angles to RS meeting it in R with |RT | = |RT 0 | = |RS|.
In other words, we can construct the square lattice of points at distance |RS|
centered on any point P ∈ P the moment we construct R, S ∈ P.
You can continue with many constructions: divide by any integer, move a
line segment RS to a parallel line segment starting at a given P ∈ P, multiply
by an integer, rotate a line segment to be parallel to the x-axis, compute a
square root, and so on and on and on.

7.2 Translate into Field Theory


Let’s regroup. Let me state the mathematical setup independently of §7.1, and
then explain the connection for anyone who read §7.1.
Definition 7.1 (Ruler and compass extensions). This is a method to build a
sequence of field extensions K0 ⊂ K1 ⊂ K2 ⊂ · · · of subfields of R. Start with
K0 = Q, and then inductively, given Ki , make a simple extension Ki ⊂ Ki+1
as follows. Inside the plane R2 , consider the set of all points with coordinates
in Ki , denoted Pi = Ki2 ⊂ R2 , and then construct a (possibly) new point S ∈ R2
by any one of the following operations:

(a) S is the unique intersection point of two lines drawn with the ruler
(b) S is one of the points of intersection of of a line drawn with the ruler and
a circle drawn with the compass
(c) S is one of the intersection points of two distinct circles drawn with the
compass
where a line means a (compact) straight line segment P Q joining two points
P, Q ∈ Pi and a circle means a circle centred at a point R ∈ Pi of radius equal
to the length of a line P Q joining points of Pi . (You may regard all italised
mention of rulers and compasses as decorative flim-flam that plays no role and
may be deleted.)

53
There is one caveat: if in any of the operations there are either no intersection
points or infinitely many in R2 , then the operation is not permitted. (The main
point is to rule out lines that meet along a common nontrivial segment.)
Then S = (x, y) ∈ R2 and Ki+1 = Ki (x, y). (It will be clear in the proof of
Lemma 7.3 that either y ∈ Ki (x) or x ∈ Ki (y), so the extension is simple.)
Any sequence of field extensions K0 ⊂ K1 ⊂ · · · ⊂ KN made in this way is
called a sequence of ruler and compass field extensions.
Remark 7.2. The algebraic process in Definition 7.1 relates directly to the
geometry of §7.1: if you construct a point P = (x, y) ∈ R2 using the geometri-
cal ruler and compass constructions of §7.1, then those geometrical operations
describe a sequence of ruler and compass field extensions with x, y ∈ KN .
Lemma 7.3. Let Q = K0 ⊂ K1 ⊂ · · · ⊂ KN be a sequence of ruler and compass
field extensions. Then [KN : Q] = 2M for some integer M ≤ N .
Proof. Obvious. End of proof. In detail: inductively we build a sequence of
field extensions Q ⊂ K1 ⊂ K2 ⊂ · · · by constructing points Q1 , Q2 , . . . whose
coordinates generate Ki+1 over Ki . (Many of these extensions may be trivial:
Ki+1 = Ki .) To extend Ki , we construct a point Qi+1 ∈ R2 by one of the follow-
ing three operations (which are the three operations of Definition 7.1 expressed
in coordinate geometry).
Working in each case with a, b, c, d, e, f ∈ Ki we solve one of:
(a) Intersection of two lines: simultaneously solve

ax + by = c and dx + ey = f

But that is done by linear algebra in Ki , and the solution is (x, y) ∈ Ki2 .
(b) Intersection of line and circle: simultaneously solve

ax + by = c and (x − d)2 + (y − e)2 = f 2

But that is done by substituting for x or y from the linear equation into
the quadratic and then solving the result with the usual quadratic formula:
thus x and y involve at worst a single square root over Ki .
(c) Intersection of two circles: simultaneously solve

(x − a)2 + (y − b)2 = c2 and (x − d)2 + (y − e)2 = f 2

Subtracting one equation from the other leaves the affine linear equation
2(d − a)x + 2(e − b)y = c2 − a2 − b2 − (f 2 − d2 − e2 ), namely the equation of
the line through the two intersection points (including the case the circles
are tangent) which puts us back in the previous case: so again x and y
involve at worst a single square root over Ki . 

Thus each step either has a solution in Ki , so [Ki+1 : Ki ] = 1, or in Ki ( s) for
some s ∈ Ki , so [Ki+1 : Ki ] ≤ 2. The Tower Law concludes the proof.

54
(Non-examinable) Remark On the face of it, the geometrical operations
of §7.1 merely describe finitely many points with finitely many coordinates (x, y).
But we hinted in §7.1 that the geometrical operations are strong enough to carry
out all field operations, so we can indeed construct any point of K0 = Q2 , and
then, in any inductive step, if we can construct Q = (x, y) ∈ R2 then we can
also construct points having any two elements of Ki (x, y) as their coordinates.
You may object that we cannot carry out infinitely many geometrical ruler
and compass constructions, and that’s fair enough, but we can construct any
specified element of any Ki using only finitely many operations, and if you wish
you may prefer to spell that out every time I say Ki .
While we’re discussing §7.1, notice that the field Ne ⊂ R is simply the union
of every field Ki that arises in every possible ruler and compass extension. If
it was ever convenient to have a ‘master field’ of geometrically constructible
numbers, then we could use this idea, but in general we are trying to get away
from the idea of working in a single fixed favourite field (such as Ne or C), and
work instead from the bottom up. 

(Non-examinable) Remark The kinds of geometrical problems we may en-


counter typically rely on constructing some particular points—the details of
solution may then involve various lines and circles involving those. If we were
interested in mastering ruler-and-compass geometry to solve such problems, we
could think of the process in two steps: first construct line segments of certain
lengths relevant to the problem (that is, carry out the field theory of Defini-
tion 7.1); second, use those and a bit of practical ingenuity to construct the
points in R2 that solve the problem. We haven’t made any effort to master the
methods so we cannot have any confidence that we could carry out these steps,
but it would be a lovely weekend break to devote yourself wholly to that.
Here, though, we are much more negative. Rather than proving we can make
particular constructions, we will prove that there are some constructions that
cannot be made. We should think of this as being a hard problem: it has to
rule out the possibility of some ingenious construction that we do not know.
This becomes a classic deceit of mathematics. Once we’ve seen the elegant
solution to various nonexistence problems on the next page, it will be tempting
to regard this as the easy task, while regarding the messy business of making
constructions that actually do work as the hard task. Shame on us if we think
that! It is a form of laziness (not bothering to perfect hundreds of beautiful
geometric constructions for ourselves) and arrogance (pretending that short,
elegant ideas are ten-a-penny), so hopefully we do not fall into this trap.
Always remember: this nonexistence proof resisted the dedicated attention
of thousands of the finest minds from half a dozen of the most advanced cultures
for at least 2000 years, to say nothing of the number of duels undertaken on the
way. And when the great mathematicians of the French Academy finally saw
the first draft of the solution, they didn’t understand it. This is not easy: you
and I would not have thought of it for ourselves. 

55
7.3 Impossible constructions
Duplicating the cube: building a shed The gamekeeper tells you she
wants a cube-shaped shed whose volume is exactly 2 in the distance units of
your Euclidean plane cubed. (I didn’t ask, but it is probably a hide for shooting
moles, who are notoriously unafraid of cubes—in general, it’s best not to know
how they keep those lawns so nice, in the same way it’s best not to know what
sausages are made of.) √
Your job, therefore, is to mark out a square of side length 3 2 on the ground:
the floor, walls and ceiling of the shed can all be modelled on that, and then
it’s someone else’s job
√ to assemble it. This is equivalent to you being able to
construct the point ( 3 2, 0) in your Euclidean garden, or equivalently
√ a sequence
of ruler and√compass field extensions Q ⊂ KN that contain 3 2 ∈ KN .
But [Q( 3 2) : Q] = 3 since x3 − 2 is irreducible √over Q, while [KN : Q] is a
power of 2 by Lemma 7.3, so by the Tower Law Q( 3 2) cannot be a subfield of
any such field KN . So it can’t be done, mate – sorry.

Trisecting angles: flowers for the visiting dignatory The French Presi-
dent is visiting Kenilworth to celebrate the Galois’ birthday. The builders have
put up a marquis and carved out a 30◦ angle in the earth for the corner of a
flower bed. Naturally the mayor would like a display of red, white and blue
flowers, and it’s your job to plant them equally.
It is easy to mark out an angle of 30◦ in the ground by ruler and compass:

we can construct rational numbers and square roots, so simply mark ( 23 , 21 )
and draw a straight line from that to (0, 0). But can you divide the angle into
three equal subangles using your ruler and compass?

(cos 30◦ , sin 30◦ ) = ( 3 1
2 , 2) •

30◦ •
(cos 10◦ , sin 10◦ )

1

Figure 4: Trisecting an angle.

If you can do that, then you can certainly plot the point Q = (cos 10◦ , sin 10◦ )
by intersecting your dividing line with a circle of radius 1. But, as when finding
the roots of a real cubic using trigonometric functions in §3.3, we can use the
triple angle formula (9) or its sin(3θ) buddy

sin(3θ) = 3 sin(θ) − 4 sin3 (θ)

56
to show that x = sin 10◦ is a root of a polynomial f = 4x3 − 3x + 21 . This
f is irreducible: any test from §5.3 will do; for example, 2f (x + 1) = 8x3 +
24x2 + 18x + 3 is irreducible by Eisenstein’s criterion with p = 3. Therefore
[Q(sin 10◦ ) : Q] = 3, and so sin 10◦ does not lie in any ruler and compass field
extension by Lemma 7.3 and the Tower Law. Sorry mayor: we can’t do that.
If you want to be more constructive—perhaps you’re looking for promotion—
you could get back to the head gardener with a list of some constructible angles
that you can trisect such as 90◦ . That would be an exercise.

Squaring the circle: linearising the ducks OK, now you’re joking. Can
you take a circular pond and rebuild it as a square pond of exactly the same
depth (the ducks like that) and exactly the same volume—not approximately,
but make a square with exactly the same area as the original circle.
The circle has some constructible radius r ∈ KN for some sequence of ruler
and compass field extensions
√ Q ⊂ KN . Its area is therefore πr2 . So√the square
must have side length r π. Are you seriously asking me whether π lies in a
field extension L/KN of degree [L : Q] some power of 2?
Well, prima facie it could. If it did, then we could also find an algebraic
extension containing π, since we can compute squares using ruler and compass.
But in fact π is transcendental—it is not the root of any polynomial with rational
coefficients (see [Stewart2015, §24.3]), so this cannot happen. I’d love to help,
mate, but this can’t be done either.

57
8 The automorphism group of a field
The first part was already in §4.4. The group of automorphisms of a field is the
main invariant, so worth reviewing.

8.1 The automorphism group


Let L be any field.
Definition 8.1. An automorphism of L is a bijective ring homomorphism
ϕ : L −→ L. (It follows easily that the inverse map is also a ring homomor-
phism.) The set of all automorphisms of L forms a group:
Aut(L) = {ϕ : L −→ L | ϕ is a automorphism of L}
If K ⊂ L is any subfield, then a K-automorphism of L is an automorphism ϕ
that fixes K pointwise: ϕ(α) = α for all α ∈ K. The set of all K-automorphisms
of L forms a group:
AutK (L) = {ϕ : L −→ L | ϕ is a K-automorphism of L} ⊂ Aut(L)
Remark 8.2. If K0 ⊂ L is the prime subfield of L (so K0 = Q or Fp for some
prime p > 0) and ϕ is any automorphism of L then ϕ necessarily fixes every
element of K0 because ϕ(1) = 1. That is, any automorphism is automatically a
K0 -automorphism, and Aut(L) = AutK0 (L).

8.2 The fixed field of a group of automorphisms


Definition 8.3. Let L be a field and σ ∈ Aut(L) any automorphism. The
fixed field of σ is the subset
Lσ = {a ∈ L | σ(a) = a}
If Σ ⊂ Aut(L) is any subset, then the fixed field of Σ is the subset
LΣ = {a ∈ L | ϕ(a) = a for all σ ∈ Σ}
It is easy to check that Lσ ⊂ L and LΣ ⊂ L are both subfields of L. (If
a, b ∈ Lσ , then σ(a + b) = σ(a) + σ(b) = a + b, so a + b ∈ Lσ , and so on.)
Example 8.4. Let σ be complex conjugation on the field C. Then
Cσ = {z ∈ C | z = z} = R and [C : Cσ ] = 2
If L = Q(α, i) ⊂ R, with α2 = 5 and i2 = −1, then [L : Q] = 4 and
L = {a + bα + ci + dαi | a, b, c, d ∈ Q}
Since σ takes the element of the basis {1, α, i, αi} to elements of L, σ(L) ⊂ L
and so σ(L) = L. Thus σ ∈ Aut(L) and
Lσ = {γ ∈ L | γ = γ} = Q(α)
which is the same as L ∩ R. In particular [L : Lσ ] = 2.

58
Lemma 8.5. Suppose L is a field and G ⊂ Aut(L) is a finite subgroup. Then

[L : LG ] ≤ #G

This is a slick piece of linear algebra, with a clever shift between coefficient
fields. We have to show that there is no LG -linearly independent set of elements
of L of size > #G. The proof easily cooks up linear dependence relations over L,
and then uses G to see that there was one defined over LG all along.
G
Proof. Let G = {σ1 , . . . , σn }. Suppose a1 , . . . , an+1 ∈ L. Set PK = L , the fixed
field of G. We must find a K-linear dependence relation xi ai = 0 with all
xi ∈ K, but not all zero.
[Clever] Let
 
σ1 (ai )
vi =  ...  ∈ Ln for i = 1, 2, . . . , n + 1
 

σn (ai )

These are n + 1 elements of the n-dimensional L-vector space Ln , so there is


certainly a linear dependence relation among them over L [note: over L not K:
that’s the price to pay for the clever trick]. Let

x 1 v1 + · · · + x k vk = 0

be a shortest such relation: xi ∈ L, all xi 6= 0, and without loss of generality


x1 = 1 and σ1 = id ∈ G.
Write that relation as
    
σ1 (a1 ) · · · σ1 (ak ) x1 0
 .. ..   ..  =  .. 
 . .   .  . (12)
σn (a1 ) · · · σn (ak ) xk 0

That is, (x1 , . . . , xk )T is a nontrivial solution (in fact, all xi ∈ L are nonzero)
of n simultaneous linear equations with coefficients in L.
We can hit the equations (12) with any σ ∈ G: that permutes the rows of the
matrix, so leaves exactly the same set of equations, but now exhibits the solution
(σ(x1 ), . . . , σ(xk ))T . But that must be the same solution as before: if not, then
since σ(x1 ) = σ(1) = 1 = x1 we could subtract it from the previous solution to
get a shorter linear dependence relation, which would be a contradiction as we
started with the shortest one.
That is σ(xi ) = xi for each i = 1, . . . , k and for every σ ∈ G. So each
xi ∈ LG = K, and the first row of (12) gives a nontrivial K-linear relation
among the ai (since σ1 = id). 

Example 8.4 gave two examples of a field L with automorphism σ with


degree [L : Lσ ] = 2. Setting G = hσi ⊂ Aut(L) they both had #G = 2, and so
Lemma 8.5 holds in these cases with equality. In fact, that is always the case.

59
Theorem 8.6. Let G ⊂ Aut(L) be a finite subgroup of the group of automor-
phisms of a field L. Then [L : LG ] = #G.
Proof. Let K = LG . Every σ ∈ G fixes every element of K, so G ⊂ AutK (L).
Lemma 8.5 shows [L : K] ≤ #G. In particular, L/K is a finite extension so
we may apply Theorem 6.41 for L/K with M = L for the converse inequality

#G ≤ # AutK (L) = # EmbK (L, L) ≤ [L : K] 

8.3 How does this all work for f = x3 − 2?


Back to the beginning. Consider f = x3 − 2 ∈ Q[x], which is irreducible—most
tests work: for example, Eisenstein’s criterion with p = 2. So f has 3 distinct
roots in C, or equivalently it factorises into distinct linear factors over C: viz.

f = (x − α)(x − αω)(x − αω 2 )

3
where α = 2 ∈ R and ω a primitive cube root of unity. Let’s name the roots

α1 = α α2 = αω α3 = αω 2

Smallest field over which f splits Rather than working in C, we prefer


a more economical approach. Let L = Q(α1 , α2 , α3 ) = Q(α, ω), the smallest
subfield of C that contains all the roots of f . We have enough tools analyse
Aut(L) and all the subfields of L in full detail.
First we get a grip on L itself. Let K1 = Q(α1 ). We know Q ⊂ K1 ⊂ L and
that [K1 : Q] = 3 by Proposition 6.21 since f is irreducible over Q.
Over K1 , the linear factor x−α splits off f , leaving the quadratic x2 +αx+α2 ,
which is irreducible over K1 (since K1 ⊂ R). The field L contains the two roots
α2 , α3 of this quadratic, so L = K1 (α2 ) (or you may adjoin α3 if you prefer).
Thus [L : K1 ] = 2 and so by the Tower Law [L : Q] = 6, and, if we need it,
L has a Q-basis 1, α, α2 , ω, αω, α2 ω.

Some intermediate fields of L/Q We already have K1 = Q(α1 ). Write


K2 = Q(α2 ) and K3 = Q(α3 ) for the subfields generated by each of the other
roots of f . (We may use the basis of L to show that Ki ∩ Kj = Q for i 6= j.)
Since L = Q(α, ω), the subfield M = Q(ω) is also evident, and L = M (α).
These few fields sit in the configuration of Figure 5. We will prove that
this is the complete lattice of subfields of L. (In fact, this was optional: the
machinery below would have found these fields even if we hadn’t spotted them.)

Automorphisms of L permute the roots of f If σ ∈ Aut(L) then

f (σ(αi )) = σ(f (αi )) = σ(0) = 0 for any i = 1, 2, 3

so σ(αi ) must be one of the roots, α1 , α2 or α3 .

60
L = Q(α, ω)

2
2
2

K1 = Q(α1 ) K2 = Q(α2 ) K3 = Q(α3 ) 3


3 3 3 M = Q(ω) = Q( −3)

Figure 5: Subfields of L = Q(α1 , α2 , α3 ) = Q(α, ω), the splitting field of x3 − 2.

Since we may express L as L = Q(α1 , α2 , α3 ), any automorphism σ is com-


pletely determined by where it maps the three αi , that is, how it permutes the
three roots. Put differently, there is an injective group homomorphism

AutQ (L) −→ S3 = permutation group on {1, 2, 3} (13)

where σ ∈ AutQ (L) is mapped to the permutation

{1, 2, 3} −→ {1, 2, 3}
i 7→ j where σ(αi ) = αj

So, having fixed the numbering of the roots, we may treat AutQ (L) ⊂ S3 as a
subgroup of the permutation group.
What we don’t know yet is the image of (13), or, put another way, which of
the permutations of the roots can be realised by an automorphism of L. Let’s
sort that out by constructing some automorphisms.

Complex conjugation Denote complex conjugation by σ(x + iy) = x − iy.


It swaps α2 and α3 and fixes α1 ∈ R. In particular, that means σ ∈ AutQ (L),
and it acts as the permutation (23) ∈ S3 .

A bit of work to find another automorphism Consider the extension


L/M . We know that L = M (α) with minimal polynomial x3 − 2 ∈ M [x].

61
So by Corollary 6.39, elements of AutM (L) = EmbM (M (α1 ), L) are deter-
mined by the roots of x3 −2 in L. Thus there is an M -homomorphism τ : L −→ L
with τ (α1 ) = α2 : that is, τ (ω) = ω and τ (α) = αω, and so it follows that also
τ (α2 ) = τ (α)τ (ω) = αω 2 = α3 and τ (α3 ) = τ (α)τ (ω)2 = αω 3 = α1
Thus τ is the permutation (123) ∈ S3 .

Groups make life easy We have found permutations σ = (23) and τ = (123)
in the subgroup AutQ (L) ⊂ S3 . But those two permutations generate all others,
so AutQ (L) = S3 .
We know everything about S3 . For example, its subgroups are
{id} hσi ∼
= Z/2 h(13)i ∼
= Z/2 h(12)i ∼
= Z/2 hτ i ∼
= Z/3 and S3
and they sit in a subgroup lattice as drawn in Figure 3 in §1.4. Of these, only
the two trivial ones and the Z/3 are normal subgroups: they comprise unions of
complete cycle types. The three subgroups of order 2 are conjugate: for example
(123)
(123)(12)(321) = (23) shows h(12)i = hσi

Fixed fields of subgroups Without any calculation, Theorem 8.6 says


Aut(L)
[L : L ] = # Aut(L) = #S3 = 3! = 6
so the fixed field LAut(L) = Q.
Recall σ ∈ Aut(L) acts as the permutation (23) ∈ S3 . Let’s calculate the
fixed field Lhσi of the subgroup hσi ⊂ AutQ (L). (That is the same as Lσ , which
we may write for brevity, but in the end I would like the subgroup to be visible.)
Theorem 8.6 says [L : Lσ ] = # hσi = 2, so [Lσ : Q] = dimQ Lσ = 3 by the
Tower Law. Since 1, α, α2 are fixed by σ, all being real, we therefore have
Lhσi = {a1 + a2 α + a3 α2 | a1 , a2 , a3 ∈ Q} = Q(α) = K1
Looking at the other subfields we happen to know, we observe that
σ : K2 −→ K3 and σ : K3 −→ K2
are isomorphisms, since ω = ω 2 , while
σ : M −→ M
is an isomorphism, but not the identity: in the Q-basis x = 1, y = ω of M , it is
reflection in the x-axis.
What about the cycle τ = (1 2 3) that rotates the three roots
α 7→ αω 7→ αω 2 7→ α
It generates a cyclic subgroup H = hτ i ⊂ S3 of 3 elements, so again by the
theorem [L : LH ] = #G = 3, so [LH : Q] = 2. But now note that ω ∈ LH , since
ω = αω/α = αω 2 /αω = α/αω 2 is invariant under τ
Thus Q(ω) ⊂ LH , and so [Q(ω) : Q] = 2 implies that LH = Q(ω) = M .

62
L = Q(α, ω)

2
2
2

3
K1 = Lh(23)i K2 = Lh(13)i K3 = Lh(12)i

3 3 3 M = Lh(123)i

Q = LS3

Figure 6: Fixed fields of subgroups of Aut(L).

There are no other subfields Suppose Q ⊂ N ⊂ L with [L : N ] > 1 and


[N : Q] > 1; as [L : Q] = 6, these degrees are 2 and 3 in some order. Since 2 and
3 are both prime, the Tower Law also implies that L = N (β) for any β ∈ L \ N .
So by Corollary 6.39, elements of AutN (L) = EmbN (N (β), L) are deter-
mined by the roots of the minimal polynomial p ∈ N [x] of β, which is an
irreducible polynomial of degree [L : N ] > 1. So H = AutN (L) ⊂ AutQ (L) is a
nontrivial subgroup, and N ⊂ LH 6= L.
That limits the possibilities for N very severely. Since [LH : Q] ≤ 3 we must
have N = LH , by the Tower Law, so N is one of the intermediate fields we have
already considered, and there are no other subfields of L.

Conclusion The subfield lattice of L is identical to the subgroup lattice of


Aut(L) = S3 . Moreover we have a bijection that identifies them explicitly,
namely the fixed field operation

{id} ⊂ H ⊂ Aut(L) L ⊃ LH ⊃ Q = LAut(L)

We will understand exactly when this works for other polynomials, and that is
enough power to understand when there can be a radical formula for their roots.

63
9 Field extensions II
9.1 Splitting fields
A splitting field for a polynomial f ∈ K[x] (not necessarily irreducible) is a
minimal field extension that contains all the roots of f , that is, where f splits
into linear factors.
Definition 9.1. An extension L/K is a splitting field for f ∈ K[x] if and
only if

(a) there are α1 , . . . , αn ∈ L and c ∈ K with f factorising as

f = c(x − α1 ) · · · (x − αn ) into linear factors.

(b) If M ⊂ L is a subfield and α1 , . . . , αn ∈ M , then M = L; in other words,


L = K(α1 , . . . , αn ).

Any such L/K is a finite extension since each αi is algebraic over K, so the
Tower Law applies to K ⊂ K(α1 ) ⊂ K(α1 , α2 ) ⊂ · · · ⊂ L to give [L : K] < ∞.
You can refer to L as the splitting field of f : it is implicit that L/K is an
extension, and the fact that you said ‘the’ rather than ‘a’ splitting field will be
justified soon when we prove that any two splitting fields for f over the given
field K are unique up to isomorphism. But first a point to be careful about.
Example 9.2. Let f = x3 − x2 − 2x +√2 = (x2 − 2)(x √− 1) ∈ Q[x]. Then,
working in C, the roots of f are 1 and√± 2,√so L = Q( 2) is a splitting field
for f . We could have written L = Q(1, 2, − 2), but some of these are already
in Q,
√ and some are in the field generated by the others, so we’ll be happy with
Q( 2).
There is a nuance that is important to appreciate, but which definition is
only clear about if you read it carefully. Consider√the same polynomial f but
now regard it as an element of N [x] where√N = Q( 3). That has not made any
difference to f , nor to its roots in C—the 3 in N is irrelevant to the roots of f .
But now L is not a splitting field, because it is√ not an extension
√ √ of the field
N of over which f is defined. Instead, L2 = N ( 2) = Q( 2, 3) is a splitting
field. What is and what is not a splitting field depends crucially on the field
over which we defined f in the first place; and if we change that field, then the
splitting field may change. This is not a big deal in practice if we take care.
Example 9.3. Let f = x4 − 2x2 − 15 ∈ Q[x]. If you spot that f factorises into
two quadratics over Q, then this is easy, but if not then maybe you spot instead
that it is a quadratic in y = x2 . It equals y 2 − 2y − 15, which has roots −3 and
2
5 by the quadratic√formula. So
2
√ we find the roots of f by solving√x =√−3 and
x = 5, namely ± −3 and ± 5. So the splitting field is L = Q( −3, 5), the
smallest field extension of Q that contains those four roots.

64
Remark 9.4. If K ⊂ C, then it is obvious that splitting fields exist over K.
Since C is algebraically closed, any f has a complete set of roots α1 , . . . , αn ∈ C,
and L = K(α1 , . . . , αn ) is a splitting field for f . There’s nothing to prove:
certainly f splits into linear factors over L, and by definition L is the smallest
subfield of C that contains all the αi . In practice it is excessive to name all
the roots when defining L, as some will come for free (x3 − 1 ∈ Q[x] has roots
1, ω, ω 2 ∈ C, and Q ⊂ Q(ω) is a splitting field), but it’s convenient and harmless
to be profligate when we don’t have explicit values for the roots.
We prove the existence of splitting fields in complete generality using the
usual K[x]/(f ) trick, without needing the sledgehammer of algebraically closure.
Proposition 9.5. Let f ∈ K[x] of degree n. Then there is a splitting field L/K
for f of degree [L : K] ≤ n!.
Proof. If f splits into linear factors in K[x] we are already done. If not, let
g1 be a nonlinear irreducible factor of f : note deg g1 ≤ n. Then K ⊂ L1 =
K[x]/(g1 ) = K(α) is an extension that contains a root of g1 , namely α = the
class of x, and [L1 : K] = deg g1 ≤ n. The factorisation of f into irreducibles
over K(α) is finer than over K: in particular, all K-linear factors of f are also
factors over K(α) and f has gained the linear factor x − α, so the number of
linear factors has strictly increased, and the maximum degree of an irreducible
factor is n − 1.
Repeating this process inductively, at the ith step adding at least one root
Li+1 = Li (αi ) of an irreducible factor gi of degree ≤ n − i, so [Li+1 : Li ] ≤ n − i,
completes the proof: if there are s steps, the Tower Law shows

[L : K] = [L : L1 ][L1 : K] = [L : Ls−1 ][Ls−1 : Ls−2 ] · · · [L1 : K]


≤ (n − (s − 1)) · · · (n − 1)n

which is ≤ n! as claimed. 

(Non-examinable) Remark In fact, any field K has an algebraic closure


K ⊂ K. We don’t prove this, but you can imagine somehow trying to take the
union of all algebraic field extensions. That doesn’t really make sense, because
it is hard to know how to compare two different extensions unless you already
own a big field (like C) where they all live. But it can be done. Set theory has
more powerful tools than simple induction: there is trickery Zorn’s Lemma that
allows you to find a maximal element of particular infinite collections of field
extensions you may build, and you can show that that is an algebraic closure.
So we could have shown the existence of splitting fields for any polynomial
f ∈ K[x] for any field K using the sledgehammer of Remark 9.4—simply adjoin
to K the roots of f in an algebraic closure of K—but the proof of Proposition 9.5
by-passes that, using much less technology. 
Theorem 9.6. Let L/K be a splitting field for f ∈ K[x]. If M/K is a field
extension in which f splits into linear factors, then there is a K-homomorphism
L → M.

65
Proof. We know that L = K(α1 , . . . , αs ), where αi are (some of) the roots
of f . We work by induction on s. Let m be the minimal polynomial of α1 . By
definition, m divides f , so m splits into linear factors in M , because f does. Let
β1 be any root of m in M . By Proposition 6.36, there is a K-homomorphism
K(α1 ) → K(β1 ) ⊂ M .
Now L/K(α1 ) is a splitting field for f /(x − α1 ) ∈ K(α1 )[x], and L =
K(α1 )(α2 , . . . , αs ) is generated by fewer than s elements over K(α1 ). So by
induction, there is a K(α1 )-homomorphism L → M . That map is, in particu-
lar, a K-homomorphism, and we are done. 
Corollary 9.7. Suppose L/K and L0 /K are both splitting fields for f ∈ K[x].
Then there is a K-isomorphism L → L0 .
Proof. By the theorem, there are (injective!) K-homomorphisms L → L0 and
L0 → L. Both fields L and L0 are finite-dimensional K-vector spaces, so they
must have the same dimension. Therefore both maps are bijections, hence are
K-isomorphisms. 

Remark 9.8. If L/K is a splitting field for f ∈ K[x], then √ it will be the
splitting field for many other g ∈ K[x]. For example L = Q( 2) is the splitting
field for x2 − 2. But by the quadratic formula it is the splitting field for any
g = ax2 + bx + c with ∆ = b2 − 4ac = 2n2 for some n ∈ Z; x2 − 8, say, or for
that matter (x2 − 2)(x2 − 8), √ since nobody said g had to be irreducible. √
Put another √ way, if α + β 2 ∈ L \ Q, then the product of x − (α + β 2)
with x − (α − β 2) lies in Q[x] and has L as its splitting field.

9.2 When is an extension normal?


A splitting field L/K is designed so that some polynomial f ∈ K[x] not only
has a root in L but has all its roots in L. Something quite unexpected happens
for free: if L is a splitting field for f , then in fact any irreducible polynomial
g ∈ K[x] that has a root in L has all its roots in L.
Theorem 9.9. Let L/K be a splitting field for f ∈ K[x]. Let g ∈ K[x] be an
irreducible polynomial that has at least one root in L. Then g splits into linear
factors in L[x].

The condition that g be irreducible is clearly necessary, but only to eliminate


the obvious trivial counterexamples: g = (x2 + 1)(x2 − 2) has a root in the
splitting field of x2 − 2 but doesn’t split there.
Proof. Regard g ∈ L[x] and let M/L be a splitting field for g.
Suppose α ∈ M is a root of g—it may be in L or not—and consider the
diagram of inclusions:

K(α) −→ L(α)
% % &
K −→ L −→ M

66
The key is to notice that L(α) is a splitting field over K(α) for the initial f
(harmlessly regarded as a polynomial in K(α)[x]). By the Tower Law

[L(α) : L][L : K] = [L(α) : K] = [L(α) : K(α)][K(α) : K]

If β ∈ M is another root of g, then analogous statements hold, including

[L(β) : L][L : K] = [L(β) : K] = [L(β) : K(β)][K(β) : K]

But then:
• K(α)/K and K(β)/K are isomorphic extensions, as usual: Proposition 6.36.

• K(α) −→ L(α) is a splitting field for the initial f , and so is the composition

=
K(α) −→ K(β) −→ L(β), so there is a K(α)-isomorphism L(α) −→ L(β).
These have the respective numerical consequences:

[K(α) : K] = [K(β) : K] and [L(α) : K(α)] = [L(β) : K(β)]

So the Tower law statements above imply at once that [L(α) : L] = [L(β) : L].
But that is enough. If some root α ∈ L, then [L(α) : L] = 1, so [L(β) : L] = 1
for any other root β, that is β ∈ L too. 

This property of splitting fields motivates the following definition.


Definition 9.10. A field extension M/K is normal if every irreducible g ∈
K[x] has the following property: if g has a root in M , then g has all its roots
in M , that is, g splits into linear factors in M [x].

Conversely, in the finite case, normal extensions are splitting fields, and
so we haven’t really generalised anything at all.
Corollary 9.11. Let L/K be a finite extension. The following are equivalent:
(a) L/K is a normal extension.

(b) L/K is the splitting field of some (not necessarily irreducible) f ∈ K[x].
Proof. Theorem 9.9 proves one direction. The converse is trivial, as follows.
Since L/K is finite, L = K(α1 , . . . , αs ). Let mi ∈ K[x] be the minimal polyno-
mial of αi . Since L is normal and mi has a root in L, mi has all its roots in L.
Thus L is the splitting field for f = m1 m2 · · · ms . 

It is clear how finiteness worked in the proof. In fact it is essential: for


example Q ⊂ C is normal in a trivial way, but it is certainly not the splitting
field of any polynomial (and is very very far from being finite).
Corollary 9.12. If L/K is a finite extension, then there is a finite extension
N/L that is normal. In particular, N/K is normal too.

67
Proof. Again write L = K(α1 , . . . , αs ) with mi ∈ K[x] the minimal polynomial
of αi , and write f = m1 m2 . . . ms . Then let N be a splitting field for f ∈ L[x].
It is normal over L by the theorem, and so a fortiori normal over K. 
Definition 9.13. Let L/K be a finite extension. A normal closure of L/K is
an extension N/L such that N/K is normal, and N does not contain any strict
subfield containing L that is also normal over K.
The Corollary says that normal closures exist. They are clearly unique up to
K-isomorphism by its proof: each mi already has a root in L by construction,
so f = m1 m2 . . . ms had better split in any normal closure, in particular in the
smallest one, and so the closure simply is a splitting field of f .

9.3 Maps between fields II


9.3.1 Extending maps from subfields of normal extensions
Constructing maps of fields in general is hard—you can try to send generators
where you want them to go, but presumably there are relations to be respected
somewhere. It is much easier for splitting fields. The following are corollaries
of the uniqueness of splitting fields: whenever ‘finite and normal’ appears as a
hypothesis, it could have been ‘splitting field for some polynomial’.

Corollary 9.14. Let L/K be a finite normal extension and K ⊂ M ⊂ L be


an intermediate field. For any K-homomorphism ψ : M −→ L, there is a K-
automorphism ϕ ∈ AutK (L) extending it: ψ = ϕ|M .
Proof. Suppose L is the splitting field for f over K, so in particular L it is also
the splitting field for f over M . Since ψ fixes K, ψ(f ) = f , so L is also a
splitting field for ψ(f ), thought of in ψ(M )[x]. Now M ∼= ψ(M ), and L/M and
L/ψ(M ) are splitting fields for f and ψ(f ), so by the uniqueness of splitting
fields (Theorem 9.6) there is an isomorphism L −→ L that restricts to ψ. 
That is great: it says that normal extensions have lots of automorphisms,
enough to move intermediate fields around as much as they would like to.

Example 9.15. Let L = Q(α, ω) where α = 3 2 and ω is a primitive cube
root of unity. L/Q is normal (and finite) because it is the splitting field of
x3 − 2. Let M = Q(α) ⊂ L and M1 = Q(αω) ⊂ L. There is certainly a Q-
homomorphism M −→ M1 , because both are simple extensions with the same
minimal polynomial: Proposition 6.36, as always. Corollary 9.14 says that there
is in fact a Q-automorphism of L that takes α 7→ αω.
In §8.3 we constructed these automorphisms almost by bare hands—we were
lucky that L is a simple extension of Q(α) so we could use Corollary 6.39
on homomorphisms from simple extensions. This illuminates the role of finite
normal extensions, or equivalently of splitting fields: they handle an inductive
sequence of simple extensions all in one go, which is, after all, how we proved
they exist at all.

68
The example is completely general: the same words prove the following.
Corollary 9.16. Let L/K be a finite normal extension. Suppose f ∈ K[x]
is irreducible and α, β are roots of f in L. Then there is ϕ ∈ AutK (L) with
ϕ(α) = β.
Proof. There is a K-homomorphism K(α) −→ K(β) by Proposition 6.36 since
α and β have the same minimal polynomial over K. Corollary 9.14 says that
this lifts to a K-autmorphism of L, since L/K is a finite normal extension. 
This was what we saw hinted in Spoiler 6.37.

9.3.2 Restricting maps from fields containing normal extensions


The last minisection gave us some tools to understand how subfields behave
with respect to automorphisms of a normal extension. Looking upwards, what
about automorphisms of a bigger ‘super’field of a normal extension?
Proposition 9.17. Let L/K be a finite normal extension. Suppose M/L is a
finite extension (so M/K is also finite extension). Then
(a) If ϕ : L −→ M is a K-homomorphism, then ϕ(L) = L.
(b) If τ ∈ AutK (M ), then τ (L) = L.
Proof. Let α ∈ L and let m be its minimal polynomial over K. Then

m(ϕ(α)) = ϕ(m(α)) = ϕ(0) = 0

so ϕ(α) is also a root of m, and so ϕ(α) ∈ L since L/K is normal. In other


words, ϕ takes any element α ∈ L to some element of L ⊂ M : that is ϕ(L) ⊂ L.
But L/K is a finite-dimensional K-vector space, by hypothesis, and ϕ is an
injective K-linear map, so ϕ(L) = L (by the rank-nullity formula).
That proves the first part; the second follows by applying the first to τ|L . 
Remark 9.18. You could summarise Proposition 9.17 by saying that automor-
phisms of M are not strong enough to move L ⊂ M away from itself when
L/K is normal. That’s great, since it means we can restrict τ ∈ AutK (L) to an
automorphism of L: that is, restriction to L gives a group homomorphism

AutK (M ) −→ AutK (L)

If also M/K is normal, then this is surjective by Corollary 9.14.

9.4 When is an extension separable?


A rational polynomial f of degree n has n roots in C when counted with multi-
plicity, and it’s easy to check that these roots are distinct when f is irreducible.
Separability captures precisely the idea of whether the roots of an irreducible
polynomial are distinct or not over fields other than C.

69
Definition 9.19. A polynomial f ∈ K[x]\{0} is separable over K if and only
if it has n = deg f distinct roots in a splitting field L/K. If f is not separable
it is called inseparable.
(The uniqueness of splitting fields up to isomorphism means this definition
is independent of which splitting field L you used to track down the roots of f .)
Example 9.20. Suppose K ⊂ C and f ∈ K[x] is any polynomial. Fix deg f =
n ≥ 2, otherwise there’s nothing to discuss. Over C we know that
f = c(x − α1 )m1 · · · (x − αs )ms
where α1 , . . . , αs are the distinct roots of f and mi are their multiplicities, and
m1 + · · · + ms = n.
We claim that if f is irreducible, then every mi = 1 and so s = n. Suppose
not: suppose m1 ≥ 2 and write f = (x−α1 )m1 g for some g ∈ C[x]. Consider the
derivative f 0 = df /dx of f and let h = gcd(f, f 0 ) ∈ K[x]. Since f is irreducible
h = 1 or h = f (up to a scalar multiple). But h also divides f 0 , which has
smaller degree than f , so it must be that h = 1.
That’s all fine, but watch this:
f 0 = m1 (x − α1 )m1 −1 g + (x − α1 )m1 g 0
which has a factor of (x − α1 ) since m1 ≥ 2. In other words, x − α1 divides h
as complex polynomials, but that’s nonsense since we just saw that h = 1.
In short: if K ⊂ C, then every polynomial f ∈ K[x] is separable over K.
At first sight, that looks like a cast-iron proof that all irreducible polynomials
are separable, but there is a loophole.
Example 9.21. First an example that doesn’t really address the point, but
contains the clue: consider f = x3 − 2 = x3 + 1 ∈ F3 [x], where the coefficient
ring is the field F3 = {0, 1, 2} of 3 elements (and therefore characteristic 3). Its
derivative is 3x2 , which is the zero polynomial since 3 = 0 in F3 . This vanishing
derivative is the loophole that allows one to get past the GCD calculation of
Example 9.20. By multiplying out you can check that f = (x + 1)3 so the 3
roots of f are not distinct. This is a fine example of an inseparable polynomial,
but it is merely a power of a polynomial in F3 [x] in disguise so not so brilliant.
We don’t find irreducible inseparable polynomials with such easy examples
(we’ll see why later), but the clue is there, and because of this all books have
the same bone fide example of an inseparable polynomial. Let’s do it.
Let K = Fp (t), where t is a variable and p ≥ 3 is an odd prime: that is K
is the field of all rational functions u(t)/v(t), where u, v are polynomials with
coefficients in Fp . The example is f = xp − t ∈ K[x].
Consider a field extension L = K(α) where α is a root of f : that is, αp = t.
Watch this: in L[x] we calculate
p(p−1) p−2 2
(x − α)p = xp − pxp−1 α + 2 x α + · · · + pxαp−1 − αp
= xp − α p
= xp − t = f

70
where all the middle terms · · · of the binomial expansion vanish because their
coefficients include a factor of p. So in fact, f has exactly one root in L, and
that root has multiplicity p.
The crucial point of this example is that f is irreducible over K. That is
elementary to check. It’s good to know that this example exists, but let’s leave
that check as an optional exercise. (Hint: if (x − α)r ∈ K[x] for 1 ≤ r < p,
then αr ∈ K, so αar+bp ∈ K for all a, b ∈ Z. You can deduce that α ∈ K, so
α = u(t)/v(t), and now taking pth powers gives 0 = tv p − up ∈ K[t], which you
explain is impossible.)
Definition 9.22. Let L/K be an extension.
(a) An element α ∈ L is separable over K if its minimal polynomial is
separable over K.
(b) The extension L/K is separable if every α ∈ L is separable over K.
There are two things to know.
Theorem 9.23. Let f ∈ K[x] be an irreducible polynomial. Then f is insepara-
ble over K if and only if char K = p > 0 and f = a0 +a1 xp +a2 x2p +· · ·+an xnp .
That statement needs reading with a little care: it only talks about irre-
ducible polynomials, and it is not true that every polynomial of that form is
irreducible: for example, 1 + x3 ∈ F3 [x] is not inseparable because it is not
irreducible; it equals (1 + x)3 . The proof in §9.4.1 is elementary once we set
things up correctly and you should not think of this as a great mystery.
Theorem 9.24. Let L = K(α1 , . . . , αs ) be an extension of K with each αi
separable over K. Then the extension L/K is separable.
The proof of this is not hard, but would be yet another small diversion; a
nice way to proceed is to characterise separability as another kind of integral
degree [L : K]sep , and then prove a tower law for that, after which the theorem
follows. In fact, we do not need this theorem in the rest of the module so we
omit the proof, but you may use the result freely in your thinking.

9.4.1 Characterisation of separable polynomials


This would be easy if we could follow the calculation of Example 9.21, but that
uses differentiation, and that in turn involves analysis (in particular, limits)
in R or C. However, for the purposes of working with polynomials, that de-
pendence on analysis is an illusion. We may define a surrogate differentiation
in formal algebraic terms, and check it has the properties we need to emulate
Example 9.21.
Definition 9.25. Let K be any field and f = a0 + a1 x + · · · + an xn ∈ K[x].
The the formal derivative of f is the polynomial

Df = a1 + 2a2 x + · · · + nan xn−1 ∈ K[x]

71
The rules of differentiation (in particular, linearity and the product rule)
follow for Df (applied only to polynomials) immediately. Thus we may turn
Example 9.21 into a general statement over any field.
Lemma 9.26. Let K be a field and f ∈ K[x] \ {0}. Then f is separable over
K if and only if f and Df are coprime in K[x].

In other words, f is inseparable if and only if f and Df have a common


factor h ∈ K[x] with deg h ≥ 1.
Proof. Let L/K be a splitting field for f . Suppose f is separable, then in L[x]
it factorises as

f = cΠni=1 (x − αi ) for distinct α1 , . . . , αn ∈ L

By the rules of differentiation


n
X
Df = c Πi6=j (x − αi )
j=1

Any factor x − αi of f divides all summands of Df except the ith one, with
which it shares no common linear factor, and so f and Df are coprime in L[x].
If they had a nontrivial common in K[x], that would also be a common factor
in L[x], and so they are coprime in K[x] as claimed.
If f is not separable, then it has a multiple root α ∈ L. Thus f = (x−α)m g ∈
L[x] for some m ≥ 2. By the product rule,

Df = m(x − α)m−1 g + (x − α)m Dg

which has common factor of at least x − α with f . If f and Df were coprime


in K[x], there would be u, v ∈ K[x] with

uf + vDf = 1

Treating this as an equality in L[x], the left-hand side has a factor of x − α, so


evaluating at α gives the contradiction 0 = 1. 
The proof of Theorem 9.23 follows at once:

deg Df < deg f and f is irreducible

so they are coprime unless Df is identically zero. The monomials xi are linearly
independent in K[x], so Df is the zero polynomial if and only if every coefficient
iai of Df is zero. For this to happen, either ai was zero all along, or i ∈ K is zero,
which is to say that char K divides i. That is the statement of Theorem 9.23.

72
10 Galois Theory
We make a small change of language, as does the rest of the world, in honour
of Évariste Galois.
Definition 10.1. Let L/K be a field extension. The Galois group of L/K is
simply another name for the group of K-automorphisms of L. It is denoted

Gal(L/K) = AutK (L)

This group contains a lot of information about the structure of the extension
L/K, even though algebraically it is a much simpler object. This section gives
a first idea of its power and significance. Polynomials relate to field theory via
the splitting field as follows.
Definition 10.2. Let f ∈ K[x] be a separable polynomial. Choose a splitting
field L of f , that is, a minimal field extension K ⊂ L in which f has all its roots.
The Galois group of f is by definition the Galois group of the extension L/K,
and is denoted
Gal(f ) = AutK (L)
Since the splitting field is defined only up to isomorphism, the group Gal(f ) is
also defined only up to isomorphism.
The Galois group is an invariant of f that contains information about the
symmetries of its roots. It is the most important invariant of a polynomial we
consider. The main point to make is that this is a very simple and concrete
invariant, even though it contains a lot of information and seems hard to
calculate at first. Let me explain.
Definition 10.3. A subgroup H ⊂ Sn of the symmetric group on n symbols
{1, 2, . . . , n} is transitive if for any pair of symbols i 6= j there is some σ ∈ H
such that σ(i) = j.
For example, Z/n = ∼ H = hσi ⊂ Sn with σ = (123 · · · n) is transitive:
σ j−i (i) = j. By contrast, Z/2 ∼
= H = h(12)i ⊂ Sn is not transitive for any
n ≥ 3, since no element of H can permute 3 to anything other than itself.
Lemma 10.4. Let f ∈ K[x] be an irreducible separable polynomial of degree n.
Then Gal(f ) is isomorphic to a transitive subgroup of the symmetric group Sn .
In explicit terms, if α1 , . . . , αn ∈ L are the roots of f in a splitting field L,
then each nontrivial element σ ∈ Gal(f ) permutes these roots in a nontrivial
way, and so permutes the indices {1, 2, . . . , n}. This gives an injective group
homomorphism Gal(f ) −→ Sn . Furthermore, given any two indices i 6= j there
is some σ ∈ Gal(f ) so that σ(αi ) = αj .
Proof. The explicit statement is almost the whole proof: L = K(α1 , . . . , αn ),
and as in Corollary 6.39 any K-automorphism must map roots to roots; and if
it does not permute them nontrivially, then every element of L remains fixed
and the automorphism is the identity.

73
The last claim follows since splitting fields L/K are normal, so the automatic
isomorphism of simple extensions K(αi ) → K(αj ) (Proposition 6.36) lifts to an
automorphism of L by Corollary 9.14. 
This is wonderful. You understood subgroups of symmetric groups almost
as soon as you met permutations, and transitive subgroups are very particular
ones.
Remark 10.5. Suppose f is an irreducible cubic polynomial with integer coef-
ficients (so it is automatically separable). The lemma says that its Galois group
is a transitive subgroup of S3 . There are only two such groups: the cyclic group
h(123)i and S3 itself. It follows at once from the Tower Law that the two cases
correspond to: (1) adjoining a root is enough to split f ; (2) adjoining a root
leaves an irreducible quadratic. The discriminant of the cubic determines which
case you are in without further work. The symmetries of the field extension will
then assist with finding the radicals that appear in a formula for the roots.
When we come to the quintic we must analyse the transitive subgroups
of S5 . There are five, and some will permit radical solutions while others will
not. We will easily find polynomials whose Galois groups realise each of these
five transitive subgroups. (Given a transitive subgroup H ⊂ Sn , it is a hard
open problem in general to exhibit a polynomial f with integer coefficients and
Gal(f ) ∼= H. That is the inverse Galois problem and is another story.)
Although the language we use did not exist 200 years ago, Galois was using
this idea explicitly, thinking of permuting roots of polynomials in the complex
numbers, and relating it with razor-sharp precision to the solvability of polyno-
mials.

10.1 Galois extensions


Definition 10.6. An extension L/K is Galois if it is the splitting field of a
separable polynomial f ∈ K[x]. Note that f may be reducible.
Remark 10.7. It is common to define Galois extensions as those that are
finite, normal and separable. That’s fine, and it defines exactly the same class
of extensions as we have done. The catch is that we would need to prove
Theorem 9.24, and we did not do that.
Lemma 10.8. If L/K is Galois and K ⊂ M ⊂ L an intermediate field, then
L/M is Galois.
Proof. If L is the splitting field of f ∈ K[x], then it is also the splitting field of f
regarded as a polynomial in M [x]. Obviously the factorisation of f over L does
not change, and it had distinct roots in L by the separability assumption. 
Example 10.9. In contrast, in the notation of the lemma, M/K does not
need to be Galois. You can remember how this works by looking through the
subfields of L = Q(α, ω) in §8.3: Q(α)/Q is not a splitting field as x3 − 2 has

74
a root but not all its roots. On the other hand Q(ω) is the splitting field of
x2 + x + 1 over Q: so Q(ω)/Q is a normal extension, which, as we shall see
in Proposition 10.13, relates to the fact that it is the fixed field of a normal
subgroup h(123)i ⊂ Aut(L).
Theorem 10.10. Let L/K be a Galois extension. Then

[L : K] = # Gal(L/K)

Proof. This is by induction on n = [L : K]. When n = 1, then L = K and there


is exactly 1 K-automorphism, namely the identity. So suppose n > 1.
Let L be the splitting field of f ∈ K[x] with deg f = d > 1, and fix α ∈
L \ K, a root of f . Let m ∈ K[x] be the minimal polynomial of α, which is an
irreducible factor of f . We break up the extension by an intermediate field

K ⊂ K(α) ⊂ L

and we handle L/K(α) by induction and K(α)/K by hand.

Step 1 L/K(α) is Galois by Lemma 10.8, so by induction

[L : K(α)] = # Gal(L/K(α)) (14)

Step 2 Since f is separable, it has distinct roots in L, and so the minimal


polynomial m also has distinct roots in L as it divides f . Thus Proposition 6.21
and Corollary 6.39 give

[K(α) : K] = deg m = # EmbK (K(α), L) (15)

Step 3 Restricting an automorphism of L from domain L to K(α) gives a


map K(α) → L: that is, there is a map of sets (the codomain is not a group)

resα : Gal(L/K) −→ EmbK (K(α), L)


σ 7→ σ|K(α)

We show that this map is surjective and that the fibres of the map are the (left)
cosets of Gal(L/K(α)) ⊂ Gal(L/K).
Given ι : K(α) → L, there is a K-automorphism σ of L that restricts to ι
by Corollary 9.14, since L/K is normal. Thus resα is surjective. Consider any
other τ ∈ Gal(L/K) that restricts to ι. The map τ −1 σ is the identity on K(α),
so τ −1 σ ∈ Gal(L/K(α)). In other words, τ and σ lie in the same left coset, so
 
# Gal(L/K) Gal(L/K)
=# = # EmbK (K(α), L) (16)
# Gal(L/K(α)) Gal(L/K(α))

75
Step 4 Putting the different equalities together gives
# Gal(L/K) = # Gal(L/K(α)) · # EmbK (K(α), L) by (16)
= [L : K(α)][K(α) : K] by (14) and (15)
= [L : K] by the Tower Law
as claimed. 
You could think of Step 2 as a central point of the theory: a dimension in
linear algebra is equal to a certain number of field maps; that is, the degree of
a minimal polynomial has two quite different interpretations. That is also the
point where we use separability in an essential way.
Corollary 10.11. Let L/K be a Galois extension. Then the fixed field LG of
G = Gal(L/K) is LG = K.
Proof. We have K ⊂ LG ⊂ L, with [L : K] = #G = [L : LG ] by Theorems 10.10
and 8.6, so LG = K by the Tower Law. 
Remark 10.12. There is a curious point in Step 3 of the proof of Theo-
rem 10.10. The subgroup Gal(L/K(α)) ⊂ Gal(L/K) is not always a normal
subgroup, and so we did not identify the set of cosets Gal(L/K)/ Gal(L/K(α))
as a group: fortunately we only needed to count it, and as the set of cosets is a
natural set in bijection with it that we could count that instead. We gain some-
thing by understanding this point a little better (compare with Example 10.9).
Proposition 10.13. Let L/K be a Galois extension and K ⊂ M ⊂ L an
intermediate field. Then M/K is a normal extension if and only if Gal(L/M ) ⊂
Gal(L/K) is a normal subgroup.
This is one of those proofs that is a bit fiddly to read, and is actually rather
easier to prove yourself: take it calmly, write down what you need to check for
each of the implications, and then note that the tools we already have confirm
all the points you mentioned: there is nothing new in this proof.
Proof. Let G = Gal(L/K) and H = Gal(L/M ).
If M/K is normal, then σ(M ) = M for any σ ∈ G by Proposition 9.17(b).
Suppose h ∈ H and σ ∈ G. For any α ∈ M , let β = σ −1 (α) ∈ M so that
σhσ −1 (α) = σh(β) = σ(β) = α
Thus σhσ −1 ∈ G fixes M , whence σhσ −1 ∈ H. That is, H ⊂ G is normal.
Conversely, suppose H ⊂ G is normal. Let α ∈ M with minimal polynomial
g ∈ K[x]; we must prove that g splits completely in M . If β is any root of g,
then there is some σ ∈ G with β = σ(α) by Corollary 9.16. For any h ∈ H,
certainly h(α) = α, so
σhσ −1 (β) = σh(α) = σ(α) = β
By normality, σHσ −1 = H, so β is fixed by every element of H, and so β ∈ LH .
Since L/M is Galois (Lemma 10.8), Corollary 10.11 shows that β ∈ LH = M ,
and so g splits over M , and M/K is normal. 

76
10.2 Fixed fields and stabilisers
This section has almost no content, which makes it hard to absorb, but it does
set up the language; keep in mind the worked example in §8.3.
Look at the lattice of subfields of Q(α, ω) in Figure 6 and the lattice of
subgroups of S3 in Figure 5 in §8.3, and then consider the following ridiculous
pronouncement: a lattice is a collection of vertices (usually labelled by some-
thing) with a collection of directed lines joining some pairs of vertices (usually
indicating some relation between the vertex labels). You could turn that into
something precise and general, but the informal idea of the two pictures is the
powerful idea we need, and the algebra will carry all the precision.

10.2.1 The subfield lattice


Let L/K be a finite extension. The subfield lattice of L/K is the set of all
intermediate fields

FL/K = {M ⊂ L | M is a field and M/K is an extension}

partially ordered by inclusion. We abbreviate FL/K by F if the extension is


clear.
We draw this as a lattice, with the fields M labelling the vertices and lines
between vertices indicating inclusions. If at all possible, we draw it so that
inclusions go up the page. Figure 2 in §1.4 is a picture of the subfield lattice of
our usual Q(α, ω).
The group Gal(L/K) acts on FL/K in the following sense: if σ ∈ Gal(L/K)
and M ⊂ L is an extension of K, then σ(M ) ⊂ L is also an extension of K.

10.2.2 The subgroup lattice


Let G be a finite group. The subgroup lattice of G is the set of all subgroups

GG = {H ⊂ G | H is a subgroup of G}

partially ordered by inclusion. We abbreviate GG by G if the group is clear.


We draw this as a lattice, with the subgroups H labelling the vertices and
lines between vertices indicating inclusions. If at all possible, we draw it so
that inclusions go down the page. Figure 3 in §1.4 is a picture of the subgroup
lattice of S3 .
The group G acts on GG in the following sense: if σ ∈ G and H ⊂ G is a
subgroup of G, then σHσ −1 ⊂ G is also a subgroup of G.

10.2.3 Order-reversing maps between the two lattices


Let L/K be an extension and denote Gal(L/K) by G. Recall from Definition 8.3
that if H ⊂ G, then the fixed field H is

LH = {α ∈ L | σ(α) = α for all σ ∈ H}

77
Definition 10.14. The stabiliser of α ∈ L in G is

StabG (α) = {σ ∈ G | σ(α) = α}

The stabiliser of A ⊂ L in G is

StabG (A) = {σ ∈ G | σ(α) = α for all α ∈ A}

Notation 10.15. When H ⊂ G is a subgroup, we have the special notation


H † = LH . We treat † as a map H 7→ H †

† : GG −→ FL/K

When K ⊂ M ⊂ L is an intermediate field, we have the special notation M ∗ =


StabG (M ). We treat ∗ as a map M 7→ M ∗

∗ : FL/K −→ GG

The two maps † and ∗ are order-reversing. The following points are immedi-
ate from the definition, and are famous for being easier and quicker to explain
to yourself than to read someone else’s attempt at a proof.
Lemma 10.16. Let L/K be a field extension and G = Gal(L/K).

(a) If H1 ⊂ H2 ⊂ G then H2† ⊂ H1† .


(b) If K ⊂ M1 ⊂ M2 ⊂ L then M2∗ ⊂ M1∗ .
(c) If H ⊂ G then H ⊂ H †∗ .
(d) If K ⊂ M ⊂ L then M ⊂ M ∗† .

Proof. An element α ∈ H2† is fixed by every element of H2 , so a fortiori by all


elements of H1 , thus α ∈ H1† .
An element h ∈ M2∗ fixes every element of M2 , so a fortiori fixes all elements
of M1 , thus h ∈ M1∗ .
If h ∈ H then h ∈ Gal(L/H † ), so h ∈ H †∗ .
If α ∈ M , then σ(α) = α for all σ ∈ Gal(L/M ) = M ∗ . That is, α ∈ M ∗† . 

10.3 The Galois Correspondence


Theorem 10.17. Let L/K be a Galois extension. Let F be the subfield lattice
of L/K and G be the subgroup lattice of Gal(L/K).
(a) The maps ∗ and † are mutual inverses that define an inclusion-reversing
bijection F ←→ G.
(b) If K ⊂ M ⊂ L is an intermediate field, then

[L : M ] = #M ∗ and [M : K] = # Gal(L/K)/#M ∗

78
(c) M/K is a normal extension if and only if M ∗ ⊂ Gal(L/K) is a normal
subgroup. In this case Gal(M/K) ∼
= Gal(L/K)/M ∗ .
Remark 10.18. By definition, M ∗ = Gal(L/M ), but the statement is about
the map ∗ so we use that notation. In (b), we could write Q = Gal(L/K)/M ∗
for the set of left cosets, as usual, and then [M : K] = #Q, but it feels safer to
avoid the quotient notation until (c) where we have normality so the quotient
is in fact a group.

L {id}

degree
#M ∗


M M ∗ = {σ ∈ G | σ|M = idM }
normal

normal
Gal(M/K)

= G/M ∗

K [L : K] = #G G = Gal(L/K)

Figure 7: Galois Correspondence at a normal intermediate field K ⊂ M ⊂ L

Proof. Let M ∈ F be an intermediate field; L/M is Galois by Lemma 10.8.


Then [L : (M ∗ )† ] = #M ∗ by Theorem 8.6 and #M ∗ = [L : M ] by Theo-
rem 10.10, so the self-evident inclusion M ⊂ M ∗† of Lemma 10.16 is an equality.
Let H ∈ G be a subgroup; the extension L/H † is Galois by Lemma 10.8.
Then #H = [L : H † ] by Theorem 8.6 and [L : H † ] = # Gal(L/H † ) = #(H † )∗
by Theorem 10.10, so the self-evident inclusion H ⊂ H †∗ of Lemma 10.16 is an
equality.
Together, those prove (a). For (b), the first claim follows by Theorem 10.10
(as already used in (a)), and the second by that and the Tower Law.
The main claim of (c) is Proposition 10.13. Since M/K is normal, any
σ ∈ Gal(L/K) has σ(M ) = M by Proposition 9.17. So there is a group homo-
morphism
Gal(L/K) −→ Gal(M/K) by σ 7→ σ|M
It is surjective by Corollary 9.14—L/K is normal, so any K-automorphism of
M lifts to one of L—and the kernel is M ∗ , so the isomorphism follows from the
first isomorphism theorem. 

79
11 Biquadratic extensions
Pick a field, any field, call it K—but be sure that char K 6= 2. Let a, b ∈ K
with b(a2 − b) 6= 0 and b ∈ K not a square in K. Let

f = (x2 − a)2 − b = x4 − 2ax2 + (a2 − b) ∈ K[x]

and L/K be a splitting field of f . Such L/K are called biquadratic extensions.
Given such an f we would know how to find its roots: taking a square root
(by making a quadratic extension of K)

x2 − a = ±β where β 2 = b

and then after rearranging we take further square roots (possibly making more
quadratic extensions) to solve

x2 = a + β and x2 = a − β

Thus the 4 roots are

α, −α, α0 , −α0 where α2 = a + β and (α0 )2 = a − β

and L = K(α, α0 ). (It’s reassuring to check that f (±α) = f (±α0 ) = 0.) You
can check that α, −α, α0 , −α0 ∈ L are four distinct elements (char K 6= 2) so
that L/K is a Galois extension: indeed deg f = 4 and f has 4 distinct roots in
a splitting field. (Alternatively, perhaps more simply, Df = 4x(x2 − a) has no
common factor with f , since b(a2 − b) 6= 0, so we may apply Lemma 9.26.)
We can see L/K in a tower

L = K(α, α0 )

K(α) K(α0 )

K(β)

K (17)

where β 2 = b, α2 = a + β and (α0 )2 = a − β as above. (We omit β from the


generators of the bigger fields since β = α2 − a = a − (α0 )2 .) Each inclusion in
the tower (17) is either equality (if the square root exists in the smaller field)
or an extension of degree 2 (if it does not). Thus

[L : K] = 2, 4 or 8

Lemma 11.1. In the notation above, with b not a square in K:

80
(a) If a2 − b is not a square in K, then [K(α) : K] = [K(α0 ) : K] = 4.
(b) If neither a2 − b nor b(a2 − b) is a square in K, then [L : K] = 8.
Recall the notation: β 2 = b, α2 = a + β and (α0 )2 = a − β.

Proof. Since b ∈ K is not a square, [K(β) : K] = 2. Now [K(α) : K] = 4 by the


tower law if we can show [K(α) : K(β)] = 2, or equivalently that α ∈
/ K(β).
If a + β ∈ K(β) is a square in K(β), then for some c, d ∈ K

a + β = (c + dβ)2 = (c2 + d2 b) + 2cdβ

and so a − β = (c − dβ)2 is also a square in K(β). Thus


2
a2 − b = (a + β)(a − β) = ((c + dβ)(c − dβ)) is a square in K

contradicting the assumptions. So α ∈ / K(β) and [K(α) : K] = 4. Similarly


α0 ∈
/ K(β) and [K(α0 ) : K] = 4.
For (b) we need [K(α, α0 ) : K(α)] = 2, or equivalently α0 ∈/ K(α). It is
easier to work in the degree 2 extension K(α)/K(β) than the degree 4 extension
K(α)/K.
Suppose α0 ∈ K(α). Writing α0 = c + dα with c, d ∈ K(β) we have

a − β = (α0 )2 = c2 + d2 (a + β) + 2cdα


Thus5 cd = 0, since {1, α} is a basis for K(α)/K(β).


But d 6= 0, as α0 ∈
/ K(β) by (a), so c = 0 and a − β = d2 (a + β). Therefore
2
a2 − b = (a − β)(a + β) = (d(a + β)) is a square in K(β)

So by Lemma 11.2 below, either a2 − b is a square in K or b(a2 − b) is a square


in K, both of which contradict the assumptions. 

Lemma 11.2. Let M be any field and u, v ∈ M . If u is a square in M ( v)
then either u or uv is a square in M .
Proof. This follows at once by observing that
√ √
(c2 + d2 v) + 2cd v = (c + d v)2 = u ∈ M =⇒ cd = 0

and if d = 0 then u = c2 , while if c = 0 then u = d2 v so uv = (dv)2 . 


Lemma 11.3. Suppose [K(α) : K] = [K(α0 ) : K] = 4.
(a) q(x) = x2 −(a+β) is the minimal polynomial of α over K(β), and q 0 (x) =
x2 − (a − β) is that of α0 .
5 Do not even think about expanding c or d in the form r + sβ: merely observe that the

whole first bracketed expression and the coefficient 2cd both lie in K(β).

81
(b) Any σ ∈ Gal(L/K) can only map β, α, α0 in one of the following 8 ways:

1 2 3 4 5 6 7 8
σ(β) β β β β −β −β −β −β
σ(α) α α −α −α α0 α0 −α0 −α0
σ(α0 ) α0 −α0 α0 −α0 α −α α −α

(There is no claim that any of these is realised by an automorphism.)


Proof. The assumption on degree implies that each step in the tower (17) has
degree 2, which ensures irreducibility of the polynomials in (a).
For (b), either σ(β) = β or −β by Corollary 6.39. If σ(β) = β, then σ fixes
K(β) so, by Corollary 6.39, σ must permute the roots of q(x) in some fashion,
and so σ(α) = α or −α. Similarly for q 0 (x), so σ(α0 ) = α0 or −α0 .
If instead σ(β) = −β, then σ(q(x)) = q 0 (x), and so σ must exchange the
roots of q and q 0 in some order: σ(α) = α0 or −α0 , and σ(α0 ) = α or −α. 

11.1 [L : K] = 8 =⇒ Gal(L/K) ∼
= D8
In this case, each extension in the tower (17) is of degree 2, so Lemma 11.3
applies. Note that it merely presents the possible outcomes for how σ acts on
β, α, α0 ; we do not know which of these 8 possibilities actually arise as automor-
phisms. But by Theorem 10.10, # Gal(L/K) = [L : K] = 8, and so we know at
once that they must all arise as automorphisms.
Selecting possibilities 6 and 2, write
 
 β 7→ −β
  β 7→
 β
σ: α 7→ α0 and τ : α 7→ α
 0  0
α 7→ −α0
 
α 7→ −α

We see that σ 4 = τ 2 = id and τ σ = σ 3 τ (check both sides agree on β, α, α0 ) so

Gal(L/K) ∼
= D8
the dihedral group with 8 elements, which is also the symmetries of the square:
you can even see that pictorially by its effect on the four roots of f as

α0 α
• •
2 1

3 4
• •
−α −α0
(18)

in which σ is rotation anticlockwise by π/2 and τ is reflection in the leading


diagonal. (Of course σ, τ fix a, b ∈ K and their effect on β is then determined
by β = α2 − a and −β = (α0 )2 − a.)

82
Either that picture or the representation σ = (1234), τ = (24) ∈ D8 ⊂ S4
by permutations makes it easy to work out the subgroup lattice of Gal(L/K);
see Figure 8 for the picture. It may help to list the 8 elements by geometrical
and cycle type: other than the identity, they are

diagonal reflections τ, σ 2 τ (24), (13)


vertical and horizontal reflections στ, σ 3 τ (12)(34), (14)(23)
central reflection / half rotation σ2 (13)(24)
quarter rotations σ, σ 3 (1234), (1432)

We can use this to work out the subfield lattice under the Galois correspon-
dence. The lattice picture is the same—that’s the main theorem—but we should
identify the subfields in more concrete terms than the subfield corresponding to
each H ⊂ G is H † = LH . Which elements of L generate H † ?
First τ (α) = α so K(α) ⊂ Lτ ; but both have degree 2 inside L, so they must
2
be equal. Similarly σ 2 τ (α0 ) = α0 , so Lσ τ = K(α0 ). Those two fields intersect
in the common subfield K(β) of degree 2, so that places K(β) in the diagram.
(You can check: K(β) = Lhσ ,τ i as its position indicates.)
2

How do we find the other fields? In a sense it’s simple: for each group
in the subgroup lattice, we must find functions of α and α0 that are invariant
under its permutations. Fortunately, we can do that here by trial and error.
It’s worth remembering, though, that L is 8 dimensional, so these subfields
are infinitesimally small transparent pins in a multidimensional hyper-haystack:
random expressions η = η(α, α0 ) are unlikely to lie in any strict subfield of L,
and so K(η) = L, which is great in itself but doesn’t solve the problem.

L hidi

K(α) K(α0 ) K(β, γ) K(δ 0 ) K(δ) hτ i σ2 τ σ2 σ3 τ hστ i

K(β) K(βγ) K(γ) σ2 , τ hσi σ 2 , στ

K D8

Figure 8: Galois correspondence for deg 8 biquadratic extensions. Non-normal


subfields and subgroups underlined, with their D8 orbits indicated by ↔

It makes sense to try fairly symmetric expressions, and it will be enough to


consider
γ = αα0 δ = α + α0 δ 0 = α − α0 (19)

83
and to note that σ and τ act on these as follows:
 
 γ →
 7 −γ  γ
 7→ −γ
0
σ: δ → 7 −δ and τ : δ 7 → δ0
 0
  0

δ 7→ δ δ 7 → δ

Thus στ (δ) = δ and σ 3 τ (δ 0 ) = δ 0 , while γ ∈ K(δ) ∩ K(δ 0 ), so the right-hand


side of the pictures is fixed in the same way as we did the left-hand side.
For the middle, note that γ ∈ / K(β) (otherwise K(α) = K(α0 )), so K(β, γ)
2
is a degree 2 extension of K(β). Since clearly K(β, γ) ⊂ Lσ , these two fields
are equal. To finish, note that σ(βγ) = βγ, and βγ ∈ / K (since γ ∈ / K(β) as
before), so K(βγ) ⊂ Lσ must be an equality.
Remark 11.4 (Seeing non-normality). Normal subfields are mapped to them-
selves by every element of Gal(L/K) (by Proposition 9.17(b)), while normal
subgroups are mapped to themselves under conjugation by any element of
Gal(L/K) (by definition). In contrast, being non-normal means the Gal(L/K)
action moves the subfields and subgroups around. All subfields and subgroups
that are not normal are underlined in Figure 8, and the arrows ↔ show how
they can move.
For example, H1 = hτ i and H2 = σ 2 τ have

στ σ −1 = σ 10 τ = σ 2 τ so σH1 σ −1 = H2

while in the corresponding places in the subfield lattice

σLH1 = σK(α) = K(σα) = K(α0 ) = LH2

(Non-examinable) Remark (Biquadratic buddy). Inside the subfield lattice


of Figure 8 you can see, running down the left-hand side, a copy of the tower (17)
that we used to analyse f = (x2 − a)2 − b in the first place. Figure 8 is sym-
metric about the central axis, so there is a very similar tower running down the
right-hand side, with β, α, α0 replaced by γ, δ, δ 0 . Is there another biquadratic
polynomial whose roots are computed by that tower?
Of course! Let’s use a different variable to keep the polynomials apart, and,
recalling the definitions from (19), write

g = (y − δ)(y + δ)(y − δ 0 )(y + δ 0 )


= y 4 − δ 2 + (δ 0 )2 y 2 + (δδ 0 )2


= y 4 − 4ay 2 + 4b
= (y 2 − 2a)2 − 4(a2 − b) where δ 2 + (δ 0 )2 = 4a and γ 2 = a2 − b

The biquadratic polynomial g ∈ K[y] was there all the time, and we could
have worked with it instead of or as well as f . It is called the companion
polynomial of f , and has the same biquadratic splitting field. 

84
b(a2 − b) ∈ K =⇒ [L : K] = 4, Gal(L/K) ∼
p
11.2 = C4
First, (βαα0 )2 = b(a2 − b) so βαα0 ∈ K by the case assumption. Therefore
α0 ∈ K(α, β) = K(α), whence L = K(α) = K(α0 ). Since b is not a square, also
a2 − b is not a square in K, so [L : K] = 4 by Lemma 11.1(a).
We first show that f is irreducible over K. It has no roots in K, so the
only thing to check is that f is not the product of two irreducible quadratic
polynomials. We check all pairwise products of linear factors: for example if

(x − α)(x − α0 ) = x2 − (α + α0 )x + αα0

was a factor of f lying in K[x], then αα0 ∈ K; but (αα0 )2 = a2 − b, so we would


conclude that b had a squareroot in K, which it does not.
Irreducibility gives us power: since f is irreducible, there is some σ ∈
Gal(L/K) with σ(α) = α0 by Corollary 9.16. Lemma 11.3 shows that necessar-
ily σ(β) = −β, and since βαα0 ∈ K is fixed by σ it follows that σ(α0 ) = −α.
Thus σ has order 4; it is a rotation of the square (18).
But we know # Gal(L/K) = 4, since L/K is Galois, so Gal(L/K) = hσi is
a cyclic group of order 4. Its subgroup structure is very simple, and the Galois
correspondence shows we already know all the subfields of L; see Figure 9.

L hidi

K(β) σ2

K C4

p
Figure 9: Galois correspondence for [L : K] = 4 with b(a2 − b) ∈ K


11.3 a2 − b ∈ K and [L : K] = 4 =⇒ Gal(L/K) ∼
= C2 × C 2
First, (αα0 )2 = a2 − b so αα0 ∈ K by the case assumption so α0 ∈ K(α), whence
L = K(α) = K(α0 ). By Lemma 11.3, we know the minimal polynomials and can
use them to build automorphisms by hand. We must exhibit 4 automorphisms,
since # Gal(L/K) = [L : K] = 4 by the Galois Correspondence.
There is an element ν ∈ Gal(L/K(β)) that exchanges the roots of q(x): that
is, ν(α) = −α and ν(β) = β. Since αα0 ∈ K is fixed by ν, also ν(α0 ) = −α0 .
There is an element of Gal(K(β)/K) that maps β to −β. This must lift to
an element ρ ∈ Gal(L/K). This automorphism takes α to one of the roots of
ρ(q(x)) = q 0 (x). If ρ(α) = α0 , then we have 4 distinct automorphisms:

id, ν, ρ, νρ ∈ Gal(L/K)

85
where νρ(α) = −α0 so is distinct from the previous three. If instead ρ(α) = −α0 ,
then we still get the same 4 automorphisms. So fix notation with ρ(α) = α.
Each of ν, ρ, νρ is an involution, ν 2 = id etc., and they all commute, so

Gal(L/K) = {id, ν, ρ, νρ} ∼


= C2 × C2

L hidi

K(β) K(δ 0 ) K(δ) hνi hνρi hρi

K C2 × C2


Figure 10: Galois correspondence for [L : K] = 4 with a2 − b ∈ K

By construction, ν fixes K(β) but does not fix L, so Lν = K(β). To find


the other fixed fields we can look for elements fixed by automorphisms by trial
and error. For example, ρ(α + α0 ) = α0 + α, so δ = α + α0 is invariant and
Lρ = K(δ). It seems like a small miracle that νρ(α − α0 ) = α − α0 = δ 0 , even
though you will have guessed it by symmetry.

11.4 Otherwise [L : K] = 2, Gal(L/K) ∼


= C2
In fact, in these cases f is not irreducible and L = K(β), but we didn’t follow
the logic of what does or does not have a root in K or K(β) to see that for free.
It isn’t hard to check: it just needs calm and accurate accounting of every case.

11.5 Examples of each type


We work over K = Q and polynomials f ∈ Q[x] throughout this section. In each
case we denote a splitting field of f by L/Q. That exists by Proposition 9.5,
though you could choose it to be the smallest subfield L ⊂ C that contains the
roots of f , if you preferred.
Example 11.5 (a = 0, b = 2). The biquadratic polynomial

f = x4 − 2

is irreducible (Eisenstein p = 2). Since none of b, a2 −b = −2 and b(a2 −b) = −4


is a square in Q, we have Gal(L/Q)√= D8 by√Lemma 11.1 and §11.1.
Alternatively, writing
√ f = (x2 −
√ 2)(x2 + 2) and factorising the quadratics
4 4
shows that Q ⊂ Q( 2) ⊂ L = Q( 2, i) is a tower of degree 4 × 2 = 8.
Example 11.6 (a = 1, b = −1). The biquadratic polynomial

f = (x2 − 1)2 + 1 = x4 − 2x2 + 2

86
is irreducible. Since none of b, a2 − b = 2 and b(a2 − b) = −2 is a square in Q,
we have Gal(L/Q) = D8 by Lemma 11.1 and §11.1.
Example 11.7 (a = 2, b = 2). The biquadratic polynomial

f = (x2 − 2)2 − 2 = x4 − 4x2 + 2


p √
is irreducible. All 4 roots ± 2 ± 2p ∈ R. Neither b nor a2 − b = 2 are squares

in Q, so [Q(α) : Q] = 4, where α = 2 + 2. Since b(a2 − b) = 4 is a square
in Q, §11.2 shows that in fact L = Q(α) and Gal(L/Q) = C4 .

Example 11.8 (a = 3, b = 5). The biquadratic polynomial

f = (x2 − 3)2 − 5 = x4 − 6x2 + 4


p √
is irreducible. All 4 roots ± 3 ± 5 ∈ R. In this case a2 − b = 4 is a square
in Q, so by §11.3 L = K(α) has degree 4 over Q and Gal(L/K) ∼= C2 × C2 .
Example 11.9 (a = 0, b = −4). The biquadratic polynomial

f = x4 + 4 = (x2 + 2x + 2)(x2 − 2x + 2)

is not irreducible, as an attempt to prove irreducibility would reveal. The


quadratic formula shows that L = Q(i) is a splitting field, so Gal(L/Q) = C2 .

87
12 Finite fields
The first finite fields we learn are

Fp = Z/(p) ∼
= {0, 1, 2, . . . , p − 1}

for a prime integer p > 0 with arithmetic modulo p.


In fact there are many others, and they are easy to understand. We know
from Proposition 6.7 that if L is a finite field, then it has characteristic char L =
p > 0 for some prime p > 0, and so L is a finite extension of Fp . Moreover,
being a vector space over Fp of some finite dimension, n say, we know too that
L has #L = pn elements.
Example 12.1. Let f = x2 + x + 1 ∈ F2 [x], which is clearly irreducible. We
can make the simple extension of F2 by a root of f :

L = F2 (α) = F2 [x]/(f ) = {0, 1, α, α + 1}

using Proposition 6.21 as usual to treat {1, α} as an F2 -basis of L, where in this


small case the coefficients with respect the basis are only F2 = {0, 1}, so we can
write out every element.
Since α2 = −(α + 1), we can compute the nontrivial multiplications

α2 = α + 1 α · (α + 1) = 1 (α + 1)2 = α

noting that 2 = 0 and −1 = 1 in F2 . You can have fun with bigger finite fields
and bigger irreducible polynomials.

12.1 The existence of finite fields


We show next that there is such a field for every p > 0 and n ≥ 1, and that for
each q = pn there is a unique such field up to isomorphism.
Theorem 12.2. Given integers p > 0 prime and n ≥ 1, let q = pn . The splitting
field L of f = xq − x ∈ Fp [x] is a field with #L = pn elements, and L/Fp is a
Galois extension. Moreover any two fields of order q = pn are isomorphic.
Proof. Let L/Fp be a splitting field for f . Note that f has distinct roots in L
by Lemma 9.26 since its formal derivative Df = −1 is coprime to f . That is f
is separable, and so L/Fp is a Galois extension.
Let M ⊂ L be the set of roots of f in L; thus

M = {α ∈ L | αq = α}

and #M = q. It is easy to check that M is a field in its own right, so M = L


by minimality of the splitting field.
Suppose N/Fp is another field of q = pn elements. Consider the set N ∗
of nonzero elements as a finite group under multiplication. Since the order of
every element in a finite group divides the order of the group, β q−1 = 1 for any

88
β ∈ N ∗ . Multiplying by β, and noticing that the following equation also holds
for β = 0, we have
βq = β for all β ∈ N
That is, f splits completely in N , and since #N is the number of roots of f
in any splitting field, N is a splitting field for f over Fp . Thus N ∼
= L by the
uniqueness of splitting fields, Corollary 9.7. 

12.2 The Frobenius map


Something amazing happens in characteristic p > 0: we have the following
remarkable map that breaks the rules we thought we knew in characteristic 0.
For finite fields, which we consider here, it is particularly simple and beautiful.

Definition 12.3. Let K be a field of characteristic p > 0 (finite or infinite).


The Frobenius map is the homomorphism

ϕp : K −→ K
α 7→ αp

Proposition 12.4. Let K be a field of characteristic p > 0, so that Fp ⊂ K,


and let ϕp : K → K be the Frobenius map defined above.

(1) The Frobenius map is a homomorphism (and so is injective).


(2) The set of elements α ∈ K for which ϕp (α) = α comprise exactly Fp .
(3) If K is a finite field, then ϕp is surjective, so ϕp ∈ AutFp (K). In this case
it has fixed field K ϕp = Fp .

Proof. (1) Write ϕ = ϕp to simplify notation. We just check the rules. First
ϕ(1) = 1p = 1 and

ϕ(αβ) = (αβ)p = αp β p = ϕ(α)ϕ(β)

It remains to check that ϕ respects addition: this is where char K = p is used.

ϕ(α + β) = (α + β)p
p(p−1) p−2 2
= αp + pαp−1 β + 2 α β + · · · + pαβ p−1 + β p
p p
= α + β = ϕ(α) + ϕ(β)

where (as we’ve said before) all the middle terms · · · of the binomial expansion
vanish because their coefficients include a factor of p.
(2) The set of elements on which ϕ is the identity is certainly a subfield
of K, since ϕ is a homomorphism, and so it contains the prime subfield Fp ⊂ K.
But we may also interpret these elements as the roots in K of the polynomial
xp − x ∈ Fp [x]. That is not the zero polynomial, so it has at most p roots. The
elements of Fp already provide p roots, so they must be all of them.

89
(3) Finally, if K is finite then it is a finite-dimensional vector space over Fp
and ϕ : K → K is an injective Fp linear map, so it is an isomorphism of Fp
vector spaces. In particular, it is a bijection, and we just saw it fixes Fp , so it
is an Fp -automorphism as claimed.
Now that we know it is an automorphism of the given K, we may speak of
its fixed field, but that is just another way of speaking of the elements on which
ϕ is the identity, which was checked in (2) 

(Non-examinable) Remark The Frobenius map ϕp : Fp (t) −→ Fp (t) is not


surjective. We used this when making an inseparable polynomial in Exam-
ple 9.21.
More generally, for infinite fields the Frobenius map is related to inseparabil-
ity (though we do not discuss it in any detail), so although at first the possibility
of inseparability seems frustrating, you may prefer to think of it as a symptom
of the existence of something wonderful and unexpected, which we’ll be glad to
have once we get used to it. 

12.3 The Galois group of finite fields


Theorem 12.5. Let K be a finite field of characteristic char K = p > 0, so
that Fp ⊂ K. Suppose #K = pn . Then

Gal(K/Fp ) = hϕp i ∼
= Z/(n)

is a cyclic group of order n generated by the Frobenius morphism ϕp : K → K.

F512 = L {id}

6
F56 = Lϕ ϕ6 ∼
= Z/(2)
4
F54 = Lϕ ϕ4

= Z/(3)
3
F53 = Lϕ ϕ3 ∼
= Z/(4)
2
F52 = Lϕ ϕ2

= Z/(6)

F5 = Lϕ Gal(L/F5 ) = hϕi

= Z/(12)

Figure 11: Galois Correspondence for F512 with Frobenius ϕ = ϕ5

90
Proof. The extension K/Fp is Galois by Theorem 12.2, so

# Gal(K/Fp ) = [K : Fp ] = n (20)

by Theorem 10.10. The case n = 1 is trivial, so we may suppose n > 1.


Therefore ϕp ∈ Gal(K/Fp ) is a nontrivial element by Proposition 12.4. We
show that it has order n, so we are done; the order is at most n by (20).
If ϕm
p is the identity for some m ≤ n, then for every α ∈ K

m
αp = ϕm
p (α) = α

m
and so g = xp − x has pn solutions in K. So pm = deg g ≥ pn , whence
m = n. 
Remark 12.6. It follows almost at once that K = Fpm is a subfield of L = Fpn
if and only if m divides n, and in that case
m
K = Lϕp [L : K] = n/m Gal(L/K) = ϕm ∼
= Z/(n/m)
p

For example, there is a unique field L = F512 of order 244, 140, 625 = 512 , a
Galois extension of F5 . Switching on Galois Theory, we easily calculate the
subgroup lattice of Gal(L/F5 ) ∼
= Z/(12) and deduce the subfield lattice in Fig-
ure 11. Every field extension in this picture is normal, so we can calculate the
Galois group of each extension as the corresponding quotient of stabilisers.

91
13 Radical solutions of polynomials
We work with subfields of C in this chapter. That is unnecessarily restrictive,
but it allows us to go quickly to the main point. All definitions are valid over
any field. We discuss when a polynomial is soluble by radicals, which loosely
means its roots have expressions using nth roots; see Definition 13.6.
All polynomials of degree ≤ 4 are soluble by radicals by the usual quadratic
formula, Cardano’s formula §3.2(8), and a more complicated formula in degree 4
that we haven’t discussed.
Galois himself understood precisely what is going on. A group is soluble if
you can find a path through its subgroup lattice of normal subgroups Hi ⊂ Hi+1
of prime index; see Definition 13.8.
Theorem 13.1. Let K ⊂ C and f ∈ K[x] be an irreducible polynomial with
splitting field L/K. If f is soluble by radicals then Gal(L/K) is a soluble group.
In fact, the converse implication also holds; is not hard—we have simply run
out of time. What we have stated will give us enough control to exhibit quintics
that cannot be solved by radicals.

13.1 Radical and soluble extensions


Definition 13.2. A field extension M/K is radical if there is a sequence of
subfields
K = F0 ⊂ F1 ⊂ · · · ⊂ Fs−1 ⊂ Fs = M
so that Fi = Fi−1 (αi ) where αini ∈ Fi−1 for some ni > 0, for each i = 1, . . . , s.
(Radical extensions are finite by the Tower law.)
In other words, radical extensions of K are built up by inductively adding
ni th roots:
√ √ √ √ √ √
K ⊂ K( n1 a1 ) ⊂ K( n1 a1 , n2 a2 ) ⊂ · · · ⊂ K( n1 a1 , n2 a2 , . . . , ns as )

writing ni ai as an informal abbreviation of αi with αini = ai ∈ Fi−1 .
Example 13.3. Let ω ∈ C be a primitive cube root of unity and M = Q(ω).
Then M/Q is a radical extension: it adjoins a root of x3 − 1.
It does not matter that x3 − 1 is not irreducible nor that different roots
may give different extensions. Of course you could factorise x3 − 1, pick the
factor you were really interested in, and say L = Q(α) where α ∈ C is a root of
x2 + x + 1. But to demonstrate that the extension is radical, you should not do
that, since the definition demands polynomials of√ the form xn − a.
1
Alternatively, you could have said α = 2 (−1+ −3), so that Q ⊂ M adjoins
a root of x2 − (−3). Nobody said that there was only one polynomial of the
form xn − a that would do the job.
Example 13.4. Let f = x3 − 3x − 3 ∈ Q[x]. Recall the drill for cubics from
§3.2–3.3: f is irreducible by Eisenstein and is already in reduced form (it has

92
no x2 term) so there is no need for Tschirnhausen and we have p = q = −3 in
the notation of §3.2. Then Cardano’s formula specifies the roots of f . First we
calculate the discriminant, ∆ = −27D = −4p3 − 27q 2 = −27 × 5. Since ∆ < 0
we know there will be one√real root
√ and a complex conjugate pair.
Fix a square root α = D = 5 and two complex numbers
r s √ r s √
3 −q + α 3 3 + 5 3 −q − α 3 3 − 5
β= = γ= =
2 2 2 2
subject to the constraint βγ = −p/3 on the choice of cube roots. Since D > 0
we may choose β, γ ∈ R so the constraint is satisfied automatically and γ = 1/β.
The three distinct complex roots of f are then
s √ s √
i 3−i i 3 3 + 5 3−i 3 3 − 5
αi = ω β + ω γ = ω +ω i = 0, 1, 2
2 2
Note that α0 ∈ R while α1 = α2 is strictly complex. Those expressions show at
once that the splitting field L = Q(α1 , α2 , α3 ) of f is contained in the following
radical extension M/K:

F0 = Q ⊂ F1 = Q( 5) ⊂ F2 = Q(β) ⊂ F3 = Q(β, ω) = M
√ √
since ( 5)2 ∈ F0 , β 3 = (3 + 5)/2 ∈ F1 and ω 3 = 1 ∈ F2 . It is clear that
the splitting field L ⊂ M , but the radical tower shows that [M : Q] = 12, so
L 6= M . (In fact, none of the radicals we used to make the extension lie in L;
they were merely ingredients to cook up the roots αi .)
Example 13.5. Let f = x3 − 3x + 1 ∈ Q[x]. (This is the case p = −3, q = 1.
It is irreducible by Eisenstein applied to f (y − 1) = y 3 − 3y 2 + 3.)
The discriminant of f is ∆ = −27D = −27 × (−3), so Cardano’s formula
computes the roots of f as:
s √ s √
3 −1 + i 3 3 −1 − i 3
αi = ω i β + ω 3−i γ = ω i + ω 3−i i = 0, 1, 2
2 2

Despite√ appearances, all αi ∈ R, since ∆ > 0. Take ω = (−1 + i 3)/2 so
β = 3 ω is a primitive 9th root of unity and γ = 1/β = β (to satisfy the
constraint βγ = −p/3), and then
4 7
α0 = β + β α1 = β 4 + β α2 = β 7 + β

are visibly real. Nevertheless, the radical extension M/Q we used for the formula
certainly involved complex radicals: we constructed it as
√ √  √ 
Q ⊂ Q( −3) ⊂ Q −3, β = Q −3, β, ω = M

and then recognised it more simply as Q ⊂ M = Q( 9 1). One can prove that
the splitting field L ⊂ R of f is not a subfield of a real radical extension M ⊂ R.

93
Definition 13.6. A field extension L/K is soluble if L is contained in a field
M with M/K radical. A polynomial f ∈ K[x] is soluble by radicals if its
splitting field L is soluble.
Thus, if f is soluble by radicals, every element of its splitting field has an
expression that involves only nth roots, field operations, and elements of K. In
particular, that is true of the roots of f . That’s not quite the same thing as
saying what the formula for the roots is, but it’s a good start.
We end this section by observing that normal closures of radical extensions
are still radical.
Proposition 13.7. Suppose char K = 0. If L/K is a radical extension, then
there is a finite extension M/L so that M/K is both radical and Galois.
Proof. Let M/L be the normal closure of L determined by Corollary 9.12. We
use the notation of Definition 13.2: K = F0 ⊂ · · · ⊂ Fs = L ⊂ M . The idea
is simple: inside M , work through the steps we would use to build a normal
closure (which M already is), and check that the extensions we build at each
step are radical.
At each step Fi = Fi−1 (αi ) defining L, let mi be the minimal polynomial
of αi over K. (Note: minimal polynomial over K: so this is not necessarily a
factor of xni − ai , since ai ∈ Fi−1 might not be in K.) Since mi is irreducible,
for any root βi there is some σ ∈ Gal(M/K) with σ(αi ) = βi by Corollary 9.16.
Thus βi is a root of xni − σ(ai ).
We work inductively through increasing i = 1, 2, . . . , s. When i = 1, we make
radical extensions by roots of m1 , each of which satisfy xn1 − a1 since a1 ∈ K
is fixed by any element of Gal(M/K). That constructs a radical extension
F0 ⊂ F1 ⊂ F f1 that is also normal, since it is the splitting field for m1 inside M .
For i = 2, we make radical extensions by roots of m2 : indeed, each such root
satisfies xn2 − σ(a2 ) for some element σ ∈ Gal(M/K), and since a2 ∈ F1 and
f1 /K is normal, each σ(a2 ) ∈ F
F f1 , by Proposition 9.17, so such roots are indeed
radicals over Ff1 .
Continuing with through i = 1, 2, . . . , s creates a radical extension

K⊂F
f1 ⊂ F
f2 ⊂ · · · ⊂ F
fs = M0 ⊃ Fs = L

contained in M that splits m1 m2 · · · ms completely. But then M = M0 by the


minimality of normal closures (Definition 9.13), since M0 is the splitting field
of a polynomial over K so is a finite, normal extension of L. 

13.2 Soluble groups and insoluble quintics


Definition 13.8. A group G is called soluble if there is a chain of subgroups

{id} = G0 ⊂ G1 ⊂ · · · ⊂ Gs = G (21)

where Gi ⊂ Gi+1 is a normal subgroup with abelian quotient Gi+1 /Gi for each
i = 0, 1, . . . , s − 1.

94
A chain of normal subgroups as in (21) is called a subnormal series, and
when the factors Gi+1 /Gi are all abelian it is called a soluble series.
Remark 13.9. When G is finite and soluble, we can find a soluble series in
which the quotients Gi+1 /Gi are of prime order, and so in particular are cyclic.
Example 13.10. Abelian groups are soluble: {id} ⊂ G is a suitable soluble
series. The symmetric group S3 is soluble: {id} ⊂ C3 ⊂ S3 is a suitable
subnormal series. You can check that S4 is soluble too, as indeed are any of its
subgroups such as D8 , the symmetries of the square.
On the other hand, A5 and S5 are not soluble. In fact, A5 is simple—it
has no nontrivial normal subgroups at all—and it’s easy to see that the only
normal subgroup of S5 is A5 , so there’s no chance of making a soluble series for
S5 either.
Without using simplicity, we can see the insolubility of A5 quite explicitly.
For g, h in a group G, we define their commutator [g, h] = ghg −1 h−1 . Every
element of A5 is a permutation of the form (ijk), (ij)(k`) or (ijk`m) with
i, j, k, `, m ∈ {1, 2, 3, 4, 5} distinct. The following formulas show that every
element of A5 is a commutator:

(ijk) = (ij`)(ikm)(i`j)(imk)
(ij)(k`) = (ijk)(ij`)(ikj)(i`j)
(ijk`m) = (ij)(km)(im`)(ij)(km)(i`m)

Now suppose H ⊂ A5 is a normal subgroup with A5 /H abelian; denote the


(surjective) quotient map by π : A5 −→ A5 /H. For any σ = ghg −1 h−1 ∈ A5

π(σ) = π(ghg −1 h−1 ) = π(g)π(h)π(g)−1 π(h)−1 = id in A5 /H

Thus H = A5 , so there cannot be a soluble series for A5 .


The insolubility of S5 follows easily; see Proposition 13.15 below.
Example 13.11. Let’s show that f = x5 − 10x + 5 ∈ Q[x] is not soluble by
radicals. It is irreducible by Eisenstein and so has 5 distinct roots in a splitting
field L, which we may take to be Q ⊂ L ⊂ C.

Step 1: high-school calculus. Observe that f has 3 distinct real roots and
a complex
√ conjugage pair. Indeed f 0 = 5(x4 − 2) has two real turning points√at
± 2, and then f 00 = 20x3 shows they
4
√ are a positive local maximum at − 4 2
4
and a negative local minimum at + 2, with the function racing monotonically
to ∓∞ outside these values.
Name these 5 roots of f as α1 , α2 , α3 ∈ R and β, β ∈ C \ R, in that order so
L = Q(α1 , α2 , α3 , β, β) and we may treat Gal(L/Q) as a subgroup of S5 .

Step 2: compute the Galois group. The set of roots that generate L over
Q is permuted (a little) under complex conjugation σ, so σ ∈ Gal(L/Q) and it
is the permutation (45) that fixes the αi and exchanges β and β.

95
Now L/Q is Galois, so # Gal(L/Q) = [L : Q], and that degree is divisible
by 5, by the Tower Law since Q ⊂ Q(α1 ) ⊂ L and [Q(α1 ) : Q] = 5 as f is
irreducible. Any subgroup of S5 with order divisible by 2 × 5 contains a 5-cycle.
(Or you can use Sylow’s first theorem as a sledgehammer.)
It is well known that S5 is generated by any 2-cycle and any 5-cycle together.
We can quickly see why: if σ = (45) and τ = (12345) then

τ −1 στ = (34) and similarly we get all (i, i + 1)


(34)(45)(34) = (35) and similarly all other transpositions

and any element of S5 is a product of transpositions. So Gal(L/Q) = S5 .

Step 3: conclude by Theorem 13.1. Since Gal(L/Q) = S5 and S5 is not


a soluble group, it follows that f is not soluble by radicals.

End of examinable material . . .

. . . but the rest is a great workout of Galois theory

13.3 Abelian Galois groups of simple radical extensions


Lemma 13.12. Let K ⊂ C be a subfield and ζ ∈ C a primitive pth root of unity
for a prime p. Then K(ζ)/K is Galois and Gal(K(ζ)/K) is abelian.
Proof. The polynomial f = xp −1 ∈ K[x] splits completely in K(ζ) with p roots
ζ i for i = 0, 1, . . . , p − 1, and so is separable. Thus K(ζ)/K is Galois.
Since the minimal polynomial m ∈ K[x] of ζ over K divides f , it too splits
completely over K(ζ) with roots a subset of the roots of f .
Any elements σ, τ ∈ Gal(K(ζ)/K) are determined by σ(ζ) and τ (ζ), which
are roots of m, say ζ r and ζ s respectively. So

στ (ζ) = (ζ s )r = ζ rs = (ζ r )s = τ σ(ζ)

and στ = τ σ as claimed. 
Lemma 13.13. Let K ⊂ C be a subfield with ζ ∈ K, where ζ is a primitive
pth root of unity for a prime p. If α ∈ C satisfies a = αp ∈ K, then K(α)/K is
Galois and Gal(K(α)/K) is abelian.

The idea is that α = p a is a choice of pth root of a. The proof is virtually
identical to the previous one.

96
Proof. The polynomial f = xp − a ∈ K[x] splits completely in K(α) with p
roots ζ i α for i = 0, 1, . . . , p − 1. Thus K(α)/K is Galois.
Since the minimal polynomial m ∈ K[x] of α over K divides f , it too splits
completely over K(α) with roots a subset of the roots of f .
Any elements σ, τ ∈ Gal(K(α)/K) are determined by σ(α) and τ (α), which
are roots of m, say ζ r α and ζ s α respectively. So, ζ ∈ K is fixed by σ, τ ,

στ (α) = σ(ζ s α) = ζ r+s α = τ (ζ r α) = τ σ(α)

and στ = τ σ as claimed. 
Corollary 13.14. Let K ⊂ C be a subfield and ζ ∈ C a primitive pth root of
unity for a prime p. If α ∈ C satisfies a = αp ∈ K, then L/K is Galois, where
M = K(ζ) and L = M (α), and the subgroup

Gal(L/M ) ⊂ Gal(L/K)

is a normal abelian subgroup with abelian quotient.

Proof. The polynomial f = xp − a ∈ K[x] splits completely over L with p


distinct roots ζ i α for i = 0, 1, . . . , p − 1; in particular, α and ζ are either roots
or combinations of roots of f , so L is a splitting field for f . Thus L/K is Galois.
Now switch on Galois theory: M/K is normal by Lemma 13.12, so, by
Theorem 10.17(c)
M ∗ = Gal(L/M ) ⊂ Gal(L/K)
is a normal subgroup with quotient Gal(M/K) that is abelian by Lemma 13.12.
Finally, since L = M (α) and ζ ∈ M , Gal(L/M ) is abelian by Lemma 13.13. 

13.4 Soluble yoga and the proof


Proposition 13.15. Let G be a group and H ⊂ G a subgroup.

(a) If G is soluble then so is H.


(b) If H ⊂ G is a normal subgroup, then

G is soluble ⇐⇒ H and G/H are both soluble

Proof. We sketch the idea. (See e.g. [Stewart] pp162–3 for fuller details.)
If {Gi } is a soluble series for G then we can check that {Hi = H ∩ Gi }
is a soluble series for H: normality is preserved and the factors Hi /Hi−1 are
subgroups of Gi /Gi−1 so also abelian.
If H ⊂ G is normal, then we can check that {Ji = HGi /H} is a soluble series
for H: the factors are quotients of Gi /Gi−1 (by applying the higher isomorphism
theorems) so also abelian.
For the converse, we can essentially adjoin soluble series {Hi } for H and
{Gj /H} for G/H to get one {Hi , Gj } for G. 

97
Theorem 13.16. If L/K is a radical Galois extension, then Gal(L/K) is sol-
uble.
Proof. We proceed by induction on n = [L : K]; the case n = 1 holds trivially,
so suppose n > 1.
As L/K is radical, there is some α ∈ L \ K with αp = a ∈ K for some
prime p. The minimal polynomial m ∈ K[x] of α over K divides xp − a, and
so its roots are all of the form ζ i α for some i, where ζ is a primitive pth root
of unity. Since deg(m) ≥ 2, it has at least two roots, and so ζ ∈ L, being a
rational combination of any two distinct ζ i α.
Consider the radical extension M = K(ζ, α) of K that lies intermediate

K⊂M ⊂L

By Corollary 13.14, M/K is Galois, so by Theorem 10.17 Gal(L/M ) ⊂ Gal(L/K)


is normal with quotient

Gal(L/K)/ Gal(L/M ) ∼
= Gal(M/K)

which is soluble, also by Corollary 13.14.


Now L/M is Galois and [L : M ] < [L : K], and it is radical since L/K and
M/K are. So Gal(L/M ) is soluble by induction.
Finally, the yoga of soluble groups Proposition 13.15 concludes the proof by
setting G = Gal(L/K) and H = Gal(L/M ). 

Corollary 13.17. Let K ⊂ C be a subfield and f ∈ K[x] be irreducible. If f is


soluble in radicals, then Gal(L/K) is soluble, where L is a splitting field for f .
Proof. Since f is soluble, there is an extension M/L such that M/K is radical.
By Proposition 13.7, there is an extension N/M such that N/K is radical
and Galois.
By Theorem 10.17 for N/K, since L/K is normal, Gal(N/L) ⊂ Gal(N/K)
is a normal subgroup, and

Gal(N/K)/ Gal(N/L) ∼
= Gal(L/K)

The soluble yoga of Proposition 13.15 concludes the proof, setting G = Gal(N/K),
which is soluble by Theorem 13.16, and H = Gal(N/L) 

98
Optional: the Introduction at The End
This isn’t a conclusion, because we have only just started the subject. Rather,
it is a fantasy introduction that spells out the whole module as if in advance.
Of course it will fail, but by trying to see how it fails, you may see better how
the whole module hangs together, and why we did what we did. As with the
other introductions at the beginning, you could choose not to read this at all.

Field extensions
Life starts with a field K and the ring of polynomials K[x] over it. Given
an irreducible polynomial f ∈ K[x] of degree n, we can construct a simple
field extension in which f has a root, namely L = K[x]/(f ); as a K-vector
space dimK L = n. If we already owned such an extension K ⊂ M and a root
α ∈ M , the field L0 = K(α) ⊂ M is another field containing a root of f ; but
an easy argument shows L ∼ = L0 , so we are not upsetting anything by doing
this—though it feels a bit strange that the element (coset) α = x + (f ) ∈ L
models any of the roots of f , and we cannot tell which particular one it was. (In
fact, the question doesn’t even make sense: if we owned a field M containing
all the roots, we could map L ,→ M , taking x to whichever root we pleased.)
Repeating this process inductively, we construct an extension K ⊂ F , a
splitting field, defined by the property that all of the roots α1 , . . . , αn of f lie
in F —in the sense that f factorises as f = cΠi (x − αi ) in the ring F [x]—but
not in any smaller subfield of F . If we already owned a field K ⊂ M which
contained all the roots of f , then F 0 = K(α1 , . . . , αn ) ⊂ M is another splitting
field for f ; a neat little inductive argument on simple extensions shows that
F ∼= F 0 , so we are not upsetting anything by our approach.
The splitting field F was designed to contain all the roots of f . But by
magic, and a clever inductive argument on degrees of extensions, it contains all
the roots of any irreducible g ∈ K[x] for which it contains at least one root.
Extensions K ⊂ F with this property are said to be normal, and this plays a
role when we consider K-automorphisms.
Rather quietly, we have taken for granted that f is separable: that is,
that it has n = deg(f ) distinct roots in any splitting field. An argument with
formal differentiation (recovering what we thought we already knew) shows that
separability can only fail if f comprises only p-powers and K has characteristic
p > 0. That’s a very interesting case, but doesn’t feature in a major way in this
module. In particular, working with K = Q, or any other subfield of C, f is
automatically separable.

Field automorphisms
Given a splitting field K ⊂ F for f (recall deg(f ) = n), let G = Gal(f ) =
Gal(F/K) be the group of all K-automorphisms of L. A simple linear in-
dependence argument in linear algebra shows that there are at most n K-
automorphisms of L. But the normality of splitting fields readily provides n au-

99
tomorphisms, by redeploying the argument about uniqueness of splitting fields,
so #G = n.
Of course, every element g ∈ G fixes the whole of K pointwise—that’s what
it means to be a K-automorphism. But prima facie, it could be the case that
they fix even more: they may fix an intermediate field K ⊂ K0 ⊂ F , say.
Whether they do or not, a bit more linear algebra shows that dimK0 L = #G.
But we knew at the outset that dimK L = n, so K0 = K after all.

The Galois correspondence


We know that the fixed field of G = Gal(F/K) is just K itself, but of course if
we take a subgroup H ⊂ G, it will probably fix more, just because it has fewer
elements so it’s reasonable to expect that it moves things around less than G
did. Conversely, given an intermediate field K ⊂ L ⊂ F , those elements g ∈ G
that fix L pointwise form a subgroup, namely AutL (F ).
Thus there is a correspondence between subgroups H ⊂ Gal(F/K) and in-
termediate fields K ⊂ L ⊂ F . Some careful counting and basic group theory
(normal subgroups, quotient groups and the first isomorphism theorem), to-
gether with the trivial observation that if F splits f over K then it also splits
f over L (it doesn’t matter that f may not be irreducible over L) so that the
previous results for splitting fields apply to L-automorphisms of F too, shows
that this correspondence bijectively pairs up all the subgroups and intermediate
fields by their stabilising effects on one another, reversing inclusion as it does
so. That’s the Galois correspondence.

Roots of polynomials
You’ll like this. Suppose f ∈ Q[x] has roots α1 , . . . , αn ⊂ C. Thus L =
Q(α1 , . . . , αn ) ⊂ C is a splitting field for f . By the Galois Correspondence,
L has only finitely many strict subfields K1 , . . . , Ks . Each of these is a Q-vector
subspace of L of dimension strictly less than dimQ L. So there is no way that
their set-theoretic union can be the whole of L. So pick any α ∈ L that does
not lie in any Ki . Then clearly L = Q(α): indeed Q(α) is a subfield of L, but
it is certainly not contained in any strict subfield of L by choice of α. This
result, that L = Q(α) for some α, is called the Theorem on the Primitive
Element.
On the face of it, the Theorem on the Primitive Element only applies to
splitting fields L ⊃ Q, since it used the Galois correspondence, and that requires
normality. But if K ⊃ Q is any finite extension, then there is a normal closure
K ⊂ N and we can hit N/Q by Galois Theory: it has only finitely many
subfields, so a fortiori K does too, and the rest of the argument follows.
But in some ways our initial ambition was the opposite. Rather than trying
to re-express L in a simple way by using only one non-rational number, we
want to express L by adjoining radicals, one at a time. (Or, at least, if not L
itself, then some field that contains L—just as Cardano put up with the fact
that he might adjoin strictly complex radicals even when computing real roots.)

100
That is not always possible. If we build a sequence of extensions by roots
of polynomials of the form xp − a, then over on the other side of the Galois
Correspondence we must be breaking down G = Gal(f ) by normal subgroups
with quotients like Z/p—such groups are called soluble, and have a more precise
definition that what I just said. We need a little care to say that correctly, but
having determined that G must decompose in a particular way, we can ask
whether actual Galois groups we know do indeed decompose like that. And
pretty quickly we spot that S5 does not, and with that motivation in hand, it
doesn’t take long to find a polynomial f with Galois group S5 . There we have
it: a polynomial that cannot be solved by radicals—centuries of work in rubble,
if you’re a glass-half-empty sort of person.

Where might we go next?


What would we do next if we had another module to play with?
(a) Prove the converse implication to Theorem 13.1.

(b) Understand much more fully which finite groups are soluble. For example,
what are the soluble transitive subgroups of Sn for n = 3, 4, 5, 6, and how
do they relate to polynomials that we know?
(c) More precisely, which groups can arise as Gal(f ) for some polynomial
f ∈ Q[x]? This is called the inverse Galois problem and it is not
solved. You would become famous if you solved it.
(d) We have worked mostly with polynomials defined over Q. What if we
worked instead over a transcendental extension Q(t)? The Galois Corre-
spondence still holds, of course (it holds over any field), but what does it
mean? What about other fields?

(e) What about characteristic p > 0, and in particular what information can
we recover in the non-separable case?

References
[1] https://mathshistory.st-andrews.ac.uk/Biographies/Ferro/
https://mathshistory.st-andrews.ac.uk/Biographies/Tartaglia/
https://mathshistory.st-andrews.ac.uk/Biographies/Cardan/
[Stewart2015]

101

You might also like