Galois
Galois
This is a module on Galois Theory and the Theory of Fields and their appli-
cation to various famous historical problems. The aim is to gain a much more
precise understanding of what it means to solve a polynomial equation in one
variable—a problem you might have hoped you’d sewn up at school.
Why wasn’t it sewn up at school? The main reason is that we didn’t know
how to speak properly then: intrinsically the ideas are simple and natural, and
they explain everything we know already and much much more, but only once
we have the language, concepts and examples that you get from a maths degree.
There are 4 things you need to follow this:
(a) enthusiasm to fool around with polynomials, using long division, GCDs,
factorisations and roots of unity, and generally enjoying their company;
(b) basic linear algebra: linear independence, bases and dimension;
(c) basic group theory: subgroups and normal subgroups in permutation
groups, and the basic isomorphism theorems;
(d) basic ring theory: ideals I in R = R[x] and their quotient rings R/I, which
really just means more fun getting your hands on polynomials.
We will review them in the notes, and you could also do that in advance now. At
the pointy end, we will use a little more algebra (such as knowing the subgroups
of a quotient group G/N ), but that’s a long way off when things will be clearer.
It’s called Galois Theory in honour of Évariste Galois, who famously died
following a duel aged 20 in May 1832. He was born near Paris in Bourg-La-
Reine, which was twinned with Kenilworth on the 150th anniversary of his
death—you can see the sign on the Kenilworth Road. He lives forever in the
maths pantheon, though he’d barely started—it will be centuries before we can
even imagine what he may have taught us.
His miracle was to see the conceptual symmetry behind an array of polyno-
mial root calculations, even without the language we use—that was developed
over the 19th century and completely rewritten at least twice in the 20th. Many
basic calculations had been known for centuries, since shortly before the sym-
metric gardens were built in Kenilworth Castle for the visit of Queen Elizabeth I,
when symmetry was in fashion in Warwickshire. Galois spent the night before
the duel the following dawn writing down these ideas so we could have them.
It’s best not to leave revision to the night before the exam.
1
Contents
1 Introduction 5
1.1 Factorisation of polynomials and splitting field . . . . . . . . . . 6
1.2 Building up a factorisation . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Subfields of the splitting field and the roots of f . . . . . . . . . 7
1.4 Permuting the roots: the Galois Correspondence . . . . . . . . . 7
1.5 One punchline: insoluble quintics . . . . . . . . . . . . . . . . . . 9
1.6 What’s where when? . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Reminders 23
4.1 A reminder about fields . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Ring homomorphisms, first isomorphism theorem . . . . . . . . . 25
4.3 Characteristic of a field . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 Automorphism group of a field: the main invariant . . . . . . . . 28
4.5 Polynomials in one variable, rings, ideals and all that . . . . . . . 28
6 Field extensions I 36
6.1 Extensions as vector spaces . . . . . . . . . . . . . . . . . . . . . 37
6.2 Adjoining a square root to a subfield of C . . . . . . . . . . . . . 39
6.3 Simple extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.1 Standard models for simple extensions . . . . . . . . . . . 42
6.3.2 A basis of a simple algebraic extension K(α)/K . . . . . 43
6.4 Adjoining a root of a polynomial . . . . . . . . . . . . . . . . . . 44
6.5 Adjoining multiple algebraic elements . . . . . . . . . . . . . . . 45
6.6 Algebraic extensions and finite extensions . . . . . . . . . . . . . 46
6.7 Maps between fields I . . . . . . . . . . . . . . . . . . . . . . . . 48
2
7 Applications: antiquities 51
7.1 Euclidean geometry – nonexaminable . . . . . . . . . . . . . . . . 51
7.2 Translate into Field Theory . . . . . . . . . . . . . . . . . . . . . 54
7.3 Impossible constructions . . . . . . . . . . . . . . . . . . . . . . . 57
9 Field extensions II 65
9.1 Splitting fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.2 When is an extension normal? . . . . . . . . . . . . . . . . . . . . 67
9.3 Maps between fields II . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3.1 Extending maps from subfields of normal extensions . . . 69
9.3.2 Restricting maps from fields containing normal extensions 70
9.4 When is an extension separable? . . . . . . . . . . . . . . . . . . 70
9.4.1 Characterisation of separable polynomials . . . . . . . . . 72
10 Galois Theory 74
10.1 Galois extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
10.2 Fixed fields and stabilisers . . . . . . . . . . . . . . . . . . . . . . 78
10.2.1 The subfield lattice . . . . . . . . . . . . . . . . . . . . . . 78
10.2.2 The subgroup lattice . . . . . . . . . . . . . . . . . . . . . 78
10.2.3 Order-reversing maps between the two lattices . . . . . . 78
10.3 The Galois Correspondence . . . . . . . . . . . . . . . . . . . . . 79
11 Biquadratic extensions 81
11.1 [L ∼
p : K] = 8 =⇒ Gal(L/K) = D8 . . . . . . . . . . . . . . . . . . 83
11.2 √ b(a − b) ∈ K =⇒ [L : K] = 4, Gal(L/K) ∼
2 = C4 . . . . . . . . . 86
11.3 a2 − b ∈ K and [L : K] = 4 =⇒ Gal(L/K) ∼ = C2 × C2 . . . . . 86
11.4 Otherwise [L : K] = 2, Gal(L/K) ∼ = C2 . . . . . . . . . . . . . . . 87
11.5 Examples of each type . . . . . . . . . . . . . . . . . . . . . . . . 87
12 Finite fields 89
12.1 The existence of finite fields . . . . . . . . . . . . . . . . . . . . . 89
12.2 The Frobenius map . . . . . . . . . . . . . . . . . . . . . . . . . . 90
12.3 The Galois group of finite fields . . . . . . . . . . . . . . . . . . . 91
3
1 Introduction
We will work with fields. You should remind yourself of the formal definition of
field, but in short K is a field if it has the four operations +, −, ×, ÷. Familiar
examples include R and C. In fact R ⊂ C is a subfield: R is a subset of C that
contains 1 and is closed under the field operations of C. We say that R ⊂ C is
a field extension.
There are a few very powerful ideas we’ll use repeatedly. Here’s the first:
since R ⊂ C, we can regard C as a vector space over R. (Convince yourself of
that: we use +, − in C, and then can do ‘scalar multiplication’ from R just by
multiplying inside C. Treating C as an R-vector space temporarily forgets most
of the information about ×, ÷ in C.) In fact, we already have a favourite basis
for C as an R-vector space:
Thus dimR C = 2, the size of that R-basis (or any other R-basis), which is why
you surely picture the complex numbers as the Argand diagram, Figure 1, with
its “real” and “imaginary” axes, and the unit circle marked in to help work out
where the numbers are, and some points in polar and cartesian coordinates.
C i ω = exp(2πi/5)
• •
ω2 • •
z = r exp(iθ)
= r(cos(θ) + i sin(θ))
θ 1
•
R
•
ω3
•
ω4
Figure 1: The complex numbers on the Argand diagram. The sum of the vertices
of any regular polygon centred at 0 is zero: e.g. 1+ω +ω 2 +ω 3 +ω 4 = 0. (That’s
just physics for you: where else could the centre of gravity possibly be?)
Here’s the second powerful idea. Rather than regarding C as the main object,
we could focus on R, and treat C as ‘R plus a bit more’: C is the smallest field
that contains both R and the symbol i (which satisfies i2 = −1, as ever). We
4
say that we can extend R by i to get C, and we write C = R(i). This is
probably how you learned it in the first place, though without that notation.
OK. Bank all that for now. We will review those points formally later, and
in any case we’ll usually work with much smaller fields like Q.
f = (x − α)(x − αω)(x − αω 2 )
= x3 − (α + αω + αω 2 )x2 + (α · αω + α · αω 2 − αω · αω 2 )x + (α · αω · αω 2 )
= x3 − α(1 + ω + ω 2 )x2 + α2 (1 + ω + ω 2 )x − α3 = x3 − 2,
and that there are no Q-linear relations between {1, α, α2 , ω, αω, αω 2 }, so they
form a Q-basis of L, and dimQ L = 6. As a first sanity check we see that ω 2 ∈ L
can indeed be expressed in this basis: ω 2 = −1 − ω.
f = (x − α)(x2 + αx + α2 ),
5
where we figure out the quadratic factor by putting in unknown coefficients,
multiplying out and asking that to equal f . This factorisation does not take
place in Q[x]. The two factors do not have rational coefficients, but have coef-
ficients in the field K = Q(α), which by definition is the smallest subfield of C
that contains Q and α. It’s not hard to persuade yourself that
and we’ll see a simple lemma later that explains this precisely. As a first sanity
check, 1 + α ∈ K, so what is its inverse? Well, since α3 = 2, we see that
(1 + α)(1 − α + α2 ) = 3, so (1 + α)−1 = 31 (1 − α + α2 ).
To complete the factorisation of f , we just need to split the remaining
quadratic factor. (It does not factorise over K, as we see in a second.) By
the quadratic formula, it has roots
√
−α ± α2 − 4α2 √ √
x= = α − 12 ± 2−3 = α − 12 ± i 23 = αω and αω 2 .
2
So there’s the factorisation (and, incidentally, the proof that the quadratic can’t
factorise over K, since K ⊂ R but ω, ω 2 ∈ / R).
λ1 = α, λ2 = αω, λ3 = αω 2
6
L = Q(α, ω)
2
2
2
3
K = Q(α) Q(αω) Q(αω 2 )
√
3 3 3 Q(ω) = Q( −3)
α 7→ αω (i.e. λ1 7→ λ2 )
αω 7→ α (i.e. λ2 7→ λ1 )
2 2
so αω = (αω) /α 7→ α /(αω) = α/ω = αω 2
2
(i.e. λ3 is fixed)
2
and so e.g. ω = αω/α 7→ α/(αω) = 1/ω = ω = −1 − ω.
You can check similarly how all the basis elements of L in the expression in (1)
are mapped, and this realises the permutation σ = (12) of the roots by a map
L −→ L of the splitting field to itself—an automorphism of L, as we say.
(We will need it to be a ring homomorphism, not just a bijection or a K-linear
map, but let’s save that for later.) It moves the elements of L around. The two
subfields Q(α) and Q(αω) are swapped, while the elements of Q(αω 2 ) are not
moved at all. An element a+bω ∈ Q(ω) is taken to a+b(−1−ω) = (a−b)+(−b)ω,
so the subfield Q(ω) is fixed as a set, even though its elements are moved around
inside it.
Similarly, we could check whether each of the other permutations of S3 is
realised by a map of the splitting field L −→ L to itself. In this particular
example, in fact they all are.
It is easy to understand the set of all subgroups of S3 : Figure 3 lists them all,
where the lines indicate subgroup inclusions going down the page. The integer
7
labels indicate the index of each subgroup: {id, (23)} ⊂ S3 is a subgroup of size
2 in a group of size 6, so has index 6/2 = 3.
{id}
2
2
2
S3
You will notice that Figures 2 and 3 are virtually identical, though they refer
to completely different mathematical objects: one describes subfields of L, the
other, subgroups of S3 , albeit with inclusions reversed. Amazing, but no fluke.
8
L → L, and if the subgroup lattice doesn’t have a path with ‘cyclic quotients’,
then there can’t be any radicals. (I’m oversimplifying, but not by much.)
If f = x5 − 10x + 5, we can deduce the subgroup lattice with nothing more
than some high-school calculus, a few permutation calculations and a bit of the
theory we will develop. To understand a group, we don’t need to understand
every element: some generators and relations will do. So we don’t need to
understand every automorphism L → L, nor every root of f . The absence of
some suitable subgroups in the subgroup lattice will mean at once that there
could not possibly be a formula in radicals for the roots of f .
Third Quarter We look in detail at the subfields that are fixed by automor-
phisms in §8, and then work out everything we claimed above for x3 − 2 in §8.3.
§9 identifies the most important properties of field extensions, and how these
relate to maps between fields. We put all this together in §10, which explains the
revolutionary idea behind Galois Theory, and the main result that illuminates
everything.
In short, the idea is this: if f (x) is a polynomial, then, whether we know
them or not, the set of its roots has symmetries, in a very similar way that a
polygon in the plane has symmetries. It turns out that properties of the group
of all symmetries reflect precisely properties of the roots themselves. But that’s
great, because in our context these groups are small finite objects that we can
analyse by hand, whereas solving polynomials in general seems impossible.
Fourth Quarter We apply that first in §11 to the substantial set piece of ‘bi-
quadratic’ quartics, where we can get hold of the 4 roots in fairly explicit terms,
and then to the classification and analysis of all finite fields in §12. Finally, in
§13, we return to the most famous problem of them all: the insolubility of the
quintic.
9
2 Essays on what you already know
Here are three alternative introductions, and there’ll be another in §13.4 at the
end. You don’t have to read any of these, or you could skim them in case one
may be interesting later when you have something substantial to tie it to.
zz = a2 + b2 = |z|2 ∈ R
(where |z| is the Euclidean length of z), we can handle division via multiplica-
tion: if z 6= 0 then
1 1 z a b
= = = 2 2
−i 2 .
a + ib z zz a +b a + b2
10
This works because both z and its reflection z meet in the denominator, and
since this is symmetric in i and −i it has no real choice but to be real (or see
the actual calculation above, if you don’t like rhetoric).
Roots of unity and other radicals From your life so far, you are indeed
most excellent at working in the complex numbers. For example, you know
what it means to extract n-th roots of a real or complex number in analytical
and geometrical and computational terms:
(a) If x ∈ R and x ≥ 0 then there is a unique real number
√ y ≥ 0 such that
y n = x. In exactly this context you write y = n x. (If pressed to find
this number, you could easily expand its decimal expansion by systematic
trial and error, or apply Newton–Raphson or similar.)
(b) If z ∈ C, then you can find a complex number t ∈ C so that tn = z. It does
not harm to suppose z 6= 0, and write z = r exp(iθ) in polar form, where
r ∈ R has r√≥ 0 and θ is well defined up to 2πZ. Then√ t = y exp(iθ/n)
where y = n r. You wisely tend not to denote this by n z since it is not
unique. Instead you do something much more precise, as follows.
(c) Let ω = exp(2π/n) = cos(2π/n) + i sin(2π/n), so that the complex num-
bers 1 = ω 0 , ω, ω 2 , . . . , ω n−1 are the n distinct complex roots of the poly-
11
nomial xn − 1. Geometrically you can’t help but picture these as the
vertices of a regular n-gon, inscribed within the unit circle, with one of
the vertices pinned down at 1 ∈ C (to stop the whole thing rotating). Our
choice of ω makes it the first vertex anticlockwise after 1, ω 2 the next, and
so on. The 5th roots of unity are drawn at the vertices of a pentagon in
Figure 1.
(d) Now if t is any n-th root of z as above, then t, ωt, ω 2 t, . . . , ω n−1 t are the
n distinct complex roots of the polynomial xn − z.
Solving polynomials by radicals When you solve ax2 +bx+c = 0, you trick
it into being a question of finding radicals: you change coordinates so that a
square root does the job. At that point, most people just remember the famous
formula, and know that they could figure it out again by completing the square
if necessary.
We will see that there are also formulas for cubic and quartic polynomials.
Nobody remembers them, and they are harder to figure out—they were the
secret technology of their time. In higher degree, there are also formulas for
special cases. But the striking fact is that the roots of a typical polynomial of
degree ≥ 5 cannot be expressed in radicals at all: in particular, there cannot
be a radical formula to solve the general case. Even more remarkably, we can
prove that fact, and we will, precisely as Galois taught us.
an xn + an−1 xn−1 + · · · + a1 x + a0 = 0.
For example, if you want to know how big a cube-shaped tupperware you’ll
need to fit in the 2kg of mashed potatoes you have left over from Christmas,
you’ll look for a real number x that satisfies x3 − 2 = 0. (I may be glossing over
some of the physics of mashed potatoes here, but then only you really know
the density you made them, so only you can assign sensible units to that 2. I’ll
leave that with you.) We sometimes think of C as the big super field (where
we know there’ll be at least one solution, and no more than n of them, even
though we may not be very clear about what they mean in Tupperware terms)
and R as the smaller, specialised ‘real’ field (where we might prefer solutions
to lie, especially if the solutions have some meaning in the real, mashed potato
world of fitting things into the fridge).
There is no intermediate field lying strictly between the two: if L is a field
satisfying R ⊂ L ⊂ C, then L = R or L = C. (This is clear: if α = a+ib ∈ L\R,
then b 6= 0, so i = 1b (α − a) ∈ L, since a, b, α ∈ L, and so L = C.)
12
But in fact C has billions upon infinities of subfields K ⊂ C that are not
equal to R. One of them is Q, the familiar rational numbers, and every subfield
K ⊂ C must contain Q (because K contains 1, so it must also contain 2 = 1 + 1
and −3 = −(1 + 1 + 1), whence also 12 and − 32 , and so on).
If we solve a single polynomial f = an xn + an−1 xn−1 + · · · + a1 x + a0 with
integer or rational coefficients, then in retrospect we didn’t need anything like
the whole of the field C to work in: we could have made do with the smallest
subfield of C that contains the roots of f . This sounds a bit theoretical: at the
outset, we don’t know the roots, and we probably think of that as the point. But
at least we know it has complex roots, so they are out there in C somewhere.
For example,
√ x3 − 2 (the mashed potato problem) has a single positive real
root α = 2 and a pair√of complex conjugate roots, αω and αω 2 , where
1 3
Q ⊂ K = Q(α) ⊂ L = Q(α, ω) ⊂ C
are all different, but all interesting in their own ways for x3 − 2.
To a large extent, this module studies fields like K and L—they are fabbo
examples that we’ll see again, so if you want to brush up your cube roots of
unity, be my guest. To get some idea of how economical they are, both K
and L are countable (thought of merely as sets), whereas both R and C are
uncountable, so K and L are miniscule in comparison. And while the Argand
diagram in Figure 1 is an absolutely brilliant visual aid for understanding C,
we stand no chance at all of drawing K or L on it in any reasonable way,
since they consist of infinitely many numbers spread around densely (in R or
C respectively) in an exquisite filagree of magic arithmetical dust—drawing Q
itself is just as impossible, though perhaps we don’t find it so mysterious.
So from one point of view, you could think of the brutal work that analysis
and topology do in completing Q to get R as grotesquely violent, smashing these
delicate little fields out of the way. The problem isn’t with C itself—once we
own R, it’s a very small step to get to C: they teach it at high school, completely
ignoring the damage that R has already caused. The problem is with that earlier
step where we thought that R would be a good idea. √
But come on, that’s a bit rich: you probably only believe in α = 3 2 in the
first place thanks to the completeness of R. So let’s not malign R, but let’s note,
1 approximately 1.25992104989487316476721060727831982242676643638515515135310740557628150781965814530849457
13
that if we want to begin to understand the billions upon infinities of subfields
of C, we had better not ask them to contain R.
We love C because it is familiar and comes with a fantastic picture, but also
because it is algebraically closed, so the roots of polynomials behave as well as
they possibly could. But there is a smaller algebraically closed field A ⊂ C, the
field of algebraic numbers (see §6.6), that contains all the roots of all polynomials
with rational coefficients, but does not contain R. We could have decided to
work in A instead of C. Some people do, but not before they’ve taken this
module. (Unless you did Algebraic Number Theory last term, in which case
you know A very well. But then you’re probably waiting for a lorryload of
machinery to be delivered by this module before you believe it.)
One final point: we can work in our fields K and L purely symbolically. That
is, any element is some rational expression in α and ω, and to tell whether two
are the same is a simple matter of algebraic manipulation (in principle, at least).
In particular, we do not need to work with decimal expansions,√ which is good
because we will never own the whole expansion of α = 3 2, and if some other
real number β 6= α had the same expansion for the first 1000 terms, we would
probably never know they were different. So we could program K and L on a
computer in precise terms without worrying about the (im)precision of floating
point calculations. Of course, since we just thought of it, it has been done
hundreds of times: all the calculations we will see involving only the subfield A ⊂
C are available on multiple computer algebra systems. (That’s a bit rich too,
though: decimal expansions are such an easy, flexible and powerful tool, that
many implementations of fields like K will carry around a decimal expansion
alongside the symbols to avoid the worst of the algebraic manipulation that I
pretended would be easy. There is some proverb about babies and bathwater
or possibly gift horses and their mouths. . . )
14
polynomials and set f to be zero whenever you come across it: or, if you prefer,
set x2 to be −1 wherever you see an x2 , which amounts to the same thing, since
x2 +1 = 0 and x2 = −1 are not so very different. So you see why it is reasonable
to work in R[x]<2 : whenever you see a higher power of x, whittle it down by
replacing occurances of x2 by −1. So in this sense it’s reasonable to think of
L = {a + bx | a, b ∈ R}, together with slightly odd ‘wrap-around’ behaviour
when calculating, just as it is reasonable to treat Z/pZ as {0, 1, . . . , p − 1}.
In fact, L is a field, whichever way you work with it. For example,
g(α) = g(y)|y=x = x2 + 1 = 0.
That is, x ∈ L is a root of g ∈ L[y]. And, if you try it you’ll see that −x is
another root. So y 2 + 1 = (y − x)(y + x) factorises in L[y].
But hang on a moment: factorising y 2 + 1 is what C is for. You’ve probably
already spotted what’s going on. At the outset, we called the variable x. We
could have called it anything: t or z or ξ or, oddly but legally, i. And if we had
called it i, then everything we’d talked about throughout the whole discussion
would have been exactly as though we were talking about C. More precisely,
there is an isomorphism of fields L → C taking a + bx to a + bi that takes the
subfield R of L identically to R inside C. There is also a second isomorphism
L → C that takes a + bx to a − bi. That’s fine: I have no idea which root of
y 2 + 1 the quantity x is and neither do you—there is no answer to that: we just
say to ourselves that we are happy that there are two maps L ,→ C, and we can
pick either one of them and stick to it.
We’ll formalise this trick in §6.7, and use it infinitely often: to some extent,
it’s one of the central points of field theory and is the main tool under the surface
of this module, so if you got something from that discussion that’s great, and if
not you can see it done properly later.
15
3 Solving cubic polynomials
3.1 The quadratic formula
To find the roots of a quadratic polynomial f = x2 + ax + b, with a, b ∈ C, you
complete the square by coordinate change x = y − a/2 and then look to solve
where c = 41 (a2 − 4b). Since the nature of the square root you want to extract
now depends on the sign of c, you define the discriminant
∆ = a2 − 4b.
√ √
With that notation, y = ± c = ± 12 ∆, whence
√
−a ± ∆
x = y − a/2 = ,
2
as everybody knows.
Notice that the ± sign isn’t just there as an afterthought: if we fix α to be
one of the solutions of z 2 = ∆, then αω2 is the other, where ω2 is a primitive
square root of unity—that is, ω2 = −1. That is, the ± is the visible action of
the group Z/2Z = {1, −1} ⊂ C of square roots of unity.
Viewed from the other end, if α, β ∈ C were the roots of f , we would have
16
• q 6= 0, otherwise the roots are y = 0 and the square roots of −p.
Out of the blue, define a type of cubic discriminant
4p3
D = q2 + (4)
27
√
and fix a complex number α = D; that is, α is either solution of z 2 = D.
Now fix a cube root r
−q + α
β= 3 , (5)
2
that is, β is any solution of z 3 = 21 (−q + α), and similarly, pick another cube
root r
−q − α
γ= 3 . (6)
2
There was choice in each of α, β, γ. You’ll have noticed that we’ve already
‘balanced’ α with −α, but I don’t yet know how to ‘balance’ the cube roots.
Let ω be a primitive cube root of unity: that is ω 3 = 1 and ω 6= 1. (We
could be more prescriptive: since ω 3 − 1 = (ω − 1)(ω 2 + ω + 1) and√ω 6= 1 we
must have ω 2 + ω + 1 = 0, so by the quadratic formula ω = 21 (−1 ± i 3). But I
don’t care which primitive root of unity ω is, so let’s not bother spelling it out.)
The key to choosing the cube roots is to consider the product (βγ)3 , which
is indifferent to all those choices. From here on in, you could follow your nose,
but I’ll work it out:
3 3 3 −q + α −q − α
(βγ) = β γ =
2 2
= 14 (q 2 − α2 )
= 14 (q 2 − D)
= 14 (q 2 − (q 2 + 4p3 /27))
= −p3 /27 = (−p/3)3 .
Thus βγ is one of the 3 cube roots of (−p/3)3 , and so the 3 distinct roots are
βγ, ωβγ and ω 2 βγ. Visibly −p/3 is one of these cube roots, so after making an
arbitrary choice of β we may choose γ so that βγ = −p/3.
Now I claim we have the three roots:
y =β+γ or ωβ + ω 2 γ or ω 2 β + ωγ. (7)
We only need to check the first of these: the other two come from making
the other two possible choices for β, and then rechoosing γ so their product
remains −p/3, so they work if βγ works. Since βγ = −p/3,
y 3 = β 3 + 3β 2 γ + 3βγ 2 + γ 3
−q + α −q − α
= − pβ − pγ +
2 2
= −p(β + γ) − q = −py − q,
17
so y 3 + py + q = 0 as required.
To show off, we might write the solutions as
v q v q
u 3
u
4p3
3 −q + q 2 + 4p 3 −q − q2 +
u u
t 27 t 27
y = ωi + ω 3−i , (8)
2 2
for i = 0, 1, 2 and a primitive cube root of unity ω—but beware that this dis-
guises the crucial compatibility between the second cube root (which is γ) and
the first (which is β). (There is of course also an unspoken relation between the
two square roots: they are related by being exactly the same.)2
If you want to solve the original cubic x3 + ax2 + bx + c, remember that
x = y − a/3, and p, q are expressed in a, b, c, so we can use these formulas for
that case too. (It’s not showing off to write (8) in terms of a, b, c, it’s madness.)
this formula [1]. Del Ferro was teaching it to his students 500 years ago in the University of
Bologna, the oldest university in Europe. To put it into local context, that was around the
time Queen Elizabeth I was born (1533). Forty years later (from 1575) she visited Kenilworth
Castle three times, in part to give Robert Dudley, the first Earl of Leicester, a chance to
impress her for a marriage proposal (unsuccessful, despite the nice new formal garden, which
you can see recreated today, and the live sea battles in the adjoining flooded fields, which you
cannot). Historians have overlooked the possibility that she was also checking out the facilities
for twinning with Galois’ birthplace (successful, but with a delay of 400 years). Anyway, that’s
not bad going by dF, T and C, bearing in mind that they didn’t really trust negative numbers,
let alone complex ones.
18
Shouldn’t D be a discriminant, defined conceptually? Yes, of course.
If y1 , y2 , y3 are the three roots (with multiplicity) of any cubic polynomial f ,
then the discriminant is correctly defined as
Clearly the value of ∆ is independent of the order we wrote down the three roots,
and ∆ = 0 if and only if f has multiple roots. You can check that ∆ = −27D,
so that the cubic has multiple roots if and only if D = 0.
This is worth stating properly in terms of the coefficients and remembering.
For a cubic polynomial f = y 3 + py + q, the discriminant is
f = x3 + ax2 + bx + c = y 3 + py + q
Real cubics with multiple roots. f has multiple √ roots if and only if D = 0.
The form of D implies
p then that p < 0. So α = D = 0 and by (5,6) we may
choose β = γ = − 3 q/2 ∈ R, checking the necessary compatibility condition is
satisfied for that choice:
p p
βγ = 3 q 2 /4 = 3 −p3 /27 = −p/3 (using D = 0).
19
We see also that the remaining two roots of f ,
y = ωβ + ω 2 γ and y = ω 2 β + ωγ,
Real cubics with 3 √ real roots: the Casus Irreducibilis. Now suppose
D < 0, so that α = D = ±i|α| is pure imaginary. By (5,6), β 3 and γ 3 are a
strictly complex conjugate pair, so the compatibility condition βγ = −p/3 ∈ R
forces us to choose their cube roots β and γ = β to be a strictly complex
conjugate pair too. Thus by (8) the 3 roots are
y = β + β, ωβ + ωβ and ω 2 β + ω 2 β
Real cubics with 3 real roots via trigonometric functions First an easy
point. Recall that for any θ, we have the triple angle formula
y 3 − 34 y − 1
4 cos(3θ) = 0. (9)
If by chance you ever come across a cubic in reduced form with p = −3/4 and
q = − cos(3θ)/4, for some value of θ, then you can say at once that it has the 3
real solutions
Second, another easy point. This solution can be adapted to work whenever
p, q ∈ R with D < 0. The trick is to consider a change of coordinates y = λt,
and choose λ so the result is of the form of (9): the only real constraint is
3 Word is that Tartaglia and Cardano were not happy about that at all—the name for this
case, Casus Irreducibilis or ‘irreducible case’, echoes with their irritation over the centuries.
20
1
that the constant term had better lie in [−1/4, 1/4] so it can be 4 ×cosine of
something. The solutions are simply
y = λ cos(θ) and λ cos(θ + 2π/3) and λ cos(θ + 4π/3)
p
where λ = 2 −p/3 (fine: p < 0 since D > 0) and θ is a inverse-cosine of
something fiddly in p, q that the algebra throws up. (See [?], §1.3 Exercise 10,
for detailed hints.)4
used by Philip II of Spain, one of the global superpowers at the time. We are discussing the
work of highly sophisticated scientists working at the cutting edge of knowledge at the time.
21
4 Reminders
Here’s what we start with. It’s part of the module to have all this on board.
Quadratic fields Let d > 0 be a squarefree integer (that is, each prime in its
prime factorisation appears only once) and consider
√ n √ o
Q( d) = a + b d | a, b ∈ Q .
√
Since d ∈ R is a (positive) real number, this set is a (nonempty) subset of
R, and it is easy to check that it is closed under +, − and ×. To complete
confirmation that it is a field, you just need to
√ figure out division, but that
works easily using the Galois conjugate a − b d in a similar way to how you
divide complex numbers:
√ √
√
1 a−b d a−b d a −b
√ = √ √ = 2 = + d.
a+b d (a + b d)(a − b d) a − b2 d a2 − b2 d a2 − b2 d
This is a subfield of C, which you can√check in the same way. Notice that in
this case the conjugate operation a − b −d is the same
√ as complex conjugation,
while it was clearly something quite different for Q( d).
22
(where of course you can cancel factors in the usual way). If we weren’t so fo-
cussed on fields, we might have started better by defining the ring of polynomials
in x over K (note the different, squarer brackets)
where my insistence that as 6= 0 is a convenient habit for later (so that I know
for certain that the polynomial has degree s), but does mean I have to remember
to include the zero polynomial at the end (which has degree −1, by convention).
It is easy to check that K[x] is an integral domain, and then K(x) is its field
of fractions:
K(x) = Frac(K[x]).
More generally, if x1 , x2 , . . . , xn are distinct indeterminates, then the ring of
polynomials in x1 , x2 , . . . , xn over K is
nX o
K[x1 , x2 , . . . , xn ] = aI xI | I ∈ Zn≥0 , aI ∈ K ,
Finite fields You may know that Fp = Z/pZ is a field exactly when p is a
prime integer. This is a finite field and has exactly p elements. You can work
with it formally as cosets a + pZ, as the notation intends, or work informally
(but just as precisely) with the set {0, 1, 2, . . . , p − 1}, where it is understood
that you can work in Z while calculating, and then finish any computation by
taking the remainder after long division by p. You may also take remainders
at any stage during the calculation, and as doing so keeps numbers smaller you
probably do. To check that that division is possible, suppose p > 0 is a prime
and let 1 ≤ a < p be an integer. Then a is is necessarily coprime to p—that’s
where we’re using that p is prime—so by the Euclidean algorithm there are
integers u, v so that
ua + vp = 1.
Now if we choose the unique 1 ≤ b < p that satisfies b ≡ u mod p, that
Euclidean equation (taken mod p) shows that ba = 1, so b is the multiplicative
inverse of a.
Translated into the language of cosets, the arithmetic rules are
23
you see that the only difference is the careful ‘ + pZ’ all over the place, which
means I can afford to be less careful to pin down b ∈ Z at the end.
Just in passing, note that there is a field with exactly 4 elements that we’ll
see later, but it is most certainly not R = Z/4Z, since that ring is not a field:
2 × 2 = 0 in R, so 2 ∈ R cannot have a multiplicative inverse; indeed, if
2 × b = 1 ∈ Z/4Z, then 2 = 2 × 1 = 2 × 2 × b = 0 × b = 0, which it isn’t.
More generally Z/nZ is not a field whenever n = ab is composite: since
then, a, b 6=∈ Z/nZ, but ab = n = 0 ∈ Z/nZ, so neither a nor b can have a
multiplicative inverse. But this is all easy to sort out later in §12, and we will
be able to understand all finite fields then.
24
(b) If ϕ : R −→ S is a map of rings, and x is an indeterminate, then ϕ naturally
extends to a ring homomorphism between polynomial rings over R and S,
which we also call ϕ:
ϕ : R[x] −→ S[x]
a0 + a1 x + · · · + an xn 7−→ ϕ(a0 ) + ϕ(a1 )x + · · · + ϕ(an )xn
simply by evaluating the coefficients at ϕ while leaving the x’s alone.
(c) For any s ∈ S there is a ring homomorphism ϕs : Z[x] −→ S determined
by ϕs (x) = s. The axioms insist that ϕs (x2 ) = ϕs (x)2 = s2 , ϕs (x3 ) = s3 ,
and so on. Also ϕs (axm ) = ϕs (a)ϕs (x)m = asm (recalling the abuse of
notation ϕs (a) = a), and so ϕ extends to any polynomial f :
ϕs (f ) = f (s)
where you could regard the right-hand side as treating f as an element of
an unmentioned polynomial ring S[x] and then evaluating it at s. This is
called the evaluation map (of x at s), because that’s what it is.
Ring homomorphisms work very naturally. If we hadn’t said any of those things,
you’d have done them correctly anyway.
The basic theorem is this:
Theorem 4.4. Let ϕ : R −→ S be a ring homomorphism. Then:
(a) ker ϕ = {r ∈ R | ϕ(r) = 0} ⊂ R is an ideal of R;
(b) Im ϕ ⊂ S is a subring of S;
(c) First Isomorphism Theorem: ϕ naturally induces an isomorphism
∼
=
ϕ : R/ ker ϕ −→ Im ϕ
r + ker ϕ 7→ ϕ(r).
Proof. See Algebra 2. Alternatively, if you write out the things that would need
to be proved, you’ll see that each is a small exercise. Once you agree R/ ker ϕ
is a ring, the main points are to check that ϕ is well defined and injective.
Our main use of Theorem 4.4(c) is the following mild rephrasing.
Corollary 4.5. Let ϕ : R −→ S with kernel I ⊂ R. Then there is an injective
ring homomorphism ϕ : R/I ,→ S so that the diagram commutes:
ϕ
R S
ϕ
R/I
25
4.3 Characteristic of a field
Lemma–Definition 4.6. Let K be any field. There is a unique ring homo-
morphism ι : Z −→ K. Setting ker ι = pZ, exactly one of the following cases
occurs:
In either case, K0 ⊂ K is called the prime field of K, and any other subfield
L ⊂ K must contain K0 .
Thus the characteristic of K, denoted char K, is either 0 or a prime p > 0.
In the case p > 0, we may also say simply that K has positive character-
istic if we only care that p 6= 0 and not its actual value.
ι : Fp = Z/pZ ,→ K.
For the final part, any subfield L ⊂ K must contain ι(Z), since it must contain 1,
and so must contain the field K0 that that generates.
For example Q, R, C, Q(x) and C(x1 , x2 ) are all characteristic zero, since
they contain Q (or, equivalently, since adding 1 to itself repeatedly never gives 0).
The finite field Fp is characteristic p, as is, for example, the bigger field
Fp (t) of rational functions over Fp , since it clearly contains Fp as the subfield of
constant functions.
26
4.4 Automorphism group of a field: the main invariant
Let L be any field.
Definition 4.7. An automorphism of L is a bijective ring homomorphism
ϕ : L −→ L. (It follows easily that the inverse map is also a ring homomor-
phism.) The set of all automorphisms of L forms a group:
a0 + a1 x + · · · + as xs (10)
where all ai ∈ K. The ai are called the coefficients of the polynomial, and
individually ai is called the coefficient of xi . Each expression ai xi in the
polynomial is called a term. We may abbreviate the expression (10) to
s
X X
ai xi or even to ai xi ,
i=0
27
where in the latter it is understood that all but finitely many of the coefficients
ai are nonzero.
For example, 2 + 3x + 5x3 is a polynomial, where it is understood that any
terms that do not appear have zero coefficient—in this case 0x2 is missing, as
are any 0xn for n > 3.
The powers xi of x are independent quantities, with x0 treated as 1 ∈ K
whenever convenient, and two polynomials are equal if and only if they have
the same coefficient of xi for every i ≥ 0.
A daft but easy point: the zero polynomial is the unique polynomial which
has all coefficients ai = 0 for all i ≥ 0. We denote it 0, and note that 0 + 0x,
0 + 0x3 − 0x17 and so on are all the same zero polynomial. So if we say ‘let g 6= 0
be a polynomial’, this simply says g is a polynomial with at least one nonzero
coefficient, P
and nothing more.
If f = ai xi is a nonzero polynomial, then its degree, denoted deg(f ),
is the largest i ≥ 0 for which ai 6= 0. By convention, the degree of the zero
polynomial is −1. Polynomials of degree ≤ 0 are called constant polynomials,
degree 1 are linear, degree 2 are quadratic, degree 3 are cubic, degree 4 are
quartic, degree 5 are quintic, and so on in latin fashion.
(f ) := {f r | r ∈ R}
28
is an ideal, and any such ideal is called a principal ideal, with f as its generator.
Principal ideals are closely related to division, since almost directly from the
definitions
f divides g ⇐⇒ g = hf for some h ⇐⇒ g ∈ (f ) ⇐⇒ (g) ⊂ (f ).
(The generator f of a principal ideal (f ) is not unique, but this division state-
ments shows at once that two principal ideals (f ) = (g) are equal if and only if
f = ug, where u ∈ R is a unit.)
In our context, R = K[x] is a principal ideal domain (PID), that is, every
ideal in K[x] is principal; in fact, something more precise is true.
Corollary 4.11. Let I ⊂ K[x] be an ideal. Then I is a principal ideal, and
exactly one of the following three cases occurs:
(a) I = {0} is the zero ideal;
(b) I contains a nonzero constant polynomial; in this case I = K[x];
(c) I = (f ) for some polynomial f ∈ K[x] with deg f ≥ 1.
Proof. See Algebra II, but simple to do. If I 6= {0}, pick a nonzero element
f ∈ I of smallest degree. If deg(f ) = 0, then we are in case (b). If not, we claim
that I = (f ). Indeed, if g ∈ I, then long division of g by f yields g = qf + r,
where deg(r) < deg(f ). But r = g − qf ∈ I, so by the minimality of deg(f ) we
must have r = 0, and so g = qf ∈ (f ), as required.
Corollary 4.12. Let f, g ∈ K[x]. Then there exist h, a, b ∈ K[x] such that
af + bg = h.
The polynomial h is well defined up to a unit factor, and any such h satisfies
the conditions for being a greatest common divisor (≡ highest common factor),
so we denote h = gcd(f, g), with care about the unit if necessary.
The point is just that the ideal (f, g) ⊂ K[x] must be principal, so define h
by (f, g) = (h). The Euclidean algorithm constructs h as usual.
29
5 Factorisation in polynomial rings
The polynomial ring K[x] over any field K is a unique factorisation domain.
That is, any polynomial f ∈ K[x] admits a factorisation
f = h1 h2 · · · hs
into irreducible polynomials hi ∈ K[x], where the hi are unique up to the order
in which we wrote them down, and up to multiplication by units. (That last
point captures, for example, that (−h1 )(−h2 ) = h1 h2 is regarded as the same
factorisation ‘up to units’.)
The harder question is when is a polynomial f ∈ K[x] irreducible. The
answer can be subtle, and depends on the field K.
f = c(x − α1 )(x − α2 ) · · · (x − αn )
30
(Non-examinable) Remark We extend all the definitions so far to the ring
Z[x] of polynomials with integer coefficients. Its units are simply ±1, irre-
ducibility is defined in exactly the same way, and it too is a unique factorisation
domain. The main difference is that it is not a principal ideal domain: for exam-
ple I = (2, x) ⊂ Z[x] cannot be generated by a single integral polynomial. But
that doesn’t matter to us here: everything here is elementary, and we proceed
without using any formal definitions.
Lemma 5.2 (Gauss’s Lemma). Let f = a0 + a1 x + · · · + an xn ∈ Z[x] be a
polynomial with n = deg f > 0. Suppose in addition that gcd(a0 , . . . , an ) = 1.
If f = gh for g, h ∈ Q[x] with deg g > 0 and deg h > 0, then there is some
b ∈ Q∗ with bg ∈ Z[x] and b−1 h ∈ Z[x].
This is completely elementary—it’s capturing the hope that if you can fac-
torise an integer polynomial over Q, such as x2 − 4x + 3 = ( 52 x − 25 )( 52 x − 15
2 ),
then you could move any nontrivial common denominators between factors so
they cancel, and so you would have a factorisation over Z too.
Proof. Let a, b ∈ Q such that g = ag1 and h = bh1 with g1 , h1 ∈ Z[x], where
there is no common factor of all the coefficients of g1 , and similarly for h1 .
(Simply choose a to be c/d, where d ∈ Z is the least common multiple of the
denominators of the coefficients of g and then c is the highest common factor
of the coefficients of dg—at the blackboard you’d say let’s clear a common
denominator and then pull out the common factor of the coefficients.)
So f = gh = (ab)g1 h1 . Let ab = r/s ∈ Q in least terms with s > 0. If s = 1,
then we are done: r divides all coefficients of f so r = ±1 by hypothesis and
thus a = ±1/b. So suppose not and pick a prime factor p of s. Since p does not
divide r, it must divide every coefficient of the product g1 h1 , as the coefficients
of f are integers. We claim that p divides all the coefficients of either g1 or h1 ;
but that contradicts how we chose g1 and h1 , so we are done.
P The claim is immediate once I bother to write some notation. Let g1 =
bi xi and h1 = ci xi and suppose, for contradiction, that bj is the least
P
index coefficient of g1 not divisible by p and ck is the least index coefficient
of h1 not divisible by p. Then the coefficient of xN , where N = j + k, in the
product g1 h1 is
b0 cN + b1 cN −1 + · · · + bj−1 ck+1 + bj ck + bj+1 ck−1 + · · · + bN c0
and p divides every term except bj ck (as it divides all b<j and c<k ). So p does
not divide the coefficient of xN in g1 h1 , a contradiction.
The following corollary is usually cited as Gauss’s Lemma.
31
(Non-examinable) Remark If you’re not impressed, think of it this way. If
f ∈ Z[x] has integral coefficients and a ∈ Z is an integer root does that mean
that f factorises in Z[x] with a factor x − a? Of course it has no right to: the
division algorithm cannot work in the same way in Z[x] as it does in K[x] with
coefficients in a field (otherwise Z[x] would be a PID, which it’s not), so your
first thought of a proof cannot work. But if you try to build a counterexample,
you’ll see how any denominator in a factor conspires to find its way back to f .
The neat proof of Gauss’s Lemma grabs hold of that runaway denominator.
32
(b) Is the polynomial f = 2x3 +3x2 +4x+6 irreducible in Q[x]? I have no idea.
If it is reducible it must have a linear factor, as it’s a cubic, so there would
be a rational root r/s. By the proposition, s divides 2 and r divides 6. So
we may assume s = 1 or s = 2 and r ∈ {−6, −3, −2, −1, 1, 2, 3, 6}. You
can simply check all 12 possible r/s. If none of them is a root, then f is
irreducible by the proposition.
(Spoiler: now that the proposition limited the possibilities for a rational
root, you’ll find soon enough that f (− 23 ) = 0, and so f = 2(x+ 32 )(x2 +2).)
aj = b0 cj + b1 cj−1 + · · · + bj−1 c1 + bj c0
and p divides aj and every term on the right except possibly bj c0 . So p must
divide bj c0 , but that’s a contradiction as we already agreed it doesn’t divide
either factor.
33
Example 5.9. Prime cyclotomic polynomials. Let p > 0 be a prime, and
consider
f = xp−1 + xp−2 + · · · + x + 1.
Then in fact f is irreducible over Q. That does not follow directly from Eisen-
stein’s criterion, but there is little trick that extends Eisenstein’s applicability.
Note first that f (x) = (xp − 1)/(x − 1). So setting x = 1 + y gives
f (x) = f (1 + y)
(1 + y)p − 1
=
y
1 p! p!
= py + y2 + y 3 + · · · + py p−1 + y p
y 2!(p − 2)! 3!(p − 3)!
p! p!
=p+ y+ y 2 + · · · + py p−2 + y p−1 .
2!(p − 2)! 3!(p − 3)!
Eisenstein applies with prime p to the last polynomial f (1+y) ∈ Z[y] (the prime
p in the binomial numerators cannot cancel since the denominator factors are
too small), so that is irreducible over Q. But any factorisation f = gh of would
give a factorisation f (1 + y) = g(1 + y)h(1 + y), so the original f is irreducible
over Q too.
f = a0 + a1 x + · · · + an xn ∈ Fp [x]
where ai ∈ {0, 1, . . . , p − 1} denotes the least residue of ai modulo p (that is, the
remainder on dividing ai by p).
If f is irreducible in Fp [x], then f is irreducible in Z[x].
Proof. Suppose f = gh is a factorisation; we may assume that g, h ∈ Z[x]. Then
f ∈ Fp [x] has degree n since an 6= 0. So f = gh is a factorisation in Fp [x].
34
6 Field extensions I
Recall that a homomorphism ϕ : K −→ L between fields is simply a ring homo-
morphism: thus ϕ respects addition and multiplication and takes 1 to 1.
Proposition 6.1. Let K and L be fields and ϕ : K −→ L a homomorphism.
Then ϕ is injective.
Proof. Quicker to do this one yourself, but here goes. Suppose a, b ∈ K satisfy
ϕ(a) = ϕ(b). So ϕ(a − b) = ϕ(a) − ϕ(b) = 0. So we show that the only element
c ∈ K that maps to 0 ∈ L is c = 0. Well, if c ∈ K \ {0} and ϕ(c) =0, then
ϕ(1) = ϕ(cc−1 ) = ϕ(c)ϕ(c−1 ) = 0, which contradicts ϕ(1) = 1.
Simplifying Remark + Notation 6.2. Proposition 6.1 says that whenever
we have a homomorphism between fields we are in precisely the situation
ϕ
K −→ K 0 = ϕ(K) ⊂ L
35
6.1 Extensions as vector spaces
We often treat C as a vector space over R. That’s how we picture C, with a
basis {1, i} spanning real and imaginary axes, as in Figure 1. In particular, C
is 2-dimensional as an R-vector space. It works because R ⊂ C, so doing scalar
multiplication from R is automatic. One of the basic ideas is that something
similar works for any field extension.
Fundamental Observation 6.5. If L/K is a field extension, then L is a vector
space over K.
Why? Addition and subtraction already work in L, so the point is the
scalar multiplication of elements of L by elements of K. When K ⊂ L, the
scalar multiplication from K is simply multiplication as elements of L. If more
generally the extension is given by a map ϕ : K → L, then we multiply v ∈ L by
α ∈ K via ϕ by defining α · v := ϕ(α)v, where multiplication on the right-hand
side is the multiplication inside L. That is, we do scalar multiplication by using
the copy K 0 = ϕ(K) of K that lies in L. (We need to check that all the axioms
of a vector space hold, and you can do that easily and should.)
Definition 6.6. If L/K is an extension, then the degree of L/K, denoted
[L : K], is defined to be
[L : K] = dimK L
that is, the dimension of L as a K-vector space. Thus the degree is a nonnegative
integer or ∞. Note that [L : K] = 1 if and only if L = K.
We say L/K is finite if [L : K] < ∞, otherwise L/K is infinite.
This definition forgets an enormous amount about what makes L a field, but
it illuminates something coarse but crucial about the relationship between K
and L that may otherwise be hidden. For example, is the following statement
immediately obvious to you before you consider K as a vector space over Fp ?
Proposition 6.7. If K is a finite field, then it has char(K) = p > 0 a prime,
and #K = pn for some n ∈ N.
Proof. Z → K cannot be injective, since K is finite, so the prime field must be
Fp ⊂ K for some prime p > 0. So K is a finite-dimensional vector space over
Fp , and after choosing any basis its elements are simply all vectors (a1 , . . . , an )
with ai ∈ Fp , where n = [K : Fp ], and there are obviously pn of those.
This leaves the more subtle question of whether finite fields of size pn exist
(they do; see §12), or, put another way, can we extend the multiplication in Fp
to a vector space, in a compatible way with that vector space structure?
Remark 6.8. So, for example, there cannot be a field with 6 elements. How
else might you prove it? If you try to think of rings with 6 elements, you’ll
immediately try Z/(6), but of course that’s not a field because 2 × 3 = 0 in it.
But that by itself is pretty weak evidence. After all, Z/(4) is not a field, but
with a bit of persistence you could write addition and multiplication tables on
36
a set of 4 elements {0, 1, a, b} that have inverses and so determine a field. Try
that, and then try with 6 elements (which must fail by the proposition), and
decide whether your determined failure would be enough to convince you.
The first main theorem is easy, but amazingly powerful—it will solve prob-
lems in an instant that you would have no idea how to calculate otherwise.
When you read it, have in mind the situation of three fields K ⊂ L ⊂ M , even
though it is phrased more generally in terms of field extensions.
Theorem 6.9 (The Tower Law). If M/L and L/K are field extensions, then
M/K is a field extension and if M/L and L/K are both finite then
[M : K] = [M : L][L : K]
37
Since {bi } is a L-basis of M , each of the coefficients must be zero. Just looking
at the coefficient of b1 , that means
Since {aj } is a K-basis of L, each of the coefficients must be zero: that is,
c11 = · · · = c1n = 0. The same holds for the coefficients of b2 , . . . , bm , and thus
all cij = 0, which is what we had to prove.
Definition 6.10. If K ⊂ L ⊂ M are fields, then L is said to be an interme-
diate field of K ⊂ M .
We could imagine phrasing this definition more generally for field extensions
K −→ L −→ M , but in the context we work that is not helpful. Of course,
K = L or L = M are allowed as intermediate fields.
p(α)
ξ= where p, q ∈ K[x] and q(α) 6= 0
q(α)
and conversely the collection of all such expressions is evidently a field (by the
usual operations with fractions) so must equal L.
But we can be much more economical in the way we describe elements of L.
In the first place, there is no point ever writing α2 since we may replace it by s
wherever it appears. So we can write an element ξ of L in the form
a + bα
ξ= where a, b, c, d ∈ K and (c, d) 6= (0, 0)
c + dα
√
If α had been i = −1, then you’d know how to use complex conjugation to
simplify this further. An analogous trick works for any α: multiply top and
bottom by the Galois conjugate c − dα of the denominator of ξ to give
a + bα c − dα ac − bds bc − ad
ξ= × = + α
c + dα c − dα c2 − d2 s c2 − d2 s
= A + Bα for some A, B ∈ K
38
where you notice that the denominators could not possibly vanish since s is not
a square in K. So, relabelling, we can write L more cleanly as
L = {a + bα | a, b ∈ K}
[L : K] = dimK L = 2
We will use the notation K(α) to denote the field L for ever more.
√
Example 6.11. Consider Q ⊂ R and α = 2. Then Q(α) √ is the subfield of
R consisting of all combinations of rational numbers and 2. The discussion
above says that
√ n √ o
Q( 2) = a + b 2 | a, b ∈ Q
√ 2
To see the discussion above in action, observe
√ that we never write 2 as we
could write 2 instead, so elements of Q( 2) are naturally expressions such as
√
3−5 2
γ= √
7 + 11 2
√
Multiplying numerator and denominator by 7 − 11 2 shows that
√ √
35 − 33 √ 2 √
(3 + 5 2)(7 − 11 2) 21 − 110 89
γ= √ √ = + 2= − 2
(7 + 11 2)(7 − 11 2) 49 − 242 49 − 242 193 193
[L : Q] = [L : K][K : Q] = 2 × 2 = 4
39
so we may expand ξ ∈ L as
√ √ √ √ √ √ √
ξ = a0 + a1 2 + (b0 + b1 2) 3 = a0 + a1 2 + b0 3 + b1 2 3
which visibly
√ √depends
√ 4 parameters a0 , a1 , b0 , b1 ∈ Q with respect to the
on √
√
basis {1, 2, 3, 6 = 2 3}.
Very soon we will have a nice simple proposition that will subsume all of
these ad hoc calculations.
That is, you get K(α) by applying all field operations to anything in K and α.
So 17α3 − 3/α ∈ K(α), for example.
Terminology 6.13. In the notation above, we say that K(α) is the field ob-
tained by adjoining α to K.
K ⊂ K(α) ⊂ L,
where the lefthand inclusion is an equality if and only if α ∈ K, and the right-
hand inclusion is an equality if and only if it so happens that every element of
L can be expressed in terms of α and elements of K.
Definition 6.14. A (nontrivial) field extension K ⊂ L is a simple (or primi-
tive) extension if and only if L = K(α) for some α ∈ L.
A familiar example is that C = R(i) is a simple extension. Before looking at
more examples, note that the field K(α) has a subring
Of course K[α] is an integral domain (it is contained in a field), and K(α) is its
field of fractions. It frequently happens that K[α] = K(α): it happens whenever
we can eliminate denominators using conjugacy tricks as in Example 6.11.
40
try to simplify γ by trading any ω 5 factors for 1 to get γ = 1 + ω 2 + ω 4 + ω + ω 3 .
If you remember at this stage that
x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1),
then you’ll notice that in fact γ = 0, so it wasn’t quite as interesting an element
as it at first appeared. (Or perhaps you spotted right away that ω 2 is also
a primitive 5th root of unity, so that γ is the zero sum of the vertices of the
pentagon in Figure 1 despite its initial appearance.)
Example 6.16. Consider Q ⊂ R and let α be π = 3.1415 · · · , the circumference
of a circle of diameter 1. Then Q(α) is the subfield of R consisting of all
combinations of rational numbers and π. For example, γ = π 2 +π 5 +1/π ∈ Q(π).
There are no polynomial relations involving π and rational numbers (π is in fact
irrational), so I’m not going to try to simplify that expression for γ at all.
41
This is practically identical to the dichotomy when defining the characteristic
char L of a field, with evα taking the role of the natural map Z → L.
Proof. Certainly evα is either injective or not. Moreover, any injective map ϕ
from an integral domain to a field extends to an injective map from its field
of fractions: ϕ(f /g) is defined to be ϕ(f )/ϕ(g), and if ϕ(f1 /g1 ) = ϕ(f2 /g2 )
then also ϕ(g2 f1 ) = ϕ(g1 f2 ), so g2 f1 = g1 f2 and dividing by both gi finishes
it. The extension of evα to K(x) surjects onto K(α) by definition of simple
extension §6.3, and it is injective because it is a homomorphism of fields.
In the non-injective case, we show the polynomial f is irreducible as follows:
if f = gh where g and h both have degree < deg f , then
0 = f (α) = g(α)h(α)
Put another way, α algebraic over K means that evα is not injective and the
minimal polynomial of α over K is the unique monic generator of ker(evα ).
Remark 6.20. We can think of K[x]/(f ) as an in vitro model of the extension
K(α) ⊂ L of K by a root α ∈ L of the irreducible f . Rather than working
with K(α) in the cold, muddy field L, we can go back to the warm, dry lab
and work with K[x]/(f ). It’s shocking at first to realise that you have the same
model whichever root in L you choose: evβ (f ) = 0 for any other root β ∈ L.
This is a linear algebra statement, so for the proof we don’t need to know
that we can multiply polynomials together; but see comments afterwards.
Proof. Set V = K[x]<n , the K-vector space of polynomials of degree < deg f .
Consider W = K[x]/(f ) as a K-vector space: that is, build the quotient
ring, and then only allow yourself to do +, − and scalar multiplication. Propo-
sition 6.18 says K[x]/(f ) ∼
= K(α) (as fields, which is even more than as K-vector
spaces) by x 7→ α, so it’s fine to use W as a model for K(α).
42
Define a map L : V −→ W by g 7→ g + (f ). It is K-linear (that is, a map
of K-vector spaces) by the golden rules of coset arithmetic: (ag + bh) + (f ) :=
(a + (f ))(g + (f )) + (b + (f ))(h + (f )). We prove that this is an isomorphism.
To show L is surjective, consider any w ∈ W , namely some coset w = g +(f ).
Let r be the remainder of division g = qf + r. Then w = r + (f ) and deg r < n,
so r is an element of V , and L(r) = w. Moreover L is injective because the
remainder is uniquely determined: if L(v) = 0, then v + (f ) = 0 + (f ), so
v ∈ (f ): but deg v < deg f , so we must have had v = 0 in the first place.
So L : V −→ W is an isomorphism, and we may use L to map any basis of V
to a basis of W . The basis {1, x, x2 , . . . , xn−1 } of V gives the basis claimed.
Remark 6.22. In fact, this operation is completely natural, and you may be
more comfortable not forgetting multiplication. We may put ring multiplication
on V by taking remainder on division by f after any multiplications, and since
f is irreducible V is then a field and V ∼
= K(α).
Clearly the fact f is monic has nothing to do with the result; it’s just there
so I can say ‘minimal polynomial’ at the end. We can adjoin a root of any
irreducible polynomial, or indeed any nontrivial polynomial at all.
Proof. Just set L = K[x]/(f ). L is a field because f is irreducible in K[x], and
the composition
K −→ K[x] −→ K[x]/(f ) = L
is a ring homomorphism and so is injective as K is a field. Let α = x + (f )
be the class of x in L. Any element of L is of the form h(x) + (f ), but that is
exactly h(α), so α generates L over K. Visibly f (α) = f (x) + (f ) = 0 + (f )
is zero in L, and since f is irreducible and monic it is the minimal polynomial
of α.
Corollary 6.24. Let f ∈ K[x]. Then there is a field extension K ⊂ L and an
element α ∈ L so that L = K(α) is a simple extension and f (α) = 0.
Pick an irreducible (monic) factor g of f and apply the theorem.
43
6.5 Adjoining multiple algebraic elements
Let K ⊂ L. If α ∈ L, we know a lot about the simple extension K ⊂ K(α).
More generally, if α, β ∈ L we define
p(α, β)
K(α, β) = | p, q ∈ K[x1 , x2 ], q(α, β) 6= 0
q(α, β)
= the smallest subfield of L that contains K and α and β
Notice that K(α, β) = K(α)(β), where the right-hand side is the simple exten-
sion of K(α) ⊂ L by β ∈ L. Indeed K(α)(β) ⊂ L is a subfield that contains α,
β and K, and K(α, β) is the smallest such, so K(α, β) ⊂ K(α)(β); on the other
hand K(α, β) ⊂ L contains β and K(α), and K(α)(β) is the smallest such, so
K(α)(β) ⊂ K(α, β). Similarly
In other words, we can adjoin multiple elements one after the other or all in one
go, and we get the same result.
In fact, for any subset S ⊂ L, we define
noting, of course, that even though S may be infinite, any particular element
of K(S) is expressed using finitely many arithmetic operations on finitely many
elements of K and S. Note an equivalent way of formulating any of these
extensions that take place within a given big field is
\
K(S) = M
M ⊃S
where the intersection takes place over all subfields M ⊂ L that contain both S
and K (although the notation leaves some of this information implicit).
√ √
Example 6.25. Consider M = Q( 2, 3). We saw in Example 6.12 that
√ √ √ √ √
Q( 2, 3) = {a + b 2 + c 3 + d 6 | a, b, c, d ∈ Q}
√ √ √ √ √
and [M : Q] = 4 with the visible basis {1, 2, 3, 6 =√ 2 3} by considering
M as a sequence of extensions by square roots Q ⊂ Q( 2) ⊂ M .
Remark
√ 6.26.
√ There’s a nice little surprise lurking in Example 6.25. Consider
α
√ = 2 + ∈ M and the intermediate √
3√ ⊂ F =√Q(α) ⊂ M . Clearly
field Q √
6 ∈ F since 6 = 12 (α2 − 5). So also α 6 = 2 3 + 3 2 ∈ F . Subtracting
√ √
copies of α from that shows that both 2 ∈ F and 3 ∈ F . So M = F = Q(α)
was a simple extension all along. Ah well. You try to be interesting. . .
It’s not true that all field extensions are simple, but it does happen more
than you might expect, and it always happens for finite extensions of Q. This
44
will follow instantly as an unintended consequence of the Main Theorem 10.17
of Galois Theory below: it’s called the Theorem on the Primitive Element.
Having said that, it often seems easier to work with a few simple generators
than with a single one, as in the example above. You could regard the question of
finding radical formulas to solve equations as the opposite of finding a primitive
element: sure the field of all roots is a simple extension, but can you span it by
a bunch of radicals instead?
Example√6.27. At last we can verify the claim in §1(1) that if L = Q(α, ω)
with α = 3 2 and ω a primitive cube root of unity, then
[L : Q] = 6 and {1, α, α2 , ω, αω, αω 2 } is a Q-basis of L
For example, consider the intermediate field
Q ⊂ K = Q(α) ⊂ L
Then by Proposition 6.21 K has Q-basis {1, α, α2 } and [K : Q] = 3. The
polynomial x2 + x + 1 ∈ K[x] is irreducible over K (not only over Q) since
K ⊂ R but the roots are not real, so again by Proposition 6.21 L has K-basis
{1, ω} and [L : K] = 2. Thus by the Tower Law
[L : Q] = [L : K][K : Q] = 2 × 3 = 6
and the proof of the Tower Law gives a Q-basis of L as the product of the two
bases we just found. (You could redo the proof going via Q(ω) instead for fun.)
45
Corollary 6.30. Let L/K be a field extension and α, β ∈ L be two algebraic
elements. Then α + β, αβ and α/β are algebraic, where of course we only claim
the last one in the case β 6= 0.
Proof. By the proposition K(α)/K is a finite extension, say [K(α) : K] = n.
Let p ∈ K[x] be the minimal polynomial of β over K. It is irreducible over K.
Suppose for a moment that p is irreducible over K(α) too. In that case [K(α, β) :
K(α)] = deg p. So by the Tower Law
is finite too. So K(α, β)/K is algebraic, by the proposition. Thus any element
of K(α, β) is algebraic over K, in particular the arithmetic combinations of α
and β we wish to prove.
We presumed p was irreducible over K(α). But if not, things are even better.
In that case p must factorise p = p1 p2 · · · ps into irreducibles pi over K(α), and
β must be a root of one of the pi , without loss of generality p1 . (That is,
p1 ∈ K(α)[x] is the minimal polynomial of β over K(α).) So simply repeat the
previous argument with p1 in place of p.
Example 6.31. This implies at once that the set A ⊂ C of all complex numbers
that are algebraic over Q,
46
Remark 6.32. It was trivial that finite extensions are algebraic. Conversely,
simple algebraic extensions are finite by Proposition 6.21. But the more general
converse is not true, as Q ⊂ A in Example 6.31 shows.
You can easily make up other gratuitous counterexamples:
√ √ √inductively ad-
√
joining all the real pth roots of all primes p, Q ⊂ Q( 2, 3 3, 5 5, . . . , p p, . . . ),
for example. In any example you make up, you’ll have to show that what you
add inductively does not already lie in the field you have constructed so far. In
√
this case, once you calculate [Q( p p) : Q] = p, you’ll see that the Tower Law
√ √ √
shows both p p ∈ / Q( 2, . . . , q q), where q is the prime before p, and then that
the whole extension is infinite. You can have fun making up some others (the
easy part) and trying to prove they are infinite (which can be difficult).
Notice that EmbK (L, M ) is not naturally a group: we cannot compose maps
because in general their domain and codomain are different.
Remark 6.34. There is one situation in which EmbK (L, M ) is a group under
composition in a straightforward way. Suppose L/K is finite, L ⊂ M and that
it so happens that ϕ(L) ⊂ L for every K-homomorphism ϕ. Since L/K is finite,
each ϕ is therefore an isomorphism L → L, and so EmbK (L, M ) is naturally
identical to AutK (L), which is a group under composition. (We used finiteness
here to ensure maps are bijections so they can be inverted.)
Fundamental Observation 6.35. Consider ϕ ∈ EmbK (L, M ) where L/K
and M/K are any field extensions. If α ∈ L is a root of a polynomial f ∈ K[x]
then ϕ(α) is a root of f in M .
Proof. The point is that ϕ respects addition and multiplication and is the iden-
tity on K, so ϕ(f ) = f , if we extend ϕ naturally to a ring homomorphims
L[x] −→ M [x] as in §4.2 Example 4.3. With that first equality,
47
Spoiler 6.37. It would be nice if the map ϕ : K(α) −→ K(β) we got extended
to a K-automorphism L −→ L. In general it does not, but later in Corollary 9.16
we will see a situation in which it always does, which will give us a lot of power.
√
Example 6.38. Thinking back to the roots α = 3 2, αω and αω 2 of f = x3 − 2,
the proposition shows that there is an isomorphism between the three subfields
Q(α) ∼ = Q(αω) ∼ = Q(αω 2 ) of the field L = Q(α, ω) containing all the roots.
When motivating the subject in §1 we implicitly said this, and hoped that these
isomorphisms could be realised by automorphisms of L (which indeed they are,
as we see later). The proof above completely ignores L—it’s only there as a
place for α, β to live—and simply says that each of the three fields is isomorphic
to the same model extension K[x]/(f ) of K.
Theorem 6.41. Let L/K be a finite extension and M/K any extension. Then
# EmbK (L, M ) ≤ [L : K]
48
Since [L : K(α)] < [L : K], we have by induction that #ρ−1 (ϕ) ≤ [L : K(α)].
So
Remark 6.42. I like this proof. It shows how we can use the tower law to do
induction on the degree of an extension, and then use simple extensions, which
are very concrete and we can be completely explicit about, as the inductive step.
Remark 6.43. It may have struck you that the upper bound in Theorem 6.41
is a function of L/K and has nothing whatsoever to do with M . The role of M
comes later as we find out when this bound is an equality. The inductive step
of the proof already contains the clue: you can build more maps if M contains
more roots of the minimal polynomials of elements of L over K.
49
7 Applications: antiquities
Euclidean geometry is a practical and applicable subject.
When you move an IKEA wardrobe from one room to another, you check
in advance whether it can be walked or carried, or whether sadly it needs dis-
mantling, before you gouge holes out of the walls and ceilings. If you want to
rise to become the Head Gardener at Buckingham Palace or Kew Gardens or
the University of Warwick, you’d better be able to be sure that the corners of
those beds are right angles and that if you have 10m2 coverage of compost the
area you’re mulching actually is 10m2 : the Royal Horticultural Society practical
exams really do include Euclidean geometry (a small dose, proofs omitted).
Because it is such a powerful tool, it is useful to pin down exactly what you
can do with Euclidean geometry and what you cannot. We specify the tools and
begin to explore the myriad constructions in §7.1. If you are only interested in
how the field theory works you can skip straight to §7.2, remembering that the
only tools we have to mark points are to intersect lines and circles with lines or
circles, and there are conditions on which lines and circles we may draw.
The tools we use for constructions are limited at the outset, but the ideas we
use to prove that we cannot solve certain problems are not. We will translate
the whole exercise into modern coordinate geometry, which was not available to
Euclid, and then the Tower Law will solve the impossible.
If you don’t like what we achieve, then you can think about buying some new
tools and working through the whole exercise with them. But remember that
your toolbox is not infinite, so at some point you’ll have to have the courage to
make do with what you’ve got and go out into the field and get working.
(d) For each of those challenges, either try harder and come up with a con-
struction, or prove that there cannot be any construction at all.
50
The question is: which points of R2 can we plot? We will gather together all
the points we plot into a set P ⊂ R2 ; our starting points (−1, 0) and (1, 0) ∈ P
give us something to work with. We will quickly find that there are many other
points in P, but first we need to understand what a ruler and compass can do
and how we identify other points of P, that is, what it means to plot a point.
What can our ruler and compass do? These are our tools:
Ruler: If P, Q ∈ P are distinct points, then we may draw the straight line P Q
between them. Note that the line does not extend beyond the endpoints
P and Q.
Compass: If P, Q, R ∈ P (not necessarily distinct) then we may draw the circle
centred on P with radius equal to the length of the line QR. In practical
terms, we set our compass radius by putting the two sharp spikes at Q
and R respectively, and then move one spike to P and rotate the compass
through 360◦ using the other spike to describe a circle as we go; see the
picture below of what a compass is in this context (you can buy that one
online right now).
What counts as plotting a point? Given that we can draw certain lines
and circles with the ruler and compass, all of the following are points of P:
(a) the intersection point of two lines drawn with the ruler;
(b) the points of intersection of of a line drawn with the ruler and a circle
drawn with the compass;
(c) the intersection points of two circles drawn with the compass.
Of course, the lines and circles in this prescription may only be drawn in accor-
dance with their definition above, so we need to know some points of P to be
able to draw them at all. We think of building up P recursively, including new
intersection points as we draw ever more lines and circles.
A point P ∈ R2 , other than the initial points (−1, 0) and (1, 0) ∈ P, is
an element of P if and only if can be realised by the operations above applied
finitely many times.
51
The first computation We start out knowing P0 = (−1, 0) and Q0 = (1, 0).
We can draw the line P0 Q0 , and then set the compass to radius P0 Q0 and draw
the circles of that radius centred on P0 and on Q0 —that’s absolutely everything
we can do given only two points to work with. But already the intersection
points of those two circles are points S and S 0 , and so our knowledge of P
grows. With the modern technology of coordinates, and ancient technology of
Pythagoras’ Theorem, we may express those as
√
S, S 0 = (1/2, ± 3/2)
We may use the ruler now to draw the line SS 0 . Intersecting the lines P0 Q0 and
SS 0 produces another new point R = (0, 0).
The picture sketches the situation. (I’ve borrowed that picture from Wolfram
alpha so it’s not labelled for us. Note too that our ruler would not allow the
vertical bisector to extend beyond the intersections of the circles.)
In complete generality this shows that we may always make the following con-
struction.
Construction 1. If P, Q ∈ P then the midpoint R of the line segment P Q also
lies in P.
N = {|P Q| : P, Q ∈ P} ⊂ R
52
Extending lines beyond their endpoints The previous trick can be used
to extend lines indefinitely. Circles drawn centred on S and S 0 of radius SS 0
meet on the line P Q but outside its endpoints. We can join those and repeat.
In this way we can draw as much of the x- and y-axes as we please—and indeed
extend any other lines we have a segment of.
Given the extension of P0 Q0 along the x-axis, we may draw a circle of radius
P0 Q0 (in the notation above) at Q0 to construct a point T as follows.
Construction 2. If P, Q ∈ P, then also T ∈ P, where |T Q| = |P Q| and Q is
the midpoint of the line segment P T .
In arithmetic terms, we may think of that as multiplying by 2. The same
set up constructs analogous points on the y-axis.
Construction 3. If R, S ∈ P, then T, T 0 ∈ P where T T 0 is the line at right
angles to RS meeting it in R with |RT | = |RT 0 | = |RS|.
In other words, we can construct the square lattice of points at distance |RS|
centered on any point P ∈ P the moment we construct R, S ∈ P.
You can continue with many constructions: divide by any integer, move a
line segment RS to a parallel line segment starting at a given P ∈ P, multiply
by an integer, rotate a line segment to be parallel to the x-axis, compute a
square root, and so on and on and on.
(a) S is the unique intersection point of two lines drawn with the ruler
(b) S is one of the points of intersection of of a line drawn with the ruler and
a circle drawn with the compass
(c) S is one of the intersection points of two distinct circles drawn with the
compass
where a line means a (compact) straight line segment P Q joining two points
P, Q ∈ Pi and a circle means a circle centred at a point R ∈ Pi of radius equal
to the length of a line P Q joining points of Pi . (You may regard all italised
mention of rulers and compasses as decorative flim-flam that plays no role and
may be deleted.)
53
There is one caveat: if in any of the operations there are either no intersection
points or infinitely many in R2 , then the operation is not permitted. (The main
point is to rule out lines that meet along a common nontrivial segment.)
Then S = (x, y) ∈ R2 and Ki+1 = Ki (x, y). (It will be clear in the proof of
Lemma 7.3 that either y ∈ Ki (x) or x ∈ Ki (y), so the extension is simple.)
Any sequence of field extensions K0 ⊂ K1 ⊂ · · · ⊂ KN made in this way is
called a sequence of ruler and compass field extensions.
Remark 7.2. The algebraic process in Definition 7.1 relates directly to the
geometry of §7.1: if you construct a point P = (x, y) ∈ R2 using the geometri-
cal ruler and compass constructions of §7.1, then those geometrical operations
describe a sequence of ruler and compass field extensions with x, y ∈ KN .
Lemma 7.3. Let Q = K0 ⊂ K1 ⊂ · · · ⊂ KN be a sequence of ruler and compass
field extensions. Then [KN : Q] = 2M for some integer M ≤ N .
Proof. Obvious. End of proof. In detail: inductively we build a sequence of
field extensions Q ⊂ K1 ⊂ K2 ⊂ · · · by constructing points Q1 , Q2 , . . . whose
coordinates generate Ki+1 over Ki . (Many of these extensions may be trivial:
Ki+1 = Ki .) To extend Ki , we construct a point Qi+1 ∈ R2 by one of the follow-
ing three operations (which are the three operations of Definition 7.1 expressed
in coordinate geometry).
Working in each case with a, b, c, d, e, f ∈ Ki we solve one of:
(a) Intersection of two lines: simultaneously solve
ax + by = c and dx + ey = f
But that is done by linear algebra in Ki , and the solution is (x, y) ∈ Ki2 .
(b) Intersection of line and circle: simultaneously solve
But that is done by substituting for x or y from the linear equation into
the quadratic and then solving the result with the usual quadratic formula:
thus x and y involve at worst a single square root over Ki .
(c) Intersection of two circles: simultaneously solve
Subtracting one equation from the other leaves the affine linear equation
2(d − a)x + 2(e − b)y = c2 − a2 − b2 − (f 2 − d2 − e2 ), namely the equation of
the line through the two intersection points (including the case the circles
are tangent) which puts us back in the previous case: so again x and y
involve at worst a single square root over Ki .
√
Thus each step either has a solution in Ki , so [Ki+1 : Ki ] = 1, or in Ki ( s) for
some s ∈ Ki , so [Ki+1 : Ki ] ≤ 2. The Tower Law concludes the proof.
54
(Non-examinable) Remark On the face of it, the geometrical operations
of §7.1 merely describe finitely many points with finitely many coordinates (x, y).
But we hinted in §7.1 that the geometrical operations are strong enough to carry
out all field operations, so we can indeed construct any point of K0 = Q2 , and
then, in any inductive step, if we can construct Q = (x, y) ∈ R2 then we can
also construct points having any two elements of Ki (x, y) as their coordinates.
You may object that we cannot carry out infinitely many geometrical ruler
and compass constructions, and that’s fair enough, but we can construct any
specified element of any Ki using only finitely many operations, and if you wish
you may prefer to spell that out every time I say Ki .
While we’re discussing §7.1, notice that the field Ne ⊂ R is simply the union
of every field Ki that arises in every possible ruler and compass extension. If
it was ever convenient to have a ‘master field’ of geometrically constructible
numbers, then we could use this idea, but in general we are trying to get away
from the idea of working in a single fixed favourite field (such as Ne or C), and
work instead from the bottom up.
55
7.3 Impossible constructions
Duplicating the cube: building a shed The gamekeeper tells you she
wants a cube-shaped shed whose volume is exactly 2 in the distance units of
your Euclidean plane cubed. (I didn’t ask, but it is probably a hide for shooting
moles, who are notoriously unafraid of cubes—in general, it’s best not to know
how they keep those lawns so nice, in the same way it’s best not to know what
sausages are made of.) √
Your job, therefore, is to mark out a square of side length 3 2 on the ground:
the floor, walls and ceiling of the shed can all be modelled on that, and then
it’s someone else’s job
√ to assemble it. This is equivalent to you being able to
construct the point ( 3 2, 0) in your Euclidean garden, or equivalently
√ a sequence
of ruler and√compass field extensions Q ⊂ KN that contain 3 2 ∈ KN .
But [Q( 3 2) : Q] = 3 since x3 − 2 is irreducible √over Q, while [KN : Q] is a
power of 2 by Lemma 7.3, so by the Tower Law Q( 3 2) cannot be a subfield of
any such field KN . So it can’t be done, mate – sorry.
Trisecting angles: flowers for the visiting dignatory The French Presi-
dent is visiting Kenilworth to celebrate the Galois’ birthday. The builders have
put up a marquis and carved out a 30◦ angle in the earth for the corner of a
flower bed. Naturally the mayor would like a display of red, white and blue
flowers, and it’s your job to plant them equally.
It is easy to mark out an angle of 30◦ in the ground by ruler and compass:
√
we can construct rational numbers and square roots, so simply mark ( 23 , 21 )
and draw a straight line from that to (0, 0). But can you divide the angle into
three equal subangles using your ruler and compass?
√
(cos 30◦ , sin 30◦ ) = ( 3 1
2 , 2) •
30◦ •
(cos 10◦ , sin 10◦ )
•
1
If you can do that, then you can certainly plot the point Q = (cos 10◦ , sin 10◦ )
by intersecting your dividing line with a circle of radius 1. But, as when finding
the roots of a real cubic using trigonometric functions in §3.3, we can use the
triple angle formula (9) or its sin(3θ) buddy
56
to show that x = sin 10◦ is a root of a polynomial f = 4x3 − 3x + 21 . This
f is irreducible: any test from §5.3 will do; for example, 2f (x + 1) = 8x3 +
24x2 + 18x + 3 is irreducible by Eisenstein’s criterion with p = 3. Therefore
[Q(sin 10◦ ) : Q] = 3, and so sin 10◦ does not lie in any ruler and compass field
extension by Lemma 7.3 and the Tower Law. Sorry mayor: we can’t do that.
If you want to be more constructive—perhaps you’re looking for promotion—
you could get back to the head gardener with a list of some constructible angles
that you can trisect such as 90◦ . That would be an exercise.
Squaring the circle: linearising the ducks OK, now you’re joking. Can
you take a circular pond and rebuild it as a square pond of exactly the same
depth (the ducks like that) and exactly the same volume—not approximately,
but make a square with exactly the same area as the original circle.
The circle has some constructible radius r ∈ KN for some sequence of ruler
and compass field extensions
√ Q ⊂ KN . Its area is therefore πr2 . So√the square
must have side length r π. Are you seriously asking me whether π lies in a
field extension L/KN of degree [L : Q] some power of 2?
Well, prima facie it could. If it did, then we could also find an algebraic
extension containing π, since we can compute squares using ruler and compass.
But in fact π is transcendental—it is not the root of any polynomial with rational
coefficients (see [Stewart2015, §24.3]), so this cannot happen. I’d love to help,
mate, but this can’t be done either.
57
8 The automorphism group of a field
The first part was already in §4.4. The group of automorphisms of a field is the
main invariant, so worth reviewing.
58
Lemma 8.5. Suppose L is a field and G ⊂ Aut(L) is a finite subgroup. Then
[L : LG ] ≤ #G
This is a slick piece of linear algebra, with a clever shift between coefficient
fields. We have to show that there is no LG -linearly independent set of elements
of L of size > #G. The proof easily cooks up linear dependence relations over L,
and then uses G to see that there was one defined over LG all along.
G
Proof. Let G = {σ1 , . . . , σn }. Suppose a1 , . . . , an+1 ∈ L. Set PK = L , the fixed
field of G. We must find a K-linear dependence relation xi ai = 0 with all
xi ∈ K, but not all zero.
[Clever] Let
σ1 (ai )
vi = ... ∈ Ln for i = 1, 2, . . . , n + 1
σn (ai )
x 1 v1 + · · · + x k vk = 0
That is, (x1 , . . . , xk )T is a nontrivial solution (in fact, all xi ∈ L are nonzero)
of n simultaneous linear equations with coefficients in L.
We can hit the equations (12) with any σ ∈ G: that permutes the rows of the
matrix, so leaves exactly the same set of equations, but now exhibits the solution
(σ(x1 ), . . . , σ(xk ))T . But that must be the same solution as before: if not, then
since σ(x1 ) = σ(1) = 1 = x1 we could subtract it from the previous solution to
get a shorter linear dependence relation, which would be a contradiction as we
started with the shortest one.
That is σ(xi ) = xi for each i = 1, . . . , k and for every σ ∈ G. So each
xi ∈ LG = K, and the first row of (12) gives a nontrivial K-linear relation
among the ai (since σ1 = id).
59
Theorem 8.6. Let G ⊂ Aut(L) be a finite subgroup of the group of automor-
phisms of a field L. Then [L : LG ] = #G.
Proof. Let K = LG . Every σ ∈ G fixes every element of K, so G ⊂ AutK (L).
Lemma 8.5 shows [L : K] ≤ #G. In particular, L/K is a finite extension so
we may apply Theorem 6.41 for L/K with M = L for the converse inequality
f = (x − α)(x − αω)(x − αω 2 )
√
3
where α = 2 ∈ R and ω a primitive cube root of unity. Let’s name the roots
α1 = α α2 = αω α3 = αω 2
60
L = Q(α, ω)
2
2
2
√
3 3 3 M = Q(ω) = Q( −3)
{1, 2, 3} −→ {1, 2, 3}
i 7→ j where σ(αi ) = αj
So, having fixed the numbering of the roots, we may treat AutQ (L) ⊂ S3 as a
subgroup of the permutation group.
What we don’t know yet is the image of (13), or, put another way, which of
the permutations of the roots can be realised by an automorphism of L. Let’s
sort that out by constructing some automorphisms.
61
So by Corollary 6.39, elements of AutM (L) = EmbM (M (α1 ), L) are deter-
mined by the roots of x3 −2 in L. Thus there is an M -homomorphism τ : L −→ L
with τ (α1 ) = α2 : that is, τ (ω) = ω and τ (α) = αω, and so it follows that also
τ (α2 ) = τ (α)τ (ω) = αω 2 = α3 and τ (α3 ) = τ (α)τ (ω)2 = αω 3 = α1
Thus τ is the permutation (123) ∈ S3 .
Groups make life easy We have found permutations σ = (23) and τ = (123)
in the subgroup AutQ (L) ⊂ S3 . But those two permutations generate all others,
so AutQ (L) = S3 .
We know everything about S3 . For example, its subgroups are
{id} hσi ∼
= Z/2 h(13)i ∼
= Z/2 h(12)i ∼
= Z/2 hτ i ∼
= Z/3 and S3
and they sit in a subgroup lattice as drawn in Figure 3 in §1.4. Of these, only
the two trivial ones and the Z/3 are normal subgroups: they comprise unions of
complete cycle types. The three subgroups of order 2 are conjugate: for example
(123)
(123)(12)(321) = (23) shows h(12)i = hσi
62
L = Q(α, ω)
2
2
2
3
K1 = Lh(23)i K2 = Lh(13)i K3 = Lh(12)i
3 3 3 M = Lh(123)i
Q = LS3
We will understand exactly when this works for other polynomials, and that is
enough power to understand when there can be a radical formula for their roots.
63
9 Field extensions II
9.1 Splitting fields
A splitting field for a polynomial f ∈ K[x] (not necessarily irreducible) is a
minimal field extension that contains all the roots of f , that is, where f splits
into linear factors.
Definition 9.1. An extension L/K is a splitting field for f ∈ K[x] if and
only if
Any such L/K is a finite extension since each αi is algebraic over K, so the
Tower Law applies to K ⊂ K(α1 ) ⊂ K(α1 , α2 ) ⊂ · · · ⊂ L to give [L : K] < ∞.
You can refer to L as the splitting field of f : it is implicit that L/K is an
extension, and the fact that you said ‘the’ rather than ‘a’ splitting field will be
justified soon when we prove that any two splitting fields for f over the given
field K are unique up to isomorphism. But first a point to be careful about.
Example 9.2. Let f = x3 − x2 − 2x +√2 = (x2 − 2)(x √− 1) ∈ Q[x]. Then,
working in C, the roots of f are 1 and√± 2,√so L = Q( 2) is a splitting field
for f . We could have written L = Q(1, 2, − 2), but some of these are already
in Q,
√ and some are in the field generated by the others, so we’ll be happy with
Q( 2).
There is a nuance that is important to appreciate, but which definition is
only clear about if you read it carefully. Consider√the same polynomial f but
now regard it as an element of N [x] where√N = Q( 3). That has not made any
difference to f , nor to its roots in C—the 3 in N is irrelevant to the roots of f .
But now L is not a splitting field, because it is√ not an extension
√ √ of the field
N of over which f is defined. Instead, L2 = N ( 2) = Q( 2, 3) is a splitting
field. What is and what is not a splitting field depends crucially on the field
over which we defined f in the first place; and if we change that field, then the
splitting field may change. This is not a big deal in practice if we take care.
Example 9.3. Let f = x4 − 2x2 − 15 ∈ Q[x]. If you spot that f factorises into
two quadratics over Q, then this is easy, but if not then maybe you spot instead
that it is a quadratic in y = x2 . It equals y 2 − 2y − 15, which has roots −3 and
2
5 by the quadratic√formula. So
2
√ we find the roots of f by solving√x =√−3 and
x = 5, namely ± −3 and ± 5. So the splitting field is L = Q( −3, 5), the
smallest field extension of Q that contains those four roots.
64
Remark 9.4. If K ⊂ C, then it is obvious that splitting fields exist over K.
Since C is algebraically closed, any f has a complete set of roots α1 , . . . , αn ∈ C,
and L = K(α1 , . . . , αn ) is a splitting field for f . There’s nothing to prove:
certainly f splits into linear factors over L, and by definition L is the smallest
subfield of C that contains all the αi . In practice it is excessive to name all
the roots when defining L, as some will come for free (x3 − 1 ∈ Q[x] has roots
1, ω, ω 2 ∈ C, and Q ⊂ Q(ω) is a splitting field), but it’s convenient and harmless
to be profligate when we don’t have explicit values for the roots.
We prove the existence of splitting fields in complete generality using the
usual K[x]/(f ) trick, without needing the sledgehammer of algebraically closure.
Proposition 9.5. Let f ∈ K[x] of degree n. Then there is a splitting field L/K
for f of degree [L : K] ≤ n!.
Proof. If f splits into linear factors in K[x] we are already done. If not, let
g1 be a nonlinear irreducible factor of f : note deg g1 ≤ n. Then K ⊂ L1 =
K[x]/(g1 ) = K(α) is an extension that contains a root of g1 , namely α = the
class of x, and [L1 : K] = deg g1 ≤ n. The factorisation of f into irreducibles
over K(α) is finer than over K: in particular, all K-linear factors of f are also
factors over K(α) and f has gained the linear factor x − α, so the number of
linear factors has strictly increased, and the maximum degree of an irreducible
factor is n − 1.
Repeating this process inductively, at the ith step adding at least one root
Li+1 = Li (αi ) of an irreducible factor gi of degree ≤ n − i, so [Li+1 : Li ] ≤ n − i,
completes the proof: if there are s steps, the Tower Law shows
which is ≤ n! as claimed.
65
Proof. We know that L = K(α1 , . . . , αs ), where αi are (some of) the roots
of f . We work by induction on s. Let m be the minimal polynomial of α1 . By
definition, m divides f , so m splits into linear factors in M , because f does. Let
β1 be any root of m in M . By Proposition 6.36, there is a K-homomorphism
K(α1 ) → K(β1 ) ⊂ M .
Now L/K(α1 ) is a splitting field for f /(x − α1 ) ∈ K(α1 )[x], and L =
K(α1 )(α2 , . . . , αs ) is generated by fewer than s elements over K(α1 ). So by
induction, there is a K(α1 )-homomorphism L → M . That map is, in particu-
lar, a K-homomorphism, and we are done.
Corollary 9.7. Suppose L/K and L0 /K are both splitting fields for f ∈ K[x].
Then there is a K-isomorphism L → L0 .
Proof. By the theorem, there are (injective!) K-homomorphisms L → L0 and
L0 → L. Both fields L and L0 are finite-dimensional K-vector spaces, so they
must have the same dimension. Therefore both maps are bijections, hence are
K-isomorphisms.
Remark 9.8. If L/K is a splitting field for f ∈ K[x], then √ it will be the
splitting field for many other g ∈ K[x]. For example L = Q( 2) is the splitting
field for x2 − 2. But by the quadratic formula it is the splitting field for any
g = ax2 + bx + c with ∆ = b2 − 4ac = 2n2 for some n ∈ Z; x2 − 8, say, or for
that matter (x2 − 2)(x2 − 8), √ since nobody said g had to be irreducible. √
Put another √ way, if α + β 2 ∈ L \ Q, then the product of x − (α + β 2)
with x − (α − β 2) lies in Q[x] and has L as its splitting field.
K(α) −→ L(α)
% % &
K −→ L −→ M
66
The key is to notice that L(α) is a splitting field over K(α) for the initial f
(harmlessly regarded as a polynomial in K(α)[x]). By the Tower Law
But then:
• K(α)/K and K(β)/K are isomorphic extensions, as usual: Proposition 6.36.
• K(α) −→ L(α) is a splitting field for the initial f , and so is the composition
∼
=
K(α) −→ K(β) −→ L(β), so there is a K(α)-isomorphism L(α) −→ L(β).
These have the respective numerical consequences:
So the Tower law statements above imply at once that [L(α) : L] = [L(β) : L].
But that is enough. If some root α ∈ L, then [L(α) : L] = 1, so [L(β) : L] = 1
for any other root β, that is β ∈ L too.
Conversely, in the finite case, normal extensions are splitting fields, and
so we haven’t really generalised anything at all.
Corollary 9.11. Let L/K be a finite extension. The following are equivalent:
(a) L/K is a normal extension.
(b) L/K is the splitting field of some (not necessarily irreducible) f ∈ K[x].
Proof. Theorem 9.9 proves one direction. The converse is trivial, as follows.
Since L/K is finite, L = K(α1 , . . . , αs ). Let mi ∈ K[x] be the minimal polyno-
mial of αi . Since L is normal and mi has a root in L, mi has all its roots in L.
Thus L is the splitting field for f = m1 m2 · · · ms .
67
Proof. Again write L = K(α1 , . . . , αs ) with mi ∈ K[x] the minimal polynomial
of αi , and write f = m1 m2 . . . ms . Then let N be a splitting field for f ∈ L[x].
It is normal over L by the theorem, and so a fortiori normal over K.
Definition 9.13. Let L/K be a finite extension. A normal closure of L/K is
an extension N/L such that N/K is normal, and N does not contain any strict
subfield containing L that is also normal over K.
The Corollary says that normal closures exist. They are clearly unique up to
K-isomorphism by its proof: each mi already has a root in L by construction,
so f = m1 m2 . . . ms had better split in any normal closure, in particular in the
smallest one, and so the closure simply is a splitting field of f .
68
The example is completely general: the same words prove the following.
Corollary 9.16. Let L/K be a finite normal extension. Suppose f ∈ K[x]
is irreducible and α, β are roots of f in L. Then there is ϕ ∈ AutK (L) with
ϕ(α) = β.
Proof. There is a K-homomorphism K(α) −→ K(β) by Proposition 6.36 since
α and β have the same minimal polynomial over K. Corollary 9.14 says that
this lifts to a K-autmorphism of L, since L/K is a finite normal extension.
This was what we saw hinted in Spoiler 6.37.
69
Definition 9.19. A polynomial f ∈ K[x]\{0} is separable over K if and only
if it has n = deg f distinct roots in a splitting field L/K. If f is not separable
it is called inseparable.
(The uniqueness of splitting fields up to isomorphism means this definition
is independent of which splitting field L you used to track down the roots of f .)
Example 9.20. Suppose K ⊂ C and f ∈ K[x] is any polynomial. Fix deg f =
n ≥ 2, otherwise there’s nothing to discuss. Over C we know that
f = c(x − α1 )m1 · · · (x − αs )ms
where α1 , . . . , αs are the distinct roots of f and mi are their multiplicities, and
m1 + · · · + ms = n.
We claim that if f is irreducible, then every mi = 1 and so s = n. Suppose
not: suppose m1 ≥ 2 and write f = (x−α1 )m1 g for some g ∈ C[x]. Consider the
derivative f 0 = df /dx of f and let h = gcd(f, f 0 ) ∈ K[x]. Since f is irreducible
h = 1 or h = f (up to a scalar multiple). But h also divides f 0 , which has
smaller degree than f , so it must be that h = 1.
That’s all fine, but watch this:
f 0 = m1 (x − α1 )m1 −1 g + (x − α1 )m1 g 0
which has a factor of (x − α1 ) since m1 ≥ 2. In other words, x − α1 divides h
as complex polynomials, but that’s nonsense since we just saw that h = 1.
In short: if K ⊂ C, then every polynomial f ∈ K[x] is separable over K.
At first sight, that looks like a cast-iron proof that all irreducible polynomials
are separable, but there is a loophole.
Example 9.21. First an example that doesn’t really address the point, but
contains the clue: consider f = x3 − 2 = x3 + 1 ∈ F3 [x], where the coefficient
ring is the field F3 = {0, 1, 2} of 3 elements (and therefore characteristic 3). Its
derivative is 3x2 , which is the zero polynomial since 3 = 0 in F3 . This vanishing
derivative is the loophole that allows one to get past the GCD calculation of
Example 9.20. By multiplying out you can check that f = (x + 1)3 so the 3
roots of f are not distinct. This is a fine example of an inseparable polynomial,
but it is merely a power of a polynomial in F3 [x] in disguise so not so brilliant.
We don’t find irreducible inseparable polynomials with such easy examples
(we’ll see why later), but the clue is there, and because of this all books have
the same bone fide example of an inseparable polynomial. Let’s do it.
Let K = Fp (t), where t is a variable and p ≥ 3 is an odd prime: that is K
is the field of all rational functions u(t)/v(t), where u, v are polynomials with
coefficients in Fp . The example is f = xp − t ∈ K[x].
Consider a field extension L = K(α) where α is a root of f : that is, αp = t.
Watch this: in L[x] we calculate
p(p−1) p−2 2
(x − α)p = xp − pxp−1 α + 2 x α + · · · + pxαp−1 − αp
= xp − α p
= xp − t = f
70
where all the middle terms · · · of the binomial expansion vanish because their
coefficients include a factor of p. So in fact, f has exactly one root in L, and
that root has multiplicity p.
The crucial point of this example is that f is irreducible over K. That is
elementary to check. It’s good to know that this example exists, but let’s leave
that check as an optional exercise. (Hint: if (x − α)r ∈ K[x] for 1 ≤ r < p,
then αr ∈ K, so αar+bp ∈ K for all a, b ∈ Z. You can deduce that α ∈ K, so
α = u(t)/v(t), and now taking pth powers gives 0 = tv p − up ∈ K[t], which you
explain is impossible.)
Definition 9.22. Let L/K be an extension.
(a) An element α ∈ L is separable over K if its minimal polynomial is
separable over K.
(b) The extension L/K is separable if every α ∈ L is separable over K.
There are two things to know.
Theorem 9.23. Let f ∈ K[x] be an irreducible polynomial. Then f is insepara-
ble over K if and only if char K = p > 0 and f = a0 +a1 xp +a2 x2p +· · ·+an xnp .
That statement needs reading with a little care: it only talks about irre-
ducible polynomials, and it is not true that every polynomial of that form is
irreducible: for example, 1 + x3 ∈ F3 [x] is not inseparable because it is not
irreducible; it equals (1 + x)3 . The proof in §9.4.1 is elementary once we set
things up correctly and you should not think of this as a great mystery.
Theorem 9.24. Let L = K(α1 , . . . , αs ) be an extension of K with each αi
separable over K. Then the extension L/K is separable.
The proof of this is not hard, but would be yet another small diversion; a
nice way to proceed is to characterise separability as another kind of integral
degree [L : K]sep , and then prove a tower law for that, after which the theorem
follows. In fact, we do not need this theorem in the rest of the module so we
omit the proof, but you may use the result freely in your thinking.
71
The rules of differentiation (in particular, linearity and the product rule)
follow for Df (applied only to polynomials) immediately. Thus we may turn
Example 9.21 into a general statement over any field.
Lemma 9.26. Let K be a field and f ∈ K[x] \ {0}. Then f is separable over
K if and only if f and Df are coprime in K[x].
Any factor x − αi of f divides all summands of Df except the ith one, with
which it shares no common linear factor, and so f and Df are coprime in L[x].
If they had a nontrivial common in K[x], that would also be a common factor
in L[x], and so they are coprime in K[x] as claimed.
If f is not separable, then it has a multiple root α ∈ L. Thus f = (x−α)m g ∈
L[x] for some m ≥ 2. By the product rule,
uf + vDf = 1
so they are coprime unless Df is identically zero. The monomials xi are linearly
independent in K[x], so Df is the zero polynomial if and only if every coefficient
iai of Df is zero. For this to happen, either ai was zero all along, or i ∈ K is zero,
which is to say that char K divides i. That is the statement of Theorem 9.23.
72
10 Galois Theory
We make a small change of language, as does the rest of the world, in honour
of Évariste Galois.
Definition 10.1. Let L/K be a field extension. The Galois group of L/K is
simply another name for the group of K-automorphisms of L. It is denoted
This group contains a lot of information about the structure of the extension
L/K, even though algebraically it is a much simpler object. This section gives
a first idea of its power and significance. Polynomials relate to field theory via
the splitting field as follows.
Definition 10.2. Let f ∈ K[x] be a separable polynomial. Choose a splitting
field L of f , that is, a minimal field extension K ⊂ L in which f has all its roots.
The Galois group of f is by definition the Galois group of the extension L/K,
and is denoted
Gal(f ) = AutK (L)
Since the splitting field is defined only up to isomorphism, the group Gal(f ) is
also defined only up to isomorphism.
The Galois group is an invariant of f that contains information about the
symmetries of its roots. It is the most important invariant of a polynomial we
consider. The main point to make is that this is a very simple and concrete
invariant, even though it contains a lot of information and seems hard to
calculate at first. Let me explain.
Definition 10.3. A subgroup H ⊂ Sn of the symmetric group on n symbols
{1, 2, . . . , n} is transitive if for any pair of symbols i 6= j there is some σ ∈ H
such that σ(i) = j.
For example, Z/n = ∼ H = hσi ⊂ Sn with σ = (123 · · · n) is transitive:
σ j−i (i) = j. By contrast, Z/2 ∼
= H = h(12)i ⊂ Sn is not transitive for any
n ≥ 3, since no element of H can permute 3 to anything other than itself.
Lemma 10.4. Let f ∈ K[x] be an irreducible separable polynomial of degree n.
Then Gal(f ) is isomorphic to a transitive subgroup of the symmetric group Sn .
In explicit terms, if α1 , . . . , αn ∈ L are the roots of f in a splitting field L,
then each nontrivial element σ ∈ Gal(f ) permutes these roots in a nontrivial
way, and so permutes the indices {1, 2, . . . , n}. This gives an injective group
homomorphism Gal(f ) −→ Sn . Furthermore, given any two indices i 6= j there
is some σ ∈ Gal(f ) so that σ(αi ) = αj .
Proof. The explicit statement is almost the whole proof: L = K(α1 , . . . , αn ),
and as in Corollary 6.39 any K-automorphism must map roots to roots; and if
it does not permute them nontrivially, then every element of L remains fixed
and the automorphism is the identity.
73
The last claim follows since splitting fields L/K are normal, so the automatic
isomorphism of simple extensions K(αi ) → K(αj ) (Proposition 6.36) lifts to an
automorphism of L by Corollary 9.14.
This is wonderful. You understood subgroups of symmetric groups almost
as soon as you met permutations, and transitive subgroups are very particular
ones.
Remark 10.5. Suppose f is an irreducible cubic polynomial with integer coef-
ficients (so it is automatically separable). The lemma says that its Galois group
is a transitive subgroup of S3 . There are only two such groups: the cyclic group
h(123)i and S3 itself. It follows at once from the Tower Law that the two cases
correspond to: (1) adjoining a root is enough to split f ; (2) adjoining a root
leaves an irreducible quadratic. The discriminant of the cubic determines which
case you are in without further work. The symmetries of the field extension will
then assist with finding the radicals that appear in a formula for the roots.
When we come to the quintic we must analyse the transitive subgroups
of S5 . There are five, and some will permit radical solutions while others will
not. We will easily find polynomials whose Galois groups realise each of these
five transitive subgroups. (Given a transitive subgroup H ⊂ Sn , it is a hard
open problem in general to exhibit a polynomial f with integer coefficients and
Gal(f ) ∼= H. That is the inverse Galois problem and is another story.)
Although the language we use did not exist 200 years ago, Galois was using
this idea explicitly, thinking of permuting roots of polynomials in the complex
numbers, and relating it with razor-sharp precision to the solvability of polyno-
mials.
74
a root but not all its roots. On the other hand Q(ω) is the splitting field of
x2 + x + 1 over Q: so Q(ω)/Q is a normal extension, which, as we shall see
in Proposition 10.13, relates to the fact that it is the fixed field of a normal
subgroup h(123)i ⊂ Aut(L).
Theorem 10.10. Let L/K be a Galois extension. Then
[L : K] = # Gal(L/K)
K ⊂ K(α) ⊂ L
We show that this map is surjective and that the fibres of the map are the (left)
cosets of Gal(L/K(α)) ⊂ Gal(L/K).
Given ι : K(α) → L, there is a K-automorphism σ of L that restricts to ι
by Corollary 9.14, since L/K is normal. Thus resα is surjective. Consider any
other τ ∈ Gal(L/K) that restricts to ι. The map τ −1 σ is the identity on K(α),
so τ −1 σ ∈ Gal(L/K(α)). In other words, τ and σ lie in the same left coset, so
# Gal(L/K) Gal(L/K)
=# = # EmbK (K(α), L) (16)
# Gal(L/K(α)) Gal(L/K(α))
75
Step 4 Putting the different equalities together gives
# Gal(L/K) = # Gal(L/K(α)) · # EmbK (K(α), L) by (16)
= [L : K(α)][K(α) : K] by (14) and (15)
= [L : K] by the Tower Law
as claimed.
You could think of Step 2 as a central point of the theory: a dimension in
linear algebra is equal to a certain number of field maps; that is, the degree of
a minimal polynomial has two quite different interpretations. That is also the
point where we use separability in an essential way.
Corollary 10.11. Let L/K be a Galois extension. Then the fixed field LG of
G = Gal(L/K) is LG = K.
Proof. We have K ⊂ LG ⊂ L, with [L : K] = #G = [L : LG ] by Theorems 10.10
and 8.6, so LG = K by the Tower Law.
Remark 10.12. There is a curious point in Step 3 of the proof of Theo-
rem 10.10. The subgroup Gal(L/K(α)) ⊂ Gal(L/K) is not always a normal
subgroup, and so we did not identify the set of cosets Gal(L/K)/ Gal(L/K(α))
as a group: fortunately we only needed to count it, and as the set of cosets is a
natural set in bijection with it that we could count that instead. We gain some-
thing by understanding this point a little better (compare with Example 10.9).
Proposition 10.13. Let L/K be a Galois extension and K ⊂ M ⊂ L an
intermediate field. Then M/K is a normal extension if and only if Gal(L/M ) ⊂
Gal(L/K) is a normal subgroup.
This is one of those proofs that is a bit fiddly to read, and is actually rather
easier to prove yourself: take it calmly, write down what you need to check for
each of the implications, and then note that the tools we already have confirm
all the points you mentioned: there is nothing new in this proof.
Proof. Let G = Gal(L/K) and H = Gal(L/M ).
If M/K is normal, then σ(M ) = M for any σ ∈ G by Proposition 9.17(b).
Suppose h ∈ H and σ ∈ G. For any α ∈ M , let β = σ −1 (α) ∈ M so that
σhσ −1 (α) = σh(β) = σ(β) = α
Thus σhσ −1 ∈ G fixes M , whence σhσ −1 ∈ H. That is, H ⊂ G is normal.
Conversely, suppose H ⊂ G is normal. Let α ∈ M with minimal polynomial
g ∈ K[x]; we must prove that g splits completely in M . If β is any root of g,
then there is some σ ∈ G with β = σ(α) by Corollary 9.16. For any h ∈ H,
certainly h(α) = α, so
σhσ −1 (β) = σh(α) = σ(α) = β
By normality, σHσ −1 = H, so β is fixed by every element of H, and so β ∈ LH .
Since L/M is Galois (Lemma 10.8), Corollary 10.11 shows that β ∈ LH = M ,
and so g splits over M , and M/K is normal.
76
10.2 Fixed fields and stabilisers
This section has almost no content, which makes it hard to absorb, but it does
set up the language; keep in mind the worked example in §8.3.
Look at the lattice of subfields of Q(α, ω) in Figure 6 and the lattice of
subgroups of S3 in Figure 5 in §8.3, and then consider the following ridiculous
pronouncement: a lattice is a collection of vertices (usually labelled by some-
thing) with a collection of directed lines joining some pairs of vertices (usually
indicating some relation between the vertex labels). You could turn that into
something precise and general, but the informal idea of the two pictures is the
powerful idea we need, and the algebra will carry all the precision.
GG = {H ⊂ G | H is a subgroup of G}
77
Definition 10.14. The stabiliser of α ∈ L in G is
The stabiliser of A ⊂ L in G is
† : GG −→ FL/K
∗ : FL/K −→ GG
The two maps † and ∗ are order-reversing. The following points are immedi-
ate from the definition, and are famous for being easier and quicker to explain
to yourself than to read someone else’s attempt at a proof.
Lemma 10.16. Let L/K be a field extension and G = Gal(L/K).
[L : M ] = #M ∗ and [M : K] = # Gal(L/K)/#M ∗
78
(c) M/K is a normal extension if and only if M ∗ ⊂ Gal(L/K) is a normal
subgroup. In this case Gal(M/K) ∼
= Gal(L/K)/M ∗ .
Remark 10.18. By definition, M ∗ = Gal(L/M ), but the statement is about
the map ∗ so we use that notation. In (b), we could write Q = Gal(L/K)/M ∗
for the set of left cosets, as usual, and then [M : K] = #Q, but it feels safer to
avoid the quotient notation until (c) where we have normality so the quotient
is in fact a group.
L {id}
†
degree
#M ∗
∗
M M ∗ = {σ ∈ G | σ|M = idM }
normal
normal
Gal(M/K)
∼
= G/M ∗
K [L : K] = #G G = Gal(L/K)
79
11 Biquadratic extensions
Pick a field, any field, call it K—but be sure that char K 6= 2. Let a, b ∈ K
with b(a2 − b) 6= 0 and b ∈ K not a square in K. Let
and L/K be a splitting field of f . Such L/K are called biquadratic extensions.
Given such an f we would know how to find its roots: taking a square root
(by making a quadratic extension of K)
x2 − a = ±β where β 2 = b
and then after rearranging we take further square roots (possibly making more
quadratic extensions) to solve
x2 = a + β and x2 = a − β
and L = K(α, α0 ). (It’s reassuring to check that f (±α) = f (±α0 ) = 0.) You
can check that α, −α, α0 , −α0 ∈ L are four distinct elements (char K 6= 2) so
that L/K is a Galois extension: indeed deg f = 4 and f has 4 distinct roots in
a splitting field. (Alternatively, perhaps more simply, Df = 4x(x2 − a) has no
common factor with f , since b(a2 − b) 6= 0, so we may apply Lemma 9.26.)
We can see L/K in a tower
L = K(α, α0 )
K(α) K(α0 )
K(β)
K (17)
[L : K] = 2, 4 or 8
80
(a) If a2 − b is not a square in K, then [K(α) : K] = [K(α0 ) : K] = 4.
(b) If neither a2 − b nor b(a2 − b) is a square in K, then [L : K] = 8.
Recall the notation: β 2 = b, α2 = a + β and (α0 )2 = a − β.
a − β = (α0 )2 = c2 + d2 (a + β) + 2cdα
whole first bracketed expression and the coefficient 2cd both lie in K(β).
81
(b) Any σ ∈ Gal(L/K) can only map β, α, α0 in one of the following 8 ways:
1 2 3 4 5 6 7 8
σ(β) β β β β −β −β −β −β
σ(α) α α −α −α α0 α0 −α0 −α0
σ(α0 ) α0 −α0 α0 −α0 α −α α −α
11.1 [L : K] = 8 =⇒ Gal(L/K) ∼
= D8
In this case, each extension in the tower (17) is of degree 2, so Lemma 11.3
applies. Note that it merely presents the possible outcomes for how σ acts on
β, α, α0 ; we do not know which of these 8 possibilities actually arise as automor-
phisms. But by Theorem 10.10, # Gal(L/K) = [L : K] = 8, and so we know at
once that they must all arise as automorphisms.
Selecting possibilities 6 and 2, write
β 7→ −β
β 7→
β
σ: α 7→ α0 and τ : α 7→ α
0 0
α 7→ −α0
α 7→ −α
Gal(L/K) ∼
= D8
the dihedral group with 8 elements, which is also the symmetries of the square:
you can even see that pictorially by its effect on the four roots of f as
α0 α
• •
2 1
3 4
• •
−α −α0
(18)
82
Either that picture or the representation σ = (1234), τ = (24) ∈ D8 ⊂ S4
by permutations makes it easy to work out the subgroup lattice of Gal(L/K);
see Figure 8 for the picture. It may help to list the 8 elements by geometrical
and cycle type: other than the identity, they are
We can use this to work out the subfield lattice under the Galois correspon-
dence. The lattice picture is the same—that’s the main theorem—but we should
identify the subfields in more concrete terms than the subfield corresponding to
each H ⊂ G is H † = LH . Which elements of L generate H † ?
First τ (α) = α so K(α) ⊂ Lτ ; but both have degree 2 inside L, so they must
2
be equal. Similarly σ 2 τ (α0 ) = α0 , so Lσ τ = K(α0 ). Those two fields intersect
in the common subfield K(β) of degree 2, so that places K(β) in the diagram.
(You can check: K(β) = Lhσ ,τ i as its position indicates.)
2
How do we find the other fields? In a sense it’s simple: for each group
in the subgroup lattice, we must find functions of α and α0 that are invariant
under its permutations. Fortunately, we can do that here by trial and error.
It’s worth remembering, though, that L is 8 dimensional, so these subfields
are infinitesimally small transparent pins in a multidimensional hyper-haystack:
random expressions η = η(α, α0 ) are unlikely to lie in any strict subfield of L,
and so K(η) = L, which is great in itself but doesn’t solve the problem.
L hidi
K D8
83
and to note that σ and τ act on these as follows:
γ →
7 −γ γ
7→ −γ
0
σ: δ → 7 −δ and τ : δ 7 → δ0
0
0
δ 7→ δ δ 7 → δ
στ σ −1 = σ 10 τ = σ 2 τ so σH1 σ −1 = H2
= y 4 − 4ay 2 + 4b
= (y 2 − 2a)2 − 4(a2 − b) where δ 2 + (δ 0 )2 = 4a and γ 2 = a2 − b
The biquadratic polynomial g ∈ K[y] was there all the time, and we could
have worked with it instead of or as well as f . It is called the companion
polynomial of f , and has the same biquadratic splitting field.
84
b(a2 − b) ∈ K =⇒ [L : K] = 4, Gal(L/K) ∼
p
11.2 = C4
First, (βαα0 )2 = b(a2 − b) so βαα0 ∈ K by the case assumption. Therefore
α0 ∈ K(α, β) = K(α), whence L = K(α) = K(α0 ). Since b is not a square, also
a2 − b is not a square in K, so [L : K] = 4 by Lemma 11.1(a).
We first show that f is irreducible over K. It has no roots in K, so the
only thing to check is that f is not the product of two irreducible quadratic
polynomials. We check all pairwise products of linear factors: for example if
(x − α)(x − α0 ) = x2 − (α + α0 )x + αα0
L hidi
K(β) σ2
K C4
p
Figure 9: Galois correspondence for [L : K] = 4 with b(a2 − b) ∈ K
√
11.3 a2 − b ∈ K and [L : K] = 4 =⇒ Gal(L/K) ∼
= C2 × C 2
First, (αα0 )2 = a2 − b so αα0 ∈ K by the case assumption so α0 ∈ K(α), whence
L = K(α) = K(α0 ). By Lemma 11.3, we know the minimal polynomials and can
use them to build automorphisms by hand. We must exhibit 4 automorphisms,
since # Gal(L/K) = [L : K] = 4 by the Galois Correspondence.
There is an element ν ∈ Gal(L/K(β)) that exchanges the roots of q(x): that
is, ν(α) = −α and ν(β) = β. Since αα0 ∈ K is fixed by ν, also ν(α0 ) = −α0 .
There is an element of Gal(K(β)/K) that maps β to −β. This must lift to
an element ρ ∈ Gal(L/K). This automorphism takes α to one of the roots of
ρ(q(x)) = q 0 (x). If ρ(α) = α0 , then we have 4 distinct automorphisms:
id, ν, ρ, νρ ∈ Gal(L/K)
85
where νρ(α) = −α0 so is distinct from the previous three. If instead ρ(α) = −α0 ,
then we still get the same 4 automorphisms. So fix notation with ρ(α) = α.
Each of ν, ρ, νρ is an involution, ν 2 = id etc., and they all commute, so
L hidi
K C2 × C2
√
Figure 10: Galois correspondence for [L : K] = 4 with a2 − b ∈ K
f = x4 − 2
86
is irreducible. Since none of b, a2 − b = 2 and b(a2 − b) = −2 is a square in Q,
we have Gal(L/Q) = D8 by Lemma 11.1 and §11.1.
Example 11.7 (a = 2, b = 2). The biquadratic polynomial
f = x4 + 4 = (x2 + 2x + 2)(x2 − 2x + 2)
87
12 Finite fields
The first finite fields we learn are
Fp = Z/(p) ∼
= {0, 1, 2, . . . , p − 1}
α2 = α + 1 α · (α + 1) = 1 (α + 1)2 = α
noting that 2 = 0 and −1 = 1 in F2 . You can have fun with bigger finite fields
and bigger irreducible polynomials.
M = {α ∈ L | αq = α}
88
β ∈ N ∗ . Multiplying by β, and noticing that the following equation also holds
for β = 0, we have
βq = β for all β ∈ N
That is, f splits completely in N , and since #N is the number of roots of f
in any splitting field, N is a splitting field for f over Fp . Thus N ∼
= L by the
uniqueness of splitting fields, Corollary 9.7.
ϕp : K −→ K
α 7→ αp
Proof. (1) Write ϕ = ϕp to simplify notation. We just check the rules. First
ϕ(1) = 1p = 1 and
ϕ(α + β) = (α + β)p
p(p−1) p−2 2
= αp + pαp−1 β + 2 α β + · · · + pαβ p−1 + β p
p p
= α + β = ϕ(α) + ϕ(β)
where (as we’ve said before) all the middle terms · · · of the binomial expansion
vanish because their coefficients include a factor of p.
(2) The set of elements on which ϕ is the identity is certainly a subfield
of K, since ϕ is a homomorphism, and so it contains the prime subfield Fp ⊂ K.
But we may also interpret these elements as the roots in K of the polynomial
xp − x ∈ Fp [x]. That is not the zero polynomial, so it has at most p roots. The
elements of Fp already provide p roots, so they must be all of them.
89
(3) Finally, if K is finite then it is a finite-dimensional vector space over Fp
and ϕ : K → K is an injective Fp linear map, so it is an isomorphism of Fp
vector spaces. In particular, it is a bijection, and we just saw it fixes Fp , so it
is an Fp -automorphism as claimed.
Now that we know it is an automorphism of the given K, we may speak of
its fixed field, but that is just another way of speaking of the elements on which
ϕ is the identity, which was checked in (2)
Gal(K/Fp ) = hϕp i ∼
= Z/(n)
F512 = L {id}
6
F56 = Lϕ ϕ6 ∼
= Z/(2)
4
F54 = Lϕ ϕ4
∼
= Z/(3)
3
F53 = Lϕ ϕ3 ∼
= Z/(4)
2
F52 = Lϕ ϕ2
∼
= Z/(6)
F5 = Lϕ Gal(L/F5 ) = hϕi
∼
= Z/(12)
90
Proof. The extension K/Fp is Galois by Theorem 12.2, so
# Gal(K/Fp ) = [K : Fp ] = n (20)
m
αp = ϕm
p (α) = α
m
and so g = xp − x has pn solutions in K. So pm = deg g ≥ pn , whence
m = n.
Remark 12.6. It follows almost at once that K = Fpm is a subfield of L = Fpn
if and only if m divides n, and in that case
m
K = Lϕp [L : K] = n/m Gal(L/K) = ϕm ∼
= Z/(n/m)
p
For example, there is a unique field L = F512 of order 244, 140, 625 = 512 , a
Galois extension of F5 . Switching on Galois Theory, we easily calculate the
subgroup lattice of Gal(L/F5 ) ∼
= Z/(12) and deduce the subfield lattice in Fig-
ure 11. Every field extension in this picture is normal, so we can calculate the
Galois group of each extension as the corresponding quotient of stabilisers.
91
13 Radical solutions of polynomials
We work with subfields of C in this chapter. That is unnecessarily restrictive,
but it allows us to go quickly to the main point. All definitions are valid over
any field. We discuss when a polynomial is soluble by radicals, which loosely
means its roots have expressions using nth roots; see Definition 13.6.
All polynomials of degree ≤ 4 are soluble by radicals by the usual quadratic
formula, Cardano’s formula §3.2(8), and a more complicated formula in degree 4
that we haven’t discussed.
Galois himself understood precisely what is going on. A group is soluble if
you can find a path through its subgroup lattice of normal subgroups Hi ⊂ Hi+1
of prime index; see Definition 13.8.
Theorem 13.1. Let K ⊂ C and f ∈ K[x] be an irreducible polynomial with
splitting field L/K. If f is soluble by radicals then Gal(L/K) is a soluble group.
In fact, the converse implication also holds; is not hard—we have simply run
out of time. What we have stated will give us enough control to exhibit quintics
that cannot be solved by radicals.
92
no x2 term) so there is no need for Tschirnhausen and we have p = q = −3 in
the notation of §3.2. Then Cardano’s formula specifies the roots of f . First we
calculate the discriminant, ∆ = −27D = −4p3 − 27q 2 = −27 × 5. Since ∆ < 0
we know there will be one√real root
√ and a complex conjugate pair.
Fix a square root α = D = 5 and two complex numbers
r s √ r s √
3 −q + α 3 3 + 5 3 −q − α 3 3 − 5
β= = γ= =
2 2 2 2
subject to the constraint βγ = −p/3 on the choice of cube roots. Since D > 0
we may choose β, γ ∈ R so the constraint is satisfied automatically and γ = 1/β.
The three distinct complex roots of f are then
s √ s √
i 3−i i 3 3 + 5 3−i 3 3 − 5
αi = ω β + ω γ = ω +ω i = 0, 1, 2
2 2
Note that α0 ∈ R while α1 = α2 is strictly complex. Those expressions show at
once that the splitting field L = Q(α1 , α2 , α3 ) of f is contained in the following
radical extension M/K:
√
F0 = Q ⊂ F1 = Q( 5) ⊂ F2 = Q(β) ⊂ F3 = Q(β, ω) = M
√ √
since ( 5)2 ∈ F0 , β 3 = (3 + 5)/2 ∈ F1 and ω 3 = 1 ∈ F2 . It is clear that
the splitting field L ⊂ M , but the radical tower shows that [M : Q] = 12, so
L 6= M . (In fact, none of the radicals we used to make the extension lie in L;
they were merely ingredients to cook up the roots αi .)
Example 13.5. Let f = x3 − 3x + 1 ∈ Q[x]. (This is the case p = −3, q = 1.
It is irreducible by Eisenstein applied to f (y − 1) = y 3 − 3y 2 + 3.)
The discriminant of f is ∆ = −27D = −27 × (−3), so Cardano’s formula
computes the roots of f as:
s √ s √
3 −1 + i 3 3 −1 − i 3
αi = ω i β + ω 3−i γ = ω i + ω 3−i i = 0, 1, 2
2 2
√
Despite√ appearances, all αi ∈ R, since ∆ > 0. Take ω = (−1 + i 3)/2 so
β = 3 ω is a primitive 9th root of unity and γ = 1/β = β (to satisfy the
constraint βγ = −p/3), and then
4 7
α0 = β + β α1 = β 4 + β α2 = β 7 + β
are visibly real. Nevertheless, the radical extension M/Q we used for the formula
certainly involved complex radicals: we constructed it as
√ √ √
Q ⊂ Q( −3) ⊂ Q −3, β = Q −3, β, ω = M
√
and then recognised it more simply as Q ⊂ M = Q( 9 1). One can prove that
the splitting field L ⊂ R of f is not a subfield of a real radical extension M ⊂ R.
93
Definition 13.6. A field extension L/K is soluble if L is contained in a field
M with M/K radical. A polynomial f ∈ K[x] is soluble by radicals if its
splitting field L is soluble.
Thus, if f is soluble by radicals, every element of its splitting field has an
expression that involves only nth roots, field operations, and elements of K. In
particular, that is true of the roots of f . That’s not quite the same thing as
saying what the formula for the roots is, but it’s a good start.
We end this section by observing that normal closures of radical extensions
are still radical.
Proposition 13.7. Suppose char K = 0. If L/K is a radical extension, then
there is a finite extension M/L so that M/K is both radical and Galois.
Proof. Let M/L be the normal closure of L determined by Corollary 9.12. We
use the notation of Definition 13.2: K = F0 ⊂ · · · ⊂ Fs = L ⊂ M . The idea
is simple: inside M , work through the steps we would use to build a normal
closure (which M already is), and check that the extensions we build at each
step are radical.
At each step Fi = Fi−1 (αi ) defining L, let mi be the minimal polynomial
of αi over K. (Note: minimal polynomial over K: so this is not necessarily a
factor of xni − ai , since ai ∈ Fi−1 might not be in K.) Since mi is irreducible,
for any root βi there is some σ ∈ Gal(M/K) with σ(αi ) = βi by Corollary 9.16.
Thus βi is a root of xni − σ(ai ).
We work inductively through increasing i = 1, 2, . . . , s. When i = 1, we make
radical extensions by roots of m1 , each of which satisfy xn1 − a1 since a1 ∈ K
is fixed by any element of Gal(M/K). That constructs a radical extension
F0 ⊂ F1 ⊂ F f1 that is also normal, since it is the splitting field for m1 inside M .
For i = 2, we make radical extensions by roots of m2 : indeed, each such root
satisfies xn2 − σ(a2 ) for some element σ ∈ Gal(M/K), and since a2 ∈ F1 and
f1 /K is normal, each σ(a2 ) ∈ F
F f1 , by Proposition 9.17, so such roots are indeed
radicals over Ff1 .
Continuing with through i = 1, 2, . . . , s creates a radical extension
K⊂F
f1 ⊂ F
f2 ⊂ · · · ⊂ F
fs = M0 ⊃ Fs = L
{id} = G0 ⊂ G1 ⊂ · · · ⊂ Gs = G (21)
where Gi ⊂ Gi+1 is a normal subgroup with abelian quotient Gi+1 /Gi for each
i = 0, 1, . . . , s − 1.
94
A chain of normal subgroups as in (21) is called a subnormal series, and
when the factors Gi+1 /Gi are all abelian it is called a soluble series.
Remark 13.9. When G is finite and soluble, we can find a soluble series in
which the quotients Gi+1 /Gi are of prime order, and so in particular are cyclic.
Example 13.10. Abelian groups are soluble: {id} ⊂ G is a suitable soluble
series. The symmetric group S3 is soluble: {id} ⊂ C3 ⊂ S3 is a suitable
subnormal series. You can check that S4 is soluble too, as indeed are any of its
subgroups such as D8 , the symmetries of the square.
On the other hand, A5 and S5 are not soluble. In fact, A5 is simple—it
has no nontrivial normal subgroups at all—and it’s easy to see that the only
normal subgroup of S5 is A5 , so there’s no chance of making a soluble series for
S5 either.
Without using simplicity, we can see the insolubility of A5 quite explicitly.
For g, h in a group G, we define their commutator [g, h] = ghg −1 h−1 . Every
element of A5 is a permutation of the form (ijk), (ij)(k`) or (ijk`m) with
i, j, k, `, m ∈ {1, 2, 3, 4, 5} distinct. The following formulas show that every
element of A5 is a commutator:
(ijk) = (ij`)(ikm)(i`j)(imk)
(ij)(k`) = (ijk)(ij`)(ikj)(i`j)
(ijk`m) = (ij)(km)(im`)(ij)(km)(i`m)
Step 1: high-school calculus. Observe that f has 3 distinct real roots and
a complex
√ conjugage pair. Indeed f 0 = 5(x4 − 2) has two real turning points√at
± 2, and then f 00 = 20x3 shows they
4
√ are a positive local maximum at − 4 2
4
and a negative local minimum at + 2, with the function racing monotonically
to ∓∞ outside these values.
Name these 5 roots of f as α1 , α2 , α3 ∈ R and β, β ∈ C \ R, in that order so
L = Q(α1 , α2 , α3 , β, β) and we may treat Gal(L/Q) as a subgroup of S5 .
Step 2: compute the Galois group. The set of roots that generate L over
Q is permuted (a little) under complex conjugation σ, so σ ∈ Gal(L/Q) and it
is the permutation (45) that fixes the αi and exchanges β and β.
95
Now L/Q is Galois, so # Gal(L/Q) = [L : Q], and that degree is divisible
by 5, by the Tower Law since Q ⊂ Q(α1 ) ⊂ L and [Q(α1 ) : Q] = 5 as f is
irreducible. Any subgroup of S5 with order divisible by 2 × 5 contains a 5-cycle.
(Or you can use Sylow’s first theorem as a sledgehammer.)
It is well known that S5 is generated by any 2-cycle and any 5-cycle together.
We can quickly see why: if σ = (45) and τ = (12345) then
στ (ζ) = (ζ s )r = ζ rs = (ζ r )s = τ σ(ζ)
and στ = τ σ as claimed.
Lemma 13.13. Let K ⊂ C be a subfield with ζ ∈ K, where ζ is a primitive
pth root of unity for a prime p. If α ∈ C satisfies a = αp ∈ K, then K(α)/K is
Galois and Gal(K(α)/K) is abelian.
√
The idea is that α = p a is a choice of pth root of a. The proof is virtually
identical to the previous one.
96
Proof. The polynomial f = xp − a ∈ K[x] splits completely in K(α) with p
roots ζ i α for i = 0, 1, . . . , p − 1. Thus K(α)/K is Galois.
Since the minimal polynomial m ∈ K[x] of α over K divides f , it too splits
completely over K(α) with roots a subset of the roots of f .
Any elements σ, τ ∈ Gal(K(α)/K) are determined by σ(α) and τ (α), which
are roots of m, say ζ r α and ζ s α respectively. So, ζ ∈ K is fixed by σ, τ ,
and στ = τ σ as claimed.
Corollary 13.14. Let K ⊂ C be a subfield and ζ ∈ C a primitive pth root of
unity for a prime p. If α ∈ C satisfies a = αp ∈ K, then L/K is Galois, where
M = K(ζ) and L = M (α), and the subgroup
Gal(L/M ) ⊂ Gal(L/K)
Proof. We sketch the idea. (See e.g. [Stewart] pp162–3 for fuller details.)
If {Gi } is a soluble series for G then we can check that {Hi = H ∩ Gi }
is a soluble series for H: normality is preserved and the factors Hi /Hi−1 are
subgroups of Gi /Gi−1 so also abelian.
If H ⊂ G is normal, then we can check that {Ji = HGi /H} is a soluble series
for H: the factors are quotients of Gi /Gi−1 (by applying the higher isomorphism
theorems) so also abelian.
For the converse, we can essentially adjoin soluble series {Hi } for H and
{Gj /H} for G/H to get one {Hi , Gj } for G.
97
Theorem 13.16. If L/K is a radical Galois extension, then Gal(L/K) is sol-
uble.
Proof. We proceed by induction on n = [L : K]; the case n = 1 holds trivially,
so suppose n > 1.
As L/K is radical, there is some α ∈ L \ K with αp = a ∈ K for some
prime p. The minimal polynomial m ∈ K[x] of α over K divides xp − a, and
so its roots are all of the form ζ i α for some i, where ζ is a primitive pth root
of unity. Since deg(m) ≥ 2, it has at least two roots, and so ζ ∈ L, being a
rational combination of any two distinct ζ i α.
Consider the radical extension M = K(ζ, α) of K that lies intermediate
K⊂M ⊂L
Gal(L/K)/ Gal(L/M ) ∼
= Gal(M/K)
Gal(N/K)/ Gal(N/L) ∼
= Gal(L/K)
The soluble yoga of Proposition 13.15 concludes the proof, setting G = Gal(N/K),
which is soluble by Theorem 13.16, and H = Gal(N/L)
98
Optional: the Introduction at The End
This isn’t a conclusion, because we have only just started the subject. Rather,
it is a fantasy introduction that spells out the whole module as if in advance.
Of course it will fail, but by trying to see how it fails, you may see better how
the whole module hangs together, and why we did what we did. As with the
other introductions at the beginning, you could choose not to read this at all.
Field extensions
Life starts with a field K and the ring of polynomials K[x] over it. Given
an irreducible polynomial f ∈ K[x] of degree n, we can construct a simple
field extension in which f has a root, namely L = K[x]/(f ); as a K-vector
space dimK L = n. If we already owned such an extension K ⊂ M and a root
α ∈ M , the field L0 = K(α) ⊂ M is another field containing a root of f ; but
an easy argument shows L ∼ = L0 , so we are not upsetting anything by doing
this—though it feels a bit strange that the element (coset) α = x + (f ) ∈ L
models any of the roots of f , and we cannot tell which particular one it was. (In
fact, the question doesn’t even make sense: if we owned a field M containing
all the roots, we could map L ,→ M , taking x to whichever root we pleased.)
Repeating this process inductively, we construct an extension K ⊂ F , a
splitting field, defined by the property that all of the roots α1 , . . . , αn of f lie
in F —in the sense that f factorises as f = cΠi (x − αi ) in the ring F [x]—but
not in any smaller subfield of F . If we already owned a field K ⊂ M which
contained all the roots of f , then F 0 = K(α1 , . . . , αn ) ⊂ M is another splitting
field for f ; a neat little inductive argument on simple extensions shows that
F ∼= F 0 , so we are not upsetting anything by our approach.
The splitting field F was designed to contain all the roots of f . But by
magic, and a clever inductive argument on degrees of extensions, it contains all
the roots of any irreducible g ∈ K[x] for which it contains at least one root.
Extensions K ⊂ F with this property are said to be normal, and this plays a
role when we consider K-automorphisms.
Rather quietly, we have taken for granted that f is separable: that is,
that it has n = deg(f ) distinct roots in any splitting field. An argument with
formal differentiation (recovering what we thought we already knew) shows that
separability can only fail if f comprises only p-powers and K has characteristic
p > 0. That’s a very interesting case, but doesn’t feature in a major way in this
module. In particular, working with K = Q, or any other subfield of C, f is
automatically separable.
Field automorphisms
Given a splitting field K ⊂ F for f (recall deg(f ) = n), let G = Gal(f ) =
Gal(F/K) be the group of all K-automorphisms of L. A simple linear in-
dependence argument in linear algebra shows that there are at most n K-
automorphisms of L. But the normality of splitting fields readily provides n au-
99
tomorphisms, by redeploying the argument about uniqueness of splitting fields,
so #G = n.
Of course, every element g ∈ G fixes the whole of K pointwise—that’s what
it means to be a K-automorphism. But prima facie, it could be the case that
they fix even more: they may fix an intermediate field K ⊂ K0 ⊂ F , say.
Whether they do or not, a bit more linear algebra shows that dimK0 L = #G.
But we knew at the outset that dimK L = n, so K0 = K after all.
Roots of polynomials
You’ll like this. Suppose f ∈ Q[x] has roots α1 , . . . , αn ⊂ C. Thus L =
Q(α1 , . . . , αn ) ⊂ C is a splitting field for f . By the Galois Correspondence,
L has only finitely many strict subfields K1 , . . . , Ks . Each of these is a Q-vector
subspace of L of dimension strictly less than dimQ L. So there is no way that
their set-theoretic union can be the whole of L. So pick any α ∈ L that does
not lie in any Ki . Then clearly L = Q(α): indeed Q(α) is a subfield of L, but
it is certainly not contained in any strict subfield of L by choice of α. This
result, that L = Q(α) for some α, is called the Theorem on the Primitive
Element.
On the face of it, the Theorem on the Primitive Element only applies to
splitting fields L ⊃ Q, since it used the Galois correspondence, and that requires
normality. But if K ⊃ Q is any finite extension, then there is a normal closure
K ⊂ N and we can hit N/Q by Galois Theory: it has only finitely many
subfields, so a fortiori K does too, and the rest of the argument follows.
But in some ways our initial ambition was the opposite. Rather than trying
to re-express L in a simple way by using only one non-rational number, we
want to express L by adjoining radicals, one at a time. (Or, at least, if not L
itself, then some field that contains L—just as Cardano put up with the fact
that he might adjoin strictly complex radicals even when computing real roots.)
100
That is not always possible. If we build a sequence of extensions by roots
of polynomials of the form xp − a, then over on the other side of the Galois
Correspondence we must be breaking down G = Gal(f ) by normal subgroups
with quotients like Z/p—such groups are called soluble, and have a more precise
definition that what I just said. We need a little care to say that correctly, but
having determined that G must decompose in a particular way, we can ask
whether actual Galois groups we know do indeed decompose like that. And
pretty quickly we spot that S5 does not, and with that motivation in hand, it
doesn’t take long to find a polynomial f with Galois group S5 . There we have
it: a polynomial that cannot be solved by radicals—centuries of work in rubble,
if you’re a glass-half-empty sort of person.
(b) Understand much more fully which finite groups are soluble. For example,
what are the soluble transitive subgroups of Sn for n = 3, 4, 5, 6, and how
do they relate to polynomials that we know?
(c) More precisely, which groups can arise as Gal(f ) for some polynomial
f ∈ Q[x]? This is called the inverse Galois problem and it is not
solved. You would become famous if you solved it.
(d) We have worked mostly with polynomials defined over Q. What if we
worked instead over a transcendental extension Q(t)? The Galois Corre-
spondence still holds, of course (it holds over any field), but what does it
mean? What about other fields?
(e) What about characteristic p > 0, and in particular what information can
we recover in the non-separable case?
References
[1] https://mathshistory.st-andrews.ac.uk/Biographies/Ferro/
https://mathshistory.st-andrews.ac.uk/Biographies/Tartaglia/
https://mathshistory.st-andrews.ac.uk/Biographies/Cardan/
[Stewart2015]
101