0% found this document useful (0 votes)
168 views30 pages

Cambridge Part 2567

This document provides notes on mathematics courses at Cambridge University. It contains 5 chapters on topics including algebraic topology, probability and measure, graph theory, automata and formal languages, and Galois theory. The notes are available online for free download and modification under an open source license. They aim to provide a reasonably complete overview of the content of the courses.

Uploaded by

Chandan Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views30 pages

Cambridge Part 2567

This document provides notes on mathematics courses at Cambridge University. It contains 5 chapters on topics including algebraic topology, probability and measure, graph theory, automata and formal languages, and Galois theory. The notes are available online for free download and modification under an open source license. They aim to provide a reasonably complete overview of the content of the courses.

Uploaded by

Chandan Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Notes on the Cambridge University

Mathematical Tripos
Sky Wilshaw

Part II

University of Cambridge
2020–2022
2
Contents

1 Algebraic Topology 7

2 Probability and Measure 11

3 Graph Theory 15

4 Automata and Formal Languages 19

5 Galois Theory 25

3
4 CONTENTS
Introduction

This book contains notes for the maths courses at Cambridge University. Please note that while
efforts have been made to ensure completeness and correctness, no guarantees can be made; this
is simply a reasonably complete way of collating information about the courses.
This book can be downloaded in PDF form for free at https://thirdsgames.co.uk/maths.html,
and the source code (for the book itself and for the individual courses) can be accessed at
https://github.com/zeramorphic/cambridge-maths-notes.
You are given the right to download the PDF of the book (or its component parts) for private use.
You are permitted to download and modify the source code of the repository (the book and the
course notes it contains), but may not distribute these modifications (including object files such as
PDFs generated from these modifications) to others. However, you are permitted to make public
forks of the repository in order to create pull requests, but this does not grant you permission to
distribute object files created from these forked repositories. It must be clear when visiting it that
such a repository is a fork of https://github.com/zeramorphic/cambridge-maths-notes, and
must include a link to the original repository. (Forks created on GitHub satisfy this requirement,
as the title contains the words ‘forked from’ and then a link to the original repository.)
This project makes use of free software in accordance with their license terms, including the
cobra CLI interface generator
https://pkg.go.dev/mod/github.com/spf13/[email protected]. For more information, read the
licenses of each dependency.

5
6 CONTENTS
Course 1

Algebraic Topology

Contents
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Homotopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Contractible spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

7
8 COURSE 1. ALGEBRAIC TOPOLOGY

1.1 Motivation
1.1.1 Motivation
Topological spaces are difficult to study on their own, and so we will assign algebraic invariants
to these spaces which allow us to reason more easily about these spaces. To a topological space
X, a ‘numerical invariant’ is a number g( X ) ∈ R ∪ {∞} such that X ≃ Y (where ≃ denotes
homeomorphism) implies g( X ) = g(Y ). An example of a numerical invariant is the number
of path-connected components of X. An algebraic invariant is a group G ( X ) assigned to a
topological space X such that X ≃ Y implies G ( X ) ≃ G (Y ), where here ≃ denotes isomorphism.
We will construct two kinds of such invariants: the fundamental group, and invariants related
to homology. The invariants we construct will behave nicely under maps: if f : X → Y is a
continuous map, we induce a homomorphism f ⋆ : G ( X ) → G (Y ). We will prove the following
model theorems.

• If Rn ≃ Rm , then n = m.

• If f : D n → D n is continuous, then there exists x ∈ D n with f ( x ) = x.

The above theorems are easy to prove in the case n = 1 by appealing to path-connectedness and
the intermediate value theorem. Our study allows us to prove similar things about these higher
dimensional cases, among other things.

1.1.2 Notation
• A space is a topological space.

• A map is a continuous function, unless defined otherwise.

• If X and Y are spaces, X ≃ Y means that X and Y are homeomorphic.

• If G and H are groups, G ≃ H means that G and H are isomorphic.

• Some common spaces include:

– The one-point space {•};

– I = [0, 1] ⊂ R;

– I n = I × · · · × I , the n-dimensional closed unit cube;


| {z }
n times

– D n = {v ∈ Rn | ∥v∥ ≤ 1}, the n-dimensional closed unit disk (note that I n ≃ D n );

– Sn−1 = {v ∈ Rn | ∥v∥ = 1}, the (n − 1)-dimensional unit sphere.

• Common maps include:

– If X is a space, the identity map idX : X → X is defined by x 7→ x;

– If X and Y are spaces with p ∈ Y, the constant map c X,p : X → Y is defined by x 7→ p.


1.2. HOMOTOPY 9

1.2 Homotopy
1.2.1 Definition

Definition. Let f 0 , f 1 : X → Y be continuous. We say f 0 is homotopic to f 1 , written f 0 ∼ f 1 ,


if there exists a continuous H : X × I → Y with H ( x, 0) = f 0 ( x ) and H ( x, 1) = f 1 ( x ).

We can think of H as a path from f 0 to f 1 in the set Hom( X, Y ) of functions X → Y, that is


continuous under a topology that will not be defined here.

Lemma (Gluing lemma). Let X = C1 ∪ C2 , where C1 , C2 are closed in X. Let f : X → Y


be a function (that may be not continuous), such that f C and f C are continuous. Then

1 2
f is continuous.

Proof. It suffices to show that the preimage of a closed set is closed. Let K ⊆ Y be closed. Then
  −1
Ki = f −1 (K ) ∩ Ci = f C (K ) is a closed set in Ci and so is closed in X because Ci is closed.
i
Since K = K1 ∪ K2 , K is also closed in X.

Lemma. Homotopy is an equivalence relation.

Proof. Reflexivity is trivial, because H ( x, t) = f ( x ) is continuous, as H = f ◦ π1 is the composition


of continuous maps. Symmetry holds because if H ( x, t) is continuous, H ( x, 1 − t) is continuous
as the composition of continuous maps. For transitivity, if f 0 ∼ f 1 via H and f 1 ∼ f 2 via H ′ , we
define (
H ( x, 2t) t < 21
H ′′ ( x, t) =
H ′ ( x, 2t − 1) t ≥ 12

and this is continuous by the gluing lemma.

Note that we sometimes write f t ( x ) for a homotopy between f 0 and f 1 .


Example. Let f 1 : X → Rn be a map. Then f 0 : X → Rn defined by c X,0 has f 1 ∼ f 0 via the
homotopy H ( x, t) = t f 1 ( x ).
Example. Let f 1 : S1 → S2 be defined by f 1 ( x, y) = ( x, y, 0): the inclusion map from the circle to
the equator in the unit 2-sphere. Let f 0 : S1 → S2 be the constant map f 0 ( x, y) = (0, 0, 1). Then
f 0 ∼ f 1 via the homotopy f t ( x, y) = ( x sin πt
2 , y sin 2 , cos 2 ).
πt πt

Lemma. If f 0 , f 1 : X → Y are homotopic via f t , and g0 , g1 : Y → Z are homotopic via


gt , then the map H : X × I → Z defined by H ( x, t) = gt ( f t ( x )), also denoted gt ◦ f t is a
homotopy for g0 ◦ f 0 ∼ g1 ◦ f 1 .

Proof. This is a composition of continuous maps and hence continuous.


10 COURSE 1. ALGEBRAIC TOPOLOGY

1.2.2 Contractible spaces

Definition. A space Y is contractible if idY ∼ cY,p for some p ∈ Y.

Example. If Y ⊆ Rn is convex and nonempty, Y is contractible via the homotopy H (y, t) =


(1 − t)y + tp for some p ∈ Y.

Proposition. Let Y be contractible. Then f 0 ∼ f 1 for any maps f 0 , f 1 : X → Y.

Proof. We have f 0 = idY ◦ f 0 ∼ cY,p ◦ f 0 = c X,p , and similarly f 1 ∼ c X,p . By transitivity, f 0 ∼


f1 .

Corollary. Let Y be contractible. Then Y is path-connected.

Proof. If Y is contractible, and p, q ∈ Y, then c{•},p ∼ c{•},q via H : {•} × I → Y. Then define
γ(t) = H (•, t) is a path from p to q in Y.

Example. R \ {0} is not contractible.


We will later prove that Rn \ {0} is not contractible for any n ≥ 1, but we require some more
theory before this can be proven.

Definition. Spaces X, Y are homotopy equivalent, denoted X ∼ Y, if there exist maps


f : X → Y and g : Y → X such that f ◦ g ∼ idY and g ◦ f ∼ idX .

Example. If X ≃ Y, X and Y are homotopy equivalent. Note that the definition of homotopy
equivalence is simply the definition of homeomorphism, except that the requirement that f ◦ g
and g ◦ f be equal to the identity is relaxed into simply being homotopic to the identity.

Proposition. X is contractible if and only if X ∼ {•}.


Course 2

Probability and Measure

Contents
2.1 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Rings and algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

11
12 COURSE 2. PROBABILITY AND MEASURE

2.1 Measures
2.1.1 Definitions

Definition. Let E be a (nonempty) set. A collection E of subsets of E is called a σ-algebra


if the following properties hold:
• ∅ ∈ E;
• A ∈ E =⇒ Ac = E \ A ∈ E ;
• if ( An )n∈N is a countable collection of sets in E , n∈N An ∈ E .
S

Example. Let E = {∅, E}. This is a σ-algebra. Also, P ( E) = { A ⊆ E} is a σ-algebra.

Remark. Since n An = ( n Acn )c , any σ-algebra E is closed under countable intersections as well
T S

as under countable unions. Note that B \ A = B ∩ Ac ∈ E , so σ-algebras are closed under set
difference.

Definition. A set E with a σ-algebra E is called a measurable space. The elements of E are
called measurable sets.

Definition. A measure µ is a set function µ : E → [0, ∞], suchSthat µ(∅) = 0, and for a
sequence ( An )n∈N such that the An are disjoint, we have µ( n∈N An ) = ∑n∈N µ( An ).
This is the countable additivity property of the measure.

Remark. If E is countable, then for any A ∈ P ( E) and measure µ, we have µ( A) = µ( x∈ A { x }) =


S

∑ x∈ A µ({ x }). Hence, measures are uniquely defined by the measure of each singleton. This
corresponds to the notion of a probability mass function.

Definition. For a collection A of subsets of E, we define the σ-algebra σ ( A) generated by


A by
σ (A) = { A ⊆ E : A ∈ E for all σ-algebras E ⊇ A}
So it is the smallest σ-algebra containing A. Equivalently,
\
σ (A) = E
E ⊇A,E a σ-algebra

2.1.2 Rings and algebras


To construct good generators, we define the following.

Definition. A ⊆ P ( E) is called a ring over E if ∅ ∈ A and A, B ∈ A implies B \ A ∈ A


and A ∪ B ∈ A.

Rings are easier to manage than σ-algebras because there are only finitary operators.
2.1. MEASURES 13

Definition. A is called an algebra over E if ∅ ∈ A and A, B ∈ A implies Ac ∈ A and


A ∪ B ∈ A.

Remark. Rings are closed under symmetric difference A △ B = ( B \ A) ∪ ( A \ B), and are closed
under intersections A ∩ B = A ∪ B \ A △ B. Algebras are rings, because B \ A = B ∩ Ac =
( Bc ∪ A)c . Not all rings are algebras, because rings do not need to include the entire space.

Proposition (Disjointification of countable unions). Consider n An for An ∈ E , where E


S

is a σ-algebra (or aS
ring, if the union is finite). Then there exist Bn ∈ E that are disjoint
such that n An = n Bn .
S

Proof. Define A
en = S j≤n A j , then Bn+1 = A en−1 .
en \ A

Definition. A set function on a collection A of subsets of E, where ∅ ∈ A, is a map


µ : A → [0, ∞] such that µ(∅) = 0. We say µ is increasing if µ( A) ≤ µ( B) for all A ⊆ B in
A. We say µ is additive if µ( A ∪ B)S= µ( A) + µ( B) for disjoint A, B ∈ A and A ∪ B ∈ A.
We say µ is countably additive if µ( n An ) = ∑n µ( An ) for disjoint sequences An where
n An and each An lie in A. We say µ is countably subadditive if µ ( n An ) ≤ ∑n µ ( An ) for
S S

arbitrary sequences An under the above conditions.

Remark. A measure satisfies all four of the above conditions.

Theorem (Carathéodory’s theorem). Let µ be a countably additive set function on a ring


A of subsets of E. Then there exists a measure µ⋆ on σ(A) such that µ⋆ A = µ.

We will later prove that this extended measure is unique.

Proof. For B ⊆ E, we define the outer measure µ⋆ as


( )


[
µ ( B) = inf µ ( A n ), A n ∈ A, B ⊆ An
n ∈N n ∈N

If there is no sequence An such that B ⊆ An , we declare the outer measure µ⋆ ( B) to be ∞.


S
n ∈N
We define the class

M = { A ⊆ E | ∀ B ⊆ E. µ⋆ ( A) = µ⋆ ( A ∩ B) + µ⋆ ( A ∩ Bc )}

This is the class of µ⋆ -measurable sets.


14 COURSE 2. PROBABILITY AND MEASURE
Course 3

Graph Theory

Contents
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

15
16 COURSE 3. GRAPH THEORY

3.1 Introduction
3.1.1 Definitions
We use the notation [n] for {1, . . . , n}. For a set X and k ∈ N, we define X (k) = {Y ⊆ X | |Y | = k }.

Definition. A graph is a pair (V, E), where V is a set of vertices and E is a set of edges where
E ⊆ V (2) . We use the notation V ( G ) to denote the set of vertices and E( G ) to denote the
set of edges, where G = (V, E) is a graph. We define | G | = |V ( G )|, and e( G ) = | E( G )|.

Example. The complete graph on n vertices, denoted Kn , is the graph with V = [n] and E = V (2) .
Note that we sometimes use juxtaposition of names of vertices to denote an edge between them,
so 13 represents the edge {1, 3}.
Remark. Edges are undirected. There are no edges from a vertex to itself. Edges between vertices
are unique if they exist. Most of the graphs covered in this course are finite.
Example. The empty graph on n vertices, denoted K n , is the graph with vertex set V = [n] and
E = ∅.
Example. The path of length n, denoted Pn , is the graph with V = [n + 1] and E = {{1, 2}, . . . , {n, n + 1}}.
Example. The cycle of length n, denoted Cn , is the graph with V = [n] and E = {{1, 2}, . . . , {n − 1, n}, {n, 1}}.

Definition. Let G be a graph, x ∈ V ( G ). The neighbourhood of x in G is

NG ( x ) = {y ∈ V ( G ) | { x, y} ∈ E( G )}

If y is a neighbour of x, we write x ∼ y.

Note that ∼ is irreflexive and not transitive in general.

Definition. The degree of a vertex x ∈ V ( G ) is defined as deg x = | N ( x )|.

Definition. Let G, H be graphs. A graph isomorphism is a bijection φ : V ( G ) → V ( H ) such


that {u, v} ∈ E( G ) ⇐⇒ { φ(u), φ(v)} ∈ E( H ).

Definition. We say H is a subgraph of G if V ( H ) ⊆ V ( G ) and E( H ) ⊆ E( G ).

If G is a graph, and xy ∈ E( G ), we define G − xy to be the graph (V ( G ), E( G ) \ { xy}). Similarly,


for x, y ∈ V ( G ), we define G + xy to be the graph (V ( G ), E( G ) ∪ { xy}).

Definition. Let x, y ∈ V ( G ). A walk from x to y in G is a sequence of vertices ( x, . . . , y)


such that each consecutive pair of elements of the sequence is connected by an edge in G.
A path from x to y in G is a walk where all the vertices are disjoint.
3.1. INTRODUCTION 17

Definition. A graph is connected if every pair of vertices is connected with a path.

The concatenation of two paths or walks P and P′ is written PP′ .


Remark. The concatenation of two walks is a walk. The concatenation of two paths is not neces-
sarily a path, if the two paths share a vertex.

Proposition. If W is a x–y walk for x ̸= y, W contains a x–y path, where ‘contains’ denotes
a subsequence.

Proof. Let W ′ be the minimal x–y walk in W. This is a path, because if there were a repeated
vertex, we could find a shorter path by eliminating the detour.
18 COURSE 3. GRAPH THEORY
Course 4

Automata and Formal Languages

Contents
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.1 Exposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.2 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.3 Revisiting Numbers and Sets . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.4 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Rewrite systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.2 Relation to languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.3 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

19
20 COURSE 4. AUTOMATA AND FORMAL LANGUAGES

4.1 Introduction
4.1.1 Exposition
Computation, or computability, is central to modern mathematics. However, we very rarely
think about the precise definition of what it means for something to be ‘computable’. There is
an important difference between existence and algorithmic access to a witness. Contrast the
statements ‘every polynomial of order n has a root’, and ‘there is an algorithm that, given a
polynomial of order n, we can find a root’. In many cases, there is an existence proof but no
algorithm to construct the relevant object.
In 1900, Hilbert gave a talk in Paris known as Mathematical Problems, in which he described a
list of 100 problems to be worked on in the coming 100 years. One of these problems, the tenth,
relates to an algorithm to determine whether solutions of Diophantine equations, those in Z[ X ],
exist. In 1928, Ackermann wrote the book Grundzüge der theoretischen Logik, in which he described
the famous Entscheidungsproblem: given a formula φ, determine whether φ is a tautology (true
regardless of how the variables are interpreted).
In both cases, Hilbert expected that solutions to these questions exist. Positive solutions to such
problems do not require a definition of words like ‘algorithm’ or ‘procedure’, because we can
agree on what an algorithm is when we see an example. However, to disprove such statements, we
need to rigorously define what an algorithm is, in order to rule all possible algorithms out.

4.1.2 Basic definitions


To talk about computation, we must first define the objects on which computation takes place.
Naturally, one would assume the objects to be some kind of number, but even the above two
examples do not have inputs as numbers; instead, we see polynomials and formulas. Modern
computation relies on encodings of complicated objects as strings of a finite set of symbols, such
as the bits 0 and 1. We use a similar approach, using a set Ω, which is usually assumed to be
finite, called the set of symbols, and then we define Ω⋆ to be the set of finite sequences of objects
of Ω, called the set of Ω-strings.

4.1.3 Revisiting Numbers and Sets


Recall that a set X is called countable if there is a surjection N → X, and that X is called infinite if
there is an injection N → X.

Proposition. If X is nonempty and countable, then X ⋆ is infinite and countable.

Proof. Since X ̸= ∅, there exists x ∈ X. X ⋆ is infinite, as the function mapping n ∈ N to |xx{z


. . . x}
n times
is injective. Because X is countable, there exists a surjection π : N → X. Each natural k ∈ N
k
has a unique prime number decomposition ∏i∈N pi i where p0 = 2, p1 = 3, p2 = 5, . . . are the
primes indexed by the naturals. We will interpret the k i as encoding a sequence of elements of X,
taking care to preserve the relevance of zero. Reading k0 as the length of a sequence, the sequence
(k1 , . . . , k k0 ) is a sequence of naturals. We then obtain the sequence (π (k1 ), . . . , π (k k0 )) in X ⋆ . By
surjectivity of π, the function we have constructed k 7→ (π (k1 ), . . . , π (k k0 )) is also surjective.
4.1. INTRODUCTION 21

Theorem (Cantor’s theorem). Let X be infinite. Then its power set P ( X ) is uncountable.

Proof. A simple diagonalisation argument shows there is no surjection from the naturals to the
power set P ( X ).

Proposition. If X is countable, then the set Fin( X ) ⊆ P ( X ) of all finite subsets of X is


countable.

Proof. We construct a surjection from X ⋆ to Fin( X ); then by composition with the surjection
obtained in the first proposition we construct a surjection N → Fin( X ). Consider the forgetful
function f : X ⋆ → Fin( X ), mapping ( x1 , . . . , xn ) to { x1 , . . . , xn }. Since X is countable, π : N → X
is surjective, hence for x ∈ X, π −1 ( x ) ⊆ N is a nonempty set of naturals. Therefore, let n x be the
least element of π −1 ( x ). Then, given F ∈ Fin( X ), consider the set {n x | x ∈ F }, order it in the
usual way, and represent this as a sequence. This is a sequence of naturals with | F | elements, and
its π-image is exactly F.

4.1.4 Notation
We will use the following notational conventions.
• The natural numbers N are defined as {0, 1, 2, . . . }.
• We use the standard set-theoretic construction of naturals as Von Neumann ordinals, n =
{0, 1, . . . , n − 1}. Therefore, a natural is the set of all lower naturals.
• X n is the set of sequence of X-strings of length n, defined as X n = n → X, treating n as a
set as above.
• We write |α| = domain(α) for the length of a sequence.
• X 0 = 0 → X is a type with only one element ε, which is the empty sequence.
• We can write X ⋆ = Xn .
S
n ∈N

• Truncation of a sequence α ∈ X n to the length k ≤ n is exactly α k : the unique sequence of


length k such that α k ⊆ α.


• Concatenation of sequences α, β ∈ X ⋆ where |α| = m, | β| = n, is denoted αβ ∈ X m+n ,


defined piecewise in the natural way.
• By recursion, we define α0 = ε and αn+1 = ααn .
• We identify the sequence of length one with its entry: x ∈ X can represent the sequence
(x) ∈ X1 .
• If Y, Z ⊆ X ⋆ , we write YZ = {αβ | α ∈ Y, β ∈ Z }.
• Similarly, if Y = {α}, we can write αZ = {αβ | β ∈ Z }.
• If f : X → Y, we can lift this function to the space X ⋆ → Y ⋆ functorially to the function fˆ.
Often, the hat is omitted.
22 COURSE 4. AUTOMATA AND FORMAL LANGUAGES

4.2 Rewrite systems


4.2.1 Definitions

Definition. Let Ω be a finite set of symbols, and let Ω⋆ be the set of Ω-strings. We call
elements of Ω⋆ × Ω⋆ rewrite rules or production rules. Such elements (α, β) are written
α → β.

Informally, we interpret a rewrite rule α → β as a procedure that replaces an occurence of α in a


string with β.

Definition. A pair R = (Ω, P) is called a rewrite system if P is a finite set of rewrite rules.

Proposition. If Ω is finite, there are only countably many rewrite systems on Ω.

Proof. Ω⋆ is countable, so Ω⋆ × Ω⋆ is countable. Every P is an element of Fin(Ω⋆ × Ω⋆ ), hence


this is countable.

R
Definition. If R = (Ω, P) is a rewrite system, and σ, τ ∈ Ω⋆ , we write σ −
→1 τ, pronounced
‘σ is rewritten to τ in one step’ or ‘R producees τ from σ in one step’, if there exist
α, β, γ, δ ∈ Ω⋆ such that σ = αγβ, τ = αδβ, and γ → δ ∈ P.
R R R
The relation −
→ is the reflexive and transitive closure of −
→1 . The sequence σ0 −
→1
R R
n1 −
σ →1 σnois called a R-derivation of length n of σn from σ0 . We write D( R, σ) =
→1 . . . −
R
τ ∈ Ω⋆ | σ −→ τ for the set of strings that can be rewritten, produced, or derived from
σ.

4.2.2 Relation to languages


In language, we can think of Ω as representing letters, and Ω⋆ representing words. We could
alternatively consider Ω to represent words, and Ω⋆ to represent sentences. Further, Ω could
represent sentences, and then Ω⋆ would represent texts.
However, not all elements of Ω⋆ in each level is a valid word, sentence, or text. We therefore would
like to describe which elements of Ω⋆ are well-formed. Natural languages spoken by humans
are finite, and normally the way we determine whether a string is a word is by consulting a
dictionary, which at its core is a lookup table that determines whether any given string is or is
not a word.
Even though in practice languages are finite, Chomsky realised that it makes more sense to model
them as infinite sets, due to a property known as linguistic recursion that seems to be an important
feature of human language. Linguistic recursion can be seen through the following example:
when X is a sentence in English, ‘E observes that X’ is also a grammatical sentence in English. If
we define an upper sentence length in English, we have to arbitrarily define an upper limit on
this form of recursion.
4.2. REWRITE SYSTEMS 23

There is a difference between a sentence being grammatical and being meaningful. One notable
example is the grammatically correct ‘colourless green ideas sleep furiously’ that does not have
meaning, to contrast with ‘furiously sleep ideas green colourless’ which is neither grammatically
correct or meaningful. We can use grammar to distinguish these two sentences, but we cannot
distinguish algebraically whether a sentence has meaning.
Example. Consider the following generative grammar of rewrite rules for English.
S → NP VP
NP → Adj NP
NP → Noun
VP → Verb
VP → Verb Adv
This rewrite system allows us to derive the sentence ‘colourless green ideas sleep furiously’ from
S.

4.2.3 Grammars

Definition. Let Σ be an alphabet of letters or terminal symbols, and let V be a set of variables
or nonterminal symbols, such that Σ, V are nonempty and disjoint. Let Ω = Σ ∪ V. a, b, c, . . .
refer to letters and A, B, C, . . . refer to variables. Elements of W = Σ⋆ ⊆ Ω⋆ are called
words. u, v, w, . . . refer to words. We denote W+ = Σ⋆ \ {ε} for the set of nonempty words.
A subset of W is called a language.

Note that there are uncountably many languages over any nonempty alphabet.

Definition. A tuple G = (Σ, V, P, S) is called a grammar if Σ, V are nonempty and disjoint


denoting Ω = Σ ∪ V, such that R = (Ω, P) is a rewrite system, and S ∈ V is the start
symbol. Since grammars give rise to a natural rewrite system, our notation for rewrite
systems may also be used for grammars. For example,

G R
D( G, σ) = D( R, σ); σ−
→(1) τ ⇐⇒ σ −
→ (1) τ

We define the language generated by the grammar to be

L( G ) = D( G, S) ∩ W

Example. If there is no rule of the form S → α in P, then D( G, S) = {S} and thus L( G ) = ∅


because the start symbol is not a word. Likewise, if there is no rule of the form α → w for w ∈ W
in P, then D( G, S) contains no words, so L( G ) = ∅.
Example. Let Σ = { a}, V = {S}, P0 = {S → aaS, S → a}, G0 = (Σ, V, P, S). We will show
L( G0 ) = a2n+1 | n ∈ N . First, every element of D( G, S) that is produced by G0 is of odd
length, which can be seen by induction on the length of the derivation, since each production
rule preserves parity of length. Conversely, each a2n+1 can be produced by the rewrite rules, by
applying S → aaS a total of n times, and then applying S → a.
Note that the only requirement of the proof was that odd length is preserved. Thus, the following
sets of production rules also produce the same language.
24 COURSE 4. AUTOMATA AND FORMAL LANGUAGES

• P1 = {S → aSa, S → a}
• P2 = {S → Saa, S → a}
• P3 = {S → aaS, S → aaSaa, S → a}
This notion is called equivalence of grammars.
Course 5

Galois Theory

Contents
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.1.2 Solving quadratics, cubics and quartics . . . . . . . . . . . . . . . . . . 26
5.2 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2.2 Symmetric polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

25
26 COURSE 5. GALOIS THEORY

5.1 Introduction
5.1.1 Introduction
Galois theory concerns itself with solving polynomial equations of higher degree, and discussing
how the symmetries of these polynomials relate to their solubility. The modern interpretation of
Galois theory is more interested in the fields that particular polynomials generate, rather than
their particular solutions; this naturally extends to studying symmetries of fields.

5.1.2 Solving quadratics, cubics and quartics


Methods for solving quadratic equations have been known since the time of the Babylonians.
2
Consider aX 2 + bX + c, and complete the square into ( X + 21 b)2 + c − b4 . This leads directly into
the usual formula.

Alternatively, consider ( X − x1 )( X − x2 ) and expand, giving X 2 − ( x1 + x2 ) X + x1 x2 . Thus,


x1 + x2 = −b and x1 x2 = c. We can write x1 = 12 [( x1 + x2 ) + ( x1 − x2 )], where x1 + x2 = b and
( x1 − x2 )2 = b2 − 4c.

Cubics were solved much later, in the early 16th century, by the Italian mathematician del Ferro.
Consider the cubic X 3 + aX 2 + bX + c, written as ( X − x1 )( X − x2 )( X − x3 ). Multiplying, we
find
x1 + x2 + x3 = − a; x1 x2 + x2 x3 + x3 x1 = b; x1 x2 x3 = − c

Without loss of generality we can set a = 0 by replacing X 7→ X − 3a . Now,


 
1
x1 = ( x1 + x2 + x3 ) + ( x1 + ωx2 + ω 2 x3 ) + ( x1 + ω 2 x2 + ωx3 )
3 | {z } | {z }
u v

2πi
where ω = e 3 . The u, v are known as Lagrange resolvents. Applying a cyclic permutation
to x1 , x2 , x3 in u or v, we find u 7→ ωu and v 7→ ωv. Hence, the cubes of u and v are invariant
under cyclic permutations of x1 , x2 , x3 . Under a permutation x2 7→ x3 , x3 7→ x2 , u and v swap.
Hence, u3 + v3 and u3 v3 are invariant under all permutations of roots. A general fact that we will
prove later is that such invariant expressions can be written in terms of the coefficients of the
polynomial. In this case, we have

u3 + v3 = −27c; u3 v3 = −27b2

Now, u3 and v3 are the roots of the quadratic Y 2 + 27cY − 27b2 . This then provides a formula for
the root x1 . This process is known as Cardano’s formula.

Similarly, the quartic X 4 + aX 3 + bX 2 + cX + d can be solved by producing an auxiliary cubic


equation, in a similar way to the auxiliary quadratic equation found for the cubic case above.
However, the same process does not work for the quintic; the auxiliary equation has a degree
which is too large. The underlying reason behind this is to do with group theory, and in particular,
the group structure of S5 and A5 . This will be explored later in the course.
5.2. POLYNOMIALS 27

5.2 Polynomials
5.2.1 Definitions
In this course, ring means a commutative nonzero ring. If R is a ring, R[ X ] denotes the ring of
polynomials with elements ∑in=0 ai X i , and the usual operations of addition and multiplication. A
polynomial f ∈ R[ X ] can be interpreted as a function f : R → R, given by x 7→ ∑in=0 ai xi . It is,
however, important to distinguish the polynomial and its associated function; the polynomial
is not in general uniquely determined by the function. For example, let R = Z⧸pZ, so for all
a ∈ R, we have a p = a, and hence X p and X are different polynomials yet represent the same
function.

Recall from Groups, Rings and Modules that if R = K is a field, K [ X ] is a Euclidean domain
(and hence is a unique factorisation domain, a Noetherian ring, a principal ideal domain, and an
integral domain). Hence, there is a division algorithm: for polynomials f , g ∈ K [ X ], there exists
a unique q, r ∈ K [ X ] such that f = gq + r and deg r < deg g, where we denote deg 0 = −∞. If
g = X − a is linear, f = ( X − a)q + r where r = f ( a) ∈ K; this is the familiar remainder theorem.
Note that every polynomial f ∈ K [ X ] is a product of irreducible polynomials since K [ X ] is a
unique factorisation domain, and there are greatest common divisors which can be computed
using Euclid’s algorithm in the usual way.

Proposition. Let K be a field, and 0 ̸= f ∈ K [ X ]. Then, f has at most deg f roots in K.

Proof. If f has no roots, the proof is complete. If f has a root a, consider f = ( X − a)q + f (0) =
( X − a)q. For a root b of f , either b = a or q(b) = 0. By induction, q has at most deg q roots, since
deg q < deg f . Then deg q + 1 ≤ deg f as required.

5.2.2 Symmetric polynomials

Definition. Let R be a ring, and let n ≥ 1. A polynomial f ∈ R[ X1 , . . . , Xn ] is symmetric if,


for every permutation σ ∈ Sn , we have f ( Xσ(1) , . . . , Xσ(n) ) = f ( X1 , . . . , Xn ), where Sn is
the symmetric group of degree n.

Note that constant polynomials are symmetric, and the property of symmetry is closed un-
der addition and multiplication. Hence, the set of symmetric polynomials is a subring of
R [ X1 , . . . , X n ] .

Example. X1 + · · · + Xn is symmetric. More generally, pk = X1k + · · · + Xnk is symmetric.

Proposition. Let f σ ( X ) = f ( Xσ(1) , . . . , Xσ(n) ). This gives an action (on the right) of the
group Sn on R[ X1 , . . . , Xn ]. A polynomial f ∈ R[ X1 , . . . , Xn ] is symmetric if and only if f
is fixed under the action of Sn ; in other words, f σ = f for all σ ∈ Sn .
28 COURSE 5. GALOIS THEORY

Definition. The elementary symmetric functions or elementary symmetric polynomials are

s r ( X1 , . . . , X n ) = ∑ Xi 1 Xi 2 · · · Xi r
i1 <···<ir

For instance,
s 2 ( X1 , X2 , X3 ) = X1 X2 + X1 X3 + X2 X3
It is clear that these are symmetric polynomials.

I
Definition. A monomial is an expression of the form X I = X11 · · · XnIn for I ∈ Nn . The
(total) degree of a monomial is ∑in=1 Ii . A term is a scalar multiple of a monomial. A
polynomial is uniquely characterised by a sum of terms. The total degree of a polynomial
is the maximum total degree of its terms.
Monomials are equipped with a lexicographic ordering, where we say monomials X I > X J
if either I1 > J1 or I1 = J1 and for some r ∈ {1, . . . , n − 1} we have I1 = J1 , . . . , Ir =
Jr , Ir+1 > Jr+1 . This is a total order.

Theorem. Every symmetric polynomial in n variables over a ring R can be expressed as a


polynomial in the sr for 1 ≤ r ≤ n, with coefficients in R. Further, there are no non-trivial
relations between the sr .

Remark. Consider the ring homomorphism θ : R[Y1 , . . . , Yn ] → R[ X1 , . . . , Xn ] given by θ (Yr ) = sr


and θ (r ) = r for r ∈ R. The first part of the above theorem stipulates that Im θ is the set of
symmetric polynomials. The second part implies that θ is injective, since any element of ker θ is a
polynomial between the sr that evaluates to zero.
Note that we can equivalently define the sr as
n
∏ ( T + Xi ) = T n + s 1 T n − 1 + · · · + s n − 1 T + s n
i =1

If we need to specify the number of variables, we use sr,n instead of sr .

Proof. Let d be the total degree of a symmetric polynomial f . Let X I be the largest (in lexicographic
order) monomial which occurs in f with coefficient c. Since f is symmetric, any permutation of
the Xi yields another monomial that occurs in f . Hence, I1 ≥ I2 ≥ · · · ≥ In , because otherwise the
rearranged monomial that satisfies this will be a strictly larger monomial in f . We can therefore
write
I −I
X I = X11 2 ( X1 X2 ) I2 − I3 · · · ( X1 . . . Xn ) In
Consider
I − I2 I2 − I3
g = s11 s2 · · · snIn
By construction, the largest monomial in g is X I . Since g is symmetric, cg is symmetric. By
induction, we may assume f − cg is expressible as a sum of symmetric polynomials as it has total
degree no larger than d, its leading monomial term is smaller than X I , and there are only finitely
many monomials of degree at most d. Hence f is also expressible as a sum of polynomials as
required.
5.2. POLYNOMIALS 29

Now we prove uniqueness by induction on n. Let G ∈ R[Y1 , . . . , Yn ] such that G (s1,n , . . . , sn,n ) = 0.
We want to show that G is the zero polynomial. If n = 1, the result is trivial as s1,1 = X1 . If
G = Ynk H with Yn not dividing H, then skn,n H (s1,n , . . . , sn,n ) = 0. Since sn,n = X1 . . . Xn , it is
not a zero divisor in R[ X1 , . . . , Xn ]. Hence H (s1,n , . . . , sn,n ) = 0. Without loss of generality, we
can assume that G is not divisible by Yn . Now, replacing Xn with zero, sk,n is mapped to sk,n−1
for k ̸= n, and sn,n is mapped to zero. Hence, G (s1,n−1 , . . . , sn−1,n−1 , 0) = 0. By induction,
G (Y1 , . . . , Yn−1 , 0) = 0. Hence Yn | G, contradicting our assumption.

Example. Consider, for n ≥ 3,


f = ∑ Xi2 Xj
i̸= j

The leading term is X12 X2 = X1 ( X1 X2 ), so we consider


!
f − s1 s2 = ∑ Xi2 X j − ∑ ∑ Xi X j X k
i̸= j i j<k
! !
= ∑ Xi2 X j − ∑ Xi2 X j +3 ∑ Xi X j X k
i̸= j i̸= j i < j<k

= −3 ∑ Xi X j X k
i< j<k

= −3s3

Hence f = s1 s2 − 3s3 .

Consider f = p5 = ∑i Xi5 . Computing this in terms of elementary symmetric polynomials by


hand is somewhat tedious, but there are various formulae, known as Newton’s formulae, which
can help in simplifying such expressions.

Theorem (Newton’s formulae). Let n ≥ 1. Then for all k ≥ 1,

pk − s1 pk−1 + · · · + (−1)k−1 sk−1 p1 + (−1)k ksk = 0

By convention, let s0 = 1 and sr = 0 if r > n.

Proof. It suffices to consider R = Z (or, for example, R = R). Consider the generating function

n n
F(T ) = ∏(1 − Xi T ) = ∑ (−1)r sr Tr
i =1 r =0

Note that for polynomials f ( x ), g( x ), their formal derivatives satisfy

d
dT ( f g ) f ′ g + f g′ f′ g′
= = +
fg fg f g
30 COURSE 5. GALOIS THEORY

Then, taking the logarithmic derivative with respect to T,

F′ (T ) d
dT ∏in=1 (1 − Xi T )
=
F(T ) ∏in=1 (1 − Xi T )
n d
dT (1 − Xi T )
= ∑ 1 − Xi T
i =1
n
Xi
=−∑
i =1
1 − Xi T
−1 n ∞ r r
T i∑ ∑ Xi T
=
=1 r =1
−1 ∞
T r∑
= pr T r
=1

Hence,
− TF ′ ( T ) = s1 T − 2s2 T 2 + · · · + (−1)n−1 nsn T n
but also
∞  
− TF ′ ( T ) = F ( T ) ∑ pr Tr = (s0 − s1 T + · · · + (−1)n sn T n ) p1 T + p2 T 2 + . . .
r =1

Equating the coefficients of powers of T, we find the identity as required by the theorem.

Example. The discriminant polynomial is

D ( X1 , . . . , X n ) = ∆ ( X1 , . . . , X n ) 2

where
∆ ( X1 , . . . , X n ) = ∏ ( Xi − X j )
i< j

This is used in defining the sign of a permutation: applying a permutation σ to ∆ multiplies ∆


by the sign of σ. Hence D is symmetric. Therefore, D can be written in terms of the symmetric
polynomials.
D ( X1 , . . . , X n ) = d ( s 1 , . . . , s n )
where d has integer coefficients. For example, n = 2 gives D = ( X1 − X2 )2 = s21 − 4s2 .

Definition. Let f = T n + ∑in=−01 an−1 T i be a monic polynomial in R[ T ]. Its discriminant is

Disc( f ) = d(− a1 , a2 , − a3 , . . . , (−1)n an ) ∈ R

Observe that if f is a product of linear polynomials f = ∏in=1 ( T − xi ), then ar = (−1)r sr ( x1 , . . . , xn ),


giving Disc( f ) = ∏i< j ( xi − x j )2 = D ( x1 , . . . , xn ). In particular, if R = K is a field, Disc( f ) = 0 if
and only if f has a repeated root. For example, Disc( T 2 + bT + c) = b2 − 4c.

You might also like