Bamberg Lectures
Bamberg Lectures
F U N D A M E N TA L
CONCEPTS
I N M AT H E M AT I C S
Contents
Preface 7
5 Metric Spaces 95
5.1 Limit of a function and limit of a sequence are really the same thing 95
5.2 A good definition? 96
5.3 Metrics 97
5.4 Boundedness 99
5.5 Cauchy sequences 99
5.6 Exercises 101
fundamental concepts in mathematics 5
6 Compactness 115
6.1 From finiteness to compactness 115
6.2 Covers and the definition of compact sets 116
6.3 Closed, bounded and compact 117
6.4 Aside: Brouwer’s fixed-point theorem 119
6.5 Exercises 119
8 Appendix 125
9 Index 129
Preface
Warning: The notes are by no means complete. They are skeletal and it is up to the student to fill in with examples
and proofs. This can be achieved either by reading a variety of good books that can be found in the library, or by
attending lectures. These notes may not give you the insight in how the writer thought about the proofs or came up
with them. This level of insight will be portrayed in lectures.
Theorems, lemmas and corollaries in this course can roughly be categorised into the following, and we will
adjoin each result with the respective symbol:
?: The proof is beyond the expected skill level of the student if it were asked of the student to create a proof with-
out ever have read a proof of the result before. However, if the student can recall the “gist” of the proof, or the
trick involved, then the student should be able to do it.
†: The proof is beyond the expected skill-level of the student if it were asked of the student to create a proof without
ever have read a proof of the result before. The proof is long, tedious or very complex and intricate. The lecturer
does not think that being able to accomplish a proof such as this has any value to the students development.
: The proof of this result is sufficiently simple that by the end of the course, the student would be expected to be
able to prove this result or one of a similar complexity, having never seen it before.
Acknowledgements
I would like to thank Murray Smith for reading drafts of these notes.
8 john bamberg
Learning outcomes
Proofs
This is really the first course in our undergraduate program where the
student learns how to write proofs in a variety of situations. At the very
least, the student should be able to do the auto-pilot component of a proof.
By this, I mean that the student should know what the necessary steps are
in a proof regardless of the difficult and ingenious step that some proofs
require. For example, if the student is asked to prove that a function f :
A → B is one-to-one, then they must begin the proof with a line like
“Suppose we have two elements a1 , a2 ∈ A such that f (a1 ) = f (a2 ). We
will show that a1 = a2 ”.
Philosophy of mathematics
From the very beginning of the course, the student will be exposed to the
philosophical edge of mathematics. Are there different types of infinities?
What does 0.999 · · · mean? There is no better way for a student to get
a thorough training in critical thinking than to embark on the theory of
cardinalities and number systems, and to question the foundations of their
subject. The student should know by the end of the course how we can
construct the various number systems (e.g., rationals, reals) from the natural
numbers, and why this was an important contribution to mathematics over
a century ago. We come across diagonal arguments, and non-constructive
proofs, and the student will have learnt how to picture difficult abstract
ideas by analogy to concrete examples.
units. The student must have a very good knowledge of the integers modulo
n and of polynomial rings. These are the basic structures of ring theory.
In this chapter we cover relations, basic set theory terminology, Russell’s Paradox, cartesian product of sets,
definition of a function, one-to-one, onto, and an introduction to writing proofs.
For two finite sets, it is straightforward to work out which set is bigger:
you simply count the elements of each set. What about infinite sets? Are
there some infinite sets that are bigger than other infinite sets? Does it make
sense to talk about infinity in the first instance?
twee, vier, zes, acht, tien, twaalf, veertien, zestien, achttien, twintig, . . . .
You don’t know the language and so you guess that this person is just
counting:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, . . . .
Then it is revealed to you that the language Grietje speaks is dutch and that
she was counting the even numbers.
Question 1.1.2. How big is the set of even whole numbers? Is it smaller
than the set of whole numbers? If we count the even numbers in a different
language, do we see any difference between the even numbers and the
whole set of numbers?
We will see that mathematics makes the answer to this question clear
and concrete by giving a definition of what the size of a set is. It turns out
that the even numbers have precisely the same size as the integers, that
the rational numbers have the same size as the integers, and that the real
numbers are definitely bigger than the integers.
by association, yet did we really know what they were? What is the cosine
of a real number? Next we learn that a function is a rule1 that assigns one 1
What is a rule?
value for every input: there cannot be two values for the same input.
Example 1.2.1. The equation f ( x)√2 = 1 has two solutions: f ( x) = 1 and
The relations we will encounter in this course deal with the interactions of
two things; they will be binary relations. The archetypical example will
be that of friendship on the set of human beings. Given two people, we
can establish whether they are friends or not. The reason to abstract and
clarify the idea of a relation is simply because they are so fundamental in
mathematics and occur so often. For example, we could say that a number
is less than another number by
x<y
or we could say that golden syrup (i.e. y) tastes better than honey (i.e.
x). Notice that we mostly use infix notation for relations3 A function is a 3
By “infix notation” we mean that the sym-
relation too! When we write bol of interest goes between its arguments.
So when we write that two things x and y
are equal, we demonstrate this by x = y.
f ( x ) : = x2
x f x2 .
This looks very weird, and in the case of functions, we do not really use
this notation in practice. Now a function f from X to Y is a relation, where
the left-hand things are in X and the right-hand things are in Y so that for
every element x of X there is a unique element of Y which is f -related to x.
4 4
A more formal definition will come later.
So what is a relation anyway?
Well, it is nothing more than describing a pair of things, each having a left-
hand one and a right-hand one. The key devise in making this idea concrete
is the notion of the Cartesian product of two sets. Given two sets A and
B, the Cartesian product5 of A and B is the set of all (ordered) pairs (a, b) 5
One of the axioms of Zermelo-Fraenkel
with a ∈ A and b ∈ B. We could write this set as Set Theory, the most accepted set of
axioms for the fundaments of mathematics,
is called the Axiom of Pairing which allows
A × B := {(a, b) : a ∈ A, b ∈ B}.
us to ‘make’ cartesian products of sets. In
other words, making pairs is one of the
things in mathematics that is necessary
to establish as a fundamental construct,
and that it does not arise by any simpler
construct.
fundamental concepts in mathematics 13
We have come to accept defining sets by a property (or for a fancier word,
predicate). So for example,
{x ∈ R : x2 + 3x + 1 = 0}.
√ √
This set consists of two elements − 23 + 25 and − 32 − 25 , and it might be a
different set if we allow x to be more than a real number (like matrices, for
example). But does it make sense to define a set of all sets of size 10?
{A : |A| = 10}.
Example 1.5.1 (The set of non-penguins). Let S be the set of all things
which are not penguins. Since S is not a penguin, it is an element of itself:
S ∈ S.
Is A a member of itself?
If A is a member of itself, it would contradict its own definition as a set
containing all sets that are not members of themselves. On the other hand,
if A is not a member of itself, it would satisfy its predicate and hence be a
member of itself!
get used to. Some of the fun of doing a proof comes from the completion of
a water-tight and elegant proof. However, for most of the formative period,
proofs can be frustrating and not enjoyable at all. So we will keep things
simple. For at least half of this course we will throw away what I believe
is one of the greatest inhibitions to the enjoyment of proofs: the syntax.
Syntax, though important in writing clear and elegant proofs, can take a
while to master. It often inhibits the new learner in understanding how a
proof works and how to recognise whether it is correct or not. In particular,
the emphasis early in one’s learning ought to be focussed on the logic of a
proof, not whether the right expression was used in conveying an inference
or deduction.
Temporary rules:
• You can use “Choose ...” if you need to show that something exists.
• Give a reason for every logical deduction, no matter how trivial it is.
Example 1.6.1. We will prove that the sum of two odd numbers is an even
number.
f ( A) := { f (a) : a ∈ A}.
Example 1.7.3.
• The function f defined by f ( x) = x2 is not one-to-one since f (−1) =
f (1) but 1 , −1.
• The “blood type” function is not one-to-one. The blood-type function has domain the set
of all people, and codomain {A, B, AB, O}.
• Pictorially, a function f on the reals is one-to-one if every horizontal Never mind the rhesus for this example!
line drawn on the graph of f intersects the values of f at most once. You
should picture “ f ( x) = x2 ” and imagine drawing horizontal lines on
the graph.
16 john bamberg
Definition 1.7.5 (Onto (surjective)). A function f : X → Y is onto, if for Equivalently: A function f : X → Y is onto
each element y ∈ Y, there is at least one element x ∈ X such that f ( x) = y. if the image f ( X ) equals the entire set Y.
• The “blood-type” function is onto, since for every blood type, there is at
least one person in the world who has that blood type.
Remark 1.7.8. In the above proof, the step “Choose x = y2 − 1” was only
possible after doing some work on the back of an envelope first. I worked
backwards to find an element x ∈ R\{0} such that f ( x) = y, and then wrote
my proof in the logically correct direction!
1.7.3 Counting
Counting is one of the most central ideas of mathematics, and it wasn’t un-
til Cantor’s work in the 19th century that we began to understand fully what
it means to count. To say that one set has more elements then another is a
trivial problem in the finite context, but what about infinite sets? Are there
more integers than even numbers? What Cantor realised, is that counting
can be thought of as pairing elements in a unique and exhaustive way. For
example, we can pair up the integers and even numbers in the following
way:
. . . , (−6, −3), (−4, −2), (−2, −1), (0, 0), (2, 1), (4, 2), (6, 3), . . . .
You can see here that every even number will appear in the first coordinate
in precisely one of these pairs and every integer will appear in the second
coordinate in just one of these pairs. So what we require is that there is a
function from the even numbers to the integers that is one-to-one and onto.
We call this a bijection.
1 1
f ( x) : = arctan( x) + .
π 2 1
It can be shown that f is one-to-one and onto, and so R and (0, 1) have the
same size!
0
To prove two functions f : X → Y and g : X → Y are equal: 1 1
Figure 1.4: π arctan( x) + 2
Let x ∈ X. Show that f ( x) = g( x). A common mistake made by
students is that they prove that two functions are equal for a specific
element of X. You must prove that they compute the same value for
EVERY element of the domain!
Example 1.7.13.
{ f (a) : a ∈ A}
of Y.
f ← (S ) := {x ∈ X : f ( x) ∈ S }.
In order for the notation f −1 to make sense, there has to be just one
inverse of an invertible function. We will assume without proof that the
operation of function composition is associative9 . 9
Jargon: A binary operation ? is . . .
• associative if the following always
Lemma 1.7.18. Suppose a function f : X → Y is invertible. Then there is a holds: a ? (b ? c) = (a ? b) ? c.
unique inverse of f . • commutative if the following always
holds: a ? b = b ? a.
Proof (restricted syntax): Suppose there are two inverses g : Y → X and
h : Y → X for f . We will show that h = g.
Then h ◦ ( f ◦ g) = (h ◦ f ) ◦ g since composition is associative.
Then h ◦ idY = idX ◦ g by definition of inverse.
Then h = g since composition with the identity func-
Therefore, f has a unique inverse. tion returns the original function.
Example 1.7.19.
20 john bamberg
• The inverse of the identity function is itself, since for any set X, idX ◦
idX = idX .
Theorem 1.7.20. A function f : X → Y is invertible if and only if it is
bijective.
Proof: See the exercises at the end of the chapter.
Before we saw that the set of even numbers has the same size as the entire
set of numbers, and to see this, we showed that there was a bijection be-
tween these two sets. If we can embed a set S inside the natural numbers,
like we did for even numbers, then we say that S is countable10 . 10
We will use a definition that includes
finite sets, which differs from some texts
Definition 1.8.1 (Countability). A set X is countable if there is a one-to-one such as Martin Liebeck’s book “A concise
function f : X → N. A set is uncountable if it is not countable. introduction to pure mathematics”.
? Sketch: Define the following function f from N to Z: A useful picture of how we ‘count’ Z:
... −3 −2 −1 0 1 2 3 ...
n
if n is even
2
f (n) : =
−n2+1
otherwise
1 1 1 1
1 → 2 3 → 4 ...
. % . %
2 2 2 2
1 2 3 4 ...
↓ % . %
3 3 3 3
1 2 3 4 ...
. %
4 4 4 4
1 → 2 3 4 ...
.. .. .. ..
. . . .
Here is a (silly) game we could play that would take a long time to com-
plete. I give you a rational number in the interval (0, 1), and then you give
me a different rational number in this interval. Then we continue turn by
turn, writing down rational numbers in (0, 1) that we have not mentioned
in earlier turns. The person who cannot think of a new rational number or
repeats one that has already been said, loses.
The answer is of course yes, but the most elegant answer is to use a
diagonal argument. Here’s how it works. Suppose the following ten moves
have been made in the game, accurate to ten decimal places:
22 john bamberg
0.0125000000
0.1000000000
0.3467891023
0.0041200340
0.9654102948
0.8475839102
0.0194857291
0.3240580293
0.3333333333
0.2121212121
Now look along the diagonal, after the decimal point. If the number is
not equal to 3, then write down next to the row the number 3. Otherwise,
write “7”.
0.0125000000 3
0.1000000000 3
0.3467891023 3
0.0041200340 3
0.9654102948 3
0.8475839102 7
0.0194857291 3
0.3240580293 3
0.3333333333 7
0.2121212121 3
Then the new number 0.3333373373 must differ from the first number in
the first place, from the second number in the second place, and so on! So
we are guaranteed of obtaining a new number!
We can use this argument to show that the real numbers cannot be placed
in bijection with the natural numbers.
Why does this argument not work for rational numbers? Well, the di-
agonal argument does not produce a rational number since the decimal
expression would be forced to be non-periodic.
∅, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {2, 3, 4}, {1, 3, 4}, {1, 2, 4}, {1, 2, 3, 4}.
Notice that this set is bigger than {1, 2, 3, 4}. We will see that this is also
true for infinite sets.
(i) If A ⊆ B, then A B.
(iii) A B or B A.
Proof:
So how many subsets are there of a finite set of size n? Well, each subset
defines a characteristic n-tuple where we just write out the values of the
characteristic function. For example, for {2, 3} as a subset of {1, 2, 3, 4}, we
just write: (0, 1, 1, 0). Every subset is then in one-to-one correspondence
with n-tuples of length n whose entries have values either 0 or 1. So there
are 2n subsets of a set of size n.
Definition 1.9.8 (Power set). For a set A, the power set P( A) of A is the set
of all subset of A.
We must be sure that we do not allow the
“set of all sets” to be a set here.
Theorem 1.9.9. For any set A, we have A ≺ P( A).
? Proof: First we show that A P( A), and then we will show that |A| ,
|P( A)|. Let f : A → P( A) be the function defined by
f (a) := {a}.
24 john bamberg
X = {a ∈ A : a < f (a)}.
A
a1 a2 a3 a4 a5 a6 a7 ···
A1 T T F T T F F ···
A2 F F T F T T F ···
A3 T F F F T F T ···
P( A) A4 T T F T T F F ···
A5 F F F F T F F ···
A6 F F F F F F T ···
A7 F T T F F F F ···
..
.
The set X we chose in the proof is what you get when you ‘flip’ the truth
values along the diagonal of this array.
A
a1 a2 a3 a4 a5 a6 a7 ··· X
A1 T T F T T F F ··· F
A2 F F T F T T F ··· T
A3 T F F F T F T ··· T
P( A) A4 T T F T T F F ··· F
A5 F F F F T F F ··· F
A6 F F F F F F T ··· T
A7 F T T F F F F ··· T
..
.
† Proof: A sketch of the proof might be done in lectures (if there is suffi-
cient time).
where
1 if i ∈ X
ai :=
0 otherwise.
4. S is one-to-one.
(a) 1 ∈ R, and
(b) x ∈ R =⇒ S ( x) ∈ R,
then R = N.
• Then 2 is defined to be the set containing 0 and 1, that is, 2 := {∅, {∅}}.
(i) x + 1 := S ( x)
1.11 Exercises
Exercise 1.11.1. In Google, the three operations of Boolean logic work like
this:
NOT Use the minus symbol −, e.g., "Poffertjes recipe" -"sour cream"
You can also use brackets to group clauses, and AND has natural precedence over OR.
(b) Write a Google expression with the terms "drink coffee" and
"less likely to develop dementia" which represents the fol-
lowing implication:
“If you drink coffee, then you are less likely to develop dementia.”
Exercise 1.11.3. Suppose A and B are sets. Give your answers in the nota-
tion of logic (not English).
f ( x) := 3x + 1 (mod 5)
1. Show that f ← ( B1 ∪ B2 ) = f ← ( B1 ) ∪ f ← ( B2 ).
2. Show that f ← ( B1 ∩ B2 ) = f ← ( B1 ) ∩ f ← ( B2 ).
28 john bamberg
This chapter covers elementary number theory and its analogues in the general theory of rings. We first look at
the “Division Rule” for the integers, the greatest common divisor of two numbers, and the Euclidean Algorithm. We
then look at the prime numbers, the atoms of the integers. This basic number theory leads to a fundamental number
system in abstract algebra, the integers modulo m, otherwise known as “clock-arithmetic”.
2.1 Divisibility
Let x and y be two integers. We say that x divides y1 if there exists another 1
Synonyms:
integer q such that • x divides y
y = qx. • y is a multiple of x
• x is a factor of y
We use the “long bar” notation to denote this relation:
x | y.
• The number 1 divides any nonzero integer and an integer x divides itself,
both because x = 1 · x.
6 4
3 2
1
30 john bamberg
Notice that in the above proof, the definition of “divides” had to be the
first deduction we made since it was all that we knew, and we reapplied the
definition of “divides” in the last deduction we made. So it is almost always
true that a mathematician, when devising a proof, constantly reflects on the
information known at each step, though being careful not to assume more
than what is allowed. At the same time, the mathematician keeps an eye on
the prize – the conclusion – and has a feeling of what the final steps need to
be.
Example 2.1.3. The set of even numbers 2N is closed under addition since
if we add two even numbers, the result is an even number. However, the set
of odd numbers is not closed under addition since 3 + 5 = 8.
Here are some operations on some familiar sets and we summarise when
the given set is closed under a particular operation.
Lemma 2.1.4 (The Division Rule). Let a be a positive integer and let b be
an integer. Then there are unique integers q and r such that
b = qa + r and 0 6 r < a.
Example 2.1.6.
• gcd(−12, 18) = 6.
One way to work out the gcd of 234 and 180, say, is to divide the smaller
number into the larger one, take its remainder (which is 54) and then notice
that gcd(234, 180) = gcd(54, 180). The next result is the basis for the next
section.
(ii) Write 234 and 180 in the right column, the largest 234 first.
234
180
(iii) Divide the smaller number 180 into the larger 234, and write the
quotient in the left column adjacent to 180, and the remainder below
180.
234
1 180
54
(iv) Now move down one row and repeat the last step over and over until
we get a remainder of 0.
234
1 180
3 54
3 18
0
(v) The second-last number in the right column is the greatest common
divisor of 180 and 234, namely 18.
Example 2.1.9. Consider 558 and 423. Here is the table we get when we
do the Euclidean Algorithm:
558
1 423
3 135
3 18
7 9
0
Lemma 2.1.10 (Bézout’s identity). If a and b are nonzero, then there are
integers m and n such that
gcd(a, b) = ma + nb.
In fact, Bézout’s identity almost says that the greatest common divisor of
a and b is the smallest integer linear combination of a and b. We fill in the
details in what follows.
Corollary 2.1.11. Let a and b be two nonzero integers. Then an integer
x can be expressed as ma + nb for two integers m and n if and only if
gcd(a, b) divides x.
? Proof: We do the “ =⇒ ” direction first. Suppose that x can be ex-
pressed as ma + nb for some integers m and n. Let d be an integer that
divides both a and b. Then by transitivity of the divides relation (Lemma
2.1.2), d divides ma and d divides nb. So d divides their sum, and hence d
divides x. Therefore, gcd(a, b) must divide x, since by definition, gcd(a, b)
divides both a and b.
Now for the converse. Suppose gcd(a, b) divides x. So there is an inte-
ger c such that x = c · gcd(a, b). By Bézout’s identity (Lemma 2.1.10),
there exist a pair of integers m0 and n0 such that
gcd(a, b) = m0 a + n0 b.
Hence, x = (cm0 )a + (cn0 )b and the result follows by letting m = cm0 and
n = cn0 .
9 = 558m + 423n.
558 1 0
-1 423 0 1
-3 135
-7 18
-2 9
To fill out the third column, we take the product of the first and third
column in the previous row and add it to the previous value.
First step Second step Third step
558 1 0 558 1 0 558 1 0
-1 423 0 1 -1 423 0 1 -1 423 0 1
-3 135 1 -3 135 1 -3 135 1
-7 18 -7 18 -3 -7 18 -3
-2 9 -2 9 -2 9 22
34 john bamberg
We just read off from the last row of the table that m = 22 and n = −29.
Definition 2.1.13. Two nonzero integers are coprime5 if their gcd is equal 5
Often the term ‘relatively prime’ is used
to 1. for the same concept.
Example 2.1.14.
Corollary 2.1.15. Two nonzero integers a and b are coprime if and only if
there exist integers m and n such that
1 = ma + nb.
Part (i) of Corollary 2.1.16 is really a fact about the least common multi-
ple of two integers.
pi | N − ( N − 1) = 1
Now we will give the most important result of number theory, the “Fun-
damental Theorem of Arithmetic6 ”. 6
The Fundamental Theorem of Arithmetic
was established at least by the Greeks
Theorem 2.2.2 (The Fundamental Theorem of Arithmetic). Let n be an of the time of Euclid, since it appears in
Volume VII of The Elements.
integer with n > 2. Then n can be written uniquely as a product of prime
numbers:
n = p1 p2 · · · pk , p1 6 p2 6 · · · 6 pk .
So if n = q1 q2 · · · qk where the qi are prime numbers, and q1 6 q2 6 · · · 6
q` , then k = ` and qi = pi for all i ∈ {1, . . . , k}.
Clearly P(2) is true as 2 is a prime itself.7 So suppose P(k) is true for some 7
Here we stress that a product of a list
positive integer k > 2. If k + 1 is prime, then we are done, so suppose k + 1 of integers L allows L to have just one
element!
is not prime and that we can write k + 1 = ab where 1 < a, b < k + 1. Now
by our inductive hypothesis, a is a product of primes and b is a product
of primes. So ab is a product of primes! Therefore, P(k + 1) is true, and
hence by the Mathematical Induction, P(n) is true for every positive integer
n > 2.
Uniqueness: Suppose we can write n two ways as a product of primes
n = p1 p2 . . . pk = q1 q2 . . . q`
where
{p1 , p2 , . . . , pt } ∩ {q1 , q2 , . . . , qu } = ∅. (2.1)
Now p1 divides both sides, and so p1 mod q1 q2 · · · qu . So from Corol-
lary 2.1.16, we know that p1 must divide at least one of the qi , for some i.
This implies that p1 = qi , as qi is prime, which is a contradiction to 2.1.
Therefore, {p1 , p2 , . . . , pk } = {q1 , q2 , . . . , q` }.
Remark 2.2.4. The ai are allowed to be zero, though in this case, we do not
list the prime in order for the expression to be unique. Since n > 2, not all
of the ai can be zero.
A consequence of the Fundamental Theorem of Arithmetic is that is
not difficult to calculate the greatest common divisor and least common
multiple of two integers.
Lemma 2.2.5. Let x and y be integers, both at least 2, and write them out
in there canonical factorisations:
Therefore,
xy
LCM( x, y) = .
gcd( x, y)
Proof: To be done in lectures.
Let π(n) be the number of primes that are less than or equal to n. For
example, π(20) = 8 as 2, 3, 5, 7, 11, 13, 17 and 19 are the prime num-
bers less than 20. Another way to express Bertrand’s Postulate is by the
inequality
(∀n > 2) π(2n) − π(n) > 1.
However, Chebyshev proved something much stronger: Pafnuty Chevyshev (1821-1894) was
one of Russian history’s brightest minds
1 n 7 n and he made far reaching contributions
(∀n > 5) < π(2n) − π(n) < . to mathematics of the 19th century. His
3 log n 5 log n
ability to write fluently in French meant
At this point, we have given some indication that the prime numbers that he was in his words a “world-wide
mathematician”, and his work then bore
occur rather frequently. Here is a simple result which shows that they are influence during his own lifetime.
also rare.
Remark 2.3.4. The prime number theorem says that π(n) grows asymp-
totically like n/ log(n), but it says nothing about the differencet π(n) −
n/ log(n).
These are the numbers which are of the form 3k + 1, where k ∈ N ∪ {0}.
We have highlighted in bold the prime numbers in this sequence, which is
known as an arithmetic progression. In general, if we take a and b to be
coprime8 , we can ask how many primes occur in the arithmetic progression 8
Why would you take a and b to be
coprime? If they had a common factor,
would we have a sensible sequence in
which to study the occurrence of primes in?
38 john bamberg
a, a + b, a + 2b, a + 3b, . . . .
a, a + b, a + 2b, a + 3b, . . .
† Proof: The proof of this result is beyond the scope of this course.
a b Sequence
3 2 3, 5, 7
3 4 3, 7, 11
5 4 5, 11, 17, 23
where k ∈ {0, 1, . . . 25}. This sequence of primes has length 26 and the
distances between them are quite large compared to the sequences we had
a above. Can we do better than a sequence of length 26? This question was
answered by Ben Green and Terence Tao9 (an Australian!): 9
Terry Tao was awarded a 2006 Fields
Medal, the mathematical equivalent of a
Theorem 2.3.6 (The Green-Tao Theorem). The primes contain arbitrarily Nobel Prize. He is the first Australian to
have been awarded a Fields Medal.
long arithmetic progressions.
We saw in the previous section that there are infinitely many primes, but
it was not a constructive proof. That is, we did not create an explicit set of
infinitely many primes, we only proved that there cannot be finitely many
primes. Mersenne primes are the simplest set of many prime numbers that
we know of, and yet, we still do not know if they can produce infinitely
many primes.
It turns out (see the Exercises at the end of this chapter) that n must a
prime number in order for 2n − 1 to be prime. Not all numbers of this form
are prime though:
fundamental concepts in mathematics 39
n 2n − 1 Is prime?
2 3 Yes
3 7 Yes
5 31 Yes
7 127 Yes
11 2047 No, 2047 = 23 · 89
13 8191 Yes
17 131071 Yes
19 524287 Yes
23 8388607 No, 8388607 = 47 · 178481
The largest known Mersenne prime was
The greek civilisation had a particular fascination with what they called discovered in 2008 and it is 2 p − 1 where
perfect numbers, and they knew of four perfect numbers at the time of p = 43, 112, 609. We currently know of 47
Mersenne primes. There is a world-wide
Nicomachus in 100AD. A number n is perfect if it equal to the sum of its “Great Internet Mersenne Prime Search”
proper divisors. For example, the proper divisors of 6 are 1, 2 and 3, and (GIMPS) that uses the CPU power of
6 = 1 + 2 + 3. thousands of home users machines who
have volunteered their desktop power to aid
in the search for the next Mersenne prime.
Example 2.4.2 (Examples of perfect numbers).
See www.mersenne.org for more.
Perfect number Proper divisors
6 1, 2, 3
28 1, 2, 4, 7, 14
496 1, 2, 4, 8, 16, 31, 62, 124, 248, 496
8128 1, 2, 4, 8, 16, 32, 64, 127, 254, 508, 1016, 2032, 4064
In Euclid’s Elements (Book IX), it was proved that there is a direct
connection from Mersenne primes to perfect numbers.
So a perfect number n satisfies σ(n) = 2n. Here are some other interest-
ing properties of σ:
Lemma 2.4.5.
pa + 1 − 1
σ( pa ) = 1 + p + p2 + · · · pa = .
p−1
Proof: For both parts, we simply use the following corollary of the
Fundamental Theorem of Arithmetic: if we express a positive integer x
in terms of its canonical factorisation pa11 pa22 . . . pak k , then every positive
divisor of x has canonical factorisation of the form pb11 pb22 . . . pbk k , where for
each i, we have bi 6 ai .
(ii) Suppose a and b are coprime. Suppose we wrote out their canonical
factorisations as follows
We may assume that a and b are both greater than 1, since the result
would definitely hold in either of these two cases. With this assump-
tion, we may further assume that each of the ai and bi are greater than
0 (otherwise we would suppress the term). Then σ(a) = i σ( pa1i ),
Q
and since we can easily read off the divisors of this integer by its
canonical factorisation, it then follows that σ(ab) = σ(a)σ(b).
The best known answer to date to this question (at the point of writing)
is due to Ochem and Rao (2011):
Theorem 2.4.8 (Ochem and Rao, 2012). An odd perfect number must be
larger than 101500 .
Suppose it is 10:02am right now. What is the time 6 hours from now? If
you use a 24-hour convention, then you would say that it is 16:02. But if
you use a 12-hour convention, using “am” or “pm”, then the answer would
be 4:02pm. What were you doing when you are working out that it would
be 4:02 in the clock? You were counting to 12, and then starting back at 0
again. That is, 12 is the same as 0.
Here is another example. What day of the week is it on the 10th of the
next month? To work out the answer, you count out the number of days
until the 10th of the next month, divide by 7, and take the remainder. Then
you add this remainder to the day we are currently on.
Example 2.5.1. If it is now Wednesday August 22nd, then there are 19 days
until September 10. Now 19 = 2 × 7 + 5, so there is a remainder of 5 when
dividing 19 by 7. We then count forward by 5 in the days of the week and
discover that September 10 is a Monday.
fundamental concepts in mathematics 41
Example 2.5.3.
• 14 ≡5 4 since 14 − 4 = 10 is a multiple of 5.
• −7 ≡3 2 since −7 − 2 = −9 is a multiple of 3.
The congruence relation is a little bit like the equality relation, but more
flexible. It at least has the following properties10 . Compare the following 10
in fact, these ‘properties’ are the defining
lemma to Lemma 2.1.2. properties of a congruence. Congruences
are studied in general in some areas of
abstract algebra.
Lemma 2.5.4. For n ∈ N, congruence modulo n satisfies the following
properties:
Proof:
Zn := {0, 1, . . . , n − 1}
which we call the set of integers modulo n. We will see that this set can be
equipped with addition-like and multiplication-like operations that makes it
into an interesting number system.
Lemma 2.5.7. For every integer x, there is a unique element y ∈ Zn such
that x ≡n y.
Proof: Let x be an integer. By the Division Rule (Lemma 2.1.4), there
exist unique integers q and r such that x = qn + r and 0 6 r < n. So
r ∈ Zn . So if we let y = r, we see that there is a unique element y ∈ Zn
such that x ≡n y.
Example 2.5.9.
• 5 ⊕4 3 = 8 (mod 4) = 0,
• −7 ⊕5 13 = 6 (mod 5) = 1,
ax + by = c
where a, b, c are given integers and we want to solve for integers x and y.
Here is an example that we see often in real life.
Example 2.5.11. What amounts of money can you make from $ 2 and $ 5
denominations? That is, for what values c can we solve 2x + 5y = c?
The next question we might ask is, if a linear diophantine solution has a
solution, does it have more, and how many more?
b
x0 + tZ, t= .
gcd(a, b)
We will leave the proof as an exercise (see the last section of this chapter).
• Volunteer 2 works out what their number is modulo 11, call it a2 , and
writes it on the piece of paper inside the envelope.
• Volunteer 3 works out what their number is modulo 13, call it a3 , and
writes it on the piece of paper inside the envelope.
They give you their scribed pieces of paper. With this information you
can recover the original number that Alice chose! Let
x ≡m1 a1
x ≡m2 a1
..
.
x ≡mk ak .
M
bi ( ) ≡mi 1.
mi
Let x = ki=1 ai bi mMi . Then for each i, we have x ≡mi ai bi mMi ≡mi ai .
P
x − x0 ≡mi 0
k
X M
x= ai bi .
i=1
mi
fundamental concepts in mathematics 45
For a long time, the world used private key cryptography to transmit secret
information, until the breakthrough of Diffie and Hellman in the 1970’s.
We will see an example of public key cryptography, the so-called RSA
cryptosystem. “We in science are spoiled by the success
It can often be difficult to find the (multiplicative) inverse of a number of mathematics. Mathematics is the study
of problems so simple that they have good
modulo m. For example, the inverse of 7 modulo 64 is 55, which may not solutions.”
be easy to guess off-hand. We can use the Euclidean Algorithm to find the
– Whitfield Diffie (1944–)
inverse modulo m and we show how this is done by example.
46 john bamberg
So
21x ≡1430 21x + 1430y = 1
and hence x will give us the inverse if we can work it out. We will use the
Euclidean Algorithm:
1430 1 0
-68 21 0 1
-10 2 1 -68
-1 0 -10 681
Sharing a ‘key’
The previous methods of enciphering depended on knowledge of the key
to decipher the message, so the key had to be kept private between Alice
and Bob. In the 1970’s a radical new approach to cryptography was borne,
public key cryptography. The general way it works is this: Bob has two
keys, one private and one public. The public key is used by Alice to en-
crypt messages sent to Bob, and the private key is used by Bob to decrypt
messages.
Here is an analogy where Alice sends a treasure chest to Bob through
the post. Alice has a padlock and both Alice and Bob have a key to this
padlock. When Alice sends the treasure chest, she puts a padlock on the
treasure chest, and then when Bob receives the chest, he opens it with
his keys. The problem with this approach is that Alice and Bob need to
meet privately to ensure they have identical keys. This is private key cryp-
tography. We can change this example to give an analogy for public key
cryptography:
Example 2.6.2. Alice has a treasure chest and padlock, and Bob has a
padlock as well, but it is a different padlock. Can you think of a way for
Alice to send the treasure chest to Bob so that Bob can open it, but they
never meet? (You are allowed to used the postal system more than once!)
Solution: Alice locks the chest with her padlock and sends it to Bob. Bob
then places his padlock on the chest and sends it back to Alice. We now
have two padlocks on the treasure chest. Alice takes her padlock off and
sends it back to Bob. Then Bob can open the treasure chest by removing his
padlock.
One-way functions
Just as we saw in the previous example, the main idea in public key cryp-
tography is a one-way function. A function f is said to be one-way if given
x it is “easy” to compute f ( x), but given y, it is “hard” to determine an x
The RSA cryptosystem is named after the three mathematicians who in-
vented it around 1978: Ron Rivest, Adi Shamir and Leonard Adleman. It
was found out much later, that the same cryptosystem was discovered in
top-secret work by the GCHQ in the early seventies (by Clifford Cocks).
Before a message can be sent, a public key is set up by the receiver
(Bob) that everyone has access to.
• Choose e such that e has an inverse modulo ϕ, and let d be this inverse.
Private Key d.
Example 2.6.3. The two chosen prime numbers are p = 47 and q = 59. So
n = 47 × 59 = 2773, ϕ = 46 × 58 = 2668.
de ≡ϕ 1
for some d. It turns out that e = 157 is one of many choices for e. In fact,
17 × e = 2669 ≡2668 1.
So d = 17.
a( p−1)(q−1) ≡ pq 1.
a( p−1)(q−1) ≡ p 1q−1 = 1.
a( p−1)(q−1) ≡q 1 p−1 = 1.
Let’s see what happens when we decrypt something that’s been en-
crypted:
Dn,d ( En,e ( x)) = ( xe )d (mod n).
So in order for decryption to give us the same thing back again, we need to
show that
xde ≡n x.
• Why cannot another user discover the sent message when they know the
public key?
2.7 Exercises
2. Find the set of all integer solutions in x to the following set of simulta-
neous linear diophantine equations:
x ≡6 2
x ≡7 1
x ≡11 3.
(a) 4 ≡13 17
(b) 6 ≡7 42
(c) −1 ≡4 11
(d) 11 ≡4 −1
(e) −5 ≡8 −21
10. Find the remainder r (between 0 and 6) that we get when we divide 682
by 7.
(a) 2 ⊕5 4
(b) −4 ⊕3 10
(c) 25 ⊗9 94
(d) −2 ⊗5 7
(e) 25634578912 ⊗2 65.
50 john bamberg
12. John H. Conway once said that 91 is the smallest integer which looks
like a prime, but isn’t. Use Fermat’s Little Theorem to show that 91 is
not prime. (Hint: Decompose 90 into binary: 90 = 64 + 16 + 8 + 2).
13. We will do a magic trick starting with 27 cards (the joker, the dia-
monds and the hearts), and the audience member selects 4 cards. The
assistant hides one card and you, the magician, will use the other three
cards to figure out what the fourth card is. First we assign a number
from 0 up to 26 to each card.
Joker A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K
– q q qqqqqqq q q q q r r r r r r r r r r r r r
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
The audience member selects four cards, and the assistant hides one of
the cards. The three cards that the magician can see are:
! ! !
3 7 10
q r r
(a) Convert these cards to numbers from {0, . . . , 26} and find the re-
mainder of their sum s when divided by 4.
(b) There are 24 possibilities for the hidden card. How many numbers
are there from {0, . . . , 23} which are congruent to s modulo 4?
(c) From the last question, you know that there are N numbers from
{0, . . . , 23} that are congruent to s modulo 4. The assistant wants to
somehow give you a number from 0 up to N − 1 which will give rise
to the number that gives away the hidden card for you.
The assistant lays out the three cards to you in this order ...
! ! !
7 10 3
r r q
i. When represented as numbers from {0, . . . , 26}, let di be the num-
ber of displayed cards appearing to the right of the i-th card that
are smaller than it. Find d1 and d2 .
ii. Compute p = 2d1 + d2 . You should get a number from {0, . . . , N −
1}.
iii. Now find R = 4p + (−s mod 4). The hidden card is the R-th
possible card. Be careful to start counting at 0 and to skip over the
three cards we already have; the hidden card may not be the R-th
card from the deck. Which card do you get?
14. Prove that a number is divisible by three if and only if the sum of
its digits is divisible by three. (Hint: The first step is to express the
unknown number N as some unknown sum of multiples of powers of
10.)
15. Find the day of the week you were born on by using the following
formula:
W = ( D + b2.6(( M + 9) mod 12) + 2.4c − 2C + b5Y/4c + bC/4c − b((M + 9) mod 12)/10c) mod 7
where
fundamental concepts in mathematics 51
The symbol bxc means the “integer part” of x (i.e., round down to the
nearest integer).
3
Rings beyond numbers
We investigate polynomial rings and their properties, to see which of those for the integers also hold for polyno-
mials. We finish with algebraic numbers and the famous impossible problems of antiquity. In the middle, we come
across one of the most important themes of this course: equivalence relations. From this chapter until the end of the
notes, the idea of an equivalence relation will crop up continuously (no pun intended for Chapter ??).
By doing some elementary number theory first, we will have some moti-
vation to study the greater context; the theory of rings. The study of rings
really began in the work of Richard Dedekind1 in his seminal work Vor- 1 http://www-history.mcs.st-and.ac.uk/
HistTopics/Ring_theory.html
lesungen über Zahlentheorie (1879, 1894) and was invented as a generali-
sation of different algebraic systems with common properties, and the early
beginnings in the theory of rings were very much inspired by the quest to
prove Fermat’s Last Theorem. The term ring was later coined by David
Hilbert.
In the definition we use below, we will always assume that a ring has
a unit; sometimes these are known specially as unital rings. It seems the
definition is very long, but once the student learns what a group is (in a later
course) it becomes much simpler.
(R \ {0}, ·) is a monoid
Distributive laws
54 john bamberg
Example 3.1.2.
• The set of integers form a ring, and the set of n × n matrices over R form
a ring with the usual addition and multiplication operations.
• Later we will see how polynomials form a ring. Polynomial rings are the
bread-and-butter of algebraic number theory and algebraic geometry,
and their influence and application to 20th -century mathematics cannot
be understated.
• Every element has a multiplicative inverse. That is, for each element
a ∈ F, there is an element b ∈ R such that a · b = 1.
Example 3.1.4. The set of real numbers R and the set of complex numbers
C are both fields, and so too is the set of rational number Q. The integers
Z, however, do not form a field since 2 does not have a multiplicative
inverse. The invertible n × n matrices over R only form a field if n = 1,
since matrix multiplication is not commutative.
The ring and field properties can be explained by looking at their addi-
tion and multiplication tables (particularly if they are finite!).
Example 3.1.5. Consider Z4 . It is a ring and we can write out its addition
and multiplication tables as follows:
⊕4 0 1 2 3 ⊗4 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 1 0 3 2 1 0 1 2 3
2 2 3 0 1 2 0 2 0 2
3 3 2 1 0 3 0 3 2 1
For the left-hand table for ⊕4 , we see that it is symmetric about the
diagonal, so this operation is commutative. We also see that it is a Latin
square2 and so every element has an additive inverse (and it is unique). On 2
Recall from your school days that an n × n
the other-hand the table for ⊗4 is symmetric, but not a Latin square. So square made up of n numbers is a Latin
square if every number appears exactly
some elements here have multiplicative inverses, and some don’t, even if once in each row and column.
we are careful to only look at the nonzero elements. If the nonzero elements
form a Latin square for ⊗4 , then this is the same as saying that the ring is a
field.
fundamental concepts in mathematics 55
3.2 Ideals
The term ideal was introduced by Dedekind at the birth of the theory of
rings. An ideal is a special sub-ring of a ring which in some sense absorbs
elements of the ring when they are multiplied by elements inside the ideal.
First, let us consider the simplest ring, the ring of integers Z.
Example 3.2.1. Consider the even numbers 2Z of Z. It is closed under
addition and multiplication and so forms ring; that is, 2Z is a subring of Z.
Now let 2x be a typical element of 2Z, and take another element of Z, say
z. Then 2x · z is divisible by 2 and hence the product of 2x with z is an even
number. So we have the property that
Notice that the last example gives us 7Z back again, and they will keep
repeating when we use higher values for the translate. For example
56 john bamberg
You might notice here that what we end up with is really not a new ring at
all: it looks and behaves just like Z7 ! The idea of rings being the same is
pursued further in a later course.
Lemma 3.2.3. The only ideals of Z are of the form nZ for some integer
n > 2.
We will come back to ideals later once we have done equivalence rela-
tions and polynomial rings.
The elements of the partition are sometimes called cells or parts of the
partition.
Example 3.3.2. The even and odd integers form a partition of the integers.
As a partition, we would write it as
{E, O}
where E is the set of even integers and O is the set of odd integers. So a
partition4 is a “set of sets”. So in this sense, the set of integers is like the 4
One of the most common difficulties we
population of Australia, only we divide it up according to West Australians see in this course is that the student does
not understand that a partition is a set of
and non-West Australians! sets. That is why I deliberately used the
term “collection” in the definition to make
Example 3.3.3. The singleton subsets {s} of a set S form a partition, and this clear.
we would write it as
P := {{s} : s ∈ S }.
The whole set S itself gives a partition with just one part!
P := {S }.
Example 3.3.5.
[ x] := {y ∈ X : y ∼ x}.
58 john bamberg
Example 3.3.7.
[0] = 7Z
[1] = 1 + 7Z
..
.
[6] = 6 + 7Z
Notice in this example that we could write [7] for [0], we could write
[−13] for [1]. There are many ways to write down any equivalence class.
(i) x ∼ y
(ii) x ∈ [y]
(iii) [ x] = [y].
We can then use the second equation to substitute a00 b0 in for a0 b00 :
ab00 = a00 b.
a c ad + bc
+ := .
b d cd
We can see by our construction in the previous section what this looks like
in terms of an operation on equivalence classes:
It turns out (see Exercise 3.15.2) that the operations of addition and
multiplication on Frac(R) are well-defined and that Frac(R) is in fact a
field! The multiplicative identity is [(1, 1)], where 1 is the multiplicative
identity of R, and a non-zero element [(a, b)] has multiplicative inverse
[(b, a)]. Notice that a and b need not have multiplicative inverses in R! So
we saw that the rational numbers are realised as Frac(Z), where we make a
shorthand notation for the element [(a, b)] be writing a fraction ab . Here are
some other examples.
Example 3.4.2.
• Let R[ x] be the set of polynomials with real coefficients. Then Frac(R[ x])
is the set of rational functions.
Let I be the unit interval [0, 1]. These are the real numbers x satisfying 0 6 (0, 1) (1, 1)
x 6 1. Now take the Cartesian product I × I. We can see this geometrically
in the Cartesian plane.
Consider the following relation ∼ on I × I:
(0, 0) (1, 0)
( x, y) ∼ ( x0 , y0 ) ⇐⇒ y = y0 and ( x = x0 or |x − x0 | = 1).
It turns out that ∼ is an equivalence relation! Let’s see what it means geo-
metrically. We see that if we take a point on the left-hand side of the square,
then it is equivalent to a point on the right-hand side of the square at the
same longitude. So what we are doing is identifying the sides of the square.
We can think of curling the plane until the sides of the square meet; we end
up with a cylinder! So geometrically, we can model the quotient of I × I by
∼ with a cylinder.
We can also model some other surfaces with equivalence relations, and
this will be explored in the Appendix.
x3 + 3x2 + 1
and the emphasis there was to understand their sets of values, to draw
graphs of these values and find out when they are zero. We will not be so
interested in their values, rather, we will be interested in polynomials as
objects themselves and their entirety. This is what we have been doing with
numbers; we are not interested in single numbers on their own, rather about
the properties of numbers in general and how they interact.
Definition 3.6.1 (Polynomial). A polynomial over a ring R is an expression
an xn + an−1 xn−1 + · · · + a1 x + a0
is x2 − 3x + 2.
62 john bamberg
Here are some technical things we need to care about, to avoid con-
fusion. The largest number i such that ai , 0 is called the degree of the
polynomial f = an xn + an−1 xn−1 + · · · + a1 x + a0 and the shorthand
notation is deg( f ). If ai = 0, we ignore writing this term of the polynomial
down. We will also write xi when the coefficient is 1.
Example 3.6.4. Let R be a ring and let P be the set of elements of R[ x] that
have degree 0 or −1. Then P is just like R! The function
a0 → a0
gk = h.
(i) d | f , d | g;
How did we find the greatest common divisor of f and g in the last
example? Well, the Euclidean Algorithm works for polynomials just as it
did for integers.
Example 3.6.14. We will now use the Euclidean Algorithm, but with poly-
nomials as input, to find the greatest common divisor of f = x3 + x − 2 and
g = x4 − 1. (We implicitly use long division throughout). Long division establishes that
x4 − 1 = x( x3 + x − 2) + (−x2 + 2x − 1).
x4 − 1
3 For the next step, we use long division
x x + x−2
again:
−x − 2 −x2 + 2x − 1
x3 + x − 2 = (−x − 2)(−x2 + 2x − 1) + (4x − 4).
− 14 x + 14 4x − 4
Then the last step is easy: −x2 + 2x − 1 =
0
(− 14 x + 14 )(4x − 4)
In the last example, we saw rational numbers that are not integers ap-
pearing in the left-hand column. This is why it is necessary for the poly-
nomial ring to be defined over a field. It would not ‘work’ if we used Z
instead of Q in the last example.
A consequence of the Euclidean Algorithm for polynomials is a Bézout
identity for polynomials.
fundamental concepts in mathematics 65
gcd( f , g) = m · f + n · g.
The next theorem gives us a way to easily check for affine divisors of
polynomials.
Theorem 3.6.16 (The Factor Theorem). Let f ∈ F [ x] and let c ∈ F. Then
f (c) = 0 if and only if the polynomial x − c divides f .
Proof: To be done in lectures.
1 = m · f + n · p.
g = m · ( f · g) + n · p · g
and we know that p divides the bracketed term above. So p divides the
right-hand side and hence p divides g. Therefore, if p does not divide f ,
then p divides g (which is logically equivalent to proving “p divides f or p
divides g”.
I ( f · g) = I ( f 0 · g0 ) I ( f ) I (g).
f = xn + an−1 xn−1 + · · · + a1 x + a0
So we see that 3 divides the non-leading coefficients and 9 does not divide
the constant coefficient. Therefore, by Eisenstein’s Criterion (Theorem
3.9.1), f ◦ g is irreducible over Q, and so by Lemma 3.9.4, f is also irre-
ducible over Q.
So far, we have seen that polynomials behave a little bit like numbers:
we have a Division Rule for F [ x], we have the greatest common divisor
function and the Euclidean Algorithm, and we have a notion of prime
numbers (i.e., the irreducible polynomials). We will now look at clock
arithmetic on R[ x] and an analogue of the ring Zn of integers modulo n.
f ≡ p g ⇐⇒ p | f − g.
Proof:
The proof of this result is simple, but perhaps a little tedious, so we leave
it to the reader to verify that it is a true statement.
x5 + 3x2 − 2 = ( x3 − x + 3) · p + x − 5.
[ f ] = [ x − 5].
[ax + b]
fundamental concepts in mathematics 69
[ x ] ⊗ [ x ] = [ x2 ] .
Now p clearly divides the difference of x2 and the constant polynomial −1,
so x2 ≡ p −1 and hence
[ x] ⊗ [ x] = [−1].
Does this look familiar? Yes, there is a natural bijection9 between the 9
You can check that this really does work.
quotient R[ x]/ ≡ p and the complex numbers given by: If you add [ax + b] and [a0 x + b0 ], you
get [(a + a0 ) x + (b + b0 )], and their
product is [aa0 x2 + (a0 b + ab0 ) x + bb0 ].
[ax + b] 7→ b + ai. But aa0 x2 + (a0 b + ab0 ) x + bb0 ≡ p
(a0 b + ab0 ) x + bb0 − aa0 , and so the
Example 3.10.6 (A field with 4 elements). Consider the field Z2 consisting product of [ax + b] and [a0 x + b0 ] is what it
would be if viewed as complex numbers.
of the two elements 0 and 1. This itself is a quotient ring of Z, where 0 rep-
resents the even numbers and 1 represents the odd numbers. Recall that Z2
has the interesting property that 1 + 1 = 0. This makes the polynomial ring
Z2 [ x] particularly interesting. Consider all of the quadratic polynomials of
Z2 [ x]:
x2 , x2 + 1, x2 + x, x2 + x + 1.
Clearly the first and third are reducible since x divides both of these. The
second one also turns out to be reducible:
( x + 1)( x + 1) = x2 + x + x + 1 = x2 + (1 + 1) x + 1 = x2 + 0x + 1 = x2 + 1.
[ x ] ⊕ p [ x + 1 ] = [ x2 + x ] = [ x2 + x + 1 + 1 ] = [ p + 1 ] = [ 1 ] .
Notice that the multiplication tables are very different from the table we
got for Z4 (see Example 3.1.5). First of all, if we delete the first row and
column, we get a Latin square! In other words, Z2 [ x]/ ≡ p is a field.
gcd( f , p) = m · f + n · p.
which shows that F [ x]/ ≡ p has zero divisors. Therefore, F [ x]/ ≡ p is not a
field.
All up, we have shown that the quotient ring F [ x]/ ≡ p is a field if and
only if p is irreducible over F.
fundamental concepts in mathematics 71
So the quotient R[ x]/ ≡ p can be realised in the usual sense in ring the-
ory, that it is a quotient R/I of a ring by an ideal. This is not an emphasis of
this course, but will be in a later course on ring theory.
Example 3.12.2.
√
• 2 is algebraic since it is a zero of x2 − 2.
√
1+ 5
• The Golden Ratio 2 is algebraic since it is a zero of x2 − x − 1.
Example 3.12.8.
√
• −2 is an algebraic integer as it is a zero of x2 + 2 (whose coefficients
are integers).
• 1
2 is not an algebraic integer.
Now we see how these sets of complex numbers behave under the usual
arithmetic operations.
Theorem 3.12.10. The algebraic numbers A form a field and the algebraic
integers form a ring.
† Proof: What we haven’t seen in this course is that we can describe al-
gebraic numbers as eigenvalues of matrices with rational entries. Sup-
pose c is an algebraic number with minimal polynomial f . If we write
f = xn + an−1 xn−1 + · · · + a1 x + a0 , then it turns out that the minimal
polynomial of f is in fact the characteristic polynomial of the following
matrix:
0 1 0 ··· 0
0 0 1 ··· 0
. .. .. .. ..
.. . . . .
0
0 0 · · · 1
−a0 −a1 −a2 · · · −an−1
This matrix is known as the companion matrix of f . So we use the follow-
ing facts:
( A ⊗ B)(C ⊗ D) = AC ⊗ BD.
• a · b is an eigenvalue of A ⊗ B.
2. Operation 1: Draw a line through two old points Pi and P j to get new
points where this line intersects other lines and circles.
1 7 3 15 5 5 1 1
x8 + x7 − x6 − x5 + x4 + x3 − x2 − x + .
2 4 4 16 16 32 32 256
√ √
π is transcendental =⇒ π· π is not constructible
√
=⇒ π is NOT constructible.
3 1
deg(cos(π/9)) = 3 (min. poly. X 3 − X − )
4 8
=⇒ cos(π/9) is NOT constructible.
First of all, the LDE Theorem does have a direct analogue and we do
this in Exercise 3.15.8. The Chinese Remainder Theorem has an analogue
in the direct factorisation of a quotient ring R/I by coprime ideals. Gauß’s
Lemma holds when taking the field of fractions:
If a polynomial with coefficients in a ring R is reducible over Frac(R), then it
is also reducible over R.
3.15 Exercises
(i) f ( x) = x3 + x − 1, g( x) = x − 1
(ii) f ( x) = x4 − 1, g( x) = −x2 + 2.
Exercise 3.15.6. Find the minimal polynomial over Q for the following
numbers:
(i) 1 + i,
(ii) 2 + 3i,
(iii) e2πi/5 .
Exercise 3.15.7. By using the fact that the algebraic numbers form an
algebraically closed field20 , show that one of π + e or π · e is transcendental. 20
A field F is algebraically closed if every
non-constant polynomial has a root.
Exercise 3.15.8. State and prove an analogue of the LDE Theorem 2.5.14
for the polynomial ring F [ x].
fundamental concepts in mathematics 77
√
3.15.1 A different type of number system and its arithmetic: Z( −5)
This is the set of all formal sums of the form
√
a + b −5
√
where a, b ∈ Z. We will simply write a for a + 0 −5. We can define
addition and multiplication of such numbers:
√ √ √
Addition (a + b −5) + (a0 + b0 −5) = (a + a0 ) + (b + b0 ) −5
√ √ √
Multiplication (a + b −5) · (a0 + b0 −5) = (aa0 − 5bb0 ) + (ab0 + a0 b) −5
√ √
Exercise 3.15.9. For z = a + b −5 ∈ Z( −5) define the norm N (z) =
√
z · z̄ where, z̄ = a − b −5. Prove that
N (z · z0 ) = N (z) · N (z0 )
√
for any z, z0 ∈ Z( −5).
√
Exercise 3.15.10. A unit of Z( −5) is an element u such that there exists
√
an element v such that u · v = 1. What are the units of Z( −5)? (Hint:
√
You know that N (z · z0 ) > N (z) for any z, z0 ∈ Z( −5).)
√
Exercise 3.15.11. An irreducible in Z( −5) is a non-unit which cannot
√
be written as a product of two non-units. Show that 3 and 1 + −5 are
irreducible.
Exercise 3.15.12. How many ways can 6 be written as a product of irre-
√
ducibles in Z( −5)?
√
Exercise 3.15.13. A prime in Z( −5) is a nonzero element p, that is not a
unit, satisfying the following:
√
given x and y in Z( −5) such that p divides x · y, then p divides x or y.
Show that every prime is an irreducible, but the converse is not true.
4
Normed vector spaces
In this chapter we look at a generalisation of Euclidean space that encapsulates ‘spaces of functions’ and other
objects whereby we can measure the difference between things as we would the vectors of Rn .
We can take subsets which are closed under these two operations, and
we call them subspaces. For example, the set of elements (u1 , u2 , . . . , un )
whose sum ui is zero forms a subspace. We can also define a basis of Rn
P
so that we can write every element as linear combinations of the basis ele-
ments. For example, the vectors (1, 0, . . . , 0), (0, 1, 0, . . . , 0), . . . , (0, . . . , 0, 1)
form a linearly independent spanning set for Rn .
Example 4.1.2 (Vector space of code words). Let’s take the smallest field,
Z2 , with just the two elements 0 and 1. Much of the theory of codes is
80 john bamberg
about strings of 0’s and 1’s, of a common length, say n. We can add code
words, with the rule that 1 + 1 = 0. For example:
Example 4.1.3 (Vector space of functions). Consider the set of all func-
tions V X from a set X to a vector space V (think of Rn if you like). We can
add functions by the so-called point-wise addition of functions, and we can
also define scalar multiplication on functions:
( f + g)( x) := f ( x) + g( x)
(λ f )( x) := λ f ( x).
There are subsets of V X that are closed under this operation, such as the
constant functions. Can we write every element of V X as a linear combina-
tion of a distinguished set of functions?
from this abstract and very general definition, we can do linear algebra
in a similar way to what you did in first year mathematics with Rn . We
can study subspaces, spanning sets, bases, linear maps, eigenvalues and
so on. We do not need to spend much time on this generalised version of
linear algebra, since you can more or less assume that whatever property
you learnt about of Rn has a direct analogue in a vector space V. The main
difference that you will see is that we can have vector spaces which do not
have a finite basis: infinite-dimensional vector spaces.
R A
R∩A
† Proof: To check all the axioms of a vector space can be quite tedious. In
this case, many of them follow simply from the definition of field: e.g. (1) –
(4) follow immediately. It just suffices to check the axioms (5)–(8), but we
will leave this to the reader.
q1 · e1 + q2 · e2 + · · · + qn · en
Is there a basis for the vector space R over Q? Does every vector space
have a basis?
We have already seen Russell’s Paradox and the Continuum Hypothesis
as fundamental philosophical questions in mathematics that stirred the
minds of early 20th century mathematicians. There is another taboo subject
in mathematics, and it is the acceptance or non-acceptance of the Axiom of
Choice. “The Axiom of Choice is necessary to
select a set from an infinite number of
Axiom of choice: For any collection S of nonempty sets, there exists a socks, but not an infinite number of shoes.”
– Bertrand Russell
function f that assigns to each set S in S an element f (S ) of S .
The function f here is called a choice function, a map which selects one
element from an infinite collection of sets. If we can always assume that
such a function exists, then the real numbers would be well-ordered6 , that 6
I’ll suppress the details here.
is,
there is a total order . such that every nonempty subset of the real numbers
has a minimum element (with respect to .).
This seems like nonsense: how can we order the real numbers so that any
open interval has a minimum element? Another reason to dislike the Axiom
of Choice is the Banach-Tarski paradox in measure theory. In 1924 Stefan
Banach and Alfred Tarski proved the following remarkable result: It is pos-
sible to take a solid ball in 3-dimensional space, cut it up into finitely many
pieces and, moving them using only rotation and translation, reassemble
the pieces into two balls of the same radius as the original. In other words,
we get two spheres exactly the same size as the original sphere merely by
cutting and shifting! Alternatively, we can cut up a ball the size of a pea
and reassemble it into a ball the size of the sun. This theorem has come to
be known as The Banach-Tarski Paradox not because it is a logical paradox
(like that of Russell), but rather because it goes against our intuition about
how the world works.
There are many, many results such as these that follow from assuming
the Axiom of Choice, so why do we bother with it at all? It turns out that
certain extremely useful results rely on an equivalent version of the Axiom
of Choice known as Zorn’s Lemma7 . However, here are some of the results 7
Zorn’s Lemma: Suppose a partially
which follow from Zorn’s Lemma: ordered set P has the property that every
totally ordered subset has an upper bound
in P. Then the set P contains at least one
maximal element.
fundamental concepts in mathematics 83
4.3 Norms
w(000130520) = 4.
There are three properties of this function which share analogous proper-
ties with the Euclidean norm on Rn :
Non-degeneracy: w(v) = 0 if and only if v is the string of all 0’s. (Just like
the zero vector).
We will be looking at vector spaces were the scalars are the real num-
bers, since there is a natural ordering on R.
kk:V→R
4.4 Boundedness
• (∀s ∈ S ) s 6 `,
• (∀ > 0)(∃s ∈ S ) ` − < s.
∃s ∈ S
`− `
Example 4.4.2. Let S := {2 − 1/n : n ∈ N}. There is no ‘maximum’ of this We say that m is the maximum of S if m is
set S , but 2 is a least upper bound for S . an upper bound of S that also lies in S .
Least upper bound: Let > 0. We need to find10 a suitable s ∈ S such that 10
We will do a ‘backwards’ calculation in
` − < s. Let n be the next largest integer after 1/ and let s = 2 − 1/n. the margin.
` − < s ⇐ 2 − < 2 − 1/n
Then n > 1/ and so > 1/n. Thus 2 − < 2 − 1/n and hence
⇐ − < −1/n
2 − < s. Since s ∈ S , we have found an element of s that is greater than
⇐ > 1/n
2 − and so 2 is the least upper bound of S .
⇐ n > 1/
Theorem 4.4.3 (The Least Upper Bound Property of R). Every nonempty
subset S of R which has an upper bound, has a least upper bound.
We will come back to the proof of this result once we properly define
what R is!
The generalisation of an open interval in higher dimensions is a sphere,
or ball, of Rn . We will go one step further and define balls for normed
vector spaces.
0.8
0.6
0.4
0.2
0
0 2 4 6 8 10
Figure 4.1: 1/n drawn as ‘n versus sn ’
One of the main successes of the theory of normed vector spaces is in the
generalisation of continuity from Euclidean spaces (over R or C) to ar-
bitrary vector spaces. We are then able to study functions or sequences
of polynomials, of matrices and sequences of continuous functions them-
selves! Continuity then gains a deeper meaning when we look for solutions
of differential equations, where the differentiable functions are the elements
of our vector space and the integration and differentiation operators are the
continuous maps!
Bδ (a) B ( f (a))
a f (a)
V W
{w} ⊂ B (w).
88 john bamberg
Here our normed vector space is just [R, | · |]. We will show that χQ is not
continuous at any point a of R. To do this, we must take the negation of the
definition of continuous:
f is not continuous at a if
Example 4.6.4 (A fully worked out, difficult, example). Let V and W both
be R2 , but equip V with the Euclidean norm and W with the norm k · k∞
from Example 4.3.3 . Then the map f : V → W defined by
f (( x, y)) := ( x − y, xy)
We should figure out first what these two entities are and what they look
like.
• The set f (Bδ ((1, 1))) is just the set of elements ( x − y, xy) such that
q
k( x, y) − (1, 1)k = ( x − 1)2 + (y − 1)2 < δ.
Before After
• The set B ((0, 1)) is just the set of elements (u, v) such that
The ‘’ was given to us, and we must find the δ which makes this work.
The picture in the margin shows that if = 5, then δ = 2 is a suit- 5
⇐ |x − 1| + |y − 1| < and |xy − y| + |y − 1| < Notice that we have a different norm in the
codomain of f . So balls look like squares!
⇐ |x − 1| + |y − 1| < and |y||x − 1| + |y − 1| <
|y| = |1 + y − 1| 6 1 + |y − 1| < 1 + δ 6 2.
linear operator.
90 john bamberg
example below, we will look at the case where V = R, but it can be done in
general if V itself is a normed vector space.
| f ( x)| < C.
|( f + g)( x)| 6 u.
Since k f + gk is the least upper bound of {|( f + g)( x)| : x ∈ X}, we must
have
k f + gk 6 u.
92 john bamberg
Example 4.7.4. What is the ‘distance’ (with respect to the sup norm)
between the functions f ( x) := x and g( x) := x2 on the closed interval
[0, 1]?
k f ( x)kW 6 C · kxkV .
Such functions with only finitely many discontinuities are square integrable
and so the following defines a norm on this function space, and it is called
the L2 -norm.
s Z π
1
k f k := | f (θ)|2 dθ
2π −π
It turns out that the cosine and sine functions form a basis for this func-
tion space, and the decomposition of f into this basis is known as a Fourier
series.
4.8 Exercises
(b) Now we show that V ∗ has a natural norm on it. Show that k · k : V ∗ →
R defined by
k f k := sup{| f ( x)| : kxk 6 1}
is a norm on V ∗ .
g(( x, y)) := x − y.
Show that k f k = 2.
Exercise 4.8.11. Consider sin and cos restricted to the interval [0, π] (so
√
they are elements of B([0, π])) Show that k sin − cos k = 2.
5
Metric Spaces
For the notions of continuity and limit in R or C, it is not the fact that these structures are ordered that makes
it all work, rather it is the notion of distance that prevails. In this chapter, we go beyond normed vector spaces to the
realm of ‘metric spaces’. These are one of the most widely available sources of interesting spaces and shapes that we
can study limits and continuity on.
5.1 Limit of a function and limit of a sequence are really the same
thing
limn→∞ sn = ` means . . .
What we will see later is that both of these definitions are of the form
(∀B)(∃A) f ( A) ⊆ B
where A and B are suitably defined sets (like a punctured open set).
Using the ball-definition allows us to be flexible with the space we are
working in, and what notion of distance or closeness that we can choose to
adopt.
–Augustin-Louis Cauchy.
fundamental concepts in mathematics 97
n 1
xn : = ( , ). 0.8
n+1 n
0.6
Does this sequence converge to something?
By the graph, it seems that the sequence gets closer and closer to the
0.4
point (1, 0). Let’s try and prove this. Suppose > 0. On the back of an
√
envelope, I worked out that choosing N > 2/ will do the job, as you will 0.2
now see.
0
√ 0 0.2 0.4 0.6 0.8 1
Proof. Suppose > 0. Choose N to be an integer greater than 2/, and
suppose n is an integer such that n > N. Figure 5.2: The sequence {xn }, where only
√ the values of the sequence are plotted (and
Then n > 2/
the domain isn’t).
Then n22 < 2 √
as N > 2/.
Then n12 + n12 < 2 by rearranging the equation.
Then (n+11)2 + n12 < 2 since 2 = 1 + 1.
q since (n+11)2 < n12 .
Then (n+11)2 + n12 <
by taking square roots of each side.
+1 , n )k <
Then k( n−1 1
by definition of the Euclidean norm.
Then k( n+1 − 1, 1n )k <
n
since n
n+1 −1 = n
n+1 − n+1
n+1 = −1
n+1 .
Then kxn − (1, 0)k < by definition of xn .
Therefore, there exists an integer N such that if n > N, then
kxn − (1, 0)k < . Therefore, {xn } converges to (1, 0).
Now we will look at generalising the notion of distance one step further
than norms on vector spaces. We will throw away the linear structure so
that we are left with a metric on a set.
5.3 Metrics
A norm k · k can be used to find the distance between two elements x and y
of a vector space V by simply computing kx − yk. This gives us a function
that takes pairs from V × V and gives them a non-negative value in R. We
will see that this idea of measuring distance can be extended beyond norms
on vector spaces; we just need a similar type of function on a set X which
has the same properties as the example arising from a norm. The great analyst Maurice Fréchet (1878
– 1973) introduced the idea of a metric
Definition 5.3.1 (Metric). Given a nonempty set X, a function d : X × X → on a set in his doctoral dissertation (Sur
quelques points du calcul fonctionnel:
R is a metric if it satisfies the following axioms: “On some points of functional calculus”),
though the term metric was first coined by
(i) d ( x, y) > 0 for all x, y ∈ X, Hausdorff.
Example 5.3.2 (A metric not arising from a norm; the discrete metric). Let
X be a set and define d : X × X → R by
1 if x , y
d ( x, y) :=
0 if x = y.
Clearly d satisfies (i), (ii) and (iii) of Definition 5.3.1. For the triangle
inequality, part (iv), let us consider x, y, z ∈ X. If x = y, then clearly
d ( x, y) 6 d ( x, z) + d (z, y) holds as d ( x, y) = 0 in this case. So suppose
x , y. If y , z, then
d ( x, y) = 1 6 d ( x, z) + 1 = d ( x, z) + d (z, y).
Br ( x) = {y ∈ R2 : d p ( x, y) < r}
= {y ∈ R2 : kyk < r − kxk or x = y} Figure 5.3: The ball B3/2 ((1, 0)). Notice
that it looks like a ‘Euclidean’ ball around
= {y ∈ R2 : kyk < r − kxk} ∪ {x} the origin plus an extra point, namely,
(1, 0).
In particular, the sequence from Example 5.2.2 does not converge to
(1, 0). It turns out (see Exercise 5.6.1) that this metric does not arise from a
norm!
fundamental concepts in mathematics 99
dr ( x, y) = min{1, kx − yk}.
What does a ball look like? Does the sequence from Example 5.2.2
converge? Does this metric arise from a norm?
5.4 Boundedness
Just as we did for normed vector spaces, we can readily extend the notion
of a bounded set, a bounded function or a bounded sequence to metric
spaces. A subset W of a metric space [ X, d ] is bounded if it is contained in
some ball:
W ⊆ Br (a), ∃a ∈ X, r ∈ R+ .
Likewise, a sequence is bounded if the whole sequence is contained in
some ball. Finally, a function whose codomain is a metric space is bounded
if its image is contained in a ball.
The brilliant insight of Fréchet and Hausdorff was that the theory of met-
ric spaces has many of the properties as functions on Euclidean space have.
For example, there are unique limits of convergent sequences and such
sequences do not ramble off too infinity. In the generality of topological
spaces, this is not true!
average of this x-coordinate with the width of the rectangle: 23 . The next
rectangle we draw has width 32 and height 2/ 32 . This time, the diagonal
meets the top-side of the rectangle in a point with x-coordinate equal to 43 .
and so take the average of this x-coordinate with the width of the rectangle:
( 34 + 32 )/2 = 17
12 . Continuing in this way, the rectangle quickly converges
√
to a square with side lengths equal to 2.
4 24
3 17
1
1 3 2 1 3 1 17
2 2 12
xn+1 := xn /2 + 1/xn , x1 = 1.
That is, the sequence would be a constant sequence where every element
would have square equal to 2. This is an example of a Cauchy sequence,
or if you like, convergence of a sequence without a designated limit. The
sequence gets closer and closer to itself! More mathematically, no matter
what window we allow, at some point, the envelope of the sequence has a
width less than that of the window.
Example 5.5.2. Consider the metric space on the open unit interval (0, 1)
with the Euclidean metric d. We will show that the following sequence is a
Cauchy sequence, but it has not limit in (0, 1):
1
xn : = 1 − .
10n
Let ∈ R+ . Choose N to be an integer greater than log10 ( 1 ) and suppose
n > m > N.
Then m > log10 ( 1 ).
Then 101m < .
Then 101m − 101n < .
Then (1 − 101n ) − (1 − 101m ) < .
Then |xn − xm | < .
Then d ( xn , xm ) < .
Therefore, {xn } is a Cauchy sequence.
fundamental concepts in mathematics 101
Example 5.5.3. Back to our original example. The metric space is [Q, d ]
where d is the Euclidean metric, and the sequence is
xn+1 := xn /2 + 1/xn , x1 = 1.
5.6 Exercises
Exercise 5.6.1. Show (by proof by contradiction) that the Post-Office metric
(Example 5.3.6) does not arise from a norm.
Exercise 5.6.2. Here we explore the Manhattan metric on Rn :
X
d (x, y) := |xi − yi |
(−1)n
!
sn : = 1 + , (−1)n + 1/n , n = 1, 2, 3, . . .
n
d ( sn , `) > .
(b) It looks like this sequence is convergent. What do you think this se-
quence is convergent to?
(c) Simplify
|xm+1 − xm |.
102 john bamberg
|xn − xm | 6 |xm+1 − xm |.
|xn − xm | < .
Exercise 5.6.4. In this exercise, we will show that the sequence {1/n} is a
Cauchy sequence in the open interval (0, 2) (w.r.t., the Euclidean metric).
(ii) Suppose, without loss of generality, that n > m > N. Show that
1 − 1 6 1
m n m
(iii) Write the remainder of your proof, starting with “Choose N = . . . and
suppose m, n ∈ N such that m, n > N. Then ...”.
f 0 ( x) = 14 ( f ( x)2 + x2 ), f (0) = 0,
By the end of this chapter, we will have a solution to this question that
uses the theory of complete metric spaces and contraction maps.
fundamental concepts in mathematics 103
Example 5.7.4. [(0, 2), dE ] is not complete. Consider the following Cauchy
sequence:
xn := 2 − 1/n.
The sequence is increasing but has no limit in (0, 2).
d ( xn , xm ) < 1
d ( xn , xN +1 ) < 1
for all n > N which means that xn = xN +1 (for all n > N) by definition of
the discrete metric. So we see that {xn } converges to xN +1 .
We have already seen the vector space of all functions F ( X, V ) from a set
X to a vector space V, and that a sequence in a metric space [ X, d ] is just a
function N → X. Let us consider just the metric space [Q, dE ] and the set
of all Cauchy sequences R in [Q, dE ]. We will show that R is a ring under
suitably defined operations of addition and multiplication.
Lemma 5.8.1. The term-by-term sum and product of two Cauchy se-
quences of [Q, dE ] are also Cauchy sequences.
104 john bamberg
The zero-Cauchy sequence is just the constant sequence 0 and the unit-
Cauchy sequence is just the constant sequence 1. We leave it as an exercise That is, 0 is the sequence 0, 0, 0, . . ., and 1
is the sequence 1, 1, 1, . . ..
to verify that these two sequences serve as additive and multiplicative
identities for R (respectively).
Theorem 5.8.3. The set of all null sequences of [Q, dE ] forms an ideal of
R.
R
The real numbers are defined to be the equivalence classes of ∼.
6
So what is Q as a subset of R?
5.10 Arithmetic on R
Now that we have finally defined the real numbers by simpler things (i.e.,
rational numbers), we can review how addition and multiplication are
defined on real numbers in the way we have defined them. In fact, we have
essentially already seen our arithmetic works on R when we defined the
term-by-term sums and products of Cauchy sequences.
Addition on R: Given two real numbers4 [{xn }] and [{yn }], define 4
Remember, a real number is an equiv-
alence class of Cauchy sequences of
rationals.
[{xn }] + [{yn }] := [{xn + yn }].
Multiplication on R: Given two real numbers [{xn }] and [{yn }], define
So
{xn0 } + {y0n } = {xn } + {yn } + {un } + {vn }
and {un } + {vn } is also a null sequence as the set of null sequences is a
subring of the Cauchy sequences. Therefore,
and hence
[{xn0 }] + [{y0n }] = [{xn }] + [{yn }].
So addition is well-defined.
In fact, it would then be not difficult to show (though tedious) that R
forms a ring under these operations, as we would hope it would! To show
that R is a field requires showing that every nonzero element has an in-
verse, which we have left as Exercise 5.18.4.
In other words, the real number we have written above is the equivalence
class of the sequence
{71/102n }.
{71/102n } ∼ {71/99}.
106 john bamberg
What about 0.999999 . . .? This is really the real number whose Cauchy
sequence of rationals is
{1 − 1/10n }.
{1/10n }
There are two other mainstream ways of defining the real numbers in math-
ematics. One is the famous ‘Dedekind cuts’ method, the other is more
abstract and arises in the theory of field extensions. The latter can be found
in a book on ring theory and you would find it under the topic of transcen-
dental extensions of the rational numbers. We will explain here the former
and easier notion of Dedekind cuts (see also page 194 of Liebeck’s book).
A nonempty subset X of the rational numbers Q is called a Dedekind cut
if it satisfies the following conditions:
• for any x ∈ X, we have that X contains all the rationals less than x;
• X has no maximum.
R
The real numbers are the set of all Dedekind cuts of Q.
{q ∈ Q : q2 < 2 or q 6 0}.
X · Y := Q− ∪ {0} ∪ {xy : x ∈ X ∩ Q+ , y ∈ Y ∩ Q+ }.
Example 5.13.2. The trivial examples of open sets are the empty set ∅ and
the whole metric space X itself. An open ball itself is open.
The next lemma tells us that in a discrete metric space, every subset is
open. The two properties outlined in the lemma essentially give us the ax-
ioms of a topological space, which you will meet in 3rd year mathematics.
Proof:
and so there exists a minimum value, say min of this set. Then Bmin ( x)
is contained in each Oi , and therefore, is contained in ∩ni=1 Oi . Thus,
∩ni=1 Oi is open.
A closed set is something like a closed interval, such as [0, 1]. One of
the most common mistakes of students is that they think that the opposite
of open is closed, as we would in normal everyday language. However,
in mathematics, this is not true! We will see examples of sets which are
neither open or closed, and examples which are both open and closed.
Example 5.13.7. A set which is both closed and open is said to be clopen.
For example, the subset
{x ∈ Q : x2 > 2}
One of the most useful results in the theory of metric spaces is the char-
acterisation of closed subsets by convergent sequences.
And conversely . . .
f ← (S ) := {x ∈ X : f ( x) ∈ S }.
Theorem 5.14.1. Let [ X, d ] and [Y, e] be two metric spaces and let f : X →
Y be a function. Then f is continuous if and only if the preimage of any
open set of Y is an open set of X.
fn ( x ) : = x n , n ∈ N.
This is what we call point-wise convergence. We see here that with this
notion of convergence, the limit of a sequence of continuous functions is a
discontinuous function!
We shall explore now the notion of uniform-convergence of functions
which preserves continuity. The difference between the two forms of con-
vergence is that instead of thinking about the values of fn and what they
tend to, we just look at the functions themselves.
These two examples are interesting from another perspective. The first
sequence of functions (Example 5.15.1) was not convergent in [B([0, 1], R), d∞ ],
and nor was it a Cauchy sequence (see Exercise 5.18.9). The second exam-
ple (Example 5.15.2) is a convergent sequence in [B([−π, π], R), d∞ ], and
so by Theorem 5.5.4, it is also a Cauchy sequence. In fact, we will prove
now and important result about bounded functions; they form a complete
metric space.
Theorem 5.15.3. For any nonempty set X, we have that [B( X, R), d∞ ] is
complete.
Theorem 5.15.4. For any nonempty set X, the subspace of bounded contin-
uous functions [C( X, R), d∞ ] is closed and hence complete.
e( f ( x1 ), f ( x2 )) 6 c · d ( x1 , x2 ).
Example 5.16.2.
Example 5.16.5.
f 0 ( x) = 41 ( f ( x)2 + x2 ), f (0) = 0,
and hence Z x
2
f ( x) = 1
4 ( f ( x) + x2 )dx.
0
To make things more difficult, we will define a function Φ on the metric
space of continuous functions C([− 21 , 21 ], R). For such a function f , de-
Rx
fine Φ( f ) to be the function that maps x to 0 14 ( f ( x)2 + x2 )dx. It is not
difficult to see that
Φ(B1/2 (0)) ⊂ B1/2 (0),
that is, if we look at the space of functions that are distance at most 12 from
the zero function 0 with respect to the sup-metric, then Φ maps this set into
itself. It turns out that Φ is a contraction map on B1/2 (0) as we will now
see.
Let f1 , f2 ∈ B1/2 (0). Then
Z
kΦ( f1 ) − Φ( f2 )k∞ = k 41 ( f12 − f22 )k∞
0
R Rx
where 0 ( f12 − f22 ) is the map that takes x to the definite integral 0 ( f12 −
f22 )dx. Now we will use a fact from calculus that if f is continuous on the
R
interval [a, b] and c ∈ [a, b], then k c f k 6 (b − a)k f k. So
1 1 −1
kΦ( f1 ) − Φ( f2 )k∞ 6 ( − )k f12 − f22 k∞
4 2 2
1
6 k f1 − f2 k∞ k f1 + f2 k∞
4
1
6 k f1 − f2 k∞ (k f1 k∞ + k f2 k∞ )
4
1 1 1
6 k f1 − f2 k∞ ( + )
4 2 2
and so
1
kΦ( f1 ) − Φ( f2 )k∞ 6 k f1 − f2 k∞ .
4
Therefore, Φ is a contraction map on B1/2 (0) with Lipschitz constant at
most 41 . Now B1/2 (0) is a closed subsets of a complete metric space, and
so must also be complete. Therefore, by Banach’s Contraction Mapping
Theorem 5.16.6, there exists a unique fixed-point of Φ.
Φ( f ) = f ,
Figure 5.7: A graph of the unique solution
which means, that f is a solution to our original differential equation. to the DE.
According to mathematica, there is no nice way to write this function f in
terms of functions we know!
fundamental concepts in mathematics 113
So the Hausdorff distance is the greatest of all the distances from a point in
one set to the closest point in the other set. It turns out that [H (Rm ), h] is a
complete metric space.
Now we define a function G on H (Rm ), known as the Hutchison opera-
tor:
H ( B) := w1 ( B) ∪ w2 ( B) ∪ · · · ∪ wn ( B).
This gives us a contraction map on H (Rm ). So by Banach’s Contraction
Mapping Theorem 5.16.6, there exists a unique closed and bounded set B
that is a fixed-point of H. This set B is the fractal we want to generate.
5.18 Exercises
Draw f1 , f2 and f3 .
114 john bamberg
k fn − f k∞ < .
...
f 0 ( x) = f ( x)
The latter property does not make much sense for the moment, so we
will explore a particular example. My topology teacher at La Trobe Uni-
versity, John Banks, described a compact set as something where you can
measure temperature. That is, there is a continuous function to the reals
whose image is bounded and attains a maximum. For example, the sphere
is such a set; on the planet earth we can measure the temperature sensibly
at each point of its surface. There is a maximum temperature, and it is a
continuous function.
Now a union of finitely many bounded sets is a bounded set, and so ∪ j∈J Uk
is bounded. So what we have done is shown that if we can reduce any cover
of K by open sets to a finite one, then it ensures that f ( K ) is bounded. This
is the first thing we need in order for a maximum and minimum of f ( K ) to
exist. We also would like f ( K ) to closed, and this will also be guaranteed.
K ⊆ ∪i∈I Ui .
Example 6.2.2. Let K the set of positive reals R+ , and let Ui be the open
interval (0, i), for each i ∈ N. Then {Ui : i ∈ N} is an open cover for K.
However, no matter what you do, you cannot find a finite subset of these
open intervals which will cover all of K.
? Proof: To see this, suppose {Ui : i ∈ I} is an open cover for [0, 1]. Now
consider the following set A:
Notice that A is nonempty since 0 ∈ A (as [0, 0] = {0} and we can find one
element of {Ui : i ∈ I} containing 0). Moreover, 1 is an upper bound for A,
and so by the least upper bound property for R, we see that a least upper
bound α for A exists.
Suppose α < 1. By definition of union, there exists j ∈ I such that
α ∈ U j . Now by definition of an open set, there exists ∈ R+ such that
B (α) ⊆ U j . On the other hand, α − 2 < α and so
[0, α − ]
2
is covered by finitely many Ui0 s, since α − 2 ∈ A. Let this set of Ui ’s be
indexed by J ⊆ I where J is finite. Therefore
{U j } ∪ {Ui : i ∈ J}
We will see later that the converse holds for Euclidean spaces, though
there are examples of metric spaces that are non-compact but closed and
bounded.
{{x} : x ∈ X}.
(ii) U ⊆ V ⊆ A =⇒ f (U ) ⊆ f (V ).
(iii) X ⊆ Y ⊆ B =⇒ f ← ( X ) ⊆ f ← (Y ).
We will be using the above properties in the proof of the next result.
1. No matter how much you stir a jar of honey, some point in the liquid
will end up in exactly the same place in the glass as before.
2. Take a map of Perth, and suppose that that map is laid out on a table
inside Perth. There will always be a point on the map which represents
that same point as its own position in Perth.
6.5 Exercises
(a) {(−n, n) : n ∈ N}
(b) {( x − 2, x + 2) : x ∈ Z}
(c) {( x − 1, x + 1) : x ∈ Z}
(a) {( x, y) ∈ R2 : x2 + y2 = 1}
(b) {( x, y) ∈ R2 : x2 + y2 6 1}
(c) {( x, y) ∈ R2 : x2 + y2 < 1}
(d) {( x, y) ∈ R2 : x2 + y2 > 1}
(e) {( x, y) ∈ R2 : x2 − y2 6 1}
Exercise 6.5.3. Let A have the discrete topology. Which subsets of A are
compact? Give a proof.
Exercise 6.5.4. Below, we look at an example in the metric space on the
closed interval [0, 1].
For x
y ∈ Q, suppose we have integers a, b, n so that
x a
= pn
y b
so that neither a nor b are divisible by p. Then the p-adic valuation of yx ,
written |x/y| p is defined to be
|x/y| p := p−n .
Example 7.1.1.
• |5|5 = 1
5 whereas |50|5 = 1
25 .
Lemma 7.1.2. The p-adic valuation defines a norm on the rational num-
bers.
Proof (sketch):
122 john bamberg
d p (u, u0 ) := |u − u0 | p .
1
Example 7.1.3. Notice that d7 (8, 1) = |7|7 = 7 whereas d7 (100, 1) =
|99|7 = 1. But then d7 (99, 1) = |98|7 = 712 .
1 + p + p2 + p3 + . . .
converges in [Q, | · | p ]. Let xn be the partial sum ni=0 pi for each integer
P
xn − (−1) = pn+1 .
Recall that we can write an integer in base 2 by listing 0’s and 1’s. So for
example, the number 22 can be written in base 2 as
101102
Notice that the base 3 expansion here does not terminate, but it is periodic.
Now [Q, | · |] is not complete, but often the partial sums ∞ −i
P
i=−n ai p (where
ai ∈ Z p ) converge to a rational number.
Lemma 7.2.1. Let p be a prime number. Any positive rational number can
be written in the form
X∞
ai p−i , ai ∈ Z p
i=n
where convergence of the series above is given by the Euclidean metric.
Lemma 7.2.2. Let p be a prime number. Any positive rational number can
be written uniquely in the form
∞
X
ai pi
i=n
With respect to the p-adic metric, the rational numbers are not complete,
and we will see this by exhibiting a particular example. Consider the fol-
lowing sequence:
n
X
p, p + p2 , p + p2 + p3 , . . . xn := pi .
i=1
Proof. Let > 0. We want to find an N ∈ N such that if n > m > N, then
d ( xm , xn ) < . Choose N to be the next largest integer after log p (1/ ) − 1.
124 john bamberg
It turns out that if n > m > N, then pm1+1 < . Now n−m−1 pi is coprime to
P
i=0
p and so it has p-adic valuation 1. Therefore,
n−m−1
X n
X
1/pm+1 = |pm+1 | p ·
pi =
pi
i=0 i=m+1
p p
X n X m
= pi − pi = |xn − xm | p .
i=1 i=1
p
Lemma 7.3.2. The p-adic numbers form a ring, and the p-adic integers are
a subring of Q( p) .
In this course you will see proofs of statements which have layers of diffi-
culty. Most of the proofs you’ve seen so far have been of statements such as
the sum of two odd numbers is an even number or for every x > 4, we have
2 x > x2 . Now you will statements such as
You might recognise the above as being the definition of an onto function.
A similar type of statement was encountered in first year when we saw the
definition of limit. If you break down the statement into its fundamental
pieces, that is, into its quantifiers and clauses, you will see how to do the
proof.
This type of proof is used to prove that a function is onto or that a function
is continuous at a point.
Set equality
Two sets A and B are equal if they have the same elements. That is,
A = B ⇐⇒ A ⊆ B and B ⊆ A.
Proving containment
A ∪ B = {x ∈ X : x ∈ A or x ∈ B}.
A ∩ B = {x ∈ X : x ∈ A and x ∈ B}.
The power set of a set X, is the set of all subsets of X, and we denote it
P( X ). The Cartesian product of two sets A and B, is the set of all ordered
pairs (a, b) of elements a ∈ A and b ∈ B, and we write this set as
• A ⊆ B ⇐⇒ X\B ⊆ X\A.
9
Index