0% found this document useful (0 votes)
13 views85 pages

Notes 5

The document consists of lecture notes on algebra and combinatorics by Robert Leek, covering topics such as combinatorics fundamentals, binary relations, finite and infinite counting, and graph theory. It includes detailed explanations of principles like the pigeonhole principle, examples, and proofs to illustrate the concepts. The notes serve as a comprehensive guide for understanding key combinatorial concepts and their applications.

Uploaded by

gongzexin123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views85 pages

Notes 5

The document consists of lecture notes on algebra and combinatorics by Robert Leek, covering topics such as combinatorics fundamentals, binary relations, finite and infinite counting, and graph theory. It includes detailed explanations of principles like the pigeonhole principle, examples, and proofs to illustrate the concepts. The notes serve as a comprehensive guide for understanding key combinatorial concepts and their applications.

Uploaded by

gongzexin123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Algebra and combinatorics

Lecture notes

Robert Leek
[email protected]

April 23, 2021


2
Contents

I Combinatorics 5

1 Fundamentals 7
1.1 The pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 The sum rule, product rule, and inclusion-exclusion formulae . . . . . . . . . 18

2 Binary relations 23
2.1 Equivalence relations and partitions . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Finite counting 31
3.1 Ordered choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Unordered choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 The binomial theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 Infinite counting 49
4.1 Countable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Uncountable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Graph theory 55
5.1 Degrees and degree sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Subgraphs, paths, and cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Connectedness and trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Bipartite graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

A Changelog 79

Notation 81

3
4 CONTENTS

Index 83
Part I

Combinatorics

5
Chapter 1

Fundamentals

Note the following terminology we will be using throughout the course:

• The integers are those numbers used for counting and their negatives, including 0.

𝑚
• The rational numbers are those that can be written as a fraction , where 𝑚, 𝑛 are inte-
𝑛
gers and 𝑛 is non-zero. We call 𝑚 the numerator and 𝑛 the denominator of this fraction.

• The real numbers are those that can be written using decimal expansions, including ter-
51 5
minating ( = 0.51), recurring ( = 0.2272727 … = 0.227), and non-recurring (2𝜋 =
100 22
6.283185 …).

1.1 The pigeonhole principle


Theorem 1.1 (Pigeonhole principle). Let 𝑛 be a positive integer. If 𝑛 + 1 ‘pigeons’ are placed in
𝑛 ‘holes’, then some ‘hole’ contains at least two ‘pigeons’.1

This statement may seem obvious, especially if you consider a specific value of 𝑛 (for ex-
ample, the statement for 𝑛 = 2 says that if 3 pigeons are placed in 2 holes, then one of the
holes must contain at 2 two pigeons). However, it is still important to give a proof. The fol-
lowing argument uses a style of proof called ‘proof by contradiction’. We start by assuming the
statement to be proven is false and derive a contradiction—something which is shown to be
1
Pigeon is English for 鸽子, and pigeonholes are a place where pigeons can nest in; they are also an open
compartment used to store papers and mail. Note we are not placing actual pigeons in holes; ‘pigeons’ are just
objects and ‘holes’ refer to some way of classifying or collecting these objects. This principle is also called Dirich-
let’s drawer principle.

7
8 CHAPTER 1. FUNDAMENTALS

true and false at the same time. This means that our assumption was false, so the statement
must be true instead, concluding the proof.

Proof. Suppose for a contradiction that the principle is false. This means that there is some
positive integer 𝑛 for which 𝑛 + 1 pigeons can be placed in 𝑛 holes so that every hole contains
at most one pigeon. Fix such an 𝑛 and a placement of pigeons, and let 𝑥𝑖 be the number of
pigeons in the 𝑖th hole. So ∑𝑛𝑖=1 𝑥𝑖 = 𝑛 +1, as it is the total number of pigeons, whilst 𝑥𝑖 ⩽ 1 for
each 𝑖 = 1, … , 𝑛 since each hole contains at most one pigeon. Summing these 𝑛 inequalities
we obtain 𝑛 𝑛
𝑛 + 1 = ∑ 𝑥𝑖 ⩽ ∑ 1 = 1⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
+ 1 + ⋯ + 1 = 𝑛,
𝑖=1 𝑖=1 𝑛 1’s
a contradiction. We therefore conclude that the principle must be true.

The pigeonhole principle arises frequently as a useful tool for both mathematical and non-
mathematical arguments, as in the following examples.

Example 1.2 (Socks). There 10 black socks and 8 white socks in my drawer. How many socks
must I take from the drawer (without looking) to guarantee that I have a matching pair among
the socks I have taken out?

Solution. Three socks. To see this, think of each sock as a ‘pigeon’, and as you take them out the
drawer, imagine that each white sock is placed in a ‘white’ hole and each black sock is placed
in a ‘black’ hole. Thus socks in the same hole are the same colour. After three socks have been
taken, there are three pigeons (socks) in two holes, so by the pigeonhole principle (with 𝑛 = 2)
some hole contains more than one sock, so contains a matching pair.2

Example 1.3 (Summing to 10). Prove that for any six integers between 1 and 9, not necessarily
distinct, either two of these integers are equal, or two of these integers sum to 10.

Solution. Let 𝑥1 , … , 𝑥6 denote the six integers, and create ‘holes’ A, B, C, D and E. For each
𝑗 = 1, … , 6, place the ‘pigeon’ 𝑥𝑗 into a hole according to the following rule:



⎪𝐴 if 𝑥𝑗 = 1 or 𝑥𝑗 = 9,








⎪𝐵 if 𝑥𝑗 = 2 or 𝑥𝑗 = 8,


Put 𝑥𝑗 in hole ⎨𝐶 if 𝑥𝑗 = 3 or 𝑥𝑗 = 7,








⎪𝐷 if 𝑥𝑗 = 4 or 𝑥𝑗 = 6,




⎩𝐸 if 𝑥𝑗 = 5.
2
Note that the ‘10’ and ‘8’ in the question don’t contribute to the final answer here.
1.1. THE PIGEONHOLE PRINCIPLE 9

Note that this rule places each of the six integers into one of the five holes. The pigeonhole
principle with 𝑛 = 5 therefore implies that some hole contains at least two of the integers.
Choose two integers which lie in the same hole; then these integers are either equal or sum
to 10 by definition of the holes. For example, if 𝐷 contains two integers, then either they are
equal or one is 4 and one is 6, giving a sum of 10.

As shown in these examples, the ‘pigeons’ may be any discrete3 object, mathematical or
otherwise, and the holes can be any collections to which pigeons may be assigned, provided
that each pigeon is assigned to a unique hole4 .
The next result is a more general version of the pigeonhole principle. First, we need to
introduce the ceiling notation. The subsequent proposition is left as an exercise.

Definition 1.4 (Floor and ceiling). Let 𝑥 be a real number.

• The floor of 𝑥 is the largest integer 𝑛 such that 𝑛 ⩽ 𝑥. We denote it by ⌊𝑥⌋.

• The ceiling of 𝑥 is the smallest integer 𝑛 such that 𝑥 ⩽ 𝑛. We denote it by ⌈𝑥⌉.

Proposition 1.5. Let 𝑥 be a real number. Then

⌊𝑥⌋ ⩽ 𝑥 < ⌊𝑥⌋ + 1,


⌈𝑥⌉ − 1 < 𝑥 ⩽ ⌈𝑥⌉.

Theorem 1.6 (General pigeonhole principle). Let 𝑛 and 𝑘 be positive integers. If 𝑛 ‘pigeons’ are
𝑛
placed in 𝑘 ‘holes’, then some ‘hole’ contains at least ⌈ ⌉ ‘pigeons’.
𝑘

Proof. Suppose for a contradiction that there exist positive integers 𝑛 and 𝑘 for which it is
possible to place 𝑛 pigeons in 𝑘 holes so that every hole contains fewer than 𝑛/𝑘 pigeons. Let
𝑛
𝑥𝑖 be the number of pigeons in the 𝑖th hole in such an arrangement, so ∑𝑘𝑖=1 𝑥𝑖 = 𝑛, and 𝑥𝑖 <
𝑘
for each 𝑖 = 1, … , 𝑘. Summing these inequalities then gives
𝑘 𝑘
𝑛 𝑛
𝑛 = ∑ 𝑥𝑖 < ∑ = ⋅ 𝑘 = 𝑛,
𝑖=1 𝑖=1 𝑘 𝑘
3
That is, we can only apply the pigeonhole principle to indivisible objects (i.e. those which only occur in
integer quantities). For example, it is not true that if 3 litres of milk are poured into two buckets, some bucket
must contain at least 2 litres, because it is possible to have 1.5 litres in each bucket.
4
Formally, this says that the rule for assigning pigeons to holes is a well-defined function, and the pigeonhole
principle says that this function is not injective.
10 CHAPTER 1. FUNDAMENTALS

a contradiction. We conclude that some hole must contain at least 𝑛 / 𝑘 pigeons. Since the
number of pigeons in each hole must be an integer, the number of pigeons in this pigeonhole
𝑛
must be at least ⌈ ⌉.5
𝑘

Note that the (mean) average number of pigeons per hole is 𝑛/𝑘. So the general pigeonhole
principle is equivalent to saying that at least one member of any collection of integers is greater
than or equal to the average of the collection, a fact you are probably familiar with.

Example 1.7. A hand in the card game Bridge consists of 13 cards from a standard 52-card
deck of cards.6 Prove that any such hand must contain at least four cards of the same suit.

Solution. Divide the 13 cards of the hands into four piles, one for each suit (these piles are
the ‘holes’ and the cards are the ‘pigeons’). Since there are 13 cards and 4 piles, by the general
13
pigeonhole principle (applied with 𝑛 = 13 and 𝑘 = 4) some pile must contain at least ⌈ ⌉ =
4
⌈3.25⌉ = 4 cards.

Example 1.8. For any nine distinct points in a square of side length one, there are three points
1
which form the vertices of a triangle whose longest side has length at most (here we con-
√2
sider a straight line to be a ‘flat’ triangle).

Solution. Divide the square into four quarters, each of which is a square of side length 1/2 (see
Figure 1.1). So each of the nine points lies in some quarter (if a point is on a boundary than
choose one of the corresponding quarters arbitrarily and say that it lies in that quarter). By the
general pigeonhole principle some quarter must then contain at least three of the points (since
⌈ 9 ⌉ = 3). Choose three points which lie in the same quarter and let 𝑇 be the triangle with these
4
three points as vertices. By Pythagoras’ theorem7 , the maximum distance between points in
1 2 1 2 1 1
the same quarter is √( ) + ( ) = , each side of 𝑇 has length at most , as required.
2 2 √2 √2

5
If you find this final step confusing, think of this: if there are at least 4.5 people in a room, then there must be
at least 5 = ⌈4.5⌉ people in the room, since you cannot have half a person. This is just the same argument applied
𝑛
to rather than 4.5.
𝑘
6
Each card has one of 13 ranks [A(ce), 2, 3, 4, 5, 6, 7, 8, 9, 10, J(ack), Q(ueen), K(ing)] and one of 4 suits [♣
(clubs), ♦ (diamonds), ♥ (hearts), ♠ (spades)]. Observe that 13 × 4 = 52.
7
This theorem states that the square of the longest side of a right-angled triangle (i.e. a triangle with one angle
of 90∘ = 𝜋 / 2 radians) is equal to the sum of the squares of the lengths of the other two sides; symbolically,
𝑎 2 +𝑏 2 = 𝑐2 . Pythagoras was an ancient Greek philosopher who was alive approximately 570–495 BCE. Although
attributed to him, he was certainly was not the first to have discovered this theorem. You are likely to recognise
this theorem by another name: 勾股定理.
1.2. SETS 11

Figure 1.1: Dividing a square into four sub-squares.

1.2 Sets
A set 𝐴 is a collection of objects (e.g. integers, lines in the plane, functions, etc.), which we call
elements or members of 𝐴. Alternatively, we say that 𝐴 contains the object 𝑥 if 𝑥 is an element
of 𝐴, written 𝑥 ∈ 𝐴. If 𝑥 is not a member of 𝐴, we write 𝑥 ∉ 𝐴. Sometimes you will see a set
referred to as a class or collection, especially when the members of the set are themselves sets.
You should treat the words ‘class’, ‘set’, and ‘collection’ as meaning the same thing.
Some important examples of sets are ℤ (the set of integers), ℚ (the set of rational numbers),
and ℝ (the set of real numbers). As there are different conventions on the status of 0 as a
‘natural number’ or not, this course will not refer to them; instead, I will use ℕ0 to denote the
set of non-negative integers, and ℕ+ to denote the set of positive integers, and ℕ without the
appropriate sub/superscript will not be defined.
There are several ways to define a set:

(1). We can define a set by listing its elements. For example,

𝐴 ≔ {2, 3, 5, 7}.

This set has elements 2, 3, 5, and 7, and no other object is a member of 𝐴.

(2). We could restrict the elements of a set by imposing conditions on the elements; if 𝑋 is a
12 CHAPTER 1. FUNDAMENTALS

set and 𝜑(𝑥) is a property of 𝑥, we can form the set 𝑌 ≔ {𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)}.8 The members
of 𝑌 are precisely the elements 𝑥 of 𝑋 (written 𝑥 ∈ 𝑋) that satisfy 𝜑(𝑥).9 For example,
𝐵 ≔ {𝑥 ∈ ℕ+ ∶ 𝑥 ⩽ 10 and 𝑥 is prime}.

(3). We can form new sets using operations, such as union and intersection. These will be
introduced in the following pages. For example, 𝐶 ≔ {2, 3} ∪ {5, 7}.

For any set 𝐴 and any object 𝑥, either 𝑥 is an element of 𝐴, which we denote by 𝑥 ∈ 𝐴, or 𝑥 is
not an element of 𝐴, which we denote by 𝑥 ∉ 𝐴.
If 𝐴 and 𝐵 are sets such that every element of 𝐴 is also an element of 𝐵, we write 𝐴 ⊆ 𝐵 and
say that 𝐴 is a subset of 𝐵. The Extension Axiom states that two sets are equal if and only if they
have the same elements, or in other words 𝐴 ⊆ 𝐵 and 𝐵 ⊆ 𝐴. 𝐴 is a proper, or strict, subset of 𝐵
if 𝐴 ⊆ 𝐵 and 𝐴 ≠ 𝐵 (so 𝐵 is not a subset of 𝐴); we write 𝐴 ⫋ 𝐵 to denote this.
We say that sets 𝐴 and 𝐵 are distinct if they are not equal; that is, if 𝐴 ≠ 𝐵, and say that 𝐴
and 𝐵 are disjoint if 𝐴 and 𝐵 have no elements in common. So, for example, the sets {2, 3} and
{2, 4} are distinct but not disjoint.10 If 𝐴 and 𝐵 do have an element in common then we say
that 𝐴 intersects 𝐵.
If we fix a set 𝑋, then we have two operations to convert between properties 𝜑(𝑥) of ele-
ments of 𝑋, and subsets of 𝑋:

Properties ⟷ Subsets
𝜑(𝑥) ⟼ {𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)}
𝑥∈𝐴 ⟻ 𝐴

Note that by the Extension Axiom, if we take a subset 𝐴 ⊆ 𝑋, turn it into a property and back
into a subset using these operations, we get the original set back: {𝑥 ∈ 𝑋 ∶ 𝑥 ∈ 𝐴} = 𝐴. Let us
see what happens when we do the same thing starting with a property 𝜑(𝑥); for any 𝑥 ∈ 𝑋,

𝑥 ∈ {𝑥′ ∈ 𝑋 ∶ 𝜑(𝑥′ )} ⟺ 𝑥 ∈ 𝑋 and 𝜑(𝑥) ⟺ 𝜑(𝑥)


8 𝑏
The 𝑥 inside the set notation {𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)} is a variable, just like how 𝑡 is a variable in ∫𝑎 𝑓 (𝑡) d𝑡. We could
replace 𝑥 with any other variable, such as {𝑦 ∈ 𝑋 ∶ 𝜑(𝑦)}.
9
In general, you should always make it clear which set you are restricting from. For example, if I tried to create
⎛1 0⎞
the set {𝑥 ∶ 𝑥2 = 𝑥}, should belong to this set? This example might seem trivial, but we can encounter
⎝0 1⎠
contradictions if we allow any property without restricting to a fixed set. An example of this is Russell’s paradox:
if there is a set of all sets 𝑈, then consider 𝑋 ≔ {𝐴 ∈ 𝑈 ∶ 𝐴 ∉ 𝐴}. Then either 𝐴 ∈ 𝐴 or 𝐴 ∉ 𝐴 – which is it?
10
Confusing the terms distinct and disjoint is a common error, so please take care to avoid it! Try the following
exercise: is it true or false that if sets 𝐴 and 𝐵 are disjoint then they must be distinct? (Be warned: the answer is
not as obvious as it may at first appear.)
1.2. SETS 13

So the two properties may not be ‘equal’, because they could be written or described differ-
ently, but they are equivalent. Therefore sets can simplify mathematical arguments by al-
lowing us to collect all elements 𝑥 ∈ 𝑋 satisfying a certain property 𝜑(𝑥) into a single object
{𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)}.
Some other important consequences of the Extension Axiom are given below:

• It does not matter in which order the elements of a set are written; so for example

{2, 3, 5, 7} = {2, 5, 3, 7}.

• We do not count the number of times an object appears in a set. For example

{1, 2, 2} = {1, 2}.11

• There is exactly one set with no elements. Indeed, if two sets both have no elements,
then they have the same elements as each other, so are the same set! We call this set the
empty set , and denote it by ∅. We say that a set 𝐴 is non-empty if 𝐴 is not the empty set
(that is, 𝐴 has at least one element).

We define the following operations on sets, which you will also see in other courses. These
allow us to construct new sets from old.

• The union of sets 𝐴 and 𝐵 is defined by

𝐴 ∪ 𝐵 = {𝑥 ∶ 𝑥 ∈ 𝐴 or 𝑥 ∈ 𝐵}.

For example, {1, 2}∪{2, 3, 4} = {1, 2, 3, 4}. Similarly, for sets 𝐴1 , … , 𝐴𝑟 , the union is defined
by
𝑟
⋃ 𝐴𝑖 = 𝐴1 ∪ ⋯ ∪ 𝐴𝑟 = {𝑥 ∶ 𝑥 ∈ 𝐴1 or 𝑥2 ∈ 𝐴2 or … or 𝑥 ∈ 𝐴𝑟 }.
𝑖=1
The Venn diagrams in Figure 1.2 give visual illustrations of the union.

• The intersection of sets 𝐴 and 𝐵 is defined by

𝐴 ∩ 𝐵 = {𝑥 ∶ 𝑥 ∈ 𝐴 and 𝑥 ∈ 𝐵}.

For example, {1, 2}∩{2, 3, 4} = {2}. Similarly, for sets 𝐴1 , … , 𝐴𝑟 , the intersection is defined
by
𝑟
⋂ 𝐴𝑖 = 𝐴1 ∩ ⋯ ∩ 𝐴𝑟 = {𝑥 ∶ 𝑥 ∈ 𝐴1 and 𝑥2 ∈ 𝐴2 and … and 𝑥 ∈ 𝐴𝑟 }.
𝑖=1

11
You should take careful note of this point, as appearances can deceive: the set {1, 2, 2} has two elements,
whilst the set {𝑥1 , … , 𝑥𝑛 } has 𝑛 elements if and only if 𝑥1 , … , 𝑥𝑛 are distinct.
14 CHAPTER 1. FUNDAMENTALS

𝐴∪𝐵

𝐵
𝐶

𝐴∪𝐵∪𝐶

Figure 1.2: Venn diagrams depicting the union of sets.

𝐴∩𝐵

𝐴∩𝐵∩𝐶

Figure 1.3: Venn diagrams depicting the intersection of sets.


1.2. SETS 15

𝐴
𝐴⧵𝐵

𝐵
𝑈
𝐴𝑐

Figure 1.4: Venn diagrams depicting the set difference and


complement operation.

• The difference of sets 𝐴 and 𝐵 is defined by12

𝐴 ⧵ 𝐵 = {𝑥 ∶ 𝑥 ∈ 𝐴 and 𝑥 ∉ 𝐵}.

For example, {1, 2} ⧵ {2, 3, 4} = {1}, and {2, 3, 4} ⧵ {1, 2} = {3, 4}.13

• In the context of a ‘ground set’ or ‘universal set’ 𝑈, the complement of a set 𝐴 is defined
by 𝐴𝑐 = 𝑈 ⧵ 𝐴. For example, if 𝑈 = {1, 2, 3, 4, 5} and 𝐴 = {1, 2, 4}, then 𝐴𝑐 = {3, 5}. We need
to first establish the universal set before taking a complement, and the complement of
a set depends on which ground set is used. See Figure 1.4 for Venn diagrams depicting
the set difference and complement operations.

• The Cartesian product of sets 𝐴 and 𝐵 is defined by

𝐴 × 𝐵 = {(𝑎, 𝑏) ∶ 𝑎 ∈ 𝐴 and 𝑏 ∈ 𝐵}.

That is, 𝐴 × 𝐵 is the set of ordered pairs whose first co-ordinate is a member of 𝐴 and
whose second co-ordinate is a member of 𝐵. So, for example

{1, 2} × {2, 3, 4} = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4)}.

See Figure 1.5 for a visual description of the Cartesian product operation. Similarly, the
12
Some sources write 𝐴 − 𝐵 instead of 𝐴 ⧵ 𝐵, but we will use ⧵ throughout this course to avoid confusion with
ordinary subtraction.
13
Note from these examples that difference is not symmetric, unlike union and intersection. The order matters,
just like with subtraction of numbers: 1 − 2 = −1 ≠ 1 = 2 − 1.
16 CHAPTER 1. FUNDAMENTALS

𝐴×𝐵
(𝑎, 𝑦) (𝑏, 𝑦) (𝑐, 𝑦) (𝑑, 𝑦)
𝑦
𝐵
(𝑎, 𝑥) (𝑏, 𝑥) (𝑐, 𝑥) (𝑑, 𝑥)
𝑥

𝑎 𝑏 𝑐 𝑑
𝐴

Figure 1.5: A visual depiction of the Cartesian product


of two sets 𝐴 and 𝐵.

Cartesian product of sets 𝐴1 , … , 𝐴𝑟 is defined by

𝐴1 × ⋯ × 𝐴𝑟 = {(𝑥1 , … , 𝑥𝑟 ) ∶ 𝑥1 ∈ 𝐴1 and 𝑥2 ∈ 𝐴2 and … and 𝑥𝑟 ∈ 𝐴𝑟 }.

So 𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟 is the set of all ordered 𝑟-tuples 14 whose 𝑗th coordinate is a member


of 𝐴𝑗 for each 𝑗. As with multiplication, given a set 𝐴 and a positive integer 𝑛 we define

𝐴𝑛 = 𝐴⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
×𝐴 ×⋯×𝐴.
𝑛 𝐴’s
This should match the definitions of ℝ2 , ℝ3 etc. with which you are already familiar.

• The power set of a set 𝐴 is defined by

𝒫(𝐴) = {𝑋 ∶ 𝑋 ⊆ 𝐴}.

So 𝒫(𝐴) is a set whose elements are the subsets of 𝐴. For example, if 𝐴 = {1, 2, 3} then
𝒫(𝐴) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. Note that for any set 𝐴, the power set
𝒫(𝐴) contains both ∅ and 𝐴 as elements. This shows that sets can be elements of other
sets.

Brackets are used to indicate the order in which to perform operations. So, for example, (𝐴 ⧵
𝐵)⧵𝐶 means first subtract 𝐵 from 𝐴 and then subtract 𝐶 from the result, whilst 𝐴⧵(𝐵 ⧵𝐶 ) means
first subtract 𝐶 from 𝐵, and then subtract the result from 𝐴.15
14
An ordered 𝑟-tuple is a sequence of 𝑟 objects, so an ordered 2-tuple is an ordered pair, and ordered 3-tuple is
an ordered triple, and so on. Make sure that you don’t confuse ordered pairs or 𝑟-tuples with sets; unlike for sets,
the order of an ordered pair or 𝑟-tuple matters, so e.g. (1, 2) ≠ (2, 1). Also, elements may be repeated in a pair or
𝑟-tuple, so e.g. (3, 3) is a valid pair.
15
Exercise: prove that these two expressions, (𝐴 ⧵ 𝐵) ⧵ 𝐶 and 𝐴 ⧵ (𝐵 ⧵ 𝐶 ), are not the same in general.
1.2. SETS 17

A major part of this course will be about counting the sizes of sets. Our first ‘counting’
result is the following theorem on the size of the power set of a set 𝐴.
We say that a set 𝐴 is finite if there is a non-negative integer 𝑛 such that 𝐴 has 𝑛 elements,
otherwise 𝐴 is infinite. ℕ0 , ℕ+ and ℤ are important examples of infinite sets. If 𝐴 is a finite set,
then the size of 𝐴, denoted |𝐴|, is the number of elements in 𝐴.16 The size of 𝐴 may also be
called the order of 𝐴 or the cardinality of 𝐴.

Theorem 1.9. If 𝐴 is a set with |𝐴| = 𝑛, then |𝒫(𝐴)| = 2𝑛 . Equivalently, a set with 𝑛 elements has
2𝑛 subsets.

Proof. Let 𝐴 = {𝑥1 , … , 𝑥𝑛 }. We can form a subset 𝐵 ⊆ 𝐴 by going through the elements of 𝐴
and deciding whether each element should be in 𝐵 or not. As there are two choices for each
element (in 𝐵 or not in 𝐵), there are

× 2 × 2 × ⋯ × 2 = 2𝑛
2⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝑛 2’s

possible ways to choose 𝐵. This gives 2𝑛 subsets of 𝐵, which are all distinct; if we make a
different choice at element 𝑥𝑗 then the resulting subsets differ in element 𝑥𝑗 .17

Sets may contain any object as a member, including other sets (for example, we have just
seen that the power set 𝒫(𝐴) of a set 𝐴 is a set). It is crucial to remember that membership or
elementhood (i.e. being an element of a set) is not transitive (that is, it is not true that just
because 𝐴 ∈ 𝐵 and 𝐵 ∈ 𝐶 we must have 𝐴 ∈ 𝐶). So, for example 9 is not a member of the set
{4, {5, 9}}: this set has two elements, namely the integer 4 and the set {5, 9}. Similarly, the set
{ℕ0 } has one element, namely the set ℕ0 , even though ℕ0 itself is a set with infinitely many
elements. Set operations must also be treated carefully – try justifying the following equality:

{1, 2, {3, 4}} ∩ {{1, 2}, 3, 4} = ∅.

16
Note that this definition only applies to finite sets, so our later results on the sizes of sets specify that the sets
in question are finite.
17
We will see similar proofs later in the course. You can alternatively prove this by induction (try this as an
exercise).
18 CHAPTER 1. FUNDAMENTALS

1.3 The sum rule, product rule, and inclusion-exclusion


formulae
The sum rule for sets specifies the size of the union 𝐶 ∪ 𝐷 of sets 𝐶 and 𝐷 which are disjoint
(remember this means 𝐶 and 𝐷 have no elements in common). Note that the theorem does
not hold for sets which are not disjoint.

Theorem 1.10 (Sum rule). Let 𝐴1 , …, 𝐴𝑛 be a finite list of pairwise-disjoint sets – i.e. for all
distinct 𝑖, 𝑗 = 1, … , 𝑛, 𝐴𝑖 ∩ 𝐴𝑗 = ∅. Then ||⋃𝑛 𝐴𝑖 || = ∑𝑛 |𝐴𝑖 |.
𝑖=1 𝑖=1

Proof. For each 𝑖 = 1, … , 𝑛, let 𝑚𝑖 ≔ |𝐴𝑖 | and write 𝐴𝑖 = {𝑎𝑖,1 , … , 𝑎𝑖,𝑚𝑖 }. Then all the ele-
ments 𝑎1,1 , … , 𝑎1,𝑚1 , 𝑎2,1 , … , 𝑎2,𝑚2 , … , 𝑎𝑛,1 , … , 𝑎𝑛,𝑚𝑛 are distinct, since 𝐴1 , …, 𝐴𝑛 are pairwise-
disjoint. Therefore ⋃𝑛𝑖=1 𝐴𝑖 = {𝑎𝑖,𝑗 ∶ 𝑖 = 1, … , 𝑛, 𝑗 = 1, … , 𝑚𝑖 } has ∑𝑛𝑖=1 𝑚𝑖 = ∑𝑛𝑖=1 |𝐴𝑖 | elements.

If sets 𝐴 and 𝐵 are not disjoint then we cannot apply the sum rule to find |𝐴 ∪ 𝐵|. Instead
we can calculate this by the inclusion-exclusion formula, provided that we know the size of
|𝐴 ∩ 𝐵|. The form for two sets is the following.

Theorem 1.11 (Inclusion-exclusion formula for two sets). Suppose that 𝐴 and 𝐵 are finite sets.
Then
|𝐴 ∪ 𝐵| = |𝐴| + |𝐵| − |𝐴 ∩ 𝐵|.

Proof. Note that 𝐴 and 𝐵 ⧵ 𝐴 are disjoint sets whose union is 𝐴 ∪ (𝐵 ⧵ 𝐴) = 𝐴 ∪ 𝐵.18 So by
Theorem 1.10 applied with 𝐶 = 𝐴 and 𝐷 = 𝐵 ⧵ 𝐴 we have

|𝐴 ∪ 𝐵| = |𝐴 ∪ (𝐵 ⧵ 𝐴)| = |𝐴| + |𝐵 ⧵ 𝐴|.

Next note that 𝐵 ∩ 𝐴 and 𝐵 ⧵ 𝐴 are disjoint sets whose union is (𝐵 ∩ 𝐴) ∪ (𝐵 ⧵ 𝐴) = 𝐵. So by


Theorem 1.10 applied with 𝐶 = 𝐵 ∩ 𝐴 and 𝐷 = 𝐵 ⧵ 𝐴 we have

|𝐵| = |(𝐵 ∩ 𝐴) ∪ (𝐵 ⧵ 𝐴)| = |𝐵 ∩ 𝐴| + |𝐵 ⧵ 𝐴|.

Combining the two equations completes the proof.

The inclusion-exclusion formula allows us to calculate information we don’t know from


the information that we have available, as in the following example. Most commonly, we will
know the sizes of the sets and their intersections and wish to calculate the size of the union,
but other scenarios are possible, such as calculating |𝐵| from |𝐴|, |𝐴 ∩ 𝐵| and |𝐴 ∪ 𝐵|.
18
Make sure you understand why these sets are disjoint and have the claimed union.
1.3. THE SUM RULE, PRODUCT RULE, AND INCLUSION-EXCLUSION FORMULAE 19

Example 1.12. If WeChat tells us that I have 155 friends, you have 274 friends, and we have 25
friends in common, how many friends do we have between us?

Solution. We can describe this scenario in set terms: let 𝐴 be the set of my friends and 𝐵 be
the set of your friends. Then we are told that |𝐴| = 155 and |𝐵| = 274. Also, 𝐴 ∩ 𝐵 is the set of
friends we have in common, so |𝐴∩𝐵| = 25. The set of people who are either my friend or your
friend is 𝐴 ∪ 𝐵, so applying the inclusion-exclusion formula we find that the number of such
people is
|𝐴 ∪ 𝐵| = |𝐴| + |𝐵| − |𝐴 ∩ 𝐵| = 155 + 274 − 25 = 404.

So we have 304 friends between us.

To find the size of the union of three sets we sum the set sizes, then subtract the sizes of
the pairwise intersections, then add the three-way intersection, as follows.

Theorem 1.13 (Inclusion-exclusion formula for three sets). Suppose that 𝐴, 𝐵 and 𝐶 are finite
sets. Then

|𝐴 ∪ 𝐵 ∪ 𝐶 | = |𝐴| + |𝐵| + |𝐶 | − |𝐴 ∩ 𝐵| − |𝐴 ∩ 𝐶 | − |𝐵 ∩ 𝐶 | + |𝐴 ∩ 𝐵 ∩ 𝐶 |.

To prove this theorem we apply the inclusion-exclusion formula for two sets three times,
and also make use of a distributivity law which you should be familiar with from other courses:
for any sets 𝐴, 𝐵 and 𝐶 we have 𝐴 ∩ (𝐵 ∪ 𝐶 ) = (𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶 ).

Proof. Let 𝑌 = 𝐵 ∪ 𝐶. Then the inclusion-exclusion formula for two sets implies that

|𝐴 ∪ 𝐵 ∪ 𝐶 | = |𝐴 ∪ 𝑌 | = |𝐴| + |𝑌 | − |𝐴 ∩ 𝑌 |

However, the distributivity law and inclusion-exclusion formula for two sets imply that

|𝐴 ∩ 𝑌 | = |𝐴 ∩ (𝐵 ∪ 𝐶 )|
= |(𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶 )|
= |𝐴 ∩ 𝐵| + |𝐴 ∩ 𝐶 | − |(𝐴 ∩ 𝐵) ∩ (𝐴 ∩ 𝐶 )|
= |𝐴 ∩ 𝐵| + |𝐴 ∩ 𝐶 | − |𝐴 ∩ 𝐵 ∩ 𝐶 |

Also, the inclusion-exclusion formula for two sets implies that

|𝑌 | = |𝐵 ∪ 𝐶 | = |𝐵| + |𝐶 | − |𝐵 ∩ 𝐶 |

Now the result follows by substitution.


20 CHAPTER 1. FUNDAMENTALS

One significant application of this formula is in counting integers which are divisible by
one of several specified integers, as in the following example.

Example 1.14. How many integers between 1 and 1,000 inclusive are divisible by at least one
of the integers 2, 3 and 5?

Solution. Define sets 𝐴, 𝐵 and 𝐶 as follows.

𝐴 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 2},


𝐵 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 3},
𝐶 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 5}.

Then

𝐴 ∩ 𝐵 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 6},


𝐴 ∩ 𝐶 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 10},
𝐵 ∩ 𝐶 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 15},
𝐴 ∩ 𝐵 ∩ 𝐶 = {𝑛 ∈ ℕ+ ∶ 𝑛 ⩽ 1,000 and 𝑛 is divisible by 30}.

For any positive integer 𝑚 and 𝑟, the number of integers between 1 and 𝑚 which are divisible
𝑚
by 𝑟 is equal to ⌊ ⌋. To see this, note that for each positive integer 𝑛, if 𝑟 divides 𝑛 and 𝑛 ⩽ 𝑚
𝑟
𝑛 𝑛 𝑚
then 𝑛 = 𝑟 ⩽ 𝑚 and 1 ⩽ ⩽ . Therefore
𝑟 𝑟 𝑟
| 𝑚 | 𝑚
|{𝑛 ∈ ℤ ∶ 1 ⩽ 𝑛 ⩽ 𝑚 and 𝑛 is divisible by 𝑟}| = ||{𝑘𝑟 ∶ 𝑘 ∈ ℤ, 1 ⩽ 𝑘 ⩽ }|| = ⌊ ⌋ .
| 𝑟 | 𝑟
So |𝐴| = 500, |𝐵| = 333, |𝐶 | = 200, |𝐴 ∩ 𝐵| = 166, |𝐴 ∩ 𝐶 | = 100, |𝐵 ∩ 𝐶 | = 66, |𝐴 ∩ 𝐵 ∩ 𝐶 | = 33.
The set of integers divisible by at least one of 2, 3 and 5 is 𝐴 ∪ 𝐵 ∪ 𝐶, so by the inclusion-
exclusion formula the number of such integers is

|𝐴 ∪ 𝐵 ∪ 𝐶 | = |𝐴| + |𝐵| + |𝐶 | − |𝐴 ∩ 𝐵| − |𝐴 ∩ 𝐶 | − |𝐵 ∩ 𝐶 | + |𝐴 ∩ 𝐵 ∩ 𝐶 |
= 500 + 333 + 200 − 166 − 100 − 66 + 33
= 734.

The versions of this formula for two and three sets hint at the general formula which ap-
plies for any number of sets: we add the sizes of the given sets, then subtract the sizes of their
pairwise intersections. Then we add back the sizes of the three-way intersections, before sub-
tracting the sizes of the four-way intersections, and so forth until all intersections have been
included in the calculation. We leave the proof as an exercise.
1.3. THE SUM RULE, PRODUCT RULE, AND INCLUSION-EXCLUSION FORMULAE 21

Theorem 1.15 (General inclusion-exclusion formula). Suppose that 𝐴1 , 𝐴2 , … , 𝐴𝑟 are finite sets.
Then 𝑟
|| 𝑟 || | |
|𝐼 |+1 | ⋂ 𝐴 | =
| |
| ⋃ 𝐴𝑖 | = ∑(−1) | 𝑖| ∑ (−1)𝑘+1 ∑ || ⋂ 𝐴𝑖 || .19
|𝑖=1 | 𝐼 ⊆{1,…,𝑟}, |𝑖∈𝐼 | 𝑘=1 𝐼 ⊆{1,…,𝑟}, |𝑖∈𝐼 |
𝐼 ≠∅ |𝐼 |=𝑘

Recall that the Cartesian product (or set product) of sets 𝐴1 , 𝐴2 , … , 𝐴𝑟 is

𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟 = {(𝑎1 , 𝑎2 , … , 𝑎𝑟 ) ∶ 𝑎1 ∈ 𝐴1 , 𝑎2 ∈ 𝐴2 , … , 𝑎𝑟 ∈ 𝐴𝑟 },

so the elements of 𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟 are ordered 𝑟-tuples whose 𝑗th co-ordinate is a member


of 𝐴𝑗 for each 𝑗. One subtle consequence of this definition is that, given sets 𝐴, 𝐵 and 𝐶, the
products 𝐴 × 𝐵 × 𝐶 and (𝐴 × 𝐵) × 𝐶 are not quite the same thing: the first is a set of ordered
triples (𝑎, 𝑏, 𝑐), whilst the second is a set of ordered pairs whose first co-ordinate is an ordered
pair ((𝑎, 𝑏), 𝑐). However, there is a natural correspondence between the two sets given by

(𝑎, 𝑏, 𝑐) ⟷ ((𝑎, 𝑏), 𝑐).

Using this natural correspondence, the sets 𝐴 ×𝐵 ×𝐶 and (𝐴 ×𝐵)×𝐶 are essentially equivalent
for most purposes (in particular, they have the same size); a similar correspondence shows
that the is true of 𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟−1 × 𝐴𝑟 and (𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟−1 ) × 𝐴𝑟 .
The product rule tells us the size of the Cartesian product of a collection of sets. Note that
unlike for the sum rule, we do not require that the sets are disjoint.

Theorem 1.16 (Product rule for two sets). If 𝐴 and 𝐵 are finite sets, then |𝐴 × 𝐵| = |𝐴| ⋅ |𝐵|.

Proof. Let 𝐴 = {𝑎1 , … , 𝑎𝑚 } and 𝐵 = {𝑏1 , … , 𝑏𝑛 }. So |𝐴| = 𝑚 and |𝐵| = 𝑛. We can list the elements
of 𝐴 × 𝐵 as


⎪ (𝑎1 , 𝑏1 ), (𝑎1 , 𝑏2 ), … (𝑎1 , 𝑏𝑛 ), ⎫




⎪ ⎪

⎪ (𝑎2 , 𝑏1 ), (𝑎2 , 𝑏2 ), … (𝑎2 , 𝑏𝑛 ), ⎪

⎨ ⎬


⎪ ⋮ ⋮ ⋱ ⋮ ⎪




⎪ ⎪


⎩ (𝑎𝑚 , 𝑏1 ), (𝑎𝑚 , 𝑏2 ), … (𝑎𝑚 , 𝑏𝑛 ) ⎭
Altogether there are 𝑚 rows and 𝑛 columns in the table, so it has 𝑚 ⋅ 𝑛 = |𝐴||𝐵| entries; since
these are all distinct, we have |𝐴 × 𝐵| = |𝐴||𝐵|.

An inductive argument based on the product rule for two sets yields a product rule for any
number of finite sets.
19
Notice the range of summation (underneath Σ) in the middle expression: we have two conditions, 𝐼 ⊆
{1, … , 𝑛} and 𝐽 ≠ ∅. So for every non-empty subset 𝐼 ⊆ {1, … , 𝑛}, we calculate (−1)|𝐼 |+1 ||⋂ 𝐴𝑖 || and add these
𝑖∈𝐼
results together. The same principle applies to general products (Π), unions (⋃), and intersection (⋂); e.g. using
notation from the previous section, ⋂𝑖∈{1,…,𝑛} 𝐴𝑖 = ⋂𝑛𝑖=1 𝐴𝑖 .
22 CHAPTER 1. FUNDAMENTALS

Theorem 1.17 (Product rule for 𝑟 sets). For any positive integer 𝑟 and any finite sets 𝐴1 , 𝐴2 , … , 𝐴𝑟
we have
|| 𝑛 || 𝑟
| ∏ 𝐴𝑖 | = |𝐴1 × ⋯ × 𝐴𝑛 | = |𝐴1 | × ⋯ × |𝐴𝑟 | = ∏ |𝐴𝑖 |.
|𝑖=1 | 𝑖=1

Proof. We proceed by induction on 𝑟.20 The base case 𝑟 = 1 is then a tautology; it states simply
that |𝐴1 | = |𝐴1 |, which is true.
Now suppose that the statement holds for 𝑟 = 𝑘 for some 𝑘 ∈ ℕ+ , that is, that for any finite
sets 𝐴1 , … , 𝐴𝑘 we have |𝐴1 × ⋯ × 𝐴𝑘 | = ∏𝑘𝑖=1 |𝐴𝑖 |. Then, given any finite sets 𝐴1 , … , 𝐴𝑘+1 , we
have

|𝐴1 × ⋯ × 𝐴𝑘+1 | = |(𝐴1 × ⋯ × 𝐴𝑘 ) × 𝐴𝑘+1 |


= |𝐴1 × ⋯ × 𝐴𝑘 ||𝐴𝑘+1 |
𝑘
= ( ∏ |𝐴𝑖 |) |𝐴𝑘+1 |,
𝑖=1
𝑘+1
= ∏ |𝐴𝑖 |,
𝑖=1

where the first equality holds by the correspondence discussed earlier, the second equality
holds by the product rule for two sets applied to the sets 𝐴1 × ⋯ × 𝐴𝑘 and 𝐴𝑘+1 , and the
final equality holds by the inductive hypothesis. So the statement holds for 𝑟 = 𝑘 + 1 also.
Having proved that the statement holds for 𝑟 = 1, and that if it holds for 𝑟 = 𝑘 then it also
holds for 𝑟 = 𝑘 + 1, we conclude by the Principle of Mathematical Induction that it holds for
every positive integer, as required.

An application of the product rule is in counting the number of factors of an integer, as in


the following example.

Example 1.18. How many positive integers are factors of 1,200?

Solution. The prime factorisation of 1,200 is 24 ⋅3⋅52 , so the factors of 1,200 are the integers of
the form 2𝑎 ⋅ 3𝑏 ⋅ 5𝑐 for 0 ⩽ 𝑎 ⩽ 4, 0, ⩽ 𝑏 ⩽ 1 and 0 ⩽ 𝑐 ⩽ 2.21 That is, the factors are 2𝑎 ⋅ 3𝑏 ⋅ 5𝑐 for
(𝑎, 𝑏, 𝑐) ∈ 𝐴 × 𝐵 × 𝐶, where 𝐴 = {0, 1, 2, 3, 4}, 𝐵 = {0, 1} and 𝐶 = {0, 1, 2}. So by the product rule,
the number of factors is

|𝐴 × 𝐵 × 𝐶 | = |𝐴||𝐵||𝐶 | = 5 ⋅ 2 ⋅ 3 = 30.

20
Exercise: prove Theorem 1.17 by counting choices as in the proof of Theorem 1.9.
21
This is a consequence of uniqueness of prime factorisation, which also implies that the integers of this form
are all distinct. We will formally prove this later in the course (see the fundamental theorem of arithmetic).
Chapter 2

Binary relations

Definition 2.1. A binary relation on a set 𝐴 is a subset 𝑅 ⊆ 𝐴2 . For elements 𝑎, 𝑏 ∈ 𝐴, we say


𝑎 is related to 𝑏 (by 𝑅) if (𝑎, 𝑏) ∈ 𝑅. We will write 𝑎 𝑅 𝑏 if (𝑎, 𝑏) ∈ 𝑅, and write 𝑎 𝑅
 𝑏 otherwise.
We often just say relation instead of ‘binary relation’, but be aware that there are other types
of relation.

Usually, we think of a relation as a property that we can apply to a pair of objects. How-
ever, just as we showed with sets, if we fix a set 𝐴 then there is a correspondence between the
properties applied to pairs (𝑎, 𝑏) of elements of 𝐴, and binary relations on 𝐴:

Properties ⟷ Binary relations


𝜑(𝑎, 𝑏) ⟼ {(𝑎, 𝑏) ∈ 𝐴2 ∶ 𝜑(𝑎, 𝑏)}
(𝑎, 𝑏) ∈ 𝑅 ⟻ 𝑅

So provided we are only interested in whether a relation holds for certain pairs or not, this
mathematical definition of a relation as a type of set is very useful. As an example, even though
we do not usually think of the relation ⩾ between real numbers as a set, by this correspondence
we could view it as the set of pairs (𝑥, 𝑦) with 𝑥 ⩾ 𝑦. Moreover, we could even draw this relation
as a subset of ℝ2 (see Figure 2.1).

Example 2.2. The following are all valid relations.

(1). The relation 𝑅 ≔ {(1, 2), (1, 3), (2, 2), (2, 3), (3, 3)} on the set {1, 2, 3}. See Table 2.1.

(2). < on ℤ (or ℕ+ , ℕ0 , ℚ, ℝ) given by the standard definition of <.

(3). 𝑆 on ℕ0 given by 𝑥 𝑆 𝑦 if |𝑥 − 𝑦| ⩽ 2.

(4). The relation 𝐶 on the set of all people, where 𝑥 𝐶 𝑦 if 𝑥 and 𝑦 live in the same country.

23
24 CHAPTER 2. BINARY RELATIONS

𝑥

Figure 2.1: The relation ⩾, including the line 𝑦 = 𝑥. Note that


this line represents the equality relation on ℝ.

𝑅 1 2 3
1 ✗ ✓ ✓
2 ✗ ✓ ✓
3 ✗ ✗ ✓

Table 2.1: The relation in Example 2.2(1).


When a relation 𝑅 is described with a table, if
we want to check whether 𝑥 𝑅 𝑦 or not, we
look in the 𝑥-row and 𝑦-column: if there is a
tick ✓, then the relation holds; otherwise,
there is a cross ✗ and 𝑥 𝑅
 𝑦.
25

(5). For a fixed positive integer 𝑚, ≡𝑚 is the relation on ℤ given by 𝑥 ≡𝑚 𝑦 if and only if
𝑥 − 𝑦 = 𝑚𝑘 for some integer 𝑘.1

Below, We give three commonly-useful properties that relations can have.

Definition 2.3. Let 𝑅 be a relation on a set 𝐴. We say that

• 𝑅 is reflexive if for any 𝑎 ∈ 𝐴 we have 𝑎 𝑅 𝑎.

• 𝑅 is symmetric 𝑎 𝑅 𝑏 implies 𝑏 𝑅 𝑎 for any 𝑎, 𝑏 ∈ 𝐴.

• 𝑅 is transitive 𝑎 𝑅 𝑏 and 𝑏 𝑅 𝑐 implies 𝑎 𝑅 𝑐, for any 𝑎, 𝑏, 𝑐 ∈ 𝐴.

Notice that these properties requiring checking for all elements of 𝐴. Thus, to show a par-
ticular one of these properties is false, we are only required to find one counterexample. Let
us see which properties each of the relations in the examples above satisfy.

Example 2.4.

(1). The relation 𝑅 ≔ {(1, 2), (1, 3), (2, 2), (2, 3), (3, 3)} on the set {1, 2, 3} is neither reflexive (be-
cause 1 𝑅
 1), nor symmetric (because 1 𝑅 2 but 2 𝑅
 1). This can be easily seen from the
table: if we look at the diagonal entries from top-left to bottom-right, a reflexive rela-
tion should have all ticks along this diagonal. Moreover, a symmetric relation would be
symmetric about this diagonal line.
For transitivity, it is in general trickier. Let 𝑎, 𝑏, 𝑐 ∈ 𝑅 be given such that 𝑎 𝑅 𝑏 and 𝑏 𝑅 𝑐.
First, note that if either 𝑎 = 𝑏 or 𝑏 = 𝑐 then 𝑎 𝑅 𝑐. So let us assume that 𝑎 ≠ 𝑏 and 𝑏 ≠ 𝑐.
Then since 𝑎 𝑅 𝑏, either 𝑏 = 2 or 𝑏 = 3. However, if 𝑏 = 3 then by 𝑏 𝑅 𝑐 we get that 𝑐 = 3,
which is a contradiction. Thus, 𝑏 = 2 and so

𝑎 𝑅 𝑏, 𝑎 ≠ 𝑏 ⇒ 𝑎 = 1,
𝑏 𝑅 𝑐, 𝑏 ≠ 𝑐 ⇒ 𝑐 = 3.

As 1 𝑅 3, we have thus shown that 𝑅 is transitive.

(2). < on ℤ is not reflexive since, for example, 3 ≮ 3. It is not symmetric since, for example,
2 < 3 but 3 ≮ 2. It is transitive since for any 𝑎, 𝑏, 𝑐 ∈ ℤ with 𝑎 < 𝑏 and 𝑏 < 𝑐 we have 𝑎 < 𝑐.

(3). The relation 𝑆 on ℕ0 given by 𝑥 𝑆 𝑦 if |𝑥 − 𝑦| ⩽ 2 is reflexive, since for any 𝑥 ∈ ℕ0 we have


|𝑥 − 𝑥| = |0| = 0 ⩽ 2, so 𝑥 𝑆 𝑥. It is symmetric as for any 𝑥, 𝑦 ∈ ℕ0 with 𝑥 𝑆 𝑦 we have
|𝑦 − 𝑥| = |𝑥 − 𝑦| ⩽ 2. It is not transitive since, for example, 3 𝑆 4 and 4 𝑆 6 but 3 
𝑆 6.
1
This relation is investigated more in Chapter ??.
26 CHAPTER 2. BINARY RELATIONS

The relations 𝐶 and ≡𝑚 in Examples 2.2(4) and (5) are important examples of equivalence
relations, which we will explore in the rest of this chapter.

2.1 Equivalence relations and partitions


Definition 2.5. Suppose that ∼ is a relation on a set 𝐴. We say that ∼ is an equivalence relation
if it is reflexive, symmetric and transitive.

Example 2.6.

(1). For any set 𝐴, the relation 𝑈𝐴 ≔ 𝐴 × 𝐴 is an equivalence relation. Notice that 𝑎 𝑅 𝑏 holds
for all 𝑎, 𝑏 ∈ 𝐴.

(2). For any set 𝐴, the equality relation 𝐸𝐴 ≔ {(𝑎, 𝑎) ∶ 𝑎 ∈ 𝐴} is an equivalence relation. This
follows from the properties of equality.

(3). Let us consider the relation 𝐶 from Example 2.2(4).

• Let 𝑥 be a person. Then 𝑥 lives in the same country as themselves, so 𝑥 𝐶 𝑥. Hence


𝐶 is reflexive.

• Let 𝑥, 𝑦 be people such that 𝑥 𝐶 𝑦. Then 𝑥 and 𝑦 live in the same country, and so
trivially 𝑦 𝐶 𝑥. Therefore 𝐶 is symmetric.

• Let 𝑥, 𝑦, 𝑧 be people such that 𝑥 𝐶 𝑦 and 𝑦 𝐶 𝑧. Then there are countries 𝐴, 𝐵 such
that 𝑥, 𝑦 live in 𝐴, and 𝑦, 𝑧 live in 𝐵. Thus 𝐴 = 𝐵, so 𝑥, 𝑧 live in 𝐴 and hence 𝑥 𝐶 𝑧.
Therefore 𝐶 is transitive.

Thus 𝐶 is an equivalence relation on the set of all people.

The principal reason that equivalence relations are useful is that they divide the set on
which they are defined into sets called equivalence classes which form a partition of 𝐴, mean-
ing that every element of 𝐴 lies in precisely one equivalence class. We now define these terms
formally.

Definition 2.7. Suppose that ∼ is an equivalence relation on a set 𝐴. For any 𝑎 ∈ 𝐴, the equiv-
alence class of 𝑎 is the set [𝑎]∼ = {𝑏 ∈ 𝐴 ∶ 𝑎 ∼ 𝑏}; that is, the set of all elements of 𝐴 to which 𝑎
is related. The quotient set of ∼ is set of all equivalence classes of members of 𝑎:

𝐴/∼ ≔ {[𝑎]∼ ∶ 𝑎 ∈ 𝐴}.


2.1. EQUIVALENCE RELATIONS AND PARTITIONS 27

Example 2.8.

(1). The equivalence classes of the relation 𝐶 from Example 2.2(4) are the people belong-
ing to a fixed country (the country’s population), and the quotient set is set of all such
populations.

(2). The equivalence classes of the relation ≡3 on ℤ are


[−3]≡3 = {… , −9, −6, −3, 0, 3, … }
[−2]≡3 = {… , −8, −5, −2, 1, 4, … }
[−1]≡3 = {… , −7, −4, −1, 2, 5, … }
[0]≡3 = {… , −6, −3, 0, 3, 6, … }
[1]≡3 = {… , −5, −2, 1, 4, 7, … }
[2]≡3 = {… , −4, −1, 2, 5, 8, … }
[3]≡3 = {… , −3, 0, 3, 6, 9, … }

So the set of equivalence classes of ≡3 is ℤ/≡3 = {[0]≡3 , [1]≡3 , [2]≡3 }.

There is an important connection between the equivalence of members and the equality
of their equivalence classes.

Lemma 2.9. Let ∼ be an equivalence relation on a set 𝑋 and let 𝑥, 𝑦 ∈ 𝑋 be given. Then 𝑥 ∼ 𝑦 if
and only if [𝑥]∼ = [𝑦]∼ .

Proof. If [𝑥]∼ = [𝑦]∼ then by reflexivity 𝑥 ∼ 𝑥, so 𝑥 ∈ [𝑥]∼ = [𝑦]∼ and hence 𝑥 ∼ 𝑦.


Suppose 𝑥 ∼ 𝑦 and let 𝑧 ∈ [𝑦]∼ be given. Then 𝑦 ∼ 𝑧, so by transitivity 𝑥 ∼ 𝑧; i.e. 𝑧 ∈ [𝑥]∼ .
Thus [𝑦]∼ ⊆ [𝑥]∼ . Now let 𝑤 ∈ [𝑥]∼ be given. Then 𝑥 ∼ 𝑤 and by symmetry 𝑦 ∼ 𝑥, so by transi-
tivity again 𝑦 ∼ 𝑤; i.e. 𝑤 ∈ [𝑦]∼ . Therefore [𝑥]∼ ⊆ [𝑦]∼ and hence [𝑥]∼ = [𝑦]∼ .

Equivalence relations are connected with partitions, which we define below.

Definition 2.10. A partition 𝑃 of a set 𝐴 is a collection of non-empty subsets of 𝐴 such that:

• for all 𝑎 ∈ 𝐴, there exists an 𝑋 ∈ 𝑃 with 𝑎 ∈ 𝑋, and

• 𝑃 is pairwise-disjoint (for all distinct 𝑋, 𝑌 ∈ 𝑃, 𝑋 ∩ 𝑌 = ∅).

We call a member 𝑋 ∈ 𝑃 a part (of 𝑃).


28 CHAPTER 2. BINARY RELATIONS

Proposition 2.11. Let 𝐴 be a set and let 𝑃 be a collection of non-empty subsets of 𝐴. Then 𝑃 is a
partition of 𝐴 if and only if for all 𝑎 ∈ 𝐴, there exists a unique part 𝑋 ∈ 𝑃 such that 𝑎 ∈ 𝑋.

Proof. Suppose 𝑃 is a partition of 𝐴. Then by definition, each element of 𝐴 belongs to a part of


𝑃. Let 𝑎 ∈ 𝐴 and let 𝑋, 𝑌 ∈ 𝑃 be given such that 𝑎 ∈ 𝑋 and 𝑎 ∈ 𝑌. Then 𝑎 ∈ 𝑋 ∩ 𝑌 ≠ ∅. However,
𝑃 is pairwise-disjoint by definition, so we must have 𝑋 = 𝑌. Therefore there is a unique part
that contains 𝑎.
Now suppose for each 𝑎 ∈ 𝐴, there exists a unique part 𝑋 ∈ 𝑃 such that 𝑎 ∈ 𝑋. We need
to show that 𝑃 is pairwise-disjoint, so let 𝑋, 𝑌 ∈ 𝑃 be distinct. If 𝑋 ∩ 𝑌 ≠ ∅ then there exists
an 𝑎 ∈ 𝑋 ∩ 𝑌, which is a contradiction by our assumption. Therefore 𝑋 ∩ 𝑌 = ∅ and so 𝑃 is
pairwise-disjoint. Thus 𝑃 is a partition of 𝐴.

Informally, you can think of a partition as splitting up a set into one or more non-overlapp-
ing pieces.

Example 2.12.

(1). If 𝐴 is a set then {{𝑎} ∶ 𝑎 ∈ 𝐴} is a partition of 𝐴.

(2). If 𝐴 is a non-empty set then {𝐴} is a partition of 𝐴.

(3). There are five possible partitions of {1, 2, 3} (See Figure 2.2). These are

𝑃1 = {{1, 2, 3}}, 𝑃2 = {{1, 2}, {3}}, 𝑃3 = {{1, 3}, {2}}, 𝑃4 = {{1}, {2, 3}}, 𝑃5 = {{1}, {2}, {3}}.

The following theorem tells us how to translate between equivalence classes and partitions
of a set; moreover, these translations are inverses of each other.

Theorem 2.13. Let 𝐴 be a set.

(1). For any equivalence relation ∼ on 𝐴, the quotient set 𝑃∼ ≔ 𝐴/∼ is a partition of 𝐴.

(2). For any partition 𝑃 of 𝐴, we define the relation ∼𝑃 on 𝐴: for all 𝑎, 𝑏 ∈ 𝐴, 𝑎 ∼𝑃 𝑏 if there
exists a part 𝑋 ∈ 𝑃 such that 𝑎, 𝑏 ∈ 𝑋. Then ∼𝑃 is an equivalence relation on 𝐴.

(3). These definitions form a bijective correspondence between the collection of equivalence
relations on 𝐴, and the collection of partitions on 𝐴:

Equivalence relations ↔ Partitions


∼ → 𝑃∼
∼𝑃 ← 𝑃
2.1. EQUIVALENCE RELATIONS AND PARTITIONS 29

2
3

1 1 1
2 2
2 3 3
3

2
3

Figure 2.2: The five partitions of {1, 2, 3}.

Proof.

(1). Let ∼ be an equivalence relation on 𝐴 and let 𝑎 ∈ 𝐴 be given. Then by reflexivity 𝑎 ∼ 𝑎,


so 𝑎 ∈ [𝑎]∼ ≠ ∅. Let 𝑋 ∈ 𝑃∼ be given such that 𝑎 ∈ 𝑋. Then there exists a 𝑏 ∈ 𝐴 such that
𝑋 = [𝑏]∼ and so 𝑏 ∼ 𝑎. Thus by Lemma 2.9 𝑋 = [𝑎]∼ .

Therefore each equivalence class (i.e. member of 𝑃∼ ) is non-empty and [𝑎]∼ is the unique
part of 𝑃∼ that contains 𝑎. Thus 𝑃∼ = 𝐴/∼ is a partition of 𝐴.

(2). Let 𝑃 be a partition of 𝐴 and let 𝑎 ∈ 𝐴 be given. Then there exists an 𝑋 ∈ 𝑃 such that 𝑎 ∈ 𝑋,
so by definition 𝑎 ∼𝑃 𝑎. Also by its definition ∼𝑃 is symmetric.

Now let 𝑎, 𝑏, 𝑐 ∈ 𝐴 be given such that 𝑎 ∼𝑃 𝑏 and 𝑏 ∼𝑃 𝑐. Then there exist 𝑋, 𝑌 ∈ 𝑃 such
that 𝑎, 𝑏 ∈ 𝑋 and 𝑏, 𝑐 ∈ 𝑌. Since 𝑏 ∈ 𝑋 ∩ 𝑌, then by uniqueness 𝑋 = 𝑌. Thus 𝑎, 𝑐 ∈ 𝑋 and
so 𝑎 ∼𝑃 𝑐. Therefore ∼𝑃 is an equivalence relation on 𝐴.

(3). Let ∼ be an equivalence relation on 𝐴 and let 𝑎, 𝑏 ∈ 𝐴 be given. Then:

𝑎 ∼𝑃∼ 𝑏 ⟺ ∃𝑋 ∈ 𝑃∼ such that 𝑎, 𝑏 ∈ 𝑋


⟺ ∃𝑐 ∈ 𝐴 such that 𝑎, 𝑏 ∈ [𝑐]∼
⟺ ∃𝑐 ∈ 𝐴 such that 𝑎 ∼ 𝑐 and 𝑏 ∼ 𝑐
30 CHAPTER 2. BINARY RELATIONS

Now if 𝑎 ∼ 𝑏 then by reflexivity 𝑏 ∼ 𝑏 too, so 𝑎 ∼𝑃∼ 𝑏. On the other hand, if there exists a 𝑐 ∈ 𝐴
such that 𝑎 ∼ 𝑐 and 𝑏 ∼ 𝑐, then by symmetry 𝑐 ∼ 𝑏 and thus by transitivity 𝑎 ∼ 𝑏. Therefore
𝑎 ∼𝑃∼ 𝑏 if and only if 𝑎 ∼ 𝑏, and hence ∼𝑃∼ = ∼.
Let 𝑃 be a partition on 𝐴. Then for each 𝑎 ∈ 𝐴, there exists a unique 𝑋𝑎 ∈ 𝑃 such that 𝑎 ∈ 𝑋𝑎 .
So for all 𝑎, 𝑏 ∈ 𝐴:

𝑎 ∼𝑃 𝑏 ⟺ ∃𝑌 ∈ 𝑃 such that 𝑎, 𝑏 ∈ 𝑌 ⟺ 𝑏 ∈ 𝑋𝑎 .

Thus [𝑎]∼𝑃 = 𝑋𝑎 and so 𝑃∼𝑃 = 𝐴/∼𝑃 = {[𝑎]∼𝑃 ∶ 𝑎 ∈ 𝐴}.


To finish, we want to show that {[𝑎]∼𝑃 ∶ 𝑎 ∈ 𝐴} = 𝑃. By the above, each ∼𝑃 -equivalence class
is a part of 𝑃, so {[𝑎]∼𝑃 ∶ 𝑎 ∈ 𝐴} ⊆ 𝑃. On the other hand, each 𝑋 ∈ 𝑃 is non-empty so there exists
some 𝑎 ∈ 𝑋 and hence 𝑋 = 𝑋𝑎 = [𝑎]∼𝑃 . Therefore 𝑃∼𝑃 = 𝑃.

This is theorem allows us to describe equivalence relations and partitions in terms of each
other; this can be useful when trying to solve a problem in one domain by translating into the
other. As an example, it is easier to describe all equivalence relations on a set using partitions
instead.
Chapter 3

Finite counting

The product rule, sum rule and inclusion-exclusion formulae are the starting points for our
counting arguments, though we often use these implicitly by ‘counting choices’. Indeed, given
a collection of sets, then the sum rule or inclusion-exclusion formulae tell us how many ways
there are to choose one item from these sets (that is, a single item is taken from the union of
all the sets, so only one item is chosen in total). The product rule tells us instead how many
ways there are to choose one item from each set.

Example 3.1. One student society has 12 members, and another has 23 members; they have
no members in common. How many ways are there to choose from the memberships

(1). a single representative for the societies (who can be from either)?

(2). a representative from each society?

Solution. Let 𝐴 be the set of members of the first society, and 𝐵 the set of members of the
second society. Choosing a single representative for the societies means choosing an element
of 𝐴 ∪ 𝐵, so the answer to (1) is |𝐴 ∪ 𝐵| = |𝐴| + |𝐵| = 12 + 23 = 35 by the sum rule. Similarly,
choosing a representative from each society means choosing a pair (𝑎, 𝑏) ∈ 𝐴 × 𝐵, as 𝑎 is then
the representative for the first society and 𝑏 is then the representative for the second. So the
answer to (2) is |𝐴 × 𝐵| = |𝐴||𝐵| = 12 × 23 = 276 by the product rule. Another way to put this
is that there are 12 choices for the first representative, and for each of these choices there are
then 23 choices for the second representative, so in total there are 12 lots of 23 choices, giving
12 × 23 possibilities in total.

In this chapter, we will explore how to calculate the number of possible ways of choosing
multiple elements of a set, subject to certain restrictions; for example:

31
32 CHAPTER 3. FINITE COUNTING

• Are we allowed to choose the same element more than once?

• Does the order in which the elements are chosen matter? That is, if we first choose 𝑎 and
then 𝑏, should we consider this choice to be different from choosing 𝑏 first and then 𝑎?

Several of our examples in this chapter will consider probabilities of events. These are nat-
urally linked to counting results: if a random experiment has a finite number of possible out-
comes, and every outcome is equally likely, then the probability of an event 𝐸 is
number of outcomes for which 𝐸 occurs
ℙ(𝐸) = .
total number of outcomes
We say that a selection is made uniformly at random if every outcome is equally likely. For ex-
ample, rolling a fair standard 6-sided die selects an element of the set {1, 2, 3, 4, 5, 6} uniformly
at random.1 Likewise, in the example above, if we choose uniformly at random a single repre-
1
sentative from either society, then each person has probability of being chosen.
35
When giving a numerical answer, it is sometimes more useful to round the number. Sup-
pose we have a real number 𝑥 written as a decimal ±𝑎𝑛 … 𝑎0 .𝑎−1 𝑎−2 … , where the 𝑎𝑖 ’s are
digits. The 1st significant figure is the first non-zero digit, starting from the left. Every digit
to the right of this digit is called a significant figure, so for example the 3rd significant figure
would be 2 places to the right of the 1st significant figure, even if that digit is 0. For a positive
integer 𝑛, to round to 𝑛 significant figures:

(1). If the (𝑛 + 1)th significant figure is 5 or more, then we add 1 to the 𝑛th significant figure.
If the 𝑛th significant figure was 9, then we instead change it to 0 and add 1 to the digit
to the left, repeating this process if this digit is also 9.

(2). After this, we replace all digits to the right of the 𝑛th significant figure by 0, and we can
ignore the 0 digits to the right of both the 𝑛th significant figure and the decimal point ‘.’.

To round to 𝑛 decimal places, we instead start with the 𝑛th digit to the right of the decimal
point ‘.’ and conduct the same process.
Depending on how 𝑥 was written as a decimal ±𝑎𝑛 … 𝑎0 .𝑎−1 𝑎−2 … , we may need to add
more 0 digits for this process to work. We write (𝑛 s.f.) or (𝑛 d.p.) after a number to indicate
that the number has been rounded to 𝑛 significant figures or 𝑛 decimal place respectively.
Unless stated otherwise, you must give the exact number for an answer! If you round to 𝑛
decimal places, or are rounding to 𝑛 significant figures and the 𝑛th significant figure is after the
1
All random selections considered in this course will be made uniformly at random; in other courses you will
see examples of non-uniform, random selections.
3.1. ORDERED CHOICE 33

1
𝑥 𝑒 √0.99 𝜋3
137
(1 s.f.) 3 1.0 30 0.007
(2 s.f.) 2.7 0.99 31 0.0073
(3 s.f.) 2.72 0.995 31.0 0.00730
(4 s.f.) 2.719 0.9950 31.01 0.007299
(5 s.f.) 2.7183 0.99499 31.006 0.0072993
(1 d.p.) 2.7 1.0 31.0 0.0
(2 d.p.) 2.72 0.99 31.01 0.01
(3 d.p.) 2.719 0.995 31.006 0.007
(4 d.p.) 2.7183 0.9950 31.0063 0.0073
(5 d.p.) 2.71828 0.99499 31.00628 0.00730

Table 3.1: Several examples of rounding.

decimal point, then you must write out that digit, even if it is 0. Table 3.1 gives some examples
demonstrating the above procedure.

3.1 Ordered choice


We first consider the situation when the order in which items are chosen does matter. The use
of factorials simplifies many expressions in this chapter.2

Definition 3.2. For any non-negative integer 𝑛, we define 𝑛 factorial, by


𝑛
𝑛! = 𝑛 × (𝑛 − 1) × (𝑛 − 2) × ⋯ × 3 × 2 × 1 = ∏ 𝑚.3
𝑚=1

Alternatively, we can define factorials using recursion4 :

• 0! ≔ 1,
2
Be careful not to mix up the mathematical and punctuational uses of the ‘!’ symbol! For example, if you solve
a problem and write ‘the answer is 10!’, do you mean 10 or 10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1 = 3, 628, 800?
3
The Π notation is for products, just like Σ notation is for sums. Given a finite list 𝑎1 , … , 𝑎𝑛 of numbers,
𝑛
∏𝑚=1 𝑎𝑖 ≔ 𝑎1 × ⋯ × 𝑎𝑛 . For the empty list, we define the empty product to be 1, which is equal to the number of
lists with no elements..
4
A recursive definition of a function 𝑓 is a definition that depends upon other values of the function. In this
case, the value of (𝑛+1)! depends on the value of 𝑛!. Usually, a recursive definition will depend on smaller values.
34 CHAPTER 3. FINITE COUNTING

• for all 𝑛 ∈ ℕ0 , (𝑛 + 1)! ≔ (𝑛 + 1) ⋅ (𝑛!).

Notice that both definitions agree, as can be shown by induction.

Theorem 3.3. Let 𝑟 and 𝑛 be non-negative integers, and let 𝑆 be a set of size 𝑛.

(1). There are 𝑛𝑟 different possible ways to make 𝑟 choices from 𝑆, if the order of choices mat-
ters and repetition is allowed. In other words, there are 𝑛𝑟 sequences of 𝑟 elements of 𝑆
(allowing repetition). Moreover, if each choice from 𝑆 is made uniformly at random and
independently from the previous choices, then each of these outcomes has equal proba-
bility 𝑛−𝑟 .
𝑛!
(2). For 𝑟 ⩽ 𝑛 there are different possible ways to make 𝑟 choices from 𝑆, if the order of
(𝑛−𝑟)!
𝑛!
choices matters but repetition is forbidden.5 In other words there are sequences of 𝑟
(𝑛−𝑟)!
elements of 𝑆 in which no element occurs more than once. Moreover, if each choice from 𝑆
is made uniformly at random from those elements of 𝑆 not previously chosen, then each
(𝑛−𝑟)!
of these outcomes has equal probability .
𝑛!

(3). For 𝑟 > 𝑛 it is not possible to make 𝑟 choices from 𝑆 if repetition is forbidden.

Proof. (1). First, let us suppose that 𝑟 is positive. Then the ordered sequences of 𝑟 elements
of 𝑆 with repetition allowed are precisely the elements of 𝑆 𝑟 , so by the product rule there
are exactly |𝑆|𝑟 = 𝑛𝑟 such sequences. Moreover, any such sequence (𝑥1 , … , 𝑥𝑟 ) and each
1 ⩽ 𝑖 ⩽ 𝑟, the probability that 𝑥𝑖 is chosen at the 𝑖-th choice is 𝑛−1 ; since these events are
independent the probability that the sequence (𝑥1 , … , 𝑥𝑟 ) is the outcome of our random
selections is 𝑛−𝑟 .

If 𝑟 = 0 then there is only one possible way of not choosing any element (or equivalently,
one empty sequence), and this coincides with the usual convention that 𝑛0 = 1. As there
is only one possible outcome, the probability of that outcome is 1 = 𝑛−0 .

(2). Suppose we are constructing a selection (𝑥1 , … , 𝑥𝑟 ) of 𝑟 elements from 𝑆 by choosing


one element at a time. So if we have already chosen 𝑥1 , … , 𝑥𝑘−1 (for 𝑘 ∈ {1, … , 𝑟}), then
𝑥𝑘 must belong to the set 𝑆 ⧵ {𝑥1 , … , 𝑥𝑘−1 } and so there are |𝑆 ⧵ {𝑥1 , … , 𝑥𝑘−1 }| = 𝑛 − 𝑘 + 1
possibilities. Therefore, if 𝑟 > 0 then there are
𝑟
[𝑛 × ⋯ × (𝑛 − 𝑟 + 1)] × [(𝑛 − 𝑟) × ⋯ × 1] 𝑛!
∏ (𝑛 − 𝑘 + 1) = =
𝑘=1 (𝑛 − 𝑟) × ⋯ × 1 (𝑛 − 𝑟)!
𝑛!
5
Some authors use the notation 𝐴𝑛𝑟 for .
(𝑛−𝑟)!
3.1. ORDERED CHOICE 35

ways to successively choose 𝑟 elements from 𝑆 without repetition. Moreover, having


already chosen 𝑥1 , … , 𝑥𝑘−1 , the probability of choosing any particular element of 𝑆 ⧵
1
{𝑥1 , … , 𝑥𝑘−1 } is . As these choices are independent, we find that the probability
𝑛−𝑘+1
of constructing any particular sequence (𝑥1 , … , 𝑥𝑟 ) is
𝑟
1 (𝑛 − 𝑟) × ⋯ × 1 (𝑛 − 𝑟)!
∏ = = .
𝑘=1 𝑛 − 𝑘 + 1 [𝑛 × ⋯ × (𝑛 − 𝑟 + 1)] × [(𝑛 − 𝑟) × ⋯ × 1] 𝑛!

𝑛!
As in (1), if 𝑟 = 0 then there is only 1 = and the probability of that outcome is
(𝑛−0)!
(𝑛−0)!
1= .
𝑛!

(3). If 𝑟 > 𝑛 then, by the pigeonhole principle, whenever we make 𝑟 choices from the 𝑛 ele-
ments of 𝑆 there must be some element which is chosen at least twice, so it is not possible
to choose 𝑟 elements of 𝑆 without repeating some element.

Example 3.4. A bag contains 5 balls, labelled 1, 2, 3, 4, and 5. I draw out two balls in turn from
the bag.6 What is the probability that the number on the second ball drawn is precisely one
greater than the number on the first ball drawn, if we draw balls according to the following
rules:

(1). I replace the first ball before drawing the second?

(2). I do not replace the first ball before the second is drawn?

Solution. Let 𝑥 be the number on the first ball drawn and 𝑦 be the number on the second
ball drawn, so each outcome of drawing the balls is a pair (𝑥, 𝑦). There are four outcomes of
the selection which give the event described, namely (1, 2), (2, 3), (3, 4) and (4, 5), so we need to
calculate the total number of possible outcomes.

(1). We are making two choices from the set {1, 2, 3, 4, 5} with repetition allowed (because
we replaced the first ball so it could be drawn again), and where order matters. So by
Theorem 3.3(1) are 52 = 25 possible outcomes, each of which is equally likely, and so the
4
probability is = 0.16.
25
6
Many questions of this type involve drawing balls from bags, or rolling dice, or dealing cards from a deck.
Unless stated otherwise you should assume that each selection is uniformly random, that is, that each ball is
equally likely to be drawn, each card is equally likely to be dealt, each face of the die is equally likely to come up,
etc. Furthermore, if multiple dice are rolled, balls are drawn or cards are dealt, then you should assume that the
selections are independent of each other unless otherwise specified.
36 CHAPTER 3. FINITE COUNTING

(2). We are again making two choices from the set {1, 2, 3, 4, 5} where order matters, but now
repetition is forbidden because we cannot draw the first ball again once it has been
5!
drawn. So by Theorem 3.3(2) there are = 5 × 4 = 20 possible outcomes, each of
(5−2)!
4 1
which is equally likely, and thus the probability is = = 0.2.
20 5

Definition 3.5. A permutation of a set 𝑆 with 𝑟 elements in an 𝑟-tuple (𝑥1 , … , 𝑥𝑟 ) listing all
elements of 𝑆 exactly once; i.e. 𝑆 = {𝑥1 , … , 𝑥𝑟 }.

For example, the permutations of {1, 2, 3} are

(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), and (3, 2, 1).

So there are 6 = 3! permutations of a set of three elements. We can also view permutations
on a set 𝑆 and the bijections 𝑓 ∶ 𝑆 → 𝑆. If we fix an enumeration 𝑆 = {𝑥1 , … , 𝑥𝑟 } of a set with 𝑟
elements, then we get the following bijective correspondence between permutations as we’ve
defined above, and bijections on a set 𝑆:

𝑟-tuples ↔ Bijections
(𝑦1 , … , 𝑦𝑟 ) ↦ 𝑓 ∶ 𝑆 → 𝑆, 𝑥𝑖 ↦ 𝑦𝑖
(𝑓 (𝑥1 ), … , 𝑓 (𝑥𝑟 )) ↤ 𝑓 ∶𝑆→𝑆

Corollary 3.6. A set 𝑆 of 𝑛 elements has 𝑛! permutations.

Proof. The permutations of 𝑆 are exactly the sequences of 𝑛 elements of 𝑆 in which no element
𝑛! 𝑛!
is repeated. This number is = = 𝑛! by Theorem 3.3(2) applied with 𝑟 = 𝑛.
(𝑛−𝑛)! 0!

Example 3.7. There are 𝑛! ways for 𝑛 people to line up in a queue, since each order is a per-
mutation of the set of people in the queue.

We conclude this section with two more examples of ordered choice.

Example 3.8. How many anagrams7 are there of the word MATHS? In how many of these is
‘T’ immediately before ‘H’?

Solution. Each anagram of MATHS is a permutation of the set {M, A, T, H, S}, so by Corollary 3.6
there are 5! = 120 anagrams. For the second part, note that the anagrams of MATHS in which
‘T’ is immediately before ‘H’ can be thought of as the permutations of the set {M, A, TH, S}
(i.e. we treat ‘TH’ as a single letter), and Corollary 3.6 tells us there are 4! = 24 of these.
7
An anagram of a word is a word obtained by rearranging the letters, including the original word. In this
course, the anagrams do not have to be real words.
3.2. UNORDERED CHOICE 37

Example 3.9. I roll four standard dice.8 What is the probability that the numbers obtained are
consecutive9 ?

Solution. If we imagine rolling the dice in turn, then we are making four choices from the set
{1, 2, 3, 4, 5, 6} in which order matters and where repetition is allowed. So by Theorem 3.3(1)
there are 64 = 1,296 possible outcomes, each of which is equally likely. The outcomes for
which the numbers obtained are consecutive are the permutations of {1, 2, 3, 4}, {2, 3, 4, 5} and
{3, 4, 5, 6}. Each of these sets has 4! = 24 permutations by Corollary 3.6, so in total there are
72 1
24 × 3 = 72 such outcomes. Therefore the probability is = = 0.056 (2 s.f.).
1,296 18

3.2 Unordered choice


There are many situations where the order in which certain objects are chosen is irrelevant.
For example, in a lottery only the numbers that are drawn is important. We first introduce the
binomial coefficients and their notation, as they frequently occur when discussing unordered
choices.

Definition 3.10. Let 𝑛 and 𝑟 be non-negative integers with 𝑟 ⩽ 𝑛. Then the binomial coefficient
of 𝑛 and 𝑟, also called 𝑛 choose 𝑟, and denoted (𝑛𝑟)10 is defined to be

𝑛 𝑛! 𝑛(𝑛 − 1)(𝑛 − 2) … (𝑛 − 𝑟 + 1)
( )≔ = .
𝑟 𝑟! ⋅ (𝑛 − 𝑟)! 𝑟!

The following properties of binomial coefficients are left as a simple exercise.

Proposition 3.11. Let 𝑛 ∈ ℕ0 be given. Then:

(1). For all 𝑟 ∈ {0, … , 𝑛}, (𝑛𝑟) = (𝑛−𝑟


𝑛
).

(2). (𝑛0) = 1 = (𝑛𝑛).

(3). If 𝑛 ⩾ 1 then (𝑛1) = 𝑛 = (𝑛−1


𝑛
).
𝑛(𝑛−1)
(4). If 𝑛 ⩾ 2 then (𝑛2) = 𝑛
= (𝑛−2).
2
8 1
A standard die has 6 sides, numbered 1 to 6 (usually with dots) and each face has a chance of facing up
6
when rolled.
9
A list of numbers is consecutive if the next term is 1 plus the previous term.
10
Other authors use 𝐶𝑛𝑟 .
38 CHAPTER 3. FINITE COUNTING

10×9
For example, (10
2)
= = 45. This is much simpler than the following calculation:
2×1

10 10! 3, 628, 800 3, 628, 800


( )= = = = 45.
2 2! × 8! 2 × 40, 320 80, 640

As an another example,

14 14 × 13 × 12 × 11 × 10 × 9 × 8
( )=
7 7×6×5×4×3×2×1
14 12 10 9 8
= × 13 × × 11 × × ×
7×2 6 5 3 4
= 1 × 13 × 2 × 11 × 2 × 3 × 2
= 3,432.

Theorem 3.12. Let 𝑆 be a finite of size 𝑛 and let 𝑟 ∈ {0, … , 𝑛} be given. Then there are (𝑛𝑟) differ-
ent possible ways to make 𝑟 successive choices from 𝑆 if the order is irrelevant and repetition is
forbidden. Equivalently, the number of subsets 𝑅 ⊆ 𝑆 of size 𝑟 is (𝑛𝑟). Furthermore, if each choice
from 𝑆 is made uniformly at random from those elements of 𝑆 which have not previously been
−1
chosen, then each of these outcomes has equal probability (𝑛𝑟) .

To see that the first two statements are equivalent, recall that a set does not distinguish
between repeated elements, nor the order in which they appear.

Proof. Let 𝐴 denote the set of ordered choices (𝑥1 , … , 𝑥𝑟 ) of 𝑟 elements from 𝑆 without repe-
𝑛!
tition, so by Theorem 3.3(2) |𝐴| = . Let 𝑚 denote the number of unordered choices of
(𝑛−𝑟)!
𝑟-elements from 𝑆 without repetition, and for each 𝑟-element subset 𝐶 ⊆ 𝑆, let 𝐵𝐶 denote the
permutations of 𝐶. Then by Corollary 3.6 |𝐵𝐶 | = 𝑟!, and as the collection {𝐵𝐶 ∶ 𝐶 ⊆ 𝑆, |𝐶 | = 𝑟} is
pairwise-disjoint, by the sum rule we get

𝑛! | |
= |𝐴| = || ⋃ 𝐵𝐶 || = ∑ |𝐵𝐶 | = 𝑚 ⋅ 𝑟!
(𝑛 − 𝑟)! | 𝐶 ⊆𝑆, | 𝐶 ⊆𝑆,
|𝐶 |=𝑟 |𝐶 |=𝑟
𝑛! 𝑛
⇒𝑚= = ( ).
𝑟! ⋅ (𝑛 − 𝑟)! 𝑟

𝑛!
Recall from Theorem 3.3(2) that the probability of any particular ordered choice is . Thus,
(𝑛−𝑟)!
as each unordered choice {𝑥1 , … , 𝑥𝑟 } arises from exactly 𝑟! ordered choices (one for each per-
mutation of that choice), from Theorem 3.3(2) we get that the probability of choosing {𝑥1 , … ,
(𝑛−𝑟)! −1
𝑥𝑟 } is 𝑟! ⋅ = (𝑛𝑟) .
𝑛!
3.2. UNORDERED CHOICE 39

Example 3.13. How many 5-card poker hands11 are there? How many contain two aces, a
king, and two other cards (i.e. not an ace or king)? What is the probability that a random 5-
card hand has this form?

Solution. Note that the order of cards is irrelevant and that repetition is forbidden. Thus, the
answer to the first part is the number of ways to choose 5 cards out of 52 without repetition
and ignoring order is

52 52 × 51 × 50 × 49 × 48 52 51 50
( )= = × × × 49 × 48 = 13 × 17 × 5 × 49 × 48 = 2, 598, 960.
5 5×4×3×2×1 4 3 5×2

For the second part, there are (42) ways to choose 2 of the 4 aces in the deck, and there are (41)
ways to pick one of the four kings in the deck. Then, whatever we have chosen so far, there
are (44
2)
ways to choose two cards from the 44 = 11 × 4 cards which are not an ace or a king.
Observe that all these choices are independent of each other, so by the product rule the total
number of such hands is
4 4 44 4×3 44 × 43
( )( )( ) = ×4× = 6 × 4 × 946 = 22, 704.
2 1 2 2 2

In particular, the probability that a random 5-card hand contains 2 aces, a king, and two other
cards, is
22,704 473
= = 0.00874 (3 s.f.).
2,598,960 54,145
Example 3.14. In the China Union Lotto we choose 6 numbers from 1 to 33, and another
number from 1 to 16. The lottery draw then selects 6 red balls, numbered 1–33, and a blue
ball, numbered 1–16. What is the probability of winning the jackpot by matching all seven
numbers? What is the probability of matching four red numbers?

Solution. The order in which the balls are selected is irrelevant, and a ball cannot be selected
more than once. So the number of possibilities for the six red numbers drawn is (33
6 ),
and
(16
1)
= 16 for the one blue number. Each outcome is equally likely, so the probability that the
outcome matches the seven numbers we chose is therefore
−1
33 1
(16 × ( )) = = 0.000000056 (2 s.f.).
6 17, 721, 088

For the second part, there are (64) ways of choosing 4 of the red numbers to match, and there
are (33−6 = 27 choices for the two red balls. So the probability of matching 4 of the red balls
2 ) (2)
11
Poker is a card game played with a standard 52-card deck, as described the footnote of Example 1.7.
40 CHAPTER 3. FINITE COUNTING

is
(64)(27
2) 5,625
= = 0.0048 (2 s.f.).
(33
6)
1,107,568

Notice that we can ignore the blue numbers / balls in the second part of this question.

Finally, we consider the case where repetition is allowed but we don’t care about the order
of choices.12

Theorem 3.15. Let 𝑛, 𝑟 ∈ ℕ+ be given and let 𝑆 be a finite set of size 𝑛. Then there are (𝑛+𝑟−1
𝑟 )
different possible ways to make 𝑟 successive choices from 𝑆 if repetition is allowed but the order
of choosing is irrelevant.

Proof. All that matters when considering the choices in this theorem is how many times each
element is chosen. To represent such a choice, we first list the elements of 𝑆 = {𝑥1 , … , 𝑥𝑛 } and
consider a string of symbols consisting of 𝑟 stars (⋆) and 𝑛 − 1 bars (∣) and count the number
of consecutive stars:
1st bar 2nd bar (𝑟−1)th bar
↓ ↓ ↓
⋆ … ⋆ ∣ ⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟ ⋆…⋆ ∣ … ∣ ⏟⏟⏟⏟⏟⏟⏟
⋆…⋆
𝑎1 stars 𝑎2 stars 𝑎𝑟 stars

The 𝑖th consecutive group of stars has 𝑎𝑖 stars, where 𝑎𝑖 is a non-negative integer. It is easily
seen that each unordered choice with repetition corresponds to exactly one such string of
symbols, where 𝑎𝑖 represents the number of times that 𝑥𝑖 was chosen. Thus, the number of
such choices is equal to the number of strings of symbols with the above form. We can create
a string by choosing 𝑛 − 1 positions to place the bars and put stars in the remaining places.
Therefore, by Proposition 3.11(1), there are (𝑛+𝑟−1 = 𝑛+𝑟−1 such choices.
𝑛−1 ) ( 𝑟 )

Corollary 3.16. For positive integers 𝑟 and 𝑛, the number of non-negative integer solutions of
𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 = 𝑟 is (𝑛+𝑟−1
𝑟 ).

12
Be very careful when using Theorem 3.15 to calculate probabilities, as unlike in Theorems 3.3 and 3.12 these
will often not be uniform. For example, if I roll two identical standard dice and ignore the order in which they
are rolled, by Theorem 3.15 there are (6+2−1
2
) = 21 distinct possible outcomes, but these are not equally likely.
For instance, I am twice as likely to get a three and a four than I am to get two sixes. For this reason, when we
repeat random experiments where repetition is allowed, such as rolling dice or drawing balls from a bag with
replacement, we almost always count ordered outcomes and use Theorem 3.3. On the other hand, as we have
already seen, when we repeat random experiments with repetition forbidden, such as dealing cards from a deck
or drawing balls without replacement, we can either count outcomes with order using Theorem 3.3 or count
outcomes without order using Theorem 3.12; the latter is usually simpler.
3.3. MORE EXAMPLES 41

respecting order ignoring order


𝑛+𝑟 −1
allowing repetition 𝑛𝑟 ( )
𝑟
𝑛! 𝑛
forbidding repetition ( )
(𝑛 − 𝑟)! 𝑟

Table 3.2: The number of ways to make 𝑟 choices from 𝑛 objects, subject to various
conditions.

The importance of this corollary is that it is the number of ways of distributing 𝑟 identical
objects among 𝑛 people. So, for example, if I want to share 7 pound coins among 5 people
then there are (5+7−1 = 11 = 330 possible ways to do this.
7 ) (7)

Proof. The non-negative integer solutions of the equation 𝑥1 + ⋯ + 𝑥𝑛 = 𝑟 correspond exactly


to the number of different ways to choose 𝑟 objects from {1, … , 𝑛} where we allow repetition
and the order doeos not matter; for each solution (𝑥1 , … , 𝑥𝑛 ) we choose 𝑖 𝑥𝑖 -times for each
𝑖 = 1, … , 𝑛. Therefore by the above theorem, there are (𝑛+𝑟−1
𝑟 )
such solutions.

We can solve similar problems with different restrictions on 𝑥1 , 𝑥2 , … , 𝑥𝑛 in a similar fash-


ion, as in the following example.

Example 3.17. How many solutions are there in positive integers to 𝑥1 + 𝑥2 + 𝑥3 = 101?

Solution. Let 𝑦1 = 𝑥1 − 1, 𝑦2 = 𝑥2 − 1 and 𝑦3 = 𝑥3 − 1. Then 𝑥1 + 𝑥2 + 𝑥3 = 101 is equivalent


to 𝑦1 + 𝑦2 + 𝑦3 = 98, and 𝑦𝑖 is a non-negative integer if and only if 𝑥𝑖 is a positive integer. So
the number of positive integer solutions to 𝑥1 + 𝑥2 + 𝑥3 = 101 is the number of non-negative
integer solutions to 𝑦1 + 𝑦2 + 𝑦3 = 98, which is (100 = 100 = 4, 950 by Corollary 3.16. Similarly,
98 ) ( 2 )
if we want to find solutions to 𝑥1 + ⋯ + 𝑥𝑛 = 𝑟 where each 𝑥𝑖 ⩾ 𝑘, this is equal to the number
of solutions to 𝑦1 + ⋯ + 𝑦𝑛 = 𝑟 − 𝑛𝑘 where each 𝑦𝑖 is a non-negative integer.

3.3 More examples


Table 3.2 summarises the results of sections 3.1 and 3.2.

Example 3.18. How many anagrams are there of ‘MISSISSIPPI’?


42 CHAPTER 3. FINITE COUNTING

Solution. There are 11! possible ways to put the elements of {M, I1 , S1 , S2 , I2 , S3 , S4 , I3 , P1 , P2 , I4 }
in order by Corollary 3.6. Ignoring the subscripts, these orders give all the anagrams of ‘MIS-
SISSIPPI’. However, each anagram of ‘MISSISSIPPI’ is formed 4! ⋅ 4! ⋅ 2! times in this manner:
if we fix the positions of the ‘I’s, then there are 4! possible arrangements of I1 , I2 , I3 and I4 in
those positions. Moreover, for each of those arrangements, there are 4! possible arrangements
of S1 , S2 , S3 and S4 in the fixed positions of the ‘S’s, and finally there are 2! = 2 possible arrange-
ments of P1 and P2 in the ‘P’-positions. So in total the number of anagrams of ‘MISSISSIPPI’
11!
is = 34,650.
4!⋅4!⋅2!
An alternative approach is to consider a sequence of 11 blank spaces, into which we will
place the letters of MISSISSIPPI. First, we decide where we are going to put the 4 ‘S’s. There are
11 spaces to choose from, so there are (11
4)
possibilities to choose from, since we are choosing 4
of the 11 empty spaces. For any of these choices, there there are then 7 empty spaces remain-
ing, so there are (74) possibilities for how to place the 4 ‘I’s among these. There are then 3 empty
spaces, so (32) choices for how to place the 2 ‘P’s. Finally, there is now only 1 empty space, so
the remaining letter ‘M’ must be placed here – there is no other choice. We conclude that the
number of anagrams of ‘MISSISSIPPI’ is
11 7 3 11 × 10 × 9 × 8 7 × 6 × 5 × 4 3 × 2 11!
( )( )( ) = × × = = 34, 650,
4 4 2 4! 4! 2! 4! × 4! × 2!
as before.

Note that in the first method we considered the problem with order (i.e. treating all char-
acters as different), but then had to consider how many times we ‘overcounted’; we initially
counted with order but for the final answer, we do not care about the order of certain parts
(i.e. the repeated letters) and so have to divide by the number of different ways we achieve
the same anagram in the end. In the second method we viewed the mixed choice as a series
of separate unordered choices. In general both of these methods will work for mixed choice
problems, but one may be significantly simpler than the other, depending on the problem.

Example 3.19. I deal a hand of 6 cards from a standard 52-card deck.

(1). What is the probability that I get 2 cards of one suit and 4 cards of another suit?

(2). What is the probability that I get 3 cards of one suit and 3 cards of another suit?

(3). What is the probability that I get exactly 2 Queens and exactly 3 Hearts?

Solution. First note that there are (52


6)
possible 6-card hands, each of which is equally likely to
be dealt.
3.3. MORE EXAMPLES 43

(1). There are 4 possibilities for the suit from which we have 4 cards and (13
4)
possibilities for
the rank of those cards. Having chosen these, there are then 3 possibilities for the suit
of the remaining 2 cards and (13
2)
possibilities for the rank of those cards. So there are
4 ⋅ (13
4)
⋅ 3 ⋅ (13
2)
such hands, and we conclude that the probability is

4 ⋅ (13
4)
⋅ 3 ⋅ (13
2) 669,240 1,287
= = = 0.033 (2 s.f.).
(52
6)
20,358,520 39,151

(2). There are (42) ways to choose which 2 suits the cards come from.13 Having chosen these
suits, there are (13
3)
ways to choose 3 cards from the one suit, and then (13
3)
ways to choose
3 cards from the other suit. So there are (42) ⋅ (13 ⋅ 13 such hands, and we conclude that
3) (3)
the probability is

(42)(13 13
3 )( 3 ) 490,776 4,719
= = = 0.024 (2 s.f.).
(52
6)
20,358,520 195,755

(3). We consider two cases. If I do not get the Queen of Hearts, then I need to get two of the
remaining three Queens (there are (32) possibilities for how this can be done), three of
the remaining twelve Hearts (there are (12
3)
possibilities for how this can be done) and
one of the 36 cards which is not a Queen nor a Heart.14 So there are 36(32)(12
3)
hands not
containing the Queen of Hearts with exactly two Queens and exactly three Hearts.

On the other hand, if I do get the Queen of Hearts, then I need to get one of the remaining
three queens (3 possibilities), two of the remaining 12 hearts ((12
2)
possibilities) and two
of the other 36 cards ((36
2)
possibilities). So there are 3(12 36
2 )( 2 )
cards containing the Queen
of Hearts with exactly two Queens and exactly three Hearts.

Overall this gives a probability of

36(32)(12
3)
+ 3(12 36
2 )( 2 ) 148,500 7,425
= = ≈ 0.0073 (2 s.f.).
(52
6)
20,358,520 1,017,926

Example 3.20. I roll seven standard dice. What is the probability that I get at least 3 sixes?
13
Note carefully the difference between the solutions of the different parts of this example. The fundamental
reason for this is that, for example, having two spades and four hearts is different from having two Hearts and
four Spades, whilst having three Hearts and three Spades is the same as having three Spades and Three Hearts.
14
If 𝐻 represents the set of Heart cards and 𝑄 represents the set of Queen cards, then by the Inclusion-exclusion
formula for two sets |𝐻 ∪ 𝑄| = |𝐻 | + |𝑄| − |𝐻 ∩ 𝑄| = 13 + 4 − 1 = 16. Thus, there are 52 − 16 = 36 cards that are not
Hearts not Queens.
44 CHAPTER 3. FINITE COUNTING

Solution. If we imagine rolling the dice in order, there are 67 possible outcomes, each of which
is equally likely. We now count the outcomes in which I get at most two sixes (that is, the out-
comes which don’t satisfy our condition). There are 57 outcomes in which I roll no sixes, since
in this case there are five possibilities (1, 2, 3, 4 or 5) for each die. There are 7 × 56 outcomes
in which I roll exactly 1 six, since there are 7 possibilities for which die shows six and 5 possi-
bilities for each of the other six dice, and likewise there are (72) × 55 outcomes in which I roll 2
sixes, since there are (72) ways to choose the 2 dice which show six and 5 possibilities for each
of the other dice. So there are 57 + 7 × 56 + (72) × 55 outcomes in which I do not roll at least 3
sixes, and therefore there are 67 − (57 + 7 × 56 + (72) × 55 ) outcomes in which I do roll at least 3
sixes. The probability of this event is therefore
67 − (57 + 7 × 56 + (72) × 55 ) 26,811 331
= = = 0.096 (2 s.f.).
67 279,936 3,456
As a general suggestion for approaching this kind of problem, try to apply the following
methods.
• If you can neatly partition the set you are trying to count, then do so, and then count each
part separately. This particularly applies if the question involves an inequality (e.g. ‘at
least 3 sixes’ in the above example). By considering each value in turn (e.g. considering
no sixes, 1 six and 2 sixes separately in the above example) we instead count sets defined
by equalities.15 For this to work it is essential that your partition is indeed a partition
(i.e. every outcome is in exactly one part – see Proposition 2.11).

• If you can describe your outcome as the result of a series of choices, then we can count
the possibilities at each choice and take the product to give the overall number of possi-
bilities. For example, in Example 3.19(1) we formed a ‘hand with 3 cards of one suit and
3 cards of another suit’ as the result of two consecutive choices: first we chose which
2 suits would appear in the hand, then we chose which cards from these suits would
appear, and multiplying these gave the number of such hands. For this to work it is
essential that for each outcome you are counting, there is precisely one sequence of
choices which results in that outcome; if not, then you will need to consider how you
overcounted.

• Finally, it often helps to take a complement; that is, to count the outcomes that do not
satisfy the given condition, since subtracting from the total number of outcomes then
15
Another example is in Example 3.19(3), where we partitioned the hands we were interested in into 2 sets,
those containing the Queen of Hearts and those not containing the Queen of Hearts, and counted these sepa-
rately.
3.4. THE BINOMIAL THEOREM 45

gives the number which do satisfy the given condition. We did this in Example 3.20,
where instead of counting the outcomes in which we rolled at least 3 sixes, we instead
counted those in which we rolled at most 2 sixes.

3.4 The binomial theorem


In this section we will prove the binomial theorem and explore some of its corollaries. We
begin with the following lemma.

Proposition 3.21. For any integers 𝑟 and 𝑛 with 0 ⩽ 𝑟 < 𝑛 we have

𝑛 𝑛 𝑛+1
( )+( )=( ).
𝑟 𝑟 +1 𝑟 +1
𝑛!
Algebraic proof. One approach is to write (𝑛𝑟) = and similarly for the other terms, and
𝑟!⋅(𝑛−𝑟)!
argue as follows:

𝑛 𝑛 𝑛! 𝑛!
( )+( )= +
𝑟 𝑟 +1 𝑟! ⋅ (𝑛 − 𝑟)! (𝑟 + 1)! ⋅ (𝑛 − 𝑟 − 1)!
𝑛! (𝑟 + 1) 1
= ( + )
(𝑟 + 1)! (𝑛 − 𝑟)! (𝑛 − 𝑟 − 1)!
𝑛!
= [(𝑟 + 1) + (𝑛 − 𝑟)]
(𝑟 + 1)! ⋅ (𝑛 − 𝑟)!
𝑛! ⋅ (𝑛 + 1)
=
(𝑟 + 1)! ⋅ ((𝑛 + 1) − (𝑟 + 1))!
(𝑛 + 1)!
=
(𝑟 + 1)! ⋅ ((𝑛 + 1) − (𝑟 + 1))
𝑛+1
=( )
𝑟 +1

Combinatorial proof. An alternative argument uses Theorem 3.12 to count the number of sub-
sets of {1, … , 𝑛 + 1} with size 𝑟 + 1 in two different ways:

𝑛+1
( ) = |{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ |𝐴| = 𝑟 + 1}|
𝑟 +1
= |{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ 𝑛 + 1 ∈ 𝐴 and |𝐴| = 𝑟 + 1}|
+ |{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ 𝑛 + 1 ∉ 𝐴 and |𝐴| = 𝑟 + 1}| by the sum rule
= |{𝐴 ⊆ {1, … , 𝑛} ∶ |𝐴| = 𝑟}| + |{𝐴 ⊆ {1, … , 𝑛} ∶ |𝐴| = 𝑟 + 1}| (∗)
𝑛 𝑛
= ( )+( ).
𝑟 𝑟 +1
46 CHAPTER 3. FINITE COUNTING

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

1 6 15 20 15 6 1

1 7 21 35 35 21 7 1

1 8 28 56 70 56 28 8 1

1 9 36 84 126 126 84 36 9 1

Figure 3.1: This diagram lists the binomial coefficients, where the 𝑛th row (starting from
𝑛 = 0) shows (𝑛0), (𝑛1), … , (𝑛𝑛) from left-to-right. Each number (apart from the 1’s) is the
sum of the two numbers to its upper-left and upper-right. For example, 28 = 21 + 7. This
diagram is often called Pascal’s triangle in English. In China, it is known as 杨辉三角形.

The step in line (∗) is justified by the bijection

{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ 𝑛 + 1 ∈ 𝐴 and |𝐴| = 𝑟 + 1} → {𝐴 ⊆ {1, … , 𝑛} ∶ |𝐴| = 𝑟},


𝐴 ↦ 𝐴 ⧵ {𝑛 + 1}.

The binomial coefficients can be arranged in a triangular array, where each row is gener-
ated by the previous one using Proposition 3.21 – see Figure 3.1.

Theorem 3.22 (Binomial theorem). For any integer 𝑛 ⩾ 0 and any 𝑎, 𝑏 ∈ ℝ, we have
𝑛 𝑛
(𝑎 + 𝑏)𝑛 = ∑ ( )𝑎 𝑖 𝑏 𝑛−𝑖 .
𝑖=0 𝑖

This theorem explains the name for binomial coefficients, as they are the coefficients in the
binomial expansion of (𝑎 + 𝑏)𝑛 .

Proof. For 𝑛 = 0,
0 0 0
(𝑎 + 𝑏)𝑛 = 1 = ( )𝑎 0 𝑏 0 = ∑ ( )𝑎 𝑖 𝑏 𝑛−𝑖 .
0 𝑖=0 𝑖
3.4. THE BINOMIAL THEOREM 47

Now let 𝑛 ∈ ℕ0 be given such that (𝑎 + 𝑏)𝑛 = ∑𝑛𝑖=0 (𝑛𝑖)𝑎 𝑖 𝑏 𝑛−𝑖 . Then

(𝑎 + 𝑏)𝑛+1 = (𝑎 + 𝑏)(𝑎 + 𝑏)𝑛


𝑛𝑛
= (𝑎 + 𝑏) ∑ ( )𝑎 𝑖 𝑏 𝑛−𝑖
𝑖=0 𝑖
𝑛 𝑛 𝑛 𝑛
= ∑ ( )𝑎 𝑖+1 𝑏 𝑛−𝑖 + ∑ ( )𝑎 𝑖 𝑏 𝑛+1−𝑖
𝑖=0 𝑖 𝑖=0 𝑖
𝑛+1 𝑛 𝑛 𝑛
= ∑( )𝑎 𝑖 𝑏 𝑛+1−𝑖 + ∑ ( )𝑎 𝑖 𝑏 𝑛+1−𝑖
𝑖=1 𝑖−1 𝑖=0 𝑖
𝑛 𝑛 𝑛 𝑛 𝑛
= ( )𝑎 𝑛+1 𝑏 0 + ∑ [( ) + ( )] 𝑎 𝑖 𝑏 𝑛+1−𝑖 + ( )𝑎 0 𝑏 𝑛+1
𝑛 𝑖=1 𝑖 − 1 𝑖 0
𝑛 𝑛+1
= 𝑎 𝑛+1 + ∑ ( )𝑎 𝑖 𝑏 𝑛+1−𝑖 + 𝑏 𝑛+1 by Proposition 3.21
𝑖=1 𝑖
𝑛+1 𝑛 + 1 𝑖 𝑛+1−𝑖
= ∑( )𝑎 𝑏 .
𝑖=0 𝑖

Therefore by induction, we have that (𝑎 + 𝑏)𝑛 = ∑𝑛𝑖=0 (𝑛𝑖)𝑎 𝑖 𝑏 𝑛−𝑖 for all 𝑛 ∈ ℕ0 and 𝑎, 𝑏 ∈ ℝ.

Corollary 3.23. For any non-negative integer 𝑛, ∑𝑛𝑖=0 (𝑛𝑖) = 2𝑛 .

Proof. Applying the binomial theorem with 𝑎 = 𝑏 = 1 gives


𝑛 𝑛 𝑖 𝑛−𝑖 𝑛 𝑛
2𝑛 = (1 + 1)𝑛 = ∑ ( )1 1 = ∑ ( ).
𝑖=0 𝑖 𝑖=0 𝑖

Note that Corollary 3.23 gives another proof of Theorem 1.9, which stated that for any finite
set 𝑋 we have |𝒫(𝑋)| = 2|𝑋| . Indeed,
𝑛
total number of subsets of 𝑆 = ∑ (number of subsets of 𝑆 with size 𝑖)
𝑖=0
𝑛 𝑛
= ∑ ( ) by Theorem 3.12
𝑖=0 𝑖

= 2𝑛 by Corollary 3.23.

Corollary 3.24. For any positive integer 𝑛,


𝑛 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
( ) − ( ) + ( ) − ( ) + ⋯ + (−1)𝑛 ( ) = ∑ (−1)𝑖 ( ) = 0,
0 1 2 3 𝑛 𝑖=0 𝑖

Proof. By the binomial theorem with 𝑎 = −1 and 𝑏 = 1:


𝑛 𝑛 𝑛 𝑛
0 = (−1 + 1)𝑛 = ∑ ( )(−1)𝑖 1𝑛−𝑖 = ∑ (−1)𝑖 ( ).
𝑖=0 𝑖 𝑖=0 𝑖
48 CHAPTER 3. FINITE COUNTING

Note that Corollary 3.24 does not hold for 𝑛 = 0, as in this case we just have the term

𝑛
( ) = (−1 + 1)0 = 00 = 1.
0

Corollary 3.25. Let 𝑆 be a non-empty finite set of size 𝑛. Then 𝑆 has 2𝑛−1 subsets of even size
and 2𝑛−1 subsets of odd size.

Proof. By Corollary 3.24,


𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
0 = ∑ (−1)𝑖 ( ) = ∑ (−1)𝑖 ( ) + ∑ (−1)𝑖 ( ) = ∑ ( ) − ∑ ( )
𝑖=0 𝑖 𝑖=0,…,𝑛, 𝑖 𝑖=0,…,𝑛, 𝑖 𝑖=0,…,𝑛, 𝑖 𝑖=0,…,𝑛, 𝑖
𝑖 even 𝑖 odd 𝑖 even 𝑖 odd

By Theorem 3.12 and the sum rule, it follows that 𝑆 has the same number of even-sized subsets
as it does odd-sized subsets. As their sum is equal to 2𝑛 by Theorem 1.9, it follows there are
2𝑛−1 even-sized and 2𝑛−1 odd-sized subsets of 𝑆.

Again, this result does not hold for 𝑛 = 0, since the empty set has one subset of even size
(itself) and no subset of odd size.
Chapter 4

Infinite counting

Our notion of ‘size’ for finite sets is intrinsically connected with the existence of bijections
between sets. Indeed, the process of counting elements of a finite set (“one”, “two”, “three”,
“four”, ...) establishes a bijection between the set of objects we wish to count, to a set of the
form {1, … , 𝑛} for non-negative integer 𝑛. We can generalise this notion of size by comparing
two sets with each other:

Definition 4.1. Let 𝑋, 𝑌 be sets.

• If there exists an injection from 𝑋 to 𝑌, we write 𝑋 ≼ 𝑌.

• If there exists a bijection from 𝑋 to 𝑌, we say that 𝑋 and 𝑌 are equinumerous 1 and write
𝑋 ≈ 𝑌.

We recall some facts about bijections and injections, written in this new notation. The
proofs are left as an exercise to the reader.

Proposition 4.2.

(1). ≼ and ≈ are reflexive: for all sets 𝑋, 𝑋 ≼ 𝑋 and 𝑋 ≈ 𝑋.

(2). ≈ is symmetric: for all sets 𝑋, 𝑌, if 𝑋 ≈ 𝑌 then 𝑌 ≈ 𝑋.

(3). ≼ and ≈ are transitive: for all sets 𝑋, 𝑌 , 𝑍,

• if 𝑋 ≼ 𝑌 and 𝑌 ≼ 𝑍 then 𝑋 ≼ 𝑍, and


• if 𝑋 ≈ 𝑌 and 𝑌 ≈ 𝑍 then 𝑋 ≈ 𝑍.
1
This is a fancy word for “same size”. However, we are not defining the size of a set but comparing two sets, so
we use this terminology instead.

49
50 CHAPTER 4. INFINITE COUNTING

The following proposition shows that the notions of injectivity / bijectivity capture the no-
tions that one set is smaller-in-size than another / equal-in-size to another.

Proposition 4.3. Let 𝑚, 𝑛 ∈ ℕ0 be given and let 𝑋, 𝑌 be finite sets with size 𝑚, 𝑛 respectively.
Then:

(1). 𝑋 ≼ 𝑌 if and only if 𝑚 ⩽ 𝑛.

(2). 𝑋 ≈ 𝑌 if and only if 𝑚 = 𝑛.

Proof. First, we enumerate 𝑋 = {𝑥1 , … , 𝑥𝑚 } and 𝑌 = {𝑦1 , … , 𝑦𝑛 }. Assume 𝑚 ⩽ 𝑛 and define

𝑓 ∶ 𝑋 → 𝑌 , 𝑥𝑖 ↦ 𝑦𝑖 .

Since 𝑥1 , … , 𝑥𝑚 are distinct, 𝑓 is well-defined, and as 𝑦1 , … , 𝑦𝑛 are distinct, 𝑓 is injective. Thus


𝑋 ≼ 𝑌. Moreover, if 𝑚 = 𝑛 then 𝑓 is surjective and thus a bijection, so 𝑋 ≈ 𝑌.
Now suppose 𝑚 > 𝑛. Then by the pigeonhole principle, for every function 𝑔 ∶ 𝑋 → 𝑌 there
exist 𝑥, 𝑥′ ∈ 𝑋 such that 𝑔(𝑥) = 𝑔(𝑥′ ) (the ‘pigeons’ are the members of 𝑥 and the ‘holes’ are the
members of 𝑌). Thus 𝑋 ≼ 𝑌.
Finally, suppose 𝑋 ≈ 𝑌, so in particular 𝑋 ≼ 𝑌 and 𝑌 ≼ 𝑋. Then 𝑚 ⩽ 𝑛 and 𝑛 ⩽ 𝑚, so 𝑚 = 𝑛,
concluding our proof.

You might expect that if two sets can be embedded into each other by injections, then those
sets should be equinumerous. This is in fact true and is called the Cantor-Schröder-Bernstein
theorem, but it is not easy to prove and so is not examinable. The proof is given as an additional
exercise.

Theorem 4.4 (Cantor-Schröder-Bernstein theorem). Let 𝑋 and 𝑌 be sets such that 𝑋 ≼ 𝑌 and
𝑌 ≼ 𝑋. Then 𝑋 ≈ 𝑌.

4.1 Countable sets


The simplest type of infinite set are the countably-infinite sets.

Definition 4.5. We say a set 𝑋 is countably-infinite if ℕ+ ≈ 𝑋. If 𝑋 is finite or countably-infinite,


then we say 𝑋 is countable. Otherwise, 𝑋 is called uncountable.

If 𝑓 ∶ ℕ+ → 𝑋 is a bijection to some set 𝑋, then we can list the elements of 𝑋: 𝑓 (0), 𝑓 (1), 𝑓 (2),
… , 𝑓 (𝑛), … . Hence the countable sets are those that can be described by a list, either finite or
infinite. Below we give several examples of infinite subsets.
4.1. COUNTABLE SETS 51

−3 𝑓 (6) −2 𝑓 (4) −1 𝑓 (2) 0 𝑓 (1) 1 𝑓 (3) 2 𝑓 (5) 3 𝑓 (7)

Figure 4.1: The bijection ℎ ∶ ℕ+ → ℤ in Example 4.6(4).

Example 4.6.

(1). ℕ+ are countably-infinite since ≈ is reflexive by Proposition 4.2(1).

(2). The function 𝑓 ∶ ℕ+ → ℕ0 , 𝑛 ↦ 𝑛 − 1 is a bijection, so ℕ+ ≈ ℕ0 . Thus ℕ0 is countably-


infinite.

(3). Let 𝐸 denote the set of positive even integers. Then 𝑔 ∶ ℕ+ → 𝐸, 𝑛 ↦ 2𝑛 is a bijection, so
𝐸 is countably-infinite. This shows that an infinite set can be equinumerous to a proper
subset of itself, even though 𝐸 “only has half the elements of ℕ+ ”.

(4). Define
⎧ 𝑛−1 if 𝑛 is odd,
ℎ ∶ ℕ+ → ℤ, 𝑛 ↦ ⎨ 2𝑛
⎩− 2
if 𝑛 is even.
Then ℎ is a bijection, so ℕ+ ≈ ℤ. Thus ℤ is countably-infinite.

Theorem 4.7. Let 𝑋 be a set.

(1). 𝑋 is countable if and only if 𝑋 ≼ ℕ+ .

(2). Assume 𝑋 is non-empty. Then 𝑋 is countable if and only if there exists a surjection 𝑓 ∶ ℕ+ →
𝑋.
52 CHAPTER 4. INFINITE COUNTING

Corollary 4.8. Let 𝑋, 𝑌 be sets with 𝑌 ⊆ 𝑋 and suppose 𝑋 is countable. Then:

(1). 𝑌 is countable, and

(2). if 𝑌 is infinite, then 𝑌 is countably-infinite.

Proof. First, assume 𝑋 is countable. Then the inclusion function 𝜄 ∶ 𝑌 → 𝑋, 𝑦 ↦ 𝑦 is an in-


jection, so 𝑌 ≼ 𝑋. As 𝑋 is countable, by Theorem 4.7(1) 𝑋 ≼ ℕ+ . Thus, by Proposition 4.2(3),
𝑌 ≼ ℕ+ , so by Theorem 4.7(1) again, 𝑌 is countable.
Now assume that 𝑌 is infinite, i.e. not finite. Then by definition 𝑌 must be countably-
infinite.

Thus every infinite subset of ℕ+ (or any countably-infinite set) is also countably-infinite,
such as the set of prime numbers, or {𝑛2 ∶ 𝑛 ∈ ℕ+ }. In fact, the product of two countable sets
is also countable, which can be shown using the following theorem. The proof is given as an
exercise, but we do show as a corollary that the set of rational numbers is countably-infinite.

Theorem 4.9. ℕ+ × ℕ+ is countably-infinite.

Corollary 4.10. ℚ is countably-infinite.

Proof. By example 4.6(4) there is a bijection 𝑓 ∶ ℕ+ → ℤ. Thus

𝑓 (𝑚)
𝑔 ∶ ℕ+ × ℕ+ → ℚ, (𝑚, 𝑛) ↦
𝑛

is a surjection. By the previous theorem, there is a bijection ℎ ∶ ℕ+ → ℕ+ × ℕ+ and thus 𝑔 ∘ ℎ ∶


ℕ+ → ℚ is a surjection. Thus by Theorem 4.7(2) is countable. Therefore, since ℚ is infinite, it
must be countably-infinite.

4.2 Uncountable sets


Not all sets are countable. In this section, we prove two theorems: the first gives us a method
to construct larger and larger sets, whilst the second shows that a specific set, the set of real
numbers, is uncountable. We first need some new notation.

Definition 4.11 (≺). For sets 𝑋, 𝑌, we write 𝑋 ≺ 𝑌 if 𝑋 ≼ 𝑌 and 𝑋 


≈ 𝑌.

Theorem 4.12 (Cantor’s theorem). Let 𝑋 be a set. Then 𝑋 ≺ 𝒫(𝑋).


4.2. UNCOUNTABLE SETS 53

Proof. We first show that there is an injection from 𝑋 to 𝒫(𝑋). Define

𝑓 ∶ 𝑋 → 𝒫(𝑋), 𝑥 ↦ {𝑥}.

Let 𝑥, 𝑦 ∈ 𝑋 be given such that 𝑓 (𝑥) = 𝑓 (𝑦). Then 𝑥 ∈ {𝑥} and {𝑥} = {𝑦}, so 𝑥 ∈ {𝑦}. However, 𝑦 is
the only member of {𝑦}, so we must have 𝑥 = 𝑦. Therefore 𝑓 is injective and so 𝑋 ≼ 𝒫(𝑋).
Now let 𝑔 ∶ 𝑋 → 𝒫(𝑋) be any function. We will show it is not surjective. Define 𝐴 ≔ {𝑥 ∈ 𝑋 ∶
𝑥 ∉ 𝑔(𝑥)} ∈ 𝒫(𝑋). Suppose there exists an 𝑥 ∈ 𝑋 such that 𝐴 = 𝑔(𝑥). Then by definition of 𝐴,

𝑥 ∈ 𝑔(𝑥) ⟺ 𝑥 ∈ 𝐴 ⟺ 𝑥 ∉ 𝑔(𝑥).

This is a contradiction. Therefore 𝐴 ∉ Im(𝑔), so 𝑔 is not surjective. In particular, 𝑔 is not bijec-


tive, so 𝑋 
≈ 𝑌 and thus 𝑋 ≺ 𝑌.

Example 4.13. By Cantor’s theorem, ℕ+ ≺ 𝒫(ℕ+ ) and so by the Cantor-Schröder-Bernstein


theorem 𝒫(ℕ+ ) ≼ ℕ+ . Thus by Theorem 4.7(1), 𝒫(ℕ+ ) is uncountable. Moreover, the sequence
(𝑋𝑛 ) of sets we define by recursion below consists of larger and larger infinite sets:

• 𝑋1 ≔ ℕ+ , and

• for all 𝑛 ∈ ℕ+ , 𝑋𝑛+1 ≔ 𝒫(𝑋𝑛 ).

Thus there are infinitely-many infinite sets, no two of which are equinumerous.

Theorem 4.14. ℝ is uncountable.

Proof. Let 𝑓 ∶ ℕ+ → [0, 1] be given. We will show that 𝑓 is not surjective. For each 𝑛 ∈ ℕ+ ,
let 0.𝑎𝑛,1 𝑎𝑛,2 … 𝑎𝑛,𝑘 … be a non-terminating decimal expansion of 𝑓 (𝑛), where each 𝑎𝑛,𝑘 is a
decimal digit from 0 to 9 inclusive.2 We define a new sequence as follows (see Figure 4.2): for
all 𝑛 ∈ ℕ+ ,
⎧𝑎𝑛,𝑛 + 5 if 𝑎𝑛,𝑛 < 5,
𝑏𝑛 ≔ ⎨
⎩𝑎𝑛,𝑛 − 5 if 𝑎𝑛,𝑛 ⩾ 5.

Define 𝑥 ≔ 0.𝑏1 𝑏2 … 𝑏𝑘 … ∈ [0, 1] and assume 𝑥 ∈ Im(𝑓 ), so there exists a 𝑛 ∈ ℕ+ such that

2
For 1, we can take the decimal expansion 0.9.
54 CHAPTER 4. INFINITE COUNTING

𝑛 𝑓 (𝑛)
1 0. 𝑎1,1 𝑎1,2 𝑎1,3 ⋯ 𝑎1,𝑛 ⋯
2 0. 𝑎2,1 𝑎2,2 𝑎2,3 ⋯ 𝑎2,𝑛 ⋯
3 0. 𝑎3,1 𝑎3,2 𝑎3,3 ⋯ 𝑎3,𝑛 ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋱
𝑛 0. 𝑎𝑛,1 𝑎𝑛,2 𝑎𝑛,3 ⋯ 𝑎𝑛,𝑛 ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋱
𝑥 0. 𝑏1 𝑏2 𝑏3 ⋯ 𝑏𝑘 ⋯

Figure 4.2: For any countably-infinite list of numbers in [0, 1], we will define
a real number 𝑥 not in this list by choosing its decimal digits to be different
from any decimal expansion in the sequence. Some care is needed as real
numbers can have 2 decimal expansions, such as 0.1 = 0.10 = 0.09.

𝑥 = 𝑓 (𝑛). Then:
𝑎𝑛,𝑘∞ ∞
𝑏𝑘
∑ 𝑘
= ∑ 𝑘
𝑘=1 10 𝑘=1 10
𝑛−1 𝑎 ∞ 𝑏 −𝑎
𝑛,𝑘 − 𝑏𝑘 𝑘 𝑛,𝑘
⇒ ∑ 𝑘
= ∑ 𝑘
𝑘=1 10 𝑘=𝑛 10
𝑛−1 𝑏𝑛 − 𝑎𝑛,𝑛 ∞ 𝑏 −𝑎
1 𝑛−1−𝑘 𝑘 𝑛,𝑘
⇒ 𝑛−1 ∑ (𝑎𝑛,𝑘 − 𝑏𝑘 )10 = 𝑛
+ ∑
10 𝑘=1 10 𝑘=𝑛+1 10 𝑘
𝑛−1 𝑏𝑛 − 𝑎𝑛,𝑛 ∞ 𝑏 −𝑎
𝑘 𝑛,𝑘
⇒ ∑ (𝑎𝑛,𝑘 − 𝑏𝑘 )10𝑛−1−𝑘 = + ∑ . (∗)
𝑘=1 10 𝑘=𝑛+1 10 𝑘−𝑛+1

Notice that the left-hand side of the last equation is an integer. Moreover,
|| ∞ 𝑏𝑘 − 𝑎𝑛,𝑘 || ∞ |𝑏 − 𝑎
𝑘 𝑛,𝑘 |

18 18 ∞ 1
| ∑ |⩽ ∑ ⩽ ∑ = ∑
10 𝑘=1 10𝑘
= 0.2.
|𝑘=𝑛+1 10𝑘−𝑛+1 | 𝑘=𝑛+1 10𝑘−𝑛+1 𝑘=𝑛+1 10
𝑘−𝑛+1

As 𝑏𝑛 − 𝑎𝑛,𝑛 = ±5, it follows that the right-hand side of (∗) belongs to the set [−0.7, −0.3] ∪
[0.3, 0.7], which is a contradiction as the left-hand side is an integer. Therefore 𝑥 ∉ Im(𝑓 ) and
so 𝑓 is not surjective. In particular 𝑓 is not surjective, so [0, 1] is uncountable.
Finally, as [0, 1] ⊆ ℝ, by Theorem 4.7(1) ℝ must also be uncountable.
Chapter 5

Graph theory

The standard formal definition of a graph is the following.

Definition 5.1. A graph 𝐺 = (𝑉 , 𝐸) consists of a set of vertices 1 𝑉 and a set 𝐸 ⊆ P(𝑉 ) of edges,
where each edge is an unordered pair {𝑢, 𝑣} of distinct vertices 𝑢, 𝑣 ∈ 𝑉.2 We call {𝑢, 𝑣} an edge
between 𝑢 and 𝑣, and will often write this edge simply as 𝑢𝑣.3
For a graph 𝐺, we denote its set of vertices by 𝑉 (𝐺 ) and its set of edges by 𝐸(𝐺 ). Note that
𝐺 = (𝑉 (𝐺 ), 𝐸(𝐺 )).

However, it may help you to think in terms of the following, more informal definition – a
graph consists of a set of points called vertices, some of which may be linked by lines called
edges, subject to the following rules:

• every edge links two distinct vertices, and

• any pair of distinct vertices is linked by at most one edge.

An example is shown in Figure 5.1. Notice that the lines do not have to be straight, and can
intersect; all that matters is whether there is a line joining two vertices or not.
We can think of an isomorphism as a relabelling of the vertices that preserves the edge
structure of the graph. By its definition, a graph is equal to another if they have the same set
of vertices, and the same set of edges. However, most of the time we only care about the con-
nections, not the actual names or labels of the vertices. This leads to the following definition.
1
Note that the word ‘vertices’ is the plural of ‘vertex’ (similar to ‘matrix’ and ‘matrices’). ‘Vertice’ and ‘vertexes’
are both incorrect! Throughout this course, all graphs have a non-zero, finite number of vertices.
2
This definition of a graph is sometimes called a simple graph. Writers using this term would typically say
that a graph may contain loops (an edge from a vertex to itself) and multiple edges between a pair of vertices.
However, we will not consider these in this course.
3
Note that 𝑢𝑣 = 𝑣𝑢 because these are unordered pairs.

55
56 CHAPTER 5. GRAPH THEORY

𝑐 𝑏
𝑒

𝑑
𝑎

Figure 5.1: A graph 𝐺 with vertex set {𝑎, 𝑏, 𝑐, 𝑑, 𝑒, 𝑓 } and edge set
{𝑎𝑐, 𝑎𝑑, 𝑎𝑓, 𝑏𝑐, 𝑏𝑑, 𝑒𝑓}. The complement 𝐺 is also shown in red.

Definition 5.2. Let 𝐺 and 𝐻 be graphs and let 𝑓 ∶ 𝑉 (𝐺 ) → 𝑉 (𝐻 ) be given. We say that 𝑓 is an
isomorphism if 𝑓 is bijective and for all distinct 𝑣, 𝑤 ∈ 𝑉 (𝐺 ), 𝑣𝑤 ∈ 𝐸(𝐺 ) if and only if 𝑓 (𝑣)𝑓 (𝑤) ∈
𝐸(𝐺 ). We say 𝐺 and 𝐻 are isomorphic if there is an isomorphic from 𝐺 and 𝐻.4

Figure 5.2 gives an example of two graphs that ‘look’ different but are isomorphic. We con-
clude this section by introducing more terminology that will be used in this chapter.

Definition 5.3 (Complement). Let 𝐺 be a graph. We define the complement of 𝐺, denoted 𝐺,


to be the graph with the same vertex set as 𝐺, and edge set {𝑢𝑣 ∶ 𝑢, 𝑣 ∈ 𝑉 (𝐺 ), 𝑢𝑣 ∉ 𝐸(𝐺 )}.

Figure 5.1 also demonstrates the complement of a graph. Notice that 𝐸(𝐺) is the set com-
plement of 𝐸(𝐺 ) in the universal set {𝐴 ⊆ 𝑉 (𝐺 ) ∶ |𝐴| = 2}.

Definition 5.4 (Order and size). The order of a graph 𝐺 is the number of vertices of 𝐺, and the
size of 𝐺 is the number of edges of 𝐺.
4
The words ‘isomorphism’ and ‘isomorphic’ can be broken down into two parts: ‘iso’, meaning ‘same’, and
‘morphism’/‘morphic’, which is related to ‘structure’. Thus, an isomorphism preserves the ‘same structure’, and
isomorphic graphs have the ‘same structure’.
5.1. DEGREES AND DEGREE SEQUENCES 57

𝐸 𝐷 𝑦

𝐵 𝑧

𝐹
𝐴 𝐶 𝑥
𝑤 𝑢

Figure 5.2: The two graphs drawn above are isomorphic, with the
isomorphism 𝑓 ∶ {𝐴, 𝐵, 𝐶 , 𝐷, 𝐸, 𝐹} → {𝑢, 𝑣, 𝑤, 𝑥, 𝑦, 𝑧} shown in green.

Definition 5.5. Let 𝐺 be a graph.

• An edge 𝑒 = 𝑢𝑣 is said to be incident to the vertices 𝑢 and 𝑣.

• If 𝑢𝑣 is an edge in 𝐺, we say that 𝑢 and 𝑣 are adjacent (in 𝐺) and are called neighbours.
For a vertex 𝑣 ∈ 𝑉 (𝐺 ), we denote the set of neighbours of 𝑣 by 𝑁𝐺 (𝑣). We can drop the
subscript if there is no ambiguity.

• A vertex 𝑣 is called isolated if it has no neighbours.

5.1 Degrees and degree sequences


Definition 5.6. The degree of a vertex 𝑣 in a graph 𝐺 is 𝑑(𝑣) = |𝑁𝐺 (𝑣)|, that is, the number of
neighbours of 𝑣 (note that this is equal to the number of edges incident to 𝑣). If the graph 𝐺 in
question is clear from context, we instead write 𝑑(𝑣) for the degree of 𝑣 in 𝐺 to make this clear.
The minimum/maximum degree of a graph 𝐺, denoted by 𝛿(𝐺 ), Δ(𝐺 ), is the smallest/-
largest degree of all vertices of 𝐺 respectively.

Note that if 𝐺 is a graph with 𝑛 vertices, then the degree of each vertex of 𝐺 is an integer
between 0 and 𝑛 − 1 inclusive. The next lemma relates the sum of all vertex degrees to the
number of edges.

Lemma 5.7 (Handshaking lemma). In any graph 𝐺 we have ∑𝑣∈𝑉 (𝐺 ) 𝑑(𝑣) = 2|𝐸(𝐺 )|.
58 CHAPTER 5. GRAPH THEORY

Vertices
Edges 𝑎 𝑏 𝑐 𝑑 𝑒 𝑓 Total
𝑎𝑐 ✓ ✗ ✓ ✗ ✗ ✗ 2
𝑎𝑑 ✓ ✗ ✓ ✗ ✗ ✗ 2
𝑎𝑓 ✓ ✗ ✗ ✗ ✗ ✓ 2
𝑏𝑐 ✗ ✓ ✓ ✗ ✗ ✗ 2
𝑏𝑑 ✗ ✓ ✗ ✓ ✗ ✗ 2
𝑒𝑓 ✗ ✗ ✗ ✗ ✓ ✓ 2
Degree 3 2 3 1 1 2 ∑ 𝑑(𝑣) = 12 = 2|𝐸(𝐺 )|
𝑣∈𝑉 (𝐺 )

Table 5.1: A table summarising the incidence information for the graph in Figure
5.1. The blue ticks (✓) indicate that the edge (for that row) is incident to the vertex
(for that column). The total number of ticks is the expression in the handshaking
lemma, shown in the lower-right corner.

Proof. We count the number of pairs (𝑣, 𝑒), where 𝑒 is an edge incident to a vertex 𝑣:

∑ 𝑑(𝑣) = ∑ ∑ 1 = ∑ ∑ 1 = ∑ 2 = 2|𝐸(𝐺 )|.


𝑣∈𝑉 (𝐺 ) 𝑣∈𝑉 (𝐺 ) 𝑒∈𝐸(𝐺 ), 𝑒∈𝐸(𝐺 ) 𝑣∈𝑒 𝑒∈𝐸(𝐺 )
𝑣∈𝑒

Example 5.8. The incidence information from Figure 5.1 is summarised in Table 5.1. By count-
ing the number of edges incident to a fixed vertex 𝑣, we get the degreo 𝑑(𝑣) of that vertex. Note
that each edge is incident to exactly 2 vertices and that the minimum and maximum degree
are 1 and 3.
The proof of the handshaking lemma uses two different methods for counting the ticks (✓)
in such a table: either by adding up rows first or columns first.

Corollary 5.9. In any (finite) graph there are an even number of vertices with odd degree.

Proof. Let 𝐺 be a graph. Then by the handshaking lemma:

∑ 𝑑(𝑣) = 2|𝐸(𝐺 )| ⇒ ∑ 𝑑(𝑣) = 2|𝐸(𝐺 )| − ∑ 𝑑(𝑣)


𝑣∈𝑉 (𝐺 ) 𝑣∈𝑉 (𝐺 ), 𝑣∈𝑉 (𝐺 ),
𝑑(𝑣) is odd 𝑑(𝑣) is even

As the right-hand side is even and each degree is an integer, it follows that the number of
vertices with odd degree must be even.

The next proposition uses the pigeonhole principle to prove that any graph must contain
two vertices with the same degree.
5.1. DEGREES AND DEGREE SEQUENCES 59

Proposition 5.10. Any graph with at least 2 vertices has 2 vertices of the same degree.

Proof. Let 𝐺 be a graph on 𝑛 ⩾ 2 vertices. Then we consider two cases.

Case 1: 𝐺 has a vertex 𝑣 of degree 𝑛 − 1. In this case, this vertex 𝑣 is adjacent to every other
vertex, so every vertex of 𝐺 has degree at least 1. That is, the degree of each vertex of 𝐺
lies in the set {1, 2, … , 𝑛 − 1}, which has size 𝑛 − 1. Hence by the pigeonhole principle at
least two vertices of 𝐺 must have same degree.

Case 2: 𝐺 has no vertex of degree 𝑛 − 1. In this case, the degree of each vertex of 𝐺 lies in the
set {0, 1, … , 𝑛 − 2}, which also has size 𝑛 − 1, so again the pigeonhole principle implies
that at least two vertices of 𝐺 must have the same degree.

Definition 5.11. The degree sequence of a graph 𝐺 is the sequence of all degrees of vertices in
𝐺, written in decreasing order: (𝑑(𝑣1 ), … , 𝑑(𝑣𝑛 )), where 𝐺 has 𝑛 vertices 𝑣1 , … , 𝑣𝑛 and 𝑑(𝑣𝑖 ) ⩽
𝑑(𝑣𝑖+1 ) for all 𝑖 = 1, … , 𝑛 − 1.

Example 5.12.

(1). The degree sequence of 𝐺 is Figure 5.1 is (1, 2, 2, 2, 2, 3). The degree sequence of the com-
plement 𝐺 is (2, 3, 3, 3, 3, 4).

(2). The degree sequence of both graphs in Figure 5.2 is (2, 2, 2, 3, 3, 4). In fact, isomorphic
graphs have the same degree sequence (left as an exercise for the reader).

Some of the results just given rule out some sequences as being degree sequences of a
graph. For example, (3, 3, 3, 3, 3) cannot be the degree sequence of a graph by Corollary 5.9,
since the sum of the degrees in the sequence is odd. Similarly (0, 1, 2, 3) cannot be the degree
sequence of a graph by Proposition 5.10 since no two of the degrees in the sequence are equal.
Many other sequences can also be seen to be impossible by similar means.
We conclude this section with some further definitions.

Definition 5.13 (Regular). A graph 𝐺 is regular if all vertices have the same degree; if that
common degree is 𝑘 ∈ ℕ0 , then we can say the graph is 𝑘-regular.

Example 5.14. The graph obtained from the vertices and edges of a regular dodecahedron 5 is
3-regular – see Figure 5.3.

5
A regular dodecahedron is a polyhedron made up of 12 flat pentagonal faces.
60 CHAPTER 5. GRAPH THEORY

Figure 5.3: The graph obtained from the vertices and edges of a regular
dodecahedron.

5.2 Subgraphs, paths, and cycles


The next definition gives some commonly-encountered graphs, which are drawn in Figure 5.4.
Note in particular that 𝐶3 ’s and 𝐾3 ’s are the same type of graph; this graph is also known as the
triangle graph.

Definition 5.15. A complete graph 𝐺 is a graph that contains all possible edges: 𝐸(𝐺 ) = {𝐴 ⊆
𝑉 (𝐺 ) ∶ |𝐴| = 2}. Notice that 𝐺 has size (|𝑉 (𝐺 )|
2 ).
We call a complete graph with 𝑛 vertices a 𝐾𝑛 .
A path is a graph 𝐺 of the form 𝑉 (𝐺 ) = {𝑣1 , … , 𝑣𝑛 }, 𝐸(𝐺 ) = {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 1, … , 𝑛 − 1}, where
𝑛 ∈ ℕ+ is the order of 𝐺. We say that 𝑛 − 1 = |𝐸(𝐺 )| is the length of 𝐺. The vertices 𝑣1 and 𝑣𝑛
are called the end-vertices of 𝐺. We call a path of length 𝑛 a 𝑃𝑛 .
A cycle is a graph 𝐺 of the form 𝑉 (𝐺 ) = {𝑣1 , … , 𝑣𝑛 }, 𝐸(𝐺 ) = {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 1, … , 𝑛 − 1} ∪ {𝑣1 𝑣𝑛 },
where 𝑛 ⩾ 3 is the order of 𝐺. We say that 𝑛 = |𝐸(𝐺 )| is the length of 𝐺. We call a cycle of length
𝑛 a 𝐶𝑛 .

Definition 5.16. A graph 𝐻 is a subgraph of another graph 𝐺 if 𝑉 (𝐻 ) ⊆ 𝑉 (𝐺 ) and 𝐸(𝐻 ) ⊆ 𝐸(𝐺 ).


If 𝑉 (𝐻 ) = 𝑉 (𝐺 ) then we call 𝐻 a spanning subgraph of 𝐺.
For a graph 𝐺 and a non-empty subset 𝐴 ⊆ 𝑉 (𝐺 ), we define the induced subgraph of 𝐺 on
𝐴 to be the graph 𝐺 [𝐴] with vertex set 𝐴 and edge set 𝐸(𝐺 [𝐴]) = {𝑒 ∈ 𝐸(𝐺 ) ∶ 𝑒 ⊆ 𝐴}; that is, 𝐺 [𝐴]
contains all edges between vertices in 𝐴 that are edges in 𝐺.
A(n isomorphic) copy of a graph 𝐻 in another graph 𝐺 is a subgraph 𝐺 ′ of 𝐺 that is isomor-
phic to 𝐻.
5.2. SUBGRAPHS, PATHS, AND CYCLES 61

𝑃5

𝑃4 𝐶6

𝑃3 𝐶5

𝑃2 𝐶4

𝐾1 , 𝑃0
𝐾2 , 𝑃1 𝐾3 , 𝐶3 𝐾4 𝐾5 𝐾6

Figure 5.4: Some examples of the graphs defined in Definition 5.15.


62 CHAPTER 5. GRAPH THEORY

𝐺 𝑒
𝑏

𝑓 𝑑
𝑐

𝑏 𝑏

𝑓 𝑑 𝑓 𝑑
𝑐

𝑎 𝐺 [{𝑎, 𝑏, 𝑑, 𝑓 }] 𝑎
𝐻

Figure 5.5: A graph 𝐺, shown with an induced subgraph


𝐺 [{𝑎, 𝑏, 𝑑, 𝑓 }] = ({𝑎, 𝑏, 𝑑, 𝑓 }, {𝑎𝑑, 𝑎𝑓 , 𝑏𝑑, 𝑏𝑓 , 𝑑𝑓 }) and a non-induced subgraph
𝐻 = ({𝑎, 𝑏, 𝑐, 𝑑, 𝑓 }, {𝑎𝑐, 𝑎𝑑, 𝑏𝑐}).

Example 5.17.

(1). The graph 𝐺 is Figure 5.5 is shown with several of its subgraphs, some of which are in-
duced subgraphs.

(2). The induced subgraphs of a 𝐾𝑛 are the 𝐾𝑚 ’s for 𝑚 ⩽ 𝑛. Moreover, if 𝐺 is a complete


graph with 𝑛 vertices and 𝐻 is a graph with at most 𝑛 vertices, then 𝐻 is isomorphic
to a subgraph of 𝐺: if 𝑓 ∶ 𝑉 (𝐻 ) → 𝑉 (𝐺 ) is an injection (which exists because |𝑉 (𝐻 )| ⩽
|𝑉 (𝐺 )|), then 𝑓 is an isomorphism from 𝐻 to the subgraph 𝐻 ′ = (Im(𝑓 ), {𝑓 (𝑢)𝑓 (𝑣) ∶ 𝑢𝑣 ∈
𝐸(𝐻 )}) of 𝐺.

(3). For every non-negative integer 𝑛, there is a path of length 𝑛 in a 𝑃𝑛+1 , and in a 𝐶𝑛+1
(provided 𝑛 ⩾ 2).

The following proposition summarises some basic facts about (induced) subgraphs.
5.2. SUBGRAPHS, PATHS, AND CYCLES 63

𝑏 𝑏 𝑏

𝑎 𝑎 𝑎

𝑐 𝑐 𝑐

Figure 5.6: The three paths of length 2 with vertex set {𝑎, 𝑏, 𝑐}.

Proposition 5.18. Let 𝐺 be a graph.

(1). If 𝐻 is a subgraph of 𝐺 and 𝐿 is a subgraph of 𝐻, then 𝐿 is a subgraph of 𝐺.

(2). Let 𝐴, 𝐵 ⊆ 𝑉 (𝐺 ) be given such that ∅ ≠ 𝐴 ⊆ 𝐵. Then 𝐺 [𝐵][𝐴] = 𝐺 [𝐴].

(3). If 𝐻 is a subgraph of 𝐺 then 𝐻 is a subgraph of 𝐺 [𝑉 (𝐻 )].

One question of particular interest to us is when copies of these (or other) graphs can be
found in other, larger graphs, and if so, how many such copies there are. For example, we
might ask whether or not a graph contains a path from one vertex to another, or whether a
graph contains a cycle. The remainder of this section investigates a couple of these questions.

Example 5.19. How many paths of length 2 are there in a complete graph 𝐺 with 𝑛 ⩾ 3 vertices?

Solution. Note that there are 3 paths of length 2 with the same vertex set {𝑎, 𝑏, 𝑐}, as shown
in Figure 5.6: their edge sets are {𝑎𝑏, 𝑎𝑐}, {𝑎𝑏, 𝑏𝑐} and {𝑎𝑐, 𝑏𝑐}. So one way to calculate the
number of these paths is to take the number of ways to choose three distinct vertices in 𝑉 (𝐺 ),
which is (𝑛3), and to multiply by 3, since each choice of three vertices supports three different
𝑛(𝑛−1)(𝑛−2)
paths of length 2. So in total there are 3(𝑛3) = paths of length 2 in a complete graph
2
on 𝑛 ⩾ 3 vertices.
Another approach is to note that any ordered triple (𝑥, 𝑦, 𝑧) of distinct vertices in 𝑉 (𝐺 )
gives a path of length 2, whose vertices are 𝑥, 𝑦 and 𝑧 and whose edges are 𝑥𝑦 and 𝑦𝑧. Recall
that the number of ways to choose three vertices out of 𝑛, with order but no repetition, is
𝑛!
= 𝑛(𝑛 − 1)(𝑛 − 2). However, this counts each path twice, since the triples (𝑥, 𝑦, 𝑧) and
(𝑛−3)!
𝑛(𝑛−1)(𝑛−2)
(𝑧, 𝑦, 𝑥) give rise to the same path using this method. So in total there are paths of
2
length 2 in a complete graph with 𝑛 ⩾ 3 vertices.

Similar arguments can be used to count the number of copies of other small graphs in
standard larger graphs. Often we want to find sufficient (and perhaps necessary) conditions
64 CHAPTER 5. GRAPH THEORY

which ensure that any graph which satisfies these conditions must contain the subgraph we
are looking for. For example, the next lemma shows that any graph with minimum degree at
least two contains a cycle.

Lemma 5.20. Let 𝐺 be a graph with 𝛿(𝐺 ) ⩾ 2. Then 𝐺 contains a cycle.

Proof. Consider a path 𝑃 with longest possible length 𝑛 in 𝐺, and enumerate 𝑉 (𝑃) = {𝑣1 , … ,
𝑣𝑛+1 } so that 𝐸(𝑃) = {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 1, … , 𝑛}. Observe that 𝑛 ≠ 0 since 𝛿(𝐺 ) ⩾ 2. Thus, since
𝑑𝐺 (𝑣𝑛+1 ) ⩾ 𝛿(𝐺 ) ⩾ 2, there exists a neighbour 𝑣𝑛+2 ∈ 𝑁𝐺 (𝑣𝑛+1 ) different from 𝑣𝑛 . If 𝑣𝑛+2 ∉
{𝑣1 , … , 𝑣𝑛−1 } then we can construct a longer path in 𝐺 by adding the edge 𝑣𝑛+1 𝑣𝑛+2 to 𝑃, which
is a contradiction. Therefore there exists a 𝑘 ∈ {1, … , 𝑛 − 1} such that 𝑣𝑘 𝑣𝑛+1 ∈ 𝐸(𝐺 ) and thus
the graph with vertex set {𝑣𝑘 , … , 𝑣𝑛+1 } and edge set {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 𝑘, … , 𝑛} ∪ {𝑣𝑘 𝑣𝑛+1 } is a cycle in
𝐺 (note that 𝑛 + 1 − 𝑘 + 1 ⩾ 3, which we require since cycles must have at least 3 vertices).

The next theorem improves on Lemma 5.20 by showing that any graph with at least as many
edges as vertices contains a cycle.

Theorem 5.21. Any graph with at least as many edges as vertices has a cycle.
|𝑉 (𝐺 )|(|𝑉 (𝐺 )|−1)
Proof. First, note that for any graph |𝐸(𝐺 )| ⩽ (|𝑉 (𝐺
2 )
)|
= , so if |𝐸(𝐺 )| ⩾ |𝑉 (𝐺 )|
2
then 𝐺 has at least 3 vertices. Also note that there is only one graph (up to isomorphism) with
3 vertices and at least 3 edges, 𝐶3 , which certainly contains a cycle.
Now let 𝑛 ⩾ 3 be an integer such that every graph with 𝑛 vertices and at least 𝑛 edges has
a cycle, and let 𝐺 be a graph with 𝑛 + 1 vertices. If 𝛿(𝐺 ) ⩾ 2 then by the previous lemma there
is a cycle in 𝐺. Now suppose 𝛿(𝐺 ) < 2, so there exists a vertex 𝑣 with degree 0 or 1. Define
𝐺 ′ ≔ 𝐺 [𝑉 (𝐺 ) ⧵ {𝑣}]. Then 𝐺 ′ has 𝑛 vertices and 𝐸(𝐺 ′ ) = 𝐸(𝐺 ) ⧵ {𝑣𝑤 ∶ 𝑤 ∈ 𝑁𝐺 (𝑣)}, so 𝐺 ′ has at
least 𝑛 − 𝑑𝐺 (𝑣) ⩾ 𝑛 − 1 edges. Therefore, by the induction hypothesis 𝐺 ′ has a cycle. As 𝐺 ′ is a
subgraph of 𝐺, it follows that 𝐺 has a cycle too.
Therefore by induction on 𝑛, every graph with at least as many vertices as edges contains
a cycle.

5.3 Connectedness and trees


Informally, a graph is connected if for any two distinct vertices, it is possible to ‘walk’ from one
to the other. We formalise this notion in the following definition:

Definition 5.22. A walk 𝑊 in a graph 𝐺 is a finite sequence (𝑣0 , 𝑣1 , … , 𝑣𝑘 ) of vertices, with


𝑘 ∈ ℕ0 , such that 𝑣𝑖 𝑣𝑖+1 is an edge for any 𝑖 = 0, … , 𝑘 − 1. The length of 𝑊 is 𝑘, the number of
5.3. CONNECTEDNESS AND TREES 65

edges traversed. A walk 𝑊 is closed if 𝑣0 = 𝑣𝑘 . We say that 𝑊 is a walk from 𝑥 to 𝑦 if 𝑣0 = 𝑥 and


𝑣𝑘 = 𝑦.
We define the walk relation on 𝑉 (𝐺 ) as follows: if 𝑥 and 𝑦 are vertices in 𝐺, then 𝑥 ∼𝐺 𝑦 if
there is a walk from 𝑥 to 𝑦 in 𝐺. If the graph 𝐺 is clear from context, we can drop the subscript
and simply use ∼ instead.
A walk (𝑣0 , … , 𝑣𝑘 ) in 𝐺 with no repeated vertices can be turned into a path of length 𝑘 in 𝐺:

𝑃(𝑣0 , … , 𝑣𝑘 ) ≔ ({𝑣0 , … , 𝑣𝑘 }, {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 0, … , 𝑘 − 1}).

If the only repeated vertices are 𝑣0 = 𝑣𝑘 (so the walk is closed) and 𝑘 ⩾ 3 then

𝐶 (𝑣0 , … , 𝑣𝑘 ) ≔ ({𝑣0 , … , 𝑣𝑘−1 }, {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 0, … , 𝑘 − 1})

is a cycle of length 𝑘 in 𝐺.
A graph 𝐺 is connected if for any two vertices 𝑢 and 𝑣 of 𝐺 there is a walk in 𝐺 from 𝑢 to 𝑣; in
other words, 𝑥 ∼𝐺 𝑦 for all 𝑥, 𝑦 ∈ 𝑉 (𝐺 ). If a graph is not connected, we say it is disconnected. A
(connected) component of 𝐺 is a maximal connected subgraph 𝐶 of 𝐺: so 𝐶 is connected and
if 𝐷 is a connected subgraph of 𝐺 and 𝐶 is a subgraph of 𝐷, then 𝐶 = 𝐷.

An important application of walks in graphs, which unfortunately we do not have time to


investigate in this course, is the study of random walks, where at each step you randomly select
which edge incident to the current vertex will be traversed next. Such walks are widely applied
in physics, in modelling Brownian motion, for example. Figure 5.7 shows some examples of
walks in a graph.
We can characterise the components of a graph using the walk relation, which we show in
the next few results.

Proposition 5.23. Let 𝐺 be a graph and let 𝐻 be a spanning subgraph of 𝐺. If 𝐻 is connected


then 𝐺 is also connected.

Proof. Let 𝑥, 𝑦 ∈ 𝑉 (𝐺 ) = 𝑉 (𝐻 ) be given. Then since 𝐻 is connected, there exists a walk (𝑣0 , … ,
𝑣𝑘 ) from 𝑥 to 𝑦 in 𝐻, which means that 𝑣0 , … , 𝑣𝑘 ∈ 𝑉 (𝐻 ), 𝑣0 = 𝑥, 𝑣𝑘 = 𝑦, and 𝑣𝑖 𝑣𝑖+1 ∈ 𝐸(𝐻 ) ⊆
𝐸(𝐺 ) for all 𝑖 = 0, … , 𝑘 − 1. Thus (𝑣0 , … , 𝑣𝑘 ) is a walk from 𝑥 to 𝑦 in 𝐺 and therefore 𝐺 is con-
nected.

The following proposition is left as an exercise for the reader.

Proposition 5.24. Let 𝐺 be a graph. Then:

(1). ∼𝐺 is an equivalence relation on 𝑉 (𝐺 ).

(2). If (𝑣0 , … , 𝑣𝑘 ) is a walk in 𝐺, then 𝑣0 , … , 𝑣𝑘 all belong to the same equivalence class of ∼𝐺 .
66 CHAPTER 5. GRAPH THEORY

𝑎
𝑏 𝑑


𝑓

𝑖
𝑘 𝑒

Figure 5.7: The edges highlighted in green are the edges of consecutive vertices a walk
without repeated vertices from 𝑏 to 𝑗: (𝑏, 𝑒, 𝑐, 𝑑, 𝑓 , 𝑗). The edges highlighted in blue are
from a closed walk from 𝑓 to itself, but with another repeated vertex:
(𝑓 , 𝑎, 𝑔, ℎ, 𝑘, 𝑎, 𝑖, 𝑓 ). Notice that the graph with vertex set {𝑎, 𝑓 , 𝑔, ℎ, 𝑖, 𝑘} and edge set
{𝑎𝑓 , 𝑎𝑔, 𝑎𝑖, 𝑎𝑘, 𝑓 𝑖, 𝑔ℎ, ℎ𝑘} is not a cycle.
5.3. CONNECTEDNESS AND TREES 67

Corollary 5.25. Let 𝐺 be a graph. Then the connected components of 𝐺 are the induced sub-
graphs of the form 𝐺 [𝐴], where 𝐴 is an equivalence class of ∼𝐺 .

Proof. Let 𝐴 be an equivalence class of ∼𝐺 and let 𝑥, 𝑦 ∈ 𝐴 be given. Then 𝑥 ∼𝐺 𝑦, so there


exists a walk (𝑣0 , … , 𝑣𝑘 ) in 𝐺 from 𝑥 to 𝑦. However, by the previous proposition 𝑣0 , … , 𝑣𝑘 ∈ 𝐴,
so 𝑣𝑖 𝑣𝑖+1 ∈ 𝐸(𝐺 [𝐴])6 for all 𝑖 = 0, … , 𝑘 − 1 and thus 𝑥 ∼𝐺 [𝐴] 𝑦. Therefore by definition 𝐺 [𝐴] is
connected.
Let 𝐷 be a connected subgraph of 𝐺 that contains 𝐺 [𝐴]. Suppose that 𝐴 ⫋ 𝑉 (𝐷), so there
is a vertex 𝑣 ∈ 𝑉 (𝐷)⧵𝐴. Pick 𝑎 ∈ 𝐴 ≠ ∅. Then as 𝐷 is connected, there is a walk from 𝑎 to 𝑣 in 𝐷
and thus in 𝐺, so 𝑣 ∈ [𝑎]∼𝐺 = 𝐴, which is a contradiction. Therefore 𝐴 = 𝑉 (𝐷), which implies
𝐸(𝐷) ⊆ 𝐸(𝐺 [𝐴]) ⊆ 𝐸(𝐷), and hence 𝐷 = 𝐺 [𝐴]. Thus 𝐺 [𝐴] is a connected component of 𝐺.
Now let 𝐶 be a component of 𝐺 and fix a vertex 𝑥 in 𝐶. Then for all 𝑦 ∈ 𝑉 (𝐶 ), there is a
walk (𝑣0 , … , 𝑣𝑘 ) from 𝑥 to 𝑦 in 𝐶. Define 𝐴 ≔ [𝑥]∼𝐺 , so by the previous proposition 𝑣0 , … , 𝑣𝑘 ∈ 𝐴
and in particular 𝑦 ∈ 𝐴. Thus 𝑉 (𝐶 ) ⊆ 𝐴 and furthermore 𝐸(𝐶 ) ⊆ 𝐸(𝐺 [𝐴]). However, since
𝐴 = 𝑉 (𝐺 [𝐴]) and 𝐺 [𝐴] is connected, by maximality 𝐶 = 𝐺 [𝐴], completing this proof.

Any graph consists of one or more connected components, and a graph is connected if and
only if it has exactly one connected component. Figure 5.8 shows a disconnected graph and
its components.

Proposition 5.26. Let 𝑢 and 𝑣 be vertices of a graph 𝐺. Then 𝐺 contains a walk from 𝑢 to 𝑣 if
and only if 𝐺 contains a path that includes both 𝑢 and 𝑣 (as end-vertices).

Proof. Let (𝑣0 , … , 𝑣𝑘 ) be a walk from 𝑢 to 𝑣 in 𝐺 with stortest possible length. If 𝑣𝑖 = 𝑣𝑗 for some
0 ⩽ 𝑖 < 𝑗 ⩽ 𝑙, then
(𝑣0 , 𝑣1 , … , 𝑣𝑖−1 , 𝑣𝑖 , 𝑣𝑗+1 , 𝑣𝑗+2 , … , 𝑣𝑘−1 , 𝑣𝑘 )

is a shorter walk of length 𝑘 − (𝑗 − 𝑖) < 𝑘 from 𝑢 to 𝑣 in 𝐺, which is a contradiction. Therefore


𝑣0 , … , 𝑣𝑘 are all distinct, so 𝑃(𝑣0 , … , 𝑣𝑘 ) is a path in 𝐺 that contains 𝑢 and 𝑣 as the end-vertices
of the path.
Now let 𝑃 ⊆ 𝑉 (𝐺 ) be a path that contains 𝑢 and 𝑣, so there is an enumeration of the vertices
𝑤1 , … , 𝑤𝑛 of 𝑃 such that 𝐸(𝑃) = {𝑤𝑖 𝑤𝑖+1 ∶ 𝑖 = 1, … , 𝑛 − 1}. As 𝑢, 𝑣 ∈ 𝑉 (𝑃), there exist 𝑖, 𝑗 ∈
{1, … , 𝑛} such that 𝑢 = 𝑤𝑖 and 𝑣 = 𝑤𝑗 . If 𝑖 ⩽ 𝑗 then (𝑤𝑖 , 𝑤𝑖+1 , … , 𝑤𝑗 ) is a walk from 𝑢 to 𝑣.
Otherwise, (𝑤𝑖 , 𝑤𝑖−1 , … , 𝑤𝑗 ) is a walk from 𝑢 to 𝑣.

Proposition 5.26 allows us to give an equivalent definition in terms of paths: a graph 𝐺 is


connected if and only if for any vertices 𝑢 and 𝑣 of 𝐺 are contained in some path in 𝐺. As there
6
By definition, that for any subset 𝐵 ⊆ 𝑉 (𝐺 ) and all distinct 𝑏, 𝑐 ∈ 𝐵, 𝑏𝑐 ∈ 𝐸(𝐺 ) if and only if 𝑏𝑐 ∈ 𝐸(𝐺 [𝐵]).
68 CHAPTER 5. GRAPH THEORY

𝑝
𝑑


𝑔

𝑗 𝑚
𝑞

𝑜
𝑐
𝑒
𝑎
𝑙
𝑏

𝑓 𝑖
𝑛

Figure 5.8: The graph drawn above has 3 components, shown in different colours. One
component conssists of just the isolated vertex 𝑔.
5.3. CONNECTEDNESS AND TREES 69

are infinitely-many walks between two vertices in the same component but only finitely-many
paths, this proposition is sometimes easier to use.

Theorem 5.27. Any graph on 𝑛 vertices with at most 𝑛 − 2 edges is disconnected.

Proof. We argue by induction. First note that the statement is vacuously true for 𝑛 = 1 since
a graph cannot have a negative number of edges. Let 𝑛 ∈ ℕ+ be given such that every graph
with order 𝑛 and size at most 𝑛 − 2 is disconnected. Let 𝐺 be a graph with order 𝑛 + 1 and size
at most 𝑛 − 1. Then by the handshaking lemma,

∑ 𝑑(𝑣) = 2|𝐸(𝐺 )| ⩽ 2(𝑛 − 1) < 2(𝑛 + 1).


𝑣∈𝑉 (𝐺 )

If 𝛿(𝐺 ) ⩾ 2 then ∑𝑣∈𝑉 (𝐺 ) 𝑑(𝑣) ⩾ 2|𝑉 (𝐺 )| = 2(𝑛 + 1), which is a contradiction. Thus 𝛿(𝐺 ) < 2
and so there exists a vertex 𝑣 with degere 0 or 1. If 𝑑(𝑣) = 0 then [𝑣]∼𝐺 = {𝑣} ≠ 𝑉 (𝐺 ) since
𝑛 + 1 ⩾ 2, so 𝐺 is disconnected.
Now suppose 𝑑(𝑣) = 1 and consider 𝐺 ′ ≔ 𝐺 [𝑉 (𝐺 )⧵{𝑣}]. Then 𝐺 ′ has 𝑛 vertices and at most
|𝑉 (𝐺 )|−𝑑𝐺 (𝑣) ⩽ (𝑛−1)−1 = 𝑛−2 vertices, so by the induction hypothesis 𝐺 ′ is disconnected.
Hence, there exist 𝑥, 𝑦 ∈ 𝑉 (𝐺 ′ ) such that 𝑥 ≁𝐺 ′ 𝑦. If 𝑥 ∼𝐺 𝑦 then by Proposition 5.26, there is a
path 𝑃 in 𝐺 with end-vertices 𝑥 and 𝑦, but 𝑃 cannot be a subgraph of 𝐺 ′ by the same propo-
sition. Thus, 𝑣 ∈ 𝑉 (𝑃) ⧵ {𝑥, 𝑦} and so 𝑑𝐺 (𝑣) ⩾ 𝑑𝑃 (𝑣) = 2, which is a contradiction. Therefore
𝑥 ≁𝐺 𝑦 and so 𝐺 is disconnected.
Thus by induction, for all positive integers 𝑛, every graph with 𝑛 vertices and at most 𝑛 − 2
edges is disconnected.

We now introduce a special class of connected graphs.

Definition 5.28. A graph is called acyclic if it does not contain a cycle. A tree is a connected
acyclic graph. A leaf of a tree is a vertex 𝑣 with 𝑑(𝑣) = 1.

Trees are an important class of graphs which have many applications: phylogenetic trees,
search trees, decision trees and so forth. They are called trees due to the way they ‘branch out’
– see Figure 5.9. Paths are also trees; they are the trees with maximum degree at most 2.
Our next theorem states that any connected graph contains a tree as a spanning subgraph,
a result with many important applications.7
7
Spanning trees are important because they are minimal connected subgraphs, that is, the smallest number
of edges you need to be able to get from any vertex to any other vertex. For example, if your graph is a road
network, then provided you can keep open the roads (edges) of a spanning tree, then traffic will still be able to
travel from any city (vertex) to any other city, even if all of the other roads are closed.
70 CHAPTER 5. GRAPH THEORY

Figure 5.9: An example of a tree with 20 vertices.

Theorem 5.29. Every connected graph contains a spanning tree.

Proof. Let 𝐺 be a connected graph with 𝑛 vertices and choose a vertex 𝑣 ∈ 𝑉 (𝐺 ). Let 𝑇1 denote
the tree with vertex set {𝑣}. We will construct a sequence (𝑇1 , … , 𝑇𝑛 ) of tree subgraphs of 𝐺,
where for every 𝑘 = 1, … , 𝑛, 𝑇𝑘 has 𝑘 vertices and 𝑇𝑘 is a subtree of 𝑇𝑘+1 (if 𝑘 < 𝑛). Then since
|𝑉 (𝑇𝑛 )| = 𝑛 = |𝑉 (𝐺 )|, 𝑇𝑛 must be a spanning tree of 𝐺.
Let 𝑚 ∈ ℕ+ be given and suppose we have constructed (𝑇1 , … , 𝑇𝑚 ) and 𝑚 < 𝑛. Then there
exists a vertex 𝑢 ∈ 𝑉 (𝐺 ) ⧵ 𝑉 (𝑇𝑚 ). As 𝐺 is connected, there is a walk (𝑤0 , … , 𝑤𝑙 ) from 𝑢 to 𝑣
in 𝐺. Let 𝑖 = 1, … , 𝑙 be the least integer such that 𝑤𝑖 ∈ 𝑇𝑚 . Then 𝑤𝑖−1 𝑤𝑖 ∈ 𝐸(𝐺 ) and 𝑤𝑖−1 ∈
𝑉 (𝐺 ) ⧵ 𝑉 (𝑇𝑚 ). We claim that 𝑇𝑚+1 ≔ (𝑉 (𝑇𝑚 ) ∪ {𝑤𝑖−1 }, 𝐸(𝑇𝑚 ) ∪ {𝑤𝑖−1 𝑤𝑖 }) is a tree.
Let 𝑥 ∈ 𝑉 (𝑇𝑚 ) be given. Then since 𝑇𝑚 is connected, there exists a walk (𝑥0 , … , 𝑥𝑘 ) from 𝑥 to
𝑤𝑖 in 𝑇𝑚 , and so (𝑥0 , … , 𝑥𝑘 , 𝑤𝑖−1 ) is a walk from 𝑥 to 𝑤𝑖−1 in 𝑇𝑚+1 . Thus 𝑇𝑚+1 is also connected.
Now suppose there is a cycle 𝐶 in 𝑇𝑚+1 . This cycle cannot be a subgraph of 𝑇𝑚 as it is a tree
by assumption. Hence 𝑤𝑖−1 ∈ 𝑉 (𝐶 ), but 𝑑𝑇𝑚+1 (𝑤𝑖−1 ) = 1 as 𝑤𝑖 is the only neighbour of 𝑤𝑖−1
in 𝑇𝑚+1 . This is a contradiction, since cycles have minimum degree 2 (and thus any vertex on
a cycle subgraph must have degree at least 2 in the original graph). Therefore 𝑇𝑚+1 is acyclic
and thus a tree.
Hence by induction, there exists a sequence (𝑇1 , … , 𝑇𝑛 ) of tree subgraphs of 𝐺 as required,
5.3. CONNECTEDNESS AND TREES 71

concluding our proof.

The proof of the above theorem gives an algorithm for finding a spanning tree, which we
describe below:

Algorithm 5.30. The algorithm to find a spanning tree in a connected graph is described in
the steps below:

(1). Input a graph G, choose a vertex v ∈ V(G). Define n ≔ 0 and let T0 denote the tree with
vertex set {v}.

(2). If Tn spans G (V(Tn ) = V(G)) then we halt the algorithm and output Tn . Otherwise, we
continue to the next step.

(3). Find a pair of adjacent vertices u, t in G, where t is in Tn but u is not. Set

Tn+1 ≔ (V(Tn ) ∪ {u}, E(Tn ) ∪ {ut}),

increase n by 1 and go to step (2).

Example 5.31.

(1). For any cycle 𝐶𝑛 , if we remove a single edge then we get a 𝑃𝑛 which is a spanning subtree.
All spanning trees of a 𝐶𝑛 are 𝑃𝑛 ’s.

(2). Figure 5.10 shows a spanning tree for the connected graph from Figure 5.7. There are
2,082 spanning trees for this graph.

We conclude this section with a characterisation of trees.

Theorem 5.32. Let 𝐺 be a graph with 𝑛 vertices. Then any two of the following properties implies
the third:

• 𝐺 is connected.

• 𝐺 is acyclic.

• 𝐺 has size 𝑛 − 1.

Proof. First, we assume 𝐺 is a tree; i.e. it is connected and acyclic. As 𝐺 is acyclic, by Theorem
5.21 𝐺 has at most 𝑛 − 1 edges, whereas since 𝐺 is connected, by Theorem 5.27 𝐺 has at least
𝑛 − 1 edges. Therefore 𝐺 has exactly 𝑛 − 1 edges.
72 CHAPTER 5. GRAPH THEORY

𝑎
𝑏 𝑑


𝑓

𝑖
𝑘 𝑒

Figure 5.10: A spanning tree 𝑇 for the graph from Figure 5.7.
5.4. BIPARTITE GRAPHS 73

Now assume 𝐺 is connected and has 𝑛 − 1 edges. Then by Theorem 5.29, 𝐺 contains a
spanning tree 𝑇. By the first part of this proof, we know that 𝑇 has 𝑛 − 1 edges and as 𝐸(𝑇 ) ⊆
𝐸(𝐺 ), it follows that 𝐺 = 𝑇, so 𝐺 is acyclic.
Finally, assume 𝐺 is acyclic and has size 𝑛 − 1. Then each component 𝐶 of 𝐺 is connected
and acyclic, so again by the first part of this proof |𝑉 (𝐶 )| = |𝐸(𝐶 )| − 1. If 𝐶1 , … , 𝐶𝑘 are the
components of 𝐺, then since edges must belong to the same component, by the sum rule

|| 𝑘 || 𝑘
𝑛 = |𝑉 (𝐺 )| = || ⋃ 𝑉 (𝐶𝑖 )|| = ∑ |𝑉 (𝐶𝑖 )|
|
𝑖=1 |𝑖=1
|| 𝑘 || 𝑘 𝑘 𝑘
| |
⇒ 𝑛 − 1 = |𝐸(𝐺 )| = | ⋃ 𝐸(𝐶𝑖 )| = ∑ |𝐸(𝐶𝑖 )| = ∑ (|𝑉 (𝐶𝑖 )| − 1) = ( ∑ |𝑉 (𝐶𝑖 )|) − 𝑘 = 𝑛 − 𝑘.
| 𝑖=1 | 𝑖=1 𝑖=1 𝑖=1

Thus 𝑘 = 1, so 𝐺 is connected.

5.4 Bipartite graphs


Definition 5.33. A graph 𝐺 is bipartite if its vertex set 𝑉 can be written as 𝑉 = 𝑉1 ∪𝑉2 where 𝑉1
and 𝑉2 are disjoint and every edge of 𝐺 is incident to one vertex of 𝑉1 and one vertex of 𝑉2 (so
no edges have both vertices in 𝑉1 or both vertices in 𝑉2 ). We refer to the sets 𝑉1 , 𝑉2 as a pair of
vertex classes of 𝐺.
Let 𝑚, 𝑛 ∈ ℕ0 be given such that they are not both zero. A complete 𝑚-by-𝑛 bipartite graph,
also called a 𝐾𝑚,𝑛 , is a bipartite graph 𝐺 with vertex classes 𝑉1 , 𝑉2 with sizes 𝑚 and 𝑛 respec-
tively, such that 𝐸(𝐺 ) = {𝑣1 𝑣2 ∶ 𝑣1 ∈ 𝑉1 , 𝑣2 ∈ 𝑉2 }. That is, the graph has all possible edges between
the two vertex classes.

Another useful way to think of bipartite graphs are those whose vertices are 2-colourable:
we can ‘colour’ the vertices in at most 2 possible colours in such a way that no adjacent vertices
have the same colour. The sets of vertices with the same colour are then vertex classes. See
Figure 5.11 for an example. Note that there may several ways to choose vertex classes of a
bipartite graph.
Bipartite graphs arise naturally in applications where a graph represents connections be-
tween different types of objects. For example, a scheduling problem might consider a graph
whose vertices are students and classes, where there is an edge between a student and a class
if that student is taking that class; in this context an edge between two students, or between
two classes, would not make sense, and the vertex classes would be the set of vertices corre-
sponding to students, and the set of vertices corresponding to classes.
74 CHAPTER 5. GRAPH THEORY

𝑎 𝑏 𝑐 𝑑 𝑒 𝑓 𝑔

ℎ 𝑖 𝑗 𝑘 𝑙

Figure 5.11: An example of a bipartite graph, with two vertex classes {𝑎, 𝑏, 𝑐, 𝑑, 𝑒, 𝑓 , 𝑔}
and {ℎ, 𝑖, 𝑗, 𝑘, 𝑙}, shown in separate rows. A different pair of vertex classes is
{𝑎, 𝑏, 𝑑, 𝑔, 𝑗, 𝑙} and {𝑐, 𝑒, 𝑓 , ℎ, 𝑖, 𝑘}, which we have coloured differently. Note that adjacent
nodes lie in different rows / have different colours.

In a bipartite graph any walk must move from one vertex class to the other with each step.
As this walk alternates between two vertex classes, the walk must have even length.

Proposition 5.34. Any closed walk in a bipartite graph has even length.

Proof. Let 𝐺 be a bipartite graph, and let 𝑊 = (𝑤0 , … , 𝑤𝑘 ) be a walk in 𝐺. Let 𝑉1 , 𝑉2 be a pair of
vertex classes of 𝐺, chosen so that 𝑤0 ∈ 𝑉2 . Since 𝑤𝑖 𝑤𝑖+1 is an edge of 𝐺 for each 𝑖 = 0, … , 𝑘−1,
the vertices 𝑣𝑖 and 𝑣𝑖+1 must lie in different vertex classes, and so it follows that 𝑤𝑖 ∈ 𝑉2 if 𝑖 is
even and 𝑤𝑖 ∈ 𝑉1 if 𝑖 is odd. In particular, as 𝑤𝑘 = 𝑤0 ∈ 𝑉2 , 𝑘 must be even.

The next proposition implies that in any graph, a shortest closed walk of odd length (if such
a walk exists) is an odd cycle.

Proposition 5.35. If a graph contains a closed walk of odd length, then it contains a cycle of
odd length.

Proof. Let 𝐺 be a graph and let 𝑊 = (𝑤0 , … , 𝑤𝑘 ) be a closed walk of shortest possible odd
length. Suppose this walk has a repeated vertex other than 𝑤0 = 𝑤𝑘 , so 𝑤𝑖 = 𝑤𝑗 for some
𝑖, 𝑗 = 0, … , 𝑘 with 𝑖 < 𝑗. Then (𝑤0 , … , 𝑤𝑖 , 𝑤𝑗+1 , … , 𝑤𝑙 ) and (𝑤𝑖 , … , 𝑤𝑗 ) are closed walks with
lengths 𝑘 − (𝑗 − 1) and 𝑗 − 𝑖 respectively. As 𝑘 is odd, it follows that one of these walks also has
5.4. BIPARTITE GRAPHS 75

odd length shorter than 𝑘, which gives a contradiction. Therefore the only repeated vertex is
𝑤0 = 𝑤𝑘 . If 𝑘 = 1 then 𝑤0 = 𝑤1 and 𝑤0 𝑤1 ∈ 𝐸(𝐺 ), which is a contradiction. Thus 𝑘 ⩾ 3 and so
𝐶 (𝑤0 , … , 𝑤𝑘 ) is a cycle of odd length in 𝐺.

Note that the analogous statement for even length walks and cycles is not true; a graph
with no cycles of even length may contain a closed walk of even length.

Theorem 5.36. A graph is bipartite if and only if it has no odd-length cycles.

Proof. Proposition 5.34 shows that a bipartite graph has no odd-length walks and thus no odd-
length cycles, so let 𝐺 be a connected graph that is not bipartite. Choose a vertex 𝑣 and for
every 𝑛 ∈ ℕ0 , let 𝐴𝑛 be the set of vertices where the shortest path from 𝑣 to 𝑥 is of length 𝑛.
As 𝐺 is connected, it follows that 𝑉 (𝐺 ) = ⋃∞ ∞
𝑛=0 𝐴𝑛 . If we define 𝐴even ≔ ⋃𝑛=0 𝐴2𝑛 and 𝐴odd ≔
⋃∞
𝑛=0 𝐴2𝑛+1 , then 𝐴even ∪ 𝐴odd = 𝑉 (𝐺 ) and 𝐴even ∩ 𝐴odd = ∅.
Note that for all 𝑛 ∈ ℕ0 and all 𝑥 ∈ 𝑉 (𝐺 ):
𝑛
𝑥 ∈ 𝐴𝑛+1 ⟺ 𝑥 ∉ ⋃ 𝐴𝑘 and 𝑁 (𝑥) ∩ 𝐴𝑛 ≠ ∅.
𝑘=0

Let 𝑚, 𝑛 ∈ ℕ0 , 𝑎 ∈ 𝐴𝑚 , 𝑏 ∈ 𝐴𝑛 be given such that 𝑚 < 𝑛 and 𝑎𝑏 is an edge in 𝐺. Then there is a


walk (𝑤0 , … , 𝑤𝑚 ) from 𝑣 to 𝑎, so (𝑤0 , … , 𝑤𝑚 , 𝑏) is a walk from 𝑣 to 𝑏, so by definition of 𝐴𝑛 we
must have 𝑛 ⩽ 𝑚 + 1. Therefore 𝑛 = 𝑚 + 1, so 𝑎 and 𝑏 belong cannot both belong to 𝐴even or
both belong to 𝐴odd . As these sets cannot be a pair of vertex classes for 𝐺, it follows that there
must be an edge 𝑎𝑏 in some 𝐴𝑛 . Thus there are walks (𝑤0 , … , 𝑤𝑚 ), (𝑤0′ , … , 𝑤𝑚
′ ) from 𝑣 to 𝑎 and

′ , … , 𝑤 ′ ) is a closed walk with length 𝑚 + 1 + 𝑚 = 2𝑚 + 1.


𝑏 respectively. Then (𝑤0 , … , 𝑤𝑚 , 𝑤𝑚 0
Therefore by Proposition 5.35, there is an odd-length cycle.
Now suppose that 𝐺 is any graph and that each component 𝐶1 , … , 𝐶𝑘 of 𝐺 is bipartite,
and let 𝑈𝑖 , 𝑉𝑖 be a pair of vertex classes for 𝐶𝑖 , for every 𝑖 = 1, … , 𝑘. Define 𝑈 ≔ ⋃𝑘𝑖=1 𝑈𝑖 , 𝑉 ≔
⋃𝑘𝑖=1 𝑉𝑖 . Then
𝑘 𝑘
𝑈 ∪ 𝑉 = ⋃ (𝑈𝑖 ∪ 𝑉𝑖 ) = ⋃ 𝑉 (𝐶𝑖 ) = 𝑉 (𝐺 ),
𝑖=1 𝑖=1
and since 𝑈𝑖 ∩ 𝑉𝑖 = ∅ for all 𝑖 = 1, … , 𝑘 and 𝑉 (𝐶𝑖 ) ∩ 𝑉 (𝐶𝑗 ) = ∅ for all distinct 𝑖, 𝑗 = 1, … , 𝑘,

𝑘 𝑘 𝑘 𝑘
𝑈 ∩ 𝑉 = ⋃ ⋃ (𝑈𝑖 ∩ 𝑉𝑗 ) = ⋃ ⋃ ∅ = ∅.
𝑖=1 𝑗=1 𝑖=1 𝑗=1

Moreover, suppose there is an edge 𝑒 ⊆ 𝑈, so then there exist 𝑖, 𝑗 = 1, … , 𝑘 and 𝑢 ∈ 𝑈𝑖 and


𝑣 ∈ 𝑈𝑗 such that 𝑒 = 𝑢𝑣. If 𝑖 = 𝑗 then we get a contradiction since 𝑈𝑖 , 𝑉𝑖 is a pair of vertex
classes for 𝐶𝑖 . If 𝑖 ≠ 𝑗 we get another contradiction by Corollary 5.25, because 𝑉 (𝐶𝑖 ) and 𝑉 (𝐶𝑗 )
76 CHAPTER 5. GRAPH THEORY

are different equivalence classes of ∼𝐺 and 𝑢 ∼𝐺 𝑣. Therefore there is no edge between vertices
of 𝑈 and likewise no edge between vertices of 𝑉, so 𝑈, 𝑉 are a pair of vertex classes for 𝐺,
proving that 𝐺 is bipartite.

An immediate consequence of this theorem is that trees, and in particular paths, are bi-
partite, as are 𝐶𝑛 ’s for even 𝑛 ⩾ 4. The proof of this theorem contains within it the following
algorithm, and also justifies that the algorithm will always halt on any input and is correct.
Figures 5.12 and 5.13 shows an example run of this algorithm on some connected graphs.

Algorithm 5.37. This algorithm will either output a pair of vertex classes (for when the input
is bipartite) or will output a closed walk with odd length at least 3 (for when the input is not
bipartite):

(1). Input a graph G. For each component C of G, run through steps (a) to (d):

(a) Fix a vertex v ∈ V(C) and define A0 ≔ {v}, n ≔ 0.


(b) If ⋃nk=0 Ak = V(C) then output

⎛ n n ⎞
⋃ Ak , ⋃ Ak
⎝ k isk=0,
even
k=0,
k is odd

for this component. Otherwise, proceed to step (c).


(c) Define An+1 ≔ {u ∈ V(C) ⧵ ⋃ni=0 Ai ∶ N(u) ∩ An ≠ ∅} and increase n by 1.
(d) If there are two vertices a, a′ ∈ An that are adjacent, find walks (w0 , … , wn ), (w′0 , … , w′n )
from v to a and b respectively. Output (w0 , … , wn , w′n , … , w′0 ) and halt the whole al-
gorithm. Otherwise, go to step (b).

(2). If C1 , … , Ck are the components of G, and for each i = 1, … , k, (Ui , Vi ) is the result of steps
(a) to (d) for component Ci , then output
k k
( ⋃ Ui , ⋃ Vi ).
i=1 i=1
5.4. BIPARTITE GRAPHS 77

1
0

1 1

2
3
3

4
1

2
3

Figure 5.12: All trees are bipartite. Each vertex is labelled with a number 𝑛 that indicates
the shortest length of paths between this vertex and a fixed starting vertex (the one
labelled with 0). At each stage, we colour the new vertices connected to the previous
stage, alternating between red for the even stages, and blue for the odd stages. So those
labelled with 𝑛 + 1 are connected by an edge with those labelled with 𝑛. Since
78 CHAPTER 5. GRAPH THEORY

0 𝑣

𝑧 2

𝑢
1

𝑤
2

𝑥 2

1 𝑦

Figure 5.13: A non-bipartite graph. The algorithm starts at vertex 𝑣 (labelled with 0)
and there is an edge between the two vertices 𝑥, 𝑤 in 𝐴2 . Working our way back to 𝑣,
we find that there is a walk with length 5 in our graph: (𝑣, 𝑢, 𝑥, 𝑤, 𝑦, 𝑣).
Appendix A

Changelog

23rd April 2021: Added section 5.4 and made several corrections to section 5.3:

• updated the definition of 𝑃(𝑣0 , … , 𝑣𝑘 ) And 𝐶 (𝑣0 , … , 𝑣𝑘 ) in Definition 5.22,

• added missing paragraph to the proof of Corollary 5.25,

• changed “𝛿(𝐺 ) ⩽ 2” to “𝛿(𝐺 ) ⩾ 2” in the proof of Theorem 5.27,

• noted that each 𝑇𝑘 is a subtree of 𝑇𝑘+1 in the first paragraph of the proof of Theorem
5.29,

• noted that all spanning trees of a 𝐶𝑛 are 𝑃𝑛 ’s in Example 5.31(1),

• amended Proposition 5.26 to note that 𝑢 and 𝑣 are end-vertices of the path, and

• changed “T” to “T1 ”, and “n ≔ 0“ to “n ≔ 1” in step 1 of Spanning tree algorithm.

Also corrected the statement of Proposition 5.18(1).

20th April 2021: Included sesction 5.3.

19th April 2021: Added changelog as an appendix and included sections 5.1 and 5.2.

14th April 2021: Updated introduciton to chapter 5.

11th April 2021: Fixed several minor typos and changed [0, 1] to [0, 1) in proof of Theorem
4.14.

8th April 2021: Included chapter 4 and corrected several issues:

• definition of 𝑅 as a set of ordered pairs in the examples of chapter 2,

79
80 APPENDIX A. CHANGELOG

• clarified the equivalence classes and quotient set of the equivalence relation 𝐶 in
chapter 2,
• fixed various index, hyperlink, and caption issues, and
• corrected other minor issues.

1st April 2021: Included first half of chapter 3 and corrected typo in relation-property corre-
spondence in chapter 2.

27th March 2021: Included chapter 2 and corrected typo in general inclusion-exclusion for-
mula (Theorem 1.15).

24th March 2021: Initial version (chapter 1).


Notation

[𝑎]∼ , 26 𝐾𝑛 , 60
𝑛!, 33
𝑛 ℕ0 , ℕ+ , 11
( ), 37
𝑟
𝑑𝐺 (𝑣), 𝑑(𝑣), 57 ℙ(𝐸), 32
𝑛
∏𝑚=1 𝑎𝑛 , 33
𝐴/∼, 26 𝑃𝑛 , 60
𝑎𝑅𝑏, 𝑎 𝑅
 𝑏, 23 𝒫(𝐴), 16
⌈𝑥⌉, 9 𝐴 × 𝐵, 𝐴1 × ⋯ × 𝐴𝑟 , 15
𝐶𝑛 , 60
ℚ, 11
𝐴𝑐 , 15
ℝ, 11
𝛿(𝐺 ), Δ(𝐺 ), 57
𝐴 ⧵ 𝐵, 15 {…}, 11
|𝐴|, 17, 49
𝑢𝑣, 55
𝐴 ⊆ 𝐵, 𝐴 ⫋ 𝐵, 12
≼, 49
≺, 52 ∪, ⋃, 13
∅, 13
(𝑉 , 𝐸), 55
≈, 49
≡𝑚 , 25

⌊𝑥⌋, 9

𝐺 [𝐴], 60

ℤ, 11
𝑥 ∈ 𝐴, 𝑥 ∉ 𝐴, 11
∩, ⋂, 13

𝐾𝑚,𝑛 , 73

81
82 NOTATION
Index

杨辉三角形, 46 Countable, 50
Countably-infinite, 50
Acyclic, 69
Cycle
Bijection, 49, 50 graph, 60, 61, 64, 69, 74, 75
Binary length, 74, 75
relation, 23
Decimal, see Real numbers
Binomial
expansions, 7
coefficient, 37–42, 45–47, 60, 63
Decimal places, 32
expansion, 46
Degree, 57–59
theorem, 46
maximum, 57
Bipartite graph, 73–76
minimum, 57, 64
complete, 73
sequence, 59
Cantor’s theorem, 52 Denominator, 7
Cantor-Schröder-Bernstein theorem, 50 Disjoint, 12
Cardinality, see size of set Distributive law, 19
Cards, 10, 39 Divides, 20
Cartesian product, see product of sets
Ceiling, 9 Edge, 55, 57
Choice Element, 11
ordered, 33–37, 40–42 common, 12
unordered, 37–42 Empty set, 13
with repetition, 34, 35, 40, 41 Equinumerous, 49
without repetition, 34, 36, 38, 40, 41 Equivalence
Connected, 64–73 class, 26
component, 65, 68 relation, 26
Copy of a graph, 64 Extension Axiom, 12

83
84 INDEX

Factor, see also divides Member, see element


Factorial, 33, 34, 36–38, 42 Modulo, 25, 27
Finite, 17, 50
Non-empty set, 13
Floor, 9
Number
Fractions, see Rational numbers
integer, 7
Graph, 55, 56 rational, 7
𝑘-regular, 59 real, 7
complement, 56 Numerator, 7
complete, 60, 61, 63
Order
connected, 70
of set, see size of set
copy, 60
Ordered
cycle, 60
𝑟-tuple, 16
length, 60
pair, 15
isomorphic, 57
isomorphic, isomorphism, 56 Partition, 27
order, 56 part of, 27
path Pascal’s triangle, 46
length, 60 Path, 61, 63, 67
patth, 60 Permutation, 36, 38, 42
regular, 59 Pigeonhole principle, 7–10, 58
Ground set, see universal set general, 9
Power
Handshaking lemma, 57
set, 16, 17, 47
Incident, 57 Prime
Inclusion-exclusion formula factorisation, 22
for three sets, 19 Probability, 32, 34, 35, 37–40, 42
for two sets, 18 Product
general, 21 of sets, 15, 16, 21, 22
Infinite, 17 Product rule, 21, 22, 31, 38, 39
Injection, 50 Proof
Integers, 7 by contradiction, 7, 9, 58
Intersection, 13, 14, 21 by induction, 17, 22, 69

Leaf, 69 Quotient
Length, see cycle / walk length set, 26
INDEX 85

Reflexive, 25
Regular graph, 59
Rounding, see Decimal places

Set, 11–17
complement, 15, 44
difference, 15
distinct, 12
equality, 12
intersect, 12, 18, 19
Significant figures, 32
Size
of graph, 56
of set, 17–19, 21, 22, 47, 48
Subgraph, 60
Subset, 12
Sum rule, 18, 31
Symmetric, 25

Transitive, 25
Tree, 64–73
spanning, 70

Uncountable, 50, 52–54


Union, 13, 14, 18, 19, 21
Universal set, 15
Unordered pair, 55

Venn diagram, 13–15


Vertex, 55, 58, 59
adjacent, 57
classes, 73
neighbour, 57
neighbourhood, 57

Walk, 64–73
closed, 65, 74
length, 64, 74

You might also like