Notes 5
Notes 5
Lecture notes
Robert Leek
[email protected]
I Combinatorics 5
1 Fundamentals 7
1.1 The pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 The sum rule, product rule, and inclusion-exclusion formulae . . . . . . . . . 18
2 Binary relations 23
2.1 Equivalence relations and partitions . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Finite counting 31
3.1 Ordered choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Unordered choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 The binomial theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 Infinite counting 49
4.1 Countable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Uncountable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5 Graph theory 55
5.1 Degrees and degree sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Subgraphs, paths, and cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Connectedness and trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Bipartite graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A Changelog 79
Notation 81
3
4 CONTENTS
Index 83
Part I
Combinatorics
5
Chapter 1
Fundamentals
• The integers are those numbers used for counting and their negatives, including 0.
𝑚
• The rational numbers are those that can be written as a fraction , where 𝑚, 𝑛 are inte-
𝑛
gers and 𝑛 is non-zero. We call 𝑚 the numerator and 𝑛 the denominator of this fraction.
• The real numbers are those that can be written using decimal expansions, including ter-
51 5
minating ( = 0.51), recurring ( = 0.2272727 … = 0.227), and non-recurring (2𝜋 =
100 22
6.283185 …).
This statement may seem obvious, especially if you consider a specific value of 𝑛 (for ex-
ample, the statement for 𝑛 = 2 says that if 3 pigeons are placed in 2 holes, then one of the
holes must contain at 2 two pigeons). However, it is still important to give a proof. The fol-
lowing argument uses a style of proof called ‘proof by contradiction’. We start by assuming the
statement to be proven is false and derive a contradiction—something which is shown to be
1
Pigeon is English for 鸽子, and pigeonholes are a place where pigeons can nest in; they are also an open
compartment used to store papers and mail. Note we are not placing actual pigeons in holes; ‘pigeons’ are just
objects and ‘holes’ refer to some way of classifying or collecting these objects. This principle is also called Dirich-
let’s drawer principle.
7
8 CHAPTER 1. FUNDAMENTALS
true and false at the same time. This means that our assumption was false, so the statement
must be true instead, concluding the proof.
Proof. Suppose for a contradiction that the principle is false. This means that there is some
positive integer 𝑛 for which 𝑛 + 1 pigeons can be placed in 𝑛 holes so that every hole contains
at most one pigeon. Fix such an 𝑛 and a placement of pigeons, and let 𝑥𝑖 be the number of
pigeons in the 𝑖th hole. So ∑𝑛𝑖=1 𝑥𝑖 = 𝑛 +1, as it is the total number of pigeons, whilst 𝑥𝑖 ⩽ 1 for
each 𝑖 = 1, … , 𝑛 since each hole contains at most one pigeon. Summing these 𝑛 inequalities
we obtain 𝑛 𝑛
𝑛 + 1 = ∑ 𝑥𝑖 ⩽ ∑ 1 = 1⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
+ 1 + ⋯ + 1 = 𝑛,
𝑖=1 𝑖=1 𝑛 1’s
a contradiction. We therefore conclude that the principle must be true.
The pigeonhole principle arises frequently as a useful tool for both mathematical and non-
mathematical arguments, as in the following examples.
Example 1.2 (Socks). There 10 black socks and 8 white socks in my drawer. How many socks
must I take from the drawer (without looking) to guarantee that I have a matching pair among
the socks I have taken out?
Solution. Three socks. To see this, think of each sock as a ‘pigeon’, and as you take them out the
drawer, imagine that each white sock is placed in a ‘white’ hole and each black sock is placed
in a ‘black’ hole. Thus socks in the same hole are the same colour. After three socks have been
taken, there are three pigeons (socks) in two holes, so by the pigeonhole principle (with 𝑛 = 2)
some hole contains more than one sock, so contains a matching pair.2
Example 1.3 (Summing to 10). Prove that for any six integers between 1 and 9, not necessarily
distinct, either two of these integers are equal, or two of these integers sum to 10.
Solution. Let 𝑥1 , … , 𝑥6 denote the six integers, and create ‘holes’ A, B, C, D and E. For each
𝑗 = 1, … , 6, place the ‘pigeon’ 𝑥𝑗 into a hole according to the following rule:
⎧
⎪
⎪𝐴 if 𝑥𝑗 = 1 or 𝑥𝑗 = 9,
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪𝐵 if 𝑥𝑗 = 2 or 𝑥𝑗 = 8,
⎪
⎪
Put 𝑥𝑗 in hole ⎨𝐶 if 𝑥𝑗 = 3 or 𝑥𝑗 = 7,
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪𝐷 if 𝑥𝑗 = 4 or 𝑥𝑗 = 6,
⎪
⎪
⎪
⎪
⎩𝐸 if 𝑥𝑗 = 5.
2
Note that the ‘10’ and ‘8’ in the question don’t contribute to the final answer here.
1.1. THE PIGEONHOLE PRINCIPLE 9
Note that this rule places each of the six integers into one of the five holes. The pigeonhole
principle with 𝑛 = 5 therefore implies that some hole contains at least two of the integers.
Choose two integers which lie in the same hole; then these integers are either equal or sum
to 10 by definition of the holes. For example, if 𝐷 contains two integers, then either they are
equal or one is 4 and one is 6, giving a sum of 10.
As shown in these examples, the ‘pigeons’ may be any discrete3 object, mathematical or
otherwise, and the holes can be any collections to which pigeons may be assigned, provided
that each pigeon is assigned to a unique hole4 .
The next result is a more general version of the pigeonhole principle. First, we need to
introduce the ceiling notation. The subsequent proposition is left as an exercise.
Theorem 1.6 (General pigeonhole principle). Let 𝑛 and 𝑘 be positive integers. If 𝑛 ‘pigeons’ are
𝑛
placed in 𝑘 ‘holes’, then some ‘hole’ contains at least ⌈ ⌉ ‘pigeons’.
𝑘
Proof. Suppose for a contradiction that there exist positive integers 𝑛 and 𝑘 for which it is
possible to place 𝑛 pigeons in 𝑘 holes so that every hole contains fewer than 𝑛/𝑘 pigeons. Let
𝑛
𝑥𝑖 be the number of pigeons in the 𝑖th hole in such an arrangement, so ∑𝑘𝑖=1 𝑥𝑖 = 𝑛, and 𝑥𝑖 <
𝑘
for each 𝑖 = 1, … , 𝑘. Summing these inequalities then gives
𝑘 𝑘
𝑛 𝑛
𝑛 = ∑ 𝑥𝑖 < ∑ = ⋅ 𝑘 = 𝑛,
𝑖=1 𝑖=1 𝑘 𝑘
3
That is, we can only apply the pigeonhole principle to indivisible objects (i.e. those which only occur in
integer quantities). For example, it is not true that if 3 litres of milk are poured into two buckets, some bucket
must contain at least 2 litres, because it is possible to have 1.5 litres in each bucket.
4
Formally, this says that the rule for assigning pigeons to holes is a well-defined function, and the pigeonhole
principle says that this function is not injective.
10 CHAPTER 1. FUNDAMENTALS
a contradiction. We conclude that some hole must contain at least 𝑛 / 𝑘 pigeons. Since the
number of pigeons in each hole must be an integer, the number of pigeons in this pigeonhole
𝑛
must be at least ⌈ ⌉.5
𝑘
Note that the (mean) average number of pigeons per hole is 𝑛/𝑘. So the general pigeonhole
principle is equivalent to saying that at least one member of any collection of integers is greater
than or equal to the average of the collection, a fact you are probably familiar with.
Example 1.7. A hand in the card game Bridge consists of 13 cards from a standard 52-card
deck of cards.6 Prove that any such hand must contain at least four cards of the same suit.
Solution. Divide the 13 cards of the hands into four piles, one for each suit (these piles are
the ‘holes’ and the cards are the ‘pigeons’). Since there are 13 cards and 4 piles, by the general
13
pigeonhole principle (applied with 𝑛 = 13 and 𝑘 = 4) some pile must contain at least ⌈ ⌉ =
4
⌈3.25⌉ = 4 cards.
Example 1.8. For any nine distinct points in a square of side length one, there are three points
1
which form the vertices of a triangle whose longest side has length at most (here we con-
√2
sider a straight line to be a ‘flat’ triangle).
Solution. Divide the square into four quarters, each of which is a square of side length 1/2 (see
Figure 1.1). So each of the nine points lies in some quarter (if a point is on a boundary than
choose one of the corresponding quarters arbitrarily and say that it lies in that quarter). By the
general pigeonhole principle some quarter must then contain at least three of the points (since
⌈ 9 ⌉ = 3). Choose three points which lie in the same quarter and let 𝑇 be the triangle with these
4
three points as vertices. By Pythagoras’ theorem7 , the maximum distance between points in
1 2 1 2 1 1
the same quarter is √( ) + ( ) = , each side of 𝑇 has length at most , as required.
2 2 √2 √2
5
If you find this final step confusing, think of this: if there are at least 4.5 people in a room, then there must be
at least 5 = ⌈4.5⌉ people in the room, since you cannot have half a person. This is just the same argument applied
𝑛
to rather than 4.5.
𝑘
6
Each card has one of 13 ranks [A(ce), 2, 3, 4, 5, 6, 7, 8, 9, 10, J(ack), Q(ueen), K(ing)] and one of 4 suits [♣
(clubs), ♦ (diamonds), ♥ (hearts), ♠ (spades)]. Observe that 13 × 4 = 52.
7
This theorem states that the square of the longest side of a right-angled triangle (i.e. a triangle with one angle
of 90∘ = 𝜋 / 2 radians) is equal to the sum of the squares of the lengths of the other two sides; symbolically,
𝑎 2 +𝑏 2 = 𝑐2 . Pythagoras was an ancient Greek philosopher who was alive approximately 570–495 BCE. Although
attributed to him, he was certainly was not the first to have discovered this theorem. You are likely to recognise
this theorem by another name: 勾股定理.
1.2. SETS 11
1.2 Sets
A set 𝐴 is a collection of objects (e.g. integers, lines in the plane, functions, etc.), which we call
elements or members of 𝐴. Alternatively, we say that 𝐴 contains the object 𝑥 if 𝑥 is an element
of 𝐴, written 𝑥 ∈ 𝐴. If 𝑥 is not a member of 𝐴, we write 𝑥 ∉ 𝐴. Sometimes you will see a set
referred to as a class or collection, especially when the members of the set are themselves sets.
You should treat the words ‘class’, ‘set’, and ‘collection’ as meaning the same thing.
Some important examples of sets are ℤ (the set of integers), ℚ (the set of rational numbers),
and ℝ (the set of real numbers). As there are different conventions on the status of 0 as a
‘natural number’ or not, this course will not refer to them; instead, I will use ℕ0 to denote the
set of non-negative integers, and ℕ+ to denote the set of positive integers, and ℕ without the
appropriate sub/superscript will not be defined.
There are several ways to define a set:
𝐴 ≔ {2, 3, 5, 7}.
(2). We could restrict the elements of a set by imposing conditions on the elements; if 𝑋 is a
12 CHAPTER 1. FUNDAMENTALS
set and 𝜑(𝑥) is a property of 𝑥, we can form the set 𝑌 ≔ {𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)}.8 The members
of 𝑌 are precisely the elements 𝑥 of 𝑋 (written 𝑥 ∈ 𝑋) that satisfy 𝜑(𝑥).9 For example,
𝐵 ≔ {𝑥 ∈ ℕ+ ∶ 𝑥 ⩽ 10 and 𝑥 is prime}.
(3). We can form new sets using operations, such as union and intersection. These will be
introduced in the following pages. For example, 𝐶 ≔ {2, 3} ∪ {5, 7}.
For any set 𝐴 and any object 𝑥, either 𝑥 is an element of 𝐴, which we denote by 𝑥 ∈ 𝐴, or 𝑥 is
not an element of 𝐴, which we denote by 𝑥 ∉ 𝐴.
If 𝐴 and 𝐵 are sets such that every element of 𝐴 is also an element of 𝐵, we write 𝐴 ⊆ 𝐵 and
say that 𝐴 is a subset of 𝐵. The Extension Axiom states that two sets are equal if and only if they
have the same elements, or in other words 𝐴 ⊆ 𝐵 and 𝐵 ⊆ 𝐴. 𝐴 is a proper, or strict, subset of 𝐵
if 𝐴 ⊆ 𝐵 and 𝐴 ≠ 𝐵 (so 𝐵 is not a subset of 𝐴); we write 𝐴 ⫋ 𝐵 to denote this.
We say that sets 𝐴 and 𝐵 are distinct if they are not equal; that is, if 𝐴 ≠ 𝐵, and say that 𝐴
and 𝐵 are disjoint if 𝐴 and 𝐵 have no elements in common. So, for example, the sets {2, 3} and
{2, 4} are distinct but not disjoint.10 If 𝐴 and 𝐵 do have an element in common then we say
that 𝐴 intersects 𝐵.
If we fix a set 𝑋, then we have two operations to convert between properties 𝜑(𝑥) of ele-
ments of 𝑋, and subsets of 𝑋:
Properties ⟷ Subsets
𝜑(𝑥) ⟼ {𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)}
𝑥∈𝐴 ⟻ 𝐴
Note that by the Extension Axiom, if we take a subset 𝐴 ⊆ 𝑋, turn it into a property and back
into a subset using these operations, we get the original set back: {𝑥 ∈ 𝑋 ∶ 𝑥 ∈ 𝐴} = 𝐴. Let us
see what happens when we do the same thing starting with a property 𝜑(𝑥); for any 𝑥 ∈ 𝑋,
So the two properties may not be ‘equal’, because they could be written or described differ-
ently, but they are equivalent. Therefore sets can simplify mathematical arguments by al-
lowing us to collect all elements 𝑥 ∈ 𝑋 satisfying a certain property 𝜑(𝑥) into a single object
{𝑥 ∈ 𝑋 ∶ 𝜑(𝑥)}.
Some other important consequences of the Extension Axiom are given below:
• It does not matter in which order the elements of a set are written; so for example
• We do not count the number of times an object appears in a set. For example
• There is exactly one set with no elements. Indeed, if two sets both have no elements,
then they have the same elements as each other, so are the same set! We call this set the
empty set , and denote it by ∅. We say that a set 𝐴 is non-empty if 𝐴 is not the empty set
(that is, 𝐴 has at least one element).
We define the following operations on sets, which you will also see in other courses. These
allow us to construct new sets from old.
𝐴 ∪ 𝐵 = {𝑥 ∶ 𝑥 ∈ 𝐴 or 𝑥 ∈ 𝐵}.
For example, {1, 2}∪{2, 3, 4} = {1, 2, 3, 4}. Similarly, for sets 𝐴1 , … , 𝐴𝑟 , the union is defined
by
𝑟
⋃ 𝐴𝑖 = 𝐴1 ∪ ⋯ ∪ 𝐴𝑟 = {𝑥 ∶ 𝑥 ∈ 𝐴1 or 𝑥2 ∈ 𝐴2 or … or 𝑥 ∈ 𝐴𝑟 }.
𝑖=1
The Venn diagrams in Figure 1.2 give visual illustrations of the union.
𝐴 ∩ 𝐵 = {𝑥 ∶ 𝑥 ∈ 𝐴 and 𝑥 ∈ 𝐵}.
For example, {1, 2}∩{2, 3, 4} = {2}. Similarly, for sets 𝐴1 , … , 𝐴𝑟 , the intersection is defined
by
𝑟
⋂ 𝐴𝑖 = 𝐴1 ∩ ⋯ ∩ 𝐴𝑟 = {𝑥 ∶ 𝑥 ∈ 𝐴1 and 𝑥2 ∈ 𝐴2 and … and 𝑥 ∈ 𝐴𝑟 }.
𝑖=1
11
You should take careful note of this point, as appearances can deceive: the set {1, 2, 2} has two elements,
whilst the set {𝑥1 , … , 𝑥𝑛 } has 𝑛 elements if and only if 𝑥1 , … , 𝑥𝑛 are distinct.
14 CHAPTER 1. FUNDAMENTALS
𝐴∪𝐵
𝐵
𝐶
𝐴∪𝐵∪𝐶
𝐴∩𝐵
𝐴∩𝐵∩𝐶
𝐴
𝐴⧵𝐵
𝐵
𝑈
𝐴𝑐
𝐴 ⧵ 𝐵 = {𝑥 ∶ 𝑥 ∈ 𝐴 and 𝑥 ∉ 𝐵}.
For example, {1, 2} ⧵ {2, 3, 4} = {1}, and {2, 3, 4} ⧵ {1, 2} = {3, 4}.13
• In the context of a ‘ground set’ or ‘universal set’ 𝑈, the complement of a set 𝐴 is defined
by 𝐴𝑐 = 𝑈 ⧵ 𝐴. For example, if 𝑈 = {1, 2, 3, 4, 5} and 𝐴 = {1, 2, 4}, then 𝐴𝑐 = {3, 5}. We need
to first establish the universal set before taking a complement, and the complement of
a set depends on which ground set is used. See Figure 1.4 for Venn diagrams depicting
the set difference and complement operations.
That is, 𝐴 × 𝐵 is the set of ordered pairs whose first co-ordinate is a member of 𝐴 and
whose second co-ordinate is a member of 𝐵. So, for example
{1, 2} × {2, 3, 4} = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4)}.
See Figure 1.5 for a visual description of the Cartesian product operation. Similarly, the
12
Some sources write 𝐴 − 𝐵 instead of 𝐴 ⧵ 𝐵, but we will use ⧵ throughout this course to avoid confusion with
ordinary subtraction.
13
Note from these examples that difference is not symmetric, unlike union and intersection. The order matters,
just like with subtraction of numbers: 1 − 2 = −1 ≠ 1 = 2 − 1.
16 CHAPTER 1. FUNDAMENTALS
𝐴×𝐵
(𝑎, 𝑦) (𝑏, 𝑦) (𝑐, 𝑦) (𝑑, 𝑦)
𝑦
𝐵
(𝑎, 𝑥) (𝑏, 𝑥) (𝑐, 𝑥) (𝑑, 𝑥)
𝑥
𝑎 𝑏 𝑐 𝑑
𝐴
𝐴𝑛 = 𝐴⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
×𝐴 ×⋯×𝐴.
𝑛 𝐴’s
This should match the definitions of ℝ2 , ℝ3 etc. with which you are already familiar.
𝒫(𝐴) = {𝑋 ∶ 𝑋 ⊆ 𝐴}.
So 𝒫(𝐴) is a set whose elements are the subsets of 𝐴. For example, if 𝐴 = {1, 2, 3} then
𝒫(𝐴) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. Note that for any set 𝐴, the power set
𝒫(𝐴) contains both ∅ and 𝐴 as elements. This shows that sets can be elements of other
sets.
Brackets are used to indicate the order in which to perform operations. So, for example, (𝐴 ⧵
𝐵)⧵𝐶 means first subtract 𝐵 from 𝐴 and then subtract 𝐶 from the result, whilst 𝐴⧵(𝐵 ⧵𝐶 ) means
first subtract 𝐶 from 𝐵, and then subtract the result from 𝐴.15
14
An ordered 𝑟-tuple is a sequence of 𝑟 objects, so an ordered 2-tuple is an ordered pair, and ordered 3-tuple is
an ordered triple, and so on. Make sure that you don’t confuse ordered pairs or 𝑟-tuples with sets; unlike for sets,
the order of an ordered pair or 𝑟-tuple matters, so e.g. (1, 2) ≠ (2, 1). Also, elements may be repeated in a pair or
𝑟-tuple, so e.g. (3, 3) is a valid pair.
15
Exercise: prove that these two expressions, (𝐴 ⧵ 𝐵) ⧵ 𝐶 and 𝐴 ⧵ (𝐵 ⧵ 𝐶 ), are not the same in general.
1.2. SETS 17
A major part of this course will be about counting the sizes of sets. Our first ‘counting’
result is the following theorem on the size of the power set of a set 𝐴.
We say that a set 𝐴 is finite if there is a non-negative integer 𝑛 such that 𝐴 has 𝑛 elements,
otherwise 𝐴 is infinite. ℕ0 , ℕ+ and ℤ are important examples of infinite sets. If 𝐴 is a finite set,
then the size of 𝐴, denoted |𝐴|, is the number of elements in 𝐴.16 The size of 𝐴 may also be
called the order of 𝐴 or the cardinality of 𝐴.
Theorem 1.9. If 𝐴 is a set with |𝐴| = 𝑛, then |𝒫(𝐴)| = 2𝑛 . Equivalently, a set with 𝑛 elements has
2𝑛 subsets.
Proof. Let 𝐴 = {𝑥1 , … , 𝑥𝑛 }. We can form a subset 𝐵 ⊆ 𝐴 by going through the elements of 𝐴
and deciding whether each element should be in 𝐵 or not. As there are two choices for each
element (in 𝐵 or not in 𝐵), there are
× 2 × 2 × ⋯ × 2 = 2𝑛
2⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝑛 2’s
possible ways to choose 𝐵. This gives 2𝑛 subsets of 𝐵, which are all distinct; if we make a
different choice at element 𝑥𝑗 then the resulting subsets differ in element 𝑥𝑗 .17
Sets may contain any object as a member, including other sets (for example, we have just
seen that the power set 𝒫(𝐴) of a set 𝐴 is a set). It is crucial to remember that membership or
elementhood (i.e. being an element of a set) is not transitive (that is, it is not true that just
because 𝐴 ∈ 𝐵 and 𝐵 ∈ 𝐶 we must have 𝐴 ∈ 𝐶). So, for example 9 is not a member of the set
{4, {5, 9}}: this set has two elements, namely the integer 4 and the set {5, 9}. Similarly, the set
{ℕ0 } has one element, namely the set ℕ0 , even though ℕ0 itself is a set with infinitely many
elements. Set operations must also be treated carefully – try justifying the following equality:
16
Note that this definition only applies to finite sets, so our later results on the sizes of sets specify that the sets
in question are finite.
17
We will see similar proofs later in the course. You can alternatively prove this by induction (try this as an
exercise).
18 CHAPTER 1. FUNDAMENTALS
Theorem 1.10 (Sum rule). Let 𝐴1 , …, 𝐴𝑛 be a finite list of pairwise-disjoint sets – i.e. for all
distinct 𝑖, 𝑗 = 1, … , 𝑛, 𝐴𝑖 ∩ 𝐴𝑗 = ∅. Then ||⋃𝑛 𝐴𝑖 || = ∑𝑛 |𝐴𝑖 |.
𝑖=1 𝑖=1
Proof. For each 𝑖 = 1, … , 𝑛, let 𝑚𝑖 ≔ |𝐴𝑖 | and write 𝐴𝑖 = {𝑎𝑖,1 , … , 𝑎𝑖,𝑚𝑖 }. Then all the ele-
ments 𝑎1,1 , … , 𝑎1,𝑚1 , 𝑎2,1 , … , 𝑎2,𝑚2 , … , 𝑎𝑛,1 , … , 𝑎𝑛,𝑚𝑛 are distinct, since 𝐴1 , …, 𝐴𝑛 are pairwise-
disjoint. Therefore ⋃𝑛𝑖=1 𝐴𝑖 = {𝑎𝑖,𝑗 ∶ 𝑖 = 1, … , 𝑛, 𝑗 = 1, … , 𝑚𝑖 } has ∑𝑛𝑖=1 𝑚𝑖 = ∑𝑛𝑖=1 |𝐴𝑖 | elements.
If sets 𝐴 and 𝐵 are not disjoint then we cannot apply the sum rule to find |𝐴 ∪ 𝐵|. Instead
we can calculate this by the inclusion-exclusion formula, provided that we know the size of
|𝐴 ∩ 𝐵|. The form for two sets is the following.
Theorem 1.11 (Inclusion-exclusion formula for two sets). Suppose that 𝐴 and 𝐵 are finite sets.
Then
|𝐴 ∪ 𝐵| = |𝐴| + |𝐵| − |𝐴 ∩ 𝐵|.
Proof. Note that 𝐴 and 𝐵 ⧵ 𝐴 are disjoint sets whose union is 𝐴 ∪ (𝐵 ⧵ 𝐴) = 𝐴 ∪ 𝐵.18 So by
Theorem 1.10 applied with 𝐶 = 𝐴 and 𝐷 = 𝐵 ⧵ 𝐴 we have
Example 1.12. If WeChat tells us that I have 155 friends, you have 274 friends, and we have 25
friends in common, how many friends do we have between us?
Solution. We can describe this scenario in set terms: let 𝐴 be the set of my friends and 𝐵 be
the set of your friends. Then we are told that |𝐴| = 155 and |𝐵| = 274. Also, 𝐴 ∩ 𝐵 is the set of
friends we have in common, so |𝐴∩𝐵| = 25. The set of people who are either my friend or your
friend is 𝐴 ∪ 𝐵, so applying the inclusion-exclusion formula we find that the number of such
people is
|𝐴 ∪ 𝐵| = |𝐴| + |𝐵| − |𝐴 ∩ 𝐵| = 155 + 274 − 25 = 404.
To find the size of the union of three sets we sum the set sizes, then subtract the sizes of
the pairwise intersections, then add the three-way intersection, as follows.
Theorem 1.13 (Inclusion-exclusion formula for three sets). Suppose that 𝐴, 𝐵 and 𝐶 are finite
sets. Then
|𝐴 ∪ 𝐵 ∪ 𝐶 | = |𝐴| + |𝐵| + |𝐶 | − |𝐴 ∩ 𝐵| − |𝐴 ∩ 𝐶 | − |𝐵 ∩ 𝐶 | + |𝐴 ∩ 𝐵 ∩ 𝐶 |.
To prove this theorem we apply the inclusion-exclusion formula for two sets three times,
and also make use of a distributivity law which you should be familiar with from other courses:
for any sets 𝐴, 𝐵 and 𝐶 we have 𝐴 ∩ (𝐵 ∪ 𝐶 ) = (𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶 ).
Proof. Let 𝑌 = 𝐵 ∪ 𝐶. Then the inclusion-exclusion formula for two sets implies that
|𝐴 ∪ 𝐵 ∪ 𝐶 | = |𝐴 ∪ 𝑌 | = |𝐴| + |𝑌 | − |𝐴 ∩ 𝑌 |
However, the distributivity law and inclusion-exclusion formula for two sets imply that
|𝐴 ∩ 𝑌 | = |𝐴 ∩ (𝐵 ∪ 𝐶 )|
= |(𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶 )|
= |𝐴 ∩ 𝐵| + |𝐴 ∩ 𝐶 | − |(𝐴 ∩ 𝐵) ∩ (𝐴 ∩ 𝐶 )|
= |𝐴 ∩ 𝐵| + |𝐴 ∩ 𝐶 | − |𝐴 ∩ 𝐵 ∩ 𝐶 |
|𝑌 | = |𝐵 ∪ 𝐶 | = |𝐵| + |𝐶 | − |𝐵 ∩ 𝐶 |
One significant application of this formula is in counting integers which are divisible by
one of several specified integers, as in the following example.
Example 1.14. How many integers between 1 and 1,000 inclusive are divisible by at least one
of the integers 2, 3 and 5?
Then
For any positive integer 𝑚 and 𝑟, the number of integers between 1 and 𝑚 which are divisible
𝑚
by 𝑟 is equal to ⌊ ⌋. To see this, note that for each positive integer 𝑛, if 𝑟 divides 𝑛 and 𝑛 ⩽ 𝑚
𝑟
𝑛 𝑛 𝑚
then 𝑛 = 𝑟 ⩽ 𝑚 and 1 ⩽ ⩽ . Therefore
𝑟 𝑟 𝑟
| 𝑚 | 𝑚
|{𝑛 ∈ ℤ ∶ 1 ⩽ 𝑛 ⩽ 𝑚 and 𝑛 is divisible by 𝑟}| = ||{𝑘𝑟 ∶ 𝑘 ∈ ℤ, 1 ⩽ 𝑘 ⩽ }|| = ⌊ ⌋ .
| 𝑟 | 𝑟
So |𝐴| = 500, |𝐵| = 333, |𝐶 | = 200, |𝐴 ∩ 𝐵| = 166, |𝐴 ∩ 𝐶 | = 100, |𝐵 ∩ 𝐶 | = 66, |𝐴 ∩ 𝐵 ∩ 𝐶 | = 33.
The set of integers divisible by at least one of 2, 3 and 5 is 𝐴 ∪ 𝐵 ∪ 𝐶, so by the inclusion-
exclusion formula the number of such integers is
|𝐴 ∪ 𝐵 ∪ 𝐶 | = |𝐴| + |𝐵| + |𝐶 | − |𝐴 ∩ 𝐵| − |𝐴 ∩ 𝐶 | − |𝐵 ∩ 𝐶 | + |𝐴 ∩ 𝐵 ∩ 𝐶 |
= 500 + 333 + 200 − 166 − 100 − 66 + 33
= 734.
The versions of this formula for two and three sets hint at the general formula which ap-
plies for any number of sets: we add the sizes of the given sets, then subtract the sizes of their
pairwise intersections. Then we add back the sizes of the three-way intersections, before sub-
tracting the sizes of the four-way intersections, and so forth until all intersections have been
included in the calculation. We leave the proof as an exercise.
1.3. THE SUM RULE, PRODUCT RULE, AND INCLUSION-EXCLUSION FORMULAE 21
Theorem 1.15 (General inclusion-exclusion formula). Suppose that 𝐴1 , 𝐴2 , … , 𝐴𝑟 are finite sets.
Then 𝑟
|| 𝑟 || | |
|𝐼 |+1 | ⋂ 𝐴 | =
| |
| ⋃ 𝐴𝑖 | = ∑(−1) | 𝑖| ∑ (−1)𝑘+1 ∑ || ⋂ 𝐴𝑖 || .19
|𝑖=1 | 𝐼 ⊆{1,…,𝑟}, |𝑖∈𝐼 | 𝑘=1 𝐼 ⊆{1,…,𝑟}, |𝑖∈𝐼 |
𝐼 ≠∅ |𝐼 |=𝑘
𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟 = {(𝑎1 , 𝑎2 , … , 𝑎𝑟 ) ∶ 𝑎1 ∈ 𝐴1 , 𝑎2 ∈ 𝐴2 , … , 𝑎𝑟 ∈ 𝐴𝑟 },
Using this natural correspondence, the sets 𝐴 ×𝐵 ×𝐶 and (𝐴 ×𝐵)×𝐶 are essentially equivalent
for most purposes (in particular, they have the same size); a similar correspondence shows
that the is true of 𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟−1 × 𝐴𝑟 and (𝐴1 × 𝐴2 × ⋯ × 𝐴𝑟−1 ) × 𝐴𝑟 .
The product rule tells us the size of the Cartesian product of a collection of sets. Note that
unlike for the sum rule, we do not require that the sets are disjoint.
Theorem 1.16 (Product rule for two sets). If 𝐴 and 𝐵 are finite sets, then |𝐴 × 𝐵| = |𝐴| ⋅ |𝐵|.
Proof. Let 𝐴 = {𝑎1 , … , 𝑎𝑚 } and 𝐵 = {𝑏1 , … , 𝑏𝑛 }. So |𝐴| = 𝑚 and |𝐵| = 𝑛. We can list the elements
of 𝐴 × 𝐵 as
⎧
⎪
⎪ (𝑎1 , 𝑏1 ), (𝑎1 , 𝑏2 ), … (𝑎1 , 𝑏𝑛 ), ⎫
⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎪ (𝑎2 , 𝑏1 ), (𝑎2 , 𝑏2 ), … (𝑎2 , 𝑏𝑛 ), ⎪
⎪
⎨ ⎬
⎪
⎪
⎪ ⋮ ⋮ ⋱ ⋮ ⎪
⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎩ (𝑎𝑚 , 𝑏1 ), (𝑎𝑚 , 𝑏2 ), … (𝑎𝑚 , 𝑏𝑛 ) ⎭
Altogether there are 𝑚 rows and 𝑛 columns in the table, so it has 𝑚 ⋅ 𝑛 = |𝐴||𝐵| entries; since
these are all distinct, we have |𝐴 × 𝐵| = |𝐴||𝐵|.
An inductive argument based on the product rule for two sets yields a product rule for any
number of finite sets.
19
Notice the range of summation (underneath Σ) in the middle expression: we have two conditions, 𝐼 ⊆
{1, … , 𝑛} and 𝐽 ≠ ∅. So for every non-empty subset 𝐼 ⊆ {1, … , 𝑛}, we calculate (−1)|𝐼 |+1 ||⋂ 𝐴𝑖 || and add these
𝑖∈𝐼
results together. The same principle applies to general products (Π), unions (⋃), and intersection (⋂); e.g. using
notation from the previous section, ⋂𝑖∈{1,…,𝑛} 𝐴𝑖 = ⋂𝑛𝑖=1 𝐴𝑖 .
22 CHAPTER 1. FUNDAMENTALS
Theorem 1.17 (Product rule for 𝑟 sets). For any positive integer 𝑟 and any finite sets 𝐴1 , 𝐴2 , … , 𝐴𝑟
we have
|| 𝑛 || 𝑟
| ∏ 𝐴𝑖 | = |𝐴1 × ⋯ × 𝐴𝑛 | = |𝐴1 | × ⋯ × |𝐴𝑟 | = ∏ |𝐴𝑖 |.
|𝑖=1 | 𝑖=1
Proof. We proceed by induction on 𝑟.20 The base case 𝑟 = 1 is then a tautology; it states simply
that |𝐴1 | = |𝐴1 |, which is true.
Now suppose that the statement holds for 𝑟 = 𝑘 for some 𝑘 ∈ ℕ+ , that is, that for any finite
sets 𝐴1 , … , 𝐴𝑘 we have |𝐴1 × ⋯ × 𝐴𝑘 | = ∏𝑘𝑖=1 |𝐴𝑖 |. Then, given any finite sets 𝐴1 , … , 𝐴𝑘+1 , we
have
where the first equality holds by the correspondence discussed earlier, the second equality
holds by the product rule for two sets applied to the sets 𝐴1 × ⋯ × 𝐴𝑘 and 𝐴𝑘+1 , and the
final equality holds by the inductive hypothesis. So the statement holds for 𝑟 = 𝑘 + 1 also.
Having proved that the statement holds for 𝑟 = 1, and that if it holds for 𝑟 = 𝑘 then it also
holds for 𝑟 = 𝑘 + 1, we conclude by the Principle of Mathematical Induction that it holds for
every positive integer, as required.
Solution. The prime factorisation of 1,200 is 24 ⋅3⋅52 , so the factors of 1,200 are the integers of
the form 2𝑎 ⋅ 3𝑏 ⋅ 5𝑐 for 0 ⩽ 𝑎 ⩽ 4, 0, ⩽ 𝑏 ⩽ 1 and 0 ⩽ 𝑐 ⩽ 2.21 That is, the factors are 2𝑎 ⋅ 3𝑏 ⋅ 5𝑐 for
(𝑎, 𝑏, 𝑐) ∈ 𝐴 × 𝐵 × 𝐶, where 𝐴 = {0, 1, 2, 3, 4}, 𝐵 = {0, 1} and 𝐶 = {0, 1, 2}. So by the product rule,
the number of factors is
|𝐴 × 𝐵 × 𝐶 | = |𝐴||𝐵||𝐶 | = 5 ⋅ 2 ⋅ 3 = 30.
20
Exercise: prove Theorem 1.17 by counting choices as in the proof of Theorem 1.9.
21
This is a consequence of uniqueness of prime factorisation, which also implies that the integers of this form
are all distinct. We will formally prove this later in the course (see the fundamental theorem of arithmetic).
Chapter 2
Binary relations
Usually, we think of a relation as a property that we can apply to a pair of objects. How-
ever, just as we showed with sets, if we fix a set 𝐴 then there is a correspondence between the
properties applied to pairs (𝑎, 𝑏) of elements of 𝐴, and binary relations on 𝐴:
So provided we are only interested in whether a relation holds for certain pairs or not, this
mathematical definition of a relation as a type of set is very useful. As an example, even though
we do not usually think of the relation ⩾ between real numbers as a set, by this correspondence
we could view it as the set of pairs (𝑥, 𝑦) with 𝑥 ⩾ 𝑦. Moreover, we could even draw this relation
as a subset of ℝ2 (see Figure 2.1).
(1). The relation 𝑅 ≔ {(1, 2), (1, 3), (2, 2), (2, 3), (3, 3)} on the set {1, 2, 3}. See Table 2.1.
(3). 𝑆 on ℕ0 given by 𝑥 𝑆 𝑦 if |𝑥 − 𝑦| ⩽ 2.
(4). The relation 𝐶 on the set of all people, where 𝑥 𝐶 𝑦 if 𝑥 and 𝑦 live in the same country.
23
24 CHAPTER 2. BINARY RELATIONS
𝑥
⩾
𝑅 1 2 3
1 ✗ ✓ ✓
2 ✗ ✓ ✓
3 ✗ ✗ ✓
(5). For a fixed positive integer 𝑚, ≡𝑚 is the relation on ℤ given by 𝑥 ≡𝑚 𝑦 if and only if
𝑥 − 𝑦 = 𝑚𝑘 for some integer 𝑘.1
Notice that these properties requiring checking for all elements of 𝐴. Thus, to show a par-
ticular one of these properties is false, we are only required to find one counterexample. Let
us see which properties each of the relations in the examples above satisfy.
Example 2.4.
(1). The relation 𝑅 ≔ {(1, 2), (1, 3), (2, 2), (2, 3), (3, 3)} on the set {1, 2, 3} is neither reflexive (be-
cause 1 𝑅
1), nor symmetric (because 1 𝑅 2 but 2 𝑅
1). This can be easily seen from the
table: if we look at the diagonal entries from top-left to bottom-right, a reflexive rela-
tion should have all ticks along this diagonal. Moreover, a symmetric relation would be
symmetric about this diagonal line.
For transitivity, it is in general trickier. Let 𝑎, 𝑏, 𝑐 ∈ 𝑅 be given such that 𝑎 𝑅 𝑏 and 𝑏 𝑅 𝑐.
First, note that if either 𝑎 = 𝑏 or 𝑏 = 𝑐 then 𝑎 𝑅 𝑐. So let us assume that 𝑎 ≠ 𝑏 and 𝑏 ≠ 𝑐.
Then since 𝑎 𝑅 𝑏, either 𝑏 = 2 or 𝑏 = 3. However, if 𝑏 = 3 then by 𝑏 𝑅 𝑐 we get that 𝑐 = 3,
which is a contradiction. Thus, 𝑏 = 2 and so
𝑎 𝑅 𝑏, 𝑎 ≠ 𝑏 ⇒ 𝑎 = 1,
𝑏 𝑅 𝑐, 𝑏 ≠ 𝑐 ⇒ 𝑐 = 3.
(2). < on ℤ is not reflexive since, for example, 3 ≮ 3. It is not symmetric since, for example,
2 < 3 but 3 ≮ 2. It is transitive since for any 𝑎, 𝑏, 𝑐 ∈ ℤ with 𝑎 < 𝑏 and 𝑏 < 𝑐 we have 𝑎 < 𝑐.
The relations 𝐶 and ≡𝑚 in Examples 2.2(4) and (5) are important examples of equivalence
relations, which we will explore in the rest of this chapter.
Example 2.6.
(1). For any set 𝐴, the relation 𝑈𝐴 ≔ 𝐴 × 𝐴 is an equivalence relation. Notice that 𝑎 𝑅 𝑏 holds
for all 𝑎, 𝑏 ∈ 𝐴.
(2). For any set 𝐴, the equality relation 𝐸𝐴 ≔ {(𝑎, 𝑎) ∶ 𝑎 ∈ 𝐴} is an equivalence relation. This
follows from the properties of equality.
• Let 𝑥, 𝑦 be people such that 𝑥 𝐶 𝑦. Then 𝑥 and 𝑦 live in the same country, and so
trivially 𝑦 𝐶 𝑥. Therefore 𝐶 is symmetric.
• Let 𝑥, 𝑦, 𝑧 be people such that 𝑥 𝐶 𝑦 and 𝑦 𝐶 𝑧. Then there are countries 𝐴, 𝐵 such
that 𝑥, 𝑦 live in 𝐴, and 𝑦, 𝑧 live in 𝐵. Thus 𝐴 = 𝐵, so 𝑥, 𝑧 live in 𝐴 and hence 𝑥 𝐶 𝑧.
Therefore 𝐶 is transitive.
The principal reason that equivalence relations are useful is that they divide the set on
which they are defined into sets called equivalence classes which form a partition of 𝐴, mean-
ing that every element of 𝐴 lies in precisely one equivalence class. We now define these terms
formally.
Definition 2.7. Suppose that ∼ is an equivalence relation on a set 𝐴. For any 𝑎 ∈ 𝐴, the equiv-
alence class of 𝑎 is the set [𝑎]∼ = {𝑏 ∈ 𝐴 ∶ 𝑎 ∼ 𝑏}; that is, the set of all elements of 𝐴 to which 𝑎
is related. The quotient set of ∼ is set of all equivalence classes of members of 𝑎:
Example 2.8.
(1). The equivalence classes of the relation 𝐶 from Example 2.2(4) are the people belong-
ing to a fixed country (the country’s population), and the quotient set is set of all such
populations.
⋮
[−3]≡3 = {… , −9, −6, −3, 0, 3, … }
[−2]≡3 = {… , −8, −5, −2, 1, 4, … }
[−1]≡3 = {… , −7, −4, −1, 2, 5, … }
[0]≡3 = {… , −6, −3, 0, 3, 6, … }
[1]≡3 = {… , −5, −2, 1, 4, 7, … }
[2]≡3 = {… , −4, −1, 2, 5, 8, … }
[3]≡3 = {… , −3, 0, 3, 6, 9, … }
⋮
There is an important connection between the equivalence of members and the equality
of their equivalence classes.
Lemma 2.9. Let ∼ be an equivalence relation on a set 𝑋 and let 𝑥, 𝑦 ∈ 𝑋 be given. Then 𝑥 ∼ 𝑦 if
and only if [𝑥]∼ = [𝑦]∼ .
Proposition 2.11. Let 𝐴 be a set and let 𝑃 be a collection of non-empty subsets of 𝐴. Then 𝑃 is a
partition of 𝐴 if and only if for all 𝑎 ∈ 𝐴, there exists a unique part 𝑋 ∈ 𝑃 such that 𝑎 ∈ 𝑋.
Informally, you can think of a partition as splitting up a set into one or more non-overlapp-
ing pieces.
Example 2.12.
(3). There are five possible partitions of {1, 2, 3} (See Figure 2.2). These are
𝑃1 = {{1, 2, 3}}, 𝑃2 = {{1, 2}, {3}}, 𝑃3 = {{1, 3}, {2}}, 𝑃4 = {{1}, {2, 3}}, 𝑃5 = {{1}, {2}, {3}}.
The following theorem tells us how to translate between equivalence classes and partitions
of a set; moreover, these translations are inverses of each other.
(1). For any equivalence relation ∼ on 𝐴, the quotient set 𝑃∼ ≔ 𝐴/∼ is a partition of 𝐴.
(2). For any partition 𝑃 of 𝐴, we define the relation ∼𝑃 on 𝐴: for all 𝑎, 𝑏 ∈ 𝐴, 𝑎 ∼𝑃 𝑏 if there
exists a part 𝑋 ∈ 𝑃 such that 𝑎, 𝑏 ∈ 𝑋. Then ∼𝑃 is an equivalence relation on 𝐴.
(3). These definitions form a bijective correspondence between the collection of equivalence
relations on 𝐴, and the collection of partitions on 𝐴:
2
3
1 1 1
2 2
2 3 3
3
2
3
Proof.
Therefore each equivalence class (i.e. member of 𝑃∼ ) is non-empty and [𝑎]∼ is the unique
part of 𝑃∼ that contains 𝑎. Thus 𝑃∼ = 𝐴/∼ is a partition of 𝐴.
(2). Let 𝑃 be a partition of 𝐴 and let 𝑎 ∈ 𝐴 be given. Then there exists an 𝑋 ∈ 𝑃 such that 𝑎 ∈ 𝑋,
so by definition 𝑎 ∼𝑃 𝑎. Also by its definition ∼𝑃 is symmetric.
Now let 𝑎, 𝑏, 𝑐 ∈ 𝐴 be given such that 𝑎 ∼𝑃 𝑏 and 𝑏 ∼𝑃 𝑐. Then there exist 𝑋, 𝑌 ∈ 𝑃 such
that 𝑎, 𝑏 ∈ 𝑋 and 𝑏, 𝑐 ∈ 𝑌. Since 𝑏 ∈ 𝑋 ∩ 𝑌, then by uniqueness 𝑋 = 𝑌. Thus 𝑎, 𝑐 ∈ 𝑋 and
so 𝑎 ∼𝑃 𝑐. Therefore ∼𝑃 is an equivalence relation on 𝐴.
Now if 𝑎 ∼ 𝑏 then by reflexivity 𝑏 ∼ 𝑏 too, so 𝑎 ∼𝑃∼ 𝑏. On the other hand, if there exists a 𝑐 ∈ 𝐴
such that 𝑎 ∼ 𝑐 and 𝑏 ∼ 𝑐, then by symmetry 𝑐 ∼ 𝑏 and thus by transitivity 𝑎 ∼ 𝑏. Therefore
𝑎 ∼𝑃∼ 𝑏 if and only if 𝑎 ∼ 𝑏, and hence ∼𝑃∼ = ∼.
Let 𝑃 be a partition on 𝐴. Then for each 𝑎 ∈ 𝐴, there exists a unique 𝑋𝑎 ∈ 𝑃 such that 𝑎 ∈ 𝑋𝑎 .
So for all 𝑎, 𝑏 ∈ 𝐴:
𝑎 ∼𝑃 𝑏 ⟺ ∃𝑌 ∈ 𝑃 such that 𝑎, 𝑏 ∈ 𝑌 ⟺ 𝑏 ∈ 𝑋𝑎 .
This is theorem allows us to describe equivalence relations and partitions in terms of each
other; this can be useful when trying to solve a problem in one domain by translating into the
other. As an example, it is easier to describe all equivalence relations on a set using partitions
instead.
Chapter 3
Finite counting
The product rule, sum rule and inclusion-exclusion formulae are the starting points for our
counting arguments, though we often use these implicitly by ‘counting choices’. Indeed, given
a collection of sets, then the sum rule or inclusion-exclusion formulae tell us how many ways
there are to choose one item from these sets (that is, a single item is taken from the union of
all the sets, so only one item is chosen in total). The product rule tells us instead how many
ways there are to choose one item from each set.
Example 3.1. One student society has 12 members, and another has 23 members; they have
no members in common. How many ways are there to choose from the memberships
(1). a single representative for the societies (who can be from either)?
Solution. Let 𝐴 be the set of members of the first society, and 𝐵 the set of members of the
second society. Choosing a single representative for the societies means choosing an element
of 𝐴 ∪ 𝐵, so the answer to (1) is |𝐴 ∪ 𝐵| = |𝐴| + |𝐵| = 12 + 23 = 35 by the sum rule. Similarly,
choosing a representative from each society means choosing a pair (𝑎, 𝑏) ∈ 𝐴 × 𝐵, as 𝑎 is then
the representative for the first society and 𝑏 is then the representative for the second. So the
answer to (2) is |𝐴 × 𝐵| = |𝐴||𝐵| = 12 × 23 = 276 by the product rule. Another way to put this
is that there are 12 choices for the first representative, and for each of these choices there are
then 23 choices for the second representative, so in total there are 12 lots of 23 choices, giving
12 × 23 possibilities in total.
In this chapter, we will explore how to calculate the number of possible ways of choosing
multiple elements of a set, subject to certain restrictions; for example:
31
32 CHAPTER 3. FINITE COUNTING
• Does the order in which the elements are chosen matter? That is, if we first choose 𝑎 and
then 𝑏, should we consider this choice to be different from choosing 𝑏 first and then 𝑎?
Several of our examples in this chapter will consider probabilities of events. These are nat-
urally linked to counting results: if a random experiment has a finite number of possible out-
comes, and every outcome is equally likely, then the probability of an event 𝐸 is
number of outcomes for which 𝐸 occurs
ℙ(𝐸) = .
total number of outcomes
We say that a selection is made uniformly at random if every outcome is equally likely. For ex-
ample, rolling a fair standard 6-sided die selects an element of the set {1, 2, 3, 4, 5, 6} uniformly
at random.1 Likewise, in the example above, if we choose uniformly at random a single repre-
1
sentative from either society, then each person has probability of being chosen.
35
When giving a numerical answer, it is sometimes more useful to round the number. Sup-
pose we have a real number 𝑥 written as a decimal ±𝑎𝑛 … 𝑎0 .𝑎−1 𝑎−2 … , where the 𝑎𝑖 ’s are
digits. The 1st significant figure is the first non-zero digit, starting from the left. Every digit
to the right of this digit is called a significant figure, so for example the 3rd significant figure
would be 2 places to the right of the 1st significant figure, even if that digit is 0. For a positive
integer 𝑛, to round to 𝑛 significant figures:
(1). If the (𝑛 + 1)th significant figure is 5 or more, then we add 1 to the 𝑛th significant figure.
If the 𝑛th significant figure was 9, then we instead change it to 0 and add 1 to the digit
to the left, repeating this process if this digit is also 9.
(2). After this, we replace all digits to the right of the 𝑛th significant figure by 0, and we can
ignore the 0 digits to the right of both the 𝑛th significant figure and the decimal point ‘.’.
To round to 𝑛 decimal places, we instead start with the 𝑛th digit to the right of the decimal
point ‘.’ and conduct the same process.
Depending on how 𝑥 was written as a decimal ±𝑎𝑛 … 𝑎0 .𝑎−1 𝑎−2 … , we may need to add
more 0 digits for this process to work. We write (𝑛 s.f.) or (𝑛 d.p.) after a number to indicate
that the number has been rounded to 𝑛 significant figures or 𝑛 decimal place respectively.
Unless stated otherwise, you must give the exact number for an answer! If you round to 𝑛
decimal places, or are rounding to 𝑛 significant figures and the 𝑛th significant figure is after the
1
All random selections considered in this course will be made uniformly at random; in other courses you will
see examples of non-uniform, random selections.
3.1. ORDERED CHOICE 33
1
𝑥 𝑒 √0.99 𝜋3
137
(1 s.f.) 3 1.0 30 0.007
(2 s.f.) 2.7 0.99 31 0.0073
(3 s.f.) 2.72 0.995 31.0 0.00730
(4 s.f.) 2.719 0.9950 31.01 0.007299
(5 s.f.) 2.7183 0.99499 31.006 0.0072993
(1 d.p.) 2.7 1.0 31.0 0.0
(2 d.p.) 2.72 0.99 31.01 0.01
(3 d.p.) 2.719 0.995 31.006 0.007
(4 d.p.) 2.7183 0.9950 31.0063 0.0073
(5 d.p.) 2.71828 0.99499 31.00628 0.00730
decimal point, then you must write out that digit, even if it is 0. Table 3.1 gives some examples
demonstrating the above procedure.
• 0! ≔ 1,
2
Be careful not to mix up the mathematical and punctuational uses of the ‘!’ symbol! For example, if you solve
a problem and write ‘the answer is 10!’, do you mean 10 or 10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1 = 3, 628, 800?
3
The Π notation is for products, just like Σ notation is for sums. Given a finite list 𝑎1 , … , 𝑎𝑛 of numbers,
𝑛
∏𝑚=1 𝑎𝑖 ≔ 𝑎1 × ⋯ × 𝑎𝑛 . For the empty list, we define the empty product to be 1, which is equal to the number of
lists with no elements..
4
A recursive definition of a function 𝑓 is a definition that depends upon other values of the function. In this
case, the value of (𝑛+1)! depends on the value of 𝑛!. Usually, a recursive definition will depend on smaller values.
34 CHAPTER 3. FINITE COUNTING
Theorem 3.3. Let 𝑟 and 𝑛 be non-negative integers, and let 𝑆 be a set of size 𝑛.
(1). There are 𝑛𝑟 different possible ways to make 𝑟 choices from 𝑆, if the order of choices mat-
ters and repetition is allowed. In other words, there are 𝑛𝑟 sequences of 𝑟 elements of 𝑆
(allowing repetition). Moreover, if each choice from 𝑆 is made uniformly at random and
independently from the previous choices, then each of these outcomes has equal proba-
bility 𝑛−𝑟 .
𝑛!
(2). For 𝑟 ⩽ 𝑛 there are different possible ways to make 𝑟 choices from 𝑆, if the order of
(𝑛−𝑟)!
𝑛!
choices matters but repetition is forbidden.5 In other words there are sequences of 𝑟
(𝑛−𝑟)!
elements of 𝑆 in which no element occurs more than once. Moreover, if each choice from 𝑆
is made uniformly at random from those elements of 𝑆 not previously chosen, then each
(𝑛−𝑟)!
of these outcomes has equal probability .
𝑛!
(3). For 𝑟 > 𝑛 it is not possible to make 𝑟 choices from 𝑆 if repetition is forbidden.
Proof. (1). First, let us suppose that 𝑟 is positive. Then the ordered sequences of 𝑟 elements
of 𝑆 with repetition allowed are precisely the elements of 𝑆 𝑟 , so by the product rule there
are exactly |𝑆|𝑟 = 𝑛𝑟 such sequences. Moreover, any such sequence (𝑥1 , … , 𝑥𝑟 ) and each
1 ⩽ 𝑖 ⩽ 𝑟, the probability that 𝑥𝑖 is chosen at the 𝑖-th choice is 𝑛−1 ; since these events are
independent the probability that the sequence (𝑥1 , … , 𝑥𝑟 ) is the outcome of our random
selections is 𝑛−𝑟 .
If 𝑟 = 0 then there is only one possible way of not choosing any element (or equivalently,
one empty sequence), and this coincides with the usual convention that 𝑛0 = 1. As there
is only one possible outcome, the probability of that outcome is 1 = 𝑛−0 .
𝑛!
As in (1), if 𝑟 = 0 then there is only 1 = and the probability of that outcome is
(𝑛−0)!
(𝑛−0)!
1= .
𝑛!
(3). If 𝑟 > 𝑛 then, by the pigeonhole principle, whenever we make 𝑟 choices from the 𝑛 ele-
ments of 𝑆 there must be some element which is chosen at least twice, so it is not possible
to choose 𝑟 elements of 𝑆 without repeating some element.
Example 3.4. A bag contains 5 balls, labelled 1, 2, 3, 4, and 5. I draw out two balls in turn from
the bag.6 What is the probability that the number on the second ball drawn is precisely one
greater than the number on the first ball drawn, if we draw balls according to the following
rules:
(2). I do not replace the first ball before the second is drawn?
Solution. Let 𝑥 be the number on the first ball drawn and 𝑦 be the number on the second
ball drawn, so each outcome of drawing the balls is a pair (𝑥, 𝑦). There are four outcomes of
the selection which give the event described, namely (1, 2), (2, 3), (3, 4) and (4, 5), so we need to
calculate the total number of possible outcomes.
(1). We are making two choices from the set {1, 2, 3, 4, 5} with repetition allowed (because
we replaced the first ball so it could be drawn again), and where order matters. So by
Theorem 3.3(1) are 52 = 25 possible outcomes, each of which is equally likely, and so the
4
probability is = 0.16.
25
6
Many questions of this type involve drawing balls from bags, or rolling dice, or dealing cards from a deck.
Unless stated otherwise you should assume that each selection is uniformly random, that is, that each ball is
equally likely to be drawn, each card is equally likely to be dealt, each face of the die is equally likely to come up,
etc. Furthermore, if multiple dice are rolled, balls are drawn or cards are dealt, then you should assume that the
selections are independent of each other unless otherwise specified.
36 CHAPTER 3. FINITE COUNTING
(2). We are again making two choices from the set {1, 2, 3, 4, 5} where order matters, but now
repetition is forbidden because we cannot draw the first ball again once it has been
5!
drawn. So by Theorem 3.3(2) there are = 5 × 4 = 20 possible outcomes, each of
(5−2)!
4 1
which is equally likely, and thus the probability is = = 0.2.
20 5
Definition 3.5. A permutation of a set 𝑆 with 𝑟 elements in an 𝑟-tuple (𝑥1 , … , 𝑥𝑟 ) listing all
elements of 𝑆 exactly once; i.e. 𝑆 = {𝑥1 , … , 𝑥𝑟 }.
(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), and (3, 2, 1).
So there are 6 = 3! permutations of a set of three elements. We can also view permutations
on a set 𝑆 and the bijections 𝑓 ∶ 𝑆 → 𝑆. If we fix an enumeration 𝑆 = {𝑥1 , … , 𝑥𝑟 } of a set with 𝑟
elements, then we get the following bijective correspondence between permutations as we’ve
defined above, and bijections on a set 𝑆:
𝑟-tuples ↔ Bijections
(𝑦1 , … , 𝑦𝑟 ) ↦ 𝑓 ∶ 𝑆 → 𝑆, 𝑥𝑖 ↦ 𝑦𝑖
(𝑓 (𝑥1 ), … , 𝑓 (𝑥𝑟 )) ↤ 𝑓 ∶𝑆→𝑆
Proof. The permutations of 𝑆 are exactly the sequences of 𝑛 elements of 𝑆 in which no element
𝑛! 𝑛!
is repeated. This number is = = 𝑛! by Theorem 3.3(2) applied with 𝑟 = 𝑛.
(𝑛−𝑛)! 0!
Example 3.7. There are 𝑛! ways for 𝑛 people to line up in a queue, since each order is a per-
mutation of the set of people in the queue.
Example 3.8. How many anagrams7 are there of the word MATHS? In how many of these is
‘T’ immediately before ‘H’?
Solution. Each anagram of MATHS is a permutation of the set {M, A, T, H, S}, so by Corollary 3.6
there are 5! = 120 anagrams. For the second part, note that the anagrams of MATHS in which
‘T’ is immediately before ‘H’ can be thought of as the permutations of the set {M, A, TH, S}
(i.e. we treat ‘TH’ as a single letter), and Corollary 3.6 tells us there are 4! = 24 of these.
7
An anagram of a word is a word obtained by rearranging the letters, including the original word. In this
course, the anagrams do not have to be real words.
3.2. UNORDERED CHOICE 37
Example 3.9. I roll four standard dice.8 What is the probability that the numbers obtained are
consecutive9 ?
Solution. If we imagine rolling the dice in turn, then we are making four choices from the set
{1, 2, 3, 4, 5, 6} in which order matters and where repetition is allowed. So by Theorem 3.3(1)
there are 64 = 1,296 possible outcomes, each of which is equally likely. The outcomes for
which the numbers obtained are consecutive are the permutations of {1, 2, 3, 4}, {2, 3, 4, 5} and
{3, 4, 5, 6}. Each of these sets has 4! = 24 permutations by Corollary 3.6, so in total there are
72 1
24 × 3 = 72 such outcomes. Therefore the probability is = = 0.056 (2 s.f.).
1,296 18
Definition 3.10. Let 𝑛 and 𝑟 be non-negative integers with 𝑟 ⩽ 𝑛. Then the binomial coefficient
of 𝑛 and 𝑟, also called 𝑛 choose 𝑟, and denoted (𝑛𝑟)10 is defined to be
𝑛 𝑛! 𝑛(𝑛 − 1)(𝑛 − 2) … (𝑛 − 𝑟 + 1)
( )≔ = .
𝑟 𝑟! ⋅ (𝑛 − 𝑟)! 𝑟!
10×9
For example, (10
2)
= = 45. This is much simpler than the following calculation:
2×1
As an another example,
14 14 × 13 × 12 × 11 × 10 × 9 × 8
( )=
7 7×6×5×4×3×2×1
14 12 10 9 8
= × 13 × × 11 × × ×
7×2 6 5 3 4
= 1 × 13 × 2 × 11 × 2 × 3 × 2
= 3,432.
Theorem 3.12. Let 𝑆 be a finite of size 𝑛 and let 𝑟 ∈ {0, … , 𝑛} be given. Then there are (𝑛𝑟) differ-
ent possible ways to make 𝑟 successive choices from 𝑆 if the order is irrelevant and repetition is
forbidden. Equivalently, the number of subsets 𝑅 ⊆ 𝑆 of size 𝑟 is (𝑛𝑟). Furthermore, if each choice
from 𝑆 is made uniformly at random from those elements of 𝑆 which have not previously been
−1
chosen, then each of these outcomes has equal probability (𝑛𝑟) .
To see that the first two statements are equivalent, recall that a set does not distinguish
between repeated elements, nor the order in which they appear.
Proof. Let 𝐴 denote the set of ordered choices (𝑥1 , … , 𝑥𝑟 ) of 𝑟 elements from 𝑆 without repe-
𝑛!
tition, so by Theorem 3.3(2) |𝐴| = . Let 𝑚 denote the number of unordered choices of
(𝑛−𝑟)!
𝑟-elements from 𝑆 without repetition, and for each 𝑟-element subset 𝐶 ⊆ 𝑆, let 𝐵𝐶 denote the
permutations of 𝐶. Then by Corollary 3.6 |𝐵𝐶 | = 𝑟!, and as the collection {𝐵𝐶 ∶ 𝐶 ⊆ 𝑆, |𝐶 | = 𝑟} is
pairwise-disjoint, by the sum rule we get
𝑛! | |
= |𝐴| = || ⋃ 𝐵𝐶 || = ∑ |𝐵𝐶 | = 𝑚 ⋅ 𝑟!
(𝑛 − 𝑟)! | 𝐶 ⊆𝑆, | 𝐶 ⊆𝑆,
|𝐶 |=𝑟 |𝐶 |=𝑟
𝑛! 𝑛
⇒𝑚= = ( ).
𝑟! ⋅ (𝑛 − 𝑟)! 𝑟
𝑛!
Recall from Theorem 3.3(2) that the probability of any particular ordered choice is . Thus,
(𝑛−𝑟)!
as each unordered choice {𝑥1 , … , 𝑥𝑟 } arises from exactly 𝑟! ordered choices (one for each per-
mutation of that choice), from Theorem 3.3(2) we get that the probability of choosing {𝑥1 , … ,
(𝑛−𝑟)! −1
𝑥𝑟 } is 𝑟! ⋅ = (𝑛𝑟) .
𝑛!
3.2. UNORDERED CHOICE 39
Example 3.13. How many 5-card poker hands11 are there? How many contain two aces, a
king, and two other cards (i.e. not an ace or king)? What is the probability that a random 5-
card hand has this form?
Solution. Note that the order of cards is irrelevant and that repetition is forbidden. Thus, the
answer to the first part is the number of ways to choose 5 cards out of 52 without repetition
and ignoring order is
52 52 × 51 × 50 × 49 × 48 52 51 50
( )= = × × × 49 × 48 = 13 × 17 × 5 × 49 × 48 = 2, 598, 960.
5 5×4×3×2×1 4 3 5×2
For the second part, there are (42) ways to choose 2 of the 4 aces in the deck, and there are (41)
ways to pick one of the four kings in the deck. Then, whatever we have chosen so far, there
are (44
2)
ways to choose two cards from the 44 = 11 × 4 cards which are not an ace or a king.
Observe that all these choices are independent of each other, so by the product rule the total
number of such hands is
4 4 44 4×3 44 × 43
( )( )( ) = ×4× = 6 × 4 × 946 = 22, 704.
2 1 2 2 2
In particular, the probability that a random 5-card hand contains 2 aces, a king, and two other
cards, is
22,704 473
= = 0.00874 (3 s.f.).
2,598,960 54,145
Example 3.14. In the China Union Lotto we choose 6 numbers from 1 to 33, and another
number from 1 to 16. The lottery draw then selects 6 red balls, numbered 1–33, and a blue
ball, numbered 1–16. What is the probability of winning the jackpot by matching all seven
numbers? What is the probability of matching four red numbers?
Solution. The order in which the balls are selected is irrelevant, and a ball cannot be selected
more than once. So the number of possibilities for the six red numbers drawn is (33
6 ),
and
(16
1)
= 16 for the one blue number. Each outcome is equally likely, so the probability that the
outcome matches the seven numbers we chose is therefore
−1
33 1
(16 × ( )) = = 0.000000056 (2 s.f.).
6 17, 721, 088
For the second part, there are (64) ways of choosing 4 of the red numbers to match, and there
are (33−6 = 27 choices for the two red balls. So the probability of matching 4 of the red balls
2 ) (2)
11
Poker is a card game played with a standard 52-card deck, as described the footnote of Example 1.7.
40 CHAPTER 3. FINITE COUNTING
is
(64)(27
2) 5,625
= = 0.0048 (2 s.f.).
(33
6)
1,107,568
Notice that we can ignore the blue numbers / balls in the second part of this question.
Finally, we consider the case where repetition is allowed but we don’t care about the order
of choices.12
Theorem 3.15. Let 𝑛, 𝑟 ∈ ℕ+ be given and let 𝑆 be a finite set of size 𝑛. Then there are (𝑛+𝑟−1
𝑟 )
different possible ways to make 𝑟 successive choices from 𝑆 if repetition is allowed but the order
of choosing is irrelevant.
Proof. All that matters when considering the choices in this theorem is how many times each
element is chosen. To represent such a choice, we first list the elements of 𝑆 = {𝑥1 , … , 𝑥𝑛 } and
consider a string of symbols consisting of 𝑟 stars (⋆) and 𝑛 − 1 bars (∣) and count the number
of consecutive stars:
1st bar 2nd bar (𝑟−1)th bar
↓ ↓ ↓
⋆ … ⋆ ∣ ⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟ ⋆…⋆ ∣ … ∣ ⏟⏟⏟⏟⏟⏟⏟
⋆…⋆
𝑎1 stars 𝑎2 stars 𝑎𝑟 stars
The 𝑖th consecutive group of stars has 𝑎𝑖 stars, where 𝑎𝑖 is a non-negative integer. It is easily
seen that each unordered choice with repetition corresponds to exactly one such string of
symbols, where 𝑎𝑖 represents the number of times that 𝑥𝑖 was chosen. Thus, the number of
such choices is equal to the number of strings of symbols with the above form. We can create
a string by choosing 𝑛 − 1 positions to place the bars and put stars in the remaining places.
Therefore, by Proposition 3.11(1), there are (𝑛+𝑟−1 = 𝑛+𝑟−1 such choices.
𝑛−1 ) ( 𝑟 )
Corollary 3.16. For positive integers 𝑟 and 𝑛, the number of non-negative integer solutions of
𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 = 𝑟 is (𝑛+𝑟−1
𝑟 ).
12
Be very careful when using Theorem 3.15 to calculate probabilities, as unlike in Theorems 3.3 and 3.12 these
will often not be uniform. For example, if I roll two identical standard dice and ignore the order in which they
are rolled, by Theorem 3.15 there are (6+2−1
2
) = 21 distinct possible outcomes, but these are not equally likely.
For instance, I am twice as likely to get a three and a four than I am to get two sixes. For this reason, when we
repeat random experiments where repetition is allowed, such as rolling dice or drawing balls from a bag with
replacement, we almost always count ordered outcomes and use Theorem 3.3. On the other hand, as we have
already seen, when we repeat random experiments with repetition forbidden, such as dealing cards from a deck
or drawing balls without replacement, we can either count outcomes with order using Theorem 3.3 or count
outcomes without order using Theorem 3.12; the latter is usually simpler.
3.3. MORE EXAMPLES 41
Table 3.2: The number of ways to make 𝑟 choices from 𝑛 objects, subject to various
conditions.
The importance of this corollary is that it is the number of ways of distributing 𝑟 identical
objects among 𝑛 people. So, for example, if I want to share 7 pound coins among 5 people
then there are (5+7−1 = 11 = 330 possible ways to do this.
7 ) (7)
Example 3.17. How many solutions are there in positive integers to 𝑥1 + 𝑥2 + 𝑥3 = 101?
Solution. There are 11! possible ways to put the elements of {M, I1 , S1 , S2 , I2 , S3 , S4 , I3 , P1 , P2 , I4 }
in order by Corollary 3.6. Ignoring the subscripts, these orders give all the anagrams of ‘MIS-
SISSIPPI’. However, each anagram of ‘MISSISSIPPI’ is formed 4! ⋅ 4! ⋅ 2! times in this manner:
if we fix the positions of the ‘I’s, then there are 4! possible arrangements of I1 , I2 , I3 and I4 in
those positions. Moreover, for each of those arrangements, there are 4! possible arrangements
of S1 , S2 , S3 and S4 in the fixed positions of the ‘S’s, and finally there are 2! = 2 possible arrange-
ments of P1 and P2 in the ‘P’-positions. So in total the number of anagrams of ‘MISSISSIPPI’
11!
is = 34,650.
4!⋅4!⋅2!
An alternative approach is to consider a sequence of 11 blank spaces, into which we will
place the letters of MISSISSIPPI. First, we decide where we are going to put the 4 ‘S’s. There are
11 spaces to choose from, so there are (11
4)
possibilities to choose from, since we are choosing 4
of the 11 empty spaces. For any of these choices, there there are then 7 empty spaces remain-
ing, so there are (74) possibilities for how to place the 4 ‘I’s among these. There are then 3 empty
spaces, so (32) choices for how to place the 2 ‘P’s. Finally, there is now only 1 empty space, so
the remaining letter ‘M’ must be placed here – there is no other choice. We conclude that the
number of anagrams of ‘MISSISSIPPI’ is
11 7 3 11 × 10 × 9 × 8 7 × 6 × 5 × 4 3 × 2 11!
( )( )( ) = × × = = 34, 650,
4 4 2 4! 4! 2! 4! × 4! × 2!
as before.
Note that in the first method we considered the problem with order (i.e. treating all char-
acters as different), but then had to consider how many times we ‘overcounted’; we initially
counted with order but for the final answer, we do not care about the order of certain parts
(i.e. the repeated letters) and so have to divide by the number of different ways we achieve
the same anagram in the end. In the second method we viewed the mixed choice as a series
of separate unordered choices. In general both of these methods will work for mixed choice
problems, but one may be significantly simpler than the other, depending on the problem.
(1). What is the probability that I get 2 cards of one suit and 4 cards of another suit?
(2). What is the probability that I get 3 cards of one suit and 3 cards of another suit?
(3). What is the probability that I get exactly 2 Queens and exactly 3 Hearts?
(1). There are 4 possibilities for the suit from which we have 4 cards and (13
4)
possibilities for
the rank of those cards. Having chosen these, there are then 3 possibilities for the suit
of the remaining 2 cards and (13
2)
possibilities for the rank of those cards. So there are
4 ⋅ (13
4)
⋅ 3 ⋅ (13
2)
such hands, and we conclude that the probability is
4 ⋅ (13
4)
⋅ 3 ⋅ (13
2) 669,240 1,287
= = = 0.033 (2 s.f.).
(52
6)
20,358,520 39,151
(2). There are (42) ways to choose which 2 suits the cards come from.13 Having chosen these
suits, there are (13
3)
ways to choose 3 cards from the one suit, and then (13
3)
ways to choose
3 cards from the other suit. So there are (42) ⋅ (13 ⋅ 13 such hands, and we conclude that
3) (3)
the probability is
(42)(13 13
3 )( 3 ) 490,776 4,719
= = = 0.024 (2 s.f.).
(52
6)
20,358,520 195,755
(3). We consider two cases. If I do not get the Queen of Hearts, then I need to get two of the
remaining three Queens (there are (32) possibilities for how this can be done), three of
the remaining twelve Hearts (there are (12
3)
possibilities for how this can be done) and
one of the 36 cards which is not a Queen nor a Heart.14 So there are 36(32)(12
3)
hands not
containing the Queen of Hearts with exactly two Queens and exactly three Hearts.
On the other hand, if I do get the Queen of Hearts, then I need to get one of the remaining
three queens (3 possibilities), two of the remaining 12 hearts ((12
2)
possibilities) and two
of the other 36 cards ((36
2)
possibilities). So there are 3(12 36
2 )( 2 )
cards containing the Queen
of Hearts with exactly two Queens and exactly three Hearts.
36(32)(12
3)
+ 3(12 36
2 )( 2 ) 148,500 7,425
= = ≈ 0.0073 (2 s.f.).
(52
6)
20,358,520 1,017,926
Example 3.20. I roll seven standard dice. What is the probability that I get at least 3 sixes?
13
Note carefully the difference between the solutions of the different parts of this example. The fundamental
reason for this is that, for example, having two spades and four hearts is different from having two Hearts and
four Spades, whilst having three Hearts and three Spades is the same as having three Spades and Three Hearts.
14
If 𝐻 represents the set of Heart cards and 𝑄 represents the set of Queen cards, then by the Inclusion-exclusion
formula for two sets |𝐻 ∪ 𝑄| = |𝐻 | + |𝑄| − |𝐻 ∩ 𝑄| = 13 + 4 − 1 = 16. Thus, there are 52 − 16 = 36 cards that are not
Hearts not Queens.
44 CHAPTER 3. FINITE COUNTING
Solution. If we imagine rolling the dice in order, there are 67 possible outcomes, each of which
is equally likely. We now count the outcomes in which I get at most two sixes (that is, the out-
comes which don’t satisfy our condition). There are 57 outcomes in which I roll no sixes, since
in this case there are five possibilities (1, 2, 3, 4 or 5) for each die. There are 7 × 56 outcomes
in which I roll exactly 1 six, since there are 7 possibilities for which die shows six and 5 possi-
bilities for each of the other six dice, and likewise there are (72) × 55 outcomes in which I roll 2
sixes, since there are (72) ways to choose the 2 dice which show six and 5 possibilities for each
of the other dice. So there are 57 + 7 × 56 + (72) × 55 outcomes in which I do not roll at least 3
sixes, and therefore there are 67 − (57 + 7 × 56 + (72) × 55 ) outcomes in which I do roll at least 3
sixes. The probability of this event is therefore
67 − (57 + 7 × 56 + (72) × 55 ) 26,811 331
= = = 0.096 (2 s.f.).
67 279,936 3,456
As a general suggestion for approaching this kind of problem, try to apply the following
methods.
• If you can neatly partition the set you are trying to count, then do so, and then count each
part separately. This particularly applies if the question involves an inequality (e.g. ‘at
least 3 sixes’ in the above example). By considering each value in turn (e.g. considering
no sixes, 1 six and 2 sixes separately in the above example) we instead count sets defined
by equalities.15 For this to work it is essential that your partition is indeed a partition
(i.e. every outcome is in exactly one part – see Proposition 2.11).
• If you can describe your outcome as the result of a series of choices, then we can count
the possibilities at each choice and take the product to give the overall number of possi-
bilities. For example, in Example 3.19(1) we formed a ‘hand with 3 cards of one suit and
3 cards of another suit’ as the result of two consecutive choices: first we chose which
2 suits would appear in the hand, then we chose which cards from these suits would
appear, and multiplying these gave the number of such hands. For this to work it is
essential that for each outcome you are counting, there is precisely one sequence of
choices which results in that outcome; if not, then you will need to consider how you
overcounted.
• Finally, it often helps to take a complement; that is, to count the outcomes that do not
satisfy the given condition, since subtracting from the total number of outcomes then
15
Another example is in Example 3.19(3), where we partitioned the hands we were interested in into 2 sets,
those containing the Queen of Hearts and those not containing the Queen of Hearts, and counted these sepa-
rately.
3.4. THE BINOMIAL THEOREM 45
gives the number which do satisfy the given condition. We did this in Example 3.20,
where instead of counting the outcomes in which we rolled at least 3 sixes, we instead
counted those in which we rolled at most 2 sixes.
𝑛 𝑛 𝑛+1
( )+( )=( ).
𝑟 𝑟 +1 𝑟 +1
𝑛!
Algebraic proof. One approach is to write (𝑛𝑟) = and similarly for the other terms, and
𝑟!⋅(𝑛−𝑟)!
argue as follows:
𝑛 𝑛 𝑛! 𝑛!
( )+( )= +
𝑟 𝑟 +1 𝑟! ⋅ (𝑛 − 𝑟)! (𝑟 + 1)! ⋅ (𝑛 − 𝑟 − 1)!
𝑛! (𝑟 + 1) 1
= ( + )
(𝑟 + 1)! (𝑛 − 𝑟)! (𝑛 − 𝑟 − 1)!
𝑛!
= [(𝑟 + 1) + (𝑛 − 𝑟)]
(𝑟 + 1)! ⋅ (𝑛 − 𝑟)!
𝑛! ⋅ (𝑛 + 1)
=
(𝑟 + 1)! ⋅ ((𝑛 + 1) − (𝑟 + 1))!
(𝑛 + 1)!
=
(𝑟 + 1)! ⋅ ((𝑛 + 1) − (𝑟 + 1))
𝑛+1
=( )
𝑟 +1
Combinatorial proof. An alternative argument uses Theorem 3.12 to count the number of sub-
sets of {1, … , 𝑛 + 1} with size 𝑟 + 1 in two different ways:
𝑛+1
( ) = |{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ |𝐴| = 𝑟 + 1}|
𝑟 +1
= |{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ 𝑛 + 1 ∈ 𝐴 and |𝐴| = 𝑟 + 1}|
+ |{𝐴 ⊆ {1, … , 𝑛 + 1} ∶ 𝑛 + 1 ∉ 𝐴 and |𝐴| = 𝑟 + 1}| by the sum rule
= |{𝐴 ⊆ {1, … , 𝑛} ∶ |𝐴| = 𝑟}| + |{𝐴 ⊆ {1, … , 𝑛} ∶ |𝐴| = 𝑟 + 1}| (∗)
𝑛 𝑛
= ( )+( ).
𝑟 𝑟 +1
46 CHAPTER 3. FINITE COUNTING
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
Figure 3.1: This diagram lists the binomial coefficients, where the 𝑛th row (starting from
𝑛 = 0) shows (𝑛0), (𝑛1), … , (𝑛𝑛) from left-to-right. Each number (apart from the 1’s) is the
sum of the two numbers to its upper-left and upper-right. For example, 28 = 21 + 7. This
diagram is often called Pascal’s triangle in English. In China, it is known as 杨辉三角形.
The binomial coefficients can be arranged in a triangular array, where each row is gener-
ated by the previous one using Proposition 3.21 – see Figure 3.1.
Theorem 3.22 (Binomial theorem). For any integer 𝑛 ⩾ 0 and any 𝑎, 𝑏 ∈ ℝ, we have
𝑛 𝑛
(𝑎 + 𝑏)𝑛 = ∑ ( )𝑎 𝑖 𝑏 𝑛−𝑖 .
𝑖=0 𝑖
This theorem explains the name for binomial coefficients, as they are the coefficients in the
binomial expansion of (𝑎 + 𝑏)𝑛 .
Proof. For 𝑛 = 0,
0 0 0
(𝑎 + 𝑏)𝑛 = 1 = ( )𝑎 0 𝑏 0 = ∑ ( )𝑎 𝑖 𝑏 𝑛−𝑖 .
0 𝑖=0 𝑖
3.4. THE BINOMIAL THEOREM 47
Now let 𝑛 ∈ ℕ0 be given such that (𝑎 + 𝑏)𝑛 = ∑𝑛𝑖=0 (𝑛𝑖)𝑎 𝑖 𝑏 𝑛−𝑖 . Then
Therefore by induction, we have that (𝑎 + 𝑏)𝑛 = ∑𝑛𝑖=0 (𝑛𝑖)𝑎 𝑖 𝑏 𝑛−𝑖 for all 𝑛 ∈ ℕ0 and 𝑎, 𝑏 ∈ ℝ.
Note that Corollary 3.23 gives another proof of Theorem 1.9, which stated that for any finite
set 𝑋 we have |𝒫(𝑋)| = 2|𝑋| . Indeed,
𝑛
total number of subsets of 𝑆 = ∑ (number of subsets of 𝑆 with size 𝑖)
𝑖=0
𝑛 𝑛
= ∑ ( ) by Theorem 3.12
𝑖=0 𝑖
= 2𝑛 by Corollary 3.23.
Note that Corollary 3.24 does not hold for 𝑛 = 0, as in this case we just have the term
𝑛
( ) = (−1 + 1)0 = 00 = 1.
0
Corollary 3.25. Let 𝑆 be a non-empty finite set of size 𝑛. Then 𝑆 has 2𝑛−1 subsets of even size
and 2𝑛−1 subsets of odd size.
By Theorem 3.12 and the sum rule, it follows that 𝑆 has the same number of even-sized subsets
as it does odd-sized subsets. As their sum is equal to 2𝑛 by Theorem 1.9, it follows there are
2𝑛−1 even-sized and 2𝑛−1 odd-sized subsets of 𝑆.
Again, this result does not hold for 𝑛 = 0, since the empty set has one subset of even size
(itself) and no subset of odd size.
Chapter 4
Infinite counting
Our notion of ‘size’ for finite sets is intrinsically connected with the existence of bijections
between sets. Indeed, the process of counting elements of a finite set (“one”, “two”, “three”,
“four”, ...) establishes a bijection between the set of objects we wish to count, to a set of the
form {1, … , 𝑛} for non-negative integer 𝑛. We can generalise this notion of size by comparing
two sets with each other:
• If there exists a bijection from 𝑋 to 𝑌, we say that 𝑋 and 𝑌 are equinumerous 1 and write
𝑋 ≈ 𝑌.
We recall some facts about bijections and injections, written in this new notation. The
proofs are left as an exercise to the reader.
Proposition 4.2.
49
50 CHAPTER 4. INFINITE COUNTING
The following proposition shows that the notions of injectivity / bijectivity capture the no-
tions that one set is smaller-in-size than another / equal-in-size to another.
Proposition 4.3. Let 𝑚, 𝑛 ∈ ℕ0 be given and let 𝑋, 𝑌 be finite sets with size 𝑚, 𝑛 respectively.
Then:
𝑓 ∶ 𝑋 → 𝑌 , 𝑥𝑖 ↦ 𝑦𝑖 .
You might expect that if two sets can be embedded into each other by injections, then those
sets should be equinumerous. This is in fact true and is called the Cantor-Schröder-Bernstein
theorem, but it is not easy to prove and so is not examinable. The proof is given as an additional
exercise.
Theorem 4.4 (Cantor-Schröder-Bernstein theorem). Let 𝑋 and 𝑌 be sets such that 𝑋 ≼ 𝑌 and
𝑌 ≼ 𝑋. Then 𝑋 ≈ 𝑌.
If 𝑓 ∶ ℕ+ → 𝑋 is a bijection to some set 𝑋, then we can list the elements of 𝑋: 𝑓 (0), 𝑓 (1), 𝑓 (2),
… , 𝑓 (𝑛), … . Hence the countable sets are those that can be described by a list, either finite or
infinite. Below we give several examples of infinite subsets.
4.1. COUNTABLE SETS 51
Example 4.6.
(3). Let 𝐸 denote the set of positive even integers. Then 𝑔 ∶ ℕ+ → 𝐸, 𝑛 ↦ 2𝑛 is a bijection, so
𝐸 is countably-infinite. This shows that an infinite set can be equinumerous to a proper
subset of itself, even though 𝐸 “only has half the elements of ℕ+ ”.
(4). Define
⎧ 𝑛−1 if 𝑛 is odd,
ℎ ∶ ℕ+ → ℤ, 𝑛 ↦ ⎨ 2𝑛
⎩− 2
if 𝑛 is even.
Then ℎ is a bijection, so ℕ+ ≈ ℤ. Thus ℤ is countably-infinite.
(2). Assume 𝑋 is non-empty. Then 𝑋 is countable if and only if there exists a surjection 𝑓 ∶ ℕ+ →
𝑋.
52 CHAPTER 4. INFINITE COUNTING
Thus every infinite subset of ℕ+ (or any countably-infinite set) is also countably-infinite,
such as the set of prime numbers, or {𝑛2 ∶ 𝑛 ∈ ℕ+ }. In fact, the product of two countable sets
is also countable, which can be shown using the following theorem. The proof is given as an
exercise, but we do show as a corollary that the set of rational numbers is countably-infinite.
𝑓 (𝑚)
𝑔 ∶ ℕ+ × ℕ+ → ℚ, (𝑚, 𝑛) ↦
𝑛
𝑓 ∶ 𝑋 → 𝒫(𝑋), 𝑥 ↦ {𝑥}.
Let 𝑥, 𝑦 ∈ 𝑋 be given such that 𝑓 (𝑥) = 𝑓 (𝑦). Then 𝑥 ∈ {𝑥} and {𝑥} = {𝑦}, so 𝑥 ∈ {𝑦}. However, 𝑦 is
the only member of {𝑦}, so we must have 𝑥 = 𝑦. Therefore 𝑓 is injective and so 𝑋 ≼ 𝒫(𝑋).
Now let 𝑔 ∶ 𝑋 → 𝒫(𝑋) be any function. We will show it is not surjective. Define 𝐴 ≔ {𝑥 ∈ 𝑋 ∶
𝑥 ∉ 𝑔(𝑥)} ∈ 𝒫(𝑋). Suppose there exists an 𝑥 ∈ 𝑋 such that 𝐴 = 𝑔(𝑥). Then by definition of 𝐴,
𝑥 ∈ 𝑔(𝑥) ⟺ 𝑥 ∈ 𝐴 ⟺ 𝑥 ∉ 𝑔(𝑥).
• 𝑋1 ≔ ℕ+ , and
Thus there are infinitely-many infinite sets, no two of which are equinumerous.
Proof. Let 𝑓 ∶ ℕ+ → [0, 1] be given. We will show that 𝑓 is not surjective. For each 𝑛 ∈ ℕ+ ,
let 0.𝑎𝑛,1 𝑎𝑛,2 … 𝑎𝑛,𝑘 … be a non-terminating decimal expansion of 𝑓 (𝑛), where each 𝑎𝑛,𝑘 is a
decimal digit from 0 to 9 inclusive.2 We define a new sequence as follows (see Figure 4.2): for
all 𝑛 ∈ ℕ+ ,
⎧𝑎𝑛,𝑛 + 5 if 𝑎𝑛,𝑛 < 5,
𝑏𝑛 ≔ ⎨
⎩𝑎𝑛,𝑛 − 5 if 𝑎𝑛,𝑛 ⩾ 5.
Define 𝑥 ≔ 0.𝑏1 𝑏2 … 𝑏𝑘 … ∈ [0, 1] and assume 𝑥 ∈ Im(𝑓 ), so there exists a 𝑛 ∈ ℕ+ such that
2
For 1, we can take the decimal expansion 0.9.
54 CHAPTER 4. INFINITE COUNTING
𝑛 𝑓 (𝑛)
1 0. 𝑎1,1 𝑎1,2 𝑎1,3 ⋯ 𝑎1,𝑛 ⋯
2 0. 𝑎2,1 𝑎2,2 𝑎2,3 ⋯ 𝑎2,𝑛 ⋯
3 0. 𝑎3,1 𝑎3,2 𝑎3,3 ⋯ 𝑎3,𝑛 ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋱
𝑛 0. 𝑎𝑛,1 𝑎𝑛,2 𝑎𝑛,3 ⋯ 𝑎𝑛,𝑛 ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋱
𝑥 0. 𝑏1 𝑏2 𝑏3 ⋯ 𝑏𝑘 ⋯
Figure 4.2: For any countably-infinite list of numbers in [0, 1], we will define
a real number 𝑥 not in this list by choosing its decimal digits to be different
from any decimal expansion in the sequence. Some care is needed as real
numbers can have 2 decimal expansions, such as 0.1 = 0.10 = 0.09.
𝑥 = 𝑓 (𝑛). Then:
𝑎𝑛,𝑘∞ ∞
𝑏𝑘
∑ 𝑘
= ∑ 𝑘
𝑘=1 10 𝑘=1 10
𝑛−1 𝑎 ∞ 𝑏 −𝑎
𝑛,𝑘 − 𝑏𝑘 𝑘 𝑛,𝑘
⇒ ∑ 𝑘
= ∑ 𝑘
𝑘=1 10 𝑘=𝑛 10
𝑛−1 𝑏𝑛 − 𝑎𝑛,𝑛 ∞ 𝑏 −𝑎
1 𝑛−1−𝑘 𝑘 𝑛,𝑘
⇒ 𝑛−1 ∑ (𝑎𝑛,𝑘 − 𝑏𝑘 )10 = 𝑛
+ ∑
10 𝑘=1 10 𝑘=𝑛+1 10 𝑘
𝑛−1 𝑏𝑛 − 𝑎𝑛,𝑛 ∞ 𝑏 −𝑎
𝑘 𝑛,𝑘
⇒ ∑ (𝑎𝑛,𝑘 − 𝑏𝑘 )10𝑛−1−𝑘 = + ∑ . (∗)
𝑘=1 10 𝑘=𝑛+1 10 𝑘−𝑛+1
Notice that the left-hand side of the last equation is an integer. Moreover,
|| ∞ 𝑏𝑘 − 𝑎𝑛,𝑘 || ∞ |𝑏 − 𝑎
𝑘 𝑛,𝑘 |
∞
18 18 ∞ 1
| ∑ |⩽ ∑ ⩽ ∑ = ∑
10 𝑘=1 10𝑘
= 0.2.
|𝑘=𝑛+1 10𝑘−𝑛+1 | 𝑘=𝑛+1 10𝑘−𝑛+1 𝑘=𝑛+1 10
𝑘−𝑛+1
As 𝑏𝑛 − 𝑎𝑛,𝑛 = ±5, it follows that the right-hand side of (∗) belongs to the set [−0.7, −0.3] ∪
[0.3, 0.7], which is a contradiction as the left-hand side is an integer. Therefore 𝑥 ∉ Im(𝑓 ) and
so 𝑓 is not surjective. In particular 𝑓 is not surjective, so [0, 1] is uncountable.
Finally, as [0, 1] ⊆ ℝ, by Theorem 4.7(1) ℝ must also be uncountable.
Chapter 5
Graph theory
Definition 5.1. A graph 𝐺 = (𝑉 , 𝐸) consists of a set of vertices 1 𝑉 and a set 𝐸 ⊆ P(𝑉 ) of edges,
where each edge is an unordered pair {𝑢, 𝑣} of distinct vertices 𝑢, 𝑣 ∈ 𝑉.2 We call {𝑢, 𝑣} an edge
between 𝑢 and 𝑣, and will often write this edge simply as 𝑢𝑣.3
For a graph 𝐺, we denote its set of vertices by 𝑉 (𝐺 ) and its set of edges by 𝐸(𝐺 ). Note that
𝐺 = (𝑉 (𝐺 ), 𝐸(𝐺 )).
However, it may help you to think in terms of the following, more informal definition – a
graph consists of a set of points called vertices, some of which may be linked by lines called
edges, subject to the following rules:
An example is shown in Figure 5.1. Notice that the lines do not have to be straight, and can
intersect; all that matters is whether there is a line joining two vertices or not.
We can think of an isomorphism as a relabelling of the vertices that preserves the edge
structure of the graph. By its definition, a graph is equal to another if they have the same set
of vertices, and the same set of edges. However, most of the time we only care about the con-
nections, not the actual names or labels of the vertices. This leads to the following definition.
1
Note that the word ‘vertices’ is the plural of ‘vertex’ (similar to ‘matrix’ and ‘matrices’). ‘Vertice’ and ‘vertexes’
are both incorrect! Throughout this course, all graphs have a non-zero, finite number of vertices.
2
This definition of a graph is sometimes called a simple graph. Writers using this term would typically say
that a graph may contain loops (an edge from a vertex to itself) and multiple edges between a pair of vertices.
However, we will not consider these in this course.
3
Note that 𝑢𝑣 = 𝑣𝑢 because these are unordered pairs.
55
56 CHAPTER 5. GRAPH THEORY
𝑐 𝑏
𝑒
𝑑
𝑎
Figure 5.1: A graph 𝐺 with vertex set {𝑎, 𝑏, 𝑐, 𝑑, 𝑒, 𝑓 } and edge set
{𝑎𝑐, 𝑎𝑑, 𝑎𝑓, 𝑏𝑐, 𝑏𝑑, 𝑒𝑓}. The complement 𝐺 is also shown in red.
Definition 5.2. Let 𝐺 and 𝐻 be graphs and let 𝑓 ∶ 𝑉 (𝐺 ) → 𝑉 (𝐻 ) be given. We say that 𝑓 is an
isomorphism if 𝑓 is bijective and for all distinct 𝑣, 𝑤 ∈ 𝑉 (𝐺 ), 𝑣𝑤 ∈ 𝐸(𝐺 ) if and only if 𝑓 (𝑣)𝑓 (𝑤) ∈
𝐸(𝐺 ). We say 𝐺 and 𝐻 are isomorphic if there is an isomorphic from 𝐺 and 𝐻.4
Figure 5.2 gives an example of two graphs that ‘look’ different but are isomorphic. We con-
clude this section by introducing more terminology that will be used in this chapter.
Figure 5.1 also demonstrates the complement of a graph. Notice that 𝐸(𝐺) is the set com-
plement of 𝐸(𝐺 ) in the universal set {𝐴 ⊆ 𝑉 (𝐺 ) ∶ |𝐴| = 2}.
Definition 5.4 (Order and size). The order of a graph 𝐺 is the number of vertices of 𝐺, and the
size of 𝐺 is the number of edges of 𝐺.
4
The words ‘isomorphism’ and ‘isomorphic’ can be broken down into two parts: ‘iso’, meaning ‘same’, and
‘morphism’/‘morphic’, which is related to ‘structure’. Thus, an isomorphism preserves the ‘same structure’, and
isomorphic graphs have the ‘same structure’.
5.1. DEGREES AND DEGREE SEQUENCES 57
𝐸 𝐷 𝑦
𝐵 𝑧
𝐹
𝐴 𝐶 𝑥
𝑤 𝑢
Figure 5.2: The two graphs drawn above are isomorphic, with the
isomorphism 𝑓 ∶ {𝐴, 𝐵, 𝐶 , 𝐷, 𝐸, 𝐹} → {𝑢, 𝑣, 𝑤, 𝑥, 𝑦, 𝑧} shown in green.
• If 𝑢𝑣 is an edge in 𝐺, we say that 𝑢 and 𝑣 are adjacent (in 𝐺) and are called neighbours.
For a vertex 𝑣 ∈ 𝑉 (𝐺 ), we denote the set of neighbours of 𝑣 by 𝑁𝐺 (𝑣). We can drop the
subscript if there is no ambiguity.
Note that if 𝐺 is a graph with 𝑛 vertices, then the degree of each vertex of 𝐺 is an integer
between 0 and 𝑛 − 1 inclusive. The next lemma relates the sum of all vertex degrees to the
number of edges.
Lemma 5.7 (Handshaking lemma). In any graph 𝐺 we have ∑𝑣∈𝑉 (𝐺 ) 𝑑(𝑣) = 2|𝐸(𝐺 )|.
58 CHAPTER 5. GRAPH THEORY
Vertices
Edges 𝑎 𝑏 𝑐 𝑑 𝑒 𝑓 Total
𝑎𝑐 ✓ ✗ ✓ ✗ ✗ ✗ 2
𝑎𝑑 ✓ ✗ ✓ ✗ ✗ ✗ 2
𝑎𝑓 ✓ ✗ ✗ ✗ ✗ ✓ 2
𝑏𝑐 ✗ ✓ ✓ ✗ ✗ ✗ 2
𝑏𝑑 ✗ ✓ ✗ ✓ ✗ ✗ 2
𝑒𝑓 ✗ ✗ ✗ ✗ ✓ ✓ 2
Degree 3 2 3 1 1 2 ∑ 𝑑(𝑣) = 12 = 2|𝐸(𝐺 )|
𝑣∈𝑉 (𝐺 )
Table 5.1: A table summarising the incidence information for the graph in Figure
5.1. The blue ticks (✓) indicate that the edge (for that row) is incident to the vertex
(for that column). The total number of ticks is the expression in the handshaking
lemma, shown in the lower-right corner.
Proof. We count the number of pairs (𝑣, 𝑒), where 𝑒 is an edge incident to a vertex 𝑣:
Example 5.8. The incidence information from Figure 5.1 is summarised in Table 5.1. By count-
ing the number of edges incident to a fixed vertex 𝑣, we get the degreo 𝑑(𝑣) of that vertex. Note
that each edge is incident to exactly 2 vertices and that the minimum and maximum degree
are 1 and 3.
The proof of the handshaking lemma uses two different methods for counting the ticks (✓)
in such a table: either by adding up rows first or columns first.
Corollary 5.9. In any (finite) graph there are an even number of vertices with odd degree.
As the right-hand side is even and each degree is an integer, it follows that the number of
vertices with odd degree must be even.
The next proposition uses the pigeonhole principle to prove that any graph must contain
two vertices with the same degree.
5.1. DEGREES AND DEGREE SEQUENCES 59
Proposition 5.10. Any graph with at least 2 vertices has 2 vertices of the same degree.
Case 1: 𝐺 has a vertex 𝑣 of degree 𝑛 − 1. In this case, this vertex 𝑣 is adjacent to every other
vertex, so every vertex of 𝐺 has degree at least 1. That is, the degree of each vertex of 𝐺
lies in the set {1, 2, … , 𝑛 − 1}, which has size 𝑛 − 1. Hence by the pigeonhole principle at
least two vertices of 𝐺 must have same degree.
Case 2: 𝐺 has no vertex of degree 𝑛 − 1. In this case, the degree of each vertex of 𝐺 lies in the
set {0, 1, … , 𝑛 − 2}, which also has size 𝑛 − 1, so again the pigeonhole principle implies
that at least two vertices of 𝐺 must have the same degree.
Definition 5.11. The degree sequence of a graph 𝐺 is the sequence of all degrees of vertices in
𝐺, written in decreasing order: (𝑑(𝑣1 ), … , 𝑑(𝑣𝑛 )), where 𝐺 has 𝑛 vertices 𝑣1 , … , 𝑣𝑛 and 𝑑(𝑣𝑖 ) ⩽
𝑑(𝑣𝑖+1 ) for all 𝑖 = 1, … , 𝑛 − 1.
Example 5.12.
(1). The degree sequence of 𝐺 is Figure 5.1 is (1, 2, 2, 2, 2, 3). The degree sequence of the com-
plement 𝐺 is (2, 3, 3, 3, 3, 4).
(2). The degree sequence of both graphs in Figure 5.2 is (2, 2, 2, 3, 3, 4). In fact, isomorphic
graphs have the same degree sequence (left as an exercise for the reader).
Some of the results just given rule out some sequences as being degree sequences of a
graph. For example, (3, 3, 3, 3, 3) cannot be the degree sequence of a graph by Corollary 5.9,
since the sum of the degrees in the sequence is odd. Similarly (0, 1, 2, 3) cannot be the degree
sequence of a graph by Proposition 5.10 since no two of the degrees in the sequence are equal.
Many other sequences can also be seen to be impossible by similar means.
We conclude this section with some further definitions.
Definition 5.13 (Regular). A graph 𝐺 is regular if all vertices have the same degree; if that
common degree is 𝑘 ∈ ℕ0 , then we can say the graph is 𝑘-regular.
Example 5.14. The graph obtained from the vertices and edges of a regular dodecahedron 5 is
3-regular – see Figure 5.3.
5
A regular dodecahedron is a polyhedron made up of 12 flat pentagonal faces.
60 CHAPTER 5. GRAPH THEORY
Figure 5.3: The graph obtained from the vertices and edges of a regular
dodecahedron.
Definition 5.15. A complete graph 𝐺 is a graph that contains all possible edges: 𝐸(𝐺 ) = {𝐴 ⊆
𝑉 (𝐺 ) ∶ |𝐴| = 2}. Notice that 𝐺 has size (|𝑉 (𝐺 )|
2 ).
We call a complete graph with 𝑛 vertices a 𝐾𝑛 .
A path is a graph 𝐺 of the form 𝑉 (𝐺 ) = {𝑣1 , … , 𝑣𝑛 }, 𝐸(𝐺 ) = {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 1, … , 𝑛 − 1}, where
𝑛 ∈ ℕ+ is the order of 𝐺. We say that 𝑛 − 1 = |𝐸(𝐺 )| is the length of 𝐺. The vertices 𝑣1 and 𝑣𝑛
are called the end-vertices of 𝐺. We call a path of length 𝑛 a 𝑃𝑛 .
A cycle is a graph 𝐺 of the form 𝑉 (𝐺 ) = {𝑣1 , … , 𝑣𝑛 }, 𝐸(𝐺 ) = {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 1, … , 𝑛 − 1} ∪ {𝑣1 𝑣𝑛 },
where 𝑛 ⩾ 3 is the order of 𝐺. We say that 𝑛 = |𝐸(𝐺 )| is the length of 𝐺. We call a cycle of length
𝑛 a 𝐶𝑛 .
𝑃5
𝑃4 𝐶6
𝑃3 𝐶5
𝑃2 𝐶4
𝐾1 , 𝑃0
𝐾2 , 𝑃1 𝐾3 , 𝐶3 𝐾4 𝐾5 𝐾6
𝐺 𝑒
𝑏
𝑓 𝑑
𝑐
𝑏 𝑏
𝑓 𝑑 𝑓 𝑑
𝑐
𝑎 𝐺 [{𝑎, 𝑏, 𝑑, 𝑓 }] 𝑎
𝐻
Example 5.17.
(1). The graph 𝐺 is Figure 5.5 is shown with several of its subgraphs, some of which are in-
duced subgraphs.
(3). For every non-negative integer 𝑛, there is a path of length 𝑛 in a 𝑃𝑛+1 , and in a 𝐶𝑛+1
(provided 𝑛 ⩾ 2).
The following proposition summarises some basic facts about (induced) subgraphs.
5.2. SUBGRAPHS, PATHS, AND CYCLES 63
𝑏 𝑏 𝑏
𝑎 𝑎 𝑎
𝑐 𝑐 𝑐
Figure 5.6: The three paths of length 2 with vertex set {𝑎, 𝑏, 𝑐}.
One question of particular interest to us is when copies of these (or other) graphs can be
found in other, larger graphs, and if so, how many such copies there are. For example, we
might ask whether or not a graph contains a path from one vertex to another, or whether a
graph contains a cycle. The remainder of this section investigates a couple of these questions.
Example 5.19. How many paths of length 2 are there in a complete graph 𝐺 with 𝑛 ⩾ 3 vertices?
Solution. Note that there are 3 paths of length 2 with the same vertex set {𝑎, 𝑏, 𝑐}, as shown
in Figure 5.6: their edge sets are {𝑎𝑏, 𝑎𝑐}, {𝑎𝑏, 𝑏𝑐} and {𝑎𝑐, 𝑏𝑐}. So one way to calculate the
number of these paths is to take the number of ways to choose three distinct vertices in 𝑉 (𝐺 ),
which is (𝑛3), and to multiply by 3, since each choice of three vertices supports three different
𝑛(𝑛−1)(𝑛−2)
paths of length 2. So in total there are 3(𝑛3) = paths of length 2 in a complete graph
2
on 𝑛 ⩾ 3 vertices.
Another approach is to note that any ordered triple (𝑥, 𝑦, 𝑧) of distinct vertices in 𝑉 (𝐺 )
gives a path of length 2, whose vertices are 𝑥, 𝑦 and 𝑧 and whose edges are 𝑥𝑦 and 𝑦𝑧. Recall
that the number of ways to choose three vertices out of 𝑛, with order but no repetition, is
𝑛!
= 𝑛(𝑛 − 1)(𝑛 − 2). However, this counts each path twice, since the triples (𝑥, 𝑦, 𝑧) and
(𝑛−3)!
𝑛(𝑛−1)(𝑛−2)
(𝑧, 𝑦, 𝑥) give rise to the same path using this method. So in total there are paths of
2
length 2 in a complete graph with 𝑛 ⩾ 3 vertices.
Similar arguments can be used to count the number of copies of other small graphs in
standard larger graphs. Often we want to find sufficient (and perhaps necessary) conditions
64 CHAPTER 5. GRAPH THEORY
which ensure that any graph which satisfies these conditions must contain the subgraph we
are looking for. For example, the next lemma shows that any graph with minimum degree at
least two contains a cycle.
Proof. Consider a path 𝑃 with longest possible length 𝑛 in 𝐺, and enumerate 𝑉 (𝑃) = {𝑣1 , … ,
𝑣𝑛+1 } so that 𝐸(𝑃) = {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 1, … , 𝑛}. Observe that 𝑛 ≠ 0 since 𝛿(𝐺 ) ⩾ 2. Thus, since
𝑑𝐺 (𝑣𝑛+1 ) ⩾ 𝛿(𝐺 ) ⩾ 2, there exists a neighbour 𝑣𝑛+2 ∈ 𝑁𝐺 (𝑣𝑛+1 ) different from 𝑣𝑛 . If 𝑣𝑛+2 ∉
{𝑣1 , … , 𝑣𝑛−1 } then we can construct a longer path in 𝐺 by adding the edge 𝑣𝑛+1 𝑣𝑛+2 to 𝑃, which
is a contradiction. Therefore there exists a 𝑘 ∈ {1, … , 𝑛 − 1} such that 𝑣𝑘 𝑣𝑛+1 ∈ 𝐸(𝐺 ) and thus
the graph with vertex set {𝑣𝑘 , … , 𝑣𝑛+1 } and edge set {𝑣𝑖 𝑣𝑖+1 ∶ 𝑖 = 𝑘, … , 𝑛} ∪ {𝑣𝑘 𝑣𝑛+1 } is a cycle in
𝐺 (note that 𝑛 + 1 − 𝑘 + 1 ⩾ 3, which we require since cycles must have at least 3 vertices).
The next theorem improves on Lemma 5.20 by showing that any graph with at least as many
edges as vertices contains a cycle.
Theorem 5.21. Any graph with at least as many edges as vertices has a cycle.
|𝑉 (𝐺 )|(|𝑉 (𝐺 )|−1)
Proof. First, note that for any graph |𝐸(𝐺 )| ⩽ (|𝑉 (𝐺
2 )
)|
= , so if |𝐸(𝐺 )| ⩾ |𝑉 (𝐺 )|
2
then 𝐺 has at least 3 vertices. Also note that there is only one graph (up to isomorphism) with
3 vertices and at least 3 edges, 𝐶3 , which certainly contains a cycle.
Now let 𝑛 ⩾ 3 be an integer such that every graph with 𝑛 vertices and at least 𝑛 edges has
a cycle, and let 𝐺 be a graph with 𝑛 + 1 vertices. If 𝛿(𝐺 ) ⩾ 2 then by the previous lemma there
is a cycle in 𝐺. Now suppose 𝛿(𝐺 ) < 2, so there exists a vertex 𝑣 with degree 0 or 1. Define
𝐺 ′ ≔ 𝐺 [𝑉 (𝐺 ) ⧵ {𝑣}]. Then 𝐺 ′ has 𝑛 vertices and 𝐸(𝐺 ′ ) = 𝐸(𝐺 ) ⧵ {𝑣𝑤 ∶ 𝑤 ∈ 𝑁𝐺 (𝑣)}, so 𝐺 ′ has at
least 𝑛 − 𝑑𝐺 (𝑣) ⩾ 𝑛 − 1 edges. Therefore, by the induction hypothesis 𝐺 ′ has a cycle. As 𝐺 ′ is a
subgraph of 𝐺, it follows that 𝐺 has a cycle too.
Therefore by induction on 𝑛, every graph with at least as many vertices as edges contains
a cycle.
If the only repeated vertices are 𝑣0 = 𝑣𝑘 (so the walk is closed) and 𝑘 ⩾ 3 then
is a cycle of length 𝑘 in 𝐺.
A graph 𝐺 is connected if for any two vertices 𝑢 and 𝑣 of 𝐺 there is a walk in 𝐺 from 𝑢 to 𝑣; in
other words, 𝑥 ∼𝐺 𝑦 for all 𝑥, 𝑦 ∈ 𝑉 (𝐺 ). If a graph is not connected, we say it is disconnected. A
(connected) component of 𝐺 is a maximal connected subgraph 𝐶 of 𝐺: so 𝐶 is connected and
if 𝐷 is a connected subgraph of 𝐺 and 𝐶 is a subgraph of 𝐷, then 𝐶 = 𝐷.
Proof. Let 𝑥, 𝑦 ∈ 𝑉 (𝐺 ) = 𝑉 (𝐻 ) be given. Then since 𝐻 is connected, there exists a walk (𝑣0 , … ,
𝑣𝑘 ) from 𝑥 to 𝑦 in 𝐻, which means that 𝑣0 , … , 𝑣𝑘 ∈ 𝑉 (𝐻 ), 𝑣0 = 𝑥, 𝑣𝑘 = 𝑦, and 𝑣𝑖 𝑣𝑖+1 ∈ 𝐸(𝐻 ) ⊆
𝐸(𝐺 ) for all 𝑖 = 0, … , 𝑘 − 1. Thus (𝑣0 , … , 𝑣𝑘 ) is a walk from 𝑥 to 𝑦 in 𝐺 and therefore 𝐺 is con-
nected.
(2). If (𝑣0 , … , 𝑣𝑘 ) is a walk in 𝐺, then 𝑣0 , … , 𝑣𝑘 all belong to the same equivalence class of ∼𝐺 .
66 CHAPTER 5. GRAPH THEORY
𝑎
𝑏 𝑑
ℎ
𝑓
𝑖
𝑘 𝑒
Figure 5.7: The edges highlighted in green are the edges of consecutive vertices a walk
without repeated vertices from 𝑏 to 𝑗: (𝑏, 𝑒, 𝑐, 𝑑, 𝑓 , 𝑗). The edges highlighted in blue are
from a closed walk from 𝑓 to itself, but with another repeated vertex:
(𝑓 , 𝑎, 𝑔, ℎ, 𝑘, 𝑎, 𝑖, 𝑓 ). Notice that the graph with vertex set {𝑎, 𝑓 , 𝑔, ℎ, 𝑖, 𝑘} and edge set
{𝑎𝑓 , 𝑎𝑔, 𝑎𝑖, 𝑎𝑘, 𝑓 𝑖, 𝑔ℎ, ℎ𝑘} is not a cycle.
5.3. CONNECTEDNESS AND TREES 67
Corollary 5.25. Let 𝐺 be a graph. Then the connected components of 𝐺 are the induced sub-
graphs of the form 𝐺 [𝐴], where 𝐴 is an equivalence class of ∼𝐺 .
Any graph consists of one or more connected components, and a graph is connected if and
only if it has exactly one connected component. Figure 5.8 shows a disconnected graph and
its components.
Proposition 5.26. Let 𝑢 and 𝑣 be vertices of a graph 𝐺. Then 𝐺 contains a walk from 𝑢 to 𝑣 if
and only if 𝐺 contains a path that includes both 𝑢 and 𝑣 (as end-vertices).
Proof. Let (𝑣0 , … , 𝑣𝑘 ) be a walk from 𝑢 to 𝑣 in 𝐺 with stortest possible length. If 𝑣𝑖 = 𝑣𝑗 for some
0 ⩽ 𝑖 < 𝑗 ⩽ 𝑙, then
(𝑣0 , 𝑣1 , … , 𝑣𝑖−1 , 𝑣𝑖 , 𝑣𝑗+1 , 𝑣𝑗+2 , … , 𝑣𝑘−1 , 𝑣𝑘 )
𝑝
𝑑
ℎ
𝑔
𝑗 𝑚
𝑞
𝑜
𝑐
𝑒
𝑎
𝑙
𝑏
𝑓 𝑖
𝑛
Figure 5.8: The graph drawn above has 3 components, shown in different colours. One
component conssists of just the isolated vertex 𝑔.
5.3. CONNECTEDNESS AND TREES 69
are infinitely-many walks between two vertices in the same component but only finitely-many
paths, this proposition is sometimes easier to use.
Proof. We argue by induction. First note that the statement is vacuously true for 𝑛 = 1 since
a graph cannot have a negative number of edges. Let 𝑛 ∈ ℕ+ be given such that every graph
with order 𝑛 and size at most 𝑛 − 2 is disconnected. Let 𝐺 be a graph with order 𝑛 + 1 and size
at most 𝑛 − 1. Then by the handshaking lemma,
If 𝛿(𝐺 ) ⩾ 2 then ∑𝑣∈𝑉 (𝐺 ) 𝑑(𝑣) ⩾ 2|𝑉 (𝐺 )| = 2(𝑛 + 1), which is a contradiction. Thus 𝛿(𝐺 ) < 2
and so there exists a vertex 𝑣 with degere 0 or 1. If 𝑑(𝑣) = 0 then [𝑣]∼𝐺 = {𝑣} ≠ 𝑉 (𝐺 ) since
𝑛 + 1 ⩾ 2, so 𝐺 is disconnected.
Now suppose 𝑑(𝑣) = 1 and consider 𝐺 ′ ≔ 𝐺 [𝑉 (𝐺 )⧵{𝑣}]. Then 𝐺 ′ has 𝑛 vertices and at most
|𝑉 (𝐺 )|−𝑑𝐺 (𝑣) ⩽ (𝑛−1)−1 = 𝑛−2 vertices, so by the induction hypothesis 𝐺 ′ is disconnected.
Hence, there exist 𝑥, 𝑦 ∈ 𝑉 (𝐺 ′ ) such that 𝑥 ≁𝐺 ′ 𝑦. If 𝑥 ∼𝐺 𝑦 then by Proposition 5.26, there is a
path 𝑃 in 𝐺 with end-vertices 𝑥 and 𝑦, but 𝑃 cannot be a subgraph of 𝐺 ′ by the same propo-
sition. Thus, 𝑣 ∈ 𝑉 (𝑃) ⧵ {𝑥, 𝑦} and so 𝑑𝐺 (𝑣) ⩾ 𝑑𝑃 (𝑣) = 2, which is a contradiction. Therefore
𝑥 ≁𝐺 𝑦 and so 𝐺 is disconnected.
Thus by induction, for all positive integers 𝑛, every graph with 𝑛 vertices and at most 𝑛 − 2
edges is disconnected.
Definition 5.28. A graph is called acyclic if it does not contain a cycle. A tree is a connected
acyclic graph. A leaf of a tree is a vertex 𝑣 with 𝑑(𝑣) = 1.
Trees are an important class of graphs which have many applications: phylogenetic trees,
search trees, decision trees and so forth. They are called trees due to the way they ‘branch out’
– see Figure 5.9. Paths are also trees; they are the trees with maximum degree at most 2.
Our next theorem states that any connected graph contains a tree as a spanning subgraph,
a result with many important applications.7
7
Spanning trees are important because they are minimal connected subgraphs, that is, the smallest number
of edges you need to be able to get from any vertex to any other vertex. For example, if your graph is a road
network, then provided you can keep open the roads (edges) of a spanning tree, then traffic will still be able to
travel from any city (vertex) to any other city, even if all of the other roads are closed.
70 CHAPTER 5. GRAPH THEORY
Proof. Let 𝐺 be a connected graph with 𝑛 vertices and choose a vertex 𝑣 ∈ 𝑉 (𝐺 ). Let 𝑇1 denote
the tree with vertex set {𝑣}. We will construct a sequence (𝑇1 , … , 𝑇𝑛 ) of tree subgraphs of 𝐺,
where for every 𝑘 = 1, … , 𝑛, 𝑇𝑘 has 𝑘 vertices and 𝑇𝑘 is a subtree of 𝑇𝑘+1 (if 𝑘 < 𝑛). Then since
|𝑉 (𝑇𝑛 )| = 𝑛 = |𝑉 (𝐺 )|, 𝑇𝑛 must be a spanning tree of 𝐺.
Let 𝑚 ∈ ℕ+ be given and suppose we have constructed (𝑇1 , … , 𝑇𝑚 ) and 𝑚 < 𝑛. Then there
exists a vertex 𝑢 ∈ 𝑉 (𝐺 ) ⧵ 𝑉 (𝑇𝑚 ). As 𝐺 is connected, there is a walk (𝑤0 , … , 𝑤𝑙 ) from 𝑢 to 𝑣
in 𝐺. Let 𝑖 = 1, … , 𝑙 be the least integer such that 𝑤𝑖 ∈ 𝑇𝑚 . Then 𝑤𝑖−1 𝑤𝑖 ∈ 𝐸(𝐺 ) and 𝑤𝑖−1 ∈
𝑉 (𝐺 ) ⧵ 𝑉 (𝑇𝑚 ). We claim that 𝑇𝑚+1 ≔ (𝑉 (𝑇𝑚 ) ∪ {𝑤𝑖−1 }, 𝐸(𝑇𝑚 ) ∪ {𝑤𝑖−1 𝑤𝑖 }) is a tree.
Let 𝑥 ∈ 𝑉 (𝑇𝑚 ) be given. Then since 𝑇𝑚 is connected, there exists a walk (𝑥0 , … , 𝑥𝑘 ) from 𝑥 to
𝑤𝑖 in 𝑇𝑚 , and so (𝑥0 , … , 𝑥𝑘 , 𝑤𝑖−1 ) is a walk from 𝑥 to 𝑤𝑖−1 in 𝑇𝑚+1 . Thus 𝑇𝑚+1 is also connected.
Now suppose there is a cycle 𝐶 in 𝑇𝑚+1 . This cycle cannot be a subgraph of 𝑇𝑚 as it is a tree
by assumption. Hence 𝑤𝑖−1 ∈ 𝑉 (𝐶 ), but 𝑑𝑇𝑚+1 (𝑤𝑖−1 ) = 1 as 𝑤𝑖 is the only neighbour of 𝑤𝑖−1
in 𝑇𝑚+1 . This is a contradiction, since cycles have minimum degree 2 (and thus any vertex on
a cycle subgraph must have degree at least 2 in the original graph). Therefore 𝑇𝑚+1 is acyclic
and thus a tree.
Hence by induction, there exists a sequence (𝑇1 , … , 𝑇𝑛 ) of tree subgraphs of 𝐺 as required,
5.3. CONNECTEDNESS AND TREES 71
The proof of the above theorem gives an algorithm for finding a spanning tree, which we
describe below:
Algorithm 5.30. The algorithm to find a spanning tree in a connected graph is described in
the steps below:
(1). Input a graph G, choose a vertex v ∈ V(G). Define n ≔ 0 and let T0 denote the tree with
vertex set {v}.
(2). If Tn spans G (V(Tn ) = V(G)) then we halt the algorithm and output Tn . Otherwise, we
continue to the next step.
Example 5.31.
(1). For any cycle 𝐶𝑛 , if we remove a single edge then we get a 𝑃𝑛 which is a spanning subtree.
All spanning trees of a 𝐶𝑛 are 𝑃𝑛 ’s.
(2). Figure 5.10 shows a spanning tree for the connected graph from Figure 5.7. There are
2,082 spanning trees for this graph.
Theorem 5.32. Let 𝐺 be a graph with 𝑛 vertices. Then any two of the following properties implies
the third:
• 𝐺 is connected.
• 𝐺 is acyclic.
• 𝐺 has size 𝑛 − 1.
Proof. First, we assume 𝐺 is a tree; i.e. it is connected and acyclic. As 𝐺 is acyclic, by Theorem
5.21 𝐺 has at most 𝑛 − 1 edges, whereas since 𝐺 is connected, by Theorem 5.27 𝐺 has at least
𝑛 − 1 edges. Therefore 𝐺 has exactly 𝑛 − 1 edges.
72 CHAPTER 5. GRAPH THEORY
𝑎
𝑏 𝑑
ℎ
𝑓
𝑖
𝑘 𝑒
Figure 5.10: A spanning tree 𝑇 for the graph from Figure 5.7.
5.4. BIPARTITE GRAPHS 73
Now assume 𝐺 is connected and has 𝑛 − 1 edges. Then by Theorem 5.29, 𝐺 contains a
spanning tree 𝑇. By the first part of this proof, we know that 𝑇 has 𝑛 − 1 edges and as 𝐸(𝑇 ) ⊆
𝐸(𝐺 ), it follows that 𝐺 = 𝑇, so 𝐺 is acyclic.
Finally, assume 𝐺 is acyclic and has size 𝑛 − 1. Then each component 𝐶 of 𝐺 is connected
and acyclic, so again by the first part of this proof |𝑉 (𝐶 )| = |𝐸(𝐶 )| − 1. If 𝐶1 , … , 𝐶𝑘 are the
components of 𝐺, then since edges must belong to the same component, by the sum rule
|| 𝑘 || 𝑘
𝑛 = |𝑉 (𝐺 )| = || ⋃ 𝑉 (𝐶𝑖 )|| = ∑ |𝑉 (𝐶𝑖 )|
|
𝑖=1 |𝑖=1
|| 𝑘 || 𝑘 𝑘 𝑘
| |
⇒ 𝑛 − 1 = |𝐸(𝐺 )| = | ⋃ 𝐸(𝐶𝑖 )| = ∑ |𝐸(𝐶𝑖 )| = ∑ (|𝑉 (𝐶𝑖 )| − 1) = ( ∑ |𝑉 (𝐶𝑖 )|) − 𝑘 = 𝑛 − 𝑘.
| 𝑖=1 | 𝑖=1 𝑖=1 𝑖=1
Thus 𝑘 = 1, so 𝐺 is connected.
Another useful way to think of bipartite graphs are those whose vertices are 2-colourable:
we can ‘colour’ the vertices in at most 2 possible colours in such a way that no adjacent vertices
have the same colour. The sets of vertices with the same colour are then vertex classes. See
Figure 5.11 for an example. Note that there may several ways to choose vertex classes of a
bipartite graph.
Bipartite graphs arise naturally in applications where a graph represents connections be-
tween different types of objects. For example, a scheduling problem might consider a graph
whose vertices are students and classes, where there is an edge between a student and a class
if that student is taking that class; in this context an edge between two students, or between
two classes, would not make sense, and the vertex classes would be the set of vertices corre-
sponding to students, and the set of vertices corresponding to classes.
74 CHAPTER 5. GRAPH THEORY
𝑎 𝑏 𝑐 𝑑 𝑒 𝑓 𝑔
ℎ 𝑖 𝑗 𝑘 𝑙
Figure 5.11: An example of a bipartite graph, with two vertex classes {𝑎, 𝑏, 𝑐, 𝑑, 𝑒, 𝑓 , 𝑔}
and {ℎ, 𝑖, 𝑗, 𝑘, 𝑙}, shown in separate rows. A different pair of vertex classes is
{𝑎, 𝑏, 𝑑, 𝑔, 𝑗, 𝑙} and {𝑐, 𝑒, 𝑓 , ℎ, 𝑖, 𝑘}, which we have coloured differently. Note that adjacent
nodes lie in different rows / have different colours.
In a bipartite graph any walk must move from one vertex class to the other with each step.
As this walk alternates between two vertex classes, the walk must have even length.
Proposition 5.34. Any closed walk in a bipartite graph has even length.
Proof. Let 𝐺 be a bipartite graph, and let 𝑊 = (𝑤0 , … , 𝑤𝑘 ) be a walk in 𝐺. Let 𝑉1 , 𝑉2 be a pair of
vertex classes of 𝐺, chosen so that 𝑤0 ∈ 𝑉2 . Since 𝑤𝑖 𝑤𝑖+1 is an edge of 𝐺 for each 𝑖 = 0, … , 𝑘−1,
the vertices 𝑣𝑖 and 𝑣𝑖+1 must lie in different vertex classes, and so it follows that 𝑤𝑖 ∈ 𝑉2 if 𝑖 is
even and 𝑤𝑖 ∈ 𝑉1 if 𝑖 is odd. In particular, as 𝑤𝑘 = 𝑤0 ∈ 𝑉2 , 𝑘 must be even.
The next proposition implies that in any graph, a shortest closed walk of odd length (if such
a walk exists) is an odd cycle.
Proposition 5.35. If a graph contains a closed walk of odd length, then it contains a cycle of
odd length.
Proof. Let 𝐺 be a graph and let 𝑊 = (𝑤0 , … , 𝑤𝑘 ) be a closed walk of shortest possible odd
length. Suppose this walk has a repeated vertex other than 𝑤0 = 𝑤𝑘 , so 𝑤𝑖 = 𝑤𝑗 for some
𝑖, 𝑗 = 0, … , 𝑘 with 𝑖 < 𝑗. Then (𝑤0 , … , 𝑤𝑖 , 𝑤𝑗+1 , … , 𝑤𝑙 ) and (𝑤𝑖 , … , 𝑤𝑗 ) are closed walks with
lengths 𝑘 − (𝑗 − 1) and 𝑗 − 𝑖 respectively. As 𝑘 is odd, it follows that one of these walks also has
5.4. BIPARTITE GRAPHS 75
odd length shorter than 𝑘, which gives a contradiction. Therefore the only repeated vertex is
𝑤0 = 𝑤𝑘 . If 𝑘 = 1 then 𝑤0 = 𝑤1 and 𝑤0 𝑤1 ∈ 𝐸(𝐺 ), which is a contradiction. Thus 𝑘 ⩾ 3 and so
𝐶 (𝑤0 , … , 𝑤𝑘 ) is a cycle of odd length in 𝐺.
Note that the analogous statement for even length walks and cycles is not true; a graph
with no cycles of even length may contain a closed walk of even length.
Proof. Proposition 5.34 shows that a bipartite graph has no odd-length walks and thus no odd-
length cycles, so let 𝐺 be a connected graph that is not bipartite. Choose a vertex 𝑣 and for
every 𝑛 ∈ ℕ0 , let 𝐴𝑛 be the set of vertices where the shortest path from 𝑣 to 𝑥 is of length 𝑛.
As 𝐺 is connected, it follows that 𝑉 (𝐺 ) = ⋃∞ ∞
𝑛=0 𝐴𝑛 . If we define 𝐴even ≔ ⋃𝑛=0 𝐴2𝑛 and 𝐴odd ≔
⋃∞
𝑛=0 𝐴2𝑛+1 , then 𝐴even ∪ 𝐴odd = 𝑉 (𝐺 ) and 𝐴even ∩ 𝐴odd = ∅.
Note that for all 𝑛 ∈ ℕ0 and all 𝑥 ∈ 𝑉 (𝐺 ):
𝑛
𝑥 ∈ 𝐴𝑛+1 ⟺ 𝑥 ∉ ⋃ 𝐴𝑘 and 𝑁 (𝑥) ∩ 𝐴𝑛 ≠ ∅.
𝑘=0
𝑘 𝑘 𝑘 𝑘
𝑈 ∩ 𝑉 = ⋃ ⋃ (𝑈𝑖 ∩ 𝑉𝑗 ) = ⋃ ⋃ ∅ = ∅.
𝑖=1 𝑗=1 𝑖=1 𝑗=1
are different equivalence classes of ∼𝐺 and 𝑢 ∼𝐺 𝑣. Therefore there is no edge between vertices
of 𝑈 and likewise no edge between vertices of 𝑉, so 𝑈, 𝑉 are a pair of vertex classes for 𝐺,
proving that 𝐺 is bipartite.
An immediate consequence of this theorem is that trees, and in particular paths, are bi-
partite, as are 𝐶𝑛 ’s for even 𝑛 ⩾ 4. The proof of this theorem contains within it the following
algorithm, and also justifies that the algorithm will always halt on any input and is correct.
Figures 5.12 and 5.13 shows an example run of this algorithm on some connected graphs.
Algorithm 5.37. This algorithm will either output a pair of vertex classes (for when the input
is bipartite) or will output a closed walk with odd length at least 3 (for when the input is not
bipartite):
(1). Input a graph G. For each component C of G, run through steps (a) to (d):
⎛ n n ⎞
⋃ Ak , ⋃ Ak
⎝ k isk=0,
even
k=0,
k is odd
⎠
(2). If C1 , … , Ck are the components of G, and for each i = 1, … , k, (Ui , Vi ) is the result of steps
(a) to (d) for component Ci , then output
k k
( ⋃ Ui , ⋃ Vi ).
i=1 i=1
5.4. BIPARTITE GRAPHS 77
1
0
1 1
2
3
3
4
1
2
3
Figure 5.12: All trees are bipartite. Each vertex is labelled with a number 𝑛 that indicates
the shortest length of paths between this vertex and a fixed starting vertex (the one
labelled with 0). At each stage, we colour the new vertices connected to the previous
stage, alternating between red for the even stages, and blue for the odd stages. So those
labelled with 𝑛 + 1 are connected by an edge with those labelled with 𝑛. Since
78 CHAPTER 5. GRAPH THEORY
0 𝑣
𝑧 2
𝑢
1
𝑤
2
𝑥 2
1 𝑦
Figure 5.13: A non-bipartite graph. The algorithm starts at vertex 𝑣 (labelled with 0)
and there is an edge between the two vertices 𝑥, 𝑤 in 𝐴2 . Working our way back to 𝑣,
we find that there is a walk with length 5 in our graph: (𝑣, 𝑢, 𝑥, 𝑤, 𝑦, 𝑣).
Appendix A
Changelog
23rd April 2021: Added section 5.4 and made several corrections to section 5.3:
• noted that each 𝑇𝑘 is a subtree of 𝑇𝑘+1 in the first paragraph of the proof of Theorem
5.29,
• amended Proposition 5.26 to note that 𝑢 and 𝑣 are end-vertices of the path, and
19th April 2021: Added changelog as an appendix and included sections 5.1 and 5.2.
11th April 2021: Fixed several minor typos and changed [0, 1] to [0, 1) in proof of Theorem
4.14.
79
80 APPENDIX A. CHANGELOG
• clarified the equivalence classes and quotient set of the equivalence relation 𝐶 in
chapter 2,
• fixed various index, hyperlink, and caption issues, and
• corrected other minor issues.
1st April 2021: Included first half of chapter 3 and corrected typo in relation-property corre-
spondence in chapter 2.
27th March 2021: Included chapter 2 and corrected typo in general inclusion-exclusion for-
mula (Theorem 1.15).
[𝑎]∼ , 26 𝐾𝑛 , 60
𝑛!, 33
𝑛 ℕ0 , ℕ+ , 11
( ), 37
𝑟
𝑑𝐺 (𝑣), 𝑑(𝑣), 57 ℙ(𝐸), 32
𝑛
∏𝑚=1 𝑎𝑛 , 33
𝐴/∼, 26 𝑃𝑛 , 60
𝑎𝑅𝑏, 𝑎 𝑅
𝑏, 23 𝒫(𝐴), 16
⌈𝑥⌉, 9 𝐴 × 𝐵, 𝐴1 × ⋯ × 𝐴𝑟 , 15
𝐶𝑛 , 60
ℚ, 11
𝐴𝑐 , 15
ℝ, 11
𝛿(𝐺 ), Δ(𝐺 ), 57
𝐴 ⧵ 𝐵, 15 {…}, 11
|𝐴|, 17, 49
𝑢𝑣, 55
𝐴 ⊆ 𝐵, 𝐴 ⫋ 𝐵, 12
≼, 49
≺, 52 ∪, ⋃, 13
∅, 13
(𝑉 , 𝐸), 55
≈, 49
≡𝑚 , 25
⌊𝑥⌋, 9
𝐺 [𝐴], 60
ℤ, 11
𝑥 ∈ 𝐴, 𝑥 ∉ 𝐴, 11
∩, ⋂, 13
𝐾𝑚,𝑛 , 73
81
82 NOTATION
Index
杨辉三角形, 46 Countable, 50
Countably-infinite, 50
Acyclic, 69
Cycle
Bijection, 49, 50 graph, 60, 61, 64, 69, 74, 75
Binary length, 74, 75
relation, 23
Decimal, see Real numbers
Binomial
expansions, 7
coefficient, 37–42, 45–47, 60, 63
Decimal places, 32
expansion, 46
Degree, 57–59
theorem, 46
maximum, 57
Bipartite graph, 73–76
minimum, 57, 64
complete, 73
sequence, 59
Cantor’s theorem, 52 Denominator, 7
Cantor-Schröder-Bernstein theorem, 50 Disjoint, 12
Cardinality, see size of set Distributive law, 19
Cards, 10, 39 Divides, 20
Cartesian product, see product of sets
Ceiling, 9 Edge, 55, 57
Choice Element, 11
ordered, 33–37, 40–42 common, 12
unordered, 37–42 Empty set, 13
with repetition, 34, 35, 40, 41 Equinumerous, 49
without repetition, 34, 36, 38, 40, 41 Equivalence
Connected, 64–73 class, 26
component, 65, 68 relation, 26
Copy of a graph, 64 Extension Axiom, 12
83
84 INDEX
Leaf, 69 Quotient
Length, see cycle / walk length set, 26
INDEX 85
Reflexive, 25
Regular graph, 59
Rounding, see Decimal places
Set, 11–17
complement, 15, 44
difference, 15
distinct, 12
equality, 12
intersect, 12, 18, 19
Significant figures, 32
Size
of graph, 56
of set, 17–19, 21, 22, 47, 48
Subgraph, 60
Subset, 12
Sum rule, 18, 31
Symmetric, 25
Transitive, 25
Tree, 64–73
spanning, 70
Walk, 64–73
closed, 65, 74
length, 64, 74