0% found this document useful (0 votes)
41 views42 pages

Notes Week4

Uploaded by

ruben135rejo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views42 pages

Notes Week4

Uploaded by

ruben135rejo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Lecture notes for

F17SC Discrete Mathematics

Dr Matteo Capoferri1 and Dr Adrian Turcanu2

Department of Mathematics
Heriot-Watt University

AY 2022–2023

1 Senior Course Leader, Edinburgh. E-mail: [email protected], Webpage: https:


//mcapoferri.com.
2 Dubai. E-mail: [email protected].
Abstract

This course is in five parts. The first part is concerned with the foundations of
set and function theory, alongside elements of combinatorics. The second part deals
with probability theory and applications. The third part consists in Graph Theory,
indispensable for students of computer science and applied mathematics. The fourth
part is concerned with recurrence relations and, finally, the fifth part provides an
introduction to matrix theory and linear algebra.

Each section is complemented by a list of exercises. Exercises for which a solution


is provided are marked with a (∗).

All the teaching materials for the course are available on Canvas.

All you need is for the course is contained in these lecture notes and in the accom-
panying Problem Sets, which will gradually become available on Canvas. However,
if you wish to consult a textbook, you may refer to

• P. Grossman, Discrete Mathematics for Computing (has a nice introduction to


set theory, combinatorics and graphs with a range of examples from computer
science)

• A. Chetwynd and P. Diggle, Discrete Mathematics (a short, very friendly text-


book, covering combinatorics and probability)

• K. H. Rosen, Discrete Mathematics and its application (the graph theory chap-
ters are particularly useful)

• S. B. Maurer and A. Ralston, Discrete Algorithmic Mathematics (covers linear


algebra, probability and applications to algorithms)

• S. Lipschutz, Schaum’s outline of theory and problems of linear algebra (covers


matrices and linear equations)

Updated: February 6, 2023.

Acknowledgements: I am grateful to Dr Anatoly Konechny for making his teach-


ing materials available to me. These notes are largely based on his.
Contents

1 Set Theory and Combinatorics 3


1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Computer Representation of Sets . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 The Sum and Product Rules . . . . . . . . . . . . . . . . . . . . . . 22
1.7.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8 Permutations and Combinations . . . . . . . . . . . . . . . . . . . . 24
1.8.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.9 The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . . . . . 28
1.9.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2 Probability 35
2.1 Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2
Chapter 1

Set Theory and Combinatorics

1.1 Sets
Definition 1.1. A set is an unordered collection of objects.

The notation A = {1, 3, 5, 7} means that the set A contains the four elements
1, 3, 5, 7 . We write a ∈ A to denote that a is an element of the set A . We write
a∈ / A to indicate that a does not belong to A . For example, given A = {1, 3, 5, 7}
we have that 5 ∈ A and 6 ∈ / A.
There are a number of ways of denoting sets. Thus for the set of positive even
numbers we could write
A = {2, 4, 6, 8, . . .}
or
A = {x | x is a positive even number} ,
where the vertical bar | is to be read “such that”.
A set cannot contain the same object more than once. Also, the elements in a
set are not ordered so the sets {a, b, c} and {c, a, b} are the same. In general, two
sets are equal if they contain the same elements.

Definition 1.2. A set B is a subset of a set A if every element of B is also an


element of a A .

If B is a subset of A we write B ⊂ A . For example, if A = {1, 3, 5, 7} and


B = {3, 7} , then B ⊂ A .

Remark 1.1. You should take care to distinguish between ‘B is a subset of A’ ( B ⊂


A ) and ‘B is a element of A’ ( B ∈ A ). For example if

A = {1, {3, 5}, 3, 6, 7},

then all of the following are true

3 ∈ A, {3, 5} ∈ A, 5∈
/ A, {3, 6} ⊂ A, {3} ⊂ A.

3
Discrete Mathematics Page 4

The set that has no elements is called the empty set and is denoted by ∅ . By
convention ∅ ⊂ A for any set A.
Remark 1.2. Note that the set ∅ and the set B = {∅} are different. The single
element of B is the empty set itself.

Definition 1.3. An ordered pair is a pair (a, b) , where a comes before b.


Two pairs (a, b) and (c, d) are equal whenever a = c and b = d . Note that
(3, 8) and (8, 3) are not equal.
Definition 1.4. If A and B are sets then the Cartesian product A × B is the set
of all ordered pairs (a, b) with a ∈ A and b ∈ B .

Example 1.5. Let A = {1, 2, 3} and B = {x, y} . Find the Cartesian product
A×B.
Solution. The Cartesian product A × B is

A × B = { (1, x), (1, y), (2, x), (2, y), (3, x), (3, y) }.

1.1.1 Exercises

Exercise 1.1.1 (∗). List the elements of the set

{x | x is the square of an integer and x < 25} .

Solution. We have

{x | x is the square of an integer and x < 25} = {0, 1, 4, 9, 16} .

Exercise 1.1.2 (∗). Find B × A for Example 1.5.

Solution. B × A = { (x, 1), (x, 2), (x, 3), (y, 1), (y, 2), (y, 3) } .

1.2 Set Operations


Definition 1.6. If A and B are sets then the union of A and B , A∪B is defined
as
A ∪ B := {x | x ∈ A or x ∈ B}.
Page 5 Dr Matteo Capoferri & Dr Adrian Turcanu

The symbol “:=” means “equal by definition”.

Definition 1.7. The intersection of A and B , A ∩ B is defined as

A ∩ B := {x | x ∈ A and x ∈ B}.

Note that the statement x ∈ A or x ∈ B does not exclude the possibility that
x is an element of both sets. For example, we have

{1, 3, 5, 7} ∪ {2, 5, 7} = {1, 2, 3, 5, 7}

and
{1, 3, 5, 7} ∩ {2, 5, 7} = {5, 7} .
Remark 1.3. It is useful to picture combinations of sets with the aid of Venn dia-
grams. This will be done in the lectures.

Definition 1.8. If A and B are sets, then the difference of A and B is denoted
by A \ B and defined as

A \ B := {x | x ∈ A and x ∈
/ B}.

For example, if A = {1, 3, 5, 7} and B = {4, 5, 6, 7} then A \ B = {1, 3} .


Often, all the sets under consideration are subsets of some larger set U called the
universal set.

Definition 1.9. The complement of a set A is A = U \ A .

Thus A = {x | x ∈
/ A} .

Example 1.10. Let U = {1, 2, 3, 4, 5, 6, 7} , A = {1, 7} , B = {1, 2, 4, 6} and


C = {1, 3, 5, 7} . Find the sets
(a) C \ A , (b) A ∩ B , (c) (A ∩ B) ∪ C , (d) A ∪ B , (e) (C − A) ∩ B .
Solution. (a) C \ A = {3, 5} .
(b) B = {3, 5, 7} so A ∩ B = {7} .
(c) A ∩ B = {1} and C = {2, 4, 6} . Hence (A ∩ B) ∪ C = {1, 2, 4, 6} .
(d) A ∪ B = {1, 2, 4, 6, 7} so A ∪ B = {3, 5, } .
(e) C \ A = {1, 2, 4, 6, 7} so (C \ A) ∩ B = {1, 2, 4, 6} .

Set operations satisfy the following laws.

A ∪ (B ∪ C) = (A ∪ B) ∪ C

A ∩ (B ∩ C) = (A ∩ B) ∩ C
Discrete Mathematics Page 6

A∪B =B∪A
A∩B =B∩A
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
(A ∪ B) = A ∩ B
(A ∩ B) = A ∪ B
Note that the empty set ∅ satisfies the following properties A ∪ ∅ = A and
A ∩ ∅ = ∅ for any set A.

1.2.1 Exercises

Exercise 1.2.1 (∗). Let U = {1, 2, 3, 4, 5, 6, 7, 8} , A = {3, 8} , B = {2, 4, 6, 8} and


C = {1, 3, 5, 7} .
Find the sets
(a) C \ A , (b) A ∩ B , (c) (A ∩ B) ∪ C ,
(d) A ∩ B , (e) (C \ A) ∩ B .

Solution. (a) C \ A = {1, 5, 7} .


(b) A ∩ B = {3, 8} ∩ {1, 3, 5, 7} = {3} .
(c) A ∩ B = {8} , so (A ∩ B) ∪ C = {2, 4, 6, 8} and
(d) A ∩ B = {1, 2, 3, 4, 5, 6, 7} .
(e) C \ A = {2, 3, 4, 6, 8} , so (C \ A) ∩ B = B .

Exercise 1.2.2 (∗). Let C = A ∩ B and D = A ∩ B . Find C ∩ D .

Solution. C ∩ D = ∅ since there are no elements that are in both B and B .

Exercise 1.2.3 (∗). What can you say about sets A and B if
(a) A∪B = A ? (b) A∩B = A ? (c) A∩B = B ∩A ? (d) A\B = A ?

Solution. (a) If A ∪ B = A then B ⊂ A .


(b) If A ∩ B = A then A ⊆ B .
(c) Since A ∩ B = B ∩ A is always true we cannot say anything about A and B .
(d) Suppose that A \ B = A and x ∈ A ∩ B . Then x ∈ A and x ∈ B so
x∈/ A \ B . Since A \ B = A , no such x can exist and A ∩ B = ∅ .
Page 7 Dr Matteo Capoferri & Dr Adrian Turcanu

1.3 Computer Representation of Sets


There are many ways to represent sets on a computer. We could store the elements
of a set in an unordered fashion. If we did this, it would be time-consuming to
compute the union or intersection of two sets since these operations would require a
large amount of searching for elements. We outline below a method for storing the
elements of a set which makes computing set operations easy.

• Step 1. We order a finite universal set U = {u1 , u2 , . . . , un } .

• Step 2. Given a subset A of U , we represent A by a bit string of length n


where the k th bit in the string is 1 if uk ∈ A and 0 otherwise.

As an example, let U = {1, 2, 3, 4, 5, 6} , A = {2, 3} and B = {3, 4, 5, 6} . Then


the bit string for A is 0 1 1 0 0 0 and the bit string for B is 0 0 1 1 1 1 .
Given the bit strings for A and B, one can easily compute the bit strings for A,
A ∪ B and A ∩ B as follows.

− To find the bit string for A from the bit string for A we turn each 1 into a 0
and each 0 into a 1.

For the above example, the bit string for A is 0 1 1 0 0 0 so the bit string for A
is 1 0 0 1 1 1 .

− The k th bit in the bit string for A ∪ B is 1 if either (or both) of the bits in
the k th position of A and B is 1, and is 0 if both bits are 0.

− The k th bit in the bit string for A ∩ B is 1 if both of the bits in the k th
position of A and B are 1 , and is 0 if either (or both) of the two bits is 0.

As an example, let U = {1, 2, 3, 4, 5} , A = {1, 3, 5} and B = {3, 4} . The bit


string for A is 1 0 1 0 1 whereas the bit string for B is 0 0 1 1 0 . Therefore, the bit
string for A ∪ B is 1 0 1 1 1 and the bit string for A ∩ B is 0 0 1 0 0 .

1.3.1 Exercises

Exercise 1.3.1 (∗). Let U = {1, 2, 3, 4, 5, 6, 7} , A = {3, 4, 5, 7} and B = {5, 6} .


Find the bit strings for A , B , A , A ∪ B , and A ∩ B .

Solution. The bit string for A is 0 0 1 1 1 0 1 , and the bit string for B is 0 0 0 0 1 1 0 .
The bit strings for A , A ∪ B and A ∩ B are 1 1 0 0 0 1 0 , 0 0 1 1 1 1 1 and
0000100.
Discrete Mathematics Page 8

Exercise 1.3.2 (∗). Which subsets of a finite universal set U do these bit strings
represent?
(a) the string with all zeros, (b) the string with all ones.

Solution. (a) The empty set. (b) The set U .

Exercise 1.3.3 (∗). Explain how to find the bit string corresponding to A \ B.

Solution. For A \ B , the k th bit is 1 if the k th bit of the first string is 1 and the
k th bit of the second string is 0, and is 0 otherwise.

1.4 Relations
Let A and B be sets. A relation between A and B links certain elements of A with
certain elements of B. More formally, a relation from A to B is a subset of A × B.
A relation on the set A is a relation from A to A.

Here are some examples of relations:

1. Let A be the set of students at Heriot-Watt University and let B be the set
of modules. The relation R consists of pairs (a, b) , a ∈ A , b ∈ B with the
student a being enrolled to the course b.

2. Let R be the relation on the set of people consisting of pairs (a, b) where a is
a parent of b.

3. Let A = {1, 2, 3, 4} and B = {x, y} . Then

R = { (1, y), (2, x), (3, x), (4, y) }

is a relation from A to B.

4. The following are relations on the set of integers:

R1 = {(a, b)|a ≥ b},

R2 = {(a, b)|a − b = 4}.

5. Let A = {1, 2, 3, 4} and let R be the relation on A given by

R = {(a, b)|a divides b }.

Then R contains the pairs

(1, 1) (1, 2) (1, 3) (1, 4) (2, 2) (3, 3) (4, 4) (2, 4) .


Page 9 Dr Matteo Capoferri & Dr Adrian Turcanu

If S is a (small) finite set, then we can represent a relation R on S by a directed


graph, a network of blobs and arrows. We have a blob for each element of S, and
draw an arrow from x to y if (x, y) ∈ R.
Let S = {a, b, c, d} and let R be the relation defined by the subset
{(a, a), (a, b), (b, b), (b, c), (c, b), (c, d), (d, d)}.
The directed graph encoding this relation is shown below.

Definition 1.11. Let R be a relation on a set A. We say that R is


• reflexive if (a, a) ∈ R for all elements a in A;
• symmetric if whenever (a, b) ∈ R then (b, a) ∈ R;
• antisymmetric if whenever (a, b) ∈ R and (b, a) ∈ R then a = b;
• transitive if whenever (a, b) ∈ R and (b, c) ∈ R then (a, c) ∈ R.
A description of the above properties in terms of graphs was given in class.

Example 1.12. As an example, consider the relation in item 5 above.


It is reflexive since it contains all pairs of the form (a, a) , namely,

(1, 1) (2, 2) (3, 3) (4, 4)

It is not symmetric since, for example, it contains (1, 2) but not (2, 1) .
It is antisymmetric since there is no pair of elements a and b with a ̸= b such
that (a, b) and (b, a) belong to R.
It is transitive. To check this we have to show that if (a, b) and (b, c) belong to
R, then so does (a, c) .

Example 1.13. As another example, let R be the relation on the set of people
consisting of pairs (a, b) where a likes b. Unfortunately it is not symmetric. It is
not antisymmetric or transitive. It is probably not even reflexive.
Discrete Mathematics Page 10

Definition 1.14. A relation that is reflexive, symmetric and transitive is called an


equivalence relation.

Example 1.15. Let R be the relation on the set A = {1, 3, 5, 9, 11, 18} defined by
the pairs (a, b) such that a − b is divisible by 4.

Given a set A with an equivalence relation R on it, we can break up all elements in
A into disjoint groups (subsets) such that within each group all elements are related
between themselves but no two elements from two different groups are related. Such
groups of elements are called equivalence classes. For Example 1.15 the equivalence
classes are C1 = {1, 5, 9}, C2 = {3, 11}, C3 = {18}.

1.4.1 Exercises

Exercise 1.4.1 (∗). Let A = {2, 3, 4, 5, 6} and let R be the relation on A given by

R = {(a, b)|a divides b }

Write down the ordered pairs in R.


Solution. R = {(2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (2, 4), (2, 6), (3, 6)}.

Exercise 1.4.2 (∗). Let R be the relation on the set of real numbers consisting
of pairs (a, b) with a ≤ b. Determine whether the relation is reflexive, symmetric.
antisymmetric, or transitive.
Solution. This relation is reflexive because a ⩽ a. It is not symmetric because e.g.
2 ⩽ 3 but 3 ⩽̸ 2. It is antisymmetric because if a ⩽ b and b ⩽ a this does imply
a = b. It is transitive, that is a ⩽ b and b ⩽ c implies a ⩽ c.

Exercise 1.4.3. Show that Example 1.15 defines an equivalence relation.

1.5 Functions
Let X and Y be sets.
Definition 1.16. A function from X to Y is a rule that assigns to each element of
X exactly one element of Y . We write f : X → Y . The set X is called the domain
of f and the set Y is called the codomain of f .
Page 11 Dr Matteo Capoferri & Dr Adrian Turcanu

Here are some examples of functions.

Example 1.17. f : R → R defined by the formula f (x) = x2 is a real-valued


function of the kind studied in calculus courses.

Example 1.18. Let A = {1, 5, 7}, B = {3, 8, 22, 30}. A function f : A → B is


defined as
f (1) = 22 , f (5) = 3 , f (7) = 22 .

Example 1.19. Function f : N → N is defined so that f (1) = 1 and f (n + 1) =


(n + 1)f (n). The latter is called a recurrence relation. It allows one to compute
values of f starting from the one given explicitly. For example f (2) = f (1 + 1) =
2f (1) = 2 · 1 = 2. It is actually computing the factorial: f (n) = n!.

Example 1.20. The next example is called the McCarthy 91 function and was
given by John McCarthy, one of the founders of artificial intelligence.
Here f : N → Z is defined for positive integers n by

n − 10 if n ≥ 101
f (n) =
f (f (n + 11)) if n ≤ 100

If we try to find f (1) directly, we put n = 1 in the definition. This shows that f (1)
depends on f (12) which in turn depends on f (23) etc. It seems hopeless to find
f (1) in this way. The trick is to work backwards. Putting n = 100 in the definition
gives
f (100) = f (f (111)) = f (101) = 91
Putting n = 99 in the definition gives

f (99) = f (f (110)) = f (100) = 91 .

Continuing in this way we can show that f (n) = 91 for all n ≤ 100 .

If f : X → Y is any function, the set of images

Ran(f ) := {y ∈ Y | y = f (x) for some x ∈ X}

is called the range of f .


Discrete Mathematics Page 12

In Example 1.17 the range of f is the set of all non-negative real numbers:
Ran(f ) = {y ∈ R | y ≥ 0}. In Example 1.18 the range is the set Ran(f ) = {3, 22}.
For the McCarthy 91 function (Example 1.20) the range is Ran(f ) = {y ∈ N | y ≥
91}.
Definition 1.21. A function is called surjective if its range is equal to its codomain.
None of the above examples defines a surjective function. However it is easy to
construct functions from the above examples which are surjective by simply modi-
fying the codomain. For instance f (x) = x2 becomes surjective if we consider it as
a function from all real numbers into the non-negative real numbers: f : R → R≥0 .
Definition 1.22. A function is called injective if no two distinct elements of the
domain have the same image.
In the above only the factorial function of Example 1.19 is injective.

Given two functions f : A → B, g : B → C we can define their composition


g ◦ f : A → C by the rule g ◦ f (x) = g(f (x)). The notation g ◦ f needs to be read
from right to left: we first apply f then g.

Example 1.23. Let f : R → R be defined as f (x) = 2x2 and g : R → R is defined


as g(x) = 3x + 4 then

(f ◦ g)(x) = f (g(x)) = f (3x + 4) = 2(3x + 4)2 = 2(9x2 + 24x + 16) = 18x2 + 48x + 32

while
(g ◦ f )(x) = g(f (x)) = g(2x2 ) = 3(2x2 ) + 4 = 6x2 + 4

The identity function on a set A is the function i : A → A defined so that


i(x) = x.
Definition 1.24. Let f : A → B and g : B → A be functions. If g ◦ f : A → A is
the identity function on A and f ◦ g : B → B is the identity function on B then it
is said that f is the inverse of g (and g is the inverse of f ).

A function f that has an inverse is said to be invertible.


The following theorem is true.
Theorem 1.25. A function f is invertible if and only if it is surjective and injective.
Remark 1.4. A function that is both injective and surjective is often called bijective.

1.5.1 Exercises
Page 13 Dr Matteo Capoferri & Dr Adrian Turcanu

Exercise 1.5.1. For each of the following functions find their range. Determine
whether the function is surjective and whether it is injective.

(a)(∗) f : R → R , f (x) = 2x + 1 (b) g : R → R , g(x) = x4 + 1 ,

(c)(∗) h : X → Y , h(s) = number of ones in s ,


(d) j : X → Y , j(s) = the first bit of s
where X be the set of all finite non-empty strings of bits and let Y be the set of all
non-negative integers.

Solution. (a) The range is R because for any y ∈ R we always have a solution to
y = f (x) = 2x + 1 given by x = (y − 1)/2. This also gives the inverse function
f −1 (y) = (y − 1)/2. Therefore by Theorem 1.25 f is injective and surjective.

(c) The function h is surjective. To show this take any non-negative integer n.
If n > 0 then let sn be a string of length n containing only 1’s. then h(sn ) = n.
If n = 0 then we have h(0) = 0 where we take the string containing a single 0.
This function is not injective. To demonstrate this it suffices to note that, e.g.,
s(101) = 2 = s(011).

Exercise 1.5.2 (*). Let the functions f , g and h have R as their domain and
codomain, and be defined as
(
2
1 if x ≥ 0
f (x) = 4x − 3 , g(x) = x + 1 , h(x) = .
0 if x < 0

Find rules for the following functions (with domain and codomain R)

(a) f ◦ f , (b) h ◦ f , (c) h ◦ g .

Solution. (a) We have f ◦ f (x) = f (4x − 3) = 4(4x − 3) − 3 = 16x − 15.

(b) To calculate h ◦ f (x) we first obtain f (x) = 4x − 3. Then if 4x − 3 ≥ 0


the final result is 1 and if 4x − 3 < 0 the final result is 0. Since 4x − 3 ≥ 0 means
x ≥ 3/4 we can summarise the answer as
(
1 if x ≥ 3/4
h ◦ f (x) =
0 if x < 3/4

(c) Since g(x) = 1 + x2 is always greater or equal than 0, h ◦ g(x) = 1 for all x.
Discrete Mathematics Page 14

Exercise 1.5.3 (*). Find the inverse function for each of the following functions,
or explain why no inverse exists.

(a) g : R → R , g(x) = x2 ,

(b) j : S → S
where S is the set of non-empty finite strings of lower case letters, and j(s) gives
a string obtained by moving the last character to the beginning of the string, e.g.
j(yncrfm) = myncrf.

Solution. (a) This function is the same as the absolute value function g(x) = x2 =
|x|. Since g(−1) = 1 = g(1) this function is not injective and therefore by Theo-
rem 1.25 the inverse function does not exist.

(b) The inverse function j −1 (s) performs the inverse operation on the string: it
takes the first letter and moves it to the end of the string, e.g. j −1 (sdfj) = dfjs.

1.6 Mathematical Induction


Induction is one of the most commonly used proof techniques in Discrete Mathe-
matics and Computer Science. It is used for proving statements which depend on
the integers.

Suppose we want to find a formula for the sum of the first n odd integers:
• For n = 1 the sum is 1 = 12 .
• For n = 2 the sum is 1 + 3 = 4 = 22 .
• For n = 3 the sum is 1 + 3 + 5 = 9 = 32 .
• For n = 4 the sum is 1 + 3 + 5 + 7 = 16 = 42 .
Based on the above, it seems reasonable to conjecture that
1 + 3 + 5 + 7 + . . . + (2n − 1) = n2 .
But how does one rigorously prove this?

To start with, let us look at what it means for a statement (or proposition) to
depend on a positive integer n .

Example 1.26. Let P (n) be the statement

1 + 3 + 5 + 7 + . . . + (2n − 1) = n2 .

Then:
Page 15 Dr Matteo Capoferri & Dr Adrian Turcanu

• P (3) states that 1 + 3 + 5 = 32 ;

• P (4) states that 1 + 3 + 5 + 7 = 42 .

Both P (3) and P (4) are true.

Example 1.27. Let Q(n) be the statement 3n > (n + 2)2 . Then

• Q(2) states that 32 > 42 ;

• Q(3) states that 33 > 52 .

Q(2) is false, whereas Q(3) is true.

Example 1.28. Let P (n) be the statement n2 + n is even. Write down P (1) ,
P (3) , P (k) and P (k + 1) .
Solution. P (1) states that 12 + 1 is even. P (3) states that 32 + 3 is even. P (k)
states that k 2 + k is even. P (k + 1) states that (k + 1)2 + k + 1 is even.

Example 1.29. Let P (n) be the statement 50 − 2n ≥ 0 . Write down P (1) , P (5)
and P (30) . Which of them are true?
Solution. P (1) states that 50 − 2 ≥ 0 which is true. P (5) states that 50 − 10 ≥ 0
which is true. P (30) states that 50 − 60 ≥ 0 which is false.

Example 1.30. Let {an } be the sequence defined by a1 = 4 and

an+1 = 2an − 3 , n ≥ 1 .

Let P (n) be the statement an = 2n−1 + 3 . Write down P (1) , P (2) and P (3) .
Which of them are true?
Solution. P (1) states that a1 = 20 + 3 , that is, a1 = 4 . Hence P (1) is true. P (2)
states that a2 = 21 + 3 , that is, a2 = 5 . P (3) states that a3 = 22 + 3 , that is,
a3 = 7 .
We use an+1 = 2an − 3 to find a2 and a3 : a2 = 2a1 − 3 = 2(4) − 3 so a2 = 5,
and a3 = 2a2 − 3 = 2(5) − 3 so a3 = 7. Hence, both P (2) and P (3) are true.
Discrete Mathematics Page 16

Let P (n) be a proposition which depends n where n is a positive integer.


A proof by mathematical induction that P (n) is true for all positive integers n
consists of two steps.

Mathematical induction

Step 1. Show that P (1) is true.

Step 2. Let k be any positive integer. Show that if P (k) is true


then P (k + 1) is true.

Example 1.31. Use Mathematical Induction to prove that for any positive integer
n
1 + 3 + 5 + 7 + . . . + (2n − 1) = n2
Solution. Let P (n) be the statement

1 + 3 + 5 + 7 + . . . + (2n − 1) = n2 .

P (1) states that 1 = 12 which is correct. Hence P (1) holds.


Assume P (k) is true: 1 + 3 + 5 + 7 + . . . + (2k − 1) = k 2 . Then

1 + 3 + 5 + 7 + . . . + (2k − 1) + (2k + 1) = [1 + 3 + 5 + 7 + . . . + (2k − 1)] + 2k + 1.

Since P (k) holds,

1 + 3 + 5 + 7 + . . . + (2k − 1) + (2k + 1) = k 2 + 2k + 1 = (k + 1)2 .

This shows that P (k + 1) follows from P (k) .


By the principle of mathematical induction, P (n) is true for any positive integer
n .

Remark 1.5. In the above example, P (n) is a statement. It is not an algebraic


expression. It is not acceptable to write P (n) = n2 .

Example 1.32. Let {an } be the sequence defined by a1 = 1 and

an+1 = an + 8n , n ≥ 1 .

Prove by induction that an = (2n − 1)2 for all positive integers n .


Solution. Let P (n) be the statement: an = (2n − 1)2 .
P (1) is the statement: a1 = (2 − 1)2 . Since a1 = 1 , P (1) is true.
Page 17 Dr Matteo Capoferri & Dr Adrian Turcanu

Assume P (k) is true: ak = (2k − 1)2 . Then

ak+1 = ak + 8k
= (2k − 1)2 + 8k
= 4k 2 + 4k + 1
= (2k + 1)2 .

Since 2(k + 1) − 1 = 2k + 1 we see that P (k + 1) follows from P (k) . By the


principle of mathematical induction, P (n) is true for any positive integer n .

To prove things by induction you should:

(a) State P (n) .

(b) Show that P (1) is true.

(c) Do the inductive step : show that if P (k) is true then P (k + 1) is true.

It is worth stressing that one needs to carry out ALL of the above steps. Indeed,
there are a number of pitfalls in using the principle of mathematical induction as
the next three examples show.

I. It does not suffice to prove only that P (1) is true. Indeed, let P (n) be the
statement: n2 = n . Then P (1) is true. But P (n) is true only when n = 1
or n = 0 and is false for n ≥ 2.

II. It does not suffice to prove only that if P (k) is true then P (k + 1) is true.
Indeed, let P (n) be the statement: n + 1 = n . Assume that P (k) holds so
that k + 1 = k . Adding 1 to each side gives k + 2 = k + 1 so P (k + 1) is
true. Of course, P (1) is false so we cannot get started.

III. Another typical error is to give a proof of P (k) implies P (k + 1) which only
works for some values of k , for example k ≥ 2 . Let P (n) be the statement:
any n students taking the same examination must all score the same mark.
Clearly P (1) is true.
Assume that P (k) is true and let any group of k + 1 candidates be given.
Let A and B be particular candidates and C the remaining group of k − 1
students. Then A and C form a group of k students so by P (k) they all
have the same mark. The same is true of B and C . Thus A and B and the
group C score the same mark so P (k + 1) is true. By induction, P (n) holds
for all n .
The mistake is that the group C must not be empty. If k = 1 then C is
empty and the marks of students from the groups A and C do not coincide.
We have proved that P (k) is true implies that P (k + 1) is true if k ≥ 2 but
there is no way of establishing P (2) which is false.
Discrete Mathematics Page 18

Example 1.33. Let {an } be the sequence defined by a1 = 1 and



an+1 = 2 + an , n ≥ 1 .

Prove by induction that an ≤ 2 for all positive integers n .


Solution. Let P (n) be the statement: an ≤ 2 . Since a1 = 1 , P (1) is true.
Assume that P (k) holds. Thus ak ≤ 2. Then

ak+1 = 2 + ak

≤ 2+2
= 2,

that is, P (k + 1) is true. By induction P (n) is true for all n .

The next example uses the factorial function n! . The number 5! (read five
factorial or factorial 5) means 5 × 4 × 3 × 2 × 1 . Note that 1! = 1 and that by
convention 0! = 1 . The reader should check by calculation that 4! = 24 and use a
calculator to check that 7! = 5040 .

Example 1.34. Prove by induction that 2n < n! for n ≥ 4 .


Solution. Let P (n) be the statement: 2n < n! .
We first check that P (4) is true. This is correct since 4! = 24 > 16 = 24 .
Assume that k ≥ 4 and that P (k) holds so that 2k < k! . Then

2k+1 = 2 2k


< 2 (k!)
< (k + 1) k!
= (k + 1)!.

Hence 2k+1 < (k + 1)! so P (k + 1) is true.


By induction P (n) is true for all n .

1.6.1 Exercises

Exercise 1.6.1 (∗). Let P (n) be the statement 3n − 100 ≥ 0 . Write down P (1) ,
P (4) and P (40) . Which of them are true?

Solution. P (1) states that 3(1) − 100 ≥ 0 , which is false.


Page 19 Dr Matteo Capoferri & Dr Adrian Turcanu

P (4) states that states that 3(4) − 100 ≥ 0 , which is false.


P (40) states that states that 3(40) − 100 ≥ 0 , which is true.

Exercise 1.6.2 (∗). Let {an } be the sequence defined by a1 = 3 and

an+1 = 5an − 8 , n ≥ 1 .

Let P (n) be the statement an = 5n−1 + 2 . Write down P (1) and P (2) . Which of
them are true?

Solution. P (1) states that a1 = 50 + 2 , that is, a1 = 3 . Hence P (1) is true.


P (2) states that a2 = 51 +2 , that is, a2 = 7 . Since a2 = 5a1 −8 = 5(3)−8 = 7 ,
P (2) is true.

Exercise 1.6.3 (∗). Use Mathematical Induction to prove that for any positive
integer n
1 + 2 + 22 + 23 + . . . + 2n = 2n+1 − 1.

Solution. Let P (n) be the statement

1 + 2 + 22 + 23 + . . . + 2n = 2n+1 − 1.

P (1) states that 1 + 2 = 22 − 1 = 3 which is correct.


Assume P (k) is true: 1 + 2 + 22 + 23 + . . . + 2k = 2k+1 − 1 . Then by P (k) ,

1 + 2 + 22 + 23 + . . . + 2k + 2k+1 = 2k+1 − 1 + 2k+1 .

Since 2k+1 + 2k+1 = 2k+2 ,

1 + 2 + 22 + 23 + . . . + 2k + 2k+1 = 2k+2 − 1.

This shows that P (k + 1) follows from P (k) .


By the principle of mathematical induction, P (n) is true for any positive integer
n .

Exercise 1.6.4 (∗). Prove by induction that 3n < n! for n ≥ 7 .

Solution. Let P (n) be the statement 3n < n! .


We first check that P (7) is true. This is correct since 7! = 5040 and 37 = 2187 .
Discrete Mathematics Page 20

Assume that k ≥ 7 and that P (k) holds so that 3k < k! . Then

3 3k < 3 (k!) < (k + 1) k! = (k + 1)!.




Hence 3k+1 < (k + 1)! so P (k + 1) is true. By induction P (n) is true for all n .

Exercise 1.6.5 (∗). Let {an } be the sequence defined by a1 = x and

a2n + 12
an+1 = , n≥1
7
where 3 < x < 4. Prove by induction that 3 < an < 4 for all n.

Solution. Let P (n) be the statement : 3 < an < 4 .


Since a1 = x and 3 < x < 4 , P (1) is true.
Assume that P (k) holds so that 3 < ak < 4 . Then

7ak+1 = a2k + 12
< 16 + 12
= 28

so ak+1 < 4. Also,


7 ak+1 = a2k + 12 > 9 + 12 = 21
so P (k + 1) is true. By induction P (n) is true for all n .

Exercise 1.6.6 (∗). For n ≥ 2 the sequence {sn } is given by by


    
1 1 1
sn = 1 − 1− ... 1 −
2 3 n

so that s2 = 12 and s3 = 31 · By computing more values of sn if needed, guess a


formula for sn and use induction to prove it.

Solution. Using the formula for sn we obtain


1 1 1
s2 = s3 = s4 = ,
2 3 4
1
and we conjecture that sn = .
n
1
Let P (n) be the statement : sn = ·
n
Page 21 Dr Matteo Capoferri & Dr Adrian Turcanu

By the above calculations, P (2) is true.


Assume that P (k) holds so that
    
1 1 1 1
sk = 1 − 1− ... 1 − = .
2 3 k k

Then
       
1 1 1 1 1 1
sk+1 = 1− 1− ... 1 − 1− = 1− .
2 3 k k+1 k k+1

Since  
1 1 1
1− = ,
k k+1 k+1
P (k + 1) is true. By induction P (n) is true for all n ≥ 2 .

Exercise 1.6.7 (∗). Let {sn } be the sequence defined by

sn = 1(1!) + 2(2!) + 3(3!) + . . . + n(n!)

Check that s1 = 1 , s2 = 5 and s3 = 23 . By computing more values of sn if needed,


guess a formula for sn and use induction to prove it.

Solution. Using the formula for sn we obtain

s1 = 1 s2 = 5 s3 = 23 s4 = 119.

Since
2! = 2 3! = 6 4! = 24 5! = 120
we conjecture that sn = (n + 1)! − 1 .
Let P (n) be the statement: sn = (n + 1)! − 1 . The above calculations show
that P (n) is true for n = 1, 2, 3, 4 .
Assume that P (k) holds so that sk = (k + 1)! − 1 . Then

sk+1 = 1(1!) + 2(2!) + . . . + k(k!) + (k + 1) (k + 1)! = (k + 1)! − 1 + (k + 1) (k + 1)!.

Since

(k + 1)! − 1 + (k + 1) (k + 1)! = (k + 1)! [1 + k + 1] − 1 = (k + 2)! − 1,

P (k + 1) is true. By induction P (n) is true for all n .


Discrete Mathematics Page 22

1.7 The Sum and Product Rules


Consider the following examples of counting problems

Example 1.35. The mathematics teaching committee requires one student repre-
sentative from either the first year or the second year or the third year. If there
are 500 first year, 300 second year and 100 third year students, how many possible
different representatives are there?
Solution. There are obviously 500 different choices for picking a representative from
the first year students, 300 for the second and 100 for the third year. Since the
students are all different and we have to pick only one representative from either
year we can add the numbers of choices for a representative from each year to obtain
900 different possible representatives.

The solution to the above problem follows form a general rule.

The sum rule

If there are n(A) ways to do A and, distinct from them, n(B) ways
to do B, then the number of ways to do A or B is n(A) + n(B).
Similarly, one adds n(A), n(B), n(C) when one must do A or B or C;
and so on.

Consider a different problem.

Example 1.36. The mathematics student committee is composed of three students:


one from each of the first year, the second year and the third year. If there are 500
first year, 300 second year and 100 third year students, how many possible different
committees are there?
Solution. There are 500 choices for a representative from the first year students.
For each of those representatives there are 300 ways to pick a representative from
the second year students. So we have 500 × 300 = 150, 000 ways to pick the two
representatives from the first and second year students. For each choice of those (for
each pair of representatives) we have 100 choices for a third year representatives. In
total this gives us 150, 000 × 100 = 15, 000, 000 different possible committees. Thus
in this problem we had to multiply through the numbers for each respective choice:
500 × 300 × 100.

The solution to the above problem follows form a general rule.


Page 23 Dr Matteo Capoferri & Dr Adrian Turcanu

The product rule

If there are n(A) ways to do A and n(B) ways to do B, then the


number of ways to do A and B is n(A) × n(B). It is assumed that
A and B are independent; that is, the number of choices in B is the
same regardless of which choice in A is taken. Similarly, one multiplies
theree terms: n(A) × n(B) × n(C) if one has to do choices in A, B,
and C independently; and so on.

To solve the following example we need to use both of the above rules.

Example 1.37. In a certain computer system a file name must be a string of letters
and digits which is 1,2, or 3 symbols long. The first symbol must be a letter and
the system does not make any distinction between uppercase and lowercase letters.
How many legitimate file names are there?
Solution. A file name either has one symbol, or two symbols or three symbols. In
each case we have 26 choices for the first letter (the 26 letters of the alphabet).
The number of one-symbol names is then 26. The number of 2-symbol names is
26 × 36 = 936 because for the second symbol we can have either a letter or a
digit, 36 = 26 + 10 choices. We multiply the number of choices for the first and
for the second symbol because both choices are done independently. Similarly the
number of three-symbol names is 26 × 36 × 36 = 33696. Summing the numbers for
the three different cases (by the sum rule) we obtain that the system at hand can
accommodate 26 + 936 + 33696 = 34658 different files.

1.7.1 Exercises

Exercise 1.7.1 (∗). A binary word is made of 0’s and 1’s. How many binary words
are there with length exactly 7? How many binary words are there with length up
to 3?

Solution. For each digit we have two choices: 0 or 1. Thus by the product rule
we have 2 × 2 × 2 × 2 × 2 × 2 × 2 = 27 different words of length 7. To count the
number of words with length up to 3 we need to add up the numbers of words of
length 1,2,3 (here we use the sum rule). Using the product rule as before we obtain:
21 + 22 + 23 = 14 - the number of binary words with length up to 3.

Exercise 1.7.2. In a restaurant for the main dish there are 10 meat, 5 fowl, and 8
fish choices. How many ways are there to pick a main dish?
Discrete Mathematics Page 24

Exercise 1.7.3 (∗). A salesman is to visit 5 towns. Assuming that he goes to one
town after another, visiting all of them once, in how many orders can he make his
trip?

Solution. The salesman has 5 choices for the first town in his itinerary. Since each
town is to be visited only once he/she has 4 choices for the second town in the trip,
and so on. By the product rule we obtain 5 × 4 × 3 × 2 × 1 = 120 - the number of
different trips.

Exercise 1.7.4 (∗). How many numbers between 1 and 1000 have exactly one 7 in
them?

Solution. The condition of the problem means that we are considering numbers of
up to 3 digits. Hence we can count separately the numbers at hand with length
1,2,3 and then add them up. Clearly we have only one such number of length 1: 7.
For two-digit numbers the digit 7 can either be the first digit or the second one. If
it stands first then we have 9 choices for the second digit: 0, 1, 2, 3, 4, 5, 6, 8, 9. If 7
is the second digit we have only 8 choices for the first digit because 0 is not allowed.
Thus we have 9 + 8 = 17 required numbers of length 2. For length 3 the digit 7 can
again be on the 1st, 2nd or 3rd position. As before, taking care of the zero, we obtain
9 × 9, 8 × 9 and 8 × 9 different numbers for each case respectively. Finally using the
sum rule we add up all different cases and obtain 1 + 17 + 81 + 72 + 72 = 243.

1.8 Permutations and Combinations


Permutations and combinations are the basic building blocks of combinatorial math-
ematics.

Definition 1.38. A permutation of n objects taken k at a time is any ordered


selection of k distinct objects from a given set of n objects.
Note that the order in which the selection is made matters. Thus for a set of 5
letters {B, S, W, K, L} the lists {B, W, K}, {W, B, K} are two different permutations
of the 5 given letters taken 3 at a time.
Repetitions are not allowed. Thus for the above example a list {B, B, L} is not
a permutation. We will often say “ordered selection without repetitions” instead of
“permutations” as this is self-descriptive.

Example 1.39. In how many ways can a committee with 10 members select a
chairperson, a treasurer and an administrator?
Page 25 Dr Matteo Capoferri & Dr Adrian Turcanu

Solution. Suppose that the first person to be selected is to be the chairperson. There
are 10 different was to do it. Suppose the second person selected is the treasurer for
whom we have 9 choices. Finally, suppose the last person chosen is the administrator,
for whom there are 8 possible people available. By the product rule the number of
selections is 10 × 9 × 8 = 720.

In the above example we had to make an ordered selection of 3 distinct people


from a group of 10 people. For general permutations the following theorem holds.

Theorem 1.40. The number of ordered selections of k distinct objects from a set
of n objects is n(n − 1)(n − 2) . . . (n − k + 2)(n − k + 1).

Remark 1.6. The above number can be alternatively written as

n!
n(n − 1)(n − 2) . . . (n − k + 2)(n − k + 1) = (1.1)
(n − k)!

where we used the factorial notation

n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1

Note that by convention one defines

0! = 1 .

With this convention the expression on the right hand side in (1.1) gives the same
result for n = k as the expression on the left hand side.

Proof of Theorem 1.40. There are n ways to select the first object. After the first
object is selected there are n − 1 objects left and there are respectively n − 1 choices
for the second object. We continue selecting until the last object for which there
are n − k + 1 choices. By the product rule these selections can be made in n(n −
1)(n − 2) . . . (n − k + 1) ways.

Remark 1.7. Observe that using formula (1.1) for k = n we find that the number of
different ways to order n objects is n!.
Consider now a different kind of selections called combinations.

Definition 1.41. A combination of n objects taken k at a time is any selection of


k distinct objects from a given set of n objects.

In this case the order of objects selected does not matter. We will often say
“unordered selection without repetitions” rather than “combination”.
Discrete Mathematics Page 26

Theorem 1.42. The number of unordered selections of k objects from a set of n


objects without repetitions is
n(n − 1) . . . (n − k + 1) n!
= (1.2)
k! (n − k)!k!
The combination of factorials on the right hand side is called a binomial coefficient
and is denoted  
n n!
:= . (1.3)
k (n − k)!k!
Proof. We already know that the number of ordered selections is given by formula
(1.1). To pick an ordered selection of k elements one can first make an unordered
selection of k elements and then choose an order. We know that there are k! different
ways to order k elements. Thus by the product rule
 
n! n
= × k!
(n − k)! k

where nk stands for the number of unordered selections. Dividing both sides by k!


we obtain formula (1.2).


The binomial coefficients (1.3) appear in the binomial formula that gives coeffi-
cients for an expansion of a power
n  
n
X n k n−k
(x + y) = x y . (1.4)
k=0
k

For example
           
5 5 0 5 5 1 4 5 2 3 5 3 2 5 4 1 5 5 0
(x + y) = xy + xy + xy + xy + xy + xy
0 1 2 3 4 5
= y 5 + 5xy 4 + 10x2 y 3 + 10x3 y 2 + 5x4 y + x5 .

Example 1.43. In the card game bridge, a hand has 13 cards. There are 52 cards
in a deck. How many different bridge hands are there?
Solution. We have to make an unordered selection of 13 cards out of 52. Repetitions
are not allowed. By the above theorem there are 52 13
= 635013559600 different
bridge hands.

Example 1.44. A full house in poker is a collection of 5 cards in which 3 of them


are from one denomination and 2 from another. In a pack of cards there are 13
denominations (2,3,...,queen,king,ace) and 4 cards of each. How many different full
houses are there?
Page 27 Dr Matteo Capoferri & Dr Adrian Turcanu

Solution. The order in which we pick cards for a full house does not matter so we
can specify a convenient order. Let us first pick a denomination for three cards.
There are 13 choices. Next we have to pick  3 cards out of 4 in this denomination.
Since the order does not matter we have 43 = 4 choices. Next let us pick the second
denomination. We have 12  choices for it. To pick 2 cards out of 4 in the chosen
denomination we have 42 = 6 choices. Finally by the product rule we multiply
through the choices to obtain 13 × 4 × 12 × 6 = 3744 — the number of different full
houses.

1.8.1 Exercises

Exercise 1.8.1. Find in how many ways an ordered selection of 2 elements out of
a set of 5 elements can be made. No repetitions are allowed. How many unordered
selections of 2 elements from the same set are there?

Exercise 1.8.2. There are 14 available ingredients for making perfume. How many
different perfumes can be made if one is to use 4 ingredients for each one?

Exercise 1.8.3. A travelling salesemen is to visit any 5 different cities from a list
of 10. How many different itineraries he can have if he visits the cities successively
one after another?

Exercise 1.8.4 (∗). What is the number of ways in which a subset of one or two
elements can be picked up from a set of n elements?

Solution. Clearly there are n ways to pick a subset of one element. To pick a subset
of two elements we have to make an unordered selection with no repetitions of 2
elements out of n. Hence we have n2 = 2×(n−2)!
n!
= n(n−1)
2
two-element subsets. By
n(n−1) n(n+1)
the sum rule we obtain n + 2
= 2
.

Exercise 1.8.5 (∗). A university department has 12 male professors and 3 female
professors. It is decided that every committee of professors in this department should
Discrete Mathematics Page 28

have at least one female member. How many different committees of 12 people can
be formed?

Solution. If we temporarily disregard the requirement that there is at least one


15

female professor in the committee we have 12 = 455 different committees. To
ensure the condition of having at least one female professor we need to subtract
from the above number the number of all male committees which is just 1 as there
are only 12 male professors. Hence the final answer is 454.

Exercise 1.8.6. The computer science students are entering a team of 5 in the
university relay race. There are 20 undergraduates and 10 postgraduates. How
many teams are possible if
(a)(∗) any student is eligible
(b) there must be 2 postgraduates and 3 undergraduates

Solution. (a) We have an unordered


 selection without repetitions of 5 students out
of 30 = 10 + 20. This gives us 30
5
= 15504 possible teams.

Exercise 1.8.7 (∗). How many ways are there to pick a combination of k distinct
numbers from {1, 2, . . . , n} if 1 and 2 cannot both be picked?

Solution. We are dealing here with unordered selections. If we temporarily disregard


n

the condition on 1 and 2 not being picked together we obtain k selections. To take
care of the additional condition we must subtract from that result the number of
forbidden combinations. In such combinations both 1 and 2 are picked. The number
n−2 n n−2

of all such combinations is k−2 . Therefore the final answer is k − k−2 .

1.9 The Inclusion-Exclusion Principle

If a set A contains a finite number of elements then it is called a finite set and we
write |A| for the number of elements in it. An infinite set is a set that is not finite.
The number |A| is called the cardinality of A. Hence |{x, y, z}| = 3 and the set of
positive integers is infinite.
Page 29 Dr Matteo Capoferri & Dr Adrian Turcanu

Inclusion-Exclusion Principle

If A and B are finite sets, then

|A ∪ B| = |A| + |B| − |A ∩ B|. (1.5)

For instance, suppose we want to find the number of cards in an ordinary deck
that are clubs or aces. Let A be the set of cards that are aces and let C be the set
of cards that are clubs.
We have |A| = 4 , |C| = 13 and |A ∩ C| = 1 (the ace of clubs). Then the
Exclusion Principle gives us
|A ∪ C| = |A| + |C| − |A ∩ C| = 4 + 13 − 1 = 16.

Example 1.45. A survey in Scotland finds that 86% of the population have heard
of product A while 80% have heard of product B. Also, 70% of the population
have heard about both products. What percentage of the population have not
heard about either product?
Solution. Let A be the set of people who have heard about product A and define
B similarly. Let N be the population of Scotland.
From the given data, |A| = (0.86)N , |B| = (0.8)N and |A ∩ B| = (0.7)N .
Then

|A ∪ B| = |A| + |B| − |A ∩ B| = (0.86)N + (0.8)N − (0.7)N = (0.96)N.

Hence 96% of the population have heard of either product A or product B and 4%
have not heard about either product.

The Inclusion-Exclusion Principle for the union of three sets reads


|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C| (1.6)

In the lectures I will indicate why (1.6) holds by means of a Venn diagram.

Example 1.46. How many integers from 1 to 100 are divisible by 2 or 3 or 5?


Solution. U = {1, 2, . . . , 100} , A = {n ∈ U | 2 divides n}
B = {n ∈ U | 3 divides n} , C = {n ∈ U | 5 divides n}
We need to find |A ∪ B ∪ C|.
We have that |A| = 100/2 = 50 , |C| = 100/5 = 20 , and |B| is the integer part of
100/3 , that is |B| = 33 .
Discrete Mathematics Page 30

For a number to be in the set A ∩ B it must be divisible by 6. Hence |A ∩ B| = 16 .


Numbers in the set A ∩ B ∩ C are divisible by 30. Hence |A ∩ B ∩ C| = 3 .
Similarly, |A ∩ C| = 10 and |B ∩ C| = 6 so

|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|

and
|A ∪ B ∪ C| = 50 + 33 + 20 − 16 − 10 − 6 + 3 = 74.

Example 1.47. At the University of Scabia all first year computer science students
have to study at least one of the three modules : accounting, ethics and mathematics.
There are 237 first year computer science students. Also, 150 study maths, 120
study accounting, 100 study ethics, 50 study both accounting and maths, 20 study
both accounting and ethics and 70 study both maths and ethics.

(a) Find the number who study all three subjects.

(b) Find the number who study accounting and maths but not ethics.

Solution. Let A be the set of students studying accounting, M those doing maths
and E those doing ethics. From the given data,

|A| = 120 |E| = 100 |M | = 150

|A ∩ M | = 50 |A ∩ E| = 20 |E ∩ M | = 70
Also, since everyone is studying at least one of the three modules,

|A ∪ E ∪ M | = 237

Using

|A ∪ E ∪ M | = |A| + |E| + |M | − |A ∩ E| − |A ∩ M | − |E ∩ M | + |A ∩ E ∩ M |

237 = 120 + 100 + 150 − 20 − 50 − 70 + |A ∩ E ∩ M |


|A ∩ E ∩ M | = 7
Hence there 7 students who study all three subjects.
The set of students not studying ethics is E .
Hence the number who study accounting and maths but not ethics is A ∩ M ∩ E .
Let X = A ∩ M ∩ E and Y = A ∩ M ∩ E . Then

X ∩Y =∅ X ∪Y =A∩M
Page 31 Dr Matteo Capoferri & Dr Adrian Turcanu

By (1.5) , |X ∪ Y | = |X| + |Y | − |X ∩ Y | so

|A ∩ M | = |A ∩ M ∩ E| + |A ∩ M ∩ E|

Hence
50 = 7 + |A ∩ M ∩ E|
and |A ∩ M ∩ E| = 43 .

1.9.1 Exercises

Exercise 1.9.1 (∗). A and B are sets with |A| = 30 and |B| = 20 . Find A ∪ B
if

(a) A ∩ B = ∅ ,

(b) |A ∩ B| = 5 ,

(c) B ⊂ A .

Solution. (a) If A ∩ B = ∅ then |A ∪ B| = |A| + |B| = 50 .


(b) |A ∪ B| = |A| + |B| − |A ∩ B| = 45 .
(c) If B ⊂ A then A ∪ B = A and |A ∪ B| = |A| = 30 .

Exercise 1.9.2 (∗). A , B and C are sets with |A| = |B| = |C| = 100 .
Find |A ∪ B ∪ C| in the following cases.
(a) A = B = C .
(b) A ∩ B = A ∩ C = B ∩ C = ∅ .
(c) |A ∩ B| = |A ∩ C| = |B ∩ C| = 50 and A ∩ B ∩ C = ∅ .
(d) |A ∩ B| = |A ∩ C| = |B ∩ C| = 50 and |A ∩ B ∩ C| = 20 .
Solution. (a) If A = B = C then A ∪ B ∪ C = A and |A ∪ B ∪ C| = |A| = 100 .
(b) If A ∩ B = A ∩ C = B ∩ C = ∅ then
|A ∪ B ∪ C| = |A| + |B| + |C| = 300

(c) |A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| = 150 .


(d) |A∪B ∪C| = |A|+|B|+|C|−|A∩B|−|A∩C|−|B ∩C|+|A∩B ∩C| = 170 .
Discrete Mathematics Page 32

Exercise 1.9.3 (∗). A , B and C are sets with |A| = 10, |B| = 100, |C| = 1000 .
Find |A ∪ B ∪ C| if

(a) A ⊂ B and B ⊂ C ,

(b) A ∩ B = A ∩ C = B ∩ C = ∅ ,

(c) |A ∩ B| = |A ∩ C| = |B ∩ C| = 5 and |A ∩ B ∩ C| = 1 .

Solution. (a) If A ⊆ B and B ⊆ C then A ∪ B ∪ C = C and |A ∪ B ∪ C| =


|C| = 1000 .
(b) |A ∪ B ∪ C| = |A| + |B| + |C| = 1110 .
(c) |A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| = 1096 .

Exercise 1.9.4 (∗). How many integers from 1 to 1000 are divisible by 7 or 11?

Solution. For integers between 1 and 1000, let A be the set of those divisible by 7
and B be the set of those divisible by 11.
We need to find |A ∪ B|. Now, |A| is the integer part of 1000/7 , that is
|A| = 142 . Also |B| = 90 . For a number to be in the set A ∩ B it must be divisible
by 77. Hence |A ∩ B| = 12 .

|A ∪ B| = |A| + |B| − |A ∩ B| = 142 + 90 − 12 = 220.

Exercise 1.9.5 (∗). How many integers from 1 to 200 are divisible by 2 or 5 or 7?

Solution. For integers between 1 and 200, let A be the set of those divisible by 2,
B the set of those divisible by 5 and C the set of those divisible by 7. We need to
find |A ∪ B ∪ C|.
We have that |A| = 200/2 = 100 , |B| = 40 and |C| = 28 .
For a number to be in the set A∩B it must be divisible by 10. Hence |A∩B| = 20.
Similarly, |A ∩ C| = 14 and |B ∩ C| = 5.
Numbers in the set A ∩ B ∩ C are divisible by 70. Hence |A ∩ B ∩ C| = 2 .

|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|

|A ∪ B ∪ C| = 100 + 40 + 28 − 20 − 14 − 5 + 2 = 131.
Page 33 Dr Matteo Capoferri & Dr Adrian Turcanu

Exercise 1.9.6 (∗). There are 73 students in the first year Humanities class at the
University of Scabia. Among them a total of 52 can play the piano, 25 can play
the violin, and 20 can play the flute; 17 can play the piano and violin, 12 can play
the piano and flute, and 7 can play the violin and flute. Only Hamish Ratho can
play all three instruments. How many of the class cannot play any of them?

Solution. Let A be the set of students who can play the piano, B the set of
students who can play the violin and C the set of students who can play the flute.
Then
|A| = 52 |B| = 25 |C| = 20
|A ∩ B| = 17 |A ∩ C| = 12 |B ∩ C| = 7 |A ∩ B ∩ C| = 1.
Using

|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|,

we have that

|A ∪ B ∪ C| = 52 + 25 + 20 − 17 − 12 − 7 + 1 = 62.

The number of students who cannot play any of the three instruments is 73 − 62 =
11 .

Exercise 1.9.7 (∗). At the University of Scabia all first year students have to
study at least one of the three modules : computing, physics and mathematics.
There are 1400 first year students. Also, 700 study maths, 600 study physics,
500 study computing, 200 study both physics and maths, 150 study both physics
and computing and 120 study both maths and computing.

(a) Find the number who study all three subjects.

(b) Find the number who study computing and maths but not physics.

Solution. Let A be the set of students studying maths, B those doing physics and
C those doing computing.
From the given data,

|A| = 700 |B| = 600 |C| = 500

|A ∩ B| = 200 |A ∩ C| = 120 |B ∩ C| = 150 |A ∪ B ∪ C| = 1400


Using

|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|


Discrete Mathematics Page 34

we have that

1400 = 700 + 600 + 500 − 200 − 120 − 150 + |A ∩ B ∩ C|,

so that |A ∩ B ∩ C| = 70 . Hence there are 70 students who study all three subjects.
The set of students not studying physics is B . Hence the number who study
computing and maths but not physics is A ∩ C ∩ B .
Let X = A ∩ C ∩ B and Y = A ∩ C ∩ B . Then

X ∩Y =∅ X ∪ Y = A ∩ C.

Using
|X ∪ Y | = |X| + |Y | − |X ∩ Y | = |X| + |Y |
we have that
|A ∩ C| = |A ∩ C ∩ B| + |A ∩ C ∩ B|.
Hence |A ∩ C ∩ B| = 120 − 70 = 50 .
Chapter 2

Probability

2.1 Probability Space


Probability theory provides mathematical models for understanding uncertain sit-
uations. We call such situations random experiments. Some examples of random
experiments are:

• flipping of a coin;

• rolling a die;

• next week’s lottery draw;

• computer software is designed to process random amounts of data customers


will put in: for a random input data measuring the amount of time needed to
process it by the program is a random experiment;

• the top link in a Google search (it will either lead to the desired information
or it won’t);

• generating a number using a C ++ random number function “rand( )”.

A probability model, although it cannot be used to determine (in advance) the


outcome of a random experiment, should capture the essential features of the random
experiment. Thus it has to be consistent with any observable “statistical regularity”
displayed by the random phenomenon.

A probability model for a random experiment has two ingredients: a sample


space and a probability function.

Definition 2.1. The sample space S for a random experiment is the set of all
possible outcomes of the experiment.

Here are some examples of sample spaces:

35
Discrete Mathematics Page 36

Experiment Sample space

Flip a coin once S = {H, T }

Roll a die once S = {1, 2, 3, 4, 5, 6}

Flip a coin repeatedly until you obtain a “head” S = {H, T H, T T H, T T T H, . . . }


The sample spaces in these examples are all discrete (the elements can be counted)
with the first two spaces being finite and the last one infinite. In applications a sam-
ple space can be also a continuous set. In this course we will only deal with discrete
sample spaces.
Definition 2.2. Any subset of a sample space is called an event.
Consider for example the rolling a die experiment. The sample space is S =
{1, 2, 3, 4, 5, 6}. The subset A = {2, 4, 6} is the event that the outcome of the roll is
even. The subset {1, 2, 3, 4} is the event that the outcome is at most 4. A∩B = {2, 4}
is the event that outcome is even and at most 4. A ∪ B = {1, 2, 3, 4, 6} is the event
that the outcome is even or at most 4. Note that any element of the sample space S
can be viewed as a subset and thus also represents an event. For example in rolling
a die model C = {2} is the event that the outcome is 2.

Example 2.3 (*). Suggest suitable sample spaces, and identify the subset corre-
sponding to the event A, for the following situations:
(a) A coin, which shows heads (H) or tails (T), is tossed three times; A is the event
that the coin shows head twice.
(b) A game of football is played; A is the event that the match is drawn.
(c) A couple have two children; A is the event that both are girls.
Solution. (a) We can take

S = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }

and
A = {HHT, HT H, T HH}.

(b) One option is to take

S = {(n, m) | n, m are integers ≥ 0}

where n is the first team’s score and m is the second team’s score. In this case

A = {(n, m) | n = m}.

Alternatively we can consider S = {W, D, L} with W meaning the first team


wins, D — it’s a draw, L — first team loses. In this case A = {D}.
Page 37 Dr Matteo Capoferri & Dr Adrian Turcanu

(c) We can take S = {GG, GB, BG, BB} where, for example, GB means first child
is a girl, second child is a boy. In this case, A = {GG}. Alternatively we can
take S = {0, 1, 2} with each sample point corresponding to number of girls born.
In this case, A = {2}.

The second ingredient in a probability model is the probability function.

Definition 2.4. The probability function is a function P (E) that assigns a value
between 0 and 1 to any event E ⊂ S. The value 0 ≤ P (E) ≤ 1 is called the
probability of event E.

P (E) indicates, on a scale from 0 to 1, how “likely” it is that the outcome of


the random experiment will be some element of the event E ⊂ S. P (E) = 0 means
that it is impossible that the outcome will be in set E. P (E) = 1 means that it is
certain that the outcome will be in set E.
The values that a probability function takes have to satisfy certain consistency
requirements, formulated below.

Probability axioms

1. For all subsets E ⊂ S of the sample space 0 ≤ P (E) ≤ 1;

2. P (∅) = 0, P (S) = 1;

3. For any two disjoint subsets: A, B ⊂ S, A ∩ B = ∅ we have

P (A ∪ B) = P (A) + P (B) .

Axiom 3 can be extended to the case where one has more than two events: if the
events E1 , E2 , . . . En are mutually exclusive, that is Ei ∩ Ej = ∅ for any i ̸= j, then

P (E1 ∪ E2 ∪ · · · ∪ En ) = P (E1 ) + P (E2 ) + · · · + P (En ) .

In particular for any finite subset E the probability P (E) is given by the sum of
probabilities assigned to its individual points.
This gives a recipe for specifying and computing a probability function. By
Axiom 2 the sum of the probabilities assigned to all sample points must by equal to
1:
n
X
for S = {1, 2, 3, . . . , n} one has pi = 1, (2.1)
i=1

where we used the notation pi = P ({i}). To specify a probability space it suffices


to specify the values pi ≥ 0 for each of its sample point satisfying (2.1).
Discrete Mathematics Page 38

Example 2.5. Consider again the experiment in which we roll a die. The sample
space is S = {1, 2, 3, 4, 5, 6}. If we think that the die is fair, that is all possibilities
are equally likely, then we should set p1 = p2 = ... = p6 = 16 . Then condition (2.1)
is satisfied.

In general for a finite sample space S with N elements one can define an equiprob-
able probability function by assigning pi = N1 for each sample point i ∈ S. This
means that all outcomes are equally likely.

WARNING: not all probability functions on finite sets are equiprobable!

For any event E we call its complement E = S \ E an event “not E” (any other
outcome but E). The following theorem holds.

Theorem 2.6. For any event E ⊂ S we have


P (E) = 1 − P (E) . (2.2)
Proof. The sample space S can be represented as a union of two disjoint sets: S =
E ∪ E. Therefore by Axioms 2 and 3: 1 = P (S) = P (E) + P (Ē). Now subtract
P (E) from both sides to obtain (2.2).
Axiom 3 tells us about the probability of a union of two sets when those sets are
disjoint. A natural question arises — what can be said about P (A ∪ B) when A and
B are not disjoint? The answer is given by the following theorem.

Theorem 2.7. For arbitrary events A, B ⊂ S we have


P (A ∪ B) = P (A) + P (B) − P (A ∩ B) . (2.3)
Proof. From the Venn diagram it can be seen that we can represent each set A, B,
A ∪ B in terms of disjoint unions:
A = (A \ B) ∪ (A ∩ B) , B = (B \ A) ∪ (A ∩ B) ,
A ∪ B = (A \ B) ∪ (A ∩ B) ∪ (B \ A) .
By axiom 3 we have
P (A) = P (A \ B) + P (A ∩ B) , P (B) = P (B \ A) + P (A ∩ B) ,
P (A ∪ B) = P (A \ B) + P (A ∩ B) + P (B \ A)
Adding together the first two equations and subtracting the third one we obtain
P (A) + P (B) − P (A ∪ B) = [P (A \ B) + P (A ∩ B)] + [P (B \ A) + P (A ∩ B)]
−[P (A \ B) + P (A ∩ B) + P (B \ A)]
= P (A ∩ B)
and (2.3) follows by rearranging the terms.
Page 39 Dr Matteo Capoferri & Dr Adrian Turcanu

Note the similarity of the statement of last theorem to the Inclusion-Exclusion


Principle. The similarity extends even further. We will be using the following
theorem.
Theorem 2.8. For any three events A, B, and C we have

P (A∪B∪C) = P (A)+P (B)+P (C)−P (A∩B)−P (A∩C)−P (B∩C)+P (A∩B∩C) .

The proof uses the same idea as in the proof of Theorem 2.7. You are warmly
encouraged to try and write it down yourself as an exercise!

Example 2.9 (*). If P (A) = 2/3 and P (B) = 3/4 what are the largest and the
smallest values that P (A ∩ B) can take?
Solution. The intersection A ∩ B is a subset of A and also is a subset of B. Thus
its probability has to be smaller than P (A) and smaller than P (B). So P (A ∩ B) ≤
2
3
= P (A) - the smaller of the two. Moreover it may take that value if A ⊂ B in
which case P (A ∩ B) = P (A) = 23 . Thus the largest P (A ∩ B) can be is 2/3. Next
consider the equality: P (A ∩ B) = P (A) + P (B) − P (A ∪ B) = 1712
− P (A ∪ B). The
17 5
largest P (A ∪ B) can be is 1. Thus the smallest P (A ∩ B) can be is 12 − 1 = 12 .

Example 2.10. One often hears that the theory of probability started in the sev-
enteenth century, when a French nobleman, the Chevalier de Méré, proposed the
following problem in 1654 to his friend Pascal: Why is one more likely to obtain a
6 in four throws of a die than to obtain a double 6 in 24 throws of two dice? This
problem is known as de Méré’s paradox. We use the word paradox, because, based
on the fact that there are 6 possible results when we roll a die and 36 possible results
when we roll two dice, some people thought that the two events above should have
the same probability. Indeed, notice that the number of throws, divided by the num-
ber of possible results, is equal to 2/3 in both cases (4/6 = 24/36 = 2/3 = 0.66...).
Nowadays, we can easily compute the probability of each event. We find that the
probability of obtaining at least one 6 in four rolls of a (fair or non-biased) die is
1 − (5/6)4 = 671/1296 ≈ 0.5177, while the probability of getting at least a double
6 in throwing two dice 24 times is 1 − (35/36)24 ≈ 0.4914. One can conclude that
the Chevalier de Méré must have spent a lot of time throwing dice to discover such
a small difference.

2.1.1 Exercises

Exercise 2.1.1 (*). In the following experiments decide what are the outcomes and
state if they are equally likely.
Discrete Mathematics Page 40

(a) A red, blue and a yellow ball are put into a bag and a ball drawn out at random.

(b) Two yellow balls, one red ball and one blue are in a bag. One ball is drawn at
random.

(c) Same collection of balls. Two balls are drawn at once.

(d) Two children throw a die to see who starts the game. If the die shows a prime
number Joe starts, otherwise Ellen starts.

Solution. (b) We can take the outcomes to be {Y, R, B} — the colour of the ball
drawn. It is natural to assume that drawing each ball is equiprobable, that is 1/4.
As there are two yellow balls in the bag there is a larger probability of drawing a
yellow ball — 1/2 = 1/4 × 2 — versus 1/4 — the probability of drawing the blue
ball — and 1/4 — the probability of drawing the red ball.
(c) The outcomes now are unordered pairs {Y Y, Y B, Y R, BR}. The probability
of drawing the pair Y B is the same as the probability of drawing Y R and it is twice
larger than the probability of drawing Y Y (which is the same as the probability for
RB).

Exercise 2.1.2 (*). Prove that if A ⊂ B then P (A) ≤ P (B).

Solution. We can represent B as the disjoint union

B = A ∪ (B \ A).

Hence by the third axiom of probability we have P (B) = P (A) + P (B \ A). Since
by the first axiom P (B \ A) ≥ 0, we conclude that P (B) ≥ P (A).

Exercise 2.1.3 (*). A weather forecaster assigns probabilities as follows:

P (A) = P ( rain today ) = 30% , P (B) = P ( rain tomorrow ) = 40% ,

P (C) = P ( rain today and tomorrow ) = 20% .


What is the probability that it will rain tomorrow but not today?

Solution. We are given P (A) = 0.3, P (B) = 0.4, P (A ∩ B) = 0.2. We are asked
to find P (B ∩ Ā). The set B can be represented as a disjoint union: B = (B ∩
A) ∪ (B ∩ Ā). Thus using P (B) = P (B ∩ A) + P (B ∩ Ā) we find P (B ∩ Ā) =
P (B) − P (B ∩ A) = 0.4 − 0.2 = 0.2. This means that there is 20% chance that it
will rain tomorrow but not today.
Page 41 Dr Matteo Capoferri & Dr Adrian Turcanu

Exercise 2.1.4 (*). Events A and B are such that P (A) = 0.75, P (B) = 0.65.
Explain why events A and B cannot be mutually exclusive. (Two events A and B
are called mutually exclusive if A ∩ B = ∅.) What can you say about P (A ∪ B) and
P (A ∩ B)?

Solution. If the events A and B were mutually exclusive then by formula (2.3) we
would have P (A ∪ B) = P (A) + P (B) but this is impossible because P (A) + P (B) =
1.4 > 1. As in Example 2.9, the probability P (A∩B) is the largest when the set with
smaller probability, the set B, is a subset of the set with the larger probability, the set
A. In this case P (A ∩ B) = 0.65. In the same situation P (A ∪ B) takes the smallest
possible value: 0.75. The largest P (A ∪ B) can be is 1. By formula (2.3) this is the
situation when P (A ∩ B) = P (A) + P (B) − P (A ∪ B) = 0.75 + 0.65 − 1 = 0.4 is the
smallest. All in all, we found that 0.4 ≤ P (A ∩ B) ≤ 0.65 and 0.75 ≤ P (A ∪ B) ≤ 1.

Exercise 2.1.5. Events A, B and C are such that P (A) = 0.7, P (B) = 0.6,
P (C) = 0.5, P (A∩B) = 0.4, P (A∩C) = 0.3, P (B∩C) = 0.2 and P (A∩B∩C) = 0.1.
Find

(a) P (A ∪ B),

(b) P (A ∪ B ∪ C),

(c) P (Ā ∩ B̄ ∩ C).

Exercise 2.1.6. An experiment consists of three successive throws of a die. Con-


sidering each outcome as equally probable, find

(a) the probability of having 3 ‘six’,

(b) the probability of having at list one ‘six’, and

(c) the probability of having two ‘six’ one after another.

You might also like