Wagner Notes-1
Wagner Notes-1
Introduction to Combinatorics
Hello World, and Thanks!
c 2022 David G. Wagner
Department of Combinatorics and Optimization
Faculty of Mathematics
University of Waterloo
Contents
I Introduction to Enumeration 9
3
2.2 The Theory in General. . . . . . . . . . . . . . . . . . . . . . . . 49
2.2.1 Generating series. . . . . . . . . . . . . . . . . . . . . . . 50
2.2.2 The Sum, Product, and String Lemmas. . . . . . . . . . 52
2.3 Compositions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 Subsets with Restrictions. . . . . . . . . . . . . . . . . . . . . . 60
2.5 Proof of Inclusion/Exclusion. . . . . . . . . . . . . . . . . . . . 63
2.6 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3 Binary Strings. 71
3.1 Regular Expressions and Rational Languages. . . . . . . . . . 72
3.2 Unambiguous Expressions. . . . . . . . . . . . . . . . . . . . . 75
3.2.1 Translation into generating series. . . . . . . . . . . . . 76
3.2.2 Block decompositions. . . . . . . . . . . . . . . . . . . . 77
3.2.3 Prefix decompositions. . . . . . . . . . . . . . . . . . . . 80
3.3 Recursive Decompositions. . . . . . . . . . . . . . . . . . . . . 81
3.3.1 Excluded substrings. . . . . . . . . . . . . . . . . . . . . 82
3.4 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4 Recurrence Relations. 91
4.1 Fibonacci Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Homogeneous Linear Recurrence Relations. . . . . . . . . . . . 94
4.3 Partial Fractions. . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3.1 The Main Theorem. . . . . . . . . . . . . . . . . . . . . . 103
4.3.2 Inhomogeneous Linear Recurrence Relations. . . . . . 105
4.4 Quadratic Recurrence Relations. . . . . . . . . . . . . . . . . . 108
4.4.1 The general binomial series. . . . . . . . . . . . . . . . . 109
4.4.2 Catalan numbers. . . . . . . . . . . . . . . . . . . . . . . 110
4.5 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
II Introduction to Graph Theory 117
7 Trees. 163
7.1 Trees and Minimally Connected Graphs. . . . . . . . . . . . . . 163
7.2 Spanning Trees and Connectedness. . . . . . . . . . . . . . . . 166
7.3 Search Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.4 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7
Part I
Introduction to Enumeration
9
Overview.
Suppose I pay $5 for a lottery ticket – what is the chance that I win a share
of the top prize? It depends on the details, of course. There are a certain
number of ways to win, and a certain number of ways to lose. Enumeration
is the art and science of figuring out this kind of thing. This is the subject of
the first part of these course notes.
There are two broad principles of the subject. The combinatorial ap-
proach is to construct explicit one-to-one correspondences between sets to
show that they have the same size. The algebraic approach is to translate the
information about the problem from combinatorics into algebra, and then to
use algebraic techniques to determine the sizes of the sets. We will see many
examples of both approaches.
In Chapter 1 we begin by introducing the basic building blocks of the the-
ory: subsets, lists and permutations, multisets, binomial coefficients, and so
on. In Section 1.2 the use of these objects is illustrated by analyzing various
applications and examples.
In Chapter 2 we introduce the idea of generating series. This begins with
the Binomial Theorem and Binomial Series, which are of fundamental im-
portance for later results. The general theory of generating series is devel-
oped in Section 2.2, and its use is illustrated by analyzing “compositions”
in Section 2.3.
In Chapter 3 we consider the enumeration of various sets of binary strings,
namely those which can be described by regular expressions – the “rational
languages”. This provides an interesting and varied class of examples to
which the results of Chapters 2 and 4 apply.
In Chapter 4 we consider sequences which satisfy a homogeneous linear
recurrence relation with initial conditions, the sequences arising in Chapters
11
2 and 3 being examples. This technique allows us to calculate the numbers
which answer the various counting problems we have been considering.
By using Partial Fractions we can derive an even better solution to such
problems, although the calculations involved are also more complicated.
(We include a proof of Partial Fractions for completeness.) In Section 4.4 we
briefly discuss recurrence relations which are quadratic rather than linear.
Two additional topics are discussed in Chapters ?? and ??.
Chapter 1
In the next few pages we will often be constructing an object of some kind
by repeatedly making a sequence of choices. In order to count the total
number of objects we could construct we must know how many choices are
available at each step, but we must know more: we also need to know how to
combine these numbers correctly. A generally good guideline is to look for
the words “AND” and “OR” in the description of the sequence of choices
available. Here are a few simple examples.
Example 1.1. On a table before you are 7 apples, 8 oranges, and 5 ba-
nanas.
• Choose an apple and a banana.
There are 7 choices for an apple AND 5 choices for a banana: 7×5 =
35 choices in all.
• Choose an apple or an orange.
There are 7 choices for an apple OR 8 choices for an orange: 7 + 8 =
15 choices in all.
• Choose an apple and either an orange or a banana.
There are 7 × (8 + 5) = 91 possible choices.
13
14 Basic Principles. Section 1.1
which is the set of all ordered pairs of elements (a, b) with a ∈ A and b ∈ B.
In general, the cardinalities of these sets are related by the formula
|A × B| = |A| · |B|.
A ∪ B = {c : c ∈ A or c ∈ B},
pn = n · pn−1 ,
provided that n is positive. (In this equation there are n choices for the first
element v of the list, AND pn−1 choices for the list of S r {v} which follows
it.) It is important to note here that each list of S will be produced exactly
once by this construction.
16 Basic Principles. Section 1.1
n(n − 1)(n − 2) · · · 3 · 2 · 1.
Notice that if k > n then the number 0 will appear as one of the factors in
the product n(n − 1) · · · (n − k + 2)(n − k + 1). This makes sense, because if
k > n then there are no partial lists of length k of an n-element set. On the
other hand, if 0 ≤ k ≤ n then we could also write this product as
n!
n(n − 1) · · · (n − k + 2)(n − k + 1) = .
(n − k)!
n
The numbers k
are read as “n choose k” and are called binomial coefficients .
Usually, when faced with a formula to prove, one’s first thought is to prove
it by algebraic calculations, or perhaps with an induction argument, or maybe
with a combination of the two. But often that is not the easiest way, nor is
it the most informative. A much better strategy is one which gives some in-
sight into the meaning of all of the parts of the formula. If we can interpret
all the numbers as counting things, addition as “OR”, and multiplication
as “AND”, then we can hope to find an explanation of the formula by con-
structing some objects in the correct way.
that way the formula becomes self–evident, and there is nothing more
to prove.
Section 1.1 Basic Principles. 19
• n−1
k−1
is the number of (k − 1)-element subsets of {1, 2, ..., n − 1};
n−1
• k is the number of k-element subsets of {1, 2, ..., n − 1};
• addition corresponds to disjoint union of sets.
So, this equation is saying that choosing a k-element subset A of {1, 2, ..., n}
is equivalent to either choosing a (k −1)-element subset of {1, 2, ..., n−1}
or a k-element subset of {1, 2, ..., n − 1}. This is perhaps not as clear as
the previous example, but the two cases depend upon whether the cho-
sen k-element subset A of {1, 2, ..., n} is such that n ∈ A OR n 6∈ A. If
n ∈ A then A r {n} is a (k − 1)-element subset of {1, 2, ..., n − 1}, while if
n 6∈ A then A is a k-element subset of {1, 2, ..., n − 1}. This construction
explains the correspondence, proving the formula.
n n! n! n
= = =
k k!(n − k)! (n − k)!(n − (n − k))! n−k
n n
and 0
= n
= 1 it can be used to compute any number of binomial coeffi-
20 Basic Principles. Section 1.1
n\k 0 1 2 3 4 5 6 7 8
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1
8 1 8 28 56 70 56 28 8 1
1.1.4 Multisets.
O O O O O O O O O O O O O
and cross out some t − 1 of these circles to choose a (t − 1)-element subset:
O O O O X O O O O O X O O
Now the t − 1 crosses chop the remaining sequence of n circles into t seg-
ments of consecutive circles. (Some of these segments might be empty,
which is to say of length zero.) Let mi be the length of the i-th segment
of consecutive O-s in this construction. Then m1 + m2 + · · · + mt = n, so
that (m1 , m2 , ..., mt ) is an n-element multiset with t types. Conversely, if
(m1 , m2 , ..., mt ) is an n-element multiset with t types then write down a se-
quence of m1 O-s, then an X, then m2 O-s, then an X, and so on, finishing
with an X and then mt O-s. The positions of the X-s will indicate a (t − 1)-
element subset of the positions {1, 2, ..., n + t − 1}.
The construction of the above paragraph shows how to translate between
(t − 1)-element subsets of {1, 2, ..., n + t − 1} and n-element multisets with
t types of element. This one–to–one correspondence completes the proof of
the theorem.
finite sets. In simple cases as we have seen so far this is not always neces-
sary, but it is good style. In more complicated situations, as we will see in
Chapters 2 to 4, it is a very useful way to organize one’s thoughts.
Example 1.12 (Subsets and indicator vectors.). Let P(n) be the set of all
subsets of {1, 2, ..., n}, and let {0, 1}n be the set of all indicator vectors
α = (a1 , a2 , ..., an ) in which each coordinate is either 0 or 1. There is a
bijection between these two sets. For a subset S ⊆ {1, 2, ..., n}, define the
vector α(S) = (a1 (S), a2 (S), ..., an (S)) by saying that for each 1 ≤ i ≤ n,
1 if i ∈ S,
ai (S) =
0 if i 6∈ S.
Example 1.13 (Subsets and multisets.). The proof of Theorem 1.9 can be
phrased in terms of bijections, as follows.
Let M(n, t) be the set of all multisets of size n ∈ N with elements of t ≥
1 types. Let B(a, k) be the set of all k-element subsets of {1, 2, ..., a}. We
establish a bijection between M(n, t) and B(n+t−1, t−1) in what follows.
Theorem 1.5 implies that |B(n + t − 1, t − 1)| = n+t−1
t−1
, completing the
proof of Theorem 1.9. Here is a precise description of this bijection.
Let S be any (t − 1)-element subset of {1, 2, ..., n + t − 1}. We can sort
the elements of S in increasing order: S = {s1 , s2 , ..., st−1 } in which s1 <
s2 < · · · < st−1 . For notational convenience, let s0 = 0 and let st = n + t.
Now define a sequence µ = (m1 , m2 , ..., mt ) by letting mi = si − si−1 − 1
for all 1 ≤ i ≤ t.
24 Basic Principles. Section 1.1
1.1.6 Inclusion/Exclusion.
In a vase is a bouquet of flowers. Each flower is (at least one of) fresh,
fragrant, or colourful:
(a) 11 flowers are fresh;
(b) 7 flowers are fragrant;
(c) 8 flowers are colourful;
(d) 6 flowers are fresh and fragrant;
(e) 5 flowers are fresh and colourful;
(f) 2 flowers are fragrant and colourful;
(g) 2 flowers are fresh, fragrant, and colourful.
How many flowers are in the bouquet?
The Principle of Inclusion/Exclusion is a systematic method for answer-
ing such questions, which involve overlapping conditions which can be sat-
Section 1.1 Basic Principles. 25
colourful
fresh fragrant
the original data gives the number of flowers counted in the central trian-
gle. The subsequent steps (h) to (m) calculate the rest of the numbers in the
diagram, moving outwards from the center.
The above works very well for three properties (fresh, fragrant, colour-
ful) but becomes increasingly difficult to apply as the number of properties
increases. Figure 1.2 shows a Venn diagram for four sets, for instance. In-
stead, consider this alternative to the calculation in Example 1.14:
This looks much easier to apply, and it gives the right answer. Why? That
is the Principle of Inclusion/Exclusion, which we now explain in general.
Let A1 , A2 , ..., Am be finite sets. We want a formula for the cardinality of
the union of these sets A1 ∪ A2 ∪ · · · ∪ Am . First a bit of notation: if S is a
nonempty subset of {1, 2, ..., m} then let AS denote the intersection of the
sets Ai for all i ∈ S. So, for example, with this notation we have A{2,3,5} =
A2 ∩ A3 ∩ A5 .
We prove Theorem 1.15 in Section 2.5, but all that is required is the Binomial
Theorem 2.2.
Example 1.16. What is the probability that a random subset of {1, 2, ..., 8}
has at most 3 elements?
Here an outcome is a subset of {1, 2, ..., 8}, and there are 28 = 256 such
subsets. The number of subsets of {1, 2, ..., 8} with at most 3 elements is
8 8 8 8
+ + + = 1 + 8 + 28 + 56 = 93.
0 1 2 3
93
= 0.363281...
256
to six decimal places.
24 1
= = 0.03333...
720 30
in total. Of these, exactly t of them have both elements of the same type
– choose one of the t types and take two elements of that type. Thus, the
probability in question is
2t 2
= .
(t + 1)t t+1
The values for the first few t are given in the following table to four
decimal places:
t 1 2 3 4 5 6 7
1.0000 0.6667 0.5000 0.4000 0.3333 0.2857 0.2500
2
(Of course, 3
= 0, but that doesn’t matter.)
The Vandermonde convolution formula can be proven algebraically by
induction on m + n, but the proof is finicky and doesn’t give much insight
into what the formula “means”. (The formula can also be deduced easily
from the Binomial Theorem ??, as we will see in Example 2.3.)
Here is a direct combinatorial proof, illustrating the strategy of thinking
about what the numbers mean. On the LHS, m+n
k
is the number of k-
element subsets S of the set {1, 2, ..., m + n}. On the RHS, the number can
be produced as follows:
• choose a value of j in the range 0 ≤ j ≤ k, and
• choose a j-element subset A of {1, 2, ..., m}, and
• choose a (k − j)-element subset B of {m + 1, ..., m + n}.
n
(Notice that the set {m + 1, ..., m + n} has n elements, so it has k−j subsets
of size k − j.) Now the formula is proved by describing a bijection between
the k-element subsets S of {1, 2, ..., m+n} counted on the LHS, and the pairs
(A, B) of subsets counted on the RHS. To describe this correspondence, let
M = {1, 2, ..., m} and N = {m + 1, ..., m + n}. Notice that M ∩ N = ∅ and
M ∪ N = {1, 2, ..., m + n} and |M | = m and |N | = n. Now, given a k-element
subset S of {1, 2, ..., m + n} we let
A = S ∩ M and B = S ∩ N.
Example 1.20. Let p(n) denote the probability that in a randomly chosen
group of n people, at least two of them are born on the same day of the
year. What does the function p(n) look like?
To simplify the analysis, we will ignore the existence of leap years and as-
sume that every year has exactly 365 days. Moreover, we will also assume
30 Basic Principles. Section 1.2
Figure 1.3 gives a graph of this function. It is a rather surprising fact that
p(23) = 0.507297, so that if you randomly choose a set of 23 people on earth
Section 1.2 Basic Principles. 31
then there is a slightly better than 50% chance that at least two of them will
have the same birthday. (Approximately – we have ignored leap years and
twins.)
This example is designed to illustrate the fact that the probabilities depend
on which model is used to analyze the situation. There are two reasonable
possibilities for this problem, which I will call the dice model and the mul-
tiset model.
In the “dice model” we keep track of the fact that the candies are stacked
up in the roll from bottom to top, so there is a natural sequence (c1 , c2 , ...., c10 )
of flavours one sees when the roll is opened. For example, the sequences
(G, P, R, Y, Y, G, O, R, Y, O)
32 Basic Principles. Section 1.2
and
(Y, G, O, P, R, R, Y, G, O, Y )
model. Therefore,
13−k
3
m(k) = 14
4
k d(k) m(k)
0 0.107374 0.285714
1 0.268435 0.219780
2 0.301990 0.164835
3 0.201327 0.119880
4 0.088080 0.083916
5 0.026424 0.055944
6 0.005505 0.034965
7 0.000786 0.019980
8 0.000074 0.009990
9 0.000004 0.003996
10 0.000000 0.000999
For the second point, the above analysis of the multiset model can be
generalized to prove Exercise 1.11: for any integers n ≥ 0 and t ≥ 2,
X n
n+t−1 n−k+t−2
= .
t−1 k=0
t − 2
Poker is played with a standard deck of 52 cards, divided into four suits:
spades ♠, hearts ♥, diamonds ♦, and clubs ♣.
Within each suit are 13 cards of different values:
A (Ace), 2, 3, 4, 5, 6, 7, 8, 9, 10, J (Jack), Q (Queen), K (King).
An Ace can be high (above K) or low (below 2) at the player’s choice.
Many variations on the game exist, but the common theme is to make
the best five-card hand according to the ranking of poker hands. This order
is determined by how unlikely it is to be dealt such a hand. From best to
worst, the types of poker hand are as follows:
• Straight Flush: five cards of the same suit with consecutive values.
For example, 8♥ 9♥ 10♥ J♥ Q♥.
• Four of a Kind (or Quads): four cards of the same value, with any fifth
card. For example, 7♠ 7♥ 7♦ 7♣ 4♦.
• Full House (or Tight, or Boat): three cards of the same value, and a
pair of cards of another value. For example, 9♠ 9♥ 9♦ A♦ A♣.
• Flush: five cards of the same suit, but not with consecutive values.
For example, 3♥ 7♥ 10♥ J♥ K♥.
• Straight: five cards with consecutive values, but not of the same suit.
For example, 8♥ 9♣ 10♠ J♥ Q♦.
• Three of a Kind (or Trips): three cards of the same value, and two
more cards not of the same value. For example, 8♠ 8♥ 8♦ K♦ 5♣.
• Two Pair: this is self-explanatory.
For example, J♥ J♣ 6♦ 6♣ A♠.
• One Pair: this is also self-explanatory.
For example, Q♠ Q♦ 8♦ 7♣ 2♠.
• Busted Hand: anything not covered above.
For example, K♠ Q♦ 8♦ 7♣ 2♠.
Section 1.2 Basic Principles. 35
There are 52
5
= 2598960 possible 5-element subsets of a standard deck
of 52 cards, so this is the total number of possible poker hands. How many
of these hands are of each of the above types? The answers can be found
easily on the WWWeb, so there’s no sense trying to keep them secret. Here
they are: N is the number of outcomes of each type, and p = N/ 52
5
is the
probability of each type of outcome (rounded to six decimal places).
Hand N p
Straight Flush 40 0.000015
Quads 624 0.000240
Full House 3744 0.001441
Flush 5108 0.001965
Straight 10200 0.003925
Trips 54912 0.021128
Two Pair 123552 0.047539
One Pair 1098240 0.422569
Busted 1302540 0.501177
Example 1.22.
• To construct a Straight hand there are 10 choices for the consecutive
values of the cards (A2345, 23456, ... up to 10JQKA), and 45 choices
for the suits on the cards. However, four of these choices for suits
give all five cards the same suit – these lead to straight flushes and
must be discounted. Hence the total number of straights is
10 · (45 − 4) = 10200.
• To construct a Busted hand there are 13
5
− 10 choices for 5 values
of cards which are not consecutive (no straight) and have no pairs.
Having chosen these values there are 45 − 4 choices for the suits on
the cards which do not give all five cards the same suit (no flush).
Hence the total number of busted hands is
13
5
− 10 · (45 − 4) = 1302540.
36 Basic Principles. Section 1.2
1.2.5 Derangements.
n! − |A1 ∪ A2 ∪ · · · ∪ An |.
{1, 2, ..., n}. Consider the example n = 8 and S = {2, 3, 6}. In this case
A{2,3,6} = A2 ∩ A3 ∩ A6 is the set of those permutations of {1, ..., 8} such that
c2 = 2 and c3 = 3 and c6 = 6. Such a permutation looks like 2 3 6
in which the boxes are filled with the numbers {1, 4, 5, 7, 8} in some order.
Since there are 5! lists of the set{1, 4, 5, 7, 8} it follows that |A{2,3,6} | = 5! =
120 in this case. The general case is similar. If ∅ 6= S ⊆ {1, 2, ..., n} is a
k-element subset then the permutations of {1, 2, ..., n} in AS are obtained by
fixing ci = i for all i ∈ S, and then listing the remaining n − k elements of
{1, ..., n} r S in the remaining spaces. Since there are (n − k)! such lists we
see that |AS | = (n − k)!.
Since |AS | = (n − k)! for every k-element subset of {1, 2, ..., n}, and there
are nk such k-element subsets, Inclusion/Exclusion implies that
n n
X n k−1
X (−1)k−1
|A1 ∪ · · · ∪ An | = (−1) (n − k)! = n! .
k=1
k k=1
k!
n n
X (−1)k−1 X (−1)k
n! − n! = n! .
k=1
k! k=0
k!
Since the total number of permutations of n objects is n!, the probability that
a randomly chosen permutation of {1, 2, ..., n} is a derangement is
n
X (−1)k
Dn = .
k=0
k!
The following table lists the first several values of the function Dn (with
the decimals rounded to six places). Notice that for n ≥ 7 the value of Dn
changes very little. If you recall the Taylor series expansion of the exponen-
tial function
∞
x
X xk
e =
k=0
k!
n Dn Dn
0 1/1 1.
1 0/1 0.000000
2 1/2 0.500000
3 1/3 0.333333
4 3/8 0.375000
5 11/30 0.366667
6 53/144 0.368056
7 103/280 0.367857
8 2119/5760 0.367882
9 16687/45630 0.367879
10 16481/44800 0.367879
In summary, for the original Example 1.23, the probability that no-one gets
their own coat is very close to 36.79%.
1.3 Exercises.
Exercise 1.2. Consider rolling six fair 6-sided dice, which are distin-
guishable, so that there are 66 = 46656 equally likely outcomes. Count
how many outcomes are of each of the following types. (The answers
add up to 46656.)
(a) Six-of-a-kind.
(b) Five-of-a-kind and a single.
(c) Four-of-a-kind and a pair.
(d) Four-of-a-kind and two singles.
(e) Two triples.
(f) A triple, a pair, and a single.
(g) A triple and three singles.
(h) Three pairs.
(i) Two pairs and two singles.
(j) One pair and four singles.
(k) Six singles.
Exercise 1.4.
(a) Prove that
is an equivalence relation.
(b) Prove Proposition 1.11.
Exercise 1.9. Let n be a positive integer. Let Sn be the set of all ordered
pairs of sets (A, B) in which A ⊆ B ⊆ {1, 2, ..., n}. Let Tn be the set of
all functions f : {1, 2, ..., n} → {1, 2, 3}.
(a) What is |Tn |? (Explain.)
(b) Define a bijection g : Sn → Tn . Explain why g((A, B)) ∈ Tn
for any (A, B) ∈ Sn . (You do not need to explain why g is a
bijection.)
(c) Define the inverse function g −1 : Tn → Sn of your bijection g
from part (b). (You do not need to explain why g and g −1 are
mutually inverse bijections.)
(d) By counting Sn and Tn in two different ways, deduce that
n
n
X n k
3 = 2 .
k=0
k
A(7, 3) B(7, 3)
(7, 0, 0) (7, 0, 0)
(5, 2, 0) (6, 1, 0)
(4, 0, 3) (5, 2, 0)
(3, 4, 0) (4, 3, 0)
(2, 2, 3) (5, 1, 1)
(1, 6, 0) (4, 2, 1)
(1, 0, 6) (3, 3, 1)
(0, 4, 3) (3, 2, 2)
Example 2.1 (The Geometric Series). The simplest infinite case of power
series is if all the coefficients equal one. Then
G = 1 + x + x2 + x3 + x4 + · · · .
Multiply this by x:
xG = x + x2 + x3 + x4 + · · · .
45
46 Generating Series. Section 2.1
We develop two of the most useful facts that we will need in what follows.
The proofs are also good illustrations of calculating with generating series.
Proof. Recall that P(n) is the set of all subsets of {1, 2, ..., n}, and that {0, 1}n
is the set of all indicator vectors α = (a1 , a2 , ..., an ) in which each coordi-
nate is either 0 or 1. Example 1.12 gives a bijection between these two sets,
which you should recall. For example, when n = 8 the subset {2, 3, 5, 7}
corresponds to the indicator vector (0, 1, 1, 0, 1, 0, 1, 0). The constructions
S 7→ α(S) and α 7→ S(α) are mutually inverse bijections between the sets
P(n) and {0, 1}n . From this, we concluded that |P(n)| = |{0, 1}n | = 2n , but
we can deduce more. Notice that if S is a subset with k elements then it
corresponds to an indicator vector α that sums to k. It is sometimes helpful
to record this information in a little table, like this:
P(n)
{0, 1}n
S ↔ α = (a1 , a2 , ..., an )
|S| = a1 + a2 + · · · + an .
Now we can simplify both sides separately. On the LHS, we know from
Theorem 1.5 that there are nk k-element subsets of an n-element set, for
each 0 ≤ k ≤ n. Therefore,
n
X
|S|
X n k
x = x .
k=0
k
S∈P(n)
On the RHS, summing over all the indicator vectors α ∈ {0, 1}n is equivalent
to summing over all a1 ∈ {0, 1} and all a2 ∈ {0, 1} and so on,... until all
an ∈ {0, 1}. This gives
X 1 X
X 1 1
X
a1 +a2 +···+an
x = ··· xa1 +a2 +···+an
α∈{0,1}n a1 =0 a2 =0 an =0
1 1 1 1
!n
X X X X
= xa1 xa2 · · · xan = xa = (1 + x)n .
a1 =0 a2 =0 an =0 a=0
This proves the Binomial Theorem. With practice and familiarity, it be-
comes a one-line proof:
n
X n X X
xk = x|S| = xa1 +a2 +···+an = (1 + x)n .
k
k=0 S∈P(n) α∈{0,1}n
Consider the set M(t) of all multisets with t ≥ 1 types of elements, re-
gardless of the size of the multiset. That is, an element of M(t) is a sequence
µ = (m1 , m2 , ..., mt ) of t natural numbers, and the size of the multiset is
|µ| = m1 + m2 + · · · + mt . By Theorem 1.9, for each n ∈ N there are n+t−1
t−1
elements of M(t) of size n. By analogy with the Binomial Theorem 2.2, we
could collect these numbers as the coefficients of a power series:
∞
X n+t−1 n
x .
n=0
t−1
The Binomial Series is an algebraic formula for this summation.
(Properly, this is the binomial series with negative integer exponent. The
general binomial series is discussed in Subsection 4.4.1).
Proof. The key observation is that the set of all multisets with t ≥ 1 types
of elements is M(t) = Nt , the Cartesian product of t copies of the natural
numbers N. This leads to a calculation similar to the proof of the Binomial
Theorem above, based on this structure:
M(t) = Nt
µ = (m1 , .., mt )
|µ| = m1 + · · · + mt
Section 2.2 Generating Series. 49
An = ω −1 (n) = {α ∈ A : ω(α) = n}
is finite. That is, for every n ∈ N there are only finitely many elements
α ∈ A of weight n.
(We usually suppress the superscript from the notation.) Remember – the
indeterminate x does not have a value. It is just used to keep track of the
weight of each object α ∈ A in the exponent.
Section 2.2 Generating Series. 51
Proof.
X ∞
X X ∞
X X ∞
X
ω(α) ω(α) n
ΦA (x) = x = x = x 1= |An | xn .
α∈A n=0 α∈A: ω(α)=n n=0 α∈An n=0
Since we will be doing a lot of long calculations with power series, and
because of Proposition 2.7, it is useful to have a handy notation for extract-
ing coefficients from them.
P∞
Definition 2.8. Let G(x) = g0 + g1 x + g2 x2 + · · · = n=0 gn xn be any
power series. Then for any k ∈ N,
[xk ] G(x) = gk
Lemma 2.10 (The Sum Lemma.). Let A and B be disjoint sets, so that
A ∩ B = ∅. Assume that ω : (A ∪ B) → N is a weight function on the
union of A and B. We may regard ω as a weight function on each of A or B
separately (by restriction). Under these conditions,
Lemma 2.12 (The Product Lemma.). Let A and B be sets with weight
functions ω : A → N and ν : B → N, respectively. Define η : A × B → N
by putting η(α, β) = ω(α) + ν(β) for all (α, β) ∈ A × B. Then η is a weight
function on A × B, and
a finite (disjoint) union of finite sets. It follows that there are only finitely
many elements of A × B of weight n. Now,
X XX
ΦηA×B (x) = xη(α,β) = xω(α)+ν(β)
(α,β)∈A×B α∈A β∈B
X X
= xω(α) · xν(β) = ΦA (x) · ΦB (x).
α∈A β∈B
The Product Lemma 2.12 can be extended to the Cartesian product of any
finite number of sets, by induction on the number of factors. (Exercise 2.7.)
Finally, the String Lemma combines both disjoint union and Cartesian
product, as follows. Let A be a set with a weight function ω : A → N.
For any k ∈ N, the Cartesian product of k copies of A is denoted by Ak .
The entries of Ak are k-tuples (α1 , α2 , ..., αk ) with each αi ∈ A. Notice that
A0 = {ε} is the one-element set whose only element is the empty string
ε = () of length zero. We can define a weight function ωk on Ak by putting
ωk (α1 , ..., αk ) = ω(α1 ) + · · · + ω(αk ).
It is a good exercise to check that this is a weight function. Note that the
weight of the empty string is zero. Repeated application of the Product
Lemma 2.12 shows that for all k ∈ N,
ΦAk (x) = (ΦA (x))k .
We can take the union of the sets Ak for all k ∈ N:
∞
[
A =
∗
Ak .
k=0
Notice that the sets in this union are pairwise disjoint, since each Ak consists
of strings with exactly k coordinates. We define a function ω ∗ : A∗ → N by
saying that ω ∗ = ωk when restricted to Ak .
54 Generating Series. Section 2.2
Proof. If γ ∈ A has weight zero, ω(γ) = 0, then for any natural number
k ∈ N, a sequence of k γ-s in Ak also has weight zero: ωk (γ, γ, ..., γ) = 0. So,
by the way ω ∗ : A∗ → N is defined, there are infinitely many elements of
weight zero in A∗ , so that ω ∗ is not a weight function.
Conversely, assume that every element of A has weight at least 1. Then,
for each k ∈ N, every element of Ak has weight at least k. Now consider
any n ∈ N and all the strings (α1 , ..., αk ) ∈ A∗ of weight n. By the previous
sentence, if there are any such strings of length k then 0 ≤ k ≤ n. For each
0 ≤ k ≤ n, Ak has only finitely many elements of weight n. It follows that
A∗ has only finitely many elements of weight n. Therefore, ω ∗ is a weight
function on A∗ .
Lemma 2.14 (The String Lemma.). Let A be a set with a weight function
ω : A → N such that there are no elements of A of weight zero. Then
1
ΦA∗ (x) = .
1 − ΦA (x)
Proof. By the Infinite Sum and Product Lemmas 2.11 and 2.12,
∞ ∞
X X 1
ΦA∗ (x) = ΦAk (x) = (ΦA (x))k = .
k=0 k=0
1 − ΦA (x)
Section 2.3 Generating Series. 55
2.3 Compositions.
γ = (c1 , c2 , ..., ck ),
Notice that there is exactly one composition of length zero: this is ε = (),
the empty string with no entries. Compositions are related to multisets,
but there are two important differences: the parts of a composition must
be positive integers, not just nonnegative, and the length of a composition
might not be specified, while the number of types of element in a multiset
must be fixed.
In this section we will apply the results of Subsection 2.2.2 to obtain for-
mulas for the generating series of various sets of compositions defined by
imposing some extra conditions. In Chapter 4 we will see how to use this
information to actually count such things.
56 Generating Series. Section 2.3
by the geometric series. From the String Lemma 2.14 it follows that
1 1−x x
ΦC (x) = ΦP ∗ (x) = = = 1+ .
1 − x/(1 − x) 1 − 2x 1 − 2x
This proves part (b).
Expanding C(x) = ΦC (x) using the geometric series, we obtain
∞
X ∞
X
C(x) = 1 + 2j xj+1 = 1 + 2n−1 xn .
j=0 n=1
Since |Cn | = [xn ]C(x) is the coefficient of xn in C(x), this proves part (c).
Example 2.18. Let F be the set of all compositions in which each part is
either one or two.
• The allowed sizes for a part are 1 or 2, so P = {1, 2} is the set of
allowed parts. The generating series for a single part is x + x2 .
• The length can be any natural number k ∈ N. By the Product
Lemma, the generating series for compositions in F of length k is
(x + x2 )k .
• Since F = {1, 2}∗ , the String Lemma implies that
∞ ∞
X X 1
F (x) = ΦF (x) = f n xn = (x + x2 )k = .
n=0 k=0
1 − x − x2
Example 2.19. Let H be the set of all compositions in which each part is
at least two.
• The allowed sizes for a part are P = {2, 3, 4, ...}. The generating
series for a single part is
∞
X
c 2 3 x2
4
ΦP (x) = x = x + x + x + ··· = .
c=2
1−x
is k
x2
.
1−x
• Since H = P ∗ , the String Lemma implies that
∞ ∞ k
X
n
X x2
H(x) = ΦH (x) = hn x =
n=0 k=0
1−x
1 1−x x2
= = = 1 + .
1 − x2 /(1 − x) 1 − x − x2 1 − x − x2
Example 2.20. Let J be the set of all compositions in which each part is
odd.
• The allowed sizes for a part are P = {1, 3, 5, ...}. The generating
series for a single part is
∞
X x
ΦP (x) = x2i+1 = x1 + x3 + x5 + · · · = .
i=0
1 − x2
Example 2.21. The sets F, H, and J in Examples 2.18 to 2.20 have very
similar generating series. In fact, after a little thought one sees that for
Section 2.3 Generating Series. 59
all n ≥ 2,
1
[xn ] H(x) = [xn−1 ] J(x) = [xn−2 ] F (x) = [xn−2 ] .
1 − x − x2
This means that for all n ≥ 2, we have hn = jn−1 = fn−2 , so for the sizes
of sets we have |Hn | = |Jn−1 | = |Fn−2 |. We have proven these equalities
even though we don’t yet know what those numbers actually are! This
seems slightly magical, but it works.
Since these sets have the same sizes there must be bijections between
them to explain this fact. Constructing such bijections is left to Exercise
2.17. As a starting point, here are the sets for n = 7:
H7 J6 F5
(7) (5, 1) (2, 2, 1)
(5, 2) (1, 5) (2, 1, 2)
(2, 5) (3, 3) (1, 2, 2)
(4, 3) (3, 1, 1, 1) (2, 1, 1, 1)
(3, 4) (1, 3, 1, 1) (1, 2, 1, 1)
(3, 2, 2) (1, 1, 3, 1) (1, 1, 2, 1)
(2, 3, 2) (1, 1, 1, 3) (1, 1, 1, 2)
(2, 2, 3) (1, 1, 1, 1, 1, 1) (1, 1, 1, 1, 1)
(It need not be the case that the bijections match up these sets of com-
positions line by line in this table.) In Section 4.1 we will determine the
coefficients of the power series 1/(1 − x − x2 ), answering the counting
problem for these sets F, H, and J.
Example 2.22. Let Q be the set of all compositions in which each part is
at least two, and the number of parts is even.
• The allowed sizes for a part are P = {2, 3, 4, ...}. The generating
series for a single part is
∞
X x2
ΦP (x) = xc = x2 + x3 + x4 + · · · = .
c=2
1−x
of length 2j is
2j
x2
.
1−x
• Since Q = (P 2 )∗ , the String Lemma implies that
∞ ∞ 2j
X
n
X x2
Q(x) = ΦQ (x) = qn x =
n=0 j=0
1−x
1 (1 − x)2
= =
1 − x4 /(1 − x)2 (1 − x)2 − x4
1 − 2x + x2 x4
= = 1 + .
1 − 2x + x2 − x4 1 − 2x + x2 − x4
Here qn = [xn ]Q(x) = |Qn | is the number of compositions in Q of
size n. In Example 4.11 we will see how to calculate the first several
values of |Qn |.
The theory above for compositions can be used to obtain generating series
for subsets of natural numbers subject to some restrictions on the “gaps”
between consecutive elements of the subset. This is because of the following
correspondence between such subsets and nonempty compositions.
this composition γ is not empty, so that γ is in the set C r {ε}. The size of
this composition is
k+1
X k+1
X
|γ| = ci = (ai − ai−1 ) = ak+1 − a0 = n + 1,
i=1 i=1
U
C r {ε}
(n, A) ↔ γ
n = |γ| − 1
|A| = `(γ) − 1.
Example 2.24. For each n ∈ N, let rn be the number of subsets of {1, ..., n}
that do not contain two consecutive numbers (like a and a+1). We obtain
a formula for the generating series R(x) = ∞ n
P
n=0 rn x using the ideas of
Proposition 2.23.
For n ∈ N, let Rn be the set of pairs (n, A) with A as in the statement
of the problem, and let R = ∞ n=0 Rn . Then |Rn | = rn for all n ∈ N, and
S
we want to determine the generating series for the set R with respect to
the weight function ω(n, A) = n.
62 Generating Series. Section 2.4
Combining the contributions for all lengths ` ≥ 1 using the Sum Lemma,
we have
∞
x X x2`−2
xR(x) = M (x) = +
1−x `=2
(1 − x)`
∞ j
x x2 X x2
= +
1−x (1 − x)2 j=0 1 − x
Section 2.5 Generating Series. 63
x x2 1
= + 2
· 2
1−x (1 − x) 1 − x /(1 − x)
2 3
x−x −x x2
= +
(1 − x)(1 − x − x2 ) (1 − x)(1 − x − x2 )
x + x2
= .
1 − x − x2
It follows that
1+x
R(x) = .
1 − x − x2
which was part of the proof of the Binomial Theorem. If T is any n-element
set then X
x|S| = (1 + x)n
S⊆T
Recall the notation from Subsection 1.1.6: for any finite number of sets
A1 , A2 , . . . , Am and ∅ 6= S ⊆ {1, 2, ..., m}, let
\
AS = Ai .
i∈S
Proof. Let V = A1 ∪ · · · ∪ Am , and let Nm = {1, 2, .., m}. For each v ∈ V let
T (v) = {i ∈ Nm : v ∈ Ai }. Notice that T (v) 6= ∅, for all v ∈ V . Also notice
that for ∅ 6= S ⊆ Nm we have v ∈ AS if and only if ∅ 6= S ⊆ T (v). Therefore,
using Lemma 2.25 above, we have
X X X
(−1)|S|−1 |AS | = (−1)|S|−1 1
∅6=S⊆Nm ∅6=S⊆Nm v∈AS
X X X
= (−1)|S|−1 = 1 = |V |,
v∈V ∅6=S⊆T (v) v∈V
as was to be shown.
Example 2.27 (The Euler totient function). For a positive integer n, the
Euler totient of n is the number ϕ(n) of integers b in the range 1 ≤ b ≤ n
such that b and n are relatively prime. That is,
Then
Therefore
X Y 1
ϕ(n) = n − n (−1)|S|−1
∅6=S⊆Nm i∈S
pi
Y 1 m
X
|S|
Y 1
= n (−1) = n 1− .
S⊆Nm i∈S
pi i=1
pi
2.6 Exercises.
Exercise 2.4. Calculate [xn ](1 + x)−2 (1 − 2x)−2 . Give the simplest ex-
pression you can find.
Exercise 2.5.
(a) Let a ≥ 1 be an integer. For each n ∈ N, extract the coefficient of
xn from both sides of this power series identity:
(1 + x)a 1
2 a
=
(1 − x ) (1 − x)a
to show that
bn/2c
X a k + a − 1
n+a−1
= .
a−1 k=0
n − 2k a − 1
Exercise 2.7. Extend the Product Lemma 2.12 to the product of finitely
many sets with weight functions.
Exercise 2.9.
(a) Make a list of all the four-letter “words” that can be formed from
the “alphabet” {a, b}. Define the weight of a word to be the
number of occurrences of ab in it. Determine how many words
there are of weight 0, 1 and 2. Determine the generating series.
(b) Do the same for five-letter words over the same alphabet, but,
preferably, without listing all the words separately.
(c) Do the same for six-letter words.
Exercise 2.10.
(a) Consider throwing two six–sided dice, one red and one green.
The weight of a throw is the total number of pips showing on
the top faces of both dice (that is, the usual score). Make a table
showing the number of throws of each weight, and write down
the generating series.
(b) Do the same as for part (a), but throwing three dice: one red, one
green, and one white.
Exercise 2.12. Let S be the set of ordered pairs (a, b) of integers with
0 ≤ |b| ≤ a. Each part gives a function ω defined on the set S. De-
termine whether or not ω is a weight function on the set S. If it is
not, then explain why not. If it is a weight function, then determine
the generating series ΦS (x) of S with respect to ω, and write it as a
polynomial or a quotient of polynomials.
(a) For (a, b) in S, let ω((a, b)) = a.
(b) For (a, b) in S, let ω((a, b)) = a + b.
(c) For (a, b) in S, let ω((a, b)) = 2a + b.
1
[xn ] ΦA (x).
1−x
Section 2.6 Generating Series. 69
Exercise 2.16.
(a) Let An be the set of all compositions of size n in which every
part is at most three. Obtain a formula for the generating series
P∞ n
n=0 |An | x .
(b) Let Bn be the set of all compositions of size n in which every
part is a positive integer that is not divisible by three. Obtain a
formula for the generating series ∞ n
P
n=0 |Bn | x .
(c) Deduce that for all n ≥ 3, |Bn | = |An | − |An−3 |.
(d)* Can you find a combinatorial proof of part (c)?
Exercise 2.19. For each part, determine the generating series for the
number of subsets S of {1, 2, ..., n} subject to the stated restriction.
(a) Consecutive elements of S differ by at most 2.
(b) Consecutive elements of S differ by at least 3.
(c) Consecutive elements of S differ by at most 3.
(d) Consecutive elements of S differ by a number congruent to 1
(modulo 3).
(e) Consecutive elements of S differ by a number congruent to 2
(modulo 3).
(f) If S = {a1 < a2 < · · · < ak } then ai ≡ i (mod 2) for all 1 ≤ i ≤ k.
(g) Fix integers 1 ≤ g < h. If S = {a1 < a2 < · · · < ak } then
g ≤ ai − ai−1 ≤ h for 2 ≤ i ≤ k.
(c) What can you say about the degree of Ap (x)? What can you say
about the value of Ap (1)?
Chapter 3
Binary Strings.
We will see how to describe various subsets of binary strings in a way which
allows us to determine their generating series (with respect to length).
71
72 Binary Strings. Section 3.1
αβ = a1 a2 · · · am b1 b2 · · · bn .
Example 3.4. Consider the sets A = {011, 01} and B = {101, 1101}.
There are four ways to concatenate a string in A followed by a string
in B:
011.101, 011.1101, 01.101, 01.1101.
Here, the dot . indicates the point at which the concatenation takes place.
However, this information is not recorded when passing from α ∈ A
and β ∈ B to their concatenation αβ. Thus the concatenation product
AB consists of the strings
The problem with Example 3.7 is that to describe the second set we would
need an expression like ∞ j j
S
j=0 0 1 . However, an infinite union like this is
not allowed according to Definition 3.2. The underlying difficulty is that a
regular expression has a “finite memory” and cannot remember arbitrarily
large numbers, like the j ∈ N needed in the above expression. There is a
close connection between rational languages and finite state machines, and
this is a central topic in the theory of computation.
Section 3.2 Binary Strings. 75
union of sets.
1 1
= .
1 − (x + x) 1 − 2x
1 1 1
· = .
1 − x 1 − x · 1/(1 − x) 1 − 2x
Section 3.2 Binary Strings. 77
Sketch of proof. The proof of this is, as usual, recursive. Or, one could say it
goes by induction on the complexity of the expression R, and uses the fact
that R is unambiguous. Certainly, each of ε, 0, and 1 are unambiguous and
lead to the correct generating series for the sets {ε}, {0}, and {1}, respec-
tively. The induction step follows from Lemma 3.9 and the Sum, Product,
and String Lemmas of Subsection 2.2.2, because each of the operations is
unambiguous.
11.000.1.0.111.00.1.000.1111.00.1.0.11
are unambiguous expressions for the set {0, 1}∗ of all binary strings.
They produce each binary string block by block.
which is good.
Example 3.19. Let G be the set of binary strings in which every block
of 1s has odd length. What is the generating series for G with respect
to length? We will modify the block decomposition 0∗ (1∗ 10∗ 0)∗ 1∗ for all
binary strings. The expression 1∗ 1 in the middle produces a block of 1s.
The expression 1∗ = ε ` 1∗ 1 produces either the empty string or a block
Section 3.2 Binary Strings. 79
(with the convention that gn = 0 for all n < 0). This gives the initial
conditions g0 = 1, g1 = 2, g2 = 3 (which can be checked directly), and the
recurrence gn = gn−1 + 2gn−2 − gn−3 for all n ≥ 3. It is easy to calculate
the first several of these numbers.
n 0 1 2 3 4 5 6 7 8
.
gn 1 2 3 6 10 19 33 61 108
Example 3.20. Let H be the set of binary strings in which each block of
0s has length one. It is not hard to see that (ε ` 0) (1∗ 1 0)∗ 1∗ is a block
decomposition for H, and is therefore unambiguous. This expression
leads to the formula
∞
X 1 1 1+x
H(x) = hn xn = (1 + x) · · = .
n=0
1− x2 /(1 − x) 1 − x 1 − x − x2
This resembles Examples 2.21 and 2.24 yet again! These are the Fibonacci
numbers – see Example 4.1. With 1/(1 − x − x2 ) = ∞ n
P
n=0 fn x , we see
that for n ≥ 1,
hn = [xn ]H(x) = fn + fn−1 = fn+1 .
Let’s reality check this for n = 5. The strings in H5 are 11111, five strings
with one 0, six strings with two 0s, and 01010. And indeed, f6 = 13.
Neat!
Example 3.21. Consider the regular expression (0∗ 1)∗ 0∗ . We claim that
this is unambiguous. To see this, let σ = b1 b2 · · · bn be a binary string
produced by this expression. As an example, take
00111010110001011000.
How can this string be produced? The repetition (0∗ 1)∗ produces a bi-
nary string by chopping it into pieces after each occurrence of the bit 1.
But any string produced by this expression is either empty or ends with
a 1. The final “suffix” 0∗ in the expression allows the possibility that the
string might end with some 0s. The string above is produced as
001.1.1.01.01.1.0001.01.1.000
and this is the only way it is produced. This rule – “chop the string into
pieces after each occurrence of the bit 1” – gives a unique way to produce
each binary string from the expression (0∗ 1)∗ 0∗ . It follows that (0∗ 1)∗ 0∗ is
an unambiguous expression for the set {0, 1}∗ of all binary strings. And,
Section 3.3 Binary Strings. 81
S(x) = 1 + (x + x)S(x).
Let κ ∈ {0, 1}∗ be a nonempty binary string. We say that σ ∈ {0, 1}∗ contains
κ if there are (possibly empty) binary strings α, β such that σ = α κ β. If
σ does not contain κ then σ avoids or excludes κ. Let Aκ ⊂ {0, 1}∗ be set of
strings that avoid κ. We will develop a general method for calculating the
generating series Aκ (x).
Example 3.24. As an easy first example, consider the case κ = 01011. Let
A be the set of strings avoiding 01011, and let B be the set of strings that
have exactly one occurrence of 01011, at the very end (that is, as a suffix).
Consider the strings in A ∪ B. Such a string is either empty, or it ends
with either a 0 or a 1. If this string is not empty, then removing the last
bit leaves a (possibly empty) string in A (because of the way the sets A
and B are defined). This translates into the relation
A ` B = ε ` A (0 ` 1)
Section 3.3 Binary Strings. 83
. . . . 0 1 0 1 1
0 1 0 1 1 . . . .
. 0 1 0 1 1 . . .
. . 0 1 0 1 1 . .
. . . 0 1 0 1 1 .
In each row after the first, there is at least one position at which the
shifted 01011 disagrees with the substring in the first row. So 01011 can-
not overlap itself in a nontrivial way. This gives the relation B = A 01011,
yielding the equation B(x) = x5 A(x). Substituting this into the first
equation gives 1 + 2x A(x) = (1 + x5 ) A(x), which is easily solved. We
conclude that A(x) = 1/(1 − 2x + x5 ).
A ` B = ε ` A (0 ` 1)
and the same equation A(x) + B(x) = 1 + 2x A(x) for the generating
84 Binary Strings. Section 3.3
series.
The second equation is the slightly trickier part, because the string
01101 can overlap itself in a nontrivial way. As in the previous exam-
ple, we still have the set inclusion B ⊆ A 01101. But the reverse inclu-
sion does not hold in this case: for example 011.01101 = 01101.101 is in
A 01101 but not in B, because it contains a substring 01101 that is not at
the very end.
. . . . 0 1 1 0 1
0 1 1 0 1 . . . .
. 0 1 1 0 1 . . .
. . 0 1 1 0 1 . .
. . . 0 1 1 0 1 .
Looking at all the possible ways that 01101 can overlap itself, we see that
in this case A 01101 = B ∪ B 101. This gives A 01101 = B (ε ` 101) and
x5 A(x) = (1 + x3 )B(x). Substituting B(x) = x5 A(x)/(1 + x3 ) into the first
equation and solving for A(x) yields
1 + 3x + 2x2
A(x) = .
1 − 2x − 2x2 + x3 + x5
The proof of the general result follows exactly the same pattern.
Theorem 3.26. Let κ ∈ {0, 1}∗ be a nonempty string of length n, and let
A = Aκ be the set of binary strings that avoid κ. Let C be the set of all
nonempty suffixes γ of κ such that κ γ = η κ for some nonempty prefix η of
κ. Let C(x) = γ∈C x`(γ) . Then
P
1 + C(x)
A(x) = .
(1 − 2x)(1 + C(x)) + xn
Proof. Let B be the set of strings that contain exactly one occurrence of κ, at
the very end. Any string in A ∪ B is either empty or in the set A{0, 1}, so
that
A ∪ B = {ε} ∪ A {0, 1}.
This yields A(x) + B(x) = 1 + 2x A(x).
The key observation is that if β ∈ B and ρ is a proper prefix of β then
ρ ∈ A. Consequently, B ⊆ A κ. Conversely, consider any σ = α κ in A κ.
Section 3.4 Binary Strings. 85
Solving this for B(x) and substituting this into the first equation gives
xn
1 + 2x A(x) = A(x) · 1 + .
1 + C(x)
3.4 Exercises.
Exercise 3.2. Let A = (10 ` 101) and B = (001 ` 100 ` 1001). For each
of AB and BA, is the expression unambiguous? What is the generating
series (by length) of the set it produces?
86 Binary Strings. Section 3.4
Exercise 3.3. Let A = (00 ` 101 ` 11) and B = (00 ` 001 ` 10 ` 110).
Prove that A∗ is unambiguous, and that B∗ is ambiguous. Find the
generating series (by length) for the set A∗ produced by A∗ .
Exercise 3.4. For each of the following sets of binary strings, write an
unambiguous expression which produces that set.
(a) Binary strings that have no block of 0s of size 3, and no block of
1s of size 2.
(b) Binary strings that have no substring of 0s of length 3, and no
substring of 1s of length 2.
(c) Binary strings in which the substring 011 does not occur.
(d) Binary strings in which the blocks of 0s have even length and
the blocks of 1s have odd length.
Exercise 3.5. Let G = 0∗ ((11)∗ 1(00)∗ 00 ` (11)∗ 11(00)∗ 0)∗ , and let G be
the set of binary strings produced by G.
(a) Give a verbal description of the strings in the set G.
(b) Find the generating series (by length) of G.
(c) For n ∈ N, let gn be the number of strings in G of length n. Give
a recurrence relation and enough initial conditions to uniquely
determine gn for all n ∈ N.
Exercise 3.6.
(a) Show that the generating series (by length) for binary strings in
which every block of 0s has length at least 2 and every block of
1s has length at least 3 is
(1 − x + x3 )(1 − x + x2 )
.
1 − 2x + x2 − x5
(b) Give a recurrence relation and enough initial conditions to
determine the coefficients of this power series.
Section 3.4 Binary Strings. 87
Exercise 3.7.
(a) For n ∈ N, let hn be the number of binary strings of length n
such that each even-length block of 0s is followed by a block of
exactly one 1 and each odd-length block of 0s is followed by a
block of exactly two 1s. Show that
1+x
hn = [xn ] .
1 − x2 − 2x3
(b) Give a recurrence relation and enough initial conditions to
determine hn for all n ∈ N.
Exercise 3.8. Let K be the set of binary strings in which any block of
1s which immediately follows a block of 0s must have length at least
as great as the length of that block of 0s. (Note: this is not a rational
language.)
(a) Derive a formula for K(x) = α∈K x`(α) .
P
Exercise 3.9.
(a) Fix an integer m ≥ 1. Find the generating series (by length) of
the set of binary strings in which no block has length greater
than m.
(b) Fix integers m, k ≥ 1. Find the generating series (by length)
of the set of binary strings in which no block of 0s has length
greater than m, and no block of 1s has length greater than k.
Exercise 3.10. Let L be the set of binary strings in which each block of
1s has odd length, and which do not contain the substring 0001. Let
Ln be the set of strings in L of length n, and let L(x) = ∞ n
P
n=0 |Ln |x .
1 + x − x2
L(x) = .
1 − x − 2x2 + x3 + x4
Exercise 3.11. Let M be the set of binary strings in which each block
of 0s has length at most two, and which do not contain 00111 as a
substring. Let Mn be the set of strings in M of length n, and let M (x) =
P∞ n
n=0 |Mn |x .
1 + x + x2
M (x) = .
1 − x − x2 − x3 + x 5
Exercise 3.12. Let N be the set of binary strings in which each block
of 0s has odd length, and each block of 1s has length one or two. Let
Nn be the set of strings in N of length n, and let N (x) = ∞ n
P
n=0 |Nn |x .
1 + 2x + x2 − x4 3 + x − 3x2
N (x) = = −2 + x + .
1 − 2x2 − x3 1 − 2x2 − x3
(b) Derive an exact formula for |Nn | as a function of n.
Exercise 3.15. Let V be the set of binary strings that do not contain
0110 as a substring. Show that the generating series (by length) for V
is
1 + x3
V (x) = ΦV (x) = .
1 − 2x + x3 − x4
Exercise 3.16.
(a) Let W be the set of binary strings that do not contain 0101 as a
substring. Obtain a formula for the generating series (by length)
of W.
(b) Fix a positive integer r ≥ 1 and consider the binary string (01)r .
(Part (a) is the case r = 2.) Obtain a formula for the generating
series of the set of binary strings that do not contain (01)r .
Recurrence Relations.
n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
fn 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610
91
92 Recurrence Relations. Section 4.1
1
F (x) = .
1 − x − x2
We have seen this generating series before, relating to the sets of composi-
tions F, H, and J in Example 2.21. Obtaining a formula for Fibonacci num-
bers will thus solve the counting problem for each of these sets of composi-
tions.
The key is the denominator of the series, in this case 1−x−x2 .
1 − x − x2 = (1 − αx)(1 − βx)
for some complex numbers α and β, called the inverse roots of the poly-
nomial. To do this we can use the Quadratic Formula, but since we are
looking for the inverse roots of a polynomial we have to be careful. Sub-
stitute x = 1/t and multiply both sides by t2 to get
t2 − t − 1 = (t − α)(t − β).
Section 4.1 Recurrence Relations. 93
Example 4.4. Having found the inverse roots of the denominator, the
next step is to apply the Partial Fractions Theorem 4.12, which will be
explained (and proved) in Section 4.3. In this case it implies that there
are complex numbers A and B such that
1 1 A B
2
= = + .
1−x−x (1 − αx)(1 − βx) 1 − αx 1 − βx
There are a few different ways to find the numbers A and B, as we will
see. Here we can multiply by 1 − x − x2 = (1 − αx)(1 − βx) and collect
like powers of x:
Now all that remains is to put the pieces of this calculation together.
nomial series.)
∞ ∞ ∞
A B X
n n
X
n n
X
+ =A α x +B β x = (Aαn + Bβ n ) xn .
1 − αx 1 − βx n=0 n=0 n=0
It follows that for all n ∈ N, the Fibonacci numbers are given by the
formula
1
fn = [xn ]F (x) = [xn ] = Aαn + Bβ n
1!− x − x2
√ √ n √ √ !n
5+ 5 1+ 5 5− 5 1− 5
= + .
10 2 10 2
That seems kind of weird, since we know that the Fibonacci numbers are
√
integers. But notice that β = (1 − 5)/2 ≈ −0.618 so that as n → ∞, β n → 0.
√ √ !n
5+ 5 1+ 5
In fact, for all n ∈ N, fn is the integer closest to .
10 2
for all n ≥ N . The values g0 , g1 , ..., gN −1 are the initial conditions of the
recurrence. The relation is linear because the LHS is a linear combi-
nation of the entries of the sequence g; it is homogeneous because the
RHS of the equation is zero.
for the generating series. This is an instance of a general fact about a se-
quence with a homogeneous linear recurrence relation. Here is another ex-
ample before we see the general theory.
Now we split the LHS into separate summations, reindex them, and
write everything in terms of the power series G(x)
∞
X ∞
X ∞
X
n n
gn x − 3 gn−2 x − 2 gn−3 xn = 0
n=3 n=3 n=3
∞
X X∞
(G(x) − g0 − g1 x − g2 x2 ) − 3x2 gj xj − 2x
3
g k xk = 0
j=1 k=0
It follows that
2 + 5x
G(x) = .
1 − 3x2 − 2x3
Notice how the polynomial 1 − 3x2 − 2x3 in the denominator of this for-
mula is related to the linear recurrence relation gn − 3gn−2 − 2gn−3 = 0 for
n ≥ 3. We can explain the numerator, too, if we make the convention that
96 Recurrence Relations. Section 4.2
gn = 0 for all integers n < 0. Then, using the initial conditions, we have
g0 − 3g−2 − 2g−3 = 2 − 0 − 0 = 2 for n = 0,
g1 − 3g−1 − 2g−2 = 5 − 0 − 0 = 5 for n = 1,
g2 − 3g0 − 2g−1 = 6 − 3 · 2 − 0 = 0 for n = 2,
gn − 3gn−2 − 2gn−3 = 0 for n ≥ 3.
Q(x) = 1 + a1 x + a2 x2 + · · · + ad xd
Proof. To prove this theorem, we just copy the calculation in Example 4.7,
but do it in the most general case. For convenience, let a0 = 1. Assume that
part (a) holds, and let
Q(x) = 1 + a1 x + a2 x2 + · · · + ad xd .
Consider the product Q(x)G(x):
d
! ∞
!
X X
Q(x)G(x) = aj x j gn xn
j=0 n=0
∞ ∞
d X d
!
X X X
n+j
= aj gn x = aj gk−j xk .
j=0 n=0 k=0 j=0
Section 4.2 Recurrence Relations. 97
In the last step we have re-indexed the double sum using k = n + j, and
used the convention that gn = 0 for all n < 0.
The coefficient of xk in this formula is gk + a1 gk−1 + · · · + ad gk−d . This
is the LHS of the recurrence relation for g applied when n = k. Thus, this
coefficient is zero for k ≥ N . On the other hand, for 0 ≤ k ≤ N − 1, we see
that it is dj=0 aj gk−j = bk by the way the numbers bk are defined. That is,
P
N
X −1
Q(x)G(x) = bk xk = P (x),
k=0
and it follows that G(x) = P (x)/Q(x). This shows that (a) implies (b).
Conversely, assume that (b) holds and that G(x) = P (x)/Q(x) is as given.
We essentially run the argument for the first part of the proof in reverse.
The equations bk = gk + a1 gk−1 + · · · + ad gk−d for 0 ≤ k ≤ N − 1 (with the
convention that gn = 0 for n < 0) determine the initial conditions g0 , g1 ,...
gN −1 inductively. For n ≥ N , the coefficient of xn in P (x) = Q(x)G(x) is
zero. This implies that gk + a1 gk−1 + · · · + ad gk−d = 0 for all k ≥ N , showing
that (b) implies (a).
n 0 1 2 3 4 5 6 7 8
dn 1 −1 1 1 5 4 5 −5 −22
n 0 1 2 3 4 5 6 7
sn 1 2 1 3 7 9 15 29
To get the generating series, Theorem 4.8 implies immediately that the
denominator is 1 − x − 2x3 . To obtain the numerator we apply the recur-
rence for small values of n, with the convention that sn = 0 if n < 0.
1 if n = 0,
sn − sn−1 − 2sn−3 = 2 − 1 = 1 if n = 1,
1 − 2 = −1 if n = 2.
1 + x − x2
S(x) = .
1 − x − 2x3
Example 4.11. Let’s revisit Example 2.22, concerning the set Q of all com-
positions in which each part is at least two, and the number of parts is
even. We derived the generating series
∞
X 1 − 2x + x2
Q(x) = q n xn = .
n=0
1 − 2x + x2 − x4
Section 4.3 Recurrence Relations. 99
n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
qn 1 0 0 0 1 2 3 4 6 10 17 28 45 72 116
We have determined that |Q14 | = 116, but we have not listed all these
compositions individually. That is pretty cool, when you think about it.
1 x x2 xd−1
, , , ...,
Q(x) Q(x) Q(x) Q(x)
s X
ds (j)
X Ci
= 0. (4.1)
i=1 j=1
(1 − λi x)j
(j)
We must show that Ci = 0 for all 1 ≤ i ≤ s and 1 ≤ j ≤ di . Suppose not.
Then there is some index 1 ≤ p ≤ s for which at least one of the coefficients
(1) (2) (d ) (t)
Cp , Cp , ..., Cp p is not zero. Letting Cp 6= 0 be the one with the largest
(t+1) (d )
superscript, we also have Cp = · · · = Cp p = 0.
Section 4.3 Recurrence Relations. 101
Now multiply equation (4.1) by (1 − λp x)t . Separating out the terms with
(t+1) (d )
i = p and using the fact that Cp = · · · = Cp p = 0, we see that
t di
X XX (j) (1 − λp x)t
Cp(j) (1 − λp x) t−j
+ Ci = 0.
j=1 i6=p j=1
(1 − λi x)j
The LHS is a rational function of the variable x which does not have a pole
at the point x = 1/λp , so we can substitute this value for x. But every term
on the LHS has a factor of (1 − λp x) except for the term with i = p and j = t.
Thus, upon making the substitution x = 1/λp this equation becomes
Cp(t) = 0.
But this contradicts our choice of p and t. This contradiction shows that all
(j)
the coefficients Ci in equation (4.1) must be zero, and it follows that the set
B is linearly independent.
Since B is a set of d linearly independent vectors in a vector space VQ
of dimension at most d, it follows that B is a basis for VQ , and the proof is
complete.
2 + 5x
G(x) =
1 − 3x2 − 2x3
from Example 4.7. This satisfies the hypotheses of the Partial Fractions
Theorem 4.12. The denominator 1 − 3x2 − 2x3 vanishes when x = −1, so
that 1 + x is a factor. Some calculation shows that
2 + 5x A B C
= + + .
1 − 3x2 − 2x3 1 + x (1 + x)2 1 − 2x
It follows that gn = [xn ]G(x) = 2n+1 + n(−1)n+1 for all n ∈ N. This can
be “reality checked” by comparison with the initial conditions g0 = 2,
g1 = 5, and g2 = 6, and the recurrence relation gn − 3gn−2 − 2gn−3 = 0 for
all n ≥ 3 defining this sequence in Example 4.7. The first few values are
n 0 1 2 3 4 5 6
gn 2 5 6 19 28 69 122
Section 4.3 Recurrence Relations. 103
P (x)
G(x) = R(x) +
Q(x)
for some polynomials P (x), Q(x), and R(x) with deg P (x) < deg Q(x) and
Q(0) = 1. Factor Q(x) to obtain its inverse roots and their multiplicities:
Then there are polynomials pi (n) for 1 ≤ i ≤ s, with deg pi (n) < di , such
that for all n > deg R(x),
Proof. The conclusion of the theorem only concerns terms with n > deg R(x),
so we can basically ignore the polynomial R(x). In truth, all it is doing is
getting in the way, and preventing the formula from holding for smaller
values of n. So we are going to concentrate on the quotient P (x)/Q(x), to
which the Partial Fractions Theorem 4.12 applies.
Consider the factor (1 − λi x)di of the denominator Q(x). In the partial
fractions expansion of P (x)/Q(x), this contributes
(1) (2) (d )
Ci Ci Ci i
+ + ··· + .
1 − λi x (1 − λi x)2 (1 − λi x)di
di (j) di ∞
X Ci X (j)
X n+j−1 n n
= Ci λi x
j=1
(1 − λi x)j j=1 n=0
j−1
∞ di !
(j) n + j − 1
X X
= Ci λni xn .
n=0 j=1
j−1
n+j−1
Notice that j−1
is a polynomial function of n of degree j − 1. It follows
104 Recurrence Relations. Section 4.3
that
di
X (j) n+j−1
pi (n) = Ci
j=1
j−1
is a polynomial function of n of degree at most di − 1. The contribution of
the inverse root λi to the coefficient gn = [xn ]G(x) is thus pi (n)λni . By the
form of the partial fractions expansion we see that
Theorem 4.14 implies that there are constants A, B, C such that for suffi-
ciently large n, hn = (A + Bn)2n + C(−1)n . To determine these constants
we need to take data from the sequence h from a point later than the de-
gree of the polynomial R(x) appearing in Theorem 4.14. From Theorem
4.8, in this case the degree of the numerator of the generating series H(x)
is no more than five, since the general case of the recurrence holds for
n ≥ 6. Writing
h3 = 2 = (A + 3B)8 − C = 8A + 24B − C
h4 = −4 = (A + 4B)16 + C = 16A + 64B + C
h5 = 3 = (A + 5B)32 − C = 32A + 160B − C
hn = (n − 5)2n−4 − 3(−1)n
for all n ≥ 3. The values for hn with 0 ≤ n ≤ 2 are given in the initial
conditions.
gn = gn−1 + 2 gn−2 − n + 1
n 0 1 2 3 4 5 6 7 8
gn 1 2 3 5 8 14 25 47 90
What is gn as a function of n ∈ N?
We solve Example 4.16 by generalizing the method above just a little bit.
First, write the recurrence in the form
gn − gn−1 − 2 gn−2 = −n + 1
Examples 4.16 and 4.17 illustrate a general fact: if the generating series
of the RHS in an inhomogeneous linear recurrence relation is a rational
function, then the generating series for the entries of the sequence is also a
rational function. Thus, the sequence in fact satisfies a homogeneous linear
recurrence relation, so we are actually back in the case we have already
considered. Proving this in general is the main point of this subsection.
The following terminology is not standard but will be convenient. A
function q : N → C is polyexp if there are polynomial functions qi (n) and
complex numbers βi ∈ C for 1 ≤ i ≤ t such that
polyexp function.
n
P
(c) The generating series G(x) = n=0 gn x is a rational function (a
quotient of polynomials in x).
(d) The sequence g = (g0 , g1 , g2 , ...) is an eventually polyexp function.
Proof. Theorem 4.8 shows that conditions (a) and (c) are equivalent. Theo-
rem 4.14 shows that (c) implies (d). That (d) implies (c) is left as Exercise
4.11. It is clear that (a) implies (b). All that remains is to show that (b) im-
plies (c).
Thus, assume that g satisfies the linear recurrence relation
gn + a1 gn−1 + · · · + ad gd = q(n)
for all n ≥ N , with initial conditions g0 , g1 , ..., gN −1 , in which q : N → C is an
eventually polyexp function as in equation (4.2) for all n ≥ M .
UNDER CONSTRUCTION
Here, the coefficients A(x), B(x), and C(x) are power series in x.
There are two solutions to the equation in Definition 4.19, and they can
be found using the Quadratic Formula:
p
G+ (x) −B(x) ± B(x)2 − 4 A(x) C(x)
= .
G− (x) 2 A(x)
Section 4.4 Recurrence Relations. 109
Rigorous justification for this kind of algebra with power series is discussed
in detail in CO 330. If G(x) is a generating series for some combinatorial ob-
jects then it has only nonnegative coefficients and nonnegative exponents.
This can be used to decide which case of the ± sign to take. In general, only
one of G+ (x) or G− (x) is the correct generating series.
In Section 2.1 we saw the Binomial Theorem and The Binomial Series with
negative integer exponents. That is, for a natural number n ∈ N,
n
n
X n k
(1 + x) = x
k=0
k
∞
√
X 1 2k − 2 k
Proposition 4.22. 1 − 4x = 1 − 2 x .
k=1
k k − 1
∞
√
X
k k 1/2 k
Proof. By Theorem 4.21, 1 − 4x = (−1) 4 x .
k=0
k
For k = 0, the coefficient of x0 is (−1)0 40 1/2
0
= 1.
For k ≥ 1, we can calculate as follows.
k k 1/2 1
(−1) 4 = (−1)k 4k (1/2)(−1/2)(−3/2) · · · (1/2 − k + 1)
k k!
1
= −4k (1/2)(1/2)(3/2) · · · (k − 3/2)
k!
k 1
= −2 (1)(1)(3)(5) · · · (2k − 3)
k!
2 (1)(3) · · · (2k − 3) (2)(4) · · · (2k − 2)
=− · ·
k (k − 1)! (k − 1)!
2 2k − 2
=− .
k k−1
(Where did we use the fact that k ≥ 1 in this calculation?)
∞
X
W (x) = w n xn
n=0
W = ε ` 0W1W
112 Recurrence Relations. Section 4.4
= 1 + xW (x)2 .
Now we can solve this equation xW (x)2 − W (x) + 1 = 0 using the Quadratic
Formula: √
W+ (x) 1 ± 1 − 4x
= .
W− (x) 2x
√
Proposition 4.22 gives the power series for 1 − 4x, so that
√ ∞ !
1 ± 1 − 4x 1 1 X 1 2n n+1
= ± 1−2 x .
2x 2x 2x n=0
n+1 n
To get nonnegative coefficients, and to cancel the term 1/2x, we need to take
the minus sign from the ±. The result is
∞
X 1 2n n
W (x) = x .
n=0
n+1 n
1 2n
Thus, the number of WFPs of size n is the n-th Catalan number Cn = n+1 n
,
for each n ∈ N.
√
Since the generating series for the set W is W (x) = (1 − 1 − 4x)/2x,
which is not a rational function, it follows from Theorem 3.13 that the set of
WFPs is not a rational language.
Section 4.5 Recurrence Relations. 113
4.5 Exercises.
Exercise 4.1. For each of the sets of compositions from Exercise 2.15,
do the following.
• Derive a recurrence relation and initial conditions for the coeffi-
cients of the corresponding generating series G(x) = ∞ n
P
n=0 gn x .
• Calculate the coefficients g0 , g1 , ... up to g9 .
(a) Give a linear recurrence relation that (together with the initial
conditions above) determines the sequence of coefficients (cn :
n ≥ 0) uniquely.
(b) Derive a formula for cn as a function of n ≥ 0.
[Hint: 1 − 5x + 8x2 − 4x3 = (1 − x)(1 − 4x + 4x2 ).]
(a) Write down a linear recurrence relation and enough initial con-
ditions to determine the sequence (an : n ∈ N) uniquely.
(b) Given that 1 − 3x2 − 2x3 = (1 − 2x)(1 + x)2 , obtain a formula for
an as a function of n ∈ N.
(a) Give a linear recurrence relation that (together with the initial
conditions above) determines the sequence of coefficients (cn :
n ≥ 0) uniquely.
(b) Derive a formula for cn as a function of n ≥ 0.
Exercise 4.8.
(a) Obtain a formula for the coefficients of the rational function
∞
X 1 + 3x − x2
B(x) = bn x n = .
n=0
1 − 3x2 − 2x3
Exercise 4.10.
(a) Find rational numbers A, B, C such that for all n ∈ N,
2 n+2
n =A + B(n + 1) + C.
2
d
Fd (x) = x Fd−1 (x).
dx
(e) Let Fd (x) = Pd (x)/(1 − x)1+d . Derive a recurrence relation for
the polynomials Pd (x).
Exercise 4.11. Show that the converse of Theorem 4.14 holds. That is,
assume that
Then ∞
X P (x)
G(x) = gn xn = R(x) +
n=0
Q(x)
in which P (x) and R(x) are polynomials, and deg P (x) < deg Q(x)
and deg R(x) < N .
Exercise 4.12.
∞
1 X 2k k
(a) Show that √ = x .
1 − 4x k=0 k
n
X 2j 2n − 2j
(b) Deduce that for all n ≥ N, = 4n .
j=0
j n − j
(c)* Can you think of a combinatorial proof of part (b)?
Part II
117
Overview.
119
In Chapter 8 we address the question of which graphs can be drawn in
the plane without crossing edges – the answer is surprisingly recent, due
to Kuratowski in 1930. The five Platonic solids – familiar from high-school
geometry – also make an appearance.
In Chapter 9 we discuss colourings of graphs. The famous Four Colour
Theorem takes center stage – ideas related to it have had a profound influ-
ence on the development of graph theory (and combinatorics, more gener-
ally) since the question was first posed in the 1850s.
Finally, Chapter 10 develops the theory of matchings, particularly in bi-
partite graphs. The main results are again surprisingly recent, due to König
and to Hall in the 1930s. And again, they have had a strong influence on
graph theory and optimization, right up to the present day.
In Chapters ?? and ?? we discuss a few other topics involving graphs,
algorithms, optimization, or linear algebra.
Chapter 5
5.1 Graphs.
G = ({1, 2, 3, 4, 5}, {{1, 2}, {1, 3}, {1, 4}, {2, 3}, {3, 4}, {3, 5}}).
V (G) = {1, 2, 3, 4, 5}
and edge-set
E(G) = {{1, 2}, {1, 3}, {1, 4}, {2, 3}, {3, 4}, {3, 5}}.
That is mathematically precise, but difficult for a human being to grasp in-
tuitively.
121
122 Graphs and Isomorphism. Section 5.2
2 2
3
1 1 3 5
4
5 4
N (v) = {w ∈ V : vw ∈ E}
Section 5.2 Graphs and Isomorphism. 123
of vertices adjacent to V .
2. A vertex v ∈ V and edge e ∈ E are incident if v ∈ e. The two
vertices incident with an edge e ∈ E are the ends of e.
3. The degree of a vertex v ∈ V is the number of edges incident with
it. This number is denoted by deg(v). That is,
in two different ways. First, for every vertex v ∈ V there are deg(v) pairs
(w, f ) in X with vertex v as the first coordinate. Therefore,
X
|X| = deg(v).
v∈V
Second, for every edge e ∈ E there are 2 pairs (w, f ) in X with edge e as the
second coordinate. Therefore,
|X| = 2 · |E|.
Proof. Let S0 be the set of vertices of even degree, and let S1 be the set of
vertices of odd degree. Consider the Handshake Lemma (modulo 2):
X X X
0 ≡ 2 · |E| ≡ deg(v) ≡ deg(v) + deg(v)
v∈V v∈S0 v∈S1
X X
≡ 0+ 1 ≡ |S1 | (mod 2).
v∈S0 v∈S1
5.3 Examples.
Example 5.9 (Circulants). Fix n ≥ 2 and let Zn = {[0], [1], ..., [n − 1]}
denote the integers (modulo n). Let S be any subset of Zn with [0] 6∈ S.
The circulant Cn (S) has
To simplify the notation, we write C10 (1, 3, 4) instead of C10 ({[1], [3], [4]}),
and so on. From the definition of Cn (S), if [a] ∈ S then it makes no difference
whether or not [−a] ∈ S. In either case, the resulting graph will be the same.
Notice that the circulant Cn (1) “looks the same as” the cycle Cn . Also,
the circulant Cn (Zn r {[0]}) “looks the same as” the complete graph Kn .
Less obviously, the circulant C12 (1, 3, 5) in Figure 5.5 “looks the same as”
the complete bipartite graph K6,6 . This idea of two graphs looking the same
is made precise in Section 5.4.
Example 5.11 (Word Graphs). For ` ≥ 1, the word graph Word(`) has
vertex-set consisting of all `-letter words in the English language. Two
words are adjacent in Word(`) if and only if they differ by the substitu-
tion of exactly one letter in one position.
For instance, in Word(4) we can hop along the edges from frog to toad as
follows.
frog − flog − flag − flat − feat
|
toad − goad − goat − boat − beat
When I found this path in Word(4) I didn’t know that “trog” (British slang
for a stupid oafish person) and “trad” (traditional jazz or folk music) are
actual English words. Here is a shorter path:
Example 5.12 (Unit Square Graphs). Fix a positive real number r > 0.
Choose n points in the unit square [0, 1] × [0, 1] independently and uni-
formly at random. Join two of these points by a line segment if they are
128 Graphs and Isomorphism. Section 5.3
within distance r of each other. These points and lines can be thought of
as the vertices and edges of a graph.
Figures 5.7 to 5.10 give examples of unit square graphs. They all have the
same randomly chosen vertex-set of n = 30 points, for four different values
of r > 0.
5.4 Isomorphism.
Example 5.13. Consider the three graphs pictured in Figure 5.11. The
graphs G and H are the same graph, since they have the same set of
vertices V (G) = V (H) and the same set of edges E(G) = E(H). That is,
G = H; these graphs are equal even though the pictures “look different”.
The graph J is not equal to the graph G because these graphs do not have
the same set of vertices: V (G) 6= V (J). But from the picture it is clear that
J “looks the same as” G.
When G ' H, we also say that G is isomorphic with H, or that G and H are
isomorphic. Informally, an isomorphism is a bijection between vertices that
sends edges to edges and non-edges to non-edges. In other words, it is a
bijection f such that both f and its inverse function f −1 preserve adjacency.
We sometimes write f : G → H for an isomorphism f : V (G) → V (H) to
simplify the notation.
The relation ' of isomorphism is an equivalence relation, for the follow-
Section 5.4 Graphs and Isomorphism. 131
5
3 2 f d
1
1 a
5 2 3 c
6 4
4 6 b e
G H J
ing reasons.
• For any graph G, the identity function ι : G → G is an isomorphism, so
that G ' G. This is the reflexive property. (Here, the identity function
ι : V → V is such that ι(v) = v for all v ∈ V .)
• If f : G → H is an isomorphism then f −1 : H → G is an isomorphism.
Thus, if G ' H then H ' G. This is the symmetric property.
• If f : G → H and g : H → J are isomorphisms, then g ◦ f : G → J is
an isomorphism. This is the transitive property.
Example 5.15.
• For the graphs in Figure 5.11, the following function f : V (G) →
V (J) is an isomorphism.
v 1 2 3 4 5 6
f (v) a d f b c e
G H
Example 5.17. The two graphs pictured in Figure 5.13 have the same
degree sequence (4 3 3 3 3 3 3 2) but they are not isomorphic, for the fol-
lowing reason. They both have only one vertex of degree 4 (marked
with a circle), and only one vertex of degree 2 (marked with a square).
By Lemma 5.16(a), if f : G → H were an isomorphism then the circle
would map to the circle, and the square would map to the square. But
in G there is a vertex adjacent to both marked vertices, while in H there
is no such vertex. Therefore, there is no isomorphism from G to H.
134 Graphs and Isomorphism. Section 5.5
The point of the last condition in Definition 5.18.1 (that (W, F ) is a graph) is
that if one chooses an edge e of G to be in the subgraph H, then one must
also choose both of the vertices at the ends of e to be in H as well. For edge-
deletion, we also write G r e instead of G r {e} to simplify notation. For
vertex-deletion, we also use the notation G r S for G[V r S], and G r v for
G r {v}.
The empty graph (∅, ∅) is a subgraph of every graph. Every graph G
is a spanning subgraph of the complete graph KV (G) with the same vertex-
set. To get a feel for these concepts, consider the following questions. What
does a proper spanning subgraph look like? What does a proper induced
subgraph look like? What does an induced spanning subgraph look like?
Informally, we often say that “G has H as a subgraph” or “G contains H”
to mean that there is a subgraph of G that is isomorphic with H. This can
also be qualified with the adjectives “proper”, “spanning”, or “induced”.
For instance, the 4-cube Q4 in Figure 5.6 has C16 as a proper spanning sub-
graph. (It is not too hard to find one.) A spanning cycle in a graph is called
a Hamilton cycle .
Section 5.5 Graphs and Isomorphism. 135
The graph Gc has the same set of vertices as G, and vertices v and w are
adjacent in Gc if and only if v and w are not adjacent in G. This could also
be described as Gc = KV (G) r E(G).
Example 5.23.
• For r, s ≥ 1, the r-by-s grid is Pr Ps . The points and lines on a go
board form the grid P19 P19 .
• The game board for nine-man morris (Figure 5.15) is a spanning
subgraph of C8 P3 .
• The d-dimensional cube Qd of Example 5.10 is isomorphic to the
Cartesian power (K2 )d = K2 K2 · · · K2 of K2 with d factors.
• For r, d ∈ N, the Hamming graph is H(r, d) = (Kr )d . Considering
the 64 squares of a chessboard as vertices, with two vertices joined
by an edge when they are one rook’s move apart – that is, either in
the same rank (row) or same file (column) – we have the Hamming
graph H(8, 2) = K8 K8 .
• For ` ≥ 1, the word graph Word(`) is an induced subgraph of the
Hamming graph H(26, `).
That is, for a bipartition (A, B) of a graph, every vertex is in exactly one of
A or B, and every edge has one end in A and one end in B.
Section 5.5 Graphs and Isomorphism. 137
The converse to Proposition 5.25(d) is also true. We could prove it now, but
it will be much easier once we have developed the ideas in Chapter 7 (see
Theorem 7.10).
For some purposes we would like to allow a graph to have more than one
edge between a given pair of vertices, or to have an edge with both ends at
the same vertex, or to have a direction on an edge from one end to the other.
(Chapters 8 and ?? are two such situations.) See Figure 5.16 for a picture
of such a general graph. We will not give a formal definition that covers all
possibilities – in practice, different people use different definitions which
are chosen for their convenience in a given context.
We will say that an (undirected) multigraph is a triple G = (V, E, B) in
which V is a set of vertices, E is a set of edges, and B : V × E → {0, 1, 2} is an
incidence function. The interpretation is that B(v, e) is the number of ends of
the edge e that are incident at the vertex v. We require that every edge e ∈ E
P
has exactly two ends (which may be equal) – that is, v∈V B(v, e) = 2. A
Section 5.7 Graphs and Isomorphism. 139
These definitions will suffice for our purposes. They could be combined
to represent a mixed multigraph which contains both undirected edges and
directed arcs. However, the vast majority of these notes involve only the
simple graphs of Definition 5.1.
5.7 Exercises.
Exercise 5.1.
(a) Find a path from hard to easy in Word(4).
(b) Find a path from wrong to right in Word(5).
140 Graphs and Isomorphism. Section 5.7
3 C c
2 d
D B b
4 e
8
1 E J A a
9 f
5
F H j
7 g
6 G h
G H J
Exercise 5.2. Let G be a graph with at least two vertices. Show that G
has two different vertices of the same degree.
Exercise 5.3. Assume that all vertices of G = (V, E) have degree either
1 or 3. Show that if |V | = |E| then the number of vertices of degree 1
is equal to the number of vertices of degree 3.
(b) Show that if G is regular with at least one edge, then |A| = |B|.
(c) Use part (b) to give another proof of Proposition 5.25(c).
Exercise 5.6. For each pair of the graphs pictured in Figure 5.17, give
either an isomorphism between them or a reason why they are not
isomorphic to each other.
Section 5.7 Graphs and Isomorphism. 141
3 C c
2 d
D B b
4 e
8
1 E J A a
9 f
5
F H j
7 g
6 G h
G H J
3 D C d c
4 2 E B e b
9
5 1 F A f a
10 G K g k
6 8
7 H J h j
G H J
Exercise 5.7. For each pair of the graphs pictured in Figure 5.18, give
either an isomorphism between them or a reason why they are not
isomorphic to each other.
Exercise 5.8. Are any of the graphs in Figure 5.17 isomorphic to any
of the graphs in Figure 5.18?
Exercise 5.9. For each pair of the graphs pictured in Figure 5.19, give
either an isomorphism between them or a reason why they are not
isomorphic to each other.
142 Graphs and Isomorphism. Section 5.7
Exercise 5.11.
(a) There are two 3-regular graphs with 6 vertices (up to isomor-
phism). Draw pictures of them and explain why they are not
isomorphic, and why any 6-vertex 3-regular graph is isomor-
phic to one or the other.
(b) There are several 3-regular graphs with 8 vertices (up to isomor-
phism). Draw pictures of them and explain why they are not
isomorphic, and why any 8-vertex 3-regular graph is isomor-
phic to one of the graphs on your list.
Exercise 5.17. Show that for any graphs G and H, the Cartesian prod-
uct G H is bipartite if and only if both G and H are bipartite.
Exercise 5.18. Let’s say that a graph with a Hamilton (spanning) cycle
is Hamiltonian . (A graph with a spanning path is weakly Hamiltonian.)
(a) Show that if G is bipartite and Hamiltonian then |V (G)| is even.
(b) For which r, s ≥ 2 is the grid Pr Ps Hamiltonian? Explain.
(c) Show that for n ≥ 3, the product Cn K2 is Hamiltonian.
(d) Deduce that for d ≥ 2, the d-cube Qd is Hamiltonian. (Hamilton
cycles in hypercubes are called “binary Gray codes”.)
145
146 Walks and Connectedness. Section 6.1
The words “path” and “cycle” have been used before, for the graphs in Ex-
ample 5.8. In Sections 5.4 and 5.5, we broadened the use of these words
to mean any graph isomorphic to some Pn or Cn . A walk that is a path as
in Definition 6.1 is supported on a path, and a walk that is a cycle as in
Definition 6.1 is supported on a cycle.
In a multigraph, to specify a walk one must also specify the choice of
edge between successive vertices at each step: W = (v0 e1 v1 e2 v2 ... ek vk ). If
e 6= f then (v e w f v) is a cycle of length two, and a loop (v e v) is a cycle of
length one.
Notice that for concatenation of walks, the length (number of steps) is addi-
tive: `(W Z) = `(W ) + `(Z).
a b c d
p q r s
w x y z
Proof. Let P = (v0 v1 ... vk ) and Q = (z0 z1 ... z` ) be distinct paths in G from
v = v0 = z0 to w = vk = z` . Since P 6= Q there is an index a with 0 ≤ a <
min{k, `} such that v0 = z0 , v1 = z1 , ..., va = za , but va+1 6= za+1 . Since vk = w
is on the path Q, there is a smallest index b such that a + 1 ≤ b ≤ k and vb is
on the path Q. Let 0 ≤ c ≤ ` be the index such that zc = vb . This index c is
determined uniquely because Q has no repeated vertices. If 0 ≤ c ≤ a then
vc = zc = vb would be a repeated vertex of P . But P is a path, so that this
does not happen. Therefore, a + 1 ≤ c ≤ `. Also, if b = a + 1 then c ≥ a + 2,
because otherwise va+1 = vb = zc = za+1 would contradict the way the index
a was determined.
Section 6.2 Walks and Connectedness. 149
The vertices va , va+1 , ..., vb are pairwise distinct since P has no repeated
vertices, and the only ones of these vertices that are also on Q are va = za
and vb = zc . The vertices za , za+1 , ..., zc−1 , zc are pairwise distinct since Q has
no repeated vertices. It follows that the closed walk
6.2 Connectedness.
Proof. The relation of reachability has the following three properties. Let
u, v, w ∈ V be vertices.
• It is reflexive: (v) is a walk, so that v reaches v.
• It is symmetric: if (z0 z1 ... zk ) is a (v, w)-walk then (zk zk−1 ... z0 ) is a
(w, v)-walk. Thus, if v reaches w then w reaches v.
150 Walks and Connectedness. Section 6.2
• It is transitive: if (z0 z1 ... zk ) is a (u, v)-walk and (t0 t1 ... t` ) is a (v, w)-
walk, then (z0 z1 ... zk t1 ... t` ) is a (u, w)-walk. Thus, if u reaches v and
v reaches w, then u reaches w.
Note that the empty graph (∅, ∅) is not connected according to Defini-
tion 6.11, because it has no connected components at all. A graph is con-
nected if and only if it has exactly one connected component.
At this point we can now describe paths and cycles structurally, or intrin-
sically, using the concept of connectedness. A cycle is a connected 2-regular
graph. A path is a connected graph in which all vertices have degree at
most 2, and which is not a cycle.
Proof. First, assume that G is connected. Then G is not the empty graph. Let
v ∈ V be any vertex. For any vertex w ∈ V , v reaches w. By Theorem 6.5
there is a (v, w)-path in G. This shows that (a) implies (b). That (b) implies
(c) is clear. If (c) holds then let v ∈ V be such a vertex of G. Since V 6= ∅,
G is nonempty. Since v reaches every vertex of G, and reachability is an
equivalence relation, G has exactly one connected component – that is, G is
connected.
In the 18th century, Königsberg was the easternmost major city of Prus-
sia. (It is now Kaliningrad, on the Baltic coast in an exclave of Russia be-
tween Poland and Lithuania.) On Sunday afternoons, the good people of
Königsberg would promenade about town, trying to walk across each of
the seven bridges exactly once each. (See Figure 6.2.) No-one was ever able
Section 6.3 Walks and Connectedness. 153
W E
The term “Euler tour” is conventional, but “Eulerian trail” would have been
better. Our goal is to give a structural characterization of Eulerian graphs.
Vertices of degree zero are irrelevant in the context of Euler tours, so we can
safely assume that every vertex has degree at least one.
(b) If an Eulerian graph has no vertices of odd degree then every Euler
tour is a closed trail.
(c) If an Eulerian graph has two vertices v, w of odd degree then every
Euler tour is a trail with ends v and w.
Example 6.18. The graph of Königsberg has four vertices of odd degree,
so it does not have an Euler tour.
x e y
X Y
6.5 Exercises.
d(d − 1)k − 2
|V | ≥ .
d−2
(f) When d ≥ 3 and g = 2k ≥ 4 is even, give a lower bound on |V |
similar to the bound in part (e).
(g) Give an example meeting the bound in part (f) when d = 3 and
g = 6.
Exercise 6.14. Recall the Odd graphs from Exercise 5.22. Fix d ≥ 3,
and let X = {1, 2, ..., 2d − 1}, so the vertices of Od are the (d − 1)-
element subsets of X.
(a) Let β : X → X be any bijection. Define a function f : V (Od ) →
V (Od ) by putting f (S) = {β(s) : s ∈ S} for each vertex S of Od .
Show that f is an automorphism of Od .
(b) For any two vertices v, w of Od , there is an automorphism f :
Od → Od such that f (v) = w.
(c) For any two edges v0 v1 and w0 w1 of Od , there is an automor-
phism f : Od → Od such that f (v0 ) = w0 and f (v1 ) = w1 .
(d) For any two paths v0 v1 v2 and w0 w1 w2 of length two in Od , there
is an automorphism f : Od → Od such that f (i) = wi for i ∈
{0, 1, 2}.
(e) For any two paths v0 v1 v2 v3 and w0 w1 w2 w3 of length three in Od ,
there is an automorphism f : Od → Od such that f (i) = wi for
i ∈ {0, 1, 2, 3}.
Chapter 7
Trees.
163
164 Trees. Section 7.1
Proposition 7.3. A tree T = (V, E) with at least two vertices has at least
two leaves.
For the induction step, assume that G has at least one edge, and that the
result holds for all graphs with fewer edges than G. Let e ∈ E be any edge
of G, and consider the spanning subgraph G0 = G r e with n0 = n vertices,
m0 = m−1 edges, and c0 = c(Gre) connected components. By the induction
hypothesis, m0 ≥ n0 − c0 and equality holds if and only if G0 is a forest.
First, assume that e is a bridge in G, so that c0 = c + 1 by Corollary 6.22. It
follows that m = m0 + 1 ≥ (n0 − c0 ) + 1 = n − (c0 − 1) = n − c, proving part of
the induction step. Also, m = n − c if and only if m0 = n0 − c0 . By induction,
this occurs if and only if G0 is a forest. We claim that since e is a bridge, G is
a forest if and only if G r e is a forest. Proof of this claim is left as Exercise
7.2, and completes this part of the induction step.
Second, assume that e is not a bridge in G, so that c0 = c. It follows that
m = m0 + 1 ≥ (n0 − c0 ) + 1 = n − c + 1 > n − c. The claimed inequality holds,
but not with equality. Since e is not a bridge it is contained in a cycle (by
Theorem 6.20), and so G is not a forest. This completes the induction step
and the proof.
n = |V | = n0 + n1 + n2 + n3 + · · ·
This yields
2n0 + n1 = 2 + n3 + 2n4 + 3n5 + · · ·
If n = 1 then n0 = 1 and nd = 0 for all d ≥ 1. If n ≥ 2 then n0 = 0, and
we deduce that n1 ≥ 2. This gives a second (more quantitative) proof of
Proposition 7.3.
Proof. If (i) and (ii) hold then G is a tree. Corollary 7.6 implies that (iii) holds.
If (i) and (iii) hold then the inequality of Corollary 7.6 holds with equality,
so that G is a tree and hence has no cycles.
If (ii) and (iii) hold then let G1 , G2 , ..., Gc be the connected components of
G. Let Gi have ni vertices and mi edges. Each Gi is connected and by (ii) has
no cycles, so is a tree, so that mi = ni − 1 by Corollary 7.6. Now (iii) implies
that
1 = n − m = (n1 + n2 + · · · + nc ) − (m1 + m2 + · · · + mc )
= (n1 − m1 ) + (n2 − m2 ) + · · · + (nc − mc ) = c.
Theorem 7.10. A graph is bipartite if and only if it does not contain any
odd cycles.
Proof. First, notice that a graph is bipartite if and only if every one of its
connected components is bipartite. Also, a graph contains an odd cycle if
and only if at least one of its connected components contains an odd cycle.
These observations allow us to reduce the proof of this theorem to the con-
nected case. That is, if the statement holds for all connected graphs then it
holds for all graphs. Thus, we may assume that G = (V, E) is a connected
graph.
If G contains an odd cycle then G is not bipartite, by Proposition 5.25(d).
Conversely, assume that G is not bipartite. Since G is connected, Theorem
7.9 implies that G has a spanning tree T . An easy induction shows that
168 Trees. Section 7.2
trees are bipartite (this is left as Exercise 7.3). Let (A, B) be a bipartition of
T . Since G is not bipartite, this is not a bipartition of G. Thus, there is an
edge vw ∈ E with both ends in A or both ends in B. By symmetry, we may
assume that v, w ∈ A. By Proposition 7.4 there is exactly one (v, w)-path P
in T . Since both ends of P are in the set A of the bipartition (A, B) of T , the
path P has an even number of steps. Now C = (V (P ), E(P ) ∪ {vw}) is a
cycle in G with an odd number of edges, completing the proof.
The proof of Theorem 7.9 suggests the following strategy for finding a
spanning tree of a connected graph. Given a connected graph G, if G con-
tains a cycle then delete an edge of that cycle. Repeat this until what is left
has no cycles. Then what is left is a spanning tree of G. This is difficult to
turn into an algorithm because of the phrase “if G contains a cycle”. We
would need a subroutine that takes a connected graph G as input, and pro-
duces as output either a cycle in G or a certificate that G contains no cycles.
To do this, it seems to be easier just to find a spanning tree first.
Informally, Algorithm 7.11 does the following. The loop over vertices v ∈ V
defines an arbitrary bijection g : V → {1, 2, ..., n}. The values of g will be
used to label the components of the output spanning forest. The loop over
edges vw ∈ E tests if an edge has the same label at both ends: g(v) = g(w).
If so, then that edge is skipped. But if g(v) 6= g(w) then the edge is included
in the output, and all of the vertices labelled g(w) are re-labelled to have the
same label as v ∈ V . After all edges have been examined, the algorithm
Section 7.2 Trees. 169
Proof. To prove part (a) we only need to show that (V, F ) does not contain a
cycle. We prove this together with part (b) by showing that these statements
hold after every passage through the loop over e = vw ∈ E. Before this loop
starts we have F = ∅, and both statements (a) and (b) hold.
Now consider one iteration of the loop, and the edge e = vw ∈ E which
is being examined. At this stage the subgraph (V, F ) and function g satisfy
statements (a) and (b), by induction on the number of passages through the
loop. There are two cases. If g(v) = g(w) then the set F and function g do
not change, and so statements (a) and (b) continue to hold. If g(v) 6= g(w)
then let F 0 = F ∪ {vw} be the updated set F , and let g 0 : V → {1, 2, ..., n}
be the updated function. Since (b) holds for (V, F ), the vertices v and w are
in different components of (V, F ). But v and w are in the same component
of (V, F 0 ). Therefore, e = vw is a bridge of (V, F 0 ). By Exercise 7.2, since
(V, F ) is a forest it follows that (V, F 0 ) is a forest. Because of the way g 0 is
updated, and becuase (b) holds for (V, F ) and g, it follows that for all z ∈ V ,
g 0 (z) = g 0 (v) if and only if z is in the component of (V, F 0 ) containing v. From
this the statement (b) follows for the subgraph (V, F 0 ) and function g 0 . This
completes the induction step, and the proof of parts (a) and (b).
For part (c), since (V, F ) is a subgraph of G, every component of (V, F )
is contained in a component of G. To prove (c), it suffices to show that if x
reaches y in G then x reaches y in (V, F ). To do this the following observation
is useful: if at any stage of the algorithm we have g(x) = g(y), then this
continues to hold at all later stages. (This can be proved by induction on the
number of iterations of the loop over the edges.) The contrapositive form is
even more useful: if at any stage of the algorithm we have g(x) 6= g(y), then
g(x) 6= g(y) also held at all earlier stages.
170 Trees. Section 7.2
a b c d e
p q r s t
v w x y z
Now suppose that x, y ∈ V are such that x reaches y in G, but that x and y
are in different components of the output (V, F ). Since we have proved (b),
we know that g(x) 6= g(y). Let W be an (x, y)-walk in G. Then there is an
edge e = vw on W for which g(v) 6= g(w). Since (a) and (b) hold, it follows
that e is not an edge in F . Now consider the iteration of the loop at which the
edge e = vw is considered. By the observation in the previous paragraph,
at this stage we also have g(v) 6= g(w). But then the algorithm would have
included e in the set F . This contradiction shows that if x reaches y in G,
then x reaches y in (V, F ), completing the proof.
Example 7.13. Table 7.1 shows Algorithm 7.11 applied to the graph pic-
tured in Figure 7.1. The columns are indexed by the vertices of the graph.
The first row indicates the initial (arbitrary) values of the function g. In
the first column, the remaining rows are indexed by the edges, in the
order they are considered by the algorithm. In the row corresponding
to an edge, the first-named vertex is marked with an asterisk (*). If the
labels of the two ends are equal, this is marked with an equal sign (=) at
Section 7.2 Trees. 171
E a b c d e p q r s t v w x y z F
g(·) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
ac ∗ . 1 . . . . . . . . . . . . ac
ws . . . . . . . . 12 . . ∗ . . . ws
dt . . . ∗ . . . . . 4 . . . . . dt
cr . . ∗ . . . . 1 . . . . . . . cr
vw . . . . . . . . 11 . ∗ 11 . . . vw
pb . 6 . . . ∗ . . . . . . . . . pb
ye . . . . 14 . . . . . . . . ∗ . ye
bx . ∗ . . . . . . . . . . 6 . . bx
xz . . . . . . . . . . . . ∗ . 6 xz
qw . . . . . . ∗ . 7 . 7 7 . . . qw
ar ∗ . . . . . . = . . . . . . .
er 14 . 14 . ∗ . . 14 . . . . . . . er
vq . . . . . . = . . . ∗ . . . .
bd . ∗ . 6 . . . . . 6 . . . . . bd
qs . . . . . . ∗ . = . . . . . .
ry . . . . . . . ∗ . . . . . = .
tz . . . . . . . . . ∗ . . . . =
px . . . . . ∗ . . . . . . = . .
g(·) 14 6 14 6 14 6 7 14 7 6 7 7 6 14 6
the other end. Otherwise, the numbers in that row indicate the process
of re-labelling the vertices, and the last column contains the name of the
edge, which is included in the output. The subgraph (V, F ) output from
this example is shown in Figure 7.2.
172 Trees. Section 7.3
a b c d e
p q r s t
v w x y z
Proof. We prove that T is a tree and that parts (b) and (c) hold by induction
on the number of iterations through the “while (∆ 6= ∅)” loop. At the end
we finish the proof that (a) holds.
Before the loop begins, T = ({v∗ }, ∅) is a tree, the functions pr and ` are
defined on the set W , and parts (b) and (c) hold. Now consider an iteration
of the loop in which the edge e = xy ∈ ∆ is considered. Let T = (W, F ) and
pr and ` be the data before the edge e is added, and let T 0 = (W 0 , F 0 ) and pr0
and `0 be the data after the edge e is added. By induction, T = T 0 r y is a
tree. By construction, y is a vertex of degree one in T 0 . Exercise 6.10 (or the
Two-out-of-Three Theorem 7.8) now implies that T 0 is a tree. Parts (b) and
(c) only need to be checked for the new vertex y. By induction, the unique
path in T from x to v∗ is obtained by following the steps v → pr(v) until
pr(v) = null, and the length of this path is `(x). It follows that the unique
174 Trees. Section 7.4
F W `(·) ∂W
∅ x 0 xp, xb, xd, xz
xb b 1 xp, xd, xz, bp, bd
xz z 1 xp, xd, bp, bd, zt
bp p 2 xd, bd, zt
bd d 2 zt, dt
zt t 2 ∅
Example 7.16. Table 7.2 shows Algorithm 7.14 applied to the graph pic-
tured in Figure 7.1 with root vertex x. The first row indicates the data
initialization stage. The first column indicates the edges included in the
output set F , in the order they are chosen by the algorithm. The second
column indicates the new vertex that is adjoined to the set W at each
step, and the third column is the level of the new vertex. The fourth col-
umn is the boundary ∂W of the updated set W . The output from this
example is shown in Figure 7.3.
7.4 Exercises.
Exercise 7.1. Draw pictures of all trees with six or fewer vertices (up
to isomorphism). (Be brave and do seven!)
Section 7.4 Trees. 175
1
b d 2
2 p t 2
0 x z 1
Planar Graphs.
We consider the question: which graphs can be drawn in the plane with-
out crossing edges? This is not only interesting in its own right, but also
has practical implications. For instance, in the manufacture of very large-
scale integrated circuits, electrical conductors are printed onto a polymer
substrate. These conductors are the edges of a graph, and to avoid short
circuits these edges must not cross one another. Another example is a map
in which cities, towns, and crossroads are represented by vertices, and high-
ways are represented by edges
Several results of this chapter are stated most naturally in terms of (undi-
rected) multigraphs, and so we adopt this point of view from the beginning.
For this chapter, the word “graph” will tacitly mean an undirected multi-
graph. Where necessary, we will be clear about restricting to simple graphs.
177
178 Planar Graphs. Section 8.1
Condition 3. says that the curves representing edges have the right end-
points. Condition 4. says that edges don’t go through vertices except at
their ends. Condition 5. says that distinct edges cannot intersect except at
their endpoints.
Here is a link to an amusing puzzle based on plane embeddings of graphs.
Example 8.2.
• The complete graphs K1 , K2 , K3 , K4 are planar, but K5 seems to be
non-planar. (This is proved in Example 8.20.)
• The complete bipartite graphs K1,r and K2,r are planar, but K3,3
seems to be non-planar. (This is proved in Example 8.22.)
of finitely many straight line segments, then γ is a tame curve. This will
suffice for our purposes.
For tame curves Theorem 8.3 is evident, and it was not until the early 1800s
that people realized that this statement requires proof. For non-tame (“wild”)
curves, the proof is surprisingly tricky. Details can be found in any introduc-
tory book on point-set topology. We say that a simple closed curve separates
the points in its interior from the points in its exterior.
Which graphs can be drawn in the plane? In Example 8.2 we saw that K5
and K3,3 seem to be non-planar. This is the starting point for a complete
characterization of planar graphs.
Informally, subdivision inserts a new vertex in the middle of the edge e. For
example, the graph of Figure 5.15 can be obtained by repeated subdivision
of edges starting from the graph C4 P3 .
182 Planar Graphs. Section 8.3
Proof. First, assume that (P, Γ) is a plane embedding of G, and let vew ∈ E
with pv = γe (0) and pw = γe (1). Let pz = γe (1/2), and define two simple
curves by γe0 (t) = γe (t/2) and γe00 (t) = γe (1 − t/2) for t ∈ [0, 1]. Together
with the points and curves from the embedding of G r e, this gives a plane
embedding of G • e. The converse implication is Exercise 8.1(b).
Proof of the easy implication. By Examples 8.20 and 8.22 (below), the graphs
K5 and K3,3 are non-planar. By Lemmas 8.4 and 8.7, if G contains a (re-
peated) subdivision of K5 or K3,3 then G is non-planar.
The other direction of the proof – that every non-planar graph contains a (re-
peated) subdivision of K5 or K3,3 – is done in CO 342. In fact, one can adapt
the proof of Kuratowski’s Theorem to obtain an algorithm which takes as
input a graph G, and produces as output either a plane embedding of G or
a Kuratowski subgraph – a subgraph of G which is a (repeated) subdivision
of K5 or K3,3 . Moreover, the running time of this algorithm is a polyno-
mial function of the size of the input graph, and it is both practically and
theoretically tractable.
Example 8.11. Figure 8.5 shows a typical face in a plane embedding, and
its boundary.
small) > 0, let B (q) be the closed disc of radius centered at the point
q. Let B (X) be the union of the sets B (q) for all points q ∈ fp(X) in the
footprint of X. Since all the curves in the embedding are tame, we can take
> 0 small enough that Bε (X) is disjoint from the footprint of everything
in G except the component X and the edge e. Let C be the boundary of
the component of the set R2 r B (X) which contains the point py . Again
by tameness, we can take > 0 sufficiently small that C is a simple closed
curve which intersects γe at a single point and is otherwise disjoint from the
footprint of G. This shows that the two faces of (P, Γ) on either side of the
curve γe are the same face: that is, F1 = F2 .
n − m + f = c + 1.
186 Planar Graphs. Section 8.3
Lemma 8.15 is the key point at which restriction to the case of simple
graphs is important. For multigraphs with loops or multiple edges, its con-
clusion need not hold.
Lemma 8.15. If G is a simple graph with at least two edges, then every face
of every plane embedding of G has degree at least three.
P∞
Proof. It is clear that |V | = d=1 nd . The Handshake Lemma implies that
∞
X X
d nd = deg(v) = 2 |E|.
d=1 v∈V
Section 8.3 Planar Graphs. 187
(In the summations over d, only finitely many terms are non-zero.) Since G
is connected with at least three vertices, Corollary 7.6 implies that G has at
least two edges. Let (P, Γ) be any plane embedding of G. By Lemma 8.15,
every face of this embedding has degree at least three. By the Faceshaking
Lemma 8.13,
X
2 |E| = deg(F ) ≥ 3 |F|, (8.1)
F ∈F
12 = 6 |V | − 6 |E| + 6 |F|
≤ 6 |V | − 6 |E| + 4 |E| = 6 |V | − 2 |E|
∞
X X∞ ∞
X
= 6 nd − d nd = (6 − d) nd .
d=1 d=1 d=1
Rearranging this yields part (a), and part (b) follows from the characteriza-
tion of equality in (8.1).
Corollary 8.17. Every nonempty planar simple graph has a vertex of degree
at most five.
Proof. Consider any connected component of the graph. If it has at most two
vertices then it has a vertex of degree at most one. Otherwise, Proposition
8.16 applies. Since the LHS of the inequality of part (a) is positive, the result
follows.
Example 8.18. Corollary 8.17 does not hold more generally for all planar
multigraphs. Think of two vertices joined by seven edges – it has seven
faces of degree two and two vertices of degree seven.
Proof. Let (P, Γ) be any plane embedding of G. Euler’s Formula implies that
|V | − |E| + |F| = 2. The inequality (8.1) also holds, and we use it to eliminate
|F| as follows:
We have to be just a little bit more careful to show that K3,3 is not planar,
since it does satisfy the inequality |E| = 9 ≤ 12 = 3|V | − 6.
Example 8.22. K3,3 is not planar, since it is connected and has girth 4 and
2|V | − 4 = 8 < 9 = |E|.
The Platonic solids are familiar from high school geometry and have been
known since antiquity. We can use graph theory to show that there are only
five possible examples. (Actually constructing them as regular polyhedra
in three-dimensional space is a bit of geometry beyond graph theory.)
Section 8.4 Planar Graphs. 189
Lemma 8.25 shows that the Faceshaking Lemma for a plane embedding is
equivalent to the Handshake Lemma for its dual graph.
UNDER CONSTRUCTION
Figure 8.12: A plane embedding (black) and its dual embedding (red).
8.5 Exercises.
Exercise 8.1.
(a) Finish the proof of Lemma 8.5.
(b) Finish the proof of Lemma 8.7.
Exercise 8.2. For each of the graphs in Figure 8.13, give either a plane
embedding or a Kuratowski subgraph.
Exercise 8.3. For each of the graphs in Figure 8.14, give either a plane
embedding or a Kuratowski subgraph.
Section 8.5 Planar Graphs. 195
Exercise 8.9. Show that if G is planar and 3-regular then the line-
graph L(G) of G is planar.
Exercise 8.10. Prove that if G is simple and has at least two edges
then every face of every plane embedding of G has degree at least
three (Lemma 8.15).
Exercise 8.14. Show that each of the five Platonic solids are unique up
to isomorphism.
f4 = 6 and f7 = 0.
(a) Show that for any such graph there is an a ∈ N such that
f4 = 6 + a and f7 = 2a.
(b) Give an example of such a graph with f7 = 2.
(c) Give an example of such a graph with f7 = 4.
(d)* (Not required.) Give examples of such graphs for all a ∈ N.
Graph Colouring.
199
200 Graph Colouring. Section 9.1
some k ∈ N.)
1. A (proper) X-colouring of G is a function f : V → X such that if
vw ∈ E then f (v) 6= f (w).
2. If |X| = k then such a function f is a (proper) k-colouring of G.
3. The chromatic number χ(G) of G is the smallest natural number
k ∈ N for which G has a (proper) k-colouring.
Example 9.3. In Definition 9.2, we can take X = V for the set of colours,
and then the identity function ι : V → V is a proper |V |-colouring of G.
This shows that the chromatic number exists and that χ(G) ≤ |V |. It is
easy to see that for complete graphs, χ(Kn ) = n. The converse is also
true: if χ(G) = |V (G)|, then G is a complete graph.
Example 9.4. Graphs with small chromatic number are easy to under-
stand.
• The only graph with chromatic number zero is the empty graph.
• A graph has chromatic number one if and only if it has no edges
and at least one vertex.
• A graph has chromatic number two if and only if it is bipartite and
has at least one edge.
Proposition 9.5. Let G be a graph and let dmax (G) be the maximum degree
of a vertex in G. Then χ(G) ≤ 1 + dmax (G).
Proof. Let k = 1 + dmax (G) and X = {1, 2, ..., k}. We construct a proper k-
Section 9.2 Graph Colouring. 201
2 1
3 0 6
4 5
Theorem 9.8 (The Four Colour Theorem). Every planar graph can be
properly coloured with at most four colours.
As an easy warm-up exercise, we can colour planar graphs with at most six
colours.
Proposition 9.9 (The Six Colour Theorem). Every planar graph can be
properly coloured with at most six colours.
204 Graph Colouring. Section 9.2
The idea behind the proof of the Six Colour Theorem is good – perhaps
with a little more care in the induction step we can reduce the number of
colours required. In 1879, Kempe thought he had solved the Four Colour
Problem along these lines. In 1890, Heawood found a flaw in Kempe’s ar-
gument – but he also saw that it worked for five colours.
Theorem 9.10 (The Five Colour Theorem). Every planar graph can be
properly coloured with at most five colours.
Theorem 9.11 (Erdo ”s, 1959). For all k ≥ 2 and g ≥ 3, there is a graph
with girth at least g and chromatic number at least k.
Lemma 9.13. If G has girth at least four then M(G) has girth at least four.
Proof. Examples for k = 2 and k = 3 are easily found. By Lemma 9.13 and
Proposition 9.14, the sequence
9.4 Exercises.
Exercise 9.1. Let G be a graph for which χ(G) = |V (G)|. Prove that G
is a complete graph.
Exercise 9.6. Let G be a 3-regular planar graph with its edges parti-
tioned into three perfect matchings (see Example 10.3). Show that the
dual of any plane embedding of G is 4-colourable.
Exercise 9.8.
(a) Show that the chromatic number of M(C5 ) is 4.
(b) Show that if G is k-colourable then M(G) is (k + 1)-colourable.
(c) Show that if M(G) is k-colourable then M(G) has a k-colouring
f : V (M(G)) → {1, 2, ..., k} such that f −1 (k) = {z}. (That is, z is
the only vertex of M(G) at which f takes the value k.)
(d) Prove Proposition 9.14.
Chapter 10
Bipartite Matching.
A B C D E F G H J
1 2 3 4 5 6 7 8 9 10
209
210 Bipartite Matching. Section 10.1
Example 10.3 (The Game of Slither). The game of Slither can be played
on any graph; for concreteness, imagine a grid like P5 P5 . There are
two players who take alternate turns – the last player to be able to move
wins. The first player chooses any edge of G to form a path with two
vertices. Next, the second player chooses an edge to extend this path (at
either end) if possible, or else loses. The chosen subgraph must remain a
path. Next, play returns to the first player, who must also either extend
the path or lose, and so it continues.
A matching M is perfect if every vertex of G is saturated. In Exercise
10.4, you are asked to show that if G has a perfect matching then the first
player has a winning strategy for Slither.
Proof. One direction follows directly from Example 10.5: if M has an aug-
menting path P then M 0 = M 4E(P ) is a matching in G that is strictly
bigger than M . Thus, if M is a maximum matching then there are no M -
augmenting paths.
Conversely, assume that M is not a maximum matching. There is a max-
imum matching M f of G, and |M | < |M
f|. Now consider the spanning sub-
graph H = (V, M ∪ M f) of G. Since H is the union of two matchings, every
vertex has degree at most two. So every connected component of H is either
a path or a cycle. Moreover, each path component is both M -alternating and
f-alternating, and each cycle component has edges alternating between M
M
and Mf as well. In particular, each cycle component has the same number
of edges in M as in M f. Now, since |M f| > |M |, there must be some com-
ponent of H with more edges in M f than in M . By the above, this is an
M -augmenting path.
This proves (a). For part (b), let matching M and cover C satisfy |M | = |C|.
If M 0 is any other matching then, by part (a), |M 0 | ≤ |C| = |M |. Thus, M is
a maximum matching. The proof that C is a minimum cover is similar.
Example 10.9. Notice that for an odd cycle C2k+1 , a maximum matching
has k edges and a minimum cover has k + 1 vertices. So there can be
a gap between the two sides in the inequality of Proposition 10.8. Tak-
ing a disjoint union of many odd cycles, one sees that this gap can be
arbitrarily large. It is interesting to try to find connected examples with
arbitrarily large “gap” min |C| − max |M |. (See Exercise 10.6.)
The key step is to examine the relations between matchings and alternating
paths.
Proof. For part (a), let P = (v0 v1 ... vk ) be an M -alternating path with v0 ∈ X0
and vk = y ∈ Y . If y is M -unsaturated then P is an M -augmenting path.
For part (b), suppose that e = xb ∈ E is an edge with x ∈ X and b ∈
B r Y . Then there is an M -alternating path (v0 v1 ... vk ) with v0 ∈ X0 and
vk = x ∈ A. since v0 ∈ A and vk ∈ A and G is bipartite, it follows that k
is even. Now, since v0 is M -unsaturated and the path is M -alternating and
k is even, it follows that vk−1 vk ∈ M is a matching edge. Thus, vk b 6∈ M is
not a matching edge. But now (v0 v1 ... vk b) is an M -alternating path, so that
b ∈ Y , a contradiction.
214 Bipartite Matching. Section 10.2
dmax edges, so that C is incident with at most dmax |C| edges. Since C is a
cover, we conclude that m = |E(G)| ≤ dmax |C| = dmax |M |, from which
the result follows.
Informally, Algorithm 10.14 does the following. The first five lines ini-
tialize the data. The “do while” loop starting on line six grows the sets X
and Y in stages, starting from the initial set of M -unsaturated vertices in
A. At each stage, the set U is the set of vertices in X which have just been
added, and U 0 is the set of vertices which will be added to X in the next
stage. When U 0 = ∅, no new vertices will be added to X, and the loop ends.
Inside the loop, if the break command is executed then the algorithm jumps
out of the loop immediately. After this loop is the output section: the value
of flag indicates whether the output is a cover or an M -augmenting path.
(The last “do while” loop produces the M -augmenting path from the vertex
y and the parent function.)
U Y U0
b, h 2 → b, 4 → b, 5 → h, 8 → h a → 2, d → 4, e → 5, g → 8
a, d, e, g 1 → a, 3 → a break c→1
Proposition 10.17. Let G = (V, E) be a graph with bipartition (A, B). Let
(M, Q) be the output when Algorithm 10.16 is applied to the input G. Then
M is a maximum matching and Q is a minimum cover of G.
Shawn is especially interested in the cases when all the work teams can be
assigned jobs. For instance, after a close look at the graph in Figure 10.1
one sees that the five teams {B , D , E , G , H} are among themselves only
qualified to do the four jobs {2, 4, 5, 8}. So there is no way to get all the
teams working on jobs they are qualified for. Hall’s Theorem 10.18 states
that obvious “bottlenecks” like this are the only problems which could arise.
Let G = (V, E) be a graph. For S ⊆ V , let the neighbourhood of S be
Now
Proof. Let G have bipartition (A, B). The case k = 0 is trivial, and the case
k = 1 is when G is itself a matching. We continue by induction on k. By
Exercise 5.5, |A| = |B|, so that a matching in G is perfect if and only if it is
A-saturating. For the induction step, we first find an A-saturating matching
in G. To do this we verify Hall’s Condition. Consider any subset S ⊆ A.
Recall the boundary ∂S of S, and notice that ∂S ⊆ ∂N (S). Now
Example 10.20. The Petersen graph has exactly six perfect matchings.
Any two perfect matchings of the Petersen graph have exactly one edge
in common. In particular, it is not possible to partition the edges of the
Petersen graph into three pairwise disjoint perfect matchings.
10.4 Exercises.
Exercise 10.1.
(a) Show that a tree has at most one perfect matching.
(b) For which values of r, s ∈ N does the grid Pr Ps have a perfect
matching?
Section 10.4 Bipartite Matching. 219
Exercise 10.2.
(a) For each n ∈ N, how many perfect matchings are there in Kn ?
(b) For each r, s ∈ N, how many perfect matchings are there in Kr,s ?
Exercise 10.3.
(a) Let G be a graph with a bipartition (A, B) and a Hamilton cycle.
Let a ∈ A and b ∈ B. Show that Gr{a, b} has a perfect matching.
(b) Show that if two squares of opposite colours are removed from
a (standard 8-by-8) chessboard, then the remaining squares can
be covered usng 31 dominoes
Exercise 10.4. In the game of Slither (Example 10.3), show that if the
graph G has a perfect matching then the first player can always win.
Exercise 10.7. Let G be a graph with bipartiton (A, B), and let C and
C 0 be covers in G.
(a) Show that Cb = ((C ∩ C 0 ) ∩ A) ∪ ((C ∪ C 0 ) ∩ B) is also a cover.
(b) Show that if C and C 0 are minimum covers in G then C b is also a
minimum cover.
220 Bipartite Matching. Section 10.4
Exercise 10.8. For each part, find a maximum matching and a mini-
mum cover in the pictured graph, by repeated application of the XY-
algorithm beginning with the indicated matching.
(a) Figure 10.3.
(b) Figure 10.4.
(c) Figure 10.5.
Exercise 10.11. Let G be a graph with bipartition (A, B). Assume that
there are no vertices of degree zero in A. Also assume that for every
edge ab ∈ E, with a ∈ A and b ∈ B, we have deg(a) ≥ deg(b). Show
that G has an A-saturating matching.
and
(a) Let d be a positive integer such that d divides the degree of every
vertex of G. Prove that one can obtain a d-regular graph starting
from G by repeatedly splitting vertices.
(b) A graph G = (V, E) with bipartition (A, B) is (k, `)-biregular
when every vertex in A has degree k and every vertex in B has
degree `. Let m, a, b ≥ 1 be integers, and let G be a (ma, mb)-
biregular bipartite graph. Prove that G contains an (a, b)-biregular
spanning subgraph.