Introduction To Higher Mathematics - Combinatorics and Graph - Melody Chan
Introduction To Higher Mathematics - Combinatorics and Graph - Melody Chan
Melody Chan
1 Combinatorics 1
1.1 The pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Putting things in order . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Bijections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 The principle of inclusion-exclusion . . . . . . . . . . . . . . . . . 11
1.6 The Erdős-Ko-Rado theorem . . . . . . . . . . . . . . . . . . . . . 14
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Graph theory 21
2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Graph coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Ramsey theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 The probabilistic method . . . . . . . . . . . . . . . . . . . . . . . 35
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Combinatorics
Theorem 1.1. (Pigeonhole principle) Let b and n be positive integers with b > n. If
we place b balls into n boxes, then some box must contain at least two balls.
If that seems obvious, good. It is worth pausing for a moment and asking our-
selves how we would prove such a statement: i.e. convince a doubtful person beyond
any doubt whatsoever using a reasoned argument. How would you do that?
Proof. This proof is an exercise. Try to imitate the proof of Theorem 1.1.
This silly-sounding principle is actually quite useful. We just need to free our
minds from thinking literally about boxes and balls (or pigeons and pigeonholes, as I
prefer) and more generally about placing some finite set of things into a fixed number
of categories. Here are some examples.
By an integer we mean a whole number . . . , −2, 1, 0, 1, 2, . . ..
Example 1.3. No matter how you choose 6 positive integers, two of them will differ
by a multiple of 5.
Proof. We observe that two numbers differ by a multiple of 5 precisely when their
remainders upon division by 5 are the same. There are only 5 possible remainders
0, 1, 2, 3, 4
so the Pigeonhole Principle implies that given any six numbers, some pair of them
have the same remainder.
Example 1.4. Given any five points in a square of side length 1, some two of them
are at distance < 0.75 of each other.
You might enjoy playing with this example. For example, if you are allowed to
choose only four points instead of five, you can put them at the four corners of the
square. But if you have to choose five points, we’re claiming that two of them will
be within < 0.75 of each other no matter what. Try it!
How can we prove this using the Pigeonhole Principle? What are the pigeons?
What are the pigeonholes? What we would really like is to be able to identify four
regions of the square that cover it, such that each region is relatively small: any two
points in a given region have distance < 0.75.
Proof. Divide the square into four squares of side length 1/2. Given five points, the
Pigeonhole Principle implies that
√ two of the points lie within the small square, and
hence are at distance at most 2/2 ≈ 0.707 from each other (this number is the
length of the diagonal of the small square).
How many ways are there to put n distinct symbols in any order?
Let’s try writing down all the ways to order 1, . . . , n for small values of n. Writ-
ing down small examples is often a good strategy to get started.
n = 1: 1
n = 2: 12, 21
n = 3: 123, 132, 213, 231, 312, 321
n = 4: 1234, 1243, 1324, 1342, 1423, 1432,
2134, 2143, 2314, 2341, 2413, 2431,
3124, 3142, 3214, 3241, 3412, 3421,
4123, 4132, 4213, 4231, 4312, 4321
We get 1, 2, 6, 24, . . .
n · (n − 1) · · · · · 3 · 2 · 1
Let us notice that n · (n − 1)! = n!. (Incidentally, for this reason, we define
0! = 1.) Our evidence suggests:
Proposition 1.5. Let n be a positive integer. There are n! ways to put n distinct
symbols in order.
Informally, we might argue: There are n different ways to pick the first symbol.
Having done that, n − 1 unused symbols remain, so there are n − 1 ways to pick the
second symbol. Proceeding in this way, we conclude that there are n(n−1) · · · 1 = n!
ways to put all n symbols in order.
That’s pretty good, although for my taste, it’s not quite a proof. What does “pro-
ceeding in this way” really mean? How did we know to multiply all the numbers 1
through n, instead of, say, adding them, or doing something even weirder with them?
ways.
Here is an illustration of the inductive step of the proof. Suppose we already know
that there are 6 ways to put 3 symbols in order, and we wish to deduce that there are
24 ways to put symbols 1, 2, 3, 4 in order. We divide the orderings according to what
the first symbol is. There are 4 groups, and each group has size 6.
Let’s generalize:
Proposition 1.6. (Multiplication rule) Suppose m1 , . . . , mn are positive integers,
for some n ≥ 1. Suppose we have a collection C of strings of symbols of length n,
such that
• there are m1 symbols appearing as the first symbol of a string in C, and,
• for each i = 2, . . . , n, any initial substring of i−1 symbols appearing in C may
be extended in exactly mi ways to an initial substring of i symbols appearing
in C.
Then
|C| = m1 · · · · · mn .
Proof. Exercise. Prove it by induction on n, imitating the proof of Proposition 1.5.
That sounds more complicated than it should. It’s basically saying that if you
need to make a sequence of n choices, and the number of choices you have at each
juncture doesn’t depend on past actions, then all in all you may multiply the number
of choices you have at each juncture. (Think of the strings of symbols as all of the
possible written records of the choices you made.)
Example 1.7. A binary string is a sequence of 0s and 1s. How many binary strings
of length k are there?
Solution. By the multiplication rule, it’s
2| · 2{z· · · 2} = 2k .
k
Example 1.8. Suppose there are 100 students in a class. How many ways are there
to choose a president, a vice-president, and a treasurer? No student may hold more
than one position.
Solution. There are 100 choices of president. Having chosen a president, there are
99 choices of vice-president. Having made those choices, there are 98 choices of
treasurer. There are 100 · 99 · 98 choices in all. Notice, by the way, that
100!
100 · 99 · 98 =
97!
using a whole lot of cancellation.
Generalizing:
n · (n − 1) · · · (n − k + 1) = n!/(n − k)!
| {z }
k
Proof. This is a special case of Proposition 1.6. This number is sometimes denoted
P (n, k), and referred to as the number of permutations of n objects taken k at a
time.
1.3 Bijections
This is as good a time as any to mention that bijections between finite sets are a
combinatorialist’s favorite tool for counting.
Definition.
Now:
If X and Y are finite sets in bijective correspondence then they have the
same number of elements.
(“Bijective correspondence” simply means that there exists a bijection from one to
the other.) The claim is obvious, right? Then we should be able to prove it...
Proof. Let |X| = n. Well, what does that actually mean? Presumably, it means that
there is a bijection between X and the set {1, . . . , n}. If so, then by composing that
bijection with a bijection between X and Y , we obtain a bijection between Y and
{1, . . . , n}, so |Y | = n also.
Think of the set {1, . . . , n} as the n-element set on the shelf at the National Bureau
of Standards.
The point is that one way to count the elements in a set Y is simply to establish
a bijection between Y and some set X whose cardinality you already know. This
comes up constantly, and is extremely satisfying! In fact, one could argue that this is
the best, most explicit ways to count things—the various other ways we discuss are
but tricks we use when unable to produce explicit bijections.
We will see tons of examples, starting straightaway in the next section.
1.4 Subsets
Definition. For n ≥ 0 an integer, we denote the set {1, . . . , n} by [n].
Solution. We claim the answer is 2n . It’s good enough to count, by the bijective
correspondence principle, subsets of your favorite n-element set. Mine is [n] =
{1, . . . , n}.
Note that there is a natural bijection between the subsets of [n] and the length n
binary strings. Namely, given A ⊆ [n], let eA be the binary string whose ith digit is
1 if i ∈ A and 0 if i 6∈ A. (I think of eA as a “membership-recording string.”) Argue
on your own that this is a bijection. Finally, we are done by Example 1.7, because
we already counted the binary strings of length n: there are 2n of them.
Definition. Let X be any set. The power set of X is the set of all subsets of X; it is
denoted P(X).
↓ ↓ ↓ ↓
k-element subsets of an n-element set. Such expressions are called binomial coeffi-
cients; we will see why in a bit.
Proposition 1.11. We have
n
· k! = P (n, k).
k
(Recall that P (n, k) denotes the number of ways to choose k elements from an n-
element set and order them. Therefore,
n n!
= . (1.1)
k k!(n − k)!
Proof. This proof is an exercise for you, imitating the argument above.
You can also write C(n, k) instead of nk , for the number of combinations of n things
1 1
0 1
2 2 2
2 1 0
This is called Pascal’s triangle; it was known to the ancients. A portion of it is shown
in Figure 1.1.
Looking at Figure 1.1, let make some observations. These are just guesses for the
moment:
Conjecture 1.13.
1. Each number (besides all the 1s) is the sum of the two numbers directly above
it.
2. The triangle is left-right symmetric.
3. Every number down the middle is even.
4. Every row is unimodal, meaning it first increases and then decreases.
5. If n is prime, then each nk , other than n0 and nn , is a multiple of n.
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
What we mean by “indistinguishable” is that we care only about how many balls
go in each box; the balls themselves look all the same.
Solution. Convince yourself that you may count instead sequences consisting of k
symbols ∗ and n − 1 symbols |. The bars represent box-separators. For example, the
sequence
∗ ∗ | | ∗ ∗∗
means that box 1 gets 2 balls, box 2 gets no balls, and box 3 gets 3 balls.
The number of such sequences is n−1+k
k .
Example 1.15. Same question as above, but now we require that every box gets at
least one ball.
Solution. Now we line up the k ∗s and then we choose n − 1 of the k − 1 empty
spaces between the ∗s to place our n − 1 bars. (Here, we assume n ≤ k, otherwise
the answer is definitely 0.) Convince yourself that these sequences are
in bijection
k−1
with the things we are actually trying to count. So the answer is n−1 .
Alternatively, by a bijection obtained by removing one ball from each box, we
may count the number of ways to put k − n balls in n boxes. So we have reduced to
k−1
the previous question. The answer is n−1 again.
This method is sometimes referred to as the “stars and bars” method, though at
least one mathematician from the Midwest (I wish I could remember who told me
this) has suggested that it should be called the “cows and fences” method.
Finally we come to the binomial theorem, explaining the name binomial coeffi-
cients. This theorem explains the following pattern:
(x + y)0 = 1
(x + y)1 = 1x + 1y
(x + y)2 = 1x2 + 2xy + 1y 2
(x + y)3 = 1x3 + 3x2 y + 3xy 2 + 1y 3
(x + y)4 = 1x4 + 4x3 y + 6x2 y 2 + 4xy 3 + 1y 4
..
.
(x + y)(x + y)(x + y) = xxx + xxy + xyx + xyy + yxx + yxy + yyx + yyy.
In such an expansion, there are 2n terms (why?). Now collecting them together using
the commutative law, we see that the term xk y n−k appears a total of nk times: this
Proof. As in our proof sketch of Theorem 1.19, we verify that for any x ∈ X1 ∪
· · · ∪ Xn , x is counted exactly once in total in the expression on the right hand side.
Let
J = {i ∈ [n] : x ∈ Xi }.
For example, if x is in X1 and X3 and no other sets, then J = {1, 3}. Then on the
right hand side, terms involving x (i.e., the ones referring to sets containing x) are
the ones of the form ∩i∈I Xi for I ⊆ J. These are counted with sign +1 if |I| is odd
and −1 if |I| is even.
So it boils down to verifying that
Example 1.21. (The problem of derangements). There are n people sitting on a com-
pletely full airplane. The airline decides to wreak maximum havoc by reassigning
seats in such a way that no person remains in the same seat. How many ways are
there to do this? (Such a reassignment is called a derangement.)
is the number of seat reassignments in which the people numbered J stay in their
seat; there are (n − j)! ways to do this. Moreover, there are nj subsets of size j.
Applying PIE, grouping together the subsets of [n] by size, gives that
n
j−1 n
X
|X1 ∪ · · · ∪ Xn | = (−1) (n − j)!.
j=1
j
So the number we want is n! minus the expression above, which may also be
written
as you may check. (So, by the way, this says that that fraction of all permutations of
[n] that are derangements, as n → ∞, actually approaches 1/e.)
Example 1.22. (Counting surjective functions). Let n and k be positive integers. How
many surjective functions [n] → [k] are there?
Solution. This example is similar to Example 1.21. First, the multiplication rule says
that there are k n functions [n] → [k]. Now for each i ∈ {1, . . . , k}, let Xi denote
the set of functions f : [n] → [k] such that i is not in the image of f . In other words,
there is no x ∈ [n] such that f (x) = i. The answer we seek, then, is
k n − |X1 ∪ · · · ∪ Xk |.
Now let’s use PIE. We note that if J ⊆ [k] has size j, then
\
Xi = (k − j)n .
i∈J
So PIE says
X \
k n − |X1 ∪ · · · ∪ Xk | = k n − (−1)|J|−1 Xi
∅6=J⊆[k] i∈J
k
X X
= kn − (−1)j−1 (k − j)n
j=1 J⊆( [k]
)
j
k
j−1 k
X
n
=k − (−1) (k − j)n
j=1
j
k
X k j
= (−1) (k − j)n .
j=0
j
By the way, there is a closely related quantity that people sometimes refer to as
the Stirling numbers, or the Stirling numbers of the second kind. (You can look up the
Stirling numbers of the first kind yourself; we are not covering them in this class.)
the Stirling numbers of the second kind. So, if we wish, we may conclude that the
number of surjective functions [n] → [k] is k! · S(n, k).
In fact, we may interpret S(n, k) as counting a closely related structure:
Proposition 1.23. S(n, k) is the number of ways to divide n objects into exactly k
groups.
Proof. Exercise, by considering a forgetful function
{surjective functions [n] → [k]} −→ {divisions of 1, . . . , n into k groups.}
Here is one construction of an intersecting set system: Fix any element i ∈ [n],
and now let
[n]
F= S∈ : i∈S .
r
Note that every member of F contains i, so F is an intersecting family. Furthermore,
n−1
|F| = .
r−1
Do you agree?
The following theorem says that the family we found above is as big as it gets.
This theorem was proved by Erdős-Ko-Rado, but the proof we give here is a later
proof, due to Katona in 1971.
1→2→3→4→1
1→2→4→3→1
1→3→2→4→1
1→3→4→2→1
1→4→2→3→1
1→4→3→2→1
So there are (n − 1)! cyclic orderings of [n].2
Now, let us say that a set is an interval of a cyclic order if its elements appear
consecutively in the cyclic order. For example, the set {3, 4} is an interval of the first,
second, fourth, and
sixth cyclic orders above.
Let F ⊆ [n]r be an intersecting family. Let us double-count the elements of the
following set:
so
(n − 1)! · r (n − 1)! n−1
|F| ≤ = = ,
r!(n − r)! (r − 1)!(n − r)! r−1
which is exactly what we wanted to show.
Exercises
Section 1.1. The Pigeonhole Principle
1.1. Suppose there are four rows of seats in a classroom, each row containing six seats. How
many students are necessary to guarantee that no matter how they seat themselves, some row
will be full?
Note that if you think the answer is N , then you need to argue both that N students force
a row to be full, but fewer than N students may seat themselves such that no row is full.
1.2. Prove that no matter how you choose five points in an equilateral triangle of side length 1,
some pair of them will be at distance at most 0.5 from each other.
1.3. Let us say that two positive integers have a common factor if they are a common multiple
of some integer > 1. For example, of the three numbers 6, 10, and 15, each pair has a common
factor.
(a) Can you find 50 numbers between 1, . . . , 100 such that every pair of them has a common
factor?
(b) Prove that for any 51 numbers between 1, . . . , 100, some pair will have no common
factor.
1.4. Suppose we have 12 dots arranged in a 2 × 6 square grid, as shown. Prove that no matter
how you choose seven of these dots, some three of them are the vertices of an isoceles triangle.
(Let us agree that three dots lying on a line do not form a triangle at all.)
1.5. Explore: Suppose now that we have a 3 × 3 square grid of dots. What is the smallest
number N such that no matter how you choose N of the dots, some three of them form an
isoceles triangle?
What about a 4 × 4 square grid of dots? (I don’t know the answer.)
1.6. There are 100 students in a class, and we wish to choose a president, vice-president, and
treasurer. The only problem is that each student has a nemesis in the class, i.e., the class is
comprised of 50 pairs of nemeses, who can’t stand each other.
How many ways are there to choose a president, vice-president, and treasurer, so that no
two nemeses are chosen?
1.7.
(a) How many 4-digit numbers are there that are not a multiple of 10?
(b) How many 4-digit numbers are there whose digits sum to an even number?
1.11. Let a1 , . . . , an be any positive numbers. Consider all the possible ways of writing a +
or − sign before each ai . Prove that at most 2n−1 of these expressions produce a positive sum.
For example, take 1, 3, and 4. Then
+1 + 3 + 4 +1−3+4 −1+3+4
1.12. Let n be a positive integer. Let an be the number of ways to write n as a sum of odd
positive integers. Let bn be the number of ways to write n as a sum of distinct positive integers.
In each case, order does not matter. For example, if n = 7, then we have
1 + 1 + 1 + 1 + 1 + 1 + 1, 1 + 1 + 1 + 1 + 3, 1 + 3 + 3, 1 + 1 + 5, 7
and
1 + 2 + 4, 3 + 4, 2 + 5, 1 + 6, 7
so an = bn = 5.
Prove that an = bn for all n.
1.13. Prove the statements (2), (3), (4), and (5) of Conjecture 1.13. You may need to formulate
more rigorous versions of them first.
1.14. Let n ≥ 0 be an integer. Let An ⊆ P([n]) be the set of subsets [n] that do not contain
any consecutive pairs of numbers. For example,
1.15. Let n ≥ 0 be an integer. How many nested pairs of subsets of [n] are there? In other
words, compute the size of the set
(B1) How many of them are surjective and take the value 1 exactly four times?
(B2) How many of them are surjective and nondecreasing?
(B3) How many of them take the value 1 exactly four times and are nondecreasing?
(B4) How many of them are surjective and nondecreasing and take the value 1 exactly four
times?
(C1) How many of them are surjective or take the value 1 exactly four times?
(C2) How many of them are surjective or nondecreasing?
(C3) How many of them take the value 1 exactly four times or are nondecreasing?
(C4) How many of them are surjective or nondecreasing or take the value 1 exactly four
times?
You may wish to use the Venn diagram in Figure 1.2 to organize your counting.
1.17. Of the numbers 1, . . . , 225, how many of them are relatively prime to 225? (Two posi-
tive integers are called relatively prime if their greatest common divisor is 1.)
1.18. Explore: In the setting of Example 1.21, suppose one of the n! possible seat reassign-
ments is chosen uniformly at random. On average, how many people stay in their seat?
nondecreasing
Figure 1.2: Venn diagram of functions [6] → [3] for Exercise 1.16.
Graph theory
Definition. A graph G is a finite set V , called the vertices, together with a set
E of unordered pairs of vertices, called the edges.1
for example, to always take V = [n] for some integer n and then never to worry about that again.
2 3
1
4
Not only that, graphs are quite useful as structures on which to build models of
various real-world phenomena, namely phenomena that have to do with pairwise
interactions across a collection of agents. For example, many models of networks
(social networks, computer networks, transportation networks, etc.) are built using
the data structure of graphs.
2.1 Graphs
Let G = (V, E) be a graph, throughout.
Definition.
1. Say an edge e is incident to v if it contains v; then we’ll say that v is an
endpoint of e.
2. If vw is an edge of G, we will say that v and w are adjacent vertices, or that
they are neighbors.
3. The degree of a vertex v in G, denoted deg(v), is the number of edges incident
to v. It is sometimes called the valence instead.
Suppose there are 41 people at a party. To simplify things, let us agree that two
people either know each other or don’t. Then I claim that someone at the party knows
an even number of people there. How could we possibly deduce that from so little
information?
Proposition 2.1. If every vertex of a graph has odd degree, then |V | is even.
Proof. Consider the set
Anyways, each edge is incident to two vertices, so |H| = 2|E|. So |H| is even. But
alternatively, we have that X
|H| = deg(v).
v∈V
We are told that every summand on the right hand side is an odd number, whereas
|H| is even. The only way that can happen is if the number of summands is even,
i.e. |V | is even.
Actually, we proved something stronger than the statement in Proposition 2.1,
namely that in any graph, the number of odd-degree vertices is even.
This proof is a good example of the technique of double-counting, that is, count-
ing the same thing two different ways in order to deduce the equality of the two
counts.
• The complete graph on n vertices has vertex set [n], say, and edge set
[n]
:= {S ⊆ [n] : |S| = 2}.
2
It is denoted Kn .
Question 2.2. How many edges are there in Kn ?
• Empty graph on n vertices, with no edges.
• A cycle on n vertices, denoted Cn . A cycle of length 3 is often called a triangle.
• A path on n vertices, denoted Pn .
• Complete bipartite graph Km,n . Here we have V = A ∪ B where A, B are
disjoint sets of sizes m and n respectively, and an edge between every a ∈ A
and b ∈ B.
• The random graph G(n, p). Okay, this isn’t really a single graph, but rather a
way of picking a graph on n vertices at random, i.e. a probability distribution
on graphs. Fix a vertex set [n], for some n ≥ 1, and Let p be a real number
between 0 and 1. For each pair of vertices i 6= j, we flip a weighted coin,
weighted heads with probability p and tails 1 − p. If it comes up heads, insert
an edge from i to j; if it comes up tails, don’t. All in all, you flip the coin n2
times.
For example, if p = 0 then you always get the empty graph. If p = 1 then
you always get the complete graph. As you slowly crank p up from 0 to 1,
you see graphs that have more and more edges. This is turns out to be very
interesting. Often, given a property enjoyed by some graphs (like “connected”
or “triangle-free,”), you see some “threshhold” behavior, i.e. as you increase
p, the probability that your graph G(n, p) has the property suddenly spikes at
a particular p depending on n (in a way that can be made precise).
2 3 B C
1 A
4 D
care about the exact names of the vertices. For example, at some level I would like to
treat the two graphs in Figure 2.2 as essentially the same, even though they are not
literally the same. Their vertex sets are different, for instance.
So we will define two graphs to be isomorphic if one is obtained from the other
by relabelling the vertices. More precisely, we have the following definition.
(Remember, ij is short for {i, j}, and similarly for f (i)f (j).)
If I give you two very small graphs, such as the ones in Figure 2.2, it is easy to tell
whether they are isomorphic. But the general problem of determining whether two
graphs are isomorphic is not at all simple. The question of just how hard that problem
is, called the Graph Isomorphism Problem, is one of the deep problems in compu-
tational complexity. In fact, a huge breakthrough on this problem by László Babai
was just achieved this year. Please see this article2 and this article3 by science/math
writer Erica Klarreich.
Now let us make a very simple observation: If G is acyclic, then it remains acyclic
after removing any edges. Do you agree? This suggests that it could be interesting to
study maximally acyclic graphs. By this we mean a graph G that is acyclic but such
that throwing in any single additional edge between two of its vertices introduces a
cycle: G is maximal with respect to the property of being acyclic.
We study maximal acyclic graphs in the next chapter. They are called trees, and
they are of fundamental importance.
2 https://www.quantamagazine.org/algorithm-solves-graph-isomorphism-in-record-time-20151214
3 https://www.quantamagazine.org/graph-isomorphism-vanquished-again-20170114/
2
1
4
2.2 Trees
First, let me define what it means for a graph to be connected: it means that one may
take a walk in the graph from any one vertex to any other.
This is the kind of property that you would want if, say, you ran an airline. You
would want your network of cities and direct flights to be connected: a customer
could travel from any city to any other on your flights.
In other words: T is a tree if it is connected, but removing any edge would make
it disconnected. It’s connected, but just barely: it has no redundancy whatsoever.
Let us understand the structure of trees. First, we observe that trees are acyclic.
After all, if instead there were a cycle in a tree T , then remove an edge e lying in the
cycle. The resulting graph is still connected (why?).
In fact, the following holds.
There are many equivalent characterizations of trees, some of which will be en-
countered in the Exercises.
Proposition 2.4. Trees have leaves. More precisely, if T is a tree on at least two
vertices, then T has at least two leaves.
Proof. First, let’s give the idea. Suppose you are sitting around on the middle of an
edge ij of T , and you want to find a leaf. What would you do? You would start
walking towards i (or to j), and then keep walking, and walking, until you reach a
dead end. That dead end is a leaf. If you go in the other direction instead, you’d get a
second leaf. The point is that you cannot cycle back to any previously visited vertex,
because trees are acyclic.
Now we give the official proof. Consider a maximal path P in T . Saying that P
is maximal means that it is not contained in a longer path of T .
Let ` and m be the endpoints of P . Since T has at least one edge, P is nonempty
so ` and m are incident to an edge in P and have degree at least 1.
On the other hand, they have degree at most 1. Suppose that ` is adjacent to
another vertex x. Well, if x is not on P , then we have produced a longer path than
P , contradicting our choice of P . But if x is on P , then we have produced a cycle in
T , contradicting that T has no cycles.
Either way, we produced a contradiction, so we conclude that ` is not adjacent to
any vertex besides its neighbor in P . We argue similarly for m.
Corollary 2.5. Every tree on [n] vertices has exactly n − 1 edges.
Proof. Here’s a sketch of a proof: If n = 1 the statement is pretty clear. For bigger
n, remove a leaf and its incident edge, and induct.
doing that until you are left with only two vertices. Call the sequence you wrote down
the Prüfer code of T , denoted Pn (T ).
Our goal is to show that Pn provides a bijection between the trees on vertex set
[n] and the nn−2 sequences of length n − 2 with alphabet [n].
Lemma 2.7. Let T be a tree on vertex set [n]. Then the leaves of T are precisely the
vertices not appearing in the Prüfer code of T .
Proof. If a number i appears in Pn (T ), that means that at some point a leaf was
ripped off of i. So i is not a leaf.
If i is not a leaf, then it has at least two incident edges. Then at least one of these
edges must have been ripped off at some point. Consider the exact moment when the
first of these edges was ripped off; it couldn’t have been that i was the leaf, since it
had degree at least 2; therefore a leaf was ripped off i! So i appears in Pn (T ).
Incidentally, this Lemma shows that if I give you a Prüfer code Pn (T ) but don’t
tell you T , you can at least determine one small thing that’s true about T . Namely, if
v is the largest vertex not in Pn (T ), then the first thing we did was to rip the leaf v
off of the vertex whose label is recorded first in Pn (T ).
But amazingly, this is all we need, by induction! Putting it all together, we are
ready to prove Cayley’s Theorem.
• after deleting the leaf `, the remaining tree T 0 on vertices [n] \ {l} has Prüfer
code s2 · · · sn−2 .
But now we are done! After all, the conditions above determine a unique T , since
by induction, the tree T 0 exists and is unique, and to produce T from T 0 , we simply
attach the leaf ` to s1 . So T also exists and is unique.
This proof should annoy you a little, because we have refrained from directly
describing a procedure for taking a Prüfer code and recovering a tree. This is a nice
problem for you to try.
We will argue this informally in class, just to get a feel for chromatic number. A
3-coloring of C5 and a 2-coloring of C4 are shown below.
1
2 1
3 2
2 1 1 2
Here is a personal anecdote. Back in 2005, I was a student at the Kneisel Hall
Chamber Music Festival in Blue Hill, Maine. Each of the 55 or so students in the
program is assigned to two chamber groups. A chamber group is a small ensemble,
usually composed of 3-6 students. Every summer the director has to slot the 27 or so
chamber groups into a daily rehearsal schedule. Of course, she can’t have the same
student assigned to two groups rehearsing at the same time.
Question. What does this scheduling problem have to do with graph coloring?
When I arrived that summer, a friend “volunteered” me to write a computer pro-
gram to solve the Kneisel Hall scheduling problem, so that the director would no
longer have to spend hours and hours overnight each summer doing the scheduling
by hand. Almost every summer since then, I still use graph-coloring algorithms to
do the scheduling of all the rehearsals and coachings at Kneisel Hall—as a sort of
mathematician’s “in-kind donation.”
Definition. We say that G is planar if it can be drawn in the plane, e.g., on a sheet of
paper, without edge crossings.
I will be just slightly informal about what exactly a drawing is, but what you pic-
ture in your mind is likely to be correct. Namely, a drawing consists of a placement
of the vertices of G at distinct points of your sheet of paper, and edges drawn in a rea-
sonable way, e.g. using a finite sequence of line segments or something sufficiently
close to that.4
It is important to note that planarity is a property of an abstract graph: it asserts
the existence of a drawing in the plane without edge-crossings. It does not assert that
any particular drawing has no edge-crossings. So, for example, the graph K4 is a
planar graph, even though if you were to draw K4 the first way that pops into your
head, it might be that you drew an edge-crossing.
Suppose you want to draw a map of the world, with all the countries colored in
various shades. As is usual with maps, you impose the condition that two bordering
countries receive different shades.
What does this have to do with graphs? Draw a vertex inside each country. Then,
if two countries share a border, draw an edge between their vertices crossing the
border. The resulting graph is called the dual graph, and the map-coloring question
becomes a question of properly coloring the dual graph.
Mapmakers over the ages have long known, heuristically at least, that four colors
are always enough.
Theorem 2.10. (The four color theorem) Every planar graph is 4-colorable.
4 For those of you who think I’m being too uptight about this, may I inform you, or remind you, that
v − e + f = 2.
The result above is called Euler’s Formula, and it is deep. You will surely return to
it in the Geometry unit of this class next semester. By the way, a special case of
this formula states that any polyhedron in R3 at all (e.g., a cube, a tetrahedron, a
dodecahedron, or whatever), satisfies v − e + f = 2 for its vertices, edges, and faces.
Do you see why this statement about polyhedra follows from Proposition 2.12?
A consequence of Euler’s formula is that every planar graph has a vertex of de-
gree at most 5.
Lemma 2.13. Every planar graph has a vertex of degree at most 5.
Proof. We may as well prove that every connected planar graph G has a vertex of
degree at most 5. Suppose on the contrary that G has every vertex having degree 6
or more. Consider a plane drawing of G with v vertices, e edges, and f faces. Now
by double-counting edge-face incidences, i.e. ordered pairs
we see that
2e ≥ 3f.
And, by double-counting vertex-edge incidences, i.e. ordered pairs
we see that
2e ≥ 6v,
since we assumed that every vertex is incident to at least 6 edges.
All together, we have
v − e + f ≤ e · 31 − 1 + 23 = 0,
v1
v5 v2
v
v4 v3
Let us make the following simple observation: if the proper coloring of G0 may be
tweaked in some way so that not all five colors are used on v1 , . . . , v5 , then we again
win: there is a color remaining for v. In particular, we may assume that c(vi ) = i for
each i = 1, . . . , 5, say.
Now we ask: is there a path from v1 to v3 in G0 in which every vertex is alter-
nately colored 1 and 3? Let us call such a path an alternating 1-3-path for short.
If not, then we may tweak the given coloring of G0 so that we change the color of
v1 from 1 to 3, without changing the color of any of the other of the vi s. Do you see
why? The idea is to switch the colors 1, 3 on the set of vertices that can be reached
from v1 by a 1-3-alternating path. After we do that, we can give color 1 to v. Thus,
we are done unless v1 and v3 are connected by an alternating 1-3-path. A similar
argument says that we are done unless v2 and v4 are connected by an alternating 2-4
path.
But it is not possible for both such paths to exist. Indeed, they would cross! We
are done.
Remember that a triangle just means a K3 subgraph. Do you agree that this is a
rephrasing of the original claim?
Proof. For psychological ease, let us label the vertices 1, . . . , 6, and call the colors
red and blue. Now of the five edges incident to vertex 1, some three of them are the
same color, by the Pigeonhole Principle; suppose that they are the edges to i, j, and
k. We may suppose the edges 1i, 1j, and 1k are red; the same argument with colors
reversed applies if they are all blue.
Now if any of the three edges ij, ik, or jk are red, then we have found an edge
that completes a monochromatic triangle. So we may assume that they are all blue.
But then the edges ij, ik, and jk form a monochromatic triangle, so we are done.
Now suppose there are 7 people at a party. Is it necessarily the case that there are
three people who all know each other, or that there are three people who all don’t
know each other? Well, yes—you can even find such a group of three people among
any six of them. So the number 6 is some kind of threshhold, a smallest number of
vertices which forces the existence of a red K3 or a blue K3 .
Now, is there any party you could throw that’s big enough to guarantee that either
some 100 people know each other or some 1000 people don’t know each other?
(You definitely need to invite at least 1000 people, otherwise you could invite 999
pairwise-strangers. But 1000 isn’t going to do the trick.) What do you think?
The question is whether there is always a threshhold like the number 6 we found
earlier, or whether instead it is possible to plan bigger and bigger parties such that
the people who know each other, and conversely the people who don’t, do not come
in large clumps: clumps of size 100 and 1000 respectively.
Definition. Let k, l ≥ 1 be integers. We define the (k, l)-Ramsey number, denoted
R(k, l), as follows. It is the smallest number n, if such a number exists, such that
every 2-edge-coloring of Kn using colors red and blue has a red Kk or has a blue
Kl .
So we have rephrased the question as: Do Ramsey numbers exist? For exam-
ple, we argued that R(3, 3) ≤ 6 exists, and we argued that in fact R(3, 3) = 6 by
exhibiting an explicit 2-edge-coloring of K5 without a monochromatic triangle.
Proposition 2.15. For all integers k, l ≥ 1, the Ramsey numbers R(k, l) exist.
This is rather amazing. We shall model the following proof on our proof that
R(3, 3) ≤ 6.
Proof. First, let us establish that for every k ≥ 1 and l ≥ 1, we have
R(1, l) = R(k, 1) = 1.
In other words, we shall show that if n = R(k−1, l)+R(k, l−1), any 2-edge-coloring
of Kn has a red Kk or a blue Kl .
Indeed, given a 2-edge-coloring of Kn , consider the
edges incident to vertex 1. Among these, the Pigeonhole Principle implies that either
there are R(k − 1, l) red edges or there are R(k, l − 1) blue edges. Let us suppose
there are R(k − 1, l) red edges incident to vertex 1; the other case can be argued
analogously.
Now look at the R(k−1, l) vertices on the other ends of these red edges. By the
inductive hypothesis, there is either a red Kk−1 or a blue Kl among them. In the
second case, we win. In the first case, we also win, since the red Kk−1 together with
the vertex 1 forms a red Kk .
Three colors. Let k, l, m ≥ 1 be integers. Does there exist an n such that every
3-edge-coloring of KN contains a red Kk , a blue Kl , or a green Km ?
Definition. Let k, l, m ≥ 1 be integers. We define the (k, l, m)-Ramsey number,
denoted R(k, l, m), as follows. It is the smallest number n, if such a number exists,
such that every 3-edge-coloring of Kn using colors red, blue, and green has a red
Kk , a blue Kl , or a green Km .
Proposition 2.16. For all integers k, l, m ≥ 1, the Ramsey numbers R(k, l, m) exist.
Proof. We claim that R(k, l, m) exists and that
The idea is to suppose that we are temporarily unable to distinguish between blue
and green.
Let n = R(k, R(l, m)) and suppose we have a 3-edge-coloring of Kn . Consider
it as 2-edge-coloring of Kn with colors “red” and “blue-green.” Then either there is a
red Kk or a blue-green KR(l,m) . In the first case we win. In the second case, remem-
bering again the difference between blue and green, we see that the original coloring
had a KR(l,m) -subgraph which was 2-edge-colored blue and green. By definition of
R(l, m), there is a blue Kl or a green Km , and again we are done.
This same idea may be extended to arbitrarily many colors.
Proposition 2.17. Let n ≥ 1 be an integer. For integers m1 , . . . , mn ≥ 1, there
exists a number N with the following property: any n-edge-coloring of the complete
graph KN contains a complete subgraph Kmi whose edges are colored i, for some
i ∈ {1, . . . , m}.
We write R(m1 , . . . , mn ) for the smallest such N such that the statement holds. It is
called the (m1 , . . . , mn )-Ramsey number.
Proof. Exercise, imitating the proof of Proposition 2.16.
What about lower bounds on Ramsey numbers? Let’s unpack this: the statement
R(k, l) > n
means that there exists a 2-edge-coloring of Kn with no red Kk and no red Kl . For
example, we showed that R(3, 3) > 5 by producing an appropriate 2-edge-coloring
of K5 .
Question 2.18. What is the best lower bound for R(3, 4) that you can find?
Imagine trying to find a lower bound for R(100, 100). It seems you would have
to painstakingly construct 2-edge-coloring a truly enormous graph and demonstrate
in some way that it had no monochromatic K100 inside it. That sounds hard!
But there is an amazing way to get around this. It hinges on a crucial idea that
has been a theme of this course so far: sometimes it is possible to prove something
exists without actually constructing it. It’s better, of course, to give a constructive
proof, but sometimes that can be hard. For example, we seen several arguments by
contradiction that show something exists without necessarily finding it. Instead, we
simply assume it does not exist, and then derive an absurdity.
In the next section we will discuss a remarkable method that can sometimes be
used to prove that something exists without actually constructing it: the probabilistic
method. We will apply this method to proving lower bounds on Ramsey numbers
R(k, l).
(2k/2 )k 1−(k2)
n k
· 21−(2) ≤ ·2
k k!
1 k2 k
= · 2 2 +1−(2)
k!
1 k
= · 21+ 2 < 1.
k!
Proof of Proposition 2.19. Let n be a positive integer satisfying the inequality (2.2).
We want to show that there is a 2-edge-coloring of Kn with no monochromatic Kk .
There are finitely many 2-edge-colorings of Kn ; pick one at random. We shall
demonstrate that the probability that it contains no monochromatic Kk is positive.
That’s the probabilistic method.
n
Well, there are 2( 2 ) 2-edge-colorings of Kn , since Kn has n2 edges. Let’s
start by focusing our eyes on one Kk subgraph—for instance, the one on vertices
{1, . . . , k}—and asking in how many of the 2-edge colorings of Kn is this particular
Kk monochromatic. The answer is
n k
2 · 2( 2 )−(2) .
Do you see why? Of course, the answer does not depend on the particular Kk sub-
graph of Kn that we chose.
Now to figure out exactly how many of the 2-edge-colorings of Kn have at least
one monochromatic Kk subgraph, we would do some inclusion/exclusion. But let’s
not. Instead, let’s make the simple observation that since there are nk Kk subgraphs
which is equal to
n k
· 21−(2) .
k
This last quantity is smaller than 1 by hypothesis.
In other words, if n satisfies the stated inequality, then some 2-edge-coloring of
Kn has no monochromatic Kk . So R(k, k) > n.
To this day, to the best of my knowledge, the probabilistic lower bounds on Ram-
sey numbers are better than the constructive lower bounds on Ramsey numbers; this
is in fact a topic of current research.
In fact, something more tantalizing can be gleaned from the bound we got. Note
that we calculated that when n = b2k/2 c, the probability of a random 2-edge-
1
coloring of Kn has no monochromatic Kk is k! · 21+k/2 . Not only is that number
smaller than 1 when k ≥ 3, it is much, much smaller than 1 for k large.
This says that a great way to construct a 2-edge coloring of Kn with no
monochromatic Kk is to pick one at random. You should then check that the one
you picked is a good one, but it’s very likely to be good. (The reason that this isn’t
the same as a constructive lower bound on Ramsey numbers is that it only deals with
one k at a time, and not all k at once.)
With hindsight, it’s reasonable that we appealed to a probabilistic method for
lower bounds on Ramsey numbers: a lower bound asserts the existence of a structure
that is rather disordered: no large clumps of red and no large clumps of blue. Roughly
speaking: if you try to write down 2-edge-colorings in some systematic way, you will
tend to find these large monochromatic clumps. In some rough, philosophical sense,
that’s due to the very fact that you are being systematic. Instead, a random 2-edge-
coloring of Kbk/2c is very, very likely to do the trick and not be very clumpy.
This is a demonstration, at least morally, of the aptly named principle to be
found all over mathematics and computer science: the difficulty of “finding hay in a
haystack.”5 It can be hard to find hay in a haystack! Despite the fact that you know
your haystack to be filled 99.9% (or whatever) with hay, every time you reach into
the haystack and pull something out, you find that it is a needle.
One final remark: you might complain that we haven’t really used probability,
as opposed to just counting, in a meaningful way. Indeed, in our proof of Propo-
sition 2.19, we counted the number of 2-edge-colorings of Kn not containing a
monochromatic Kk , and showed that this number is smaller than the total number
of 2-edge-colorings of Kn . The complaint is a good one. In this particular instance,
everything we did can be phrased simply in terms of counting and without mention
of probabilities. On the other hand, in many cases it turns out that the perspective
of probability and the probabilistic method, and the techniques it brings to bear, are
very useful, even when applied in purely combinatorial situations. That is a story for
another time, which you can learn from [1] in the references. I’ve also included in
5 I heard this compelling catch-phrase in a talk by mathematician/computer scientist Avi Wigderson,
but I am not sure whether he was the one who coined it.
Exercises
Section 2.1. Graphs
2.1. Prove that in any graph, there are two vertices having equal degree. (Possible hint: assume
not, and think about the set of vertex degrees.)
2.2. Let G and G0 be the two 6-vertex graphs shown below. How many isomorphisms (see
page 24) from G to G0 are there? Explain your answer briefly.
1 3 5 A C E
2 4 6 B D F
G G0
2.4. Draw six nonisomorphic trees on vertex set {1, 2, 3, 4, 5, 6}, and compute each of their
Prüfer codes.
2.5. Let n ≥ 2 be an integer. Of the nn−2 trees on vertex set [n], prove that exactly 2nn−3 of
them contain the edge {1, 2}.
Here are two possible hints.
(i) Double-count the elements of the set
(T, {i, j}) : T is a tree with vertices [n] and {i, j} is an edge of T .
(ii) Alternatively, prove carefully that a tree contains the edge {1, 2} if and only if its Prüfer
code ends in 1 or 2.
2.6. Let n ≥ 2 be an integer. Of the nn−2 trees on vertex set [n], in how many of them is
vertex n a leaf?
2.7. Explore: Let n > m ≥ 2 be integers. Of the nn−2 trees on vertex set 1, . . . , n, how
many have exactly m leaves?
What is the average number of leaves in a tree on n vertices?
2.9. Let G be a graph and let D be the maximum degree of its vertices. Prove that
χ(G) ≤ D + 1.
2.10. Let G be a graph on n vertices. The complement of G, denoted G, is the graph obtained
from G by changing edges to non-edges and non-edges to edges. In other words, ij ∈ E(G)
if and only if ij 6∈ E(G).
Prove that
χ(G) · χ(G) ≥ n.
2.11. The four-color theorem asserts that every planar graph is 4-colorable. Is the converse
true? That is, is every 4-colorable graph planar?
2.13. Consider an N × N square grid of dots, with rows and columns labeled 1, . . . , N . For
n ≤ N , let S, T ⊆ [N ] be subsets of size n. An n × n square subgrid is the set of the n2 dots
in rows S and columns T . In particular, square subgrids don’t have to be contiguous.
(a) Given n ≥ 1, is it true that there exists a number N such that every 2-coloring of the
dots in an N × N square grid contains an n × n monochromatic square subgrid?
(b) If n = 2, can you calculate the smallest N with the property of part (a)?
(a) The set Xn of walks from (0, 0) to (2n, 0) and consisting of 2n steps, each
step in direction (1, 1) or (1, −1), with the property that the walk never crosses
below the x-axis.
(b) The set Yn of sequences of n symbols ( and n symbols ) that are correctly
parenthesized, meaning that in any initial subsequence, one never encounters
more ) than (. For example:
( ( ) ( ( ) ) ).
(c) The set Zn of bijective assignments of the numbers 1, . . . , 2n to a 2 × n grid
of boxes such that the rows increase from left to right and the columns increase
from top to bottom. For example:
1 3 4
2 5 6
Can you compute the size of X4 ? X5 ? Make a conjecture about the sequence of
sizes of the sets X1 , X2 , X3 , . . . and prove your conjecture.
1 This exercise is inspired by Exercise 6.19 in Richard Stanley’s classic text Enumerative Combina-
torics, Volume II. The exercise there is just like this one except that it has 66 parts instead of 4, and asks
you to come up with 66
2
bijections.
[1] N. Alon and J. H. Spencer. The probabilistic method. Wiley Series in Discrete
Mathematics and Optimization. John Wiley & Sons, Inc., Hoboken, NJ, fourth
edition, 2016.
[2] M. Bóna. A walk through combinatorics. World Scientific Publishing Co., Inc.,
River Edge, NJ, 2002. An introduction to enumeration and graph theory, With a
foreword by Richard Stanley.
[3] R. Diestel. Graph theory, volume 173 of Graduate Texts in Mathematics.
Springer-Verlag, New York, 1997. Translated from the 1996 German original.