0% found this document useful (0 votes)
93 views122 pages

Current Adapt

This document introduces a guided discovery approach to learning discrete mathematics. It discusses how the notes are structured around problem sets designed to have students discover solutions themselves, often through simplified examples. The goal is for students to learn how to discover ideas and methods on their own rather than just applying pre-taught methods. The document emphasizes that understanding small steps is key to mathematical problem solving and discovery.

Uploaded by

Srijith M Menon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views122 pages

Current Adapt

This document introduces a guided discovery approach to learning discrete mathematics. It discusses how the notes are structured around problem sets designed to have students discover solutions themselves, often through simplified examples. The goal is for students to learn how to discover ideas and methods on their own rather than just applying pre-taught methods. The document emphasizes that understanding small steps is key to mathematical problem solving and discovery.

Uploaded by

Srijith M Menon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

DISCRETE MATHEMATICS THROUGH

GUIDED DISCOVERY:
Classnotes for MTH 355
Kenneth P. Bogart 1
Department of Mathematics
Dartmouth College

Mary E. Flahive 2
Department of Mathematics
Oregon State University

This author was supported by National Science Foundation Grant Number DUE-0087466 for his development of the original notes.
2
This author was supported by National Science Foundation Grant Number DUE-0410641 for this adaption of the original notes.

Preface
Much of your experience in lower division mathematics courses probably had
the following flavor: You attended class and listened to lectures where theory
and examples were presented. Your text usually gave a parallel development.
Then your turn came. For every assigned problem or proof there was a
method already taught in class that was the key, and your job was to decide
which method applied and then to apply it. In upper-level math major
courses, your goal should be to discover some of the ideas and methods for
yourself as many as you can. These notes are intended as an introduction to
discrete mathematics and also as an introduction to mathematical thinking
within a classroom mode of learning called Guided Discovery.
Guided Discovery approaches mathematics very much like a mathematician does when on unfamiliar ground: Looking at special cases, trying to
discover patterns, wandering up blind alleys, possibly being frustrated, but
finally putting it all together into a solution of a problem or a proof of
a theorem. This thinking process is as important for you as the discrete
mathematics you will learn in the process. The notes consist principally of
sequences of problems designed for you to discover solutions to problems
yourself. Often you are guided to such solutions through simplified examples that set the stage. As you work through later problems, you will recall
earlier techniques that can either be used directly or be slightly modified to
get a solution. The point of learning in this way is that you are not just
applying methods that someone else has developed for you but rather you
are learning how to discover ideas and methods for yourself. Understanding
small points and taking small steps is the usual way of doing mathematics,
and is the usual path to all mathematical results including very significant
ones.
The notes are designed to be worked through linearly, with the problems
in the first chapter introducing you to the habit of thinking for yourself as
well as introducing you to discrete mathematics. During class you will work
in groups, with some class discussion that will help give an overall context.
i

Contents
I

COURSE NOTES

1 Beginning Combinatorics
1.1 What is Combinatorics? . . . . . . . . . . . . . .
1.2 Basic Counting Principles . . . . . . . . . . . . .
1.3 Functions and their Directed Graphs . . . . . . .
1.4 Another Application of the Sum Principle . . . .
1.5 The Generalized Pigeonhole Principle (Optional)
1.6 An Overview . . . . . . . . . . . . . . . . . . . .
1.6.1 An overview of problem solving . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

3
4
5
11
17
19
21
22

2 The Principle of Mathematical Induction


2.1 Inductive Processes . . . . . . . . . . . . .
2.2 The Principle of Mathematical Induction
2.2.1 Recurrences . . . . . . . . . . . . .
2.3 The General Product Principle . . . . . .
2.3.1 Counting the number of functions

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

24
24
28
34
36
37

3 Equivalence Relations
3.1 Equivalence Relations . . . . . . . . . . . . . .
3.2 Equivalence Classes . . . . . . . . . . . . . . . .
3.3 Counting Subsets . . . . . . . . . . . . . . . . .
3.3.1 Pascals Triangle . . . . . . . . . . . . .
3.3.2 Catalan numbers (Optional) . . . . . . .
3.4 Ordered-functions and Multisets . . . . . . . .
3.5 The Existence of Ramsey Numbers (Optional)

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

40
40
43
46
50
52
54
56

.
.
.
.
.

.
.
.
.
.

4 Graph Theory
59
4.1 Undirected Graphs . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

ii

4.3

.
.
.
.
.
.
.

64
67
67
69
71
73
74

5 Generating Functions
5.1 Using Pictures to Visualize Counting . . . . . . . . . . . . . .
5.1.1 Pictures of trees (Optional) . . . . . . . . . . . . . . .
5.2 Generating Functions . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Generating polynomials . . . . . . . . . . . . . . . . .
5.2.2 Generating functions . . . . . . . . . . . . . . . . . . .
5.2.3 Product Principle for Generating Functions (Optional)
5.3 Solving Recurrences with Generating Functions . . . . . . . .

76
76
79
80
80
81
86
87

6 The
6.1
6.2
6.3
6.4
6.5

.
.
.
.
.
.

90
90
92
94
95
96
97

.
.
.
.

99
99
101
103
103

4.4
4.5
4.6
4.7

Labelled Trees and Pr


ufer Codes . . . . . . . . . . . . . . .
4.3.1 More information from Pr
ufer codes (Optional) . . .
Monochromatic Subgraphs (Optional) . . . . . . . . . . . .
Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Counting the Number of Spanning Trees (Optional)
Finding Shortest Paths in Graphs . . . . . . . . . . . . . . .
Some Asymptotic Combinatorics (Optional) . . . . . . . . .

Principle of Inclusion and Exclusion


The Size of a Union of Sets . . . . . . . .
The Principle of Inclusion and Exclusion .
Counting the Number of Onto Functions .
The Menage Problem . . . . . . . . . . .
The Chromatic Polynomial of a Graph . .
6.5.1 Deletion-Contraction (Optional) .

7 Distribution Problems
7.1 The Idea of Distributions . . .
7.2 Counting Partitions . . . . . .
7.2.1 Multinomial Coefficients
7.3 Additional Problems . . . . . .

II

.
.
.
.
.
.

. . . . . . .
. . . . . . .
(Optional)
. . . . . . .

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

REVIEW MATERIAL

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.

.
.
.
.

109

A More on Functions and Digraphs


110
A.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
A.2 Digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
B More on the Principle of Mathematical Induction

114

C More on Equivalence Relations

119

Part I

COURSE NOTES

Chapter 1

Beginning Combinatorics
In most textbooks, an introductory explanation is followed by problems
which use the information just given. Since much of the learning in guided
discovery occurs as you work on problems, it is natural that definitions of
concepts be given within the problems and that the text material often
occurs after you have already worked problems developing the concepts.
Because of this, be sure to read over all the problems including any you
dont work right away.
Table 1.1: The meaning of the tags on the problem numbers.

essential for the course


motivational material
summary
especially interesting

As you flip through the pages of these notes you will see that most
of the problems are marked by a symbol to the left of the number. The
problems that are marked with a bullet () are essential for understanding
later material and are where the main ideas of the book are developed.
(Your instructor may leave out some of these problems because he or she
plans not to cover future problems that rely on them.) Other problems are
marked with open circles () which indicate that they are designed to provide
a motivation for important concepts by hinting at or partially developing
ideas that can be helpful in solving subsequent problems. A few problems
that summarize ideas that have come before are marked with a plus sign (+).
You will also find some problems marked with an arrow (). These point
3

to problems that are particularly interesting; some of them are difficult but
not all are. Frequently these problems are not intrinsically difficult but only
might seem hard in light of what has come before and they will be easier
when you return to them after working on more problems.

1.1

What is Combinatorics?

Combinatorial mathematics arises from studying combining objects into arrangements. For example, we might be combining sports teams into a tournament, samples of tires into groups for testing on cars, students into classes
to compare approaches to teaching a subject, or members of a tennis club
into pairs to play tennis. There are many questions that can be asked about
such arrangements of objects. Here the primary focus will be questions
about how many ways objects can be combined into arrangements of the
desired type. These are called counting problems. One way to count is
to enumerate a complete list of all the objects with the desired properties,
and another way is to count how many objects have the properties without
actually making a complete list. Enumerative combinatorics is usually interested in the second approach, although obtaining a solution might involve
writing a partial list.
Sometimes combinatorial mathematicians ask if a certain arrangement
is possible. For instance, if there are ten baseball teams and each team has
to play each other team once, can the whole series be scheduled if the fields
are available at enough hours for forty games? Sometimes combinatorial
mathematicians ask if all the arrangements able to be made have a certain
desirable property. For example, do all ways of testing five brands of tires
on five different cars compare each brand with each other brand on at least
one common car?
Counting problems (and problems of the other sorts described above)
arise throughout physics, biology, computer science, statistics, and many
other subjects. In order to demonstrate all these relationships detours would
have to be taken into all of these subjects. Instead, although there will be
some important applications, the discussions will usually be phrased around
either your everyday experience or your mathematical experience so that
you wont have to learn a new context before learning the mathematics.

1.2

Basic Counting Principles

1. Five schools plan to send their baseball team to a tournament in which


each team must play each other team exactly once. How many games
must be played?
2. Now some number n of schools plan to send their baseball teams to
a tournament in which each team must play each other team exactly
once. Think of the teams as numbered 1 through n.
(a) How many games does Team 1 have to play?
(b) How many additional games (other than the one with Team 1)
does Team 2 have to play?
(c) How many additional games (other than those with the first i 1
teams) does Team i have to play?
(d) In terms of your answers to the previous parts of this problem,
what is the total number of games that must be played?
Hint. If you have trouble doing this problem, work on n = 6 before
studying the general n.
3. One of the schools sending its team to the tournament has to travel
some distance, and so the school is making sandwiches for team members to eat along the way. There are three choices for the kind of bread
and five choices for the kind of filling. How many different kinds of
sandwiches are available?
An ordered pair (a, b) consists of two members (which are often called
coordinates) that are labeled here as a and b. Then a is called the first
member of the pair and b is the second member of the pair. What is an
ordered triple?
You almost certainly used ordered pairs, at least implicitly, to solve the
first three problems. At the time, did you recognize you were doing so?
+ 4. (a) If M is a set with m elements and N is a set with n elements,
how many ordered pairs are there whose first element is a member
of M and whose second element is a member of N ? (Note that
when such a question is asked in any problem in these notes,
you are required both to answer the question and to provide a
justification for the answer you give.)
(b) Explain carefully how Problem 3 can be viewed mathematically
as asking you to count the number of ordered pairs from two
specific sets.
5

While working the next problems, be sure to ask yourself whether the
problem amounts to counting the number of ordered pairs or ordered triples
or ordered n-tuples of elements chosen from appropriate sets. What is an
ordered n-tuple?
5. Since a sandwich by itself is pretty boring, students from the school
in Problem 3 are offered a choice of a drink (from among five different
kinds), a sandwich (with the choices as in Problem 3), and a fruit
(from among four different kinds). In how many ways may a student
make a choice of a lunch, if every lunch is a choice of these three items?
Now that you have worked with your group on a few problems, you understand more about the setup of the course. For instance, youve been
working on the problems in groups and your instructor has principally been
listening to what is going on in the groups. Any guidance from your instructor has primarily been in the form of asking a question or occasionally giving
a hinta question or hint that might at first seem unrelated to the problem
at hand. The reason for this is that in this course your instructor is a guide.
Problems in the notes (along with perhaps cryptic hints from your instructor) are designed to lead you and your group to discover for yourselves and
to prove for yourselves. There is considerable pedagogical evidence that this
can lead to deep learning and understanding. In addition, it is satisfying
and fun to discover things for yourself.
6. The coach of the team in Problem 3 knows of an ice cream shop along
the way where she plans to stop to buy each team member a tripledecker cone. The store offers 12 different flavors of ice cream, and
triple-decker cones are made only in homemade waffle cones. (Here
repeated flavors are allowed; in fact, a triple-decker with three scoops of
the same flavor is even possible. Be sure to count Strawberry, Vanilla,
Chocolate as different from Chocolate, Vanilla, Strawberry, etc.)
(a) How many possible triple-decker cones will be available to the
team members?
(b) How many triple-deckers have three different kinds of ice cream?
You probably have noticed some standard mathematical words and phrases
(for instance, set, ordered pair, function) are creeping into the problems.
One of the goals of these notes is to show how the solution of most counting
problems uses standard mathematical objects. As was said earlier, since

most of the intellectual content of these notes is in the problems, it is natural that definitions of concepts will often be within problems.1 For example,
Problem 4 is meant to suggest that the question asked in Problem 3 was
really a problem of counting all the ordered pairs consisting of a bread choice
and a filling choice. The notation A B is usually used to represent the set
of all ordered pairs whose first member is in A and whose second member
is in B, and A B is called the Cartesian product of A and B. Therefore you can think of Problem 3 as asking you for the size of the Cartesian
product of M and N , where M is the set of all bread types and N is the set
of all possible fillings; that is, the number of different kinds of sandwiches
equals the number of elements in the Cartesian product M N .
7. The idea of a function is ubiquitous in mathematics. A function f
from a set S to a set T is a relationship between the two sets that
associates to each element x in the set S exactly one member f (x) in
the set T . The ideas of function and relationship will be revisited in
more detail and from different points of view from time to time.
(a) Using f, g, . . . to stand for various functions, list all the different
functions from the set {1, 2} to the set {a, b}. For example, you
might start with the function f given by
f (1) = a and f (2) = b .
(b) Let us look at the last part in a different way. Instead of asking for
a list of all the functions, suppose you simply asked how many
functions are there from the set {1, 2} to the set {a, b}. Now
devise a way to count the number of functions without writing
an exhaustive list.
(c) How many functions are there from the 3-element set {1, 2, 3} to
the 2-element set {a, b}?
(d) How many functions are there from the 2-element set {a, b} to
the 3-element set {1, 2, 3}?
(e) How many functions are there from any 3-element set to any
12-element set?
(f) Re-do Problem 6(a) by constructing a function from the 3-element
set of positions in the triple-decker to the set of 12-element set of
flavors. Give an explicit verbal description of your function.
1
When you come across an unfamiliar term in a problem, most likely it was defined
earlier, and you should be able to find the term listed in the index.

This idea of using functions to count is very powerful, and is one of


the foundations of combinatorics. It also illustrates the added insight that
can be gained by looking at a problem from more than one point of view.
Take time right now to check to see if this has already happened in earlier
problems.
8. A function f is called a one-to-one function (often called an injection) if f is a function which has the property that whenever x is
different from y, then f (x) is different from f (y).
(a) How many one-to-one functions are there from a 2-element set to
a 3-element set?
Hint. Since 2 and 3 are fairly small numbers, you might first list
all the functions, and then decide which are one-to-one. But you
should then work out a method for counting them without an
exhaustive listing.
(b) How many one-to-one functions are there from a 3-element set to
a 2-element set?
(c) How many one-to-one functions are there from a 3-element set to
a 12-element set?
(d) Re-do Problem 6(b) by showing that a solution amounts to counting all the one-to-one functions between two sets. In order to
completely make this connection, you must explicitly define these
two sets as well as a function between them that describes the
possible ice cream cones.
9. A group of three hungry members of the team in Problem 6 notices it
would be cheaper to buy three pints of ice cream to share among the
three of them. (And they would also then get more ice cream!)
(a) In how many ways may they choose three pints of three different
flavors?
(b) In how many ways may they choose three pints of two different
flavors? How is this problem different from Problem 6?
(c) In how many ways may they choose three pints with at least two
different flavors?
(d) In how many ways may they choose three pints of ice cream with
no restrictions on repeating flavors?
In the last part of Problem 9 you probably found it helpful to break
the question into certain cases that you could solve by previous methods.
After doing that you could then figure out the answer by using the answers
8

from all the cases. Because this is a fairly common strategy, some special
terminology is helpful. Two sets are said to be disjoint if they have no
elements in common. For example, the sets {1, 3, 12} and {6, 4, 8, 2} are
disjoint, but {1, 3, 12} and {3, 5, 7} are not disjoint sets. Three or more sets
are said to be mutually disjoint if no two of the sets have any elements
in common. To solve Problem 9(d), the set of all possible choices of three
pints of ice cream can be broken into three mutually disjoint sets: Three
pints of different flavors; three pints with two different flavors; three pints
of the same flavor. The total number of choices is the sum of the sizes of
these three mutually disjoint sets.
10. (a) What can you say about the size of the union of a finite number
of mutually disjoint finite sets?
(b) What can you say about the size of the union of m mutually
disjoint sets, each of the same size n? This is a fundamental
principle of counting or enumerative combinatorics.
(c) Find at least one of the previous problems that can be solved
by the counting principle in part (b). Explain how to solve your
chosen problem using that principle.
The problems youve just completed contain among them kernels of the
fundamentals of enumerative combinatorics. For example, in your solution
to Problem 10(a) you just stated the Sum Principle (illustrated in Figure 1.1), and in Problem 10(b), the Product Principle (illustrated in
Figure 1.2.) These are two of the most basic principles of combinatorics,
and they form a foundation on which many other counting principles are
developed.
Figure 1.1: The union of these two disjoint sets has size 17.

When a set S is a union of m mutually disjoint sets B1 , B2 , . . . , Bm , then


the sets B1 , B2 , . . . , Bm is said to form a partition of the set S. (Note that
9

Figure 1.2: The union of four disjoint sets of size 5.

a partition of S is a set of sets.) In order that the set S is not confused with
the sets Bi into which it has been divided, the sets B1 , B2 , . . . , Bm are often
called the blocks of the partition.
11. Reword your solution to the last part of Problem 9 in terms of partitioning some set into blocks.
Using the language of partitions, the Sum Principle and the Product
Principle translate to:

The Sum Principle


For any partition of a finite set S, the size of S is the sum of the sizes of
the blocks of the partition.

The Product Principle


If a finite set S is partitioned into m blocks of the same size n, then S has
size mn.
Youll notice that both of these principles refer to a partition of a finite
set. The language could be modified a bit to cover infinite sets, but the
sets considered in this book are primarily finite. In order to avoid possible
complications in the future, the term size will only be used for finite sets.
12. Prove the Product Principle follows logically from the Sum Principle.
13. Explain how to interpret 2 + 3 = 5 in terms of a partitioning some set.
Because of this, the Sum Principle for the case when the partition has
two blocks is the definition of the binary operation of addition on the
set of positive integers. Explain this in your own words.

10

14. If A is a subset of some universe U , define its set complement to be


U \ A := {x U : x
/ A} .
Use the Sum Principle for two blocks to prove |U \ A| = |U | |A|.
Youll prove the Sum Principle for any finite number of blocks in the
next chapter. For right now you may accept it as true and use it wherever
you like. Of course you must be careful that your proofs in the next chapter
do not depend on any results which youve proved using the general Sum
Principle.
15. In a biology lab study of the effects of basic fertilizer ingredients on
plants, 16 plants are treated with potash, 16 plants are treated with
phosphate, and a total of eight plants among these are treated with
both phosphate and potash. No other treatments are used. How many
plants receive at least one treatment? If 33 plants are studied, how
many receive no treatment?
16. Use partitions to prove a formula for the size |AB| of the union AB
of any two (finite but not necessarily disjoint) sets A and B in terms
of the sizes |A| of A, |B| of B, and |A B| of the intersection A B.
The formula you proved in the last problem is a special case of the Principle of Inclusion and Exclusion, which is considered more thoroughly
in Chapter 6.

1.3

Functions and their Directed Graphs

A typical way to define a function f from a set S (called the domain of the
function) to a set T (which in discrete mathematics is commonly referred to
as its co-domain) is: A function f is a relation from S to T which relates
each element of S to one and only one member of T . The notation f (x)
is used to represent the element of T that is related to the element x of S,
and the standard shorthand f : S T is used for f is a function from
S to T . Please note that the word relation has a precise meaning in
mathematics. Do you know it? Refer to Appendix A if you need a review
of this terminology.
Relations between subsets of the set of real numbers can be graphed in
the Cartesian plane, and it is helpful to remember how you used a graph of
a relation defined on the real numbers to determine whether it is actually a
11

function. Namely, to do this you learned that you should check that each
vertical straight line crosses the graph of the relation in at most one point.
You might also recall how to determine whether such a function is one-to-one
by examining its graph. If each horizontal line crosses the graph in at most
one point, the function is one-to-one. If even one horizontal line crosses the
graph in more than one point, the function is not one-to-one.
The domain and co-domain of the functions considered in this book
will both usually be finite and will often contain objects which are not real
numbers, and this means that graphs in the Cartesian plane are usually not
available. But all is not lost, since there is another kind of graph called a
directed graph (or digraph) that is especially useful when dealing with
functions between finite sets. Figure 1.3 has several examples of digraphs of
functions.
In analyzing the graphs in Figure 1.3 you should notice the following:
When you want to draw the digraph that represents a relation from a set
S to a set T , you can draw a line of dots called vertices to represent the
elements of S and another (usually parallel) line of vertices to represent
the elements of T . (Part (e) is slightly different.) You then can draw an
arrow from the vertex for x S to the vertex for y T if and only if x is
related to y. Such arrows are called (directed) edges. Because there is
an inherent order in a relation, every edge is an arrow and not just a line
segment. When the relation is a function, f : S T , one arrow is drawn
from each x S to its corresponding f (x) T . Familiarize yourself with
this technique by working through the digraphs in Figure 1.3. Note that
in part (e) the function is from a set S to itself and the picture has been
simplified by drawing only one set of vertices representing the elements of
S. Digraphs can often be more enlightening if you experiment to find an
attractive placement of the vertices rather than putting them in a row.
There is a simple test for whether a digraph of a relation from S to T is
a digraph of a function from S to T :
17. Returning to the digraphs in Figure 1.3, determine which are functions.
Then formulate a precise sentence stating what properties the arrows
and vertices in a digraph must possess so that the digraph represents
a function f from S to T .
18. (a) In how many ways can you pass out six different candies to two
children? Set up your solution as a problem about counting functions.
(b) In how many ways can you pass out the candy if each child must
get at least one piece? Exactly three pieces?
12

Figure 1.3: What is a digraph of a function?


1

16

25

000

001

010

011

100

101

110

111

(b) The function from the set {0,1,2,3,4,5,6,7} to the set of triples
of zeros and ones given by f(x) = the binary representation of x.

(a) The function given by f(x) = x 2


on the domain {1,2,3,4,5}.

-2

-1

(d) Not the digraph of a function.


0
1

(c) The function from the set {-2,-1,0,1,2}


to the set {0,1,2,3,4} given by f (x) = x2 .

2
3

(e) The function from {0, 1, 2, 3, 4, 5}


to {0, 1, 2, 3, 4, 5} given by f (x) = x + 2 mod 6

13

19. (a) In how many ways can you pass out nine different candies to
three children? Set up your solution as a problem about counting
functions.
(b) In how many ways can you pass out the candy if each child must
get at least one piece?
(c) Exactly three pieces?
20. Suppose you have n distinguishable balls. How many ways can you
paint each of them with one color, chosen from red, black, green and
blue?
The most mathematically elegant solutions to the last problems probably
involve using functions. Until now, your experience with functions probably
has only involved a formula in calculus. In contrast, in discrete mathematics
a function can be an algorithm or it might be given by a verbal description.
From now on, for any positive integer n the notation [n] will be used for
the set {1, . . . , n}. For example, [4] equals the set {1, 2, 3, 4}. This symbol
is not used in all branches of mathematics, but for us [n] will always mean
the set {1, . . . , n}.
21. If B1 , B2 , . . . , Bm is a partition of a finite set S, define a relation f
from S to [m] by
f (s) = i s Bi .

(a) Use the definition of partition to explain completely why f is a


function with domain S.
(b) This correspondence defines a relation from the set of all m-block
partitions of S to the set of all functions from S to [m]. Is this
relation a function whose domain is the set of all m-block partitions of S? Either find a counterexample or prove the relation is
always a function.
22. A function f : S T is an onto function (also called a surjection)
if each element of T is f (x) for at least one x S.
(a) Choose finite sets S and T and a function from S to T that is
one-to-one but not onto. Draw the digraph of your function.
(b) Choose finite sets S and T and a function from S to T that is
onto but not one-to-one. Draw the digraph of your function.

14

23. Visual inspection of the digraph of a function can reveal whether or


not the function is one-to-one or is onto. In each of the following
parts devise a test similar to the one for testing when a digraph is the
digraph of a function.
(a) Give a written statement of a test for whether or not the digraph
of a function is the digraph of a one-to-one function. (Remember
that in order to be a one-to-one function, a relation must be a
function.)
(b) Write a statement of a test for whether or not the digraph of a
function is the digraph of an onto function.
If you think you might need more work on functions, consider working
through Appendix A outside of class. It covers functions and also digraphs
in more detail.
24. A function from a set X to a set Y which is both one-to-one and onto
is usually called a bijection (but is sometimes called a 1-1 correspondence). What does the digraph of a bijection from a set S to a
set T look like?
25. If f is a bijection from X to Y and g is a bijection from Y to Z, must
the function composition g f be a bijection from X to Z? Explain.
26. If you reverse all the arrows in the digraph of a bijection f with domain
S and co-domain T , you get the digraph of another function g. Why
is g a function from T to S? Why is g a bijection? What is f (g(x))?
What is g(f (x))?
The digraphs marked (a), (b), and (e) in Figure 1.3 are digraphs of
bijections. Your description in Problem 24 illustrates another fundamental
principle of combinatorial mathematics:

The Bijection Principle


Two sets have the same size if and only if there is a bijection between the
sets.
It is surprising how this innocent-sounding principle frequently provides an
insight into some otherwise very complicated counting arguments.
A binary representation of a positive integer m is an ordered list
a1 a2 . . . ak of zeros and ones such that
m = a1 2k1 + a2 2k2 + + ak 20 .
15

Our definition allows leading zeros: for instance, both 011 and 11 represent
the number 3. Often such an ordered list of k zeros and ones is called a
binary k-string, and each of its digits is called a bit, which is shorthand
for binary digit. In the above example, 011 is a binary 3-string representation
of 3, while 11 is a binary 2-string representation of 3.
27. Let n be a fixed positive integer. For this problem, let S be the set of
all binary n-string representations of numbers between 0 and 2n 1,
and let T be the set of all subsets of [n]. Note that the empty set is
a subset of every set.
(a) For n = 2, write out the sets S and T and then describe a natural
bijection from S to T where in your statement of the bijection,
0 and 1 play the roles of does not belong to and belongs,
respectively.
(b) Using the strategy from part (a), describe a bijection from S to T
for general n. Explain why your map is a bijection from S to T .
(c) Explain how part (b) enables you to find the number of subsets
of [n].
In the last problem you used the fact that the empty set is a subset of
every set. This fact may seem a little strange at first, but logical reasoning
will convince you that it is true. Indeed, assume by way of contradiction
that there exists a set X such that is not a subset of X. By the definition
of subset this means there must be an element of which is not an element
of X. What does this contradict?
28. (a) Let n 2. Use the Bijection Principle to prove the number of
2-element subsets of [n] equals the number of subsets of [n] that
have n 2 elements. Be sure to construct a specific function
which you prove is a bijection.
(b) Let n 3. Generalize the reasoning used in part (a) to prove the
number of 3-element subsets of [n] equals the number of subsets
of [n] which have n 3 elements.
(c) Let n k. State a natural conjecture suggested by parts (a)
and (b). Is your conjecture true or false? Justify this by either
proving your conjecture or providing a counterexample to the
conjecture.
29. For each subset A of [n], define a function (traditionally denoted by

16

A ) as follows:

(
1 if i A,
A (i) =
0 if i 6 A.
The function A is called the characteristic function or the indicator function of A. Notice that the characteristic function is a
function from [n] to {0, 1}.
(a) Draw the digraph of the function A for n = 4 and A = {1, 3}.
(b) Draw the digraph of the function A for n = 6 and A = {1, 3}.
30. Let S be the set of all subsets of [n]. Let T be the set of all functions
from [n] to {0, 1}. Define a map f : S T by f (A) = A for all A S.
Explain why f is a bijection. What does this say about the number
of functions from [n] to [2]?
The proofs in Problem 27 and 30 use essentially the same bijection, but
they interpret sequences of zeros and ones differently and so end up being
different proofs.
Youll return to the question of counting the number of functions between
any two finite sets in Section 2.3.1.

1.4

Another Application of the Sum Principle

31. US coins are all marked with the year in which they were made. How
many coins do you need to guarantee that on at least two of them,
the date has the same last digit? (The phrase to guarantee that on
at least two of them,... means that you can find two coins with the
same last digit. You might be able to find three with that last digit,
or you might be able to find one pair with the last digit 1 and one pair
with the last digit 9, or any combination of equal last digits, as long
as there is at least one pair with the same last digit.)
There are many ways to explain your answer to Problem 31. For example, you can separate the coins into stacks or blocks according to the last
digit of their date. That is, you can put all the coins with a given last digit
in a stack together (putting no other coins in that stack), and repeat this
process until all coins have been placed in a stack. Using the terminology
introduced earlier, this gives a partition of your set of coins into blocks of
The symbol is the Greek letter chi that is pronounced Ki, where the i sounds like
eye.
2

17

coins with the same last digit. If no two coins have the same last digit, then
each block has at most one coin. Since there are only ten digits, there are at
most ten non-empty blocks and by the Sum Principle there can be at most
ten coins. Note that if there were only ten coins, it would be possible to
have all different last digits, but with eleven coins some block must have at
least two coins in order for the sum of the sizes of at most ten blocks to be
11. This is one explanation of why eleven coins are needed in Problem 31.
This type of situation arises often in combinatorial situations, and so rather
than always using the Sum Principle to explain your reasoning, you can use
another principle which is a variant of the Sum Principle.

The Pigeonhole Principle


If a set with more than n elements is partitioned into n blocks, then at
least one block has more than one element.
The Pigeonhole Principle gets its name from the idea of a grid of little
boxes that might be used to sort mail or as mailboxes for a group of people
in an office. The boxes in such grids are sometimes called pigeonholes in
analogy with the stacks of boxes used to house homing pigeons back when
homing pigeons were used to carry messages. People will sometimes state
this principle in a more colorful way as if more than n pigeons are put into
n pigeonholes, then some pigeonhole contains more than one pigeon.
32. Prove the Pigeonhole Principle follows from the Sum Principle.
33. Prove that any function from [n] to a set of size less than n cannot be
one-to-one.
Hint. You must prove that regardless of the function f chosen, there
are always two elements, say x and y, such that f (x) = f (y).
34. Prove that if f is a one-to-one function between finite sets of equal
size, then f must be onto. Compare this with Problem 22(a).
35. Prove that if f is an onto function between finite sets of equal size,
then f must be one-to-one. Compare this with Problem 22(b).
36. Let f be a function between finite sets of equal size. Then f is a
bijection if and only if f is either onto or one-to-one. Find a counterexample to this result when the domain and co-domain are both
infinite.
37. (Some familiarity with arithmetic modulo 10 and modulo 100 is helpful
for this problem.) In this problem you prove there exists an integer
18

n such that if you take the first n powers of any prime other than
two or five, the last two digits of at least one of these powers must
be 01. (That is, the same integer n has the property for all primes
p 6= 2, 5.)
(a) All powers of 5 end in the digit 5, and all powers of 2 are even.
For any prime p 6= 2, 5, prove there are at most four values of the
last digit of any power pi . What does that say about the values
of pi (mod 10)?
(b) How many values are possible for pi (mod 100)? Referring to this
number of values as N0 , in the remaining parts of this problem
you will prove that n = N0 can be used in the statement of the
result you want to prove.
(c) Use the Pigeonhole Principle to prove that among the first N0 + 1
powers of any prime p 6= 2, 5, there exist i 6= j such that pi pj
is divisible by 100. Then prove pij 1 (mod 100).
(d) Find the smallest value of n that works in the statement of the
result. Give a complete proof that this value works.

1.5

The Generalized Pigeonhole Principle (Optional)

Although this section is used in other optional sections of these notes, it is


independent of the main body of the notes.

The Generalized Pigeonhole Principle


If a set with more than kn elements is partitioned into n blocks, then at
least one block has at least k + 1 elements.

38. Prove the Generalized Pigeonhole Principle follows from the Sum Principle.
39. Draw five circles labelled Al, Sue, Don, Pam, and Jo.
(a) Find a way to draw red and green lines between people (circles)
so that every pair of people is joined by a line and there is neither
a triangle consisting entirely of red lines or a triangle consisting
of green lines.
(b) Suppose Bob joins the original group of five people. Now can you
draw a combination of red and green lines that have the same
property as those in part (a)? Explain.
19

40. Show that in a set of six people, there is either a subset of at least three
people who all know each other or a subset of at least three people
none of whom know each other. (Here it is assumed that if Person 1
knows Person 2, then Person 2 knows Person 1.) Does the conclusion
hold when there are five people in the set rather than six?
Problems 39 and 40 together show that six is the smallest number R
with the property that if there are R people in a room, then there is either
a set of (at least) three mutual acquaintances or a set of (at least) three
mutual strangers. Another way to say the same thing is to say that six is
the smallest number so that no matter how you connect six points in the
plane (no three on a line) with red and green lines, you can find either a
red triangle or a green triangle. There is a name for this: The Ramsey
Number R(m, n) is the smallest number R such that if you have R people
in a room, then there is a set of at least m mutual acquaintances or at least
n mutual strangers.
Problem 39 hints at a geometric description of Ramsey Numbers which
uses the idea of a complete graph on R vertices. A complete graph on
R vertices consists of R points in the plane, together with line segments
(or curves) connecting each pair vertices. As you may guess, a complete
graph is a special case of something called a graph, which will be defined
more carefully in Section 4.1. The points in a graph are called vertices (or
nodes) and the line segments are called edges. The notation Kn is used to
represent a complete graph on n vertices.
The geometric description of R(3, 3) may be translated into the language
of graph theory by saying R(3, 3) is the smallest number R such that if you
color the edges of a KR with two colors, then you can find in the picture a
K3 all of whose edges have the same color. The graph theory description
of R(m, n) is that R(m, n) is the smallest number R such that if you color
the edges of a KR with the colors red and green, then you can find in your
picture either a Km all of whose edges are red or a Kn all of whose edges are
green. Because you could have said your colors in the opposite order, you
may conclude that R(m, n) = R(n, m). In particular R(n, n) is the smallest
number R such that if you color the edges of a KR with two colors, then
your picture contains a Kn all of whose edges have the same color. The
results of Problems 39 and 40 combine to prove R(3, 3) = 6.
41. Since R(3, 3) = 6, an uneducated guess might be that R(4, 4) = 8.
Show that this is not the case.

20

Hint. To get started, try to write down what it means to say R(4, 4)
does not equal 8.
42. Show that among ten people, there are either four mutual acquaintances or three mutual strangers. What does this say about R(4, 3)?
43. Show that among an odd number of people there is at least one person
who is an acquaintance of an even number of people and therefore also
a stranger to an even number of people.
Hint. Let ai be the number of acquaintances of person i.
44. Find a way to color the edges of a K8 with red and green so that there
is no red K4 and no green K3 .
45. Find R(4, 3).
Hint. There is a relevant problem that you have not used yet.
As of this writing, relatively few Ramsey Numbers are known. Some
that have been found are: R(3, n) for all n < 10; R(4, 4) = 18; R(4, 5) = 25.

1.6

An Overview

Now is a good time for you to go back into this chapter with a fresh outlook,
informed by your working through the problems in this chapter. Because
most of the techniques are developed in problems, you might not realize how
much you have learned and most likely it will help to write a summary of
important facts. Every once in a while you should do thatgo through the
notes with an eye for what you have learned and how you might approach
older problems differently. You should review at the end of each chapter,
and some of you might prefer more frequent reviews, done either on your
own or in your group.
+ 46. As a group, identify four or five important principles of counting developed in this chapter. Also, identify at least four techniques used. (For
this, I would call the Bijection Principle a counting principle, whereas
I consider the idea of using a function to count to be a technique.
There is not a clear line of separation between them.)
+ 47. When you originally solved Problems 1 to 9, you probably did not
think explicitly in terms of any basic principles, although you probably used some principles implicitly. Revisit those problems from the
21

point of view of categorizing the counting principle used in your solution. For this, compose a 93 table whose rows are labeled Problem 1,
Problem 2,..., Problem 9 and columns by Sum Principle, Product Principle, Bijection Principle. In each row, indicate with an X each principle that is natural to use to solve the problem which names the row.
In some problems, several principles might be used.
+ 48. As a group, discuss the function interpretation of Problems 19 and 20
and compose a similar problem of your own which can be solved using
this technique. Share the problems with the other groups and evaluate
their solutions. Be sure the solutions include a clear explanation of
why the constructed function can be used to solve the problem. In
addition, solutions must correctly identify and count the size of the
domain and co-domain of the function.
+ 49. Evaluate this solution to Problem 27(b):
Define the function f : S T by f (a1 a2 an ) equals
the set of all subscripts such that ai = 1. This is a bijection
since S and T both have 2n elements.

1.6.1

An overview of problem solving

You should think about how your approach to counting problems has matured over the course of this chapter. There are some fairly general techniques which you have used and should continue to use. Among these are:
As you work on a problem, think about why you are doing what you
are doing. Is it helping you? If your current approach does not feel
right, try to see why.
Is this a problem you can decompose into simpler problems?
Can you compose a simple example (even a silly one) of what the
problem is asking you to do?
If a problem is asking you to do something for every value of a positive
integer n, then what happens with small values of n like 0, 1, and 2?
When you are having trouble counting the number of possibilities for
a specific large number N , consider temporarily replacing N with a
smaller value, say n. After you solve the problem for the smaller n, see
if your reasoning (or perhaps some variation of it) transfers to give a
solution for the original N .

22

Throughout, do not worry about making mistakes, because mistakes often


lead mathematicians to their best insights.

23

Chapter 2

The Principle of
Mathematical Induction
2.1

Inductive Processes

Proof by mathematical induction is a more subtle method of reasoning than


what first meets the eye. It is also much more widely applicable than you
might have guessed based on your previous experience with the technique.
In this chapter youll use mathematical induction to prove the Sum Principle
and the Product Principle, as well as other counting techniques you used in
the first chapter.
This chapter assumes youve had some prior experience with proof by
mathematical induction. If that assumption is not true for you, you should
first work through Appendix B before beginning this section. Most likely the
first examples of proof by induction you worked involved proving identities
such as
n
X
n(n + 1)
i=
(2.1)
2
i=1

or

n
X
i=1

1
n
=
.
i (i + 1)
n+1

(2.2)

Theres a common thread to these arguments: Looking at (2.1) more closely,


you are asked to prove a sequence of statements which are indexed by positive
integers n. The first four statements in this sequence of statements are
1=

12
23
34
45
; 1+2=
; 1+2+3=
; 1+2+3+4=
.
2
2
2
2
24

This illustrates one reason why someone might think of using induction to
prove (2.1): The identity gives a sequence of statements indexed by integers n that are greater than a certain size (in this case for all integers n 1).
In other words, the identity yields a family of statements S(n) parametrized
by integers n 1. This means that for each integer n 1,
S(n) is the statement

n
X
i=1

i=

n(n + 1)
.
2

P
That is, S(n) is the statement that the sum ni=1 i always equals n(n+1)/2
for every integer n 1. Any result that can be proved by induction must be
able to be written as a sequence of statements which can be parameterized
by integers greater than some base integer b. Because this initial requirement
is satisfied by (2.1), you can see that induction might be able to be used to
prove this result. Next is an explanation of why induction can be successfully
used to prove (2.1).
For the moment, let us suspend belief and suppose you had no idea
whether or not (2.1) is true for any n 1, no less that it is true for all
n 1. How can you proceed? In the spirit of scientific reasoning, you might
think of checking that S(n) is true for some small integers n. Maybe you
would even use a computer to verify the statement S(n) for all values of n
up to a quite large size. If you do this, you will find that S(n) is in fact true
for every value of n 1 which you check, and this gives you some confidence
that the statement is true in general for all n 1. Of course this is not yet
a proof, but rather simply a justification for continuing to try to prove the
statement.
In order to get induction to work, an underlying inductive process
must be discovered. Namely, how is the statement S(N ) for any integer N
related to the statements that precede it, statements that you may already
have checked are true? For this particular sequence of statements, the relationship is simple: For any N 2, the truth of any statement S(N ) relies
only on the truth of the immediately preceding statement S(N 1). The
reason for this is that the sum on the left-hand side in the statement S(N )
is built up from the left-hand sum in the statement S(N 1) by adding
the integer N to the sum. The uncovering of an inductive process for this
problem is further evidence that a proof by mathematical induction might
be able to be constructed.
For similar reasons, the second sequence of statements (2.2) can most
likely be proved by induction. Indeed, the underlying inductive process is really the same. First identify the sequence of statements S(1), S(2), . . . , S(n), . . ..
25

Here
S(n) is the statement

n
X
i=1

1
n
=
.
i (i + 1)
n+1

Then observe that the left-hand sum in statement S(N + 1) is obtained by


adding 1/(N +1)(N +2) to the left-hand sum in the immediately preceding
statement S(N ). Because of this, the inductive process is basically the same
for (2.1) and (2.2), although the algebra involved is somewhat different.
50. Prove each of (2.1) and (2.2) by induction on n 1.
51. What postage do you think can be made using only three-cent and
five-cent stamps? If you have an unlimited supply of these two types
of stamps, do you think that there is a number N such that for every
n N , you can make n cents worth of postage?
In the last problem you probably did some calculations to convince yourself that n cents worth of postage can be made for n = 3, 5, 6 (but not for
n = 1, 2, 4, 7) and also that apparently postage can be made for all integers
that are at least 8. A proof of this fact will now be given in full detail.
First of all, the sequence of statements to be proved is:
S(n): n cents of postage can be made using only 3 and 5cent
stamps.
A key observation (and one that highlights how this problem is different
from proving the identities (2.1) and (2.2) ) is that you can not prove S(13)
is true by working only with the fact that S(12) is true. The reason for this
is simple: you do not have a 1-cent stamp at your disposal. In general you
can never conclude that n cents of postage can be made simply by knowing
that n 1 cents of postage can be made; that is, in this example the truth
of the statement S(n) cannot be proved using only the truth of S(n 1).
As is done in any proof by induction, there will first be an analysis of
some small cases. Youve already noted in the last problem that the base
integer must be b 8; this is forced by the fact that 7 cents of postage cannot
be made using only 3-cent and 5-cent stamps. Also, youve undoubtedly
noted that you can make 8 cents with one 3-cent stamp and one 5-cent
stamp; 9 cents can be made with three 3-cents and 10 with two 5-cents.
Maybe youve checked larger integers as well, but at this point the truth of
each of S(8), S(9), S(10) has been verified.
It has already been observed that you can not get 13 cents from the fact
you can get 12 cents. One way to get 13 cents is to add a 3-cent stamp to
the two fives used to get 10-cents. The surprising thing is that this simple
26

observation is the key to describing an inductive process. Because of this,


if you can make postage for three consecutive amounts, then you can make
postage of any larger amount by simply adding the correct number of 3-cent
stamps to an earlier attainable amount of postage. (Dont be distracted by
the fact that this might not be the only way to get a certain amount of
postage, since for instance 15-cents can be made by three 5-cent stamps as
well as the five 3-cents that would be obtained in one step from the postage
for 12 cents.) Said another way, the idea here is to devise an induction
statement which lumps together
8 , 9 , 10 ;
and
11 , 12 , 13 ;
and
14 , 15 , 16 ;
and so on. In general for all n 3,
3n 1 , 3n , 3n + 1
are to be considered together. This analysis leads to a revision from the
original sequence of statements S(n) to the following sequence of statements T (n): for any n 3,
T (n) is the statement: It is possible to form postage totaling
exactly 3n 1, 3n, and 3n + 1 cents using only 3-cent and 5-cent
stamps.
Notice that here the base integer is b = 3. (Why cant the base case be
either b = 1 or b = 2 ?)
Here is an inductive proof that T (n) is true for all n 3:
Since 8 = 3 + 5, 9 = 3 + 3 + 3 and 10 = 5 + 5, it is possible to make
k-cent postage for all k = 8, 9, 10 and this proves the base case T (3) is
a true statement. Next consider any integer N 3 for which T (N ) is
true, and then show that the truth of T (N + 1) follows from the truth
of T (N ). Because T (N ) is true, postage can be made for each of
3N 1 , 3N , and 3N + 1 .
If you add one 3-cent stamp to the postage for 3N 1, you obtain
3N 1 + 3 = 3(N + 1) 1 cents of postage. Likewise, by adding one
3-cent stamp to the postage for each of 3N and 3N + 1, you can obtain
postage for each of 3(N + 1) and 3(N + 1) + 1. Therefore, it has been
shown: if T (N ) is assumed to be true, then

27

It is possible to form postage totaling each of


3(N + 1) 1, 3(N + 1), and 3(N + 1) + 1 cents
using only 3-cent and five-cent stamps.
This is precisely the statement T (N + 1). Thus by the Principle of
Mathematical Induction, using only 3- and 5-cent stamps, you can make
n cents in postage for every n 8 .

The postage problem for any finite number of stamp types is also referred
to as Frobenius Problem and the Coin Exchange Problem.
Appendix B contains another review of the fundamentals of proofs using
the Principle of Mathematical Induction.

2.2

The Principle of Mathematical Induction

All proofs using the Principle of Mathematical Induction have four parts:
An identification of the sequence of statements to be proved, a base step,
an inductive step, and the inductive conclusion.
It is helpful to identify these four parts in the proof of the postage problem in the last section. First of all, the sequence of statements S(n) which
must be proved was identified. In the postage problem this involved some
trial-and-error, whereas for (2.1) and (2.2) the statement of the problem
already identified the parametrized statement to be proved. For the postage
problem, the base step is the case n = 3. Next locate the sentence Next
consider any integer N 3 for which T (N ) is true, and then show that the
truth of T (N + 1) follows from the truth of T (N ). This is an outline of
what must be accomplished in the inductive step of the proof. In particular,
Because T (N ) is true, postage can be made for each of
3N 1 , 3N , and 3N + 1
is called the inductive hypothesis. In the inductive step the statement
is derived for n = N + 1 from the inductive hypothesis, proving that the
truth of the statement when n = N implies the truth of the statement
when n = N + 1. The last sentence in the proof, Thus by the Principle of
Mathematical Induction, using only 3- and 5-cent stamps, you can make n
cents in postage for every n 8, is the inductive conclusion.
One way of looking at the Principle of Mathematical Induction is that it
tells you that if you know the first case of a theorem and you can derive
every other case of the theorem from a smaller case, then the theorem is true
in all cases. However, the particular way in which this reasoning process has
28

been used above is rather restrictive because in the inductive step you have
derived the next case of the statement from the immediately preceding case
of the statement. Instead, there is another formulation of mathematical
induction which is often referred to as the Strong Principle of Mathematical
Induction. It is equivalent to the principle of mathematical induction that
youve been using and it is called strong because it can be more easily
applied to a wider range of problems.
In the following box the basic template for a proof by mathematical
induction is highlighted.

The Principle of Mathematical Induction


In order to prove a sequence of statements indexed by integers k b it is
sufficient to do the following:
1. Determine the sequence of statements to be proved;
2. (Base Step) Prove the statement for k = b;
3. (Inductive Step) Prove that (for any N b) the truth of the statements for k = b, k = b + 1, . . . , k = N implies the truth of the
statement for k = N + 1;
4. (Inductive Conclusion) By the Principle of Mathematical Induction,
every statement in the sequence is true.

The only change in this formulation of the principle is in the Inductive


Step. This statement of the Inductive Step is preferable because it is more
easily applied to a wider range of situations than the non-strong principle.
For example, this statement of the Principle of Mathematical Induction
allows the postage stamp problem to be proved in a more straightforward
manner by directly using the original observation that the truth of S(N )
follows from the truth of S(N 3).
Step 1: Identify the sequence of statements to be proved.
For all n 8, let
S(n): n cents of postage can be made using only 3- and 5-cent
stamps.
Step 2: Base Step.
Since 8 = 3 + 5, 9 = 3 + 3 + 3 and 10 = 5 + 5, it is possible to make
k-cent postage for all k = 8, 9, 10 and this proves the base cases with
n = 8, 9, 10. (Why must you explicitly verify the first three statements?)
Step 3: The Inductive Step.
Assume that S(k) is known to be true for all k = 8, . . . , N , and then

29

show that the truth of S(N + 1) follows. Because you know S(N 2)
is true, postage amounting to N 2 cents can be made and adding one
3-cent stamp to this allows N + 1 cents of postage to be made.
Step 4: Inductive Conclusion.
By the Principle of Mathematical Induction, using only 3- and 5-cent
stamps you can make n cents in postage for every n 8 .

52. What postage do you think can be made using only 5-cent and 6-cent
stamps? Do you think that there is a number N such that for any
n N , you can make n cents worth of postage? Either explain why
such an integer N does not exist or give an inductive process inherent
in the problem from which a proof by induction could be constructed.
(A complete proof by induction is not required for this problem.)
53. Explain why it is true that for any integer N there always exists n > N
for which it is impossible to make n-cents worth of postage using only
2-cent and 4-cent stamps. Generalize this observation.
54. Suppose a and b are two positive integers for which there exists an
integer N such that for any n N , n-cents of postage can always be
made using only a-cent and b-cent stamps. Using the information you
have collected in the preceding problems, conjecture a value of N which
has this property. (N will depend on a and b). Test your conjecture
on some more examples.
55. Divide your group into pairs to play three or four games of Sylver
Coinage (a.k.a Postal Kiosk),1 in which players take turns naming positive integers n for which n-cents of postage cannot be made using only
numbers which have already been named. The player who chooses 1
loses. For instance, 3, 4, 5, 2, 1 is a possible game, but 3, 5, 7, 6, 4, 1 is
not (why?). Without referring to human mortality, explain why any
game of Sylver Coinage ends after finitely many steps.
56. Suppose that f is a function which is defined by the inductive process
f (1) = 1 and f (n) = n + f (n 1). For practice, find f (2), f (3),
and f (4), and then prove that f (n) = n(n + 1)/2. Notice that this
gives another proof of (2.1), because the sum there satisfies the two
conditions defining the function f .
57. What properties in the definition of function allow you to say that
every function from [m + 1] to [10] is built up from a function from
[m] to [10] in a unique way? Let Sn be the set of all functions from [n]
1
The authors of the book Winning Ways for Your Mathematical Plays attribute this
the game of Sylver Coinage to a 1884 article by the mathematician Joseph J. Sylvester.

30

to [10]. Give an inductive process relating the set Sm+1 to the set Sm ,
and use that to find an equation for the size of Sm+1 in terms of the
size of Sm . Compare this with Problem 30.
58. Using the simple observation that a subset of [n] either does or does
not contain n, find an inductive relationship between the number of all
subsets of [n] and the number of all subsets of [n 1].
Note: What does the symbol [n] mean?
59. Use the relationship in Problem 58 to construct an inductive proof that
the set [n] has 2n subsets. Compare this with Problem 27.
60. Let k be a fixed positive integer, and let sn be the number of functions
from [n] to [k]. Find an equation relating sn+1 to sn . Explain.
+ 61. Compare the inductive relationships in the last two problems. Is there
a reason for one to be a special case of the other?
62. For a fixed positive integer n, a composition of n is an ordered list
of positive integers whose sum is n. For this problem, cn will denote
the number of compositions of n. For example, a complete list of all
compositions of n = 3 is 3; 2 1; 1 2; 1 1 1. Consequently, c3 = 4.
(a) Find cn for some small values of n.
(b) Use the calculations from part (a) to describe an inductive procedure for obtaining cn from earlier ci .
(c) Carefully prove that cn = 2n1 holds for all n 1.
(d) Use the result of Problem 59 to obtain part (c) of this problem
directly.
63. A roller coaster car has n rows of seats, each of which has room for two
people. Suppose n men and n women get into the car with a man and
a woman in each row, and let an be the number of configurations that
can occur.
(a) Find and write down an inductive relationship between an and
an+1 .
(b) Use your inductive relationship to conjecture a formula for an .
(c) Prove your formula by induction.
64. (a) Beginning with the 3-string 000, write a list of all eight binary 3strings which is ordered in such a way that each 3-string differs from
the previous one in the list by changing just one digit. This should
be considered cyclically; that is, the last element also differs in
only one digit from the first. There are many such sequences, and
are called (cyclic) Gray Code for 3-strings. That is, a Gray Code
for binary n-strings is a listing of all 2n binary n-strings in such
31

a way that each n-string differs from the previous one in exactly
one place. Describe how to get your Gray Code for 3-strings from
some Gray code for 2-strings.
(b) Can you describe how to get some Gray Code for 4-strings from
the one you found for 3-strings?
(c) Give a written description of the inductive step of an inductive
proof of the existence of Gray codes for n-strings for all n 1.
One of the original uses of Gray codes was to reduce coding errors in a
pulse communication system. Frank Gray of Bell Labs received a patent for
Gray codes2 in 1953, but this type of sequence is now known to have been

used even earlier (in the 1870s) in a telegraph device by Emile


Baudot, a
French engineer.
65. Use the definition of a Gray Code to find a bijection between the evensized subsets of [n] and the odd-sized subsets of [n].
Hint. Consider the characteristic functions of subsets.

Figure 2.1: The Towers of Hanoi Puzzle

66. The Towers of Hanoi Puzzle (refer to Figure 2.1) has three rods rising
from a rectangular base and n 1 rings of different sizes. At the
beginning of the puzzle all rings are stacked on one rod in order of
decreasing size. A legal move consists of moving a ring from one rod
to another in such a way that the ring does not land on a smaller one.
Let mn be the (minimal) number of moves required to move all the
rings from the initial rod to any other rod. Find mn for all n 3 and
use those solutions to develop a strategy for obtaining the number of
moves for N + 1 rings from the number of moves for N rings. Describe
this as an inductive process. Guess a formula for mn and use induction
to show your guess is correct.
2

Pulse Code Communication, United States Patent Number 2632058 (March 17, 1953)

32

67. Are more moves required if the Towers of Hanoi Puzzle had the additional stipulation that the stack had to wind up on a pre-determined
rod? Explain.
Each of the foregoing problems had an underlying inductive process, and
that process was the basis for the inductive step of the proof by mathematical
induction. Next a complete proof of Problem 62 will be given.
The sequence of statements to be proved is
S(n) : cn = 2n1

for all

n1,

where cn is the number of compositions of the integer n (as was defined


in Problem 62).
The base case of n = 1: Since 1 is the only composition of 1 then c1 =
1 = 211 and the base case is true.
Next, for fixed N 1 assume that
cn = 2n1

for all of n = 1, . . . , N ,

and use this to prove that cN +1 = 2N . In other words, an inductive


procedure must be identified which gives the number of compositions of
the integer N +1 in terms of the number of compositions for smaller n
N . First of all, beginning with any composition of N , if a summand of 1
is added to the end of the sum, the result is a composition of N + 1.
Because all of the original compositions of N are different, there are
exactly cN different compositions of N + 1 which end in a summand
of 1. Another way to obtain compositions of N + 1 is to increase the last
summand of a composition of N by 1. Again, these are all different from
each other, and none of them can be a composition of the first type, each
of which had 1 for its last summand. Since these two procedures result
in mutually disjoint sets, so far, 2cN compositions of N + 1 have been
obtained.
BUT there is still the question of whether or not every composition
of N + 1 arises in one of these two ways. In other words, do these two
blocks of compositions of N + 1 partition the set of all compositions
of N + 1 or do they form a partition of a smaller set? Because the last
summand of any composition of N + 1 is either 1 (and the other N
summands form a composition of N ) or is greater than 1 (and becomes
a composition of N when the last summand is decreased by 1), the two
blocks given do partition the set of all compositions of N + 1. It has
thus shown that cN +1 = 2cN = 2N since cN = 2N 1 by the inductive
assumption. Therefore, by the Principle of Mathematical Induction,
cn = 2n1 holds for all n 1.

After carefully analyzing the above proof, return to your solutions to


the other problems in this section to make sure you have written complete
33

explanations in your earlier proofs. Your understanding of mathematical


induction will be relied on throughout the remainder of these notes. Work
through Appendix B if youd like more practice with this proof technique.
68. Write down at least two ways that the Sum Principle can be expressed
as a sequence of statements indexed by the positive integers. Can you
think of a third way? Settle on a way that allows you to construct an
inductive process.
69. Prove the Sum Principle by mathematical induction. (As commented
earlier in Problem 13, the base case of a partition into two blocks is
the definition of addition of two positive integers.)
Since in Problem 12 you proved the Product Principle follows logically from
the Sum Principle, you have now completed the proofs of both the Sum and
Product Principles.

2.2.1

Recurrences

In the last section you considered many situations that involved counting
items which are defined inductively. For instance, in Problem 58 you found
the relationship
sn = 2sn1 .
(2.3)
Also, when sn stands for the number of functions from [n] to [k], for a fixed
integer k, the inductive process you found in Problem 60 gave the equation
sn+1 = k sn .

(2.4)

Equations (2.3) and (2.4) are examples of recurrence equations, which are
sometimes called recurrence relations, or simply recurrences. A recurrence
is an equation that expresses the n-th term of a sequence an in terms of
earlier values of ai . Other examples of recurrences are
an = an1 + 7,

(2.5)

an = 3an1 + 2n ,

(2.6)

an = an3 + 3an2 ,

(2.7)

an = a1 an1 + a2 an2 + . . . + an1 a1 .

(2.8)

A solution to a recurrence is any sequence that satisfies the recurrence. For


instance, the sequence given by sn = 2n is a solution to recurrence (2.3).
Since sn = 172n and sn = 132n are two other solutions to recurrence (2.3),
34

a recurrence can have infinitely many solutions, but in a given problem there
is often only one solution of interest. For example, if you are interested in
the number of subsets of a set, then the solution to recurrence (2.3) that
you care about is sn = 2n . The reason for this is that it is the only solution
that begins with s0 = 1, the number of subsets of the empty set. (Notice
that this is consistent with what you have already proved in Problem 27.)
Usually s0 is called the initial value of the recurrence.
70. Use induction to show that there is one and only one solution to Recurrence (2.3) which begins with initial value s0 = 1.
71. A linear first-order recurrence is a recurrence which expresses an in
terms of an1 (to the first power) and other functions of n, but does not
include any of a0 , . . . , an2 in the equation. Which of recurrences (2.3)
through (2.8) are first-order recurrences?
72. Show that there is one and only one sequence {an } which has all of the
following properties:
(a) it is defined for every nonnegative integer n (which means that it
is an infinite sequence);
(b) it has a fixed initial value (which means a0 has a pre-determined
value, say a);
(c) and it satisfies a linear first-order recurrence (that is, an = c an1 +
d for some constants c, d).
73. Use Recurrence 2.4 to find the number of functions from [n] to [k].
74. (a) In repaying a mortgage loan with initial amount A, annual interest
rate p% (on a monthly basis), and a monthly payment of m, what
recurrence describes the amount owed after n months of payments
in terms of the amount owed after n 1 months? Some technical
details: You make the first payment after one month. The amount
of interest included in your monthly payment is .01p/12 and the
interest rate is applied to the amount you owed immediately after
making your previous monthly payment.
(b) Find a formula for the amount owed after n months.
(c) Find a formula for the number of months needed to bring the
amount owed to zero. Another technical point: If you were to make
the standard monthly payment m in the last month, you might
actually end up owing a negative amount of money. Therefore it is
okay if the result of your formula for the number of months needed
gives a non-integer number of months. The bank would just round

35

up to the next integer and adjust your payment so your balance


comes out to zero.
(d) What should the monthly payment be to pay off the loan over a
period of 30 years?
75. A tennis club has 2n members and it wants to pair the members in
twos for singles matches.
(a) Find a recurrence for the number of different ways to divide all the
2n members into sets of two. Be sure to give the initial value.
(b) Give a recurrence for the number of ways to divide 2n people into
sets of two for tennis games in which the first server is determined.
(c) In each of the previous two parts, use your recurrences to write the
number of ways as a product.
76. Draw n mutually intersecting circles in the plane so that each one
crosses each other one exactly twice and no three intersect in the same
point. Find a recurrence for the number rn of regions into which the
plane is divided by n circles. (One circle divides the plane into two
regions, the inside and the outside.) Find the number of regions with
n circles. For what values of n can you draw a Venn diagram showing
all the possible intersections of n sets using circles to represent each of
the sets?
Hint. Suppose n 1 circles have been drawn in such a way that they
define rn1 regions. When you draw a new circle, each time it crosses
a new circle it finishes dividing one region into two parts and starts
dividing a new region into two parts.

2.3

The General Product Principle

Although the Product Principle in Chapter 1 can be applied directly to solve


problems such as Problems 5 and 6, the reasoning can be cumbersome. An
easier way to work this type of counting problem is to think in terms of
making a sequence of choices as in the next problem.
77. Suppose you make a sequence of m choices, where
the first choice can be made in k1 ways, and
for each way of making the first i 1 choices, the i-th choice can
be made in ki ways.
Explain why this is an inductive process. In how many different ways
may you make your sequence of m choices? At this time you need not
36

prove your answer is correct, but simply write your answer in the next
box.

The General Product Principle


Suppose you make a sequence of m choices, where
the first choice can be made in k1 ways, and
for each way of making the first i 1 choices, the i-th choice can be
made in ki ways.
Then the total number of different ways to make this sequence of m choices
is:

+ 78. Return to Chapter 1 and re-do Problems 5, 6, and 20 by applying


the General Product Principle. In each case, write the problem as
a sequence of choices, count ki for each choice, and then apply the
principle.
79. Use the General Product Principle to count the number of bijections
on the set [k]. Explain. (It might be a good idea to work through some
of the cases k = 2, 3, and 4 before doing the general case.)
80. Let S(2) denote the statement of the General Product Principle for a
sequence of m = 2 choices. Let S denote the set of all different ways
to make these two choices.
(a) Write out S(2) explicitly.
(b) Remember that the first choice can be made in any of k1 ways and
let I1 , I2 , ..., Ik1 be the specific items that can be chosen for the
first choice. Letting Bj be the subset of S for which the first choice
made was the item Ij , find the size of Bj for each j.
(c) Explain why B1 , ..., Bk1 is a partition of S.
(d) What principle allows you to conclude that the size of S equals
k1 k2 ?
81. Use mathematical induction to prove the General Product Principle.

2.3.1

Counting the number of functions

Problem 73 is now revisited from the perspective of the General Product


Principle.

37

82. Use the General Product Principle to prove the number of functions
from an m-element set to a n-element set is nm . A common notation
for the set of all functions from a set M to a set N is N M .
+ 83. Now suppose you are thinking about the set S of functions f from [m]
to [n]. (For example, the set of functions from the three possible places
for scoops in an ice-cream cone to 12 flavors of ice cream.) Suppose
f (1) can be chosen in k1 ways. (In the ice cream problem, k1 = 12
holds because there were 12 ways to choose the first scoop.) Suppose
that for each choice of f (1) there are k2 choices for f (2). (For the ice
cream cones, k2 = 12 when the second flavor could be the same as the
first, but k2 = 11 when the flavors had to be different.) In general,
suppose that for each choice of f (1), f (2), . . . , f (i 1), there are ki
choices forf (i). What has been assumed so far about the functions in
S may be summarized as:
There are k1 choices for f (1).
For each choice of f (1), f (2), . . . , f (i 1), there are ki choices
for f (i).
How many functions are in the set S? Is there any practical difference
between the result of this problem and the General Product Principle?
The point of Problem 83 is that originally the statement the General Product
Principle was somewhat informal. To be more mathematically precise it is
a statement about counting sets of functions.
84. This problem revisits the question: How many subsets does a set S
with n elements have? (Compare with Problems 27 and 58.)
(a) For the specific case of n = 3, describe a sequence of three decisions
which could be made to yield subsets of [3]. Apply the General
Product Principle to find the number of subsets of [3]. Re-work
this from a function point of view.
(b) Use the functional interpretation of the General Product Principle
to prove that a set with n elements has 2n subsets.
85. In how many ways can you pass out k distinct pieces of fruit to n
children (with no restriction on how many pieces of fruit a child may
get)?
86. Assuming k n, in how many ways can you pass out k distinct pieces
of fruit to n children if each child may get at most one? What is the
number if k > n? Assume for both questions that you pass out all the
fruit. Note that each of these is a list of k distinct things chosen from
a set S (of children).
38

87. In combinatorics, an (ordered) list of k distinct things chosen from a


set S is called a k-element permutation of S. How many k-element
permutations does an n-element set have?
88. Find the number of one-to-one functions from a k-element set to an
n-element set. Now review your solutions to Problem 8.
Donald Knuth invented the notation nk , read n to the k falling orn
to the k down for the number you just found:
nk = n(n 1) (n k + 1) =

k
Y
i=1

39

(n i + 1) .

(2.9)

Chapter 3

Equivalence Relations
3.1

Equivalence Relations

Equivalence relations have been in the background of some of the problems


youve already worked. For instance, in Problem 9 with three distinct flavors
at first it was probably tempting to say there are 12 flavors for the first pint,
11 for the second, and 10 for the third, so there are 12 11 10 ways to
choose the pints of ice cream. However, once the pints have been chosen,
bought, and put into a bag, there is no way to tell which one was bought
first, which second, and which third. The number 12 11 10 is the number
of lists of three distinct flavors, in which the order in which the pints are
bought makes a difference. Two of the lists become equivalent after the ice
cream purchase if they have the same flavors of ice cream. In other words,
two of these lists are equivalent (are related for this problem) if they list the
same subset of the set of twelve ice cream flavors. To visualize this relation
with a digraph, one vertex would be needed for each of the 12 11 10 = 1320
lists (which is not feasible to draw). Even with five flavors of ice cream the
number of vertices would be 5 4 3 = 60. So for now the easier-to-draw
question of choosing three pints of different ice cream flavors from a choice
of four flavors of ice cream will be considered. For this, there are 4 3 2 = 24
different lists.
89. Suppose you have four flavors of ice cream: V(anilla), C(hocolate),
S(trawberry) and P(each). Draw the directed graph whose vertices
are all lists of three distinct flavors of the ice cream, and whose edges
connect two lists if they list the same three flavors. This graph makes
it pretty clear in how many really different ways you may choose 3
flavors from four. How many?
40

90. Now again suppose you are choosing three distinct flavors of ice cream
out of four possible flavors, but instead of putting scoops in a cone or
choosing pints, the three scoops will be arranged symmetrically in a
circular dish. The original lists are the same as in the last problem.
Namely, you can describe a selection of ice cream in terms of which
one goes in the dish first, which one goes in second (say to the right of
the first), and which one goes in third (say to the right of the second
scoop, which makes it to the left of the first scoop). But here two of
these lists will be considered equivalent if once they are in the dish no
one can tell which scoop went in first. Think about what makes two
lists of flavors equivalent, and draw the directed graph whose vertices
consist of all 24 lists of three of the flavors of ice cream and whose edges
connect two lists between which you cannot distinguish as dishes of ice
cream. How many dishes of ice cream can be distinguished from one
another?
91. Consider two seating arrangements of people around a round table to be
equivalent if you can get from one arrangement to the other by having
everyone get up, move one chair to the right, and then sit down again.
Explain how the digraph in Figure 3.1 describes the seating arrangements of four people at a round table. In your explanation, be sure to
tell what the arrows signify and what the disconnected appearance of
the digraph signifies. How many non-equivalent seating arrangements
of four people are there?

Figure 3.1: Digraph of arranging four people at a round table.


ABCD

DABC

ABDC

CABD

ADBC

CADB

BCDA

CDAB

BDCA

DCAB

DBCA

BCAD

ACBD

DACB

CBAD

DCBA

BACD

DBAC

CBDA

BDAC

BADC

ADCB

ACDB

CDBA

41

In the last few problems, you began with a set of lists and first you had
to decide when two lists were equivalent representations of the objects you
were trying to count. Then you drew the directed graph for that particular
relation of equivalence. Go back to your digraphs and check that each vertex
has an arrow to itself. This is what is meant when it is said that a relation is
reflexive. Also, check that whenever you have an arrow from one vertex to
a second, there is an arrow from the second back to the first. This is what is
meant when a relation is said to be symmetric. There is another property
of those relations you have graphed. Namely, whenever you have an arrow
from L1 to L2 and an arrow from L2 to L3 , then there is an arrow from
L1 to L3 . This is what is meant when a relation is said to be transitive.
A relation on a set is called an equivalence relation on the set when it
satisfies all three of these properties.
92. If R is a relation on a finite set, how can the digraph be used to check
whether or not the relation is reflexive? To check symmetry? To check
transitivity?
93. The adjacency matrix of a relation R on an n-element set {x1 , . . . , xn }
is defined to be the n n matrix A whose (i, j) entry equals the number of edges from xi to xj in the associated digraph. For at least three
relations on the set [4], find the adjacency matrix A and calculate A2 .
Use this information to formulate a conjecture about what is recorded
in the entries of A2 .
94. Let B be the matrix obtained from the matrix A2 by changing every
positive entry to 1. Explain how the matrix AB can be used to determine whether the relation is transitive. If the relation is not transitive,
explain how the matrix A B can be used to find a counterexample to
transitivity.
Check that each relation of equivalence in Problems 89, 90 and 91 satisfies the three properties, and so each is an equivalence relation. Carefully
visualize the same three properties in the relations of equivalence that you
use in the remaining problems of this chapter. Work through Appendix C
if you would like more practice with checking whether or not a relation is
an equivalence relation.
You undoubtedly have noticed that for each of the equivalence relations
youve considered so far, the digraph is divided into clumps of mutually
connected vertices. In the next section you will show that this clumping property holds for all equivalence relations, and that it is exactly this
property which makes equivalence relations a powerful tool for counting.
42

3.2

Equivalence Classes

Next is an investigation of the clumping behavior you observed in the


digraphs of the equivalence relations in the last section.
95. Describe the equivalence relation inherent in seating n people around
a round table in such a way that they are equally spaced around the
table. Check that it is an equivalence relation on the set of all n! lists
of n people. This equivalence relation induces a partition of the set
of lists. What size blocks occur? How many non-equivalent seating
arrangements are there? Compare this problem with Problem 91.
96. In this problem, consider the question of making necklaces by arranging
n distinguishable beads on a string. Assume that once all n beads are
placed on the string its ends are carefully knotted so the knot cannot
be seen and that the beads are equally spaced on the string. Regard
two necklaces as equivalent if when both are placed on a table, a person
can pick up one of the necklaces, move it around in space, and put it
back down so that it exactly matches the other necklace. Describe this
as an equivalence relation on some set. What size blocks of necklaces
occur? How many non-equivalent necklaces can be constructed using
n distinguishable beads?
97. Verify explicitly that your relationship of equivalent necklaces is reflexive, symmetric, and transitive, and so is an equivalence relation.
Some standard notation is useful for working with equivalence relations:
When R is a relation on the set X , the notation xRy will indicate that
x, y X are related under R. Using this notation, the relation
R is called reflexive if xRx for every x X.
R is called symmetric if xRy holds whenever yRx holds.
R is called transitive if whenever both xRy and yRz, then xRz as
well.
R is an equivalence relation if R is reflexive, symmetric and transitive.

Suppose that R is an equivalence relation on a set X and for each x X,


define the set Cx by Cx = {y X : yRx}. That is, Cx is the subset of X
consisting of all elements of X which are equivalent to the element x under
43

the given equivalence relation. (Can any of the sets Cx be empty? Yes? No?
Why?) The sets Cx are called the equivalence classes of the relation R.
Find the equivalence classes for the seating relation whose digraph is given
in Figure 3.1.
98. In Problem 95 the equivalence classes correspond to seating arrangements. For each of Problems 89, 90, and 96, describe (that is, using
complete English sentences) what the equivalence classes correspond
to. Three answers are expected here.
99. Let R be an equivalence relation on a set X.
(a) If the classes Cx and Cz have an element y in common, what
can you conclude about the sets Cx and Cz (besides the fact that
they have an element in common!)? Be explicit about what property(ies) of equivalence relations justify your answer.
(b) Why is every element of X in some class Cx ? Be explicit about
what property(ies) of equivalence relations you are using to answer
this question.
(c) Explain why two distinct sets Cx and Cz are disjoint. What do
these sets have to do with the clumping you saw in the digraphs
of Problems 89 and 90?
You have just proved that if R is an equivalence relation on the set X, then
each element of X is in exactly one equivalence class of R. That is,
Theorem 1. If R is an equivalence relation on X, then the set of equivalence
classes of R forms a partition of X.
In each of Problems 89, 90, 95, and 96 when you counted the number of
equivalence classes of an equivalence relation there was a special structure
to the problems that made this somewhat easy to do. For example, in
Problem 89, there was a set of 4 3 2 = 24 lists of three distinct flavors
chosen from V, C, S, and P. Each list was equivalent to 3 2 1 = 3! = 6 lists
(including itself) since the order in which you selected the three flavors was
unimportant. This says that each of the equivalence classes has size 6, and
the set of all 432 lists was a union of some number n of equivalence classes,
each of size 6. By the Product Principle, if you have a union of n disjoint
sets, each of size 6, the union has 6n elements. But you already knew that
the union was the set of all 24 lists of three distinct letters chosen from the
four letters. Thus, 6n = 24, proving there are n = 4 equivalence classes.
In Problem 90: If you choose the flavors V, C, and S, and arrange them
in the dish with C to the right of V and S to the right of C, then the scoops
44

are in different relative positions than if you arrange them instead with S to
the right of V and C to the right of S. Thus the order in which the scoops
go into the dish is somewhat important, only somewhat because putting in
V first, then C to its right and S to its right is the same as putting in S first,
then V to its right and C to its right. In this case, each list of three flavors
is equivalent to only three lists, including itself, and so if n is the number
of equivalence classes, you have 3n = 24. This gives 24/3 = 8 equivalence
classes.
100. Given the partition {1, 3}, {2, 4, 6}, {5} of the set X = {1, 2, 3, 4, 5, 6},
define two elements of X to be related if they are in the same block of
the partition. That is, define 1 to be related to 3 (and 1 and 3 each
related to itself), define 2 and 4, 2 and 6, and 4 and 6 to be related
(and each of 2, 4, and 6 to be related to itself), and define 5 to be
related to itself. Show that this relation is an equivalence relation.
101. Suppose P = {S1 , S2 , S3 , . . . , Sk } is a partition of S. Define two elements of S to be related if they are in the same set Si , and otherwise
not to be related. Show that this relation is an equivalence relation on
the set S. Show that the equivalence classes of the equivalence relation
are the sets Si .
In Problem 101 you just proved that every partition of a set gives rise to
(or induces) an equivalence relation, and that the classes of the induced
equivalence relation are the blocks of the original partition.
102. In how many ways can you attach two identical red beads and two
identical blue beads to the corners of a square (with one bead per
corner) if the square is free to move around in (three-dimensional)
space? What is the underlying equivalence relation and its equivalence
classes? Write out the equivalence classes as sets of lists. For this
problem it might be helpful to just draw some pictures of the possible
configurations since there are not that many.
103. (This has already appeared as Problem 75. Equivalence relations can
be used for a different approach.) A tennis club has 2n members, and
you want to pair the members by twos for singles matches.
(a) In how many different ways can you list all 2n members of the
club?
(b) Define an equivalence relation on the set of all lists which can be
used to identify the lists which give rise to the same pairings.
(c) Find the size of an equivalence class, being careful to explain why
all equivalence classes have the same size.
45

(d) In how many ways can you pair up all the members of the club for
singles matches?
(e) Suppose that in addition to specifying who plays whom for each
pairing you also specify who serves first. Now in how many ways
can you specify your pairs? Use your solution to part (d), if possible.
104. Suppose you plan to put six distinct computers in a network as shown
in Figure 3.2 where the computers are the nodes.
(a) What is the total number of ways to assign computers to the nodes
(or the vertices) of this regular hexagon?
(b) The edges (line segments) show which computers can communicate directly with which others. Consider two ways of assigning
computers to the nodes of the network to be different if there are
at least two computers that communicate directly in one assignment and that do not communicate directly in the other. Define an
equivalence relation which models the equivalence in this situation.
(c) Prove every equivalence class has 48 elements.
(d) In how many different ways can you assign computers to the network?
Figure 3.2: A computer network.

3.3

Counting Subsets


The symbol nk is used to represent the number of ways to choose a kelement subset from an n-element set. You may have already seen this
notation elsewhere, but do not be concerned if you have not seen it because it will be developed completely here. The symbol should be read as
n choose k. (Another common way to read the binomial coefficient notation is the number of combinations of n things taken k at a time but weve
found that this can be confusing and so please do not read the notation that
46



way in this course.) Sometimes nk is written as C(n, k), or n Ck , but nk
is more standard in discrete mathematics and should be the only notation
you use here.

105. Using only the given definition of nk (that it equals the number of ways
to choose a k-element subset from an n-element set) and the Bijection
Principle, prove that
  

n
n
=
for all 0 k n .
k
nk
106. Use the distributive law to check that
(x + y)4 = x4 + 4x3 y + 6x2 y 2 + 4xy 3 + y 4 .


Explain why this means that 41 = 4 and 42 = 6. (Think this through
k nk
carefully!) Give a general
argument for why the coefficient

 of x y
n
n
n
in (x + y) equals k . For this reason, the symbol k is called a
binomial coefficient.
In the next problems
you will use equivalence relations to find a formula

for calculating nk . Although you may have seen this formula before, do
not calculate the actual numbers in these problems but rather simply use
binomial coefficients in your answers.
107. A basketball team has 12 members of which only five can play at any
given time during a game.
(a) In how many ways may the coach choose the five players?
(b) To be more realistic, the five players normally consist of two guards,
two forwards, and one center. If there are five guards, four forwards, and three centers on the team, in how many ways may the
coach choose the team to be sent out on the court?
(c) If one of the centers is equally skilled at playing forward, how many
different teams may the coach send out to play?
108. In how many ways can you pass out k (identical) ping-pong balls to
n children if each child may get at most one? (Ask yourself What is
a problem like this doing in the middle of a bunch of problems about
counting subsets of a set? Is it related? Or is it supposed to give a
break from sets?)
109. Letting S denote the set of all 3-element permutations of {a, b, c, d, e},
Table 3.1 lists all elements of S exactly once and the list is given in
47

Table 3.1: The 3-element permutations of {a, b, c, d, e} organized by rows


according to which 3-element set they permute.
abc
abd
abe
acd
ace
ade
bcd
bce
bde
cde

acb
adb
aeb
adc
aec
aed
bdc
bec
bed
ced

bac
bad
bae
cad
cae
dae
cbd
cbe
dbe
dce

bca
bda
bea
cda
cea
dea
cdb
ceb
deb
dec

cab
dab
eab
dac
eac
ead
dbc
ebc
ebd
ecd

cba
dba
eba
dca
eca
eda
dcb
ecb
edb
edc

such a way that each row of the table lists all permutations of a certain
3-element subset of {a, b, c, d, e}. Since each 3-element permutation
appears exactly once, the rows of the table determine a partition of
the set S. From Problem 101 you know any partition is the set of
equivalence classes of some equivalence relation. Find the equivalence
relation for this partition of S.
110. Rather than restricting to n = 5 and k = 3, it is possible to partition
the set of all k-element permutations of an n-element set (which can
be assumed to be the set [n] ) into equivalence classes.
Let S be the set of all k-element permutations of [n], and for s1 , s2 S,
define
s1 R s2 s1 has the same elements as s2 .
Prove R is an equivalence relation on S.
How many elements are in any equivalence class?
What is the size of S? (You found this earlier. In which problem?)
Write a carefully worded sentence that describes a bijection between the set of equivalence classes of R and the set of k-element
subsets of [n].

(e) What formula does this give you for the number nk of k-element
subsets of an n-element set?

(a)
(b)
(c)
(d)

In the last problem sequence you proved the following formula.

48

Calculation of Binomial Coefficients


For any k, n 0 with k n, the number of k-element subsets of an
n-element set is
 
n
n!
=
.
k
k! (n k)!

111. Use the above formula to find numerical answers to Problem 107.
112. You can write n as a sum of n ones.
(a) How many plus signs did you use in the sum?
(b) In how many ways can you write n as a sum of a list of k positive
numbers? Such a list is called a composition of the integer n
into k parts.
(c) Find the total number of compositions of n, that is, into any number of parts. Compare this with your solution to Problem 62.
113. The answer in Problem 112(b) can be expressed as a binomial coefficient. This means it should be possible to interpret a composition into
k parts as a subset of some set. Find a bijection between compositions
of n into k parts and certain subsets of some set. Be sure to explain
explicitly what your function is; that is, tell how to get the subset from
the composition.
114. Give a recurrence for the number of ways to divide 4n people into sets
of four for games of bridge. (Do not worry about how they sit around
the bridge table or who is the first dealer.)
115. A town has n street lights running along the north side of Main Street.
The poles on which they are mounted need to be painted so that they
do not rust. In how many ways may they be painted with red, white,
blue, and green if an even number of them are to be painted green?
Hint. First try for small n.
116. This is the third time youve seen this problem. Here, binomial coefficients should be used to construct a solution. A tennis club has 2n
members, and you want to pair up the members by twos for singles
matches.
(a) How many ways can you list all possible pairs of these 2n tennis
players, where each list consists of n unordered pairs?
(b) Define an equivalence relation on the set of lists from part (a) which
identifies lists which yield the same pairings for singles matches if
you do not care who serves first.
49

(c) Find the number of pairings of 2n people for singles matches if you
do not care who serves first.
You now have many methods for solving the following problem. Try to
solve it several ways. Do not settle for doing it just one way!
117. A tennis club has 4n members. To specify a doubles match, you choose
two teams of two people. In how many ways can you arrange the members into doubles matches so that each player is in one doubles match?
In how many ways can you do it if you also specify who serves first on
each team?

3.3.1

Pascals Triangle

118. Let C be the set of all k-element subsets of [n] that contain the number n, and let D be the set of all k-element subsets of [n] that do not
contain n.
(a) Find the sets C and D for n = 5 and k = 2.
(b) Let C 0 be the set of (k 1)-element subsets of [n 1]. Describe a
bijection from C to C 0 . (A verbal description is fine.)
(c) Let D0 be the set of k-element subsets of [n1]. Describe a bijection
from D to D0 . (A verbal description is fine.)
(d) Based on the two previous parts, express the sizes of C and D in
terms of binomial coefficients involving n 1.
(e) Apply the Sum
Principle to C and D to obtain a formula that

expresses nk in terms of two binomial coefficients involving n 1.
This formula is a recurrence in the two variables n, k.
Write your solution to Problem 118 (e) in the following box:

Pascals Equation

50

119. Let S(k, n) denote the number of partitions of [k]-element set into n
non-empty blocks. Show that for k n > 1, S(k, n) satisfies the
recurrence
S(k, n) = S(k 1, n 1) + nS(k 1, n) .
Hint. In any partition of [k], the number k is either in a block by itself
or it is not.
You will return to S(k, n) in Section 7.2.
In Problem 118 (e) you derived Pascals Equation which is the basis
for the famous Pascals Triangle, the triangle in Figure 3.3 whose rows
and columns are numbered so that the top row is the 0-th row and the initial
entry of a row is called the 0-th number in the row. Then the n-th row is
set up as follows: the number of k-element subsets of an n-element set is
the k-th number over in the n-th
 row. Your formula from the last problem
does not say anything about nk when k = 0 or k = n, but otherwise it says
that each entry is the sum of the two that are above it and just to the left
or right.
Figure 3.3: Pascals Triangle
1
1
1
1
1
1
1
1

2
3

4
5

1
3

6
10

15
21

1
1
4
10
20

35

1
5

15
35

1
6

21

1
7

120. Just for practice, use your formula to get the 8-th row of Pascals
Triangle.
121. Without writing out any more complete rows, write enough of Pascals
Triangle to get a numerical answer for the first question in Problem 9.
Try to do this as efficiently as possible. You should be able to get the
answer by writing down at most 10 numbers (including the answer).
122. Give an inductive proof of the Binomial Theorem, using the fact that
(x + y)n = (x + y)(x + y)n1 . In case you do not know the Binomial
51

Theorem, you can discover it as you work through the proof. Your goal
is to express (x + y)n as the sum of terms of the form
something xk y nk
where for each k the something is a binomial coefficient.

3.3.2

Catalan numbers (Optional)

123. In part of a certain city, all streets run either north-south or east-west,
and there are no dead ends. Suppose you are standing on a street
corner. In how many ways can you walk to a corner that is four blocks
north and six blocks east, using as few blocks as possible?
124. The last problem has a geometric interpretation in a coordinate plane.
A lattice path in the plane is a sequence of line segments which either
go from a point (i, j) to the point (i + 1, j) or from a point (i, j) to the
point (i, j + 1), where i and j are integers. (Lattice paths always move
either up or to the right.) The path length is the number of such line
segments in the path.
(a) What is the length of a lattice
 path from (0, 0) to (m, n)?
m+n
(b) Show there are exactly n lattice paths from (0, 0) to (m, n).
(c) How many lattice paths are there from (i, j) to (m, n), assuming
i, j, m, and n are all integers? Be careful: What happens to the
answer if i > m or j > n? Remember that paths go up or to the
right.
125. Admission to a school play requires a ten-dollar donation per person.
Assume that each person who comes to the play only pays for themselves and that the payment is made with either a ten-dollar bill or a
twenty-dollar bill. The teacher who is collecting the money forgot to
get change before the event. If there are always at least as many people
who have paid with a ten as a twenty as they arrive, the teacher will
not have to give anyone an IOU for change. Suppose 2n people come
to the play, and exactly half of them pay with ten-dollar bills.
(a) Describe a bijection between the set of sequences of tens and twenties given to the teacher and the set of lattice paths from (0, 0) to
(n, n). (Be sure to explain why your map is a function, one-to-one,
and onto.)
(b) What is the geometric interpretation of a sequence that does not
require the teacher to give any IOUs?

52

Notice that a lattice path from (0, 0) to (n, n) stays inside (or on the
edges of) the square whose sides are the x-axis, the y-axis, the line x = n
and the line y = n. It may or may not stay within or on the triangle whose
sides are the x-axis, the line x = n and the line y = x. Any lattice path that
does stay within this triangle is called a Catalan path. Figure 3.4 shows
the lattice points which form the triangle of interest for n = 4, where the
sides are the x-axis, the line x = 4 and the line y = x. The numbers given
in the figure are the number of Catalan paths to the indicated point. Check
that these numbers are correct.
Figure 3.4: The Catalan paths from (0, 0) to (i, i) for i = 0, 1, 2, 3, 4. The
number of paths to the point (i, i) is written just above the point.
14
5
2
1
1

126. Let P be a lattice path from (0, 0) to (n, n) which is not Catalan.
(a) Show the path P must have at least one point on the line y = x+1.
Let the point P be the point whose x-coordinate is least among all
these points of intersection with the line y = x + 1.
(b) Take the part of path P which lies from (0, 0) to this point P , and
reflect it about the line y = x + 1. (That is, replace every upstep
with a step one unit to the left and every right step with a step
one unit down.) Show that this new path is a lattice path from
(1, 1) to (n, n). If you are having trouble for a general n, try it
on a few non-Catalan lattice paths from (0, 0) to (4, 4).
In Problem 126 you transformed a lattice path P from (0, 0) to (n, n)
which is not Catalan into a lattice path from (1, 1) to (n, n). This method
of construction is called the Feller Reflection Principle.
127. In Problem 126 you used the Feller Reflection Principle to show that
each non-Catalan path from (0, 0) to (n, n) determines a lattice path
53

from (1, 1) to (n, n). This process therefore defines a function from
the set of all non-Catalan paths from (0, 0) to (n, n) into the set of all
lattice paths from (1, 1) to (n, n).
(a) Check that is injective for the special case of n = 4. Write a
convincing proof that is injective for all n.
(b) Prove is a bijection.
(c) How many non-Catalan paths are there from (0, 0) to (n, n)?
(d) How many Catalan paths are there from (0, 0) to (n, n)? This is
called the Catalan number Cn .
Refer to Richard Stanleys website in the Applied Mathematics Department at MIT for at least sixty-six combinatorial interpretations of the Catalan numbers.

3.4

Ordered-functions and Multisets

Suppose you wish to place k distinct books (here distinct will always mean
that the objects are distinguishable one from the other) onto the shelves of
a bookcase with n shelves. Assume that the shelves are long enough so that
all of the books would fit on any of the shelves. Also, let us imagine that
after you are done putting books on the shelves, you push the books on a
shelf as far to the left as possible. This means that you are only thinking
about how the books sit relative to each other, not about the exact places
where you put any book. Since the books are all different, you can number
them as the first book, the second book, and so on.
128. (a) In how many ways can you place the first of the k books on one of
the n shelves?
(b) When you go to place the second book, if you decide to place it on
the shelf that already has a book does it matter if you place it to
the left or right of the book that is already there? In how many
ways can you place the second book into the bookcase?
(c) In how many ways can you place the ith book into the bookcase?
(d) In how many ways can you place all k books into the bookcase?
129. Suppose you wish to place the k distinct books so that in addition to
the other requirements each shelf gets at least one book. Now in how
many ways can you place the books?
Hint. Do something before you start the process described in Problem 128.

54

The suggestion intended by the hint in the last problem was to consider
a two-step process. There are other ways to solve the problem. For instance,
imagine first lining up the k books in a row, giving k 1 spaces between
books. Choose n 1 of these spaces in which to slide a piece of paper as
a divider. Now put the books before the first divider on shelf one, and the
books after divider i on shelf i + 1. This gives an arrangement of the books
on the shelves so that every shelf has a book.
130. Use the method just described to solve Problem 129.
For any given arrangement of books in the bookcase, the assignment of
a book to the shelf on which it was put is a function from the set of books
to the set of shelves. But this function does not give all information since it
only records which shelf each book is on. It does not say which book sits to
the left of which others on the shelf, information which is an important part
of how the books are arranged on the shelves. In other words, the order
in which the shelves receive their books matters. So, in order to record all
relevant information for the arrangement, we must assign an ordered list of
books to each shelf. In these notes such a map will be called an orderedfunction, and the word is hyphenated because an ordered-function from S to
T is in general not a function from S to T . The phrase ordered-function
is not a standard one and in fact there is not yet a standard name for the
result of an ordered distribution problem.
More precisely, an ordered-function from a set S to a set T is a map
that assigns an ordered list of elements of S (books) to some elements of
T (bookshelves) in such a way that each element of S appears on one and
only one of the lists. An ordered-onto-function is an ordered-function
from S to T that assigns a list to every element of T . In Problem 128 you
counted the number of ordered-functions from [k] to [n] and in Problem 129
the number of ordered-onto-functions from [k] to [n].
131. Let S be the set of all arrangements of k different books in an n-shelf
bookcase.
(a) Consider the following relation on S: Two arrangements are related
if and only if the two arrangements have the same number of books
on each shelf. Prove this relation is an equivalence relation.
(b) Each equivalence class has the same number of elements. Show
that this is true and find the number.
(c) Find the number of different equivalence classes, and write this
number as a binomial coefficient.

55

(d) Find the number of arrangements of k identical books on n shelves.


Explain.
132. Use an equivalence relation to count how many ways you can put k
identical books onto n distinct shelves if each shelf must get at least
one book.
A multiset chosen from a set S is a collection of elements from S in
which elements may be repeated. To determine a multiset you must say how
many times (allowing the possibility of zero times) each member appears in
the multiset.
133. (a) Find a bijection between arrangements of k identical books on n
shelves and multisets chosen from [n].
(b) What is the number of multisets of size k that can be chosen
from [n]?
134. Multisets can be used to give a more elegant solution to the earlier
Problem 9: A group of three hungry members of the team in Problem 6
notices it would be cheaper to buy three pints of ice cream to share
among the three of them. In how many ways may they choose three
pints of ice cream with no restrictions on repeating flavors? Be sure to
use multisets to solve this problem.
135. How many solutions are there in nonnegative integers to the equation
x1 + x2 + + xm = r, where m and r are fixed positive integers?
136. In how many ways can you distribute k identical objects to n distinct
recipients so that each recipient gets at least m objects? (Here you
assume that k mn.)
137. Your answer to Problem 133 (b) is expressible as a binomial coefficient.
Since a binomial coefficient counts subsets, find a bijection between
subsets of something and multisets chosen from some set S.

3.5

The Existence of Ramsey Numbers (Optional)

In Chapter 1 the Ramsey Number R(m, n) was defined to be the smallest


number R such that if there are R people in a room, then there is a subset
of the set of people which has either at least m mutual acquaintances or at
least n mutual strangers. However, if you return to those earlier problems
(page 20) youll see that you never proved Ramsey Numbers exist for general
m and n.

56

Provided you can show that there is some integer R such that for any
group of R people there are either m mutual acquaintances or n mutual
strangers, the Ramsey Number R(m, n) must exist, because it is the smallest
such R (and every nonempty set of positive integers has a least element).
In this section you will use mathematical induction to prove that m+n2
m1
is one such R. This simultaneously
proves
that
Ramsey
numbers
exist
and

that R(m, n) m+n2
.
The
question
is,
what
should
be
inducted
on,
m
m1

m+n3
or n? In other words, do you use the fact that with m2 people in a
room there are at least m 1 mutual acquaintances
 or n mutual strangers,
m+n3
or do you use the fact that with at least
people in a room there
n2
are at least m mutual acquaintances or at least n 1 mutual strangers? It
turns out that both will be used. That is, it is helpful to induct on m and
n. One way to do that is to use yet another variation of induction, which
will be called the Principle of Double Mathematical Induction.

The Principle of Double Mathematical Induction


In order to prove that a sequence of statements S(m, n) indexed by integers
m a and n b are all true, you can do both of the following steps:
1. Prove the statement S(a, b) for the fixed integers a and b is true.
2. Show that the truth of the statement S(m, n) for all values of m and
n with a + b m + n K implies the truth of the statement S(m, n)
for all pairs of integers m, n with m + n = K + 1.
Then the statement S(m, n) is true for all pairs of integers m a and
n b.

138. (a) Use Double Induction


and Pascals Equation to prove that for any

choice of m+n2
people
in a room there are either at least m
m1
mutual acquaintances or at least n mutual strangers.
(b) Prove that R(m, n) exists
 for every pair of integers m, n 1 and
is no more than m+n2
m1 .
139. Prove that R(m, n) R(m 1, n) + R(m, n 1).
Hint. Begin by explicitly stating what you need to prove. You do not
need to use induction.
140. (a) What does the equation in Problem 139 say about R(4, 4)?
(b) Consider 17 people arranged in a circle such that each person is
acquainted with the first, second, fourth, and eighth person to the
right and the first, second, fourth, and eighth person to the left.
57

Can you find a set of four mutual acquaintances? Can you find a
set of four mutual strangers?
(c) What is R(4, 4)?

58

Chapter 4

Graph Theory
4.1

Undirected Graphs

The idea of a directed graph (or digraph) was introduced in Section 1.3.
In this chapter you will work with undirected graphs, usually simply called
graphs, which consist of vertices and edges. Vertices and edges are described
in much the same way as points and lines are described in geometry, which
means that vertices and edges are taken as undefined objects. Although
graphs with an infinite number of vertices or an infinite number of edges
can be studied, here the sets of edges and vertices are both finite.
A graph consists of a finite set V called a vertex set and a finite set E
called an edge set. Each member of V is called a vertex (plural: vertices)
and each member of E is called an edge (plural: edges). Associated with
each edge are two (not necessarily different) vertices called its endpoints.
Representations of graphs are drawn using points to represent the vertices
and line segments to represent the edges. The edges can be curved if you
like, but you must be sure that each endpoint is a vertex.
Figure 4.1: Three ways to draw a complete graph K4 on four vertices.

A complete graph on n vertices consists of n points in the plane,


together with all possible edges connecting each pair vertices. The notation
Kn is used to represent a complete graph on n vertices, and Figure 4.1 gives
59

three different ways to draw K4 .


141. Let n be a positive integer.
(a) How many edges are in the complete graph Kn ? Explain.
(b) In how many ways may the edges of Kn be colored with two colors,
say red and blue?
In a graph it is possible to have an edge that connects a vertex to itself
(called a loop) and it is possible to have two or more edges between two
vertices. A graph that has no loops and no multiple edges is called a simple
graph.
142. Prove there are 2

(n2 )

simple graphs on a set of n vertices.

Figure 4.2: Three different graphs

6
7

v
z

2
f
c
1

Figure 4.2 gives three pictures of graphs. Each gray circle in the figure
represents a vertex and each line segment represents an edge. You will note
that the vertices have been labelled, and in general you can choose to label
vertices or not. The degree of a vertex is the number of times it appears
as the endpoint of edges. For instance, the degree of y in the third graph in
the figure is 4.

60

143. Find the degree of each vertex in the graph on the left in Figure 4.2.
For each graph in Figure 4.2 is the number of vertices of odd degree
even or odd?
144. The sum of the degrees of the vertices of a (not necessarily simple)
graph is related in a natural way to the number of edges. In this
problem, you are asked to discover this relationship, but you do not
yet have to prove it.
(a) As a group, draw graphs with one edge, two edges, three edges,
and perhaps four edges, and calculate the sum of the degrees in
each graph. Conjecture a relationship between this sum and the
number of edges in the graph.
(b) What is the contribution of a given edge to the sum of the degrees?
How is the sum affected by deleting an edge (but not its endpoints)
from the graph?
145. Explain how a graph with E + 1 edges can be viewed as being constructed in a natural way from a (not necessarily unique) graph with
E edges.
146. Use induction on the number of edges to prove your conjecture in Problem 144.
In the last two problems you capitalized on the simple observation that
deleting an edge from a graph with E + 1 edges results in a graph with E
edges. In other words, the result in Problem 145 provided a useful inductive
process for a proof in Problem 146. A word of caution: You cannot assume
a subgraph with E edges inherits all properties of the original graph with
E + 1 edges. For instance, in the middle graph of Figure 4.2 every vertex
has even degree, but deleting any edge from the graph results in a graph
with two vertices of odd degree. In the graph on the left, the deletion of any
edge results in a subgraph which is no longer connected.
So, a graph can have properties which are not necessarily inherited by
its subgraphs. This affects the construction of proofs by mathematical induction: If the argument in your inductive step for graphs with E + 1 edges
relies on properties of a subgraph obtained by deleting an edge, then you
must prove that the graph with E edges possesses every property you use.
For the proof in Problem 146, the inductive assumption is: Every graph with
at most E edges has the property that the sum of the degrees of its vertices
equals twice the number of edges. Note this statement assumes no properties of the graph other than it has at most E edges. (For example, it might
have vertices of odd degree or it might be disconnected.) Because of this,
61

you are able to apply the inductive assumption to any subgraph obtained by
deleting an edge from the original graph with E + 1 edges without proving
the subgraph has any other properties. In general, you should exercise great
care when using mathematical induction to prove graph-theoretical facts.
147. In Problem 146 you used induction on the number of edges to prove
your conjecture. Now consider constructing a proof by induction on the
number of vertices. Is there an inherent inductive process? Thinking
through how you might construct such a proof, explain why inducting
on the vertices is more complicated (or at least messier) than inducting
on the edges.
148. Find a proof of your conjecture in Problem 144 that does not use induction.
149. (Refer to Problem 143.) What can you say about the number of vertices
of odd degree in a graph? Explain.
150. Given a set of people, consider the relation of being acquainted.
(a) For any nonempty set of people, define what is meant by an acquaintance graph on the set by describing its vertices and which
vertices are connected by edges.
(b) Using graph theory, prove that within any group with an odd number of people, there is at least one person who knows an even
number of people.

4.2

Trees

A walk in a graph is an alternating sequence v0 e1 v1 . . . ek vk of vertices and


edges such that for all i the consecutive vertices vi1 , vi are the endpoints
of the edge ei . Note that the definition of the term walk does not require
all of the vertices v0 , . . . , vk to be differentany number of them might be
the same. Likewise, there can be repetitions among the edges e1 , . . . , ek in
a walk. The literature in graph theory is not consistent in defining a walk.
Notice that the definition used here allows a walk to have no edges, which
would be the lazy walk that remains at the initial vertex.
A graph is called connected if for any pair of vertices there is a walk
which starts at one vertex and ends at the other vertex.
151. Which of the graphs in Figure 4.2 are connected?
152. A path in a graph is a walk with no repeated vertices. Find the longest
path in the third graph of Figure 4.2.
62

A cycle in a graph is a walk which has all of the following properties:


its first and last vertices are the same but it has no other repeated vertices;
it has at least one edge; it has no repeated edges.
153. Which graphs in Figure 4.2 have cycles? What is the largest number
of edges in a cycle in the second graph in Figure 4.2? What is the
smallest number of edges in a cycle in the third graph in Figure 4.2?
154. A connected graph with no cycles is called a tree. Which graphs (if
any) in Figure 4.2 are trees?
155. In a tree with n vertices, given two vertices, v1 , v2 , how many paths
connect v1 to v2 ? Give a complete explanation.
156. Find all trees with at most 4 vertices. Give an argument which shows
that every tree with at least two vertices has at least one vertex of
degree 1.
157. For any tree with n 2 vertices, remove one of its vertices of degree 1
and the edge containing that vertex (but do not remove the other endpoint of the edge). Prove that the graph that remains is a tree.
158. On the basis of your examples of trees with at most 4 vertices, make a
conjecture about the relationship between the number of vertices and
edges in a tree.
159. Prove your conjecture in Problem 158 by induction. Be sure to begin
by asking yourself what inductive process you can use. Is it easier
to induct on the number of vertices or the number of edges? Or on
something else?
160. A hydrocarbon molecule is a molecule whose only atoms are either hydrogen atoms or carbon atoms, and there is at least one carbon atom
and at least one hydrogen atom. In a simple molecular model of a hydrocarbon, a carbon atom will bond to exactly four other atoms, and
a hydrogen atom will bond to exactly one other atom. Such a model
is shown in Figure 4.3. A hydrocarbon compound can be represented
by a graph whose vertices are labelled with Cs and Hs where each C
vertex has degree four and each H vertex has degree one. A hydrocarbon is called an alkane if the graph is a tree. Common examples
are methane (natural gas), butane (one version of which is shown in
Figure 4.3), propane, hexane (ordinary gasoline), and octane (to make
gasoline burn more slowly).
(a) How many vertices are labelled H in the graph of an alkane with
exactly n vertices labelled C?
63

Figure 4.3: A model of a butane molecule.

(b) An alkane is called butane if it has exactly four carbon atoms. Why
was it said above that one version of butane is shown in Figure 4.3?
161. What is the minimum number of vertices of degree one in a tree with
n 2 vertices? Prove that you are correct. See if you can find (and
give) more than one proof.

4.3

Labelled Trees and Pr


ufer Codes

Next you will explore the idea of labelled trees. Figure 4.4 gives all different labellings of a fixed tree with 3 vertices. Notice that the convention
for labelling the vertices of trees is that the tree which has edges between
vertices 1 and 2 and between vertices 2 and 3 is different from the tree that
has edges between vertices 1 and 3 and between vertices 2 and 3.
Figure 4.4: The three labelled trees on three vertices

162. How many labelled trees are there on the vertex set [2]? On the vertex
set [3]? How many labelled trees are there on four vertices? How
many labelled trees are there with five vertices? You do not have a
lot of data to formulate a guess, but try to guess a formula for the
number of labelled trees with vertex set [n]. When you get to four and
especially five vertices, draw all the unlabelled trees you can think of,
64

and then figure out in how many different ways you can put labels on
the vertices.
The next problems will develop a method for proving the formula you
just guessed in the last problem. In order to do this, an auxiliary sequence
is defined.
Given a tree with n 2 vertices which has been labelled in any way using
the elements of [n], define the auxiliary sequence b1 , b2 , . . . in the following
inductive manner:
Step 1: If the tree has two vertices, the sequence consists of one term, the
larger label, which means the sequence is b1 = 2. Otherwise, let a1
be the lowest-numbered vertex of degree 1 in the tree. (How do you
know there is such a vertex?) Let b1 be the label of the unique vertex
in the tree adjacent to a1 and write down b1 . (Why is b1 unique?) For
example, in the first graph in Figure 4.2, a1 is 1 and b1 is 2.
Step 2: Suppose a1 through ai1 have already been identified, and let ai be
the lowest-numbered vertex of degree 1 in the tree you get by deleting
vertices a1 through ai1 and all edges containing at least one of these
vertices. (How do you know the resulting graph is always a tree?) Let
bi be the unique vertex in this new tree adjacent to ai . For example,
in the first graph in Figure 4.2, a2 = 2 and b2 = 3. Then a3 = 5 and
b3 = 4.
163. Use the letter B to stand for the sequence of bi s inductively obtained
in this way. Use your earlier work to answer the questions posed in the
above two-step algorithm.
164. For the tree (the first graph) in Figure 4.2, the sequence B is 2344378.
At this point, work with your group to draw some other labelled trees
on eight vertices and construct the sequence B associated with each
tree.
165. How long is the sequence B computed from a labelled tree with n
vertices?
166. From your examples, explain why you can always predict the last member of the sequence B. Explain.
167. Is it possible for a1 to be in B? Can you tell from B what a1 is? Now
use the sequence b2 , . . . , bn to find a2 . Explain how the sequence B can
be used to find the sequence a1 , . . . , an (in order, of course).

65

For a labelled tree T , the associated sequence P (T ) := b1 , b2 , . . . , bn2 is


called a Pr
ufer coding or the Pr
ufer code of T . For instance, the Pr
ufer
code for the labelled tree T of Figure 4.2 is P (T ) = 234437. Notice that
the last term of B is not included in the Pr
ufer code because it is known to
be n.
Let T be the set of all labelled trees on nine vertices. For each tree
T T , let P (T ) be the Pr
ufer code for T . This defines a relation on the set
T . Why is it a function with domain T ?
168. Find a co-domain for this function. (At this point you are not asked
to find the smallest co-domain.)
169. Play the following game in your group: In turn, each of you should
secretly write down a tree, determine its Pr
ufer code, and then share
the code with the whole group. The other members of the group then
should find all labelled trees that have your sequence as its Pr
ufer code.
How many labelled trees are found? What does your answer say about
the function P ?
170. Now, as a group write down any sequence of seven integers from [9].
Try to find a tree T T for which your sequence is P (T ). Do this for
several different sequences. Use this information to find the smallest
co-domain for the function P .
The idea of writing the last sequence of problems as a game originated
with the Fall 2003 Math 399 class at Oregon State.
You are now probably convinced that there is enough information in a
Pr
ufer code to in fact identify the tree. Now it is time to prove this fact.
First some notation. For fixed n 2, let T be the set of all labelled trees
on n vertices, and S be all sequences with n 1 elements chosen from [n] in
which the last element of the sequence is n.
171. Prove the function { (T, P (T )) : T T } is a 1-1 function.
Hint: Use Problem 167.
172. Considering S to be its co-domain, prove the function { (T, P (T )) :
T T } is onto S, being careful to prove the pre-image is always a
tree.
173. Find the number of labelled trees with n vertices. Is this the formula
you conjectured earlier in Problem 162?
In addition to providing a way to count labelled trees, there is a good bit
of other interesting information encoded in the Pr
ufer code for a tree. You
66

can begin to see this by working the next two problems and the problems
in the optional section that follows them.
174. What can you say about the vertices of degree one from the Pr
ufer
code for a tree labelled with the integers from 1 to n; that is, what
vertex or vertices in the sequence b1 , b2 , . . . , bn1 can have degree 1?
175. What can you say about the Pr
ufer code for a tree in which exactly
two vertices have degree 1? Does this characterize such trees?

4.3.1

More information from Pr


ufer codes (Optional)

176. What can you determine about the degree of the vertex labelled i from
the Pr
ufer code of the tree?
Hint. If a vertex has degree 1, how many times does it appear in the
Pr
ufer code of the tree? What about a vertex of degree 2?
177. What is the number of (labelled) trees on n vertices with three vertices
of degree 1? (Assume they are labelled with the integers 1 through n.)
Hint. How many vertices appear exactly once in the Pr
ufer code of the
tree and how many appear exactly twice?
178. How many labelled trees on n vertices have exactly four vertices of
degree 1?
179. The degree sequence of a graph is a list of the degrees of the vertices
in non-increasing order. For example the degree sequence of the first
graph in Figure 4.2 is (3, 3, 2, 2, 1, 1, 1, 1). For a graph with vertices
labelled 1 through n, the ordered degree sequence of the graph is
the sequence d1 , d2 , . . . , dn in which di is the degree of vertex i.
(a) How many labelled trees have n vertices and the ordered degree
sequence d1 , d2 , . . . , dn ?
(b) How many labelled trees have n vertices and a degree sequence in
which the degree d appears id times?

4.4

Monochromatic Subgraphs (Optional)

For a fixed positive integer m, recall Km denotes the complete graph on m


vertices. Here we consider the vertex set of Km to be labelled using the
elements of [m]. Let C be the set of all colorings of the edges of Km with
(m2 )
two colors, say red and blue. In Problem 141 you proved |C| = 2 .

67

For 1 n m, define the set S := {s : s [m], |s| = n}. (What is


|S|?) For any coloring c C and any s S, consider the subgraph of Km
whose vertex set is s and whose edge set contains all the (colored) edges of
Km which connect pairs of vertices in s. This graph is a colored complete
graph on n vertices, and it will be denoted by K(c, s).
180. For m = 4, n = 2, find C and S. Work together as a group to find
K(c, s) for at least three choices of (c, s) C S.
181. Let m = 4.
(a) For n = 2, show that for every coloring c C there exists s S
such that every edge in K(c, s) is the same color. Such subgraphs
are called monochromatic for the coloring c.
(b) Show that n = 2 is the largest possible integer which has the
property given in part (a).
Define mono(c, S) to be 1 if K(c, S) is monochromatic for the coloring c
and to be 0 otherwise.
182. Explain why
1 XX
1 XX
mono(c, s) = m
mono(c, s)
m
2( 2 ) cC sS
2( 2 ) sS cC
and why this common value is the average number of monochromatic
n-subgraphs, averaged over all colorings of Km .
P
(m2 )(n2 )
.
183. For a fixed subset s S, prove cC mono(c, s) = 2 2
 (n2 )
184. Show that your expression in Problem 182 equals 2 m
.
n 2
n 1

(2)
then there exists a coloring c C such that no
185. Prove: If m
n <2

K(c , s) is monochromatic.
Hint. Use Problem 184.
The very clever technique used in the last problem is a simple form of
the probabilistic method of Paul Erdos.
The next problem sequence uses Problem 185 to obtain a lower bound
on the Ramsey Numbers R(n, n). These numbers were discussed earlier in
the optional Section 1.5 (on page 19ff). That earlier material is only used
in the next problem, which connects Ramsey Numbers with monochromatic
subgraphs.

68

186. Use the definition of the Ramsey Number R(n, n) to explain why the
following is true: If R(n, n) m, then for all colorings c C there
exists s S such that K(c, s) is monochromatic.

(n2 )1
187. Explain why R(n, n) > m for all integers m such that m
<
2
.
n
p
n
n
188. Prove: If n m are positive integers such that m < 2( 2 )1 n! then

m
(n2 )1 .
n <2
 mn
Hint. Use the fact that m
n n! .
p
n
n
189. Prove that R(n, n) > 2( 2 )1 n! .
In the last section of this chapter, you will show how the inequality
in Problem 189 can be used to obtain a prediction of the asymptotic size
of R(n, n).

4.5

Spanning Trees

Many of the applications of trees arise from trying to find an efficient way
to connect all the vertices of a graph by a path. For example, in a telephone
network, at any given time there are a certain number of wires (or microwave
channels, or cellular channels) available for use. These wires or channels go
from one specific place to another specific place, and so the wires or channels
may be thought of as edges of a graph and the places where the wires connect
may be thought of as vertices of that graph. A tree whose vertices are all
of the vertices of the graph G and whose edges are some of the edges of a
graph G is called a spanning tree of G. A spanning tree for a telephone
network gives a way to route calls between any two vertices in the network
that uses the minimum number of wires. For example, Figure 4.5 contains
all spanning trees of the graph on the far left of the figure.
190. As a group, draw an example of a connected graph which is not a
tree and has six vertices. List all of its spanning trees. (The question
of counting the number of spanning trees will be considered later in
Section 4.5.1.)
191. Explain why every connected graph has a spanning tree. It is possible
to find an explanation that starts with the graph and works down
towards the spanning tree and to find another explanation that starts
with just the vertices and works up towards the spanning tree. Try
to find both kinds.

69

Figure 4.5: A graph and all its spanning trees.

Our motivation for talking about spanning trees was the idea of finding
a minimum number of edges needed to connect all the edges of a communication network together. In many cases the edges of a communication
network have costs associated with them. For example, one cell-phone operator might charge another one when a customer of one uses an antenna of
the other.
Suppose a company has offices in a number of cities and wants to put together a communication network connecting its various locations with highspeed communication lines, and to do so at minimum cost. This can be
modeled by a graph whose vertices are the cities in which it has offices and
whose edges represent possible communications lines between the cities. Of
course there will not necessarily be lines between each pair of citiesin fact
the company might not want to pay for a line connecting city i and city j if
it can already connect them indirectly by using other lines it has chosen.
192. Provide this company with a written description of a graph-theoretic
model of its problem.
You will want to choose a spanning tree of minimum cost among all
spanning trees of the communications graph. This special tree is often called
a minimal spanning tree (often abbreviated as MST) for the graph. For
this type of application, nonnegative numbers (called weights) are assigned
to the edges of the graph and the sum of the numbers on the edges of a
spanning tree is called the cost of the spanning tree.
193. (a) Put weights on the edges of your graph from Problem 190. Find a
minimal spanning tree for your graph. Can you find two?

70

(b) Draw another connected graph with weighted edges, and find a
minimal spanning tree for it.
194. Describe an inductive method (or better, two methods different in at
least one aspect) for finding a minimal spanning tree in a connected
graph whose edges are labelled with costs.
Hint. Think of forming the MST by carefully selecting one edge of the
tree at a time.
The method you designed in Problem 194 is called a greedy method,
because each time you made a choice of an edge you chose the least costly
edge available to you.

4.5.1

Counting the Number of Spanning Trees (Optional)

There are two operations on graphs which can be used to get a recurrence for
finding the number of spanning trees of a graph. Each operation is applied
to an edge e of a graph G.
The first operation is called deletion: you simply delete the edge e from
the graph by removing it from the edge set (without removing either of its
endpoints). Work through Figure 4.6 for an example of how a sequence of
edge deletions can be used to get a spanning tree.
Figure 4.6: Deleting two appropriate edges from this graph gives a spanning
tree.

The second operation is called contraction of an edge. Intuitively, you


contract an edge by shrinking its length until its endpoints coincide and
letting the rest of the graph go along for the ride. To be more precise, the
edge e with endpoints v and w is contracted as follows:
Step 1 remove from the edge set all edges having either v or w (or both)
for an endpoint;
Step 2 remove v and w from the vertex set;
Step 3 add a new vertex E to the vertex set;
Step 4 for each remaining vertex that had an edge removed in Step 1, add
an edge from the vertex to E;
71

Step 5 add an edge from E to E for any edge other than e whose endpoints
were in the set {v, w}.
195. Work through the examples in Figure 4.7 to get a better understanding
of the idea. Youll find that the wording for this process of contraction
is more complicated than the process.
Figure 4.7: The results of contracting three different edges in a graph.
6

5
4

e
e
1

The notation G \ e (read as G minus e) will be used to represent the


graph that results from deleting the edge e from G, and G/e (read as G
contract e) for the result of contracting the edge e from G.
196. How do the number of spanning trees of G not containing the edge e
relate to the number of spanning trees of G \ e? How do the number
of spanning trees of G containing e relate to the number of spanning
trees of G/e? Explain.
197. Use #(G) to represent the number of spanning trees of a graph G (so
that, for example, #(G/e) equals the number of spanning trees of G/e).
Find an expression for #(G) in terms of #(G/e) and #(G \ e). The
equation that results is called the deletion-contraction recurrence.
In what sense is it a recurrence?
198. Use the recurrence of the last problem repeatedly to show that the
graph in Figure 4.8 has twenty-one spanning trees.
+ 199. Describe an algorithm for counting the number of spanning trees in a
connected graph.

72

Figure 4.8: A graph.


3
4
5

4.6

Finding Shortest Paths in Graphs

Suppose that a company has a main office in one city and regional offices in
other cities, and it happens that most of the communication in the company
is between the main office and the regional offices. The company would like
to find a spanning tree which minimizes not the total cost over all possible
communication links (all edges), but rather the total cost of communication
between the main office and each of the regional offices. The weighted
length of a path in the graph is the sum of the weights of its edges, and the
distance between two vertices is the least weighted length of any path between the two vertices. There are two optimization (actually minimization)
problems inherent here:
Given a vertex v, what is the distance between v and each other vertex?
Given a vertex v, can you find a spanning tree in G such that the length
of the path in the spanning tree from v to each vertex x is the distance
from v to x in G?
Consider the following inductive process, which is known as Dijkstras
algorithm. The algorithm is applied to a simple weighted graph whose
vertices are labelled 1 to n.
Step 1 Let d(1) := 0. Let d(i) := for all other i.
Let v(1) := 1. Let v(j) := 0 for all other j.
For each i and j, let w(i, j) be the weight of the edge between i and j,
or if there are no such edges.
Let k := 1. Let t := 1.
Step 2 For each i, if d(i) > d(k) + w(k, i) let d(i) = d(k) + w(k, i).
Step 3 Among those i with v(i) = 0, choose one for which d(i) is a minimum, and let k = i. Increase t by 1. Let v(i) = 1.
Step 4 Repeat the previous two steps until t = n.
73

200. As a group, draw two connected weighted graphs which are not trees
which have at least seven vertices and at least fourteen edges. Modify
Dijkstras algorithm to find spanning trees in each of the graphs.
201. Show that at the end of Dijkstras algorithm each d(i) equals the distance from vertex 1 to vertex i.
202. In every connected graph, is there always a spanning tree such that
for every vertex i, the distance from vertex 1 to vertex i given by the
algorithm in Problem 200 is the distance from vertex 1 to vertex i in
the tree?

4.7

Some Asymptotic Combinatorics (Optional)


While the formula for nk given in the box on page 49
 is very useful, it does
n
not give a sense of how big the binomial coefficient k is. You can get a very

n
rough idea, for example, of the size of 2n
n by recognizing that (2n) /n! can
2n1
n+1
be written as 2n
n n1 1 , and each quotient is at least 2, so the product
n
is at least 2 . If this were an accurate estimate, it would mean the fraction
of n-element subsets of a 2n-element set would be about 2n /22n = 1/2n ,
which becomes very small as n becomes large. However, it is pretty clear
this approximation is not a very good one, because
 some of the terms in
2n
that product are much larger than 2. In fact, if k were the same for every
1
k, then each would be the fraction 2n+1
of 22n . This is much larger than the
approximation 21n . But our intuition (and also Pascals Triangle) suggests



2n
2n
that 2n
is
much
larger
than
and
is
likely
larger
than
n
1
n1 so you can
be sure the approximation is a bad one. In order to make accurate estimates
of binomial coefficients, James Stirling developed

 n a nformula to approximate
n! when n is large, namely n! is about
2n n /e . In fact the ratio of n!
to this expression approaches 1 as n becomes infinite.

Stirlings Formula

nn
2n n ,
e

n
which is read as n! is asymptotic to 2n nen .
n!

Proving Stirlings Formula requires more of a detour than is advisable


here. However, there is an elementary proof which you can work through in
the problems of the end of Section 1 of Chapter 1 of Introductory Combinatorics by Kenneth P. Bogart, Harcourt Academic Press (2000).
74

203. Use Stirlings Formula to show that the fraction of subsets of size n

in an 2n-element set is approximately 1/ n, which is a much bigger


fraction than 1/2n .
For the final problems in this chapter, you now return to the question of
calculating the asymptotic size of the Ramsey Numbers R(n, n). Here you
should restrict to large enough n so that Stirlings Formula is reasonably
accurate, accurate enough that you may replace n! by the approximation
given in Stirlings formula. This is not a tight argumenta proof requires
an -argument.
204. Use Stirlings Formula to convert (for large enough n) the upper bound
for R(n, n) in Problem 138(b) to an upper bound which is a multiple
of a power of 2.
205. Use Stirlings Formula to convert
(for large enough n) the lower bound
for R(n, n) in Problem 189 to 2n .
206. Show the Ramsey Numbers R(n, n) grow exponentially with n.

75

Chapter 5

Generating Functions
5.1

Using Pictures to Visualize Counting

Suppose you want to choose a snack of three pieces of fruit from among
apples, pears and bananas. Since choosing two or three of the same fruit
has not been precluded, all your choices can be symbolically represented as
+

(Why doesnt
appear?) Here a picture of a piece of fruit represents
taking a piece of that fruit. For instance, stands for taking an apple;
represents taking an apple and a pear; and
for taking two apples.
In this representation you can think of the plus sign as standing for
the exclusive or . For example,
+ would stand for I take an apple
or a banana but not both. This similarity with mathematical notation is
extended by condensing the expression to
3

(5.1)
2

where in this notation


stands for choosing three apples, while
represents a choice of two apples and a banana, and so on. What the notation
in (5.1) is really doing is giving a convenient way to list all three-element
multisets chosen from the set { , , }. This approach was inspired by
George P
olyas paper Picture Writing, in the December 1956 issue of The
American Mathematical Monthly. While we are taking a somewhat more
formal approach than P
olya, it is still completely in the spirit of his work.
Suppose now that you plan to choose between one and three apples,
between one and two pears, and between one and two bananas, and that no
76

other restrictions are placed on the total number of fruit to be chosen. In a


somewhat clumsy way you could describe the fruit selections as
+

2 2

+ +

2 2

+ +

2 2

.
(5.2)

207. Using an A in place of the picture of an apple, a P in place of the


picture of a pear, and a B in place of the picture of a banana, write out
the entire expression intended in (5.2), that is, without any dots for
left-out terms. (You may use pictures instead of letters if you prefer,
but it gets tedious quite quickly!) Now expand the product (A + A2 +
A3 )(P + P 2 )(B + B 2 ) and compare the result with your expression.
208. Substitute an x for each of A, P and B in the expression you found in
Problem 207. Expand the result in powers of x and give an interpretation of the coefficient of xn .
You saw that expanding
(

)( +

)( +

(5.3)

gives the expression in (5.2). This means that (5.2) and (5.3) each describes
the number of multisets you can choose from the set { , , } in which
appears between one and three times, and
and each appears once or
twice. Interpret (5.2) as describing each individual multiset you can choose,
and interpret (5.3) as saying that you first decide how many apples you
will take, and then decide how many pears to take, and then decide how
many bananas. At this stage it might seem a bit magical that doing ordinary
algebra with the second formula yields the first. In fact, by defining addition
and multiplication with these pictures more formally we could explain in
detail why things work out. This more formal exposition will not be given
here.
In the descriptions of the ways to choose fruit youve seen that the pictures of the fruit can be treated as if they were variables. In the theory of
generating functions (which will be developed in the next section), variables
or polynomials or even power series are associated with members of a set.
This is an adaptation of language introduced by George Polya to describe
how to associate variables with the members of a set. A picture of a member of a set S means a variable, or perhaps a product of powers of variables
or even a polynomial in the variables. A function P that assigns a picture
P (s) to each member s S will be called a picture function. The picture
enumerator for a picture function P defined on a set S will be the sum of
77

the pictures of the elements in S, which will be written symbolically as


X
EP (S) =
P (s) .
sS

Using this terminology in the original problem: If S is the set of the


three fruit, then A is the picture of an apple, and A + P + B is the picture
enumerator of the picture function on S. Likewise, when S is the set of all
multisets of fruit which have from one to three apples, one to two pears, and
one to two bananas, then in Problem 207 you found the picture enumerator
for S. This language has been chosen because the picture enumerator lists
(that is, enumerates) all elements of S according to their pictures.
209. The product A2 P 3 represents taking two apples and three pears, which
means choosing the picture of the ordered pair (2 apples, 3 pears) to
be the juxtaposition
, the product of the pictures of a multiset
of two apples and a multiset of three pears.
Show that if S1 and S2 are sets with picture functions P1 and P2 defined
on them, and if the picture of an ordered pair (x1 , x2 ) S1 S2 is
defined to be P ((x1 , x2 )) = P1 (x1 )P2 (x2 ), then the picture enumerator
of P on the set S1 S2 is EP1 (S1 )EP2 (S2 ). This is called the Product
Principle for Picture Enumerators.
210. Use the Product Principle for Picture Enumerators to explain why (5.2)
and (5.3) are equal.
211. What should be the picture of taking no apples? Find a polynomial in
the variable A that says you may take between zero and three apples.
Hint. If A1 is the picture of taking one apple and A2 is the picture
of taking two apples, what would make a good picture of taking no
apples?
212. Write a picture enumerator that says you can take between zero and
three apples, between zero and three pears, and between zero and three
bananas.
213. Suppose you want to choose a snack of between zero and three apples,
between zero and three pears, and between zero and three bananas.
(a) Write a polynomial in one variable x in which the coefficient of xn
is the number of ways to choose a snack with n pieces of fruit.
(b) Suppose an apple costs 20 cents, a banana costs 25 cents, and a
pear costs 30 cents. What should you substitute for A, P , and B in
Problem 212 in order to get a polynomial in which the coefficient
78

of xn is the number of ways to choose a selection of fruit that costs


n cents?
(c) Suppose an apple has 40 calories, a pear has 60 calories, and a
banana has 80 calories. What should you substitute for A, P ,
and B in Problem 212 in order to get a polynomial in which the
coefficient of xn is the number of ways to select fruit with a total
of n calories?
214. In this problem, you want to choose a subset of the set [n]. For each i
from 1 to n, use xi to be the picture of choosing i to be in the subset.
(a) What is the picture enumerator for either choosing i or not choosing i to be in the subset?
(b) What is the picture enumerator for all possible choices of subsets
of [n]? What should be substituted for xi in order to get a polynomial in x such that the coefficient of xk is the number of ways
to choose a k-element subset of [n]? Explain.
(c) You have just proved a special case of what theorem?

5.1.1

Pictures of trees (Optional)

In the following exercises, a tree with n vertices will always be considered


to have its vertices labelled by the elements of [n]. For such a tree, define the
picture of the vertex i to be xi , and the picture of the edge with endpoints
xi and xj to be xi xj . Then the picture of the tree T is defined to be the
product
Y
P (T ) =
xi xj
(5.4)
{i,j} T

where T = { {i, j} : i and j are connected by an edge in the tree T }.


215. Draw a tree with seven vertices, and find its picture. Show that the
Q
deg(i)
picture you found can be re-written as 7i=1 xi
. Do this for several
examples. Explain why for any tree T with n vertices, its tree picture
Q
deg(i)
P (T ) can be re-written as ni=1 xi
.
216. For each n 1, let Sn be the set of all trees with vertex set [n]. For
each tree T Sn use the picture P (T ) given in (5.4). Find the picture
enumerators EP (Sn ) for each of n = 2, 3, 4. In each case, factor the
polynomials as completely as possible.
217. Explain why x1 x2 xn is always a factor of the picture of any tree on
n vertices.

79

218. (a) Write down the picture of a tree on five vertices with one vertex
of degree four, say vertex i.
(b) If a tree on five vertices has a vertex of degree three, what are the
possible degrees of the other vertices? What can you say about
the picture of a tree with a vertex of degree three?
(c) If a tree on five vertices has no vertices of degree three or four,
what can you say about the picture of the tree?
(d) Write down the picture enumerator for all trees on five vertices.
Hint. Remember the formula involving degrees and edges.
219. As above, for n 1 let Sn be the set of all trees with vertex set [n].
Prove that the picture enumerator EP (Sn ) equals
EP (Sn ) = x1 x2 xn (x1 + x2 + + xn )n2 .
220. The enumerator for trees by degree sequence is the sum over all trees
of xd11 xd22 xdnn , where di is the degree of the vertex i. Explain why
x1 x2 xn (x1 + x2 + + xn )n2 is the enumerator by degree sequence
for trees on the vertex set [n],
221. Find the number of labelled trees on n vertices and prove your formula
is correct. (You also established this formula using Pr
ufer codes in
Section 4.3. Since you most likely used some results from that section
in the proof of Problem 219, this new proof is not entirely independent
of Pr
ufer codes.)

5.2

Generating Functions

5.2.1

Generating polynomials

In your solution to Problem 214 you saw that the process of expanding the
polynomial (1 + x)n as given in the Binomial Theorem
can be thought of as

a way of generating the binomial coefficients nk as the coefficients of xk
n
in the expansion of (1+x)n . For this reason,
 (1+x) is called the generating
n
polynomial for the binomial coefficients k .
More generally,P
the generating polynomial for a finite sequence a0 , . . . , an
is the polynomial ni=0 ai xi . In Problem 213(a) you converted the picture
enumerator for selecting between zero and three each of apples, pears, and
bananas to the generating polynomial of the finite sequence a0 , . . . , an in
which ai is the number of such fruit snacks which contain i pieces of fruit.
When you substituted xc for each fruit picture (where c is the number of
calories in that particular kind of fruit), the resulting polynomial was the
80

generating polynomial for the number of fruit selections with i calories. Also
remember that the original picture enumerator was obtained by multiplying
three picture enumerators:
1 + A + A2 + A3

1 + P + P2 + P3

1 + B + B2 + B3 .

When xc is substituted for each fruit picture where c is now the cost of the
fruit (as in Problem 213(b)), these picture enumerators become
1 + x20 + x40 + x60

1 + x30 + x60 + x90

1 + x25 + x50 + x75 ,

where in each case the coefficient of xi gives the number of selections of


that particular fruit which cost i cents. The Product Principle of Picture
Enumerators therefore translates directly into a Product Principle for Generating Polynomials. Before stating this principle, note that in each of the
above instances there was:
a finite set S of possible fruit selections (for instance, from zero to three
apples);
an associated value function defined from S to the nonnegative integers (for instance, the cost of the fruit selection or the number of
calories in the fruit selection);
and a polynomial that is the generating polynomial for the number of
elements s S which have the value i. This polynomial will be called
the generating polynomial associated with the value.

The Product Principle for Generating Polynomials


Let S1 , S2 be finite sets with value functions v1 , v2 . If Gi (x)
is the generating polynomial associated with the value vi
then the coefficient of xk in the polynomial G1 (x)G2 (x) is
the number of ordered pairs (s1 , s2 ) S1 S2 such that
v1 (s1 ) + v2 (s2 ) = k.

5.2.2

Generating functions

Generating functions are also defined for infinite sequences, and they are
defined in such a way that the generating function for an infinite sequence
81

{ai }i0 with only finitely many nonzero terms (say ai = 0 for all i > N )
is the same as the generating polynomial for the finite sequence aP
0 , . . . , aN .

i
The generating function for {ai : i 0} is the expression
i=0 ai x ,
a formal power series. For formal power series, the series is simply a
convenient way of representing the terms of sequences that interest us. You
will see that they are convenient for our purposes because the sum and
product of formal power series are defined in a way that captures properties
of the sequences that are important in discrete mathematics. The sum of
two series is defined to be coefficient-wise addition; that is,

Addition of Formal Power Series


!
!

X
X
X
i
j
ai x +
(ak + bk )xk .
bj x =
i=0

j=0

k=0

Before defining multiplication of two formal power series, remember that


in calculus (and in analysis in general) you are interested in whether or not
a power series is a function, and so in analysis it is important to know for
what values of x the power series converges. On the other hand, in discrete mathematics power series can be purely formal objects, which means
that even though you use the phrase generating function, the power series
is not required to actually represent a function and so there is no need to
worry about convergence. As an historical aside: Before settling on the
current definition of the word function, the word evolved through several
meanings, starting with very imprecise meanings and ending with the current definition. The terminology generating function may be thought of
as an example of one of the earlier uses of the term function. Now on to
multiplication.
222. (a) What is the coefficient of x2 in the polynomial
(a0 + a1 x + a2 x2 )(b0 + b1 x + b2 x2 + b3 x3 ) ?
What is the coefficient of x4 ?
(b) In part(a), why is there a b0 and a b1 in your expression for the
coefficient of x2 but there is not a b0 or a b1 in your expression for
the coefficient of x4 ?
(c) What is the coefficient of x4 in
(a0 + a1 x + a2 x2 + a3 x3 + a4 x4 )(b0 + b1 x + b2 x2 + b3 x3 + b4 x4 )?
82

Express this coefficient in the form


4
X

something,

i=0

where the something is an expression you need to figure out.


223. The point of Problem 222 is that when the sequences {ai } and {bj } are
finite (or, equivalently for our purposes, when {ai } and {bj } are infinite
sequences with ai = 0 for i > n and bj = 0 for j > m), then there is a
very nice formula for the coefficient of xk in the product

! m
n
X
X
ai xi
bj xj .
i=0

j=0

Write this formula explicitly and justify your conclusion.


224. Assuming that the rules of polynomial arithmetic apply to formal power
series, write down a formula for the coefficient ck of xk in the product

X
X
ai xi
bj xj .
i=0

j=0

The expression you obtained in Problem 224 defines the product


P of formal
i
power
series.
That
is,
the
product
of
two
formal
power
series
i=0 ai x and
P
P
k
j
k=0 ck x , where ck is the expression
j=0 bj x is the unique power series
you found in Problem 224. For convenience of referral, write the correct
coefficient ck in the following formula:

Multiplication of Formal Power Series


!
!
"
#

X
X
X
i
j
ai x
bj x =
xk .
i=0

j=0

k=0

Since your expression for the product of two formal power series was
derived using usual polynomial algebra, it should not be surprising that
multiplication of formal power series satisfies the usual rules of polynomial
algebra, such as the associative law and the commutative law. We could
83

explicitly state these rules and prove that they are all valid for addition
and multiplication of formal power series. However, for the purposes of this
book that is excessive, except to point out that the polynomial 1 is the
multiplicative identity. (Verify that is true.)
Pn
k
225. Use the definition of multiplication
P k to find the product (1 x) k=0 x
and the product (1 x) k=0 x .
P
i
226. A formal
i=0 ai x is said to be invertible with inPpowerj series
verse j=0 bj x if the product equals the identity 1. Show that 1 x
is invertible, and that x + x2 is not invertible. Determine which powers
series are invertible. Try to find a condition which is easy to verify.
P
i
The usual notation for inverse will be used here: The inverse of
i=0 ai x
is written as
!1

X
i
.
ai x
i=0

What is (1 x); that is, what is the inverse of the formal power series 1 x ?
Because the algebra of generating functions is the same whether the sequence is finite or infinite, the Product Principle for Generating Polynomials
(as given on page 81) can be shown to hold for all generating functions, and
mathematical induction can be used to extend this principle from two sets
to any finite number of sets. (Refer to Section 5.2.3.)

The Product Principle for Generating Functions


Suppose each of the sets S1 , S2 , . . . , Sn has a value function defined
from the set to the nonnegative integers. For each i, let Gi (x) be the
generating function associated with the value on the set Si . Then the
generating function for the number of n-tuples of each possible total
value is the product
G(x) = G1 (x)G2 (x) . . . Gn (x) .

227. Suppose once again that i is an integer betweenP1 and n. In Probk


lem 225 you encountered the formal power series
k=0 x in which the
k
coefficient of every x is 1, an example of a geometric series. In
this problem it will be useful to interpret this series as a generating
function in which the coefficient 1 is the number of multisets of size k
84

chosen from the singleton set {i} . Namely, there is only one way to
chose a multiset of size k from {i}: choose i exactly k times. Express
the generating function in which the coefficient of xk is the number of
k-element multisets chosen from [n] as a power of another power series.
What does Problem 133 tell you about what this generating function
equals?
228. Express the generating function for the number of multisets of size k
chosen from [n] (where n is fixed but k can be any nonnegative integer)
as the inverse of something relatively simple. Check your expression is
consistent with your solution to Problem 225, where you answered this
question for n = 1.
For future reference, fill in the coefficients in the following power series
representation:

(1 x)


X

xk

k=0

229. Use the above formula to write the inverse of (1x)2 as a formal power
series. Comparing it to (1 x)1 , does this give you any insight into
what might be called the formal derivative of a power series? Explain.
(Again note that this differentiation process is referred to as formal
since no underlying limit process has been established.)
230. (a) Write down the generating function for the number of ways to
distribute identical pieces of candy to n = 3 children so that no
child gets more than 4 pieces.
(b) Using the fact that
(1 x)(1 + x + x2 + . . . + x4 ) = 1 x5 ,
write your generating function from part (a) as a quotient of polynomials.
(c) Under the restrictions given in part (a), use the information from
the last part to calculate how many ways can you pass out exactly
ten pieces of candy to the three children.
231. Let m and n be fixed nonnegative integers. Express the generating
function for the number of k-element multisets of an n-element set
such that no element appears more than m times as a quotient of two
85

polynomials. Use this expression to get a formula for the number of


k-element multisets of an n-element set such that no element appears
more than m times.
232. Let j < n be positive integers.
(a) What is the generating function for the number of multisets chosen
from an n-element set so that each element appears at least j times
and less than m times?
(b) Write the generating function from part (a) as a quotient of polynomials.
(c) Write the quotient from part (b) as the product of a polynomial
and a power series.
233. Let n be a fixed positive integer. Suppose there is an unlimited supply
of identical pieces of candy. What is the generating function for the
number of ways to pass out k pieces of candy to n children in such a way
that each child gets between three and six pieces of candy (inclusive)?
Use generating functions to find a formula for the number of ways to
pass out k pieces of candy.
234. Suppose you have some chairs which you are going to paint with red,
white, blue, green, yellow and purple paint. Suppose that you may
paint any number of chairs red or white, at most one chair blue, at
most three chairs green, only an even number of chairs yellow, and
only a multiple of four chairs purple. In how many ways can you paint
k chairs?
Hint. It is useful to write each factor in the product as a quotient of
polynomials and then do some cancellation (that is, use inverses).

5.2.3

Product Principle for Generating Functions (Optional)

235. (Here is an outline of a proof of the Product Principle for Generating


Functions which does not rely on the Product Principle for Picture
Enumerators.) Suppose that you have two sets S1 and S2 . Let v1
(here v stands for value) be a function from S1 to the nonnegative
integers and let v2 be a function from S2 to the nonnegative integers.
Define a new function v on the
Pset S1i S2 by v(x1 , x2 ) = v1 (x1 ) +
v2 (x2 ). Suppose further that i=0 ai x is the generating function for
the number of elements
P x1 j S1 of value i, that is, with v1 (x1 ) = i.
Suppose also that j=0 bj x is the generating function for the number
of elements x2 of S2 of value j, that is, with v2 (x2 ) = j. Prove that the

86

coefficient of xk in

!
X
ai xi
bj xj

i=0

j=0

is the number of ordered pairs (x1 , x2 ) in S1 S2 with total value k, that


is, with v1 (x1 ) + v2 (x2 ) = k. This is called the Product Principle
for Generating Functions.
Hint. If this problem appears difficult, the most likely reason is because
the P
definitions are all new and symbolic. Focus on what it means
k
for
k=0 ck x to be the generating function for ordered pairs of total
value k. In particular, how do we get an ordered pair with total value
k? What do we need to know about the values of the components of
the ordered pair?
236. Use mathematical induction to prove the Product Principle for Generating Functions (as given on page 84).

5.3

Solving Recurrences with Generating Functions

In this section you will learn how generating functions can be used to obtain
a closed formula for the solution of recurrences. Such a formula is useful
because it allows easier calculation of specific terms in the solution sequence.
For instance, suppose the recurrence ai = 3ai1 + 3i were used to model a
certain population of bacteria, where ai is the number of bacteria after i
hours. An explicit formula for ai (involving no previous terms) would be
very handy, since it would not be practical to use the recurrence itself to
compute the size of the colony after many hours (why?).
Using pencil and paper, follow the following directions for algebraic manipulation using the recurrence ai = 3ai1 + 3i . First, multiply both sides
of the recurrence by xi and then sum both the left-hand side and right-hand
side from i = 1 to infinity. In the left-hand side use the fact that

ai xi =

i=1

X


ai xi a0

i=0

and in the right-hand side, use the fact that

X
i=1

ai1 xi = x

ai1 xi1 = x

i=1

X
j=0

87

aj xj = x

X
i=0

ai xi

(where j is substituted for i 1, a surprisingly


useful trick) to rewrite the
P
i . Solve the resulting equation
equation in terms of the power series
a
x
i
i=0
for the power series

ai xi =

i=0

a0 1
1
.
+
1 3x (1 3x)2

(You can save a lot of writing by using a variable such as y to represent the
power series you are solving for.) Writing each summand on the right-hand
side as a power series, you can equate coefficients to obtain a closed form
for the recurrence in terms of the initial population, a0 .
237. Find a closed form for the recurrence ai = 3ai1 + 3i with initial
value a0 .
The next sequence of problems works with a mathematical model of a
fictional population of rabbits. For purposes of modeling the rabbit population three assumptions are made:
Rabbits are mature and begin to reproduce after one month.
Each mature pair produces two new pairs at the end of each month.
No rabbit dies during the period of observation.
The example of a rabbit population is used for historic reasons, and the
goal is a classical sequence of numbers called the Fibonacci numbers. Fibonacci is the name that Leonardo de Pisa was given posthumously; it is
a shortening of son of Bonacci in Italian. Leonardo de Pisa introduced
this mathematical model of a biological population in his book, Liber Abaci,
which was published in 1202. In time for the 500-th anniversary, SpringerVerlag published Fibonaccis Liber Abaci: A Translation into Modern English of Leonardo Pisanos Book of Calculation by Laurence Sigler.
238. Begin at the end of month 0 with 10 pairs (where a pair means one
female and one male) of baby rabbits. Let an be the number of rabbit
pairs at the end of month n. Show that a0 = 10 and an = an1 +2an2 .
This is an example of a second-order recurrence which is also linear
and has constant coefficients. Using a method similar to the one used
at the beginning of this section, show that

ai xi =

i=0

88

10
.
1 x 2x2

In the last problem you represented the generating function for {ai }
as a quotient of polynomials. Such a quotient is often referred to as a
rational representation of the generating function for the recurrence. If
you know how to express the reciprocal of denominator in this representation
as a power series, then it is relatively easy to find the coefficients of your
generating function. You did that in Problem 237 for the linear first-order
recurrence given in the beginning of the section. In Problem 238 you do not
have the corresponding formal power series directly available. Try to think
of a way to get it.
239. In Fibonaccis original problem, there is one pair of baby rabbits at the
end of month 0 and each pair of mature rabbits produces one new pair
at the end of each month. Otherwise the situation is the same as in
Problem 238.
(a) Find the recurrence.
(b) Under these assumptions, find a rational representation of the generating function for the number of pairs of rabbits at the end of n
months.
240. (a) Use the quadratic formula to factor 1 x x2 , and then use the
1
factors to find the partial fraction decomposition of
.
1 x x2
Hint. It is useful to represent the roots of the equation 1 x
x2 = 0 by r1 , r2 while you are working through the partial fraction
decomposition.
(b) Use the partial fraction decomposition you found in part
P (a) to n
write the generating function you found in Problem 239 as
n=0 an x ,
the standard form for power series.
(c) Solve for an explicit formula for an . This is called Binets Formula.
241. Explain why there exists a real number b such that, for large values
of n, the value of the nth Fibonacci number is almost exactly (but not
quite) some constant times bn . Find b and the constant.

89

Chapter 6

The Principle of Inclusion


and Exclusion
6.1

The Size of a Union of Sets

One of the first counting principles in these notes was the Sum Principle
which says that the size of a union of disjoint sets is the sum of their sizes.
Computing the size of the union of overlapping sets quite naturally requires
information about how they overlap. Taking such information into account
allows the development of a powerful extension of the Sum Principle known
as the Principle of Inclusion and Exclusion.
242. In Problem 15, just two fertilizers were used to treat all sample plants in
a certain biology lab. Now suppose there are three fertilizer treatments:
15 plants are treated with nitrates, 16 with potash, 16 with phosphate,
7 with nitrate and potash, 9 with nitrate and phosphate, 8 with potash
and phosphate and 4 with all three. Now how many plants have been
treated? If 32 plants were studied, how many received no treatment at
all?
243. Give a formula for the size of A B C in terms of the sizes of A, B,
C and the various intersections of these sets.
244. Conjecture a formula for the size of a union of sets
A1 A2 . . . An =

n
[

Ai

i=1

in terms of the sizes of the sets Ai and their various intersections.


Express your conjecture in words before attempting to write a formula.
90

The hardest part of generalizing your answer in Problem 243 to Problem 244 is probably finding a good notation to express your conjecture. In
fact, for many people it would be easier to express the conjecture in words
than to express it as mathematical formula. Some notation which will make
your task easier is similar to the notation
X
EP (S) =
P (s)
sS

that was used to stand for the sum of the pictures of the elements of a set
S when picture enumerators were introduced. Here, define
\
Ai
iI

to mean the intersection over all elements i in the set I of Ai ; for example,
\
Ai = A1 A3 A4 A6 .
(6.1)
i{1,3,4,6}

Youve already used this kind of notation (involving an operator with a


descriptor below it) in summation notation for sums and in product notation
for products. In this case the operator is set intersection and the descriptor
identifies the values of a dummy variable that we are interested in.
245. Use notation similar to (6.1) to express the answer to Problem 244.
Note there are many different correct ways to do this, and try to write
down more than one. Choose the neatest one you can. Be sure to say
why you chose it because your view of what makes a formula nice may
be different from the formula others supply.
246. A group of n students with backpacks goes to a restaurant. The manager invites everyone to check his or her backpack at the check desk
and everyone does. While they are eating, a child playing in the check
room randomly moves around the claim check stubs.
(a) What is the total number of ways to pass back the backpacks?
(b) Let Ai be the set of backpack distributions in which student i
gets the correct backpack. In how many of the distributions of
backpacks-to-students does at least one student get his or her own
backpack? It might be a good idea to first consider cases with
n = 3, 4, and 5.

91

247. In this problem you will compute the probability that, at the end of
the meal, at least one student in the previous problem receives his or
her own backpack. Here the probability is the fraction of the total
number of ways to return the backpacks in which at least one student
gets his or her own backpack.
(a) What is the probability that at least one student gets the correct
backpack?
(b) What is the probability that no student gets his or her own backpack?
(c) As the number of students becomes large, what does the probability that no student gets the correct backpack approach?
Hint. Think calculus and Taylor polynomials.
Problem 246 is classically called the Hat Check Problemthe name
comes from substituting hats for backpacks. It is also sometimes called the
Derangement Problem. A derangement of an n-element set is a permutation of that set (thought of as a bijection on [n]) that maps no element of
the set to itself; that is, a permutation with no fixed points. One can think
of a way of handing back the backpacks as a permutation f of the students
in which f (i) is the owner of the backpack that student i receives. Then a
derangement is a way to pass back the backpacks so that no student gets
his or her own.

6.2

The Principle of Inclusion and Exclusion

The formula youve discovered in Problem 245 is usually called the Principle of Inclusion and Exclusion for unions of sets. The reason for this
name is the pattern in the formula: It first adds (includes) all the sizes of the
sets, then subtracts (excludes) all the sizes of the intersections of two sets,
then adds (includes) all the sizes of the intersections of three sets, and so
on. There are a variety of proofs of this principle. Perhaps one of the most
straightforward is an inductive proof that relies on the inductive process
A1 A2 An = (A1 A2 An1 ) An ,
which expresses the n-fold union as a union of two sets. What formula for
|A B| did you discover in Problem 16?
248. Use induction to give a proof of your formula for the Principle of Inclusion and Exclusion.

92

249. A more elegant proof can be obtained by asking for a picture enumerator for A1 A2 An . So assume A is a set with a picture function
P defined on it and that each set Ai is a subset of A.
(a) By thinking about how the formula for the size of a union was obtained, write down instead a conjecture for theTpicture enumerator
of a union. You could use a notation like EP ( iS Ai ) for the picture enumerator of the intersection of the sets Ai for i in a subset
S of [n].S
(b) If x ni=1 Ai , what is the coefficient S
of P (x) in (the inclusionexclusion side of) your formula for EP ( ni=1 Ai )?
Hint. Let T be the set of all i such that x Ai . In terms of x,
what is different about the i in T and those not in T ? You may
come toSa point where the binomial theorem would be helpful.
of P (x) in (the inclusion(c) If x 6 ni=1 Ai , what is the coefficient S
exclusion side of) your formula for EP ( ni=1 Ai )?
(d) How have you proved your conjecture for the picture enumerator
of the union of the sets Ai ?
(e) How can you get the formula for the Principle of Inclusion and
Exclusion from your formula for the picture enumerator of the
union?
Frequently the Principle of Inclusion and Exclusion is applied to situations similar to that of Problem 246(b). That is, there is a set A and
subsets A1 , A2 , . . . , An and what is required is the size of the set of elements
in A which are not in the union. This set is known as the complement
S
of the union of the Ai s in A. This will either be denoted by A \ ni=1 Ai
S
or by ni=1 Ai , where the latter is used when the universe A is clear from
context. The same mathematical symbol can have different meanings in different contexts. Here an over-bar is used to denote the complement of the
set, whereas in analysis or topology this sometimes means the closure of the
set. There, and elsewhere, the complement of the set A might be denoted
by Ac .
The Principle of Inclusion and Exclusion can refer to both the formula
for the union and the one for its complement.
250. Prove the formula
n
[



Ai = |A|

i=1

\


(1)|S|1
Ai .

S[n] , S6=

93

iS

A very elegant
T way of writing the formula in Problem 250 can be obtained
by setting i Ai equal to A. (Rewrite the formula using this convention.)
An aside for those interestedTin logic and set theory: Given a family of
subsets Ai of a set A, define iS Ai to be the set of all members x of A
that are in Ai for all i S. (Note that this allows x to be in some other
Aj s as well.) Then if S = , the intersection consists of all members x of A
that satisfy the statement if i , then x Ai . But since the hypothesis
of the if-then
T statement is false, the statement itself is true for all x A.
Therefore i Ai = A.
251. Each person attending a party has been asked to bring a prize. The
person planning the party has arranged to give out all the donated
prizes (and only those prizes), but any person may win any number of
prizes. Suppose there are n guests, and for each 1 i n let Ai be
the set of all distributions of prizes in which person i gets the prize he
or she brought. Let A be the set of all distributions of prizes.
(a) Find |A| and each |Ai |.
Hint. Think functions.
(b) In how many ways can the prizes be given out so that nobody gets
the prize that he or she brought?
(c) What is the probability that nobody gets the prize she or he
brought?
(d) Is there a limiting value for the probability in part (c) as the number of party guests increases? If so, find the limiting value.
252. There are m students attending a seminar in a room with n seats. The
seminar is a long one, and in the middle the group takes a break.
(a) In how many ways may the students return to the room and sit
down so that nobody is in the same seat as before?
(b) What happens in a probabilistic sense as m and n become large?
253. Suppose that n children join hands in a circle for a game at nursery
school. The game involves everyone falling down (and letting go). In
how many ways may they join hands in a circle again so that nobody
has the same person immediately to the right both times the group
joins hands?

6.3

Counting the Number of Onto Functions

254. Let S be the set of all functions f from [k] to [n]. The sets A1 , . . . , An
are defined as follows: For any i, f S will be a member of the set Ai
94

if and only if f (x) 6= i for every x [k]. For k = 3 and n = 5, find all
A1 , . . . , A5 . What is A2 A3 ?
255. If f is an onto function from [k] to [n], how many of the sets Ai (as
defined in the previous problem) does f belong to? What is the number
of onto functions from [k] to [n]?
256. If a die is rolled eight times, a sequence of eight numbers from the
set [6] is obtained; namely, the number of dots on top on the first roll,
the number on the second roll, and so on.
(a) What is the number of ways of rolling the die eight times so that
each of the numbers one through six appears at least once in the
sequence?
(b) What is the probability that a sequence is rolled in which all six
numbers between one and six appear?
257. Continuing with the last problem: How many times must a die be rolled
in order to ensure the probability is at least 1/2 that all six numbers
appear in the sequence? In order to answer this question, you will need
to experiment and use a computational device like a programmable
calculator or some kind of computer algebra package.

6.4

The M
enage Problem

A certain town has a large number of 8-year-old twins, who are all in the
same third-grade class. The teacher asks n sets of twins to sit around a
round table.
258. Let Ai be the set of all such seatings in which the children in the i-th
set of twins are sitting next to each other. Find |Ai |. Find |Ai Aj |
for i 6= j.
259. For each of n = 4 and n = 5, find the number of ways n sets of twins
can be seated if no one may sit next to his or her twin.
260. For general n, in how many ways can the n sets of twins be seated if
no one may sit next to his or her twin?
261. In this problem you are again seating n sets of twins around a round
table, and now each set of twins has one boy and one girl. In how
many ways can they sit so that no person is next to his or her twin
and the genders alternate around the table? This problem is called the
M
enage Problem.

95

Hint. Reason somewhat as you did in Problem 260, noting that if the
set of all twins who do sit side-by-side is nonempty, then the gender
of the person at each place at the table is determined once one pair in
that set is seated, or, for that matter, once one person is seated.

6.5

The Chromatic Polynomial of a Graph

A coloring of the vertices of a graph by the elements of a set C (of colors)


is an assignment of an element of C to each vertex of the graph; that is,
a function from the vertex set V of the graph to C. A coloring is called
a proper coloring if for each edge joining two distinct vertices1 , the two
vertices it joins have different colors. You may have heard of the famous
Four Color Theorem of graph theory that says every drawing of a graph in
the plane in which no two edges cross (though they may touch at a vertex)
has a proper coloring with four colors. Here a different, though related,
problem is considered: In how many ways can you properly color a graph
(regardless of whether it can be drawn in the plane or not) using k or fewer
colors?
262. Given a graph which might or might not be connected, define a relation
on its set V of vertices by:
For v, w V, vRw there is a path from v to w .
Prove this relation is an equivalence relation. Its equivalence classes
are called the connected components of the graph.
Notice that the connected components depend on the edge set of the
graph. That is, if a graph has vertex set V and edge set E and another
graph has the same vertex set V and edge set E 0 , these two graphs could have
different connected components. It is traditional to use the Greek letter
(called gamma) to represent the number of connected components of a graph.
In particular, (V, E) represents the number of connected components of
the graph with vertex set V and edge set E. The Principle of Inclusion and
Exclusion may be used to compute the number of ways to color a graph
properly using colors from a set C of c colors.
1

If a graph has a loop connecting some vertex to itself, the loop must of course connect
a vertex to a vertex of the same color. Because of this, in this definition the only edges
considered are those with two distinct vertices.

96

263. Let G be a graph with vertex set V and edge set E = {e1 , e2 , . . . e|E| },
and let F be any subset of E. Suppose C is a set of c colors with which
to color the vertices.
(a) In terms of (V, F), in how many ways can you color the vertices
of G so that every edge in F connects two vertices of the same
color?
Hint. Use the connected components. For each edge in F to connect two vertices of the same color, we must have all the vertices
in a connected component of the graph with vertex set V and edge
set F colored the same color.
(b) Given a coloring of G, for each edge ei in E consider the set Ai
of all colorings in which both endpoints of ei are colored the same
color. In which sets Ai does a proper coloring lie?
(c) Find a formula for the number of proper colorings of G using colors
in the set C. Your formula will probably involve summing over
all subsets F of the edge set of the graph and using the number
(V, F) of connected components of the graph with vertex set V
and edge set F.
The formula you found in Problem 263 involves powers of the number
of colors, and so it is a polynomial function of c. People often use x as the
notation for the number of colors used to color G. Frequently people will
use G (x) to represent the number of ways to color G with x colors, and
call G (x) the chromatic polynomial of G.

6.5.1

Deletion-Contraction (Optional)

264. In Section 4.5.1 (on pages 71ff) you developed the deletion-contraction
recurrence and used it to count the number of spanning trees in a
graph.
(a) Figure out how the chromatic polynomial of a graph is related to
the chromatic polynomials of the graphs resulting from deletion of
an edge e and from contraction of that same edge e.
(b) Try to find a recurrence like the one for counting spanning trees
that expresses the chromatic polynomial of a graph in terms of the
chromatic polynomials of G \ e and G/e for an arbitrary edge e.
(c) Use the recurrence from the last part to give another proof that
the number of ways to color a graph with x colors is a polynomial
function of x.
265. Use the deletion-contraction recurrence to reduce the computation of
97

the chromatic polynomial of the graph in Figure 6.1 to computing chromatic polynomials that you can easily compute. (You can simplify your
computations by thinking about the effect on the chromatic polynomial
of deleting an edge that is a loop, or deleting one of several edges between the same two vertices.)
Figure 6.1: A graph.
3
4
5

266. (a) In how many ways may you properly color the vertices of a path on
n vertices with x colors? Describe any dependence of the chromatic
polynomial of a path on the number of vertices.
(b) In how many ways may you properly color the vertices of a cycle on
n vertices with x colors? Describe any dependence of the chromatic
polynomial of a cycle on the number of vertices.
267. In how many ways may you properly color the vertices of a tree on n
vertices with x colors?
268. What do you observe about the signs of the coefficients of the chromatic
polynomial of the graph in Figure 6.1? What about the signs of the
coefficients of the chromatic polynomial of a path? Of a cycle? Of
a tree? Make a conjecture about the signs of the coefficients of a
chromatic polynomial and prove it.

98

Chapter 7

Distribution Problems
7.1

The Idea of Distributions

Many of the problems you solved in earlier chapters may be considered to be


distribution problemsproblems which involve distributing objects (such as
pieces of fruit or ping-pong balls) to recipients (such as children). For example, in Problem 108 you probably worked through the fact that the number
of ways to pass out k ping-pong balls to n children so that no child gets
more than one ball is the number of ways that you can choose a k-element
subset of an n-element set. You can think that the children are the recipients and the identical ping-pong balls are the objects you are distributing,
and that the distribution is done in such a way that each recipient gets at
most one ball. Those children who receive a ball form the k-element subset
of the n-element set of children. (They form a subset because the balls are
identical.)
Another popular model for distributions is to think of putting balls in
boxes or books in bookcases rather than distributing objects to recipients.
Passing out identical objects is modeled by putting identical balls into boxes.
Passing out distinct objects is modeled by putting distinct balls into boxes.
So, when you are passing out objects to recipients, you may think of the
objects as being either identical or distinct. You may also think of the
recipients as being either identical (grocery bags in the case of putting fruit
into bags in the grocery store) or distinct (children in the case of passing
fruit out to children). You may restrict the distributions to those that give
at least one object to each recipient, or those that give exactly one object
to each recipient, or those that give at most one object to each recipient,
or you may have no such restrictions. If the objects are distinct, it may be

99

that the order in which the objects are received is relevant (think about ice
cream in 3-decker cones) or that the order in which the objects are received
is irrelevant (think about dropping a handful of candy into a childs trick-ortreat bag). The next three problems are a review of formulas youve already
developed.
+ 269. Consider the distribution of k distinct objects to n distinct recipients,
with different conditions on how the objects are received. The first row
in the following table is already filled in. Fill in the other rows.
Conditions
No conditions
Each gets at least one
Each gets at most one
Each gets exactly one
Order matters
Order matters
Each gets at least one

Number of Ways
nk

Mathematical Model
functions

+ 270. Consider the distribution of k identical objects to n distinct recipients,


with different conditions on how the objects are received. Fill in all
entries in the table.
Conditions
No conditions
Each gets at least one
Each gets at most one
Each gets exactly one

Number of Ways

Mathematical Model

+ 271. Consider the distribution of k objects to n identical recipients, with


different conditions on how the objects are received. Fill in the entries
of the table (except for the entries with ?).
Objects
Distinct
Distinct
Distinct
Distinct
Identical
Identical

Conditions
Each gets at most one
Each gets exactly one
Order matters
Order matters and each gets at least one
Each gets at most one
Each gets exactly one
100

Number of Ways

?
?

272. In how many ways may you stack 5 distinct books into 3 identical
boxes so that each box contains at least one book? (Assume the order
of books in a stack makes a difference.) This problem different from
arranging 5 distinct books in a bookcase with 3 bookshelves in such a
way that each shelf gets at least one bookwhy?
In the last problem you might have thought to first partition the five
books into three blocks and then follow by ordering the books within the
blocks of the partition. This turns out not to be a useful combinatorial way
of visualizing the problem because the number of ways to order the books
in the various blocks depends on the sizes of the blocks and not just the
number of blocks.
273. In this problem you want to count the number of ways to stack k
distinct books into n identical boxes so that there is at least one book
in every box.
(a) First consider the set S of all arrangements of the k distinct books
on n distinct shelves such that every shelf has at least one book.
When are two of the arrangements in S the same as far as what you
are asked to count in this problem? Define this idea of sameness
as an equivalence relation on S. Explain why every equivalence
class has the same size. What is this common size?
(b) Find the number of ways to stack k distinct books into n identical
boxes so that there is a stack in every box. This number is usually
denoted by L(k, n) and is called a Lah number.
274. Explain why the
P number of ways to stack k distinct books into n identical boxes is ni=1 L(k, i).
275. Show the Lah numbers L(k, n) satisfy the recurrence
L(k, n) = L(k 1, n 1) + (n + k 1)L(k, n) .
This is similar to Pascals Equation which you proved in Section 3.3.1.
Section
Hint. Either k is in an ordered block by itself or it is not.
276. Fill in the two entries with ?s in the table in Problem 271.

7.2

Counting Partitions

In Problem 21 you showed any partition of [k] into n non-empty blocks


corresponds to a function from [k] to [n]. For instance, {1, 2, 4}, {3}, {5} is
101

a k = 3 block partition of [5] that can be described by the function


f (1) = 1, f (2) = 1, f (3) = 2, f (4) = 1, f (5) = 3 ;
that is, f (x) equals the block in which x lies. Since re-ordering the partitionblocks changes the function, this correspondence is a relation from the set
of n-block partitions of [k] to the set of all functions from [k] to [n] but is
not a function. In fact, a partition of [k] into n blocks is a distribution of k
distinct objects to n indistinguishable recipients.
We use the notation S(k, n) to stand for the number of n-block partitions
of [k], where by convention S(0, 0) = 1. For historical reasons, S(k, n) is
called a Stirling Number of the second kind.
277. What is S(k, 1); S(k, k)? How should you define S(k, n) for n > k?
278. Find S(k, k 1) and S(k, 2) for any k 2.
279. Given a function f from [k] to [n], we can define a partition of [k]
by putting x and y in the same block of the partition if and only if
f (x) = f (y). (This relation is inverse to the relation in Problem 21.)
How many blocks does the partition have if f is an onto function? How
is the number of onto functions from [k] to [n] related to a Stirling
Number? Be as precise in your answer as you can.
In Problem 119 you proved a recurrence for computing the Stirling Numbers S(k, n) that is similar to Pascals Equation for computing binomial
coefficients.
280. Use the Stirling recurrence to create a table of values of S(k, n) for
1 k 5 and 1 n k. This table is sometimes called Stirlings
Triangle because of the analogy with Pascals Triangle.
281. Extend Stirlings Triangle far enough to allow you to answer the following question. A caterer is preparing three bag lunches for hikers. If
the caterer has nine different sandwiches, in how many ways can these
nine sandwiches be distributed into three identical lunch bags so that
each bag gets at least one?
The total number of partitions of a k-element set is denoted by B(k) and
is called the k-th Bell Number Verify that B(1) = 1, B(2) = 2, B(3) = 5
by explicitly exhibiting all the partitions.
282. For a given k 1, what is the relationship between B(k) and the
Stirling Numbers S(k, n)?

102


P
k1
283. (a) Prove the Bell Numbers satisfy B(k) = k1
n=0
n B(n) by asking
yourself what happens when you delete the entire block containing k .
(b) Use this Bell recurrence to calculate B(k) for k = 4, 5, 6.

7.2.1

Multinomial Coefficients (Optional)

Q
284. Fix positive integers k1 , . . . , kn . Prove there are n!/ ni=1 (i!)ki ki ! ways
to partition k distinct items into n blocks so that there are ki blocks of
size i for each i. The sequence k1 , k2 , . . . , kn is called the type vector
of the partition.
285. Describe how to compute S(n, k) in terms of all type vectors (k1 , k2 , . . . , kn )
such that k1 + k2 + + kn = k.
286. In how many ways may we label the elements of [k] with n distinct
labels (numbered 1 through n) so that label i is used ji times? This
number is called a multinomial coefficient and denoted by


k
.
j1 , j2 , . . . , jn
What if the ji s do not add to k?
Hint. Think about listing the elements of [k] and labeling the first j1
elements with label number 1.

k
as a multinomial coefficient.
287. Write the binomial coefficient m
288. Explain how multinomial coefficients can be used to compute the number of functions from [k] to [n].
289. Explain how to use multinomial coefficients to compute the number of
onto functions from [k] to [n].
Hint. How are the relevant ji s in the multinomial coefficients you use
here different from the ji s in the previous problem?
290. How may multinomial coefficients be used to obtain an expression for
kth power of a multinomial x1 + x2 + + xn ? This result is called the
Multinomial Theorem.
Hint. Review the Binomial Theorem.

7.3

Additional Problems

1. Answer each of the following questions.


103

(a) In how many ways can you pass out k identical pieces of candy
n children?
(b) In how many ways can you pass out k distinct pieces of candy
n children?
(c) In how many ways can you pass out k identical pieces of candy
n children so that each gets at most one? (Assume k n.)
(d) In how many ways can you pass out k distinct pieces of candy
n children so that each gets at most one? (Assume k n.)
(e) In how many ways can you pass out k identical pieces of candy
n children so that each gets at least one? (Assume k n.)

to
to
to
to
to

2. The neighborhood improvement committee has been given r trees to


distribute to s families living along one side of a street. Unless otherwise
specified, it does not matter where a family plants the trees it gets.
(a) In how many ways can the committee distribute all of them if
the trees are distinct, there are more families than trees, and each
family can get at most one tree?
(b) In how many ways can the committee distribute all of them if the
trees are distinct and any family can get any number?
(c) In how many ways can the committee distribute all the trees if the
trees are identical, there are no more trees than families, and any
family receives at most one?
(d) In how many ways can the committee distribute all the trees if
they are identical and anyone may receive any number of trees?
(e) In how many ways can all the trees be distributed and planted if
the trees are distinct, any family can get any number, and a family
must plant its trees in an evenly spaced row along the road?
(f) Answer the question in part (e) assuming that every family must
get a tree.
(g) Answer the question in part (d) assuming that each family must
get at least one tree.

104

Index
Coin Exchange Problem, 28
coloring of a graph, 96
proper, 96
combinations, 46
complement
set, 93
complete graph, 20, 59, 67
composition
of an integer, 31
of an integer into parts, 49
congruence modulo n, 119
connected component of a graph, 96
connected graph, 62
contraction, 71
cost of a spanning tree, 70

Kn , 20, 59
n!, Stirlings formula for, 74
nk , 39
[n], 14
1-1 correspondence, 15
addition of power series, 82
adjacency matrix, 42
Bell Number, 102
bijection, 15, 113
Bijection Principle, 15
binary representation, 15
binary string, 16
Binets Formula, 89
binomial coefficient, 47
Binomial Theorem, 79
bit, 16
block of a partition, 10
Cartesian product, 7
Catalan
number, 54
path, 53
characteristic function, 17
chromatic polynomial of a graph, 97
co-domain of a function, 11, 110
code
Gray, 31
Pr
ufer, 66
coefficient
binomial, 47
multinomial, 103

degree of a vertex, 60
degree sequence of a graph, 67
ordered, 67
deletion, 71
deletion-contraction recurrence, 72, 97
derangement, 92
digraph, 12
edge of, 12
Dijkstras algorithm, 73
directed graph, see digraph
disjoint sets, 9
distance
in a graph, 73
in a weighted graph, 73
domain, of a function, 11, 110
double induction, 57

105

edge, 20, 59
of a digraph, 12
weighted, 70
equivalence class, 44
equivalence relation, 40, 43
exclusive or, 76
Feller Reflection Principle, 53
Fibonacci numbers, 88
first-order recurrence, linear, 35
formal power series, 82
invertible, 84
Four Color Theorem, 96
Frobenius Problem, 28
function, 7, 11, 110
characteristic, 17
co-domain of, 11, 110
digraph of, 12
domain of, 11, 110
indicator, 17
injective, 8
one-to-one, 8
onto, 14, 95, 102, 103
ordered, 55
onto, 55
relation of, 110
surjective, 14
value, 81
functions
number of, 31, 35, 37
number of onto, 95, 102, 103
General Product Principle, 37
generating functions, 82
Product Principle for, 84
rational representation of, 89
generating polynomials, 80
geometric series, 84
graph, 20, 59
chromatic polynomial of, 97

coloring of, 96
proper, 96
complete, 20, 59
connected, 62
connected component of, 96
directed, 12
distance in, 73
simple, 60
Gray code, 31
greedy method, 71
Hat Check Problem, 92
Inclusion and Exclusion
Principle of, 11, 93
indicator function, 17
induction, see Mathematical Induction
inductive procedure, 25
initial value, 35
injection, see one-to-one function
invertible power series, 84
Lah number, 101
lattice path, 52
length
path, 52
weighted path, 73
Menage Problem, 95
Mathematical Induction, 29, 37, 116
double, 57
minimal spanning tree, 70
monochromatic subgraph, 68
MST, 70
multinomial coefficient, 103
Multinomial Theorem, 103
multiplication of power series, 83
multisets, 56, 84
number of, 81

106

General, 37
one-to-one function, 8
product, Cartesian, 7
onto function, 14
proper coloring of a graph, 96
counting, 95, 102, 103
ordered degree sequence of a graph,
Ramsey Number, 20, 56, 68, 75
67
range of a function, see co-domain,
ordered pair, 5
of a function
ordered tuple, 6
rational
representation
ordered-function, 55
of a generating function, 89
ordered-onto-function, 55
recurrence, 34
constant coefficient, 88
partial fractions, 89
deletion-contraction, 72
partition, 9, 17, 101, 102
linear first-order, 35
blocks of, 10
second-order, 88
number of, 102
solution to, 34
type vector, 103
two-variable, 50
Pascals Triangle, 51
reflexive relation, 43
path, 62
relation, 110
Catalan, 53
equivalence, 40, 43
lattice, 52
of a function, 12, 110
path length, 52
recurrence, 34
weighted, 73
reflexive, 43
permutation
symmetric, 43
k-element, 39
transitive, 43
picture
enumerators, 77
second-order recurrence, 88
Product Principle for, 78
set complement, 93
function, 77
sets
of a tree, 79
disjoint, 9
Pigeonhole Principle, 18
mutually disjoint, 9
Generalized, 19
simple graph, 60
power series
spanning tree, 69, 71, 97
addition of, 82
cost of, 70
multiplication of, 83
minimal, 70
Pr
ufer code, 66
Stirling
Number, second kind, 102
probabilistic method, 68
Stirlings
formula for n!, 74
probability of an event, 92
Stirlings Triangle, 102
procedure, inductive, 25
subsets, number of, 46
Product Principle, 9, 10
Sum Principle, 9, 10, 90
for Generating Functions, 84
surjection, see onto function
for Picture Enumerators, 78
107

Sylver Coinage, 30
symmetric relation, 43
transitive relation, 43
tree, 63
picture of, 79
spanning, 69
cost of, 70
minimal, 70
value function, 81
vertex, 20, 59
degree of, 60
walk, 62
Well-Ordering Principle, 115

108

Part II

REVIEW MATERIAL

109

Appendix A

More on Functions and


Digraphs
A.1

Functions

Exercise A.1. Consider the functions from S = {2, 1, 0, 1, 2} to T =


{1, 2, 3, 4, 5} defined by f (x) = x + 3, and g(x) = x5 5x3 + 5x + 3. Write
down the set of all ordered pairs (x, f (x)) for x S, and the set of all ordered
pairs (x, g(x)) for x S. Are the two functions the same or different?
Exercise A.1 points out how two functions which appear to be different are actually the same. Most of the time when we are thinking about
functions it is fine to think of a function casually as a relationship between
two sets. In Exercise A.1 the set of ordered pairs you wrote down for each
function is called the relation of the function. When we want to distinguish between the casual and the careful in talking about relationships, our
casual term will be relationship and our careful term will be relation.
So relation is a technical word in mathematics, and as such it has a technical
definition:. A relation from a set S to a set T is a set of ordered pairs whose
first elements are in S and whose second elements are in T . Another way to
say this is that a relation from S to T is a subset of the Cartesian product
S T.
A typical way to define a function f from a set S (called the domain
of the function) to a set T (called the co-domain) is that f is a relation
from S to T which relates each element of S to one and only one member of
T . We use f (x) to stand for the element of T that is related to the element
x of S, and we use the standard shorthand f : S T for f is a function
from S to T .
110

Exercise A.2. Here are some questions that will help you get used to the
formal idea of a relation and the related formal idea of a function. S will
stand for a finite set of size s and T will stand for a finite set of size t.
(a) What is the size of the largest relation from S to T ?
(b) What is the size of the smallest relation from S to T ?
(c) What is the size of the relation of a function from S to T ? That is,
how many ordered pairs are in the relation of a function from S to T ?
(d) Before working this and the next exercise, review the definitions of oneto-one function and onto function in Chapter 1. How many different
elements must appear as second elements of the ordered pairs in the
relation of a one-to-one function from S to T ?
(e) What is the minimum size that S can have if there is a onto function
from S to T ?
Exercise A.3. When f is a function from S to T , the sets S and T play
a big role in determining whether a function is one-to-one or onto. For the
remainder of this exercise, let S and T stand for the set of nonnegative real
numbers.
(a) If f : S T is given by f (x) = x2 , is f one-to-one? Is f onto?
(b) Now assume for the rest of the exercise that S 0 is the set of all real
numbers and g : S 0 T is given by g(x) = x2 . Is g one-to-one? Is g
onto?
(c) Assume for the rest of the exercise that T 0 is the set of all real numbers
and h : S T 0 is given by h(x) = x2 . Is h one-to-one? Is h onto?
(d) And if the function j : S 0 T 0 is given by j(x) = x2 , is j one-to-one?
Is j onto?
(e) If f : S T is a function, we say that f maps x to y as another way
to say that f (x) = y. Suppose S = T = {1, 2, 3}. Give a function from
S to T that is not onto. Notice that two different members of S have
mapped to the same element of T . Thus when we say that f associates
one and only one element of T to each element of S, it is quite possible
that the one and only one element f (1) that f maps 1 to is exactly the
same as the one and only one element f (2) that f maps 2 to.

A.2

Digraphs

Figure A.1 illustrates a digraph of the comes before in alphabetical order


relation on the letters a, b, c, and d. We draw the arrow from a to b, for
example, because a comes before b in alphabetical order. We try to choose
the locations for the vertices so that the arrows capture what we are trying to
111

Figure A.1: The Alphabet Digraph.


illustrate as well as possible. Sometimes this entails re-drawing our directed
graph several times until we think the arrows capture the relationship well.
Exercise A.4. Draw the digraph of the is a proper subset of relation
on the set of subsets of a two element set. (Remember the empty set is
a subset.) How many arrows would you have had to draw if this exercise
asked you to draw the digraph for the subsets of a three-element set?
Exercise A.5. (a) Draw the digraph of the relation from the set {A, M,
P, S} to the set {Sam, Mary, Pat, Ann, Polly, Sarah} given by is the
first letter of.
(b) Draw the digraph of the relation from the set {Sam, Mary, Pat, Ann,
Polly, Sarah} to the set {A, M, P, S} given by has as its first letter.
Exercise A.6. When we draw the digraph of a function f , we draw an arrow
from the vertex representing x to the vertex representing f (x). One of the
relations you considered in Exercise A.5 is the relation of a function.
(a) Which relation is the relation of a function?
(b) How does the digraph help you visualize that one relation is a function
and the other is not?
Exercise A.7. Digraphs of functions help you to visualize whether or not
they are onto or one-to-one. For example, let both S and T be the set
{2, 1, 0, 1, 2} and let S 0 and T 0 both be the set {0, 1, 2}. Let f (x) =
2 |x|.
(a) Draw the digraph of the function f , assuming its domain is S and its
range is T . Use the digraph to explain whether or not this function
maps S onto T .
(b) Use the digraph of the previous part to explain whether or not the
function is one-to one.
(c) Draw the digraph of the function f assuming its domain is S and its
range is T 0 . Use the digraph to explain whether or not the function is
onto.
112

(d) Use the digraph of the previous part to explain whether or not the
function is one-to-one.
(e) Draw the digraph of the function f , assuming its domain is S 0 and its
range is T 0 . Use the digraph to explain whether the function is onto.
(f) Use the digraph of the previous part to explain whether the function
is one-to-one.
(g) Suppose that the function f has domain S 0 and range T . Draw the
digraph of f and use it to explain whether f is onto.
(h) Use the digraph of the previous part to explain whether or not f is
one-to-one.
A function from a set X to a set Y which is both one-to-one and onto
is frequently called a bijection, especially in combinatorics. Your work in
Exercise A.7 should show you that a digraph is the digraph of a bijection
from X to Y when all four of the following properties hold:
The vertices of the digraph represent the elements of X and Y (so that
X is the possible domain and Y is the possible co-domain).
Each vertex representing an element of X has one and only one arrow
leaving it (so that f is indeed a function).
Each vertex representing an element of Y has at least one one arrow
entering it (so that it is onto).
Each vertex representing an element of Y has at most one arrow entering it (so it is one-to-one).
Of course, the last two properties can be combined to the requirement
that each vertex representing an element of Y has exactly one arrow entering
it.

113

Appendix B

More on the Principle of


Mathematical Induction
You should have already seen the Principle of Mathematical Induction in
other courses. If youve been able to work through Section 2.2 of Chapter 2
easily, there is no need to read this chapter. This section is provided in case
youve found your background to be weak, and spending time outside class
reviewing seems to be advisable.
Exercise B.1. (a) Write down a list of all subsets of {1, 2}. Do not forget
the empty set! Group the sets containing 2 separately from the others.
(b) Write down a list of the subsets of {1, 2, 3}. Group the sets containing
3 separately from the others.
(c) Look for a natural way to match up the subsets containing 2 in part (a)
with those not containing 2. Look for a way to match up the subsets
in part (b) containing 3 with those not containing 3.
(d) On the basis of the previous part, you should be able to find a bijection
between the collection of subsets of {1, 2, . . . , n} containing n and those
not containing n. (If you are having difficulty figuring out the bijection,
try rethinking Parts (a) and (b), perhaps by doing a similar exercise
with the set {1, 2, 3, 4}.) Describe the bijection and explain why it is a
bijection. Explain why the number of subsets of {1, 2, . . . , n} containing
n equals the number of subsets of {1, 2, . . . , n 1}.
(e) Parts (a) and (b) suggest strongly that the number of subsets of a
n-element set is 2n . In particular, the empty set has 20 subsets; a oneelement set has 21 subsets: itself and the empty set; and in parts (a) and (b)
we saw that two-element and three-element sets have 22 and 23 subsets, respectively. So there are certainly some values of n for which
114

an n-element set has 2n subsets. One way to prove that an n-element


set has 2n subsets for all values of n is to argue by contradiction. For
this purpose, suppose there is a nonnegative integer n such that an
n-element set does not have exactly 2n subsets. In that case there
maybe more than one such n, and so choose k to be the smallest such
n. (Notice that k 1 is still a positive integer, because k can not be 0,
1, 2, or 3.) Since k was the smallest value of n for which the statement
An n-element set has 2n subsets is false, what do you know about
the number of subsets of a (k 1)-element set? What do you know
about the number of subsets of the k-element set {1, 2, . . . , k} that do
not contain k? What do you know about the number of subsets of
{1, 2, . . . , k} that do contain k? What does the Sum Principle tell you
about the number of subsets of {1, 2, . . . , k}? Notice that this contradicts the way in which we chose k, and the only assumption that went
into our choice of k was that there is a nonnegative integer n such
that an n-element set does not have exactly 2n subsets. Since this
assumption has led us to a contradiction, it must be false. What can
you now conclude about the statement for every nonnegative integer
n, an n-element set has exactly 2n subsets?
Exercise B.2. Notice that the nth odd integer is 2n 1, and so the expression
1 + 3 + 5 + + 2n 1
(B.1)
is the sum of the first n odd integers . Experiment a bit with the sum for
the first few positive integers and guess its value in terms of n. Now apply
the technique of Exercise B.1 to prove that you are right.
In Exercises B.1 and B.2, our proofs had several distinct elements: We
had a statement involving an integer n. We knew the statement was true
for the first few nonnegative integers in Exercise B.1 and for the first few
positive integers in Exercise B.2. We wanted to prove that the statement
was true for all nonnegative integers in Exercise B.1 and for all positive
integers in Exercise B.2. In both cases we used the method of proof by
contradiction: for that purpose we assumed that there was a value of n for
which our formula was not true. We then chose k to be the smallest value
of n for which our formula was not true. This meant that when n was k 1,
our formula was true, (or else that k 1 was not a nonnegative integer in
Exercise B.1 or that k 1 was not a positive integer in Exercise B.2). What
we did next was the crux of the proof. We showed that the truth of our
statement for n = k 1 implied the truth of our statement for n = k. This
115

gave us a contradiction to the assumption that there was an n that made


the statement false. In fact, we will see that we can bypass entirely the use
of proof by contradiction. We used it to help you discover the central ideas
of the technique of proof by mathematical induction. The central core of
mathematical induction is the proof that the truth of a statement about the
integer n for n = k 1 implies the truth of the statement for n = k. For
example, once we know that a set of size 0 has 20 subsets, if we have proved
our implication, we can then conclude that a set of size 1 has 21 subsets,
from which we can conclude that a set of size 2 has 22 subsets, from which
we can conclude that a set of size 3 has 23 subsets, and so on up to a set of
size n having 2n subsets for any nonnegative integer n we choose. In other
words, although it was the idea of proof by contradiction that led us to think
about such an implication, we can now do without the contradiction at all.
What we need to prove a statement about n by this method is a place to
start; that is, a value b for which we know the statement to be true, and
then a proof that the truth of our statement for n = k 1 implies the truth
of the statement for n = k, regardless of which k > b is considered.

The Principle of Mathematical Induction


In order to prove a statement about an integer n, if you can
Prove the statement when n = b, for some fixed integer b;
Show that the truth of the statement for n = k 1 implies the truth
of the statement for n = k whenever k > b;
then you can conclude the statement is true for all integers n b.

As an example, let us return to Exercise B.1. The statement we wish to


prove is the statement that A set of size n has 2n subsets.
Our statement is true when n = 0, because a set of size 0 is the
empty set, for which the only subset is the empty set, giving 1 = 20
subsets. (This step of our proof is called a base step.)
Now suppose that k > 0 and every set with k 1 elements has
2k1 subsets. Suppose S = {a1 , a2 , . . . ak } is a set with k elements.
We partition the subsets of S into two blocks. Block B1 consists
of the subsets that do not contain ak and Block B2 consists of
the subsets that do contain ak . Each set in B1 is a subset of
{a1 , a2 , . . . ak1 }, and each subset of {a1 , a2 , . . . ak1 } is in B1 .
Thus B1 is the set of all subsets of {a1 , a2 , . . . ak1 }. Therefore
by our assumption in the first sentence of this paragraph, the size
116

of B1 is 2k1 . Consider the function from B2 to B1 which takes


a subset of S including ak and removes ak from it. The set B2 is
the domain of this function, because every set in B2 contains ak .
This function is onto, because if T is a set in B1 , then T {ak }
is a set in B2 which the function sends to T . This function is
one-to-one because if V and W are two different sets in B2 , then
removing ak from both of them gives two different sets in B1 .
Thus we have a bijection between B1 and B2 , so B1 and B2 have
the same size. Therefore by the Sum Principle the size of B1 B2
is 2k1 + 2k1 = 2k . Therefore, S has 2k subsets. This shows that
if a set of size k 1 has 2k1 subsets, then a set of size k has 2k
subsets. Therefore by the principle of mathematical induction, a
set of size n has 2n subsets for every nonnegative integer n.
The first sentence of the last paragraph is called the inductive hypothesis. In an inductive proof we always make an inductive hypothesis as part
of proving that the truth of our statement when n = k 1 implies the truth
of our statement when n = k. The last paragraph itself is called the inductive step of our proof. In an inductive step we derive the statement for
n = k from the statement for n = k 1, thus proving that the truth of our
statement when n = k 1 implies the truth of our statement when n = k.
The last sentence in the last paragraph is called the inductive conclusion.
All inductive proofs should have a base step, an inductive hypothesis,
an inductive step, and an inductive conclusion. There are a couple details
worth noticing. First, in this exercise, our base step was the case n = 0, or in
other words, we had b = 0. However, in other proofs, b could be any integer,
positive, negative, or 0. Second, our proof that the truth of our statement
for n = k 1 implies the truth of our statement for n = k required that k be
at least 1, so that there would be an element ak we could remove in order
to describe our bijection. However, the second condition in the statement
of the Principle of Mathematical Induction only requires that we be able to
prove the implication for k > 0, so we were allowed to assume k > 0.
Exercise B.3. Use mathematical induction to prove your formula from
Exercise B.2.
Exercise B.4. Experiment with various values of n in the sum
n

X
1
1
1
1
1
+
+
+ +
=
.
12 23 34
n (n + 1)
i (i + 1)
i=1

Guess a formula for this sum and prove your guess is correct by induction.
117

Exercise B.5. For large values of n, which is larger, n2 or 2n ? A graph


might be help to decide what large value means here. Use mathematical
induction to prove that you are correct.
Exercise B.6. What is wrong with the following attempt at an inductive
proof that all integers in any consecutive set of n integers are equal for every
positive integer n?
For an arbitrary integer i, all integers from i to i are equal, so our
statement is true when n = 1. Now suppose k > 1 and all integers
in any consecutive set of k 1 integers are equal. Let S be a set
of k consecutive integers. By the inductive hypothesis, the first
k 1 elements of S are equal and the last k 1 elements of S
are equal. Therefore all the elements in the set S are equal. Thus
by the Principle of Mathematical Induction, for every positive n,
every n consecutive integers are equal.

118

Appendix C

More on Equivalence
Relations
Exercise C.1. Which of the reflexive, symmetric and transitive properties
does the < relation on the integers have?
Exercise C.2. A relation R on the set of ordered pairs of positive integers
that you learned about in grade school in another notation is the relation
that says (m, n) is related to(h, k) if mk = hn. Show that this relation is
an equivalence relation. In what context did you learn about this relation
in grade school?
Exercise C.3. Another relation that you may have learned about in school,
perhaps in the guise of clock arithmetic, is the relation of equivalence
modulo n. For integers (positive, negative, or zero) a and b, we write
a b (mod n)
to mean that a b is an integer multiple of n, and in this case, we say that
a is congruent to b modulo n. Show that the relation of congruence
modulo n is an equivalence relation.
Exercise C.4. Define a relation on the set of all lists of n distinct integers
chosen from {1, 2, . . . , n}, by saying two lists are related if they have the
same elements (though perhaps in a different order) in the first k places,
and the same elements (though perhaps in a different order) in the last
n k places. Show this relation is an equivalence relation.

119

You might also like