0% found this document useful (0 votes)
13 views151 pages

Notre Dame Math 43900 Fall 2020

Uploaded by

quintenbackup123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views151 pages

Notre Dame Math 43900 Fall 2020

Uploaded by

quintenbackup123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 151

Notre Dame Math 43900

Instructor: David Galvin


Fall 2020

Abstract
This project contains all information relevant to the Fall 2020 running of Math
43900 — Problem Solving in Math at University of Notre Dame. This includes
• basic course information;
• weekly problem sets and solutions; and
• various pieces of supplementary material, added as they come up during the
semester.
It will replace the usual course website/Sakai page — everything that you need to
know about the course will be found here.
Here is the Zoom address for remote class participants:

https://notredame.zoom.us/j/95331468489

(meeting ID 953 3146 8489, passcode 314159).

1
Contents
1 Basic course information 4
1.1 Zoom url . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 COVID-19 policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The absolute basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Course description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.1 Official description . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.2 A more helpful description . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Course policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.1 Attendance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.2 Honor code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.3 Class conduct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Week 1 (August 11) — Induction 8


2.1 A few problems to discuss in class . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Some problems to work on for week 2 . . . . . . . . . . . . . . . . . . . . . 13
2.4 Solutions to induction problems . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Week 2 (August 18) — Pigeonhole principle 20


3.1 Some problems to work on for week 3 . . . . . . . . . . . . . . . . . . . . . 22
3.2 Solutions to pigeon hole problems . . . . . . . . . . . . . . . . . . . . . . . 23

4 Week 3 (August 25) — Binomial coefficients 28


4.1 Warm-up problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Problems to think about for week 4 . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Solutions to warm-up problems . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Solutions to Binomial coefficient problems . . . . . . . . . . . . . . . . . . 37

5 Week 4 (September 1) — Graphs 44


5.1 Some problems to think about for week 5 . . . . . . . . . . . . . . . . . . . 48
5.2 Solutions to graph theory problems . . . . . . . . . . . . . . . . . . . . . . 49

6 Week 5 (September 8) — Calculus 54


6.1 Problems to think about for week 6 . . . . . . . . . . . . . . . . . . . . . . 56
6.2 Solutions to calculus problems . . . . . . . . . . . . . . . . . . . . . . . . . 57

7 Week 6 (September 15) — Recurrences 64


7.1 Some problems to think about for week 7 . . . . . . . . . . . . . . . . . . . 69
7.2 Solutions to recurrence problems . . . . . . . . . . . . . . . . . . . . . . . 70

2
8 Week 7 (September 22) — Writing solutions 77
8.1 Some problems to think about for week 8 . . . . . . . . . . . . . . . . . . . 80
8.2 Solutions to problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

9 Week 8 (September 29) 88

10 Week 9 (October 6) — Inequalities 89


10.1 Problems to think about for week 10 . . . . . . . . . . . . . . . . . . . . . 95
10.2 Solutions to problems on inequalities . . . . . . . . . . . . . . . . . . . . . 98

11 Week 10 (October 13) — Modular arithmetic and greatest common


divisor 105
11.1 Problems to think about for week 11 . . . . . . . . . . . . . . . . . . . . . 109
11.2 Solutions to problems on modular arithmetic . . . . . . . . . . . . . . . . . 109

12 Week 11 (October 20) — Probability 118


12.1 Problems to think about for week 12 . . . . . . . . . . . . . . . . . . . . . 123
12.2 Solutions to problems on probability . . . . . . . . . . . . . . . . . . . . . 124

13 Week 12 (October 27) — Polynomials 131


13.1 Problems to think about for week 13 . . . . . . . . . . . . . . . . . . . . . 134
13.2 Solutions to problems on polynomials . . . . . . . . . . . . . . . . . . . . . 135

14 Week 13 (November 3) — Games 141


14.1 Problems to think about for week 14 . . . . . . . . . . . . . . . . . . . . . 142
14.2 Solutions to problems on games . . . . . . . . . . . . . . . . . . . . . . . . 143

15 Week 14 (November 10) 148

3
1 Basic course information
NOTE: Initial course policies are in flux, and can be expected to change right
up to the start of semester!

1.1 Zoom url


Here is the Zoom address for remote class participants:

https://notredame.zoom.us/j/95331468489

(meeting ID 953 3146 8489, passcode 314159).

1.2 COVID-19 policies


From https://here.nd.edu/health-safety/ (retrieved July 24, 2020):

All students, faculty, and staff in a University campus space will be required
to wear face coverings at all times when they are, or may be, in the presence
of other individuals, except when alone in a private room (office, assigned
residence hall room) or in a private vehicle.

Adherence to this policy concerning face coverings (worn covering both mouth and nose)
is an absolute requirement of in-person attendance. This is for my health and safety,
for your health and safety, and for the health and safety of all those that we come in
contact with, some of whom may be at high risk for suffering devastating consequences
from COVID-19.
Again from https://here.nd.edu/health-safety/ (retrieved July 24, 2020):

• Stay at least 6 feet (about 2 arms’ length) from other people


• Do not gather in groups
• Stay out of crowded spaces and avoid mass gatherings

The practical import of this for the weekly meetings is that you should not move the chairs
from their sticker-centered positions, and that you should avoid gathering in groups before
or after meetings.
Again from https://here.nd.edu/health-safety/ (retrieved July 24, 2020):

All members of the Notre Dame community (students, faculty and staff)
should conduct a daily health check by taking their temperature and assessing
symptoms.

So: if we are sick, or show symptoms, we should not attend meetings in person!!!
Because it is likely that at any given time, there will be some people who have to skip
the in-person meetings, all sessions will be streamed online, at

4
https://notredame.zoom.us/j/95331468489

(meeting ID 953 3146 8489, passcode 314159).


Also, it is possible that some days *I* will have to skip the in-person meetings, in which
case the class meeting will go on-line. This may happen at short notice, so I encourage
you to check your email each Tuesday afternoon, to see if there has been an announcement
from me about moving that week’s meeting on-line.

1.3 The absolute basics


• Course instructor: David Galvin, [email protected].

• Meeting times: Tues 4.05-4.55, Hayes-Healy 129, and

https://notredame.zoom.us/j/95331468489

(meeting ID 953 3146 8489, passcode 314159).

1.4 Course description


1.4.1 Official description
The main goal of this course is to develop problem solving strategies in mathematics.

1.4.2 A more helpful description


A very specific goal of this course is to help people prepare for the annual William Lowell
Putnam Mathematical Competition.
Every year, around 4500 US & Canadian undergraduates from around 550 institutions
participate in this competition, which takes place on-campus. The competition typically
consists of two three-hour sessions (morning and afternoon), with each session having
six problems. The problems are hard, not because they are made up of lots of parts,
or involve extensive computation, or require very advanced mathematics to solve. They
are hard because they each require a moment of cleverness, intuition and ingenuity
to reach a solution. Typically, the median score out of 120 (10 possible points per
question) is 1! The Putnam Competition may be the most challenging and rewarding
tests of mathematical skill that you will ever encounter. See https://www.maa.org/
math-competitions/putnam-competition for more information, including information
about prizes and recognition for high performers.
This year the Putnam Competition is scheduled for

Saturday, February 20

(a departure from the traditional December date). The precise details of how the Compe-
tition will be administered have not yet been announced.

5
Each meeting of Math 43900 will (usually) be built around a specific theme (pigeon-hole
principle, induction & recursion, inequalities, probability, et cetera). We’ll talk about the
general theme, then spend time trying to solve some relevant problems. At the end of
each meeting I’ll hand out a set of problems on that theme, that you can cut your teeth
on. Usually we’ll begin the next session with presentations of solutions to some of those
problems. On occasional meetings, I might give out a problem set at the beginning, and
have everyone pick a problem or two to work on individually for the meeting period (a
sort of “mock Putnam”).
As the last paragraph indicates, the course also has the general goal of introducing
useful problem-solving techniques, and bits and pieces of useful mathematics that might
not have a natural home in other courses in the math curriculum.
Those who want to get the most out of the Putnam Competition are also encouraged
to take part in the Virginia Tech Regional Mathematics Contest. This usually which
happens six weeks or so before the Putnam (again, on campus). More information on this
competition will be available at the start of September.

1.5 Assessment
The grade for the class will be determined solely by active participation in class (partici-
pating in class discussions, occasionally presenting problem solutions on the board) and
by participation in the 2020 Putnam Competition.

1.6 Course policies


1.6.1 Attendance
Attendance at weekly meetings is required. The meetings will be held in-person, but to
accommodate those who are unable to attend in person due to health reasons, all sessions
will be streamed online, at

https://notredame.zoom.us/j/95331468489

(meeting ID 953 3146 8489, passcode 313159). Recordings of each class should also be
available, soon after each one ends — more details on this when semester begins.

1.6.2 Honor code


You have all taken the Honor Code pledge, to not participate in or tolerate academic
dishonesty. For this course, that means that although you may discuss assignments with
your colleagues, you must write the final version of each of your assignments on your own;
if you use any external sources to assist you (such as discussions on mathstackexchange,
computer programmes, et cetera), you should cite them clearly.

6
1.6.3 Class conduct
At the meetings you should feel free to engage in lively discussion about the course topics;
don’t be shy! But non course related interruptions should be kept to a minimum. In
particular, you should turn off or switch to silent all phones, etc., before the start of each
meeting. If for some good reason you need to have your phone on during meetings, please
mention it to me in advance.

7
2 Week 1 (August 11) — Induction
2.1 A few problems to discuss in class
Here a few problems that we can solve together in class during the first meeting, to get
our feet wet. They are all particular favorites of mine, for one reason or another.

1. A room has 100 lockers, numbered 1 through 100, initially all closed. I run through
the room, and open every locker. Then I run through the room again, and close
the lockers numbered 2, 4, 6, et cetera (all the even numbered lockers). Next I run
through the room, and change the status of the lockers numbered 3, 6, 9, et cetera
(opening the closed ones, and closing the open ones). I keep going in this manner
(on the ith run through the room, I change the status of lockers numbered i, 2i, 3i,
et cetera), until on my 100th run through I change the status of locker number 100
only.
At the end of all this, which lockers are open?

2. In the picture below, which of the two shaded in regions (the red region and the blue
region, if you are looking at the pdf online) has the greater area? (The boundary
is a perfect quarter-circle. The two circle-like curves inside the quarter circle are
perfect semi-circles, whose diameters are radii of the quarter-circle.)

3. Alice and Bob want to know Carole’s birthday. She tells them that it is one of ten
dates, shown in the table below.

8
Then she whispers the month of her birthday to Alice, and the day of her birthday
to Bob. The following conversation ensues:

• Alice: Bob, I know that you don’t know Carole’s birthday.


• Bob: Now I know!
• Alice: And now so do I!

When is Carole’s birthday?1

1
This is a fairly straightforward example of a “knowledge puzzle”. If you want a real challenge, try to
solve this one: Carole (truthfully) tells Alice and Bob that she is thinking of two distinct positive integers,
both bigger than 1, whose sum is at most 100. She whispers the sum of the numbers to Alice, and she
whispers the product to Bob. The following conversation ensues:
• Alice: Bob, I know that you don’t know the two numbers.
• Bob: Now I know them!
• Alice: And now so do I!
What are the two numbers? (You should assume that Alice and Bob are very smart and very logical.)
ADDED AUGUST 18: This is the famous Sum and Product problem, and the solution is that the
numbers are 4 and 13. See https://en.wikipedia.org/wiki/Sum_and_Product_Puzzle for a detailed
discussion.

9
2.2 Induction
Often one can tackle a problem that involves one or more parameters by checking what
happens with small values of the parameters, noticing a pattern, and then establishing that
the pattern holds in general. The most powerful mathematical technique for establishing
the correctness of a pattern is induction.

Basic induction
Suppose that P (n) is an assertion about the natural number n. Induction is essentially
the following: if there is some a for which P (a) is true, and if for all n ≥ a we have
that the truth of P (n) implies the truth of P (n + 1), then we can conclude that P (n)
is true for all n ≥ a. This should be fairly obvious: knowing P (a) and the implication
“P (n) =⇒ P (n + 1)” at n = a, we immediately deduce P (a + 1). But now knowing
P (a + 1), the implication “P (n) =⇒ P (n + 1)” at n = a + 1 allows us to deduce
P (a + 2); and so on. Induction is the mathematical tool that makes the “and so on” above
rigourous. Induction works because of the following fundamental fact, often referred to as
the well-ordering principle:

A non-empty subset of the natural numbers must have a least element.

To see why well-ordering allows induction to work, suppose that we know that P (a) is
true for some a, and that we can argue that for all n ≥ a, the truth of P (n) implies the
truth of P (n + 1). Suppose now (for a contradiction) that there are some n ≥ a for which
P (n) is not true. Let F = {n|n ≥ a, P (n) not true}. By assumption F is non-empty, so
has a least element, n0 say. We know n0 6= a, since P (a) is true; so n0 ≥ a + 1. That
means that n0 − 1 ≥ a, and since n0 − 1 6∈ F (if it was, n0 would not be the least element)
we know P (n0 − 1) is true. But then, by assumption, P ((n0 − 1) + 1) = P (n0 ) is true, a
contradiction!
Example: Prove that a set of size n ≥ 1 has 2n subsets (including the empty set and the
set itself).
Solution: Let P (n) be the statement “a set of size n has 2n subsets”. We prove that P (n)
is true for all n ≥ 1 by induction. We first establish a base case. When n = 1, the generic
set under consideration is {x}, which has 2 = 21 subsets ({x} and ∅); so P (1) is true.
Next we establish the inductive step. Suppose that for some n ≥ 1, P (n) is true.
Consider P (n + 1). The generic set under consideration now is {x1 , . . . , xn , xn+1 }. We can
construct a subset of {x1 , . . . , xn , xn+1 } by first forming a subset of {x1 , . . . , xn }, and then
either adding the element xn+1 to this subset, or not. This tells us that the number of
subsets of {x1 , . . . , xn , xn+1 } is 2 times the number of subsets of {x1 , . . . , xn , }. Since P (n)
is assumed true, we know that {x1 , . . . , xn } has 2n subsets (this step is usually referred
to as applying the inductive hypothesis); so {x1 , . . . , xn , xn+1 } has 2 × 2n = 2n+1 subsets.
This shows that the truth of P (n) implies that of P (n + 1), and the proof by induction is
complete.

10
Strong Induction
Induction is a great tool because it gives you somewhere to start from in an argument.
And sometimes, the more you start with, the further you’ll go. That’s why the principle
of Strong Induction is worth keeping in mind: if there is some a for which P (a) is true,
and if for each n > a we have that the truth of P (m) for all m, a ≤ m < n, implies the
truth of P (n), then we can conclude that P (n) is true for all n ≥ a.
The proof that this works is almost the same as the proof that induction works. What’s
good about strong induction is that when you are at the part of the argument where you
have to show that the truth of P (n + 1) from some assumptions about earlier assertions,
you now have a lot more to work with: each of P (a), P (a + 1), . . . , P (n − 1), rather than
just P (n) alone. Sometimes this is helpful, and sometimes it’s absolutely necessary.
Example: Prove that every integer n ≥ 2 can be written as n = p1 . . . p` where the pi ’s
are (not necessarily distinct) prime numbers.
Solution: Let P (n) be the statement: “n can be written as n = p1 . . . p` where the pi ’s
are (not necessarily distinct) prime numbers”. We’ll prove that P (n) is true for all n ≥ 2
by strong induction.
P (2) is true, since 2 = 2 works.
Now consider P (n) for some n > 2. We want to show how the (simultaneous) truth
of P (2), . . . , P (n − 1) implies the truth of P (n). If n is prime, then n = n works to show
that P (n) holds. If n is not a prime, then its composite, so n = ab for some numbers a, b
with 2 ≤ a < n and 2 ≤ b < n. We’re allowed to assume that P (a) and P (b) are true, that
is, that a = p1 . . . p` where the pi ’s are (not necessarily distinct) prime numbers, and that
a = q1 . . . qm where the qi ’s are (not necessarily distinct) prime numbers. It follows that
n = ab = p1 . . . p` q1 . . . qm .
This is a product of (not necessarily distinct) prime numbers, and so P (n) is true.
So, by strong induction, we conclude that P (n) is true for all n ≥ 2.
Notice that we would have gotten exactly nowhere with this argument if, in trying to
prove P (n), all we had been allowed to assume was P (n − 1).

Presenting a proof by induction


A proof by induction (any sort of proof, indeed), should be presented in complete sentences.
If you read the proof aloud, giving the mathematical symbols their usual english-language
names, it should form a coherent narrative. And remember that for a write-up of the
solution to a problem, the goal is not to convince yourself that you have solved the problem;
it is to convince someone else, who doesn’t get to ask you for clarification as they read
your solution.
Here is a template for the presentation of an induction proof.
Claim 2.1. For every natural number n,
n(n + 1)
1 + 2 + 3 + ... + n = .
2
11
Proof: We proceed by induction on n.2
Base case: The base case n = 1 is obvious.3
Induction step: Let n be an arbitrary natural number. Assume

n(n + 1) 4
1 + 2 + 3 + ... + n = .
2
From this we get

1 + 2 + 3 + . . . + n + (n + 1) = (1 + 2 + 3 + . . . + n) + (n + 1)
n(n + 1)
= +n+1
2
n2 + n + 2n + 2
=
2
n2 + 3n + 2
=
2
(n + 1)(n + 2)
=
2
(n + 1)((n + 1) + 1)
= .
2
The equality of the first and last expressions in this chain is the case n + 1 of the assertion5 ,
so we have verified the induction step.
By induction the assertion is true for all n. 
In proving an identity — an equality between two expressions, both depending on some
variable(s) — by induction, it is often very helpful to start with one side of the n + 1 case
of the identity, and manipulate it via a sequence of equalities in a way that introduces one
side of the n case of the identity into the mix; this can then be replaced with the other
side of the n case, and then the whole thing might be massage-able into the other side of
the n + 1 identity. That’s exactly how we proceeded above.

2
Here you might chose to say specifically that you are proving the predicate P (n):“1 + 2 + 3 + . . . + n =
n(n + 1)
” for n ∈ N; this is usually not necessary for proving a simple statement, but it can be very
2
useful, when proving a more complex statement, especially one involving multiple variables, to introduce
explicit notation for the predicate.
3
It’s ok to say this, if the base case really is obvious!
4
Or, if you have named the predicate P (n), “assume P (n)”.
5
Or, “is P (n + 1)

12
2.3 Some problems to work on for week 2
1. Find (with proof!) a formula for the sum of the first n odd natural numbers.

2. A function f (n) satisfies f (1) = 5/2 and, for each n > 1,

f (n) = f (n − 1)2 − 2.

Find, with proof, a simple expression for f (n).

3. Define polynomials fn (x) for n ≥ 0 by f0 (x) = 1, fn (0) = 0 for n ≥ 1, and

d
fn+1 (x) = (n + 1)fn (x + 1)
dx
for n ≥ 0. Find, with proof, the explicit factorization of f100 (1) into powers of
distinct primes.

4. Let x1 , . . . , xn be n positive numbers satisfying


1
x1 + x 2 + · · · + x n = .
2
Prove that
(1 − x1 ) (1 − x2 ) (1 − xn ) 1
· · ··· · ≥ .
(1 + x1 ) (1 + x2 ) (1 + xn ) 3

5. Show that every positive integer can be written in the form ±12 ± 22 · · · ± n2 for
some n ≥ 1 and some choice of signs.

6. The numbers 1 through 2n are partitioned into two sets A and B of size n, in an
arbitrary manner. The elements a1 , . . . , an of A are sorted in increasing order, that
is, a1 < a2 < . . . < an , while the elements b1 , . . . , bn of B are sorted in decreasing
order, that is, b1 > b2 > . . . > bn . Find (with proof!) the value of the sum
n
X
|ai − bi |.
i=1

13
2.4 Solutions to induction problems
1. Find (with proof!) a formula for the sum of the first n odd natural numbers.

Solution6 : We claim that7 for each natural number n,

1 + 3 + 5 + · · · + (2n − 1) = n2 .

We prove this by induction on n8 , with the base case9 n = 1 being obvious10 .


For the induction step11 , assume that for some n ≥ 1 we have 1+3+5+· · ·+(2n−1) =
n2 .12 We want to deduce that

1 + 3 + 5 + · · · + (2n − 1) + (2n + 1) = (n + 1)2

(we are then done, by induction)13 . To make this deduction, observe that14

1 + 3 + 5 + · · · + (2n − 1) + (2n + 1) = [1 + 3 + 5 + · · · + (2n − 1)] + (2n + 1)


= n2 + (2n + 1) (induction hypothesis)
= (n + 1)2 .

This completes the induction step, and so the proof (by induction) of the claim.15

2. A function f (n) satisfies f (1) = 5/2 and, for each n > 1,

f (n) = f (n − 1)2 − 2.
6
For this problem, I’m going to write out a complete solution, to illustrate how a proof by induction
might be properly laid out. I’ll make comments along the way in the footnotes. Notice that the proof is
written in complete sentences. A well-written proof should read sensibly as a piece of English prose, as
long as you replace all the mathematical symbols with their usual English equivalents.
7
For any problem, not just one involving a proof by induction, that asks you to come up with an
answer to a question (rather than verify that a given answer is correct) — you MUST begin with a clear
statement of your answer.
8
If you are using proof by induction, say so!
9
Clearly say that you establishing a base case — that is part of a proof by induction.
10
It is ok to dismiss the base case like this, if it really is obvious.
11
When you move on to the induction step, clearly say so.
12
It’s helpful to say explicitly what the induction hypothesis is. It helps you as you write your proof,
and it helps the reader as she reads it.
13
This is a little overkill, but it can sometimes be helpful to write down the explicit goal, to focus
yourself on where you are going with the proof. Of course, if you stating the goal hopefully like this, you
must be clear that this is something that you are going to prove/wish to proof, rather than something
that you have already proven!
14
Notice that what follows is a chain of equalities, each one following from the previous, either by
some basic algebra or by the induction hypothesis. So: I’m arguing from true/known statements, to the
statement I want. I’m not arguing from the statement I want to be true, to a true statement (logically,
that tells me nothing).
15
It’s always good to end a proof by induction with a clear statement that all necessary steps have been
completed. In part this is for your benefit — if you haven’t done all you should have done, you’ll likely
notice this while writing “I’ve done all I should have done”, and will then be able to correct the problem.

14
Find, with proof, a simple expression for f (n).

Solution16 : Computing a few values strongly suggests


n−1 1
f (n) = 22 + .
22n−1
We prove this by induction, with the base case obvious, and for the induction step

f (n + 1) = f (n)2 − 2
 2
2n−1 1
= 2 + 2n−1 −2
2
n 1
= 22 + 2 + 2n − 2
2
(n+1)−1 1
= 22 + 2(n+1)−1 .
2
(Notice that it was computationally quite convenient to deduce the formula for
f (n + 1) from that of f (n) here; that is just as valid as deducing the formula for
f (n) from that of f (n − 1)).

3. Define polynomials fn (x) for n ≥ 0 by f0 (x) = 1, fn (0) = 0 for n ≥ 1, and


d
fn+1 (x) = (n + 1)fn (x + 1)
dx
for n ≥ 0. Find, with proof, the explicit factorization of f100 (1) into powers of
distinct primes.

Solution17 :
The answer is 10199 . It is a fairly easy induction that fn (x) = x(x + n)n−1 (left to
reader). Once this relation is established, the result follows immediately.

4. Let x1 , . . . , xn be n positive numbers satisfying


1
x1 + x 2 + · · · + x n = .
2
Prove that
(1 − x1 ) (1 − x2 ) (1 − xn ) 1
· · ··· · ≥ .
(1 + x1 ) (1 + x2 ) (1 + xn ) 3

Solution18 :
16
Here I will just give a sketch of the solution; this should not be mistaken for a polished final solution
17
This was problem B2 of the 1985 Putnam competition.
18
I found this problem on a Putnam prep class handout prepared by Amites Sarkar, Western Washington
University.

15
We prove the result by induction on n, with the base case n = 1 very easy (the
inequality holds with equality in this case).
For the induction step, assume that for some n ≥ 1, whenever x1 , . . . , xn are n
positive numbers satisfying
1
x1 + x2 + · · · + xn =
2
then
(1 − x1 ) (1 − x2 ) (1 − xn ) 1
· · ··· · ≥ .
(1 + x1 ) (1 + x2 ) (1 + xn ) 3
Let x1 , . . . , xn , xn+1 be n + 1 positive numbers satisfying
1
x1 + x2 + · · · + xn + xn+1 = .
2
Setting x0i = xi for i = 1, . . . , n − 1, and x0n = xn + xn+1 , we apply the induction
hypothesis to the numbers x01 , . . . , x0n (which are positive and sum to 1/2) to conclude

(1 − x01 ) (1 − x02 ) (1 − x0n ) 1


0
· 0
· · · · · 0
≥ . (?)
(1 + x1 ) (1 + x2 ) (1 + xn ) 3

We claim that
(1 − xn ) (1 − xn+1 ) (1 − xn − xn+1 )
· ≥ . (??)
(1 + xn ) (1 + xn+1 ) (1 + xn + xn+1 )

If (??) is true, then

(1 − x1 ) (1 − xn−1 ) (1 − xn ) (1 − xn+1 )
· ··· · · ·
(1 + x1 ) (1 + xn−1 ) (1 + xn ) (1 + xn+1 )

(1 − x1 ) (1 − xn−1 ) (1 − xn − xn+1 )
≥ · ··· · ·
(1 + x1 ) (1 + xn−1 ) (1 + xn + xn+1 )
(1 − x01 ) (1 − x0n )
= · ··· ·
(1 + x01 ) (1 + x0n )
1
≥ ,
3
the last inequality by (?).
So what remains is to prove (??); but after a little algebra this reduces to

2x2n xn+1 + 2xn x2n+1 ≥ 0,

which is clearly true.

16
5. Show that every positive integer can be written in the form ±12 ± 22 · · · ± n2 for
some n ≥ 1 and some choice of signs.

Solution: We prove this by induction. We have to be a little bit careful here, not
to be confused by using “n” in the induction hypothesis, and then thinking that we
have to represent n in the form ±12 ± 22 · · · ± n2 ; in fact, no restriction is placed
here on the number of terms we should use in the representation of any particular
natural number.
So, for each natural number m let P (m) be the proposition

there is some natural number n such that m can be represented in the


form ±12 ± 22 · · · ± n2 .

We will prove P (m) by strong induction on m.19


We start by verifying the base cases m = 1 through m = 4: P (1) is obvious. P (2) is
less so, but after a little experimentation we find

2 = −1 − 4 − 9 + 16.

P (3) is obvious: 3 = −1 + 4. P (4) is also easy: 4 = −1 − 4 + 9.


So from here on we may assume m ≥ 5, and that we have already established P (k)
for all k < m. By this induction hypothesis we have a representation of m − 4 (note
m − 4 ≥ 1) in the form

m − 4 = ε1 12 + ε2 22 + · · · + εn n2

for some n, with each εi ∈ {+1, −1}. Now note that

(n + 1)2 − (n + 2)2 − (n + 3)2 + (n + 4)2 = 4,

so

m = ε1 12 + ε2 22 + · · · + εn n2 + (n + 1)2 − (n + 2)2 − (n + 3)2 + (n + 4)2 ,

and we are done with the induction.

6. The numbers 1 through 2n are partitioned into two sets A and B of size n, in an
arbitrary manner. The elements a1 , . . . , an of A are sorted in increasing order, that
is, a1 < a2 < . . . < an , while the elements b1 , . . . , bn of B are sorted in decreasing
order, that is, b1 > b2 > . . . > bn . Find (with proof!) the value of the sum
n
X
|ai − bi |.
i=1

19
Note that here we have given the proposition to be proved a name, P (m). This is handy in this case,
since it is quite a wordy proposition.

17
Solution: The wording of the question strongly suggests that the answer is inde-
pendent of the choice of A and B, so we should start with a particularly nice choice
of A and B, see what answer we get, conjecture that this is always the answer, and
then try to prove the conjecture.
Letting A = {1, 2, 3, . . . , n} and B = {2n, 2n − 1, . . . , n + 1}, we find that
n
X
|ai − bi | = (2n − 1) + (2n − 3) + . . . + 1 = n2
i=1

(it is an easy induction that the sum of the first n odd positive integers is n2 ).
Xn
So, we try to prove by induction that |ai − bi | = n2 . The base case n = 1 is
i=1
trivial.
For n ≥ 2, we can consider two cases. Case 1 is when 1 and 2n end up in different
partition classes. Let’s start by considering 1 ∈ A, 2n ∈ B. In this case, |a1 − b1 | will
contribute 2n − 1 to the sum. What about the remaining terms? Notice that A \ {1}
and B \ {2n} form a partition {a2 , . . . , an } ∪ {b2 , . . . , bn } of {2, . . . , 2n − 1}, with the
a’s increasing and the b’s decreasing. Setting a01 = a2 − 1, a02 = a3 − 1, etc., up to
a0n−1 = an − 1, and also setting b01 = b2 − 1, b02 = b3 − 1, etc., up to b0n−1 = bn − 1, we
get that A0 and B 0 form a partition {a01 , . . . , a0n−1 } ∪ {b01 , . . . , b0n−1 } of {1, . . . , 2n − 2},
with the a0 ’s increasing and the b0 ’s decreasing. By induction,
n−1
X
|a0i − b0i | = (n − 1)2 ,
i=1

and so, since |a0i − b0i | = |ai+1 − bi+1 | for each i,


n
X n
X
|ai − bi | = (2n − 1) + |ai+1 − bi+1 |
i=1 i=2
n−1
X
= (2n − 1) + |a0i − b0i |
i=1
= (2n − 1) + (n − 1)2
= n2 .

If 2n ∈ A, 1 ∈ B, an almost identical argument gives the same result.


Case 2 is where 1 and 2n end up in the same partition class. We start with the case
where 1, 2n ∈ A. Let x be such that 1, 2, . . . , x ∈ A, but x + 1 ∈ B. Then
n
X x
X n−1
X
|ai − bi | = (bi − i) + |ai − bi | + (2n − (x + 1)).
i=1 i=1 i=x+1

18
(Note that this is valid even if x = n − 1, the largest it can possibly be; the second
sum in this case is empty and so 0). If we modify A and B to form A0 = {a01 , . . . , a0n },
B 0 = {b01 , . . . , b0n } by swapping 1 and x + 1, then
n
X x
X n−1
X
|a0i − b0i | = (bi − (i + 1)) + |ai − bi | + (2n − 1)
i=1 i=1 i=x+1
x
X n−1
X
= (bi − i) + |ai − bi | + (2n − (x + 1))
i=1 i=x+1
n
X
= |ai − bi |.
i=1

But now (with A0 and B 0 ) we are back in case 1, so


n
X n
X
|ai − bi | = |a0i − b0i | = n2 .
i=1 i=1

A very similar reduction works if 1, 2n ∈ B.

19
3 Week 2 (August 18) — Pigeonhole principle
“If n + 1 pigeons settle themselves into a roost that has only n pigeonholes,
then there must be at least one pigeonhole that has at least two pigeons.”

This very simple principle, sometimes called the box principle, and sometimes Dirichlet’s
box principle, can be very powerful.
The proof is trivial: number the pigeonholes 1 through n, and consider the case where
n
X
ai pigeons land in hole i. If each ai ≤ 1, then ai ≤ n, contradicting the fact that (since
i=1
n
X
there are n + 1 pigeons in all) ai = n + 1.
i=1
Since it’s a simple principle, to get some power out of it it has to be applied cleverly
(in the examples, there will be at least one such clever application). Applying the principle
requires identifying what the pigeons should be, and what the pigeonholes should be;
sometimes this is far from obvious.
The pigeonhole principle has many obvious generalizations. I’ll just state one of them:

“if more than mn pigeons settle themselves into a roost that has no more than
n pigeonholes, then there must be at least one pigeonhole that has at least
m + 1 pigeons”.

Example: 10 points are placed randomly in√a 1 by 1 square. Show that there must be
some pair of points that are within distance 2/3 of each other.
Solution: Divide the square into 9 smaller squares, each of dimension 1/3 by 1/3. These
are the pigeonholes. The ten randomly chosen points are the pigeons. By the pigeonhole
principle, at least one of the 1/3 by 1/3 squares must have at least two of the ten points
in it. The maximum distance between two points in ap 1/3 by 1/3 square is√the distance
between two opposite corners. By Pythagoras this is (1/3)2 + (1/3)2 = 2/3, and we
are done.
Example: Show that there are two people in New York City who have the exactly same
number of hairs on their head.
Solution: Trivial, because surely there are at least two baldies in NYC! But even if we
weren’t sure of that: a quick websearch shows that a typical human head has around
150,000 hairs, and it is then certainly reasonable to assume that no one has more than
5,000,000 hairs on their head. Set up 5,000,001 pigeonholes, numbered 0 through 5,000,000,
and place a resident of NYC (a “pigeon”) into bin i if (s)he has i hairs on her head.
Another websearch shows that the population of NYC is around 8,300,000, so there are
more pigeons than pigeonholes, and some pigeonhole must have multiple pigeons in it.
Example: Show that every sequence of nm + 1 real numbers must contain EITHER a
decreasing subsequence of length n + 1 OR an increasing subsequence of length m + 1.

20
(In a sequence a1 , a2 , . . ., an increasing subsequence is a subsequence ai1 , ai2 , . . . [with i1 <
i2 < . . .] satisfying ai1 ≤ ai2 ≤ . . ., and a decreasing subsequence is defined analogously).
Solution: Let the sequence be a1 , . . . , anm+1 . For each k, 1 ≤ k ≤ nm + 1, let f (k) be
the length of the longest decreasing subsequence that starts with ak , and let g(k) be the
length of the longest increasing subsequence that starts with ak . Notice that f (k), g(k) ≥ 1
always.
If there is a k with either f (k) ≥ n + 1 or g(k) ≥ m + 1, we are done. If not, then
for every k we have 1 ≤ f (k) ≤ n and 1 ≤ g(k) ≤ m. Set up nm pigeonholes, with each
pigeonhole labeled by a different pair (i, j), 1 ≤ i ≤ n, 1 ≤ j ≤ m (there are exactly nm
such pairs). For each k, 1 ≤ k ≤ nm + 1, put ak in pigeonhole (i, j) iff f (k) = i and
g(k) = j. There are nm + 1 pigeonholes, so one pigeonhole, say hole (r, s), has at least
two pigeons in it.
In other words, there are two terms of the sequence, say ap and aq (where without loss
of generality p < q), with f (p) = f (q) = r and g(p) = g(q) = s.
Suppose ap ≥ aq . Then we can find a decreasing subsequence of length r + 1 starting
from ap , by starting ap , aq , and then proceeding with any decreasing subsequence of length
r that starts with aq (one such exists, since f (q) = r). But that says that f (p) ≥ r + 1,
contradicting f (p) = r.
On the other hand, suppose ap ≤ aq . Then we can find an increasing subsequence of
length s + 1 starting from ap , by starting ap , aq , and then proceeding with any increasing
subsequence of length s that starts with aq (one such exists, since g(q) = s). But that says
that g(p) ≥ s + 1, contradicting g(p) = s.
So, whether ap ≥ aq or ap ≤ aq , we get a contradiction, and we CANNOT ever be in
the case where there is NO k with either f (k) ≥ n + 1 or g(k) ≥ m + 1. This completes
the proof.
Remark: This beautiful result was discovered by P. Erdös and G. Szekeres in 1935; the
incredibly clever application of pigeonholes was given by A. Seidenberg in 1959.

21
3.1 Some problems to work on for week 3
1. Given m integers a1 , . . . , am , show that there is a consecutive subsequence whose
sum is divisible by m. (A consecutive subsequence means a subsequence

ai , ai+1 , ai+2 , . . . ai+j−1

of length j, where j could be as small as one.)

2. (a) 51 different integers are chosen between 1 and 100, inclusive. Show that some
two of them are coprime (have no prime factor in common).
(b) 51 different integers are chosen between 1 and 100, inclusive. Show that there
are some two of them such that one divides the other.

3. Prove that from a set of ten distinct two-digit numbers, it is possible to select two
nonempty disjoint subsets whose members have the same sum.

4. Let A and B be 2 by 2 matrices with integer entries such that A, A + B, A + 2B,


A + 3B and A + 4B are all invertible matrices whose inverses have integer entries.
Show that A + 5B is invertible and that its inverse has integer entries.

5. A regular icosahedron is a convex polyhedron having 12 vertices and 20 faces; the


faces are congruent equilateral triangles. On each face of a regular icosahedron is
written a nonnegative integer such that the sum of all 20 integers is 39. Show that
there are two faces that share a vertex and have the same integer written on them.

6. Show that among any 256 people, there are either some 5 of them who mutually know
each other, or some 5 who mutually don’t know each other. (The relation “knowing”
is assumed to be symmetric — if I know you, you know me, and vice-versa.)

7. The Fibonacci numbers are defined by the recurrence f0 = 0, f1 = 1 and fn =


fn−1 + fn−2 for n ≥ 2. Show that the Fibonacci sequence is periodic modulo any
positive integer. (I.e, show that for each k ≥ 1, the sequence whose nth term is the
remainder of fn on division by k is a periodic sequence).

8. Start at the bottom right hand corner of a square, and draw a wrap-around straight
line.20 Show that if the slope of the line is irrational, then eventually the line will
get arbitrarily close to every point of the square.21

20
meaning: if the line hits the right-hand boundary of the square, three-quarters of the way up, then it
immediately jumps to the left-hand boundary, three-quarters of the way up, and keeps the same slope;
and if it hits the top boundary one-ninth of the way along, then then it immediately jumps to the bottom
boundary, one-ninth of the way along, and keeps the same slope; et cetera.
21
meaning: for every point x in the square, and for every ε > 0, at some point the line will pass through
a point that is withing ε of x.

22
3.2 Solutions to pigeon hole problems
1. Given m integers a1 , . . . , am , show that there is a consecutive subsequence whose
sum is divisible by m. (A consecutive subsequence means a subsequence

ai , ai+1 , ai+2 , . . . ai+j−1

of length j, where j could be as small as one.)

Solution: Look at the m numbers a1 , a1 + a2 , etc., up to a1 + . . . + am . If any


one of these is divisible by m, we are done. If not, then these m numbers have
between them at most m − 1 remainders on division by m (1 through m − 1), so by
pigeon-hole principle, some two of them must have the same remainder on division
by m.
Say those two are a1 + . . . + ak and a1 + . . . + ak + . . . + a` for some ` > k. Then the
difference of these two, ak+1 + . . . + a` , is divisible by m.

2. (a) 51 different integers are chosen between 1 and 100, inclusive. Show that some
two of them are coprime (have no prime factor in common).
Solution: The two parts to this problem were favorites of Paul Erdős. The first
is often called “Posá’s soup problem”; see http://www.math.uwaterloo.ca/
navigation/ideas/articles/honsberger/index.shtml for an explanation.
Among 51 numbers chosen from between 1 and 100, two must be consecutive,
and so coprime (use pigeon-hole principle with 50 pigeon-holes labelled “1, 2”,
“3, 4”, etc., up to “99, 100”).
(b) 51 different integers are chosen between 1 and 100, inclusive. Show that there
are some two of them such that one divides the other.
Solution: Every positive whole number can be expressed uniquely as n = m2k
where m is odd and k is a non-negative whole number. Create 50 pigeon-holes
labelled “1”, “3”, etc., up to “99”. Place number n in pigeon-hole labelled “m”
if n = m2k for some non-negative whole number k. By the pigeon-hole principle,
there is some odd m such that there are two distinct numbers n1 , n2 among the
51 with n1 = m2k1 and n2 = m2k2 . The smaller of these divides the larger.

3. Prove that from a set of ten distinct two-digit numbers, it is possible to select two
nonempty disjoint subsets whose members have the same sum.

Solution: This was problem 1 of the 1972 International Mathematical Olympiad.


There are 210 − 1 = 1023 non-empty subsets. The smallest sum that any of these sets
can have is 11, and the largest is 99 + 98 + 97 + 96 + 95 + 94 + 93 + 92 + 91 + 90 = 945.
So there are only 935 possible sums among 1023 non-empty subsets; by PHP some
two, A and B, must have the same sum. These sets might not be disjoint, but
A0 := A \ (A ∩ B) and B 0 := B \ (A ∩ B) are disjoint sets. Both are non-empty.

23
(Justification: it could only happen that A0 is empty if A ⊆ B; but since
A and B are not the same, we would then have B = A ∪ C for some
non-empty C, so the sum of the elements in B would be greater than that
in A, a contradiction. And similarly we can’t have B empty.)

Also, since we have removed the same set of elements from both A and B to get A0
and B 0 , and the sum of the elements of A is the same as that of B, it follows that
the sum of the elements of A0 is the same as that of B 0 .

4. Let A and B be 2 by 2 matrices with integer entries such that A, A + B, A + 2B,


A + 3B and A + 4B are all invertible matrices whose inverses have integer entries.
Show that A + 5B is invertible and that its inverse has integer entries.

Solution: This was Problem A4 of the 1994 Putnam competition. The following
solution is due to Kiran Kedlya:
First recall that an integer matrix A has an integer inverse if and only if det(A) = ±1.

Proof: clearly if A−1 has integer entries, then 1 = det(AA−1 ) = det(A)det(A−1 )


so det(A) divides 1. Conversely, if det(A) = ±1 then A−1 equals 1/det(A)
times the signed cofactor matrix of A, which has integer entries.

So let f (n) = det(A) + nB. Clearly f is a quadratic polynomial in n with integer


coefficients (just write it out in terms of the entries); our claim is that if f (i) ∈ {1, −1}
for i = 0, 1, 2, 3, 4, then f (n) = ±1 for all n.
Note that since f has integer coefficients, x − y divides f (x) − f (y) for any integers
x and y.

Proof: If f (x) = an xn + · · · + a0 , then f (x) − f (y) = an (xn − y n ) +


an−1 (xn−1 − y n−1 ) + . . ., and each term is divisible by x − y.

But if x and y both belong to {0, 1, 2, 3, 4} and |x − y| > 3, then |f (x) − f (y)| ≤
|f (x)| + |f (y)| = 2, so f (x) − f (y) must be 0. Using this, we conclude f (3) = f (0) =
f (4) = f (1) (apply what we just said to the two sides of each equality). Thus the
quadratic polynomial f (n) − f (1) has four zeroes; that’s too many, so it must be the
zero polynomial, and f (n) = f (1) = ±1 for all x.

Aside: It’s not enough to know that A, A + B, A + 2B, A + 3B are all invertible
with integer entries in their inverses, to conclude that for all n, A + nB is invertible
with integer entries in its inverse. To see this, consider:
   
−1 1 1 0
A= , B=
1 −2 0 1

5. A regular icosahedron is a convex polyhedron having 12 vertices and 20 faces; the


faces are congruent equilateral triangles. On each face of a regular icosahedron is

24
written a nonnegative integer such that the sum of all 20 integers is 39. Show that
there are two faces that share a vertex and have the same integer written on them.

Solution: This was problem A1 from the 2013 Putnam competition.


A key fact to know about the icosahedron is how many faces meet at each vertex.
Suppose it is x. The the number of pairs (v, F ) where v is a vertex and F is a face
that has v as a vertex is 12x (each of 12 vertices contributes x to the number); but
it is also 3 × 20 (each of 20 faces contributes 3 to the number). So x = 5.
Suppose that there are no two faces that share a vertex and have the same integer
written on them. Then, for each vertex, the smallest possible sum of the five numbers
on the faces meeting that vertex is 0 + 1 + 2 + 3 + 4 = 10. Consider the double sum
X X
number on F .
vertices x faces F touching x

We have just argued that for each x, the inner sum is at least 10, so the double
sum is at least 120. But because each face is a triangle, the double sum counts
each number exactly three times, and hence the sum is 3 × 39 = 117. This is a
contradiction; hence, there are no two faces that share a vertex and have the same
integer written on them.

6. Show that among any 256 people, there are either some 5 of them who mutually know
each other, or some 5 who mutually don’t know each other. (The relation “knowing”
is assumed to be symmetric — if I know you, you know me, and vice-versa.)

Solution: This is (a version of) Ramsey’s theorem, a central theorem in combina-


torics. We’ll prove the more general statement:

among any 4n−1 people, there are either some n of them who mutually
know each other, or some n who mutually don’t know each other.

(Note that when n = 5 we have 4n−1 = 256.)


Note that 4n−1 = 22n−2 . Pick a person a1 arbitrarily. Of the remaining 22n−2 − 1
people, it must be the case that either a1 knows at least 22n−3 of them, or doesn’t
know at least 22n−3 of them (this is pigeon-hole principle, essentially; if she knows
fewer than 22n−3 , and doesn’t know fewer than 22n−3 , this accounts for fewer than
22n−2 − 1 people).
If a1 knows at least 22n−3 people, then label a1 with a “K”, and select arbitrarily
a subset of the people she knows of size 22n−3 . Remove all other people from
consideration. If she doesn’t know at least 22n−3 people, then label her with a “D”,
and select arbitrarily a subset of the people she doesn’t know of size 22n−3 . Remove
all other people from consideration.
Repeat: select an arbitrary person a2 from among the 22n−3 people left under
consideration after a1 has been dealt with; among the 22n−3 − 1 people that a2

25
may know or not know (not counting a1 ), she either knows at least 22n−4 of them,
or doesn’t know this many; in the former case, label a2 “K” and select a subset
of size 22n−4 of people (other than a1 ) that she knows, removing all others from
consideration; in the latter case, label a2 “D” and select a subset of size 22n−4 of
people (other than a1 ) that she doesn’t know, removing all others from consideration.
Iterate this process until we have selected a1 , a2 , . . . , a2n2 . Notice that when we
consider a2n−2 , there is one person left unconsidered (if a2n−2 knows this person, she
gets label “K”; if not, label “D”). Call this last person a2n−1 .
Two labels have been used to label a1 through a2n−2 , so, by pigeon-hole, one of the
labels must be used at least n − 1 times [note that we could have said the same thing
if we had only up to a2n−3 ; so the “4n−1 ” at the beginning of the problem could
be replaced by “4n−1 /2”]. Say that that label is “K”. Then any collection of n − 1
of the ai ’s with label “K”, together with a2n−1 , form a collection of n people who
mutually know each other (that ai knows aj for i < j follows from the fact that ai
has label “K”). On the other hand, if that label is “D” then any collection of n − 1
of the ai ’s with label “D”, together with a2n−1 , form a collection of n people who
mutually don’t know each other.
7. The Fibonacci numbers are defined by the recurrence f0 = 0, f1 = 1 and fn =
fn−1 + fn−2 for n ≥ 2. Show that the Fibonacci sequence is periodic modulo any
positive integer. (I.e, show that for each k ≥ 1, the sequence whose nth term is the
remainder of fn on division by k is a periodic sequence).
Solution: I found this on the Northwestern Putnam prep class webpage.
Consider the sequence obtained from the Fibonacci sequence by taking the remainder
of each term on division by k (so the result is a sequence, all terms in {0, . . . , k − 1}).
Suppose that there are two consecutive terms in this sequence, say the mth and
(m + 1)st, taking values a, b, and two other consecutive terms , say the nth and
n + 1st, taking the same values a, b (with m < n). Then the (m + 2)nd and (n + 2)nd
terms of the reduced sequence agree.
[WHY? Because the (m + 2)nd term is the remainder of Fm+2 on division by k,
which is the remainder of Fm + Fm+1 on division by k, which is the remainder of
Fm on division by k PLUS the remainder of Fm+1 on division by k, which is the
remainder of a on division by k PLUS the remainder of b on division by k, which is
the remainder of Fn on division by k PLUS the remainder of Fn+1 on division by
k, which is the remainder of Fn + Fn+1 on division by k, which is the remainder of
Fn+2 on division by k.]
The same argument shows that the reduced sequence is periodic beyond the mth
terms, with period (at most) n − m.
So all we need to do to find periodicity is to find two consecutive terms in the
sequence, that agree with two other consecutive terms. There are only k 2 possibilities
for a pair of consecutive values in the sequence, and infinitely many consecutive
values, so by PHP there has to be a coincidence of the required kind.

26
8. Start at the bottom right hand corner of a square, and draw a wrap-around straight
line.22 Show that if the slope of the line is irrational, then eventually the line will
get arbitrarily close to every point of the square.23

Solution: To be added. This is referred to as irrational winding/flow/rotation


around the torus, see e.g.

https://en.wikipedia.org/wiki/Linear_flow_on_the_torus,

and see

https://www.desmos.com/calculator/ghfkyjx2lc

for an interactive Desmos demonstration.

22
meaning: if the line hits the right-hand boundary of the square, three-quarters of the way up, then it
immediately jumps to the left-hand boundary, three-quarters of the way up, and keeps the same slope;
and if it hits the top boundary one-ninth of the way along, then then it immediately jumps to the bottom
boundary, one-ninth of the way along, and keeps the same slope; et cetera.
23
meaning: for every point x in the square, and for every ε > 0, at some point the line will pass through
a point that is withing ε of x.

27
4 Week 3 (August 25) — Binomial coefficients
Binomial coefficients crop up quite a lot in Putnam problems. This handout presents some
ways of thinking about them.

Introduction to binomial coefficients


 
n
The binomial coefficient , with n ∈ N and k ∈ Z, can be defined many ways; possibly
k
the most helpful definition from the point of view of problem-solving is the following
combinatorial one:
 
n
is the number of subsets of size k of a set of size n.
k
 
n
In particular, this definition immediately tells us that for all n ≥ 0 we have = 0 if
      k
n n 0
k > n or if k < 0, and that = = 1 (and so in particular = 1).
0 n 0
The binomial coefficients can also be defined by a recurrence relation: for n ≥ 1, and
all k ∈ Z, we have the recurrence
     
n n−1 n−1
= + , (Pascal’s identity)
k k k−1
   
0 0
with initial conditions = 0 if k 6= 0, and = 1. To see that this recurrence does
k 0
indeed generate the binomial coefficients,  think about the combinatorial interpretation:
n
the subsets of {1, . . . , n} of size k ( of them) partition into those that don’t include
  k  
n−1 n−1
element n ( of them) and those that do include element n ( of them). The
k k−1
recurrence allows us to quickly compute small binomial coefficients via Pascal’s triangle:
the zeroth row of the triangle has length one, and consists just of the number 1. Below
that, the first row has two 1’s, one below and to the left of the 1 in the zeroth row, and
one below and to the right of the 1 in the zeroth row. The second row has three entries, a
1 below and to the left of the leftmost 1 in the first row, a 1 below and to the right of the
rightmost 1 in the first row, and in the center a 2. Each subsequent row contains one more
entry than the previous row, starting with a 1 below and to the left of the leftmost 1 in
the previous row, ending with a 1 below and to the right of the rightmost 1 in the previous
row, and with all other entries being the sum of the two entries in the previous row above
to the left and to the right of the entry being considered (see the picture below).

28
1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

1 6 15 20 15 6 1
etc.
Pascal’s triangle in numbers

 The kth entry in row k (counting from 0 rather than 1 both down and across) is then
n
(this is just a restatement of Pascal’s identity) (see the picture below).
k
 
0
 0 
1 1
 0 1 
2 2 2
 0 1 2 
3 3 3 3
 0 1 2 3 
4 4 4 4 4
 0 1 2 3 4 
5 5 5 5 5 5
 0 1 2 3 4 5 
6 6 6 6 6 6 6
0 1 2 3 4 5 6
etc.
Pascal’s triangle symbolically
 
n
Finally, there is an algebraic expression for , that makes sense for all n, k ≥ 0,
k
using the factorial function (defined combinatorially as the number of ways of arranging n
distinct objects in order, and algebraically by n! = n(n − 1)(n − 2) . . . (3)(2)(1) for n ≥ 1,
with 0! = 1):  
n n(n − 1) . . . (n − (k − 1)) n!
= = .
k k! k!(n − k)!

29
To see this, note that n(n − 1) . . . (n − (k − 1)) is fairly evidently the number of ordered
lists of k distinct elements from {1, . . . , n} (often referred to in textbooks as “permutations
of n items taken k at a time” — ugh). When the ordered lists are turned into (unordered)
subsets, each subset appears k! times (once for each of the k! ways of putting k distinct
objects into an ordered list), so we need to divide the ordered count by k! to get the
unordered count.
When dealing with binomial coefficients, it is very helpful to bear all three definitions
in mind, but in particular the first two.

Identities
The binomial coefficients satisfy a staggering number of identities. The simplest of these
are easily understood using either the combinatorial or algebraic definitions; for the more
involved ones, that include sums, the algebraic definition is usually next to useless, and
often the easiest way to prove the identity is combinatorially, by showing that both sides
of the identity count the same thing in different ways (illustration below), though it is
often possible also to prove these identities by induction, using the recurrence relation.
Another approach that is helpful is that of generating functions.
Here are some of the basic binomial coefficient identities:
1. (Symmetry)    
n n
=
k n−k
(Proof: trivial from the algebraic definition; combinatorially, left-hand side counts
selection of subsets of size k from a set of size n, by naming the selected elements;
right-hand side also counts selection of subsets of size k from a set of size n, this
time by naming the unselected elements).
2. (Lower summation)
n  
X n
= 2n
k=0
k
(Proof: close to impossible using the algebraic definition; combinatorially, very
straightforward: left-hand side counts the number of subsets of a set of size n, by
first deciding the size of the subset, and then choosing the subset itself; right-hand
side also counts the number of subsets of a set of size n, by going through the n
elements one-by-one and deciding whether they are in the subset or not).
3. (Upper summation)
n    
X m n+1
= .
m=k
k k+1

4. (Parallel summation)
n    
X m+k n+m+1
= .
k=0
k n

30
5. (Square summation)
n  2  
X n 2n
= .
k=0
k n

6. (Vandermonde identity, or Vandermonde convolution)


r     
X m n n+m
= .
k=0
k r − k r

The binomial theorem


This is the most important identity involving binomial coefficients: for all real x and y,
and n ≥ 0,
n  
n
X n n−k k
(x + y) = x y .
k=0
k
This can be proved by induction using Pascal’s identity, but the proof is quite awkward.
Here’s a nice combinatorial proof. First, note that the identity is trivial if either x = 0 or
y = 0, so we may assume x, y 6= 0. Dividing through by xn , the identity is the same as
n  
n
X n k
(1 + z) = z .
k=0
k

We will prove this combinatorially when z is a positive integer. The left-hand side counts
the number of words of length n from alphabet {0, 1, 2, . . . , z}, by deciding on the letters
one after the other. The right-hand side also counts the number of words of length n
from alphabet {0, 1, 2, . . . , z}, as follows: first decide how many of the letters of the word
are from {1, . . . , z} (this
 is the k of the summation). Next, decide the location of these
n
k letters (this is the ). Finally, decide what specific letters go into those spots, one
k
after another (this is the z k ) (note that the remaining n − k letters must all be 0’s).
This only shows the identity for positive integer z. But now we use the fact that
both the right-hand and left-hand sides are polynomials of degree n, so if they agree at
n + 1 different values of z, they must agree at all values of z (otherwise, their difference
is a not-identically-zero polynomial of degree at most n with n + 1 distinct roots, an
impossibility). And indeed, the two sides agree not just at n + 1 different values of z, but
at infinitely many (all positive integers z). So from the combinatorial argument that shows
that the two sides are equal for positive integers z, we infer that they are equal for all real
z. This argument is often called the polynomial principle.
There is a version of the binomial theorem also for non-positive-integral exponents: for
all real α,
X α
α
(1 + z) = zk
k≥0
k

31
 
α
where is defined in the obvious way:
k
   
α α(α − 1) . . . (α − k + 1) α
= (for k ≥ 1; = 1),
k k! 0

and the equality is valid for all real |z| < 1. (Check: when α is a positive integer, this
reduces to the standard binomial theorem).
For example, if ` > 0 is a positive integer, then
   
−` (−`)(−` − 1) . . . (−` − k + 1) k `+k−1
= = (−1) ,
k k! k
and so
1 X ` + k − 1
`
= zk .
(1 − z) k≥0
k
This generalizes the familiar identity
1
= 1 + z + z2 + . . . .
1−z
Modulo the convergence analysis, the proof of the binomial theorem for general exponents
is fair easy: the coefficient of xk in the Taylor series Taylor series of (1 + z)α is

1 dk
 
α α
k
(1 + z) |z=0 = .
k! dx k

Compositions and weak compositions


A composition of a positive integer n into k parts is a vector (x1 , x2 , . . . , xk ), with each entry
Xk
a strictly positive integer, and with xi = n. For example, (2, 1, 1, 3) is a composition
i=1
of 7, as is (1, 3, 1, 2); and, because a composition is a vector (ordered list), these two are
considered different compositions.
A weak composition of a positive integer n into k parts is a vector (x1 , x2 , . . . , xk ),
Xk
with each entry a non-negative (possibly 0) integer, and with xi = n. For example,
i=1
(2, 0, 1, 3) is a weak composition of 6, but not a composition.
How many weak compositions of n are there, into k parts? Put down n + k − 1 stars in
a row. Choose k − 1 of them to turn into bars. The resulting arrangement of stars-and-bars
encodes a weak composition of n into k parts — the number of stars before the first bar
is x1 , the number of stars between the first and second bar is x2 , and so on, up to the
number of stars after the last bar, which is xk (notice that only k − 1 bars are needed to
determine k intervals of stars). Conversely, every weak composition of n into k parts is
encoded by one such selection of k − 1 bars from the initial list of n + k − 1 stars. For

32
example, the configuration ? ? || ? | ? ?? encodes the weak composition (2, 0, 1, 3) of 6 into
4 parts. So,the number of weak compositions of n into k parts is a binomial coefficient,
n+k−1
.
k−1
How many compositions of n are there, into k parts? Each such composition (x1 , x2 , . . . , xk )
gives rise to a weak composition (x1 − 1, x2 − 1, . . . , xk − 1) of n − k into k parts, and all
weak composition of n − k into k parts are achieved by this process. So, the number of
compositions of n intok parts is the same as  the number
 of weak compositions of n − k
(n − k) + k − 1 n−1
into k parts, which is = .
k−1 k−1
For example: I like plain cake, chocolate cake, blueberry cake and pumpkin cake donuts
from Dunkin’ Donuts. In how many different ways can I buy a dozen donuts that I like? I
must buy x1 plain, x2 chocolate, x3 blueberry and x4 pumpkin, with x1 + x2 + x3 + x4 = 12,
and with each xi a non-negative integer (possibly 0). So the number of  different
 purchases
15
I can make is the number of weak compositions of 12 into 4 parts, so = 1365.
4

4.1 Warm-up problems


We won’t talk about these in class, but they are worth working on if they are new to you.
1. Give a combinatorial proof of the upper summation identity.
2. Give a combinatorial proof of the parallel summation identity.
3. Give a combinatorial proof of the square summation identity.
4. Give a combinatorial proof of the Vandermonde identity.
5. Evaluate n  
X
kn
(−1)
k=0
k
for n ≥ 1.
6. (a) Let an be the number of 0-1 strings of length n that do not have two consecutive
1’s. Find a recurrence relation for an (starting with initial conditions a0 = 1,
a1 = 2).
(b) Let an,k be the number of 0-1 strings of length n that have exactly k 1’s and that
do not have two consecutive 1’s. Express an,k as a (single) binomial coefficient.
(c) Use the results of the previous two parts to give a combinatorial proof (showing
that both sides count the same thing) of the identity
X n − k − 1
Fn =
k≥0
k

where Fn is the nth Fibonacci number (F1 = 1, F2 = 1, Fn = Fn−1 + Fn−2 for


n ≥ 3).

33
4.2 Problems to think about for week 4
1. Find a simple expression (not involving a sum) for
       
2 n 2 n 2 n 2 n
1 +2 +3 + ··· + n .
1 2 3 n
2. n points are arranged on a circle. All possible diagonals are drawn. Assuming that
no three of the diagonals meet at a single point, how many intersections of diagonals
are there inside the circle?
3. (a) The kth falling power of x is xk = x(x − 1)(x − 2) . . . (x − (k − 1)). Prove24
that for all real x, y, and all n ≥ 1,
n  
n
X n n−k k
(x + y) = x y .
k=0
k

(b) The kth rising power of x is xk = x(x + 1)(x + 2) . . . (x + (k − 1)). Prove25 that
for all real x, y, and all n ≥ 1,
n  
n
X n n−k k
(x + y) = x y .
k=0
k

4. Prove that the expression  


gcd(m, n) n
n m
is an integer for all pairs of integers n ≥ m ≥ 1.
5. Evaluate n  
X n
Fk+1
k=0
k
for n ≥ 0, where F1 , F2 , F3 , F4 , F5 , . . . are the Fibonacci numbers 1, 1, 2, 3, 5, . . ..
6. Show26 that for every n, m ≥ 0,
Z 1
1
xn (1 − x)m dx = n+m
.
0 (n + m + 1) n
24
Hint: Think combinatorially.
25
Hint: deduce it from 3(a).
26
So,  
n+m (n + m + 1)
= R1 .
m xn (1 − x)m dx
0
 
α
This suggests a way of defining for any non-negative real numbers α, β with α > β, namely
β
 
α (α + 1)
= R1 .
β xβ (1 − x)α−β dx
0

This is a quite standard way to extend the binomial coefficients beyond integer arguments.

34
7. Define a selfish set to be a set which has its own cardinality (number of elements) as
an element. Find the number of subsets of {1, 2, . . . , n} which are minimal selfish
sets, that is, selfish sets none of whose proper subsets is selfish.

8. Three distinct vertices are chosen at random from the vertices of a regular (2n + 1)-
sided polygon. If all such choices are equally likely, what is the probability that
the center of the polygon lies in the interior of the triangle determined by the three
chosen vertices?

4.3 Solutions to warm-up problems


1. Give a combinatorial proof of the upper summation identity.

Solution: RHS is number of subsets of {1, . . . , n + 1} of size k + 1, counted directly.


LHS counts same, by first specifying largest elementinsubset (if largest element is
k
k + 1, remaining k must be chosen from {1, . . . , k}, ways; if largest element is
k  
k+1
k + 2, remaining k must be chosen from {1, . . . , k + 1}, ways; etc.).
k
2. Give a combinatorial proof of the parallel summation identity.

Solution: RHS is number of subsets of {1, . . . , n + m + 1} of size n, counted directly.


LHS counts same, by first specifying the smallest element not in subset (ifsmallest 
m+n
missed element is 1, all n elements must be chosen from {2, . . . , n + m + 1},
n
ways, the k = n term; if smallest missed element is 2, then 1 isin subset and
 remaining
m+n−1
n − 1 elements must be chosen from {3, . . . , n + m + 1}, ways, the
n−1
k = n − 1 term; etc., down to: if smallest missed element is n + 1, then {1, . . . , n} 
is
m+0
in subset and remaining 0 elements must be chosen from {n + 2, . . . , k + 1},
0
ways, the k = 0 term).

3. Give a combinatorial proof of the square summation identity.

Solution: RHS is number of subsets of {±1, . . . , ±n} of size n, counted directly.


LHS counts same, by first specifying
  k, the number of positive elements chosen, then
n
selecting k positive elements ( ways), then selecting the k negative elements
k  
n
that are not chosen (so the n − k that are, for n in total) ( ways).
k
4. Give a combinatorial proof of the Vandermonde identity.

Solution: Let A = {x1 , . . . , xm } and B = {y1 , . . . , yn } be disjoint sets. RHS is


number of subsets of A ∪ B of size r, counted directly. LHS counts same, by first

35
specifying
  k, the number of elements chosen from A, then selecting relements  from
m n
A( ways), then selecting the remaining r − k elements from B ( ways).
k r−k
5. Evaluate n  
X
kn
(−1)
k=0
k
for n ≥ 1.

Solution: Applying the binomial theorem with x = 1, y = 1 get


n   n  
n
X n n−k k
X
k n
0 = (1 − 1) = 1 (−1) = (−1)
k=0
k k=0
k

so the sum is 0.
6. (a) Let an be the number of 0-1 strings of length n that do not have two consecutive
1’s. Find a recurrence relation for an (starting with initial conditions a0 = 1,
a1 = 2).
Solution: By considering whether the last term is a 0 or a 1, get the Fibonacci
recurrence: an = an−1 + an−2 .
(b) Let an,k be the number of 0-1 strings of length n that have exactly k 1’s and that
do not have two consecutive 1’s. Express an,k as a (single) binomial coefficient.
Solution: Add a 0 to the beginning and end of such a string. By reading off
a1 , the number of 0’s before the first 1, then a2 , the number of 0’s between the
first 1 and the second, and so on up to ak+1 , the number of 0’s after the last 1,
we get a composition (a1 , . . . , ak+1 ) of n + 2 − k into k + 1 parts; and each such
composition can be encoded (uniquely) by such a string. So an,k is the number 
n+1−k
of compositions of n + 2 − k into k + 1 parts, and so equals .
k
(c) Use the results of the previous two parts to give a combinatorial proof (showing
that both sides count the same thing) of the identity
X n − k − 1
Fn =
k≥0
k

where Fn is the nth Fibonacci number (F1 = 1, F2 = 1, Fn = Fn−1 + Fn−2 for


n ≥ 3).
Solution: From the recurrence in the first part, we get an = Fn+2 , so Fn counts
the number of 0-1 strings of length n − 2 with no two consecutive 1’s. We can
count such strings by first deciding on
 k, the number
 of 1’s, and by the second
n−1−k
part, the number of such strings is . Summing over k we get the
k
result.

36
4.4 Solutions to Binomial coefficient problems
1. Find a simple expression (not involving a sum) for
       
2 n 2 n 2 n 2 n
1 +2 +3 + ··· + n .
1 2 3 n

Solution: This was on the Putnam in 1962. It was question A5. These days, A5 is
typically a much more involved question!

We claim that the (or at least an) answer is n(n + 1)2n−2 .


From the binomial theorem
n  
n
X n
(1 + x) = xk . (?)
k=0
k

Differentiating both sides with respect to x twice, get


n   n  
n−2
X n k−2 X 2 n k−2
n(n − 1)(1 + x) = k(k − 1) x = (k − k) x ,
k=0
k k=0
k

and evaluating at x = 1 get


       
n−2 2 n 2 n 2 n 2 n
n(n − 1)2 = (1 − 1) + (2 − 2) + (3 − 3) + · · · + (n − n) . (??)
1 2 3 n

Differentiating both sides of (?) with respect to x once, get


n  
n−1
X n k−1
n(1 + x) = k x ,
k=0
k

and evaluating at x = 1 get


       
n−1 n n n n
n2 =1 +2 +3 + ··· + n . (? ? ?).
1 2 3 n

Adding (??) and (? ? ?) get


       
2 n 2 n 2 n 2 n
1 +2 +3 + ··· + n = n(n − 1)2n−2 + n2n−1
1 2 3 n
= n(n + 1)2n−2 .

2. n points are arranged on a circle. All possible diagonals are drawn. Assuming that
no three of the diagonals meet at a single point, how many intersections of diagonals
are there inside the circle?

37
Solution: This is an old classic.
 
n
We claim that the answer is .
4
Each intersection inside the circle determines a unique collection of four of the points
on the circle, by: two lines meet at each intersection, and each of the two lines has
two endpoints. Conversely, each set of four points on the circle determines a unique
point of intersection, by: if the four points are, in clockwise order, a, b, c, d, then
the associated point of intersection is the intersection of the lines ac and bd.
It follows that there are exactly as many intersections of diagonals
 inside the circle,
n
as there are sets of points on the circle; and there are such sets of points.
4

3. (a) The kth falling power of x is xk = x(x − 1)(x − 2) . . . (x − (k − 1)). Prove27


that for all real x, y, and all n ≥ 1,
n  
n
X n n−k k
(x + y) = x y .
k=0
k

Solution: This part and the next are standard Binomial coefficient identities.
This problem was B1 on the 1962 Putnam competition.
An argument by induction is possible. But there is also a combinatorial
argument: Let x and y be positive integers. The number of words in alphabet
{1, . . . , x} ∪ {x1 , . . . , x + y} of length n with no two repeating letters, counted
by selecting letter-by-letter, is (x + y)n . If instead we count by first selecting k,
the number of letters from {x + 1, . . . , x + y} used, then locate the k positions
in which those letters appear, then selecting the n − k letters from {1, . . . , x}
letter-by-letter in the order that they appear in the word, and finally selecting
the k letters from {x + 1, . . . , x + y} letter-by-letter in the order that they
n  
X n n−k k
appear in the word, we get a count of x y . So the identity is true
k=0
k
for positive integers x, y.
The LHS and RHS are polynomials in x and y of degree n, so the difference
is a polynomial in x and y of degree at most n, which we want to show is
identically 0. Write the difference as P (x, y) = p0 (x) + p1 (x)y + . . . + pn (x)y n
where each pi (x) is a polynomial in x of degree at most n. Setting x = 1 we
get a polynomial P (1, y) in y of degree at most n. This is 0 for all integers
y > 0 (by our combinatorial argument), so by the polynomial principle28 it is
identically 0. So each pi (x) evaluates to 0 at x = 1. But the same argument
shows that each pi (x) evaluates to 0 at any positive integer x. So again by the
27
Hint: Think combinatorially.
28
A polynomial p(x) that is 0 at infinitely many points, must be identically 0. In fact, a polynomial of
degree n that is 0 at n + 1 points must be identically 0.

38
polynomial principle, each pi (x) is identically 0 and so P (x, y) is. This proves
the identity for all real x, y.

(b) The kth rising power of x is xk = x(x + 1)(x + 2) . . . (x + (k − 1)). Prove29 that
for all real x, y, and all n ≥ 1,
n  
n
X n n−k k
(x + y) = x y .
k=0
k

Solution: Set x0 = −x and y 0 = −y; we have


(x + y)n = (−x0 − y 0 )n = (−1)n (x0 + y 0 )n
and
n   n  
X n n−k k X n
x y = (−x0 )n−k (−y 0 )k
k=0
k k=0
k
n  
X n
= (−1)n−k (x0 )n−k (−1)k (y 0 )k
k=0
k
n  
X n
= (−1)n
(x0 )n−k (y 0 )k ,
k=0
k

so the identity follows from the falling power binomial theorem (the previous
question).
4. Prove that the expression  
gcd(m, n) n
n m
is an integer for all pairs of integers n ≥ m ≥ 1.
Solution: (Putnam competition, 2000, problem B2) We know that gcd(m, n) =
am + bn for some integers a, b; but then
       
gcd(m, n) n m n n n
=a +b .
n m n m n m
Since    
n n n−1
=
m m m−1
(the committee-chair identity again, or easy algebra), we get
     
gcd(m, n) n n−1 n
=a +b ,
n m m−1 m
   
n−1 n
and so (since a, b, and are all integers) we get the desired result.
m−1 m
29
Hint: deduce it from 3(a).

39
5. Evaluate n  
X n
Fk+1
k=0
k
for n ≥ 0, where F1 , F2 , F3 , F4 , F5 , . . . are the Fibonacci numbers 1, 1, 2, 3, 5, . . ..

Solution: This is a well-known binomial coefficient identity.

We claim that the answer is F2n+1 .


When n = 0 we get a sum of 1; when n = 1 we get a sum of 2; when n = 2 we get
a sum of 5; when n = 3 we get a sum of 13; when n = 4 we get a sum of 34; this
suggests strongly
n  
X n
Fk+1 = F2n+1 .
k=0
k
One way to prove this is to iterative apply the Fibonacci recurrence to F2n+1 : on
the zeroth iteration,  
0
F2n+1 = F2n+1 .
0
The first iteration leads to
   
1 1
F2n+1 = F2n + F2n−1 = F2n + F2n−1 .
0 0
The second leads to

F2n+1 = F2n + F2n−1


= (F2n−1 + F2n−2 ) + (F2n−2 + F2n−3 )
     
2 2 2
= F2n−1 + F2n−2 + F2n−3 .
0 1 2
This suggest that we prove to more general statement, that for each 0 ≤ s ≤ n,
s  
X s
F2n+1 = F2n+1−s−j . (?)
j=0
j

The case s = n yields


n  
X n
F2n+1 = Fn+1−j ,
j=0
j
 
n
which is the same as what we have to prove (by the symmetry relation =
  j
n
).
n−j
We can prove (?) by induction on s (for each fixed n), with the case s = 0 trivial.
For larger s, we begin with the s − 1 case of the induction hypothesis, then use the

40
Fibonacci recurrence to break each Fibonacci number into the sum of two earlier
ones, then use Pascals identity to gather together terms involving the same Fibonacci
number. (Details omitted.)

6. Show30 that for every n, m ≥ 0,


Z 1
1
xn (1 − x)m dx = .
n+m
0 (n + m + 1) n

First solution: The result is trivial when one or both of n, m = 0, so we may


assume n + m ≥ 2. Using integration by parts, we get
Z 1 Z 1
n m m
x (1 − x) dx = xn+1 (1 − x)m−1 dx.
0 n + 1 0

This suggests that for each s ≥ 2 we use induction on m to prove the result for all
pairs (n, m) with n + m = s. The case m = 0 has been observed already. For m > 0
we have
Z 1 Z 1
n m m
x (1 − x) dx = xn+1 (1 − x)m−1 dx
0 n + 1 0
m 1
=
n + 1 ((n + 1) + (m − 1) + 1) (n+1)+(m−1)

n+1
m
= .
(n + 1)(n + m + 1) n+m
n+1

The result follows if we can show


   
n+m n+m
(n + 1) =m .
n+1 n

This is easily verified, either algebraically, or via the “committee-chair” identity: to


choose a committee-with-chair of size n + 1 from n + m people, we either
 choose
 the
n+m
n + 1 people for the committee and elect one of them chair ((n + 1) ways)
n+1
30
So,  
n+m (n + m + 1)
= R1 .
m xn (1 − x)m dx
0
 
α
This suggests a way of defining for any non-negative real numbers α, β with α > β, namely
β
 
α (α + 1)
= R1 .
β x (1 − x)α−β dx
β
0

This is a quite standard way to extend the binomial coefficients beyond integer arguments.

41
or select the n non-chair members
 from 
the n + m people, and choose the chair from
n+m
among those not yet chosen (m ways).
n
Second solution (essentially the same as the first, written differently): The result
is trivial when one or both of n, m = 0, so we may assume both n, m ≥ 1. Consider
n ≥ 1 fixed. Using integration by parts, we get
Z 1 Z 1
n m m
x (1 − x) dx = xn+1 (1 − x)m−1 dx
0 n+1 0

(recall m ≥ 1). Repeating integration by parts m − 1 more times, we get


Z 1     Z 1
n m m m−1 1
x (1 − x) dx = ··· xn+m dx
0 n + 1 n + 2 n + m 0
m! 1
=
(n + m)(n + m − 1) . . . (n + 1) n + m + 1
1
=
(n + m + 1) n+m

m
1
= .
(n + m + 1) n+mn

7. Define a selfish set to be a set which has its own cardinality (number of elements) as
an element. Find the number of subsets of {1, 2, . . . , n} which are minimal selfish
sets, that is, selfish sets none of whose proper subsets is selfish.

Solution: (Putnam competition 1996, problem B1) The answer is the nth Fibonacci
number. The solution given here is taken from http://mathforum.org/kb/thread.
jspa?forumID=13&threadID=20920&messageID=50140.
Write #S for the number of elements in S. Note that a set S is a minimal selfish set
if and only if #S is the smallest element of S.
Induct on n. For n = 1, there is one selfish set, namely {1}, so the number of
minimal selfish sets is 1. For n = 2, there are two selfish sets, namely {1} and {1, 2},
so the number of minimal selfish sets is 1.
Fix n ≥ 2. Assume that there are a minimal selfish subsets of {1, 2, . . . , n} and b
minimal selfish subsets of {1, 2, . . . , n − 1}. I will construct exactly a + b minimal
selfish subsets of {1, 2, . . . , n + 1}.
Construction A: Let S be a minimal selfish subset of {1, 2, . . . , n}. Then S is a
minimal selfish subset of {1, 2, . . . , n + 1}. This construction produces a minimal
selfish subsets.
Every minimal selfish subset T of {1, 2, . . . , n + 1} not containing n + 1 is obtained
from Construction A. Proof: T is a subset of {1, 2, . . . , n}.

42
Construction B: Let S be a minimal selfish subset of {1, 2, . . . , n − 1}. Define
T = {n + 1} ∪ {x + 1 : x ∈ S}. Then T is a minimal selfish set: its smallest element
is #S + 1 = #T . This construction produces exactly b minimal selfish subsets, all
different from the subsets in Construction A since they all contain n + 1.
Every minimal selfish subset T of {1, 2, . . . , n + 1} containing n + 1 is obtained from
Construction B. Proof: 1 ∈ / T since otherwise min T = 1 < 2 ≤ #T . Consider the
set S = {x − 1 : x ∈ T, x 6= n + 1}. Then #S = #T − 1 is the smallest element of
S, so S is a minimal selfish set. Finally T = {n + 1} ∪ {x + 1 : x ∈ S}.

8. Three distinct vertices are chosen at random from the vertices of a regular (2n + 1)-
sided polygon. If all such choices are equally likely, what is the probability that
the center of the polygon lies in the interior of the triangle determined by the three
chosen vertices?

Solution: (UCLA Putnam preparation class) Let the vertices be labelled clockwise
a
1 , a
2 , . . . , a2n+1 . Without loss of generality, let a1 be one of the points. There are
2n
possibilities for the remaining points. If a2 is the nearest point chosen to a1 ,
2
going clockwise around the polygon, then there is only one point (an+2 ) that can
complete a triangle that encloses the center. If a3 is the nearest point to a1 then
two points (an+2 and an+3 ) can complete a triangle. In general, if ak (k ≥ 2) is the
nearest point chosen to a1 , going clockwise around the polygon, then there are k − 1
points (an+2 through an+k ) that can complete a triangle that encloses the center.
The largest value k can take is n + 1. So the number  of pairs of points that can
n+1
complete a good triangle with a1 is 1 + 2 + . . . + n = , and the required
2
probability is
n+1

2 n+1
2n
 = .
2
2(2n − 1)
Notice that this tends to 1/4 as n tends to infinity, suggesting the following: if a
circle of radius 1 is given, a fixed point x on the perimeter is marked, two numbers
a, b are generated uniformly from the interval [0, 2π], and two points are marked
on the perimeter of the circle, one at arc-distance a from x and the other at arc
distance b (measured in a clockwise direction), then the probability that the center
of the circle lies in the triangle formed by the three marked points should be 1/4.
Does this make intuitive sense?

Here’s a video31 that discusses this problem, a three-dimensional analog, and also
talks a little at the beginning about the Putnam competition: https://youtu.be/
OkmNXy7er84

31
h/t Thomas.

43
5 Week 4 (September 1) — Graphs
A graph G = (V, E) consists of a set V of vertices and a set E of edges, each of which is a
two-element subset of V . Think of the vertices as points put down on a piece of paper,
and of the edges as arcs joining pairs of points. There is no inherent geometry to a graph
— all that matters is which pairs of points are joined, not the exact position of the points,
or the nature of the arcs joining them.
Thinking about the data of a problem as a graph can sometimes be helpful. Although
some Putnam problems in the past have been non-trivial results from graph theory in
disguise, there is no real need to know much graph theory, so in this discussion I’ll just
mention some basic ideas that might be useful. A little more background on graph theory
can be found, for example, at http://www.math.ucsd.edu/~jverstra/putnam-week6.
pdf.
Problem: n people go to a party, and each one counts up the number of other people she
knows at the party. Show that there are an even number of people who come up with an
answer that is an odd number. (Assuming that “knowing” is a two-way relation; I know
you if and only if you know me.)
Solution: Model the problem as a graph. V is the set of n party goers, and E consists of
all pairs of people who know each other. For person i, denote by di the number of edges
that involve i (di is the degree of vertex i). We have
n
X
di = 2|E|
i=1

since as we run over all vertices and count degrees, each edge gets counted exactly twice
(once for each vertex in that edge). So the sum of degrees is even. But if there were an
odd number of vertices with odd degree, the sum would be odd; so there are indeed an
even number of people who know an odd number of people.
The useful fact that is true about all graphs that lies at the heart of the solution is
this: in any graph G = (V, E),
Xn
di = 2|E|.
i=1
Problem: Show that two of the people at the party have the same number of friends.
Solution: The possible values for di are 0 through n − 1, n of them, so the pigeon-hole
principle doesn’t immediately apply. But: it’s not possible for there to be one vertex with
degree 0, and another with degree n − 1. So the possible values of di are either 1 through
n − 1, n − 1 of them, or 0 through n − 2, n − 1 of them, and in either case the pigeon-hole
principle gives that there are two people with the same number of friends.
The useful fact that is true about all graphs that lies at the heart of the solution is
this: in any graph G = (V, E), there must be two vertices with the same degree.
A walk in a graph from vertex u to vertex v is list of (not necessarily distinct) edges,
with u in the first, v in the last, and every pair of consecutive edges sharing a vertex in

44
common — graphically, a walk is a way to trace a path from u to v, always using complete
arcs of the drawing, and never taking pencil of paper.
Problem: How many different walks are there from u to v, that use k edges?
Solution: Form the adjacency matrix of the graph: rows and columns indexed by vertices,
entry (a, b) is 1 if {a, b} is an edge, and 0 otherwise. Then form the matrix Ak . The (u, v)
entry of this matrix is exactly the number of different walks from u to v, that use k edges.
The proof uses induction of k, and the definition of matrix multiplication. The key point
is that the number of walks from u to v that use k edges is the sum, over all neighbours w
of u (i.e., vertices w such that {u, w} is an edge), of the number of walks from w to v that
use k − 1 edges. I’ll skip the details.
The relation “there is a walk between” is an equivalence relation on vertices, so any
graph can be partitioned into equivalence classes, with each class having the property that
between any two vertices in the class, there is a walk, but there is no walk between any
two vertices in different classes. These classes are called components of the graph. If a
graph has just one component, meaning that between any two vertices in the graph, there
is a walk, it is said to be connected.
Problem: Given a graph G, under what circumstances is it possible to take a walk that
uses every edge of the graph exactly once, and ends up at the same vertex that it started
at?
Solution: Such a walk is called an Euler circuit, after the man who first studied them
(google “Bridges of Konigsberg”). Such a circuit is a tracing of the graphical representation
of the graph, with each arc traced out exactly once, the pencil never leaving the paper,
ending where it started. Two fairly obvious necessary conditions for the existence of an
Euler circuit are:

• the graph is connected, and

• every degree is even (because each time an Euler circuit visits a vertex, it eats up
two edges — one going in and one coming out).

Euler proved that these necessary conditions are sufficient: a connected graph has an Euler
circuit if and only if all vertex degrees are even. The details are given in any basic graph
theory textbook.
What if we don’t require the tracing to end at the same vertex it began?
Problem: Given a graph G, and two distinct vertices u and v, under what circumstances
is it possible to take a walk from u to v that uses every edge of the graph exactly once?
Solution: Such a walk is called an Euler trail. Euler proved that a connected graph has
an Euler trail from u to v if and only if all vertex degrees are even except the degrees of u
and v, which must be odd. It follows easily from his result on Euler circuits: just add an
edge from u to v, apply the Euler trails theorem, and delete the added edge.

45
Problem: Given a graph G with n vertices, under what circumstances is it possible to
list the vertices in some order v1 , . . . , vn , in such a way that each of {v1 , v2 }, {v2 , v3 }, . . .,
{vn−1 , vn }, {vn , v1 } are all edges?
Solution: Such a list is called a Hamiltonian cycle, after the man who first studied them
(google “icosian game”). Unlike with Eulerian trials, there is no simple set of necessary-
and-sufficient conditions known to allow one to determine whether such a thing exists in a
given graph. There is one useful sufficient condition, due to Dirac, that has an elementary
but involved proof that can be found in any graph theory textbook.
Dirac’s theorem: A graph G with n vertices has a Hamiltonian cycle if all vertices have
degree at least n/2.
A connected graph with the fewest possible number of edges is called a tree. It turns
out that all trees on n vertices have the same number of edges, namely n − 1. One way to
see this is to imagine building up the tree from a set of n totally disconnected vertices,
edge-by-edge. At each step, you should add an edge that bridges two components, since
adding an edge within a component does not help; in the end such an edge can be removed
without hurting connectivity. Since two components get merged each time an edge is
added, exactly n − 1 are needed to get to a single component.
A characterization of trees is that they are connected, but have no cycles — a cycle in
a graph is a list of distinct vertices u1 , u2 , . . . , uk such that each of {u1 , u2 }, {u2 , u3 }, . . .,
{uk−1 , uk }, {uk , u1 } are all edges.
A planar graph is a graph that can be drawn in the plane with no two arcs meeting
except at a vertex (if they have one in common). A planar drawing of a graph partitions
the plane into connected regions, called faces. Euler discovered a remarkable formula that
relates the number of vertices, edges and faces in a planar graph:
Euler’s formula: Let G be a planar graph with V vertices, E edges and F faces. Then
V − E + F = 2.

Proof sketch: By induction on F . If F = 1 then the graph has only one face, so it must
be a tree. A tree on V vertices has V − 1 edges, and so fits the formula.
Now suppose F > 1. Then the graph contains a cycle. If we remove an edge e of
that cycle then F drops by one, V stays the same, and E drops by 1. Now by induction,
V − (E − 1) + (F − 1) = 2 and this gives V − E + F = 2.
Problem: Show that five points can’t be connected up with arcs in the plane in such a
way that no two arcs meet each other except at a vertex (if they have one in common).
Solution: Suppose such  a connection was possible. The resulting planar graph would
5
have 5 vertices and = 10 edges, so by Euler’s formula would have 7 faces. The sum,
2
over the faces, of the number of edges bounding the faces, is then at least 21, since each
faces has at least three bounding edges. But this sum is at most twice the number of
edges, since each each edge can be on the boundary of at most two faces; so it is at most
20, a contradiction.

46
A bipartite graph is a graph whose vertex set can be partitioned into two classes, X
and Y , such that the graph only has edges that go from X to Y (and so none that are
entirely within X or entirely within Y ). It’s fairly easy to see that any odd-length cycle
is not bipartite, so any graph that has an odd-length cycle sitting inside it is also not
bipartite. This turns out to be a characterization of bipartite graphs; the proof can be
found in any textbook on graph theory.
Theorem: A graph is bipartite if an only if it has no odd cycle.
A matching in a graph is a set of edges, no two of which share a vertex. A perfect
matching is a matching that involves all the edges. A famous result, whose proof can be
found in any graph theory textbook, is Hall’s marriage theorem. A consequence of it says
that if there are n women and n men, each women likes exactly d men, and each man is
liked by exactly d women, then it is possible to pair the men and women off into n pairs,
such that each women is paired with a man she likes. Here’s the statement in graph-theory
language:
Theorem: Let G be a bipartite graph that is regular (all vertices have the same degree).
G has a perfect matching.

47
5.1 Some problems to think about for week 5
1. In a town there are three newly build houses, and each needs to be connected by a
line to the gas, water, and electricity factories. The lines are only allowed to run
along the ground. Is there a way to make all nine connections without any of the
lines crossing each other?

2. n teams play each other in a round-robin tournament (so each team plays each of
the other teams exactly once). There are no ties. Show that there exists an ordering
of the teams, (a1 , a2 , a3 , . . . , an ), such that team a1 beats team a2 , team a2 beats
team a3 , . . ., team an−1 beats team an .

3. An airline operates flights out of 2n airports. In all, the airline operates n2 + 1


different routes (all there-and-back: South Bend to Chicago is considered the same
route as Chicago to South Bend). Prove that there are three airports a, b, c such
that it is possible to fly directly from a to b, then b to c, then c back to a.

4. Show that in any connected graph, any two paths both of longest possible length
must have a vertex in common.32
 
n
5. The complete graph Kn on vertex set {1, . . . , n} is the graph in which all
2
possible edges are present. Suppose that the edges of Kn are colored with two colors,
say Red and Blue (meaning, each edge is either assigned the color Red or the color
Blue, but not both). Prove that is possible to partition {1, . . . , n} as A ∪ B, such
that there is a Red path that covers all the vertices in A and a Blue path that
covers all the vertices in B. (By this is meant: the elements of A can be ordered
as (a1 , a2 , . . . , a` ) in such a way that each of the edges a1 a2 , a2 a3 , . . . , a`−1 a` are all
colored Red, and similarly for B.)

6. During a particularly boring Zoom lecture, each of five participants fell asleep exactly
twice. For each pair of these five, there was some moment when both were sleeping
simultaneously. Prove that at some point, at least three of them were sleeping
simultaneously.

7. Is there a way to list the 2n subsets of {1, . . . , n} (with each subset appearing on the
list once and only once) in such a way that the first element of the list is the empty
set, and every element on the list is obtained from the previous element either by
adding an element or deleting an element?

8. Let G be a finite group of order n generated by a and b. Prove or disprove: there


is a sequence g1 , g2 , g3 , . . . , g2n such that every element of G occurs exactly twice in
the sequence, and, for each i = 1, 2, . . . , 2n, gi+1 equals gi a or gi b. (Interpret g2n+1
as g1 .)
32
A path in a graph is a list of distinct vertices v0 , v1 , . . . , vn , with an edge from vi−1 to vi for i = 1, . . . , n
(and maybe other edges, too). The length of the path just described is n (equals the number of edges
traversed in going from v0 to vn along the path).

48
5.2 Solutions to graph theory problems
1. In a town there are three newly build houses, and each needs to be connected by a
line to the gas, water, and electricity factories. The lines are only allowed to run
along the ground. Is there a way to make all nine connections without any of the
lines crossing each other?

Solution: This is a standard result in graph theory.

The question is asking whether the graph on six vertices, 1, 2, . . . , 6, with an edge
from i to j if and only if 1 ≤ i ≤ 3 and 4 ≤ j ≤ 6, can be drawn in the plane without
crossing edges. It has 6 vertices and 9 edges, so if it could, any representation would
have 5 faces (Euler’s formula). Each face is bounded by at least 4 edges (note that
the graph we are working with clearly has no triangles), so summing “#(bounding
edges)” over all faces, get at least 20. But this sum counts each edge at most twice,
so we get at most 18, a contradiction that reveals that there is no such planar
representation.

2. n teams play each other in a round-robin tournament (so each team plays each of
the other teams exactly once). There are no ties. Show that there exists an ordering
of the teams, (a1 , a2 , a3 , . . . , an ), such that team a1 beats team a2 , team a2 beats
team a3 , . . ., team an−1 beats team an .

Solution: This was on the 1958 Putnam competition, but is also a standard result
in graph theory.

Proof by induction on n, with n = 1, and indeed n = 2, trivial. So assume n ≥ 3.


Fix an arbitrary vertex x. By induction there exists an ordering of the remaining
teams, (a1 , a2 , a3 , . . . , an−1 ), such that team a1 beats team a2 , team a2 beats team
a3 , . . ., team an−2 beats team an−1 .
If x beats a1 , then the ordering (x, a1 , . . . , an−1 ) works. If x looses to everyone,
then the ordering (a1 , . . . , an−1 , x) works. If neither of these things happen, then
there must be an i such that x looses to ai , but beats ai+1 , and then the ordering
(a1 , . . . , ai , x, ai+1 , . . . , an−1 ) works.

3. An airline operates flights out of 2n airports. In all, the airline operates n2 + 1


different routes (all there-and-back: South Bend to Chicago is considered the same
route as Chicago to South Bend). Prove that there are three airports a, b, c such
that it is possible to fly directly from a to b, then b to c, then c back to a.

Solution: This is Mantel’s theorem, one of the first results proved in the vast area
of extremal graph theory.

Suppose there were no such three cities a, b, c. For each city x, denote by d(x) the
number of cities with direct connection to x. If there is a connection between cities

49
x and y, then there cannot be a third city y directly connected to both (or we would
have a triangle), so

(d(x) − 1) + (d(y) − 1) ≤ 2n − 2, d(x) + d(y) ≤ 2n.

Now in the sum of d(x) + d(y) over all pairs of connected cites, for each x we have
that d(x) appears exactly d(x) times (once for each city directly connected to x), so
X X
(d(x) + d(y)) = d2 (x) ≤ (n2 + 1)2n.
x

Now use the Cauchy-Schwarz-Bunyakovski inequality to bound:


! !2
X X
2n d2 (x) ≥ d(x) = (2(n2 + 1))2
x x
X
(note that in d(x), each route gets counted exactly twice, once for each endpoint).
x
We conclude that
4(n2 + 1)2 ≤ 4n2 (n2 + 1),
which fails to hold for any n ≥ 0.

Here’s an alternate solution taken from https://mks.mff.cuni.cz/kalva/putnam/


psoln/psol5612.html: Model the problem as one about graphs: we are given a
graph on 2n vertices with n2 + 1 edges, and we want to find a triangle.
Induction. For n = 2, the result is obviously true, because there is only one graph
with 4 points and 5 edges and it certainly contains a triangle. Suppose the result
is true for some n ≥ 2. Consider a graph G with 2n + 2 vertices and n2 + 2n + 2
edges. Take any two vertices x and y joined by an edge. We consider two cases. If
there are fewer than 2n + 1 other edges joined to either x or y (or both), then if we
remove x and y we get a graph with 2n vertices and at least n2 + 1 edges, which
must contain a triangle (by induction), so G does also. If there are at least 2n + 1
other edges joined to either x or y (or both) then by pigeon-hole principle there is at
least one vertex joined to both, and that gives a triangle.

4. Show that in any connected graph, any two paths both of longest possible length
must have a vertex in common.33

Solution: Suppose, for a contradiction, that the graph has two paths of maximum
possible length, say P = x0 x1 · · · xn and Q = y0 y1 · · · yn , with no vertex in common
between the xi ’s and the yi ’s.
33
A path in a graph is a list of distinct vertices v0 , v1 , . . . , vn , with an edge from vi−1 to vi for i = 1, . . . , n
(and maybe other edges, too). The length of the path just described is n (equals the number of edges
traversed in going from v0 to vn along the path).

50
Since the graph is connected, there is a path in the graph that connects P and Q.
By considering a minimal such path, we get that there is a vertex xa of P , a vertex
yb of Q, and a path R in the graph from xa to yb that uses no vertices of P, Q other
than xa , yb . Without loss of generality (e.g., by relabeling vertices if necessary) we
can assume that the path from x0 to xa is at least half the length of P , and that the
path from yb to yn is at least half the length of Q. But then the path that starts at
x0 , follows P to xa , then follows R to yb , then follows Q to yn , is longer than P, Q, a
contradiction.
 
n
5. The complete graph Kn on vertex set {1, . . . , n} is the graph in which all
2
possible edges are present. Suppose that the edges of Kn are colored with two colors,
say Red and Blue (meaning, each edge is either assigned the color Red or the color
Blue, but not both). Prove that is possible to partition {1, . . . , n} as A ∪ B, such
that there is a Red path that covers all the vertices in A and a Blue path that
covers all the vertices in B. (By this is meant: the elements of A can be ordered
as (a1 , a2 , . . . , a` ) in such a way that each of the edges a1 a2 , a2 a3 , . . . , a`−1 a` are all
colored Red, and similarly for B.)
Solution: I learned of this problem in a recent paper of András Gyárfás , at
http://arxiv.org/pdf/1509.05539.pdf.
Let A and B be disjoint subsets of {1, . . . , n}, with a Red path covering A and a
Blue path covering B, and with A ∪ B as large as possible subject to this condition.
If A ∪ B = {1, . . . , n} then we are done. If not, then we may assume that both A
and B are not empty, since if one of them, B say, was empty, then we could replace
B by {x} where x is any vertex not in A, and the result would be a valid pair (A, B)
with the size of the union one larger, a contradiction of maximality (note that a Blue
path covers a single vertex).
Let A be covered by the Red path given by the ordering (a1 , a2 , . . . , a` ), and let B be
covered by the Blue path given by the ordering (b1 , b2 , . . . , bk ). Let x be any vertex
not in A ∪ B. If either the edge a` x is Red or the edge bk x is Blue, then we can
either add x to A or add x to B and get a valid pair that covers more vertices, a
contradiction. So we may assume that a` x is Blue and bk x is Red.
Now look at edge a` bk . If this is Red, then we can replace A by {a1 , . . . , a` , bk , x}
and replace B by {b1 , . . . , bk−1 } and get a valid pair that covers more vertices, a
contradiction. If a` bk is Blue, then we can replace A by {a1 , . . . , a`−1 } and replace
B by {b1 , . . . , bk , a` , x} and again get a valid pair that covers more vertices, a
contradiction.
We conclude that A ∪ B = {1, . . . , n}.
6. During a particularly boring Zoom lecture, each of five participants fell asleep exactly
twice. For each pair of these five, there was some moment when both were sleeping
simultaneously. Prove that at some point, at least three of them were sleeping
simultaneously.

51
Let the participants be named A, B, C, D, E, and let A1 , A2 , B1 , B2 , et cetera, denote
the time intervals during which each participant was asleep.
Consider a graph with vertex set A1 , A2 , B1 , B2 , et cetera, with an edge between two
vertices if the two associated time intervalshave
 a point in common. This graph has
5
at least 10 edges, because there are 10 = pairs of participants, and it is given
2
that for each pair there was some moment when both were sleeping simultaneously.
A graph with 10 vertices and at least 10 edges must have a cycle. Suppose that
there is a cycle with vertices v1 , v2 , . . . , vk (encountered in that order along the cycle).
Consider the intervals of time I1 , I2 , . . . , Ik associated with these k vertices. We
claim that some three of these intervals must have a common point — giving rise to
a time at which at least three of the participants were sleeping simultaneously.
We prove the claim by contradiction, assuming that no three of the intervals have a
common point. Without loss of generality, assume that I1 is the interval whose start
time is earliest. I2 starts at least as late as I1 , and must end later; otherwise where
I3 starts is a triple intersection point. So I3 is completely disjoint from I1 , and on a
number line lies completely to the right of I1 .
Now I4 must start after I2 ends (else there will be triple intersection point between
I2 , I3 , I4 ), so I4 must lie completely to the right of I2 , and hence completely to the
right of I1 .
Continuing the argument in this manner, we find that Ik lies completely to the right
of I1 . This contradicts that v1 , v2 , . . . , vk forms a cycle, i.e. that Ik and I1 intersect.
So we conclude that there must be a triple intersection point among the intervals.
7. Is there a way to list the 2n subsets of {1, . . . , n} (with each subset appearing on the
list once and only once) in such a way that the first element of the list is the empty
set, and every element on the list is obtained from the previous element either by
adding an element or deleting an element?

Solution: I learned of this problem from Imre Leader.

We’ll prove by induction on n that it is possible, and that moreover it is possible


to do so in such a way that the last element listed is a singleton (so that the list
can be considered a cycle). In fact, the question is asking for a Hamiltonian path (a
walk that visits every vertex once) in the graph whose vertex set in the power set
of {1, . . . , n}, with two vertices adjacent if they have symmetric difference of size
exactly 1; this graph is called the n-dimensional hypercube (when n = 2 it is just a
square, when n = 3 it is the usual 3-cube). We’ll prove by induction on n ≥ 2 that
the n-dimensional hypercube has a Hamiltonian cycle, from which we can clearly
construct a Hamiltonian path of the required kind by deleting an edge out of ∅.
The case n = 2 is trivial: ∅, {1}, {1, 2}, {2}, ∅ works.
For n ≥ 2, start with a Hamiltonian cycle C, ∅, {1}, . . . , {2}, ∅, of the (n − 1)-
dimensional hypercube (we known there’s one by induction), and then also consider

52
the sequence C 0 , {n}, {1, n}, . . . , {2, n}, {n}, obtained by unioning every term of
C with {n}. C 0 has the property that it is a cycle list of the elements of the
n-dimensional hypercube that are not listed in C, and also has the property that
adjacent elements have symmetric difference of size exactly 1. A Hamiltonian cycle
of the n dimensional hypercube is now obtained by starting with all the elements
of C except the final ∅, then going to the second-from-last element of C 0 , and then
listing the remaining element of C 0 (except for the final ∅) in reverse order.

8. Let G be a finite group of order n generated by a and b. Prove or disprove: there


is a sequence g1 , g2 , g3 , . . . , g2n such that every element of G occurs exactly twice in
the sequence, and, for each i = 1, 2, . . . , 2n, gi+1 equals gi a or gi b. (Interpret g2n+1
as g1 .)

Solution: This was from the 1990 Putnam competition. It is (an instance of)
a very basic result in graph theory, probably the first ever result proved, namely
the necessary and sufficient conditions for the existence of an Eulerian circuit in a
directed graph.

See https://mks.mff.cuni.cz/kalva/putnam/psoln/psol9010.html for a solu-


tion.

53
6 Week 5 (September 8) — Calculus
Calculus is a rich source for competition problems. The Putnam problem setters try to
assume minimal mathematical background, so the the topics from Calculus that come up
will tend to focus on material from Calc 1 & 2. Here are the things you should for sure be
familiar with:

• The definitions of limits, continuity and derivative. Some questions will ask to
compute interesting limits, or make certain continuity assumptions, or give some
information about the values of derivatives of a function, and of course it will be
helpful to be comfortable with these concepts!

• The three basic theorems of continuity and differentiability:

– The intermediate value theorem: if continuous f : [a, b] → R is negative at a


and positive at b it must be 0 at some point between a and b.
– The extreme value theorem: a continuous function f defined on a closed interval
[a, b] is bounded, and there are numbers c, d such that f (c) = max{f (x) : x ∈
[a, b]} and f (d) = min{f (x) : x ∈ [a, b]} (i.e., not only is f bounded, but it
reaches its bounds).
– The mean value theorem: if f : [a, b] → R is differentiable (and so, necessarily,
f (b) − f (a)
continuous) then there is some c ∈ (a, b) with f 0 (c) = (i.e., the
b−a
average slope is matched at some point by the exact slope).

• The meaning of first and second derivatives, in terms of local maxima and minima
of functions.

• The idea of approximating am integral via a Riemann sum, and recognizing a sum as
a Riemann sum — sometimes a complicated sum becomes very easy to understand
if you realize that it is a Riemann sum for some integral.

• The fundamental theorem of calculus, which has two distinct parts:


Z x
– if new function g is defined from old continuous function f by g(x) = f (t)dt
a
(some fixed a), then g is differentiable, and g 0 (x) = f (x), and
– if for some continuous f , the function g has the property that g 0 (x) = f (x) (i.e.,
Z b
g is an antiderivative of f ) then f (x)dx = g(b) − g(a).
a

• Taylor’s theorem, with remainder term: suppose f is infinitely differentiable at and


near a. Then
f 00 (a) f (n) (a)
f (x) ≈ f (a) + (x − a)f 0 (a) + (x − a)2 + · · · + (x − a)n .
2! n!

54
More precisely, there is some number c between a and x for which

f 00 (a) f (n) (a) f (n+1) (c)


f (x) = f (a)+(x−a)f 0 (a)+(x−a)2 +· · ·+(x−a)n +(x−a)n+1 .
2! n! (n + 1)!

• All the basic integrals, and all the basic integration techniques — integration by
parts, u-substitutions, trigonometric substitutions, et cetera.

Paraphrasing my colleague Andrei Jorza: “you will rarely need any new calculus
technique that you haven’t seen before; the difficulty is to patch together all the things you
know to obtain a solution. While cleverness will take you a long way in problem solving
calculus, this is no place for being squeamish about algebraic manipulations.”
The book Putnam and Beyond (available online) has a huge number of Putnam-style
calculus problems. You’ll also find a fair number at https://www3.nd.edu/~ajorza/
courses/2018f-m43900/handouts/lecture3.pdf (Andrei Jorza’s 43900 page from Fall
2018). Many of this week’s problems come from that list.

55
6.1 Problems to think about for week 6
1. Find, with explanation, the maximum value of f (x) = x3 − 3x on the set of all real
numbers satisfying x4 + 36 ≤ 13x2 .
2. Suppose f : R → R is a continuous function satisfying |f (x) − f (y)| ≥ |x − y| for all
x, y. Show that f is both injective and surjective.
3. Let f : [0, 1] → R be a continuous function. Show that for every x ∈ [0, 1], the series

X f (xn )
n=1
2n
converges.
4. Let n
X 1
f (n) = √ √ .
k=1
k+ k+1
Evaluate f (9999).
5. For which real numbers c is it true that
1 x 2
e + e−x ≤ ecx

2
for all real numbers x?
6. Given 0 < α < β, find
Z 1 1/λ
λ
lim (βx + α(1 − x)) dx .
λ→0 0

7. Compute
x + sin x − cos x − 1
Z
dx.
x + ex + sin x
Z x
8. Let f : R → R be a continuous function. Define g(x) = f (x) f (t)dt. Show that
0
if g is non-increasing then f must be the identically 0 function.
9. Compute
Z π/2
dx
√ .
0 1 + tan 2 (x)
10. Let A and B be points on the same branch of the hyperbola xy = 1. Let P be a
point on the chord AB such that the triangle AP B has largest area. Show that the
area bounded by the hyperbola and the chord AP is the same as the area bounded
by the hyperbola and the chord BP .
11. Compute  
1 1 1
lim √ +√ + ··· + √ .
n→∞ 4n2 − 12 4n2 − 22 4n2 − n2

56
6.2 Solutions to calculus problems
1. Find, with explanation, the maximum value of f (x) = x3 − 3x on the set of all real
numbers satisfying x4 + 36 ≤ 13x2 .

Solution: This was from the 1986 Putnam competition, question A1.

The answer is 18.


x4 + 36 ≤ 13x2 is equivalent to x4 − 13x2 + 36 ≤ 0, which is equivalent to (x2 −
4)(x2 − 9) ≤ 0, which is equivalent to 4 ≤ x2 ≤ 9, which is equivalent to
−3 ≤ x ≤ −2 and 2 ≤ x ≤ 3.
Now f (x) = x3 − 3x is a continuous function with critical points (points of zero
derivative) at 3x2 − 3 = 0, or x = ±1. Neither of these are in the range of interest,
so to find the maximum on the range of interest, we need only evaluate f (x) at
x = −3, −2, 2 and 3. It’s an easy check that f (3) = 18 and this is the largest value
among f (−3), f (−2), f (2) and f (3).
2. Suppose f : R → R is a continuous function satisfying |f (x) − f (y)| ≥ |x − y| for all
x, y. Show that f is both injective and surjective.

Solution: This is from Putnam and beyond by Gelca and Andreescu.

Injectivity: Suppose that x = 6 y. Then it must be the case that f (x) 6= f (y); for if
not, then 0 = |f (x) − f (y)| ≥ |x − y| > 0, a contradiction.
Surjectivity: For any y > 0, we have |f (0) − f (y)| ≥ y, so

either f (y) ≥ f (0) + y or f (y) ≤ f (0) − y,

and similarly we have |f (0) − f (−y)| ≥ y, so

either f (−y) ≥ f (0) + y or f (−y) ≤ f (0) − y.

Suppose we have f (y) ≥ f (0) + y and f (−y) ≥ f (0) + y. By intermediate value


theorem, somewhere in (0, y) f takes on the value f (0) + y/2, and somewhere in
(−y, 0) it also takes on the value f (0) + y/2. This contradicts injectivity. We get a
similar contradiction if f (y) ≤ f (0) − y and f (−y) ≤ f (0) − y.
So either
f (y) ≥ f (0) + y and f (−y) ≤ f (0) − y,
or
f (y) ≤ f (0) − y and f (−y) ≥ f (0) + y.
In either case, by intermediate value theorem f takes on all values in the interval
[f (0) − y, f (0) + y] (and in particular takes them on as the argument runs between
−y and y). Since f (0) + y can be made arbitrarily large, and f (0) − y arbitrarily
small, by appropriate choice of y > 0, we conclude that f takes on all real values.

57
3. Let f : [0, 1] → R be a continuous function. Show that for every x ∈ [0, 1], the series

X f (xn )
n=1
2n

converges.

Solution: I found this on Andrei Jorza’s Putnam prep class page.

Since f is a continuous function on a bounded closed interval, there is M > 0 such


that f |(x)| ≤ M for all x ∈ [0, 1]. So

X |f (xn )|
n=1
2n

(a sum of positive terms) converges — the partial sums form an increasing sequence,
bounded above by

X M
n
= 2M.
n=1
2

X f (xn )
So, for all x, is absolutely convergent, and so convergent.
n=1
2n

4. Let n
X 1
f (n) = √ √ .
k=1
k+ k+1
Evaluate f (9999).

Solution: This problem was on the 2014 U. Illinois Urbana-Champaign mock


Putnam.
Rationalize!
n
X 1
f (n) = √ √
k=1
k + k+1
n √ √
X k− k+1
= √ √ √ √
k=1
( k + k + 1)( k − k + 1)
n
X √ √
= ( k + 1 − k)
k=1

= n + 1 − 1,

the last line by a telescoping sum. So f (9999) = 10000 − 1 = 99.

58
5. For which real numbers c is it true that
1 x 2
e + e−x ≤ ecx

2
for all real numbers x?

Solution: This was Problem B1 from the 1980 Putnam Competition. See for example
https://faculty.math.illinois.edu/~hildebr/putnam/problems/mock13sol.pdf
for a solution.

6. Given 0 < α < β, find


Z 1 1/λ
λ
lim (βx + α(1 − x)) dx .
λ→0 0

Solution: This was problem B2 of the 1979 Putnam competition.

Solution due to John Scholes: Making the substitution t = βx + α(1 − x), so


dt = (β − α)dx, the integral becomes
Z β   λ+1
− αλ+1
 
1 λ 1 β
t dt = .
β−α α 1+λ β−α

So we need to compute
1/λ
β λ+1 − αλ+1
 
1
lim
λ→0 1+λ β−α

Now
1 1 1
lim+ = lim k
= ,
(1 + λ)1/λ k→∞ 1 + 1 e

λ→0
k
and  `
1 1 1 1
lim− 1/λ
= lim k
= lim 1 − = ,
λ→0 (1 + λ) k→−∞ 1 + 1 ` e

`→+∞
k
so
1 1
lim = .
λ→0 (1 + λ)1/λ e
For the the other part of the limit, write
!
1/λ β λ+1 −αλ+1
 λ+1 λ+1 log
β −α β−α
=e λ .
β−α

59
The exponent is an indeterminate of the form 0/0 at λ = 0, so we evaluate the limit
of the exponent as λ → 0 by an application of L’Hôpital’s rule; it is

(λ + 1)β λ − (λ + 1)αλ
  
β−α 1
lim λ+1 λ+1
= .
λ→0 β −α β−α β−α
So by continuity,
1/λ
β λ+1 − αλ+1

lim = e1/(β−α) .
λ→0 β−α
It follows that
Z 1 1/λ
λ
lim (βx + α(1 − x)) dx = (1/e)e1/(β−α) .
λ→0 0

7. Compute
x + sin x − cos x − 1
Z
dx.
x + ex + sin x

Solution: This was from Putnam and Beyond by Gelca and Andreescu.

We have
x + sin x − cos x − 1 x + ex + sin x − cos x − ex − 1
Z Z
dx = dx
x + ex + sin x x + ex + sin x
1 + ex + cos x
Z   
= 1− dx
x + ex + sin x
1 + ex + cos x
Z  
= x− dx
x + ex + sin x
Z
du
= x− where u = x + ex + sin x
u
= x − log |u|
= x − log |x + ex + sin x|.
Z x
8. Let f : R → R be a continuous function. Define g(x) = f (x) f (t)dt. Show that
0
if g is non-increasing then f must be the identically 0 function.

Solution: This was from the book Putnam and Beyond by Gelca and Andreescu.
Define h : R → R by 2
Z x
1
h(x) = f (t)dt .
2 0

Notice that, by the fundamental theorem of calculus, h is differentiable and

h0 (x) = g(x).

60
Now g(x) is non-increasing and g(0) = 0, so g(x) is non-negative on (−∞, 0) and
non-positive on (0, ∞). But g = h0 , so this implies that h is non-decreasing on
(−∞, 0), and non-increasing on (0, ∞). And h(0) = 0, while h(x) ≥ 0 for all x, so it
must be the case that h(x) = 0 for all x. This tells us that
Z x
f (t)dt = 0
0

for all real x; and differentiating with respect to x tells us that f (x) = 0 for all x.
9. Compute
Z π/2
dx
√ .
0 1 + tan 2 (x)

Solution: This was problem A3 of the Putnam Competition from 1980.

Solution due to John Scholes: The integral evaluates to π/4.


For any positive real α, set fα (x) = 1/(1 + tanα x). We have
1 1 tanα x
fα (π/2 − x) = = = = 1 − fα (x).
1 + tanα (π/2 − x) 1 + cotα x 1 + tanα x
So
Z π/2 Z π/4 Z π/2
fα (x) dx = fα (x) dx + fα (x) dx
0 0 π/4
Z π/4 Z π/4
= (?) fα (x) dx + fα (π/2 − x) dx
0 0
Z π/4 Z π/4
= fα (x) dx + (1 − fα (x)) dx
0 0
Z π/4
= 1 dx
0
π
= .
4

In particular, when α = 2 the result is π/2.
Explanation of (?): we calculate
Z π/4
fα (π/2 − x) dx
0

by making the substitution u = π/2 − x. We have du = −dx, so dx = −du; at x = 0,


u = π/2; at x = π/4, u = π/4; and the integrand fα (π/2 − x) becomes fα (u). So:
Z π/4 Z π/4 Z π/2 Z π/2
fα (π/2 − x) dx = − fα (u) du = fα (u) du = fα (x) dx.
0 π/2 π/4 π/4

61

Note that the 2 was a complete red herring(!), just introduced to make sure that
the integrand does not have an elementary antiderivative.

10. Let A and B be points on the same branch of the hyperbola xy = 1. Let P be a
point on the chord AB such that the triangle AP B has largest area. Show that the
area bounded by the hyperbola and the chord AP is the same as the area bounded
by the hyperbola and the chord BP .

Solution: This was problem A1 on the 2015 Putnam.

Solution by Kiran Kedlaya: Without loss of generality, assume that A and B lie
in the first quadrant with A = (t1 , 1/t1 ), B = (t2 , 1/t2 ), and t1 < t2 . If P = (t, 1/t)
with t1 ≤ t ≤ t2 , then the area of triangle AP B is

1 1 1
1 t2 − t1
t1 t t2 = (t1 + t2 − t − t1 t2 /t).
2 2t1 t2
1/t1 1/t 1/t2

When t1 , t2 are fixed, this is maximized


√ when t + t1 t2 /t is minimized, which by
AM-GM exactly holds when t = t1 t2 .
t1 + t − x
The line AP is given by y = , and so the area of the region bounded by
tt1
the hyperbola and AP is
Z t   
t1 + t − x 1 t t1 t
− dx = − − log ,
t1 tt1 x 2t1 2t t1
√ t2 − t1 p
which at t = t1 t2 is equal to √ − log( t2 /t1 ). Similarly, the area of the region
2 t1 t2
t2 t t2 √
bounded by the hyperbola and P B is − − log , which at t = t1 t2 is also
2t 2t2 t
t2 − t1 p
√ − log( t2 /t1 ), as desired.
2 t1 t2

Second solution: For any λ > 0, the map (x, y) 7→ (λx, λ−1 y) preserves both
areas and the hyperbola xy = 1. We may thus rescale the picture so that A, B are
symmetric across the line y = x, with A above the line. As P moves from A to B,
the area of AP B increases until P passes through the point (1, 1), then decreases.
Consequently, P = (1, 1) achieves the maximum area, and the desired equality is
obvious by symmetry. Alternatively, since the hyperbola is convex, the maximum
is uniquely achieved at the point where the tangent line is parallel to AB, and by
symmetry that point is P .

11. Compute  
1 1 1
lim √ +√ + ··· + √ .
n→∞ 4n2 − 12 4n2 − 22 4n2 − n2

62
Solution: This was from Putnam and Beyond by Gelca and Andreescu.

The limit is π/6. We have


n n n
X 1 X 1 X1 1
√ = p = p .
k=1
4n2 − k 2 k=1
n 4 − (k/n)2 k=1
n 4 − (k/n)2

This is a Riemann sum, specifically for the function f (x) = 1/ 4 − x2 , on the
interval [0, 1], with n partitions each of length 1/n, and evaluating at the right-hand
end of each interval. Since f is integrable on [0, 1], and indeed
Z 1
dx π
√ = ,
0 4 − x2 6

we get that the limit is π/6.

63
7 Week 6 (September 15) — Recurrences
We met recurrences in the induction hand-out.
Sometimes we are either given a sequence of numbers via a recurrence relation, or we
can argue that there is such relation that governs the growth of a sequence. A sequence
(bn )n≥a is defined via a recurrence relation if some initial values, ba , ba+1 , . . . , bk say, are
given, and then a rule is given that allows, for each n > k, bn to be computed as long as
we know the values ba , ba+1 , . . . , bn−1 .
Sequences defined by a recurrence relation, and proofs by induction, go hand-in-glove.
Here’s an illustrative example.
Example: Let an be the number of different ways of covering a 1 by n strip with 1 by 1
and 1 by 3 tiles. Prove that an < (1.5)n .
Solution: We start by figuring out how to calculate an via a recurrence. Some initial
values of an are easy to compute: for example, a1 = 1, a2 = 1 and a3 = 2. For n ≥ 4, we
can tile the 1 by n strip EITHER by first tiling the initial 1 by 1 strip with a 1 by 1 tile,
and then finishing by tiling the remaining 1 by n − 1 strip in any of the an−1 admissible
ways; OR by first tiling the initial 1 by 3 strip with a 1 by 3 tile, and then finishing by
tiling the remaining 1 by n − 3 strip in any of the an−3 admissible ways. It follows that
for n ≥ 4 we have an = an−1 + an−3 . So an (for n ≥ 1) is determined by the recurrence


 1 if n = 1,
1 if n = 2,

an =

 2 if n = 3, and
an−1 + an−3 if n ≥ 4.

Notice that this gives us enough information to calculate an for all n ≥ 1: for example,
a4 = a3 + a1 = 3, a5 = a4 + a2 = 4, and a6 = a5 + a3 = 6.
Now we prove, by strong induction, that an < 1.5n . That a1 = 1 < 1.51 , a2 = 1 < (1.5)2
and a3 = 2 < (1.5)3 is obvious. For n ≥ 4, we have

an = an−1 + an−3
< (1.5)n−1 + (1.5)n−3
 3 !
2 2
= (1.5)n +
3 3
 
26
= (1.5)n
27
n
< (1.5) ,

(the second line using the inductive hypothesis) and we are done by induction.
Notice that we really needed strong induction here, and we really needed all three
of the base cases n = 1, 2, 3 (think about what would happen if we tried to use regular
induction, or what would happen if we only verified n = 1 as a base case).

64
Solving via generating functions
Given a sequence (an )n≥0 , we can form its generating function, the function

X
F (x) = an x n .
n=0

Often we can use a recurrence relation to produce a functional equation that F (x) satisfies,
then solve that equation to find a compact (non-infinite-summation) expression for F (x),
then finally use knowledge of calculus power-series to extract an exact expression for an .
There are so many different varieties of this method, that I won’t describe it in general,
just give an example. The Perrin sequence is defined by p0 = 3, P1 = 0, p2 = 2, and

pn = pn−2 + pn−3 for n > 2.

(A quite interesting sequence: see http://en.wikipedia.org/wiki/Perrin_number#


Primes_and_divisibility.) The generating function of the sequence is

P (x) = p0 + p1 x + p2 x2 + p3 x3 + p4 x4 + . . . .

Plugging in the given values for p0 , p1 and p2 , and the recurrence’s right-hand side for all
others, we get

P (x) = 3 + 2x2 + (p0 + p1 )x3 + (p1 + p2 )x4 + . . .


= 3 + 2x2 + x3 (p0 + p1 x + . . .) + x2 (p1 x + p2 x2 + . . .)
= 3 + 2x2 + x3 P (x) + x2 (P (x) − p0 )
= 3 − x2 + (x3 + x2 )P (x).

We can now solve for P (x) as a rational function in x, and expand using partial fractions:

3 − x2
P (x) =
1 − x 2 − x3
A B C
= + +
1 − α1 x 1 − α2 x 1 − α3 x
where A, B and C are some constants and (1 − α1 x)(1 − α2 x)(1 − α3 x) = 1 − x2 − x3 , or
equivalently (x − α1 )(x − α2 )(x − α3 ) = x3 − x − 1. In other words, α1 , α2 and α3 are the
solutions to x3 − x − 1 = 0 (it happens that one of them, say α1 , is real, and is roughly
1.32 [it’s called the plastic number] and the other two are a complex conjugate pair with
absolute value smaller than α1 ).
Using
1
= 1 + kx + k 2 x2 + . . . ,
1 − kx
we now get that

P (x) = (A+B+C)+(Aα1 +Bα2 +Cα3 )x+(Aα12 +Bα22 +Cα32 )x2 +(Aα13 +Bα23 +Cα33 )x3 +. . . ,

65
and so we can read off a formula for pn (by uniqueness of power-series representations):

pn = Aα1n + Bα2n + Cα3n .

But what are A, B and C? One way to figure them out is to use the initial conditions, to
get a set of simultaneous equations:
A + B + C = 3
Aα1 + Bα2 + Cα3 = 0
Aα12 + Bα22 + Cα32 = 2.
It turns out that the unique solution to this system is A = B = C = 1. This solution
satisfies the first equation above, evidently; it satisfies the second since

(x−α1 )(x−α2 )(x−α3 ) = x3 −x−1 = x3 −(α1 +α2 +α3 )x2 +(α1 α2 +α1 α3 +α2 α3 )x−(α1 α2 α3 )
(1)
implies α1 + α2 + α3 = 0; and it satisfies the last since

(α1 + α2 + α3 )2 = (α12 + α22 + α32 ) + 2(α1 α2 + α1 α3 + α2 α3 ),

and from (1) this reduces to 0 = α12 + α22 + α32 − 2 so α12 + α22 + α32 = 2.
So we have an exact formula for pn :

pn = α1n + α2n + α3n

where α1 , α2 and α3 are the roots of x3 − x − 1.


Notice that without even doing the explicit computation of A, B and C, we have
learned something from the generating function approach about pn , namely the following:
since α1 ≈ 1.32 is real and (α2 , α3 ) is a complex conjugate pair with absolute value smaller
than α1 , we have that for large n,

pn ≈ Aα1n ≈ A(1.32)n .

In other words, with very little work, we have isolated the rough growth rate of pn .
If this business of generating functions interests you, you can find out much more in
Herb Wilf’s beautiful book generatingfunctionology (just google it; it’s freely available
online).

Solving via characteristic function


Mimicing what we did with the Perrin sequence, we can easily prove the following theorem:
let (an ) be a sequence defined recursively, via the defining relation

an = c1 an−1 + c2 an−2 + ck an−k for n ≥ k,

(for some constants ci ) together with initial values a0 , a1 , a2 , . . . , ak−1 . Form the polynomial

C(x) = xk − c1 xk−1 − c2 xk−2 − . . . − ck−1 x − ck

66
(called the characteristic polynomial of the recurrence). If C(x) = (x−α1 )(x−α2 ) . . . (x−αk )
factors into distinct linear terms, then there are constants A1 , A2 , . . . , Ak such that for all
n ≥ 0,
an = A1 α1n + . . . + Ak αkn
with the Ai ’s explicitly findable by solving the k by k system of linear equations

A1 + . . . + Ak = a0
A1 α1 + . . . + Ak αk = a1
A1 α12 + . . . + Ak αk2 = a2
...
k−1 k−1
A1 α1 + . . . + Ak αk = ak−1 .

Even without solving this system, if C(x) has a unique root (say α1 ) of greatest absolute
value, then we know the asymptotic growth rate

pn ∼ A1 α1n

as n → ∞.
Similar statements can be made when C(x) has repeated roots, with the form of the
final answer changing depending on what is the right expression to use in the partial
fractions expansion step of the generating function method. I won’t make a general
statement, because it would be way too cumbersome (but ask me if you want to see more!);
instead here’s an example:
Suppose an = 4an−1 − 4an−2 for all n ≥ 2, with a0 = 0 and a1 = 1. The generating
function method gives that the generating function A(x) satisfies
x
A(x) = .
1 − 4x + 4x2
The correct partial fractions expansion now is
A B
A(x) = + .
1 − 2x (1 − 2x)2

The coefficient of xn in A/(1 − 2x) is A(2n ). For B/(1 − 2x)2 , we use:


1
= 1 + kx + k 2 x2 + . . . ,
1 − kx
so differentiating
k
= k + 2k 2 x + 3k 3 x2 + . . . + (n + 1)k n+1 xn + . . . ,
(1 − kx)2
so
1
= 1 + 2kx + 3k 2 x2 + . . . + (n + 1)k n xn + . . . ,
(1 − kx)2

67
so the coefficient of xn in B/(1 − 2x)2 is B(n + 1)(2n ). [This trick of figuring out new
power series from old by differentiation is quite useful!] This gives

an = A2n + (n + 1)B2n .

Using a0 = 0 we get A + B = 0, and using a1 = 1 we get 2A + 4B = 1, so A = −1/2,


B = 1/2 and
an = n2n−1
(a fact that if we had guessed correctly, we could have easily proven by induction).

Solving via matrices


The Perrin recurrence can be encoded in matrix form: start with pn+2 = pn + pn−1 , and
add the trivial identities pn+1 = pn+1 and pn = pn to get
    
pn 0 1 0 pn−1
 pn+1  =  0 0 1   pn  (2)
pn+2 1 1 0 pn+1

valid for n ≥ 1. iteratively applying, we get that for all n ≥ 1,


   n  
pn 0 1 0 3
 pn+1  =  0 0 1   0  . (3)
pn+2 1 1 0 2

Write this as vn = An v. If we can diagonalize A (that is, find invertible S with SAS −1 = D,
with D a diagonal matrix), then we can write A = S −1 DS, so An = S −1 Dn S, from which
we can easily find vn and so pn explicitly. If you know enough linear algebra, you’ll quickly
see that this approach requires finding eigenvalues, which are roots of a certain cubic, and
the computations quickly reduce to exactly the same ones as those of the previous two
methods described. I mention this method just to bring up the matrix point of view of
recurrences, which can sometimes be quite helpful.

Solving general recurrences


I’ve only talked about recurrences with constant coefficients, but of course recurrences
can be far more general. While the generating function method is very good to bear in
mind for more general problems, there’s really no general approach that’s sure to work;
solving recurrences generally involves ad-ho tool like playing with lots of small examples,
and spotting, conjecturing and proving patterns (often by induction).

68
A non-Putnam warm-up exercise
Using the trick of repeatedly differentiating the identity
1
= 1 + x + x2 + . . . ,
1−x
find a nice expression for the coefficients of the power series (about 0) of 1/(1 − x)k . Use
this to derive, via generating functions, the identity
n
X n(n + 1)(2n + 1)
i2 = ,
i=0
6
and if you are feeling masochistic, go on to find a nice closed-form for
n
X
i3
i=0

using the same idea.

7.1 Some problems to think about for week 7


1. Define a sequence recursively by x0 = 0, x1 = 1 and, for n ≥ 1,
xn nxn−1
xn+1 = + .
n+1 n+1
Either prove that lim xn exists (and find the limit), or show that the limit does not
n→∞
exist.
h √ i
2. For n ≥ 0, let an = (1 + 2)n , where [x] indicates the floor of x — the largest
integer that is no larger than x (so, e.g., [2] = 2 and [2.1] = 2). Prove that an is even
if and only if n is odd.
3. Define a sequence (pn )n≥1 recursively by p1 = 3, p2 = 7 and, for n ≥ 3,
pn = 4 + pn−1 + 2pn−2 + . . . + 2p1
(so, for example, p3 = 4 + p2 + 2p1 = 17 and p4 = 4 + p3 + 2p2 + 2p1 = 41). Find a
closed-form expression for pn for general n.
4. Let (xn )n≥0 be a sequence of nonzero real numbers such that x2n − xn−1 xn+1 = 1 for
n = 1, 2, 3, . . .. Prove that there exists a real number a such that xn+1 = axn − xn−1
for all n ≥ 1.
5. Define a sequence by ak = k for k = 1, 2, . . . , 2020 and
ak+1 = ak + ak−2020
for k ≥ 2021. Show that the sequence has 2020 consecutive terms each divisible by
2021. Note: Sean has pointed out that there is an error in this statement — need
to correct!

69
6. The last question was clearly written with the years 2020 and 2021 in mind. Does
the conclusion remain true for an arbitrary year? That is, fix m ≥ 1. Define a
sequence by ak = k for k = 1, 2, . . . , m + 1 and

ak+1 = ak + ak−m

for k ≥ m + 1. For which m is it true that the sequence has m consecutive terms
each divisible by m + 1?

7. Let am,n denote the coefficient of xn in the expansion of (1 + x + x2 )m . Prove that


for all integers k ≥ 0,
b2k/3c
X
0≤ (−1)i ak−i,i ≤ 1.
i=0

(Here bac denote the round-down of a to the nearest integer at or below a; so for
example b3.4c = 3, b2.999c = 2 and b5c = 5.)

8. Define (an )n≥0 by


1 X
= an x n .
1 − 2x − x2 n≥0

Show that for each n ≥ 0, there is an m = m(n) such that am = a2n + a2n+1 .

7.2 Solutions to recurrence problems


First, the warm-up problem:
Differentiating the left-hand side k times, we get
k!
,
(1 − x)k+1

and differentiating the right-hand side k times, we get an power series where the coefficient
of xn−k is n(n − 1) . . . (n − (k − 1)), so the coefficient of xn is (n + k)(n + k − 1) . . . (n + 1).
Dividing through by k!, the coefficient of xn in 1/(1 − x)k+1 is
 
(n + k)(n + k − 1) . . . (n + 1) n+k
= .
k! k
n
X
Let an = i2 ; an satisfies the recurrence a0 = 0 and an = an−1 + n2 for n > 0. Letting
i=0

70
A(x) = a0 + a1 x + a2 x2 . . . be the generating function of the an ’s, we get

A(x) = 0 + (a0 + 12 )x + (a1 + 22 )x2 + . . .


= xA(x) + (12 x + 22 x2 + 32 x3 + . . .)
= xA(x) + [1.0 + 1]x + [2.1 + 2]x2 + [3.2 + 3]x3 + . . .
= xA(x) + (1.0x + 2.1x2 + 3.2x3 + . . .) + (1x + 2x2 + 3x3 + . . .)
= xA(x) + x2 (2.1 + 3.2x + . . .) + x(1 + 2x + 3x2 + . . .)
2
   
2 d 1 d 1
= xA(x) + x +x
dx2 1 − x dx 1 − x
2
2x x
= xA(x) + 3
+
(1 − x) (1 − x)2
x2 + x
= xA(x) +
(1 − x)3
so
x2 + x x2 x
A(x) = = + .
(1 − x)4 (1 − x)4 (1 − x)4
This means that an consists of two parts — the coefficient of xn−2 in 1/(1 − x)4 and the
coefficient of xn−1 in 1/(1 − x)4 . By what we established earlier, this is
   
n+1 n+2 n(n + 1)(2n + 1)
+ = .
3 3 6
If you were feeling masochistic, you might have used the same method to discover
n  2
X
3 n(n + 1)
i = .
i=0
2

Now onto the main problem set:

1. Define a sequence recursively by x0 = 0, x1 = 1 and, for n ≥ 1,


xn nxn−1
xn+1 = + .
n+1 n+1
Either prove that lim xn exists (and find the limit), or show that the limit does not
n→∞
exist.

Solution: (University of Illinois at Urbana Champaign Mock Putnam 2012) Rewrite


the recurrence as
(n + 1)xn+1 = xn + nxn−1 , (n ≥ 1).
Setting dn = xn+1 − xn , we get from this that
−ndn−1
dn = , (n ≥ 1)
n+1

71
with d0 = 1. Iterating this, we get an explicit expression for dn , via a “telescoping”
(or canceling) product:
(−1)n
dn = , (n ≥ 0),
n+1
so that
n−1 n−1
X X (−1)i
xn = x0 + di = .
i=0 i=0
i+1
By Leibniz’ test for alternating sums, this series converges, and by thinking about
the Taylor series for log(1 + x) about 0 (log here is base e), and evaluating the Taylor
series at x = 1, we get that the sum converges to log 2.
h √ ni
2. For n ≥ 0, let an = (1 + 2) , where [x] indicates the floor of x — the largest
integer that is no larger than x (so, e.g., [2] = 2 and [2.1] = 2). Prove that an is even
if and only if n is odd.

Solution: (University of Illinois at Urbana Champaign Mock Putnam 2011) Consider


√ √
bn = (1 + 2)n + (1 − 2)n .

By the binomial theorem,


n  
X n √ k √ 
bn = 2 + (− 2)k
k=0
k
X n
= 2 2k/2 .
k
k even
From this we conclude that bn is an even integer.

Now 1 − √ 2 is a negative number, and in absolute value it is strictly smaller than 1;
hence (1 − 2)n is a negative number between strictly −1 and 0 when n is odd, and
is a positive number strictly between 0 and q when n is even.
√ n
So, when n is even,
√ (1 + 2) equals bn (even) minus a number strictly √ between 0
and 1, so [(1 + 2)n ] = bn − 1, which is odd; and when n√is odd, (1 + 2)n equals
bn (even) plus a number strictly between 0 and 1, so [(1 + 2)n ] = bn , which is even.

3. Define a sequence (pn )n≥1 recursively by p1 = 3, p2 = 7 and, for n ≥ 3,

pn = 4 + pn−1 + 2pn−2 + . . . + 2p1

(so, for example, p3 = 4 + p2 + 2p1 = 17 and p4 = 4 + p3 + 2p2 + 2p1 = 41). Find a


closed-form expression for pn for general n.

Solution: The first few values are p1 = 3, p2 = 7, p3 = 17, p4 = 41, p3 = 17, p4 = 41.
A pattern seems to be emerging: pn = 2pn−1 + pn−2 , with p1 = 3, p2 = 7. We verify

72
this by induction on n. It’s certainly true for n = 3. For n > 3,

pn = 4 + pn−1 + 2pn−2 + 2pn−3 + . . . + 2p3 + 2p2 + 2p1


= (pn−1 + pn−2 ) + 4 + pn−2 + 2pn−3 + . . . + 2p3 + 2p2 + 2p1
= (pn−1 + pn−2 ) + pn−2 (induction)
= 2pn−1 + pn−2 ,

as required. With this new recurrence, it is easy to apply the method of generating
functions, as described in the introduction, to get
√ √
(1 + 2)n+1 (1 − 2)n+1
pn = + .
2 2

Remark: This problem arose in my research. A graph is a collection of points, some


pairs of which are joined by edges. A Widom-Rowlinson coloring of a graph is a
coloring of the points using 3 colors, red, white and blue, in such a way that no
point colored red is joined by an edge that is colored blue. I was looking at how
many Widom-Rowlinson colorings there are of the graph Pn that consists of n points,
numbered 1 up to n, with edges from 1 to 2, from 2 to 3, etc., up to from n − 1 to n.
It turns out that there are pn such colorings, where pn satisfies the first recurrence.
In trying to find a closed form for pn , I realized that pn satisfies the Fibonacci-like
recurrence described in the solution above, and so was able to solve for pn explicitly
using generating functions.

4. Let (xn )n≥0 be a sequence of nonzero real numbers such that x2n − xn−1 xn+1 = 1 for
n = 1, 2, 3, . . .. Prove that there exists a real number a such that xn+1 = axn − xn−1
for all n ≥ 1.

Solution: (1993 Putnam problem A2) Set


xn + xn−2
fn =
xn−1

for n ≥ 2 (well-defined since all xi are non-zero), so that


xn+1 + xn−1
fn+1 = .
xn
We claim that for all n ≥ 2, fn+1 = fn for n ≥ 2. Indeed, this equality is equivalent
to
xn + xn−2 xn+1 + xn−1
=
xn−1 xn
or
x2n + xn xn−2 = xn−1 xn+1 + x2n−1
or
x2n − xn−1 xn+1 = x2n−1 − xn xn−2 ;

73
but this last is true since both sides equal 1 for n ≥ 2. It follows that there is some
a such that for all n ≥ 2,
axn−1 = xn + xn−2
or
xn = axn−1 − xn−2 ,
so for all n ≥ 1
xn+1 = axn − xn−1 ,
as was required to show.

5. Define a sequence by ak = k for k = 1, 2, . . . , 2020 and

ak+1 = ak + ak−2020

for k ≥ 2021. Show that the sequence has 2020 consecutive terms each divisible by
2021.

Solution: (Putnam competition 2006, problem A3) The given recurrence can be
used to extend the sequence to a0 (a0 is the unique integer satisfying a2021 = a2020 +a0 ,
so a0 = 1), and indeed in the same way it can be extended to all negative k. So we
can consider that we are working with a doubly-infinite sequence, indexed by Z.
For future reference, it will be useful to know the following:

• a0 = 1;
• a−1 is defined by a2020 = a2019 + a−1 , so also a−1 = 1;
• a−2 is defined by a2019 = a2018 + a−2 , so also a−2 = 1;
• and this continues down to a−2019 , which is defined by a2 = a1 + a−2019 , so also
a−2019 = 1.

Here things change a little.

• We have that a−2020 , which is defined by a1 = a0 + a−2020 , takes value 0.


• Also, a−2021 , which is defined by a0 = a−1 + a−2021 , takes value 0.
• This continues down to a−4039 , which is defined by a−2018 = a−2019 + a−4039 , and
so also takes value 0.

Notice that we have found 2020 consecutive values of the sequence, albeit with
negative indices, that take value 0, namely a−2020 through a−4039 .
In the section on pigeon-hole principle, we showed that the Fibonacci numbers,
viewed similarly as a doubly-infinite sequence, are periodic modulo any modulus;
that is, if we take any integer n and consider the remainder of the terms of the
doubly-infinite Fibonacci sequence modulo n, we get a periodic sequence. The same
proof works to show that the given sequence (ak )k∈Z is periodic modulo 2021.

74
So, it is enough to show that the doubly-infinite sequence (ak )k∈Z has 2020 consecutive
terms (some perhaps with negative indices), each divisible by 2021; the periodicity
will then give us such a long sequence of terms with positive indices, also. But now
we are done, since a−2020 through a−4039 gives such a sequence.
6. The last question was clearly written with the years 2020 and 2021 in mind. Does
the conclusion remain true for an arbitrary year? That is, fix m ≥ 1. Define a
sequence by ak = k for k = 1, 2, . . . , m + 1 and

ak+1 = ak + ak−m

for k ≥ m + 1. For which m is it true that the sequence has m consecutive terms
each divisible by m + 1?

Solution: Yes, indeed, the very same proof goes through.


7. Let am,n denote the coefficient of xn in the expansion of (1 + x + x2 )m . Prove that
for all integers k ≥ 0,
b2k/3c
X
0≤ (−1)i ak−i,i ≤ 1.
i=0

(Here bac denote the round-down of a to the nearest integer at or below a; so for
example b3.4c = 3, b2.999c = 2 and b5c = 5.)

Solution: (Putnam competition 1997, problem B4) Note that for each k ≥ 0, if
i > b2k/3c, then ak−i,i = 0 also. So we may as well consider
X
(−1)i ak−i,i ,
i≥0

b2k/3c
X
as it takes exactly the same value as (−1)i ak−i,i .
i=0
Observe that, by definition of am,n ,
X
(1 + x + x2 )m = am,n xn ,
n≥0

so that the two-variable generating function


X
am,n xn y m
m,n≥0

is equal to
1 + y(1 + x + x2 ) + y 2 (1 + x + x2 )2 + . . .
which simplifies to
1
.
1 − y(1 + x + x2 )

75
Evaluating both sides at x = −y, we get
1 X
= (−1)n am,n y m+n .
1 − y + y 2 − y 3 m,n≥0

Note that the right-hand side above can be rewritten as


!
X X
(−1)i ak−i,i y k ;
k≥0 i≥0

so the sum we are trying to evaluate is the coefficient of y k in 1/(1 − y + y 2 − y 3 ).


But now 1 − y + y 2 − y 3 = (1 − y 4 )/(1 + y), so
1 1 y
= + .
1 − y + y2 − y3 1 − y4 1 − y4

From this we see that the coefficient of y k in 1/(1 − y + y 2 − y 3 ) is 1 if k is a multiple


of 4, or one greater than a multiple of 4, and is 0 otherwise; in any case it is between
0 and 1, as was required to be shown.

8. Define (an )n≥0 by


1 X
= an x n .
1 − 2x − x2 n≥0

Show that for each n ≥ 0, there is an m = m(n) such that am = a2n + a2n+1 .

Solution: (1999 Putnam, problem A3) Using partial fractions, we can get the Taylor
series of 1/(1 − 2x − x2 ), and discover that
1  √ n+1 √ n+1 
an = √ (1 + 2) − (1 − 2) .
2 2
After a little (messy) algebra, we find that

a2n + a2n+1 = a2n+2 ,

so m = 2n + 2 works.

76
8 Week 7 (September 22) — Writing solutions
This week’s handout is concerned with the art of writing — presenting — solutions. I’ve
made a few notes myself, but I also strongly recommend that you also look at both of
these essays:

• (short, with very practical advice and a few examples): http://web.evanchen.cc/


handouts/english/english.pdf

• (long, with lots of examples of both good and bad writing): https://artofproblemsolving.
com/news/articles/how-to-write-a-solution

As you go over this week’s problems and (more importantly) as you take a Putnam
competition, I encourage you to bear in mind the advice contained in these notes, and it
the above references.
Note: I have shamelessly appropriate much of this from Ravi Vakil’s Stanford Putnam
Preparation website (http://math.stanford.edu/~vakil/putnam05/05putnam7.pdf),
and from Ioana Dumitriu’s UWashington’s “The Art of Problem Solving” website (http:
//www.math.washington.edu/~putnam/index.html). Both of these websites are filled
with what I think is great advice.
For next Tuesday fully write up a solution to at least one of the problems that appear
later in the handout, after you have read the handout and (at the very least) the 4-page
essay by Evan Chen (http://web.evanchen.cc/handouts/english/english.pdf).

Some general problem-solving tips


Remember that problem solving is a full-contact sport: throw everything you know at the
problem you are tackling! Sometimes, the solution can come from an unexpected quarter.
Here are some slogans to keep in mind when solving problems:

• Try small cases!

• Plug in small numbers!

• Do examples!

• Look for patterns!

• Draw pictures!

• Write lots!

• Talk it out!

• Choose good notation!

• Look for symmetry!

77
• Break into cases!

• Work backwards!

• Argue by contradiction!

• Consider extreme cases!

• Modify the problem!

• Make a generalization!

• Don’t give up after five minutes!

• Don’t be afraid of a little algebra!

• Take a break!

• Sleep on it!

• Ask questions!

And above all:

• Enjoy!

Some specific mathematical tips


Here are some very simple things to remember, that can be very helpful, but that people
tend to forget to do.

1. Try a few small cases out. Try a lot of cases out. Remember that the three hours
of the Putnam competition is a long time — you have time to spare! If a question
asks what happens when you have n things, or 2015 things, try it out with 1, 2, 3, 4
things, and try to form a conjecture. This is especially valuable for questions about
sequences defined recursively.

2. Don’t be afraid to use lots and lots of paper.

3. Don’t be afraid of diving into some algebra. (Again, three hours is a long time . . ..)
You shouldn’t waste that much time, thanks to the 15-minute rule.

4. If a question asks to determine whether something is true or false, and the direction
you initially guess doesn’t seem to be going anywhere, then try guessing the opposite
possibility.

5. Be willing to try (seemingly) stupid things.

78
6. Look for symmetries. Try to connect the problem to one you’ve seen before. Ask
yourself “how would [person X] approach this problem”? (It’s quite reasonable here
for [Person X] to be [Chuck Norris]!)

7. Putnam problems always have slick solutions. That leads to a helpful meta-approach:
“The only way this problem could have a nice solution is if this particular approach
worked out, unlikely as it seems, so I’ll try it out, and see what happens.”

8. Show no fear. If you think a problem is probably too hard for you, but you have an
idea, try it anyway. (Three hours is a long time.)

Some specific writing tips


You’re working hard on a problem, focussing all your energy, applying all the good tips
above, and all of a sudden a bulb lights above your head: you have figured out how to
solve one of the problems! Awesome! But here’s the hard, unforgiving truth: that warm
tingle you’re feeling is no guarantee

• that you have actually solved the problem (are you sure you have all the cases
covered? Are you sure that all the little details work out? . . .)
and

• that you will get any or all credit for it.

Solving the problem is only half the battle. Now you have launch into the other half,
convincing the grader that you have actually done so.
Life is tough, and Putnam graders may be even tougher. But here’s a list of things
which, when done properly, will yield a nice write-up which will appease any reasonable
grader.

1. All that scrap paper, filled with your musings on the problem to date? Put it aside;
you must write a clean, coherent solution on a fresh piece of paper.

2. Before you start writing, organize your thoughts. Make a list of all steps to the
solution. Figure out what intermediary results you will need to prove. For example,
if the problem involves induction, always start with the base case, and continue with
proving that “true for n implies true for n + 1”. Make sure that the steps follow
from each other logically, with no gaps.

3. After tracing a “road to proof” either in your head or (preferably) on scratch paper,
start writing up the solution on a fresh piece of paper. The best way to start this is
by writing a quick outline of what you propose to do. Sometimes, the grader will
just look at this outline, say “Yes, she knows what she is doing on this problem!”,
and give the credit.

4. Lead with a clear statement of your final solution.

79
5. Complete each step of the “road” before you continue to the next one.

6. When making statements like “it follows trivially” or “it is easy to see”, listen for
quiet, nagging doubts. If you yourself aren’t 100% convinced, how will you convince
someone else? Even if it seems to follow trivially, check again. Small exceptions may
not be obvious. The strategy “I am sure it’s true, even if I don’t see it; if I state that
it’s obvious, maybe the grader will believe I know how to prove it” has occasionally
led its user to a score of 0 out of 10.

7. Organize your solution on the page; avoid writing in corners or perpendicular to the
normal orientation. Avoid, if possible, post-factum insertion (if you discover you’ve
missed something, rather than making a mess of the paper by trying to write it over,
start anew. You have the time!)

8. Before writing each phrase, formulate it completely in your mind. Make sure it
expresses an idea. Starting to write one thing, then changing course in mid-sentence
and saying another thing is a sure way to create confusion.

9. Be as clear as possible. Avoid, if possible, long-winded phrases. Use as many words


as you need – just do it clearly. Also, avoid acronyms.

10. If necessary, state intermediary results as “claims” or “lemmas” which you can prove
right after stating them. If you cannot prove one of these results, but can prove the
problem’s statement from it, state that you will assume it, then show the path from
it to the solution. You may get partial credit for it.

11. Rather than using vague statements like “and so on” or “repeating this process”,
formulate and prove by induction.

12. When you’re done writing up the solution, go back and re-read it. Put yourself in the
grader’s shoes: can someone else read your write-up and understand the solution?
Must one look for things in the corners? Are there “miraculous” moments?, etc..

8.1 Some problems to think about for week 8


1. Either find a function f : N → N with the property that f (f (n)) = n + 1 for all
n ∈ N, or prove that no such function exists.

2. Evaluate Z 2
1 arctan(1 + x)
dx.
π ln 2 1 x
Express your answer as a rational number. (Here arctan is defined in the standard
way, so that for 0 ≤ x < ∞ we have 0 ≤ arctan x < π/2.)

3. At some point early in the season, Chris Paul’s free throw percentage (number of
successful free throws/number of free throw attempts) was below 80%. At the end

80
of the season, it was above 80%. Must it have been exactly 80% at some moment?
(Note that going in the other direct, the answer is an easy “no”: he could start the
season with a successful free throw followed by an unsuccessful one, so his average
would go from above 80% to below 80% without ever being exactly 80%).

4. An integer is selected at random from the set {1, 2, . . . , 2020} (with each integer
having a chance 1/2020 of being selected). You are required to determine the correct
integer in an odd number of guesses. After each guess you are told whether the
actual integer is less than, equal to, or greater than your guess. You are not allowed
to guess an integer which has already been ruled out by the earlier answers. Show
that there is a strategy that allows you to determine the integer, in such a way that
there is a more than 2/3 chance of making an odd number of guesses.

5. Let S be the smallest set of positive integers such that

• 2∈S
• n ∈ S whenever n2 ∈ S
• (n + 5)2 ∈ S whenever n ∈ S.

Which positive integers are not in S? (Here “smallest” means that any proper subset
of S fails at least one of the properties listed above).

6. Define a sequence (xn )∞


n=0 recursively by x0 = 1 and, for n ≥ 0,

xn+1 = log (exn − xn ) .

(Here “log” is log to the base e). Show that the infinite series

x0 + x1 + x2 + · · ·

converges, and find its sum.

7. Remember that a composite number is a positive integer n such that n = ab with


a, b both positive integers greater than 1 (and not necessarily distinct).
Show that every composite number is expressible as xy + xz + yz + 1, with x, y, and
z positive integers.

8. Let Nn be the number of ordered n-tuples (a1 , . . . , an ) of positive integers with the
property that
1 1 1
+ + ··· + = 1.
a1 a2 an
Is N10 odd or even?

81
8.2 Solutions to problems
1. Either find a function f : N → N with the property that f (f (n)) = n + 1 for all
n ∈ N, or prove that no such function exists.

Solution: We claim that no such function exists. Indeed, suppose that there was
such a function. Let f (0) = k.
We claim that for each j = 0, 1, 2. . . ., we have f (j) = k + j. We prove this by
induction on j, with the base case j = 0 immediate. For j > 0, we have the
induction hypothesis that f (j − 1) = k + j − 1. To force f (f (j − 1)) = j we require
f (k + j − 1) = j. But now f (f (k + j − 1)) = f (j), and also f (f (k + j − 1)) = k + j,
so f (j) = k + j, completing the induction.34
We have now fully described the function f , by f (j) = k + j, j = 0, 1, 2, . . ., for some
k ≥ 1. But now f (f (j)) = f (k + j) = 2k + j for all j. But also f (f (j) = j + 1 for
all j. So j + 1 = 2k + j, forcing k = 1/2, a contradiction.
(This was Question 3 on the 2018 Virginia Tech Regional Math Competition.)

2. Evaluate Z 2
1 arctan(1 + x)
dx.
π ln 2 1 x
Express your answer as a rational number. (Here arctan is defined in the standard
way, so that for 0 ≤ x < ∞ we have 0 ≤ arctan x < π/2.)

Solution: We claim that the answer is 3/8. Denote by I the integral


Z 2
arctan(1 + x)
dx.
1 x
Integrating by parts we get
Z 2
(ln x)dx
I = [(ln x) arctan(1 + x)]21 − 2
1 1 + (1 + x)
Z 2
(ln x)dx
= (ln 2) arctan(3) − .
1 2 + 2x + x2
Now denote by J the integral
Z 2
(ln x)dx
.
1 2 + 2x + x2
34
This may be easier to see going step-by-step: we must have f (k) = 1 (to force f (f (0)) = 1). But now
f (f (k)) = f (1), and also f (f (k)) = k + 1, so f (1) = k + 1.
Now must have f (k + 1) = 2 (to force f (f (1)) = 2). But now f (f (k + 1)) = f (2), and also
f (f (k + 1)) = k + 2, so f (2) = k + 2; and so on.

82
Make the substitution x = 2/y to get
Z 1
−2(ln 2 − ln y)dy
J = 2 2
2 y (2 + 4/y + 4/y )
Z 2
(ln 2)dy
= 2
− J,
1 1 + (1 + y)

so
Z 2
(ln 2)dy
2J = 2
1 1 + (1 + y)
= (ln 2) [arctan(1 + y)]21
= (ln 2)(arctan 3 − arctan 2).

It follows that
(ln 2)(arctan 3 + arctan 2)
I= .
2
Now
3+2
tan(arctan 3 + arctan 2) = = −1
1−6
(using the formula
tan A + tan B
tan(A + B) = ),
1 − (tan A)(tan B)
so

arctan 3 + arctan 2 = .
4
It follows that
3π ln 2
I= ,
8
so the answer to the question is 3/8, as claimed.
(This was Question 1 on the 2018 Virginia Tech Regional Math Competition; the
solution given here is modified from the official solution35 .)

3. At some point early in the season, Chris Paul’s free throw percentage (number of
successful free throws/number of free throw attempts) was below 80%. At the end
of the season, it was above 80%. Must it have been exactly 80% at some moment?
(Note that going in the other direct, the answer is an easy “no”: he could start the
season with a successful free throw followed by an unsuccessful one, so his average
would go from above 80% to below 80% without ever being exactly 80%).

Solution (from Kiran Kedlaya’s Putnam Competition archive36 ; this was Putnam
A1 2004): Yes. To see this, let is S(N ) be Paul’s number of successful free throws,
after N attempts, and suppose that there is no N with S(N ) = 0.8N . Then there
35
https://intranet.math.vt.edu/people/plinnell/Vtregional/S18/index.html
36
https://kskedlaya.org/putnam-archive/

83
would be an N0 such that S(N0 ) < 0.8N0 and S(N0 + 1) > 0.8(N0 + 1); that is,
Paul’s free throw percentage is under 80% at some point, and after one subsequent
free throw (necessarily made), his percentage is over 80%. If he makes m of his first
N0 free throws, then m/N0 < 4/5 and (m + 1)/(N0 + 1) > 4/5. This means that
5m < 4N0 < 5m + 1, which is impossible since then 4N0 is an integer between the
consecutive integers 5m and 5m + 1.

Remark: This same argument works for any fraction of the form (n − 1)/n for some
integer n > 1, but not for any other real number between 0 and 1.

4. An integer is selected at random from the set {1, 2, . . . , 2020} (with each integer
having a chance 1/2020 of being selected). You are required to determine the correct
integer in an odd number of guesses. After each guess you are told whether the
actual integer is less than, equal to, or greater than your guess. You are not allowed
to guess an integer which has already been ruled out by the earlier answers. Show
that there is a strategy that allows you to determine the integer, in such a way that
there is a more than 2/3 chance of making an odd number of guesses.

Solution (from Kiran Kedlaya’s Putnam Competition archive; this was Putnam
B4, 2002): Use the following strategy: guess 1, 3, 4, 6, 7, 9, . . . , 3k, 3k + 1, . . . until
the target number n is revealed to be equal to or lower than one of these guesses.
If n ≡ 1 (mod 3), it will be guessed on an odd turn. If n ≡ 0 (mod 3), it will be
guessed on an even turn. If n ≡ 2 (mod 3), then n + 1 will be guessed on an even
turn, forcing a guess of n on the next turn. Thus the probability of success with this
strategy is 1347/2020 > 2/3.

5. Let S be the smallest set of positive integers such that

• 2∈S
• n ∈ S whenever n2 ∈ S
• (n + 5)2 ∈ S whenever n ∈ S.

Which positive integers are not in S? (Here “smallest” means that any proper subset
of S fails at least one of the properties listed above).

Solution (from Kiran Kedlaya’s Putnam Competition archive; this is Putnam A1,
2017): We denote the condition 2 ∈ S by (a); the condition n ∈ S whenever n2 ∈ S
by (b), and the condition (n + 5)2 ∈ S whenever n ∈ S by (c).
We claim that the positive integers not in S are 1 and all multiples of 5.
If S consists of all other natural numbers, then S satisfies the given conditions: note
that the only perfect squares not in S are 1 and numbers of the form (5k)2 for some
positive integer k, and it readily follows that both (b) and (c) hold (that (a) holds
for this set is trivial).

84
Now suppose that T is another set of positive integers satisfying (a), (b), and (c).
Note from (b) and (c) that if n ∈ T then n + 5 ∈ T , and so T satisfies the following
property:

(d) if n ∈ T , then n + 5k ∈ T for all k ≥ 0.

The following must then be in T , with implications labeled by conditions (b) through
(d):
c c d b d b
2 ⇒ 49 ⇒ 542 ⇒ 562 ⇒ 56 ⇒ 121 ⇒ 11
d b d b
11 ⇒ 16 ⇒ 4 ⇒ 9 ⇒ 3
d b
16 ⇒ 36 ⇒ 6

Since 2, 3, 4, 6 ∈ T , by (d) S ⊆ T , and so S is smallest set.

6. Define a sequence (xn )∞


n=0 recursively by x0 = 1 and, for n ≥ 0,

xn+1 = log (exn − xn ) .

(Here “log” is log to the base e). Show that the infinite series

x0 + x1 + x2 + · · ·

converges, and find its sum.

Solution (from Kiran Kedlaya’s Putnam Competition archive; this is Putnam B1,
2016): Note that the function ex − x is strictly increasing for x > 0 — its derivative
is ex − 1, which is positive for x > 0 because ex is strictly increasing and takes value
1 at 0. Also, the value of ex − x at 0 is 1. By induction on n, it follows that xn > 0
for all n.
By exponentiating the equation defining xn+1 , we obtain the expression

xn = exn − exn+1 .

We use this equation repeatedly to acquire increasingly precise information about


the sequence {xn }.

• Since xn > 0, we have exn > exn+1 , so xn > xn+1 .


• Since the sequence {xn } is decreasing and bounded below by 0, it converges to
some limit L.
• Taking limits as n → ∞ in the equation xn = exn − exn+1 (and using that the
exponential function is continuous) yields L = eL − eL , whence L = 0.
• Since L = 0, the sequence {exn } converges to 1.

85
Now note that, applying xn = exn − exn+1 repeatedly, we get a telescoping sum for
the partial sum x0 + · · · + xn :
x0 + · · · + xn = (ex0 − ex1 ) + · · · + (exn − exn+1 )
= ex0 − exn+1 = e − exn+1 .
By taking limits as n goes to infinity, we see that the sum x0 + x1 + · · · converges to
the value e − 1.
7. Remember that a composite number is a positive integer n such that n = ab with
a, b both positive integers greater than 1 (and not necessarily distinct).
Show that every composite number is expressible as xy + xz + yz + 1, with x, y, and
z positive integers.
Solution (from John Scholes’ collection of Putnam solutions37 ; this is Putnam B1,
1988): If n is composite then n = ab with a, b ≥ 2, and a, b both integers. But
ab = 1(a − 1) + 1(b − 1) + (a − 1)(b − 1) + 1,
so we may take x = 1, y = a − 1 and z = b − 1, all of which are positive integers, to
get a representation of n as n = xy + xz + yz + 1, with x, y, and z positive integers.
8. Let Nn be the number of ordered n-tuples (a1 , . . . , an ) of positive integers with the
property that
1 1 1
+ + ··· + = 1.
a1 a2 an
Is N10 odd or even?
Solution: (from Kiran Kedlaya’s Putnam Competition archive; this is Putnam A5,
1997) There are an even number of solutions (in positive integers) to
1 1 1
+ + ··· + = 1.
a1 a2 a10
6 a2 (these solutions can be grouped into disjoint pairs, with (a1 , a2 , a3 , . . . , a10 )
with a1 =
being paired with (a2 , a1 , a3 , . . . , a10 )). So we need only find the parity of the number
of solutions with a1 = a2 . Similarly, we can reduce to considering those 10-tuples
with a3 = a4 , a5 = a6 , a7 = a8 , a9 = a10 .
We have reduced to considering solutions (in positive integers) to
2 2 2 2 2
+ + + + = 1.
a1 a3 a5 a7 a9
As before, we need only consider the parity of the number of solutions with a1 = a3
and a5 = a7 , which reduces us to considering solutions (in positive integers) to
4 4 2
+ + = 1.
a1 a5 a9
37
https://prase.cz/kalva/putnam.html

86
We need only consider the parity of the number of solutions with a1 = a5 , which
reduces us to considering solutions (in positive integers) to
8 2
+ = 1.
a1 a9
This equation is equivalent to

(a1 − 8)(a9 − 2) = 16

which, by inspection, has 5 solutions in positive integers (specifically: (a1 , a9 ) = (9, 18)
or (10, 10) or (12, 6) or (16, 4) or (24, 3)). It follows that N10 is odd.

87
9 Week 8 (September 29)
No handout this week.

88
10 Week 9 (October 6) — Inequalities
Many Putnam problem involve showing that a particular inequality between two expressions
holds always, or holds under certain circumstances. There are a huge variety of general
inequalities between sets of numbers satisfying certain conditions, that are quite reasonable
for you to quote as “well-known”. Many of these are also useful to know about for other,
more serious, mathematical application. I’ve listed some of them here, mostly without
proofs, with stars next to the most important ones. If you are interested in knowing more
about inequalities, consider looking at the book lovely The Cauchy-Schwarz Master Class
by Steele (readily available online), or the classic Inequalities by Hardy, Littlewood and
Pólya (QA 303 .H223i at the library).

Squares are positive ??


Surprisingly many inequalities reduce to the obvious fact that x2 ≥ 0 for all real x, with
equality iff x = 0. I’ll highlight one example in what follows.

The triangle inequality


n
For real or complex x and y, |x+y| ≤ |x|+|y|. More generally, if x, y are vectors
q in R , with
x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ), and || · || is Euclidean norm (||x|| = x21 + · · · + x2n )
then ||x + y|| ≤ ||x|| + ||y||.
To see why this is called the triangle inequality, consider three points u, v, w in Rn .
The straight-line distance from u to v is ||v − u||, from u to w is ||u − w|| and from w to v
is ||w − v||. Since u − v = (u − w) + (w − v), the inequality says

||u − v|| ≤ ||u − w|| + ||w − v||.

So to get from u to v it is at least as efficient to go directly (along one side of the triangle
uvw) than it is to go via the point w (along the other two sides of the triangle).

AM–GM–HM inequality ??
For positive a1 , . . . , an
n √ a1 + . . . + an
1 1 ≤ n
a1 . . . an ≤
a1
+ ... + a1
n

with equalities in both inequalities iff all ai are equal. The three expressions above are the
harmonic mean, the geometric mean and the arithmetic mean of the ai .

For n = 2, here’s a proof of the second inequality: a1 a2 ≤ (a1 + a2 )/2 iff 4a1 a2 ≤
(a1 + a2 )2 iff a21 − 2a1 a2 + a22 ≥ 0 iff (a1 − a2 )2 ≥ 0, which is true by the “squares are
positive” inequality; there’s equality all along iff a1 = a2 .

For n = 2 the first inequality is equivalent to a1 a2 ≤ (a1 + a2 )/2.

89
Power means inequality
For a non-zero real r and positive a1 , . . . , an define
1/r
ar1 + . . . + arn

r
M (a1 , . . . , an ) = ,
n

and set M 0 (a1 , . . . , an ) = n
a1 . . . an . For real numbers r < s,

M r (a1 , . . . , an ) ≤ M s (a1 , . . . , an )

with equality iff all ai are equal.


Notice that M −1 (a1 , . . . , an ) is the harmonic mean of the ai ’s, and M 1 (a1 , . . . , an ) is
their geometric mean, so this inequality generalizes the AM–GM–HM inequality.
There is a weighted power means inequality: let w1 , . . . , wn be positive reals that add
to 1, and define
Mwr (a1 , . . . , an ) = (w1 ar1 + . . . + wn arn )1/r
for non-zero real r, with Mw0 (a1 , . . . , an ) = aw wn
1 . . . an . For real numbers r < s,
1

Mwr (a1 , . . . , an ) ≤ Mws (a1 , . . . , an ).

(This reduces to the power means inequality when all wi = 1/n.)

Cauchy-Schwarz-Bunyakovsky inequality ??
Let x1 , . . . , xn and y1 , . . . , yn be real numbers. We have

(x1 y1 + . . . + xn yn )2 ≤ x21 + . . . + x2n y12 + . . . + yn2 .


 

Equality holds if one of the sequences (x1 , . . . , xn ), (y1 , . . . , yn ) is identically zero. If both
are not identically zero, then there is equality iff there is some real number t0 such that
xi = t0 yi for each i.
Here’s a quick proof: If either sequence is identically 0, both sides are zero. So assume
that neither is identically 0. For any real t we have
n
X
(xi − tyi )2 ≥ 0.
i=1

But also,
n
X n
X n
X n
X
(xi − tyi )2 = x2i − 2t xi yi + t2 yi2 ,
i=1 i=1 i=1 i=1

so for all real t, so


n
X n
X n
X
x2i − 2t xi y i + t2
yi2 ≥ 0.
i=1 i=1 i=1

90
This means that viewed as a polynomial in t, the expression above must have either
complex roots or a repeated real root, i.e., that
n
!2 n
! n !
X X X
2 xi y i ≤4 x2i yi2 ,
i=1 i=1 i=1

which is exactly the inequality we wanted. (Notice the key point — squares are positive!).
If the inequality is an equality, then the polynomial has a repeated root, which means
there is some real t0 at which the polynomial evaluates to 0. But the polynomial at this
Xn
point is equal to (xi − t0 yi )2 , and the only way this can happen is if each xi − t0 yi is 0,
i=1
as claimed.
This is really a very general inequality: if you are familiar with inner products from
linear algebra, the Cauchy-Schwarz-Bunyakovsky inequality really says that if x, y are
vectors in an inner product space over the reals then

|hx, yi|2 ≤ hx, xihy, yi.

Equivalently
|hx, yi| ≤ ||x|| ||y||.
There is equality iff x and y are linearly dependent.

Hölder’s inequality
Fix p > 1 and define q by 1/p + 1/q = 1. Let x1 , . . . , xn and y1 , . . . , yn be real numbers.
We have

|x1 y1 + . . . + xn yn | ≤ (|x1 |p + . . . + |xn |p )1/p (|y1 |q + . . . + |yn |q )1/q .

Notice that Hölder becomes Cauchy-Schwarz-Bunyakovsky in the case p = 2.

Chebyshev’s sum inequality


If a1 ≥ . . . ≥ an and b1 ≥ . . . ≥ bn are sequences of reals, then
  
a1 b 1 + . . . + an b n a1 + . . . + an b1 + . . . + b n
≥ .
n n n

The same holds if a1 ≤ . . . ≤ an and b1 ≤ . . . ≤ bn ; if either a1 ≥ . . . ≥ an and b1 ≤ . . . ≤ bn


or a1 ≤ . . . ≤ an and b1 ≥ . . . ≥ bn , then
  
a1 b1 + . . . + an bn a1 + . . . + an b1 + . . . + b n
≤ .
n n n

91
The rearrangement inequality
If a1 ≤ . . . ≤ an and b1 ≤ . . . ≤ bn are sequences of reals, and aπ(1) , . . . , aπ(n) is a
permutation (rearrangement) of a1 ≤ . . . ≤ an , then
an b1 + . . . + a1 bn ≤ aπ(1) b1 + . . . + aπ(n) bn ≤ a1 b1 + . . . + an bn .
If a1 < . . . < an and b1 < . . . < bn , then there is equality in the first inequality iff π is the
reverse permutation π(i) = n + 1 − i, and there is equality in the second inequality iff π is
the identity permutation π(i) = i.

Jensen’s inequality ??
A real function f (x) is convex on the interval [c, d] if for all c ≤ a < b ≤ d, the line segment
joining (a, f (a)) to (b, f (b)) lies entirely above the graph y = f (x) on the interval (a, b),
or equivalently, if for all 0 ≤ t ≤ 1 we have
f ((1 − t)a + tb) ≤ (1 − t)f (a) + tf (b).
If f (x) is convex on the interval [c, d], and c ≤ a1 ≤ . . . ≤ an ≤ d, then
 
a1 + . . . + an f (a1 ) + . . . + f (an )
f ≤
n n
(note that when n = 2, this is just the definition of convexity).
We say that f (x) is concave on [c, d] if for all c ≤ a < b ≤ d, and for all 0 ≤ t ≤ 1, we
have
f ((1 − t)a + tb) ≥ (1 − t)f (a) + tf (b).
If f (x) is concave on the interval [c, d], and c ≤ a1 ≤ . . . ≤ an ≤ d, then
 
a1 + . . . + an f (a1 ) + . . . + f (an )
f ≥ .
n n
As an example, consider the convex function f (x) = x2 ; for this function Jensen says
that 2
a2 + . . . + a2n

a1 + . . . + an
≤ 1 ,
n n
which is equivalent to the powers means inequality M 1 (a1 , . . . , an ) ≤ M 2 (a1 , . . . , an ); and
when f (x) = − ln x we get
√ a1 + . . . + an
n
a1 . . . an ≤ ,
n
the AM-GM inequality.
If f is twice differentiable, there is an easy check for convexity/concavity: f is convex
on intervals where the second derivative is positive, and concave on intervals where the
second derivative is negative. If f is just once differentiable, there is a similar test: f is
convex on intervals where the first derivative is increasing, and concave on intervals where
the first derivative is decreasing.

92
Four miscellaneous comments
1. Maximization/minimization problems are often problems about inequalities in dis-
guise. For example, to find the minimum of f (a, b) as (a, b) ranges over a set R,
it is enough to first guess that the minimum is m, then find an (a, b) ∈ R with
f (a, b) = m, and then use inequalities to show that f (a, b) ≥ m for all (a, b) ∈ R.
2. If an expression is presented as a sum of n squares, it is sometimes helpful to think
of it as the (square of the) distance between two points in n dimensional space, and
then think of the problem geometrically.
3. Sometimes a little calculus is all that is needed. For example, here is a very useful
inequality:
1 + x ≤ ex for all x ∈ R.
To prove this for x ≥ 0 note that both sides are equal at x = 0, and the derivative
of 1 + x, which is 1, is smaller than the derivative of ex , which is ex , for all x ≥ 0;
so the two sides start together but always the right-hand side is growing faster
than the left-hand side, so the right-hand side is always bigger. A similar argument
proves the inequality for x ≤ 0: 1 + x, with derivative 1, falls faster as we move
along the x-axis negatively away from 0, than does ex , which has derivative positive
but strictly less than 1 for x < 0. (To formalize this second half of the argument,
consider f (y) = 1 − y and g(y) = e−y , defined for y ≥ 0. We have f (0) = g(0), and
f 0 (y) = −1 ≤ −e−y = g 0 (y) for y ≥ 0, so f (y) ≤ g(y) for y ≥ 0. It follows that for
x ≤ 0, 1 − x ≤ e−x .)
4. If f (x) is a positive, increasing function on (0, ∞), then by considering Riemann
sums we have Z n n Z n+1
X
f (x) dx ≤ f (k) ≤ f (x) dx
0 k=1 1

(assuming the left-hand integral converges). For example, consider f (x) = xk for
k > 0. We have Z n
nk+1
xk dx =
0 k+1
and Z n+1
(n + 1)k+1 1
xk dx = − .
1 k+1 k+1
It easy to check that
(n + 1)k+1
   k+1 
1 n
− / → 1 as n → ∞,
k+1 k+1 k+1
so we have a quick proof that for each fixed k > 0 (not necessarily an integer)
1k + . . . + nk 1
lim k+1
= ;
n→∞ n k+1
in other words, the sum of the first n perfect kth powers grows like nk+1 /(k + 1).

93
Some warm-up problems
You should find that these are all fairly easy to prove by direct applications of an appropriate
inequality from the list above.
 n
n+1
1. n! < for n = 2, 3, 4, . . ..
2
p √ √ √
2. 3(a + b + c) ≥ a + b + c for positive a, b, c.

3. Minimize x1 + . . . + xn subject to xi ≥ 0 and x1 . . . xn = 1.

4. Minimize
x2 y2 z2
+ +
y+z z+x x+y
subject to x, y, z ≥ 0 and xyz = 1.

5. If triangle has side lengths a, b, c and opposite angles (measured in radians) A, B, C,


then
aA + bB + cC π
≥ .
a+b+c 3
6. Identify which is bigger:

2019!(2020) or 2020!(2019) .

(Here n!(k) indicates iterating the factorial function k times, so for example 4!(2) = 24!.)

7. Identify which is bigger:


20202020 or 20212019 .

8. Minimize
sin3 x cos3 x
+
cos x sin x
on the interval 0 < x < π/2.

94
10.1 Problems to think about for week 10
1. For positive integers m, n, show
(m + n)! m! n!
m+n
< m n.
(m + n) m n

2. Let T be an acute triangle. Inscribe a rectangle R in T with one side along a side of
T . Then inscribe a rectangle S in the triangle formed by the side of R opposite the
side on the boundary of T , and the other two sides of T , with one side along the
side of R. For any polygon X, let A(X) denote the area of X. Find the maximum
value, or show that no maximum exists, of
A(R) + A(S)
,
A(T )
where T ranges over all triangles and R, S over all rectangles as above.
3. Minimize 2


2 9
(u − v) + 2 − u2 −
v

in the range 0 ≤ u ≤ 2, v ≥ 0.
4. Show that for non-negative reals a1 , . . . , an and b1 , . . . , bn ,
(a1 . . . an )1/n + (b1 . . . bn )1/n ≤ ((a1 + b1 ) . . . (an + bn ))1/n .

5. Show that for every positive integer n,


  2n−1   2n+1
2n − 1 2 2n + 1 2
≤ 1 · 3 · 5 · . . . · (2n − 1) < .
e e

6. Suppose that f (x) is a polynomial with all real coefficients, satisfying f (x)+f 0 (x) > 0
for all x. Show that f (x) > 0 for all x.
7. Given that {x1 , . . . , xn } = {1, . . . , n} (i.e., the numbers x1 , . . . , xn are 1 through n
in some order), find (with proof!) the maximum value of
x1 x2 + x2 x3 + · · · + xn−1 xn + xn x1 .

8. Let (xk )∞ ∞
k=1 and (yk )k=1 be sequences satisfying

y1 ≥ y2 ≥ y3 ≥ · · ·
and, for each k ≥ 1,
x1 x2 · · · xk ≥ y1 y2 · · · yk .
Show that for each k ≥ 1,
x1 + x2 + · · · + xk ≥ y1 + y2 + · · · + yk .

95
Solutions to warm-up problems
All of these problems were all taken from a Northwestern Putnam prep problem set.
 n
n+1
1. n! < for n = 2, 3, 4, . . ..
2
Solution: Use the AM–GM inequality, with (a1 , . . . , an ) = (1, . . . , n).
p √ √ √
2. 3(a + b + c) ≥ a+ b+ c for positive a, b, c.

Solution: Use the power means inequality, with (a1 , a2 , a3 ) = (a, b, c) and r =
1/2, s = 1.

3. Minimize x1 + . . . + xn subject to xi ≥ 0 and x1 . . . xn = 1.

Solution: Guess: the minimum is n, achieved when all x1 = 1. Then use AM–GM
inequality to show  
x1 + . . . + x n √
≥ n x 1 . . . xn = 1
n
for positive xi satisfying x1 . . . xn = 1.

4. Minimize
x2 y2 z2
+ +
y+z z+x x+y
subject to x, y, z ≥ 0 and xyz = 1.
√ √ √ 
Solution: Apply Cauchy-Schwarz with the vectors y + z, z + x, x + y and
 
x y z
√ ,√ ,√
y+z z+x x+y
to get
x2 y2 z2
 
2
(x + y + z) ≤ + + 2 (x + y + z) ,
y+z z+x x+y
leading to
x2 y2 z2 x+y+z
+ + ≥ .
y+z z+x x+y 2
By the AM–GM inequality,
x+y+z √
≥ 3 xyz = 1,
3
so
x2 y2 z2 3
+ + ≥ .
y+z z+x x+y 2
This lower bound can be achieved by taking x = y = z = 1, so the minimum is 3/2.

96
5. If triangle has side lengths a, b, c and opposite angles (measured in radians) A, B, C,
then
aA + bB + cC π
≥ .
a+b+c 3

Solution: Assume, without loss of generality, that a ≤ b ≤ c. Then also A ≤ B ≤ C,


so by the Chebychev inequality,
    
aA + bB + cC a+b+c A+B+C a+b+c π
≥ = ,
3 3 3 3 3

from which the result follows.

6. Identify which is bigger:

2019!(2020) or 2020!(2019) .

(Here n!(k) indicates iterating the factorial function k times, so for example 4!(2) = 24!.)

Solution: For n ≥ 1, n! is increasing in n (1 ≤ n < m implies n! < m!). So, starting


from the easy
2019! > 2020,
apply the factorial function 2019 more times to get

2019!(2020) > 2020!(2019) .

7. Identify which is bigger:


20202020 or 20192021 .

Solution: Consider f (x) = (2020 − x) ln(2020 + x). We have ef (0) = 20202020 and
ef (1) = 20192021 , so we want to see what f does on the interval [0, 1]: increase or
decrease? The derivative is
2020 − x
f 0 (x) = − ln(2020 + x) + ,
2020 + x
which is negative on [0, 1] (since, for example,
2020 − x
≤ 1 = ln e < ln(2020 + x)
2020 + x
on that interval). So
20202020 < 20192021 .

97
8. Minimize
sin3 x cos3 x
+
cos x sin x
on the interval 0 < x < π/2.

Solution: We can use the rearrangement inequality on the pairs sin3 x, cos3 x


(which satisfies sin3 x ≤ cos3 x on [0, π/4], and sin3 x ≥ cos3 x on [π/4, π/2]), and
(1/ cos x, 1/ sin x) (which also satisfies 1/ cos x ≤ 1/ sin x on [0, π/4], and 1/ cos x ≥
1/ sin x on [π/4, π/2]), to get
sin3 x cos3 x sin3 x cos3 x
+ ≥ + = sin2 x + cos2 x = 1
cos x sin x sin x cos x
on the whole interval. Since 1 can be achieved (at x = π/4) the minimum is 1.

10.2 Solutions to problems on inequalities


1. For positive integers m, n, show
(m + n)! m! n!
< .
(m + n)m+n m m nn

Solution: This was from the 2004 Putnam competition, Problem B2.

Rearranging, this is the same as


  m  n
m+n m n
< 1.
m m+n m+n
This suggests looking at the binomial expansion
 m+n
m n
+ .
m+n m+n
The whole binomial expansion sums to 1; one term of the expansion is
  m  n
m+n m n
.
m m+n m+n
Since all terms are strictly positive, we get the required inequality.
2. Let T be an acute triangle. Inscribe a rectangle R in T with one side along a side of
T . Then inscribe a rectangle S in the triangle formed by the side of R opposite the
side on the boundary of T , and the other two sides of T , with one side along the
side of R. For any polygon X, let A(X) denote the area of X. Find the maximum
value, or show that no maximum exists, of
A(R) + A(S)
,
A(T )

98
where T ranges over all triangles and R, S over all rectangles as above.

Solution: This problem was on the 1985 Putnam Competition, Problem A2.

We claim that the answer is 2/3. Here’s a picture to assist in visualizing the problem:

Assume, without loss of generality, that the horizontal base of T has length 1. Let
the base of R have length x, and the base of S have base y, where 0 < y < x < 1.
We have
A(S)
= 2y(x − y)
A(T )
and
A(R)
= 2x(1 − x),
A(T )
so the quantity we want to maximize is

2y(x − y) + 2x(1 − x),

subject to the constraint 0 < y < x < 1.


For each fixed x, as y varies over 0 < y < x the quantity 2y(x − y) + 2x(1 − x)
achieves its maximum at y = x/2, where it takes value x2 /22x(1 − x) = (4x − 3x2 )/2.
This achieves its maximum over 0 < x < 1 at x = 2/3, where it takes the value 2/3.

3. Minimize 2


2 9
(u − v) + 2− u2 −
v

in the range 0 ≤ u ≤ 2, v ≥ 0.

Solution: This was on the Putnam competition 1984 problem B2.

The expression
√ to be minimized is √ the (square of the) distance between a point of
2
the form (u, 2 − u ) on 0 < u < 2, and a point of the form (v, 9/v) on v > 0;
in other words, we are looking for the (square of the) distance between the circle

99
x2 + y 2 = 2 in the first quadrant and the hyperbola xy = 9 in the same quadrant. By
symmetry, it strongly seems that the two closed points are (3, 3) on the hyperbola
and (1, 1) on the circle (squared distance 8). To prove that this is the minimum,
note that the tangent lines to the two curves at those two points are parallel, that
the distance between them at these points is the perpendicular distance between
the two tangent lines, and that the hyperbola (in the first quadrant) lies completely
above its tangent line, while the circle (in the first quadrant) lies completely below
its tangent line; so the distance between any other two points is at least the distance
between the two tangent lines.

4. Show that for non-negative reals a1 , . . . , an and b1 , . . . , bn ,

(a1 . . . an )1/n + (b1 . . . bn )1/n ≤ ((a1 + b1 ) . . . (an + bn ))1/n .

Solution: This was from the 2003 Putnam competition, problem A2.

If any ai is 0, the result is trivial, so we may assume all ai > 0. Dividing through by
(a1 . . . an )1/n , the inequality becomes

1 + (c1 . . . cn )1/n ≤ ((1 + c1 ) . . . (1 + cn ))1/n

for ci ≥ 0. Raising both sides to the power n, this is the same as


n   n
X n k/n
X
(c1 . . . cn ) ≤ ek
k=0
k k=0

where ek is the sum of the products of the ci ’s, taken k at a time. So it is enough to
show that for each k,
 
n X Y
(c1 . . . cn )k/n ≤ ci .
k i∈A
A⊆{1,...,n}, |A|=k
Y
We apply the AM-GM inequality to the numbers ci as A ranges over all subsets
i∈A  
n−1
of size k of {1, . . . , n}. Note that each ai appears exactly times in all these
k−1
numbers. So we we get
P Q
n−1 n |A|=k i∈A ci
(c . . . c )(k−1)/(k ) ≤
A⊆{1,...,n},
1 n n .
k
   
n−1 n
Since / = k/n, this is the same as
k−1 k
P Q
A⊆{1,...,n}, |A|=k i∈A ci
(c1 . . . cn )k/n ≤ n
 ,
k

100
which is exactly what we wanted to show.

Much quicker solution, shown to me by Jonathan Sheperd: If there is any


i for which ai + bi = 0, then the inequality trivially holds. If not, divide both sides
by the right-hand side to get the equivalent inequality

n  ! n1 n  ! n1
Y ai Y bi
≤ 1.
i=1
ai + b i i=1
ai + b i

Applying the arithmetic mean – geometric mean inequality to both terms on the
left-hand side, we find that the left-hand side is at most
n
! n
!
1 X ai 1 X bi
+
n i=1 ai + bi n i=1 ai + bi

which is the same as !


n
1 X ai + b i
,
n i=1
ai + b i
which is indeed at most (in fact exactly) 1.

5. Show that for every positive integer n,


  2n−1   2n+1
2n − 1 2 2n + 1 2
≤ 1 · 3 · 5 · . . . · (2n − 1) < .
e e

Solution: This was from the 1996 Putnam Competition, problem B2.

We estimate the integral of ln x, which is convex and hence easy to estimate. Take
the integral from 1 to 2n − 1. This is less than 2(ln 3 + ln 5 + . . . + ln(2n − 1)). But the
antiderivative of ln x is x ln x−x, so the integral evaluates to (2n−1) ln(2n−1)−2n+2.
Hence (2n − 1) ln(2n − 1) − (2n − 1) < (2n − 1) ln(2n − 1) − 2n + 2 < 2(ln 3 + ln 5 +
. . . + ln(2n − 1)). Exponentiating gives the right-hand inequality.
Similarly, the integral from e to 2n + 1 is greater than 2(ln 3 + ln 5 + . . . + ln(2n − 1)),
and an explicit evaluation of the antiderivative here leads to the right-hand side of
the inequality. The choice of lower bound e for the integral here is just the right
thing to make the computations work out nicely.

6. Suppose that f (x) is a polynomial with all real coefficients, satisfying f (x)+f 0 (x) > 0
for all x. Show that f (x) > 0 for all x.

Solution: Source: I got this problem from a Northwestern Putnam preparation


class.

101
f (x) and f (x) + f 0 (x) have the same leading coefficient, so the same limiting behavior
as x goes to ±∞, namely they both tend to +∞ (since f (x) + f 0 (x) > 0 always, the
limits cannot be −∞).
f (x) cannot have a repeated root: at a repeated root, the derivative is also 0, so
f (x) + f 0 (x) = 0 at this point. So all of f (x)’s real roots (if it has any) are simple.
Since f (x) goes to +∞ as x approaches both ±∞, it must thus have an even number
of real zeroes.
Suppose it has any. Let x1 and x2 be the first two. Between x1 and x2 , at some point
the derivative is 0 (Rolle’s theorem); at that point f (x) + f 0 (x) must be negative
(since f (x) negative here). This contradiction shows that f (x) has no real roots, so
can’t change sign, so must be always positive.

Remark: The example of f (x) = −e−2x shows that the hypothesis that f (x) is a
polynomial is crucial here.

7. Given that {x1 , . . . , xn } = {1, . . . , n} (i.e., the numbers x1 , . . . , xn are 1 through n


in some order), find (with proof!) the maximum value of

x1 x2 + x2 x3 + · · · + xn−1 xn + xn x1 .

Solution: This was from the 1996 Putnam Competition, problem B3

Here is a solution written by Kiran Kedlaya:


View x1 , . . . , xn as an arrangement of the numbers 1, 2, . . . , n on a circle. We prove
that the optimal arrangement is

. . . , n − 4, n − 2, n, n − 1, n − 3, . . .

To show this, note that if a, b is a pair of adjacent numbers and c, d is another pair
(read in the same order around the circle) with a < d and b > c, then the segment
from b to c can be reversed, increasing the sum by

ac + bd − ab − cd = (d − a)(b − c) > 0.

Now relabel the numbers so they appear in order as follows:

. . . , an−4 , an−2 , an = n, an−1 , an−3 , . . .

where without loss of generality we assume an−1 > an−2 . By considering the pairs
an−2 , an and an−1 , an−3 and using the trivial fact an > an−1 , we deduce an−2 > an−3 .
We then compare the pairs an−4 , an−2 and an−1 , an−3 , and using that an−1 > an−2 , we
deduce an−3 > an−4 . Continuing in this fashion, we prove that an > an−1 > · · · > a1

102
and so ak = k for k = 1, 2, . . . , n, i.e. that the optimal arrangement is as claimed. In
particular, the maximum value of the sum is

1 · 2 + (n − 1) · n + 1 · 3 + 2 · 4 + · · · + (n − 2) · n
= 2 + n2 − n + (12 − 1) + · · · + [(n − 1)2 − 1]
(n − 1)n(2n − 1)
= n2 − n + 2 − (n − 1) +
6
3 2
2n + 3n − 11n + 18
= .
6

Alternate solution: We prove by induction that the value given above is an upper
bound; it is clearly a lower bound because of the arrangement given above. Assume
this is the case for n − 1. The optimal arrangement for n is obtained from some
arrangement for n − 1 by inserting n between some pair x, y of adjacent terms. This
operation increases the sum by nx + ny − xy = n2 − (n − x)(n − y), which is an
increasing function of both x and y. In particular, this difference is maximal when x
and y equal n − 1 and n − 2. Fortunately, this yields precisely the difference between
the claimed upper bound for n and the assumed upper bound for n − 1, completing
the induction.

8. Let (xk )∞ ∞
k=1 and (yk )k=1 be sequences satisfying

y1 ≥ y2 ≥ y3 ≥ · · ·

and, for each k ≥ 1,


x1 x2 · · · xk ≥ y 1 y 2 · · · y k .
Show that for each k ≥ 1,

x1 + x2 + · · · + xk ≥ y1 + y2 + · · · + yk .

Solution: This was on the 2013 University of Illinois — Urbana Champaign Mock
Putnam. Here is a solution taken from the UIUC Mock Putnam website, https:
//faculty.math.illinois.edu/~hildebr/putnam/mockputnamproblems.html:

103
104
11 Week 10 (October 13) — Modular arithmetic and
greatest common divisor
Modular arithmetic is something that everyone (not just mathematicians), are familiar
with from a very early age, though maybe not in a formal way. For example, whenever you
observe something like “it is 11 o’clock now, so in three hours time it will two o’clock”, you
are performing addition modulo 12, saying “11 + 3 = 2”. In this section we will formalize
modular arithmetic, and present numerous properties and applications that highlight its
usefulness.

Modular arithmetic
For integers a and b, and positive integer k, say that a is congruent to b (modulo k), written
“a ≡ b (mod k)”, if a and b leave the same remainder on division by k, or equivalently if
a − b is a multiple of k, or equivalently if a − b = mk for some integer m. Congruence
(modulo k) is an equivalence relation on the integers, that partitions Z into k classes, called
residue classes. For example, when k = 3 the three classes are {. . . , −6, −3, 0, 3, 6, . . .},
{. . . , −5, −2, 1, 4, 7, . . .} and {. . . , −4, −1, 2, 5, 8, . . .}.
Many of the standard arithmetic operations go through unchanged to modular arith-
metic. For example, it is easy to establish that if

a ≡ b (mod k) and c ≡ d (mod k)

then each of

a + c ≡ b + d (mod k)
a − c ≡ b − d (mod k)
ac ≡ bd (mod k)

hold. Repeated application of this last relation also quickly gives that for all positive
numbers n,
an ≡ bn (mod k).
Modular arithmetic can be a great time-saver when working with problems concerning
divisibility. We give a quick and useful example.
Claim: The remainder of any integer, on division by 9, is the same as the remainder of
the sum of its digits on division by 9.
Xn
Proof: Write the number in decimal form as ai 10i (with each ai ∈ {0, . . . , 9}). Since
i=0
10 ≡ 1 (mod 9), we immediately have 10i ≡ 1i ≡ 1 (mod 9), and so ai 10i ≡ ai (mod 9),
Xn n
X
and so ai 10i ≡ ai (mod 9), which is exactly what we wanted to show.
i=0 i=0
Here are three more quick examples illustrating how modular arithmetic can make life
easy:

105
Question: What are the last two digits of 372 ?
Answer: We are being asked: what number x between 0 and 99 is such that 372 ≡
x (mod 100)? By repeated squaring we have

3 ≡ 3 (mod 100)
32 ≡ 32 ≡ 9 (mod 100)
34 ≡ 92 ≡ 81 (mod 100)
38 ≡ 812 ≡ 6561 ≡ 61 (mod 100)
316 ≡ 612 ≡ 3721 ≡ 21 (mod 100)
332 ≡ 212 ≡ 441 ≡ 41 (mod 100)
364 ≡ 412 ≡ 1681 ≡ 81 (mod 100)

and so
372 ≡ 364 38 ≡ 81 · 61 ≡ 4941 ≡ 41 (mod 100),
so the last two digits of 372 are 41.
Problem: Prove that 270 + 370 is divisible by 13.
Solution: We could use the same technique as above to discover that 270 ≡ 10 (mod 13)
and 370 ≡ 3 (mod 13) so that 270 + 370 ≡ 10 + 3 ≡ 0 (mod 13). But there is a much easier
way: 22 ≡ −32 (mod 13), so 270 ≡ (−1)35 370 ≡ −370 (mod 13), so 270 + 370 ≡ 0 (mod 13).
Problem: Find all integers x, y satisfying x2 − 5y 2 = 6.
Solution: Some experimentation shows that no small numbers x and y work. We might
suspect, then, that the equation has no integer solutions. One way to verify this is to work
modulo 4. If there was an x and y with x2 − 5y 2 = 6, then for that x and y we would have
x2 − 5y 2 ≡ 6 (mod 4).
If x ≡ 0, 1, 2, 3 (mod 4) then x2 ≡ 0, 1, 0, 1 (mod 4), and if y ≡ 0, 1, 2, 3 (mod 4) then
5y ≡ 0, 1, 0, 1 (mod 4). So, modulo 4, x2 − 5y 2 is equivalent to one of −1, 0, 1, or one of
2

0, 1, 3, and so we cannot have x2 − 5y 2 ≡ 6 (mod 4).


Hence the equation indeed has no solutions.

The greatest common divisor


Closely related to modular arithmetic is the greatest common divisor function in number
theory. Here is a brief introduction to some ideas around the greatest common divisor.

1. Divisibility: For integers a, b, a|b (a divides b) if there is an integer k with ak = b.


For positive integers a and b, the greatest common divisor of a, b, gcd(a, b) (sometimes
just written (a, b)) is the largest positive number that is a divisor of both a and b
(this exists, since 1 is a common divisor, and all common divisors are at most the
minimum of a and b). This means that if d = gcd(a, b), and e is any positive number
with e|a and e|b, then e ≤ d; but in fact it turns out that moreover e|d. This very

106
useful fact follows easily from looking at the prime factorizations of a and b, see
below.
The least common multiple of a, b, lcm(a, b) is the smallest positive number f such
that a|f and and b|f ; for any positive number g with a|g and b|g we have f ≤ g; but
in fact, as with gcd, it turns out that we even have f |g in this case.
If gcd(a, b) = 1 (so no factors in common other that 1) then a and b are said to be
coprime or relatively prime.
2. Primes: If p > 1 only has 1 and p as divisors, it is said to be prime; otherwise it is
composite.
The fundamental fact about prime numbers (other that there are infinitely many of
them!) is that every number n > 1 has a prime factorization:
n = pa11 . . . pakk
with each pi a prime, and each ai > 0. Moreover, the factorization is unique if we
assume that p1 < . . . < pk .
The prime factorization gives one way (not the most computationally efficient way)
of accessing gcd(a, b) and lcm(a, b). Indeed, if
a = pa11 . . . pakk and b = pb11 . . . pbkk
(with some of the ai and bi possibly 0), then
min(a1 ,b1 ) min(ak ,bk ) max(a1 ,b1 ) max(ak ,bk )
gcd(a, b) = p1 . . . pk and lcm(a, b) = p1 . . . pk .
Using min(x, y) + max(x, y) = x + y, we get the nice identity
ab = gcd(a, b)lcm(a, b).
k
Y
Since any common divisor of a and b must be of the form pγi i for some γi ’s satisfying
i=1
γi ≤ min(a1 , b1 ), we quickly get the fact, alluded to earlier, that if d = gcd(a, b) and
e is a common divisor of a and b, then not only do we we have e ≤ d but also e|d.
3. Euclidean algorithm: Euclid described a simple way to compute gcd(a, b). Assume
a > b. Write
a = kb + j
where 0 ≤ j < b. If j = 0, then gcd(a, b) = b. If j > 0, then it is fairly easy to check
that gcd(a, b) = gcd(b, j). Repeat the process with the smaller pair b, j, and keep
repeating as long as necessary. For example, suppose I want gcd(63, 36):
63 = 1.36 + 27
36 = 1.27 + 9
27 = 3.9.
We conclude 9 = gcd(27, 9) = gcd(36, 27) = gcd(63, 36).

107
4. Bézout’s Theorem: Given a, b, there are integers x, y such that ax+by = gcd(a, b).
Moreover, the set of numbers that can be expressed in the form ax0 + by 0 = c for
integers x0 , y 0 is exactly the set of multiples of gcd(a, b).
The proof comes from working the Euclidean algorithm backwards. I’ll just do an
example, with the pair 63, 36. We have

9 = 36 − 1.27
= 36 − 1(63 − 1.36)
= −1.63 + 2.36

so we can take x = −1 and y = 2.


Once we have found an x and y, the rest is easy. Suppose c = kgcd(a, b) is a multiple
of gcd(a, b); then (kx)a + (ky)b = cgcd(a, b). On the other hand, if ax0 + by 0 = c then
since gcd(a, b)|a and gcd(a, b)| we have gcd(a, b)|c, so c is a multiple of gcd(a, b).
The most common form of Bézout’s Theorem is that if (a, b) = 1 then every integer
k can be written as a linear combination of a and b; in particular there is x, y with
ax + by = 1.
In the language of modular arithmetic, this says that if (a, k) = 1, then there is a
number x such that ax ≡ 1 (mod k). We may think of x as an inverse of a (modulo
k); this is a starting point for thinking about division in modular arithmetic.

5. Useful facts/theorems concerning modular arithmetic:

(a) Inverses (repeating a previous observation): If p is a prime, and a 6≡ 0 (mod p),


then there is a whole number b such that ab ≡ 1 (mod p); more generally if a
and k are coprime then there is a whole number b such that ab ≡ 1 (mod k).
(b) Fermat’s theorem: If p is a prime, and a 6≡ 0 (mod p), then ap−1 ≡ 1 (mod p).
More generally, for arbitrary m (prime or composite) define ϕ(m) to be the
number of numbers in the range 1 through m that are coprime with m. If
(a, m) = 1 then aϕ(m) ≡ 1 (mod m). (When m = p this reduces to Fermat’s
Theorem.)
We refer to ϕ as Euler’s totient function. A quick way to calculate its value: if
m has prime factorization m = pa11 . . . pakk , then
k  
Y 1
ϕ(m) = m 1− .
i=1
pi

(c) Chinese Remainder Theorem: Suppose n1 , n2 , . . . , nk are pairwise relatively


prime. If a1 , a2 , . . . , ak are any integers, there is a number x that simultaneously
satisfies x ≡ ak (mod nk ). Moreover, modulo n1 n2 . . . nk , this solution is unique.

108
11.1 Problems to think about for week 11
1. Prove that the product of three consecutive integers is divisible by 504 if the middle
one is a perfect cube.

2. Find all integers n such that (2n + n)|(8n + n).

3. Compute the sum of the digits of the sum of the digits of the sum of the digits of
the number 44444444 .

4. Several positive integers are written on a chalk board. One can choose two of them,
erase them, and replace them with their greatest common divisor and least common
multiple. Prove that eventually the numbers on the board do not change.

5. How many primes numbers have the following (decimal) form: digits alternating
between 1’s and 0’s, beginning and ending with 1?
   
pa a
6. Let a ≥ b ≥ 0 be integers and let p be a prime number. Show that and
pb b
are congruent modulo p.

7. Is it possible to place 2021 integers on a circle such that for every pair of adjacent
numbers the ratio of the larger one to the smaller one is a prime?

8. Let n > 1 be an integer and p a prime such that n|(p − 1) and p|(n3 − 1). Prove
that 4p − 3 is a perfect square.

11.2 Solutions to problems on modular arithmetic


1. Prove that the product of three consecutive integers is divisible by 504 if the middle
one is a perfect cube.

Solution: Let the middle integer be m3 where m is an integer. Then the product of
the three integers is

(m3 − 1)m3 (m3 + 1) = (m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1).

The prime factorization of 504 is 504 = 23 × 32 × 7.


We first show that 7|(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1) by looking at m
modulo 7.

• If m ≡ 0 (modulo 7) then clearly 7|(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1).


• If m ≡ 1 (modulo 7) then m − 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 2 (modulo 7) then m2 + m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).

109
• If m ≡ 3 (modulo 7) then m2 − m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 4 (modulo 7) then m2 + m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 5 (modulo 7) then m2 − m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 6 (modulo 7) then m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).

Next we show that 32 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1) by looking at m


modulo 3.

• If m ≡ 0 (modulo 3) then 32 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1)


because of the m3 factor.
• If m ≡ 1 (modulo 3) then m − 1 ≡ 0 (modulo 3) and m2 + m + 1 ≡ 0 (modulo
3), so 32 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 2 (modulo 3) then m + 1 ≡ 0 (modulo 3) and m2 − m + 1 ≡ 0 (modulo
3), so 7|(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1).

Finally we show that 23 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1) by looking at


m modulo 2, and modulo 4.

• If m ≡ 0 (modulo 2) then 23 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1)


because of the m3 factor.
• If m ≡ 1 (modulo 2) and m ≡ 1 (modulo 4) then m − 1 ≡ 0 (modulo 4) and
m + 1 ≡ 0 (modulo 2), so 23 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 1 (modulo 2) and m ≡ 3 (modulo 4) then m − 1 ≡ 0 (modulo 2) and
m + 1 ≡ 0 (modulo 4), so 23 |(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1).

We conclude that 23 32 7|(m − 1)(m2 + m + 1)m3 (m + 1)(m2 − m + 1), as required.

Source: Kenyon College Putnam prep class.

2. Find all integers n such that (2n + n)|(8n + n).

Solution: This was on a Putnam prep page from Northwestern.

The solutions are n = 0, 1, 2, 4 and 6.


We may assume n ≥ 0: note that for n ≤ −1, neither 2n + n nor 8n + n are integers;
and also 2n + n and 8n + n are both positive, with 2n + n > 8n + n, so even with a
broad interpretation of “divides”, no integer below 0 will work.
We have 8n + n = 2n 4n + n = (2n + n)4n − n4n + n, so if 2n + n divides 8n + n, then

(2n + n)|(n4n − n).

110
We have n4n − n = n2n 2n − n = (2n + n)n2n − n2 2n − n, so if 2n + n divides n4n − n,
then
(2n + n)|(n2 2n + n).

We have n2 2n + n = (2n + n)n2 − n3 + n, so if 2n + n divides n2 2n + n, then

(2n + n)|(n3 − n).

It is an easy (but slightly tedious, so omitted) induction that for n ≥ 10, 2n + n >
n3 − n, so we conclude that if 2n + n divides 8n + n, then n ≤ 9.
It is another tedious but easy check that n = 0, 1, 2, 4 and 6 all lead to integers, but
not any other n ≤ 9.

3. Compute the sum of the digits of the sum of the digits of the sum of the digits of
the number 44444444 .

Solution: We start with

44444444 < 1000010000 = 1040000 .

Among all numbers below 1040000 , none has a larger sum of digits than 1040000 − 1 (a
string of 40000 9’s). So the sum of the digits of 44444444 is at most 9×40000 < 1000000.
Among all numbers below 1000000, none has a larger sum of digits than 999999. So
the sum of the digits of the sum of the digits of 44444444 is at most 54. Among all
numbers at most 54, none has a larger sum of digits than 49. So the sum of the
digits of the sum of the digits of the sum of the digits of 44444444 is at most 13.
Now we use a useful fact: the remainder of a number, on division by 9, is the same
as the remainder of the sum of the digits on division by 9. . This fact implies that
the sum of the digits of the sum of the digits of the sum of the digits of 44444444
leaves the same remainder on division by 9 as 44444444 itself does.
To calculate the remainder of 44444444 on division by 9, we can use a repeated
multiplication trick. It’s easy that

4444 ≡ 7 (mod 9)

111
and so

44442 ≡ 49 ≡ 4 (mod 9)
44444 ≡ 16 ≡ 7 (mod 9)
44448 ≡ 49 ≡ 4 (mod 9)
444416 ≡ 16 ≡ 7 (mod 9)
444432 ≡ 49 ≡ 4 (mod 9)
444464 ≡ 16 ≡ 7 (mod 9)
4444128 ≡ 49 ≡ 4 (mod 9)
4444256 ≡ 16 ≡ 7 (mod 9)
4444512 ≡ 49 ≡ 4 (mod 9)
44441024 ≡ 16 ≡ 7 (mod 9)
44442048 ≡ 49 ≡ 4 (mod 9)
44444096 ≡ 16 ≡ 7 (mod 9).

It follows that

44444444 = 44444096 4444256 444464 444416 44448 44444 ≡ 7.7.7.7.4.7 ≡ 7 (mod 9).

So 44444444 leaves a remainder of 7 on division by 9, and also the sum of the digits
of the sum of the digits of the sum of the digits of 44444444 leaves a remainder of 7
on division by 9; but we’ve calculated that this last is at most 13. The only number
at most 13 that leaves a remainder of 7 on division by 9 is 7 it self; so the sum of
the digits of the sum of the digits of the sum of the digits of 44444444 must be 7.

Source: International Mathematical Olympiad 1975, problem 4.

Remark: This was an IMO (International Mathematical Olympiad) problem. Ap-


propriately enough, if you had got this question fully correct at the IMO, you would
have scored 7 points!

4. Several positive integers are written on a chalk board. One can choose two of them,
erase them, and replace them with their greatest common divisor and least common
multiple. Prove that eventually the numbers on the board do not change.

Soution: I found this problem on a Stanford Putnam prep class page.

Here’s a very quick, slick solution shown to me by Do Trong Thanh: If you pick
two numbers a, b with a|b or b|a, then since gcd(a, b) = min{a, b} and lcm(a, b) =
max{a, b} in this case, the numbers do not change. In general gcd(a, b)|lcm(a, b),
so if it is not the case that a|b or b|a, then after the swap it is the case for that
particular pair. Initially there are only finitely many pairs (a, b) with a 6 |b and b 6 |a;
either eventually we replace all these pairs with pairs of which one divides to other

112
(in which case we are done), or we eventually commit to avoiding all remaining such
pairs (in which case we are done).
Here’s my laborious solution: When we take a pair of numbers (a, b), and replace them
with (gcd(a, b), lcm(a, b)), we preserve something, namely the product of the pair of
numbers (that ab = gcd(a, b)lcm(a, b) is easily seen from the prime factorization of a
and b: if n n
Y Y
ai
a= pi , b = pbi i
i=1 i=1

(with maybe some of the ai , bi zero) then


n
Y
ab = pai i +bi ,
i=1
n n
min{ai ,bi } max{ai ,bi }
Y Y
gcd(a, b) = pi , lcm(a, b) = pi ,
i=1 i=1
so n
min{ai ,bi }+max{ai ,bi }
Y
gcd(a, b)lcm(a, b) = pi .
i=1

Thus ab = gcd(a, b)lcm(a, b) follows from x + y = min{x, y} + max{x, y}, valid for
any positive integers x, y.)
For any fixed positive number, there are only finitely many ways to write it as the
product of a fixed number of positive numbers (if the target of the product is N ,
and we are using d numbers, then each of the d numbers must be a divisor of N , so
the number of ways of writing N as a product of d terms is at most a(N )d , where
a(N ) is the number of divisors of N ). This shows that there are only finitely many
possibilities for the numbers written on the board.
Consider the sum of the numbers. How does this change with the swap operation?
It depends on how a + b compares to gcd(a, b) + lcm(a, b). Experimentation suggests
that gcd(a, b) + lcm(a, b) ≥ a + b, with equality iff the pair (a, b) coincides (in some
order) with the pair (lcm(a, b), gcd(a, b)). To prove this, first consider a = b, for
which the result is trivial. For all other cases, assume without loss of generality that
a > b. We have
lcm(a, b) ≥ a > b ≥ gcd(a, b).
If any one of a = lcm(a, b), b = gcd(a, b) holds then by the conservation of product
the other must too, and the result we are trying to prove is true. So now we may
assume
lcm(a, b) > a > b > gcd(a, b),
and what we want to show is that this implies gcd(a, b) + lcm(a, b) > a + b. Let
n = ab = gcd(a, b)lcm(a, b), so
n n
gcd(a, b) + lcm(a, b) = gcd(a, b) + and a + b = b + .
gcd(a, b) b

113
A little
√ calculus shows that the√function f (x) = x + n/x is decreasing on the interval
(0, n]. Since gcd(a, b) < b < n, this shows that
n n
gcd(a, b) + >b+
gcd(a, b) b

which is exactly what we want to show.


So, suppose we have the bunch of numbers in front of us, and we perform the swap
operation infinitely often. All swaps preserve the product. Some swaps also preserve
the sum; these swaps are exactly the swaps that don’t change the set of numbers.
All other swaps increase the sum. We can only increase the sum finitely many times
(there are only finitely many different configurations of numbers). Therefore there
must be some point (curiously, not boundable as a function of the original numbers!)
after which we make no more sum-increasing swaps; from that point on, the numbers
remain unchanged.

5. How many primes numbers have the following (decimal) form: digits alternating
between 1’s and 0’s, beginning and ending with 1?

Solution: This was from an NYU Putnam prep class webpage.

The only prime of this form is 101.


The number xn = 1010 . . . 101, with n 0’s, can be written as

1 + 100 + 1000 + 1000000 + . . . + 1000 . . . 000 = 1 + (100) + (100)2 + . . . + (100)n ,

in other words, xn = Pn (100) where Pn (x) is the polynomial 1 + x + x2 + . . . + xn .


We want to know for which n the polynomial Pn (x) is prime for x = 100.
For n = 0, it is not (P0 (100) = 1), and for n = 1 it is (P1 (100) = 101). So we assume
n ≥ 2.
Since (x − 1)(1 + x + x2 + . . . + xn ) = (xn+1 − 1), we have
2
99Pn (100) = 100n+1 − 1 = 102(n+1) − 1 = 10n+1 − 1 = 10n+1 − 1 10n+1 + 1 .
 

What happens if Pn (100) is prime? It must divide one of 10n+1 − 1, 10n+1 + 1. But,
for n ≥ 2,

Pn (100) = 1 + (100) + (100)2 + . . . + (100)n > 1 + 102n > 1 + 10n+1 ,

so Pn (100) is too big to divide either 10n+1 + 1 or 10n+1 − 1. Hence for n ≥ 2, Pn (100)
can’t be prime.
The conclusion is that the only prime of the given form is 101.

114
   
pa a
6. Let a ≥ b ≥ 0 be integers and let p be a prime number. Show that and
pb b
are congruent modulo p.

Solution: INCOMPLETE This was from the 1977 Putnam competition, Problem
A5.

Solution from John Scholes. Denote by f (n) the highest power of p dividing n (so,
e.g., f (23 58 p7 ) = p7 , if p 6= 2, 5). The multiples of p in (pa)! are pa, p(a − 1), . . ., 2p,
and p. Hence f ((pa)!) = pa f (a!). Similarly, f ((pb)!) = pb f (b!) and f ((p(a − b))!) =
pa−b f ((a − b)!). Hence    
a pa
f =f .
b pb
   
pa a
This says that − can be expressed as xpy where x and y are non-negative
pb b
integers, and x is not divisible by p. If y > 0, this gives the result.
I’m not sure what happens for this line of attack if y = 0.
Here is the solution as posted in the American Mathematical Monthly shortly after
the 1977 competition:

7. Is it possible to place 2021 integers on a circle such that for every pair of adjacent
numbers the ratio of the larger one to the smaller one is a prime?

Solution: I found this on Andrei Jorza’s webpage, from his 2018 Putnam prep class.

No. We argue by contradiction.


Suppose it were possible. Consider two consecutive numbers on the circle, a and b.
Either b = pa for some prime p, in which case label the arc of the circle between a
and b “p” (and call the arc an UP arc), or a = pb, in which case label the arc “1/p”
(and call it a DOWN arc).

115
Starting from a particular (arbitrarily chosen) number, A, say, on the circle, the
number one step away clockwise from A is A multiplied by the label on the arc of the
circle between A and that number one step away clockwise. In general, the number
k steps away from A (clockwise) is A multiplied by all the labels encountered along
those k arcs. So the number 2021 steps away from A (clockwise) is A multiplied by
all the labels on the arcs (since 2021 steps takes us all the way around the circle).
But this last number is A itself. So we have an equation:
product of bunch of primes — the primes on the UP arcs
A=A×
product of bunch of primes — the primes on the DOWN arcs
or
product of primes on UP arcs = product of primes on DOWN arcs.
But there are 2021 arcs, and odd number, so one side of the above equation has an
odd number of primes in it, and the other side has an even number, contradicting
the fundamental theorem of arithmetic.
Notice that all we used here was that 2021 is odd.

8. Let n > 1 be an integer and p a prime such that n|(p − 1) and p|(n3 − 1). Prove
that 4p − 3 is a perfect square.

Solution: (Northwestern Putnam Preparation) We have p|(n − 1)(n2 + n + 1) so


either p|(n − 1) or p|(n2 + n + 1). The first is impossible (it implies p ≤ n − 1, but
n|(p − 1) says n ≤ p − 1 so p ≥ n + 1). So we have p|(n2 + n + 1).
By n|(p − 1) we have kn = p − 1 for some k ≥ 1, and by p|(n2 + n + 1) we have
`p = n2 + n + 1 for some ` ≥ 1. From the first, we have `p = `kn + `, and so
`kn + ` = n2 + n + 1 or

n2 + (1 − `k)n + (1 − `) = 0.

The solutions to this quadratic must be integers, so the discriminant must be a


perfect square, i.e.,
(1 − `k)2 − 4(1 − `) = x2
for some integer x.
One possibility is ` = 1 (the left-hand side above becomes (1 − k)2 , certainly a
square). But ` = 1 says p = n2 + n + 1, so 4p − 3 = 4n3 + 4n + 1 = (2n + 1)2 , a
perfect square.
If ` > 1 then rewriting the above as (`k − 1)2 + 4(` − 1) = x2 , we see that

4(` − 1) ≥ 2(`k − 1) + 1

(why? (`k − 1)2 is a perfect square, and when we add 4(` − 1) we get a larger perfect
square; the first perfect square after (`k − 1)2 is (`k)2 , which differs from (`k − 1)2
by 2(`k − 1) + 1, so we need to add at least this much).

116
4(` − 1) ≥ 2(`k − 1) + 1 implies k = 1 (if k ≥ 2 then 2(`k − 1) ≥ 4` − 2 > 4(` − 1)).
But k = 1 says p = n + 1, so n + 1|(n2 + n + 1), so n + 1|n2 , impossible for n > 0
(why¿ n + 1|n2 − 1, so if n + 1|n2 then n + 1|1).

117
12 Week 11 (October 20) — Probability
Discrete probability may be thought about along the following lines: an experiment is
performed, with a set S, the sample space, of possible observable outcomes (e.g., roll a dice
and note the uppermost number when the dice lands; then S would be {1, 2, 3, 4, 5, 6}). S
may be finite or countable for our purposes. An event is a subset A of S; the event occurs
if the observed outcome is one of the elements of A (e.g., if A = {2, 4, 6}, which we might
describe as the event that an even number is rolled, then we would say that A occurred if
we rolled a 4, and that it did not occur if we rolled a 5). The compound event A ∪ B is
the event that at least one of A, B occur; A ∩ B is the event that both A and B occur,
and Ac (= S \ A) is the event that A did not occur.
A probability function is a function P that assigns to each event a real number, which
is intended to measure how likely A is to occur, or, in what proportion of a very large
numbers of independent repetitions of the experiment does A occur. P should satisfy the
following three rules:
1. P (A) ≥ 0 always (events occur with non-negative probability);
2. P (S) = 1 (something always happens); and
3. if A and B are disjoint events (no outcomes in common) then P (A ∪ B) = P (A) +
P (B); more generally, if A1 , A2 , . . . is a countable collection of mutually disjoint
events, then X
P (Ui Ai ) = P (Ai ).
i

Three consequences of the rules are the following relations that one would expect:
1. If A ⊆ B then P (A) ≤ P (B);
2. P (∅) = 0; and
3. P (Ac ) = 1 − P (A).
Usually one constructs the probability function in the following way: intuitition/experiment/some
underlying theory suggests that a particular s ∈ S will occur a proportion ps of the time,
when the experiment isX repeated many times; a reality check here is that ps should be
non-negative, and that ps = 1. One then sets
s∈S
X
P (A) = ps ;
s∈A

it is readily checked that this function satisfies all the axioms.


In the particular case when S is finite and intuition/experiment/some underlying theory
suggests that all outcomes s ∈ S are equally likely to occur, we get the classical “definition”
of the probability of an event:
|A|
P (A) = ,
|S|

118
and calculating probabilities comes down to counting.
Example: I toss a coin 100 times. How likely is it that I get exactly 50 heads?
Solution: All 2100 lists of outcomes of 100 tosses are equally likely, so each one should
occur with
 probability
 1/2100 . The number of outcomes in which there are exactly 50
100
heads is , so the required probability is
50
100

50
.
2100
A random variable X is a function that assigns to each outcome of an experiment
a (usually real) numerical value. For example, if I toss a coin 100 times, I may not be
interested in the particular list of heads and tails I get, just in the total number of heads,
so I could define X to be the function that takes in a string of 100 heads and tails, and
returns as the numerical value the number of heads in the string. The density function of
the random variable X is the function pX (x) = P (X = x), where “P (X = x)” is shorthand
for the event “the set of all outcomes for which X evaluates to x”. For tossing a coin 00
times and counting the number of heads, the density function is
  
 100 −100
2 if x = 0, 1, 2, . . . , 100
pX (x) = x
0 otherwise.

More generally, we have the following:


Binomial distribution: I toss a coin n times, and each time it comes up heads with
some probability p. Let X be the number of heads that comes up. The random variable
X is called the binomial distribution with parameters n and p, and has density function
  
 n x
p (1 − p)n−x if x = 0, 1, 2, . . . , n
pX (x) = x
0 otherwise.

Note that by the binomial theorem.


n  
X n
pk (1 − p)n−k = (p + (1 − p))n = 1.
k=0
k

The expected value of a probability distribution/random variable is a measure of the


average value of a long sequence of readings from that distribution; it is calculated as a
weighted average: X
E(X) = xpX (x)
x

119
with reading x being given weight pX (x). For example, if X is the binomial distribution
with parameters n and p, then E(X) is
n   n  
X n k n−k
X n − 1 k−1
k p (1 − p) = np p (1 − p)n−k
k=0
k k=0
k − 1
= np(p + (1 − p))n−1
= np,
as we would expect.
It is worth knowing that expectation is a linear function:
Linearity of expectation: If a probability distribution/random variable X can be
written as the sum X1 + . . . + Xn of n (usually simpler) probability distributions/random
variables, then
E(X) = E(X1 ) + . . . + E(Xn )
Example: n boxes have labels 1 through n. n cards with numbers 1 through n written
on them (one number per card) are distributed among the n boxes (one card per box). On
average how many boxes get the card whose number is the same as the label on the box?
Solution: Let Xi be the random variable that takes the value 1 if card i goes into box
i, and 0 otherwise; pXi (1) = 1/n, pXi (0) = 1 − (1/n) and pXi (x) = 0 for all other x’s, so
E(Xi ) = 1/n. Let X be the random variable that counts the number of boxes that get
the right card; since X = X1 + . . . Xn we have
E(X) = E(X1 ) + . . . + E(Xn ) = n(1/n) = 1
(independent of n!) [This is the famous problem of derrangements.]
One of the rules of probability is that for disjoint events A, B, we have P (A ∪ B) =
P (A) + P (B). If A and B have overlap, this formula overcounts by including outcomes in
A ∩ B twice, so should be corrected to
P (A ∪ B) = P (A) + P (B) − P (A ∩ B).
For three events A, B, C, a Venn diagram readily shows that
P (A ∪ ∪C) = P (A) + P (B) + P (C) − P (A ∩ B) − P (A ∩ C) − P (B ∩ C) + P (A ∩ B ∩ C).
There is a natural generalization:
Inclusion-exclusion (also called the sieve formula):
n
X X
P (∪ni=1 Ai ) = P (Ai ) − P (Ai ∩ Aj )
i=1 i<j
X
+ P (Ai ∩ Aj ∩ Ak ) + . . .
i<j<k
X
+(−1)`−1 P (Ai1 ∩ Ai2 ∩ Ai` ) + . . .
i1 <i2 <...<i`
n−1
+(−1) P (A1 ∩ A2 ∩ An ).

120
Inclusion-exclusion is often helpful because calculating probabilities of intersections is
easier than calculation probabilities of unions.
Example: In the problem of derrangements discussed above, what is the exact probability
that no box gets the correct card?
Solution: Let Ai be the event that box i gets the right card. We have P (Ai ) = 1/n, and
more generally for i1 < i2 < . . . < i` we have
(n − `)!
P (Ai1 ∩ Ai2 ∩ Ai` ) =
n!
(there are n! distributions of cards, and to make sure that boxes i1 through i` get the
right card, we are forced to place these ` cards each in a predesignated box; but the
remaining n − ` cards can be completely freely distributed among the remaining boxes).
By inclusion-exclusion,
n  
`−1 n (n − `)!
X
n
P (∪i=1 Ai ) = (−1) ,
`=1
` n!

the binomial term coming from selecting i1 < i2 < . . . < i` . We want the probability of
none of the boxes getting the right card, which is the complement of ∪ni=1 Ai :
n  
`−1 n (n − `)!
c
X
n
P ((∪i=1 Ai ) ) = 1 − (−1)
`=1
` n!
n  
` n (n − `)!
X
= (−1)
`=0
` n!
n
X (−1)`
= .
`=0
`!

Note that this is the sum of the first n + 1 terms in the power series of ex around 0,
evaluated at x = −1, so as n gets larger the probability of there being no box with the
right card approaches 1/e.
There is a counting version of inclusion-exclusion, that is very useful to know:
Inclusion-exclusion (counting version):
n
X X
| ∪ni=1 Ai | = |Ai | − |Ai ∩ Aj |
i=1 i<j
X
+ |Ai ∩ Aj ∩ Ak | + . . .
i<j<k
X
+(−1)`−1 |Ai1 ∩ Ai2 ∩ Ai` | + . . .
i1 <i2 <...<i`
n−1
+(−1) |A1 ∩ A2 ∩ An |.

121
Example: How many numbers are there, between 1 and n, that are relatively prime to n
(have no factors in common)?
Solution: Let n have prime factorization pa11 pa22 . . . pakk . Let Ai be the set of numbers
between 1 and n that are multiples of pi . We have |Ai | = n/pi , and more generally for
i1 < i2 < . . . < i` we have
n
|Ai1 ∩ Ai2 ∩ Ai` | = .
pi1 . . . pi`

We want to know |(Ai1 ∩ Ai2 ∩ Ai` )c | (complement taken inside of {1, . . . , n}), because
this is exactly the set of numbers below n that share no factors in common with n. By
inclusion-exclusion,
k
!
n
X 1 X 1 (−1)
| (∪ni=1 Ai )c | = n 1 − + − ... +
i=1
p i i<j
p i p j p1 p2 . . . pk
   
1 1
= n 1− ... 1 − .
p1 pk
The function
k  
Y 1
ϕ(n) = n 1−
i=1
pi
counting the number of numbers between 1 and n that are relatively prime to n, is called
the Euler totient function.
The only mention I’ll make of probability with uncountable underlying sample spaces
is this: if R is a region in the plane, then a natural model for “selecting a point from R,
all points equally likely”, is to say that for each subset R0 of R, the probability that the
selected point will be in R0 is Area(R0 )/area(R), that is, proportional to the area of R0 .
This idea naturally extends to more general spaces.
Example: I place a small coin at a random location on a 3 foot by 5 foot table. How
likely is it that the coin is within one foot of some edge of the table?
Solution: There’s a 1 foot by 3 foot region at the center of the table, consisting of exactly
those points that are not within one foot of some edge of the table; assuming that the
coin is equally likely to be placed at any location, the probability of landing in this region
is (1 × 3)/(3 × 5) = .2, so the probability of landing withing one foot of some edge is
1 − .2 = .8.
Conditional probability: Given events A, B, with P (B) 6= 0, the conditional proba-
bility of A given B is
P (A ∩ B)
P (A|B) =
P (B)
(knowing that B has occurred, the only sample points that lead to A occurring are those
in A ∩ B, and the probability of these points occurring should be measured relative to
P (B), not 1).

122
From this definition, we get the law of total probability: if B1 , . . . , Bn form a partition
of the sample space of an experiment (disjoint from each other, union covers whole space),
and A is any event, then

P (A) = P (A|B1 )P (B1 ) + · · · + P (A|Bn )P (Bn ).

This formula can often be used to calculate the probability that the nth term in a random
sequence takes a certain value, if it is known what is the probability distribution of the
(n − 1)th term. For example, suppose X1 , X2 , X3 , . . . is some sequence of random variables,
where the exact distribution of Xn depends on the distribution of Xn−1 , and that all the
Xi ’s take on only the values 0, 1 and 2. Then

P (Xn = 0|Xn−1 = 0)P (Xn−1 = 0)+


P (Xn = 0) = P (Xn = 0|Xn−1 = 1)P (Xn−1 = 1)+ ,
P (Xn = 0|Xn−1 = 2)P (Xn−1 = 2)

and we can extent this to the level of expectation:

E(Xn |Xn−1 = 0)P (Xn−1 = 0)+


E(Xn ) = E(Xn |Xn−1 = 1)P (Xn−1 = 1)+ .
E(Xn |Xn−1 = 2)P (Xn−1 = 2)

This last formula can be thought of as calculating expectation of a term in a sequence, by


conditioning on the previous term, and will be a very useful tool in some of the problems
below.

12.1 Problems to think about for week 12


1. Shanille O’Keal shoots free throws on a basketball court. She hits the first and
misses the second, and thereafter the probability that she hits the next shot is equal
to the proportion of shots she has hit so far. What is the probability that she hits
exactly 50 of her first 100 shots?

2. A bag contains 2021 red balls and 2020 black balls. We remove two balls at a time
repeatedly and (i) discard both if they are the same color and (ii) discard the black
ball and return the red ball to the bag if their colors differ. What is the probability
that this process will terminate with exactly one red ball in the bag?

3. You have coins C1 , C2 , . . . , Cn . For each k, coin Ck is biased so that, when tossed,
it has probability 1/(2k + 1) of falling heads. If the n coins are tossed, what is
the probability that the number of heads is odd? Express the answer as a rational
function of n.

4. An unbiased coin (i.e., heads and tails will each occur with probability 1/2) is tossed
n times. Find a formula, in closed form (no summation), for the expected value of
|H − T |, where H is the number of heads and T is the number of tails.

123
5. A dart, thrown at random, hits a square target. Assuming that any two parts of
the target of equal area are equally likely to be hit, find the probability that the
point
√ hit is nearer to the center than to any edge. Express your answer in the form
(a b + c)/d where a, b, c and d are integers.
6. Two real numbers x and y are chosen at random in the interval (0, 1) with respect
to the uniform distribution. What is the probability that the closest integer to x/y
is even? Express the answer in the form r + sπ, where r and s are rational numbers.
7. Let k be a positive integer. Suppose that the integers 1, 2, 3, . . . , 3k + 1 are written
down in random order. What is the probability that at no time during the process,
the sum of the integers that have been written up to that time is a positive integer
divisible by 3? Your answer should be in closed form, but may include factorials.
8. Let S be the set of 2 by 2 matrices each of whose entries is one of the 15 squares
0, 1, 4, 9, . . . , 196. Prove that if one selects more than 154 − 152 − 15 + 2 matrices
from S, then two of those selected must commute.

12.2 Solutions to problems on probability


1. Shanille O’Keal shoots free throws on a basketball court. She hits the first and
misses the second, and thereafter the probability that she hits the next shot is equal
to the proportion of shots she has hit so far. What is the probability that she hits
exactly 50 of her first 100 shots?

Solution: Some doodling with small examples suggests the following: if Shanille
throws a total of n free throws, n ≥ 3, then for each k in the range [1, n − 1] the
probability that she makes exactly k shots in 1/(n − 1) (independent of k).
We can prove this by induction on n, with n = 3 very easy. For n > 3, we start with
the extreme case k = 1. The probability that she makes exactly one shot in total is
the probability that she misses each of shots 3 through n, which is
1 2 3 n−3 n−2 1
· · · ... · · = .
2 3 4 n−2 n−1 n−1
For k > 1, there are two (mutually exclusive) ways that she can make k shots in
total:
(a) Make k −1 of the first n−1, and make the last; the probability of this happening
is, by induction,
1 k−1
· ,
n−2 n−1
or
(b) make k of the first n − 1, and miss the last; the probability of this happening
is, by induction,
1 n−1−k
· .
n−2 n−1

124
Thus the net probability of making k shots is
1 k−1 1 n−1−k 1
· + · = ,
n−2 n−1 n−2 n−1 n−1
and we are done by induction.
The answer to the given question is 1/99 (n = 100).

Source: Putnam Competition 2002, Problem B1.

2. A bag contains 2021 red balls and 2020 black balls. We remove two balls at a time
repeatedly and (i) discard both if they are the same color and (ii) discard the black
ball and return the red ball to the bag if their colors differ. What is the probability
that this process will terminate with exactly one red ball in the bag?

Solution: It helps to generalize to r red balls and b black balls, since as the
process goes along the number of balls of the two colors will not be equal. A little
experimentation suggests the following: if the process is started with and odd number
r ≥ 1 of red balls, and b ≥ 0 balls, then it always ends with one red ball. We prove
this by induction on r + b. Formally: for each n ≥ 1, P (n) is the proposition “if the
process is started with and odd number r ≥ 1 of red balls, and b ≥ 0 balls, with
r + b = n, then it always ends with one red ball”, and we prove P (n) by induction
on n.
Base case n = 1 is trivial, as is base case n = 2. For base case n = 3, we either start
with three red balls, in which case after one step we are down to one red, or we start
with one red and two blues. In this case, one third of the time we first pick the two
blacks, and we are down to one red, while two thirds of the time we first pick a black
and a red, and we are down to one red and one black, leaving us with one red after
step two.
Now consider n ≥ 4, and start with r red balls and b black balls, r odd and r + b = n.
If on the first step we pick two reds, then we are left with r − 2 red balls and b black
balls. Note that r − 2 is odd and (r − 2) + b = n − 2, so by induction in this case we
always end with one red. If on the first step we pick two blacks, then we are left
with r red balls and b − 2 black balls. Note that r is odd and r + (b − 2) = n − 2, so
by induction in this case we always end with one red. Finally, if on the first step we
pick a black and a red, then we are left with r red balls and b − 1 black balls. Note
that r is odd and r + (b − 1) = n − 1, so by induction in this case we always end
with one red. This completes the induction.
Since 2021 is odd, the probability of ending with one red, starting with 2021 red
balls and 2020 black balls is 1.

Source: I heard this problem from David Cook.

3. You have coins C1 , C2 , . . . , Cn . For each k, coin Ck is biased so that, when tossed,
it has probability 1/(2k + 1) of falling heads. If the n coins are tossed, what is

125
the probability that the number of heads is odd? Express the answer as a rational
function of n.

Solution: Let pn be the required probability. We have p1 = 1/3. For n ≥ 2 we can


express pn in terms of pn−1 as follows: we get an odd number of heads either by
getting an odd number of heads among the first n − 1 (probability pn−1 ) and a tail
on the nth coin (probability 2n/(2n + 1)), or by getting an even number of heads
among the first n − 1 (probability 1 − pn−1 ) and a head on the nth coin (probability
1/(2n + 1)). This leads to the recurrence
2npn−1 1 − pn−1 1 + (2n − 1)pn−1
pn = + =
2n + 1 2n + 1 2n + 1
valid for n ≥ 2. We claim that the solution to this recurrence is pn = n/(2n + 1).
We prove this claim by induction on n. This base case n = 1 is clear. For n ≥ 2 we
have, using the inductive hypothesis in the second equality,
1 + (2n − 1)pn−1
pn =
2n + 1
n−1
1 + (2n − 1) 2(n−1)+1
=
2n + 1
n
= ,
2n + 1
completing the induction.

Source: Putnam Competition 2001, A2.

4. An unbiased coin (i.e., heads and tails will each occur with probability 1/2) is tossed
n times. Find a formula, in closed form (no summation), for the expected value of
|H − T |, where H is the number of heads and T is the number of tails.

Solution: This was on the Putnam Competition in 1974, Problem A4. Here’s a
solution from https://mks.mff.cuni.cz/kalva/putnam/putn74.html:
If i heads are tossed, then n − i tails, so H − T = 2i − n, and
n  
1 X n
E(|H − T |) = n |2i − n| .
2 i=0 i
 
n
For i = 0, . . . [n/2], the summand is (n − 2i) , which is equal to (2(n − i) −
  i
n
n) , the summand for n − i; so
n−i
[n/2]  
1 X n
E(|H − T |) = (n − 2i)
2n−1 i=0
i

126
(this clearly works for odd n, since as i runs from 0 to [n/2], n − i runs from n to
[n/2] + 1; it also works for even, since in this case the one term that contributes in
both ranges, i = [n/2], contributes 0.
We’ll show
[n/2]    
X n n−1
(n − 2i) =n ,
i=0
i [n/2]
which leads to
n−1

n[n/2]
E(|H − T |) = .
2n−1
We’ll use the committee-chair identity,
   
n n−1
i =n ,
i i−1

and  n−1
[n/2]
X  2
  if n odd
= 1 n
 2n−1 + if n even
i=0 2 [n/2]
    n  
n n X n
(to see this, use = and = 2n ).
r n−r i=0
i
For odd n, we have
[n/2]   [n/2]  
X n n−1
X n
(n − 2i) = n2 −2 i
i=0
i i=0
i
[n/2] 
X n − 1
n−1
= n2 − 2n
i=1
i−1
[n/2]−1 
X n − 1
n−1
= n2 − 2n
i
 i=0  
n−1 n−1 n−1
= n2 −n 2 −
[n/2]
 
n−1
= n ,
[n/2]

as claimed. The case of even n is dealt with similarly.

5. A dart, thrown at random, hits a square target. Assuming that any two parts of
the target of equal area are equally likely to be hit, find the probability that the
point
√ hit is nearer to the center than to any edge. Express your answer in the form
(a b + c)/d where a, b, c and d are integers.

127
Solution: Place the dartboard on the x-y plane, with vertices at (0, 0), (0, 2), (2, 2)
and (2, 0), so center at (1, 1). We want to compute the are of the set of points
inside this square which are closer to (1, 1) than any of x = 0, 2, y = 0, 2. We’ll just
consider the triangle T bounded by vertices (0, 0), (1, 0), (1, 1); by symmetry, this is
one eight of the desired area.
p
For a point (x, y) in T , the distance to (1, 1) is (x − 1)2 +p(y − 1)2 , and the distance
to the nearest of x = 0, 2, y = 0, 2 is just y. So the curve (x − 1)2 + (y − 1)2 = y,
or
x2 − 2x + 2
y=
2
cuts T into two regions, one (containing (1, 1)) being the points that are closer
to (1, 1)√than the√ nearest of x = 0, 2, y = 0, 2. This curve hits the line x = y
at (2 − 2, 2 − 2). So the desired area inside T is the total√area of √ T (which
is 1/2) minus the area bounded by x = y from (0, 0) to (2 − 2, 2 − 2), then
y = (x2 − 2x + 2)/2 to (1, 1/2), then x = 1 to (1, 0), then the √ x-axis√back to
(0, 0).√This area is the area√of the triangle bounded by (0, 0), (2 − 2, 2 − 2), and
(2 − 2, 0) (which is (2 − 2)2 /2), plus
1 √
Z 1
x2 − 2x + 2

dx = (4 2 − 5);
2− 2 2 3

grand total (1/3)(4 − 2 2). It follows that the desired area inside T is
1 1 √ 1 √
− (4 − 2 2) = (4 2 − 5),
2 3 6

and so the total desired area is eight times this, or (4/3)(4 2 − 5).
Since the total area of the square is 4, the desired probability is thus
1 √
(4 2 − 5) ≈ .218951.
3

Source: Putnam Competition 1989, Problem B1. Note that this is a rare example
of a Putnam question with a typo: the correct (and officially sanctioned) answer is
as given above, but notice that c = −5, which is not a positive integer; and when the
problem appeared on √ the 1989 Putnam Competition, it ended with “Express your
answer in the form (a b + c)/d where a, b, c and d are positive integers” (emphasis
mine).
6. Two real numbers x and y are chosen at random in the interval (0, 1) with respect
to the uniform distribution. What is the probability that the closest integer to x/y
is even? Express the answer in the form r + sπ, where r and s are rational numbers.

Solution: The closest integer to x/y is 0 if x < 2y. It is 2n (for n > 0) if


2x/(4n + 1) < y < 2x/(4n − 1). (We can ignore y/x = 2/(2m + 1) since it has
probability zero.)

128
Hence the required probability is p = 1/4 + (1/3 − 1/5) + (1/7 − 1/9) + . . . . But
now recall that π/4 = 1 − 1/3 + 1/5 − 1/7 + . . . , so p = 5/4 − π/4.

Source: Putnam competition 1993, problem B3.

7. Let k be a positive integer. Suppose that the integers 1, 2, 3, . . . , 3k + 1 are written


down in random order. What is the probability that at no time during the process,
the sum of the integers that have been written up to that time is a positive integer
divisible by 3? Your answer should be in closed form, but may include factorials.

Solution: This was a Putnam Competition problem, and the official solution,
published in the American Mathematical Monthly, is very nicely presented, so I
reproduce it here verbatim:
“The number of ways to write down 1, 2, 3, . . . , 3k + 1 in random order is (3k + 1)!, so
we want to count the number of ways in which none of the “partial sums” is divisible
by 3. First, consider the integers modulo 3 : 1, 2, 0, 1, 2, 0, . . . , 1, 2, 0, 1. To write
these with none of the partial sums divisible by 3, we must start with a 1 or a 2.
After that, we can include or omit 0’s at will without affecting whether any of the
partial sums are divisible by 3, so suppose [initially] we omit all 0’s. The remaining
sequence of 1’s and 2’s must then be of the form

1, 1, 2, 1, 2, 1, 2, . . .

or
2, 2, 1, 2, 1, 2, 1, . . .
(once you start, the rest of the sequence is forced by the condition that no partial
sum is divisible by 3). However, a sequence of the form 2, 2, 1, 2, 1, 2, 1, . . . has one
more 2 than 1, and we need to have one more 1 than 2. So the only possibility for
our sequence modulo 3, once the 0’s are omitted, is 1, 1, 2, 1, 2, 1, 2, . . .. There are
2k + 1 numbers in this sequence, and the k 0’s can be returned to the sequence
arbitrarily except at the beginning. So the number of ways to form the complete
sequence modulo 3 equals the number of ways to distribute the k identical 0’s over
2k + 1 boxes (the“slots” after the 1’s and 2’s), which by a standard “stars and bars”
3k
argument is . Once this is done, there are k! ways to replace the k 0’s in the
k
sequence modulo 3 by the actual integers 3, 6, . . . , 3k. Also, there are k! ways to
“reconstitute” the 2’s and (k + 1)! ways for the 1’s. So the answer is
3k

k
k!k!(k + 1)! 00
.
(3k + 1)!

Source: Putnam competition 2007, problem A3.

129
8. Let S be the set of 2 by 2 matrices each of whose entries is one of the 15 squares
0, 1, 4, 9, . . . , 196. Prove that if one selects more than 154 − 152 − 15 + 2 matrices
from S, then two of those selected must commute.

Solution: (Putnam Competition 1990, problem B3) Let A be the set of diagonal
matrices from S (matrices with 0’s off the main diagonal), and B the set of matrices
with all four entries the same. Since |S| = 154 , |A| = 152 , |B| = 15 and |A ∩ B| = 1,
inclusion-exclusions tells us that

|(A ∪ B)c | = 154 − 152 − 15 + 1.

Let us select at least 154 − 152 − 15 + 3 elements from S. If the all-zero matrix (the
unique entry in A ∩ B) is among those we select, then there is a pair that commutes,
since the all-zero matrix commutes with everything. So let’s assume that the all-zero
matrix was not selected.
If two diagonal matrices are selected, or two matrices with all four entries the same
are selected, then there is a pair that commutes, since any two diagonal matrices
commute, and any two matrices with all four entries the same commute. So let’s
assume that at most one diagonal matrix was selected, and at most one matrix with
all four entries the same.
This leaves at least 154 − 152 − 15 + 1 matrices selected from (A ∪ B)c ; in other
words, all matrices from (A ∪ B)c . The problem is completed by exhibiting two
matrices that commute from (A ∪ B)c ; the pair [1, 1; 0, 1] and [1, 4; 0, 1] works.

130
13 Week 12 (October 27) — Polynomials
This week’s problem are all about polynomials, which come up in virtually every Putnam
competition.

Things to know about polynomials


• Fundamental Theorem of Algebra: Every polynomial p(x) = xn + a1 xn−1 +
a2 xn−2 + . . . + an−1 x + an , with real or complex coefficients, has a root in the complex
numbers, that is, there is c ∈ C such that p(c) = 0.

• Factorization: In fact, every polynomial p(x) = xn +a1 xn−1 +a2 xn2 +. . .+an−1 x+an ,
with real or complex coefficients, has exactly n roots, in the sense that there is a
vector (c1 , . . . , cn ) (perhaps with some repetitions) such that

p(x) = (x − c1 )(x − c2 ) . . . (x − cn ).

If a c appears in this vector exactly k times, it is called a root or zero of multiplicity


k. The next bullet point gives a very useful consequence of this.

• Two different polynomials of the same degree can’t agree too often: If
p(x) and q(x) (over R or C) both have degree at most n, and there are n + 1 distinct
numbers x1 , . . . , xn+1 such that p(xi ) = q(xi ) for i = 1, . . . , n + 1, then p(x) and q(x)
are equal for all x. [Because then p(x) − q(x) is a polynomial of degree at most n
with at least n + 1 roots, so must be identically zero].

• Complex conjugates: If the coefficients of p(x) = xn + a1 xn−1 + a2 xn2 + . . . +


an−1 x + an are all real, then the
√ complex roots occur in complex-conjugate pairs: if
s + it (with s, t real, and i = −1) is a root, then s − it is also a root.

• Coefficients in terms of roots: If (c1 , . . . , cn ) is the vector of roots of a polynomial


p(x) = xn +a1 xn−1 +a2 xn2 +. . .+an−1 x+an (over R or C), then each of the coefficients
can be expressed simply in terms of the roots: a1 is the negative of the sum of the
ci ’s; a2 is the sum of the products of the ci ’s, taken two at a time, a3 is the negative
of the sum of the products of the ci ’s, taken three at a time, etc. Concisely:
X Y
ak = (−1)k ci .
A⊆{1,...,n}, |A|=k i∈A

• Elementary symmetric polynomials: The kth elementary symmetric polynomial


in variables x1 , . . . , xn is
X Y
σk = xi
A⊆{1,...,n}, |A|=k i∈A

131
(these polynomials have already appeared in the last bullet point). A polynomial
p(x1 , . . . , xn ) in n variables is symmetric if for every permutation π of {1, . . . , n}, we
have
p(x1 , . . . , xn ) ≡ p(xπ(1) , . . . , xπ(n) ).
(For example, x21 + x22 + x23 + x24 is symmetric, but x21 + x22 + x23 + x1 x4 is not.)
Every symmetric polynomial in variables x1 , . . . , xn can be expressed as a linear
combination of the σk ’s.

• Some special values tell things about the coefficients: (Rather obvious, but
worth keeping in mind) If p(x) = a0 xn + a1 xn−1 + a2 xn2 + . . . + an−1 x + an , then

p(0) = an
p(1) = a0 + a1 + a2 + . . . + an
p(−1) = an − an−1 + an−2 − an−3 + . . . + (−1)n a0 .

The derivative of a polynomial also gives information about the coefficients, in a


probabilistic setting. Suppose that B is a set of objects, which have various sizes,
from 0 to n. Let bi be the number of objects of size i, and form the polynomial

B(x) = b0 + b1 x + b2 x2 + · · · + bn xn

(this is often called the generating polynomial of B, by size). We have

– B(1) = |B|, and


B 0 (1)
– = expected size of an object chosen uniformly and randomly from B
B(1)
(the latter since the probability that a randomly chosen object from B has size i is
ibi /|B| = ibi /B(1)). One can also express the variance of the size of an object chosen
uniformly and randomly from B, in terms of B(1), B 0 (1) and B 00 (1), but I’ll leave
that as an exercise!

• Intermediate value theorem: If p(x) is a polynomial with real coefficients (or


in fact any continuous real function) such that for some a < b, p(a) and p(b) have
different signs, then there is some c, a < c < b, with p(c) = 0.

• Lagrange interpolation: Suppose that p(x) is a real polynomial of degree n, whose


graph passes through the points (x0 , y0 ), (x1 , y1 ), . . ., (xn , yn ). Then we can write
n
X Y x − xj
p(x) = yi .
i=0 j6=i
xi − xj

• The Rational Roots theorem: Suppose that p(x) is a polynomial of degree n


with integer coefficients, and that x is a rational root a/b with a and b having no
common factors. Then the leading coefficient of p(x) (the coefficient of xn ) is a

132
multiple of b, and the constant term is a multiple of a. An immediate corollary of
this is that if p(x) is a monic polynomial (integer coefficients, leading coefficient 1),
then any rational root must in fact be an integer; conversely, if a real number x is a
root
√ of a monic polynomial but is not an integer, it must be irrational (for example,
2
2 is a root of monic x − 2, but is clearly not an integer, so it must be irrational)!

• Gauss’ lemma: Here is a weak form of Gauss’ lemma, but one that is very useful: if
c is an integer root of a monic polynomial p(x) (integer coefficients, leading coefficient
1), then p(x) factors as (x − c)q(x), where q(x) is also a monic polynomial (the
surprise being not that q(x) has leading coefficient 1, but that it has all integer
entries).

• One more fact about integer polynomials: Let p(x) be a (not necessarily monic)
polynomial with integer coefficients. For any integers a, b,

(a − b)|(p(a) − p(b)).

(So also,
(p(a) − p(b))|(p(p(a)) − p(p(b))),
etc.)

133
13.1 Problems to think about for week 13
1. (Verifying the last fact from the introduction) If p(x) is a polynomial with integer
coefficients, and a and b are distinct integers, verify that

p(b) − p(a)
b−a
is always an integer.

2. For which real values of p and q are the roots of the polynomial x3 − px2 + 11x − q
three consecutive integers? Give the roots in these cases.

3. Let p(x) be a polynomial with integer coefficients, for which p(0) and p(1) are odd.
Can p(x) have any integer zeroes?

4. (a) Determine all polynomials p(x) such that p(0) = 0 and p(x + 1) = p(x) + 1 for
all x.
(b) Determine all polynomials p(x) such that p(0) = 0 and p(x2 + 1) = (p(x))2 + 1
for all x.

5. Does there exist a non-zero polynomial f (x) for which xf (x − 1) = (x + 1)f (x) for
all x?

6. Determine, with proof, all positive integers n for which there is a polynomial p(x) of
degree n satisfying the following three conditions:

(a) p(k) = k for k = 1, 2, . . . , n,


(b) p(0) is an integer, and
(c) p(−1) = 2020.

7. Let p(x) = xn + an−1 xn−1 + . . . + a1 x + a0 be a polynomial with integer coefficients.


Suppose that there exist four distinct integers a, b, c, d with p(a) = p(b) = p(c) =
p(d) = 5. Prove that there is no integer k with p(k) = 8.

8. Is there an infinite sequence a0 , a1 , a2 , . . . of nonzero real numbers such that for


n = 1, 2, 3, . . . the polynomial

pn (x) = a0 + a1 x + a2 x2 + . . . + an xn

has exactly n distinct real roots?

9. A Boolean function is a function f : {0, 1}n → {0, 1}. A multilinear polynomial


p : RnY
→ R is a (real) linear combination of linear monomials — expressions of the
form xi where S is a subset of {1, . . . , n}. Show that for every Boolean function
i∈S
there is a unique multilinear polynomial pf that agrees with f on {0, 1}n .

134
13.2 Solutions to problems on polynomials
1. (Verifying the last fact from the introduction) If p(x) is a polynomial with integer
coefficients, and a and b are distinct integers, verify that
p(b) − p(a)
b−a
is always an integer.

Solution: We’ll use the useful factorization formula


ak − bk = (a − b) bk−1 + abk−2 + · · · + ak−2 b + ak−1 .


Write
p(x) = c0 + c1 x + c2 x2 + · · · + cn xn ,
with ci an integer, and cn 6= 0. We have
n
X n
X
p(b) − p(a) = c k bk − c k ak
k=0 k=0
n
X
c k b k − ak

=
k=0
n
X
ck (b − a) bk−1 + abk−2 + · · · + ak−2 b + ak−1

=
k=0
n
X
ck bk−1 + abk−2 + · · · + ak−2 b + ak−1 .

= (b − a)
k=0
n
X
ck bk−1 + abk−2 + · · · + ak−2 b + ak−1 is an integer, we conclude that (p(b)−

Since
k=0
p(a))/(b − a) is an integer.

2. For which real values of p and q are the roots of the polynomial x3 − px2 + 11x − q
three consecutive integers? Give the roots in these cases.

Solution: The solutions is: either p = 6, q = 6 (in which case roots are 1, 2, 3), or
p = −6, q = −6 (in which case roots are −1, −2, −3)
A polynomial with roots being three consecutive integers is of the form
(x − (a − 1))(x − a)(x − (a + 1)) = x3 − 3ax2 + (3a2 − 1)x − (a3 − a)
for some integer a. So, matching coefficients, we must have 3a2 − 1 = 11, or a = ±2.
When a = 2 we get roots 1, 2, 3 and p = 6, q = 6; when a = −2 we get roots
−3, −2, −1 and p = −6, q = −6.

Source: From a Harvey Mudd Putnam prep class.

135
3. Let p(x) be a polynomial with integer coefficients, for which p(0) and p(1) are odd.
Can p(x) have any integer zeroes?

Solution: No. If k is an even integer we have p(k) ≡ p(0) ≡ 1 (mod 2) (Why?


Suppose a ≡ b (mod m). Then a` ≡ b` (mod m) for any `, so ca` ≡ cb` (mod m)
for any c, so (summing), p(a) ≡ p(b) (mod m) for any polynomial p). By the same
token, if k is odd then p(k) ≡ p(1) ≡ 1 (mod 2). So we never have p(k) ≡ 0 (mod 2),
and never have p(k) = 0.

Source: From a Northwestern Putnam prep class.

4. (a) Determine all polynomials p(x) such that p(0) = 0 and p(x + 1) = p(x) + 1 for
all x.
Solution: The only such polynomial is the identity polynomial.
By induction, p(x) = x for all positive integers x, so p(x) − x is a polynomial
with infinitely many zeros, so must be identically 0. We conclude that p(x) = x
is the only possible polynomial satisfying the given conditions.

(b) Determine all polynomials p(x) such that p(0) = 0 and p(x2 + 1) = (p(x))2 + 1
for all x.
Solution: The only such polynomial is the identity polynomial.
We have p(0) = 0, p(1) = p(0)2 +1 = 1, p(2) = p(1)2 +1 = 2, p(5) = p(2)2 +1 = 5,
p(26) = p(5)2 + 1 = 26 and in general, by induction, if the sequence (an ) is
defined recursively by a0 = 0 and an+1 = a2n + 1, then p(an ) = an . Since
the sequence (an ) is strictly increasing, we find that there are infinitely many
distinct values x for which p(x) = x; as in the last part, this tells us that
p(x) = x is the only possible polynomial satisfying the given conditions.

Source: Modified from Putnam competition, 1971 problem A2.

5. Does there exist a non-zero polynomial f (x) for which xf (x − 1) = (x + 1)f (x) for
all x?

Solution: No. For positive integer n, taking x = n in the equation above, we have
n n−1
f (n) = f (n − 1) = f (n − 2) = . . . = 0f (−1) = 0.
n+1 n+1
Hence f (x) has infinitely many zeros, and must be identically zero; f (x) ≡ 0.

Source: From a Northwestern Putnam prep class.

6. Determine, with proof, all positive integers n for which there is a polynomial p(x) of
degree n satisfying the following three conditions:

136
(a) p(k) = k for k = 1, 2, . . . , n,
(b) p(0) is an integer, and
(c) p(−1) = 2020.

Solution/source: The possible values of n are 42, 46 and 2020.


This was modified from a UIUC mock Putnam; see http://www.math.illinois.
edu/~hildebr/putnam/problems/mock12sol.pdf for a solution (replace “2012” ev-
erywhere in that solution with “2020”, and “2013” with “2021”).

7. Let p(x) = xn + an−1 xn−1 + . . . + a1 x + a0 be a polynomial with integer coefficients.


Suppose that there exist four distinct integers a, b, c, d with p(a) = p(b) = p(c) =
p(d) = 5. Prove that there is no integer k with p(k) = 8.

Solution: Set q(x) = p(x) − 5. We have q(a) = q(b) = q(c) = q(d) = 0 and so
q(x) = r(x)(x − a)(x − b)(x − c)(x − d), where r(x) is some rational polynomial; but
in fact (by Gauss’ Lemma), r(x) is a polynomial over integers.

Aside: Why is r(k) above a polynomial over integers? Suppose xn + an−1 xn−1 +
. . . + a1 x + a0 (call this expression 1), with all ai integers, factors as (x − c)(xn−1 +
rn−2 xn−2 + . . . + r1 x + r0 ) (call this expression 2), where c is an integer. Then
necessarily the ri are rational numbers; but in fact, we can show that they are
all integers. This is obvious when c = 0, so assume c = 6 0. Expanding out the
factorization and equating coefficients, we get

an−1 = rn−2 − c
an−2 = rn−3 − crn−2
an−3 = rn−4 − crn−3
···
a2 = r1 − cr2
a1 = r0 − cr1
a0 = −cr0 .

Now evaluating both expression 1 and expression 2 at x = c, we get

cn + an−1 cn−1 + . . . + a2 c2 + a1 c + a0 = 0.

Plugging in a0 = −cr0 yields

c cn−1 + an−1 cn−2 + . . . + a2 c + a1 − r0 = 0.




Utilizing c 6= 0, we conclude that

cn−1 + an−1 cn−2 + . . . + a2 c + a1 = r0

137
and so, since the left-hand side is clearly an integer, so is the right-hand side, r0 .
Now plugging a1 = r0 − cr1 into this last inequality, and dividing by c, we get

cn−2 + an−1 cn−3 + . . . + a2 = r1

so r1 is also an integer. Continuing in this manner we get, for general k,

cn−(k+1) + an−1 cn−(k+2) + . . . + ak+1 = rk

for k ≤ n − 2 (this could be formally proved by induction), which allows us to


conclude that all of the ri ’s are integers.

Back to solution: Now suppose there is an integer k with p(k) = 8. Then q(k) = 3,
so r(k)(k − a)(k − b)(k − c)(k − d) = 3. Since r(k), (k − a), (k − b), (k − c) and
(k − d) are all integers, and 3 is prime, one of the five must be ±3 and the remaining
four must be ±1. It follows that at least three of (k − a), (k − b), (k − c) and (k − d)
must be ±1, and so at least two of them must take the same value; this contradicts
the fact that a, b, c and d are distinct.

Source: From a Northwestern Putnam prep class.

8. Is there an infinite sequence a0 , a1 , a2 , . . . of nonzero real numbers such that for


n = 1, 2, 3, . . . the polynomial

pn (x) = a0 + a1 x + a2 x2 + . . . + an xn

has exactly n distinct real roots?

Solution: We can explicitly construct such a sequence. Start with a0 = 1 and


a1 = −1 (so case n = 1 works fine). We’ll construct the ai ’s inductively, always
alternating in sign. Suppose we have a0 , a1 , . . . , an−1 . The polynomial pn−1 (x) =
a0 + a1 x + a2 x2 + . . . + an−1 xn−1 has real distinct roots x1 < . . . < xn−1 . Choose
y1 , . . . , yn so that

y1 < x1 < y2 < x2 < . . . < yn−1 < xn−1 < yn .

The sequence pn−1 (y1 ), pn−1 (y2 ), . . . , pn−1 (yn ) alternates in sign (think about the
graph of y = pn−1 (x)). As long as we choose an sufficiently close to 0, the sequence
pn (y1 ), pn (y2 ), . . . , pn (yn ) alternates in sign (this is by continuity). So, choose such
an an . Now choose a yn+1 sufficiently large that pn (yn+1 ) has the opposite sign
to pn (yn ) (this is where alternating the signs of the ai ’s comes in — such a yn+1
exists exactly because an and an−1 have opposite signs). We get that the sequence
pn (y1 ), pn (y2 ), . . . , pn (yn+1 ) alternates in sign. Hence pn (x) has n distinct real roots:
one between y1 and y2 , one between y2 and y3 , etc., up to one between yn and yn+1 .
This accounts for all its roots, and we are done.

Source: Putnam competition, 1990 problem B5.

138
9. A Boolean function is a function f : {0, 1}n → {0, 1}. A multilinear polynomial
p : RnY
→ R is a (real) linear combination of linear monomials — expressions of the
form xi where S is a subset of {1, . . . , n}. Show that for every Boolean function
i∈S
there is a unique multilinear polynomial pf that agrees with f on {0, 1}n .

Solution: Existence: Consider an input c = (c1 , c2 , . . . , cn ) for which the output


is 1. The expression
pc = x01 x2 · · · x0n ,
where 
xk if ck = 1
x0k =
1 − xk if ck = 0
has the property that

1 if (c1 , . . . , cn ) = (x1 , . . . , xn )
pc (x1 , . . . , xn ) =
0k otherwise.

Notice that when expanded out, pc is a linear combination of linear monomials, so is


a multilinear polynomial. It follows that the multilinear polynomial
X
p= pc
c:f (c)=1

evaluates to 1 for all c for which f (c) = 1, and evaluates to 0 otherwise; i.e., it agrees
with f on {0, 1}n .

Uniqueness: Suppose that q is another multilinear polynomial that agrees with


f on {0, 1}n . Then the multilinear polynomial p − q has the property that on all
{0, 1}n , it evaluates to 0. The following claim shows that p − q must be identically
0, so p = q and p is unique.

Claim: If t is a multilinear polynomial in variables x1 , . . . , xn that evaluates to 0 on


all of {0, 1}n , then t is identically 0.

Proof: By induction on n. For the base case n = 1, we have t(x1 ) = ax1 + b.


Evaluating at x1 = 0 (and using t(0) = 0) we get b = 0, and then evaluating at
x1 = 1 we get that also a = 0, so t is identically 0.
For the induction step, consider t a multilinear polynomial in variables x1 , . . . , xn .
Write t as
t(x1 , . . . , xn ) = r(x1 , . . . , xn−1 )xn + s(x1 , . . . , xn−1 )
where r, s are multilinear polynomials in variables x1 , . . . , xn−1 . We have

t(x1 , . . . , xn−1 , 0) = s(x1 , . . . , xn−1 ),

139
and since the left-hand side above is 0 for every choice of (x1 , . . . , xn−1 ) ∈ {0, 1}n−1 ,
so is the right-hand side, and it follows (by the induction hypothesis) that s is
identically 0. So
t(x1 , . . . , xn ) = r(x1 , . . . , xn−1 )xn .
Now we have
t(x1 , . . . , xn−1 , 1) = r(x1 , . . . , xn−1 ),
and since the left-hand side above is 0 for every choice of (x1 , . . . , xn−1 ) ∈ {0, 1}n−1 ,
so is the right-hand side, and it follows (again by the induction hypothesis) that r is
identically 0. So in fact t is identically 0, completing the induction.

Source: This is a standard result in discrete Fourier analysis.

140
14 Week 13 (November 3) — Games
These problem are all about games played between two players. Usually when these
problems appear in the Putnam competition, you are asked to determine which player
wins when both players play as well as possible. Once you have decided which player wins
(maybe based on analyzing small examples), you need to prove this in general. Often this
entails demonstrating a winning strategy: for each possible move by the losing player, you
can try to identify a single appropriate response for the winning player, such that if the
winning player always uses these responses as the game goes on, then she will indeed win.
It’s important to remember that you must produce a response for the winning player for
every possible move of the losing player — not just a select few.

141
14.1 Problems to think about for week 14
1. Two players alternately draw diagonals between vertices of a regular polygon. They
may connect two vertices if they are non-adjacent (i.e. not a side) and if the diagonal
formed does not cross any of the previous diagonals formed. The last player to draw
a diagonal wins.
Who wins if the polygon has 2020 vertices?
2. Two players play a game in which the first player places a king on an empty 8 by 8
chessboard, and then, starting with the second player, they alternate moving the
king (in accordance with the rules of chess) to a square that has not been previously
occupied. The player who moves last wins. Which player has a winning strategy?
3. There are nine cards laid out on a table, face up, numbered 1 through 9. Two players,
A and B, take turns picking up cards (and once a card is picked up, it is out of play).
As soon as one of the players has among his chosen cards three of them that sum to
fifteen, that player wins.
(a) If both players play perfectly, what happens?
(b) What game are the players really playing?
4. Alan and Barbara play a game in which they take turns filling entries of an initially
empty 1024 by 1024 array. Alan plays first. At each turn, a player chooses a real
number and places it in a vacant entry. The game ends when all the entries are filled.
Alan wins if the determinant of the resulting matrix is nonzero; Barbara wins if it is
zero. Which player has a winning strategy?
5. I shuffle a regular deck of cards (26 red, 26 black), and begin to turn them face-up,
one after another. At some point during this process, you say “STOP!”. You can
say stop as early as before I’ve even turned over the first card, or as late as when
there is only one card left to be turned over; the only rule is that at some point you
must say it. Once you’ve said stop, I turn over the next card. If it is red, you win
the game, and if it is black, you lose.
If you play the strategy “say stop before even a single card has been turned over”,
you have a 50% chance of winning the game. Is there a more clever strategy that
gives you a better than 50% chance of winning the game?
6. Alice and Bob play the following game. They start with a pile of 9 matches. They
take turns, Alice playing first. Each player may remove between 1 and 3 matches.
The player who picks up the last match wins. Who has a winning strategy? And
what is it? And what if, instead of 9 matches, we start with a pile of n matches?
7. Two players, A and B, take turns naming positive integers, with A playing first. No
player may name an integer that can be expressed as a linear combination, with
positive integer coefficients, of previously named integers. The player who names “1”
loses. Show that no matter how A and B play, the game will always end.

142
8. Suppose n ≥ 2 light bulbs are arranged in a row, numbered 1 through n. Under
each bulb is a button. Pressing the button will change the state of the bulb above
it (from on to off or vice versa), and will also change the neighbors’ states. (Most
bulbs have two neighbors, but the bulbs on the end have only one.) The bulbs start
off randomly (some on and some off). For which n is it guaranteed to be possible
that by flipping some switches, you can turn all the bulbs off?

14.2 Solutions to problems on games


1. Two players alternately draw diagonals between vertices of a regular polygon. They
may connect two vertices if they are non-adjacent (i.e. not a side) and if the diagonal
formed does not cross any of the previous diagonals formed. The last player to draw
a diagonal wins.
Who wins if the polygon has 2020 vertices?

Solution: Player 1 wins.


It’s easy to prove (by induction) that if the game is played on an n-sided polygon
(n ≥ 4) then it will have exactly n − 3 moves. So on a 2020-sided polygon, there will
by 2017 moves, and player 1 must move last (and win). No strategy is involved!

Source: University of Texas Putnam prep.

2. Two players play a game in which the first player places a king on an empty 8 by 8
chessboard, and then, starting with the second player, they alternate moving the
king (in accordance with the rules of chess) to a square that has not been previously
occupied. The player who moves last wins. Which player has a winning strategy?

Solution: Player 2 has a winning strategy. She can imagine the board as being
covered with non-overlapping 2-by-1 dominoes (there are many ways to cover an 8
by 8 board with dominoes). Wherever player 1 puts the king, player 2 moves it to
the other square in the corresponding domino. She then repeats this strategy until
the game is over. (Player 2’s approach here is referred to as a pairing strategy).

Source: University of Texas Putnam prep.

3. There are nine cards laid out on a table, face up, numbered 1 through 9. Two players,
A and B, take turns picking up cards (and once a card is picked up, it is out of play).
As soon as one of the players has among his chosen cards three of them that sum to
fifteen, that player wins.

(a) If both players play perfectly, what happens?


(b) What game are the players really playing?

Solution: The winning triples are

143
• 1,5,9
• 1,6,7
• 2,4,9
• 2,5,8
• 2,6,7
• 3,4,8
• 3,5,7
• 4,5,6

Arranging the 9 numbers in a three by three grid:


6 7 2
1 5 9
8 3 4

we see that the winning triples are the three rows, three columns and two diagonals.
So the players are in fact playing tic-tac-toe, which (after some case analysis) is seen
to be a draw when both players play optimally.

4. Alan and Barbara play a game in which they take turns filling entries of an initially
empty 1024 by 1024 array. Alan plays first. At each turn, a player chooses a real
number and places it in a vacant entry. The game ends when all the entries are filled.
Alan wins if the determinant of the resulting matrix is nonzero; Barbara wins if it is
zero. Which player has a winning strategy?

Solution: Barbara has a winning strategy. For example, Whenever Alan plays x in
row i, Barbara can play −x in some other place in row i (since there are an even
number of places in row i, Alan will never place the last entry in a row if Barbara
plays this strategy). So Barbara can ensure that all row-sums of the final matrix are
0, so that the column vector of all 1’s is in the kernel of the final matrix, so it has
determinant zero.

Source: Putnam competition 2008, problem A2.

5. I shuffle a regular deck of cards (26 red, 26 black), and begin to turn them face-up,
one after another. At some point during this process, you say “STOP!”. You can
say stop as early as before I’ve even turned over the first card, or as late as when
there is only one card left to be turned over; the only rule is that at some point you
must say it. Once you’ve said stop, I turn over the next card. If it is red, you win
the game, and if it is black, you lose.
If you play the strategy “say stop before even a single card has been turned over”,
you have a 50% chance of winning the game. Is there a more clever strategy that
gives you a better than 50% chance of winning the game?

144
Solution: Here’s the quick-and-dirty solution: The game fairly easily seen to be
equivalent to the following: exactly as before, except now when you say “STOP”, I
turn over the bottom card in the pile of cards that remains. In this formulation, it is
clear that there cannot be a strategy that gives you better than a 50% chance of
winning.
Here’s a more prosaic solution. Suppose that instead of being played with a balanced
deck, it is played with a deck that has we claim that if there are a red cards and
b black cards, then there is no strategy better than the naive one of saying stop
before a single card has been turned over; note that with this strategy you win
with probability a/(a + b). We prove this by induction on a + b. If a + b = 1, then
(whether a = 1 or a = 0) the result is trivial. Suppose a + b ≥ 2. To get a strategy
that potentially improves on the proposed best strategy, you must at least wait for
the first card to be turned over. Two things can happen:
• The first card turned over is red; this happens with probability a/(a + b). Once
this happens, you are playing a new version of the game, with a − 1 red cards
and b black cards, and by induction your best winning strategy has you winning
with probability (a − 1)/(a − 1 + b).
• The first card turned over is black; this happens with probability b/(a + b).
Once this happens, you are playing a new version of the game, with a red cards
and b − 1 black cards, and by induction your best winning strategy has you
winning with probability a/(a + b − 1).
Your probability of winning the original game is therefore at most
a a−1 b a a
× + × = .
a+b a+b−1 a+b a+b−1 a+b

Source: Asked me by David Wilson, in an interview for a job at Microsoft Research.

6. Alice and Bob play the following game. They start with a pile of 9 matches. They
take turns, Alice playing first. Each player may remove between 1 and 3 matches.
The player who picks up the last match wins. Who has a winning strategy? And
what is it? And what if, instead of 9 matches, we start with a pile of n matches?

Solution: To be added. It seems that Player 2 can force a win if there are 4n
matches, and Player 1 can force a win otherwise. Example of strategy stealing.

7. Two players, A and B, take turns naming positive integers, with A playing first. No
player may name an integer that can be expressed as a linear combination, with
positive integer coefficients, of previously named integers. The player who names “1”
loses. Show that no matter how A and B play, the game will always end.

Solution: Suppose the first k moves consist of naming x1 , . . . , xk . Let gk be the


greatest common divisor of the xi ’s. Consider the set of numbers expressible as

145
a linear combination of the xi ’s over positive integers. EachX x in this set is an
integer multiple of gk (gk divides the right-hand side of x = ai xi , so it divides the
i
left-hand side). We claim that there is some m such that all multiples of gk greater
than mgk are in this set.
If we can prove this claim, we are done. The sequence (g1 , g2 , g3 , . . .) is non-increasing.
It stays constant in going from gi to gi+1 exactly when xi+1 is a multiple of gi , and
drops exactly when xi+1 is not a multiple of gi . By our claim, once the sequence has
reached a certain g, it can only stay there for a finite length of time. So eventually
that sequence becomes constantly 1. But once the sequence reaches 1, there are only
finitely many numbers that can be legitimately played, and so eventually 1 must be
played.
Here’s what we’ll prove, which is equivalent to the claim: if x1 , . . . , xk are relatively
prime positive integers (greatest common divisor equals 1) then there exists an m such
that all numbers greater than m can be expressed as a positive linear combination
of the xi ’s. We prove this by induction on k. When k = 1, xk = 1 and the result
is trivial. For k > 1, consider x1 , . . . , xk−1 . These may not be relatively prime;
say their greatest common divisor is d. By induction, there’s an m0 such that all
positive integer multiples of d greater than m0 d can be expressed as a positive linear
combination of the x1 , . . . , xk−1 . Now d and xk must be relatively prime (otherwise
the xi ’s would not be relatively prime), which means that there must be some positive
integer e (which way may assume is between 1 and xk − 1) with ed ≡ 1 (modulo
xk ). If we add any multiple of xk to e to get e0 , we still get e0 d ≡ 1 (modulo xk ).
Pick a multiple large enough that e0 > m0 . By induction, e0 d can be expressed as a
positive integer combination of x1 , . . . , xk−1 . So too can 2e0 d, 3e0 d, . . . , xk e0 d. These
xk numbers cover all the residue classes modulo xk . Let m be one less than the largest
of these numbers. For ` > m, we can express ` as a positive linear combination of
x1 , . . . , xk as follows: first, determine the residue class of ` modulo xk , say it’s p.
Then add the appropriate positive integer multiple of xk to pe0 d (which can can be
expressed as a positive integer combination of x1 , . . . , xk−1 ).

Source: This is the game of Sylver coinage, invented by John H. Conway; see http:
//en.wikipedia.org/wiki/Sylver_coinage. It is named after J. J. Sylvester, who
proved that if a and b are relatively prime positive integers, then the largest positive
integer that cannot be expressed as a positive linear combination of a and b is
(a − 1)(b − 1) − 1.

8. Suppose n ≥ 2 light bulbs are arranged in a row, numbered 1 through n. Under


each bulb is a button. Pressing the button will change the state of the bulb above
it (from on to off or vice versa), and will also change the neighbors’ states. (Most
bulbs have two neighbors, but the bulbs on the end have only one.) The bulbs start
off randomly (some on and some off). For which n is it guaranteed to be possible
that by flipping some switches, you can turn all the bulbs off?

146
Solution: To be added

147
15 Week 14 (November 10)
Here’s a collection of problems contributed by participants of this year’s running of Math
43900. Some fun things to think about over the winter break.

1. 100 people play a game, organized by a game-master. The game goes as follows:
The 100 participants line up in single file. The game-master puts either a red or a
blue hat on each participant’s head. Every participant can see the hats of the people
in front of them in the line — but not their own hat, nor those of anyone behind
them. The game-master starts at the back of the line, and asks the last person to
call out the colour of their hat. They must answer “red” or “blue”. If they answers
correctly, they stay in line; if they give the wrong answer, they are immediately and
silently removed from the room. (So while everyone hears the answer, no-one knows
whether an answer was right.) The game-master then proceeds up the line, repeating
the same procedure with each of the 100 participants. Before the game begins, the
participants are allowed to confer on a strategy to help them. What should they do,
if they want as many people as possible still in line at the end of the game?38

2. Let R be the set of real numbers. Find all functions f : R → R satisfying

f (f (x)f (y)) + f (x + y) = f (xy)

for all real x, y.



X f (n)
3. Let f (n) be the number of 1s in the base 2 representation of n. Let k = .
n=1
n + n2
Is ek rational?

4. Satan has captured you and your friend. He puts a full Othello board (an 8 by 8
board of tiles, that are black on one side and white on the other, such that only white
or black is visible for each tile — see picture below for a typical instance) in front of
you. He says he will send your friend out of the room, rearrange the board however
he likes, then designate one “magic square”. You may then flip one tile from black to
white or white to black39 . Your friend will then return and must be able to identify
the chosen “magic square”. You are allowed to discuss strategy as long as you like
beforehand, but after your friend leaves the room you cannot communicate. What
strategy do you come up with? Note: you and your friend both view the board in
same orientation (you both know which side is the “top” side of the board).40
38
There is an open-endedness to this question. You can seek to have a number of people certain to
still be in line at the end; or maybe to have a high expected number of people left in line; or maybe some
hybrid of these goals.
39
“May” here really means “may” — you can flip a tile if you wish, or you can choose not to flip.
40
The person who contributed this problem says: This is a puzzle that was posed to me several years
ago by a friend who also studied here. I was unable to solve it then, and have not found a solution since,
but I trust my friend that a solution does actually exist. I hope someone in the class has better luck with
it than I did!

148
5. You are given six boxes, B1 , . . . , B6 , and, for each ` = 0, . . . , 5, two identical coins
each of denomination 2`41 . In how many ways can you distribute the coins among
the boxes, so that for each k = 1, . . . , 5, box Bk contains coins of total value 2k ?
What if 6 is replaced by arbitrary n ∈ N?

6. Three players enter a room, and a red or blue hat is placed on each person’s head.
The color of each hat is determined by a coin toss, with the outcome of one coin
toss having no effect on the others. Each person can see the other players’ hats but
not their own.
No communication of any sort is allowed, except for an initial strategy session
between the three players before the game begins. Once they have had a chance to
look at the other hats, the players must simultaneously guess the color of their own
hats, or pass. The group shares a hypothetical $3 million prize if at least one player
guesses correctly and no players guess incorrectly.
What strategy for the group maximizes its chances of winning the prize?

7. A group of lions live on an island covered in grass but with no other animals. The
lions are identical, perfectly rational and aware that all the others are rational. They
are also aware that all the other lions are aware that all the others are rational, and
so on.
Naturally, the lions are extremely hungry but they do not attempt to fight each other
because they are identical in physical strength and so would inevitably all end up
dead. As they are all perfectly rational, each lion prefers a hungry life to a certain
death. With no alternative, they can survive by eating an essentially unlimited
supply of grass, but they would all prefer to consume something meatier.
41
So: two coins of denomination 1, two coins of denomination 2, two coins of denomination 4, et cetera.

149
One day, a lamb miraculously appears on the island42 . If any lion consumes the
defenceless lamb, it will become too full to defend itself from the other lions, and
thus will be eaten.
Suppose there are n lions at the moment the lamb appears. What will happen?43

8. How many primes among the positive integers, written as usual in base 10, are such
that their digits are alternating 1’s and 0’s, beginning and ending with a 1?

9. Prove that every nonzero coefficient of the Taylor series of

(1 − x + x2 )ex

about x = 0 is a rational number whose numerator (in lowest terms) is either 1 or a


prime number.

10. Let P (x) be a polynomial of degree n such that P (x) = Q(x)P 00 (x), where Q(x) is a
quadratic polynomial and P 00 (x) is the second derivative of P (x). Show that if P (x)
has at least two distinct roots then it must have n distinct roots.

11. Suppose we have a floor made of parallel strips of wood, each one unit wide. If we
drop a needle one unit long onto the floor, what is the probability that the needle
will cross one of the lines between two strips of wood on the floor?

12. Players A and B each have a well shuffled standard deck of cards, with no jokers.
The players deal their cards one at a time, from the top of the deck, checking for an
exact match. Player A wins if, once the packs are fully dealt, no matches are found.
Player B wins if at least one match occurs.
What is the probability that player A wins?

13. Here’s a carnival game: you are brought into a circular room with a number q of
identical doors, k of which have a prize (a fun math problem) behind them, and the
rest of which have nothing behind them. You are allowed to choose a door, open it,
and see what is behind it. Then you are allowed again to choose a door, open it, and
see what is behind it, and in fact you are allowed to do this q times44 . After you
make each choice, the locations of the k prizes are randomly re-distributed among
the q doors.
42
Unfortunate creature, it would seem. But actually the lamb has a chance of surviving this hell!
43
It will be helpful to assume:
• that lions cannot share;
• that the lions all move at the same speed; that any defenceless creature (lamb or full lion), being
rational, bows to the inevitable and does not try to run away as soon as it sees a hungry lion move
towards it;
• and that at all points, the distances between all pairs of creatures on the island are distinct.

44
Remember, q is the total number of doors (both prize and non-prize)

150
As time goes by, the carnival owners add new doors frequently45 ; but since they
aren’t very good at math and can’t come up with any new problems, they keep the
number of prize doors the same46 . As more and more doors are installed, what is
the probability that you lose the game (i.e., that in all of your q choices, you never
see a prize)?47

45
So q keeps growing
46
So k is fixed
47
More precisely, what nice value is this probability approaching?

151

You might also like