Notre Dame Math 43900 Fall 2020
Notre Dame Math 43900 Fall 2020
Abstract
This project contains all information relevant to the Fall 2020 running of Math
43900 — Problem Solving in Math at University of Notre Dame. This includes
• basic course information;
• weekly problem sets and solutions; and
• various pieces of supplementary material, added as they come up during the
semester.
It will replace the usual course website/Sakai page — everything that you need to
know about the course will be found here.
Here is the Zoom address for remote class participants:
https://notredame.zoom.us/j/95331468489
1
Contents
1 Basic course information 4
1.1 Zoom url . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 COVID-19 policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The absolute basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Course description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.1 Official description . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.2 A more helpful description . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Course policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.1 Attendance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.2 Honor code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.3 Class conduct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2
8 Week 7 (September 22) — Writing solutions 77
8.1 Some problems to think about for week 8 . . . . . . . . . . . . . . . . . . . 80
8.2 Solutions to problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3
1 Basic course information
NOTE: Initial course policies are in flux, and can be expected to change right
up to the start of semester!
https://notredame.zoom.us/j/95331468489
All students, faculty, and staff in a University campus space will be required
to wear face coverings at all times when they are, or may be, in the presence
of other individuals, except when alone in a private room (office, assigned
residence hall room) or in a private vehicle.
Adherence to this policy concerning face coverings (worn covering both mouth and nose)
is an absolute requirement of in-person attendance. This is for my health and safety,
for your health and safety, and for the health and safety of all those that we come in
contact with, some of whom may be at high risk for suffering devastating consequences
from COVID-19.
Again from https://here.nd.edu/health-safety/ (retrieved July 24, 2020):
The practical import of this for the weekly meetings is that you should not move the chairs
from their sticker-centered positions, and that you should avoid gathering in groups before
or after meetings.
Again from https://here.nd.edu/health-safety/ (retrieved July 24, 2020):
All members of the Notre Dame community (students, faculty and staff)
should conduct a daily health check by taking their temperature and assessing
symptoms.
So: if we are sick, or show symptoms, we should not attend meetings in person!!!
Because it is likely that at any given time, there will be some people who have to skip
the in-person meetings, all sessions will be streamed online, at
4
https://notredame.zoom.us/j/95331468489
https://notredame.zoom.us/j/95331468489
Saturday, February 20
(a departure from the traditional December date). The precise details of how the Compe-
tition will be administered have not yet been announced.
5
Each meeting of Math 43900 will (usually) be built around a specific theme (pigeon-hole
principle, induction & recursion, inequalities, probability, et cetera). We’ll talk about the
general theme, then spend time trying to solve some relevant problems. At the end of
each meeting I’ll hand out a set of problems on that theme, that you can cut your teeth
on. Usually we’ll begin the next session with presentations of solutions to some of those
problems. On occasional meetings, I might give out a problem set at the beginning, and
have everyone pick a problem or two to work on individually for the meeting period (a
sort of “mock Putnam”).
As the last paragraph indicates, the course also has the general goal of introducing
useful problem-solving techniques, and bits and pieces of useful mathematics that might
not have a natural home in other courses in the math curriculum.
Those who want to get the most out of the Putnam Competition are also encouraged
to take part in the Virginia Tech Regional Mathematics Contest. This usually which
happens six weeks or so before the Putnam (again, on campus). More information on this
competition will be available at the start of September.
1.5 Assessment
The grade for the class will be determined solely by active participation in class (partici-
pating in class discussions, occasionally presenting problem solutions on the board) and
by participation in the 2020 Putnam Competition.
https://notredame.zoom.us/j/95331468489
(meeting ID 953 3146 8489, passcode 313159). Recordings of each class should also be
available, soon after each one ends — more details on this when semester begins.
6
1.6.3 Class conduct
At the meetings you should feel free to engage in lively discussion about the course topics;
don’t be shy! But non course related interruptions should be kept to a minimum. In
particular, you should turn off or switch to silent all phones, etc., before the start of each
meeting. If for some good reason you need to have your phone on during meetings, please
mention it to me in advance.
7
2 Week 1 (August 11) — Induction
2.1 A few problems to discuss in class
Here a few problems that we can solve together in class during the first meeting, to get
our feet wet. They are all particular favorites of mine, for one reason or another.
1. A room has 100 lockers, numbered 1 through 100, initially all closed. I run through
the room, and open every locker. Then I run through the room again, and close
the lockers numbered 2, 4, 6, et cetera (all the even numbered lockers). Next I run
through the room, and change the status of the lockers numbered 3, 6, 9, et cetera
(opening the closed ones, and closing the open ones). I keep going in this manner
(on the ith run through the room, I change the status of lockers numbered i, 2i, 3i,
et cetera), until on my 100th run through I change the status of locker number 100
only.
At the end of all this, which lockers are open?
2. In the picture below, which of the two shaded in regions (the red region and the blue
region, if you are looking at the pdf online) has the greater area? (The boundary
is a perfect quarter-circle. The two circle-like curves inside the quarter circle are
perfect semi-circles, whose diameters are radii of the quarter-circle.)
3. Alice and Bob want to know Carole’s birthday. She tells them that it is one of ten
dates, shown in the table below.
8
Then she whispers the month of her birthday to Alice, and the day of her birthday
to Bob. The following conversation ensues:
1
This is a fairly straightforward example of a “knowledge puzzle”. If you want a real challenge, try to
solve this one: Carole (truthfully) tells Alice and Bob that she is thinking of two distinct positive integers,
both bigger than 1, whose sum is at most 100. She whispers the sum of the numbers to Alice, and she
whispers the product to Bob. The following conversation ensues:
• Alice: Bob, I know that you don’t know the two numbers.
• Bob: Now I know them!
• Alice: And now so do I!
What are the two numbers? (You should assume that Alice and Bob are very smart and very logical.)
ADDED AUGUST 18: This is the famous Sum and Product problem, and the solution is that the
numbers are 4 and 13. See https://en.wikipedia.org/wiki/Sum_and_Product_Puzzle for a detailed
discussion.
9
2.2 Induction
Often one can tackle a problem that involves one or more parameters by checking what
happens with small values of the parameters, noticing a pattern, and then establishing that
the pattern holds in general. The most powerful mathematical technique for establishing
the correctness of a pattern is induction.
Basic induction
Suppose that P (n) is an assertion about the natural number n. Induction is essentially
the following: if there is some a for which P (a) is true, and if for all n ≥ a we have
that the truth of P (n) implies the truth of P (n + 1), then we can conclude that P (n)
is true for all n ≥ a. This should be fairly obvious: knowing P (a) and the implication
“P (n) =⇒ P (n + 1)” at n = a, we immediately deduce P (a + 1). But now knowing
P (a + 1), the implication “P (n) =⇒ P (n + 1)” at n = a + 1 allows us to deduce
P (a + 2); and so on. Induction is the mathematical tool that makes the “and so on” above
rigourous. Induction works because of the following fundamental fact, often referred to as
the well-ordering principle:
To see why well-ordering allows induction to work, suppose that we know that P (a) is
true for some a, and that we can argue that for all n ≥ a, the truth of P (n) implies the
truth of P (n + 1). Suppose now (for a contradiction) that there are some n ≥ a for which
P (n) is not true. Let F = {n|n ≥ a, P (n) not true}. By assumption F is non-empty, so
has a least element, n0 say. We know n0 6= a, since P (a) is true; so n0 ≥ a + 1. That
means that n0 − 1 ≥ a, and since n0 − 1 6∈ F (if it was, n0 would not be the least element)
we know P (n0 − 1) is true. But then, by assumption, P ((n0 − 1) + 1) = P (n0 ) is true, a
contradiction!
Example: Prove that a set of size n ≥ 1 has 2n subsets (including the empty set and the
set itself).
Solution: Let P (n) be the statement “a set of size n has 2n subsets”. We prove that P (n)
is true for all n ≥ 1 by induction. We first establish a base case. When n = 1, the generic
set under consideration is {x}, which has 2 = 21 subsets ({x} and ∅); so P (1) is true.
Next we establish the inductive step. Suppose that for some n ≥ 1, P (n) is true.
Consider P (n + 1). The generic set under consideration now is {x1 , . . . , xn , xn+1 }. We can
construct a subset of {x1 , . . . , xn , xn+1 } by first forming a subset of {x1 , . . . , xn }, and then
either adding the element xn+1 to this subset, or not. This tells us that the number of
subsets of {x1 , . . . , xn , xn+1 } is 2 times the number of subsets of {x1 , . . . , xn , }. Since P (n)
is assumed true, we know that {x1 , . . . , xn } has 2n subsets (this step is usually referred
to as applying the inductive hypothesis); so {x1 , . . . , xn , xn+1 } has 2 × 2n = 2n+1 subsets.
This shows that the truth of P (n) implies that of P (n + 1), and the proof by induction is
complete.
10
Strong Induction
Induction is a great tool because it gives you somewhere to start from in an argument.
And sometimes, the more you start with, the further you’ll go. That’s why the principle
of Strong Induction is worth keeping in mind: if there is some a for which P (a) is true,
and if for each n > a we have that the truth of P (m) for all m, a ≤ m < n, implies the
truth of P (n), then we can conclude that P (n) is true for all n ≥ a.
The proof that this works is almost the same as the proof that induction works. What’s
good about strong induction is that when you are at the part of the argument where you
have to show that the truth of P (n + 1) from some assumptions about earlier assertions,
you now have a lot more to work with: each of P (a), P (a + 1), . . . , P (n − 1), rather than
just P (n) alone. Sometimes this is helpful, and sometimes it’s absolutely necessary.
Example: Prove that every integer n ≥ 2 can be written as n = p1 . . . p` where the pi ’s
are (not necessarily distinct) prime numbers.
Solution: Let P (n) be the statement: “n can be written as n = p1 . . . p` where the pi ’s
are (not necessarily distinct) prime numbers”. We’ll prove that P (n) is true for all n ≥ 2
by strong induction.
P (2) is true, since 2 = 2 works.
Now consider P (n) for some n > 2. We want to show how the (simultaneous) truth
of P (2), . . . , P (n − 1) implies the truth of P (n). If n is prime, then n = n works to show
that P (n) holds. If n is not a prime, then its composite, so n = ab for some numbers a, b
with 2 ≤ a < n and 2 ≤ b < n. We’re allowed to assume that P (a) and P (b) are true, that
is, that a = p1 . . . p` where the pi ’s are (not necessarily distinct) prime numbers, and that
a = q1 . . . qm where the qi ’s are (not necessarily distinct) prime numbers. It follows that
n = ab = p1 . . . p` q1 . . . qm .
This is a product of (not necessarily distinct) prime numbers, and so P (n) is true.
So, by strong induction, we conclude that P (n) is true for all n ≥ 2.
Notice that we would have gotten exactly nowhere with this argument if, in trying to
prove P (n), all we had been allowed to assume was P (n − 1).
n(n + 1) 4
1 + 2 + 3 + ... + n = .
2
From this we get
1 + 2 + 3 + . . . + n + (n + 1) = (1 + 2 + 3 + . . . + n) + (n + 1)
n(n + 1)
= +n+1
2
n2 + n + 2n + 2
=
2
n2 + 3n + 2
=
2
(n + 1)(n + 2)
=
2
(n + 1)((n + 1) + 1)
= .
2
The equality of the first and last expressions in this chain is the case n + 1 of the assertion5 ,
so we have verified the induction step.
By induction the assertion is true for all n.
In proving an identity — an equality between two expressions, both depending on some
variable(s) — by induction, it is often very helpful to start with one side of the n + 1 case
of the identity, and manipulate it via a sequence of equalities in a way that introduces one
side of the n case of the identity into the mix; this can then be replaced with the other
side of the n case, and then the whole thing might be massage-able into the other side of
the n + 1 identity. That’s exactly how we proceeded above.
2
Here you might chose to say specifically that you are proving the predicate P (n):“1 + 2 + 3 + . . . + n =
n(n + 1)
” for n ∈ N; this is usually not necessary for proving a simple statement, but it can be very
2
useful, when proving a more complex statement, especially one involving multiple variables, to introduce
explicit notation for the predicate.
3
It’s ok to say this, if the base case really is obvious!
4
Or, if you have named the predicate P (n), “assume P (n)”.
5
Or, “is P (n + 1)
12
2.3 Some problems to work on for week 2
1. Find (with proof!) a formula for the sum of the first n odd natural numbers.
f (n) = f (n − 1)2 − 2.
d
fn+1 (x) = (n + 1)fn (x + 1)
dx
for n ≥ 0. Find, with proof, the explicit factorization of f100 (1) into powers of
distinct primes.
5. Show that every positive integer can be written in the form ±12 ± 22 · · · ± n2 for
some n ≥ 1 and some choice of signs.
6. The numbers 1 through 2n are partitioned into two sets A and B of size n, in an
arbitrary manner. The elements a1 , . . . , an of A are sorted in increasing order, that
is, a1 < a2 < . . . < an , while the elements b1 , . . . , bn of B are sorted in decreasing
order, that is, b1 > b2 > . . . > bn . Find (with proof!) the value of the sum
n
X
|ai − bi |.
i=1
13
2.4 Solutions to induction problems
1. Find (with proof!) a formula for the sum of the first n odd natural numbers.
1 + 3 + 5 + · · · + (2n − 1) = n2 .
(we are then done, by induction)13 . To make this deduction, observe that14
This completes the induction step, and so the proof (by induction) of the claim.15
f (n) = f (n − 1)2 − 2.
6
For this problem, I’m going to write out a complete solution, to illustrate how a proof by induction
might be properly laid out. I’ll make comments along the way in the footnotes. Notice that the proof is
written in complete sentences. A well-written proof should read sensibly as a piece of English prose, as
long as you replace all the mathematical symbols with their usual English equivalents.
7
For any problem, not just one involving a proof by induction, that asks you to come up with an
answer to a question (rather than verify that a given answer is correct) — you MUST begin with a clear
statement of your answer.
8
If you are using proof by induction, say so!
9
Clearly say that you establishing a base case — that is part of a proof by induction.
10
It is ok to dismiss the base case like this, if it really is obvious.
11
When you move on to the induction step, clearly say so.
12
It’s helpful to say explicitly what the induction hypothesis is. It helps you as you write your proof,
and it helps the reader as she reads it.
13
This is a little overkill, but it can sometimes be helpful to write down the explicit goal, to focus
yourself on where you are going with the proof. Of course, if you stating the goal hopefully like this, you
must be clear that this is something that you are going to prove/wish to proof, rather than something
that you have already proven!
14
Notice that what follows is a chain of equalities, each one following from the previous, either by
some basic algebra or by the induction hypothesis. So: I’m arguing from true/known statements, to the
statement I want. I’m not arguing from the statement I want to be true, to a true statement (logically,
that tells me nothing).
15
It’s always good to end a proof by induction with a clear statement that all necessary steps have been
completed. In part this is for your benefit — if you haven’t done all you should have done, you’ll likely
notice this while writing “I’ve done all I should have done”, and will then be able to correct the problem.
14
Find, with proof, a simple expression for f (n).
f (n + 1) = f (n)2 − 2
2
2n−1 1
= 2 + 2n−1 −2
2
n 1
= 22 + 2 + 2n − 2
2
(n+1)−1 1
= 22 + 2(n+1)−1 .
2
(Notice that it was computationally quite convenient to deduce the formula for
f (n + 1) from that of f (n) here; that is just as valid as deducing the formula for
f (n) from that of f (n − 1)).
Solution17 :
The answer is 10199 . It is a fairly easy induction that fn (x) = x(x + n)n−1 (left to
reader). Once this relation is established, the result follows immediately.
Solution18 :
16
Here I will just give a sketch of the solution; this should not be mistaken for a polished final solution
17
This was problem B2 of the 1985 Putnam competition.
18
I found this problem on a Putnam prep class handout prepared by Amites Sarkar, Western Washington
University.
15
We prove the result by induction on n, with the base case n = 1 very easy (the
inequality holds with equality in this case).
For the induction step, assume that for some n ≥ 1, whenever x1 , . . . , xn are n
positive numbers satisfying
1
x1 + x2 + · · · + xn =
2
then
(1 − x1 ) (1 − x2 ) (1 − xn ) 1
· · ··· · ≥ .
(1 + x1 ) (1 + x2 ) (1 + xn ) 3
Let x1 , . . . , xn , xn+1 be n + 1 positive numbers satisfying
1
x1 + x2 + · · · + xn + xn+1 = .
2
Setting x0i = xi for i = 1, . . . , n − 1, and x0n = xn + xn+1 , we apply the induction
hypothesis to the numbers x01 , . . . , x0n (which are positive and sum to 1/2) to conclude
We claim that
(1 − xn ) (1 − xn+1 ) (1 − xn − xn+1 )
· ≥ . (??)
(1 + xn ) (1 + xn+1 ) (1 + xn + xn+1 )
(1 − x1 ) (1 − xn−1 ) (1 − xn ) (1 − xn+1 )
· ··· · · ·
(1 + x1 ) (1 + xn−1 ) (1 + xn ) (1 + xn+1 )
(1 − x1 ) (1 − xn−1 ) (1 − xn − xn+1 )
≥ · ··· · ·
(1 + x1 ) (1 + xn−1 ) (1 + xn + xn+1 )
(1 − x01 ) (1 − x0n )
= · ··· ·
(1 + x01 ) (1 + x0n )
1
≥ ,
3
the last inequality by (?).
So what remains is to prove (??); but after a little algebra this reduces to
16
5. Show that every positive integer can be written in the form ±12 ± 22 · · · ± n2 for
some n ≥ 1 and some choice of signs.
Solution: We prove this by induction. We have to be a little bit careful here, not
to be confused by using “n” in the induction hypothesis, and then thinking that we
have to represent n in the form ±12 ± 22 · · · ± n2 ; in fact, no restriction is placed
here on the number of terms we should use in the representation of any particular
natural number.
So, for each natural number m let P (m) be the proposition
2 = −1 − 4 − 9 + 16.
m − 4 = ε1 12 + ε2 22 + · · · + εn n2
so
6. The numbers 1 through 2n are partitioned into two sets A and B of size n, in an
arbitrary manner. The elements a1 , . . . , an of A are sorted in increasing order, that
is, a1 < a2 < . . . < an , while the elements b1 , . . . , bn of B are sorted in decreasing
order, that is, b1 > b2 > . . . > bn . Find (with proof!) the value of the sum
n
X
|ai − bi |.
i=1
19
Note that here we have given the proposition to be proved a name, P (m). This is handy in this case,
since it is quite a wordy proposition.
17
Solution: The wording of the question strongly suggests that the answer is inde-
pendent of the choice of A and B, so we should start with a particularly nice choice
of A and B, see what answer we get, conjecture that this is always the answer, and
then try to prove the conjecture.
Letting A = {1, 2, 3, . . . , n} and B = {2n, 2n − 1, . . . , n + 1}, we find that
n
X
|ai − bi | = (2n − 1) + (2n − 3) + . . . + 1 = n2
i=1
(it is an easy induction that the sum of the first n odd positive integers is n2 ).
Xn
So, we try to prove by induction that |ai − bi | = n2 . The base case n = 1 is
i=1
trivial.
For n ≥ 2, we can consider two cases. Case 1 is when 1 and 2n end up in different
partition classes. Let’s start by considering 1 ∈ A, 2n ∈ B. In this case, |a1 − b1 | will
contribute 2n − 1 to the sum. What about the remaining terms? Notice that A \ {1}
and B \ {2n} form a partition {a2 , . . . , an } ∪ {b2 , . . . , bn } of {2, . . . , 2n − 1}, with the
a’s increasing and the b’s decreasing. Setting a01 = a2 − 1, a02 = a3 − 1, etc., up to
a0n−1 = an − 1, and also setting b01 = b2 − 1, b02 = b3 − 1, etc., up to b0n−1 = bn − 1, we
get that A0 and B 0 form a partition {a01 , . . . , a0n−1 } ∪ {b01 , . . . , b0n−1 } of {1, . . . , 2n − 2},
with the a0 ’s increasing and the b0 ’s decreasing. By induction,
n−1
X
|a0i − b0i | = (n − 1)2 ,
i=1
18
(Note that this is valid even if x = n − 1, the largest it can possibly be; the second
sum in this case is empty and so 0). If we modify A and B to form A0 = {a01 , . . . , a0n },
B 0 = {b01 , . . . , b0n } by swapping 1 and x + 1, then
n
X x
X n−1
X
|a0i − b0i | = (bi − (i + 1)) + |ai − bi | + (2n − 1)
i=1 i=1 i=x+1
x
X n−1
X
= (bi − i) + |ai − bi | + (2n − (x + 1))
i=1 i=x+1
n
X
= |ai − bi |.
i=1
19
3 Week 2 (August 18) — Pigeonhole principle
“If n + 1 pigeons settle themselves into a roost that has only n pigeonholes,
then there must be at least one pigeonhole that has at least two pigeons.”
This very simple principle, sometimes called the box principle, and sometimes Dirichlet’s
box principle, can be very powerful.
The proof is trivial: number the pigeonholes 1 through n, and consider the case where
n
X
ai pigeons land in hole i. If each ai ≤ 1, then ai ≤ n, contradicting the fact that (since
i=1
n
X
there are n + 1 pigeons in all) ai = n + 1.
i=1
Since it’s a simple principle, to get some power out of it it has to be applied cleverly
(in the examples, there will be at least one such clever application). Applying the principle
requires identifying what the pigeons should be, and what the pigeonholes should be;
sometimes this is far from obvious.
The pigeonhole principle has many obvious generalizations. I’ll just state one of them:
“if more than mn pigeons settle themselves into a roost that has no more than
n pigeonholes, then there must be at least one pigeonhole that has at least
m + 1 pigeons”.
Example: 10 points are placed randomly in√a 1 by 1 square. Show that there must be
some pair of points that are within distance 2/3 of each other.
Solution: Divide the square into 9 smaller squares, each of dimension 1/3 by 1/3. These
are the pigeonholes. The ten randomly chosen points are the pigeons. By the pigeonhole
principle, at least one of the 1/3 by 1/3 squares must have at least two of the ten points
in it. The maximum distance between two points in ap 1/3 by 1/3 square is√the distance
between two opposite corners. By Pythagoras this is (1/3)2 + (1/3)2 = 2/3, and we
are done.
Example: Show that there are two people in New York City who have the exactly same
number of hairs on their head.
Solution: Trivial, because surely there are at least two baldies in NYC! But even if we
weren’t sure of that: a quick websearch shows that a typical human head has around
150,000 hairs, and it is then certainly reasonable to assume that no one has more than
5,000,000 hairs on their head. Set up 5,000,001 pigeonholes, numbered 0 through 5,000,000,
and place a resident of NYC (a “pigeon”) into bin i if (s)he has i hairs on her head.
Another websearch shows that the population of NYC is around 8,300,000, so there are
more pigeons than pigeonholes, and some pigeonhole must have multiple pigeons in it.
Example: Show that every sequence of nm + 1 real numbers must contain EITHER a
decreasing subsequence of length n + 1 OR an increasing subsequence of length m + 1.
20
(In a sequence a1 , a2 , . . ., an increasing subsequence is a subsequence ai1 , ai2 , . . . [with i1 <
i2 < . . .] satisfying ai1 ≤ ai2 ≤ . . ., and a decreasing subsequence is defined analogously).
Solution: Let the sequence be a1 , . . . , anm+1 . For each k, 1 ≤ k ≤ nm + 1, let f (k) be
the length of the longest decreasing subsequence that starts with ak , and let g(k) be the
length of the longest increasing subsequence that starts with ak . Notice that f (k), g(k) ≥ 1
always.
If there is a k with either f (k) ≥ n + 1 or g(k) ≥ m + 1, we are done. If not, then
for every k we have 1 ≤ f (k) ≤ n and 1 ≤ g(k) ≤ m. Set up nm pigeonholes, with each
pigeonhole labeled by a different pair (i, j), 1 ≤ i ≤ n, 1 ≤ j ≤ m (there are exactly nm
such pairs). For each k, 1 ≤ k ≤ nm + 1, put ak in pigeonhole (i, j) iff f (k) = i and
g(k) = j. There are nm + 1 pigeonholes, so one pigeonhole, say hole (r, s), has at least
two pigeons in it.
In other words, there are two terms of the sequence, say ap and aq (where without loss
of generality p < q), with f (p) = f (q) = r and g(p) = g(q) = s.
Suppose ap ≥ aq . Then we can find a decreasing subsequence of length r + 1 starting
from ap , by starting ap , aq , and then proceeding with any decreasing subsequence of length
r that starts with aq (one such exists, since f (q) = r). But that says that f (p) ≥ r + 1,
contradicting f (p) = r.
On the other hand, suppose ap ≤ aq . Then we can find an increasing subsequence of
length s + 1 starting from ap , by starting ap , aq , and then proceeding with any increasing
subsequence of length s that starts with aq (one such exists, since g(q) = s). But that says
that g(p) ≥ s + 1, contradicting g(p) = s.
So, whether ap ≥ aq or ap ≤ aq , we get a contradiction, and we CANNOT ever be in
the case where there is NO k with either f (k) ≥ n + 1 or g(k) ≥ m + 1. This completes
the proof.
Remark: This beautiful result was discovered by P. Erdös and G. Szekeres in 1935; the
incredibly clever application of pigeonholes was given by A. Seidenberg in 1959.
21
3.1 Some problems to work on for week 3
1. Given m integers a1 , . . . , am , show that there is a consecutive subsequence whose
sum is divisible by m. (A consecutive subsequence means a subsequence
2. (a) 51 different integers are chosen between 1 and 100, inclusive. Show that some
two of them are coprime (have no prime factor in common).
(b) 51 different integers are chosen between 1 and 100, inclusive. Show that there
are some two of them such that one divides the other.
3. Prove that from a set of ten distinct two-digit numbers, it is possible to select two
nonempty disjoint subsets whose members have the same sum.
6. Show that among any 256 people, there are either some 5 of them who mutually know
each other, or some 5 who mutually don’t know each other. (The relation “knowing”
is assumed to be symmetric — if I know you, you know me, and vice-versa.)
8. Start at the bottom right hand corner of a square, and draw a wrap-around straight
line.20 Show that if the slope of the line is irrational, then eventually the line will
get arbitrarily close to every point of the square.21
20
meaning: if the line hits the right-hand boundary of the square, three-quarters of the way up, then it
immediately jumps to the left-hand boundary, three-quarters of the way up, and keeps the same slope;
and if it hits the top boundary one-ninth of the way along, then then it immediately jumps to the bottom
boundary, one-ninth of the way along, and keeps the same slope; et cetera.
21
meaning: for every point x in the square, and for every ε > 0, at some point the line will pass through
a point that is withing ε of x.
22
3.2 Solutions to pigeon hole problems
1. Given m integers a1 , . . . , am , show that there is a consecutive subsequence whose
sum is divisible by m. (A consecutive subsequence means a subsequence
2. (a) 51 different integers are chosen between 1 and 100, inclusive. Show that some
two of them are coprime (have no prime factor in common).
Solution: The two parts to this problem were favorites of Paul Erdős. The first
is often called “Posá’s soup problem”; see http://www.math.uwaterloo.ca/
navigation/ideas/articles/honsberger/index.shtml for an explanation.
Among 51 numbers chosen from between 1 and 100, two must be consecutive,
and so coprime (use pigeon-hole principle with 50 pigeon-holes labelled “1, 2”,
“3, 4”, etc., up to “99, 100”).
(b) 51 different integers are chosen between 1 and 100, inclusive. Show that there
are some two of them such that one divides the other.
Solution: Every positive whole number can be expressed uniquely as n = m2k
where m is odd and k is a non-negative whole number. Create 50 pigeon-holes
labelled “1”, “3”, etc., up to “99”. Place number n in pigeon-hole labelled “m”
if n = m2k for some non-negative whole number k. By the pigeon-hole principle,
there is some odd m such that there are two distinct numbers n1 , n2 among the
51 with n1 = m2k1 and n2 = m2k2 . The smaller of these divides the larger.
3. Prove that from a set of ten distinct two-digit numbers, it is possible to select two
nonempty disjoint subsets whose members have the same sum.
23
(Justification: it could only happen that A0 is empty if A ⊆ B; but since
A and B are not the same, we would then have B = A ∪ C for some
non-empty C, so the sum of the elements in B would be greater than that
in A, a contradiction. And similarly we can’t have B empty.)
Also, since we have removed the same set of elements from both A and B to get A0
and B 0 , and the sum of the elements of A is the same as that of B, it follows that
the sum of the elements of A0 is the same as that of B 0 .
Solution: This was Problem A4 of the 1994 Putnam competition. The following
solution is due to Kiran Kedlya:
First recall that an integer matrix A has an integer inverse if and only if det(A) = ±1.
But if x and y both belong to {0, 1, 2, 3, 4} and |x − y| > 3, then |f (x) − f (y)| ≤
|f (x)| + |f (y)| = 2, so f (x) − f (y) must be 0. Using this, we conclude f (3) = f (0) =
f (4) = f (1) (apply what we just said to the two sides of each equality). Thus the
quadratic polynomial f (n) − f (1) has four zeroes; that’s too many, so it must be the
zero polynomial, and f (n) = f (1) = ±1 for all x.
Aside: It’s not enough to know that A, A + B, A + 2B, A + 3B are all invertible
with integer entries in their inverses, to conclude that for all n, A + nB is invertible
with integer entries in its inverse. To see this, consider:
−1 1 1 0
A= , B=
1 −2 0 1
24
written a nonnegative integer such that the sum of all 20 integers is 39. Show that
there are two faces that share a vertex and have the same integer written on them.
We have just argued that for each x, the inner sum is at least 10, so the double
sum is at least 120. But because each face is a triangle, the double sum counts
each number exactly three times, and hence the sum is 3 × 39 = 117. This is a
contradiction; hence, there are no two faces that share a vertex and have the same
integer written on them.
6. Show that among any 256 people, there are either some 5 of them who mutually know
each other, or some 5 who mutually don’t know each other. (The relation “knowing”
is assumed to be symmetric — if I know you, you know me, and vice-versa.)
among any 4n−1 people, there are either some n of them who mutually
know each other, or some n who mutually don’t know each other.
25
may know or not know (not counting a1 ), she either knows at least 22n−4 of them,
or doesn’t know this many; in the former case, label a2 “K” and select a subset
of size 22n−4 of people (other than a1 ) that she knows, removing all others from
consideration; in the latter case, label a2 “D” and select a subset of size 22n−4 of
people (other than a1 ) that she doesn’t know, removing all others from consideration.
Iterate this process until we have selected a1 , a2 , . . . , a2n2 . Notice that when we
consider a2n−2 , there is one person left unconsidered (if a2n−2 knows this person, she
gets label “K”; if not, label “D”). Call this last person a2n−1 .
Two labels have been used to label a1 through a2n−2 , so, by pigeon-hole, one of the
labels must be used at least n − 1 times [note that we could have said the same thing
if we had only up to a2n−3 ; so the “4n−1 ” at the beginning of the problem could
be replaced by “4n−1 /2”]. Say that that label is “K”. Then any collection of n − 1
of the ai ’s with label “K”, together with a2n−1 , form a collection of n people who
mutually know each other (that ai knows aj for i < j follows from the fact that ai
has label “K”). On the other hand, if that label is “D” then any collection of n − 1
of the ai ’s with label “D”, together with a2n−1 , form a collection of n people who
mutually don’t know each other.
7. The Fibonacci numbers are defined by the recurrence f0 = 0, f1 = 1 and fn =
fn−1 + fn−2 for n ≥ 2. Show that the Fibonacci sequence is periodic modulo any
positive integer. (I.e, show that for each k ≥ 1, the sequence whose nth term is the
remainder of fn on division by k is a periodic sequence).
Solution: I found this on the Northwestern Putnam prep class webpage.
Consider the sequence obtained from the Fibonacci sequence by taking the remainder
of each term on division by k (so the result is a sequence, all terms in {0, . . . , k − 1}).
Suppose that there are two consecutive terms in this sequence, say the mth and
(m + 1)st, taking values a, b, and two other consecutive terms , say the nth and
n + 1st, taking the same values a, b (with m < n). Then the (m + 2)nd and (n + 2)nd
terms of the reduced sequence agree.
[WHY? Because the (m + 2)nd term is the remainder of Fm+2 on division by k,
which is the remainder of Fm + Fm+1 on division by k, which is the remainder of
Fm on division by k PLUS the remainder of Fm+1 on division by k, which is the
remainder of a on division by k PLUS the remainder of b on division by k, which is
the remainder of Fn on division by k PLUS the remainder of Fn+1 on division by
k, which is the remainder of Fn + Fn+1 on division by k, which is the remainder of
Fn+2 on division by k.]
The same argument shows that the reduced sequence is periodic beyond the mth
terms, with period (at most) n − m.
So all we need to do to find periodicity is to find two consecutive terms in the
sequence, that agree with two other consecutive terms. There are only k 2 possibilities
for a pair of consecutive values in the sequence, and infinitely many consecutive
values, so by PHP there has to be a coincidence of the required kind.
26
8. Start at the bottom right hand corner of a square, and draw a wrap-around straight
line.22 Show that if the slope of the line is irrational, then eventually the line will
get arbitrarily close to every point of the square.23
https://en.wikipedia.org/wiki/Linear_flow_on_the_torus,
and see
https://www.desmos.com/calculator/ghfkyjx2lc
22
meaning: if the line hits the right-hand boundary of the square, three-quarters of the way up, then it
immediately jumps to the left-hand boundary, three-quarters of the way up, and keeps the same slope;
and if it hits the top boundary one-ninth of the way along, then then it immediately jumps to the bottom
boundary, one-ninth of the way along, and keeps the same slope; et cetera.
23
meaning: for every point x in the square, and for every ε > 0, at some point the line will pass through
a point that is withing ε of x.
27
4 Week 3 (August 25) — Binomial coefficients
Binomial coefficients crop up quite a lot in Putnam problems. This handout presents some
ways of thinking about them.
28
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
etc.
Pascal’s triangle in numbers
The kth entry in row k (counting from 0 rather than 1 both down and across) is then
n
(this is just a restatement of Pascal’s identity) (see the picture below).
k
0
0
1 1
0 1
2 2 2
0 1 2
3 3 3 3
0 1 2 3
4 4 4 4 4
0 1 2 3 4
5 5 5 5 5 5
0 1 2 3 4 5
6 6 6 6 6 6 6
0 1 2 3 4 5 6
etc.
Pascal’s triangle symbolically
n
Finally, there is an algebraic expression for , that makes sense for all n, k ≥ 0,
k
using the factorial function (defined combinatorially as the number of ways of arranging n
distinct objects in order, and algebraically by n! = n(n − 1)(n − 2) . . . (3)(2)(1) for n ≥ 1,
with 0! = 1):
n n(n − 1) . . . (n − (k − 1)) n!
= = .
k k! k!(n − k)!
29
To see this, note that n(n − 1) . . . (n − (k − 1)) is fairly evidently the number of ordered
lists of k distinct elements from {1, . . . , n} (often referred to in textbooks as “permutations
of n items taken k at a time” — ugh). When the ordered lists are turned into (unordered)
subsets, each subset appears k! times (once for each of the k! ways of putting k distinct
objects into an ordered list), so we need to divide the ordered count by k! to get the
unordered count.
When dealing with binomial coefficients, it is very helpful to bear all three definitions
in mind, but in particular the first two.
Identities
The binomial coefficients satisfy a staggering number of identities. The simplest of these
are easily understood using either the combinatorial or algebraic definitions; for the more
involved ones, that include sums, the algebraic definition is usually next to useless, and
often the easiest way to prove the identity is combinatorially, by showing that both sides
of the identity count the same thing in different ways (illustration below), though it is
often possible also to prove these identities by induction, using the recurrence relation.
Another approach that is helpful is that of generating functions.
Here are some of the basic binomial coefficient identities:
1. (Symmetry)
n n
=
k n−k
(Proof: trivial from the algebraic definition; combinatorially, left-hand side counts
selection of subsets of size k from a set of size n, by naming the selected elements;
right-hand side also counts selection of subsets of size k from a set of size n, this
time by naming the unselected elements).
2. (Lower summation)
n
X n
= 2n
k=0
k
(Proof: close to impossible using the algebraic definition; combinatorially, very
straightforward: left-hand side counts the number of subsets of a set of size n, by
first deciding the size of the subset, and then choosing the subset itself; right-hand
side also counts the number of subsets of a set of size n, by going through the n
elements one-by-one and deciding whether they are in the subset or not).
3. (Upper summation)
n
X m n+1
= .
m=k
k k+1
4. (Parallel summation)
n
X m+k n+m+1
= .
k=0
k n
30
5. (Square summation)
n 2
X n 2n
= .
k=0
k n
We will prove this combinatorially when z is a positive integer. The left-hand side counts
the number of words of length n from alphabet {0, 1, 2, . . . , z}, by deciding on the letters
one after the other. The right-hand side also counts the number of words of length n
from alphabet {0, 1, 2, . . . , z}, as follows: first decide how many of the letters of the word
are from {1, . . . , z} (this
is the k of the summation). Next, decide the location of these
n
k letters (this is the ). Finally, decide what specific letters go into those spots, one
k
after another (this is the z k ) (note that the remaining n − k letters must all be 0’s).
This only shows the identity for positive integer z. But now we use the fact that
both the right-hand and left-hand sides are polynomials of degree n, so if they agree at
n + 1 different values of z, they must agree at all values of z (otherwise, their difference
is a not-identically-zero polynomial of degree at most n with n + 1 distinct roots, an
impossibility). And indeed, the two sides agree not just at n + 1 different values of z, but
at infinitely many (all positive integers z). So from the combinatorial argument that shows
that the two sides are equal for positive integers z, we infer that they are equal for all real
z. This argument is often called the polynomial principle.
There is a version of the binomial theorem also for non-positive-integral exponents: for
all real α,
X α
α
(1 + z) = zk
k≥0
k
31
α
where is defined in the obvious way:
k
α α(α − 1) . . . (α − k + 1) α
= (for k ≥ 1; = 1),
k k! 0
and the equality is valid for all real |z| < 1. (Check: when α is a positive integer, this
reduces to the standard binomial theorem).
For example, if ` > 0 is a positive integer, then
−` (−`)(−` − 1) . . . (−` − k + 1) k `+k−1
= = (−1) ,
k k! k
and so
1 X ` + k − 1
`
= zk .
(1 − z) k≥0
k
This generalizes the familiar identity
1
= 1 + z + z2 + . . . .
1−z
Modulo the convergence analysis, the proof of the binomial theorem for general exponents
is fair easy: the coefficient of xk in the Taylor series Taylor series of (1 + z)α is
1 dk
α α
k
(1 + z) |z=0 = .
k! dx k
32
example, the configuration ? ? || ? | ? ?? encodes the weak composition (2, 0, 1, 3) of 6 into
4 parts. So,the number of weak compositions of n into k parts is a binomial coefficient,
n+k−1
.
k−1
How many compositions of n are there, into k parts? Each such composition (x1 , x2 , . . . , xk )
gives rise to a weak composition (x1 − 1, x2 − 1, . . . , xk − 1) of n − k into k parts, and all
weak composition of n − k into k parts are achieved by this process. So, the number of
compositions of n intok parts is the same as the number
of weak compositions of n − k
(n − k) + k − 1 n−1
into k parts, which is = .
k−1 k−1
For example: I like plain cake, chocolate cake, blueberry cake and pumpkin cake donuts
from Dunkin’ Donuts. In how many different ways can I buy a dozen donuts that I like? I
must buy x1 plain, x2 chocolate, x3 blueberry and x4 pumpkin, with x1 + x2 + x3 + x4 = 12,
and with each xi a non-negative integer (possibly 0). So the number of different
purchases
15
I can make is the number of weak compositions of 12 into 4 parts, so = 1365.
4
33
4.2 Problems to think about for week 4
1. Find a simple expression (not involving a sum) for
2 n 2 n 2 n 2 n
1 +2 +3 + ··· + n .
1 2 3 n
2. n points are arranged on a circle. All possible diagonals are drawn. Assuming that
no three of the diagonals meet at a single point, how many intersections of diagonals
are there inside the circle?
3. (a) The kth falling power of x is xk = x(x − 1)(x − 2) . . . (x − (k − 1)). Prove24
that for all real x, y, and all n ≥ 1,
n
n
X n n−k k
(x + y) = x y .
k=0
k
(b) The kth rising power of x is xk = x(x + 1)(x + 2) . . . (x + (k − 1)). Prove25 that
for all real x, y, and all n ≥ 1,
n
n
X n n−k k
(x + y) = x y .
k=0
k
This is a quite standard way to extend the binomial coefficients beyond integer arguments.
34
7. Define a selfish set to be a set which has its own cardinality (number of elements) as
an element. Find the number of subsets of {1, 2, . . . , n} which are minimal selfish
sets, that is, selfish sets none of whose proper subsets is selfish.
8. Three distinct vertices are chosen at random from the vertices of a regular (2n + 1)-
sided polygon. If all such choices are equally likely, what is the probability that
the center of the polygon lies in the interior of the triangle determined by the three
chosen vertices?
35
specifying
k, the number of elements chosen from A, then selecting relements from
m n
A( ways), then selecting the remaining r − k elements from B ( ways).
k r−k
5. Evaluate n
X
kn
(−1)
k=0
k
for n ≥ 1.
so the sum is 0.
6. (a) Let an be the number of 0-1 strings of length n that do not have two consecutive
1’s. Find a recurrence relation for an (starting with initial conditions a0 = 1,
a1 = 2).
Solution: By considering whether the last term is a 0 or a 1, get the Fibonacci
recurrence: an = an−1 + an−2 .
(b) Let an,k be the number of 0-1 strings of length n that have exactly k 1’s and that
do not have two consecutive 1’s. Express an,k as a (single) binomial coefficient.
Solution: Add a 0 to the beginning and end of such a string. By reading off
a1 , the number of 0’s before the first 1, then a2 , the number of 0’s between the
first 1 and the second, and so on up to ak+1 , the number of 0’s after the last 1,
we get a composition (a1 , . . . , ak+1 ) of n + 2 − k into k + 1 parts; and each such
composition can be encoded (uniquely) by such a string. So an,k is the number
n+1−k
of compositions of n + 2 − k into k + 1 parts, and so equals .
k
(c) Use the results of the previous two parts to give a combinatorial proof (showing
that both sides count the same thing) of the identity
X n − k − 1
Fn =
k≥0
k
36
4.4 Solutions to Binomial coefficient problems
1. Find a simple expression (not involving a sum) for
2 n 2 n 2 n 2 n
1 +2 +3 + ··· + n .
1 2 3 n
Solution: This was on the Putnam in 1962. It was question A5. These days, A5 is
typically a much more involved question!
2. n points are arranged on a circle. All possible diagonals are drawn. Assuming that
no three of the diagonals meet at a single point, how many intersections of diagonals
are there inside the circle?
37
Solution: This is an old classic.
n
We claim that the answer is .
4
Each intersection inside the circle determines a unique collection of four of the points
on the circle, by: two lines meet at each intersection, and each of the two lines has
two endpoints. Conversely, each set of four points on the circle determines a unique
point of intersection, by: if the four points are, in clockwise order, a, b, c, d, then
the associated point of intersection is the intersection of the lines ac and bd.
It follows that there are exactly as many intersections of diagonals
inside the circle,
n
as there are sets of points on the circle; and there are such sets of points.
4
Solution: This part and the next are standard Binomial coefficient identities.
This problem was B1 on the 1962 Putnam competition.
An argument by induction is possible. But there is also a combinatorial
argument: Let x and y be positive integers. The number of words in alphabet
{1, . . . , x} ∪ {x1 , . . . , x + y} of length n with no two repeating letters, counted
by selecting letter-by-letter, is (x + y)n . If instead we count by first selecting k,
the number of letters from {x + 1, . . . , x + y} used, then locate the k positions
in which those letters appear, then selecting the n − k letters from {1, . . . , x}
letter-by-letter in the order that they appear in the word, and finally selecting
the k letters from {x + 1, . . . , x + y} letter-by-letter in the order that they
n
X n n−k k
appear in the word, we get a count of x y . So the identity is true
k=0
k
for positive integers x, y.
The LHS and RHS are polynomials in x and y of degree n, so the difference
is a polynomial in x and y of degree at most n, which we want to show is
identically 0. Write the difference as P (x, y) = p0 (x) + p1 (x)y + . . . + pn (x)y n
where each pi (x) is a polynomial in x of degree at most n. Setting x = 1 we
get a polynomial P (1, y) in y of degree at most n. This is 0 for all integers
y > 0 (by our combinatorial argument), so by the polynomial principle28 it is
identically 0. So each pi (x) evaluates to 0 at x = 1. But the same argument
shows that each pi (x) evaluates to 0 at any positive integer x. So again by the
27
Hint: Think combinatorially.
28
A polynomial p(x) that is 0 at infinitely many points, must be identically 0. In fact, a polynomial of
degree n that is 0 at n + 1 points must be identically 0.
38
polynomial principle, each pi (x) is identically 0 and so P (x, y) is. This proves
the identity for all real x, y.
(b) The kth rising power of x is xk = x(x + 1)(x + 2) . . . (x + (k − 1)). Prove29 that
for all real x, y, and all n ≥ 1,
n
n
X n n−k k
(x + y) = x y .
k=0
k
so the identity follows from the falling power binomial theorem (the previous
question).
4. Prove that the expression
gcd(m, n) n
n m
is an integer for all pairs of integers n ≥ m ≥ 1.
Solution: (Putnam competition, 2000, problem B2) We know that gcd(m, n) =
am + bn for some integers a, b; but then
gcd(m, n) n m n n n
=a +b .
n m n m n m
Since
n n n−1
=
m m m−1
(the committee-chair identity again, or easy algebra), we get
gcd(m, n) n n−1 n
=a +b ,
n m m−1 m
n−1 n
and so (since a, b, and are all integers) we get the desired result.
m−1 m
29
Hint: deduce it from 3(a).
39
5. Evaluate n
X n
Fk+1
k=0
k
for n ≥ 0, where F1 , F2 , F3 , F4 , F5 , . . . are the Fibonacci numbers 1, 1, 2, 3, 5, . . ..
40
Fibonacci recurrence to break each Fibonacci number into the sum of two earlier
ones, then use Pascals identity to gather together terms involving the same Fibonacci
number. (Details omitted.)
This suggests that for each s ≥ 2 we use induction on m to prove the result for all
pairs (n, m) with n + m = s. The case m = 0 has been observed already. For m > 0
we have
Z 1 Z 1
n m m
x (1 − x) dx = xn+1 (1 − x)m−1 dx
0 n + 1 0
m 1
=
n + 1 ((n + 1) + (m − 1) + 1) (n+1)+(m−1)
n+1
m
= .
(n + 1)(n + m + 1) n+m
n+1
This is a quite standard way to extend the binomial coefficients beyond integer arguments.
41
or select the n non-chair members
from
the n + m people, and choose the chair from
n+m
among those not yet chosen (m ways).
n
Second solution (essentially the same as the first, written differently): The result
is trivial when one or both of n, m = 0, so we may assume both n, m ≥ 1. Consider
n ≥ 1 fixed. Using integration by parts, we get
Z 1 Z 1
n m m
x (1 − x) dx = xn+1 (1 − x)m−1 dx
0 n+1 0
7. Define a selfish set to be a set which has its own cardinality (number of elements) as
an element. Find the number of subsets of {1, 2, . . . , n} which are minimal selfish
sets, that is, selfish sets none of whose proper subsets is selfish.
Solution: (Putnam competition 1996, problem B1) The answer is the nth Fibonacci
number. The solution given here is taken from http://mathforum.org/kb/thread.
jspa?forumID=13&threadID=20920&messageID=50140.
Write #S for the number of elements in S. Note that a set S is a minimal selfish set
if and only if #S is the smallest element of S.
Induct on n. For n = 1, there is one selfish set, namely {1}, so the number of
minimal selfish sets is 1. For n = 2, there are two selfish sets, namely {1} and {1, 2},
so the number of minimal selfish sets is 1.
Fix n ≥ 2. Assume that there are a minimal selfish subsets of {1, 2, . . . , n} and b
minimal selfish subsets of {1, 2, . . . , n − 1}. I will construct exactly a + b minimal
selfish subsets of {1, 2, . . . , n + 1}.
Construction A: Let S be a minimal selfish subset of {1, 2, . . . , n}. Then S is a
minimal selfish subset of {1, 2, . . . , n + 1}. This construction produces a minimal
selfish subsets.
Every minimal selfish subset T of {1, 2, . . . , n + 1} not containing n + 1 is obtained
from Construction A. Proof: T is a subset of {1, 2, . . . , n}.
42
Construction B: Let S be a minimal selfish subset of {1, 2, . . . , n − 1}. Define
T = {n + 1} ∪ {x + 1 : x ∈ S}. Then T is a minimal selfish set: its smallest element
is #S + 1 = #T . This construction produces exactly b minimal selfish subsets, all
different from the subsets in Construction A since they all contain n + 1.
Every minimal selfish subset T of {1, 2, . . . , n + 1} containing n + 1 is obtained from
Construction B. Proof: 1 ∈ / T since otherwise min T = 1 < 2 ≤ #T . Consider the
set S = {x − 1 : x ∈ T, x 6= n + 1}. Then #S = #T − 1 is the smallest element of
S, so S is a minimal selfish set. Finally T = {n + 1} ∪ {x + 1 : x ∈ S}.
8. Three distinct vertices are chosen at random from the vertices of a regular (2n + 1)-
sided polygon. If all such choices are equally likely, what is the probability that
the center of the polygon lies in the interior of the triangle determined by the three
chosen vertices?
Solution: (UCLA Putnam preparation class) Let the vertices be labelled clockwise
a
1 , a
2 , . . . , a2n+1 . Without loss of generality, let a1 be one of the points. There are
2n
possibilities for the remaining points. If a2 is the nearest point chosen to a1 ,
2
going clockwise around the polygon, then there is only one point (an+2 ) that can
complete a triangle that encloses the center. If a3 is the nearest point to a1 then
two points (an+2 and an+3 ) can complete a triangle. In general, if ak (k ≥ 2) is the
nearest point chosen to a1 , going clockwise around the polygon, then there are k − 1
points (an+2 through an+k ) that can complete a triangle that encloses the center.
The largest value k can take is n + 1. So the number of pairs of points that can
n+1
complete a good triangle with a1 is 1 + 2 + . . . + n = , and the required
2
probability is
n+1
2 n+1
2n
= .
2
2(2n − 1)
Notice that this tends to 1/4 as n tends to infinity, suggesting the following: if a
circle of radius 1 is given, a fixed point x on the perimeter is marked, two numbers
a, b are generated uniformly from the interval [0, 2π], and two points are marked
on the perimeter of the circle, one at arc-distance a from x and the other at arc
distance b (measured in a clockwise direction), then the probability that the center
of the circle lies in the triangle formed by the three marked points should be 1/4.
Does this make intuitive sense?
Here’s a video31 that discusses this problem, a three-dimensional analog, and also
talks a little at the beginning about the Putnam competition: https://youtu.be/
OkmNXy7er84
31
h/t Thomas.
43
5 Week 4 (September 1) — Graphs
A graph G = (V, E) consists of a set V of vertices and a set E of edges, each of which is a
two-element subset of V . Think of the vertices as points put down on a piece of paper,
and of the edges as arcs joining pairs of points. There is no inherent geometry to a graph
— all that matters is which pairs of points are joined, not the exact position of the points,
or the nature of the arcs joining them.
Thinking about the data of a problem as a graph can sometimes be helpful. Although
some Putnam problems in the past have been non-trivial results from graph theory in
disguise, there is no real need to know much graph theory, so in this discussion I’ll just
mention some basic ideas that might be useful. A little more background on graph theory
can be found, for example, at http://www.math.ucsd.edu/~jverstra/putnam-week6.
pdf.
Problem: n people go to a party, and each one counts up the number of other people she
knows at the party. Show that there are an even number of people who come up with an
answer that is an odd number. (Assuming that “knowing” is a two-way relation; I know
you if and only if you know me.)
Solution: Model the problem as a graph. V is the set of n party goers, and E consists of
all pairs of people who know each other. For person i, denote by di the number of edges
that involve i (di is the degree of vertex i). We have
n
X
di = 2|E|
i=1
since as we run over all vertices and count degrees, each edge gets counted exactly twice
(once for each vertex in that edge). So the sum of degrees is even. But if there were an
odd number of vertices with odd degree, the sum would be odd; so there are indeed an
even number of people who know an odd number of people.
The useful fact that is true about all graphs that lies at the heart of the solution is
this: in any graph G = (V, E),
Xn
di = 2|E|.
i=1
Problem: Show that two of the people at the party have the same number of friends.
Solution: The possible values for di are 0 through n − 1, n of them, so the pigeon-hole
principle doesn’t immediately apply. But: it’s not possible for there to be one vertex with
degree 0, and another with degree n − 1. So the possible values of di are either 1 through
n − 1, n − 1 of them, or 0 through n − 2, n − 1 of them, and in either case the pigeon-hole
principle gives that there are two people with the same number of friends.
The useful fact that is true about all graphs that lies at the heart of the solution is
this: in any graph G = (V, E), there must be two vertices with the same degree.
A walk in a graph from vertex u to vertex v is list of (not necessarily distinct) edges,
with u in the first, v in the last, and every pair of consecutive edges sharing a vertex in
44
common — graphically, a walk is a way to trace a path from u to v, always using complete
arcs of the drawing, and never taking pencil of paper.
Problem: How many different walks are there from u to v, that use k edges?
Solution: Form the adjacency matrix of the graph: rows and columns indexed by vertices,
entry (a, b) is 1 if {a, b} is an edge, and 0 otherwise. Then form the matrix Ak . The (u, v)
entry of this matrix is exactly the number of different walks from u to v, that use k edges.
The proof uses induction of k, and the definition of matrix multiplication. The key point
is that the number of walks from u to v that use k edges is the sum, over all neighbours w
of u (i.e., vertices w such that {u, w} is an edge), of the number of walks from w to v that
use k − 1 edges. I’ll skip the details.
The relation “there is a walk between” is an equivalence relation on vertices, so any
graph can be partitioned into equivalence classes, with each class having the property that
between any two vertices in the class, there is a walk, but there is no walk between any
two vertices in different classes. These classes are called components of the graph. If a
graph has just one component, meaning that between any two vertices in the graph, there
is a walk, it is said to be connected.
Problem: Given a graph G, under what circumstances is it possible to take a walk that
uses every edge of the graph exactly once, and ends up at the same vertex that it started
at?
Solution: Such a walk is called an Euler circuit, after the man who first studied them
(google “Bridges of Konigsberg”). Such a circuit is a tracing of the graphical representation
of the graph, with each arc traced out exactly once, the pencil never leaving the paper,
ending where it started. Two fairly obvious necessary conditions for the existence of an
Euler circuit are:
• every degree is even (because each time an Euler circuit visits a vertex, it eats up
two edges — one going in and one coming out).
Euler proved that these necessary conditions are sufficient: a connected graph has an Euler
circuit if and only if all vertex degrees are even. The details are given in any basic graph
theory textbook.
What if we don’t require the tracing to end at the same vertex it began?
Problem: Given a graph G, and two distinct vertices u and v, under what circumstances
is it possible to take a walk from u to v that uses every edge of the graph exactly once?
Solution: Such a walk is called an Euler trail. Euler proved that a connected graph has
an Euler trail from u to v if and only if all vertex degrees are even except the degrees of u
and v, which must be odd. It follows easily from his result on Euler circuits: just add an
edge from u to v, apply the Euler trails theorem, and delete the added edge.
45
Problem: Given a graph G with n vertices, under what circumstances is it possible to
list the vertices in some order v1 , . . . , vn , in such a way that each of {v1 , v2 }, {v2 , v3 }, . . .,
{vn−1 , vn }, {vn , v1 } are all edges?
Solution: Such a list is called a Hamiltonian cycle, after the man who first studied them
(google “icosian game”). Unlike with Eulerian trials, there is no simple set of necessary-
and-sufficient conditions known to allow one to determine whether such a thing exists in a
given graph. There is one useful sufficient condition, due to Dirac, that has an elementary
but involved proof that can be found in any graph theory textbook.
Dirac’s theorem: A graph G with n vertices has a Hamiltonian cycle if all vertices have
degree at least n/2.
A connected graph with the fewest possible number of edges is called a tree. It turns
out that all trees on n vertices have the same number of edges, namely n − 1. One way to
see this is to imagine building up the tree from a set of n totally disconnected vertices,
edge-by-edge. At each step, you should add an edge that bridges two components, since
adding an edge within a component does not help; in the end such an edge can be removed
without hurting connectivity. Since two components get merged each time an edge is
added, exactly n − 1 are needed to get to a single component.
A characterization of trees is that they are connected, but have no cycles — a cycle in
a graph is a list of distinct vertices u1 , u2 , . . . , uk such that each of {u1 , u2 }, {u2 , u3 }, . . .,
{uk−1 , uk }, {uk , u1 } are all edges.
A planar graph is a graph that can be drawn in the plane with no two arcs meeting
except at a vertex (if they have one in common). A planar drawing of a graph partitions
the plane into connected regions, called faces. Euler discovered a remarkable formula that
relates the number of vertices, edges and faces in a planar graph:
Euler’s formula: Let G be a planar graph with V vertices, E edges and F faces. Then
V − E + F = 2.
Proof sketch: By induction on F . If F = 1 then the graph has only one face, so it must
be a tree. A tree on V vertices has V − 1 edges, and so fits the formula.
Now suppose F > 1. Then the graph contains a cycle. If we remove an edge e of
that cycle then F drops by one, V stays the same, and E drops by 1. Now by induction,
V − (E − 1) + (F − 1) = 2 and this gives V − E + F = 2.
Problem: Show that five points can’t be connected up with arcs in the plane in such a
way that no two arcs meet each other except at a vertex (if they have one in common).
Solution: Suppose such a connection was possible. The resulting planar graph would
5
have 5 vertices and = 10 edges, so by Euler’s formula would have 7 faces. The sum,
2
over the faces, of the number of edges bounding the faces, is then at least 21, since each
faces has at least three bounding edges. But this sum is at most twice the number of
edges, since each each edge can be on the boundary of at most two faces; so it is at most
20, a contradiction.
46
A bipartite graph is a graph whose vertex set can be partitioned into two classes, X
and Y , such that the graph only has edges that go from X to Y (and so none that are
entirely within X or entirely within Y ). It’s fairly easy to see that any odd-length cycle
is not bipartite, so any graph that has an odd-length cycle sitting inside it is also not
bipartite. This turns out to be a characterization of bipartite graphs; the proof can be
found in any textbook on graph theory.
Theorem: A graph is bipartite if an only if it has no odd cycle.
A matching in a graph is a set of edges, no two of which share a vertex. A perfect
matching is a matching that involves all the edges. A famous result, whose proof can be
found in any graph theory textbook, is Hall’s marriage theorem. A consequence of it says
that if there are n women and n men, each women likes exactly d men, and each man is
liked by exactly d women, then it is possible to pair the men and women off into n pairs,
such that each women is paired with a man she likes. Here’s the statement in graph-theory
language:
Theorem: Let G be a bipartite graph that is regular (all vertices have the same degree).
G has a perfect matching.
47
5.1 Some problems to think about for week 5
1. In a town there are three newly build houses, and each needs to be connected by a
line to the gas, water, and electricity factories. The lines are only allowed to run
along the ground. Is there a way to make all nine connections without any of the
lines crossing each other?
2. n teams play each other in a round-robin tournament (so each team plays each of
the other teams exactly once). There are no ties. Show that there exists an ordering
of the teams, (a1 , a2 , a3 , . . . , an ), such that team a1 beats team a2 , team a2 beats
team a3 , . . ., team an−1 beats team an .
4. Show that in any connected graph, any two paths both of longest possible length
must have a vertex in common.32
n
5. The complete graph Kn on vertex set {1, . . . , n} is the graph in which all
2
possible edges are present. Suppose that the edges of Kn are colored with two colors,
say Red and Blue (meaning, each edge is either assigned the color Red or the color
Blue, but not both). Prove that is possible to partition {1, . . . , n} as A ∪ B, such
that there is a Red path that covers all the vertices in A and a Blue path that
covers all the vertices in B. (By this is meant: the elements of A can be ordered
as (a1 , a2 , . . . , a` ) in such a way that each of the edges a1 a2 , a2 a3 , . . . , a`−1 a` are all
colored Red, and similarly for B.)
6. During a particularly boring Zoom lecture, each of five participants fell asleep exactly
twice. For each pair of these five, there was some moment when both were sleeping
simultaneously. Prove that at some point, at least three of them were sleeping
simultaneously.
7. Is there a way to list the 2n subsets of {1, . . . , n} (with each subset appearing on the
list once and only once) in such a way that the first element of the list is the empty
set, and every element on the list is obtained from the previous element either by
adding an element or deleting an element?
48
5.2 Solutions to graph theory problems
1. In a town there are three newly build houses, and each needs to be connected by a
line to the gas, water, and electricity factories. The lines are only allowed to run
along the ground. Is there a way to make all nine connections without any of the
lines crossing each other?
The question is asking whether the graph on six vertices, 1, 2, . . . , 6, with an edge
from i to j if and only if 1 ≤ i ≤ 3 and 4 ≤ j ≤ 6, can be drawn in the plane without
crossing edges. It has 6 vertices and 9 edges, so if it could, any representation would
have 5 faces (Euler’s formula). Each face is bounded by at least 4 edges (note that
the graph we are working with clearly has no triangles), so summing “#(bounding
edges)” over all faces, get at least 20. But this sum counts each edge at most twice,
so we get at most 18, a contradiction that reveals that there is no such planar
representation.
2. n teams play each other in a round-robin tournament (so each team plays each of
the other teams exactly once). There are no ties. Show that there exists an ordering
of the teams, (a1 , a2 , a3 , . . . , an ), such that team a1 beats team a2 , team a2 beats
team a3 , . . ., team an−1 beats team an .
Solution: This was on the 1958 Putnam competition, but is also a standard result
in graph theory.
Solution: This is Mantel’s theorem, one of the first results proved in the vast area
of extremal graph theory.
Suppose there were no such three cities a, b, c. For each city x, denote by d(x) the
number of cities with direct connection to x. If there is a connection between cities
49
x and y, then there cannot be a third city y directly connected to both (or we would
have a triangle), so
Now in the sum of d(x) + d(y) over all pairs of connected cites, for each x we have
that d(x) appears exactly d(x) times (once for each city directly connected to x), so
X X
(d(x) + d(y)) = d2 (x) ≤ (n2 + 1)2n.
x
4. Show that in any connected graph, any two paths both of longest possible length
must have a vertex in common.33
Solution: Suppose, for a contradiction, that the graph has two paths of maximum
possible length, say P = x0 x1 · · · xn and Q = y0 y1 · · · yn , with no vertex in common
between the xi ’s and the yi ’s.
33
A path in a graph is a list of distinct vertices v0 , v1 , . . . , vn , with an edge from vi−1 to vi for i = 1, . . . , n
(and maybe other edges, too). The length of the path just described is n (equals the number of edges
traversed in going from v0 to vn along the path).
50
Since the graph is connected, there is a path in the graph that connects P and Q.
By considering a minimal such path, we get that there is a vertex xa of P , a vertex
yb of Q, and a path R in the graph from xa to yb that uses no vertices of P, Q other
than xa , yb . Without loss of generality (e.g., by relabeling vertices if necessary) we
can assume that the path from x0 to xa is at least half the length of P , and that the
path from yb to yn is at least half the length of Q. But then the path that starts at
x0 , follows P to xa , then follows R to yb , then follows Q to yn , is longer than P, Q, a
contradiction.
n
5. The complete graph Kn on vertex set {1, . . . , n} is the graph in which all
2
possible edges are present. Suppose that the edges of Kn are colored with two colors,
say Red and Blue (meaning, each edge is either assigned the color Red or the color
Blue, but not both). Prove that is possible to partition {1, . . . , n} as A ∪ B, such
that there is a Red path that covers all the vertices in A and a Blue path that
covers all the vertices in B. (By this is meant: the elements of A can be ordered
as (a1 , a2 , . . . , a` ) in such a way that each of the edges a1 a2 , a2 a3 , . . . , a`−1 a` are all
colored Red, and similarly for B.)
Solution: I learned of this problem in a recent paper of András Gyárfás , at
http://arxiv.org/pdf/1509.05539.pdf.
Let A and B be disjoint subsets of {1, . . . , n}, with a Red path covering A and a
Blue path covering B, and with A ∪ B as large as possible subject to this condition.
If A ∪ B = {1, . . . , n} then we are done. If not, then we may assume that both A
and B are not empty, since if one of them, B say, was empty, then we could replace
B by {x} where x is any vertex not in A, and the result would be a valid pair (A, B)
with the size of the union one larger, a contradiction of maximality (note that a Blue
path covers a single vertex).
Let A be covered by the Red path given by the ordering (a1 , a2 , . . . , a` ), and let B be
covered by the Blue path given by the ordering (b1 , b2 , . . . , bk ). Let x be any vertex
not in A ∪ B. If either the edge a` x is Red or the edge bk x is Blue, then we can
either add x to A or add x to B and get a valid pair that covers more vertices, a
contradiction. So we may assume that a` x is Blue and bk x is Red.
Now look at edge a` bk . If this is Red, then we can replace A by {a1 , . . . , a` , bk , x}
and replace B by {b1 , . . . , bk−1 } and get a valid pair that covers more vertices, a
contradiction. If a` bk is Blue, then we can replace A by {a1 , . . . , a`−1 } and replace
B by {b1 , . . . , bk , a` , x} and again get a valid pair that covers more vertices, a
contradiction.
We conclude that A ∪ B = {1, . . . , n}.
6. During a particularly boring Zoom lecture, each of five participants fell asleep exactly
twice. For each pair of these five, there was some moment when both were sleeping
simultaneously. Prove that at some point, at least three of them were sleeping
simultaneously.
51
Let the participants be named A, B, C, D, E, and let A1 , A2 , B1 , B2 , et cetera, denote
the time intervals during which each participant was asleep.
Consider a graph with vertex set A1 , A2 , B1 , B2 , et cetera, with an edge between two
vertices if the two associated time intervalshave
a point in common. This graph has
5
at least 10 edges, because there are 10 = pairs of participants, and it is given
2
that for each pair there was some moment when both were sleeping simultaneously.
A graph with 10 vertices and at least 10 edges must have a cycle. Suppose that
there is a cycle with vertices v1 , v2 , . . . , vk (encountered in that order along the cycle).
Consider the intervals of time I1 , I2 , . . . , Ik associated with these k vertices. We
claim that some three of these intervals must have a common point — giving rise to
a time at which at least three of the participants were sleeping simultaneously.
We prove the claim by contradiction, assuming that no three of the intervals have a
common point. Without loss of generality, assume that I1 is the interval whose start
time is earliest. I2 starts at least as late as I1 , and must end later; otherwise where
I3 starts is a triple intersection point. So I3 is completely disjoint from I1 , and on a
number line lies completely to the right of I1 .
Now I4 must start after I2 ends (else there will be triple intersection point between
I2 , I3 , I4 ), so I4 must lie completely to the right of I2 , and hence completely to the
right of I1 .
Continuing the argument in this manner, we find that Ik lies completely to the right
of I1 . This contradicts that v1 , v2 , . . . , vk forms a cycle, i.e. that Ik and I1 intersect.
So we conclude that there must be a triple intersection point among the intervals.
7. Is there a way to list the 2n subsets of {1, . . . , n} (with each subset appearing on the
list once and only once) in such a way that the first element of the list is the empty
set, and every element on the list is obtained from the previous element either by
adding an element or deleting an element?
52
the sequence C 0 , {n}, {1, n}, . . . , {2, n}, {n}, obtained by unioning every term of
C with {n}. C 0 has the property that it is a cycle list of the elements of the
n-dimensional hypercube that are not listed in C, and also has the property that
adjacent elements have symmetric difference of size exactly 1. A Hamiltonian cycle
of the n dimensional hypercube is now obtained by starting with all the elements
of C except the final ∅, then going to the second-from-last element of C 0 , and then
listing the remaining element of C 0 (except for the final ∅) in reverse order.
Solution: This was from the 1990 Putnam competition. It is (an instance of)
a very basic result in graph theory, probably the first ever result proved, namely
the necessary and sufficient conditions for the existence of an Eulerian circuit in a
directed graph.
53
6 Week 5 (September 8) — Calculus
Calculus is a rich source for competition problems. The Putnam problem setters try to
assume minimal mathematical background, so the the topics from Calculus that come up
will tend to focus on material from Calc 1 & 2. Here are the things you should for sure be
familiar with:
• The definitions of limits, continuity and derivative. Some questions will ask to
compute interesting limits, or make certain continuity assumptions, or give some
information about the values of derivatives of a function, and of course it will be
helpful to be comfortable with these concepts!
• The meaning of first and second derivatives, in terms of local maxima and minima
of functions.
• The idea of approximating am integral via a Riemann sum, and recognizing a sum as
a Riemann sum — sometimes a complicated sum becomes very easy to understand
if you realize that it is a Riemann sum for some integral.
54
More precisely, there is some number c between a and x for which
• All the basic integrals, and all the basic integration techniques — integration by
parts, u-substitutions, trigonometric substitutions, et cetera.
Paraphrasing my colleague Andrei Jorza: “you will rarely need any new calculus
technique that you haven’t seen before; the difficulty is to patch together all the things you
know to obtain a solution. While cleverness will take you a long way in problem solving
calculus, this is no place for being squeamish about algebraic manipulations.”
The book Putnam and Beyond (available online) has a huge number of Putnam-style
calculus problems. You’ll also find a fair number at https://www3.nd.edu/~ajorza/
courses/2018f-m43900/handouts/lecture3.pdf (Andrei Jorza’s 43900 page from Fall
2018). Many of this week’s problems come from that list.
55
6.1 Problems to think about for week 6
1. Find, with explanation, the maximum value of f (x) = x3 − 3x on the set of all real
numbers satisfying x4 + 36 ≤ 13x2 .
2. Suppose f : R → R is a continuous function satisfying |f (x) − f (y)| ≥ |x − y| for all
x, y. Show that f is both injective and surjective.
3. Let f : [0, 1] → R be a continuous function. Show that for every x ∈ [0, 1], the series
∞
X f (xn )
n=1
2n
converges.
4. Let n
X 1
f (n) = √ √ .
k=1
k+ k+1
Evaluate f (9999).
5. For which real numbers c is it true that
1 x 2
e + e−x ≤ ecx
2
for all real numbers x?
6. Given 0 < α < β, find
Z 1 1/λ
λ
lim (βx + α(1 − x)) dx .
λ→0 0
7. Compute
x + sin x − cos x − 1
Z
dx.
x + ex + sin x
Z x
8. Let f : R → R be a continuous function. Define g(x) = f (x) f (t)dt. Show that
0
if g is non-increasing then f must be the identically 0 function.
9. Compute
Z π/2
dx
√ .
0 1 + tan 2 (x)
10. Let A and B be points on the same branch of the hyperbola xy = 1. Let P be a
point on the chord AB such that the triangle AP B has largest area. Show that the
area bounded by the hyperbola and the chord AP is the same as the area bounded
by the hyperbola and the chord BP .
11. Compute
1 1 1
lim √ +√ + ··· + √ .
n→∞ 4n2 − 12 4n2 − 22 4n2 − n2
56
6.2 Solutions to calculus problems
1. Find, with explanation, the maximum value of f (x) = x3 − 3x on the set of all real
numbers satisfying x4 + 36 ≤ 13x2 .
Solution: This was from the 1986 Putnam competition, question A1.
Injectivity: Suppose that x = 6 y. Then it must be the case that f (x) 6= f (y); for if
not, then 0 = |f (x) − f (y)| ≥ |x − y| > 0, a contradiction.
Surjectivity: For any y > 0, we have |f (0) − f (y)| ≥ y, so
57
3. Let f : [0, 1] → R be a continuous function. Show that for every x ∈ [0, 1], the series
∞
X f (xn )
n=1
2n
converges.
(a sum of positive terms) converges — the partial sums form an increasing sequence,
bounded above by
∞
X M
n
= 2M.
n=1
2
∞
X f (xn )
So, for all x, is absolutely convergent, and so convergent.
n=1
2n
4. Let n
X 1
f (n) = √ √ .
k=1
k+ k+1
Evaluate f (9999).
58
5. For which real numbers c is it true that
1 x 2
e + e−x ≤ ecx
2
for all real numbers x?
Solution: This was Problem B1 from the 1980 Putnam Competition. See for example
https://faculty.math.illinois.edu/~hildebr/putnam/problems/mock13sol.pdf
for a solution.
So we need to compute
1/λ
β λ+1 − αλ+1
1
lim
λ→0 1+λ β−α
Now
1 1 1
lim+ = lim k
= ,
(1 + λ)1/λ k→∞ 1 + 1 e
λ→0
k
and `
1 1 1 1
lim− 1/λ
= lim k
= lim 1 − = ,
λ→0 (1 + λ) k→−∞ 1 + 1 ` e
`→+∞
k
so
1 1
lim = .
λ→0 (1 + λ)1/λ e
For the the other part of the limit, write
!
1/λ β λ+1 −αλ+1
λ+1 λ+1 log
β −α β−α
=e λ .
β−α
59
The exponent is an indeterminate of the form 0/0 at λ = 0, so we evaluate the limit
of the exponent as λ → 0 by an application of L’Hôpital’s rule; it is
(λ + 1)β λ − (λ + 1)αλ
β−α 1
lim λ+1 λ+1
= .
λ→0 β −α β−α β−α
So by continuity,
1/λ
β λ+1 − αλ+1
lim = e1/(β−α) .
λ→0 β−α
It follows that
Z 1 1/λ
λ
lim (βx + α(1 − x)) dx = (1/e)e1/(β−α) .
λ→0 0
7. Compute
x + sin x − cos x − 1
Z
dx.
x + ex + sin x
Solution: This was from Putnam and Beyond by Gelca and Andreescu.
We have
x + sin x − cos x − 1 x + ex + sin x − cos x − ex − 1
Z Z
dx = dx
x + ex + sin x x + ex + sin x
1 + ex + cos x
Z
= 1− dx
x + ex + sin x
1 + ex + cos x
Z
= x− dx
x + ex + sin x
Z
du
= x− where u = x + ex + sin x
u
= x − log |u|
= x − log |x + ex + sin x|.
Z x
8. Let f : R → R be a continuous function. Define g(x) = f (x) f (t)dt. Show that
0
if g is non-increasing then f must be the identically 0 function.
Solution: This was from the book Putnam and Beyond by Gelca and Andreescu.
Define h : R → R by 2
Z x
1
h(x) = f (t)dt .
2 0
h0 (x) = g(x).
60
Now g(x) is non-increasing and g(0) = 0, so g(x) is non-negative on (−∞, 0) and
non-positive on (0, ∞). But g = h0 , so this implies that h is non-decreasing on
(−∞, 0), and non-increasing on (0, ∞). And h(0) = 0, while h(x) ≥ 0 for all x, so it
must be the case that h(x) = 0 for all x. This tells us that
Z x
f (t)dt = 0
0
for all real x; and differentiating with respect to x tells us that f (x) = 0 for all x.
9. Compute
Z π/2
dx
√ .
0 1 + tan 2 (x)
61
√
Note that the 2 was a complete red herring(!), just introduced to make sure that
the integrand does not have an elementary antiderivative.
10. Let A and B be points on the same branch of the hyperbola xy = 1. Let P be a
point on the chord AB such that the triangle AP B has largest area. Show that the
area bounded by the hyperbola and the chord AP is the same as the area bounded
by the hyperbola and the chord BP .
Solution by Kiran Kedlaya: Without loss of generality, assume that A and B lie
in the first quadrant with A = (t1 , 1/t1 ), B = (t2 , 1/t2 ), and t1 < t2 . If P = (t, 1/t)
with t1 ≤ t ≤ t2 , then the area of triangle AP B is
1 1 1
1 t2 − t1
t1 t t2 = (t1 + t2 − t − t1 t2 /t).
2 2t1 t2
1/t1 1/t 1/t2
Second solution: For any λ > 0, the map (x, y) 7→ (λx, λ−1 y) preserves both
areas and the hyperbola xy = 1. We may thus rescale the picture so that A, B are
symmetric across the line y = x, with A above the line. As P moves from A to B,
the area of AP B increases until P passes through the point (1, 1), then decreases.
Consequently, P = (1, 1) achieves the maximum area, and the desired equality is
obvious by symmetry. Alternatively, since the hyperbola is convex, the maximum
is uniquely achieved at the point where the tangent line is parallel to AB, and by
symmetry that point is P .
11. Compute
1 1 1
lim √ +√ + ··· + √ .
n→∞ 4n2 − 12 4n2 − 22 4n2 − n2
62
Solution: This was from Putnam and Beyond by Gelca and Andreescu.
63
7 Week 6 (September 15) — Recurrences
We met recurrences in the induction hand-out.
Sometimes we are either given a sequence of numbers via a recurrence relation, or we
can argue that there is such relation that governs the growth of a sequence. A sequence
(bn )n≥a is defined via a recurrence relation if some initial values, ba , ba+1 , . . . , bk say, are
given, and then a rule is given that allows, for each n > k, bn to be computed as long as
we know the values ba , ba+1 , . . . , bn−1 .
Sequences defined by a recurrence relation, and proofs by induction, go hand-in-glove.
Here’s an illustrative example.
Example: Let an be the number of different ways of covering a 1 by n strip with 1 by 1
and 1 by 3 tiles. Prove that an < (1.5)n .
Solution: We start by figuring out how to calculate an via a recurrence. Some initial
values of an are easy to compute: for example, a1 = 1, a2 = 1 and a3 = 2. For n ≥ 4, we
can tile the 1 by n strip EITHER by first tiling the initial 1 by 1 strip with a 1 by 1 tile,
and then finishing by tiling the remaining 1 by n − 1 strip in any of the an−1 admissible
ways; OR by first tiling the initial 1 by 3 strip with a 1 by 3 tile, and then finishing by
tiling the remaining 1 by n − 3 strip in any of the an−3 admissible ways. It follows that
for n ≥ 4 we have an = an−1 + an−3 . So an (for n ≥ 1) is determined by the recurrence
1 if n = 1,
1 if n = 2,
an =
2 if n = 3, and
an−1 + an−3 if n ≥ 4.
Notice that this gives us enough information to calculate an for all n ≥ 1: for example,
a4 = a3 + a1 = 3, a5 = a4 + a2 = 4, and a6 = a5 + a3 = 6.
Now we prove, by strong induction, that an < 1.5n . That a1 = 1 < 1.51 , a2 = 1 < (1.5)2
and a3 = 2 < (1.5)3 is obvious. For n ≥ 4, we have
an = an−1 + an−3
< (1.5)n−1 + (1.5)n−3
3 !
2 2
= (1.5)n +
3 3
26
= (1.5)n
27
n
< (1.5) ,
(the second line using the inductive hypothesis) and we are done by induction.
Notice that we really needed strong induction here, and we really needed all three
of the base cases n = 1, 2, 3 (think about what would happen if we tried to use regular
induction, or what would happen if we only verified n = 1 as a base case).
64
Solving via generating functions
Given a sequence (an )n≥0 , we can form its generating function, the function
∞
X
F (x) = an x n .
n=0
Often we can use a recurrence relation to produce a functional equation that F (x) satisfies,
then solve that equation to find a compact (non-infinite-summation) expression for F (x),
then finally use knowledge of calculus power-series to extract an exact expression for an .
There are so many different varieties of this method, that I won’t describe it in general,
just give an example. The Perrin sequence is defined by p0 = 3, P1 = 0, p2 = 2, and
P (x) = p0 + p1 x + p2 x2 + p3 x3 + p4 x4 + . . . .
Plugging in the given values for p0 , p1 and p2 , and the recurrence’s right-hand side for all
others, we get
We can now solve for P (x) as a rational function in x, and expand using partial fractions:
3 − x2
P (x) =
1 − x 2 − x3
A B C
= + +
1 − α1 x 1 − α2 x 1 − α3 x
where A, B and C are some constants and (1 − α1 x)(1 − α2 x)(1 − α3 x) = 1 − x2 − x3 , or
equivalently (x − α1 )(x − α2 )(x − α3 ) = x3 − x − 1. In other words, α1 , α2 and α3 are the
solutions to x3 − x − 1 = 0 (it happens that one of them, say α1 , is real, and is roughly
1.32 [it’s called the plastic number] and the other two are a complex conjugate pair with
absolute value smaller than α1 ).
Using
1
= 1 + kx + k 2 x2 + . . . ,
1 − kx
we now get that
P (x) = (A+B+C)+(Aα1 +Bα2 +Cα3 )x+(Aα12 +Bα22 +Cα32 )x2 +(Aα13 +Bα23 +Cα33 )x3 +. . . ,
65
and so we can read off a formula for pn (by uniqueness of power-series representations):
But what are A, B and C? One way to figure them out is to use the initial conditions, to
get a set of simultaneous equations:
A + B + C = 3
Aα1 + Bα2 + Cα3 = 0
Aα12 + Bα22 + Cα32 = 2.
It turns out that the unique solution to this system is A = B = C = 1. This solution
satisfies the first equation above, evidently; it satisfies the second since
(x−α1 )(x−α2 )(x−α3 ) = x3 −x−1 = x3 −(α1 +α2 +α3 )x2 +(α1 α2 +α1 α3 +α2 α3 )x−(α1 α2 α3 )
(1)
implies α1 + α2 + α3 = 0; and it satisfies the last since
and from (1) this reduces to 0 = α12 + α22 + α32 − 2 so α12 + α22 + α32 = 2.
So we have an exact formula for pn :
pn ≈ Aα1n ≈ A(1.32)n .
In other words, with very little work, we have isolated the rough growth rate of pn .
If this business of generating functions interests you, you can find out much more in
Herb Wilf’s beautiful book generatingfunctionology (just google it; it’s freely available
online).
(for some constants ci ) together with initial values a0 , a1 , a2 , . . . , ak−1 . Form the polynomial
66
(called the characteristic polynomial of the recurrence). If C(x) = (x−α1 )(x−α2 ) . . . (x−αk )
factors into distinct linear terms, then there are constants A1 , A2 , . . . , Ak such that for all
n ≥ 0,
an = A1 α1n + . . . + Ak αkn
with the Ai ’s explicitly findable by solving the k by k system of linear equations
A1 + . . . + Ak = a0
A1 α1 + . . . + Ak αk = a1
A1 α12 + . . . + Ak αk2 = a2
...
k−1 k−1
A1 α1 + . . . + Ak αk = ak−1 .
Even without solving this system, if C(x) has a unique root (say α1 ) of greatest absolute
value, then we know the asymptotic growth rate
pn ∼ A1 α1n
as n → ∞.
Similar statements can be made when C(x) has repeated roots, with the form of the
final answer changing depending on what is the right expression to use in the partial
fractions expansion step of the generating function method. I won’t make a general
statement, because it would be way too cumbersome (but ask me if you want to see more!);
instead here’s an example:
Suppose an = 4an−1 − 4an−2 for all n ≥ 2, with a0 = 0 and a1 = 1. The generating
function method gives that the generating function A(x) satisfies
x
A(x) = .
1 − 4x + 4x2
The correct partial fractions expansion now is
A B
A(x) = + .
1 − 2x (1 − 2x)2
67
so the coefficient of xn in B/(1 − 2x)2 is B(n + 1)(2n ). [This trick of figuring out new
power series from old by differentiation is quite useful!] This gives
an = A2n + (n + 1)B2n .
Write this as vn = An v. If we can diagonalize A (that is, find invertible S with SAS −1 = D,
with D a diagonal matrix), then we can write A = S −1 DS, so An = S −1 Dn S, from which
we can easily find vn and so pn explicitly. If you know enough linear algebra, you’ll quickly
see that this approach requires finding eigenvalues, which are roots of a certain cubic, and
the computations quickly reduce to exactly the same ones as those of the previous two
methods described. I mention this method just to bring up the matrix point of view of
recurrences, which can sometimes be quite helpful.
68
A non-Putnam warm-up exercise
Using the trick of repeatedly differentiating the identity
1
= 1 + x + x2 + . . . ,
1−x
find a nice expression for the coefficients of the power series (about 0) of 1/(1 − x)k . Use
this to derive, via generating functions, the identity
n
X n(n + 1)(2n + 1)
i2 = ,
i=0
6
and if you are feeling masochistic, go on to find a nice closed-form for
n
X
i3
i=0
69
6. The last question was clearly written with the years 2020 and 2021 in mind. Does
the conclusion remain true for an arbitrary year? That is, fix m ≥ 1. Define a
sequence by ak = k for k = 1, 2, . . . , m + 1 and
ak+1 = ak + ak−m
for k ≥ m + 1. For which m is it true that the sequence has m consecutive terms
each divisible by m + 1?
(Here bac denote the round-down of a to the nearest integer at or below a; so for
example b3.4c = 3, b2.999c = 2 and b5c = 5.)
Show that for each n ≥ 0, there is an m = m(n) such that am = a2n + a2n+1 .
and differentiating the right-hand side k times, we get an power series where the coefficient
of xn−k is n(n − 1) . . . (n − (k − 1)), so the coefficient of xn is (n + k)(n + k − 1) . . . (n + 1).
Dividing through by k!, the coefficient of xn in 1/(1 − x)k+1 is
(n + k)(n + k − 1) . . . (n + 1) n+k
= .
k! k
n
X
Let an = i2 ; an satisfies the recurrence a0 = 0 and an = an−1 + n2 for n > 0. Letting
i=0
70
A(x) = a0 + a1 x + a2 x2 . . . be the generating function of the an ’s, we get
71
with d0 = 1. Iterating this, we get an explicit expression for dn , via a “telescoping”
(or canceling) product:
(−1)n
dn = , (n ≥ 0),
n+1
so that
n−1 n−1
X X (−1)i
xn = x0 + di = .
i=0 i=0
i+1
By Leibniz’ test for alternating sums, this series converges, and by thinking about
the Taylor series for log(1 + x) about 0 (log here is base e), and evaluating the Taylor
series at x = 1, we get that the sum converges to log 2.
h √ ni
2. For n ≥ 0, let an = (1 + 2) , where [x] indicates the floor of x — the largest
integer that is no larger than x (so, e.g., [2] = 2 and [2.1] = 2). Prove that an is even
if and only if n is odd.
Solution: The first few values are p1 = 3, p2 = 7, p3 = 17, p4 = 41, p3 = 17, p4 = 41.
A pattern seems to be emerging: pn = 2pn−1 + pn−2 , with p1 = 3, p2 = 7. We verify
72
this by induction on n. It’s certainly true for n = 3. For n > 3,
as required. With this new recurrence, it is easy to apply the method of generating
functions, as described in the introduction, to get
√ √
(1 + 2)n+1 (1 − 2)n+1
pn = + .
2 2
4. Let (xn )n≥0 be a sequence of nonzero real numbers such that x2n − xn−1 xn+1 = 1 for
n = 1, 2, 3, . . .. Prove that there exists a real number a such that xn+1 = axn − xn−1
for all n ≥ 1.
73
but this last is true since both sides equal 1 for n ≥ 2. It follows that there is some
a such that for all n ≥ 2,
axn−1 = xn + xn−2
or
xn = axn−1 − xn−2 ,
so for all n ≥ 1
xn+1 = axn − xn−1 ,
as was required to show.
ak+1 = ak + ak−2020
for k ≥ 2021. Show that the sequence has 2020 consecutive terms each divisible by
2021.
Solution: (Putnam competition 2006, problem A3) The given recurrence can be
used to extend the sequence to a0 (a0 is the unique integer satisfying a2021 = a2020 +a0 ,
so a0 = 1), and indeed in the same way it can be extended to all negative k. So we
can consider that we are working with a doubly-infinite sequence, indexed by Z.
For future reference, it will be useful to know the following:
• a0 = 1;
• a−1 is defined by a2020 = a2019 + a−1 , so also a−1 = 1;
• a−2 is defined by a2019 = a2018 + a−2 , so also a−2 = 1;
• and this continues down to a−2019 , which is defined by a2 = a1 + a−2019 , so also
a−2019 = 1.
Notice that we have found 2020 consecutive values of the sequence, albeit with
negative indices, that take value 0, namely a−2020 through a−4039 .
In the section on pigeon-hole principle, we showed that the Fibonacci numbers,
viewed similarly as a doubly-infinite sequence, are periodic modulo any modulus;
that is, if we take any integer n and consider the remainder of the terms of the
doubly-infinite Fibonacci sequence modulo n, we get a periodic sequence. The same
proof works to show that the given sequence (ak )k∈Z is periodic modulo 2021.
74
So, it is enough to show that the doubly-infinite sequence (ak )k∈Z has 2020 consecutive
terms (some perhaps with negative indices), each divisible by 2021; the periodicity
will then give us such a long sequence of terms with positive indices, also. But now
we are done, since a−2020 through a−4039 gives such a sequence.
6. The last question was clearly written with the years 2020 and 2021 in mind. Does
the conclusion remain true for an arbitrary year? That is, fix m ≥ 1. Define a
sequence by ak = k for k = 1, 2, . . . , m + 1 and
ak+1 = ak + ak−m
for k ≥ m + 1. For which m is it true that the sequence has m consecutive terms
each divisible by m + 1?
(Here bac denote the round-down of a to the nearest integer at or below a; so for
example b3.4c = 3, b2.999c = 2 and b5c = 5.)
Solution: (Putnam competition 1997, problem B4) Note that for each k ≥ 0, if
i > b2k/3c, then ak−i,i = 0 also. So we may as well consider
X
(−1)i ak−i,i ,
i≥0
b2k/3c
X
as it takes exactly the same value as (−1)i ak−i,i .
i=0
Observe that, by definition of am,n ,
X
(1 + x + x2 )m = am,n xn ,
n≥0
is equal to
1 + y(1 + x + x2 ) + y 2 (1 + x + x2 )2 + . . .
which simplifies to
1
.
1 − y(1 + x + x2 )
75
Evaluating both sides at x = −y, we get
1 X
= (−1)n am,n y m+n .
1 − y + y 2 − y 3 m,n≥0
Show that for each n ≥ 0, there is an m = m(n) such that am = a2n + a2n+1 .
Solution: (1999 Putnam, problem A3) Using partial fractions, we can get the Taylor
series of 1/(1 − 2x − x2 ), and discover that
1 √ n+1 √ n+1
an = √ (1 + 2) − (1 − 2) .
2 2
After a little (messy) algebra, we find that
so m = 2n + 2 works.
76
8 Week 7 (September 22) — Writing solutions
This week’s handout is concerned with the art of writing — presenting — solutions. I’ve
made a few notes myself, but I also strongly recommend that you also look at both of
these essays:
• (long, with lots of examples of both good and bad writing): https://artofproblemsolving.
com/news/articles/how-to-write-a-solution
As you go over this week’s problems and (more importantly) as you take a Putnam
competition, I encourage you to bear in mind the advice contained in these notes, and it
the above references.
Note: I have shamelessly appropriate much of this from Ravi Vakil’s Stanford Putnam
Preparation website (http://math.stanford.edu/~vakil/putnam05/05putnam7.pdf),
and from Ioana Dumitriu’s UWashington’s “The Art of Problem Solving” website (http:
//www.math.washington.edu/~putnam/index.html). Both of these websites are filled
with what I think is great advice.
For next Tuesday fully write up a solution to at least one of the problems that appear
later in the handout, after you have read the handout and (at the very least) the 4-page
essay by Evan Chen (http://web.evanchen.cc/handouts/english/english.pdf).
• Do examples!
• Draw pictures!
• Write lots!
• Talk it out!
77
• Break into cases!
• Work backwards!
• Argue by contradiction!
• Make a generalization!
• Take a break!
• Sleep on it!
• Ask questions!
• Enjoy!
1. Try a few small cases out. Try a lot of cases out. Remember that the three hours
of the Putnam competition is a long time — you have time to spare! If a question
asks what happens when you have n things, or 2015 things, try it out with 1, 2, 3, 4
things, and try to form a conjecture. This is especially valuable for questions about
sequences defined recursively.
3. Don’t be afraid of diving into some algebra. (Again, three hours is a long time . . ..)
You shouldn’t waste that much time, thanks to the 15-minute rule.
4. If a question asks to determine whether something is true or false, and the direction
you initially guess doesn’t seem to be going anywhere, then try guessing the opposite
possibility.
78
6. Look for symmetries. Try to connect the problem to one you’ve seen before. Ask
yourself “how would [person X] approach this problem”? (It’s quite reasonable here
for [Person X] to be [Chuck Norris]!)
7. Putnam problems always have slick solutions. That leads to a helpful meta-approach:
“The only way this problem could have a nice solution is if this particular approach
worked out, unlikely as it seems, so I’ll try it out, and see what happens.”
8. Show no fear. If you think a problem is probably too hard for you, but you have an
idea, try it anyway. (Three hours is a long time.)
• that you have actually solved the problem (are you sure you have all the cases
covered? Are you sure that all the little details work out? . . .)
and
Solving the problem is only half the battle. Now you have launch into the other half,
convincing the grader that you have actually done so.
Life is tough, and Putnam graders may be even tougher. But here’s a list of things
which, when done properly, will yield a nice write-up which will appease any reasonable
grader.
1. All that scrap paper, filled with your musings on the problem to date? Put it aside;
you must write a clean, coherent solution on a fresh piece of paper.
2. Before you start writing, organize your thoughts. Make a list of all steps to the
solution. Figure out what intermediary results you will need to prove. For example,
if the problem involves induction, always start with the base case, and continue with
proving that “true for n implies true for n + 1”. Make sure that the steps follow
from each other logically, with no gaps.
3. After tracing a “road to proof” either in your head or (preferably) on scratch paper,
start writing up the solution on a fresh piece of paper. The best way to start this is
by writing a quick outline of what you propose to do. Sometimes, the grader will
just look at this outline, say “Yes, she knows what she is doing on this problem!”,
and give the credit.
79
5. Complete each step of the “road” before you continue to the next one.
6. When making statements like “it follows trivially” or “it is easy to see”, listen for
quiet, nagging doubts. If you yourself aren’t 100% convinced, how will you convince
someone else? Even if it seems to follow trivially, check again. Small exceptions may
not be obvious. The strategy “I am sure it’s true, even if I don’t see it; if I state that
it’s obvious, maybe the grader will believe I know how to prove it” has occasionally
led its user to a score of 0 out of 10.
7. Organize your solution on the page; avoid writing in corners or perpendicular to the
normal orientation. Avoid, if possible, post-factum insertion (if you discover you’ve
missed something, rather than making a mess of the paper by trying to write it over,
start anew. You have the time!)
8. Before writing each phrase, formulate it completely in your mind. Make sure it
expresses an idea. Starting to write one thing, then changing course in mid-sentence
and saying another thing is a sure way to create confusion.
10. If necessary, state intermediary results as “claims” or “lemmas” which you can prove
right after stating them. If you cannot prove one of these results, but can prove the
problem’s statement from it, state that you will assume it, then show the path from
it to the solution. You may get partial credit for it.
11. Rather than using vague statements like “and so on” or “repeating this process”,
formulate and prove by induction.
12. When you’re done writing up the solution, go back and re-read it. Put yourself in the
grader’s shoes: can someone else read your write-up and understand the solution?
Must one look for things in the corners? Are there “miraculous” moments?, etc..
2. Evaluate Z 2
1 arctan(1 + x)
dx.
π ln 2 1 x
Express your answer as a rational number. (Here arctan is defined in the standard
way, so that for 0 ≤ x < ∞ we have 0 ≤ arctan x < π/2.)
3. At some point early in the season, Chris Paul’s free throw percentage (number of
successful free throws/number of free throw attempts) was below 80%. At the end
80
of the season, it was above 80%. Must it have been exactly 80% at some moment?
(Note that going in the other direct, the answer is an easy “no”: he could start the
season with a successful free throw followed by an unsuccessful one, so his average
would go from above 80% to below 80% without ever being exactly 80%).
4. An integer is selected at random from the set {1, 2, . . . , 2020} (with each integer
having a chance 1/2020 of being selected). You are required to determine the correct
integer in an odd number of guesses. After each guess you are told whether the
actual integer is less than, equal to, or greater than your guess. You are not allowed
to guess an integer which has already been ruled out by the earlier answers. Show
that there is a strategy that allows you to determine the integer, in such a way that
there is a more than 2/3 chance of making an odd number of guesses.
• 2∈S
• n ∈ S whenever n2 ∈ S
• (n + 5)2 ∈ S whenever n ∈ S.
Which positive integers are not in S? (Here “smallest” means that any proper subset
of S fails at least one of the properties listed above).
(Here “log” is log to the base e). Show that the infinite series
x0 + x1 + x2 + · · ·
8. Let Nn be the number of ordered n-tuples (a1 , . . . , an ) of positive integers with the
property that
1 1 1
+ + ··· + = 1.
a1 a2 an
Is N10 odd or even?
81
8.2 Solutions to problems
1. Either find a function f : N → N with the property that f (f (n)) = n + 1 for all
n ∈ N, or prove that no such function exists.
Solution: We claim that no such function exists. Indeed, suppose that there was
such a function. Let f (0) = k.
We claim that for each j = 0, 1, 2. . . ., we have f (j) = k + j. We prove this by
induction on j, with the base case j = 0 immediate. For j > 0, we have the
induction hypothesis that f (j − 1) = k + j − 1. To force f (f (j − 1)) = j we require
f (k + j − 1) = j. But now f (f (k + j − 1)) = f (j), and also f (f (k + j − 1)) = k + j,
so f (j) = k + j, completing the induction.34
We have now fully described the function f , by f (j) = k + j, j = 0, 1, 2, . . ., for some
k ≥ 1. But now f (f (j)) = f (k + j) = 2k + j for all j. But also f (f (j) = j + 1 for
all j. So j + 1 = 2k + j, forcing k = 1/2, a contradiction.
(This was Question 3 on the 2018 Virginia Tech Regional Math Competition.)
2. Evaluate Z 2
1 arctan(1 + x)
dx.
π ln 2 1 x
Express your answer as a rational number. (Here arctan is defined in the standard
way, so that for 0 ≤ x < ∞ we have 0 ≤ arctan x < π/2.)
82
Make the substitution x = 2/y to get
Z 1
−2(ln 2 − ln y)dy
J = 2 2
2 y (2 + 4/y + 4/y )
Z 2
(ln 2)dy
= 2
− J,
1 1 + (1 + y)
so
Z 2
(ln 2)dy
2J = 2
1 1 + (1 + y)
= (ln 2) [arctan(1 + y)]21
= (ln 2)(arctan 3 − arctan 2).
It follows that
(ln 2)(arctan 3 + arctan 2)
I= .
2
Now
3+2
tan(arctan 3 + arctan 2) = = −1
1−6
(using the formula
tan A + tan B
tan(A + B) = ),
1 − (tan A)(tan B)
so
3π
arctan 3 + arctan 2 = .
4
It follows that
3π ln 2
I= ,
8
so the answer to the question is 3/8, as claimed.
(This was Question 1 on the 2018 Virginia Tech Regional Math Competition; the
solution given here is modified from the official solution35 .)
3. At some point early in the season, Chris Paul’s free throw percentage (number of
successful free throws/number of free throw attempts) was below 80%. At the end
of the season, it was above 80%. Must it have been exactly 80% at some moment?
(Note that going in the other direct, the answer is an easy “no”: he could start the
season with a successful free throw followed by an unsuccessful one, so his average
would go from above 80% to below 80% without ever being exactly 80%).
Solution (from Kiran Kedlaya’s Putnam Competition archive36 ; this was Putnam
A1 2004): Yes. To see this, let is S(N ) be Paul’s number of successful free throws,
after N attempts, and suppose that there is no N with S(N ) = 0.8N . Then there
35
https://intranet.math.vt.edu/people/plinnell/Vtregional/S18/index.html
36
https://kskedlaya.org/putnam-archive/
83
would be an N0 such that S(N0 ) < 0.8N0 and S(N0 + 1) > 0.8(N0 + 1); that is,
Paul’s free throw percentage is under 80% at some point, and after one subsequent
free throw (necessarily made), his percentage is over 80%. If he makes m of his first
N0 free throws, then m/N0 < 4/5 and (m + 1)/(N0 + 1) > 4/5. This means that
5m < 4N0 < 5m + 1, which is impossible since then 4N0 is an integer between the
consecutive integers 5m and 5m + 1.
Remark: This same argument works for any fraction of the form (n − 1)/n for some
integer n > 1, but not for any other real number between 0 and 1.
4. An integer is selected at random from the set {1, 2, . . . , 2020} (with each integer
having a chance 1/2020 of being selected). You are required to determine the correct
integer in an odd number of guesses. After each guess you are told whether the
actual integer is less than, equal to, or greater than your guess. You are not allowed
to guess an integer which has already been ruled out by the earlier answers. Show
that there is a strategy that allows you to determine the integer, in such a way that
there is a more than 2/3 chance of making an odd number of guesses.
Solution (from Kiran Kedlaya’s Putnam Competition archive; this was Putnam
B4, 2002): Use the following strategy: guess 1, 3, 4, 6, 7, 9, . . . , 3k, 3k + 1, . . . until
the target number n is revealed to be equal to or lower than one of these guesses.
If n ≡ 1 (mod 3), it will be guessed on an odd turn. If n ≡ 0 (mod 3), it will be
guessed on an even turn. If n ≡ 2 (mod 3), then n + 1 will be guessed on an even
turn, forcing a guess of n on the next turn. Thus the probability of success with this
strategy is 1347/2020 > 2/3.
• 2∈S
• n ∈ S whenever n2 ∈ S
• (n + 5)2 ∈ S whenever n ∈ S.
Which positive integers are not in S? (Here “smallest” means that any proper subset
of S fails at least one of the properties listed above).
Solution (from Kiran Kedlaya’s Putnam Competition archive; this is Putnam A1,
2017): We denote the condition 2 ∈ S by (a); the condition n ∈ S whenever n2 ∈ S
by (b), and the condition (n + 5)2 ∈ S whenever n ∈ S by (c).
We claim that the positive integers not in S are 1 and all multiples of 5.
If S consists of all other natural numbers, then S satisfies the given conditions: note
that the only perfect squares not in S are 1 and numbers of the form (5k)2 for some
positive integer k, and it readily follows that both (b) and (c) hold (that (a) holds
for this set is trivial).
84
Now suppose that T is another set of positive integers satisfying (a), (b), and (c).
Note from (b) and (c) that if n ∈ T then n + 5 ∈ T , and so T satisfies the following
property:
The following must then be in T , with implications labeled by conditions (b) through
(d):
c c d b d b
2 ⇒ 49 ⇒ 542 ⇒ 562 ⇒ 56 ⇒ 121 ⇒ 11
d b d b
11 ⇒ 16 ⇒ 4 ⇒ 9 ⇒ 3
d b
16 ⇒ 36 ⇒ 6
(Here “log” is log to the base e). Show that the infinite series
x0 + x1 + x2 + · · ·
Solution (from Kiran Kedlaya’s Putnam Competition archive; this is Putnam B1,
2016): Note that the function ex − x is strictly increasing for x > 0 — its derivative
is ex − 1, which is positive for x > 0 because ex is strictly increasing and takes value
1 at 0. Also, the value of ex − x at 0 is 1. By induction on n, it follows that xn > 0
for all n.
By exponentiating the equation defining xn+1 , we obtain the expression
xn = exn − exn+1 .
85
Now note that, applying xn = exn − exn+1 repeatedly, we get a telescoping sum for
the partial sum x0 + · · · + xn :
x0 + · · · + xn = (ex0 − ex1 ) + · · · + (exn − exn+1 )
= ex0 − exn+1 = e − exn+1 .
By taking limits as n goes to infinity, we see that the sum x0 + x1 + · · · converges to
the value e − 1.
7. Remember that a composite number is a positive integer n such that n = ab with
a, b both positive integers greater than 1 (and not necessarily distinct).
Show that every composite number is expressible as xy + xz + yz + 1, with x, y, and
z positive integers.
Solution (from John Scholes’ collection of Putnam solutions37 ; this is Putnam B1,
1988): If n is composite then n = ab with a, b ≥ 2, and a, b both integers. But
ab = 1(a − 1) + 1(b − 1) + (a − 1)(b − 1) + 1,
so we may take x = 1, y = a − 1 and z = b − 1, all of which are positive integers, to
get a representation of n as n = xy + xz + yz + 1, with x, y, and z positive integers.
8. Let Nn be the number of ordered n-tuples (a1 , . . . , an ) of positive integers with the
property that
1 1 1
+ + ··· + = 1.
a1 a2 an
Is N10 odd or even?
Solution: (from Kiran Kedlaya’s Putnam Competition archive; this is Putnam A5,
1997) There are an even number of solutions (in positive integers) to
1 1 1
+ + ··· + = 1.
a1 a2 a10
6 a2 (these solutions can be grouped into disjoint pairs, with (a1 , a2 , a3 , . . . , a10 )
with a1 =
being paired with (a2 , a1 , a3 , . . . , a10 )). So we need only find the parity of the number
of solutions with a1 = a2 . Similarly, we can reduce to considering those 10-tuples
with a3 = a4 , a5 = a6 , a7 = a8 , a9 = a10 .
We have reduced to considering solutions (in positive integers) to
2 2 2 2 2
+ + + + = 1.
a1 a3 a5 a7 a9
As before, we need only consider the parity of the number of solutions with a1 = a3
and a5 = a7 , which reduces us to considering solutions (in positive integers) to
4 4 2
+ + = 1.
a1 a5 a9
37
https://prase.cz/kalva/putnam.html
86
We need only consider the parity of the number of solutions with a1 = a5 , which
reduces us to considering solutions (in positive integers) to
8 2
+ = 1.
a1 a9
This equation is equivalent to
(a1 − 8)(a9 − 2) = 16
which, by inspection, has 5 solutions in positive integers (specifically: (a1 , a9 ) = (9, 18)
or (10, 10) or (12, 6) or (16, 4) or (24, 3)). It follows that N10 is odd.
87
9 Week 8 (September 29)
No handout this week.
88
10 Week 9 (October 6) — Inequalities
Many Putnam problem involve showing that a particular inequality between two expressions
holds always, or holds under certain circumstances. There are a huge variety of general
inequalities between sets of numbers satisfying certain conditions, that are quite reasonable
for you to quote as “well-known”. Many of these are also useful to know about for other,
more serious, mathematical application. I’ve listed some of them here, mostly without
proofs, with stars next to the most important ones. If you are interested in knowing more
about inequalities, consider looking at the book lovely The Cauchy-Schwarz Master Class
by Steele (readily available online), or the classic Inequalities by Hardy, Littlewood and
Pólya (QA 303 .H223i at the library).
So to get from u to v it is at least as efficient to go directly (along one side of the triangle
uvw) than it is to go via the point w (along the other two sides of the triangle).
AM–GM–HM inequality ??
For positive a1 , . . . , an
n √ a1 + . . . + an
1 1 ≤ n
a1 . . . an ≤
a1
+ ... + a1
n
with equalities in both inequalities iff all ai are equal. The three expressions above are the
harmonic mean, the geometric mean and the arithmetic mean of the ai .
√
For n = 2, here’s a proof of the second inequality: a1 a2 ≤ (a1 + a2 )/2 iff 4a1 a2 ≤
(a1 + a2 )2 iff a21 − 2a1 a2 + a22 ≥ 0 iff (a1 − a2 )2 ≥ 0, which is true by the “squares are
positive” inequality; there’s equality all along iff a1 = a2 .
√
For n = 2 the first inequality is equivalent to a1 a2 ≤ (a1 + a2 )/2.
89
Power means inequality
For a non-zero real r and positive a1 , . . . , an define
1/r
ar1 + . . . + arn
r
M (a1 , . . . , an ) = ,
n
√
and set M 0 (a1 , . . . , an ) = n
a1 . . . an . For real numbers r < s,
M r (a1 , . . . , an ) ≤ M s (a1 , . . . , an )
Cauchy-Schwarz-Bunyakovsky inequality ??
Let x1 , . . . , xn and y1 , . . . , yn be real numbers. We have
Equality holds if one of the sequences (x1 , . . . , xn ), (y1 , . . . , yn ) is identically zero. If both
are not identically zero, then there is equality iff there is some real number t0 such that
xi = t0 yi for each i.
Here’s a quick proof: If either sequence is identically 0, both sides are zero. So assume
that neither is identically 0. For any real t we have
n
X
(xi − tyi )2 ≥ 0.
i=1
But also,
n
X n
X n
X n
X
(xi − tyi )2 = x2i − 2t xi yi + t2 yi2 ,
i=1 i=1 i=1 i=1
90
This means that viewed as a polynomial in t, the expression above must have either
complex roots or a repeated real root, i.e., that
n
!2 n
! n !
X X X
2 xi y i ≤4 x2i yi2 ,
i=1 i=1 i=1
which is exactly the inequality we wanted. (Notice the key point — squares are positive!).
If the inequality is an equality, then the polynomial has a repeated root, which means
there is some real t0 at which the polynomial evaluates to 0. But the polynomial at this
Xn
point is equal to (xi − t0 yi )2 , and the only way this can happen is if each xi − t0 yi is 0,
i=1
as claimed.
This is really a very general inequality: if you are familiar with inner products from
linear algebra, the Cauchy-Schwarz-Bunyakovsky inequality really says that if x, y are
vectors in an inner product space over the reals then
Equivalently
|hx, yi| ≤ ||x|| ||y||.
There is equality iff x and y are linearly dependent.
Hölder’s inequality
Fix p > 1 and define q by 1/p + 1/q = 1. Let x1 , . . . , xn and y1 , . . . , yn be real numbers.
We have
91
The rearrangement inequality
If a1 ≤ . . . ≤ an and b1 ≤ . . . ≤ bn are sequences of reals, and aπ(1) , . . . , aπ(n) is a
permutation (rearrangement) of a1 ≤ . . . ≤ an , then
an b1 + . . . + a1 bn ≤ aπ(1) b1 + . . . + aπ(n) bn ≤ a1 b1 + . . . + an bn .
If a1 < . . . < an and b1 < . . . < bn , then there is equality in the first inequality iff π is the
reverse permutation π(i) = n + 1 − i, and there is equality in the second inequality iff π is
the identity permutation π(i) = i.
Jensen’s inequality ??
A real function f (x) is convex on the interval [c, d] if for all c ≤ a < b ≤ d, the line segment
joining (a, f (a)) to (b, f (b)) lies entirely above the graph y = f (x) on the interval (a, b),
or equivalently, if for all 0 ≤ t ≤ 1 we have
f ((1 − t)a + tb) ≤ (1 − t)f (a) + tf (b).
If f (x) is convex on the interval [c, d], and c ≤ a1 ≤ . . . ≤ an ≤ d, then
a1 + . . . + an f (a1 ) + . . . + f (an )
f ≤
n n
(note that when n = 2, this is just the definition of convexity).
We say that f (x) is concave on [c, d] if for all c ≤ a < b ≤ d, and for all 0 ≤ t ≤ 1, we
have
f ((1 − t)a + tb) ≥ (1 − t)f (a) + tf (b).
If f (x) is concave on the interval [c, d], and c ≤ a1 ≤ . . . ≤ an ≤ d, then
a1 + . . . + an f (a1 ) + . . . + f (an )
f ≥ .
n n
As an example, consider the convex function f (x) = x2 ; for this function Jensen says
that 2
a2 + . . . + a2n
a1 + . . . + an
≤ 1 ,
n n
which is equivalent to the powers means inequality M 1 (a1 , . . . , an ) ≤ M 2 (a1 , . . . , an ); and
when f (x) = − ln x we get
√ a1 + . . . + an
n
a1 . . . an ≤ ,
n
the AM-GM inequality.
If f is twice differentiable, there is an easy check for convexity/concavity: f is convex
on intervals where the second derivative is positive, and concave on intervals where the
second derivative is negative. If f is just once differentiable, there is a similar test: f is
convex on intervals where the first derivative is increasing, and concave on intervals where
the first derivative is decreasing.
92
Four miscellaneous comments
1. Maximization/minimization problems are often problems about inequalities in dis-
guise. For example, to find the minimum of f (a, b) as (a, b) ranges over a set R,
it is enough to first guess that the minimum is m, then find an (a, b) ∈ R with
f (a, b) = m, and then use inequalities to show that f (a, b) ≥ m for all (a, b) ∈ R.
2. If an expression is presented as a sum of n squares, it is sometimes helpful to think
of it as the (square of the) distance between two points in n dimensional space, and
then think of the problem geometrically.
3. Sometimes a little calculus is all that is needed. For example, here is a very useful
inequality:
1 + x ≤ ex for all x ∈ R.
To prove this for x ≥ 0 note that both sides are equal at x = 0, and the derivative
of 1 + x, which is 1, is smaller than the derivative of ex , which is ex , for all x ≥ 0;
so the two sides start together but always the right-hand side is growing faster
than the left-hand side, so the right-hand side is always bigger. A similar argument
proves the inequality for x ≤ 0: 1 + x, with derivative 1, falls faster as we move
along the x-axis negatively away from 0, than does ex , which has derivative positive
but strictly less than 1 for x < 0. (To formalize this second half of the argument,
consider f (y) = 1 − y and g(y) = e−y , defined for y ≥ 0. We have f (0) = g(0), and
f 0 (y) = −1 ≤ −e−y = g 0 (y) for y ≥ 0, so f (y) ≤ g(y) for y ≥ 0. It follows that for
x ≤ 0, 1 − x ≤ e−x .)
4. If f (x) is a positive, increasing function on (0, ∞), then by considering Riemann
sums we have Z n n Z n+1
X
f (x) dx ≤ f (k) ≤ f (x) dx
0 k=1 1
(assuming the left-hand integral converges). For example, consider f (x) = xk for
k > 0. We have Z n
nk+1
xk dx =
0 k+1
and Z n+1
(n + 1)k+1 1
xk dx = − .
1 k+1 k+1
It easy to check that
(n + 1)k+1
k+1
1 n
− / → 1 as n → ∞,
k+1 k+1 k+1
so we have a quick proof that for each fixed k > 0 (not necessarily an integer)
1k + . . . + nk 1
lim k+1
= ;
n→∞ n k+1
in other words, the sum of the first n perfect kth powers grows like nk+1 /(k + 1).
93
Some warm-up problems
You should find that these are all fairly easy to prove by direct applications of an appropriate
inequality from the list above.
n
n+1
1. n! < for n = 2, 3, 4, . . ..
2
p √ √ √
2. 3(a + b + c) ≥ a + b + c for positive a, b, c.
4. Minimize
x2 y2 z2
+ +
y+z z+x x+y
subject to x, y, z ≥ 0 and xyz = 1.
2019!(2020) or 2020!(2019) .
(Here n!(k) indicates iterating the factorial function k times, so for example 4!(2) = 24!.)
8. Minimize
sin3 x cos3 x
+
cos x sin x
on the interval 0 < x < π/2.
94
10.1 Problems to think about for week 10
1. For positive integers m, n, show
(m + n)! m! n!
m+n
< m n.
(m + n) m n
2. Let T be an acute triangle. Inscribe a rectangle R in T with one side along a side of
T . Then inscribe a rectangle S in the triangle formed by the side of R opposite the
side on the boundary of T , and the other two sides of T , with one side along the
side of R. For any polygon X, let A(X) denote the area of X. Find the maximum
value, or show that no maximum exists, of
A(R) + A(S)
,
A(T )
where T ranges over all triangles and R, S over all rectangles as above.
3. Minimize 2
√
2 9
(u − v) + 2 − u2 −
v
√
in the range 0 ≤ u ≤ 2, v ≥ 0.
4. Show that for non-negative reals a1 , . . . , an and b1 , . . . , bn ,
(a1 . . . an )1/n + (b1 . . . bn )1/n ≤ ((a1 + b1 ) . . . (an + bn ))1/n .
6. Suppose that f (x) is a polynomial with all real coefficients, satisfying f (x)+f 0 (x) > 0
for all x. Show that f (x) > 0 for all x.
7. Given that {x1 , . . . , xn } = {1, . . . , n} (i.e., the numbers x1 , . . . , xn are 1 through n
in some order), find (with proof!) the maximum value of
x1 x2 + x2 x3 + · · · + xn−1 xn + xn x1 .
8. Let (xk )∞ ∞
k=1 and (yk )k=1 be sequences satisfying
y1 ≥ y2 ≥ y3 ≥ · · ·
and, for each k ≥ 1,
x1 x2 · · · xk ≥ y1 y2 · · · yk .
Show that for each k ≥ 1,
x1 + x2 + · · · + xk ≥ y1 + y2 + · · · + yk .
95
Solutions to warm-up problems
All of these problems were all taken from a Northwestern Putnam prep problem set.
n
n+1
1. n! < for n = 2, 3, 4, . . ..
2
Solution: Use the AM–GM inequality, with (a1 , . . . , an ) = (1, . . . , n).
p √ √ √
2. 3(a + b + c) ≥ a+ b+ c for positive a, b, c.
Solution: Use the power means inequality, with (a1 , a2 , a3 ) = (a, b, c) and r =
1/2, s = 1.
Solution: Guess: the minimum is n, achieved when all x1 = 1. Then use AM–GM
inequality to show
x1 + . . . + x n √
≥ n x 1 . . . xn = 1
n
for positive xi satisfying x1 . . . xn = 1.
4. Minimize
x2 y2 z2
+ +
y+z z+x x+y
subject to x, y, z ≥ 0 and xyz = 1.
√ √ √
Solution: Apply Cauchy-Schwarz with the vectors y + z, z + x, x + y and
x y z
√ ,√ ,√
y+z z+x x+y
to get
x2 y2 z2
2
(x + y + z) ≤ + + 2 (x + y + z) ,
y+z z+x x+y
leading to
x2 y2 z2 x+y+z
+ + ≥ .
y+z z+x x+y 2
By the AM–GM inequality,
x+y+z √
≥ 3 xyz = 1,
3
so
x2 y2 z2 3
+ + ≥ .
y+z z+x x+y 2
This lower bound can be achieved by taking x = y = z = 1, so the minimum is 3/2.
96
5. If triangle has side lengths a, b, c and opposite angles (measured in radians) A, B, C,
then
aA + bB + cC π
≥ .
a+b+c 3
2019!(2020) or 2020!(2019) .
(Here n!(k) indicates iterating the factorial function k times, so for example 4!(2) = 24!.)
Solution: Consider f (x) = (2020 − x) ln(2020 + x). We have ef (0) = 20202020 and
ef (1) = 20192021 , so we want to see what f does on the interval [0, 1]: increase or
decrease? The derivative is
2020 − x
f 0 (x) = − ln(2020 + x) + ,
2020 + x
which is negative on [0, 1] (since, for example,
2020 − x
≤ 1 = ln e < ln(2020 + x)
2020 + x
on that interval). So
20202020 < 20192021 .
97
8. Minimize
sin3 x cos3 x
+
cos x sin x
on the interval 0 < x < π/2.
Solution: We can use the rearrangement inequality on the pairs sin3 x, cos3 x
(which satisfies sin3 x ≤ cos3 x on [0, π/4], and sin3 x ≥ cos3 x on [π/4, π/2]), and
(1/ cos x, 1/ sin x) (which also satisfies 1/ cos x ≤ 1/ sin x on [0, π/4], and 1/ cos x ≥
1/ sin x on [π/4, π/2]), to get
sin3 x cos3 x sin3 x cos3 x
+ ≥ + = sin2 x + cos2 x = 1
cos x sin x sin x cos x
on the whole interval. Since 1 can be achieved (at x = π/4) the minimum is 1.
Solution: This was from the 2004 Putnam competition, Problem B2.
98
where T ranges over all triangles and R, S over all rectangles as above.
Solution: This problem was on the 1985 Putnam Competition, Problem A2.
We claim that the answer is 2/3. Here’s a picture to assist in visualizing the problem:
Assume, without loss of generality, that the horizontal base of T has length 1. Let
the base of R have length x, and the base of S have base y, where 0 < y < x < 1.
We have
A(S)
= 2y(x − y)
A(T )
and
A(R)
= 2x(1 − x),
A(T )
so the quantity we want to maximize is
3. Minimize 2
√
2 9
(u − v) + 2− u2 −
v
√
in the range 0 ≤ u ≤ 2, v ≥ 0.
The expression
√ to be minimized is √ the (square of the) distance between a point of
2
the form (u, 2 − u ) on 0 < u < 2, and a point of the form (v, 9/v) on v > 0;
in other words, we are looking for the (square of the) distance between the circle
99
x2 + y 2 = 2 in the first quadrant and the hyperbola xy = 9 in the same quadrant. By
symmetry, it strongly seems that the two closed points are (3, 3) on the hyperbola
and (1, 1) on the circle (squared distance 8). To prove that this is the minimum,
note that the tangent lines to the two curves at those two points are parallel, that
the distance between them at these points is the perpendicular distance between
the two tangent lines, and that the hyperbola (in the first quadrant) lies completely
above its tangent line, while the circle (in the first quadrant) lies completely below
its tangent line; so the distance between any other two points is at least the distance
between the two tangent lines.
Solution: This was from the 2003 Putnam competition, problem A2.
If any ai is 0, the result is trivial, so we may assume all ai > 0. Dividing through by
(a1 . . . an )1/n , the inequality becomes
where ek is the sum of the products of the ci ’s, taken k at a time. So it is enough to
show that for each k,
n X Y
(c1 . . . cn )k/n ≤ ci .
k i∈A
A⊆{1,...,n}, |A|=k
Y
We apply the AM-GM inequality to the numbers ci as A ranges over all subsets
i∈A
n−1
of size k of {1, . . . , n}. Note that each ai appears exactly times in all these
k−1
numbers. So we we get
P Q
n−1 n |A|=k i∈A ci
(c . . . c )(k−1)/(k ) ≤
A⊆{1,...,n},
1 n n .
k
n−1 n
Since / = k/n, this is the same as
k−1 k
P Q
A⊆{1,...,n}, |A|=k i∈A ci
(c1 . . . cn )k/n ≤ n
,
k
100
which is exactly what we wanted to show.
n ! n1 n ! n1
Y ai Y bi
≤ 1.
i=1
ai + b i i=1
ai + b i
Applying the arithmetic mean – geometric mean inequality to both terms on the
left-hand side, we find that the left-hand side is at most
n
! n
!
1 X ai 1 X bi
+
n i=1 ai + bi n i=1 ai + bi
Solution: This was from the 1996 Putnam Competition, problem B2.
We estimate the integral of ln x, which is convex and hence easy to estimate. Take
the integral from 1 to 2n − 1. This is less than 2(ln 3 + ln 5 + . . . + ln(2n − 1)). But the
antiderivative of ln x is x ln x−x, so the integral evaluates to (2n−1) ln(2n−1)−2n+2.
Hence (2n − 1) ln(2n − 1) − (2n − 1) < (2n − 1) ln(2n − 1) − 2n + 2 < 2(ln 3 + ln 5 +
. . . + ln(2n − 1)). Exponentiating gives the right-hand inequality.
Similarly, the integral from e to 2n + 1 is greater than 2(ln 3 + ln 5 + . . . + ln(2n − 1)),
and an explicit evaluation of the antiderivative here leads to the right-hand side of
the inequality. The choice of lower bound e for the integral here is just the right
thing to make the computations work out nicely.
6. Suppose that f (x) is a polynomial with all real coefficients, satisfying f (x)+f 0 (x) > 0
for all x. Show that f (x) > 0 for all x.
101
f (x) and f (x) + f 0 (x) have the same leading coefficient, so the same limiting behavior
as x goes to ±∞, namely they both tend to +∞ (since f (x) + f 0 (x) > 0 always, the
limits cannot be −∞).
f (x) cannot have a repeated root: at a repeated root, the derivative is also 0, so
f (x) + f 0 (x) = 0 at this point. So all of f (x)’s real roots (if it has any) are simple.
Since f (x) goes to +∞ as x approaches both ±∞, it must thus have an even number
of real zeroes.
Suppose it has any. Let x1 and x2 be the first two. Between x1 and x2 , at some point
the derivative is 0 (Rolle’s theorem); at that point f (x) + f 0 (x) must be negative
(since f (x) negative here). This contradiction shows that f (x) has no real roots, so
can’t change sign, so must be always positive.
Remark: The example of f (x) = −e−2x shows that the hypothesis that f (x) is a
polynomial is crucial here.
x1 x2 + x2 x3 + · · · + xn−1 xn + xn x1 .
. . . , n − 4, n − 2, n, n − 1, n − 3, . . .
To show this, note that if a, b is a pair of adjacent numbers and c, d is another pair
(read in the same order around the circle) with a < d and b > c, then the segment
from b to c can be reversed, increasing the sum by
ac + bd − ab − cd = (d − a)(b − c) > 0.
where without loss of generality we assume an−1 > an−2 . By considering the pairs
an−2 , an and an−1 , an−3 and using the trivial fact an > an−1 , we deduce an−2 > an−3 .
We then compare the pairs an−4 , an−2 and an−1 , an−3 , and using that an−1 > an−2 , we
deduce an−3 > an−4 . Continuing in this fashion, we prove that an > an−1 > · · · > a1
102
and so ak = k for k = 1, 2, . . . , n, i.e. that the optimal arrangement is as claimed. In
particular, the maximum value of the sum is
1 · 2 + (n − 1) · n + 1 · 3 + 2 · 4 + · · · + (n − 2) · n
= 2 + n2 − n + (12 − 1) + · · · + [(n − 1)2 − 1]
(n − 1)n(2n − 1)
= n2 − n + 2 − (n − 1) +
6
3 2
2n + 3n − 11n + 18
= .
6
Alternate solution: We prove by induction that the value given above is an upper
bound; it is clearly a lower bound because of the arrangement given above. Assume
this is the case for n − 1. The optimal arrangement for n is obtained from some
arrangement for n − 1 by inserting n between some pair x, y of adjacent terms. This
operation increases the sum by nx + ny − xy = n2 − (n − x)(n − y), which is an
increasing function of both x and y. In particular, this difference is maximal when x
and y equal n − 1 and n − 2. Fortunately, this yields precisely the difference between
the claimed upper bound for n and the assumed upper bound for n − 1, completing
the induction.
8. Let (xk )∞ ∞
k=1 and (yk )k=1 be sequences satisfying
y1 ≥ y2 ≥ y3 ≥ · · ·
x1 + x2 + · · · + xk ≥ y1 + y2 + · · · + yk .
Solution: This was on the 2013 University of Illinois — Urbana Champaign Mock
Putnam. Here is a solution taken from the UIUC Mock Putnam website, https:
//faculty.math.illinois.edu/~hildebr/putnam/mockputnamproblems.html:
103
104
11 Week 10 (October 13) — Modular arithmetic and
greatest common divisor
Modular arithmetic is something that everyone (not just mathematicians), are familiar
with from a very early age, though maybe not in a formal way. For example, whenever you
observe something like “it is 11 o’clock now, so in three hours time it will two o’clock”, you
are performing addition modulo 12, saying “11 + 3 = 2”. In this section we will formalize
modular arithmetic, and present numerous properties and applications that highlight its
usefulness.
Modular arithmetic
For integers a and b, and positive integer k, say that a is congruent to b (modulo k), written
“a ≡ b (mod k)”, if a and b leave the same remainder on division by k, or equivalently if
a − b is a multiple of k, or equivalently if a − b = mk for some integer m. Congruence
(modulo k) is an equivalence relation on the integers, that partitions Z into k classes, called
residue classes. For example, when k = 3 the three classes are {. . . , −6, −3, 0, 3, 6, . . .},
{. . . , −5, −2, 1, 4, 7, . . .} and {. . . , −4, −1, 2, 5, 8, . . .}.
Many of the standard arithmetic operations go through unchanged to modular arith-
metic. For example, it is easy to establish that if
then each of
a + c ≡ b + d (mod k)
a − c ≡ b − d (mod k)
ac ≡ bd (mod k)
hold. Repeated application of this last relation also quickly gives that for all positive
numbers n,
an ≡ bn (mod k).
Modular arithmetic can be a great time-saver when working with problems concerning
divisibility. We give a quick and useful example.
Claim: The remainder of any integer, on division by 9, is the same as the remainder of
the sum of its digits on division by 9.
Xn
Proof: Write the number in decimal form as ai 10i (with each ai ∈ {0, . . . , 9}). Since
i=0
10 ≡ 1 (mod 9), we immediately have 10i ≡ 1i ≡ 1 (mod 9), and so ai 10i ≡ ai (mod 9),
Xn n
X
and so ai 10i ≡ ai (mod 9), which is exactly what we wanted to show.
i=0 i=0
Here are three more quick examples illustrating how modular arithmetic can make life
easy:
105
Question: What are the last two digits of 372 ?
Answer: We are being asked: what number x between 0 and 99 is such that 372 ≡
x (mod 100)? By repeated squaring we have
3 ≡ 3 (mod 100)
32 ≡ 32 ≡ 9 (mod 100)
34 ≡ 92 ≡ 81 (mod 100)
38 ≡ 812 ≡ 6561 ≡ 61 (mod 100)
316 ≡ 612 ≡ 3721 ≡ 21 (mod 100)
332 ≡ 212 ≡ 441 ≡ 41 (mod 100)
364 ≡ 412 ≡ 1681 ≡ 81 (mod 100)
and so
372 ≡ 364 38 ≡ 81 · 61 ≡ 4941 ≡ 41 (mod 100),
so the last two digits of 372 are 41.
Problem: Prove that 270 + 370 is divisible by 13.
Solution: We could use the same technique as above to discover that 270 ≡ 10 (mod 13)
and 370 ≡ 3 (mod 13) so that 270 + 370 ≡ 10 + 3 ≡ 0 (mod 13). But there is a much easier
way: 22 ≡ −32 (mod 13), so 270 ≡ (−1)35 370 ≡ −370 (mod 13), so 270 + 370 ≡ 0 (mod 13).
Problem: Find all integers x, y satisfying x2 − 5y 2 = 6.
Solution: Some experimentation shows that no small numbers x and y work. We might
suspect, then, that the equation has no integer solutions. One way to verify this is to work
modulo 4. If there was an x and y with x2 − 5y 2 = 6, then for that x and y we would have
x2 − 5y 2 ≡ 6 (mod 4).
If x ≡ 0, 1, 2, 3 (mod 4) then x2 ≡ 0, 1, 0, 1 (mod 4), and if y ≡ 0, 1, 2, 3 (mod 4) then
5y ≡ 0, 1, 0, 1 (mod 4). So, modulo 4, x2 − 5y 2 is equivalent to one of −1, 0, 1, or one of
2
106
useful fact follows easily from looking at the prime factorizations of a and b, see
below.
The least common multiple of a, b, lcm(a, b) is the smallest positive number f such
that a|f and and b|f ; for any positive number g with a|g and b|g we have f ≤ g; but
in fact, as with gcd, it turns out that we even have f |g in this case.
If gcd(a, b) = 1 (so no factors in common other that 1) then a and b are said to be
coprime or relatively prime.
2. Primes: If p > 1 only has 1 and p as divisors, it is said to be prime; otherwise it is
composite.
The fundamental fact about prime numbers (other that there are infinitely many of
them!) is that every number n > 1 has a prime factorization:
n = pa11 . . . pakk
with each pi a prime, and each ai > 0. Moreover, the factorization is unique if we
assume that p1 < . . . < pk .
The prime factorization gives one way (not the most computationally efficient way)
of accessing gcd(a, b) and lcm(a, b). Indeed, if
a = pa11 . . . pakk and b = pb11 . . . pbkk
(with some of the ai and bi possibly 0), then
min(a1 ,b1 ) min(ak ,bk ) max(a1 ,b1 ) max(ak ,bk )
gcd(a, b) = p1 . . . pk and lcm(a, b) = p1 . . . pk .
Using min(x, y) + max(x, y) = x + y, we get the nice identity
ab = gcd(a, b)lcm(a, b).
k
Y
Since any common divisor of a and b must be of the form pγi i for some γi ’s satisfying
i=1
γi ≤ min(a1 , b1 ), we quickly get the fact, alluded to earlier, that if d = gcd(a, b) and
e is a common divisor of a and b, then not only do we we have e ≤ d but also e|d.
3. Euclidean algorithm: Euclid described a simple way to compute gcd(a, b). Assume
a > b. Write
a = kb + j
where 0 ≤ j < b. If j = 0, then gcd(a, b) = b. If j > 0, then it is fairly easy to check
that gcd(a, b) = gcd(b, j). Repeat the process with the smaller pair b, j, and keep
repeating as long as necessary. For example, suppose I want gcd(63, 36):
63 = 1.36 + 27
36 = 1.27 + 9
27 = 3.9.
We conclude 9 = gcd(27, 9) = gcd(36, 27) = gcd(63, 36).
107
4. Bézout’s Theorem: Given a, b, there are integers x, y such that ax+by = gcd(a, b).
Moreover, the set of numbers that can be expressed in the form ax0 + by 0 = c for
integers x0 , y 0 is exactly the set of multiples of gcd(a, b).
The proof comes from working the Euclidean algorithm backwards. I’ll just do an
example, with the pair 63, 36. We have
9 = 36 − 1.27
= 36 − 1(63 − 1.36)
= −1.63 + 2.36
108
11.1 Problems to think about for week 11
1. Prove that the product of three consecutive integers is divisible by 504 if the middle
one is a perfect cube.
3. Compute the sum of the digits of the sum of the digits of the sum of the digits of
the number 44444444 .
4. Several positive integers are written on a chalk board. One can choose two of them,
erase them, and replace them with their greatest common divisor and least common
multiple. Prove that eventually the numbers on the board do not change.
5. How many primes numbers have the following (decimal) form: digits alternating
between 1’s and 0’s, beginning and ending with 1?
pa a
6. Let a ≥ b ≥ 0 be integers and let p be a prime number. Show that and
pb b
are congruent modulo p.
7. Is it possible to place 2021 integers on a circle such that for every pair of adjacent
numbers the ratio of the larger one to the smaller one is a prime?
8. Let n > 1 be an integer and p a prime such that n|(p − 1) and p|(n3 − 1). Prove
that 4p − 3 is a perfect square.
Solution: Let the middle integer be m3 where m is an integer. Then the product of
the three integers is
109
• If m ≡ 3 (modulo 7) then m2 − m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 4 (modulo 7) then m2 + m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 5 (modulo 7) then m2 − m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
• If m ≡ 6 (modulo 7) then m + 1 ≡ 0 (modulo 7) and 7|(m − 1)(m2 + m +
1)m3 (m + 1)(m2 − m + 1).
110
We have n4n − n = n2n 2n − n = (2n + n)n2n − n2 2n − n, so if 2n + n divides n4n − n,
then
(2n + n)|(n2 2n + n).
It is an easy (but slightly tedious, so omitted) induction that for n ≥ 10, 2n + n >
n3 − n, so we conclude that if 2n + n divides 8n + n, then n ≤ 9.
It is another tedious but easy check that n = 0, 1, 2, 4 and 6 all lead to integers, but
not any other n ≤ 9.
3. Compute the sum of the digits of the sum of the digits of the sum of the digits of
the number 44444444 .
Among all numbers below 1040000 , none has a larger sum of digits than 1040000 − 1 (a
string of 40000 9’s). So the sum of the digits of 44444444 is at most 9×40000 < 1000000.
Among all numbers below 1000000, none has a larger sum of digits than 999999. So
the sum of the digits of the sum of the digits of 44444444 is at most 54. Among all
numbers at most 54, none has a larger sum of digits than 49. So the sum of the
digits of the sum of the digits of the sum of the digits of 44444444 is at most 13.
Now we use a useful fact: the remainder of a number, on division by 9, is the same
as the remainder of the sum of the digits on division by 9. . This fact implies that
the sum of the digits of the sum of the digits of the sum of the digits of 44444444
leaves the same remainder on division by 9 as 44444444 itself does.
To calculate the remainder of 44444444 on division by 9, we can use a repeated
multiplication trick. It’s easy that
4444 ≡ 7 (mod 9)
111
and so
44442 ≡ 49 ≡ 4 (mod 9)
44444 ≡ 16 ≡ 7 (mod 9)
44448 ≡ 49 ≡ 4 (mod 9)
444416 ≡ 16 ≡ 7 (mod 9)
444432 ≡ 49 ≡ 4 (mod 9)
444464 ≡ 16 ≡ 7 (mod 9)
4444128 ≡ 49 ≡ 4 (mod 9)
4444256 ≡ 16 ≡ 7 (mod 9)
4444512 ≡ 49 ≡ 4 (mod 9)
44441024 ≡ 16 ≡ 7 (mod 9)
44442048 ≡ 49 ≡ 4 (mod 9)
44444096 ≡ 16 ≡ 7 (mod 9).
It follows that
44444444 = 44444096 4444256 444464 444416 44448 44444 ≡ 7.7.7.7.4.7 ≡ 7 (mod 9).
So 44444444 leaves a remainder of 7 on division by 9, and also the sum of the digits
of the sum of the digits of the sum of the digits of 44444444 leaves a remainder of 7
on division by 9; but we’ve calculated that this last is at most 13. The only number
at most 13 that leaves a remainder of 7 on division by 9 is 7 it self; so the sum of
the digits of the sum of the digits of the sum of the digits of 44444444 must be 7.
4. Several positive integers are written on a chalk board. One can choose two of them,
erase them, and replace them with their greatest common divisor and least common
multiple. Prove that eventually the numbers on the board do not change.
Here’s a very quick, slick solution shown to me by Do Trong Thanh: If you pick
two numbers a, b with a|b or b|a, then since gcd(a, b) = min{a, b} and lcm(a, b) =
max{a, b} in this case, the numbers do not change. In general gcd(a, b)|lcm(a, b),
so if it is not the case that a|b or b|a, then after the swap it is the case for that
particular pair. Initially there are only finitely many pairs (a, b) with a 6 |b and b 6 |a;
either eventually we replace all these pairs with pairs of which one divides to other
112
(in which case we are done), or we eventually commit to avoiding all remaining such
pairs (in which case we are done).
Here’s my laborious solution: When we take a pair of numbers (a, b), and replace them
with (gcd(a, b), lcm(a, b)), we preserve something, namely the product of the pair of
numbers (that ab = gcd(a, b)lcm(a, b) is easily seen from the prime factorization of a
and b: if n n
Y Y
ai
a= pi , b = pbi i
i=1 i=1
Thus ab = gcd(a, b)lcm(a, b) follows from x + y = min{x, y} + max{x, y}, valid for
any positive integers x, y.)
For any fixed positive number, there are only finitely many ways to write it as the
product of a fixed number of positive numbers (if the target of the product is N ,
and we are using d numbers, then each of the d numbers must be a divisor of N , so
the number of ways of writing N as a product of d terms is at most a(N )d , where
a(N ) is the number of divisors of N ). This shows that there are only finitely many
possibilities for the numbers written on the board.
Consider the sum of the numbers. How does this change with the swap operation?
It depends on how a + b compares to gcd(a, b) + lcm(a, b). Experimentation suggests
that gcd(a, b) + lcm(a, b) ≥ a + b, with equality iff the pair (a, b) coincides (in some
order) with the pair (lcm(a, b), gcd(a, b)). To prove this, first consider a = b, for
which the result is trivial. For all other cases, assume without loss of generality that
a > b. We have
lcm(a, b) ≥ a > b ≥ gcd(a, b).
If any one of a = lcm(a, b), b = gcd(a, b) holds then by the conservation of product
the other must too, and the result we are trying to prove is true. So now we may
assume
lcm(a, b) > a > b > gcd(a, b),
and what we want to show is that this implies gcd(a, b) + lcm(a, b) > a + b. Let
n = ab = gcd(a, b)lcm(a, b), so
n n
gcd(a, b) + lcm(a, b) = gcd(a, b) + and a + b = b + .
gcd(a, b) b
113
A little
√ calculus shows that the√function f (x) = x + n/x is decreasing on the interval
(0, n]. Since gcd(a, b) < b < n, this shows that
n n
gcd(a, b) + >b+
gcd(a, b) b
5. How many primes numbers have the following (decimal) form: digits alternating
between 1’s and 0’s, beginning and ending with 1?
What happens if Pn (100) is prime? It must divide one of 10n+1 − 1, 10n+1 + 1. But,
for n ≥ 2,
so Pn (100) is too big to divide either 10n+1 + 1 or 10n+1 − 1. Hence for n ≥ 2, Pn (100)
can’t be prime.
The conclusion is that the only prime of the given form is 101.
114
pa a
6. Let a ≥ b ≥ 0 be integers and let p be a prime number. Show that and
pb b
are congruent modulo p.
Solution: INCOMPLETE This was from the 1977 Putnam competition, Problem
A5.
Solution from John Scholes. Denote by f (n) the highest power of p dividing n (so,
e.g., f (23 58 p7 ) = p7 , if p 6= 2, 5). The multiples of p in (pa)! are pa, p(a − 1), . . ., 2p,
and p. Hence f ((pa)!) = pa f (a!). Similarly, f ((pb)!) = pb f (b!) and f ((p(a − b))!) =
pa−b f ((a − b)!). Hence
a pa
f =f .
b pb
pa a
This says that − can be expressed as xpy where x and y are non-negative
pb b
integers, and x is not divisible by p. If y > 0, this gives the result.
I’m not sure what happens for this line of attack if y = 0.
Here is the solution as posted in the American Mathematical Monthly shortly after
the 1977 competition:
7. Is it possible to place 2021 integers on a circle such that for every pair of adjacent
numbers the ratio of the larger one to the smaller one is a prime?
Solution: I found this on Andrei Jorza’s webpage, from his 2018 Putnam prep class.
115
Starting from a particular (arbitrarily chosen) number, A, say, on the circle, the
number one step away clockwise from A is A multiplied by the label on the arc of the
circle between A and that number one step away clockwise. In general, the number
k steps away from A (clockwise) is A multiplied by all the labels encountered along
those k arcs. So the number 2021 steps away from A (clockwise) is A multiplied by
all the labels on the arcs (since 2021 steps takes us all the way around the circle).
But this last number is A itself. So we have an equation:
product of bunch of primes — the primes on the UP arcs
A=A×
product of bunch of primes — the primes on the DOWN arcs
or
product of primes on UP arcs = product of primes on DOWN arcs.
But there are 2021 arcs, and odd number, so one side of the above equation has an
odd number of primes in it, and the other side has an even number, contradicting
the fundamental theorem of arithmetic.
Notice that all we used here was that 2021 is odd.
8. Let n > 1 be an integer and p a prime such that n|(p − 1) and p|(n3 − 1). Prove
that 4p − 3 is a perfect square.
n2 + (1 − `k)n + (1 − `) = 0.
4(` − 1) ≥ 2(`k − 1) + 1
(why? (`k − 1)2 is a perfect square, and when we add 4(` − 1) we get a larger perfect
square; the first perfect square after (`k − 1)2 is (`k)2 , which differs from (`k − 1)2
by 2(`k − 1) + 1, so we need to add at least this much).
116
4(` − 1) ≥ 2(`k − 1) + 1 implies k = 1 (if k ≥ 2 then 2(`k − 1) ≥ 4` − 2 > 4(` − 1)).
But k = 1 says p = n + 1, so n + 1|(n2 + n + 1), so n + 1|n2 , impossible for n > 0
(why¿ n + 1|n2 − 1, so if n + 1|n2 then n + 1|1).
117
12 Week 11 (October 20) — Probability
Discrete probability may be thought about along the following lines: an experiment is
performed, with a set S, the sample space, of possible observable outcomes (e.g., roll a dice
and note the uppermost number when the dice lands; then S would be {1, 2, 3, 4, 5, 6}). S
may be finite or countable for our purposes. An event is a subset A of S; the event occurs
if the observed outcome is one of the elements of A (e.g., if A = {2, 4, 6}, which we might
describe as the event that an even number is rolled, then we would say that A occurred if
we rolled a 4, and that it did not occur if we rolled a 5). The compound event A ∪ B is
the event that at least one of A, B occur; A ∩ B is the event that both A and B occur,
and Ac (= S \ A) is the event that A did not occur.
A probability function is a function P that assigns to each event a real number, which
is intended to measure how likely A is to occur, or, in what proportion of a very large
numbers of independent repetitions of the experiment does A occur. P should satisfy the
following three rules:
1. P (A) ≥ 0 always (events occur with non-negative probability);
2. P (S) = 1 (something always happens); and
3. if A and B are disjoint events (no outcomes in common) then P (A ∪ B) = P (A) +
P (B); more generally, if A1 , A2 , . . . is a countable collection of mutually disjoint
events, then X
P (Ui Ai ) = P (Ai ).
i
Three consequences of the rules are the following relations that one would expect:
1. If A ⊆ B then P (A) ≤ P (B);
2. P (∅) = 0; and
3. P (Ac ) = 1 − P (A).
Usually one constructs the probability function in the following way: intuitition/experiment/some
underlying theory suggests that a particular s ∈ S will occur a proportion ps of the time,
when the experiment isX repeated many times; a reality check here is that ps should be
non-negative, and that ps = 1. One then sets
s∈S
X
P (A) = ps ;
s∈A
118
and calculating probabilities comes down to counting.
Example: I toss a coin 100 times. How likely is it that I get exactly 50 heads?
Solution: All 2100 lists of outcomes of 100 tosses are equally likely, so each one should
occur with
probability
1/2100 . The number of outcomes in which there are exactly 50
100
heads is , so the required probability is
50
100
50
.
2100
A random variable X is a function that assigns to each outcome of an experiment
a (usually real) numerical value. For example, if I toss a coin 100 times, I may not be
interested in the particular list of heads and tails I get, just in the total number of heads,
so I could define X to be the function that takes in a string of 100 heads and tails, and
returns as the numerical value the number of heads in the string. The density function of
the random variable X is the function pX (x) = P (X = x), where “P (X = x)” is shorthand
for the event “the set of all outcomes for which X evaluates to x”. For tossing a coin 00
times and counting the number of heads, the density function is
100 −100
2 if x = 0, 1, 2, . . . , 100
pX (x) = x
0 otherwise.
119
with reading x being given weight pX (x). For example, if X is the binomial distribution
with parameters n and p, then E(X) is
n n
X n k n−k
X n − 1 k−1
k p (1 − p) = np p (1 − p)n−k
k=0
k k=0
k − 1
= np(p + (1 − p))n−1
= np,
as we would expect.
It is worth knowing that expectation is a linear function:
Linearity of expectation: If a probability distribution/random variable X can be
written as the sum X1 + . . . + Xn of n (usually simpler) probability distributions/random
variables, then
E(X) = E(X1 ) + . . . + E(Xn )
Example: n boxes have labels 1 through n. n cards with numbers 1 through n written
on them (one number per card) are distributed among the n boxes (one card per box). On
average how many boxes get the card whose number is the same as the label on the box?
Solution: Let Xi be the random variable that takes the value 1 if card i goes into box
i, and 0 otherwise; pXi (1) = 1/n, pXi (0) = 1 − (1/n) and pXi (x) = 0 for all other x’s, so
E(Xi ) = 1/n. Let X be the random variable that counts the number of boxes that get
the right card; since X = X1 + . . . Xn we have
E(X) = E(X1 ) + . . . + E(Xn ) = n(1/n) = 1
(independent of n!) [This is the famous problem of derrangements.]
One of the rules of probability is that for disjoint events A, B, we have P (A ∪ B) =
P (A) + P (B). If A and B have overlap, this formula overcounts by including outcomes in
A ∩ B twice, so should be corrected to
P (A ∪ B) = P (A) + P (B) − P (A ∩ B).
For three events A, B, C, a Venn diagram readily shows that
P (A ∪ ∪C) = P (A) + P (B) + P (C) − P (A ∩ B) − P (A ∩ C) − P (B ∩ C) + P (A ∩ B ∩ C).
There is a natural generalization:
Inclusion-exclusion (also called the sieve formula):
n
X X
P (∪ni=1 Ai ) = P (Ai ) − P (Ai ∩ Aj )
i=1 i<j
X
+ P (Ai ∩ Aj ∩ Ak ) + . . .
i<j<k
X
+(−1)`−1 P (Ai1 ∩ Ai2 ∩ Ai` ) + . . .
i1 <i2 <...<i`
n−1
+(−1) P (A1 ∩ A2 ∩ An ).
120
Inclusion-exclusion is often helpful because calculating probabilities of intersections is
easier than calculation probabilities of unions.
Example: In the problem of derrangements discussed above, what is the exact probability
that no box gets the correct card?
Solution: Let Ai be the event that box i gets the right card. We have P (Ai ) = 1/n, and
more generally for i1 < i2 < . . . < i` we have
(n − `)!
P (Ai1 ∩ Ai2 ∩ Ai` ) =
n!
(there are n! distributions of cards, and to make sure that boxes i1 through i` get the
right card, we are forced to place these ` cards each in a predesignated box; but the
remaining n − ` cards can be completely freely distributed among the remaining boxes).
By inclusion-exclusion,
n
`−1 n (n − `)!
X
n
P (∪i=1 Ai ) = (−1) ,
`=1
` n!
the binomial term coming from selecting i1 < i2 < . . . < i` . We want the probability of
none of the boxes getting the right card, which is the complement of ∪ni=1 Ai :
n
`−1 n (n − `)!
c
X
n
P ((∪i=1 Ai ) ) = 1 − (−1)
`=1
` n!
n
` n (n − `)!
X
= (−1)
`=0
` n!
n
X (−1)`
= .
`=0
`!
Note that this is the sum of the first n + 1 terms in the power series of ex around 0,
evaluated at x = −1, so as n gets larger the probability of there being no box with the
right card approaches 1/e.
There is a counting version of inclusion-exclusion, that is very useful to know:
Inclusion-exclusion (counting version):
n
X X
| ∪ni=1 Ai | = |Ai | − |Ai ∩ Aj |
i=1 i<j
X
+ |Ai ∩ Aj ∩ Ak | + . . .
i<j<k
X
+(−1)`−1 |Ai1 ∩ Ai2 ∩ Ai` | + . . .
i1 <i2 <...<i`
n−1
+(−1) |A1 ∩ A2 ∩ An |.
121
Example: How many numbers are there, between 1 and n, that are relatively prime to n
(have no factors in common)?
Solution: Let n have prime factorization pa11 pa22 . . . pakk . Let Ai be the set of numbers
between 1 and n that are multiples of pi . We have |Ai | = n/pi , and more generally for
i1 < i2 < . . . < i` we have
n
|Ai1 ∩ Ai2 ∩ Ai` | = .
pi1 . . . pi`
We want to know |(Ai1 ∩ Ai2 ∩ Ai` )c | (complement taken inside of {1, . . . , n}), because
this is exactly the set of numbers below n that share no factors in common with n. By
inclusion-exclusion,
k
!
n
X 1 X 1 (−1)
| (∪ni=1 Ai )c | = n 1 − + − ... +
i=1
p i i<j
p i p j p1 p2 . . . pk
1 1
= n 1− ... 1 − .
p1 pk
The function
k
Y 1
ϕ(n) = n 1−
i=1
pi
counting the number of numbers between 1 and n that are relatively prime to n, is called
the Euler totient function.
The only mention I’ll make of probability with uncountable underlying sample spaces
is this: if R is a region in the plane, then a natural model for “selecting a point from R,
all points equally likely”, is to say that for each subset R0 of R, the probability that the
selected point will be in R0 is Area(R0 )/area(R), that is, proportional to the area of R0 .
This idea naturally extends to more general spaces.
Example: I place a small coin at a random location on a 3 foot by 5 foot table. How
likely is it that the coin is within one foot of some edge of the table?
Solution: There’s a 1 foot by 3 foot region at the center of the table, consisting of exactly
those points that are not within one foot of some edge of the table; assuming that the
coin is equally likely to be placed at any location, the probability of landing in this region
is (1 × 3)/(3 × 5) = .2, so the probability of landing withing one foot of some edge is
1 − .2 = .8.
Conditional probability: Given events A, B, with P (B) 6= 0, the conditional proba-
bility of A given B is
P (A ∩ B)
P (A|B) =
P (B)
(knowing that B has occurred, the only sample points that lead to A occurring are those
in A ∩ B, and the probability of these points occurring should be measured relative to
P (B), not 1).
122
From this definition, we get the law of total probability: if B1 , . . . , Bn form a partition
of the sample space of an experiment (disjoint from each other, union covers whole space),
and A is any event, then
This formula can often be used to calculate the probability that the nth term in a random
sequence takes a certain value, if it is known what is the probability distribution of the
(n − 1)th term. For example, suppose X1 , X2 , X3 , . . . is some sequence of random variables,
where the exact distribution of Xn depends on the distribution of Xn−1 , and that all the
Xi ’s take on only the values 0, 1 and 2. Then
2. A bag contains 2021 red balls and 2020 black balls. We remove two balls at a time
repeatedly and (i) discard both if they are the same color and (ii) discard the black
ball and return the red ball to the bag if their colors differ. What is the probability
that this process will terminate with exactly one red ball in the bag?
3. You have coins C1 , C2 , . . . , Cn . For each k, coin Ck is biased so that, when tossed,
it has probability 1/(2k + 1) of falling heads. If the n coins are tossed, what is
the probability that the number of heads is odd? Express the answer as a rational
function of n.
4. An unbiased coin (i.e., heads and tails will each occur with probability 1/2) is tossed
n times. Find a formula, in closed form (no summation), for the expected value of
|H − T |, where H is the number of heads and T is the number of tails.
123
5. A dart, thrown at random, hits a square target. Assuming that any two parts of
the target of equal area are equally likely to be hit, find the probability that the
point
√ hit is nearer to the center than to any edge. Express your answer in the form
(a b + c)/d where a, b, c and d are integers.
6. Two real numbers x and y are chosen at random in the interval (0, 1) with respect
to the uniform distribution. What is the probability that the closest integer to x/y
is even? Express the answer in the form r + sπ, where r and s are rational numbers.
7. Let k be a positive integer. Suppose that the integers 1, 2, 3, . . . , 3k + 1 are written
down in random order. What is the probability that at no time during the process,
the sum of the integers that have been written up to that time is a positive integer
divisible by 3? Your answer should be in closed form, but may include factorials.
8. Let S be the set of 2 by 2 matrices each of whose entries is one of the 15 squares
0, 1, 4, 9, . . . , 196. Prove that if one selects more than 154 − 152 − 15 + 2 matrices
from S, then two of those selected must commute.
Solution: Some doodling with small examples suggests the following: if Shanille
throws a total of n free throws, n ≥ 3, then for each k in the range [1, n − 1] the
probability that she makes exactly k shots in 1/(n − 1) (independent of k).
We can prove this by induction on n, with n = 3 very easy. For n > 3, we start with
the extreme case k = 1. The probability that she makes exactly one shot in total is
the probability that she misses each of shots 3 through n, which is
1 2 3 n−3 n−2 1
· · · ... · · = .
2 3 4 n−2 n−1 n−1
For k > 1, there are two (mutually exclusive) ways that she can make k shots in
total:
(a) Make k −1 of the first n−1, and make the last; the probability of this happening
is, by induction,
1 k−1
· ,
n−2 n−1
or
(b) make k of the first n − 1, and miss the last; the probability of this happening
is, by induction,
1 n−1−k
· .
n−2 n−1
124
Thus the net probability of making k shots is
1 k−1 1 n−1−k 1
· + · = ,
n−2 n−1 n−2 n−1 n−1
and we are done by induction.
The answer to the given question is 1/99 (n = 100).
2. A bag contains 2021 red balls and 2020 black balls. We remove two balls at a time
repeatedly and (i) discard both if they are the same color and (ii) discard the black
ball and return the red ball to the bag if their colors differ. What is the probability
that this process will terminate with exactly one red ball in the bag?
Solution: It helps to generalize to r red balls and b black balls, since as the
process goes along the number of balls of the two colors will not be equal. A little
experimentation suggests the following: if the process is started with and odd number
r ≥ 1 of red balls, and b ≥ 0 balls, then it always ends with one red ball. We prove
this by induction on r + b. Formally: for each n ≥ 1, P (n) is the proposition “if the
process is started with and odd number r ≥ 1 of red balls, and b ≥ 0 balls, with
r + b = n, then it always ends with one red ball”, and we prove P (n) by induction
on n.
Base case n = 1 is trivial, as is base case n = 2. For base case n = 3, we either start
with three red balls, in which case after one step we are down to one red, or we start
with one red and two blues. In this case, one third of the time we first pick the two
blacks, and we are down to one red, while two thirds of the time we first pick a black
and a red, and we are down to one red and one black, leaving us with one red after
step two.
Now consider n ≥ 4, and start with r red balls and b black balls, r odd and r + b = n.
If on the first step we pick two reds, then we are left with r − 2 red balls and b black
balls. Note that r − 2 is odd and (r − 2) + b = n − 2, so by induction in this case we
always end with one red. If on the first step we pick two blacks, then we are left
with r red balls and b − 2 black balls. Note that r is odd and r + (b − 2) = n − 2, so
by induction in this case we always end with one red. Finally, if on the first step we
pick a black and a red, then we are left with r red balls and b − 1 black balls. Note
that r is odd and r + (b − 1) = n − 1, so by induction in this case we always end
with one red. This completes the induction.
Since 2021 is odd, the probability of ending with one red, starting with 2021 red
balls and 2020 black balls is 1.
3. You have coins C1 , C2 , . . . , Cn . For each k, coin Ck is biased so that, when tossed,
it has probability 1/(2k + 1) of falling heads. If the n coins are tossed, what is
125
the probability that the number of heads is odd? Express the answer as a rational
function of n.
4. An unbiased coin (i.e., heads and tails will each occur with probability 1/2) is tossed
n times. Find a formula, in closed form (no summation), for the expected value of
|H − T |, where H is the number of heads and T is the number of tails.
Solution: This was on the Putnam Competition in 1974, Problem A4. Here’s a
solution from https://mks.mff.cuni.cz/kalva/putnam/putn74.html:
If i heads are tossed, then n − i tails, so H − T = 2i − n, and
n
1 X n
E(|H − T |) = n |2i − n| .
2 i=0 i
n
For i = 0, . . . [n/2], the summand is (n − 2i) , which is equal to (2(n − i) −
i
n
n) , the summand for n − i; so
n−i
[n/2]
1 X n
E(|H − T |) = (n − 2i)
2n−1 i=0
i
126
(this clearly works for odd n, since as i runs from 0 to [n/2], n − i runs from n to
[n/2] + 1; it also works for even, since in this case the one term that contributes in
both ranges, i = [n/2], contributes 0.
We’ll show
[n/2]
X n n−1
(n − 2i) =n ,
i=0
i [n/2]
which leads to
n−1
n[n/2]
E(|H − T |) = .
2n−1
We’ll use the committee-chair identity,
n n−1
i =n ,
i i−1
and n−1
[n/2]
X 2
if n odd
= 1 n
2n−1 + if n even
i=0 2 [n/2]
n
n n X n
(to see this, use = and = 2n ).
r n−r i=0
i
For odd n, we have
[n/2] [n/2]
X n n−1
X n
(n − 2i) = n2 −2 i
i=0
i i=0
i
[n/2]
X n − 1
n−1
= n2 − 2n
i=1
i−1
[n/2]−1
X n − 1
n−1
= n2 − 2n
i
i=0
n−1 n−1 n−1
= n2 −n 2 −
[n/2]
n−1
= n ,
[n/2]
5. A dart, thrown at random, hits a square target. Assuming that any two parts of
the target of equal area are equally likely to be hit, find the probability that the
point
√ hit is nearer to the center than to any edge. Express your answer in the form
(a b + c)/d where a, b, c and d are integers.
127
Solution: Place the dartboard on the x-y plane, with vertices at (0, 0), (0, 2), (2, 2)
and (2, 0), so center at (1, 1). We want to compute the are of the set of points
inside this square which are closer to (1, 1) than any of x = 0, 2, y = 0, 2. We’ll just
consider the triangle T bounded by vertices (0, 0), (1, 0), (1, 1); by symmetry, this is
one eight of the desired area.
p
For a point (x, y) in T , the distance to (1, 1) is (x − 1)2 +p(y − 1)2 , and the distance
to the nearest of x = 0, 2, y = 0, 2 is just y. So the curve (x − 1)2 + (y − 1)2 = y,
or
x2 − 2x + 2
y=
2
cuts T into two regions, one (containing (1, 1)) being the points that are closer
to (1, 1)√than the√ nearest of x = 0, 2, y = 0, 2. This curve hits the line x = y
at (2 − 2, 2 − 2). So the desired area inside T is the total√area of √ T (which
is 1/2) minus the area bounded by x = y from (0, 0) to (2 − 2, 2 − 2), then
y = (x2 − 2x + 2)/2 to (1, 1/2), then x = 1 to (1, 0), then the √ x-axis√back to
(0, 0).√This area is the area√of the triangle bounded by (0, 0), (2 − 2, 2 − 2), and
(2 − 2, 0) (which is (2 − 2)2 /2), plus
1 √
Z 1
x2 − 2x + 2
√
dx = (4 2 − 5);
2− 2 2 3
√
grand total (1/3)(4 − 2 2). It follows that the desired area inside T is
1 1 √ 1 √
− (4 − 2 2) = (4 2 − 5),
2 3 6
√
and so the total desired area is eight times this, or (4/3)(4 2 − 5).
Since the total area of the square is 4, the desired probability is thus
1 √
(4 2 − 5) ≈ .218951.
3
Source: Putnam Competition 1989, Problem B1. Note that this is a rare example
of a Putnam question with a typo: the correct (and officially sanctioned) answer is
as given above, but notice that c = −5, which is not a positive integer; and when the
problem appeared on √ the 1989 Putnam Competition, it ended with “Express your
answer in the form (a b + c)/d where a, b, c and d are positive integers” (emphasis
mine).
6. Two real numbers x and y are chosen at random in the interval (0, 1) with respect
to the uniform distribution. What is the probability that the closest integer to x/y
is even? Express the answer in the form r + sπ, where r and s are rational numbers.
128
Hence the required probability is p = 1/4 + (1/3 − 1/5) + (1/7 − 1/9) + . . . . But
now recall that π/4 = 1 − 1/3 + 1/5 − 1/7 + . . . , so p = 5/4 − π/4.
Solution: This was a Putnam Competition problem, and the official solution,
published in the American Mathematical Monthly, is very nicely presented, so I
reproduce it here verbatim:
“The number of ways to write down 1, 2, 3, . . . , 3k + 1 in random order is (3k + 1)!, so
we want to count the number of ways in which none of the “partial sums” is divisible
by 3. First, consider the integers modulo 3 : 1, 2, 0, 1, 2, 0, . . . , 1, 2, 0, 1. To write
these with none of the partial sums divisible by 3, we must start with a 1 or a 2.
After that, we can include or omit 0’s at will without affecting whether any of the
partial sums are divisible by 3, so suppose [initially] we omit all 0’s. The remaining
sequence of 1’s and 2’s must then be of the form
1, 1, 2, 1, 2, 1, 2, . . .
or
2, 2, 1, 2, 1, 2, 1, . . .
(once you start, the rest of the sequence is forced by the condition that no partial
sum is divisible by 3). However, a sequence of the form 2, 2, 1, 2, 1, 2, 1, . . . has one
more 2 than 1, and we need to have one more 1 than 2. So the only possibility for
our sequence modulo 3, once the 0’s are omitted, is 1, 1, 2, 1, 2, 1, 2, . . .. There are
2k + 1 numbers in this sequence, and the k 0’s can be returned to the sequence
arbitrarily except at the beginning. So the number of ways to form the complete
sequence modulo 3 equals the number of ways to distribute the k identical 0’s over
2k + 1 boxes (the“slots” after the 1’s and 2’s), which by a standard “stars and bars”
3k
argument is . Once this is done, there are k! ways to replace the k 0’s in the
k
sequence modulo 3 by the actual integers 3, 6, . . . , 3k. Also, there are k! ways to
“reconstitute” the 2’s and (k + 1)! ways for the 1’s. So the answer is
3k
k
k!k!(k + 1)! 00
.
(3k + 1)!
129
8. Let S be the set of 2 by 2 matrices each of whose entries is one of the 15 squares
0, 1, 4, 9, . . . , 196. Prove that if one selects more than 154 − 152 − 15 + 2 matrices
from S, then two of those selected must commute.
Solution: (Putnam Competition 1990, problem B3) Let A be the set of diagonal
matrices from S (matrices with 0’s off the main diagonal), and B the set of matrices
with all four entries the same. Since |S| = 154 , |A| = 152 , |B| = 15 and |A ∩ B| = 1,
inclusion-exclusions tells us that
Let us select at least 154 − 152 − 15 + 3 elements from S. If the all-zero matrix (the
unique entry in A ∩ B) is among those we select, then there is a pair that commutes,
since the all-zero matrix commutes with everything. So let’s assume that the all-zero
matrix was not selected.
If two diagonal matrices are selected, or two matrices with all four entries the same
are selected, then there is a pair that commutes, since any two diagonal matrices
commute, and any two matrices with all four entries the same commute. So let’s
assume that at most one diagonal matrix was selected, and at most one matrix with
all four entries the same.
This leaves at least 154 − 152 − 15 + 1 matrices selected from (A ∪ B)c ; in other
words, all matrices from (A ∪ B)c . The problem is completed by exhibiting two
matrices that commute from (A ∪ B)c ; the pair [1, 1; 0, 1] and [1, 4; 0, 1] works.
130
13 Week 12 (October 27) — Polynomials
This week’s problem are all about polynomials, which come up in virtually every Putnam
competition.
• Factorization: In fact, every polynomial p(x) = xn +a1 xn−1 +a2 xn2 +. . .+an−1 x+an ,
with real or complex coefficients, has exactly n roots, in the sense that there is a
vector (c1 , . . . , cn ) (perhaps with some repetitions) such that
p(x) = (x − c1 )(x − c2 ) . . . (x − cn ).
• Two different polynomials of the same degree can’t agree too often: If
p(x) and q(x) (over R or C) both have degree at most n, and there are n + 1 distinct
numbers x1 , . . . , xn+1 such that p(xi ) = q(xi ) for i = 1, . . . , n + 1, then p(x) and q(x)
are equal for all x. [Because then p(x) − q(x) is a polynomial of degree at most n
with at least n + 1 roots, so must be identically zero].
131
(these polynomials have already appeared in the last bullet point). A polynomial
p(x1 , . . . , xn ) in n variables is symmetric if for every permutation π of {1, . . . , n}, we
have
p(x1 , . . . , xn ) ≡ p(xπ(1) , . . . , xπ(n) ).
(For example, x21 + x22 + x23 + x24 is symmetric, but x21 + x22 + x23 + x1 x4 is not.)
Every symmetric polynomial in variables x1 , . . . , xn can be expressed as a linear
combination of the σk ’s.
• Some special values tell things about the coefficients: (Rather obvious, but
worth keeping in mind) If p(x) = a0 xn + a1 xn−1 + a2 xn2 + . . . + an−1 x + an , then
p(0) = an
p(1) = a0 + a1 + a2 + . . . + an
p(−1) = an − an−1 + an−2 − an−3 + . . . + (−1)n a0 .
B(x) = b0 + b1 x + b2 x2 + · · · + bn xn
132
multiple of b, and the constant term is a multiple of a. An immediate corollary of
this is that if p(x) is a monic polynomial (integer coefficients, leading coefficient 1),
then any rational root must in fact be an integer; conversely, if a real number x is a
root
√ of a monic polynomial but is not an integer, it must be irrational (for example,
2
2 is a root of monic x − 2, but is clearly not an integer, so it must be irrational)!
• Gauss’ lemma: Here is a weak form of Gauss’ lemma, but one that is very useful: if
c is an integer root of a monic polynomial p(x) (integer coefficients, leading coefficient
1), then p(x) factors as (x − c)q(x), where q(x) is also a monic polynomial (the
surprise being not that q(x) has leading coefficient 1, but that it has all integer
entries).
• One more fact about integer polynomials: Let p(x) be a (not necessarily monic)
polynomial with integer coefficients. For any integers a, b,
(a − b)|(p(a) − p(b)).
(So also,
(p(a) − p(b))|(p(p(a)) − p(p(b))),
etc.)
133
13.1 Problems to think about for week 13
1. (Verifying the last fact from the introduction) If p(x) is a polynomial with integer
coefficients, and a and b are distinct integers, verify that
p(b) − p(a)
b−a
is always an integer.
2. For which real values of p and q are the roots of the polynomial x3 − px2 + 11x − q
three consecutive integers? Give the roots in these cases.
3. Let p(x) be a polynomial with integer coefficients, for which p(0) and p(1) are odd.
Can p(x) have any integer zeroes?
4. (a) Determine all polynomials p(x) such that p(0) = 0 and p(x + 1) = p(x) + 1 for
all x.
(b) Determine all polynomials p(x) such that p(0) = 0 and p(x2 + 1) = (p(x))2 + 1
for all x.
5. Does there exist a non-zero polynomial f (x) for which xf (x − 1) = (x + 1)f (x) for
all x?
6. Determine, with proof, all positive integers n for which there is a polynomial p(x) of
degree n satisfying the following three conditions:
pn (x) = a0 + a1 x + a2 x2 + . . . + an xn
134
13.2 Solutions to problems on polynomials
1. (Verifying the last fact from the introduction) If p(x) is a polynomial with integer
coefficients, and a and b are distinct integers, verify that
p(b) − p(a)
b−a
is always an integer.
Write
p(x) = c0 + c1 x + c2 x2 + · · · + cn xn ,
with ci an integer, and cn 6= 0. We have
n
X n
X
p(b) − p(a) = c k bk − c k ak
k=0 k=0
n
X
c k b k − ak
=
k=0
n
X
ck (b − a) bk−1 + abk−2 + · · · + ak−2 b + ak−1
=
k=0
n
X
ck bk−1 + abk−2 + · · · + ak−2 b + ak−1 .
= (b − a)
k=0
n
X
ck bk−1 + abk−2 + · · · + ak−2 b + ak−1 is an integer, we conclude that (p(b)−
Since
k=0
p(a))/(b − a) is an integer.
2. For which real values of p and q are the roots of the polynomial x3 − px2 + 11x − q
three consecutive integers? Give the roots in these cases.
Solution: The solutions is: either p = 6, q = 6 (in which case roots are 1, 2, 3), or
p = −6, q = −6 (in which case roots are −1, −2, −3)
A polynomial with roots being three consecutive integers is of the form
(x − (a − 1))(x − a)(x − (a + 1)) = x3 − 3ax2 + (3a2 − 1)x − (a3 − a)
for some integer a. So, matching coefficients, we must have 3a2 − 1 = 11, or a = ±2.
When a = 2 we get roots 1, 2, 3 and p = 6, q = 6; when a = −2 we get roots
−3, −2, −1 and p = −6, q = −6.
135
3. Let p(x) be a polynomial with integer coefficients, for which p(0) and p(1) are odd.
Can p(x) have any integer zeroes?
4. (a) Determine all polynomials p(x) such that p(0) = 0 and p(x + 1) = p(x) + 1 for
all x.
Solution: The only such polynomial is the identity polynomial.
By induction, p(x) = x for all positive integers x, so p(x) − x is a polynomial
with infinitely many zeros, so must be identically 0. We conclude that p(x) = x
is the only possible polynomial satisfying the given conditions.
(b) Determine all polynomials p(x) such that p(0) = 0 and p(x2 + 1) = (p(x))2 + 1
for all x.
Solution: The only such polynomial is the identity polynomial.
We have p(0) = 0, p(1) = p(0)2 +1 = 1, p(2) = p(1)2 +1 = 2, p(5) = p(2)2 +1 = 5,
p(26) = p(5)2 + 1 = 26 and in general, by induction, if the sequence (an ) is
defined recursively by a0 = 0 and an+1 = a2n + 1, then p(an ) = an . Since
the sequence (an ) is strictly increasing, we find that there are infinitely many
distinct values x for which p(x) = x; as in the last part, this tells us that
p(x) = x is the only possible polynomial satisfying the given conditions.
5. Does there exist a non-zero polynomial f (x) for which xf (x − 1) = (x + 1)f (x) for
all x?
Solution: No. For positive integer n, taking x = n in the equation above, we have
n n−1
f (n) = f (n − 1) = f (n − 2) = . . . = 0f (−1) = 0.
n+1 n+1
Hence f (x) has infinitely many zeros, and must be identically zero; f (x) ≡ 0.
6. Determine, with proof, all positive integers n for which there is a polynomial p(x) of
degree n satisfying the following three conditions:
136
(a) p(k) = k for k = 1, 2, . . . , n,
(b) p(0) is an integer, and
(c) p(−1) = 2020.
Solution: Set q(x) = p(x) − 5. We have q(a) = q(b) = q(c) = q(d) = 0 and so
q(x) = r(x)(x − a)(x − b)(x − c)(x − d), where r(x) is some rational polynomial; but
in fact (by Gauss’ Lemma), r(x) is a polynomial over integers.
Aside: Why is r(k) above a polynomial over integers? Suppose xn + an−1 xn−1 +
. . . + a1 x + a0 (call this expression 1), with all ai integers, factors as (x − c)(xn−1 +
rn−2 xn−2 + . . . + r1 x + r0 ) (call this expression 2), where c is an integer. Then
necessarily the ri are rational numbers; but in fact, we can show that they are
all integers. This is obvious when c = 0, so assume c = 6 0. Expanding out the
factorization and equating coefficients, we get
an−1 = rn−2 − c
an−2 = rn−3 − crn−2
an−3 = rn−4 − crn−3
···
a2 = r1 − cr2
a1 = r0 − cr1
a0 = −cr0 .
cn + an−1 cn−1 + . . . + a2 c2 + a1 c + a0 = 0.
137
and so, since the left-hand side is clearly an integer, so is the right-hand side, r0 .
Now plugging a1 = r0 − cr1 into this last inequality, and dividing by c, we get
Back to solution: Now suppose there is an integer k with p(k) = 8. Then q(k) = 3,
so r(k)(k − a)(k − b)(k − c)(k − d) = 3. Since r(k), (k − a), (k − b), (k − c) and
(k − d) are all integers, and 3 is prime, one of the five must be ±3 and the remaining
four must be ±1. It follows that at least three of (k − a), (k − b), (k − c) and (k − d)
must be ±1, and so at least two of them must take the same value; this contradicts
the fact that a, b, c and d are distinct.
pn (x) = a0 + a1 x + a2 x2 + . . . + an xn
The sequence pn−1 (y1 ), pn−1 (y2 ), . . . , pn−1 (yn ) alternates in sign (think about the
graph of y = pn−1 (x)). As long as we choose an sufficiently close to 0, the sequence
pn (y1 ), pn (y2 ), . . . , pn (yn ) alternates in sign (this is by continuity). So, choose such
an an . Now choose a yn+1 sufficiently large that pn (yn+1 ) has the opposite sign
to pn (yn ) (this is where alternating the signs of the ai ’s comes in — such a yn+1
exists exactly because an and an−1 have opposite signs). We get that the sequence
pn (y1 ), pn (y2 ), . . . , pn (yn+1 ) alternates in sign. Hence pn (x) has n distinct real roots:
one between y1 and y2 , one between y2 and y3 , etc., up to one between yn and yn+1 .
This accounts for all its roots, and we are done.
138
9. A Boolean function is a function f : {0, 1}n → {0, 1}. A multilinear polynomial
p : RnY
→ R is a (real) linear combination of linear monomials — expressions of the
form xi where S is a subset of {1, . . . , n}. Show that for every Boolean function
i∈S
there is a unique multilinear polynomial pf that agrees with f on {0, 1}n .
evaluates to 1 for all c for which f (c) = 1, and evaluates to 0 otherwise; i.e., it agrees
with f on {0, 1}n .
139
and since the left-hand side above is 0 for every choice of (x1 , . . . , xn−1 ) ∈ {0, 1}n−1 ,
so is the right-hand side, and it follows (by the induction hypothesis) that s is
identically 0. So
t(x1 , . . . , xn ) = r(x1 , . . . , xn−1 )xn .
Now we have
t(x1 , . . . , xn−1 , 1) = r(x1 , . . . , xn−1 ),
and since the left-hand side above is 0 for every choice of (x1 , . . . , xn−1 ) ∈ {0, 1}n−1 ,
so is the right-hand side, and it follows (again by the induction hypothesis) that r is
identically 0. So in fact t is identically 0, completing the induction.
140
14 Week 13 (November 3) — Games
These problem are all about games played between two players. Usually when these
problems appear in the Putnam competition, you are asked to determine which player
wins when both players play as well as possible. Once you have decided which player wins
(maybe based on analyzing small examples), you need to prove this in general. Often this
entails demonstrating a winning strategy: for each possible move by the losing player, you
can try to identify a single appropriate response for the winning player, such that if the
winning player always uses these responses as the game goes on, then she will indeed win.
It’s important to remember that you must produce a response for the winning player for
every possible move of the losing player — not just a select few.
141
14.1 Problems to think about for week 14
1. Two players alternately draw diagonals between vertices of a regular polygon. They
may connect two vertices if they are non-adjacent (i.e. not a side) and if the diagonal
formed does not cross any of the previous diagonals formed. The last player to draw
a diagonal wins.
Who wins if the polygon has 2020 vertices?
2. Two players play a game in which the first player places a king on an empty 8 by 8
chessboard, and then, starting with the second player, they alternate moving the
king (in accordance with the rules of chess) to a square that has not been previously
occupied. The player who moves last wins. Which player has a winning strategy?
3. There are nine cards laid out on a table, face up, numbered 1 through 9. Two players,
A and B, take turns picking up cards (and once a card is picked up, it is out of play).
As soon as one of the players has among his chosen cards three of them that sum to
fifteen, that player wins.
(a) If both players play perfectly, what happens?
(b) What game are the players really playing?
4. Alan and Barbara play a game in which they take turns filling entries of an initially
empty 1024 by 1024 array. Alan plays first. At each turn, a player chooses a real
number and places it in a vacant entry. The game ends when all the entries are filled.
Alan wins if the determinant of the resulting matrix is nonzero; Barbara wins if it is
zero. Which player has a winning strategy?
5. I shuffle a regular deck of cards (26 red, 26 black), and begin to turn them face-up,
one after another. At some point during this process, you say “STOP!”. You can
say stop as early as before I’ve even turned over the first card, or as late as when
there is only one card left to be turned over; the only rule is that at some point you
must say it. Once you’ve said stop, I turn over the next card. If it is red, you win
the game, and if it is black, you lose.
If you play the strategy “say stop before even a single card has been turned over”,
you have a 50% chance of winning the game. Is there a more clever strategy that
gives you a better than 50% chance of winning the game?
6. Alice and Bob play the following game. They start with a pile of 9 matches. They
take turns, Alice playing first. Each player may remove between 1 and 3 matches.
The player who picks up the last match wins. Who has a winning strategy? And
what is it? And what if, instead of 9 matches, we start with a pile of n matches?
7. Two players, A and B, take turns naming positive integers, with A playing first. No
player may name an integer that can be expressed as a linear combination, with
positive integer coefficients, of previously named integers. The player who names “1”
loses. Show that no matter how A and B play, the game will always end.
142
8. Suppose n ≥ 2 light bulbs are arranged in a row, numbered 1 through n. Under
each bulb is a button. Pressing the button will change the state of the bulb above
it (from on to off or vice versa), and will also change the neighbors’ states. (Most
bulbs have two neighbors, but the bulbs on the end have only one.) The bulbs start
off randomly (some on and some off). For which n is it guaranteed to be possible
that by flipping some switches, you can turn all the bulbs off?
2. Two players play a game in which the first player places a king on an empty 8 by 8
chessboard, and then, starting with the second player, they alternate moving the
king (in accordance with the rules of chess) to a square that has not been previously
occupied. The player who moves last wins. Which player has a winning strategy?
Solution: Player 2 has a winning strategy. She can imagine the board as being
covered with non-overlapping 2-by-1 dominoes (there are many ways to cover an 8
by 8 board with dominoes). Wherever player 1 puts the king, player 2 moves it to
the other square in the corresponding domino. She then repeats this strategy until
the game is over. (Player 2’s approach here is referred to as a pairing strategy).
3. There are nine cards laid out on a table, face up, numbered 1 through 9. Two players,
A and B, take turns picking up cards (and once a card is picked up, it is out of play).
As soon as one of the players has among his chosen cards three of them that sum to
fifteen, that player wins.
143
• 1,5,9
• 1,6,7
• 2,4,9
• 2,5,8
• 2,6,7
• 3,4,8
• 3,5,7
• 4,5,6
we see that the winning triples are the three rows, three columns and two diagonals.
So the players are in fact playing tic-tac-toe, which (after some case analysis) is seen
to be a draw when both players play optimally.
4. Alan and Barbara play a game in which they take turns filling entries of an initially
empty 1024 by 1024 array. Alan plays first. At each turn, a player chooses a real
number and places it in a vacant entry. The game ends when all the entries are filled.
Alan wins if the determinant of the resulting matrix is nonzero; Barbara wins if it is
zero. Which player has a winning strategy?
Solution: Barbara has a winning strategy. For example, Whenever Alan plays x in
row i, Barbara can play −x in some other place in row i (since there are an even
number of places in row i, Alan will never place the last entry in a row if Barbara
plays this strategy). So Barbara can ensure that all row-sums of the final matrix are
0, so that the column vector of all 1’s is in the kernel of the final matrix, so it has
determinant zero.
5. I shuffle a regular deck of cards (26 red, 26 black), and begin to turn them face-up,
one after another. At some point during this process, you say “STOP!”. You can
say stop as early as before I’ve even turned over the first card, or as late as when
there is only one card left to be turned over; the only rule is that at some point you
must say it. Once you’ve said stop, I turn over the next card. If it is red, you win
the game, and if it is black, you lose.
If you play the strategy “say stop before even a single card has been turned over”,
you have a 50% chance of winning the game. Is there a more clever strategy that
gives you a better than 50% chance of winning the game?
144
Solution: Here’s the quick-and-dirty solution: The game fairly easily seen to be
equivalent to the following: exactly as before, except now when you say “STOP”, I
turn over the bottom card in the pile of cards that remains. In this formulation, it is
clear that there cannot be a strategy that gives you better than a 50% chance of
winning.
Here’s a more prosaic solution. Suppose that instead of being played with a balanced
deck, it is played with a deck that has we claim that if there are a red cards and
b black cards, then there is no strategy better than the naive one of saying stop
before a single card has been turned over; note that with this strategy you win
with probability a/(a + b). We prove this by induction on a + b. If a + b = 1, then
(whether a = 1 or a = 0) the result is trivial. Suppose a + b ≥ 2. To get a strategy
that potentially improves on the proposed best strategy, you must at least wait for
the first card to be turned over. Two things can happen:
• The first card turned over is red; this happens with probability a/(a + b). Once
this happens, you are playing a new version of the game, with a − 1 red cards
and b black cards, and by induction your best winning strategy has you winning
with probability (a − 1)/(a − 1 + b).
• The first card turned over is black; this happens with probability b/(a + b).
Once this happens, you are playing a new version of the game, with a red cards
and b − 1 black cards, and by induction your best winning strategy has you
winning with probability a/(a + b − 1).
Your probability of winning the original game is therefore at most
a a−1 b a a
× + × = .
a+b a+b−1 a+b a+b−1 a+b
6. Alice and Bob play the following game. They start with a pile of 9 matches. They
take turns, Alice playing first. Each player may remove between 1 and 3 matches.
The player who picks up the last match wins. Who has a winning strategy? And
what is it? And what if, instead of 9 matches, we start with a pile of n matches?
Solution: To be added. It seems that Player 2 can force a win if there are 4n
matches, and Player 1 can force a win otherwise. Example of strategy stealing.
7. Two players, A and B, take turns naming positive integers, with A playing first. No
player may name an integer that can be expressed as a linear combination, with
positive integer coefficients, of previously named integers. The player who names “1”
loses. Show that no matter how A and B play, the game will always end.
145
a linear combination of the xi ’s over positive integers. EachX x in this set is an
integer multiple of gk (gk divides the right-hand side of x = ai xi , so it divides the
i
left-hand side). We claim that there is some m such that all multiples of gk greater
than mgk are in this set.
If we can prove this claim, we are done. The sequence (g1 , g2 , g3 , . . .) is non-increasing.
It stays constant in going from gi to gi+1 exactly when xi+1 is a multiple of gi , and
drops exactly when xi+1 is not a multiple of gi . By our claim, once the sequence has
reached a certain g, it can only stay there for a finite length of time. So eventually
that sequence becomes constantly 1. But once the sequence reaches 1, there are only
finitely many numbers that can be legitimately played, and so eventually 1 must be
played.
Here’s what we’ll prove, which is equivalent to the claim: if x1 , . . . , xk are relatively
prime positive integers (greatest common divisor equals 1) then there exists an m such
that all numbers greater than m can be expressed as a positive linear combination
of the xi ’s. We prove this by induction on k. When k = 1, xk = 1 and the result
is trivial. For k > 1, consider x1 , . . . , xk−1 . These may not be relatively prime;
say their greatest common divisor is d. By induction, there’s an m0 such that all
positive integer multiples of d greater than m0 d can be expressed as a positive linear
combination of the x1 , . . . , xk−1 . Now d and xk must be relatively prime (otherwise
the xi ’s would not be relatively prime), which means that there must be some positive
integer e (which way may assume is between 1 and xk − 1) with ed ≡ 1 (modulo
xk ). If we add any multiple of xk to e to get e0 , we still get e0 d ≡ 1 (modulo xk ).
Pick a multiple large enough that e0 > m0 . By induction, e0 d can be expressed as a
positive integer combination of x1 , . . . , xk−1 . So too can 2e0 d, 3e0 d, . . . , xk e0 d. These
xk numbers cover all the residue classes modulo xk . Let m be one less than the largest
of these numbers. For ` > m, we can express ` as a positive linear combination of
x1 , . . . , xk as follows: first, determine the residue class of ` modulo xk , say it’s p.
Then add the appropriate positive integer multiple of xk to pe0 d (which can can be
expressed as a positive integer combination of x1 , . . . , xk−1 ).
Source: This is the game of Sylver coinage, invented by John H. Conway; see http:
//en.wikipedia.org/wiki/Sylver_coinage. It is named after J. J. Sylvester, who
proved that if a and b are relatively prime positive integers, then the largest positive
integer that cannot be expressed as a positive linear combination of a and b is
(a − 1)(b − 1) − 1.
146
Solution: To be added
147
15 Week 14 (November 10)
Here’s a collection of problems contributed by participants of this year’s running of Math
43900. Some fun things to think about over the winter break.
1. 100 people play a game, organized by a game-master. The game goes as follows:
The 100 participants line up in single file. The game-master puts either a red or a
blue hat on each participant’s head. Every participant can see the hats of the people
in front of them in the line — but not their own hat, nor those of anyone behind
them. The game-master starts at the back of the line, and asks the last person to
call out the colour of their hat. They must answer “red” or “blue”. If they answers
correctly, they stay in line; if they give the wrong answer, they are immediately and
silently removed from the room. (So while everyone hears the answer, no-one knows
whether an answer was right.) The game-master then proceeds up the line, repeating
the same procedure with each of the 100 participants. Before the game begins, the
participants are allowed to confer on a strategy to help them. What should they do,
if they want as many people as possible still in line at the end of the game?38
4. Satan has captured you and your friend. He puts a full Othello board (an 8 by 8
board of tiles, that are black on one side and white on the other, such that only white
or black is visible for each tile — see picture below for a typical instance) in front of
you. He says he will send your friend out of the room, rearrange the board however
he likes, then designate one “magic square”. You may then flip one tile from black to
white or white to black39 . Your friend will then return and must be able to identify
the chosen “magic square”. You are allowed to discuss strategy as long as you like
beforehand, but after your friend leaves the room you cannot communicate. What
strategy do you come up with? Note: you and your friend both view the board in
same orientation (you both know which side is the “top” side of the board).40
38
There is an open-endedness to this question. You can seek to have a number of people certain to
still be in line at the end; or maybe to have a high expected number of people left in line; or maybe some
hybrid of these goals.
39
“May” here really means “may” — you can flip a tile if you wish, or you can choose not to flip.
40
The person who contributed this problem says: This is a puzzle that was posed to me several years
ago by a friend who also studied here. I was unable to solve it then, and have not found a solution since,
but I trust my friend that a solution does actually exist. I hope someone in the class has better luck with
it than I did!
148
5. You are given six boxes, B1 , . . . , B6 , and, for each ` = 0, . . . , 5, two identical coins
each of denomination 2`41 . In how many ways can you distribute the coins among
the boxes, so that for each k = 1, . . . , 5, box Bk contains coins of total value 2k ?
What if 6 is replaced by arbitrary n ∈ N?
6. Three players enter a room, and a red or blue hat is placed on each person’s head.
The color of each hat is determined by a coin toss, with the outcome of one coin
toss having no effect on the others. Each person can see the other players’ hats but
not their own.
No communication of any sort is allowed, except for an initial strategy session
between the three players before the game begins. Once they have had a chance to
look at the other hats, the players must simultaneously guess the color of their own
hats, or pass. The group shares a hypothetical $3 million prize if at least one player
guesses correctly and no players guess incorrectly.
What strategy for the group maximizes its chances of winning the prize?
7. A group of lions live on an island covered in grass but with no other animals. The
lions are identical, perfectly rational and aware that all the others are rational. They
are also aware that all the other lions are aware that all the others are rational, and
so on.
Naturally, the lions are extremely hungry but they do not attempt to fight each other
because they are identical in physical strength and so would inevitably all end up
dead. As they are all perfectly rational, each lion prefers a hungry life to a certain
death. With no alternative, they can survive by eating an essentially unlimited
supply of grass, but they would all prefer to consume something meatier.
41
So: two coins of denomination 1, two coins of denomination 2, two coins of denomination 4, et cetera.
149
One day, a lamb miraculously appears on the island42 . If any lion consumes the
defenceless lamb, it will become too full to defend itself from the other lions, and
thus will be eaten.
Suppose there are n lions at the moment the lamb appears. What will happen?43
8. How many primes among the positive integers, written as usual in base 10, are such
that their digits are alternating 1’s and 0’s, beginning and ending with a 1?
(1 − x + x2 )ex
10. Let P (x) be a polynomial of degree n such that P (x) = Q(x)P 00 (x), where Q(x) is a
quadratic polynomial and P 00 (x) is the second derivative of P (x). Show that if P (x)
has at least two distinct roots then it must have n distinct roots.
11. Suppose we have a floor made of parallel strips of wood, each one unit wide. If we
drop a needle one unit long onto the floor, what is the probability that the needle
will cross one of the lines between two strips of wood on the floor?
12. Players A and B each have a well shuffled standard deck of cards, with no jokers.
The players deal their cards one at a time, from the top of the deck, checking for an
exact match. Player A wins if, once the packs are fully dealt, no matches are found.
Player B wins if at least one match occurs.
What is the probability that player A wins?
13. Here’s a carnival game: you are brought into a circular room with a number q of
identical doors, k of which have a prize (a fun math problem) behind them, and the
rest of which have nothing behind them. You are allowed to choose a door, open it,
and see what is behind it. Then you are allowed again to choose a door, open it, and
see what is behind it, and in fact you are allowed to do this q times44 . After you
make each choice, the locations of the k prizes are randomly re-distributed among
the q doors.
42
Unfortunate creature, it would seem. But actually the lamb has a chance of surviving this hell!
43
It will be helpful to assume:
• that lions cannot share;
• that the lions all move at the same speed; that any defenceless creature (lamb or full lion), being
rational, bows to the inevitable and does not try to run away as soon as it sees a hungry lion move
towards it;
• and that at all points, the distances between all pairs of creatures on the island are distinct.
44
Remember, q is the total number of doors (both prize and non-prize)
150
As time goes by, the carnival owners add new doors frequently45 ; but since they
aren’t very good at math and can’t come up with any new problems, they keep the
number of prize doors the same46 . As more and more doors are installed, what is
the probability that you lose the game (i.e., that in all of your q choices, you never
see a prize)?47
45
So q keeps growing
46
So k is fixed
47
More precisely, what nice value is this probability approaching?
151