Problem Solving Tactics
Problem Solving Tactics
Angelo Di Pasquale,
in Mathematics
at The University A Di Pasquale, N Do & D Mathews A
t
of Melbourne
studying alge-
braic curves. He is currently Director of Training
for the Australian Mathematical Olympiad
Committee (AMOC), and Australian Team
Leader at the International Mathematical
Olympiad. He enjoys composing Olympiad
problems for mathematics contests.
Norman Do
represented
A u s t r a l i a
at the 1997
International
Mathematical
Olympiad.
In 2010 he
obtained a PhD
in Mathematics
from The Uni-
versity of Melbourne. He is now a lecturer
at Monash University, where he researches
problems that combine geometry, topology,
combinatorics, and mathematical physics.
He is currently the Chair of the AMOC
A u s t r a l i a n M a t h e ma t i c s T r u s t
M
33
978-1-876420-75-8
A
E n r i c h m e n t S e r i e s t
Published by
AMT P u b lishing
En r i c h m e n t Se r i e s
E d i t o r i a l C o m m i t t ee
E n r i c h m e n t S e r i e s
B oo k s in the Series
anyone who wishes to qualify for an Olympiad training school in mathematics, either in
Australia or overseas
anyone who has attended an Olympiad training school in mathematics and who would
like to be better prepared should they qualify again for an invitation
interested students, teachers and parents, as it will give an idea of the sorts of mathe-
matics considered there
any mathematically able students, hobbyists or problem solvers, whether local or abroad,
who would find this publication enriching.
The focus of the user of this book should not be on reading solutions but on trying
to solve problems.
ii
That is why solutions are not provided to the problems at the beginning of each chapter. It
is also why we recommend that a problem be tried thoroughly with the showcased idea of the
section in mind, before the solution is studied.
We recognise that some problems are relatively easy exercises while others are of the difficulty
of the International Mathematical Olympiad—the pinnacle of problem-solving mathematics
for high school students the world over. So the reader definitely should not expect to be able
to solve all of the problems straight away.
Acknowledgments
Some problems are the inventions of staff members at AMOC training schools. However, many
of the problems have come from contests such as the Australian Mathematical Olympiad
(AMO), national mathematical Olympiads of some other countries, the Asian Pacific Mathe-
matics Olympiad (APMO), the International Mathematical Olympiad (IMO) and problems
shortlisted for the IMO. Since many of these problems have appeared in multiple contests,
in many cases it has been hard to identify their true origin and so we would simply like to
acknowledge all of the above sources.
Although this book has three listed authors, the ideas it contains are the product of many
AMOC staff interacting with each other and with students over many years.
We also express our appreciation to Ross Atkins, Andrew Elvey Price, Ivan Guo, Konrad
Pilch, Chaitanya Rao, Sally Tsang, Graham White and Sampson Wong, who assisted in
proofreading for mathematical content and accuracy, and provided other feedback.
The document was typeset using LATEX. The drawings were done with the help of GeoGebra
and TikZ.
1 Methods of proof 1
1.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Logic and deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Converse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 If and only if . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Contrapositive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Proof by contradiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Proof by induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Strong induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.8 Proof by exhaustion (case bashing) . . . . . . . . . . . . . . . . . . . . . . . 12
1.9 Pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.10 Advanced pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.11 Extremal principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.12 Telescoping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Number theory 19
2.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1 Fundamental theorem of arithmetic . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Dealing with digits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Floor function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Square roots and conjugates . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Powers of two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7 Euclid’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.8 Integers base-n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.9 Construction problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
iii
iv
3 Diophantine equations 39
3.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1 Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Bounding arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4 Polynomial modulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Quadratic discriminants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.7 Divisibility and gcds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.8 Reduction of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.9 Infinite descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.10 Vieta jumping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.11 Cyclotomic recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Plane geometry 53
4.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1 Angle chasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Cyclic quadrilaterals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 One step at a time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4 Triangle centres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.5 Constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.6 Exploit symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.7 Extend to the circumcircle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.8 Reverse reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.9 Trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.10 Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.11 Relate to known diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.12 Create beautiful pictures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9 Polynomials 149
9.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
9.1 Identity theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.2 Division algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.3 Fundamental theorem of algebra . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.4 Vieta’s formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9.5 Integer polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
9.6 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.7 Algebraic trickery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.8 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.9 Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.10 Polynomials modulo p (upstairs–downstairs) . . . . . . . . . . . . . . . . . . 163
9.11 Polynomials modulo P (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
9.12 Lagrange interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
9.13 Root focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11 Inequalities 187
11.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11.1 Squares are non-negative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
11.2 AM–GM inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
11.3 Rearrangement inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
11.4 Cauchy–Schwarz inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
vii
13 Combinatorics 225
13.0 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
13.1 Addition and multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
13.2 Subtraction and division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
13.3 Binomial identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13.4 Bijections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
13.5 The supermarket principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
13.6 Pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
13.7 Principle of inclusion–exclusion . . . . . . . . . . . . . . . . . . . . . . . . . 234
13.8 Double counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
13.9 Injections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
13.10 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
13.11 Double counting via tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
13.12 Combinatorial reciprocal principle . . . . . . . . . . . . . . . . . . . . . . . . 241
17 Appendices 293
17.1 How do complex numbers work? . . . . . . . . . . . . . . . . . . . . . . . . . 293
17.2 Function notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
17.3 Directed angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
17.4 Some useful triangle formulas . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Index 299
Methods of proof
1
Proofs are the essence of mathematics. They are nothing more than logical arguments which
present the solution to a mathematical problem beyond all possible doubt. There is no set
format for a proof, no particular way in which it must be presented on a page. Some people
may try to tell you that proofs must appear in two columns, with statements on the left and
explanations on the right—but this is complete nonsense! As long as you have sufficiently
clear assumptions, a logical argument that is fully explained, and the correct conclusion, then
you have a proof.1
So how do you know whether or not you have a rigorous proof to a mathematical problem?
Well, this is a difficult question to answer. Being able to write correct proofs without leaving
out important details is a skill which can only be learned through experience—that is, by
reading and writing them yourself. However, the following little test might help you on your
way. Imagine that you have an inquisitive younger sibling who is looking over your shoulder as
you write your proof and constantly interrogating you. ‘What does this mean?’ she might ask,
or ‘Why are you doing that?’ or ‘How does that make sense?’ If you claim that something
is obvious, she will ask why and if you are tempted to be vague, she just won’t understand.
Your goal, of course, is to make sure that she understands and to answer all of her questions
before she has even asked them. In order to accomplish this, you should provide a clear
explanation for every single fact you write down that is not blatantly obvious.
Apart from the logical aspect of your proof, there are a few other points to keep in mind.
Proofs should always be clear, concise, and most definitely legible. Keep in mind that a nice
proof to a problem is usually shorter than a messy one. This is often achieved by using good
notation and providing the right definitions. A common problem is for people to provide
extraneous material, which does not contribute to the main argument, or to write long essays,
which are not enlightening at all. One way to avoid these pitfalls is to break a problem
into smaller chunks, which can be individually proved and then reassembled to provide the
complete proof. In general, a good proof should take the reader on a pleasant mathematical
journey ending at the desired result.
1.0 Problems
1. Prove that if a + b is an irrational number, then at least one of a or b is irrational.
1 Thereis a rigorous notion of what constitutes a mathematical proof, but for our purposes and for the
purposes of most modern mathematics, our informal explanation is sufficient.
2 1 Methods of proof
2. Show that at any party, there are always at least two people with exactly the same
number of friends at the party.
19
3. The equal temperament tuning2 of musical instruments is based on the fact that 2 12 is
very close to 3.
Show that there can be no perfect tuning3 by proving that if 2x = 3, then x must be
irrational.
√
4. If m and n are positive integers, prove that m n is either a positive integer or irrational.
5. Prove that there are infinitely many prime numbers of the form 6n + 5, where n is a
positive integer.
6. For every positive integer n, prove that
1 1 1 1 n−1
+ + + ··· + = .
1×2 2×3 3×4 (n − 1) × n n
n(n + 1)(2n + 1)
1 2 + 2 2 + 3 2 + · · · + n2 = .
6
n2 (n + 1)2
13 + 23 + 33 + · · · + n3 = ,
4
and go on to conclude that
13 + 23 + 33 + · · · + n3 = (1 + 2 + 3 + · · · + n)2 .
11. Prove that every positive integer can be uniquely expressed as a sum of different numbers,
where each number is of the form 2n for some non-negative integer n.
1
12. Suppose x is a real number such that x + x is an integer.
n 1
Prove that x + xn is also an integer for any positive integer n.
13. Any finite collection of lines in the plane divides the plane up into regions.
Prove that it is possible to colour each of these regions either black or white in such a
way that no two regions which share a common edge have the same colour.
2 This tuning has perfect octaves and almost perfect fifths. Octaves are tuned so that the ratio of the frequency
of the pitch of the higher note to that of the note an octave lower is 2 : 1. Perfect fifths are tuned so that
the ratio of the frequency of the pitch of the higher note to the note a fifth below is 3 : 2. Standard equal
temperament tuning says that rising by an almost perfect fifth 12 times should be the same as rising by a
12
perfect octave seven times. Thus 23 ≈ 27 .
3 By perfect tuning, we mean that all octaves and fifths are perfect.
1.0 Problems 3
14. Show that if there are five points in a square with side length 1 metre, then there exist
two of them which are less than 75 centimetres apart.
15. Four points are given inside a square with side length 8 metres.
√
(a) Prove that two of them are less than 65 metres apart.
(b) Can you prove, beyond a shadow of a doubt, that two of them are less than 8
metres apart?
16. (a) Prove that if x and y can each be written as the sum of the squares of two integers,
then so can xy.
(b) Prove that if x and y are both of the form a2 + 2b2 (a, b ∈ Z), then so is xy.
(c) Let k be a fixed integer. Prove that if x and y are both of the form a2 + kb2
(a, b ∈ Z), then so is xy.
17. Consider the non-empty subsets of {1, 2, . . . , n}. For each of these subsets, consider the
reciprocal of the product of its elements.
Determine the sum of all of these numbers.
18. A finite set of chords is drawn in a circle such that each of them passes through the
midpoint of another chord.
Prove that all of the chords must be diameters.
19. Show that if we take n + 1 numbers from the set {1, 2, 3, . . . , 2n}, there must exist two
which have no common factor greater than 1.
Does this remain true if we take n numbers?
20. A circular island is divided into states by a number of chords of the circle. Consider a
tour that starts and ends in the same state without passing through the intersection of
any two borders.
Prove that the tour must involve an even number of border crossings.
21. Suppose you are given a balance scale and a collection of weights whose masses are
1, 3, 32 , 33 , . . ..
(a) Prove that using these masses you can determine the weight of any object whose
mass is a positive integer.
(b) Prove that apart from interchanging the contents of the left and right pans of the
scale, the configuration of masses on the pans that correctly determines the weight
is unique.
(c) Prove that the weights 1, 3, 32 , 33 , . . . are the only integral weights that uniquely
determine the weight of every integral mass.
22. A group of people played in a tennis tournament where each person played exactly one
match against every other person.
Prove that it is always possible to put the players in a line so that the first player beat
the second, the second player beat the third, all the way down to the last player.
23. A polygon is divided into triangles by diagonals whose endpoints are the vertices of the
polygon in such a way that no two of the diagonals intersect inside the polygon.
Prove that it is possible to colour the vertices of the polygon with three colours so that
the three vertices of each triangle have different colours.
4 1 Methods of proof
24. The Tower of Hanoi is a mathematical puzzle consisting of three rods and n discs of
distinct sizes, which can slide onto any of the three rods. The puzzle starts with the
discs neatly stacked in order of size on one rod, the smallest at the top, as shown in
the diagram below. The aim is to transfer the entire stack to another rod by moving a
disc from the top of one stack to the top of another stack in such a way that no disc is
placed on top of a smaller disc.
Prove that the task can be accomplished in 2n − 1 moves but not in fewer moves.
25. Thirty coins lie on a table, with 17 of them showing heads. Your task, should you
choose to accept it, is to separate the coins into two piles, not necessarily of the same
size, each of which has the same number of heads showing. Unfortunately, you happen
to be blindfolded and cannot feel the difference between the two sides of a coin.
How can you perform the task?
26. Each of the numbers 1, 2, 3, . . . , n2 is written in one of the squares of an n×n chessboard.
Show that there exist two squares which share a vertex or an edge whose entries differ
by at least n + 1.
27. Each square of an 8 × 8 chessboard has a real number written in it in such a way that
each number is equal to the geometric mean4 of all the numbers a knight’s move away
from it.
Is it true that all of the numbers must be equal?
28. There are 1000 positive numbers written at different points on the circumference of
a circle. If the numbers x, y, z appear in a row in that order, then it is known that
xz = y 2 .
Prove that all of the numbers are equal.
29. There are m horizontal lines and n vertical lines drawn in the plane. Each point of
intersection between a pair of lines is coloured in one of 100 colours.
Find values of m and n such that, no matter how the colouring is performed, there
always exists a rectangle whose vertices are the same colour.
30. Prove that from any set of 10 distinct two-digit numbers, it is possible to select two
disjoint subsets whose members have the same sum.
31. Does there exist a convex polyhedron such that no two of its faces have the same number
of edges?
If X, then Y .
For example, we could take X to be the statement ‘Rex is a dog’ and Y to be the statement
‘Rex is an animal’. Being such lazy creatures, we mathematicians have invented the following
shorthand for such statements, which is often read as ‘X implies Y ’.
X⇒Y
Now if X ⇒ Y and Y ⇒ Z, then you can automatically deduce that X ⇒ Z. One of the
easiest ways to write a proof is to string together a chain of deductions in this manner, starting
with the assumptions of the problem and ending with the conclusion of the problem. This is
often called a direct proof, an example of which follows.
Problem Prove the quadratic formula, which states that if ax2 + bx + c = 0 and a =
6 0, then
√ √
−b + b2 − 4ac −b − b2 − 4ac
x= or x = .
2a 2a
b c
ax2 + bx + c = 0 ⇒ x2 + x + = 0.
a a
Now we use an algebraic trick, known as completing the square, to write the left-hand side as
2
b2 − 4ac
b c b
x2 + x + = x+ − .
a a 2a 4a2
This is a good time to mention that you should never ever take equations like this for granted.
So grab a pen and some paper and check that it’s true for yourself! Once you’ve done that,
you should be convinced that the quadratic equation now takes the following form.
2 √
b2 − 4ac ± b2 − 4ac
b b
x+ = ⇒ x+ =
2a 4a2 2a 2a
√
2
−b ± b − 4ac
⇒ x=
2a
Of course, we were careful to consider both the positive and the negative square roots, as one
should always do, and this completes the proof.
1.2 Converse
For every statement of the form X ⇒ Y , there is something known as the converse, which is
the statement Y ⇒ X. You may be tempted to think that these two statements mean the
same thing, that is, if X ⇒ Y is true, then Y ⇒ X is true. But this is most definitely not
the case. For example, using our two example statements from earlier, it is clear that
is false, since there is the possibility that Rex could be a lizard or some other animal. So if
we know that a statement is true, there is no guarantee whatsoever that its converse is also
true. But sometimes it is, as in the following well-known example.
Problem Pythagoras’ theorem states that, if a right-angled triangle has side lengths a, b, c,
where c is the length of the hypotenuse, then a2 + b2 = c2 .
Assuming that Pythagoras’ theorem is true, prove the converse of Pythagoras’ theorem.
Solution Of course, the first thing we must do is write down what the converse actually is.
Let’s construct a right-angled triangle whose legs have lengths a and b, and let the hypotenuse
have length d. The reason for this is because we can now invoke Pythagoras’ theorem, which
we already know to be true. It tells us that a2 + b2 = d2 . Using this in conjunction with our
assumption that a2 + b2 = c2 , we deduce that c2 = d2 , which implies that c = d.
Therefore, the triangle with side lengths a, b, c that we were given possesses exactly the same
side lengths as the right-angled triangle that we have constructed. This means that the two
triangles are, in fact, congruent. So the given triangle was indeed right-angled, as we intended
to prove. Furthermore, the equation a2 + b2 = c2 implies that c is the longest side length in
the triangle, and hence is the length of the hypotenuse.
In the previous solution, we relied on Pythagoras’ theorem to prove its converse. This is a
rather general strategy, so keep the following point in mind. If you are given a true statement
and asked to prove its converse, then it is often advantageous, sometimes crucial, to use the
original statement itself.
Pythagoras’ theorem and its converse Suppose that a triangle has side lengths a, b, c,
where c is the longest side. Then the triangle is right-angled if and only if a2 + b2 = c2 .
In general, the statements ‘If X, then Y ’ and ‘If Y , then X’ can be combined to create the
single statement ‘X if and only if Y ’. You can probably guess that the mathematical notation
for this is simply
X ⇔ Y.
Now if someone actually asks you to prove a statement of the form X ⇔ Y , then what do
you do? The simplest approach is to split the problem into two parts. First prove X ⇒ Y ,
then prove Y ⇒ X. The next problem not only demonstrates this point but also provides us
with a useful way to test whether or not a number is divisible by 7. You should think about
why this is so.
1.4 Contrapositive 7
Problem If a and b are integers, prove that 10a + b is divisible by 7 if and only if a − 2b is
divisible by 7.
Solution As with most ‘if and only if’ statements, the proof naturally divides into two parts.
Hopefully, you will have noticed that the two parts are very similar in nature. Although
this is reasonably common, there will be times when one direction is significantly easier to
prove than the other. And, as we mentioned in the previous section, once you’ve proved the
statement in one direction, you can often use it to your advantage to prove the statement in
the other direction.
1.4 Contrapositive
The contrapositive is a way of turning a logical statement on its head to give an equivalent
logical statement. For example, instead of saying
where ‘not X’ is the opposite of X and ‘not Y ’ is the opposite of Y . By calling a statement
and its contrapositive equivalent, we mean that if the statement is true, then its contrapositive
is true, while if the statement is false, then its contrapositive is false. In other words, proving
either one will automatically prove the other. Note that if you take the contrapositive of
the contrapositive, then you actually end up with the statement you started with. As an
example, consider the following two statements, which are the contrapositives of each other.
Problem If a and b are real numbers such that ab is irrational, then at least one of a and b
must be irrational.
8 1 Methods of proof
Solution Let’s suppose that there are only finitely many primes, so that we can list them
all as p1 , p2 , . . . , pn . Euclid asks us to consider the number N = p1 × p2 × · · · × pn + 1. The
main fact that we’ll use is that two consecutive integers cannot be divisible by the same prime.
Since N − 1 is divisible by p1 , it’s impossible for N to be divisible by p1 as well. Similarly, it’s
impossible for N to be divisible by p2 or p3 or any of the primes in our supposedly complete
list. So what does this all mean? It means that if we look at the prime factors of N , they
must come from outside our list. This contradicts the fact that we started with a complete
list of the primes. Since we’ve shown that no finite list of primes can be complete, it follows
that there are infinitely many primes.
If you’re seeing proof by contradiction for the first time, then it can be a little baffling. Make
sure you understand the previous argument completely before looking at the next example.
Eventually, you should not only be able to understand proofs by contradiction, but also come
up with your own.
√
Problem Prove that 2 is irrational.
√
Solution Let us assume√on the contrary that 2 is actually a rational number and hope to
find a contradiction. If 2 were rational, then we could express it as pq , where p and q are
√
integers which have no common factors greater than 1. Start with the equation pq = 2 and
remove square roots and fractions to obtain
p √ p2
= 2 ⇒ =2 ⇒ p2 = 2q 2 .
q q2
5 Of course from a different point of view, the Sun is close to the Earth, relative to other astronomical objects!
1.6 Proof by induction 9
Since the right-hand side is even, the left-hand side is even. So p2 , and hence p, is even.
Therefore, we can write p = 2m for some integer m and substitute this back into the previous
equation.
(2m)2 = 2q 2 ⇒ 2m2 = q 2
Now it’s time for the left-hand side to be even, which forces the right-hand side to be even.
So q 2 , and hence q, is even. But hold on a second! We’ve assumed that p and q have no
common factors greater than 1 and also proved
√ that both p and q are even! This contradiction
means that our√original assumption, that 2 is rational, was incorrect. Therefore, we must
conclude that 2 is irrational, which was what we set out to do.
Already, we’ve learnt about the converse, the contrapositive and contradiction, which may
create some confusion. So take some time to learn the differences between these three concepts!
make sure that if the domino labelled k falls, then it knocks over the domino labelled
k + 1.
1 2 3 . . . k k+1 k+2
Analogously, suppose that you want to prove that some statement is true for all positive
integers n. An incredibly silly way to do this would be to prove it for n = 1, then for n = 2,
then for n = 3, and so on, but you would never get to the end of it. Instead, all you need to
do is
prove the base case, that is, prove the statement for n = 1
prove the inductive step, i.e. prove that whenever the statement is true for n = k (the
inductive hypothesis), then it is also true for n = k + 1.
Using this idea to prove a statement for all positive integers n is called proof by induction.
This may all seem pretty confusing to you at the moment, but the following example might
help to clarify the situation.
10 1 Methods of proof
n(n + 1)
1 + 2 + 3 + ··· + n = .
2
Solution For the base case, it is a simple matter to verify that the statement is true for
n = 1. Indeed, the left-hand side is simply 1 while the right-hand side is 1(1+1)
2 . These two
being equal takes care of the base case.
Next, for the inductive step, we can make use of the inductive hypothesis
k(k + 1)
1 + 2 + 3 + ··· + k =
2
in order to prove that the statement is true for n = k + 1. But if we know the value of
1 + 2 + 3 + · · · + k, then surely it must be an easy matter for us to determine the value of
1 + 2 + 3 + · · · + k + (k + 1). In fact, we have
k(k + 1)
1 + 2 + 3 + · · · + k + (k + 1) = + (k + 1)
2
(k + 1)(k + 2)
= .
2
And this is exactly the statement that we are trying to prove, with n = k + 1. This takes
care of the inductive step.
But wait a moment! Do we actually know that the statement is true for n = k? Well, no
we don’t. But what we have shown is that if the statement is true for n = k, then it must
also be true for n = k + 1. And since we already know it’s true for n = 1, then it must also
be true for n = 2. And since it’s true for n = 2, then it must also be true for n = 3. And
since it’s true for n = 3, then it must also be true for n = 4, and so on. In this way we have
managed to prove by induction that, for every positive integer n,
n(n + 1)
1 + 2 + 3 + ··· + n = .
2
Problem An L-tromino is a shape formed by joining three unit squares along their edges to
form an L shape.
Prove that, if any square of a 2100 × 2100 chessboard is removed, then the part of the board
which remains can be tiled by L-trominoes.
Solution Hopefully, one of the first things you realise when reading this problem is the fact
that the number 100 is not important at all! In fact, we will prove that the statement is true
for a 2n × 2n chessboard, where n is any positive integer. Needless to say, we will proceed by
induction.
The base case n = 1 is, as usual, extremely simple though entirely necessary to take care of.
We simply note that removing a square from a 2 × 2 chessboard always leaves an L-tromino.
Next, for the inductive step, we assume the inductive hypothesis. In other words, that it is
possible to tile a 2k × 2k chessboard with any square removed. Our goal now is to prove that
it is possible to tile a 2k+1 × 2k+1 chessboard with any square removed. How can we relate
these two facts? The idea is to divide the larger 2k+1 × 2k+1 board into four blocks, each
one a copy of the smaller 2k × 2k board. After rotating the board, we may assume that the
removed square lies in the top-left block. The inductive hypothesis guarantees that we can
tile the remaining part of the top-left block, but what to do with the remaining three blocks?
1.7 Strong induction 11
If we want to rely on the inductive hypothesis again, it would be nice to remove one square
from each of the remaining three blocks. We can do this by placing an L-tromino as shown
in blue in the diagram above. Once again, the inductive hypothesis guarantees that we can
tile the remaining parts of the remaining three blocks. This completes the inductive step.
Therefore we have proved by induction that, for every positive integer n, a 2n × 2n chessboard
with any square removed can be tiled by L-trominoes.
Proof by induction is such an extremely important concept that we will finish this section
with one further example.
2n < n!.
Solution This time, the statement we are trying to prove is only true for integers n ≥ 4.
This causes no problem since we can simply start from the base case n = 4 instead.
This is easy enough to verify, because 16 = 24 < 4! = 24.
Next, for the inductive step, we assume the inductive hypothesis. In other words, that 2k < k!
for some value of k ≥ 4. From this assumption, we would like to obtain the fact that the
statement is true for n = k + 1, that is, 2k+1 < (k + 1)!. This is easily done, by starting with
the inductive hypothesis and multiplying both sides by 2.
And there we have it! Knowing that the statement is true for n = 4 tells us that it is true for
n = 5. And knowing that the statement is true for n = 5 tells us that it is true for n = 6, and
so on. So we have managed to prove by induction that 2n < n!, for every integer n ≥ 4.
prove it for k + 1. This type of induction is certainly legitimate and is often called strong
induction. This is a more advanced method of proof which can be extremely useful, such as
in the solution to the following problem.
Problem You might already know that the Fibonacci sequence is defined by F1 = 1, F2 = 1
and Fm+1 = Fm + Fm−1 for m ≥ 2.
Prove that every positive integer can be expressed as a sum of terms from the Fibonacci
sequence, no two of which appear consecutively.
This result is often called Zeckendorf ’s theorem. It gives us a pretty interesting way to
represent positive integers, certainly more interesting than the customary base 10!
We can divide the problem into seven cases according to the remainder that n leaves after
division by 7.
Case 1: n = 7q + 1
In this case, the factor n − 1 = 7q is divisible by 7.
6 Your inquisitive younger sibling is asking why k + 1 − Fm is smaller than Fm−1 . Please provide a good
reason.
1.9 Pigeonhole principle 13
Case 2: n = 7q + 2
In this case, the factor n2 + n + 1 = 49q 2 + 35q + 7 is divisible by 7.
Case 3: n = 7q + 3
In this case, the factor n2 − n + 1 = 49q 2 + 35q + 7 is divisible by 7.
Case 4: n = 7q + 4
In this case, the factor n2 + n + 1 = 49q 2 + 63q + 21 is divisible by 7.
Case 5: n = 7q + 5
In this case, the factor n2 − n + 1 = 49q 2 + 63q + 21 is divisible by 7.
Case 6: n = 7q + 6
In this case, the factor n + 1 = (7q + 6) + 1 = 7q + 7 is divisible by 7.
Case 7: n = 7q
In this case, the factor n = 7q is divisible by 7.
Therefore, in all seven cases, at least one of the factors of n7 − n is divisible by 7. Since these
cases account for every possible situation, we can conclude that if n is an integer, then n7 − n
is divisible by 7, as desired.7
Pigeonhole principle If you place n + 1 pigeons into n pigeonholes, then at least one
pigeonhole will contain at least 2 pigeons . . . as long as you don’t cut them up!
This is now correct and, despite being blatantly obvious, is surprisingly useful. As with many
of the techniques that we will learn, the power of the pigeonhole principle arises from the
ingenious ways in which it can be applied, and what better way to demonstrate this than
with an example.
The pigeonhole principle is particularly useful for problems where you have to show the
existence of something without being able to construct it explicitly.
For example, suppose that you were asked to prove that there are two people in Australia
with the same number of hairs on their head. Your first instinct might be to go and find two
completely bald people! But if that seems too difficult for you, then you can also solve the
problem with a little thought and an application of the pigeonhole principle. For there are
over twenty million people in Australia, and each one of them has fewer than one million hairs
on their head. In fact, the average head full of hair will have approximately one hundred
thousand hairs. So the population of Australia will form our pigeons and we will put them
into pigeonholes numbered from 0 up to 999999 according to the number of hairs on their
head. The pigeonhole principle then guarantees that there will be at least two who end up in
the same pigeonhole.
7A much quicker way to deal with this question is provided by Fermat’s little theorem, found in section 2.12.
14 1 Methods of proof
The next couple of problems illustrate how the pigeonhole principle can be used in number
theory and combinatorics.
Problem Prove that any set of n integers has a non-empty subset whose sum is divisible
by n.
We treat these n numbers as our pigeons and place them into pigeonholes according to their
value modulo n. If the numbers s1 , s2 , . . . , sn are all distinct modulo n, then there must be
one of them, say sk , which is divisible by n. In other words, a1 + a2 + · · · + ak is divisible
by n.
Otherwise, the pigeonhole principle guarantees that two of the numbers, say si and sj , are
the same modulo n. This means that sj − si is divisible by n, where we may assume without
loss of generality that i < j. In other words, ai+1 + ai+2 + · · · + aj is divisible by n.
Solution One rather foolish way to solve this problem would be to consider all of the possible
colourings of the grid, of which there are 23×7 = 2097152. Of course, this is a rather inelegant
solution, not to mention the fact that it could take months! We’ll take a shortcut to the
solution by making use of the pigeonhole principle. One feature of the problem which suggests
that we should do so is the fact that we are asked to prove the existence of something, without
demonstrating precisely what that something is. But even though we know that we would
like to invoke the pigeonhole principle, some ingenuity is still required to know exactly how
to proceed.
Let us call four squares formed by the intersection of two rows and two columns a quartet
and refer to such a quartet as monochromatic if all four squares are the same colour. Our
first observation is that there are eight possibilities for each column, as shown in the diagram
below.
1 2 3 4 5 6 7 8
Our second observation is that if one of these occurs twice, then we must necessarily have a
monochromatic quartet. Our third observation is that if you use column 7 in conjunction
with columns 3, 5 or 6, then you must have a monochromatic quartet and similarly, if you
use column 8 in conjunction with columns 1, 2 or 4, then you must have a monochromatic
quartet. The problem now divides naturally into three cases.
1.10 Advanced pigeonhole principle 15
Problem Show that if we take n + 1 numbers from the set {1, 2, 3, . . . , 2n}, there must exist
one which is divisible by another.
Solution Since this problem asks for the existence of two numbers satisfying certain con-
ditions, it seems like a prime candidate for the pigeonhole principle. We are given n + 1
numbers, which will most likely be our pigeons, so it seems sensible to look for n pigeonholes.
Furthermore, these should be constructed so that given any two numbers from the same
pigeonhole, one is divisible by the other. If we can construct n pigeonholes from the set
{1, 2, 3, . . . , 2n} which satisfy these conditions, then we will be done.
Let’s play around with some small numbers then. Clearly, 1 and 2 can be in the same
pigeonhole, although 3 cannot lodge with them. The next number which can join them is 4,
and after that 8. In fact, we can put all numbers of the form 2a into one pigeonhole. If we
start with 3 in a pigeonhole, then the next number which can join it is 6, then 12, then 24,
and so on. So the numbers which are of the form 3 × 2a can all be put into one pigeonhole.
Similarly, the numbers which are of the form 5 × 2a can all be put into one pigeonhole.
Continuing in this way, each positive integer is in exactly one pigeonhole.8
X1 = {1, 2, 4, 8, 16, . . .}
X2 = {3, 6, 12, 24, 48, . . .}
X3 = {5, 10, 20, 40, 80, . . .}
..
.
Xk = {2k − 1, 2(2k − 1), 4(2k − 1), 8(2k − 1), 16(2k − 1), . . .}
..
.
Now that we’ve constructed the pigeonholes, the rest is easy. Given n + 1 numbers from the
set {1, 2, 3, . . . , 2n}, they all belong to one of the n pigeonholes X1 , X2 , . . . , Xn . But by the
pigeonhole principle, two of them must end up in the same pigeonhole. These two numbers
are of the form m × 2a and m × 2b , for some positive integer m. Thus the larger one is
divisible by the smaller one.
8 Your younger sibling wants to know why this is the case.
16 1 Methods of proof
In the previous section, we proved that there are at least two people in Australia with the
same number of hairs on their heads. But it seems likely that, with so many Australians
around, we should be able to find a larger group of people, all with the same number of
hairs on their heads. This idea is the basis for the following more advanced version of the
pigeonhole principle.
Pigeonhole principle If you place kn + 1 pigeons into n pigeonholes, then at least one
pigeonhole will contain at least k + 1 pigeons . . . as long as you don’t cut them up!
Solution We have 17 objects and need to prove the existence of three of them satisfying
certain conditions. Given this information alone, it seems that the problem is begging for the
pigeonhole principle to be used on it! Not only that, the numbers 17 and 3 imply that we
should take the given points to be our pigeons and look for eight pigeonholes. Fortunately,
there is a particularly obvious way to divide a cube of side length 2 into eight pieces, that is,
by slicing it into eight unit cubes.
From here, the solution to the problem is evident, since the pigeonhole principle tells us
that three of our points must lie in the same unit cube. These three points would form a
triangle with the largest area if they were at non-adjacent vertices of the unit cube.9√Note
√
that three such points form an equilateral triangle with side length 2 which has area 23 , as
required.
There is yet another version of the pigeonhole principle which you might as well know about,
even if we won’t require it to solve any of the problems in this particular chapter.
Infinite pigeonhole principle If you place infinitely many pigeons into finitely many
pigeonholes, then at least one pigeonhole will contain infinitely many pigeons.
Problem Consider an m × n rectangular grid, where each of the mn unit squares is labelled
with a real number. Suppose that the number in each square is the average of the numbers in
all neighbouring squares.
Prove that all of the labels must be the same.
Solution We start with the simple fact that every finite set of real numbers has a largest
element, a fact which is not true for infinite sets. So, we can consider the largest number
appearing in the grid and call it M . Consider the numbers in the adjacent squares. They are
9 Yourinquisitive younger sibling is asking why this produces a triangle of maximal area. Can you provide an
explanation?
1.11 Extremal principle 17
all less than or equal to M and yet their average is M . The only way that this can happen is
if all of the numbers in the adjacent squares are also M . So what we’ve proved is that every
square adjacent to one containing the number M also contains the number M . It now follows
that every square in the grid is labelled with the number M .
The next problem illustrates the fact that the extremal principle is particularly valuable when
used in conjunction with proof by contradiction.
Problem Consider 10 black points and 10 white points in the plane, no three of which lie
on a line.
Prove that it is possible to connect each black point with a white point by a line segment
such that no two line segments intersect.
Solution Let us call a set of 10 line segments which connect each black point with a white
point a matching. If we don’t care whether or not the line segments intersect, there are only
finitely many (actually 10!) matchings possible. Let us call the sum of the lengths of the 10
line segments the length of the matching. The idea now is that if you don’t want your lines
to cross, then you probably don’t want to connect points faraway from each other. With
this in mind, we consider a matching which has the shortest length and prove that it avoids
intersecting line segments.
In order to obtain a contradiction, suppose that the line segments AB and CD intersect at
a point X, where A and C are black points while B and D are white points. Then we can
replace these with the line segments AD and BC. By the triangle inequality, which essentially
states that the shortest distance between two points is a straight line segment, we have
C
A
D B
AD + BC < AB + CD.
So we have constructed a new matching which is shorter than the one which was supposed to
be the shortest. This contradiction implies that the shortest matching cannot contain two
line segments which intersect, so we are done.
There are two important points we should make about the previous proof. First, it was
necessary to mention that there are only finitely many matchings. For if there were infinitely
many, then it might not have been possible to choose one with the shortest length. Second,
we were careful to choose a shortest matching rather than the shortest matching, since it is
certainly possible that there could be many of them. Our proof works no matter which of the
shortest matchings we decide to consider.
18 1 Methods of proof
1.12 Telescoping
This is a useful method of evaluating sums. The idea is simply to rewrite things so that
cancellation systematically occurs. Specifically, if we wish to find a closed form for the sum
a1 + a2 + · · · + an , then we seek a function F which satisfies
ai = F (i + 1) − F (i) for i = 1, 2, . . . , n.
The sum telescopes because all the middle terms cancel out as follows.
2i − 1 = i2 − (i − 1)2 .
This corresponds to F (i) = (i − 1)2 . It follows that the given sum telescopes to
F (n + 1) − F (1) = n2 − 02 = n2 .
2.0 Problems
1. For any right-angled triangle whose side lengths are positive integers, we define the
tri-product of the triangle to be the product of its three side lengths.
Find with proof the greatest common divisor of all tri-products of all such right-angled
triangles.
m! + 5 = n3 .
is divisible by 2000.
16. Show that there are infinitely many positive integers m for which
is divisible by 2006.
17. Let n be a given positive integer. Prove that the sequence
a aa
a, aa , aa , aa , . . .
18. Let V = {1, 2, . . . , 24, 25}. Alexander is looking for a subset W of V with the property
that no two different members of W have a product which is a perfect square.
(a) Find the maximal possible size for W .
(b) Find the number of such sets W with this maximal size.
19. Prove that for each positive integer n, there exists a unique number divisible by 2n
whose decimal representation consists of n digits, each of them equal to 1 or 2.
20. Prove the following useful lemma. Suppose that positive integers a, b, x, y satisfy
ax = by .
a = rm and b = rn .
0 1 2 3 4
1 0 3 2 5
2 3 0 1 6
3 2 1 0 7
4 5 6 7 0
26. Determine the value of the following sum, where p and q are relatively prime positive
integers.
p 2p (q − 1)p
+ + ··· +
q q q
22 2 Number theory
27. For any positive integer n, let d(n) denote the number of positive divisors of n.
For which n does the sequence
n, d(n), d(d(n)), . . .
28. Let S = {2!, 3!, 4!, . . .}. Some members of S may be written as a product of smaller
members of S, such as 4! = 3! × 2! × 2!. Let’s call such a number a factorial-composite.
If a member of S cannot be written as a product of smaller members of S, such as 3!,
let’s call such a number a factorial-prime.
gcd(am − 1, an − 1) = agcd(m,n) − 1
31. Show that for every positive integer n, there are infinitely many terms of the Fibonacci
sequence which are divisible by n.
32. The geometric mean of m non-negative real numbers is the mth root of their product.
(a) Prove that, for every positive integer n, there exists a set of n distinct positive
integers such that the geometric mean of every subset is an integer.
(b) Does there exist an infinite set of distinct positive integers such that the geometric
mean of every finite subset is an integer?
33. (a) For an irrational number x, consider the sequence {x}, {2x}, {3x}, . . . , where as
usual {x} = x − bxc denotes the fractional part of x.
For every positive integer n, show that there exists a term of the sequence which
lies between 0 and n1 .
(b) Hence, for any real numbers a and b satisfying 0 ≤ a < b ≤ 1, prove that there
exists a term of this sequence which lies between a and b.
(c) Use this fact to show that there exists a power of two whose decimal representation
starts with the digits 123456789.
34. Find all powers of two with the property that, after deleting the first digit of its decimal
representation, one again obtains a power of two.
2 Thus the analogue of the fundamental theorem of arithmetic (factorisation into primes is unique up to the
order of factors) is not true in this setting.
2.0 Problems 23
36. Determine all positive integers which are relatively prime to all terms of the infinite
sequence
an = 2n + 3n + 6n − 1, n ≥ 1.
37. (a) Prove the following useful lemma. Let p be an odd prime and a a positive integer
such that p | a − 1.
If pα k a − 1, prove that
n
is a power of 2.
42. Prove there are infinitely many positive integers n which can be expressed as the sum
of the squares of k positive integers for every integer 1 ≤ k ≤ n − 14.
43. Show that for every positive integer n there exist n distinct positive integers such that
their sum is a perfect 2009th power, and their product is a perfect 2010th power.
44. Let b, n > 1 be integers. Suppose that for each k > 1 there exists an integer ak such
that
k | b − ank .
X, 2X, 3X, . . . , nX
3 See the problem in section 2.14 for the definition of a Carmichael number.
24 2 Number theory
Fundamental theorem of arithmetic Every positive integer has a unique prime factori-
sation.
Perhaps the most important word appearing in the above statement is the word unique. It
tells us that a prime factorisation acts like a fingerprint. In other words, it provides a way
to identify positive integers. And it’s quite a useful one, because if you’re given numbers in
prime factorised form, then it’s a simple matter to test for divisibility, count the number of
divisors, calculate greatest common divisors, determine lowest common multiples, and so on.
Problem Call a set of positive integers funky if every pair of elements has greatest common
divisor not equal to 1, while every triple of elements has greatest common divisor equal to 1.
Show that there exists a funky set with n elements for each positive integer n.
Solution The easiest way to show the existence of a funky set with n elements is simply to
construct a funky set with n elements. This is exactly what we are going to do, and our final
set of numbers will be called {a1 , a2 , . . . , an }. Since the problem involves greatest common
divisors, it makes sense to consider prime factorisations. In fact, let’s create a table with n
rows, one for each of the numbers a1 , a2 , . . . , an , and infinitely many columns, one for each of
the primes p1 , p2 , p3 , . . .. We could then place a tick in the row corresponding to a and the
column corresponding to p if and only if p appears in the prime factorisation of a. Your job
now, if you choose to accept it, is to place ticks in the table in order to guarantee that we
end up with a funky set.
For each pair of elements to have greatest common divisor larger than 1 simply means that
for 1 ≤ i < j ≤ n, there exists a column which contains ticks in the rows corresponding to ai
and aj . For each triple of elements to have greatest common divisor equal to 1 simply means
that no column can contain three or more ticks. However, this is easy to do! For each pair
1 ≤ i < j ≤ n, just find a column—any single one of the infinitely many columns—in which to
put exactly two ticks in the rows corresponding to ai and aj . For example, the construction
for n = 5 looks something like the following.
p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 ···
a1 X X X X
a2 X X X X
a3 X X X X
a4 X X X X
a5 X X X X
It’s worth noting that we could have recorded in the table the highest power of p which
divides a, although this particular problem didn’t require us to do so. Another lesson we
can learn from this problem is that thinking in terms of tables can often help to simplify a
problem immensely.
2.2 Pigeonhole principle 25
Problem Show that for every positive integer n not divisible by 2 or 5, there exists a
multiple of n all of whose digits are ones.
Solution Our pigeons will be the first n + 1 numbers from the sequence 1, 11, 111, 1111, . . . ,
and our pigeonholes will be the n possible remainders modulo n. So the pigeonhole principle
asserts that two of these numbers must be congruent to each other modulo n. Suppose that
they consist of i ones and j ones, where we can assume without loss of generality that i > j.
Then
111
| {z. . . 1} − 111
| {z. . . 1} = |111{z
. . . 1} 000
| {z. . . 0} ≡ 0 (mod n).
i j i−j j
However, since n is not divisible by 2 or 5, we are allowed to divide both sides of this
congruence equation by 10j . This leaves us with
111
| {z. . . 1} ≡ 0 (mod n),
i−j
so there certainly does exist a multiple of n all of whose digits are ones.
an an−1 . . . a1 a0 ≡ a0 + a1 + a2 + · · · + an (mod 9)
n
an an−1 . . . a1 a0 ≡ a0 − a1 + a2 − · · · + (−1) an (mod 11)
Problem Find all integers n > 1 for which there exist distinct positive integers a and b such
that na + 1 can be obtained from nb + 1 by reversing the order of its decimal digits, and vice
versa.
Solution Let’s assume without loss of generality that a < b. The first observation that we
can make about the two numbers na + 1 and nb + 1 is that they must have the same number
of digits. It seems that we should be able to use this fact to show that n can’t be all that
large. Indeed, if we consider that the larger of these two numbers is less than 10 times the
smaller, we obtain the inequality
nb + 1 < 10(na + 1)
⇒ na (nb−a − 10) < 9.
26 2 Number theory
By inspection, the latter inequality simply cannot hold if n ≥ 11. Furthermore, if n = 10,
then na + 1 takes the form 100 . . . 01, so reversing the order of its digits returns exactly the
same number.
Thus, we are left to consider the case 2 ≤ n ≤ 9. Since na + 1 and nb + 1 have exactly the
same digits, they must leave the same remainders modulo 9. Therefore,
na + 1 ≡ nb + 1 (mod 9)
a b−a
⇒ n (n − 1) ≡ 0 (mod 9).
In conclusion, the one and only integer which satisfies the conditions of the problem is 3.
1 1
+ = 1.
α β
Solution First, we note that α > 1 and β > 1, which implies that the two sequences must
be strictly increasing. Next, we’ll show that no integer can occur in both sequences. To do
this, suppose that biαc = bjβc = n. Since α and β are irrational, it cannot be the case that
iα = n or jβ = n, so we have the strict inequalities
n < iα < n + 1 and n < jβ < n + 1.
Rearranging these inequalities yields
n n+1 n n+1
<i< and <j< ,
α α β β
which add to give
n n n+1 n+1
n=
+ <i+j < + = n + 1.
α β α β
However, this is a contradiction because i + j is an integer, so it can’t lie between the
consecutive integers n and n + 1. Therefore, no integer can occur in both sequences.
Finally, we’ll show that every integer occurs in one of the sequences. To do this, suppose that
the integer n doesn’t occur in either sequence. Then there must exist integers i and j such
that
iα < n < n + 1 < (i + 1)α and jβ < n < n + 1 < (j + 1)β.
Rearranging these inequalities yields
n n+1 n n+1
i< , i+1> , j< and j+1> .
α α β β
These add to give
n n n+1 n+1
i+j < + =n and i+j+2> + = n + 1.
α β α β
However, this is a contradiction because it implies that the integer i + j must lie between
the consecutive integers n − 1 and n. Therefore, no integer fails to occur in both sequences.
Piecing all the information together, we find that the sequences together include every positive
integer exactly once.
r1 + r2 = r1 + r2 r1 − r2 = r1 − r2
r1 × r2 = r1 × r2 r1 ÷ r2 = r1 ÷ r2
rn = rn ,
where on the left, the exponent appears inside the conjugate, while on the right, the exponent
appears outside the conjugate.
√
Problem Show
√ that
√ if we raise 2 − 1 to the power of a positive integer, then the result is
of the form m − m − 1 for some positive integer m.
√
Solution If we expand ( 2 − 1)n by using the binomial theorem5 , we obtain
√ √
( 2 − 1)n = a + b 2
for some integers a and b. Now consider the conjugate expression to obtain
√ √
(− 2 − 1)n = a − b 2.
√ √
However, since 2 − 1 is between 0 √ and 1, it follows
√ that
√ ( 2 − 1)n must also be between 0
and 1. So it must be the case that ( 2 − 1)n = m − m − 1.
The case for n odd can be handled in a similar way.
n
Problem For each positive integer n, determine the remainder when 32 − 1 is divided
by 2n+3 .
5 See section 13.3 if you don’t know what this is.
2.7 Euclid’s algorithm 29
n
Solution Given an expression like 32 − 1, you can’t help but want to use the difference of
perfect squares factorisation, otherwise known by the acronym DOPS. This gives
n n−1 n−1
32 − 1 = (32 + 1)(32 − 1)
and once again, DOPS can be happily applied to the second factor. This gives
n n−1 n−2 n−2
32 − 1 = (32 + 1)(32 + 1)(32 − 1).
So if we continue with this veritable feast of DOPS, we will finally be left with
n n−1 n−2
32 − 1 = (32 + 1)(32 + 1) · · · (32 + 1)(31 + 1)(31 − 1).
We’d like to know the remainder after this number is divided by 2n+3 . But with so many
brackets, all being even, it seems that our number might actually be divisible by 2n+3 . Let’s
see if this is the case. A keen observation will reveal that for integers m,
So all of the brackets, apart from the final two, are divisible by 2, but not 4. Since there are
n − 1 such brackets, and the final two brackets contribute a factor of 8, our number must be
n
divisible by 2n−1 × 8 = 2n+2 , but not by 2n+3 . It follows that 32 − 1 leaves a remainder of
2n+2 when divided by 2n+3 .
Division algorithm For any two integers a and b 6= 0, there is a unique way to write
a = qb + r, where 0 ≤ r < |b|.
Problem Find a number N which is divisible by 39 and such that N + 1 is divisible by 106.
Solution We’d like to solve the equations N = 39a and N + 1 = 106b for integers a and b.
However, these imply that 106b − 39a = 1, which is known as a linear Diophantine equation
and can be solved with the help of Euclid’s algorithm. In the left column below, we apply
Euclid’s algorithm, by repeatedly applying the division algorithm as shown.
106 = 2 × 39 + 28 1=1×6−1×5
39 = 1 × 28 + 11 = 2 × 6 − 1 × 11
28 = 2 × 11 + 6 = 2 × 28 − 5 × 11
11 = 1 × 6 + 5 = 7 × 28 − 5 × 39
6=1×5+1 = 7 × 106 − 19 × 39
30 2 Number theory
On the left, we have applied Euclid’s algorithm to the numbers 106 and 39, thereby showing
that gcd(106, 39) = 1. On the right, we have sneakily reversed Euclid’s algorithm. The last
line of Euclid’s algorithm allows us to write the number 1 as a multiple of 5 (namely −5),
plus a multiple of 6 (namely 6). We can express this idea by saying that 1 is a combination of
5 and 6. The second last line of Euclid’s algorithm allows us to exchange the number 5 for a
combination of 6 and 11. Therefore, we can write the number 1 as a combination of 6 and 11.
The third last line of Euclid’s algorithm allows us to exchange the number 6 for a combination
of 11 and 28. Therefore, we can write the number 1 as a combination of 11 and 28. We can
continue in this way until we have finally expressed the number 1 as a combination of 39 and
106, which is what we set out to do. Given that 7 × 106 − 19 × 39 = 1, it is clear that we
should take N = 19 × 39 = 741.
Next, we’ll look at a problem which mixes Euclid’s algorithm with some Fibonacci fun! The
result is a rather amazing fact concerning the greatest common divisor of two Fibonacci
numbers.
Solution Let’s break the problem into bite-sized pieces which we’ll finally put together to
produce a complete proof.
To pass from the first line to the second line, we simply use the division algorithm,
while to pass from the second line to the third line, we use the fact that Fm | Fqm and
gcd(Fqm+1 , Fqm ) = 1.
If you’re not quite sure about how the proof works, then the following example may be
enlightening.
Problem Is it possible to choose 2000 distinct non-negative integers, all less than 100000,
no three of which are consecutive terms of an arithmetic progression?
Solution Since we want to keep our numbers small, let’s use the following greedy algorithm.6
Start with the number a0 = 0 and let an+1 be the next largest integer which doesn’t form
a three term arithmetic progression with any of a0 , a1 , a2 , . . . , an . So it’s clear that we can
take a1 = 1, but then we have to skip 2 to avoid the arithmetic progression (0, 1, 2). However,
we can take a2 = 3 and continue in this fashion to obtain an increasing sequence of positive
integers. Before you read on, try to write down the first 20 terms of the sequence and see if
you can spot a pattern.
If you can’t spot the pattern, then at least you should have noticed that there are large jumps
at certain parts of the sequence. The first jump is at a2 = 3 and the next largest jump occurs
at a4 = 9. There are two more large jumps at a8 = 27 and a16 = 81. The numbers 3, 9, 27
and 81, are all powers of 3, which suggests that something interesting might be happening in
base-3. So let’s write out a table which displays the numbers of our sequence in base-3.
6 Informally,
a greedy algorithm makes an optimal choice at each step. Such a tactic is not generally guaranteed
to produce an optimal set overall. However, in many questions, including this one, it is sufficient to solve the
problem.
32 2 Number theory
n 0 1 2 3 4 5 6 7 8
an 0 1 3 4 9 10 12 13 27
base-3 0 1 10 11 100 101 110 111 1000
n 9 10 11 12 13 14 15 16 17
an 28 30 31 36 37 39 40 81 82
base-3 1001 1010 1011 1100 1101 1110 1111 10000 10001
Rather amazingly, the base-3 representation of an only contains the digits 0 and 1, just like
numbers written in binary. And even more amazingly, it seems that the ternary representation
of an is simply the binary representation of n. Of course, everything we’ve accomplished so
far has merely been pattern spotting, so let’s try to prove the following conjecture.
Let an be the positive integer whose base-3 representation is the binary represen-
tation of n. Then no three terms of the sequence a0 , a1 , a2 , . . . form an arithmetic
progression.
In order to prove this, we assume that the numbers x, y, z are distinct terms of the sequence
which form an arithmetic progression, that is, x + z = 2y. So x and z both have ternary7
representations containing only the digits 0 and 1, while 2y has a ternary representation
containing only the digits 0 and 2. It’s clear that if we add x and z in ternary, then there can
be no carries. So in order for their sum to contain only the digits 0 and 2, each occurrence of
the digit 1 in x must match up with a corresponding digit 1 in z and vice versa. In other
words, x and z must actually be the same number, contradicting the fact that x, y, z are
distinct. Therefore, we can be sure that no three terms of the sequence a0 , a1 , a2 , . . . form an
arithmetic progression.
Now it just suffices to show that the 2000 numbers a0 , a1 , a2 , . . . , a1999 of our sequence are all
less than 100000. Since the sequence is increasing, we need only consider the number a1999 .
However, 1999 in binary is 11111001111 and 11111001111 in ternary is only 88249, much less
than 100000.
Problem Does there exist an infinite increasing sequence t1 < t2 < t3 < · · · of positive
integers such that, for any integer c, the sequence
t1 + c, t2 + c, t3 + c, . . .
Solution Consider a particular value of c, such as 73. How can we easily guarantee that the
sequence t1 + 73, t2 + 73, t3 + 73, . . . contains only a finite number of primes? One way is to
make sure that, eventually, the numbers in the sequence t1 , t2 , t3 , . . . are divisible by 73. In
fact, we would like this to be true for almost any value of c, which means that the numbers
7 This is another term for base-3.
2.9 Construction problems 33
t1 , t2 , t3 , . . . should probably have lots of factors. A good candidate for such a sequence is
tn = n!. In fact, for any c with |c| ≥ 2, the number n! + c is divisible by |c| and larger than it
whenever n > |c|. So it’s certainly true that there are finitely many primes in the sequence
1! + c, 2! + c, 3! + c, . . . as long as |c| ≥ 2. Furthermore, when c = 0, there are clearly no primes
in the sequence apart from 2.
So we’re left to deal with the cases c = −1 and c = +1. Unfortunately, there is no easy way
to see that n! ± 1 is composite, so our initial choice of sequence may have to be tweaked. We
would like to change the sequence somehow, maintaining the abundance of factors in our
numbers, but using numbers which are obviously composite when 1 is added or subtracted.
So what sort of numbers are obviously composite when 1 is added or subtracted? One answer
to this question is perfect cubes, since we have the factorisations
X 3 − 1 = (X − 1)(X 2 + X + 1) and X 3 + 1 = (X + 1)(X 2 − X + 1).
Sometimes, you may be asked to perform a particular construction for all positive integers n.
In that case, it should be easier to find a construction in the case n + 1 if you use the fact
that you have already found a construction in the case n. This means that your construction
will involve induction, quite a common phenomenon in number theory problems.
Solution In the language of modular arithmetic, the problem is asking us to prove that
there are infinitely many integers m such that m2 ≡ −7 (mod 2n ). Our first observation is
that once you’ve found one such m, then infinitely many such m are easy to find. This is
because
m2 ≡ −7 (mod 2n ) ⇒ (m + 2n )2 ≡ −7 (mod 2n ).
So the problem is really asking us to prove that there exists at least one integer m such that
m2 ≡ −7 (mod 2n ). For the first several values of n, we can simply list such values of m.
12 ≡ −7 (mod 2) 12 ≡ −7 (mod 4) 12 ≡ −7 (mod 8)
2 2 2
3 ≡ −7 (mod 16) 5 ≡ −7 (mod 32) 11 ≡ −7 (mod 64)
The idea behind our construction is as follows: if m2 ≡ −7 (mod 2n ), then it must be true
that either
m2 ≡ −7 (mod 2n+1 ) or m2 ≡ −7 + 2n (mod 2n+1 ).
In the former case, we can simply use m again. But in the latter case, we can use m + 2n−1
and in this case we have
(m + 2n−1 )2 ≡ m2 + m2n + 22n−2 (mod 2n+1 )
≡ −7 + 2n + m2n + 22n−2 (mod 2n+1 )
≡ −7 + 2n (m + 1 + 2n−2 ) (mod 2n+1 )
≡ −7 (mod 2n+1 ).
This is certainly true for n ≥ 3, since we know that m is necessarily odd, which forces
(m + 1 + 2n−2 ) to be even. So given the fact that m2 ≡ −7 (mod 2n ), we’ve shown that
either m2 ≡ −7 (mod 2n+1 ) or (m + 2n−1 )2 ≡ −7 (mod 2n+1 ). Therefore, by induction, for
every positive integer n, we have a value of m for which m2 ≡ −7 (mod 2n ).
34 2 Number theory
2p
Problem If p is a prime, show that p − 2 is a multiple of p.
(2p − 1)(2p − 2) · · · (p + 1)
2 ≡2 (mod p).
(p − 1)(p − 2) · · · 1
If p = 2, then the statement is obvious, since both sides are congruent to 0. On the other
hand, for p > 2 we can divide both sides by 2 and then multiply by the denominator to obtain
This statement is equivalent to the previous one precisely because we are working modulo a
prime, so we may divide by any number which is not 0 modulo p. The last equation is clearly
true, because
2p
In fact, more is true. If p > 3 is prime, then p − 2 is a multiple of p3 . See if you can prove
it!
You can verify that x ≡ 53 (mod 60) is a solution. In fact it represents all of the solutions.
This is a special case of the following.
2.12 From Fermat to Euler 35
This theorem is particularly useful if you wish to show the existence of numbers satisfying
certain congruence conditions.
Problem Prove that for each positive integer n, there exist n consecutive positive integers,
none of which is an integral power of a prime number.
And this is where the Chinese remainder theorem comes to our rescue! This is because
it guarantees that there exists a solution as long as the numbers p1 q1 , p2 q2 , . . . , pn qn are
relatively prime. Therefore, if we simply pick the primes p1 , p2 , . . . , pn , q1 , q2 , . . . , qn to be
distinct, then we can invoke the Chinese remainder theorem to find a solution for x modulo m,
where m = p1 q1 p2 q2 · · · pn qn .
Note that x is a solution to the problem if and only if x + m is. This shows that for a given
n there are infinitely many solutions.
A convenient equivalent form of Fermat’s little theorem is: If p is a prime and a is any
positive integer, then
ap ≡ a (mod p).
Problem A positive integer n is called groovy if, for every positive integer a, n2 divides
an − 1 whenever n divides an − 1.
Show that all primes are groovy.
36 2 Number theory
ap ≡ 1 (mod p),
ap ≡ a (mod p).
a≡1 (mod p)
and we can write a = mp + 1 for some integer m. Therefore, by the binomial theorem, we
have
p
X p
ap − 1 = (mp + 1)p − 1 = mk pk − 1.
k
k=0
2
If we consider this expression modulo p , then each term with k ≥ 2 disappears, due to the
appearance of pk . Therefore, we conclude that
p p
ap − 1 ≡ m0 p0 + m1 p1 − 1 ≡ 1 + mp2 − 1 ≡ 0 (mod p2 ),
0 1
Sometimes you’re not working modulo a prime, in which case Fermat’s little theorem just
isn’t good enough. Luckily Euler’s theorem, which generalises Fermat’s little theorem, comes
to the rescue!
Euler’s theorem If n is a positive integer and gcd(a, n) = 1, then
Here, ϕ(n) denotes the number of integers 1 ≤ k ≤ n such that gcd(k, n) = 1 and is usually
referred to as the Euler phi function.8 For example, ϕ(12) = 4, since the four numbers 1, 5, 7
and 11 are relatively prime to 12. We’ll use Euler’s theorem to provide a quick proof to a
problem we previously encountered in section 2.2.
Problem Show that for every positive integer n not divisible by 2 or 5, there exists a
multiple of n all of whose digits are ones.
Solution Consider the number X = 19 (10ϕ(9n) − 1). It’s an integer whose decimal repre-
sentation consists entirely of ones, specifically ϕ(9n) ones, to be precise. However, since
gcd(10, 9n) = gcd(10, n) = 1 by assumption, we can invoke Euler’s theorem to show that
1
10ϕ(9n) ≡ 1 (mod 9n) ⇒ (10ϕ(9n) − 1) ≡ 0 (mod n).
9
Therefore, X is divisible by n, as desired.
Fermat’s little theorem and Euler’s theorem are most successfully applied in conjunction with
the following useful result, which is worth a section on its own.
8 There is a formula for ϕ(n) in terms of the prime factorisation of n. See if you can find and prove it!
2.13 The gcd trick 37
This useful result can be proven using the representation of gcd(x, y) as a linear combination
of x and y using Euclid’s algorithm. See section 2.7. The proof is not hard and you should
try to work it out for yourself.
Since p is prime, we know that gcd(p, q − 1) must be equal to 1 or p. The former case is
absurd, since it would imply that 1 ≡ 2gcd(p,q−1) ≡ 2 (mod q).
We are left to ponder the latter case, when gcd(p, q − 1) = p. But this simply means that
p | q − 1, which is only possible if p < q.
The previous problem provides us with a beautiful proof that there are infinitely many primes.
This is because if there were only finitely many, we could take p to be the largest one. Then
2p − 1 would be divisible by some prime q, which would have to be larger than p. Since this
contradicts the fact that p was the largest prime, we conclude that there must be infinitely
many primes.
1, 3, 2, 6, 4, 5, 1, 3, 2, 6, 4, 5, . . . ,
which is a 6-cycle.
The number 3 is called a generator modulo 7 because its successive powers generate everything
which is relatively prime to 7 modulo 7.
More generally, a number a is called a generator9 modulo n if the powers of a, namely,
1, a, a2 , a3 , . . . ,
9 The term primitive root modulo n is used in some literature. It means the same thing.
38 2 Number theory
generate the complete set of elements modulo n which are relatively prime to n.
This theorem is most often used in the case where n is a prime. It can be used very powerfully
in conjunction with the gcd trick. The important point is that if g is a generator modulo n,
then d = ϕ(n) is the smallest positive integer for which g d ≡ 1 (mod n).
an ≡ a (mod n)
For a long time it was suspected that there are infinitely many Carmichael numbers. A proof
of this fact was finally found in 1994.
10 These numbers are of interest because in the light of Fermat’s little theorem, they look prime, but are not!
Diophantine equations
3
A Diophantine equation is an equation that asks for integer solutions. The equation is usually
posed as, or at least can be reduced to, a polynomial expression in several variables. For
example, a2 + b3 c = d4 is a Diophantine equation.
3.0 Problems
1
1. Show that for every integer n > 1 it is possible to write n as a sum of two reciprocals
of distinct positive integers.
3x + 4y = 2xy.
(x − y)2 = x + y.
4. Find all ordered triples of positive integers such that each of them is a factor of the
sum of the other two.
x3 y + x + y = xy + 2xy 2 .
(c) Find all pairs of integers (u, v) such that u2 + 2uv − v 2 is divisible by 7 but
u2 + uv + 2v 2 is not divisible by 7.
9. Suppose that a and b are positive integers such that
112011 | a2 + b2 .
Prove that
112012 | ab.
n2 − 29
k= ,
3n + 11
where n is also an integer.
11. Determine all positive integers k such that for all positive integers a and b the following
statement is true.
ab + bc + ca = 2 + abc.
a | bc − 1, b | ac − 1 and c | ab − 1.
y 2 (x − 1) = x5 − 1.
17. Find all integers a, b and c with 1 < a < b < c such that
(a − 1)(b − 1)(c − 1)
is a divisor of abc − 1.
18. Suppose that a, b, c and d are positive integers satisfying ab = cd.
Prove that neither a + b + c + d nor a2 + b2 + c2 + d2 can be prime numbers.
19. Find all integer solutions to the equation
8x3 − 4 = y(6x − y 2 ).
20. Can the product of five consecutive positive integers be a perfect square?
3.0 Problems 41
xp−1 + xp−2 + · · · + x + 1
is congruent to 0 or 1 modulo p.
23. Find all ordered triples of positive integers (a, b, c) such that for all positive integers t
(at + 1)(bt + 1)(ct + 1) − 1
lcm(at, bt, ct)
is also a positive integer.
24. Given any set A = {a1 , a2 , a3 , a4 } of four distinct positive integers, we denote the sum
a1 + a2 + a3 + a4 by sA . Let nA denote the number of pairs (i, j) with 1 ≤ i < j ≤ 4
for which ai + aj divides sA .
Find all sets A of four distinct positive integers which achieve the largest possible value
of nA .
25. Find all pairs (a, b) of positive integers such that
a2 b + a + b
ab2 + b + 7
is also a positive integer.
26. The positive integers a and b are such that
a5 + a3 b + b4 = 0.
a2 b2 − 4a − 4b
is a perfect square.
42 3 Diophantine equations
a2 + 4b and b2 + 4a
32. Determine the maximum value of m2 + n2 , where m and n are integers 1 ≤ m, n ≤ 2015
and
(n2 − mn − m2 )2 = 1.
a2 + b2
ab + 1
is a positive integer, then it is a perfect square.
34. Find all positive integers k of the form
x+1 y+1
k= + ,
y x
where x and y are also positive integers.
35. Let a and b be positive integers such that
(4a2 − 1)2
4ab − 1
is also a positive integer.
Prove that a = b.
x7 − 1
= y 5 − 1.
x−1
3.1 Factorisation 43
3.1 Factorisation
If someone asked you to solve the equation xy = 100, where x and y are integers, then
hopefully you’d find the problem easy. That’s because x must be a divisor of 100. After
listing out all of the divisors of 100—remembering, of course, that they may be positive or
negative—we find the values for x and then solve to find the corresponding values for y = 100
x .
It’s often possible to rearrange a Diophantine equation so that we have an integer on one side
and a factored expression on the other side. It is then a simple matter of examining every
possible factorisation of the integer and lining it up with the factored expression.
3.2 Monotonicity
Suppose, for example, we know that x is a perfect square. Suppose also we can show that
a2 < x < (a + 3)2 by other means. Then we may conclude that the only possibilities we need
pursue further are x = (a + 1)2 and x = (a + 2)2 . This is because the function f (t) = t2 is an
increasing function on the positive integers. This sort of technique is valid for any monotonic
(i.e. increasing or decreasing) function, as in the following example.
Problem Find all ways of writing the number 1 as a sum of the reciprocals of three positive
integers.
1 1 1
Solution The equation a + b + c = 1 is equivalent to the equation
abc = ab + ac + bc.
Note that none of a, b, c equal 1. Due to symmetry, we may order the variables and say
WLOG that 2 ≤ a ≤ b ≤ c. Using this we note that the RHS ≤ 3bc and so abc ≤ 3bc. Thus
a ≤ 3. We may now take the two cases a = 2 and a = 3.
Case 1: a = 3
Then we have 3b + 3c = 2bc. This may be rewritten as
(2b − 3)(2c − 3) = 9.
Case 2: a = 2
In a similar way to case 1 we arrive at
(b − 2)(c − 2) = 4.
Remembering that 2 ≤ b ≤ c we find that only b = 3 and b = 4 are possible, which lead
quickly to the solutions (a, b, c) = (2, 3, 6) and (a, b, c) = (2, 4, 4).
In summary we have solutions (3, 3, 3), (2, 3, 6) and (2, 4, 4). However, remember that we
used a WLOG argument at the beginning due to symmetry. So we must remember to include
all permutations of our solutions too, giving us 10 distinct solutions in all.
a2 + b + c = abc.
This certainly is a candidate for bounding! WLOG c ≥ b. If b ≥ 3, then bc ≥ c+c+c > b+c+1.
So we must have either b = 1 or b = 2.
Case 1: b = 1
The original equation can be rearranged to
a2 + 1 2
c= =a+1+ .
a−1 a−1
2
Since a−1 must be an integer, this leads to a = 2 or a = 3. The solutions corresponding
to this case are then (a, b, c) = (2, 1, 5) and (3, 1, 5).
Case 2: b = 2
The original equation can be rearranged to
a2 + 2
c= .
2a − 1
Thus
(2a)2 + 8 9
4c = = 2a + 1 + .
2a − 1 2a − 1
Hence 2a − 1 | 9, which leads to a = 1, 2, 5. The solutions corresponding to this case
are (a, b, c) = (1, 2, 3), (2, 2, 2) and (5, 2, 3).
a2 +2
If you are wondering how we mysteriously came up with the idea of going from c = 2a−1 to
(2a)2 +8 9
4c = 2a−1 = 2a + 1 + 2a−1 , please read on because section 3.4 will provide illumination!
a2 + 2 ≡ 0 (mod 2a − 1)
2
⇔ (2a) + 8 ≡ 0 (mod 2a − 1)
2
⇔ (1) + 8 ≡ 0 (mod 2a − 1)
Thus 2a − 1 | 9. Checking the six factors of 9, including the negative ones, gives rise to six
solutions, namely, a = −4, −1, 0, 1, 2 or 5.
46 3 Diophantine equations
You would be right in thinking that this would be a prime candidate for trying the method
of using a polynomial modulus. And yes, it works just fine. However, study the following
solution which showcases the quadratic discriminant method.
a2 + 2 = 2ak + 3k
to have integer solutions. Viewing this as a quadratic in a, this is only possible if the
discriminant 4k 2 + 12k − 8 is a perfect square. Thus
4k 2 + 12k − 8 = u2 ,
for some non-negative integer u. Note that u2 is even and so u is also. Write u = 2v.
Rearranging the above relation yields
k 2 + 3k − 2 − v 2 = 0.
This is a quadratic in k. Once again its discriminant 9 + 4(2 + v 2 ) is a perfect square. Thus
4v 2 + 17 = w2 ,
Since w + 2v ≥ 0, only the positive factors of 17 need to be checked. The only possibility is
w − 2v = 1 and w + 2v = 17. This yields w = 9 and v = 4. Thus u = 8. Solving for k yields
k = 3 or −6. Finally solving for a yields the four solutions a = −10, −2, −1 or 7.
x4 + 131y 4 = 3z 4 + 2000
Solution Since ϕ(5) = 4, we try considering the equation modulo 5. The equation simplifies
as
x4 + y 4 ≡ 3z 4 (mod 5).
Observe that a4 ≡ 0, 1 (mod 5) for any integer a. It follows that LHS ≡ 0, 1 or 2 (mod 5)
and RHS ≡ 0 or 3 (mod 5). So we must have LHS ≡ RHS ≡ 0 (mod 5). But this is only
possible if x ≡ y ≡ z ≡ 0 (mod 5). However, this in turn implies that 54 | 2000, which is a
contradiction.
Solution Viewing the equation as a quadratic in m, if it did have solutions, then the
discriminant
36n2 − 20(7n2 − 1985) = 4(9925 − 26n2 )
should be a perfect square. Thus
9925 − 26n2
is a perfect square.
Squares can only be congruent to 0, 1 or 4 modulo 8. Hence considering this expression
modulo 8 yields
5 − 2n2 ≡ 0, 1, 4 (mod 8)
≡1 (mod 8) (since LHS is odd)
2
⇒ 2n ≡ 4 (mod 8)
2
⇒ n ≡2 (mod 4).
x4 + 2x2 y + y 3 = 0.
Solution Let gcd(x, y) = d. Thus we may write x = sd and y = td, where gcd(s, t) = 1.
The equation becomes
s4 d + 2s2 t + t3 = 0.
Now if p is any prime factor of s, we see that p | t3 and thus p | t. But since gcd(s, t) = 1,
this forces s = ±1 and thus d = −2t − t3 . Thus the complete set of solutions is given by
x = ±t(2 + t2 )
y = −t2 (2 + t2 ),
a2 + 2b2 = c2 .
Solution By dividing out by λ = gcd(a, b, c), we may assume that a, b and c are pairwise
coprime. Considering the equation modulo 4 shows us that a and c are both odd and that b
is even. Thus after writing b = 2d and rearranging we have
c−a c+a
2d2 = ·
2 2
and that gcd( c−a c+a c−a c+a
2 , 2 ) = gcd(a, c) = 1. Thus whichever one of 2 and 2 is odd must
be a perfect square, say u , and the other must be twice a square, say 2v 2 . Adding these
2
together yields
c = u2 + 2v 2 , a = ±(u2 − 2v 2 ) and d = uv.
Thus the most general solutions are given by
a = |u2 − 2v 2 |λ
b = 2uvλ
c = (u2 + 2v 2 )λ,
where λ is any positive integer and (u, v) are any pair of relatively prime positive integers.1
a3 + b3 = c3 + d3 + e3
has infinitely many integer solutions where gcd(a, b, c, d, e) = 1 and a, b, c, d and e are all
distinct.
Solution We try the substitution a = x−s, b = x+s. The left-hand side becomes 2x3 +6s2 x.
A little more thought encourages us to try a similar thing on the right-hand side, namely
c = x − t, and d = x + t. Using these, the equation simplifies to
6x(s − t)(s + t) = e3 .
We have now reduced the equation to four variables. Trying s = 2 and t = 1 simplifies matters
even further to
18x = e3 .
1A very similar approach yields a formula for all Pythagorean triples, that is, triples of positive integers that
are the side lengths of a right-angled triangle.
3.9 Infinite descent 49
Choosing e = 18y finishes off the problem. We have discovered an infinite one-parameter
family of solutions given by
a = 182 y 3 − 2
b = 182 y 3 + 2
c = 182 y 3 − 1
d = 182 y 3 + 1
e = 18y.
x3 + 3y 3 + 9z 3 = 9xyz.
Solution Assume we have a solution (x, y, z) with |x| + |y| + |z| > 0. Then there is a solution
with |x| + |y| + |z| minimal. The equation implies 3 | x3 and so 3 | x. Thus x = 3a for some
integer a. Substitute this into the equation and divide the equation by 3 to find
9a3 + y 3 + 3z 3 = 9ayz.
It only remains to note that (a, b, c) is also a solution to the original equation and satisfies
0 < |a| + |b| + |c| = |x|+|y|+|z|
3 . This contradicts the minimality of |x| + |y| + |z|. Thus there
is no solution with |x| + |y| + |z| > 0. Hence the only solution is x = y = z = 0.
50 3 Diophantine equations
Suppose that we have a Diophantine equation in two variables x and y which is a monic2
quadratic in x. If we have one solution (x, y) = (a, b) in integers, it turns out that we can
apply Vieta’s formula3 for the sum of the roots to generate another integer solution.
x2 + y 2 − 4xy − 4 = 0
x2 − 8x = 0.
Using Vieta’s formula, the sum of the roots is 8. Since x = 0 is one of the roots, the other
must be x = 8. So (8, 2) is a solution. By symmetry, (2, 8) is also a solution.
Next substitute y = 8 into the original equation. We obtain the quadratic equation
x2 − 32x + 60 = 0.
Again by Vieta’s formula, the sum of the roots is 32. Since x = 2 is one of the roots, the
other must be x = 30. So (30, 8) is a solution. By symmetry, so is (8, 30).
This procedure may be continued indefinitely. In general if (x, y) = (a, b) is a solution, we
may construct another solution by considering the quadratic equation
x2 − 4bx + b2 − 4 = 0.
From Vieta’s formula, the sum of the roots is 4b. Since x = a is one of the roots, the
other must be 4b − a. Thus (x, y) = (4b − a, b) is also a solution and by symmetry, so is
(x, y) = (b, 4b − a).
If we assume that 0 ≤ a < b, then b < 4b − a. Thus any solution (a, b) leads to a bigger
solution (b, 4b − a). Since (0, 2) is a solution, we can thus construct infinitely many solutions
using this procedure starting from (0, 2).
By using the technique of infinite descent, it is possible to prove that all integer solutions
(x, y), with 0 ≤ x ≤ y, to the given equation occur as a consecutive pair of the generated
sequence 0, 2, 8, 30, . . . , where the rule for the sequence is
for n ≥ 1.
From this definition it is easy to compute the first few cyclotomic polynomials to be
Φ1 (x) = x − 1
Φ2 (x) = x + 1
Φ3 (x) = x2 + x + 1
Φ4 (x) = x2 + 1
Φ5 (x) = x4 + x3 + x2 + x + 1
Φ6 (x) = x2 − x + 1
Φ7 (x) = x6 + x5 + x4 + x3 + x2 + x + 1
Φ8 (x) = x4 + 1.
Additionally, it is not too hard to show directly from the definition that
Φp (x) = xp−1 + xp−2 + · · · + x + 1
for any prime p.
Sometimes part of an algebraic expression can be seen to be part of a cyclotomic polynomial.
Since any cyclotomic polynomial is a factor of an expression of the form xn − 1, there is a
fair chance that the theorems of Euler or Fermat from section 2.12 might be helpful.
Solution We astutely recognise that the RHS is equal to 8 plus the cyclotomic polynomial
Φ17 (y) = y 16 + y 15 + · · · + y 2 + y + 1. Rewrite the equation as
x3 − 8 = y 16 + y 15 + · · · + y 2 + y + 1.
Note that the RHS is a factor of y 17 − 1.
Let p be any prime factor of the RHS. Note that this implies that p - y. Then it follows that
y 17 ≡ 1 (mod p).
Case 1: d = 1
Then we have y ≡ 1 (mod p), and so,
y 16 + y 15 + · · · + y 2 + y + 1 ≡ 17 (mod p).
We have shown that every prime factor p of the RHS satisfies either p = 17 or p ≡ 1 (mod 17).
Hence it follows that every factor of the RHS is congruent to 0 or 1 modulo 17.
However, since
x3 − 8 = (x − 2)(x2 + 2x + 4)
we see that x − 2 ≡ 0, 1 (mod 17).
But if x ≡ 2 (mod 17), then x2 + 2x + 4 ≡ 12 (mod 17).
If x ≡ 3 (mod 17), then x2 + 2x + 4 ≡ 2 (mod 17).
In both cases the other factor is not congruent to 0 or 1 modulo 17, a contradiction. Thus
the equation has no integer solutions.
Plane geometry
4
Welcome dear reader to a domain where many fear to tread, and many stumble. The domain
is plane geometry, surely one of the most beautiful areas of mathematics.
In one sense, geometry is the easiest mathematical topic to learn for it is the epitome of logical
rigour. But in another sense, geometry is the most difficult subject to teach. It relies heavily
on ingenuity. Solving a geometry problem is more often dependent on noticing something
that is difficult to see, perhaps more so than elsewhere. But perseverance is an excellent
character trait for problem solving.
We shall assume you know all the basic theorems of plane geometry: congruent triangles,
similar triangles, basic circle theorems, cyclic quadrilaterals and special points of a triangle.
There are a number of pitfalls unique to geometry, into which many an inexperienced problem-
solver falls. They are incredibly easy to avoid but far too many fail to avoid them, because
they think these matters are not important.
Your chances of solving the problem are as good as your diagram. If you
don’t draw a good diagram in a geometry problem, unless it is very, very easy, you have
significantly lowered your chances of solving it.
Size matters. We admire the responsible sentiment of those who don’t want to use
up too much paper. But with a big accurate diagram, you’re likely to see things more
clearly and solve the problem. With a dozen small terrible diagrams you’re not going
to get anywhere. A good diagram should take up most of the page.
Accuracy is a virtue. If you draw freehand, you draw sloppily. If you draw accurately,
using ruler and compass, your diagram will be spot on the mark. And with a perfect
diagram, you can see very concretely the real situation the problem is asking about.
You can spot things, make guesses, get a feel for how all the points and lines relate, and
54 4 Plane geometry
so on. Does one of your angles look like a right angle? Does one of your quadrilaterals
look cyclic? Do those three points look collinear? Your accurate diagram will give you
all sorts of suggestions, if you look hard enough at it, and you can try and prove them.
4.0 Problems
1. Two parallel lines are tangent to a circle with centre O. A third line, also tangent to
the circle, meets the two parallel lines at A and B.
Prove that AO is perpendicular to OB.
2. Suppose that two circles are externally tangent at P . Let a common tangent touch the
circles at A and B.
Prove that triangle AP B is right-angled.
3. Let ABC be a triangle with incentre I. Suppose that X is the midpoint of the arc BC
not containing A on the circumcircle of triangle ABC.
Prove that X is the circumcentre of triangle BIC.
4. Let D, E, F be points on sides AB, BC, CA of triangle ABC such that DE = BE
and F E = CE.
Prove that the circumcentre of triangle ADF lies on the bisector of ∠DEF .
5. Given a triangle ABC, let the median from vertex A intersect the circumcircle of the
triangle again at K. A circle Γ passes through A and B such that BC is tangent to Γ.
Let L be the intersection of AK and Γ different from A.
Prove that BLCK is a parallelogram.
6. Let two circles intersect at A and B. Suppose that a common tangent to the two circles
meets them at P and Q.
If the line AB meets P Q at M , show that M is the midpoint of P Q.
7. ABC is a triangle, right-angled at C. The internal angle bisectors of angle BAC and
angle ABC meet BC and CA at P and Q, respectively. Let M and N be the feet of
the perpendiculars from P and Q to AB, respectively.
Find the size of ∠M CN .
8. Two circles Γ1 and Γ2 intersect at A and B. It is known that Γ2 passes through point
O, where O is the centre of Γ1 . Point X lies on the arc AOB, and AX intersects Γ1
again at Y .
Prove that XY = XB.
9. Let AD be an altitude and H be the orthocentre of 4ABC. Let line AD meet the
circumcircle of 4ABC at X.
(a) Prove that HD = DX.
(b) Prove that the circumradius of the triangles formed by H and any two vertices is
equal to the circumradius of 4ABC.
10. Two circles of equal radius intersect at A and B. The point C is on one of the circles
such that B is the midpoint of the arc AC.
Prove that AC is tangent to the other circle.
4.0 Problems 55
11. Two circles C1 and C2 intersect at A and B. Let P be a point on C1 and Q be a point
on C2 , so that P and Q lie on opposite sides of the line AB. Suppose further that
∠AP B + ∠AQB = 90◦ .
Prove that if O1 is the centre of C1 and O2 is the centre of C2 , then the triangles O1 AO2
and O1 BO2 are right-angled.
12. Let ABCD be a quadrilateral such that CD bisects ∠ACB. Suppose that
21. The bisectors of the angles A and B of the triangle ABC meet the sides BC and CA
at the points D and E, respectively.
Assuming that AE + BD = AB, determine the size of ∠C.
22. Suppose that P is a point inside triangle ABC that satisfies ∠ABP = ∠ACP and
∠CBP = ∠CAP .
Prove that P is the orthocentre of the triangle.
23. Let P be an interior point of 4ABC, and let AP , BP and CP intersect BC, CA and
AB at D, E and F , respectively.
Prove that
AP BP CP AE AF AP
+ + =2 and + = .
AD BE CF EC FB PD
24. Convex quadrilateral P QRS satisfies P Q = QR and P Q is not parallel to SR. The
diagonals P R and QS intersect at T . The perpendicular bisectors of P R and QS
intersect at V .
Prove that P QT V is cyclic.
25. Triangle ABC satisfies ∠ABC = 2∠ACB. Point P , located inside the triangle, satisfies
P B = P C and AP = AB.
Prove that ∠BAC = 3∠P AC.
26. A triangle ABC satisfies ∠ACB > ∠ABC. The internal bisector of ∠BAC meets BC
at D. The point E on AB is such that ∠EDB = 90◦ . The point F on AC is such that
∠BED = ∠DEF .
Show that ∠BAD = ∠F DC.
27. Let A, B and C be three collinear points with B between A and C. Equilateral triangles
ABD, BCE and CAF are constructed with D and E on one side of the line AC and
F on the opposite side.
(a) Prove that the centroids of the triangles are the vertices of an equilateral triangle.
(b) Prove that the centroid of this triangle lies on the line AC.
28. Let ABC be an acute triangle. Let AD be the altitude on BC, and let H be any
interior point on AD. Lines BH and CH, when extended, intersect AC and AB at E
and F , respectively.
Prove that ∠EDH = ∠F DH.
29. Let C be a circle with centre O and let A and B be points on the circle such that
∠AOB = 90◦ . Let C1 and C2 be two circles internally tangent to C at points A and B,
respectively. Furthermore C1 and C2 are tangent to each other and have centres O1
and O2 , respectively. Circle C3 is located inside angle AOB. It has centre O3 and is
externally tangent to C1 and C2 and is internally tangent to C.
Prove that OO1 O3 O2 is a rectangle.
30. Let ABC be an acute triangle. Let M be the midpoint of BC and P be the point on
AM such that M B = M P . Let H be the foot of the perpendicular from P to BC.
The lines through H perpendicular to P B and P C meet AB and AC at Q and R,
respectively.
Show that BC is tangent to the circle through Q, H and R at H.
4.0 Problems 57
31. Triangle ABC is acute-angled with circumcircle Γ and orthocentre H so that AB 6= AC.
Let AH meet BC and Γ at D and E, respectively. Let F be the midpoint of BC. The
line tangent to circle DEF at D meets the lines AB and AC at M and L, respectively.
Prove that M D = DL.
32. Let ABC be a triangle with a right angle at A and area 4, and let S be its circumcircle.
Let S1 be the circle tangent to sides AB and AC and internally tangent to S. Let S2
be the circle tangent to rays AB and AC and externally tangent to S. Let r1 and r2
denote the respective radii of S1 and S2 .
Prove that r1 r2 = 44.
33. Angle A is the smallest in triangle ABC. The points B and C divide the circumference
of the triangle into two arcs. Let U be an interior point of the arc between B and C
which does not contain A. The perpendicular bisectors of AB and AC meet the line
AU at V and W , respectively. The lines BV and CW meet at T .
Show that AU = T B + T C.
34. Points X, Y and Z are located inside triangle ABC and satisfy
∠Y AC = ∠ZAB = 13 ∠A
∠ZBA = ∠XBC = 13 ∠B
∠XCB = ∠Y CA = 13 ∠C.
35. Let ω be the incircle of triangle ABC. Let L, N and E be the points of tangency of
ω with the sides AB, BC and CA, respectively. Lines LE and BC intersect at the
point H and lines LN and AC intersect at the point J. (All the points H, J, N and E
lie on the same side of the line AB.) Let O and P be the midpoints of EJ and N H,
respectively.
Prove that p
Area(HJN E) = 4 Area(ABOP ) × Area(COP ).
36. Let ABCD be a convex quadrilateral in which the diagonals AC and BD are perpendic-
ular and the opposite sides AB and DC are not parallel. The perpendicular bisectors
of AB and CD meet at point P inside ABCD.
Prove that ABCD is cyclic if and only if triangles ABP and CDP have equal areas.
37. Consider five points A, B, C, D and E such that ABCD is a parallelogram and BCED
is a cyclic quadrilateral. Let ` be a line passing through A. Suppose that ` intersects
the interior of the segment DC at F and intersects line BC at G. Suppose also that
EF = EG = EC.
Prove that ` is the bisector of angle DAB.
38. Let ABCD be a convex quadrilateral which does not have any two sides of equal length.
Prove that ABCD is cyclic if and only if there exist points Q and R on line BD, one
strictly between B and D, the other outside of the segment BD, such that
39. A circle with centre O passes through the vertices A and C of triangle ABC and
intersects the segments AB and BC again at distinct points K and N , respectively.
The circumscribed circles of the triangles ABC and KBN intersect at exactly two
distinct points B and M .
Prove that angle OM B is a right angle.
4.1 Angle chasing 59
We will now see how angle chasing can make light work of a problem.
Solution
D C
A B
Theorem
Even problems that don’t seem to involve any circles often feature cyclic quadrilaterals.
It’s worth remembering that discovering a cyclic quadrilateral is a valuable find because it
gives you new information that you could not otherwise obtain by standard angle-chasing
techniques.
Problem In triangle ABC, points D and E are located on the side BC such that AD is an
altitude and AE is an angle bisector. The point M on AE is such that BM is perpendicular
to AE and the point N on AC is such that EN is perpendicular to AC.
Prove that the points D, M , N are collinear.
Solution
A
B D E C
With three right angles floating around the diagram, you can be confident that there are also
cyclic quadrilaterals. In fact, we know that ABDM is cyclic because ∠ADB = ∠AM B = 90◦ .
We also know that ADEN is cyclic because ∠ADE + ∠AN E = 90◦ + 90◦ = 180◦ .
Now there are many ways to show that the points D, M , N are collinear, but our particular
approach will be to prove that ∠BDM + ∠N DC = 180◦ .
As with many plane geometry problems, we start by labelling a sensible angle. Here we will
use ∠BAC = 2α. Then we use this to label as many other angles in the diagram as possible.
For a start, we have ∠BAE = ∠CAE = α.
The cyclic quadrilateral ABDM tells us that
∠BDM = 180◦ − ∠BAM = 180◦ − ∠BAE = 180◦ − α.
The cyclic quadrilateral ADEN tells us that
∠N DC = ∠N DE = ∠N AE = α.
Therefore, ∠BDM + ∠N DC = (180◦ − α) + α = 180◦ , as required.
Problem Let ABCD be a square and P be a point on its side BC. The circle passing
through points A, B and P intersects BD once more at point Q. The circle passing through
points C, P and Q intersects BD once more at point R.
Prove that points A, R and P are collinear.
Solution First of all draw the entire diagram carefully. But after that it pays to draw just
a part of the diagram and see what we can glean. We will examine the situation with each
circle separately. So first let’s just examine the set-up with circle ABP Q.
A D
B P C
A D
45
α Q
45 45
45 α
B P C
Some of the other angles in the diagram are ∠BQP = ∠BAP = 90◦ −α and ∠QP C = 135◦ −α.
62 4 Plane geometry
Well, that’s all pretty interesting. Now let’s think about drawing in the circle around C, P, Q.
Actually, let’s just think about drawing in CQ—one step at a time!
A D
45
α Q
45 45
45 α
B P C
When you only look at this part of the diagram, it’s pretty clear that by some sort of symmetry,
AQ = CQ. You could prove this by saying that C is the reflection of A in the line BD, and
Q is on BD. Or you could prove ADQ and CDQ are congruent. Anyway, we now have
AQ = CQ = P Q. So CP Q is isosceles and ∠QCP = ∠QP C = 135◦ − α.
Now we’ll put the second circle in.
A D
Q
B P C
1. Centroid Let ABC be a triangle with medians AX, BY and CZ meeting at centroid G.
Prove that the line segments XY , Y Z and ZX divide triangle ABC into four
smaller triangles which are congruent to each other.
2 If you don’t know about centroids, orthocentres, and so on, you can look them up on the internet.
4.4 Triangle centres 63
The triangle XY Z is called the medial triangle. Prove that it can be obtained by
performing a dilation3 on triangle ABC with centre G and factor − 12 .
Prove that G is the centroid of the medial triangle.
Prove that the three medians of a triangle divide it into six triangles of equal area.
Prove that
AG BG CG
= = = 2.
GX GY GZ
2. Orthocentre Let ABC be a triangle with altitudes AD, BE and CF meeting at
orthocentre H.
Draw triangle ABC, the three altitudes AD, BE, CF and triangle DEF .
Write down all six cyclic quadrilaterals which appear in the diagram.
Label every angle in the diagram in terms of ∠A, ∠B and ∠C.
What is the relationship between H and triangle DEF ?
What are the orthocentres of triangles AHB, BHC and CHA?
Draw triangle ABC, the perpendicular bisectors of the sides XO, Y O, ZO and
the medial triangle XY Z.
Write down all three cyclic quadrilaterals appearing in your diagram.
Label every angle in the diagram in terms of ∠A, ∠B and ∠C.
What is the relationship between O and triangle XY Z?
4. Incentre Let the incircle of triangle ABC have centre I and touch AB, BC and CA
at P , Q and R, respectively.
Draw triangle ABC, the angle bisectors AI, BI, CI, the points of tangency of the
incircle P , Q, R, the segments P I, QI, RI and triangle P QR.
Write down all three cyclic quadrilaterals appearing in your diagram.
Label every angle in the diagram in terms of ∠A, ∠B and ∠C.
Label the lengths AQ, AR, BR, BP , CP and CQ in terms of the side lengths a,
b and c of the triangle.
Prove that the area of triangle ABC is given by rs, where r is the radius of the
incircle and s = a+b+c
2 is the semiperimeter.
Draw triangle ABC, the internal angle bisector AIa , the external angle bisectors
BIa , CIa , the points of tangency of the excircle P , Q, R, the segments P Ia , QIa ,
RIa and triangle P QR.
Write down all three cyclic quadrilaterals appearing in your diagram.
Label every angle in the diagram in terms of ∠A, ∠B and ∠C.
3 See section 7.3.
64 4 Plane geometry
Label the lengths AQ, AR, BR, BP , CP and CQ in terms of the side lengths a,
b and c of the triangle.
Prove that the area of triangle ABC is given by ra (s − a), where ra is the radius
of the excircle opposite A and s is the semiperimeter.
What is the relationship between I and triangle Ia Ib Ic ?
What is the relationship between triangle Ia Ib Ic and the circumcircle of triangle
ABC?
4.5 Constructions
There’s a simple rule of thumb for deciding whether a geometry question is hard, or really
hard. If it can be solved without drawing any extra lines beyond the ones you’re given (and
maybe a couple more natural ones), it’s not that hard. If it requires you to make some
constructions of your own, it’s really hard.
Again, nobody can teach you to look at a diagram and say where to make a construction—this
is a truly creative business. But by doing enough problems it’s possible to get a feel for how a
certain situation works, enough that you can try inserting various extra bits, here and there.
Don’t be afraid to do so: if it doesn’t work, or your diagram becomes too cluttered, just draw
another one!
Problem Suppose that A, B, M are points on a circle such that M is the midpoint of the
arc AB. Let C be an arbitrary point on the arc AM B such that AC is longer than BC.
Let D be the foot of the perpendicular from M to AC.
Prove that AD = DC + CB.
Solution
P
M
x
A B
A much easier task than proving one length is equal to the sum of two lengths is proving that
one length is equal to another. With this in mind, we extend the line AC to the point P such
that CP = CB. Of course, what we now need to prove is that AD = DP .
But if AD = DP , then we would know that M lies on the perpendicular bisector of AP .
Since M also lies on the perpendicular bisector of AB, it must be the case that M is the
circumcentre of triangle ABP . Let’s aim to prove this using an angle chase.
4.5 Constructions 65
First, we let ∠AP B = x. Since we have constructed triangle BCP to be isosceles, we know
that ∠P BC = x and ∠P CB = 180◦ − 2x. From this, it follows that ∠ACB = 2x and since
ABCM is a cyclic quadrilateral, we also have ∠AM B = 2x.
Now look at what we have here. The chord AB subtends an angle 2x at M with AM = BM
and an angle x at P . Since P and M lie on the same side of AB, the point M is indeed the
circumcentre of triangle ABP . We now know that M D splits the isosceles triangle AM P
into two congruent triangles, so AD = DP .
By the way, as with most geometry problems, different solutions can be found. For the one
just discussed, it is possible to define a point Q on segment AD such that CD = DQ. Thus
it remains to show that AQ = BC. This can be done by proving that triangles AM Q and
BM C are congruent. See if you can angle chase out the details!
Problem A triangle ABC has squares P ABK and QACL constructed on its exterior. The
altitude AD of triangle ABC is extended to meet P Q at point M .
Prove that M is the midpoint of P Q.
Solution
R Q
P S
A L
B D C
First, let’s do a bit of angle chasing. If we let ∠CAD = α and ∠BAD = β, then we have
∠ACD = 90◦ − α and ∠ABD = 90◦ − β. Also, since angles on a straight line add to 180◦ ,
we have ∠QAM = 90◦ − α and ∠P AM = 90◦ − β.
There are several ways to proceed from here, but one way is as follows. Note that QA = AC
and ∠QAM = ∠ACD. These are quite similar situations, and by drawing a single line
segment, we can create a pair of congruent triangles—certainly a useful thing to do! So let R
be the foot of the perpendicular from Q to the line M D. This gives us the congruent triangles
QAR and ACD, as desired. But what we have done on the right side of the diagram, we can
similarly do on the left. So let S be the foot of the perpendicular from P to the line M D, so
that we have congruent triangles P AS and ABD. In particular, we have managed to prove
that QR = AD = P S.
Therefore, M is horizontally halfway between P and Q. This means that M must be the
midpoint of P Q, and we are done.
66 4 Plane geometry
(You could make this last statement a little more rigorous by showing that triangle M QR is
congruent to triangle M P S, so that M P = M Q.)
Problem In rectangle ABCD, let M and N be the midpoints of BC and CD, respectively.
Let DM and BN intersect at P .
Prove that ∠M AN = ∠BP M .
Solution We use the symmetry of the rectangle to redraw the situation near A, over
near B. That is, let Q be the midpoint of AD and construct BQ. Then, by symmetry,
∠QBN = ∠M AN .
A Q D
B M C
All that we need to prove now is that BQ and M D are parallel. But once we observe that
BM and QD are equal and parallel, it follows that BQDM is a parallelogram and so BQ is
parallel to M D as desired.
There is one diagram that is particularly exciting! Take a triangle and its circumcircle. Draw
its altitudes and extend them to the circumcircle. Now, as an exercise, try to discover as
many interesting properties of this diagram as you can.
Here are a couple of such properties that you should try to prove for yourself.
Problem With notation as above, let P be a point on the circumcircle of triangle ABC.
Suppose that A00 is the intersection of lines P A0 and BC, B 00 is the intersection of lines P B 0
and CA, and C 00 is the intersection of lines P C 0 and AB.
Prove that the points A00 , B 00 and C 00 are collinear.
Solution Try drawing the diagram yourself. It looks quite complicated, unless it is large and
multicoloured. A carefully drawn diagram suggests that H also lies on the line through A00 ,
B 00 and C 00 . If this were true, then it suffices to prove that any two of A00 , B 00 , C 00 are collinear
with H. With this tactic in mind let’s try and prove that B 00 , C 00 and H are collinear. For
this, it is enough to prove that ∠BHC 00 = ∠B 0 HB 00 .
B0
A
B 00
C0
H
C 00
A00
B C
A0
P
Actually, we are not done because we have run into diagram dependence issues. Although the
argument that A00 , B 00 and H are collinear is similar to the argument that B 00 , C 00 and H
are collinear, it is only similarish. There are some differences in the angle chase. Finally, the
whole argument depended on the particular diagram drawn and the position of P relative to
the other points on the circumcircle. All these issues need to be ironed out. See if you can do
it!5
As a final kicker, there is a really short solution to this problem involving Pascal’s theorem.6
See if you can find it.
Problem Point O lies inside square ABCD such that ∠OAB = ∠OBA = 15◦ .
Prove that triangle ODC is equilateral.
B A B A
75 75
O E
60 60
C D C D
Let E be the point inside ABCD such that EDC is equilateral. Then we have a diagram
which looks the same, but by our assumption we know a different set of angles, and lots of
lengths are equal. We now aim to show that ∠EAB = ∠EBA = 15◦ , so that E and O are
the same point. (This follows since there is only one possible point O inside ABCD satisfying
the conditions ∠OAB = ∠OBA = 15◦ .)
As triangle CDE is equilateral we have CE = CD = CB. So triangle CBE is isosceles.
But since ∠BCE = 30◦ we have ∠CEB = ∠CBE = 75◦ and so ∠EBA = 15◦ . Similarly,
∠EAB = 15◦ , as desired. Therefore O = E and triangle ODC = EDC is equilateral.
5 For help on how to do this you might like to consult section 17.3.
6 See section 6.9.
4.9 Trigonometry 69
4.9 Trigonometry
Sine rule Let ABC be a triangle with side lengths a = BC, b = CA, c = AB and
circumradius R. Then we have
a b c
= = = 2R.
sin A sin B sin C
Solution The sine rule automatically gives us the value of AO in terms of the side lengths
and angles of triangle ABC. The natural approach is to use trigonometry to express the
length AH in terms of the side lengths and angles of triangle ABC.
As we discussed earlier in this chapter, any angles involving the orthocentre, the feet of the
altitudes, and the vertices of the triangle can easily be found in terms of ∠A, ∠B and ∠C.
So our task shouldn’t be too difficult!
F
O
H
B D C
Let D, E and F be the feet of the altitudes as per the diagram. Considering triangle AEH,
we obtain
AE AE
= sin ∠C ⇒ AH = .
AH sin ∠C
Considering triangle AEB, we obtain
AE
= cos ∠A ⇒ AE = AB cos ∠A.
AB
Piecing these two pieces of information together and invoking the sine rule yet again, we end
up with
AB cos ∠A
AH = = 2R cos ∠A = 2AO cos ∠A.
sin ∠C
Given that AH = AO, we can cancel this equation to give cos ∠A = 12 , which in turn implies
that ∠A = 60◦ .
It turns out that if the triangle is not restricted to being acute, then there is another value of
∠A for which AH = AO. See if you can find (and prove) what it is.
70 4 Plane geometry
4.10 Areas
One simple approach to geometry problems, often neglected, is to consider areas. This means
more than remembering the formulas for the area of a triangle
1 1
∆= base × height and ∆= ab sin C.
2 2
The following ideas are useful.
Area properties
Two triangles with the same base length and the same height have equal area.
If two similar figures have corresponding lengths in the ratio a : b, then their areas are
in the ratio a2 : b2 .
The following problem is a variation on these ideas.
Problem Let ABCD be a quadrilateral. Let the midpoints of AB, BC, CD and DA be E,
F , G and H, respectively.
Prove that EF GH has half the area of ABCD.
Solution The first thing to notice is that EF GH is a parallelogram. Why? By the midpoint
theorem, both EF and GH are parallel to AC and half its length. Therefore EF k GH and
EF = GH. Similarly F G = HE and F G k HE. This is a commonly used fact. Now the
diagram looks nicer!
D
G
C
H
F
A E B
Note that
|ADB| + |CBD| = |BAC| + |DAC| = |ABCD|.
Adding together the equations above gives
2|ABCD| |ABCD|
|AHE| + |CF G| + |BEF | + |DCH| = = .
4 2
4.10 Areas 71
So these four triangles give half the area of ABCD. Now EF GH is what you get if you
remove these four triangles from ABCD. So |EF GH| = 12 |ABCD| as required.
Problem Suppose that each of the three main diagonals AD, BE and CF divide the convex
hexagon ABCDEF into two regions of equal area.
Prove that the three diagonals meet at a common point.
Solution Label the intersection points and lengths along the diagonals as shown. (The
diagram has been drawn assuming that the diagonals don’t meet at a common point.)
D E
d e
I h
C F
c G f
g i
H
b a
B A
We want to show that the three points G, H and I coincide. Our strategy is to prove that
g = h = i = 0.
Each of the main diagonals divides the hexagon into equal pieces of equal area. It follows
that |ABCD| = |DEF A| = |BCDE| = |EF AB| = |CDEF | = |F ABC|.
Now ABCD and BCDE overlap along BCDH, so we have
|ABH| = |DEH|.
Therefore,
1 1
ab sin ∠AHB = (d + g)(e + i) sin ∠DHE,
2 2
and since the angles are equal, this reduces to
ab = (d + g)(e + i).
All the quantities involved are non-negative, so this is ridiculous unless we have g = h = i = 0.
Hence AD, BE and CF must meet at a point.
Note that this solution is diagram dependent. There is one other way to draw the diagram.
Specifically H could lie inside quadrilateral CDEF (instead of F ABC as in the diagram).
But this case can be handled in much the same way.
72 4 Plane geometry
Problem Suppose that N is the midpoint of the side BC of triangle ABC. Construct
right isosceles triangles AM B and AP C on sides AB and AC outside the triangle where,
∠AM B = ∠AP C = 90◦ .
Prove that M N P is also a right isosceles triangle.
It’s hard to relate points from one of the isosceles triangles to the other.
We are used to drawing squares, not isosceles triangles, on the sides of triangles, in
some proofs of Pythagoras’ theorem. This is a situation with lots of nice things going
on.
G
A P D
B N C
We see that M and P are the centres of these two squares. In fact, since BM = 12 BG
and BN = 12 BC, triangles BM N and BGC are similar. So, M N = 12 GC, and M N k GC.
Similarly, N P = 12 BE and N P k BE. Thus it suffices to show that GC = BE and GC ⊥ BE.
These facts are some of the nice properties of this diagram that you may already know from
elsewhere. Either way, note that ∠GAC = ∠GAB + ∠BAC = ∠EAC + ∠BAC = ∠BAE
and GA = BA and AC = AE. Therefore, triangles GAC and BAE are congruent. (In fact
one is a rotation of the other about A by 90◦ .) It follows that GC = BE and GC ⊥ BE as
required.
4.12 Create beautiful pictures 73
Problem Let ABD be a triangle and let C be a point on the side BD, lying strictly between
B and D. Suppose that BC = 2CD, ∠ACB = 60◦ and ∠ADC = 45◦ .
Determine ∠BAD.
Solution Simple techniques will not suffice here. Trigonometry can provide a solution but
we’ll avoid it, and take the artistic approach.
We get the feeling that there might be a construction which illuminates the diagram. We
hope to discover something we didn’t know before about the diagram. For instance, we might
discover that some point is a centroid, or incentre, or another well-known point. We might
discover a cyclic quadrilateral, a tangent line, or something else useful.
Since DC : CB = 1 : 2, it might be worthwhile to make a construction so that C will be a
centroid of some triangle. But that does not seem to work, and so we will instead try to
make C the incentre of a triangle. Incentres, being the intersection of angle bisectors, always
give us lots of information about angles, and information about ratios, which is what we want.
So, paint a picture where C is the incentre of some triangle. A little angle chasing gives
∠DAC = 15◦ . So construct a point E such that ∠DAE = 30◦ and ∠ADE = 90◦ . Then C is
the incentre of triangle AED. Furthermore, AED is a 30◦ -60◦ -90◦ triangle. Label points as
shown with H being the intersection of EC and AD. Note that EHD is also a 30◦ -60◦ -90◦
triangle.
15
60 45
B C D
Each configuration is accompanied by one, two or three stars. This is an estimated rating
of how difficult the result is to prove using geometric methods.1 More stars mean that it is
more difficult to prove.
The configurations are purposely kept minimal since the idea is to be able to recognise the
minimal elements needed to infer something. The resulting incidence is generally indicated by
fattening and colouring the points or curves involved. Thus four fat points usually indicate
that the four points are concyclic and so on.
This chapter is clearly very different to all the other chapters. Try to become very familiar
with it!
1 That is, without resorting to grubby computational methods such as trigonometry, complex numbers or
coordinate geometry!
76 5 Important configurations in geometry
6 AC
AB = ⇒ ABCD is cyclic.
AB = AC ⇒ ABCD is a kite.
A
•
α α
B• •C
a a
•
D
77
A2 Pivot theorem F
•
•
•
• • •
•
•
•
•
• • •
Variations: tangents2
2 Inboth these variations, two points are allowed to coalesce into a single point resulting in a tangent incidence
rather than two distinct intersection points.
78 5 Important configurations in geometry
•
•
•
• •
• •
Variations
A4 Similar switch F
B B
C C
D D
A A
• •
P P
There is a spiral symmetry about P which sends segment AB to segment CD if and only if
there is a spiral symmetry about P which sends segment AC to segment BD.
The above configuration is often seen as a byproduct of two intersecting circles. Point P is
the centre of each spiral symmetry.
B
C
A
•
P
• •
• •
•
• •
•
(Compare this with the four lines and four circles diagram in the C-List.)
80 5 Important configurations in geometry
x
• x
•
81
B2 Perpendicularity F
AC ⊥ BD ⇔ AB 2 + CD2 = AD2 + BC 2 .
Convex quadrilateral
B • D
Triangle
B
• D
C
Non-convex quadrilateral
A A
D C
B
C B D
82 5 Important configurations in geometry
•R •S •
T
K L M
84 5 Important configurations in geometry
• x
• • x
•
85
I = incentre
Ia = excentre
DB = DC = DI = DIa .
I
•
• •
B C
•
D
• Ia
86 5 Important configurations in geometry
B7 Simson line F
•
• •
• • •
H = orthocentre
Simson line bisects P H.
x
•
x
.
H
87
• • •
•
•
•
Pappus’ theorem
• •
•
88 5 Important configurations in geometry
•
•
•
89
D
B
Variations
AB + CD = BC + AD
B D
D B AB − CD = BC − AD
A
D
AB − CD = BC − AD
B
C
90 5 Important configurations in geometry
C1 Nine-point circle F
Midpoints of sides, feet of altitudes and midpoints of segments connecting vertices to ortho-
centre are all concyclic.
•
•
• •
•
• •
• •
91
C2 Euler line F
H = orthocentre
N = nine-point centre
G = centroid
O = circumcentre
O
N G •
H • • 2
• 3
1
92 5 Important configurations in geometry
•
93
C4 Newton–Gauss line FF
Midpoints of the diagonals of a complete quadrilateral3 are collinear.
3A complete quadrilateral is the figure determined by four lines, no three of which are concurrent, and their
six points of intersection. The three pairs of points which are not already connected by the original four
lines determine the diagonals of the complete quadrilateral.
94 5 Important configurations in geometry
4 The symmedian is the reflection of the median about the angle bisector.
95
Diagonals concurrent ⇔ AB · CD · EF = BC · DE · F A.
B D
•
E
A
F
96 5 Important configurations in geometry
•
if and only if
97
•
•
| {z }| {z }
m m
Variation: excentre
m m
z }| { z }| {
•
98 5 Important configurations in geometry
M = midpoint BC
X = midpoint AD
X
•
•I
•
B D M C
Variation: excentre
M = midpoint BC
Y = midpoint ADa
Y•
M
B • C
Da
• Ia
99
M = midpoint BC
D = incircle contact point
Da = excircle contact point
A A
B M Da C B D M C
Ia
100 5 Important configurations in geometry
• I
•
• •
I
•
• •
• •
•
Ia
101
P , B, D collinear ⇔ AB · CD = AD · BC.
P
B
•
•
D•
5A cyclic quadrilateral is said to be harmonic if the products of its opposite sides are equal.
102 5 Important configurations in geometry
I •
•
•
•
I
•a
•
6A mixtilinear incircle is a circle which is internally tangent to a triangle’s circumcircle and two of its sides.
There are three mixtilinear incircles associated with any triangle. A mixtilinear excircle is similar but is
instead externally tangent to the triangle’s circumcircle.
103
x x
|
• {z
•
}| {z
• }
m m
Incidence geometry
6
An incidence basically describes a situation when something more specific occurs than might
otherwise be thought. For example, three lines are usually not concurrent, but sometimes
you may be asked to prove that three lines are concurrent. So ‘three lines concurrent’ is an
example of an incidence. Other examples include ‘three points collinear’ and ‘four points
concyclic’.
Closely related to incidences is the concept of locus. For our purposes, a locus is a set of
points satisfying some condition. We often think of a locus as the curve traced out by a
point following some rule. For example, a circle may be thought of as the locus of points
equidistant from a given point. A line may be thought of as the locus of points equidistant
from a given line and lying on a given side of the given line.
When asked to find the locus of a set of points it is important to know beforehand what the
answer is likely to be. This can be done by drawing a careful diagram and building up a
picture of the locus. Most loci tend to be lines or circles, but sometimes ellipses or even whole
regions can occur.
An example might be to find the locus of points P such that ∠AP B = 45◦ , where A and B
are given fixed points. We know that the angle subtended in a circle on one side of a chord
is constant, so we can see that the locus is the union of two circular arcs each of which is
three-quarters of a full circle.
6.0 Problems
1. Show that the orthocentre and the circumcentre of a triangle are isogonal conjugates of
each other.1
3. Find the locus of points P such that for fixed points A and B,
AP
= k,
BP
where k is a positive constant.
4. Let A, B, C and D be four distinct points on a line, in that order. The circles with
diameters AC and BD intersect at the points X and Y . The line XY meets BC at
the point Z. Let P be a point on the line XY different from Z. The line CP intersects
the circle with diameter AC at the points C and M , and the line BP intersects the
circle with diameter BD at the points B and N .
Prove that the lines AM , DN and XY are concurrent.
5. The segment AB is fixed and point M is a variable point on that segment. Squares SA
and SB are constructed on AM and M B and on the same side of AB.
Find the locus of the midpoint of the segment joining the centres of the two squares.
6. Let circles C1 and C2 intersect at A and B. The tangent to C1 at A meets C2 again
at P . The tangent to C2 at A meets C1 again at Q. Let M be the point on line AB
such that AB = BM .
Prove that A, P , M and Q are concyclic.
7. Let A be a fixed point on a fixed circle Γ. Let B and C be variable points on Γ and
AD 2
let D be a point on BC such that BD·DC is a fixed constant.
Find the locus of D as B and C vary over Γ.
8. Of all triangles with given base length and height, which one has the largest inradius?
9. Points A, B and C range over the circumference of a fixed circle.
Find the locus of the incentre of triangle ABC.
10. Let K, L, M and N be four collinear points on the possibly extended sides AB, BC,
CD and DA, respectively, of quadrilateral ABCD.
Prove that
AK BL CM DN
· · · = +1,
KB LC M D N A
where this is an equation in directed lengths.2
11. Let ABCDEF be a cyclic convex hexagon with vertices labelled clockwise.
Prove that
AB · CD · EF = BC · DE · F A,
if and only if the diagonals AD, BE and CF are concurrent.
12. Let ABCD be a cyclic quadrilateral. Let A1 and C1 be the respective feet of the
perpendiculars from A and C to BD, and let B1 and D1 be the respective feet of the
perpendiculars from B and D to AC.
Prove that A1 B1 C1 D1 is cyclic.
13. Let ABCD be a quadrilateral whose diagonals AC and BD are perpendicular. Let P ,
Q, R and S be the midpoints of the sides, and let T , U , V and W be the feet of the
altitudes from these midpoints to the opposite sides.
Prove that P , Q, R, S, T , U , V and W all lie on a circle.
2 By AK
this we mean that KB
is positive if K is between A and B, and negative otherwise.
6.0 Problems 107
2AB = AC + BC.
Prove that the midpoints of AC and BC and the incentre and circumcentre of triangle
ABC are concyclic.
15. Let M be the midpoint of the side AC of a triangle ABC and let H be the foot of
the altitude from B. Let P and Q be the orthogonal projections of A and C onto the
bisector of the angle at B.
Prove that the four points H, P , M and Q lie on the same circle.
16. Show that the tangents to the circumcircle of a triangle at two of its vertices meet on
the symmedian from the third vertex.
17. In an acute-angled triangle ABC let AD and BE be altitudes and let AP and BQ be
internal angle bisectors. Denote by I and O the incentre and the circumcentre of ABC,
respectively.
Prove that D, E and I are collinear if and only if P , Q and O are collinear.
18. Let ABC be an acute-angled triangle and let D, E and F be the feet of the perpendiculars
from A, B and C onto the sides BC, CA and AB, respectively. Let P , Q and R be
the feet of the perpendiculars from A, B and C onto the lines EF , F D and DE,
respectively.
Prove that the lines AP , BQ and CR are concurrent.
19. Let ABCD be a quadrilateral such that its opposite sides AB and CD meet at P and
its other opposite sides AD and BC meet at Q. Let X be the intersection of P Q and
CA. Let Y be the intersection of P Q and DB.
Show that
P X · QY = P Y · XQ.
20. Let ABCD be a given convex quadrilateral with sides BC and AD equal in length and
not parallel. Let E and F be variable points on the sides BC and AD, respectively,
and which satisfy BE = DF . The lines AC and BD meet at P , the lines BD and EF
meet at Q and the lines EF and AC meet at R. Consider all triangles P QR as E and
F vary as mentioned earlier.
Show that the circumcircles of all these triangles have a common point other than P .
21. Let I be the incentre of triangle ABC. A circle which is tangent to the circumcircle of
triangle ABC on the inside also touches CA and BC at D and E, respectively.
Show that I is the midpoint of DE.
22. Let ABCD be a convex quadrilateral that just happens to have both an incentre I and
a circumcentre O.3
Prove that the intersection of the diagonals of the quadrilateral lies on the line OI.
Problem Let ABC and ADE be similar triangles whose vertices are labelled clockwise. Let
P be the second common point of the circumcircles of the triangles besides A.
Show that P must lie on the line connecting B and D.
Solution
B A
Look at the diagram. Equal angles abound and we have the chain of equalities
The first follows from the cyclic quadrilateral ABCP , the second follows from the similar
triangles ABC and ADE, while the third follows from the cyclic quadrilateral ADEP . But
seeing that B and D lie on the same side of the line AP , the equality ∠BP A = ∠DP A tells
us that P must lie on the line passing through B and D.
Are we done? No, we certainly are not! This is an opportune time to mention a common
pitfall in geometry known as diagram dependence. We only solved the problem for the diagram
shown. It is possible to have other diagrams where the relative positions of the points are
different, and our angle chase is a bit different. For instance, if triangle ADE were rotated
clockwise until D lay on ray AP beyond P , then it is no longer true that ∠DEA = ∠DP A,
but instead we would have ∠DEA = 180◦ − ∠DP A. See if you can identify all the different
configurations possible and solve in each case.4
Three other points in the diagram turn out to be collinear. What are they? Prove it!
Problem Let P be a point on the circumcircle of triangle ABC. Let D, E and F be the
feet of the perpendiculars from P to the lines BC, AC and AB, respectively.
Show that D, E and F are collinear.
4 You might like to consult section 17.3 for a possible way to deal with this without resorting to a large number
of case distinctions.
6.2 Menelaus’ theorem 109
Solution
A P
B D C
Since BAF is a straight line, it suffices to show that ∠BF D = ∠AF E. From cyclic
quadrilaterals P F BD and P F AE we know that ∠BF D = ∠BP D and ∠AF E = ∠AP E. If
we subtract ∠BP E from both these angles, then it suffices to prove that ∠EP D = ∠AP B.
However, both these angles are equal to ∠BCA—the first is because CDEP is cyclic, and
the second is because BAP C is cyclic.
This solution is not complete since we have not addressed issues of diagram dependence. We
leave this for the reader to complete.
The result from the preceding problem can be extended. It is known as Simson’s theorem.
Simson’s theorem Let P be any point in the plane of triangle ABC. Then the feet of
the perpendiculars from P to the (possibly extended) sides of the triangle are collinear if
and only if P lies on the circumcircle of the triangle. In such a case the line containing the
three feet of the perpendiculars is called the Simson line.
Menelaus’ theorem If X, Y and Z lie on the three (possibly extended) sides BC, AC
and AB of a triangle ABC, then the three points X, Y and Z are collinear if and only if
AZ BX CY
· · = −1,
ZB XC Y A
where the segments are considered to have directed length.
The part about directed lengths in the statement of Menelaus’ theorem simply means that
−→ −→
AZ
the ratios take into account the directions of the vectors AZ, ZB, and so forth. Thus ZB is
a positive ratio if Z lies on segment AB, and is a negative ratio otherwise.
110 6 Incidence geometry
B C X
Problem Suppose that ABC is a triangle with circumcircle Γ in which the three tangents
to Γ at A, B and C meet the three opposite sides at X, Y and Z, respectively.
Prove that X, Y and Z are collinear.
Solution
Y
A
X B C
First, triangles XAB and XCA are similar. This follows from the alternate segment theorem,
which asserts that ∠XAB = ∠BCA. Thus we may write
XA XB AB
= = .
XC XA AC
Combining some of these equalities yields
BX AB 2
=− .
XC AC 2
We compute similar expressions for the other two ratios. These all multiply together and
cancel out to give −1. Thus X, Y and Z are collinear by Menelaus’ theorem.
Problem Let P be a point in the plane of a triangle ABC. Reflect the lines P A, P B and
P C through the angle bisectors at A, B and C, respectively.
Prove that these three reflected lines are concurrent.
Solution Let pa , pb and pc denote the distances from P to the lines BC, CA and AB,
respectively. Similarly, for any point Q in the plane of triangle ABC, we let qa , qb and qc
denote the distances from Q to the lines BC, CA and AB, respectively.
Observe that Q lies on the line obtained by reflecting the line P A through the angle bisector
at A if and only if
qb pc
= .
qc pb
P Q
B C
(Can you locate the appropriate similar triangles to establish why this is true?)
Next, define Q to be the point which lies on the reflected line through A and the reflected
line through B. Therefore, the previous observation gives us the two equations
qb pc qc pa
= and = .
qc pb qa pc
which implies that Q lies on the reflected line through C as well. Therefore, the three reflected
lines are concurrent at Q.
Ceva’s theorem If X, Y and Z lie on the three (possibly extended) sides BC, AC and
AB of a triangle ABC, then the three lines (called cevians) AX, BY and CZ are concurrent
if and only if
AZ BX CY
· · = +1,
ZB XC Y A
where the segments are considered to have directed length.
Note that this is exactly the same expression as for Menelaus’ theorem except that we have +1
on the right-hand side instead of −1.
Problem Outside triangle ABC, points K, L and M are constructed in such a way that
Solution Let x = ∠M AB, y = ∠KBC and z = ∠LCA and let AK, BL and CM intersect
BC, CA and AB at points X, Y and Z, respectively, as in the diagram.
A
x
x α
L
Y
M
Z
y z
β γ
B y X z C
We obtain similar expressions for the other two ratios and thus compute after much cancelling
out that
KB LC M A
P = · · .
KC LA M B
Finally, we use the sine rule in triangle KBC to find
KB sin z
= .
KC sin y
We obtain similar expressions for the other two ratios so that we finally compute that P = 1.
Therefore, AX, BY and CZ are concurrent by Ceva’s theorem.
Trigonometric form of Ceva’s theorem If angles are marked as in the figure, then the
cevians are concurrent if and only if
sin α1 sin β1 sin γ1
· · = +1.
sin α2 sin β2 sin γ2
α2
α1
γ2 β1
γ1 θ β2
C X B
The proof of this is quite straightforward and may be carried out by using the sine rule six
times. For example,
CX AC XB AB
= and =
sin α1 sin θ sin α2 sin(180◦ − θ)
and so we obtain equations such as
sin α1 AB CX
= · .
sin α2 AC XB
Problem Let ABC be a triangle with altitudes AD, BE, CF , medians AX, BY , CZ, and
orthocentre H. Let A0 , B 0 and C 0 be the midpoints of AH, BH and CH, respectively.
Prove that the nine points A0 , B 0 , C 0 , D, E, F , X, Y and Z all lie on a circle.6
Solution
A0
E
Z Y
B0 C0
B D X C
There are midpoints galore in this problem. In fact, six of the nine points which we will prove
are concyclic are defined as midpoints. Therefore, it seems like a prime opportunity to use
the midpoint theorem, which states that if Z is the midpoint of AB and Y is the midpoint of
AC, then Y Z is parallel to BC and half its length. Applied in triangle ABH, we obtain that
B 0 Z is parallel to AH, while applied in triangle ACH, we obtain that C 0 Y is also parallel to
AH. Applied in triangle ABC, we obtain that Y Z is parallel to BC, while applied in triangle
HBC, we obtain that B 0 C 0 is parallel to BC.
In summary, B 0 Z and C 0 Y are parallel to each other and to AH. Furthermore, Y Z and B 0 C 0
are parallel to each other and to BC. However, since AH is perpendicular to BC, B 0 C 0 Y Z
must be a rectangle. Similar arguments lead to the fact that C 0 A0 ZX and A0 B 0 XY are also
rectangles.
Therefore, if we let N be the midpoint of B 0 Y , then N is the centre of the circle circumscribing
the rectangle B 0 C 0 Y Z, as well as the centre of the circle circumscribing A0 B 0 XY , both of
which have B 0 Y as diameters. It follows that A0 , B 0 , C 0 , X, Y and Z all lie on a circle.
Now note that the line Y Z bisects AD and is perpendicular to it. In other words, the reflection
of A in the line Y Z is the point D. To paraphrase again, triangle AY Z is congruent to
triangle DY Z. However, triangle AY Z is also congruent to triangle XZY . In particular, we
have the equal angles ∠Y DZ = ∠Y XZ, so that the quadrilateral XY ZD is cyclic. Therefore,
the point D—and by a similar argument, the points E and F —lie on the circumcircle of
triangle XY Z. It follows that A0 , B 0 , C 0 , D, E, F , X, Y and Z all lie on a circle.
Here is an alternative solution which illustrates that there are many different ways to complete
an angle chase. It is also highlights the ‘one step at a time’ method described in section 4.3.
6 For obvious reasons, this circle is called the nine-point circle of triangle ABC.
6.5 Concyclic points 115
Solution Draw the triangle with only the extra points X, Y , Z and D marked.
Z Y
B D X C
Thus
∠Y DC = ∠Y CD = ∠XZY.
The first equality comes from the fact that triangle ADC is right-angled at D. Hence its
circumcentre is the midpoint of AC, namely Y , and so Y A = Y D = Y C. The second equality
comes from the fact that XZY C is a parallelogram, and thus has opposite angles equal.
Hence ∠Y DX = ∠Y DC = ∠XZY , which establishes that D lies on circle XY Z. Similarly,
points E and F lie on circle XY Z. Thus the six points X, Y , Z, D, E and F all lie on the
same circle.
Now draw the triangle with only the extra points D, E, F , A0 and H marked.
A0 E
F
H
B D C
The circle with diameter AH passes through points E and F due to the right angles at E
and F . Thus A0 is the centre of circle EAF H, and so ∠EA0 F = 2∠A.
Furthermore, ∠HDF = ∠HBF = 90◦ − ∠A from cyclic quadrilateral HDBF . Similarly,
∠HDE = ∠HCE = 90◦ − ∠A. Therefore,
This means that A0 lies on circle DEF . Similarly, points B 0 and C 0 also lie on circle DEF .
Since the circle through D, E and F is unique and contains the points X, Y , Z, A0 , B 0 and
C 0 , we conclude that all nine points lie on this same circle.
116 6 Incidence geometry
Problem Let D, E and F be points on the sides BC, CA and AB of triangle ABC.
Prove that the circumcircles of triangles AEF , DBF and DEC are concurrent.
Solution
A
P
F
B D C
Let the circumcircles of triangles AEF and DEC meet at P . Then since the quadrilaterals
AF P E and DP EC are cyclic, we have
Since ∠BF P + ∠BDP = 180◦ , the quadrilateral BF P D is also cyclic. Therefore, the
circumcircle of triangle DBF also passes through P .7
This is an extremely useful result and is well worth remembering. Many geometry problems
happen to include this set-up lurking as a subdiagram.
7 Thisargument is diagram dependent because it did not address the possibility of P lying outside of the
triangle. See section 17.3 to see how to deal with this and more.
6.6 Power of a point 117
In addition, we define the power of a point P with respect to a circle Γ to be the real number
P A · P B, where P , A and B are collinear and A and B are points on Γ. We treat the lengths
as being directed so that the power is positive if P is outside Γ, and negative if P is inside Γ.
Note that the power of a point theorem ensures that this is a well-defined real number. That
is, we get the same result no matter how A and B are chosen on Γ, provided that P , A and
B are collinear.
Problem Two circles Γ1 and Γ2 intersect in two points A and B. Point P lies on the line
AB. A line passing through P intersects Γ1 at U and V . Another line passing through P
intersects Γ2 at X and Y .
Prove that the four points U , V , X and Y are concyclic.
Solution
U X
A
Γ1 Γ2
P
Y V
P U · P V = P A · P B.
P X · P Y = P A · P B.
Thus
P U · P V = P X · P Y,
and so using the power of a point theorem again we deduce that U , V , X and Y are
concyclic.
118 6 Incidence geometry
Radical axis theorem Given three circles, the three radical axes associated with the
three pairs of circles are either concurrent or parallel.
Solution Let the three circles be Γ1 , Γ2 and Γ3 . Let λ be the radical axis of Γ1 and Γ2 and
let µ be the radical axis of Γ2 and Γ3 . Suppose that λ and µ intersect at P .8
Since P lies on λ, it has equal power with respect to Γ1 and Γ2 . Similarly, since P lies on µ,
it has equal power with respect to Γ2 and Γ3 . It follows that P has equal power with respect
to Γ1 and Γ3 and consequently lies on the radical axis of Γ1 and Γ3 .
Thus the three radical axes all pass through P .
The concurrence of three radical axes is highly useful. Look out for them!
Solution
A
X
P
•
W
• γ
β
F E
• •
B C
8 If λ and µ are parallel, then P is undefined and so this argument is faulty. Can you deal with this case?
6.8 Ellipses 119
Let β be the circle with diameter BE and let γ be the circle with diameter CF . Let P and Q
be the two points where β and γ intersect. We want to show that line P Q passes through A.
We have two circles, and their common chord which is supposed to pass through A. Note
that there are two other obvious lines, namely AB and AC, also passing through A. This
looks like a candidate for the radical axis theorem!
We seek a third circle whose radical axes with β and γ are the lines AB and AC, respectively.
There is only one such candidate. It should pass through the intersection, W say, of line AB
and β and it should also pass through the intersection, X say, of line AC and γ. If we can
show that BW XC is cyclic, then applying the radical axis theorem to circles BW XC, β and
γ will tell us that BW , CX and P Q are concurrent. Since BW and CX intersect at A, this
means that P Q would also pass through A.
Thus it remains to show that BW XC is cyclic. Since BE is a diameter of β, we know that
∠F W E = ∠BW E = 90◦ . Thus W lies on the circle with diameter F E. Similarly X lies on
the circle with diameter F E. So F W XE is a cyclic quadrilateral.
Therefore,
If you look at an accurately drawn diagram, it seems that W E, F X and P Q are also
concurrent. This is the same as saying the orthocentre of triangle AEF also lies on line P Q.
See if you can prove it! You only need the radical axis theorem applied once more. But which
circles should you apply it to?
6.8 Ellipses
Ellipses have certain highly useful properties. For example, an ellipse can be thought of as
the locus of points P such that the sum P A + P B is a given constant for fixed points A and
B. The points A, B are called the foci. These two points have a nice optical property: if
the inside boundary of an ellipse is made of reflective material, then any light emitted from
one focus, A say, will always pass through the other focus B. Thus if P is any point on the
ellipse and ` is a tangent at P , then the angle that AP makes with ` equals the angle that
BP makes with `, that is, ‘angle of incidence equals angle of reflection’.
. .
A B
120 6 Incidence geometry
Solution
(a) For any point P in the plane write f (P ) = P A + P B + P C. We wish to find the
minimum value for f (P ).
First we investigate the possibility that P might lie outside the triangle. The outside of
the triangle can be divided into six regions defined by the lines AB, AC and BC and
P could lie in any of these.
I
C
II
VI
V B IV
A III
P .
C
B A
Consider the ellipse with foci at A and B such that P is on the ellipse. Since
triangle P AB lies inside the ellipse and C lies inside triangle P AB, we conclude
that C lies strictly inside the ellipse. Consequently, we have AP + BP > AC + BC.
Therefore,
f (P ) = AP + BP + CP > AP + BP > AC + BC = f (C).
Case 2: The point P lies in one of the regions II, IV or VI.
WLOG P lies in region II.
C .P
Q
B A
9 This problem is considered again in section 12.6.
6.8 Ellipses 121
So in both cases we have shown that for any point P lying outside the triangle, there
exists another point Q on the boundary of the triangle such that f (P ) > f (Q). So from
here on we may restrict our attention to points not lying outside the triangle.
Consider any position of a point P inside or on the boundary of triangle ABC such that
f (P ) = P A + P B + P C is minimal.10 If P were on the boundary of the triangle, it is
easy to see that P would have to be a foot of an altitude of the triangle. Consequently,
P cannot be at a vertex.
Consider the ellipse ε with foci at A and B and such that P is on ε. Consider also the
circle γ centred at C passing through P .
γ
C
P
Q
ε
B A
If γ is not tangent to ε, then it intersects ε in at least two points. In this case, any point
Q lying strictly inside the intersection of γ and ε would necessarily have CQ < CP and
AQ + BQ < AP + BP and so f (Q) < f (P ), a contradiction. (Even if Q lay outside the
triangle, this would still be a contradiction, because f (P ) was taken to be the global
minimum.)
Therefore, γ is tangent to ε at P . Let ` be the common tangent of γ and ε at P . Since
we know the angle of incidence is equal to the angle of reflection at P in ε and we
also know that ` ⊥ P C, this allows us to deduce that ∠AP C = ∠BP C. Similarly, we
deduce that ∠BP A = ∠CP A.
Hence all three angles around P are equal. Therefore, P is a point where all three
angles are equal to 120◦ .
10 Aminimal value exists because f is a continuous function whose domain is now restricted to a closed and
bounded subset of the plane.
122 6 Incidence geometry
(b) We construct the point as follows. Erect equilateral triangles and their circumcircles
outwards on AB and BC. These circles both subtend 120◦ angles on AB and BC inside
triangle ABC and hence their intersection point is the desired point P .
Pascal’s theorem Let A, B, C, D, E and F be any six points on any conic section. Let
X = AB ∩ DE, Y = BC ∩ EF and Z = CD ∩ F A. Then X, Y and Z are collinear.
Note that the theorem is true no matter what order the six points appear on the conic. The
theorem is also true in its limiting cases. For example, if B is permitted to approach and
then coincide with A, then the line AB becomes the tangent at A. In the context of using
Pascal’s theorem, the tangent at A is often written as AA. Some illustrative diagrams can be
found on page 87.
In most applications the conic section11 will be a circle. Sometimes it might be a pair of
lines.12
A useful way to remember what intersects what is that AB, DE are pairs of opposite sides of
the hexagon ABCDEF , as are BC, EF and CD, F A.
Problem Let ABC be an acute triangle with circumcentre O. Let AA1 be a diameter of
the circumcircle of triangle ABC. The tangent line to the circumcircle at A1 intersects the
line BC at point D. The line OD intersects the sides AB and AC of the triangle at points P
and Q, respectively.
Prove that OP = OQ.
Solution
.O
B
D A1
•K
P •
B
•
D A1
It would suffice to prove that K also lies on DP for then K would lie on both DP and AA1 ,
ensuring that K = O. This is where Pascal’s theorem makes an entry. Indeed K = CX ∩ AA1 ,
D = A1 A1 ∩ BC and P = AB ∩ A1 X. It only remains to see if there is a Pascal configuration
corresponding to these lines and points.
The Pascal configuration virtually reveals itself! A1 A1 is a side (corresponding to the tangent
of the circle at A1 ), A1 X and AA1 are sides, as are XC and CB. So if the hexagon is
AA1 A1 XCB with points in that order, then we apply Pascal’s theorem to find that the points
K = AA1 ∩ XC, D = A1 A1 ∩ CB and P = A1 X ∩ BA are collinear as desired.
Transformation geometry
7
Geometric transformations are often used in an effort to understand a particular geometry
problem better. The transformations considered in this chapter are spiral symmetries and
affine transformations. Reflections are discussed in section 12.2.
Every spiral symmetry may be described as a linear complex function.1 Understanding what
functions correspond to what spiral symmetries helps to understand their compositions.
An anticlockwise rotation of angle θ composed with a dilation of factor r = 6 0 about the same
point is called a spiral symmetry. If it is performed about the origin, it may be described by
f (z) = az,
f (z) = az + (1 − a)p.
f (z) = az + b
for any complex number b provided that a 6= 0 or 1. The centre of the spiral symmetry is
b
given by p = 1−a . If a = 1, we also have the identity spiral symmetry f (z) = z. Furthermore,
if we throw in the translation functions f (z) = z + b for each complex number b, we have
the complete set of linear functions of the complex plane. In loose terms a translation can
be thought of as a spiral symmetry whose centre is infinitely far away and whose angle of
rotation is 0◦ . From here on, whenever we speak of a spiral symmetry, we implicitly include
the translations amongst them.
To summarise, the full family of spiral symmetries, now also including the translations, is
given by the set of functions
f (z) = az + b,
where a 6= 0. The rotation angle is θ = arg(a) and the dilation factor is r = |a|.
Consider what happens if we compose two such spiral symmetries. Suppose we do
f1 (z) = a1 z + b1
1 You might like to check out chapter 8 and section 17.1 if you would like a refresher on complex numbers.
126 7 Transformation geometry
followed by
f2 (z) = a2 z + b2 .
We obtain
f3 (z) = f2 (f1 (z)) = a3 z + b3 ,
where a3 = a1 a2 and b3 = a2 b1 + b2 .
If a1 = r1 eiθ1 , a2 = r2 eiθ2 and a3 = r3 eiθ3 , then
r3 = r1 r2 and θ3 = θ1 + θ2 .
Note also that in general the order of composition does matter. So, it would not necessarily
be that case that f1 (f2 (z)) = f2 (f1 (z)). For instance, the respective centres of the spiral
symmetries are usually different.
There are some important subgroups of the family of spiral symmetries.
Translations
A general translation is represented by f (z) = z + α for any complex number α. It is
easy to verify that the composition of two translations is a translation.
Rotations
A rotation is represented by f (z) = az + b for |a| = 1. Note that for our purposes, the
family of rotations includes all the translations. It is easy to verify that the composition
of two rotations is a rotation.
Dilations
A dilation is represented by f (z) = az + b for a ∈ R and a 6= 0. Note that for our
purposes, the family of dilations includes all the translations. It is easy to verify that
the composition of two dilations is a dilation.
There are also geometric transformations, known as affine transformations, which we will
discuss later in the chapter.
7.0 Problems
1. Given three parallel lines, show how to construct with straightedge and compass a point
on each line so that the points form an equilateral triangle.
2. If the opposite sides of a hexagon are equal and parallel, prove that the diagonals joining
opposite vertices are concurrent.
3. Show that the three medians of a triangle are concurrent.
4. Given three circles in the plane, show how to construct a point on each so that the
points form an equilateral triangle. Discuss under what circumstances this is even
possible.
7.0 Problems 127
5. Given two triangles in the plane which have corresponding sides parallel, show that the
lines joining the corresponding vertices are concurrent.
6. Let ABC be a triangle and let D, E and F be the midpoints of BC, AC and AB,
respectively.
−→ −→ −→
Prove that the vectors AD, BE and CF form a triangle.
7. Show how to construct, using straightedge and compass, a square whose vertices all lie
on the sides of a given triangle.
8. Two circles are internally tangent at T . A chord AB of the outer circle is tangent to
the inner circle at P .
Prove that T P bisects ∠AT B.
9. Two common tangents of two intersecting circles meet at a point A. Let B be a point of
intersection of the two circles, and C and D be the points in which one of the tangents
touches the circles.
Prove that the line AB is tangent to the circumcircle of triangle BCD.
10. Let ABC be a triangle. Triangles A0 BC, B 0 CA and C 0 AB are erected externally on
the sides of triangle ABC such that
11. Let X and Y be the centres of the squares erected externally on sides AB and AC of
triangle ABC. Let M be the midpoint of BC.
Prove that M X and M Y are equal and perpendicular.
12. Triangle ABC has squares P ABK and QACL constructed on the exterior of the sides
AB and AC, respectively. Let AH be an altitude of triangle ABC with H on BC.
Prove that A, H and the midpoint of P Q are collinear.
13. Given triangle ABC, we erect similar isosceles triangles BCP , ACR and ABQ externally
on sides BC and AC and internally on side AB.
Prove that P QRC is a parallelogram.
14. Given a quadrilateral ABCD, we erect the four equilateral triangles ABP and CDR
externally and BCR and ADS internally on its sides.
Show that P QRS is a parallelogram.2
15. Chords AB and CD of circle Γ intersect at a point E inside Γ. The circle ω is internally
tangent to the figure bounded by segments AE and EC and arc AC (not containing B
or D) of Γ, touching arc AC at point F . A line ` containing the centre O of Γ, intersects
segments AE and DE at points P and Q, respectively, and satisfies EP = EQ. Line
EF intersects ` at point M .
Prove that the line through M parallel to the line AB is tangent to Γ.
2 This is a generalisation of the previous problem.
128 7 Transformation geometry
16. Let M be the midpoint of the altitude of triangle ABC from vertex A. Let I be the
incentre of the triangle. Let Y be the point of tangency of the excircle opposite vertex
A with side BC.
Prove that M , Y and I are collinear.
17. The incircle of triangle ABC has centre I and touches BC at D. Let E be the midpoint
of BC.
Prove that the line through E and I passes through the midpoint of the segment AD.
18. Two circles Γ1 and Γ2 lie inside and are internally tangent to a third circle Γ at points
A1 and A2 , respectively. A common external tangent touches Γ1 at T1 and Γ2 at T2 .
Let P be the intersection of lines A1 T1 and A2 T2 .
(a) Prove that P lies on Γ.
(b) Hence prove that A1 T1 T2 A2 is cyclic.
(c) Hence also prove that if Γ1 and Γ2 intersect at two points, then the extension of
their common chord also passes through P .
19. Let Q = A0 A1 A2 A3 be a quadrilateral in the plane. Given a point M0 in the plane we
define the sequence M0 , M1 , M2 , . . . of points in the plane as follows. If n ≡ i (mod 4),
then Mn+1 is the point obtained from rotating the point Mn anticlockwise about Ai by
90◦ .
(a) Prove that if M2008 = M0 for one point M0 in the plane, then we also have
M2008 = M0 for all points M0 in the plane.
(b) Suppose that Q is a parallelogram. Prove that if M2008 = M0 , then Q is a square.
Under what circumstances is the converse true?
(c) Suppose that Q is a general quadrilateral. Find simple necessary and sufficient
conditions on Q such that M2008 = M0 .
20. Let O, A, B, C, D and E be six points in the plane such that the 10 triangles which
have O as a vertex all have area at least 1.
√
(a) Prove that one of those 10 triangles has area at least 2.
(b) Is the result still true if we only consider O plus four more points?
21. A point P is permitted to vary over the interior of a fixed triangle ABC. The line AP
meets BC in A1 . Points B1 and C1 are defined similarly.
As P varies over the interior of the triangle, what is the maximum possible area of
triangle A1 B1 C1 ?
22. Circles Γ1 and Γ2 intersect at points A and B. A common tangent to the two circles
touches Γ1 at P1 and Γ2 at P2 . Points M1 and M2 are such that line M1 M2 is the
perpendicular bisector of AB, and
24. Let H be a strictly convex hexagon inscribed in an equilateral triangle. Suppose that
H has all of its sides of equal length.
Prove that the three main diagonals of H are concurrent.
25. Let ABCD be a convex quadrilateral with BA = BC. Denote the incircles of triangles
ABC and ADC by ω1 and ω2 , respectively. Suppose that there exists a circle ω tangent
to the ray BA beyond A and to the ray BC beyond C, which is also tangent to the
lines AD and CD.
Prove that the common external tangents of ω1 and ω2 intersect on ω.
130 7 Transformation geometry
7.1 Translations
Problem Given a segment AB and circles S1 and S2 , using a straightedge and compass,
show how to construct points P and Q, one on each circle, satisfying P Q k AB and P Q = AB.
Solution
P Q S2
S1 τ (S1 )
A B
Suppose that P and Q are located as required on the two circles with P on S1 , say. Then the
translation τ , which takes A to B, will also take P ∈ S1 to Q ∈ S2 . Thus if we apply the
translation τ to S1 ,3 any place where τ (S1 ) intersects S2 will be a candidate for Q. Finally,
we can invert the translation to recover P from Q.
In the above problem there may be 0, 1, 2, 3, 4, or infinitely many possible positions for the
segment P Q. Can you analyse under what conditions these occur?
7.2 Rotations
Problem Suppose ABC is a triangle. We erect three equilateral triangles P BC, QCA and
RAB externally on the sides of triangle ABC. Let K, L and M be their respective centroids.
Show that triangle KLM is equilateral.
Solution Note that triangles AM B, BKC and CLA are all isosceles and have 120◦ angles
at M , K and L, respectively. We capitalise on the fact that these three angles sum to 360◦ .
B C
M0
Consider the three 120◦ clockwise rotations TM , TK and TL about M , K and L, respectively.
Note that
TM (A) = B, TK (B) = C and TL (C) = A.
We know that the composition TL ◦ TK ◦ TM of these rotations has angle sum 360◦ and thus
must be a translation. However,
Thus the composition is a translation which fixes the point A. Hence the composition must
be the identity transformation.
Consider now, what happens to the point M .
M K = KM 0 , M L = LM 0 and ∠M KM 0 = ∠M LM 0 = 120◦ .
7.3 Dilations
One of the important things to remember about dilations is that if X is the centre of a
dilation f , then X, Y and f (Y ) are always collinear for any point Y .
Problem Prove that the centroid, orthocentre and circumcentre of a triangle are collinear.4
Solution
A
C0 B0
..
G O
B A0 C
4 The line common to all three points is called the Euler line.
132 7 Transformation geometry
Consider the dilation f with scale factor − 12 about the centroid G of triangle ABC. Note
that f maps triangle ABC to its medial triangle A0 B 0 C 0 which has the same centroid G0 = G
as triangle ABC.
Let H be the orthocentre of triangle ABC and let H 0 = f (H) be the orthocentre of triangle
A0 B 0 C 0 . The perpendicular bisectors of the sides of ABC are the altitudes of the medial
triangle and so O = H 0 . But H, G and H 0 = f (H) are collinear thanks to the dilation. Thus
H, G and O are collinear.
In fact we have proven more! Namely that H, G and O occur in that order on the line and
that HG : GO = 2 : 1.
Continuing these ideas, it is possible to deduce that if N is the circumcentre of the medial
triangle, then N also lies on the same line in between H and G. Furthermore,
HN : N G : GO = 3 : 1 : 2.
Problem For a pair of non-congruent circles neither of which lies inside the other, we define
their focal point to be the point of intersection of their pair of external common tangents. For
three circles in the plane, none of which lies inside the other, we thus have three focal points,
one for each pair of circles.
Prove that these three focal points are collinear.5
Solution
S3
S2
S1
P3 P2 P1
Let the circles be S1 , S2 and S3 and let r1 , r2 and r3 be their respective radii. Let P1 , P2
and P3 be the focal points of the pairs (S2 , S3 ), (S3 , S1 ) and (S1 , S2 ) of circles, respectively.
The intersection of the two common external tangents to a pair of circles is a centre of dilation
of the circles, and the dilation factor is equal to the ratio of the two radii.
5 Thisis known as Monge’s theorem. If you understand the proof of it, you should be able to state and prove
a similar result involving two internal focal points (the intersection point of the pair of internal tangents)
and an external focal point. In fact you should be able to state and prove a result even when one circle is
inside another.
7.4 Spiral symmetries 133
With this in mind, let D1 be the dilation with factor + rr32 , centred at P1 . Note that D1 sends
S2 to S3 . Similarly, let D2 be the dilation with factor + rr31 , centred at P2 and let D3 be the
dilation with factor + rr21 , centred at P3 . Thus
Thus the composition is a translation which fixes the circle S2 . Hence the composition must
be the identity transformation.
Consider now the line ` = P1 P2 . Since ` passes through P1 , we see that the dilation D1 leaves
` fixed (although points within ` do move along `). After next applying D2 to ` we see that `
still remains fixed because ` passes through P2 . Finally applying D3 to ` must leave ` fixed
because D3 ◦ D2 ◦ D1 is the identity transformation.
However, if P3 were not on `, then D3 would move `. Thus P3 is also on ` and hence collinear
with P1 and P2 .
L
K
θ θ
B θ M θ C
M0
6 This problem is solved using complex numbers in section 8.4.
134 7 Transformation geometry
Consider the spiral symmetry SC centred at C which takes L to P . Note that SC has dilation
factor r and rotation factor θ. Consider also the spiral symmetry SB centred at B which
takes P to K. Note that SB has dilation factor 1r and rotation factor θ.
The composition SB ◦ SC has dilation factor 1 and rotation factor 2θ. It is thus a rotation
about some point. If we can show that this point is M , then we are done because the
composition takes L to K.
Since the only point fixed by a rotation is the centre of rotation, it suffices to show that M is
fixed under the composition. This is rather easy. Indeed suppose that SC (M ) = M 0 , then
triangles CM M 0 and CLP are similar. Thus M M 0 ⊥ BC. Since M is the midpoint of BC
we also have that triangles BM M 0 and CM M 0 congruent. Thus SB (M 0 ) = M . Hence M is
indeed fixed under the composition.
In fact we have proven not only that KM = LM but also that ∠KM L = 2θ, which is even
more than we were asked to prove!
Note that properties 6, 7 and 8 are not independent of each other. For example, if we choose
to transform an ellipse into a circle, then we lose the freedom to transform any triangle into
any triangle.
Problem Let G be the centroid of triangle ABC and let M be the midpoint of BC. Let X
and Y be on AB and AC, respectively, such that the points X, G and Y are collinear and so
7.5 Affine transformations 135
that XY is parallel to BC. Suppose that XC and BG intersect at Q and that Y B and GC
intersect at P .
Show that triangle M P Q is similar to triangle ABC.
Note that we could just have easily assumed that ABC was a right isosceles triangle and
used coordinate geometry.
Complex numbers
8
Many geometry problems can be solved by simply tossing them onto the complex number
plane and then doing a few routine calculations. Arithmetical operations involving complex
numbers1 can be seen to have geometric interpretations. Indeed if α ∈ C is a constant, we
have the following table of interpretations for any complex number z.
Algebra Geometry
z±α Translation
z
z · α, (for α 6= 0) Spiral symmetry
α
Dilation if α ∈ R
Rotation if |α| = 1
z Reflection
|z| Length
arg(z) Angle
Roots of z n = 1 Vertices of regular n-gon
8.0 Problems
1. Suppose that points A and B in the plane are represented by the complex numbers α
and β.
Find geometric interpretations for the arithmetic, geometric and harmonic means of α
and β.
(2 + i)(3 + i) = 5 + 5i.
7. Draw any quadrilateral. On each side draw a square lying outside the given quadrilateral.
Draw line segments joining the centres of opposite squares.
Show that the two line segments are equal in length and perpendicular.
8. Let Q = A0 A1 A2 A3 be a quadrilateral in the plane. Given a point M0 in the plane we
define the sequence M0 , M1 , M2 , . . . of points in the plane as follows. If n ≡ i (mod 4),
then Mn+1 is the point obtained from rotating the point Mn anticlockwise about Ai by
90◦ .
(a) Prove that if M2008 = M0 for one point M0 in the plane, then we also have
M2008 = M0 for all points M0 in the plane.
(b) Suppose that Q is a parallelogram. Prove that if M2008 = M0 , then Q is a square.
Under what circumstances is the converse true?
(c) Suppose that Q is a general quadrilateral. Find simple necessary and sufficient
conditions on Q such that M2008 = M0 .
9. Let ABCDE be a convex pentagon such that
maximised or minimised?
(c) What results can you derive if P is allowed to vary over a circle concentric with
the circumcircle?
11. (a) Show there is an equiangular (i.e. having all internal angles equal) hexagon with
side lengths 1, 2, 3, 4, 5 and 6 in some order.
(b) Do the same for a 15-gon.
(c) Generalise to any n-gon, where n is not a power of a prime.
(d) If n has at least three different prime factors, show that we can construct an
equiangular n-gon with side lengths 12 , 22 , . . . , n2 in some order.
12. For which integers n ≥ 3 is it possible to find a convex n-gon with all its interior angles
equal but all its side lengths different positive integers?
13. If P (x), Q(x), R(x) and S(x) are all polynomials such that
14. Suppose
(1 + x + x2 + · · · + x10 )200 = a0 + a1 x + · · · + a2000 x2000 .
Problem Show that the midpoints of a quadrilateral form a parallelogram whose diagonals
intersect at the centroid of the original quadrilateral.
Solution Let the quadrilateral be ABCD. Let the midpoints of AB, BC, CD and DA be
E, F , G and H, respectively. If we toss the quadrilateral onto the complex plane, then we
have
1 1 1 1
E= (A + B), F = (B + C), G= (C + D) and H = (D + A).
2 2 2 2
D
G
C
H
F
A E B
We know that EF GH is a parallelogram if and only if EF is equal and parallel to HG. That
is, if
F − E = G − H.
A routine calculation shows in fact that they both equal 12 (C − A). Thus EF GH is indeed a
parallelogram.
8.2 Angles 141
8.2 Angles
Given an angle ∠AOB, where O is at the origin of the complex plane, we can express
B
∠AOB = arg B − arg A = arg .
A
Problem Let A and B be two fixed points on a circle Γ and let XY be a variable diameter
of Γ. Find the locus of points P defined by the intersection of AX and BY as XY varies
over Γ.
Solution A careful diagram suggests that P lies on a circle passing through A and B. We
can verify this by showing that ∠AP B is constant. This is equivalent to showing that the
complex number z = A−XB−Y has constant argument. This in turn is equivalent to showing that
z
z is constant.
A
X
O. P
3 Try and prove this fact for yourself. It’s not hard.
142 8 Complex numbers
We may assume that the centre O of Γ is the origin of the complex plane. Thus XY being
a diameter means that Y = −X. Let r be the radius of the circle. Thus for any complex
number α lying on the circle, we have αα = r2 .
We compute
z A−X B−Y
= ·
z B−Y A−X
r2 r2
A−X B − Y
= · r2 r2
B−Y A − X
AX
=
BY
A
=− .
B
1 z
. Thus ∠AP B = 21 ∠AOB, which is a constant.
But arg z = 2 arg z
Do not be content to finish here with the problem! The only instance of where we actually
used Y = −X was to simplify A·X A X z
B·Y to − B . In fact, if Y is constant, we still derive that z is
constant. But XY is constant is the same as saying that XY is a chord of constant length.
Thus our solution by complex numbers shows that the problem can in fact be generalised.
Can you discuss the significance of the negative sign?
Can you discuss how we account for both parts, that is, the major and minor arcs of the
circle which form the locus of P ?
(2 + i)3 = 2 + 11i.
10
D
1
A 2 B
Problem Let P AB and P XY be equilateral triangles oriented the same way. Let K, L and
M be the midpoints of P A, BX and P Y , respectively.
Prove that triangle KLM is equilateral.
πi
Solution Toss the figure onto the complex plane so that P = 0. Let ω = e 3 . Multiplication
by ω corresponds to a rotation of 60◦ anticlockwise. Thus we may write
B = ωA and Y = ωX.
Y
L
P K A
144 8 Complex numbers
Problem Let similar triangles AKB, BLC and CM A be constructed on the exterior of
triangle ABC.
Prove that the centroids of triangles ABC and KLM coincide.
Solution
A
M
K
C B
L
8.4 Similarity ideas 145
AK
Let z be the complex number such that arg z = ∠BAK and |z| = AB . Due to the similarity
of the three triangles we may write
Problem Let P be a point inside triangle ABC such that ∠P BA = ∠P CA. Let K and L
be the feet of the perpendiculars from P to AB and AC, respectively. Let M be the midpoint
of BC.
Prove that KM = LM .5
L
K
θ θ
B M C
Triangles P KB and P LC are similar but are oppositely oriented. Assume that BK CL
P K = P L = r.
◦
Point B is obtained by rotating P about K by 90 clockwise and dilating by a factor r. This
corresponds to multiplying by the complex number −ir. Thus
C = irP + (1 − ir)L.
|K − M | = |L − M |
1 1
⇔ |(1 − ir)K − (1 − ir)L| = |(1 + ir)L − (1 + ir)K|
2 2
⇔ |K − L||1 − ir| = |K − L||1 + ir|.
This is true because r√is a real number and so 1 − ir and 1 + ir are complex conjugates with
common magnitude 1 + r2 .
5 This problem was solved using spiral symmetries in section 7.4.
146 8 Complex numbers
Note that in the above solution point A was not used at all in the calculation. This is
because the important part consisted of the two similar triangles P KB and P LC. In such a
calculational solution it is probably better only to draw in what you need to complete the
calculation. In this case point A and the segments AK and AL could be omitted.
xn−1 + xn−2 + · · · + 1 = 0.
Solution Dilate the n-gon by a factor of 1r and toss it onto the complex plane so that each
2πi
Ak is an nth root of unity. Specifically, Ak = ω k , where ω = e n . Furthermore, since P is a
point on the circumcircle we may identify P with a complex number ρ satisfying
ρρ = |ρ|2 = 1.
A2 = ω 2 P
A1 = ω
A3
A0 = 1
A4
A6
A5
8.5 Roots of unity 147
Recall that
1 + ω + ω 2 + · · · + ω n−1 = 0.
P A2k = |ρ − ω k |2
= (ρ − ω k )(ρ − ω k )
= (ρ − ω k )(ρ − ω k )
= ρρ + (ωω)k − ρω k − ρω k
= |ρ|2 + |ω|2k − ρω k − ρω k
= 2 − ρω k − ρω k .
Thus
n−1
X n−1
X
P A2k = 2 − ρω k − ρω k
k=0 k=0
n−1
X n−1
X
= 2n − ρ ω n−k − ρ ωk
k=0 k=0
n−1
X n−1
X
= 2n − ρ ωk − ρ ωk
k=0 k=0
= 2n − 0 − 0
= 2n.
Thus for a regular n-gon of unit circumradius the sum required is 2n. If the circumradius
is r, then the answer is 2nr2 , which is independent of P .
Problem Let p ≥ 3 be a prime number and let P be a convex p-gon with all its interior
angles equal and all its side lengths positive integers.
Prove that P is regular.
Solution Suppose that the side lengths of P in order were a0 , a1 , . . . , ap−1 . It would then
follow that
p−1
X
ak ω k = 0,
k=0
2πi
where ω = e p . (Can you see why?)
So ω is a root of the polynomial
p−1
X
f (x) = ak xk = 0.
k=0
Thus ω is a root of
d(x) = gcd(f (x), g(x)).
148 8 Complex numbers
However, g(x) is irreducible.6 Thus d(x) = g(x) and so g(x) is a factor of f (x). Since f (x) and
g(x) have the same degree, we must have f (x) = a0 g(x). Consequently, a0 = a1 = · · · = ap−1
and therefore, P is regular.
Solution Let
2000
X
p(x) = aj xj = (1 + x + · · · + x10 )200 .
j=0
2πi
Let ω = e 11 . Note that p(wk ) = 0 provided that 11 - k.
We now substitute x = ω k for k = 0, 1, 2, . . . , 10 into p(x). This yields the following 11
equations.
2000
X
aj = 11200
j=0
2000
X
aj ω j = 0
j=0
2000
X
aj ω 2j = 0
j=0
..
.
2000
X
aj ω 10j = 0
j=0
6 To prove that g(x) = 1 + x + x2 + · · · + xp−1 is irreducible over Z whenever p is a prime, take the change
of variables x = y + 1, expand the brackets, then use the upstairs–downstairs technique modulo p from
section 9.10.
Polynomials
9
Polynomials are just algebraic expressions of a particular type. However, they actually consti-
tute much more than a boring branch of algebra. In fact, polynomials have many interesting
properties and many amazing connections to other flavoursome areas of mathematics. This
need for lateral thinking provides a fertile breeding ground for mathematical problems, and
in this chapter we’ll sample a variety of those goodies. But in order to proceed, you’ll need
some basic knowledge about polynomials, but not too much. Let’s start by introducing a bit
of jargon, while various results concerning polynomials will appear, mostly without proof,
scattered throughout the chapter.
We refer to a0 as the constant term and we refer to an as the leading coefficient, provided
that it is non-zero. If an is equal to 1, then we say that the polynomial is monic.
Equality almost always holds in the last line. The only time it does not is when the
leading terms of P (x) and Q(x) cancel each other out.
We say that P (x) is a divisor or a factor of Q(x) if there exists a polynomial M (x) such
that P (x)M (x) = Q(x).
150 9 Polynomials
9.0 Problems
1. Let P (x) be a real polynomial satisfying
4. Find the cubic monic polynomial whose roots are the cubes of the roots of
x3 − x2 + x − 2 = 0.
(x + 1)n + xn + 1
9. Find a polynomial P (x) such that P (x) is divisible by x2 + 1 and P (x) + 1 is divisible
by x3 + x2 + 1.
10. Find all real polynomials P (x) such that
xP (x − 1) = (x − 2)P (x).
11. Let P (x) be an integer polynomial such that the equation P (x) = 5 has five distinct
integer solutions.
Prove that the equation P (x) = 8 has no integer solutions.
12. Find all real polynomials P (x) such that
P (x)P (x + 1) = P (x2 ).
9.0 Problems 151
13. (a) Does there exist a non-constant integer polynomial P (x) such that the sequence
P (1), P (2), P (3), . . . consists only of primes?
(b) Does there exist a non-constant real polynomial P (x) such that the sequence
P (1), P (2), P (3), . . . consists only of primes?
14. Find all real polynomials P (x) such that, if a is a real number and P (a) is an integer,
then a must also be an integer.
15. (a) Prove that the polynomial
is irreducible over Q.
(b) Let P (x) = an xn + an−1 xn−1 + · · · + a0 be an integer polynomial and suppose
that there exists a prime p such that
is irreducible over Q.
16. (a) Show that there is a unique polynomial p(x) with integer coefficients in the set
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9} such that p(−2) = p(−5) = 10.
(b) What happens if p(−2) = p(−5) = 2009 instead?
17. Let p(x) be a polynomial with integer coefficients. Suppose that there are integers
x1 , x2 , . . . , xn such that
Prove that x1 = x3 .
18. (a) Which polynomials can be written as a finite sum of cubes of polynomials with
real coefficients?
(b) Which polynomials can be written as a finite sum of cubes of polynomials with
integer coefficients?
19. Let a1 , a2 , . . . , a100 , b1 , b2 , . . . , b100 be distinct real numbers. For each i and j the number
ai − bj is written in the ith row and jth column of a 100 × 100 table. Suppose that the
product of the numbers in each column is equal to 1.
Prove that the product of the numbers in each row is equal to −1.
20. Show that the set of real numbers x which satisfy the inequality
70
X k 5
≥
x−k 4
k=1
k
p(k) = , for k = 0, 1, 2, . . . , n.
k+1
p(k) = 2k , for k = 0, 1, 2, . . . , n.
and
p(1) = p(3) = p(5) = p(7) = p(9) = 1.
26. Let k be a given positive integer and let F (x) be an integer polynomial satisfying
0 ≤ F (c) ≤ k, for c = 0, 1, . . . , k + 1.
p(x) = x6 + x5 + x4 + x3 + x2 + k.
28. Let P (x) and Q(x) be polynomials whose coefficients are all equal to 1 or 7.
If P (x) divides Q(x), prove that 1 + deg P (x) divides 1 + deg Q(x).
9.0 Problems 153
29. Find all polynomials p(x, y) with real coefficients such that
30. The graph of a monic cubic polynomial contains the vertices of exactly one square.
What is the area of the square?
31. Prove that every real polynomial can be multiplied by a non-zero real polynomial to
obtain a polynomial whose exponents are all divisible by 1000.
32. Find all integer polynomials f (x) such that f (a) = f (b) for infinitely many pairs of
integers a, b with a 6= b.
33. Let F (x) be a real polynomial of degree n.
Show that F (x) has n real roots if and only if it is not possible to write
Identity theorem
If a polynomial has infinitely many roots, then it is the zero polynomial.
If two polynomials satisfy P (x) = Q(x) for infinitely many values of x, then the
polynomials P (x) and Q(x) are equal.
If two polynomials of degree at most n satisfy P (x) = Q(x) for n + 1 values of x, then
the polynomials P (x) and Q(x) are equal.
You should definitely try your hand at proving the identity theorem. It should be well within
your grasp, particularly if you are acquainted with the following important results.
Factor theorem and remainder theorem The factor theorem states that the number
r is a root of P (x) if and only if P (x) is divisible by x − r. More generally, the remainder
theorem states that P (r) = c if and only if the remainder after dividing P (x) by x − r is c.
This proof is a little sloppy. The following is a brief sketch of how you should write it down,
given that you want to produce a completely rigorous proof.
You might start by explicitly defining the sequence 0, 1, 2, 5, . . . that we are interested in by
defining a sequence a0 , a1 , . . . which satisfies
Next, you would show that this is an increasing sequence. Then you would prove that
P (an ) = an for each n, by induction.
If you have done all of this correctly, then you now have infinitely many values of x for which
P (x) = x and can invoke the identity theorem just as we did above.
a = qb + r, where 0 ≤ r < |b|. One of the lessons you will hopefully learn from this chapter
is the fact that many concepts from number theory have natural analogues in the world of
polynomials. The division algorithm provides just one example.
Division algorithm for polynomials For any two polynomials A(x) and B(x) 6= 0, there
is a unique way to write
A(x) = Q(x)B(x) + R(x),
where deg R(x) < deg B(x).
You can convince yourself that this is true by performing polynomial long division on a few
well chosen pairs of polynomials.
Problem Let P (x) be a real polynomial satisfying P (a) = A and P (b) = B, where a 6= b.
Determine the remainder when P (x) is divided by (x − a)(x − b).
Solution The division algorithm asserts that there is a unique way to write
P (x) = Q(x)(x − a)(x − b) + R(x),
where R(x) is linear or a constant. Our goal, of course, is to determine R(x). Plugging the
values a and b into this equation, we find that
R(a) = P (a) = A and R(b) = P (b) = B.
Therefore, the two points (a, A) and (b, B) lie on the graph of y = R(x), which we know to
be a line. So we can easily deduce that the desired remainder is
A−B aB − Ab
R(x) = x+ .
a−b a−b
Of course, the answer is nicely symmetric, as we should have expected.
For the remainder of this chapter, we will never again mention polynomials with complex
coefficients. So, you might be wondering, why is the fundamental theorem of algebra so useful
to us? It’s simply because we’ll be dealing with polynomials whose coefficients are integers,
rationals and real numbers, and these are all just particular examples of complex numbers.
Problem Find all real polynomials P (x) which satisfy the equation
(x − 16)P (2x) = 16(x − 1)P (x).
156 9 Polynomials
Now substitute this into both sides of the given equation—a good way to start many a
polynomial problem. The left-hand side is
r1 rn
A(x − 16)(2x − r1 ) · · · (2x − rn ) = 2n A(x − 16) x − ··· x − ,
2 2
while the right-hand side is
16A(x − 1)(x − r1 ) · · · (x − rn ).
2n A = 16A,
from which it follows that n = 4. We can now compare roots to obtain the fact that the
following two sets contain exactly the same elements.
n r1 r2 r3 r4 o
16, , , , = {1, r1 , r2 , r3 , r4 }
2 2 2 2
r1
Without loss of generality, let r1 = 16 which implies that 2 = 8. Then, without loss of
generality, let r2 = 8 which implies that r22 = 4.
Proceeding in this fashion yields
where A can be any real number. As usual, you should substitute this back into the original
equation to check that it actually works, and you will find that it certainly does.
There are nice relations between these two descriptions, which you obtain by expanding out
the second expression and comparing coefficients with the first.
Observe that the signs on the right-hand sides of these equations alternate between negative
and positive. The first equation involves the sum of the roots, the second equation involves
9.4 Vieta’s formulas 157
the sum of the products of the roots taken two at a time, the third equation involves the sum
of the products of the roots taken three at a time, and so on. Such expressions, known as
elementary symmetric functions, appear in the following important result.
Problem Suppose that P (x) is a degree three polynomial with roots p, q, r such that
1 1
P +P − = 82P (0).
5 5
Solution If we let P (x) = ax3 + bx2 + cx + d, then the given equation becomes
a b c a b c
+ + +d + − + − + d = 82d.
125 25 5 125 25 5
This simplifies to
2
b + 2d = 82d,
25
from which we quickly obtain
b
= 1000.
d
But from Vieta’s formulas, we know that
1 1 1 −a(p + q + r) b
+ + = = = 1000.
pq qr rp −apqr d
Now let’s turn our attention to something just a little more difficult.
Although the left-hand side of the equation appears to involve a symmetric expression of the
roots of P (x), the root 1 is missing. So our approach might be to find a polynomial whose
roots are precisely r1 , r2 , . . . , rn . That polynomial would be
P (x)
= xn + xn−1 + xn−2 + · · · + 1.
x−1
Subsequently, your strategy might be to expand out the given expression, rely on Vieta’s
formulas, and hope for the best. But such an approach would create a huge mess, requiring
you to put the given expression over a common denominator, followed by excessive amounts
of algebra. However, a much slicker approach is available.
158 9 Polynomials
Since this polynomial has degree n + 1, it must have n + 1 complex roots counting multiplicity.
So just as we previously excluded the root 1, we now must exclude the root 0. In summary,
s1 , s2 , . . . , sn are precisely the roots of the polynomial
(1 − x)n+1 − 1
Q(x) = .
x
Now if we write Q(x) = an xn + an−1 xn−1 + · · · + a0 , then Vieta’s formulas tell us that
1 1 1 s2 s3 · · · sn + · · · + s1 s2 · · · sn−1 a1
+ + ··· + = =− .
s1 s2 sn s1 s2 · · · sn a0
n+1
−1
In order to obtain the coefficients of Q(x), we simply need to expand (1−x)x using the
good old binomial formula.
n+1 n+1 2 n+1
Q(x) = − +x −x + · · · + (−1)n+1 xn
1 2 3
Therefore, we conclude that
n+1
1 1 1 2 n
+ + ··· + = n+1
= .
s1 s2 sn 1
2
The following is another useful little result, very easy to prove, which applies only to integer
polynomials.
a − b | P (a) − P (b)
P (P (P (P (a)))) = a.
Solution Consider the sequence defined by a0 = a and an+1 = P (an ) for all non-negative
integers n. The sequence consists entirely of integers and we have a4 = P (P (P (P (a)))) = a0 .
It follows that a5 = a1 , a6 = a2 , a7 = a3 , and so on. In other words, the sequence is periodic
with period four.
By the lemma above, we have
for every positive integer k. In particular, we have the following chain of divisibilities.
a1 − a0 | a2 − a1 | a3 − a2 | a0 − a3 | a1 − a0
If a term appearing in this chain is equal to 0, then they must all be, since 0 doesn’t divide
any non-zero integer. In this case, we would have a1 = a0 or equivalently, P (a) = a, which
immediately implies that P (P (a)) = a.
Otherwise, all terms appearing in the chain are non-zero. Since a | b implies that b = 0 or
|a| ≤ |b|, we have the following chain of inequalities.
If you fully understand the previous solution, then you should have no problem in proving
the following, more general, statement.
P (P (· · · P (P (a)) · · · )) = a,
then P (P (a)) = a.
160 9 Polynomials
If you are familiar with complex conjugation, you should be able to prove the conjugate root
theorem without too much trouble.
Problem Prove that any real polynomial can be written as a product of real linear and real
quadratic polynomials.
Solution The conjugate root theorem allows us to list the real roots of P (x) as r1 , r2 , . . . , rk
and the non-real roots as z1 , z1 , z2 , z2 , . . . , zm , zm . In other words, for some real number A,
we can write
This means that each complex conjugate pair of roots contributes a real quadratic factor to
P (x). Clearly, each real root contributes a real linear factor to P (x), so we’re done.
Solution Normally it’s quite difficult to tell, simply by looking at a polynomial, whether
or not it has a real root greater than 1. However, it’s often easier to tell whether or not a
polynomial has a real root greater than 0, as we’ll soon see.
Motivated by these reasons, let’s set x = y + 1 and substitute into the expression for P (x) to
obtain
Note that P (x) has a real root greater than 1 if and only if P (y + 1) has a real root greater
than 0. However, it’s clear by inspection that P (y + 1) is positive whenever y is positive. So
we can conclude that P (y + 1) has no real root greater than 0. Therefore, P (x) has no real
root greater than 1.
For the next problem, we use the fact that the product of two sums of two perfect squares is
another sum of two perfect squares. This follows from the algebraic identity
You could prove this simply by expanding both sides and verifying that they are equal. For
the complex numbers expert, a more insightful approach is to recognise that the result is
equivalent to
|A + Bi| × |C + Di| = |(A + Bi) × (C + Di)|.
Either way, this is one of those remarkable formulas that every good mathemagician should
know!
Furthermore, this identity is not just true when A, B, C and D are numbers, but any
quantities that you can add, subtract and multiply. In particular it is true if A, B, C and D
are functions, and of course these include polynomials.
Problem Let P (x) be a real polynomial such that P (x) ≥ 0 for all real x. Prove that it’s
possible to write
P (x) = F (x)2 + G(x)2
for real polynomials F (x) and G(x).
Solution If P (x) has a real root r with odd multiplicity, then the sign of the graph y = P (x)
will change from positive to negative or vice versa at x = r. Therefore, every real root of
P (x) must have even multiplicity.
Now we use the result from section 9.6 to write
Here, the real number A must be non-negative to ensure that P (x) ≥ 0 for all real x. Therefore,
we can write
P (x) = R(x)2 Z1 (x)Z2 (x) · · · Zm (x),
where √
R(x) = A(x − r1 )(x − r2 ) · · · (x − rk )
and
Zi (x) = (x − zi )(x − zi ).
Therefore, we can write each of Z1 (x), Z2 (x), . . . , Zm (x) as a sum of two perfect squares.
Since R(x)2 = R(x)2 + 02 is also a sum of two perfect squares, we can repeatedly use our
mathemagical identity
to write the product P (x) = R(x)2 Z1 (x)Z2 (x) · · · Zm (x) as a sum of two perfect squares.
162 9 Polynomials
9.8 Irreducibility
By the fundamental theorem of algebra, every polynomial can be factorised into linear factors,
if we are allowed to use complex numbers. But given a polynomial with integer coefficients,
one might ask whether or not it can be factorised into two polynomials, each with integer
coefficients themselves. Of course, you can take out a factor of 1 or perhaps some larger
integer, but this is totally boring. So we say that a polynomial F (x) is reducible over Z if it
can be written as a product F (x) = G(x)H(x), where G(x) and H(x) are integer polynomials
of positive degree. If this is not possible, then we say that F (x) is irreducible over Z. Similarly
if F (x) = G(x)H(x), where G(x) and H(x) are rational polynomials of positive degree, we
say that F (x) is reducible over Q and otherwise, that it is irreducible over Q.
Clearly, if an integer polynomial is reducible over Z, then it’s reducible over Q, because every
integer is certainly rational. More surprisingly, the converse is true as well.
Gauss’ lemma If an integer polynomial is reducible over Q, then it’s reducible over Z. So,
for an integer polynomial, reducibility over Z and reducibility over Q are the same thing.
Showing that a given polynomial is irreducible can be very difficult. We’ll see a few different
techniques as we go on, but for the following problem, we’ll use a rather direct approach.
F (x) = (x − a1 )(x − a2 ) · · · (x − an ) − 1
is irreducible over Z.
Solution Suppose that F (x) = G(x)H(x), where G(x) and H(x) are integer polynomials of
positive degree. Clearly, F (x) has degree n, so we know that
deg G + deg H = n.
Therefore, G(ak ) and H(ak ) must be integers which multiply to give −1, which leads to
G(ak ) = 1 and H(ak ) = −1 or vice versa. In either case, we have
G(ak ) + H(ak ) = 0.
has the distinct roots a1 , a2 , . . . , an , and possibly more. So, if P (x) is a non-zero polynomial,
then it would have to have degree at least n. But the degree of P (x) is at most the maximum
of deg G and deg H, which is strictly less than n by assumption. So this case cannot occur.
The other possibility is that P (x) is the zero polynomial. But then G(x) = −H(x) and so,
F (x) = −G(x)2 .
There is a problem here: the leading coefficient of −G(x)2 is clearly negative, while the
leading coefficient of F (x) is clearly positive.
From these contradictions, we can conclude that F (x) is irreducible over Z.
9.9 Factorisation 163
We finish this section with a useful fact concerning irreducible integer polynomials which you
might like to prove.
9.9 Factorisation
Yet another piece of algebraic trickery that arises in polynomial problems is the difference of
perfect powers factorisation.
Solution The trick is to write x = 10, so that we can express the terms of the sequence as
1 + x4 + x8 + · · · + x4k .
Now we apply two tactics. The first is to use the factorisation above to write this geometric
series in closed form, and the second is to further factorise the result, which features differences
of perfect squares.
x4k+4 − 1 (x2k+2 + 1) (x2k+2 − 1)
4
= ×
x −1 (x2 + 1) (x2 − 1)
Now the roots of x2 − 1 are ±1 and it’s easy to check that these are both roots of x2k+2 − 1.
Similarly, the roots of x2 + 1 are ±i, and it’s easy to check that these are both roots of
x2k+2 + 1 if k is even, or x2k+2 − 1 if k is odd.
In either case, after dividing the numerator by the denominator, we are left with the product
of two integer polynomials. For k ≥ 2, both of these polynomials are guaranteed to have
positive degree. Furthermore, it’s easy to check (and you should do so right now) that they
cannot be equal to 1 when x = 10. Thus, every term of the sequence after the first is a
product of two numbers greater than 1 and hence, not prime.
For the first term of the sequence, our clever polynomial argument doesn’t work. However,
we can compute directly that 10001 = 73 × 137.
It’s almost always useful to take p to be prime, in which case you can rely on the division
algorithm along with many of the results you already know and love which hold for polynomials
in general. For example, we require p to be a prime for the following simple, though useful,
statement to be true.
Theorem Let p be a prime. Suppose that
then
F (x) ≡ 0 (mod p) or G(x) ≡ 0 (mod p).
Another particularly useful true statement that requires p to be prime is the following analogue
of the fundamental theorem of arithmetic.
Unique factorisation for polynomials modulo p If p is a prime, then any factorisation
of a polynomial into irreducible factors modulo p is unique up to the order of its factors.
(We did not mention it earlier, but unique factorisation also holds for ordinary integer
polynomials.)
One of the most useful reasons to consider polynomials modulo p is to prove irreducibility. If
F (x) can be factorised as F (x) = G(x)H(x), then we know that F (x) ≡ G(x)H(x) (mod p).
In other words, if a polynomial is reducible over the integers, then it’s reducible over the
integers modulo p. The contrapositive of this statement is particularly useful: if a polynomial
is irreducible over the integers modulo p, then it’s irreducible over the integers.
The beauty of this technique is that modulo p there are only finitely many polynomials of
degree lower than a given polynomial. This means that you can check them all! Furthermore,
you are free to choose whichever value of p happens to work for you. However, one must
beware! If a polynomial is reducible over the integers modulo p, then it definitely does not
follow that it’s reducible over the integers.
F (x) = x5 − x2 + 1
is irreducible over Z.
Solution We simply consider F (x) modulo 2. For the remainder of this proof, all congruences
are assumed to be considered modulo 2.
Suppose that F (x) is reducible over the integers modulo 2. Then we can write
F (x) ≡ G(x)H(x),
x5 − x2 + 1 ≡ (x4 + x) · x + 1
x5 − x2 + 1 ≡ (x4 + x3 + x2 ) · (x + 1) + 1
9.10 Polynomials modulo p (upstairs–downstairs) 165
Similarly, there are only four polynomials modulo 2 and of degree 2, namely, x2 , x2 + 1, x2 + x
and x2 + x + 1. To make things even simpler, we have
x2 ≡ x · x, x2 + 1 ≡ (x + 1) · (x + 1) and x2 + x ≡ x · (x + 1),
x5 − x2 + 1 ≡ (x3 + x2 ) · (x2 + x + 1) + 1,
this is also not a factor of F (x). So F (x) is irreducible over the integers modulo 2 and hence,
irreducible over the integers.
Often, we think of working with integers as upstairs and working with integers modulo p
as downstairs. For more difficult problems, we may need to move between upstairs and
downstairs quite often, transferring information between the two levels.
F (x) = xn + 5xn−1 + 3
Solution In this solution, downstairs will refer to working modulo 3. Suppose that F (x) is
reducible, so that we can write F (x) = G(x)H(x) for polynomials G(x) and H(x) of positive
degree.
Downstairs, we have
F (x) ≡ xn−1 (x + 5),
which, by unique factorisation modulo p, can only be non-trivially factorised into two
polynomials as
xk (x + 5) × xn−1−k ,
for some k = 0, 1, 2, . . . , n − 2. So, without loss of generality, we have
If k is positive, then moving upstairs tells us that the constant terms of both G(x) and H(x)
are divisible by 3. This implies that the constant term of F (x) is divisible by 9, a rather
blatant contradiction.
So, k = 0 and our equations downstairs become
Going back upstairs we must have deg G(x) ≥ 1 and deg H(x) ≥ n−1. But G(x)H(x) = F (x),
which has degree n. Thus, deg G(x) = 1 and deg H(x) = n − 1.
Since G(x) is linear it has a rational root, which we can express as rs in lowest terms. But
then rs is a root of F (x) and the rational root theorem implies that r | 3 and s | 1. In short,
F (x) must have ±1 or ±3 as a root.
We can substitute these four values into the expression xn + 5xn−1 + 3 to verify that they are
certainly not roots, which grants us our desired contradiction.
166 9 Polynomials
Solution There are a number of ways to solve this problem. One is to find the roots of
x3 + x2 + x + 1 and show that they’re also roots of the given polynomial. But we’re here to
learn about polynomial modular arithmetic, so our approach will be to show that the given
polynomial is congruent to zero modulo x3 + x2 + x + 1.
Since x4 − 1 = (x − 1)(x3 + x2 + x + 1), we have
x4 ≡ 1 (mod x3 + x2 + x + 1).
So by working modulo x3 + x2 + x + 1, we obtain
x4a+3 + x4b+2 + x4c+1 + x4d = x3 (x4 )a + x2 (x4 )b + x(x4 )c + (x4 )d
≡ x3 + x2 + x + 1
≡ 0.
Solution If yk = 0, then the problem is easy, since the identity theorem forces us to take
P (x) = 0. Otherwise, the problem is almost as easy, since we know all of the roots of the
polynomial.
P (x) = A(x − x1 ) · · · (x\
− xk ) · · · (x − xn+1 ).
That strange looking hat notation is telling the term (x − xk ) to politely leave the building—it
is excluded from the product. We can then substitute x = xk to derive an expression for A.
The final answer is
(x − x1 ) · · · (x\
− xk ) · · · (x − xn+1 )
P (x) = yk .
\
(xk − x1 ) · · · (xk − xk ) · · · (xk − xn+1 )
9.13 Root focus 167
Solution Simple! We just add up the polynomials that we obtained in the previous problem.
So the final answer is
n+1
X (x − x1 ) · · · (x\
− xk ) · · · (x − xn+1 )
P (x) = yk .
k=1
\
(xk − x1 ) · · · (xk − xk ) · · · (xk − xn+1 )
It’s this result that is known as the Lagrange interpolation formula. Once again, remember
that it’s not the actual formula itself that you should commit to memory, but the method
used to arrive at the formula.
Problem If p is an odd prime and n is a positive integer, prove that the polynomial
F (x) = xn + x + p
is irreducible over Z.
So, if the complex roots of G(x) are r1 , r2 , . . . , rk , then with the help of Vieta’s formulas, we
have the equation
|r1 r2 · · · rk | = 1.
If all these complex roots had magnitude greater than 1, then their product would be greater
than 1, a contradiction. Hence there must be some complex root r of G(x) which satisfies
|r| ≤ 1.
But any root of G(x) is also a root of F (x). So F (x) has a complex root r which satisfies
|r| ≤ 1. This is a very interesting piece of information indeed! This was certainly not
something that was obvious from the problem’s statement.
Continuing, this implies that
rn + r + p = 0
for some r ∈ C with |r| ≤ 1.
Now think about what this equation means geometrically. It means that the triangle in the
complex plane whose vertices are rn , rn + r and rn + r + p = 0 has side lengths |rn | ≤ 1,
|r| ≤ 1 and p ≥ 3. But this directly contradicts the triangle inequality, so we conclude that
xn + x + p is irreducible over Z.
Functional equations
10
Functional equations are simply equations involving functions! To read this chapter, you’ll
need to have some familiarity with the concept of a function and other basic notions, some of
which may be found in section 17.2. To solve functional equations, you’ll need an assortment
of standard and not-so-standard techniques, many of which may be found in this chapter. As
is common for mathematical Olympiad problems, the concepts themselves are generally quite
simple, but they will be applied in particularly tricky ways.
Hopefully, you are already familiar with the following standard notation, which we’ll use
consistently throughout this chapter.
10.0 Problems
1. Find all functions f : R → R such that
f (x − f (y)) = 1 − x − y
xf (x) + f (1 − x) = x3 − x
f (x + y)f (x − y) = 2x + f (x2 − y 2 )
g(g(x)) = x
f (m + f (n)) = f (m) + n
f (x) + f (y) = 1 + f (x + y)
f (m + f (n)) = f (m) − n
9. (a) Find all strictly increasing or strictly decreasing functions f : R → R such that
f (x + f (y)) = f (x) + y
f (x + y) = f (x)f (y)
for all x, y ∈ R.
11. Prove that there exists no function f : Z → Z which satisfies the equation
f (f (n)) = n + 1
f (x + y) + f (x + z) + f (y + z) ≥ 3f (x + 2y + 3z)
for all x, y, z ∈ R.
10.0 Problems 171
f (f (m) + f (n)) = m + n
f (f (m)f (n)) = mn
f (n) + f (f (n)) = 2n
f (4x) − f (3x) = 2x
for all x ∈ R.
22. Let T denote the set of all ordered pairs (a, b) of non-negative integers.
Find all functions f : T → R such that
0
if ab = 0,
f (a, b) =
1 + f (a + 1, b − 1) + f (a − 1, b + 1)
otherwise.
2
172 10 Functional equations
23. Let R+
0 be the set of non-negative real numbers.
Find all functions f : R+ +
0 → R0 such that
√ √
f (x + y − z) + f (2 xz) + f (2 yz) = f (x + y + z)
for every x, y, z ∈ R+
0 such that x + y ≥ z.
f (x + y) = f (x) + f (y).
However, the choice of domain and codomain that we are interested in will have a massive
effect on what the solutions look like. This is one of the simplest and most fundamental
functional equations, and its solution should be familiar to any competent problem solver!
f (x + y) = f (x) + f (y)
The idea is to work out the value of f (x) for more and more values of x until we have the
entire function.
Before we proceed, here is a useful strategy for solving functional equations. It’s often a good
idea to guess what the solutions are and, in this case, you should be able to find at least
one without too much trouble! You should keep the suspected answer in the back of your
mind while solving the functional equation, yet remain open to other possibilities. For this
particular problem, one of the easiest solutions to find is f (x) = x. But then you might also
notice that f (x) = 2x is a solution and, in fact, so is f (x) = cx for any real number c.
Solution So how do you start on a problem like this? For almost any functional equation, a
good starting point is the substitution of values for x and y. It’s a free world, so you can use
whatever numbers you like, provided they are in the function’s domain. In practice, however,
simple substitutions such as setting the variables to be 0, 1, −1 or equal to each other are the
best to start with. For this problem, the substitution x = y = 0 yields f (0) + f (0) = f (0)
which then implies that f (0) = 0.
Next, you might try to figure out the value of f (1). But, try as you might, you won’t be
able to do it. If you are too narrow-minded and believe that f (x) = x is the only solution,
then you could get stuck at this point. That’s why you need to remain open-minded to other
possibilities, because if f (x) = cx is a solution for any real number c, then f (1) can take on
any value whatsoever!
With this in mind, let f (1) = c so that we can try to prove that f (x) = cx for all non-negative
integers x. In order to obtain the value of f (2), it’s natural to substitute x = y = 1, which
leads to
f (2) = f (1) + f (1) = 2c.
In fact, putting y = 1 in the functional equation gives
f (x + y) = f (x) + f (y)
Solution Note that all the hard work required to solve the previous problem carries over to
this problem as well. So we already know that f (x) = cx for all non-negative integers x and
some real number c.
Since we have already deduced that f (0) = 0, it makes sense to try the substitution y = −x
in the functional equation. This leads to
This piece of information tells us that f (x) = cx holds for every integer, whether positive,
negative or zero.
Now that we have the function on all of Z, it’s time to extend our net to all of Q. The trick
is to notice that Cauchy’s functional equation implies that
Hence, we can expand out sums of not only two numbers, but three numbers, or four numbers,
or even n numbers.1 In particular, for any integer m and positive integer n, we have
m m m m m m
f + + ··· + =f +f + ··· + f .
|n n {z n} | n n {z n }
n times n times
Therefore, f (m) = nf ( m
n ) which implies that
m f (m) cm
f = = .
n n n
We’ve now deduced that f (x) = cx for all rational numbers x and some real number c. Once
again, you should check that functions of this type do indeed satisfy the given functional
equation.
We have solved Cauchy’s functional equation where the domain is N0 , Z and Q. So what are
the solutions to Cauchy’s functional equation when the domain is R? You could be forgiven
for believing that all the solutions are linear, as they were in the previous cases, but that just
isn’t true! Although f (x) = cx is a solution, there are in fact many, many more, all of which
are crazy and none of which can be described by any compact formula. To eliminate these
crazy solutions, it’s necessary to impose an extra condition. For example, it’s known that
the solutions to Cauchy’s functional equation on the real numbers are given by f (x) = cx, if
we are also given that f is continuous, that f is monotonic, or that f is bounded on some
interval.
Here’s the idea. Say you have a functional equation and you think that f (x) = x3 is the only
solution. Then the function g(x) = f (x) − x3 would be pretty simple because it would be the
zero function. Another simple function would be h(x) = fx(x)
p
3 or perhaps even k(x) = 3 f (x).
This motivates us to use one of these three substitutions to simplify the original functional
equation. Sometimes, but certainly not always, you may end up with a much easier problem.
Solution The first thing you might notice is the fact that this looks like Cauchy’s functional
equation, except for that pesky 2xy term which ruins everything. You could try similar
methods to those used to solve Cauchy’s functional equation, but the 2xy term makes things
a little difficult.
The second thing you might notice is the fact that 2xy arises when you expand (x + y)2 . In
fact, the given functional equation bears an uncanny resemblance to the formula
(x + y)2 = x2 + y 2 + 2xy.
In fact, this verifies that f (x) = x2 is a solution; maybe not the only solution, but a solution
nonetheless.
Now it’s time to guess and hope! Let’s substitute g(x) = f (x) − x2 , that is, f (x) = g(x) + x2 ,
in the hope that g(x) satisfies a simpler functional equation. This leads to
You guessed it—Cauchy’s functional equation! We already know that the solutions to this are
given by g(x) = cx for some real number c. Retracing our steps, we find that f (x) = x2 + cx
for some real number c. You can easily substitute this back into the original functional
equation to verify that it is indeed a solution.
10.3 Substitutions
Suppose that you stumble upon a magic black box in which you can place an object and
another object will emerge from the other side. If you want to know how the box works,
you might try passing different objects through it and observing the output, until you can
deduce something about the behaviour of the magic black box. Similarly, if you stumble upon
a functional equation and want to solve it, a natural approach is to substitute particular
numbers or, better still, algebraic expressions for the variables.
Solution An obvious first substitution is y = 0, because it knocks out one of the terms on
the right-hand side. In fact, it leads to the equation
f (f (x)) = f (x2 ).
Trying the substitution x = y = 0, we discover that f (f (0)) = f (0). You could continue
trying various combinations of x and y equal to simple numbers like 0, 1 and −1, but you
probably wouldn’t get too far with this approach.
The idea here is to look for algebraic substitutions, rather than numerical ones. In particular,
we would like to find algebraic substitutions which provide nice cancellation. For example,
the appearance of f (f (x) + y) on the left-hand side motivates us to try y = −f (x), which
yields a rather interesting looking equation.
Furthermore, the term f (x2 − y) on the right-hand side motivates us to try y = x2 , which
turns up yet another interesting looking equation.
You probably can’t help but notice that these two equations bear quite a resemblance to each
other. In fact, we can use them to eliminate the unwieldy term f (f (x) + x2 ) and obtain
For x = 0, this equation simply tells us that f (0) = 0. But if x 6= 0, then f (x) 6= 0, and so we
can divide through by f (x) ending up with
f (x) = x2 .
Therefore, the only possible solution is f (x) = x2 and you can check that it does indeed
satisfy the original functional equation.
This problem involved some algebraic trickery! Such deviousness is often the result of a
long period of time spent playing around and substituting different combinations of the
variables involved. In the end, this is a matter of intuition and experience. However, note
what motivated us to substitute y = x2 . We turned f (x2 − y) into f (0). You should always
be on the lookout for substitutions which give similarly nice results.
f (a) = b. The big advantage of having a surjective function f is that you can then substitute
f (a) = b for any value of b in the codomain.
A function is said to be bijective if it is both injective and surjective. The big advantage of
having a bijective function f is that there exists an inverse function f −1 which satisfies
These three properties are supremely important when solving functional equations. It’s a
standard tactic, when confronted with a difficult problem, to try to prove that the function
involved has some of these properties. All of the previous discussion might sound completely
cryptic to you and, if that’s the case, you’ll probably feel more enlightened after considering
the following example.
Solution Think about holding x constant. For example, let x = 0 and vary y. Then the
right-hand side can attain any real value. Since the left-hand side is f applied to some
expression, this proves that f is surjective.
You might be a little suspicious of this argument, but we can also express it in more algebraic
terms. To prove that f is surjective, for any real number b we must be able to find a real
number a such that f (a) = b. This value of a is simply f (b − f (0)2 ), since the functional
equation with x = 0 and y = b − f (0)2 implies that
f (f (b − f (0)2 )) = b.
If the extra algebra didn’t help your understanding, that’s fine—it was probably just compli-
cating what is actually quite a simple argument.
Now we turn our attention to showing that f is injective. To do this, we assume that
f (y1 ) = f (y2 ) and hope to deduce that y1 = y2 . But if f (y1 ) = f (y2 ), then we have
If you examine the previous proof closely, you’ll see that we relied on two particular features
of the functional equation. The first is that the left-hand side involves only f (y), but not y.
The second is that the right-hand side involves only y, but not f (y). Variables which appear
in this way often provide the key to proving the fact that a function is injective, surjective or
bijective. Unfortunately, this particular example doesn’t really demonstrate just how helpful
it is to know that f is bijective. But we’ll see various illustrative examples of this as we
progress.
178 10 Functional equations
Solution The following chain of implications tells us that the function f must be injective.
Now we notice that the given information about f and g allows us to calculate the expression
f (g(f (x))) in two different ways.
Therefore,
f (x)2 = f (x3 )
It follows that f (−1), f (0) and f (1) are all equal to 0 or 1 and so two of them are equal to
each other. This directly contradicts the fact that f is injective, so we conclude that there
cannot exist functions satisfying the conditions of the problem.
Note that the associative trick is particularly useful where the expression f (f (x)) is known.
For example, if
f (f (x)) = x + 1,
f (x + 1) = f (x) + 1.
2 Anoperation ∗ is said to be associative if f ∗ (g ∗ h) = (f ∗ g) ∗ h always holds true. In our setting here, the
operation is function composition. Other examples of associative operations include the ordinary arithmetic
operations of addition and multiplication.
10.6 Exploit symmetry 179
Solution The crucial observation is the following. If we substitute m = f (p) for any integer p,
the left-hand side is a symmetric expression in p and n. In other words, it remains the same
when we swap p and n. However, the right-hand side is not symmetric in p and n.
Since swapping p and n in the above equation leaves the left-hand side unchanged, it must
also leave the right-hand side unchanged. Thus
Hence
f (p)n = f (n)p
for all integers p and n.
Substituting p = 1, we deduce that f (n) = f (1)n. Thus any solution to this functional
equation must be of the form f (n) = cn.
However, as with all functional equations, we must check our solutions. If we plug f (n) = cn
into the original functional equation, we deduce that c = 1. So, the only possible solution is
f (n) = n.
10.7 Involutions
An involution is a function which is its own inverse. That is, a function f which satisfies
f (f (x)) = x
for all x. Involutions can be surprisingly useful and crop up in the most mysterious ways.
Solution Though it might look scary and complicated, this functional equation is actually
simpler than most. For example, it only contains one variable x. All the equation does is
2x
relate two different values of g, that is, the value at x and the value at 3x−2 . This begs the
180 10 Functional equations
2x 2x
following question: if x is related to 3x−2 , then what is 3x−2 related to? With this in mind,
2x
let’s replace x with 3x−2 in the functional equation and hope for the best.
2x
!
2( 3x−2 )
2x 2x 1
−g = g 2x
3x − 2 3x − 2 2 3( 3x−2 )−2
Yes, it looks messy, but a little algebra should convince you that
2x
2( 3x−2 ) 4x
2x = = x.
3( 3x−2 )−2 6x − 2(3x − 2)
You can think of these as two simultaneous equations, which you can solve for both g(x) and
2x
g( 3x−2 ). Do this and you’ll find that
4x(x − 1)
g(x) = ,
3x − 2
which is indeed the solution to the functional equation.
2x
What’s going on here? Obviously there’s something special about the expression 3x−2 . In
2x
fact if you let f (x) = 3x−2 , then f (f (x)) = x, which means that it’s an involution!
(i) f (xf (y)) = yf (x) for all positive real numbers x and y, and
Solution As a first step, let’s prove that the function must be bijective. Think about
holding x constant. For example, set x = 1 and vary y. Then the right-hand side can attain
any positive real value. Since the left-hand side is f applied to some expression, this proves
that f is surjective.
If we assume that f (y1 ) = f (y2 ), then it is certainly true that
Let’s use these facts to determine the value of f (1). The big advantage of having a surjective
function lies in the fact that there exists a positive real number k for which f (k) = 1.
Substituting y = k in the functional equation gives
Therefore, we have f (1) = f (k) = 1. This is a fairly indirect way to find f (1), but the
technique is surprisingly common and useful. Finding just one value of the function, like
f (1), can often lead to a windfall of other benefits. For example, if we now set x = 1 in the
functional equation, we obtain the fact that
f (f (y)) = y.
This is interesting because it means that the number xf (x) is a fixed point of f for any value
of x. Such a fact seems to imply that f must have a whole lot of fixed points. On the other
hand, you can’t have too many fixed points, for the second condition guarantees that every
fixed point is less than or equal to 1000. All this discussion really motivates us to concentrate
on the fixed points of f . We have the following observations.
Hence, ab is also a fixed point of f . It follows that, if a is a fixed point of f , then so are
a2 , a3 , a4 , . . ..
Now let’s piece these three statements together in a clever way to show that 1 is the only
fixed point of f .
Suppose that a > 1 is a fixed point, then the fixed points
a2 , a3 , a4 , . . .
1 1 1
, , ,...
a2 a3 a4
would eventually become greater than 1000, another contradiction.
182 10 Functional equations
Thus, the only possible fixed point of f is 1. And yet we also know that xf (x) is a fixed point
for any value of x. The only way that this can happen is if xf (x) = 1 or, in other words, if
1
f (x) =
x
for any value of x.
As always, you should substitute this solution back into the functional equation to make sure
that it actually works.
Of course, there was a great deal of algebraic trickery involved in this solution. Rather than
remarking bewilderedly ‘I could never think of that!’, you should examine the solution again
and instead ask ‘What factors are present in this problem that would lead me to try that
technique?’ For all, well almost all, mathematicians are ordinary mortals. And most likely
any solution you read to a difficult problem, however short, has involved a great deal of effort.
Solution Since we’ve already shown that f is bijective, we know that there exists a real
number k for which f (k) = 0. Substituting x = k into the functional equation immediately
implies that
f (f (y)) = y,
which means that f is an involution.
The trick now is to employ the ‘one expression, two ways’ philosophy. Note that the expression
xf (x) is invariant—that is, does not change—when you replace x with f (x). This follows
from our newly obtained result which states that f (f (x)) = x. So when we replace x with
f (x) in the given functional equation, we obtain
And this is where we should tread carefully, for herein lies a trap for the unsuspecting
functional equation enthusiast. A beginner might now conclude that there are two solutions,
namely, f (x) = x and f (x) = −x. But this reasoning is entirely incorrect, because all we
have deduced is that for each real number x, the value of f (x) is either x or −x. There are
10.10 Completely multiplicative functions 183
still infinitely many possibilities, since the function could conceivably take the value x at
some places and −x at others!
Such crazy solutions to the functional equation seem improbable though, so let’s try to
eliminate them. To do this, suppose that there exist two distinct real numbers x and y such
that f (x) = x and f (y) = −y. In fact, we might as well assume that x and y are non-zero,
since f (0) = 0 in either case. Now substitute these values into the functional equation to
obtain
f (x2 − y) = x2 + y.
However, we know that f (x2 − y) = ±(x2 − y). If we take the positive sign, we obtain y = 0
and if we take the negative sign, we obtain x = 0. Both cases contradict the fact that x and
y are non-zero.
Therefore, we can finally conclude that the only solutions to the functional equation are
f (x) = x and f (x) = −x.
If you really want to make sure that you never fall into the ‘somewhere versus everywhere’
trap, then you should return to the problem from section 10.3 and solve it without the
condition that f (x) 6= 0 for x 6= 0.
where p1 , . . . , pk are primes and a1 , a2 , . . . , ak are positive integers. In fact, if we allow negative
powers of primes, then we may uniquely express any positive rational number as well.
These concepts will be particularly useful for dealing with completely multiplicative functions.
These are functions f which satisfy
The unsuspecting reader might assume that, much like other functional equations we’ve seen,
the only solution is given by f (n) = n. But he or she could not be further from the truth.
Solution A first step would be to substitute simple numbers such as m = n = 1 into the
functional equation f (mn) = f (m)f (n). This would give us the fact that
f (1) = 1.
Thus, if we take the prime factorisation n = pa1 1 pa2 2 · · · pakk , then we have
f (n) = f (pa1 1 pa2 2 · · · pakk ) = f (p1 )a1 f (p2 )a2 · · · f (pk )ak .
So the function can be completely described by its values at the primes, namely,
If you still believe that f (n) = n is the only solution, then you could try to go further and
show that f (p) = p for every prime p. Unfortunately, you won’t get anywhere! This is because
we’ve deduced as much as we can possibly deduce.
In fact, as long as we take f (1) = 1, we can set f (2), f (3), f (5), f (7), . . . to be any arbitrary
positive integer sequence and any such choice will give us a completely multiplicative function
f : N+ → N + .
You shouldn’t just believe this, so check it for yourself. In conclusion, this functional equation
has infinitely many solutions given by the previous description.
10.11 Well-ordering of N+
One property of N+ not shared by Q or R is the fact that every subset of N+ has a smallest
element. It seems so simple, yet this property can be incredibly useful, so much so that it has
the grand sounding name of well-ordering.
f (n + 1) > f (f (n))
Solution The first thing you should do is try to construct such a function. You’ll probably
find that f (n) = n is an easy one to come up with, but other possibilities tend to fail because
you obtain various inequalities leading to other inequalities leading to contradictions. It’s
hard to see why these fail in general. Using the well-ordering of N+ allows us to consider
minimality, which will aid us considerably here.
The idea is to suppose that f (k) is the smallest member of the set S = {f (1), f (2), f (3), . . .}.
Of course, there might be more than one such k, but we only need one of them. The functional
equation tells us that
f (k) > f (f (k − 1)).
This seems to contradict the fact that f (k) is the smallest member of S. The only way to
avoid this contradiction is to have k = 1. In this way the inequality doesn’t arise because the
domain of f is the set of positive integers. This sleight of hand proves that f (1) is the unique
smallest member of S.3
3 Theterm ‘unique smallest’ is a little imprecise. After all, S only has one smallest member anyway. What we
mean is that if m is the smallest member of S, then the only number that satisfies f (n) = m, is n = 1. The
term has an analogous meaning for ‘unique second smallest’, and so on, later in the proof.
10.11 Well-ordering of N+ 185
Next, what’s the second smallest member of S? It must be f (k) for some k ≥ 2. Again, there
might be more than one such k, but we only need one of them. The functional equation tells
us that
f (k) > f (f (k − 1)).
If f (k − 1) > 1, we would have f (f (k − 1)) > f (1). This would contradict the fact that f (k)
is the second smallest member of S. So we must have f (k − 1) = 1. This implies that the
smallest member of S is 1 and that k = 2. Thus another sleight of hand proves that f (2) is
the unique second smallest member of S.
What’s the third smallest member of S? It is f (k) for some k ≥ 3. The functional equation
tells us that
f (k) > f (f (k − 1)).
If f (k − 1) > 2, we would have f (f (k − 1)) > f (2) > f (1). This would contradict the fact
that f (k) is the third smallest member of S. We certainly can’t have f (k − 1) = 1 because
this occurs only for k = 2. So we must have f (k − 1) = 2. This implies that the second
smallest member of S is 2 and that k = 3.
This argument can be continued inductively4 to show that the members of S written in
increasing order are
f (1) < f (2) < f (3) < · · · .
It also shows that the members of S written in increasing order are
4 Thisproof cannot be considered complete as it currently stands. Can you complete the proof by writing out
the details of the induction yourself?
11
Inequalities
An inequality is a mathematical problem which asks you to prove that some expression is
always greater than (or equal to) some other expression. For example, you might have to
show that for any real numbers a, b, c, the following holds.
a2 + b2 + c2 ≥ ab + bc + ca
Although daunting at first, you should be able to solve a whole variety of inequalities after
mastering a handful of techniques and tricks. That is exactly what this chapter will help you
to achieve.
11.0 Problems
1. Farmer Brown wants a rectangular paddock of area A next to a long straight river, with
a fence on the other three sides.
What is the minimum length of fencing that is required?
2. (a) Use the AM–GM inequality to find the maximum value of xyz, where x, y, z are
positive real numbers satisfying x + 2y + 3z = 3.
(b) Use the AM–GM inequality to find the minimum value of x + y + z, where x, y, z
are positive real numbers satisfying xy 2 z 3 = 108.
(a + b)(b + c)(c + a) ≥ 8.
a b c 3
+ + ≥ .
b+c c+a a+b 2
188 11 Inequalities
6. Use the rearrangement inequality to prove the following, where a, b, c are positive real
numbers.
(a) a2 b + b2 c + c2 a ≤ a3 + b3 + c3
1 1 1 a b c
(b) + + ≤ 2+ 2+ 2
a b c b c a
(c) abc bca cab ≤ aab bbc cca
7. Use the Cauchy–Schwarz inequality to find the minimum value of
x2 + y 2 + z 2 ,
11. Let A, B, C be the angles of a triangle. If the triangle is acute, use Jensen’s inequality
to prove the following inequalities. Also determine which of these inequalities remain
true if the triangle is obtuse.
√
3 3
(a) sin A + sin B + sin C ≤
2
3
(b) cos A + cos B + cos C ≤
2
1
(c) cos A cos B cos C ≤
8
12. If a1 , a2 , . . . , an are distinct positive integers, prove that
a1 a2 an 1 1 1
+ 2 + ··· + 2 ≥ + + ··· + .
12 2 n 1 2 n
(2 + a1 )(2 + a2 ) · · · (2 + an ) ≥ 3n .
14. Let a, b, c be positive real numbers such that abc = 1. Prove that
bc ca ab
+ + ≤1
b5 + c5 + bc c5 + a5 + ca a5 + b5 + ab
and determine when equality holds.
11.0 Problems 189
16. Prove that if a, b, c are positive real numbers with sum less than 1, then
abc(1 − a − b − c) 1
≤ .
(a + b + c)(1 − a)(1 − b)(1 − c) 81
17. Suppose that the numbers x, y, z are greater than or equal to 1 and satisfy
1 1 1
+ + = 2.
x y z
Prove that √ √ p √
x+y+z ≥ x−1+ y−1+ z − 1.
19. Let x, y, z be positive real numbers which satisfy xyz = 1. Show that
x3 y3 z3 3
+ + ≥ .
(1 + y)(1 + z) (1 + z)(1 + x) (1 + x)(1 + y) 4
Prove that √
n
a1 a2 · · · an ≥ 100(n − 1).
21. Let n ≥ 3 be an integer, and let a2 , a3 , . . . , an be positive real numbers such that
a2 a3 · · · an = 1. Prove that
(1 + a2 )2 (1 + a3 )3 · · · (1 + an )n > nn .
190 11 Inequalities
You should check this right now with pen and paper to make sure that we’re not lying to you!
Once you’ve done that, the inequality should be almost self-evident since the left-hand side is
a sum of squares, which we know to be non-negative. Furthermore, the first term is zero only
when x = ±1, while the second term is zero only when x = 2. So it’s impossible for them to
be 0 simultaneously, and this guarantees that the left-hand side is not only non-negative, but
always positive.
Of course, the strategy we used here involved taking all terms to the left-hand side and then
magically discovering that it could be expressed as a sum of squares. Although this may not
have been apparent to you from the outset, spotting squares becomes much easier after one
has had some square-spotting practice.
Problem Farmer Black wants a rectangular paddock of area A with a fence on each side.
What is the minimum length of fencing that is required?
Solution Let the side lengths of Farmer Black’s rectangular paddock be x and y. We are
given that xy = A and would like to find the minimum possible value of 2x + 2y. Your
intuition probably tells you that this will occur when x and y are equal to each
√ other, in
which case the rectangular paddock would actually be a square with perimeter 4 A. So let’s
try to prove that
√
2x + 2y ≥ 4 A,
or equivalently,
x+y √
≥ xy.
2
After squaring both sides1 , collecting all terms on the left-hand side, and factorising, the
inequality becomes
The inequality is so named because the left-hand side is known as the arithmetic mean—a
fancy name for what we usually call the average—while the right-hand side is known as the
geometric2 mean.
1 1 1
+ + ≥ 27.
ab bc ca
1 1 1 a+b+c 1
+ + = = .
ab bc ca abc abc
In other words, we would like to maximise the product abc.
The AM–GM inequality is often the key when dealing with inequalities which involve sums
and products. Applying it to a, b, c, we obtain
a+b+c √
3 1 √
3 1
≥ abc ⇒ ≥ abc ⇒ abc ≤ .
3 3 27
It follows that
1 1 1 1
+ + = ≥ 27,
ab bc ca abc
as desired.
E = a1 b1 + a2 b2 + · · · + an bn .
Then E is maximised when the sequences (a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ) are sorted the
same way.
If, on the other hand, we are trying to minimise E, then this occurs when the sequences
(a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ) are sorted the opposite way.
Two sequences of numbers (a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ) are said to be sorted the same
way if the largest ai is in the same position as the largest bi , the second largest ai is in the
same position as the second largest bi , and so on. On the other hand, they are said to be
sorted the opposite way if the largest ai is in the same position as the smallest bi , the second
largest ai is in the same position as the second smallest bi , and so on.
For future convenience, we’ll refer to the expression
a1 b1 + a2 b2 + · · · + an bn
x21 x2 x2 x2
+ 2 + · · · + n−1 + n ≥ x1 + x2 + · · · + xn .
x2 x3 xn x1
Solution The left-hand side of the inequality looks suspiciously like the product of two
sequences,
2 2 2 1 1 1
(x1 , x2 , . . . , xn ) and , ,..., .
x2 x3 x1
You should be able to see that the right-hand side can also be written as a product of the
same two sequences rearranged, that is,
1 1 1
(x21 , x22 , . . . , x2n ) and , ,..., .
x1 x2 xn
So, by the rearrangement inequality, the desired result will follow once we prove that these
latter two sequences are sorted the opposite way. But this is certainly true since
1 1
x2i ≤ x2j if and only if ≥ .
xi xj
In other words, if x2i is the largest term of the first sequence, then x1i is the smallest term
of the second sequence; if x2i is the second largest term of the first sequence, then x1i is the
second smallest term of the second sequence; and so on.
The rearrangement inequality is a surprisingly useful result, so you should take the time to
really understand the previous argument before moving on.
aa bb cc ≥ ab bc ca .
11.4 Cauchy–Schwarz inequality 193
Solution This inequality involves products of powers, whereas the rearrangement inequality
involves sums of products. This suggests that we should take the logarithm of both sides.
And we can do this without any change of the inequality sign, since log x is an increasing
function on the set of positive real numbers. Thus, we obtain the equivalent inequality
These are sorted the same way, since the log function is increasing, as we already mentioned.
So by the rearrangement inequality, the left-hand side is certainly greater than or equal to
the product of the two sequences
As a final comment on this section we note that the rearrangement inequality generalises
to more than two sequences of numbers. Here we state the version for three sequences of
numbers.3
Rearrangement inequality for three sequences of numbers Let (a1 , a2 , . . . , an ),
(x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be three sequences of positive real numbers. Suppose we
seek rearrangements (b1 , b2 , . . . , bn ) of (x1 , x2 , . . . , xn ) and (c1 , c2 , . . . , cn ) of (y1 , y2 , . . . , yn )
which maximise the expression
E = a1 b1 c1 + a2 b2 c2 + · · · + an bn cn .
Then E is maximised if the three sequences are sorted the same way.
Note that in contrast to the two sequence version of the rearrangement inequality, there is no
simple criterion for minimising E.
x1 : y1 = x2 : y2 = · · · = xn : yn .
The Cauchy–Schwarz inequality is often neglected by the budding inequality problem solver,
mainly because it is difficult to know when and how to use it. However, it can be a mighty
weapon in the hands of an expert. Here are some general tips.
Problem Prove the AM–HM inequality4 , which states that if a1 , a2 , . . . , an are positive real
numbers, then
a1 + a2 + · · · + an n
≥ 1 1 .
n a1 + a2 + · · · + a1n
Furthermore, show that equality occurs if and only if a1 = a2 = · · · = an .
Solution Since a1 , a2 , . . . , an are positive real numbers, we can simply take the two sequences
√ √ √ 1 1 1
( a1 , a2 , . . . , an ) and √ , √ ,..., √
a1 a2 an
√ 1 √ 1 √ 1
a1 : √ = a2 : √ = · · · = an : √ ,
a1 a2 an
Problem Let a, b, c be positive real numbers such that abc = 1. Prove that
1 1 1 3
+ + ≥ .
a3 (b + c) b3 (c + a) c3 (a + b) 2
Solution A common starting strategy is to massage the given inequality into a nicer looking
form by using substitutions or other algebraic trickery. In this case, it would be nice to get
rid of the unsightly cubes appearing in the denominator of eachterm. We can improve the
situation greatly if we opt for the clever substitution (a, b, c) = x1 , y1 , z1 . This will help to
bring some of the terms into the numerator—almost always a good thing—and will also give
us the equally nice constraint xyz = 1. As an example, the first term on the left-hand side
would become
1 1 x3 x3 yz x2
3
= 1 1 1 = 1 1 = = .
a (b + c) x3 ( y + z ) y + z
y+z y+z
x2 y2 z2 3
+ + ≥ ,
y+z z+x x+y 2
This inequality not only looks much friendlier, but also seems to be a prime candidate for the
Cauchy–Schwarz inequality. To obtain the left-hand side, it makes sense to let one of our
sequences be
x y z
√ ,√ ,√ .
y+z z+x x+y
When using Cauchy–Schwarz, you should always look for nice cancellation, and this is most
easily achieved if we let the other sequence be
√ √ √
y + z, z + x, x + y .
x2 y2 z2 x+y+z
+ + ≥ .
y+z z+x x+y 2
We often call M2 the quadratic mean (QM), M1 the arithmetic mean (AM), M0 the geometric
mean5 (GM), and M−1 the harmonic mean (HM). You can see just how powerful this result
is from the fact that the AM–GM inequality and the AM–HM inequality are simply special
cases.
Problem Suppose that x, y, z are positive real numbers which satisfy xyz = 1.
If r > s > 0, prove that
xr + y r + z r ≥ xs + y s + z s .
Solution The inequality seems reminiscent of the power means inequality with n = 3, which
can be rearranged to give the following.
rs −1
xs + y s + z s
r r r s s s
x + y + z ≥ (x + y + z )
3
5 Actually,M0 looks like the oddball here. Even though we can’t directly put r = 0 into the definition of Mr ,
it turns out that Mr converges to the geometric mean as r → 0.
196 11 Inequalities
In fact, this is a particular case of the following more general result, which you should now
be able to prove on your own. It’s definitely a handy little inequality to have under your belt
and will reappear before the end of the chapter.
A useful inequality Suppose that x1 , x2 , . . . , xn are positive real numbers which satisfy
x1 x2 · · · xn = 1. If r > s > 0, then
it’s worth checking whether f is convex (like a smile) or concave (like a frown).6 More often
than not, Jensen’s inequality can be used in such cases.
Jensen’s inequality If the real numbers x1 , x2 , . . . , xn lie on an interval where the func-
tion f is convex, then
f (x1 ) + f (x2 ) + · · · + f (xn ) x1 + x2 + · · · + xn
≥f .
n n
If they lie on an interval where the function f is concave, then the inequality is reversed.
Problem Let x1 , x2 , . . . , xn be positive real numbers whose sum is 1. What is the minimum
value of the following expression?
x1 x2 xn
+ + ··· +
1 + x2 + · · · + xn 1 + x1 + x3 + · · · + xn 1 + x1 + · · · + xn−1
Indeed, the function f is convex on the interval [0, 1] and since the numbers x1 , x2 , . . . , xn all
lie on this interval, we have
x1 x2 xn
+ + ··· + = f (x1 ) + f (x2 ) + · · · + f (xn )
2 − x1 2 − x2 2 − xn
x1 + x2 + · · · + xn
≥ nf
n
1
= nf
n
n
= .
2n − 1
n
This shows that the expression is always greater than or equal to 2n−1 . To finish we need to
demonstrate that this value can actually be attained. Simply taking x1 = x2 = · · · = xn = n1
does the job.
11.7 Substitutions
Substitutions can often help to simplify an inequality. In particular, you should always be on
the lookout for useful substitutions, such as the following.
Consider a substitution which makes an ugly denominator look nicer, even if it makes
the numerator look uglier.
Consider a substitution which makes an ugly constraint look nicer, even if it makes the
inequality look uglier.
198 11 Inequalities
If a, b, c are the side lengths of a triangle, it’s almost always useful to try the substitution
(a, b, c) = (y + z, z + x, x + y). In fact, a, b, c are the side lengths of a triangle if and
only if x = b+c−a2 , y = c+a−b
2 , z = a+b−c
2 are all positive.
We call this the incircle substitution because the incircle of a triangle with side lengths
a, b, c divides the sides into segments of lengths x, y, z. (See section 12.9.)
Solution Whenever fractions are concerned, it’s almost always better for ugliness to occur
in the numerator rather than the denominator. In this particular inequality, we can introduce
the following substitution to move the ugliness from the bottom of each fraction to the top.
x = b + 2c 9a = 4y + z − 2x
y = c + 2a ⇔ 9b = 4z + x − 2y
z = a + 2b 9c = 4x + y − 2z
After performing these substitutions and tidying up (something which you should try on your
own with pen and paper), the inequality looks a lot more palatable.
y z x x y z
4 + + + + + ≥ 15
x y z y z x
And it’s not too difficult to see that this inequality is true, since each bracket is greater than
or equal to 3 by the AM–GM inequality.
Many interesting inequalities are created using this technique. One way to solve them is to
reverse engineer them. In other words, your task is to discover the original chunks which were
added or multiplied together to create the inequality. As you will see, this does not involve
guesswork alone, but also intelligence, experience, and sometimes luck.
a = y + z, b = z + x, c = x + y,
And now it seems far more plausible that each term on the left is smaller than the corresponding
term on the right. All that remains is to show that
√ p √
2x + 2y ≤ 2 x + y,
which is easily proved by squaring, simplifying, and resorting to the AM–GM inequality. Then
adding this inequality with the two others obtained by permuting the variables yields the
desired result.
Solution This time, the chunks are much harder to uncover, but they look like the following.
abc c
≤ ⇔ a2 bc + b2 ca + c2 ab ≤ ca3 + cb3 + c2 ab.
a3 + b3 + abc a+b+c
After collecting all terms on the left-hand side and factorising, this latter inequality simply
becomes
−c(a + b)(a − b)2 ≤ 0,
which is obviously true. Now the solution to the original problem follows immediately, since
we can simply sum the three corresponding inequalities.
The trick in this solution was to break the number 1 into three chunks, each one depending
on a, b and c.
200 11 Inequalities
Solution It might seem like a maniacal thing to do, but we’re going to multiply both sides
of the inequality by (a + b)(b + c)(c + a). Since the inequality is cyclic in nature, it would be
foolish for us not to use cyclic sum notation. We want to prove
X
(a + b − 2c)(c + a)(a + b) ≥ 0,
cyc
which is equivalent to
X
a3 + 2a2 b + b2 c + b2 a − 2c2 a − 2c2 b − a2 c ≥ 0.
cyc
This might seem even messier than when we began, but remember that in cyclic sum notation,
we have equations like
X X X
a2 b = b2 c = c2 a = a2 b + b2 c + c2 a.
cyc cyc cyc
So, this observation allows us to cancel a lot of the terms until we are left with something
which almost looks nice. X
a3 + a2 b − 2a2 c ≥ 0
cyc
11.10 Homogeneous inequalities 201
It would be fantastic if we could prove that the sum a3 + a2 b − 2a2 c was non-negative, but
this fact just isn’t true for arbitrary positive real numbers a, b, c.
However, remember that in cyclic sum notation, the term a2 b is the same as the term c2 a.
Plugging this into the inequality and factorising gives
X X
a3 + c2 a − 2a2 c = a(a − c)2 ≥ 0,
cyc cyc
which is now obviously true, using the fact that squares are non-negative.
Solution First, we note that the inequality is homogeneous, since plugging in the values ra,
rb and rc gives
ra rb rc ra + rb + rc
1+ 1+ 1+ ≥2 1+ √ 3
,
rb rc ra rarbrc
which is identical to the original inequality after some cancellation. This means that if we
can prove the inequality for (a, b, c), then it must hold true for (ra, rb, rc) for any positive
real number r. In particular, if we can prove the inequality in the case a + b + c = 1, then
it follows that the inequality is true for any positive value of a + b + c. Similarly, if we can
prove the inequality for abc = 1, then it follows that the inequality is true for any positive
value of abc.
In this particular case, fixing abc = 1 gets rid of the unsightly cube root for us and the
inequality takes the following form
a b c
1+ 1+ 1+ ≥ 2(1 + a + b + c).
b c a
Expanding and simplifying leaves us with
a2 c + b2 c + b2 a + c2 a + c2 b + a2 b ≥ 2(a + b + c).
The rest of the inequality can be handled by cleverly pairing the six terms on the left-hand
side as shown and applying the AM–GM inequality to each pair.
√ √ √
(a2 b + a2 c) + (b2 c + b2 a) + (c2 a + c2 b) ≥ 2 a4 bc + 2 b4 ca + 2 c4 ab
3 3 3
= 2(a 2 + b 2 + c 2 )
≥ 2(a + b + c)
202 11 Inequalities
The very last step here is a simple application of the useful inequality stated in section 11.5.
a1 ≥ a2 ≥ · · · ≥ an
b1 ≥ b2 ≥ · · · ≥ bn
a 1 ≥ b1
a 1 + a 2 ≥ b1 + b2
a1 + a2 + a3 ≥ b1 + b2 + b3
..
.
a1 + a2 + · · · + an−1 ≥ b1 + b2 + · · · + bn−1
a1 + a2 + · · · + an−1 + an = b1 + b2 + · · · + bn−1 + bn
Problem Prove that if a, b, c are positive real numbers with product 1, then
a4 b + ab4 + a4 c + ac4 + b4 c + bc4 ≥ 2ab + 2bc + 2ac.
Solution Note first that since abc = 1 we may replace ab with a2 b2 c, bc with ab2 c2 and ac
with a2 bc2 thus making the inequality homogeneous. Then if we write it using symmetric
sum notation, the inequality takes the form
X X
a4 b1 c0 ≥ a2 b2 c.
sym sym
Since the triple (4, 1, 0) majorises the triple (2, 2, 1), we may invoke Muirhead’s inequality to
finish off the problem.
Solution We suppress the brute force part here! Suffice to say that after homogenising,
clearing denominators and rearranging, the inequality takes the form
X X
x10 yz + 4x7 y 5 + x6 y 3 z 3 ≥ x8 y 2 z 2 + 2x6 y 5 z + 2x6 y 4 z 2 + x5 y 5 z 2 .
sym sym
11.12 Weighted inequalities 203
It would be nice if we could match up sequences on the LHS to majorise sequences on the
RHS, but unfortunately this cannot be done. The sequence (10, 1, 1) on the LHS majorises
all of the sequences associated with the RHS. The sequence (7, 5, 0) on the LHS majorises all
the sequences on the RHS except for (8, 2, 2). But the sequence (6, 3, 3) on the LHS does not
majorise anything on the RHS. The solution to our conundrum is to use the AM–GM on the
LHS so as to smooth out the strong (10, 1, 1) term with the weak (6, 3, 3) term. This yields
x10 yz + x6 y 3 z 3 ≥ 2x8 y 2 z 2 .
We are now done because (8, 2, 2) majorises (5, 5, 2) while (7, 5, 0) majorises (6, 5, 1) and
(7, 5, 2).
The expansion and collection of like terms which was suppressed in the above solution would
require an enormous amount of accuracy, time and perseverance if performed without using
symmetric sum notation. Even with this notationX it is somewhat unwieldy. A useful notation
that can be used is to write [i, j, k] to mean xi y j z k . Of course one must understand how
sym
to manipulate the notation. But it’s not that difficult. Perhaps the hardest part is to deduce
that
[i, j, k][p, q, r] = [i + p, j + q, k + r] + [i + p, j + r, k + q]
+ [i + q, j + p, k + r] + [i + q, j + r, k + p]
+ [i + r, j + p, k + q] + [i + r, j + q, k + p],
! !
X X
which represents the expansion of xi y j z k xp y q z r .
sym sym
Note that most expansions are simpler, such as
Weighted power means inequality For positive real numbers a1 , a2 , . . . , an , and positive
real numbers w1 , w2 , . . . , wn (called weights), define
r1
w1 ar1 + w2 ar2 + · · · + wn arn
1
M0 = (aw 1 w2 wn w1 +w2 +···+wn
1 a2 · · · an ) and Mr =
w1 + w2 + · · · + wn
If they lie on an interval where the function f is concave, then the inequality is reversed.
Problem Prove the Cauchy–Schwarz inequality from the weighted AM–HM inequality,
where the variables in the Cauchy–Schwarz inequality are positive real numbers.
Solution We prove only the version for two sets of three variables but the proof generalises
quite easily for two sets of n variables. The Cauchy–Schwarz inequality for two sets of three
variables is
(a21 + a22 + a23 )(b21 + b22 + b23 ) ≥ (a1 b1 + a2 b2 + a3 b3 )2 .
The weighted AM–HM inequality for three variables x1 , x2 , x3 , and corresponding weights
w1 , w2 , w3 , is
w1 x1 + w2 x2 + w3 x3 w1 + w2 + w3
≥ w1 w2 w3 ,
w1 + w2 + w3 x1 + x2 + x3
which rearranges as
w1 w2 w3
(w1 x1 + w2 x2 + w3 x3 ) + + ≥ (w1 + w2 + w3 )2 .
x1 x2 x3
This looks very similar to the Cauchy–Schwarz inequality. In fact it is identical if we put
ai
wi = ai bi and xi = .
bi
Note that the change of variables at the end of the solution is reversible via
√
√ wi
ai = w i xi and bi = √ .
xi
This demonstrates that the weighted AM–HM inequality and the Cauchy–Schwarz inequality
are in fact the same inequality but stated in different variables!
Using weighted means it is possible to prove the more general Hölder’s inequality.
In fact this generalises again to more than two sets of variables. The full version is a bit
unwieldy and would rarely be used, so we just state the version for three sets of variables
from which the generalisation should be apparent.
11.12 Weighted inequalities 205
n
! p1 n
! q1 n
! r1 n
X X X X
api bqi cri ≥ ai bi ci .
i=1 i=1 i=1 i=1
Here is one more example of how a difficult inequality can be proved using weighted means.
af (a2 + 8bc) + bf (b2 + 8ac) + cf (c2 + 8ab) a(a2 + 8bc) + b(b2 + 8ac) + c(c2 + 8ab)
≥f
a+b+c a+b+c
(a + b + c)3 ≥ a3 + b3 + c3 + 24abc.
However, this is very easy to prove after the LHS is expanded. We leave the proof of this last
part to the reader.
Geometric inequalities
12
Geometric inequalities, even simple looking ones, can be fiendishly difficult. Of course, it
helps to have some mastery of geometry as well as some mastery of inequalities. We will look
at a whole new bag of tricks for solving geometric inequalities in this chapter.
12.0 Problems
1. If ABCD is a convex quadrilateral, which point P minimises the sum
P A + P B + P C + P D?
AP + BP < AC + BC.
3. Let ABC be a triangle and let X, Y and Z be the midpoints of BC, CA and AB,
respectively.
Prove that
3
(AB + BC + CA) < AX + BY + CZ < AB + BC + CA.
4
7. Two planets A and B lie in space, near to a long, straight asteroid belt which can be
considered as a straight line.
What is the shortest path from A to B via the asteroid belt?
(a) Determine the point M on the circle which maximises the value of AM 2 + BM 2 .
(b) Determine the point M on the circle which minimises the value of AM 2 + BM 2 .
10. Determine the point M inside acute triangle ABC which minimises the value of
AM × BC + BM × AC + CM × AB.
12. Let P , Q, R and S be points on the sides AB, BC, CD and DA of a parallelogram
ABCD, respectively.
Prove that
P Q + QR + RS + SP ≥ 2AC,
where AC is the shorter diagonal of the parallelogram.
13. Let ABC be a triangle. Let K, L and M be points on BC, AC and AB, respectively.
Let X, Y and Z be points on LM , M K and KL, respectively. Let E1 , E2 , E3 , E4 , E5 ,
E6 and E denote the areas of triangles AM Y , CKY , BKZ, ALZ, BM X, CLX and
ABC, respectively.
Show that p
E ≥ 8 6 E1 E2 E3 E4 E5 E6 .
15. Let M and N be the midpoints of sides AD and BC of the convex quadrilateral ABCD.
Prove that
2M N ≤ AB + CD,
with equality if and only if AB is parallel to CD.
16. Consider a quadrilateral of area A and with consecutive side lengths a, b, c and d.
Prove that
ac + bd ≥ 2A,
with equality if and only if the quadrilateral is cyclic and its diagonals are perpendicular.
17. Consider the hexagon formed in a triangle by drawing the three tangents to the incircle
which are parallel to the sides of the triangle.
2
Prove that the perimeter of the hexagon is less than or equal to 3 times the perimeter
of the triangle.
12.0 Problems 209
18. In the plane we are given 5 distinct points A, B, C, P and Q, no three of which are
collinear.
Prove that
AB + BC + CA + P Q < AP + AQ + BP + BQ + CP + CQ.
AB = BC, CD = DE and EF = F A.
Prove that
BC DE FA 3
+ + ≥ ,
BE DA F C 2
and determine when equality occurs.
20. Let P be a point inside triangle ABC.
Prove that one of the three angles ∠P AB, ∠P BC, ∠P CA is at most 30◦ .
21. In an acute triangle ABC, let O be the circumcentre and P be the foot of the altitude
from A. Suppose that
∠BCA ≥ ∠ABC + 30◦ .
Prove that
∠CAB + ∠COP < 90◦ .
22. Let ABCDEF be a convex hexagon whose opposite sides are parallel. Let RA , RC and
RE be the radii of circles F AB, BCD and DEF , respectively.
If P is the perimeter of the hexagon, prove that
P
RA + RC + RE ≥ .
2
210 12 Geometric inequalities
AC ≤ AB + BC,
Solution The form of this geometric inequality suggests that we should try to construct
a triangle whose side lengths are AB, BC and 2BM . The natural way to construct a line
segment of length 2BM in our diagram is to extend BM to a point D, where BM = DM .
Alternatively, we can describe D as the unique point in the plane which makes ABCD a
parallelogram.
A D
B C
Problem Two towns A and B lie on the same side of a long, straight river.
What is the shortest path from A to B via the river?
1 Inmost of our problems, including this one, triangle refers to a non-degenerate triangle, that is, one in which
the three vertices are not collinear.
12.2 Reflection principle 211
Solution It’s clear that such a shortest path must consist of two straight line segments, AP
and P B, where P is some point on the river. But where should we put the point P ? The
trick is to imagine a ghost town B 0 , located at the reflection of town B through the river, as
shown in the diagram.
A.
. B
.
B0
Then for any point P on the river, the distance P B is exactly the same as the distance P B 0 .
By the triangle inequality, we have
AP + P B = AP + P B 0 ≥ AB 0 ,
One thing you might notice from this solution is that, in order to obtain the shortest path,
the line segments AP and P B should meet the river at equal angles. This is analogous to the
physical principle which states that when a beam of light is reflected by a mirror, the angle
of incidence equals the angle of reflection.
Let’s now try to use the reflection principle to solve a more substantial problem.
P
W
D
Z Q
B X C
212 12 Geometric inequalities
Solution It might look like a crazy problem, but you’ll see just how easy it is with the help
of the reflection principle. Perhaps surprisingly, we will reflect not once, not twice, not thrice,
but four times!
First take the quadrilateral and reflect it through its side AD.
Then take the new quadrilateral and reflect it through its side BC.
Then take the new quadrilateral and reflect it through its side AD.
Finally, take the new quadrilateral and reflect it through its side AB.
The end result should be a diagram consisting of the original quadrilateral on the far left,
and four reflected copies.
P0
X0
0
W
Y0
Q0
Z0
P 0 → W 0 → X 0 → Y 0 → Z 0 → Q0 .
Furthermore, since reflections keep lengths the same, we can be sure that P W = P 0 W 0 ,
W X = W 0 X 0 , XY = X 0 Y 0 , Y Z = Y 0 Z 0 and ZQ = Z 0 Q0 . Of course, in order to minimise
the length of the path from P 0 to Q0 on our new diagram, we should simply take the straight
line segment P 0 Q0 . We can then copy this back on to our original quadrilateral to obtain the
desired shortest path.
As for the path’s length, it can be calculated using Pythagoras’ theorem, since we know
that Q0 is 40 units across√and 9 units down from P 0 . So the length of the shortest path with
the desired properties is 402 + 92 = 41.
We’ll use the reflection principle to full effect in the solution of the following interesting
problem, first posed by Fagnano way back in the 18th century.
Problem Let ABC be an acute triangle and let X, Y and Z be points on the sides BC,
CA and AB, respectively.
Where should X, Y and Z be located so that the perimeter of triangle XY Z is minimal?
Solution First, we’ll minimise the perimeter of triangle XY Z, where X is some fixed point
on BC. The trick is to reflect X in the sides AC and AB to obtain the points P and Q,
respectively.
12.3 Transformations 213
Z
Y
B X C
XY + Y Z + ZX = P Y + Y Z + ZQ ≥ P Q,
with equality if and only if Y lies on the intersection of AC and P Q, while Z lies on the
intersection of AB and P Q. So for a fixed point X on BC, we can find Y and Z which
minimises the perimeter of triangle XY Z. In fact, we know that the minimal perimeter in
this case is precisely the length of P Q.
Now observe that, since P and Q were obtained by reflection we have
AP = AQ = AX.
Furthermore,
When stating Fagnano’s problem, we were careful to mention that ABC is an acute triangle.
What happens when it’s right-angled or obtuse?
12.3 Transformations
We’ve already seen that reflections are particularly useful for solving geometric inequalities,
but they’re certainly not the only transformations which come in handy. In general, transfor-
mations which preserve lengths, known as isometries, may also be useful. In fact, a common
approach is to use an isometry to create a situation in which the triangle inequality can be
employed.
Problem Two towns A and B lie on different sides of a long, straight river with parallel
banks.
Where is the best place to build a bridge, perpendicular to the banks of the river, such that
the path from A to B via the bridge is shortest?
214 12 Geometric inequalities
Solution Suppose that the bridge touches the north bank at P and the south bank at Q, so
that the width of the river is the distance P Q. Now consider the translation which takes B
towards the river by the distance P Q. Then the path from A to B via the bridge has length
AP + P Q + QB = (AP + P B 0 ) + P Q.
A .
Q .
B0
. B
AP + P B 0 ≥ AB 0 ,
12.4 Trigonometry
Trigonometry is often useful for solving geometric inequalities, because we can use various
trigonometric identities as well as the fact that sin θ ≤ 1 and cos θ ≤ 1 for all angles θ. Here
is a nice illustrative example.
Problem Let P be a point inside triangle ABC and let the feet of the perpendiculars from P
to the sides BC, CA, AB be D, E, F , respectively.
Find the point P which maximises the value of
PD · PE · PF
.
PA · PB · PC
Solution Consider the angles a1 and a2 labelled in the following diagram.
a1 a2
E
F
P
B D C
12.5 Parametrisation 215
PF PE
· = sin a1 sin a2
PA PA
1 1
= cos(a1 − a2 ) − cos(a1 + a2 ) (product to sum)
2 2
1 1
≤ − cos A
2 2
A
= sin2 (double angle formula)
2
Equality holds here if and only if a1 = a2 or, in other words, when P lies on the angle bisector
from A. Writing down the other two analogous inequalities, multiplying them together, and
taking the square root gives
PD · PE · PF A B C
≤ sin · sin · sin ,
PA · PB · PC 2 2 2
with equality if and only if P lies on all three angle bisectors.
So the maximum value is achieved when P is the incentre of the triangle.
12.5 Parametrisation
When solving a geometric inequality, it’s sometimes possible to write the expression that
you’re trying to minimise or maximise in terms of certain parameters, such as angles or
lengths. This can often help to turn a geometric inequality into an algebraic inequality.
Problem Let I be the incentre of triangle ABC and let the angle bisectors from A, B and
C meet the opposite sides at A0 , B 0 and C 0 , respectively.
Prove that
AI · BI · CI 8
0 0 0
≤ .
AA · BB · CC 27
Solution The diagram is brimming with angle bisectors while the inequality is brimming
with ratios, which suggests that we might be able to use the angle bisector theorem to our
advantage.
B0
0
C
I
B A0 C
216 12 Geometric inequalities
AI
A natural approach is to try to rewrite the expression AA 0 in terms of some nicer lengths. So
let’s aim to write it in terms of a = BC, b = CA and c = AB, the side lengths of the triangle.
AI
We can’t get the expression AA 0 directly from the angle bisector theorem. However, we can
get the related expression IA0 from using the angle bisector theorem in triangle ABA0 or
AI
AI AB + AC b+c IA0 a
0
= 0 0
= ⇒ =
IA BA + CA a AI b+c
AI
Now we have all the information we need to determine AA0 .
This concludes the parametrisation of the geometric inequality in terms of the side lengths of
the triangle. We have now left geometry behind and all that remains is to prove the algebraic
inequality
b+c c+a a+b 8
≤ ,
a+b+c a+b+c a+b+c 27
where the only restriction on a, b, c is that they are positive real numbers which obey the
triangle inequality. Applying the AM–GM in the obvious way, we obtain
The inequality is still true if the point I is allowed to be any point inside the triangle. However,
the proof of this requires a different, but not difficult, approach. See if you can prove it!
AB · CD + BC · DA ≥ AC · BD.
If the four points do not lie on a line, then equality occurs if and only if ABCD is cyclic
with the points A, B, C and D lying in that order around the circle.
Problem In triangle ABC, the angle bisector from A meets the circumcircle again at P .
Similarly, the angle bisector from B meets the circumcircle again at Q and the angle bisector
from C meets the circumcircle again at R.
Prove that
AB + BC + CA < AP + BQ + CR.
Solution
A
Q
B C
Observe that P bisects the arc between B and C on the circumcircle since AP is the angle
bisector from A. In particular, this means that we have
∠A
∠P BC = ∠P CB = ⇒ P B = P C.
2
Now points A, B, P and C are cyclic in that order, so we may use the equality form of
Ptolemy’s inequality to find
AB · P C + P B · AC = AP · BC.
It seems that Ptolemy’s inequality is a good idea when there are isosceles triangles around.
In the following problem, which was posed by Fermat, we’ll see that it’s an even better idea
when there are equilateral triangles.
218 12 Geometric inequalities
Problem Let ABC be a triangle with all angles less than 120◦ .
Find a point P such that
PA + PB + PC
3
is minimal.
Solution The first thing we’re going to do is find a bound for P B + P C. And the way we’re
going to do that is to use Ptolemy’s inequality with an equilateral triangle. But where is
the equilateral triangle, you might be wondering? Well, let’s just construct the point X so
that triangle BCX is equilateral and external to our original triangle. Now using Ptolemy’s
inequality in BP CX gives
BP · CX + P C · XB ≥ BC · P X.
B C
So we’ve learnt that using an equilateral triangle gives us nice cancellation in Ptolemy’s
inequality. We now have
PA + PB + PC ≥ PA + PX
and you just can’t help but use the triangle inequality on this last expression. That gives
P A + P B + P C ≥ AX.
But can we actually achieve equality? Yes we can! For equality to occur in Ptolemy’s
inequality, we need the point P to lie on the circumcircle of triangle BCX on the minor arc
BC. For equality to occur in the triangle inequality, we need P to lie on the line segment AX.
So P should be the intersection of the circumcircle of triangle BCX and the line segment
AX. In fact, the minimal value of P A + P B + P C will simply be the length of AX.
3A slightly weaker version of this problem was solved in section 6.8.
12.7 Locus and tangency 219
Note that the conditions on P imply that ∠BP C = 120◦ . But there was nothing special
about the side BC. We could just as well have constructed our equilateral triangle on CA or
AB in which case we would have arrived at the result CP A = 120◦ or AP B = 120◦ .
So it turns out that P is the unique point in triangle ABC such that
The locus of points M such that ∠AM B is constant is the union of two circular arcs
symmetric about AB.
The locus of points M such that AM 2 + BM 2 is constant is a circle whose centre is
the midpoint of AB.
The locus of points M such that AM + BM is constant is an ellipse whose foci are A
and B.
Problem Of all the triangles with a given base and a given perimeter, which one has the
greatest area?
Solution
M?
A B
Suppose that the given base is AB and that the given perimeter is p. Consider the ellipse
whose foci are A and B such that, for every point M on the ellipse,
AB + AM + BM = p.
220 12 Geometric inequalities
In the previous problem, the notion of locus was crucial to the solution. In the next, we’ll see
how the notion of tangency also comes into play.
Problem If A and B are two points on the same side of a line `, determine all points P on
` such that ∠AP B is maximised.
Solution First, we consider the case when AB is parallel to `. Then there is a unique circle
which passes through A and B and is tangent to ` at a point T .
Every point P on ` apart from T lies on the same side of AB and outside the circle. Therefore,
for every point P on `, we have the inequality
∠AP B ≤ ∠AT B,
.
A
A . . B .
B
T T1 X T2
Now, consider the case when the line through AB meets ` at some point X. Then there are
two circles which pass through A and B and are tangent to ` at points, which we’ll call T1
and T2 , one on each side of X.
Every point P on ` apart from T1 , but on the same side of X as T1 , lies on the same side of
AB and outside the circle passing through T1 . Therefore, for every point P on ` on the same
side of X as T1 , we have the inequality
∠AP B ≤ AT1 B,
with equality if and only if P = T1 . Similarly, for every point P on ` on the same side of X
as T2 , we have the inequality
∠AP B ≤ AT2 B,
with equality if and only if P = T2 .
It is possible that ∠AT1 B = ∠AT2 B, in which case both points T1 and T2 are solutions to
the problem. If r1 and r2 are the radii of circles ABT1 and ABT2 , respectively, then by the
extended sine rule we have
AB AB
2r1 = = = 2r2 .
sin ∠AT1 B sin ∠AT2 B
12.8 Isoperimetric inequalities 221
This implies that the two circles have equal radius and so AB is perpendicular to `. Conversely,
if AB is perpendicular to `, then by symmetry both T1 and T2 are solutions to the problem.
On the other hand, if we assume without loss of generality that ∠AT2 B < AT1 B, then for
every point P on `, we have the inequality ∠AP B < AT1 B, with equality if and only if
P = T1 .
Isoperimetric inequalities
Among all plane figures of a given perimeter, the circle has the maximum area.
Equivalently, among all plane figures of a given area, the circle has the minimum
perimeter.
Among all n-gons of a given perimeter, the regular n-gon has the maximum area.
Equivalently, among all n-gons of a given area, the regular n-gon has the minimum
perimeter.
Problem Prove that among all polygons with given side lengths in a given order, the cyclic
one has the maximal area.
Solution Implicit in the statement of this problem is the fact that if someone gives you a
polygon, then you can always construct a cyclic polygon with the same side lengths in the
same order, and this polygon is unique. We won’t discuss a proof of this fact here, but you
should certainly think about why it is true.
First, if the polygon were not convex, then turning a concave part of the polygon inside out
would yield a polygon of greater area.4 Thus we may assume that the polygon is convex.
Now suppose that P is a convex polygon with given side lengths in a given order which cannot
be circumscribed by a circle. Let Q be the unique cyclic polygon with the same side lengths
in the same order. The circumcircle of Q is naturally divided into the polygon itself as well
as a number of segments. Now consider the shape obtained by gluing those segments onto
the corresponding sides of P .
P −→ Q −→ P
4 This is very sloppy, because the new polygon might cross itself. See if you can iron out the details of this.
222 12 Geometric inequalities
This shape5 has the same perimeter as the circumcircle of Q. So, by the isoperimetric
inequality, its area is smaller than the area of the circumcircle of Q. However, after removing
the segments from each shape, we obtain the fact that the area of P is less than the area
of Q. So we can conclude that, among all convex polygons with given side lengths in a given
order, the cyclic one has the maximal area.
a = y + z, b = x + z, and c = x + y,
x
x
z
y
B y z C
Solution Using the incircle substitution, the inequality transforms to something which looks
much nicer, that is,
(y + z)(z + x)(x + y) ≥ 8xyz.
This follows from multiplying the three inequalities
√ √ √
y + z ≥ 2 yz, z + x ≥ 2 zx and x + y ≥ 2 xy
together. Each of these can be proved by using the AM–GM inequality or the fact that
squares are non-negative.
Problem Prove that R ≥ 2r, where R and r are respectively the circumradius and inradius
of a triangle.
Solution Perhaps surprisingly, one solution to this problem arises from considering the
area A of the triangle. In particular, we have the three formulas
abc p
A= , A = rs, and A = s(s − a)(s − b)(s − c).
4R
Here, a, b and c are the side lengths of the triangle and s = a+b+c
2 denotes the semiperimeter.
If you’ve never seen these before, then now is definitely the time to try to prove them. The
third one is particularly interesting and is often referred to as Heron’s formula. From these,
we can deduce the equation
abcrs
= s(s − a)(s − b)(s − c),
4R
from which
r (s − a)(s − b)(s − c)
= .
4R abc
However, the result of the previous problem implies that
8abc ≥ (s − a)(s − b)(s − c).
Thus, it follows that
r 1
≤ ,
4R 8
which simplifies to give R ≥ 2r.
This result is often known as Euler’s inequality because it follows immediately from the
following interesting result.
Euler’s theorem in geometry The distance d between the circumcentre and the incentre
of a triangle is given by p
d = R2 − 2Rr.
You might like to prove this result by yourself, thereby providing an alternative proof of
Euler’s inequality. As a treat, we’ll finish with the most elegant proof of this chapter—another
proof of Euler’s inequality.
Solution We start with a simple observation: the smallest circle which touches or crosses
all three sides of a triangle is the incircle.
Recall that the midpoints of the sides of a triangle form what is known as the medial triangle.
It is always similar to the original triangle and exactly half the size. Therefore, the circumcircle
of the medial triangle6 has radius R2 .
Observe that the circumcircle of the medial triangle passes through the midpoints of all three
sides of the original triangle. So, from our earlier simple observation, the circumcircle of the
medial triangle is at least as large as the incircle.7
Hence, we may conclude that
R
≥ r,
2
which can be equivalently written as R ≥ 2r.
6 This is also the nine-point circle.
7 Infact, as a carefully drawn diagram suggests, the incircle is always internally tangent to the nine-point
circle. But this is more difficult to prove.
224 12 Geometric inequalities
B C
Combinatorics
13
In a very loose sense, combinatorics is the area of mathematics concerned with counting. By
this, we don’t mean calling out the numbers 1, 2, 3, . . . or anything like that. Combinatorics
is rather about counting in a clever way, or counting without counting. Actually, this isn’t
a particularly good description of combinatorics either. Pretty much anything that doesn’t
quite fit into another mathematical category tends to be described as combinatorial. This
makes combinatorics a treasure trove of mathematical delights!
13.0 Problems
1. How many whole numbers from 1 to 2003 are not divisible by 2, 7 or 11?
2. A number is said to be ordered if each digit is greater than or equal to the digit on its
left.
How many three-digit numbers are ordered?
3. How many eight-digit phone numbers begin with the number 9, are even, and have the
third, fourth and fifth digits in decreasing order?
4. How many odd numbers are there in the 2003rd row of Pascal’s triangle?
5. The number 3 can be expressed as a sum of one or more positive integers (taking order
into account) in four ways:
3 = 1 + 2 = 2 + 1 = 1 + 1 + 1.
9. Prove that
n
X n k n−m n
=2 .
k m m
k=m
11. Prove in at least two different ways that if m, n are positive integers and 1 ≤ k ≤ n,
then
n m n m n m m+n
+ + ··· + = .
0 k 1 k−1 k 0 k
S1 ⊆ S2 ⊆ · · · ⊆ Sk ⊆ {1, 2, . . . , n}.
14. In a circus, there are n clowns who dress and paint themselves up using a selection of
12 distinct colours. Each clown is required to use at least five different colours. One
day the ringmaster of the circus demands that no two clowns have exactly the same set
of colours, and no more than 20 clowns may use any one particular colour.
Find the largest number n so that the ringmaster’s demand can be met.
15. In a maths competition, 2n students take part. Each student submits a different problem
and all 2n problems are collected, shuffled, and then handed back to each participant.
The distribution is considered fair if there are n participants receiving the problems
proposed by the other n participants.
Prove that the number of ways in which the problems can be distributed in a fair way
is a perfect square.
16. An n × n matrix which has entries coming from the set S = {1, 2, . . . , 2n − 1} is called a
silver matrix if, for each i = 1, 2, . . . , n, the ith row and the ith column together contain
all the elements of S.
Prove the following.
17. An 8 × 8 square is divided into 64 unit squares, and must be covered by√64 black and 64
white isosceles right-angled triangles whose side lengths are 1, 1 and 2, respectively.
Each unit square must be covered by 2 triangles. A covering is said to be awesome if
any two triangles sharing a common side are of distinct colours.
How many different awesome coverings are there?
13.0 Problems 227
18. A chess master who has 11 weeks to prepare for a tournament decides to play at least
one game every day, but in order to conserve his energy he decides not to play more
than 12 games in any seven-day period.
Show there are consecutive days during which he plays exactly 21 games.
19. Let S = {0000000, 0000001, . . . , 1111111} be the set of all binary sequences of length 7.
The distance between two elements s1 , s2 ∈ S is the number of places in which s1 and
s2 differ.
Show that if T ⊂ S and |T | > 16, then T contains two elements whose distance is at
most 2.
20. A rectangular chessboard has 5 rows and 2008 columns. Each square is painted either
red or blue.
Determine the largest integer N which guarantees that, no matter how the board is
painted, there are two rows which have matching colours in at least N columns.
21. Let n > 0 be an integer. We are given a balance scale and n weights of masses
20 , 21 , . . . , 2n−1 , respectively. We are to place each of the n weights on the balance scale,
one after another, in such a way that the right pan is never heavier than the left pan.
At each step we choose one of the weights that has not yet been placed on the balance
scale, and place it on either the left pan or the right pan, until all of the weights have
been placed.
Determine the number of ways in which this can be done.
22. Up until now the national library of the small city state of Sepharia has had n shelves,
each shelf holding at least one book. The library recently bought k new shelves. The
books will be rearranged, and the librarian has announced that each of the now n + k
shelves will hold at least one book. Call a book privileged if the shelf on which it
will stand in the new arrangement is to hold fewer books than the shelf where it was
previously located.
Prove there are at least k + 1 privileged books in the national library of Sepharia.
23. Let n and k be positive integers with k ≥ n and k − n an even number. Let 2n lamps
labelled 1, 2, . . . , 2n be given, each of which can be either on or off. Initially all the
lamps are off.
We consider sequences of steps: at each step one of the lamps is switched from on to off
or from off to on.
Let N be the number of such sequences consisting of k steps and resulting in the state
where lamps 1 through n are all on, and lamps n + 1 through 2n are all off.
Let M be the number of such sequences consisting of k steps resulting in the state
where lamps 1 through n are all on, and lamps n + 1 through 2n are all off, but where
none of the lamps n + 1 through 2n is ever switched on.
N
Determine the ratio M.
24. A group of 21 girls and 21 boys took part in a mathematical contest. Suppose that
(i) each contestant solved at most six problems, and
(ii) for each girl and each boy, at least one problem was solved by both of them.
Prove that there was a problem that was solved by at least three girls and at least three
boys.
228 13 Combinatorics
Solution There are n choices of position for the first person to stand in the line. After the
first position has been chosen, there are n − 1 positions remaining for the second person. And
after the second position has been chosen, there are n − 2 positions remaining for the third
person, and so on.
Therefore, the number of ways to choose the order in which n people stand in a line is
n × (n − 1) × (n − 2) × · · · × 2 × 1.
You probably already know that this number is written as n! and is called ‘n factorial’.
Solution There are 8 choices for the first digit, 8 choices for the second digit and 8 choices
for the third digit, that is, any number from 1 to 8.
Hence, there are 83 = 512 such numbers.
Problem Each letter in Morse code is a sequence of at most four dots and dashes. How
many letters are possible?
Solution There are 21 = 2 letters with 1 signal, 22 = 4 letters with 2 signals, 23 = 8 letters
with 3 signals, and 24 = 16 letters with 4 signals.
Hence, the answer is 2 + 4 + 8 + 16 = 30.
So Morse code only just copes with the 26 letters of the English alphabet!
Problem How many five-digit positive integers have at least two digits the same?
13.3 Binomial identities 229
Problem In how many ways can you choose a four-flavour combination from 10 ice-cream
flavours?
Solution Well, if you actually cared about the order in which you chose them, then the
answer would simply be 10×9×8×7. But since the order doesn’t matter, we have overcounted
each combination by a factor of 4 × 3 × 2 × 1—the number of ways of reordering the four
flavours that we have chosen.
Therefore, the answer is
10 × 9 × 8 × 7
= 210.
4×3×2×1
This can be conveniently expressed using factorial notation as
10 10!
= .
4 4! 6!
Problem In how many ways is it possible to arrange the letters of the word recurrence?
Solution There are 10! ways to rearrange the letters, but again we have overcounted. The
three occurrences of the letter r can be rearranged in 3! ways among themselves. In general,
if a letter occurs k! times, then there are k! ways of rearranging the occurrences of that letter
among themselves. Since the letters r and e occur three times, the letter c twice and the
letters u and n once, the answer is
10 10!
= = 50400.
3, 3, 2, 1, 1 3! 3! 2! 1! 1!
Symmetry
n n
=
k n−k
Binomial theorem
n
X n k n−k
(x + y)n = x y
k
k=0
Addition formula
n n−1 n−1
= +
k k−1 k
230 13 Combinatorics
In-and-out formula
n−1 n
n =k
k−1 k
Solution We provide two very distinct proofs of this fact. In the next section we provide
a third proof! This is a rather common feature of binomial identities and it pays to be
comfortable with the various types of proof.
Solution
Method 1: Algebra
We use the in-and-out formula to write k nk = n n−1
k−1 . Then the left-hand side becomes
n−1 n−1 n−1
n +n + ··· + n = n2n−1 ,
0 1 n−1
n−1
X
n−1
where = (1 + 1)n−1 is a consequence of the binomial theorem.
k
k=0
Method 2: Calculus
Substituting y = 1 into the binomial formula, we obtain
n n n 2 n 3 n n
(x + 1)n = + x+ x + x + ··· + x .
0 1 2 3 n
13.4 Bijections
Another way to prove combinatorial identities is to look for combinatorial interpretations for
both sides of the identity. This is really a good technique and a useful skill to have. Here we
discuss two problems we considered earlier but present bijective solutions.
Solution The identity is equivalent to the fact that, for every positive integer n,
X n X n
= .
k k
k odd k even
If S is a set with n elements, the left-hand side counts the number of subsets of S with an
odd number of elements while the right-hand side counts the number of subsets of S with an
even number of elements.
Now that we have a combinatorial interpretation for both sides of the equation, it makes sense
to look for a one-to-one correspondence—a bijection to use more precise terminology—between
the objects described by the left-hand side and the objects described by the right-hand side.
232 13 Combinatorics
In other words, we want to find a rule for turning a subset of S with an odd number of
elements into a subset of S with an even number of elements, and vice versa.
We start by picking some fixed s ∈ S. The rule is to add the element s to your set if it doesn’t
already contain it, and to remove the element s from your set if it does already contain it.
Since we are either adding or subtracting a single element, this rule certainly turns a subset
of S with an odd number of elements into a subset of S with an even number of elements,
and vice versa. It is a bijection because the rule can be reversed. In fact, the rule is its own
inverse!1 This completes our combinatorial proof.
Solution The idea is to find a set of objects which can be counted in two ways—one of which
produces the left-hand side while the other produces the right-hand side. In this particular
case, the form of the left-hand side gives a strong hint as to what that set might be.
Each term of the left-hand side is of the form k nk which suggests that, from
a set of n people,
we would like to choose a committee of k people. This can be done in nk ways. Furthermore,
we would like to choose a president from this committee. This can be done in k ways. As k
ranges from 1 up to n, we are choosing committees of every possible size. So what we have
shown is that the left-hand side counts the number of ways to choose a committee, with one
person designated the president, from a set of n people.
Of course, all that remains is to show that the number of ways to choose a committee, with one
person designated the president, from a set of n people also happens to be n2n−1 . Whereas
we earlier counted by choosing the committee first and then the president, we will now count
by choosing the president first and then the remainder of the committee. But this is easy,
because there are n choices for the president and for each of the remaining n − 1 people, there
are two choices. Each person is either in the committee or is not. Thus the number of ways of
choosing a committee with president is n2n−1 . This completes our combinatorial proof.
Problem The supermarket has an unlimited supply of apple tarts, chocolate muffins and
cheesecakes.
How many ways are there to buy seven cakes from the supermarket?
Solution You buy your seven cakes and take them to the checkout, where you put them
on the conveyor belt. Just to make things easy, you place all the apple tarts together first,
then the chocolate muffins, and then the cheesecakes. To make things really easy for the
stressed-out checkout attendant, you place some of those conveyor belt dividers between each
type of cake.
Now there will be nine objects on the conveyor belt: seven cakes and two dividers. (If you
buy no apple tarts, for example, the first object will be a divider.) Choosing which of these
1 As we saw in section 10.7, a rule or a function which is its own inverse is called an involution.
13.6 Pigeonhole principle 233
nine objects are the dividers uniquely determines what cakes you have bought. So the answer
is 92 = 36.
You should convince yourself that this is actually correct. That is, for each way of buying
the prescribed cakes, there is a way of placing the dividers; and conversely, for each way of
placing the dividers, there is a way of buying cakes so that the dividers are placed there. Both
of these are quite obvious once you understand what is going on, but often this is difficult to
express!
Now for a less supermarket-oriented and slightly more subtle example. There are certainly
other solutions to it, but perhaps this is the most elegant.
Problem Find the number of three-digit numbers whose digit sum is 10.
Solution Represent a number by a collection of ‘units’ (1s) and ‘dividers’ (∆s). For example,
325 is represented by
1 1 1 ∆ 1 1 ∆ 1 1 1 1 1.
Now every such three-digit number corresponds to a way of placing two dividers among ten
1s. However the correspondence is not bijective! This is because the ‘conveyor belt’
∆1∆111111111
would correspond to the number 019, which is not a three-digit number. You can’t have a
divider in the first place. But provided that you start with a 1, the correspondence is bijective.
(You should make sure of this.)
Ignoring the initial 1, the answer is the number of ways of placing two dividers amongst the
remaining nine 1s. There are 11 objects in total and so the number of ways of nominating
two of them to be dividers is simply 11 2 = 55.
Problem Given 27 distinct odd positive integers less than 100, prove there is a pair of them
whose sum is 102.
Note that each of the first 25 sets consists of two numbers whose sum is 102, while the 26th
set consists of a single number. Furthermore, the 26 sets form a partition of all the odd
positive integers less than 100. So given 27 numbers (pigeons), some set (pigeonhole) has two
different numbers (pigeons) chosen (living) from (in) it! Those two numbers add to 102.
234 13 Combinatorics
Solution Recall that a positive integer is a perfect square if and only if the exponent of
every prime in its prime factorisation is even. So given a product of powers of these primes
n = pα αr
1 · · · pr , define the parity pattern of n to be (α1 , . . . , αr ), where the αi are considered
1
modulo 2.
Clearly when you multiply two numbers, you add their parity patterns term by term
modulo 2, and if you can divide, then you subtract their parity patterns modulo 2. We just
need to show there is a non-empty subset of our r + 1 integers whose product has parity
pattern (0, . . . , 0).
So, consider all 2r+1 subsets of our r + 1 integers, and consider their parity patterns. But
there are only 2r possible parity patterns. By the pigeonhole principle there are two distinct
subsets X and Y of {a1 , . . . , ar+1 } whose products have the same parity pattern v.
If X and Y were disjoint, then the product of the elements of X ∪ Y would have parity
pattern 2v = (0, . . . , 0), as required.
If X and Y overlap, let D = X ∩ Y . We claim that the product of the elements of X ∪ Y \ D
is a perfect square. Indeed consider the equality,
Y Y Y Y Y
x y = x y d2 .
x∈X y∈Y x∈X\D y∈Y \D d∈D
Y Y
The LHS has parity pattern (0, . . . , 0) because x and y have the same parity pattern.
x∈X y∈Y
Y
Also d2 has parity pattern (0, . . . , 0). It follows that
d∈D
Y Y
x y
x∈X\D y∈Y \D
also has parity pattern (0, . . . , 0) and so the product of the elements of X ∪ Y \ D is a perfect
square, as required.
|X ∪ Y | = |X| + |Y | − |X ∩ Y |
|X ∪ Y ∪ Z| = |X| + |Y | + |Z| − |X ∩ Y | − |Y ∩ Z| − |Z ∩ X| + |X ∩ Y ∩ Z|
2 Why not draw the corresponding Venn diagrams to see what is going on?
13.8 Double counting 235
Principle of inclusion–exclusion
X
|X1 ∪ · · · ∪ Xn | = (−1)k−1 |Xi1 ∩ · · · ∩ Xik | .
{i1 ,...,ik }⊆{1,...,n}
(S \ X1 ) ∪ · · · ∪ (S \ Xn ) = S \ (X1 ∩ · · · ∩ Xn )
and
(S \ X1 ) ∩ · · · ∩ (S \ Xn ) = S \ (X1 ∪ · · · ∪ Xn ),
which yield
X
|X1 ∩ · · · ∩ Xn | = (−1)k−1 |Xi1 ∪ · · · ∪ Xik | .
{i1 ,...,ik }⊆{1,...,n}
Solution This problem really involves permutations, which we consider to be bijections from
the set {1, 2, . . . , n} to itself or equivalently, arrangements of the numbers 1, 2, . . . , n. What
we are seeking is the number Dn of derangements of the set {1, 2, . . . , n}. A derangement is a
permutation of a set which leaves no element fixed.
For i = 1, 2, . . . , n, let Xi denote the set of permutations which fix the number i. The number
of elements in Xi is easy to count and it is simply (n − 1)!. Furthermore, we know that
Xi1 ∩ Xi2 ∩ · · · ∩ Xik is simply the set which fixes the number i1 , fixes the number i2 , and so
on, up to the number ik . This is a set which is also easy to count and its number of elements
is simply (n − k)!. Note that the total number of ways of choosing k of the Xi is equal to nk .
X
|X1 ∪ · · · ∪ Xn | = (−1)k−1 |Xi1 ∩ · · · ∩ Xik |
{i1 ,...,ik }⊆{1,...,n}
n
X
k−1 n
⇒ n! − Dn = (−1) (n − k)!
k
k=1
n
X n!
⇒ Dn = n! − (−1)k−1
k!
k=1
(−1)n
1 1 1 1
= n! − + − + ··· + .
0! 1! 2! 3! n!
Problem At a party each person knew exactly 22 others. For any pair of people X and Y
who knew one another, there was no other person at the party whom they both knew. For
any pair of people X and Y , who did not know one another, there were exactly 6 other people
whom they both knew.
How many people were at the party?
Solution This problem has an obvious graph theory interpretation, where vertices represent
people and edges represent a mutual acquaintanceship between two persons. Define a vee to
be a triple of persons such that exactly two of the three pairs of acquaintances know each
other. We count the number of vees in two different ways.3
Suppose there are n people at the party. Concentrating on vertices we see that each vertex
contributes 22
2 = 231 vees since each vertex has 22 edges emanating from it. Thus the total
number of vees is 231n.
On the other hand adding the degrees of each vertex gives 22n, but this overcounts the
number of edges by a factor of two. Therefore, the total number of edges is 11n. This means
that the total number of pairs of vertices not connected by an edge is
n
− 11n.
2
Each such non-edge makes a vee with 6 other vertices. Thus the total number of vees is
n
6 − 11n .
2
Problem Let pn (k) be the number of permutations of the set {1, 2, . . . , n} which have
exactly k fixed points.
Prove that
n
X
kpn (k) = n!.
k=0
Solution We find a combinatorial interpretation for the sum. We are counting each per-
mutation, but counting each with multiplicity equal to the number of its fixed points. So
permutations without any fixed points, that is, derangements, are not counted at all; permu-
tations with one fixed point are counted once; those with two fixed points are counted twice;
and so on. Consider a permutation as the numbers 1 to n, written in some order.
For each permutation with one or more fixed points, colour in one of the fixed points as
our
Pn ‘favourite’. Then the set of ‘permutations with favourite fixed points’ has precisely
k=0 kpn (k) elements. This is a combinatorial interpretation for the sum.
For example, here are the ‘permutations with favourite fixed points’ for the case n = 3.
How many of these objects are there? Well, with 1 as the favourite fixed point, there are
(n − 1)! permutations—one for each permutation of the other objects. Similarly, with 2 as the
favourite fixed point, there are also (n − 1)! permutations. There are (n − 1)! permutations
for each individual favourite fixed point.
This gives n × (n − 1)! = n! of them overall.4
13.9 Injections
Problem A permutation (x1 , x2 , . . . , x2n ) of the set {1, 2, . . . , 2n}, where n is a positive
integer, is said to be good if
|xi − xi+1 | = n
for at least one i in {1, 2, . . . , 2n − 1}, and is said to be bad otherwise.
Show that, for each n, there are more good permutations than bad permutations.
Solution Since we want to show there are more of one thing than another, we don’t construct
a bijection, but an injection! We find a map from bad permutations to good permutations
which is injective but not surjective: this will show there are more good permutations.
First think about what a good permutation means. For given x ∈ {1, . . . , 2n}, there is only
one y in the set for which |x − y| = n. So a permutation is good if and only if we see at least
one of the pairs
(1, n + 1), (2, n + 2), . . . , (n, 2n)
occurring as adjacent numbers in the permutation (in any order). A bad permutation is one
where we see none of these pairs of numbers adjacent. In what follows, it will be convenient
to define the notation i ∗ n as follows.
i + n if 1 ≤ i ≤ n
i∗n=
i − n if n + 1 ≤ i ≤ 2n
Now take any bad permutation. Here is how to make a good one out of it. Suppose x1 = i.
Then i ∗ n can’t occur as x2 , but must occur later on. Thus the permutation looks like
(i, A, i ∗ n, B),
where A is a non-empty sequence, and B is another (possibly empty) sequence, neither of
which contains a pair of the form (j, j ∗ n). We make our permutation good by putting i and
i ∗ n together to make
(A, i, i ∗ n, B).
Thus we have a map φ from the set of bad permutations to the set of good permutations
given by
φ(i, A, i ∗ n, B) = (A, i, i ∗ n, B).
We first show our map φ is not surjective. Any good permutation in the image of φ is of the
form (A, i, i ∗ n, B), where A is non-empty and the only adjacent pair of the form (j, j ∗ n)
occurs if j = i. Clearly there are more good permutations than these. For example, any
permutation of the form (1, n + 1, . . .) is a good permutation that is not in the image of φ.
Now we show φ is injective. Suppose that φ(i1 , A1 , i1 ∗ n, B1 ) = φ(i2 , A2 , i2 ∗ n, B2 ). Then
from the definition of φ we have
(A1 , i1 , i1 ∗ n, B1 ) = (A2 , i2 , i2 ∗ n, B2 .)
4 There are many other solutions to this problem. See if you can find any others.
238 13 Combinatorics
However from our earlier observation the only adjacent pair of the form (j, j ∗ n) on the LHS
occurs at j = i1 . Similarly, the only adjacent pair of the form (j, j ∗ n) on the RHS occurs at
j = i2 . This allows us to deduce that i1 = i2 . Then it follows that A1 = A2 and B1 = B2 .
Therefore, φ is injective.
13.10 Recursion
As a first example of the technique of recursion, we solve the problem about the absent-minded
postman that we saw in section 13.7.
Solution Suppose that Dn is the number of derangements of n objects. The trick here is to
relate Dn to some of the previous values D1 , D2 , . . . , Dn−1 .
Unfortunately, we have not accounted for every possible derangement of n letters. The
ones we are missing are precisely those where letter n is delivered to address j and
letter j is delivered to address n for some j ∈ {1, 2, . . . , n − 1}. But in this case, notice
that the remaining letters and addresses form a derangement on n − 2 letters. Therefore,
for each j there are Dn−2 ways in which this can happen. But since we can take any
j ∈ {1, 2, . . . , n − 1} there are
(n − 1)Dn−2
cases in total, where the letter n has swapped addresses with another letter.
Dn = (n − 1)(Dn−1 + Dn−2 ).
Letting
En = Dn − nDn−1 ,
13.10 Recursion 239
Since this is true for all integers n ≥ 1 we also have the same equation for n replaced
successively by n − 1, n − 2, . . . , 1. Upon summing these equations, all the middle terms on
the left-hand side cancel out leaving us with
Problem In town A there are n girls and n boys such that each girl knows each boy. In
town B there are n girls and 2n − 1 boys such that girl k knows boys 1, 2, 3, . . . , 2k − 1, and
only these boys. Let A(n, r) denote the number of different ways in which r girls from town A
can dance with r boys from town A, forming r pairs where the girl knows the boy. Similarly,
let B(n, r) denote the number of different ways in which r girls from town B can dance with
r boys from town B, forming r pairs where the girl knows the boy.
Prove that A(n, r) = B(n, r), for r = 1, 2, . . . , n.
Solution We can calculate the number A(n, r) very easily. The number of ways of choosing
r girls is nr and the number of ways of choosing r boys is nr . The number of ways of pairing
them up is r!. Therefore, we have
2
n n!2
A(n, r) = r! = .
r (n − r)!2 r!
To show that
n!2
B(n, r) =
(n − r)!2 r!
directly is a very difficult task indeed. However, what we can do is relate B(n, r) to the values
B(n − 1, r − 1) and B(n − 1, r) in the following way.
We establish a recurrence relation for B(n, r). Let n ≥ 2 and 2 ≤ r ≤ n. There are two cases
for a desired selection of r pairs of girls and boys.
B(n, r) = 0, if r > n
B(n, 1) = 1 + 3 + 5 + · · · + (2n − 1) = n2 .
It is directly verified that the numbers A(n, r) satisfy the same initial conditions and recurrence
relations, from which it follows that A(n, r) = B(n, r) for all n and r ≤ n. We leave it to the
reader to do the algebra that verifies this.
J1 J2 J3 J4 J5 J6 J7
C1 0 1 0 0 0 0 1
C2 1 0 1 0 1 0 1
C3 1 1 1 0 0 0 0
C4 1 1 0 0 1 1 0
C5 0 1 0 1 0 1 1
Think vertically.
b
For each pair of judges, we know that they agree in at most k places. Since there are 2
pairs of judges, we conclude that
b
N ≤k .
2
13.12 Combinatorial reciprocal principle 241
Think horizontally.
Suppose a contestant has received m passes and nfails, where m + n = b. The m judges
m
who awarded the contestant a pass contribute 2 agreements, while the n judges who
n
awarded the contestant a fail contribute 2 agreements.
Therefore, looking along a row, the number of agreements is
m n m(m − 1) n(n − 1)
+ = +
2 2 2 2
m 2 + n2 b
= − .
2 2
All that remains is to put these two pieces of information together. What we have shown is
that
2
b−1 b
a ≤N ≤k .
2 2
It follows that
2
k b−1 b
≥
a 4 2
b−1
= .
2b
To help you understand what this is saying, write down some examples and then try to prove
the principle. Once it is well understood, this principle can solve some otherwise difficult, or
even apparently unapproachable, problems.
242 13 Combinatorics
Problem Students from 13 different countries participated in the 491st International Math-
ematics Bonanza. Each student belonged to one of five different age groups.
Prove that there were at least nine participants in the Bonanza who had more fellow
participants in his or her age group than fellow participants from his or her own country.
Solution Given a student x, let Ax denote the number of students (including x) in the same
age group as x, and let Cx denote the number of students from the same country. So by the
combinatorial reciprocal principle,
X 1 X 1
=5 and = 13.
x
Ax x
Cx
Thus X 1 1
− = 8.
x
Cx Ax
1 4
2 5
3
2 5
3 4
1
graph with graph with
two components multiple edges a pair of isomorphic graphs
For most of our purposes, we will assume that a graph is finite and has no loops (a loop is an
edge connecting a vertex with itself) or multiple edges (two or more edges connecting the
same pair of vertices). However, there are situations where considering loops and multiple
edges is useful.
244 14 Graph theory
14.0 Problems
1. Remembering that we view two graphs as being the same if they are isomorphic, how
many graphs are there with at most five vertices? How many of these are planar?1
2. At a party with 100 people, in any set of four there is at least one person who is mutually
acquainted with the other three.
Given that there are three people who are mutually unacquainted with each other,
prove that the remaining 97 people must know everyone at the party.
3. In a country with n towns, there are some one-way roads connecting pairs of towns.
It’s known that for any two towns you can drive from one of them to the other one.
Prove that there is a town from which it’s possible to drive to any other town.
4. Show that if every vertex in a graph has degree k, then the graph contains a path of
length k. (See section 14.7 for the definition of a path.)
5. Consider a 3 × 3 × 3 cube made from 27 unit cubes. These smaller cubes are rooms
which have doors on each of their faces.
Is it possible to start in the central room and visit every other room exactly once
without ever leaving the large cube?
6. Consider a tennis tournament in which each of n people played against every other
person exactly once.
(a) Show that it is possible to label the players P1 , P2 , . . . , Pn in such a way that P1
beat P2 , P2 beat P3 , and so on to Pn−1 beat Pn .
(b) Suppose it is known that there is no loop of players Q1 , Q2 , . . . , Qk such that Q1
beat Q2 , Q2 beat Q3 , and so on up Qk−1 beat Qk and Qk beat Q1 , for some k ≥ 3.
Prove that there is a unique way to rank the players so that each player beat
everyone below them.
7. A snozzberry has the shape of a convex polyhedron such that three faces meet at every
vertex, and each face is either a pentagon or a hexagon. Each pentagon is surrounded by
five hexagons while each hexagon is surrounded by three hexagons and three pentagons.
How many faces does a snozzberry have?
8. My wife and I were invited to a dinner party attended by four other couples, making
a total of 10 people. A certain amount of handshaking took place subject to two
conditions: no one shook his or her own hand and no couple shook hands with each
other. Afterwards, I became curious and asked everybody else at the party how many
people they shook hands with.
Given that I received nine different answers, how many hands did I shake?
9. An n-domino consists of two squares which share an edge, where each square is labelled
by a number from 1 up to n.
(a) Prove that there are n+1
2 dominoes.
(b) For which values of n is it possible to arrange the dominoes in a line so that
adjacent halves of neighbouring dominoes have the same label?
10. Prove that a graph is bipartite if and only if it has no cycles of odd length.
1 Some of the problems in this section use terminology that is defined later in the chapter.
14.0 Problems 245
11. If G is a graph with V vertices and E edges, prove that the following statements are
equivalent to each other.
13. Prove that the complete bipartite graph K3,3 is not planar.
14. Seventeen people correspond by mail with one another, each one with all the rest. In
their letters only three different topics are discussed and each pair of people deals with
only one of these topics.
Prove that there are at least three people who write to each other about the same topic.
15. (a) During a meeting of 2n people, more than n2 handshakes took place, with no pair
of people shaking hands more than once.
(i) Prove that there must have been three people who all shook hands with each
other.
(ii) Is the problem still true if exactly n2 handshakes took place?
(b) What is the maximal number of edges that a graph on n vertices can have such
that the graph contains no triangle?
16. (a) Prove that, at any party with nine people, there must exist four mutual friends or
three mutual strangers.
(b) Show that this is not true for a party with eight people.
17. A Platonic solid is a convex polyhedron in which each vertex is surrounded by the same
number of congruent regular polygons.
Prove that there are exactly five Platonic solids and determine the number of vertices,
edges and faces of each.
18. Consider a planar graph whose faces, including the infinitely large one, are all triangles.
If each vertex is coloured red, green or blue, prove that the number of faces whose
vertices are all different colours is even.
19. There are 1985 participants at an international meeting. In each set of three participants,
there are at least two who speak the same language.
Given that no one speaks more than five languages, prove that there are at least 200
participants who speak the same language.
20. Each edge of the complete graph K9 is coloured either blue, or red, or left uncoloured.
Find the smallest value of n such that whenever n edges are coloured, there necessarily
exists a monochromatic triangle.
246 14 Graph theory
14.1 Degree
Let’s start with a few fundamental definitions in graph theory. If one of the endpoints of the
edge e is the vertex v, then we say that e and v are incident. If two vertices u and v are
incident to the same edge, then we say that u and v are adjacent. Now define the degree of a
vertex v to be the number of edges incident to v and denote it by deg(v).
Problem Show that at any party, there are always at least two people with exactly the
same number of friends at the party.
This is the first of many party problems which we will examine. In fact, you can think of
every graph theory problem as a party problem in disguise. Vertices represent people, while
edges represent mutual acquaintance. For the time being, we won’t deal with the case of
celebrities or forgetful people, that is, where A knows B but B doesn’t know A. Furthermore,
we don’t count people as knowing themselves, so that the graph has no loops, and we don’t
allow people to know each other twice over, so that the graph has no multiple edges. The
degree of a vertex represents the number of people at the party that a person knows so, in
some sense, degree is a popularity index!
Throughout the chapter, we will switch between party language and graph language, depending
on the circumstances. This particular problem can be translated into graph theory terminology
in the following way.
Show that in any graph, there exist two vertices with the same degree.
Solution With the goal of obtaining a contradiction, suppose that each person knows a
different number of people at the party.
If there are n partygoers, then they can know 0, 1, 2, . . . , n − 1 people, and these are the only
possibilities. Since there are n people and n possibilities for the number of people they know,
there must be one person who knows 0 people, one person who knows 1 person, and so on,
up to one person who knows all n − 1 other people at the party. However, it’s impossible for
there to be two people at the party, one who knows no one else and one who knows everybody
else. This is the desired contradiction, so we can conclude that there exist two people who
know the same number of people at the party.
At any party, if you ask everyone in the room, including yourself, how many hands they shook,
and add up all of the answers, then you will always end up with an even number. In fact, the
sum will be twice the number of handshakes that have occurred during the party. We can
state this in the language of graph theory in the following way.
Handshaking lemma In any graph, the sum of the degrees of all the vertices is equal to
twice the number of edges.
This is true because each edge contributes two to the sum of the degrees, one for each vertex
incident to it. A simple corollary of the handshaking lemma is the fact that the number of
vertices of odd degree in any graph must be even.
Problem Is it possible to build a house with exactly eight rooms, each with three doors,
and such that exactly three of the house’s doors lead outside?
Solution If you could build such a house, then you could construct a corresponding graph
in the following way. Let there be nine vertices, one for each room and one to represent the
248 14 Graph theory
outside of the house. Place an edge between two vertices if there is a door between the two
corresponding areas. The conditions of the problem assert that every vertex of our graph has
degree three.
Since there are nine vertices, the sum of the degrees is 9 × 3 = 27. But the handshaking
lemma tells us that there should be 13 12 edges in our graph, which is clearly impossible!
Problem Consider a squash tournament in which each of n people plays against every other
person exactly once. Let Lk and Wk be the number of losses and wins of the kth player,
respectively.
Prove that
L21 + L22 + · · · + L2n = W12 + W22 + · · · + Wn2 .
In the directed graph which represents this tournament, the total degree for the kth player is
indeg(v) + outdeg(v) = Lk + Wk = n − 1.
Since this is a constant, we can divide the previous equation through by n − 1 to obtain
L1 + L2 + · · · + Ln = W1 + W2 + · · · + Wn .
But this is now quite simple, since the left-hand side represents the sum of the indegrees,
which is equal to the number of edges in the graph. Similarly, the right-hand side represents
the sum of the outdegrees, which is also equal to the number of edges in the graph.
A connected graph is one in which it’s possible to walk between any two vertices along
the edges. In other words, the entire graph consists of only one piece.
A cycle is a sequence of distinct vertices v1 , v2 , . . . , vn with v1 adjacent to v2 , v2 adjacent
to v3 , and so on, with vn adjacent to v1 .
A tree is a connected graph which has no cycles.
Problem If G is a graph with V vertices and E edges, prove that G is a tree if and only if
it is connected and E = V − 1.
The solution to the previous problem should give you some ideas which you can use to prove
the following important results.
The complete graph Kn is the graph consisting of n vertices with an edge between every
pair of vertices.
A bipartite graph is one whose vertices can be coloured black and white such that every
edge is incident to one black vertex and one white vertex.
The complete bipartite graph Km,n is the graph consisting of m black vertices and n
white vertices with an edge between each black and each white vertex.
The best way to remember jargon like this is to put it to good use!
Problem In the parliament of Frenemia each member is friends with exactly one other
member and enemies with exactly one other member.
Prove that the members of parliament can be divided into two chambers so that no chamber
contains a pair of mutual friends or a pair of mutual enemies.
Solution As is often the case, our first step will be to phrase this in graph theory language.
In a graph, the edges are coloured red and blue in such a way that each vertex is
incident to one red edge and one blue edge. Prove that the graph is bipartite.
We will give a procedure for colouring the vertices black and white such that every edge is
incident to one black vertex and one white vertex.
If you try drawing some graphs which satisfy the conditions of the problem, then you’ll find
that they all consist of a bunch of cycles, each one with an even number of vertices.
With this in the back of our minds, let’s start by taking any vertex and colouring it black.
Now take another vertex adjacent to it and colour it white. And take another vertex adjacent
to this one and colour it black. Continue walking around the graph, alternately colouring
vertices black and white until you get stuck. Clearly, this can only happen in one of two
ways: either you reach a dead end, or you meet a vertex which has already been coloured.
The former case actually never arises, since every vertex has degree two while a dead end is a
vertex of degree one. In the second case—and you should carefully think about why this is
so—we must return to the vertex at which we started.
In summary, we have traversed a cycle, alternately colouring vertices black and white. Now a
problem arises if the cycle has an odd number of vertices. But we’ve been told that the edges
alternate between red and blue, so there must be an even number of vertices along the cycle.
At this stage, we have successfully managed to colour some of the vertices black and white.
If we have coloured every vertex, then we are done, but if not, then we simply repeat the
process, starting at an uncoloured vertex.
This recipe will eventually colour every vertex in such a way that each edge is incident to one
black vertex and one white vertex, so the graph is certainly bipartite.
not know each other, in which case we call them strangers. We will use vertices to represent
people, red edges to represent friends and blue edges to represent strangers. Therefore, every
party is simply a complete graph with each edge coloured red or blue.
Problem Prove that, at any party with six people, there must exist three mutual friends or
three mutual strangers.
Prove that given a complete graph on six vertices with each edge coloured red or
blue, there exists a monochromatic2 triangle.
Consider a random partygoer A. Of the five edges incident to A, the pigeonhole principle
guarantees that at least three of them are the same colour. Without loss of generality, let
these edges be red and let them join A to the party people B, C and D. If BC is red, then
triangle ABC is red. If CD is red, then triangle ACD is red. If DB is red, then triangle ADB
is red. So to avoid a red triangle, the edges BC, CD and DB must all be blue, which forces
triangle BCD to be blue. So we simply cannot avoid having a monochromatic triangle.
The people tried in vain for many years until the great mathematician Leonhard Euler proved
that it was impossible. His first move was to reduce the map to a graph, with each vertex
representing a land mass and each edge a bridge as in the following diagram.
2 The word monochromatic simply means one-coloured.
252 14 Graph theory
• •
In this section only, we will allow a graph to have loops and multiple edges. So what we
would like to know is whether it’s possible to walk around the graph, traversing every edge
exactly once. Such a walk is known as an Euler trail.
Solution Let’s suppose that the Königsberg graph has an Euler trail and hope for a
contradiction.
We’ll start with the obvious fact that it must start somewhere and end somewhere. Between
the start and end, each time we visit a vertex, we must have walked along two edges incident
to it, one going in and one coming out. Thus every vertex other than the start and end must
have even degree. So if there exists an Euler trail, then there can be at most two vertices of
odd degree. But you can see for yourself that the graph has four vertices of odd degree, a
contradiction.
This proof shows that, if a graph has an Euler trail, then it must have at most two vertices
of odd degree. However, by the handshaking lemma, we know that any graph has an even
number of vertices of odd degree. So for a graph to have an Euler trail, the number of vertices
of odd degree must be either 0 or 2. It turns out that the converse of this statement is true
as long as the graph is connected.
Problem Prove that a graph has an Euler trail if and only if it is connected and has 0 or 2
vertices of odd degree.
Nailing down the details of this proof requires quite a bit of work. We will only present the
main ideas of the proof and leave it up to you, dear reader, to sort out all of the details!
Solution Suppose that a connected graph has 0 or 2 vertices of odd degree. If there are 2
vertices of odd degree, call one of them S and one of them F . If there are 0 vertices of odd
degree, pick any vertex and call it both S and F . We will prove that there exists an Euler
trail which starts at S and finishes at F .
We will proceed by strong induction on the number of edges.
For a connected graph with one edge, it’s easy to find an Euler trail!
Now assume that we have a graph with E edges and that the result is true for any graph
with fewer than E edges. The idea is to start at S and commence walking randomly, never
traversing the same edge twice. Since there are only finitely many edges, sooner or later you
will get stuck somewhere. It’s impossible to get stuck at a vertex with even degree, because
each time we can walk in, there will always be an edge along which we can walk out. In fact,
it turns out (and you should think about why) that the only place we can get stuck is at F .
If our walk so far has traversed every edge of the graph, then we are done! Otherwise, we’ve
missed some of the edges and these remaining edges must form a number of smaller connected
14.7 Paths 253
graphs G1 , G2 , . . . , Gn . Each of these components Gk must have all its vertices of even degree
(think carefully about why this must be so). And each such Gk intersects our previously
constructed walk at some vertex vk . Furthermore, each Gk has fewer than E edges, so the
inductive hypothesis guarantees that there is an Euler trail for each Gk which starts and
finishes at vk .
Now we modify our original walk as follows. Start the same way but for each k whenever you
reach one of the vertices vk , take a detour along the Euler trail for Gk . Our new walk now
traverses all the edges from the original walk, along with all the edges from the Gk . So it’s
an Euler trail for the original graph.
That was quite a difficult argument, so take some time to think about it carefully. The idea of
the random walk which cannot be stopped turns out to be a very useful one in graph theory.
14.7 Paths
A walk is any sequence of vertices v1 , v2 , . . . , vn with v1 adjacent to v2 , v2 adjacent to v3 , and
so on, with vn−1 adjacent to vn .
A path is any walk with distinct vertices.
Problem In Eulerland, there are 100 cities and two airlines, Air Gauss and Air Jordan.
For any two cities in Eulerland, exactly one of the companies provides direct flights in both
directions between them. It’s known that there are two cities a and b such that it is impossible
to travel from a to b using only Air Jordan flights.
Prove that it’s possible to travel between any two cities in Eulerland using only Air Gauss
flights.
Solution We can use vertices to represent cities and edges to represent flights, but how can
we represent the two airlines? Simple! We use red edges to represent Air Gauss and blue
edges to represent Air Jordan. It turns out that a whole variety of graph theory problems
involve colouring things in. In graph theory language, the problem translates to the following.
Suppose that each edge of the complete graph on 100 vertices is coloured red or
blue. It’s known that there are two vertices a and b such that there is no red path
from a to b.
Prove that there is a blue path between any two vertices.
Let A denote the set of all vertices which are connected to a by a red path, including a
itself. Similarly, let B denote the set of all vertices which are connected to b by a red path,
including b itself. And finally, let C denote the set of remaining vertices. (It is possible that
C is empty.) We have the following observations.
Every edge between a vertex in A and a vertex in B must be blue—for otherwise, there
would be a red path from a to b.
Every edge between a vertex in A and a vertex in C must be blue—for otherwise, that
vertex in C could be reached by a red path from a.
Every edge between a vertex in B and a vertex in C must be blue—for otherwise, that
vertex in C could be reached by a red path from b.
254 14 Graph theory
We will now prove that for two arbitrary vertices u and v, there exists a blue path between
them. We’ve already established that there exists a blue edge, and hence a blue path, between
any two vertices which are in different groups. So all that remains is to consider when u and
v lie in the same group.
Problem In the country of König, it’s possible to travel by plane between any two of the
cities, although you might have to take several flights, stopping at other intermediate cities
along the way. By a journey, we mean a sequence of flights which never visits the same city
twice. Let m be the maximum possible number of flights on a journey between two cities in
König.
Prove that any two journeys of length m must have at least one city in common.
Solution Clearly, we can interpret cities as vertices, flights as edges, and journeys as paths.
If we define the length of a path to be the number of edges that it traverses, then the problem
can be restated as follows.
If the longest path in a connected graph has length m, prove that any two paths
of length m in the graph must share a vertex.
To obtain a contradiction, assume there exist two paths P1 and P2 of length m in the graph
that don’t share a vertex. Let P1 join vertex A to vertex B and let P2 join vertex C to vertex
D. The aim is to find a path with length greater than m, which will then contradict the fact
that m is the maximum length of a path.
How might we construct such a path? Well, since the graph is connected, there must be a
path from a vertex on P1 to a vertex on P2 . In fact, let X be a vertex on P1 and Y a vertex
on P2 such that there exists a path from X to Y of minimal length. Using such a path of
minimal length ensures that it is a path which cannot pass through any other vertices of P1
or P2 , because otherwise it would not be of minimal length.
A X B
• • • • • • • • •
• • • • • • • • •
D Y C
14.9 Count and count again 255
Note that either the path from A to X or the path from B to X has length at least m 2.
Without loss of generality, let it be from A to X. Similarly, either the path from C to Y or
the path from D to Y has length at least m 2 . Without loss of generality, let it be from C to
Y . Now the path from A to X to Y to C passes through no vertex twice, and has length
strictly greater than m. However, this contradicts the fact that m is the maximum length of
a path. Therefore, any two paths of length m in the graph must share a vertex.
Problem There are several boys and girls at a party, where each girl dances with at least
one boy but no boy dances with every girl.
Prove that there exist two boys B1 and B2 and two girls G1 and G2 such that B1 dances
with G1 , B2 dances with G2 , B1 does not dance with G2 and B2 does not dance with G1 .
It’s quite difficult to attack this problem directly. As we’ll soon see, the best approach is to
consider the boy who dances with the most girls. Like some other proofs in this book, this
may appear as surprising as a magician pulling a rabbit out of a hat! But the point, dear
reader, is for you to carefully read over the proof and ask how you could have thought of
it yourself. What features of the problem lead you to think about the extremal principle?
And why would you apply it in this particular way? If you understand the answers to these
questions, then you will soon be ready to start pulling rabbits out of your own hat!
Solution Of course, vertices will represent people while edges represent couples who dance
together. It’s useful to use black vertices for the boys and schematically place them at the
top of the diagram and to use white vertices for the girls and schematically place them at the
bottom of the diagram.
We’ll start by considering one of the boys, Max say, who dances with the maximum number
of girls. Now there is a girl, Anne say, who doesn’t dance with Max. And Anne must dance
with some boy, Bob say, who isn’t Max. So if one of Max’s dance partners, of whom there
are maximally many, does not dance with Bob, then we’re done!
But if Bob danced with all of Max’s dance partners, as well as with Anne, then Bob would
have danced with more girls than Max. This contradiction means that there must be some
girl, Carly say, who dances with Max but not Bob.
So we can take B1 , B2 , G1 and G2 to be Max, Bob, Carly and Anne, respectively.
We have already witnessed the virtues of double counting in section 13.8 and here, we’ll see
that it can be particularly useful when dealing with graphs.
Problem In a senate, there are 30 senators and each pair of them are either mutual friends
or mutual enemies. Each senator has exactly 6 enemies and every group of 3 senators forms a
commission.
Find the total number of commissions whose members are either all mutual friends or all
mutual enemies.
Each edge of the complete graph K30 is coloured either red or blue. Each vertex
is incident to exactly 6 blue edges.
Find the total number of monochromatic triangles.
If we concentrate on one particular vertex, we see that it’s incident to 6 blue edges and
23 red edges. This means that there are 6 × 23 = 138 colourful vees formed from the
edges incident to one particular vertex.
Since there are 30 vertices altogether, the total number of colourful vees is
30 × 138 = 4140.
30
Note that there are 3 = 4060 triangles in our graph. Since X of these are monochro-
matic, we know that 4060 − X of these are not. But in a monochromatic triangle, there
are no colourful vees, while in a triangle which is not monochromatic, there are two
colourful vees.
Therefore, the total number of colourful vees is
2(4060 − X) = 8120 − 2X.
Now we simply need to equate these two results and we end up with
4140 = 8120 − 2X
⇒ X = 1990.
Euler’s formula For a connected planar graph with V vertices, E edges and F faces,
V − E + F = 2. When using Euler’s formula, we always include the infinitely large face on
the outside.
14.11 Polyhedra 257
Problem Suppose a planar graph with E edges divides the plane into F faces.
Prove that 3F ≤ 2E.
Solution Let’s count the number E of edges. Imagine cutting the planar graph along its
edges. We get a collection of polygons and one figure which might be called an ‘anti-polygon’
corresponding to the outside face. Since each original edge splits into two edges, the total
number of newly formed edges is 2E. However, each of our F newly formed polygons, including
the outside anti-polygon, has at least three edges. Thus the number of newly formed edges is
at least 3F . It follows that 3F ≤ 2E.
Problem Prove that K5 , the complete graph on five vertices, is not planar.
Problem If a graph is planar, prove that it has at least one vertex of degree less than or
equal to five.
Solution In the hope of finding a contradiction, let’s start with the assumption that there
exists a planar graph, all of whose vertices have degree at least six.
Suppose that this graph has V vertices, E edges and divides the plane into F faces. Then
the sum of the degrees is at least 6V , so the handshaking lemma asserts that 6V ≤ 2E or
equivalently,
E
V ≤ .
3
2E
From the previous problem, we know that F ≤ 3 .
Substituting these two inequalities into Euler’s formula, we obtain
E 2E
2=V −E+F ≤ −E+ = 0,
3 3
an obvious contradiction.
14.11 Polyhedra
If you take a bunch of polygons and glue them together so that no side is left unglued, then
the resulting object is usually called a polyhedron. Typical examples include the tetrahedron,
the square pyramid, the cube and the soccer ball with its pentagonal and hexagonal patches.
The corners of the polygons are called vertices, the sides of the polygons are called edges and
the polygons themselves are called faces. We say that a polyhedron is convex if, for each
plane which lies along a face, the polyhedron lies on one side of that plane. So, for example,
the cube is convex while the polyhedron formed by gluing three congruent cubes together to
form an L shape is not.
The fact that every polyhedron is a graph is a rather simple statement. This is because if
you have a polyhedron and simply ignore the faces, then what you have left over is just a
bunch of vertices connected in pairs by edges, or in other words, a graph.
258 14 Graph theory
A more interesting statement is the following fact. Every convex polyhedron corresponds to a
planar graph. So why is this true? Well, suppose that your polyhedron is made from some
sort of rubbery material, like a balloon. If you pop the balloon by removing one of the faces,
then what remains is a rubbery sheet with the vertices and edges still drawn on it. Now
just stretch this out flat onto a table and there you have your planar graph. Note that this
planar graph has the same number of vertices, edges and faces as the original polyhedron.
The face that we removed from the polyhedron now corresponds to the infinitely large face of
the planar graph. In particular, Euler’s formula applies to all convex polyhedra.
Problem Suppose that you have a convex polyhedron and you are told that each face is a
quadrilateral or a hexagon and that three faces meet at every vertex. Furthermore, every
quadrilateral face shares an edge with four hexagonal faces, while every hexagonal face shares
an edge with three quadrilateral faces and three hexagonal faces.
Deduce the number of quadrilateral faces and the number of hexagonal faces of the polyhedron.
Solution Let V be the number of vertices, E the number of edges, Q the number of
quadrilateral faces, and H the number of hexagonal faces of the polyhedron. We will deduce
these four values by writing down four equations that they satisfy and solving them.
The first equation comes from applying the handshaking lemma to the polyhedron.
Since three faces meet at every vertex, every vertex of the polyhedron has degree three.
Therefore, the sum of the degrees is simply 3V and we obtain the equation
3V = 2E. (1)
The second equation comes from double counting the edges via face contributions. Each
quadrilateral contributes four edges and each hexagon contributes six edges. But each
edge has been double counted due to the fact that it is adjacent to two faces so we find
that 4Q + 6H = 2E, or equivalently,
E = 2Q + 3H. (2)
The third equation comes from a clever double counting argument. The trick here is to
count the number of times a quadrilateral face shares an edge with a hexagonal face.
This happens four times for each quadrilateral face. In other words, the number is 4Q.
Arguing in a different way, we can say that this happens three times for each hexagonal
face. In other words, the number is 3H. Of course, these numbers must be the same, so
we must have
4Q = 3H. (3)
V − E + Q + H = 2. (4)
It is not hard to solve equations (1)–(4). For example, using (3) we have H = 43 Q. Put this
into (2) to find E = 6Q. Put this into (1) to find V = 4Q. Finally put everything into (4) to
obtain Q = 6. We may then go on to find H = 8, E = 36 and V = 24.
Problem Given a graph with n vertices and E edges, prove that the number of triangles is
at least
4E 2 nE
− .
3n 3
Solution The basic idea of the solution is rather simple and elegant. Start by labelling the
vertices v1 , v2 , . . . , vn , and suppose that they have degrees d1 , d2 , . . . , dn , respectively.
Next take two vertices vi and vj which are connected by an edge. Then there are di − 1 edges
which emanate from vi to the rest of the graph and dj − 1 edges which emanate from vj to
the rest of the graph. Since there are n − 2 other vertices, then by the pigeonhole principle a
triangle will be created if
(di − 1) + (dj − 1) > n − 2.
In fact, we can guarantee that the edge between vi and vj will be involved in at least
(di − 1) + (dj − 1) − (n − 2) = di + dj − n
triangles.
• • • • •
• •
vi vj
If we sum over all possible edges, then we will count each triangle at most three times. So,
letting T denote the total number of triangles, we have the inequality
X X
3T ≥ (di + dj − n) = (di + dj ) − nE.
edges edges
2
Since we are aiming for the result 3T ≥ 4E n − nE, all that remains to be proved is the
inequality
X 4E 2
(di + dj ) ≥ .
n
edges
Note that in this sum over edges, the number di appears once for each edge that is incident
to the vertex vi . That is, di occurs in the sum di times. So we can write this inequality
equivalently as
n n
!2
2 2
X 4E (2E) 1 X
d2i ≥ = = di .
i=1
n n n i=1
Here, we’ve used the handshaking lemma to express 2E as the sum of the degrees.
We’ve finally arrived at a point where the problem is reduced to algebra. One way to finish
off the problem is to invoke the power means inequality3 , the QM–AM in particular.
From the power means inequality, we know
r
d21 + d22 + · · · + d2n d1 + d2 + · · · + dn
≥
n n
and this is exactly what we need to complete the proof.
3 If you’re unsure of what this means, then you might like to consult section 11.5.
Games and invariants
15
Games? Who said games? We always seem ready to play a game. Why? Because games are
fun! Games often turn up in mathematics problems and if there is one important point you
should remember, it is this:
Play the game!
The reason is that you gain insight into what the problem is all about. In fact you might
need to play the game for quite a while before you notice something that just might be the
key element in solving the problem.
15.0 Problems
1. Amy and Ben play the game of misère noughts and crosses on a 3 × 3 square array. On
Amy’s turn, she can place an X in any vacant square, while on Ben’s turn, he can place
an O in any vacant square. The players take turns to place their symbol, with Amy
going first. Any player who gets three in a row (horizontally, vertically or diagonally)
immediately loses the game. The game is considered drawn if there is no winner after
all squares have been filled.
(a) Which player, if any, has a winning strategy?
(b) Answer the same question if the game is played on a three-dimensional 3 × 3 × 3
cubic array.
2. Initially, there are n coins on a table and two players take turns to remove a number of
them. The number of coins removed must belong to the set S and a player wins by
removing the last coin.
Find the winning and losing positions in the following cases.
(a) S = {1, 2, 3, 4, 5, 6}
(b) S = {1, 2, 3, . . . , k}
(c) S = {1, 3, 4}
(d) S = {1, 3, 8}
(e) S = {1, 2, 4, 8, 16, 32, 64, 128, . . .}
(f) S = {p | p is a prime or equal to 1}
262 15 Games and invariants
3. A 16 × 16 square grid is constructed from sixteen 4 × 4 smaller square grids called boxes.
The game of Ukodus is played on the 16 × 16 grid as follows. Two players write, in
turn, numbers from the set {1, 2, . . . , 16} in different squares. The numbers in each row,
column and box of the 16 × 16 grid must be different. The loser is the one who is not
able to write a number.
Which player has a winning strategy?
4. At the start of a game, the numbers 1 and 2 are each written 10 times on a blackboard.
Two players take turns to erase two of the numbers, replacing them with a 1 if they are
different and with a 2 if they are the same. The first player wins if the last number on
the board is 1, while the second player wins if it is 2.
Which player has a winning strategy?
6. (a) Show that it’s possible to tile an m × n chessboard with 4 × 1 rectangles if and
only if 4 divides m or 4 divides n.
(b) Show that it’s possible to tile an m × n chessboard with k × 1 rectangles if and
only if k divides m or k divides n.
7. On the island of Trichroma, there are 10 blue, 15 red and 20 yellow chameleons. If
two chameleons of different colours meet, they both simultaneously change to the third
colour.
Is it possible for all of the chameleons to eventually be the same colour?
(i) it contains exactly two squares with an X, one of which must be the middle square
of the playable set, or
(ii) it contains exactly two squares with an O, one of which must be the middle square
of the playable set.
At any stage Lex is permitted to choose any playable set and simultaneously change
every X into an O, and every O into an X.
After a finite number of such changes, Lex has reached a configuration with exactly one
square containing an X.
Which squares could it be?
9. Two players start with the number 1 and take turns to multiply it by an integer from 2
to 9. The winner is the first player to obtain a number greater than or equal to 1000.
Which player has a winning strategy?
10. A coin is placed in each square of a 4 × 4 grid so that all are showing heads apart
from the coin in the first row and second column. You are allowed to flip all of the
coins along a row, along a column, or along a line parallel to one of the diagonals. In
particular, you are allowed to flip the coin in any corner square.
Prove that it’s impossible for all of the coins to show heads.
15.0 Problems 263
11. A positive integer is written in each square of an m × n chessboard. You are allowed to
add or subtract the same integer from any two squares which share a common side, as
long as the resulting numbers are both non-negative.
When is it possible to reduce all of the numbers to zero?
13. The numbers 2, 3 and 6 are written on a blackboard. You are allowed to replace two of
them, say a and b, with the numbers
3a 4b 4a 3b
+ and − .
5 5 5 5
Prove that it is impossible for a number greater than 7 to appear on the blackboard.
14. A class of students is lined up in order of height with the tallest person at the front. At
any stage a student who is not in the first or second position is permitted to move two
places towards the front.
For what size class is it possible that the students could end up in the reverse order of
height?
15. In the game of Fifteen we have 15 unit squares, numbered from 1 to 15. They are
arranged in order in a 4 × 4 square array leaving a space of one unit square in the
bottom-right corner. Any square adjacent to the space is free to slide into the space.
(a) By using these sliding moves is it possible to swap the positions of the squares
labelled 14 and 15, as shown in the diagram below?
(b) How many different positions are achievable in this game?
1 2 3 4 1 2 3 4
5 6 7 8 ? 5 6 7 8
−→
9 10 11 12 9 10 11 12
13 14 15 13 15 14
16. A real number is written in each square of an m × n grid. If the sum of the numbers in
a row or column is negative, then you are allowed to switch the signs of the numbers in
that row or column.
Prove that the sum of the numbers in each row and column will eventually be non-
negative no matter in what order the rows and columns are chosen.
17. Fifty coins of various denominations lie in a row. Ollie picks up a coin from one end of
the row, then Ellie picks up a coin from one end of the row of remaining coins. They
alternate in this way until they each have 25 coins.
Prove that Ollie can guarantee to win at least as much money as Ellie.
18. A deck of n cards labelled 1, 2, . . . , n is shuffled. If the top card is labelled k, then the
order of the top k cards is reversed.
Prove that after finitely many of these shuffles, the card labelled 1 will eventually be on
top of the deck.
264 15 Games and invariants
19. Two players play a game involving a knight on an 8 × 8 chessboard. The first player
places the knight on the board and the second player makes a knight’s move.1 The
two players then take turns to make a knight’s move, but may not place the piece on a
square it has already visited.
If the player who is unable to move is considered the loser, which player has a winning
strategy?
20. A game consists of circular discs numbered from 1 to 20 that are allowed to slide freely
along a track, which is a closed loop. Furthermore, there is a section of the track on
which exactly four discs can fit, which can be swivelled around its centre so that the
order of the four discs is reversed.
(a) From any given initial configuration, is it possible to move the discs into a configu-
ration where all 20 numbers are in order clockwise around the track?
(b) How many different positions are achievable in this game?
(c) Answer the same questions if there are 21 discs instead.
21. There are three amoebas sitting at the points (0, 0), (0, 1) and (1, 0) of the coordinate
plane. Every now and then an amoeba splits into two separate amoebas, one of which
will move one unit upwards, while the other moves one unit to the right. They do this
in such a way that no two amoebas ever sit at the same point.
Is it possible for the amoebas to split in such a way that the points (0, 0), (0, 1) and
(1, 0) will eventually be unoccupied?
22. The game of Domination is played on an m × n chessboard. Two players take turns to
place a domino so that it covers two adjacent squares of the chessboard. Dominoes are
not allowed to overlap and the player who cannot move loses.
(a) Which player has a winning strategy if Domination is played on a 3 × 3 board?
(b) Which player has a winning strategy if Domination is played on an m × n board,
where m and n are even?
(c) Which player has a winning strategy if Domination is played on an m × n board,
where m is odd and n is even?
23. Two people play a game involving n coins on a table. The first player takes at least
one, but not all, of the coins. The players then take turns to take at least one coin, but
no more than was taken on the previous move. The player who takes the last coin is
considered the winner.
For which values of n does the second player have a winning strategy?
1Achess knight moves either two squares horizontally and one square vertically or two squares vertically and
one square horizontally.
15.0 Problems 265
24. Chomp is played with an m × n grid of chocolate. Two players take turns to eat a
square of chocolate, along with every square which is above and to the right of it.
Unfortunately the bottom-left square is poisonous, so that the player who is forced to
eat it is considered the loser.
25. Initially, the number 2 is written on a blackboard. Two players take turns to erase the
number N from the blackboard and replace it with the number N + d, where d is one
of the divisors of N satisfying 0 < d < N .
(a) If the player who first writes a number greater than 12345 loses the game, which
player has a winning strategy?
(b) If the player who first writes a number greater than 123456 loses the game, which
player has a winning strategy?
26. Write the numbers 1, 4, 5, 8, 6, 2, 3, 9 and 7 in that order, around a circle. A move
consists of changing three consecutive numbers (a, b, c) to (a + 1, b − 2, c + 1).
Can you change all of the numbers to 5 using such moves?
27. Ten girls sitting around a circular table are playing a game with N cards. Initially, one
girl is holding all of the cards. Each minute, if there is at least one girl holding at least
two cards, one of them must pass a card to each of her two neighbours. The game ends
if no girl is holding more than one card.
(a) Prove that if N ≥ 10, then it is impossible for the game to end.
(b) Prove that if N < 10, then the game must eventually end.
28. Consider a chocolate bar in the shape of an equilateral triangle, with sides of length n,
divided by grid lines into equilateral triangles of side length 1. Two players take turns
to break off a triangular piece along one of the grid lines and pass the remaining block
of chocolate to the other player. A player who is unable to move or who leaves an
equilateral triangle of side length 1 is declared the loser.
For which values of n does the second player have a winning strategy?
29. On an infinite grid, n2 pieces are arranged in an n × n block of squares, one piece per
square. A move consists of jumping a piece horizontally or vertically over a piece in
an adjacent occupied square to an unoccupied square immediately beyond. The piece
which has been jumped over is then removed.
Find those values of n for which this game can end with only one piece remaining.
30. Consider the unit squares formed by the integer lattice points in the x-y plane. At n
squares north of the x-axis is a princess whom her loyal subjects wish to rescue. In
each square below the x-axis is one pawn. (These are the princess’s loyal subjects.)
But the pawns can only move by jumping either horizontally or vertically over another
immediately adjacent pawn to a vacant square, after which the pawn that has been
jumped over is removed.
For which n is it possible that one of the princess’s loyal subjects can reach her by
landing on her square?
266 15 Games and invariants
31. Amy and Ben play a game of 11-in-a-row on an infinite two-dimensional square array.
They take turns to choose a vacant square and mark it. Amy goes first and marks
squares with an X, while Ben marks squares with an O. A player wins by being the
first to mark 11 consecutive squares vertically, horizontally or diagonally.
(a) Show that Amy can prevent Ben from winning.
(b) Show that Ben can prevent Amy from winning.
(c) Show that (a) and (b) are still true if they are playing 9-in-a-row instead.2
2 Forthe general case of n-in-a-row, it is known that the first player can force a win for n ≤ 5 and that the
second player can force a draw for n ≥ 8. As of September 2014, the status for n = 6, 7 is unknown.
15.1 Number invariants 267
You are given the configuration A and a set of legal moves which change the
configuration. Can you use these moves to end up with the configuration B?
If the answer is yes, then you could prove this by simply demonstrating a sequence of legal
moves which takes configuration A to configuration B. But if the answer is no, then you have
to be much trickier. More often than not, you will need to use the idea of an invariant. An
invariant is something—for example, a number—which we can associate to every configuration
such that performing a legal move doesn’t change it. So if the value of the invariant for
configuration A differs from its value for configuration B, then the task is impossible to
achieve, no matter how hard you try. All of this probably sounds quite cryptic and will
remain so until we see some examples in action.
Problem Given some numbers, we may choose two of them, say a and b, and replace them
with the single number a + b.
Prove that if we start with the numbers
1, 2, 3, . . . , 100
and apply the operation 99 times, we always end up with the same final number.
Solution It should be obvious that, after applying the operation 99 times, we will be left
with only one number. And it should seem intuitively clear that, no matter what order we
choose to perform the additions, the final number will be the sum of the original numbers.
This is certainly true, but we can state it in the language of invariants by associating to the
numbers a1 , a2 , . . . , an the sum
I = a1 + a2 + · · · + an .
The number I is an invariant because it doesn’t change when we apply the operation.
Therefore, it must be the same for both the initial and final configurations. Indeed, we have
1 + 2 + · · · + 100 = 5050,
Problem Given some numbers, we may choose two of them, say a and b, and replace them
with the single number ab + a + b.
Prove that if we start with the numbers
1 1 1
1, , , . . . ,
2 3 100
and apply the operation 99 times, we always end up with the same final number.
Solution Again, it should be obvious that, after applying the operation 99 times, we will be
left with only one number. Although this problem is more difficult than the previous one, it’s
still easily solved once you stumble upon the correct invariant. We simply associate to the
numbers a1 , a2 , . . . , an the value
Removing the numbers a and b from our list divides the value of I by
(a + 1)(b + 1).
ab + a + b + 1 = (a + 1)(b + 1).
So, the number I is an invariant because it doesn’t change when we apply the operation.
The value of I for the initial configuration can be determined using the following veritable
feast of cancellation.
1 1 1 1 2 3 4 101
I= +1 +1 + 1 ··· + 1 = · · ··· = 101
1 2 3 100 1 2 3 100
Since the invariant must be the same for both the initial and final configurations, the final
number will always be 100.
Of course, the difficulty in this solution lies in finding the actual invariant to use. A good
problem will often leave clues which can guide you in the right direction. For example, an
observation that might lead you to discover the correct invariant for this problem is the fact
that ab + a + b almost factorises as (a + 1)(b + 1) − 1.
15.2 Parity
Another extremely useful invariant is parity, which essentially means oddness or evenness.
You should always keep your eyes open for a parity argument.
Problem Given some numbers, we may choose two of them, say a and b, and replace them
with the difference |a − b|. Suppose that we start with the numbers
1, 2, 3, . . . , 2n,
Solution Since we always end up with an odd number, a likely candidate for an invariant is
the parity of the sum of the numbers. Initially, we have
2n(2n + 1)
1 + 2 + 3 + · · · + 2n = = n(2n + 1),
2
which is odd.
To show that the parity of the sum of the numbers doesn’t change after each operation,
suppose that we remove the numbers a and b, where we assume without loss of generality
that a ≥ b. Then the sum of the numbers would change by
−a − b + (a − b) = −2b,
Problem You are given an 8 × 8 chessboard where the squares are coloured black and white
in the usual way. You are allowed to switch the colours of all the squares in a row or column.
Can you end up with exactly one black square on the board?
Solution The first thing you should do is take out some pen and paper, draw a chessboard,
and try to obtain exactly one black square. Playing around with the problem in this way
should convince you, sooner or later, that you probably cannot do it. In fact, you might even
be able to conjecture that it’s impossible to obtain an odd number of black squares. So let’s
try to show that the parity of the number of black squares is an invariant.
We note that if a row or column has x black squares, then after switching the colours, it will
have 8 − x black squares. So the change in the number of black squares is
−x + (8 − x) = 8 − 2x,
Problem A magic lolly machine has the property that if two of Stephen’s Stupendous
Smarties are inserted, then three of Justin’s Jumbo Jaffas come out. Also, if three of
Stephen’s Stupendous Smarties are inserted, then two of Justin’s Jumbo Jaffas come out.
The reverse also occurs, that is, if two of Justin’s Jumbo Jaffas are inserted, then three of
Stephen’s Stupendous Smarties come out. And if three of Justin’s Jumbo Jaffas are inserted,
then two of Stephen’s Stupendous Smarties come out.
(a) If I want to turn two Jaffas into exactly 61 Jaffas, what is the minimum number of
Smarties that I also end up with?
(b) Can I turn one Jaffa and one Smartie into 10 Jaffas and no Smarties?
Solution
(a) The first thing you should do is take out some pen and paper, pretend you have a magic
lolly machine, and work out how many Jaffas and Smarties you can obtain. For example,
you might come up with the following possibilities, created from just two Jaffas.
Jaffas 2 0 3 1 7 1 13 1 28 0 63 61
Smarties 0 3 1 4 0 9 1 19 1 43 1 4
Difference 2 −3 2 −3 7 −8 12 −18 27 −43 62 57
It seems that if I want to turn two Jaffas into 61 Jaffas, then I might have to create at
least four Smarties.
270 15 Games and invariants
If we’re looking for an invariant, we could think about the sum of the number of Jaffas
and Smarties, but this doesn’t seem to be very useful.
The difference, on the other hand, is very interesting indeed. In fact, it seems that
J −S (mod 5)
is an invariant, where J is the number of Jaffas and S is the number of Smarties.
Let’s prove this by considering the effect of one operation of the magic lolly machine.
The pair (J, S) can become any of
(J − 2, S + 3), (J − 3, S + 2), (J + 3, S − 2) or (J + 2, S − 3).
In all of these cases, the difference changes from J − S to J − S ± 5, so we can now
conclude that J − S (mod 5) is indeed an invariant.
Initially, we only have two Jaffas and J − S ≡ 2 (mod 5). So to end up with 61 Jaffas,
we need at least four Smarties in order to maintain J − S ≡ 2 (mod 5).
But be careful! While the invariant allows us to decide that certain (J, S) combinations
are not possible, it doesn’t necessarily allow us to decide which ones actually are possible.
So we still need to show that one can obtain 61 Jaffas and four Smarties, but this we
have already accomplished in the table above.
(b) Consider the problem of turning one Jaffa and one Smartie into 10 Jaffas and no
Smarties. Our invariant does not exclude this as a possibility, since J − S ≡ 0 (mod 5)
for both cases. However, observing that we need at least two Jaffas or at least two
Smarties to use the magic lolly machine tells us that this task is impossible.
Problem Consider an 8 × 8 chessboard, where the top-right and bottom-left squares have
been removed.
Is it possible to tile this mutilated chessboard with 2 × 1 rectangles?
Solution The first thing you should do is take out some pen and paper, draw a mutilated
chessboard, and try to tile it with 2 × 1 rectangles. However, I can tell you right now that
you will fail, not because your tiling skills are poor, but because the task is impossible!
Perhaps surprisingly, the key to this problem is the standard black-and-white colouring of the
chessboard. This is because a 2 × 1 rectangle will always occupy two adjacent squares on the
chessboard and hence, cover one black square and one white square. Therefore, any part of
the chessboard that can be tiled with 2 × 1 rectangles must have the same number of black
and white squares.
Now we note that the standard chessboard has 32 squares of each colour, while the mutilated
chessboard is obtained by removing two squares of the same colour. Since there are now 30
squares of one colour remaining and 32 squares of the other colour, it is impossible to tile the
mutilated chessboard with 2 × 1 rectangles.
We were lucky in this problem, because the standard 8 × 8 chessboard came with a colouring
which helped our cause, free of charge. But sometimes, as in the next problem, you have to
invent your own colouring.
15.4 Colouring invariants 271
Solution The tactic is to find a colouring of the chessboard such that any 4 × 1 rectangle
on the board occupies one square of each colour. Of course, this means that we require four
colours, which we will call 0, 1, 2 and 3. Working along the bottom row of the chessboard,
we may as well label the first four squares 0, 1, 2 and 3, in that order. After that, every
square in the row must be coloured according to the repeating pattern 0, 1, 2, 3, 0, 1, 2, 3, and
so on. If we apply the same argument along the columns, we might end up with the following
colouring.
1 2 3 0 1 2 3 0 1 2
0 1 2 3 0 1 2 3 0 1
3 0 1 2 3 0 1 2 3 0
2 3 0 1 2 3 0 1 2 3
1 2 3 0 1 2 3 0 1 2
0 1 2 3 0 1 2 3 0 1
3 0 1 2 3 0 1 2 3 0
2 3 0 1 2 3 0 1 2 3
1 2 3 0 1 2 3 0 1 2
0 1 2 3 0 1 2 3 0 1
We call this a modulo 4 colouring, because if we label the rows and columns 0, 1, 2, . . ., then
the square in row i and column j is coloured i + j modulo 4.
This colouring certainly obeys the rule that a 4 × 1 rectangle on the board always occupies
one square of each colour. Of course, we’re hoping that there are not the same number of
squares of each colour. One way to verify this is to simply count them and you would indeed
find that this is true. However, that is rather pedestrian, so let’s use a slicker, more stylish,
approach.
We simply note that it is quite easy to demonstrate a tiling of the entire board except for
the 2 × 2 square in the top-right corner. The tiled part of the board must certainly contain
the same number of squares of each colour, otherwise we wouldn’t have been able to tile it.
However, the remaining part of the board does not because there is one square coloured 0,
two squares coloured 1, one square coloured 2 and no squares coloured 3. Hence there cannot
be the same number of squares of each colour on the entire chessboard. We conclude that a
10 × 10 chessboard cannot be tiled with 4 × 1 rectangles.
For other problems, you might need to use a modulo n colouring for some other positive
integer n. Or something completely different might be needed! Using the notation (i, j) to
represent the square in row i and column j, other useful colourings include the following.
Colour (i, j) according to (i (mod 2), j (mod 2)). This uses four colours but is different
from the modulo 4 colouring used in the above solution.
272 15 Games and invariants
15.5 Monovariants
An invariant is something which doesn’t change when you perform a particular move. On
the other hand, a monovariant is a value which always gets larger or always gets smaller
when you perform a particular move. For example, if you keep spending money without ever
earning any, then you will never again have as much money as when you started spending.
Here the monovariant is obviously the amount of money that you have. This idea is crucial
to solving many problems, including the following.
Problem Given some numbers, we may choose two of them, say a and b, and replace them
with the numbers
b a
a+ and b − .
2 2
If we start with a set of non-zero numbers S and keep applying the operation, show that we
can never again obtain the set S.
We will determine the change in M after we replace two of the numbers, say a and b.
2
a2 b2
b a 2
change in M = a+ + b− − a2 − b2 = + ≥0
2 2 4 4
For some reason, squares often feature in monovariant problems, as they did in the previous
one. Next, we will see how the idea of a monovariant can help when it doesn’t seem like it
should.
1
Problem A unit fraction is a number of the form n, where n is a positive integer.
Prove that every rational number between 0 and 1 can be expressed as a sum of finitely many
distinct unit fractions.
Solution We will show that this can be achieved using the greedy algorithm.3 You should
convince yourself that the solution to the problem follows immediately once we have proven
the following statement.
Given a rational number 0 < r < 1, subtract the largest unit fraction less than or
equal to r. Prove that, if we continue to do this, we must eventually reach the
number 0 after finitely many subtractions. Furthermore, we never subtract the
same unit fraction more than once.
3 That is, take as much as you can at each stage.
15.6 Invariants as cost 273
The idea is to show that the numerator of the rational number is a monovariant which
decreases until we reach 0.
Write the number r as
a
r= ,
b
where a and b are relatively prime positive integers.
1
Let the largest unit fraction less than or equal to r be m, so that
1 a 1
≤ < .
m b m−1
a 1 am − b
− = .
b m bm
a 1
However, the inequality b < m−1 implies that am − b < a.
This means that the numerator of the fraction strictly decreases after each step4 , until we
eventually reach the number 0.
It should be clear that we never subtract the same unit fraction more than once because
1 1
if m is the largest unit fraction less than or equal to r, then r − m is too small to be able to
1
subtract m again.
Problem Initially, there is a pawn placed in each square in the bottom four rows of an 8 × 8
chessboard. If two pawns are in adjacent squares of the same row, you are allowed to remove
them and add a pawn in the row above.
Is it possible to place a pawn in the top row of the chessboard?
Solution Let’s number the rows in order so that 1 is the lowest row while 8 is the highest
row. The idea behind this problem is as follows: since two pawns in row R can become one
pawn in row R + 1, we should consider the cost of a pawn in row R + 1 to be twice the cost
of a pawn in row R.
So suppose that pawns in row 1 cost $1, pawns in row 2 cost $2, pawns in row 3 cost $4,
and so on, so that pawns in row 8 cost $128. The total value of the pawns initially on the
chessboard is
8 × ($1 + $2 + $4 + $8) = $120.
Since this cost is invariant, it’s impossible to place a pawn in the top row of the chessboard,
which would cost $128.
(2 3 1 4)
Note that because permutations are functions, we write the order of composition of permuta-
tions from right-to-left, as shown in the above example.
It turns out that permutations have a parity invariant which is highly useful.
A pair i < j is said to be an inversion if j occurs before i in the permutation. For example,
the permutation (2 3 1 4) has inversions (2, 1) and (3, 1). Thus (2 3 1 4) is an even permutation.
The permutation (1 2 4 3) has (4, 3) as its only inversion and so it is an odd permutation. The
composition (2 4 1 3) has three inversions so it is an odd permutation.
The important thing you need to know about permutation parity is the following.
These can be proved via the following exercises which we leave for you to do.
Thus conclude that the parity of a permutation is simply the parity of the number of
transpositions which can be used to express it.5
The following illustrates how permutations and parity may be used to solve a problem quickly.
Problem Andrew, Brenda and Chris are competing in a three-person race. At some point
in the race Andrew is winning, with Brenda coming second and Chris third. From here on
until the end of the race it is noted that there are 36 times when their relative order changes.
The race ends with Brenda finishing third.
If at no point in time did all three draw level, who won the race?
Solution The original order may be notated as [A,B,C] (Andrew first, Brenda second, Chris
third). Each change in the order is a transposition which is an odd permutation. Since there
are 36 such permutations, the net result is an even permutation of [A,B,C].
We are given that B comes last, so the order is either [A,C,B] or [C,A,B]. But [A,C,B] is an
odd permutation, whereas [C,A,B] is an even permutation. Thus the final order must be
Chris first, Andrew second and Brenda third.
Either one player wins and the other loses or the game results in a draw.
Problem There are two piles on a table, one containing 1000 coins and the other containing
1001 coins. Two players take turns to remove at least one coin from one of the piles. The
winner is the player who removes the last coin from the table. Since a draw is impossible, the
fundamental theorem of combinatorial games guarantees that there exists a winning strategy
for one of the players.
Which player has a winning strategy?
Solution The easiest way to determine which player has a winning strategy is simply to
come up with a winning strategy which works!
5 This also proves that no matter how we write a permutation as a composition of transpositions (and there are
many different ways of doing this) the parity of the number of transpositions is always the same. Specifically,
it is equal to the parity of the permutation.
276 15 Games and invariants
In this game, the first player has a winning strategy which we can describe as follows: take
coins from the larger pile to leave two piles of the same size.
For this strategy to work, the first player must always see two piles of different sizes on their
move. This is certainly true for their first move since the piles contain 1000 and 1001 coins.
Subsequently, the second player is always faced with two piles of the same size which they
must turn into two piles of different sizes. So the first player continues to see two piles of
different sizes when it is their turn to move.
But we still have to show why the second player has no chance of winning against this strategy.
The reason for this is that, as we have already mentioned, the second player always leaves
two piles of different sizes. However, to win the game, you must leave two piles of equal size,
both with zero coins. Since the second player can never win the game as long as the first
player follows this plan, what we have described is a winning strategy for the first player.
Often, a winning strategy doesn’t present itself very easily but can still be found by using a
technique called position analysis. If you are about to move and you have a strategy which
allows you to win, then we call the current position of the game a winning position. On the
other hand, if you are about to move and you do not have a strategy which allows you to
win, then we call the current position of the game a losing position. One corollary of the
fundamental theorem of combinatorial games is the following extremely useful result.
Theorem In a combinatorial game with no draws, every position of the game can be
categorised as winning or losing. Furthermore, it must be the case that
from a winning position, it is possible to move to a losing position, and
The idea of position analysis is to use this result to analyse enough small cases until we see
can some general patterns. If all goes well, then we should be able to describe all of the
winning positions, all of the losing positions, and perhaps even a winning strategy.
Problem Initially, there are n coins on a table and two players take turns to remove 1, 2, 3
or 4 coins. A player wins if he removes the last coin.
Find the winning and losing positions.
Solution In this problem, we can describe the position of the game by the number of coins
on the table. It is clear that 1, 2, 3 and 4 are all winning positions because we can simply
remove all of the coins on the table if it is our move.
But what happens when there are 5 coins on the table? Well, the only possibilities are for us
to leave 1, 2, 3 or 4 coins on the table, thereby leaving our opponent in a winning position.
So 5 must be a losing position.
But if 5 is a losing position, then 6, 7, 8 and 9 must all be winning positions. That is because
from these positions, we can remove 1, 2, 3 or 4 coins to leave 5 coins on the table, thereby
leaving our opponent in a losing position.
Continuing this argument gives us the following table.
15.10 The copycat strategy 277
At this stage, it seems like a safe bet that the losing positions are simply the multiples of 5.
To prove that this is true, all we need to do is demonstrate that
Both of these statements are obvious! For the first, we use the fact that a non-multiple of 5
must be of the form 5k + 1, 5k + 2, 5k + 3 or 5k + 4 and subtracting 1, 2, 3 or 4 from each of
these, respectively, leaves a multiple of 5.
For the second, we note that the difference between two multiples of 5 is still a multiple of 5
and certainly cannot be 1, 2, 3 or 4.
Note that not only have we determined the winning and losing positions, we have also
uncovered a simple winning strategy: always reduce the number of coins to a multiple of 5.
Problem Next to a square table is a pile of circular coins, all the same size. Two players
take turns putting a coin on the table so that it doesn’t touch any other coin. The player
who cannot do so loses the game.
Show that the first player can always win.
Solution The first player can win by placing the first coin at the centre of the table. Now,
wherever the second player places their coin, the first player simply copies them by placing
their coin symmetrically opposite.
After each of the first player’s moves, the configuration of coins is symmetric under a 180◦
rotation of the table. Thus, whenever the second player can place a coin on the table, the
first player can also do so by placing their coin symmetrically opposite.6 So by using this
copycat strategy, the first player can always win.
Problem Alice and Bob play a game on a large7 grid where they take turns to choose a
square and mark it. Alice moves first and marks squares with an X while Bob marks squares
6 Itis possible that a coin may overlap a symmetrically opposite coin. But this occurs only if the coin lies over
the centre of the table. This is ruled out by the first player’s very first move.
7 A 100 × 200 grid is a nice large grid for the purposes of this problem.
278 15 Games and invariants
with an O. They play until one of the players marks a row or a column of five consecutive
squares, and this player wins the game. If no player marks a row or column of five consecutive
squares, then the game is declared a draw.
Show that Bob can prevent Alice from winning.
Solution Label the board as shown in the diagram, repeating the pattern in all directions.
1 2 3 3 1 2 3 3
1 2 4 4 1 2 4 4
3 3 1 2 3 3 1 2
4 4 1 2 4 4 1 2
1 2 3 3 1 2 3 3
1 2 4 4 1 2 4 4
3 3 1 2 3 3 1 2
4 4 1 2 4 4 1 2
Each square is paired with the neighbouring square that contains the same label.
Suppose that whenever Alice plays, Bob plays in the neighbouring square with the same label.
In this way, Alice can never occupy both squares of such a pair.
But the pairing was chosen very carefully so that any block of five consecutive squares in a
row or column contains such a pair. Needless to say, you should check that this is true for
yourself. Hence, Alice can never mark a row or a column of five consecutive squares.
Since Bob can prevent Alice from winning, the fundamental theorem of combinatorial games
states that he must have a winning strategy or both players can force a draw. In the next
section, we will use a sneaky technique to show that the latter is the case.
Problem Alice and Bob play a game on a large grid where they take turns to choose a
square and mark it. Alice moves first and marks squares with an X while Bob marks squares
with an O. They play until one of the players marks a row or a column of five consecutive
squares, and this player wins the game. If no player marks a row or column of five consecutive
squares, then the game is declared a draw.
Show that Alice can prevent Bob from winning.
Solution The main idea is that Alice’s extra move at the start of the game can never hurt
her chances of winning. However, this intuition doesn’t constitute a proof in itself, but can
be turned into one by using the concept of strategy stealing.
15.12 Strategy stealing 279
In order to obtain a contradiction, suppose that Bob has a winning strategy. In other words,
it is possible to write a book which describes how Bob can win, no matter how Alice plays.
Suppose now that Alice manages to steal this second player’s strategy book and plays as
follows. She simply places her first move randomly on the grid, ignores the fact that she has
moved, and then pretends that she is the second player. She continues to do this until the
book tells her to move in the square where she moved first. Since she has already done this,
she can use this move to mark a different random square on the grid. Continuing in this way,
we see that any book which provides a winning strategy for the second player can be used to
provide a winning strategy for the first player! This blatantly contradicts the fundamental
theorem of combinatorial games. So we must conclude that there is no winning strategy for
the second player at all. In other words, Alice can prevent Bob from winning.
Strategy stealing is brimming with trickery, so have a read through the above solution again
until it’s well understood. After that, you may have a look at the following example of
strategy stealing at its best.
Problem In the game Double Chess, the rules of chess are changed so that White and Black
alternately make two legal moves at a time.
Show that Black doesn’t have a winning strategy.
Solution In order to obtain a contradiction, let us suppose to the contrary that Black does
have a winning strategy. Then it must be the case that whatever White does on the first
move, Black can win from the resulting position.
So what would happen if White started by moving a knight out and then back to the square
that it was originally on? At this stage, the board looks exactly the same as it did initially,
but it’s now Black’s turn to move. Remember that, by assumption, Black can force a win
from this position. But if that is the case, couldn’t White have just mirrored Black’s strategy
from the very beginning of the game? Of course this is possible and so White also seems to
have a winning strategy. This contradicts the fundamental theorem of combinatorial games,
because White and Black cannot both have winning strategies. So our original assumption
must have been wrong and we conclude that Black doesn’t have a winning strategy.
Combinatorial geometry
16
Combinatorial geometry is an exotic hybrid. Usually a combinatorial problem is posed in a
geometric setting. The usual ideas in combinatorics are often insufficient to take into account
extra constraints that come from the geometric setting. But in this chapter we will see ideas
that can be used for such problems.
16.0 Problems
1. Several points lie in the plane such that the area of the triangle formed by any three of
them is no more than 1.
Show that all the points lie in a triangle of area no more than 4.
2. Let K be a set of points in R3 such that every triangle with vertices in K has a side of
length at most 1.
Prove that there exist two spheres S1 and S2 , both of radius 1, such that K ⊆ S1 ∪ S2 .
3. Let b, r be positive integers. Suppose we are given 2r red points and 2b blue points in
the plane such that no three points are collinear.
(a) Show there exists a line in the plane with exactly b blue points and r red points
on each side.
(b) Can you generalise this question to three dimensions?
4. Each point on the perimeter of an equilateral triangle is coloured either black or white.
Is it always possible to find three points of the same colour which are also the vertices
of a right-angled triangle?
5. Prove that any convex polygon of area 1 lies inside a rectangle of area 2.
6. Show that for all n ≥ 4 there exists a convex hexagon which can be dissected into n
congruent triangles.
7. (a) Show that for all n ≥ 4, any cyclic quadrilateral can be dissected into n cyclic
quadrilaterals.
(b) What if the original quadrilateral is not cyclic?
282 16 Combinatorial geometry
8. A Platonic solid is a convex polyhedron in which each vertex is surrounded by the same
number of congruent regular polygons.
Prove that there are exactly five Platonic solids and determine the number of vertices,
edges and faces of each.
9. (a) Let S be a set of n ≥ 3 convex sets in the plane with empty intersection. Show
that there exist three members of S with empty intersection.
(b) Let S be a set of n ≥ 4 convex sets in space with empty intersection. Show that
there exist four members of S with empty intersection.
(These results are known as Helly’s theorem.)
10. There exist rectangles which can be dissected into squares of different sizes.
Can you find an example of this using just nine squares?1
11. Show that it is not possible to dissect a cube into finitely many cubes of different sizes.
12. (a) Show that any polygon of n sides can be cut into triangles.2
(b) Show that this can be done by cutting along just n − 3 diagonals of the polygon
resulting in n − 2 triangles.
(c) There are examples where we can get away with fewer than n − 2 triangles. What
is the least possible number of triangles?
(d) Show that any polyhedron can be dissected into tetrahedra.
13. For any set of points S, we say that S admits distance d, if there are two points in S
such that the distance between them is d.
(a) We wish to colour every point of the plane with finitely many colours in such a way
that no colour admits distance 1. Let χ be the minimal such number of colours for
which this is possible.
Show that 3 ≤ χ ≤ 9.
(b) Can you show that 4 ≤ χ ≤ 7?3
(c) If we only colour the rational points of the plane4 , show that χ = 2.
16. A cube of side length 30 contains 999 blue points in its interior, no four of which are
coplanar. Consider also the eight vertices of the cube which are coloured red.
Among the 1007 coloured points considered, prove that four of them, including at least
one blue point, form the vertices of a tetrahedron whose volume is less than 9.
1 This is called squaring a rectangle. It is also possible to square a square, but doing this requires at least 21
squares.
2 It is the non-convex case that is particularly tricky!
3 The number χ is known as the chromatic number of the plane. As of September 2014, it is still an open
problem to determine the exact value of χ. The bound given here is the best known so far. As for the
corresponding problem in three-dimensional space, it is known that 6 ≤ χ ≤ 15.
4 One can also prove that χ = 2 for the rational points of three-dimensional space.
16.0 Problems 283
17. Determine all integers n ≥ 4 for which there exist n points A1 , A2 , . . . , An in the plane
and real numbers r1 , r2 , . . . , rn satisfying the following two conditions.
Area(4Ai Aj Ak ) = ri + rj + rk .
18. A square ABCD is given. A triangulation of the square is a partition of the square into
triangles such that any two triangles are either disjoint, share only a common vertex,
or share only a common side. A good triangulation of the square is a triangulation in
which all the triangles are acute.
19. Find all possible sets S of n ≥ 3 points in the plane such that every perpendicular
bisector of every pair of distinct points of S is an axis of symmetry of S.
20. Each point of the coordinate plane is coloured using finitely many colours. Let O denote
the origin. For each point X different from O, let C(X) be the circle with centre O and
radius
α(X)
OX + ,
OX
where α(X) is the angle measured clockwise in radians that OX makes with the positive
x-axis.
Prove there exists a point Y such that α(Y ) > 0, and the colour of Y appears on the
circle C(Y ).
21. For any set S of five points in the plane, no three of which are collinear, let M (S) and
m(S) denote the largest and smallest areas, respectively, of triangles determined by
three points from S.
What is the minimum possible value of M (S)/m(S)?
22. One is given a finite set of points in the plane, each point having integer coordinates.
Is it always possible to colour some of the points in the set red and the remaining points
white, in such a way that for any straight line L parallel to either one of the coordinate
axes the difference (in absolute value) between the numbers of white points and red
points on L is not greater than 1?
24. Determine whether or not there exist two disjoint infinite sets A and B of points in the
plane satisfying the following two conditions.
(i) No three points of A ∪ B are collinear and the distance between any two is at
least 1.
(ii) There is a point of A in any triangle whose vertices are in B, and there is a point
of B in any triangle whose vertices are in A.
284 16 Combinatorial geometry
25. We are given n ≥ 2 distinct lines in the plane such that no two lines are parallel and
such that the lines are not all concurrent through a single point.
Prove that there exists a point through which exactly two of the lines pass.5
26. Let n, k be positive integers and let S be a set of n points in the plane for which no
three points of S are collinear, and for every point P of S there are at least k points
of S equidistant from P .
Prove that
1 √
k< + 2n.
2
27. Consider a square of side length a positive integer n. Suppose that there are (n + 1)2
points in the interior of the square.
Show that three of these points define a (possibly degenerate) triangle of area at most 12 .
Problem Is it possible to colour every point on a circle using the two colours white and
blue so that there is no isosceles triangle whose vertices all have the same colour?
Solution After experimenting for a while you may be convinced that it is not possible. So
assume for the sake of contradiction that it is possible.
Then certainly we can pick two points A and B of the same colour, say white. If X is the
point on the circle such that the arc lengths XA and AB are equal, then triangle AXB is
isosceles and so X must be blue. Similarly the point Y is blue where Y satisfies AB = BY .
P
A • B
◦
• •
◦
X Y
• •
Consider now the point P , say, on the minor arc of the circle halfway between A and B. Since
AP = P B, P cannot be white. Furthermore, since XP = Y P , point P cannot be blue.
Thus P cannot be any colour, which is a contradiction.
This proof is not quite complete because it may occur that the points A, B, X, Y and P are
not all distinct. For instance, if ABX forms an equilateral triangle, then X = Y .
We leave it for you to think about how to overcome this problem.
Actually, there is a really short solution to this problem as follows.
Solution Let ABCDE be any regular pentagon inscribed in the circle. Then by the
pigeonhole principle, three of its vertices must be the same colour.
However any three points of a regular pentagon form an isosceles triangle!
Problem We are given a set of discs in the plane with pairwise disjoint interiors. Each disc
is tangent to at least six other discs of the family.
Prove that there are infinitely many discs in the set.
286 16 Combinatorial geometry
Solution Assume that the family is finite. Then there is a disc, D say, of minimal radius r.
Thus there are at least six discs around D and of radius at least r. However there is only
room for there to be exactly six discs, all of radius equal to r around D.
We may apply the same argument to each of these discs, thus generating infinitely many discs
of radius r at ever increasing distances from D. This is a contradiction.
Problem A closed6 and bounded7 shape S in the plane has the property that any two
points of S can be connected by a semicircular arc which lies completely in S.
Find all possibilities for the figure S.
Solution Since S is closed and bounded, there exist points A, B ∈ S such that the distance
d = AB is maximal.8 Let α be the semicircular arc lying in S which joins A to B and let γ
be the full circle defined by α.
Let P be any point on α. Then we know that there is a semicircular arc β lying in S which
joins B to P . If β lies outside of γ, then there exists a point on β, whose distance from A is
greater than d.9 This is a contradiction.
α
β
P
A B
Thus β lies inside γ. This is true for all points P ∈ α. Since the set of such β covers the
entire interior of γ, we see that the interior of γ is a subset of S.
Since S is closed, it follows that the boundary of γ is also a subset of S. Finally, since any
point lying outside of γ would give rise to a distance in S greater than d, we conclude that S
is in fact the closed disc described by γ.
6 We say that a set is closed if it contains its boundary. For example, the solid disc {(x, y) | x2 + y 2 ≤ 1} is
closed. So is the unit circle {(x, y) | x2 + y 2 = 1}. But the disc {(x, y) | x2 + y 2 < 1} is not closed because it
is missing some (in fact all) of its boundary, namely the unit circle.
7 We say that a set is bounded if there is a real number R > 0 such that the distance between the origin and
any point of the shape is at most R. The three examples in the previous footnote are all bounded because
all points of those shapes lie within distance 1 of the origin.
8 The existence of two such points is a consequence of the fact that a continuous function on a compact (i.e.
closed and bounded) set achieves a maximal value. In this case the compact set is S ×S = {(X, Y ) | X, Y ∈ S}
and the function is f : S × S → R, f (X, Y ) = XY .
9 For example, a point which lies very close to B on β would be such an offending point.
16.3 Perturbation 287
Problem Given n points in the plane, no three of which are collinear, show it is possible to
join them up in sequence so that we have a broken line consisting of n − 1 segments, no two
of which cross each other.
Solution There does not seem to be anything obviously extremal to look at here. However,
if you drew a really long path through the points, it seems likely that the path would contain
lots of intersections. A shorter path probably would contain fewer intersections. The shortest
path hopefully would contain none. Let us prove that this is indeed the case.
Let A1 A2 . . . An be a path of minimal length. Suppose that Aj Aj+1 crosses Ai Ai+1 (i < j).
Then consider the quadrilateral Ai Aj Ai+1 Aj+1 . The diagonals intersect at a point P inside
the quadrilateral.
Aj−1 Ai+2
Aj−1 Ai+2
Ai+1
Aj Ai+1
Aj
P −→
P
Ai Aj+1 Ai Aj+1
16.3 Perturbation
Sometimes objects such as lines or points may be not quite in the position that you want
them. A miniscule jiggle of the configuration can sometimes rectify this. We illustrate this
with another way to approach the previous problem.
Problem Given n points in the plane, no three of which are collinear, show it is possible to
join them up in sequence so that we have a broken line consisting of n − 1 segments, no two
of which cross each other.
Solution The points lie in the x-y plane. So if we label the points P1 , P2 , . . . , Pn according
to increasing x-coordinate, then we could simply join P1 to P2 , P2 to P3 , and so on.
However, it may not be the case that the x-coordinates are all distinct. Surely we can rotate
the configuration in the plane so that all x-coordinates are distinct! Indeed all we have to
do is rotate the
configuration so that the y-axis is not parallel to any of the lines formed by
joining all n2 pairs of points.
288 16 Combinatorial geometry
16.4 Induction
More than likely, a problem for which we can build larger examples out of smaller examples
can be approached by mathematical induction.
Problem Given 1002 distinct points in the plane, we join every pair of points with a line
segment and colour its midpoint red.
Show that there are at least 2001 red points.
Solution We prove by the induction the more general statement that for n ≥ 2 points there
are at least 2n − 3 red points.
The result is clearly true for n = 2.
Suppose now that the result is true for n = 2, 3, . . . , m where m ≥ 2. Consider an arrangement
of m + 1 points. By using a perturbation argument we may assume that all points have
distinct x-coordinates. Label them as A1 , A2 , . . . , Am , Am+1 by increasing x-coordinate.
By the inductive assumption we have at least 2m − 3 red points from the midpoints of
A1 , A2 , . . . , Am . Can we find two more red points by using Am+1 ? Yes! The midpoints of
Am+1 Am−1 and Am+1 Am are distinct and both are to the right of all red points considered
so far. Thus we have at least 2m − 1 = 2(m + 1) − 3 red points in all. This completes the
induction.
As an extension, can you determine where equality occurs in the above problem?
Problem We are given 2n + 1 blue points in the plane such that no three are collinear and
no four are concyclic.
For every pair of blue points A, B, show that there exists a circle passing through A, B with
n − 1 blue points inside it, three points on its boundary and n − 1 blue points outside it.
Solution Consider any two blue points A and B and orient the plane so that AB is vertical.
For each point P on the perpendicular bisector of AB there is an associated circle ΓP having
centre P and passing through A and B. The idea is to consider what happens as P varies
from far to the left of AB to far to the right of AB.
Suppose there are a blue points to the left of the line AB and b blue points to the right of
the line AB. WLOG a ≥ b. Since a + b = 2n − 1, a and b are of opposite parity and so we
must have a > b.
16.6 Convex hull 289
There is a point Pleft , such that the corresponding circle contains in its interior all of the a
blue points located to the left of AB but none of the b blue points to the right of AB. There
is also a point Pright , such that the corresponding circle contains in its interior all of the b
blue points located to the right of AB but none of the a blue points to the left of AB.10
As P varies continuously between Pleft and Pright the circle ΓP will meet the remaining 2n − 1
blue points one by one. When ΓP meets the ith such blue point, let ai be given by
ai = I(ΓP ) − O(ΓP ),
where I(ΓP ) is the number of blue points lying inside ΓP and O(ΓP ) is the number of blue
points lying outside ΓP . Note that ai is even because I(ΓP ) and O(ΓP ) have the same parity
due to I(ΓP ) + O(ΓP ) = 2n − 2.
•
A
•
•
• •
•
B
If the first point that ΓP meets is to the left of AB, then a1 = a − 1 − b. If it is to the right of
AB, then a1 = a − (b − 1). Either way we have a1 ≥ a − b − 1 ≥ 0 because a > b. Similarly,
if the last point that ΓP meets is to the left of AB, then a2n−1 = b − (a − 1). If it is to the
right of AB, then a2n−1 = b − 1 − a. Either way we have a2n−1 ≤ b − a + 1 ≤ 0 because
b < a. To summarise we have a1 ≥ 0 and a2n−1 ≤ 0.
What happens when ΓP goes from meeting the ith blue point to meeting the (i + 1)th blue
point?
If both such points are to the left of AB, then I(ΓP ) decreases by 1 while O(ΓP )
increases by 1. Thus ai+1 = ai − 2.
If both points are to the right of AB, then I(ΓP ) increases by 1 while O(ΓP ) decreases
by 1. Thus ai+1 = ai + 2.
If the points are on opposite sides of AB, then I(ΓP ) and O(ΓP ) remain unchanged.
Thus ai+1 = ai .
In all cases we have |ai+1 − ai | ≤ 2. Finally, applying the discrete intermediate value theorem
to the sequence 12 a1 , 12 a2 , . . . , 12 a2n−1 guarantees that ai = 0 for some i. Then the circle
corresponding to this i satisfies the conclusion of the problem.
polygon whose angles are all less than or equal to 180◦ . The area occupied by this polygon is
called the convex hull of the nails.
The notion of convexity is defined as follows. A set S is convex if for any two points A and B
in S, the whole line segment AB lies entirely in S. It is easily shown that the intersection of
convex sets is also a convex set. The convex hull of a set T is defined to be the intersection of
all convex sets containing T . It is in fact the smallest convex set containing T .
Solution It is tempting to oversimplify by asserting that the five points form a pentagon
and so one of its interior angles is at least 108◦ . Although this is true, that interior angle
might also be more than 180◦ . This is where the convex hull becomes useful.
Consider the perimeter of the convex hull. If it contains all five points, then we may use the
above argument since all interior angles are at most 180◦ .
If the convex hull is a quadrilateral, divide the quadrilateral into two triangles as shown. The
fifth point must be inside one of those triangles. Using line segments, join that fifth point to
the vertices of the triangle in which it lies. Since these segments define three angles around
the fifth point, none of which exceeds 180◦ , and whose sum is 360◦ , it follows that one of
those angles is between 120◦ and 180◦ .
If the convex hull is a triangle, again we find a triangle with a point (in fact two points) in its
interior leading to an angle between 120◦ and 180◦ .
If the convex hull is a segment, then all points are collinear and so three of them form a 180◦
angle.
Problem The square ABCD contains n points P1 , P2 , . . . , Pn in its interior such that no
three of the n + 4 points A, B, C, D, P1 , . . . , Pn are collinear.
16.8 Pigeonhole principle 291
(a) Show that it is possible to subdivide the square into triangles in such a way that the
vertices of each triangle are among the n + 4 given points.
(b) Show that the number of resulting triangles in (a) is always the same no matter how
the subdivision is carried out.
Solution
(a) It is easy to establish using induction that a subdivision is always possible. Indeed each
extra point would land inside some triangle. Subdividing this triangle into three further
triangles by joining the interior point to the three vertices loses the original triangle
but creates three smaller triangles. This construction yields 2n + 2 triangles.
Note that such a proof by induction that all subdivisions as described in the problem lead
to the same number of triangles is faulty. This is because not all such subdivisions can be
constructed inductively in this way from subdivisions with fewer triangles.11 So instead for
part (b) we resort to Euler’s formula along with a counting argument.
(b) Let T be the number of triangles. Then F = T + 1 because of the infinitely large outside
face that has four sides. Each edge belongs to two faces. Since each triangle has three
edges and the outside face has four edges we have
3T + 4
E= .
2
Solution Divide the triangle up into four equilateral triangles of area 1. Since we have nine
points in total, by the pigeonhole principle at least three of these points lie inside or on the
boundary of one of these four triangles and thus define a triangle of area at most 1.
4
In fact more is true! We can sharpen the result from 1 to 13 as follows.
Suppose that we place one of the six points inside the triangle, and use this point to subdivide
the triangle into three smaller triangles. Next place a second point of the six points. This
will fall inside one of the three smaller triangles and we can use our second point to subdivide
this smaller triangle into three even smaller triangles, making five triangles in all. Continuing
in this fashion placing one point at a time until all six points are placed results in the original
triangle being subdivided into 13 triangles.
4
Thus one of these triangles has area at most 13 .
11 There are many examples of this. Can you find one?
292 16 Combinatorial geometry
16.9 Colouring
Questions about colouring points in the plane often use the pigeonhole principle.
Problem Let S be a disc. The points of S are painted in finitely many colours.
Show that for every n ≥ 3 there exist infinitely many congruent polygons with n sides
contained in S such that all of them have their vertices painted in the same single colour.
Solution This problem initially seems quite daunting. The number of colours and the value
of n are allowed to be quite large. The best way to start is to examine simple cases. We
begin with the simplest case where we have only two colours and our congruent polygons are
triangles.
Consider any regular pentagon P , lying entirely in the interior of S. By the pigeonhole
principle three of the vertices of P have the same colour. However, there are infinitely many
disjoint translates of the set of five vertices of P which also lie in S. By the preceding
argument each of these contains a monochromatic triangle. Thus we have infinitely many
such monochromatic triangles.
But the triangles we are considering only come in one of two congruence types. Thus applying
the infinite pigeonhole principle, one of these congruence types contains infinitely many
monochromatic triangles.
We now have infinitely many congruent monochromatic triangles. Finally, a second application
of the infinite pigeonhole principle allows us to conclude that one of the two colours contains
infinitely many such congruent monochromatic triangles.
We leave it to the reader to work out how to solve the problem in its full generality as stated.
For m colours, the issue is basically to find a number k so that whenever P has k vertices
there are always n vertices of P of the same colour.
A couple of comments are in order here. First, P did not have to be regular. In the case
P is not regular, we have perhaps up to 53 = 10 different congruence types of triangle for
each translate of P . But that is still finitely many. Second, why did we choose P to be a
pentagon? The answer is that a pentagon has enough vertices so that if we colour them using
two colours, a monochromatic triangle always appears. So P could have had any number of
vertices greater than or equal to 5 and the proof would still work.
17
Appendices
Basic arithmetic
Computations with numbers involving i can be done by treating i as an unknown quantity,
but replacing i2 with −1 every time it occurs.1
Complex numbers include ones such as 2 − 3i. This lies 3 units directly below the number 2.
In fact considering all expressions of the form a + bi (a, b ∈ R), we fill up the entire plane.
This is called the complex plane.
Addition and subtraction are very easy to do. For example,
since i2 = −1.
Division is easy once you think of rationalising the denominator. For example,
One of the amazing things about complex numbers is the fundamental theorem of algebra.
This states that any non-constant polynomial with complex coefficients has at least one
complex root.
1A
√
similar thing is done when √
you are working with surds such as 2. You treat it as an unknown quantity
except that you can replace ( 2)2 with 2.
294 17 Appendices
√
To deal with quantities like
√ −1√we have to go beyond the real numbers and invent i. However,
for the quantities like i and −i we don’t need to invent any more numbers. They are
roots of the polynomials x2 − i = 0 and x2 + i = 0, respectively. The fundamental theorem of
algebra guarantees that both these polynomials have roots. In fact you might like to check by
hand that
√ √
1+i i−1
i=± √ and −i = ± √ .
2 2
So far we have considered complex numbers to be of the form a + bi (a, b ∈ R). This is called
the Cartesian form. The real numbers a and b basically tell us the x-coordinate and the
y-coordinate of the complex number so we can plot it in the complex plane if we wish.
Polar form
There is another form that is very useful called the polar form. Take a non-zero complex
number z = a + bi in Cartesian form. Draw a line segment from the number 0 to the number
√ z
in the complex plane. Using Pythagoras’ theorem, the length r of this segment is r = a2 + b2
and the angle θ the segment makes when measured anticlockwise from the positive x-axis
satisfies a = r cos θ and b = r sin θ.
bi z = a + bi
θ
a
The elements r and θ make up the polar form and we can write
then
z1 z2 = r1 r2 cis(θ1 + θ2 ).
This property is not at all obvious but it is most important. If you want to try and prove it
(it’s not that hard), you might find the compound angle formulas for sin(x + y) and cos(x + y)
helpful.
17.1 How do complex numbers work? 295
Here θ is measured, not in degrees, but in radians.3 Once you know this amazing result it is
very easy to prove the rule for multiplying complex numbers in polar form. In fact you can
also prove the trigonometric compound angle formulas from this!
Here is an example of the power of the polar form. Suppose we are searching for complex
numbers z such that
z 3 = 1,
that is, cube roots of 1. Write z = reiθ . Then we have
r3 e3iθ = 1.
The polar form for 1 has |1| = 1 and arg(1) = 0. Thus we have r3 = 1 and 3θ = 0 and so
r = 1 and θ = 0.
However, we have missed something, namely, that the arg function is defined modulo 2π (in
radians). Thus we also need to check 3θ = ±2π, ±4π, ±6π, . . .. These lead to just two other
values (modulo 2π) namely, θ = 2π 4π
3 and θ = 3 .
So we have not one but three cube roots of 1, two of which are non-real complex numbers. If
you plot all three cube roots in the complex plane, you will see that they form the vertices
of an equilateral triangle inscribed in the unit circle. As a curiosity try computing their
Cartesian forms.
In general if n is a positive integer, the n roots of z n = 1 form the vertices of a regular n-gon
inscribed in the unit circle.
2 You might be wondering, ‘How do I even compute e to the power of i?’ Unfortunately, a discussion of what
this even means would take us too far afield. An internet search could be most illuminating.
3 Note that 360 degrees is equal to 2π radians.
296 17 Appendices
f : N+ → R,
this means that f is a function defined for all elements of N+ (the set of positive integers)
and taking values in the set R (the set of real numbers). An example of such a function is
f (x) = −x.
The set that appears immediately after the colon (N+ in our example) is called the domain
of the function.
The set that appears after the arrow (R in our example) is called the codomain.
The set of values that f actually achieves (in our case it is the set of negative integers) is
called the image.
Note that if we tried to define the function f : N+ → N+ satisfying f (x) = −x, then no such
function exists! Indeed changing the domain or codomain can drastically alter things.
17.3 Directed angles 297
All directed angles are considered modulo 180◦ . In algebraic terms this means that
∠(m, n) = ∠(m, n) + 180◦ .
∠(n, m) = −∠(m, n).
As an example, here is how you can prove part of the generalised pivot theorem as shown in
section 6.5. We shall prove that if D, E and F lie on lines BC, AC and AB, respectively,
and if P is the second intersection point of circles AEF and CDE, then P also lies on circle
DBF .
But EC and EA both define the same line, thus we may deduce that
It only remains to notice that F A and F B define the same line as do DC and DB. Thus
Of course in reality you would have solved this question by doing an angle chase on the
diagram. But in the write-up, using directed angles helps you to avoid having to deal with
any case distinctions, such as what happens if P is outside of triangle ABC or if any of the
points D, E or F are on the extensions of the sides of the triangle.
4 This one is especially useful because depending on the order of A, B, C, D around the circle you either have
the classic ‘bow tie’ configuration (angles standing on the same arc) or you have that opposite angles in a
convex quadrilateral add to 180◦ .
5 This is a limiting case of the preceding property.
298 17 Appendices
Area Formulas
1
4= × base × height
2
= rs
abc
=
4R
p
= s(s − a)(s − b)(s − c) (Heron’s formula)
1
= bc sin α.
2
b d c
C x D y B
If AD is the internal angle bisector, then we have the angle bisector theorem
b x
=
c y
as well as
ab
x=
b+c
ac
y=
b+c
d2 = bc − xy.
If AD is the median, then we have Apollonius’ theorem.
a 2
2d2 + 2 = b2 + c2
2
If AD is a general cevian, then we have Stewart’s theorem.
b2 y + c2 x = a(xy + d2 )
Index
299
300 Index