0% found this document useful (0 votes)
456 views82 pages

Mathematics Magazine

mathematics

Uploaded by

Akshika Shree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
456 views82 pages

Mathematics Magazine

mathematics

Uploaded by

Akshika Shree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

EDITORIAL POLICY

Mathematics Magazine aims to provide


lively and appealing mathematical exposi-
tion. The Magazine is not a research jour-
nal, so the terse style appropriate for such a
journal (lemma-theorem-proof-corollary) is
not appropriate for the Magazine. Articles
should include examples, applications, his-
torical background, and illustrations, where
appropriate. They should be attractive and
accessible to undergraduates and would,
ideally, be helpful in supplementing un-
dergraduate courses or in stimulating stu-
dent investigations. Manuscripts on history
are especially welcome, as are those show-
ing relationships among various branches of
mathematics and between mathematics and
other disciplines.
A more detailed statement of author
guidelines appears in this Magazine, Vol.
83, at pages 7374, and is available at
the Magazines website www.maa.org/pubs/
mathmag.html. Manuscripts to be submitted
should not be concurrently submitted to, ac-
cepted for publication by, or published by
another journal or publisher.
Please submit new manuscripts by email
directly to the editor at [email protected].
A brief message containing contact infor-
mation and with an attached PDF le
is preferred. Word-processor and DVI
les can also be considered. Alternatively,
manuscripts may be mailed to Mathemat-
ics Magazine, 132 Bodine Rd., Berwyn, PA
19312-1027. If possible, please include an
email address for further correspondence.
Cover image by Samia Khalaf, assisted by
Jason Challas. Samia, working her way to-
wards a career in art and design, is an an-
imation student at West Valley College in
Saratoga, California, where Jason teaches.
As noted on page 169, all animal transfor-
mations are completely reversible. Page 239
art by Susan Stromquist.
MATHEMATICS MAGAZINE (ISSN 0025-570X) is pub-
lished by the Mathematical Association of America at
1529 Eighteenth Street, N.W., Washington, D.C. 20036
and Hanover, PA, bimonthly except July/August.
The annual subscription price for MATHEMATICS
MAGAZINE to an individual member of the Associ-
ation is $131. Student and unemployed members re-
ceive a 66% dues discount; emeritus members receive
a 50% discount; and new members receive a 20% dues
discount for the rst two years of membership.)
Subscription correspondence and notice of change
of address should be sent to the Membership/
Subscriptions Department, Mathematical Association
of America, 1529 Eighteenth Street, N.W., Washington,
D.C. 20036. Microlmed issues may be obtained from
University Microlms International, Serials Bid Coordi-
nator, 300 North Zeeb Road, Ann Arbor, MI 48106.
Advertising correspondence should be addressed to
MAA Advertising
1529 Eighteenth St. NW
Washington DC 20036
Phone: (866) 821-1221
Fax: (202) 387-1208
E-mail: [email protected]
Further advertising information can be found online at
www.maa.org
Change of address, missing issue inquiries, and other
subscription correspondence:
MAA Service Center, [email protected]
All at the address:
The Mathematical Association of America
1529 Eighteenth Street, N.W.
Washington, DC 20036
Copyright c by the Mathematical Association of Amer-
ica (Incorporated), 2010, including rights to this journal
issue as a whole and, except where otherwise noted,
rights to each individual contribution. Permission to
make copies of individual articles, in paper or elec-
tronic form, including posting on personal and class
web pages, for educational and scientic use is granted
without fee provided that copies are not made or dis-
tributed for prot or commercial advantage and that
copies bear the following copyright notice:
Copyright the Mathematical Association
of America 2010. All rights reserved.
Abstracting with credit is permitted. To copy other-
wise, or to republish, requires specic permission of
the MAAs Director of Publication and possibly a fee.
Periodicals postage paid at Washington, D.C. and ad-
ditional mailing ofces.
Postmaster: Send address changes to Membership/
Subscriptions Department, Mathematical Association
of America, 1529 Eighteenth Street, N.W., Washington,
D.C. 20036-1385.
Printed in the United States of America
Vol. 83, No. 3, June 2010

MATHEMATICS
MAGAZINE
EDITOR
Walter Stromquist
ASSOCIATE EDITORS
Bernardo M.

Abrego
California State University, Northridge
Paul J. Campbell
Beloit College
Annalisa Crannell
Franklin & Marshall College
Deanna B. Haunsperger
Carleton College
Warren P. Johnson
Connecticut College
Victor J. Katz
University of District of Columbia, retired
Keith M. Kendig
Cleveland State University
Roger B. Nelsen
Lewis & Clark College
Kenneth A. Ross
University of Oregon, retired
David R. Scott
University of Puget Sound
Paul K. Stockmeyer
College of William & Mary, retired
Harry Waldman
MAA, Washington, DC
LETTER FROM THE EDITOR
The cover refers to Mad Vet puzzles, in which animals are transformed into other
animals. These puzzles are the starting point for the article by Gene Abrams and
Jessica Sklar in this issue. They show how each of these puzzles is related to
a particular semigroup. Understand the semigroup and solve the puzzle! From
there they nd connections to graph theory and to current research.
Other animalssome horses, but also beasts like Lebesgue measuretake the
stage when Julia Barnes and Lorelei Koss invite us to their carnival. It is a carnival
of mappings, exploring the implications of G. D. Birkhoffs Ergodic Theorem.
Ever drill a hole through the center of a sphere? In calculus problems, perhaps.
Vincent Coll and Jeff Dodd consider what other solids you might drill through
instead. The diameters of the Earth and of a hydrogen atom are mentioned.
Danielle Arett and Suzanne Dor ee tell us about Tower of Hanoi graphs. They
explore properties of these graphs and use them to derive combinatorial identi-
ties. Arett was Dor ees student at Augsburg College when this work began.
In the Notes section, Todd Will gives us a denitive treatment of a sums-of-
squares problem, partly by combining (and sometimes reconciling) old results.
There are also pieces by Ron Hirshon on random walks with barriers (or gam-
bling games, if we prefer), Christopher Frayer on polynomial root squeezing, and
Alexander Kheifets and James Propp on integration by parts. At the back of the
issue are problems, solutions, and results from the 50th International Mathemat-
ical Olympiad.
But let us begin with some beginnings. Ko-Wei Lih introduces us to a magic
square from 18th-century Korealong before Eulers work on the latin squares.
Could Choes square have inuenced Benjamin Franklin? He would surely have
been interested, and it was in print before he was ten years old.
Walter Stromquist, Editor
162
ARTICLES
A Remarkable Euler Square before Euler
KO- WEI LI H
Institute of Mathematics
Academia Sinica
Nankang, Taipei 115, Taiwan
[email protected]
Orthogonal Latin squares and Choes conguration
A Latin square of order n is formed when the cells of an n n square array are lled
with elements taken from a set of cardinality n so that all cells along any row or any
column are occupied with distinct elements. A notion of orthogonality between two
Latin squares can be dened as follows. We may juxtapose two Latin squares A and B
of order n into one square array so that each cell is occupied with an ordered pair, rst
component from A and second component from B. When all n
2
of these ordered pairs
are distinct, we say that A is orthogonal to B. Obviously, this orthogonality relation is
symmetric. The juxtaposition of two orthogonal Latin squares is called a Graeco-Latin
square by Euler, who was the rst to study the properties of Latin and Graeco-Latin
squares in a short paper [2] written in 1776. His motivation was to produce magic
squares from Graeco-Latin squares. We call a Graeco-Latin square an Euler square in
this article.
A magic square of order n is an arrangement of the numbers 1, 2, . . . , n
2
into an
n n square array so that the sum of numbers along any row, any column, or either of
the two main diagonals is equal to the xed number n(n
2
+1)/2.
To make things simpler, we always suppose that a Latin square of order n is lled
with numbers from the set {1, 2, . . . , n}. Euler used the simple algorithm of mapping
the pair (x, y) into the number n(x 1) + y to convert a Graeco-Latin square of order
n into an array of order n. We call this mapping the canonical mapping in the sequel.
It is easy to see that the range of this mapping is the set {1, 2, . . . , n
2
} and the sum of
numbers along any row or column of the array is n(n
2
+1)/2. If we can arrange to
have both main diagonals sum to n(n
2
+1)/2, then a magic square is produced.
The highest order of an Euler square explicitly constructed in [2] is ve. The follow-
ing is an example from [2] in matrix form with entry xy representing a pair (x, y) in
the Euler square. Applying the canonical mapping to this square, we obtain the magic
square on the right.

34 45 51 12 23
25 31 42 53 14
11 22 33 44 55
52 13 24 35 41
43 54 15 21 32

14 20 21 2 8
10 11 17 23 4
1 7 13 19 25
22 3 9 15 16
18 24 5 6 12
Math. Mag. 83 (2010) 163167. doi:10.4169/002557010X494805. c Mathematical Association of America
163
164 MATHEMATICS MAGAZINE
Orthogonal Latin squares have been known to predate Euler in Europe. A compre-
hensive history of Latin squares can be found in [1]. However, it is surprising that an
Euler square of order higher than ve was already in existence in the Orient, prior to
Eulers paper. In a Korean mathematical treatise Kusuryak ( , Summary of the
Nine Branches of Numbers) written by Choe S ok-ch ong ( , 16461715), an Eu-
ler square of order nine appeared. Choe, a Confucian scholar and one time the prime
minister of the Choson Dynasty, wrote his treatise presumably after his retirement in
1710. Figure 1 is a facsimile of the pages copied from [5] (vol. 1, pp. 698699) exhibit-
ing Choes congurations. The 9 9 square on the right is our main concern in this
note. (The square begins with the rightmost column on the left-hand page and extends
over most of the right-hand page.)
Figure 1 A facsimile of Choes congurations
The reader is referred to [3] and [4] for background information on the history of
Korean mathematics. Choes treatise was entirely written in Chinese characters. He
did not reveal any clue as how he arrived at his congurations. A modern matrix form
M of his square is displayed as follows.
M =

51 63 42 87 99 78 24 36 15
43 52 61 79 88 97 16 25 34
62 41 53 98 77 89 35 14 26
27 39 18 54 66 45 81 93 72
19 28 37 46 55 64 73 82 91
38 17 29 65 44 56 92 71 83
84 96 75 21 33 12 57 69 48
76 85 94 13 22 31 49 58 67
95 74 86 32 11 23 68 47 59

VOL. 83, NO. 3, JUNE 2010 165


Hong-Yeop Song has called attention to this square in [6]. As observed in [6], the
following square is obtained when the canonical mapping is applied to M.
37 48 29 70 81 62 13 24 5
30 38 46 63 71 79 6 14 22
47 28 39 80 61 72 23 4 15
16 27 8 40 51 32 64 75 56
9 17 25 33 41 49 57 65 73
26 7 18 50 31 42 74 55 66
67 78 59 10 21 2 43 54 35
60 68 76 3 11 19 36 44 52
77 58 69 20 1 12 53 34 45
Choes square M is a juxtaposition of the following two Latin squares L and R. We
write M = L R, where is a notation for the juxtaposition operation.
L =

5 6 4 8 9 7 2 3 1
4 5 6 7 8 9 1 2 3
6 4 5 9 7 8 3 1 2
2 3 1 5 6 4 8 9 7
1 2 3 4 5 6 7 8 9
3 1 2 6 4 5 9 7 8
8 9 7 2 3 1 5 6 4
7 8 9 1 2 3 4 5 6
9 7 8 3 1 2 6 4 5

R =

1 3 2 7 9 8 4 6 5
3 2 1 9 8 7 6 5 4
2 1 3 8 7 9 5 4 6
7 9 8 4 6 5 1 3 2
9 8 7 6 5 4 3 2 1
8 7 9 5 4 6 2 1 3
4 6 5 1 3 2 7 9 8
6 5 4 3 2 1 9 8 7
5 4 6 2 1 3 8 7 9

It is also observed in [6] that each pair of corresponding rows of L and R form
a palindrome. Let P
n
= ( p
i, j
) be an n n permutation matrix with p
i, j
= 1 when
j = n +1 i . Then this observation amounts to the matrix equality R = LP
9
.
In the next section, we list new observations about nice properties of M. In the last
section we will explain how M can be constructed by a matrix product method. The
construction will make clear why these properties hold.
More nice properties of Choes square
Sums of centrally symmetric cells Any pair of cells in a matrix of odd order is said
to be centrally symmetric if they are located symmetrically with respect to the center
cell. In the square L (or R), any pair of entries at centrally symmetric cells sum to 10.
It follows that, in Choes square M, if we read each entry as a two-digit integer, any
pair of centrally symmetric entries sums to 110. (In the magic square formed by the
canonical map, any pair of centrally symmetric entries sums to 82.)
A partition into orthogonal Latin squares We split M right down the central ver-
tical line to get two matrices L

and R

, each of which is a Latin square.


166 MATHEMATICS MAGAZINE
L

5 1 6 3 4 2 8 7 9
4 3 5 2 6 1 7 9 8
6 2 4 1 5 3 9 8 7
2 7 3 9 1 8 5 4 6
1 9 2 8 3 7 4 6 5
3 8 1 7 2 9 6 5 4
8 4 9 6 7 5 2 1 3
7 6 8 5 9 4 1 3 2
9 5 7 4 8 6 3 2 1

9 7 8 2 4 3 6 1 5
8 9 7 1 6 2 5 3 4
7 8 9 3 5 1 4 2 6
6 4 5 8 1 9 3 7 2
5 6 4 7 3 8 2 9 1
4 5 6 9 2 7 1 8 3
3 1 2 5 7 6 9 4 8
2 3 1 4 9 5 8 6 7
1 2 3 6 8 4 7 5 9

Again, R

= L

P
9
and L

is an Euler square.

59 17 68 32 44 23 86 71 95
48 39 57 21 66 12 75 93 84
67 28 49 13 55 31 94 82 76
26 74 35 98 11 89 53 47 62
15 96 24 87 33 78 42 69 51
34 85 16 79 22 97 61 58 43
83 41 92 65 77 56 29 14 38
72 63 81 54 99 45 18 36 27
91 52 73 46 88 64 37 25 19

However, the canonical mapping does not convert L

into a magic square.


Exchanges of four pairs of centrally symmetric cells We consider the following
four pairs of centrally symmetric cells in the matrix M = (m
i, j
):
{m
i,i
, m
10i,10i
}, {m
i,5
, m
10i,5
}, {m
5,i
, m
5,10i
}, {m
i,10i
, m
10i,i
}.
For each i , 1 i 4, if we simultaneously interchange the entries in each of the above
four pairs, we get an Euler square M
i
. Each M
i
can be converted into a magic square
by the canonical mapping. If we split each M
i
along the central vertical line to get two
Latin squares L

i
and R

i
, then R

i
= L

i
P
9
and L

i
R

i
is again an Euler square.
Our method to construct Choes square
First we dene a formal Kronecker product of two matrices. Let U = (u
i, j
) be an
m m matrix and V = (v
i, j
) be an n n matrix. Dene U V to be an mn mn
matrix

Y
1,1
Y
1,2
Y
1,m
Y
2,1
Y
2,2
Y
2,m
. . . . . . . . . . . . . . . . . . . . .
Y
m,1
Y
m,2
Y
m,m

,
where Y
i, j
is an n n matrix whose (s, t )-entry is equal to the pair (u
i, j
, v
s,t
).
There are six permutations of the numbers 1, 2, and 3. They can be grouped into
two 3 3 orthogonal Latin squares A and B such that B = AP
3
.
A =

2 3 1
1 2 3
3 1 2

B =

1 3 2
3 2 1
2 1 3

VOL. 83, NO. 3, JUNE 2010 167


Now A A is the following matrix.

(2, 2) (2, 3) (2, 1) (3, 2) (3, 3) (3, 1) (1, 2) (1, 3) (1, 1)


(2, 1) (2, 2) (2, 3) (3, 1) (3, 2) (3, 3) (1, 1) (1, 2) (1, 3)
(2, 3) (2, 1) (2, 2) (3, 3) (3, 1) (3, 2) (1, 3) (1, 1) (1, 2)
(1, 2) (1, 3) (1, 1) (2, 2) (2, 3) (2, 1) (3, 2) (3, 3) (3, 1)
(1, 1) (1, 2) (1, 3) (2, 1) (2, 2) (2, 3) (3, 1) (3, 2) (3, 3)
(1, 3) (1, 1) (1, 2) (2, 3) (2, 1) (2, 2) (3, 3) (3, 1) (3, 2)
(3, 2) (3, 3) (3, 1) (1, 2) (1, 3) (1, 1) (2, 2) (2, 3) (2, 1)
(3, 1) (3, 2) (3, 3) (1, 1) (1, 2) (1, 3) (2, 1) (2, 2) (2, 3)
(3, 3) (3, 1) (3, 2) (1, 3) (1, 1) (1, 2) (2, 3) (2, 1) (2, 2)

Next we substitute 3(a 1) + b for the entry (a, b) in A A. The result is the
matrix L. Any pair of entries at centrally symmetric cells in A sum to 4. Therefore,
the above substitution implies that any pair of entries at centrally symmetric cells in
A A sum to 10.
Similarly, we may compute B B and perform the same substitution and the out-
come is the matrix R. Again, any pair of entries at centrally symmetric cells in B B
sum to 10.
We also note that (A A)P
9
= AP
3
AP
3
= B B. Consequently, The proper-
ties of L

and R

described in subsection 2.2 follow.


Acknowledgment The author is grateful to Yaokun Wu for introducing him to the presentation of Hong-Yeop
Song [6] from which he rst learned about Choes remarkable square.
REFERENCES
1. L. D. Andersen, History of latin squares, Department of Mathematical Sciences, Aalborg University, Research
Report Series R-2007-32, 2007. To appear in The History of Combinatorics, R. Wilson and J. Watkins, eds.
2. L. Euler, De Quadratis Magicis. Opera Omnia, Ser. I, Vol. 7, 441457, Commentationes Arithmeticae 2 (1849)
593602. Also available online at http://www.eulerarchive.org.
3. Jun Yong Hoon, Mathematics in context: a case in early nineteenth-century Korea, Science in Context 19
(2006) 475512. doi:10.1017/S0269889706001049
4. Yong Woon Kim, Korean mathematics, in I. Grattan-Guinness, ed., Companion Encyclopedia of the History
and Philosophy of the Mathematical Sciences, Vol. 1, Routlege, London, 1994, 111117.
5. Kim Yong Woon, ed., Mathematics Section (Suhak Pyun) of the Compendium of the History of Korean Science
and Technology (Hanguk Kwahak Kisul Sa Jaryo Taekye), Yogang Chulpansa, Seoul, 1985.
6. Hong-Yeop Song, Chois orthogonal latin squares is at least 67 years earlier than Eulers, A presentation to the
2008 Global KMS Conference, Jeju, Korea.
Summary Orthogonal Latin squares have been known to predate Euler in Europe. However, it is surprising
that an Euler square of order nine was already in existence prior to Euler in the Orient. It appeared in a Korean
mathematical treatise written by Choe S ok-ch ong (16461715). Choes square has several nice properties that
have never been fully appreciated before. In this paper, an analysis of Choes remarkable square is provided and
a method of its construction is supplied.
KO-WEI LIH received a B.S. from the National Taiwan University in 1970. He worked under Joseph R. Shoen-
eld at Duke University, receiving his Ph.D. in 1976. He is a Research Fellow at the Institute of Mathematics,
Academia Sinica, where he has been since 1976. He switched his main research area from mathematical logic to
discrete mathematics in the early 1980s. He has great devotion to the promotion and popularization of science
and mathematics in Taiwan. His interest in magic congurations discovered by East Asian scholars before the
20th century led him to the study of Choes remarkable square. In addition to history of mathematics, his favored
hobbies include reading literature and enjoying art works.
168 MATHEMATICS MAGAZINE
The Graph Menagerie:
Abstract Algebra and the Mad Veterinarian
GENE ABRAMS
University of Colorado
Colorado Springs, CO 80933-7150
[email protected]
J ESSI CA K. SKLAR
Pacic Lutheran University
Tacoma, WA 98447-0003
[email protected]
Jessica owns three adorable cats: Boo, Kodiak, and Yoshi. Yoshi, unfortunately, has a
bad habit: He likes to damage Jessicas carpet. Sometimes Jessica wishes she had a
machine that would magically change Yoshi into a tidier pet . . . a goldsh, perhaps. Of
course, a goldsh is much smaller than a cat, so perhaps Yoshi could instead be turned
into two goldsh. Or maybe two goldsh and a turtle? But goldsh and turtles arent
too cuddly; Jessica might regret the change, so she would want the machine to be able
to turn two goldsh and a turtle back into a cat.
In the parlance of recreational mathematics, Jessica sometimes wishes she were a
Mad Veterinarian. Mad Vet scenarios were originally presented by Harris [7], who
posed questions as to which collections of animals can be transformed by Mad Vet
machines into other collections. Recently, such scenarios have been used as the basis
of various problem solving and Math Circle activities; see, for instance, [13]. In this
article we take a different approach, using Mad Vet scenarios to explore the concepts
of groups, semigroups, and directed graphs.
We have two main goals in analyzing Mad Vet scenarios. Corresponding to any
Mad Vet scenario there is a naturally dened semigroup, which may or may not be a
group. Our rst main goal is to help readers gain some intuition about when a given
semigroup is actually a group; to this end, we provide a number of not-so-run-of-the-
mill examples involving these algebraic structures.
Our second main goal is to illustrate a practice common in mathematics: namely,
answering a question in one area by recasting it in another area, answering the recast
question there, and then using that result to answer the original question. There are
numerous examples of such powerful cross-disciplinary pollination, including Eulers
solution to the classic K onigsberg Bridges Problem; see, for instance, Chapter 1 in
Biggs et al. [4]. We provide a beautiful example of this technique, posing an abstract
algebraic question and answering it using graph theory.
Along the way, we provide numerous examples and specic computations. We also
present some follow-up questions and information which could be used to supplement
the material in an abstract algebra course. We assume that the reader is familiar with
rst-semester abstract algebraic concepts such as groups and equivalence relations. A
good source for these topics is Fraleigh [5].
1. Mad Vet scenarios
A Mad Vet scenario posits a Mad Veterinarian in possession of a nite number of
transmogrifying machines, where
Math. Mag. 83 (2010) 168179. doi:10.4169/002557010X494814. c Mathematical Association of America
VOL. 83, NO. 3, JUNE 2010 169
1. Each machine transmogries a single animal of a given species into a nite
nonempty collection of animals from any number of species;
2. Each machine can also operate in reverse; and
3. There is a one-to-one correspondence between the species with which the Mad Vet
works and the transmogrifying machines; moreover, each species corresponding
machine takes as its input exactly one animal of that species.
These three requirements do not explicitly appear in the puzzles posed by Harris [7],
but they are certainly implicit there.
Lets consider an example.
Scenario #1. Suppose a Mad Veterinarian has three machines with the following
properties.
Machine 1 turns one ant into one beaver;
Machine 2 turns one beaver into one ant, one beaver and one cougar;
Machine 3 turns one cougar into one ant and one beaver.
Starting with one ant, the Mad Vet could produce innitely many different collec-
tions of animals. For example, she could use Machine 1 to turn the ant into a beaver,
and then use Machine 2 repeatedly to continually increase the number ants and cougars
in her collection. Alternatively, she could use Machine 1 followed by Machine 2, and
put the resulting cougar into Machine 3, yielding a collection of two ants and two
beavers. Then using Machine 1 twice in reverse, shed obtain a collection consisting
of exactly four ants.
We now mathematize these Mad Vet scenarios. Given a scenario involving n distinct
species of animals, we let A
i
be the species of animal taken as input (in the forward
direction) by Machine i , and denote by d
i, j
the number of animals of species A
j
which
are produced by Machine i . For example, in Scenario #1, A
1
= Ant, A
2
= Beaver and
A
3
= Cougar, and we have, for instance, d
1,1
= 0, d
1,2
= 1, and d
1,3
= 0.
Writing N for the set {0, 1, 2, . . .} and 0 for the trivial vector (0, 0, . . . , 0) of length
n, we dene a menagerie to be an element of the set
S = N
n
\ {0}.
There is a natural bijective correspondence between menageries and nonempty collec-
tions of animals from species A
1
, A
2
, . . . , A
n
. For instance, in Scenario #1 a collection
of two beavers and ve cougars would correspond to (0, 2, 5) in S.
2. Mad Vet graphs
We give here a brief introduction to some standard graph theory concepts. For a more
thorough examination of the topic, see, for example, West [11] or Wilson and Watkins
[12]. (Note that graph theory denitions vary widely from text to text; for instance,
what we will call a path is what West calls a walk [11].) A directed graph consists
of a set V of vertices and a set E of edges; the graph is nite if both V and E are
nite. Each edge e in E has an initial vertex, i (e), and terminal vertex, t (e), and is
represented in the graph by an arrow pointing from i (e) to t (e). Loops (that is, edges
e for which i (e) = t (e)) are allowed, as are multiple edges (that is, edges that have a
common initial vertex and a common terminal vertex). A vertex is a sink if it is not
the initial vertex of any edge.
170 MATHEMATICS MAGAZINE
Given any Mad Vet scenario, its corresponding Mad Vet graph is the directed graph
with V = {A
1
, A
2
, . . . , A
n
}, and having, for each A
i
, A
j
in V, exactly d
i, j
edges with
initial vertex A
i
and terminal vertex A
j
. Note that any Mad Vet graph is sink-free, due
to the third dening feature of a Mad Vet scenario.
EXAMPLE. Scenario #1 has the following Mad Vet graph.
A
1
A
3
A
2
We return to directed graphs in Section 6.
3. Menagerie equivalence classes
Now we come to the key idea. In the context of a Mad Vet scenario, there is a relation-
ship between various menageries. Clearly, a set consisting of two ants and a cougar is
different from a set consisting of an ant and three beavers. But if the vet has machines
that can be used to replace the rst collection of animals with the second (and vice
versa), it would make sense to somehow identify the menageries (2, 0, 1) and (1, 3, 0)
in S. We have here a naturally arising relation on S, dened formally as follows.
Given a = (a
1
, a
2
, . . . , a
n
) and b = (b
1
, b
2
, . . . , b
n
) in S, we say that a is related to
b, and write a b, if there is a sequence of Mad Vet machines that will transmogrify
the collection of animals associated with menagerie a into the collection of animals
associated with menagerie b. Using the three properties of a Mad Vet scenario, it is
straightforward to show that is an equivalence relation on S. The equivalence class
of a in S under is
[a] = {b S : b a};
such equivalence classes partition S.
We now focus on the set
W = {[a] : a S}
of equivalence classes of S under . Though the elements of W are actually sets them-
selves, we will work with them primarily as individual elements of the set W.
EXAMPLE. Suppose that our Mad Vet of Scenario #1 starts with the menagerie
(1, 0, 0), that is, a collection consisting of just one ant. Then (1, 0, 0) (0, 1, 0) (using
Machine 1); in fact, our previous discussion shows that
(1, 0, 0) (0, 1, 0) (1, 1, 1) (2, 2, 0) (4, 0, 0).
Using equivalence class notation, weve shown
[(1, 0, 0)] = [(0, 1, 0)] = [(1, 1, 1)] = [(2, 2, 0)] = [(4, 0, 0)],
that is, that these ve expressions all represent same element of W.
Now, let (a, b, c) be any menagerie in this Mad Vet scenario. We claim that (a, b, c)
is equivalent to one of the menageries (1, 0, 0), (2, 0, 0), or (3, 0, 0). If c > 0, then
VOL. 83, NO. 3, JUNE 2010 171
using Machine 3 c times we see that (a, b, c) (a + c, b + c, 0); then if b + c >
0, we can use Machine 1 in reverse b + c times to show that (a + c, b + c, 0)
(a +b +2c, 0, 0). By the transitivity of , we conclude that (a, b, c) (i, 0, 0) for
some positive integer i (namely, i = a + b + 2c). We noted above that (1, 0, 0)
(4, 0, 0), which implies that (2, 0, 0) (5, 0, 0), (3, 0, 0) (6, 0, 0), and, more gen-
erally, that ( j, 0, 0) (i, 0, 0) for any positive integers i and j that are congruent
modulo 3. Thus, the only elements of W are
[(1, 0, 0)], [(2, 0, 0)], and [(3, 0, 0)].
We now rule out any redundancy among these three elements of W. Given a
menagerie m = (a, b, c), dene the sum s
m
= a + b + 2c. If we apply Machine
1 to m, we obtain menagerie x = (a 1, b + 1, c); if we apply Machine 2 to m
we obtain y = (a + 1, b, c + 1); nally, if we apply Machine 3 to m we obtain
z = (a +c, b +c, 0). Since
s
x
= (a 1) +(b +1) +2c = s
m
= (a +c) +(b +c) = s
y
and
s
z
= (a +1) +b +2(c +1) = s
m
+3,
we have that if menageries m and n are related under then s
m
and s
n
are congruent
modulo 3. Since s
(1,0,0)
= 1, s
(2,0,0)
= 2 and s
(3,0,0)
= 3, the equivalence classes of
menageries (1, 0, 0), (2, 0, 0) and (3, 0, 0) under are all distinct. Hence, for this
Mad Vet scenario, W is the 3-element set
{[(1, 0, 0)], [(2, 0, 0)], [(3, 0, 0)]}.
4. Mad Vet semigroups
We can gain some understanding of a Mad Vet scenario by studying its collection, W,
of menagerie equivalence classes simply as a set. But we can learn even more if we
exploit a natural operation which combines menageries. We rst remind the reader of
some denitions.
Let S be any set, and let be a binary operation on S. Recall that (S, ) is a semi-
group if is associative; a semigroup (S, ) is a monoid if it contains an identity
element for ; and a monoid is a group if each of its elements has an inverse under .
Three important types of semigroups arise in the context of Mad Vet scenarios.
First, given a scenario, we have its set S of menageries, equipped with the usual addi-
tion of vectors. (Such addition is an acceptable semigroup operation on S since it
is associative and since the sum of two nonzero vectors is again nonzero.) Next, we
have the scenarios Mad Vet semigroup, which we discuss in this section. Finally, we
introduce graph semigroups in Section 7.
To create the Mad Vet semigroup of a Mad Vet scenario, we dene addition on the
scenarios set W of equivalence classes of menageries by setting
[x] +[y] = [x + y],
where addition on the right-hand side of the equation takes place in S. Addition on W
can be understood as follows. Suppose a Mad Vet has a collection of animals in her lab
corresponding to menagerie x, and is given a new collection of animals corresponding
to menagerie y. Then the sum[x] +[y] in W is the equivalence class of the menagerie
corresponding to the union of the animals in the two collections.
172 MATHEMATICS MAGAZINE
Since the elements of W are equivalence classes, we must make sure that our ad-
dition on W is well dened. But this is straightforward to see, by identifying our
menageries with their associated collections of animals: If some sequence of machines
transforms menagerie x into menagerie x

, and some sequence transforms menagerie


y into menagerie y

, then these machines, used in tandem, transform menagerie x + y


into menagerie x

+ y

.
Associativity of +on W is inherited from the associativity of +on S. Thus, (W, +)
is a semigroup, called the Mad Vet semigroup of its corresponding Mad Vet scenario.
Since addition is clearly commutative on S, every Mad Vet semigroup (W, +) is com-
mutative.
EXAMPLE. We revisit Scenario #1 and examine its Mad Vet semigroup (W, +).
We showed previously that in this case W is the 3-element set
W = {[(1, 0, 0)], [(2, 0, 0)], [(3, 0, 0)]}.
Using the operation + in W, we get, for instance,
[(1, 0, 0)] +[(1, 0, 0)] = [(1 +1, 0, 0)] = [(2, 0, 0)],
as wed expect. But perhaps its a bit surprising that
[(1, 0, 0)] +[(3, 0, 0)] = [(4, 0, 0)] = [(1, 0, 0)].
In other words, [(3, 0, 0)] behaves like an identity element with respect to the ele-
ment [(1, 0, 0)] in W. In fact, [(i, 0, 0)] + [(3, 0, 0)] = [(i, 0, 0)] for any 1 i 3.
So for this Mad Vet scenario the Mad Vet semigroup (W, +) is a monoid, with identity
[(3, 0, 0)]. Further, since
[(1, 0, 0)] +[(2, 0, 0)] = [(3, 0, 0)]
in W, every element in (W, +) has an inverse. Therefore, (W, +) is in fact a group;
since its order is 3, it must be isomorphic to the group Z
3
.
5. Not all Mad Vet semigroups are groups
Perhaps it is not surprising that the Mad Vet semigroup of Scenario #1 is a group, in
light of the explicit description of its elements. In many Mad Vet scenarios, (W, +)
is indeed a group; however, we will later see a Mad Vet semigroup that is not even a
monoid. Notably, given any Mad Vet semigroup W, the obvious choice, [0], for an
identity element of W is not even contained in W, since 0 is not in S.
Scenario #2. Suppose the same Mad Vet has replaced two of her machines with
new machines.
Machine 1 still turns one ant into one beaver;
Machine 2 now turns one beaver into one ant and one cougar;
Machine 3 now turns one cougar into two cougars.
In this situation W is a monoid, but not a group. First, we claim that
W = {[(i, 0, 0)] : i Z
+
} {[(0, 0, 1)]},
where Z
+
denotes the set of positive integers. Indeed, let (a, b, c) be a menagerie
for this scenario. If a = b = 0 (that is, there are only cougars in the menagerie) then
VOL. 83, NO. 3, JUNE 2010 173
c 1 applications of Machine 3 yields that (0, 0, c) (0, 0, 1). Else, suppose that
at least one of a or b is nonzero. Since (a, b, c) (a +b, 0, c) (using Machine 1 in
reverse b times), we may assume that the menagerie contains at least one ant and no
beavers. If c = 0, then we are done. If c = 0, then we can apply Machine 3 in the
appropriate direction |a c| times, obtaining a menagerie that contains a ants and a
cougars; thus, (a, 0, c) (a, 0, a). Then applying Machine 2 in reverse a times yields
(a, 0, a) (0, a, 0), which is equivalent to (a, 0, 0) (using Machine 1).
Hence, W consists of the indicated elements. We may now use arguments similar
to the argument utilized in studying Scenario #1 to show that these elements are all
distinct in W. This establishes our claim.
The same sorts of computations as before show that [(0, 0, 1)] is an identity element
for this Mad Vet semigroup, and hence W in this case is a monoid. But W is not a
group, because, for instance, there is no element [x] in W for which [(1, 0, 0)] +[x] =
[(0, 0, 1)].
Given a Mad Vet scenario, we can pose a variety of questions regarding the struc-
ture of its Mad Vet semigroup. For instance, is its semigroup nite or innite? Is it a
monoid? If it is a monoid, is it a group? Note that if it is a group, then that group is nec-
essarily abelian (since all Mad Vet semigroups are commutative)but is it necessarily
cyclic?
To give some sense of just how diverse Mad Vet semigroups can be, we provide be-
low ve additional Mad Vet scenarios (Scenarios #37) which include, in some order,
a scenario for which (1) W is an innite group; (2) W is a nite noncyclic group; (3)
W is a nite nonmonoid; (4) W is a nite cyclic group, not isomorphic to Z
3
; and (5)
W is an innite nonmonoid.
In fact, these ve different structures even arise in scenarios where the Mad Vet has
just three species in her lab. Our readers are encouraged to try their hands at matching
the above-described scenarios with those of Scenarios #37. Teachers can also nd a
sample Mad Vet homework assignment, appropriate for a rst-semester abstract alge-
bra course, at the MAGAZINE website. Descriptions of the semigroups arising in the
following ve Mad Vet scenarios are provided at the end of the article, so that readers
can check their work.
Scenario #3.
Machine 1 turns one ant into one beaver and one cougar;
Machine 2 turns one beaver into one ant and one cougar;
Machine 3 turns one cougar into one ant and one beaver.
Scenario #4.
Machine 1 turns one ant into two ants;
Machine 2 turns one beaver into two beavers;
Machine 3 turns one cougar two cougars.
Scenario #5.
Machine 1 turns one ant into one beaver and one cougar;
Machine 2 turns one beaver into one ant and one beaver;
Machine 3 turns one cougar into one ant and one cougar.
174 MATHEMATICS MAGAZINE
Scenario #6.
Machine 1 turns one ant into one beaver;
Machine 2 turns one beaver into one cougar;
Machine 3 turns one cougar into one cougar.
Scenario #7.
Machine 1 turns one ant into one ant, one beaver and one cougar;
Machine 2 turns one beaver into one ant and one cougar;
Machine 3 turns one cougar into one ant and one beaver.
Given the varied properties of Mad Vet semigroups displayed thus far, one may
wonder how one can possibly identify when Mad Vet semigroups are groups. In the
next section, we translate this algebraic question into a comparable graph-theoretical
question, whose solution is used to obtain an answer in the algebraic realm.
6. The Mad Vet Group Test
In this section, we answer the question: Given a Mad Vet scenario, when is its Mad
Vet semigroup W actually a group?
We need a bit more (standard) graph theory terminology. A path in a directed graph
is a sequence P = e
1
e
2
e
m
of one or more edges in for which t (e
j
) = i (e
j +1
)
for each 1 j m 1; we say that P is a path from i (e
1
) to t (e
m
). If v and w are
vertices in , we say v connects to w in case either v = w or there is a path in from
v to w. More generally, if P = e
1
e
2
e
m
is any path in and v is any vertex in ,
we say v connects to P in case v connects to i (e
j
) for some edge e
j
of P, 1 j m.
For a vertex v in V, a cycle based at v is a path e
1
e
2
e
m
from v to v for which the
vertices i (e
1
), i (e
2
), . . . , i (e
m
) are distinct. A loop at a vertex is therefore a cycle, with
m = 1.
The following graph-theoretic denitions might be more unfamiliar to a reader. A
nite graph is conal in case every vertex v of connects to every cycle and to
every sink in . Next, if C = f
1
f
2
f
m
is a cycle in , then an edge e is called an
exit for C if i (e) = i ( f
j
) for some 1 j m, and e = f
j
. (Intuitively, an exit for
C is an edge e, not included in C, which provides a way to momentarily step away
from C.)
EXAMPLE. Consider the following graph.
z
g
y
e f
h
x
The cycle eg based at y has three different exits: f , h and the loop at y. These same
three edges are also exits for the cycle ge based at z. Similarly, the loop at y has exits
e, f and h. On the other hand, the loop at x has no exit. Also, notice that this graph is
not conal, since, for example, vertex x does not connect to the cycle eg.
Now we are ready to answer the main question of this section.
VOL. 83, NO. 3, JUNE 2010 175
MAD VET GROUP TEST. The Mad Vet semigroup W of a Mad Vet scenario is a
group if and only if the corresponding Mad Vet graph has the following two proper-
ties.
(1) is conal; and
(2) Every cycle in has an exit.
The proof of this test is too long for this article; however, in Section 7 we will show
how the result follows from a more general theorem (whose complete proof is provided
in a supplement at the MAGAZINE website). Here, we see how this test applies to some
Mad Vet scenarios.
EXAMPLES. Consider again the Mad Vet graph associated with Scenario #1.
A
1
A
3
A
2
By inspection we see that is conal (there are no sinks in and every vertex con-
nects to each of the cycles in ) and that every cycle in has an exit. Thus the Mad
Vet Group Test reconrms that the Mad Vet Semigroup for this scenario is indeed a
group, a fact we established directly in Section 4. On the other hand, recall the Mad
Vet graph of Scenario #2.
A
1
A
3
A
2
We see that is not conal, since vertex A
3
does not connect to the cycle A
1
A
2
A
1
. So
the Mad Vet Group Test reconrms that the Mad Vet semigroup of Scenario #2 is not
a group, as we saw in Section 5.
Scenario #8. Consider the Mad Vet scenario described by Harris [7], in which the
Mad Vet has three machines with the following properties.
Machine 1 turns one cat into two dogs and ve mice;
Machine 2 turns one dog into three cats and three mice;
Machine 3 turns one mouse into a cat and a dog.
This scenario has the following Mad Vet graph, where A
1
= Cat, A
2
= Dog, and
A
3
= Mouse. The label (d) on an edge e indicates that there are actually d edges in
the graph from i (e) to t (e).
176 MATHEMATICS MAGAZINE
A
1
(5)
(2)
A
3
A
2
(3)
(3)
It is straightforward to see that this graph satises the two properties enumerated in
the Mad Vet Group Test; thus, the Mad Vet semigroup in this case is a group, which
we identify in Section 8.
You may now want to draw the Mad Vet graphs of Scenarios #37, and use the Mad
Vet Group Test to determine (or conrm) which three of those Mad Vet scenarios pro-
duce Mad Vet groups. Heres one additional observation about the Mad Vet graphs of
the remaining two scenarios: One of the graphs is conal but contains a cycle without
an exit, and the other is not conal, though each of its cycles has an exit.
7. Explanation of the Mad Vet Group Test
With the Mad Vet Group Test in hand, we have achieved the second main goal of our
article: that is, answering an algebraic question using graph theory. But we have not
proven the Mad Vet Group Test. We omit its lengthy proof, but note that the result
follows from a theorem about graph semigroups. In Section 2, we described a natural
connection between Mad Vet scenarios and directed graphs. In fact, a tighter connec-
tion can be forged. Any directed graph has an associated commutative graph monoid,
(M

, +). (The interested reader can nd the specics of this construction on p. 163 of
Ara et al. in [2].) It turns out that if x, y M

with x + y = 0, then x = y = 0. Thus,


the set W

= M

\ {0} is closed under +, and so (W

, +) is a semigroup, called the


graph semigroup of .
It follows directly from these constructions that given a Mad Vet scenario with Mad
Vet semigroup W and Mad Vet graph , the semigroups W and W

are isomorphic.
Thus, information about graph semigroups may be brought to bear in a Mad Vet con-
text. In particular, the main question of the previous section can be answered if we can
answer the related question: Given a directed graph , when is its graph semigroup
W

actually a group?
As it turns out, this question about graph semigroups has recently received signif-
icant attention in various mathematical research circles. Some of the related research
ideas are described in Section 9. Though in this article we are interested only in sink-
free graphs, we do not limit ourselves to such graphs in stating the following result.
GRAPH SEMIGROUP GROUP TEST. Let be a nite directed graph. Then the
graph semigroup W

is a group if and only if has the following three properties.


(1) is conal;
(2) Every cycle in has an exit; and
(3) contains no sinks.
Since Mad Vet graphs are sink-free, this test immediately implies the Mad Vet
Group Test. The interested reader can nd Enrique Pardos proof of this result at the
MAGAZINE website. While Pardos proof is too long to include here, we note that the
Mad Vet Group Test can be proven using only undergraduate-level graph theory and
abstract algebra tools.
VOL. 83, NO. 3, JUNE 2010 177
8. Classication of Mad Vet groups
Though we have achieved our two main goals, another natural question remains: When
a Mad Vet semigroup is a group, just exactly what group is it? We turn to another
area of mathematicsnamely, linear algebrafor an algorithmic way of nding the
structure of any Mad Vet group. Note that a Mad Vet semigroup must be a group in
order for this method to apply.
Let be the Mad Vet graph of a Mad Vet scenario whose Mad Vet semigroup
is a group. The graph has an associated incidence matrix A

, dened as follows:
Suppose has n vertices, v
1
, v
2
, . . . , v
n
. Then A

is the n n matrix (d
i j
), where d
i j
is
the number of edges with initial vertex v
i
and terminal vertex v
j
(for all 1 i, j n).
For example, if is the graph of Scenario #1, then
A

0 1 0
1 1 1
1 1 0

.
First, we form the matrix I
n
A

, where I
n
is the n n identity matrix. For in-
stance, using the above matrix A

, we have
I
3
A

1 1 0
1 0 1
1 1 1

.
Then we put the (square) matrix I
n
A

in Smith normal form. The Smith normal


form of an n n matrix having integer entries is a diagonal n n matrix whose diag-
onal entries are nonnegative integers

1
,
2
, . . . ,
q
, 0, 0, . . . , 0
such that
i
divides
i +1
for each 1 i q 1. The Smith normal form of a matrix
A can be obtained by performing on A a combination of these matrix operations:
interchanging rows or columns, or adding an integer multiple of a row [column] to
another row [column]. The resulting Smith normal form of matrix A is thus of the
form PAQ, where P and Q are integer-valued matrices with determinants equal to
1. Many computer algebra systems have a built-in Smith normal form function.

For
more information about the Smith normal form of a matrix, see, for example, Stein
[10] or Chapter 23 in Hogben [8].
Heres a way of answering the just exactly what group is it? question.
MAD VET GROUP IDENTIFICATION THEOREM. Given a Mad Vet scenario whose
Mad Vet semigroup, W, is a group, let be its associated Mad Vet graph. Then
W

= Z

1
Z

2
Z

q
Z
nq
,
where
1
,
2
, . . . ,
q
are the nonzero diagonal entries of the Smith normal form of the
matrix I
n
A

.
The justication of this theorem is beyond the scope of this article, but the very
enthusiastic reader can nd a similar justication in Section 3 of Abrams et al. [1].

For instance, to use Maple to compute the Smith normal form of a matrix B, dene B in Maple, load the
package LinearAlgebra, and use the command SmithForm(B). A word of caution: the Smith normal form function
in some computer algebra systems will not nd the Smith normal form of a matrix of determinant 0, even though
such a Smith normal form always exists in this case. A matrix of that type may arise in some Mad Vet scenarios;
indeed, it arises in one of our eight numbered Mad Vet scenarios.
178 MATHEMATICS MAGAZINE
EXAMPLE. Letting be the Mad Vet graph of Scenario #1, the Smith normal form
of the matrix I
3
A

is the matrix

1 0 0
0 1 0
0 0 3

.
Because we already know that Scenario #1s semigroup is a group, the Mad Vet Group
Identication Theorem implies that it is isomorphic to Z
1
Z
1
Z
3

= {0} {0}
Z
3

= Z
3
, as expected.
See if you can now use this method to identify the three groups which arise among
Scenarios #37. Finally, try applying this method to Scenario #8; you should get that
the Mad Vet group in that case is isomorphic to Z
34
.
9. Beyond the Mad Vet
By this point, you may be wondering: Who really cares about Mad Vet semigroups
anyway? Good question! In case you are not convinced that Mad Vet semigroups are
of interest in their own right, we present the following theorem. Although this result
is rather technical, our point in stating it is to emphasize the fact that Mad Vet semi-
groups do indeed play a central role in current, active lines of mathematical research.
Not only that, but this theorem actually bridges two apparently different branches of
mathematics (algebra and analysis) and the Graph Semigroup Group Test is exactly
the link between them.
PURELY INFINITE SIMPLICITY THEOREM. For a nite directed sink-free graph
, the following are equivalent:
(1) The Leavitt path algebra L
C
() is purely innite and simple. (This is a statement
about an algebraic structure.)
(2) The graph C

-algebra C

() is purely innite and simple. (This is a statement


about an analytic structure.)
(3) satises the conditions of the Graph Semigroup Group Test.
(4) The graph semigroup W

is a group.
In the interest of brevity, we have not stated the most general formof this result. Pardos
direct proof of the equivalence of (3) and (4), which involves only undergraduate-level
graph- and group-theoretic ideas, is new; the only published proof of this equivalence
of which the authors are aware involves showing that both (3) and (4) are equivalent
to (1). The very energetic reader may wish to consult Arando Pino et al. [3].
Finally, as promised earlier, here is a description of the Mad Vet semigroups arising
in Scenarios #37. In order, these scenarios semigroups are (up to isomorphism) the
group Z
2
Z
2
, a 7-element nonmonoid, the group Z, the monoid Z
+
, and the group
Z
4
. For details, see our Analyses of Mad Vet Scenarios #37, available at the MAGA-
ZINE website.
Acknowledgment The authors express their gratitude to Enrique Pardo for allowing them to use and modify
his proof of the Graph Semigroup Group Test for this article; to Amelia Taylor and Brian Hopkins for carefully
reading and offering helpful suggestions about the article; and to Ken Ross for his valuable comments, advice, and
support. The rst author was introduced to Mad Veterinarian puzzles at a June 2008 workshop on Math Teachers
Circles, sponsored by the American Institute of Mathematics, Palo Alto, CA. The author is grateful for AIMs
support.
VOL. 83, NO. 3, JUNE 2010 179
REFERENCES
1. G. Abrams, P. N.

Anh, A. Louly, and E. Pardo, The classication question for Leavitt path algebras, Journal
of Algebra 320(5) (2008) 19832026. doi:10.1016/j.jalgebra.2008.05.020
2. P. Ara, M.A. Moreno, and E. Pardo, Nonstable K-Theory for graph algebras, Algebra Rep. Th. 10 (2007)
157178. doi:10.1007/s10468-006-9044-z
3. G. Aranda Pino, F. Perera, and M. Siles Molina (eds.), Graph Algebras: Bridging the Gap between Analysis
and Algebra, Universidad de M alaga Press, Malaga, Spain, 2007.
4. Norman L. Biggs, E. Keith Lloyd, and Robin J. Wilson, Graph Theory 17361936, Oxford University Press,
New York, 1999.
5. John B. Fraleigh, A First Course in Abstract Algebra, 7th ed., Addison Wesley, Boston, 2002.
6. P. A. Grillet, Commutative Semigroups, Springer, New York, 2001.
7. Robert S. Harris, Bobs Mad Veterinarian Puzzles, http://www.bumblebeagle.org/madvet/index.
html.
8. Leslie Hogben, ed., Handbook of Linear Algebra, Chapman & Hall/CRC, Boca Raton, FL, 2006.
9. John M. Howie, Fundamentals of Semigroup Theory, Oxford Science Publications, Oxford, UK, 1996.
10. William Stein, Finitely generated abelian groups, http://modular.fas.harvard.edu/papers/ant/
html/node9.html.
11. Douglas B. West, Introduction to Graph Theory, 2nd ed., Prentice-Hall, Upper Saddle River, NJ, 2000.
12. Robin J. Wilson and John J. Watkins, Graphs: An Introductory ApproachA First Course in Discrete Math-
ematics, Wiley, New York, 1990.
13. Joshua Zucker, Math Teachers Circle: An introduction to problem solving, http://www.
mathteacherscircle.org/materials/JZproblemsolvingstrategies.pdf.
Summary In this paper, we explore Mad Veterinarian scenarios. We show how these recreational puzzles nat-
urally give rise to semigroups (which are sometimes groups), and we point out a beautiful, striking connection
between abstract algebra and graph theory. Linear algebra also plays a role in our analysis.
GENE ABRAMS received his Ph.D. in Mathematics from the University of Oregon in 1981 under the direction
of Frank Anderson. He is pleased to have coauthored this article with a (much younger, much wiser) mathematical
sibling! He has been an algebraist at the University of Colorado at Colorado Springs since 1983. He is proud to
have been designated as a University of Colorado systemwide Presidents Teaching Scholar, as well as the 2002
MAA Rocky Mountain Section Distinguished Teaching Award recipient. When not out riding his bicycle, he
surrenders to his passions for baseball and the New York Times Sunday Crossword.
JESSICA K. SKLAR received her Ph.D. in Mathematics from the University of Oregon in 2001, and is happy
to have collaborated on this paper with a mathematical older brother. She is a Pacic Lutheran University
algebraist and animal enthusiast. She swears she would never transmogrify her cats into goldsh, but wouldnt
mind turning her neighborhood woodpeckers into something less destructive. Like tapirs. Or grizzly bears.
180 MATHEMATICS MAGAZINE
The Ergodic Theory Carnival
J ULI A BARNES
Western Carolina University
Cullowhee, NC 28723
[email protected]
LORELEI KOSS
Dickinson College
Carlisle, PA 17013
[email protected]
Ladies and gentlemen, children of all ages. Come one, come all, to see the amazing
sights at our ergodic theory carnival! Step right up, friends, and we will show you some
of the mysteries seen around the carousel and in a taffy pulling booth. We will see a
carnival photographer and nd out what kinds of carousel rotations work best for her
photographs. We will meet a magician who knows how to nd a jewel in a pile of taffy
without getting his hands sticky!
Youve got to see it to believe it, but these situations can be analyzed by an area
of mathematics called ergodic theory. Thats right, folks, not only will we look at a
collection of basic piecewise linear functions that model activities at the carnival, but
we will also use ergodic theory to distinguish between these activities. Come right
over and watch how very small differences in local behavior cause big differences in
the long term behavior of functions!
What else is ergodic theory good for, you ask? Well, let me tell you. You can use
it to explain what happens to a system over time. This marvelous mathematics was
rst used to study statistical mechanics and investigate the motion or ow of gases
over time [7, 10, 15]. But wait! Theres more! For no extra cost you can use ergodic
properties in number theory to calculate how frequently any digit occurs in the real-
number base > 1 expansion of a number in [0, 1] [3, 12, 15]. Believe it or not, you
can even use ergodic theory in the eld of environmental science to assess the validity
of ecosystem models for pine forests [11]. An ergodic function has the property that if
you look long enough at its iterates on an arbitrary point you can obtain information
that represents the entire system. Starting at any other point gives you exactly the same
information. There is no sleight of hand here, folks; what you see is what you get.
Gather around and watch what we are going to do! Grab some cotton candy, bring
your mathematical intuition, and join us for a great show.
Basic examples
As you enter our carnival, stop rst at the carousel with its artistically crafted horses
and distinctive music. Find a place to stand by the side of the carousel and watch the
activity for a while. Notice the photographer taking pictures of children riding on the
carousel. She has set up her tripod at the best vantage point, and she takes a picture
every time the carousel stops.
As a mathematician, you notice that each movement of the carousel can be de-
scribed as a function on a circle, ignoring the up-and-down movement of the horses.
Pick the horse nearest to you on the edge of the carousel and call its initial point zero.
Let the circumference of the carousel be one unit. As the carousel moves, the distance
Math. Mag. 83 (2010) 180190. doi:10.4169/002557010X494823. c Mathematical Association of America
VOL. 83, NO. 3, JUNE 2010 181
of the horse from you, measured along the circumference of the carousel in the direc-
tion of motion, increases from 0 to 1. But wait! When it has traveled one unit, it is back
at its initial point. The location of any horse at any instant is described by its distance
along the edge of the circle in the counterclockwise direction, or a number in [0, 1]
where 0 and 1 represent the same location.
Lets practice by describing the motion of the horses while the carousel is stopped
to let children on. If a horse starts at the location x, let I (x) be its location at the end
of this motion. It isnt moving! So, I (x) = x. That was easy!
The operator starts the carousel again and could stop it after it travels any distance.
For now, the horse that starts at zero moves halfway around the circle and stops. It has
been a very short ride for everyone, and because the carousel is a solid structure, every
horse has moved exactly halfway around the circle. Using your mathematical skills,
you think of a function to represent this circular motion. While you might consider a
function taking the circle to itself, here we represent the distance traveled along the
edge of the carousel as a function from [0, 1] to [0, 1]. If a horse starts at x, let C(x)
be its location at the end of this motion. Then C(x) is dened by
C(x) =

x +1/2 if 0 x < 1/2


x +1/2 1 if 1/2 x < 1
= (x +1/2) mod 1.
The graph of C(x) is shown in FIGURE 1. This function is often called a rational
rotation of the circle because we are rotating by the rational number 1/2. Dening
C(x) as a piecewise function might seem complicated, but viewing it this way will
simplify the ideas presented later.
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Figure 1 The carousel function C(x) = (x +1/2) mod 1
Next to the carousel, you see a booth where two clowns are pulling taffy. You watch
while the rst clown holds one end of the taffy, while the second clown stretches it
to twice its length. Then the second clown folds the taffy over so that the end he was
holding is on top of the end that the rst clown is holding. FIGURE 2 illustrates this
taffy folding method. The second clown then picks up the newly created end at the
folded crease, and the process is repeated.
You notice that each step of this process can be described as a function on the
interval [0, 1]. Let the rst clown be at location zero, and dene the original length of
the taffy to be one unit. When the second clown stretches the taffy to twice its length
and folds it over, his end moves from a place one unit from the rst clown to zero units
182 MATHEMATICS MAGAZINE
from the rst clown. A point in the middle of the original taffy ends up one unit away
from the rst clown and in the second clowns hands. We can use the map
T(x) =

2x if 0 x < 1/2
2x +2 if 1/2 x < 1
to describe the taffy pull, and the graph of T(x) appears in FIGURE 3. This map is
commonly referred to as a tent map because of the shape of the graph.
1
1
1
1
(a)
(b)
(c)
(d)
Figure 2 (a) original taffy; (b) stretch
the taffy to twice its original length;
(c) fold taffy in half; (d) smush taffy
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Figure 3 The taffy function T(x)
As you watch the repetitive motion of the clowns, a magician appears and, with a
sly smile, leans over the taffy and drops a shiny jewel into the sticky mess. It lands
about 3/4 of the distance from the rst taffy puller toward the second taffy puller and,
after one quick stretch and fold, you catch a glimpse of it about halfway between them.
The taffy pullers are stretching and folding so quickly that you lose track of the jewel,
and you wonder if the magician will be able to nd it again.
Invertibility
While you stand there eating your cotton candy and watching the carnival sights, you
contemplate how the attractions you have seen are similar and how they are different.
You can begin by comparing the properties of the carnival functions we have already
dened, I (x), C(x), and T(x).
What would happen if the carousel were rotated in reverse? What if the taffy pullers
were to try to undo their work? It is easy to see that I (x) can be reversed. That is, since
the horses dont move at all, every horse arriving at I (x) comes from one previous
pointin this case, x. Mathematically, this property is called invertibility. A function
f (x) is called invertible if it is one-to-one, so that for any element y in the range of f ,
there is exactly one element x in the domain with f (x) = y. Even when the carousel
rotates, it is certainly possible to undo the rotation, sending each horse backwards to
the place where it started. Therefore, the carousel function, C(x), is invertible. (Parents
are lucky that the carousel is invertible; if it werent, reversing the direction of the
carousel would take a horse and child back to more than one locationthe one he or
she originally started from as well as cloned duplicates of the child in other locations.
Children are hard enough to keep up with already!)
VOL. 83, NO. 3, JUNE 2010 183
Attempting to invert T(x), however, is a little more sticky. Notice that T(1/4) =
T(3/4) = 1/2. That is, applying the taffy function in reverse would take each portion
of the taffy, break it into two pieces, and send these pieces to different locations. It
becomes a gummy mess, which is what is expected if one attempts to unmix taffy. It
also means that the taffy function is not invertible.
Lebesgue measure
Now, we take you on a quick trip away from the midway. Up next, we show you the
strange and mystifying sideshow attraction of measure theory. Those with sensitive
stomachs should look away as we generalize the concept of length to frightening and
grotesque subsets of the real line.
Step right up, ladies and gentleman, young and old, to see the wonderful and mys-
terious Lebesgue measure. If you have previously seen the secrets of the fantastic
integration developed by Henri Lebesgue then you may move immediately to the next
section of our carnival. But no one else should miss this attraction!
The familiar Riemann integration that you learned in calculus originated in the work
of Newton and Leibniz, and it only works on functions that are relatively nice. In
particular, we expect the sets that we use to be no more complicated than countable
unions of disjoint intervals contained in [0, 1]. Using that the length of an interval
[a, b] is l([a, b]) = b a, we can clearly dene the length of sets that are countable
unions of disjoint intervals. The length of the set is just the sum of the lengths of the
intervals. This concept of length is critical to the denition of Riemann integration.
Henri Lebesgue worked to extend the concepts of integration to functions that are
much more bizarre. He did this by generalizing the notion of length to what is called
a measure that is dened on more complicated sets. We now offer entire classes (one
may be starting soon in the Chautauqua tent right over there!) on the theory of measur-
able sets, measures, and integration, and mathematicians are still conducting interest-
ing research in these areas. Here, we sketch an outline of the development of Lebesgue
measure, the details of which can be found in Halmos book on measure theory [6].
To begin, we dene Lebesgue outer measure , which is a function dened on all
subsets E of [0, 1]. First, we take a countable collection of open intervals whose union
contains the set E and nd the sum of the lengths of the intervals in that collection.
Then we take the greatest lower bound of the lengths over all such unions of open
intervals containing E. This serves to minimize any overlap and measure E as closely
as possible. The greatest lower bound is called the outer measure of E, or (E).
Now, we really want to have the relationship (E) + ([0, 1] \ E) = 1. That is,
E and its complement should surely combine to have the length of [0, 1], and no
more. Thats just common sense! When that happens, we say that E is a Lebesgue
measureable set, and we dene the Lebesgue measure of E, (E), to be (E) = (E).
If [a, b] is an interval, then ([a, b]) = b a, as we expect. So Lebesgue measure is a
generalized length function that can be applied to more complicated subsets of [0, 1].
But not all subsets! Unfortunately, there are some complicated subsets of the inter-
val (sideshow horrors, unsuitable for most visitors) for which Lebesgue outer measure
gives rise to some paradoxes that conict with properties that we expect any length
function to have, so Lebesgue outer measure is not really a length function. Do you
want to enter the Sideshow of Strange Pathologies? No, no, turn back! The uninitiated
may be shocked by the behavior of sets born from the Axiom of Choice. Skip the next
two paragraphs!
We will now construct for you a set that has no Lebesgue measure. The rst step is
to suppose that x and y are two numbers in [0, 1] and dene x to be equivalent to y
184 MATHEMATICS MAGAZINE
if and only if x y is a rational number. That seemingly tame axiom we mentioned
allows us to conjure up a subset of [0, 1] that contains exactly one element from each
equivalence class; call it N. For each rational number r in [0, 1], we dene another
subset N
r
as follows:
N
r
= {x +r : x N [0, 1 r)} {x +r 1: x N [1 r, 1]}.
This slight-of-hand moves N r units to the right, and then moves the part that extends
beyond the point 1 backwards by 1 unit. It takes a little bit of work, but it is not difcult
to show that [0, 1] is the disjoint union of the sets N
r
.
The length of N
r
for each r should be the same because they are just translations of
N, but the sum of the lengths of the N
r
s over the countably innite rational numbers
in [0, 1] must be the length of the entire interval. If each N
r
has a positive length then
the sum would be innite, contradicting your knowledge that the length of [0, 1] is
one. Similarly, if each N
r
has length 0, then the sum would be 0. This is a paradox,
and we end up with a strange and alarming set whose length cannot be measured.
Welcome back, and for the sake of your sanity, be glad that you skipped the last two
paragraphs!
Measure-preserving functions
Back in the safety of the midway, we return to comparing the properties of the func-
tions that we have seen. How do the carousel and taffy pull treat our new friend
Lebesgue measure? Do children change in size? Does the amount of taffy shrink?
Mathematically, we are asking whether the functions preserve measure. To introduce
the formal denition, we need to dene the preimage of the set A as the set f
1
(A) =
{x : f (x) A}.
DEFINITION 1. If is Lebesgue measure on [0, 1] and f : [0, 1] [0, 1] is a
function, then f preserves the measure if ( f
1
(A)) = (A) for every measurable
set A.
If we consider the identity map, the inverse image of any measurable set A is simply
itself, I
1
(A) = A, so it easily follows that I preserves measure.
The carousel function, C(x), simply rotates every point x to a location halfway
around the carousel, and C
1
(x) rotates the carousel halfway the other direction. If
we see a certain number of children on the carousel right now, there were the same
number there before the carousel rotated. The children did not multiply or disappear.
The measure of any set of children is not changed by C
1
(x), and therefore C(x)
is measure-preserving. Even if we modify C(x) to rotate by an amount other than
1/2, C
a
(x) = (x +a) mod 1 for some real number a, Lebesgue measure is preserved
because C
a
(x) is still just a translation. FIGURE 4 shows the graph of one example of
a modied carousel function, C

2/2
(x) = (x +

2/2) mod 1.
Unlike C(x), the taffy function T(x) is not a simple translation, so it may appear
as if this function does not preserve measure. This function is not even invertible! But,
does T(x) preserve measure?
When the taffy is pulled, the original slab of taffy is stretched to twice its length.
Then the taffy is folded over to make a new piece of taffy the same length as the
original piece. If we reverse the process, any set A in the interval [0, 1] of taffy has to
be unfolded into two pieces, and then each piece is shrunk to half of its length (yes, this
would be difcult to do in real life!). FIGURE 5 illustrates this procedure. Since each
piece is reduced by half its length and there are two pieces, the pre-image of A has the
VOL. 83, NO. 3, JUNE 2010 185
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Figure 4 The modied carousel func-
tion C

2/2
(x) = (x +

2/2) mod 1
1
1
1
1
(a)
(b)
(c)
(d)
Figure 5 (a) select part of the taffy;
(b) un-smush taffy; (c) unfold taffy;
(d) shrink taffy back to original length
same measure as A. Looking at this more formally, imagine having an interval [a, b]
in [0, 1]. Then, T
1
([a, b]) = [
a
2
,
b
2
] [1
b
2
, 1
a
2
]. It follows that the Lebesgue
measure of T
1
([a, b]) is [
b
2

a
2
] +[(1
a
2
) (1
b
2
)] = 2(
b
2

a
2
) = b a which
is the measure of [a, b]. Since this holds for all intervals and since any measurable set
A has a measure based on all intervals containing A, it follows that T is a measure-
preserving function. In other words, we dont lose any taffy in the process, and it is
spread evenly in each step.
The fact that our taffy function is measure-preserving is based on the fact that when
we mix the taffy, any newly mixed piece (interval) comes from two pieces which
are each half the length of the new piece. This is directly related to the fact that we
stretched the taffy to twice its length. What if we modify the taffy function to allow
stretching by a different amount? Suppose that a new taffy-pulling clown arrives at the
scene. Instead of stretching the taffy to twice its length, the new clown stretches the
taffy from a length of one to a length of 3/2 and then folds the newly stretched taffy
over, making a crease at the point one unit from 0 like before. This time, part of the
taffy is not covered by the newly stretched part, and the graph is not symmetric, as
seen in FIGURE 6. The new resulting taffy fold function becomes:
T
3/2
(x) =

3
2
x if 0 x <
2
3

3
2
x +2 if
2
3
x < 1.
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Figure 6 The modied taffy fold function T
3/2
186 MATHEMATICS MAGAZINE
If we consider a portion of the taffy near 0, say the interval A = [0, 1/3], then the
measure of the pre-image of A, T
1
3/2
(A), is (1/3)/(3/2) = 2/9. But the measure of A
is 1/3. Since A is a measurable set, T
3/2
is not a measure-preserving function, even
though no taffy is lost in the process. The main difference is that the taffy is not mixed
evenly in this case.
Folks, both our taffy and carousel functions as originally dened preserve Lebesgue
measure. However, when we modify the functions, all carousel-like functions preserve
Lebesgue measure, but not all taffy-like functions preserve Lebesgue measure.
Ergodicity
Ladies and gentlemen, you are now about to witness the secrets behind ergodicity and
how it relates to our carnival functions and modied versions of these functions. We
will rst show you strange sets that are equal to their preimages.
What does this mean, you ask, to have a set A with f
1
(A) = A? Watch carefully
as our carousel carnies dress two children on opposite sides of our carousel in red
clown wigs. Keep your eyes wide open when the operator runs the carousel backwards.
Thats right, friends, C
1
(x) is a rotation halfway around the carousel, so no child ends
in the same location as he or she began. But wait! After a half rotation of the carousel
backwards, the red wigs are located in exactly the same positions as before, even if the
children themselves are in different locations! Thats right, folks, if A denotes the set
of locations of red clown wigs, we have that C
1
(A) = A.
Why are sets with f
1
(A) = A important? In general, if we have a measurable
subset A [0, 1] and measure-preserving function f such that f
1
(A) = A, then it
is also true that f
1
([0, 1] A) = [0, 1] A. In this case, we could simplify things;
we could study f by looking at its restriction to A independently from its restriction
to [0, 1] A. However, if (A) = 0 or (A) = 1, then we havent signicantly sim-
plied our study. Functions that cannot be simplied in this way are called ergodic. In
other words, if the measure of A is strictly between 0 and 1, then for f to be ergodic,
it is necessary that f
1
moves at least part of the set A to somewhere else.
DEFINITION 2. If is Lebesgue measure on [0, 1] and f : [0, 1] [0, 1] is a
measure-preserving function, then f is ergodic if the only measurable sets A with
f
1
(A) = A satisfy (A) = 0 or 1.
Although the carousel function C(x) is measure-preserving, it is not ergodic, and it
is very easy to construct a measurable set to verify this. Let A = [0, 1/4) [1/2, 3/4),
representing children riding in the rst or third quadrants of the circle. Then A is
clearly measurable with (A) = 1/2, and C
1
(A) = A, so C(x) is not ergodic. We
will see in the next section why this is signicant.
What about the other carousel-like functions dened by C
a
(x) = (x +a) mod 1
for a real number a? We know that they are all measure-preserving, but are any of
these functions ergodic? If a = 0 then we have that C
0
(x) is the identity function
I (x), and in this case I
1
(A) = A for any set A, so I (x) is clearly not ergodic. If
the translation number a is any other rational number, then C
a
is also not ergodic. For
when a is rational, a = p/q for some integers p and q with q = 0, and p and q have no
common factors besides 1. Dene A = [0,
1
2q
] [
1
q
,
3
2q
] [
2
q
,
5
2q
] [
q1
q
,
2q1
2q
].
Then C
1
a
(A) = A, but the measure of A is 1/2.
If the translation number a of C
a
(x) is irrational, then we are in a much differ-
ent situation. If a is irrational, then C
a
(x) is ergodic. While this is difcult to prove
rigorously from the denition, it is not too challenging to see why the conditions of er-
godicity must hold on intervals. Suppose that the set A contained an interval [c, d]. We
VOL. 83, NO. 3, JUNE 2010 187
know that no matter how many times we run the carousel, we always end up with a set
of length d c. Since C
1
a
(A) = A, it follows that C
1
a
([c, d]) A. Using the same
reasoning, C
1
a
(C
1
a
([c, d])) A, and so on. However, since a is an irrational number,
the points C
n
a
(c) = C
1
a
C
1
a
C
1
a
(c), where we perform n compositions, ll
out the circle. That is, no matter where you decide to stand around the carousel, at some
time the left endpoint c will stop arbitrarily close to you. If kids with red clown wigs
were sitting in the interval [c, d], then no matter where you stand before the carousel
moves, at some time there will be a red wig almost directly in front of you; thus there
was a red wig in front of you before the carousel moved. So all points on the carousel
must belong to A, and (A) = 1. Recall that FIGURE 4 shows the graph of a modied
carousel function with a =

2/2, which we now know is ergodic since the translation
number is irrational.
Dont let this sleight of hand fool you into thinking that this is a complete proof
that C
a
(x) is ergodic when a is irrational! Remember from our earlier discussion that
Lebesgue measurable sets are more complicated than intervals, or even innite unions
or intersections of intervals. Still, examining intervals gives us some idea about why
C
a
is ergodic when a is an irrational number, and you can nd a complete proof in
standard ergodic theory books, like ones by Petersen [10] or Walters [15].
What about the taffy function T(x)? It, too, is ergodic. Again, we can examine
intervals to obtain a glimmer of understanding as to why this is true. Imagine that you
used red food coloring to color a visible section of your taffy that belonged to a set A.
After unfolding and shrinking, there would be a red piece closer to the rst taffy puller
and a red piece closer to the second taffy puller, so those regions had to belong to the
original set A as well. Continuing this process, we see that if A contains an interval
and T
1
(A) = A, then (A) = 1. A rigorous proof can be found in Nicholis book on
Nonlinear Science [9].
You might suspect that any measure-preserving, noninvertible function is ergodic,
but that is false. In fact, we can easily modify the taffy function to obtain a measure-
preserving function that isnt ergodic. Lets suppose that our original taffy stretching
clown, who was skilled at stretching the taffy to twice its original length, returns to the
booth. However, after the second clown stretches the taffy to twice its length, instead
of folding the taffy, he cuts it in half. Then each clown performs his own taffy fold at
the midpoint of his own piece. After this is completed, the second clown sticks the two
ends of his piece to the fold of the rst clowns piece, so they now have a piece of taffy
that has length one again, and they repeat the process. We can represent this function
with the equation
S(x) =

2x if 0 x < 1/4
2x +1 if 1/4 x < 1/2
2x 1/2 if 1/2 x < 3/4
2x +5/2 if 3/4 x < 1.
(1)
See FIGURE 7 for a graph of S(x). Again, if we look at pre-images of any interval,
we end up with exactly two pieces of the same length. In addition, it is not difcult
to see that the graph in FIGURE 7 can be decomposed into the function on [0, 1/2]
and the function on [1/2, 1]. Using arguments similar to those for T(x), the func-
tion S(x) is non-invertible and measure-preserving. However, S is not ergodic because
S
1
([0, 1/2)) = [0, 1/2).
So the original taffy function is ergodic but the carousel function is not. However,
as we have shown, we can modify the carousel function to obtain one that is ergodic,
and we can modify the taffy function to obtain one that is not!
188 MATHEMATICS MAGAZINE
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Figure 7 The modied taffy fold function S(x)
The ergodic theorem
Folks, you may not yet be convinced that ergodic functions are useful or important,
but stick around to see the famous Birkhoff ergodic theorem, proved by George David
Birkhoff in 1931 [2]. The ergodic theorem ensures that what you observe is represen-
tative of the entire system. We will use this theorem to help our photographer, who
would like to take pictures of all children on the carousel. If she only takes photos
when the carousel stops, which carousel functions will allow her to photograph all of
the children? We will also use this theorem to help our magician, who has dropped the
jewel into the taffy.
Stick around, friends, and we will show you a simplied version of Birkhoffs er-
godic theorem that will resolve the conundrums of our photographer and magician. To
do this, we need to dene the characteristic function of A,
1
A
(x) =

1 if x A
0 if x / A.
THEOREM 3. (BIRKHOFFS ERGODIC THEOREM) If is Lebesgue measure on
[0, 1] and f : [0, 1] [0, 1] is a measure-preserving function, then f is ergodic if
and only if
lim
N
1
N
N

n=1
1
A
( f
n
(x)) = (A)
for each measurable set A and for almost every x [0, 1] ( for all x [0, 1] except
for at most a set of measure 0).
The left-hand side of the equation in Birkhoffs ergodic theorem represents the limit
of the average number of times f (x), f ( f (x)), f ( f ( f (x))), . . . lands in the set A.
This is commonly known as the time average, and the right hand side of the equation
is known as the space average. In other words, for almost every possible point x,
the set f (x), f ( f (x)), f ( f ( f (x))), . . . will eventually land in every set of positive
measure, and about as often as the measure of the set would indicate. The statement
and proof of Birkhoffs ergodic theorem is beyond the scope of this paper, but we refer
the interested reader to Birkhoffs paper, [2], or ergodic theory books by Petersen [10]
or Walters [15].
How does the ergodic theorem apply to our photographer, who is taking pictures
every time that the carousel stops? If the carousel moves according to the original
carousel function C(x), the photographer would photograph the same two children
VOL. 83, NO. 3, JUNE 2010 189
over and over again. This is because C(x) rotates exactly halfway around each time.
If we look at C
a
(x) for any rational a, she would still see a nite number of children
as the day continues. She much prefers the motion x described by C
a
(x) when a is
irrational. Why? Since we know this system is ergodic, Birkhoffs ergodic theorem
implies that almost every point along the edge of the carousel will eventually move
into the cameras eld of view. The photographer does not have to move, yet she can
take photographs of each child if she waits long enough. If she had selected a different
location to set up her camera, she would still photograph every child. Hence, when a
is irrational, we have a happy photographer.
What about our magician? He simply asks a member of the audience to select one
small region on the table to stare at as the taffy pullers work. The magician is convinced
that the jewel will reappear in this one location as long as the group waits long enough.
Since we showed that the taffy function T(x) is ergodic in the previous section, the
ergodic theorem implies that he is correct. However, our magician knows better than to
do his jewel trick with the modied taffy function S(x) shown in FIGURE 7. He cant
guarantee that the audience member will choose a spot where the jewel will reappear
because S(x) is not ergodic.
This brings us back to the question of how ergodic theory is used. In physics, the
ergodic theorem implies that studying the motion of a single particle of gas over the
long term (the time average) gives the same information as looking at all particles at a
particular instant (the space average) [7, 10, 15]. Ergodicity is also useful in biomedical
signal and image processing. For many tests, such as the electrocardiogram (ECG) and
the electroencephalography (EEG), technicians take only one sample recording from
a patient and calculate a time average. If the process is ergodic, then they can use the
time average to estimate the mean and variance of the signal (the space averages) using
the ergodic theorem [8]. These examples, my friends, are not just a day at the carnival.
A nal question After spending a hot, sticky day at the midway, we want to leave
you with one more enticing idea that will compel you to return to our carnival again.
How can we distinguish between the ergodic examples? There are many other prop-
erties that play important roles in ergodic theory; we mention one more. We say a
function f is strong mixing if for all measurable sets A and B
lim
n
(A f
n
B) = (A)(B).
This means that, in the long run, f distributes B fairly evenly throughout [0, 1]. Strong
mixing implies ergodicity, but not all ergodic functions are strong mixing. One of the
ergodic examples in this paper is strong mixing with respect to Lebesgue measure, and
the other is not. Can you gure out which is which? The answers can be found in the
references [9, 15].
Acknowledgment The referees made insightful comments and suggestions that greatly improved this paper,
and so we novice carnival workers gratefully acknowledge the assistance of our Lot Managers.
REFERENCES
1. J. Barnes and J. Hawkins, Families of ergodic and exact one-dimensional maps, Dyn. Syst. 22(2) (2007)
203217. doi:10.1080/14689360600914730
2. G. D. Birkhoff, Proof of the ergodic theorem, Proc. Natl. Acad Sci. 17 (1931) 656660. doi:10.1073/pnas.
17.12.656
3. K. Dajani and C. Kraaikamp, Ergodic Theory of Numbers, Mathematical Association of America, Washing-
ton, DC, 2002.
4. W. de Melo and S. van Strien, One Dimensional Dynamics, Springer-Verlag, Berlin, 1993.
5. P. Collet and J. P. Eckmann, Iterated Maps on the Interval as Dynamical Systems, Birkhh auser, Boston, 1980.
190 MATHEMATICS MAGAZINE
6. P. R. Halmos, Measure Theory, Van Nostrand, New York, 1950.
7. U. Krengel, Ergodic Theorems, de Gruyter Studies in Mathematics #6, Walter de Gruyter, Berlin, 1985.
8. K. Najarian, R. Splinter, Biomedical Signal and Image Processing, CRC Press, Boca Raton, FL, 2006.
9. G. Nicolis, Introduction to Nonlinear Science, Cambridge University Press, Cambridge, UK, 1995.
10. K. Petersen, Ergodic Theory, Cambridge studies in advanced mathematics #2, Cambridge University Press,
Cambridge, UK, 1983.
11. S. Pietsch and H. Hasenauer, Using ergodic theory to assess the performance of ecosystem models, Tree
Physiology 25 (2005) 825837.
12. A. R enyi, Representations for real numbers and their ergodic properties, Acta Math. Acad. Sci. Hungar 8
(1957) 477493. doi:10.1007/BF02020331
13. D. Rudolph, Fundamentals of Measurable Dynamics: Ergodic Theory on Lebesgue Spaces, Clarendon Press,
Oxford, UK, 1990.
14. C. E. Silva, An Invitation to Ergodic Theory, American Mathematical Society, Providence, RI, 2008.
15. P. Walters, An Introduction to Ergodic Theory, Springer-Verlag, New York, 1982.
Summary The Birkhoff ergodic theorem, proved by George David Birkhoff in 1931, allows us to investigate
the long-term behavior of certain dynamical systems. In this article, we explain what it means for a function to
be ergodic, and we present Birkhoffs theorem. We construct models of activities typically found at carnivals and
compare and contrast them by analyzing their ergodic theory properties. We use these carnival models to show
how Birkhoffs ergodic theorem can be used to help a photographer set up her equipment to take pictures of all
children on a carousel and to aid a magician in nding a lost jewel in a sticky mess of taffy.
JULIA BARNES received her Ph.D. from UNC-Chapel Hill in 1996, and has been teaching at Western Carolina
University ever since. Her research area is a cross between ergodic theory and complex dynamical systems.
Although she has not visited a carnival, ridden a carousel, or watched a clown pull taffy lately, she does enjoy
looking at fun applications of mathematics.
LORELEI KOSS is an associate professor in the Department of Mathematics and Computer Science at Dickin-
son College in Carlisle, Pennsylvania. She received a Ph.D. in Mathematics from the University of North Carolina
at Chapel Hill (1998). In addition to her interest in teaching undergraduate mathematics, she enjoys research on
complex dynamical systems and ergodic theory. She also loves taffy.
To appear in College Mathematics Journal, September 2010
THE FAIRNESS ISSUE
Articles
An Interview with Steven J. Brams, by Michael A. Jones
A Geometric Approach to Fair Division, by Julius Barbanel
Cutting Cakes Carefully, by Theodore P. Hill and Kent E. Morrison
Taking Turns, by Brian Hopkins
Who Does the Housework? by Angela Vierling-Claassen
Lewis Carroll, Voting, and the Taxicab Metric, by Thomas C. Ratliff
Gerrymandering and Convexity, by Jonathan K. Hodge, Emily Marshall, and
Geoff Patterson
Classroom Capsule
Visualizing Elections using Saari Triangles, by Mariah Birgen
VOL. 83, NO. 3, JUNE 2010 191
Which Surfaces of Revolution
Core Like a Sphere?
VI NCENT COLL
Lehigh University
Bethlehem PA 18015
[email protected]
J EFF DODD
Jacksonville State University
Jacksonville, AL 36265
[email protected]
A spherical ring is the object that remains when a cylindrical drill bit bores through a
solid sphere along an axis, removing from the sphere a capsule consisting of a cylinder
with a spherical cap on each end, as shown in FIGURE 1. Remarkably, the volume of
such a spherical ring depends only on its height, dened as the height of its cylindrical
inner boundary, and not on the radius of the sphere from which it was cut.
h
y
x
z
Figure 1 Cutting a spherical ring of height h from a sphere.
h
(r, 0) (r, 0)
(0, r)
y
x
y =
_
r
2
x
2
_
h/2,
_
r
2
(h/2)
2
_ _
h/2,
_
r
2
(h/2)
2
_
Figure 2 A spherical ring as a solid of revolution.
One straightforward way to verify this fact is to note that all the objects in FIGURE 1
are solids of revolution. This is depicted in FIGURE 2, where everything shown in the
xy-plane is to be revolved around the x-axis. There a sphere of radius r is represented
Math. Mag. 83 (2010) 191199. doi:10.4169/002557010X494832. c Mathematical Association of America
192 MATHEMATICS MAGAZINE
by the semicircular graph of y =

r
2
x
2
, and a spherical ring of height h cut from
this sphere is represented by the shaded region below the semicircle and above the
horizontal line segment of length h inscribed in the semicircle. We can calculate the
volume of this spherical ring by integrating the areas of its annular cross-sections taken
perpendicular to the x-axis (the washer method):
V =
_
h/2
h/2
_
(
_
r
2
x
2
)
2
(
_
r
2
(h/2)
2
)
2
_
dx
=
_
h/2
h/2
_
(h/2)
2
x
2
_
dx =
h
3
6
.
At the outset it looks as though V should depend on both r and h, but it turns out
to be a function of h only. This is a surprise that challenges many peoples intuition.
For example, a spherical ring of height one centimeter cut out of a sphere the size of
the earth has the same volume as a spherical ring of height one centimeter cut out of
a sphere the size of a baseball. How can this be? The reason is that while the inner
radius of the ring cut out of the earth is much larger, the radial thickness of this ring is
much smaller: about 2 10
10
cm, which is less than the diameter of a hydrogen atom.
For spherical rings of any xed height h cut out of spheres of increasing radius r, this
tradeoff between increasing inner radius (the quantity
_
r
2
(h/2)
2
in FIGURE 2) and
decreasing radial thickness (the quantity r
_
r
2
(h/2)
2
in FIGURE 2) preserves a
xed volume.
This property of the sphere appears in many calculus textbooks as an exercise in
calculating volumes of solids of revolution. It has also caught the eye of many recre-
ational mathematicians, perhaps getting its most public airing in the newspaper column
of Marilyn vos Savant [11]. But, despite its prominence, it seems to lack a name. Since
the process of cutting a spherical ring out of a sphere is much like coring an apple, we
refer to this property as the coring property of the sphere.
Many surfaces of revolution can be similarly cored by cylindrical drill bits centered
on their axes of revolution. So it is natural to ask to what extent the coring property
characterizes the sphere among surfaces of revolution. Here we pose this question pre-
cisely and answer it completely using only elementary ideas from calculus, informed
at critical junctures by geometric insight.
The coring property The rst order of business is to state the coring property in
such a way that it applies to surfaces of revolution other than spheres. The coring
property of the sphere compares spheres of different radii r, but each of these is just
the unit sphere scaled up or down by the linear scale factor r. So we say that a surface
of revolution satises the coring property if, when the surface is scaled up or down
by a linear scale factor and then cored by a cylindrical drill bit centered on its axis of
revolution, what remains (exterior to the drill bit) is a ring whose volume depends only
on its height, and not on the scale factor. We dene a ring to be a one-piece solid of
revolution having a single cylindrical inner boundary, and the height of such a ring to
be the height of its cylindrical inner boundary.
To esh out this formulation of the coring property, and to give us a workable setup
for our investigation of it, we need a picture. In general, a surface of revolution S is
generated by revolving a plane curve C, called the prole curve of S, around a line
lying in the same plane as C, which we have already called the axis of revolution
of S. In particular, a sphere is the surface generated by revolving a semicircle around
the line containing its diameter. (In fact, this is how Euclid dened a sphere in his
VOL. 83, NO. 3, JUNE 2010 193
Elements [6, p. 261]!) Since we are essentially generalizing a property of the sphere,
we begin with a prole curve looking much like a semicircle, as depicted in FIGURE 3.
h
(ra, 0) (ra, 0)
y
x
(0, rb)
y = r f (x/r)
_
h/2, r f (h/2r)
_ _
h/2, r f (h/2r)
_
Figure 3 An even prole function y = f (x) scaled by a linear scale factor r.
The prole curve in FIGURE 3 is the graph of an even prole function y = f (x)
and is to be revolved around the x-axis. We scale the surface S generated by the
graph of f by a linear scale factor r, yielding surfaces S(r) generated by the curves
y/r = f (x/r), or y = r f (x/r). (For example, if S is a sphere of radius , then S(r)
is a sphere of radius r.) We can cut a ring out of the solid bounded by S(r) by boring
through it with a cylindrical drill bit centered on the x-axis. The resulting ring is gen-
erated by revolving the shaded region around the x-axis in FIGURE 3. We say that the
surface S satises the coring property if the volume V(r, h) of a ring of height h cut
out of the solid bounded by S(r) is a function of h alone.
Before striking out in search of surfaces satisfying the coring property, lets exam-
ine the assumptions implicit in FIGURE 3, since these will be the hypotheses for any
conclusions that we reach based on this picture. To begin with, the prole curve in
FIGURE 3 is not self-intersecting and it has exactly two x-intercepts. We accept these
assumptions as geometrically natural, because they ensure that the resulting surface S
is closed: that is, it encloses a single 3-dimensional region.
Two other prominent features of this prole curve are:
1. It is the graph of a function y = f (x).
2. It has a vertical line of symmetry, which conveniently and with no loss of generality
is the y-axis.
These assumptions are not quite as cumbersome as they might seem because, for our
purposes, the rst is subsumed by the second. That is, if a curve C generates a surface
that satises the coring property and if C is symmetric with respect to the y-axis, then y
must be a function of x on C. This is because for any prole curve C that is symmetric
with respect to the y-axis on which y is not a function of x, there will be values of h
for which two or more rings having the same height h but different volumes can be cut
out of the surface generated by C by cylindrical drill bits of different sizes, so that the
volume of a ring cannot be a function of its height alone. For example, consider the
prole curve C indicated in FIGURE 4. For the value of h indicated there, cylindrical
drill bits of radii R
1
, R
2
, and R
3
will cut rings out of the surface generated by C having
the same height h but different volumes. A surface generated by a curve C having a
vertical line of symmetry is centrally symmetric. That is, it has a center of symmetry:
a point P (in this case the origin) bisecting every line segment passing through P that
connects two points on the surface.
194 MATHEMATICS MAGAZINE
(h/2, R
3
)
(h/2, R
2
)
(h/2, R
1
)
(h/2, R
3
)
(h/2, R
2
)
(h/2, R
1
)
h
(a, 0) (a, 0)
(0, b)
y
x
Figure 4 A symmetric prole curve not dened by a function.
So a closed, centrally symmetric surface of revolution S satisfying the coring prop-
erty must be generated by the graph of an even prole function f having exactly two
x-intercepts. In addition, f must be increasing to the left of x = 0 and decreasing to
the right of x = 0, since only then will coring the surface S with a cylindrical drill bit
always result in what we have dened to be a ring, which needs to be in one piece.
Therefore, to determine which closed, centrally symmetric surfaces of revolution sat-
isfy the coring property, it is safe use FIGURE 3 as a starting point.
The symmetric case: a calculus argument The volume V(r, h) of the ring formed
in FIGURE 3 is twice the volume of the right half of the ring, which is the volume
enclosed by S(r) on the interval 0 x h/2 less the volume of the cylinder drilled
out on that same interval:
V(r, h) = 2
_
_
h/2
0

_
r f
_
x
r
__
2
dx
_
r f
_
h
2r
__
2
h
2
_
. (1)
We wish to identify the functions f for which V depends only on h and not on r.
Towards this end, the simplest strategy turns out to be the best: we simply set equal to
each other the volumes of two different rings of the same height, and see what we can
say about f based on the resulting equation.
In particular, note that for a ring cut out of the unscaled surface S, whose height
h will satisfy 0 h/2 a, another ring of the same height can be cut out of any
scaled-up surface S(r) where r > 1, and the volumes of these two rings should be
the same. That is, for any h such that 0 h/2 a and any r 1, we should have
V(1, h) = V(r, h), or from (1):
2
_
_
h/2
0
[ f (x)]
2
dx
_
f
_
h
2
__
2
h
2
_
= 2
_
_
h/2
0

_
r f
_
x
r
__
2
dx
_
r f
_
h
2r
__
2
h
2
_
(2)
which is easily rearranged to yield
_
h/2
0
_
[ f (x)]
2
[r f (x/r)]
2
_
dx = (h/2)
_
[ f (h/2)]
2
r
2
[ f (h/2r)]
2
_
. (3)
VOL. 83, NO. 3, JUNE 2010 195
For xed r 1, let
g(x) = [ f (x)]
2
r
2
[ f (x/r)]
2
.
Then for 0 h/2 a, g satises
_
h/2
0
g(x) dx =
h
2
g(h/2). (4)
Dividing both sides of (4) by h/2, we see that the average value of g on any subinterval
[0, h/2] of [0, a] is its value at the right endpoint of the subinterval: g(h/2). Does this
mean that g must be constant? If f is continuous on the interval [0, a], then so is g, so
that both sides of (4) are differentiable functions of h. Differentiating yields
1
2
g(h/2) =
1
2
g(h/2) +
h
4
g

(h/2)
so that g

(h/2) = 0 for 0 h/2 a. So indeed, g is constant on [0, a]. What is the


constant? If, as in FIGURE 3, f (0) = b, then
g(0) = [ f (0)]
2
r
2
[ f (0)]
2
= b
2
r
2
b
2
= (1 r
2
)b
2
so that
[ f (x)]
2
r
2
[ f (x/r)]
2
= (1 r
2
)b
2
. (5)
If, as in FIGURE 3, f (a) = 0, then setting x = a in (5) yields
[ f (a/r)]
2
=
_
1
1
r
2
_
b
2
. (6)
This is essentially a formula for f . We can put it in a more recognizable form by
making the change of variable u = a/r. Since 1 r < , we have 0 < u a and
[ f (u)]
2
=
_
1
u
2
a
2
_
b
2
.
That is, on the graph of f :
_
y
b
_
2
+
_
x
a
_
2
= 1. (7)
So the graph of f must be a semi-ellipse, which when revolved around the x-axis
produces a spheroid: a sphere expanded or contracted in the x-direction. Indeed, direct
calculation shows that the volume of a ring of height h formed by coring the spheroid
of equation (7) is
V
ring
=
1
6

_
b
a
_
2
h
3
which depends only on the shape of the spheroid and on h, and not on the scale of the
spheroid. So we have shown:
PROPOSITION 1. A closed, centrally symmetric surface of revolution generated by
a continuous prole curve satises the coring property if and only if it is a spheroid.
196 MATHEMATICS MAGAZINE
The non-symmetric case: a geometric insight To expand our search for closed
surfaces of revolution satisfying the coring property, we need to look at surfaces that
are not centrally symmetric. But the prole curve of such a surface need not be the
graph of a prole function. So how do we describe the prole curves among which we
want to search? We must replace FIGURE 3 by the more complicated FIGURE 5.
h
x = rG(y/r) x = rF(y/r)
(0, rb)
y
x
R
Figure 5 A family of non-symmetric prole curves C(r).
There a non-symmetric prole curve C generating a non-symmetric surface S is
scaled by a linear scale factor r to produce a family of prole curves C(r) that generate
surfaces S(r). For convenience, we locate the maximum y-value b on the curve C at
the point (0, b). Since by hypothesis the curve C has exactly two x-intercepts, one
portion of C must connect the rightmost of these x-intercepts with (0, b) and another
portion of C must connect the leftmost of these x-intercepts with (0, b). On each of
these portions y need not be a function of x, but x is a function of y. Otherwise, coring
the surface S with a cylindrical drill bit centered on its axis would not always produce
a ring, which by denition has to be in one piece. So the curve C is the union of the
graphs of two functions: x = F(y) on the right and x = G(y) on the left. The domain
of both F and G is 0 y b and F(b) = G(b) = 0.
Fortunately, we can reduce this more complicated situation to the simpler one we
have already analyzed. We merely symmetrize the prole curve C in FIGURE 5 with
respect to the y-axis. That is, for each y we horizontally shift the line segment deter-
mined by the points (G(y), y) and (F(y), y) on C so that its center is on the y-axis.
The left and right endpoints of the shifted line segment then lie the same distance
(F(y) G(y))/2 to the left and the right of the y-axis, respectively. This transforms
C to the symmetric curve C

in FIGURE 6. The surface S

generated by C

is the
symmetrization of the surface S generated by C relative to the plane x = 0. Clearly S

is centrally symmetric.
Now suppose we scale both the original curve C and the symmetrized curve C

by the same linear scale factor r. Coring the resulting surfaces of revolution using the
same cylindrical drill bit of radius R centered on the x-axis yields two rings having the
same height h(r) = r(F(R/r) G(R/r)). These rings are generated by revolving the
shaded regions around the x-axes in FIGURE 5 and FIGURE 6. If the volumes of these
rings are calculated using the shell method, the answer is the same in each case:
V =
_
rb
R
2y (r F(y/r) r G(y/r)) dy.
VOL. 83, NO. 3, JUNE 2010 197
h
(0, rb)
y
x
R
x =
r
2
_
G(y/r) F(y/r)
_
x =
r
2
_
F(y/r) G(y/r)
_
Figure 6 Symmetrized versions C

(r) of the prole curves C(r).


It follows that the surface S generated by the non-symmetric curve C satises the
coring property if and only if the centrally symmetric surface S

generated by the
symmetrized curve C

does. If the curve C is continuous (that is, if F and G are each


continuous) then by Proposition 1, S

satises the coring property if and only if it is a


spheroid. So we have shown:
PROPOSITION 2. A closed surface of revolution generated by a continuous prole
curve satises the coring property if and only if its symmetrization relative to a plane
perpendicular to its axis of revolution is a spheroid.
Examples A variety of surfaces meet the hypotheses of Proposition 2 and therefore
satisfy the coring property. The prole curve of each is the upper half of the graph
of (x/a)
2
+ (y/b)
2
= 1 desymmetrized by displacing each pair of points sharing
a common y value with a horizontal shift that varies continuously with y. For given
positive a and b, such prole curves can be produced using either of the following
recipes:
1. Choose a continuous horizontal shift function h : [0, b] R, where h(b) = 0 to
keep the maximum y-value on the curve at (0, b). Then the prole curve is given
by the upper half of the graph of
_
(x h(y))
a
_
2
+
_
y
b
_
2
= 1.
2. Choose a right hand portion for the curve: a continuous function x = F(y) where
F : [0, b] R and F(b) = 0, as in FIGURE 5. Then the left-hand portion of the
curve is given by x = G(y) = F(y) 2a
_
1 y
2
/b
2
.
Two prole curves created using the rst recipe are shown in FIGURE 7 and FIGURE
8, and two created using the second recipe in FIGURE 9 and FIGURE 10. We have
graphed the reections of these prole curves through the x-axis as well, yielding side
views of the resulting surfaces of revolution (which we have dubbed the egg, the Star
Trek emblem, the acorn, and the heart respectively). In each case a = b = 1, so these
curves all symmetrize to yield the unit sphere. We have not seen such non-symmetric
examples exhibited elsewhere.
Conclusions, Reections, and Questions Does the coring property characterize the
sphere among closed surfaces of revolution? Based on Proposition 2, a fair answer
198 MATHEMATICS MAGAZINE
Figure 7 (x (1/5)(1 y
2
))
2
+y
2
= 1 Figure 8 (x 2(1 y
2
))
2
+y
2
= 1
Figure 9 x = (1/2)(1 y
2
),
x = (1/2)(1 y
2
) 2
_
1 y
2
(1 y 1)
Figure 10 x = 2(1 |y|),
x = 2(1 |y|) 2
_
1 y
2
(1 y 1)
is: sort of. Perhaps the largest class of surfaces that are at least vaguely sphere-like
are smooth ovaloids: surfaces that are convex, meaning that the line segment connect-
ing any two points inside the surface is also inside the surface, and smooth, meaning
that near each point, the surface is the graph of a function having continuous partial
derivatives of all orders, so that the surface has no sharp points or edges. Note that
the surface in FIGURE 7 is a smooth ovaloid, but the surfaces generated by the pro-
le curves in FIGURES 810 are, respectively, smooth but not convex, convex but not
smooth, and neither smooth nor convex. The apparent diversity of these surfaces belies
their unifying feature: they all yield spheroids when symmetrized.
Our investigation hardly exhausts the topic at hand. There are a number of lesser-
known variations on the coring property of the sphere to be found in the literature. In
his classic exploration of reasoning by induction and analogy Mathematics and Plausi-
ble Reasoning [8, pp. 190192 and 201202], George Polya noted that coring spheres
with conical or parabolic drill bits also produces rings whose volumes are determined
by their heights alone. Alexanderson and Klosinski have expanded on Polyas obser-
vations by presenting an even larger catalog of similar phenomena [1].
This discussion may well bring to mind another interesting property of the sphere
that can be found in the exercises of almost any calculus text: the fact that the sur-
face area of a zone sliced out of a sphere by two parallel planes depends only on the
distance between the planes and not on the location of the zone. Does this slicing
property characterize the sphere among closed surfaces of revolution? This ques-
tion was addressed by B. Richmond and T. Richmond in the Monthly [9], where they
named this property the equal area zones property. (The sphere turns out to be the
only smooth surface of revolution satisfying this property, but some non-smooth sur-
faces of revolution satisfying this property can also be constructed.) More recently, a
VOL. 83, NO. 3, JUNE 2010 199
generalization of this property involving pairs of surfaces of revolution has been for-
mulated and explored by Cass and Wildenberg [3]. Walter Rudin has formulated and
examined a variation of the equal area zones property in the context of n-dimensional
spheres [10]. And more recently, we have examined higher dimensional analogs of
both the equal area zones property [5] and the coring property [4] in the context of
more general hypersurfaces of revolution.
Finally, here is a historical question. The machinery of calculus is not required to
discover the coring property of the sphere. It can be derived elegantly using Cavalieris
principle [7, pp. 206210]. It can even be cobbled together from the volumes of a
sphere, a cylinder, and a spherical cap, all of which were known to Archimedes [2, pp.
180193]. Similarly, the equal area zones property of the sphere follows easily from
a proposition of Archimedes (see [9]). But we know of no evidence that Archimedes
noticed either of these properties. Moreover, it seems to us that it might have been
difcult for him to have formulated them given the limitations of the language and
notation of his day. Who was the the rst to articulate these properties, and when?
REFERENCES
1. G. L. Alexanderson and L. F. Klosinski, Some surprising volumes of revolution, The Two-Year College Math-
ematics Journal 6 (September 1975) 1315. doi:10.2307/3027164
2. E. J. Dijksterhuis, Archimedes, Enjar Munksgaard, Copenhagen, 1956.
3. D. Cass and G. Wildenberg, Pairs of equal surface functions, College Math. J. 39 (2008) 5154.
4. V. Coll and J. Dodd, Invariant volumes of revolution: The coring property of the sphere and conic sections,
in preparation.
5. J. Dodd and V. Coll, Generalizing the equal area zones property of the sphere, Journal of Geometry 90 (2008)
4755. doi:10.1007/s00022-008-2015-2
6. Euclid, Elements, Volume III: Books XXIII and Appendix, trans. by Sir Thomas Heath, Dover, New York,
1956.
7. Howard Eves, Great Moments in Mathematics Before 1650, Mathematical Association of America, Wash-
ington, DC, 1983.
8. George Polya, Mathematics and Plausible Reasoning Volume I: Induction and Analogy in Mathematics,
Princeton University Press, Princeton, NJ, 1954.
9. B. Richmond and T. Richmond, The equal area zones property, Amer. Math. Monthly 100 (1993) 475477.
doi:10.2307/2324302
10. W. Rudin, A generalization of a theorem of Archimedes, Amer. Math. Monthly 80 (1973) 794796. doi:
10.2307/2318169
11. Marilyn vos Savant, Ask Marilyn, Parade Magazine, October 20, 1996 and December 15, 1996, PARADE,
New York.
Summary If a cylindrical drill bit bores through a solid sphere along an axis, removing a capsule from the
sphere, the object that remains is called a spherical ring. A surprising property of the sphere that is often presented
in calculus courses is that any two spherical rings whose cylindrical inner boundaries have the same height also
have the same volume, regardless of the radii of the spheres from which they were cut. In this article, we pose
and answer the question: to what extent does this property characterize the sphere among surfaces of revolution?
VINCENT E. COLL, Jr. received a B.S. from Loyola University (New Orleans), an M.S. from Texas A&M
University and a Ph.D. in algebraic deformation theory from the University of Pennsylvania in 1990 under the
direction of Murray Gerstenhaber. His current research interests revolve around the content properties of surfaces
of revolution. His outside interests include practicing the martial arts and playing ice hockeybut not at the same
time.
JEFF DODD received a B.S. from the University of Maryland at College Park, an M.A. from the University
of Pennsylvania, and a Ph.D. in partial differential equations from the University of Maryland at College Park
in 1996 under the direction of Robert L. Pego. Since 1996, he has been on the faculty at Jacksonville State
University. As a college student, he decided to become a math major largely so that he could learn what was
really going on in his calculus courses, and he is still working on that goal, this paper being a small step in that
direction!
200 MATHEMATICS MAGAZINE
Coloring and Counting on the
Tower of Hanoi Graphs
DANI ELLE ARETT
Fargo, ND
[email protected]
SUZANNE DOR

EE
Augsburg College
Minneapolis, MN 55454-1338
[email protected]
The Tower of Hanoi graphs are intricate, highly symmetric, little-known combinatorial
graphs that arise from the multipeg generalization of the well-known Tower of Hanoi
puzzle. In this paper, we tour this family of graphs, exploring what we and others have
shown, and what is open for further investigation. Even a quick glance at FIGURES 1
4 showing the rst few examples (which we dene more carefully within the paper)
suggests patterns waiting to be discovered. We count the order, size, and degrees of
vertices and show how alternate methods of counting these objects can be used to de-
rive combinatorial identities. We describe the standard labeling of these graphs, from
which we demonstrate that, although these graphs become more complex as their order
increases, one measure of their complexitythe chromatic numberremains remark-
ably simple.
Figure 1 The Hanoi graph H
2
3
Figure 2 The Hanoi graph H
2
4
Figure 3 The Hanoi graph H
3
3
Figure 4 The Hanoi graph H
3
4
Math. Mag. 83 (2010) 200209. doi:10.4169/002557010X494841. c Mathematical Association of America
VOL. 83, NO. 3, JUNE 2010 201
The Hanoi graphs
The graphs begin with the Tower of Hanoi puzzle. The classic version has three pegs
and several disks with distinct diameters, as in FIGURE 5. At the beginning, all of
the disks are stacked on the rst peg in order by size, with the largest at the bottom.
The object is to move the disks so that they are similarly stacked on the second peg.
Only one disk may be moved at a time, from the top of one stack to the top of another
stack (or onto an empty peg)and, no disk may ever sit atop a smaller disk. Readers
who have never tried the puzzle might wish to play one of the many available online
versions.
Figure 5 The tower of Hanoi puzzle
Figure 6 Adjacent states in H
5
4
The puzzle was invented in 1883 by French number theorist and recreational math-
ematician

Edouard Lucas (18421891). It was quickly generalized. Lucas himself ex-
plored multipeg puzzles as early as 1889. A 4-peg puzzle known as The Reves Puz-
zle appeared in 1908 in The Canterbury Puzzles and Other Curious Problems [3].
The problem of counting the number of steps needed to solve the multipeg puzzle (as
a function of the numbers of pegs and disks) was posed in 1939 in the Monthly [17].
Lucas counted the minimum number of moves needed to solve the 3-peg puzzle, but
the minimum number of moves needed to solve the 4-peg puzzle has yet to be settled.
Of course, if the number of pegs exceeds the number of disks, then the puzzle is trivial,
but with each added peg the corresponding graphs become more complicated. Andreas
Hinz gives a more detailed history of the puzzle [4].
Associated with many puzzles and games is a model called a state graph, or cong-
uration graph. Its vertices are the legal states, in our case the allowable congurations
of disks on pegs. Two vertices are connected by an edge if a single move takes us from
one state to the other. The state graph of a Tower of Hanoi puzzle with d disks on p
pegs for p 3 is called a generalized Tower of Hanoi graph, or just Hanoi graph, and
is denoted H
d
p
. These graphs are undirected since every move is reversible.
For example, FIGURE 6 shows two states in the puzzle with ve disks on four pegs.
We get from the rst state to the second by moving the next-to-smallest (light gray)
disk from the rst to fourth peg. Thus the vertices corresponding to these two states
are connected by an edge in the graph H
5
4
.
To see how these graphs are built, note that for the (admittedly silly) one-disk puzzle
on p pegs, the state graph consists of p vertices with an edge connecting each pair of
vertices. That is, H
1
p

= K
p
, the complete graph on p vertices. Another observation
for those just getting to know these graphs is that the corners of the large triangle in
FIGURE 3 correspond to states with all three disks stacked on a single peg.
202 MATHEMATICS MAGAZINE
For two disks, the subgraph of H
2
p
whose edges correspond to moves of the smaller
disk is p disjoint copies of H
1
p

= K
p
. (Each copy of H
1
p
corresponds to a particular
xed placement of the larger disk.) To build the full graph H
2
p
, we connect vertices
from different components when there is a move of the larger disk between their cor-
responding states. For example, FIGURE 1 shows the graph H
2
3
built from three copies
of the triangle H
1
3

= K
3
, and FIGURE 2 shows the graph H
2
4
built from four copies
of the kite H
1
4

= K
4
. Using our imagination, we see H
2
5
built from ve copies of the
pentagram H
1
5

= K
5
and so on. We can more easily track this construction using the
vertex labeling we present later.
In general, the d-disk graph H
d
p
is built from p copies of H
d1
p
, each corresponding
to a xed placement of the largest disk, where we connect remote vertices if there is
a corresponding move of this largest disk. For example, FIGURE 3 shows the graph
H
3
3
built from three copies of H
2
3
and FIGURE 4 shows the graph H
3
4
built from four
copies of H
2
4
.
This recursive construction suggests that the graphs are connected: that we can get
from any arrangement of disks on pegs to any other in the puzzle. Though connect-
edness is not obvious from the puzzle itself, Hinz and Daniele Parisse prove that the
Hanoi graphs are not only connected when p 3, but also Hamiltonian: there exists
a cycle visiting each vertex exactly once [7]. They also assert that H
d
p
is ( p 1)-
connected: that the removal of any p 2 vertices and their corresponding edges does
not disconnect the graph.
The Hanoi graphs for the classic 3-peg puzzle were introduced in 1944 in The Math-
ematical Gazette [16]. They bear striking resemblance to Sierpi nskis triangles and are
a special case of the Sierpi nski graphs discussed by various authors [8, 9, 12, 18]. They
are related to Pascals triangle, as discussed by David Poole [15] and Hinz [5]. As an
application, Paul Cull and Ingrid Nelson discuss the 3-peg graphs role in perfect 1-
error correcting codes [2]. The Hanoi graphs for the puzzle on more than three pegs
have been studied since the 1980s, for example by Xiaowu Lu [13] and Hinz [4].
Though we are interested in the graphs, it is worth mentioning the connection to
solving the puzzle. A path in a graph is a sequence of distinct vertices, each consecu-
tive pair connected by an edge. The length of the path is the number of edges. Solving
the puzzle amounts to nding a path from the starting vertex to the ending vertex,
and of particular interest are paths of minimal length. In the 3-peg graphs, a minimal
path follows the side of the triangle. Hinz and others have expressed hope that under-
standing the Hanoi graphs might lead to insight on minimal solutions of the puzzle for
p > 3 pegs.
Counting on the Hanoi graphs
A graph can be measured in many ways, often beginning with the number of vertices,
number of edges, and degrees of vertices. In this section, we calculate these quantities
for the Hanoi graphs. Then, we derive some combinatorial identities. These results
appear (or are implicit) in the work of Sandi Klav zar, Uro s Milutinovi c, and Ciril
Petr [10].
How many vertices does H
d
p
have? Each of the d disks can be assigned to any of
the p pegs. Since disks must be piled largest to smallest on each peg, each assignment
produces a unique conguration. Therefore, there are p
d
different congurations and,
thus, p
d
vertices in the graph.
How many edges does H
d
p
have? For a xed pair of pegs, we can move a disk
from precisely one of those pegs to the other at every state except where both pegs are
empty. Since there are ( p 2)
d
states with both pegs empty, there are p
d
( p 2)
d
VOL. 83, NO. 3, JUNE 2010 203
states where we can move a disk between this pair of pegs. Each move is counted at
each state, which is to say, counted twice. Accounting for our choice of pegs as well,
we nd the total number of edges is
1
2

p
2

[ p
d
( p 2)
d
].
For example, the graph H
3
3
shown in FIGURE 3 has 27 vertices and 39 edges, and
the graph H
2
4
shown in FIGURE 2 has 16 vertices and 36 edges.
Alternatively, for each 1 i d, we can move disk i between peg A and peg B
as long as none of the i 1 smaller disks sit on either of these pegs. There are

p
2

choices for pegs A and B, p


di
possible placements of the larger disks, and ( p 2)
i 1
placements of the smaller disks. Thus there are

p
2

p
di
( p 2)
i 1
edges that correspond to moving disk i . Summing to get the total number of edges and
equating with our previous count gives the identity
d

i =1

p
2

p
di
( p 2)
i 1
=
1
2

p
2

[ p
d
( p 2)
d
].
We could have derived this by algebraic manipulation (using the factorization of
x
n
y
n
, where here x y = 2), but is more amusing when it appears from counting
on Hanoi graphs.
What is the degree of each vertex? At each vertex there is one incident edge for
every pair of pegs, except when both pegs are empty in the corresponding state. Thus,
the degree of a vertex corresponding to a state with k occupied pegs, or equivalently k
top disks, is

p
2

p k
2

,
where the second term is understood to equal zero if k = p 1 or k = p.
Alternatively, the only disks that move are top disks, which can move to any other
peg unless that peg is occupied by a smaller top disk. Thus, counting from smallest top
disk to largest, we nd the degree of a vertex corresponding to a state with k occupied
pegs equals
( p 1) +( p 2) + +( p k) = kp

k +1
2

p
2

p k
2

.
Notice that the degree depends on the number of occupied pegs in the corresponding
state. Howmany states have exactly k occupied pegs? For this count we use the Stirling
number of the second kind, S(d, k), which equals the number of ways to partition
d distinguishable objects into k nonempty subsets. A standard recursion to calculate
S(d, k) for 0 k d is
S(0, 0) = 1; S(d, 0) = 0 for d 1;
204 MATHEMATICS MAGAZINE
and
S(d, k) = S(d 1, k 1) +kS(d 1, k), for d 1.
(To see why, note that the rst summand counts the partitions where the dth element
is in a singleton set.)
Thus we can sort d disks into exactly k nonempty subsets in S(d, k) ways. We
can assign these subsets to p pegs in p( p 1) ( p (k 1)) ways; we denote
this falling factorial by ( p)
k
. Since the subsequent placement of each disk onto its
subsets assigned peg is uniquely determined by size, the number of states with exactly
k occupied pegs is S(d, k)( p)
k
.
Klav zar et al. use the Hanoi graphs to derive various combinatorial identities [10].
For example, summing over the possible number of occupied pegs and equating our
two counts for the total number of vertices give the well-known Stirling identity
p

k=1
S(d, k)( p)
k
= p
d
for any positive integers d and p.
Similarly, we can compare the number of edges. We count S(d, k)( p)
k
vertices
corresponding to states with exactly k occupied pegs, each with degree

p
2

pk
2

.
Thus the number of edges in the graph is
1
2
p

k=1
S(d, k)( p)
k

p
2

p k
2

.
Equating with our previous count and simplifying give
p2

k=1
S(d, k)( p)
k+2
= p( p 1)( p 2)
d
,
which might appear to be novel but, alas, after canceling p( p 1) reduces to the same
Stirling identity for p 2.
There are further enumerative uses of the Hanoi graphs. Klav zar et al. showed con-
nections to second order Euler numbers, Lah numbers, and Catalan numbers; they
suggest that there may be additional identities available [11]. Hinz et al. connect the
graphs to Sterns diatomic sequence [6].
Labeling and coloring the Hanoi graphs
It is helpful to label each vertex of the Hanoi graph in a way that lets us read off
the state of the puzzle it represents. In this section, we describe the standard labeling,
which leads to a natural denition of the recursive structure introduced informally
earlier and is key to coloring the vertices.
It is customary to number the pegs 0, 1, 2, . . . , p 1 and the disks 1, 2, 3, . . . , d
from smallest to largest. We say the i th disk sits on peg s
i
, for i = 1, 2, . . . , d, and
label the vertex corresponding to this state with the string s
d
s
2
s
1
in this (reverse)
order. Note that the labeling denotes where each disk goes; imagine placing the disks
on the pegs, starting with the largest disk and working down by size.
For example, the state shown in FIGURE 7 corresponds to the vertex labeled 173033
in H
6
8
.
VOL. 83, NO. 3, JUNE 2010 205
1 5 4 3 2 0 7 6
Figure 7 State corresponding to vertex labeled 173033 in H
6
8
We list the labels of its twenty-two adjacent vertices in a table.
Disk to peg 0 to peg 1 to peg 2 to peg 3 to peg 4 to peg 5 to peg 6 to peg 7
1 173030 173031 173032 173034 173035 173036 173037
2
3 173133 173233 173433 173533 173633 173733
4
5 113033 123033 143033 153033 163033
6 273033 473033 573033 673033
As another example, note that FIGURE 6 corresponds to the edge between vertices
labeled 01302 (top) and 01332 (bottom) in H
5
4
. Conversely, we can determine the state
from its vertex label.
Notice the vertex labeled s
d
s
2
s
1
has k = |{s
d
, , s
2
, s
1
}| occupied pegs. For
example, the vertex labeled 173033 in H
6
8
has
k = |{1, 7, 3, 0, 3, 3}| = 4
occupied pegs and thus degree

8
2

84
2

, which equals 22, as before. The reader can


check the degrees in the now-labeled graphs H
3
3
and H
2
4
shown in FIGURES 8 and 9.
221
010
101
121
202
212
122
022
021
222
220
210
211
011
012
002
000
001
120
110
111 112 102 100 200 201
020
Figure 8 H
3
3
with vertex labels
32
03
02
01
12
11
10
13
23
20
22
21
30
31
00
33
Figure 9 H
2
4
with vertex labels
With this labeling we can now formally dene the standard recursive construction of
the graphs. We write v w if the vertex labeled v is adjacent to the vertex labeled w.
Any vertex of H
d
p
has a label of the form av where a is the peg number for the largest
206 MATHEMATICS MAGAZINE
disk and v is the label from the vertex in H
d1
p
corresponding to the arrangement of
the other disks.
When is av bw in H
d
p
? There are two possibilities. If we do not move the largest
disk, then a = b and, since we must move a smaller disk, v w in H
d1
p
. If we move
the largest disk while the other disks remain xed, then a = b but v = w. In this case
there cannot be any other disks on either peg a or peg b or else the largest disk could
not move. Thus, in the state corresponding to v, pegs a and b are empty. We abuse the
notation slightly by writing a, b / v for short.
As an application, we derive a recursive formula for the number of edges in H
d
p
for
xed p, which we denote e
d, p
. An edge where we do not move the largest disk has
the form av aw for a {0, 1, . . . , p 1} and v w in H
d1
p
; thus H
d
p
has pe
d1, p
edges of this type. An edge where we move the largest disk has the form av bv for
a, b {0, 1, . . . , p 1} and v H
d1
p
such that a, b / v. The vertex labeled v can
correspond to any of the ( p 2)
d1
arrangements of the d 1 disks on the pegs other
than a and b. Thus H
d
p
has

p
2

( p 2)
d1
edges of this type. Therefore, e
1, p
=

p
2

and
for d 2,
e
d, p
= pe
d1, p
+

p
2

( p 2)
d1
.
The reader can check that our previous count satises this recursion.
Thus far we have looked at known properties of the Hanoi graphs. We are now ready
to prove a new result. The Hanoi graphs are complicated, but thanks to their symmetry
and our convenient labeling, they can be easily colored.
For a positive integer c, a graph can be c-colored if there is a way to label the
vertices with the colors 0, 1, . . . , c 1 such that adjacent vertices are different colors.
The chromatic number of a graph G is the smallest number of colors needed and is
denoted (G). For example, (H
1
p
) = (K
p
) = p.
At any vertex of the full graph H
d
p
, the subgraph corresponding to moving only the
smallest disk is a copy of H
1
p

= K
p
. Thus (H
d
p
) p.
To see that p colors sufce, color the vertex labeled s
d
s
2
s
1
by the sum of its peg
numbers modulo p. That is,
(s
d
s
2
s
1
) = s
d
+ +s
2
+s
1
(mod p).
To check that is a p-coloring, observe that the labels of adjacent vertices differ in
exactly one place, corresponding to the sole moved disk between the states.
FIGURE 10 shows this coloring of H
3
4
with white (0), light gray (1), dark gray (2),
and black (3).
Alternatively, this coloring can be built recursively. Begin with H
1
p
colored by its
vertex labeling. For d 2, given p copies of H
d1
p
each initially p-colored the same,
place the number a in front of each vertex label in the ath copy and twist the coloring
of each vertex in that copy by adding a modulo p. Formally, write (v) for the color
assigned to the vertex labeled v in H
d1
p
, so that the twisted coloring on H
d
p
is dened
by
(av) = (v) +a (mod p).
The reader can now verify that each type of edge in H
d
p
connects vertices of different
colors and also that we obtain the same coloring as before.
Notice that, although the number of vertices and number of edges of the Hanoi
graphs each grow exponentially in the number of disks, the chromatic number is inde-
pendent of the number of disks.
VOL. 83, NO. 3, JUNE 2010 207
Figure 10 H
3
4
with colored vertices
Another way to measure a graph is by its independence number, which is the max-
imum number of non-adjacent vertices, usually called (G). In the Hanoi graphs, the
p
d1
vertices of a xed color in a minimal coloring form an independent set and so
(H
d
p
) p
d1
. Conversely, any independent set may include at most one vertex from
each copy of K
p
corresponding to moving only the smallest disk. As there are p
d1
copies, (H
d
p
) = p
d1
.
Further investigation
While we understand much about the Hanoi graphs, there is much we still do not know.
Hinz and Parisse have calculated the chromatic index (edge-coloring number) of the
Hanoi graphs [8]. Any permutation of the peg numbers gives an automorphism of the
graph. Recently, So Eun Park has shown that these are the only automorphisms of
the graph: Aut (H
d
p
)

= S
p
[14]. Most graph theoretic measures of the Hanoi graphs
including the domination number, covering number, and pebbling numbersare un-
known. Some of these quantities have been calculated for the Sierpi nski graphs but not
the Hanoi graphs for more than three pegs [18].
We are particularly interested in the diameter: the maximum over all pairs of ver-
tices of the minimal length of a path connecting them. The minimum number of moves
needed to solve the Tower of Hanoi puzzle is bounded by the diameter of the graph and
equal to the diameter in the classic 3-peg graph. The diameter of the multipeg graphs
are, in general, unknown and it is known that in some cases the diameter is larger than
the minimum number of moves. Thus it is not clear whether calculating the diameter is
more or less difcult than calculating the minimum number of moves needed to solve
208 MATHEMATICS MAGAZINE
the puzzle. Some results on the diameter of variants of the puzzle are known [1].
The 3-peg Hanoi graphs are planar: they can be drawn in the plane without any
edges crossing. Hinz and Parisse [7] prove that the only planar Hanoi graphs on more
than three pegs are H
1
4
and H
2
4
. (We challenge the reader to draw H
2
4
without crossing.
If you try and are stuck, consider these possibly cryptic hints: View K
4
as if looking
at the top of a tetrahedron and do a little cats cradle. In case you are still puzzled,
look for a representation of H
2
4
as a planar graph in the October 2010 issue of this
MAGAZINE.) For any nonplanar graph, it is natural to ask about the crossing number:
the minimum number of crossings needed to draw it in the plane. (Technically, a cross-
ing involves only two edges at a time.) Alternatively we might inquire whether there
are other surfaces on which the graph can be drawn without crossings; the genus of a
graph is the smallest genus of such a surface. The genus is no larger than the cross-
ing number, as one can add a bypass handle at each edge crossing, but efciencies
often lead to a smaller genus. The genera of the complete graphs are known, but the
crossing numbers are not. Results on the crossing numbers of the related Sierpinski
graphs are given by Klav zar and Bojan Mohar [12]. The genera and crossing numbers
of nonplanar multidisk Hanoi graphs are unknown.
We offer one nal direction for further investigation. Poole lists numerous variants
of the puzzle [15]. For example, in Straightline Hanoi on three pegs, we may only
move disks to and from the rst peg. In Cyclic Hanoi the pegs are arranged in a
circle and we may only move disks counterclockwise. In Rainbow Hanoi the disks
are colored and various restrictions are placed on moves based on the color of the disks.
In Multidisk Hanoi there are multiple copies of each disk (either distinguishable or
not). Hinz claims that Lucas suggested the variation of allowing the disks to be out of
order at the startlarger disks on smaller onessubject to the usual rules later in the
play. Still other variants allow a larger disk to sit on the next smallest disk, but not any
smaller disks than that. To our knowledge, very little about their graphs is known.
Acknowledgment We thank Paul Cull for introducing Danielle to these graphs at his Research Experience
for Undergraduates at Oregon State University in Summer 1999, Andreas Hinz for expert advice, and Matthew
Richey for help with the graphics and for cheering Suzanne on.
REFERENCES
1. Daniel Berend and Amir Sapir, The diameter of Hanoi graphs, Inform. Process. Lett. 98(2) (2006) 7985.
doi:10.1016/j.ipl.2005.12.004
2. Paul Cull and Ingrid Nelson, Error-correcting codes on the towers of Hanoi graphs, Discrete Math.
208/209(28) (1999) 157175. doi:10.1016/S0012-365X(99)00070-9
3. Henry E. Dudeney, The Canterbury puzzles and other curious problems, E. P. Dutton, New York, 1908. [4th
edition, Dover Publications, Mineola, NY, 1958.]
4. Andreas M. Hinz, The tower of Hanoi, Ensign. Math. (2) 35(2) (1989) 289321.
5. Andreas M. Hinz, Pascals triangle and the tower of Hanoi, Amer. Math. Monthly 99 (1992) 538544. doi:
10.2307/2324061
6. Andreas M. Hinz, Sandi Klav zar, Uro s Milutinovi c, Daniele Parisse, and Ciril Petr, Metric properties of
the tower of Hanoi graphs and Sterns diatomic sequence, European J. Combin. 26(5) (2005) 693708. doi:
10.1016/j.ejc.2004.04.009
7. Andreas M. Hinz and Daniele Parisse, On the planarity of Hanoi graphs, Expo. Math. 20(3) (2002) 263268.
8. Andreas M. Hinz and Daniele Parisse, Coloring Hanoi graphs, preprint, 2006.
9. M. Jakovac and Sandi Klav zar, Vertex-, edge-, and total-colorings of Sierpi nski-like graphs, Discrete Math.
309(6) (2009) 15481556. doi:10.1016/j.disc.2008.02.026
10. Sandi Klav zar, Uro s Milutinovi c, and Ciril Petr, Combinatorics of topmost discs of multi-peg tower of Hanoi
problem, Ars Combin. 59 (2001) 5564.
11. Sandi Klav zar, Uro s Milutinovi c, and Ciril Petr, Hanoi graphs and some classical numbers, Expo. Math. 23(4)
(2005) 371378.
12. Sandi Klav zar and Bojan Mohar, Crossing number of Sierpi nski-like graphs, J. Graph Theory 50(3) (2005)
186198. doi:10.1002/jgt.20107
VOL. 83, NO. 3, JUNE 2010 209
13. Xiaowu Lu, Tower of Hanoi graphs, Int. J. Comput. Math. 19 (1986) 2338. doi:10.1080/
00207168608803502
14. S. Eun Park, The group of symmetries of the Tower of Hanoi graph, Amer. Math. Monthly 117 (2010) 353
360. doi:10.4169/000298910X480829
15. David G. Poole, The towers and triangles of Professor Claus (or, Pascal knows Hanoi), Math. Mag. 67 (1994)
323344.
16. R. S. Scorer, P. M. Grundy, and C. A. B. Smith, Some binary games, Gaz. Math. 28(280) (1944) 96103. doi:
10.2307/3606393
17. B. M. Stewart, Advanced problem 3918, Amer. Math. Monthly 46 (1939) 363364. doi:10.2307/2302907
18. Alberto M. Teguia and Anant P. Godbole, Sierpinski gasket graphs and some of their properties, Australas.
J. Combin. 35 (2006) 181192.
Summary The Tower of Hanoi graphs make up a beautifully intricate and highly symmetric family of graphs
that show moves in the Tower of Hanoi puzzle played on three or more pegs. Although the size and order of these
graphs grow exponentially large as a function of the number of pegs, p, and disks, d (there are p
d
vertices and
even more edges), their chromatic number remains remarkably simple. The interplay between the puzzles and the
graphs provides fertile ground for counts, alternative counts, and still more alternative counts.
DANIELLE ARETT graduated with a double major in Mathematics and English from Augsburg College in
2000. She now works for the Hartford Life Insurance Company in Fargo, North Dakota. In her free time, Danielle
enjoys writing prose, composing music, and playing piano and guitar. She rst learned of the Tower of Hanoi
graphs in an REU at Oregon State University in the summer of 1999.
SUZANNE DOR

EE earned her doctorate in mathematics at the University of Wisconsin. She has taught at
Augsburg College since 1989 where she adores working with studentsfrom directing undergraduate research
projects in combinatorics, to helping mathematics majors develop their reasoning and speaking skills, to engaging
diverse learners in the developmental algebra course she developed. For fun, Suzanne enjoys playing bridge,
solving puzzles, interior design, and getting her hands dirty, literally, in the garden.
NOTE S
When Is n
2
a Sum of k Squares?
TODD G. WI LL
University of WisconsinLa Crosse
La Crosse, WI 54601
[email protected]
The square 169 can be written as a sum of two squares 5
2
+ 12
2
, as a sum of three
squares 3
2
+4
2
+12
2
, as a sum of four squares 1
2
+2
2
+8
2
+10
2
, as a sum of ve
squares 1
2
+ 2
2
+ 2
2
+4
2
+12
2
, and so on for quite a long while. In fact, Jackson,
et al. [5] note that 169 can be written as a sum of k positive squares for all k from 1
to 155 and rst fails as a sum of length 156. The authors go on to ask whether there is
any limit to such a string of sums. Specically, for every positive integer b is there an
integer n which can be written as a sum of k positive squares for all k from 1 to b? We
assemble a collection of results, most of which have been known for quite some time,
to answer this question and, in fact, to specify all possible lengths for sums of squares
equal to a given square.
This investigation began when I read a manuscript in which the author proved that a
certain combinatorially dened integer c(k) could be written as a sum of k positive in-
teger squares. Although the proof technique was interesting, I wondered if it wouldnt
be more surprising to nd that a sufciently large integer couldnt be written as a sum
of k squares. For that reason, in what follows we address the possible lengths for sums
of squares equal to a given integer which may or may not be a square.
Sums of 5 or more positive squares Dickson [1] credits Dubouis with publishing
the following theorem in 1911. An integer n 34 can be written as a sum of k pos-
itive squares for all k satisfying 5 k n except for k = n 13, n 10, n 7,
n 5, n 4, n 2, n 1. Writing 20 years later, Pall [7] laments over having du-
plicated Dubouis work before noticing the report of it but resists presenting his own
proof. Writing over 75 years later still, I suspect that both Dubouis and Palls proofs
resembled the following.
First we show that no integer n can be written as a sum of k positive squares for
k {n 13, n 10, n 7, n 5, n 4, n 2, n 1}. To see this note that the sum
of k positive squares n = s
2
1
+ + s
2
k
can be obtained from the sum of n ones by
repeatedly replacing s
2
i
of the ones with the single square s
2
i
. This replacement reduces
the number of summands by s
2
i
1. For example, replacing four ones, 1 +1 +1 +1,
with a single square 2
2
reduces the number of summands by 3. A replacement of 3
2
ones reduces the number of summands by 8 and larger squares reduce the number of
summands by at least 15. A quick check shows that the count of n summands in the
sum of all ones cannot be reduced by any of the amounts 1, 2, 4, 5, 7, 10, 13 using
reductions of 3 and 8.
We now use induction to show that n can be written as sums of the specied lengths,
securing the base case of n = 34 with a hand check. For n > 34 we add 1
2
to each of
Math. Mag. 83 (2010) 210213. doi:10.4169/002557010X494850. c Mathematical Association of America
210
VOL. 83, NO. 3, JUNE 2010 211
the sums of squares equal to n 1 given by the induction hypothesis. This gives all of
the required lengths of sums for n except for a length 5 sum.
The proof is completed by showing that all n > 34 can be written as a sum of 5
positive squares. A computer check (an additional hand check for Pall and Dubouis)
veries this for 34 < n 169. For n > 169 we use Lagranges theorem, which states
that every positive integer can be written as a sum of four or fewer positive squares.
For n > 169, use Lagranges theorem to write n 169 as a sum of 1, 2, 3 or 4 positive
squares. Then add the appropriate representation of 169 as the sum of 4, 3, 2, or 1
positive squares to obtain ve positive squares summing to n.
So, except for lengths of 2, 3, and 4, this result species all possible lengths for sums
of squares equal to a given square. In addition the result greatly simplies the question
in Jackson, et al., since if a square n can be written as a sum of 2, 3, and 4 positive
squares then n can be written as a sum of k positive squares for all 1 k n 14.
Sums of two positive squares There seems to be some disagreement about when
an integer can be written as a sum of two positive squares. In the 1959 article [3] the
condition is stated that the integer must have the form 4
a
n
1
n
2
2
, with integral a 0,
n
1
> 1, the prime factors of n
1
congruent to 1 mod 4 and the prime factors of n
2
congruent to 3 mod 4. In the 2006 book [6] the condition is the same except that 4
a
is
replaced with 2
e
, with e a nonnegative integer. In both sources the claims are said to
follow easily from previous results, but proofs are not given. However, neither of these
conditions include 18 = 2 3
2
= 3
2
+3
2
since 18 has no 4k +1 prime factor. More
generally the conditions exclude the numbers n = m
2
+m
2
where m has no 4k +1
prime. Perhaps the authors meant to describe conditions in which n could be written
as a sum of two distinct positive squares.
In any case, the correct statement is that a positive integer n can be written as the
sum of two positive squares if and only if either n is twice a square or n has at least
one 4k +1 prime factor and all of its 4k +3 prime factors appear to even powers.
This fact follows easily from the much deeper theory for computing r
k
(n) which
is dened to be the number of ways of writing n as a sum of k integer squares. In
computing r
k
(n) the squares of both positive and negative integers as well as 0
2
are
allowed and permutations of addends are counted as distinct sums. So, for example
r
2
(9) = 4 since 9 = 0
2
+ (3)
2
= (3)
2
+0
2
are the four ways to express 9 as the
sum of two integer squares.
Let n = 2
k

p
a
i
i

q
b
j
j
be the prime factorization of n with the p
i
and q
j
being the
primes congruent to 1 and 3 mod 4, respectively. Gauss showed that if any of the b
j
are
odd then r
2
(n) = 0 and otherwise r
2
(n) = 4

(1 +a
i
). So for example, since n = 9
has no 4k +3 primes to an odd power, and all 4k +1 primes occur to the zero power,
r
2
(9) = 4(1 +0) = 4 as counted above.
Now assume that n = a
2
+b
2
is the sum of two positive squares. Either n is twice
a square or a = b in which case n = (a)
2
+ (b)
2
= (b)
2
+ (a)
2
shows that
r
2
(n) 8. From this it follows that all 4k +3 primes appear to even powers and there
is at least one 4k +1 prime. Conversely, if n = 2k
2
, then clearly n is the sum of two
nonzero squares. If, on the other hand, all 4k + 3 primes appear to even power and
there is at least one 4k +1 prime, then r
2
(n) 8. Since at most 4 of these sums can
use 0
2
, there must be a sum with two positive squares.
Sums of three positive squares When an integer can be written as a sum of three
positive squares has not quite been pinned down. Legendre showed that numbers of
the form 4
h
(8k +7) are those which cannot be written as the sum of three or fewer
positive squares. But this left open the set of numbers which cannot be written as a sum
of three positive squares but can be written as a sum of one or two. In 1959 Grosswald,
212 MATHEMATICS MAGAZINE
et al., [3] proved that there exists a nite set of integers S such that n is not the sum
of three positive squares if and only if n = 4
h
q where q = 7 mod 8 or q is an element
of the nite set S. They conjectured that S = {1, 2, 5, 10, 13, 25, 37, 58, 85, 130} but
their proof showed only that the set S is nite.
Despite this disappointment, it is known which squares are sums of three positive
squares. Hurwitz [4] proved that with the exception of (2
k
)
2
and (5 2
k
)
2
, every pos-
itive square can be written as a sum of three positive squares. Fraser and Gordon later
gave an elementary proof of this fact in [2].
As a digression, note that Hurwitzs result shows that the set S contains no squares
other than 1 and 25. So, in considering whether there might be additional numbers in
S, we need only consider nonsquares. If n is not a square, then for n = a
2
+b
2
neither
a nor b are zero and so the orderings in the three sums 0
2
+a
2
+b
2
, a
2
+ 0
2
+ b
2
,
a
2
+ b
2
+ 0
2
are distinct. If n cannot be written as a sum of three positive squares,
then all sums of three squares equal to n must have one of these three forms. Thus if n
is not a square, then n cannot be written as a sum of three positive squares if and only if
r
3
(n) = 3r
2
(n). In three hours, a laptop search using Mathematicas built-in SquaresR
function veried that the conjectured values for S are correct for n 5 10
6
.
Sums of four positive squares In [6], Pall is credited with showing that n can be
written as a sum of four positive squares if and only if n is not one of {1, 3, 5, 9, 11,
17, 29, 41} or of the form 2 4
k
, 6 4
k
, 14 4
k
. In a footnote of the cited work [7],
Pall says that the reader will have no difculty in proving [this result] by using the
following classical result, which was rst stated by Fermat, and was rst proved by
Legendre in 1798. A positive integer is a sum of three [or fewer positive] squares if
and only if it is not of the form 4
h
(8k + 7). With such a challenge I picked up my
pen and searched for the proof. Minutes ticked away to hours with my ego sinking
all the while. I eventually did hit upon the following proof similar to the one I later
found in [8].
First note that 4
h
(8k +7) = 0, 4, 7 mod 8. If n = 2, 3, 4, 6, 7 mod 8, then n 13
2
=
1, 2, 3, 5, 6 mod 8 and so n 13
2
is not of the form 4
h
(8k +7). Thus for n > 13
2
,
Legendres results shows that n 13
2
can be written as a sumof three or fewer positive
squares. Augment this sum with the appropriate choice from among 13
2
= 5
2
+12
2
=
3
2
+ 4
2
+ 12
2
to obtain four positive squares summing to n. For n 13
2
, a com-
puter check nds that {2, 3, 6, 11, 14} are the only integers in these congruence classes
which cannot be written as a sum of four positive squares.
If n = 1, 5 mod 8, then n 26
2
= 5, 1 mod 8. So for n > 26
2
, n 26
2
can be
written as a sum of three positive squares. Augment this sum with the appropriate
choice from among 26
2
= 10
2
+24
2
= 6
2
+8
2
+24
2
to obtain four positive squares
summing to n. For n 26
2
, a computer check nds that {1, 5, 9, 17, 29, 41} are the
only integers in these congruence classes which cannot be written as a sum of four
positive squares.
If n = 0 mod 8, consideration mod 8 shows that n is a sum of four positive squares
if and only if n/4 is. Repeated applications of this observation allows n to be written
as 4
a
2 j where 2 j = 0 mod 8 and n is a sum of four positive squares if and only if 2 j
is. Previous cases show that 2 j = 0 mod 8 is not a sum of four positive squares only
for 2 j = 2, 6, 14.
Conclusion What, then, are the possible lengths for sums of squares equal to a given
positive square?
The possible lengths of 5 and higher are specied by Dubouis result for squares 36
and above. A direct check shows that the same result holds for 16 and 25 and that the
possible sum lengths for 9 are 1, 3, 6, 9. Since a square cannot also be twice a square,
VOL. 83, NO. 3, JUNE 2010 213
the squares which can be written as a sum of two positive squares are those with a
prime factor congruent to 1 mod 4. We see that among positive squares, (2
k
)
2
and
(5 2
k
)
2
are the only ones which cannot be written as a sum of three positive squares
and that 1 and 9 are the only ones which cannot be written as a sum of four positive
squares.
Combining these conditions, we learn that with the exception of (5 2
k
)
2
, a square
can be written as sums of 2, 3, and 4 positive squares if and only if it has at least one
prime factor congruent to 1 mod 4. Moreover such a square n can be written as a sum
of k positive squares for all k from 1 to n 14.
The rst few squares meeting the combined conditions are 169, 225, 289, 625, 676,
841, 900. Going out a little farther we nd n = 1 000 002 000 001 = (101 9901)
2
with 101 being a prime congruent to 1 mod 4. So this square can be written as a sum
of k positive squares for all k from 1 to 1 000 001 999 987, making 169s run of 155
look not so special after all.
REFERENCES
1. Leonard Eugene Dickson, History of the Theory of Numbers, Vol. II: Diophantine Analysis, Dover, New York,
2005.
2. Owen Fraser and Basil Gordon, On representing a square as the sum of three squares, Amer. Math. Monthly
76 (1969) 922923. doi:10.2307/2317949
3. E. Grosswald, A. Calloway, and J. Calloway, The representation of integers by three positive squares, Proc.
Amer. Math. Soc. 10 (1959) 451455. doi:10.2307/2032865
4. Adolf Hurwitz, Mathematische Werke. Bd. II: Zahlentheorie, Algebra und Geometrie, Birkh auser Verlag,
Basel, 1963.
5. Kelly Jackson, Francis Masat, and Robert Mitchell, Extensions of a sums-of-squares problem, Math. Mag. 66
(1993) 4143.
6. Carlos J. Moreno and Samuel S. Wagstaff, Jr., Sums of Squares of Integers, Chapman & Hall/CRC, Boca
Raton, FL, 2006.
7. Gordon Pall, On sums of squares, Amer. Math. Monthly 40 (1933) 1018. doi:10.2307/2301257
8. Don Redmond, Number Theory: An Introduction, Marcel Dekker, New York, 1996.
Summary This note shows that with the exception of (5 2
k
)
2
, an integer square can be written as sums of
2, 3, and 4 positive squares if and only if it has at least one prime factor congruent to 1 mod 4. Moreover such
a square n can be written as a sum of k positive squares for all k from 1 to n 14. The question of when a
non-square can be written as a sum of k positive squares is also examined.
How Fast Will We Lose?
RON HI RSHON
College of Staten Island
Staten Island, NY 10314
[email protected]
Two players X and Y play a gambling game. They start with bankrolls of x and y
dollars respectively, where x and y are positive integers and (x, y) = (1, 1). They
repeatedly ip a coin, which may be a fair or unfair coin. When heads appears, X wins
and receives one dollar from Y; when tails appears, X loses and pays one dollar to Y.
Math. Mag. 83 (2010) 213218. doi:10.4169/002557010X494869. c Mathematical Association of America
214 MATHEMATICS MAGAZINE
The game continues until one player runs out of money. Let L be the event that X loses
the match; that is, that it is X who ends the game with a zero balance.
We assume that the ips are independent. We write p for the probability that X wins
a given ip, and we always write q for 1 p. Then the probability that X loses is
Pr(L) = q
x
p
y
q
y
p
x+y
q
x+y
( p = q); Pr(L) =
y
x + y

p = q =
1
2

. (1)
This is a well-known formula. Our gambling game is called gamblers ruin, and
can also be described as a random walk on the integers with two absorbing barriers.
A classical reference is Feller [1], chapters III and XIV; see especially equations (3.4)
and (3.5) in section XIV.3. The theory goes back over 300 years, and early investigators
include Huygens, DeMoivre, Monmart, and two Bernoullis. A good source, both for
history and results, is Tak acs [4]. Formula (1) is also used in [3], for which this paper
is a sequel.
In this paper, we study the probability of the event L
n
that X loses in exactly n ips.
DeMoivre calculated this probability in 1718, but his formula was quite complicated;
see [4], equations (13) and (12). Our goal is to give a simple method for nding these
probabilities. As explained in the last section of [3], this will involve the parallel goal
of counting the number c
n
= c
n
(x, y) of different sequences of H and T of length
n that lead to losing in exactly n ips. Also, given that X loses, we determine the
expected time it will take to lose.
For X to go broke, X must lose x more coin ips than X wins. Thus, for some
integer k 0, the sequence consists of x + k tails and k heads. The probability of
each such sequence is q
x+k
p
k
, and the number of such sequences is c
x+2k
. Thus
Pr(L
x+2k
) = c
x+2k
q
x+k
p
k
. If n is not of the form x + 2k, then Pr(L
n
) = 0. Therefore
Pr(L) =

k=0
c
x+2k
q
x+k
p
k
. (2)
As noted in [3], the numbers c
x+2k
are the coefcients for the power series of a certain
rational function g = g
x,y
. This means that g is a generating function for the sequence
{c
x+2k
}, k = 0, 1, 2, . . . .
First we rewrite equation (1) using S
n
= p
n1
+ p
n2
q + + pq
n2
+ q
n1
,
which is positive for 0 p 1. Observe that
S
n
=
p
n
q
n
p q
( p = q) and S
n
=
n
2
n1

p = q =
1
2

. (3)
It follows that
Pr(L) = q
x
S
y
S
x+y
for 0 p 1; (4)
to see this, for p = q divide the numerator and denominator in (1) by p q, and for
p =
1
2
, note that
q
x
S
y
S
x+y
=

1
2

x
y
2
y1

2
x+y1
x + y
=
y
x + y
.
LEMMA. The expression S
n
may be expressed as a polynomial in u = pq with
integer coefcients.
VOL. 83, NO. 3, JUNE 2010 215
Proof. For n = 1 or n = 2, (3) reduces to 1 so that S
1
= S
2
= 1. Since
p
n+1
q
n+1
= p
n
p q
n
q = p
n
(1 q) q
n
(1 p)
= ( p
n
q
n
) pq( p
n1
q
n1
),
for p = q and n 2 we have from (3) that
S
n+1
= S
n
pqS
n1
= S
n
uS
n1
. (5)
This identity also holds for p =
1
2
, which can be veried directly or by using a conti-
nuity argument. The lemma follows by induction.
Iterating (5), we obtain the sample calculations summarized in TABLE 1.
TABLE 1
S
3
S
4
S
5
S
6
1 u 1 2u 1 3u + u
2
1 4u + 3u
2
Set the expressions in (2) and (4) for Pr(L) equal and cancel q
x
from both sides of
the resulting identity. Setting u = pq, we obtain the identity

k=0
c
x+2k
u
k
=
S
y
S
x+y
. (6)
We write g(u) = g
x,y
(u) for the rational function
S
y
S
x+y
. From (6) and (4), we have
g(u) =

k=0
c
x+2k
u
k
and Pr(L) = q
x
g(u). (7)
We call g the loss function of X for the parameters x and y (in the variable u), and
we call the coefcients of the Maclaurin expansion in (7) the loss sequence of X for
these parameters. To repeat, the rst term in the loss sequence is always c
x
= 1.
THEOREM. Given the loss sequence c
x+2k
, we have
Pr(L
x+2k
) = c
x+2k
q
x+k
p
k
for integers k 0. (8)
Similarly, there is a win sequence d
y+2k
for X, based on Ys loss function g
y,x
, so that
Xs probability of winning in exactly y + 2k steps is d
y+2k
p
k
q
y+k
.
Note that, in the beginning, we had equation (2) but we did not know the coef-
cients. Equation (7) gets the power series to represent a rational function g. Now by
direct means, we can obtain the rational function, then its power series, and then easily
read off as many coefcients as we like. This is valid because of the uniqueness theo-
rem for power series: If two power series agree on an interval, then their coefcients
are equal.
To illustrate the Theorem, see TABLE 2. For example, from the (x, y) = (4, 2)
line, we conclude that Pr(L
4
) = q
4
, Pr(L
6
) = 4q
5
p, Pr(L
8
) = 13q
6
p
2
, Pr(L
10
) =
40q
7
p
3
, etc. Note also that the number of ways of losing in 22 ips is 29,524.
The loss functions in TABLE 2 were obtained using equation (6) and the results
in TABLE 1. Most of the loss sequences in TABLE 2 can be veried by rewriting the
216 MATHEMATICS MAGAZINE
TABLE 2
x y Loss Function g(u) Loss Sequencerst ten terms
1 2 1/(1 u) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
1 3 (1 u)/(1 2u) 1, 1, 2, 2
2
, 2
3
, 2
4
, 2
5
, 2
6
, 2
7
, 2
8
1 4 (1 2u)/(1 3u + u
2
) 1, 1, 2, 5, 13, 34, 89, 233, 610, 1597
2 4 (1 2u)/(1 4u + 3u
2
) 1, 2, 5, 14, 41, 122, 365, 1094, 3281, 9842
5 1 1/(1 4u + 3u
2
) 1, 4, 13, 40, 121, 364, 1093, 3280, 9841, 29524
4 2 1/(1 4u + 3u
2
) 1, 4, 13, 40, 121, 364, 1093, 3280, 9841, 29524
3 3 1/(1 3u) 1, 3, 3
2
, 3
3
, 3
4
, 3
5
, 3
6
, 3
7
, 3
8
, 3
9
loss function using partial fractions and then using the expansion
1
1w
=

k=0
w
k
. For
example, for (x, y) = (1, 3), we obtain
g(u) =
1 u
1 2u
= 1 +
u
1 2u
= 1 +

k=1
2
k1
u
k
,
which explains the powers of 2 in the loss sequence. The relationship g
2,4
(u) =
ug
4,2
(u) +
1
1u
explains why the loss sequences in lines (4, 2) and (2, 4) look similar.
The sequence for (x, y) = (1, 4) in TABLE 2 no doubt looks familiar. In fact, it
is 1, f
1
, f
3
, f
5
, . . . where f
n
is the Fibonacci sequence ( f
1
= f
2
= 1, f
3
= 2, f
4
=
3, f
5
= 5, . . . ). To see this, we note that
f
1
+ f
2
z + f
3
z
2
+ f
4
z
3
+ f
5
z
4
+ =
1
1 z z
2
;
see, for example, formulas (6.116) and (6.117) in [2]. Also
f
1
f
2
z + f
3
z
2
f
4
z
3
+ f
5
z
4
=
1
1 + z z
2
.
Adding, we obtain
f
1
+ f
3
z
2
+ f
5
z
4
+ =
1 z
2
(1 z
2
)
2
z
2
.
Replacing z
2
by u, we get for 0 < u < 1,
f
1
+ f
3
u + f
5
u
2
+ =
1 u
1 3u + u
2
,
and so
1 + f
1
u + f
3
u
2
+ f
5
u
3
+ = 1 + u
1 u
1 3u + u
2
=
1 2u
1 3u + u
2
.
The rational function on the right is the loss function g(u) in TABLE 2 for (x, y) =
(1, 4), and we now see why the corresponding loss sequence consists of Fibonacci
numbers.
We return to the power series in (7). Pr(L) is dened for all p between 0 and 1
inclusive. Hence, from (7),

k=0
c
x+2k
u
k
converges for p =
1
2
or u =
1
4
, so the radius
of convergence R of the Maclaurin series of any loss function, with (x, y) = (1, 1),
VOL. 83, NO. 3, JUNE 2010 217
obeys
1
4
R < 1. If we set u =
1
4
and p = q =
1
2
in (7), and if we use the second
equation of (1), we obtain the following useful relation for the loss sequence:

k=0
c
x+2k

1
4

k
= g

1
4

= 2
x
Pr(L) =
2
x
y
x + y
. (9)
This shows that, given the value of x and the loss sequence of X, the value of y is
uniquely determined. As an example, suppose the loss sequence is one whose gen-
eral term is 3
k
, k 0. If x = 3, then by using (9), y is determined by the equations

k=0
(
3
4
)
k
= 4 =
8y
3+y
, which has unique solution y = 3.
Different pairs (x, y) may yield the same loss function. For example, (x, y) =
(n, 1) yields the same loss function as (x, y) = (n 1, 2). In each case, the com-
mon loss function is 1/S
n+1
. However, one can never nd three distinct pairs (x, y)
that have the same loss function. To see this, note that if for n > 1, we arrange the
powers of u in the expansion of S
n
in ascending order as in Table 1, the rst two terms
in this expansion will be
1 (n 2)u. (10)
This is easily proved by induction using the dening relation S
n+1
= S
n
uS
n1
.
Now suppose that two pairs (x, y) and (x

, y

) yield the same loss function, so that


S
y
/S
x+y
= S
y
/S
x

+y
and
S
y
S
x

+y
= S
x+y
S
y
. (11)
First suppose that both y and y

are greater than 1. Performing the multiplications


of polynomials in (11) and using (10), we see that the start of the calculation gives
[1 (y 2)u][1 (x

+ y

2)u] = [1 (x + y 2)u][1 (y

2)u].
Equating coefcients of u, we nd that x = x

. But then y = y

by the statement
following equation (9).
Now suppose that y = 1, so we are investigating the case when (x, 1) and (x

, y

)
yield the same loss function. Then S
1
/S
x+1
= S
y
/S
x

+y
and S
x

+y
= S
x+1
S
y
. The
same analysis as in the last paragraph leads to x

= x if y

= 1, and x

= x 1
if y

> 1. We are left with the case that (x, 1) and (x 1, y

) give the same loss


function. For y

= 2 we already observed, prior to equation (10), that this happens. In


general, there cannot be three such pairs (x, 1), (x 1, y

), (x 1, y

) because, as
we noted after equation (9), the y value is uniquely determined by the x value and the
loss sequence. Thus only one y value can go with x 1 and y

= y

.
Finally, here are two questions that come to mind.
QUESTION 1. Can an innite number of the loss functions have a common root?
QUESTION 2. Our main ideas are actually probability free in their denition.
Can one give, in a manner as simple as ours, a method of determining the loss function
for any (x, y) without referring to the probability result (1)?
Average time to lose As promised, we compute the expected time it will take to lose,
given that we lose. If T represents the number of ips before losing, then we want the
conditional expectation E(T|L) and this equals
1
Pr(L)

k=0
(x + 2k)c
x+2k
q
x+k
p
k
=
xq
x
g(u) + 2q
x
ug

(u)
q
x
g(u)
= x + 2u
g

(u)
g(u)
.
218 MATHEMATICS MAGAZINE
For (x, y) = (4, 2), we have g

(u)/g(u) =
46u
3u
2
4u+1
, so the expected number of ips
is
4 +2pq
4 6pq
3p
2
q
2
4pq +1
.
For p = q =
1
2
and u =
1
4
, the expected time to lose is 32/3.
Acknowledgment The author would like to express his thanks to Emeric Deutsch for reading several versions
of this paper and for general advice. Special thanks are due to Ken Ross, Associate Editor, for a great deal of
improvement of this paper, mathematically, historically, and stylistically.
REFERENCES
1. W. Feller, An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed., John Wiley, 1968.
2. R. L. Graham, D. E. Knuth, and O. Patashnik, Concrete Mathematics: A Foundation for Computer Science,
Addison-Wesley, 1989.
3. R. Hirshon and R. De Simone, An offer you cant refuse, Mathematics Magazine 81 (2008) 146152.
4. L. Tak acs, On the classical ruin problems, J. American Statistical Association 64 (1969) 889906. doi:10.
2307/2283470
Summary In a version of gamblers ruin, players start with x and y dollars respectively, and ip coins for
one dollar per ip until one player runs out of money. This is a random walk with two absorbing barriers. We
consider the number of ways for the rst player to lose on the nth ip, for n = x, x +2, . . . . We use probabilistic
arguments to construct generating functions for these quantities along with explicit methods for computing them.
This paper builds on the paper by Hirshon and De Simone, Mathematics Magazine 81 (2008) 146152.
More Polynomial Root Squeezing
CHRI STOPHER FRAYER
University of WisconsinPlatteville
Platteville, WI 53818-3099
[email protected]
Suppose youre looking at the graph of a polynomial y = p(x) in a java applet, with
blue dots on the x-axis indicating the polynomials roots, and red dots on the x-axis
showing the positions of the critical points. Lets assume that all the roots are real and
that you grab the blue dots and move them around on the x-axis. As you do this, what
happens to the red dots?
This is a fair question because the roots determine the polynomial up to a constant
multiple, and they determine the critical points exactly. For simplicity (and without
loss of generality) we will only consider monic polynomials (that is, polynomials with
leading coefcient 1).
If you move all the blue points (roots) the same amount, the whole graph just trans-
lates, and all the red dots simply move along for the ride. If you move all the roots in
the same direction but by different amounts, it seems reasonable that the critical points
all move in that same direction. This is in fact true, according to the Polynomial Root
Dragging Theorem (see [1], [3]). But suppose you take two roots and symmetrically
squeeze them closer to each other, something we call polynomial root squeezing. Then
Math. Mag. 83 (2010) 218221. doi:10.4169/002557010X494878. c Mathematical Association of America
VOL. 83, NO. 3, JUNE 2010 219
what do the critical points do? In [2], Boelkins, From and Kolins answer this for crit-
ical points that are outside the interval between the two selected roots. In this article
we extend their analysis to cover critical points at or between the two squeezed roots.
Notation and denitions Let p(x) be a monic degree-n polynomial with real roots
r
1
r
2
r
n
and critical points c
1
c
2
c
n1
. Rolles Theorem tells us
that there is a critical point strictly between each pair of adjacent roots. We know that
wherever there are r roots together at a single point, there are also (r 1) critical
points. So we have
r
1
c
1
r
2
c
2
c
n1
r
n
(1)
with r
i
< c
i
< r
i +1
whenever r
i
< r
i +1
. By polynomial root squeezing we mean se-
lecting two indices i and j with r
i
strictly less than r
j
; we then move the smaller root
from r
i
to r
i
+ d and the larger root from r
j
to r
j
d, where d > 0. We insist that
d <
r
i
+r
j
2
, so that the roots dont pass each other.
As an example, consider the polynomial p(x) = x
2
(x + 1)(x 2). It has single
roots at 1 and 2, and a double root at 0. Its critical points are at (approximately)
.693, 0, and 1.443. After squeezing the roots at 1 and 2 to .5 and 1.5 respectively,
the polynomial becomes p(x) = x
2
(x +.5)(x 1.5). The left critical point moves to
the right from.693 to .343, and the right critical point moves to the left from 1.443
to 1.093. However the center critical point remains at zero. This example is illustrated
in FIGURE 1.
q(x)
p(x)
2.5
2
1.5
1
.5
0
.5
1.5 1 .5 0 .5 1 1.5 2 2.5
Figure 1 Two roots of the polynomial p(x) = x
2
(x + 1)(x 2) have been squeezed
together to form p(x). In this example, x = 0 is a critical point of p(x) and q(x).
Why doesnt the critical point at zero move? It is because x = 0 is a repeated root
of
p(x)
(x+1)(x2)
, and as long as this repeated root remains xed, so must the critical point.
More generally, if c
k
is a repeated root of
p(x)
(xr
i
)(xr
j
)
, then c
k
will remain a critical
point when r
i
and r
j
are squeezed together. For this reason, we say that a critical point
is stubborn if it is a repeated root of
p(x)
(xr
i
)(xr
j
)
, and ordinary otherwise.
A stubborn critical point can move if it lies at r
i
or r
j
. If r
i
(or r
j
) lies at a repeated
root of multiplicity greater than two, then there is a repeated stubborn critical point
there. When r
i
is dragged to the right, one of the stubborn critical points will move to
220 MATHEMATICS MAGAZINE
the right, while the others will remain xed. In order to state the theorem as succinctly
as possible we exclude the case of stubborn critical points and leave the details as an
exercise.
The theorem Boelkins, From and Kolins [2] proved the Polynomial Root Squeezing
Theorem. That theorem explains how squeezing two roots together affects the critical
points that are outside of the interval between the two squeezed roots. Our proof of the
Polynomial Root Squeezing Theorem extends their analysis to the critical points that
lie at or between the two squeezed roots.
THEOREM. If the roots at r
i
and r
j
move equal distances toward each other, then
each ordinary critical point moves toward (r
i
+r
j
)/2. If the roots at r
i
and r
j
move
equal distances away from each other, then each ordinary critical point moves away
from (r
i
+r
j
)/2.
r
4
r
2
c
1
c
2
c
3
c
4
c
5
r
2
+ r
4
2
Figure 2 The Polynomial Root Squeezing Theorem: when we drag r
2
and r
4
together,
the critical points move toward (r
2
+r
4
)/2.
Proof. We prove the root squeezing part of the theorem. The root separating part
(moving r
i
and r
j
equal distances away from each other) follows similarly.
Let p(x) be a polynomial of degree n with (possibly repeated) real roots r
1
r
2

r
n
, r
i
< r
j
and c
k
any critical point of p(x). Let p(x) be the polynomial that
results from squeezing r
i
and r
j
a xed distance d, with 0 d <
1
2

r
j
r
i

. That is
p(x) = (x r
i
d)(x r
j
+d)

k=i, j
(x r
k
)
= (x r
i
d)(x r
j
+d)q(x).
Denote the roots of p(x) by r
1
r
2
. . . r
n
and the critical points by c
1
c
2

. . . c
n1
.
If c
k
lies outside the interval from r
i
to r
j
, then the conclusion follows from [2].
(It also follows from a slight variation of the reasoning below.) If c
k
is between r
i
and
r
i
+d, or between r
j
d and r
j
(that is, if one of the moving roots passes by c
k
) then
the result follows from counting intervals in (1).
We now assume that c
k
is not at a repeated root of p and that r
i
+d < c
k
< r
j
d.
Our goal is to compare c
k
and c
k
. We do so by investigating p

(c
k
). Let
p(x) = (x r
i
)(x r
j
)q(x),
so that
p

(x) = (x r
i
+ x r
j
)q(x) +(x r
i
)(x r
j
)q

(x), (2)
VOL. 83, NO. 3, JUNE 2010 221
and
p

(x) = (x r
i
+ x r
j
)q(x) +(x r
i
d)(x r
j
+d)q

(x). (3)
Subtracting (2) from (3) yields
p

(c
k
) = d(r
j
r
i
d)q

(c
k
). (4)
Since r
j
r
i
d > 0, this implies that p

(c
k
) and q

(c
k
) have the same sign.
Without loss of generality we assume that p(x) < 0 on (r
k
, r
k+1
) and that |c
k
r
i
| <
|c
k
r
j
| (The cases where |c
k
r
i
| > |c
k
r
j
| and or p(x) > 0 are similar.) Since
r
i
< c
k
< r
j
, it follows that (c
k
r
i
)(c
k
r
j
) < 0 so that q(c
k
) > 0. As p

(c
k
) = 0,
0 = p

(c
k
) = (c
k
r
i
+c
k
r
j
)q(c
k
) +(c
k
r
i
)(c
k
r
j
)q

(c
k
).
An analysis of the sign of the terms, with the assumption that |c
k
r
i
| < |c
k
r
j
|,
implies that q

(c
k
) < 0. It then follows from (4) that p

(c
k
) < 0.
Since p(c
k
) < 0, the equation
p(c
k
)(c
k
r
i
d)(c
k
r
j
+d) = p(c
k
)(c
k
r
i
)(c
k
r
j
)
implies that p(c
k
) < 0. Since we assume that r
i
+ d < c
k
< r
j
d and c
k
is not a
repeated root of p, it follows that r
k
= r
k
or r
k
= r
i
+d while r
k+1
= r
k+1
or r
k+1
=
r
j
d. In all four cases, r
k
< c
k
< r
k+1
with p(c
k
) < 0 which implies that p(x) < 0
on ( r
k
, r
k+1
). Therefore p

(x) changes sign from negative to positive at c


k
. As p

(c
k
) <
0, it follows that c
k
< c
k
and c
k
has moved toward (r
i
+r
j
)/2.
This extended version of the Polynomial Root Squeezing Theorem completely char-
acterizes the behavior of all the critical points when distinct roots are squeezed or sep-
arated a uniform distance. In every case, if a critical point moves at all, it moves in the
same direction as the moving root that is nearest to it.
Unfortunately, this intuition does not help us when two distinct roots are squeezed
together a nonuniform distance. Neither does it tell us what happens when more than
two roots are moved simultaneously. These problems could prompt some interesting
undergraduate research.
Acknowledgment The author wishes to express his gratitude to James Swenson and Tony Thomas for helpful
conversations.
REFERENCES
1. Bruce Anderson, Polynomial root dragging, Amer. Math. Monthly 100 (1993) 864866. doi:10.2307/
2324665
2. Matthew Boelkins, Justin From, and Samuel Kolins, Polynomial root squeezing, Math. Mag. 81 (2008) 3944.
3. Gideon Peyser, On the roots of the derivative of a polynomial with real roots, Amer. Math. Monthly 74 (1967)
11021104. doi:10.2307/2313625
Summary Given a polynomial with all real roots, the Polynomial Root Dragging Theorem states that moving
one or more roots of the polynomial to the right will cause every critical point to move to the right, or stay xed.
But what happens to the position of a critical point when roots are dragged in opposite directions? In this note
we discuss the Polynomial Root Squeezing Theorem, which states that moving two roots, r
i
and r
j
, an equal
distance toward each other without passing other roots, will cause each critical point to move toward (r
i
+r
j
)/2,
or remain xed.
222 MATHEMATICS MAGAZINE
A Counterexample to Integration by Parts
ALEXANDER KHEI FETS
Department of Mathematical Sciences
University of Massachusetts Lowell
Alexander [email protected]
J AMES PROPP
Department of Mathematical Sciences
University of Massachusetts Lowell
James [email protected]
The integration-by-parts formula

f

(x)g(x) dx = f (x)g(x)

f (x)g

(x) dx
carries with it an implicit quantication over functions f, g to which the formula ap-
plies. So, what conditions must f and g satisfy in order for us to be able to apply the
formula?
A natural guesswhich some teachers might even offer to a student who raised
the questionwould be that this formula applies whenever f and g are differen-
tiable. Clearly this condition is necessary, since otherwise the integrands f

(x)g(x)
and f (x)g

(x) are not dened. But is this condition sufcient? We will show that
it is not. That is, we will give an example of two differentiable functions f, g on
[0, 1] for which the denite integrals

1
0
f

(x)g(x) dx and

1
0
f (x)g

(x) dx do not
exist (the former is and the latter is +); it follows that the functions f

(x)g(x)
and f (x)g

(x) do not have antiderivatives on the interval [0, 1], so that the indenite
integrals

f

(x)g(x) dx and

f (x)g

(x) dx do not exist.


A cautious teacher might instead reply that the theorem holds whenever f and g are
differentiable and f

g and f g

are integrable. While this version of the theorem is true,


it cannot be applied in cases where one does not know ahead of time that the integral
one is trying to compute actually exists. One wants an integration-by-parts theorem
that includes the integrability of f

(x)g(x) as part of its conclusion, not as part of its
hypothesis.
Before we give our counterexample to the naive interpretation of the integration by
parts formula, or state what we think the teacher should say, we point out that the for-
mula holds if either f

or g

is continuous. For instance, if f



is continuous, then (since
g is continuous) the product f

g is continuous; but then the function f

g must have
an antiderivative h, and consequently the function f g

must have an antiderivative too,


namely f g h. So any counterexample to the naive interpretation of integration by
parts must feature differentiable functions f, g whose derivatives are not continuous,
such as the famous function x
2
sin 1/x (extended to a function on all of R by continu-
ity) and its relatives. Moreover, it will not do to let f and g be the same function of
this sort, since the function f f

always has an antiderivative, namely
1
2
f
2
.
Our counterexample is the pair of functions
f (x) =

x
2
sin

1
x
4

, x = 0
0, x = 0
Math. Mag. 83 (2010) 222225. doi:10.4169/002557010X494896. c Mathematical Association of America
VOL. 83, NO. 3, JUNE 2010 223
and
g(x) =

x
2
cos

1
x
4

, x = 0
0, x = 0
on the interval [0, 1]. Both functions are continuous on [0, 1] and differentiable on
[0, 1]. Indeed, if we consider f and g as dened above to be dened on all of R,
both functions are differentiable everywhere; for, away from 0 we can use the chain
rule, while at 0 we have |( f (h) f (0))/(h 0)| = | f (h)/h| |h
2
/h| = |h| so that
f

(0) = lim
h0
( f (h) f (0))/(h 0) = 0, and likewise g

(0) = 0. Obviously, the


integral

1
0
[ f (x)g(x)]

dx
exists. However, we will show that both integrals

1
0
f

(x)g(x) dx and

1
0
f (x)g

(x) dx
are divergent. It sufces to show that the rst integral is divergent. For x = 0,
f

(x) = 2x sin

1
x
4

4x
2
cos

1
x
4

1
x
5
.
The rst term in this representation of f

(x) is continuous, and g(x) is continuous, so
their product is continuous and therefore integrable. So, we focus on the second term
times g(x), namely
4

1
0
x
2
cos

1
x
4

1
x
5
g(x) dx = 4

1
0
x
4
cos
2

1
x
4

1
x
5
dx
=

1
0
x
4
cos
2

1
x
4

1
x
4

.
After the substitution
u =
1
x
4
the integral turns into


1
1
u
cos
2
(u) du
(with the minus sign coming from the interchange of upper and lower limits of inte-
gration). To show that this integral diverges, let k be a positive integer. Then for every
u in the interval [2k

4
, 2k] we have
cos
2
(u)
1
2
and
1
u

1
(2k)
.
224 MATHEMATICS MAGAZINE
Therefore,


1
1
u
cos
2
(u) du

k=1

2k
2k

4
1
u
cos
2
(u) du

k=1
1
(2k)
1
2

4
=
1
16

k=1
1
k
=
This completes the proof.
Our analysis shows that the (improper) denite integrals

1
0
f

(x)g(x) dx and

1
0
f (x)g

(x) dx do not exist. This in turns shows that the functions f



(x)g(x) and
f (x)g

(x) do not have antiderivatives on [0, 1]. For, if these functions had antideriva-
tives, the fundamental theorem of calculus would yield nite values for the denite
integrals.
We have shown that the functions f

g and f g

are not integrable over [0, 1]. It is


worth noting that | f

| and |g

| are not integrable over [0, 1] either, as can be shown by a


similar method. On the other hand, the function f

is integrable over [0, 1] in the sense
that the improper Riemann integral

1
0
f

(x) dx exists: for all > 0 the Fundamental
Theorem of calculus implies

f

(x) dx = f (1) f (), which converges to f (1)
f (0) as 0, implying that

1
0
f

(x) dx exists and equals f (1) f (0). Likewise g

is integrable over [0, 1].


The following three pictures (created with the help of Mathematica) illustrate what
is going on: they depict the (truncated) graphs of f , f

, and f

g (we show f

g
rather than f

g so that the function will be non-negative rather than non-positive).
The continuous function f is integrable, and the discontinuous function f

is inte-
grable because its oscillations balance out, but the non-negative function f

g is non-
integrable.
0.2 0.4 0.6 0.8 1.0
0.4
0.2
0.2
0.4
0.6
0.8
0.2 0.4 0.6 0.8 1.0
200
100
100
200
0.2 0.4 0.6 0.8 1.0
5
10
15
20
25
Some might be inclined to say that our example is actually a vindication of an
extended integration by parts theorem that asserts, as important special cases, that
if

b
a
f

(x)g(x) is then

b
a
f (x)g

(x) is and vice versa (and likewise with


the signs reversed), and that if either of these integrals diverges by oscillation (as
in the case for the functions f, g on [1, 1] given by x
2
sin(1/x
4
), x
2
cos(1/x
4
) on
[0, 1] and x
2
sin(1/x
4
), x
2
cos(1/x
4
) on [1, 0], respectively) then so does the
VOL. 83, NO. 3, JUNE 2010 225
other. However, to the extent that one might be inclined to treat the integration by
parts formula as implicitly asserting that the integrals are well-dened, our example
provides a corrective.
Is this corrective needed? We have not found any calculus texts that present a
mistaken statement of the integration by parts theorem, but we have found some
widely-used web sites that do so (e.g.: Let u and v be differentiable functions, then

uv

dx = uv

v dx). More common are books and web sites that present the
integration by parts formula and give examples without specifying the conditions un-
der which the formula applies. A provocative treatment of other pedagogical aspects
of the integration by parts theorem is [2].
So, what should the calculus teacher say?
In an ordinary calculus class, the integration by parts formula should be stated as
a theorem that begins If f

and g

are continuous, then . . . (although, as we have


noted, it sufces that either f

or g

is continuous).
For a more advanced course (an honors calculus class or an introductory real analy-
sis class), our example could be presented in detail and used to motivate the notion of
bounded variation, since the lack of bounded variation of the derivatives of the func-
tions near the origin is the source of the problem. We also mention that, in lieu of
adopting the hypothesis that f (or g) is continuously differentiable, one might require
that f be Riemann-Stieltjes integrable with respect to dg. Then it can be shown that the
integration by parts formula (where the integrals now are Riemann-Stieltjes integrals)
is valid, and it is part of the conclusion that g will be Riemann-Stieltjes integrable with
respect to d f (see [1]).
Finally, we mention that if the functions f

and g

are assumed to be integrable


in the sense that

1
0
f

(x) dx and

1
0
g

(x) dx exist as strict Riemann integrals (and


not just as improper Riemann integrals), then the conclusion of the integration by
parts theorem applies. Indeed, we only need to know that at least one of f

and g

is Riemann integrable. For, Lebesgues Theorem states that a (measurable) function is


Riemann integrable if and only if it is bounded and its set of discontinuity has Lebesgue
measure zero. If g is continuous and f

is Riemann integrable (i.e., it is bounded and its
set of discontinuity has Lebesgue measure zero), then so is f

g, and the integration by
parts theorem applies. Hence it is an essential feature of our counterexample that the
functions f

nor g

are not just discontinuous but also non-integrable in the Riemann


sense.
Acknowledgment This work was stimulated by conversations with the honors freshman calculus class at
UMass Lowell, and also beneted from conversations with Lee Jones of UMass Lowell (who found a differ-
ent counterexample) and Zbigniew Nitecki of Tufts University.
REFERENCES
1. Tom M. Apostol, Mathematical Analysis, 2nd ed., Addison Wesley, Reading, MA, 1974.
2. Jonathan Lewin, Integration by Parts: Another Example of Voodoo Mathematics, http://science.
kennesaw.edu/~jlewin/fb/integration-by-parts.pdf.
Summary The authors exhibit two differentiable functions f and g for which the function f

g and f g

are not
integrable, so that the integration by parts formula does not apply.
PROBL E MS
BERNARDO M.

ABREGO, Editor
California State University, Northridge
Assistant Editors: SILVIA FERN

ANDEZ-MERCHANT, California State University, North-


ridge; JOS

E A. G

OMEZ, Facultad de Ciencias, UNAM, M exico; ROGELIO VALDEZ, Facultad
de Ciencias, UAEM, M exico; WILLIAM WATKINS, California State University, Northridge
PROPOSALS
To be considered for publication, solutions should be received by November 1,
2010.
1846. Proposed by Eddie Cheng and Jerrold W. Grossman, Department of Mathemat-
ics and Statistics, Oakland University, Rochester, MI.
For which n 1 is it possible to place the numbers 1, 2, . . . , n in some order (a) on
a line segment, or (b) on a circle, so that for every s from 1 to
1
2
n(n + 1) there is a
connected subset of the segment or circle such that the sum of the numbers on that
subset is s?
1847. Proposed by Panagiote Ligouras, Leonardo da Vinci High School, Noci,
Italy.
Let ABC be a scalene triangle. Let h
a
, l
a
, and m
a
be the respective lengths of the
height, bisector, and median, of ABC with respect to A, and let r
a
be the exradius of
the excircle of ABC opposite to A. Similarly, dene h
b
, l
b
, m
b
, and r
b
, with respect
to B, and h
c
, l
c
, m
c
, and r
c
with respect to C. Prove that
l
4
a
(m
2
a
h
2
a
)
h
3
a
r
a
(l
2
a
h
2
a
)
+
l
4
b
(m
2
b
h
2
b
)
h
3
b
r
b
(l
2
b
h
2
b
)
+
l
4
c
(m
2
c
h
2
c
)
h
3
c
r
c
(l
2
c
h
2
c
)
>
16
3
.
1848. Proposed by Herb Bailey, RoseHulman Institute of Technology, Terre Haute,
IN.
Let N be a base ten positive integer with nonzero last digit. Let N

be the integer
formed by moving the last digit of N to the front. For example, if N = 867053 then
N

= 386705. Find all N for which N is divisible by N

.
Math. Mag. 83 (2010) 226233. doi:10.4169/002557010X494904. c Mathematical Association of America
We invite readers to submit problems believed to be new and appealing to students and teachers of advanced
undergraduate mathematics. Proposals must, in general, be accompanied by solutions and by any bibliographical
information that will assist the editors and referees. A problem submitted as a Quickie should have an unexpected,
succinct solution. Submitted problems should not be under consideration for publication elsewhere.
Solutions should be written in a style appropriate for this MAGAZINE.
Solutions and new proposals should be mailed to Bernardo M.

Abrego, Problems Editor, Department of
Mathematics, California State University, Northridge, 18111 Nordhoff St, Northridge, CA 91330-8313, or mailed
electronically (ideally as a L
A
T
E
X or pdf le) to [email protected]. All communications, written or
electronic, should include on each page the readers name, full address, and an e-mail address and/or FAX
number.
226
VOL. 83, NO. 3, JUNE 2010 227
1849. Proposed by Ovidiu Furdui, Campia Turzii, Cluj, Romania.
Find the sum

m=1

n=1
(1)
n+m
(

n +m)
3
,
where a denotes the greatest integer less than or equal to a.
1850. Proposed by Richard Stephens, Department of Mathematics, Columbus State
University, Columbus, GA.
Let be a topology on a nite set X. Dene a topology on X to be regular if for any
nonempty closed E X and x X \ E, there exist disjoint open sets U and V in
such that E V and x U. Prove or disprove that the topological space (X, ) is
regular if and only if has a base B which is a partition of X.
Quickies
Answers to the Quickies are on page 232.
Q1001. Proposed by Herman Roelants, Center for Logic, Institute of Philosophy, Uni-
versity of Leuven, Leuven, Belgium.
The recursive sequence (a
n
) is dened as follows: a
1
= 0 and a
n+1
=
_
a
2
n
+1 +a
n
for n 1. Determine the value of
lim
n
2
n
a
n
.
Q1002. Proposed by Michael W. Botsko, Saint Vincent College, Latrobe, PA.
Let g be a positive, continuous, real-valued function on [0, ), and let
f (x) = g(x)
_
x
0
1
(g(t ))
2
dt.
Prove that f is unbounded on [0, ).
Solutions
Locating the intersection of the diagonals June 2009
1821. Proposed by Abdullah Al-Sharif and Mowaffaq Hajja, Yarmouk University, Ir-
bid, Jordan.
Let ABCD be a convex quadrilateral, let X and Y be the midpoints of sides BC and DA
respectively, and let O be the point of intersection of diagonals of ABCD. Prove that
O lies inside of quadrilateral ABXY if and only if
Area(AOB) < Area(COD).
I. Solution by Michel Bataille, Rouen, France.
Let U and V be the points of intersection of XY with AC and BD, respectively (see
gure).
228 MATHEMATICS MAGAZINE
Let positive real numbers p, q be dened by

OC = p

OA,

OD = q

OB
so that C = pA +(1 + p)O and D = (1 +q)O qB.
Then, 2X = B + C = pA + (1 + p)O + B and similarly, 2Y = A + (1 +
q)O qB. It follows that the equation of the line XY, in barycentric coordinates
(x, y, z) relative to (A, O, B), is
( pq +1 +2q)x +( pq 1)y +( pq +1 +2p)z = 0,
and so the coordinates of U and V are U = (1 pq, pq + 1 + 2q, 0) and V =
(0, pq +1 +2p, 1 pq), that is,
2(1 +q)

OU = (1 pq)

OA, and 2(1 + p)

OV = (1 pq)

OB.
Thus O is in the interior of ABXY if and only if pq > 1.
On the other hand,
Area(COD) =
1
2
OC OD sin(COD) =
1
2
p OA q OB sin(AOB)
= pqArea(AOB).
Thus pq > 1 if and only if Area(COD) > Area(AOB).
II. Solution by David Getling, Berlin, Germany.
Let Z and W be the midpoints of CD and AB, respectively. Varignons Theorem
says that XWYZ is a parallelogram. Indeed, XW and YZ are parallel to AC and also YW
and XZ are parallel to BD. As a consequence O always lies inside this parallelogram.
Also, O lies inside ABXY if and only if O lies inside the triangle XYW, that is, O lies
inside ABXY if and only if [WXOY] < [YOXZ], where [WXOY] designates the area of
WXOY.
VOL. 83, NO. 3, JUNE 2010 229
In the gure, all triangular regions with the same area have been labeled with the
same number. The condition [WXOY] < [YOXZ] is equivalent to
[1] +[1] +[3] +[4] < [2] +[2] +[3] +[4], or [1] +[1] +[3] < [2] +[2] +[3].
But [WBX] = [ABC]/4, from which [1] +[3] = [5] +[8], and similarly, [2] +[3] =
[7] +[8]. Thus the condition is equivalent to
[1] +[5] +[8] < [2] +[7] +[8], or
1
2
[AOB] = [1] +[5] < [2] +[7] =
1
2
[COD],
which completes the proof.
Also solved by Robert Calcaterra, Robert L. Doucette, Fisher Problem Solving Group, Dmitry Fleischman,
Michael Goldenberg and Mark Kaplan, Eugen J. Ionascu, Young Ho Kim (Korea), Omran Kouba (Syria), Victor
Y. Kutsenok, Aaron Panchal, Joel Schlosberg, Edward Schmeichel, Marian Tetiva (Romania), and the proposers.
An inequality for
3

u/v +
3

v/u June 2009


1822. Proposed by Pham Van Thuan, Hanoi University of Science, Hanoi, Vietnam.
Let u and v be positive real numbers. Prove that
1
8
_
17
2uv
u
2
+v
2
_

3
_
u
v
+
3
_
v
u

_
(u +v)
_
1
u
+
1
v
_
.
Find conditions under which equality holds.
Solution by Omran Kouba, Damascus, Syria.
We rst prove the following inequality. If x is real and x > 2, then
1
8
_
17
2
x(x
2
3)
_
< x < (x 1)

x +2.
Note that
8x 17 +
2
x(x
2
3)
= (x 2)
_
8
(x +1)
2
x(x
2
3)
_
.
Writing (x +1)
2
= (x
2
3) +2x +4, and noting that x
2
3 > 1 for x > 2, it fol-
lows that
(x +1)
2
x(x
2
3)
=
1
x
+
2x +4
x(x
2
3)
<
1
x
+
2x +4
x
= 2 +
5
x
< 2 +
5
2
=
9
2
.
Hence
8x 17 +
2
x(x
2
3)
> (x 2)
_
8
9
2
_
> 0,
and the rst inequality is proved. To prove the second inequality, note that x > 2 im-
plies

x +2 > 2, and consequently
x < 2(x 1) < (x 1)

x +2.
For the problem at hand, let x =
3

u/v +
3

v/u. The Arithmetic MeanGeometric


Mean Inequality implies that x 2, with equality if and only if u = v. Thus, if u = v
then x > 2, and by the previously proved inequality,
1
8
_
17
2
x(x
2
3)
_
< x < (x 1)

x +2.
230 MATHEMATICS MAGAZINE
Because x
3
= u/v +v/u +3x, it follows that
(x 1)
2
(x +2) = x
3
3x +2 = 2 +
u
v
+
v
u
= (u +v)
_
1
u
+
1
v
_
and
2
x(x
2
3)
=
2
x
3
3x
=
2
u/v +v/u
=
2uv
u
2
+v
2
.
Therefore
1
8
_
17
2uv
u
2
+v
2
_
<
3
_
u
v
+
3
_
v
u
<
_
(u +v)
_
1
u
+
1
v
_
.
Moreover, if u = v then all three expressions in the inequality are equal, so equality
holds if and only if u = v.
Editors Note. Stan Wagon veried that the constant
1
8
in the rst inequality cannot be
improved. Eugene A. Herman proved the stronger inequality
4
9
(5 uv/(u
2
+v
2
)) <
3

u/v +
3

v/u. Furthermore, he proved that this is the sharpest possible inequality of


the form a b(uv/(u
2
+v
2
)) <
3

u/v +
3

v/u with a, b > 0. Graham Lord general-


ized in a different vein; he proved that
1
8
(17 2uv/(u
2
+v
2
)) <
4

u/v +
4

v/u and
veried that the statement no longer holds with fth roots.
Also solved by Arkady Alt, Michel Bataille (France), Minh Can, Hongwei Chen, John Christopher, Chip Cur-
tis, Robert L. Doucette, John Ferdinands, Leon Gerber, Michael Goldenberg and Mark Kaplan, Eugene A. Her-
man, Eugen J. Ionascu and Sarah E. Ewing, Parvis Khalili, Elias Lampakis (Greece), Kee-Wai Lau (China), Gra-
ham Lord, Jos e H. Nieto (Venezuela), Northwestern University Math Problem Solving Group, Occidental College
Problem Solving Group, Paolo Perfetti (Italy), Gabriel T. Pr ajitur a, Joel Schlosberg, John L. Simmons (Holland),
Nicholas C. Singer, Sanghun Song (Korea), Albert Stadler (Switzerland), David Stone and John Hawkins, Mar-
ian Tetiva (Romania), Texas State Problem Solvers Group, Michael Vowe (Switzerland), Stan Wagon, and the
proposer. There were two incorrect submissions.
Permutations with k initial entries of the same parity June 2009
1823. Proposed by Emeric Deutsch, Polytechnic University, Brooklyn, NY.
Let n and k be positive integers. Find a closed-form expression for the number of
permutations of {1, 2, . . . , n} for which the initial k entries have the same parity, but
the initial k + 1 entries do not. (As an example, for the permutation 5712463, the
number of initial entries of the same parity is 3, the order of the set {5, 7, 1}.)
Solution by Jos e H. Nieto, Universidad del Zulia, Maracaibo, Venezuela.
Let I
n
= {1, 2, . . . , n}. Denote by E(n, k) and O(n, k) the sets of permutations
of I
n
with just k initial even entries, respectively with just k initial odd entries. The
problem asks to nd an expression for p(n, k) = |E(n, k)| +|O(n, k)|.
If n = 2m is even, the rst k entries of a permutation in E(n, k) can be chosen
in m(m 1) (m k +1) ways, the (k +1)th entry in m ways, and the remaining
n k 1 entries in (2m k 1)! ways, hence |E(2m, k)| =
_
m
k
_
k!m(2m k 1)!.
By symmetry |O(2m, k)| = |E(2m, k)| and
p(2m, k) = 2m
_
m
k
_
k! (2m k 1)!.
Analogously, if n = 2m + 1 then |E(2m + 1, k)| =
_
m
k
_
k! (m + 1)(2m k)! and
|O(2m +1, k)| =
_
m+1
k
_
k! m(2m k)!, hence
p(2m +1, k) =
_
(m +1)
_
m
k
_
+m
_
m +1
k
__
k!(2m k)!.
VOL. 83, NO. 3, JUNE 2010 231
Both formulas for n even and odd may be resumed as follows:
p(n, k) =
_
_
n
2
_
_
_
n
2
_
k
_
+
_
n
2
_
_
_
n
2
_
k
__
k!(n k 1)!.
Editors Note. Graham Lord observed that if the set I
n
is partitioned into sets A and
B with | A| = a and |B| = b, then the number of permutations of I
n
where the rst k
entries are in A and the next j entries are in B is equal to
_
a
k
_
k!
_
b
j
_
j ! (n j k)!.
Also solved by Michel Bataille (France), Jany C. Binz (Switzerland), Robert Calcaterra, Chip Curtis,
M. N. Deshpande (India), Dmitry Fleischman, Ralph P. Grimaldi, Eugene A. Herman, Peter M. Joyce and
Richard F. McCoart Jr., Victor Y. Kutsenok, Elias Lampakis (Greece), Graham Lord, Rob Pratt, Joel Schlos-
berg, John Sumner and Aida Kadic-Galeb, Nicholas C. Singer, Texas State Problem Solvers Group, Michael
Woltermann, and the proposer.
An Intermediate Value Theorem conclusion June 2009
1824. Proposed by Cezar Lupu, student, University of Bucharest, Bucharest, Roma-
nia.
Let f be a continuous real-valued function dened on [0, 1] and satisfying
_
1
0
f (x) dx =
_
1
0
x f (x) dx.
Prove that there exists a real number c, 0 < c < 1, such that
cf (c) =
_
c
0
x f (x) dx.
Solution by Dave Trautman, Department of Mathematics and Computer Science, The
Citadel, Charleston, SC.
Because f is continuous and
_
1
0
(1 x) f (x) dx = 0, the Mean Value Theorem for
Integrals assures the existence of some c
1
, 0 < c
1
< 1, such that (1 c
1
) f (c
1
) = 0.
Clearly this means f (c
1
) = 0. If
_
c
1
0
x f (x) dx = 0, then c = c
1
proves the required
identity. Replacing f by f if necessary, it can be assumed that
_
c
1
0
x f (x) dx > 0.
Because the function G(x) = x f (x) is continuous on [0, 1], there exists c
2
, 0 c
2
<
c
1
, such that G(c
2
) is the maximum value of G on [0, c
1
]. For 0 x c
1
, let
H(x) =
_
x
0
t f (t ) dt.
Because c
2
< 1, it follows that
H(c
2
) =
_
c
2
0
t f (t ) dt c
2
G(c
2
) < G(c
2
).
On the other hand,
H(c
1
) =
_
c
1
0
t f (t ) dt > 0 = G(c
1
).
Thus the Intermediate Value Theorem says that there exists c, c
2
< c < c
1
, such that
G(c) = H(c), that is cf (c) =
_
c
0
x f (x) dx.
Editors Note. A number of readers pointed out that the same conclusion follows if the
hypothesis is replaced by the weaker condition of f being continuous and f (x
0
) = 0
for some 0 < x
0
< 1.
232 MATHEMATICS MAGAZINE
Also solved by Michael R. Bacon and Charles K. Cook, Michel Bataille (France), Gerald E. Bilodeau, Michael
W. Bosko, Robert Calcaterra, Hongwei Chen, John Christopher, Andr es Fielbaum (Chile), Fisher Problem Solv-
ing Group, G.R.A.20 Problem Solving Group (Italy), William Hodge, Eugen J. Ionascu, Parviz Khalili, Elias
Lampakis (Greece), Kee-Wai Lau (China), Kim McInturff, Occidental Problem Solving Group,

Angel Plaza and
Jos e M. Pacheco (Spain), Edward Schmeichel, Sanghun Song (Korea), Marian Tetiva (Romania), Jeremy Thi-
bodeaux, Thomas P. Turiel, Nicholas J. Willis, and the proposer.
Non-nested subsets of a ring closed under multiplication June 2009
1825. Proposed by Greg Oman and Kevin Schoenecker, The Ohio State University,
Columbus, OH.
Let R be a ring with more than two elements. Prove that there exist subsets S and T
of R, both closed under multiplication, and such that S T and T S. (Note: We
do not assume that R is commutative nor do we assume that R has a multiplicative
identity.)
Solution by Howard E. Bell, Department of Mathematics, Brock University, St. Cather-
ines, Ontario, Canada.
If R contains an element a such that a
n
= 0 for all n Z
+
, then the sets S = {0} and
T = {a
n
: n Z
+
} satisfy the required properties. Assume that R is a nil ring, that is
for every x R there is a positive integer n such that x
n
= 0. Let the index of x be the
smallest positive integer with this property. If R contains two distinct elements a and
b of index 2, then let S = {0, a} and T = {0, b}. Clearly S and T satisfy the required
conditions. This case occurs if the maximum index in R is 2. It also occurs when there
exists a R with index k 4, for in this case a
k1
and a
k2
are two elements of index
2. The only remaining case is that R contains an element a of index 3, in which case
a, a
2
, and a +a
2
are nonzero and a = a
2
, a = a +a
2
, and a
2
= a +a
2
. Thus the sets
S = {0, a, a
2
} and T = {0, a +a
2
, a
2
} satisfy the requirements.
Note. It is possible to insist that S T be commutative, for if R is a noncommutative
ring with maximum index 2 and a and b are noncommuting elements of R, then a,
b, and a + b all have square zero, so that ab + ba = 0 and hence both ab and ba
are nonzero. Thus, S = {0, ab} and T = {0, ba} satisfy the requirements and S T is
commutative.
Also solved by Paul Budney, Robert Calcaterra, John Ferdinands, John N. Fitch, Rod Hardy and Alin
A. Stancu, Elias Lampakis (Greece), David P. Lang, Missouri State University Problem Solving Group, Justin
Neil and Paul Peck, Jos e H. Nieto, Northwestern University Math Problem Solving Group,

Eric Pit e (France),
Gabriel T. Pr ajitur a, Nicholas C. Singer, John Sumner and Aida Kadic-Galeb, Vadim Ponomarenko, Marian
Tetiva (Romania), Texas State University Problem Solvers Group, Gregory P. Wene (Mexico), and the proposers.
There was one incorrect submission.
Answers
Solutions to the Quickies from page 227.
A1001. The answer is . Note that 1/a
2
= 1 = tan(/2
2
). By induction, if 1/a
n
=
tan(/2
n
), then for positive angles less than /2 the Tangent Half-Angle Formula
gives
tan
_

2
n+1
_
=
1 +
_
1 +tan
2
(/2
n
)
tan(/2
n
)
=
1 +
_
1 +a
2
n
a
1
n
= a
n
+
_
a
2
n
+1 =
1
_
a
2
n
+1 +a
n
=
1
a
n+1
.
VOL. 83, NO. 3, JUNE 2010 233
Therefore
lim
n
2
n
a
n
= lim
n
_
2
n
tan
_

2
n
__
= lim
n
_
2
n

tan
_

2
n
_
_
= .
A1002. Suppose f is bounded on [0, ). Let h(x) =
_
x
0
(g(t ))
2
dt so that h

(x) =
(g(x))
2
. Note that h(x) > 0 on (0, ). Because f is bounded, there exists B >
0 such that f (x) = g(x)h(x) B on [0, ). Therefore g
2
(x)h
2
(x) B
2
and thus
h

(x)/h
2
(x) 1/B
2
on (0, ). Integrating this inequality yields
1
h(1)

1
h(x)
=
_
x
1
h

(t )
h
2
(t )
dt
_
x
1
1
B
2
dt =
1
B
2
(x 1) on [1, ).
Therefore
1
B
2
(x 1)
1
h(x)
+
1
B
2
(x 1)
1
h(1)
on [1, ),
which is a contradiction.
Editors Note. By letting B(x) = c

x, the same argument shows that f (x)/

x is also
unbounded. On the other hand, the function g(x) =

x +2 shows that it is possible
for f (x)/(

x +2 ln(x +2)) to be bounded on [0, ).


Whats Luck Got to Do with It?
The History, Mathematics, and Psychology
of the Gamblers Illusion
Joseph Mazur
This is a fascinating book. Its a fresh,
funny, philosophical look at gambling by a
mathematician who knows what hes talking
about, and who has quite obviously thought
about gambling for a long time. Mazur isnt
afraid to make provocative, opinionated
statements. I have not seen a gambling book
like this before.
Paul J. Nahin, author of An Imaginary Tale
and Digital Dice
Cloth $29.95 978-0-691-13890-9 July
800.777.4726
press.princeton.edu
RE VI E WS
PAUL J. CAMPBELL, Editor
Beloit College
Assistant Editor: Eric S. Rosenthal, West Orange, NJ. Articles and books are selected for this
section to call attention to interesting mathematical exposition that occurs outside the main-
stream of mathematics literature. Readers are invited to suggest items for review to the editors.
Beardon, Alan F., Creative Mathematics: A Gateway to Research, Cambridge University Press,
2009; x + 110 pp, $27.99(P). ISBN 978-0-521-13059-2.
Problem books abound. Naturally, most focus on solving the problems. But there can be an-
other, larger aim: expanding on the problems and furthering additional mathematical discovery.
This book offers 11 problems, each with a solution and more problems; and then those further
problems are discussed and generalizations urged. The book begins with a succinct eight pages
on how to write mathematics and give a presentation, on the grounds that writing and commu-
nicating a careful solution to a problem itself stimulates further thought and new ideas. Some
of the problems require linear algebra, others modular arithmetic, and a few some probability
(mostly nite); one applies Taylor series with remainder to realize a limit. This is an inspiring
book; I wish the price could be lower.
Alsina, Claudi, and Roger B. Nelsen, When Less Is More: Visualizing Basic Inequalities, MAA,
2009; xix + 190 pp, $59.95; member price $48.95. ISBN 978-0-88385-342-9.
Dip anywhere into this book and you will learn something new to you: Guhas inequality as
a lemma to an easy proof of the inequality of the means, Simpsons paradox in statistics as
an illustration of the mediant inequality, not one but three geometric proofs of the Cauchy-
Schwarz inequality, and the use of majorization to prove inequalities. This book concentrates
on geometric inequalities and indeed aims to present a methodology for producing mathe-
matical visualization of inequalities. Each of the nine chapters is devoted to a method, such
as representing numbers geometrically, or using incircles, circumcircles, reections, rotations,
transformations, or graphs of functions. Each chapter ends with challenges to apply its method,
and solutions are given to all challenges.
Hardy, Michael, and Catherine Woodgold, Prime simplicity, Mathematical Intelligencer 31 (4)
(2009) 4452.
Do you think that Euclid proved the existence of innitely many prime numbers by contra-
diction? You may think so, and I bet that you know how to do it that waybut Euclid didnt
do it that way, despite the fact that lots of your colleagues (including some famous ones) have
written that he did. Euclid in fact gave a constructive proof, that there are more prime numbers
than any proposed multitude of prime numbersnot that there are an innite number of them,
since the concept of an actual (as opposed to potential) innity was not part of Greek thought.
This study of the history of Euclids proof, with 147 references, is remarkably thorough. The
authors conclude, however: When and how did the error become the prevailing doctrine? We
have no answer. Though the authors nd no single infection as the source, this virus surely is a
consequence of the modern curriculums abandonment of the custom in the nineteenth century
(and earlier) of direct study of Euclids work itself.
Math. Mag. 83 (2010) 234235. doi:10.4169/002557010X494922. c Mathematical Association of America
234
VOL. 83, NO. 3, JUNE 2010 235
Stein, James D., How Math Can Save Your Life (and Make You Rich, Help You Find the One,
and Avert Catastrophes), Wiley, 2010; xiv + 242 pp, $24.95. ISBN 978-0-47043-775-9.
Well, if there was ever a title to sell a book about math, this should be it! In the table of contents,
each chapter title is paired with three intriguing questions (e.g., Will renancing your house
actually save money?). The writing is informal and brisk, and author Stein is a Berkeley Ph.D.
in mathematics with a long career in university teaching. He considers mathematical aspects
of all kinds of everyday topics: service contracts for appliances, strategy in football, nding a
mate, picking lottery numbers, risky surgery, hybrid cars, nancial indexes, teaching children
arithmetic, and damage from disasters. The main tool is expected value, with contributions
from symbolic logic, game theory, and regression to the mean. I was put off a bit by anti-
algebra editorializing in the Introduction: outside of the people [in the sciences, engineering,
and investments] almost nobody needs algebra or ever uses it. But I was more troubled by bad
arithmetic and the wrong conclusion about auto insurance policies (p. 14), no mention of utility
in regard to expected value, and the otherwise-ingenious Tulip Indexes that unfortunately
divide the S&P stock index and mean new-home prices in current dollars by mean household
income in constant (ination-adjusted) dollars without observing that fact.
Bayley, Melanie, Alices adventures in algebra: Wonderland solved, New Scientist issue 2739
(16 December 2009). Algebra in Wonderland, New York Times (7 March 2010), http://www.
nytimes.com/2010/03/07/opinion/07bayley.html.
Pycior, Helena M., At the intersection of mathematics and humor: Lewis Carrolls Alices and
symbolical algebra, Victorian Studies 28 (1) (Autumn 1984) 149170.
Devlin, Keith, The hidden math behind Alice in Wonderland, http://www.maa.org/devlin/
devlin_03_10.html .
Wilson, Robin J., Lewis Carroll in Numberland: His Fantastical Mathematical Logical Life: An
Agony in Eight Fits, W.W. Norton, 2008; xi + 237 pp., $24.95. ISBN 978-0-393-06027-0.
The recent release in March of the new lm Alice in Wonderland will no doubt regenerate
interest in Lewis Carroll (Charles Dodgson) and his works. Mathematics instructors may be
able to use that renewed interest as a teachable moment, thanks to author Bayley, a doc-
toral candidate in Victorian literature. She claims that in the Alice book Dodgson satirized and
argued against the absurdity of the new mathematics of his day, which she takes to be imagi-
nary numbers, symbolical algebra, projective geometry, and quaternions. Commentator Devlin
summarizes Bayleys arguments sympathetically, even agreeing with the highly-implausible
assertion that without the mathematical undercurrents the book would never have achieved
stardom! Earlier, author Pycior looked into Dodgsons struggle against symbolical algebra and
his fusion of mathematics with humor. Did Dodgsons mathematical colleagues react to Alice?
Neither Robin Wilson nor Martin Gardner, author of The Annotated Alice, has weighed in yet
on this latest cluster of claims.
Denning, Peter J., and Peter A. Freeman, Computings paradigm, Communications of the Asso-
ciation for Computing Machinery 52 (12) (December 2009) 2830.
A recent curriculum proposal at my college would reclassify academic departments as arts,
humanities, social sciences, or natural and physical sciences. Curiously, both mathematics and
computer science, currently classied as (unnatural) sciences, were left out (whats the mes-
sage in that?). A colleague in my department and I asserted that they areof course
humanities. There is confusion in the public mind about mathematics and computer science;
the prevailing erroneous view (even among college faculty) is that they both deal primarily
with numbers. Authors Denning and Freeman ask what characterizes computing and present
for it a paradigm, that is, a belief system and its associated practices, dening how a eld
sees the world and approaches the solutions of problems. Their main concern is reconciling
the engineering and science views of computing, and they accept that computing is a fourth
great domain of science alongside the physical, life, and social sciences. From the three sub-
paradigms of mathematics, science, and engineering, they synthesize the computing paradigm
as focusing on information processesnatural or constructed. . . discrete or continuous. There
is no mention of numbers whatever.
NE WS AND L E TTE RS
50th International Mathematical Olympiad
ZUMI NG FENG
Phillips Exeter Academy
Exeter, NH 03833-2460
[email protected]
STEVEN R. DUNBAR
MAA American Mathematics Competitions
University of Nebraska-Lincoln
Lincoln, NE 68588-0658
[email protected]
Problems
1. Let n be a positive integer and let a
1
, . . . , a
k
(k 2) be distinct integers in the set
{1, . . . , n} such that n divides a
i
(a
i +1
1) for i = 1, . . . , k 1. Prove that n does not
divide a
k
(a
1
1).
2. Let ABC be a triangle with circumcenter O. The points P and Q are interior points of
the sides CA and AB, respectively. Let K, L, and M be the midpoints of the segments
BP, CQ, and PQ respectively and let be the circle passing through K, L, and M.
Suppose that the line PQ is tangent to the circle . Prove that OP = OQ.
3. Suppose that s
1
, s
2
, s
3
, . . . is a strictly increasing sequence of positive integers such
that the subsequences s
s
1
, s
s
2
, s
s
3
, . . . and s
s
1
+1
, s
s
2
+1
, s
s
3
+1
, . . . are both arithmetic
progressions. Prove that s
1
, s
2
, s
3
, . . . is itself an arithmetic progression.
4. Let ABC be a triangle with AB = AC. The angle bisectors of

CAB and

ABC meet
the sides BC and CA at D and E, respectively. Let K be the incenter of triangle ADC.
Suppose that

BEK = 45

. Find all possible values of



CAB.
5. Determine all functions f from the set of positive integers to the set of positive integers
such that, for all positive integers a and b, there exists a non-degenerate triangle with
sides of lengths
a, f (b), and f (b + f (a) 1).
(A triangle is non-degenerate if its vertices are not collinear.)
6. Let a
1
, a
2
, . . . , a
n
be distinct positive integers and let M be a set of n 1 integers not
containing s = a
1
+a
2
+. . . a
n
. A grasshopper is to jump along the real axis, starting
at the point 0 and making n jumps to the right with lengths a
1
, a
2
, . . . , a
n
in some order.
Prove that the order can be chosen in such a way that the grasshopper never lands on
any points in M.
Solutions Following are the essential ideas for each problem. These solution sketches
are adapted from [1] and details and alternatives are in the forum.
Math. Mag. 83 (2010) 236239. doi:10.4169/002557010X494931. c Mathematical Association of America
236
VOL. 83, NO. 3, JUNE 2010 237
1. Prove inductively that n | a
1
(a
i
1) for i = 2, . . . , k. The case i = 2 is a hy-
pothesis so assume true for i > 2. Then n | a
1
(a
i
1) and n | a
i
(a
i +1
1), so
n | (a
1
a
i
a
1
)a
i +1
a
1
a
i
+ a
1
and n | a
1
a
i
a
i +1
a
1
a
i
. Subtracting the rst from
the second, we obtain n | a
1
a
i +1
a
1
so the induction is complete. Now n | a
1
a
k
a
1
and if n | a
k
a
1
a
k
, then n | a
1
a
k
which is impossible.
This problem was proposed by Ross Atkins of Australia.
2. The circle is tangent to line PQ if and only

MLK =

QMK. Since MK is parallel


to AB, it follows that

AQP =

MLK. Since MK and ML are mid-lines in PQB and


PQC respectively, it follows that

PAQ =

KML. Therefore APQ MKL. Then


AP/AQ = MK/ML = BQ/PC and so AP PC = AQ BQ. But AP PC is the power
of P with respect to the circle with center O. Then AP PC = R
2
OP
2
. Similarly
AQ BQ = R
2
OQ
2
and so OP = OQ.
This problem was proposed by Sergei Berlov of Russia.
3. Suppose s
s
i
= A + i d
A
and s
s
i
+1
= B + i d
B
where A, B, d
A
, d
B
> 0. Since s
i
is an increasing sequence s
s
i
+1
> s
s
i
. Note that s
s
i
+1
s
s
i
= (B A) +i (d
B
d
A
)
and s
s
i +1
s
s
i
+1
= (A +d
A
B) +i (d
A
d
B
) are arithmetic progressions and their
common differences add to zero. If the rst of the common differences d
B
d
A
is
strictly positive, then the other common difference d
A
d
B
must be strictly negative,
and so eventually s
s
i +1
s
s
i
+1
must be negative, a contradiction to being increasing.
Likewise, if d
A
d
B
is strictly positive, then eventually s
s
i
+1
s
s
i
must be negative,
also a contradiction. Hence d
A
= d
B
and s
s
1
, s
s
2
, . . . and s
s
1
+1
, s
s
2
+1
, . . . have the same
common difference, say d. Establish that s
i
i by induction. Then s
s
i +1
s
s
i
s
i +1

s
i
0. Since d = s
s
i +1
s
s
i
, we see s
i +1
s
i
is bounded. The difference achieves a
maximum s
a+1
s
a
= M and minimum s
b+1
s
b
= m. Let s
a
= k. Let s
b
= l.
Then s
s
s
a+1
s
s
s
a
= s
s
s
a
+M
s
s
s
a
= s
s
k+M
s
s
k
= M d since s
s
i
is an arithmetic
progression with common difference d. Since M is the maximum of s
i +1
s
i
, and
the average value of s
i +1
s
i
from s
s
s
a
to s
s
s
a+1
is M, it follows s
s
s
a
+1
s
s
s
a
= M.
But s
s
i
+1
s
s
i
is constant, so it equals M. By a similar argument using that m is the
minimum of s
i +1
s
i
, we have s
s
i
+1
s = m. Hence M = m and the given sequence
is arithmetic.
This problem was proposed by Gabriel Carroll of the USA.
4. Let =

DAC. Then

CAB = 2,

BCA =

CBA = 90

, and

EBC = 45

/2. Consider EKC with



KCE = 45

/2,

CEK = 3/2 and

CKE = 135

. Finally,

CKA = 135

. From elementary trigonometry CB = 2CD = 2ACsin().


Applying the Law of Sines to BEC and substituting for CB
EC = 2ACsin()
sin(45

/2)
sin(45

+3/2)
. (1)
Apply the Law of Sines to AKC and simplify to obtain KC =

2ACsin(/2).
Finally, apply the Law of Sines to EKC and rearrange to obtain EC = KCsin(135

)/ sin(/2). Combining
EC =

2ACsin(/2)
sin(135

)
sin(3/2)
. (2)
Then equating (1) and (2) and cancelling AC
2 sin()
sin(45

/2)
sin(45

+3/2)
=

2 sin(/2)
sin(135

)
sin(3/2)
. (3)
Solving this equation, = 30

or = 45

, so

CAB = 60

or 90

.
The problem was suggested by Jan Vonk, Belgium, Peter Vandendriessche, Belgium
and Hojoo Lee, Korea.
238 MATHEMATICS MAGAZINE
5. First note that if a triangle has positive integer side lengths 1, a, b, then by the triangle
inequality a 1 b a +1. If the triangle is non-degenerate, then a = b. Using a =
1, then f (b) = f (b + f (a) 1). Now the claim is that f (a) = 1, since otherwise if
f (a) > 1 then f is periodic of period f (a) 1, and f is bounded above. Then choosing
a larger than twice the upper bound violates the triangle inequality. Using b = 1, then
a, f (1) = 1, f ( f (a)) are the side lengths of a triangle, so a = f ( f (a)) for all a. Thus
f is injective.
Now assume f (2) = k > 2. Hence f (b) 1 f (b + f (2) 1) f (b) +1. Then
check the 3 possibilities for f (b + f (2) 1):
(a) f (b + f (2) 1) = f (b), so f (2) = 1 which is impossible.
(b) f (b + f (2) 1) = f (b) 1, so set k = f (2) 1, so f (b + k) = f (b)
1. By induction f (b + n k) = f (b) n. Choosing n = f (b) 1 leads to
function value 1, contradicting injectivity.
(c) f (b + f (2) 1) = f (b) + 1. Set b = 1, f (2) 1 = k, so f (1 + k) =
f (1) +1. Inducting, f (1 +n k) = n +1. Now if k > 1, then 1 k 1 <
k + 1 < 1 + n k. This means that f (k 1) = 1 = f (1) or k = 2 implies
f (2) = 3 and f (b +2) = f (b) +1 and nally f (2) = 3 and f (5) = 3 which
is impossible. Conclude that k = 1, so f (b +1) = f (b) +1 and f (n) = n.
This problem was proposed by Bruno Le Floch of France.
6. Induct on n. The cases n = 1 and n = 2 are easy. For n 3, without loss of generality
let a
n
be the largest jump size and let m
1
be the smallest element of M. Consider 3
cases.
(a) If m
1
< a
n
and a
n
/ M, then begin with a jump of size a
n
. That jump avoids
m
1
, and the induction hypothesis means that the grasshopper can arrange the
remaining n 1 jumps to avoid the remaining n 2 values of M.
(b) If m
1
< a
n
but a
n
M, say a
n
= m
j
for some j , then consider the start-
ing two-jump sequences (a
1
, a
n
), . . . (a
n1
, a
n
). There are n 1 of these
sequences, and the landing values are all distinct and different from m
j
.
Therefore there are not enough forbidden values in M to block all of them.
For some i , the grasshopper can start with two safe jumps of size a
i
and a
n
.
These two jumps take the grasshopper past m
1
and m
j
, and by induction the
grasshopper can arrange the remaining n 2 jumps to avoid the remaining
n 3 values of M.
(c) If m
1
a
n
the grasshopper needs a different strategy. Begin with jump a
n
,
ignore the value m
1
, and arrange the remaining jumps to avoid the remaining
n 2 values of M other than m
1
. If this arrangement avoids m
1
, the proof is
done. Otherwise, suppose that the grasshopper lands on m
1
just before making
a jump of size a
i
. Then modify the jump sequence by exchanging jumps a
n
and a
i
. Then verify that the modied sequence avoids all the values of M.
This solution is by Anton Mellit, IMO observer with the Ukraine delegation, and
Ilya Bogdanov, IMO observer with the Russian delegation with simplications by Brian
Basham, a mathematics student at MIT.
Immediately following the IMO, Terry Tao hosted a collaborative solution on his
blog site as a mini-polymath project, [3]. The polymath collaborative solution con-
tinued two days [4] until the contributors agreed upon a solution. Terry Tao followed
with an analysis of the polymath process, [5]. Michael Nielsen wrote up 5 variant proofs
from the collaboration [2].
This problem was proposed by Dmitry Khramtsov of Russia.
VOL. 83, NO. 3, JUNE 2010 239
2009 International Mathematical Olympiad Results At the IMO 530 young math-
ematicians from 104 countries competed on July 1516, 2009. The USA team ranked 6th
among all 104 participating countries. The USA team has consistently nished in the top
ten at the IMO. As part of the 50th anniversary of the IMO, Terry Tao and 5 other famous
mathematicians who were IMO medalists gave commemorative lectures. The students vis-
ited a mag-lev train demonstration project, the North Sea resort island Wangerooge, and
the historic Bremen city center.

John Berman, a graduate of John T. Hoggard High School, Wilmington, NC, won a Gold
medal.

Wenyu Cao, a student at Phillips Academy, Andover, Massachusetts won a Silver medal.

Eric Larson, who graduated from South Eugene High School, Eugene, OR won a Gold
medal.

Delong Meng who graduated from Baton Rouge Magnet School, Baton Rouge, LA won
a Silver medal.

Evan ODorney who attends the Venture School and is from Danville CA, won a Silver
medal.

Qinxuan Pan, who graduated from Wooton High School in Rockville MD, won a Silver
medal.
REFERENCES
1. IMO Moderators, Questions of the IMO 2009 Germany, 2009 (accessed Mar. 24, 2010). http://www.
artofproblemsolving.com/Forum/index.php?f=580.
2. Michael Nielsen. Imo 2009 Q6, 2009 (accessed Mar. 24, 2010). http://michaelnielsen.org/
polymath1/index.php?title=Imo_2009_q6.
3. Terry Tao, IMO 2009 Q6 as a mini-polymath project, 2009 (accessed Mar. 24, 2010). http://terrytao.
wordpress.com/2009/07/20/imo-2009-q6-as-a-mini-polymath-project/.
4. Terry Tao, IMO 2009 Q6 mini-polymath project cont., 2009 (accessed Mar. 24, 2010). http://terrytao.
wordpress.com/2009/07/21/imo-2009-q6-mini-polymath-project-cont/.
5. Terry Tao, IMO 2009 Q6 mini-polymath project cont., 2009 (accessed Mar. 24, 2010). http://
terrytao.wordpress.com/2009/07/22/imo-2009-q6-mini-polymath-project-impressions-
reflections-analysis/.
As a robust repertoire of examples is essential for students
to learn the practice of mathematics, so a mental library of
counterexamples is critical for students to grasp the logic of
mathematics. Counterexamples are tools that reveal incor-
rect beliefs. Without such tools, learners natural misconcep-
tions gradually harden into convictions that seriously impede
further learning. This slim volume brings the power of coun-
terexamples to bear on one of the largest and most important
courses in the mathematics curriculum.
Professor Lynn Arthur Steen, St. Olaf College, Minnesota,
USA, Co-author of Counterexamples in Topology
Counterexamples in Calculus
Sergiy Klymchuk
Order your copy today!
1.800.331.1622 www.maa.org
Catalog Code: CXC
101pp., Paperbound, 2010
ISBN: 978-0-88385-756-6
List: $45.95
MAA Member: $35.95
Counterexamples in Calculus serves as a supplementary resource to en-
hance the learning experience in single variable calculus courses. This
book features carefully constructed incorrect mathematical statements
that require students to create counterexamples to disprove them. Methods
of producing these incorrect statements vary. At times the converse of a
well-known theorem is presented. In other instances crucial conditions are
omitted or altered or incorrect defnitions are employed. Incorrect state-
ments are grouped topically with sections devoted to: Functions, Limits,
Continuity, Differential Calculus and Integral Calculus.
This book aims to fll a gap in the literature and provide a resource Ior
using counterexamples as a pedagogical tool in the study of introductory
calculus. In that light it may well be useful for
high school teachers and university Iaculty as a teaching resource
high school and college students as a learning resource
a proIessional development resource Ior calculus instructors
New title by the MAA
That student is taught the best who is
told the least.
R. L. Moore, 1966
The Moore Method: A Pathway to Learner-Centered Instruction ofers a
practical overview of the method as practiced by the four co-authors,
serving as both a how to manual for implementing the method and
an answer to the question, what is the Moore method. Moore is well
known as creator of The Moore Method (no textbooks, no lectures, no
conferring) in which there is a current and growing revival of interest and
modifed application under inquiry-based learning projects. Beginning
with Moores Method as practiced by Moore himself, the authors proceed
to present their own broader defnitions of the method before addressing
specifc details and mechanics of their individual implementations. Each
chapter consists of four essays, one by each author, introduced with the
commonality of the authors writings.
Topics include the culture the authors strive to establish in the classroom,
their grading methods, the development of materials and typical days
in the classroom. Appendices include sample tests, sample notes, and
diaries of individual courses. With more than 130 references supporting
the themes of the book the work provides ample additional reading
supporting the transition to learner-centered methods of instruction.
The Moore Method: A Pathway
to Learner-Centered Instruction
Catalog Code: NTE-75
260 pp., Paperbound, 2009,
ISBN: 978-0-88385-185-2
List: $57.50 MAA Member: $47.50
Charles A. Coppin, Ted Mahavier, E. Lee May,
and Edgar Parker, Editors
To order call 1-800-331-1622 or visit us online at www.maa.org
New title from the MAA

You might also like