AMM - Vol.118 NR 06 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 97

THE AMERICAN MATHEMATICAL

MONTHLY
VOLUME 118, NO. 6 JUNE–JULY 2011

Roads and Wheels, Roulettes and Pedals 479


Fred Kuczmarski

The Golden String, Zeckendorf Representations, 497


and the Sum of a Series
Martin Griffiths

Achievement Sets of Sequences 508


Rafe Jones

Polynomials, Ellipses, and Matrices: Two Questions, 522


One Answer
Pamela Gorkin and Elizabeth Skubak

A Lost Counterexample and a Problem on 534


Illuminated Polytopes
Ronald F. Wotzlaw and Günter M. Ziegler

NOTES

Conway’s Conjecture for Monotone Thrackles 544


János Pach and Ethan Sterling

Proofs of Power Sum and Binomial Coefficient Congruences 549


via Pascal’s Identity
Kieren MacMillan and Jonathan Sondow

Meromorphic Continuation of Dirichlet Series via Derivations 551


Caleb Emmons

PROBLEMS AND SOLUTIONS 557

REVIEWS

Continuous Symmetry: From Euclid to Klein. 565


By William Barker and Roger Howe
Robin Hartshorne

An Official Publication of the Mathematical Association of America


Looking for a great textbook
for your class?
The MAA offers affordable textbooks in a variety of subjects.

View our online textbook catalogue:


http://www.maa.org/pubs/2011OnlineTextbookCatalog.pdf
THE AMERICAN MATHEMATICAL

MONTHLY
VOLUME 118, NO. 6 JUNE–JULY 2011

EDITOR
Daniel J. Velleman
Amherst College

ASSOCIATE EDITORS
William Adkins Jeffrey Nunemacher
Louisiana State University Ohio Wesleyan University
David Aldous Bruce P. Palka
University of California, Berkeley National Science Foundation
Roger Alperin Joel W. Robbin
San Jose State University University of Wisconsin, Madison
Anne Brown Rachel Roberts
Indiana University South Bend Washington University, St. Louis
Edward B. Burger Judith Roitman
Williams College University of Kansas, Lawrence
Scott Chapman Edward Scheinerman
Sam Houston State University Johns Hopkins University
Ricardo Cortez Abe Shenitzer
Tulane University York University
Joseph W. Dauben Karen E. Smith
City University of New York University of Michigan, Ann Arbor
Beverly Diamond Susan G. Staples
College of Charleston Texas Christian University
Gerald A. Edgar John Stillwell
The Ohio State University University of San Francisco
Gerald B. Folland Dennis Stowe
University of Washington, Seattle Idaho State University, Pocatello
Sidney Graham Francis Edward Su
Central Michigan University Harvey Mudd College
Doug Hensley Serge Tabachnikov
Texas A&M University Pennsylvania State University
Roger A. Horn Daniel Ullman
University of Utah George Washington University
Steven Krantz Gerard Venema
Washington University, St. Louis Calvin College
C. Dwight Lahr Douglas B. West
Dartmouth College University of Illinois, Urbana-Champaign
Bo Li
Purdue University

EDITORIAL ASSISTANT
Nancy R. Board
NOTICE TO AUTHORS Proposed problems or solutions should be sent to:
The MONTHLY publishes articles, as well as notes and DOUG HENSLEY, MONTHLY Problems
other features, about mathematics and the profes- Department of Mathematics
sion. Its readers span a broad spectrum of math- Texas A&M University
ematical interests, and include professional mathe- 3368 TAMU
maticians as well as students of mathematics at all College Station, TX 77843-3368
collegiate levels. Authors are invited to submit arti-
cles and notes that bring interesting mathematical
In lieu of duplicate hardcopy, authors may submit
ideas to a wide audience of MONTHLY readers.
pdfs to [email protected].
The MONTHLY’s readers expect a high standard of ex-
position; they expect articles to inform, stimulate,
challenge, enlighten, and even entertain. MONTHLY Advertising Correspondence:
articles are meant to be read, enjoyed, and dis- MAA Advertising
cussed, rather than just archived. Articles may be 1529 Eighteenth St. NW
expositions of old or new results, historical or bio- Washington DC 20036
graphical essays, speculations or definitive treat-
ments, broad developments, or explorations of a Phone: (877) 622-2373
single application. Novelty and generality are far E-mail: [email protected]
less important than clarity of exposition and broad
appeal. Appropriate figures, diagrams, and photo- Further advertising information can be found online
graphs are encouraged. at www.maa.org

Notes are short, sharply focused, and possibly infor- Change of address, missing issue inquiries, and
mal. They are often gems that provide a new proof other subscription correspondence:
of an old theorem, a novel presentation of a familiar MAA Service Center, [email protected]
theme, or a lively discussion of a single issue.
All at the address:
Beginning January 1, 2011, submission of articles and
notes is required via the MONTHLY’s Editorial Man- The Mathematical Association of America
ager System. Initial submissions in pdf or LATEX form 1529 Eighteenth Street, N.W.
can be sent to the Editor-Elect Scott Chapman at Washington, DC 20036

http://www.editorialmanager.com/monthly Recent copies of the MONTHLY are available for pur-


chase through the MAA Service Center.
[email protected], 1-800-331-1622
The Editorial Manager System will cue the author
for all required information concerning the paper. Microfilm Editions: University Microfilms Interna-
Questions concerning submission of papers can be tional, Serial Bid coordinator, 300 North Zeeb Road,
addressed to the Editor-Elect at [email protected]. Ann Arbor, MI 48106.
Authors who use LATEX are urged to use article.sty,
or a similar generic style, and its standard environ- The AMERICAN MATHEMATICAL MONTHLY (ISSN
ments with no custom formatting. The style of ci- 0002-9890) is published monthly except bimonthly
tations for journal articles and books should match June-July and August-September by the Mathe-
that used on MathSciNet (see http://www.ams. matical Association of America at 1529 Eighteenth
org/mathscinet). Follow the link to Electronic Publi- Street, N.W., Washington, DC 20036 and Lancaster,
cations Information for authors at http://www.maa. PA, and copyrighted by the Mathematical Associa-
org/pubs/monthly.html for information about fig- tion of America (Incorporated), 2011, including rights
ures and files, as well as general editorial guidelines. to this journal issue as a whole and, except where
Letters to the Editor on any topic are invited. otherwise noted, rights to each individual contribu-
Comments, criticisms, and suggestions for making tion. Permission to make copies of individual arti-
the MONTHLY more lively, entertaining, and infor- cles, in paper or electronic form, including posting
mative can be forwarded to the Editor-Elect at on personal and class web pages, for educational
[email protected]. and scientific use is granted without fee provided
that copies are not made or distributed for profit
The online MONTHLY archive at www.jstor.org is a or commercial advantage and that copies bear the
valuable resource for both authors and readers; it following copyright notice: [Copyright the Mathe-
may be searched online in a variety of ways for any matical Association of America 2011. All rights re-
specified keyword(s). MAA members whose institu- served.] Abstracting, with credit, is permitted. To
tions do not provide JSTOR access may obtain indi- copy otherwise, or to republish, requires specific
vidual access for a modest annual fee; call 800-331- permission of the MAA’s Director of Publications and
1622. possibly a fee. Periodicals postage paid at Washing-
See the MONTHLY section of MAA Online for current ton, DC, and additional mailing offices. Postmaster:
information such as contents of issues and descrip- Send address changes to the American Mathemati-
tive summaries of forthcoming articles: cal Monthly, Membership/Subscription Department,
MAA, 1529 Eighteenth Street, N.W., Washington, DC,
20036-1385.
http://www.maa.org/
Roads and Wheels, Roulettes and Pedals
Fred Kuczmarski

Abstract. We revisit the idea of road-wheel pairs, first introduced 50 years ago by Gerson
Robison and later popularized by Stan Wagon and his square-wheeled tricycle. We show how
to generate such pairs geometrically: the road as a roulette curve and the wheel as a pedal
curve. Along the way we gain geometric insight into two theorems proved by Jakob Steiner
relating the area and arc length of a roulette to those of a corresponding pedal. Finally, we use
our results to generate parabolas, ellipses, and sine curves as roulettes.

1. INTRODUCTION. In an article in Crelle’s Journal in 1838 Jakob Steiner pub-


lished two theorems relating the arc length and area of a roulette curve to those of a
corresponding pedal curve. In 1960, G. Robison [6] introduced the idea of rockers and
rollers: pairs of curves, a rocker and a roller, or a road and a wheel, such that the axle
of the wheel moves in a straight line while the wheel rolls on the road without slip-
ping. In this article we give a geometric method of generating road-wheel pairs and
find a surprising connection with Steiner’s theorems. We then use the results in [3] to
generate familiar curves as roulettes in unusual ways. But first, we review some basic
facts, interesting in their own right, about pedals and roulettes.

2. ROULETTE CURVES. As one curve, C , rolls on another without slipping, the


locus of a point P that maintains a fixed position relative to C is called a roulette. We
focus on roulettes generated by curves rolling on straight lines. For example, a cycloid
is the roulette traced by a point P on the circumference of a circle as the circle rolls on
a line. If the point P starts at the origin and moves with a unit circle rolling along the
x-axis, then a parameterization of the cycloid is given by
(x, y) = (θ − sin θ, 1 − cos θ ), (1)
where θ is the angle through which the wheel has turned. That y = d x/dθ in this pa-
rameterization is not a coincidence, as the following lemma shows. But before stating
the lemma, we wish to be sure that we can roll C smoothly on a line with a constant an-
gular velocity. To this end, we require C to be a continuously differentiable plane curve
such that the angle of rotation of its tangent lines, as measured relative to some initial
position, is a strictly monotonic function of arc length. We call such curves rollable.
The monotonic condition implies that rollable curves have no inflection points, while
the strictness of the monotonicity precludes rollable curves from containing line seg-
ments. We invite the reader to consider generalizing the results in this paper to include
curves with inflection points as well as piecewise differentiable curves.
We assume throughout the paper that C is rollable. In the following lemma, and
throughout the paper, we let θ denote the angle of rotation, relative to some starting
position, of the curve C rolling on the x-axis. We take θ > 0 to correspond to a clock-
wise rotation.

Roulette Lemma. Let C be a rollable curve and let R be the roulette traced by a
point P moving with C as C rolls on the x-axis. If (x, y) = ( f (θ ), g(θ )) is a param-
eterization of R in terms of the angle of rotation of C , then f and g are continuously
differentiable and g(θ) = f 0 (θ).
doi:10.4169/amer.math.monthly.118.06.479

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 479


Proof. The assumption that C is rollable ensures that we can parameterize R in terms
of θ , for as C rolls on the x-axis the sense of rotation of C will not change and θ will
increase. Let A be the point of contact between C and the x-axis, as shown in Figure 1.
Since A is the instantaneous center of rotation of C , the differential displacement of P
−→
is perpendicular to A P and has magnitude r dθ , where r = | A P|. So if ρ denotes the
−→
angle that A P makes with the negative x-axis and P has coordinates (x, y),

(d x, dy) = r dθ (sin ρ, cos ρ). (2)

Since C contains no line segments, the instantaneous center of rotation moves continu-
ously in the direction of the positive x-axis as C rolls on the axis, and the x-coordinate
of point A is a continuous function of θ . Also, since C is continuously differentiable, r
and ρ are continuous functions of θ , and hence f and g are continuously differentiable.
Finally, f 0 (θ ) = r sin ρ = g(θ).

P(x, y)
C
r

x
A0 A

Figure 1. The roulette lemma.

One immediate consequence of the roulette lemma is that the tracing point moves
to the right, that is d x/dθ > 0, precisely when it is above the x-axis. Or put another
way, the horizontal motion of the tracing point changes direction when the tracing
point crosses the x-axis. Thus, the vertical tangents to the roulette curve occur at the
x-intercepts. The case of a trochoid, traced by a point exterior to a circle as the circle
rolls on a line, nicely illustrates these ideas and is shown in Figure 2.

Figure 2. A trochoid.

We can use the lemma to parameterize roulette curves by expressing the y-


coordinate of P in terms of θ and then integrating to determine the x-coordinate.
We assume the initial position of P, which we denote as P0 , is on the y-axis, and we
let y0 denote the y-coordinate of P0 . The y-coordinate of P after C has turned through
the angle θ is the signed distance, g(θ), from P0 to the tangent line to C at A0 , the
point of C that will come in contact with the x-axis after C has turned through the

480 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



angle θ . This is shown in Figure 3, where A00 is the foot of the perpendicular from P0
to the tangent line. Figure 4 shows the curve after it has turned through the angle θ ,
where the points A and A0 are the rotated images of A0 and A00 , respectively. To ensure
that the y-coordinate of P has the correct sign, we define the signed distance to be
−−→
positive if the angle between the vector A00 P0 and the positive x-axis is θ + π/2, and
to be negative if the angle is θ − π/2. Then a parameterization of the roulette R is
given by
Z θ 
(x, y) = g(t) dt, g(θ ) . (3)
0

y y
P0

g( ) A00

P(x, y)

C g( )
C
A0
x x
A A0

Figure 3. Parameterizing a roulette, part I. Figure 4. Parameterizing a roulette, part II.

We should point out that the standard parameterization of a roulette is usually of the
form

(x, y) = (s − r cos ρ, r sin ρ),

where r and ρ are as in Figure 1 and s is the arc length of C between A and the
point of C that was in contact with the origin. But by using (3) we avoid the arc length
computation. As an example, we give a short proof that the roulette traced by the focus
of a parabola rolling on a line is a catenary. In Figure 5 the curve C in its initial position
is the parabola y = x 2 /4 and P0 is the focus, (0, 1). A simple computation shows that
A00 is the point of intersection of the tangent line and the x-axis. Thus, g(θ ) = sec θ ,

A0

P0
x
A00

Figure 5. Parameterizing the roulette generated by the focus of a parabola.

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 481


and a parameterization of the roulette is given by

(x, y) = (ln(sec θ + tan θ ), sec θ) .

It follows that the roulette generated by the focus of the parabola y = x 2 /4 rolling on
the x-axis is the catenary y = cosh x. See [1] for another proof.

3. PEDAL CURVES. Let P be a point in the plane of C . The pedal curve of C with
respect to P, which we denote by C P , is the set of the feet of all perpendiculars from
P to the tangent lines of C . The idea of a pedal curve is contained in Figure 3, where
the pedal, C P0 , is the set of points A00 as A0 varies along C . If we place the pole of
a polar coordinate system at P0 and let the polar axis point in the direction of the
negative y-axis in Figure 3, then a polar equation for C P0 is r = g(θ ). Thus, in (3), the
y-coordinate of the roulette is the polar radius of C P0 .
We use the notation of Figure 3 throughout, where capital letters denote points on
a curve and their primed counterparts denote the corresponding points on the pedal
curve. For now we give four examples.
1. Let C be the circle r = 2 cos θ and let P be the pole of the polar coordinate
system. Figure 6 illustrates that the cardioid r = 1 + cos θ is the pedal of C with
respect to P. To see why, note that since 1PQ0 D ∼ 1OQD and OD = sec θ ,
it follows that r = PQ0 = 1 + cos θ. Alternatively, we could have concluded
directly from the expression for the y-coordinate of the cycloid in (1) that the
pedal of the circle r = −2 cos θ with respect to the pole is the cardioid r =
1 − cos θ .

Q0

P 1 O D

CP

Figure 6. A cardioid as a pedal of a circle.

2. The pedal of a parabola with respect to its focus is the tangent line to the parabola
at the vertex (see Figure 5).
3. The pedal of an ellipse with respect to a focus is the circle having the major axis
of the ellipse as a diameter. See [5, p. 13] for a proof.
4. In a degenerate case, where the curve C is a point Q, the family of tangents to C
is the pencil of lines through Q and C P is the circle with diameter PQ.
A curve geometrically similar to C P , but twice as large, may be obtained by reflect-
ing P across each of the tangent lines of C . The set of reflected points P 0 is called
the orthotomic of C with respect to P (see [5, p. 153]). The orthotomic may also be

482 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



generated as the roulette traced by P 0 as it moves with a mirror image of C rolling on
C in such a way that the mirror image is the reflection of C about the tangent line at the
point of contact. This is illustrated in Figure 7, where C 0 is the reflection of C across
the tangent at A. We mention this to prove what we call the angle property of pedal
curves: the segments PA and PA0 from P to a point A on C and the corresponding point
A0 on C P make congruent angles with the tangents to these curves at A and A0 . That is,
the angles marked ρ at A and A0 in Figure 7 are congruent. To see why, note that since
the dilation centered at P with factor 2 maps A0 to P 0 and the pedal to the orthotomic,
the angles marked ρ at A0 and P 0 are congruent. But since A is the instantaneous center
of rotation of C 0 as it rolls on C , AP0 is perpendicular to the orthotomic at P 0 , and the
angles marked ρ at A and P 0 are also congruent. See [5, p. 14] for another proof.

P0

A0
CP

C0
A
C

P
Figure 7. The angle property of pedal curves.

4. STEINER’S THEOREMS. Given the close connection between pedals and


roulettes as expressed in Figures 3 and 4, it is not surprising that there are some
relationships between pedals and roulettes. Two of these are given by theorems dis-
covered by Jakob Steiner (see [2] for generalizations). Let A and B be points of C ,
and let A0 and B 0 be the corresponding points on the pedal curve C P . Let R be the
portion of the roulette curve traced by P as C rolls on a line m between contact points
A and B.

Steiner Theorem 1. The arc length of R is equal to the arc length of C P between A0
and B 0 .

Steiner Theorem 2. The area between R and m is twice the area bounded by PA0 ,
PB0 and C P .

In particular, if C is a closed curve rolling on a line, the area between the line and
roulette after one revolution is twice the area of C P , and the arc length of the roulette
is equal to the arc length of C P . To illustrate, let C be a circle of radius r that rolls
through an angle of θ radians, and let P be the center of the circle. The roulette curve
is a line segment l of length r θ and the area of the rectangular region between l and
m is r 2 θ . Since Q 0 = Q for all points Q of C , the corresponding region of the pedal
curve is the sector of a circle with central angle θ , area 12 r 2 θ, and arc length r θ . For a
second example of Steiner’s theorems, take C to be a circle, let the tracing point P be

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 483


a point on C , and let A and B be any two points on C . As illustrated in Figure 8, the
roulette, R, is an arc of a cycloid and the pedal, C P , is an arc of a cardioid. The two arc
lengths are equal and the area between the arc of the cycloid and the x-axis is twice
the area bounded by the cardioid and segments PA0 and PB0 .

C
Q R
~ CP Q
P
B
Q0
A
P
P ~ ~ B0
Q0 Q
A0

Figure 8. An illustration of the Steiner theorems.

The Steiner theorems follow as corollaries to the roulette lemma. To see this, let C
roll along the x-axis with P initially on the y-axis, and let θ denote the angle through
which C has turned from its starting position. Using (3), a parameterization of R is
given by (x, y) = ( f (θ), g(θ)), where f 0 (θ) = g(θ ). If we place the pole of a polar
coordinate system at P with the polar axis pointing in the direction of the negative
y-axis, a polar equation of C P is r = g(θ). Then as C turns through the differential
angle dθ , the differential arc length of the roulette is
p p
(d x)2 + (dy)2 = (g(θ))2 + (g 0 (θ ))2 dθ, (4)

and the differential arc length of the pedal is


p p
(r dθ)2 + (dr )2 = (g(θ))2 + (g 0 (θ ))2 dθ, (5)

proving the first Steiner theorem. The area element of the roulette is

y d x = (g(θ))2 dθ,

and the area element of the roulette is


1 2 1
r dθ = (g(θ))2 dθ,
2 2
proving the second Steiner theorem.
To get a dynamic picture of the Steiner theorems, let’s return for a moment to Fig-
ure 8, where we have shown the rolling circle in an intermediate position (dashed)
along with points P̃ and Q̃, the rotated images of the points P and Q, respectively.
Point Q̃ 0 is the foot of the perpendicular from P̃ to the x-axis and corresponds to
Q 0 on the pedal curve. Now let’s put this figure in motion. Imagine the dashed circle
rolling clockwise on the x-axis as P̃ traces out the arc of the cycloid. At the same time
point Q 0 moves counterclockwise around P and sweeps out the corresponding arc of
the cardioid. Steiner’s first theorem says that P̃ and Q 0 move with the same speed,
as shown by the equality of the differential arc lengths in (4) and (5). Given this, it’s

484 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



easy to see why Steiner’s second theorem is true. As C turns through the angle dθ , the
segment PQ0 sweeps out a differential triangle, while the segment P̃ Q̃ 0 sweeps out a
differential rectangle. But since the segments are congruent and Q 0 and P̃ move at the
same speed, the area of the rectangle is twice that of the triangle. In Section 6 we give
a visual representation of Steiner’s first theorem. But first we must introduce the idea
of road-wheel pairs.

5. ROADS AND WHEELS. Many readers are probably familiar with a square-
wheeled bicycle. If the wheels roll along a road consisting of portions of appropriately
chosen inverted catenaries, the axles of the wheels move horizontally and the ride is
smooth. Figure 9 shows another example of a road-wheel pair. An elliptical wheel rolls
without slipping on a sinusoidal road, while one focus, A, of the ellipse moves along
the x-axis. Hall and Wagon [3] generate many other examples of road-wheel pairs.
We give some of their examples below, but for now we present a modified version
of their approach which is more suited to our geometric point of view. We place the
road below the x-axis in a rectangular coordinate system and require that the wheel
roll on the road without slipping and that the axle of the wheel, which we initially
place at the origin, move along the x-axis. An equivalent way to describe the rolling
condition is that the point of contact of the road and wheel (Q in Figure 9) is the
instantaneous center of rotation of the wheel. This implies that the velocity of the axle
is perpendicular to the spoke AQ. Adding the requirement that the axle of the wheel
move along the x-axis forces Q to be directly below A at all times.

y
Q0

x
A0 A

Q
Figure 9. An ellipse rolling on a sine curve.

In describing a road-wheel pair, we adopt the convention of describing the wheel


in its initial position, with its axle at the origin. We do so by giving a polar equation
of the wheel, r = g(θ) > 0, where the pole coincides with the origin and the polar
axis points in the direction of the negative y-axis. Given the function r = g(θ ), we
wish to parameterize the corresponding road in terms of θ . Suppose the wheel ro-
tates clockwise through an angle of θ while rolling on the road, so that the spoke of
the wheel, A0 Q 0 , which originally makes an angle of θ with the negative y-axis, has
moved to AQ and is now pointing downward. If Q(x, y) is the point of contact, then
y = −r = −g(θ). To finish parameterizing the road we must express x as a function
of θ , in a sense unwrapping the spokes of the wheel and placing them perpendicular
to the x-axis to form the road. Since Q is the instantaneous center of rotation of the
wheel, the differential displacement of the axle, A(x, 0), of the wheel is

d x = r dθ, (6)

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 485


so that x = f (θ ) is the solution of the initial value problem

d x/dθ = g(θ), x(0) = 0.

Thus, the wheel with polar equation r = g(θ ) has as its corresponding road the
curve
Z θ 
(x, y) = g(t) dt, −g(θ ) . (7)
0

Note that (6) and (7) hold even if g(θ) < 0. In this case, the part of the road cor-
responding to g(θ) < 0 lies above the x-axis and the axle moves to the left while
the wheel is in contact with this part of the road. This is illustrated in Figure 10 (see
example 1 below). Here are some examples, generated using (7).

Figure 10. A limaçon rolling on a trochoid.

1. If the wheel is the limaçon r = 1 − d cos θ , the road is the trochoid (x, y) =
(θ − d sin θ, d cos θ − 1). If d = 1 the wheel is a cardioid and the road is a
cycloid. The case d = 1.5 is shown in Figure 10.
2. If the wheel is the horizontal line r = sec θ (or y = −1), the road is the catenary
(x, y) = (ln(sec θ + tan θ), − sec θ), as shown in Figure 11.
3. If the wheel is the circle r = cos θ , the road is the circle (x, y) = (sin θ, − cos θ ).
In this example a circle with radius 1/2, in its initial position with center at
(0, −1/2), rolls on the inside of the unit circle x 2 + y 2 = 1. The point on the
rolling circle initially at the origin oscillates on the x-axis between (−1, 0) and
(1, 0).
4. If the wheel is the parabola r = 1/(2 + 2 cos θ ), or y = x 2 − 1/4, the road is the
parabola (x, y) = (sin θ/(2 + 2 cos θ), −1/(2 + 2 cos θ )), or y = −x 2 − 1/4.

Figure 11. A line rolling on a catenary.

486 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



More generally, as the mirror image of a fixed parabola, P , rolls on P in such
a way that the rolling parabola is the reflection of P across the tangent line at
the point of contact, the focus of the rolling parabola moves along the directrix
of P .
5. If the wheel is the rose r = cos nθ , the road is the ellipse (x, y) = ( n1 sin nθ,
−cos nθ).

6. ROADS, WHEELS, AND STEINER’S THEOREMS. In this section we de-


scribe how to generate road-wheel pairs geometrically and give a new geometric inter-
pretation of the Steiner theorems. Let (R, W ) denote a road-wheel pair, and let capital
letters and their primed counterparts denote corresponding points on the road R and
wheel W , respectively. That is, if A is a point on the road, the point A0 is the point
on the wheel that comes into contact with the road at A. Then we have the following
two properties of road-wheel pairs, the first of which follows immediately from our
assumption that the wheel rolls without slipping:

Lemma 1. The length of the road between A and B is equal to the length of the wheel
between A0 and B 0 .

Lemma 2. The area between the road and the x-axis from A to B is twice the area
bounded by the wheel and the segments OA0 and OB0 , where O is the axle of the wheel.

To prove the second property, note from (6) that the area element between the road
and the x-axis is

d A = −y d x = −y r dθ = r 2 dθ,

which is twice the area element of the wheel. We can show this property geometrically
by using the spokes to unwrap the wheel as shown in Figure 12. The distance between
nearby unwrapped spokes of length r and R = r + 1r is taken to be r 1θ, where 1θ is
the angle between the spokes. The area of each trapezoid is then approximately twice
the area of the corresponding sector. This can be thought of as a generalization of the
grade school proof for the area of a circle. Alternatively, we can think of the latter
proof as constructing a road for a circular wheel.

r1

R
R

Figure 12. Unwrapping a wheel to form a road.

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 487


The similarity of these lemmas with the Steiner theorems, as well as the similarity
between (3) and (7), suggests the following theorem. In the statement of the theorem
we follow our convention of describing the wheel in its initial position with its axle at
the origin.

Main Theorem. Let C be a rollable curve initially tangent to the x-axis at the origin,
O, and let P0 be a point on the y-axis. Let R0 be the reflection in the x-axis of the
roulette curve R generated by P0 as C rolls on the x-axis. Then translating the pedal,
−−→
C P0 , by the vector P0 O gives a wheel for the road R0 .

Proof. A comparison of (3) and (7) actually proves the theorem, but we give a more
geometric proof that illustrates why the theorem is true and describes the rolling mo-
tion of the wheel. As shown in Figure 13, let P(x, y) denote the position of P0 after C
has rolled some distance along the x-axis. Let A be the point of tangency between C
and the x-axis and let A0 be the corresponding point of C P . Also, let Q be the reflec-

→ −−→
tion of P in the x-axis. We show that translating C P by the vector PA0 = A0 Q gives the
wheel in its position when it is in contact with the road at Q; in particular, the initial
−−→
position of the wheel is the translation of C P0 by the vector P0 O. Let W denote the
−→
translation of C P by PA0 . The translation ensures that W and R0 are in contact at Q.
It also sends the pedal point P to A0 , so that the translated copy of P moves along the
x-axis and serves as the wheel’s axle. It remains to show that W rolls on the road. We
first show that W and R0 are tangent at Q. By the angle property of pedals, the angles
at A and A0 labeled ρ are congruent. But the angle between R and PA0 at P also has
measure ρ. It follows that W and R0 are tangent at Q. That W is rolling, and not slid-
ing, on R0 follows immediately from the first Steiner theorem, since the corresponding
arcs lengths of R0 and C P0 are equal. But since we would like to claim Steiner’s first
theorem as a corollary to this theorem, we give an alternative argument to show that
W is rolling on R0 . The roulette lemma implies that as C turns through the angle dθ ,
the horizontal component of the displacement of P, and hence the displacement of

P C

x
A0 A
CP

Q
R0

Figure 13. Generating road-wheel pairs geometrically.

488 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



A0 , is y dθ . But since W turns through the same angle as C , it follows that Q is the
instantaneous center of rotation of W .

We now show how to use our theorem to generate the first three road-wheel pairs
from the previous section. We describe the curve C in its initial position, tangent to the
x-axis at the origin. We also describe the wheel in its initial position, W0 , with its axle
at the origin. As usual, the polar equation of W0 assumes that the polar axis coincides
with the negative y-axis.
1. Take C to be the circle x 2 + (y − 1)2 = 1 and P0 = (0, 0). Then C P0 is the car-
dioid r = 1 − cos θ and the roulette R is the cycloid (x, y) = (θ − sin θ, 1 −
cos θ ). Since the pedal point is initially at the origin, W0 is the same cardioid.
Thus, the wheel r = 1 − cos θ rolls on the road R0 given by (x, y) = (θ −
sin θ, cos θ − 1).
2. Take C to be the parabola y = x 2 /4 and P0 = (0, 1). Then C P0 is the x-axis (see
Figure 5) and R is the catenary y = cosh x. The wheel, W0 , is the translation of
−−→
the x-axis by the vector P0 O = h0, −1i, or the line y = −1. So the line y = −1
is the wheel for the catenary y = − cosh x.
3. In a degenerate case, take C to be the origin and P0 to be the point (0, 1). The
pedal curve, C P0 , is the circle with diameter O P0 . The roulette curve R (and also
−−→
R0 ) is the unit circle centered at the origin. Since P0 O = h0, −1i, a wheel for
R is the unit circle centered at (0, −1/2).
0

Our theorem suggests that to visualize Steiner’s first theorem we should replace
our image of P tracing out R as C rolls on a line (as in Figure 8) by the image of
C P rolling on R0 . We use this idea to recast our dynamic interpretation of the Steiner
theorems given at the end of Section 4. To illustrate, return to Figure 9, showing an
elliptical wheel rolling on a sinusoidal road, and put the wheel in motion. As the ellipse
rolls on the sine curve, the point Q 0 moves on the fixed ellipse at the left. Recall Q 0
corresponds to the point, Q, on the rolling ellipse that is in contact with the road at
any instant. Expressed another way, the motion of the spoke A0 Q 0 on the fixed ellipse
is just the motion of the spoke AQ on the rolling ellipse as seen by an observer in the
reference frame of the rolling ellipse. We may restate Steiner’s first theorem as saying
Q and Q 0 move at the same speed. Of course this follows from our condition that the
ellipse roll on the road without slipping. But now consider the additional condition that
the axle of the wheel move along the x-axis. This requirement forces the spoke, AQ,
of the rolling ellipse to be perpendicular to the x-axis and thus sweep out a differential
rectangle as the ellipse turns through the angle dθ , while the spoke, A0 Q 0 , of the fixed
ellipse sweeps out a differential triangle. Since the speeds of A0 and A are equal, the
area of the differential rectangle is twice that of the triangle.

7. GENERATING ADDITIONAL ROAD-WHEEL PAIRS. We may use our main


theorem to generate additional road-wheel pairs by arbitrarily choosing a rollable
curve C . For example, let C be the cardioid r = 1 − cos(θ − π/2) and P0 be the ori-
gin. We leave it to the reader to verify that g(θ ) = 2 sin3 (θ/3). Using (3) we may
parameterize R as

(x, y) = (−6 cos(θ/3) + 2 cos3 (θ/3) + 4, 2 sin3 (θ/3)), 0 ≤ θ ≤ 3π.

The rolling motion of the cardioid is hinted at in Figure 14. The pedal curve, C P0 , is
Cayley’s sextic with polar radius given by the y-coordinate of R, or r = 2 sin3 (θ/3).

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 489


y

Figure 14. A cardioid rolling on the x-axis.

This is a wheel for the road R0 , as shown in Figure 15, where Cayley’s sextic is shown
in its initial position and some later position. The cardioid (dashed) is also shown to
illustrate our theorem; the position of the wheel is found by translating the pedal of the
−→
cardioid by the vector PA0 .

x
A0

Figure 15. Cayley’s sextic rolling on a road.

As a second example, we find the road corresponding to a circular wheel of radius


a and center O whose axle, P, is ae units from O, with 0 < e < 1. Let C be the ellipse
with center O, one focus at P, and eccentricity e. Then by example 3 of Section 3, the
wheel is the pedal of C with respect to P. It follows that the road for the circular wheel
with axle at P is the roulette curve generated by P as C rolls on a line. The roulette,
an elliptic catenary, is shown in Figure 16 and the road-wheel pair in Figure 17. The
road-wheel pair consisting of a line rolling on a catenary, shown in Figure 11, may
be regarded as the limiting case of a circular wheel rolling on an elliptic catenary. If
we set a = 2e/(1 − e2 ) and let e → 1− , in any bounded region the elliptic catenary in
Figure 17 approaches the catenary y = − cosh x and the circular wheel approaches a
tangent line to the catenary.

Figure 16. An elliptic catenary generated as a Figure 17. A circular wheel rolling on an elliptic
roulette, e = 0.8. catenary, e = 0.8.

490 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



8. GENERATING FAMILIAR CURVES AS ROULETTES. Given a road-wheel
pair, (R0 , W ), we can use our main theorem to express the reflection of the road in
the x-axis as a roulette by reversing the process described in Figure 13 to recover the
rolling curve C and the tracing point P. We describe this process when the wheel is in
its initial position, with its axle at the origin. So in Figure 13 we take A0 to coincide
with the origin, O, and we replace P and Q with P0 and Q 0 , respectively, to emphasize
that the wheel is in its initial position. Determining P0 is easy. We translate the axle of
−−→
the wheel (at O) by the vector Q 0 O. To determine C we first find the negative pedal
of W with respect to O; that is, we find the curve whose pedal with respect to O is
W . We denote this curve by W O . Finally, we recover C by translating W O by the
−−→
vector Q 0 O. To determine the negative pedal, note that the tangents to W O are the
lines through points S of W perpendicular to the segments OS. If this one-parameter
family of tangents has the parameterization f (x, y, t) = 0, then the negative pedal as
a set of points is a solution to the system

f (x, y, t) = f t (x, y, t) = 0, (8)

where the subscript denotes partial differentiation with respect to the parameter t.
For example, suppose the wheel is the parabola y = x 2 /4 − 1, with correspond-
ing road y = −x 2 /4 − 1. Since the initial point of contact of the wheel and road
is Q 0 (0, −1), the initial position of the tracing point is P0 (0, 1). Parameterizing the
wheel as (x, y) = (2t, t 2 − 1) gives a parameterization of the set of tangent lines to
W 0 as
f (x, y, t) = 2t x + (t 2 − 1)y − (t 4 + 2t 2 + 1) = 0.
−−→
We then translate the solution of the system (8) by the vector Q 0 O = h0, 1i to get a
parameterization of C in its initial position as

(x, y) = (3t − t 3 , 3t 2 ).

Thus, the the parabola y = x 2 /4 + 1 is the roulette traced by the point P0 (0, 1) as the
above curve rolls along the x-axis. In this case the rolling curve, the negative pedal
of a parabola with respect to its focus, is called Tschirnhausen’s cubic. Figure 18
shows Tschirnhausen’s cubic rolling along the x-axis as the point P0 (0, 1) traces out
a parabola. Tschirnhausen’s cubic is shown twice; once in its initial position (dashed)
and again after it has rolled some distance along the x-axis.

Figure 18. A parabola generated as a roulette curve.

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 491


For the remaining examples we describe the wheel in its initial position by its po-
lar equation, r = g(θ). In this case, replacing the parameter t with θ and keeping in
mind that the polar axis coincides with the negative y-axis, the system of equations (8)
becomes

x sin θ − y cos θ − g(θ) = x cos θ + y sin θ − g 0 (θ ) = 0,

with solution

(x, y) = (g(θ) sin θ + g 0 (θ) cos θ, −g(θ ) cos θ + g 0 (θ ) sin θ ). (9)

For example, take the wheel to be the rose r = cos(nθ ), with corresponding road the
ellipse (x, y) = ( n1 sin nθ, − cos nθ). Since the initial point of contact is Q 0 (0, −1),
the initial position of the tracing point is P0 (0, 1). Using (9) and translating by the
−−→
vector Q 0 O = h0, 1i gives a parameterization of the rolling curve in its initial position
as
n−1
(x, y) = (− sin(n + 1) θ, cos(n + 1) θ)
2
n+1
− (sin(n − 1) θ, cos(n − 1) θ) + (0, 1). (10)
2
The curves (10) are all hypocycloids. That is, they are the roulettes generated by a
point on the circumference of a circle of radius r as it rolls with internal contact inside
a fixed circle of radius R. Choosing r = (n − 1)/2 and R = n gives the hypocycloid
in (10). Since a hypocycloid is not differentiable at its cusps, we may apply our main
theorem only to each differentiable piece. However, a hypocycloid has well-defined
tangents at its cusps and the tangents to the hypocycloid vary continuously as the
curve is traversed. Thus, a hypocycloid can roll on a line and as it rolls it maintains its
sense of rotation. We can piece together the results from our main theorem to conclude
that when the curve (10) rolls on the x-axis, the point P0 (0, 1) traces out the ellipse
(x, y) = ( n1 sin nθ, cos nθ). The description of the rolling motion below makes this
transparent.
Figure 19 shows the case n = 2, where the rolling curve is an astroid. The dashed
astroid is the curve in its initial position and the solid astroid is the curve in its new

Figure 19. An ellipse as a roulette of an astroid.

492 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Figure 20. An ellipse as a roulette of a hypocy- Figure 21. An ellipse as a roulette of a hypocy-
cloid. cloid.

position after it has rotated θ = 0.5 radians. Figures 20 and 21 show the case n = 4,
where the initial and rotated (θ = 0.25) positions are shown separately for clarity.
Since the tangent lines to the hypocycloid at the cusps pass through the tracing point
P0 , the x-intercepts of the ellipse are traced when the cusps are tangent to the x-axis.
Thus, half of the ellipse is traced when one arc of the hypocycloid has completed
rolling along the x-axis. When the point of tangency of the hypocycloid and the x-axis
passes through a cusp, the tracing point crosses the x-axis and the point of tangency
switches between the “top” and “bottom” of the x-axis (more precisely, the astroid is
alternately above and below the x-axis near the point of tangency). As mentioned in
the remarks following the roulette lemma, it is at the moments when the tracing point
crossing the x-axis, in this case when a cusp is the point of tangency, that the direction
of the tracing point changes between a rightward and a leftward motion.
In our last two examples, the curve C also has cusps and we may apply our theorem
only to its differentiable pieces. But, as in the last example, the cusps have well-defined
tangent lines and the tangents to C vary continuously, and we may piece together the
roulettes generated by each differentiable part of C . We now consider the case of an
elliptical wheel rolling on a sinusoidal road, as one focus of the ellipse moves along the
x-axis, as shown in Figure 9. More specifically, suppose that the wheel is the ellipse
r = ed/(1 + e cos θ), 0 < e < 1. Using (7), the corresponding road is given by

2ed f (θ) + e 1+e ed


      
(x, y) = arctan − arctan ,− ,
a a a 1 + e cos θ

where a = 1 − e2 , f (θ) = tan(θ/2 + π/4), and the value of arctan(( f (θ ) + e)/a)
is the appropriate angle closest to θ/2 + π/4. A little trigonometry shows that an equa-
tion of the road in rectangular coordinates is given by

ed
y=− (1 − e cos(cx)) , (11)
a2
where c = a/(ed). For an alternative approach see [3]. The negative pedal of an ellipse
with respect to a focus, P, is shown in Figures 22 and 23 for two cases, e > 1/2
and e < 1/2, respectively, where the ellipses are dashed. We follow Lockwood [4]
and call the negative pedal Burleigh’s oval and we call P the eye of the oval. If e >
1/2 the oval has two cusps. Otherwise, it is an egg-shaped curve. It follows that as
Burleigh’s oval rolls on a line, the eye traces out a sine curve. Using (9) and translating

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 493


P P

Figure 22. Negative pedal of an ellipse (e = Figure 23. Negative pedal of an ellipse (e =
0.8). 0.45).

−−→
by the vector Q 0 O = h0, ed/(1 + e)i gives a parameterization of the rolling curve in
its initial position as

ed ed
 
(x, y) = (sin θ + e sin 2θ, − cos θ − e cos 2θ ) + 0, .
(1 + e cos θ)2 1+e

If this oval is rolled on the x-axis, the eye, P0 (0, ed/(1 + e)), traces out the reflection
of the sine curve (11) in the x-axis. When e > 1/2 the point of tangency between the
oval and the x-axis changes between being above and below the x-axis at the cusps.
But in contrast to the previous example, the tracing point continues to move to the right
at all times since no tangent to the oval passes through its eye.

Figure 24. Roulette of focus of negative pedal of Figure 25. Roulette of focus of negative pedal of
an ellipse (e = 0.8). an ellipse (e = 0.45).

This example can be generalized by taking the wheel to be the curve

r = ed/(1 + e cos(λθ )), (12)

where λ is a constant. Then the corresponding road is parameterized as

2ed f (λθ) + e 1+e ed


      
(x, y) = arctan − arctan ,− ,
aλ a a 1 + e cos(λθ )

with rectangular equation

ed
y=− (1 − e cos(cx)), (13)
a2

494 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



√ −−→
where a = 1 − e2 and c = aλ/(ed). Using (9) and translating by the vector Q 0 O =
h0, ed/(1 + e)i gives a parameterization of the rolling curve in its initial position as
ed
(x, y) = {(sin θ, − cos θ ) + e(sin θ cos λθ, − cos θ cos λθ )
(1 + e cos λθ)2
ed
 
+ λe(sin λθ cos θ, sin λθ sin θ )} + 0, . (14)
1+e

If λ is chosen to be a positive integer, the wheel is a simple closed curve with


λ “lobes”
√ and rolls over λ periods of the road as it rotates
√ once. If we choose e =
2
1/ 1 + λ and d = λ , then the road has equation y = − 1 + λ2 + cos x. The wheel
2

and the road for λ = 4√ are shown in Figure 26. As the curve√given by (14) rolls on the
x-axis, the eye, P0 (0, 17 − 1), traces out the curve y = 17 − cos x, as shown in
Figure 27.


Figure 26. A wheel for the road y = − 17 + cos x.


Figure 27. The curve y = 17 − cos x generated as a roulette.

While we have used our theorem to generate some familiar curves as roulettes, there
are many other possibilities. We invite the interested reader to explore more examples
of road-wheel pairs in [3] and [6] to generate additional roulettes.

ACKNOWLEDGMENTS. I am deeply grateful to several referees whose many suggestions greatly improved
this article. In particular, I would like to thank one referee for generously providing me with an example of
how to use Mathematica and another for pointing out the angle property of pedals and providing the geometric
proof of the main theorem. I would also like to thank my student, Mark Pembrooke, for suggesting the roulette
lemma, and my teacher, John P. Titterton, for introducing me to the beauty of curves thirty years ago.

REFERENCES

1. A. Agarwal and J. E. Marengo, The locus of the focus of a rolling parabola, College Math. J. 41 (2010)
129–133. doi:10.4169/074683410X480230
2. T. Apostol and M. Mnatsakanian, Area & arc length of trochogonal arches, Math Horizons 11(2) (2003)
24–30.

June–July 2011] ROADS AND WHEELS, ROULETTES AND PEDALS 495


3. L. Hall and S. Wagon, Roads and wheels, Math. Mag. 65 (1992) 283–301. doi:10.2307/2691240
4. E. H. Lockwood, Negative pedal of the ellipse with respect to a focus, Math. Gaz. 41 (1957) 254–257.
doi:10.2307/3610116
5. , A Book of Curves, Cambridge University Press, Cambridge, 1961.
6. G. B. Robison, Rockers and rollers, Math. Mag. 33 (1960) 139–144. doi:10.2307/3029034

FRED KUCZMARSKI received his B.A. from the University of Pennsylvania in 1984 and his Ph.D. from
the University of Washington in 1995 under the guidance of Paul Goerss. His passion for geometry was later
sparked by James King. He currently teaches at Shoreline Community College in Seattle. His other interests
include baking bread and helping the Forest Service look for fires in Washington’s North Cascades.
Department of Mathematics, Shoreline Community College, Shoreline, WA 98133
[email protected]

Kovalevskaya on Learning Calculus

“When we moved permanently to the country, the whole house had to be redeco-
rated and all the rooms had to be freshly wallpapered. But since there were many
rooms, there wasn’t enough wallpaper for one of the nursery rooms . . . [which]
just stood there for many years with one of its walls covered with ordinary pa-
per. But by happy chance, the paper for this preparatory covering consisted of
the lithographed lectures of Professor Ostrogradsky on differential and integral
calculus, which my father had acquired as a young man.
These sheets, all speckled over with strange, unintelligible formulas, soon
attracted my attention. I remember as a child standing for hours on end in front
of this mysterious wall, trying to figure out at least some isolated sentences . . . .
Many years later, when I was already fifteen I took my first lesson in differ-
ential calculus from the eminent Petersburg professor Alexander Nikolayevich
Strannolyubsky. He was amazed at the speed with which I grasped and assim-
ilated the concepts of limit and of derivatives, ‘exactly as if you knew them in
advance.’ . . . And, as a matter of fact, at the moment when he was explaining
these concepts I suddenly had a vivid memory of all this, written on the memo-
rable sheets of Ostrogradsky; and the concept of limit appeared to me as an old
friend.”
Sofya Kovalevskaya, A Russian Childhood,
trans. B. Stillman, Springer-Verlag, New York, pp. 122–123

—Submitted by Robert Haas, Cleveland Heights, OH

496 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



The Golden String, Zeckendorf
Representations, and the Sum of a Series
Martin Griffiths

Abstract. In this article we explore some wonderfully intricate relationships between three
mathematical objects, each of which is associated in some way with the Fibonacci sequence.
One of these objects is a particular finite series comprising n terms, and, via the other two, our
mathematical journey culminates in the derivation of an expression for the sum of this series
in terms of sums and products of certain Fibonacci numbers.

1. INTRODUCTION AND INITIAL DEFINITIONS. Since the golden string and


Zeckendorf representations of integers are both intimately associated with the Fi-
bonacci numbers, it should be no great surprise that the properties of these two mathe-
matical artefacts are linked in some way. Furthermore, the fact that the golden ratio φ
is also very much bound up with the Fibonacci numbers might lead us to suspect that
the sequence {bnφc}, the golden string, and Zeckendorf representations form a trio of
highly interconnected mathematical objects.
In this article we show that our suspicions are indeed well founded, and go on to
utilize some of these beautifully intricate and intimate interconnections in order to
obtain an exact expression for the sum of the series
n
X
bmφc (1)
m=1

in terms of sums and products of Fibonacci numbers indexed by integers associated


with the Zeckendorf representation of n. As we proceed to the main theorem of this
paper, a number of interesting intermediate results are encountered.
The main results presented here came about whilst the author was studying vari-
ous properties of the golden string, which comprises a particular sequence of a’s and
b’s. As we shall see, bmφc gives the position of the mth b in the golden string. The
author
Pn wondered whether it was possible to obtain an exact “floorless” formula for
1
n m=1 bmφc, the mean position of the first n b’s.
For the sake of completeness, we give some definitions with which most readers
will no doubt be familiar. First, the Fibonacci sequence {Fn } is defined by setting
F1 = F2 = 1 and then Fn = Fn−1 + Fn−2 for n ≥ 3. Next, the golden ratio φ is given
by

1+ 5
φ= .
2
Finally, bxc denotes the floor function, representing the largest integer less than or
equal to x. Since φ > 1, it follows that {bnφc} is a strictly increasing sequence of
positive integers, the first few terms of which are given by

1, 3, 4, 6, 8, 9, 11, 12, 14, 16, 17, 19, 21, 22, 24, 25, 27, 29, 30, 32, . . . . (2)
doi:10.4169/amer.math.monthly.118.06.497

June–July 2011] THE GOLDEN STRING 497


This
P appears as sequence A000201 in [7]. The sequence we are concerned with here,
{ nm=1 bmφc : n ≥ 1}, can also be found in [7], as A054347. In fact, the formula
n  
X n(n + 1)φ n
bmφc = − + f (n)
m=1
2 2

is given there, where f (n) = 0 or f (n) = 1. Unfortunately, no explanation is provided


as to exactly when these alternatives apply. In this paper we provide a formula for the
sum (1) that, in addition to avoiding this problem, is “floorless” and utilizes Zeckendorf
representations.

2. ZECKENDORF REPRESENTATIONS, AND OTHERS. Any positive integer


can be expressed as a sum of distinct Fibonacci numbers in at least one way; in fact
many can be expressed in several ways, even if we prohibit the use of F1 . For example,
11 = F4 + F6
= F2 + F3 + F6
= F2 + F3 + F4 + F5 .
We call these F-representations for 11. In general, we term
Fc1 + Fc2 + · · · + Fck
an F-representation for n if n = Fc1 + Fc2 + · · · + Fck , where (c1 , c2 , . . . , ck ) is an
increasing sequence such that c1 ≥ 2.
Zeckendorf’s theorem provides conditions under which each positive integer may
be represented in a unique way as a sum of distinct Fibonacci numbers. Specifically, it
states that, for any n ∈ N, there is exactly one way in which n can be written as a sum
of distinct Fibonacci numbers such that the sum does not include any two consecutive
Fibonacci numbers. This gives the so-called Zeckendorf representation of n. A more
formal statement of Zeckendorf’s theorem is as follows:

Theorem 2.1. For any n ∈ N there exists a unique increasing sequence of positive
integers, (c1 , c2 , . . . , ck ) say, such that c1 ≥ 2, ci ≥ ci−1 + 2 for i = 2, 3, . . . , k, and
k
X
n= Fci .
i=1

Proofs may be found in both [1] and [8]. Note that F1 is excluded from appearing in a
Zeckendorf representation.
It seems remarkable that such a simple result was not discovered until well into the
20th century. Indeed, it is thought that Edouard Zeckendorf (1901–1983) first obtained
a proof in 1939, although he did not publish anything in this regard until 1972; see [9].
For interested readers, an excellent potted biography of Zeckendorf can be found in
[4]. It is worth noting, in particular, that he was a medical doctor by profession.
Throughout the majority of this article, we shall be dealing more generally with F-
representations of integers. In fact, our main result will apply to any F-representation
of n. However, Zeckendorf representations might be regarded as optimal in the sense
that if Fc1 + Fc2 + · · · + Fck is the Zeckendorf representation of n, then any other F-
representation of n will possess at least k terms. This property will be utilized right at
the very end of this paper in order to obtain the most efficient formula.

498 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



3. SOME RESULTS ASSOCIATED WITH THE GOLDEN STRING. The
golden string S∞ = “babbababbabba . . .” is defined in [6] to be the infinite string of
a’s and b’s constructed recursively as follows. Let S1 = “a” and S2 = “b”, and then,
for k ≥ 3, Sk is defined to be the concatenation of the strings Sk−1 and Sk−2 , which we
denote by Sk−1  Sk−2 . Thus

S3 = S2  S1 = “b”  “a” = “ba”,


S4 = S3  S2 = “ba”  “b” = “bab”,
S5 = S4  S3 = “bab”  “ba” = “babba”,

and so on. The golden string is the unique infinite string S∞ such that for all k ≥ 2, Sk
is an initial segment of S∞ . We note here that some authors prefer to use 0’s and 1’s
rather than a’s and b’s as elements of S∞ ; see [5], for example. Our first lemma gives
some simple results that will be used in due course.

Lemma 3.1. Sk and S∞ have the following structural properties:


(i) Sk contains Fk letters, of which Fk−1 are b’s and Fk−2 are a’s, where F−1 = 1
and F0 = 0 by definition.
(ii) For any k ∈ N such that k ≥ 2, Sk  Sk−1  · · ·  S2 gives the first F2 + F3 +
· · · + Fk letters of S∞ .
(iii) The leftmost letter of Sk is b for any k ≥ 2.
(iv) The rightmost letter of Sk is a if, and only if, k is odd.
(v) There are no consecutive a’s in S∞ .

Proof. It is a straightforward matter to prove (i) by induction on k. Result (ii) follows


from (i) and the fact that for odd k, (Sk  Sk−1  · · ·  S2 )  S3 = Sk+2 , and for even
k, (Sk  Sk−1  · · ·  S2 )  S1  S2 = Sk+2 , both of which may be proven by induction.
Results (iii), (iv), and (v) follow very easily, using induction, from the definitions of
the Fibonacci sequence and S∞ .

In the following two lemmas we begin to establish the fascinating structural inter-
play between the golden string, F-representations, and a sequence related to {bnφc}.

Lemma 3.2. Let Fc1 + Fc2 + · · · + Fck be an F-representation for some n ∈ N. Then
Sck  Sck−1  · · ·  Sc1 gives the first n letters of S∞ .

Proof. In order to prove this result we proceed by induction on k, the number of terms
in the F-representation. When k = 1, the statement of the lemma is certainly true.
Now assume it is true for some k = m ≥ 1. Consider the F-representation

n = Fc1 + Fc2 + · · · + Fcm + Fcm+1 .

We know, by way of the inductive hypothesis, that

Scm  Scm−1  · · ·  Sc1

gives the first Fc1 + Fc2 + · · · + Fcm letters of S∞ , and therefore, by Lemma 3.1(ii), of

Scm+1 −1  Scm+1 −2  · · ·  Scm+1 −m .

June–July 2011] THE GOLDEN STRING 499


Thus

Scm+1  Scm  Scm−1  · · ·  Sc1

gives the first n letters of

Scm+1  Scm+1 −1  Scm+1 −2  · · ·  Scm+1 −m ,

and hence, by Lemma 3.1(ii), of S∞ .

Lemma 3.3. Let Na (n) and Nb (n) be the numbers of a’s and b’s, respectively, ap-
pearing amongst the first n letters of S∞ . Then
   
n+1 n+1
Na (n) = n − and Nb (n) = .
φ φ

Proof. Suppose that n = Fc1 + Fc2 + · · · + Fck is an F-representation of n. Then,


from Lemmas 3.1(i) and 3.2, it follows that

Nb (n) = Fc1 −1 + Fc2 −1 + · · · + Fck −1 . (3)

In order to prove the lemma we may use Binet’s formula; see [1] or [2] for a proof of
this result:
1 m
   
1
Fm = √ φ − −m
.
5 φ
Now,
 !
1 m+1

Fm 1
=√ φ m−1
+ −
φ 5 φ
 m−1  m−1  m+1 !
1 1 1 1
= √ φ m−1 − − + − + −
5 φ φ φ

(−1)m−1
 
1 1
= Fm−1 + √ + m+1 .
5 φ m−1 φ
From this and (3) it follows that
∞  
n+1 1 X 1 1 1
> Nb (n) − √ + 2 j+1 +
φ 5 j=1 φ 2 j−1 φ φ

and
∞  
n+1 1 X 1 1 1
< Nb (n) + √ + 2 j+2 + .
φ 5 j=1 φ 2 j φ φ

On using the formula for the sum to infinity of a geometric progression, we obtain,
after some simplification,
∞   ∞  
1 X 1 1 1 1 X 1 1 1
√ + = and √ + = 2.
5 j=1 φ 2 j−1 φ 2 j+1 φ 5 j=1 φ 2j φ 2 j+2 φ

500 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



The above implies that

n+1
Nb (n) < < Nb (n) + 1,
φ

showing that
 
n+1
Nb (n) = .
φ

The formula for Na (n) follows from this.

Since, as is easily shown,

1
φ =1+ ,
φ

it follows from Lemma 3.3 that


 
n
bnφc = n + = n + Nb (n − 1).
φ

As a consequence of this,
     
n+1 n
b(n + 1)φc − bnφc = n + 1 + − n+
φ φ
   
n+1 n
=1+ −
φ φ
= 1 + Nb (n) − Nb (n − 1),

which in turn implies that

b(n + 1)φc − bnφc = 2

if, and only if, the nth letter of S∞ is b. Note that it follows from this result that if the
nth letter of S∞ is a then b(n + 1)φc − bnφc = 1.
We now show that bnφc determines the position, when counting from the left, of
the nth b in the golden string. To take an example, it can be seen that the 7th b of S∞
occurs at position 11. Indeed, a quick check confirms that b7φc = 11.

Lemma 3.4. The nth b of S∞ occurs at position bnφc.

Proof. Let m be the position of the nth b in S∞ . Then

Nb (m − 1) = n − 1 and Nb (m) = n.

By Lemma 3.3 this means that


   
m m+1
=n−1 and = n.
φ φ

June–July 2011] THE GOLDEN STRING 501


Therefore
m m+1
<n< ,
φ φ
so

m < nφ < m + 1,

from which we see that bnφc gives the position of the nth b in S∞ .

Incidentally, it is interesting to note that the sequence


 
b(n + 1)φc − 1
φ
is also related to the golden string. The first few terms are given by

1, 1, 3, 4, 4, 6, 6, 8, 9, 9, 11, 12, 12, 14, 14, 16, 17, 17, 19, 19, 21, 22, 22, . . . ,

and S∞ may be constructed from this by replacing each pair of consecutive terms of
the form m, m for some m ∈ N by b, and each of the remaining terms by a.

4. DOVE-TAIL SEQUENCES. In this section we give a result that will be used in


the proof of Theorem 5.1. The notion of dove-tail sequences was considered in [3].
Suppose that A = (a1 , a2 , a3 , . . .) and B = (b1 , b2 , b3 , . . .) are both strictly increasing
sequences of positive integers. Then A and B are termed a pair of dove-tail sequences
if A ∪ B = N and A ∩ B = ∅.
In [3] it is proved that if α and β are two positive irrational numbers such that
their sum equals their product, then {bnαc} and {bnβc} constitute a pair of dove-tail
sequences. Letting α = φ, and then solving for β in the equation φ + β = φβ, we
obtain β = φ + 1 and hence that {bnφc} and {bn(φ + 1)c} = {bnφc + n} are a pair of
dove-tail sequences. The sequence {bnφc + n} starts

2, 5, 7, 10, 13, 15, 18, 20, 23, 26, 28, 31, . . . , (4)

as is easily checked. Notice how sequences (2) and (4) do indeed dove-tail. This result
implies that it is not possible to find m, n ∈ N such that bmφc = bnφc + n.
It is interesting that
( $s %)
n+1
n φ +n−1
 2 
and n+
φ

are also a pair of dove-tail sequences. In order to show that this is indeed true, note
first that the difference between successive terms in the sequence
( $s %)
n+1
n+
φ

is either 1 or 2. Furthermore, from Lemmas 3.3 and 3.4, we know that


 
n+1
φ

502 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



is definitely equal to a square, k 2 say, when n = bk 2 φc and sometimes remains at this
value when n = bk 2 φc + 1, but will be equal to k 2 − 1 when n = bk 2 φc − 1. Then,
using the simple results

bxc = k 2 ⇒ and bxc = k 2 − 1 ⇒


√  √ 
x =k x = k − 1,

it can be shown that


$s %! $s %!
n+1 (n − 1) + 1
n+ − (n − 1) + =2
φ φ

if, and only if, n = bk 2 φc for some k ∈ N. It is therefore that case that
( $s %)
n+1
n+
φ

provides all of the positive integers except those of the form


s 
bk φc + 1 
 2

bk φc +  − 1 = bk 2 φc + k − 1,
2


φ

as required.
This result generalizes, and it is in fact true that, for each k ∈ N,
( $s %)
n+1
n φ +n−1
 k  k
and n+
φ

are a pair of dove-tail sequences.

5. EVALUATING A FINITE SUM. We now start to piece together the results from
previous sections in order to obtain a formula for the sum (1) in terms of Fibonacci
numbers. As will eventually be seen, this involves F-representations of the upper limit
n of the sum. The next two lemmas will be used in the main theorems.

Lemma 5.1. For any k ∈ N, it is the case that

bφ F2k−1 c = F2k and bφ F2k c = F2k+1 − 1.

Proof. Lemma 3.4 implies that bφ F2k−1 c locates the position of the (F2k−1 )th b in S∞ .
From Lemma 3.1(i) and (iv), we know that this b must be situated at position F2k .
On the other hand, bφ F2k c gives the position of the (F2k )th b in S∞ . On using
Lemma 3.1(i), (iv), and (v), it can be seen that this is F2k+1 − 1.

Lemma 5.2. Let Fc1 + Fc2 + · · · + Fck be an F-representation of some n ∈ N. Then,


for 1 ≤ r ≤ Fc1 ,

r + Fc2 + Fc3 + · · · + Fck φ = br φc + Fc2 +1 + Fc3 +1 + · · · + Fck +1 .


  

June–July 2011] THE GOLDEN STRING 503


Proof. Since Fc1 + Fc2 + · · · + Fck is an F-representation, Fc1 +1 + Fc2 +1 + · · · +
Fck +1 is an F-representation of some integer q. We know, by Lemma 3.2, that
Sck +1  Sck−1 +1  · · ·  Sc1 +1
gives the first q letters of S∞ . Note next that
Sck +1  Sck−1 +1  · · ·  Sc2 +1
contains exactly Fc2 + Fc3 + · · · + Fck b’s. From the definition of Sc1 +1 and Lemma
3.4, it follows that, for 1 ≤ r ≤ Fc1 , br φc gives the position of the r th b in Sc1 +1 . Thus,
for 1 ≤ r ≤ Fc1 ,
br φc + Fc2 +1 + Fc3 +1 + · · · + Fck +1

gives the position of the r + Fc2 + Fc3 + · · · + Fck th b in S∞ , which, by Lemma 3.4,
is also given by
r + Fc2 + Fc3 + · · · + Fck φ .
  

It might initially appear that Lemma 5.1 is a specialization of Lemma 5.2. Note,
however, that the result from Lemma 5.2 is not necessarily true when r = 0. The
following theorem evaluates (1) for the case in which the upper limit is a Fibonacci
number.
Theorem 5.1.
Fk
X 1
bmφc = Fk−1 (Fk+2 + 1) + (−1)k+1 .
m=1
2

Proof. In Section 4 it was noted that {bnφc} and {bnφc + n} are dove-tail sequences.
Also, from Lemma 5.1, we know that
bφ F2k c = F2k+1 − 1
and
F2k−1 + bφ F2k−1 c = F2k−1 + F2k = F2k+1 .
Therefore
F2k F2k−1 F2k+1
X X X
bmφc + (m + bmφc) = m,
m=1 m=1 m=1

which rearranges to give


F2k F2k−1 F2k+1 F2k−1
X X X X
bmφc + bmφc = m− m. (5)
m=1 m=1 m=1 m=1

Similarly, since
bφ F2k+1 c = F2k+2
and
F2k + bφ F2k c = F2k + F2k+1 − 1 = F2k+2 − 1,

504 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



we obtain
F2k+1 F2k F2k+2 F2k
X X X X
bmφc + bmφc = m− m. (6)
m=1 m=1 m=1 m=1

It is now shown, by induction, that


 
F2k+1 Fj F2k+2 F2k+1 F j+1 Fj
X X X X X X
bmφc + (−1) j bmφc = m− m + (−1) j  m− m (7)
m=1 m=1 m=1 m=1 m=1 m=1

for 1 ≤ j ≤ 2k − 1. On using (5) and (6) it may be seen that (7) is certainly true for
j = 2k − 1. Now assume that (7) is true for some j such that 2 ≤ j ≤ 2k − 1. On
using (7) with (5) or (6), according to whether j is even or odd respectively, it follows
that
F2k+1 F j−1
X X
bmφc − (−1) j
bmφc
m=1 m=1
   
F2k+2 F2k+1 F j+1 Fj F j+1 F j−1
X X X X X X
= m− m + (−1) j  m− m  − (−1) j  m− m ,
m=1 m=1 m=1 m=1 m=1 m=1

which may be rearranged to give

F2k+1 F j−1
X X
bmφc + (−1) j−1 bmφc
m=1 m=1
 
F2k+2 F2k+1 Fj F j−1
X X X X
= m− m + (−1) j−1  m− m ,
m=1 m=1 m=1 m=1

as required.
On setting j = 1 in (7) we obtain
F2k+1 F2k+2 F2k+1
X X X
bmφc = 1 + m− m
m=1 m=1 m=1

1
= (F2k+2 (F2k+2 + 1) − F2k+1 (F2k+1 + 1)) + 1
2
1 2 2

= F2k+2 − F2k+1 + F2k+2 − F2k+1 + 1
2
1
= ((F2k+2 + F2k+1 )(F2k+2 − F2k+1 ) + F2k ) + 1
2
1
= (F2k+3 F2k + F2k ) + 1
2
1
= F2k (F2k+3 + 1) + 1.
2

June–July 2011] THE GOLDEN STRING 505


It may be shown, in a similar manner, that
F2k
X 1
bmφc = F2k−1 (F2k+2 + 1) − 1,
m=1
2

from which it follows that an overall formula is given by


Fk
X 1
bnφc = Fk−1 (Fk+2 + 1) + (−1)k+1 .
m=1
2

We are now in a position to give our main result, expressing (1) in terms of Fi-
bonacci numbers indexed by positive integers associated with F-representations of n.

Theorem 5.2. Let Fc1 + Fc2 + · · · + Fck be an F-representation of n. Then


n k k−1 k
X 1 X X X
bmφc = Fci −1 (Fci +2 + 1) + 2(−1)ci +1 + Fci Fc j +1 ,
m=1
2 i=1 i=1 j=i+1

where the double sum on the right is defined to be zero if k = 1.

Proof. This may be proved by induction on k. By Theorem 5.1, the statement of the
theorem is true when k = 1. Now assume that it holds for some k = m ≥ 1. Consider

N = Fc1 + Fc2 + · · · + Fcm+1

and

N = N − Fc1 = Fc2 + Fc3 + · · · + Fcm+1 .

First, by the inductive hypothesis,


N
X N
X N
X
b jφc = b jφc + b jφc
j=1 j=1 j=N +1

m+1 m m+1 N
1 X X X X
Fci −1 Fci +2 + 1 + 2(−1)ci +1 +

= Fci Fc j +1 + b jφc.
2 i=2 i=2 j=i+1 j=N +1

Now, using Lemma 5.2 and Theorem 5.1, we obtain


N Fc1
X X
r + Fc2 + Fc3 + · · · + Fcm+1 φ
  
b jφc =
j=N +1 r =1

Fc1
X
br φc + Fc2 +1 + Fc3 +1 + · · · + Fcm+1 +1

=
r =1

1
= Fc −1 (Fc1 +2 + 1) + (−1)c1 +1 + Fc1 (Fc2 +1 + Fc3 +1 + · · · + Fcm+1 +1 ),
2 1
thereby proving the theorem.

506 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



It now remains to reintroduce Zeckendorf representations into the picture. Theorem
5.2 holds for any F-representation of n. However, for ease of computation we would
like, for any given n ∈ N, the value of k to be as small as possible. As mentioned in
Section 2, Zeckendorf representations are, in this sense at least, optimal. Thus it makes
sense always to specialize Theorem 5.2 to such representations.

ACKNOWLEDGMENTS. I would like to thank the two anonymous referees for their valuable comments
and suggestions.

REFERENCES

1. D. Burton, Elementary Number Theory, McGraw-Hill, Singapore, 1998.


2. P. J. Cameron, Combinatorics: Topics, Techniques, Algorithms, Cambridge University Press, Cambridge,
1994.
3. M. Griffiths, Dove-tail sequences, Math. Gaz. 91 (2007) 300–302.
4. C. Kimberling, Edouard Zeckendorf, Fibonacci Quart. 36 (1998) 416–418.
5. R. Knott, The Golden String of 0s and 1s, available at http://www.mcs.surrey.ac.uk/Personal/R.
Knott/Fibonacci/fibrab.html.
6. D. E. Knuth, The Art of Computer Programming, vol. 1, Addison-Wesley, Reading, MA, 1968.
7. N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences, available at http://www.research.
att.com/~njas/sequences/.
8. Wikipedia contributors, Zeckendorf’s theorem, Wikipedia, The Free Encyclopedia, available at http:
//en.wikipedia.org/wiki/Zeckendorf’s_theorem.
9. E. Zeckendorf, Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nom-
bres de Lucas, Bull. Soc. R. Sci. Liège 41 (1972) 179–182.

MARTIN GRIFFITHS is currently a Lecturer in Mathematics Education at the University of Manchester,


having previously been both a high school Head of Mathematics and a university Lecturer in Mathematics.
He received his Ph.D. in 2007 from the University of Strathclyde, referees for several mathematical journals,
and is also Reviews Editor for the Mathematical Gazette. He enjoys listening to jazz music and walking in the
beautiful English countryside.
School of Education, University of Manchester, M13 9PL, United Kingdom
[email protected]

June–July 2011] THE GOLDEN STRING 507


Achievement Sets of Sequences
Rafe Jones

Abstract. Given a real sequence (xn ), we examine the set of all sums of the form i∈I xi , as I
P
varies over subsets of the positive integers. We call this the achievement set of (xn ), and write it
AS(xn ). For instance, AS(1/2n ) = [0, 1] by the existence of binary expansions, and AS(2/3n )
is the Cantor middle-third set. We explore the properties of these two sequences that account
for their very different achievement sets. We give a sufficient condition for a sequence to have
an achievement set that is an interval, and another sufficient condition for the achievement set
to be a Cantor set. We also examine what sets can occur as achievement sets, and give results
on the topology of achievement sets.

INTRODUCTION. In 1854, Bernhard Riemann proved his well-known rearrange-


ment theorem, which states that the terms of a conditionally convergent series may
be rearranged so that the series sums to any specified real number (or ±∞). On the
other hand, rearrangements of terms of an absolutely convergent series have no effect
on the sum. In this paper, we consider a variant of the rearrangement problem: what if
we allow omissions of terms (and not rearrangements)? More precisely, we say r ∈ R
is achieved by a real sequence (xn ) if there is a (possibly finite) subsequence of (xn )
whose sum converges to r . We seek to understand all r that are achieved by a given
(xn ), and we call this set the achievement set of (xn ), denoted AS(xn ).
Two examples motivate our explorations. First, consider that the existence of bi-
nary expansions shows that AS(1/2n ) = [0, 1]. On the other hand, AS(2/3n ) is the
Cantor middle-third set, since the latter consists of those numbers in the unit interval
representable by a ternary expansion consisting only of the digits 0 and 2. The vast
topological differences between these sets prompt natural questions. What properties
of these sequences make their achievement sets so different? What other sets can occur
as achievement sets? In this paper we resolve the first question, and shed some light
on the second.
In the direction of the first question, we characterize in Section 1 the (xn ) with limit
zero such that AS(xn ) is an interval, and deduce several corollaries. One of them is
an analogue of Riemann’s rearrangement theorem: if the terms of (xn ) form a con-
ditionally convergent series, then AS(xn ) = R. In Section 2 we give a condition that
ensures AS(xn ) is a Cantor set, which requires proving that AS(xn ) is closed provided
(xn ) approaches zero. Towards the second question mentioned in the previous para-
graph, we show in Section 3 that achievement sets come in two distinct flavors: with
empty interior, or with dense interior. Moreover, we give conditions on (xn ) that imply
AS(xn ) is either a finite union of intervals or a finite union of Cantor sets. In Section
4, we give some examples of classes of sets that do occur as achievement sets, and on
the other hand show that many familiar sets do not. Curiously, we find that the set of
nonnegative rational numbers Q+ is in the latter category, while {−1} ∪ Q+ is in the
former.
Various authors have investigated aspects of achievement sets. Hornich [5], Kakeya
[6], and Ribenboim [10, Chapter 2] have results in the direction of those presented in

doi:10.4169/amer.math.monthly.118.06.508

508 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Section 1. Hornich [5] and Morán [7] have work along the lines of that presented in
Section 2. The results of Section 3 and Section 4 appear to be new. In [7, 8], Morán al-
lows sequences consisting of vectors in Rn , and examines the case where the Lebesgue
measure of the resulting achievement set is zero. He gives precise results on the Haus-
dorff dimensions of such achievement sets, particularly in the case of sequences satis-
fying the conditions of Theorem 2.1. Also related are [2] and [4], in which a sequence
of positive integers is called complete if every positive integer is the sum of some sub-
sequence. In [2], J. L. Brown showed that the Fibonacci sequence is complete, but if
any two terms are removed the resulting sequence is not complete.

1. INTERVALS. Throughout, we deal only with sequences whose terms are all
nonzero. We denote a sequence x1 , x2 , x3 , . . . by (xn ), and we declare that the empty
subsequence sums to 0. While the definition of achievement set applies to both finite
and infinite sequences, our primary concern is with infinite sequences.
For our first results on achievement sets, we give conditions on (xn ) that imply that
AS(xn ) is an interval. In keeping with the terminology introduced so far, we call (xn ) a
high achiever if AS(xn ) is an interval (we refrain from calling (xn ) remedial if AS(xn )
fails to be an interval). The notion of a high achiever is the analogue of a complete
sequence of positive integers, since it requires AS(xn ) to be as large as possible. The
following theorem gives a characterization of high achievers among sequences whose
limit is zero, and represents a minor extension of results appearing in [5], [6], and [10,
Chapter 2].

Theorem 1.1. Let (xn ) = x1 , x2 , x3 , . . . be a sequence of real numbers with xn → 0.


Suppose that for each k ≥ 1,


X
|xk | ≤ |xn |. (1)
n=k+1

Then (xn ) is a high achiever. Moreover, if |xk | ≥ |xk+1 | for each k ≥ 1 then (xn ) is a
high achiever if and only if (1) holds.

Note that if one drops the requirement |xk | ≥ |xk+1 | for each k ≥ 1 then it is easy to
find high achievers that violate (1): any nontrivial rearrangement of ( 21n ) suffices.
Before getting to the proof of Theorem 1.1, we give a lemma that will be used
repeatedly in the sequel to handle sequences with negative terms.

Lemma 1.2. Let (xn ) be a sequence of real numbers, and suppose that the sum of the
negative terms of (xn ) converges to s N ≤ 0. Then −s N + AS(xn ) = AS(|xn |).

Proof. Partition the positive integers Z+ into the disjoint subsets I P = {i | xi > 0} and
I N = {i | xi < 0}. Since I P and I N partition Z+ (recall our convention that the terms
of all sequences are nonzero), we have AS(xn ) = AS(xi | i ∈ I P ) + AS(xi | i ∈ I N ),
where + denotes the arithmetic sum. Note that in this equation, we use the fact that
our hypothesis on the negative terms ensures that a subsequence with convergent sum
must in fact have absolutely convergent sum. Taking absolute values then yields

AS(|xn |) = AS(xi | i ∈ I P ) − AS(xi | i ∈ I N ). (2)

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 509


Let r ∈ AS(xn ), so that there is K ⊆ Z+ such that r = k∈K xk . Then K = K P ∪
P
K N for some K P ⊆ I P and K N ⊆ I N . Thus
 
X X X X X
r − sN =  xi + xi  − xi = xi − xi
i∈K P i∈K N i∈I N i∈K P i∈I N \K N

and by (2) this last expression is an element of AS(|xn |). We’ve therefore shown
AS(xn ) − s N ⊆ AS(|xn |).
To show the reverse inclusion, suppose that r ∈ AS(|xn |). By (2), there must be
subsets J P ⊆ I P and JN ⊆ I N such that
X X
r= xi − xi .
i∈J P i∈J N
P
Adding and subtracting xi to the right-hand side gives
i∈I N
 
X X
r = xi + xi  − s N ,
i∈J P i∈I N \J N

and thus r ∈ AS(xn ) − s N .

P of Theorem 1.1. Let I N be as in the proof of Lemma 1.2, and assume first that
Proof
i∈I N x i converges. By Lemma 1.2 it is enough in this case to show that (|x n |) is a
high achiever. We may thus assume that all terms of (xn ) are positive.
Let s denote the sum of the xn , and allow s to be infinite. Clearly it is enough to
show that r ∈ AS(xn ) for 0 < r < s.
We define indices i 1 , i 2 , i 3 , . . . using a greedy algorithm. Let i 1 be the smallest index
satisfying xi1 ≤ r . Inductively, if i 1 , i 2 , . . . , i m are already chosen, we take i m+1 to be
the smallest index such that i m+1 > i m and
m
X
xim+1 + xi j ≤ r,
j=1

provided that at least one such index exists.


If this process terminates, then there P must be some m such that i 1 ,P . . . , i m are defined
but for each n > i m we have xn + mj=1 xi j > r . By construction mj=1 xi j ≤ r , and
by hypothesis limn→∞ xn = 0. It follows that mj=1 xi j = r , whence r ∈ AS(xn ).
P
Suppose now that the process of constructing the i j does not terminate, and suppose
further that the sequence i 1 , i 2 , i 3 , . . . omits a finite number of positive integers. Since
r < s the sequence must omit at least one positive integer. Let k be the largest such
integer. Consider the sum t of the xi j with i j < k (let t = 0 if there are no i j < k). We
then have xk + t > r and t + ∞ h=1 x k+h ≤ r . It follows that x k >
P P∞
h=1 x k+h , which
contradicts (1).
Therefore if the process of constructing the i j does not terminate, then the sequence
i 1 , i 2 , i 3 , . . . omits an infinite number of positive integers. Let {k1, k2 , k3 , . . .} be such a
sequence. This means that for each kl ,
X X
x kl + xi j > r ≥ xi j . (3)
i j <kl i j <kl

510 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



P∞hypothesis liml→∞ xkl = 0, and so taking the limit as l → ∞ in P
By (3) gives r =
j=1 x i j
. Thus r ∈ AS(x n ). This proves the theorem in the case that i∈I N x i con-
verges.
If i∈I N xi diverges, then the positive-term sequence (−xi | i ∈ I N ) satisfies (1)
P
for each k. Thus AS(−xi | i ∈ I N ) = [0, ∞). Letting I P be the set of indices of the
positive terms of (xn ), we now have
(
(−∞, c]
P
if i∈I p xi converges to c,
AS(xn ) = (4)
(−∞, ∞)
P
if i∈I P xi diverges.

In either case, AS(xn ) is a high achiever.


We now prove the second assertion of the theorem. Suppose that |xk | ≥ |xk+1 | for
each k ≥ 1, and also that (1) does not hold, i.e., there exists an index k with

X
|xk | > |xn | .
n=k+1

This implies that the terms of (xn ) form an absolutely convergent series, so by Lemma
1.2 we may assume withoutP loss of generality that the terms of (xn ) are positive.
Clearly both b = xk and a = ∞ n=k+1 x n are in AS(x n ). We claim that AS(x n ) ∩ (a, b)
is empty, which shows that (xn ) is not +
Pa high achiever. Let I ⊆ Z . If i ∈ I for some
i ≤ k, then since xPi ≥ x k , we
Phave i∈I x i ≥ x k = b. On the other hand, if I omits

every j ≤ k, then i∈I xi ≤ i=k+1 xi = a.

We now reap some of the fruits of Theorem 1.1; see also [10, Chapter 2].
1

Corollary 1.3. AS n
= [0, ∞)

Corollary 1.3 follows immediately from the fact that the harmonic series diverges,
and implies that every real number in [0, ∞) can be expressed as a (possibly infinite)
Egyptian fraction. Indeed, such an Egyptian fraction can evenP be taken with all denom-
inators prime. This follows from the result of Euler that ∞ 1
n=1 pn diverges [9, p. 59],
where p1 , p2 , . . . is an enumeration of the primes, implying that AS(1/ pn ) = [0, ∞).
We also have an analogue of Riemann’s rearrangement theorem. Note that in our
setting we allow only omissions of terms, not rearrangements.

Corollary 1.4. Let (xn ) be a sequence whose terms form a conditionally convergent
series. Then AS(xn ) = R.

Proof. Let I P and I N be, respectively, the sets of indices


P of the positivePand nega-
tive terms of (xn ). Conditional convergence implies i∈I P xi = ∞ and i∈I N xi =
−∞. Theorem 1.1 then shows that AS(xi | i ∈ I P ) = [0, ∞) and AS(xi | i ∈ I N ) =
(−∞, 0]. The corollary follows immediately.

Our final corollary gives us a practical method for showing that many sequences
whose terms form absolutely convergent series are high achievers.

Corollary 1.5. Let xn be a sequence with limn→∞ xn = 0, and suppose |xn+1 | ≥ 12 |xn |
for all n. Then (xn ) is a high achiever.

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 511


1
Proof. By iterating our hypothesis, we have |xk+i | ≥ 2i
|xk | for every k and i. Thus for
each k
∞ ∞ ∞
X X 1 X 1
|xk+i | ≥ i
|x k | = |x k | = |xk | .
i=1 i=1
2 i=1
2i

It follows from Theorem 1.1 that (xn ) is a high achiever.

Corollary 1.5 may be applied to the sequence of Fibonacci reciprocals (1/Fn ).


When summed, they yield a series converging to β ≈ 3.36, a number of considerable
mystery whose irrationality was proven only in 1989 [1]. Since Fn+1 = Fn + Fn−1 ≤
2Fn , Corollary 1.5 shows that AS(1/Fn ) = [0, β].

2. CANTOR SETS. As noted in the introduction, AS(2/3n ) is the Cantor middle-


third set. We wish to understand what kinds of sequences have Cantor sets as their
achievement sets, and in this section we give a sufficient condition that is similar to
the one in Theorem 1.1 (see Theorem 2.1). Recall that a generalized Cantor set, which
we refer to simply as a Cantor set, is a compact, perfect, totally disconnected subset of
the real numbers. Any Cantor set can be constructed in a manner similar to the Cantor
middle-third set: begin with a compact interval and remove an open subinterval in
such a way that two closed intervals of positive length remain. These two intervals are
called the intervals of stage one. Perform a similar operation on each of the intervals
of stage one, leaving four intervals of stage two. Continue in this way, producing 2k
disjoint intervals at stage k. If we let Ck be the union of the intervals of stage k, and
the
T∞maximum length of these intervals tends to zero as k tends to infinity, then C =
k=0 C k is a Cantor set.
For our purposes, we are interested primarily in central Cantor sets, namely those
that can be formed by following the recipe of the previous paragraph, but at any stage
all removed subintervals are centered and have the same length. In this case the maxi-
mum length of the intervals at stage k automatically tends to zero. The Cantor middle-
third set is an example.

Theorem 2.1. Let (xn ) be a real sequence, and suppose that for each k ≥ 1,

X
|xk | > |xi |. (5)
i=k+1

Then AS(xn ) is a central Cantor set.


P∞
The removed intervals of stage k all have length |xk | − i=k+1 |xi |. It follows that
every central Cantor set with 0 as its left endpoint is the achievement set of some
sequence (see Section 4). Also, under the hypotheses of Theorem 2.1, the measure of
AS(xn ) is

X
lim 2k |xi |.
k→∞
i=k+1

For more on the interesting question of how the measure and Hausdorff dimension of
AS(xn ) relate to (xn ), see [7, 8].

512 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



In order to prove Theorem 2.1, it is certainly necessary to establish that AS(xn ) is
closed. This result has interest in its own right, and is originally due to Hans Hornich
[5]. We give its proof as a separate theorem, following which we prove Theorem 2.1.

Theorem 2.2 (Hornich). Let (xn ) be a positive-term sequence, and suppose


P∞
n=1 xn
converges. Then AS(xn ) is closed.

Proof. Let (sPj ) be a sequence of elements of AS(x n ) whose limit is s. Let I j ⊆ Z


+

satisfy s j = i∈I j xi .
Suppose there are no positive integers n that belong to I j for infinitely many j. Then
for any fixed m we have that for all j sufficiently large,

X
sj ≤ xn .
n=m+1

Letting m → ∞ and using the convergence of ∞ n=1 x n , we have s = 0 ∈ AS(x n ).


P
Suppose now that there is a positive integer belonging to I j for infinitely many j,
and let n 1 be the smallest one. Thus there is an infinite set J1 such that for all j ∈ J1 , n 1
is the smallest element of I j . Now suppose that n 1 , . . . , n k have been chosen, and there
is an infinite set Jk such that for all j ∈ Jk , the first k elements of I j are n 1 , . . . , n k .
If there are no n > n k that belong to I j for infinitely many j ∈ Jk , then fixing an
m > n k we have that for all j ∈ Jk sufficiently large,

X
s j − (xn 1 + · · · + xnk ) ≤ xi . (6)
i=m+1

Letting m → ∞, we have s = xn 1 + · · · + xn k ∈ AS(xn ). If there is an n > n k that be-


longs to I j for infinitely many j, then let n k+1 be the smallest one. Then there is an in-
finite set Jk+1 such that for all j ∈ Jk+1 , the first k + 1 elements of I j are n 1 , . . . , n k+1 .
This process either terminates or results in an infinite sequence n 1 , n 2 , . . . . In the
former case, from (6) we have s ∈ AS(xn ). In the latter case, we can choose for each
k some s jk with jk an increasing sequence and

X
s jk − (xn1 + · · · + xn k ) ≤ xi . (7)
i=n k +1

P∞ sequence (s j ) approaches s, the subsequence (s jk ) must too. We thus


Since the original
obtain s = i=1 xni by taking k → ∞ in (7). Therefore s ∈ AS(xn ), as desired.

Proof of Theorem 2.1. An immediate consequence of (5) is that the series ∞


P
n=1 x n
is absolutely convergent. In particular, the sum of the negative terms of (xn ) must
converge. Thus by Lemma 1.2, we may assume that each xn is positive, since a translate
of a central Cantor setP
is again a central Cantor set.

For k ≥ 1, let tk = i=k xi . Define C0 = [0, t1 ] and

Ck = AS(x1 , . . . , xk ) + [0, tk+1 ] (8)

for k ≥ 1. Since AS(x1 , . . . , xk ) is a finite set, Ck is a finite union of closed intervals.


We also clearly have Ck ⊇ Ck+1 for each k ≥ 0.

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 513


T∞ that k=0 Ck = AS(xn ). If s ∈ AS(xn ), then it follows immediately
T∞
We claim
that s ∈ k=0 Ck . Suppose that s ∈ Ck . Then s − qk ∈ [0, tk+1 P] for some qk ∈
AS(x1 , . . . , xk ) ⊆ AS(xn ). In particular,T|s − qk | ≤ tk+1 . Because ∞ n=1 x n converges,
tk+1 → 0 as k → ∞. So if we take s ∈ ∞ k=0 C k , then there exists an infinite sequence
q1 , q2 , . . . of elements in AS(xn ) such that limk→∞ |s − qk | = 0. Because AS(xn ) is
closed by Theorem 2.2, we have s ∈ AS(xn ).
To complete the proof, we need only show that each Ck is central, i.e., that it consists
of 2k disjoint intervals and that, for k ≥ 1, Ck is formed from Ck−1 by deleting central
open intervals of equal length from all intervals of Ck−1 . Note that

C1 = [0, t2 ] ∪ [x1 , x1 + t2 ] = [0, t2 ] ∪ [x1 , t1 ] ⊆ [0, t1 ] = C0 .

By (5), x1 > t2 , so the intervals of C1 are disjoint. Moreover, t2 = t1 − x1 , so the


removed subinterval (t2 , x1 ) is a central interval of [0, t1 ] = C0 .
Now suppose inductively that Ck is central, which implies that it is a union of 2k
disjoint intervals. By definition, each interval of Ck is a translate of [0, tk+1 ]. Thus Ck+1
consists of disjoint pairs of intervals that are translates of [0, tk+2 ] ∪ [xk+1 , tk+1 ]. By (5)
we have xk+1 > tk+2 , so each pair of intervals is disjoint and the removed subintervals
have the same length. Because tk+2 = tk+1 − xk+1 , the removed subinterval (tk+2 , xk+1 )
is central. Hence Ck+1 is central. Thus by induction all the Ck are central.

We close this section with a generalization of Theorem 2.2.

Corollary 2.3. Suppose limn→∞ xn = 0. Then AS(xn ) is closed.

Proof. Let s N ≥ −∞ denote the sum of the negative terms of (xn ). If s N is infinite,
then as in the proof of Theorem 1.1 (see (4)) AS(xn ) is closed. If s N is finite, then by
Lemma 1.2 we can assume P∞ that (xn ) has positive terms, because a translate of a closed
set is again closed. If n=1 x n converges, then AS(x n ) is closed by Theorem 2.2. If
n ) = [0, ∞) by Theorem 1.1.
P∞
n=1 x n diverges then AS(x

It is not true that all achievement sets are closed. For instance, suppose that xn =
1 + 1/n for all n ≥ 1. Then AS(xn ) does not contain its limit point 1. In this example,
AS(xn ) is countable, so it is natural to ask if all uncountable achievable sets are closed;
the following example of Velleman (personal communication, 2006) shows that the
answer is no.
Consider the two sequences given by

2 1
xn = , yn = 2 − .
3n 2 · 3n−1

Make a new sequence z n by interleaving these two, so that the first few terms of z n are
2/3, 3/2, 2/9, 11/6, 2/27, 35/18. Note that AS(xn ) is the usual Cantor middle-third
set, and hence AS(z n ) is uncountable. Moreover, 2 is an accumulation point of AS(yn ),
and thus also of AS(z n ). We now show that 2 6 ∈ AS(z n ). Suppose a subsequence of z n
sums to 2, and note that it can contain at most one term of (yn ), since yn > 1 for all
n. Moreover, this subsequence must contain at least one term of (yn ), since summing
all the xn yields 1. Therefore we have a subsequence of (xn ) whose terms sum to 2·31k−1
for some k. However, 2·31k−1 is halfway between 31k and 32k , and so is not contained in
the Cantor middle-third set. This is a contradiction, proving that 2 6 ∈ AS(z n ).

514 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



3. THE TWO KINDS OF ACHIEVEMENT SETS. Thus far we have seen that
many achievement sets are intervals, and thus connected (Section 1), and many are
Cantor sets and thus totally disconnected (Section 2). Examples of achievement sets
that are unions of disjoint intervals also abound: for instance, if x1 = 2 and xn =
1/2n−1 for n ≥ 2, then AS(xn ) = [0, 1] ∪ [2, 3]. This raises the question of whether
there are achievement sets that contain an interval but are not unions of intervals. In
fact, such achievement sets exist, and we give one defined by Velleman (personal com-
munication, 2006).
The idea is to construct an achievement set that consists of all numbers repre-
sentable by a certain kind of base-a expansion, where a is a suitably chosen real
number less than 1. The allowable multiples of the powers of a come from a set hav-
ing additive properties that ensure no intervals are contained in the extremities of the
achievement set, but an interval is contained in the middle.
Define x1 = 53 , x2 = 25 , x3 = 25 , x4 = 25 , and for n > 4 put xn = a · xn−4 , where a is
chosen so that

1 X n 2
≤ a < . (9)
5 n=1
9

For instance, a = 19/109 will do. Put b = ∞ n=1 a = 1−a , and note that AS(x n ) ⊆
n a
P

0, 5 (1 + b) .
 9 

By (9) we have 59 ∞ n=1 a < 5 , whence AS(xn ) omits the interval 95 b, 25 . Simi-
2
P n


larly, AS(xn ) omits the intervals 59 ba i , 25 a i for all i ≥ 1. Thus 0 ∈ AS(xn ) but AS(xn )

omits an interval in all neighborhoods of 0. It follows that AS(xn ) is not a finite union
of intervals.
On the other hand, we claim 52 (1 + b), 75 (1 + b) ⊂ AS(xn ). Consider the se-
 

quence yn defined by yn = 15 a i for 5i + 1 ≤ n ≤ 5i + 5. Thus the first five terms of


yn are all 1/5, the next five are all a · 1/5, the next five are a 2 · 1/5, and so on. By
Theorem 1.1, AS(yn ) must be an interval provided that for all i we have

1 i 5 X n
a ≤ a .
5 5 n=i+1

This holds by (9),


 and therefore AS(y n ) = [0, 1 + b].
Now let c ∈ 52 (1 + b), 75 (1 + b) . Then c can be written as 52 (1 + b) plus an ele-
ment of AS(yn ). We thus have
∞  
2X n k0 k1 k2
c= a + + a + a2 + · · · (0 ≤ kn ≤ 5)
5 n=0 5 5 5
2 + k0 2 + k1 2 + k2 2
= + a+ a + ··· (0 ≤ kn ≤ 5).
5 5 5
All fractions of the form (2 + k)/5, 0 ≤ k ≤ 5 may be produced by summing sub-
collections of {3/5, 2/5, 2/5, 2/5}. Therefore c ∈ AS(xn ), proving that 25 (1 + b) ,

7
(1 + b) ⊂ AS(xn ).

5
We remark that by Theorem 2.1 the sequences ( 53 a n ) and ( 25 a n ) both have achieve-
ment sets that are central Cantor sets. Therefore AS(xn ) is the arithmetic sum of four
central Cantor sets, and we have shown that this arithmetic sum can contain an in-
terval without being a disjoint union of intervals. In general, the question of when an
arithmetic sum of Cantor sets contains an interval is difficult; see, e.g., [3].

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 515


We can, however, salvage some kind of dichotomy among achievement sets, thanks
to a result mainly due to Velleman (personal communication, 2006). Recall that a set
is nowhere dense if its closure has empty interior, and meager (or of first category) if
it is a countable union of nowhere dense sets. By the Baire category theorem a meager
subset of the reals has empty interior, and hence is totally disconnected.

Theorem 3.1. Let (xn ) be a sequence of real numbers. Then AS(xn ) is either a meager
set, and thus has empty interior, or the interior of AS(xn ) is dense in AS(xn ).

Before embarking on the proof of Theorem 3.1, we give a proposition that effec-
tively reduces the proof to the case where xn → 0. This proposition has some inde-
pendent interest as well, and gives some justification for our emphasis thus far on
sequences whose terms approach 0.

Proposition 3.2. Let (xn ) be any real sequence. Then AS(xn ) is either countable, an
infinite interval, or a countable union of translates of AS(xn k ), where (xn k ) is some
subsequence of (xn ) that converges to 0.

Proof. Let E be the set of accumulation points of (xn ). Suppose first that E ∩ (0, ) 6 =
∅ for every  > 0. Then there is a sequence e1 , e2 , . . . of elements of E with en →
0 and en < 1 for all n. For each n, let kn be a positive integer with 1/kn > en >
1/(kn + 2), whence there are infinitely many x j with 1/kn > x j > 1/(kn + 2). For
each n, choose kn + 2 such terms and form an infinite subsequence by concatenation.
This subsequence approaches 0 but its sum diverges, and thus it has achievement set
[0, ∞) by Theorem 1.1. It follows that AS(xn ) is an infinite interval. In the case where
E ∩ (−, 0) 6= ∅ for every  > 0 a similar argument applies.
Now suppose that there is some  > 0 such that E ∩ {r ∈ R | 0 < |r | < } = ∅.
Let (xm k ) be the subsequence consisting of the terms of (xn ) with absolute value at
least 2 . Let (xn k ) be the complementary subsequence. Note that

AS(xn ) = AS(xm k ) + AS(xn k ).

The first summand on the right-hand side consists only of sums of finitely many terms,
and thus is countable (see also Proposition 4.1). By our assumption about E, the only
possible accumulation point of (xn ) in (−, ) is 0, so the sequence (xnk ) is either finite
or has a limit of 0. In the first case, AS(xn ) is countable, while in the second it is a
countable union of translates of AS(xn k ).

Note that in the case that AS(xn ) is an infinite interval, it is clearly a countable
union of translates of AS(1/2n ). Thus Proposition 3.2 implies that AS(xn ) is either
countable or a countable union of translates of AS(yn ), where (yn ) is some sequence
whose terms approach zero. We are now ready to prove Theorem 3.1.

Proof of Theorem 3.1. We first remark that translates of a meager set are meager, and
a countable union of meager sets is also meager. Moreover, it is easy to see that the
same two statements hold if “meager” is replaced by “has dense interior.” Thus by
Proposition 3.2, it is enough to prove the theorem in the case that xn → 0.
Assume now that the theorem is true when xn → 0 and xn > 0. If xn → 0 but xn
has both positive and negative terms, consider the sum of the negative terms. If this
sum diverges then by Theorem 1.1, AS(xn ) is an interval and thus has dense interior.
If it converges, then by Lemma 1.2 we have that AS(xn ) is a translate of AS(|xn |).

516 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



By assumption we have that AS(|xn |) is either meager or has dense interior, so the set
AS(xn ) must also fall into one of these two classes.
Therefore it suffices to prove the theorem under the hypotheses that xn → 0 and
xn > 0 for all n. In this case we first show that if 0 is in the closure of the in-
terior of AS(xn ), then AS(xn ) has dense interior. Let x ∈ AS(xn ), and fix  > 0.
Since x ∈ AS(xn ), there is a subsequence of (xn ) whose terms sum to x. Therefore
we can find a finite sum xn1 + · · · + xnk such that x −  < xn1 + · · · + xn k ≤ x. Let
δ = min{, xn 1 , . . . , xnk }. Since 0 is in the closure of the interior of AS(xn ), we can
find a and b so that 0 < a < b < δ and (a, b) ⊆ AS(xn ). Notice that every element of
(a, b) can be written as the sum of a subsequence of (xn ), but the terms xn1 , . . . , xn k
will not be used in any of these sums, because they are too large. It follows that
xn 1 + · · · + xn k + (a, b) ⊆ AS(xn ). But xn1 + · · · + xn k + (a, b) ⊆ (x − , x + ).
Since  was arbitrary, this shows that x is in the closure of the interior of AS(xn ).
Now we show that if 0 is not in the closure of the interior of AS(xn ), then AS(xn )
is meager. In this case there is some  > 0 such that [0, ) contains no elements of the
P n ). Therefore AS(xn ) is not a high achiever, and it follows P
interior of AS(x from Theo-
rem 1.1 that ∞ n=N x n <

n=1 x n converges. Hence we can choose some N such that
. Since AS(x N , x N +1 , . . .) ⊆ AS(xn ) ∩ [0, ), we have that AS(x N , x N +1 , . . .) has
empty interior, and moreover by Theorem 2.2 it is closed and thus is nowhere dense.
Now

AS(xn ) = AS(x1 , . . . , x N −1 ) + AS(x N , x N +1 , . . .),

and since AS(x1 , . . . , x N −1 ) is finite, AS(xn ) is a finite union of nowhere dense sets,
and hence is meager.

If we restrict our attention to certain sequences, we can recover a dichotomy


stronger than that given in Theorem 3.1. For instance, AS(1/cn ) is an interval if
1 < c ≤ 2 (Corollary 1.5), and a central Cantor set for c > 2 (Theorem 2.1). In fact,
we can generalize this remark:

x
Proposition 3.3. Let (xn ) be a real sequence, and suppose limn→∞ n+1
xn
exists and
equals L. Then AS(xn ) is a finite union of closed intervals if 1
2
< L < 1 and a finite
union of central Cantor sets if 0 ≤ L < 21 .

Proof. Note that since L < 1, limn→∞ xn = 0. If 21 < L < 1, then for some n 0 > 0 we
|x |
have |xn+1 n|
> 12 for all n ≥ n 0 . Hence by Corollary 1.5, we have that AS(xn 0 , xn 0 +1 ,
xn 0 +1 , . . .) is a closed interval. It then follows from the decomposition

AS(xn ) = AS(x1 , . . . , xn 0 −1 ) + AS(xn 0 , xn 0 +1 , xn 0 +1 , . . .)

that AS(xn ) is the union of a finite number of translates of a closed interval.


|x |
If 0 ≤ L < 21 , then for some n 0 > 0, we have |xn+1 n|
< 12 for all n ≥ n 0 . Thus for
any i ≥ 1 and any n ≥ n 0 , we have |xn+i | < 21i |xn |. Therefore
∞ ∞ ∞
X X 1 X 1
|xn+i | < i
|x n | = |x n | = |xn | .
i=1 i=1
2 i=1
2i

It now follows from Theorem 2.1 that AS(xn0 , xn0 +1 , xn 0 +1 , . . .) is a central Cantor set.
Therefore AS(xn ) is a finite union of central Cantor sets.

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 517


We can extend Proposition 3.3 in a few ways. If L > 1, then no infinite subsequence
of (xn ) can have a convergent sum, and thus AS(xn ) is countable. In addition, if L = 1
and limn→∞ xn = 0, then AS(xn ) is a finite union of closed intervals by the same
argument used in the case 21 < L < 1. However, if L = 21 or if L = 1 and limn→∞ xn 6 =
0, many behaviors are possible.
1
To illustrate the variety of behaviors
 possible  when L = 2 , consider the three se-
quences 21n − 31n , 21n + 31n , and 21n + (−3) 1
 
n . One can easily verify that the first

satisfies, for each k, condition (1) of Theorem 1.1, namely |xk | ≤ ∞


P
n=k+1 |x n |. Thus
its achievement set is a closed interval. Similarly, the second satisfies, for each k, con-
dition (5) of Theorem 2.1, namely |xk | > ∞
P
n=k+1 |x n |. Thus its achievement set is a
Cantor set.
However, the third sequence has a mysterious achievement set. The sequence sat-
isfies (1) for k odd and (5) for k even, and we give an intuitive description of how
this alternation affects the achievement set. Recall the sets Ck as defined in (8). For a
general sequence (xn ), each Ck consists of 2k not necessarily disjoint intervals, while
Ck+1 is formed by splitting each interval of Ck into two not necessarily disjoint inter-
vals, which we refer to here as “new intervals.” If (1) holds for k then each pair of
new intervals is overlapping, while if (5) holds for k then each pair of new intervals is
disjoint. Each pair of new intervals is disjoint when k is even and overlapping when k
is odd. Because Ck is the union of all new intervals at stage k, the gaps that are intro-
duced when k is even may be covered by the overlap of the previous stage. See Figure
1 for an illustration.

An interval A pair of Four new intervals


of stage 2m. new intervals of stage 2m + 2
of stage 2m + 1.
 
1 1
Figure 1. Intervals in three successive stage approximations to 2n + (−3)n .

This interaction appears to be complicated. In particular, it is not clear if any inter-


vals survive all stages unpunctured (though it is easy to see that AS(xn ) is not itself an
interval).
 
Question 1. Let (xn ) = 1
2n
+ 1
(−3)n
. Does AS(xn ) have empty interior?

Note that from the proof of Theorem 3.1, one sees that to answer Question 1, it
is enough to determine whether 0 is in the closure of the interior of AS(xn ). In other

518 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



words, does there exist  > 0 such that AS(xn ) ∩ [0, ] contains no elements of the
interior of AS(xn )?
In the same vein as Question 1, we pose the following:

Question 2. Does there exist a sequence (xn ) such that limn→∞ |xn+1 /xn | exists and
AS(xn ) contains an interval but is not a union of intervals?

4. ACHIEVABLE SETS. Call a subset of R achievable if it can be obtained as


AS(xn ) for some real sequence (xn ). In this section we give some examples of classes
of achievable sets, and examine the achievability of some well-known sets, such as Q.
As a simple starting example, any closed interval with 0 as an endpoint is achiev-
able: for r ∈ R, we have AS(r/2n ) = [0, r ] (or [r, 0] if r < 0). Now let s ∈ R, and
consider the sequence (xn ) = s, r2 , 2r2 , . . . . Then AS(xn ) = [0, r ] ∪ [s, s + r ], with
S made for r < 0. More generally if S is any achievable set and
appropriate alterations
r ∈ R, then both s∈S [s, s + r ] and r S = {r s : s ∈ S} are achievable.
Consider now a central Cantor set C whose original interval has its left endpoint at
0. To specify such a set, one needs only the length L of the original interval and the
P∞ an of each of the central intervals removed at stage n. If (xn ) satisfies |xn | >
length
k=n+1 |x k | for each n, Theorem 2.1 shows that AS(x n ) is a central
P Cantor set and the
length of each of the 2n removed intervals at stage n is |xn | − ∞ k=n+1 |x k |. Thus one
constructs a sequence (xn ) with AS(xn ) = C by taking x1 = 2 (L + a1 ) and
1

L + 2n−1 an − n−1 k−1


P
k=1 2 ak
xn = n
2

for each n ≥ 2. It is straightforward to check that x1 + x2 + · · · = L and xn − xn+1 −


xn+2 − · · · = an for each n.
We now give some properties of achievable sets. Theorem 2.2 shows that any
bounded achievable set must be closed, and Theorem 3.1 shows an achievable set
must be meager or have dense interior. Any bounded achievable set must also be
symmetric
P about its midpoint. Indeed, such a set S is the achievement P set of (xn ),
where ∞ x
n=1 n converges absolutely to r . If s is in AS(x n ), then s = i∈I x i for some
I ⊆ Z+ . Letting J = Z+ \I , we have r − s = j∈J x j , which is in AS(xn ), showing
P
that AS(xn ) is symmetric about r2 .
Our next two results furnish additional properties of achievable sets.

Proposition 4.1. Let (xn ) be an infinite real sequence. Then AS(xn ) is uncountable if
and only if (xn ) has a subsequence converging to 0.

Proof. Suppose first that (xn ) contains a subsequence converging to 0. Without loss of
generality we may assume that xn → 0; we show that AS(xn ) is uncountable.
If there is a k0 such that whenever k > k0 we have


X
|xk | ≤ |xn |, (10)
n=k+1

then it follows from Theorem 1.1 that AS(xn ) contains an interval, and is thus

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 519


uncountable. If there is no k0 such that (10) is satisfied for k > k0 , then there must be
a sequence k1 , k2 , . . . such that for each k j ,

X ∞
X
|xk j | > |xn | ≥ |xki |.
n=k j +1 i= j+1

By Theorem 2.1, AS(xk j ) is a central Cantor set, and thus uncountable.


Now suppose that (xn ) contains no subsequence converging to 0. Then no infinite
sum of terms can converge, so all elements of AS(xn ) are finite sums of terms. Hence
AS(xn ) is countable.

Proposition 4.2. If AS(xn ) is uncountable, then it is without isolated points.

P that by Proposition 4.1 there is a subsequence (xn j ) whose limit is 0.


Proof. Note
Let s = i∈I xi ∈ AS(xn ). If I is finite, let k be its greatest element and let l be
minimal such that xnl > k. Then s + xnl , s + xnl+1 , s + xnl+2 , . . . is an infinite
P sequence
of elements of AS(xn ) converging to s. If I is infinite, the partial sums of i∈I xi form
an infinite sequence of elements of AS(xn ) converging to s. Therefore AS(xn ) has no
isolated points.

With these properties of achievable sets now established, we can examine the
achievability of certain well-known sets.

Corollary 4.3. If S ⊂ R is a countable set of nonnegative numbers having 0 as an


accumulation point, then S is not achievable. In particular, the set of nonnegative
rational numbers Q+ is not achievable.

Proof. Suppose AS(xn ) = S. Clearly (xn ) can have no negative terms. Thus if xn > 
for all n and some  > 0, then AS(xn ) ∩ (0, ) is empty, which contradicts the fact that
0 is an accumulation point of S. Hence (xn ) must have a subsequence converging to 0.
By Proposition 4.1, AS(xn ) is then uncountable, and we have a contradiction.

It is interesting to note that if we adjoin a single negative number to Q+ the resulting


set is achievable. For instance, let q1 , q2 , . . . be an enumeration of Q+ . Set x1 = −1
and for n ≥ 2, let xn = 1 + qn−1 . Clearly Q+ ⊆ AS(xn ) and −1 ∈ AS(xn ). But the
terms of any finite subsequence of (xn ) sum to −1 or a nonnegative rational number,
while the terms of any infinite subsequence form a divergent series. Hence AS(xn ) =
{−1} ∪ Q+ . Using similar reasoning one can show that if G is any countably infinite
additive subgroup of R and g ∈ G + , then {−g} ∪ G + is achievable.
The full set Q of rationals is also achievable. Let (xn ) be an enumeration of the
rationals with absolute value at least 1. Since (xn ) has no subsequences with limit
0, no infinite sum of terms can converge. Finite sums of terms are rational numbers,
so AS(xn ) ⊆ Q. On the other hand, clearly AS(xn ) contains all rationals of absolute
value at least one. If q ∈ Q ∩ (−1, 1), then we have 2 + (q − 2) ∈ AS(xn ). Thus
AS(xn ) = Q. A similar result can be shown for any countably infinite subgroup of R.
Let us now consider I = {r ∈ R | r irrational} ∪ {0}. The interior of I, being empty,
cannot be dense in I. Hence if I is achievable, then by Theorem 3.1 it must be meager.
To see that this cannot be the case, note that the rationals are meager, and since unions
of meager sets are again meager, we have that R is meager. But complete metric spaces
cannot be meager by the Baire category theorem. A similar argument applies to the
positive irrationals I+ .

520 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



ACKNOWLEDGMENTS. I am very grateful to Dan Velleman for his interest in this project, and his contri-
butions of Theorem 3.1 and the examples at the end of Section 2 and the beginning of Section 3. These have
enriched the paper substantially. I would also like to thank the anonymous referees for valuable suggestions.
The author’s research was partially supported by NSF grant DMS-0852826.

REFERENCES

1. R. André-Jeannin, Irrationalité de la somme des inverses de certaines suites récurrentes, C. R. Acad. Sci.
Paris Sér. I Math. 308 (1989) 539–541.
2. J. L. Brown Jr., Note on complete sequences of integers, Amer. Math. Monthly 68 (1961) 557–560. doi:
10.2307/2311150
3. C. Cabrelli, K. Hare, and U. Molter, Sums of Cantor sets yielding an interval, J. Aust. Math. Soc. 73
(2002) 405–418. doi:10.1017/S1446788700009058
4. V. E. Hoggatt and C. King, Problem E 1424, Amer. Math. Monthly 67 (1960) 593; Solution by J. Silver,
68 (1961) 179–180. doi:10.2307/2312499
5. H. Hornich, Über beliebige Teilsummen absolut konvergenter Reihen, Monatsh. Math. Phys. 49 (1941)
316–320. doi:10.1007/BF01707309
6. S. Kakeya, On the partial sums of an infinite series, Sci. Rep. Tôhoku Imperial Univ. 3 (1914) 159–163.
7. M. Morán, Fractal series, Mathematika 36 (1989) 334–348. doi:10.1112/S0025579300013176
8. , Dimension functions for fractal sets associated to series, Proc. Amer. Math. Soc. 120 (1994)
749–754.
9. T. Nagell, Introduction to Number Theory, John Wiley, New York, 1951.
10. P. Ribenboim, My Numbers, My Friends, Springer-Verlag, New York, 2000.

RAFE JONES received his B.A. from Amherst College in 1998 and his Ph.D. from Brown University in
2005 under the direction of Joseph Silverman. He currently teaches at the College of the Holy Cross. While
his mathematical interests lie mainly in arithmetic questions related to the iteration of rational functions, he is
never averse to a good side project. When not doing mathematics, he enjoys running, cycling, speaking French,
and the occasional computer-based distraction. These pastimes are not new; he hopes that none of his Amherst
College professors noticed his world ranking (at the time) in Minesweeper.
Department of Mathematics and Computer Science, College of the Holy Cross, Worcester, MA, 01610
[email protected]

June–July 2011] ACHIEVEMENT SETS OF SEQUENCES 521


Polynomials, Ellipses, and Matrices:
Two Questions, One Answer
Pamela Gorkin and Elizabeth Skubak

Abstract. Consider the following questions for points a1 , a2 in the unit disc, D. If q(z) =
(z − a1 )(z − a2 ), when is q the derivative of a polynomial with all of its zeros on the unit
circle, ∂D? If an ellipse E with foci a1 , a2 is inscribed in a triangle with vertices on ∂D,
when is E tangent at the midpoints to a triangle with vertices on ∂D? We show that these
problems are essentially the same. In fact, the answer to both is a very simple: if and only if
2|a1 a2 | = |a1 + a2 |. We also discuss generalizations of these problems and their solutions.

1. INTRODUCTION. Given a polynomial of degree n with zeros in the open unit


disc D or on the unit circle ∂D, where are the zeros of its derivative? This appears to
be a perfect problem for an undergraduate: take a polynomial with zeros in the closed
unit disc and see where the critical points are in relation to the zeros. Experiments
show that the closed disc of radius one centered at any zero always contains a critical
point [30]. Blagovest Sendov noticed this as early as 1959, but thus far no one has been
able to prove or disprove this result; it is now known as the Sendov conjecture [21]. In
2001, Sendov provided an overview of the problem and related questions [22].
While T. Sheil-Small notes [23, p. 206] that “odds of 4-1 in favour of the conjecture
are cautious odds in the circumstances,” this 50-year-old conjecture has been proven
only in some special cases. Thus, what appears to be an interesting problem for an
undergraduate is actually a difficult and challenging research problem.
One case in which the conjecture has been proven [21, Section 2.4] attracted our
attention: What if all the zeros of the polynomial lie on ∂D? This is, in some sense,
an extreme form of the problem. What happens when we turn the problem around? In
other words: Given a polynomial with zeros in D, when is it the derivative of a poly-
nomial with all of its zeros on ∂D? This question not only has a simple and beautiful
answer, it also has surprising connections to two other problems.
Let us turn to a seemingly unrelated problem—one with a geometric flavor. Given
a triangle, there exists an ellipse that can be inscribed in the triangle such that the
points of tangency lie at the midpoints of the sides of the triangle. This result is due to
Steiner [27]; the ellipse is often called the Steiner inellipse (see Figure 1.1). Siebeck
[24] showed that given a degree-3 polynomial with noncollinear zeros z 1 , z 2 , z 3 , the
critical points are the foci of the Steiner inellipse of 4z 1 z 2 z 3 .

z2

a2
a1

z1 z3

Figure 1.1. A Steiner inellipse.

doi:10.4169/amer.math.monthly.118.06.522

522 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Daepp, Gorkin, and Mortini in [2] discuss another ellipse. Given a1 , a2 ∈ D and a
function b, called a Blaschke product, defined by

z − a1 z − a2
  
b(z) = z , (1.1)
1 − a1 z 1 − a2 z

there is a unique associated inscribed ellipse: Beginning at an arbitrary point z 1 ∈


∂D, there are two other points z 2 , z 3 ∈ ∂D such that b(z 1 ) = b(z 2 ) = b(z 3 ). The main
result in [2] showed that the triangle with vertices z 1 , z 2 , z 3 circumscribes the ellipse
E defined by |z − a1 | + |z − a2 | = |1 − a1 a2 | (see Figure 1.2). Thus, every Blaschke
product of the form described in (1.1) is associated with an ellipse. But most of these
Blaschke ellipses will not be tangent at the midpoints to a triangle with vertices on
∂D; that is, most of these ellipses will not be Steiner inellipses for any triangle with
vertices on ∂D. So, we have our second question: Which Blaschke ellipses are Steiner
inellipses for some triangle inscribed in ∂D?

a2
a1

Figure 1.2. A Blaschke ellipse with foci a1 , a2 .

Evidently, we have two very different questions, but there are obvious similarities:
Both lift an object that is, in some sense, of degree two to an object of degree three.
Both begin with a description depending on values inside D, and end with an object de-
scribed by values on ∂D. We were still surprised to discover that they are (more or less)
the same question. We obtain necessary and sufficient conditions for our questions to
have positive answers by proving the following:

Theorem 1.1. Let a1 , a2 ∈ D, and let b be the degree-3 Blaschke product

z − a1 z − a2
  
b(z) = z .
1 − a1 z 1 − a2 z

The following are equivalent:


(1) There is a cubic polynomial p with zeros on ∂D and critical points a1 , a2 .
(2) The Blaschke ellipse associated with b is a Steiner inellipse for a triangle in-
scribed in ∂D.
(3) The points a1 and a2 satisfy 2|a1 a2 | = |a1 + a2 |.

We conclude the paper with a discussion of extensions of these results to general-


degree polynomials and curves inscribed in polygons, as well as a discussion of a
connection to a result about matrices.

June–July 2011] POLYNOMIALS, ELLIPSES, AND MATRICES 523


2. THE ELLIPSES.

2.1. Steiner Inellipses. Our study begins with an ellipse named after the Swiss math-
ematician Jakob Steiner (1796–1863). (See, for example, [18].)

Theorem 2.1 (Steiner). Given a triangle T = 1z 1 z 2 z 3 , there is a unique ellipse in-


scribed in T tangent at the midpoints of its sides. The foci of the ellipse are
s
2
1 1 1

(z 1 + z 2 + z 3 ) ± (z 1 + z 2 + z 3 ) − (z 1 z 2 + z 1 z 3 + z 2 z 3 ). (2.1)
3 3 3

We call this unique ellipse the Steiner inellipse (see Figure 1.1). Siebeck’s theorem
tells us more about the nature of the foci of the ellipse.

Theorem 2.2 (Siebeck). Given T = 1z 1 z 2 z 3 , consider p(z) = (z − z 1 )(z − z 2 )(z −


z 3 ). Then the roots of p 0 are the foci of the Steiner inellipse of 4z 1 z 2 z 3 .

We call a Steiner inellipse a unit Steiner inellipse if it is the Steiner inellipse of


some triangle with vertices on ∂D. Conversely, given a Steiner inellipse E, we call a
triangle that circumscribes E, is tangent at the midpoints, and has vertices on ∂D a unit
Steiner triangle of E. We will show (in Theorem 2.10) that a unit Steiner inellipse has
a unique unit Steiner triangle, unless the inellipse is a circle.

2.2. Blaschke Ellipses. Another family of ellipses was discovered during the investi-
gation of finite Blaschke products, which are rational functions of a special type.

Definition 2.3. A finite Blaschke product of degree n is a function defined by


n
Y z − aj
b(z) = β ,
j=1
1 − ajz

where |β| = 1 and a j ∈ D for j = 1, . . . , n.

Finite Blaschke products map D to itself, ∂D to itself, and map points outside D
back outside D [6, p. 5]. Furthermore, the finite Blaschke product b is an n-to-one map
on ∂D; that is, if λ ∈ ∂D, then b maps exactly n distinct points of ∂D to λ, as shown in
[2]. We will say that b identifies n points if it maps them all to the same value.
A Blaschke 3-ellipse, or, simply, Blaschke ellipse, is a curve associated with a given
degree-3 Blaschke product; it is described by the following theorem [2]:

Theorem 2.4. Consider a Blaschke product b of the form

z − a1 z − a2
  
b(z) = z . (2.2)
1 − a1 z 1 − a2 z

For λ ∈ ∂D, let z 1 , z 2 , z 3 ∈ ∂D denote the distinct points mapped to λ under b. Write
the partial fraction decomposition

b(z)/z m1 m2 m3
F(z) = = + + .
b(z) − λ z − z1 z − z2 z − z3

524 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Then m 1 , m 2 , m 3 > 0 and m 1 + m 2 + m 3 = 1, and the line joining z 1 and z 2 is tan-
gent to the ellipse E : |w − a1 | + |w − a2 | = |1 − a1 a2 | at the point ζ3 = (m 1 z 2 +
m 2 z 1 )/(m 1 + m 2 ). Further, the other two sides of 1z 1 z 2 z 3 are tangent to the ellipse at
points ξ1 and ξ2 that are defined similarly.
Conversely, each point of E is the point of tangency with E of a line that passes
through two distinct points z 1 and z 2 on ∂D for which b(z 1 ) = b(z 2 ).

The theorem in [2] assumes 0, a1 , a2 are distinct, but a continuity argument shows
that the result holds in general as long as b(0) = 0. Our Blaschke products will always
be in the form (2.2).
This theorem should be compared with Marden’s theorem [17, p. 9], which deals
with more general F but not the connection to Blaschke products. In addition, the
reader might be interested in a follow-up on Marden’s theorem, “The most marvelous
theorem in mathematics,” by D. Kalman (see [14] and [15]).
For any ellipse associated with a Blaschke product b, we may start with any point
z ∈ ∂D and let λ = b(z) to see from Theorem 2.4 that there exists a triangle circum-
scribing E with one vertex at z and all vertices on ∂D (Figure 1.2). In fact, if any ellipse
is inscribed in one triangle with vertices on the unit circle, it is inscribed in infinitely
many. This beautiful result is often called a porism, and it is due to Jean Poncelet.
More information about Poncelet’s porism can be found in a recent book by Flatto [4].
Our circumscribing triangles are similar to unit Steiner triangles, but are more gen-
eral; the points of tangency of the ellipse need not be the midpoints of the sides. The
equivalence of (1) and (2) in the main theorem will use the Poncelet property via the
following result of Frantz [5, Prop. 3]. We recommend [8], [9], and [31] for related
work and more about connections to Poncelet’s porism.

Lemma 2.5. The ellipses that can be inscribed in triangles with vertices on ∂D are
precisely the Blaschke 3-ellipses.

In honor of Poncelet’s theorem, we name these ellipses Poncelet ellipses.

a2

a1

Figure 2.1. A Blaschke ellipse that is also a unit Steiner inellipse, with foci a1 = (−1 − 3i)/4 and a2 = 1/2.

2.3. Proof of the Main Theorem. We’ll first need the following lemma.

Lemma 2.6. Suppose we are given a triangle with vertices z 1 , z 2 , z 3 ∈ ∂D. Denote
the foci of the Steiner inellipse of this triangle by a1 , a2 . Then 2|a1 a2 | = |a1 + a2 |.

Proof. Theorem 2.2 implies that there exists a polynomial p with zeros z 1 , z 2 , z 3 and
critical points a1 , a2 . That is, [(z − z 1 )(z − z 2 )(z − z 3 )]0 = 3(z − a1 )(z − a2 ). Thus
(z 1 + z 2 + z 3 )/3 = (a1 + a2 )/2 and z 1 z 2 + z 1 z 3 + z 2 z 3 = 3a1 a2 , by comparing co-
efficients. Multiplying the latter by z 1 z 2 z 3 and recalling that z j z j = 1, we obtain

June–July 2011] POLYNOMIALS, ELLIPSES, AND MATRICES 525


z 1 + z 2 + z 3 = 3a1 a2 z 1 z 2 z 3 . Thus z 1 z 2 z 3 a1 a2 = (a1 + a2 )/2, and the desired equation
follows.

Now we are ready for the proof of our main theorem.

Proof of Theorem 1.1. First, suppose that (1) holds. Then Theorem 2.2 implies that
the critical points of p are the foci of the ellipse inscribed in the triangle 1z 1 z 2 z 3 at
the midpoints. By Lemma 2.6, 2|a1 a2 | = |a1 + a2 |, and (3) holds.
To see that (3) implies (1), assume (3). If a1 = 0 or a2 = 0, then a1 = a2 = 0 and
z 3 − 1, for example, satisfies (1). If both are nonzero, let λ = (a1 + a2 )/(2a1 a2 ) and
note that, by (3), we have |λ| = 1. By the three-to-one property of the Blaschke product
b, there are three distinct points z 1 , z 2 , z 3 ∈ ∂D with b(z j ) = λ.
Write Q(z) := b(z) − λ. Then z 1 , z 2 , z 3 must be the three zeros of Q. Now
z − a1 z − a2
  
Q(z) = z −λ
1 − a1 z 1 − a2 z
is a rational function, but since the poles of Q are outside the closed unit disc, to find
the zeros of Q we need only consider the numerator of Q. Thus
q(z) = z(z − a1 )(z − a2 ) − λ(1 − a1 z)(1 − a2 z)
is the monic polynomial with zeros at z 1 , z 2 , z 3 .
Expanding and using λ = 1/λ = 2a1 a2 /(a1 + a2 ), we see that
q(z) = z 3 − (a1 + a2 + λa1 a2 )z 2 + a1 a2 + λ(a1 + a2 ) z − λ


3
= z 3 − (a1 + a2 )z 2 + 3a1 a2 z − λ.
2
So q 0 (z) = 3 z 2 − (a1 + a2 )z + a1 a2 = 3(z − a1 )(z − a2 ) and q satisfies (1).
 
Now we show that (1) and (2) are equivalent: Suppose (1) holds. By Theorem 2.2,
a1 and a2 are the foci of a unit Steiner, and hence Poncelet, inellipse E 1 . Lemma 2.5
implies that E 1 is a Blaschke 3-ellipse. Since the Blaschke product has zeros at 0, a1 ,
and a2 , we see that E 1 is the Blaschke ellipse associated with b. Thus, the Blaschke
ellipse associated with b is a unit Steiner inellipse and (2) holds. The converse, (2)
implies (1), follows from Theorem 2.2.

Remark 2.7. Suppose a1 , a2 satisfy (2) and (3) of Theorem 1.1 and we wish to ex-
plicitly find a polynomial as in (1). In the proof above, q has constant term −λ =
−(a1 + a2 )/(2a1 a2 ), so that, if a1 , a2 6 = 0, to find such a polynomial we can simply
integrate 3(z − a1 )(z − a2 ) with an integration constant of −λ. If a1 or a2 is 0, as
mentioned in the proof, we may take z 3 + γ for any γ ∈ ∂D.

2.4. Some Consequences. We now describe explicitly the connection between


Blaschke products, polynomials with zeros on ∂D, and unit Steiner inellipses. (See
also [8, Theorem 2.1].)

Theorem 2.8. An ellipse E is a unit Steiner inellipse if and only if there exists λ ∈ ∂D
and a degree-3 Blaschke product b of the form (2.2) such that E is the Blaschke ellipse
of b, and if z 1 , z 2 , z 3 are the distinct points mapped to λ by b, then
b(z)/z 1/3 1/3 1/3
= + + .
b(z) − λ z − z1 z − z2 z − z3

526 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Proof. Suppose E is a unit Steiner inellipse with a unit Steiner triangle with vertices
z 1 , z 2 , z 3 . By Lemma 2.5, there is a degree-3 Blaschke product b as above that defines
E. By Theorem 2.4, there exists λ ∈ ∂D such that b(z 1 ) = b(z 2 ) = b(z 3 ) = λ.
To show that the m j in Theorem 2.4 are as claimed, note that the points of tangency
are the midpoints of the line segments between the z j . So (see Theorem 2.4)

m 2 z1 m 1 z2 z1 z2
ζ3 = + = + . (2.3)
m1 + m2 m1 + m2 2 2

Now points of the form az 1 + (1 − a)z 2 for 0 ≤ a ≤ 1 lie on the line segment
from z 1 to z 2 , and different values of a lead to different points on the line segment.
Therefore, we must have m 1 /(m 1 + m 2 ) = m 2 /(m 1 + m 2 ) = 1/2. Thus, m 1 = m 2 .
By symmetry, m 1 = m 2 = m 3 . Since m 1 + m 2 + m 3 = 1, each m j = 1/3, as de-
sired.
Conversely, assume that b is as defined in the hypothesis with m 1 = m 2 = m 3 =
1/3. By Theorem 2.4, one point of tangency of the Blaschke ellipse E is ζ3 = (m 1 z 2 +
m 2 z 1 )/(m 1 + m 2 ) = (z 2 + z 1 )/2, which is the midpoint of the line segment z 1 z 2 . Sim-
ilarly ζ1 and ζ2 are midpoints of z 2 z 3 and z 1 z 3 respectively. Thus E is a unit Steiner
ellipse, as desired.

If p is a polynomial satisfying (1) in the main theorem, then by (2) the Blaschke
ellipse associated with b is a unit Steiner inellipse with foci a1 and a2 . So, we arrive at
the following corollary, using the logarithmic derivative of p for the final equality.

Corollary 2.9. Suppose that p(z) = (z − z 1 )(z − z 2 )(z − z 3 ) for distinct z 1 , z 2 , z 3 ∈


∂D and that p has critical points a1 , a2 ∈ D. Then the degree-3 Blaschke product b
with zeros at 0, a1 , a2 and b(z j ) = λ for j = 1, 2, 3 also satisfies

b(z)/z 1/3 1/3 1/3 p 0 (z)


F(z) = = + + = .
b(z) − λ z − z1 z − z2 z − z3 3 p(z)

Note that this shows us why the critical points of p are the zeros of b(z)/z.

2.5. How Many Unit Steiner Triangles Can an Ellipse Have? Let us now analyze
a few simple situations.

2.5.1. Infinitely Many Unit Steiner Triangles. Suppose the unit Steiner triangle is
equilateral. Then both foci of the unit Steiner inellipse are equal to 0: a circle can
be inscribed in 4z 1 z 2 z 3 that is tangent at the midpoints of the sides of the triangle
and has the origin as its center [15]. Since the Steiner inellipse is unique, it must be
Cr := {z : |z| = r } for some r with 0 < r < 1; a calculation shows that r = 1/2. In
this case, any rotation of the triangle is a unit Steiner triangle of the circle.
Now suppose that we have a Steiner inellipse with a focus a1 = 0. From Lemma
2.6, we see that a2 = 0 as well. Thus, the only Steiner inellipse with a focus at 0 is a
circle, centered at 0. Also, if a Steiner inellipse is a circle, we have a1 = a2 . Lemma
2.6 implies that a1 = a2 = 0, and the only Steiner “in-circle,” so to speak, is the circle
C1/2 .

June–July 2011] POLYNOMIALS, ELLIPSES, AND MATRICES 527


2.5.2. One Unit Steiner Triangle. In fact, using Theorem 2.4, we show that C1/2 is the
only unit Steiner inellipse with more than one unit Steiner triangle. That is, any unit
Steiner inellipse not C1/2 has a unique unit Steiner triangle.

Theorem 2.10. Suppose we have an ellipse E with foci a1 and a2 in D. If E has two
distinct unit Steiner triangles, then E = C1/2 .

It is easier to think about this result with polynomials: given two (monic, cubic)
polynomials with distinct zeros on ∂D and with the same two critical points, we show
that these polynomials are each z 3 + c0 for an appropriate choice of c0 on ∂D.

Proof. By our main theorem we may choose two distinct monic, cubic polynomials,
p and q, with zeros on ∂D corresponding to the vertices of our unit triangles and with
critical points a1 , a2 ∈ D. Because p and q have the same critical points a1 , a2 , they
are both associated with the same Blaschke product b as defined in Theorem 2.4. By
Corollary 2.9, there exist γ j ∈ ∂D such that

p 0 (z) b(z)/z q 0 (z) b(z)/z


= and = .
3 p(z) b(z) − γ1 3q(z) b(z) − γ2
Solving for b yields

zγ1 p 0 zγ2 q 0
b(z) = and b(z) = .
zp 0 − 3 p zq 0 − 3q

By our conditions, we know that p 0 = q 0 . So, setting the two expressions for b equal
to each other and substituting, we see that γ1 /(zp 0 − 3 p) = γ2 /(zp 0 − 3q). If γ1 = γ2 ,
we have p = q. So γ1 6 = γ2 .
We can rewrite our equation as γ1 (zq 0 − 3q) = γ2 (zp 0 − 3 p). But we know that
q = p + C, so

3γ1 C = (γ1 − γ2 )(zp 0 − 3 p). (2.4)

Now p has the form p(z) = z 3 + c2 z 2 + c1 z + c0 , and so zp 0 (z) − 3 p(z) =


−c2 z 2 − 2c1 z − 3c0 . Using this expression for zp 0 − 3 p in (2.4) shows that c2 =
c1 = 0. Thus p(z) = z 3 + c0 for some c0 . Then the zeros of p are equally spaced on
∂D. The same argument shows that q takes this form as well, with a different constant.
Therefore, the ellipse is inscribed in (at least) two equilateral triangles, and Section
2.5.1 shows that the Steiner inellipse described by both p and q is C1/2 .

2.6. Which Points Can Be the Foci of a Unit Steiner Inellipse? Given a1 ∈ D,
you may be wondering what points a2 ∈ D we may choose so that a1 and a2 are the
foci of a unit Steiner inellipse. We’ve discussed the case a1 = 0, so assume a1 6 = 0.
Interestingly, in this case, the set of such a2 is part of a circle (we consider a line to
be a circle of infinite radius). From Theorem 1.1, the condition 2|a1 a2 | = |a1 + a2 | is
necessary and sufficient for a1 and a2 to be the foci of a unit Steiner inellipse. We wish
to determine all z ∈ D with 2|a1 z| = |a1 + z|. Let T (z) = a1 z/(a1 + z). We want all z
such that |T (z)| = 1/2. Since T is a Möbius transformation, T maps circles to circles,
so our solution is the set of points in D on the circle T −1 {w : |w| = 1/2}.
Figure 2.2 shows some examples of these circles. Using the description of our circle
in terms of the Möbius transformation, and by rotation taking a1 > 0, we note that as
a1 increases from just greater than 0 to 1/2, the circle grows in size from very small to

528 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Figure 2.2. On the left, as a1 % 1/2 the circle grows infinitely large; on the right, for a1 > 1/2, as a1 % 1
the circle shrinks.

infinitely large. When a1 increases from just greater than 1/2 to 1, the circle shrinks,
though it never again lies entirely in D. When a1 = 1/2 the solutions are points in D on
the line {z : <(z) = −1/4}. Note that the circles approach <(z) = −1/4 as a1 moves
toward 1/2 from either direction.

3. A MATRIX INTERPRETATION. It is natural to ask whether there is a gener-


alization of Steiner’s theorem that locates the critical points of a higher-degree poly-
nomial as the foci of a curve. In general, however, the curve would no longer be an
ellipse. Furthermore, Blaschke products of higher degree are well studied, so it is natu-
ral to seek a generalization of our main theorem. In order to discuss this generalization,
we need to present a connection between our Blaschke products and matrices.
To see what matrix properties we can use to make this connection, we recall (mod-
ified versions of) the two questions we stated in the introduction: First, when is a
degree-2 polynomial with zeros in D the derivative of a degree-3 polynomial with ze-
ros on ∂D? And, second, when is an ellipse contained in D tangent at the midpoints
to all sides of a triangle with vertices on ∂D? Therefore, noting that in these questions
we have an object associated with two points in D as well as a related object associ-
ated with three points on ∂D, we’ll begin with a 2 × 2 matrix with eigenvalues in D,
and note that unitary matrices have all eigenvalues on ∂D. Then we ask the follow-
ing: When can a 2 × 2 matrix with eigenvalues in D appear in the upper-left corner
of a 3 × 3 unitary matrix? The answer will lead us to the matrices we wish to study.
Then we’ll find that the first two questions we’ve studied give us conditions for find-
ing a 3 × 3 unitary matrix that gives us the eigenvalues of our 2 × 2 matrix through
differentiation of the characteristic polynomial.
To state our result, we begin with a definition. Given a matrix M, we say that N is a
dilation of M if N is a larger matrix containing M in its upper-left corner [13, p. 53].
If N is also unitary, we say that N is a unitary dilation. If M has a unitary dilation U ,
then kMk ≤ kU k = 1; that is, M is a contraction. Thus, only contractions can have
unitary dilations. Now, all eigenvalues of a contraction lie in D or on ∂D, and we have
decided to focus our attention on 2 × 2 matrices with eigenvalues in D. Finally, in
order for our matrix to have a 3 × 3 unitary dilation, it turns out that we must have
rank(I − M ∗ M) = 1. (We cannot have rank(I − M ∗ M) = 0, for then M would have
eigenvalues on ∂D, and if rank(I − M ∗ M) = 2, it is known that there is no unitary
dilation of M smaller than 4 × 4 [29].) We are now ready to add one more equivalent
condition to our theorem.
In what follows, we denote the characteristic polynomial of a matrix N by ch(N ).

Theorem 3.1. Let M be a 2 × 2 contraction with eigenvalues a1 , a2 ∈ D satisfying


rank(I − M ∗ M) = 1. The following are equivalent:

June–July 2011] POLYNOMIALS, ELLIPSES, AND MATRICES 529


(1) The Blaschke ellipse constructed from a degree-3 Blaschke product b with zeros
0, a1 , and a2 is a unit Steiner inellipse.
(2) 2|a1 a2 | = |a1 + a2 |.
(3) There exists a 3 × 3 unitary dilation U of M with ch(U )0 = 3ch(M).

Matrices that are unitarily equivalent have the same characteristic polynomial. So,
we will choose the most convenient form for M and its unitary dilation from among
all unitarily equivalent ones.
Now, any matrix satisfying the hypotheses of Theorem 3.1 is unitarily equivalent to
one in the form of the matrix A presented in Definition 3.2 below [9, p. 180]. Further-
more, every 3 × 3 unitary dilation of A, and therefore of M, is unitarily equivalent to
a matrix of the form Bλ below [8, p. 364]. Thus, the study of our 2 × 2 contraction M
with eigenvalues in D and rank(I − M ∗ M) = 1 can be completed by studying matri-
ces of the form A below and unitary dilations of the form Bλ . Such matrices have been
well studied (see, for example, [8], [9], [10], [19], and [3]).

Definition 3.2. Let a1 , a2 ∈ D and λ ∈ ∂D.


Let
" p p #
a1 1 − |a1 |2 1 − |a2 |2
A=
0 a2

and
 p p p 
a1 1 − |a1 |2 1 − |a2 |2 −a2 1 − |a1 |2

.
 
Bλ = 
p
 0 a2 1 − |a2 |2 
λ 1 − |a1 |2 λa1 a2
p p
−λa1 1 − |a2 |2

These particular matrices will allow us to prove easily the equivalence in Theorem
3.1. Note that if b is a degree-3 Blaschke product with zeros 0, a1 , a2 , then the eigen-
values of A are the zeros of b(z)/z. What are the eigenvalues of Bλ ? The answer lies
in the next theorem.

Theorem 3.3 ([8, Lemma 2.4]). Given λ ∈ ∂D, the eigenvalues of Bλ are the values
mapped to λ by b.

Now, more surprisingly, A and Bλ also have a strong connection to our Blaschke
ellipses. Given an n × n matrix M, its numerical range (or field of values) is the set

W (M) = {hM x, xi : x ∈ Cn , kxk = 1},

where h , i denotes the standard inner product. There are many reasons for studying
the numerical range, one of which is the fact that it helps locate the eigenvalues of M—
they are somewhere inside W (M) [12, Problem 169]. It is usually difficult to describe
the numerical range of a matrix. In the two cases we are concerned with, however,
the descriptions are simple. First, the numerical range of any normal matrix, and so in
particular any unitary matrix, is the convex hull of its set of eigenvalues [12, Problem
a c 
171]. Second, if N = 01 a2 , then W (N ) is the elliptic disc with foci a1 , a2 and minor
axis of length |c|; see [9].

530 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Consequently, p the boundary
p of W (A), denoted ∂ W (A), is an ellipse with minor
axis of length 1 − |a1 |2 1 − |a2 |2 . Theorem 2.4 and a calculation show that the
boundary of W (A) is in fact the Blaschke ellipse of the Blaschke product b with zeros
0, a1 , a2 , the eigenvalues of A. Furthermore, by Theorem 3.3, the dilation Bλ of A has
as eigenvalues the three points b maps to λ. Since Bλ is unitary, W (Bλ ) is the convex
hull of these points and, by Theorem 2.4, ∂ W (Bλ ) is a triangle circumscribing the
elliptical disc W (A), and so also the Blaschke ellipse of b.
So here’s what we’ve learned: As in Theorem 2.4, ∂ W (A) is a Blaschke ellipse
and, as λ ranges over ∂D, the ∂ W (Bλ ) give all the circumscribing triangles. Now
we establish Theorem 3.1, by showing that conditions (1) and (2) are equivalent to
ch(Bλ )0 = 3ch(A). Recall that by the unitary equivalence of Bλ and A with U and M,
respectively, ch(Bλ )0 = 3ch(A) is equivalent to (3).

Proof of Theorem 3.1. We already know that statements (1) and (2) are equivalent. So
suppose the Blaschke ellipse associated with b is a Steiner inellipse. By Theorem 1.1,
there is a polynomial p with zeros on ∂D and critical points a1 , a2 . From Remark 2.7,
the zeros of p are the points z j for which b(z j ) = λ for j = 1, 2, 3, where −λ is the
constant term of p. Thus, from Theorem 3.3, ch(Bλ ) = p, and ch(Bλ )0 (z) = p 0 (z) =
3(z − a1 )(z − a2 ) = 3ch(A)(z). Thus (1) implies (3).
Now suppose ch(Bλ )0 = 3ch(A). So ch(Bλ ) is a polynomial with zeros on ∂D and
critical points a1 , a2 . By Theorem 1.1, 2|a1 a2 | = |a1 + a2 |, and (3) implies (2).

When can the eigenvalues of a matrix be located using its unitary dilations? This
sounds nothing like our questions about polynomials and ellipses, yet Theorem 3.1
gives a surprising connection. This connection to Blaschke products will allow us,
in Section 4, to discuss generalizations of Steiner’s theorem, Blaschke ellipses, and
eigenvalues of n × n contractions and their (n + 1)-unitary dilations.
According to Sunder [28], Halmos studied normal operators and unitary dilations
because of his “unwavering belief that the secret about general operators lay in their
relationship to normal operators.” For example, it can be shown that the numerical
range of an n × n contraction is the intersection of the numerical ranges of all of its
unitary dilations, and we have seen a special case of this above: the triangles W (Bλ )
given by the numerical ranges of the unitary dilations intersect to the elliptical disc
W (A), the numerical range of the contraction. Even in infinite dimensions, we may
find the closure of the numerical range of a contraction by taking the intersection of
the closures of the numerical ranges of the unitary dilations involved. The interested
reader should consult two recent papers, [1] and [7], as well as [8], in which the authors
describe a refinement of a classical result due to Lucas on the locations of the critical
points of a polynomial.

4. REMARKS ON THE GENERAL CASE. It is possible to adapt the proofs (and


results) of the degree-3 case to the degree-n case. We assume throughout that b is
a Blaschke product of degree n + 1 with b(0) = 0. As in the degree-3 case, it may
or may not happen that b will be associated with a (monic) polynomial through the
(n + 1)-term partial fraction expansion of F(z) = b(z)/(z(b(z) − λ)), as in Corol-
lary 2.9. It is, however, possible to determine necessary and sufficient conditions for
this to occur. Therefore, it is reasonable to ask if the connection between Steiner and
Blaschke ellipses generalizes and whether or not there are matrices associated with
these Blaschke curves.
Much of this does generalize in a natural, but deep, way. Perhaps the easiest way to
describe the connection is through the matrices A and Bλ . If we begin with an n × n

June–July 2011] POLYNOMIALS, ELLIPSES, AND MATRICES 531


contraction M with eigenvalues a1 , . . . , an ∈ D, and we require rank(I − M ∗ M) = 1,
we can define an n × n matrix A in much the same way as in Definition 3.2Qto which
M is unitarily equivalent. The matrix A will be upper triangular, so ch(A) = nj=1 (z −
a j ), where the a j are the zeros of b(z)/z for a Blaschke product b. There is a family of
(n + 1) × (n + 1) unitary dilations of A, denoted Bλ for λ ∈ ∂D. Again, any unitary
dilation of A (and therefore M) will be unitarily equivalent to such a Bλ . Since the
eigenvalues of a unitary matrix always have modulus one, ch(Bλ ) = n+1 j=1 (z − z j )
Q
with the z j ∈ ∂D and b(z j ) = λ. In fact, the z j are distinct and computations such
as those above determine conditions under which ch(Bλ )0 = (n + 1)ch(A). Relevant
results may be found in [3], [9], or [26].
All other connections remain: From a function-theoretic viewpoint, we are look-
ing at the relationship between zeros of b(z)/z and the points on ∂D identified by b,
and we are searching for a polynomial associated with b in the manner of Corollary
2.9. Geometrically speaking, instead of being an ellipse, the curve associated with a
Blaschke product now requires knowledge of algebraic curves of higher class. From a
linear algebra point of view, it represents the boundary curve of W (M) = W (A). In
addition, this curve will be inscribed in convex (n + 1)-gons (the boundaries of the
W (Bλ )) and, therefore, is still thought of as a Poncelet curve. This information can
be found in [3], [9], or [31]. Theorem 3.1 generalizes and can be proved in the same
manner as in the degree-3 case, using the elementary symmetric functions of the roots
of monic, degree-n polynomials [26]. It’s also true that the zeros of the Blaschke prod-
uct are the foci of a curve associated with W (M), but with more than two zeros, it’s
difficult to imagine what is meant by “foci.” The reader might consult [16], [20], or
[25] for related results.

5. CONCLUSION. Significant progress has been made on the Sendov conjecture


using complex analysis; our hope is that this paper suggests other useful approaches. In
fact, these ideas can be used to provide a proof of Sendov’s conjecture for polynomials
with roots on ∂D [3], but the proof relies heavily on the fact that the zeros lie on ∂D.
The study of finite Blaschke products has led to results about distances between roots
and critical points of polynomial-like functions [11, Theorem 12].
In the study of the degree-3 case of Sendov’s conjecture using Steiner’s and
Siebeck’s results, the connection to triangles with vertices on ∂D and inscribed el-
lipses is quite natural. Poncelet’s and Marden’s theorems suggest a way to generalize
the geometric result to the degree-n case. Perhaps studying foci of degree-n curves or
the connection to numerical ranges will provide another useful viewpoint.

ACKNOWLEDGMENTS. This work was the second author’s honors thesis at Bucknell University under
the direction of the first author. We are grateful to the Department of Mathematics for its support and to Ueli
Daepp for his very careful reading of this manuscript. We also thank the referee for many helpful suggestions
and for improving the exposition of this paper.

REFERENCES

1. M.-D. Choi and C.-K. Li, Constrained unitary dilations and numerical ranges, J. Operator Theory 46
(2001) 435–447.
2. U. Daepp, P. Gorkin, and R. Mortini, Ellipses and finite Blaschke products, Amer. Math. Monthly 109
(2002) 785–795. doi:10.2307/3072367
3. U. Daepp, P. Gorkin, and K. Voss, Poncelet’s theorem, Sendov’s conjecture, and Blaschke products, J.
Math. Anal. Appl. 365 (2010) 93–102. doi:10.1016/j.jmaa.2009.09.058
4. L. Flatto, Poncelet’s Theorem, American Mathematical Society, Providence, RI, 2009.

532 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



5. M. Frantz, How conics govern Möbius transformations, Amer. Math. Monthly 111 (2004) 779–790. doi:
10.2307/4145189
6. J. Garnett, Bounded Analytic Functions, Academic Press, New York, 1981.
7. H.-L. Gau, C.-K. Li, and P. Y. Wu, Higher-rank numerical ranges and dilations, J. Operator Theory 63
(2010) 181–189.
8. H.-L. Gau and P. Y. Wu, Lucas’ theorem refined, Linear Multilinear Algebra 45 (1999) 359–373. doi:
10.1080/03081089908818600
9. , Numerical range and Poncelet property, Taiwanese J. Math. 7 (2003) 173–193.
10. , Numerical range circumscribed by two polygons, Linear Algebra Appl. 382 (2004) 155–170.
doi:10.1016/j.laa.2003.12.003
11. P. Gorkin and R. C. Rhoades, Boundary interpolation by finite Blaschke products, Constr. Approx. 27
(2008) 75–98. doi:10.1007/s00365-006-0646-3
12. P. Halmos, A Hilbert Space Problem Book, D. Van Nostrand, Princeton, NJ, 1967.
13. R. A. Horn and C. R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge, 1991.
14. D. Kalman, An elementary proof of Marden’s theorem, Amer. Math. Monthly 115 (2008) 330–338.
15. , The most marvelous theorem in mathematics, Journal of Online Mathematics and its Applica-
tions, 8 (2008) Article ID 1663, http://www.maa.org/joma/Volume8/Kalman/index.html.
16. J. Langer and D. Singer, Foci and foliations of algebraic curves, Milan. J. Math. 75 (2007) 225–271.
doi:10.1007/s00032-007-0078-4
17. M. Marden, Geometry Of Polynomials, 2nd ed., Mathematical Surveys and Monographs, No. 3, American
Mathematical Society, Providence, RI, 1966.
18. D. Minda and S. Phelps, Triangles, ellipses, and cubic polynomials, Amer. Math. Monthly 115 (2008)
679–688.
19. B. Mirman, UB-matrices and conditions for Poncelet polygon to be closed, Linear Algebra Appl. 360
(2003) 123–150. doi:10.1016/S0024-3795(02)00447-0
20. J. L. Parish, On the derivative of vertex polynomial, Forum Geom. 6 (2006) 285–288.
21. G. Schmeisser, The conjectures of Sendov and Smale, in Approximation Theory, B. D. Bojanov, ed.,
DARBA, Sofia, Bulgaria, 2002, 353–369.
22. Bl. Sendov, Hausdorff geometry of polynomials (English summary), East J. Approx. 7 (2001) 123–178.
23. T. Sheil-Small, Complex Polynomials, Cambridge Studies in Advanced Mathematics, vol. 1, 75, Cam-
bridge University Press, Cambridge, 2002.
24. J. Siebeck, Über eine neue analytische Behandlungsweise der Brennpunkte, J. Reine Angew. Math. 64
(1864) 175–182.
25. D. Singer, The location of critical points of finite blaschke products, Conform. Geom. Dyn. 10 (2006)
117–124. doi:10.1090/S1088-4173-06-00145-7
26. E. Skubak, Blaschke ellipses, Steiner inellipses, and the polynomials that brought them together, B.S.
honors thesis, Bucknell University, Lewisburg, PA, 2009.
27. J. Steiner, Gesammelte Werke, vol. 2, Prussian Academy of Sciences, Berlin, 1881–1882.
28. V. Sunder and Paul Halmos, Expositor par excellence, available at http://www.imsc.res.in/
~sunder/paul.pdf.
29. R. C. Thompson and C. T. Kuo, Doubly stochastic, unitary, unimodular, and complex orthogonal power
embeddings, Acta Sci. Math. (Szeged) 44 (1982) 345–357.
30. B. Torrence, Sendov’s conjecture—From Wolfram Demonstrations Project, A Wolfram Web Resource,
http://demonstrations.wolfram.com/SendovsConjecture/.
31. P. Y. Wu, Polygons and numerical ranges, Amer. Math. Monthly 107 (2000) 528–540. doi:10.2307/
2589348

PAMELA GORKIN received her B.S., M.S., and Ph.D. from Michigan State University. She also spent a
year at Indiana University where she learned about numerical ranges from Paul Halmos. She has been teaching
at Bucknell University since 1982, with time off for good behavior. Her hobbies are hiking, reading, traveling,
cooking and eating, though not necessarily in that order.
Department of Mathematics, Bucknell University, Lewisburg, PA 17837
[email protected]

ELIZABETH SKUBAK attended Bucknell University for her undergraduate studies and is currently a gradu-
ate student and teaching assistant at the University of Wisconsin–Madison. She likes to spend her few waking,
non-working hours reading, cooking, taking photographs, and being outside.
Department of Mathematics, University of Wisconsin at Madison, Madison, WI 53706-1388
[email protected]

June–July 2011] POLYNOMIALS, ELLIPSES, AND MATRICES 533


A Lost Counterexample and
a Problem on Illuminated Polytopes
Ronald F. Wotzlaw and Günter M. Ziegler

Abstract. Daniel A. Marcus claimed in a “Note added in proof” to his 1984 paper on positively
k-spanning vector configurations that some minimal positively k-spanning vector configura-
tions in m-dimensional space have more than 2km elements, but the example he found (for
k = 2 and m = 12) seems to be lost.
We produce such an example by applying Gale duality, a linear algebra technique devel-
oped by Micha Perles in the sixties, to a result on illuminated polytopes by Peter Mani from
1974. Our example has exactly the parameters claimed by Marcus.
Conversely, we show how results on positively k-spanning vector configurations, again via
Gale duality, can be used to solve a problem by Mani on nonsimplicial illuminated polytopes.

In a “Note added in proof” to his 1984 paper on positively spanning vector configu-
rations, Daniel A. Marcus (from the California State Polytechnic University, Pomona,
CA) claimed to have a counterexample to his conjecture that a minimal positively
k-spanning vector configuration in Rm has size at most 2km. However, the counterex-
ample was never published, and seems to be lost.
Independently, and ten years earlier, in 1974, Peter Mani (Bern, Switzerland) had
disproved a conjecture by the Swiss Mathematician Hugo Hadwiger that every “illu-
minated” d-dimensional polytope must have at least 2d vertices.
Here we observe that these two studies are related by “Gale duality,” an elemen-
tary linear algebra technique devised by Micha A. Perles (at the Hebrew University,
Jerusalem) in the sixties. Thus Mani’s study provides a counterexample for Marcus’s
conjecture with exactly the parameters that Marcus had claimed. In the other direction,
with Marcus’s tools we provide an answer to a problem left open by Mani: Could “illu-
minated” d-dimensional polytopes on a minimal number of vertices be nonsimplicial?

1. MARCUS’S LOST COUNTEREXAMPLE AND MANI’S PROBLEM. A


positively spanning vector configuration U in Rm is a finite configuration of vectors
(duplicates are allowed) that positively span Rm , that is, if U = {u 1 , . . . , u n } and
v ∈ Rm , there are real nonnegative numbers λ1 , . . . , λn such that

v = λ1 u 1 + λ2 u 2 + · · · + λn u n .

A positively k-spanning vector configuration is a positively spanning vector configu-


ration that is still positively spanning even if any k − 1 vectors are deleted from the
configuration. Equivalently, a vector configuration is positively k-spanning if and only
if every open half-space of Rm contains at least k vectors of the configuration. See
Figure 1. To prove the last statement it is sufficient to show that a vector configuration
is positively spanning if and only if every open half-space contains at least one vector
of the configuration. (Indeed, the set of nonnegative combinations is a convex cone
with apex at the origin. It is either the entire space, or else it lies on one side of a hy-
perplane through the origin, which may be obtained from the Hahn–Banach theorem
[4, Thm. 2.2].)
doi:10.4169/amer.math.monthly.118.06.534

534 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Figure 1. Two minimal positively 2-spanning configurations in R2 .

In two papers [8, 9], dating from 1981 and 1984, Marcus studied properties of
positively k-spanning vector configurations. In particular, he was interested in upper
bounds on the cardinality of minimal positively k-spanning vector configurations, that
is, of positively k-spanning vector configurations that are minimal with respect to in-
clusion:

Question 1 (Marcus [8, 9]; see also [11]). What is the maximum size of a minimal
positively k-spanning vector configuration in Rm ?

A classical result, known as the Blumenthal–Robinson theorem [1, 3, 13], states


that for k = 1 the exact answer is 2m. Marcus conjectured that the answer for the gen-
eral case is 2km. The answer clearly cannot be smaller than 2km: the configuration
illustrated in Figure 2 consisting of k multiples of the standard basis vectors and their
negatives, that is, ±e1 , . . . , ±em , is a minimal positively k-spanning vector configura-
tion in Rm .

Figure 2. A minimal k-spanning configuration of 2km vectors in Rm ; here m = 2 and k = 3.

Yet for k ≥ 2 it is not obvious that there is a finite upper bound. However, this was
proved by Marcus for the case k = 2; for all k ≥ 2 it can be derived from the Perles
skeleton theorem for (convex) polytopes [6]; see [14].
A convex d-dimensional polytope is the convex hull of a finite number of points V
in Rd that affinely span Rd . The points in V that are not convex combinations of some
of the other points in V are the vertices of the polytope. A d-dimensional polytope
thus has at least d + 1 vertices. If it has exactly d + 1 vertices, we call it a simplex. We
denote by vert(P) the set of vertices and by f 0 = f 0 (P) = |vert(P)| the number of
vertices. For example, if P is a 3-dimensional octahedron, then vert(P) ⊂ R3 is its set
of six vertices, and f 0 (P) = 6. A face of a polytope is the convex hull of any subset
of the set of vertices on which some linear function achieves its maximal value. (The
polytope itself, and the empty set, are also defined to be faces.) Every face of a polytope
is itself a polytope in some lower-dimensional space and thus has a dimension. For
example, the vertices are the 0-dimensional faces. The faces of dimension 1, d − 2,

June–July 2011] A LOST COUNTEREXAMPLE 535


and d − 1 are called edges, ridges, and facets, respectively. Since a facet has dimension
d − 1, it must have at least d vertices. In the special case where every facet of the
polytope has the minimum number d of vertices, we call the polytope simplicial. An
important combinatorial property of polytopes is that every ridge lies in exactly two
facets. For further intuition, information, and terminology related to polytopes we refer
to the textbooks Grünbaum [4], Matoušek [10], and Ziegler [15].
In the following, we focus on the case k = 2 of positively k-spanning vector config-
urations, which is particularly interesting. It was pointed out and used by Marcus [8, 9]
that via Gale diagrams the above question translates into a question about polytopes.
Indeed, for k ≥ 2 every positively k-spanning vector configuration of n vectors in
Rm is a Gale diagram of a d-dimensional polytope P with f 0 = n vertices, where
d = n − m − 1. We refer to the appendix to this paper for a quick sketch of Gale
diagrams. The main relation between the vector configuration and the corresponding
polytope (which “live” in different dimensions) is that each vector in the configuration
corresponds to a particular vertex of the polytope, and the minimal subconfigurations
having the property that they positively span their linear spans correspond exactly to
the complements of the vertex sets of facets of the polytope; for details of this con-
struction see, for example, [4, Sect. 5.4], [15, Lect. 6], or [10, Sect. 5.6].
For k = 2, the condition that we look at a minimal positively k-spanning vector
configuration is equivalent, via Gale duality, to the following special property of the
polytope: for every vertex u of P there is a different vertex v such that u and v are
not connected by an edge of P, that is, the edge between u and v is missing. In other
words, no vertex is connected to all the other vertices. We will call a polytope with this
property unneighborly; see Figure 3.

Figure 3. A polytope (pyramid) with two missing edges (dotted), and an unneighborly 3-dimensional poly-
topes (a triangular prism).

With this translation, the k = 2 case of Question 1 is very closely related to the
following:

Question 2 (Marcus [8, 9]). What is the minimum number of vertices of an unneigh-
borly d-polytope?

Marcus’s conjectured bound of 4m for a positively 2-spanning vector configura-


tion in Rm would imply that an unneighborly polytope in dimension d has at least
4(d + 1)/3 vertices. This is wrong, as we shall see, and it seems that this was noticed
by Marcus: in [9], a “Note added in proof” says that he had discovered an unneigh-
borly polytope of dimension d = 36 with f 0 = 49 vertices. (The bound of 4(d + 1)/3
would imply that such a polytope would need to have at least 50 vertices.) However,
there is no mention of a construction for this polytope or any kind of reference that
would help to recover this example. McMullen, who reviewed Marcus’s paper for

536 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Mathematical Reviews and Zentralblatt, had not seen the counterexample either (per-
sonal communication). In the following we will refer to this polytope as Marcus’s lost
counterexample.
As far as we know, Marcus was not aware of work that Peter Mani had done ten
years earlier [7]. Mani had studied a question by Hugo Hadwiger on illuminated poly-
topes. These are polytopes in which every vertex lies on an inner diagonal, that is, a
segment connecting two vertices that passes through the interior of the polytope. The
d-dimensional crosspolytope is an example of such a polytope with 2d vertices: this is
the polytope generated from the points ±e1 , ±e2 , . . . , ±ed ; it is the d-dimensional ver-
sion of the octahedron, also known as “the unit ball in a d-dimensional space with `1 -
norm.” (So the 2-dimensional crosspolytope is a square, and the 3-dimensional one is
an octahedron.) Clearly, inner diagonals are missing edges, and thus every illuminated
polytope is also unneighborly. McMullen proposes to call them strongly unneighborly
(personal communication).

Figure 4. An inner diagonal (dotted), and an illuminated 3-dimensional polytope (an octahedron).

Question 3 (Hadwiger [5]). What is the minimum number of vertices of an illumi-


nated d-polytope? More specifically, is it 2d?

In [7], Mani gave a remarkable complete answer to this question. Indeed, the bound
of 2d conjectured
√ by Hadwiger turned out to be wrong, whereas the correct bound is
roughly d + 2 d. More precisely, Mani showed that every illuminated d-polytope has
at least

M(d) := min{2d, d + p(d) + p(d) + 1}


 d 


vertices, where we set p(d) := d 4d+1−1
2
e for d ≥ 1. McMullen (personal communi-
cation 2009) has noted that one can write the function M(d) in the following simple
form:
√ √
M(d) = min{2d, d( d + 1)2 e} = min{2d, d + 1 + d2 de}.

According to Mani, every illuminated polytope has at least M(d) vertices and ex-
amples with M(d) vertices exist: His construction starts from a cyclic d-polytope—this
is the convex hull of a finite number of points on a curve in Rd of order d; the upper
bound theorem tells us that such a d-dimensional polytope has the maximal number of
facets for a given number of vertices. Mani’s examples are obtained from a cyclic d-
polytope with d + p(d) vertices by stacking new vertices onto d p(d)
d
e + 1 well-chosen
facets. The operation stacking a facet builds a flat pyramid over a given facet F of a

June–July 2011] A LOST COUNTEREXAMPLE 537


polytope P. The key condition is that the pyramid must be so flat that the line connect-
ing the tip of the pyramid to any polytope vertex not on F passes through the relative
interior of the facet F. Thus after stacking onto F there is an inner diagonal from the
new vertex to any vertex of P that did not lie on F; see Figure 5.

Figure 5. Stacking onto a facet.

In particular, for large enough d there are illuminated—and thus also unneighborly
—polytopes on much fewer vertices than 4(d + 1)/3. Indeed, the smallest d where this
occurs is d = 36, and Mani’s illuminated polytope in dimension d = 36 has M(36) =
49 vertices, so it has exactly the same parameters as Marcus’s lost counterexample. We
will probably never know whether this is the same polytope that Marcus had in mind:
he never published on the subject again, and when we tried to contact him via the
California State Polytechnic University in Pomona, where he had been on the faculty
for a number of years, we received the information that he had passed away some years
ago.
We will refer to illuminated polytopes with the minimal number f 0 = M(d) of
vertices as Mani polytopes. The Mani polytopes constructed by Mani himself are, by
construction, simplicial. In relation to Question 3, Mani thus asked:

Question 4 (Mani [7]; see also Bremner and Klee [2]). Are all illuminated poly-
topes with the minimum number of vertices simplicial? Or are there nonsimplicial
Mani polytopes?

Below we give a complete answer to this question: up to dimension 5, all Mani poly-
topes are simplicial, and there is only one combinatorial type, given by the crosspoly-
tope. For every d ≥ 6 though, we construct a nonsimplicial Mani polytope on the
minimum number of vertices. This corrects a statement by Bremner and Klee [2], who
had claimed that up to dimension 7 all extremal illuminated polytopes were crosspoly-
topes.
Our solution for Mani’s question, which we present in Section 3, produces suitable
Gale diagrams; thus it uses Marcus’s positively k-spanning vector configurations. In
turn, Marcus’s conjecture on the size of minimal positively k-spanning vector config-
urations is refuted by taking Mani’s viewpoint of illuminated polytopes.
To summarize, the precise answers for Questions 1 and 2 remain open, although
Marcus’s conjectured answers are refuted by Mani’s construction as well as by √ our
construction in this paper: they yield illuminated d-polytopes with roughly d + 2 d
vertices, while√ Marcus’s work implies that an unneighborly d-polytope has at least
roughly d + 2d vertices. Question 3 was solved by Mani [7], and Question 4 is
solved by our construction.
Marcus’s conjecture for Question 1 is proven wrong for all k ≥ 2 in the first author’s
Ph.D. thesis [14] based on Mani’s construction for the case k = 2.

538 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



2. UNIQUE MANI POLYTOPES. We show that in dimensions d with 1 ≤ d ≤ 5
Mani polytopes are combinatorially unique. Thus the only combinatorial type that
appears is the d-crosspolytope. Our argument is mainly based on the original results
by Mani [7] and the subsequent simplification by Rosenfeld [12].

Definition (Self illuminated, opposite sets). Let P be an illuminated d-polytope. A


set of vertices U ⊆ vert(P) is said to illuminate itself if for every vertex v ∈ U there
is a vertex u ∈ U such that

[u, v] := {x ∈ Rd : x = λu + (1 − λ)v, λ ∈ [0, 1]}

is an inner diagonal.
A set W ⊆ vert(P) is said to lie opposite the vertex v ∈ vert(P) if for every w ∈ W
the segment [v, w] is an inner diagonal and vert(P) \ (W ∪ {v}) illuminates itself.
Let 0(P) := max{|W | : W lies opposite some v ∈ vert(P)}.

The following is a slightly stronger statement than the main result from [12] that is
easily extracted from Rosenfeld’s proof.

Lemma 1 (Rosenfeld [12]). Let P be an illuminated d-polytope. If 0(P) = 1, then


f 0 ≥ 2d and there is a perfect matching on the inner diagonals, that is, there are
pairwise vertex-disjoint inner diagonals that cover all vertices of P.

For the d-dimensional crosspolytope, Lemma 1 points to the matching of “opposite”


vertices. The next lemma is easily derived from results by Mani; just combine the
statements in [7, Lemma 1, Proposition 2, and Proposition 3].

Lemma 2 (Mani [7]). Let d ≥ 3, let P be  a Mani d-polytope, and assume that
0(P) ≥ 2. Then f 0 (P) ≥ d + p(d) + p(d)
d
+ 1.

Corollary 3. Let d ≥ 3, and let P be a Mani d-polytope with 0(P) ≥ 2. Then d ≥ 6.

Proof. By Lemma 2, the number of vertices is at least d + p(d) + p(d) + 1. But for
 d 

3 ≤ d ≤ 5 we have that d + p(d) + p(d) + 1 > 2d. Since P is a Mani polytope we


 d 

must have d ≥ 6.

Theorem 4. For 1 ≤ d ≤ 5 there is exactly one combinatorial type of Mani d-


polytope, namely the d-dimensional crosspolytope.

Proof. The cases d = 1, 2 are trivial, so assume d ≥ 3. Let P be a Mani d-polytope


with 3 ≤ d ≤ 5. By Corollary 3, we have 0(P) = 1. Lemma 1 and the existence of the
crosspolytopes then imply that f 0 (P) = 2d and that there is a perfect matching on the
inner diagonals. It remains to show that the combinatorial structure of P is determined
by this. First, any facet of P can contain only one vertex of any inner diagonal. Second,
any facet of P must have at least d vertices. Thus, any facet of P consists of exactly
d vertices such that no two share an inner diagonal. Since every ridge of P must lie in
two facets, exchanging a vertex with its “matching vertex” also yields a facet. Thus,
the facets of P are given exactly by the subsets of the vertices that do not contain
two vertices that share an inner diagonal. The combinatorial structure of a polytope is
determined by the vertex-facet incidences. Thus we have shown that the combinatorial
structure of a d-polytope is determined by the properties that f 0 (P) = 2d and that

June–July 2011] A LOST COUNTEREXAMPLE 539


there is a perfect matching of inner diagonals. Since the d-dimensional crosspolytope
satisfies the two properties, our polytope P is combinatorially equivalent to the d-
dimensional crosspolytope.

3. NONSIMPLICIAL MANI POLYTOPES.

Theorem 5. There exists a nonsimplicial Mani d-polytope for every d ≥ 6.

Proof. For every d ≥ 6 we construct ad nonsimplicial Mani d-polytope. (Observe that


for d = 6, 7 we have d + p(d) + p(d) + 1 = 2d.) Let d > p ≥ 1, q := d dp e, and


choose an ` with 1 ≤ ` ≤ q − 1.
We construct a nonsimplicial polytope Q with d + p vertices that has q + 1 simplex
facets, such that stacking onto these facets produces a nonsimplicial illuminated d-
polytope. (What we describe here is in fact a whole family of such polytopes, indexed
by the parameter `.)
We describe Q in terms of a Gale diagram A. Let

B = {− 1, e1 , . . . , e p−1 },

where 1 denotes the vector in which all entries are 1. This is a positive basis of R p−1
of cardinality p. The vectors in A are the following:
(1) Take ` copies of B, and denote them by B1 , . . . , B` .
(2) Take q − ` copies of −B, and denote them by B̃1 , . . . , B̃q−` .
(3) Furthermore, take the vectors 1, −e1 , . . . , −ed+ p− pq−1 .
Then the number of vectors in A is d + p.
By the translation between Gale diagram and polytope combinatorics, every Bi ,
i = 1, . . . , `, and every B̃ j , j = 1, . . . , q − `, corresponds to a facet complement of
size p in Q, that is, to the complement of a simplex facet. If we augment the set
{1, −e1 , . . . , −ed+ p− pq−1 } to a positive basis B 0 by taking the last pq − d vectors of
B̃1 , we get that

{Bi : i = 1, . . . , `} ∪ { B̃ j : j = 1, . . . , q − `} ∪ {B 0 }

is a set of subconfigurations of A that correspond to complements of simplex facets of


Q. These complements cover all vertices of Q. Thus, stacking onto the corresponding
facets we obtain an illuminated polytope P with d + p + q + 1 vertices, since every
vertex will be on an inner diagonal to one of the stacking vertices. For p := p(d) we
get that P is a Mani polytope.
Since 1 ≤ ` ≤ q − 1, there is a set of two opposite vectors in A, which corre-
sponds to a facet complement, so Q is nonsimplicial unless p = 2. For d ≥ 7, we
have p(d) ≥ 3 and we indeed get a nonsimplicial polytope. However, for d = 6 we
get p = p(d) = 2 and q = 3. In this case, we choose p := 3 instead of p := p(d).
Then q = d dp e = 2 and we have f 0 (P) = 12 = M(6), that is, P is a Mani polytope.
In both cases, the polytope P is nonsimplicial, because Q is nonsimplicial and we
only stack onto simplex facets.

We look at two examples that arise from the above description.

Example 6. For d = 16, p = p(d) = 4, q = 4, and ` = 3, the result is given as a


Gale diagram in the left picture of Figure 6. In this case, the polytope Q has f 0 = 20

540 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Figure 6. Gale diagrams of building blocks for nonsimplicial Mani polytopes: stacking onto a suitable set of
facets of a polytope with the left Gale diagram yields a nonsimplicial Mani 16-polytope; for the 2-dimensional
Gale diagram given on the right we stack onto three well-chosen facets to obtain a nonsimplicial Mani 6-
polytope.

and five disjoint simplex facet complements of size four that cover all vertices. This
yields a nonsimplicial illuminated 16-polytope with M(16) = 25 vertices.

Example 7. For the special case d = 6, we get a polytope Q as in the proof of The-
orem 5 by constructing the Gale diagram on the right in Figure 6 with p = 3, q = 2,
and ` = 1.
The Gale diagram has three disjoint positive bases that cover all vectors: the basis
B1 = {− 1, e1 , e2 }, and the bases B̃1 = B 0 = {1, −e1 , −e2 }. These bases correspond
to complements of simplex facets of Q. Stacking onto these three facets produces a
nonsimplicial illuminated 6-polytope with M(6) = 12 vertices.

APPENDIX: A QUICK SKETCH OF GALE DIAGRAMS. Let P ⊂ Rd be a d-


dimensional polytope. To construct a Gale diagram for P, we start with the vertices
v1 , . . . , vn ∈ Rd of P. By adding a row of ones we obtain a matrix

v1 · · · vn
 
V = ∈ R(d+1)×n
1 ··· 1

of full row rank d + 1, so rowspace(V ) has dimension d + 1. Indeed, column opera-


tions lead to

v1 v2 − v1 · · · vn − v1
 
∈ R(d+1)×n ,
1 0 ··· 0

a matrix whose columns are easily seen to span Rd+1 .


An affine dependence between the vertices of P is given by coefficients λ1 , . . . , λn
with λ1 + · · · + λn = 0 such that λ1 v1 + · · · + λn vn = 0. Thus the affine dependences
are represented by row vectors (λ1 , . . . , λn ) that are orthogonal to all the rows of V ,
that is, by vectors in rowspace(V )⊥ , which has dimension n − d − 1.
Any basis for rowspace(V )⊥ yields the rows of a matrix G = (g1 , . . . , gn ) ∈
(n−d−1)×n
R such that rowspace(G) = rowspace(V )⊥ . The matrix G is unique up to
invertible row operations (i.e., basis change in rowspace(G)). So the sequence of
vectors g1 , . . . , gn ∈ Rn−d−1 , a Gale diagram of P, is unique up to linear coordinate
transformations in Rn−d−1 . It encodes the affine dependences of the vertices of P,
so it encodes P up to affine transformations. The vectors g1 , . . . , gn ∈ Rn−d−1 span
Rn−d−1 since the matrix G has full column rank by construction. Since each row of

June–July 2011] A LOST COUNTEREXAMPLE 541


G is orthogonal to the all ones vector (1, . . . , 1), the columns g1 , . . . , gn of G satisfy
g1 + · · · + gn = 0, so they are positively spanning.
Conversely, if g1 , . . . , gn are any n positively spanning vectors in Rn−d−1 , then there
must be nonnegative coefficients α1 , . . . , αn such that α1 g1 + · · · + αn gn = −(g1 +
· · · + gn ). Thus we have positive coefficients βi := αi + 1 > 0 with β1 g1 + · · · +
βn gn = 0. Hence G = (g1 , . . . , gn ) ∈ R(n−d−1)×n is a matrix of full row rank n − d − 1
with a positive dependence β1 g1 + · · · + βn gn = 0 with βi > 0. This determines a ma-
trix

w1 · · · wn
 
W = ∈ R(d+1)×n
β1 · · · βn

with rowspace(W ) = rowspace(G)⊥ , and thus in particular with full row rank d + 1.
The rows of G give the linear dependences of the columns of W and thus also linear
dependences of the vectors

(1/βi )wi v
   
=: i
1 1

with the same sign pattern. This allows one to reconstruct the full combinatorics of the
polytope P := conv{v1 , . . . , vn } ⊂ Rd from the sign patterns of the linear dependences
of the columns of G. However, G determines a polytope with n vertices if and only
if no point vi is in the convex hull of the other points v j , that is, if and only if there
is no vector in the row space of G that has exactly one negative component, that is,
if and only if every open half-space in Rn−d−1 contains at least two of the vectors
gi . As observed in the introduction above, this is equivalent to the condition that the
vector configuration g1 , . . . , gn in Rn−d−1 is positively 2-spanning. Figure 7 gives a
very simple example.

3 3, 4

2
6 1, 2
1
5

5, 6

4
Figure 7. An octahedron (6 vertices, 3-dimensional), and its Gale diagram (6 vectors, 2-dimensional).

The result of this remarkable construction may be summarized as follows:

polytope P ←→ Gale diagram G


convex hull of n points in R d
←→ n vectors in Rn−d−1
each of the points is a vertex ←→ every open half-space contains ≥ 2 vectors
face of P ←→ positive dependence of G
points not on the face ←→ vectors not involved in the positive dependence

542 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



ACKNOWLEDGMENTS. The authors thank Raman Sanyal for bringing Mani’s paper to their attention.
Peter McMullen provided valuable comments during a number of email exchanges in 2008 and 2009. We are
grateful to an anonymous referee for very helpful comments and suggestions on the exposition.
Our work was supported by the Deutsche Forschungsgemeinschaft (DFG) within the Research Training
Group “Methods for Discrete Structures” (GRK 1408).

REFERENCES

1. L. M. Blumenthal, Theory and Applications of Distance Geometry, Oxford University Press, 1953.
2. D. Bremner and V. Klee, Inner diagonals of convex polytopes, J. Combin. Theory, Ser. A 87 (1999)
175–197. doi:10.1006/jcta.1998.2953
3. C. Davis, Theory of positive linear dependence, Amer. J. Math. 76 (1954) 733–746. doi:10.2307/
2372648
4. B. Grünbaum, Convex Polytopes, Interscience, London 1967; 2nd ed. (V. Kaibel, V. Klee and G. M.
Ziegler, eds.), Graduate Texts in Mathematics, vol. 221, Springer-Verlag, New York, 2003.
5. H. Hadwiger, Ungelöste Probleme, Nr. 55, Elem. Math. 27 (1972) 57.
6. G. Kalai, Some aspects of the combinatorial theory of convex polytopes, in Polytopes: Abstract, Convex
and Computational—Scarborough, ON, 1993, NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., vol. 440,
Kluwer Academic, Dordrecht, 1994, 205–229.
7. P. Mani, Inner illumination of convex polytopes, Comment. Math. Helv. 49 (1974) 65–73. doi:10.1007/
BF02566719
8. D. A. Marcus, Minimal positive 2-spanning sets of vectors, Proc. Amer. Math. Soc. 82 (1981) 165–172.
9. , Gale diagrams of convex polytopes and positive spanning sets of vectors, Discrete Appl. Math.
9 (1984) 47–67. doi:10.1016/0166-218X(84)90090-8
10. J. Matoušek, Lectures on Discrete Geometry, Graduate Texts in Mathematics, vol. 212, Springer-Verlag,
New York, 2002.
11. P. McMullen, Transforms, diagrams and representations, in Contributions to Geometry, Proceedings of
the Geom. Sympos.—Siegen 1978, Birkhäuser, Basel, 1979, 92–130.
12. M. Rosenfeld, Inner illumination of convex polytopes, Elem. Math. 30 (1974) 27–28.
13. G. C. Shephard, Diagrams for positive bases, J. London Math. Soc. 4 (1971) 165–175. doi:10.1112/
jlms/s2-4.1.165
14. R. F. Wotzlaw, Incidence Graphs and Unneighborly Polytopes, Ph.D. dissertation, Technische Universität
Berlin, 2009; available at http://opus.kobv.de/tuberlin/volltexte/2009/2221/.
15. G. M. Ziegler, Lectures on Polytopes, Graduate Texts in Mathematics, vol. 152, Springer-Verlag, New
York, 1995.

RONALD F. WOTZLAW received his Ph.D. at TU Berlin in 2009, in the framework of the DFG Research
Training Group “Methods for Discrete Structures.” In 2009, he decided to combine his passions for computer
science, mathematics, and photography and joined Nik Software, where he now works on photography-related
problems in digital imaging.
Nik Software GmbH, Hinter den Kirschkaten 26, 23560 Lübeck, Germany
[email protected]

GÜNTER M. ZIEGLER received his Ph.D. at M.I.T. in 1987, and after four years in Augsburg and a winter
in Stockholm, arrived in Berlin in 1992. He has been a professor at TU Berlin since 1995. He is the speaker of
the DFG Research Training Group “Methods for Discrete Structures,” and a co-chair of the Berlin Mathemat-
ical School. His writing includes Proofs from THE BOOK (1998, with Martin Aigner, in German Das BUCH
der Beweise, translated into 12 other languages) and a 2010 book in German Darf ich Zahlen? Geschichten
aus der Mathematik.
Inst. Mathematics, Freie Universität Berlin, Arnimallee 2, 14195 Berlin, Germany
[email protected]

June–July 2011] A LOST COUNTEREXAMPLE 543


NOTES
Edited by Ed Scheinerman

Conway’s Conjecture for


Monotone Thrackles
János Pach and Ethan Sterling

Abstract. A drawing of a graph in the plane is called a thrackle if every pair of edges meet
precisely once, either at a common vertex or at a proper crossing. According to Conway’s
conjecture, every thrackle has at most as many edges as vertices. We prove this conjecture for
x-monotone thrackles, that is, in the case when every edge meets every vertical line in at most
one point.

1. INTRODUCTION. A drawing of a graph is a representation of the graph in the


plane such that the vertices are represented by distinct points and the edges by (possi-
bly crossing) simple continuous curves connecting the corresponding point pairs and
not passing through any other point representing a vertex. If it leads to no confusion,
we make no notational distinction between a drawing and the underlying abstract graph
G. In the same vein, V (G) and E(G) will stand for the vertex set and edge set of G as
well as for the sets of points and curves representing them.
A drawing of G is called a thrackle if every pair of edges meet precisely once, either
at a common vertex or at a proper crossing. (A crossing p of two curves is proper if at
p one curve passes from one side of the other curve to its other side. Two edges that
share an endpoint cannot have any other point in common.)
More than forty years ago, Conway [1, 6, 13] conjectured that every thrackle has
at most as many edges as vertices, and offered a bottle of beer for a solution. (The
prize has since risen to a thousand dollars.) In spite of considerable efforts, Conway’s
thrackle conjecture is still open. If true, Conway’s conjecture would be tight, as any
cycle of length at least five can be drawn as a thrackle, see [15]. Two thrackle drawings
of C5 and C6 are shown in Figure 1. According to legend, Conway named these draw-
ings after a peculiar term he heard from a fisherman during his holiday in Scotland:
the man described his tangled fishing line as “thrackled.”

Figure 1. C5 and C6 drawn as thrackles.

doi:10.4169/amer.math.monthly.118.06.544

544 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



The first linear upper bound for the number of edges of a thrackle of n vertices,
2n, was established by Lovász, Pach, and Szegedy [10]. Cairns and Nikolayevsky im-
proved this bound to 32 n [3]. Presently, the best known bound, 167117
n < 1.428n, is due
to Fulek and Pach [7]. For related results, see [2, 4, 8, 11, 12].
The origins of the thrackle problem go back to the nineteen-thirties, when Hopf and
Pannwitz [9] posed the following problem in the Jahresbericht der Deutschen Mathe-
matischen Vereinigung. The diameter of a finite point set is the maximum distance
between two of its elements. Let P be a set of n points in the plane with diameter one.
Prove that the maximum number of point pairs in P which determine distance one is
n. Several solutions were submitted [14]. Much later, Erdős noticed that the statement
follows from a more general result: Every graph drawn in the plane with straight-line
edges so that each pair of edges either share an endpoint or properly cross has at most
as many edges as vertices. Indeed, it is easy to verify that joining two points of a unit
diameter set in the plane by a segment if they are at distance one results in a draw-
ing that satisfies the above condition. Using Conway’s terminology, we obtain that
Conway’s conjecture is true for straight-line thrackles, that is, for thrackles drawn by
straight-line edges on a set of vertices, no three of which are collinear. (We make the
latter assumption to exclude edges that share a whole segment.)

Theorem 1 (Erdős). Every straight-line thrackle has at most as many edges as ver-
tices.

The most elegant proof of Theorem 1 is due to Perles. Let G be a straight-line


thrackle. We call a vertex v ∈ V (G) pointed if all edges incident to v lie in a half-
plane bounded by a line passing through v. (See Figure 2a.) For each pointed vertex
v, delete from G the “leftmost” edge incident to v, that is, the first element in the
clockwise order of edges around the half-plane at v. Notice that we deleted all edges,
which proves the theorem. Indeed, if we are left with an edge uv ∈ E(G), then it was
not leftmost at u, nor at v. Thus, originally G contained two edges, uv 0 and u 0 v, with
∠v 0 uv < π and ∠u 0 vu < π. These two edges must lie on opposite sides of the line
uv. Hence, they are disjoint, contradicting the thrackle condition. See Figure 2b.

u0
v0 v v0 v

leftmost
u u
edge
u0
v
(a) (b) (c)
Figure 2. (a) a pointed vertex; (b) Perles’ argument; and (c) why it breaks down for x-monotone thrackles.

The above proof applies verbatim to another special class of thrackles, for which
it makes sense to speak of “leftmost” edges. A thrackle is called outerplanar if its
vertices lie on a circle and its edges are represented by continuous curves contained in
the interior of this circle [5].

Theorem 2. Every outerplanar thrackle has at most as many edges as vertices.

Cairns and Nikolayevsky [5] have recently established the stronger statement that
every outerplanar thrackle which has no vertex of degree at most one is an odd cycle.
Woodall [15] characterized all thrackles, assuming that Conway’s conjecture is true.

June–July 2011] NOTES 545


The aim of this note is to verify Conway’s conjecture in another special case. We
call a thrackle x-monotone if each curve representing an edge meets every vertical line
in at most one point. In particular, every straight-line thrackle with no vertical edge is
x-monotone.

Theorem 3. Every x-monotone thrackle has at most as many edges as vertices.

Theorem 3 is a generalization of Theorem 1. However, the argument of Perles fails


in this case, because the edges uv 0 and u 0 v may cross (see Figure 2c). Instead, we can
explore the fact that there is a natural partial order on the edges.

2. PROOF OF THEOREM 3. Let G be an x-monotone thrackle with n vertices and


e edges. In the sequel, assume without loss of generality that G is in “general position,”
that is, no two vertices have the same x-coordinate and no three edges pass through the
same point. The orthogonal projections of the edges to the x-axis are closed segments
with the property that any two of them have at least one point in common. By Helly’s
theorem in dimension 1, we obtain that all of these segments share a point. Hence,
there is a vertical line ` that meets all edges of G.
We distinguish two cases.

Case A. The line ` does not pass through any vertex of G.

In this case, G is a bipartite graph: all of its edges connect a point in the left half-
plane bounded by ` to a point in the right half-plane. We show that the number of
edges of G is strictly smaller than n. Indeed, otherwise G would contain a cycle of
even length, contradicting the following lemma.

Lemma. No even cycle can be drawn as an x-monotone thrackle.

Proof. Suppose for a contradiction that there exists a drawing of an even cycle C =
v0 v1 · · · vk−1 which can be drawn as a thrackle (k is even and the indices are taken
modulo k).
First of all, notice that we can assume without loss of generality that C meets the
requirements of Case A: none of its vertices lies on `. Indeed, if there existed such a
vertex vi , the two edges of C meeting at vi would lie in the same half-plane bounded
by `; otherwise, by the evenness of k, one of them would be disjoint from another edge
of C that lies entirely in the opposite open half-plane. If both vi−1 vi and vi vi+1 lie in
the same half-plane, then slightly translating ` we can bring it to a position where no
vertex lies on it and ` (strictly) crosses every edge of C.
We say that the edge vi−1 vi lies below the edge vi vi+1 if the intersection point of
vi−1 vi with the line ` is below the intersection point of vi vi+1 with `. Notice that, by
the definition of a thrackle, these two points cannot coincide, since the interiors of any
two adjacent edges must be disjoint. If an edge lies below another edge adjacent to it,
then we say that the latter edge lies above the first one.
Observe that if the edge vi−1 vi lies below vi vi+1 , then the next edge along C,
vi+1 vi+2 , must also lie below vi vi+1 , since otherwise vi−1 vi and vi+1 vi+2 could not
cross. In this case, we say that vi vi+1 is an upper edge. Otherwise, both vi−1 vi and
vi+1 vi+2 must lie above vi vi+1 , and vi vi+1 is called a lower edge. Obviously, the edges
of C are alternately upper and lower edges. See Figure 3.
Consider now the cycle C as a closed self-intersecting curve embedded in the plane,
which divides the plane into simply connected regions. Exactly one of these regions is

546 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



unbounded. It is well known and easy to prove that one can color these regions with
two colors, white and grey, say, such that no two regions that share a boundary arc
are of the same color. (By our assumption of general position, each point at which C
intersects itself belongs to the boundary of precisely four regions, two of which are
white and the other two grey.)

vi + 3

vi + 1

vi

vi – 1

vi + 2

vi + 4
`
Figure 3. An even cycle. The upper edges are marked.

Let va and vb be the leftmost and rightmost vertices of C, respectively. Assume, by


symmetry, that 0 < a < b < k. Since va and vb are on different sides of ` and every
edge of C crosses `, we have that b − a is odd. Assume, again by symmetry, that
va va+1 is an upper edge. Since every upper edge is followed by a lower edge and vice
versa, we obtain that vb−1 vb must also be an upper edge.
Let us trace now the curve P = va va+1 · · · vb−1 vb from left to right, and record the
color changes along the right-hand side of P. Every time we arrive at a crossing where
C intersects itself, the color changes. How many crossings are there along P? Each
edge of C (including the edges of P) crosses k − 3 other edges, because each of them
must cross every other edge except itself and the two adjacent edges. Since k is even,
k − 3 is odd. We have seen that b − a is odd, so that P consists of an odd number
of edges, each of which is crossed by an odd number of other edges. Therefore, the
total number of crossings along P must be odd. (Note that every crossing where P
crosses itself is counted twice!) This implies that the color of the region lying on the
right-hand side of the initial portion of P is different from the color on the right-hand
side of its final portion. Since va va+1 and vb−1 vb are upper edges, this means that the
regions directly below the initial and final portions of P are of different colors.
On the other hand, the points above va and the points above vb belong to the (unique)
unbounded region, and all of them are colored with the color of that region. Hence, the
points above a small initial portion of P and the points above a small final portion of
P are of the same color. This yields that the regions directly below the initial and final
portions of P must be of the same color. This contradiction completes the proof of the
lemma.

Case B. The line ` passes through a vertex of v ∈ V (G).

In this case, replace v by two vertices, v 0 and v 00 , very close to each other and to the
original position of v. Let v 0 lie in left half-plane, and let it be the new left endpoint

June–July 2011] NOTES 547


of all edges that previously had v as their left endpoints. Analogously, let us redirect
all edges that previously had v as their right endpoints to the point v 00 , which lies in
the right half-plane. We can make sure that after this slight perturbation, every edge
incident to v 0 will cross all edges incident to v 00 , and the resulting drawing G 0 remains
an x-monotone thrackle with n 0 = n + 1 vertices and e0 = e edges.
Notice that all edges of G 0 meet ` and none of them lies on `; that is, the conditions
of Case A are satisfied. However, in Case A, we have shown that the number of edges is
strictly smaller than the number of vertices. Thus, we have e0 < n 0 , so that e < n + 1,
and the proof of Theorem 3 is complete.

ACKNOWLEDGMENTS. The present note grew out of the Intel project of the second named author in 2005,
while he was a student at Stuyvesant High School, New York, under the supervision of the first named author.
Some portions of the argument have been rediscovered by Radoslav Fulek, Rom Pinchasi, Konrad Swanepoel,
and Géza Tóth. Research on this paper was partially supported by NSF grant CCF-08-3072, Swiss NSF grant
200021-125287/1, and grants from OTKA and BSF. The second named author was also supported by MIT’s
UROP program, under the direction of Dr. Karl Mahlburg.

REFERENCES

1. P. Brass, W. Moser, and J. Pach, Research Problems in Discrete Geometry, Springer-Verlag, New York,
2005.
2. G. Cairns, M. McIntyre, and Y. Nikolayevsky, The thrackle conjecture for K 5 and K 3,3 , in Towards a
Theory of Geometric Graphs, J. Pach, ed., Contemporary Mathematics, vol. 342, American Mathematical
Society, Providence, RI, 2004, 35–54.
3. G. Cairns and Y. Nikolayevsky, Bounds for generalized thrackles, Discrete Comput. Geom. 23 (2000)
191–206. doi:10.1007/PL00009495
4. , Generalized thrackle drawings of non-bipartite graphs, Discrete Comput. Geom. 41 (2009) 119–
134. doi:10.1007/s00454-008-9095-5
5. , Outerplanar thrackles, manuscript (2009), available at http://www.latrobe.edu.au/
mathstats/staff/cairns/papers/alter.pdf.
6. J. H. Conway, Unsolved problem, in Combinatorics: Being the Proceedings of the Conference on Combi-
natorial Mathematics–Mathematical Institute, Oxford, D. J. A. Welsh and D. R. Woodall, eds., Institute
of Mathematics and Its Applications, Southend-on-Sea, UK, 1972, 351–363.
7. R. Fulek and J. Pach, A computational approach to Conway’s thrackle conjecture, in Graph Drawing
2010, Lecture Notes in Computer Science, vol. 6502, Springer-Verlag, Berlin, 2011, 226–237.
8. J. E. Green and R. D. Ringeisen, Combinatorial drawings and thrackle surfaces, in Graph Theory, Com-
binatorics, and Algorithms, Vol. 2. Proceedings of the Seventh Quadrennial International Conference on
the Theory and Applications of Graphs–Western Michigan University, Kalamazoo, MI, 1992, Y. Alavi
and A. Schwenk, eds., Wiley-Interscience, New York, 1995, 999–1009.
9. H. Hopf and E. Pannwitz, Aufgabe Nr. 167, Jahresber. Deutsch. Math.-Verein. 43 (1934) 114.
10. L. Lovász, J. Pach, and M. Szegedy, On Conway’s thrackle conjecture, Discrete Comput. Geom. 18
(1998) 369–376. doi:10.1007/PL00009322
11. A. Perlstein and R. Pinchasi, Generalized thrackles and geometric graphs in R3 with no pair of strongly
avoiding edges, Graphs Combin. 24 (2008) 373–389. doi:10.1007/s00373-008-0796-6
12. B. L. Piazza, R. D. Ringeisen, and S. K. Stueckle, Subthrackleable graphs and four cycles, in Graph
Theory and Applications–Hakone, 1990, M. Kano, H. Okamura, S. Tazawa, K. Ushio, and Y. Yamasaki,
eds., Discrete Math. 127, no. 1-3 (1994) 265–276. doi:10.1016/0012-365X(92)00484-9
13. R. D. Ringeisen, Two old extremal graph drawing conjectures: Progress and perspectives, Congr. Numer.
115 (1996) 91–103.
14. J. W. Sutherland, Lösung der Aufgabe 167, Jahresber. Deutsch. Math.-Verein. 45 (1935) 33–35.
15. D. R. Woodall, Thrackles and deadlock, in Combinatorial Mathematics and Its Applications, D. J. A.
Welsh, ed., Academic Press, New York, 1969, 335–348.

EPFL Lausanne and Rényi Institute, H-1364 Budapest, POB 127, Hungary
[email protected]

MIT, Department of Mathematics, Cambridge, MA 02139, USA


[email protected]

548 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Proofs of Power Sum and Binomial
Coefficient Congruences via Pascal’s Identity
Kieren MacMillan and Jonathan Sondow

Abstract. A well-known and frequently cited congruence for power sums is


(
n n n −1 (mod p) if ( p − 1) | n,
1 + 2 + ··· + p ≡
0 (mod p) if ( p − 1) - n,

where n ≥ 1 and p is prime. We survey the main ingredients in several known proofs. Then
we give an elementary proof, using an identity for power sums proven by Pascal in 1654. An
application is a simple proof of a congruence for certain sums of binomial coefficients, due to
Hermite and Bachmann.

In the literature on power sums


a
j n = 1n + 2n + · · · + a n
X
Sn (a) := (n ≥ 0, a ≥ 1), (1)
j=1

the following congruence is well known and is frequently cited.

Theorem 1. Let p be a prime. For n ≥ 1, we have


(
−1 (mod p) if ( p − 1) | n,
Sn ( p) ≡
0 (mod p) if ( p − 1) - n.

For example, it is used to prove theorems on Bernoulli numbers (that of von Staudt-
Clausen in [2] and [7, Theorem 118], of Carlitz-von Staudt in [2, Theorem 4] and
[11], and of Almkvist-Meurman in [3, Theorem 9.5.29]) and to study the Erdős-Moser
Diophantine equation Sn (m − 1) = m n (in [10], [11], [12], and [15]) as well as other
exponential Diophantine equations and Stirling numbers of the second kind (in [9]).
A component of most proofs of Theorem 1 is Fermat’s little theorem ([7, Theorem
71], [14, p. 36]), which says that if p is prime, then

p - a =⇒ a p−1 ≡ 1 (mod p). (2)

To prove the nontrivial case ( p − 1) - n, another component is needed. The usual


proof ([7, Theorem 119], [10, Lemma 1]) relies on the theory of primitive roots.
(It gives an integer g such that g n 6 ≡ 1 (mod p). Then p - g, implying Sn ( p) ≡
g n Sn ( p) (mod p), and we infer that p | Sn ( p).) Another proof [11, Lemma 2], due
to Zagier, invokes Lagrange’s theorem (see [14, p. 39]) on roots of polynomials over
Z/ pZ. (Using it, Zagier deduces the existence of an integer g with g n 6 ≡ 1 (mod p).)
Still a third proof [2] employs Bernoulli numbers and finite differences.
In this note, we give a very elementary proof of Theorem 1, using a recurrence for
the sequence of power sums S0 (a), S1 (a), . . . proven by Pascal [13] in 1654 (see [6,
p. 82]).
doi:10.4169/amer.math.monthly.118.06.549

June–July 2011] NOTES 549


Pascal’s Identity. If n ≥ 0 and a ≥ 1, then
n 
n+1

Sk (a) = (a + 1)n+1 − 1.
X
(3)
k=0
k

Proof. For j > 0, the binomial theorem gives


n 
n+1

n+1 n+1
jk.
X
( j + 1) −j =
k=0
k

Summing from j = 1 to a, the left-hand side telescopes to (a + 1)n+1 − 1, and by (1)


we get the desired identity.

Proof of Theorem 1. The case ( p − 1) | n follows easily from (2).


To prove the second case, suppose on the contrary that ( p − 1) - n (so p > 2) but
p - Sn ( p). Let n be the smallest such number and write n = d( p − 1) + r , where
d ≥ 0 and 0 < r < p − 1. Now (2) yields Sn ( p) ≡ Sr ( p) (mod p), and the minimality
of n implies first that n = r and then that p | Sk ( p) for k < n (note that S0 ( p) = p).
Hence (3) with a = p implies p | (n + 1)Sn ( p). But then p prime with p > n + 1
forces p | Sn ( p), a contradiction. This completes the proof.

As a bonus, Pascal’s identity allows a simple proof of a congruence for certain sums
of binomial coefficients mk (generalizing the easily-established facts that m−1 m
k=1 k is
 P 
m
even for m > 0, and that if p is prime and p ≤ m ≤ 2( p − 1), then p divides p−1 ).
The case m odd is due to Hermite [8] in 1876, and the general case to Bachmann [1,
p. 46] in 1910; for these and related results, see [4, pp. 270–275].

Corollary 1 (Hermite and Bachmann). If m > 0 and p is prime, then


X m 
≡ 0 (mod p), (4)
0<k<m,
k
( p−1) | k

where the sum is over all k ≡ 0 (mod p − 1) with 1 ≤ k ≤ m − 1.

Proof. Set n = m − 1 and a = p in (3), then reduce modulo p. Using Theorem 1, the
result follows.

For example,

14 14 14
     
+ + = 1001 + 3003 + 91 ≡ 0 (mod 5).
4 8 12

For (4) and generalizations due to Glaisher and Carlitz, see [3, p. 70, Lemma 9.5.28;
p. 133, Exercise 62; and p. 327, Proposition 11.4.11]. Recently, Dilcher [5] discovered
an analog of (4) for alternating sums.

ACKNOWLEDGMENTS. We thank the three referees for several suggestions and references, and Pieter
Moree for sending us a preprint of [11].

550 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



REFERENCES

1. P. Bachmann, Niedere Zahlentheorie, Part 2, Teubner, Leipzig, 1910; Parts 1 and 2 reprinted in one
volume, Chelsea, New York, 1968.
2. L. Carlitz, The Staudt-Clausen theorem, Math. Mag. 34 (1961) 131–146. doi:10.2307/2688488
3. H. Cohen, Number Theory, Volume II: Analytic and Modern Tools, Graduate Texts in Mathematics, vol.
240, Springer-Verlag, New York, 2007.
4. L. E. Dickson, History of the Theory of Numbers, vol. 1, Carnegie Institution of Washington, Washington,
DC, 1919; reprinted by Dover, Mineola, NY, 2005.
5. K. Dilcher, Congruences for a class of alternating lacunary sums of binomial coefficients, J. Integer Seq.
10 (2007) Article 07.10.1.
6. A. W. F. Edwards, Pascal’s Arithmetical Triangle, Charles Griffin, London, 1987.
7. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 6th ed., D. R. Heath-Brown
and J. H. Silverman, eds., Oxford University Press, Oxford, 2008.
8. Ch. Hermite, Extrait d’une lettre à M. Borchardt, J. Reine Angew. Math. 81 (1876) 93–95.
9. B. C. Kellner, The equivalence of Giuga’s and Agoh’s conjectures (2004), available at http://arxiv.
org/abs/math/0409259.
10. P. Moree, Moser’s mathemagical work on the equation 1k + 2k + · · · + (m − 1)k = m k (preprint), avail-
able at http://arxiv.org/abs/1011.2940.
11. , A top hat for Moser’s four mathemagical rabbits, Amer. Math. Monthly 118 (2011) 364–370;
also available at http://arxiv.org/abs/1011.2956.
12. L. Moser, On the Diophantine equation 1n + 2n + · · · + (m − 1)n = m n , Scripta Math. 19 (1953) 84–88.
13. B. Pascal, Sommation des puissances numériques, in Oeuvres Complètes, vol. III, J. Mesnard, ed., De-
sclée-Brouwer, Paris, 1964, 341–367; English trans. A. Knoebel, R. Laubenbacher, J. Lodder, and D. Pen-
gelley, Sums of numerical powers, in Mathematical Masterpieces: Further Chronicles by the Explorers,
Springer-Verlag, New York, 2007, 32–37.
14. H. E. Rose, A Course in Number Theory, Clarendon Press, Oxford, 1988.
15. J. Sondow and K. MacMillan, Reducing the Erdős-Moser equation 1n + 2n + · · · + k n = (k + 1)n mod-
ulo k and k 2 (preprint), available at http://arxiv.org/abs/1011.2154.

55 Lessard Avenue, Toronto, Ontario, Canada M6S 1X6


[email protected]
209 West 97th Street, New York, NY 10025
[email protected]

Meromorphic Continuation of Dirichlet


Series via Derivations
Caleb Emmons

Abstract. We show how to use derivations on the ring of Dirichlet series to achieve mero-
morphic continuations of completely multiplicative Dirichlet series. In particular we present
infinitely many new representations of the Riemann zeta function as a quotient of series con-
verging on Re(s) > 0.

A derivation on a ring is an operator that acts like a derivative, namely, it is linear


and satisfies the product rule. Derivations have been used by Shapiro and others to
investigate the algebraic independence of Dirichlet series—see [3], [4], and [2]. In this
note we aim to use them to achieve meromorphic continuations. Although analytic
doi:10.4169/amer.math.monthly.118.06.551

June–July 2011] NOTES 551


REFERENCES

1. P. Bachmann, Niedere Zahlentheorie, Part 2, Teubner, Leipzig, 1910; Parts 1 and 2 reprinted in one
volume, Chelsea, New York, 1968.
2. L. Carlitz, The Staudt-Clausen theorem, Math. Mag. 34 (1961) 131–146. doi:10.2307/2688488
3. H. Cohen, Number Theory, Volume II: Analytic and Modern Tools, Graduate Texts in Mathematics, vol.
240, Springer-Verlag, New York, 2007.
4. L. E. Dickson, History of the Theory of Numbers, vol. 1, Carnegie Institution of Washington, Washington,
DC, 1919; reprinted by Dover, Mineola, NY, 2005.
5. K. Dilcher, Congruences for a class of alternating lacunary sums of binomial coefficients, J. Integer Seq.
10 (2007) Article 07.10.1.
6. A. W. F. Edwards, Pascal’s Arithmetical Triangle, Charles Griffin, London, 1987.
7. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 6th ed., D. R. Heath-Brown
and J. H. Silverman, eds., Oxford University Press, Oxford, 2008.
8. Ch. Hermite, Extrait d’une lettre à M. Borchardt, J. Reine Angew. Math. 81 (1876) 93–95.
9. B. C. Kellner, The equivalence of Giuga’s and Agoh’s conjectures (2004), available at http://arxiv.
org/abs/math/0409259.
10. P. Moree, Moser’s mathemagical work on the equation 1k + 2k + · · · + (m − 1)k = m k (preprint), avail-
able at http://arxiv.org/abs/1011.2940.
11. , A top hat for Moser’s four mathemagical rabbits, Amer. Math. Monthly 118 (2011) 364–370;
also available at http://arxiv.org/abs/1011.2956.
12. L. Moser, On the Diophantine equation 1n + 2n + · · · + (m − 1)n = m n , Scripta Math. 19 (1953) 84–88.
13. B. Pascal, Sommation des puissances numériques, in Oeuvres Complètes, vol. III, J. Mesnard, ed., De-
sclée-Brouwer, Paris, 1964, 341–367; English trans. A. Knoebel, R. Laubenbacher, J. Lodder, and D. Pen-
gelley, Sums of numerical powers, in Mathematical Masterpieces: Further Chronicles by the Explorers,
Springer-Verlag, New York, 2007, 32–37.
14. H. E. Rose, A Course in Number Theory, Clarendon Press, Oxford, 1988.
15. J. Sondow and K. MacMillan, Reducing the Erdős-Moser equation 1n + 2n + · · · + k n = (k + 1)n mod-
ulo k and k 2 (preprint), available at http://arxiv.org/abs/1011.2154.

55 Lessard Avenue, Toronto, Ontario, Canada M6S 1X6


[email protected]
209 West 97th Street, New York, NY 10025
[email protected]

Meromorphic Continuation of Dirichlet


Series via Derivations
Caleb Emmons

Abstract. We show how to use derivations on the ring of Dirichlet series to achieve mero-
morphic continuations of completely multiplicative Dirichlet series. In particular we present
infinitely many new representations of the Riemann zeta function as a quotient of series con-
verging on Re(s) > 0.

A derivation on a ring is an operator that acts like a derivative, namely, it is linear


and satisfies the product rule. Derivations have been used by Shapiro and others to
investigate the algebraic independence of Dirichlet series—see [3], [4], and [2]. In this
note we aim to use them to achieve meromorphic continuations. Although analytic
doi:10.4169/amer.math.monthly.118.06.551

June–July 2011] NOTES 551


continuation of the Riemann zeta function to C \ {1} is well known, the values of ζ (s)
in the critical strip 0 < Re(s) < 1 are certainly far from understood! Here we present
a recipe for an infinite family of representations of ζ (s) converging on the half-plane
Re(s) > 0.

1. CLASSICAL THEORY OF DIRICHLET SERIES. Let P denote the set of


prime numbers, and for x > 0, let P [x] be the set of primes no greater than x. For a
natural number n and a prime p we let v p (n) denote the exponent of the highest power
of p dividing n. Any function a : N → C is called an arithmetic function. We have the
corresponding Dirichlet series

X a(n)
Fa (s) = ,
n=1
ns

where s is a complex variable. If a(nm) = a(n)a(m) for all n and m, then a is called
completely multiplicative. For example, the arithmetic function function 1(n) ≡ 1 is
completely multiplicative and gives rise to the Riemann zeta function, ζ (s)P = F1 (s).
A few facts are in order, whose proofs may be found in [1]. Put A(x) = n≤x a(n).
If A(x) = O(x α ) we have that Fa (s) converges to an analytic function of s for Re(s) >
α. Moreover, the series may be differentiated term-by-term in this region to achieve

X a(n)
Fa0 (s) = − log(n) . (1)
n=1
ns

We will shortly be replacing the term − log(n) with a similarly behaved log-derivation,
−`(n). Let us see how the standard approach progresses first.
The set of arithmetic functions is a ring under
P addition and convolution. For a(n)
and b(n), their convolution is (a ∗ b)(n) = j|n a( j)b(n/j). The connection with
Dirichlet series is Fa∗b (s) = Fa (s)Fb (s).
The von Mangoldt function is defined by putting 3(n) = log( p) if n = p v with p
prime and v ≥ 1, and 3(n) = 0 otherwise. This function plays a critical role in the
original proof of the Prime Number Theorem (PNT).

Lemma 1.1. For any completely multiplicative a,

a · log = (3 · a) ∗ a.

Proof. We compute ((3 · a) ∗ a)(n) = 3( j)a( j)a(n/j) = a(n) 3( j) =


P P
j|n j|n
a(n) p∈P v p (n) log( p) = a(n) log(n).
P

On the Dirichlet series side this translates into −Fa0 (s) = F3·a (s)Fa (s). Hence, for
example, taking a = 1 and solving for Fa (s) gives

! ∞ !
X log(n) X 3(n)
0
ζ (s) = −ζ (s)/F3 (s) = . (2)
n=1
ns n=1
ns

Where do the series on the right converge?


P For the series in the denominator, one
can define the summation function ψ(x) = n≤x 3(x). The PNT, in one of its forms,
asserts that ψ(x) ∼ x, and hence the Dirichlet series converges to an analytic function

552 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



for Re(s) > 1. This representation of ζ (s) does not provide a meromorphic continua-
tion.
However, notice that nowhere have we actually used the values of log( p)! In fact,
each log( p) could in a sense be any complex number whatsoever (independently of
the others) and our computations above would still work out if we appropriately adjust
the value of log(n) for nonprimes.

2. LOG-DERIVATIONS. To define a log-derivation `(n), simply begin with any se-


quence of complex numbers indexed by the primes {`( p)} p∈P , and extend to a com-
pletely additive
P arithmetic function; that is we require `(nm) = `(n) + `(m). It is clear
that `(n) = p∈P v p (n)`( p).
Given a log-derivation `, we can make an operation on the set of Dirichlet series,
by defining

X a(n)
Fa` (s) := F−`·a (s) = −`(n) . (3)
n=1
ns

We call this the `-derivation of the Dirichlet series. The superscript ` notation is in-
tended to be suggestive of a derivative; the next lemma shows this is a good fit.

Lemma 2.1. An `-derivation of Dirichlet series is a derivation. That is, if a and b are
arithmetic functions we have

Fa+b
`
(s) = Fa` (s) + Fb` (s) (4)

and

Fa∗b
`
(s) = Fa` (s)Fb (s) + Fa (s)Fb` (s). (5)

on the half-planes where these series converge.

Proof. These follow immediately from the corresponding identities of arithmetic func-
tions. Let us prove the second one:
X
((a ∗ b)`)(n) = a( j)b(n/j)`(n)
j|n
X
= a( j)b(n/j)(`( j) + `(n/j))
j|n
X X
= (a`)( j)b(n/j) + a( j)(b`)(n/j)
j|n j|n

= ((a`) ∗ b)(n) + (a ∗ (b`))(n).

For every notion one has in the classical theory, there is the corresponding notion
with log(n) replaced by `(n). Define the `-modified von Mangoldt function 3` (n) =
`( p) if n = p v and 3` (n) = 0 otherwise; then immediately:

Proposition 2.2. If a is completely multiplicative,

Fa` (s) = −Fa (s)F3` ·a (s) (6)

on any half-plane where all three series converge.

June–July 2011] NOTES 553


Moreover, if only two of the series in equation (6) converge to analytic functions
for say Re(s) > β, then the third (if it converges anywhere) analytically or meromor-
phically continues to Re(s) > β via equation (6) itself. And with so much choice for
the `( p)’s we can often finangle this convergence. Let us focus on the Riemann zeta
function for an example.

3. MEROMORPHIC CONTINUATION OF ζ (S).

Theorem 3.1. Index the prime numbers, so that p1 = 2, p2 = 3, p3 = 5, and so forth.


Put
1 pk − 1
`(2) = and `( pk ) = − for k ≥ 2. (7)
2 2k
Then the equation
`
ζ (s) = −F1 (s)/F3` (s) (8)

gives a meromorphic continuation of ζ (s) to Re(s) > 0.

Proof. According to Proposition 2.2, it remains only to show that the two series in
the quotient converge to analytic
P functions in the stated region. For the numerator,
we should consider L(x) = n≤x `(n). For easeP of notation suppose x is an integer.
Because of log-type behavior, L(x) = `(x!) = p v p (x!)`( p). The next lemma is
perhaps well known; we sketch the proof because it is short and amusing.
1
Lemma 3.2. The order of p that divides x! is v p (x!) = p−1
(x − w p (x)), where w p (x)
denotes the sum of the digits of x written in base p.

Proof. By induction on x. When x = 1, both sides are zero. Suppose the formula
holds for a given x. Let v ≥ 0 be the exact power of p dividing x + 1. One sees
w p (x + 1) = w p (x) + 1 − ( p − 1)v, and so the right-hand side of the formula has a
net gain of v as needed.

Note w p (x) < ( p − 1)(log p (x) + 1) ≤ ( p − 1)(log2 (x) + 1). Hence



X `( p)


|L(x)| = (x − w p (x))
p∈P[x] p − 1

`( p) `( p)
X X

≤ x
+ w p (x)
p∈P[x] p − 1 p∈P[x] p − 1


x X
≤ π(x) + (log2 (x) + 1) |`( p)|.
2 p∈P[x]

Here π(x) denotes the number of primes less than or equal to x. Chebyshev gave an
elementary bound π(x) ≥ C1 x/ log(x), which suffices to show that the term x/2π(x)
tends to zero as x → ∞. Chebyshev’s inequality may
P be transformed into a bound
pk ≤ C2 k log(k), which then easily implies K := p∈P |`( p)| < ∞. We conclude
`
L(x) = O(log x). Hence L(x) = O(x ε ) for every ε > 0, and the series −F1 (s) con-
verges for Re(s) > 0.

554 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



As to the denominator, F3` (s), we should consider the summation function ψ` (x) =
n≤x 3` (n). By defining an auxiliary summation function θ` (x) = p∈P[x] `( p), we
P P
may write
X X X
ψ` (x) = `( p) + `( p) + `( p) + · · ·
p≤x p 2 ≤x p 3 ≤x

= θ` (x) + θ` (x 1/2 ) + θ` (x 1/3 ) + · · ·

where p denotes a prime. Note that |θ` (y)| ≤ K for any y, and θ` (x 1/m ) = 0 as soon
as m > log2 (x). Hence |ψ` (x)| ≤ K log2 (x) = O(log x). We again have convergence
of the Dirichlet series on Re(s) > 0.

To reiterate, we have proven that for Re(s) > 0,


 
!, ∞
p|n v p (n)`( p)
∞ P
X X `( p) 
ζ (s) = (9)

n s ns
 
n=1 n=2
n= p v

where p denotes a prime and the `( p)’s are defined in (7).

4. FINAL REMARKS. There was nothing particularly special about our choice for
the `( p). One can see for example that if for some β ≥ 0 and all ε > 0 the sequence
{`( p)} p∈P satisfies:
X `( p)
(H1) = O(x β−1+ε ), and
p∈P[x]
p − 1
X
(H2) |`( p)| = O(x β+ε ),
p∈P[x]

we will achieve a meromorphic continuation of ζ (s) to Re(s) > β. For example, one
could take `(2) = M 21−M and `( pk ) = − pMk −1
k for any M ∈ C with |M| > 1 and achieve
a continuation to Re(s) > 0. (Our previous sequence arose from M = 2.)
There are other interesting approaches to this method of derivations. Let µ(n) de-
note the Möbius function, so that Fµ (s) = 1/ζ (s), i.e., Fµ (s)F1 (s) = 1. Applying an
`-derivation yields

Fµ` (s)F1 (s) + Fµ (s)F1` (s) = 0.

This can be cleaned up via equation (6) to ζ (s) = −F3` (s)/Fµ` (s).
Fix s0 ∈ C with Re(s0 ) > 21 . One might try to find a log-derivation ` (depending
on s0 ) for which the series in the quotient converge in a half-plane containing s0 , and
for which F3` (s0 ) 6 = 0. If we could do this for every s0 with Re(s0 ) > β for some β
satisfying 21 ≤ β < 1 this would be an improvement to the known zero-free region of
zeta, and of course β = 12 would give the Riemann hypothesis. The equation
X `( p)
F3` (s) =
p∈P
ps − 1

may be of some use, although a priori it only holds for Re(s) > 1.

June–July 2011] NOTES 555


There has recently been some work on so-called number derivatives, or quasi-
derivations of the natural numbers—see [6] and [5]. It is worth noting that number
derivatives with codomain C are in one-to-one correspondence with log-derivations
via the map 1(n) 7 → `(n) := 1(n)n
. Indeed, it was the authors’ invitation in [6] to in-
vestigate F1 (s) which inspired this note and I would like to close by thanking them.

REFERENCES

1. G. J. O. Jameson, The Prime Number Theorem, London Mathematical Society Student Texts, vol. 53,
Cambridge University Press, Cambridge, 2003.
2. V. Laohakosol, Dependence of arithmetic functions and Dirichlet series, Proc. Amer. Math. Soc. 115 (1992)
637–645.
3. H. N. Shapiro, On the convolution ring of arithmetic functions, Comm. Pure Appl. Math. 25 (1972) 287–
336. doi:10.1002/cpa.3160250306
4. H. N. Shapiro and G. H. Sparer, On algebraic independence of Dirichlet series, Comm. Pure Appl. Math.
39 (1986) 695–745. doi:10.1002/cpa.3160390602
5. M. Stay, Generalized number derivatives, J. Integer Seq. 8 (2005) Article 05.1.4, available at http://
www.cs.uwaterloo.ca/journals/JIS/VOL8/Stay/stay44.html.
6. V. Ufnarovski and B. Åhlander, How to differentiate a number, J. Integer Seq. 6 (2003)
Article 03.3.4, available at http://www.cs.uwaterloo.ca/journals/JIS/VOL6/Ufnarovski/
ufnarovski.html.

Department of Mathematics and Computer Science, Pacific University, Forest Grove, OR 97116
[email protected]

Deep Thought

A professor who valued deep thought


Used abstraction more oft than he ought.
He was wont to adorn
With a lemma of Zorn
Each mathematical subject he taught.

—Submitted by Rick Norwood, East Tennessee State University

556 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



PROBLEMS AND SOLUTIONS
Edited by Gerald A. Edgar, Doug Hensley, Douglas B. West
with the collaboration of Mike Bennett, Itshak Borosh, Paul Bracken, Ezra A. Brown,
Randall Dougherty, Tamás Erdélyi, Zachary Franco, Christian Friesen, Ira M. Ges-
sel, László Lipták, Frederick W. Luttmann, Vania Mascioni, Frank B. Miles, Bog-
dan Petrenko, Richard Pfiefer, Cecil C. Rousseau, Leonard Smiley, Kenneth Stolarsky,
Richard Stong, Walter Stromquist, Daniel Ullman, Charles Vanden Eynden, Sam Van-
dervelde, and Fuzhen Zhang.

Proposed problems and solutions should be sent in duplicate to the MONTHLY


problems address on the back of the title page. Proposed problems should never
be under submission concurrently to more than one journal. Submitted solutions
should arrive before October 31, 2011. Additional information, such as general-
izations and references, is welcome. The problem number and the solver’s name
and address should appear on each solution. An asterisk (*) after the number of
a problem or a part of a problem indicates that no solution is currently available.

PROBLEMS
11579. Proposed by Hallard Croft, University of Cambridge, Cambridge, U. K., and
Sateesh Mane, Convergent Computing, Shoreham, NY. Let m and n be distinct integers,
with m, n ≥ 3. Let B be a fixed regular n-gon, and let A be the largest regular m-gon
that does not extend beyond B. Let d = gcd(m, n), and assume d > 1. Show that:
(a) A and B are concentric.
(b) If m | n, then A and B have m points of contact, consisting of all the vertices of A.
(c) If m - n and n - m, then A and B have 2d points of contact.
(d) A and B share exactly d common axes of symmetry.
11580. Proposed by David Alfaya Sánchez, Universidad Autónoma de Madrid,
Madrid, Spain, and José Luis Dı́az-Barrero, Universidad Politécnica de Cataluña,
Barcelona, Spain. For n ≥ 2, let a1 , . . . , an be positive numbers that sum to 1, let
E = {1, . . . , n}, and let F = {(i, j) ∈ E × E : i < j}. Prove that
X (ai − a j )2 + 2ai a j (1 − ai )(1 − a j ) X (n + 1)a 2 + nai n 2 (n + 2)
+ i
≥ .
(i, j)∈F
(1 − ai )2 (1 − a j )2 i∈E
(1 − ai )2 (n − 1)2

11581. Proposed by Duong Viet Thong, National Economics University, Hanoi,


Vietnam.
R1 Let f be a continuous, nonconstant function from [0, 1] to R such that
0
f (x) d x = 0. Also, let m = min0≤x≤1 f (x) and M = max0≤x≤1 f (x). Prove that
Z 1
(x) ≤ −m M .


x f d x 2(M − m)
0

11582. Proposed by Aleksandar Ilić, University of Niš, Serbia. Let n be a positive inte-
Pk
ger, and consider the set Sn of all numbers that can be written in the form i=2 ai−1 ai
with a1 , . . . , ak being positive integers that sum to n. Find Sn .
doi:10.4169/amer.math.monthly.118.06.557

June–July 2011] PROBLEMS AND SOLUTIONS 557


11583. Proposed by David Beckwith, Sag Harbor, NY. The instructions for a magic
trick are as follows: Pick a positive integer n. Next, list all partitions of n as nonde-
creasing strings—for instance, with n = 3, the list is {111, 12, 3}. Count 1 point for
λ j+1
the string (n). For the string λ1 · · · λk with k > 1, count k−1
Q 
j=1 λ j points. Add up
your points, take the log base 2 of that, and add 1. Voilà! n. Explain.
11584. Proposed by Raymond Mortini and Jérôme Noël, Université Paul Verlaine,
Metz, France. Let
Q ha j i be a sequence of nonzero complex numbers inside the unit
circle such that ∞
k=1 |ak | converges. Prove that

∞ 1 − |a j |2 1 − ∞ |a |2
X Q
≤ Q∞ j=1 j .

j=1 aj
j=1 |a j |

11585. Proposed by Bruce Burdick, Roger Williams University, Bristol, RI. Show that
∞ k−2
!
X 1 X π2
ζ (k − m)ζ (m + 1) − k = 3 + γ 2 + 2γ1 − .
k=3
k m=1 3

Here, ζ denotes the


PRiemann zeta function, γ is the Euler-Mascheroni constant, given
by γ = limn→∞ n
γ1 is the first Stieltjes constant, given by

1/k − log(n) , and
Pn k=1
γ1 = limn→∞ log k 1
(log .
2

k=1 k − 2
n)

SOLUTIONS

Extrema On the Edge


11449 [2009, 647]. Proposed by Michel Bataille, Rouen, France. (corrected) Find the
maximum and minimum values of
(a 3 + b3 + c3 )2
(b2 + c2 )(c2 + a 2 )(a 2 + b2 )
given that a + b ≥ c > 0, b + c ≥ a > 0, and c + a ≥ b > 0.
Solution by Chip Curtis, Missouri Southern State University, Joplin, MO. Let F be the
expression to be maximized. The maximum of F in the feasible region is 2, attained
when a = b = 1 and c = 2, as well as at permutations and scalings of this.
Let H = 2(b2 + c2 )(c2 + a 2 )(a 2 + b2 ) − (a 3 + b3 + c3 )2 . Since F ≤ 2 is equiva-
lent to H ≥ 0, we prove the latter. By symmetry, we may assume that a ≤ b ≤ c. By
homogeneity, we may take a = 1. Hence, we can write b = 1 + x and c = 1 + x + y
with x, y ≥ 0. Since a + b ≥ c, we have y ≤ 1. Expanding H as a polynomial in x
with coefficients that are polynomials in y gives the following expansion:

H = x 4 [1 + 7(1 + y)(1 − y)] + 2x 3 [1 + (1 − y)(7y 2 + 21y + 13)]


+ x 2 [1 + (1+ y)(1 − y)(13y 2 + 42y + 39)]
+ 2x(1+ y)(1− y)(3y + 7)(y 2 + 2y + 2)+(1+ y)2 (1− y)(y 3 + 5y 2 + 7y + 7),

which is evidently nonnegative. It is 0 if and only if x = 0 and y = 1. This corresponds


to (a, b, c) = (1, 1, 2).

558 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



Also solved by R. Agnew, A. Alt, M. Ashbaugh, R. Bagby, D. Beckwith, H. Caerols & R. Pellicer (Chile),
R. Chapman (U. K.), H. Chen, C. Curtis,P. P. Dályay (Hungary), Y. Dumont (France), J. Fabrykowski and T.
Smotzer, S. Falcón and Á. Plaza (Spain), D. Fleischman, J.-P. Grivaux (France), E. A. Herman, F. Holland (Ire-
land), T. Konstantopoulos (U. K.), O. Kouba (Syria), A. Lenskold, J. H. Lindsey II, B. Mulansky (Germany),
P. Perfetti (Italy), C. R. Pranesachar (India), N. C. Singer, R. Stong, T. Tam, R. Tauraso (Italy), M. Tetiva (Ro-
mania), D. Tyler, E. I. Verriest, Z. Vörös (Hungary), S. Wagon, G. D. White, GCHQ Problem Solving Group
(U. K.), Microsoft Research Problems Group, and the proposer.

Editorial comment. Two versions of this problem appeared; the first was not what the
proposer intended. The treatment of the upper bound given in the March issue of this
column (p. 278) fails as a solution to the corrected version. The maximum of F in the
closure of the feasible region is attained not only at a corner, which is off-limits, but
also at the other boundary points noted. The solver list here includes those who had
supplied solutions under a new deadline. The editors regret the confusion.

Hexagon Inscribed in Circle


11470 [2009, 491]. Proposed by Marian Tetiva, National College “Gheorghe Roşca
Codreanu,” Bı̂rlad, Romania. Let ABCDEF be a hexagon inscribed in a circle. Let M,
N , and P be the midpoints of the line segments BC, DE, and FA, respectively, and
similarly let Q, R, and S be the midpoints of AD, BE, and CF. Show that if both MNP
and QRS are equilateral, then the segments AB, CD, and EF have equal lengths.
Solution by Oliver Geupel, Brühl, NRW, Germany. Let the circle be the unit circle
in the complex plane, and let a, b, c, . . . be the complex numbers corresponding to
A, B, C, . . . . Thus 2m = b + c, 2n = d + e, 2 p = f + a, 2q = a + d, 2r = b + e,
and 2s = c + f . Write  = exp(2πi/3). It is well known (for example: T. Andreescu
and T. Andrica, Complex Numbers from A to Z, Birkhäuser, Boston, 2006, pp. 70ff.,
Proposition (3.4)1) that a triangle UVW is equilateral if and only if u + v +  2 w = 0
or u + w +  2 v = 0, depending on the orientation of 4UVW. Without loss of gener-
ality, we may assume that 4MNP is oriented so that m + n +  2 p = 0. Hence
(b + c) + (d + e) +  2 ( f + a) = 0. (1)
We consider two cases, depending on the orientation of 4QRS.
Case 1: 4MNP and 4QRS have opposite orientation. In this case
(a + d) + (c + f ) +  2 (b + e) = 0. (2)
−1−+ 2 −1++ 2
Multiplying (1) by 2(−1)
, multiplying (2) by 2(−1)
, and adding, we obtain a +
−1++ 2 2
c +  e = 0. Multiplying (1) by
2
multiplying (2) by −1−+
2(−1)
, 2(−1)
, and adding,
we obtain b + d +  f = 0. Thus 4ACE and 4BDF are equilateral, which implies
2

AB = CD = EF.
Case 2: 4QRS has the same orientation as 4MNP. Now
(b + e) + (c + f ) +  2 (a + d) = 0. (3)
Multiplying (1) by 1
1−
1
, multiplying (3) by − 1− , and adding, we obtain c − e = ( f −
2 1
d). Therefore CE = DF, so CD = EF. Multiplying (1) by 1− , multiplying (3) by − 1− ,
and adding, we obtain e − a = ( f − b). Therefore EA = FB, so EF = AB.
Also solved by R. Chapman (U. K.), P. P. Dályay (Hungary), M. Garner, M. Goldenberg & M. Kaplan, J.-P.
Grivaux (France), S. W. Kim (Korea), O. Kouba (Syria), O. P. Lossers (Netherlands), M. A. Prasad (India), R.
Stong, S. Tonegawa & F. Vafa, and the proposer.

June–July 2011] PROBLEMS AND SOLUTIONS 559


Product of Derivatives
11472 [2009, 941]. Proposed by Mahdi Makhul, Shahrood University of Technology,
Shahrood, Iran. Let t be a nonnegative integer, and let f be a (4t + 3)-times continu-
ously differentiable function on R. Show that there is a number a such that at x = a,
4t+3
Y d k f (x)
≥ 0.
k=0
dxk

Solution by Robin Chapman, University of Bristol, Bristol, England, U. K. We first


claim that if g is a twice-differentiable function on R, then there exists b ∈ R such
that g(b)g 00 (b) ≥ 0. To prove this, suppose that g(x)g 00 (x) < 0 for all x ∈ R. Now
g(x) 6= 0 for all x ∈ R. Since g is continuous, g has constant sign. Hence, g 00 has the
opposite sign. Suppose that g is positive and g 00 is negative (otherwise consider −g in
place of g). Hence g 0 is decreasing, and there exists c ∈ R with g 0 (c) 6 = 0. By Taylor’s
theorem, for each x ∈ R,
(x − c)2 00
g(x) = g(c) + (x − c)g 0 (c) + g (ξ ),
2
where ξ is between c and x. Since g 00 is negative,
g(x) ≤ g(c) + (x − c)g 0 (c).
Depending on the sign of g 0 (c), this implies that g(x) < 0 for all large enough x or for
all small enough x. Either way we have a contradiction. Hence there exists b ∈ R with
g(b)g 00 (b) ≥ 0.
Now let f be a (4t + 3)-times continuously differentiable function on R. Let
( j)
F(x) = 4t+3 (x). If F is always negative, then F is always nonzero, so each f ( j)
Q
j=0 f
with 0 ≤ j ≤ 4t + 3, since it is continuous, has constant sign. From the foregoing,
f ( j) and f ( j+2) must have the same sign for 0 ≤ j ≤ 4t + 1. Therefore 2t+1 (2 j)
Q
j=0 f
Q2t+1 (2 j+1)
and j=0 f are both positive, so F is positive, a contradiction.
Editorial comment. The special case t = 0 of this problem was problem A3 on the
1998 Putnam exam.
Also solved by G. Apostolopoulos (Greece), P. P. Dályay (Hungary), J.-P. Grivaux (France), O. Kouba (Syria),
O. P. Lossers (Netherlands), M. Omarjee (France), J. Simons (U. K.), R. Stong, R. Tauraso (Italy), M. Tetiva
(Romania), X. Wang, GCHQ Problem Solving Group (U. K.), and the proposer.

A Series Equation
11473 [2009, 941]. Proposed by Paolo Perfetti, Mathematics Dept., University “Tor
Vergata Roma,” Rome, Italy. Let α and β be real numbers such that −1 < α + β < 1
and such that, for all integers k ≥ 2,

−(2k) log(2k) 6 = α, (2k + 1) log(2k + 1) 6 = α,


1 + (2k + 1) log(2k + 1) 6 = β, −1 − (2k + 2) log(2k + 2) 6 = β.
Let
N Y
n
X α + (−1)k · k log(k)
T = lim ,
N →∞
n=2 k=2
β + (−1)k+1 (1 + (k + 1) log(k + 1))
N n
X Y α + (−1)k · k log(k)
U = lim ((n + 1) log(n + 1)) .
N →∞
n=2 k=2
β + (−1)k+1 (1 + (k + 1) log(k + 1))

560 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



(a) Show that the limits defining T and U exist.
(b) Show that if, moreover, |α| < 1/2 and β = −α, then T = −2U .
Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The
Netherlands.
(a) The series for T and U are eventually alternating in sign, so for convergence it
suffices to prove that the absolute value of the term decreases eventually and converges
to zero. Since (n + 1) log(n + 1) is an increasing function of n, it suffices to prove this
for U only. The negative of the quotient of two consecutive terms is
(n + 1) log(n + 1) (−1)n α + n log n
· .
n log n 1 + (−1)n+1 β + (n + 1) log(n + 1)
With the abbreviation xn = n log n, this expression can be written as
1 − (−1)n (α + β)
 
1 1
1− + − (−1)n (α + β) + O(xn−2 ).
xn xn+1 xn
Since 1/xn+1 − 1/xn = O(n −1 xn−1 ) and |α + β|Q< 1, this has the form 1 − cn with 1 >
cn > 12 (1 − |α + β|)/xn eventually. Therefore nk=1 |1 − ck | is eventually decreasing.
Also, since xn−1 diverges, the product goes to zero. This proves that the limit for U ,
P
and hence also for T , exists.
(b) The equation T = −2U is incorrect. Let pk = (−1)k α + xk and qk =
(−1)k+1 β + 1 + xk+1 . If α + β = 0, then the partial sums for T + 2U can be written
as
N n N Qn Qn+1 !
pk k=2 pk k=2 pk
X Y X
(−1) (qn + pn+1 )
n+1
= (−1) n+1
Qn−1 + Qn .
n=2 k=2
q k n=2 k=2 q k k=2 q k

This is a telescoping sum that simplifies to


N
Y pk
− p2 + (−1) N +1 p N +1 .
q
k=2 k

From the convergence of T and U , it follows that the second term goes to zero as N
tends to infinity. Thus
T + 2U = −α − 2 log 2.

Also solved by O. Kouba (Syria), R. Stong, and the GCHQ Problem Solving Group (U. K.).

An Inequality for Triangles


11476 [2010, 86]. Proposed by Panagiote Ligouras, “Leonardo da Vinci” High
School, Noci, Italy. Let a, b, and c be the side-lengths of a triangle, and let r be
its inradius. Show
a 2 bc b2 ca c2 ab
+ + ≥ 18r 2 .
(b + c)(b + c − a) (c + a)(c + a − b) (a + b)(a + b − c)

Solution by P. Nüesch, Lausanne, Switzerland. Write s for the semiperimeter of the


triangle. The left side of the inequality is (employing geometry’s cyclic summation
conventions)
X a 2 bc abc X a
= .
(b + c)(b + c − a) 2 (2s − a)(s − a)

June–July 2011] PROBLEMS AND SOLUTIONS 561


The function f defined by
x
f (x) =
(2s − x)(s − x)
is convex for 0 < x < s. Setting x1 = a, x2 = b, x3 = c yields
P   
X a X xi 2s 9
= f (xi ) ≥ 3 f =3f = .
(2s − a)(s − a) 3 3 2s
Together with abc = 4Rr s and Euler’s inequality R ≥ 2r , we obtain
abc X a abc 9
≥ = 9Rr ≥ 18r 2 .
2 (2s − a)(s − a) 2 2s

Also solved by A. Alt, G. Apostolopoulos (Greece), R. Bagby, D. Beckwith, E. Bráune (Austria), R. Chapman
(U. K.), P. P. Dályay (Hungary), J. Fabrykowski & T. Smotzer, H. Y. Far, O. Faynshteyn (Germany), V. V.
Garcia (Spain), O. Kouba (Syria), K.-W. Lau (China), J. H. Lindsey II, Á. Plaza & S. Falcón (Spain), C.
Pohoata (Romania), C. R. Pranesachar (India), R. Stong, E. Suppa (Italy), M. Tetiva (Romania), M. Vowe
(Switzerland), L. Wimmer (Germany), L. Zhou, GCHQ Problem Solving Group (U. K.), and the proposer.

The Winding Density of a Non-Closing Poncelet Trajectory


11479 [2010, 87]. Proposed by Vitaly Stakhovsky, National Center for Biotechnologi-
cal Information, Bethesda, MD. Two circles are given. The larger circle C has center
O and radius R. The smaller circle c is contained in the interior of C and has center o
and radius r . Given an initial point P on C, we construct a sequence hPk i (the Poncelet
trajectory for C and c starting at P) of points on C: Put P0 = P, and for j ≥ 1, let
P j be the point on C to the right of o as seen from P j−1 on a line through P j−1 and
tangent to c. For j ≥ 1, let ω j be the radian measure of the angle counterclockwise
along C from P j−1 to P j . Let
k
1 X
(C, c, P) = lim ωj.
k→∞ 2π k
j=1

(a) Show that (C, c, P) exists for all allowed choices of C, c, and P, and that it is
independent of P.
(b) Find a formula for (C, c, P) in terms of r , R, and the distance d from O to o.
Solution by Richard Stong, Center for Communications Research, San Diego, CA. We
will show
 
1 r − d
F arccos m
2 R 4d R
(C, c, P) = , where m = ,
K (m) (R + d)2 − r 2
which is independent of P. We have used the incomplete elliptic integral of the first
kind, defined by
Z θ Z sin θ
dt dy
F(θ |m) = p = p p ,
2 1 − y 1 − my 2
2
0 1 − m sin t 0

and the corresponding complete integral K (m) = F(π/2|m).


Use coordinates with c centered at the origin and C centered on the nonneg-
ative x-axis. Parameterize c as T (θ) = (r cos θ, r sin θ ) and C as P(φ) = (d +

562 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



R cos φ, R sin φ). Then kT (θ)k2 = kT 0 (θ)k2 = r 2 and hT 0 (θ ), T (θ )i = 0. The tan-
gent line to c at T (θ) is given by hX, T (θ)i = r 2 and a point X on the tangent can be
written as
p
kX k2 − r 2 0
X = T (θ) ± T (θ ),
r
using the + sign if X is counterclockwise from T (θ ) and the − sign if X is clockwise
from T (θ ) as viewed from the origin.
For any two points P(φ1 ) and P(φ2 ) on C we have
φ1 − φ2 φ1 + φ2 φ1 + φ2
     
P(φ1 ) − P(φ2 ) = 2 sin −R sin , R cos ,
2 2 2
φ1 − φ2 φ1 + φ2 φ1 + φ2
     
P 0 (φ1 ) + P 0 (φ2 ) = 2 cos −R sin , R cos .
2 2 2
Hence these two vectors are parallel.
For a point T (θ) on the circle c, write P(φ− ) and P(φ+ ) for the two points where
the tangent to c at T (θ) meet C with φ+ counterclockwise from T (θ ) and φ− <
φ+ < φ− + 2π. Then hP(φ± ), T (θ)i = r 2 so hP(φ+ ) − P(φ− ), T (θ )i = 0 and hence
hP 0 (φ+ ) + P 0 (φ− ), T (θ)i = 0. Now suppose we traverse the circle c so that

= hP 0 (φ− ), T (θ)i = −hP 0 (φ+ ), T (θ )i.
dt
This makes dθ/dt > 0, so we traverse c in counterclockwise order. Then from
d dφ± dθ
0= hP(φ± ), T (θ)i = hP 0 (φ± ), T (θ )i + hP(φ± ), T 0 (θ )i
dt dt dt
we see
dφ±
= ±hP(φ± ), T 0 (θ)i
dt
p p
= r kP(φ± )k2 − r 2 = r R 2 + d 2 − r 2 + 2d R cos φ± .
Thus the elliptic integral I given by
Z φ+

I =
R 2 + d 2 − r 2 + 2d R cos φ
p
φ−

satisfies
dI 1 dφ+
=p
dt R 2 + d 2 − r 2 + 2d R cos φ+ dt
1 dφ−
−p
R2 + d2 − r2 + 2d R cos φ− dt
=r −r = 0
and is a constant. One possible chord is the vertical one through the point (r, 0) with
θ = 0, φ± = ± arccos((r − d)/R), so we obtain
Z arccos((r −d)/R)

I =2
R + d − r 2 + 2d R cos φ
p
0 2 2
 
4 1 r − d 4d R
=p F arccos .
(R + d)2 − r 2 (R + d)2 − r 2

2 R

June–July 2011] PROBLEMS AND SOLUTIONS 563


Let
Z 2π

J=
− r 2 + 2d R cos φ
p
0 R2 + d2
 
4 4d R
=p K .
(R + d)2 − r 2 (R + d)2 − r 2

Now suppose P0 = (d + R cos φ0 , R sin φ0 ) and let φk = φ0 + ω j . We have


Pk
j=1
Z φk

= k I.
R + d − r 2 + 2d R cos φ
p
φ0 2 2

This integral is over an interval of at least b(φk − φ0 )/(2π )c complete periods and
fewer than d(φk − φ0 )/(2π)e complete periods. Hence

j=1 ω j j=1 ω j
$ Pk % & Pk '
J ≤ kI ≤ J.
2π 2π

Thus
ωj ωj ωj
& Pk ' ! Pk $ Pk % !
I 1 1 j=1 j=1 1 j=1 I 1
− ≤ −1 ≤ ≤ +1 ≤ +
J k k 2π 2πk k 2π J k

and
ωj
Pk
j=1 I
lim = ,
k→∞ 2π k J
which is the quotient of elliptic integrals claimed.
Editorial comment.
In the classical case, when the trajectory closes—returns to its starting point af-
ter finitely many steps—this “winding density” is rational: the number of times the
closed trajectory goes around the circle divided by the number of intervals in the tra-
jectory. The use of elliptic integrals to compute it is known, and in many special cases it
can be computed without elliptic integrals: see http://mathworld.wolfram.com/
PonceletsPorism.html.
Also solved by J. A. Grzesik, and the proposer.

564 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



REVIEWS
Edited by Jeffrey Nunemacher
Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015

Continuous Symmetry: From Euclid to Klein. By William Barker and Roger Howe. American
Mathematical Society, Providence, RI, 2007, xxi + 546 pp., ISBN 978-0821839003, $69.

Reviewed by Robin Hartshorne

This is a book about plane Euclidean geometry with special emphasis on the group
of isometries. It includes the classification of plane isometries into reflections, trans-
lations, rotations, and glide reflections, and also the classification of frieze groups and
the seventeen wallpaper groups with complete proofs. It offers unusual proofs of some
standard theorems of plane geometry, making systematic use of the group of isome-
tries. The book is intended as a one-semester course, with exercises, though the in-
structor will have to be somewhat selective, since there is more than enough material
for one semester. The authors state in the preface, “We have tried to write a book that
honors the Greek tradition of synthetic geometry and at the same time takes Felix
Klein’s Erlanger Programm seriously.”
To begin with, the authors devote the first chapter to the axiomatic foundations of
plane geometry. Here already, following a popular modern trend, they diverge from
Euclid’s purely synthetic geometry by presupposing the real numbers, and implicitly
using some concepts of analysis. In Euclid’s Elements there are no numbers, no mea-
sure of length of a segment, no measure of angles. Instead there is an undefined notion
of congruence. Even in Hilbert’s rigorous rewriting of the foundations of Euclidean
geometry in [5], numbers are not necessary. It is only after the initial development
that it appears that the Euclidean plane can be represented as a Cartesian plane over a
Euclidean ordered field, which can be the real numbers, but need not be. Presupposing
the real numbers does simplify the technical beginnings of the subject, but one loses
the purity of Euclid’s synthetic approach.
The authors say in a footnote on page 1 that they have closely followed Moise’s
book [8] for the axiomatic foundations. But while Moise takes the distance function,
which to each pair of points assigns a real number, as a basic datum of the theory,
Barker and Howe prefer to take a coordinate system on each line as their data. They
call this the “Ruler Axiom” (p. 17), which states that for each line there exists a one-
to-one correspondence with the real numbers. However what they need is not the exis-
tence, but the choice for each line of a fixed coordinate system. This leads to a certain
amount of confusion in the first chapter, and to some incorrect claims. These have been
corrected in a list on the web page for the book [2] together with a revised version of
Chapter 1. The axiomatic foundations are summarized on page 120. They postulate a
set of points, a distinguished collection of subsets called lines, on each line a coordi-
nate system (i.e., a 1-1 correspondence with the real numbers), and an angle measure
function, satisfying incidence axioms, a plane separation axiom, angle measure ax-
ioms, the side-angle-side axiom (SAS), and the parallel postulate.

doi:10.4169/amer.math.monthly.118.06.565

June–July 2011] REVIEWS 565


It is worth remarking on the inclusion of SAS as an axiom. In Euclid’s Elements,
this theorem appears as Book I, Proposition 4, and is proved by the so-called method
of superposition, placing one triangle on the other. Nothing in Euclid’s definitions or
postulates allows him to do this, and later commentators were justly critical of this
procedure, which in effect allows one to move figures around in the plane without al-
tering their size or shape. Hilbert in his axiomatization of plane geometry [5] therefore
took SAS as an axiom, and that sufficed to develop the rest of the theory. If one prefers
to follow Euclid’s idea, then one needs to postulate the existence of enough rigid mo-
tions, i.e., what we might call free mobility, which says that given any two points in
the plane, a ray emanating from each of those points, and a choice of one side of each
ray, then there exists a rigid motion of the plane (an isometry) that takes one point, ray,
and side to the other point, ray, and side. In fact, one can show that in the presence
of Hilbert’s other axioms, SAS becomes equivalent to the existence of enough rigid
motions; see [4, Corollary 17.5].
It is these rigid motions that are the main subject of the book under review, and
the authors devote Chapter II to defining them, showing that they are isometries, and
showing that these isometries form a group with the usual notions of composition
and inverse. Chapter III contains the classification of isometries into reflections, trans-
lations, rotations, and glide reflections. Chapter IV studies the larger group of simi-
larities, including a study of their fixed points, and the classification into isometries,
dilations, stretch rotations, and stretch reflections. All of this is done in excellent detail
with examples and diagrams for the reader.
In Chapter V we come to the second main motivating source for this book (suppos-
ing the first to be Euclid), namely Klein’s Erlanger Programm. The authors summarize
the influence of Klein’s work in two main principles:

EP1. A geometry is the study of the properties of figures in a set X that are invariant
under a group G of symmetries acting on the set X.
EP2. Starting with an appropriate group G, one can reconstruct the set X, together
with the action of G on X, to obtain a geometry.

Felix Klein’s Erlanger Programm [7] was his inaugural dissertation on his appoint-
ment as professor at the University of Erlangen. The main import of Klein’s paper
is expressed by statement EP1 above. For example, we can say two figures are con-
gruent if there exists an element of the group taking one to the other. To set the im-
portance of Klein’s work in perspective, one must remember that when Klein was
writing, group theory was in its infancy [9]. Groups appeared in Galois’s work in
the early 19th century as finite permutation groups, or groups of substitutions as they
were called then. The theory of finite substitution groups was fairly well developed
by the mid 19th century as we see from the masterful work of Jordan [6]. The no-
tion of an abstract group had been suggested earlier by Cayley, but did not take hold
until much later. So Klein says at the beginning of his work [7, p. 6, footnote 3] that
he will borrow the concepts and notations of a group from substitution theory and
apply it to the groups of transformations acting on a geometry, in particular, to the
groups of isometries and similarities. It was a great conceptual leap to consider an in-
finite group of transformations acting on the continuous space of a geometry, and this
idea had tremendous influence on later studies of geometry (think of Lie groups, for
example).
As for the idea expressed in EP2 above, this came only much later, and is not due
to Klein, though one could argue that it is a natural outgrowth of his work. The theory

566 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



of abstract groups came into its own in the 1880s and 1890s and the first monograph
on abstract group theory appeared in 1904. The full realization of EP2, giving axioms
on an abstract group G so that it can be recognized as the transformations of an asso-
ciated geometry X, and the subsequent development of that geometry, appears only in
the book of Bachmann [1], based on earlier work of Reidemeister and Schmidt in the
1930s and 1940s. The idea is that every involution (element of order two) in the group
will correspond either to a point reflection or a line reflection, and these can be dis-
tinguished because the former are squares in the group, while the latter are not. While
these ideas are explored in Chapter V of the present book, it is too bad that they are not
developed further. Surprisingly, neither Klein’s Erlanger Programm nor Bachmann’s
book appear in the short list of references on p. 531.
The structure of the isometry group is strongly emphasized in Chapter VIII, which
gives a classification of frieze groups and wallpaper groups. A wallpaper group is a
discrete subgroup of the group of isometries of the plane that contains two linearly
independent translations. For example, think of the group of isometries that sends a
tiling of the plane by hexagons into itself. Any two such hexagon-tiling groups are
of the same “type,” even though the centers of the rotations, the directions of the
translations, and the scale of the hexagons may be different. In the book, this no-
tion of same type is called “symmetry equivalence,” which means, roughly, conju-
gacy of subgroups of the group of isometries. Unfortunately this notion is never pre-
cisely defined; it seems to have a fluid interpretation depending on which section of
which chapter you are reading. So for example in Theorem VIII.4.16 on the classi-
fication of wallpaper groups, I was unable to extract a precise statement of exactly
what notion of symmetry equivalence is meant in saying that there are just seven-
teen of them. Still, the authors are to be commended on giving complete proofs and
a sufficiently detailed analysis so that students working through the chapter will get
a real understanding of how transformation groups work. Other proofs of this re-
sult that I have seen tend to be slick or clever, so one does not really see what is
going on.
Chapter VI has some nice applications of the group-theoretic point of view to more
advanced results of Euclidean geometry. My only complaint here is that many attri-
butions are given without references (the Euler line, Feuerbach’s theorem, Fagnano’s
problem, the Fermat problem, Napoleon’s theorem, and so forth). As Coxeter and Gre-
itzer say in their delightful book [3], it is about as likely that Napoleon knew enough
geometry to prove “his” theorem as it is that he knew enough English to compose the
palindrome “Able was I ere I saw Elba.” It would be nice if the authors, instead of un-
critically repeating what others have said, at least made some effort to trace the origins
of these results and their attributions.
A final chapter deals with area and volume. Again it is refreshing to see a careful
treatment, since most elementary texts simply talk about area without saying what it
is or showing that an area function exists. Our authors choose the route of analysis,
using Jordan measure in the plane, which has the advantage of being accurate, and is
consistent with their assumption of the real numbers at the outset, even though it is
possible to develop a theory of area, at least of polygonal figures, without analysis.
The reader should watch the terminology carefully, because even though they talk
about polygonal regions, they never show that if P1 , . . . , Pn is a finite set of points in
the plane, then the polygon formed of the segments P1 P2 , P2 P3 , . . . , Pn P1 defines a
polygonal region!
All in all, this is a substantial book with a lot of good material in it, well worth
studying. The authors promise a volume 2, which should contain solid geometry and
non-Euclidean geometry in the context of projective geometry.

June–July 2011] REVIEWS 567


REFERENCES

1. F. Bachmann, Aufbau der Geometrie aus dem Spiegelungsbegriff, Springer, Berlin, 1973.
2. W. Barker and R. Howe, Additional material for Continuous Symmetry: From Euclid to Klein (2010),
available at http://www.ams.org/publications/authors/books/postpub/mbk-47.
3. H. S. M. Coxeter and S. L. Greitzer, Geometry Revisited, Mathematical Association of America, Wash-
ington, DC, 1967.
4. R. Hartshorne, Geometry: Euclid and Beyond, Springer, New York, 2000.
5. D. Hilbert, Grundlagen der Geometrie, in Festschrift zur Feier der Enthüllung des Gauss-Weber-Denkmals
in Göttingen, B. G. Teubner, Leipzig, 1899.
6. C. Jordan, Traité des Substitutions et des Equations Algébriques, Gauthier-Villars, Paris, 1870.
7. F. Klein, Vergleichende Betrachtungen über Neuere Geometrische Forschungen, A. Deichert, Erlangen,
1872.
8. E. Moise, Elementary Geometry from an Advanced Standpoint, 3rd ed., Addison Wesley, Reading, MA,
1990.
9. H. Wussing, The Genesis of the Abstract Group Concept, MIT Press, Cambridge, MA, 1984.

Department of Mathematics, University of California, Berkeley, CA, 94720-3840


[email protected]

The Essence of Mathematics . . .

“The essence of mathematics is perpetually to be discarding more special ideas in


favour of more general ideas, and special methods in favour of general methods.”

Alfred North Whitehead, Technical Education and Its Relation to


Science and Literature, in The Aims of Education and Other Essays,
The Free Press, New York, 1929, p. 53.

568 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118



New from the MAA

Randomness and Recurrence in


Dynamical Systems

Rodney Nillsen

Randomness and Recurrence in Dynamical Systems


bridges the gap between undergraduate teaching
and the research level in mathematical analysis.
It makes ideas on averaging, randomness, and
recurrence, which traditionally require measure
theory, accessible at the undergraduate and
lower graduate level. The author develops new
techniques of proof and adapts known proofs to
make the material accesible to students with only a
background in elementary real analysis.

Over 60 figures are used to explain proofs, provide alternative viewpoints


and elaborate on the main text. The final part of the book explains further
developments in terms of measure theory. The results are presented in the
context of dynamical systems, and the quantitative results are related to the
underlying qualitative phenomena—chaos, randomness, recurrence, and
order.

The final part of the book introduces and motivates measure theory and
the notion of a measurable set, and describes the relationship of Birkhoff’s
Individual Ergodic Theorem to the preceding ideas. Developments in other
dynamical systems are indicated, in particular Lévy’s result on the frequency
of occurence of a given digit in the partial fractions expansion of a number.

Historical notes and comments suggest possible avenues for self-study.

Catalog Code: CAM-31


ISBN: 978-0-88385-043-5
Hardbound, 2010
List: $62.95
MAA Member: $50.95

To order visit us online at www.maa.org


or call 1-800-331-1622.
MATHEMATICAL ASSOCIATION OF AMERICA 1529 Eighteenth St., NW • Washington, DC 20036

Your Favorite MAA Books.


Now Digital.
Visit www.maa.org/ebooks

Save 10% on your order!


coupon code: 353030655

You might also like