AMM - Vol.118 NR 06 PDF
AMM - Vol.118 NR 06 PDF
AMM - Vol.118 NR 06 PDF
MONTHLY
VOLUME 118, NO. 6 JUNE–JULY 2011
NOTES
REVIEWS
MONTHLY
VOLUME 118, NO. 6 JUNE–JULY 2011
EDITOR
Daniel J. Velleman
Amherst College
ASSOCIATE EDITORS
William Adkins Jeffrey Nunemacher
Louisiana State University Ohio Wesleyan University
David Aldous Bruce P. Palka
University of California, Berkeley National Science Foundation
Roger Alperin Joel W. Robbin
San Jose State University University of Wisconsin, Madison
Anne Brown Rachel Roberts
Indiana University South Bend Washington University, St. Louis
Edward B. Burger Judith Roitman
Williams College University of Kansas, Lawrence
Scott Chapman Edward Scheinerman
Sam Houston State University Johns Hopkins University
Ricardo Cortez Abe Shenitzer
Tulane University York University
Joseph W. Dauben Karen E. Smith
City University of New York University of Michigan, Ann Arbor
Beverly Diamond Susan G. Staples
College of Charleston Texas Christian University
Gerald A. Edgar John Stillwell
The Ohio State University University of San Francisco
Gerald B. Folland Dennis Stowe
University of Washington, Seattle Idaho State University, Pocatello
Sidney Graham Francis Edward Su
Central Michigan University Harvey Mudd College
Doug Hensley Serge Tabachnikov
Texas A&M University Pennsylvania State University
Roger A. Horn Daniel Ullman
University of Utah George Washington University
Steven Krantz Gerard Venema
Washington University, St. Louis Calvin College
C. Dwight Lahr Douglas B. West
Dartmouth College University of Illinois, Urbana-Champaign
Bo Li
Purdue University
EDITORIAL ASSISTANT
Nancy R. Board
NOTICE TO AUTHORS Proposed problems or solutions should be sent to:
The MONTHLY publishes articles, as well as notes and DOUG HENSLEY, MONTHLY Problems
other features, about mathematics and the profes- Department of Mathematics
sion. Its readers span a broad spectrum of math- Texas A&M University
ematical interests, and include professional mathe- 3368 TAMU
maticians as well as students of mathematics at all College Station, TX 77843-3368
collegiate levels. Authors are invited to submit arti-
cles and notes that bring interesting mathematical
In lieu of duplicate hardcopy, authors may submit
ideas to a wide audience of MONTHLY readers.
pdfs to [email protected].
The MONTHLY’s readers expect a high standard of ex-
position; they expect articles to inform, stimulate,
challenge, enlighten, and even entertain. MONTHLY Advertising Correspondence:
articles are meant to be read, enjoyed, and dis- MAA Advertising
cussed, rather than just archived. Articles may be 1529 Eighteenth St. NW
expositions of old or new results, historical or bio- Washington DC 20036
graphical essays, speculations or definitive treat-
ments, broad developments, or explorations of a Phone: (877) 622-2373
single application. Novelty and generality are far E-mail: [email protected]
less important than clarity of exposition and broad
appeal. Appropriate figures, diagrams, and photo- Further advertising information can be found online
graphs are encouraged. at www.maa.org
Notes are short, sharply focused, and possibly infor- Change of address, missing issue inquiries, and
mal. They are often gems that provide a new proof other subscription correspondence:
of an old theorem, a novel presentation of a familiar MAA Service Center, [email protected]
theme, or a lively discussion of a single issue.
All at the address:
Beginning January 1, 2011, submission of articles and
notes is required via the MONTHLY’s Editorial Man- The Mathematical Association of America
ager System. Initial submissions in pdf or LATEX form 1529 Eighteenth Street, N.W.
can be sent to the Editor-Elect Scott Chapman at Washington, DC 20036
Abstract. We revisit the idea of road-wheel pairs, first introduced 50 years ago by Gerson
Robison and later popularized by Stan Wagon and his square-wheeled tricycle. We show how
to generate such pairs geometrically: the road as a roulette curve and the wheel as a pedal
curve. Along the way we gain geometric insight into two theorems proved by Jakob Steiner
relating the area and arc length of a roulette to those of a corresponding pedal. Finally, we use
our results to generate parabolas, ellipses, and sine curves as roulettes.
Roulette Lemma. Let C be a rollable curve and let R be the roulette traced by a
point P moving with C as C rolls on the x-axis. If (x, y) = ( f (θ ), g(θ )) is a param-
eterization of R in terms of the angle of rotation of C , then f and g are continuously
differentiable and g(θ) = f 0 (θ).
doi:10.4169/amer.math.monthly.118.06.479
Since C contains no line segments, the instantaneous center of rotation moves continu-
ously in the direction of the positive x-axis as C rolls on the axis, and the x-coordinate
of point A is a continuous function of θ . Also, since C is continuously differentiable, r
and ρ are continuous functions of θ , and hence f and g are continuously differentiable.
Finally, f 0 (θ ) = r sin ρ = g(θ).
P(x, y)
C
r
x
A0 A
One immediate consequence of the roulette lemma is that the tracing point moves
to the right, that is d x/dθ > 0, precisely when it is above the x-axis. Or put another
way, the horizontal motion of the tracing point changes direction when the tracing
point crosses the x-axis. Thus, the vertical tangents to the roulette curve occur at the
x-intercepts. The case of a trochoid, traced by a point exterior to a circle as the circle
rolls on a line, nicely illustrates these ideas and is shown in Figure 2.
Figure 2. A trochoid.
y y
P0
g( ) A00
P(x, y)
C g( )
C
A0
x x
A A0
We should point out that the standard parameterization of a roulette is usually of the
form
where r and ρ are as in Figure 1 and s is the arc length of C between A and the
point of C that was in contact with the origin. But by using (3) we avoid the arc length
computation. As an example, we give a short proof that the roulette traced by the focus
of a parabola rolling on a line is a catenary. In Figure 5 the curve C in its initial position
is the parabola y = x 2 /4 and P0 is the focus, (0, 1). A simple computation shows that
A00 is the point of intersection of the tangent line and the x-axis. Thus, g(θ ) = sec θ ,
A0
P0
x
A00
It follows that the roulette generated by the focus of the parabola y = x 2 /4 rolling on
the x-axis is the catenary y = cosh x. See [1] for another proof.
3. PEDAL CURVES. Let P be a point in the plane of C . The pedal curve of C with
respect to P, which we denote by C P , is the set of the feet of all perpendiculars from
P to the tangent lines of C . The idea of a pedal curve is contained in Figure 3, where
the pedal, C P0 , is the set of points A00 as A0 varies along C . If we place the pole of
a polar coordinate system at P0 and let the polar axis point in the direction of the
negative y-axis in Figure 3, then a polar equation for C P0 is r = g(θ ). Thus, in (3), the
y-coordinate of the roulette is the polar radius of C P0 .
We use the notation of Figure 3 throughout, where capital letters denote points on
a curve and their primed counterparts denote the corresponding points on the pedal
curve. For now we give four examples.
1. Let C be the circle r = 2 cos θ and let P be the pole of the polar coordinate
system. Figure 6 illustrates that the cardioid r = 1 + cos θ is the pedal of C with
respect to P. To see why, note that since 1PQ0 D ∼ 1OQD and OD = sec θ ,
it follows that r = PQ0 = 1 + cos θ. Alternatively, we could have concluded
directly from the expression for the y-coordinate of the cycloid in (1) that the
pedal of the circle r = −2 cos θ with respect to the pole is the cardioid r =
1 − cos θ .
Q0
P 1 O D
CP
2. The pedal of a parabola with respect to its focus is the tangent line to the parabola
at the vertex (see Figure 5).
3. The pedal of an ellipse with respect to a focus is the circle having the major axis
of the ellipse as a diameter. See [5, p. 13] for a proof.
4. In a degenerate case, where the curve C is a point Q, the family of tangents to C
is the pencil of lines through Q and C P is the circle with diameter PQ.
A curve geometrically similar to C P , but twice as large, may be obtained by reflect-
ing P across each of the tangent lines of C . The set of reflected points P 0 is called
the orthotomic of C with respect to P (see [5, p. 153]). The orthotomic may also be
P0
A0
CP
C0
A
C
P
Figure 7. The angle property of pedal curves.
Steiner Theorem 1. The arc length of R is equal to the arc length of C P between A0
and B 0 .
Steiner Theorem 2. The area between R and m is twice the area bounded by PA0 ,
PB0 and C P .
In particular, if C is a closed curve rolling on a line, the area between the line and
roulette after one revolution is twice the area of C P , and the arc length of the roulette
is equal to the arc length of C P . To illustrate, let C be a circle of radius r that rolls
through an angle of θ radians, and let P be the center of the circle. The roulette curve
is a line segment l of length r θ and the area of the rectangular region between l and
m is r 2 θ . Since Q 0 = Q for all points Q of C , the corresponding region of the pedal
curve is the sector of a circle with central angle θ , area 12 r 2 θ, and arc length r θ . For a
second example of Steiner’s theorems, take C to be a circle, let the tracing point P be
C
Q R
~ CP Q
P
B
Q0
A
P
P ~ ~ B0
Q0 Q
A0
The Steiner theorems follow as corollaries to the roulette lemma. To see this, let C
roll along the x-axis with P initially on the y-axis, and let θ denote the angle through
which C has turned from its starting position. Using (3), a parameterization of R is
given by (x, y) = ( f (θ), g(θ)), where f 0 (θ) = g(θ ). If we place the pole of a polar
coordinate system at P with the polar axis pointing in the direction of the negative
y-axis, a polar equation of C P is r = g(θ). Then as C turns through the differential
angle dθ , the differential arc length of the roulette is
p p
(d x)2 + (dy)2 = (g(θ))2 + (g 0 (θ ))2 dθ, (4)
proving the first Steiner theorem. The area element of the roulette is
y d x = (g(θ))2 dθ,
5. ROADS AND WHEELS. Many readers are probably familiar with a square-
wheeled bicycle. If the wheels roll along a road consisting of portions of appropriately
chosen inverted catenaries, the axles of the wheels move horizontally and the ride is
smooth. Figure 9 shows another example of a road-wheel pair. An elliptical wheel rolls
without slipping on a sinusoidal road, while one focus, A, of the ellipse moves along
the x-axis. Hall and Wagon [3] generate many other examples of road-wheel pairs.
We give some of their examples below, but for now we present a modified version
of their approach which is more suited to our geometric point of view. We place the
road below the x-axis in a rectangular coordinate system and require that the wheel
roll on the road without slipping and that the axle of the wheel, which we initially
place at the origin, move along the x-axis. An equivalent way to describe the rolling
condition is that the point of contact of the road and wheel (Q in Figure 9) is the
instantaneous center of rotation of the wheel. This implies that the velocity of the axle
is perpendicular to the spoke AQ. Adding the requirement that the axle of the wheel
move along the x-axis forces Q to be directly below A at all times.
y
Q0
x
A0 A
Q
Figure 9. An ellipse rolling on a sine curve.
d x = r dθ, (6)
Thus, the wheel with polar equation r = g(θ ) has as its corresponding road the
curve
Z θ
(x, y) = g(t) dt, −g(θ ) . (7)
0
Note that (6) and (7) hold even if g(θ) < 0. In this case, the part of the road cor-
responding to g(θ) < 0 lies above the x-axis and the axle moves to the left while
the wheel is in contact with this part of the road. This is illustrated in Figure 10 (see
example 1 below). Here are some examples, generated using (7).
1. If the wheel is the limaçon r = 1 − d cos θ , the road is the trochoid (x, y) =
(θ − d sin θ, d cos θ − 1). If d = 1 the wheel is a cardioid and the road is a
cycloid. The case d = 1.5 is shown in Figure 10.
2. If the wheel is the horizontal line r = sec θ (or y = −1), the road is the catenary
(x, y) = (ln(sec θ + tan θ), − sec θ), as shown in Figure 11.
3. If the wheel is the circle r = cos θ , the road is the circle (x, y) = (sin θ, − cos θ ).
In this example a circle with radius 1/2, in its initial position with center at
(0, −1/2), rolls on the inside of the unit circle x 2 + y 2 = 1. The point on the
rolling circle initially at the origin oscillates on the x-axis between (−1, 0) and
(1, 0).
4. If the wheel is the parabola r = 1/(2 + 2 cos θ ), or y = x 2 − 1/4, the road is the
parabola (x, y) = (sin θ/(2 + 2 cos θ), −1/(2 + 2 cos θ )), or y = −x 2 − 1/4.
Lemma 1. The length of the road between A and B is equal to the length of the wheel
between A0 and B 0 .
Lemma 2. The area between the road and the x-axis from A to B is twice the area
bounded by the wheel and the segments OA0 and OB0 , where O is the axle of the wheel.
To prove the second property, note from (6) that the area element between the road
and the x-axis is
d A = −y d x = −y r dθ = r 2 dθ,
which is twice the area element of the wheel. We can show this property geometrically
by using the spokes to unwrap the wheel as shown in Figure 12. The distance between
nearby unwrapped spokes of length r and R = r + 1r is taken to be r 1θ, where 1θ is
the angle between the spokes. The area of each trapezoid is then approximately twice
the area of the corresponding sector. This can be thought of as a generalization of the
grade school proof for the area of a circle. Alternatively, we can think of the latter
proof as constructing a road for a circular wheel.
r1
R
R
Main Theorem. Let C be a rollable curve initially tangent to the x-axis at the origin,
O, and let P0 be a point on the y-axis. Let R0 be the reflection in the x-axis of the
roulette curve R generated by P0 as C rolls on the x-axis. Then translating the pedal,
−−→
C P0 , by the vector P0 O gives a wheel for the road R0 .
Proof. A comparison of (3) and (7) actually proves the theorem, but we give a more
geometric proof that illustrates why the theorem is true and describes the rolling mo-
tion of the wheel. As shown in Figure 13, let P(x, y) denote the position of P0 after C
has rolled some distance along the x-axis. Let A be the point of tangency between C
and the x-axis and let A0 be the corresponding point of C P . Also, let Q be the reflec-
−
→ −−→
tion of P in the x-axis. We show that translating C P by the vector PA0 = A0 Q gives the
wheel in its position when it is in contact with the road at Q; in particular, the initial
−−→
position of the wheel is the translation of C P0 by the vector P0 O. Let W denote the
−→
translation of C P by PA0 . The translation ensures that W and R0 are in contact at Q.
It also sends the pedal point P to A0 , so that the translated copy of P moves along the
x-axis and serves as the wheel’s axle. It remains to show that W rolls on the road. We
first show that W and R0 are tangent at Q. By the angle property of pedals, the angles
at A and A0 labeled ρ are congruent. But the angle between R and PA0 at P also has
measure ρ. It follows that W and R0 are tangent at Q. That W is rolling, and not slid-
ing, on R0 follows immediately from the first Steiner theorem, since the corresponding
arcs lengths of R0 and C P0 are equal. But since we would like to claim Steiner’s first
theorem as a corollary to this theorem, we give an alternative argument to show that
W is rolling on R0 . The roulette lemma implies that as C turns through the angle dθ ,
the horizontal component of the displacement of P, and hence the displacement of
P C
x
A0 A
CP
Q
R0
We now show how to use our theorem to generate the first three road-wheel pairs
from the previous section. We describe the curve C in its initial position, tangent to the
x-axis at the origin. We also describe the wheel in its initial position, W0 , with its axle
at the origin. As usual, the polar equation of W0 assumes that the polar axis coincides
with the negative y-axis.
1. Take C to be the circle x 2 + (y − 1)2 = 1 and P0 = (0, 0). Then C P0 is the car-
dioid r = 1 − cos θ and the roulette R is the cycloid (x, y) = (θ − sin θ, 1 −
cos θ ). Since the pedal point is initially at the origin, W0 is the same cardioid.
Thus, the wheel r = 1 − cos θ rolls on the road R0 given by (x, y) = (θ −
sin θ, cos θ − 1).
2. Take C to be the parabola y = x 2 /4 and P0 = (0, 1). Then C P0 is the x-axis (see
Figure 5) and R is the catenary y = cosh x. The wheel, W0 , is the translation of
−−→
the x-axis by the vector P0 O = h0, −1i, or the line y = −1. So the line y = −1
is the wheel for the catenary y = − cosh x.
3. In a degenerate case, take C to be the origin and P0 to be the point (0, 1). The
pedal curve, C P0 , is the circle with diameter O P0 . The roulette curve R (and also
−−→
R0 ) is the unit circle centered at the origin. Since P0 O = h0, −1i, a wheel for
R is the unit circle centered at (0, −1/2).
0
Our theorem suggests that to visualize Steiner’s first theorem we should replace
our image of P tracing out R as C rolls on a line (as in Figure 8) by the image of
C P rolling on R0 . We use this idea to recast our dynamic interpretation of the Steiner
theorems given at the end of Section 4. To illustrate, return to Figure 9, showing an
elliptical wheel rolling on a sinusoidal road, and put the wheel in motion. As the ellipse
rolls on the sine curve, the point Q 0 moves on the fixed ellipse at the left. Recall Q 0
corresponds to the point, Q, on the rolling ellipse that is in contact with the road at
any instant. Expressed another way, the motion of the spoke A0 Q 0 on the fixed ellipse
is just the motion of the spoke AQ on the rolling ellipse as seen by an observer in the
reference frame of the rolling ellipse. We may restate Steiner’s first theorem as saying
Q and Q 0 move at the same speed. Of course this follows from our condition that the
ellipse roll on the road without slipping. But now consider the additional condition that
the axle of the wheel move along the x-axis. This requirement forces the spoke, AQ,
of the rolling ellipse to be perpendicular to the x-axis and thus sweep out a differential
rectangle as the ellipse turns through the angle dθ , while the spoke, A0 Q 0 , of the fixed
ellipse sweeps out a differential triangle. Since the speeds of A0 and A are equal, the
area of the differential rectangle is twice that of the triangle.
The rolling motion of the cardioid is hinted at in Figure 14. The pedal curve, C P0 , is
Cayley’s sextic with polar radius given by the y-coordinate of R, or r = 2 sin3 (θ/3).
This is a wheel for the road R0 , as shown in Figure 15, where Cayley’s sextic is shown
in its initial position and some later position. The cardioid (dashed) is also shown to
illustrate our theorem; the position of the wheel is found by translating the pedal of the
−→
cardioid by the vector PA0 .
x
A0
Figure 16. An elliptic catenary generated as a Figure 17. A circular wheel rolling on an elliptic
roulette, e = 0.8. catenary, e = 0.8.
where the subscript denotes partial differentiation with respect to the parameter t.
For example, suppose the wheel is the parabola y = x 2 /4 − 1, with correspond-
ing road y = −x 2 /4 − 1. Since the initial point of contact of the wheel and road
is Q 0 (0, −1), the initial position of the tracing point is P0 (0, 1). Parameterizing the
wheel as (x, y) = (2t, t 2 − 1) gives a parameterization of the set of tangent lines to
W 0 as
f (x, y, t) = 2t x + (t 2 − 1)y − (t 4 + 2t 2 + 1) = 0.
−−→
We then translate the solution of the system (8) by the vector Q 0 O = h0, 1i to get a
parameterization of C in its initial position as
(x, y) = (3t − t 3 , 3t 2 ).
Thus, the the parabola y = x 2 /4 + 1 is the roulette traced by the point P0 (0, 1) as the
above curve rolls along the x-axis. In this case the rolling curve, the negative pedal
of a parabola with respect to its focus, is called Tschirnhausen’s cubic. Figure 18
shows Tschirnhausen’s cubic rolling along the x-axis as the point P0 (0, 1) traces out
a parabola. Tschirnhausen’s cubic is shown twice; once in its initial position (dashed)
and again after it has rolled some distance along the x-axis.
with solution
For example, take the wheel to be the rose r = cos(nθ ), with corresponding road the
ellipse (x, y) = ( n1 sin nθ, − cos nθ). Since the initial point of contact is Q 0 (0, −1),
the initial position of the tracing point is P0 (0, 1). Using (9) and translating by the
−−→
vector Q 0 O = h0, 1i gives a parameterization of the rolling curve in its initial position
as
n−1
(x, y) = (− sin(n + 1) θ, cos(n + 1) θ)
2
n+1
− (sin(n − 1) θ, cos(n − 1) θ) + (0, 1). (10)
2
The curves (10) are all hypocycloids. That is, they are the roulettes generated by a
point on the circumference of a circle of radius r as it rolls with internal contact inside
a fixed circle of radius R. Choosing r = (n − 1)/2 and R = n gives the hypocycloid
in (10). Since a hypocycloid is not differentiable at its cusps, we may apply our main
theorem only to each differentiable piece. However, a hypocycloid has well-defined
tangents at its cusps and the tangents to the hypocycloid vary continuously as the
curve is traversed. Thus, a hypocycloid can roll on a line and as it rolls it maintains its
sense of rotation. We can piece together the results from our main theorem to conclude
that when the curve (10) rolls on the x-axis, the point P0 (0, 1) traces out the ellipse
(x, y) = ( n1 sin nθ, cos nθ). The description of the rolling motion below makes this
transparent.
Figure 19 shows the case n = 2, where the rolling curve is an astroid. The dashed
astroid is the curve in its initial position and the solid astroid is the curve in its new
position after it has rotated θ = 0.5 radians. Figures 20 and 21 show the case n = 4,
where the initial and rotated (θ = 0.25) positions are shown separately for clarity.
Since the tangent lines to the hypocycloid at the cusps pass through the tracing point
P0 , the x-intercepts of the ellipse are traced when the cusps are tangent to the x-axis.
Thus, half of the ellipse is traced when one arc of the hypocycloid has completed
rolling along the x-axis. When the point of tangency of the hypocycloid and the x-axis
passes through a cusp, the tracing point crosses the x-axis and the point of tangency
switches between the “top” and “bottom” of the x-axis (more precisely, the astroid is
alternately above and below the x-axis near the point of tangency). As mentioned in
the remarks following the roulette lemma, it is at the moments when the tracing point
crossing the x-axis, in this case when a cusp is the point of tangency, that the direction
of the tracing point changes between a rightward and a leftward motion.
In our last two examples, the curve C also has cusps and we may apply our theorem
only to its differentiable pieces. But, as in the last example, the cusps have well-defined
tangent lines and the tangents to C vary continuously, and we may piece together the
roulettes generated by each differentiable part of C . We now consider the case of an
elliptical wheel rolling on a sinusoidal road, as one focus of the ellipse moves along the
x-axis, as shown in Figure 9. More specifically, suppose that the wheel is the ellipse
r = ed/(1 + e cos θ), 0 < e < 1. Using (7), the corresponding road is given by
ed
y=− (1 − e cos(cx)) , (11)
a2
where c = a/(ed). For an alternative approach see [3]. The negative pedal of an ellipse
with respect to a focus, P, is shown in Figures 22 and 23 for two cases, e > 1/2
and e < 1/2, respectively, where the ellipses are dashed. We follow Lockwood [4]
and call the negative pedal Burleigh’s oval and we call P the eye of the oval. If e >
1/2 the oval has two cusps. Otherwise, it is an egg-shaped curve. It follows that as
Burleigh’s oval rolls on a line, the eye traces out a sine curve. Using (9) and translating
Figure 22. Negative pedal of an ellipse (e = Figure 23. Negative pedal of an ellipse (e =
0.8). 0.45).
−−→
by the vector Q 0 O = h0, ed/(1 + e)i gives a parameterization of the rolling curve in
its initial position as
ed ed
(x, y) = (sin θ + e sin 2θ, − cos θ − e cos 2θ ) + 0, .
(1 + e cos θ)2 1+e
If this oval is rolled on the x-axis, the eye, P0 (0, ed/(1 + e)), traces out the reflection
of the sine curve (11) in the x-axis. When e > 1/2 the point of tangency between the
oval and the x-axis changes between being above and below the x-axis at the cusps.
But in contrast to the previous example, the tracing point continues to move to the right
at all times since no tangent to the oval passes through its eye.
Figure 24. Roulette of focus of negative pedal of Figure 25. Roulette of focus of negative pedal of
an ellipse (e = 0.8). an ellipse (e = 0.45).
ed
y=− (1 − e cos(cx)), (13)
a2
and the road for λ = 4√ are shown in Figure 26. As the curve√given by (14) rolls on the
x-axis, the eye, P0 (0, 17 − 1), traces out the curve y = 17 − cos x, as shown in
Figure 27.
√
Figure 26. A wheel for the road y = − 17 + cos x.
√
Figure 27. The curve y = 17 − cos x generated as a roulette.
While we have used our theorem to generate some familiar curves as roulettes, there
are many other possibilities. We invite the interested reader to explore more examples
of road-wheel pairs in [3] and [6] to generate additional roulettes.
ACKNOWLEDGMENTS. I am deeply grateful to several referees whose many suggestions greatly improved
this article. In particular, I would like to thank one referee for generously providing me with an example of
how to use Mathematica and another for pointing out the angle property of pedals and providing the geometric
proof of the main theorem. I would also like to thank my student, Mark Pembrooke, for suggesting the roulette
lemma, and my teacher, John P. Titterton, for introducing me to the beauty of curves thirty years ago.
REFERENCES
1. A. Agarwal and J. E. Marengo, The locus of the focus of a rolling parabola, College Math. J. 41 (2010)
129–133. doi:10.4169/074683410X480230
2. T. Apostol and M. Mnatsakanian, Area & arc length of trochogonal arches, Math Horizons 11(2) (2003)
24–30.
FRED KUCZMARSKI received his B.A. from the University of Pennsylvania in 1984 and his Ph.D. from
the University of Washington in 1995 under the guidance of Paul Goerss. His passion for geometry was later
sparked by James King. He currently teaches at Shoreline Community College in Seattle. His other interests
include baking bread and helping the Forest Service look for fires in Washington’s North Cascades.
Department of Mathematics, Shoreline Community College, Shoreline, WA 98133
[email protected]
“When we moved permanently to the country, the whole house had to be redeco-
rated and all the rooms had to be freshly wallpapered. But since there were many
rooms, there wasn’t enough wallpaper for one of the nursery rooms . . . [which]
just stood there for many years with one of its walls covered with ordinary pa-
per. But by happy chance, the paper for this preparatory covering consisted of
the lithographed lectures of Professor Ostrogradsky on differential and integral
calculus, which my father had acquired as a young man.
These sheets, all speckled over with strange, unintelligible formulas, soon
attracted my attention. I remember as a child standing for hours on end in front
of this mysterious wall, trying to figure out at least some isolated sentences . . . .
Many years later, when I was already fifteen I took my first lesson in differ-
ential calculus from the eminent Petersburg professor Alexander Nikolayevich
Strannolyubsky. He was amazed at the speed with which I grasped and assim-
ilated the concepts of limit and of derivatives, ‘exactly as if you knew them in
advance.’ . . . And, as a matter of fact, at the moment when he was explaining
these concepts I suddenly had a vivid memory of all this, written on the memo-
rable sheets of Ostrogradsky; and the concept of limit appeared to me as an old
friend.”
Sofya Kovalevskaya, A Russian Childhood,
trans. B. Stillman, Springer-Verlag, New York, pp. 122–123
Abstract. In this article we explore some wonderfully intricate relationships between three
mathematical objects, each of which is associated in some way with the Fibonacci sequence.
One of these objects is a particular finite series comprising n terms, and, via the other two, our
mathematical journey culminates in the derivation of an expression for the sum of this series
in terms of sums and products of certain Fibonacci numbers.
1, 3, 4, 6, 8, 9, 11, 12, 14, 16, 17, 19, 21, 22, 24, 25, 27, 29, 30, 32, . . . . (2)
doi:10.4169/amer.math.monthly.118.06.497
Theorem 2.1. For any n ∈ N there exists a unique increasing sequence of positive
integers, (c1 , c2 , . . . , ck ) say, such that c1 ≥ 2, ci ≥ ci−1 + 2 for i = 2, 3, . . . , k, and
k
X
n= Fci .
i=1
Proofs may be found in both [1] and [8]. Note that F1 is excluded from appearing in a
Zeckendorf representation.
It seems remarkable that such a simple result was not discovered until well into the
20th century. Indeed, it is thought that Edouard Zeckendorf (1901–1983) first obtained
a proof in 1939, although he did not publish anything in this regard until 1972; see [9].
For interested readers, an excellent potted biography of Zeckendorf can be found in
[4]. It is worth noting, in particular, that he was a medical doctor by profession.
Throughout the majority of this article, we shall be dealing more generally with F-
representations of integers. In fact, our main result will apply to any F-representation
of n. However, Zeckendorf representations might be regarded as optimal in the sense
that if Fc1 + Fc2 + · · · + Fck is the Zeckendorf representation of n, then any other F-
representation of n will possess at least k terms. This property will be utilized right at
the very end of this paper in order to obtain the most efficient formula.
and so on. The golden string is the unique infinite string S∞ such that for all k ≥ 2, Sk
is an initial segment of S∞ . We note here that some authors prefer to use 0’s and 1’s
rather than a’s and b’s as elements of S∞ ; see [5], for example. Our first lemma gives
some simple results that will be used in due course.
In the following two lemmas we begin to establish the fascinating structural inter-
play between the golden string, F-representations, and a sequence related to {bnφc}.
Lemma 3.2. Let Fc1 + Fc2 + · · · + Fck be an F-representation for some n ∈ N. Then
Sck Sck−1 · · · Sc1 gives the first n letters of S∞ .
Proof. In order to prove this result we proceed by induction on k, the number of terms
in the F-representation. When k = 1, the statement of the lemma is certainly true.
Now assume it is true for some k = m ≥ 1. Consider the F-representation
gives the first Fc1 + Fc2 + · · · + Fcm letters of S∞ , and therefore, by Lemma 3.1(ii), of
Lemma 3.3. Let Na (n) and Nb (n) be the numbers of a’s and b’s, respectively, ap-
pearing amongst the first n letters of S∞ . Then
n+1 n+1
Na (n) = n − and Nb (n) = .
φ φ
In order to prove the lemma we may use Binet’s formula; see [1] or [2] for a proof of
this result:
1 m
1
Fm = √ φ − −m
.
5 φ
Now,
!
1 m+1
Fm 1
=√ φ m−1
+ −
φ 5 φ
m−1 m−1 m+1 !
1 1 1 1
= √ φ m−1 − − + − + −
5 φ φ φ
(−1)m−1
1 1
= Fm−1 + √ + m+1 .
5 φ m−1 φ
From this and (3) it follows that
∞
n+1 1 X 1 1 1
> Nb (n) − √ + 2 j+1 +
φ 5 j=1 φ 2 j−1 φ φ
and
∞
n+1 1 X 1 1 1
< Nb (n) + √ + 2 j+2 + .
φ 5 j=1 φ 2 j φ φ
On using the formula for the sum to infinity of a geometric progression, we obtain,
after some simplification,
∞ ∞
1 X 1 1 1 1 X 1 1 1
√ + = and √ + = 2.
5 j=1 φ 2 j−1 φ 2 j+1 φ 5 j=1 φ 2j φ 2 j+2 φ
n+1
Nb (n) < < Nb (n) + 1,
φ
showing that
n+1
Nb (n) = .
φ
1
φ =1+ ,
φ
As a consequence of this,
n+1 n
b(n + 1)φc − bnφc = n + 1 + − n+
φ φ
n+1 n
=1+ −
φ φ
= 1 + Nb (n) − Nb (n − 1),
if, and only if, the nth letter of S∞ is b. Note that it follows from this result that if the
nth letter of S∞ is a then b(n + 1)φc − bnφc = 1.
We now show that bnφc determines the position, when counting from the left, of
the nth b in the golden string. To take an example, it can be seen that the 7th b of S∞
occurs at position 11. Indeed, a quick check confirms that b7φc = 11.
Nb (m − 1) = n − 1 and Nb (m) = n.
m < nφ < m + 1,
from which we see that bnφc gives the position of the nth b in S∞ .
1, 1, 3, 4, 4, 6, 6, 8, 9, 9, 11, 12, 12, 14, 14, 16, 17, 17, 19, 19, 21, 22, 22, . . . ,
and S∞ may be constructed from this by replacing each pair of consecutive terms of
the form m, m for some m ∈ N by b, and each of the remaining terms by a.
2, 5, 7, 10, 13, 15, 18, 20, 23, 26, 28, 31, . . . , (4)
as is easily checked. Notice how sequences (2) and (4) do indeed dove-tail. This result
implies that it is not possible to find m, n ∈ N such that bmφc = bnφc + n.
It is interesting that
( $s %)
n+1
n φ +n−1
2
and n+
φ
are also a pair of dove-tail sequences. In order to show that this is indeed true, note
first that the difference between successive terms in the sequence
( $s %)
n+1
n+
φ
if, and only if, n = bk 2 φc for some k ∈ N. It is therefore that case that
( $s %)
n+1
n+
φ
as required.
This result generalizes, and it is in fact true that, for each k ∈ N,
( $s %)
n+1
n φ +n−1
k k
and n+
φ
5. EVALUATING A FINITE SUM. We now start to piece together the results from
previous sections in order to obtain a formula for the sum (1) in terms of Fibonacci
numbers. As will eventually be seen, this involves F-representations of the upper limit
n of the sum. The next two lemmas will be used in the main theorems.
Proof. Lemma 3.4 implies that bφ F2k−1 c locates the position of the (F2k−1 )th b in S∞ .
From Lemma 3.1(i) and (iv), we know that this b must be situated at position F2k .
On the other hand, bφ F2k c gives the position of the (F2k )th b in S∞ . On using
Lemma 3.1(i), (iv), and (v), it can be seen that this is F2k+1 − 1.
It might initially appear that Lemma 5.1 is a specialization of Lemma 5.2. Note,
however, that the result from Lemma 5.2 is not necessarily true when r = 0. The
following theorem evaluates (1) for the case in which the upper limit is a Fibonacci
number.
Theorem 5.1.
Fk
X 1
bmφc = Fk−1 (Fk+2 + 1) + (−1)k+1 .
m=1
2
Proof. In Section 4 it was noted that {bnφc} and {bnφc + n} are dove-tail sequences.
Also, from Lemma 5.1, we know that
bφ F2k c = F2k+1 − 1
and
F2k−1 + bφ F2k−1 c = F2k−1 + F2k = F2k+1 .
Therefore
F2k F2k−1 F2k+1
X X X
bmφc + (m + bmφc) = m,
m=1 m=1 m=1
Similarly, since
bφ F2k+1 c = F2k+2
and
F2k + bφ F2k c = F2k + F2k+1 − 1 = F2k+2 − 1,
for 1 ≤ j ≤ 2k − 1. On using (5) and (6) it may be seen that (7) is certainly true for
j = 2k − 1. Now assume that (7) is true for some j such that 2 ≤ j ≤ 2k − 1. On
using (7) with (5) or (6), according to whether j is even or odd respectively, it follows
that
F2k+1 F j−1
X X
bmφc − (−1) j
bmφc
m=1 m=1
F2k+2 F2k+1 F j+1 Fj F j+1 F j−1
X X X X X X
= m− m + (−1) j m− m − (−1) j m− m ,
m=1 m=1 m=1 m=1 m=1 m=1
F2k+1 F j−1
X X
bmφc + (−1) j−1 bmφc
m=1 m=1
F2k+2 F2k+1 Fj F j−1
X X X X
= m− m + (−1) j−1 m− m ,
m=1 m=1 m=1 m=1
as required.
On setting j = 1 in (7) we obtain
F2k+1 F2k+2 F2k+1
X X X
bmφc = 1 + m− m
m=1 m=1 m=1
1
= (F2k+2 (F2k+2 + 1) − F2k+1 (F2k+1 + 1)) + 1
2
1 2 2
= F2k+2 − F2k+1 + F2k+2 − F2k+1 + 1
2
1
= ((F2k+2 + F2k+1 )(F2k+2 − F2k+1 ) + F2k ) + 1
2
1
= (F2k+3 F2k + F2k ) + 1
2
1
= F2k (F2k+3 + 1) + 1.
2
We are now in a position to give our main result, expressing (1) in terms of Fi-
bonacci numbers indexed by positive integers associated with F-representations of n.
Proof. This may be proved by induction on k. By Theorem 5.1, the statement of the
theorem is true when k = 1. Now assume that it holds for some k = m ≥ 1. Consider
and
m+1 m m+1 N
1 X X X X
Fci −1 Fci +2 + 1 + 2(−1)ci +1 +
= Fci Fc j +1 + b jφc.
2 i=2 i=2 j=i+1 j=N +1
Fc1
X
br φc + Fc2 +1 + Fc3 +1 + · · · + Fcm+1 +1
=
r =1
1
= Fc −1 (Fc1 +2 + 1) + (−1)c1 +1 + Fc1 (Fc2 +1 + Fc3 +1 + · · · + Fcm+1 +1 ),
2 1
thereby proving the theorem.
ACKNOWLEDGMENTS. I would like to thank the two anonymous referees for their valuable comments
and suggestions.
REFERENCES
Abstract. Given a real sequence (xn ), we examine the set of all sums of the form i∈I xi , as I
P
varies over subsets of the positive integers. We call this the achievement set of (xn ), and write it
AS(xn ). For instance, AS(1/2n ) = [0, 1] by the existence of binary expansions, and AS(2/3n )
is the Cantor middle-third set. We explore the properties of these two sequences that account
for their very different achievement sets. We give a sufficient condition for a sequence to have
an achievement set that is an interval, and another sufficient condition for the achievement set
to be a Cantor set. We also examine what sets can occur as achievement sets, and give results
on the topology of achievement sets.
doi:10.4169/amer.math.monthly.118.06.508
1. INTERVALS. Throughout, we deal only with sequences whose terms are all
nonzero. We denote a sequence x1 , x2 , x3 , . . . by (xn ), and we declare that the empty
subsequence sums to 0. While the definition of achievement set applies to both finite
and infinite sequences, our primary concern is with infinite sequences.
For our first results on achievement sets, we give conditions on (xn ) that imply that
AS(xn ) is an interval. In keeping with the terminology introduced so far, we call (xn ) a
high achiever if AS(xn ) is an interval (we refrain from calling (xn ) remedial if AS(xn )
fails to be an interval). The notion of a high achiever is the analogue of a complete
sequence of positive integers, since it requires AS(xn ) to be as large as possible. The
following theorem gives a characterization of high achievers among sequences whose
limit is zero, and represents a minor extension of results appearing in [5], [6], and [10,
Chapter 2].
∞
X
|xk | ≤ |xn |. (1)
n=k+1
Then (xn ) is a high achiever. Moreover, if |xk | ≥ |xk+1 | for each k ≥ 1 then (xn ) is a
high achiever if and only if (1) holds.
Note that if one drops the requirement |xk | ≥ |xk+1 | for each k ≥ 1 then it is easy to
find high achievers that violate (1): any nontrivial rearrangement of ( 21n ) suffices.
Before getting to the proof of Theorem 1.1, we give a lemma that will be used
repeatedly in the sequel to handle sequences with negative terms.
Lemma 1.2. Let (xn ) be a sequence of real numbers, and suppose that the sum of the
negative terms of (xn ) converges to s N ≤ 0. Then −s N + AS(xn ) = AS(|xn |).
Proof. Partition the positive integers Z+ into the disjoint subsets I P = {i | xi > 0} and
I N = {i | xi < 0}. Since I P and I N partition Z+ (recall our convention that the terms
of all sequences are nonzero), we have AS(xn ) = AS(xi | i ∈ I P ) + AS(xi | i ∈ I N ),
where + denotes the arithmetic sum. Note that in this equation, we use the fact that
our hypothesis on the negative terms ensures that a subsequence with convergent sum
must in fact have absolutely convergent sum. Taking absolute values then yields
and by (2) this last expression is an element of AS(|xn |). We’ve therefore shown
AS(xn ) − s N ⊆ AS(|xn |).
To show the reverse inclusion, suppose that r ∈ AS(|xn |). By (2), there must be
subsets J P ⊆ I P and JN ⊆ I N such that
X X
r= xi − xi .
i∈J P i∈J N
P
Adding and subtracting xi to the right-hand side gives
i∈I N
X X
r = xi + xi − s N ,
i∈J P i∈I N \J N
P of Theorem 1.1. Let I N be as in the proof of Lemma 1.2, and assume first that
Proof
i∈I N x i converges. By Lemma 1.2 it is enough in this case to show that (|x n |) is a
high achiever. We may thus assume that all terms of (xn ) are positive.
Let s denote the sum of the xn , and allow s to be infinite. Clearly it is enough to
show that r ∈ AS(xn ) for 0 < r < s.
We define indices i 1 , i 2 , i 3 , . . . using a greedy algorithm. Let i 1 be the smallest index
satisfying xi1 ≤ r . Inductively, if i 1 , i 2 , . . . , i m are already chosen, we take i m+1 to be
the smallest index such that i m+1 > i m and
m
X
xim+1 + xi j ≤ r,
j=1
This implies that the terms of (xn ) form an absolutely convergent series, so by Lemma
1.2 we may assume withoutP loss of generality that the terms of (xn ) are positive.
Clearly both b = xk and a = ∞ n=k+1 x n are in AS(x n ). We claim that AS(x n ) ∩ (a, b)
is empty, which shows that (xn ) is not +
Pa high achiever. Let I ⊆ Z . If i ∈ I for some
i ≤ k, then since xPi ≥ x k , we
Phave i∈I x i ≥ x k = b. On the other hand, if I omits
∞
every j ≤ k, then i∈I xi ≤ i=k+1 xi = a.
We now reap some of the fruits of Theorem 1.1; see also [10, Chapter 2].
1
Corollary 1.3. AS n
= [0, ∞)
Corollary 1.3 follows immediately from the fact that the harmonic series diverges,
and implies that every real number in [0, ∞) can be expressed as a (possibly infinite)
Egyptian fraction. Indeed, such an Egyptian fraction can evenP be taken with all denom-
inators prime. This follows from the result of Euler that ∞ 1
n=1 pn diverges [9, p. 59],
where p1 , p2 , . . . is an enumeration of the primes, implying that AS(1/ pn ) = [0, ∞).
We also have an analogue of Riemann’s rearrangement theorem. Note that in our
setting we allow only omissions of terms, not rearrangements.
Corollary 1.4. Let (xn ) be a sequence whose terms form a conditionally convergent
series. Then AS(xn ) = R.
Our final corollary gives us a practical method for showing that many sequences
whose terms form absolutely convergent series are high achievers.
Corollary 1.5. Let xn be a sequence with limn→∞ xn = 0, and suppose |xn+1 | ≥ 12 |xn |
for all n. Then (xn ) is a high achiever.
Theorem 2.1. Let (xn ) be a real sequence, and suppose that for each k ≥ 1,
∞
X
|xk | > |xi |. (5)
i=k+1
For more on the interesting question of how the measure and Hausdorff dimension of
AS(xn ) relate to (xn ), see [7, 8].
satisfy s j = i∈I j xi .
Suppose there are no positive integers n that belong to I j for infinitely many j. Then
for any fixed m we have that for all j sufficiently large,
∞
X
sj ≤ xn .
n=m+1
Proof. Let s N ≥ −∞ denote the sum of the negative terms of (xn ). If s N is infinite,
then as in the proof of Theorem 1.1 (see (4)) AS(xn ) is closed. If s N is finite, then by
Lemma 1.2 we can assume P∞ that (xn ) has positive terms, because a translate of a closed
set is again closed. If n=1 x n converges, then AS(x n ) is closed by Theorem 2.2. If
n ) = [0, ∞) by Theorem 1.1.
P∞
n=1 x n diverges then AS(x
It is not true that all achievement sets are closed. For instance, suppose that xn =
1 + 1/n for all n ≥ 1. Then AS(xn ) does not contain its limit point 1. In this example,
AS(xn ) is countable, so it is natural to ask if all uncountable achievable sets are closed;
the following example of Velleman (personal communication, 2006) shows that the
answer is no.
Consider the two sequences given by
2 1
xn = , yn = 2 − .
3n 2 · 3n−1
Make a new sequence z n by interleaving these two, so that the first few terms of z n are
2/3, 3/2, 2/9, 11/6, 2/27, 35/18. Note that AS(xn ) is the usual Cantor middle-third
set, and hence AS(z n ) is uncountable. Moreover, 2 is an accumulation point of AS(yn ),
and thus also of AS(z n ). We now show that 2 6 ∈ AS(z n ). Suppose a subsequence of z n
sums to 2, and note that it can contain at most one term of (yn ), since yn > 1 for all
n. Moreover, this subsequence must contain at least one term of (yn ), since summing
all the xn yields 1. Therefore we have a subsequence of (xn ) whose terms sum to 2·31k−1
for some k. However, 2·31k−1 is halfway between 31k and 32k , and so is not contained in
the Cantor middle-third set. This is a contradiction, proving that 2 6 ∈ AS(z n ).
For instance, a = 19/109 will do. Put b = ∞ n=1 a = 1−a , and note that AS(x n ) ⊆
n a
P
0, 5 (1 + b) .
9
By (9) we have 59 ∞ n=1 a < 5 , whence AS(xn ) omits the interval 95 b, 25 . Simi-
2
P n
larly, AS(xn ) omits the intervals 59 ba i , 25 a i for all i ≥ 1. Thus 0 ∈ AS(xn ) but AS(xn )
omits an interval in all neighborhoods of 0. It follows that AS(xn ) is not a finite union
of intervals.
On the other hand, we claim 52 (1 + b), 75 (1 + b) ⊂ AS(xn ). Consider the se-
Theorem 3.1. Let (xn ) be a sequence of real numbers. Then AS(xn ) is either a meager
set, and thus has empty interior, or the interior of AS(xn ) is dense in AS(xn ).
Before embarking on the proof of Theorem 3.1, we give a proposition that effec-
tively reduces the proof to the case where xn → 0. This proposition has some inde-
pendent interest as well, and gives some justification for our emphasis thus far on
sequences whose terms approach 0.
Proposition 3.2. Let (xn ) be any real sequence. Then AS(xn ) is either countable, an
infinite interval, or a countable union of translates of AS(xn k ), where (xn k ) is some
subsequence of (xn ) that converges to 0.
Proof. Let E be the set of accumulation points of (xn ). Suppose first that E ∩ (0, ) 6 =
∅ for every > 0. Then there is a sequence e1 , e2 , . . . of elements of E with en →
0 and en < 1 for all n. For each n, let kn be a positive integer with 1/kn > en >
1/(kn + 2), whence there are infinitely many x j with 1/kn > x j > 1/(kn + 2). For
each n, choose kn + 2 such terms and form an infinite subsequence by concatenation.
This subsequence approaches 0 but its sum diverges, and thus it has achievement set
[0, ∞) by Theorem 1.1. It follows that AS(xn ) is an infinite interval. In the case where
E ∩ (−, 0) 6= ∅ for every > 0 a similar argument applies.
Now suppose that there is some > 0 such that E ∩ {r ∈ R | 0 < |r | < } = ∅.
Let (xm k ) be the subsequence consisting of the terms of (xn ) with absolute value at
least 2 . Let (xn k ) be the complementary subsequence. Note that
The first summand on the right-hand side consists only of sums of finitely many terms,
and thus is countable (see also Proposition 4.1). By our assumption about E, the only
possible accumulation point of (xn ) in (−, ) is 0, so the sequence (xnk ) is either finite
or has a limit of 0. In the first case, AS(xn ) is countable, while in the second it is a
countable union of translates of AS(xn k ).
Note that in the case that AS(xn ) is an infinite interval, it is clearly a countable
union of translates of AS(1/2n ). Thus Proposition 3.2 implies that AS(xn ) is either
countable or a countable union of translates of AS(yn ), where (yn ) is some sequence
whose terms approach zero. We are now ready to prove Theorem 3.1.
Proof of Theorem 3.1. We first remark that translates of a meager set are meager, and
a countable union of meager sets is also meager. Moreover, it is easy to see that the
same two statements hold if “meager” is replaced by “has dense interior.” Thus by
Proposition 3.2, it is enough to prove the theorem in the case that xn → 0.
Assume now that the theorem is true when xn → 0 and xn > 0. If xn → 0 but xn
has both positive and negative terms, consider the sum of the negative terms. If this
sum diverges then by Theorem 1.1, AS(xn ) is an interval and thus has dense interior.
If it converges, then by Lemma 1.2 we have that AS(xn ) is a translate of AS(|xn |).
and since AS(x1 , . . . , x N −1 ) is finite, AS(xn ) is a finite union of nowhere dense sets,
and hence is meager.
Proof. Note that since L < 1, limn→∞ xn = 0. If 21 < L < 1, then for some n 0 > 0 we
|x |
have |xn+1 n|
> 12 for all n ≥ n 0 . Hence by Corollary 1.5, we have that AS(xn 0 , xn 0 +1 ,
xn 0 +1 , . . .) is a closed interval. It then follows from the decomposition
It now follows from Theorem 2.1 that AS(xn0 , xn0 +1 , xn 0 +1 , . . .) is a central Cantor set.
Therefore AS(xn ) is a finite union of central Cantor sets.
Note that from the proof of Theorem 3.1, one sees that to answer Question 1, it
is enough to determine whether 0 is in the closure of the interior of AS(xn ). In other
Question 2. Does there exist a sequence (xn ) such that limn→∞ |xn+1 /xn | exists and
AS(xn ) contains an interval but is not a union of intervals?
Proposition 4.1. Let (xn ) be an infinite real sequence. Then AS(xn ) is uncountable if
and only if (xn ) has a subsequence converging to 0.
Proof. Suppose first that (xn ) contains a subsequence converging to 0. Without loss of
generality we may assume that xn → 0; we show that AS(xn ) is uncountable.
If there is a k0 such that whenever k > k0 we have
∞
X
|xk | ≤ |xn |, (10)
n=k+1
then it follows from Theorem 1.1 that AS(xn ) contains an interval, and is thus
With these properties of achievable sets now established, we can examine the
achievability of certain well-known sets.
Proof. Suppose AS(xn ) = S. Clearly (xn ) can have no negative terms. Thus if xn >
for all n and some > 0, then AS(xn ) ∩ (0, ) is empty, which contradicts the fact that
0 is an accumulation point of S. Hence (xn ) must have a subsequence converging to 0.
By Proposition 4.1, AS(xn ) is then uncountable, and we have a contradiction.
REFERENCES
1. R. André-Jeannin, Irrationalité de la somme des inverses de certaines suites récurrentes, C. R. Acad. Sci.
Paris Sér. I Math. 308 (1989) 539–541.
2. J. L. Brown Jr., Note on complete sequences of integers, Amer. Math. Monthly 68 (1961) 557–560. doi:
10.2307/2311150
3. C. Cabrelli, K. Hare, and U. Molter, Sums of Cantor sets yielding an interval, J. Aust. Math. Soc. 73
(2002) 405–418. doi:10.1017/S1446788700009058
4. V. E. Hoggatt and C. King, Problem E 1424, Amer. Math. Monthly 67 (1960) 593; Solution by J. Silver,
68 (1961) 179–180. doi:10.2307/2312499
5. H. Hornich, Über beliebige Teilsummen absolut konvergenter Reihen, Monatsh. Math. Phys. 49 (1941)
316–320. doi:10.1007/BF01707309
6. S. Kakeya, On the partial sums of an infinite series, Sci. Rep. Tôhoku Imperial Univ. 3 (1914) 159–163.
7. M. Morán, Fractal series, Mathematika 36 (1989) 334–348. doi:10.1112/S0025579300013176
8. , Dimension functions for fractal sets associated to series, Proc. Amer. Math. Soc. 120 (1994)
749–754.
9. T. Nagell, Introduction to Number Theory, John Wiley, New York, 1951.
10. P. Ribenboim, My Numbers, My Friends, Springer-Verlag, New York, 2000.
RAFE JONES received his B.A. from Amherst College in 1998 and his Ph.D. from Brown University in
2005 under the direction of Joseph Silverman. He currently teaches at the College of the Holy Cross. While
his mathematical interests lie mainly in arithmetic questions related to the iteration of rational functions, he is
never averse to a good side project. When not doing mathematics, he enjoys running, cycling, speaking French,
and the occasional computer-based distraction. These pastimes are not new; he hopes that none of his Amherst
College professors noticed his world ranking (at the time) in Minesweeper.
Department of Mathematics and Computer Science, College of the Holy Cross, Worcester, MA, 01610
[email protected]
Abstract. Consider the following questions for points a1 , a2 in the unit disc, D. If q(z) =
(z − a1 )(z − a2 ), when is q the derivative of a polynomial with all of its zeros on the unit
circle, ∂D? If an ellipse E with foci a1 , a2 is inscribed in a triangle with vertices on ∂D,
when is E tangent at the midpoints to a triangle with vertices on ∂D? We show that these
problems are essentially the same. In fact, the answer to both is a very simple: if and only if
2|a1 a2 | = |a1 + a2 |. We also discuss generalizations of these problems and their solutions.
z2
a2
a1
z1 z3
doi:10.4169/amer.math.monthly.118.06.522
z − a1 z − a2
b(z) = z , (1.1)
1 − a1 z 1 − a2 z
a2
a1
Evidently, we have two very different questions, but there are obvious similarities:
Both lift an object that is, in some sense, of degree two to an object of degree three.
Both begin with a description depending on values inside D, and end with an object de-
scribed by values on ∂D. We were still surprised to discover that they are (more or less)
the same question. We obtain necessary and sufficient conditions for our questions to
have positive answers by proving the following:
z − a1 z − a2
b(z) = z .
1 − a1 z 1 − a2 z
2.1. Steiner Inellipses. Our study begins with an ellipse named after the Swiss math-
ematician Jakob Steiner (1796–1863). (See, for example, [18].)
We call this unique ellipse the Steiner inellipse (see Figure 1.1). Siebeck’s theorem
tells us more about the nature of the foci of the ellipse.
2.2. Blaschke Ellipses. Another family of ellipses was discovered during the investi-
gation of finite Blaschke products, which are rational functions of a special type.
Finite Blaschke products map D to itself, ∂D to itself, and map points outside D
back outside D [6, p. 5]. Furthermore, the finite Blaschke product b is an n-to-one map
on ∂D; that is, if λ ∈ ∂D, then b maps exactly n distinct points of ∂D to λ, as shown in
[2]. We will say that b identifies n points if it maps them all to the same value.
A Blaschke 3-ellipse, or, simply, Blaschke ellipse, is a curve associated with a given
degree-3 Blaschke product; it is described by the following theorem [2]:
z − a1 z − a2
b(z) = z . (2.2)
1 − a1 z 1 − a2 z
For λ ∈ ∂D, let z 1 , z 2 , z 3 ∈ ∂D denote the distinct points mapped to λ under b. Write
the partial fraction decomposition
b(z)/z m1 m2 m3
F(z) = = + + .
b(z) − λ z − z1 z − z2 z − z3
The theorem in [2] assumes 0, a1 , a2 are distinct, but a continuity argument shows
that the result holds in general as long as b(0) = 0. Our Blaschke products will always
be in the form (2.2).
This theorem should be compared with Marden’s theorem [17, p. 9], which deals
with more general F but not the connection to Blaschke products. In addition, the
reader might be interested in a follow-up on Marden’s theorem, “The most marvelous
theorem in mathematics,” by D. Kalman (see [14] and [15]).
For any ellipse associated with a Blaschke product b, we may start with any point
z ∈ ∂D and let λ = b(z) to see from Theorem 2.4 that there exists a triangle circum-
scribing E with one vertex at z and all vertices on ∂D (Figure 1.2). In fact, if any ellipse
is inscribed in one triangle with vertices on the unit circle, it is inscribed in infinitely
many. This beautiful result is often called a porism, and it is due to Jean Poncelet.
More information about Poncelet’s porism can be found in a recent book by Flatto [4].
Our circumscribing triangles are similar to unit Steiner triangles, but are more gen-
eral; the points of tangency of the ellipse need not be the midpoints of the sides. The
equivalence of (1) and (2) in the main theorem will use the Poncelet property via the
following result of Frantz [5, Prop. 3]. We recommend [8], [9], and [31] for related
work and more about connections to Poncelet’s porism.
Lemma 2.5. The ellipses that can be inscribed in triangles with vertices on ∂D are
precisely the Blaschke 3-ellipses.
a2
a1
Figure 2.1. A Blaschke ellipse that is also a unit Steiner inellipse, with foci a1 = (−1 − 3i)/4 and a2 = 1/2.
2.3. Proof of the Main Theorem. We’ll first need the following lemma.
Lemma 2.6. Suppose we are given a triangle with vertices z 1 , z 2 , z 3 ∈ ∂D. Denote
the foci of the Steiner inellipse of this triangle by a1 , a2 . Then 2|a1 a2 | = |a1 + a2 |.
Proof. Theorem 2.2 implies that there exists a polynomial p with zeros z 1 , z 2 , z 3 and
critical points a1 , a2 . That is, [(z − z 1 )(z − z 2 )(z − z 3 )]0 = 3(z − a1 )(z − a2 ). Thus
(z 1 + z 2 + z 3 )/3 = (a1 + a2 )/2 and z 1 z 2 + z 1 z 3 + z 2 z 3 = 3a1 a2 , by comparing co-
efficients. Multiplying the latter by z 1 z 2 z 3 and recalling that z j z j = 1, we obtain
Proof of Theorem 1.1. First, suppose that (1) holds. Then Theorem 2.2 implies that
the critical points of p are the foci of the ellipse inscribed in the triangle 1z 1 z 2 z 3 at
the midpoints. By Lemma 2.6, 2|a1 a2 | = |a1 + a2 |, and (3) holds.
To see that (3) implies (1), assume (3). If a1 = 0 or a2 = 0, then a1 = a2 = 0 and
z 3 − 1, for example, satisfies (1). If both are nonzero, let λ = (a1 + a2 )/(2a1 a2 ) and
note that, by (3), we have |λ| = 1. By the three-to-one property of the Blaschke product
b, there are three distinct points z 1 , z 2 , z 3 ∈ ∂D with b(z j ) = λ.
Write Q(z) := b(z) − λ. Then z 1 , z 2 , z 3 must be the three zeros of Q. Now
z − a1 z − a2
Q(z) = z −λ
1 − a1 z 1 − a2 z
is a rational function, but since the poles of Q are outside the closed unit disc, to find
the zeros of Q we need only consider the numerator of Q. Thus
q(z) = z(z − a1 )(z − a2 ) − λ(1 − a1 z)(1 − a2 z)
is the monic polynomial with zeros at z 1 , z 2 , z 3 .
Expanding and using λ = 1/λ = 2a1 a2 /(a1 + a2 ), we see that
q(z) = z 3 − (a1 + a2 + λa1 a2 )z 2 + a1 a2 + λ(a1 + a2 ) z − λ
3
= z 3 − (a1 + a2 )z 2 + 3a1 a2 z − λ.
2
So q 0 (z) = 3 z 2 − (a1 + a2 )z + a1 a2 = 3(z − a1 )(z − a2 ) and q satisfies (1).
Now we show that (1) and (2) are equivalent: Suppose (1) holds. By Theorem 2.2,
a1 and a2 are the foci of a unit Steiner, and hence Poncelet, inellipse E 1 . Lemma 2.5
implies that E 1 is a Blaschke 3-ellipse. Since the Blaschke product has zeros at 0, a1 ,
and a2 , we see that E 1 is the Blaschke ellipse associated with b. Thus, the Blaschke
ellipse associated with b is a unit Steiner inellipse and (2) holds. The converse, (2)
implies (1), follows from Theorem 2.2.
Remark 2.7. Suppose a1 , a2 satisfy (2) and (3) of Theorem 1.1 and we wish to ex-
plicitly find a polynomial as in (1). In the proof above, q has constant term −λ =
−(a1 + a2 )/(2a1 a2 ), so that, if a1 , a2 6 = 0, to find such a polynomial we can simply
integrate 3(z − a1 )(z − a2 ) with an integration constant of −λ. If a1 or a2 is 0, as
mentioned in the proof, we may take z 3 + γ for any γ ∈ ∂D.
Theorem 2.8. An ellipse E is a unit Steiner inellipse if and only if there exists λ ∈ ∂D
and a degree-3 Blaschke product b of the form (2.2) such that E is the Blaschke ellipse
of b, and if z 1 , z 2 , z 3 are the distinct points mapped to λ by b, then
b(z)/z 1/3 1/3 1/3
= + + .
b(z) − λ z − z1 z − z2 z − z3
m 2 z1 m 1 z2 z1 z2
ζ3 = + = + . (2.3)
m1 + m2 m1 + m2 2 2
Now points of the form az 1 + (1 − a)z 2 for 0 ≤ a ≤ 1 lie on the line segment
from z 1 to z 2 , and different values of a lead to different points on the line segment.
Therefore, we must have m 1 /(m 1 + m 2 ) = m 2 /(m 1 + m 2 ) = 1/2. Thus, m 1 = m 2 .
By symmetry, m 1 = m 2 = m 3 . Since m 1 + m 2 + m 3 = 1, each m j = 1/3, as de-
sired.
Conversely, assume that b is as defined in the hypothesis with m 1 = m 2 = m 3 =
1/3. By Theorem 2.4, one point of tangency of the Blaschke ellipse E is ζ3 = (m 1 z 2 +
m 2 z 1 )/(m 1 + m 2 ) = (z 2 + z 1 )/2, which is the midpoint of the line segment z 1 z 2 . Sim-
ilarly ζ1 and ζ2 are midpoints of z 2 z 3 and z 1 z 3 respectively. Thus E is a unit Steiner
ellipse, as desired.
If p is a polynomial satisfying (1) in the main theorem, then by (2) the Blaschke
ellipse associated with b is a unit Steiner inellipse with foci a1 and a2 . So, we arrive at
the following corollary, using the logarithmic derivative of p for the final equality.
Note that this shows us why the critical points of p are the zeros of b(z)/z.
2.5. How Many Unit Steiner Triangles Can an Ellipse Have? Let us now analyze
a few simple situations.
2.5.1. Infinitely Many Unit Steiner Triangles. Suppose the unit Steiner triangle is
equilateral. Then both foci of the unit Steiner inellipse are equal to 0: a circle can
be inscribed in 4z 1 z 2 z 3 that is tangent at the midpoints of the sides of the triangle
and has the origin as its center [15]. Since the Steiner inellipse is unique, it must be
Cr := {z : |z| = r } for some r with 0 < r < 1; a calculation shows that r = 1/2. In
this case, any rotation of the triangle is a unit Steiner triangle of the circle.
Now suppose that we have a Steiner inellipse with a focus a1 = 0. From Lemma
2.6, we see that a2 = 0 as well. Thus, the only Steiner inellipse with a focus at 0 is a
circle, centered at 0. Also, if a Steiner inellipse is a circle, we have a1 = a2 . Lemma
2.6 implies that a1 = a2 = 0, and the only Steiner “in-circle,” so to speak, is the circle
C1/2 .
Theorem 2.10. Suppose we have an ellipse E with foci a1 and a2 in D. If E has two
distinct unit Steiner triangles, then E = C1/2 .
It is easier to think about this result with polynomials: given two (monic, cubic)
polynomials with distinct zeros on ∂D and with the same two critical points, we show
that these polynomials are each z 3 + c0 for an appropriate choice of c0 on ∂D.
Proof. By our main theorem we may choose two distinct monic, cubic polynomials,
p and q, with zeros on ∂D corresponding to the vertices of our unit triangles and with
critical points a1 , a2 ∈ D. Because p and q have the same critical points a1 , a2 , they
are both associated with the same Blaschke product b as defined in Theorem 2.4. By
Corollary 2.9, there exist γ j ∈ ∂D such that
zγ1 p 0 zγ2 q 0
b(z) = and b(z) = .
zp 0 − 3 p zq 0 − 3q
By our conditions, we know that p 0 = q 0 . So, setting the two expressions for b equal
to each other and substituting, we see that γ1 /(zp 0 − 3 p) = γ2 /(zp 0 − 3q). If γ1 = γ2 ,
we have p = q. So γ1 6 = γ2 .
We can rewrite our equation as γ1 (zq 0 − 3q) = γ2 (zp 0 − 3 p). But we know that
q = p + C, so
2.6. Which Points Can Be the Foci of a Unit Steiner Inellipse? Given a1 ∈ D,
you may be wondering what points a2 ∈ D we may choose so that a1 and a2 are the
foci of a unit Steiner inellipse. We’ve discussed the case a1 = 0, so assume a1 6 = 0.
Interestingly, in this case, the set of such a2 is part of a circle (we consider a line to
be a circle of infinite radius). From Theorem 1.1, the condition 2|a1 a2 | = |a1 + a2 | is
necessary and sufficient for a1 and a2 to be the foci of a unit Steiner inellipse. We wish
to determine all z ∈ D with 2|a1 z| = |a1 + z|. Let T (z) = a1 z/(a1 + z). We want all z
such that |T (z)| = 1/2. Since T is a Möbius transformation, T maps circles to circles,
so our solution is the set of points in D on the circle T −1 {w : |w| = 1/2}.
Figure 2.2 shows some examples of these circles. Using the description of our circle
in terms of the Möbius transformation, and by rotation taking a1 > 0, we note that as
a1 increases from just greater than 0 to 1/2, the circle grows in size from very small to
infinitely large. When a1 increases from just greater than 1/2 to 1, the circle shrinks,
though it never again lies entirely in D. When a1 = 1/2 the solutions are points in D on
the line {z : <(z) = −1/4}. Note that the circles approach <(z) = −1/4 as a1 moves
toward 1/2 from either direction.
Matrices that are unitarily equivalent have the same characteristic polynomial. So,
we will choose the most convenient form for M and its unitary dilation from among
all unitarily equivalent ones.
Now, any matrix satisfying the hypotheses of Theorem 3.1 is unitarily equivalent to
one in the form of the matrix A presented in Definition 3.2 below [9, p. 180]. Further-
more, every 3 × 3 unitary dilation of A, and therefore of M, is unitarily equivalent to
a matrix of the form Bλ below [8, p. 364]. Thus, the study of our 2 × 2 contraction M
with eigenvalues in D and rank(I − M ∗ M) = 1 can be completed by studying matri-
ces of the form A below and unitary dilations of the form Bλ . Such matrices have been
well studied (see, for example, [8], [9], [10], [19], and [3]).
and
p p p
a1 1 − |a1 |2 1 − |a2 |2 −a2 1 − |a1 |2
.
Bλ =
p
0 a2 1 − |a2 |2
λ 1 − |a1 |2 λa1 a2
p p
−λa1 1 − |a2 |2
These particular matrices will allow us to prove easily the equivalence in Theorem
3.1. Note that if b is a degree-3 Blaschke product with zeros 0, a1 , a2 , then the eigen-
values of A are the zeros of b(z)/z. What are the eigenvalues of Bλ ? The answer lies
in the next theorem.
Theorem 3.3 ([8, Lemma 2.4]). Given λ ∈ ∂D, the eigenvalues of Bλ are the values
mapped to λ by b.
Now, more surprisingly, A and Bλ also have a strong connection to our Blaschke
ellipses. Given an n × n matrix M, its numerical range (or field of values) is the set
where h , i denotes the standard inner product. There are many reasons for studying
the numerical range, one of which is the fact that it helps locate the eigenvalues of M—
they are somewhere inside W (M) [12, Problem 169]. It is usually difficult to describe
the numerical range of a matrix. In the two cases we are concerned with, however,
the descriptions are simple. First, the numerical range of any normal matrix, and so in
particular any unitary matrix, is the convex hull of its set of eigenvalues [12, Problem
a c
171]. Second, if N = 01 a2 , then W (N ) is the elliptic disc with foci a1 , a2 and minor
axis of length |c|; see [9].
Proof of Theorem 3.1. We already know that statements (1) and (2) are equivalent. So
suppose the Blaschke ellipse associated with b is a Steiner inellipse. By Theorem 1.1,
there is a polynomial p with zeros on ∂D and critical points a1 , a2 . From Remark 2.7,
the zeros of p are the points z j for which b(z j ) = λ for j = 1, 2, 3, where −λ is the
constant term of p. Thus, from Theorem 3.3, ch(Bλ ) = p, and ch(Bλ )0 (z) = p 0 (z) =
3(z − a1 )(z − a2 ) = 3ch(A)(z). Thus (1) implies (3).
Now suppose ch(Bλ )0 = 3ch(A). So ch(Bλ ) is a polynomial with zeros on ∂D and
critical points a1 , a2 . By Theorem 1.1, 2|a1 a2 | = |a1 + a2 |, and (3) implies (2).
When can the eigenvalues of a matrix be located using its unitary dilations? This
sounds nothing like our questions about polynomials and ellipses, yet Theorem 3.1
gives a surprising connection. This connection to Blaschke products will allow us,
in Section 4, to discuss generalizations of Steiner’s theorem, Blaschke ellipses, and
eigenvalues of n × n contractions and their (n + 1)-unitary dilations.
According to Sunder [28], Halmos studied normal operators and unitary dilations
because of his “unwavering belief that the secret about general operators lay in their
relationship to normal operators.” For example, it can be shown that the numerical
range of an n × n contraction is the intersection of the numerical ranges of all of its
unitary dilations, and we have seen a special case of this above: the triangles W (Bλ )
given by the numerical ranges of the unitary dilations intersect to the elliptical disc
W (A), the numerical range of the contraction. Even in infinite dimensions, we may
find the closure of the numerical range of a contraction by taking the intersection of
the closures of the numerical ranges of the unitary dilations involved. The interested
reader should consult two recent papers, [1] and [7], as well as [8], in which the authors
describe a refinement of a classical result due to Lucas on the locations of the critical
points of a polynomial.
ACKNOWLEDGMENTS. This work was the second author’s honors thesis at Bucknell University under
the direction of the first author. We are grateful to the Department of Mathematics for its support and to Ueli
Daepp for his very careful reading of this manuscript. We also thank the referee for many helpful suggestions
and for improving the exposition of this paper.
REFERENCES
1. M.-D. Choi and C.-K. Li, Constrained unitary dilations and numerical ranges, J. Operator Theory 46
(2001) 435–447.
2. U. Daepp, P. Gorkin, and R. Mortini, Ellipses and finite Blaschke products, Amer. Math. Monthly 109
(2002) 785–795. doi:10.2307/3072367
3. U. Daepp, P. Gorkin, and K. Voss, Poncelet’s theorem, Sendov’s conjecture, and Blaschke products, J.
Math. Anal. Appl. 365 (2010) 93–102. doi:10.1016/j.jmaa.2009.09.058
4. L. Flatto, Poncelet’s Theorem, American Mathematical Society, Providence, RI, 2009.
PAMELA GORKIN received her B.S., M.S., and Ph.D. from Michigan State University. She also spent a
year at Indiana University where she learned about numerical ranges from Paul Halmos. She has been teaching
at Bucknell University since 1982, with time off for good behavior. Her hobbies are hiking, reading, traveling,
cooking and eating, though not necessarily in that order.
Department of Mathematics, Bucknell University, Lewisburg, PA 17837
[email protected]
ELIZABETH SKUBAK attended Bucknell University for her undergraduate studies and is currently a gradu-
ate student and teaching assistant at the University of Wisconsin–Madison. She likes to spend her few waking,
non-working hours reading, cooking, taking photographs, and being outside.
Department of Mathematics, University of Wisconsin at Madison, Madison, WI 53706-1388
[email protected]
Abstract. Daniel A. Marcus claimed in a “Note added in proof” to his 1984 paper on positively
k-spanning vector configurations that some minimal positively k-spanning vector configura-
tions in m-dimensional space have more than 2km elements, but the example he found (for
k = 2 and m = 12) seems to be lost.
We produce such an example by applying Gale duality, a linear algebra technique devel-
oped by Micha Perles in the sixties, to a result on illuminated polytopes by Peter Mani from
1974. Our example has exactly the parameters claimed by Marcus.
Conversely, we show how results on positively k-spanning vector configurations, again via
Gale duality, can be used to solve a problem by Mani on nonsimplicial illuminated polytopes.
In a “Note added in proof” to his 1984 paper on positively spanning vector configu-
rations, Daniel A. Marcus (from the California State Polytechnic University, Pomona,
CA) claimed to have a counterexample to his conjecture that a minimal positively
k-spanning vector configuration in Rm has size at most 2km. However, the counterex-
ample was never published, and seems to be lost.
Independently, and ten years earlier, in 1974, Peter Mani (Bern, Switzerland) had
disproved a conjecture by the Swiss Mathematician Hugo Hadwiger that every “illu-
minated” d-dimensional polytope must have at least 2d vertices.
Here we observe that these two studies are related by “Gale duality,” an elemen-
tary linear algebra technique devised by Micha A. Perles (at the Hebrew University,
Jerusalem) in the sixties. Thus Mani’s study provides a counterexample for Marcus’s
conjecture with exactly the parameters that Marcus had claimed. In the other direction,
with Marcus’s tools we provide an answer to a problem left open by Mani: Could “illu-
minated” d-dimensional polytopes on a minimal number of vertices be nonsimplicial?
v = λ1 u 1 + λ2 u 2 + · · · + λn u n .
In two papers [8, 9], dating from 1981 and 1984, Marcus studied properties of
positively k-spanning vector configurations. In particular, he was interested in upper
bounds on the cardinality of minimal positively k-spanning vector configurations, that
is, of positively k-spanning vector configurations that are minimal with respect to in-
clusion:
Question 1 (Marcus [8, 9]; see also [11]). What is the maximum size of a minimal
positively k-spanning vector configuration in Rm ?
Yet for k ≥ 2 it is not obvious that there is a finite upper bound. However, this was
proved by Marcus for the case k = 2; for all k ≥ 2 it can be derived from the Perles
skeleton theorem for (convex) polytopes [6]; see [14].
A convex d-dimensional polytope is the convex hull of a finite number of points V
in Rd that affinely span Rd . The points in V that are not convex combinations of some
of the other points in V are the vertices of the polytope. A d-dimensional polytope
thus has at least d + 1 vertices. If it has exactly d + 1 vertices, we call it a simplex. We
denote by vert(P) the set of vertices and by f 0 = f 0 (P) = |vert(P)| the number of
vertices. For example, if P is a 3-dimensional octahedron, then vert(P) ⊂ R3 is its set
of six vertices, and f 0 (P) = 6. A face of a polytope is the convex hull of any subset
of the set of vertices on which some linear function achieves its maximal value. (The
polytope itself, and the empty set, are also defined to be faces.) Every face of a polytope
is itself a polytope in some lower-dimensional space and thus has a dimension. For
example, the vertices are the 0-dimensional faces. The faces of dimension 1, d − 2,
Figure 3. A polytope (pyramid) with two missing edges (dotted), and an unneighborly 3-dimensional poly-
topes (a triangular prism).
With this translation, the k = 2 case of Question 1 is very closely related to the
following:
Question 2 (Marcus [8, 9]). What is the minimum number of vertices of an unneigh-
borly d-polytope?
Figure 4. An inner diagonal (dotted), and an illuminated 3-dimensional polytope (an octahedron).
In [7], Mani gave a remarkable complete answer to this question. Indeed, the bound
of 2d conjectured
√ by Hadwiger turned out to be wrong, whereas the correct bound is
roughly d + 2 d. More precisely, Mani showed that every illuminated d-polytope has
at least
√
vertices, where we set p(d) := d 4d+1−1
2
e for d ≥ 1. McMullen (personal communi-
cation 2009) has noted that one can write the function M(d) in the following simple
form:
√ √
M(d) = min{2d, d( d + 1)2 e} = min{2d, d + 1 + d2 de}.
According to Mani, every illuminated polytope has at least M(d) vertices and ex-
amples with M(d) vertices exist: His construction starts from a cyclic d-polytope—this
is the convex hull of a finite number of points on a curve in Rd of order d; the upper
bound theorem tells us that such a d-dimensional polytope has the maximal number of
facets for a given number of vertices. Mani’s examples are obtained from a cyclic d-
polytope with d + p(d) vertices by stacking new vertices onto d p(d)
d
e + 1 well-chosen
facets. The operation stacking a facet builds a flat pyramid over a given facet F of a
In particular, for large enough d there are illuminated—and thus also unneighborly
—polytopes on much fewer vertices than 4(d + 1)/3. Indeed, the smallest d where this
occurs is d = 36, and Mani’s illuminated polytope in dimension d = 36 has M(36) =
49 vertices, so it has exactly the same parameters as Marcus’s lost counterexample. We
will probably never know whether this is the same polytope that Marcus had in mind:
he never published on the subject again, and when we tried to contact him via the
California State Polytechnic University in Pomona, where he had been on the faculty
for a number of years, we received the information that he had passed away some years
ago.
We will refer to illuminated polytopes with the minimal number f 0 = M(d) of
vertices as Mani polytopes. The Mani polytopes constructed by Mani himself are, by
construction, simplicial. In relation to Question 3, Mani thus asked:
Question 4 (Mani [7]; see also Bremner and Klee [2]). Are all illuminated poly-
topes with the minimum number of vertices simplicial? Or are there nonsimplicial
Mani polytopes?
Below we give a complete answer to this question: up to dimension 5, all Mani poly-
topes are simplicial, and there is only one combinatorial type, given by the crosspoly-
tope. For every d ≥ 6 though, we construct a nonsimplicial Mani polytope on the
minimum number of vertices. This corrects a statement by Bremner and Klee [2], who
had claimed that up to dimension 7 all extremal illuminated polytopes were crosspoly-
topes.
Our solution for Mani’s question, which we present in Section 3, produces suitable
Gale diagrams; thus it uses Marcus’s positively k-spanning vector configurations. In
turn, Marcus’s conjecture on the size of minimal positively k-spanning vector config-
urations is refuted by taking Mani’s viewpoint of illuminated polytopes.
To summarize, the precise answers for Questions 1 and 2 remain open, although
Marcus’s conjectured answers are refuted by Mani’s construction as well as by √ our
construction in this paper: they yield illuminated d-polytopes with roughly d + 2 d
vertices, while√ Marcus’s work implies that an unneighborly d-polytope has at least
roughly d + 2d vertices. Question 3 was solved by Mani [7], and Question 4 is
solved by our construction.
Marcus’s conjecture for Question 1 is proven wrong for all k ≥ 2 in the first author’s
Ph.D. thesis [14] based on Mani’s construction for the case k = 2.
is an inner diagonal.
A set W ⊆ vert(P) is said to lie opposite the vertex v ∈ vert(P) if for every w ∈ W
the segment [v, w] is an inner diagonal and vert(P) \ (W ∪ {v}) illuminates itself.
Let 0(P) := max{|W | : W lies opposite some v ∈ vert(P)}.
The following is a slightly stronger statement than the main result from [12] that is
easily extracted from Rosenfeld’s proof.
Lemma 2 (Mani [7]). Let d ≥ 3, let P be a Mani d-polytope, and assume that
0(P) ≥ 2. Then f 0 (P) ≥ d + p(d) + p(d)
d
+ 1.
Proof. By Lemma 2, the number of vertices is at least d + p(d) + p(d) + 1. But for
d
must have d ≥ 6.
choose an ` with 1 ≤ ` ≤ q − 1.
We construct a nonsimplicial polytope Q with d + p vertices that has q + 1 simplex
facets, such that stacking onto these facets produces a nonsimplicial illuminated d-
polytope. (What we describe here is in fact a whole family of such polytopes, indexed
by the parameter `.)
We describe Q in terms of a Gale diagram A. Let
B = {− 1, e1 , . . . , e p−1 },
where 1 denotes the vector in which all entries are 1. This is a positive basis of R p−1
of cardinality p. The vectors in A are the following:
(1) Take ` copies of B, and denote them by B1 , . . . , B` .
(2) Take q − ` copies of −B, and denote them by B̃1 , . . . , B̃q−` .
(3) Furthermore, take the vectors 1, −e1 , . . . , −ed+ p− pq−1 .
Then the number of vectors in A is d + p.
By the translation between Gale diagram and polytope combinatorics, every Bi ,
i = 1, . . . , `, and every B̃ j , j = 1, . . . , q − `, corresponds to a facet complement of
size p in Q, that is, to the complement of a simplex facet. If we augment the set
{1, −e1 , . . . , −ed+ p− pq−1 } to a positive basis B 0 by taking the last pq − d vectors of
B̃1 , we get that
{Bi : i = 1, . . . , `} ∪ { B̃ j : j = 1, . . . , q − `} ∪ {B 0 }
and five disjoint simplex facet complements of size four that cover all vertices. This
yields a nonsimplicial illuminated 16-polytope with M(16) = 25 vertices.
Example 7. For the special case d = 6, we get a polytope Q as in the proof of The-
orem 5 by constructing the Gale diagram on the right in Figure 6 with p = 3, q = 2,
and ` = 1.
The Gale diagram has three disjoint positive bases that cover all vectors: the basis
B1 = {− 1, e1 , e2 }, and the bases B̃1 = B 0 = {1, −e1 , −e2 }. These bases correspond
to complements of simplex facets of Q. Stacking onto these three facets produces a
nonsimplicial illuminated 6-polytope with M(6) = 12 vertices.
v1 · · · vn
V = ∈ R(d+1)×n
1 ··· 1
v1 v2 − v1 · · · vn − v1
∈ R(d+1)×n ,
1 0 ··· 0
w1 · · · wn
W = ∈ R(d+1)×n
β1 · · · βn
with rowspace(W ) = rowspace(G)⊥ , and thus in particular with full row rank d + 1.
The rows of G give the linear dependences of the columns of W and thus also linear
dependences of the vectors
(1/βi )wi v
=: i
1 1
with the same sign pattern. This allows one to reconstruct the full combinatorics of the
polytope P := conv{v1 , . . . , vn } ⊂ Rd from the sign patterns of the linear dependences
of the columns of G. However, G determines a polytope with n vertices if and only
if no point vi is in the convex hull of the other points v j , that is, if and only if there
is no vector in the row space of G that has exactly one negative component, that is,
if and only if every open half-space in Rn−d−1 contains at least two of the vectors
gi . As observed in the introduction above, this is equivalent to the condition that the
vector configuration g1 , . . . , gn in Rn−d−1 is positively 2-spanning. Figure 7 gives a
very simple example.
3 3, 4
2
6 1, 2
1
5
5, 6
4
Figure 7. An octahedron (6 vertices, 3-dimensional), and its Gale diagram (6 vectors, 2-dimensional).
REFERENCES
1. L. M. Blumenthal, Theory and Applications of Distance Geometry, Oxford University Press, 1953.
2. D. Bremner and V. Klee, Inner diagonals of convex polytopes, J. Combin. Theory, Ser. A 87 (1999)
175–197. doi:10.1006/jcta.1998.2953
3. C. Davis, Theory of positive linear dependence, Amer. J. Math. 76 (1954) 733–746. doi:10.2307/
2372648
4. B. Grünbaum, Convex Polytopes, Interscience, London 1967; 2nd ed. (V. Kaibel, V. Klee and G. M.
Ziegler, eds.), Graduate Texts in Mathematics, vol. 221, Springer-Verlag, New York, 2003.
5. H. Hadwiger, Ungelöste Probleme, Nr. 55, Elem. Math. 27 (1972) 57.
6. G. Kalai, Some aspects of the combinatorial theory of convex polytopes, in Polytopes: Abstract, Convex
and Computational—Scarborough, ON, 1993, NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., vol. 440,
Kluwer Academic, Dordrecht, 1994, 205–229.
7. P. Mani, Inner illumination of convex polytopes, Comment. Math. Helv. 49 (1974) 65–73. doi:10.1007/
BF02566719
8. D. A. Marcus, Minimal positive 2-spanning sets of vectors, Proc. Amer. Math. Soc. 82 (1981) 165–172.
9. , Gale diagrams of convex polytopes and positive spanning sets of vectors, Discrete Appl. Math.
9 (1984) 47–67. doi:10.1016/0166-218X(84)90090-8
10. J. Matoušek, Lectures on Discrete Geometry, Graduate Texts in Mathematics, vol. 212, Springer-Verlag,
New York, 2002.
11. P. McMullen, Transforms, diagrams and representations, in Contributions to Geometry, Proceedings of
the Geom. Sympos.—Siegen 1978, Birkhäuser, Basel, 1979, 92–130.
12. M. Rosenfeld, Inner illumination of convex polytopes, Elem. Math. 30 (1974) 27–28.
13. G. C. Shephard, Diagrams for positive bases, J. London Math. Soc. 4 (1971) 165–175. doi:10.1112/
jlms/s2-4.1.165
14. R. F. Wotzlaw, Incidence Graphs and Unneighborly Polytopes, Ph.D. dissertation, Technische Universität
Berlin, 2009; available at http://opus.kobv.de/tuberlin/volltexte/2009/2221/.
15. G. M. Ziegler, Lectures on Polytopes, Graduate Texts in Mathematics, vol. 152, Springer-Verlag, New
York, 1995.
RONALD F. WOTZLAW received his Ph.D. at TU Berlin in 2009, in the framework of the DFG Research
Training Group “Methods for Discrete Structures.” In 2009, he decided to combine his passions for computer
science, mathematics, and photography and joined Nik Software, where he now works on photography-related
problems in digital imaging.
Nik Software GmbH, Hinter den Kirschkaten 26, 23560 Lübeck, Germany
[email protected]
GÜNTER M. ZIEGLER received his Ph.D. at M.I.T. in 1987, and after four years in Augsburg and a winter
in Stockholm, arrived in Berlin in 1992. He has been a professor at TU Berlin since 1995. He is the speaker of
the DFG Research Training Group “Methods for Discrete Structures,” and a co-chair of the Berlin Mathemat-
ical School. His writing includes Proofs from THE BOOK (1998, with Martin Aigner, in German Das BUCH
der Beweise, translated into 12 other languages) and a 2010 book in German Darf ich Zahlen? Geschichten
aus der Mathematik.
Inst. Mathematics, Freie Universität Berlin, Arnimallee 2, 14195 Berlin, Germany
[email protected]
Abstract. A drawing of a graph in the plane is called a thrackle if every pair of edges meet
precisely once, either at a common vertex or at a proper crossing. According to Conway’s
conjecture, every thrackle has at most as many edges as vertices. We prove this conjecture for
x-monotone thrackles, that is, in the case when every edge meets every vertical line in at most
one point.
doi:10.4169/amer.math.monthly.118.06.544
Theorem 1 (Erdős). Every straight-line thrackle has at most as many edges as ver-
tices.
u0
v0 v v0 v
leftmost
u u
edge
u0
v
(a) (b) (c)
Figure 2. (a) a pointed vertex; (b) Perles’ argument; and (c) why it breaks down for x-monotone thrackles.
The above proof applies verbatim to another special class of thrackles, for which
it makes sense to speak of “leftmost” edges. A thrackle is called outerplanar if its
vertices lie on a circle and its edges are represented by continuous curves contained in
the interior of this circle [5].
Cairns and Nikolayevsky [5] have recently established the stronger statement that
every outerplanar thrackle which has no vertex of degree at most one is an odd cycle.
Woodall [15] characterized all thrackles, assuming that Conway’s conjecture is true.
In this case, G is a bipartite graph: all of its edges connect a point in the left half-
plane bounded by ` to a point in the right half-plane. We show that the number of
edges of G is strictly smaller than n. Indeed, otherwise G would contain a cycle of
even length, contradicting the following lemma.
Proof. Suppose for a contradiction that there exists a drawing of an even cycle C =
v0 v1 · · · vk−1 which can be drawn as a thrackle (k is even and the indices are taken
modulo k).
First of all, notice that we can assume without loss of generality that C meets the
requirements of Case A: none of its vertices lies on `. Indeed, if there existed such a
vertex vi , the two edges of C meeting at vi would lie in the same half-plane bounded
by `; otherwise, by the evenness of k, one of them would be disjoint from another edge
of C that lies entirely in the opposite open half-plane. If both vi−1 vi and vi vi+1 lie in
the same half-plane, then slightly translating ` we can bring it to a position where no
vertex lies on it and ` (strictly) crosses every edge of C.
We say that the edge vi−1 vi lies below the edge vi vi+1 if the intersection point of
vi−1 vi with the line ` is below the intersection point of vi vi+1 with `. Notice that, by
the definition of a thrackle, these two points cannot coincide, since the interiors of any
two adjacent edges must be disjoint. If an edge lies below another edge adjacent to it,
then we say that the latter edge lies above the first one.
Observe that if the edge vi−1 vi lies below vi vi+1 , then the next edge along C,
vi+1 vi+2 , must also lie below vi vi+1 , since otherwise vi−1 vi and vi+1 vi+2 could not
cross. In this case, we say that vi vi+1 is an upper edge. Otherwise, both vi−1 vi and
vi+1 vi+2 must lie above vi vi+1 , and vi vi+1 is called a lower edge. Obviously, the edges
of C are alternately upper and lower edges. See Figure 3.
Consider now the cycle C as a closed self-intersecting curve embedded in the plane,
which divides the plane into simply connected regions. Exactly one of these regions is
vi + 3
vi + 1
vi
vi – 1
vi + 2
vi + 4
`
Figure 3. An even cycle. The upper edges are marked.
In this case, replace v by two vertices, v 0 and v 00 , very close to each other and to the
original position of v. Let v 0 lie in left half-plane, and let it be the new left endpoint
ACKNOWLEDGMENTS. The present note grew out of the Intel project of the second named author in 2005,
while he was a student at Stuyvesant High School, New York, under the supervision of the first named author.
Some portions of the argument have been rediscovered by Radoslav Fulek, Rom Pinchasi, Konrad Swanepoel,
and Géza Tóth. Research on this paper was partially supported by NSF grant CCF-08-3072, Swiss NSF grant
200021-125287/1, and grants from OTKA and BSF. The second named author was also supported by MIT’s
UROP program, under the direction of Dr. Karl Mahlburg.
REFERENCES
1. P. Brass, W. Moser, and J. Pach, Research Problems in Discrete Geometry, Springer-Verlag, New York,
2005.
2. G. Cairns, M. McIntyre, and Y. Nikolayevsky, The thrackle conjecture for K 5 and K 3,3 , in Towards a
Theory of Geometric Graphs, J. Pach, ed., Contemporary Mathematics, vol. 342, American Mathematical
Society, Providence, RI, 2004, 35–54.
3. G. Cairns and Y. Nikolayevsky, Bounds for generalized thrackles, Discrete Comput. Geom. 23 (2000)
191–206. doi:10.1007/PL00009495
4. , Generalized thrackle drawings of non-bipartite graphs, Discrete Comput. Geom. 41 (2009) 119–
134. doi:10.1007/s00454-008-9095-5
5. , Outerplanar thrackles, manuscript (2009), available at http://www.latrobe.edu.au/
mathstats/staff/cairns/papers/alter.pdf.
6. J. H. Conway, Unsolved problem, in Combinatorics: Being the Proceedings of the Conference on Combi-
natorial Mathematics–Mathematical Institute, Oxford, D. J. A. Welsh and D. R. Woodall, eds., Institute
of Mathematics and Its Applications, Southend-on-Sea, UK, 1972, 351–363.
7. R. Fulek and J. Pach, A computational approach to Conway’s thrackle conjecture, in Graph Drawing
2010, Lecture Notes in Computer Science, vol. 6502, Springer-Verlag, Berlin, 2011, 226–237.
8. J. E. Green and R. D. Ringeisen, Combinatorial drawings and thrackle surfaces, in Graph Theory, Com-
binatorics, and Algorithms, Vol. 2. Proceedings of the Seventh Quadrennial International Conference on
the Theory and Applications of Graphs–Western Michigan University, Kalamazoo, MI, 1992, Y. Alavi
and A. Schwenk, eds., Wiley-Interscience, New York, 1995, 999–1009.
9. H. Hopf and E. Pannwitz, Aufgabe Nr. 167, Jahresber. Deutsch. Math.-Verein. 43 (1934) 114.
10. L. Lovász, J. Pach, and M. Szegedy, On Conway’s thrackle conjecture, Discrete Comput. Geom. 18
(1998) 369–376. doi:10.1007/PL00009322
11. A. Perlstein and R. Pinchasi, Generalized thrackles and geometric graphs in R3 with no pair of strongly
avoiding edges, Graphs Combin. 24 (2008) 373–389. doi:10.1007/s00373-008-0796-6
12. B. L. Piazza, R. D. Ringeisen, and S. K. Stueckle, Subthrackleable graphs and four cycles, in Graph
Theory and Applications–Hakone, 1990, M. Kano, H. Okamura, S. Tazawa, K. Ushio, and Y. Yamasaki,
eds., Discrete Math. 127, no. 1-3 (1994) 265–276. doi:10.1016/0012-365X(92)00484-9
13. R. D. Ringeisen, Two old extremal graph drawing conjectures: Progress and perspectives, Congr. Numer.
115 (1996) 91–103.
14. J. W. Sutherland, Lösung der Aufgabe 167, Jahresber. Deutsch. Math.-Verein. 45 (1935) 33–35.
15. D. R. Woodall, Thrackles and deadlock, in Combinatorial Mathematics and Its Applications, D. J. A.
Welsh, ed., Academic Press, New York, 1969, 335–348.
EPFL Lausanne and Rényi Institute, H-1364 Budapest, POB 127, Hungary
[email protected]
where n ≥ 1 and p is prime. We survey the main ingredients in several known proofs. Then
we give an elementary proof, using an identity for power sums proven by Pascal in 1654. An
application is a simple proof of a congruence for certain sums of binomial coefficients, due to
Hermite and Bachmann.
For example, it is used to prove theorems on Bernoulli numbers (that of von Staudt-
Clausen in [2] and [7, Theorem 118], of Carlitz-von Staudt in [2, Theorem 4] and
[11], and of Almkvist-Meurman in [3, Theorem 9.5.29]) and to study the Erdős-Moser
Diophantine equation Sn (m − 1) = m n (in [10], [11], [12], and [15]) as well as other
exponential Diophantine equations and Stirling numbers of the second kind (in [9]).
A component of most proofs of Theorem 1 is Fermat’s little theorem ([7, Theorem
71], [14, p. 36]), which says that if p is prime, then
As a bonus, Pascal’s identity allows a simple proof of a congruence for certain sums
of binomial coefficients mk (generalizing the easily-established facts that m−1 m
k=1 k is
P
m
even for m > 0, and that if p is prime and p ≤ m ≤ 2( p − 1), then p divides p−1 ).
The case m odd is due to Hermite [8] in 1876, and the general case to Bachmann [1,
p. 46] in 1910; for these and related results, see [4, pp. 270–275].
Proof. Set n = m − 1 and a = p in (3), then reduce modulo p. Using Theorem 1, the
result follows.
For example,
14 14 14
+ + = 1001 + 3003 + 91 ≡ 0 (mod 5).
4 8 12
For (4) and generalizations due to Glaisher and Carlitz, see [3, p. 70, Lemma 9.5.28;
p. 133, Exercise 62; and p. 327, Proposition 11.4.11]. Recently, Dilcher [5] discovered
an analog of (4) for alternating sums.
ACKNOWLEDGMENTS. We thank the three referees for several suggestions and references, and Pieter
Moree for sending us a preprint of [11].
1. P. Bachmann, Niedere Zahlentheorie, Part 2, Teubner, Leipzig, 1910; Parts 1 and 2 reprinted in one
volume, Chelsea, New York, 1968.
2. L. Carlitz, The Staudt-Clausen theorem, Math. Mag. 34 (1961) 131–146. doi:10.2307/2688488
3. H. Cohen, Number Theory, Volume II: Analytic and Modern Tools, Graduate Texts in Mathematics, vol.
240, Springer-Verlag, New York, 2007.
4. L. E. Dickson, History of the Theory of Numbers, vol. 1, Carnegie Institution of Washington, Washington,
DC, 1919; reprinted by Dover, Mineola, NY, 2005.
5. K. Dilcher, Congruences for a class of alternating lacunary sums of binomial coefficients, J. Integer Seq.
10 (2007) Article 07.10.1.
6. A. W. F. Edwards, Pascal’s Arithmetical Triangle, Charles Griffin, London, 1987.
7. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 6th ed., D. R. Heath-Brown
and J. H. Silverman, eds., Oxford University Press, Oxford, 2008.
8. Ch. Hermite, Extrait d’une lettre à M. Borchardt, J. Reine Angew. Math. 81 (1876) 93–95.
9. B. C. Kellner, The equivalence of Giuga’s and Agoh’s conjectures (2004), available at http://arxiv.
org/abs/math/0409259.
10. P. Moree, Moser’s mathemagical work on the equation 1k + 2k + · · · + (m − 1)k = m k (preprint), avail-
able at http://arxiv.org/abs/1011.2940.
11. , A top hat for Moser’s four mathemagical rabbits, Amer. Math. Monthly 118 (2011) 364–370;
also available at http://arxiv.org/abs/1011.2956.
12. L. Moser, On the Diophantine equation 1n + 2n + · · · + (m − 1)n = m n , Scripta Math. 19 (1953) 84–88.
13. B. Pascal, Sommation des puissances numériques, in Oeuvres Complètes, vol. III, J. Mesnard, ed., De-
sclée-Brouwer, Paris, 1964, 341–367; English trans. A. Knoebel, R. Laubenbacher, J. Lodder, and D. Pen-
gelley, Sums of numerical powers, in Mathematical Masterpieces: Further Chronicles by the Explorers,
Springer-Verlag, New York, 2007, 32–37.
14. H. E. Rose, A Course in Number Theory, Clarendon Press, Oxford, 1988.
15. J. Sondow and K. MacMillan, Reducing the Erdős-Moser equation 1n + 2n + · · · + k n = (k + 1)n mod-
ulo k and k 2 (preprint), available at http://arxiv.org/abs/1011.2154.
Abstract. We show how to use derivations on the ring of Dirichlet series to achieve mero-
morphic continuations of completely multiplicative Dirichlet series. In particular we present
infinitely many new representations of the Riemann zeta function as a quotient of series con-
verging on Re(s) > 0.
1. P. Bachmann, Niedere Zahlentheorie, Part 2, Teubner, Leipzig, 1910; Parts 1 and 2 reprinted in one
volume, Chelsea, New York, 1968.
2. L. Carlitz, The Staudt-Clausen theorem, Math. Mag. 34 (1961) 131–146. doi:10.2307/2688488
3. H. Cohen, Number Theory, Volume II: Analytic and Modern Tools, Graduate Texts in Mathematics, vol.
240, Springer-Verlag, New York, 2007.
4. L. E. Dickson, History of the Theory of Numbers, vol. 1, Carnegie Institution of Washington, Washington,
DC, 1919; reprinted by Dover, Mineola, NY, 2005.
5. K. Dilcher, Congruences for a class of alternating lacunary sums of binomial coefficients, J. Integer Seq.
10 (2007) Article 07.10.1.
6. A. W. F. Edwards, Pascal’s Arithmetical Triangle, Charles Griffin, London, 1987.
7. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 6th ed., D. R. Heath-Brown
and J. H. Silverman, eds., Oxford University Press, Oxford, 2008.
8. Ch. Hermite, Extrait d’une lettre à M. Borchardt, J. Reine Angew. Math. 81 (1876) 93–95.
9. B. C. Kellner, The equivalence of Giuga’s and Agoh’s conjectures (2004), available at http://arxiv.
org/abs/math/0409259.
10. P. Moree, Moser’s mathemagical work on the equation 1k + 2k + · · · + (m − 1)k = m k (preprint), avail-
able at http://arxiv.org/abs/1011.2940.
11. , A top hat for Moser’s four mathemagical rabbits, Amer. Math. Monthly 118 (2011) 364–370;
also available at http://arxiv.org/abs/1011.2956.
12. L. Moser, On the Diophantine equation 1n + 2n + · · · + (m − 1)n = m n , Scripta Math. 19 (1953) 84–88.
13. B. Pascal, Sommation des puissances numériques, in Oeuvres Complètes, vol. III, J. Mesnard, ed., De-
sclée-Brouwer, Paris, 1964, 341–367; English trans. A. Knoebel, R. Laubenbacher, J. Lodder, and D. Pen-
gelley, Sums of numerical powers, in Mathematical Masterpieces: Further Chronicles by the Explorers,
Springer-Verlag, New York, 2007, 32–37.
14. H. E. Rose, A Course in Number Theory, Clarendon Press, Oxford, 1988.
15. J. Sondow and K. MacMillan, Reducing the Erdős-Moser equation 1n + 2n + · · · + k n = (k + 1)n mod-
ulo k and k 2 (preprint), available at http://arxiv.org/abs/1011.2154.
Abstract. We show how to use derivations on the ring of Dirichlet series to achieve mero-
morphic continuations of completely multiplicative Dirichlet series. In particular we present
infinitely many new representations of the Riemann zeta function as a quotient of series con-
verging on Re(s) > 0.
where s is a complex variable. If a(nm) = a(n)a(m) for all n and m, then a is called
completely multiplicative. For example, the arithmetic function function 1(n) ≡ 1 is
completely multiplicative and gives rise to the Riemann zeta function, ζ (s)P = F1 (s).
A few facts are in order, whose proofs may be found in [1]. Put A(x) = n≤x a(n).
If A(x) = O(x α ) we have that Fa (s) converges to an analytic function of s for Re(s) >
α. Moreover, the series may be differentiated term-by-term in this region to achieve
∞
X a(n)
Fa0 (s) = − log(n) . (1)
n=1
ns
We will shortly be replacing the term − log(n) with a similarly behaved log-derivation,
−`(n). Let us see how the standard approach progresses first.
The set of arithmetic functions is a ring under
P addition and convolution. For a(n)
and b(n), their convolution is (a ∗ b)(n) = j|n a( j)b(n/j). The connection with
Dirichlet series is Fa∗b (s) = Fa (s)Fb (s).
The von Mangoldt function is defined by putting 3(n) = log( p) if n = p v with p
prime and v ≥ 1, and 3(n) = 0 otherwise. This function plays a critical role in the
original proof of the Prime Number Theorem (PNT).
a · log = (3 · a) ∗ a.
On the Dirichlet series side this translates into −Fa0 (s) = F3·a (s)Fa (s). Hence, for
example, taking a = 1 and solving for Fa (s) gives
∞
! ∞ !
X log(n) X 3(n)
0
ζ (s) = −ζ (s)/F3 (s) = . (2)
n=1
ns n=1
ns
We call this the `-derivation of the Dirichlet series. The superscript ` notation is in-
tended to be suggestive of a derivative; the next lemma shows this is a good fit.
Lemma 2.1. An `-derivation of Dirichlet series is a derivation. That is, if a and b are
arithmetic functions we have
Fa+b
`
(s) = Fa` (s) + Fb` (s) (4)
and
Fa∗b
`
(s) = Fa` (s)Fb (s) + Fa (s)Fb` (s). (5)
Proof. These follow immediately from the corresponding identities of arithmetic func-
tions. Let us prove the second one:
X
((a ∗ b)`)(n) = a( j)b(n/j)`(n)
j|n
X
= a( j)b(n/j)(`( j) + `(n/j))
j|n
X X
= (a`)( j)b(n/j) + a( j)(b`)(n/j)
j|n j|n
For every notion one has in the classical theory, there is the corresponding notion
with log(n) replaced by `(n). Define the `-modified von Mangoldt function 3` (n) =
`( p) if n = p v and 3` (n) = 0 otherwise; then immediately:
Proof. According to Proposition 2.2, it remains only to show that the two series in
the quotient converge to analytic
P functions in the stated region. For the numerator,
we should consider L(x) = n≤x `(n). For easeP of notation suppose x is an integer.
Because of log-type behavior, L(x) = `(x!) = p v p (x!)`( p). The next lemma is
perhaps well known; we sketch the proof because it is short and amusing.
1
Lemma 3.2. The order of p that divides x! is v p (x!) = p−1
(x − w p (x)), where w p (x)
denotes the sum of the digits of x written in base p.
Proof. By induction on x. When x = 1, both sides are zero. Suppose the formula
holds for a given x. Let v ≥ 0 be the exact power of p dividing x + 1. One sees
w p (x + 1) = w p (x) + 1 − ( p − 1)v, and so the right-hand side of the formula has a
net gain of v as needed.
Here π(x) denotes the number of primes less than or equal to x. Chebyshev gave an
elementary bound π(x) ≥ C1 x/ log(x), which suffices to show that the term x/2π(x)
tends to zero as x → ∞. Chebyshev’s inequality may
P be transformed into a bound
pk ≤ C2 k log(k), which then easily implies K := p∈P |`( p)| < ∞. We conclude
`
L(x) = O(log x). Hence L(x) = O(x ε ) for every ε > 0, and the series −F1 (s) con-
verges for Re(s) > 0.
where p denotes a prime. Note that |θ` (y)| ≤ K for any y, and θ` (x 1/m ) = 0 as soon
as m > log2 (x). Hence |ψ` (x)| ≤ K log2 (x) = O(log x). We again have convergence
of the Dirichlet series on Re(s) > 0.
4. FINAL REMARKS. There was nothing particularly special about our choice for
the `( p). One can see for example that if for some β ≥ 0 and all ε > 0 the sequence
{`( p)} p∈P satisfies:
X `( p)
(H1) = O(x β−1+ε ), and
p∈P[x]
p − 1
X
(H2) |`( p)| = O(x β+ε ),
p∈P[x]
we will achieve a meromorphic continuation of ζ (s) to Re(s) > β. For example, one
could take `(2) = M 21−M and `( pk ) = − pMk −1
k for any M ∈ C with |M| > 1 and achieve
a continuation to Re(s) > 0. (Our previous sequence arose from M = 2.)
There are other interesting approaches to this method of derivations. Let µ(n) de-
note the Möbius function, so that Fµ (s) = 1/ζ (s), i.e., Fµ (s)F1 (s) = 1. Applying an
`-derivation yields
This can be cleaned up via equation (6) to ζ (s) = −F3` (s)/Fµ` (s).
Fix s0 ∈ C with Re(s0 ) > 21 . One might try to find a log-derivation ` (depending
on s0 ) for which the series in the quotient converge in a half-plane containing s0 , and
for which F3` (s0 ) 6 = 0. If we could do this for every s0 with Re(s0 ) > β for some β
satisfying 21 ≤ β < 1 this would be an improvement to the known zero-free region of
zeta, and of course β = 12 would give the Riemann hypothesis. The equation
X `( p)
F3` (s) =
p∈P
ps − 1
may be of some use, although a priori it only holds for Re(s) > 1.
REFERENCES
1. G. J. O. Jameson, The Prime Number Theorem, London Mathematical Society Student Texts, vol. 53,
Cambridge University Press, Cambridge, 2003.
2. V. Laohakosol, Dependence of arithmetic functions and Dirichlet series, Proc. Amer. Math. Soc. 115 (1992)
637–645.
3. H. N. Shapiro, On the convolution ring of arithmetic functions, Comm. Pure Appl. Math. 25 (1972) 287–
336. doi:10.1002/cpa.3160250306
4. H. N. Shapiro and G. H. Sparer, On algebraic independence of Dirichlet series, Comm. Pure Appl. Math.
39 (1986) 695–745. doi:10.1002/cpa.3160390602
5. M. Stay, Generalized number derivatives, J. Integer Seq. 8 (2005) Article 05.1.4, available at http://
www.cs.uwaterloo.ca/journals/JIS/VOL8/Stay/stay44.html.
6. V. Ufnarovski and B. Åhlander, How to differentiate a number, J. Integer Seq. 6 (2003)
Article 03.3.4, available at http://www.cs.uwaterloo.ca/journals/JIS/VOL6/Ufnarovski/
ufnarovski.html.
Department of Mathematics and Computer Science, Pacific University, Forest Grove, OR 97116
[email protected]
Deep Thought
PROBLEMS
11579. Proposed by Hallard Croft, University of Cambridge, Cambridge, U. K., and
Sateesh Mane, Convergent Computing, Shoreham, NY. Let m and n be distinct integers,
with m, n ≥ 3. Let B be a fixed regular n-gon, and let A be the largest regular m-gon
that does not extend beyond B. Let d = gcd(m, n), and assume d > 1. Show that:
(a) A and B are concentric.
(b) If m | n, then A and B have m points of contact, consisting of all the vertices of A.
(c) If m - n and n - m, then A and B have 2d points of contact.
(d) A and B share exactly d common axes of symmetry.
11580. Proposed by David Alfaya Sánchez, Universidad Autónoma de Madrid,
Madrid, Spain, and José Luis Dı́az-Barrero, Universidad Politécnica de Cataluña,
Barcelona, Spain. For n ≥ 2, let a1 , . . . , an be positive numbers that sum to 1, let
E = {1, . . . , n}, and let F = {(i, j) ∈ E × E : i < j}. Prove that
X (ai − a j )2 + 2ai a j (1 − ai )(1 − a j ) X (n + 1)a 2 + nai n 2 (n + 2)
+ i
≥ .
(i, j)∈F
(1 − ai )2 (1 − a j )2 i∈E
(1 − ai )2 (n − 1)2
11582. Proposed by Aleksandar Ilić, University of Niš, Serbia. Let n be a positive inte-
Pk
ger, and consider the set Sn of all numbers that can be written in the form i=2 ai−1 ai
with a1 , . . . , ak being positive integers that sum to n. Find Sn .
doi:10.4169/amer.math.monthly.118.06.557
11585. Proposed by Bruce Burdick, Roger Williams University, Bristol, RI. Show that
∞ k−2
!
X 1 X π2
ζ (k − m)ζ (m + 1) − k = 3 + γ 2 + 2γ1 − .
k=3
k m=1 3
SOLUTIONS
Editorial comment. Two versions of this problem appeared; the first was not what the
proposer intended. The treatment of the upper bound given in the March issue of this
column (p. 278) fails as a solution to the corrected version. The maximum of F in the
closure of the feasible region is attained not only at a corner, which is off-limits, but
also at the other boundary points noted. The solver list here includes those who had
supplied solutions under a new deadline. The editors regret the confusion.
AB = CD = EF.
Case 2: 4QRS has the same orientation as 4MNP. Now
(b + e) + (c + f ) + 2 (a + d) = 0. (3)
Multiplying (1) by 1
1−
1
, multiplying (3) by − 1− , and adding, we obtain c − e = ( f −
2 1
d). Therefore CE = DF, so CD = EF. Multiplying (1) by 1− , multiplying (3) by − 1− ,
and adding, we obtain e − a = ( f − b). Therefore EA = FB, so EF = AB.
Also solved by R. Chapman (U. K.), P. P. Dályay (Hungary), M. Garner, M. Goldenberg & M. Kaplan, J.-P.
Grivaux (France), S. W. Kim (Korea), O. Kouba (Syria), O. P. Lossers (Netherlands), M. A. Prasad (India), R.
Stong, S. Tonegawa & F. Vafa, and the proposer.
A Series Equation
11473 [2009, 941]. Proposed by Paolo Perfetti, Mathematics Dept., University “Tor
Vergata Roma,” Rome, Italy. Let α and β be real numbers such that −1 < α + β < 1
and such that, for all integers k ≥ 2,
From the convergence of T and U , it follows that the second term goes to zero as N
tends to infinity. Thus
T + 2U = −α − 2 log 2.
Also solved by O. Kouba (Syria), R. Stong, and the GCHQ Problem Solving Group (U. K.).
Also solved by A. Alt, G. Apostolopoulos (Greece), R. Bagby, D. Beckwith, E. Bráune (Austria), R. Chapman
(U. K.), P. P. Dályay (Hungary), J. Fabrykowski & T. Smotzer, H. Y. Far, O. Faynshteyn (Germany), V. V.
Garcia (Spain), O. Kouba (Syria), K.-W. Lau (China), J. H. Lindsey II, Á. Plaza & S. Falcón (Spain), C.
Pohoata (Romania), C. R. Pranesachar (India), R. Stong, E. Suppa (Italy), M. Tetiva (Romania), M. Vowe
(Switzerland), L. Wimmer (Germany), L. Zhou, GCHQ Problem Solving Group (U. K.), and the proposer.
(a) Show that (C, c, P) exists for all allowed choices of C, c, and P, and that it is
independent of P.
(b) Find a formula for (C, c, P) in terms of r , R, and the distance d from O to o.
Solution by Richard Stong, Center for Communications Research, San Diego, CA. We
will show
1 r − d
F arccos m
2 R 4d R
(C, c, P) = , where m = ,
K (m) (R + d)2 − r 2
which is independent of P. We have used the incomplete elliptic integral of the first
kind, defined by
Z θ Z sin θ
dt dy
F(θ |m) = p = p p ,
2 1 − y 1 − my 2
2
0 1 − m sin t 0
satisfies
dI 1 dφ+
=p
dt R 2 + d 2 − r 2 + 2d R cos φ+ dt
1 dφ−
−p
R2 + d2 − r2 + 2d R cos φ− dt
=r −r = 0
and is a constant. One possible chord is the vertical one through the point (r, 0) with
θ = 0, φ± = ± arccos((r − d)/R), so we obtain
Z arccos((r −d)/R)
dφ
I =2
R + d − r 2 + 2d R cos φ
p
0 2 2
4 1 r − d 4d R
=p F arccos .
(R + d)2 − r 2 (R + d)2 − r 2
2 R
This integral is over an interval of at least b(φk − φ0 )/(2π )c complete periods and
fewer than d(φk − φ0 )/(2π)e complete periods. Hence
j=1 ω j j=1 ω j
$ Pk % & Pk '
J ≤ kI ≤ J.
2π 2π
Thus
ωj ωj ωj
& Pk ' ! Pk $ Pk % !
I 1 1 j=1 j=1 1 j=1 I 1
− ≤ −1 ≤ ≤ +1 ≤ +
J k k 2π 2πk k 2π J k
and
ωj
Pk
j=1 I
lim = ,
k→∞ 2π k J
which is the quotient of elliptic integrals claimed.
Editorial comment.
In the classical case, when the trajectory closes—returns to its starting point af-
ter finitely many steps—this “winding density” is rational: the number of times the
closed trajectory goes around the circle divided by the number of intervals in the tra-
jectory. The use of elliptic integrals to compute it is known, and in many special cases it
can be computed without elliptic integrals: see http://mathworld.wolfram.com/
PonceletsPorism.html.
Also solved by J. A. Grzesik, and the proposer.
Continuous Symmetry: From Euclid to Klein. By William Barker and Roger Howe. American
Mathematical Society, Providence, RI, 2007, xxi + 546 pp., ISBN 978-0821839003, $69.
This is a book about plane Euclidean geometry with special emphasis on the group
of isometries. It includes the classification of plane isometries into reflections, trans-
lations, rotations, and glide reflections, and also the classification of frieze groups and
the seventeen wallpaper groups with complete proofs. It offers unusual proofs of some
standard theorems of plane geometry, making systematic use of the group of isome-
tries. The book is intended as a one-semester course, with exercises, though the in-
structor will have to be somewhat selective, since there is more than enough material
for one semester. The authors state in the preface, “We have tried to write a book that
honors the Greek tradition of synthetic geometry and at the same time takes Felix
Klein’s Erlanger Programm seriously.”
To begin with, the authors devote the first chapter to the axiomatic foundations of
plane geometry. Here already, following a popular modern trend, they diverge from
Euclid’s purely synthetic geometry by presupposing the real numbers, and implicitly
using some concepts of analysis. In Euclid’s Elements there are no numbers, no mea-
sure of length of a segment, no measure of angles. Instead there is an undefined notion
of congruence. Even in Hilbert’s rigorous rewriting of the foundations of Euclidean
geometry in [5], numbers are not necessary. It is only after the initial development
that it appears that the Euclidean plane can be represented as a Cartesian plane over a
Euclidean ordered field, which can be the real numbers, but need not be. Presupposing
the real numbers does simplify the technical beginnings of the subject, but one loses
the purity of Euclid’s synthetic approach.
The authors say in a footnote on page 1 that they have closely followed Moise’s
book [8] for the axiomatic foundations. But while Moise takes the distance function,
which to each pair of points assigns a real number, as a basic datum of the theory,
Barker and Howe prefer to take a coordinate system on each line as their data. They
call this the “Ruler Axiom” (p. 17), which states that for each line there exists a one-
to-one correspondence with the real numbers. However what they need is not the exis-
tence, but the choice for each line of a fixed coordinate system. This leads to a certain
amount of confusion in the first chapter, and to some incorrect claims. These have been
corrected in a list on the web page for the book [2] together with a revised version of
Chapter 1. The axiomatic foundations are summarized on page 120. They postulate a
set of points, a distinguished collection of subsets called lines, on each line a coordi-
nate system (i.e., a 1-1 correspondence with the real numbers), and an angle measure
function, satisfying incidence axioms, a plane separation axiom, angle measure ax-
ioms, the side-angle-side axiom (SAS), and the parallel postulate.
doi:10.4169/amer.math.monthly.118.06.565
EP1. A geometry is the study of the properties of figures in a set X that are invariant
under a group G of symmetries acting on the set X.
EP2. Starting with an appropriate group G, one can reconstruct the set X, together
with the action of G on X, to obtain a geometry.
Felix Klein’s Erlanger Programm [7] was his inaugural dissertation on his appoint-
ment as professor at the University of Erlangen. The main import of Klein’s paper
is expressed by statement EP1 above. For example, we can say two figures are con-
gruent if there exists an element of the group taking one to the other. To set the im-
portance of Klein’s work in perspective, one must remember that when Klein was
writing, group theory was in its infancy [9]. Groups appeared in Galois’s work in
the early 19th century as finite permutation groups, or groups of substitutions as they
were called then. The theory of finite substitution groups was fairly well developed
by the mid 19th century as we see from the masterful work of Jordan [6]. The no-
tion of an abstract group had been suggested earlier by Cayley, but did not take hold
until much later. So Klein says at the beginning of his work [7, p. 6, footnote 3] that
he will borrow the concepts and notations of a group from substitution theory and
apply it to the groups of transformations acting on a geometry, in particular, to the
groups of isometries and similarities. It was a great conceptual leap to consider an in-
finite group of transformations acting on the continuous space of a geometry, and this
idea had tremendous influence on later studies of geometry (think of Lie groups, for
example).
As for the idea expressed in EP2 above, this came only much later, and is not due
to Klein, though one could argue that it is a natural outgrowth of his work. The theory
1. F. Bachmann, Aufbau der Geometrie aus dem Spiegelungsbegriff, Springer, Berlin, 1973.
2. W. Barker and R. Howe, Additional material for Continuous Symmetry: From Euclid to Klein (2010),
available at http://www.ams.org/publications/authors/books/postpub/mbk-47.
3. H. S. M. Coxeter and S. L. Greitzer, Geometry Revisited, Mathematical Association of America, Wash-
ington, DC, 1967.
4. R. Hartshorne, Geometry: Euclid and Beyond, Springer, New York, 2000.
5. D. Hilbert, Grundlagen der Geometrie, in Festschrift zur Feier der Enthüllung des Gauss-Weber-Denkmals
in Göttingen, B. G. Teubner, Leipzig, 1899.
6. C. Jordan, Traité des Substitutions et des Equations Algébriques, Gauthier-Villars, Paris, 1870.
7. F. Klein, Vergleichende Betrachtungen über Neuere Geometrische Forschungen, A. Deichert, Erlangen,
1872.
8. E. Moise, Elementary Geometry from an Advanced Standpoint, 3rd ed., Addison Wesley, Reading, MA,
1990.
9. H. Wussing, The Genesis of the Abstract Group Concept, MIT Press, Cambridge, MA, 1984.
Rodney Nillsen
The final part of the book introduces and motivates measure theory and
the notion of a measurable set, and describes the relationship of Birkhoff’s
Individual Ergodic Theorem to the preceding ideas. Developments in other
dynamical systems are indicated, in particular Lévy’s result on the frequency
of occurence of a given digit in the partial fractions expansion of a number.