100% found this document useful (1 vote)

353 views287 pages

Winding Around AMS

From the AMS Student library !

Uploaded by

Luis Lopez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

353 views287 pages

Winding Around AMS

From the AMS Student library !

Uploaded by

Luis Lopez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 287

S T U D E N T M AT H E M AT I C A L L I B R A RY

Volume 76

Winding Around
The Winding Number
in Topology, Geometry,
and Analysis

John Roe

HEMATICS
MAT

STUDY
ADV
ANCED

SEM
ESTERS

American Mathematical Society

Mathematics Advanced Study Semesters

S T U D E N T M AT H E M AT I C A L L I B R A RY
Volume 76

Winding Around
The Winding Number
in Topology, Geometry,
and Analysis

John Roe

HEMATICS
MAT

STUDY
ADV
A N CED

SE M
E ST ERS

American Mathematical Society

Mathematics Advanced Study Semesters

2010 Mathematics Subject Classiﬁcation. Primary 55M25;

Secondary 55M05, 47A53, 58A10, 55N15.

For additional information and updates on this book, visit

www.ams.org/bookpages/stml-76

Library of Congress Cataloging-in-Publication Data

Roe, John, 1959–
Winding around : the winding number in topology, geometry, and analysis /
John Roe.
pages cm. — (Student mathematical library ; volume 76)
Includes bibliographical references and index.
ISBN 978-1-4704-2198-4 (alk. paper)
1. Mathematical analysis—Foundations. 2. Associative law (Mathematics)
3. Symmetric functions. 4. Commutative law (Mathematics) I. Title.
QA299.8.R64 2015
515—dc23
2015019246

Copying and reprinting. Individual readers of this publication, and nonproﬁt

libraries acting for them, are permitted to make fair use of the material, such as to
copy select pages for use in teaching or research. Permission is granted to quote brief
passages from this publication in reviews, provided the customary acknowledgment of
the source is given.
Republication, systematic copying, or multiple reproduction of any material in this
publication is permitted only under license from the American Mathematical Society.
Permissions to reuse portions of AMS publication content are handled by Copyright
Clearance Center’s RightsLink service. For more information, please visit: http://
www.ams.org/rightslink.
Send requests for translation rights and licensed reprints to reprint-permission
@ams.org.
Excluded from these provisions is material for which the author holds copyright.
In such cases, requests for permission to reuse or reprint material should be addressed
directly to the author(s). Copyright ownership is indicated on the copyright page,
or on the lower right-hand corner of the ﬁrst page of each article within proceedings
volumes.

2015
c by the author. All rights reserved.
Printed in the United States of America.

∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at http://www.ams.org/
10 9 8 7 6 5 4 3 2 1 20 19 18 17 16 15

Foreword: MASS and REU at Penn State University ix

Preface xi

Chapter 1. Prelude: Love, Hate, and Exponentials 1

§1.1. Two sets of travelers 1
§1.2. Winding around 5
§1.3. The most important function in mathematics 6
§1.4. Exercises 12

Chapter 2. Paths and Homotopies 15

§2.1. Path connectedness 15
§2.2. Homotopy 18
§2.3. Homotopies and simple-connectivity 22
§2.4. Exercises 25

Chapter 3. The Winding Number 27

§3.1. Maps to the punctured plane 27
§3.2. The winding number 29
§3.3. Computing winding numbers 33
§3.4. Smooth paths and loops 38

§3.5. Counting roots via winding numbers 42

§3.6. Exercises 46

Chapter 4. Topology of the Plane 49

§4.1. Some classic theorems 49
§4.2. The Jordan curve theorem I 54
§4.3. The Jordan curve theorem II 59
§4.4. Inside the Jordan curve 63
§4.5. Exercises 67

Chapter 5. Integrals and the Winding Number 73

§5.1. Diﬀerential forms and integration 73
§5.2. Closed and exact forms 79
§5.3. The winding number via integration 84
§5.4. Homology 87
§5.5. Cauchy’s theorem 94
§5.6. A glimpse at higher dimensions 95
§5.7. Exercises 97

Chapter 6. Vector Fields and the Rotation Number 101

§6.1. The rotation number 101
§6.2. Curvature and the rotation number 105
§6.3. Vector ﬁelds and singularities 107
§6.4. Vector ﬁelds and surfaces 113
§6.5. Exercises 117

Chapter 7. The Winding Number in Functional Analysis 121

§7.1. The Fredholm index 121
§7.2. Atkinson’s theorem 125
§7.3. Toeplitz operators 129
§7.4. The Toeplitz index theorem 133
§7.5. Exercises 136

Chapter 8. Coverings and the Fundamental Group 139

§8.1. The fundamental group 139
§8.2. Covering and lifting 143
§8.3. Group actions 150
§8.4. Examples 153
§8.5. The Nielsen-Schreier theorem 157
§8.6. An application to nonassociative algebra 161
§8.7. Exercises 165

Chapter 9. Coda: The Bott Periodicity Theorem 169

§9.1. Homotopy groups 169
§9.2. The topology of the general linear group 174

Appendix A. Linear Algebra 181

§A.1. Vector spaces 181
§A.2. Basis and dimension 184
§A.3. Linear transformations 188
§A.4. Duality 192
§A.5. Norms and inner products 194
§A.6. Matrices and determinants 197

Appendix B. Metric Spaces 203

§B.1. Metric spaces 203
§B.2. Continuous functions 206
§B.3. Compact spaces 208
§B.4. Function spaces 213

Appendix C. Extension and Approximation Theorems 217

§C.1. The Stone-Weierstrass theorem 217
§C.2. The Tietze extension theorem 220

Appendix D. Measure Zero 223

§D.1. Measure zero subsets of R and of S 1
223

Appendix E. Calculus on Normed Spaces 229

§E.1. Normed vector spaces 229
§E.2. The derivative 231
§E.3. Properties of the derivative 233
§E.4. The inverse function theorem 237

Appendix F. Hilbert Space 239

§F.1. Deﬁnition and examples 239
§F.2. Orthogonality 243
§F.3. Operators 246

Appendix G. Groups and Graphs 249

§G.1. Equivalence relations 250
§G.2. Groups 251
§G.3. Homomorphisms 254
§G.4. Graphs 257

Bibliography 261

Index 265

This book is part of a collection published jointly by the Amer-

ican Mathematical Society and the MASS (Mathematics Advanced
Study Semesters) program as a part of the Student Mathematical
Library series. The books in the collection are based on lecture
notes for advanced undergraduate topics courses taught at the MASS
and/or Penn State summer REU (Research Experiences for Under-
graduates). Each book presents a self-contained exposition of a non-
standard mathematical topic, often related to current research areas,
accessible to undergraduate students familiar with an equivalent of
two years of standard college mathematics and suitable as a text for
an upper division undergraduate course.
Started in 1996, MASS is a semester-long program for advanced
undergraduate students from across the USA. The program’s curricu-
lum amounts to sixteen credit hours. It includes three core courses
from the general areas of algebra/number theory, geometry/topology,
and analysis/dynamical systems, custom designed every year; an in-
terdisciplinary seminar; and a special colloquium. In addition, ev-
ery participant completes three research projects, one for each core
course. The participants are fully immersed into mathematics, and

this, as well as intensive interaction among the students, usually leads

to a dramatic increase in their mathematical enthusiasm and achieve-
ment. The program is unique for its kind in the United States.
The summer mathematical REU program is formally indepen-
dent of MASS, but there is a signiﬁcant interaction between the two:
about half of the REU participants stay for the MASS semester in
the fall. This makes it possible to oﬀer research projects that re-
quire more than seven weeks (the length of the REU program) for
completion. The summer program includes the MASS Fest, a two
to three day conference at the end of the REU at which the partici-
pants present their research and that also serves as a MASS alumni
reunion. A nonstandard feature of the Penn State REU is that, along
with research projects, the participants are taught one or two intense
topics courses.
Detailed information about the MASS and REU programs at
Penn State can be found on the website www.math.psu.edu/mass.

Mathematics is an endlessly fruitful subject. One reason is its ability

to make lemons into lemonade. In mathematics, the gap between
what we’re hoping to prove and what is actually true can itself become
something that we can measure, something we can quantify — the
basis for a whole new world of mathematical theory.
Let me give an example. In Calculus II, you learn that every (rea-
sonably smooth) function of one variable is the derivative of another
function — the fundamental theorem of calculus says that integration
is the reverse of differentiation. In Calculus III, you find out that in
higher dimensions there is a necessary condition that must be satis-
fied if n given functions are to be the partial derivatives of another
function. For instance, in dimension 2, if functions u and v are to be
the partial derivatives (with respect to x and y) of a function f , then
the integrability condition

∂u ∂v
=
∂y ∂x

must be satisfied.
Is this necessary condition always sufficient? For functions de-
fined on a disc, the answer is yes (“every irrotational vector field is
the gradient of a potential”). On more general domains, though, the

answer is no, as is shown by the notorious example

y −x
u= 2 2
, v= 2
x +y x + y2
defined on R2 \ {(0, 0)}. Most Calculus III courses treat this as a nui-
sance, an anomaly. What if we instead treated it as a clue, a signpost,
the start of a trail that might lead to a new kind of mathematics?
In fact, this trailhead leads us up one of the many routes to the
summit of Mount Winding-Number, one of the most beautiful peaks
of the Mathematical Range1 . This book is a sort of hiker’s guide to
that mountain. Some guides want to get you to the top as quickly as
possible so as to move on to “greater” things. I am not one of those.
Rather, I want us to take our time, to explore different paths, and to
get to know the shape of the mountain from various angles: “winding
around” is a description of the book’s methodology as well as of its
subject matter. Only in the final chapter will we begin to explore
the high ridge that connects our mountain to the “greater ranges” of
algebraic topology.
The book originates from a course taught in Penn State’s MASS
program in the fall of 2013. MASS is a unique semester-long intensive
experience that brings together a “critical mass” of highly motivated
undergraduate students from colleges across the USA and elsewhere.
It is a pleasure to record my thanks for the opportunity to share in
this program once again and for the energetic participation of all the
students in the course. This book is dedicated to you all.
Note to the reader: Our trail in this book will wind through
several different parts of mathematics, parts which are often segre-
gated in their own courses with titles like “abstract algebra” or “anal-
ysis” or “geometry” or “topology”. Probably, you will be more fa-
miliar with some of these than with others. Don’t worry! A series of
appendices reviews necessary background (and gives suggestions for
further reading if you want to follow up in greater depth) in these
various subject areas. As you read through the main text, notes will
direct you to the relevant appendix at the first point that its concepts
are required. Then you can decide whether to read the appendix for
a quick refresher or to continue with the main text and hope for the
1
For an extended riff on the idea of mathematics as mountaineering, see [1].

best. Whichever you decide, be sure to have fun! This is a beautiful

mathematical journey and I want you to enjoy it. If you have any
comments or suggestions for improving the book, feel free to contact
me at [email protected].
The website for this book is www.ams.org/bookpages/stml-76.

John Roe

Chapter 1

Prelude: Love, Hate,

and Exponentials

1.1. Two sets of travelers

A topologist is a mathematician who can’t tell the
diﬀerence between a donut and a coﬀee mug.

Figure 1.1. Transforming a donut into a coﬀee mug.

This well-known saying expresses the idea that topology studies

those properties of “spaces” (we will have to say what we mean by
that) which are unaﬀected by “continuous changes” (we will have
to say what we mean by that also). Why might mathematicians be
interested in such a thing? Doesn’t it seem rather, well, imprecise?
Here’s a story (freely adapted from Chapter 1 of [3], where it is
attributed to N. N. Konstantinov) that hints at an answer.

In a certain country there are two cities — call

them Aberystwyth and Betws-y-Coed — and two

roads that join them: the “low road” and the “high
road”.
In A dwell two lovers, Maelon and Dwynwen,
who must travel to B: M by the high road, and D
by the low. So great is the force of their love that
if at any instant they are separated by ten miles
or more, they will surely die.
As well as a pair of lovers, our story contains a
pair of sworn enemies, Llewelyn and John. As our
story begins, L is in A, J is in B, and they must
exchange places, L traveling from A to B via the
high road while J travels from B to A via the low
road. So great is the force of their hatred that if
at any instant they are separated by ten miles or
less, they will surely die.
Prove that tragedy is inevitable. At least two
people will end up dead.
Remark 1.1.1. The point about this story is that we are given no
specific information about the travels of D, M, L, and J: how fast
they go, whether they halt on the journey, whether they speed up or
slow down or even backtrack. Any mathematical tool effective enough
to solve the problem must not care about these kind of “geometri-
cal” specifics: must not care, in fact, about the difference between
the donut and the coffee mug. It is wrong to suppose that topology,
because it does not care about such distinctions, is somehow impre-
cise. On the contrary! Only a truly powerful theory can draw precise,
specific conclusions from such unspecific initial data.

There are two components to solving the problem. The ﬁrst is

to set up a suitable graphical representation, which turns this pic-
turesque story into a problem in topology. The second is to solve the
resulting topological problem.
In the ﬁrst step, we parameterize the problem by the unit square
S = [0, 1] × [0, 1]. A point (x, y) ∈ S is thought of as describing
the location of a pair of characters (either M and D, or L and J)
along the high and low roads, respectively. So for instance the point
(0, 0) represents “both characters are at A”, (1, 1) represents “both

Lovers’ path
Low road
Haters’ path

High road

Figure 1.2. Parameterizing the lovers and haters problem.

characters are at B”, (0.4, 0.7) represents “the ﬁrst character is 40

percent of the way along the high road from A to B and the second
character is 70 percent of the way along the low road”, and so on.
The travels of a pair of characters along the high and low roads are
now encoded in the movement of the single point (x, y) through S.
Now the terms of the problem say that the path which describes
the motion of the pair (M, D) must start at (0, 0) and end at (1, 1).
And the path which describes the motion of (L, J) must start at (0, 1)
and end at (1, 0). So (and this is the topological bit that we’ll have to
come back to), “obviously”, the two paths have to cross (Figure 1.2).
Okay, what happens at a crossing point (x0 , y0 )? This represents
a pair of points — one on the high road, one on the low — which
are occupied (at diﬀerent times) both by M and D and by L and J.
If that pair of points is 10 miles or more apart, it spells doom for M
and D; 10 miles or less, curtains for L and J. Either way, tragedy is
inevitable, just as the problem says.
So here is the key topological fact that we have to prove.

Theorem 1.1.2. Two continuous paths in the unit square S, one

joining (0, 0) to (1, 1) and the other joining (0, 1) to (1, 0), must cross
somewhere.

Surprisingly (perhaps) this is not easy to prove. Let’s look at one

attempted proof and critique that.
Attempted Proof #1: Let the two paths be the graphs of contin-
uous functions f (x) and g(x). Thus f (0) = 0, f (1) = 1, g(0) = 1,
g(1) = 0. Therefore if we consider the function
h(x) = f (x) − g(x),
we have h(0) = −1 and h(1) = 1. By the intermediate value theorem,
h(x0 ) = 0 for some x0 . Then f (x0 ) = g(x0 ) = y0 , say, so the two
paths cross at (x0 , y0 ). (?)
The trouble with this argument is that it assumes that our paths
can be represented as the graphs of functions — in other words, that
there is no “backtracking” in the x-direction. But nothing in the
statement of the problem requires this, and there are many continuous
paths which cannot be represented as graphs, either in this way where
y is a function of x or in the reverse way where x is a function of y. In
a sense, the “no backtracking assumption” has allowed us to reduce
the 2-dimensional problem to a 1-dimensional one, which can then
be solved using a 1-dimensional tool, the intermediate value theorem.
Without making this assumption we are confronted with a situation
which requires essentially 2-dimensional tools.
Attempted Proof #2: Consider the loop in the plane formed by
traveling from (0, 0) to (1, 1) along the lovers’ path and then returning
via the circular arc
t → (cos t, 1 + sin t), 0 t 3π/2.
The point (1, 0) is clearly outside the loop and (0, 1) is inside it, so
any path — such as the haters’ path — from one to the other must
cross the loop somewhere.
This argument is correct, but the notions “outside” and “inside”
have to be made precise, and this isn’t as easy as it may seem —
especially if we consider paths that may cross themselves or self-
intersect. What we will end up doing is deﬁning whether a point p is
“outside” or “inside” a loop γ by counting how many times γ “winds
around” p. Of course that simply shifts the question to explaining
what we mean by “winds around”, but this is a question to which it
is possible to give a precise answer.

1.2. Winding around

Counting revolutions or “windings” is an important and familiar no-
tion, in everyday life as well as in mathematics and science. We
measure our days by revolutions of the earth, our months by revo-
lutions of the moon around the earth, and our years by revolutions
of the earth around the sun. Computing orbits and their periods is
the beginning of the theory of gravitation. The metaphor of life as
a “wheel of fortune” resonates through cultures ancient and modern.
Jerry Garcia sung in 1972

The wheel is turning and you can’t slow down

You can’t let go and you can’t hold on
You can’t go back and you can’t stand still
If the thunder don’t get you then the lightning will.

How many times does the wheel turn? If we stipulate that at

the end of the story the wheel is in the same position as it was in
the beginning, then the answer is an integer — a whole number of
turns, positive (by convention) for counterclockwise revolutions and
negative for clockwise ones. This integer is the winding number, the
central concept of this book. Notice that to compute it, you have to
know the whole continuous story of the motion of the wheel: it is not
enough to look at snapshots of its beginning and end. In other words,
the question “how many times around” is at root a topological one,
and its answer, the winding number, is a topological notion.
We’ve already seen in the previous section an example of how
any kind of continuous motion can be conceptualized as a path in a
suitable abstract space, that is, a mapping from the unit interval into
that space. Similarly, a continuous motion that returns to its starting
point can be conceptualized as a loop, that is, a mapping of the unit
circle into a space. The winding number provides a way to classify
and distinguish such loops. As we hinted above, it is the key to such
intuitively natural notions as the distinction between the “inside” and
the “outside” of a closed curve in the plane.
Many students will ﬁrst meet the winding number in a course
on complex analysis, rather than topology. This is because of the

beautiful way the winding number enters into Cauchy’s residue theo-
rem, which allows one to compute certain integrals of a function f (z)
in terms of the behavior of f at certain special points, its so-called
poles or singularities, and the winding numbers of loops around these
singularities. That powerful subject is not emphasized here, however
(in particular, one does not need any prior acquaintance with com-
plex analysis in order to read this book). Why? Because important
as complex analysis is (with its applications throughout mathemat-
ics, physics, and engineering), the notion of winding number turns
out to have ramiﬁcations far beyond even that ﬁeld. In fact, it’s
not really too much of a stretch to see the winding number as the
golden cord which guides the student through the labyrinth of clas-
sical mathematics: connecting algebra and analysis, potential theory
and cohomology, complex numbers and just about everything.
In this book, we will look at some of the many ways that winding
numbers show up in mathematics. The settings are quite diverse:
topology, geometry, functional analysis, complex analysis, algebraic
systems, and even Lie groups. However, underneath it all is a simple
idea: winding around.
Let’s get started.

1.3. The most important function

in mathematics
We’ll begin by renewing our acquaintance with a familiar object —
the function ex — from the viewpoint of the complex plane. As Eu-
ler discovered, the exponential and trigonometric functions are closely
related in the complex domain, and in particular the exponential func-
tion can be used to describe the unit circle in C. It should therefore
be no surprise that exponentials are going to be closely involved in
our discussion of the winding number, which is all about continuous
travel on the unit circle.
The exponential function exp(z), or ez , is deﬁned by

z2 z3
exp(z) = 1 + z + + + ··· .
2! 3!

Rudin [34] begins his classic book Real and Complex Analysis with
the statement “This is the most important function in mathematics.”
Before we can start looking at its properties, though, we need to
remind ourselves what kind of thing z is here.
Remember that a complex number is a formal expression of the
sort
z = x + yi
where x and y are real numbers and i2 = −1. (We call x the real
part of z and y the imaginary part, and we use the notation x = Re z,
y = Im z.) We’ll think of x + yi as represented by the point (x, y) of
the plane (sometimes called the complex plane or the Argand diagram
in this context).
There is no problem in adding, subtracting, or multiplying com-
plex numbers by the usual rules. However, the following is a nontrivial
fact.
Theorem 1.3.1. The complex numbers form a ﬁeld; i.e., every non-
zero complex number has a multiplicative inverse.

Proof. We write an explicit formula for the inverse. If z = x + yi is a

complex number, then its absolute value or modulus |z| is the positive
real number deﬁned by
|z|2 = x2 + y 2 .
The complex conjugate of z is
z̄ = x − yi.
One computes
z z̄ = x2 + y 2 = |z|2 .
Thus, if z = 0, one has |z| > 0 and
z̄ x y
= 2 − 2 i
|z|2 x + y2 x + y2
is the multiplicative inverse of z.
Remark 1.3.2. The ﬁeld C may be considered as a vector space over
R (see Appendix A to review the theory of vector spaces and linear
algebra). When so considered, C is 2-dimensional: any R-basis has
two elements (the canonical example is of course the basis consisting

of 1 and i). The possible R-bases (z, w) of C fall into two classes,
right-handed like (1, i) and left-handed like (1, −i). Formally we may
say that (z, w) is a right-handed basis if Im(z̄w) > 0 and a left-handed
basis if Im(z̄w) < 0. (If Im(z̄w) = 0, then z and w do not form a
basis.)

The exponential series

∞
zn
exp(z) =
n=0
n!

converges for all values of z and deﬁnes a diﬀerentiable function on

the whole complex plane (such a function is called an entire function).
We also use the notation ez for this function.
By term-by-term multiplication and differentiation one verifies
(treat these as exercises)
(a) addition law: ez+w = ez ew ;
(b) differentiation law: the function z → ez is its own derivative.
The sine and cosine functions are defined in terms of the exponential
by
∞
eiz − e−iz (−1)n z 2n+1
sin z = = ,
2i n=0
(2n + 1)!

∞
eiz + e−iz (−1)n z 2n
cos z = = .
2 n=0
(2n)!

The exponential, sine, and cosine functions are real-valued for real
arguments, and we have
eiz = cos z + i sin z
for all z. Moreover, since the power series for the exponential function
has real coeﬃcients, ez̄ = ez . It follows that
|ez |2 = ez ez = ez+z̄ = e2 Re z
so |ez | = eRe z , for all complex numbers z. In particular, |eiy | = 1 for
all real y.

The addition law for the exponential function yields the corre-
sponding laws for sine and cosine,
sin(z + w) = sin z cos w + cos z sin w,
cos(z + w) = cos z cos w − sin z sin w.
In particular sin2 z + cos2 z = 1 — the special case w = −z of the
second identity. One sees by computation that cos has a positive real
zero; deﬁne π by letting π/2 be the smallest positive real zero of cos.
We have cos(π/2) = 0 and sin(π/2) = 1. The identities now give
sin(z + π/2) = cos(z), cos(z + π/2) = − sin(z).
Iterating these we ﬁnd that cos and sin are 2π-periodic, so the ex-
ponential function is 2πi-periodic. In particular we get the famous
formulae
e2πi = 1, eπi = −1.

Remark 1.3.3. It’s often useful to represent a complex number z =

x + iy in polar coordinate form as
z = reiθ = r cos θ + ir sin θ.
Here r = |z|, and θ is a “residue class modulo 2π” (an equivalence
class of real numbers, two numbers being considered as equivalent if
they diﬀer by a multiple of 2π). One calls r the modulus of z and
θ the argument. In polar coordinates the law for multiplication of
complex numbers takes the simple form
(r1 eiθ1 )(r2 eiθ2 ) = r1 r2 ei(θ1 +θ2 ) ;
you multiply the moduli and add the arguments.

The identity ez ·e−z = 1 shows that the exponential function never

takes the value 0. However, it can take any other value. Indeed if w =
r(cos θ + i sin θ) is a nonzero complex number written in polar form,
then z = log(r) + iθ has ez = w. There are of course infinitely many
such z, differing by integer multiples of 2πi. The really interesting
story begins when we ask how these infinitely many possibilities for
the preimage of w fit together as w varies continuously in C \ {0}.
Let’s start to get a grip on this by considering only a limited range
of values of w, those that lie on the unit circle (|w| = 1).

x
y

Figure 1.3. The exponential map illustrated — from the pic-

ture, you can guess that “winding around” will be involved
somehow. This is the graph of x + iy = eit .

Lemma 1.3.4. The complex number w = exp(z) lies on the unit

circle if and only if z is purely imaginary, which is to say z = it for
some t ∈ R.

Proof. This follows from the formula | exp(z)| = eRe z which we ob-
served before: | exp(z)| = 1 if and only if Re z = 0, which is to say
that z is purely imaginary.

For t ∈ R we have Euler’s formula

exp(it) = cos t + i sin t.
As t moves along the real axis, the point w = exp(it) rotates with
unit speed around the circle. (If you like, think of t as time, measured
in suitable units, and w as the position of the tip of the minute hand
of a clock whose center is at the origin.1 ) Many diﬀerent t-values cor-
respond to the same w, just as many diﬀerent time-values all have the
minute hand pointing to 6. We can think of the exponential function

1
Unfortunately for this illustration, mathematical convention is that the positive
direction of rotation is counterclockwise, so you should think of the clock as running
backwards.

as “wrapping up” the imaginary axis into a spiral that projects to the
unit circle, as shown in Figure 1.3.
It is fundamentally important that, although each point w on the
unit circle corresponds to many diﬀerent t-values, there is no way
to choose those t-values for the whole unit circle in a “continuous”
manner. More precisely,
Lemma 1.3.5. There is no continuous function θ : S 1 → R such
that w = exp(iθ(w)) for all w ∈ S 1 . In fact, there is no continuous
function s : S 1 → S 1 such that s(w)2 = w for all w ∈ S 1 .

Proof. The second statement clearly implies the first since if we could
find a function θ having the required properties, we could then define
s(w) = exp(iθ(w)/2)
and we would have s(w)2 = w.
Suppose then that a continuous function s exists having s(w)2 =
w. Consider the function
u(t) = s(eit )s(e−it ), t ∈ R.
This is a continuous function on R. We have
u(t)2 = s(eit )2 s(e−it )2 = eit e−it = 1.
Thus u(t) = ±1 for each t. A continuous integer-valued function on
R is constant, so u is constant. But then
−1 = s(−1)2 = u(π) = u(0) = s(0)2 = 1,
which is an obvious contradiction2 .
Remark 1.3.6. It’s interesting to contrast the situation for square
roots of complex numbers, revealed by Lemma 1.3.5, with the corre-
sponding situation for real numbers. When we look in the real field,
we find two problems: an existence problem (some numbers, the neg-
ative ones, don’t have any square roots) and a uniqueness problem
(other numbers, the positive ones, have more than one, so the sym-
√
bol x can be ambiguous). In the real case it’s easy to resolve the
uniqueness problem by executive order: we just decree, as is done
2
This argument is adapted from Beardon’s book [8].

It follows from Lemma 1.3.5 that there is no “complex logarithm”

function deﬁned and continuous on all of C \ {0} and such that
exp((z)) = z. Functions with this property can, however, be found
on some smaller domains. Here is an important example.

Lemma 1.3.7. Let S = C\R− be the complex plane with the negative
real axis removed (this is sometimes called a “slit plane”). There
exists a continuous function : S → C such that exp((z)) = z for all
z ∈ S.

Proof. Each z ∈ S has a unique polar coordinate representation

z = reiθ = r cos θ + ir sin θ, −π < θ < π,
and the polar coordinates r, θ depend continuously on z ∈ S. Put
(z) = log(r) + iθ,
where log is the usual natural logarithm for positive real numbers.

Remark 1.3.8. A function having the property asserted by the

lemma is called a branch of the logarithm deﬁned on the slit plane
C \ R− . Notice that such branches are not unique: if z → (z) has
the property of the lemma, then so does z → (z) + 2kπi for any
integer k. Later, we will see that this integer ambiguity is related to
the winding number in a simple way.

1.4. Exercises
Exercise 1.4.1. Calculate the quotient (3 + 2i)/(1 − 2i). Find two
complex roots of the quadratic equation
2z 2 − 3z − 5i = 0.

Exercise 1.4.2. Show that the modulus obeys the triangle inequality
|z ± w| |z| + |w|.
This allows us to make the complex plane into a metric space (see
later, Definition B.1.1) and thus to introduce topological notions such
as open and closed sets, continuity, etc.
√
Exercise 1.4.3. Let a = 1 + i and b = 3 − i. Express each of the
complex numbers
a + b, a − b, ab, a/b
in the form x + yi and in the form reiθ , simplifying your answers as
much as possible.
Exercise 1.4.4. Let z = eiθ where θ = 2π/5. Prove that 1 + z +
z 2 + z 3 + z 4 = 0. By considering the real part of this expression prove
that √
−1 + 5
cos θ = .
4
Exercise 1.4.5. (a) Show that the mapping z → 1/z sends the circle
|z − 1| = 1 (in the complex plane) into a straight line.
(b) Let A, B, C, and D be four points on a circle in the (Euclidean)
plane, and let the symbol d(X, Y ) denote the Euclidean distance
between two points X and Y . Let p = d(A, B)d(C, D), let q =
d(A, C)d(B, D), and let r = d(A, D)d(B, C). Show that one of
p, q, r is equal to the sum of the other two. (This result is due to
Ptolemy of Alexandria, nearly 2,000 years ago. To prove it using
complex numbers, take the circle to be the one in the first part of
the question, and take A to be the origin. Use the transformation
z → 1/z to relate the theorem to the distances between points on
a straight line.)
Exercise 1.4.6. In the 1840s, William Rowan Hamilton spent much
effort trying to find a 3-dimensional field of “hypercomplex” numbers,
i.e., of symbols of the form x + yi + zj, with x, y, z ∈ R, which can
be added, subtracted, multiplied, and divided in the same way that
complex numbers can. Show that his quest was hopeless: no matter
how we define i2 and j 2 , we will not obtain a 3-dimensional system
of the desired sort. (Hint: Use linear algebra. Let V denote the

proposed system. Fix a speciﬁc nonreal element α ∈ V and let mα

denote the operation of multiplication by α, which is an R-linear
transformation from V to V . This transformation must have a real
eigenvalue because V is odd-dimensional. From this, deduce that one
can ﬁnd two nonzero elements of V whose product is zero, which
contradicts the desired existence of division in V .)

Chapter 2

Paths and Homotopies

2.1. Path connectedness

In this chapter we will explore the notions of “continuous movement”
or “continuous deformation” which (as we saw in Chapter 1) are fun-
damental to understanding the winding number. We’ll represent these
by paths in a metric space. Metric spaces (examples include the stan-
dard Euclidean spaces Rn or subsets thereof) provide an abstract
context in which continuity can be defined. For a review of metric
space theory, see Appendix B.
Let X be a metric space. A path in X is a (continuous) map
γ : [0, 1] → X. The points γ(0) and γ(1) of X are the initial and
final points of the path. (This is Definition B.2.2 in Appendix B.)
One should think of this as saying that a path is “the track of a
continuously moving point” in X.

Deﬁnition 2.1.1. Two points p, q in a metric space X are connected

by a path if there exists a path γ : [0, 1] → X with initial point γ(0) =
p and ﬁnal point γ(1) = q.

Proposition 2.1.2. The relation of “being connected by a path” (on

points in a given metric space) is an equivalence relation1 .

1
See Section G.1.

Proof. We must check that the relation is reﬂexive, symmetric, and

transitive.
It is reflexive: for any p ∈ X, the constant path at p (γ(t) = p
for all t ∈ [0, 1, ]) shows that p is connected to itself.
It is symmetric: if γ is a path with initial point p and final point
q, then the reverse path γ defined by γ (t) = γ(1 − t) has initial
point q and final point p.
It is transitive: let γ1 be a path with initial point p and final
point q and let γ2 be a path with initial point q and final point r.
Define a new path γ = γ1 ∗ γ2 (the concatenation of γ1 and γ2 ) by

γ1 (2t) (t 12 ),
γ(t) =
γ2 (2t − 1) (t 12 ).

Then γ is continuous2 and has initial point p and ﬁnal point r.

Deﬁnition 2.1.3. The equivalence classes for the above equivalence

relation are called the path components of the space X. If X only has
one path component, it is called path connected.

If a space is path connected, it is (in principle) straightforward to

show that: just construct some paths. How do we show that a space
is not path connected, though? Most proofs ultimately rely on the
following fact.

Lemma 2.1.4. Any continuous path in a discrete space X (one in

which every subset is open; see Example B.1.9) must be a constant
path.

Proof. Let γ be a path in X with initial point γ(0) = p. Consider the

function f : X → {0, 1} which sends p to 0 and all points of X \ {p}
to 1. Since X is discrete, this function is continuous; so f ◦ γ is a
continuous function from [0, 1] to {0, 1}. Such a function must be
constant (by the intermediate value theorem; if it wasn’t constant it
would take the value 12 somewhere, a contradiction) which shows that
γ(t) = p for all t.

2
This follows from the “gluing lemma”, Proposition B.4.2.

Remark 2.1.5. Traditionally, a space X is called connected if it

cannot be written as the union of two disjoint nonempty open subsets.
It is easy to see that if X is path connected, it is connected (use the
argument of Lemma 2.1.4). The converse is false in general. You’ll
ﬁnd a discussion of this in any introductory text, but I’ve chosen to
avoid it here by working with path connectedness exclusively.

Example 2.1.6. We will spend a lot of time working with paths. It’s
important to understand, therefore, that the behavior of paths can be
very far from the “smoothness” that one becomes accustomed to by
drawing pictures in the plane. A famous example of Peano [30] gives
a path in the plane whose image is the whole unit square. In other
words, continuous maps can raise dimension! Peano’s construction is
quite explicit and we’ll review it below.
We consider real numbers between zero and one as represented
by ternary (base-3) expansions; that is, the digits allowed are 0, 1,
and 2, and a sequence
0 · a1 a2 a3 . . .

of digits represents the real number ∞ −j
j=1 aj 3 . Most numbers be-
tween 0 and 1 have a unique expansion of this form, the only excep-
tions being triadic rational numbers (those whose denominator is a
power of 3), which have two expansions, one ending in all 0s and one
ending in all 2s.
For a digit a let κ(a) denote the complement of a; that is,
κ(0) = 2, κ(1) = 1, κ(2) = 0.
We let κn (a) denote the result of applying κ n times to a: this is κ(a)
if n is odd and a if n is even. Given a number x = 0 · a1 a2 a3 . . ., we
define two new numbers y = 0 · b1 b2 b3 . . . and z = 0 · c1 c2 c3 by the
relations
bn = κa2 +···+a2n−2 (a2n−1 ), cn = κa1 +a3 +···+a2n−1 (a2n ).
One checks that this process is well-defined : two different representa-
tions of the same x (if such exist, that is, when x is a triadic rational)
give rise to different representations of the same y and z. It is also
continuous. To see this, notice that given any x and any n, there is

δ > 0 such that if |x − x| < δ, then the first n digits of (suitably cho-
sen) expansions of x and x agree. Thus, given x and ε > 0, choose
m such that 3−m < ε; then take n = 2m and δ chosen as above in
terms of n. If |x − x | < δ and if y , z are defined in terms of x as
above, then y agrees with y up to m digits (and similarly for z and
z ), so
|y − y | < ε, |z − z | < ε.
Therefore x → (y, z) is a continuous map of the unit interval to the
unit square. This map is surjective. For consider expansions of y, z
as above; we can find an x that maps to them by writing
a2n−1 = κc1 +c2 +···+cn−1 (bn ), a2n = κb1 +b2 +···+bn (cn ).
This completes the construction of a path whose image is the unit
square, usually called the Peano space-filling curve.

Remark 2.1.7. The Peano map from the interval to the unit square
is not injective (see Exercise 2.4.3) and therefore is not a homeomor-
phism. Nevertheless, the existence of such a strange example might
lead one to worry about whether some more complicated construction
might produce a homeomorphism from the interval to the unit square.
If that were to happen, it would mean that the notion of “dimension”
— in the intuitive sense of “how many parameters are needed” to
describe something — would not belong to topology. In the early
twentieth century, Brouwer and others showed that dimension is, in
fact, a topological notion: Rn and Rm are not homeomorphic unless
n = m. The general proof is quite delicate. We’ll use the winding
number to address some cases of this question later in the book (see
Exercise 4.5.9 for example).

2.2. Homotopy
Let X and Y be metric spaces, with X compact. We recall (see Deﬁni-
tion B.4.1 and the following discussion) that the collection of all con-
tinuous maps f : X → Y is itself a metric space (denoted Maps(X, Y ))
under the uniform distance
d(f0 , f1 ) = sup{d(f0 (x), f1 (x)) : x ∈ X}.

The next deﬁnition is a key one in topology.

Deﬁnition 2.2.1. Let X, Y be metric spaces, with X compact, and
let f0 , f1 be maps from X to Y . A homotopy from f0 to f1 is a
path joining them in Maps(X, Y ). In other words, it is a continuous
“one-parameter family of maps” {(fs ) : s ∈ [0, 1]} with initial point
f0 and ﬁnal point f1 . The path components of Maps(X, Y ) are called
the homotopy classes of maps from X to Y . Two maps that are
connected by a homotopy are called homotopic.
Remark 2.2.2. The relation “being homotopic” is thus a special
case of the relation “being connected by a path”. Since the latter is
an equivalence relation (Proposition 2.1.2), so is the former.

By deﬁnition, a homotopy is a one-parameter family, say fs , of

maps from X to Y . But we can also consider the same data as defining
a single map3 F : [0, 1] × X → Y by the formula
F (s, x) = fs (x).
We will use either definition as it is convenient. The fact that the two
definitions are equivalent is a special case of the exponential law for
function spaces, which is proved in Appendix B (Proposition B.4.5).
Example 2.2.3. Let γ : [0, 1] → X be a path in a metric space X.
Intuitively, a path represents the “trajectory” of a moving particle
which takes position γ(t) at time t ∈ [0, 1]. One can envisage the
particle tracing out the same “trajectory” but at a different speed:
this corresponds to a path γ ◦ ϕ, where ϕ : [0, 1] → [0, 1] is a contin-
uous map having ϕ(0) = 0 and ϕ(1) = 1. Such a path is called a
reparameterization of γ. Now we have
Proposition 2.2.4. A reparameterization of a path is homotopic to
the original path. Moreover, the homotopy can be taken to fix the
endpoints of γ (the meaning of this will be explained in the proof ).

Proof. The required homotopy is given by

h(s, t) = γ st + (1 − s)ϕ(t) .
3
In fact, if you consult standard textbooks, you will ﬁnd that this is the more
common deﬁnition of “homotopy”. It has the advantage of working well even when X
is not compact.

Note that h(s, 0) = γ(0) and h(s, 1) = γ(1) for all s. That is what we
mean by “fixing the endpoints”: all of the curves h(s, ·) making up
the homotopy have the same starting and ending points.
Remark 2.2.5. As is suggested by the above discussion of homo-
topies of paths “with endpoints fixed”, we are often interested only
in those maps from a space X to Y which have some special behavior
on a subspace of X (in the example, the subset {0, 1} consisting of
the endpoints of the unit interval [0, 1]). For example, if A is a subset
of X and B a subset of Y , the maps f : X → Y such that f (A) ⊆ B
are called maps of pairs (X, A) → (Y, B). We’ll denote the space
of such maps by Maps((X, A), (Y, B)). A particularly important ex-
ample occurs when each of A and B consists of a single point which
we may call the “basepoint” of X or Y , respectively. In that case
Maps((X, A), (Y, B)) is the space of basepoint-preserving maps from
X to Y , which we may also denote
Maps• (X, Y )
if the choice of basepoint is clear from the context.
Example 2.2.6. Let Y be a metric space, and let y0 ∈ Y . The path
space of Y based at y0 is the space
Py0 (Y ) = Maps(([0, 1], {0}), (Y, {y0 }));
in other words, it is the space of paths in Y with initial point {y0 }.
Example 2.2.7. Let Y be a metric space, and let y0 ∈ Y . A loop
in Y based at y0 is a path whose initial and final points are y0 . The
space of such paths
Ωy0 (Y ) = Maps(([0, 1], {0, 1}), (Y, {y0 }))
is called the based loop space of Y with basepoint x0 .
Example 2.2.8. The free loop space Ω(Y ) is the space of all maps
f : [0, 1] → Y such that f (0) = f (1).

Each of these mapping spaces comes equipped with its own notion
of homotopy, which is a path joining two maps in the relevant mapping
space. For example, if f0 , f1 belong to Maps((X, A), (Y, B)), we can
consider a path joining them in the space Maps((X, A), (Y, B)); we

could call this a homotopy of maps of pairs. Special cases are a

homotopy of paths (a path in Py0 (Y ), Example 2.2.6), a homotopy
of based loops (a path in Ωy0 (Y ), Example 2.2.7), and a homotopy of
free loops (a path in Ω(Y ), Example 2.2.8). Note that increasing the
size of the mapping space under consideration can make homotopy
“easier”. For instance, it is perfectly possible for two loops to be
nonhomotopic in Ωy0 (Y ) but homotopic in Py0 (Y ); geometrically, it
is easier to deform a path if you don’t have to keep its ﬁnal point
ﬁxed at the initial point.

Remark 2.2.9. The fundamental example of a loop is given by our

friend the exponential map η : t → e2πit , which maps [0, 1] continu-
ously onto the unit circle S 1 in the complex plane, with η(0) = η(1) =
1. In fact, this is the universal example of a loop in the following pre-
cise sense.

Proposition 2.2.10. Let f : [0, 1] → Y be a loop in a metric space

Y . Then there is a unique map g : S 1 → Y such that f = g ◦ η; in
other words, the diagram

[0, 1]
CC
CCf
η CC
CC
!
S 1 g
/Y

is commutative.

Proof. The function g is deﬁned as follows: to ﬁnd g(u), for u ∈ S 1 ,

choose a t such that η(t) = u, and then deﬁne g(u) to equal f (t).
There is no ambiguity in this process unless u = 1, in which case t
could be 0 or 1; but since f (0) = f (1) because f is a loop, it does
not matter which t we pick in this case. It remains to show that
g is continuous, and for this we use the characterization of contin-
uous functions as those which pull back closed sets to closed sets
(Remark B.2.4). Let C be any closed subset of Y . Then f −1 (C) is
a closed subset of [0, 1] because f is continuous, hence compact be-
cause [0, 1] is compact. Therefore g −1 (C) = η(f −1 (C)) is compact
(Proposition B.3.16), and therefore closed (Proposition B.3.6). Thus
g is continuous.

It follows that the free loop space Ω(Y ) can be identiﬁed with the
space Maps(S 1 , Y ) of maps from S 1 to Y . Similarly, the based loop
space Ωy0 (Y ) can be identiﬁed with Maps• (S 1 , Y ).

2.3. Homotopies and simple-connectivity

We can now formulate the question that is central to this course.
Key Question Let X be a path-connected space, p a
point of X. Is the space of loops, Ωp (X), path con-
nected? If so, how do we prove this? If not, how do we
describe the path components of Ωp (X)?
Example 2.3.1. Let X = C, the complex plane, and choose the
“base point” p to be the number 1. Let γ be any loop based at p; i.e.,
γ : [0, 1] → X with γ(0) = γ(1) = p. Then the formula
γs (t) = (1 − s)γ(t) + sp
deﬁnes a homotopy from γ = γ0 to the constant loop γ1 . Thus ev-
ery loop in Ωp X is homotopic to the constant loop, so Ωp X is path
connected.

Deﬁnition 2.3.2. A space X is called simply connected if both X

and Ωp X are path connected. In other words, X is path connected
and every loop in X, based at p, is homotopic to a constant loop.
Thus the example above shows that the complex plane C is simply
connected. (It can be shown that this deﬁnition does not depend on
the choice of base point p: see Exercise 2.4.6.)
Example 2.3.3. Let X = C \ R− , the complex plane slit4 along the
negative real axis. The exact same homotopy as in Example 2.3.1
shows that this X is also simply connected.
Remark 2.3.4. One can abstract the key property that is involved
in the preceding argument. Suppose that X is a subset of C (or more
generally of any Euclidean space). If there is a point p ∈ X which has
the property that, for every x ∈ X, all the points on the line segment
from x to p also belong to X, then X is called star-shaped about p.
The argument of Example 2.3.1 works exactly for star-shaped sets.
4
We introduced this terminology in Lemma 1.3.7.

Example 2.3.5. Let X = C \ {0}, the “punctured plane”. This

subset is not star-shaped and the homotopy that we used in the pre-
vious two examples will not work. Consider for instance the loop
γ(t) = e2πit = cos(2πt) + i sin(2πt). This is a loop in X (it never goes
through the origin). But the linear homotopy that we used before,
γs (t) = (1−s)γ(t)+s, is now invalid: it would pass through the origin
at s = t = 12 , and the origin is not in X. In the next chapter we will
see that this is not an accident: this X is not simply connected, and
this γ is not homotopic to a constant loop.

Example 2.3.6. Consider the 2-sphere, which is the subset S 2 ⊆ R3

made up of points at unit distance from the origin, or more generally
the (n − 1)-sphere

S n−1 := {(x1 , . . . , xn ) : x21 + · · · + x2n = 1} ⊆ Rn .

We would like to prove that this is simply connected when n 3.

However, this will take a bit more work than anything we have done
so far.

Let N ∈ S n−1 be the point (0, 0, 0, . . . , 1) (the “north pole”). The

stereographic projection ϕ : S n−1 \{N } → Rn−1 is the map that sends
P = (x1 , . . . , xn ) ∈ S n−1 \ {N } to the point where the line through
N and P intersects the plane spanned by the ﬁrst (n − 1) coordinate
axes (see Figure 2.1). Elementary geometry shows that ϕ is given by
the formula

x1 xn−1
ϕ(x1 , . . . , xn ) = ,..., .
1 − xn 1 − xn

Lemma 2.3.7. The stereographic projection ϕ : S n−1 \ {N } → Rn−1

is a homeomorphism.

Proof. Calculation shows that the inverse of ϕ is given by the map

2x1 2xn−1
x
2 − 1
x = (x1 , . . . , xn−1 ) → ,..., , ,

x
2 + 1
x
2 + 1
x
2 + 1
where
x
2 = x21 + · · · + x2n−1 . From these explicit formulas we see
that both ϕ and ϕ−1 are continuous.

N = (0, 1)
P = (x, y)

ϕ(P ) = y/(1 − x)

Figure 2.1. Stereographic projection (the ﬁgure shows the

case n = 2).

In the same manner we can deﬁne the stereographic projection

whose pole is any point P ∈ S n−1 : it is a homeomorphism from
S n−1 \ {P } to the hyperplane P ⊥ ⊆ Rn−1 made up of points with
position vector perpendicular to P .
It follows that every loop in S n−1 whose image is not the whole of
n−1
S is homotopic to a constant loop. For if there is a point P not in
the image of our loop, then under stereographic projection the loop
becomes a map S 1 → S n−1 \ {P } ∼ = Rn−1 . It is therefore homotopic
to a constant loop because R n−1
is star-shaped (Remark 2.3.4).
Unfortunately, the example of the Peano space-filling curve (Ex-
ample 2.1.6) shows that there might be loops that map S 1 onto S n−1
and don’t “omit” any point N that we can use in the above argument.
So what we are going to do is to first make a preliminary homotopy
to put our loop into “general position”; it will then omit many points,
and we can use the argument above.
Definition 2.3.8. Let a, b ∈ S n−1 . We say that they are antipodal
if a = −b as vectors in Rn ; it is equivalent to say that d(a, b) = 2. In
this case we write b = A(a) (the antipode of a).

If a and b are not antipodal, there is a unique shortest path in

S n−1 from a to b (a great circle arc); we’ll call this the straight path
from a to b. The reason for this terminology is that if ϕ denotes
stereographic projection with pole A(a), then ϕ carries the straight
path from a to b to the straight line in Rn−1 from ϕ(a) to ϕ(b).

Completion of Example 2.3.6. We are going to prove that S n−1

is simply connected for n 3. Let us say that a path γ in S n−1
is piecewise straight if there are finitely many parameter values t0 =
0, t1 , . . . , tn = 1 such that γ(tk ) and γ( tk+1 ) are not antipodal and
the restriction of γ to the interval [tk , tk+1 ] is the straight path from
γ(tk ) to γ(tk+1 ).
I claim that any path in S n−1 is homotopic to a piecewise straight
path with the same starting and ending points. Indeed, let γ be
a path in S n−1 . Since γ is continuous, it is uniformly continuous
(Proposition B.3.19) so there is some δ > 0 such that if |s−t| < δ, then
d(γ(s), γ(t)) < 2. Choose t0 = 0, t1 , . . . , tn = 1 such that |tk − tk+1 | <
δ for all k; this ensures that for each k the range γ([tk , tk+1 ]) does
not contain any pair of antipodal points. Then “straighten out” the
path between tk and tk+1 by first projecting stereographically from tk ,
then carrying out an endpoints-fixed homotopy to a straight path in
Rn−1 , and then projecting back again to S n−1 . An explicit formula for
such a homotopy can be given as follows: let ϕk be the stereographic
projection from S 2 \ A(γ(tk )) to C. For t ∈ [tk , tk+1 ], put

−1
γs (t) = ϕk (1 − s)ϕk (γ(t))

tk+1 − t t − tk
+s ϕk (γ(tk )) + ϕk (γ(tk+1 )) .
tk+1 − tk tk+1 − tk
Then γ0 = γ, and γ1 is the piecewise straight path with vertices
γ(t0 ), . . . , γ(tn ).
Now the points of a piecewise straight path are contained in the
union of ﬁnitely many 2-dimensional planes in Rn . Thus, if n 3, a
piecewise straight path cannot ﬁll up all of S n−1 . By stereographic
projection, as noted above, it follows that a piecewise straight loop in
S n−1 is homotopic to a constant loop. Since every loop is homotopic
to a piecewise straight loop, the proof is completed.

2.4. Exercises
Exercise 2.4.1. Find the path components of the rational numbers
Q with their usual metric.

Exercise 2.4.2. Check the details in the construction of the Peano

space-filling curve (Example 2.1.6). Also, check that the Peano map
is not injective (think about the ambiguities in ternary rational ex-
pansions for triadic rationals).
Exercise 2.4.3. Let X be a path-connected metric space. Call a
point p ∈ X a cut point if X \ {p} is not path connected. Show that
the unit interval [0, 1] has some cut points but that the unit square
[0, 1] × [0, 1] does not have any cut points. Explain why this shows
that these two spaces cannot be homeomorphic to one another.
Exercise 2.4.4. Let X be the punctured complex plane C \ {0}. A
loop in X, starting and ending at 1, is defined as follows:

cos(4πt) + i sin(4πt) (0 t 12 ),
γ(t) =
(32(t − 34 )2 − 1) − (t − 12 )(t − 34 )(t − 1)i ( 12 t 1).
Show that the loop γ is homotopic to a constant loop.
Exercise 2.4.5. Let S be the spiral in the complex plane defined by
S = {teit : 0 t < ∞}.
Prove that C \ S is simply connected.
Exercise 2.4.6. Show that if X is path connected and p, q ∈ X,
then Ωp (X) is path connected if and only if Ωq (X) is path connected.
(Suggestion: Consider the map Ωp (X) → Ωq (X) defined by sending
a loop γ based at p to the concatenation θ ∗ γ ∗ θ , where θ is a path
from q to p.) It follows that our definition of “simply connected” does
not depend on the choice of base point.
Exercise 2.4.7. Let X be a nonempty compact space with a base
point p. Consider the space Maps• (X, X) of maps from X to itself
that send p to p. We say X is contractible if Maps• (X, X) is path
connected. Prove:
(i) Every contractible space is path connected.
(ii) Every contractible space is simply connected.
(iii) X is contractible if and only if the identity map X → X is
homotopic in Maps• (X, X) to a constant map.

Chapter 3

The Winding Number

3.1. Maps to the punctured plane

Let X be a compact metric space. We are going to analyze maps1
from X to the punctured plane C \ {0}. A key player in the discussion
will be the exponential map exp : C → C \ {0} which we discussed in
Section 1.3.

Deﬁnition 3.1.1. Let X be some compact metric space. A (con-

tinuous) map f : X → C \ {0} is an exponential if there is a map
g : X → C such that f = exp ◦g; in other words, the diagram

;C
xxx
x
xx
g
xx
exp
x
x f
X / C \ {0}

commutes. (Because of the way this picture looks, one also expresses
this by saying that f lifts through the exponential map.)

Lemma 3.1.2. If f : X → C \ {0} is an exponential, then it is homo-

topic to a constant map. More generally, if f1 /f0 is an exponential,
then f0 is homotopic to f1 .

1
By our convention, the words “map” and “mapping” refer to continuous func-
tions; see Deﬁnition B.2.1.

Proof. Suppose that f1 (t)/f0 (t) = exp(g(t)). Then a homotopy from

f0 to f1 , in C \ {0}, is given by fs (t) = exp(sg(t))f0 (t).

The main result of this section is a converse to Lemma 3.1.2: a

map X → C \ {0} is homotopic to a constant map only if it is an
exponential. First we need

Lemma 3.1.3. Suppose that f : X → C \ {0} never takes negative

real values. Then f is an exponential.

Proof. By assumption, f maps into the slit plane S = C \ R− . Then

f (x) = exp((f (x))),
where is the branch of the logarithm deﬁned in Lemma 1.3.7.

Proposition 3.1.4 (Rouché’s theorem). Suppose that f0 and f1

are maps from X to C \ {0} such that
|f0 (x) − f1 (x)| < |f0 (x)| + |f1 (x)|.
Then f0 /f1 and f1 /f0 are exponentials.

Proof. The inequality shows that for each w = f0 (x)/f1 (x),

|w − 1| < |w| + 1.
/ R− .
It is not hard to see that w ∈ C satisﬁes this inequality iﬀ w ∈
Thus f0 /f1 never takes negative real values and the result follows
from the previous lemma.

Proposition 3.1.5. Let X be a compact metric space and f0 , f1 maps

from X to C \ {0}. The maps f0 and f1 are homotopic if and only if
f1 /f0 is an exponential.

Proof. We already proved “if” (Lemma 3.1.2).

For “only if”, consider a homotopy h(s, x) with h(0, x) = f0 (x)
and h(1, x) = f1 (x). Because [0, 1] × X is a compact space and h
never takes the value 0, there is ε > 0 such that |h(s, x)| > ε for all
s ∈ [0, 1] and x ∈ X. By the uniform continuity (Proposition B.3.19)
of h, there is δ > 0 such that
|h(s, x) − h(s , x)| < ε whenever s, s ∈ [0, 1] with |s − s | < δ.

Thus when |s − s | < δ the maps hs (x) = h(s, x) and hs (x) = hs (x)
satisfy
|hs (x) − hs (x)| < ε < |hs (x)| + |hs (x)|.
By Rouché’s theorem, hs /hs is an exponential. Choose a ﬁnite se-
quence sj with s0 = 0, sm = 1, and |sj − sj+1 | < δ. Then

m−1

f1 /f0 = hsj+1 /hsj
j=0

is a product of exponentials, which is an exponential.

Corollary 3.1.6. With the same notation, a map X → C \ {0} is
itself an exponential if and only if it is homotopic to a constant map.

3.2. The winding number

In the previous section we discussed maps from any compact X to
the punctured plane. We now specialize our attention to the cases
X = [0, 1] (paths) and X = S 1 (loops). Recall that a path in a space
Y is just a (continuous) map γ : [0, 1] → Y ; a loop in Y is a path with
the additional property that its initial point γ(0) is equal to its ﬁnal
point γ(1). We noted in Proposition 2.2.10 that a loop in Y can also
be represented as a map from the unit circle S 1 to Y .
Proposition 3.2.1. Every path γ : [0, 1] → C \ {0} lifts through the
exponential map to a path g : [0, 1] → C. Moreover, if g0 and g1 are
two lifts of γ, then there is an integer n such that g0 (t)−g1 (t) = 2πin,
for all t.

Proof. The homotopy

h(s, t) = γ((1 − s)t)
shows that γ is homotopic to a constant path. By Corollary 3.1.6,
then, γ lifts through the exponential map.
Suppose now that g0 , g1 are two lifts of γ. Then
exp(g0 (t)) = γ(t) = exp(g1 (t)),
and thus exp(g0 (t) − g1 (t)) = 1, which implies that (g0 (t) − g1 (t))/2πi
is an integer. A continuous integer-valued function on [0, 1] is constant
(Lemma 2.1.4), so g0 (t) − g1 (t) = 2πin for some integer n.

2πi γ

g
1

Figure 3.1. A path g in C starting at 0 and ending at 2πi

and the loop γ = exp ◦g in C \ {0} obtained by exponentiating
it. The winding number of γ is (2πi)−1 (g(1) − g(0)), which
equals 1 in this case.

Now suppose that γ is a loop in C \ {0}, that is, a path that

has the same initial and final points. By Proposition 3.2.1, γ is the
exponential of a path, that is to say, a map g : [0, 1] → C such that
(3.2.2) γ(t) = exp(g(t)).
But g need not be a loop! All we know about its initial and final points
is that exp(g(1)) = exp(g(0)), and this tells us that g(1)−g(0) = 2πim
for some integer m. Moreover, the second part of Lemma 3.2.1 tells
us that m does not depend on the choice of lifting g for the loop γ.
It is therefore an invariant of the loop γ.
Definition 3.2.3. If γ is a loop in C \ {0} as above, the integer m
(equal to (2πi)−1 (g(1) − g(0)) for any g such that exp(g(t)) = γ(t))
is called the winding number of γ about the origin and is denoted
wn(γ, 0) (or sometimes just wn(γ)).

See Figure 3.1 for an illustration of this.

Deﬁnition 3.2.4. If γ is a loop in C \ {p}, the winding number of γ
about p is deﬁned to be the winding number of the loop t → γ(t) − p
(in C \ {0}) about the origin. We denote it wn(γ, p).

Lemma 3.2.5. If a loop γ : [0, 1] → C \ {p} is homotopic through

loops to a constant loop, then wn(γ, p) = 0.

It is important that the homotopy in this lemma take place

through loops — in other words, if we identify the given loop with
a map S 1 → C \ {p} using Proposition 2.2.10, then that map is ho-
motopic to a constant in the free loop space Maps(S 1 , C \ {p}).

Proof. Without loss of generality take p = 0. Consider the diagram

SO 1 H
ψ
/C
H
H ϕ
η H exp
H
H#
[0, 1] / C \ {0}.
γ

Here η(t) = e2πit . By Proposition 2.2.10, a map ϕ (shown by the

dashed arrow) exists that makes the bottom triangle commute. By
hypothesis, ϕ is homotopic to zero in the free loop space. Hence by
Proposition 3.1.5 (with X = S 1 ), there exists a map ψ (shown by the
dotted arrow) making the top triangle commute. Now g = ψ ◦ η is a
lifting of γ through the exponential map, and g(0) = g(1). Therefore
the winding number of γ is zero.

Lemma 3.2.6. Let γ1 and γ2 be loops in C \ {0} and let γ(t) =

γ1 (t)γ2 (t) be their pointwise product. Then
wn(γ) = wn(γ1 ) + wn(γ2 ).

Proof. If g1 and g2 are lifts of γ1 and γ2 , respectively, then g1 + g2

is a lift of γ1 γ2 .

Theorem 3.2.7. A loop in C \ {0} has winding number n if and only

if it is homotopic to the loop
en (t) = exp(2πint), t ∈ [0, 1].
In particular, two loops are homotopic if and only if they have the
same winding number, and a loop is homotopic to a constant if and
only if it has winding number 0.

Proof. It is clear from the deﬁnition that en has winding number n

since we can take a lift g for en to be g(t) = 2πint.
Suppose that γ has winding number 0. Then we can write γ(t) =
exp(2πig(t)) with g(0) = g(1) = 0. The homotopy

h(s, t) = exp(2πi(1 − s)g(t))

is then a homotopy of loops (not just of paths) and contracts γ to

a constant loop. Conversely, if γ is homotopic (through loops) to a
constant loop, it has winding number zero by Lemma 3.2.5. This
shows that a loop is homotopic to a constant loop if and only if it has
winding number zero. Moreover, all constant loops are homotopic to
one another (since C \ {0} is path connected).
To prove the general case we make use of Lemma 3.2.6 (which
shows that loops γ0 and γ1 have the same winding number if and
only if γ0 /γ1 has winding number zero), together with the observation
that γ0 and γ1 are homotopic through loops if and only if γ0 /γ1 is
homotopic to a constant loop (if γs is a homotopy from γ0 to γ1 , then
γs /γ1 is a homotopy from γ0 /γ1 to the constant loop 1).

Remark 3.2.8. The loops en that appear in Theorem 3.2.7 have

the property that |en (t)| = 1 for all t; in other words, they are maps
whose range is contained in the unit circle S 1 ⊆ C. It follows therefore
from the theorem that every loop in C\{0} is homotopic to one whose
range is contained in the unit circle. It is also easy to see this directly.
In fact, if z ∈ C \ {0}, let υ(z) = z/|z| be the “unitization” of z; then
for any loop γ in C \ {0} the “radial retraction” homotopy

h(s, t) = (1 − s)γ(t) + sυ(γ(t))

shows that γ is homotopic to the loop υ ◦ γ, whose range lies in S 1 .

Remark 3.2.9. The winding number has another additivity prop-

erty besides that described in Lemma 3.2.6. Let γ1 and γ2 be loops
in C \ {0}, based at the same point, and let γ = γ1 ∗ γ2 be their
concatenation. That is,

γ1 (2t) (t 12 ),
γ(t) =
γ2 (2t − 1) (t 12 )

(see the proof of Proposition 2.1.2). Then, once again, wn(γ) =

wn(γ1 ) + wn(γ2 ). To see this, note that if g is a lift of γ through the
exponential map, then g = g1 ∗ g2 , where g1 and g2 are lifts of γ1 and
γ2 , respectively. Thus

wn(γ) = (2πi)−1 (g(1)−g(0)) = (2πi)−1 (g(1)−g( 21 )+g( 12 )−g(0))

= (2πi)−1 (g1 (1)−g1 (0))+(2πi)−1 (g2 (1)−g2 (0)) = wn(γ1 )+wn(γ2 )
as required.

3.3. Computing winding numbers

In this section we will learn how to compute winding numbers in
practice.
Let γ be a loop in C \ {0}. The image of γ (the set of points γ(t)
for t ∈ [0, 1]) is a compact subset of C \ {0} and in this context is
usually denoted γ ∗ . Consider the complement of this image, C \ γ ∗ :
that is, the set of all those points p that the loop γ does not pass
through. For each such p, the winding number wn(γ, p) of γ around
p is well-deﬁned.
Proposition 3.3.1. The function p → wn(γ, p) is constant on the
path components of C \ γ ∗ . It is equal to zero on the unbounded
component 2 of this set.

Proof. Suppose p and q are in the same path component of C \ γ ∗ .

That means there is a path ψ : [0, 1] → C \ γ ∗ that starts at p and
ends at q. For each s ∈ [0, 1], the loop
t → γ(t) − ψ(s)
does not pass through 0. Thus the map
h : (s, t) → γ(t) − ψ(s)
is a homotopy of loops in C \ {0}, which must therefore preserve
the winding number about 0. But wn(h(0, ·), 0) = wn(γ, p) and
wn(h(1, ·), 0) = wn(γ, q).
The compact set γ ∗ is bounded, so it is included in some ball
B(0; R). All the points in the complement of this ball (i.e., those p
2
I’ll explain what this is in the course of the proof.

with |p| > R) are path connected to each other outside the ball and
are therefore in the same component of C \ γ ∗ . This is called the
unbounded component of C \ γ ∗ . If |p| > R, then the loop t → γ(t) − p
is contained in a half-plane. A half-plane is star-shaped (see Remark
2.3.4) so this loop is homotopic (by a linear homotopy) to a constant
and has winding number zero.

The components of C \ γ ∗ are sometimes called the cells of γ.

The above proposition suggests the following approach to calculating
winding numbers: to ﬁnd wn(γ, p), choose a path ψ from p to the
unbounded cell. We expect (with luck!) that this path should pass
through only ﬁnitely many cells. The winding number wn(γ, ψ(s)) is
constant on cells, and we know it is zero on the unbounded one, so
“all” we have to do is to understand how the winding number changes
as we pass from one cell to the next.
Here is the simplest situation in which one can do that.

Lemma 3.3.2. Suppose that a loop γ has a short straight section.

Suppose p0 is just to the right of the short straight section and p1 is
just to the left of it. Then wn(γ, p1 ) − wn(γ, p0 ) = 1.

Of course there are lots of questions here: what is a short straight

section? Which way is to the right? How far is “just”, etc.? We can
make this precise as follows.
We’ll suppose that there are a ball B = B(a; δ), a complex num-
ber b of absolute value δ, and a parameter interval (t0 , t1 ) for γ, such
that γ(t0 ) = a − b, γ(t1 ) = a + b, and as t runs from t0 to t1 the path
γ follows the straight line (a diameter of the ball) from a − b to a + b.
We assume further that γ(t) does not meet B for t ∈ / (t0 , t1 ). This is
what we will mean by a “short straight section” of the path.
The complement, in B, of the diameter (a − b, a + b) has two path
components. The component consisting of those z for which the R-
basis (z − a, b) for C is right-handed (see Remark 1.3.2) will be called
the “right-hand” component; the other will be called the “left-hand”
component.

(ii) Path θ
p0
a
a+b p1 a+b

p0 p0
(i) a (iii) a
p1 p1

Path γ a−b
Path γ a−b

Figure 3.2. The bubble argument. (i): the original path γ

including the short straight section from a − b to a + b. (ii):
the additional “bubble” θ which winds around p0 but not p1 .
(iii): the modiﬁed path γ̃ that is homotopic to a concatenation
of γ and θ.

Our claim precisely is that if |p0 − a|, |p1 − a| < δ/2, p0 is

in the right-hand component, p1 in the left-hand component, then
wn(γ, p1 ) − wn(γ, p0 ) = 1.

Proof. Let θ be the D-shaped loop that starts at a − 34 b, proceeds

around a semicircle of radius 34 δ to a + 34 b, and then returns via a
straight line to a − 34 b again.
Let γ̃ be the loop obtained by modifying γ by replacing the
straight line segment from a − 34 b to a + 34 b with the semicircle de-
scribed above. (See Figure 3.2 for illustrations of these three loops.)
Observe the following.
(a) γ̃ is homotopic to the concatenation3 of θ and γ.

3
Details: Parameterize both γ and θ with a − 34 b as the base point, so that
γ(0) = γ(1) = θ(0) = θ(1) = a − 34 b. Then the concatenation θ ∗ γ travels around
the semicircular arc of θ, then back down the straight line segment, out along the
straight line segment again (exactly reversing the previous section), and then around
the rest of γ. The two opposite copies of the straight line segment can be deformed
by a linear homotopy to a constant path at a + 34 b. The result of this homotopy is a
parameterization of γ̃.

(b) The winding number of θ about p0 is 1; about p1 it is 0.

(c) The points p0 and p1 belong to the same component of the com-
plement of γ̃ ∗ .
From Remark 3.2.9 the winding number of a concatenation of
paths is the sum of their winding numbers. Taken together, these
facts give us the proof:
wn(γ, p1 ) = wn(γ̃, p1 ) − wn(θ, p1 ) = wn(γ̃, p1 )
= wn(γ̃, p0 ) = wn(γ, p0 ) + wn(θ, p0 ) = wn(γ, p0 ) + 1.

Remark 3.3.3. We refer to this style of proof as a “bubble argument”

because of the way the point p1 is enclosed in a “bubble” (the loop
θ) that grows out of the curve γ.

For an example where this allows a complete calculation of wind-

ing numbers, let us consider polygonal loops. Suppose that we give a
list of points v0 , v1 , . . . , vn ∈ C. The polygonal path with these ver-
tices is obtained by concatenating the straight line segments (“edges”)
between them; i.e., it is the map
t → (nt − k)vk+1 + (k + 1 − nt)vk for k/n t (k + 1)/n.
(Here we have parameterized the path so that each edge takes equal
“time”; but any other parameterization would yield a homotopic
path.) It is a polygonal loop if v0 = vn . We’ll denote a polygonal
path (or loop) by v0 , . . . , vn .
Suppose we have a polygonal loop v0 , . . . , vn that does not pass
through a point p. Choose a ray R from p to ∞ that is not parallel to
any of the edges of the given loop and does not pass through any of
its vertices (we’ll call this a transverse ray with respect to the loop).
For each edge ek = vk , vk+1 of the loop we can deﬁne an intersection
number
⎧
⎪
⎪
⎨0 if ek and R do not meet,
(3.3.4) i(ek , R) := +1 if R crosses ek from left to right,
⎪
⎪
⎩−1 if R crosses e from right to left.
k

(Since ek and R are both straight, they can meet in at most one
point. The notion of “right” and “left” is deﬁned as in our discussion

⊕

⊕

Figure 3.3. Computing the winding number of a polygonal loop.

of Lemma 3.3.2.) Now we have

Proposition 3.3.5. The winding number of a polygonal loop around
p is equal to the sum of its edge-intersection numbers with a transverse
ray:
wn( v0 , . . . , vn , p) = i( vk , vk+1 , R),
k
where R is a transverse ray from p to inﬁnity. (See Figure 3.3.)

Proof. This follows immediately from Lemma 3.3.2: trace along the
ray R, from the unbounded component inward to p, keeping track
of the changes in the winding number. There is one small issue to
deal with: our deﬁnition of a “polygonal loop” allows for diﬀerent
edges to overlap (for instance a polygonal loop with 6 edges that goes
around the same triangle twice). Lemma 3.3.2 does not apply directly
in such a case. However, it is easy to see that any polygonal loop is
homotopic to a nearby one that does not have overlapping edges.
So, if necessary, we may adjust the loop by a preliminary homotopy
to ensure that its edges do not overlap, and then we may apply the
preceding argument to the adjusted loop.
Remark 3.3.6. This gives us, “in principle”, a way to compute wind-
ing numbers for any loop. Let γ be a loop in C\{0}, say. By compact-
ness, there is ε > 0 such that |γ(t)| > ε for all parameter values t; and
then by uniform continuity there is δ > 0 such that if |t − t | < δ, then
|γ(t) − γ(t )| < ε. It follows that the portion of γ between parameter

values t and t having |t − t | < δ is homotopic, by a linear homotopy

in C \ {0}, to the straight line path from γ(t) to γ(t ). Subdividing
the whole parameter interval into subintervals of length < δ, we find
that γ is homotopic to a polygonal path. We can then use the algo-
rithm above to compute the winding number of this approximating
polygonal path, which will be the same as the winding number of the
original γ.
This pattern — there is an algorithm to compute winding num-
bers, but only after approximating a general path by a “good” path —
will recur several times with different meanings of “good” (polygonal,
algebraic, differentiable, etc.). The next two sections give additional
examples.

3.4. Smooth paths and loops

Let Ω be an open subset of C (or of any other finite-dimensional vector
space over R) and let γ : [0, 1] → Ω be a path. We can differentiate
it by the usual formula
γ(t + h) − γ(t)
γ (t) = lim ∈C
h→0 h
provided that the limit exists. Then γ is also a function from [0, 1] →
C, and we can ask whether it is differentiable, and so on. The path is
said to be smooth if derivatives of all orders exist. If γ is a loop, we’ll
also require smoothness at the basepoint, i.e., that γ (r) (0) = γ (r) (1)
for r = 1, 2, . . .. With these conventions, the derivative of a smooth
path is a smooth path, and the derivative of a smooth loop is a smooth
loop. Finally, we’ll say that a smooth path (or loop) γ is regular if
γ (t) = 0 for all t ∈ [0, 1].
In this section we will show how to compute the winding number
for smooth loops, along the lines of the computation for polygonal
loops in the previous section.

Remark 3.4.1. It follows immediately from the Stone-Weierstrass

theorem (Theorem C.1.5) that the smooth loops in Ω are dense among
all (continuous) loops; that is to say, for any continuous loop and
any ε > 0 there is a smooth loop within ε of the given continuous
one. Replicating the argument given in Remark 3.3.6 above, we can

Figure 3.4. Transverse (T ) and nontransverse (N ) intersec-

tions of a curve with a ray.

therefore see that every continuous loop is homotopic to a smooth

one and therefore that the computation of the winding number for
smooth loops gives us another “in principle” way to compute it for
all loops.
Using the Stone-Weierstrass theorem in a slightly more elaborate
way, one can also see that if two smooth loops are continuously ho-
motopic, then they are smoothly homotopic; that is, the homotopy
h : [0, 1] × [0, 1] → Ω can be taken to be a smooth function of both
variables. We will need this fact at one point later on.
Deﬁnition 3.4.2. Let γ be a smooth path in C and let R be a ray in
the direction of a unit vector u ∈ C. Then R and γ are transverse if,
at every point γ(t) where γ ∗ and R intersect, the complex numbers
γ (t) and u are linearly independent over R (and thus form an R-basis
of C; see Remark 1.3.2).

“Linearly independent over R” amounts to saying that γ (t) is

nonzero and not parallel to the unit vector u. Thus the ray R “cuts
through” the curve at a point of transversality and does not “graze”
it. See Figure 3.4.
Deﬁnition 3.4.3. Let P = γ(t) be a point where a smooth path γ
meets a ray R (in direction u) transversely. The intersection number
of γ and R at P , which we write it (γ, R), is deﬁned to be +1 if u and
γ (t) form a right-handed basis for C (see Remark 1.3.2) and −1 if
they form a left-handed one.

Notice that this deﬁnition is consistent with the one we gave for
polygonal paths (equation (3.3.4)). We are going to prove a result
expressing winding numbers in terms of smooth intersection numbers,
which will be a counterpart of Proposition 3.3.5 in the polygonal case.
Proposition 3.4.4. Let γ be a smooth loop in C \ {p} and let R
be a ray from p to ∞, transverse to γ. Then there are only ﬁnitely
many parameter values t = t1 , . . . , tk where γ(t) intersects R and the
winding number of γ around p is equal to the sum of its intersection
numbers with R, that is

k
wn(γ, p) = itj (γ, R).
j=1

We will prove Proposition 3.4.4 by “straightening out” the smooth

loop near each intersection point, without changing the winding num-
ber or any of the intersection numbers. Once this is done we can make
use of Lemma 3.3.2 relating the winding number to intersection num-
bers for curves that have short straight sections. To be precise, we
need to verify the following.
Claim 3.4.5. Let γ be a smooth loop in C \ {p} and let R be a ray
from p to ∞, transverse to γ. Then R meets γ in only ﬁnitely many
points. Moreover, γ is homotopic, in C \ {p}, to a loop which has the
same intersection points with R and has short straight sections near
each intersection point.

Proof. There is no loss of generality in assuming that p = 0 and

that the ray R is the positive x-axis. If we write γ(t) = x(t) + iy(t),
then intersection points t = τ have y(τ ) = 0 and the transversality
condition is y (τ ) = 0. By the mean value theorem, there is ε > 0
such that if |t − τ | < ε, then
|y(t) − y(τ ) − y (τ )(t − τ )| < 12 |y (τ )||t − τ |, |x(t)| > 12 |x(τ )|.
In particular this tells us that there are no other points of intersection
for t in this range; so the parameter values t for which intersections
occur form a discrete subset of the compact set [0, 1], which is there-
fore ﬁnite. Now let
λ(t) = (x(τ ) + x (τ )(t − τ )) + i(y(τ ) + y (τ )(t − τ )),

that is, the straight line path tangent to γ at the intersection point
τ . Let ϕ be a “bump function” for which ϕ(t) = 1 for |t − τ | < ε/3
and ϕ(t) = 0 for |t − τ | > 2ε/3. The homotopy

h(s, t) = (1 − s)γ(t) + s((1 − ϕ(t))γ(t) + ϕ(t)λ(t))

now deforms γ, in C \ {0}, to a path which has a short straight

section near t = τ and exactly the same intersection with R as γ has
at τ . Apply this construction to each of the ﬁnitely many intersection
points to complete the proof.

Despite their formal similarity, there is an apparently signiﬁcant

difference between Propositions 3.3.5 and 3.4.4. In the polygonal case
it is obvious that what we called transverse rays exist: there are only
finitely many directions that such a ray has to avoid. In the smooth
case, it is not at all obvious that there are any transverse rays in the
sense of Definition 3.4.2.
In fact, however, a ray “chosen at random” will almost surely be
transverse. This is an (easy) consequence of a general result called
Sard’s theorem, which is fundamental to the study of smooth maps
in all dimensions. Sard’s theorem, which is discussed in Appendix D,
says that for any smooth function f , the image under f of the set of
points where f vanishes is “small” in the sense of measure theory.
The consequence that we will need looks like this:

Proposition 3.4.6. Let γ and p be ﬁxed as in Proposition 3.4.4.

Then the nontransverse rays form a subset of measure zero4 in the
set of all rays through p. In particular, transverse rays always exist.

Proof. Consider the loop γ as a path [0, 1] → C \ {p}. From (3.2.2),

there is a map g : [0, 1] → C such that γ(t) − p = exp(g(t)), and
the smoothness of γ implies that g is smooth too. Let us work out
what the condition is for the ray through γ(t) to meet the path γ
transversely there. It is Im((γ(t) − p)γ (t)) = 0; but if γ(t) − p =

4
Notice that the rays through a ﬁxed point are parameterized by a copy of the cir-
cle, S 1 . The notion of a measure zero set of rays is deﬁned according to Remark D.1.8.

exp(g(t)), this expression is equal (by the chain rule) to

Im((γ(t) − p)γ (t)) = Im(exp(g(t)) exp ḡ(t)ḡ (t))
= − exp(2 Re g(t)) Im g (t).
So let f (t) = Im g(t), the imaginary part of g. Then nontransverse
rays occur for parameter values t where f (t) = 0, and moreover the
angle for such a nontransverse ray is just exp(if (t)) ∈ S 1 . Therefore,
the set of angles for nontransverse rays is
F = exp(if (E)),

where E = {t : f (t) = 0}. By Lemma D.1.7, f (E) is a measure zero
subset of R, which implies that F is a measure zero subset of S 1 .

3.5. Counting roots via winding numbers

The fundamental theorem of algebra is the following statement: let
p(z) = z n + an−1 z n−1 + · · · + a0
be a complex polynomial of degree n; then p has exactly n roots in
the complex plane (counted according to multiplicity).

Remark 3.5.1. Complex numbers ﬁrst entered mathematics as a

corollary of Cardano’s formula (sixteenth century) for the solution of
a cubic equation. Rafael Bombelli noticed that this formula, when
applied to the equation x3 − 15x − 4 = 0, gave a solution involving
the square root of −121; yet by inspection x = 4 is a root. Bombelli
had discovered that he could compute with “imaginary” quantities
√
like −121 and obtain the correct “real” answer.
Leibniz (1702) claimed that the fundamental theorem of algebra
is false and that x4 + 1 = 0 is a counterexample. He assumed that the
square root of a complex imaginary quantity must be an imaginary of
a still more complicated kind. In fact, as we know, the square roots
of i are the complex numbers ±2−1/2 (1 + i).
It is usually said that the ﬁrst “real proof” of the fundamental
theorem was given by Gauss in his 1799 doctoral thesis. However,
by modern standards that proof contains some signiﬁcant gaps. It
assumes various properties of real algebraic curves, which are actually

quite subtle (Gauss: “no-one has ever doubted it. . . but, if anyone
desires, on another occasion I intend to give a demonstration which
will leave no doubt.” Apparently he never did.) Gauss’s proof also
assumes the topological lemma about paths in the unit square that
already appeared in our discussion of the “lovers and haters” problem.
Nowadays there are proofs of the fundamental theorem based on
analysis, proofs based on algebra, and proofs based on topology. We
will give a topological proof. Moreover, the proof will generalize to
give another, quite diﬀerent, “in principle” calculation of the winding
number for any loop, this one based on approximating by rational
functions and counting zeroes and singularities.

Now we will begin our proof of the fundamental theorem of alge-

bra. Recall the deﬁnition of the multiplicity of a root of a polynomial.
The division algorithm tells us that if p is a polynomial of degree n
and a ∈ C, we can always write
p(z) = (z − a)q(z) + p(a)
for some polynomial q = q1 of degree n − 1. In particular if a is a
root of p, that is p(a) = 0, we can write p(z) = (z − a)q1 (z). It may
or may not be the case that a is a root of q1 ; if it is, we can apply
the division algorithm again to write p(z) = (z − a)2 q2 (z), and so on
until we arrive at some qk with qk (a) = 0.
Deﬁnition 3.5.2. The multiplicity of a root a of a complex polyno-
mial p of degree n is the number k ∈ N such that
p(z) = (z − a)k qk (z)
where qk is a polynomial of degree n − k and qk (a) = 0.

Note that the multiplicity is zero if p(a) = 0.

Proposition 3.5.3. Let p be a complex polynomial and let a be a
root of p. Let γ be a circular loop around a of radius r, small enough
that B(a; r) contains no other roots of p, and traversed once in the
positive direction. Consider the loop
p ◦ γ : t → p(γ(t)) ∈ C \ {0}.
The winding number of this loop (about 0) is equal to the multiplicity
of the root of p at a.

Proof. Write p(z) = (z − a)k qk (z) = uk (z)qk (z) as in the deﬁnition

above. Then, by Lemma 3.2.6, the winding number of p ◦ γ is the
sum of the winding numbers of qk ◦ γ and of uk ◦ γ.
Write γ(t) = γρ (t) = a + ρe2πit , with ρ = r. Since qk never
vanishes in B(a; r), letting ρ vary from r to zero deﬁnes a homotopy
of qk ◦ γ to the constant path. Thus wn(qk ◦ γ) = 0.
We can calculate explicitly that uk ◦ γ(t) = r k e2πikt . This has
winding number wn(uk ◦ γ) = k, by Theorem 3.2.7. Putting this
together with the previous calculation completes the proof.

Theorem 3.5.4. Let p be a complex polynomial. Let r > 0, and

suppose that p has no roots on the circle |z| = r. Then the total
number of roots inside that circle (counted with multiplicity) is the
winding number of t → p(re2πit ) about the origin.

Proof. It is another “bubble argument” (compare Remark 3.3.3).

We consider the winding number n(ρ) of t → p(ρe2πit ) about the
origin, as ρ increases from 0 to r.
When ρ is small and positive, n(ρ) is equal to the multiplicity of
0 as a zero of p. As ρ increases, the path t → p(ρe2πit ) varies by a
homotopy in C \ {0} except for those ρ which are the absolute values
of roots of p. Thus n(ρ) is piecewise constant with “jumps” at the
absolute values of the roots of p.
Let a be a root of p, and suppose for a moment that p has no other
roots of the same absolute value as a. Then by a bubble argument,
the increase in n(ρ) when ρ passes through |a| is just the winding
number of p ◦ γ about 0, where γ is a small circular path around a.
By Proposition 3.5.3, this increment is the multiplicity of the zero
at a. (If there are several zeroes with the same absolute value c, we
consider “bubbles” for each of them separately, and we ﬁnd that the
increment in n(ρ) as ρ passes through c is the total multiplicity of all
the roots having that absolute value.)
Adding up all the increments in n(ρ), we ﬁnd that n(r) is the total
multiplicity of zeroes contained in the disc of radius r, as asserted.

Corollary 3.5.5 (Fundamental theorem of algebra). A polyno-

mial of degree n has exactly n complex roots, counted with multiplicity.

Proof. Let p(z) = z n + an−1 z n−1 + · · · + a0 be such a polynomial.

(There is no loss of generality in rescaling to make the polynomial
monic, i.e., to make the leading coeﬃcient 1.) Write f (z) = z n . We
know that f (z) − p(z) is a polynomial of degree n − 1 at most, and it
follows that there is R > 0 such that
|f (z) − p(z)| < |z|n = |f (z)| whenever |z| R.
It follows in particular that p(z) has no zeroes for |z| R. Moreover,
let γ denote the circular path with center 0 and radius R. By Rouché’s
theorem (Proposition 3.1.4), the paths f ◦ γ and p ◦ γ have the same
winding number about 0, which is to say that f and p have the same
number of roots (counted with multiplicity) inside B(0; R). But f
obviously has exactly n roots in B(0; R) (at the origin, with n-fold
multiplicity), so p does also.

We can generalize Theorem 3.5.4 above to rational functions, with

basically the same proof. Remember that a rational function is just
a quotient of two polynomials:
f (z) = p(z)/q(z).
As well as zeroes (where p has a root), a rational function has poles
or singularities (where q has a root). If a is a pole, we have
f (z) = (z − a)−k g(z), g(a) = 0, ∞,
for some rational function g which has neither a zero nor a pole at a.
The number k is called the multiplicity of the pole at a. Comparison
with Deﬁnition 3.5.2 makes it clear why we should envisage a pole as
a “zero of negative multiplicity”.
Theorem 3.5.4 generalizes as follows.
Theorem 3.5.6. Let f be a rational function. Let r > 0, and sup-
pose that f has no zeroes or poles on the circle |z| = r. Then the total
number of zeroes and poles inside that circle (counted with multiplic-
ity, with zeroes counting positively and poles counting negatively) is
the winding number of t → f (re2πit ) about the origin.
Remark 3.5.7. In fact, every loop in C\{0} can be approximated by
(and, in particular, is homotopic to) one of the form t → f (eit ), where
f is rational; this follows, once again, from the Stone-Weierstrass

theorem (Theorem C.1.5). Thus, Theorem 3.5.6 gives us a third “in

principle” algorithm for calculating the winding number, this time in
terms of algebraic invariants — roots of polynomials.

Let V be a vector space and T : V → V a linear transformation.

An eigenvalue for T is a scalar λ for which the linear equation T v =
λv has a nonzero solution v ∈ V ; the corresponding vectors v are
called eigenvectors for the eigenvalue λ. A well-known corollary of
the fundamental theorem of algebra is

Proposition 3.5.8. Let V be a ﬁnite-dimensional complex vector

space. Then any linear transformation T : V → V has an eigenvalue.

Proof. Since V is ﬁnite-dimensional, so is the space End(V ) of linear

transformations V → V : if V has dimension n, then End(V ) has
dimension n2 . Consider the n2 + 1 linear transformations
2
I, T, T 2 , . . . , T n
in End(V ). They cannot form a linearly independent set, so there is a
linear relation between them. That is to say, there is a polynomial p
(of degree m n2 ) such that p(T ) = 0. By the fundamental theorem
of algebra we may write
p(λ) = c(λ − λ1 ) · · · (λ − λm )
for some complex numbers λ1 , . . . , λm and c = 0. Then
0 = p(T ) = c(T − λ1 I) · · · (T − λm I).
Since the composite of injective maps is injective, at least one of the
linear transformations T − λj I must fail to be injective. A nonzero
element of its kernel is then an eigenvector.

Traditionally, this result is proved using determinants. But, as

stressed by Axler [7], it is simpler to avoid this discussion, as we have
done above.

3.6. Exercises
Exercise 3.6.1. Let f (z) = z + 1/z. Determine the winding number
of the loop f ◦ γk about 0 in the following cases (k = 1, 2, 3):

(i) γ1 is a circular path with center 0 and radius 12 ;

(ii) γ2 is a circular path with center 0 and radius 3;
(iii) γ3 is a circular path with center i and radius 32 .
(A “circular path” is traversed once, in the positive direction.)
Exercise 3.6.2. Investigate whether it is possible to define a “wind-
ing number” for maps C \ {0} → C \ {0}. What properties does your
definition have?
Exercise 3.6.3. Suppose that γ1 and γ2 are loops starting and ending
at 1. Construct an explicit homotopy of loops from the concatenation
γ1 ∗ γ2 to the pointwise product γ1 γ2 .
Exercise 3.6.4. Let γ : [0, 1] → C \ {0} be a loop.
(a) Let u : [0, 1] → [0, 1] be any map such that u(0) = 0 and u(1) = 1.
Show that the winding number of γ ◦ u is the same as the winding
number of γ. (One says that the winding number is independent
of parameterization. See Example 2.2.3.)
(b) For c ∈ (0, 1) define γc by

γ(t + c) if t + c 1,
γc (t) =
γ(t + c − 1) if t + c > 1.
Show that γc is a loop with the same winding number as γ. (One
says that the winding number is independent of basepoint.)
Exercise 3.6.5. Let X denote the unit square {(x, y) : 0 x, y 1}.
Let γ1 be a continuous path in X from (0, 0) to (1, 1), and let γ2 be
a continuous path from (0, 1) to (1, 0). Prove that γ1 and γ2 must
meet somewhere.
(This is the lemma we needed for the “lovers and haters” problem.
Assuming no crossing, close off one path into a loop in R2 , and obtain
a contradiction by considering properties of the winding number.)
Exercise 3.6.6. A polygonal path traverses the vertices of a regular
heptagon in the unit circle by joining every third vertex (that is, the
vertices are traversed in the order e6kπi/7 for k = 0, 1, 2, . . . , 6). Find
the winding number of the path around the origin. Generalize to
paths traversing every mth vertex of a regular n-gon.

Exercise 3.6.7. Construct an example of a smooth path in C\{0} for

which there are inﬁnitely many nontransverse rays through 0. How
is this consistent with Sard’s theorem?
Exercise 3.6.8. Fill in the details of Bombelli’s calculation for the
equation x3 − 15x − 4 = 0, as follows. Write x = u + v, with the
auxiliary condition uv = 5. Show that the original equation now
gives u3 + v 3 = 4. Deduce that u3 and v 3 are the two roots of the
quadratic equation
t2 − 4t + 125 = 0
and therefore that u3 , v 3 are 2 ± 11i.
Check that 2 + i is a complex cube root of 2 + 11i, and similarly
for the minus sign. Thus the real root 4 = (2 + i) + (2 − i) can be
obtained by the use of complex numbers.
Exercise 3.6.9. (a) Show (by consideration of winding numbers)
that there is no sequence of complex polynomials pn (z) such that
pn (z) → z̄ uniformly for |z| = 1.
(b) For those who feel more ambitious, investigate whether the same
conclusion holds when we replace “uniformly” by “pointwise”.
That is, is there a sequence of complex polynomials with pn (z) →
z̄ for each individual z on the unit circle, but without the as-
sumption of uniformity? You will probably need to use a result
from complex analysis about polynomial approximation, such as
Runge’s theorem — see Rudin [34].
Exercise 3.6.10. In the context of Lemma 3.3.2, show that the way
in which the short straight section is parameterized does not matter.
In other words, we get the same winding numbers if we replace γ(t),
for t ∈ [t0 , t1 ], by any expression a + bϕ(t), where ϕ : [t0 , t1 ] → [−1, 1]
is any (continuous) map sending t0 to −1 and t1 to 1.

Chapter 4

Topology of the Plane

4.1. Some classic theorems

Many natural questions about the topology of ﬁgures in the plane can
be answered by using the winding number.

Deﬁnition 4.1.1. Let n be a natural number. The n-ball B n is the

set of points (x1 , . . . , xn ) ∈ Rn with x21 + · · · + x2n 1; the (n − 1)-
sphere S n−1 is the boundary of the n-ball, that is, the set of points
(x1 , . . . , xn ) ∈ Rn with x21 + · · · + x2n = 1.

A basic result of topology is the no-retraction theorem:

Theorem 4.1.2. There is no continuous map f : B n → S n−1 such

that f (x) = x for all x ∈ S n−1 .

When n = 1, the theorem follows from the fact that B 1 is path

connected (Deﬁnition 2.1.1) while S 0 is not. When n = 2, we will
prove the theorem using the winding number. For n 3 one needs
higher-dimensional topological methods (one possible approach will
be indicated in the ﬁnal chapter; see Example 9.1.7).

Proof (for n = 2). Let f : B 2 → S 1 be a map satisfying the hypoth-

esis of the theorem. Deﬁne a family of maps ht : S 1 → S 1 by

ht (eiθ ) = f (teiθ ).

f (x)
S n−1
x
Bn

g(x)

Figure 4.1. Constructing a retraction to prove the Brouwer

ﬁxed-point theorem.

Plainly h is a homotopy, with h1 being the identity map (by assump-

tion) and h0 being constant. But the identity map S 1 → S 1 has
winding number 1, and a constant map has winding number 0, so no
such homotopy can exist.

The no-retraction theorem implies the Brouwer ﬁxed-point the-

orem. This is another result which is valid in all dimensions, but
techniques based on the winding number only give us a proof in di-
mension n = 2.

Corollary 4.1.3 (Brouwer ﬁxed-point theorem). Let g : B n →

B n be a continuous map. Then g has a ﬁxed point; in other words,
there exists x ∈ B n such that g(x) = x.

Many existence questions in mathematics and its applications

(does this diﬀerential equation have a solution? does this economic
model have an equilibrium? does this function have a zero?) can
be reformulated as ﬁxed-point problems, and the Brouwer theorem is
then often the key to a positive solution.

Proof. Suppose that g has no ﬁxed point. Deﬁne a map f : B n →

S n−1 as follows. For each x ∈ B n , since g(x) = x, there is a unique
ray starting at g(x) and passing through x. Let f (x) be the (unique)
point where this ray intersects S n−1 . (See Figure 4.1.)

It is clear that the map f is continuous, and if x ∈ S n−1 , then

f (x) = x by construction. Thus f is a retraction, contradicting The-
orem 4.1.2.

In the following discussion we will focus our attention on the

winding numbers of maps S 1 → S 1 (in this context the winding num-
ber is often referred to as the degree). Since the domain and the
codomain are the same, we can compose these maps:
(f ◦ g)(z) = f (g(z)).
What eﬀect does composition have on the winding number?
Lemma 4.1.4. Let f, g ∈ Maps(S 1 , S 1 ), and let h = f ◦ g denote
their composition. Then
wn(h) = wn(f ) · wn(g).

Proof. Notice that if f varies through a homotopy fs , then h varies

through the homotopy fs ◦g. Similarly if g varies through a homotopy
gs , then h varies through the homotopy f ◦ gs .
Let f have winding number m and let g have winding number n.
Then, by Theorem 3.2.7, f is homotopic to the map z → z m and g
is homotopic to the map z → z n . By the previous paragraph, h is
homotopic to the composite of these two maps, which is z → z mn .
Thus h has winding number mn, as required.

A map f : S 1 → S 1 is called even if f (z) = f (−z) for all z, and

it is called odd if f (z) = −f (−z) for all z.
Proposition 4.1.5. An even map S 1 → S 1 has even degree; an odd
map S 1 → S 1 has odd degree.

Proof. Suppose that f : S 1 → S 1 is even. Deﬁne g : S 1 → S 1 by

g(z) = f (w), where w2 = z.
Of course, given z, there are two possible choices for w (diﬀering by
sign); but because f is even, the choice doesn’t matter, and so g is
well-deﬁned.
I claim that g is also continuous. Indeed, let ε > 0. Since f is
(uniformly) continuous there is δ > 0 such that |w1 − w2 | < δ implies

|f (w1 ) − f (w2 )| < ε. Now suppose |z1 − z2 | < δ 2 , z1 = w12 , and

z2 = w22 . Then
|w1 − w2 ||w1 + w2 | = |w12 − w22 | < δ 2 ,
so one of |w1 − w2 | and |w1 + w2 | is less than δ. By an appropriate
choice of sign we may arrange that |w1 − w2 | < δ; so we have proved
that |z1 − z2 | < δ 2 ⇒ |g(z1 ) − g(z2 )| < ε and g is continuous, as
required.
Now f = g ◦ s, where s(w) = w2 . Since the map s has degree
2, Lemma 4.1.4 shows that the degree of f is a multiple of 2. This
proves the ﬁrst part of the proposition.
To prove the second part, suppose that k is odd. Then f (z) =
zk(z) is even, so it has even degree. But deg(f ) = 1 + deg(k) by
Lemma 3.2.6. Thus k has odd degree as required.

In the statement of the next theorem, let A : S 2 → S 2 denote

the antipodal map (Deﬁnition 2.3.8) which sends each point to its
opposite.
Theorem 4.1.6 (Borsuk-Ulam theorem). Let h : S 2 → R2 be a
continuous map. Then there exists x ∈ S 2 such that h(x) = h(A(x)),
where A(x) is the antipode of x.

A famous illustration of the Borsuk-Ulam theorem is that there

are always two antipodal points on the earth’s surface having exactly
the same temperature and barometric pressure.

Proof. Suppose not. Then, for each x ∈ S 2 , the vector h(x) −

h(A(x)) is nonzero; let u(x) ∈ S 1 be the unit vector in the direc-
tion of h(x) − h(A(x)). Then u : S 2 → S 1 is a continuous map and
satisﬁes
u(A(x)) = −u(x)
for all x ∈ S 2 . If we restrict u to the equatorial copy of S 1 inside
S 2 , we obtain an odd map v : S 1 → S 1 , which must have odd degree
by Proposition 4.1.5. In particular, the degree of v must be nonzero.
But the map u on the upper hemisphere of S 2 describes a homo-
topy between v and a constant map, which has zero degree. This
contradiction proves the theorem.

Figure 4.2. Planar counterpart of the ham sandwich theo-

rem: two areas can be bisected by a single line.

Theorem 4.1.7 (Ham sandwich theorem). Let A, B, and C be

three solid bodies in R3 . There is always a plane that divides all three
of them exactly in half by volume.

To understand the name, think of A, B, and C as being the two

slices of bread and the ﬁlling of a sandwich, which we want to divide
fairly by a single knife-cut. It seems however (see [9]) that the earliest
formulation involving ham is the following: “Can we place a piece of
ham under a meat cutter in such a way that meat, bone, and fat are
all cut in halves?” Apparently bread—forming a “sandwich”—was
added to the problem later, replacing the bone and fat.

Proof. The mention of volume suggests, correctly, that one would

have to do some measure theory to make a rigorous argument; we
won’t sweat the details of that here.
It’s helpful to think ﬁrst about the case when one or more of A,
B, C are balls. A plane that bisects the volume of a ball is just a
plane through its center. So if all three bodies are balls, the existence
of the desired plane is obvious: just consider the plane through their
centers.
If one body is a ball — say C — then for each vector v ∈ S 2
there is exactly one plane normal to v that bisects C, namely, the
plane through the center of C. Call that plane Pv . Now deﬁne x(v)
to be the volume of the part of body A on the positive v-side of

Pv , and y(v) similarly for body B. Deﬁne a map f : S 2 → R2 by

f (v) = (x(v), y(v)).
By the Borsuk-Ulam theorem, there is a v ∈ S 2 such that f (v) =
f (A(v)). But this means that the volumes of the parts of bodies A
and B on the positive v-side of Pv are the same as the corresponding
volumes on the negative v-side; that is, Pv bisects A and B, as well
as C.
In the general case, we can carry out the same argument provided
we know that for each vector v we can ﬁnd a plane Pv normal to v that
bisects C and that Pv depends continuously1 on v. Here’s how to do
that. Consider the family of all planes perpendicular to v. These are
parameterized by a real number t, the signed distance from the plane
to the origin, and as t varies from −∞ to ∞ the fraction of C on the
positive v-side of the plane increases continuously from 0 to 1. Thus,
by the intermediate value theorem, there is a value of t for which this
number is 12 , i.e., a plane perpendicular to v bisecting C. (If there is
more than one such plane, the possible t-values for such planes form
a closed interval; we choose Pv to be the plane corresponding to the
midpoint of that interval.)
This completes the proof of the ham sandwich theorem.

4.2. The Jordan curve theorem I

Deﬁnition 4.2.1. A loop γ in the plane is called a Jordan curve if
it has no self-intersection points or, to put it another way, it is given
by an injective map γ : S 1 → C.

Proposition B.3.17 tells us that such an injective map γ is actually

a homeomorphism onto its image, for which reason one sometimes
refers to the image γ ∗ itself as a Jordan curve. One of the most
notorious theorems in topology is the following:
Theorem 4.2.2 (Jordan curve theorem). Any Jordan curve in C
divides the plane into exactly two regions, of which it is the common
boundary.

1
This continuity is one of those details that we ought to sweat over. In fact, it
is a nontrivial point: see the discussion in [6]. Exercise 4.5.5 will indicate one way of
ﬁlling in the details here.

Figure 4.3. A complicated Jordan curve. Courtesy of Robert

Bosch, Oberlin College.

We give a little clariﬁcation here. By “divides the plane into

exactly two regions”, we mean that C \ γ ∗ has two path components
(or cells as we called them before). The notion of boundary is deﬁned
below:

Deﬁnition 4.2.3. Let U be a subset of a metric space X. A point

x ∈ X is a boundary point of U if for every ε > 0 the ball B(x; ε)
contains both points of U and points of the complement X \ U . The
boundary ∂U of U is the collection of its boundary points.

Lemma 4.2.4. For any compact K ⊆ C, the boundary of each path

component of C \ K is a subset of K.

The Jordan curve theorem supplies the additional information

that if K is a Jordan curve, this subset is all of K.

Proof. Let U be some path component of C \ K and let W be the

union of all the other path components of C \ K. Let p ∈ ∂U . Then
for any ε > 0, the ball B(p; ε) contains a point z0 ∈ U and a point
z1 ∈
/ U . The point z1 either belongs to K or to W ; if it belongs to W ,
the line segment from z0 to z1 must contain a point of K (otherwise
z0 and z1 would be in the same path component of C \ K). So, in
either event, B(p; ε) contains a point of K. Since this is true for any
ε > 0, p belongs to the closure of K. But K is closed since it is
compact, so p ∈ K as required.

Remark 4.2.5. Given what we know already, it is not hard to prove

Theorem 4.2.2 for polygonal Jordan curves (Exercise 4.5.6). But
the Peano curve (Example 2.1.6) shows that continuous paths can
have unexpected, counterintuitive topological properties, which are
not shared by “nice” paths like polygonal or diﬀerentiable ones. One
might worry, then, whether the oh-so-intuitive Jordan curve theorem
might also fail for some weird continuous path. In fact, it doesn’t;
but this is not at all straightforward to prove.

Remark 4.2.6. The theorem was ﬁrst stated formally by Jordan in

his Cours d’analyse of 1877. It is often claimed that this proof was
fallacious. However, in 2007 Thomas Hales published a reanalysis
of Jordan’s original proof [19] in which he defends Jordan’s reason-
ing. Jordan’s argument proceeds from the case of polygonal paths to
general paths.

The Jordan curve theorem is hard because it deduces something

about the complement of a subset of the plane using only information
internal to that subset. The ﬁrst result that we need in the proof of
the theorem gives a way to make that crucial step from a subset to
its complement. Notice in the statement below that we give a crite-
rion to determine when a point p lies in the unbounded component
of the complement of a compact subset K using only homotopical
information that is internal to K itself.

Proposition 4.2.7 (Eilenberg’s criterion). Let K be a compact

subset of C. A point p ∈ C\K belongs to the unbounded component of
C\K if and only if the function from K to C\{0} defined by z → z −p
lifts through the exponential map (in the sense of Definition 3.1.1,
that is, if and only if there is a function k(z) defined on K with
z − p = exp(k(z))).

Proof. (“Only if”) If p is suﬃciently far from K, then z − p is con-

tained in a ﬁxed half-plane for all z ∈ K, so it certainly lifts through
the exponential map. As p moves on a path in the unbounded com-
ponent of C \ K, the map z → z − p moves by a homotopy in C \ {0}.
By Proposition 3.1.5, the property of lifting through the exponential
map is preserved by such a homotopy.

U
γr K
p W

Figure 4.4. Illustrating the proof of the “if” direction of

Eilenberg’s criterion.

(“If”) Suppose (aiming at a contradiction) that p belongs to some

bounded component U of C \ K and that z − p = exp f (z) for some
continuous function f : K → C. Let X be the metric space C \ U ,
which consists of K and all the other components (except for U ) of
C \ K; in particular it includes the unbounded component W . Thus
we have the following set-up: a compact subset K of a metric space
X, and a continuous function f : K → C.
In this situation, the Tietze extension theorem (see Proposition
C.2.1 in Appendix C) applies and says that there exists a continuous
g : X → C that extends f ; that is, we have a commutative diagram
XO .
g

? f
K /C

In particular, z − p = exp(g(z)) for z ∈ K. Now recall (Lemma

4.2.4) that ∂U ⊆ K and therefore Ū ⊆ U ∪ K. Consider the two
continuous functions z → z − p (on Ū ) and z → exp(g(z)) (on X =
C \ U ). These are defined on closed sets and agree where they are de-
fined, so by Proposition B.4.2 they fit together to define a continuous
map u : C \ {p} → C \ {0}.

Consider the winding number of the loop u ◦ γr , where γr is a

circle with center p and radius r. Varying r gives a homotopy, so all
these winding numbers must be the same. But for r small, γr lies
entirely inside U and the winding number is 1; whereas for r large, γr
lies entirely inside the unbounded component W , and therefore inside
X, so that u ◦ γr is an exponential and the winding number is 0. This
contradiction proves the result.

Eilenberg’s criterion immediately allows us to prove that certain

compact sets have connected complements (one says that they do not
separate the plane). Recall (Exercise 2.4.7) that a compact space K
is contractible if the identity map K → K is homotopic to a constant
map.

Proposition 4.2.8. Let K be a compact contractible subset of C.

Then its complement C \ K is connected.

Proof. If K is contractible, then there is a homotopy H : [0, 1]×K →

K between the identity map K → K and a constant map. Now
suppose that f is a map from K to another space X. Then the
composite

[0, 1] × K
H /K f
/X

gives a homotopy between f and a constant map K → X. In other

words, any map from a contractible space to any space is homotopic
to a constant map.
In particular, for p ∈ C \ K, the map ϕ : z → (z − p) is homotopic
in C \ {0} to a constant map. By Proposition 3.1.5, the map ϕ lifts
through the exponential function. By Eilenberg’s criterion, p is in the
unbounded component of C \ K. Since p was arbitrary, every point
of C \ K is in the unbounded component, which is to say that C \ K
is connected.

Remark 4.2.9. An important example is that of an arc. By deﬁni-

tion, an arc in C is a subset that is homeomorphic to the compact
interval [0, 1]. It is easy to see that [0, 1] is contractible, and thus so
is any arc. From Proposition 4.2.8 we see therefore that the comple-

ment of any arc is connected (or, as it is usually expressed, no arc

separates the plane): a fact that will play an important role in the
proof of the Jordan curve theorem in the next section.

4.3. The Jordan curve theorem II

We continue with our proof of the Jordan curve theorem (Theorem
4.2.2).

Proposition 4.3.1. Let γ be a Jordan curve in C, and assume that

C \ γ ∗ is not path connected. Then the boundary of each connected
component of C \ γ ∗ is all of γ ∗ .

Proof. Let z = γ(t) be a point of γ ∗ . Let U be a bounded component

of C \ γ ∗ and let W be the unbounded component. It suffices to show
that, for every ε > 0, the disc B(z; ε) contains points of U and points
of W .
By continuity there is δ > 0 such that the arc B = γ([t − δ, t + δ])
lies entirely in the disc B(z; ε). Let A be the complementary arc
γ(S 1 \ (t − δ, t + δ)). The arc A does not separate the plane (Re-
mark 4.2.9). Therefore, there is a path ψ in C \ A joining a point
p ∈ U to a point q ∈ W . This path must meet γ (since p, q are in
different cells) and it doesn’t meet A, so it must meet B. Let s0 be
the parameter value where it first meets B, i.e.,
s0 = inf{s ∈ [0, 1] : ψ(s) ∈ B}.
Then ψ(s0 ) ∈ B(z; ε). By continuity, ψ(s) ∈ B(z; ε) for s < s0
sufficiently close to s0 . But ψ(s) ∈ U for all s < s0 , so B(z; ε) contains
points of U . Arguing similarly for s1 = sup{s ∈ [0, 1] : ψ(s) ∈ B}, we
see that B(z; ε) also contains points of W .

We will also need the “lovers and haters” lemma (Exercise 3.6.5).

Proposition 4.3.2. Let R be a closed rectangle in the plane, and

let A, B, C, D be four points on the circumference of R, taken in
cyclic order. Let γ1 and γ2 be paths from A to C and from B to D,
respectively, lying entirely (except for their endpoints) in the interior
of R. Then γ1 and γ2 have a point of intersection.

Arc J1
P Q

Arc J2

Figure 4.5. Jordan curve theorem, Part 1.

Proof. Join C to A by a polygonal arc γ0 that lies outside R (except

at its endpoints). The concatenation of γ0 and γ1 forms a loop γ. By
Lemma 3.3.2 we can compute that the winding numbers of γ about B
and about D diﬀer by ±1. In particular, B and D cannot be joined
by a path in C \ γ ∗ . Thus γ2 must meet γ somewhere. As it lies
entirely within R, it must in fact meet γ1 , as required.

Now for the proof of the Jordan curve theorem. Let J be a

Jordan curve. Since J is compact, there exist points P, Q ∈ J that
are as far apart as possible (that is, r = d(P, Q) is greater than or
equal to the distance between any other two points of J). Thus J is
completely contained in the lozenge-shaped region B(P ; r) ∩ B(Q; r).
In particular, it is completely contained in some rectangle R with
sides perpendicular and parallel to P Q, and it meets the boundary
of this rectangle only at P and at Q. The Jordan curve J is then the
union of two arcs from P to Q; call these J1 and J2 . (See Figure 4.5.)
Keep this set-up for the next two lemmas, which together consti-
tute a proof of the Jordan curve theorem. This proof is based on an
article by Maehara [27].

Lemma 4.3.3. The complement of J has at least two connected

components.

Proof. We want to ﬁnd a point which we can prove does not lie in
the unbounded component of the complement of J. Here’s how we

A
K
α Arc J1
P Q

L
W
Arc J2
M
β
N
B

Figure 4.6. Jordan curve theorem, Part 2.

proceed. Pick a point A on the top of the rectangle R and draw a

vertical line AB from the top to the bottom of the rectangle. The
“lovers and haters” lemma assures us that this line will pass through
both arcs J1 and J2 ; assume, without loss of generality, that the ﬁrst
such intersection point (as we move downwards from A) is with J1 .
Let K be the highest intersection point of AB with J1 , and let L be
the lowest such point. (Remark that the existence of such “highest”
and “lowest” points — the sup and inf of certain sets — is assured
by the compactness of J1 .)
There are points of J2 ∩ AB lower than L. (Proof: If not, the
path comprised of segment AK, the arc α of J1 from K to L, and
segment LB would connect A to B without meeting J2 , contradicting
the lovers and haters lemma.) Let M be the highest such point, and
let N be the lowest such point. For future reference, we note that
the straight line segment N B meets neither J1 nor J2 , except at its
endpoint N .
Finally, let W be the midpoint of LM . By construction, W be-
longs neither to J1 nor to J2 , so it is in C \ J. We will prove that it
is not in the unbounded component of this set. (See Figure 4.6.)
Suppose the contrary. Then there is a path Γ in R \ J that
connects W to some point Z on the boundary of the rectangle R.

There are two possibilities: either Z lies on the boundary arc P AQ

or on the boundary arc P BQ.
Suppose ﬁrst that Z lies in P AQ. Then concatenating Γ to the
segment W B yields a path Γ ∗ W B from Z to B that does not meet
J1 . This contradicts the lovers and haters lemma.
On the other hand, suppose that Z lies in P BQ. Then the con-
catenation of four paths

AK ∗ α ∗ LW ∗ Γ

yields a path from A to Z that does not meet J2 . This is a similar

contradiction.

Lemma 4.3.4. The complement of J has at most two connected

components.

Proof. Keep the notation from the previous lemma. In addition to

the arc α that we already deﬁned, let β be the arc of J2 that connects
M to N . Let Δ be the path from A to B obtained by concatenating

Δ = AK ∗ α ∗ LM ∗ β ∗ N B.

Observe that the points of Δ all lie in either J itself, the unbounded
component of the complement of J, or the bounded component of
the complement that contains W . Let ε > 0 be small enough that
B(P ; ε) and B(Q; ε) don’t meet the arc Δ.
Suppose that U is another component of C \ J, bounded, but
not containing W . By Proposition 4.3.1, ∂U = J. In particular,
there are points P and Q of U belonging to B(P ; ε) and B(Q; ε),
respectively. These points are inside R (everything outside belongs to
the unbounded component). Moreover, there is a path Λ in U joining
P to Q. Notice that, by construction, the path Δ contains no points
on U , so Λ cannot meet Δ. (See Figure 4.7.)
Now we contradict the lovers and haters lemma using the paths
Δ and Λ = P P ∗ Λ ∗ QQ . This completes the proof.

A classically important corollary is the theorem of invariance of

domain.

A
K
α
P
Q
L
W
M
β
N
B

Figure 4.7. Jordan curve theorem, Part 3. The path Δ is emphasized.

Theorem 4.3.5 (Invariance of domain). Let U ⊆ C be an open

set and let f : U → C be continuous and injective. Then f is an open
map (it takes open sets to open sets). In particular, f (U ) is open in
C, and f is a homeomorphism from U onto f (U ).

Proof. It suﬃces to show that whenever B(z; ε) ⊆ U , f (B(z; ε)) is

an open subset in C. Let D be the closure of B = B(z; ε) and S
its boundary circle. Then f (S) is a Jordan curve, so its complement
has two components; on the other hand, the complement of f (D) has
only one (unbounded) component by Eilenberg’s criterion (Proposi-
tion 4.2.8). Since f (B) is path connected, it must be contained in
one of the components of the complement of f (S), and it must in fact
be the whole of the bounded component (since otherwise the comple-
ment of f (D) = f (S) ∪ f (B) would not be connected). Thus f (B) is
open.

4.4. Inside the Jordan curve

It’s important to be clear about what we have not proved at this
point. Let J be a Jordan curve in C. We have proved that the
complement C \ J consists of exactly two components, one bounded
and one unbounded. The bounded component is called the Jordan

domain ΩJ determined by J. None of the techniques that we have

developed so far answer any of the following natural questions:
(a) Is the open set ΩJ simply connected?
(b) If the answer to (a) is yes, is ΩJ homeomorphic to the open disc
U := {z ∈ C : |z| < 1}?
(c) If the answer to (b) is yes, can a homeomorphism ΩJ → U
be found which extends continuously to a homeomorphism from
ΩJ = ΩJ ∪ J to U = U ∪ S 1 ?
Clearly these are successively stronger statements: (c) implies (b)
which implies (a). Note that in (c) the equality ΩJ = ΩJ ∪ J is a
consequence of the Jordan curve theorem.
It turns out that all three of these statements are true. The
strongest of them, (c), is known as the Schoenflies theorem. How-
ever, they are all of a higher level of difficulty than the Jordan curve
theorem itself. One way of seeing this is to notice that the natural
higher-dimensional counterpart of the Jordan curve theorem is true
(the Jordan-Brouwer separation theorem) but the higher-dimensional
counterpart of the Schoenflies theorem is false without additional as-
sumptions. We’ll discuss this in a moment.
So how might one go about proving (a)–(c)? Modern direct
proofs [35] use some kind of infinite combinatorial construction to
prove the Schoenflies theorem — imagine a ramifying, treelike struc-
ture reaching out towards that (potentially very wiggly) boundary of
ΩJ . But the first proofs did not work that way. Instead, they made
use of analysis — specifically, the theory of conformal mapping — to
prove this result.
Definition 4.4.1. Let U and V be open subsets of C. A conformal
mapping from U to V is a homeomorphism f : U → V which is holo-
morphic, that is, differentiable as a function of a complex variable —
the limit
f (z + h) − f (z)
f (z) := lim ∈C
h→0 h
must exist for each z ∈ U .

Holomorphic functions will be discussed in more detail in the

next chapter (Deﬁnition 5.3.1), where we will see that the real and

imaginary parts u, v of such a function f must satisfy the partial

differential equations
∂u ∂v ∂v ∂u
= , =−
∂x ∂y ∂x ∂y
known as the Cauchy-Riemann equations. The geometric significance
of these equations turns out to be that f preserves the angles between
infinitesimal vectors, though it may rescale their lengths. That is the
reason for the word “conformal”.
Theorem 4.4.2 (Riemann mapping theorem). Let U be a proper,
nonempty connected open subset of C (proper means not equal to C
itself ) and suppose that the following holds:
• for every loop γ in U and every p ∈
/ U , the winding number
wn(γ; p) equals zero.
Then there exists a conformal homeomorphism from U onto U. In
particular, U is homeomorphic to U.
Remark 4.4.3. Most courses on complex analysis will contain a
proof of the Riemann mapping theorem. One approach, due to Koebe,
proceeds as follows. First, one shows that the hypothesis about wind-
ing numbers implies that we can always find holomorphic square
roots2 of nowhere-zero holomorphic functions on U — in other words,
if f is such a function, then one can find another such function g such
that g(z)2 = f (z). Now one considers the class of conformal homeo-
morphisms of U into U (i.e., the range may be a proper subset of U).
Using the existence of square roots, one gives an explicit construction
for “improving” such a homeomorphism by making its range “big-
ger”. Iterating this “improvement” process and passing to the limit
gives the Riemann map.

Now it is easy to see that the hypothesis of the Riemann mapping

theorem is satisﬁed for the interior region ΩJ of a Jordan curve J
(Exercise 4.5.8). The theorem therefore implies a “yes” answer to
questions (a) and (b) above.
One can also use these analytical techniques to prove the Schoen-
ﬂies theorem (that is, give a positive answer to question (c) as well).
2
We will prove this in the next chapter; see Proposition 5.5.2.

To do so one must investigate the extension of the Riemann mapping

to boundary points. Two examples warn us that this will be a delicate
matter:
Example 4.4.4. If U is an open subset of C which is not a Jordan
domain, a confromal homeomorphism U → U need not extend to a
homeomorphism ∂U → ∂U; see Exercise 4.5.12.
Example 4.4.5. If ΩJ is a Jordan domain and f : ΩJ → U is a
general (not conformal) homeomorphism, then f need not extend to
a homeomorphism J → ∂U; see Exercise 4.5.11.

Nevertheless, in the early twentieth century Carathéodory was

able to prove
Proposition 4.4.6. Let J be a Jordan curve, and let f : ΩJ → U be a
conformal homeomorphism. Then there is a unique homeomorphism
from ΩJ = ΩJ ∪ J to U that extends f .

Together with the Riemann mapping theorem, this completes the

proof of the Schoenflies theorem. (Full details of this argument may
be found in [8].)
Remark 4.4.7. It may seem surprising that the first proof of a purely
topological result such as the Schoenflies theorem should depend on
the analysis of partial differential equations. But there is more than
one modern parallel. One of the most sensational mathematical sto-
ries of recent years has been the proof, by Perelman, of the Poincaré
conjecture, a fundamental result of 3-dimensional topology. And while
the details are very different and much more sophisticated, the ba-
sic structure — introducing auxiliary partial differential equations,
approaching an “optimum” solution by an iterative “improvement”
process, and showing that such an “optimum” also solves the original
topological problem — is quite similar to the structure of the proof
of the Schoenflies theorem that is sketched above. See [28].

As mentioned earlier, the Schoenﬂies theorem is special to dimen-

sion 2. The Alexander horned sphere gives an example of a homeomor-
phic image of S 2 in R3 whose interior domain is not homeomorphic
to the interior of a standard ball. To prove the Schoenﬂies theo-
rem in higher dimensions, one needs an extra hypothesis of “local

flatness” which rules out such wild behavior. The classic reference
on the Alexander horned sphere is [10, page 38] (many interesting
drawings and animations can now be found online also), and for the
“locally flat” version of the higher-dimensional Schoenflies theorem,
see [12].
However, a “stabilized” version of the Schoenflies theorem is true
(and easy to prove) in any dimension. The statement is:
Lemma 4.4.8. Let A and B be homeomorphic closed subsets of Rn ,
and let h : A → B be a homeomorphism between them. Embed Rn
in R2n as a hyperplane. Then the homeomorphism h : A → B can
be extended to a homeomorphism h̃ from R2n → R2n . In particular,
R2n \ A is homeomorphic to R2n \ B.

In other words, if we allow ourselves some extra room, in the

form of n extra dimensions, then the strongest possible Schoenﬂies-
type statement becomes true. The ingenious proof, due to Dold [15],
is outlined in Exercise 4.5.14. The key step is provided by the Tietze
extension theorem. With a little extra input from algebraic topology
it is possible to obtain the Jordan curve theorem, and its higher-
dimensional counterpart the Jordan-Brouwer separation theorem, as
corollaries of this result.

4.5. Exercises
Exercise 4.5.1. Give an example of a compact connected metric
space X and a continuous map f : X → X that has no fixed point.
Also, give such examples where f has a specified number n ∈ N of
fixed points.
Exercise 4.5.2. Let m ∈ N. Suppose that a map f : S 1 → S 1 has
the property that f (z) = f (e2πi/m z) for all z ∈ S 1 . What can you
say about the degree of f ? (Note that Proposition 4.1.5 covers the
case m = 2.)
Exercise 4.5.3. It is a consequence of Lemma 4.1.4 that, for maps f
and g from S 1 to itself, f ◦ g is always homotopic to g ◦ f . Investigate
whether this is always true for maps from a compact metric space X
to itself.

Exercise 4.5.4. Let A be an n × n square matrix all of whose entries

are strictly positive. Show that A has an eigenvector with strictly
positive eigenvalue and all entries strictly positive. (This is a simple
version of the Perron-Frobenius theorem.)
Hint: Let Δ ⊆ Rn be the simplex

Δ = (x1 , . . . , xn ) : xi 0, xi = 1 .

Show that Δ is homeomorphic to B n−1 . If v ∈ Δ, show that Av has

all entries strictly positive and therefore that
Av
1 , the sum of the
entries of Av, is also strictly positive. Apply the Brouwer ﬁxed-point
theorem to the map
v →
Av
−1
1 Av

from Δ to itself.

Exercise 4.5.5. This exercise indicates one way to resolve the con-
tinuity question in the proof of the ham sandwich theorem. We will
consider each of the three bodies A, B, and C to be Lebesgue mea-
surable subsets of R3 , having positive Lebesgue measure and each
contained in the ball B(0; R) of center 0 and radius R in R3 . Let
H = [−3R, 3R] × S 2 . To each (t, v) ∈ H assign the half-space
Ω(t, v) = {x ∈ R3 : x · v > t}.
Notice that B(0; 2R) is a subset of Ω(t, v) for all t < −R and all v and
is disjoint from Ω(t, v) for all t > R and all v. Let λ denote Lebesgue
measure.
(i) Let f ∈ L1 (B(0; 2R), λ) be an integrable function and deﬁne

Φf (t, v) = f dλ.
Ω(t,v)

Prove that Φf is a continuous function on H.

(ii) Fix ε > 0 and let f = χC + εχB(0;2R) . Show that, for ﬁxed
v, the function t → Φf (t, v) is strictly monotone decreasing for
t ∈ [−R, R] and indeed that there is a constant c such that
|Φf (t, v) − Φf (t , v)| c|t − t |
for t, t ∈ [−R, R].

(iii) Deduce that for each v ∈ S 2 there is a unique t = tv such that

Φf (t, v) = 2 f dλ = 12 (λ(C) + ελ(B(0; 2R)))
1

and that tv depends continuously on V .

(iv) Using the Borsuk-Ulam theorem show that there exists v ∈ S 2
such that Ω(tv , v) contains exactly half of the volume of A and
of B.
(v) The v and tv constructed in the preceding steps depend on ε.
Now let ε = 1/k, k = 1, 2, 3, . . ., and let (tk , vk ) be the sequence
of corresponding parameters in H. Show that we can ﬁnd a
subsequence (tkm , vkm ) which is convergent, to (t∞ , v∞ ) say.
(vi) Show that the plane represented by (t∞ , v∞ ) solves the original
ham sandwich problem.

Exercise 4.5.6. Complete the following outline of a proof of the

Jordan curve theorem for polygonal loops.

(a) Let γ be a polygonal Jordan curve and let ε > 0. For each
straight line segment of γ draw two parallel segments, at distance
ε either side of γ, and join these segments at suitable points to
form polygonal curves γ1 and γ2 , which if ε is sufficiently small
will be disjoint from γ.
(b) Now let U be the union of all those cells of γ for which the winding
number wn(γ, p) is even, and let V be the union of all those cells
for which the winding number is odd. Show that γ1 is contained
in one of these sets, say U , and γ2 is contained in the other, say V .
(c) Show that U and V are path connected; this will finish the proof.
Suppose that p, q ∈ U and draw the straight line path from p to
q. If this doesn’t meet γ, we are done. If it does, there is a first
point at which it meets γ. Show that just before the first point
at which it meets γ, it must meet γ1 , say at a point p . By the
same argument, just after the last point at which it meets γ it
must meet γ1 , say at a point q . But now we get a path from p to
q in U by traveling first straight to p , then along γ1 to q , then
straight along γ to q.

(This argument suggests that the Jordan curve theorem is relatively

simple once our curve γ has an appropriate “tubular neighborhood”,
here constituted by the “guard rails” γ1 and γ2 .)
Exercise 4.5.7. Let A and B be compact subsets of C, and assume
that A ∩ B is connected.
(i) Suppose that A and B do not separate the plane. Show that
A ∪ B does not separate the plane. (Remember that a compact
subset K separates the plane if C \ K has more than one path
component.)
(ii) More generally, show that if p, q are two points which are in the
same component of C \ A and in the same component of C \ B,
then they are in the same component of C \ A ∪ B.
(iii) Does the result in part (i) remain true if we only require that A
and B are closed sets?
Exercise 4.5.8. Let J ⊆ C be a Jordan curve and let ΩJ be its
interior (that is, the bounded component of the complement of J).
(i) Prove that if γ is a loop in ΩJ and q ∈ C\ΩJ , then wn(γ, q) = 0.
(Hint: Show that all points of C \ ΩJ belong to the same path
component of C \ γ ∗ .)
(ii) Let p belong to the bounded component of C \ J. Show that the
winding number of J around p is ±1. (Hint: Prove this ﬁrst
when J has a short straight section, using Lemma 3.3.2. Reduce
the general case to this using a “cross-cut”.)
Exercise 4.5.9. Show that R2 is not homeomorphic to Rn for n 3.
Hint: Let h : Rn → R2 be the putative homeomorphism. Let
U ⊆ Rn be the set {(x1 , x2 , 0, . . . , 0) : x21 + x22 < 1} which is homeo-
morphic to a disc in R2 . By applying the invariance of domain the-
orem to the map R2 → R2 deﬁned by (x1 , x2 ) → h(x1 , x2 , 0, . . . , 0)
show that the image h(U ) is open in R2 . Show that this contradicts
the continuity of h.
Exercise 4.5.10. What is the 1-dimensional counterpart of the in-
variance of domain theorem? Prove it.
Exercise 4.5.11. Construct an example of a homeomorphism U → U
which has no continuous extension to the boundary.

Exercise 4.5.12. (i) Let J be a Jordan curve in C and let U be

the bounded component of C \ J. Let z0 ∈ J. Show that for
each ε > 0 there exists δ > 0 having the following property: if
z1 , z2 ∈ U ∩ B(z0 ; δ), then there exists a curve in U ∩ B(z0 ; ε)
joining z1 to z2 .
(ii) Let U denote the “slit disc” in C; that is,
U = {x + iy : x2 + y 2 < 1, x 0 ⇒ y = 0}.
Show that U is bounded and simply connected, but its boundary
is not a Jordan curve.
(iii) Give an explicit example to show that the boundary of the region
U in part (ii) above does not have the property proved in part
(i) for Jordan curves.
(iv) (For those who know something about conformal maps) Con-
struct a conformal homeomorphism from the slit disc U onto
U. Show that the homeomorphism you have constructed does
not extend to ∂U . Can you give an explicit description of its
boundary behavior?
Exercise 4.5.13. Let J be a Jordan curve in the plane. Show that
there is a continuous function f : C → R such that f (z) is positive
if z is in the interior of J, negative if z is in the exterior of H, and
zero if z is on J. Challenge: If J is smooth (i.e., can be smoothly
parameterized by a loop γ with γ = 0), show that f can be chosen
to be smooth also.
Exercise 4.5.14. Let A ⊆ Rm and B ⊆ Rn be closed sets and let
h : A → B be a homeomorphism.
(i) Using the Tietze extension theorem, extend h to a continuous
map f : Rm → Rn . Show that the map
h1 (x, y) = (x, y + f (x))
is a homeomorphism Rm × Rn → Rm × Rn .
(ii) Similarly, extend h−1 to a continuous map g : Rn → Rm and
show that
h2 (x, y) = (x − g(y), y)
is a homeomorphism Rm × Rn → Rm × Rn .

(iii) Show that h2 ◦ h1 : Rm × Rn → Rm × Rn is a homeomorphism

which sends (x, 0) to (0, h(x)) whenever x ∈ A.
(iv) Putting n = m, deduce the result of Lemma 4.4.8.
Dold’s paper [15] shows how this result plus a little algebraic topology
(the Mayer-Vietoris sequence) can be used to give a very concise proof
of the general Jordan-Brouwer separation theorem. A detailed devel-
opment of the argument, with all the required algebraic topology, can
be found in [26]. The presentation uses de Rham cohomology, which
we will begin to develop in the next chapter.

Chapter 5

Integrals and
the Winding Number

5.1. Diﬀerential forms and integration

In Chapter 3 we gave three procedures for computing the winding
number, one “polygonal”, one “smooth”, and one “algebraic”. In
various ways, these all worked by counting discrete objects, like in-
tersection points of curves or roots of polynomials. In terms of our
archetypal model of a clock face, imagine a bell chiming every time
the clock hand passes 12, so that the winding number is obtained by
counting the chimes.
We could also imagine a quite different approach to calculating
the winding number. Instead of counting discrete “chimes” or inter-
sections, we could try to measure the angular speed at which the clock
hand is turning at each instant and thus find the total angle through
which it turns (which is just 2π times the winding number) by adding
up the increments “angular speed times time interval” over succes-
sive instants. In other words, we could define the winding number by
integration. That is the guiding idea of this chapter.
We’ll need the language of multivariable calculus to do this effec-
tively. Traditionally, multivariable calculus is expressed using arrays
of partial derivatives, but it is more in accordance with the spirit of

modern mathematics to avoid this implicit use of speciﬁc coordinate

systems and instead to think of derivatives as linear maps between
normed vector spaces. Appendix E gives an outline of multivariable
calculus from this point of view.

5.1(a). Diﬀerential 1-forms. Let V be a ﬁnite-dimensional vector

space over the real ﬁeld R, and let V ∗ denote its dual space (Deﬁni-
tion A.4.1).

Deﬁnition 5.1.1. Let Ω be an open subset of V . A (diﬀerential )

1-form on Ω is a smooth map Ω → V ∗ .

In other words, a 1-form α is a “smoothly varying family of ele-

ments of the dual space”: it is a function Ω×V → R, (x, v) → α(x)[v],
depending1 smoothly on x and linearly on v. One can add 1-forms
(pointwise) and multiply them by scalars, so the 1-forms comprise
a vector space. In fact, one can even multiply 1-forms pointwise by
functions. If α is a 1-form and g is a smooth function, then β deﬁned
by
β(x)[v] = g(x)α(x)[v]
is also a 1-form, we use the natural notation β = gα.
If f is a smooth, real-valued function on Ω, its directional deriva-
tives give rise to a 1-form in a natural way:

Deﬁnition 5.1.2. Let f : Ω → R be a smooth map. The gradient

1-form df of f is deﬁned by

d
df (x)[v] = f (x + tv)
dt t=0

for vectors v ∈ V .

In the language of Appendix E, df is simply the derivative Df

of f , thought of as a map Ω → V ∗ : df (x)[v] = Df (x)[v]. See
Example E.2.8.

1
To help keep track of the various dependencies, I will use square brackets in
this chapter to indicate a linear dependence, and round brackets to indicate a more
general functional dependence. See Convention E.2.6 for more about this.

Proposition 5.1.3. The gradient operator d satisﬁes the sum and

product rules:

d(f + g) = df + dg, d(f g) = f dg + g df.

Proof. We’ll prove the product rule (the sum rule is easier). By
deﬁnition

d
d(f g)(x)[v] = f (x + tv)g(x + tv)
dt t=0

d d
= f (x + tv) g(x + tv) + g(x + tv) f (x + tv)
dt dt t=0
= f (x) dg(x)[v] + g(x) df (x)[v]

using the ordinary form of the product rule.

Remark 5.1.4. Later we shall also want to consider complex -valued

1-forms, where α(x)[v] belongs to C rather than to R. There is no
extra diﬃculty in dealing with these — their real and imaginary parts
are just 1-forms in the ordinary (real-valued) sense. The laws above
continue to apply.

5.1(b). Pullback forms. Let V and W be two vector spaces, and

suppose that ϕ : V → W is a smooth map. Recall that for each x ∈ V ,
the derivative Dϕ(x) of ϕ is a linear map V → W . If β is a 1-form
on W , then we may deﬁne a 1-form α on V by composing with Dϕ:

Dϕ(x) β(ϕ(x))
α(x) : V /W /R.

In symbols, α(x)[v] = β(ϕ(x)) Dϕ(x)[v] .

Deﬁnition 5.1.5. The above-deﬁned 1-form α is called the pullback

of β along the smooth map ϕ, and it is written α = ϕ∗ β.

Remark 5.1.6. Here is an important special case. Suppose that

β is in fact the gradient of a function f : W → R. Then I claim
that α = ϕ∗ β is the gradient of the function f ◦ ϕ : V → R, that is,
ϕ∗ ( df ) = d(f ◦ϕ). To see this, remember that df (x)[w] = Df (x)[w]

and therefore

ϕ∗ ( df )(x)[v] = Df (ϕ(x)) Dϕ(x)(v)

= Df (ϕ(x)) ◦ Dϕ(x) [v] (deﬁnition of composite map)
= D(f ◦ ϕ)(x)[v] (by the chain rule, Proposition E.3.2)
= d(f ◦ ϕ)(x)[v] as required.

Proposition 5.1.7. Pullbacks are functorial: given smooth maps

ϕ ψ
V /W /X

and a 1-form β on X, we have

(ψ ◦ ϕ)∗ (β) = ϕ∗ (ψ ∗ (β)).

Proof. Let β be a 1-form on X. Then by deﬁnition

(ψ ◦ ϕ)∗ (β)(x)[v] = β(ψ(ϕ(x)))[D(ψ ◦ ϕ)(x)[v]]
ψ ∗ (ϕ∗ β) (x)[v] = β(ψ(ϕ(x)))[Dψ(ϕ(x))[Dϕ(x)[v]]].
But these are equal by the chain rule, Proposition E.3.2.

Example 5.1.8. Consider the special case where both V and W are
1-dimensional vector spaces. In that case they are both isomorphic
to R; ﬁx isomorphisms x : V → R and y : W → R. The function
y = ϕ(x) is now a real-valued function of a real variable, and its
derivative Dϕ(x) is a 1 × 1 matrix, that is, a scalar Dϕ(x) = ϕ (x).
The most general 1-form β on W is given by β = f (y) dy for some
smooth function f , and Deﬁnition 5.1.5 gives
(5.1.9) ϕ∗ (f (y) dy) = f (ϕ(x))ϕ (x) dx.

The identity (5.1.9) in the above example should remind you of

the rule for integration by substitution from elementary calculus:
ϕ(b) b
f (y) dy = f (ϕ(x))ϕ (x) dx,
ϕ(a) a

valid where the substitution ϕ has (strictly) positive derivative on

the interval [a, b] ⊆ R. In fact, this allows us to set up a close link
between 1-forms and integration over intervals.

Deﬁnition 5.1.10. Let I = [a, b] ⊆ R be a closed bounded interval

and let α be a 1-form on I. The integral of α over I, written α, is
deﬁned to be the ordinary “Calculus I” integral
b
f (x) dx,
a
where α = f (x) dx is the expression of the 1-form α in terms of the
“standard” 1-form dx.

From (5.1.9) and the rule for integration by substitution, it then

follows that
Lemma 5.1.11. Let I and J be closed bounded intervals, and let
ϕ : I → J be a smooth bijection with strictly positive derivative every-
where. Let β be a 1-form on J. Then

ϕ∗ β = β.
I J

Thus 1-forms and pullbacks automatically keep track of the deriv-

ative terms required when we integrate by substitution. We’ll develop
this relationship further in the next subsection.

5.1(c). Integration along a path. Let us now consider paths in V .

By a smooth path (also known as a smooth curve) in V we just mean
a smooth map [0, 1] → V , as before (see Section 3.4). Sometimes it is
convenient to allow a general parameter interval [a, b] instead of [0, 1];
of course, this makes no essential difference.
Definition 5.1.12. Let γ be a smooth path in V and let α be a
∗
subset Ω ⊆ V that contains γ . The integral of α
1-form on an open
along γ, written γ α, is defined by

α= γ ∗ α,
γ [0,1]

where γ ∗ α is the pullback of the 1-form α along the smooth map γ.

Using the deﬁnitions of pullback (Deﬁnition 5.1.5) and of integra-

tion over [0, 1] (Deﬁnition 5.1.10), we obtain the explicit expression
1
(5.1.13) α= α(γ(t))[γ (t)] dt,
γ 0

where the integral on the right is now an ordinary “Calculus I” inte-

gral.
There is a version of the fundamental theorem of calculus in this
context, which tells us what happens when we integrate a gradient
form.
Proposition 5.1.14. Let γ be a smooth path in V , f a smooth real-
valued
function deﬁned on an open subset Ω ⊆ V containing
γ ∗ . Then
γ
df = f (γ(1)) − f (γ(0)). So, if γ is a loop, then γ df = 0.

Proof. By the chain rule (Proposition E.3.2),

d
(f (γ(t))) = df (γ(t))[γ (t)].
dt
Therefore,
1 1 1
d
df = df (γ(t))[γ (t)] dt = (f (γ(t))) dt = f (γ(t)) ,
γ 0 0 dt t=0

by the ordinary fundamental theorem of calculus.

Deﬁnition 5.1.15. Let γ be a smooth path in V and let ϕ : [0, 1] →
[0, 1] be a smooth bijection with strictly positive derivative every-
where. Then we will call the path γ ◦ ϕ a smooth reparameterization
of γ.

See Example 2.2.3 for the language of “reparameterization”. It

is natural to regard a reparameterization of γ as “the same path tra-
versed with diﬀerent speed”, and this leads to the idea that geometric
concepts related to smooth paths should not be aﬀected by smooth
reparameterization. Lemma 5.1.11 tells us that this is true for the
integral of a 1-form. Here are the details:
Proposition 5.1.16. Suppose that γ2 is a smooth reparameterization
of the smooth path γ1 in V and that α is a 1-form on V . Then

α= α.
γ1 γ2

Proof. Let γ2 = γ1 ◦ ϕ, where ϕ is as above. Put β = γ1∗ α. By the

functorial property of pullbacks (Proposition 5.1.7),
γ2∗ α = (γ1 ◦ ϕ)∗ α = ϕ∗ γ1∗ α = ϕ∗ β.

Therefore,

α= γ1∗ α = β= ∗
ϕ β= γ2∗ α = α
γ1 [0,1] [0,1] [0,1] [0,1] γ2

using the deﬁnition of path integral and Lemma 5.1.11.

Because of Proposition 5.1.16, when calculating the integral of

a 1-form along a path we may freely choose whichever smooth pa-
rameterization of the path is most convenient. This often simpliﬁes
calculation.

5.1(d). Piecewise smooth paths. A path γ : [0, 1] → V (that is, a

continuous map) is piecewise smooth if there exist a0 , . . . , am ∈ [0, 1],
with 0 = a0 < a1 < · · · < am−1 < am = 1, such that γ is smooth on
each of the intervals [aj , aj+1 ] for j = 0, . . . , m − 1. In other words,
a piecewise smooth path is made up by concatenating finitely many
smooth segments. Polygonal paths (see Section 3.3) are examples of
piecewise smooth paths.
We can extend the definition of path integral to piecewise smooth
paths by defining

m−1
(5.1.17) α= α,
γ j=0 γj

where γj is the smooth path obtained by restricting γ to the interval

[aj , aj+1 ]. With this deﬁnition the fundamental theorem of calculus,
Proposition 5.1.14, continues to hold for piecewise smooth paths (to
prove this, apply the smooth version to each of the segments γj and
add the results).

5.2. Closed and exact forms

Let α be a 1-form defined on an open subset Ω of a vector space V .
Being (in particular) a smooth map Ω → V ∗ , α can be differentiated :
its derivative Dα(x) is defined at each x ∈ Ω and is a linear map
V → V ∗ or (what is the same thing) a bilinear map V × V → R.
That is, the symbol
Dα(x)[v1 , v2 ]

deﬁnes a real number for all x ∈ Ω and v1 , v2 ∈ V ; this real number

depends smoothly on x and linearly on v1 and v2 . (Here we are
reproducing, in our context of 1-forms, some of the discussion leading
up to the expression (E.3.3).)

Deﬁnition 5.2.1. We say that α is closed if Dα is symmetric, that

is, Dα(x)[v1 , v2 ] = Dα(x)[v2 , v1 ] for all x, v1 , v2 .

Deﬁnition 5.2.2. We say that α is exact if there is a smooth function

f on Ω with α = df .

The equations deﬁning a closed form are linear, so the closed

forms (on a given domain Ω) make up a vector space Z 1 (Ω), a sub-
space of the vector space of all forms on Ω. Similarly, the exact forms
make up a vector space B 1 (Ω). In fact, B 1 (Ω) is a subspace of Z 1 (Ω);
in other words,

Proposition 5.2.3. Every exact form is closed.

Proof. Bearing in mind that df (x)[v] = Df (x)[v] (Deﬁnition 5.1.2),

we see that if α = df , then Dα = D2 f . This is symmetric by
Clairaut’s theorem (Proposition E.3.4).

Proposition 5.2.4. Let Ω be an open subset of a vector space V and

let Ω be an open subset of a vector space V . Let ϕ : Ω → Ω be a
smooth map and let α be a 1-form deﬁned on Ω. Then:
(a) If α is closed, then its pullback ϕ∗ α is closed.
(b) If α is exact, then its pullback ϕ∗ α is exact.
(Brieﬂy, closedness and exactness are preserved under pullback.)

Proof. To prove (a), we calculate using the chain rule that

D(ϕ∗ α)(x)[v1 , v2 ] = Dα ϕ(x) Dϕ(x)[v1 ], Dϕ(x)[v2 ] ,
and clearly this is symmetric in v1 , v2 if Dα(ϕ(x)) is symmetric.
To prove (b), we observe from Remark 5.1.6 that if α = df , then
ϕ∗ α = d(f ◦ ϕ).

Remark 5.2.5. Since every exact form is closed, it is natural to

ask whether the converse holds: is every closed form exact? One

can measure the diﬀerence2 between closed and exact forms by the
quotient space
H 1 (Ω) := Z 1 (Ω)/B 1 (Ω),
known as the ﬁrst de Rham cohomology of Ω. This is a topological
invariant of Ω and is not always zero.

For computational purposes it is important to understand how

the deﬁnition of “closed” is expressed in coordinates. Choose a basis
for the n-dimensional vector space V , and let x1 , . . . , xn : V → R be
the associated coordinate functions. Then3 a general 1-form α can be
written as
α = u1 dx1 + · · · + un dxn ,
where f1 , . . . , fn : V → R are smooth functions.

Proposition 5.2.6. A 1-form α, written in coordinates as u1 dx1 +

· · · + un dxn , is closed if and only if
∂ui ∂uj
=
∂xj ∂xi
for all i, j with 1 i, j n.

Proof. Fix a point y = (y1 , . . . , yn )T in V . The derivative Dα(x) is

a bilinear form on V , which may be written in coordinate form (with
respect to the given basis of V ) as a square matrix. By the chain rule,
the (i, j)th entry of this matrix is the derivative at yi of the composite
function
τ
R
σi
/ Ω α / V∗ j / R ,
where σi (y) = (y1 , . . . , yi−1 , y, yi+1 , . . . , yn )T and τj (ξ1 , . . . , ξn ) = ξj .
But this composite function is y → uj (y1 , . . . , yi−1 , y, yi+1 , . . . , yn )
and its derivative at xi is just
∂uj
(y).
∂xi

2
Making this deﬁnition is an example of the process described in the Preface,
whereby mathematicians turn lemons into lemonade. In place of an inconvenience —
closed and exact are not always the same — we now have an interesting and useful
invariant — the de Rham cohomology.
3
See Exercise 5.7.1.

(0, b) γ3 (a, b)

γ4 γ2

(0, 0) γ1 (a, 0)

Figure 5.1. Figure for Lemma 5.2.7.

The form α is closed if and only if the matrix representing Dα is

symmetric, which gives the required condition
∂ui ∂uj
(y) = (y),
∂xj ∂xi
which must be valid at all points y ∈ Ω.

The next lemma gives the crucial step in understanding the rela-
tionship between closed and exact forms.
Lemma 5.2.7. Let γ be a piecewise smooth path traversing the bound-
ary of a rectangle R, whose sides are parallel to the coordinate axes.
Let α be a closed 1-form deﬁned on R. Then γ α = 0.

Proof. There is no loss of generality in assuming that the successive

vertices of the rectangle are at (0, 0), (a, 0), (a, b), and (0, b). (See
Figure 5.1). Write γ1 , γ2 , γ3 , and γ4 for paths traversing the succes-
sive sides of the rectangle, and write α = g dx + h dy where g and
h are functions of x and y. The assumption that α is closed means
that g2 = h1 , where the subscripts 1 (or 2) denote partial derivative
in the ﬁrst (or second) variable.

We want to evaluate γ α, which is equal to the sum of the inte-

grals along the four sides of the rectangle, 4k=1 γk α. Let’s look at
these terms individually.

Parameterizing the path γ1 in the obvious way4 gives γ1 α =
a
x=0
g(x, 0)dx. Similarly we may parameterize γ3 to get γ3 α =
4
Note the implicit use of Proposition 5.1.16.

using the fundamental theorem of calculus. On the other hand, sim-

ilar reasoning with the vertical sides of the rectangle gives
b
α+ α= (h(a, y) − h(0, y)) dy
γ2 γ4 y=0
b a
=− h1 (x, y)dxdy.
y=0 x=0

Combining both equations and changing the order of integration, we

get
a b
f (z) dz = − (g2 (x, y) − h1 (x, y)) dydx
γ x=0 y=0

and this vanishes because α is closed.

Theorem 5.2.8. Let Ω be an open subset of a vector space V and let

α be a closed 1-form deﬁned on Ω. Let γ be a smooth (or piecewise
smooth) loop in Ω. If γ is homotopic to a constant loop, then

α = 0.
γ

More generally, for an arbitrary loop γ, the integral γ α depends only
on the homotopy class of γ in the space Maps(S 1 ; Ω).

Proof. Let h be a homotopy of smooth loops in Ω between γ0 and

γ1 . Thus, h is a map (which may be assumed to be smooth; see
Remark 3.4.1) from [0, 1]×[0, 1] to Ω, with h(0, t) = γ0 (t) and h(1, t) =
γ1 (t), and moreover h(s, 0) = h(s, 1) for all s (this expresses that h
is a homotopy of loops). Let us apply Lemma 5.2.7 to the form h∗ α,
which is closed because of Proposition 5.2.4 and is deﬁned on the
rectangle [0, 1] × [0, 1] ⊆ R2 . The lemma tells us that the sum of the

integrals of h∗ α along the four sides of the rectangle is zero. But, by

deﬁnition, the integrals along the vertical sides of the rectangle are

α and − α,
γ1 γ0

and the integrals along the horizontal sides cancel because of the loop
property h(s, 0) = h(s, 1). Thus we ﬁnd that

α= α,
γ1 γ0

as required. (If the initial loop γ is only piecewise smooth, we ﬁrst

round oﬀ the corners to get a homotopic smooth loop without chang-
ing the integral by more than some prescribed ε; then we apply
the above argument to the smooth approximation and ﬁnally let
ε → 0.)

5.3. The winding number via integration

We now focus our attention on the 2-dimensional case — the complex
plane. Let f be a smooth function from an open subset Ω of C to C
itself.
Deﬁnition 5.3.1. The smooth function f : Ω → C is holomorphic if
it is complex diﬀerentiable at each point z ∈ Ω; in other words, if the
limit
f (z + h) − f (z)
f (z) = lim
h→0 h
h∈C
exists (in C) for each z ∈ Ω.

The usual arguments of elementary analysis apply to this deﬁni-

tion and show that the sum, product, quotient, and composite (where
deﬁned) of holomorphic functions are holomorphic.
Remark 5.3.2. Suppose we write f = u + iv, z = x + iy and regard
u, v as functions of the two real variables x, y. Then, in the language
of Appendix E, to say that f is holomorphic is to say that at each
point z of Ω the real derivative

∂u/∂x ∂u/∂y
Df =
∂v/∂x ∂v/∂y

is actually a complex linear map C → C: namely, the map given by

multiplication by the complex number f (z) = a + ib ∈ C. The 2 × 2
matrix of this multiplication map is obtained by considering the effect
of multiplication by a + ib on the basis vectors 1 and i. It is

a −b
.
b a
Comparison with the matrix of partial derivatives above shows that
a holomorphic map satisfies the Cauchy-Riemann equations
∂u ∂v ∂v ∂u
= , =− ,
∂x ∂y ∂x ∂y
which we have already encountered in Section 4.4.
Most complex analysis courses will offer a proof that, under some
smoothness assumptions, the Cauchy-Riemann equations give a suffi-
cient as well as a necessary condition for a function to be holomorphic.
However, we will not need this fact.
Lemma 5.3.3. If the smooth function f is holomorphic, then the
1-form
f (z) dz
is closed.

Proof. The form f (z) dz is a complex 1-form (see Remark 5.1.4),

which we may express in terms of real and imaginary parts as
f (z) dz = (u + iv)( dx + i dy) = (u + iv) dx + (−v + iu) dy,
where f = u + iv. The condition for this form to be closed is then
∂ ∂
(u + iv) = (−v + iu),
∂y ∂x
and taking real and imaginary parts gives
∂u ∂v ∂v ∂u
= , =− ,
∂x ∂y ∂x ∂y
which are the Cauchy-Riemann equations.

There is a close relation between holomorphic functions, integra-

tion along paths, and winding numbers. We have developed all the
ingredients that we need to state and prove the fundamental result.

Theorem 5.3.4. Let p be a point of C, and let γ be a piecewise

smooth loop in C \ {p}. Then

1 1
wn(γ, p) = dz.
2πi γ z − p

Remark 5.3.5. The integral here is of the general shape γ f (z) dz,
where f is a complex-valued function. The “Calculus I” formula for
such an integral (following (5.1.13)) is
1
f (z) dz = f (γ(t))γ (t) dt,
γ 0

where, on the right-hand side, the real and imaginary parts of the
complex expression f (γ(t))γ (t) are integrated independently.

Proof. There is no loss of generality in taking p = 0, and we shall

do that. Our proof will use a three-stage process for identifying the
winding number, which is a “standard operating procedure” that we
will see again in Section 7.4. Considering the right-hand side as a
function ν(γ) = (2πi)−1 γ z −1 dz of the loop γ, we’ll prove (a) that
ν is multiplicative (that is, ν(γ1 γ2 ) = ν(γ1 ) + ν(γ2 )) and (b) that it
is homotopy invariant. In light of Theorem 3.2.7 this shows that ν(γ)
is some multiple of the winding number of γ, and we’ll check which
multiple it is by (c) computing a single example.
For multiplicativity (a), note the following consequence of the
product rule (Proposition 5.1.3):
1 1 1
(5.3.6) d(f g) = df + dg.
fg f g
We’re going to apply this as follows: let V = C ⊕ C, a 4-dimensional
vector space over R, and let f and g be defined on V by f (z1 , z2 ) = z1
and g(z1 , z2 ) = z2 . Define a path γ in V by γ(t) = (γ1 (t), γ2 (t)). Then
it is easy to check (either by using the definition of path integral
in terms of pullbacks or more prosaically by using the “Calculus I”
formula of Remark 5.3.5) that

1 1 1 1
ν(γ1 ) = df, ν(γ2 ) = dg,
2πi γ f 2πi γ g

and
1 1
ν(γ1 γ2 ) = d(f g).
2πi γ f g
Thus the desired additivity follows from (5.3.6).
For homotopy (b), consider the integrand (z − p)−1 dz. Since
p∈/ Ω, the function z → (z − p)−1 is holomorphic on Ω. Thus, by
Lemma 5.3.3, the 1-form (z−p)−1 dz is closed. The desired homotopy
invariance follows from Theorem 5.2.8.
As noted above, properties (a) and (b) show that ν is a multiple
of the winding number. To show that this multiple is 1 and thus that
ν is equal to the winding number, we compute a single example (c).
Consider the standard path γ(t) = e2πit with winding number 1. For
this path
1 1
1 γ (t) 1
ν(γ) = dt = 2πi dt = 1,
2πi 0 γ(t) 2πi 0
which agrees with the winding number. This completes the proof.
Remark 5.3.7. Notice that this theorem provides us with a fun-
damental example of a closed 1-form (on the non-simply-connected
region C \ {p}) that is not exact. Indeed, (z − p)−1 dz is closed as
we observed above; but if it were exact, then its integral around any
loop would vanish, by the fundamental theorem of calculus (Proposi-
tion 5.1.14).

5.4. Homology
As we further develop the theory of integration along paths, we are
going to run across situations where we need to employ the following
device. Let γ be a piecewise smooth loop in some region Ω ⊆ C. It
may be possible to express γ as the concatenation of ﬁnitely many
paths, say γ1 , . . . , γn . Then, of course, we will have
n
α= α
γ k=1 γk

for any 1-form α (in fact, this is how we deﬁned the integral along
a piecewise smooth path; see (5.1.17)). Now it is quite possible that
concatenating the paths γk in a diﬀerent order, such as γσ(1) , . . . , γσ(n)

where σ is a permutation of {1, . . . , n}, might yield an entirely diﬀer-

ent loop δ. In this case we will have
n n
α= α= α= α,
δ k=1 γσ(k) k=1 γk γ

even though the loops γ and δ may be quite diﬀerent from the point
of view of the topological tools we have developed thus far — for
instance, they need not be homotopic (Exercise 5.7.15).
It is convenient to formalize this idea by means of the notions
of chains, cycles, and homology. This machinery is called homology
theory and is one of many things described in this book whose de-
velopments and generalizations are vital tools in higher-dimensional
topology.

Deﬁnition 5.4.1. Let Ω be an open subset of a vector space V .

A 1-chain in Ω is a ﬁnite formal linear combination, with integer
coeﬃcients, of smooth paths in Ω. In other words, it is a formal
expression
n
mk [γk ],
k=1
where the γk are smooth paths in Ω and the mk are integers.

Chains will usually be denoted by upper case Greek letters like

Γ. By introducing the integer coefficients mk , we have made the
collection of chains into an abelian group: two chains can be added
componentwise, and this addition operation is commutative. (Those
who are familiar with the terminology will recognize that the group
of chains is in fact the free abelian group generated by the smooth
paths in Ω.) When mk = 1, we usually omit it and just write the
corresponding term as [γk ], and similarly when mk = −1, we write
−[γk ]. A term with a 0 coefficient can be omitted. The coefficient
mk may be referred to as the multiplicity of γk in the given chain.
Finally, we define the image Γ∗ of Γ to be the union of the images γk∗ .
Important examples arise from piecewise smooth paths: if γ is
such a path, made up by concatenating smooth subarcs γk , then we

may associate to it the chain Γ = k [γk ]. The following definition
then generalizes (5.1.17).

An equivalence relation ∼ on chains (we just call it “equivalence”)

is generated5 by the following three rules.
(a) If γ is the concatenation of γ1 and γ2 , then [γ] ∼ [γ1 ] + [γ2 ].
(b) If γ is the reverse6 of γ (traversed in the backwards direction),
then [γ] ∼ −[γ ].
(c) If γ1 is a reparameterization7 of γ2 , then [γ1 ] ∼ [γ2 ].
Notice that this equivalence relation respects the operation of
integration along chains: if Γ1 is equivalent to Γ2 , then

α= α
Γ1 Γ2

for all α deﬁned in a neighborhood of Γ∗1 ∪ Γ∗2 . Notice also that a

constant chain is equivalent to 0 (by combining (a) and (c) above).
Definition 5.4.3. Suppose that γ is a piecewise smooth loop, made
up by concatenating smooth subarcs γk . The associated chain Γ =

k [γk ] is called a simple cycle. More generally, a cycle is any chain
which is equivalent to a finite linear combination of simple cycles.
In other words, a chain is a cycle if its component arcs (counted
according to multiplicity) can be subdivided, rearranged, reoriented,
and concatenated to yield finitely many piecewise smooth loops (again
with associated multiplicities).

Lemma 5.4.4. If Γ is a cycle, then Γ α = 0 for any exact 1-form α.

Proof. This follows from the fundamental theorem of calculus for

piecewise smooth loops: if γ is such a loop, then

df = f (γ(1)) − f (γ(0)) = 0.
γ

5
See Remark G.1.4.
6
See Proposition 2.1.2.
7
See Example 2.2.3.

Thus the result holds for simple cycles, and it extends to all cycles by
linearity.

To deﬁne the key notion of homology, we need to introduce a

special class of cycles, called simple boundaries. Let R be a rectangle
in R2 , with vertices a, b, c, d. We can define a piecewise smooth loop
∂R (the boundary of R) in R2 by following the boundary edges ab,
bc, cd, and da, in that order and at constant speed. Now suppose
that σ : R → Ω is a smooth map. By composing σ with ∂R we obtain
a piecewise smooth loop in Ω and hence a cycle which we denote
∂(R, σ) or just ∂R if the map σ is obvious from the context. Such a
cycle is also called a simple boundary in Ω.
By analogy with the previous definition we now say
Definition 5.4.5. A boundary in Ω is any cycle which is equivalent
to a finite linear combination of simple boundaries. Two chains (or
cycles) are said to be homologous in Ω if their difference is a boundary
in Ω. Because of this, a boundary may also be described as null-
homologous.
Example 5.4.6. Two paths that are homotopic (keeping endpoints
fixed) in Ω are homologous in Ω. (Indeed, a homotopy is simply a
map from the rectangle [0, 1] × [0, 1] to Ω.)

Compare the next result with Lemma 5.4.4.

Lemma 5.4.7. Suppose that Γ is a boundary in Ω. Then Γ
α=0
for any closed 1-form α on Ω.

Proof. It suﬃces to consider the case of a simple boundary, Γ =

∂(R, σ). Then
α= σ∗α = β,
Γ ∂R ∂R
where β is the closed (Proposition 5.2.4) form σ ∗ α deﬁned on T . The
result then follows from Lemma 5.2.7.

Thus the diﬀerence between boundaries and cycles is closely re-

lated to the diﬀerence between closed forms and exact forms.
This result completes our development of the general language
of homology. In order to apply it to problems in plane topology,

however, we need one more thing: an eﬀective criterion that tells

us when a given cycle is actually a boundary. The proof of such a
criterion occupies the remainder of this section.

Remark 5.4.8. Let Γ be a cycle in C, and let p ∈ C be a point not

belonging to Γ∗ . Then we can deﬁne the winding number of Γ around

p in the natural way: if Γ = m Γ , where Γ is the simple cycle
associated to the piecewise smooth loop γ , then put

wn(Γ, p) = m wn(γ , p).

The integral formula for the winding number, Theorem 5.3.4, extends
by linearity to give

1 1
wn(Γ, p) = dz.
2πi Γ z − p
Theorem 5.4.9 (Artin’s criterion). Let Ω ⊆ C be an open subset,
and let Γ be a cycle in Ω. Then Γ is a boundary in Ω if and only if
wn(Γ, p) = 0 for all p ∈ C \ Ω. Consequently, two cycles in Ω are
homologous in Ω if and only if they have the same winding numbers
around all p ∈ C \ Ω.

In order to present the proof of Artin’s criterion, let’s introduce

some terminology. A finite collection of horizontal and vertical lines
in C will be called a grid. The grid lines of a given grid G subdivide
one another into line segments or edges (some finite and some not),
and a cycle which is equivalent to a finite linear combination of finite
edges for G will be called a grid cycle for G. Such a cycle is completely
determined, up to equivalence, by the multiplicities8 that it assigns
to each of the finite edges of the grid G; these will be called the grid
multiplicities of the given grid cycle.
The complement of the grid G in C is a disjoint union of open
rectangles (some finite and some not), and a given grid cycle will have
a well-defined winding number around each of these complementary
rectangles; let us call the collection of these numbers the grid winding
numbers of the grid cycle in question. We use the notation wn(Γ; R)

8
For deﬁniteness, let us say that each edge is oriented in the direction of increasing
the appropriate coordinate, x for horizontal edges and y for vertical ones.

+1 −1

Figure 5.2. A grid, a grid cycle, and some grid winding numbers.

where Γ is a grid cycle and R a complementary rectangle, and we no-

tice that wn(Γ; R) = 0 if the rectangle R is not ﬁnite. See Figure 5.2.

Lemma 5.4.10. Every cycle in an open subset Ω ⊆ C is homologous

to a grid cycle (for some grid G).

Proof. It suﬃces to consider a piecewise smooth loop γ. Since γ ∗

is a compact subset of Ω, there is ε > 0 such that any disc of ra-
dius ε centered on a point of γ is contained in Ω. Since γ is uni-
formly continuous (Proposition B.3.19), we can find δ > 0 such that if
|s − t| < δ, then |γ(s) − γ(t)| < ε. Subdivide the parameter interval
[0, 1] into finitely many subintervals [tj , tj+1 ], j = 0, . . . , m, such that
|tj − tj+1 | < δ for each j, and denote by γj the path given by restrict-
ing γ to [tj , tj+1 ]. Then [γ] is equivalent to the cycle [γ0 ]+· · ·+[γm−1 ].
Put γ(tj ) = xj + iyj , and let hj be the horizontal line segment from
xj + iyj to xj+1 + iyj , and vj the vertical line segment from xj+1 + iyj
to xj+1 + iyj+1 . Then γj is homotopic (with endpoints fixed) to the
concatenation of hj and vj , by a linear homotopy lying inside the disc

B(γ(tj ); ε) ⊆ Ω. Consequently, [γ] is homologous in Ω to

m−1
[hj ] + [vj ],
j=0

which is a grid cycle.

Lemma 5.4.11. A grid cycle all of whose grid winding numbers are
0 is equivalent to 0.

Proof. By Lemma 3.3.2, the multiplicity of an edge e in a grid cycle

is equal to the diﬀerence of the grid winding numbers on the two sides
of e. So, if all grid winding numbers are 0, then all multiplicities are
0 also.

Proof of Artin’s criterion (Theorem 5.4.9). Suppose that Γ is

a boundary. Then Lemma 5.4.7 shows that the winding numbers

1 1
wn(Γ, p) = dz
2πi Γ z − p
must vanish for p ∈
/ Ω since the integrand is a closed form by Lemma
5.3.3.
Suppose conversely that these winding numbers all vanish. By
Lemma 5.4.10, there is no loss of generality in assuming that Γ is a
grid cycle based on some grid G.
Deﬁne a new grid cycle

(5.4.12) Γ = wn(Γ; R)[∂R],
R

where the sum is extended over all ﬁnite complementary rectangles

R to G. Notice that if R is a fixed finite complementary rectangle,
then
1 (R = R ),
wn([∂R]; R ) =
0 (R = R ).
Consequently, wn(Γ; R) = wn(Γ ; R) for all R, and therefore, accord-
ing to Lemma 5.4.11, Γ − Γ is equivalent to 0. In particular, Γ and
Γ are homologous.
It remains to show that Γ is a boundary in Ω, and this will
follow if we can show that every R appearing with nonzero coefficient

in (5.4.12) has closure contained in Ω. Indeed, suppose R contains

a point p ∈/ Ω. Then wn(Γ; p) = 0 by hypothesis. There is a ball
B(p; ε) that does not meet Γ∗ , and thus wn(Γ; q) = 0 for all q in this
ball. Some of the points q are in the interior of R, so wn(Γ; R) = 0.
This completes the proof.

This proof is adapted from Ahlfors’ book [2] (which is also the
source of the attribution to Artin).

5.5. Cauchy’s theorem

The results of the previous section combine to prove the most general
form of a key result of complex analysis, which substantially extends
the homotopy argument used in part (b) of the proof of Theorem 5.3.4.
Theorem 5.5.1 (Cauchy’s theorem). Let Ω be an open subset of
C, and let Γ be a cycle in Ω which has the property that wn(Γ; p) = 0
for all p ∈ Ω. Then
f (z) dz = 0
Γ
for every holomorphic function f : Ω → C.

Proof. By Artin’s criterion (Theorem 5.4.9), Γ is a boundary. By

Lemma 5.3.3, the 1-form f (z) dz is closed. Combining these two facts
and applying Lemma 5.4.7, we obtain the result.

This theorem is used throughout complex analysis. We’ll only

give one example here (see Exercise 5.7.14 for another). In Section 4.4,
discussing generalizations of the Jordan curve theorem, we promised
to prove that any nowhere-zero holomorphic function deﬁned on a
Jordan domain has a holomorphic square root there. We can now
redeem that promise (indeed, we can do a little better).
Proposition 5.5.2. Let U be a connected open subset of C having
the property9 that for every loop γ in U and every p ∈ C \ U , the
winding number wn(γ; p) equals 0. Then every holomorphic function
f : U → C \ {0} lifts (holomorphically) through the exponential map;
i.e., there exists a holomorphic g : U → C such that eg = f .
9
By Exercise 4.5.8 every Jordan domain has this property.

The existence of holomorphic square roots (which is what was

needed for the Riemann mapping theorem) certainly follows from
this: the function eg/2 gives such a square root.

Proof. Let z0 be some point of U , let 0 be a complex number such

that exp(0 ) = z0 , and deﬁne g(z) for z ∈ U by
z
f (z)
g(z) = 0 + dz,
z0 f (z)

where the integral is taken along an arbitrary path in U from z0 to

z. (Cauchy’s theorem (Theorem 5.5.1) is invoked at this point: the
integrand f (z)/f (z) is holomorphic in U , so if γ1 and γ2 are two
paths from z0 to z,

f (z) f (z)
dz − dz = 0
γ1 f (z) γ2 f (z)

because this is the integral of a holomorphic function around a cycle

in U obtained by traversing γ1 followed by the reverse of γ2 .) By
diﬀerentiating the integral we ﬁnd g (z) = f (z)/f (z) and therefore
d −g(z)
e f (z) = e−g(z) (−g (z)f (z) + f (z)) = 0.
dz
Thus eg(z) is a constant multiple of f (z) and evaluating at z = z0 we
see that the constant is 1.

5.6. A glimpse at higher dimensions

The underlying message of this chapter is that differential 1-forms are
useful in studying the “winding around” aspect of topology: simple-
connectedness, winding numbers, and so on. The reason for this is
that the difference between the properties “closed” and “exact” for
1-forms on an open set U in a vector space V reflects the lack of
simple-connectedness of U (see Exercise 5.7.10).
Now, as observed in Remark 5.2.5, the “difference between closed
and exact” is measured by the quotient vector space
Z 1 (U ) Closed 1-forms on U
H 1 (U ) := = ,
B 1 (U ) Exact 1-forms on U

and one may ask whether this H 1 is the beginning of a sequence of

quotient spaces that express “higher-dimensional topological struc-
tures” of U . It turns out that this is indeed the case. When V is
n-dimensional, there is a sequence of vector spaces and maps d be-
tween them, i.e.,

Ω0 (U )
d / Ω1 (U ) d / Ω2 (U ) d / ··· d / Ωn (U ) ,

called the de Rham complex of U . Here Ω0 (U ) is the space of smooth

functions on U , Ω1 (U ) is the space of 1-forms, and the first d is
the one that we have already introduced. The subsequent Ωs are
more complicated “multilinear” objects: for instance, Ω2 (U ) can be
regarded as the space of functions which assign, to each point of U ,
a skew-symmetric bilinear functional on V × V .
In the sequence above it turns out that the composite of any
two d operators is zero, or to put it another way, the kernel of each
d contains the image of the preceding one. This is a generalization
of the fact that exact 1-forms are closed (Proposition 5.2.3) and its
proof, like the proof in the 1-form case, relies on Clairaut’s theorem
(Proposition E.3.4). Thus the definition
Kernel of d: Ωp (U ) → Ωp+1 (U )
H p (U ) :=
Image of d: Ωp−1 (U ) → Ωp (U )
generalizes our H 1 (U ) above. These H p are called the de Rham co-
homology spaces of U and they are a vital tool in higher-dimensional
topology.
One way to bring the above abstractions down to earth is to con-
sider the case when V = R3 . In this situation we can avail ourselves
of a useful coincidence: the number of independent components of a
skew-symmetric 3 × 3 matrix
⎛ ⎞
0 a −b
⎝ −a 0 c ⎠,
b −c 0
namely 3, is the same as the number of components of a vector;10
and therefore, 2-forms can be identified with vector fields in this
10
This coincidence is also responsible for the existence of the classical vector
product, another uniquely 3-dimensional construction.

case. When we make this identiﬁcation, the three d operators in

the de Rham complex become the familiar operators grad, curl, and
div of classical vector calculus, and the relations between kernel and
image are also classical: curl grad ϕ = 0 and div curl V = 0.
Remark 5.6.1. From Lemmas 5.4.4 and 5.4.7, it follows that inte-
gration gives a pairing
H 1 (U ) × H1 (U ) → R,
where the first homology group H1 (U ) is defined to be the quotient of
the cycles by the boundaries. This duality structure also generalizes
to higher dimensions and relates the analytically defined de Rham
spaces to the combinatorics of cycles and boundaries.

For much more about diﬀerential forms and cohomology, I rec-

ommend the book [26].

5.7. Exercises
Exercise 5.7.1. Suppose that V = Rn and that x1 , . . . , xn : V → R
are the standard coordinate functions. Show that for any point x,
the elements dx1 (x), . . . , dxn (x) ∈ V ∗ form a basis for V ∗ . Deduce
that any 1-form α on V can be written α = f1 dx1 + · · · + fn dxn
for suitably chosen smooth functions fi : V → R.
Exercise 5.7.2. Suppose that V = Rn with coordinate functions
x1 , . . . , xn : V → R as in the previous exercise. Let f : V → R be a
smooth function. Show that
n
∂f
df = dxi .
j=1
∂x i

(Hint: Fix an (n − 1)-tuple of coordinates, say (x2 , . . . , xn ), and

let i1 : R → V be deﬁned by sending x ∈ R to the column vector
(x, x2 , . . . , xn )T , where the superscript T denotes transpose. Com-
pute the derivative of f ◦ i1 . Using the chain rule, identify this de-
rivative with the coeﬃcients of dx1 in df . Repeat for the other
coordinate subscripts 2, 3, . . ..)
Exercise 5.7.3. Let V = R2 with coordinate functions x and y.
Show that the 1-form x dy is not the gradient of any function.

Exercise 5.7.4. Let U1 , U2 be open subsets of ﬁnite-dimensional vec-

tor spaces V1 , V2 , respectively. A diffeomorphism f : U1 → U2 is a
smooth bijection whose inverse is also smooth.
Show that if such a diffeomorphism exists, then dim V1 = dim V2 .
(Hint: Use the chain rule to show that the derivative of a diffeomor-
phism (at any point) must be an invertible linear map.)
This easy proof that diffeomorphisms preserve dimension should
be contrasted with the difficulty of proving that homeomorphisms do
so (compare Exercise 4.5.9), a difficulty which ultimately arises from
nonsmooth examples such as the Peano curve (Example 2.1.6).

Exercise 5.7.5. Show that a smooth map between open intervals

is a diffeomorphism if and only if it is onto and the derivative f (t)
is nonzero and of constant sign (either > 0 for all t or < 0 for all
t: the first case is called orientation-preserving and the second is
called orientation-reversing). By considering the map t → t3 from
(−1, 1) to itself, show that a smooth and bijective map need not be
a diffeomorphism.

Exercise 5.7.6. If the smooth path γ is regular (see the beginning

of Section 3.4 for the deﬁnition), show that any smooth reparameter-
ization of γ is regular too.

Exercise 5.7.7. Write out a detailed proof of Proposition 5.1.16

without using the language of pullbacks and functionality, only using
the “Calculus I” deﬁnition of path integral. (This is a good way to
understand just what is “under the hood” of such language.)

Exercise 5.7.8. Verify the following properties of 1-forms:

(a) (Integration by parts for 1-forms) Let γ be a smooth loop in

a vector space V and let f, g be smooth functions deﬁned in a
neighborhood of γ ∗ . Prove that

f dg = − g df.
γ γ

(b) (Chain rule for 1-forms) Let f be a smooth function of x1 , . . . , xk ,

where x1 , . . . , xk are themselves smooth functions on a vector

space V . Prove that

k
∂f
df = dxi .
i=1
∂x i

Exercise 5.7.9. Let Ω be an open subset of V . A 1-form α on Ω

is said to have some property locally if, for every p ∈ Ω, there is an
open U ⊆ Ω with p ∈ U such that the restriction of α to U has that
property. Show that a locally closed form must be closed but that a
locally exact form need not be exact.

Exercise 5.7.10. Let Ω ⊆ V be a simply connected open set. Prove

that every closed 1-form α deﬁned on Ω is exact. (Hint: Fix a base
point x0 ∈ Ω and deﬁne a smooth function
x
f (x) = α
x0

where the integral is taken along any path in Ω from x0 to x — use

Theorem 5.2.8 to argue that the choice of path does not matter. Show
that α = df .)

Exercise 5.7.11. Show that the ﬁrst de Rham cohomology is func-

torial : a smooth map ϕ : Ω → Ω induces a linear map ϕ∗ : H 1 (Ω) →
H 1 (Ω ). Also show that homology is functorial: ϕ induces a homo-
morphism ϕ∗ : H1 (Ω ) → H1 (Ω). (Notice the opposite directions of
the two maps.) Can you relate these functorial maps?

Exercise 5.7.12. An entire function is a function f : C → C that is

represented by an everywhere convergent power series
∞

f (z) = an z n ,
n=0

for some coeﬃcients an ∈ C. Prove that

f (z)
an = n+1
dz,
γ 2πiz

where γ is a circular path around the origin (of any given radius).
Deduce that no nonconstant entire function can be bounded (Liou-
ville’s theorem: note that the example of sin x = x−x3 /3!+· · · shows

−1 1

Figure 5.3. The loop γ for Exercise 5.7.15.

that this statement is not true if we restrict attention to real variables

only).
Exercise 5.7.13. We can use Liouville’s theorem to give another
proof of the fundamental theorem of algebra (see Section 3.5). Let
p be a polynomial and suppose, for a contradiction, that p does not
have any roots. Then f (z) = 1/p(z) deﬁnes an entire function. Show
that f is bounded, and therefore (by Liouville’s theorem) f , and thus
p, must be constant.
Exercise 5.7.14. Let f : Ω → C be a holomorphic function, and let
Γ be a cycle in Ω that is nullhomologous (that is, wn(γ; p) = 0 for all
p∈/ Ω). Show that for any w ∈ C,

1 f (z)
dz = wn(Γ; w)f (w).
2πi Γ z − w
(Hint: Show that Γ is homologous in Ω \ {w} to the cycle Γε that
travels wn(Γ; w) times around a circular loop of radius ε around w.
Apply Cauchy’s theorem to the function f (z)/(z − w) and the cycle
[Γ] − [Γ ]; then let ε → 0.)

Exercise 5.7.15. Let Ω = C \ {−1, 1}, and let γ be the loop in

Ω that is depicted in Figure
5.3. Show using the homology form of
Cauchy’s theorem that γ f (z) dz = 0 for any holomorphic function
f on Ω. Nevertheless, γ is not homotopic in Ω to a constant loop (we
will show this rigorously in Chapter 8), and thus a homotopy form of
Cauchy’s theorem is insuﬃcient for a proof.

Chapter 6

Vector Fields and

the Rotation Number

6.1. The rotation number

Let γ be a regular smooth loop in the plane — we recall from Sec-
tion 3.4 that regular means γ (t) = 0 for all t ∈ [0, 1]. Then for each
t, γ (t) ∈ C \ {0} and γ (0) = γ (1). In other words, γ is itself a loop
in C \ {0}.

Deﬁnition 6.1.1. The rotation number of γ, denoted rot(γ), is the

winding number of the loop γ about 0.

The rotation number thus measures the number of times that the
tangent vector to γ turns around 0. That is not necessarily the same
as the number of times that γ itself turns around 0 — see Figure 6.1
for an illustrative example.
When studying the rotation number it is often convenient to con-
sider the unit tangent vectors to γ. Recall (Remark 3.2.8) that, for
any nonzero complex number w, the symbol υ(w) denotes w/|w|, the
unit vector in the direction of w. (It is sometimes convenient to think
of υ(w) as “measuring the angle” between w and 1.) The map υ
sends C \ {0} continuously onto the unit circle S 1 , and we noted in
Remark 3.2.8 that for any loop γ in C \ {0}, the winding numbers
(around 0) of the loops γ and υ ◦ γ are the same. In particular we can

101

Figure 6.1. A loop with winding number +1 about 0 and

rotation number +2.

(and often will ﬁnd it convenient to) think of the rotation number of
γ as the winding number of t → υ(γ (t)).
The winding number and rotation number are not the same, as
we have seen; but there are important special cases where they agree.

Deﬁnition 6.1.2. The smooth loop γ is monotonic around p ∈ C

if no ray through p is tangent to γ, that is, if the complex numbers
γ(t) − p and γ (t) are always linearly independent over R.

Proposition 6.1.3. If the smooth loop γ is monotonic about p, then

wn(γ, p) = rot(γ).

Proof. Consider the linear homotopy

h(s, t) = (1 − s) γ(t) − p + sγ (t).
This joins γ − p and γ and never passes through 0 (since h(s, t) = 0
would give an R-linear dependence between γ(t) − p and γ (t)) so it
shows that γ − p and γ have the same winding number about 0.

Recall from Section 4.2 that a Jordan curve is a loop that does
not intersect itself: the points γ(t) and γ(t ), for t < t, must always
be distinct (except when t = 0 and t = 1). The beautiful proof of
the following result about the rotation number of a Jordan curve is
due to Heinz Hopf.

Proposition 6.1.4 (Theorem of the turning tangent). If γ is a

smooth Jordan curve, then rot(γ) = ±1.

Proof. By compactness, there is some point on the curve that min-

imizes the imaginary part of γ(t). Without loss of generality, let us
take it that this minimum occurs at t = 0 and that γ(0) = 0. Thus
Im γ(t) 0 for all t.
Recall that for a nonzero complex number w, we deﬁne υ(w) =
w/|w|. Consider the function (the secant map) deﬁned for 0 t
t 1 by
⎧
⎪
⎨υ(γ(t) − γ(t )) if t > t , (t , t) = (0, 1),
⎪

σ(t , t) = υ(γ (t)) if t = t ,
⎪
⎪
⎩−υ(γ (0)) if t = 0, t = 1.

Expressed in words, this means that σ(t , t) is the unit vector in the
direction from γ(t ) to γ(t) (well-defined because the curve does not
intersect itself), with appropriate interpretation when t = t (as the
unit tangent vector at t) and when t = 0, t = 1 (now γ(t ) = γ(t)
as we have gone “all the way around” the loop and the appropriate
interpretation is minus the unit tangent vector at 0). I claim that σ
is a continuous function on the closed triangle 0 t t 1. Let us
grant that claim for a moment, finish the proof, then come back to
the claim.
We construct a homotopy of loops in S 1 as follows:

σ((1−s)t,(1+s)t) (0 t 12 ),
h(s, t) =
σ((1−s)(1−t)+(2t−1),(1+s)(1−t)+(2t−1)) ( 21 t 1).
This homotopy is illustrated in Figure 6.2: as t runs from 0 to 1, it
follows the values of σ along the line segment from (0, 0) when t = 0
to ( 12 (1 − s), 12 (1 + s)) when t = 12 and then along the line segment
from there to (1, 1) when t = 1. When s = 0, this is the loop of unit
tangent vectors to γ, so its winding number is precisely rot(γ). When
s = 1, we are traversing the two line segments from (0, 0) to (0, 1)
and then to (1, 1).
Now use the fact that γ lies in the upper half-plane and γ(0) = 0.
This tells us first of all that γ (0) is real, so that h(1, 0) = h(1, 1) =

(0, 1)
(1, 1)

( 12 (1 − s), 12 (1 + s))

(0, 0)

Figure 6.2. The homotopy used to prove the theorem of the

turning tangents.

−h(1, 12 ) = ±1. Second, it tells us that the loop t → h(1, t) must

lie in the upper half-plane for 0 t 12 and the lower half-plane
for 12 t 1. But now Rouché’s theorem (Proposition 3.1.4) tells
us that this loop is homotopic to the loop t → e2πit (if h(1, 0) = 1)
or t → −e−2πit (if h(1, 0) = −1). Thus it has winding number ±1.
Now the homotopy h shows that this winding number is the same as
the winding number of t → h(0, t), which is the rotation number of
γ, as we remarked. This completes the proof except for verifying the
claimed continuity of the secant map σ.
To check the continuity, deﬁne

γ(t)−γ(t )
t−t (t < t ),
g(t , t) =
γ (t) (t = t ),
so that σ(t , t) = υ(g(t , t)) except for the special case (t , t) = (0, 1)
which we will handle separately. It will be enough to show that g is
a continuous function of (t , t) and the only points at which there is
any doubt are those on the diagonal t = t . Consider then a diagonal
point (a, a). Given ε > 0 there is δ > 0 such that |γ (t) − γ (a)| < ε
for t ∈ (a − δ, a + δ). Now write
1
g(t , t) − g(a, a) = [γ ((1 − u)t + ut ) − γ (a)] du
0

using the fundamental theorem of calculus. If t , t ∈ (a − δ, a + δ),

then the absolute value of the quantity under the integration sign is
less than ε for each u. Thus |g(t , t) − g(a, a)| < ε and this establishes
continuity.
At the point (0, 1) the proof uses the same idea. See Exercise 6.5.5
for some supporting details.
Remark 6.1.5. One sees from the proof that the sign (±) of the
rotation number depends on whether the curve γ is traversed in the
“positive” or “negative” direction, where by the “positive direction”
we mean that direction for which the interior of γ is to the left of γ .

6.2. Curvature and the rotation number

The rotation number is closely related to the curvature studied in
differential geometry. Let us recall a few definitions. Let γ be a
smooth, regular path in the plane. For the purposes of this discussion,
we allow the parameter interval to be of arbitrary length — that is,
we allow γ to be a map [0, d] → C, rather than [0, 1] → C.
Definition 6.2.1. The path γ is a unit speed path if |γ (t)| = 1 for
all t.

It is a simple theorem that every regular smooth path can be

smoothly reparameterized (Deﬁnition 5.1.15) so as to become a unit
speed path. This can be achieved by using the arclength
t
(6.2.2) s(t) = |γ (t)|dt
0
as the new parameter. For this reason a unit speed path is also called
a path parameterized by arc length, and we will usually use the letter
s to denote the parameter for such a path.
Let γ be a smooth unit speed path. Then s → γ (s) is a smooth
map [0, d] → S 1 . According to Proposition 3.2.1 there exists a func-
tion s → θ(s) such that γ (s) = eiθ(s) ; the function θ is a “smooth
choice of tangent angle” at every point.
Deﬁnition 6.2.3. The curvature κ of a unit speed path γ at s is
the derivative θ (s): it measures how fast the tangent angle is turning
with distance.

Example 6.2.4. The curvature of a circular path of radius r is equal

to ±1/r (the sign dependent on the orientation).

From our deﬁnitions we have

Proposition 6.2.5. If γ : [0, d] → C is a smooth loop with a unit
speed parameterization, then
d
1
rot(γ) = κ(s)ds;
2π 0
the rotation number is equal to (2π)−1 times the total curvature.
d
Proof. Since κ(s) = dθ/ds (in the notation above), we have 0 κ(s)ds
= θ(d) − θ(0). But, by Deﬁnition 3.2.4, this is just 2π times the wind-
ing number of γ .

Deﬁnition 6.2.3 has the disadvantage of apparently depending on

a particular choice of parameterization (a unit speed parameteriza-
tion). We can use the technology of 1-forms to give a more canonical
approach.
Definition 6.2.6. Let γ be a regular smooth path in the plane, The
curvature 1-form of γ is the 1-form
ωγ = Im(γ (t)/γ (t)) dt
defined on γ. (Here γ and γ are complex numbers, and Im denotes
the imaginary part.)
Lemma 6.2.7. The curvature 1-form is unchanged by smooth repa-
rameterization (Definition 5.1.15) of the path γ. If the parameteri-
zation is a unit speed one, then ω = κ ds, where κ is the curvature
defined above for unit speed curves.

Proof. Let t and t be two parameterizations for γ. We have, by the

chain and product rules,
2
dγ dγ dt d2 γ dγ d2 t d2 γ dt
= , = 2 + 2
dt dt dt dt2 dt dt dt dt
and so
d2 γ dγ d2 t dt d2 γ dγ dt
= + .
dt2 dt dt2 dt dt 2 dt dt

The first term in the second display is real, so when we take imaginary
parts it vanishes and we get
2 2
d γ dγ d γ dγ
Im dt = Im dt ,
dt2 dt dt 2 dt
which shows that the definition of ω is independent of the parame-
terization. For the second part of the lemma, consider a unit speed
parameterization with γ (s) = eiθ(s) ; then γ = ieiθ θ and the defini-
tion of ω becomes ω = Im(iθ ) ds = κ ds, as required.

The invariant version of Proposition 6.2.5 is then

Proposition 6.2.8. Let γ be a smooth, regular loop. Then

1
rot(γ) = ωγ ;
2π γ
the rotation number is (2π)−1 times the total curvature. In partic-
ular, the total curvature of a smooth Jordan curve is ±2π, the sign
depending on whether the curve is traversed in the positive or negative
direction.

Proof. Since the right-hand side does not depend on the choice of
parameterization, by Lemma 6.2.7, we may assume that the loop γ
is parameterized by arc length. Then the right side is (2π)−1 κ ds,
which equals the rotation number by Proposition 6.2.5. For an alter-
native proof see Exercise 6.5.10.

6.3. Vector ﬁelds and singularities

Imagine that we want to compute the rotation number of a large loop
that is laid out on the surface of the earth. We walk around the loop,
measuring at each point the angle θ that the tangent vector makes
with north (let’s say) and then integrating the curvature κ = dθ/ds
around the loop. To ﬁnd out which direction is north, we carry a
magnetic compass.
What’s wrong with this picture? For some curves our calculation
of the rotation number will give the “correct” answer. For others,
like a meridian of latitude encircling the (magnetic) north pole, it

will look as though we have a Jordan curve with rotation number

zero!
What is happening is that we are using a vector field to describe
our “reference” direction — here, the vector field which gives the
direction in which the compass needle points — and that vector field
can have “singularities” (like that of the earth’s magnetic field at the
north pole) which themselves contribute to the rotation number. In
this section we will work out the details of this idea.
Let V be a finite-dimensional vector space (we will usually be
thinking about V = C, considered as a 2-dimensional vector space
over R) and let Ω be an open subset of V .
Definition 6.3.1. A tangent vector to Ω at p is a pair (p, v) consisting
of a point p ∈ Ω and a vector v ∈ V .

We think of a tangent vector at p as a little arrow whose origin

is at p and which is pointing in the direction indicated by v.
Deﬁnition 6.3.2. A vector ﬁeld X on Ω is a continuous (usually
smooth) function which assigns to each point p ∈ Ω a tangent vector
at that point.

To visualize a vector ﬁeld, therefore, imagine attaching a little

arrow to each point p ∈ Ω, varying smoothly with p. You might end
up with something like a weather map indicating wind strengths and
directions. It is common practice to write X(p) for the vector v by
itself (rather than the ordered pair (p, v)) when this will not cause
confusion.
Definition 6.3.3. A singularity of the vector field X is a point z
such that the vector X(z) = 0 (the little arrow has zero length).
The vector field has isolated singularities if the set of singularities is
discrete, and it is nonsingular if the set of singularities is empty.

From now on we will focus on vector ﬁelds with isolated singular-

ities, defined on open subsets of V = C. Let X be such a vector field,
defined on Ω ⊆ C, that has an isolated singularity at a. Then, for
small ε, the function t → X(a + εe2πit ), t ∈ [0, 1], is a loop in C \ {0}.
Varying ε changes this loop by a homotopy, so the winding number
of the loop is independent of ε (as long as ε is small enough).

Index 1 Another index 1 example Index -1

Figure 6.3. Vector ﬁelds with various singularities. The

black dot denotes the basepoint p of each vector and the line
denotes its magnitude and direction. The singularity is the
point in the center of each picture where the vector ﬁeld van-
ishes.

Deﬁnition 6.3.4. The above winding number is called the index (or
degree) of the isolated singularity at a and is denoted ind(X, a).

In favorable cases the index can be recovered from “inﬁnitesimal”

data about the vector ﬁeld near the singularity.

Proposition 6.3.5. Let X := u + iv be a smooth vector ﬁeld having

an isolated singularity at the point z := x + iy = a (say). Suppose
that the Jacobian
∂u/∂x ∂u/∂y
J=
∂v/∂x ∂v/∂y
is nonzero at the singularity z = a. Then the index of the singularity
is ±1, according to the sign of J.

Proof. Without loss of generality assume that a = 0. Let M denote

the 2 × 2 matrix whose determinant is the Jacobian; that is,

∂u/∂x ∂u/∂y
M= .
∂v/∂x ∂v/∂y (x,y)=(0,0)

By Deﬁnition E.2.4 we can write the path whose winding number

gives the index as follows:

u(ε cos(2πt), ε sin(2πt)) cos 2πt
= εM + e(t),
v(ε cos(2πt), ε sin(2πt)) sin 2πt

where the error term e(t) has the property that ε−1
e(t)
→ 0 as
ε → 0. Since M is an invertible matrix, the ﬁrst term on the right-
hand side has norm greater than Cε for all t, ε, where C is a constant
(actually the norm of the matrix M −1 ). Thus for ε small enough
the norm of the ﬁrst term on the right-hand side is strictly greater
than that of the second one, and so by Rouché’s theorem (Proposition
3.1.4) the index is equal to the winding number of the path

cos 2πt
t → M .
sin 2πt
But this is just an ellipse, traversed in the positive or negative di-
rection according to the sign of the determinant of M , and so has
winding number ±1.

Remark 6.3.6. In fact one can recover the index from the inﬁnites-
imal data at any isolated singularity, even one where the Jacobian is
singular. The remarkable algebraic formula which allows one to do
this was discovered only in the 1970s. See [17].

Let X be a vector ﬁeld, with isolated singularities, deﬁned on an

open subset Ω ⊆ C. Let γ be a loop in Ω which does not pass through
any singularities of X. Then t → X(γ(t)) is a loop in C \ {0}. If γ is a
small circle surrounding a singularity, then the winding number of this
loop is, by definition, the index of the singularity (Definition 6.3.4).
But there is no need to restrict our attention to that case, and we can
make the more general definition below.

Deﬁnition 6.3.7. In the situation above, we will call the winding

number of X(γ(t)) the rotation number of X around γ, and we will
denote it by rot(X; γ). We may extend this deﬁnition to cycles Γ in
Ω as in Remark 5.4.8.

Remark 6.3.8. The rotation number of a smooth loop γ, as we have

defined it in Definition 6.1.1, appears in terms of the above definition
to be rot(γ ; γ), the “rotation number of the vector field γ around γ”.
This does not make literal sense without a bit of work — since γ is
defined only on γ itself, not on an open set Ω — but it is a suggestive
connection.

Theorem 6.3.9. Let Ω be an open subset of C, and let Γ be a null-

homologous cycle1 in Ω. Suppose that X is a nonsingular smooth
vector ﬁeld deﬁned on Ω. Then rot(X; Γ) = 0.

Recall that, by Artin’s criterion (Theorem 5.4.9), Γ is nullhomol-

ogous if and only if wn(Γ; p) = 0 for all p ∈ Ω.

Proof. First, observe that since X is nonsingular, we may without

loss of generality assume that it is a unit vector field, that is, |X(z)| =
1 for all z ∈ Ω. Indeed, the unit vector field υ(X) is well-defined,
smooth, and nonsingular and it has the same rotation number as X
by Remark 3.2.8.
Now let X = u + iv, u, v real, be a smooth unit vector field.
Define a 1-form ω by

d(u + iv) ∂u ∂v ∂u ∂v
ω= = −v +u dx + −v +u dy.
i(u + iv) ∂x ∂x ∂y ∂y
To see where this formula comes from, notice that for each p ∈ Ω
there is a disc D = D(p; ε) ⊆ Ω. Since this disc is contractible,
Corollary 3.1.6 tells us that there is a smooth real-valued function θ
on D with
u + iv = eiθ = cos θ + i sin θ.
Then the form ω above is equal, on D, to dθ. In particular ω is
closed on a neighborhood of p; and, since p was arbitrary, it follows
that ω is closed2 .
Moreover, by the integral formula for the winding number (The-
orem 5.3.4),

1
rot(X; γ) = ω.
2π Γ
Since
ω is closed and Γ is a boundary in Ω, Lemma 5.4.7 implies that
Γ
ω = 0. The result now follows.

Corollary 6.3.10. Suppose that X is a smooth vector ﬁeld with iso-

lated singularities on Ω, and let Γ be a nullhomologous cycle in Ω.

1
See Section 5.4 for the language of cycles and homology.
2
But it need not be exact! See Exercise 5.7.9.

Then

rot(X; Γ) = wn(Γ; pk ) ind(X; pk ),
k
where the sum is taken over all singularities pk .

Proof. For each singularity pk , let γk denote a small circle surround-

ing it. Let Ω = Ω \ {pk } denote Ω with the singularities removed.
Let Γ be the cycle in Ω deﬁned by

Γ = wn(γ; pk )γk .
k

Then Γ and Γ have the same winding numbers around all points not
in Ω , on which region X is nonsingular. Thus by Theorem 6.3.9,
applied to the cycle [Γ] − [Γ ],

rot(X; Γ) = rot(X; Γ ) = wn(γ; pk ) rot(X; γk ),
k

and rot(X; γk ) = ind(X; pk ) by deﬁnition.

Just for fun, let’s use this to prove the Brouwer ﬁxed-point the-
orem again.

Theorem 6.3.11. Let D denote the closed unit disc {z ∈ C : |a| 1}.
Any continuous map f : D → D must have a ﬁxed point, that is, a
point z0 such that f (z0 ) = z0 .

Proof. Suppose not. Then deﬁne a vector ﬁeld3 X as follows: X(z)

is the vector f (z) − z, i.e., the vector that points from z to f (z).
This is a vector ﬁeld without singularities (by assumption) and from
any boundary point z ∈ S 1 it always points inwards. Let t →
γ(t) = e2πit parameterize the boundary circle. Then for all t, we
have Im(γ (t)/X(γ(t))) < 0; this just translates the statement that
X(γ(t)) points inwards. By Rouché’s theorem (Theorem 3.1.4), the
winding number of γ (that is, the rotation number of γ) equals the
winding number of X ◦ γ (that is, the rotation number of X). But
the ﬁrst of these is ±1 (Proposition 6.1.4) and the second is 0 (The-
orem 6.3.9), so this is a contradiction.

3
Extend it to the exterior of the disc by setting X(reiθ ) = X(eiθ ) for r > 1.

Figure 6.4. Some surfaces.

6.4. Vector ﬁelds and surfaces

In this ﬁnal section we are going to use the ideas of this chapter to
sketch the proof of a famous theorem about vector ﬁelds on surfaces.
The kinds of surfaces that we have in mind are called in mathematics
closed, oriented, smooth surfaces of which standard examples are the
sphere, the torus, and the double torus (Figure 6.4).
One way to study the topology of a surface S is to subdivide
it into polygons, or faces, meeting along edges and at vertices. For
example, a standard soccer ball subdivides the surface of a sphere
into 32 faces (12 pentagons and 20 hexagons), with 90 edges and 60
vertices. The quantity

χ(S) = V − E + F,

where V , E, and F denote the numbers of vertices, edges, and faces,

respectively, is called the Euler characteristic of the surface. It does
not depend on the way the surface is subdivided and is equal to 2 for
the sphere, 0 for the torus, −2 for the double torus, and in general to
2 − 2g where g is the genus or “number of holes” of S.
A surface may also be considered as the domain of a tangent vec-
tor field (one whose direction is everywhere tangential to the surface).
When you start sketching these you soon find that a tangent vector
field on the sphere seemingly must have a singularity somewhere (as
the “latitudinal” vector field has singularities at the north and south
poles) whereas a tangent vector field on the torus can be nonsingular

Figure 6.5. Nonsingular vector ﬁeld on a torus. Creative

Commons Attribution-ShareAlike 3.0.

(see Figure 6.5). These observations are systematized and generalized

by the following famous theorem of Hopf.
Theorem 6.4.1 (Hopf index theorem). Let X be a smooth tangent
vector ﬁeld on a compact oriented surface S, with isolated singularities
{p1 , . . . , pn }. Then the sum of the indices of the singularities,

n
ind(X, pk ),
k=1

depends only on S (and not on the vector ﬁeld ); moreover, it is equal

to the the Euler characteristic χ(S) of the surface S.

Sketch of the proof. The basic idea is to apply Proposition 6.2.8

to the “boundary curves” of each of the faces in a subdivision and
then add up the results. There are two fundamental obstacles to this
plan.
(a) Proposition 6.2.8 applies to smooth Jordan curves, but the bound-
aries of faces are only piecewise smooth — the tangent vector
“jumps” at the vertices.
(b) Proposition 6.2.8 involves the curvature, that is, the rate of change
of the angle that γ makes with a ﬁxed “reference direction”. But,
on a surface, there is no globally deﬁned choice of “reference di-
rection”.

It is not too hard to see how one should resolve obstacle (a). Suppose
that we consider regular piecewise smooth loops: a loop γ : [0, 1] → is
regular piecewise smooth if it is continuous and if there exist ﬁnitely
many parameter values 0 = a0 < a1 < · · · < am = 1 such that
(i) the map γ is smooth on each subinterval [ai , ai+1 ], and its de-
rivative γ is nonzero there;
(ii) (no cusps) at each breakpoint ai the tangent vectors
γ (a−
i ) := lim γ (ai + u), γ (a+
i ) := lim γ (ai + u)
u→0,u<0 u→0,u>0

are not in exactly opposite directions.

For such a curve γ the derivative γ (t) now traces out a series of arcs in
C \ {0} as t runs over the parameter intervals [ai , ai+1 ]. The endpoint
γ (a− +
i ) of one arc need not be the same as the starting point γ (ai ) of
the next arc, but because of the no-cusps condition, the straight line
path from γ (a− +
i ) to γ (ai ) lies in C \ {0}. Thus, by joining up the

arcs γ|[ai−1 ,ai ] with line segments, we obtain a (continuous) loop in
C\{0}, and we can deﬁne the rotation number of the regular piecewise
smooth loop γ to be the winding number of this loop. Clearly, if γ
is actually smooth, this agrees with Deﬁnition 6.1.1. Remembering
that the curvature is supposed to keep track of the angular change
in the tangent vector γ , it is not hard to see that Proposition 6.2.8
generalizes to the following statement for regular piecewise smooth
curves:
⎛ ⎞

1 ⎝
(6.4.2) rot(γ) = ωγ + θj ⎠ ,
2π i γ|[a ,a ] j i i+1

where ω is the curvature 1-form as before and θj ∈ (−π, π) is the

external angle at the jth vertex, that is, the angle between γ (a−
j )
and γ (a+
j ). In particular, for a regular piecewise smooth Jordan
curve, oriented in the positive direction, the quantity appearing in
(6.4.2) is equal to 1.
Now let’s turn our attention to obstacle (b) above, and here we
convert the obstacle from a bug to a feature by using the vector ﬁeld
X to provide the missing choice of “reference direction”. In other
words, we deﬁne “relative curvature” ω X and so on by measuring the

rate of change of the angle between γ (t) and X(γ(t)) (rather than the
rate of change of either of these quantities individually). This makes
sense provided that all the singularities of X are in the interiors of
faces (which can easily be arranged by perturbing the subdivision a
bit); however, it now introduces a further change into (6.4.2). By
integrating the relative curvature, we will obtain, not the absolute
rotation number +1 of a boundary curve γ, but the diﬀerence between
this quantity and the rotation number of X around γ as computed in
Corollary 6.3.10. In other words, we will have for each γ that bounds
a face F (considered as a regular piecewise smooth curve),
⎛ ⎞

1 ⎝
(6.4.3) 1 − ind(X; pk ) = ωγX + θj ⎠ ,
2π γ|[a ,a
pk ∈F i ] j i i+1

with the θj denoting external angles as before.

It is this last equation that we will sum over all faces F . When

we do that, we will obtain F − ind(X; pk ) on the left. What will
happen on the right? Each edge will appear twice, once as the edge of
its left-hand face and once as the edge of its right-hand face, and the
orientations of these two occurrences will be opposite. So the sum of
all the edge integral contributions, from all faces, will be zero. But
the sum of the vertex contributions will not be zero. To see what we
will get here, let ϕj = π − θj be the internal angle corresponding to
the external angle θj . Then for face F ,

θj = e(F )π − ϕj ,
j j

where e(F ) is the number of vertices (equal to the number of edges)

of face F . If we sum this expression over all faces, the ﬁrst term on
the right sums to 2πE (because each edge appears twice in the sum,
contributing π each time) and the second term on the right sums to
−2πV (because the sum of all the external angles at any given vertex
is 2π). Consequently, after summing over all faces we get from (6.4.3)

F− ind(X; pk ) = E − V,

which is Hopf’s theorem.

6.5. Exercises
Exercise 6.5.1. Give a construction that produces, for each m, n ∈
Z, a regular smooth loop in C \ {0} with winding number m around
0 and rotation number n.
Exercise 6.5.2. Generalize Proposition 6.1.3 by proving that the
difference
wn(γ, p) − rot(γ)
is in general equal to the number of rays through p that are tangent to
γ, counted with appropriate signs. (“In general” refers to a suitable
transversality hypothesis.)
Exercise 6.5.3. Let γ0 and γ1 be regular smooth loops in Ω (an
open subset of C). A homotopy h between them is called a regular
homotopy if, for each s ∈ [0, 1], the loop γs defined by
γs (t) = h(s, t)
is also regular and smooth. Give an example of two loops that are
homotopic but not regularly homotopic.
Exercise 6.5.4. Show that two regular smooth loops in C are reg-
ularly homotopic if and only if they have the same rotation number.
This is the Whitney-Graustein theorem; see [39]. To prove the “if”
part, try to integrate a homotopy on the level of derivatives (γ ) to a
homotopy on the level of the curves themselves (γ).
Exercise 6.5.5. In the proof of Proposition 6.1.4 we left to the reader
the verification that the secant map is continuous at (0, 1). Here is
one way to approach this. Define a new smooth path θ by

γ(s + 12 ) (0 s 12 ),
θ(s) =
γ(s − 12 ) ( 21 < s 1)
(smoothness at s = 12 follows from the regular loop condition for γ at
t = 0). Then define for t > 12 and t < 12 ,
⎧
⎨ θ(t− 12 )−θ(t + 12 )
t−t −1 (t − t < 1),
h(t , t) =
⎩γ (0) (t = 1, t = 0).
Show that h is continuous and that υ(h(t , t)) = −σ(t , t) for all t, t .

Exercise 6.5.6. A diﬀerent proof of the theorem of turning tangents

(Proposition 6.1.4) can be given by way of polygonal approximation.
In this exercise we’ll develop that proof, following some lecture notes
of M. Ghomi.
(a) A polygon P in the plane is called whisker-free if no two successive
edges are in exactly opposite directions. (This is the equivalent
for polygons of the no-cusps requirement for regular piecewise
smooth curves.) The external angles of a whisker-free polygon
are then well-defined in (−π, π). Show that the sum of the ex-
ternal angles of a whisker-free Jordan polygon is ±2π, with sign
dependent on orientation. (Hint: Use induction on the number
of vertices. For the induction step, show that if there are more
than 3 vertices, then one can always be removed in such a way
that the truncated polygon is still a whisker-free Jordan one.)
(b) Let γ be a regular smooth Jordan curve, parameterized by arc
length: γ : [0, d] → C. Define the nth approximating polygon Pn
to be the polygon with vertices γ(kd/n), k = 0, . . . , n − 1. Show
that when n is large enough, Pn is a whisker-free Jordan polygon
and that its kth side (the side with vertices γ((k − 1)d/n) and
γ(kd/n)) is parallel to γ (sk ) for some sk ∈ [(k − 1)d/n, kd/n].
(c) Deduce using the definition of an integral as the limit of a sum
that the sum of the external angles of the polygon Pn tends to
the total curvature,
d
κ(s)ds,
0
as n tends to infinity. Thus obtain another proof of the theorem
of the turning tangent.

Exercise 6.5.7. Let γ be a regular smooth plane curve, possibly

with ﬁnitely many transverse self-intersections. An uncrossing move
modiﬁes γ in the neighborhood of a self-intersection, as shown in
Figure 6.6.
Show that an uncrossing move does not change the rotation num-
ber. Deduce that the rotation number of γ is equal to the number
of anticlockwise loops minus the number of clockwise loops obtained
after we have uncrossed all self-intersections.

Figure 6.6. Uncrossing move.

Exercise 6.5.8. With the notation of the previous problem, give

each self-intersection point p a sign w(p) according to the following
scheme: assume that γ(0) = γ(1) is not a self-intersection and, if
γ(a) = γ(b) with 0 < a < b < 1, give a self-intersection point a +
sign if the pair (γ (b), γ (a)) is a right-handed basis (i.e., γ (a) lies
counterclockwise from γ (b)) and a − sign if not. Show that

rot(γ) = w(p) + wn(γ, q ) + wn(γ, q ),
p

where the sum runs over the self-intersections and q , q are points in
the two cells of C \ γ ∗ that are adjacent to the base point γ(0) = γ(1).
(Use an induction argument on the number of self-intersections.)

Exercise 6.5.9. Show that if a path in the plane has constant cur-
vature 1/r, it is (part of) a circle of radius r.

Exercise 6.5.10. Give an alternate proof of Proposition 6.2.8 by

applying the integral formula for the winding number (Theorem 5.3.4)
to the loop γ .

Exercise 6.5.11. A smooth curve γ : [0, 1] → C in the plane has the

property that the distance from the origin, |γ(t)|, achieves its max-
imum value R at t = a, where 0 < a < 1. Prove that the absolute
value of its curvature κ at t = a is at least R−1 . Is there a corre-
sponding theorem for the point where |γ(t)| achieves its minimum
value?

Exercise 6.5.12. Show that the formula z → az n , where a = 0 is a

constant, gives a vector ﬁeld with an index n singularity at 0 and that
the formula z → az̄ n gives a vector ﬁeld with an index −n singularity.
Identify choices of a and n leading to the pictures in Figure 6.3. Draw
similar pictures for index 2 and index −2 singularities.

Exercise 6.5.13. A class is studying the Hopf index theorem (The-

orem 6.4.1).
(a) A misguided student argues as follows: “On the complex plane C
we can consider a constant vector field, which has no singulari-
ties. Identify C with the sphere by stereographic projection; then
we’ll get a vector field on the sphere with no singularities. This
contradicts Hopf’s theorem since the Euler characteristic of the
sphere is 2.” Find the student’s mistake, and draw a picture to
show that, in fact, his example is consistent with Hopf’s theorem.
(b) The Euler characteristic of the torus is 0, so it is consistent with
Hopf’s theorem that there exists a vector field on the torus with-
out singularities. Can you help the students give an example of
such a vector field?
(c) The students are arguing about whether there is a compact ori-
ented surface (without boundary) that has Euler characteristic 0
but does not admit a nonsingular vector field. Some say “yes”;
others say “no”. It turns out that some of them are assuming that
a surface has to have an additional topological property besides
those that were explicitly mentioned, and others are not. What
is this property?
Exercise 6.5.14. Study the paper [31] regarding the topology of
“ridge patterns” (such as those appearing in human fingerprints).
What is the key difference between these “ridge patterns” and the
vector fields that we have investigated in this chapter?
Exercise 6.5.15. Read Jules Verne’s novel Around the World in
Eighty Days [37]. At the end of the book, Phineas Fogg thinks he
has lost his bet, but it turns out that he has not. Explain his mistake
in terms of the results of this chapter.

Chapter 7

The Winding Number

in Functional Analysis

Topological ideas such as the winding number are ubiquitous in mod-

ern mathematics, often showing up in entirely unexpected contexts.
For example, in this chapter the winding number will arise as the
solution to a problem in functional analysis — roughly speaking, the
problem is to count the number of “arbitrary constants” that appear
in the general solution of a certain integral equation. This process —
relating topology to “counting” the solutions of differential or integral
equations — is the central theme in the index theorem of Atiyah and
Singer (see [23]), one of the great unifying mathematical results of
the later twentieth century.
The equations that we look at are going to be defined on an
infinite-dimensional vector space. We will use the theory of Hilbert
spaces throughout this chapter; Hilbert spaces are simple examples
of infinite-dimensional vector spaces, with a geometry that is close
to the familiar Euclidean geometry of finite dimensions. The Hilbert
space theory that we will need is reviewed in Appendix F.

7.1. The Fredholm index

Let V be a vector space and U a subspace of V . Remember (Deﬁni-
tion A.2.6) that the dimension of U is the number of elements in a

121

basis for U . That is, the dimension of U is the smallest n for which
we can find u1 , . . . , un ∈ U such that every u ∈ U can be written

u = ni=1 λi ui for some scalars λi ∈ C.
We will also need the notion of codimension. By definition, the
codimension (Definition A.3.10) of the subspace U of V is the dimen-
sion of an algebraic complement to U , that is, the smallest number n
for which there exist v1 , . . . , vn ∈ V such that, for every v ∈ V , one
can write

n
v= μi vi + u, μi ∈ C, u ∈ U.
i=1

When everything is ﬁnite-dimensional, the codimension of U in

V is just dim V − dim U . But the codimension may be finite even if
V and U are both infinite-dimensional.
Now let T : V → W be a linear map between vector spaces. Recall
the following definitions:
(a) The kernel and image of T are defined by
Ker T = {v ∈ V : T v = 0},
Im T = {w ∈ W : ∃v ∈ V, T v = w}.
They are vector subspaces of V and W , respectively.
(b) The nullity of T is the dimension of Ker T , and the rank of T is
the dimension of Im T .
(c) The corank of T is the codimension of Im T in W . Similarly, the
conullity of T is the codimension of Ker T in V .
One of the basic results of linear algebra is the “rank-nullity” the-
orem (Theorem A.3.14). One version of its statement is the following.

Theorem 7.1.1. For a linear transformation T : V → W between

ﬁnite-dimensional vector spaces, we have
Nullity T − Corank T = dim V − dim W.

Notice that the usual formulation Nullity T + Rank T = dim V

is equivalent to this one, but our version is more suggestive when we
come to generalize to inﬁnite dimensions. It’s helpful to think of the
rank-nullity theorem in terms of Figure A.2 on page 191, which shows

T as exactly matching up a piece of V (of size Conullity T ) with a

piece of W (of size Rank T ), with “left-over” pieces on each side of
size Nullity T and Corank T , respectively.
An invertible linear map has zero nullity and zero corank. We
will be interested in operators on Hilbert space that are “almost”
invertible, where “almost” is expressed by saying that the nullity and
corank are finite. That is,
Definition 7.1.2. Let V and W be Hilbert spaces and let T : V → W
be a bounded linear operator. We say T is a Fredholm operator if
(a) the kernel of T has finite dimension,
(b) the range of T has finite codimension.

Proposition 7.1.3. The kernel and the range of a Fredholm operator

on Hilbert space are closed subspaces.

Proof. The kernel of any bounded linear operator is closed since it

is the inverse image of a closed set, namely {0}, under a continuous
map (Remark B.2.4). As for the range, let T : V → W be Fredholm.
Let {w1 , . . . , wn } be a basis for a complement of Im T in W . Deﬁne
a bounded linear operator

n
L : (Ker T )⊥ ⊕ Cn → W, (v, λ1 , . . . , λn ) → T v + λi vi .
i=1

L is bijective and therefore has a bounded inverse M = L−1 by the

closed graph lemma (Lemma F.3.5). Then Im T is the inverse image
M −1 ((Ker T )⊥ ) of the closed subspace (Ker T )⊥ , so it is closed.
Deﬁnition 7.1.4. The index Index(T ) of a Fredholm operator T is
the diﬀerence of dimensions Nullity T − Corank T .

The rank-nullity theorem can be restated as follows: if V, W are

finite-dimensional and T : V → W is a linear map, then Index(T ) =
dim V − dim W . In other words, the index does not depend on T
at all! In particular, an operator from a finite-dimensional vector
space to itself must have index zero. This statement is not true for
maps from an infinite-dimensional space to itself, as the following
important example shows.

Example 7.1.5. Let V = W = 2 , the Hilbert space of square-

summable sequences (Example F.1.2). Let T : V → W be the linear
operator deﬁned by

T (a0 , a1 , a2 , a3 , . . .) = (0, a0 , a1 , a2 , . . .),

called the unilateral shift. Clearly, Nullity T = 0 while Corank T = 1,

so T is a Fredholm operator of index −1. The adjoint operator T ∗
(the unilateral backward shift) deﬁned by

T ∗ (a0 , a1 , a2 , a3 , . . .) = (a1 , a2 , a3 , a4 , . . .)

has index +1.

The next result gives the key properties of the Fredholm index.

Theorem 7.1.6. Let H be a Hilbert space. Then the space Fred(H)

of Fredholm operators H → H is an open subset of B(H). The index
function
Index : Fred(H) → Z
is constant on the path components of Fred(H), and two Fredholm
operators belong to the same path component of Fred(H) if and only
if they have the same index. If H is inﬁnite-dimensional, all integers
can be obtained as the indices of appropriate Fredholm operators.

These properties should remind you very strongly of the key prop-
erties of the winding number, Theorem 3.2.7. We’ll prove the ﬁrst two
statements (the Fredholm operators form an open set and the index
is constant on path components) in the next section. As for the ﬁnal
statement (all integers can be obtained), we have already seen exam-
ples of Fredholm operators of index ±1 (the unilateral shift and its
adjoint), and by taking powers of these we can obtain Fredholm op-
erators of any integer index. We’re not going to prove in detail that
two Fredholm operators having the same index are in the same path
component, as the argument requires more Hilbert space technology
than I want to develop. But I wanted to state the full result (Theo-
rem 7.1.6) so you can see the closeness of the parallel to the winding
number.

7.2. Atkinson’s theorem

From Appendix F the collection B(H) of all bounded ( = continuous)
linear operators on a Hilbert space H is itself a complete normed
vector space. The norm is deﬁned by

T
= sup{
T x
:
x
1} = sup{| T x, y| :
x
,
y
1}.

Lemma 7.2.1. Let H be a Hilbert space and let S ∈ B(H) with

S
< 1. Then I − S is an invertible operator.

Proof. Deﬁne R by the series

∞

R = I + S + S + ··· = 2
Sn.
n=0

If
S
= s < 1, then
S n
sn , and so simple estimates show that
the partial sums of the above series form a Cauchy sequence in the
normed vector space B(H). Since B(H) is complete, this Cauchy
sequence converges to a bounded operator R. We have
∞

SR = RS = Sn = R − I
n=1

whence R(I − S) = (I − S)R = I as required.

Corollary 7.2.2. The set of invertible operators (on a single Hilbert

space or from one Hilbert space to another ) is open.

Proof. Let T : H1 → H2 be invertible, with inverse S. Let ε =

S
−1 . If
T − T
< ε, then
I − ST
< 1 and
I − T S
< 1,
whence ST and T S are invertible. It follows that T is invertible.

Proposition 7.2.3. Let V, W be Hilbert spaces. The set Fred(V, W )

of all Fredholm operators from V to W is an open subset of B(V, W ),
and the index is constant on path components of this open set.

Proof. Let T be Fredholm. We are going to show that there exists

ε > 0 such that if
T −T
< ε, then T is Fredholm and has the same
index as T . This will clearly show that the set of Fredholm operators
is open. It also implies that the index is a continuous integer-valued

function on the set of Fredholm operators and hence that it is constant

on path components.
We consider the orthogonal direct sum decompositions (Theo-
rem F.2.4) of V and W given by
V = V0 ⊕ V1 , V0 = Ker(T ), V1 = Ker(T )⊥ ,
W = W0 ⊕ W1 , W0 = Im(T )⊥ , W1 = Im(T ).
Note that V0 and W0 are ﬁnite-dimensional and Index(T ) = dim V0 −
dim W0 . Every linear transformation from V to W has a 2 × 2 matrix
representation with respect to this decomposition. In particular T
itself has such a representation

0 0
T = ,
0 T11
where T11 : V1 → W1 is invertible.
Let T = T + L, where the perturbation L has norm smaller than
−1 −1
ε =
T11
, and write

L00 L10
T +L= .
L01 T11 + L11
By our assumption, (T11 + L11 ) is invertible (Corollary 7.2.2). Now
we are going to perform “elementary row and column operations” on
the matrix T + L. This may seem strange because the entries of our
matrix are themselves linear transformations, not numbers as they
are in a ﬁrst course in linear algebra, but in fact everything works
in the same way: an elementary row operation corresponds to multi-
plying on the left by a certain invertible matrix, and an elementary
column operation corresponds to multiplying on the right by a certain
invertible matrix.
The operations we want to carry out are these:
−1
• Add −L10 (T11 +L
11 ) times row 2 to −1
row1; that is, multply
1 −L10 (T11 + L11 )
on the left by .
0 1
• Add −(T11 + L11 )−1 L01 times column 2 to column 1; that
1 0
is, multiply on the right by −1 .
−(T11 + L11 ) L01 1

These operations are eﬀected by invertible matrices, so they don’t

change the dimension of the kernel or the codimension of the image
(and in particular they don’t change the Fredholm index). Their
eﬀect is to reduce T + L to the matrix

L00 − L10 (T11 + L11 )−1 L01 0
.
0 T11 + L11

The index of this diagonal matrix is clearly the sum of the indices
of the diagonal entries. But the ﬁrst diagonal entry is just a linear
transformation between ﬁnite-dimensional vector spaces, so its index
is dim V0 − dim W0 = Index(T ), and the second entry is invertible so
it has index zero. Thus T + L is Fredholm and has the same index as
T , completing the proof.

Proposition 7.2.4 (Atkinson’s theorem). Let T be a bounded op-

erator on a Hilbert space H. The following conditions are equivalent:

(a) T is Fredholm.
(b) T is invertible modulo ﬁnite-rank operators: there is a bounded
operator S such that I − ST and I − T S are of ﬁnite rank.
(c) T is invertible modulo compact operators: there is a bounded op-
erator S such that I − ST and I − T S are compact operators.

See Definition F.3.3 for the definitions of “finite rank” and “com-
pact” operators. For those who are familiar with the terminology,
Atkinson’s theorem can be expressed as follows: an operator T is
Fredholm if and only if its image in the quotient algebra B(H)/K(H)
is invertible. This quotient (called the Calkin algebra Q(H)) is an
important object in operator algebra theory.

Proof. Suppose that T is Fredholm, (a). Then T maps the orthog-

onal complement (Ker(T ))⊥ bijectively onto Im(T ). Let Q be the
inverse map from Im(T ) to (Ker(T ))⊥ ; by the closed graph lemma
(Lemma F.3.5), Q is a bounded operator. Let P be the orthogonal
projection from H onto the closed subspace1 Im(T ) and let S = QP .

1
This projection exists by Theorem F.2.4.

Then by construction, I − T S and I − ST are the orthogonal projec-

tions onto Im(T )⊥ and Ker(T ), respectively. Since these are finite-
dimensional, the associated projections have finite rank. Thus T is
invertible modulo finite-rank operators, (b).
It is obvious that (b) implies (c). Suppose (c), that T is invertible
modulo compacts, and let S be such that I − ST ∈ K and I − T S ∈ K.
There is a finite-rank operator F such that
I − ST − F
< 12 . By
Lemma 7.2.1, this implies that ST + F is invertible. Consider now
the identity map
I = (ST + F )−1 (ST + F )
when restricted to the kernel of T ; the restriction of ST +F to Ker(T )
has finite rank, whence the restriction of I to Ker(T ) has finite rank,
and thus Ker(T ) is finite-dimensional. Similarly there is a finite-rank
operator F such that T S + F is invertible. The equation

v = (T S + F )(T S + F )−1 v = T S(T S + F )−1 v + F (T S + F )−1 v

shows that Im(T )+Im(F ) = H and, since Im(F ) is ﬁnite-dimensional,

this shows that Im(T ) has ﬁnite codimension. Thus T is Fredholm,
(a), as required.

Corollary 7.2.5. Let T be a Fredholm operator and K a compact

operator. Then T + K is Fredholm and has the same index as T .

Proof. Any inverse for T modulo compacts is also an inverse for

T +K modulo compacts, so T +K is Fredholm by Atkinson’s theorem.
The linear path s → T + sK shows that T and T + K belong to the
same path component of the space of Fredholm operators, so they
have the same index.

Proposition 7.2.6. If T1 , T2 are Fredholm operators on a Hilbert

space H, then so is their composite T1 T2 , and moreover

Index(T1 T2 ) = Index(T1 ) + Index(T2 ).

Proof. It follows from Atkinson’s theorem that the composite of

Fredholm operators is Fredholm. To prove the formula for the index,
choose an operator S2 that is an inverse for T2 modulo compacts.

Consider the one-parameter family of 2 × 2 matrices (operators on

H ⊕ H)

T2 cos(πs/2) I sin(πs/2)
Vs = , s ∈ [0, 1].
−I sin(πs/2) S2 cos(πs/2)

These are all invertible modulo compacts (hence Fredholm) with

T2 0 0 I
V0 = , V1 = .
0 S2 −I 0

Note that Index(V0 ) = Index(T2 )+Index(S2 ), whereas V1 is invertible

so has index 0; therefore, Index(T2 ) = − Index(S2 ). Consider now the
path of operators

T1 0
Ws = Vs .
0 I
This is also a continuous path of Fredholm operators with

T 1 T2 0 0 T1
W0 = , W1 = .
0 S2 −I 0

The equality Index(W0 ) = Index(W1 ) now gives

Index(T1 T2 ) + Index(S2 ) = Index(T1 ),

which, together with Index(S2 ) = − Index(T2 ), implies the desired

result.

7.3. Toeplitz operators

There is a natural construction that gives rise to Fredholm operators
related to the Hilbert space L2 (S 1 ) of square-integrable functions on
the circle. We recall that this Hilbert space has an orthonormal basis
given by the elementary trigonometric functions

en (t) = eint , t ∈ [0, 2π],

for n ∈ Z. The coeﬃcients of a function with respect to this orthonor-

mal basis are called its Fourier coeﬃcients; see Deﬁnition F.1.6.

Suppose that g is a continuous function on the circle (i.e., a con-

tinuous map [0, 2π] → C with g(0) = g(2π)). The multiplication
operator Mg on H is deﬁned by

Mg (f ) = gf ∀f ∈ H = L2 (S 1 ).

Proposition 7.3.1. The multiplication operator Mg is bounded, with

norm

Mg
sup{|g(x)| : x ∈ [0, 2π]}.

Proof. Let m = sup{|g(x)| : x ∈ [0, 2π]}. If f ∈ L2 (S 1 ), we have

This shows that

Mg f
m
f
as required.

Remark 7.3.2. In fact, it is not hard to prove that we have equality

in Proposition 7.3.1:
Mg
= sup{|g(x)| : x ∈ [0, 2π]}. We will not
need this, however.

We now ask: what is the (inﬁnite-by-inﬁnite) matrix of the mul-

tiplication operator Mg with respect to the Fourier basis {en }? This
question is particularly simple to answer when g is a trigonometric
polynomial (that is, a ﬁnite linear combination of the en ). If g(t) is
such a polynomial, of the form

N
g(t) = cn eint ,
n=−N

then we clearly have

N
Mg eikt = cn ei(k+n)t .
n=−N

This proves the special case (where g is a trigonometric polynomial)

Proposition 7.3.3. Let g be a continuous function on S 1 . The ma-

trix of Mg with respect to the trigonometric basis is
⎡ ⎤
.. ..
. .
⎢ ⎥
⎢ c1 c0 c−1 c−2 c−3 ⎥
⎢ ⎥
⎢ c2 c1 c0 c−1 c−2 ⎥ ,
⎢ ⎥
⎢ c c0 c−1 ⎥
⎣ 3 c2 c1 ⎦
.. ..
. .

where the {cn } are the Fourier coeﬃcients of g.

Proof. The calculations above prove this when g is a trigonometric

polynomial. The general case follows by the Weierstrass approxima-
tion theorem (Theorem C.1.5): the trigonometric polynomials are
(uniformly) dense among all continuous functions.

Deﬁnition 7.3.4. The Hardy space H 2 (S 1 ) is the closed subspace

of L2 (S 1 ) comprised of those functions f ∈ L2 (S 1 ) all of whose neg-
ative Fourier coefficients are zero — in other words, those for which
f, en = 0 for n < 0. The Hardy projection P is the orthogonal pro-
jection onto Hardy space; thus P f has the same Fourier coefficients
as f for n 0, and zero Fourier coefficients for n < 0.

Remark 7.3.5. If we think of f as an L2 function of z = eit deﬁned

on the unit circle in the complex plane, then the functions in the
Hardy space are precisely those that involve only nonnegative powers
of z and that therefore extend to holomorphic functions deﬁned on
the unit disc.

Deﬁnition 7.3.6. Let g be a continuous function on the circle. The

Toeplitz operator with symbol g is the operator on the Hardy space
deﬁned by
Tg = P Mg ;
in other words, to compute Tg f , we ﬁrst multiply f by g and then
project the result back into the Hardy space.

From our discussion above we can see that the matrix of a Toeplitz
operator with symbol g is the truncation of the matrix given in Propo-
sition 7.3.3 for the corresponding multiplication operator. That is,
⎡ ⎤
c0 c−1 c−2 · · ·
⎢ c1 c0 c−1 ⎥
⎢ ⎥
(7.3.7) ⎢ c2 c1 c0 ⎥.
⎣ ⎦
.. ..
. .
Whereas the diagonals of the “multiplication matrix” of Proposi-
tion 7.3.3 were “two-way infinite”, the diagonals of the corresponding
“Toeplitz matrix” (7.3.7) are only “one-way infinite”. This gives rise
to some edge effects when we multiply Toeplitz matrices. It turns out
that these “edge effects” are represented by compact operators:
Proposition 7.3.8 (Symbolic calculus). Let g1 and g2 be contin-
uous functions on S 1 . Then
Tg1 g2 − Tg1 Tg2 ∈ K;
in other words, the assignment g → Tg is a homomorphism modulo
compact operators.

Proof. Let P denote the Hardy projection. I claim that for any
continuous g the commutator
[P, Mg ] := P Mg − Mg P
is compact. This will be enough since then
Tg1 Tg2 = P Mg1 P Mg2 ∼ P P Mg1 Mg2 = P Mg1 g2 = Tg1 g2 ,
where the notation ∼ denotes “equality modulo compacts”.
To prove the claim, consider the collection C of all continuous
functions g that satisfy it. Clearly C is a vector space, and the identity
[P, AB] = [P, A]B + A[P, B]
shows that it is closed under multiplication. A direct computation
shows that [P, Mg ] is a rank-one operator when g(t) = e±it . It follows
that C contains all trigonometric polynomials. Now for a general g let
gn be a sequence of trigonometric polynomials converging uniformly
to g (the existence of such a sequence is guaranteed by the Weierstrass

approximation theorem (Theorem C.1.5)). Then the operators Mgn

converge to Mg by Proposition 7.3.1 and therefore the commutators
[P, Mgn ] converge to [P, Mg ]. Thus [P, Mg ] is a limit of compact
operators, which is compact, as required.

7.4. The Toeplitz index theorem

Now we can put all the ingredients together to relate topology and
analysis. Let g be a continuous complex-valued function on S 1 .
Proposition 7.4.1. If the function g is nowhere-vanishing, then the
Toeplitz operator Tg is a Fredholm operator.

Proof. Since g does not vanish, the function g −1 is deﬁned and con-
tinuous everywhere. By the symbolic calculus of Proposition 7.3.8,
Tg Tg−1 and Tg−1 Tg are equal modulo compacts to T1 = I. Thus Tg is
invertible modulo compacts, so it is Fredholm by Atkinson’s theorem
(Proposition 7.2.4).

Since Tg is a Fredholm operator when g is nowhere-vanishing, we

can ask about its index.
Theorem 7.4.2 (Toeplitz index theorem). Let g : S 1 → C \ {0}
be a nowhere-vanishing function. Then
Index(Tg ) = − wn(g, 0)
where we consider g as a loop that does not pass through 0.
Example 7.4.3. Suppose that g(t) = eit . Then for each of the basis
elements e0 , e1 , e2 , . . . of the Hardy space we have
Tg en = en+1 .
Thus Tg is in fact the unilateral shift (Example 7.1.5) and has index
−1. On the other hand, the path g described is just the unit circle
traversed once in the positive direction, so wn(g, 0) = +1.

Proof. Just as in the proof of Theorem 5.3.4, our argument will

follow the three-stage “standard operating procedure”:
(a) Show that Index Tg depends only on the homotopy class of g
(among maps S 1 → C \ {0}).

(b) Show that the index is multiplicative: Index Tg1 g2 = Index Tg1 +
Index Tg2 .
(c) Deduce that the index is a multiple of the winding number; ﬁx
the multiple by computing one example.
To prove (a), we notice that Tg “depends continuously on g”:
speciﬁcally, if sup{|g1 (x) − g2 (x)|} < ε, then

Mg1 − Mg2
=
Mg1 −g2
< ε

and it follows that

Tg1 − Tg2
=
P (Mg1 − Mg2 )

P

Mg1 − Mg2
=
Mg1 −g2
< ε

since
P
= 1. It follows that if s → gs is a homotopy of maps S 1 →
C\{0}, then s → Tgs is a continuous path of Fredholm operators, and
therefore that Index Tgs does not depend on s by Proposition 7.2.3.
To prove item (b), notice that Tg1 g2 is equal modulo compacts
to Tg1 Tg2 , by the symbolic calculus of Proposition 7.3.8, and these
two operators therefore have the same index by Corollary 7.2.5. By
Proposition 7.2.6, Index(Tg1 Tg2 ) = Index(Tg1 ) + Index(Tg2 ).
Now Theorem 3.2.7 and Lemma 3.2.6 tell us that loops in C \ {0}
are classiﬁed up to homotopy by their winding numbers, with the
pointwise product of loops corresponding to the addition of winding
numbers and each loop being homotopic to some power of the basic
loop z → z. It follows that any integer-valued, multiplicative homo-
topy invariant of loops in C \ {0} is of the form g → k wn(g, 0), where
the constant k is the value of the invariant on the basic loop. In the
case of the Toeplitz index, we have already carried out the calculation
of k in Example 7.4.3 above.

Instead of single (“scalar”) Toeplitz operators, as earlier, we can

consider matrix Toeplitz operators, i.e., n×n matrices whose elements
are Toeplitz operators. The symbol of such a Toeplitz operator is an
n × n matrix of functions on the circle, i.e., a map S 1 → Mn (C), and
the operator is Fredholm if the symbol is invertible, i.e., is a map to
the group GL(n, C) of invertible matrices. To conclude this section,
let’s state the index theorem for matrix Toeplitz operators.

Theorem 7.4.4. Let ϕ : S 1 → GL(n, C) be a continuous, matrix-

valued symbol, and let Tϕ be the corresponding matrix Toeplitz oper-
ator. Then Tϕ is Fredholm, and its index is given by
Index Tϕ = − wn(det ϕ, 0)
where det ϕ is the path in C \ {0} given by the determinant2 of ϕ.

Sketch of the proof. Induction on n, the n = 1 case being the

theorem that we have established already.
By what we have already proved, the index depends only on the
homotopy class of the symbol in the space of maps S 1 → GL(n, C).
So to give the inductive step, it suffices to prove that any map S 1 →
GL(n, C) is homotopic,

in the space of such maps, to a map of the
ϕ 0
form , where ϕ is a map S 1 → GL(n − 1, C). Such a
0 1
homotopy will preserve both the left- and the right-hand sides of the
theorem (the index and the winding number of the determinant) and
will reduce the n-dimensional case (for ϕ) to the (n − 1)-dimensional
case (for ϕ ), which we may assume solved by induction.
The idea is to use row and column operations again, but there is a
significant difficulty. To cancel the last row and column we have to get
an element in the bottom-right corner which is invertible and so can
act as a “pivot”. In ordinary linear algebra this is no problem: some
element is certainly nonzero and we can permute rows and columns to
get it to the bottom right. But here we are doing linear algebra with
matrix entries that are functions on S 1 , and it may well be that even
though the whole matrix is invertible for every parameter value, there
is no individual matrix element that does not vanish somewhere. The
path of rotation matrices t → [ −cos t sin t
sin t cos t ] gives a practical example
of this.
So here’s how we proceed. Take our loop ϕ = ϕ(t) and let v =
v(t) ∈ Cn be the last column of ϕ. This is a nowhere-vanishing vector
(if it vanished somewhere, ϕ would not be invertible at that point)
and by rescaling ϕ(t) by
v(t)
−1 (which only changes things by a
homotopy) we may assume that v(t) is a unit vector for all t. Thus
v is actually a loop in the sphere S 2n−1 of unit vectors in Cn .
2
See Definition A.6.5.

This sphere is simply connected (Example 2.3.6). That is to

say, there is a homotopy of loops in the sphere from the constant
loop (0, 0, . . . , 0, 1)T to the loop v(t). Now we appeal to a lifting
property, rather like the crucial property of Proposition 3.1.5 of the
exponential map. This is called the fibration property of the map
c : GL(n, C) → Cn \ {0} that sends an invertible matrix to its last
column (Definition 9.1.9). Here, it allows us to “lift” the homotopy
of the final column (given to us by the simple-connectedness of the
sphere) to a homotopy of the whole matrix. This then can be used to
show that a nonzero pivot can be found and therefore that elemen-
tary row and column operations will reduce the matrix of ϕ to the
block form required. More details of this argument can be found in
Lemma 9.2.1.

7.5. Exercises
Exercise 7.5.1. Let H be a Hilbert space and suppose that E and
F are closed subspaces with E ⊆ F . Show that F ⊥ ⊆ E ⊥ and that
the codimension of E in F is the same as the codimension of F ⊥ in
E ⊥.
Exercise 7.5.2. If T is a Fredholm operator on Hilbert space, show
that its adjoint T ∗ is a Fredholm operator also and Index(T ∗ ) =
− Index(T ).
Exercise 7.5.3. An operator T on a Hilbert space is normal if it
commutes with is adjoint, T T ∗ = T ∗ T .
(i) If T is a normal Fredholm operator, show that its index is zero.
(ii) An operator T is essentially normal if the diﬀerence T T ∗ − T ∗ T
is compact. Show that the sum of a normal operator and a
compact operator is essentially normal.
(iii) Show that there exist essentially normal operators that cannot
be expressed as the sum of a normal operator and a compact
operator.
Elucidating the phenomenon in (iii) above leads one to the Brown-
Douglas-Fillmore theory [11], an important link between operator
theory and topology.

Exercise 7.5.4. This exercise gives an alternative proof of the result

of Proposition 7.2.6 that if S, T are Fredholm operators on a Hilbert
space H, then so is their composite ST , and that
Index(ST ) = Index(S) + Index(T ).
It does not involve the matrix homotopies used in the text, but it
does need some more ideas from linear algebra (quotient spaces and
the associated isomorphism theorems).
(i) Show that the sequences

0 / Ker(T ) / Ker(ST ) T / Ker(S) ∩ Im(T ) /0

and
/ H S / H / H /0
0
Ker(S) + Im(T ) Im(ST ) Im(S)
are exact. (See Deﬁnition A.3.17 for the terminology.)
(ii) Count dimensions in these exact sequences (Theorem A.3.18)
and use the “second isomorphism theorem”
Ker(S) + Im(T ) ∼ Ker(S)
=
Im(T ) Ker(S) ∩ Im(T )
to complete the proof of the desired result.
Exercise 7.5.5. Fredholm was originally interested in solutions to
integral equations of the form f (x) − k(x, y)f (x) = g(x): for our
purposes these can be written (I + K)f = g, where I is the iden-
tity and K a compact operator (on some Hilbert space). In these
circumstances he formulated what came to be called The Fredholm
Alternative 3 , namely the statement that the following conditions are
equivalent:
• Solutions always exist to the inhomogeneous problem; i.e.,
for every g there exists f such that (I + K)f = g.
• Solutions to the homogeneous problem are unique; i.e., the
only f such that (I + K)f = 0 is f = 0.
Prove the Fredholm alternative as a consequence of our general results
on Fredholm operators.
3
This really should have been the title of a book by Robert Ludlum.

Exercise 7.5.6. Let H denote the Hardy space H 2 (S 1 ). Consider

the Toeplitz algebra T of operators on H: that is, the smallest closed
subset of B(H) which contains all the Toeplitz operators Tf and
satisfies
A, B ∈ T, λ, μ ∈ C =⇒ λA + μB ∈ T, AB ∈ T.
Show that T contains all the compact operators.
Exercise 7.5.7. Let H be a Hilbert space. You are given the fol-
lowing facts: Among the compact operators on H there is a subclass
called the traceable operators. The traceable operators form an ideal
in B(H) which contains every finite-rank operator. If T is a traceable
operator and {en } is a complete orthonormal set, the “sum of diag-

onal matrix entries” n T en , en converges absolutely to a number
Tr(T ), the trace of T , which depends only on T (not on the choice
of orthonormal set). Finally, if A and B are bounded operators such
that AB and BA are both traceable, then Tr(AB) = Tr(BA).
(a) Show that if T is a Fredholm operator, there is a “parametrix” S
such that I − ST and I − T S are traceable.
(b) Show that if S is such a parametrix, then
Index(T ) = Tr(I − ST ) − Tr(I − T S).
(c) (Challenge) Let f be a function on the circle and let P denote
the Hardy projection. It is known that if f is sufficiently smooth,
then the commutator [Mf , P ] is a traceable operator on L2 (S 1 ).
Assuming this, prove
Index(Tf ) = Tr(Mf −1 [P, Mf ]),
where [A, B] denotes the commutator AB − BA.
Alain Connes noticed that if you think of the trace as a “noncommu-
tative” version of the integral and the commutator as a “noncommu-
tative” version of the exterior derivative d, the result in (c) looks a lot
like the integral formula for the winding number, (1/2πi) f −1 df .
This became one of the foundations of his theory of noncommutative
geometry, developed in the book [13].

Chapter 8

Coverings and
the Fundamental Group

8.1. The fundamental group

The winding number project was about understanding and classify-
ing loops in the metric space C \ {0}. We can generalize this to an
arbitrary metric space, as follows.

Deﬁnition 8.1.1. Let X be a metric space, with a basepoint x0 . The

fundamental group π1 (X, x0 ) is the collection of homotopy classes of
loops in X based at x0 , that is, of continuous maps γ : [0, 1] → X
having γ(0) = γ(1) = x0 .

Remark 8.1.2. Note that an element of the fundamental group is

not a single loop, but an entire homotopy class of such loops. For
example, the fundamental group of the punctured plane has one el-
ement for each integer n, and that element is the homotopy class
consisting of all the loops (with the speciﬁed basepoint) that have
winding number n.

The object π1 (X, x0 ) was invented by Poincaré. It is called the

fundamental group because it can be equipped with a “multiplication”
that satisﬁes the laws of an abstract group. We made use of such a
multiplication (the pointwise product of loops) on several occasions

139

when studying the winding number, but this multiplication involved

the arithmetic of complex numbers — the product on C \ {0} — and
no analog to this is available on a general metric space X. However,
it turns out that the concatenation of loops provides an acceptable
substitute. Recall from Remark 3.2.9 the following deﬁnition:

Deﬁnition 8.1.3. Suppose that γ1 and γ2 are loops in X based at

x0 . Their concatenation is the loop γ1 ∗ γ2 deﬁned by

γ1 (2t) (t 12 ),
t →
γ2 (2t − 1) (t > 12 ).

Because homotopies between loops can be concatenated in the

same way as the loops themselves, concatenation passes to homotopy
classes and deﬁnes a binary operation ∗ on π1 (X, x0 ).

Proposition 8.1.4. The operation of concatenation makes π1 (X, x0 )

into a group (Deﬁnition G.2.1). That is:
(a) Concatenation is associative: g1 ∗ (g2 ∗ g3 ) = (g1 ∗ g2 ) ∗ g3 .
(b) The class e of the constant path at x0 acts as an identity element:
e ∗ g = g = g ∗ e.
(c) Inverses exist: for each g there is g −1 such that g∗g −1 = g −1 ∗g =
e.

Proof. The proof uses the notion of reparameterization of a loop. Re-

call (Example 2.2.3) that we deﬁne a reparameterization of a path to
be its composition with any continuous map u : [0, 1] → [0, 1] having
u(0) = 0 and u(1) = 1. If two loops are related by reparameterization,
they are homotopic (via a linear homotopy).
Let γ1 , γ2 , γ3 be loops. The loops γ1 ∗ (γ2 ∗ γ3 ) and (γ1 ∗ γ2 ) ∗ γ3
are related by reparameterization using the piecewise linear map
⎧
⎪
⎪ 1
(0 t 12 ),
⎨2t
u(t) = t − 14 ( 12 t 34 ),
⎪
⎪
⎩2t − 1 ( 3 t 1),
4

whose graph is shown in Figure 8.1.

Figure 8.1. Reparameterization used to prove associativity

in π1 .

Similarly if γ is a loop and e is the constant loop, the loops γ and

γ ∗ e are related by reparameterization using the piecewise linear map

2t (0 t 12 ),
u(t) =
1 ( 12 t 1),
and γ and e ∗ γ are related by a similar reparameterization.
Finally, the reverse loop to a loop γ is deﬁned by γ (t) = γ(1 − t).
The concatenation γ ∗ γ is homotopic to the constant loop via the
homotopy

γ(2 min{t, (1 − s)}) (t 12 ),
h(s, t) =
γ(2 min{(1 − t), (1 − s)}) (t 12 ).
This completes the proof.

Example 8.1.5. For a simply connected space X, π1 (X, x0 ) is the

trivial group 0 (with just one element, the identity).

Example 8.1.6. For X = C \ {0}, our calculations of the winding

number (in Theorem 3.2.7) show that π1 (X, x0 ) ∼ = Z (we can take
any basepoint x0 , but for deﬁniteness let’s take x0 = 1).

Remark 8.1.7. How does the group π1 (X, x0 ) depend on the choice
of basepoint? We need to assume that X is path connected to answer
this question. Assuming this, let x0 and x1 be two basepoints and
let ϕ : [0, 1] → X be a path connecting them, with ϕ(0) = x0 and
ϕ(1) = x1 ; let ϕ be the reversed path (ϕ (t) = ϕ(1 − t)). If γ is a
loop based at x0 , then ϕ ∗ γ ∗ ϕ is a loop based at x1 , and this process

passes to homotopy classes and gives an isomorphism

π1 (X, x0 ) → π1 (X, x1 ).
Thus “up to isomorphism” the fundamental group of a path-connected
space does not depend on the choice of basepoint. However, note that
there may be many different isomorphisms between π1 (X, x0 ) and
π1 (X, x1 ); the above discussion does not single out a particular one
— if we choose a different path from x0 to x1 , we could in principle
obtain a different isomorphism. See Exercise 8.7.2

Suppose that (X, x0 ) and (Y, y0 ) are spaces with basepoint and
that f : X → Y is a based map — that is, a continuous map with
f (x0 ) = y0 . Then, for any loop γ : [0, 1] → X in X, the composite
f ◦ γ : [0, 1] → Y is a loop in Y . Moreover, this construction passes
to homotopy classes and so it gives rise to a map f∗ : π1 (X, x0 ) →
π1 (Y, y0 ). It is clear that for g1 , g2 ∈ π1 (X, x0 ),
f∗ (g1 ∗ g2 ) = f∗ (g1 ) ∗ f∗ (g2 );
that is, f∗ is a homomorphism of groups (Deﬁnition G.3.1). It is
called the induced homomorphism associated to the map f .
The construction has the following obvious properties.

Proposition 8.1.8. The construction of the induced homomorphism

is functorial. That is, if f : (X, x0 ) → (Y, y0 ) and g : (Y, y0 ) → (Z, z0 )
are maps of spaces with basepoints, then
g∗ ◦ f∗ = (g ◦ f )∗ : π1 (X, x0 ) → π1 (Z, z0 ).
Moreover, the identity map gets transformed into the identity homo-
morphism. Finally, homotopic maps induce the same homomorphism
on fundamental groups.

The “functorial” language comes from category theory [25]: a

functor transforms both “objects” and “morphisms” from one kind
of mathematical theory to another, while preserving the formal prop-
erties expressed by composition laws. For instance, the fundamental
group functor transforms topology to group theory, replacing spaces
by groups and continuous mappings by homomorphisms. By the use
of functors, problems in one theory (say, topology) can be transformed

into problems in another theory (say, algebra). This is the basic idea
of “algebraic topology”.
Example 8.1.9. Let’s use these ideas to prove the Brouwer ﬁxed-
point theorem yet once more. As we observed before, this is equivalent
to the no-retraction theorem (Theorem 4.1.2): there is no retraction
of the closed disc D2 onto its boundary S 1 .
Well, suppose there was. Such a retraction would amount to a
commutative diagram of spaces and maps
S1 B / S1
BB ||>
BB ||
BB ||
B! ||
D2
where the horizontal map is the identity, the downward diagonal is the
inclusion map, and the upward diagonal is the supposed retraction.
Applying the fundamental group functor would give us a commutative
diagram of groups and homomorphisms
Z> /Z
>> ?
>>
>>
>
0
where the horizontal map is the identity. But clearly no such diagram
exists (the identity map Z → Z cannot factor through a trivial group
or else it would itself be trivial).
Remark 8.1.10. Let f : (X, x0 ) → (Y, y0 ) be a based map. If there
is a map g : (Y, y0 ) → (X, x0 ) such that f ◦ g is homotopic to the iden-
tity map on Y and g ◦ f is homotopic to the identity map on X, then
we say that f is a homotopy equivalence. From the functorial proper-
ties of the induced homomorphism it follows that if f is a homotopy
equivalence, then f∗ : π1 (X, x0 ) → π1 (Y, y0 ) is an isomorphism.

8.2. Covering and lifting

We are going to consider some special kinds of maps between (metric)
spaces. A map f : X → Y is a surjection if it maps X onto Y , i.e.,
for every y ∈ Y there is x ∈ X with f (x) = y. Two surjections onto

the same space, f1 : X1 → Y and f2 : X2 → Y , are called equivalent

if there is a homeomorphism h : X1 → X2 such that f2 ◦ h = f1 , or in
other words such that the diagram

X1 A
h / X2
AA }}
AA }}
A }}
f1 AA
~}} f2
Y
commutes.

Example 8.2.1. An obvious example of a surjection is a product

surjection, where X is just a product Y × F for some space F and
f : X → Y is the “coordinate” map f (x, ξ) = x. The space F is called
the ﬁber of the product surjection.

Example 8.2.2. We can restrict a surjection to an open subset U ⊆

Y : if f : X → Y is a surjection and U ⊆ Y is open, then f −1 (U ) ⊆ X
is open too. Considering f −1 (U ) and U as metric spaces in their own
right, we obtain a surjection
fU : f −1 (U ) → U,
called the restriction of f to U .

Now we give a key deﬁnition. Recall (Example B.1.9) that a

metric space F is discrete if every subset is open. Equivalently, for
each ξ ∈ F , the inﬁmum
inf{d(ξ, ξ ) : ξ = ξ}
is strictly positive.

Deﬁnition 8.2.3. A surjection p : E → B is a covering map if there

are a discrete space F and an open cover U of B such that, for each
U ∈ U , the restriction pU : p−1 (U ) → U is equivalent to a product
surjection with ﬁber F . Again, we call F the ﬁber of the covering
map. The cardinality of F is the number of sheets of the covering.

A cover U having the property described in this deﬁnition will

be called a trivializing cover for p, and its members trivializing sets.
The two uses of the word “cover”, as in “open cover” and “covering

x
y

Figure 8.2. Reprising Figure 1.3 to illustrate the covering

map from R to S 1 . The ﬁgure shows the set U ⊆ S 1 (dashed)
and three of the inﬁnitely many components of e−1 (U ) (heavy
lines).

map”, are unrelated, but both are part of the standard terminology
of the subject.
Consider the exponential map e(t) = exp(2πit), from R to S 1 ,
a familiar friend from earlier chapters. It is a local homeomorphism
but not a global homeomorphism. In fact,

Proposition 8.2.4. The exponential map e : R → S 1 is a covering

map.

Proof. Consider the open subset U of S 1 given by its intersection

with the left-hand half-plane; that is, U = {x+iy : x2 +y 2 = 1, x < 0}.
Notice that a point of U is completely determined by its y-coordinate,
which lies in (−1, 1), and x = −(1 − y 2 )1/2 is a continuous function
of y; so U is homeomorphic to (−1, 1). We have
'
e−1 (U ) = (n + 14 , n + 34 )
n∈Z

and the homeomorphism e−1 (U ) → U × Z that sends n + t, t ∈

( 41 , 34 ), to ((− cos 2πt, sin 2πt), n) implements an equivalence between
the restriction of e to U and the product surjection U × Z → U ; see
Figure 8.2.

Similar discussions can be carried out for the intersection of S 1

with the upper, lower, and right-hand half-planes, and the four open
subsets so obtained cover S 1 . This completes the proof.

One can prove by a similar argument that the exponential map

exp : C → C \ {0} is a covering map.

Example 8.2.5. Fix a nonzero n ∈ N. The map S 1 → S 1 given by

z → z n is a covering map. In this case the ﬁber F has n points.

If p : E → B is a covering map (or any surjection really) and

f : X → B is a map, a lift of f is a map f˜: X → E such that
p ◦ f˜ = f , i.e., such that the diagram
E
~>
f˜ ~
~ p
~ f
X /B
commutes.

Theorem 8.2.6 (Homotopy lifting theorem). Let p : E → B be a

covering map. Let X be a compact metric space, and let fs , s ∈ [0, 1],
be a homotopy of maps from X to B. Suppose a lift f˜0 : X → E of f0
is given. Then there is a lift f˜s of the entire homotopy fs , beginning
at f˜0 . Moreover, such a lift is unique.

The theorem can be expressed by the diagram below: the solid

arrows represent the data (homotopy fs and initial lift f˜0 ), and the
dashed arrow represents the lifted homotopy that is to be constructed
so as to make the diagram commute:

X × {0} /E
_ v:
v
v p
v
v

X × [0, 1] / B.

Corollary 8.2.7. Suppose that p : E → B is a covering space with the

total space E simply connected. Then a loop f : S 1 → B is homotopic
to a constant loop if and only if it lifts to a loop in E.

Proof of the corollary, assuming the theorem. Suppose that f

lifts to F : S 1 → E. By assumption, E is simply connected, so the
loop F is homotopic (in E) to a constant. Composing this homotopy
with the projection p : E → B, we get a homotopy (in B) of f to a
constant loop.
Conversely suppose that f is homotopic to a constant loop. Let
fs be such a homotopy with f0 being constant and f1 = f . The
constant loop f0 can be lifted to a constant loop F0 in E (since p is a
surjection). By Theorem 8.2.6 the homotopy fs lifts to a homotopy
Fs , which starts at F0 and ends at F1 = F , which is the required lift
of f .

Corollary 8.2.7 is the analog for general covering spaces of the

lifting property of the exponential map (Proposition 3.1.5).

Proof of Theorem 8.2.6. Fix a trivializing cover U for B. If Y is

a subset of X, let us call Y modest (as in, “of modest size”) if we
can partition [0, 1] into finitely many closed subintervals [sk , sk+1 ],
0 = s0 s1 · · · sm = 1, such that the restriction of the homotopy
f to Y × [sk , sk+1 ] has image that is a subset of a member of U .
Step (a). We show that for each x ∈ X, there is εx > 0 such that
the closed ball B(x; εx ) is modest. Indeed, let jx : [0, 1] → B be the
path jx (s) = f (s, x). Let δ be a Lebesgue number (Theorem B.3.13)
for the cover jx∗ (U ) of [0, 1]. Choose a partition of [0, 1] into finitely
many closed subintervals [sk , sk+1 ] of length less than δ. By con-
struction, then, for each k there is Uk ∈ U such that the compact set
f ([sk , sk+1 ]×{x}) is contained in Uk . By compactness and by uniform
continuity of f , there is εk > 0 such f ([sk , sk+1 ] × B(x; εk )) ⊆ Uk .
Take ε = 12 min{εk }; then B(x; εx ) is modest.
Step (b). Now we prove the uniqueness statement. It is sufficient
to prove this when X is a point since two liftings which agree when
restricted to each point of X must agree globally. So assume X is
a point (and then omit it from the notation). Now we are in the
situation of a path-lifting problem: we are given a path f : [0, 1] → B,
and we want to know that two liftings f , f : [0, 1] → E of f that
start at the same place (f (0) = f (0)) must in fact agree everywhere
(f (s) = f (s) for all s). Let T denote the collection of all those

numbers t ∈ [0, 1] such that f (s) = f (s) for all s ∈ [0, t]. By
hypothesis, T is a nonempty interval with left endpoint 0. Let t0 =
sup T . There is ε > 0 such that V = (t0 −ε, t0 +ε)∩[0, 1] is contained in
f −1 (U ) for some trivializing set U . Now using the local trivialization,
we may identify p−1 (U ) ⊆ E with U × F , where F is some discrete
space, and under this identiﬁcation f and f become maps of the
form

f (s) = (f (s), g (s)), f (s) = (f (s), g (s)), s ∈ V, g (s), g (s) ∈ F.

Since g (s) = g (s) for some s ∈ V , this identity must hold for all
s ∈ V (a continuous map from an interval to a discrete space must be
constant). If t0 < 1, then V contains s > t0 , which is a contradiction;
so t0 = 1 and moreover 1 ∈ V , and this is the desired uniqueness
statement.
Step (c). Next we will prove the existence statement under the
assumption that X itself is modest. The proof is quite similar to the
proof of uniqueness. Let now T be the collection of all those numbers
t ∈ [0, 1] for which a continuous lift f˜ of f , starting at the given
f˜0 , exists on X × [0, t]. Clearly T is a nonempty interval with left
endpoint 0. Let t0 = sup T . By the assumption that X is modest,
there is ε > 0 such that if V = (t0 − ε, t0 + ε) ∩ [0, 1], then X × V
is contained in f −1 (U ) for some trivializing set U . Now using the
local trivialization, we may identify p−1 (U ) ⊆ E with U × F , where
F is some discrete space, and under this identiﬁcation the lifting f˜
becomes a map of the form

f˜(x, s) = (f (x, s), g(x)), s ∈ V ∩ [0, t0 ), x ∈ X.

The map g is continuous from X to F ; it does not depend on s ∈

V ∩ [0, t0 ) because a continuous map from an interval to a discrete
space must be constant. However, the displayed equation makes sense
for all s ∈ V and deﬁnes a lifting of f . As before it follows that t0 = 1
and 1 ∈ V , which is the required lifting result.
Step (d). Finally we prove the general case. Since X is compact,
Step (a) shows that it can be covered by ﬁnitely many modest closed
balls. Applying Step (c) to each of these, we get continuous liftings

over each of these modest balls. By Step (b), whenever two modest
balls overlap, the corresponding liftings agree. By the gluing lemma
(Proposition B.4.2), the liftings over modest balls ﬁt together to deﬁne
a continuous lifting on all of X × [0, 1], as required.

Theorem 8.2.8. Let p : E → B be a covering, with E path connected,

and let basepoints e0 ∈ E and b0 ∈ B be chosen with p(e0 ) = b0 . Let
(Y, y0 ) be a path-connected and locally path-connected space and let
f : Y → B be a based map. Then there exists a based lifting g : Y → E
of f if and only if f∗ (π1 (Y, y0 )) ⊆ p∗ (π1 (E, e0 )). Moreover, if it exists,
such a based lifting is unique.

In this theorem, “locally path connected” means that for every

point x ∈ E and every open subset U containing x, there is a path-
connected open subset V with x ∈ V ⊆ U . This condition is satisﬁed
for most “ordinary” spaces, though there are examples for which it
fails. The lifting condition is expressed in the diagram

(E, e0 )
u:
g u
u p
u
u f

(Y, y0 ) / (B, b0 )

Proof. From the functoriality of the fundamental group, if a lifting

exists, f∗ = p∗ g∗ so Im(f∗ ) ⊆ Im(p∗ ). Thus the condition on the
fundamental groups is necessary.
To see that it is sufficient, suppose that Im(f∗ ) ⊆ Im(p∗ ). Then
we attempt to define a lift g as follows: for y ∈ Y , pick a path γ from
y0 to y, consider the path f ◦ γ in B from b0 to f (y), and lift this
path (using Theorem 8.2.6) to a path in E starting at e0 . We want to
define g(y) to be the endpoint of the lifted path. To do so, we must
check that different choices of γ give the same final result.
Suppose then that γ and γ are two such choices. The concate-
nation γ ∗ γ is then a loop in Y , defining a class in the fundamen-
tal group π1 (Y, y0 ). By the hypothesis Im(f∗ ) ⊆ Im(p∗ ), the loop

f ◦ (γ ∗ γ ) in B is homotopic (via a homotopy h) to a loop p ◦ θ,

where θ is a loop in E. Now apply the homotopy lifting theorem
(Theorem 8.2.6) to the situation

S 1 × {0}
θ /E
_ v:
v
v p
v
v
S 1 × [0, 1]
h / B.

The existence of the lifting (the diagonal map in the diagram) shows
in particular that γ ∗ γ lifts to a loop in E and therefore that the
endpoints of the lifts of γ and γ are the same.
The local connectedness hypothesis comes in when we try to
prove that the map g that we have defined is continuous. Suppose
that U is an open neighborhood of g(y) in E. Then p(U ) is open
in B, and, without loss of generality, we may take U so small that
p−1 (p(U )) ∼
= U × F where F is the (discrete) fiber. By the local con-
nectedness, there is a path-connected open neighborhood V of y such
that f (V ) ⊆ p(U ). For y ∈ V , a path from y0 to y may be defined
as the concatenation of γ with a path that remains entirely within V
(because V is path connected). A lift of this path may be defined by
concatenating a lift of γ with a path that lies entirely in p(V ) × F
where the fiber component remains constant, i.e., which lies entirely
in U . Thus g(V ) ⊆ U . This proves g is continuous at y.
Finally, the uniqueness claim follows from the uniqueness part of
Theorem 8.2.6.

8.3. Group actions

Covering spaces are closely related to group actions.
Deﬁnition 8.3.1. Let G be a group and let X be a metric space.
An action of G on X is a homomorphism from G to the group of
isometries of X. In other words, to each g ∈ G and x ∈ X there is
associated g · x ∈ X such that:
(a) (g1 g2 ) · x = g1 · (g2 · x).
(b) If e ∈ G denotes the identity, then e · x = x.

(c) d(x1 , x2 ) = d(g · x1 , g · x2 ) (i.e., each g ∈ G acts as an isometry

on X).
One says that the action makes X into a G-space.
Example 8.3.2. Let G be the group Z, and let X = R. Then
addition (g · x = g + x) makes X into a G-space.
Example 8.3.3. Let G = S3 , the symmetric group on 3 letters, and
let T be an equilateral triangle in the plane with vertices A, B, C, say.
For each permutation σ ∈ G there is one and only one isometry Iσ of
T which permutes the vertices A, B, C according to the permutation
σ. The assignment σ · x = Iσ (x) gives an action of G on T .
Deﬁnition 8.3.4. An action of G on X is free and properly discon-
tinuous (FPD for short) if for each x ∈ X there is ε > 0 such that
d(x, gx) > ε for all nonidentity g ∈ G.

The ﬁrst example above is free and properly discontinuous; the

second is not.
Deﬁnition 8.3.5. If G acts on X, the orbit Gx of x ∈ X is the set
{g · x : g ∈ G}. The orbit space G\X is the collection of orbits of G
on X.
Proposition 8.3.6. Suppose that G acts on the metric space X by
an FPD isometric action. Then the formula
d(Ω, Ω ) = inf{d(x, x ) : x ∈ Ω, x ∈ Ω }
deﬁnes a metric on the orbit space G\X.

Proof. It is clear that the formula for d is positive and symmetric.

We must show that it is strictly positive on distinct orbits and that
it satisﬁes the triangle inequality.
As a preliminary to both parts of the proof notice that if x ∈ Ω,
then d(Ω, Ω ) = inf{d(x, x ) : x ∈ Ω } (i.e., we only need to take
the inﬁmum over one orbit, not both). This is because d(gx, g x ) =
d(x, g −1 g x ) since G acts by isometries.
Now, suppose d(Ω, Ω ) = 0 with Ω = Gx, Ω = Gx , x = x . By
the above there exists a sequence of distinct elements gn ∈ G with
d(x, gn x ) < n−1 . Then d(x , gn−1 gm x ) = d(gn x , gm x ) < n−1 + m−1

and this will contradict the FPD condition for n, m sufficiently large.
Thus the distance between distinct orbits is strictly positive.
To prove the triangle inequality, consider three orbits Ω, Ω , Ω
equal to Gx, Gx , Gx , respectively. By definition of the infimum, for
any ε > 0 there exist g , g ∈ G such that
d(x, g x ) < d(Ω, Ω ) + ε, d(x , g x ) < d(Ω , Ω ) + ε.
Then
d(Ω, Ω ) d(x, g g x ) d(x, g x ) + d(g x , g g x )
= d(x, g x ) + d(x , g x ) d(Ω, Ω ) + d(Ω , Ω ) + 2ε.
Letting ε → 0 we obtain the triangle inequality.
Proposition 8.3.7. Suppose that G acts on the metric space X by
an FPD isometric action. Then the natural map p : X → G\X is a
covering map with fiber G (equipped with a discrete metric).

Proof. Put Y = G\X, let y ∈ Y , and let x ∈ p−1 (y). According

to the FPD condition there is ε > 0 such that the distances d(x, gx),
g = e, are all greater than ε. It easily follows that the balls B(gx; 12 ε)
are all disjoint.
Let U = B(y; 12 ε). By deﬁnition of the metric in Y , if x ∈
p (U ), then there is some g ∈ G such that d(x , gx) < 12 ε. From the
−1

FPD condition, this g is uniquely determined, and the restriction of p

to a map B(gx; 12 ε) → U is an isometry. Thus the map x → (p(x ), g)
is a homeomorphism from p−1 (U ) to U ×G. Thus p is a covering map,
as asserted.

The next result generalizes our use of the exponential map to

compute the fundamental group of the circle S 1 .
Theorem 8.3.8. Suppose that G acts on the metric space X by an
FPD isometric action and in addition that X is simply connected.
Then the fundamental group π1 (G\X) is isomorphic to G.

Proof. Let Y = G\X. Choose a basepoint y0 ∈ Y and a basepoint

x0 ∈ X mapping to y0 under the covering map p : X → Y which
sends each point to its orbit. Deﬁne a map α : G → π1 (Y, y0 ) as

follows: given g ∈ G, join x0 to g · x0 by a path γ in X, and let

α(g) be the homotopy class of the based loop p ◦ γ in Y . Since X
is simply connected, the path γ exists and is unique up to (endpoint
fixed) homotopy; it follows that p ◦ γ is well-defined up to homotopy
of loops, so that α(g) ∈ π1 (Y, y0 ) is well-defined.
We prove that α is a homomorphism. Let g, h ∈ G and let γ, θ
be paths from x0 to g · x0 and h · x0 . Let γ h (t) = h · γ(t): this is a
path from h · x0 to hg · x0 . Notice that p ◦ γ h = p ◦ γ. However, the
concatenation θ ∗ γ h is a path from x0 to hg · x0 ; thus,
α(hg) = [p ◦ (θ ∗ γ h )] = [(p ◦ θ) ∗ (p ◦ γ h )] = α(h)α(g),
where the square brackets denote homotopy classes of loops. This
shows that α is a homomorphism.
We prove that α is surjective. Let ϕ : [0, 1] → Y be a based loop
in Y representing a class in the fundamental group. By the lifting
theorem (Theorem 8.2.6), there is a path ϕ̃ in X starting at x0 that
lifts ϕ. The point ϕ̃(1) is in the same orbit as x0 , so there is g ∈ G
such that g · x0 = ϕ̃(1). Then α(g) = [ϕ] ∈ π1 (Y, y0 ).
We prove that α is injective. Suppose that g ∈ ker α. Then
a path from x0 to g · x0 projects to a nullhomotopic loop in Y . By
Corollary 8.2.7, that loop in Y lifts to a loop in X. By the uniqueness
of lifting, that lift must be equal to the original path from x0 to g · x0 .
Thus g · x0 = x0 . Since the action is free, g is the identity.

8.4. Examples
In the previous section we proved that if a group G acts freely and
properly discontinuously on a simply connected space X, then the
quotient space G\X has fundamental group G. We will give several
examples.
Example 8.4.1. Considering the FPD action of Z on R by transla-
tions recovers for us the calculation π1 (S 1 ) = Z which underlies the
notion of winding number.
Example 8.4.2. Letting Z2 act on R2 by translations, we get the
calculation π1 (T 2 ) = Z2 , where T 2 is the torus. This also can be
deduced from the case of the circle using Exercise 8.7.1.

Remark 8.4.3. The fundamental groups of closed surfaces of higher

genus can also be approached in this way, but the required isometric
actions are not actions on the Euclidean plane anymore. Instead, X
becomes 2-dimensional hyperbolic space, and the required groups G
are Fuchsian groups (discrete groups of hyperbolic isometries). This
very classical theory was extensively developed in the nineteenth cen-
tury.
Example 8.4.4. Let the group G = Z2 with two elements act on the
sphere S 2 by the antipodal map (i.e., the nonzero element of g sends
each point of S 2 to its antipodal point). Clearly this action is FPD.
The orbit space G\S 2 is called the real projective plane, RP2 . By the
result above, the fundamental group of RP2 is Z2 .

Now we will consider another important example, the free group.

Let S be a set, which in this context we call the set of generators. A
word in S is a formal expression (of any length k)
sn1 1 sn2 2 · · · snk k
where s1 , . . . , sk ∈ S and n1 , . . . , nk ∈ Z. (The individual sni i are
called the terms of the word, with base si and exponent ni .). We also
allow consideration of the empty word of length 0, which is denoted 1.
Two words are said to be equivalent if one can be obtained from the
other by a ﬁnite succession of the following operations (“elementary
equivalences”):
(i) replacing two successive terms with the same base s by a single
term using the “exponential law”: sn sm = sn+m ,
(ii) replacing a single term by two successive terms with the same
base using the exponential law,
(iii) inserting a term s0 in any position for any generator s,
(iv) deleting a term s0 from any position for any generator s.
For example, if S = {a, b}, the following words are equivalent.
a2 babb−1 a−1 b2 ∼ a2 baa−1 b2 ∼ a2 bb2 ∼ a2 b3 ∼ aabbb.
In the last version we abbreviate s1 as s. It is clear that our relation-
ship of equivalence is, indeed, an equivalence relation generated by
our elementary equivalences (Remark G.1.4).

If w and w are words, their juxtaposition is the word deﬁned

by writing the terms of w and following them by the terms of w .
Juxtaposition of words is an associative binary operation, for which
the empty word acts as an identity. It does not, however, admit
inverses.
Proposition 8.4.5. The juxtaposition operation passes to equiva-
lence classes of words. Equipped with this operation, the collection of
equivalence classes of words becomes a group (i.e., it has inverses).
Definition 8.4.6. The group so defined (of equivalence classes of
words in S) is called the free group generated by S and is denoted
by F (S). The free group on a finite number n of generators is also
denoted Fn .

Proof of the proposition. It is clear that if words w, w are equiv-

alent, so are their juxtapositions with a third word w . Thus juxta-
position does indeed pass to an associative binary operation on F (S),
with the equivalence class of the empty word as identity. If
w = sn1 1 sn2 2 · · · snk k ,
then let
−n
w−1 = s−n
k
k
sk−1k−1 · · · s−n
1
1
.
One checks directly that the words ww−1 and w−1 w are equivalent
to the empty word,
Deﬁnition 8.4.7. A word in S is called reduced if no term has ex-
ponent 0 and no two consecutive terms have the same base s ∈ S.

Proposition 8.4.8. Each word is equivalent to one and only one

reduced word.

Outline proof. We describe (via pseudocode) Algorithm 1, which

has the following properties:
(a) The input to the algorithm is a word; the output from the algo-
rithm is a reduced word equivalent to the input word.
(b) If the input to the algorithm is already reduced, the output is the
same as the input.
(c) Equivalent input words produce the same output.

Algorithm 1 Algorithm to reduce a word in the free group

1: function Reduce-Word(W ) W is a word in the alphabet S
2: Initialize two stacks, L and R. The L stack is initially empty;
the R stack contains the input word, with the leftmost term on
top.
3: while R stack is non-empty do
4: while terms on top of L and R stacks have same base do
5: t ← combination of top terms of L and R according to
exponential law
6: pop stacks L and R
7: if exponent of t is nonzero then push t onto stack L
8: end if
9: end while
10: t ← top term of R
11: pop stack R
12: if exponent of t is nonzero then push t onto stack L
13: end if
14: end while
15: Reduced word is now contained on stack L, rightmost on top
return contents of stack L
16: end function

The existence of this algorithm proves the proposition.

One sees (and proves formally by induction) that the word on the
left stack is always reduced and that the operation of the algorithm
does not change the equivalence class of the word obtained by con-
catenating the L and R stacks. Moreover, if two words are related
by an elementary equivalence (and therefore by any equivalence), the
algorithm will convert them into the same word.
Remark 8.4.9. Let G be any group. A subset S ⊆ G is called a
generating set for G if any element of G can be written as a word in
the members of S, where successive terms are multiplied using the
composition law of G. The word problem for G asks for an algorithm
to decide when two diﬀerent words represent the same group element.
The algorithm that we have presented above solves the word problem
for the free group.

It was proved by Novikov that there exist groups for which there
is no solution to the word problem — there is no algorithm to decide
whether or not a given word represents the identity. Combining these
ideas with topological constructions, it can be shown that there is no
algorithm to decide whether or not two explicitly given 4-dimensional
manifolds are homeomorphic to one another.

8.5. The Nielsen-Schreier theorem

Now we are going to construct a simply connected space on which a
free group acts freely and properly discontinuously. To keep things
explicit we’ll consider the free group on two generators, S = {a, b},
but the arguments extend without difficulty to any finite number of
generators (or indeed, with a bit of extra bookkeeping, to an infinite
number of generators).
Our metric space X is the geometric realization of the Cayley
graph of G = F (S). The Cayley graph has one vertex for each element
of G, and vertices g and g are joined by an edge if and only if g −1 g
is one of a, b, a−1 , b−1 . Remember that each group element can be
represented uniquely by a reduced word, a product of nonzero powers
of the two generators a, b alternately. We see that g, g are joined by
an edge in the Cayley graph if and only if the reduced word for g
differs from the reduced word for g by “one letter on the right”. For
example, the four neighbors of the reduced word a2 b−1 a3 are a2 b−1 a2 ,
a2 b−1 a4 , a2 b−1 a3 b, and a2 b−1 a3 b−1 .
We make X into a metric space by identifying each edge with
a standard interval of length 1 and then measuring distances along
shortest paths (see Remark G.4.5 for more details). For g ∈ G, we
define its absolute value to be the distance from g to the identity in
X. This is the same as the length of the reduced word representing g,
i.e., the sum |n1 | + |n1 | + · · · + |nk | in the reduced word representation
g = sn1 1 · · · snk k . By examining reduced words, one sees that of the four
edges of X that leave a nonidentity vertex g, three lead to a vertex of
greater absolute value and one to a vertex of lesser absolute value. It
follows that from any vertex g of X (and therefore from any point of
X at all) there is a unique radial geodesic: a path of length |g| in X

Figure 8.3. Part of the Cayley graph of the free group. All
edges have length 1.

leading from g to the identity vertex. For x ∈ X let us temporarily

denote by xs , s ∈ [0, 1], the point on the radial geodesic from e to x
that is distance s|x| from the identity (so that x1 = x and x0 = e for
all x).
Proposition 8.5.1. The Cayley graph X of a free group is simply
connected.

Proof. Clearly X is connected. If γ(t) is a loop (based at the iden-

tity) in X, then γ(t)s , s ∈ [0, 1], gives a homotopy of that loop to the
constant loop.

Every nonzero g ∈ G moves every point of X by distance at least

1, so the action of G on X is FPD. Thus the quotient G\X is a
space having fundamental group G, the free group on two generators.
What is this space? All the vertices of X are in the same G-orbit. The
edges fall into two G-orbits, according to whether their endpoints are

Figure 8.4. The “ﬁgure eight” space, a bouquet of two circles.

related by multiplying by a or by b. Thus the quotient space G \ X

consists of one vertex and two loops on it, as shown in Figure 8.4.
This is sometimes called a “bouquet” (or, less poetically, a “wedge”)
of two circles. The free group on n generators can similarly be shown
to be the fundamental group of a bouquet of n circles.
Proposition 8.5.2. The fundamental group of any ﬁnite connected
graph is free on n generators, where n = E − V + 1, with E being the
number of edges and V the number of vertices.

This result is also true for inﬁnite graphs, but to keep things
simple we will prove only the ﬁnite case.

Proof. A basic (and easy) theorem of graph theory says that a con-
nected graph has a spanning tree, that is, a subgraph which contains
all the vertices and has no circuits (see Lemma G.4.4). The spanning
tree of our finite graph contains all V vertices and V − 1 edges, so
there are n = E − V + 1 edges of the graph that do not belong to the
spanning tree.
Let (Y, y0 ) be a bouquet of n circles. Define a map f : X → Y
by sending every point of the spanning tree to y0 and sending the
n remaining edges linearly to the n circles of the bouquet. Define
a map g : Y → X by sending each circle of the bouquet linearly to
the loop in X obtained by following the spanning tree outward to the
beginning of the corresponding edge, traversing that edge, and then
following the spanning tree inward again to the basepoint. It is not
hard to see that f ◦ g and g ◦ f are homotopic to their respective
identity maps.
Thus the graph X is homotopy equivalent (Remark 8.1.10) to
a bouquet of n circles, which means that its fundamental group is
isomorphic to the fundamental group of such a bouquet — that is, to
the free group Fn .

We are going to use this to prove an important group-theoretic re-

sult about free groups, the Nielsen-Schreier theorem. First, a general
lemma about covering spaces.
Lemma 8.5.3. Let p : (Y, y0 ) → (X, x0 ) be a covering space, where Y
is path connected. Then the induced map p∗ : π1 (Y, y0 ) → π1 (X, x0 )

is injective. The index1 of the subgroup H = p∗ (π1 (Y, y0 )) in G =

π1 (X, x0 ) is equal to the number of sheets of the covering.

Proof. An element of the kernel of p∗ would be a loop in Y that maps

to a nullhomotopic loop in X. But the homotopy lifting theorem
(Theorem 8.2.6) allows us to lift such a homotopy to a nullhomotopy
of the original loop. Thus p∗ has trivial kernel, so it is injective.
We count the number of points in the ﬁber F = p−1 (x0 ). For
each element g ∈ π1 (X, x0 ) we obtain a y ∈ F as the endpoint of
γ(1), where γ is a path in Y starting at y0 and lifting g. This gives
a mapping of π1 (X, x0 ) onto F . Two group elements g1 , g2 represent
the same y if and only if g1 g2−1 comes from a loop in Y based at y0 , in
other words, if and only if g1 and g2 are in the same coset of H. Thus
the number of points in the ﬁber is equal to the number of cosets,
i.e., the index of H in G.

Theorem 8.5.4. Let G = Fn be a free group on n generators. Let H

be a subgroup of ﬁnite index k. Then H is also free on 1 + k(n − 1)
generators.

Remark 8.5.5. This is the “ﬁnite” part of the Nielsen-Schreier the-

orem. The infinite part, which can be proved in the same way, says
that an infinite-index subgroup of a free group is also free, on infin-
itely many generators. Notice the apparently paradoxical fact that
the number of generators increases the smaller the subgroup gets.

Proof. Let X be a bouquet of n circles, and let X̃ be the Cayley

graph of G, which is the infinite tree on which G acts FPD with
X = G\X̃. Since H is a subgroup of G, it can also be thought of as
acting on X̃, and this action is also FPD. Let Y = H\X̃. Then Y has
fundamental group H, and Y → X is a covering map whose fiber is
the coset space H\G (in particular, its cardinality is the index k of H
in G). Thus Y is a finite graph so π1 (Y ) = H is free. To compute the
number of generators, let VX , EX denote the number of vertices and

1
See Remark G.2.9.

edges for X, and let VY and EY denote the corresponding numbers

for Y . We have EX − VX = n − 1, EY = kEX , VY = kVX . From
these we ﬁnd that the number of generators of H is

1 + EY − VY = 1 + k(EX − VX ) = 1 + k(n − 1),

as asserted.

8.6. An application to nonassociative algebra

The complex numbers C form a ﬁnite-dimensional vector space A over
R equipped with a bilinear map A × A → A (the multiplication of
complex numbers). In general, a vector space equipped with such a
bilinear map is called a (real) algebra: the words associative, commu-
tative applied to an algebra refer to the corresponding properties of
the multiplication map. A ﬁnite-dimensional algebra is called a divi-
sion algebra if the multiplication map has no zero-divisors: that is to
say, the product of two nonzero elements is again nonzero. Finally, an
algebra is unital if it contains an element 1 that acts as the identity
for multiplication.
It is easy to prove that the only associative, commutative, unital
real division algebras are R and C, and it is not too much harder to
prove that dropping the commutativity requirement yields just one
more example, the quaternion algebra H of Hamilton. But suppose
we keep commutativity and drop associativity instead? This question
was investigated by Hopf in the 1940s and his answer involved topol-
ogy and covering spaces in a very intriguing way. His conclusion was
this:

Theorem 8.6.1. The only ﬁnite-dimensional commutative unital real

division algebras are R and C. (Thus, for these algebras, “the asso-
ciative law is a consequence of the commutative law”.)

In what follows, A will denote a ﬁnite-dimensional real commu-

tative division algebra, with multiplication map denoted by (x, y) →
x · y. We will not need the unitality hypothesis until the very end.
By choosing a basis, we may identify A with Rn and thus introduce
the norm
a
— the usual (Euclidean) length of the vector a.

Lemma 8.6.2. For any (ﬁnite-dimensional) division algebra A there

is a constant m > 0 such that

m−1
a

b

a · b
m
a

for all a, b ∈ A.

Proof. Let S n−1 denote the unit sphere in Rn . Consider the set
S = {
a · b
: (a, b) ∈ S n−1 × S n−1 } ⊆ R+ . Since S n−1 × S n−1 is
compact, S is a compact set of strictly positive real numbers, so it is
bounded above and below. Choose m = max{sup(S), inf(S)−1 } and
use bilinearity to complete the proof.

We deﬁne the quadratic map q : A \ {0} → A \ {0} by q(x) = x · x.

The key to Hopf’s proof is to consider the topological properties of
the quadratic map. The image of the quadratic map may or may not
be the whole of A \ {0} (the examples A = C and A = R show these
two possibilities). However each point of Im(q) is the image of exactly
two points in the domain. This is because of the familiar factorization

q(x) − q(a) = x · x − a · a = (x − a) · (x + a)

which shows, since there are no zero-divisors, that if q(x) = q(a), then
x = ±a. Note that this factorization depends on the commutative
and bilinear properties of multiplication only; it does not require the
associative law.

Lemma 8.6.3. The image of q is a closed subset of A \ {0}.

Proof. Suppose yn = q(xn ) is a sequence in Im(q) that converges to

y ∈ A \ {0}. By omitting initial terms if necessary, we may assume
that there is a constant r > 0 such that r −1
yn
r for all n.
Now using Lemma 8.6.2 we obtain a constant m such that (mr)−1/2

xn
(mr)1/2 for all n. By compactness, there is a subsequence of
the xn that converges to a nonzero x. Since q is continuous, q(x) = y,
as required.

We are now going to consider q as a smooth map, using the higher-

dimensional calculus language of Appendix E. Since q is a map A →

A, its derivative at a point x ∈ A \ {0} is a linear map A → A; in

fact the calculation
q(x + h) = q(x) + 2x · h + h · h
(which uses commutativity) shows that the derivative of q at x is the
linear map h → 2x · h. By the division algebra property this is an
invertible linear map.
It follows from the inverse function theorem (see Theorem E.4.1)
that q is a local homeomorphism near each x ∈ A \ {0}; in fact, each
such x has a neighborhood U such that q maps U diﬀeomorphically
onto a neighborhood q(U ) of q(x). In particular, the image of q is an
open subset of A \ {0}.

Lemma 8.6.4. If dim A 2, then the quadratic map q is surjective

(that is, Im(q) = A \ {0}).

Proof. The image Im(q), considered as a subset of A \ {0}, is both

open (as we have just seen) and closed (by Lemma 8.6.3). But the
complement of the origin in a vector space of dimension at least 2
is connected (Remark 2.1.5), so its only open-and-closed subsets are
the empty set and the whole of A \ {0}. Since the image of q is not
empty, it must be the whole space.

At this point we have split the proof of Theorem 8.6.1 into two
cases:
(i) A is 1-dimensional: in this case it is easily seen that A ∼
= R as
an algebra.
(ii) A is higher-dimensional: in this case we now know that q maps
A \ {0} onto A \ {0}. We’ll continue the analysis to show that
in this second case dim A = 2.

Lemma 8.6.5. If the quadratic map q is onto, then it is a (2-sheeted )

covering map.

Proof. By Deﬁnition 8.2.3, we must show that every y ∈ A \ {0}

has an open neighborhood V such that q −1 (V ) is the disjoint union
of two open sets, on each of which q restricts to a homeomorphism.
Let q −1 {y} = {x, −x}. Then (by the inverse function theorem) there

is a neighborhood U of x such that q maps U homeomorphically

(indeed diﬀeomorphically) onto q(U ). By shrinking U if necessary,
we may assume that U ∩ (−U ) = ∅. Take V = q(U ). Then q −1 (V ) =
U (−U ) is the disjoint union of two open sets each of which is
mapped homeomorphically onto V .

A more general approach to this result is indicated in Exer-

cise 8.7.14.

Lemma 8.6.6. If Rn \ {0} has a connected 2-sheeted covering, then

n = 2.

Proof. For n 3, Rn \{0} is simply connected. (Any loop in Rn \{0}

is homotopic, by a radial retraction, to one in the unit sphere S n−1 .
For n 3 this unit sphere is simply connected (Example 2.3.6) so
any loop in it is homotopic to a constant). By Lemma 8.5.3, a simply
connected space cannot have a connected covering with more than
one sheet.

Completion of the proof of Theorem 8.6.1. From Lemmas 8.6.5

and 8.6.6, together with the case-splitting above, we ﬁnd that either
A is 1-dimensional (and so isomorphic to R) or it is 2-dimensional.
It remains to show that in this second case A is isomorphic to C.
So far our argument didn’t use the assumption of an identity. In
fact there do exist other 2-dimensional, commutative, nonassociative
division algebras without identity, for example the multiplication on
C deﬁned by
a · b = ab.

However, with the additional assumption of an identity 1 we may

argue as follows. Since q is surjective, there exists an element i such
that q(i) = −1. Necessarily, 1 and i are linearly independent since
q(λ1) = λ2 1 = −1 for all real λ. Since A is 2-dimensional, it is
spanned by 1 and i, so every element of A is of the form x1+yi, where
x, y ∈ R and the multiplication is given by 1 · 1 = 1, 1 · i = i · 1 = i,
i · i = −1. It is now evident that A is isomorphic to C.

8.7. Exercises
Exercise 8.7.1. Let X and Y be metric spaces (compact, if you
like) with basepoints x0 and y0 . Let Z = X × Y with basepoint
z0 = (x0 , y0 ). Show that there is an isomorphism
∼ π1 (X, x0 ) × π1 (Y, y0 ).
π1 (Z, z0 ) =
Exercise 8.7.2. Suppose that X is path connected and π1 (X, x0 )
is abelian. Show that, in this case, the isomorphisms π1 (X, x)0 ) →
π1 (X, x1 ) associated to different paths from x0 to x1 are all the same.
Exercise 8.7.3. Prove that homotopy equivalence is an equivalence
relation (on the class of compact metric spaces with basepoint).
Exercise 8.7.4. Show that the real projective plane can be obtained
by gluing a disc to a Möbius band along their boundary circles, and
that in terms of this identification the generator of the fundamental
group of RP2 is the “core” circle of the Möbius band.
Exercise 8.7.5. Show that there is no retraction of the Möbius band
onto its boundary circle.
Exercise 8.7.6. Regard the 3-sphere S 3 as the space {(z1 , z2 ) :
z1 , z2 ∈ C, |z1 |2 + |z2 |2 = 1}. Let p, q be natural numbers with p
prime, q < p. Let T be the transformation S 3 → S 3 defined by
T (z1 , z2 ) = (e2πi/p z1 , e2πiq/p z2 ).
Describe a metric on S 3 for which T is an isometry.
An action of Z on S 3 is defined by n · x = T n (x). The orbit
space of this action is called the lens space L(p, q). Compute the
fundamental group of L(p, q).
Exercise 8.7.7. Consider the group G of isometries of R2 generated
by the two transformations a(x, y) = (x + 1, y) (a translation) and
b(x, y) = (−x, y + 1) (a glide reflection). Verify that ba = a−1 b
(equivalently, aba = b). Deduce that any g ∈ G can be written as
am bn and the multiplication law is
n
m n+n
(am bn )(am bn ) = am+(−1) b .
Show that G is a nonabelian group and that its action on the plane
is FPD. (The quotient G\R2 is called the Klein bottle.)

Exercise 8.7.8. (a) Show that every continuous map from the real
projective plane to the torus is nullhomotopic.
(b) Must every continuous map from the torus to the real projective
plane be nullhomotopic?
Exercise 8.7.9. Implement Algorithm 1 in your favorite program-
ming language and check that it really works.
Exercise 8.7.10. Verify the “not hard to see” statement in the proof
of Proposition 8.5.2.
Exercise 8.7.11. Let G = F2 be the free group on two generators
x and y. Let C(x) denote the set of all g ∈ G whose reduced word
begins with a strictly positive power of x, and let C(x−1 ) denote
those whose reduced word begins with a strictly negative power of x.
Similarly deﬁne C(y) and C(y −1 ). Show that
G = {e} ∪ C(x) ∪ C(x−1 ) ∪ C(y) ∪ C(y −1 )
but also
G = xC(x−1 ) ∪ C(x), G = yC(y −1 ) ∪ C(y).
Deduce that there is no way to assign “probabilities” to subsets of
G that satisfy the ordinary laws of probability theory (probabilities
are real numbers between 0 and 1, the probability of all of G is 1,
the probability of the empty set is 0, the probability of A ∪ B is the
probability of A plus the probability of B minus the probability of
the intersection), which is also translation invariant (the probability
of A equals the probability of gA, for all g ∈ G).
(This question produces a paradoxical decomposition: G is de-
composed into four pieces which can be reassembled to make two
copies of G. The underlying idea can be used in more complicated
paradoxical decompositions such as the Banach-Tarski paradox, de-
composing a solid ball in R3 into ﬁnitely many pieces which can be
reassembled to make two equal-size balls. See [38].)
Exercise 8.7.12. Show that C \ {−1, 1} is homotopy equivalent to
the bouquet of two circles. Use this and our computations for the free
group to verify that the loop in C\{−1, 1} described in Exercise 5.7.15
is not homotopic to a constant loop.

Exercise 8.7.13. For an arbitrary group G, deﬁne G ⊗ Z2 to be

the quotient of G by the subgroup generated by all squares and all
commutators in G. Prove that Fn ⊗ Z2 is a finite group with 2n
elements, and deduce that Fn and Fm are not isomorphic if n = m.
Exercise 8.7.14. Give an example of a local homeomorphism which
is not a covering map. Show, however, that a proper local homeo-
morphism is always a covering map (with finite fibers). (A continuous
map p : X → Y between metric spaces is proper if p−1 (K) is compact
in X whenever K is compact in Y .)

Chapter 9

Coda: The Bott

Periodicity Theorem

9.1. Homotopy groups

In this ﬁnal chapter I want to follow a beautiful paper of Atiyah [4]
to indicate one direction in which the theory that we’ve studied so
far can be developed. By no means is this the only such direction!
Indeed, the winding number lies at the foundation of nearly every
part of modern mathematics. This will be a very brisk tour with
almost no proofs.
One way to generalize the fundamental group (and therefore the
notion of winding number) is the following.

Deﬁnition 9.1.1. Let (X, x0 ) be a pointed space. The nth homo-

topy group πn (X, x0 ) is [S n , X], the collection of homotopy classes of
basepoint-preserving maps from the n-sphere to X.

To understand the group operation, it is helpful to reformulate

the deﬁnition and say that πn (X) is the collection of homotopy classes
of maps from the unit n-cube I n to X which send the boundary ∂I n
to the basepoint x0 . (This corresponds in the case n = 1 to our
viewing loops as paths which begin and end at the basepoint.) Now
we can deﬁne a group operation by concatenation on one of the I

169

factors in I n . It turns out not to matter, up to homotopy, which I

factor we pick. This observation leads to

Proposition 9.1.2. For n 2, the group πn (X) is abelian.

Let’s calculate some of the groups πn (X).

Proposition 9.1.3. πn (S 1 ) = 0 for n 2.

Proof. In the diagram

R
|=
|
|
|
Sn / S1

the map S n → S 1 lifts to a map S n → R because of the lifting

theorem (Theorem 8.2.8), since S n is simply connected for n 2.
But R is contractible — the identity map R → R is homotopic to
the constant map — and it follows that any map from any space to
R is homotopic to the constant map. Thus any map S n → S 1 is
homotopic to a constant, and πn (S 1 ) = 0.

The only thing about S 1 that we used in this argument is that it

has a contractible covering space R. In general, a space with a con-
tractible covering space is called an Eilenberg-MacLane space of type
K(π, 1) and the same argument applies to show that all the higher
homotopy groups of such a space vanish. Any compact manifold car-
rying a metric of nonpositive curvature (such as a surface of genus
1 or a compact hyperbolic 3-manifold) is an Eilenberg-MacLane
space.
We used the fact that π1 (S n ) = 1 for n 2 in the above proof.
You’ll remember that the proof of this is a “general position” argu-
ment: a map S 1 → S n can be deformed to a piecewise linear map,
which certainly misses a point of S n (say the north pole, N ), and
then the map factors as
S 1 → S n \ {N } → S n ;
but since S n \ {N } ∼
= Rn is contractible, this map is homotopic to
a constant. With a little extra care, this sort of general position

argument can be applied to maps S m → S n where m < n, thus

giving
Proposition 9.1.4. The groups πm (S n ) are trivial when m < n.

What about when m = n? This is the situation with our winding

number discussion, π1 (S 1 ) ∼
= Z. It turns out that πn (S n ) is also
Z. There are two parts to this (just as there are with the winding
number): first, defining a numerical invariant of maps S n → S n (the
degree); second, proving that it is the only such invariant (i.e., two
maps with the same degree are homotopic).
Here’s how to define the degree. Let f : S n → S n be a map.
By a homotopy, we can make it smooth. The appropriate version of
Sard’s theorem (compare Proposition 3.4.6 for the case n = 1) says
that for almost all p ∈ S n , the map f is transverse at p, meaning that
f −1 {p} is a finite set of points and for each q ∈ f −1 {p}, the derivative
Tq f of f , which is a linear map from the n-dimensional vector space
T1 S n to the n-dimensional vector space Tp S n , is invertible. We count
the number of points q (with appropriate sign determined by the
determinant of Tq f ) and that number is the degree of f . It is not too
hard to see that this is a homotopy invariant of the map f .
This process gives a homomorphism πn (S n ) → Z for each n. To
see that it’s an isomorphism needs some further ideas.
Definition 9.1.5. Let X be a space. The suspension of X, denoted
ΣX, is obtained from X × [0, 1] by identifying all of X × {0} to a
single point and all of X × {1} to another single point.

The suspension of S n is S n+1 . Suspension is functorial (that is,

a map X → Y “suspends” to a map ΣX → ΣY ). These two facts
combine to yield a suspension homomorphism
Σ : πk (X) → πk+1 (ΣX)
and in particular when X is a sphere, we have the suspension homo-
morphism πk (S n ) → πk+1 (S n+1 ).
Theorem 9.1.6 (Freudenthal suspension theorem). The sus-
pension homomorphism Σ : πk (X) → πk+1 (ΣX) is an isomorphism if
X is n-connected (which means that πj (X) is trivial for j n) and
k 2n; it is an epimorphism if k = 2n + 1.

This is proved by transversality-type arguments (or by more elab-

orate techniques like spectral sequences, but I can’t go into that here).
Applied to the groups πn (S n ) we ﬁnd that in the sequence of suspen-
sion maps

π1 (S 1 ) = Z → π2 (S 2 ) → π3 (S 3 ) → · · ·

the ﬁrst map is an epimorphism and all the subsequent maps are
isomorphisms. But the suspension map is clearly compatible with
the degree, so that degree gives a right inverse to the epimorphism
π1 (S 1 ) → π2 (S 2 ). Thus all suspension maps are isomorphisms in this
case, and πn (S n ) = Z.

Example 9.1.7. The existence of the degree allows one to prove the
general case of the no-retraction theorem (Theorem 4.1.2) and there-
fore of the Brouwer ﬁxed-point theorem. Indeed, we can mimic the
proof given in Chapter 4 for the 2-dimensional case of Theorem 4.1.2,
using the degree of a map S n−1 → S n−1 in place of the winding
number of a map S 1 → S 1 .

The discussion so far yields the following table of homotopy groups

πk (S n ), where k labels the columns and n the rows:

1 2 3 4
1 Z 0 0 0
2 0 Z ? ?
3 0 0 Z ?
4 0 0 0 Z

This evidence is compatible with the idea that the whole table is full
of zeroes except for Z’s down the diagonal. What makes homotopy
theory fascinating (and diﬃcult) is that this is very far from being
the case. The ﬁrst “?” in the table, π3 (S 2 ), provides an example.
It turns out that this group is not zero, but Z. The generator is the
Hopf map. (See [24] for a more detailed introduction.)

Deﬁnition 9.1.8. The Hopf map is the map ψ : S 3 → S 2 that is given

by sending each point of the unit sphere S 3 ⊆ C2 to the corresponding
point of the complex projective line CP1 (which can be identiﬁed

with the Riemann sphere S 2 = C ∪ {∞} via stereographic projection

(Lemma 2.3.7)). In explicit form,

ψ(z, w) = (2z w̄, |z|2 − |w|2 ) ∈ C × R = R3 .

For each point p ∈ S 2 , the set ψ −1 {p} is a copy of the circle S 1

sitting in S 3 . For two diﬀerent points p, q ∈ S 3 the circles ψ −1 {p}
and ψ −1 {q} are linked. (This is how we know that the Hopf map is
homotopically nontrivial.)

The Hopf map is actually a ﬁbration. What is a ﬁbration? It is

a “nondiscrete” generalization of the notion of covering map.

Deﬁnition 9.1.9. A continuous map p : E → B is a ﬁbration if, for

any space X and any maps f : X → E and g : X × [0, 1] → B with
p(g(x)) = f (x, 0), the diagonal in the commutative diagram
g
X × {0} /E
_ v;
v
v p
v
v
X × [0, 1]
f
/B

can be ﬁlled in by a continuous map.

Compare this with Theorem 8.2.6 for covering spaces. The dif-
ference is that in the definition of fibration, we do not assert that the
lift (the diagonal map) is uniquely determined by the given data. For
example, any product projection B × F → B is a fibration, whether
or not F is discrete.
If p : E → B is a fibration and B is path connected, then all of the
fibers p−1 {x}, x ∈ B, are homotopy equivalent. (Exercise: Prove
this!) One usually picks one, e.g., the inverse image of the basepoint,
and calls it “the” fiber F of the fibration.
If G is a Lie group acting transitively by diffeomorphisms on a
manifold X, then the map G → X defined by g → g·x0 (x0 being some
fixed point in X) is a fibration whose fiber is the stabilizer subgroup

Gx0 = {g ∈ G : g · x0 = x0 }.

In particular this covers the case of the Hopf fibration, which is given
by an action of the Lie group S 3 = SU (2) (2 × 2 complex unitary
matrices with determinant 1) on S 2 by rotations.
Another important example is the path space fibration. Let B
be any pointed space and let P B be the path space (Example 2.2.6)
of B, that is, the collection of all maps [0, 1] → B that start at the
basepoint. There is then a natural map p : P B → B sending each
path in P B to its “free” end. A tautological argument shows that
this map is a fibration. Its fiber is the loop space ΩB.
Theorem 9.1.10. Let p : E → B be a fibration with fiber F . There
is a long exact sequence of homotopy groups
· · · → πk+1 (E) → πk+1 (B) → πk (F ) → πk (E) → πk (B) → · · · .

When you see something like this, the most important question
to ask is, how is the connecting homomorphism defined—that is, the
one which shifts dimension, from πk+1 (B) → πk (F )? This homomor-
phism is just a generalized version of our definition of the winding
number! For instance if k = 0, we start with a loop in B, lift it to a
path in E starting at the basepoint, and ask where in F it ends up
(π0 (F ) = F when F is discrete). The general case is just a version
“with parameters” of the same argument.
Example 9.1.11. For the Hopf fibration we obtain isomorphisms
πk (S 3 ) = πk (S 2 ) for k 3, and also π2 (S 2 ) = π1 (S 1 ) via the con-
necting map (this is an alternative way to compute π2 (S 2 )).
Example 9.1.12. The path space P B is always contractible, as we
discussed a while ago, so the connecting map gives isomorphisms
πk (B) = πk−1 (ΩB) for all k. The k = 1 case of this is our origi-
nal definition of the fundamental group (as path components of the
loop space).

9.2. The topology of the general linear group

We are going to be interested in the topology of the general linear
group GL(m, C) of m × m complex invertible matrices. When m = 1,
this is our old favorite, the punctured plane.

Lemma 9.2.1. When k < 2m, the map πk (GL(m)) → πk (GL(m +

1)) (induced by adding an extra 1 at the bottom right of an invertible
m × m matrix to give an invertible (m + 1) × (m + 1) matrix ) is an
isomorphism.

Proof. GL(m + 1) acts transitively on Cm+1 \ {0}, with stabilizer of

the last basis vector being the group H of matrices

A 0
,
C 1
where A ∈ GL(m) and C is an arbitrary row vector. The group H
is homotopy equivalent to GL(m), via a linear homotopy on the C
components. Moreover, Cm+1 \ {0} is homotopy equivalent to S 2m+1 .
Thus, up to homotopy, there is a ﬁbration

GL(m) → GL(m + 1) → S 2m+1 ,

and the associated long exact sequence of homotopy groups gives the
result.

Thus the interesting group is going to be π2m−1 (GL(m)), which

we can also write as π2m−1 (GL) to indicate that there is no “penalty”
for increasing the size of the matrices if we want. The case m = 1 is
of course the winding number again.

Theorem 9.2.2 (Bott periodicity theorem). The “odd order”

homotopy group π2m−1 (GL(m)) is always isomorphic to Z. The iso-
morphism is given as follows: given f : S 2m−1 → GL(m), take the
ﬁrst column of f as a map g : S 2m−1 → Cm \ {0} S 2m−1 . Then the
Bott integer invariant of f is
1
deg(g),
(m − 1)!
where deg g is the usual degree of the map g. In particular, deg g
is always divisible by (m − 1)!. Moreover, the “even order” groups
π2m−2 (GL(m)) are always trivial.

It is hard to overemphasize how surprising this theorem was when

Raoul Bott found it in 1957. At that time it was assumed that the

homotopy groups of any “reasonable” space, like those of spheres, be-

come successively harder and harder to compute. Bott’s work contra-
dicted this expectation and predicted a previously unheard-of general
pattern. But it turned out to be right.
Like many important results, Bott periodicity has several dif-
ferent proofs. It turns out that one of them is based on Fredholm
operators!

Deﬁnition 9.2.3. Let Fred(H) denote the space of Fredholm oper-

ators on a Hilbert space H. The subset Fred0 (H) is deﬁned to be
the connected component of Fred(H) containing the identity, i.e., the
Fredholm operators of index zero.

Proposition 9.2.4. For k 1 there is an isomorphism πk (Fred0 ) ∼

=
πk−1 (GL).

To prove this proposition, what we must do is to exhibit a ﬁbra-

tion (up to homotopy)

GL → E → Fred0 ,

where the space E in the middle is contractible (that is, homotopy

equivalent to a point). Let B(H) denote the space of all bounded
operators on H, and let G be the group of invertible elements in
B(H). Similarly, let Q(H) = B(H)/K(H) be the Calkin algebra,
and let F be the group of invertible elements in Q(H). Let F0 be
the component of the identity in F . Let ρ : B(H) → Q(H) be the
quotient map. By Atkinson’s theorem (Theorem 7.2.4), F = ρ(Fred)
and F0 = ρ(Fred0 ).
The map G → F0 is a surjective group-homomorphism. The
kernel K of this homomorphism is the group of invertibles on H
which are of the form identity plus compact. Even though our groups
are infinite-dimensional, it can be shown that G → F0 is a fibration
with fiber K. Now Proposition 9.2.4 will follow from three assertions:

(a) The ﬁber K is homotopy equivalent to GL.

(b) The total space G is contractible (Kuiper’s theorem).
(c) The projection ρ : Fred0 → F0 is a homotopy equivalence.

The most surprising of these three items is probably (b), Kuiper’s

theorem: the group GL(H) of invertible bounded operators on an
inﬁnite-dimensional Hilbert space H is contractible. So I’m going to
sketch the proof. Actually, I’m not going to sketch the full proof of
Kuiper’s theorem, but of a weaker version, which says that the in-
clusion GL(H) → GL(H ⊕ H) given by a → ( a0 I0 ) is nullhomotopic.
This weaker statement includes the main idea of Kuiper’s proof, and,
actually, it would be enough for our purposes (because the “stabiliza-
tion” idea of increasing the size of matrices is built into our discussion
anyhow).

Remark 9.2.5. Don’t confuse GL(H) with the smaller group GL =

limn→∞ GL(n) that we discussed earlier. This latter group is a proper
subgroup of the big group GL(H).

Proof of the simpliﬁed Kuiper theorem. If U is any invertible

bounded operator, there is a standard formula which gives a path γU
of 2 × 2 invertible matrices with

U 0 I 0
γU (0) = , γU (1) = .
0 U −1 0 I

We constructed such a formula in our proof of Proposition 7.2.6. Now

let U be any invertible on H. We map to GL(H ⊕ H) as discussed
above, but now we use the fact that the second H is isomorphic to
the direct sum of inﬁnitely many copies of itself. Thus, we are looking
at the invertible operator in GL(H ⊕ H ⊕ H ⊕ · · · ) which has U at
the top left and I’s down the rest of the diagonal. For simplicity let’s
just denote this operator as V = U ⊕ I ⊕ I ⊕ . . ., and let’s distinguish
the individual Hilbert space factors as H0 , H1 , H2 , and so on.
Thinking about the subspace H1 ⊕H2 , V restricts to this subspace
and acts as I ⊕ I. Follow the path γU −1 backwards on this subspace
to get to the operator U −1 ⊕ U . What’s more, follow that same path
on all the subspaces H2k−1 ⊕ H2k (in other words, take the direct
sum of all the paths). At time 1 we have deformed V to

W = U ⊕ (U −1 ⊕ U ) ⊕ (U −1 ⊕ U ) ⊕ · · · .

(The parentheses just indicate the individual 2-space summands on

which we carried out our rotations.) Now reparenthesize W as
W = (U ⊕ U −1 ) ⊕ (U ⊕ U −1 ) ⊕ · · ·
and on each of the summands H2k ⊕ H2k+1 follow the path γU to get
to
(I ⊕ I) ⊕ (I ⊕ I) ⊕ · · · ,
which is of course the identity.

Remark 9.2.6. The key observation from functional analysis which

makes this work is that if you have an operator on Hilbert space
given by a diagonal matrix (or a block diagonal matrix, i.e., one
which is a direct sum of “blocks”), then the norm of the operator is
the supremum of the norms of the diagonal entries or blocks. Thus, if
all the blocks vary in a continuous way (with some uniformity, which
is certainly true here because the blocks are all basically the same),
then the operator varies continuously as well.

These ideas give us a ﬁbration (up to homotopy)

GL → • → Fred0
which provides isomorphisms πk (Fred) → πk−1 (GL) (morally speak-
ing, a map Ω Fred → GL). What about the other direction to ﬁnish
the proof of Bott periodicity? We need a map ΩGL → Fred.
But we already constructed such a map! An element of ΩGL is
just a map f : S 1 → GL(k) for some k. To this, we may associate a
(matrix) Toeplitz operator Tf as in Theorem 7.4.4, which is Fredholm
since its symbol is invertible. In this way we get a map
ΩGL → Fred
and the index theorem for matrix Toeplitz operators says that this
map induces an isomorphism on the level of π0 , from π1 (GL) =
π0 (ΩGL) to π0 (Fred). We need to understand why this map too
is a homotopy equivalence. The reason is going to be the Toeplitz
index theorem plus some extra algebraic structure.
To get a better idea about this let’s express the crucial homotopy
groups πk−1 (GL) in a more homogeneous manner. Suppose we have a

map ϕ : S k−1 → GL(m). We extend it to a map Rk → GL(Cm ⊕ Cm )

as follows:

0 |x|ϕ(x/|x|)
Φ(x) = .
|x|ϕ(x/|x|)−1 0
In other words, we have first extended ϕ homogeneously from the
unit sphere to the whole of Euclidean space, and then we have “dou-
bled” it to act on Cm ⊕ Cm . A vector space which is split into two
“halves” in this way is sometimes called a super vector space, and
linear transformations on it are classified as odd or even according to
whether they preserve or reverse the two factors in the decomposition.
The map Φ is what is called a homogeneous supersymmetry; in other
words, it is odd and satisfies
Φ(x)2 = |x|2 I.
The homotopy classes of such homogeneous supersymmetries on Rk
can be organized into a group K(Rk ), which is of course isomorphic
to πk−1 (GL) — all we have done is rewrite things in a different way.
The advantage of this approach though is that it makes evident a
multiplicative structure which we might not have seen before. There
is a product K(Rk1 ) × K(Rk2 ) → K(Rk1 +k2 ) defined by
Φ = Φ1 ⊗1( + 1⊗Φ( 2,
)
and this makes the direct sum K = k
k K(R ) into a graded ring.
What we have to prove is that this graded ring is actually a polynomial
ring on b ∈ K(R2 ), where b is the Bott generator

0 z
b(z) = ,
z̄ 0
where we have identiﬁed R2 with C.
The map α : πk+1 (GL) → πk−1 (GL) deﬁned by Toeplitz index
theory now becomes a map of degree −2, α : K → K.

Lemma 9.2.7. α is a map of Z[b]-modules.

What this means is that once we know that α(b) = 1 — which is

the basic Toeplitz index theorem — we will also know that α(bm ) =
bm−1 and thus that α is a left inverse to multiplication by b. To show

that it is also a right inverse, Atiyah [5] used an ingenious homotopy

to show
Lemma 9.2.8. Let X be any space. The diagram

K(X × R2 )
b / K(X × R2 × R2 )

α α

K(X)
b / K(X × R2 )
commutes.

Because of this lemma, the fact that α is a right inverse of b for

the space X follows from the fact that it is a left inverse of b for the
space X × R2 .
Our brief sketch of the proof of Bott periodicity is thereby com-
pleted, and with it we conclude this book. I wish you much happy
winding around in the future.

Appendix A

Linear Algebra

Linear algebra is the study of linear transformations. A linear trans-

formation is, essentially, one that can be computed on the sum of
two objects by applying it to each of the objects separately and then
adding the results (the formal deﬁnition is a tad more complicated:
it can be found in Deﬁnition A.3.1). Such objects are common in
mathematics, and linear algebra is therefore an important foundation
for many mathematical theories.
A classical reference for the material of this appendix is Hal-
mos [20]. For a more modern treatment, see Axler [7]. Either of
these books will provide much more detail and background about the
key ideas of linear algebra.

A.1. Vector spaces

We fix a number system called the field of scalars. For the purposes
of this book, the scalars will be either the real numbers (R) or the
complex numbers (C). (Other scalar fields are possible, and important
in various parts of mathematics, but we won’t need them here.) To
encompass both possibilities, we’ll use the symbol K.

181

Deﬁnition A.1.1. A vector space with scalar ﬁeld K is a set V

equipped with two mappings:

(a) addition, which is a mapping V × V → V and is denoted (u, v) →

u + v, and
(b) scalar multiplication, which is a mapping K × V → V and is
denoted (λ, v) → λv.

These mappings are required to have the following properties:

(i) Addition is associative: u + (v + w) = (u + v) + w for all u, v, w ∈

V.
(ii) Addition is commutative: u + v = v + u for all u, v ∈ V .
(iii) There is an identity element for addition: that is, an element
0 ∈ V such that 0 + v = v = v + 0 for all v ∈ V .
(iv) There are additive inverses in V : for each v ∈ V there exists an
element (−v) ∈ V such that v + (−v) = 0.
(v) Scalar multiplication distributes over addition of scalars:
(λ + μ)v = λv + μv for all λ, μ ∈ K and v ∈ V .
(vi) Scalar multiplication distributes over addition of vectors:
λ(u + v) = λu + λv for all λ ∈ K and u, v ∈ V .
(vii) Scalar multiplication is associative: (λμ)v = λ(μv) for all λ, μ ∈
K and v ∈ V .
(viii) The multiplicative unit 1 ∈ K acts as unit for scalar multiplica-
tion: 1v = v for all v ∈ V .

The elements of a vector space are called vectors.

The set consisting of a single vector 0 is a vector space, the zero
space. The ﬁeld K itself is an example of a vector space (with scalar
multiplication being multiplication in K). More generally, for any
integer n, the collection Kn of n-tuples of elements of K, with “com-
ponentwise” addition and scalar multiplication, is a vector space. In-
deed, the space of functions from any set X to K is a vector space,
with componentwise operations: the space Kn of n-tuples is an ex-
ample of this, where X = {1, 2, . . . , n}.

Deﬁnition A.1.2. Let S be a subset of a vector space V . An ex-

pression of the form
k
λi s i ,
i=1
where the λi are members of K (scalars) and the si are members of
S, is called a linear combination of members of S.
Deﬁnition A.1.3. A subset W of a vector space V is called a sub-
space if it has the property that any linear combination of members
of W is itself a member of W . In this case, the addition and scalar
multiplication mappings send W × W into W and K × W into W , so
that W may be considered as a vector space in its own right.
Example A.1.4. Let X be a metric space (for example, the closed
interval [0, 1]) and let V be the vector space of all functions X → K.
Then the continuous functions X → K form a subspace of V .

Let V be a vector space and let V1 , V2 be subspaces of V . Their

intersection V1 ∩ V2 is always a subspace of V as well. Their union
V1 ∪ V2 is almost never a subspace of V (think for example of V = R2
with V1 being the x-axis and V2 being the y-axis). However, the sum
V1 + V2 := {v1 + v2 : v1 ∈ V1 , v2 ∈ V2 },
that is, the collection of all vectors v ∈ V that can be represented as
a sum v = v1 + v2 with v1 ∈ V1 and v2 ∈ V2 , is a subspace of V and
in fact is the smallest such subspace that contains both V1 and V2 .
Deﬁnition A.1.5. Let V be a vector space and V1 , V2 subspaces of
V . If V1 ∩ V2 = {0} and V1 + V2 = V , then we say V is the direct sum
of V1 and V2 and write V = V1 ⊕ V2 .

It is equivalent to say that V = V1 ⊕ V2 if every v ∈ V can be

represented in a unique way as a sum of v1 ∈ V1 and v2 ∈ V2 . The
uniqueness follows because if
v1 + v2 = v = v1 + v2
are two such representations, then v1 − v1 = v2 − v2 belongs to the
intersection V1 ∩ V2 and is therefore zero. Thus the maps P1 : V → V1
and P2 : V → V2 , sending v to its components v1 and v2 , respectively,
are well-deﬁned.

Figure A.1. A subspace (V1 ) of the plane and several com-

plementary subspaces to it (V2 , V2 , V2 ).

Deﬁnition A.1.6. The maps P1 , P2 deﬁned above are called the

projections associated to the direct sum decomposition V = V1 ⊕ V2 .
(We call P1 the projection onto V1 along V2 , and we call P2 the
projection onto V2 along V1 . )
Deﬁnition A.1.7. Let V be a vector space and V1 a subspace. An-
other subspace V2 is called a complement (or complementary) to V1
if V = V1 ⊕ V2 .

Note that there can be many complements to a given subspace.

See Figure A.1.

A.2. Basis and dimension

Let V be a vector space and let S = {s1 , . . . , sk } be a ﬁnite subset of
V . Associated to S is a map

k
(A.2.1) ΦS : Kk → V, ΦS (λ1 , . . . , λk ) = λi s i ,
i=1

which sends each k-tuple of scalars (λ1 , . . . , λk ) to the corresponding

linear combination of members of S.
Deﬁnition A.2.2. Let S and V be as above. Then:
(i) If the map ΦS is injective, we say that S is a linearly independent
set. (That is to say, S is linearly independent if there is no

Deﬁnition A.2.3. If a vector space V has a ﬁnite spanning set, we

say that it is ﬁnite-dimensional.

Lemma A.2.4. Let V be a ﬁnite-dimensional vector space. Then:

(a) Any (ﬁnite) spanning set has a subset which is a basis.
(b) Any (ﬁnite) linearly independent set is a subset of a basis.

Proof. (a) Let S be a ﬁnite spanning set. By removing elements of S

one by one, you will eventually reach a minimal spanning set: a subset
S of S which spans V , but such that no proper subset of it spans V .
I claim that such a minimal spanning set is linearly independent and
is therefore a basis.
Indeed, if the set S fails to be linearly independent, there exist
s1 , . . . , sk ∈ S and scalars λ1 , . . . , λk ∈ K, not all zero, such that
k
j=1 λj sj = 0. Suppose without loss of generality that λ1 = 0.
Then
k
s1 = − (λj /λ1 )sj ,
j=2

and using this relation, any linear combination of members of S can

be rewritten so as not to involve s1 . Thus S \ {s1 } spans V , contra-
dicting the hypothesis that S is a minimal spanning set.
The proof of (b) is similar. Suppose T is a linearly independent set
and S is a (ﬁnite) spanning set for V . By including elements of S one
by one to T , you will eventually reach a maximal linearly independent
set: a set T with T ⊆ T ⊆ T ∪ S which is linearly independent, but
with the property that any T ⊆ T ∪ S that properly includes T
is not linearly independent. I claim that T spans V . Suppose ﬁrst
that s ∈ S \ T . Then the set T ∪ {s} is not linearly independent, so

there is a linear relation among its members. Because T is linearly

independent, this linear relation must include the vector s with a
nonzero coeﬃcient. Thus (rewriting the relation as we did in the
previous paragraph) s is a linear combination of members of T . This
applies to every s ∈ S \ T , so all members of S are either in T
already or are linear combinations of members of T . Since S spans
V , it follows that T spans V , as required.

So far we have deﬁned “ﬁnite-dimensional” but not “dimension”.

The next result allows us to give a definition of dimension.
Proposition A.2.5. In a finite-dimensional vector space, the number
of elements in any linearly independent set is less than or equal to the
number of elements in any spanning set. In particular, any two bases
have the same (finite) number of elements.
Definition A.2.6. The number of elements in a basis of a vector
space V is called its dimension and is denoted dim(V ).

Proof. Let S span V and let T be linearly independent. We will de-

scribe a process (called the replacement algorithm) which will produce
a sequence of spanning sets S = S0 , S1 , S2 such that:
(a) All of the sets Sj have the same number (say m) of elements.
(b) Sj consists of j elements of T and m − j elements of S.
(c) Sj+1 is obtained from Sj by replacing an element of S (already
contained in Sj ) by an element of T (not already contained in Sj ).
The process terminates when all the elements of S have been replaced
by elements of T . At that point we have obtained a spanning set Sk
that has the same number of elements as the original S and has T as
a subset. Clearly this implies that T has no more elements than S.
Here is how the replacement algorithm works. Suppose that S
originally has m elements. At the jth step, we have
Sj = {t1 , . . . , tj , s1 , . . . , sm−j },
where the t’s belong to T and the s’s belong to S, and this is a
spanning set. If the set {t1 , . . . , tj } is all of T , we are done. Otherwise,
there is some element of T that is not one of the t1 , . . . , tj ; choose such
an element and denote it by tj+1 .

Since Sj is a spanning set, there exist scalars λi and μi such that

j
m−j
tj+1 = λ i ti + μi s i .
i=1 i=1

At least one of the μ’s must be nonzero (since T is a linearly inde-

pendent set): say (without loss of generality) that it is μ1 . Then we
can rewrite the equation above to express s1 as a linear combination
of the members of

Sj+1 = {t1 , . . . , tj+1 , s2 , . . . , sm−j }.

Thus every linear combination of members of Sj is also a linear com-

bination of members of Sj+1 , so Sj+1 spans V . We have successfully
“replaced” s1 by tj+1 , as required.

Proposition A.2.7. Every subspace of a ﬁnite-dimensional vector

space is finite-dimensional and has a complement which is also finite-
dimensional. Moreover, if the finite-dimensional space V is the direct
sum of subspaces X and Y (see Definition A.1.5), then

dim(V ) = dim(X) + dim(Y ).

Proof. First we show that any subspace of a ﬁnite-dimensional space

is finite-dimensional. Let V be finite-dimensional and X a subspace
of V . Every linearly independent subset of X is also a linearly in-
dependent subset of V : thus, no such subset can have more than
dim V members, by Proposition A.2.5. Therefore there exist maxi-
mal linearly independent (finite) subsets of X: such a subset S ⊆ X
is linearly independent, but no subset of X that properly contains
S is linearly independent. The same argument as in the proof of
Lemma A.2.4(b) shows that such a maximal linearly independent sub-
set is a basis for X. Thus X is finite-dimensional (and its dimension
is less than or equal to the dimension of X).
Now choose a basis {x1 , . . . , xk } for X. It is linearly independent
(in V ) so by Lemma A.2.4(b) there are elements {y1 , . . . , y } such
that {x1 , . . . , xk , y1 , . . . , y } is a basis for V . The elements {y1 , . . . , y }
form a basis for another subspace Y which is complementary to X.
Conversely, if X and Y are complementary subspaces and we combine

bases {x1 , . . . , xk } for X and {y1 , . . . , y } for Y , we obtain a basis

{x1 , . . . , xk , y1 , . . . , y } for V . We now have
dim V = k + = dim X + dim Y,
as required.
Corollary A.2.8. The only n-dimensional subspace of an n-dimens-
ional vector space V is V itself.

Proof. A complementary subspace is 0-dimensional, hence trivial.

A.3. Linear transformations

Definition A.3.1. Let V and W be vector spaces (having the same
scalar field K). A map T : V → W is called a linear map or linear
transformation if
T (λ1 v1 + λ2 v2 ) = λ1 T (v1 ) + λ2 T (v2 )
for all λ1 , λ2 ∈ K and v1 , v2 ∈ V .
Example A.3.2. Suppose that V is decomposed as a direct sum
V = V1 ⊕ V2 . Then the projections P1 and P2 associated to the direct
sum decomposition (see Definition A.1.6) are linear maps.
Example A.3.3. If S : U → V and T : V → W are linear maps, so
is their composite T ◦ S (the map U → W defined by (T ◦ S)(u) =
T (S(u))).
Definition A.3.4. Let T : V → W be a linear map. The kernel and
image of T are defined by
Ker T = {v ∈ V : T v = 0},
Im T = {w ∈ W : ∃v ∈ V, T v = w}.
They are vector subspaces of V and W , respectively.
Remark A.3.5. By definition, a linear map T : V → W is surjective
(onto) if and only if Im(T ) = W . Also, T is injective (one-to-one) if
Ker(T ) = {0}: to see this, notice that if T (v1 ) = T (v2 ), then
T (v1 − v2 ) = T (v1 ) − T (v2 ) = 0,

and therefore v1 − v2 ∈ Ker(T ). If T is bijective (that is, both surjec-

tive and injective), it is called an isomorphism.

Deﬁnition A.3.6. Vector spaces V and W are isomorphic if there

exists an isomorphism (a bijective linear map) from V to W .

Example A.3.7. Let S = {s1 , . . . , sk } be a ﬁnite subset of V . Then

the associated map ΦS : Kk → V (see (A.2.1)) is linear. If S is a basis
for V , then ΦS is linear and bijective, hence an isomorphism.

Proposition A.3.8. Isomorphic vector spaces have the same dimen-

sion.

Proof. Let T : V → W be an isomorphism and let S be a subset of

V . Deﬁne T (S) to be {T s : s ∈ S} ⊆ W . Because T is surjective, if S
spans V , then T (S) spans W . Because T is injective, if S is linearly
independent in V , then T (S) is linearly independent in W . Thus T
carries bases of V to bases of W , and the result follows.

Proposition A.3.9. Let V be a vector space and let W be a subspace

of V . Then any two complementary subspaces to W are isomorphic.

Proof. Let X and Y be two complementary subspaces, so that V =

W ⊕ X = W ⊕ Y . Let P denote the projection onto X along W , and
let Q denote the projection onto Y along W . I claim that if x ∈ X,
then P (Q(x)) = x. Indeed, write

x = y + w, y ∈ Y, w ∈ W.

Then by deﬁnition P (x) = y. But then the decomposition

y = x + (−w), x ∈ X, (−w) ∈ W,

shows that, by deﬁnition, Q(y) = x. It follows that Q : X → Y and

P : Y → X are inverses of one another, so they are isomorphisms.

The previous two propositions imply that the following deﬁnition

is a good one.

Deﬁnition A.3.10. The codimension of a subspace W of a vector

space V is the dimension of any complementary subspace of W .

If everything is ﬁnite-dimensional, Proposition A.2.7 shows that

the codimension of W in V is just dim(V ) − dim(W ). The idea
of codimension gains importance in the infinite-dimensional context.
Even if V and W are not finite-dimensional, the codimension of W
in V may be finite, as the following example shows.

Example A.3.11. Let V be the vector space of all continuous func-

tions [0, 1] → K, and let W be the subspace consisting of all functions
that vanish at 0 (that is, W = {f ∈ V : f (0) = 0}). Both V and
W are inﬁnite-dimensional. However, W has codimension 1 in V . To
show this, we just need to construct a 1-dimensional complementary
subspace; I claim that the space of constant functions does the job.
The space X of constant functions is 1-dimensional (a basis is pro-
vided by the single function 1, the constant function with value 1).
Clearly W ∩ X = {0}, and on the other hand the identity

f = (f − f (0)1) + (f (0)1)
* +, - * +, -
∈W ∈X

shows that W + X = V .

Lemma A.3.12. Let V be a ﬁnite-dimensional vector space and let

T : V → W be a linear transformation. Let Y be a subspace of V
complementary to Ker(T ). Then the restriction of T to a map Y →
Im(T ) is an isomorphism.

Proof. The existence of the complementary subspace Y follows from

Proposition A.2.7. Let T denote the restriction of T to a map Y →
Im(T ). Then Ker T = Ker T ∩ Y = {0}, so T is injective. On the
other hand, if w ∈ Im(T ), then w = T v for some v, and we may write
v = x + y with x ∈ Ker(T ) and y ∈ Y . We have

w = T (v) = T (x) + T (y) = 0 + T (y) ∈ Im(T ).

Thus T is surjective. Being linear, injective, and surjective, T is an

isomorphism.

It’s helpful to represent this lemma as in Figure A.2: T maps

“part” of V (the kernel) to zero, and a “complementary part” (the
space Y ) isomorphically to a “part” of W (the image).

V W

Complement Complement
to Ker T ∼
= to Im T

Ker T T Im T

Figure A.2. Schematic representation of a linear operator.

Deﬁnition A.3.13. Let T : V → W be a linear map. The nullity of

T is the dimension of Ker T , and the rank of T is the dimension of
Im T .
Theorem A.3.14 (Rank-nullity theorem). For a linear trans-
formation T : V → W between ﬁnite-dimensional vector spaces, we
have
Nullity T + Rank T = dim V.

Proof. Let Y be a complement to Ker(T ) in V . According to Propo-

sition A.2.7,
dim(Y ) + dim Ker T = dim V.
But by Lemma A.3.12, Y is isomorphic to Im T . The result follows
from Proposition A.3.8.

A reformulation of the rank-nullity theorem will become impor-

tant in the discussion of the theory of Fredholm operators in Chap-
ter 7.
Deﬁnition A.3.15. Let T : V → W be a linear map. The corank
of T , Corank T , is the codimension of Im(T ) in W . Similarly, the
conullity of T , Conullity T , is the codimension of Ker T in V .
Proposition A.3.16. For a linear transformation T : V → W be-
tween ﬁnite-dimensional vector spaces, we have
Nullity(T ) − Corank(T ) = dim V − dim W.

Proof. This follows immediately from Theorem A.3.14 and the def-
inition of codimension.

We can generalize this result by using the idea of exact sequence.

Deﬁnition A.3.17. Let V0 , V1 , . . . , Vn be a sequence of vector spaces

and Tj : Vj → Vj+1 linear maps between them:
Tn−1
V0
T0
/ V1 T1
/ ··· / Vn .

The sequence is exact if Im(Tj ) = Ker(Tj+1 ) for each j = 0, . . . , n−1.

An exact sequence is terminating if V0 = Vn = {0}.

Theorem A.3.18. In a terminating exact sequence of ﬁnite-dimen-

sional vector spaces, the alternating sum of the dimensions of the

spaces, j (−1)j dim(Vj ), is equal to zero.

Proof. For each j let Xj be the subspace of Vj deﬁned by

Xj = Ker(Tj ) = Im(Tj−1 ),
and let Yj be a complementary subspace. According to Lemma A.3.12,
Tj gives rise to an isomorphism between Yj and Xj+1 , so these spaces
are of the same dimension. Thus

(−1)j dim(Vj ) = (−1)j (dim(Xj ) + dim(Yj ))
j j

= (−1)j (dim(Xj ) − dim(Xj+1 )) .
j

The expression on the right is a “telescoping” sum which gives 0, as

required.

A.4. Duality
Let V be a vector space with scalar ﬁeld K. A linear map ϕ : V → K
is called a linear functional on V .
Suppose that ϕ1 , ϕ2 are linear functionals. We can deﬁne their
sum to be the linear functional ϕ given by
ϕ(v) = ϕ1 (v) + ϕ2 (v).

We write ϕ = ϕ1 +ϕ2 in this case. In the same way, if λ is a scalar, we

can deﬁne the scalar product of λ and ϕ1 to be the linear functional
ψ given by
ψ(v) = λϕ1 (v),
and we may write ψ = λψ1 . Thus we have equipped the collection of
all linear functionals with operations of addition and scalar multipli-
cation, and it is easy to verify that (equipped with these operations)
the space of linear functionals is itself a vector space.

Deﬁnition A.4.1. The dual space V ∗ of V is the space of linear

functionals V → K.

Proposition A.4.2. If V is ﬁnite-dimensional, then V ∗ is ﬁnite-

dimensional also, and dim(V ) = dim(V ∗ ).

Proof. Choose a basis S = {v1 , . . . , vn } for V . Then, for each k ∈

{1, . . . , n}, deﬁne a linear functional ϕk : V → K by
⎛ ⎞
n
ϕk ⎝ λj vj ⎠ = λk ;
j=1

in other words, ϕk (v) is obtained by writing v as a linear combination

of the basis vectors v1 , . . . , vn and then taking the coeﬃcient of vk .
I claim that the set P = {ϕ1 , . . . , ϕn } forms a basis for V ∗ . To
prove this we must show two things: P spans V ∗ and P is linearly
independent. For both arguments the key observation is that

1 if j = k,
ϕk (vj ) =
0 if j = k,
which follows from the deﬁnition of ϕk .
To prove that P spans V ∗ , let ϕ ∈ V ∗ and consider the linear
functional
n
ψ= ϕ(vk )ϕk .
k=1
From the “key observation” above, ϕ(vj ) = ψ(vj ) for each j =
1, . . . , n. Since every v ∈ V is a linear combination of v1 , . . . , vn ,
it follows that ϕ(v) = ψ(v) for every v, that is, ϕ = ψ. But ψ is a

linear combination of ϕ1 , . . . , ϕn , by construction. It follows that P

spans V ∗ .
To prove that P is linearly independent, suppose there is a linear
relation of the form
n
μk ϕk = 0.
k=1
Applying this linear relation to vj and using the “key observation”,
we ﬁnd that μj = 0. Since j was arbitrary, all the μ’s are zero. So no
nontrivial linear relation exists, and we have proved that P is linearly
independent.
Since P and S have the same number of elements, dim(V ) =
dim(V ∗ ), as asserted.
Remark A.4.3. The basis P for V ∗ that appears above is called the
dual basis to the originally given basis of V .

The operation of forming duals (“dualization”) can be applied to

linear maps between vector spaces, as well as to the vector spaces
themselves. Specifically, suppose that T : V → W is a linear transfor-
mation and that ϕ ∈ W ∗ . We can define a linear functional ψ ∈ V ∗
by
ψ(v) = ϕ(T (v)), or ψ = ϕ ◦ T.
We denote ψ by T ∗ ϕ. Then T ∗ is a linear transformation W ∗ → V ∗
(in the “backwards” direction). It is called the dual transformation to
V . It is an immediate consequence of the definition that if S : U → V
and T : V → W are linear transformations, then
(T ◦ S)∗ = S ∗ ◦ T ∗ ;
this is expressed by saying that the process of dualization is functorial.

A.5. Norms and inner products

In the previous section we have seen that if V is a ﬁnite-dimensional
vector space, then V and V ∗ have the same dimension and are there-
fore isomorphic. However, there is no unique choice of an isomor-
phism between V and V ∗ . One way in which one can make such a
choice is by ﬁxing an inner product on V .

Deﬁnition A.5.1. Let V be a vector space. An inner product on

V is a map V × V → K, denoted by (v, w) → v, w, which has the
following properties:
(i) (Linearity) For scalars λ1 , λ2 ∈ K and vectors v1 , v2 , w ∈ V , one
has
λ1 v1 + λ2 v2 , w = λ1 v1 , w + λ2 v2 , w.
(ii) (Positive deﬁniteness) The inner product of a vector with itself,
v, v, is a nonnegative real number and is equal to 0 if and only
if v is the zero vector.
(iii) (Symmetry) This has two versions, depending on whether K = R
or C. If K = R, we require simply that v, w = w, v for all
v and w. If K = C, we introduce complex conjugation into this
identity and require that v, w = w, v for all v and w.
A vector space equipped with an inner product is called an inner
product space.

The “dot product” of multivariable calculus is an example of an

inner product. Some people (following this example) also use the
term “scalar product” for an inner product, but we will avoid this
terminology, as it leads to confusion with the scalar multiplication
operation K × V → V .

Lemma A.5.2. Let V be an inner product space. Then for all vectors
u, v ∈ V we have the Cauchy-Schwarz inequality
| u, v|
u

with equality if and only if u and v are linearly dependent.

Proof. Suppose, without loss of generality, that u = 0 (otherwise the

inequality is trivial). Put
w = u, vu − u, uv.
Then, expanding and using the properties of the inner product,
w, w = | u, v|2 u, u − 2| u, v|2 u, u + u, u2 v, v.
This quantity is nonnegative, so | u, v|2 u, u u, u2 v, v, and
dividing through by the (strictly positive) quantity u, u gives the

Cauchy-Schwarz inequality. If equality occurs, w = 0, and this gives

the asserted linear relation between u and v.

Corollary A.5.3. In an inner product space V the expression

v
=
( v, v)1/2 deﬁnes a norm on V ; that is to say,

λv
= |λ|
v
,
u + v

u
+
v
,
for all λ ∈ K and u, v ∈ V .

Proof. The ﬁrst equality follows immediately from properties of the

inner product. For the second displayed item (the triangle inequality),
use Cauchy-Schwarz to write

u + v
2 =
u
2 +
v
2 + 2 Re u, v
u
2 +
v
2 + 2
u

v
.
The right side is (
u
+
v
)2 , and taking square roots gives the
result.

Let V be an inner product space. For each w ∈ V the map

ϕw : V → K deﬁned by
ϕw (v) = v, w
belongs to V ∗ . A linear functional ϕ that arises in this way (i.e., is
equal to ϕw for some w) is said to be represented by the inner product.

Proposition A.5.4 (Representation theorem). If V is a ﬁnite-

dimensional inner product space, then every linear functional on V is
represented by the inner product.

Proof. Suppose ﬁrst that K = R. Then the assignment w → ϕw

deﬁnes a linear transformation Φ from V to V ∗ . I claim that this lin-
ear transformation is injective: indeed, if w ∈ Ker Φ, then ϕw (w) =
w, w = 0, and it follows from the deﬁniteness of the inner prod-
uct that w = 0. From the rank-nullity theorem, then, dim(Im Φ) =
dim V = dim V ∗ . Since Im Φ is a subspace of V ∗ , it must be equal to
V ∗ (Corollary A.2.8). This is the representation theorem.
If K = C, this argument does not quite work as stated, be-
cause the map Φ is antilinear (Φ(λ1 v1 + λ2 v2 ) = λ̄1 Φ(v1 ) + λ̄2 Φ(v2 ))
rather than linear. It is necessary to verify that the conclusion of
the rank-nullity theorem still holds for antilinear maps. This is a

straightforward generalization of the earlier discussion and is left to

the reader.

If V is not ﬁnite-dimensional, the representation theorem does

not hold in the form that we have stated. Later we shall single out an
important class of inﬁnite-dimensional spaces (Hilbert spaces) where
a version of the representation theorem does remain true.

Remark A.5.5. In passing, we deﬁned above the notion of norm on

a vector space V . Recall that a norm is a map V → R+ such that

λv
= |λ|
v
,
u + v

u
+
v
,
for all λ ∈ K and u, v ∈ V . An inner product deﬁnes a norm, but there
are examples of norms that do not arise in this way: for instance, the
expression

(x, y)
= max{|x|, |y|}
deﬁnes a norm on R2 that does not arise from any inner product. The
notion of norm will be important in our discussions of multivariable
calculus (Appendix E) and, once again, of Hilbert space (Appen-
dix F).

A.6. Matrices and determinants

Let V be a finite-dimensional vector space, and suppose that a basis
B = {v1 , . . . , vk } has been chosen for V . Then, as we remarked in
Example A.3.7, the linear map Kk → V defined by sending a k-tuple

(λ1 , . . . , λk ) to the vector v = λj vj (see (A.2.1)) is an isomorphism
from Kk to V . It is conventional to write the list of scalars {λj } in a
vertical format, like
⎛ ⎞
λ1
⎜ .. ⎟
⎝ . ⎠,
λk
and call it the column that represents the vector v relative to the
given basis B. We use the notation [v]B for this column.
Now suppose that T : V → W is a linear transformation between
finite-dimensional vector spaces and that bases B = {v1 , . . . , vn } and

C = {w1 , . . . , wm } are chosen in V and W , respectively. Then there

are unique scalars tij ∈ K, i = 1, . . . , m, j = 1, . . . , n, such that

m
T (vj ) = tij wi .
i=1

The array of scalars

⎛ ⎞
t11 ··· t1n
B ⎜ .. .. .. ⎟
[T ]C = ⎝ . . . ⎠
tm1 ··· tmn
is called the matrix that represents the linear transformation T . Us-
ing linearity, one checks easily that the composite map ΦC ◦ T ◦ Φ−1 B
— that is, the map that takes the column [v]B (say of scalars λj ) that
represents v ∈ V to the column [T v]C (say of scalars μi ) that repre-
sents T v ∈ W — is given by the usual rule of matrix multiplication
[T v]C = [T ]B
C [v]B ; that is,
⎛ ⎞ ⎛ ⎞⎛ ⎞
μ1 t11 · · · t1n λ1
⎜ .. ⎟ ⎜ .. .. .. ⎟ ⎜ .. ⎟ .
⎝ . ⎠=⎝ . . . ⎠⎝ . ⎠
μm tm1 ··· tmn λn
Example A.6.1. We can apply this idea to the dual space V ∗ of V
(Deﬁnition A.4.1). An element ϕ ∈ V ∗ is just a linear map V → K.
Taking the obvious basis {1} of the 1-dimensional vector space K, we
see that ϕ is represented by a row
(α1 , . . . , αn )
in such a way that the result of applying ϕ to v corresponds to the
matrix product
⎛ ⎞
λ1
⎟
n
⎜
(α1 , . . . , αn ) ⎝ ... ⎠ = αj λj .
j=1
λn
The {αj } are therefore the coeﬃcients of the expression of ϕ in terms
of the dual basis (Remark A.4.3) to the original basis B of V .

Lemma A.6.2. Let S : U → V and T : V → W be linear transfor-

mations between ﬁnite-dimensional vector spaces, and let A, B, C be

bases in U, V, W , respectively. Then the matrices of T , S, and T ◦ S

are related by
[T ◦ S]A B A
C = [T ]C [S]B ,

where the product on the right is matrix multiplication.

Proof. Let u ∈ U . By deﬁnition of T ◦ S, we have (T ◦ S)(u) =

T (S(u)). This translates into matrix language as

[T ◦ S]A
C [u] A = [T ] B
C [S] A
B [u] A .

Since matrix multiplication is associative and since it is distributive

over subtraction, we may rewrite this as

[T ◦ S]A
C − [T ] B
C [S] A
B [u]A = 0.

But [u]A in this equation can be an arbitrary column vector, so the

matrix in large parentheses is zero, as required.

Deﬁnition A.6.3. Two square matrices M and N are similar if

there is an invertible matrix U such that M = U −1 N U .

Corollary A.6.4. Let T : V → V be a linear transformation from a

ﬁnite-dimensional vector space V to itself, and let B and C be two
bases of V . Then the matrices [T ]B C
B and [T ]C are similar.

Proof. We have
[T ]B C C B
B = [id]B [T ]C [id]C ,

where “id” stands for the identity map. But [id]C B B

B [id]C = [id]B = I,
the identity matrix, and similarly [id]C [id]B = I. Thus U = [id]B
B C
C is
−1 C
invertible with inverse U = [id]B , proving the result.

We will conclude with a brief discussion of determinants. Let

M = (mij ) be a square matrix, with n rows and n columns. Recall
(see Appendix G for a fuller discussion) that Sn denotes the collection
of all bijective maps {1, . . . , n} → {1, . . . , n} — the symmetric group
— and that if σ ∈ Sn , then sign(σ) is ±1 according to whether σ

can be represented as the composite of an even or an odd number of

transpositions (exchanges of two elements). With the above notation
we have

Deﬁnition A.6.5. The determinant of the n × n matrix M = (mij )

is the scalar

det(M ) = sign(σ)m1σ(1) m2σ(2) · · · mnσ(n) .
σ∈Sn

Example A.6.6. Suppose n = 2. The symmetric group S2 has two

elements, the identity map (sign +1) and the transposition exchang-
ing 1 and 2 (sign −1). The formula above gives

a11 a12
det = a11 a22 − a12 a21 .
a21 a22

Example A.6.7. The determinant of the identity matrix is 1.

Lemma A.6.8. The determinant, considered as a map from the vec-

tor space Mn (K) to K, has the following two properties:

(a) It is column-multilinear: that is, det(M ) depends linearly on each

column of M if we keep the other columns ﬁxed.
(b) It is alternating: that is, det(M ) changes sign if we interchange
any two columns of M .

Moreover, any other map Mn (K) → K having these two properties is

a multiple of the determinant.

Proof. Consider the formula for det(M ) given in Deﬁnition A.6.5,

det(M ) = sign(σ)m1σ(1) m2σ(2) · · · mnσ(n) .
σ∈Sn

The formula is a sum of n! terms, each of which is a product of n

matrix entries. Fixing a particular column, say the kth, each one of
the product terms contains exactly one matrix entry from the kth
column, namely the entry mσ−1 (k),k . This proves (a). As for (b),
suppose that N is obtained from M by transposing two columns by

a transposition τ . Then

det(N ) = sign(σ)m1,τ σ(1) m2,τ σ(2) · · · mn,τ σ(n)
σ∈Sn

= (− sign(τ σ)) m1,τ σ(1) m2,τ σ(2) · · · mn,τ σ(n) = − det(M )
σ∈Sn

since τ σ runs over Sn as σ does.

Suppose now that Λ : Mn (K) → K is another map which is
column-multilinear and alternating. For any map μ : {1, . . . , n} →
{1, . . . , n}, let Eμ denote the matrix with a 1 in each position (μ(k), k)
and zeroes elsewhere — in other words, Eμ is a matrix with exactly
one “1” in each column, the positions of the “1”s being governed by
the map μ. Because Λ is column-multilinear,

Λ(M ) = Λ(Eμ )mμ(1),1 mμ(2),2 · · · mμ(n),n .
μ

Because Λ is alternating, it must vanish on any matrix M with

two columns identical (interchanging those columns gives Λ(M ) =
−Λ(M )). The only matrices Eμ that do not have two columns iden-
tical are those where μ is a permutation σ; so the sum over all μ in
the display above can be replaced by a sum over permutations only.
For a permutation σ, the alternating property gives
Λ(Eσ ) = sign(σ)Λ(I),
where I is the identity matrix. Finally, then,

Λ(M ) = Λ(I) sign(σ)mσ(1),1 mσ(2),2 · · · mσ(n),n = Λ(I) det(M ),
σ∈Sn

as required.
Proposition A.6.9. For square matrices M and N (of the same
size) we have
det(M N ) = det(M ) det(N ).

Proof. Fix M . By deﬁnition of matrix multiplication, each column

of M N is obtained by multiplying M by the corresponding column of
N . Thus, the map Mn (K) → K deﬁned by N → det(M N ) is column-
multilinear and alternating, so it is a multiple of the determinant.
Taking N = I shows that this multiple is det(M ).

Corollary A.6.10. Similar matrices have the same determinant.

Proof. Suppose that M and N are similar, that is, M = U −1 N U for

some invertible U . Then
det(M ) = det(U −1 ) det(N ) det(U )
= det(U ) det(U −1 ) det(N ) = det(U U −1 N ) = det N,
as required.

Now suppose that V is a ﬁnite-dimensional vector space and

T : V → V is a linear transformation. We define the determinant
of T as follows: choose a basis B for V , and define det(T ) to be the
determinant of the square matrix [T ]BB . This is well-defined because
another choice of basis C would lead to a matrix [T ]C
C which is similar
to [T ]B
B and thus has the same determinant.
Theorem A.6.11 (Invertibility criterion). Suppose the vector
space V is finite-dimensional. A linear transformation T : V → V
is bijective (invertible) if and only if det(T ) = 0.

Proof. If T is invertible, then

det(T ) det(T −1 ) = det(id) = 1,
so det(T ) = 0. Conversely, suppose that T is not invertible. Then it
fails either to be injective or to be surjective, but in this case these
are equivalent by the rank-nullity theorem. Thus T is not injective
so there is a nonzero vector v1 such that T v1 = 0. The set {v1 } is
linearly independent, so there is a basis B containing v1 as its ﬁrst
member. The matrix of T with respect to this basis has its ﬁrst row
consisting entirely of zeroes. Therefore, its determinant is zero.

Appendix B

Metric Spaces

A metric space is a set of points for which we have a notion of dis-

tance — a measure of “how diﬀerent” or “how far apart” two points
are. This notion of distance can be applied in several ways, but one
of the most important and natural is to say what we mean by con-
tinuous motion and therefore to serve as the foundation of topology,
the study of continuous maps and the properties that they preserve.
That is what we’ll review in this appendix. For a reference see Suther-
land [36], or, for a more leisurely treatment, O’Searcoid [29].

B.1. Metric spaces

Deﬁnition B.1.1. A metric space is a set X equipped with a function
d : X × X → R (called a metric or distance function) such that:

(i) d(x, x ) 0 for all x, x ; moreover, d(x, x ) = 0 if and only if

x = x .
(ii) d(x, x ) = d(x , x) for all x, x (symmetry).
(iii) d(x, x ) d(x, x ) + d(x , x ) for all x, x , x (triangle inequal-
ity).

Here are some examples of metric spaces.

203

Example B.1.2. The vector space Rn can be made into a metric

space by deﬁning the norm of a vector x = (x1 , . . . , xn ) to be
1/2

x
= |x1 |2 + · · · + |xn |2

and then deﬁning the distance between two vectors x and x by
d(x , x ) =
x − x
. The same formula also makes Cn into a met-
ric space. (These deﬁnitions come from the standard inner products
on Rn and Cn , respectively, so that the triangle inequality is a con-
sequence of Corollary A.5.3.) These examples are called Euclidean
spaces.

Example B.1.3. Let X be any set. A metric on X can be deﬁned

by

0 (x = y),
d(x, y) =
1 (x = y).

Example B.1.4. Let A be a ﬁnite set (the alphabet) and consider

the set An of n-tuples of elements of A, which we think of as n-letter
words in the alphabet A. Deﬁne a distance on An by

d(x, y) = #{i : 1 i n, xi = yi },

in other words, the number of positions in which the two words diﬀer.
(The reader should check that the triangle inequality holds, so this
formulation does deﬁne a metric.) This so-called Hamming metric
was introduced in 1950 to give a technical foundation to the theory
of error-correcting codes [21].

Example B.1.5. Let X be any metric space and let Y be a subset of

X. Then the distance function on X, restricted to Y , makes Y into
a metric space; this is called the subspace metric on Y .

Deﬁnition B.1.6. A subset U of a metric space X is an open subset

of X if for every x ∈ U there is ε > 0 such that the entire open ball

B(x; ε) := {x ∈ X : d(x, x ) < ε}

is contained in U .

Lemma B.1.7. Every open ball in a metric space is an open set.

Proof. Let X be a metric space and let U = B(p; r) = {x ∈ X :

d(p, x) < r} be an open ball. For x ∈ U , deﬁne
ε = r − d(p, x) > 0.
Suppose that y ∈ B(x; ε). Then by the triangle inequality, d(p, y)
d(p, x) + d(x, y) < (r − ε) + ε = r, and therefore y ∈ U . It follows
that B(x; ε) ⊆ U . Since x ∈ U was arbitrary, this shows that U is
open.

This result shows that there are plenty of open subsets.

Deﬁnition B.1.8. Let X be a metric space. A subset F ⊆ X whose
complement X \ F is open is called closed.

Note carefully that “closed” does not mean the same as “not
open”. Many subsets of X are neither open nor closed, and some
may be both.
Example B.1.9. In a space with the metric of Example B.1.3, every
subset is open (and, therefore, every subset is also closed). A space
that has this property is called a discrete space. It is easy to see that a
metric space X is discrete if and only if, for every x ∈ X, the inﬁmum
inf{d(x, y) : y ∈ X, y = x}
is strictly positive.
Lemma B.1.10. In any metric space, the union of any collection of
open subsets is an open subset. The intersection of a ﬁnite collection
of open subsets is open. The empty set ∅ and the entire metric space
X are open subsets of X.

Proof. Let F be a collection of open subsets of a metric space X

0
and let U = F be the union of the family. If x ∈ U , then there is
some V ∈ F such that x ∈ V . Since V is open, there is ε > 0 such
that B(x; ε) ⊆ V . Since V ⊆ U , we also have B(x; ε) ⊆ U . Thus for
any x ∈ U there exists ε > 0 such that B(x; ε) ⊆ U , which is to say
that U is open.
Now let F = {U1 , . . . , Un } be a ﬁnite collection of open sets and
1
let U = F = U1 ∩ · · · ∩ Un . If x ∈ U , then for each i = 1, . . . , n
there is εi > 0 such that B(x; εi ) ⊆ Ui . Let ε = min{ε1 , . . . , εn } > 0.

Then B(x; ε) ⊆ Ui for all i, and thus B(x; ε) ⊆ U . Thus for any
x ∈ U there exists ε > 0 such that B(x; ε) ⊆ U , which is to say that
U is open.
Remark B.1.11. Mathematicians frequently say “U is open” as a
shorthand for “U is an open subset of whatever metric space X is
natural in the context”, but it is important to realize that openness
depends on X as well as on U . For example, the interval (0, 1) is an
open subset of R, but it is not an open subset of R2 (where we think
of R as the x-axis in R2 ).

Remark B.1.12. Dually, the intersection of any collection of closed

sets is closed. In particular, given any subset A of X, the intersec-
tion of all the closed subsets of X that contain A is itself a closed
set. Clearly, this intersection is the smallest closed subset of X that
contains A: it is called the closure of A and is denoted A.
Deﬁnition B.1.13. We say that a subset of a metric space is bounded
if it is contained in some open ball.

B.2. Continuous functions

The deﬁnition of continuity is translated in the natural way to the
metric space context.
Deﬁnition B.2.1. A function f : X → Y between metric spaces is
continuous at x ∈ X if for every ε > 0 there is δ > 0 such that
d(f (x), f (x )) < ε whenever d(x, x ) < δ. It is continuous if it is
continuous at every x ∈ X.

In the context of metric spaces a continuous function will also be

called a map or mapping. Important examples are paths in a metric
space.
Deﬁnition B.2.2. Let X be a metric space. A path in X is a contin-
uous function from [0, 1] to X. (Recall that [0, 1] denotes the closed
unit interval {x ∈ R : 0 x 1}.)

Although we deﬁne continuity in terms of epsilons and deltas,

it also has a very important alternative characterization in terms of
open sets.

Theorem B.2.3. Let X and Y be metric spaces. Then f : X → Y

is continuous if and only if, for every open U ⊆ Y , the inverse image
f −1 (U ) := {x ∈ X : f (x) ∈ U }
is open in X.

Proof. Suppose that f is continuous and let U ⊆ Y be open. Let

x ∈ f −1 (U ); then f (x) ∈ U so by definition of “open” there is ε > 0
such that B(f (x); ε) ⊆ U . By definition of “continuous” there is
δ > 0 such that if x ∈ B(x; δ), then f (x ) ∈ B(f (x); ε) ⊆ U . But
this means that B(x; δ) ⊆ f −1 (U ). Thus the set f −1 (U ) is open.
Conversely, let x ∈ X, ε > 0 and suppose that f satisfies the con-
dition in the theorem. In particular, we may consider U = B(f (x); ε),
an open set such that x ∈ f −1 (U ). Our hypothesis tells us that
f −1 (U ) is open, which means that there is a δ > 0 such that B(x; δ) ⊆
f −1 (U ). Thus, whenever x ∈ B(x; δ), f (x ) ∈ B(f (x); ε). This gives
us continuity.
Remark B.2.4. Because closed sets are the complements of open
ones and f −1 (Y \ A) = X \ f −1 (A), it is also true that f is continuous
iff the inverse image of every closed set is closed.
Definition B.2.5. A map f : X → Y between metric spaces is called
a homeomorphism if it is a bijection and both f and f −1 are contin-
uous. (Because of the preceding theorem, it is the same thing to say
that f is a bijection and U is open in X if and only if f (U ) is open
in Y , or that f is a bijection and F is closed in X if and only if f (F )
is closed in Y .) If there is a homeomorphism between X and Y , then
we say that these spaces are homeomorphic.

When we are studying topology, we think of homeomorphic spaces

as “essentially the same”: the coﬀee cup is homeomorphic to the
donut (compare Figure 1.1). Homeomorphism therefore plays the
same role in topology as isomorphism does in linear algebra or in
group theory.
Remark B.2.6. For f to be a homeomorphism it is not enough that
it simply be continuous and bijective. For example, the map from
the half-open interval [0, 2π) to the unit circle, given by t → eit , is
continuous and bijective, but its inverse is not continuous.

A more restrictive notion than homeomorphism is that of isome-

try
Deﬁnition B.2.7. Let (X, dX ) and (Y, dY ) be metric spaces. An
isometry from X to Y is a bijection f : X → Y such that
dY (f (x1 ), f (x2 )) = dX (x1 , x2 )
for all x1 , x2 ∈ X.

Every isometry is a homeomorphism, but the reverse is far from

the case: isometric spaces are “the same” metrically, and not just
topologically. For instance, all ellipses (in the plane) are homeomor-
phic, but they are not all isometric. (The interested reader can ﬁnd
necessary and suﬃcient conditions for two ellipses to be isometric.)

B.3. Compact spaces

The notions of sequence of points in a metric space and subsequence
of a given sequence are defined in the same way as they are for real
numbers. (Formally speaking, a sequence in the space X is a function
N → X, and a subsequence of the given sequence is the result of
composing it with a strictly monotonic function N → N.)
Definition B.3.1. We say that a sequence {xn } in the metric space
X is convergent to x ∈ X if
∀ε > 0 ∃N ∈ N ∀n N d(xn , x) < ε.
Equivalently, every open set that contains x also contains all xn for
n sufficiently large; the sequence “enters and remains within every
neighborhood of x”.

Convergent sequences characterize closed sets:

Proposition B.3.2. A subset A of a metric space X is closed if and
only if, whenever {xn } is a sequence of points of A that converges to
x ∈ X, the limit x also belongs to A.

Proof. Both directions are proofs by contradiction.

(Only if) Suppose A is closed and x ∈/ A. Since X \ A is open,
there is ε > 0 such that B(x; ε) ⊆ X \ A. Thus the distance from x

to any point of A is ε, and therefore no sequence of points of A can

converge to x.
(If) Suppose that A is not closed, so that X \ A is not open.
Then there exists x ∈ X \ A for which no ball B(x; ε) is a subset of
X \ A — that is, every such ball has nonempty intersection with A.
Take ε = 1/n and let xn be a point of B(x; 1/n) ∩ A. Now {xn } is a
sequence of points of A that converges to x ∈
/ A.

Remark B.3.3. A subset A ⊆ X is called dense if every x ∈ X is the

limit of a sequence in A. In view of Proposition B.3.2, it is equivalent
to say that the only closed subset of X that contains A is the set X
itself: that is, the closure of A is X.

Deﬁnition B.3.4. A metric space X is compact iﬀ every sequence

of points of X has a subsequence that converges in X.

Remark B.3.5. This notion is often called sequential compactness to

distinguish it from another formulation, covering compactness, which
is applicable in more general contexts. For metric spaces, however, the
two kinds of compactness are equivalent (as we’ll see in a moment).
Thus there is no risk of ambiguity if we use the word “compact”
without qualiﬁcation, and we will usually do that.

The notion of compactness is often applied to subsets of a metric

space, which can of course be thought of as metric spaces in their
own right (compare Example B.1.5). For example, R is not compact,
but any closed, bounded subset of R is compact — this is the classical
Bolzano-Weierstrass theorem.

Proposition B.3.6. If A is a subset of any metric space X and A

is compact (in its own right), then A is closed and bounded in X.

Proof. We apply the criterion of Proposition B.3.2. Suppose A is

compact. Let {xn } be a sequence in A that converges to x ∈ X. By
compactness of A, {xn } has a subsequence that converges in A — in
particular, its limit must be a point of A. But any subsequence of
{xn } converges to x, so x belongs to A. Thus A is closed.
Fix a ∈ A and suppose (aiming for a contradiction) that A is not
bounded. Then for each n ∈ N there exists xn ∈ A with d(xn , a) >

n. No subsequence of the {d(xn , a)} can be bounded, whence no

subsequence of the {xn } can be convergent. Thus A is not compact.

Proposition B.3.7. Let A be a closed subset of a compact metric
space X. Then A is compact (in its own right).

Proof. Let {xn } be a sequence in A. Since X is compact, this se-

quence has a subsequence convergent to x ∈ X. Since A is closed, the
limit x in fact belongs to A. Thus {xn } has a subsequence convergent
in A. That is, A is compact.
Deﬁnition B.3.8. Let X and Y be metric spaces. The product space
X ×Y is the set of pairs (x, y), with x ∈ X and y ∈ Y and with metric
1/2
d((x, y), (x , y )) = dX (x, x )2 + dY (y, y )2 .

Thus Rn is the product space R × · · · × R (n factors).

Proposition B.3.9. If X and Y are both compact, so is X × Y .

Proof. Let (xn , yn ) be a sequence in X × Y . Since X is compact,

there is a subsequence xnk of xn that converges in X. Now consider
the sequence ynk in Y . Since Y is compact, there is a subsequence
ynkj that converges in Y . The corresponding subsequence xnkj also
converges in X since it is a subsequence of the convergent sequence
xnk . Thus
(xnkj , ynkj )
is a subsequence of the original sequence that converges in X ×Y .
Proposition B.3.10. Every closed, bounded1 subset of Rn or Cn is
compact.

Proof. We take it as known that a closed, bounded interval in R

is compact (the Bolzano-Weierstrass theorem). Let A be a closed
bounded subset of Rn . Then A is contained in some cube, which is
a product of closed bounded intervals. Proposition B.3.9 shows that
the cube is compact. Therefore A is compact, as a closed subset of a
compact space (Proposition B.3.7).
1
See Deﬁnition B.1.13.

Remark B.3.11. A sequence (xn ) in a metric space X is called a

Cauchy sequence if d(xn , xm ) → 0 as n, m → ∞, in other words if
its points “get arbitrarily close to one another”. A metric space X
is said to be complete if every Cauchy sequence in X is convergent.
It is easy to see that every compact metric space is complete. There
are, however, many other examples (e.g., X = R).

An alternate formulation of compactness makes use of the notion

of open cover for a metric space.
Definition B.3.12. Let X be a metric space. An open cover U for
X is a collection (finite or infinite) of open sets whose union is all of
X. A Lebesgue number for U is a number δ > 0 such that every open
ball of radius δ is a subset of some member of U .
Theorem B.3.13. Every open cover of a compact metric space has
a Lebesgue number.

Proof. Suppose that U does not have a Lebesgue number. Then

for every n there is xn ∈ X such that B(xn ; 1/n) is contained in no
member of U . If X is compact, the sequence (xn ) has a subsequence
that converges, say to x. Now x belongs to some member U of U
since U is a cover. Thus there is ε > 0 such that B(x; ε) ⊆ U . There
is n > 2/ε such that d(xn , x) < ε/2. But then
B(xn ; 1/n) ⊆ B(xn ; ε/2) ⊆ B(x; ε) ⊆ U,
which is a contradiction.
Deﬁnition B.3.14. A metric space X is said to have the Heine-Borel
property if, for every open cover U of X, there exists a ﬁnite subset
of U that still covers X.

A subset of a cover U of X which is itself a cover of X is called

a subcover of U . Thus, a space X has the Heine-Borel property if
every open cover of X has a ﬁnite subcover.
Proposition B.3.15. A metric space has the Heine-Borel property
if and only if it is compact.

As hinted in Remark B.3.5, a space with the Heine-Borel prop-

erty is often called covering compact, and the above result is then

expressed by saying “a metric space is covering compact if and only

if it is sequentially compact”.

Proof. Suppose that X is compact. I claim that, for each δ > 0, X

has a finite cover by δ-balls (this is sometimes called “total bounded-
ness” of X). Indeed, try to construct a sequence x1 , x2 , . . . in X in
the following way: pick any point x1 , pick a point x2 at distance δ
from x1 , then a point x3 at distance δ from both x1 and x2 , and so
on. If this process continues for ever, it produces a sequence without
any convergent subsequence, which is impossible. But the only way
it can stop is if, for some n, the balls B(x1 ; δ), . . . , B(xn ; δ) cover X.
Now let U be an open cover of X. Let δ > 0 be a Lebesgue
number for U (which exists because of Theorem B.3.13). By the
previous paragraph, X has a finite cover by δ-balls. But each such
ball is a subset of a member of U ; so U has a finite subcover.
Now suppose that X has the Heine-Borel property. Suppose for a
contradiction that (xn ) is a sequence in X without convergent subse-
quence. Then for each x ∈ X there is some εx such that xn ∈/ B(x; εx )
for all but finitely many n. The B(x; εx ) form a cover of X. Picking
a finite subcover we obtain the contradiction that xn ∈/ X for all but
finitely many n.

The following is a typical application of the covering formulation

of compactness (the reader will also be able to give an easy proof
using the sequential formulation).

Proposition B.3.16. Let X and Y be metric spaces, and suppose

that there exists a map f : X → Y that is continuous and surjective.
If X is compact, then Y is compact also.

Proof. Let U be an open cover of Y . For each open set U ∈ U , the

set f −1 (U ) is an open subset of X, by Theorem B.2.3. The collection
W of open sets f −1 (U ), U ∈ U , covers X. As X is compact, W has
a ﬁnite subcover, say {f −1 (U1 ), . . . , f −1 (Un )}. The corresponding
ﬁnite collection of open subsets of Y , {U1 , . . . , Un }, is a subset of U
and (because f is surjective) covers Y . Hence Y is compact also.

This gives us a useful result (compare Remark B.2.6).

Proposition B.3.17. Let f : X → Y be a continuous bijection of

metric spaces, with X compact. Then f is a homeomorphism.

Proof. To prove that f −1 is continuous it suﬃces to prove that f

takes closed sets to closed sets (see Remark B.2.4). Let A ⊆ X be
closed. Then A is compact (Proposition B.3.7). So f (A) is compact
(Proposition B.3.16). Thus f (A) is closed (Proposition B.3.6).

There is also an important relationship between compactness and

uniform continuity.

Deﬁnition B.3.18. A map f : X → Y between metric spaces is

uniformly continuous if for each ε > 0 there is δ > 0 such that, for
all x, x ∈ X, d(f (x), f (x )) < ε whenever d(x, x ) < δ.

The extra information beyond ordinary “pointwise” continuity is

that δ does not depend on x.

Proposition B.3.19. Every continuous map from a compact metric

space X to a metric space Y is uniformly continuous.

Proof. Let f : X → Y be continuous. Let ε > 0. Let U be the

open cover of Y by all balls of radius ε/2, and let V = f ∗ (U ) be the
pullback of this cover under f ; i.e., a set V belongs to V if and only
if it is of the form
V = f −1 (BY (y; ε/2)) for some y ∈ Y .
(Notice that we’re using Theorem B.2.3 to argue that V is open.) Let
δ be a Lebesgue number for the cover V of X. Then, if d(x, x ) <
δ, we have x ∈ BX (x; δ), and so both x, x belong to some V ∈
V . That means that both f (x), f (x ) belong to some B(y; ε/2), so
dY (f (x), f (x )) dY (f (x), y) + dy (y, f (x )) < ε and we are done.

B.4. Function spaces

Let X be a compact metric space and let f0 , f1 : X → Y be two maps
from X to another metric space Y . The function X → R deﬁned by
x → dY (f0 (x), f1 (x))

is then continuous, so its range is compact (Proposition B.3.16) and

therefore bounded. Thus the supremum in the deﬁnition below exists
and is ﬁnite:

Deﬁnition B.4.1. With notation as above, the uniform distance (or

just distance) between the maps f0 and f1 is
d(f0 , f1 ) = sup{d(f0 (x), f1 (x)) : x ∈ X}.

The collection of all (continuous) maps from the compact space

X to Y is denoted Maps(X, Y ). It is easily verified that the uniform
distance satisfies the triangle inequality, so Maps(X, Y ) becomes a
metric space in its own right. One way to study the topology of a
space Y is to look at the connectivity properties of Maps(X, Y ) for
various standard spaces X.
We will use two important properties of these mapping spaces.
The first is the gluing property:

Proposition B.4.2. Let X be a metric space and let A, B be closed

subsets whose union is X. Let f : X → Y be a function from X to
another metric space Y . If the restrictions f|A and f|B of f to A and
B are continuous (as maps A → Y and B → Y ), then f itself is
continuous (as a map X → Y ).

Remark B.4.3. A diagrammatic way to say the same thing is this:

in the diagram of restriction maps
Maps(X, Y ) / Maps(A, Y )

Maps(B, Y ) / Maps(A ∩ B, Y )

an element of the top left space Maps(X, Y ) is the same as a pair of

elements of the top right and bottom left spaces whose restrictions to
the bottom right space agree. One expresses this property by saying
that the diagram is a pullback square.

Proof. We begin by establishing the following lemma: if A is a closed

subset of a metric space X and F is a closed subset of A (when A
is considered as a metric space in its own right), then F is a closed

subset of X. (Compare Remark B.1.11.) To see this, let U be the

complement of F in X. Let x ∈ U . There are two cases:
(a) Case x ∈ X \ A. Then there is ε > 0 such that B(x; ε) ⊆ X \ A
since A is closed in X. A fortiori, B(x; ε) ⊆ U = X \ F .
(b) Case x ∈ A \ F . Then there is ε > 0 such that the ball in
A around x, of radius ε, BA (x; ε), is contained in A \ F . But
BA (x; ε) = B(x; ε) ∩ A, so again we have

B(x; ε) ⊆ BA (x; ε) ∪ (X \ A) ⊆ U = X \ F.
In either case B(x; ε) ⊆ U . Thus U is open in X, so F is closed in
X, as required.
Now for the proof of the proposition. Let G be an arbitrary
−1
closed subset of Y . Since f|A is continuous, f|A (G) = f −1 (G) ∩ A is
a closed subset of A, and therefore a closed subset of X by the lemma.
Similarly, f −1 (G) ∩ B is a closed subset of X. But then

f −1 (G) = (f −1 (G) ∩ A) ∪ (f −1 (G) ∩ B)

is closed in X, so f is continuous by Remark B.2.4.

Remark B.4.4. The same result holds (with a diﬀerent proof) if A

and B are both open rather than closed. Some condition on A and
B is necessary, however, as easy examples show.

The second important property of mapping spaces is the expo-

nential law. Suppose that X and Y are compact metric spaces and
Z is any metric space. Let F be a continuous map from X × Y to Z.
Then, for each ﬁxed x ∈ X, the map

fx (y) = F (x, y) : Y → Z

is continuous. Therefore, x → fx deﬁnes a function g from X to

Maps(Y, Z). Moreover, this function g is itself continuous. To see
why, notice that by Proposition B.3.19, the map F is uniformly con-
tinuous from X × Y to Z. In particular this implies that, given
ε > 0, there exists δ > 0 such that if x, x ∈ X with d(x, x ) < δ,
then d(F (x, y), F (x , y)) < 12 ε for all y ∈ Y . But this implies
d(g(x), g(x )) < ε, so g is continuous.

From this discussion we conclude that the process of passing from

F to g deﬁnes a function
Φ : Maps(X × Y, Z) → Maps(X, Maps(Y, Z)).
The exponential law for function spaces is then the following state-
ment.
Proposition B.4.5. For metric spaces X, Y , and Z, with X and
Y compact, the function Φ deﬁned above is a homeomorphism from
Maps(X × Y, Z) to Maps(X, Maps(Y, Z)).

Proof. It is immediate from the deﬁnitions that if d(F, F ) < ε,

then d(Φ(F ), Φ(F )) ε, which shows that Φ is continuous. Let us
construct an inverse to Φ. Suppose that g ∈ Maps(X, Maps(Y, Z)).
Then for each x ∈ X, g(x) is a (continuous) map Y → Z. Let us now
define Ψ(g) = F where
F (x, y) = [g(x)](y).
We need to show two things: first, that this F is a continuous map
from X × Y to Z and, second, that Ψ is a continuous map from
Maps(X, Maps(Y, Z)) to Maps(X × Y, Z). Considering F first, let
ε > 0 be given. There exists α > 0 such that
(d(x, x ) < α) ⇒ (d(g(x), g(x )) < 12 ε),
and there exists β > 0 such that
(d(y, y ) < β) ⇒ (d(g(x)(y), g(x)(y )) < 12 ε).
Let δ = min{α, β}. Then if d((x, y), (x , y )) < δ, we have

d(F (x, y), F (x , y )) = d(g(x)(y), d(g(x )(y ))

d(g(x)(y), g(x)(y )) + d(g(x)(y ), g(x )(y ))
d(g(x)(y), g(x)(y )) + d(g(x), g(x )) < 12 ε + 12 ε = ε,
which shows that F is continuous. The proof that Ψ is continuous
proceeds in a similar way.

Appendix C

Extension and
Approximation
Theorems

In this brief section we will prove two signiﬁcant results about real-
valued functions on compact metric spaces, the Stone-Weierstrass the-
orem and the Tietze extension theorem. These results are important
in many parts of analysis: in this book, they will be involved in our
proof of the Jordan curve theorem (Chapter 4) and in our discussion
of Fredholm operators (Chapter 7).

C.1. The Stone-Weierstrass theorem

The classical Weierstrass approximation theorem states that every
continuous function on a closed bounded interval can be uniformly
approximated by polynomials. The Stone-Weierstrass theorem gen-
eralizes that result to functions on other compact metric spaces.

Deﬁnition C.1.1. A collection L of real-valued functions on a set X

is a lattice if, whenever it contains functions f and g, it also contains
their pointwise maximum and minimum — usually written f ∨ g and
f ∧ g in this context.

Proposition C.1.2. Let L be a lattice of continuous real-valued func-

tions on a compact metric space X. If, for all x, x ∈ X and any

217

a, a ∈ R, there is a function f ∈ L having f (x) = a and f (x ) = a ,

then L is dense in Maps(X; R).

The property appearing in the statement is called the two point

interpolation property.

Proof. Let the continuous function h : X → R be given and let ε > 0.

We are going to approximate h within ε by elements of L. By hypoth-
esis, for each x, x ∈ X there exists fxx ∈ L such that fxx (x) = h(x)
and fxx (x ) = h(x ). Fixing x for a moment, let

Vx = {y ∈ X : h(y) − ε < fxx (y)}.

These sets are open, and x ∈ Vx , so they cover X. Take a ﬁnite
subcover and let gx be the (pointwise) maximum of the corresponding
functions fxx ∈ L. Because L is a lattice, gx ∈ L and by construction,

h(y) − ε < gx (y) ∀y, h(x) = gx (x).

So we have approximated h from one side by members of L.

Now we play the same trick again from the other direction: let

Wx = {y : gx (y) < h(y) + ε}.

Again these form an open cover of X; take a ﬁnite subcover and let
g be the (pointwise) minimum of the corresponding gx . Then g ∈ L
and by construction h − ε < g < h + ε, as required.

There is a more classical formulation which makes use of algebraic

rather than order-theoretic operations.

Lemma C.1.3. There is a sequence of polynomials on [−1, 1] con-

verging uniformly to |x|.

Proof (Dydak-Feldman [16]). It clearly suﬃces to produce a se-

quence pn (x) of polynomials on [−1, 1] that converge uniformly to the
function f (x) = 12 (x + |x|) that equals x for x 0 and 0 for x 0.
Deﬁne p1 (x) = x2 and, inductively,

pn+1 (x) = pn (x) · 1 + 12 (x − pn (x)) .

A simple calculation shows that x−pn+1 (x) = (x−pn (x))(1− 21 pn (x)).

Using these facts, one easily establishes by induction the following:
(a) We have 0 pn (x) |x| for all x ∈ [−1, 1].
(b) The sequence {pn (x)} is monotone decreasing for x 0 and
monotone increasing for x 0.
Fix ε > 0. We divide the interval [−1, 1] into three parts: A =
[−1, −ε], B = [−ε, ε], and C = [ε, 1]. If x ∈ A, then

0 pn+1 (x) = pn (x) · 1 + 12 (x − pn (x)) pn (x) · (1 − 12 ε).

By induction we ﬁnd 0 pn+1 (x) (1 − 12 ε)n , which is smaller than

ε for n suﬃciently large. If x ∈ B, then 0 pn (x) ε for all n, by
(a) above. If x ∈ C, then

0 x − pn+1 (x) = (x − pn (x)) 1 − 12 pn (x) (x − pn (x))(1 − 12 ε2 ).

Again by induction, 0 x − pn+1 (x) (1 − 12 ε2 )n , which is smaller

than ε for n suﬃciently large. We conclude that for n suﬃciently
large, |pn (x) − f (x)| ε for all x ∈ [−1, 1], as required.

One says that a subset of Maps(X; R) is a subalgebra if it is

closed under pointwise addition, subtraction, multiplication of func-
tions, and multiplication by scalars.

Lemma C.1.4. A closed subalgebra of Maps(X; R) that contains the

constant functions is a lattice.

Proof. Let A be the given subalgebra and f, g ∈ A; let h = f − g.

There is no loss of generality in assuming (by rescaling) that |h| 1
everywhere. Any polynomial in h belongs to A and hence so does |h|
by closure and Lemma C.1.3. But now
f + g − |h| f + g + |h|
f ∧g = , f ∨g =
2 2
belong to A as well.

Theorem C.1.5 (Stone-Weierstrass theorem). Let X be a com-

pact metric space and let M be one of the two spaces Maps(X, R) or

Maps(X, C). Let E be a subset of M having the following properties:

(i) E contains the constant functions.
(ii) E is a subring of M ; that is, it is closed under the operations
of addition, subtraction, and multiplication.
(iii) E separates points of X; that is, for each pair of distinct points
x0 , x1 ∈ X there is a function f ∈ E with f (x0 ) = f (x1 ).
(iv) In the complex case it is also required that E is closed under
complex conjugation; that is, if the function f ∈ E, then f¯ ∈ E
also.
Then E is dense in M .

Proof. Consider the case of Maps(X, R) ﬁrst. Let Ē denote the

closure of E. It is a closed subalgebra of Maps(X; R) and is therefore a
lattice, by Lemma C.1.4. Moreover, it has the two point interpolation
property (by simple algebra using the fact that E separates points).
Thus Ē = Maps(M ; R) by Proposition C.1.2, which is to say that E
is dense.
In the case of Maps(X, C), since E is now also supposed to be
closed under complex conjugation, we see from the previous result
that the collection F of real parts (or of imaginary parts) of elements
of E is dense in Maps(X, R). It follows that E is dense in Maps(X; C).

Notice that the algebra of (real) polynomial functions on a com-

pact interval in R satisﬁes the conditions of the theorem. Thus, such
functions are dense among the continuous functions. This is the clas-
sical Weierstrass approximation theorem. (Of course, a special case
of this, in the form of Lemma C.1.3, had to be proved by an explicit
construction in order for us to get the general Stone-Weierstrass re-
sult.)

C.2. The Tietze extension theorem

Let X and Y be metric spaces, and let A be a subset of X. Then
every continuous function from X to Y restricts to a continuous func-
tion from A to Y . This operation deﬁnes a restriction map between

function spaces
Maps(X, Y ) → Maps(A, Y )
which is in a natural sense “dual” to the inclusion map A → X.
Must the restriction map Maps(X, Y ) → Maps(A, Y ) be surjective?
In other words, can every continuous map A → Y be extended to a
continuous map X → Y ? In general the answer is “no”, but it is
“yes” when the range space Y is R, the space of real numbers.
Proposition C.2.1 (Tietze extension theorem). Let A be a com-
pact subspace of a metric space X. Any continuous function A → R
can be extended to a continuous function X → R.

That is, given any continuous f : A → R, one can ﬁll in the

diagram
XO
g

? f
A /R
with a continuous g to make it commutative.

Proof. Consider the metric space M = Maps(A; R) and the subset E

consisting of maps f : A → R that can be extended to maps X → R
(let’s call these the extendible maps). Then E contains the constant
functions, and it is a subring of M because an extension of the sum
(or difference or product) of two extendible functions is the sum (or
difference or product) of their extensions. Moreover, E separates
points of A: given distinct x0 , x1 ∈ A the function x → d(x, x0 )
separates x0 from x1 and is extendible. By Stone-Weierstrass, then,
E is dense in M . To complete the proof of the theorem (that is, to
show that all f ∈ M are extendible), it will therefore suffice to show
that E is also closed.
Begin with a simple observation: if a function f ∈ A has an
extension g at all, then it has such an extension with
g
=
f
.
Indeed, if g is any extension of f , then so is the function
⎧
⎪
⎨+
f
if g(x) >
f
,
⎪
h : x → g(x) if −
f
g(x)
f
,
⎪
⎪
⎩−
f
if g(x) < −
f
,

and this extension has

h
=
f
. Now, suppose that {fj } is a
sequence in E converging uniformly to f ∈ M . By passing to a
subsequence we may assume that

fj+1 − fj
< 2−j .
Let g0 be an extension of f1 and, for j 1, let gj be an extension of
fj+1 − fj having
gj
< 2−j . Then the series
∞

gj
j=0

converges uniformly on X to an extension of f . Thus f belongs to E

also, so E is closed.
Remark C.2.2. The Tietze extension theorem applies also when Y
is any Euclidean space Rn (just consider each component separately).

Appendix D

Measure Zero

The notion of set of measure zero (in the real line) is important in the
proof of Sard’s theorem (the 1-dimensional version, Proposition 3.4.6)
and the “general position” results that follow from it. In this section
we’ll review the basic deﬁnitions here.

D.1. Measure zero subsets of R and of S 1

Let I ⊆ R be an interval (open, closed, or half-open). We deﬁne the
length of I by the natural formula

(I) = sup I − inf I.

Deﬁnition D.1.1. Let S be a subset of R. We say that S has

measure zero (or is a null set) if, for every ε > 0, there exists a
sequence I1 , I2 , I3 , . . . of intervals that cover S and whose total length
∞

(In )
n=1

is less than ε.

Example D.1.2. Clearly, any subset of a set of measure zero also has
measure zero. The union of two sets of measure zero also has measure
zero. A translate of a set of measure zero also has measure zero.

223

It is usual, in this deﬁnition, to require the intervals to be open,

but this in fact does not make any difference: if we have a sequence
of arbitrary intervals In satisfying the definition, we may find open
intervals Jn ⊇ In with (Jn ) < (In ) + ε2−n . The open intervals Jn

then cover S and ∞ n=1 (Jn ) < 2ε. In particular, taking the original
intervals In to be points, we find

Lemma D.1.3. Any ﬁnite or countable set has measure zero.

The following shows that the notion of “measure zero” is not a

vacuous one.

Proposition D.1.4. An interval of positive length does not have

measure zero.

Proof. Any interval of positive length contains a closed interval of

positive length; it suffices therefore to prove the result for a closed
interval. I claim that in any covering of a closed interval [a, b] by open
intervals, the total length of the open intervals must be at least b − a;
it cannot be arbitrarily small.
Any covering of [a, b] by open intervals has a finite subcovering,
so it suffices to show that if a closed interval [a, b] is covered by n open
intervals, the total length of those intervals must be at least b − a.
This we shall do by induction on n.
The base case (n = 1) is evident: if a single open interval I1
covers [a, b], then I1 contains both a and b so its length is at least
b − a.
Suppose now that the statement has been proved for coverings
by (n − 1) open intervals, and let I1 , . . . , In be n open intervals whose
union contains [a, b]. One of the intervals, say I1 = (α, β), must
contain a. Then α < a < β. The closed interval [β, b] is covered by
I2 , . . . , In so by the induction hypothesis,

n
(Ik ) b − β.
k=2

But (I1 ) = β − α > β − a, so

n
(Ik ) (β − a) + (b − β) = b − a,
k=1

completing the induction as required.

Note that, as a corollary of Proposition D.1.4 and Lemma D.1.3,

we deduce that an interval of positive length is uncountable (i.e., not
countable). This fact was originally proved by Cantor using decimal
expansions and a diagonal argument.
Example D.1.5. The converse of Lemma D.1.3 is false: there exist
uncountable sets of measure zero. The standard example of such
an object is the Cantor ternary set C, which is the collection of all
x ∈ (0, 1) that can be written in a ternary (base-3) expansion that
does not contain the digit 1. (See Example 2.1.6 for more about
this idea. In cases where the expansion is ambiguous, it suffices that
one of the possible expansions does not contain the digit 1.) This
set is uncountable — in fact, sending a binary expansion made up
of 0’s and 1’s to the ternary expansion made up of 0’s and 2’s in
the corresponding places defines an injection from the whole interval
[0, 1] (the standard example of an uncountable set) to C. But C has
measure zero. In fact, write Cn for the set of numbers that have a
ternary representation which do not contain 1 among the first n digits
1
only. Then C = Cn ; but each Cn is a union of (closed) intervals
whose total length is (2/3)n , and this tends to zero as n → ∞.
Lemma D.1.6. Any bounded open subset S of R can be written as
the union of at most countably many disjoint open intervals: S =
0
j (aj , bj ). If S is contained in a closed interval [a, b], then

(bj − aj ) b − a;
j

that is, the total length of the open intervals that comprise S is less
than or equal to the length of [a, b].

Proof. Deﬁne an equivalence relation ∼ on S by saying that x ∼ y

if [x, y] ⊆ S. The equivalence classes for this relation are easily seen
to be open intervals contained in S; so S is a union of disjoint open

intervals. Each open interval in R contains a rational number, and

the set of rationals is countable, so there can be at most countably
many open intervals in any disjoint family.
For the statement about length, it suﬃces to show that if ﬁnitely
many disjoint intervals (aj , bj ), j = 1, . . . , n, are contained in [a, b],
then the sum of their lengths is less than or equal to b − a. Again,
we use induction on n, the statement being apparent for n = 1.
For the inductive step, assume without loss of generality that a1 =
min{a1 , . . . , an }. Because the intervals are disjoint we must have
aj b1 for j = 2, . . . , n. Thus the disjoint intervals (aj , bj ) for j =
2, . . . , n are contained in the closed interval [b1 , b]. By the induction
hypothesis,
n
(bj − aj ) b − b1 ,
j=2
and therefore

n
(bj − aj ) b − a1 b − a.
j=1
This completes the proof.

The key to the proof of Proposition 3.4.6 is

Lemma D.1.7. Let f : (0, 1) → R be diﬀerentiable with continuous
derivative. Let E ⊆ (0, 1) be the set of those points x for which
f (x) = 0. Then f (E) has measure zero.

Note that E itself might not have measure zero — for instance,
f might be a constant function!

Proof. Let Em = {x ∈ (0, 1) : |f (x)| < 1/m}. Em is an open subset

of (0, 1), by the continuity of f , and therefore it is an open subset of
R. Thus by Lemma D.1.6 it is a countable union of disjoint intervals
(cn , dn ), with total length at most 1. By the mean-value theorem, for
x, x ∈ (cn , dn ) we must have
|f (x) − f (x )| (1/m)|x − x | < (1/m)(dn − cn )
and thus f ((cn , dn )) is a subset of some open interval (an , bn ) of length
at most (2/m)(dn −cn ). It follows that f (Em ) is a subset of the union
0
n (an , bn ), which is a union of intervals with total length at most

Appendix E

Calculus on
Normed Spaces

E.1. Normed vector spaces

This appendix sketches the modern approach to multivariable calcu-
lus in the context of normed vector spaces. This approach has two
advantages:

(a) It does away with the complicated notation of partial derivatives.

(b) It puts into prominence the central notion that the derivative of a
map is the best linear approximation to that map at a particular
point.

For greater detail about these ideas, the best reference is still Dieudonné
[14, Chapter 8].
Let V be a (real or complex) vector space. Recall from Re-
mark A.5.5 the following deﬁnition.

Deﬁnition E.1.1. A norm on V is a map V → R+ , denoted by

v →
v
, such that:

(i) For all v,

v
0, with equality if and only if v = 0.
(ii)
λv
= |λ|
v
for all scalars λ and vectors v.
(iii)
v + w

v
+
w
.

229

The triangle inequality (iii) above implies that a norm gives rise
to a metric via the expression d(u, v) =
u − v
.
Example E.1.2. Let V be a ﬁnite-dimensional vector space and
choose a basis {v1 , . . . , vn } for V . Each v ∈ V can then be written

uniquely as a sum nj=1 λj vj . The expression
⎛ ⎞1
2

n

v
= ⎝ |λj | 2⎠

j=1

then deﬁnes a norm on V (in fact, this is the norm associated to an

inner product for which the chosen basis is orthonormal; see Corol-
lary A.5.3).
Deﬁnition E.1.3. Two norms
·
1 and
·
2 on the same vector
space V are equivalent if there is a constant m > 0 such that

v
1 m
v
2 ,
v
2 m
v
1
for all v ∈ V .
Proposition E.1.4. Any two norms on a ﬁnite-dimensional vector
space are equivalent.

Proof. Let V be a ﬁnite-dimensional (real) normed space, with norm

·
. Choose a basis {v1 , . . . , vn } and deﬁne a function from the sphere
S n−1 = {(x1 , . . . , xn ) ∈ Rn : x21 + · · · + x2n = 1} to R+ by
2 2
2 2
2 n 2
(x1 , . . . , xn ) → 2
2 x 2
j j2 .
v
2 j=1 2
The properties of the norm show that this is a continuous, nowhere-
vanishing function on the compact space S n−1 , hence bounded be-
tween m−1 and m for some m > 0. This shows that
·
is equivalent
to the norm associated to choosing {v1 , . . . , vn } as an orthonormal
basis. Since
·
was arbitrary, all norms are equivalent.

It follows that all norms on a ﬁnite-dimensional vector space give

rise to the same topology and that they are all complete (that is,
Cauchy sequences converge). All the theory that we are going to
develop in this appendix works for complete normed vector spaces

(even inﬁnite-dimensional ones), but to keep things down to earth we

will state it only in the finite-dimensional case.
Remark E.1.5. If V and W are finite-dimensional vector spaces, so
is the space L(V, W ) of linear transformations from V to W . If V
and W have specified norms, it is natural to equip L(V, W ) with the
norm

T
= sup{
T v
: v ∈ V,
v
1},
which is called the operator norm of T .

E.2. The derivative

For a map f : R → R, the standard definition of the derivative is
f (x + h) − f (x)
f (x) = lim .
h→0 h
This definition does not generalize directly to maps between vector
spaces because you cannot divide one vector by another. We must
first reformulate it, as follows.
Proposition E.2.1. Let f : R → R be a function, and let x ∈ R.
The derivative f (x) (if it exists) is the unique constant t ∈ R with
the property that
f (x + h) = f (x) + th + o(h).
Here the notation o(h) refers to some function of h, say g(h), with
the property that |g(h)|/|h| tends to 0 as h tends to 0.

Proof. Once we have decrypted the o notation, we see that the equa-
tion in the proposition tells us that

f (x + h) − f (x)
− t → 0 as h → 0;
h
that is, by deﬁnition, f (x) exists and equals t.

Remark E.2.2. The o notation makes sense in ﬁnite-dimensional

vector spaces too: if V, W are such spaces, the symbol o(h), for a
vector variable h ∈ V , will refer to some function g(h) ∈ W such that

g(h)
/
h
tends to 0 as h tends to 0 in V . Because any two norms

in a ﬁnite-dimensional space are equivalent (Proposition E.1.4), this

deﬁnition does not depend on the choice of norms.

Lemma E.2.3. Let T : V → W be a linear map between ﬁnite-

dimensional vector spaces. If T (h) = o(h) for h ∈ V , then T = 0.

Proof. If T is nonzero, there is some v ∈ V with

v
= 1 and

T (v)
= c > 0. But then, putting h = λv,

T (h)
/
h
= c for all λ = 0,
contradicting the hypothesis T (h) = o(h).

Deﬁnition E.2.4. Let V and W be ﬁnite-dimensional vector spaces

over R, let Ω be an open subset of V , and let f : Ω → W be a
continuous map. We say that f is diﬀerentiable at x ∈ Ω if there is a
linear map T : V → W such that
(E.2.5) f (x + h) = f (x) + T [h] + o(h)
as h → 0 in V . The map T is called the derivative of f at x, and we
denote it by Df (x).

The underlying idea here is that the derivative Df (x) gives the
best linear approximation to f near x. Lemma E.2.3 shows that the
derivative (if it exists) is uniquely determined by (E.2.5).

Convention E.2.6. Notice that Df (x), the derivative of f at x,

is itself a linear map from V to W . Given h ∈ V , then, Df (x)[h]
is a vector in W , depending both on x and on h. Calculations with
derivatives tend to involve a plethora of parentheses like this. To help
keep them straight, I will use a (nonstandard) convention. When the
value of a function depends linearly on its vector argument, I will
use square brackets, like [h]; when the dependence is not necessarily
linear I will use parentheses, like (x).

Example E.2.7. Suppose that V = R. A map f from R to W is

just a path in W . We deﬁned the derivative f (x) ∈ W of such a path
at the beginning of Section 3.4, by the usual formula
f (x + h) − f (x)
f (x) = lim .
h→0 h

To reconcile this with Deﬁnition E.2.4 we need only rewrite the dis-
played equation as

f (x + h) = f (x) + f (x)h + o(h).

Comparing with Deﬁnition E.2.4 we see that Df (x) is the linear map
R → W that sends h to f (x)h. In fact, every linear map R → W is
of the form h → ch, where c is a constant vector, so the two versions
of the deﬁnition contain exactly the same information.

Example E.2.8. Suppose that W = R, so that we are looking

at a real-valued function f defined on an open subset of a higher-
dimensional vector space V . The derivative Df is now a linear map
V → R, that is, an element of the dual space V ∗ of V (Defini-
tion A.4.1). One should think of this as a version of the directional
derivative from traditional multivariable calculus, in that Df (x)[v]
measures the rate of change of f at the point x in the direction of the
vector v. In fact, holding x and v fixed and letting t be an auxiliary
real variable, we have by definition and using linearity

f (x + tv) = f (x) + Df (x)[tv] + o(t) = f (x) + tDf (x)[v] + o(t).

Thus

d
Df (x)[v] = f (x + tv) ,
dt t=0

which is the usual deﬁnition of the directional derivative of f in the

direction v.

E.3. Properties of the derivative

In this section we will brieﬂy review some key properties of this notion
of derivative. The ﬁrst result is a version of the mean value theorem
(expressed as an inequality rather than an equality):

Proposition E.3.1 (Increment formula). Let V, W be ﬁnite-dimen-

sional (normed ) vector spaces, and let f be a diﬀerentiable map from
a convex open subset Ω of V to W . Suppose that there is a constant

C > 0 such that the operator norm

Df (x)
C for all x ∈ Ω. Then,
for all x, y ∈ Ω,

f (x) − f (y)
C
x − y
.

Proof. Suppose that the result is false. Then there exist x, y ∈ Ω

and δ > 0 such that
f (x) − f (y)
(C + δ)
x − y
. Let us call a
pair of points x, y for which this happens a δ-bad pair.
Let z = (x + y)/2 be the midpoint of [x, y]. Then one of the
pairs (x, z) and (z, y) must also be δ-bad. Let (x1 , y1 ) be such a one.
Repeating this argument we obtain a sequence of δ-bad pairs (xn , yn ),
all lying on the line segment [x, y] and each one obtained by bisecting
the previous one. Clearly there is some z on the line segment such
that xn → z, yn → z. But then by construction there are points z + h
arbitrarily close to z with

f (z + h) − f (z)
(C + δ)
h
.
This contradicts the deﬁnition of the derivative since

f (z + h) − f (z)

Df (z)[h]
+ o(h) C
h
+ o(h)
using the deﬁnition of the operator norm.

The most important fact about derivatives is the chain rule. In

our language, the chain rule simply states that the best linear ap-
proximation to a composite map f ◦ g is the composite of the best
linear approximation to f and the best linear approximation to g.
(Contrast the simplicity and generality of this statement with the nu-
merous complicated special cases that appear in the usual Calculus
III course!) To put it more precisely,
Proposition E.3.2 (Chain rule). Let V ,V , and W be three finite-
dimensional vector spaces over R. Let Ω be an open subset of V and
Ω an open subset of V . Let g : V → V be continuous with g(Ω) ⊆ Ω ,
let f : Ω → W be continuous, let g be differentiable at p ∈ Ω, and let
f be differentiable at q = g(p) ∈ Ω . Then f ◦ g is differentiable at p,
and
D(f ◦ g)(p) = Df (q) ◦ Dg(p) = Df (g(p)) ◦ Dg(p) ∈ L(V ; W )
where Dg(p) ∈ L(V, V ) and Df (q) ∈ L(V , W ), so that their com-
posite belongs to L(V, W ), as does D(f ◦ g)(p).

Proof. There is no loss of generality in assuming that p = q = 0.

From the definition of the derivative, we may write
g(x) = g(x) − g(0) = Dg(0)[x] + o1 (x),
f (y) − f (0) = Df (0)[y] + o2 (y),
where o1 , o2 denote “error terms” which satisfy the following: for each
fixed ε ∈ (0, 1) there is δ > 0 such that

o1 (x)
ε
x
provided
x
< δ,

o2 (y)
ε
y
provided
y
< δ.
Let a = max{
Dg(0)
,
Df (0)
} (operator norm). Then provided

x
< δ we have
g(x)
(1 + a)
x
. Now take y = g(x) and
substitute the first equation above into the second. This gives
f (g(x)) − f (g(0)) = Df (0) ◦ Dg(0)[x] + Df (q)[o1 (x)] + o2 (g(x)).
If we take
x
< δ/(1+a), then the two right-hand terms are bounded
by
aε
x
+ ε(1 + a)
x
= (2a + 1)ε
x
,
and since ε was arbitrary this shows that the sum of these terms is
o(x), as required.

What about higher derivatives? The space L(V ; W ) of linear

maps from V to W is itself a finite-dimensional vector space. Thus, if
f is differentiable in Ω, then Df is a map from Ω to L(V ; W ) and one
can ask about its continuity, differentiability, and so on. The second
derivative of f , D2 f , is the derivative of Df . By definition, D2 f (x) is
a linear map from V to L(V ; W ). But this is exactly the same thing
as a bilinear map from V × V to W : that is to say, the expression
(E.3.3) D2 f (x)[v1 , v2 ]
gives an element of W for all pairs of vectors v1 , v2 ∈ V and, moreover,
depends linearly both on v1 and on v2 . ( In a similar way we can define
third and higher derivatives. We say a function is smooth if it has
derivatives of all orders.)

Proposition E.3.4 (Clairaut’s theorem). Suppose that f is a

function from an open subset Ω of V to W , twice diﬀerentiable at a

point p ∈ Ω. Then , D2 f (p) is a symmetric bilinear map V ×V → W .

That is to say,
D2 f (p)[x, y] = D2 f (p)[y, x]
for all x, y ∈ V .

This result is familiar in Calculus III under the form “the mixed
partial derivatives are symmetric” — that is, ∂ 2 f /∂x∂y = ∂ 2 f /∂y∂x.

Proof. There is no loss of generality in assuming that p = 0 and

f (0) = 0; moreover, by adding a linear term to f , we may assume that
Df (0) = 0 also. The proof will show that under these assumptions,
D2 f (0)[x, y] is a good approximation to the expression f (x + y) −
f (x) − f (y). Since this expression is clearly symmetric in x and y,
the result will follow.
Here are the details. We start by applying the increment formula
(Proposition E.3.1) to the map
ϕ : x → f (x + y) − f (x) − Df (y)[x] (y ﬁxed).
The derivative of ϕ is
Dϕ(x) = Df (x + y) − Df (x) − Df (y)
= D2 f (0)[(x + y) − x − y] + o(
x
+
y
) = o(
x
+
y
).
Therefore by the increment formula

ϕ(x) − ϕ(0)
=
f (x + y) − f (x) − f (y) − Df (y)[x]

x
o(
x
+
y
) o((
x
+
y
)2 ).

Moreover, by deﬁnition of the derivative,

Df (y) − D2 f (0)[y]
=
o(
y
) and thus

D2 f (0)[y, x] − Df (y)[x]
o(
y

x
) o((
x
+
y
)2 ).
Putting these together,

f (x + y) − f (x) − f (y) − D2 f (0)[y, x]
o((
x
+
y
)2 )
and therefore, using the symmetry,

D2 f (0)[x, y] − D2 f (0)[y, x]
o((
x
+
y
)2 ).
A rescaling argument (as in the proof of Lemma E.2.3) now completes
the proof.

Remark E.3.5. Our deﬁnitions of derivative and of smooth have

been given for functions defined on open subsets Ω of a vector space.
The alert reader will have noticed, however, that we used the word
“smooth” in Section 3.4 in the context of paths, that is, of maps
defined on [0, 1], which is definitely not an open subset of R. Later,
we shall also need to talk about smooth maps defined on the unit
square [0, 1] × [0, 1] in R2 and a few similar examples of closed subsets
C ⊆ Rn . The definitions of this appendix still work in this context:
the key observation is that for every x ∈ C, even when x is a boundary
point, the vectors h ∈ Rn such that x+h ∈ C are sufficiently abundant
to span Rn . This ensures that Definition E.2.4 defines the derivative
uniquely.

E.4. The inverse function theorem

At one point in Chapter 8 we will also need the inverse function
theorem. We state it as follows:
Theorem E.4.1. Let f be a smooth map from an open subset Ω of V
to W , and suppose that for some x0 ∈ Ω the linear map Df (x0 ) : V →
W is invertible. Then there is an open set Ω contained in Ω and
containing x0 such that the restriction of f to Ω is a bijection Ω →
f (Ω ). Moreover, if g : f (Ω ) → Ω denotes the inverse map, then g
is smooth, and moreover Dg(f (x0 )) = (Df (x0 ))−1 as a linear map
W →V.

Since this result is only used in one example in this book, we

won’t give the proof here. The argument, which may be found in [14,
10.2.5], uses an iteration which produces the inverse map by successive
approximations. Because of the limiting process involved here, if we
are working in the general normed space context, it is necessary to
assume that the spaces involved are complete. This is automatic in
the ﬁnite-dimensional case.

Appendix F

Hilbert Space

Functional analysis is the subject which arose at the beginning of

the twentieth century when mathematicians began to systematize the
insight that many of the standard processes of analysis — differen-
tiation and integration are the most obvious examples — are linear
operations and should therefore be considered in the context of linear
algebra (where the underlying vector spaces are infinite-dimensional).
The development of quantum mechanics in the 1920s and beyond un-
derlined the importance of infinite-dimensional vector spaces, espe-
cially those equipped with an inner product. These are the Hilbert
spaces whose theory we will review in this appendix. In Chapter 7 we
extract an integer invariant, closely related to the winding number,
from this infinite-dimensional linear algebra.

F.1. Deﬁnition and examples

Recall from Deﬁnition A.5.1 that an inner product on a complex vec-
tor space V is a complex-valued function on V × V , written ·, ·,
which has the properties that u, λ1 v1 + λ2 v2 = λ1 u, v1 + λ2 u, v2 ,
that u, v = v, u, and that u, u is a nonnegative real number, zero
only when u = 0. A vector space V equipped with an inner product
is an inner product space.

239

3 The norm on an inner product space V is deﬁned by

v
=
v, v. The norm on V deﬁnes a metric by d(u, v) =
u − v
.

Deﬁnition F.1.1. An inner product space is called a Hilbert space

if it is complete as a metric space; that is, every Cauchy sequence
converges. (Remark B.3.11.)

Example F.1.2. Let 2 (N) denote the space of square-summable se-

quences of complex numbers; that is, an element a ∈ 2 (N) is a se-
∞
quence (an )n∈N such that n=1 |a2n | < ∞, and the inner product of
two such sequences is deﬁned as

∞

a, b = ān bn
n=1

(the sum converges absolutely). It is a standard exercise to check that

this is a Hilbert space. The space 2 (Z) of square-summable two-way
sequences (bn )n∈Z of complex numbers can be deﬁned in a completely
analogous way (as indeed can the space 2 (S) for any set S).

One can think of the inner product in Example F.1.2 as being the
most direct “inﬁnite-dimensional” generalization of the dot product

a · b = a1 b1 + a2 b2 + a3 b3

of vectors in R3 — obtained by allowing the subscripts n to vary over

the inﬁnite range N or Z rather than just the ﬁnite set {1, 2, 3}. Hav-
ing taken that step, though, one might go further and ask whether the
discrete variable n can be replaced by a continuous one. For example,
suppose that we consider continuous complex-valued functions on the
unit circle, equivalently, continuous 2π-periodic functions on R. The
formula
2π
1
(F.1.3) u, v = ū(t)v(t)dt
2π 0

deﬁnes an inner product on the space of such functions. Do we obtain

a Hilbert space in this way? In other words, is this inner product space
complete? The answer is no:
Example F.1.4. Let {un } be the sequence of continuous functions
on [0, 2π] deﬁned by
⎧
⎪
⎪nt (0 t 1/n),
⎪
⎪
⎨1 (1/n t 1),
un (t) =
⎪1 − n(t − 1)
⎪ (1 t 1 + 1/n),
⎪
⎪
⎩
0 (1 + 1/n t 2π).
Then it is easy to see that (un ) is a Cauchy sequence (in the norm
arising from the inner product(F.1.3)), but that it does not converge
in this norm to any continuous function.

The situation described in the above example is analogous to one

familiar from introductory analysis. Suppose for instance that we
deﬁne a sequence of rational numbers inductively by q1 = 1, q2 = 32 ,
q3 = 1712 , and so on, with qn+1 = 2 (qn + 2/qn ). Then {qn } is a
1

Cauchy sequence of rational numbers, but it does not converge to

a√ rational limit — it is “trying” to approach the irrational number
2. Similarly, the sequence {un } above is “trying” to approach the
discontinuous function that equals 1 for 0 < t 1 and 0 for other
values of t.
In the case of the rational numbers, we know that by enlarging
the rational number system to the real number system, we can arrive
at a complete space — in other words, we can ensure that every
Cauchy sequence converges. We might therefore ask whether we can
similarly enlarge the space of continuous functions on the circle to a
larger space (including some discontinuous functions) which will be
complete (in other words, a Hilbert space) with respect to the L2
inner product (F.1.3).
The Lebesgue theory of integration, developed at the turn of the
twentieth century, provides a positive answer. It can be shown that
if {un } is a Cauchy sequence of continuous functions with respect to
the L2 norm, then there is a subsequence {unk } that converges “al-
most everywhere”: that is to say, the points x for which the sequence
of real numbers {unk (x)} does not converge form a set of measure

zero1 . The functions u(x) obtained as “almost everywhere” limits

of Cauchy sequences of continuous functions in this way are called
square-integrable or L2 functions. Lebesgue’s theory allows one to
interpret the expression (F.1.3) for the inner product even when u
and v are merely square-integrable (and not assumed to be continu-
ous). Moreover, the space L2 (S 1 ) of square-integrable functions on
the circle, equipped with this inner product, is a Hilbert space — it
is complete.
Remark F.1.5. There is one nuance which is important here. Re-
member that the convergence appearing above was only “almost ev-
erywhere”, and as a result we should think of an L2 function as being
deﬁned only “almost everywhere”. The points of L2 (S 1 ) are therefore
strictly speaking equivalence classes of functions, two functions being
regarded as equivalent if they diﬀer only on a set of measure zero.

The two basic examples that we have described above (2 and L2 )
are even more closely related than they may at first appear. In fact,
let H = L2 (S 1 ) be the Hilbert space of square-integrable functions
on the circle. The functions en (t) = eint , n ∈ Z, form an orthonormal
set in H: that is, en , em equals 1 if n = m and equals 0 if n = m.
Definition F.1.6. Let f ∈ L2 (S 1 ) be a square-integrable function.
The Fourier coefficients of f are the inner products
2π
1
cn = en , f = f (t)e−int dt.
2π 0
(Some
√ textbooks may have definitions that differ by a factor of
2π or 2π.) Now we have
Proposition F.1.7. The map that sends a function f to the sequence
{cn := en , f } of its Fourier coefficients is an isometric isomorphism
from L2 (S 1 ) to 2 (Z). The inverse map is defined as follows: for a

sequence {cn } in 2 (Z), the series cn en converges in L2 (S 1 ) to a
function f whose Fourier coefficients are the given sequence.

Outline of proof. A ﬁnite linear combination of the functions en is

called a trigonometric polynomial. If f is a trigonometric polynomial,
1
This result can be proved by elementary means, i.e., using only the deﬁnition of
“measure zero” in Appendix D. The reader may enjoy thinking about this.

This result may also be expressed by saying that the orthonormal

set {en } is complete. The complete orthonormal set {en } is called the
Fourier basis of H = L2 (S 1 ).
Remark F.1.8. The space L2 (S 1 ) is only one example of a wider
class of Hilbert spaces that arise from measure theory. Indeed, if
(X, μ) is a measure space, we may deﬁne L2 (X, μ) to consist of equiv-
alence classes (modulo equality μ-almost everywhere) of functions
f : X → C such that

f
= |f (x)|2 dμ(x) < ∞.
2

However, the space L2 (S 1 ) is the only example of this general con-

struction that we will need to use.

F.2. Orthogonality
Two vectors u and v in an inner product space are called orthogonal
if u, v = 0.
Lemma F.2.1 (Pythagoras’s theorem). If u, v ∈ H are orthogo-
nal, then
u + v
2 =
u
2 +
v
2 .

Proof. Write

u + v
2 = u + v, u + v =
u
2 + u, v + v, u +
v
2 .
Since u, v = v, u, this gives the result.

A similar expansion for any two vectors u, v gives the parallelo-

gram law
(F.2.2)
u + v
2 +
u − v
2 = 2
u
2 + 2
v
2

v
E

Figure F.1. Proof of the projection theorem.

(the “cross terms” cancel out). Geometrically, this states that the
sum of the squares on the two diagonals of a parallelogram is equal
to the sum of the squares on all four edges.
Let u be a vector in some Hilbert space H. The collection u⊥ of
all vectors orthogonal to u is a closed subspace of H because it is the
kernel of the continuous linear map
v → v, u
from H to C. It follows that for any S ⊆ H, the collection
4
S ⊥ := u⊥
u∈S

of vectors orthogonal to every u ∈ S is also a closed subspace. It is

called the orthogonal of S.
Remark F.2.3. Notice that not every subspace of a Hilbert space is
closed. We have already seen counterexamples: the space of continu-
ous functions inside L2 (S 1 ), or the space of ﬁnitely nonzero sequences
inside 2 (Z), is not closed.
Theorem F.2.4 (Projection theorem). Let H be a Hilbert space
and E a closed subspace of H. Then the orthogonal E ⊥ is a comple-
mentary subspace (Deﬁnition A.1.7) to E.

Outline of the proof. It is clear that E ∩ E ⊥ = {0} since the only

vector that is orthogonal to itself is the zero vector. Thus, we have to
prove that every v ∈ H can be written as a sum x + y, where x ∈ E

and y ∈ E ⊥ . It follows from Pythagoras’s theorem that if this can be

done, then x is the unique point of E that is closest to v. Thus, to
construct x, we try to show that there is a point of E at which the
minimum distance to v is attained. In finite dimensions this would
be an easy compactness argument. In infinite-dimensional spaces we
must proceed more carefully, as follows.
Let c = inf{
x − v
: x ∈ E} be the infimal distance from v to
E. We want to show that c is attained by some x ∈ E. Suppose that
x , x ∈ E and let w = x + x − v, so that the points v, x , w, x
form a parallelogram (see Figure F.1). Notice that the midpoint
of the parallelogram, 12 w + v = 12 x + x , belongs to E, so that

12 (w − v)
2 c2 . By the parallelogram law

x − x
2 = 2(
x − v
2 +
x − v
2 ) −
w − v
2
2(
x − v
2 +
x − v
2 ) − 4c2 .
Now let {xn } be a sequence of points of E such that
xn − v
→ c.
Applying the above identity with x = xn and x = xn we see that

xn − xn
→ 0 as n , n → ∞. That is, {xn } is a Cauchy sequence.
By completeness, it converges to a point x ∈ H, and since E is closed,
we in fact have x ∈ E. This finishes the proof.

As a consequence of this, we ﬁnd that a version of the represen-

tation theorem (Theorem A.5.4) is true for Hilbert space.
Proposition F.2.5. Every continuous linear functional on a Hilbert
space H is represented by the inner product. That is, if ϕ : H → C is
a continuous linear map, then there exists x ∈ H such that
ϕ(v) = v, x
for all v ∈ H.

Proof. We may assume that ϕ is not the zero functional (otherwise

the result is trivial). Let K = ker(ϕ), which is a closed subspace of
codimension one. By the projection theorem, K ⊥ is a complementary
subspace, and it has dimension one by Proposition A.3.9. Let w be
a unit vector in K ⊥ and let x = ϕ(w)w. Then w, x = ϕ(w)
w
2 =
ϕ(w). It follows that the equality
ϕ(v) = v, x

holds for all v ∈ K ⊥ , and it also holds for all v ∈ K since both sides
are zero, so it holds for all v ∈ K ⊕ K ⊥ = H.

F.3. Operators
Let H and K be Hilbert spaces and let T : H → K be a linear map.
One says that T is bounded if the quantity
(F.3.1)
T
:= sup{
T x
:
x
1}
is ﬁnite. In that case,
T
is called the norm of T . The bounded
linear maps are exactly those which are continuous when we consider
H and K as metric spaces. In functional analysis we usually restrict
our attention to such maps. The collection of all bounded linear
maps from H to K, denoted B(H; K) (or just B(H) if H = K) then
becomes a normed vector space, and the completeness of K easily
implies that B(H; K) is also complete. A bounded linear map is also
called a linear operator.
Linear maps can be composed (multiplied) as well as added; that
is, B(H) is not only a normed vector space but also a ring. Notice
the inequality relating the norm to the composition of linear maps

ST

S

which follows easily from the deﬁnition of the norm of an operator in

(F.3.1) above.
Suppose that T : H → K is a bounded linear map. Then, for
each fixed y ∈ K, the map from H to C defined by
x → T x, y
is bounded and linear. According to the representation theorem
(Proposition F.2.5), then, it is represented by the inner product with
an element of H. Let us call this element (which of course depends
on y) T ∗ y. That is, we have by definition
T x, y = x, T ∗ y
for all x, y.
Proposition F.3.2. The map T ∗ : K → H so defined is a bounded
linear operator, with norm
T ∗
=
T
.

Proof. The expression for the norm follows from the equation

T
= sup{| T x, y| :
x
,
y
1},
which in turn is a consequence of the Cauchy-Schwarz inequality.

The operator T ∗ is called the adjoint of T .

Definition F.3.3. An operator T ∈ B(H) has finite rank if Im(T ) is
finite-dimensional. An operator T is compact if there exists a sequence
of finite rank operators Tn that converges to T in norm. The space
of compact operators on H is denoted K(H).

It is easy to see that the sum of two ﬁnite rank operators is

finite rank and that the composite (either way around) of a finite
rank operator and a bounded operator is finite rank. For compact
operators, this implies:
Proposition F.3.4. K(H) forms a closed, two-sided ideal in B(H);
that is to say, the sum of two compact operators is compact, the prod-
uct of a compact operator and a bounded operator (in either order ) is
compact, and the limit of a convergent sequence of compact operators
is compact.

We will need the following standard lemma at one point. Its

tricky proof, which may be found in standard texts such as Rudin [33,
Corollary 2.12], ultimately depends on the Baire category theorem.
Lemma F.3.5 (Closed graph lemma). An algebraically invert-
ible operator on Hilbert space is topologically invertible. That is, if
T : V → W is a bijective bounded linear map between Hilbert spaces,
then the linear map T −1 : W → V is also bounded.

Appendix G

Groups and Graphs

The idea of “symmetry” is a fundamental one in mathematics. A ﬁg-

ure such as an equilateral triangle is highly symmetrical because there
are many geometric operations (two rotations and three reﬂections)
that map the triangle back to itself. The subject of group theory orig-
inates when we focus our attention on the symmetries themselves and
the eﬀects of composing them, more than we focus on the “symmet-
rical” object. One of the earliest triumphs of this point of view was
Galois’ understanding of the symmetries of polynomial equations and
the relationship of these symmetries to the solvability of these equa-
tions by a “radical formula” (such as the famous quadratic formula

√
−b ± b2 − 4ac
x=
2a

for the solution of the equation ax2 + bx + c = 0).

We will use a few ideas from group theory in Chapter 8, and
we’ll review them brieﬂy in this appendix. All this material is usu-
ally covered in an undergraduate “introductory abstract algebra”
course. Standard textbooks for such a course include Gallian [18],
Herstein [22], or Rotman [32]; all of these provide more detail on the
topics listed below, and of course they develop algebra much further
as well.

249

G.1. Equivalence relations

Let X be a set. A relation R on X is a subset of X × X. We think
of R as expressing a “relationship” between elements of X that may
be true or false, and in keeping with this point of view we write xRy
(for x, y ∈ X) in place of (x, y) ∈ R. For example, “=” (equals),
“<” (is less than), and “|” (divides) are all relations on the natural
numbers N.
Definition G.1.1. A relation ∼ on X is an equivalence relation if it
satisfies the following three properties for all x, y, z ∈ X:
(a) x ∼ x (reflexive law),
(b) if x ∼ y, then y ∼ x (symmetric law),
(c) if x ∼ y and y ∼ z, then x ∼ z (transitive law).

Equality is the archetypal equivalence relation but there are many

other examples also. For example, let n be a natural number. The
relation deﬁned on Z by “congruence modulo n”, where p ∼ q if and
only if p − q is a multiple of n, is easily seen to be an equivalence
relation. If two elements are related by an equivalence relation, we
may say that they are equivalent.
Deﬁnition G.1.2. Given an equivalence relation ∼ on a set X, the
equivalence class of x ∈ X is
[x] = {y ∈ X : y ∼ x},
that is, the subset of X consisting of all y that are equivalent to x.
Proposition G.1.3. Let X be a set with equivalence relation ∼.
Then X can be written as a disjoint union of equivalence classes, so
that each member of X belongs to one and only one such equivalence
class.

Proof. By the reﬂexive law, x ∈ [x], so every member of X belongs

to at least one equivalence class. Thus it suﬃces to show that if
x, x ∈ S, then their equivalence classes [x] and [x ] are either identical
or disjoint. Suppose that they are not disjoint and let y ∈ [x] ∩ [x ].
Then y ∼ x and y ∼ x ; it follows, using the symmetric and transitive
laws, that x ∼ x . Hence, by transitivity again, if y ∼ x, then y ∼ x ,

so [x] ⊆ [x ]. Reversing the roles of x and x we get the opposite

inclusion, so [x] = [x ] as required.
Remark G.1.4. Suppose X is a set and R a relation which is sym-
metric, but not necessarily transitive. A new relation ∼ on X may
be defined by saying that x ∼ y if and only if there is a finite
list x = x0 , x1 , . . . , xN = y of points of X such that xk−1 Rxk for
k = 1, . . . , N . (We allow N = 0, so our relation has the reflexive
property.) Then ∼ is an equivalence relation, called the equivalence
relation generated by R. In a similar way we can talk of the equiv-
alence relation generated by a finite list R1 , . . . , Rm of symmetric
relations (now each pair of successive points of the chain is allowed
to be related by any one of the R’s).

G.2. Groups
Let S be a set. A binary operation on S is a mapping S × S → S,
in other words, a well-defined process which takes two members of
S as “input” and produces another as “output”. We are familiar
with many examples. The usual arithmetical operations are binary
operations on appropriate sets S: for instance, addition is a binary
operation on the natural numbers N, subtraction is a binary oper-
ation on the integers Z, multiplication is a binary operation on the
rational numbers Q, division is a binary operation on the nonzero
real numbers R \ {0} (division by zero is not defined!). Addition of
vectors in a vector space is a binary operation, composition of linear
transformations is a binary operation, concatenation of paths is a bi-
nary operation. A group is a set with a binary operation that obeys
various “symmetry” properties.
Definition G.2.1. A group is a set G equipped with a binary oper-
ation (here denoted by ∗) which has the following properties:
(a) (Associative law) For all g1 , g2 , g2 ∈ G we have
(g1 ∗ g2 ) ∗ g3 = g1 ∗ (g2 ∗ g3 ).
(b) (Existence of identity) There is an element e ∈ G, called the
identity, which has the property that for all g ∈ G,
e ∗ g = g = g ∗ e.

(c) (Existence of inverses) For each g ∈ G there is another element

g −1 ∈ G, called the inverse of G, such that
g ∗ g −1 = e = g −1 ∗ g.
Remark G.2.2. There can only be one identity element: if e and
e were two identities, then e = e ∗ e because e is an identity, and
e ∗ e = e because e is an identity. Similar reasoning shows that a
given g ∈ G can only have one inverse.

The notation ∗ for the binary operation may be replaced by some-

thing else, according to what is most helpful. In what follows we’ll
usually denote the binary operation by simple juxtaposition; i.e., we
will write gh for g ∗ h. Notice that while the binary operation must
satisfy the associative law, it is not required to satisfy the commu-
tative law g1 g2 = g2 g1 . A group in which the commutative law is
satisﬁed is called abelian.
Example G.2.3. The integers, or the rational, real, or complex num-
bers, with the usual addition operation, form a group.
Example G.2.4. Let n be a natural number. A residue class modulo
n is a subset of Z consisting of all the integers that leave a given
remainder when divided by n; there are n such residue classes. (For
example, the even integers and the odd integers form the two residue
classes modulo 2.) If A and B are two residue classes modulo n, their
sum
A + B := {a + b : a ∈ A, b ∈ B}
is also a residue class modulo n. (For example, odd plus odd equals
even.) This operation makes the collection of such residue classes into
a group with n elements, called the cyclic group of order n, Zn .
Example G.2.5. Suppose that X is a set with n elements (for exam-
ple, the set {1, 2, . . . , n}). The collection of all bijections (one-to-one
correspondences) S → S forms a group under the operation “com-
position of maps”. This group, which has n! elements, is called the
symmetric group Sn , and its elements are called permutations of S.
In contrast to the previous two examples, this group is not abelian
(as soon as n 3).

Figure G.1. A regular tetrahedron. The symmetry (isome-

try) group of the tetrahedron is isomorphic to S4 , acting by
permutations of the four vertices A, B, C, D.

Example G.2.6. The previous example (the symmetric group) is

only one of many examples of symmetry groups in mathematics. In
general, if X is some kind of “mathematical structure”, a “structure-
preserving bijection” from X to itself is called a symmetry of X,
and such symmetries will in general form a group. Examples might
include the group of isometries from a metric space to itself (such as
the tetrahedron shown in Figure G.1), or the group of invertible linear
transformations from a vector space to itself. The latter example is
also called the general linear group of the vector space.

Example G.2.7. If G and H are groups, their Cartesian product

G × H can be made into a group by “componentwise operations”:

(g1 , h1 ) ∗ (g2 , h2 ) = (g1 g2 , h1 h2 ).

An example that will be important in Chapter 8 is the group Z2 =

Z × Z of pairs of integers, with componentwise addition as the group
operation.

Deﬁnition G.2.8. Let G be a group. A subgroup of G is a subset

H ⊆ G which has the following properties:

(a) The identity e belongs to H.

(b) If h ∈ H, then h−1 ∈ H.
(c) If h1 , h2 ∈ H, then h1 h2 ∈ H.

When these properties are satisﬁed, H becomes a group in its own

right under the operation that it inherits from G (this explains the
“subgroup” terminology).
Remark G.2.9. Suppose that H is a subgroup of G. The relation
∼ on G defined by
g1 ∼ g2 ⇔ g1 g2−1 ∈ H
is then an equivalence relation (the three properties (a)–(c) of a sub-
group, in the definition above, correspond exactly to the reflexive,
symmetric, and transitive properties of an equivalence relation). The
equivalence classes are called cosets of H in G, and the number of
such cosets is the index of H in G. It can be denoted [G : H].
All cosets contain the same number of elements: the map h → hg
is a bijection from H to the coset containing the element g. Conse-
quently we have Lagrange’s theorem: if G is a finite group and H a
subgroup, the order of H divides the order of G.

G.3. Homomorphisms
Let G and H be groups.
Deﬁnition G.3.1. A homomorphism from G to H is a map ϕ : G →
H such that
ϕ(g1 g2 ) = ϕ(g1 )ϕ(g2 )
for all g1 , g2 ∈ G.

Notice that the “multiplication” on the left side of this equation

takes place in G, and on the right side in H.
Lemma G.3.2. If ϕ : G → H is a homomorphism, then ϕ “respects
identity and inverses”; in other words, it sends the identity of G to
the identity of H, and, for all g ∈ G, ϕ(g −1 ) is the inverse (in H) of
ϕ(g).

Proof. The identity of a group G is the only element that satisﬁes the
equation g 2 = g (proof: multiply by g −1 ). Thus ϕ(eG )2 = ϕ(e2G ) =
ϕ(eG ) implies that ϕ(eG ) = eH .
Now ϕ(g)ϕ(g −1 ) = ϕ(gg −1 ) = ϕ(eG ) = eH . This implies that
ϕ(g) and ϕ(g −1 ) are inverses in H, as asserted.

Deﬁnition G.3.3. Let ϕ : G → H be a homomorphism of groups.

The kernel of ϕ is deﬁned by
Ker(ϕ) = {g ∈ G : ϕ(g) = eH }.
The image of ϕ is deﬁned by
Im(ϕ) = {h ∈ H : ∃g ∈ G, ϕ(g) = h}.

These are easily seen to be subgroups of G and H, respectively.

Remark G.3.4. By deﬁnition, a homomorphism ϕ : G → H is sur-

jective (onto) if and only if Im(ϕ) = H. Also, ϕ is injective (one-
to-one) if and only if Ker(ϕ) = {eG }. To see this, notice that if
ϕ(g1 ) = ϕ(g2 ), then
ϕ(g1 g2−1 ) = ϕ(g1 )ϕ(g2 )−1 = eH ,
and therefore g1 g2−1 ∈ Ker(ϕ). If ϕ is bijective (that is, both surjective
and injective), it is called an isomorphism. Compare all this with
Remark A.3.5.

Example G.3.5. Fix n. The map which assigns to each integer its
residue class modulo n is a homomorphism Z → Zn .

Example G.3.6. Let G be the group of all nonzero real numbers

under multiplication. Let H be the group consisting of the numbers
{±1}, also under multiplication. (Although H is a subgroup of G,
that is not really relevant to the following discussion.) Deﬁne a map
ϕ : G → H by sending all positive numbers to +1 and all negative
numbers to −1. This is a homomorphism. The homomorphism law,
in this case, summarizes the familiar sign rules (“minus times minus
equals plus”, etc.) which were ﬁrst written down in Bombelli’s Algebra
of 1572.

Notice that H = {±1} is isomorphic to the cyclic group with

two elements. Another important example of a homomorphism to
this group is provided by the sign of a permutation. Recall that a
permutation of X = {1, . . . , n} is a bijective map X → X. Let σ be
a ﬁxed permutation of X. Deﬁne an equivalence relation ∼ on X by
x ∼ y ⇔ ∃n ∈ Z : σ n (x) = y.

Deﬁnition G.3.7. The equivalence classes associated to this equiv-

alence relation are called the orbits, or cycles, of σ. The sign of σ is
the number ±1 deﬁned by
sign(σ) = (−1)n−m ,
where m is the number of orbits for σ.

Proposition G.3.8 (Parity theorem). For each n, the sign gives a

homomorphism from Sn to {±1} (the cyclic group with two elements).

This is not at all an obvious fact. To prove it, we will make use
of the concept of transposition: a transposition is a permutation that
interchanges just two elements of X, leaving the others ﬁxed.

Lemma G.3.9. Let σ be a permutation with m orbits and let τ be a

permutation. Then τ σ has either m + 1 orbits or m − 1 orbits.

Proof. This is just a calculation. The first case occurs when the two
elements that τ interchanges belong to the same orbit of σ, and the
second case occurs when the two elements that τ interchanges belong
to different orbits of σ. In either case the other orbits (those that do
not involve the elements interchanged by τ ) are unaffected.

Lemma G.3.10. Every permutation can be written as a product of

transpositions.

Notice that there is no uniqueness assertion here — a permutation

can be written as a product of transpositions in many diﬀerent ways.

Proof. We prove by induction on n that every permutation in Sn can

be written in this way, starting with n = 2. The group S2 contains the
identity element (the product of 0 transpositions, or 2 copies of the
same transposition if you prefer) and the permutation interchanging 1
and 2 (which is itself a transposition), so the base case is veriﬁed. Now
let σ ∈ Sn+1 and suppose that σ(n + 1) = k. Let τ be a transposition
interchanging k and n + 1 (or the identity if k = n + 1). Then τ σ
maps n + 1 to itself, so it may be considered as permuting {1, . . . , n}.
By the induction hypothesis, τ σ is a product of transpositions. But
then σ = τ (τ σ) is such a product too.

Proof of Proposition G.3.8. The identity permutation e has n cy-

cles, so sign(e) = +1. By Lemma G.3.9, if τ is a transposition,
sign(τ σ) = − sign(σ). Therefore, for a product of k transpositions
τ1 · · · τk ,
sign(τ1 · · · τk ) = − sign(τ2 · · · τk ) = · · · = (−1)k sign(e) = (−1)k .
By Lemma G.3.10, every permutation can be written as a product of
transpositions. Thus we have established an alternative deﬁnition of
sign(σ): it is equal to +1 if σ can be written as the product of an even
number of transpositions, and −1 if σ can be written as the product
of an odd number of transpositions. The homomorphism property
sign(σ1 σ2 ) = sign(σ1 ) sign(σ2 )
follows immediately from this.

Example G.3.11. The full symmetry group of the regular tetrahe-

dron is isomorphic to S4 (Figure G.1). The transpositions are imple-
mented by reﬂections in the planes bisecting edges of the tetrahedron.
The subgroup of rotations of the tetrahedron is composed exactly of
those symmetries that are a product of an even number of reﬂections,
that is, the kernel of the parity homomorphism. This kernel is called
the alternating group, A4 in this case.

We used the sign of a permutation in deﬁning the determinant

of a linear transformation (Deﬁnition A.6.5). Recall that one of the
basic properties of the determinant is the product rule
det(AB) = det(A) det(B).
In the language of this section, we recognize this as saying that the
determinant is a homomorphism from the group GL(n, K) of invert-
ible n × n matrices to the group K \ {0} of nonzero elements of the
ﬁeld K under multiplication.

G.4. Graphs
Deﬁnition G.4.1. A graph is a set V together with a relation E
on V which is nonreﬂexive and symmetric; that is, for all x, y ∈ V ,
(x, x) ∈
/ E, and (x, y) ∈ E if and only if (y, x) ∈ E.

D E

A B C

Figure G.2. A graph. Vertices {A, B, C, D, E} and edges

{AB, BC, CD, DE, CE, BD}.

We think of the elements of V as vertices or points, and the

elements of E as edges joining pairs of points. For an edge e = (x, y),
the vertices x and y are the ends of E. This allows us to represent a
graph by a picture like Figure G.2. The symmetry condition means
that the edges are unoriented : we make no distinction between the
“head” and the “tail” of an edge.
A graph may be finite or infinite. For example, we could make a
graph whose vertices are the integers Z and whose edges join n and
n + 1 for each n ∈ Z.
Definition G.4.2. A graph G = (V, E) is connected if the equiva-
lence relation generated by E (Remark G.1.4) has only one equiva-
lence class, V .
Definition G.4.3. A tree is a minimal connected graph; that is, it is
connected but the removal of any edge will make it disconnected. If
G = (V, E) is any graph, a spanning tree for G is a tree with vertices
exactly V and edges belonging to E (see Figure G.3).

Lemma G.4.4. Every connected graph contains a spanning tree.

Proof. Start with the given graph and, so long as this is possible,
keep removing edges while preserving connectedness. When this is
no longer possible — that is, when removal of any additional edge
will disconnect the graph — then you have, by deﬁnition, arrived at
a minimal connected graph, that is, a tree.

Of course, if our original graph was inﬁnite, this argument needs

to be augmented by some standard trickery (like induction or Zorn’s

Figure G.3. A graph and a spanning tree. Heavy lines de-

note edges of the spanning tree.

lemma) before it can count as a proof. However, we will be interested

only in the finite case so we will not sweat the details of this.
Let G = (V, E) be a (finite or infinite) connected graph. A space
X = |G| called the geometric realization of G may be defined by the
following process. Consider the Hilbert space 2 (V ) (see Example
F.1.2) of square-summable sequences on V . Identify each v ∈ V with
the corresponding basis vector ev ∈ 2 (V ) (the sequence which has
value 1 at v and 0 elsewhere), and, if (u, v) ∈ E is an edge, join eu
and ev by a straight line segment in 2 (V ). The resulting geometrical
space (the union of all the basis vectors corresponding to vertices
and straight line segments corresponding to edges) is the geometric
realization X = |G|.

Remark G.4.5. The realization |G| is equipped with a natural met-

ric called the path metric. To define it, notice first that the definition
of length of a path (see (6.2.2))
1
Length(γ) =
γ (t)
dt
0

makes perfect sense for smooth or piecewise smooth paths in a Hilbert

space; we can differentiate such paths using the definition of Exam-
ple E.2.7. Now because G is connected, any two points of |G| can be
joined by a piecewise smooth path that lies entirely in |G|. Define the
path metric on |G| by
d(x, y) = inf 2−1/2 Length(γ).
γ(0) = x, γ(1) = y
γ : [0, 1] → |G| piecewise smooth

In other words, the distance between the points x and y in the path
metric is the length of the shortest path in |G| joining x and y. The
normalization factor 2−1/2 is put in so that the edges all have length 1.

1. Colin Adams, Into Thin Air, Mathematical Intelligencer 22 (2000),

no. 1, 21–22.
2. Lars Ahlfors, Complex Analysis, 3rd ed., McGraw-Hill Science/
Engineering/Math, New York, January 1979.
3. Vladimir Arnold, Ordinary Diﬀerential Equations, Springer-Verlag,
Berlin, 1992.
4. Michael F. Atiyah, Algebraic Topology and Elliptic Operators, Commu-
nications on Pure and Applied Mathematics 20 (1967), no. 2, 237–249.
5. , Bott Periodicity and the Index of Elliptic Operators, Oxford
Quarterly Journal of Mathematics 19 (1968), 113–140.
6. Various authors, Proof of the Ham-Sandwich Theorem, Math
StackExchange discussion, http://math.stackexchange.com/questions/
381142/proof-of-the-ham-sandwich-theorem.
7. Sheldon Axler, Linear Algebra Done Right, 3rd ed., Springer, New
York, November 2014.
8. Alan F. Beardon, Complex Analysis: The Argument Principle in Anal-
ysis and Topology, Wiley, 1979.
9. William Beyer and Andrew Zardecki, The Early History of the Ham
Sandwich Theorem, The American Mathematical Monthly 111 (2004),
no. 1, 58.
10. R. H. Bing, The Geometric Topology of 3-Manifolds, American Math-
ematical Society, 1983.
11. Larry G. Brown, Ronald G. Douglas, and Peter A. Fillmore, Extensions
of C*-Algebras and K-Homology, Annals of Mathematics 105 (1977),
no. 2, 265–324.

261

12. Morton Brown, A Proof of the Generalized Schoenﬂies Theorem, Bull.

Amer. Math. Soc. 66 (1960), no. 2, 74–76.
13. Alain Connes, Non-Commutative Geometry, Academic Press, Boston,
1995.
14. Jean Dieudonné, Foundations of Modern Analysis, 3rd ed., Academic
Press, New York, 1969.
15. Albrecht Dold, A Simple Proof of the Jordan-Alexander Complement
Theorem, The American Mathematical Monthly 100 (1993), no. 9,
856–857.
16. Jerzy Dydak and Nathan Feldman, Major Theorems on Compactness:
A Uniﬁed Exposition, The American Mathematical Monthly 99 (1992),
no. 3, 220–227.
17. David Eisenbud, An Algebraic Approach to the Topological Degree of a
Smooth Map, Bull. Amer. Math. Soc. 84 (1978), no. 5, 751–764.
18. Joseph Gallian, Contemporary Abstract Algebra, 8th ed., Cengage
Learning, Boston, MA, July 2012.
19. Thomas Hales, Jordan’s Proof of the Jordan Curve Theorem, Studies
in Logic, Grammar and Rhetoric 10 (2007), no. 23, 45–60.
20. Paul R. Halmos, Finite Dimensional Vector Spaces, Literary Licensing,
LLC, September 2013.
21. Richard Hamming, Error-Detecting and Error-Correcting Codes, Bell
System Technical Journal 29 (1950), 147–160.
22. Israel Herstein, Abstract Algebra, 3rd ed., Wiley, New York, January
1996.
23. Nigel Higson and John Roe, The Atiyah-Singer Index Theorem, in The
Princeton Companion to Mathematics (Tim Gowers, ed.), 2008.
24. David W. Lyons, An Elementary Introduction to the Hopf Fibration,
Mathematics Magazine 76 (2003), no. 2, 87–98.
25. Saunders MacLane, Categories for the Working Mathematician, 2nd
ed., Springer, New York, September 1998.
26. Ib H. Madsen and Jorgen Tornehave, From Calculus to Cohomology:
De Rham Cohomology and Characteristic Classes, Cambridge Univer-
sity Press, Cambridge, New York, March 1997.
27. Ryuji Maehara, The Jordan Curve Theorem Via the Brouwer Fixed
Point Theorem, The American Mathematical Monthly 91 (1984),
no. 10, 641–643.
28. John W. Morgan and Gang Tian, Ricci Flow and the Poincaré Con-
jecture, American Mathematical Society and the Clay Mathematics
Institute, American Mathematical Society, Providence, RI, 2007.

29. Mı́cheál O’Searcoid, Metric Spaces, Springer, London, August 2006.

30. Guiseppe Peano, Sur une courbe, qui remplit toute une aire plane,
Math. Ann. 36 (1890), no. 1, 157–160.
31. Roger Penrose, The Topology of Ridge Systems, Annals of Human Ge-
netics 42 (1979), no. 4, 435–444.
32. Joseph J. Rotman, A First Course in Abstract Algebra, 3rd ed., Pear-
son, Upper Saddle River, NJ, October 2005.
33. Walter Rudin, Functional Analysis, McGraw-Hill, New York, 1973.
34. , Real and Complex Analysis, McGraw-Hill, New York, 1987.
35. Laurent Siebenmann, The Osgood-Schoenﬂies Theorem Revisited, Rus-
sian Mathematical Surveys 60 (2005), no. 4, 645–672.
36. Wilson A. Sutherland, Introduction to Metric and Topological Spaces,
2nd ed., Oxford University Press, Oxford, 2009.
37. Jules Verne, Around the World in 80 Days, World’s Classics ed., Oxford
University Press, 2008.
38. Stan Wagon, The Banach-Tarski Paradox, Cambridge University Press,
Cambridge, New York, September 1993.
39. Hassler Whitney, On Regular Closed Curves in the Plane, Compositio
Mathematica 4 (1937), 276–284.

o notation, 231 boundary, 55, 59, 71, 82, 90, 165,

237
absolute value, 7, 157 in homology theory, 90
action, of a group, 150, 152 bounded, 57, 99
examples, 153–157 linear map, 130, 246
free and discontinuous, 151 subset of a metric space, 206, 209
algorithm, 38, 155, 186 bouquet of circles, 159
alternating group, 257 Brouwer ﬁxed-point theorem, 50,
angle, 73, 105, 114 112, 143, 172
external, 115 bubble argument, 36, 44
internal, 116
antipodal, 24, 52
map, 52, 154 calculus, xi, 86, 233
arc, 58 fundamental theorem of, 78, 87,
argument (of complex number), 9 105
Artin’s criterion, 91, 94, 111 symbolic, 132
Atiyah, Sir Michael, 169, 180 Calkin algebra, 127, 176
Atkinson’s theorem, 127, 176 Carathéodory, Constantin, 66
Cardano, Gerolamo, 42
Baire category theorem, 247 Cauchy sequence, 125, 211, 245
ball, 33, 49, 66, 204 Cauchy’s theorem, 94
basepoint, 20, 139, 142 Cauchy-Riemann equations, 65, 85
and winding number, 47 Cauchy-Schwarz inequality, 195
basis, 7, 39, 81, 122, 123, 161, 185 Cayley graph, 157
right-handed, 8, 34, 119 cell, 34, 55
best linear approximation, 232 chain, 88
Bolzano-Weierstrass theorem, 210 chain rule, 99, 234
Bombelli, Rafael, 42, 255 Clairaut’s theorem, 80, 96, 235
Borsuk-Ulam theorem, 52 closed, 205, 208, 209
Bott periodicity theorem, 175 form, see also form, closed

265

subspace of Hilbert space, 123, directional derivative, 74, 233

131 discrete, 16, 144, 205
closed graph lemma, 123, 247 Dold, Albrecht, 67
codimension, 122, 136, 189 dual space, 74, 192–194, 233
compact
covering, 211 eigenvalue, 14, 46, 68
operator on Hilbert space, 247 Eilenberg’s criterion, 56, 63
sequentially, 209 Eilenberg-MacLane space, 170
complete, 211, 230, 240 entire function, 8, 99
component equivalence relation, 15, 19, 89,
path, 16, 33, 55, 124 154, 250
unbounded, 34 Euclidean space, 15, 204
concatenation, 16, 32, 62, 140 Euler characteristic, 113, 114
conformal mapping, 64 even map, 51
conjugate, 7 exact sequence, 137, 174, 192
connected, 17, 163 exponential function, 6–12, 27, 30,
locally, 149 56, 94, 136, 145, 146
path, 15, 16, 22 illustrated, 10
simply, see also simply connected exponential law, 154
continuous, 206
fiber, 144, 176
in terms of open sets, 207
fibration, 136, 173–174
uniformly, 213
final point, 15
contractible, 26, 58, 170, 176
form, 74
conullity, 122
closed, 80, 81, 85, 90, 111
corank, 122
and integration, 82
coset, 160, 254
exact, 80
covering map, 144, 152, 163
gradient, 74, 75
curvature, 105, 106
pullback, 75, 80
form, 106
Fredholm operator, 123, 133, 176
cycle, 89, 91, 111
free group, 154, 157
Freudenthal suspension theorem,
de Rham cohomology, 72, 81, 99
171
degree
functorial, 76, 99, 142, 149, 171,
of map S 1 → S 1 , 51
194
of map S n → S n , 171
fundamental group, 139–167
of polynomial, 42
fundamental theorem of algebra,
of singularity, 109
42, 44, 100
dense, 38, 131, 209
derivative, 232 Garcia, Jerry, 5
directional, 233 general position, 24, 170, 223
determinant, 135, 199–202, 257 Ghomi, Mohammad, 118
diffeomorphism, 98 gluing lemma, 16, 149, 214
differential form, see also form grid, 91
differentiation group, 88, 134, 140, 169, 249–257
of a path, 38 see fundamental group, 257
of function between vector
spaces, 232 ham sandwich theorem, 53
dimension, 17, 98, 121, 184–188 Hamilton, William Rowan, 13, 161

Hamming metric, 204 Jordan domain, 64

Hardy space, 131
Heine-Borel property, 211 kernel, 122, 188
Hilbert space, 178, 239–247 Klein bottle, 165
holomorphic, 64, 84 Koebe, Paul, 65
homeomorphism, 54, 207, 213 Kuiper, Nicolaas, 176
homology, 87–94, 100 Lagrange’s theorem, 254
homomorphism, 99, 132, 254–257 Lebesgue number, 147, 211
connecting, 174 lemonade, xi, 81, 115
induced, 142 lens space, 165
homotopy, 19, 28, 29, 83, 100, 139, lifting, 30, 136, 143–150, 153, 170
146, 160, 170 Liouville’s theorem, 100
of loops, 22–25 local homeomorphism, 163, 167
homotopy class, 19 logarithm, 12, 28
homotopy equivalence, 143 loop, 4, 30, 31, 54, 83, 133, 139
homotopy group, higher, 169 monotonic, 102
homotopy lifting theorem, 146 polygonal, 36
Hopf index theorem, 114 regular, see also path, regular
Hopf map, 172 smooth, 38, 101
Hopf, Heinz, 102, 114, 161 loop space, 20, 31, 174
lovers and haters, 1–4, 47, 59
image
of a chain, 88 Maehara, Ryuji, 60
of a path, 33 map, 27, 206
of linear map, 122, 188 measure theory, 53, 68, 243
increment formula, 233 measure zero, 226
index metric space, 55, 150, 203–206, 240
Fredholm, 123, 124 Möbius band, 165
of a subgroup, 254 modest, 147–149
of singularity, 109 modulus, 7, 9
initial point, 15 multiplication operator, 130
inner product, 194–197, 239 multiplicity, 43, 88
integration
along chain, 89 no-retraction theorem, 49, 172
along path, see also path, norm, 196, 204, 229–231
integration along Novikov, Pyotr, 157
by parts, 98 nullhomologous, 90, 100
nullity, 122, 191
by substitution, 76
number of sheets, 144, 160
intermediate value theorem, 4, 16
intersection, 73 odd map, 51
intersection number, 36, 39 open, 125, 204
invariance of domain, 63 cover, 211
invertibility criterion, 202 operator, 246
isometry, 151, 208 orbit, 151
orthogonal, 243
Jordan curve, 54, 102
Jordan curve theorem, 54, 56, 94 paradoxical decomposition, 166
proof of, 59–63 parallelogram law, 243

parentheses, convention for shape secant map, 103, 117

of, 74, 232 second derivative, 235
parity theorem, 256 sequence, 208
path, 3, 15, 19, 20, 29, 32, 38, 56, short straight section, 34
128, 135, 206 sign, of a permutation, 199, 256
integration along, 77–79 simply connected, 22, 99, 136, 146
piecewise straight, 25 singularity
polygonal, 36, 79 index of, 109
regular, 38, 101, 107 of vector field, 108
smooth slit plane, 12, 22, 28
piecewise, 79 smooth, 38, 162, 235
unit speed, 105 space-filling curve, 17–18, 24, 26, 98
path connected, see also connected sphere, 23, 49
path space, 20 standard operating procedure, 86,
fibration, 174 133
Peano, Guiseppe, 17 star-shaped, 22
Perelman, Grisha, 66 stereographic projection, 23, 120
Perron-Frobenius theorem, 68 Stone-Weierstrass theorem, 38, 46,
pole, 6, 45 131, 219
positive subspace
direction of rotation, 5, 10 complementary, 184
square root, 12 supersymmetry, 179
projection, 131, 147, 184 surjection, 143
projection theorem, 128, 244 suspension, 171
projective plane, 154 symbolic calculus, 132
Ptolemy, 13 symmetric group, 151, 199, 252
punctured plane, 23, 27, 174
Pythagoras’s theorem, 243 theorem of the turning tangent, 103
Tietze extension theorem, 57, 67,
quadratic map, 162 221
Toeplitz index theorem, 133, 179
radial retraction, 32, 158, 164 matrix version, 134
rank, 122, 191 Toeplitz operator, 129–136, 178
rank-nullity theorem, 122, 191 definition of, 132
rational function, 45 torus, 153
reparameterization, 19, 140 transverse, 36, 41, 171
and winding number, 47 triangle inequality, 152, 196, 203
smooth, 78, 105, 106 trigonometric polynomial, 130
representation theorem, 196 trivializing cover, 145
reverse, 16, 89, 141
Riemann mapping theorem, 65, 95 uncrossing, 118
rotation number, 101, 110 uniform distance, 214
Rouché’s theorem, 28, 45, 110 unilateral shift, 124
row and column operations, 126 unitization, 32

Sard’s theorem, 41, 171 vector ﬁeld, 108

Schoenﬂies theorem, 64

Whitney-Graustein theorem, 117 word, 154

winding number, 5, 29–46, 51, 73, problem, 156
91, 172 reduced, 155
deﬁnition of, 30

Licensed to Penn St Univ, University Park. Prepared on Tue Mar 12 18:04:19 EDT 2019for download from IP 132.174.254.159.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/publications/ebooks/terms
Licensed to Penn St Univ, University Park. Prepared on Tue Mar 12 18:04:19 EDT 2019for download from IP 132.174.254.159.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/publications/ebooks/terms
Selected Published Titles in This Series
76 John Roe, Winding Around: The Winding Number in Topology,
Geometry, and Analysis, 2015
73 Bruce M. Landman and Aaron Robertson, Ramsey Theory on the
Integers, Second Edition, 2014
72 Mark Kot, A First Course in the Calculus of Variations, 2014
71 Joel Spencer, Asymptopia, 2014
70 Lasse Rempe-Gillen and Rebecca Waldecker, Primality Testing for
Beginners, 2014
69 Mark Levi, Classical Mechanics with Calculus of Variations and Optimal
Control, 2014
68 Samuel S. Wagstaff, Jr., The Joy of Factoring, 2013
67 Emily H. Moore and Harriet S. Pollatsek, Difference Sets, 2013
66 Thomas Garrity, Richard Belshoff, Lynette Boos, Ryan Brown,
Carl Lienert, David Murphy, Junalyn Navarra-Madsen, Pedro
Poitevin, Shawn Robinson, Brian Snyder, and Caryn Werner,
Algebraic Geometry, 2013
65 Victor H. Moll, Numbers and Functions, 2012
64 A. B. Sossinsky, Geometries, 2012
63 Marı́a Cristina Pereyra and Lesley A. Ward, Harmonic Analysis,
2012
62 Rebecca Weber, Computability Theory, 2012
61 Anthony Bonato and Richard J. Nowakowski, The Game of Cops
and Robbers on Graphs, 2011
60 Richard Evan Schwartz, Mostly Surfaces, 2011
59 Pavel Etingof, Oleg Golberg, Sebastian Hensel, Tiankai Liu, Alex
Schwendner, Dmitry Vaintrob, and Elena Yudovina, Introduction to
Representation Theory, 2011
58 Álvaro Lozano-Robledo, Elliptic Curves, Modular Forms, and Their
L-functions, 2011
57 Charles M. Grinstead, William P. Peterson, and J. Laurie Snell,
Probability Tales, 2011
56 Julia Garibaldi, Alex Iosevich, and Steven Senger, The Erdős
Distance Problem, 2011
55 Gregory F. Lawler, Random Walk and the Heat Equation, 2010
54 Alex Kasman, Glimpses of Soliton Theory, 2010
53 Jiřı́ Matoušek, Thirty-three Miniatures, 2010

For a complete list of titles in this series, visit the

AMS Bookstore at www.ams.org/bookstore/stmlseries/.

For additional information

and updates on this book, visit
www.ams.org/bookpages/stml-76

AMS on the Web

STML/76 www.ams.org

The Mathematics of Soap Films AMS PDF
No ratings yet
The Mathematics of Soap Films AMS PDF
282 pages
Continue
No ratings yet
Continue
5 pages
Transformation Groups For Beginners PDF
100% (1)
Transformation Groups For Beginners PDF
258 pages
S (1) - Lefschetz-Algebraic Topology
No ratings yet
S (1) - Lefschetz-Algebraic Topology
396 pages
An Analysis Companion - Heil
No ratings yet
An Analysis Companion - Heil
245 pages
Tao An Epsilon of Room
100% (1)
Tao An Epsilon of Room
689 pages
Mathematics 433/533 Class Notes: Richard Koch
No ratings yet
Mathematics 433/533 Class Notes: Richard Koch
188 pages
Algebra (Ferrar) PDF
100% (2)
Algebra (Ferrar) PDF
238 pages
Complex Analysis - Christian Berg PDF
No ratings yet
Complex Analysis - Christian Berg PDF
192 pages
Cohn Measure Theory
No ratings yet
Cohn Measure Theory
384 pages
Riemann Surfaces S. K. Donaldson
No ratings yet
Riemann Surfaces S. K. Donaldson
127 pages
Cyclotomic Polynomials
No ratings yet
Cyclotomic Polynomials
13 pages
Stichtenoth-Algebraic Function Fields and Codes-2008
100% (1)
Stichtenoth-Algebraic Function Fields and Codes-2008
363 pages
Foubook PDF
No ratings yet
Foubook PDF
341 pages
Texts in Applied Mathematics: Springer
No ratings yet
Texts in Applied Mathematics: Springer
349 pages
Arithmetic and Geometry I PDF
100% (2)
Arithmetic and Geometry I PDF
184 pages
Transformation Groups
No ratings yet
Transformation Groups
192 pages
Fundamentals of Mathematics. Analysis - Vol.3 - Behnke
No ratings yet
Fundamentals of Mathematics. Analysis - Vol.3 - Behnke
557 pages
Cyclotomic Fields and Applications
No ratings yet
Cyclotomic Fields and Applications
18 pages
Aliprantis and Burkinshaw, Problems in Real Analysis
100% (2)
Aliprantis and Burkinshaw, Problems in Real Analysis
410 pages
(Ergebnisse Der Mathematik Und Ihrer Grenzgebiete) Bosch, Siegfried, Lütkebohmert, Werner, Raynaud, Michel - Néron Models-Springer Verlag (1990) PDF
No ratings yet
(Ergebnisse Der Mathematik Und Ihrer Grenzgebiete) Bosch, Siegfried, Lütkebohmert, Werner, Raynaud, Michel - Néron Models-Springer Verlag (1990) PDF
336 pages
(Texts and Monographs in Symbolic Computation) Dipl.-Ing. Dr. Franz Winkler (Auth.) - Polynomial Algorithms in Computer Algebra (1996, Springer-Verlag Wien) PDF
No ratings yet
(Texts and Monographs in Symbolic Computation) Dipl.-Ing. Dr. Franz Winkler (Auth.) - Polynomial Algorithms in Computer Algebra (1996, Springer-Verlag Wien) PDF
283 pages
Map The Matics
No ratings yet
Map The Matics
1 page
Elliptic Curves - An Introduction R6001 PDF
No ratings yet
Elliptic Curves - An Introduction R6001 PDF
5 pages
Point Set Topology
100% (1)
Point Set Topology
327 pages
Practical Projective Geometric Algebra
No ratings yet
Practical Projective Geometric Algebra
110 pages
Foundation of Algebra Geometry
No ratings yet
Foundation of Algebra Geometry
826 pages
GTM 007 A Course in Arithmetic by Jean-Pierre Serre
No ratings yet
GTM 007 A Course in Arithmetic by Jean-Pierre Serre
123 pages
Sperner
No ratings yet
Sperner
150 pages
David A. Singer - Geometry - Plane and Fancy-Springer (1998)
No ratings yet
David A. Singer - Geometry - Plane and Fancy-Springer (1998)
168 pages
Howie J. Fundamentals of Semigroup Theory (OUP, 1995) (ISBN 0198511949) (K) (T) (361s) - MAtg
100% (1)
Howie J. Fundamentals of Semigroup Theory (OUP, 1995) (ISBN 0198511949) (K) (T) (361s) - MAtg
361 pages
(Graduate Texts in Mathematics 82) Raoul Bott, Loring W. Tu - Differential Forms in Algebraic Topology-Springer (1995) PDF
No ratings yet
(Graduate Texts in Mathematics 82) Raoul Bott, Loring W. Tu - Differential Forms in Algebraic Topology-Springer (1995) PDF
173 pages
The Handbook of Computer-Aided Geometric Design: January 2002
No ratings yet
The Handbook of Computer-Aided Geometric Design: January 2002
17 pages
Ivchenko Medvedev Chistyakov Problems in Mathematical Statistics
No ratings yet
Ivchenko Medvedev Chistyakov Problems in Mathematical Statistics
282 pages
A First Course in Complex Analysis: Matthias Beck, Gerald Marchesi, and Dennis Pixton
No ratings yet
A First Course in Complex Analysis: Matthias Beck, Gerald Marchesi, and Dennis Pixton
110 pages
(Richard Kaye, Robert Wilson) Linear Algebra (Oxfo (B-Ok - Org) 12.2 176p Mini Poly
No ratings yet
(Richard Kaye, Robert Wilson) Linear Algebra (Oxfo (B-Ok - Org) 12.2 176p Mini Poly
244 pages
Bruce Driver-Undergraduate Analysis Tools
No ratings yet
Bruce Driver-Undergraduate Analysis Tools
228 pages
John McCleary - A User's Guide To Spectral Sequences
100% (1)
John McCleary - A User's Guide To Spectral Sequences
579 pages
9780203755419
100% (1)
9780203755419
431 pages
Second Course in Linear Algebra
100% (2)
Second Course in Linear Algebra
36 pages
A Term of Commutative Algebra
No ratings yet
A Term of Commutative Algebra
133 pages
Siegfried Carl, Seppo Heikkilä) Fixed Point Theo (Book4You)
No ratings yet
Siegfried Carl, Seppo Heikkilä) Fixed Point Theo (Book4You)
492 pages
Thomas Hawkins-Lebesgue's Theory of Integration - Its Origins and Development - Chelsea Pub Co (1975) PDF
67% (3)
Thomas Hawkins-Lebesgue's Theory of Integration - Its Origins and Development - Chelsea Pub Co (1975) PDF
244 pages
Emanuel Parzen Modern Probability Theory and Its Applications
100% (2)
Emanuel Parzen Modern Probability Theory and Its Applications
480 pages
Hilton H. - Plane Algebraic Curves
100% (1)
Hilton H. - Plane Algebraic Curves
410 pages
Vector Analysis - Joseph Coffin
No ratings yet
Vector Analysis - Joseph Coffin
283 pages
Hecke's Theory of Modular Forms and Dirichlet Series
No ratings yet
Hecke's Theory of Modular Forms and Dirichlet Series
150 pages
Bishop Constructive Analysis
100% (1)
Bishop Constructive Analysis
23 pages
Sasane Amol Sasane Sara Maad A Friendly Approach To Complex
100% (2)
Sasane Amol Sasane Sara Maad A Friendly Approach To Complex
219 pages
Substitutional Analysis
From Everand
Substitutional Analysis
Daniel Edwin Rutherford
No ratings yet
Invariant Subspaces
From Everand
Invariant Subspaces
Heydar Radjavi
No ratings yet
Solved Problems in Analysis: As Applied to Gamma, Beta, Legendre and Bessel Functions
From Everand
Solved Problems in Analysis: As Applied to Gamma, Beta, Legendre and Bessel Functions
Orin J. Farrell
No ratings yet
Sieve Methods
From Everand
Sieve Methods
Heine Halberstam
No ratings yet
Analysis in Euclidean Space
From Everand
Analysis in Euclidean Space
Kenneth Hoffman
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
Almost Periodic Functions
From Everand
Almost Periodic Functions
Harald Bohr
No ratings yet
Topological Methods in Galois Representation Theory
From Everand
Topological Methods in Galois Representation Theory
Victor P. Snaith
No ratings yet
Diophantine Approximations
From Everand
Diophantine Approximations
Ivan Niven
3/5 (1)
Introduction to Matrices and Linear Transformations: Third Edition
From Everand
Introduction to Matrices and Linear Transformations: Third Edition
Daniel T. Finkbeiner
3/5 (1)
Foundations of Modern Analysis
From Everand
Foundations of Modern Analysis
J. Dieudonne
3/5 (5)
Constructive Real Analysis
From Everand
Constructive Real Analysis
Allen A. Goldstein
No ratings yet
Applied Functional Analysis
From Everand
Applied Functional Analysis
D.H. Griffel
No ratings yet
Introduction to Topology and Geometry
From Everand
Introduction to Topology and Geometry
Saul Stahl
No ratings yet
1994 Book LinearAlgebra
100% (5)
1994 Book LinearAlgebra
214 pages
The Joy of Factoring AMS
No ratings yet
The Joy of Factoring AMS
311 pages
Binomial Theorem - Short Notes
No ratings yet
Binomial Theorem - Short Notes
1 page
Boolean Algebra
0% (1)
Boolean Algebra
401 pages
Math 8 1st Exam 2
100% (2)
Math 8 1st Exam 2
2 pages
Algebraic Topology I and II, Haynes Miller MIT 2021
100% (1)
Algebraic Topology I and II, Haynes Miller MIT 2021
307 pages
Slide 1
No ratings yet
Slide 1
27 pages
Basic Maths
No ratings yet
Basic Maths
3 pages
2 - Surds - Indices
100% (1)
2 - Surds - Indices
26 pages
1st Periodic Test - Math 7 - v2
No ratings yet
1st Periodic Test - Math 7 - v2
3 pages
Unit 3 (Part1)
No ratings yet
Unit 3 (Part1)
30 pages
Summer Vacation Homeworks Class-9 SHPS 2022
No ratings yet
Summer Vacation Homeworks Class-9 SHPS 2022
3 pages
Chap 7
No ratings yet
Chap 7
190 pages
Mdcat Phy Supplement Unit-1
No ratings yet
Mdcat Phy Supplement Unit-1
31 pages
I Assignment Question Papers (Set 1 - Set 9)
No ratings yet
I Assignment Question Papers (Set 1 - Set 9)
9 pages
A Short Proof of Jacobi's Formula For The Number of Representations of An Integer As A Sum of Four Squares
No ratings yet
A Short Proof of Jacobi's Formula For The Number of Representations of An Integer As A Sum of Four Squares
3 pages
1.3.4 Practice - Modeling: Multiplying Binomials (Practice)
No ratings yet
1.3.4 Practice - Modeling: Multiplying Binomials (Practice)
3 pages
Fuzzy Q-Ideals in Q-Algebras
No ratings yet
Fuzzy Q-Ideals in Q-Algebras
12 pages
Notes Matrices, Determinants, Eigen Values and Eigen Vectors
100% (1)
Notes Matrices, Determinants, Eigen Values and Eigen Vectors
51 pages
Complex Number DPP - 1
No ratings yet
Complex Number DPP - 1
12 pages
Hit Bulls Eye Algebra Questions
100% (1)
Hit Bulls Eye Algebra Questions
22 pages
Matrices PDF
No ratings yet
Matrices PDF
13 pages
Complex Number GB Sir Module PDF
No ratings yet
Complex Number GB Sir Module PDF
24 pages
Class 9 Mathematics Notes Chapter 2 Polynomials
No ratings yet
Class 9 Mathematics Notes Chapter 2 Polynomials
41 pages
Chapter 2 - Vector
No ratings yet
Chapter 2 - Vector
7 pages
Multiplication of Integers
No ratings yet
Multiplication of Integers
17 pages
What Is Number System
No ratings yet
What Is Number System
17 pages
Math
No ratings yet
Math
145 pages
Activity 4 Gallego
No ratings yet
Activity 4 Gallego
3 pages
Old Tests and Solutions PDF
No ratings yet
Old Tests and Solutions PDF
16 pages
BF00729808
No ratings yet
BF00729808
20 pages

Winding Around AMS

Uploaded by

Winding Around AMS

Uploaded by

S T U D E N T M AT H E M AT I C A L L I B R A RY

American Mathematical Society

American Mathematical Society

2010 Mathematics Subject Classiﬁcation. Primary 55M25;

For additional information and updates on this book, visit

Library of Congress Cataloging-in-Publication Data

Copying and reprinting. Individual readers of this publication, and nonproﬁt

Foreword: MASS and REU at Penn State University ix

Chapter 1. Prelude: Love, Hate, and Exponentials 1

Chapter 2. Paths and Homotopies 15

Chapter 3. The Winding Number 27

§3.5. Counting roots via winding numbers 42

Chapter 4. Topology of the Plane 49

Chapter 5. Integrals and the Winding Number 73

Chapter 6. Vector Fields and the Rotation Number 101

Chapter 7. The Winding Number in Functional Analysis 121

Chapter 8. Coverings and the Fundamental Group 139

Chapter 9. Coda: The Bott Periodicity Theorem 169

Appendix A. Linear Algebra 181

Appendix B. Metric Spaces 203

Appendix C. Extension and Approximation Theorems 217

Appendix D. Measure Zero 223

Appendix E. Calculus on Normed Spaces 229

Appendix F. Hilbert Space 239

Appendix G. Groups and Graphs 249

This book is part of a collection published jointly by the Amer-

this, as well as intensive interaction among the students, usually leads

Mathematics is an endlessly fruitful subject. One reason is its ability

answer is no, as is shown by the notorious example

best. Whichever you decide, be sure to have fun! This is a beautiful

Prelude: Love, Hate,

1.1. Two sets of travelers

Figure 1.1. Transforming a donut into a coﬀee mug.

This well-known saying expresses the idea that topology studies

In a certain country there are two cities — call

There are two components to solving the problem. The ﬁrst is

Figure 1.2. Parameterizing the lovers and haters problem.

characters are at B”, (0.4, 0.7) represents “the ﬁrst character is 40

Theorem 1.1.2. Two continuous paths in the unit square S, one

Surprisingly (perhaps) this is not easy to prove. Let’s look at one

1.2. Winding around

The wheel is turning and you can’t slow down

How many times does the wheel turn? If we stipulate that at

1.3. The most important function

Proof. We write an explicit formula for the inverse. If z = x + yi is a

The exponential series

converges for all values of z and deﬁnes a diﬀerentiable function on

Remark 1.3.3. It’s often useful to represent a complex number z =

The identity ez ·e−z = 1 shows that the exponential function never

Figure 1.3. The exponential map illustrated — from the pic-

Lemma 1.3.4. The complex number w = exp(z) lies on the unit

For t ∈ R we have Euler’s formula

It follows from Lemma 1.3.5 that there is no “complex logarithm”

Proof. Each z ∈ S has a unique polar coordinate representation

Remark 1.3.8. A function  having the property asserted by the

proposed system. Fix a speciﬁc nonreal element α ∈ V and let mα

Paths and Homotopies

2.1. Path connectedness

Deﬁnition 2.1.1. Two points p, q in a metric space X are connected

Proposition 2.1.2. The relation of “being connected by a path” (on

Proof. We must check that the relation is reﬂexive, symmetric, and

Then γ is continuous2 and has initial point p and ﬁnal point r.

Deﬁnition 2.1.3. The equivalence classes for the above equivalence

If a space is path connected, it is (in principle) straightforward to

Lemma 2.1.4. Any continuous path in a discrete space X (one in

Proof. Let γ be a path in X with initial point γ(0) = p. Consider the

Remark 2.1.5. Traditionally, a space X is called connected if it

The next deﬁnition is a key one in topology.

By deﬁnition, a homotopy is a one-parameter family, say fs , of

Proof. The required homotopy is given by

could call this a homotopy of maps of pairs. Special cases are a

Remark 2.2.9. The fundamental example of a loop is given by our

Proposition 2.2.10. Let f : [0, 1] → Y be a loop in a metric space

Proof. The function g is deﬁned as follows: to ﬁnd g(u), for u ∈ S 1 ,

2.3. Homotopies and simple-connectivity

Deﬁnition 2.3.2. A space X is called simply connected if both X

Remark 1.3.8. A function having the property asserted by the