Bressoud - A Radical Approach To Lebesgue's Theory of Integration (2008)
Bressoud - A Radical Approach To Lebesgue's Theory of Integration (2008)
Bressoud - A Radical Approach To Lebesgue's Theory of Integration (2008)
OF INTEGRATION
MAA TEXTBOOKS
DAVID M. BRESSOUD
Macalester College
CAMBRIDGE
UNIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi
A catalog record for this publication is available from the British Library.
Preface page xi
Introduction 1
ix
Contents
I look at the burning question of the foundations of infinitesimal analysis without sorrow,
anger, or irritation. What Weierstrass — Cantor — did was very good. That's the way it had
to be done. But whether this corresponds to what is in the depths of our consciousness is a
very different question. I cannot but see a stark contradiction between the intuitively clear
fundamental formulas of the integral calculus and the incomparably artificial and complex
work of the "justification" and their "proofs." One must be quite stupid not to see this at
once, and quite careless if, after having seen this, one can get used to this artificial, logical
atmosphere, and can later on forget this stark contradiction.
Nikolai Luzin reminds us of a truth too often forgotten in the teaching of analysis;
the ideas, methods, definitions, and theorems of this study are neither natural nor
intuitive. It is all too common for students to emerge from this study with little
sense of how the concepts and results that constitute modern analysis hang together.
Here more than anywhere else in the advanced undergraduate/beginning graduate
curriculum, the historical context is critical to developing an understanding of the
mathematics.
This historical context is both interesting and pedagogically informative. From
transfinite numbers to the Heine—Borel theorem to Lebesgue measure, these ideas
arose from practical problems but were greeted with a skepticism that betrayed
confusion. Understanding what they mean and how they can be used was an uncer-
tain process. We should expect our students to encounter difficulties at precisely
those points at which the contemporaries of Weierstrass, Cantor, and Lebesgue had
balked.
Throughout this text I have tried to emphasize that no one set out to invent
measure theory or functional analysis. I find it both surprising and immensely sat-
isfying that the search for understanding of Fourier series continued to be one of the
principal driving forces behind the development of analysis well into the twentieth
xi
xii Preface
century. The tools that these mathematicians had at hand were not adequate to the
task. In particular, the Riemann integral was poorly adapted to their needs.
It took several decades of wrestling with frustrating difficulties before mathe-
maticians were willing to abandon the Riemann integral. The route to its eventual
replacement, the Lebesgue integral, led through a sequence of remarkable insights
into the complexities of the real number line. By the end of 1 890s, it was recognized
that analysis and the study of sets were inextricably linked. From this rich interplay,
measure theory would emerge. With it came what today we call Lebesgue's domi-
nated convergence theorem, the holy grail of nineteenth-century analysis. What so
many had struggled so hard to discover now appeared as a gift that was almost free.
This text is an introduction to measure theory and Lebesgue integration, though
anyone using it to support such a course must be forewarned that I have intentionally
avoided stating results in their greatest possible generality. Almost all results are
given only for the real number line. Theorems that are true over any compact set
are often stated only for closed, bounded intervals. I want students to get a feel for
these results, what they say, and why they are important. Close examination of the
most general conditions under which conclusions will hold is something that can
come later, if and when it is needed.
The title of this book was chosen to communicate two important points. First,
this is a sequel to A Radical Approach to Real Analysis (ARATRA). That book ended
with Riemann's definition of the integral. That is where this text begins. All of the
topics that one might expect to find in an undergraduate analysis book that were
not in ARATRA are contained here, including the topology of the real number line,
fundamentals of set theory, transfinite cardinals, the Bolzano—Weierstrass theorem,
and the Heine—Borel theorem. I did not include them in the first volume because
I felt I could not do them justice there and because, historically, they are quite
sophisticated insights that did not arise until the second half of the nineteenth
century.
Second, this book owes a tremendous debt to Thomas Hawkins' Lebesgue's
Theory of Integration: Its Origins and Development. Like ARATRA, this book is
not intended to be read as a history of the development of analysis. Rather, this
is a textbook informed by history, attempting to communicate the motivations,
uncertainties, and difficulties surrounding the key concepts. This task would have
been far more difficult without Hawkins as a guide. Those who are intrigued by the
historical details encountered in this book are encouraged to turn to Hawkins and
other historians of this period for fuller explanation.
Even more than ARATRA, this is the story of many contributions by many mem-
bers of a large community of mathematicians working on different pieces of the
puzzle. I hope that I have succeeded in opening a small window into the workings
of this community. One of the most intriguing of these mathematicians is Axel
Preface xiii
Harnack, who keeps reappearing in our story because he kept making mistakes, but
they were good mistakes. Harnack's errors condensed and made explicit many of
the misconceptions of his time, and so helped others to find the correct path. For
ARATRA, it was easy to select the four mathematicians who should grace the cover:
Fourier, Cauchy, Abel, and Dirichlet stand out as those who shaped the origins
of modern analysis. For this book, the choice is far less clear. Certainly I need to
include Riemann and Lebesgue, for they initiate and bring to conclusion the princi-
pal elements of this story. Weierstrass? He trained and inspired the generation that
would grapple with Riemann's work, but his contributions are less direct. Heine,
du Bois-Reymond, Jordan, Hankel, Darboux, or Dini? They all made substantial
progress toward the ultimate solution, but none of them stands out sufficiently.
Cantor? Certainly yes. It was his recognition that set theory lies at the heart of
analysis that would enable the progress of the next generation. Who should we
select from that next generation: Peano, Volterra, Borel, Baire? Maybe Riesz or
one of the others who built on Lebesgue's insights, bringing them to fruition? Now
the choice is even less clear. I have settled on Bore! for his impact as a young
mathematician and to honor him as the true source of the Heine—Borel theorem,
a result that I have been very tempted to refer to as he did: the first fundamental
theorem of measure theory.
I have drawn freely on the scholarship of others. I must pay special tribute to Soo
Bong Chae's Lebesgue Integration. When I first saw this book, my reaction was
that I did not need to write my own on Lebesgue integration. Here was someone
who had already put the subject into historical context, writing in an elegant yet
accessible style. However, as I have used his book over the years, I have found that
there is much that he leaves unsaid, and I disagree with his choice to use Riesz's
approach to the Lebesgue integral, building it via an analysis of step functions.
Riesz found an elegant route to Lebesgue integration, but in defining the integral
first and using it to define Lebesgue measure, the motivation for developing these
concepts is lost. Despite such fundamental divergences, the attentive reader will
discover many close parallels between Chae's treatment and mine.
I am indebted to many people who read and commented on early drafts of
this book. I especially thank Dave Renfro who gave generously of his time to
correct many of my historical and mathematical errors. Steve Greenfield had the
temerity to be the very first reader of my very first draft, and I appreciate his many
helpful suggestions on the organization and presentation of this book. I also want to
single out my students who, during the spring semester of 2007, struggled through
a preliminary draft of this book and helped me in many ways to correct errors
and improve the presentation of this material. They are Jacob Bond, Kyle Braam,
Pawan Dhir, Elizabeth Gillaspy, Dan Gusset, Sam Handler, Kassa Haileyesus, Xi
Luo, Jake Norton, Stella Stamenova, and Linh To.
xiv Preface
David M. Bressoud
[email protected]
June 19, 2007
1
Introduction
This book is devoted to explaining the answers to these five questions — answers
that are very much intertwined. Before we tackle what happened after 1850, we
need to understand what was known or believed in that year.
1
bk = —I F(x)sin(kx)dx (k> 1). (1.2)
Jt J-yr
The heuristic argument for the validity of this procedure is that if F really can
be expanded in a series of the form given in Equation (1.3), then
pm
F(x)cos(nx)dx
J—yr
=f f
cos(nx) dx + ak cos(kx) cos(nx) dx
00
Similarly,
p 7t
F(x)sin(nx)dx = (1.6)
J—7t
This is a convincing heuristic, but it ignores the problem of interchanging inte-
gration and summation, and it sidesteps two crucial questions:
This example demonstrates how very strange functions can be if we take seriously
the definition of a function as a well-defined rule that assigns a value to each number
in the domain. Dirichlet's example represents an important step in the evolution
of the concept of function. To the early explorers of calculus, a function was an
algebraic rule such as sin x or x2 — 3, an expression that could be computed to
whatever accuracy one might desire.
When Augustin-Louis Cauchy showed that any piecewise continuous function
is integrable, he cemented the realization that functions could also be purely geo-
metric, representable only as curves. Even in a situation in which a function has no
explicit algebraic formulation, it is possible to make sense of its integral, provided
the function is continuous.
Dirichlet stretched the concept of function to that of a rule that can be individually
defined for each value of the domain. Once this conception of function is accepted,
the gates are opened to very strange functions. At the very least, integrability can
no longer be assumed.
The next problem is to show that our trigonometric series converges. In his
1829 paper, Dirichlet accomplished this, but he needed the hypothesis that the
original function F is piecewise monotonic, that is the domain can be partitioned
into a finite number of subintervals so that F is either monotonically increasing or
monotonically decreasing on each subinterval.
The final question is whether the function to which it converges is the function
F with which we started. Under the same assumptions, Dirichlet was able to show
that this is the case, provided that at any points of discontinuity of F, the value
Introduction
taken by the function is the average of the limit from the left and the limit from the
right.
Dirichlet's result implies that the functions one is likely to encounter in physical
situations present no problems for conversion into Fourier series. Riemann recog-
nized that it was important to be able to extend this technique to more complicated
functions now arising in questions in number theory and geometry. The first step
was to get a better handle on what we mean by integration.
Integration
It is ironic that integration took so long to get right because it is so much older
than any other piece of calculus. Its roots lie in methods of calculating areas,
volumes, and moments that were undertaken by such scientists as Archimedes
(287—212 Bc), Liu Hui (late third century AD), ibn al-Haytham (965—1039), and
Johannes Kepler (1571—1630). The basic idea was always the same. To evaluate
an area, one divided it into rectangles or triangles or other shapes of known area
that together approximated the desired region. As more and smaller figures were
used, the region would be matched more precisely. Some sort of limiting argument
would then be invoked, some means of finding the actual area based on an analysis
of the areas of the approximating regions.
Into the eighteenth century, integration was identified with the problem of
"quadrature," literally the process of finding a square equal in area to a given
area and thus, in practice, the problem of computing areas. In section 1 of Book
I of his Mathematical Principles of Natural Philosophy, Newton explains how to
calculate areas under curves. He gives a procedure that looks very much like the
definition of the Riemann integral, and he justifies it by an argument that would be
appropriate for any modern textbook.
Specifically, Newton begins by approximating the area under a decreasing curve
by subdividing the domain into equal subintervals (see Figure 1.1). Above each
subinterval, he constructs two rectangles: one whose height is the maximum value
of the function on that interval (the circumscribed rectangle) and the other whose
height is the minimum value of the function (the inscribed rectangle). The true area
lies between the sum of the areas of the circumscribed rectangles and the sum of
the areas of the inscribed rectangles.
The difference between these areas is the sum of the areas of the rectangles
aKbl, bLcm, cMdn, dDEo. If we slide all of these rectangles to line up under
a Kbl, we see that the sum of their areas is just the change in height of the function
multiplied by the length of any one subinterval. As we take narrower subintervals,
the difference in the areas approaches zero. As Newton asserts: "The ultimate ratios
which the inscribed figure, the circumscribed figure, and the curvilinear figure have
1.1 The Five Big Questions 5
a
if
K
Figure 1.1. Newton's iliustration from Mathematical Principles of Natural Philosophy. (Newton, 1999, p. 433)
to one another are ratios of equality," which is his way of saying that the ratio of
any two of these areas approaches 1. Therefore, the areas are all approaching the
same value as the length of the subinterval approaches 0.
In Lemma 3 of his book, Newton considers the case where the subintervals are
not of equal length (using the dotted line f F in Figure 1.1 in place of 1B). He
observes that the sum of the differences of the areas is still less than the change in
height multiplied by the length of the longest subinterval. We therefore get the same
limit for the ratio so long as the length of the longest subinterval is approaching
zero.
This method of finding areas is paradigmatic for an entire class of problems in
which one is multiplying two quantities such as
where the value of the first quantity can vary as the second quantity increases. For
example, knowing that "distance = speed x time," we can find the distance traveled
by a particle whose speed is a function of time, say v(t) = 8t + 5, 0 t 4. If we
split the time into four intervals and use the velocity at the start of each interval,
we get an approximation to the total distance:
1 + 13•1+21•1 1=68.
introduction
If we use eight intervals of length 1/2 and again take the speed at the start of each
interval, we get
1 1 1 1
If we use 1,024 intervals of length 1/256 and take the speed at the start of each
interval, we get
1 161 1 81 1 1,183 1
f f(x)dx.
The product is f(x) dx, the value of the first quantity times the infinitesimal
increment. The elongated S, f, represents the summation.
This is all precalculus. The insight at the heart of calculus is that if f(x) represents
the slope of the tangent to the graph of a function F at x, then this provides an
easy method for computing limits of sums of products: If x ranges over the interval
[a, b], then the value of this integral is F(b) — F(a). Thus, to find the area under
the curve v = 8t + 5 from t = 0 to t = 4, we can observe that f(t) = 8t + 5 is the
derivative of F(t) = 4t2 + St. The desired area is equal to
(4 42+5 4) (4 02+5 O)—84
The calculating power of calculus comes from this dual nature of the integral. It can
be viewed as a limit of sums of products or as the inverse process of differentiation.
It is hard to find a precise definition of the integral from the eighteenth century.
The scientists of this century understood and exploited the dual nature of the
integral, but most were reluctant to define it as the sum of products of f(x) times
the infinitesimal dx, for that inevitably led to the problem of what exactly is meant
by an "infinitesimal." It is a useful concept, but one that is hard to pin down.
George Berkeley aptly described infinitesimals as "ghosts of departed quantities."
He would object, "Now to conceive a quantity infinitely small, that is, infinitely
less than any sensible or imaginable quantity or than any the least finite magnitude
is, I confess, above my capacity."
George Berkeley, The Analyst, as quoted in Struik (1986, pp. 335, 338).
1.1 The Five Big Questions 7
The result was that when a definition of f f(x) dx was needed, the integral was
simply defined as the operator that returns you to the function (or, in modern use, the
class of functions) whose derivative is f. One of the early calculus textbooks written
for an undergraduate audience was S. F. Lacroix's Traité élémentaire de Calcul
Différentiel et de Calcul Integral of 1802 (Elementary Treatise of Differential
Calculus and Integral Calculus). Translated into many languages, it would serve
as the standard text of the first half of the nineteenth century. It provides no explicit
definition of the integral, but does state that
Integral calculus is the inverse of differential calculus. Its goal is to restore the functions from
their differential coefficients.
f xk_1).
the value of the definite integral, and the function f is said to be integrable over
[a,b].
Equipped with this definition, Cauchy succeeded in proving that any continuous
or piecewise continuous function is integrable. The class of functions to which
Fourier's analysis could be applied was suddenly greatly expanded.
When Riemann turned to the study of trigonometric series, he wanted to know
the limits of Cauchy's approach to integration. Was there an easy test that could
be used to determine whether or not a function could be integrated? Cauchy had
chosen to evaluate the function at the left-hand endpoint of the interval simply for
convenience. As Riemann thought about how far this definition could be pushed,
he realized that his analysis would be simpler if the definition were stated in a
slightly more complicated but essentially equivalent manner. Given a partition of
[a, b]: (a = x0 <x1 < <x,, = b), we assign a tag to each interval, a number
contained in that interval, and consider all sums of the form
— Xk1).
defined as a limiting process. They then clarify the precise relationship between
integration and differentiation. The actual statements that we shall use are given by
the following theorems.
Ja
f(t)dt = F(b) - F(a). (1.7)
Theorem 1.2 (FTC, antiderivative). 1ff is integrable on the interval [a, b], then
under suitable hypotheses we have that
d f'X
/ f(t)dt = f(x). (1.8)
dx Ja
The first of these theorems tells us how we can use any antiderivative to obtain a
simple evaluation of a definite integral. The second shows that the definite integral
can be used to create an antiderivative, the definite integral of f from a to x is a
function of x whose derivative is f. Both of these statements would be meaningless
if we had defined the integral as the antiderivative. Their meaning and importance
comes from the assumption that fa" f(t) dt is defined as a limit of summations.
In both cases, I have not specified the hypotheses under which these theorems
hold. There are two reasons for this. One is that much of the interesting story that
is to be told about the creation of analysis in the late nineteenth century revolves
around finding necessary and sufficient conditions under which the conclusions
hold. When working with Riemann's definition of the integal, the answer is com-
plicated. The second reason is that the hypotheses that are needed depend on the
way we choose to define the integral. For Lebesgue's definition, the hypotheses are
quite different.
2
With thanks to Larry D'Antonio and Ivor Grattan-Guinness for uncovering many of these references.
10 introduction
Siméon Denis Poisson (1781—1840) studied and then taught at the École Poly-
technique. He succeeded to Fourier's professorship in mathematics when Fourier
departed for Grenoble to become prefect of the department of Isère. It was Poisson
who wrote up the rejection of Fourier's Theory of the Propoagation of Heat in
SolidBodies in 1808. When, in 1815, Poisson published his own article on the flow
of heat, Fourier pointed out its many flaws and the extent to which Poisson had
rediscovered Fourier's own work.
Poisson, as a colleague of Cauchy at the École Polytechnique, almost certainly
was aware of Cauchy's definition of the definite integral even though Cauchy had
not yet published it. But the relationship between Poisson and Cauchy was far from
amicable, and it would have been surprising had Poisson chosen to embrace his
colleague's approach. Poisson defines the definite integral as the difference of the
values of the antiderivative. It would seem there is nothing to prove. What Poisson
does prove is that if F has a Taylor series expansion and F' = f, then
where = b — a
F(b) — F(a) = n—*oo
lim t f (a + (j — 1)t),
n
j=1
Poisson begins with the observation that for 1 <j <n and t = (b — a)/n, there
is a k > 1 and a collection of functions R1 such that
— 1)t)]
The statement and proof of Theorem 1.2 can be found in Cauchy's Résumé des
Leçons Données a L'Ecole Polytechnique of 1823, the same place where he first
defines the definite integral. It is not stated as a fundamental theorem. In fact, it is
not identified as a theorem or proposition, simply a result mentioned in the text en
route to the real problem which is to define the indefinite integral, the general class
of functions that have f as their derivative.
The term "Fundamental Principles of the Integral Calculus" appears in Lardner's
An Elementary Treatise on the Differential and Integral Calculus of 1825, and these
include the statement of the evaluation part of the fundamental theorem of calculus.
But this statement is one of nine principles that include the fact that the integral is
a linear operator as well as many rules for integrating specific functions.
The term "fundamental theorem for integrals" was used to refer to the evaluation
part of the fundamental theorem of calculus in Charles de Freycinet's De L'Analyse
Infinitesimal. Étude sur la Métaphysique du haut Calcul of 1860. de Freycinet
(1828—1923) was trained as a mining engineer, was elected to the French senate in
1876, and served four times as prime minister of France. It would be interesting
to know if there have been any other heads of state that have written calculus
textbooks.
The full modern statement of both parts of the fundamental theorem of calculus
with the definite integral defined as a limit in Cauchy's sense, referred to as the
"fundamental theorem of integral calculus," can be found in an appendix to an
article on trigonometric series published by Paul du Bois-Reymond in 1876. In
1880, he published an extended discussion and proof of this theorem in the widely
read journal Mathematische Annalen.
The fundamental theorem of integral calculus was popularized in English in the
early twentieth century by the publication of Hobson's The Theory of Functions
of a Real Variable and the Theory of Fourier's Series of 1907. This is a thorough
treatment of analysis that was very influential. Hobson gives statements of the
fundamental theorem for both the Riemann and Lebesgue integrals. Some evidence
that this may be the source of this phrase in English is given by the classic English-
language calculus textbook of the first half of the twentieth century, Granville's
Elements of the Differential and Integral Calculus. Granville does not mention
a "fundamental theorem" in his first edition of 1904, but in the second edition
of 1911, we do find it. Since Granville defines integration to be the reversal of
differentation, his fundamental theorem is that the definite integral is equal to the
limit of the approximating summations.
It seems that G. H. Hardy may be responsible for dropping the adjective
"integral." In the first edition (1908) of G. H. Hardy's A Course of Pure Mathemat-
ics, there is no mention of the phrase "fundamental theorem of calculus." It does
appear, without the adjective "integral," in the second edition, published in 1914.
12 introduction
Although Bernhard Boizano had shown how to construct a function that is everywhere continuous and nowhere
differentiable, his example only existed in a privately circulated manuscript and was not published until 1930.
du Bois-Reymond's example was found by Weierstrass's who had described it in his lectures but never
published it.
1.1 The Five Big Questions 13
Term-by-term Integration
Returning to Fourier series, we saw that the heuristic justification relied on inter-
changing summation and integration, integrating an infinite series of functions by
integrating each summand. This works for finite summations. It is not hard to find
infinite series for which term-by-term integration leads to a divergent series or,
even worse, a series that converges to the wrong value.
Weierstrass had shown that if the series converges uniformly, then term-by-term
integration is valid. The problem with this result is that the most interesting series,
especially Fourier series, often do not converge uniformly and yet term-by-term
integration is valid. Uniform convergence is sufficient, but it is very far from
necessary. As we shall see, finding useful conditions under which term-by-term
integration is valid is very difficult so long as we cling to the Riemann integral.
As Lebesgue would show in the opening years of the twentieth century, his
definition of the integral yields a simple, elegant solution, the Lebesgue dominated
convergence theorem.
Exercises
1.1.1. Find the Fourier expansions for fi(x) = x and f2(x) = x2 over [—7r, 7r].
1.1.2. For the functions fi and f2 defined in Exercise 1.1.1, differentiate each
summand in the Fourier series for f2. Do you get the summands in the Fourier
series for 2fi? Differentiate each summand in the Fourier series for fi. Do you get
the summand in the Fourier series for
1.1.3. Using the Fourier series expansion for x2 (Exercise 1.1.1) evaluated at
x= 7t, show that
2
n=1
= 6
cos(kx) dx sin(kx) dx =
J —Jr =J
14 Introduction
sin(kx) cos(nx) dx = 0.
J—Jr
1.1.5. Using the definition of continuity, justify the assertion that the characteristic
function of the rationals, Example 1.1, is not continuous at any real number.
1.1.6. Let C be the circumscribed area, I the inscribed area in Newton's illustration,
using intervals of length L\x. Newton claims that he has demonstrated that
C
lim —=1
I
but what he actually proves is that
lim (C — I) = 0.
Ax—÷O
lim t1+k = 0.
n —+00
j=1
1.1.15. Define
J x, x is rational,
gx —
— 0, x is not rational.
For what values of x is g continuous? For what values of x is g differentiable?
1.1.16. Define
I x2, x is rational,
0, x is not rational.
For what values of x is h continuous? For what values of x is h differentiable?
1.1.17. Prove that if a function is not continuous at x = a then it cannot be differ-
entiable at x = a.
1.1.18. Show that
(nxe_nx2) dx.
f' (lim
0 )
dx lim 0f'
1.2 Presumptions
In this book, we presume that the reader is familiar with certain notations, defini-
tions, and theorems. The most important of these are summarized here.
Notation
{x e [a, b] I f(x) > 0), set notation, to the left of is the description of the general
set in which this particular set sits, to the right is the condition or conditions
16 Introduction
satisfied by elements of this set. Braces are also used to list the elements of the
set; thus, {1, 2, ..., 10) is the set of positive integers from 1 to 10.
sequence notation in which the order is important; this sequence could
also be written as (1, 1/2, 1/3, .. .). When it is clear that we are working with a
sequence, this may be written without specifying the limits on n: (a1, a2, ...) =
(ar).
Definitions
continuity: The function f is continuous at c if for every E > 0 there is a response
> 0 such that Ix —Cl implies that I f(x) — f(c)l <E.
uniform continuity: The function f is uniformly continuous over the set S if for
every E > 0 there is a response > 0 such that for every c e 5, lx — Cl <
implies that lf(x) — f(c)l <E.
least upper bound or sup 5: the least value that is greater than or equal to every
element of 5; greatest lower bound or inf 5: the greatest value that is less than
or equal to every element of S. We also write
For a function f,
Cauchy sequence: The sequence (ar) is Cauchy if for each E > 0 there is a response
N such that for every m, n ? N we have that lam — <E.
nested interval principle: Given any nested sequence of closed intervals in IR,
0.
vector space: A vector space is a set that is closed under addition, closed under
multiplication by scalars from a field such as IR, and that satisfies the following
conditions where X, Y, Z, 0 denote vectors and a, b, 1 denote scalars:
1. commutativity: X + Y = Y + X,
2. associativity of vectors: (X + Y) + Z = X + (Y + Z),
3. additive identity: 0+ X = X + 0 = X,
4. additive inverse: X + (—X) = 0,
5. associativity of scalars: a(bX) = (ab)X,
6. distributivity of scalars: (a + b)X = aX + bX,
7. distributivity of vectors: a(X + Y) = aX + aY,
8. scalar identity: 1X = X.
Theorems
The designation ARATRA 3.1 means that this is theorem (or proposition, lemma, or
corollary) 3.1 in A Radical Approach to Real Analysis.
Theorem 1.3 (DeMorgan's Laws). Let {Sk} be any finite or infinite collection of
sets, then
(ySk)C=nSkc, (flSk)C=ySkc.
S fl (T U U) = (S fl T) U (S fl U), S U (T fl U) = (S U T) fl (S U U).
Theorem 1.5 (Mean Value Theorem, ARATRA 3.1). Given a function f that is
differentiable at all points strictly between a and x and continuous at all points on
1.2 Presumptions 19
the closed interval from a to x, there exists a real number c strictly between a and
x such that
f(a)
= f'(c). (1.9)
Theorem 1.7 (Darboux's Theorem, ARATRA 3.14). 1ff is differentiable on [a, b],
then f' has the intermediate value property on [a, b].
Theorem 1.8 (The Cauchy Criterion, ARATRA 4.2). A sequence of real numbers
converges if and only if it is a Cauchy sequence.
Exercises
1.2.1. Give an example of a function and an interval for which the function is
continuous but not uniformly continuous on the interval.
1.2.2. Give an example of a sequence that converges but is not monotonic.
1.2.3. Prove or find a counterexample to the statement: Every infinite sequence
contains an infinite monotonic subsequence.
1.2.4. Give an example of a sequence of functions and an interval for which the
sequence converges pointwise but not uniformly on the interval.
1.2.5. Prove that 2 by showing how to find a response N for each
E >0.
1.2.6. The lim sup, can also be defined as the value A, such that given
any E > 0, there is a response N such that n N implies that < A + E, and for
every M e N, there is an m > M such that A — E <am. Show that this definition
is equivalent to the definition
A = inf (supak
n>1 \k>n
(ysk)C x g
(ysk)C
S1 = F1 fl and S2 = fl
then
F1 = (F2 U fl
1.2.17. Use the mean value theorem, Theorem 1.5, to prove the following weaker
form of Darboux's theorem: If f'is the derivative of f on an open interval con-
taining c and if f'(x) and f'(x) exist, then these one-sided limits
must be equal.
1.2.18. Give an example of a series that converges but does not converge abso-
lutely.
interval [a, b] such that the series does not converge uniformly to f over [a, b] but
b °° °° b
f
1.2.23. Prove that if f is integrable over [a, b] then there exists c e [a, b] for which
1
Ja a
2
The Riemann Integral
Bernard Riemann received his doctorate in 1851, his Habilitation in 1854. The
habilitation confers recognition of the ability to create a substantial contribution
to research beyond the doctoral thesis, and it is a necessary prerequisite for ap-
pointment as a professor in a German university. Riemann chose as his habilitation
thesis the problem of Fourier series. It was titled Uber die Darstelibarkeit einer
Function durch eine trigonometrische Reihe (On the representability of a function
by a trigonometric series), and, strictly speaking, it answered the broader ques-
tion: When can a function over (—7r, 7r) be represented as a series of the form
ao/2 + cos(nx) + sin(nx))? This is where we find the Riemann in-
tegral, introduced in a short section before the main body of the thesis, part of
the groundwork that he needed to lay before he could tackle the real problem of
representability by a trigonometric series.
Riemann had studied with Dirichlet in Berlin before going to Gottingen to com-
plete his doctorate under the direction of Gauss. In the fall of 1852, Dirichlet visited
Gottingen. Shortly afterward, Riemann wrote to his friend Richard Dedekind,
The other morning Dirichlet stayed with me for about two hours; he gave me the notes necessary
for my Habilitation so completely that my work has become much easier; otherwise, for some
things I would have searched for a long time in the library.'
Riemann was almost certainly referring to the extensive introduction to his thesis
in which he describes the progress that had been made in understanding Fourier
series until that time. But it is also clear that Dirichlet had continued to think about
this problem, and he may have had some useful advice.
Riemann's thesis on trigonometric series was not published until 1868, two years
after his death at the age of 39. Dedekind was responsible for this publication.
23
24 The Riemann Integral
Richard Dedekind (183 1—1916) and Bernhard Riemann both studied with Gauss at
Gottingen and then worked with Dirichiet who succeeded to Gauss's chair. They
developed a strong friendship. In 1862, Dedekind took a position at the Brunswick
Polytechnikum where he would remain for the rest of his career. Today he is
best known for his work in number theory and modern algebra, especially for
establishing the theory of the ring of integers of an algebraic number field.
In 1870, three significant papers appeared that built on Riemann's accomplish-
ments: Hermann Hankel's Untersuchungen über die unendlich oft oscillirendend
und unstetigen Funktionen (Investigations on infinitely often oscillating and dis-
continuous functions), Eduard Heine's Uber trigonometrische Reihen (On trigono-
metric series), and Georg Cantor's Uber einen die trigonometrischen Reihen be-
treffenden Lehrsatz (On a theorem concerning trigonometric series). These papers
accomplished two important tasks. The first was to clarify the concept of uniform
convergence and the related issue of when term-by-term integration is legitimate.
The second was to turn the question of integrability of a function to the study of
the set of points at which the function is discontinuous, thus opening the way to
the development of set theory and a deeper understanding of the structure of the
real numbers.
A fourth seminal paper directly inspired by Riemann's thesis was Gaston
Darboux's Mémoire sur lesfonctions discontinues (Memoir on discontinuous func-
tions) of 1875. In 1873, Darboux had published a translation of Riemann's thesis
into French. It is clear that he studied it very carefully. His 1875 paper greatly sim-
plified the treatment of the Riemann integral. In discussing the Riemann integral,
we shall rely on Darboux's definitions and insights.
Gaston Darboux (1842—19 17) studied at the École Normale Supérieur and taught
there from 1872 to 1878. He then went to the Sorbonne where, in 1880, he succeeded
Michel Chasles as chair of higher geometry. Darboux is best known for his work
in differential geometry, but among his many contributions to mathematics, he also
edited Fourier's Collected Works.
2.1 Existence
Riemann devotes three brief pages to the definition of the definite integral, the
definition of an improper integral, and the statement and proof of the necessary
and sufficient condition for integrability. He then spends one page describing a
function that is discontinuous at every rational number with an even denominator
but which is integrable, thus showing that while continuity is a sufficient condition
for integrability, it is far from necessary. As Darboux demonstrated, there is a lot
to mine from these four pages.
2.1 Existence 25
- xJi) - V <€.
Given a function f defined on [a, b], we can find a Riemann sum approximation
to the definite integral f(x) dx by choosing a partition of the interval
- 1).
Using the Cauchy criterion for convergence, the value V will exist if given any
0, there is a response > 0 so that any two Riemann sums with intervals of
length less than will differ by less than €. The value of the integral is denoted
by
pb
V=J f(x)dx.
a
The greatest difficulty with this definition is handling the variability in the tags
off since can be any value in the interval [x1_1, xi]. Darboux saw that the way
to do this is to work with the least upper bound2 (or supremum) and the greatest
lower bound (or infimum) of the set {f(x) <x <x1}.
Every Riemann sum for this partition lies between the upper and lower Darboux
sums (see top of next page). While it may not be possible to find a Riemann sum
that actually equals the upper or the lower Darboux sum, we can find Riemann
sums for this partition that come arbitrarily close to the Darboux sums.
The function f is Riemann integrable if and only if we can force all Riemann
sums to be within of our specified value V = f f(x) dx simply by restricting our
2
Actually, Darboux at this time did not make a clear distinction between the supremum and maximum of a set.
26 The Riemann Integral
partitions to those with interval length less than an appropriately chosen response
This will happen if and only if the upper and lower Darboux sums for these
partitions are within of the specified value V. It follows that f is Riemann
integrable if and only if we can make the difference between the upper and lower
Darboux sums as small as we wish by controlling the length of the intervals in the
partition,
In order to guarantee that this sum is less than E, we need some control on the
size of — m1, what is called the oscillation of the function over the interval
This implies that every continuous function is integrable (see Exercise 2.1.9).
What about a discontinuous function? If f is discontinuous, then there will be
intervals that include the points of discontinuity where the oscillation cannot be
made as small as we wish. If our function is integrable and our partition includes
intervals where the oscillation is greater than or equal to a, then the sum of the
2.1 Existence 27
lengths of these intervals must be less than €/cr. If denotes the sum over the
intervals on which the oscillation is at least a, then
If we choose a smaller bound for the difference between the upper and lower
Darboux sums, then we get an even smaller bound on the sum of the lengths of the
intervals on which the oscillation was at least a. Since we can force the difference
between the upper and lower Darboux sums to be as small as we wish, we can
also force the sum of the lengths of the intervals on which the oscillation exceeds
a to be as small as we wish, just by controlling the lengths of the intervals in the
partition.
Riemann realized that this also works the other way. If for every a > 0, we can
force the sum of the lengths of the intervals on which the oscillation exceeds a to
be as small as we wish by restricting the lengths of the intervals in the partition,
then we can force the upper and lower Darboux sums to be within any specified
of each other. We define D to be the difference between the least upper bound
and the greatest lower bound of {f(x) a x b}, so that — rn, D for all
j. We let a = €/2(b — a) and choose a limit on the partition intervals so that those
on which the oscillation exceeds a have total length less than €/2D. We split the
difference in Darboux sums into over those intervals where the oscillation is
at least a and over the intervals where the oscillation is strictly less than a:
— — xJ_i)
— + — x1_1)
(2.3)
Proof We take the easy direction first. We leave it as Exercise 2.1.11 to prove that
çb
f(x)dx<J f(x)dx.
J
—a a
/ f(x)dx= infS(P;f).
Ja
Similarly, the lower Darboux integral is defined by
Figure 2.1. Solid vertical bars mark the points of partition P. Dotted vertical bars mark the points of partition
P3. The partition Q consists of all vertical bars, solid or dotted.
1ff is Riemann integrable, then we can find a partition for which S(P; f) — S(P; f)
is less than any specified' positive value. It follows that the absolute value of the
difference between the upper and lower Darboux integrals is also less than any
specified positive value, which can only be true if the difference is 0.
In the other direction, if the Darboux integrals are equal, then this common value
is our candidate for V, the value of the Riemann integral. Given any E > 0, we can
find an upper Darboux sum S(P1; f) and a lower Darboux sum S(P2; f) that are
each less than €/2 away from V. If we let P3 denote the common refinement of P1
and P2, P3 = P1 U P2, then
2mD 2
Since Q is a refinement of P3, we get an upper bound on the upper Darboux sum
for P,
By a similar argument,
S(P;f) >S(Q;f)-€/2
Every Riemann sum for P is within of V.
Corollary 2.3 (One Partition Suffices). Let f be a bounded function on [a, b].
This function is Riemann integrable over this interval if and only iffor each 0
there is a partition P for which S(P; f) — S(P; f) <€.
Improper Integrals
One of the drawbacks of Riemann 's definition of the integral is that it only applies
to bounded functions on finite intervals, an issue that clearly was of concern to
Riemann, for immediately after giving his definition, he explains how to deal with
integrals of unbounded functions. Today we refer to these as improper integrals.
Strictly speaking, the Riemann integral does not exist in this case. However,
there may be a value that can be assigned to such an integral by taking a limit of
integrals that are Riemann integrable. For unbounded integrals such as
dx
i_i
we evaluate the integral on intervals for which the function is bounded and then
take the limit of these values as the endpoints approach the point at which we have
a vertical asymptote:
1' dx dx(El çl dx
I
f—i lxl'/2
=limIJ—i IxI"2 +limI J62 IxI"2
= lim —i + lim €2
/
+ 2)' + lim 1\
1/2 1/2
= lim I
•
=4.
As Riemann went to great pains to point out, the existence of an antiderivative is
no guarantee that the improper integral exists. When there is more than one limit,
2.1 Existence 31
they must be taken independently. For example, the antiderivative of 1/x is ln lxi
andlnflI—lnI — fl =O—O=O,but
f1 dx f61 dx ci dx
I —=lim
X
I —+lim
X
I X
J—1
= lim ln I I
— lim
62 -± 0+
ln I I.
Since neither limit is finite, this function is not integrable over [—1, 1].
Exercises
2.1.1. Explain why if P and Q are partitions of the same interval and Q is a
refinement of P, Q P, and if f is any bounded function on this interval, then
S(P;f)<S(Q;f)<S(Q;f)<S(P;f).
2.1.2. Consider the function f defined by
1, x=O,
f(x)= x, O<x<1,
0, x=1.
Let Pbe the partition (0, 1/4, 1/2, 3/4, 1). Find the upper and lower Darboux
sums, S(P; f) and S(P; f).
2.1.3. Using the function f defined in Exercise 2.1.2 and given E = 1/2, find a
response so that for any partition P into intervals of length less than the
difference between S(P; f) and S(P; f) will be less than 1/2.
2.1.4. Using the function f defined in Exercise 2.1.2 over the interval 1/2 x 1,
explain why no Riemann sum can equal the upper Darboux sum no matter what
partition we choose.
2.1.5. Consider the function
1
O<x<1,
2
32 The Riemann Integral
where [a] denotes the greatest integer less than or equal to a. Show that this series
converges for all x e [0, 1], that it is monotonically increasing, and that g(O) = 0,
g(1) = 1. Find all points at which g is discontinuous and at these points find the
difference between the limit from the left and the limit from the right.
2.1.6. Using the function g defined in Exercise 2.1.5, show that it is Riemann
integrable over [0, 1].
2.1.7. Using the function g defined in exercise 2.1.5, find the value of g(x) dx.
Show the work that leads to your conclusion.
2.1.8. Prove that f is continuous at c if and only if given any E > 0 there is a
response for which the oscillation of f over (c — c + is less than €.
2.1.9. Using the fact that a continuous function on a closed and bounded interval
is uniformly continuous on that interval, prove that if f is continuous on [a, b],
then f is Riemann integrable over [a, b].
2.1.10. Find the upper and lower Darboux integrals of the characteristic function
of the rationals (Example 1.1 on page 3) over the interval [0, 1].
2.1.11. Prove that if f is bounded on [a, b], then
f(x)dx<J f(x)dx.
a
x, xe[O,1]flQ,
h(x)
= 0, x e [0, 1] — Q.
Find the upper and lower Darboux integrals of h over [0, 11.
1, x=0,
m(x) = 1/q, x = p/q e Q, gcd(p, q) = 1, q 1,
0, xgQ.
Show that m is integrable over [0, 1].
2.2 Nondifferentiable Integrals 33
I
f'/IaI I1I\Idx
Il—I—al—I
Jo \LxJ LxJJ
exists and has value a ln a.
2.1.19. Prove that if a function is bounded and Cauchy integrable over [a, bi, then
it is also Riemann integrable over that interval.
(2.6)
///
34 The Riemann Integral
//
0.6
0.4
0.2
/2 /1
/
Figure 2.2. Graph of y = ((x)).
y
0.6
0.4
0.2
x
0.2 0.4
—0 2
—0 . 4
—0 . 6
Since I((nx))I < 1/2, this series converges for all x. It has a discontinuity whenever
nx is half of an odd integer, and that will happen for every x that is a rational
number with an even denominator (see Figure 2.3).
Specifically, if x = a/2b, where a is odd and a and b are relatively prime, and
if n is an odd multiple of b, then
+ = -1/2 and +
- = 1/2.
-
2.2 Nondifferentiable Integrals 35
= (2.7)
00
= (2.8)
The first line of these equalities assumes that we can interchange limits, that is
The justification of this interchange rests on the uniform convergence of our series
over the set of all x and is left as Exercise 2.2.1.
Our function f has a discontinuity at every rational number with an even denom-
inator, but it is integrable. Given any a > 0, there are only finitely many rational
numbers between 0 and 1 at which the variation is larger than a. If the variation is
larger than a at x = a/2b, then b must satisfy
8b2
Darboux's Observation
Darboux observed that if f is integrable over an open interval containing a, if we
define a new function F by
x
F(x)= f(t)dt,
a
h
This follows immediately from the mean value theorem of integral calculus:
pa+h
F(a+h)— F(a)= f(t)dt =h .f(c) (2.10)
Ja
for some c strictly between a and a + h, valid for any h 0.
Therefore, if f(x) f(x), then F cannot be differentiable at
a. On the other hand, F(a + h) — F(a) can be made arbitrarily small simply by
limiting the size of h, and therefore F is continuous at every point. The antideriva-
tive of Riemann's function is continuous and not differentiable at
rational values with even denominators.
This directly contradicts assertions made by Ampere and by Duhamel that con-
tinuity guarantees differentiability, at least at all but a sparse set of values. Our
question #4, "What is the relationship between continuity and differentiability?"
was now wide open.
Darboux went beyond this to find a continuous function that is not differentiable
at any value of x.
Quite a bit more work. Darboux's original justification, published in 1875, had several flaws. He published an
addendum in 1879 in which he corrected the justification of his original example and gave a simpler example,
cos(n! x)/n! (see Exercises 2.24—2.2.9).
2.2 Nondifferentiable Integrals 37
Example 2.3. Consider the function defined by the uniformly convergent series
where 0 <b < 1 and a is odd integer for which ab > 1 + (for example,
b = 2/3, a = 9 can be used).4
Weierstrass had publicly presented this example to the Berlin Academy in 1872,
but it had not appeared in print.
At the same time, Weierstrass produced an example, valid for any bounded,
countably infinite set S, of an increasing, continuous function that is not differen-
tiable at any point of S. The set of rational numbers in [0, 11 is an example of a
countable set. The set of all algebraic numbers in [0, 11, all roots of polynomials
with rational coefficients, is also countable, as we shall see in the next chapter.
Example 2.4. Given our favorite bounded, countably infinite sequence, (a1, a2,
a3, . .), we define the function
.
x I ln(x2)\
h(0)=0.
2 )'
We choose any k strictly between 0 and 1. The Weierstrass function is given by
w(x) = —ar).
For an explanation, see A Radical Approach to Real Analysis, 2nd ed., pp. 259—262.
38 The Riemann Integral
Summary
To summarize the situation with regard to question #4 as it stood in 1875:
• If f is differentiable at a, then it is also continuous at a. Any function that is
differentiable at every point in an interval is also continuous over that interval.
• There are functions that are continuous at every point in an interval but differen-
tiable at none of the points in that interval.
• For any countable set of points, we can find an increasing, continuous function
that is not differentiable at any point in the set.
What was not known was whether or not it is possible to construct an increasing,
continuous function that is not differentiable at any point in the interval. It would
take 30 years to find the answer to this question.
With regard to question #3, the fundamental theorem of calculus, we have seen
that we can find an integrable function for which
f(t)dt
dx a
does not exist for values of x that are rational numbers with even denominators.
Questions that remained open included
• Could dX fa f(t) dt exist but not equal f(x)?
• Could f be integrable but f(t) dt fail to exist at every point?
Exercises
2.2.1. Prove that f(x) = converges uniformly. Prove that the inter-
change of limits in Equation (2.9) is allowed.
2.2.2. Use the mean value theorem to prove that if f is continuous on [a, bi
and differentiable with a nonnegative derivative at all points of (a, b) except for
c e (a, b) where it is not differentiable, then f is a monotonically increasing
function over [a, bi.
2.2 Nondifferentiable Integrals 39
2.2.3. For the function h defined in Example 2.4, find the supremum and infimum
of {h'(x)I 0 <x < 1}. Show that
cos(n! x)
Jr(x) =
does not exist at any x, and therefore Jr is nowhere differentiable. We fix an E > 0
and, for each N e N, define h = E/N!. The variable h depends on N.
and therefore
N-i N-i
h E n! E
(2.13)
cos(n!(x+h)) — °° cos(n!x)
n! — n!
n—N+i n—N+i
00
—
cos(n!(x+h))—cos(n!x)
n=N+i
n!h
40 The Riemann Integral
(N + 1)(N +2)
+
1
+...
— (N + 1)(N + 2)(N +3)
4
(2.14)
— EN
2.2.8. Using Equations (2.12)—(2.14), we see that
'N—i
Jr(x + h) — Jr(x) — cos(N! (x + h)) — cos(N! x)
h — —
sin(n! x) + N!h
n=1 /
+ E(E, N)
cos(N!x +E))—cos(N!x)
=— +
E
is independent of E
2.2.9. Show that
cos(N! x + 2E) — cos(N! x) — cos(N! x + E) — cos(N! x)
2E E
cos(E) — 1
= cos(N! x + E). (2.17)
E
lim cos(N!x+E)=0
regardless of the value of E > 0, and this is not true for any x.
Riemann turned this around by starting with arbitrary trigonometric series in the
form of (2.18) and asking what properties such a function must possess. Does a
trigonometric series have to be integrable? If it is integrable, then we can calculate
its Fourier coefficients. Is this Fourier series always identical to the series with
which we started? One outcome of this line of reasoning was the question whether
two distinct trigonometric series could converge to the same function. If this were
possible, then the difference between these series would be a trigonometric series
with some nonzero coefficients that converges to 0. It would have been very sur-
prising if someone had exhibited such a series, but no one could prove that it does
not exist.
Uniqueness is easy to prove if the trigonometric series converges uniformly. As
Weierstrass had shown in his Berlin lectures of the 1 860s, term-by-term integration
is valid for uniformly convergent series. Since we begin with a trigonometric series,
Fourier's heuristic argument given in Equation (1.4) on page 2 actually proves that
the series is unique. The problem is that if a series of continuous functions converges
uniformly, then it converges to a continuous function. The most interesting Fourier
series of the time converged to discontinuous functions and thus could not be
uniformly convergent.
Dirichlet had been able to show that if we start with a continuous function
and form the trigonometric series given in Equation (2.18), then that series is
uniformly convergent. It follows from his analysis that if we work with a piecewise
42 The Riemann Integral
continuous function, a function that is continuous at all but finitely many points,
then its Fourier series is uniformly convergent on any closed interval that does
not contain a point of discontinuity. This led Heine to describe a condition that
is almost as good as uniform convergence, uniform convergence in general. A
series with finitely many exceptional points that is uniformly convergent on any
closed interval that does not contain one of these points is uniformly convergent in
general.
Heine succeeded in proving that if a trigonometric series is uniformly convergent
in general and converges to the function identically equal to zero, then all of the
coefficients must be zero. That is to say, among the set of trigonometric series
that are uniformly convergent in general, no two distinct series converge to the
same function. It was Heine who convinced Cantor to take up the question of what
happens when the convergence is not uniform in general.
In his 1870 paper, Cantor drew on Riemann's methods to get around the need
for uniform convergence on any interval. He proved that if a trigonometric series
converges to 0 at all x, then all coefficients of the series must be 0. By 1871,
Cantor realized that his proof would work if it is known that the trigonometric
series converges to 0 at all but at most finitely many points. He began working on
the problem of an infinite number of exceptional points. For what infinite sets S
can we conclude that if a trigonometric series converges to 0 at all points not in
S, then all of the coefficients must be 0? Cantor was on his way to inventing set
theory.
Hankel's Innovations
Early in 1871, Cantor reviewed Hankel's 1870 paper Untersuchungen über die
unendlich oft oszillierenden und unstetigen Funckionen (Investigations on infinitely
often oscillating and discontinuous functions). It spurred his thinking about infinite
sets of discontinuities.
Hermann Hankel (1839—1873) took classes with Riemann at Gottingen and
Weierstrass in Berlin before earning his doctorate in 1862 at the University of
Leipzig where he then taught. His 1870 paper came shortly after his move to
TUbingen. In it, he attempted to clarify Riemann's necessary and sufficient condi-
tions for integrability.
We have considered the oscillation of a function over an interval. It is defined as
the difference between the least upper bound of the values of the function and the
greatest lower bound of those values. Hankel focused this onto a single point. We
consider all open intervals that contain that point and look at the oscillation over
each of these intervals. As the intervals become smaller, the oscillation can only
2.3 The Class of 1870 43
Definition: Oscillation
Given a function f and an interval I, the oscillation of f over I is
w(f; I) = sup{f(x) I
x e I} — inf{f(x) I
x e I).
The oscillation of f at the point c is
w(f;c) = infw(f;I),
IeI
where I is the set of open intervals containing c. If f(x) f(c)
f(x), then this is equivalent to
w(f;c)= limf(x)—
x—±c
limf(x).
decrease. The oscillation at a point x is the greatest lower bound over all open
intervals that contain x of the oscillation over j•5
The following proposition follows immediately from the second definition of
oscillation at a point. The equivalence of these definitions is left as Exercise 2.3.14.
With this notion, Riemann's criterion for integrability can now be stated in terms
of S5, the set of points with oscillation at least a. A function f is integrable over
the interval [a, bi if and only if for each a > 0, we can put the points of S5 fl[a, bi
inside a finite union of intervals, intervals that can be chosen so that the sum of
their lengths is less than any predetermined positive amount.
It would take many years before the terminology was fixed, but we see here the
beginning of the idea of the outer content of a set of points (see definition at top
of next page), denoted ce, e for "exterior."
Any finite set of points has outer content zero. This is because given any E > 0,
we can put a small interval around each point so that the sum of the lengths of the
intervals is less than E.
If we consider the set { 1, 1/2, 1/3, 1/4, .1' it also has outer content zero. Given
.
.
any E > 0, we put an interval of length E/2 around 0. That contains all but finitely
many points from this set. The remaining points, because they are finite, can be put
inside a union of intervals whose lengths add to less than E/2. On the other hand,
the set of rational numbers between 0 and 1, Q fl[0, 11, has outer content 1.
Hankel actually defined a different but related concept he called the "jump" of f at c, the largest a such that
inside any interval containing c there is a point x for which f(x) — > a.
44 The Riemann Integral
Although Hankel did not have the terminology of outer content, he did grasp the
idea and turned it into a characterization of when a function is Riemann integrable.
Recast into the language of outer content, Hankel's insight is summarized in the
following theorem.
The first to explicitly use this measure of the size of a set was Otto Stolz (1842—
1905) in 1881. The term "content" (Inhalt) is due to Cantor in 1884. The distinction
between inner and outer content would be made by Guiseppe Peano in 1887 (see
Section 5.1). The concept would be popularized in Jordan's Cours d'analyse of
1893—1896, because of which it is sometimes referred as Jordan content or Jordan
measure when the inner and outer content of a set are the same.
The outer content is the same whether we use open or closed intervals in our
finite cover (see Exercise 2.3.15). Because the distinction was not yet recognized
as important, mathematicians of this period usually referred simply to intervals
without distinguishing whether they were open or closed.
Theorem 2.5 is a profound and very useful result. Hankel did not have the
terminology to state this result as we have here, but he did understand it fully.
Unfortunately, Hankel used it to reach a faulty conclusion about when a discontin-
uous function is Riemann integrable. He was led astray because of the paucity of
examples of highly discontinuous functions.
Definition: Dense
A set S is dense in the interval I if every open subinterval of I contains at least
one point of S.
Hankel's Error
Hankel's argument for his assertion that every pointwise discontinuous function
is Riemann integrable is not unreasonable. We choose an arbitrary a > 0. If a
function is continuous at one value, then we can find an interval around this value
on which the oscillation is less than a. If the set of points of continuity is dense,
then we have succeeded in putting each element of this dense set inside an interval
that contains no points of Sa.
Those points with oscillation larger than a constitute a very thin set. Between
any two points of this set there must be an entire open interval of points not in
the set. Hankel believed that such a set must have outer content 0 and therefore
must be Riemann integrable. This belief was reinforced by the fact that all of the
examples of pointwise discontinuous functions that Hankel knew, examples such
as Riemann's function, were integrable.
Hankel's fallacy, and he was not the only prominent mathematician to fall into
it, was to assume that such a thin set cannot have positive outer content. Thomas
Hawkins has presented evidence that between 1870 and 1875 this was the case for
Hankel, for Axel Harnack, and for Paul du Bois-Reymond. But in 1878, when Ulisse
Dini published his book on the theory of functions of real variables, Fondamenti per
la teorica dellefunczioni di variabili reali, Dini expressed doubt in the validity of
Hankel's claim. As we shall see in Chapter 4, finding the flaw in Hankel's reasoning
would greatly advance our understanding of the structure of the real numbers, as it
also revealed problems with Riemann's definition of the integral.
unbounded sets, we count the number of times that we need to take the derived set
in order to get to the empty set. A set with no accumulation points is considered to
be type 0. The set {1, 1/2, 1/3, 1/4, .. .} is type 1 because its derived set is {0} and
the derived set of {0} is the empty set.
If a derived set is infinite, then we can consider the derived set of its derived set.
For example, starting with the set
11 1
Lm n
its derived set contains {1, 1/2, 1/3, 1/4, .. .}, and with a little work (see Exer-
cise 2.3.11) you can show that the derived set equals { 1, 1/2, 1/3, 1/4, . .}. The .
show in the next chapter, any infinite set in a bounded interval has a limit point, so
once the limit points have been covered, there can only be finitely many points left.
Any bounded type 2 set has outer content zero. The set of limit points is a
bounded type 1 set and so can be covered by intervals of total length less than E/2.
The points from the original set that are not covered by these intervals are finite in
number.
We now see that, by induction, any bounded first species set has outer content
zero. If we have proven that a bounded set of type n must have outer content zero,
then so must any bounded set of type n + 1 because its derived set has type n.
Combining this with Hankel's insights, we see that if Sa, the set of points at which
the oscillation is greater than or equal to a, is of first species for all a > 0, then
the function is Riemann integrable. This appears to give even greater credence to
Hankel's claim that any pointwise discontinuous function is Riemann integrable.
Surely if there is an entire interval of continuity around every point of continuity
and the points of continuity are dense, then what is left over must be first species.
It is to Cantor's credit that he realized that what seemed obvious was not so
clear. He recognized that before any further progress could be made, he needed to
understand the structure of the real number line. Cantor now embarked on a quest
that would profitably engage the remainder of his career and mark him as one of
the great mathematicians.
Exercises
2.3.1. Find the derived set of each of the following sets. Which of these sets are
first species?
1. Qfl[0,1J
2. [0,1I—Q
3.
4.
5. k=1 2or3j
6. (O,i)U(3,4)
2.3.2. Find the outer content of each of the sets in Exercise 2.3.1. Justify each
answer.
2.3.3. Show that if S has outer content zero and T is any bounded set, then
Ce (S U T) = Ce(T).
1.Q
2. [0,1I—Q
3. The set of rational numbers with denominators that are a power of 2
4. The set of real numbers that have no 2 in their decimal expansion
5. The set of real numbers that have a 2 somewhere in their decimal expansion
6. The set of rational numbers with denominators less than or equal to 1,000
7. The set of rational numbers with denominators that are prime
8. The set of rational numbers with numerators that are prime
2.3.7. Show that the function rn defined in Exercise 2.1.14 is pointwise discontin-
uous.
2.3.8. Consider the function h defined in Exercise 2.1.12. Is it possible to find an
E > 0 so that h is totally discontinuous in the open interval (—E, E)? Explain why
or why not.
2.3.9. Consider the function g in Exercise 2.1.5. Find all values of x e [0, 1] at
which the function is not continuous, and find the oscillation of r at each of
these points. Determine whether this function is totally discontinuous or pointwise
discontinuous and justify your answer.
2.3.10. Prove that for any set S,
s" c S',
the derived set of the derived set of S is contained in the derived set of S.
2.3.11. Prove that the derived set ofT = { + rn n NJ is the set U = In
N} U{0}. First show that U ç T'. To prove that T has no other limit points, show
that there are only finitely many points of T in (1/N, 1/(N — 1)) that are not of
the form 1/N + 1/n.
2.3.12. Prove that
11 1 1
is type k. It is clear that its derived set includes Tk_1. The key is to show that for all
k the derived set of Tk does not contain any points other than 0 that are not in Tkl.
50 The Riemann Integral
2.3.13. Consider the set of zeros of the function f' defined by fi(x) = sin(1/x).
What is the type of this set? What is the type of the set of zeros of f2(x) =
sin (1/fi(x))? What is the type of the set of zeros of f3(x) = sin (1 /f2(x))?
Ce
(U = 1.
2.3.17. Prove that if the function f is Riemann integrable on [0, 11, then it is either
continuous or pointwise discontinuous on [0, 11.
3
Explorations of R
3.1 Geometry of R
The real number line is, above all else, a line. While true lines may not exist in the
world of our senses, we do see them at the intersection of flat or apparently flat
51
52 Explorations of IR
surfaces such as the line of the horizon when looking across a sea or prairie. To
imagine the line as infinite is easy, for that is simply imagining the absence of an end.
For the line as a geometric construct, the natural operation is demarcation of
distance. There are two critical properties of distances that become central to the
nature of real numbers:
1. However small a distance we measure, it is always possible to imagine a smaller
distance.
2. Any two distances are commensurate. However small one distance might be
and large the other, one can always use the smaller to mark out the larger.
The second property is known as the Archimedean principle,' that given any two
distances, one can always find a finite multiple of the smaller that exceeds the
larger.
Let us now take our line and mark a point on it, the origin. We conduct a mental
experiment. We stretch the line, doubling distances from the origin. What does the
line now look like? It cannot have gotten any thinner. It did not have any width to
begin with. A point that was a certain distance from the origin is now twice as far,
but the first property tells us that the line itself should look the same. No gaps or
previously unseen structures are going to appear as we stretch it. No matter how
many times we double the length, what we see does not change.
An Infinite Extension
We now kick our mental experiment up a level and imagine stretching the line by
an infinite amount. What happens to our line? There are two reasonable answers,
and a choice must be made. One reasonable answer is that it still looks like a line.
This answer builds on the human expectation that whatever has never changed,
never will change. Every time we have doubled the length, the line has remained
unchanged. Is stretching by an infinite factor so different?
It is different. Go back to our original line and identify one of the points other
than the origin. What happens to that point as we magnify by an infinite factor? No
matter how large our field of vision, that point has moved outside of it. All points
other than the origin have moved outside the field of vision. All that is left is the
point at the origin. Infinite magnification has turned our line into a single point.
I said that we have a choice. We could hold onto our instinctive answer that the
infinitely magnified line is still a line. That choice contradicts the Archimedean
principle, and so it is not the generally accepted route to construction of the real
numbers, but I do want to go down that road a little way.
Also known as the Archimedean axiom or the continuity axiom. It predates Archimedes, appearing as definition
4 of book 5 of Euclid's Elements.
3.1 Geometry of R 53
Consider one of the points other than the origin on the infinitely magnified
line. Where was it before the magnification? It certainly was not any measurable
distance away from the origin, otherwise it would have sailed off to infinity when
we magnified the line. It was not on top of the origin because then it would have
stayed put. It must have been off the origin, but less than any measurable distance
away from the origin. Its distance from the origin must have been so small that it is
incommensurate with any measurable distance. No matter how many of these tiny
distances we take, we cannot fill any measurable distance, no matter how small.
The tiny distances are known as infinitesimals. Leibniz used them to explain his
development of the calculus. It is possible to develop analysis using the real number
line with infinitesimals, though the full complexity of infinitesimals is much greater
than what is suggested by this simple thought experiment. This approach to calculus
through infinitesimals is called nonstandard analysis and was developed in the
early 1960s, beginning with the work of Abraham Robinson at Princeton. Initially,
the logical underpinnings required to work with infinitesimals were daunting. Since
then, these foundations have been greatly simplified, and nonstandard analysis has
vocal proponents.
If the real number line includes infinitesimals, then every point on the real line
must be surrounded by a cloud of points that are an infinitesimal distance away. The
infinitely magnified line, if it really is a replica of the original line, must also contain
infinitesimals, and they must be stretched from points that were infinitesimally small
with respect to the original infinitesimals. Thus every infinitesimal is surrounded
by a cloud of points whose distance is infinitesimally infinitesimal, and these by
third-order infinitesimals, and so on.
"And so on" is a wonderful human phrase. We actually can imagine this unimag-
inable construction, or at least imagine enough that we are prepared to accept it
and work with it. There is a real choice to be made. That choice was made in the
early nineteenth century. Looking back, the decision to hold to the Archimedean
principle appears inevitable. As I said earlier, the real number line is, above all else,
a line. We are working with distances, and distances in our everyday experience
are commensurate. Those seeking foundations for analysis preferred to stay to the
simpler, surer, and more intuitive ground of commensurate distances.
Topology of R
The only geometry on a line is the measurement of distance. There is an entire
branch of mathematics that is built on the concept of distance and its generalizations:
topology. Topology gets much more interesting in higher dimensions, but there is
a lot going on in just one dimension.
We begin with some basic definitions. The basic building block of topology is the
E-neighborhood of a point a. Objects such as neighborhoods are best visualized
54 Explorations of IR
Definition: e-Neighborhood
Given e > 0, the e-neighborhood of a point a, N6(a), is the set of all points whose
distance from a is strictly less than e:
in two- or three-dimensional space, but you also need to visualize them as they live
on the real number line.
On the real number line, this is the open interval (a — E, a + E). In the plane R2,
it is a disc centered at a (without the bounding edge), and in R3 it is the solid ball
centered at a (without the bounding surface). It is also often useful to work with
a deleted or punctured neighborhood of a point a, consisting of a neighborhood
with the point a removed, {xI 0 < Ix — al <E}.
Any open interval is open. Any union of open intervals is open. The empty set is
open. (Since there is no x S, this statement is true about every x that is in S). The
entire real line R is open. In two dimensions, the inside of any polygon, without
the boundary, is open.
Consider the function defined by f(x) = x2. The inverse image of (1, 4) con-
sists of all points of R whose square lies strictly between 1 and 4. This is
(—2, —l)tJ (1,2). The inverse image of(—1, 4) is (—2,2). Think about why.
Proof We begin with the E-8 definition of continuity. A function f is continuous
at a if and only if given any E > 0, there exists a response S such that Ix — a < S
implies that If(x) — f(a)I <E.
We first translate this statement into the language of E-neighborhoods. A function
f is continuous at a if and only if given any E-neighborhood of f(a), there is a
S-neighborhood of a such that
f NE (f(a)).
3.1 Geometry of R 55
More Definitions
A closed interval is closed. Any finite set of points is closed. Any set of points
that forms a convergent sequence, taken together with its limit, is a closed set.
The empty set and the entire real line IR are closed. These are the only two sets
that are both open and closed (see Exercise 3.1.8). Many sets such as (0, 11 and
{1, 1/2, 1/3, 1/4, . .1 (without the limiting value 0) are neither open nor closed.
.
points. If S contains all of its accumulation points, then any point in Sc sits inside
some E-neighborhood entirely inside 5c (see Exercise 3.1.12). Therefore, 5c is
open and so S is closed. LI
It follows that any closed set contains its derived set. If S is closed, then its
elements might or might not be accumulation points. Thus, the set
is closed, but 0 is the only accumulation point of S. On the other hand, if T = [0, 11,
then T is closed and every point in T is an accumulation point of T. This closed
set is equal to its derived set. Any derived set is closed (see Exercise 3.1.21).
It took a long time for the mathematical community to recognize the importance
of open and closed sets. Closed sets, which contain their own derived set, are the
older notion and can be traced back to Cantor in 1884. Mathematicians working
in analysis in the late nineteenth and even early twentieth centuries would fail to
make it clear whether the intervals that they described were to include endpoints
or not. Often it made no difference. Sometimes it was critically important. One of
the most infamous examples was Borel's 1905 Leçons sur lesfonctions de variable
réelles (Lectures on real variable functions).2 As late as the early twentieth century,
some mathematicians including the Youngs used the term "open set" to mean a
set that is not closed. The current meaning can be traced to Baire in 1899. It was
Lebesgue who popularized the current definition.
The interior of an open set is itself. The interior of the closed interval [a, bI
is the open interval (a, b). The interior of a finite set of points is the empty set.
The closure of a closed set is itself, while the closure of the open interval (a, b)
is [a, b]. The half-open, half-closed interval [— 1, 1) has closure [— 1, 1], interior
(—1, 1), and a boundary that consists of two points, { — 1, 1). The closure of
{1/n I n e N) is {1/n I n e N) U{0). Its interior is the empty set. The boundary is
{1/n In eN)U{0).
The interior, closure, and boundary can also be described in terms of E-
neighborhoods.
2
Renfro speculates that this might have been the fault of Maurice Fréchet, then a young graduate student, who
transcribed these lectures.
3.1 Geometry of R 57
Note that a set S is dense (recall the definition on p. 45) in T if, for all x e T,
every E-neighborhood of x contains infinitely many points of S. The set of rational
numbers is dense in IR. We can take a much smaller set and still be dense in IR. The
set of rational numbers whose denominators are even (recall Riemann's function,
Example 2.1) is dense. Even if we restrict ourselves to the set of rational numbers
whose denominators are powers of 2, this also is a dense subset of R. If a set S is
dense in R, then every real number is an accumulation point of S. In this case, the
closure of S is the entire real number line.
Exercises
3.1.1. Prove that every constant function is continuous.
3.1.2. Give an example of a continuous function f and an open set S in the domain
of f such that f(S), the image of 5, is not open.
3.1.3. Define the function f over the domain [— 1/m, 1/mi by
f(x) = sin(1/x), x 0, f(0) = 0.
Describe f1(N112(0)), the inverse image of N112(0). Is this inverse image open,
closed, or neither?
3.1.4. Define the function g over the domain [— 1/m, 1 /m] by
3.1.5. For each of the following sets in 1R2 state whether the set is open, closed, or
neither. Then find the closure, the interior, and the boundary of the set.
1.
2.
3.
4.
5.
6.
7.
3.1.6. For each of the following sets in R, state whether the set is open, closed, or
neither. Then find the closure, the interior, and the boundary of the set.
1.Q
2. IR-Q
(1/2n, 1/2n — 1)
U n=1
[1/2n, 1/2n — 11
5. the set of all rational numbers with denominators that are less than 1,000
6. the set of all rational numbers with denominators that are powers of 2
7. the set of all rational numbers with numerators that are powers of 2
3.1.7. Prove that every neighborhood of x contains at least one point of S other
than x itself if and only if every neighborhood of x contains infinitely many points
of S.
3.1.8. Prove that if a set is both open and closed, then it is either IR or the empty
set.
3.1.9. Give an example of a set with an accumulation point that is not a boundary
point.
3.1.10. Give an example of a set with a boundary point that is not an accumulation
point.
them. It was in these lectures that Weierstrass first explained what today we call the
Bolzano—Weierstrass theorem. His proof rested on the nested interval principle.
The shared attribution with Bernhard Bolzano arises from Weierstrass's acknowl-
edgment of his indebtedness to Bolzano's 1817 proof that the convergence of every
Cauchy sequence implies that every bounded, increasing sequence has a limit.
The nested interval principle implies what we can call the Bolzano—Weierstrass
principle, that every bounded infinite set has a limit point, but the Bolzano—
Weierstrass principle also implies the nested interval principle. Given a nested
sequence of intervals, we create an infinite set S by choosing one point from
each interval so that no point duplicates any of the previously chosen points. By
Bolzano—Weierstrass, the set S has a limit point, and since every neighborhood
of this limit point contains infinitely many elements of 5, this limit point must be
inside every interval.
How do we define and assign values to those points on the real number line that
are predicted by Bolzano—Weierstrass but are neither algebraic nor roots of common
functions? Is there a way we can start with arithmetic and build to all of the values
represented by the real number line? Several notable mathematicians wrestled with
this question. Beginning in the late 1 850s, Richard Dedekind, Karl Weierstrass,
Georg Cantor, and H. Charles Mèray each found his own solution. Dedekind and
Cantor published their solutions in 1872. The details of these solutions are less im-
portant than what they have in common, all drawing on a fundamental observation
about the structure of IR: The set of rationals is dense on the real number line.
Every open interval, no matter how far we may have zoomed in, contains at
least one and therefore infinitely many rational numbers (see Exercise 3.2.1). Any
3.2 Accommodating Algebra 61
point on the real number line is uniquely determined by reference to these rational
numbers. To Dedekind, each irrational number is described by considering all
rationals less than the point in question and all rationals that are greater. Given
two such sets with the property that every element of the first is strictly less than
every element of the second and their union consists of all rational numbers, such
a "Dedekind cut" defines a unique point in IR. Weierstrass used series. Cantor and
Mèray used sequences of rational values that could be forced as close as desired to
the point in question by taking sufficiently many terms or going out sufficiently far
in the sequence. Cantor identified each point in IR with the collection of rational
Cauchy sequences that converge to this point. Each point on the real number line
is identified by an appropriate collection of Cauchy sequences.
We are using a modified form of Cantor's definition when we identify the
elements of IR with all possible decimals to infinitely many places. Such a deci-
mal expansion represents a choice of a particular Cauchy sequence. For example,
we identify m with a Cauchy sequence that begins (3, 3.1, 3.14, 3.141, 3.1415,
3.14159, .. .). When we write "m = 3.14159.. .," there are two observations that
we need to make. The first is that, actually, we have not specified the location of
7t. The information supplied by 3.14159... tells us nothing about the digit that
follows 9. There are an infinitude of points on the real number line that begin with
this particular decimal expansion. Giving a thousand or a million or a billion digits
gets us no further in the sense that there are still infinitely many different points
whose expansions start with those digits.
The second observation is that the statement "m = 3.14159. ." nevertheless
.
does tell us something very important. It implies that there is a sequence of rational
approximations to m that begins (3, 3.1, 3.14, 3.141, 3.1415, ...) and that eventu-
ally will enter and stay within any open interval containing m, no matter how small
that interval might be. We may not know how to find an arbitrary term of this se-
quence, but we are asserting its existence. Such a statement tells us that m is a point
on the real number line and that it is located within the interval [3.14159, 3.141601.
There are, of course, many explicit Cauchy sequences that represent m. One of
them is given by
7 4 44 444 4444
35 357 3579
(44—— 4——+— 4——+——— 4——+—---—+— ...
3
The point of Cantor's construction is not that we can find a Cauchy sequence for
each real number, but that it exists.
Completeness
Dedekind based his construction on the assumption that every nonempty bounded
set should have a least upper bound. Cantor based his construction on the
62 Explorations of R
Definition: Completeness
A set of numbers is called complete if it has any of the four equivalent properties:
assumption that every Cauchy sequence should converge. The nested interval prin-
ciple is another equivalent assumption. These are different but equivalent ways of
making precise what we mean when we describe the real number line as a contin-
uum. This property of IR would eventually come to be known as completeness.
In particular, the set of all real numbers is complete. The set of all rational
numbers is not complete. Today, rather than attempting to define the set of real
numbers so as to justify their completeness, it is common to simply assert as
axiomatic that IR contains all rational numbers and is complete.
I have already explained Weierstrass 's proof that the nested interval principle and
the Bolzano—Weierstrass principle are equivalent. The equivalence of the remaining
statements is left for you in Exercises 3.2.8—3.2.10.
The nineteenth century witnessed an increasing sense of paradox from the in-
terplay of algebra and geometry on the real number line. It reached a peak in the
1 880s. One of the curious phenomena that was discovered and debated in that
decade began with the observation that the rational numbers are denumerable or
countable.
Informally, a set is countable if it is possible to list its elements in order: first,
second, third The rational numbers in [0, 11 can be listed by ordering them by
the size of the denominator (when reduced) and among those of equal denominator
by the numerator:
01112131234151 32 )
1'1'2'3'3'4'4'5'5'5'5'6'6'7' (
The fact that the set of rational numbers is countable leads to an important charac-
terization of open sets in IR.
Proof Let U be an open set, and choose any t e U. Let a = inf{x (x, t] c U). The
I
Definition: Countable
A set is countable or denumerable if it is finite or if it is in one-to-one correspon-
dence with N, the set of positive integers. A set that is countable and not finite is
called countably infinite.
and we could find an x <a for which (x, t] c U. Similarly, let b = sup{x [t, x) c I
U). The point b cannot be in U, but (a, b) c U. (Note that we might have a =
and/or b = oc.) We call this interval 1(t) = (a, b).
If s and t are two points in U, then either I(s) = 1(t) or I(s) fl 1(t) = 0, so U
is a union of disjoint open intervals. To see that there are at most countably many
open intervals, we observe that we can find a distinct rational number inside each
interval.
Harnack's Mistake
We take the ordering of the rational points in [0, 1] given in (3.2) and call them (a1 =
0, a2 = 1, a3 = 1/2, a4 = 1/3, .. .). We choose any positive E and let 'k be the open
interval of length E/2k that is centered at ak: = (ak — ak + Does
the union of these intervals contain all points in [0, 11? In other words, can we put
the closed interval [0, 11 inside a countable union of open intervals whose lengths
add up to E? This was a problem first posed by Axel Harnack in 1885. He convinced
himself that the answer is "yes."
Axel Harnack (185 1—i 888) was the younger twin brother of the German theolo-
gian Adolf von Harnack. Axel earned his doctorate at Erlangen-Nurnberg Univer-
sity in 1875, working under the direction of Felix Klein. He is best known for his
work in harmonic analysis and the theory of algebraic curves.
In essence, what Harnack did was to ask himself, "What is the complement of a
countable union of intervals?" He believed that it must also be a countable union of
intervals. Think about this. The intervals might be open or closed or half-open/half-
closed, and a closed interval might be a single point. It is certainly true that the
complement of any finite union of intervals is a finite union of intervals. It is not
obvious that the same would not be true for countable unions. But if Harnack was
right, then the complement of U 'k is a countable union of intervals. The intervals
in the complement must be single points, otherwise they would contain rational
numbers between 0 and 1. We now put each of these countably many points inside
intervals whose lengths add up to E, and we now have all of [0, 11 contained within
a union of countably many intervals whose lengths add up to 2E.
In fact, Harnack's basic premise, that the complement of a countable union of
intervals is a countable union of intervals, is wrong. The complement can be an
64 Explorations of R
uncountable union. Georg Cantor and others understood this. The flaw in Harnack's
reasoning underscores some of the complexity of the real number line as a set of
numbers, and we shall treat it in full detail in the next section. But that does not
prove that his answer was wrong. This was accomplished by Emile Borel in 1895.
Borel's Series
A year after earning his doctorate, Borel published Sur quelques points de la théorie
des fonctions (On some points in the theory of functions), a paper that dealt with
a question from complex analysis. Specifically, he studied analytic continuation
across a boundary on which we have a countable dense set of poles (points for
which the function is unbounded in every neighborhood). We shall focus on a very
special case of the type of function he studied,3 the series
CX) CX)
1/2
where A <oc,
n=1 n=1
and the points } are dense in [0, 11. For example, we could take } to be the
For any x in
= {x e [0, 1] — > for all n > 1
our series converges. But are there any points in
Borel considered the complement of It is a union of open intervals, x —
cA,1/2, the nth of which has length 2cA,Y2. The sum of the lengths of these intervals
is 2cA. If we choose c < 1/2A, then these intervals have total combined length
less than 1. To prove his result about analytic continuation, Borel needed to know
that there is at least one point in
Borel proved that if the sum of the lengths of the open intervals is strictly less
than 1, then there must be points — in fact, uncountably many points — that are
The interested reader can find a fuller description in Hawkins's Lebesgue's Theory of Integration,
pp. 97—106.
3.2 Accommodating Algebra 65
not in any of these intervals. For c < 1/2A, the set contains uncountably many
points in [0, 1]. In a note appended to this paper, he remarked that his proof actually
demonstrated a stronger statement, a theorem that today we call the Heine—Borel
theorem. In stating and proving this theorem, Borel was interested only in the case
of a countable collection of open intervals. We give it in its most general form.
This theorem will become one of our most useful tools for proving results about
measure theory and Lebesgue integration. It implies that if we have a collection
of open intervals for which the sum of the lengths is strictly less than 1, then the
union of those intervals cannot contain [0, 1]. If it did, then by Heine—Borel there
would be a finite subcollection that also contained [0, 1], and this is impossible. In
Exercise 3.2.3, you are asked to prove that a finite union of intervals of total length
less than 1 cannot cover [0, 1].
Henri Lebesgue in his 1904 book Leçons sur l'Intégration et la Recherche des
Fonctions Primitives (Lectures on integration and the search for antiderivatives)
gave the following proof, which is valid for any collection of open intervals,
including an uncountable collection.
Proof Consider the set S of points x e [a, b] for which [a, x] is contained in
a finite union of intervals from {Uk}. Since a e 5, we know that this set is not
empty. Since the set has an upper bound, it must have a least upper bound, call it
= sup S. The point ,8 is contained in one of these open sets, say U1 = (ai,
a1 If we take the finite union of intervals that contain [a, ,8) and add to
it the open set we have a finite union of open sets that contains [a, This
contradicts the assumption that ,8 = sup S unless ,8 = b.
Compactness
This property of closed, bounded intervals is so important that it has a name,
compactness. The Heine—Borel theorem tells us that any closed, bounded interval,
[a, b], is compact. We are interested in all sets that share this property.
Proof We leave it as Exercises 3.2.4 and 3.2.5 to show that if a set is not closed or
not bounded, then it is not compact.
66 Explorations of R
Proof We first prove that f(S) is bounded. Let (Ia) be a sequence of intervals
of length 1 for which f(S) c U Since f is continuous, is open and
is an open cover of S. Since S is compact, we can find a finite subcol-
lection (f_i (In)) that contains all of S. It follows that U 'flk contains f(S), and
this is a finite union of intervals of length 1.
We now prove that f(S) is closed. Let y be any accumulation point of f(S)
and choose a sequence c f(S) that converges to y. Since is
contained in the bounded set 5, the Bolzano—Weierstrass theorem promises us an
accumulation point, x0, of this sequence. It follows that there is a subsequence that
converges to this accumulation point, f_i (Yjk) x0. Since S is closed, xO must be
in S. By continuity, Yjk converges to f(xo), and therefore y = f(xo) e f(S). E
Two Corollaries
There are two immediate corollaries of the Heine—Borel theorem that are histori-
cally intertwined. They predate Borel's theorem of 1895. The first corollary is the
Bolzano—Weierstrass theorem, Theorem 3.4. Since S is bounded, it is contained in
a closed interval [a, b]. If there are no limit points, then for each point of [a, b] we
can find a neighborhood that contains only finitely many points of S. The collection
of these neighborhoods is an open cover of [a, b], and so there is a finite subcover
of [a, b] S. But this finite subcover contains only finitely many points of S.
Since the Heine—Borel theorem follows from the assumption that every bounded
set has a least upper bound, and it in turn implies the Bolzano—Weierstrass theorem,
3.2 Accommodating Algebra 67
we see that the Heine—Borel theorem is yet another equivalent statement of what it
means to say that R is complete.
The second corollary is Theorem 1.13: A continuous function on a closed and
bounded interval is uniformly continuous on that interval. In 1904, Lebesgue
observed that Heine—Borel implies "a pretty demonstration of the uniformity of
continuity." In his words,
Let f be a continuous function at all points of [a, bi. Each point of [a, b] is, by definition,
inside an interval on which the oscillation of f is less than E. One can cover [a, b] with a
finite number of them. Let I be the length of the shortest interval that is used. In each interval
of length 1, the oscillation of f is at most 2E since such an interval overlaps at most two of the
intervals The continuity is uniform.4
Lebesgue (1904, p. 113, fn). The notation for closed intervals was not standard at that time. Instead of using
[a, bi, Lebesgue refers to it as "(a, b) including a and b."
See Dugac (1989, pp.
6
See A Radical Approach to RealAnalvsis, 2nd ed., pp. 241—243.
68 Explorations of R
Clearly a e Sa, so the the set is not empty. Since it is bounded, it has a least upper
bound. Set c1 = sup Sa. Dirichlet observes that if c1 b, then I f(a) — f(ci )I = E
(see Exercise 3.2.6). We now define
We continue in this way, obtaining a sequence of values a < ci < c2 < c3 <
b with the property that f(ck+1) — f(ck)I = E and if Ck <x <Ck+1, then
I
If(x) — f(cj3I <E. If there are only a finite number of values of Ck before we get
to b, then we have uniform continuity (see Exercise 3.2.7). We only need to rule
out the possibility that (a < c1 < c2 < ...) is an infinite sequence, converging to
some c <b.
If there is such a limit c, we know that f is continuous at c, and therefore we
can find a response 6 so that c — 6 <x < c + 6 implies that If(x) — f(c)I < E/2.
Since the sequence (Ck) converges to c, we can find two consecutive elements of
the sequence that lie inside this 6-neighborhood, say cj and c1+i. But then
Monte! and Giuseppe Vita!i questioned whether Schönflies had any specia! c!aim
to this theorem. They referred to it as the Bore!—Lebesgue theorem, acknow!edging
Lebesgue's priority in proving the genera! case. Bore! himse!f wou!d come to cal!
it the "first fundamenta! theorem of measure theory," a name with a great deal of
merit that, unfortunate!y, has not stuck.
In 1904, Oswa!d Veb!en pointed out that Theorem 3.4, the Bo!zano—Weierstrass
theorem, fo!lows from the Heine—Bore! theorem and, in fact, is equivalent to it.
The key to proving Heine—Bore! is that every bounded set must have a least upper
bound, but this is precisely the property of comp!eteness and thus equivalent to the
Bo!zano—Weierstrass theorem.
Exercises
3.2.1. Show that if every open interva! contains at !east one point of 5, then every
open interva! contains infinite!y many points of S. Show that no matter how many
points of S we have found, if it is a finite number, then we can always find one
more.
3.2.5. Show that if a set S is not c!osed, then we can find an infinite co!!ection of
open sets whose union contains S and such that no finite subcollection wi!! contain
S. Use the fact that S is not c!osed if and only if there is a !imit point of S that is
not in S.
3.2.6. Prove that if f is continuous on [a, b] and s = sup{t <b I a <x <t
If(x) — f(a)I <E} and s b, then If(a) — f(s)I = E.
3.2.7. C!ean up Dirichlet's proof of Theorem 1.13. Assume that for every E > 0,
there is a finite sequence of values, say a = CO <c1 < = b, such
that if Ck <x < Ck+1 then I f(x) — f(ck)I < E. Exp!ain how to use this to find a 6
response to any cha!!enge of an E > 0.
3.2.8. Prove that the Bo!zano—Weierstrass princip!e, that every infinite bounded
set has a !imit point, imp!ies that every Cauchy sequence converges.
70 Explorations of R
3.2.9. Prove that if every Cauchy sequence converges, then every bounded set has
a least upper bound.
3.2.10. Prove that if every bounded set has least upper bound, then every sequence
of closed, nested intervals has a nonempty intersection.
3.2.11. In 1880, Weierstrass published a proof that if a series > converges
uniformly in some neighborhood of each x in the closed and bounded interval
[a, b], then the series converges uniformly over the entire interval. An outline
of the proof — translated into modern terminology — is given below. Justify each
of these statements. Identify where and how Weierstrass used the completeness
of JR.
n=1 m=1
3.3 Set Theory 71
U — + = UU — 4m' +
n=1 m=1 k=1
3.2.14. Consider the bounded infinite set of numbers of the form
3m
MeN.
m=1
Show that 1/3 is not an accumulation point. Find three accumulation points of
this set.
Cardinality
To lay the foundations for an explanation of Cantor's work, it will be helpful to
borrow concepts and language that would not come into being until much later,
72 Explorations of R
of values. We can order the union of all of these sequences, ordering by the sum of
the subscripts and by the first subscript when the sums are equal (see Figure 3.1).
Once we have ordered the positive rational numbers,
a1 = 1, a2 = 1/2, a3 = 2, a4 = 1/3
jas tCioaijj
'ZV— . . •
I7LST OiWJ
°'u. Jo
Jo si
Jo si
JO
uo
si
jo jo
U irnwouiciod jo
U si
u
jo
si
jo si
OM MOU)j
L
tDgJooD jo oip JOMOd Si j
74 Explorations of R
rational numbers are algebraic, and so the cardinality of the algebraic numbers can
be neither more nor less than The set of algebraic numbers has cardinality
The proof presented here is a recasting into modern language of Cantor's original
1874 proof, interesting because of its use of the nested interval principle. The more
commonly known proof is based on an argument made by Cantor in 1891. You will
see it later in this section. The 1874 proof is important not just because it was first.
It also shows that if a set satisfies the nested interval principle and between any
two distinct elements there always lies a third, then the set cannot be countable.
Proof We need to prove that we cannot order the real numbers as (r1, r2, r3, . .), so .
we assume that we can and look for an absurdity that arises from this asssumption.
We pick any closed interval on the real line, say [0, 1], and find the first two real
numbers in our list that are inside this interval. Since all real numbers are in our
list, there are at least two inside [0, 1]. Call them a0 <b0. All of the numbers in
the open interval (ao, b0) are also inside [0, 1], so we have not yet encountered any
of them. We continue down the list until we find the first two real numbers in our
list that are inside (ao, b0), call them a1 < b1. We have not yet encountered any
numbers from the open interval (a1, b1), so we continue until we find the first two
in this interval.
We are generating a nested sequence of closed intervals,
[0,11 D [ao,bo] D [ai,b1] D [a2,b2] D ...,
with 0 <a0 <a1 <a2 < and 1 > > > b2 > . By the nested interval
•
principle, there is at least one real number contained in all of these intervals, call it
rm. By the strict inclusion of these intervals, rm is not equal to any of the endpoints.
Now we have a problem because and are preceded by at least 2n elements
from our sequence (r1, r2, r3, .. .). This means that we can find an n for which
and come after rm in our sequence. This contradicts the fact that are the
first two real numbers in the list that lie within the open interval E
Theorem 3.9 says that the continuum, the set of points in R, has a cardinality
strictly larger8 than a cardinality that is denoted by the letter c. Are there any
subsets of [0, 1] whose cardinality is larger than and less than c? Cantor believed
that it was not possible, a belief known as the continuum hypothesis.
8
It is not clear that cardinalities can always be ordered. A discussion of what it means for one cardinal number
to be larger than another can be found in Section 5.4.
3.3 Set Theory 75
Until 2002, the Fields Medal was the highest honor any mathematician could win. Awarded only every four
years but to up to four people, it is restricted to mathematicians under the age of 40. In 2002, the Abel Prize was
created by the Norwegian government to honor Niels Henrik Abel. Similar to the Nobel Prize, it is awarded
each year.
76 Explorations of R
quibbled that this means making an uncountable number of selections, and it is not
clear how that can be done.
The ability to choose one element from each equivalence class and so define
this subset is a consequence of what came to be known as the axiom of choice.
It appears in many proofs, but it also has surprising and disturbing consequences
such as the Banach—Tarski paradox. As it turns out, the truth of the existence of
this set is also a matter of choice, a fact also proven by Gödel and Cohen. This
axiom will come to play an important role in Section 5.4, where we shall explore
it in greater detail and say something about the Banach—Tarski paradox.
The problem arises from thinking of R as a set of values that in some sense are
equivalent to algebraic values. The status of the continuum hypothesis shows just
how strange it really is to impose the concept of sets of numbers onto the points
of a continuous line. This does not mean that we should not think of R as a set of
numbers. That is an extremely useful construction that will lie at the heart of our
eventual solution of all of the problems regarding integration and the fundamental
theorem of calculus. But it is a reminder that we must tread very carefully. We are
now in a realm where intuition can no longer be trusted.
Power Sets
If A and B are sets, we use to denote the set of all mappings from B to A. The
reason for this notation is best explained through an example. Let
A={a,b,c}, B={1,2,3,4,5}.
A mapping from B to A assigns one of the letters from A to each of the numbers in
B. There are three possible images of 1 (we can have 1 —* a or 1 —* b or 1 —* c),
three possible images of 2 (no reason that we cannot use the same image more than
once), and so on for a total of 35 possible mappings. We see that for finite sets, the
cardinality of A raised to the cardinality of B.
We now extend this idea to infinite cardinalities. In general A8 means the
cardinality of the set of mappings from a set with cardinality B to a set with
cardinality A.
For example denotes the cardinality of the set of mappings from {1, 2} to N.
Each mapping is uniquely determined by a pair of positive integers, {(i, j) I i, i E
N}. The first coordinate is the image of 1; the second coordinate is the image of
2. We have seen (see Figure 3.1) that the cardinality of such pairs is again and
therefore
=
It is easy to extend this to any finite positive integer n: =
3.3 Set Theory 77
What about This is the cardinality of the set of mappings from N to {O, 1 }. We
are looking at an infinite sequences of Os and is, such as iOiOOl000l0000iO...
There is a natural conespondence between such sequences and the set of real
numbers between 0 and i. We just put a decimal point in front of the sequence and
read this sequence in base 2:
O.iOiOOi000i...2
1 0 i 0 0 i 0 0 0 i
We do have times when two different sequences represent the same real number,
for example,
O.OOiOOiii = O.OOiOi000,
but there are a countable number of these duplications. Exercises 3.3.5—3.3.7 es-
tablish that is the cardinality of R,
= C. (3.3)
by2S.
It was in i 89i, in an address to the first congress of the German Mathematical
Association, that Cantor stated and proved what is now known as Cantor's theorem,
that the cardinality of S can never equal the cardinality of its power set.
Theorem 3.10 (Cantor's Theorem). For any set 5, the cardinality of S is not the
same as the cardinality of the power set of S.
Proof. We assume that S and have the same cardinality and look for a contra-
diction. If the cardinalities were the same, then we would have a one-to-one and
onto mapping from S to the collection of subsets of 5, i/i S : We construct a
subset T c S according to the following rule: a e T if and only if a g i/i(a). Since
T is a subset of 5, it is an element of Since i/i is a one-to-one and onto mapping,
we can find an element b e S for which ifr(b) = T. Is b in T? If b is in T =
then b e ifr(b), so b is not an element of T. That is a contradiction, so b cannot be
in T. But if b is not in T = ifr(b), then b g ifr(b), so b is in T. Having assumed a
78 Explorations of R
Note that, since R has the cardinality of the power set of N, this theorem implies
that R is not countable. In fact, a variation on this argument has become the
common proof that R is not countable. Assume that the set of real numbers in
[0, is countable, and write down their decimal expansions in order
O.aIa2a3a4a5
O.b1b2b3b4b5
O.c1c2c3c4c5...
O.d1d2d3d4d5
Now choose any decimal whose digits do not include 0 or 9 (exercise 3.3.9 asks
you to explain why we avoid those digits) and whose first digit is not a1, whose
second digit is not b2, whose third digit is not c3, whose fourth digit is not d4,
and so on. This number does not appear in the list, and so we were wrong when
we claimed that we could list them in order. We see now that c is not the largest
possible cardinality; 2C is bigger; is even bigger than that.
We can go even further, taking the union of countably many sets of which
the first set has cardinality c and the nth set is the power set of the n — 1st set:
= c, = T1 = This union has cardinality strictly larger than
any of the cardinalities in the sequence. We can then take the power set of this union,
T2 = 2T1• We can now restart with this power set, continue through countably many
power sets, and take the union of these sets: U1 = Tn. This is only the second
iteration of this process. We can do it countably many times: Vi = U=1 W1 =
Vn We have still only described countably many cardinalities. There are
sets of cardinalities that are themselves uncountable, even sets of cardinalities that
have cardinality In fact, for every cardinality b, there is a set of cardinalities
that itself has cardinality b.
Exercises
3.3.1. Show that if(b1, b2, . . .) is any sequence, if x <bn for every n, and if y bn
for some n, then x <y.
3.3.2. Explain how to establish a one-to-one conespondence between N and the
set of all rational numbers.
3.3 Set Theory 79
3.3.3. Explain how to establish a one-to-one conespondence between R and the
set of all inational numbers.
3.3.4. Describe a one-to-one and onto mapping between each of following pairs of
sets:
3.3.5. Prove that the set of real numbers in [0, 11 that have more than one repre-
sentation in base 2 is a countable set.
3.3.6. Prove that c + = c. That is to say, find a one-to-one mapping from R to
R for which the image omits countably many elements of R. Hint: If x =
b odd, then x x/2. Otherwise, x x.
3.3.16. Consider the set of pairs (x, y) that are roots of polynomials in x and y
with rational coefficients; for example, x2 + xy + y2. What is the cardinality of
the set of all such pairs? Justify your answer.
80 Explorations of R
3.3.17. What is the cardinality of the set of all rational polynomials in it (all ex-
pressions of the form + +••• + + ao, where ao, a1, . . ,
.
fl[an, ba].
4
Nowhere Dense Sets and the Problem with the
Fundamental Theorem of Calculus
This chapter will focus on the types of sets that confused Hankel and Harnack
and many other mathematicians of the late nineteenth century. Consider again
what is left over when we order the rational numbers between 0 and 1, say =
0, a2 = 1, a3 = 1/2,..., and remove all numbers within 1/8 of a1, within 1/16 of
a2, ..., within 1/2 of As Borel showed, there is something left over. In fact,
if we let S denote the set of points that are not eliminated, then Ce(S)> 1/2 (see
Exercise 4.1.1).
This set also gives a counter-example to Hankel's contention that every pointwise
discontinuous function is Riemann integrable. If we define x = 1 if x e 5, = 0
if x g 5, then x is continuous at every rational number and so is pointwise
discontinuous. But no matter how we partition the interval [0, 11, the subintervals
that contain points of S — and thus have oscillation 1 — must have total length at
least 1/2. This function is not Riemann integrable over [0, 11.
Although S seems like a very sparse set, it does not fall into the category of
any of our characterizations of sparse sets. As we shall see in this chapter, it is not
countable. It does not have outer content 0. Since the outer content is not zero, it
cannot be first species. We need a new term to describe the way in which this set
is sparse.
A set T is dense in (a, b) if every open interval in (a, b) contains at least one
point of T. The set S is nowhere dense in (a, b) if and only if every open interval in
(a, b) contains an open subinterval with no points of S (see Exercise 4.1.3). Finite
sets are nowhere dense and so are discrete sets, sets such as N for which each
element is contained in a neighborhood that has no other elements of that set. But
there are also nowhere dense set that are not discrete.
The confusion exhibited by Hankel and many of his contemporaries arises from
the attempt to connect the intuitive idea of a "sparse" set with any of the precise
definitions that were beginning to emerge. In some sense, to be countable is to
be sparse. Such a set, though infinite, is of a smaller order of infinity than the
81
82 Nowhere Dense Sets
continuum, R. But as we know with the rationals, a countable set can still be dense.
Mathematicians of the 1870s and into the 1880s would conflate the concepts of
discrete, nowhere dense, first species, and outer content zero, and often throw in the
assumption that such a set must be countable. It would take a while to straighten
these out and clarify how they are related.
The problem arises from the use of three incomparable ways of measuring the
size of a set: cardinality, density, and measure (which, for the moment, means
outer content). We shall straighten these out in Section 4.1. In Section 4.2 we
shall explain the disturbing implications for the fundamental theorem of calculus
that arise from the existence of nowhere dense sets with positive outer content.
Voltena's example of a bounded, pointwise discontinuous function that is not
Riemann integrable is built using such a set. In Section 4.3, we shall rely on our
improved understanding of nowhere dense sets to explore Osgood's justification
of term-by-term integration for any bounded series of continuous functions that
converge to a continuous function. Nowhere dense sets lie at the heart of these first
three sections, but as Osgood's proof makes very clear, they are an inconvenient
tool with which to explore analysis. I have included Osgood's proof for a specific
pedagogical purpose. In struggling with it, the reader is prepared to appreciate — as
did analysts of the early twentieth century — the incredible simplicity and clarity
of Lebesgue's approach. In Section 4.4 we shall explore Baire's insights into the
gulf that separates nowhere dense sets from intervals, insights that will restrict
the possibilities for the set of discontinuities of a function. His work marks the
culmination of our understanding of nowhere dense sets, preparing the ground for
and inspiring Lebesgue's development of measure theory.
Definition: Perfect
A set is perfect if it is equal to its derived set. In other words, S is perfect if and
only if every point of S is an accumulation point of 5, and all accumulation points
of S are in S.
There are two big questions that we shall answer in this section:
1. Can a nonempty perfect set be nowhere dense? If so, then we would have a set
that is second species and nowhere dense.
2. Can a nowhere dense set have positive outer content? If so, then Hankel's proof
that pointwise discontinuous functions are Riemann integrable collapses. It
should be possible to find a function for which has positive outer content
even though it is nowhere dense.
As we shall see, bounded, perfect, nowhere dense sets can be constructed with any
outer content we wish, provided only that the outer content is strictly less than the
length of interval that contains our set.
The first construction of a perfect, nowhere dense set was by the British math-
ematician Henry J. S. Smith (1826—1883) in 1875. Smith, who taught at Balliol
College in Oxford and was appointed Savilian professor of geometry in 1860, is
known primarily for his work in number theory. Not many mathematicians were
aware of Smith's construction, a fate that was shared by some of his other ground-
breaking work. Most of the exciting mathematics was happening in Germany and
France, and that is where attention was focused. In 1881, Vito Voltena showed how
to construct such a set, but Voltena was still a graduate student, and he published
in an Italian journal that was not widely read. Again, little notice was paid. Finally,
in 1883, Cantor rediscovered this construction for himself, and suddenly everyone
knew about it. Cantor's example is known as the Cantor ternary set. We shall use
this term to refer to Cantor's specific example, but the family of examples of per-
fect, nowhere dense sets exemplified by the work of Smith, Voltena, and Cantor
will be refened to as the Smith—Voltena—Cantor sets, or SVC sets.
I I I I I I I I
Figure 4.1. Construction of the Cantor ternary set by removal of middle thirds.
indefinitely (see Figure 4.1). We shall call the set of values that remain the Cantor
ternary set. What is left?
We clearly still have many of the rational numbers between 0 and 1 whose
denominators are powers of 3, the endpoints of the intervals we kept. It may seem
that that is all that we have, but there is more.
The easiest way to see what is left is to consider the base 3 expansion of the real
numbers between 0 and 1. This uses the digits 0, 1, and 2, for example,
2 1 0 2 0 1 586
When we eliminate all values between 1/3 and 2/3, we are eliminating those
numbers between 0.13 and 0.23. In other words, we take out all values with a 1
in the third's place and a nonzero digit after that. When we remove the intervals
(1/9, 2/9) and (7/9, 8/9), we are removing the values between 0.013 and 0.023 and
the values between 0.213 and 0.223. In other words, we remove those values with
a 1 in the ninth's place and a nonzero digit somewhere after that. As we continue
our removal, what we are eliminating are the numbers with a 1 anywhere in the
base 3 expansion, provided the 1 is eventually followed by a nonzero digit.
We can simplify the description of the Cantor ternary set. Base 3 representations
that terminate can also be written with repeating 2s. Thus, we have
0. 13 = 0.022222 ..
0.02 13 = 0.02022222. .
We can define the elements of the Cantor ternary set as those numbers that can be
written in base 3 without using the digit 1.
It is now easy to find elements of the Cantor ternary set that are not rational
numbers with denominators that are powers of 3. For example,
— 2
0.0202023=—+—+—+•••=-.
2 2 2 1 29 1
32 34 36 9 1—1/9 9 8 4
Proposition 4.1 (Properties of the Cantor Ternary Set). The Cantor ternary set,
C, is perfect, nowhere dense, and has outer content zero.
4.1 The Smith—Volterra—Cantor Sets 85
The SVC sets consist of those sets that are constructed by starting with a closed
interval and removing an open subinterval. One then removes an open subinterval
from each of the remaining subintervals and continues through an infinite sequence
of such removals, choosing the subintervals that are removed so that every open
subinterval of the original set overlaps with at least one of the subintervals that are
removed. The SVC set is the intersection of this countably infinite collection of the
sets that remain after each iteration. Every SVC set is closed and nowhere dense.
A particular family of SVC sets consists of those formed by, at the kth iteration,
removing an open interval of length I from the center of each of the remaining
closed intervals. We shall denote the resulting set SVC(n), n > 3. The Cantor
ternary set will, from now on, be referred to as SVC(3).
i — — i i i i
This function maps SVC(3) onto all of [0, i]. Since SVC(3) is also a subset of
[0, ii, this onto mapping implies that SVC(3) must have the same cardinality as
[0, ii, the cardinality c.
Note that ifa and b are elements of SVC(3), a <b, then DS(a) DS(b).
Equality occurs if and only if a and b are the endpoints of one of the open intervals
that was removed to create SVC(3). Assume that a and b are the endpoints of one
of these intervals, and assume that they agree in the n digits, d1 through We can
represent these values by
7/8 —
3/4
5/8 —
1/2 —
3/8-
1/4 — 1
1/8 —
at which we evaluate DS' to those points where it exists. But then the integral
of the derivative of DS is a constant function, not DS.
This suggests that for the evaluation part of the fundamental theorem of calculus,
ph
F'(x) = f(x) f(x) dx = F(b) — F(a),
Ja
we want to insist that the derivative of F must exist at all points in [a, b]. But this
is not the end of our troubles. As we shall see in the next section, we can use the
SVC sets to create examples of functions that are differentiable at every point in
[a, b], F'(x) = f(x), and yet
ph
f(x)dx F(b) — F(a).
Ja
88 Nowhere Dense Sets
Exercises
4.1.1. Prove that if is any order of the rational numbers in [0, 1], then the
set of points that are not within (1/2)n+2 of has outer content at least 1/2.
4.1.2. Give an example of a closed set that is not perfect.
4.1.3. By definition, if S is nowhere dense in (a, b), then it is not dense in any
interval contained in (a, b). Show that this holds if and only if every open interval
contained in (a, b) has an open subinterval that has no points of S.
4.1.4. Explain the connection between the fact that the set of rational numbers that
can be written with denominators that are powers of 10 is dense in R and the fact
that every real number has a decimal expansion.
4.1.5. Prove that every real number between 0 and 1 can be represented in a base
3 expansion using the digits 0, 1, and 2.
4.1.6. Prove that a set that is first species must be nowhere dense.
4.1.7. Prove that between any two numbers, there will always be a number with a
1 somewhere in its base 3 representation. Consider the set of numbers whose base
3 expansion requires using the digit 1. Show that this set is dense in R.
4.1.8. Prove that the Cantor ternary set has cardinality c.
4.1.9. Prove that no finite set can be perfect unless it is the empty set.
4.1.10. Prove that no countably infinite set can be perfect.
4.1.11. Prove that a set is nowhere dense if and only if its derived set is nowhere
dense.
4.1.12. Prove that the devil's staircase is continuous by explaining how to find
a response 6 > 0 to the challenge E > 0 so that if lx — < 6, then DS(x) —I
DS(y)I <E.
4.1.13. Using the definition of the derivative, justify the assertion that the derivative
of the devil's staircase, DS, is not defined at any point of the Cantor ternary set,
SVC(3).
4.1.14. Let F denote the set of values in [0, 1] that can be written in base 5 without
the use of the digits 1 or 3. Thus, 1/5 = 0.15 = 0.0444.. is in F but 7/25 = 0.125
is not. Describe the open intervals that are removed from [0, 1] to create F. Find
the outer content of F.
4.1.15. For the set F defined in Exercise 4.1.14, define a function, DSF, that
takes each point in F, x = 0.d1d2 .. to y = 0.e1e2. where e, = d,/2. Thus,
.
can extend DSF to all of [0, 1] by defining it to equal DSF(a) = DSF(b) on each
of removed intervals (a, b). Sketch the graph of DSF.
0.
So f is a function that is differentiable over the interval [—1, 1], but f' is not a
bounded function on this interval. The Riemann integral exists only for bounded
functions. Notice that if we treat f'(x) dx as an improper integral, then it does
satisfy the fundamental theorem of calculus
p1 1
lim I 2x sin(x2)
e_÷O+
— 2x' cos(x2) dx = lim x2 sin(x2) = sin(1). (4.1)
J
90 Nowhere Dense Sets
0.6
—31t
Figure 4.3. Graph of the derivative defined by f'(x) = 2x sin(x2) — 2x1 cos(x2).
Similarly,
SVC(4)
The set SVC(3) was created by removing an interval of length 1/3, then two
intervals of length 1/32, then four of length and so on. To create the set
SVC(4), we remove an open interval of length 1/4 from the middle of [0, 1],
4.2 Volterra's Function 91
I I I I I I I I
then an open interval of length 1/42 from each of the two remaining pieces, then
intervals of length from each of the four intervals that remain, and so on.
We do not have a nice characterization of SVC(4) comparable to the base 3
description of SVC(3), but we still wind up with a perfect, nowhere dense set (see
Exercises 4.2.4 and 4.2.5). Any finite collection of open sets that covers SVC(4)
must have lengths that add up to at least
1 2 22 1/
1------------—--...=1--(1+-+---+... \
1=-. 1 1 1
4 42 43 44 4\ 2 22 J 2
We can find finite open covers of SVC(4) for which the sum of the lengths of the
intervals comes as close as we wish to 1/2, and therefore the outer content of this
set is 1/2 (see Figure 4.4).
Our next function has a derivative that exists and is bounded but is not continuous
atx = 0.
g(x)=x2sin(x'), g(0)=0.
This is very much like our previous function, but the derivative is now bounded,
Recall that a function is Riemann integrable if and only if for every a > 0, the
set of points for which the oscillations exceeds a has outer content zero. For the
function g', the oscillation at x = 0 is 2. We have only a single point at which the
oscillation is positive, so this function is Riemann integrable. But what if we could
construct a function for which the oscillation at every point of SVC(4) is 2? That
would imply that the set of points at which the oscillation is greater than 1 does
not have zero content, and so the function cannot be Riemann integrable. Our basic
idea is to take copies of g and paste them into each of the intervals that have been
removed. The behavior of our new function at each of the endpoints of the removed
intervals will look just like the behavior of g at x = 0.
Example 4.4. (Volterra's Function). We craft our function with some care. To
find the piece of the function that will go into the interval of length 1/4, we start
92 Nowhere Dense Sets
o . 004
0. 002
—0 . 002
—0. 004
—0 . 006
—0. 008
with our function g and find the largest value less than 1/8 at which g' is zero, call
it a1. We now define the function h1 (x) by (see Figure 4.5)
0, x<3/8,
g(x — 3/8), 3/8 x 3/8 + a1,
hi(x)= g(ai), 3/8+ai <x <5/8—ai, (4.3)
g(5/8—x), 5/8—ai <x <5/8,
0, x>5/8.
We have constructed this function so that it is differentiable at every point in [0, 11,
and the oscillation of h'1 is 2 at both 3/8 and 5/8.
We now define h2, which will be nonzero in the two intervals of length 1 / 16.
We first find a2, the largest value less than 1/32 at which g' is zero. We then have
(see Figure 4.6)
0, x <5/32,
g(x — 5/32), 5/32<x <5/32-I-a2,
g(a2), 5/32+a2 <x <7/32—a2,
g(7/32 — x), 7/32 — a2 <x <7/32,
h2(x) = 0, 7/32 <x <25/32, (4.4)
g(x — 25/32), 25/32 <x <25/32+a2,
g(a2), 25/32+a2 <x <27/32—a2,
g(27/32 — x), 27/32—a2 <x <27/32,
0, x > 27/32.
4.2 Volterra's Function 93
0 . 00075
0. 0005
0 . 00025
—0.00025
—0 . 0005
—0. 00075
0.
—0 . 002
—0.004
—0 . 006
—0 . 008
The derivative of h1 + h2 has oscillation 2 at 5/32, 7/32, 3/8, 5/8, 25/32, and 27/32
(see Figure 4.7).
We continue in this way. For each n we find the largest value less than
1122n+1 at which g' is zero. We construct a function that is nonzero only inside
the intervals of length 4_il, and in each of those intervals it is two mirrored copies
94 Nowhere Dense Sets
of g over the interval [0, connected by the constant function equal to g(an).
Volterra's function is
V(x) = (4.5)
V'(x) = (4.6)
If x E SVC(4), then V(x) = 0 and IV(x) — V(y)l <(y — x)2. From the definition
of the derivative,
V(x)—V(y)
lim = 0. (4.7)
x—y
This is also equal to the value of at any x in SVC(4).
The oscillation of V' at any endpoint of one of the intervals is 2. Since every point
of SVC(4) is an accumulation point of the set of endpoints, every neighborhood
of a point in SVC(4) contains points where V' is 1 and points where V' is —1,
and thus we get oscillation 2 at every point of SVC(4). The function V' cannot be
integrated.
Notice that Vt is pointwise discontinuous. Every open interval contains a point —
in fact an entire open interval of points — at which V' is continuous. This is the
counterexample to Hankel's claim that every pointwise discontinuous function is
Riemann integrable.
Proof Since S contains its accumulation points, it is closed and points a and b are in
S. By Theorem 3.5, the complement of S in [a, b], Sc fl [a, b], consists of a count-
able union of disjoint open intervals. Since every point of S is an accumulation point
of 5, a cannot be a left endpoint of one of these open intervals, b cannot be a right
endpoint of one of these open intervals, and no right endpoint of any of these inter-
vals is a left endpoint of another interval. If 5c fl [a, b] consisted of only finitely
many intervals, then S would contain a closed interval of positive length. Since S is
nowhere dense, the number of intervals in 5c n [a, b] must be countably infinite.
We now show that S is the derived set of the set of endpoints of the disjoint open
intervals whose union is 5c fl [a, b]. Let (Ii, '2, 13,...) be an ordering of these
disjoint open intervals. Let be the left endpoint of interval and the
right endpoint. The set of endpoints is contained in S and, since S is perfect, its
derived set is also in S. Since S is nowhere dense, if s is any element of 5, every
neighborhood of s has a nonempty intersection with at least one of the intervals
and therefore an endpoint of this interval is in this neighborhood of s. It follows
that s is an accumulation point for the set of endpoints, and, therefore, S is the
derived set of the set of endpoints.
To prove that S is not countable, it is enough to find a one-to-one mapping from
into S. Given an infinite sequence of Os and is (mi, m2, m3, .. .) where =0
or 1, we create a sequence of nested intervals starting with [ao, b0] = [a, b]. Given
[ai_i, we find the first open interval in our ordered sequence, contained
in [ai_i, We define
[ai_i, if = 0,
if = 1.
We see that a is always a left endpoint. It
follows that and is strictly contained within (ai, by). The right-hand end-
points of our nested intervals, b1 > > . , form a bounded, decreasing sequence
that converges to some fi e S. This fi is the image to which our sequence is mapped,
Theorem 4.2 has implications for the continuum hypothesis. If a perfect set is
dense in an interval (a, b), then it must contain every point in [a, b]. Therefore,
every nonempty perfect set has cardinality c. When this is combined with Cantor's
result (see p. 82) that every derived set is the union of a countable set and a perfect
96 Nowhere Dense Sets
set, we see that every derived set, and therefore every closed set, has cardinality that
is finite, equal to or equal to c. Thus, if there is a subset of R with cardinality
strictly between and c, then it is not closed.
In 1903, Young proved that a subset of R with cardinality strictly between
and c cannot be the intersection of a countable collection of open sets. In 1914,
Hausdorff extended this to exclude sets that are the union of a countable collection
of sets that are the intersection of a countable collection of open sets. Finally,
in 1916, Hausdorff and Alexandrov showed that a set with such an intermediate
cardinality cannot be a Borel set (see definition on p. 127).
SVC(n)
While there are many ways of constructing the countable disjoint open intervals
that constitute the complement of our perfect, nowhere dense set, the method used
for SVC(3) and SVC(4) will work for finding perfect, nowhere dense sets in [0, 1]
whose outer content comes as close to 1 as we wish. We define SVC(n) as the
set that remains after removing an interval of length 1/n centered at 1/2, then an
interval of length 1/n2 from the center of each of the two remaining intervals, then
intervals of length 1/n3 from the centers of each of the remaining four intervals,
and so on, leaving a set with outer content
1 2 22 1 1 n—3
=1
1
2
n3 n 1—2/nn—2
Exercises
4.2.1. For each of the following combinations, either give an example of a bounded,
nonempty set with these properties or explain why such a set cannot exist.
1. nowhere dense, first species, and outer content 0
2. nowhere dense, first species, and positive outer content
3. nowhere dense, second species, and outer content 0
4. nowhere dense, second species, and positive outer content
5. dense in some interval, first species, and outer content 0
6. dense in some interval, first species, and positive outer content
7. dense in some interval, second species, and outer content 0
8. dense in some interval, second species, and positive outer content
4.2.2. Find a perfect, nowhere dense subset of [0, 1] with outer content 9/10.
4.2.3. Show that if f is a nonconstant function that has a bounded derivative over
[a, b], and if f'is zero on a dense subset of [a, b], then f' cannot be integrable on
[a,b].
4.2 Volterra's Function 97
4.2.4. Show that SVC(4) is closed and that every point is an accumulation point of
this set.
4.2.5. Prove that every open interval contained in [0, 1] contains a subinterval with
no points of SVC(4), and therefore SVC(4) is nowhere dense.
4.2.6. For the function g in Example 4.3, prove that the oscillation of g' at x = 0
is 2.
4.2.7. Find the values of a1 and a2 to 10-digit accuracy, where a1 is the largest
number less than 1/8 for which g' is zero and a2 is the largest number less than
1/32 for which g' is zero.
4.2.8. Show that even though V'(x) = this series does not converge
uniformly.
4.2.9. Show that if S = SVC(3) and the intervals are ordered so that longer intervals
precede shorter, then the mapping described in the proof of Theorem 4.2 takes
(0, 1, 0, 1, 0, 1, .) to
. .
2 2 2 _2 1 1
9+81+729+
Explain why this mapping is independent of how we choose to order intervals of
the same length.
4.2.10. Show that the derived set of the set described in Exercise 3.2.14 (p. 71) is
perfect and nowhere dense. Describe the countable union of open sets for which
this derived set is the complement in [—1/2, 1/21.
4.2.11. Of the types of sets listed in Exercise 4.2.1 that do exist, which can be
countable?
4.2.12. Give an example of a bounded, countable, nowhere dense set that has
positive outer content.
4.2.13. Let S be a bounded set with exactly one accumulation point, a. Define
to be the set of points in S that are at least 1/n away from a. Use the fact that
S — {a} = S is countable.
4.2.14. Using induction on the type, prove that any first species set is countable. It
is possible to mimic the proof from Exercise 4.2.13 and define to be the set of
points in S that are at least 1/n from any of the accumulation points of 5, but the
statement
S — 5' =
4.2.15. One assumption that is sufficient to make the evaluation part of the fun-
damental theorem of calculus correct for Riemann integrals is to assume that F'
f
is continuous: If F' = where f is continuous, then f(t) dt = F(b) — F(a).
Explain why this assumption eliminates Volterra's counter-example.
f
4.2.16. Show that while the assumption F' = is continuous may be sufficient
to imply the evaluation part of the fundamantal theorem of calculus (see Exer-
cise 4.2.15), it is not a necessary condition.
4.2.17. Following Weierstrass, we can modify the definition of the Riemann inte-
gral by taking our Riemann sums, — with restricted to
be a point of continuity of f in the interval x3]. Show that even with this
modified definition, the derivative of Volterra's function is not Riemann integrable.
Heine popularized this result and clarified the distinction between pointwise and
uniform convergence in Die Elemente der Functionenlehre (The Elements of Func-
tion Theory) published in 1872. Uniform convergence is sufficient for term-by-term
integration, but it was clear that it was not necessary. Many series that do not con-
verge uniformly still allow for term-by-term integration. As we saw in Section 2.3,
Heine tried to work around the nonuniform convergence by isolating a small set
4.3 Term-by-Term Integration 99
that was problematic and focusing on its complement where convergence would
be uniform.
As would be realized eventually, this problematic set has to be closed and
nowhere dense. In the early 1 870s, it was hopefully believed that this meant it
was a small set. But as the work of Smith, Volterra, and Cantor showed, a closed,
nowhere dense set can be very large and can in fact have outer content as close as
desired to the length of the entire interval in which it lies.
In the 1880s, Paul du Bois-Reymond tackled the problem of term-by-term in-
tegration, proving in 1883 that any Fourier series of an integrable function can
be integrated term by term. In 1886, he published results on the general problem
and focused attention on those values of x for which convergence is uniform in-
side some neighborhood of x. His approach was picked up in 1896 by William F.
Osgood whose work we will study in detail.
Paul du Bois-Reymond (1831—1889) was German. His father had moved to
Germany from Neuchâtel in francophone Switzerland. He received his doctorate
under the direction of Ernst Kummer at the University of Berlin in 1853 and held
positions at a succession of universities including Heidelberg, Freiburg, Tubingen,
and finally at the Technische Hochschule Charlottenberg (Charlottenberg Institute
of Technology) in Berlin. Otto Holder whom we will meet later was one of his
doctoral students in Tubingen.
William F. Osgood (1864—1943) is one of the few Americans to feature in this
story. He went to Germany for his graduate work, studying with Max Noether
at Erlangen and earning his doctorate in 1890. He returned to spend his career
teaching at Harvard.
To clarify the problems with which du Bois-Reymond and Osgood had to deal, it
is useful to consider some examples. Rather than working with series, it is simpler
if we work with sequences of integrable functions, and ask whether
çb 71b
j (lim Sn(x)) dx = lim ( j Sn(x) dx
Ja \Ja
This is equivalent to working with series for we can always define
(4.8)
— otherwise.
Each is discontinuous on a finite set of points, so each is integrable. The
function these approach is Dirichlet's function (Example 1.1), which is not Riemann
integrable.
For our purposes, we shall assume that the functions in the sequence as well
as the limiting function are all continuous. In this case, we lose no generality if
we restrict ourselves to sequences that converge to 0. Finally, we assume that the
interval over which we integrate is [0, 1].
(lim =0.
0
1.5
0.5 n=3
and therefore
/ 1
) = Jim =
1
Jim (f o
n—oo 2
—
2
The integral of the limit is not equal to the limit of the integral. If we convert this
into a series,
= (kxe_kx2 — (k — 1)xe_1)x2),
The sequence (Ar) does not converge uniformly, so the ability to interchange
limit and integral is not guaranteed, but there are examples of sequences for which
the convergence is not uniform, and yet the integral of the limit is equal to the limit
of the integral.
=
1 + n3x2
Again we have = 0 at every x, and therefore,
(iim dx = 0.
0
1.75
1.5
1.25
0.75
n=3
0.5
0.25
0.5
0.4
0.3
0.2
0.1
f'
i
n2x
dx=
Jn(1+n3)
Jo 1+n3x2 2n
and therefore
ln(1 + n3)
Jim (jo Bn(x)dx) = Jim = 0. (4.9)
n—+oc n—oc 2n
(lim dx = 0.
0
nx ln(1+n2)
I dx=
J0 2n
and, therefore,
Jim
If' Cn(x)dx) = lim
ln(1+n2)
= 0.
0 n—+oc 2n
4.3 Term-by-Term Integration 103
Interchanging limits and integrals works in these last two cases despite the fact
that convergence is not uniform. What is different about them?
We shall not be able to explain why the interchange works for until we have
the tools of Lebesgue integration in hand, but we can tackle now. What is
most noticeable about the sequence (Ca) is that these functions stay bounded. In
the case of the we were able to get convergence to 0 and still have the area
under the graph of y = increase toward 1/2 because the maximal values
of the functions were increasing. This did not interfere with the convergence to
0, because the location of that maximum kept moving left, approaching x = 0.
Whatever positive value of x we might choose, eventually the maximum will occur
to the left of it, and from then on the sequence approaches 0. But if our functions
are all bounded, then we cannot use that trick.
Definition: F-Points
Given a sequence of functions, f2, .. .), that converges pointwise to 0, and
any a > 0, we define Fa to be the set of x such that given any integer m and any
neighborhood of x, there is an integer n m and a point y E for
which > a. We call x a F-point if it is an element of Fa for some 0.a >
104 Nowhere Dense Sets
of xO. Since there is an element x e Fa that lies in the open set N3(xo), we may
choose a neighborhood of x, N3'(x), which is entirely contained within N8(xo).
For any integer m, there is an n m and a point y e N8'(x) c N8(xo) for which
> a. Therefore x0 is also in Fa.
To prove that Fa is nowhere dense, we assume that it is dense in some open
interval (a1, and look for a contradiction. We can find an integer n1 and a
point Yi e (a1, b1) for which > a. Since is continuous, there must
be an open interval (a2, b2) containing Yi and contained in (ai, b1) over which
the absolute value is larger than a/2. Since Fa is dense in (a2, b2), there is an
integer n3 > n2 and a point Y2 e (a2, b2) for which Ifn2(Y2)I > a. We can find a
neighborhood of (a3, b3), a2 <a3 <b3 <b2, over which > a/2.
Continuing in this way, we generate an increasing sequence of integers,
(ni <n2 <fl3 < •.•), anda sequence of nested intervals, [a1, b1] D [a2, b21
[a3, b3] D ••., for which y e [ak, bk] implies that Ifnk(Y)I ? a/2. Since we have
strict containment, ak_i < ak < bk < bk_i, we can consider the closed intervals,
(a1, b1) D [a2, b2] D [a3, b3] D
where y e [ak+1, bk+i] implies that Ifnk(Y)I a/2.
By the nested interval principle, there is a point c contained in all of these
intervals. Since converges to 0, there is an N such that n N implies that
<a/2. Choose any N. This gives our contradiction because c is in
[ak+i, bk+i] and, therefore, Ifnk(c)I ? a/2. LI
Is Boundedness Sufficient?
As du Bois-Reymond knew by the time he was working on the problem of term-
by-term integration, closed nowhere dense sets can be quite large. As we saw in
the last section, there are closed and nowhere dense subsets of [0, 1] with outer
content as close to 1 as we wish. Because of the difficulty of working with such
sets, du Bois-Reymond was never certain whether or not uniform boundedness,
combined with continuity, would be enough to allow term-by-term integration.
He died before Osgood answered this question. Unbeknownst to either du Bois-
Reymond or Osgood, a mathematician at the University of Bolgna, Cesare Arzelà
(1847—19 12), had proven in 1885 that, for any convergent sequence of integrable
and uniformly bounded functions, the limit of the integrals is equal to the integral of
the limit. Arzelà, who had studied with Ulisse Dini in Pisa, anticipated many of the
results in analysis that others would discover, but his work was not widely known.
We shall follow Osgood's proof that relies on continuity because it is simpler and
ties directly to our study of perfect, nowhere dense sets. In fact, the only place that
Osgood used continuity was in the proof of Proposition 4.3.
4.3 Term-by-Term Integration 105
Osgood's proof is more difficult to read than it needs to be because he did not
have access to the Heine—Borel theorem. Borel had published that result a year
earlier, but it would be another five years before Heine—Borel would be recognized
as the powerful tool that it is. Because of this, Osgood relied on the nested interval
principle that, as we have seen, is equivalent to the Heine—Borel theorem, but is
less well suited for Osgood's needs. To simplify matters, we shall use Heine—Borel
at the critical points in this proof.
The first time we use Heine—Borel is to prove Osgood's lemma. In the statement
of this lemma he assumed that the set G is closed, bounded, and nowhere dense.
In fact, he did not need the assumption that G is nowhere dense, so we shall prove
a more general form of his lemma. Recall that Ce is the outer content (p. 44).
Lemma 4.4 (Osgood's Lemma). Let G be a closed, bounded set and let
G1, G2, ... be subsets of G such that
and UGk=G.
It follows that
lim Ce(Gk) = Ce(G).
As Osgood points out, we really need G to be closed and bounded. For example,
if G = Q fl [0, 1], we can let Gk be the set of rational numbers between 0 and 1
with denominators less than or equal to k. In this case, Ce(Gk) = 0 for all k, but
Ce(G) = 1.
Proof. Given an arbitrary 6 > 0, we must show that there is response N so that
n N implies that Ce(Gn) Ce(G) <Ce(Gn) + 6. The first inequality follows from
the fact that c G. Since Ce(Gn) ? Ce(GN), the second inequality will follow if
we can show that Ce(G) < Ce(GN) + 6.
Let Uk be a finite union of disjoint, open intervals that contains Gk and such
that the sum of the lengths of the intervals is strictly less than Ce(Gk) + The
collection is an open cover of G. By the Heine—Borel theorem, it has a
finite subcover. Let N be the largest subscript in this finite subcover.
If Uk is in the finite subcover of G, k <N, then we divide it into two disjoint
sets: = Uk fl UN and = Uk — the part of Uk in UN and the part not
contained in UN. Since both Uk and UN are finite unions of open intervals, is
a finite union of intervals. (They might be open, closed, or half open.) Since both
UN and Uk contain Gk, the sum of the lengths of the intervals in is at least
Ce(Gk), and therefore the sum of the lengths of the intervals in is strictly less
106 Nowhere Dense Sets
than 6121c• Because is a finite union of intervals, if any of the intervals are not
open, we can replace them by slightly larger open intervals and still keep the sum
of the lengths strictly less than We denote this finite union of open intervals
containing
U denote the union of UN with all of the Vk, k < N, for which Uk is
in the finite subcover of G. The set U is still a finite union of open intervals that
contains G, and we have the desired bounds,
Proof We need to show that for any a > 0, we can find an N such that n N
implies that
<a.
f
To do this, we shall separate the points in Ta12 from those that are not. The
proof presented here is a modified version of Osgood's proof, recast so as to take
maximum advantage of the Heine—Borel theorem.
From the definition of outer content, we can find a finite union of open intervals
that contains Fa/2 and for which the sum of the lengths of the intervals is as close
as I wish to the outer content of Fa/2. We shall call the union of these open intervals
U. The complement of U is a finite union of closed intervals, some of which might
be single points. We shall use C to denote the intersection of this complement with
[0, 1], still a finite union of closed intervals. The theorem now breaks into two
4.3 Term-by-Term Integration 107
parts, limiting the size of the integral over U and limiting the size of the integral
over C. First, we need to specify our choice of U. For each g e Fa/2, we know that
= 0, so we can find an so that n implies that <a/2.
Define to be the set of g e Fa/2 for which n i implies that I <a/2. We
see that
and UGi=Fa12.
(x 1) U N8 (x2) U . U N8 (xv) D C.
We now consider the integral over U of any with n K. We know that the
sum of the lengths of the intervals in U is less than Ce(GK) + a/2B. If we take
any partition P of U, the sum of the lengths of the intervals that contain points in
GK must be at least Ce(GK). The infimum of the values of I on these intervals
is strictly less than a/2. On all other intervals in this partition, the value of I I
is
still bounded by B. This implies that we have the following bound on the lower
Darboux sum for this partition:
Since this inequality holds for all lower Darboux sums, and we know that the
function is Riemann integrable, this also provides an upper limit for the Riemann
integral of I I
over U:
<f (4.11)
108 Nowhere Dense Sets
Combining equations (4.10) and (4.11) and using the fact that Ce(C) + Ce(GK) =
1]) = 1, we see that if n > max{A, K), then
Ce ([0,
(4.12)
Exercises
4.3.1. Prove that for any k 1,
urn =0.
fl
4.3.2. Evaluate (k —
dx, k> 1, and then show that
0O 1
dx) =
(f
—
— (k
4.3.3. In the proof of Proposition 4.3, where does the assumption that is dense
in an open interval, that is, the negation of conclusion, actually get used?
F1DF2DF3D..., flFk=F,
lim Ce(Fk) Ce (n Fk
\k
4.3.7. Show that if F1 F2 F3 •, where F1 is bounded, then
lim Ce(Fk)> Ce Fk
4.3.8. Show that x is not a F-point if and only if for each a > 0, there is a
neighborhood of x, N such that that for all y e
andalin > <a.
4.3.9. Consider the sequence of functions defined on [—1, 1] by
0, if x =Oor lxi >2/n,
1
= — sin(7r/x), 0 < xl < 1/n,
n
sin(7r/x), 1/n lxi 2/n.
Show that this sequence converges to 0 for all x e (—1, 1). Show that this conver-
gence is not uniform in any E-neighborhood of 0. For what values of a > 0 is 0 in
['a?
4.3.10. Consider the sequence of functions, defined on (—1, 1) by =
0 and for x 0,
gn(x)=
1 ri 1\
I(Ixl——Im(m—1)I
112 1 1
lfl?2.
m—1L\ mJ J m rn—i
Show that this sequence converges to 0 for all x e [—1, 1]. Show that this conver-
gence is not uniform in any E-neighborhood of 0. Show that 0 is an accumulation
point of F-points, but it is not a F-point. This demonstrates that while each set Fa
is closed, the union of these sets needs not be closed.
where
E(n,a)= {x e ?aj,
then
pb pb
/ f(x)dx = lim / (4.14)
Ja
and a finite union of open intervals, U, that contains E(n, a) and whose outer
content is less than E/2B. We then have that
b b b
f(x) dx
f f f(x) dx
=f
[a,b]—U
The first explicit example of such a sequence was given by René Baire in his
doctoral dissertation of 1899, (Example 4.5 on p. 100).
René-Louis Baire (1874—1932) entered the École Normale Supérieure as an
undergraduate in 1892 and earned his doctorate there in 1899. He went on to teach
at the University of Montpellier in 1902 and Dijon in 1905. Baire suffered from both
physical and psychological disorders that became progressively debilitating. By
1914, they completely prevented him from teaching or continuing his mathematics.
He spent his last years in bitter solitude.
Baire's dissertation had a profound effect on Henri Lebesgue and the further
development of integration. Baire's thesis, Sur les fonctions de variables réelles
(On functions of real variables), clarified the intimate connection between the
structure of the real numbers and properties of functions. In the process, he made it
very clear that outer content is a fundamentally flawed way of measuring the size
of a set. We begin with the central result of his thesis.
We have seen that nowhere dense sets can be quite large in the sense that their
outer content can be as close as desired to the length of the interval in which they
lie. Baire realized that not even a countable union of them could fill that interval.
This is called the Baire category theorem because of Baire's definition.
112 Nowhere Dense Sets
Definition: Category
A set is of first category if it is a countable union of nowhere dense sets. A set
that is not of first category is said to be of second category.
Thus, the Baire category theorem is more succinctly put as: "Every open interval
is of second category."
Proof We lose no generality if we assume that our interval is (0, 1). Let be
a sequence of nowhere dense subsets of (0, 1). We must show that there is at least
one x e (0, 1) that is not in the union of the Sn.
Since Si is nowhere dense, we can find an open subinterval of (0, 1) that contains
no points of If necessary, we come in slightly from each endpoint to find a closed
interval [ai, b1] c (0, 1), ai < b1, that contains no points of Si. Since S2 is nowhere
dense, we can find a subinterval [a2, b2] c (ai, bi), a2 <b2, that contains no points
of S2. In general, once we have defined [an_i, bn_ 1], an_i < bn_ i, we choose a
subinterval [an, bn] c (an_i, bn_ i), an <bn, that contains no points of 5n• By the
nested interval principle, the intersection bn] contains at least one point,
and this point is not in any of the Sn. E
This is not a hard proof. Baire's genius lay in recognizing how important this
simple observation can be.
Proof. Let S be of first category. Any subset of a first-category set is again of first
category (Exercise 4.4.7), so the intersection of S with any open interval is of first
category. It follows that every open interval contains a point of 5c, and thus Sc is
dense in JR. It is left for Exercise 4.4.8 to show that the intersection of 5c with any
open interval cannot be countable.
Now we get to the heart of what interested Baire, the characterization of discon-
tinuous functions. Recall Hankel's distinction (p. 45) between totally discontinuous
functions, such as Dirichet's function, Example 1.1, which is discontinuous at ev-
ery point, and pointwise discontinuous functions, such as Riemann's function,
4.4 The Baire Category Theorem 113
Example 2.1, which is discontinuous at every rational point with even denominator
but is still continuous at all other points.
It will be convenient to follow Baire and consider the continuous functions as a
subset of the pointwise discontinuous functions. A continuous function is simply
a pointwise discontinuous function for which the set of points of discontinuity has
shrunk all the way down to the empty set.
Proof One direction is easy and follows from Corollary 4.8. If the set of discon-
tinuities is of first category, then its complement, the set of points at which the
function is continuous, is dense.
For the other direction, we assume that f is pointwise discontinuous and show
this implies that the set of discontinuities is of first category. We begin by recalling
(Proposition 2.4) that a function f is continuous at c if and only if the oscillation
of f at c, w(f; c), is zero. Let
= {x e 1/kj.
Ix - cI <6 f(x) -
<4(k +
Now consider the interval (c — 6/2, c + 6/2). If x is in this interval and Ix —
6/2, then Iy — cI <6 and
This implies that the oscillation of f at x must be less than or equal to 1/(k + 1) <
1/k. We have shown that Pk is nowhere dense, and therefore the set of points of
discontinuity is of first category.
114 Nowhere Dense Sets
able collection of pointwise discontinuous functions on (a, b). There are uncount-
ably many points in (a, b) at which all of these functions are continuous.
Proof By Corollary 4.9, the set of points of discontinuity for each function is
a set of first category. A countable union of sets of first category is a countable
union of countable unions of nowhere dense sets, so it is also of first category. By
Corollary 4.8, there are uncountably many points of (a, b) not in this union. E
Definition: Class
Continuous functions constitute class 0. Pointwise discontinuous functions that
are not continuous constitute class 1. Inductively, if f is the limit of functions in
class n but it is not in any class k n, then we say that f is in class n + 1.
0, otherwise.
This function is not continuous, but it is the limit of continuous functions, so it is
in class 1. We now take the limit
f(x) fk(x)
= =I
This is Dirichlet's function which we know is not in class 1. It must be in class 2.
Are there functions in class 3, 4, .. . up to any finite number? Are there functions so
discontinuous that they are not in any finite class? In 1905, Henri Lebesgue would
show that the answer to both questions is "yes."
= U E f(xi) —
Discontinuities of Derivatives
We conclude this section with a corollary of Theorem 4.11. It was Darboux who
first observed that if a derivative is discontinuous, then its discontinuities must
be like those in the derivative of Volterra's function. Even though f'(x)
4.4 The Baire Category Theorem 117
does not exist, f' must still satisfy the intermediate value property.' Baire showed
that Volterra's example illustrates the worst possible case in terms of the size and
density of the set of discontinuities of f'.
Proof Let f be differentiable on (a, b), and let f' be its derivative. Since f is
differentiable, it is continuous on (a, b). For each k 1, the function defined by
f(x + 1/k) — f(x)
fk(x)=
1/k
is also continuous on (a, b — 1/k). Choose a positive integer K. Since f is differ-
entiable, fk(x), k K, exists and equals f'(x) for all x E (a, b — 1/K).
By Theorem 4.11, f'is pointwise discontinuous on (a, b — 1/K). Its points of
discontinuity form a set of first category. Since this is true for every K 1, the set
of points of discontinuity of f' on
(a,b)=U(a,b—l/K)
is also of first category. Therefore, f'is pointwise discontinuous, which implies
that it is of class either 0 or 1. LI
Exercises
4.4.1. Consider the sequence of functions defined in Example 4.5 on p. 100.
Describe the sets
E(n,a)= {x E [0,
Find the value of ce (E(n, a)).
4.4.2. Give an example of a sequence of integrable functions on [0, 11 that converge
to an integrable function and such that
lim ce(E(n,a))=0,
fl —±00
4.4.3. Prove that any finite union of nowhere dense sets is nowhere dense.
4.4.4. Let C1 denote the Cantor ternary set, C1 = SVC(3). Let C2 be the subset of
[0, 1] formed by putting a copy of C1 inside every open interval in [0, 1] — C1. Let
C3 be the subset of [0, 1] formed by putting a copy of C1 inside every open interval
in [0, 1] — (C1 U C2). In general, let be the subset of [0, 1] formed by putting
a copy of C1 inside every open interval in [0, 1] — (C1 U C2 U U Show
that C = is first category.
4.4.5. For the set C defined in Exercise 4.4.4, find a description of the elements of
C in terms of their representation in base 3.
4.4.10. We have seen (Corollary 4.12) that any derivative is continuous on a dense
set of points. Can a derivative also be discontinuous on a dense set of points? To
see that the to this question is "yes," let (r1, r2, ...) be an ordering of the
rational numbers in [0, 1]. Let
f(x)=x2sin(1/x), f(0)=0,
and define
Show that f is differentiable at every point in [0, 1] and that its derivative is
discontinuous at each
4.4.11. Prove that any countable union of first-category sets is first category.
4.4 The Baire Category Theorem
4.4.12. Define
f( X ) — J q, if x = p/q E Q where q 1 and gcd(p, q) = 1,
— 0, otherwise
Prove that f is of class 2.
4.4.13. Let be a sequence of real numbers chosen so that no two differ by
a rational number, r1 — Q if i j. Define
Q=UQn
Define the characteristic function of Q, X Q(X) = 1 if x E Q, = 0 if x Q. Prove
that this function is at most of class 3.
5
The Development of Measure Theory
Through the 1880s and 1890s, the Riemann integral piled up a list of inconve-
niences, including the following:
1. It is defined only for bounded functions. While improper integrals had been
introduced to deal with unbounded functions, this fix appears ad hoc. Further-
more, recourse to improper integrals can work only if the set of points with
unbounded oscillation has outer content zero.
2. It is possible to have an integrable function with positive oscillation on a
dense set of points, and therefore the integral is not differentiable at any of the
points in this dense set (Riemann's function, example 2.1). This violates the
antidifferentiation part of the fundamental theorem of calculus on this dense
set.
3. It is possible to have a bounded derivative that cannot be integrated (Volterra's
function, Example 4.4). This violates the evaluation part of the fundamental
theorem of calculus.
4. The limit of a bounded sequence of integrable functions is not necessarily
integrable (Baire's sequence, Example 4.5).
5. The question of finding necessary and sufficient conditions under which term-
by-term integration is valid was turning out to be extremely difficult.
120
The Development of Measure Theory 121
The only question is "what do we mean by the area of a set of points in the plane?"
We can extend the idea of outer content to the plane. The area of any rectangle
is its length times its width. In exact analogy with the definition of outer content
on the real number line, given any set 5, we let C denote the set of all coverings
C of S by a finite number of rectangles, let area(C) be the sum of the areas of the
rectangles in C, and define
The Weierstrass integral has the advantage that every bounded function is inte-
grable. It yields the desired value for the integral of the derivative of Volterra's
function.
It does, however, have a noticeable drawback. Consider the characteristic func-
tion of a set.
If S and T are disjoint sets that are both dense in [0, 1] (e.g, S could be the
rationals and T the irrationals), then
Definition: Content
Let be the set of all finite coverings of the set S ç R'1 using n-dimensional
rectangular boxes, and let be the set of all pairwise disjoint finite collections
of open n-dimensional rectangular boxes for which the union is contained in S.
The volume of a rectangular box is the product of the lengths of the sides, and the
volume of C e or e denoted vol(C), is the sum of the volumes of the boxes.
The inner and outer content of a bounded set S are defined, respectively, as
equal to the outer content, in which case we can denote this area as simply the
content of the set.
If S is not bounded, let Nk(O) be the neighborhood of the origin with radius k
and define
Content corresponds to the usual concept of length in R, area in R2, and volume
in R3. Inner and outer contents differ only for sparse sets. For example, Q fl [0, 1]
has inner content 0 and outer content 1. The set SVC(4) has inner content 0 —
because it contains no intervals — and outer content 1/2. Its complement in [0, 1],
[0, 1] — SVC(4), has inner content 1/2 and outer content 1.
Peano recognized the relationship between inner and outer content given in
the following proposition. The concept was both popularized and made rigorous
by Camille Jordan in the first volume of Cours d'analyse, published in 1893.
Recall that the boundary of S, denoted as, consists of all points for which every
neighborhood contains at least one point of S and at least one point not in S.
Proposition 5.1 (Inner versus Outer Content). Let S be a bounded set in We
have that
As a consequence, the set S has a well-defined area, called the content of the set,
if and only if ce(aS) = 0.
Proof We shall prove this theorem for two-dimensional sets. The same idea works
in any number of dimensions. We subdivide R2 into squares, 2m by and
124 The Development of Measure Theory
restrict our attention to those squares that have nonempty intersection with S. Let
5e,m be the union of the squares that contain at least one point of S. This is a cover
of S. As m increases, the area of this cover decreases and approaches the outer
content of 5,
The squares in 5e,m are of two types: those that contain boundary points of S
and those that do not. Let S8,m be the union of the squares that contain boundary
points, and 5i,m the union of the squares that do not and, therefore, are completely
contained within 5,
As m increases, the area of 5i,m also increases and approaches the inner content of
As m increases, the area of 58,m decreases and approaches the outer content of the
boundary of 5,
Therefore,
=Cj(5)+Ce(a5). LI
It follows that f is Riemann integrable if and only if has area in the sense that
its inner and outer contents are equal.
Jordan Measure
Camille Jordan (1838—1922) earned his doctorate in 1861 but worked as an en-
gineer until 1876 when he took a position as professor of analysis at the École
5.1 Peano, Jordan, and Borel 125
Polytechnique in Paris. His interests ranged widely, and he is known today for
his contributions to group theory, topology, and number theory as well as analy-
sis. His three-volume analysis textbook, Cours d'analyse de l'Ecole Polytechnique
(Course in analysis for the Polytechnical Institute), published 1893—1896, estab-
lished Peano's content as the basis for calculus.
The problem that forced Jordan to focus on content was the issue of multidi-
mensional integrals. In particular, he had to explain how to integrate real-valued
functions in two real variables for which the domain might be a very irregular
region. As long as the inner and outer contents of the domain were equal, one could
make sense of the integral. A critical piece of this is the fact that if we have a finite
collection of pairwise disjoint sets, then the content of the union is the sum of the
contents.
Proposition 5.2 (Finite Additivity of Content). Let S1, S2, ..., be a finite set
of pairwise disjoint Jordan measurable sets. The content of their union is equal to
the sum of their contents,
Proof From the definition of inner and outer content, we have that
Jordan's Cours d'analyse was very influential. Henri Lebesgue studied it while
an undergraduate at the École Normale Supérieure. Lebesgue later recounted how
it had prepared the way for his own approach to integration. But using content
to define area had one major flaw; for too many important sets, inner and outer
content are not equal. Volterra's example of a nonintegrable derivative relies on
the fact that the inner and outer content of SVC(4) are different. This suggested to
126 The Development of Measure Theory
several people that a more all-encompassing definition of area might get around
the difficulties of Volterra's function. As we saw with Weierstrass's integral, outer
content alone would not do it. Every bounded set has a well-fined outer content,
but the Weierstrass integral is not additive because outer content is not additive. It
is possible to have two disjoint sets, S fl T = 0, for which
Borel Measure
Emile Borel (187 1—1956) was only 22 when he became chair of mathematics at
the University of Lille. He returned to Paris in 1896 to teach at the École Normale
Supérieure where Lebesgue was then an undergraduate. In 1909 he became a
professor at the Sorbonne. We have already encountered some of his work in the
Heine—Borel theorem. We now turn to his study of area published in 1898 in Leçons
sur la théorie desfonctions (Lectures on the theory of functions).
In Section 3.3, we discussed Borel's paper of 1895 and how his study of the
convergence of certain infinite series led to the discovery of the Heine—Borel
theorem. It did more than that. Borel was interested in the size of his set of points
on which the series must converge,
an infinite sequence of pairwise disjoint sets, (S1, 52, 53, . .), Si fl Si = 0 for
.
i j, then we want the measure of the union to equal the sum of the measures,
00 00
m
(u Sk) =
As an example, such a countably additive measure would imply that the set of
rational numbers in [0, 1], a countable union of single points, must have measure 0.
Bore! begins with three assumptions that uniquely define Borel measure:
1. The measure of a bounded interval is the length of that interval (whether open,
closed, or half open).
2. The measure of a countable union of pairwise disjoint measurab!e sets is the
sum of their measures.
3. If R and S are measurable sets, R c 5, then so is S — R. Furthermore, m(S —
R) = rn(S) — m(R).
Borel Sets
Borel came very close to our modern concept of measure, but in all of his discus-
sions of the application of his measure, he restricted himself to sets that could be
constructed from intervals using countable unions and complements. Today, we
call the sets that can be built in this way Borel sets.
It is left for you to show (see Exercises 5.1.13—5.1.15) that under this definition,
all open intervals and all half-open intervals are Borel sets, that any countable
intersection of Borel sets is also a Borel set, and if A, B are Borel sets, A D B,
then A — B is a Borel set.
Definition: cr-algebra
A cr-algebra, A, is a collection of sets with the property that
1. 0 E A,
2. if {A1, A2, . . .} is any countable (finite or infinite) collection of sets in A, then
their union is also in A (note that we do not need them to be pairwise disjoint),
3. if A E A, then Ac E A.
128 The Development of Measure Theory
In the next section, we shall see that any Borel set is measurable in Borel 's sense.
This will take some work. If we have a countable union of pairwise disjoint sets for
which the Borel measure is defined, then the Borel measure of the union is the sum
of the Bore! measures. But we need to define the Borel measure of any countable
union of sets with well-defined Borel measure.
Borel measure actually applies to a much smaller collection of sets than Jordan
measure. This may seem a strange comment in view of our examples of sets that
are measurable in Borel's sense, but not when we try to use Jordan's content.
But, as we shall see, the cardinality of B, the collection of all Borel sets, is only
c, the cardinality of [0, 1]. On the other hand, any subset of SVC(3) (the Cantor
ternary set with content 0) will also have Jordan measure zero. As we saw in
Section 4.1, SVC(3) has cardinality c. The collection of its subsets has cardinality
By Cantor's theorem (Theorem 3.9), this is a larger cardinality than c.
In view of the fact that there are so many more Jordan measurable sets than Borel
sets, one might expect that it is fairly easy to give an explicit example of a set that
is Jordan measurable and not Borel. In fact, finding such explicit sets is difficult
(but see Exercises 5.4.8—5.4.11 for an example).
Because it would take us far afield, a discussion of the proof that the cardinality
of B is c has been put in Appendix A. 1, but there is a simple heuristic argument
why this might be the case. The cardinality of the set of all intervals cannot
exceed the cardinality of the set of pairs of real numbers, and that is c. Taking
complements only doubles the number of elements in the set, so the cardinality
is still c. Taking differences, unions of countable collections, and intersections of
countable collections of the Borel sets we have already constructed still does not
get us beyond cardinality = c. We proceed by induction.
The problem with this approach is that the induction needs to go beyond all
finite positive integers. One has to be much more careful than this when arguing
by induction with transfinite numbers. But it gives an indication of why we might
believe that the cardinality is only c.
Borel knew that it would make sense to assign measure 0 to the subsets of the
Cantor ternary set. In general, if A c S c B, where A and B are Borel sets with
the same measure, then Borel recognized that we should assign that value as the
measure of S. But he never worked out the implications of this insight. And he
never recognized that this would provide the key to the problems of integration.
That revelation would come to the young graduate student, Henri Lebesgue.
Exercises
5.1.1. Prove that if f is Riemann integrable, then every open interval contains at
least one point at which f is continuous.
5.1.2. Give an example of a pointwise discontinuous function that is not Riemann
integrable.
5.1.3. Show that a bounded set S is Jordan measurable if and only if, given E > 0,
we can find two finite unions of intervals, E1 and E2, such that
m+ m,nEN
I n+1
Jordan measurable? Justify your answer.
130 The Development of Measure Theory
5.1.8. Let be the sequence of rationals in [0, 1]. Let 'k be the open interval
of length 112k+ 1 centered at rk. Show that 'k is not Jordan measurable.
5.1.9. Given an example of a bounded open set that is not Jordan measurable.
5.1.10. Give an example of a bounded closed set that is not Jordan measurable.
5.1.11. Give an example of a Borel measurable set that is not Jordan measurable.
5.1.12. Find the smallest a-algebra that contains all closed intervals with rational
endpoints.
5.1.13. Show that under the definition of Borel sets, every open interval and every
half-open interval is a Borel set.
5.1.14. Show that under the definition of Borel sets, every countable intersection
(finite or infinite) of Bore! sets is again a Borel set.
5.1.15. Show that under the definition of Borel sets, if A and B are Borel sets,
A D B, then A — B is a Borel set.
5.1.16. Find an example of a Borel set that is neither the countable union of
intervals (open, closed, half open, or even a single point), nor is it the complement
of such a union.
This function is not bounded and so not Riemann integrable on [0, 1], but its
improper integral does exist. Find the value of the improper integral f(x) dx.
Consider the function g defined on [0, 1] by g(x) = 0 for x E SVC(4), and, if(a, b)
is one of the disjoint open intervals whose union equals [0, 1] — SVC(4), then g
on (a, b) is given by
(x)—
a <x
— (a+b)/2<x<b.
Show that even the improper Riemann integral of g over [0, 1] does not exist. Find
the integral of g over each of the open intervals of length on which g is nonzero.
Sum these values (recalling that there are intervals of length to find a
value that would be reasonable to assign to this improper integral.
5.1.18. Consider the set C defined in Exercise 4.4.4. Find the Borel measure of C.
Justify your answer.
5.1.19. Consider the set V defined in Exercise 4.4.6. Find the Borel measure of V.
Justify your answer.
5.2 Lebesgue Measure 131
5.1.20. Let f be any real-valued function defined on R. Show that the set of points
of continuity of f is a Borel set.
5.1.21. Let be a sequence of continuous functions defined on R. Show that
the set of points at which this sequence converges is a Borel set.
5.1.22. A real number x is simply normal to base 10 if each digit appears with
the same asymptotic frequency. Specifically, let N(x, d, n) be the number of oc-
currences of the digit d among the first n digits in the decimal expansion of x,
then N(x, d, n)/n = 1/10. A real number x is normal to base 10 if each
block of k digits appears with the same asymptotic frequency. Let N(x, B, n) be
the number of occurrences of the block B (including overlapping occurrences)
among the first n digits of the decimal expansion of x, the N(x, B, n)/n =
S3 S4 S3 S3
f(x) 1
for all x E [a, b].
<
5.2 Lehesgue Measure 133
We are working with finite summations, so the integral of each of these sums should
be the sums of the integrals, and integration should preserve the inequalities:
jb
f(x)dx Xs,(x)dx.
The integral of the characteristic function of a set should be the measure of that
set. Our sets S1 are, by the way they have been defined, pairwise disjoint. If they are
always Jordan measurable, then the sum of the measures of the 5, is the measure of
their union, which is b — a. If the lengths of the intervals on the y-axis, 'j+l —
are all less than E, then the upper and lower limits on the value for our integral
differ by at most
(li+1
fX <E x = E(b — a).
—
We can always force the upper and lower bounds as close as we wish by taking the
partition of the y-axis sufficiently fine.
If we restrict ourselves to Jordan measure, then we are right back at the Riemann
integral. But Lebesgue saw that he could use Borel's idea of measure.
Consider V', the derivative of Volterra's function. Our inability to integrate this
function comes from the fact that in any neighborhood of a point in SVC(4), this
derivative takes on both the values + 1 and —1. If we slice our function horizontally
and look at where the function lies between, say, 0.7 and 0.8, this is a fairly nice
set. It is a countable union of disjoint intervals (see Figure 5.2). The set of values of
x for which V'(x) lies between 0.7 and 0.8 is not measurable in Jordan's sense, but
it is a Borel set. If we use Borel measure, then the derivative of Volterra's function
is integrable. The fundamental theorem of calculus (evaluation part) appears to be
salvageable.
Improving on Borel
Lebesgue realized that he could not simply substitute Borel measure for Jordan
measure. As we have seen, that severely reduces the number of sets that are measur-
able. What Lebesgue needed was a concept of measure that would encompass all
Jordan measurable sets and all Borel measurable sets. He laid out three conditions
that his measure would have to possess:
Unlike Borel, Lebesgue sought to find the most general possible collection of sets
for which such a measure could be defined. Lebesgue measure is built on the
concept of the countable cover.
Any countable union of intervals can be expressed as a countable union of pair-
wise disjoint intervals. If C is a countable union of pairwise disjoint intervals, then
Lebesgue's three conditions imply that the Lebesgue measure of C, denoted m(C),
must equal the sum of the lengths of the intervals of C. We use this as our starting
point for the general definition of Lebesgue measure.
Lebesgue outer measure satisfies Lebesgue's three conditions, but it still misses
one critical property that is present in Jordan and Borel measure: if S and T are
Jordan or, respectively, Borel measurable sets, then S — T is also respectively
Jordan or Borel measurable, and the respective measure of S — T is the measure
of S minus the measure of S fl T. In the case of Borel measure, this is built into
=(b—a)—ce([a,b]—S).
The condition needed for Jordan measurability, ct(S) = Ce(S), is precisely the
condition we need in order to guarantee that Ce ([a, b] — s) = (b — a) — Ce(S).
Lebesgue outer measure is more complicated because the complement of a
countable union of disjoint intervals is no longer necessarily a countable union of
disjoint intervals — witness the SVC sets. It might seem that the natural definition of
Lebesgue inner measure would be to take the supremum over all countable unions
of disjoint intervals contained in S of the sum of the lengths of these intervals.
That turns out to be a useless notion because it just recreates the inner content
(see Exercise 5.2.11). The right definition of Lebesgue inner measure parallels the
mj(S)=(b—a)—me([a,b]— S).
alternate definition of inner content, the one that shows how to compute the content
of a complement.
It may seem that we have stopped short of the full complementarity that we need:
If S and T are measurable, then so is S — T, and m(S — T) = rn(S) — m(S fl T).
As we shall see in the next section, this more general statement of complementarity
is a consequence of the statement that for S c [a, b],
m ([a, b] —5) = (b — a) — rn(S). (5.5)
From now on, the terms outer measure, inner measure, and measure will
refer to Lebesgue measures. The fact that this definition satisfies the first and third
conditions for our measure is easy to check and is left for the exercises. The second
condition, countable additivity, will be proven in the next section.
As Lebesgue observed, outer measure is always subadditive,
Proof Choose any e > 0 and for each choose a countable open cover
'i2, . . .) for which
We create a countable open cover of U S, by taking all of the open intervals in all
of the chosen open covers. This is a countable collection of countable collections,
so it is still countable. The outer measure of U is bounded by the sum of the
lengths of all of these intervals. This sum is bounded by
00 00
me (u
It follows that
with exact equality if and only ifS is measurable. Note that S c [a, b] is measurable
if and only if [a, b] — S is measurable (see Exercise 5.2.5).
As the next theorem shows, subadditivity allows us to collect many examples of
measurable sets.
Theorem 5.5 (Examples of Measurable Sets). If the set S is bounded, then any
of the following conditions implies that S is measurable:
1. The outer measure of S is zero,
2. S is countable, or
3. S is an interval (open, closed, or half open).
Proof.
1. Because of inequality (5.7), we only need to prove that
me(S)+me([a,b]—S)_<b—a.
The outer measure of [a, b] — S is less than or equal to b — a, and therefore
Exercises
5.2.1. Prove that if A and B are measurable sets and A c B then m(A) m(B).
5.2.2. Let x be a real number, x + S = {x + SI X + s, s E S}. Prove that me(x +
S) = me(S). Show that this implies that if S is measurable, then m(x + S) = M(S).
5.2.3. Prove that (0, 1) is measurable and its measure is equal to 1.
5.2.4. Show that the definition of the inner measure does not depend on the choice
of the interval [a, bi. Let a = inf S and ,6 = sup S. Show that
me(Scfl[a,b])=(b_a)_(13_a)+me(Scfl[a,13]).
5.2.5. Prove that if S ç [a, b] then S is measurable if and only if Sc fl [a, b] is
measurable.
5.2.6. Prove that if me(S) = 0 then me(S U T) = me(T).
5.2.7. Prove that for any bounded set S and any E > 0, we can always find an open
set U D S such that
me(U) <me(S) + E.
5.2.8. Prove that for any bounded set 5, we can always find a set T that is a
countable intersection of open sets (and, thus, a Borel set) for which S T and
me(S) = me(T).
5.2 Lebesgue Measure 139
5.2.9. Prove that for any bounded set S and any 0, we can always find a closed
set K C S such that
5.2.10. Prove that for any bounded set S, we can always find a set L that is a
countable union of closed sets (and, thus, a Borel set) for which S D L and
me(S) = me(L).
5.2.11. Given a set 5, let ICs be the collection of all countable unions of pairwise
disjoint intervals contained in S. Prove that
m (S U T) + m (S fl T) = rn(S) + m(T).
5.2.13. Show that if S and T are bounded sets, then
me (S U T) + me (S fl T) <me(S) + me(T).
5.2.16. Assuming that all open sets and all closed sets are measurable, show that if
S ç [a, b], then the infimum over all open sets U that contain [a, b] — S of m(U)
is equal to b — a minus the supremum over all closed sets F ç S of m(F).
5.2.17. Prove that if S C (a, b) has measure 0, then (a, b) — S is dense in (a, b)
and is uncountable.
5.2.18. Using Exercise 5.2.17 and the fact that any uncountable closed subset of R
has cardinality c (see p. 96), prove that if S C (a, b) has measure 0, then (a, b) — S
has cardinality c.
140 The Development of Measure Theory
Lemma 5.7 (Local Additivity). Let S be any bounded set and (Ii, '2,...) any
countable collection of pairwise disjoint intervals, then
me (s y = me(S fl Ii).
Proof Given any 0, choose a countable open cover, (J1, J2, .. .), of S fl U1
such that m(J1) <me(S U1 + €. Because the intervals Ij, '2,... are
pairwise disjoint, we have that
>m(Jj fl <m(J1).
By the subadditivity of the outer measure, Theorem 5.4, and the fact that
we see that
me(Sfl <>me(Sflhj)
fl
<>m(Jj)
<me(Sfl 5.12
Since this is true for all 0, the first inequality must be an equality.
Proof (Theorem 5.6) We assume that S satisfies equation (5.10). Our first step is
to show that S satisfies equation (5.9) when X is a bounded interval.
142 The Development of Measure Theory
>m(Ij) <me(X)+E.
We use subadditivity and the first part of our proof:
me(X) <me(S fl X) + me(Sc X)
+me(Scfl
= >m(Ij)
<me(X)+€. (5.17)
Again, since this is true for all 0, the first inequality must be equality. LI
5.3 Carathéodory's Condition 143
We now take the first step toward proving that any countable union of measurable
sets is measurable. This theorem, in addition to moving us toward that result, is
very important in its own right.
Theorem 5.8 (Countable Additivity). If (S1, S2, ...) are pairwise disjoint mea-
surable sets whose union has finite outer measure, then
00 00
me
(u = (5.18)
Proof We start with two disjoint measurable sets, and S2, and invoke the
Carathéodory condition with X = U S2. Since fl X = S2, the Carathéodory
condition gives us exactly what we need,
Since this upper bound holds for all n, the summation converges as n approaches
infinity and
00 00
me (U
The inequality in the other direction follows from subadditivity.
We would like to be able to say that these unions are also measurable. In fact,
we would like to be able to say that any countable union of measurable sets is
measurable. The first step is to consider finite unions and intersections.
Theorem 5.9 (Finite Unions and Intersections). Any finite union or intersection
of measurable sets is measurable.
144 The Development of Measure Theory
Proof It is enough to show that the union of two measurable sets is measurable.
The intersection of two sets is the complement of the union of their complements,
fl S2 = U
and by induction we can then conclude that any finite union or intersection of
measurable sets is measurable.
Let X be the arbitrary set to be cut by S1 U S2. We divide X into four disjoint
subsets (see Figure 5.3):
X1 = X fl fl Sf), X2 = X fl fl S2),
X3 X4 = (5.19)
Proof Again, it is enough to prove this theorem for countable unions. Let Tn =
U S2 U... U T = Si. We shall also use the sets where U1 = T1
and = Tn — for n > 1. In other words, consists of all elements of 5n
that are not in S2,..., or By their construction, the sets Tn and are
measurable, and the are pairwise disjoint. The union of U1 through is Tn.
Since Tn is measurable, we know that for any set X with finite outer measure,
Since is measurable,
By induction,
me(X fl
We use the same trick we used in the proof of Theorem 5.8. Our summation has an
upper bound independent of n, so the infinite summation must converge and
By Theorem 5.8,
00 00 00
Therefore,
An important consequence of this result is the next theorem that shows that
Lebesgue measure is, in a real sense, not very far removed from Jordan content.
146 The Development of Measure Theory
= (S — T) U (T — S) = (S fl TC) U (T fl Sc).
For example, the symmetric difference of the overlapping intervals [0, 2] and
[1, 3] is
Therefore,
m(S— Wc) <E/2.
Since W D 5c fl [a, b], S contains Wc fl [a, b], which is closed. We have now
sandwiched our set S between a closed set and an open set,
Wcfl[a,bI c S c V.
2
Bore!, Leçons sur Ia Théorie des Fonctions, 4th ed. 1950. The first fundamenta! theorem is the Heine—Borel
theorem.
5.3 Carathéodory's Condition 147
Figure 5.4. V S WC fl [a, bJ. The shaded region is W. U is the region inside the dotted hexagon.
Wc fl [a, bi c U c V.
By subadditivity,
We finish with a corollary that stands in stark contrast to Lemma 4.4 on page 105
which was both much more complicated and much more restricted in its assump-
tions.
Corollary 5.12 (Limit of Measure). IfS1, S2, ... are measurable sets such that
00
sic C 53 C = 5,
= T,
148 The Development of Measure Theory
Pro of The sets S2 — S1, S3 — S2, ... are pairwise disjoint. If we define S0 = 0,
then we can write
urn = jim —
j=l
=m Si — Si_i) =
All intervals are measurable. With Theorem 5.10, we see that all Borel sets are
measurable. I leave it as an exercise (Exercise 5.3.4) to verify that for any bounded
set 5,
c,(S) <mi(S) <me(S) Ce(S),
and therefore any Jordan measurable set is measurable in Lebesgue's sense. Are
there any sets that are not measurable? That is an important question with a very
surprising answer that will be revealed in the next section.
Exercises
5.3.1. Show that both R and 0 satisfy Carathéodory's condition.
5.3.2. Let S be an unbounded set with bounded outer measure such that for every
k e N, S fl [—k, ki satisfies Lebesgue's condition for measurability,
me ([k, ki — 5) = 2k — me (5 fl [—k, ki).
Show that S satisfies the Carathéodory condition and that
m(S) = m (Sn [—k, ki).
5.3.3. Show directly that the intersection of two measurable sets is measurable by
proving that
WnS
given in (5.29).
me (S U T) = me(S) + me(T).
5.3.8. Show that if S and T are measurable, then
5.3.9. Show that if me(S) <oc and there is a measurable subset T c 5 such that
m(T) = me(S), then S is measurable.
5.3.10. Let S be any subset of R. Show that for any E > 0 there is an open set
U D S such that me(S) <m(U) + E. Show that there is a countable intersection of
open sets, G, such that G D S and me(S) = m(G).
5.3.11. Show that for any bounded set 5, the following statements are equivalent:
1. 5 is measurable.
2. Given any E > 0, there is an open set U S such that me(U — 5) <E.
3. There is a countable intersection of open sets G S such that me(G — 5) = 0.
4. Given any E > 0, there is a closed set C S such that me(S — C) <E.
5. There is a countable union of closed sets F c S such that me(S — F) = 0.
5.3.12. Show that the statements of Exercise 5.3.11 are also equivalent when S is
any subset of lit
5.3.13. Let S and T be sets with finite outer measure. Show that
5.3.15. For a sequence of sets in IR, we define the supremum and the
infimum3 as
00 00 00 00
lim = fl
k=1 n=k
and lim
n-÷oo
= U fl
k=1 n=k
1. Show that if each Sn is measurable, then
2. Show that if, in addition, m(Sn U Sn+1 U...) <00 for at least one n 1, then
n-+00
limm(Sn).
n—*00
5.3.17. Let S be the set of points in [0, 1] that do not require the use of the digit 7
in their decimal expansion. Show that S is measurable and rn(S) = 0.
5.3.18. Find the Lebesgue measure of the set of points in [0, 11 for which there is
a decimal expansion that uses all of the digits 1 through 9.
Theorem 5.13 (Existence of Nonmeasurable Set). The set .A/ is not measurable.
Proof Let q be any rational number and define the translation .A/ + q to be {a +
qae We have added a rational number to each element of so .A/ + q
also consists of exactly one element from each equivalence class. If qi and are
distinct rational numbers, then J\f + qi is disjoint from J\f +
Every real number in (0, 1) is contained in J\f + q for exactly one rational value
of q, and this value of q lies strictly between —1 and 1. To see this, take any
real number a e (0, 1) and find the equivalent number ,8 e By definition of
the equivalence, ,8 — a e Q and —1 <fi — a < 1. We can bound the union of the
pairwise disjoint sets J\f + q for rational q between —1 and 1:
U
qEQfl(—1,1)
U
\qEQn(—1, 1) / qEQfl(—1, 1) qEQfl(—1, 1)
U = m(J\f)=oo.
\qEQn(—1,1) / qEQfl(—1,1)
Difficulties
This would seem to settle the matter. There are nonmeasurable sets. But Vitali's
paper landed in the very center of a raging controversy among mathematicians.
The construction of J'./ requires selecting one number from each equivalence class.
We have uncountably many equivalence classes.
Is it possible to have a set whose definition requires uncountably many choices?
152 The Development of Measure Theory
a —< b, a = b, or b —< a.
Zermelo (187 1—1953) earned his doctorate in 1894 at the University of Berlin,
working on the calculus of variations. After moving to Gottingen in 1897, and at
the urging of David Hilbert, he turned his attention to the problems of set theory.
He taught for several years at the University of Zurich before retiring to the Black
Forest of Germany because of poor health. He was awarded an honorary chair at
the University of Freiburg in 1926, a position he resigned in 1935 in protest against
Hitler's government.
To illustrate what is meant by a well-ordered set, we begin with the rational
numbers between 0 and 1. The rational numbers are not well ordered if we rely on
the usual order according to position on the real number line. With this order, the
set of rational numbers between 0 and 1 does not have a smallest element. If we
use an order that puts these rational numbers into one-to-one correspondence with
the natural numbers:
--<--<--<--<--<--<--<...
1
2
2 1
34 4 3
2 1 3 1
5 5
this is a well ordering. Every subset has a first element when we use this order.
Is it possible to well order the real numbers between 0 and 1? Cantor thought
that it should be possible, but he could not find such an order. The problem
hung unaswered for 17 years, occasionally prodded by those few individuals truly
dedicated to set theory, but ignored by most mathematicians. Then in 1900, David
Hilbert, probably the most influential mathematician of the age, delivered an address
at the International Congress of Mathematicians in which he described the 23 most
pressing and important unsolved problems in mathematics. Problem number 1 on
his list was to settle the continuum hypothesis (see p. 75). The specific question he
asked is whether or not there exists an infinite subset of [0, 1] whose cardinality is
neither nor c.
In his explanation of problem 1, Hilbert raised the question whether it is possible
to well order the real numbers. Suddenly, this became an important problem.
Opinion was divided. In 1904, Julius Konig announced a proof that such a well
ordering could not exist. Flaws in his proof were quickly discovered. The same
year, Ernst Zermelo published his proof that it could be done, not just for the real
numbers but for any set.
What Zermelo actually accomplished was to show that every set can be well
ordered if and only if the axiom of choice always holds.
We have encountered three decision points in our construction of the real number
line. The first was whether or not to include infinitesimals. The judgment to reject
them was made by Cauchy and his contemporaries in the early nineteenth century.
As we have seen, calculus could have been placed on a firm foundation with their
acceptance, but this was not fully realized until the work of Abraham Robinson in
the 1960s. It requires paying the price of greatly complicating the structure of the
real numbers and violating the intuitive principle of commensurability. Robinson's
nonstandard analysis has many supporters who believe that it should be the standard
approach to analysis, but they consitute a minority of mathematicians.
The second decision point involves the continuum hypothesis. We are free to
decide that there either are or are not infinite subsets of JR with cardinality other
than or c. Given this choice, most mathematicians would probably opt for no
other cardinalities. It keeps life simpler. Beyond those who work directly in set
theory, no one worries much about this. This preference does not impact other
branches of mathematics.
The axiom of choice is far more problematic because it does affect many other
branches of mathematics. There are results whose proofs are greatly simplified by
appeal to the axiom of choice, others that are possible only because of this axiom.
In 1918, Waclaw Sierpiñski published a list of such results. In 1929, Krull used
the axiom of choice to prove that in a commutative ring, every proper ideal can
be extended to a maximal prime ideal. In 1932, Hausdorif used the well-ordering
principle to prove that every vector space has a basis. In 1936, Teichmüller extended
this proof to show that every Hilbert space has an orthonormal basis. It is not
necessary to know what these statements mean to recognize that much modem
mathematics presupposes the axiom of choice. We certainly could live without it
or with a weaker form that creates fewer apparent paradoxes, but that would create
complications that most mathematicians would prefer to live without.
Sierpiñski was unhappy calling this the "axiom of choice" since nothing is
chosen. Rather, this axiom asserts the existence of something that we can never
explicitly construct. In a letter to Emile Borel written in 1905, Jacques Hadamard
described this debate as centering on the distinction "between what is determined
and what can be described." Nonmeasurable sets can be determined in the sense
that they can be prescribed; they cannot be described. Hadamard goes on to com-
pare this debate to "the one which arose between Riemann and his predecessors
over the notion of function. The rule that Lebesgue demands appears to me to
resemble closely the analytic expression on which Riemann's adversaries insisted
so strongly." Here Hadamard adds a footnote:
I believe it necessary to reiterate this point, which, if I were to express myself fully, apppears to
form the essence of the debate. From the invention of the infinitesimal calculus to the present,
5.4 Nonmeasurable Sets 157
it seems to me, the essential progress in mathematics has resulted from successively annexing
notions which, for the Greeks or the Renaissance geometers or the predecessor of Riemann,
were "outside mathematics" because it was impossible to describe them.5
Exercises
5.4.1. Prove that if are rational numbers, then
5.4.9. Show that if is a continuous, strictly increasing function, then any set S
in the domain of is a Borel set if and only if *(S) is a Borel set.
5.4.10. Define = x + DS(x), where DS is the Devil's staircase, Example 4.1
on p. 86. Show that is a continuous and strictly increasing function from
[0, 1] onto [0, 2]. Let C = SVC(3) be the Cantor ternary set on [0, 1]. Show
that m (i/i(C)) = 1, and thus *(C) contains a nonmeasurable set, M.
5.4.11. Let M be a nonmeasurable set contained in i/r(C). Show that (M) is
Jordan measurable, but it cannot be a Borel set.
6
The Lebesgue Integral
In Section 5.2, we saw that the idea behind the Lebesgue integral of f is to
partition the y-axis, 1 = 10 <11 <12 <• <in = L, define S1 = {x I <f(x) <
'j+l }, and then bound the integral by the summations
f(x)dx
f
As the partition of the y-axis gets finer, these sums will approach each other and
so approach a value for the integral. The only catch is that these sets, the must
be measurable.
In view of the difficulty involved in finding a nonmeasurable set, we should
expect that for reasonable functions, the are measurable. But there is something
to prove here. We shall call such functions measurable functions. The most
important result of the first section is that every Riemann integrable function is
measurable. We will lose nothing (and gain a great deal) by switching from the
Riemann integral to the Lebesgue integral.
Our greatest gain will be Lebesgue's dominated convergence theorem, stated
and proven in Section 6.3. Here at last we shall see a broadly applicable sufficient
condition allowing for term-by-term convergence. In Section 6.4, we shall explore
the connection between measurability and uniform convergence. This will lead into
a discussion of some of the varied ways in which sequences can converge, a theme
that will be picked up and developed much further in Chapter 8.
159
160 The Lebesgue Integral
Proof Since complements and countable intersections of measurable sets are mea-
surable and
(6.1)
(6.2)
Corollary 6.2 (Lebesgue Sets Si Are Measurable). 1ff is measurable on [a, b],
then {x e [a, b] c < f(x) <d } is measurable.
If k <0, then the second set is {x e [a, bill f(x) <c/k). If c <0, then {x e
[a, bill f2(x) > c} = [a, fri. If c 0, then
Since
fg=
fg is measurable. If c <0, then {x e [a, b]I lf(x)l > c} = [a, fri. If c 0, then
{x e [a,b] If(x)I > cj = {x e [a,b] f(x) < —cj U {x e [a,b] f(x) > cj.
Proof Recall that any set of outer measure zero is measurable (Theorem 5.5), and
thus any subset of a set of measure zero also has measure zero. There is a set S of
measure 0 such that
lim f(x)
n—+Oo
for all x e [a, fri — S, where rn(S) = 0. If we choose some c e IR, then the sets
and
are not necessarily identical, but any element that is in one but not in the other must
be in S.
Let
S1=F1—F2 and S2=F2—F1.
From Proposition 6.4, we know that F2 is measurable. Since and S2 are subsets
of S, they are also measurable. I leave it for you (Exercise 1.2.14) to show that
F1 = (F2 U fl
Therefore, F1 is measurable.
We now focus on the kind of function we want to use in our approximation to
the Lebesgue integral.
A function is simple if and only if it can be written as a finite linear combination
of characteristic functions of measurable functions. Simple functions admit many
different representations. For example,
0(x)
=
where 1, e JR are distinct, x is the characteristic function of S,, and the sets S, are
measurable, pairwise disjoint, and their union is the domain of 0. In dealing with
a generic simple function, we shall assume that the representation we are using is
this unique representation.
Proof. I leave it for Exercise 6.1.5 to prove that simple functions are measurable.
It follows from Theorem 6.5 that the limit almost everywhere of simple functions,
if it exists, is a measurable function.
In the other direction, let f be measurable. For each positive integer n and for
<k define
I k k+1
Define
XEfl,k(X).
=
For n> If(x)I, we know that
fis the limit of these simple functions. If f is bounded below, f(x) A for
all x e [a, b}, then for all n > IA this sequence is monotonically increasing.
Note that f does not need to be a bounded function. For the Riemann integral,
we had to twist ourselves in knots to handle unbounded functions. For the Lebesgue
integral, unbounded functions present no special problems.
Lebesgue integration completely subsumes all integrals that can be defined us-
ing Riemann's definition and, fortunately, when they both exist, the values of the
Riemann and Lebesgue integrals are the same.
Proof Let f be a Riemann integrable function on [a, b]. For each positive integer
n, we define as the partition of [a, b] into equal intervals of length (b —
Let 'n,k be the kth interval of this partition,
r
Ink=[a+(a—b)
k—i k\
=
and let = infXEjflk f(x). We define the simple function
= XIflk•
lim
fl—+ 00
= <f(x).
All that remains is to prove that /(x) = f(x) almost everywhere. Choose any
x e [a, b]. For each n, choose k so that x E 'n,k• The oscillation of f over 'n,k is
content, the measure of is zero. The set of points at which the oscillation is
positive is the union
Si/k.
This is a countable union of sets of measure zero, so it also has measure zero.
We have shown that = f(x) almost everywhere. Since each is
measurable, Theorem 6.5 implies that f is measurable.
Proof. We confirmed one direction in the previous proof. All that remains is to
show that if f is bounded and continuous almost everywhere, then it is Riemann
integrable. Again let be the set of points at which the oscillation is greater than
or equal to a > 0. By Theorem 2.5, we need to show that this set has outer content
zero. Since it is a subset of the set of points at which f is discontinuous, we know
that has measure zero. The problem is that <ce(Sa), and we need to
show that Ce(Sa) = 0. We shall need to be clever.
The fact that = 0 means that for any E > 0, we can find a countable open
cover of for which the sum of the lengths of the intervals is less than E. If we
can show that is closed, then we can use the Heine—Borel theorem to conclude
that there is finite subcover of for which the sum of the lengths of the intervals
is less than E. This is exactly what we need to conclude that Ce(Sa) = 0.
Our proof has come down to showing that is closed. We will show that its
complement is open. If c e then the oscillation at c is strictly less than a,
w(f;c)= limf(x)—limf(x)< a.
Let = (a — w(f; c))/3. By the definition of lim and lim, we can find an open
neighborhood of c in which
There is a delicious irony here. Riemann introduced his definition of the integral
for the purpose of understanding how discontinuous a function could be and still be
integrable. It appeared that it could be very discontinuous, having discontinuities
at all rational numbers. In fact, there are Riemann integrable functions with dis-
continuities at the points in a set with cardinality c. Now that we are finally putting
the Riemann integral behind us, we get the answer that Riemann was seeking. A
Riemann integrable function is always a very continuous function. It is continuous
almost everywhere. A function that is discontinuous only at the rational numbers
is not very discontinuous.
The Lebesgue integral enables us to handle truly discontinuous functions. Dirich-
let's function, the characteristic function of the rationals, was created to show that
a function could be so discontinuous that it would make no sense to talk about
its integral. This was considered a function beyond the pale. Yet, as we shall see,
the Lebesgue integral of this function has a simple and natural meaning. Over any
interval, the integral of this function is the measure of the set of rationals in that
interval, which is zero.
Exercises
6.1.1. Using the definition of a measurable function, show that any constant func-
tion is measurable.
6.1.2. Prove equation (6.5).
6.1.3. Show that
X SflT = XT'
XSUT = XS+XT - Xs• XT'
xsc=1-xs.
6.1.4. Let be a sequence of sets. Prove the equality of the characteristic
function of the infimum of these sets (defined in Exercise 5.3.15) and the lim inf
of the characteristic functions of these sets,
X urn = n—+
iiJJ2
00
for which the sets S, are both measurable and pairwise disjoint.
6.1.7. Prove that any sum or product of finitely many simple functions is a simple
function.
6.1.8. If If I
is measurable, does it necessarily follow that f is measurable?
6.1.9. Let f be a real-valued function defined on R. Show that the condition
6.2 Integration
We begin with the definition of the Lebesgue integral of a simple function (see
below) and prove a few of the properties we would expect of an integral.
c
dx
3. if 4(x) < all x e E, then IE 4(x)dx s IE and
= IE, + IE,
The last of these statements may look a little unusual. It is simply a generalization
of the identity
= fb
f(x)dx f(x)dx
f + f
Proof We begin by setting
m n
i=1 j=1
where the are distinct, the are distinct, the S, are pairwise disjoint, the T1 are
pairwise disjoint, and E = = T1.
m m
= fl E) = fl E) =
fE
= 1=1
where the I, are distinct, the S, are pairwise disjoint, and is the domain of
S1
dx fl E).
fE =
170 The Lebesgue Integral
It follows that
+ dx = + fl Tj fl E)
fE
m n
= k, m fl Tj fl E)
i=1 j=1
n m
flT1flE)
j=1 i=1
m n
= k, fl E) + m(T1 fl E)
= fE
dx
+ f
3. The assumption 4 i/i for all x e E implies that we can rewrite our functions
as
= i/f =
f dx = fl Tj fl E) < fl Tj fl E)
= f dx.
= fl fl
m m
= f dx
+
dx.
We have one more result to prove about integrals of simple functions. Like many
of our results, it applies to monotonic sequences.
6.2 Integration 171
Proof Since every bounded increasing sequence converges, the conclusion of this
proposition is equivalent to the statement that U, the set of x e E for which
is unbounded, is a set of measure zero. Choose any E > 0, and define
={x e AlE).
Since is nonnegative,
E
< I dx < JEI dx <A.
f= — where (6.10)
We saw in Theorem 6.6 that any nonnegative function is the limit of a mono-
tonically increasing sequence of simple functions. We define the integral of f in
terms of the integrals of simple functions.
172 The Lebesgue Integral
f f f f(x)dx. (6.14)
Note that an integral might have the value or even though the function
is not integrable. This is analogous to the situation of a sequence that diverges to
infinity. Such a sequence does not converge, but to say that this sequence approaches
or that it approaches still says something meaningful.
The following proposition follows immediately from the definition of the
Lebesgue integral.
I f(x)dx = 0.
JE
(6.15)
I f(x)dx=0.
JE
0= [f(x)dx= I
JE
and therefore = 0. Since
S S S S 7 fi
S S 7 f2
S S S S S 7
• S 7 fr—i
Q5n,2 Q5n,3 S s 7
Figure 6.1. Monotonically increasing sequences of simple functions.
= max
= max max =
1<j,k<n 1<j,k<n—1
supf
f g(x)dx = supf Ø(x)dx
E E E
Ø(x)dx
= f Eh(x)dx.
(6.19)
From Proposition 6.12, if g(x) h(x) except possibly forx e U, where m(U) = 0,
then
dx f(x)
the third part of the proof, we begin with any simple function less than
or equal to f, 0 = f, where the S, are pairwise disjoint and
U 5, = E. Choose an a between 0 and 1, and define the sets
A1cA2c..., UAn=E.
n=1
By Corollary 5.12,
Choose an E > 0. For each set we can find an so that n implies that
m(51 fl Ar)> (1 — fl E). Let N = max{Ni, N2, ..., Nk}.
176 The Lebesgue Integral
JE
[
I
k
=a fl
(i — flE)
= fl E)— fl E)
=a [Ø(x)dx—Em(E).
JE
Since this is true for every E > 0 and for every a between 0 and 1, it follows that
I
JE
I
JE
We have shown that every simple function 0 f has an integral that is dominated
by fE dx for all n sufficiently large. We can conclude that
E
It took some work to prove this theorem, but we are rewarded with four important
corollaries.
Proof We know that I fi, cf, and f + g are measurable. Since the integrals of
f f f
6.2 Integration 177
To show that cf and f + g are integrable, we need to establish that the integrals
of (cf), (f + g) are finite. We see that
(cf) +
(f + (f + +
Since these functions are nonnegative and we know that the integrals of the larger
functions are all finite, so are the integrals of the smaller functions.
We first establish equations (6.22) and (6.23) for f, g nonnegative, C? 0. If
c = 0, then equation (6.22) is trivially true. Let be a monotonically increas-
ing sequence of simple functions that converges to f. It follows that is a
monotonically increasing sequence of simple functions that converges to cf. By
Theorem 6.14,
f cf(x) dx = lim f
E E
c dx = lim c fE dx = c
f f(x) dx.
E
= lim I dx + lim I dx
= [f(x)dx+ [g(x)dx.
JE JE
f cf(x)dx=f c.f+(x)dx_f
E
= c(f f f(x)dx)
=c [f(x)dx.
JE
To conclude the proof of equation (6.23), we begin with the observation that
(f + g= — + — g,
and therefore
(f + g = (f + + +
178 The Lebesgue Integral
We integrate each side of this equality. All of the summands are nonnegative
measurable functions, so we can write each integral as a sum of integrals:
f f f(x)dx f f g(x)dx,
I (f+g)(x)dx= JE[f(x)dx+ [g(x)dx.
JE JE
In the next corollary, we see that term-by-term integration is correct for series of
nonnegative measurable functions provided that the sum of the integrals converges.
The proof is left as Exercise 6.2.7.
o
<;,
—
f(x)dx_f
Let 1 = — 1/IN, which is also a simple function. We have that
b b
f(x) -
f dx
= b
f —
f dx
Our final corollary will be a very important result that shows that for any inte-
grable function, even an unbounded function, we can force the integral to be as
small as we wish by taking a domain with sufficiently small measure.
I f(x)dx
Js
<E. (6.26)
Proof. We know from Corollary 6.17 that for any E > 0 we can approximate f by
a simple function 0 so that
b
E
f <
Since 0 is simple, it is bounded, say IO(x)I < B for all x e [a, b]. We can choose
8 = E/2B. Given any set S ç [a, b] of measure less than E/2B,
Js 2B 2
180 The Lebesgue Integral
It follows that
f
f dx
f(x) dx dx
<f a S
+= E.
Exercises
6.2.1. Find the Lebesgue integral over [0, 11 of the function f defined by
1x2, xe[0,1]—Q,
f(x)=li xe[0,1]flQ.
Is this function Riemann integrable over [0, 1]?
6.2.2. Using the Cantor ternary set, SVC(3), define the function g on [0, 1J by
xeSVC(3),
gx
(
— x is in a removed interval of length
Find the value of the integral g(x) dx. Is this function Riemann integrable over
[0, 1]?
6.2.3. Using the Cantor ternary set, SVC(3), define the function h by
Find the value of the integral h(x) dx. Is this function Riemann integrable over
[0, 1]?
0(x) = XT(x),
where (k1, ... , are any n real numbers, not necessarily distinct, and (T1, ...,
are measurable sets, not necessarily pairwise disjoint and whose union is not
6.2 Integration 181
n C
necessarily the domain of 0, but for which 0(x) = 0 for all x e T,) , then
it is still true that for any measurable set E,
fE
6.2.5. Explain why it is that if and 0 are measurable functions, then An = {x e
E I fn(X) > aØ(x)} is a measurable set.
6.2.6. Compare the hypotheses of the monotone convergence theorem, Theo-
rem 6.14 with those of the Arzelà—Osgood theorem, Theorem 4.5 on p. 106.
Give an example of a sequence of functions that satisfies the hypotheses of the
Arzelà—Osgood theorem but not those of the monotone convergen theorem. Give
an example of a sequence of functions that satisfies the hypotheses of the monotone
convergence theorem but not those of the Arzelà—Osgood theorem.
6.2.7. Prove Corollary 6.16.
6.2.8. Show that the conclusion of Corollary 6.16 can be false if we do not
require that > 0 for all n, even if we strengthen the bounding condition to
I
f(x))dxl <A.
6.2.9. Let 1
be a series of integrable functions for which
°° b
a
n=1
converges.
6.2.11. In the proof of Proposition 6.11, explain why En C En+i. Then show that
if is unbounded for some x, then x must be contained in at least one of
the Ek.
6.2.12. Show that if f is Lebesgue integrable on E and if
Sn = {x e If(x)I
then •m(Sn)= 0.
182 The Lebesgue Integral
6.2.13. Show that if f and g are integrable over E and f(x) g(x) for all x e E,
then
I f(x)dx
JE
I g(x)dx.
JE
6.2.14. Show that we can still conclude that fE f(x) dx g(x) dx with the
weaker hypothesis that f(x) g(x) almost everywhere on E.
f f
f f 0 almost everywhere on
Ek={x e
6.2.20. Let f be a nonnegative, measurable function on the set F, m(F) < oc.
Prove that f is Lebesgue integrable if and only if m (Fk) converges, where
Fk=
6.2.21. Let f be a nonnegative, measurable function on the set G, m(G) < oc. For
E > 0, define
Prove that
lim S(E)
E-±O
= I f(x) dx.
JG
6.3 Lebesgue's Dominated Convergence Theorem 183
I I
g almost everywhere in E,
I f(x)dx =
JE
lim I (6.27)
In terms of infinite series, this says that if the are integrable functions, if the
series converges almost everywhere, and if the partial sums are bounded by an
integrable function, g,
dx).
f dx =
(f (6.28)
Uniform Convergence
Weierstrass had shown that if a sequence of Riemann integrable functions converges
uniformly, then the integral of the limit is the limit of the integrals. We shall show
that this is a special case of Theorem 6.19, the dominated convergence theorem.
To say that (fr) converges uniformly to f over [a, b] means that for any E > 0
we can find a response N so that nN implies that
then g will also be measurable and bounded, and therefore, integrable on [a, b].
Since
Bounded Convergence
Arzelà's generalization of Osgood's theorem says that if a sequence of integrable
functions converges to an integrable function and there is a finite bound A such
that
then the integral of the limit is the limit of the integrals. Again, this is a special
case of Theorem 6.19, the dominated convergence theorem. In this case, we define
g(x) = + 2A.
— f(x) - + g(x).
1 + n3x2
These functions converge to zero on [0, 1], but the convergence is not bounded. The
maximum value of this function on [0, 1] occurs at x = and is equal to
a value that does not stay bounded. Nevertheless, as we saw in equation (4.9) on
p. 102, the integral of the limit is equal to the limit of the integrals. This example
can be explained by the dominated convergence theorem.
6.3 Lebesgue's Dominated Convergence Theorem 185
(see Exercise 6.3.3). This is an unbounded function, but it is integrable over [0, 1]
in the Lebesgue sense.
We can use an improper Riemann integral to verify that g is Lebesgue integrable:
22/3
lim I dx = lim (211/3 — 2'/'3a2"3) = 21/3.
a_*O+ J 3
The unproven theorem that we are using is that any strictly nonnegative function
for which the improper Riemann integral exists will be Lebesgue integrable (see
Exercise 6.3.4).
For a rigorous verification that g is Lebesgue integrable, we show how to express
g as a limit of simple functions. For 1 < i <rn, let be the interval [(i —
1)/rn, i/rn), which is closed on the left, open on the right, and let S = [(rn —
We see that
1 m 22/3 . —1/3 m 22/3 . —1/3
1
—
m
22/3
—13
1/.
If we divide by rn213 and then take the limit as rn approaches oo, we get
m
22/3 22/3 3
lim
m-*oo3rn2/3 3 2
The integrals of the step functions are bounded, and therefore g is integrable.
186 The Lebesgue Integral
As we saw there,
Each f,, is nonzero on a distinct interval, and therefore the integral of the
supremum over [0, 2] is the sum of the integrals of the
2 °°
1
[
JO
sup
n>1 n=1
lim = 0,
fl 00
6.3 Lebesgue's Dominated Convergence Theorem 187
and
1
urn / = urn — =0= / urn
n—*ooJo n—*oon Jo
This is a sequence that is not dominated by an integrable function, and yet the
limit of the integrals does equal the integral of the limit. Lebesgue's condition is
sufficient but not necessary.
Nevertheless, the dominated convergence theorem is extremely useful. It gives us
a very generous condition under which term-by-term integration is always allowed.
Fatou's Lemma
Pierre Fatou (1878—1929) studied as an undergraduate at the École Normale
Supérieure, attending from 1898 to 1901. The result that carries his name was
part of his doctoral thesis of 1906. He worked as an astronomer at the Paris ob-
servatory. Much of his mathematics involved proving the existence of solutions to
systems of orbital differential equations. He also studied iterative processes and
was the first to investigate what today we call the Mandelbrot set.
If we think back to our examples where the limit of the integrals of a sequence of
functions is not equal to the integral of the limit (such as Example 4.6 on p. 100),
we see that the integral of the limit was always less than the limit of the integrals.
Fatou's lemma says what should be intuitively apparent, that if the functions are
nonnegative then we can never get the inequality to go in the other direction.
[ lim
JEn—*oo
f
n—÷ooJE
(6.29)
Proof Define the sequence (gm) by gm(x) = infn>m It follows that for all
n > m, we have
JE
I gm(x)dx JE
188 The Lebesgue Integral
and therefore
JE
f gm(x)dx lim f
n—÷ooJE
f lim
E n—* m -* 00 f gm(x)dx
E
lim fE
n—* 00
If I = + g almost everywhere.
Since f+ and are nonnegative, they are each bounded above almost everywhere
by g. By Proposition 6.12, if we change the value of a function on a set of measure
zero, it does not change the value of the integral. Therefore, the integrals of f+
and of are bounded above by the integral of g, which is finite. We have proven
that f is integrable.
Since g + is nonnegative, we can apply Fatou's lemma:
fg(x)dx
+ f f(x)dx = f(g + f)(x)dx
<lim [(g+fn)(x)dx
fl—*00 JE
= lim
fl—*00 E E
[g(x) dx +
= JE dx.
n-+ooJE
Therefore,
f g(x)dx —
f f(x)dx = f(g — f)(x)dx
f(g_fn)(x)dx
n—*oo JE
= I g(x) dx — lim
JE
I dx.
Therefore,
lim I < I
JE
f(x)dx. (6.32)
At this point, it is worth going back and comparing this to Osgood's proof of a
much weaker result in Section 4.3. What should be most striking are the knots we
had to tie ourselves into to deal with those sets on which the convergence was not
nice. It is the ability to neatly excise troublesome sets of measure zero that makes
all the difference.
Exercises
6.3.1. Show that if m n are positive integers, then
ri 11 ri
1 11
I—,-—+——InI--'--+—I=Ø.
1
f f dx.
6.3.8. Show that there is no sequence of functions on [0, 2ir] of the type
= sin(nx) + cos(nx),
which converges to the function 1 almost everywhere on [—ir, ir], and where
+ < 10.
f (>fn(x))dx=>2(f fn(x)dx).
6.3.11. Let (fe,) be a sequence of integrable functions and let f be an integrable
function such that
pb
lim I (x) — f(x) dx = 0.
Ja
lim I dx = I f(x)dx.
JE
6.4 Egorov's Theorem 191
fn(x)dx)
f' dx
(f
and
Show that
f fn(x)) dx = fn(x)dx).
lim /
fl—*O0
f(x) cos(nx) dx = 0.
6.3.17. Show that if f is integrable on (—oo, oo) and g is bounded and measurable,
then
By "almost," we mean that it is true except for a set of measure less than E where
can be any positive number, no matter how small. This means that we can first try
proving our theorem in the greatly simplified case where our sets are finite unions
of open intervals, our functions are continuous, and our convergent sequences
converge uniformly. We then use these "almost" statements to expand the range of
situations in which our theorem holds.
The actual theorem summarized in the first principle is Theorem 5.11, that for
any measurable set and any 0 we can find a finite union of open intervals
so that the symmetric difference between the original set and this finite union
has measure less than €. The theorems that correspond to the second and third
principles will be proven in this section. The second principle corresponds to
Luzin's theorem, Theorem 6.26. The third principle is made explicit in Egorov's
theorem, Theorem 6.21.
In Section 4.3 we saw how Osgood approached the justification of term-by-term
integration by looking for a large subset of our interval on which convergence is
uniform. Osgood sought to isolate the F-points, the points that are most problematic
for uniform convergence.
As explained in the last section, we do not need to address uniform convergence
directly in order to prove the dominated convergence theorem, but there is an
implicit use of uniform convergence. In 1911 Dimitri Egorov would make explicit
the connection between Lebesgue measure and uniform convergence. Egorov's
student, Nikolai N. Luzin, then used this result to prove that every measurable
function is almost continuous. By this, we mean that given any 0 and any
measurable function f, we can remove a set of measure < from the domain of f,
and f will be continuous on what remains. Luzin was not the first to observe this.
Lebesgue had stated this theorem — though without providing a proof — in 1903.'
Vitali published a proof of this result in 1905.
Dimitri Fedorovich Egorov (1869—1931) began teaching at Moscow University
in 1894 and earned his doctorate there in 1901. In addition to his work in real
analysis, he is noted for his contributions to differential geometry. In 1923, he
was appointed director of the Institute for Mechanics and Mathematics at Moscow
State University. Egorov protested against the arrests and execution of clergy in the
1920s and also against the attempt to impose Marxist methodology in science. He
was dismissed in 1929, arrested in 1930, and died in exile a year later.
Examples 4.6—4.8 from Section 4.3 are all nonuniformly convergent sequences.
But if we remove any neighborhood of 0, no matter how small, each of these
sequences converges uniformly on the interval that remains. These are all almost
uniformly convergent.
In a footnote, Lebesgue corrects a statement from a letter written to Borel in which he had claimed that one
could remove a set of measure zero and have the function be continuous on the set that remained.
6.4 Egorov's Theorem 193
The sequence (ga) converges uniformly to 0 on a given set if and only if (fn)
converges uniformly to 0 on that set (see Exercise 6.4.3). The advantage of working
with (ga) is that it is a sequence of monotonically decreasing functions. Let A be
the subset of [a, b] on which 0. By our assumption, m(A) = 0.
Define
By Corollary 5.12,
lim m(Sk,fl)
n—+oo
=m (U > m ([a, b] — A) =b — a.
\n= I
Given any 0, for each k, we choose an n that may depend on k, written n(k),
so that
m(Sk,fl(k)) > b — a —
194 The Lebesgue Integral
We set
00 00
Proof. Let Sk be a set of measure < 1/k for which converges uniformly to
f on [a, b] — Sk. In particular, it converges to f. The set on which it does
not converge is contained in Sk, which, by Corollary 5.12, has measure
LI
Convergence in Measure
We have seen that convergence almost everywhere is equivalent to almost uniform
convergence. Both of these are weaker than pointwise convergence which required
convergence at every point. In the route we shall take to prove Luzin's theorem that
measurable functions are continuous once we remove an arbitrarily small set, we
make use of an even weaker type of convergence, convergence in measure.
Notice how similar this is to Kronecker's convergence (Theorem 4.6 on p. 110)
in which the outer content (rather than the measure) of this set must converge to 0.
Since the outer content is always greater than or equal to the measure, convergence
in measure is also considerably weaker than Kronecker's convergence.
fl,2,
converges to 0 in measure. However, for each x e [0, 1], there are infinitely many
functions in this sequence at which = 1. This sequence does not converge
atanyx in[O, 1].
Our first result shows what might be expected, that uniform convergence implies
convergence in measure. The second result is perhaps more surprising. If a sequence
converges in measure, it might not converge almost everywhere (as in our example),
but there will always be a subsequence that converges almost everywhere.
Proof Choose any > 0 and find a set S of measure less than so that
converges uniformly on E — S. Given a > 0, there is a response N so that for any
n N and any x e E — 5, — f(x)I <a. It follows that for n N,
lim m({x e E
fl--* 00
?a}) <&
Since this holds for every > 0,
lim m({x e E
Sk {x eE — f(x)I >
=
2 2 2
Let
S=flSk.
The measure of Sk is bounded above by
m(Sk)
2-k + 2-k-1 + +... = 21-k
and therefore the measure of S is
m(S) = lim m(Sk) = 0.
We need to show that if x e E — S, then fflk (x) f(x). Choose any E > 0 and
find K so that <E and x g SK. Then for all k K, x Sk, 50
fnk(x) — <E
of the intervals, and therefore every step function is continuous almost everywhere.
One of the consequences of Riesz's result is the following theorem.
Proof One direction is easy. Step functions are measurable functions, so Theo-
rem 6.5 implies that f must be measurable.
In the other direction, we know from Theorem 6.6 that f can be written as the
limit almost everywhere of a sequence of simple functions. Let be such a
sequence of simple functions, and let T be the set of measure zero on which this
sequence does not converge to f. We write as
where the Sk,n are measurable sets. By Theorem 5.11, we know that for each
we can find a finite union of open intervals,
N(k,n)
Uk,n 1k,n,i,
=
such that the measure of the symmetric difference between and is as small
a positive value as we might wish. In particular, we can have
Since each is a finite union of intervals, the function is a step function. Note
that the intervals in {U1 might overlap, but each intersection is
a finite union of intervals, and there are only finitely many such intersections, so
we can always rewrite as a simple function for which the measurable sets are
nonoverlapping intervals.
Let = Uk,fl). For every x e [a, b], x lies in exactly one of the
1 <k <ma. Therefore, if x g then = We also know that
rn, rn,
We have that
{x e >E} c {x e >E}UT.
Therefore, for n > N, we have
m ({x e [a,b] — f(x)f > E})
e f(x)I ?E})+0.
Since converges almost everywhere to f, it also converges in measure to f,
and therefore
urn m({x e >E}) =0.
n —k 00
Luzin's Theorem
Nikolai Nikolaevich Luzin (1883—1950) studied engineering at Moscow University
from 1901 to 1905. Here Egorov spotted his talent and encouraged him to pursue
mathematics. After graduation, Luzin began the study of medicine but returned
to Moscow University in 1909 to study mathematics with Egorov. They began
joint publications on function theory in 1910. Luzin's theorem comes from a
paper published in 1912. In 1915, Luzin received his doctorate. He was appointed
professor at Moscow University in 1917. In 1935 he became head of the Department
of the Theory of Functions of Real Variables at the Steklov Institute. In 1936
he was denounced for publishing his mathematical results outside of the Soviet
Union, activity that was viewed as anti-Soviet.2 He came close to dismissal and
possible imprisonment, but managed to survive. In addition to his work in function
theory, Luzin is noted for his contributions to descriptive set theory (measure-
theoretic and topological aspects of Borel sets and other a -algebras) and to complex
analysis.
We are going to use the fact that a convergent sequence of measurable functions
is almost uniformly convergent to prove that for every measurable function, we
can remove a set of arbitrarily small measure and the function will be continuous
relative to what remains (see definition of continuity relative to a set on p. 114). We
know that if f is measurable over E, then we can find a sequence of step functions
that converge to f. We know that step functions are continuous almost everywhere.
2
For more on the "Luzin affair," see
6.4 Egorov's Theorem 199
Exercises
6.4.1. Define the functions by
0, otherwise.
6.4.2. Without using Egorov's theorem, show that for any E > 0, we can find a set
5, rn(S) <E, such that sequence of functions defined in exercise 6.4.1 converges
uniformly on [—2, 2] — S.
6.4.3. Prove that if gn(x) = supm >n over [a, b], then (ga) converges uni-
formly over [a, b] if and only if converges uniformly over [a, b].
6.4.4. For the sequence given in Example 6.2, fi, i, fl,2, fl,3, ..., find a sub-
sequence that converges almost everywhere. Then apply the proof of Theorem 6.24
200 The Lehesgue Integral
to this sequence and find the subsequence predicted by that proof. In other words,
find the first pair (k1, n i) such that
lim Ce({X e
fl —*00
=0.
Show that any sequence that converges in Kronecker's sense must converge in
measure.
6.4.6. Find an example of a sequence of functions that converges in measure but
does not converge in Kronecker's sense.
6.4.7. Give an example of a measurable function for which we cannot remove a
set of measure 0 and have a function that is continuous relative to what remains.
Justify your answer.
6.4.8. Consider X the characteristic function of SVC(4). Given any E > 0,
describe how to construct a set S (the construction may depend on E) with rn(S) <E
and with the property that X SVC(4) is continuous relative to [0, 1] — S. Justify your
answer.
6.4.9. Let Al be the nonmeasurable set described in Theorem 5.13 on p. 151. Define
the function f by f(x) = q, where q is the unique rational number chosen so that
x e Al + q. Prove that f is discontinuous at every value of x.
6.4.10. Let be a sequence of measurable functions on E, m(E) <oc. Show
that
lim i
f dx=0
1 +
if and only if converges to 0 in measure. Show that this result is false if we
omit the assumption rn(E) <oc.
6.4.11. In equation (6.33) in the proof of Egorov's theorem, Theorem 6.21, we
showed that rn(S) <E. We chose E to be an arbitrary positive integer. Where is the
flaw in our reasoning if we conclude from this statement that rn(S) = 0?
6.4.12. Egorov's theorem does not claim that there exists a subset E C [a, b] with
rn(E) = 0 such that converges uniformly to f on [a, b] — E. However, show
6.4 Egorov's Theorem 201
that it does imply that there exists a sequence of measurable sets, (En), in [a, b]
such that
6.4.14. Prove that if for each E > 0 there is a measurable set E, m(E) <E, such
that f is continuous on [a, b] — E, then f is measurable on [a, b].
6.4.15. Let be a sequence of measurable functions. Show that the set E of
points at which this sequence converges must be a measurable set.
6.4.21. Let be a sequence that converges in measure to f on [a, b]. Show that
pb pb
urn / sin(f(x)) dx.
Ja
7
The Fundamental Theorem of Calculus
203
204 The Fundamental Theorem of Calculus
D f(c) = lim ,
x — C x — C
7.1 The Dini Derivatives 205
As examples, for f(x) = fx the Dini derivatives at 0 are
All of the Dini derivatives exist, provided the function is defined in a neighbor-
hood of c. A function is differentiable at c if and only if the four Dini derivatives
at c are finite and equal. It is possible to have a continuous and strictly increasing
function for which all four Dini derivatives at c are different (Exercise 7.1.1). There
is no necessary relationship among these derivatives at a single point except the
fact that the lim sup is always greater than or equal to the lim inf,
Theorem 7.1 (Dini's Theorem). Let f be Riemann integrable on [a, b]. For
x e [a, b] define
cx
F(x)=J f(t)dt.
a
Proof We begin by noting that if g(x) = —f(x), h(x) = f(—x), and k(x) =
—f(—x), then (Exercise 7.1.2)
It follows that if this theorem holds for Dt then it holds for each of the Dini
derivatives.
206 The Fundamental Theorem of Calculus
Since f is Riemann integrable on [a, b], it is bounded on this interval. For each
x e [a, b], define
1(x) = lim f(t), L(x) = lim f(t).
t-÷x
The functions 1 and L are bounded. For x > c, we have that
7 / \
inf f(t) )
\tE[c,x]
(x — c) < / f(t)dt < ( sup 1(t)) (x — c),
J \tE[cx] /
and, therefore,
1
fX
= f(t)dt <L(c),
x— C \tE[cx] /
= 1
fX
f(t)dt> lim (inf 1(t)) > 1(c).
x— C tE[c,x]
Bounded Variation
One of Dini's observations in his 1878 book was that if a function f has Dini
derivatives that are either bounded above or bounded below, then f can be written
as a difference of two monotonically increasing functions (see Exercise 7.1.3).
This is significant because Dirichlet's theorem that prescribed sufficient conditions
under which a function can be represented by its Fourier series included piecewise
monotonicity as one of the conditions. Three years later, in his 1881 paper Sur la
série de Fourier, Camille Jordan found a simple characterization that is equivalent
7.1 The Dini Derivatives 207
V(P, fl —
=
f over [a, b] is the supremum of the variation over all
partitions,
V(f)=supV(P,f).
P
We say that f has bounded variation on this interval if the total variation is finite.
If we take the partition (0, 1/(N7r), 1/((N — 1)7r),..., 1/2rr, 1/7r), then the vari-
ation is
1 1 1 1 1
We can make this as large as we want by taking N sufficiently large, so this function
does not have bounded variation.
Given any partition of [0, 1/rr], we find the smallest N so that 1/N7r is less than
x1, the first point of the partition that lies to the right of 0. The variation that
208 The Fundamental Theorem of Calculus
+
+
This function has bounded variation, even though it oscillates infinitely often in
any neighborhood of 0.
Proof. We begin with the assumption that f = g — h, where g and h are mono-
tonically increasing. It follows that
=
— — — g(xji) +
g(xi) — —
+
(g(b) — g(a)) + (h(b) — h(a)).
In the other direction, we assume that f has bounded variation. We define the
function T by
T(x) =
the total variation of x over the interval from a to x. The function T is monotonically
f
increasing. Since f = — (T — f), we only need to show that T — is also f
monotonically increasing. We leave it as Exercise 7.1.8 to verify that for a <b <c,
we have
f(y) -
Corollary 7.3 (Continuity of Variation). The function f is continuous and of
bounded variation on [a, b] if and only if it is equal to the difference of two
continuous, monotonically increasing functions.
+
E
< lim T(x)-f---
2
This means that every sufficiently fine partition of [a, c] has a variation strictly less
than T(c) — E/2, but this contradicts the definition of T(c) as the supremum of the
variations over all partitions.
The other direction is left as Exercise 7.1.10.
210 The Fundamental Theorem of Calculus
Exercises
7.1.1. Find a strictly increasing, continuous function for which
Df(0), and D_f(0) are all different.
7.1.2. Show that if g(x) = —f(x), h(x) = f(—x) and k(x) = —f(—x), then
D+f(c) = Wf(c) = D_f(c) =
7.1.3. Show that if all four Dini derivatives are bounded below by A and if c is any
constant larger than IA then f(x) + cx is a monotonically increasing function of
x. It follows that f is the difference of two monotonically increasing functions.
7.1.4. Show that Dirichlet's function, the characteristic function of the rationals,
does not have bounded variation.
7.1.5. Prove that if a function has bounded variation on [a, b], then it is bounded
on[a,b].
7.1.6. Give two examples of continuous functions on [0, 1] that do not have
bounded variation and whose difference does not have bounded variation.
7.1.7. Show that the set of points of discontinuity of a monotonic function is
countable. Using this result, prove that any function of bounded variation has at
most countably many points of discontinuity.
7.1.8. Show that for a <b <c, we have
V(f) = Vab(f) +
Exercises 7.1.14—7.1.20 will lead you through Dini's proof that if f is contin-
uous on [a, b] and f(x) + Ax is piecewise monotonic for all A e R, then f is
7.1 The Dini Derivatives 211
x—y
for some sequence (li, 12,...)
7.1.19. Show that the sequence (la) is increasing and for all ii ? 1,
1 1 1 1
7.1.20. Let xo be any element of Show that f, the function with which
we began in Exercise 7.1.14, is differentiable at xO and that f'(xo) = lo.
7.1.21. Show that the devil's staircase, DS(x) (Example 4.1 on p. 86), is not
differentiable at 1/4.
212 The Fundamental Theorem of Calculus
Wf(x).
214 The Fundamental Theorem of Calculus
We start with inequality (7.3). Riesz observed that if D_ f(x) < for a
given value of x, then we can find two rational numbers, r and R, such that
D_f(x) <r <R <
For each pair of rational numbers 0 < r <R <oc, we define the set
= {x e <r <R <
There are a countable number of such pairs. If we can show that each set has
measure zero, then inequality (7.3) holds almost everywhere.
The set is the intersection of ER = {x e (a, b) R} and Er =
{x e (a, b) I D_f(x) <r}. We need to limit the size of these sets. Notice that the
set of x for which = oc is equal to flR ER. Using the flipping operation,
h(x) = f(—x), we see that D+h(x) = —D_f(x), so whatever we can say about
ER can be translated into a comparable result for Er. The key is to be able to limit
the size of ER.
If x e ER, then we can find a z > x such that
f(z)—f(x)
> R, or, equivalently,
z—x
f(z) — Rz> f(x) — Rx.
If we define g(x) = f(x) — Rx, then the set ER is contained in the set of shadow
points of g. The shadow points correspond to points in the valleys (see Figure 7.1).
A point x is a shadow point if the point (x, g(x)) on the graph of g lies in the
shadow of the rising sun. The next lemma provides the key to bounding the size of
ER.
Lemma 7.5 (Rising Sun Lemma). Let g be continuous on [a, b]. The set of shadow
points of g that lie in (a, b) is a countable union of pairwise disjoint open intervals
(ak, bk)for which
Proof. For each shadow point x, we can use the continuity of g to find a small
neighborhood of x that is left of z and over which the value of the function stays less
than g(z). This tells us that the set of shadow points is an open set. By Theorem 3.5,
the set of shadow points is a countable union of pairwise disjoint open intervals
(ak, bk). What may seem obvious from looking at Figure 7.1 but actually takes
some work is that g(ak) g(bk) (inequality (7.5)). This is a critical part of the
rising sum lemma.
We shall prove that g(x) g(bk) for every x E (ak, bk). The inequality forx = ak
then follows from the continuity of g (see Exercise 7.2.4). For ak <x <bk, let
= {y E g(x) g(y)}.
Since x E is nonempty. It is bounded by bk, so has a least upper bound,
t= sup g(t) g(x) (see Exercise 7.2.5). If t <bk, then g(bk) <g(x). Also,
if t <bk, then t (ak, bk), so t is a shadow point. We can find a z > t so that
g(z) > g(t). We have the inequalities
g(z) > g(t) g(x) > (7.6)
Since t is the least upper bound of y E [x, bk] for which g(y) g(x), z must be
larger than bk, and, as we have seen, g(z)> g(bk). This means that bk is a shadow
point, a contradiction. Therefore bk = sup and g(x).
Notice that the shadow points of g(x) = f(x) — Rx include much more than
just the points in The reason for using shadow points is so that we can work
with a countable union of intervals, as we shall see in the next result that tells us
about the size of
Proof. We apply the rising sun lemma to the function g(x) = f(x) — Rx. As we
have seen, ER is contained in
U(ak, ,8k),
where the open intervals are pairwise disjoint and g(ak) This tells us that
— ak — f(ak)). (7.8)
The set of x for which D+f(x) = is the intersection of taken over all
R E N. We see that E' E2 .••. By Corollary 5.12, we can conclude that
f(b) f(a)
m ({x E [a, b] = = lim m(ER) iim =0.
R—÷oo R—*oo R
(7.9)
To find the size of and finish our proof, we need a slightly different result for
Er.
fC8k)—f(ak)
Proof We follow the proof of Lemma 7.6 with f replaced by h(x) = f(—x) and
R replaced by —r. Notice that we did not need the fact that R is positive until we
divided by R to get inequality (7.8). We observe that
—r D_f(x) <r,
7.2 Monotonicity Implies Differentiability Almost Everywhere 217
so
Er = {x E (a, Df(x) <r) = {x E (—a, —a) —r j.
Following the proof of Lemma 7.6 up to inequality (7.7), we see that
This is equivalent to
The next lemma completes the proof of Theorem 7.4. It uses a very ingenious
trick.
r
(7.12)
218 The Fundamental Theorem of Calculus
— — (7.13)
(
I
Figure 7.2. A strictly increasing function with discontinuities. Its inverse.
Lemma 7.11 Finite AE). Let g be a strictly increasing function on [a, b].
The Dini derivative D+g is finite almost everywhere.
Proof We can restrict our attention to those x for which D+g(x) = Dg(x)
because the set of x on which they differ has measure zero. Let E°° = {x E
(a, b) I Dg(x) = +oc}. If x E E°°, then for any positive N we can
find s and t, s <x <t, such that
g(t) —g(x)> N(t — x),
g(x) — g(s)> N(x — s).
Therefore, g(t) — g(s)> N(t — s). We define SN to be the set of all x E (a, b) for
which we can find s and t, a <s <x <t <b, such that g(t) — g(s)> N(t — s).
By what we have just shown about E°°, we know that it is a subset of 5N• For each
xe we select and so that <x <ti,
— — si). (7.16)
The intervals (si, ti), taken over all x E SN, provide an open cover of 5N•
The set SN is open (Exercise 7.2.7). By Theorem 3.5, SN is the union of a count-
able collection of pairwise disjoint open intervals, 5N = Uk(ak, bk). We choose a
closed interval inside each (ak, bk) whose length is exactly half bk — ak,
1
[ak, ,8kI c (ak, bk), — ak = — ak). (7.17)
Each closed interval [ak, ,8kI is contained in the open cover UXEsN(5x, ti). By the
Heine—Borel theorem, Theorem 3.6, we can find a finite collection of these open
intervals that covers [ak,
This finite open cover can be ordered so that Sx(k, 1) < Sx(k,2) < < We
can assume that tx(k, 1) < tx(k,2) < < because otherwise there is an open
interval that can be eliminated from the cover. Furthermore, if the right endpoint
of one interval is strictly greater than the left endpoint of the second interval to its
right, Sx(k,j+2) <tx(k,j), then the interval in the middle is contained in their union,
(see Exercise 7.2.8), so we can eliminate (SX(k,J+1), tX(k,J+1)) from the cover. Thus,
after removing these superfluous intervals, the intervals in odd position are pairwise
disjoint, as are the intervals in even position.
We can use the fact that g is strictly increasing, together with equation 7.16, to
put a bound on the length of [ak,
— g(sx(k,2j_1)))
l<j<flk/2
+ — g(sx(k,2j)))
l<j<flk/2
It follows that E°° SN (ak, bk) where these intervals are pairwise dis-
joint. The measure of E°° is less than or equal to
00 00
400
> (gC8k) — g(ak))
Since this is true for all N > 0, no matter how large, we can conclude that
m(E°°) =0.
Exercises
7.2.1. Let f be the function defined by f(0) = 0 and f(x) = x sin(1/x) for x 0.
Find Df(0), and D_f(0).
222 The Fundamental Theorem of Calculus
7.2.2. Show that if a function f assumes its maximum at c, then f(c) 0 and
D_f(c) > 0.
7.2.3. Show that if f is continuous on [a, b] and any one of its Dini derivatives
(say is everywhere nonnegative on [a, b], then f(b) f(a).
7.2.4. Prove that if f is continuous on [ak, bk] and if f(x) f(bk) for all x E
(ak, bk), then f(ak) f(bk).
7.2.5. Prove that if f is continuous on [ak, bk] and if
t = sup fy E [ak, bk] f(x) f(y)j,
then f(t)> f(x).
7.2.6. Show that if g is strictly increasing and
7.2.8. Given three intervals (a1, b1), (a2, b2), (a3, b3) that cover (ai, b3) with a1 <
a2 <a3 and b1 <b2 <b3, show that if a3 <b1, then
(a2, b2) c (a1, b1) U (a3, b3).
7.2.9. Let f be monotonically increasing on , b], and c an arbitrary value in
(a, b). Show that
sup f(t) = lim f(x) <f(c) < lim f(x) = c<t<b
inf f(t).
a<t<c x—*c
7.2.10. Given an arbitrary sequence (xv) ç [a, b] and a sequence of positive num-
bers (ca) such that < oc, define the function f by
f(x)= >cn.
xn <x
Show that
1. f is monotonically increasing on [a, b],
2. f is discontinuous at each and
3. f is continuous at each x E [a, b] —
7.3 Absolute Continuity 223
7.2.11. Verify that in the rising sun lemma (Lemma 7.5), we have f(ak) = f(bk)
except possibly when ak = a.
1. Antiderivative part:
(a) When is a function integrable?
(b) If the integral exists, when can that integral be differentiated?
(c) When does differentiating the integral take us back to the original
function?
2. Evaluation part:
(a) When is a function differentiable?
(b) If the derivative exists, when can that derivative be integrated?
(c) When does integrating the derivative take us back to the original function?
Theorem 7.12 (Properties of Integral). 1ff is integrable on [a, b], then F(x) =
f f(t) dt is uniformly continuous and of bounded variation on [a, b].
By Theorem 7.9, it follows that F is differentiable almost everywhere.
implies that
=
F(x) —
f
f is integrable, by Corollary 6.17 we can find a simple function such that
fb
224 The Fundamental Theorem of Calculus
Since every simple function takes on only finitely many values, it is bounded, say
<B for all x E [a, b]. Choose = €/2B, then Ix — yI implies that
f
f(t)
f dt
f
To see that F has bounded variation, we observe that
fX fX
and
F(x)
= f f(t)dt = f f(t)dt
is a difference of monotonically increasing functions. By the Jordan decomposition
theorem (Theorem 7.2), it has bounded variation.
<&
we have that
— <E.
To see why not, we consider the devil's staircase, DS(x), (Example 4.1 on p. 86).
The total variation of this function is 1,so it has bounded variation. It is a continuous
function that is constant on the open intervals that form the complement of SVC(3).
Since SVC(3), the Cantor ternary set, has measure 0, we have that
The integral of any function that is 0 almost everywhere is a constant function, and
DS(x) is not constant. If we start with DS, differentiate, and then define
F(x)
= f DS'(t)dt,
then F(x) = 0 for all x E [0, 1].
We need something stronger than continuity to characterize functions that are
integrals. We need absolute continuity.
To see that the devil's staircase is not absolutely continuous, let us take E = 1/2.
Our function increases by 1/2 from x = 0 to x = 1/3, so our response must be
less than 1/3. But the increase of 1/2 also occurs in an increase of 1/4 over [0, 1/9]
and an increase of 1/4 over [2/9, 1/3]. Our must be less than 2/9. But these
increases actually occur over four intervals, each of length 1/27. The response is
less than 4/27. Continuing in this way, we see that for each positive integer n,
3n+1
>0.
There is no response.
226 The Fundamental Theorem of Calculus
A Little History
This property of definite integrals, absolute continuity, was first observed by Axel
Harnack in 1884. The name was coined by Vitali in 1905, but several mathemati-
cians were aware of it and using it by the 1890s, including Charles de la Vallée
Poussin, Camille Jordan, Otto Stolz, and E. H. Moore. As we shall prove later in this
section, if a function F can be defined as a definite integral, F(x) = f(t) dt (us-
ing either the Lebesgue or Riemann definition of the integral), then F is absolutely
continuous.
What about the other direction? If a function is absolutely continuous, does that
imply that it is an integral? The Riemann integral is intractable, but we can do
this for the Lebesgue integral. Because we are not limited to bounded functions, it
will take more work to verify that any function defined as a definite integral must
be absolutely continuous. But it will be possible to show that the implication also
runs in the opposite direction; every absolutely continuous function is a definite
integral. This result was observed by Lebesgue in 1904, but he gave no proof. The
first proof was published by Vitali in 1905, the same paper in which this property
received its name. The next two propositions will move us toward the theorem that
F can be written as
F(x) f(t)dt
= Ja
for some function f if and only if F is absolutely continuous.
Since is simple, it is bounded, say kt,(x)I < B for all x E [a, b]. Choose the
response = E/2B. Let
S = bk),
f(t)dt
k=1 J k=1 ak S
— dt
+ f dt
+ f dt
:
What about the other direction? If F is absolutely continuous, can we find an
integrable function f for which
F(x)=J f(t)dt.
a
The natural candidate for f is F', but does F' exist? The next proposition guarantees
that it does, almost everywhere.
f(bk) — <1.
LetN = [(b — the smallest integer greater than or equal to (b — a)/& Let
P = (a = x0, xi, . .. , Xm = b) be any partition of [a, b] into intervals of length
rn > N. For 1 <j < N, choose 1(j) to be the largest integer such that
228 The Fundamental Theorem of Calculus
forallx,y [a,b].
X1(J) <a + j& The interval from X1(j) to X1(J)+1 is one of the intervals in P, so it
has length less than Since
a+ <a + (I +
the interval from X1(J)+J to XI(j+l) also has length less than On each of these
intervals, the variation of F with respect to P is less than 1. Counting the initial
and final intervals, [a, X/(l)] and [X/(N_ 1)+1' b], there are at most 2N such intervals,
so the variation of F with respect to P is strictly less than 2N,
V(P,f)< 2N
2[b_a1
Since every partition has a refinement with intervals of length less than and since
refining a partition can only increase the variation, we see that the total variation is
bounded by 2[(b —
For the evaluation part of the fundamental theorem of calculus, if we start with
F, differentiate it, and then integrate, we wind up with an absolutely continuous
function. If we want any hope that we end with the same function with which we
started, then we need to have started with an absolutely continuous function. This
condition is necessary. As we shall see in the final section of this chapter, it is also
sufficient.
A Hierarchy of Functions
Absolute continuity implies bounded variation, but as the devil's staircase illus-
trates, bounded variation does not imply absolute continuity. With one more defi-
nition in place, we can describe a nice hierarchy of functions defined on a closed
bounded interval. In Section 8.1 we shall see how Lipschitz's condition arose and
how he used it.
A function is said to be C1 or continuously differentiable on [a, b] if it is
differentiable and its derivative is continuous on this interval. All of the following
statements hold for functions on a closed and bounded interval:
1. If a function is C1, then it has a bounded derivative.
7.3 Absolute Continuity 229
2. If a function is differentiable with a bounded derivative, then it satisfies a
Lipschitz condition of order 1.
3. If a function satisfies a Lipschitz condition of order 1, then it is absolutely
continuous.
4. If a function is absolutely continuous, then it has bounded variation.
5. If a function has bounded variation, then it is differentiable almost everywhere.
The proofs of the first three statements are left as exercises. All of these impli-
cations go only one way, a fact that is also left for the exercises.
mk
T(bk) — = = sup —
j=1
where the supremum is taken over all partitions, ak = XkO <Xk, 1 < <Xk,mk =
bk, of [ak, bk]. It follows that
n n mk
Since the set of intervals (xk,j_1, xk,J), 1 < j <mk, 1 <k <n, is a finite collection
of pairwise disjoint intervals of total length less than each double sum on the
230 The Fundamental Theorem of Calculus
right side of equation (7.18) is less than E/2, and so the supremum of these sums
is strictly less than E.
Exercises
7.3.1. Find an example of a simple function 0 such that
<0.1.
7.3.2. Let f be defined by f(0) 0=, f(x) = x2 sin(l/x2) for x 0. Show that f
does not have bounded variation in any neighborhood of 0, but it is differentiable
at 0.
>(bk—ak)< &
we have that
— <E.
F(x) f(t)dt
= Ja
is differentiable almost everywhere, and F'(x) = f(x) almost everywhere.
Proof We saw in Propositions 7.13 and 7.14 that F has bounded variation and
thus is differentiable almost everywhere. We shall show that
This will complete the proof because if f is integrable, then so is — f. The definite
integral of —f is —F, and the derivative of —F is —F'. Thus once we have proven
that F' f for every integrable f, it also follows that
—F'(x) < —f(x), almost everywhere.
We get the second inequality for free. But we pay dearly for the first inequality.
The proof is very reminiscent of the proof of Theorem 7.4. Let S be the subset
of [a, b] on which F is differentiable and begin by considering the set
E Q. The set of x for which f(x) < F'(x) is the union over all pairs
p <q of This is a countable union. If we can show that = 0, then it
follows that F'(x) f(x) almost everywhere.
For x E f(x) <p, and therefore
f(t)dt (7.19)
f(t)dt
Given any E > 0, Corollary 6.18 guarantees a response so that for any measur-
able set A C [a, b] for which m(A) < we have that
m (U —
U= bk).
Let
Uk = E (ak, bk)
k, then the intervals (ak,J, t3k,J) are pairwise disjoint. Since (ak,J, ,8k,j) c
(ak, bk) and the intervals (ak, bk) are pairwise disjoint, all of the intervals over all
pairs k, j are pairwise disjoint. We also have that
We are now ready to put the pieces back together, using equations (7.19)—(7.21):
00 00
q q —
k=1 j=1
f(t)dt
k=1 j=1
= I f(t)dt
JT
= I f(t)dt + I f(t)dt
Note that without absolute continuity, f could be the devil's staircase, in which
case the conclusion to this lemma would be false.
If we take any finite collection of these, bk), the sum of these lengths is
less than 6, SO f maps
U (f(ak), f(bk)),
a set of measure less than E. Since the intervals (f(ak), f(bk)) are pairwise disjoint,
EZ-f(Z)>EX—f(X).
The set E is contained in the set of shadow points of the function defined by
EX — f(x). Let ((ak, be a collection of pairwise disjoint open intervals
whose union contains E and for which
00 00
This is true for all E > 0; therefore, m (f(E)) = 0 and f(a) = f(b). Since f is
monotonically increasing, it is constant on [a, b]. LI
We are now prepared to state and prove the second half of the fundamental theo-
rem of calculus. As we have seen, absolute continuity is not just sufficient; it is also
necessary.
Theorem 7.18 (FTC, Evaluation). 1ff is absolutely continuous on [a, b], then it
is differentiable almost everywhere, f'is integrable on [a, b], and
pb
f'(t)dt = f(b) — f(a). (7.22)
Ja
Proof We have already shown that f has bounded variation on [a, bi and therefore
is differentiable almost everywhere. We can extend f' however we wish so that it
is defined on all of [a, b]. In particular, we can use any of the Dini derivatives in
place of f'. Changing the value of the integrand on a set of measure zero does not
affect the value of the Lebesgue integral. We need to prove that f'is integrable and
to establish equation (7.22).
As shown in Proposition 7.15, if f is absolutely continuous, then it is the
difference of two absolutely continuous and monotonically increasing functions. It
is enough to prove our theorem with the added assumption that f is monotonically
increasing.
To prove that f'is integrable, we define a sequence of functions by
where we extend f to the right of x = b by defining f(x) = f(b) forx > b. Each
is nonnegative, and converges to f' almost everywhere. By Fatou's lemma
7.4 Lebesgue's FTC 237
(Theorem 6.20), if the integral of over [a, bi has a bound independent of n, then
f' is integrable. The bound on the integral of follows from the monotonicity
of f,
pb pb pb
fn(X)dXflJ f(x+1/n)dx_nJ f(x)dx
Ja a a
=n / f(x)dx—n / f(x)dx
Ja
Ja+1/n
b+1/n a+1/n
=n f(x)dx—n f(x)dx
b a
We now define g by
We want to apply Lemma 7.17 to the function g. From its definition, g(a) = f(a) —
0 = f(a). If we can show that g is constant on [a, bi, then that constant is f(a) and
equation (7.22) is proven. We only need to show that g is absolutely continuous
and monotonically increasing, and that its derivative is 0 almost everywhere.
Since g is a difference of two absolutely continuous functions, it is absolutely
continuous. By equation (7.23), x <y implies that
= f(y) - f(x)
g(y) — g(x)
- f f'(t)dt >0,
so g is monotonically increasing. By the antiderivative part of the fundamental
theorem of calculus, Theorem 7.16,
d
f'(t) dt = f'(x), almost everywhere,
dx a f
andtherefore g' = 0 almost everywhere. By Lemma 7.17, g is the constant function
equal to f(a). For all x e [a, bi,
—
f(x) f f'(t)dt = f(a), LX f'(t)dt = f(x) — (a).
We have now answered four of our five original questions. We have found the
right way to define integration. In Lebesgue's dominated convergence theorem we
238 The Fundamental Theorem of Calculus
have found a condition which, though not necessary, is a strong and useful sufficient
condition that allows term-by-term integration of a series. We have learned that the
connection between continuity and differentiability is stronger than we might have
expected. And in this section, we have explained the exact relationship between
integration and differentiation.
That still leaves one question, our very first question, the question that started us
asking all of these other questions,
When does a function have a Fourier series expansion that converges to that function?
We now have the tools to make serious progress. One of the most surprising
insights of the early twentieth century was that this is not quite the right way to
pose the problem. As we shall see in the next chapter, there is a better, more useful
question that will have a very elegant answer.
Exercises
7.4.1. Give an example of a function, f, integrable on [0, 11, such that for F(x) =
f(t)dt, there is a c e (0, 1) such that F is differentiable at c but F'(c) f(c).
7.4.2. Show that if
lim
y—x
>q
for all x e [a, bi, then we can find a 6 > 0 so that x, y e [a, bi and 0 < y—x <6
implies that
J
f a bounded derivative on [a, bi. Show that for
allx e [a,bI,
f'(t)dt = f(x) -
f
7.4.4. Let f be integrable on [a, bi with
px
f(t)dt=0
Ja
for all x e [a, bi. Using Proposition 6.13 but not using Theorem 7.16, show that
f = 0 almost everywhere.
7.4 Lebesgue's FTC 239
7.4.5. Use the result of Exercise 7.4.4 and the evaluation part of the fundamental
theorem of calculus, Theorem 7.18, to prove the antiderivative part of the funda-
mental theorem of calculus, Theorem 7.16.
7.4.6. Let f and g be absolutely continuous on [a, bi with f' = g' almost every-
where. Show that f = g + c for some constant c.
7.4.7. Show that if f is absolutely continuous on [a, bi, then
= f dx,
where is the total variation of f on [a, bi. Show that this is not necessarily
true if f is not absolutely continuous.
7.4.8. A monotonic function f defined on [a, bi is said to be singular if f' = 0
almost everywhere. Show that any monotonically increasing function is the sum of
an absolutely continuous function and a singular function.
7.4.9. Let g be a strictly increasing, absolutely continuous function on [a, bi with
g(a) = c, g(b) = d.
1. Show that for any measurable set S c [a, bi,
m (g(S)) = [g'(x)dx.
is
2. Show that if A = {x e [a, bi g'(x) 0), and B is any subset of [c, dl of
measure zero, then
m(Ang'(B)) =0.
3. Show that if A is the set defined in part 2 and C is any measurable subset of
[c, dl, then
p
m(C)=J g'(x)dx=J
Aflg'(C) a
7.4.10. [Change of Variable] Prove the change of variable formula for Lebesgue
integrals: If g is strictly increasing and absolutely continuous on [a, bi with g(a) =
c and g(b) = d and if f is integrable on [c, dl, then
d b
Prove that
fb fb
G(t)f(t)dt g(t)F(t)dt = F(b)G(b) — F(a)G(a). (7.25)
Ja + Ja
7.4.12. [Integration by Parts] Prove the formula for integration by parts for
Lebesgue integrals and absolutely continuous functions: If f and g are absolutely
continuous on [a, bi, then
b b
7.4.13. Let f be integrable on [a, bi. We say that c e (a, b) is a Lebesgue point
if +oo and
1
çc+h
lim
h—*O
-h / f(t) — dt =0.
Show that if c is a Lebesgue point for f, then F(x) = f(t) dt is differentiable
atcandF'(c)= f(c).
7.4.14. Show that if f is integrable on [a, bi, then each point of continuity of f is
a Lebesgue point for f.
7.4.15. Show that if f is integrable on [a, b], then almost every point of [a, bi (all
but a set of measure zero) is a Lebesgue point for f.
7.4.16. Let f, not necessarily a measurable function, be defined on [a, bi. For
each x0 e [a, bi and h, E > 0, let S(xo, h, E) be the set of points x e [x0 — h, x0 +
hi fl [a, bi for which f(x) — E. We say that xo e [a, fri is a point of
approximate continuity of f if for each E > 0,
me (S(xo, h, E))
lim =0.
h—*O 2h
Show that any point of continuity is also a point of approximate continuity. Give
an example of a function for which there is a point of approximate continuity that
is not a point of continuity. Justify your example.
7.4.17. Prove that if f is measurable on [a, bi, then almost all points of [a, b] (all
but a set of measure zero) are points of approximate continuity.
8
Fourier Series
The development of measure theory and Lebesgue integration did not come about
because mathematicians decided they needed a new definition of the integral. It
happened because they were trying to develop and use tools of analysis to solve real
and practical problems. These included solutions to partial differential equations,
extensions of calculus to higher dimensions and to complex-valued functions, and
generalizations of the concepts of area and volume. Fourier series were not unique
in motivating work in analysis, but they constitute a very useful lens through which
to view the development of analysis because these series often were the principal
source of the questions that would prove most troublesome and insightful. As
progress was made in our understanding of analysis, these insights often translated
directly into answers about Fourier series.
This is true especially of Lebesgue's work on the integral. In 1905, armed with
the power of his new integral, he gave a definitive answer to the question of when
the Fourier series of a function converges pointwise to that function. We shall see
his answer in this first section.
The story does not stop there. Once we are using the Lebesgue integral, we can
change the values of the function on any set of measure zero without changing the
value of the integral. Therefore, two functions that are equal almost everywhere
will have the same Fourier coefficients, and so the same Fourier series. The best
we can hope for from a theorem with the weak assumption that f is integrable is
that the Fourier series of f converges to f almost everywhere. In fact, this is not
quite true, though we can come close to it by either strengthening the assumption
just a little (Theorem 8.9) or slightly weakening the conclusion (Theorem 8.2). If
we want the Fourier series to converge to f at every point, we shall need to be quite
restrictive about the kind of function with which we start.
If we are content with convergence almost everywhere, then we really need
to think of equivalence classes of functions where f g if f = g almost
everywhere, or, equivalently, f f(x) — dx = 0. This integral defines a
241
242 Fourier Series
are functions that satisfy conditions 1, 2, and 4 but for which the Fourier series
does not even converge at all values of x e (—7r, rr). See Exercises 8.1.10—8.1.15
for Fejér's example of a continuous function whose Fourier series fails to converge
at any rational multiple of 7T.
Riemann would show that the function f does not need to be bounded. It is
enough that f is absolutely integrable.
Assumption 2 is essential. To say that the Fourier series for f converges to f(x)
at a given value of x is to say that
n_*00\(f(x)
lim — — cos(kx) + bk = 0,
2
/
where
1
f7t
ak = —I f(x)cos(kx)dx,
7T
1
f7t
bk = —I f(x)sin(kx)dx.
7T
We substitute these integrals, interchange the finite summation and the integra-
tion, use the sum of angles formula, and employ the trigonometric identity'
1 sin[(2n + 1)u/21
— + cos u + cos 2u + + cos nu = (8.1)
2 2sin[u/21
to rewrite the limit as
7 1 sin[(2n + 1)(t — x)/21
lim I f(x) — — I f(t)dt = 0.
7T j
With a change of variable and the continuation of f outside (—7r, 7r) by assuming
f(x + 27r) = f(x), we can rewrite this as
lim /(f(x)—— I
1 sin[(2n + 1)uI
[f(x—2u)+f(x+2u)I duJ =0.
fl-*OO\ sinu j
This process leading to the derivation of equation (8.3) is done at a more leisurely pace in Section 6.1
of A Radical Approach to Real Analysis. It includes a proof of the assertion that equation (8.3) implies
equation (8.4).
244 Fourier Series
— <Alt — (8.5)
Any function that satisfies inequality (8.5) is said to satisfy a Lipschitz condition
of order a. It implies continuity at u = 0. Notice that it is not strong enough to
imply differentiability at u = 0 unless a > 1.
8.1 Pointwise Convergence 245
The next advances were made by Ulisse Dini. In 1872, he showed that the bound
Alt — ula ontherightsideofinequality(8.5)couldbereplacedbyA/ log It —
In 1880, he found a single condition that implies pointwise convergence of the
Fourier series of f to f.
Dini's Condition. The following condition implies that the Fourier series of f
converges pointwise to f on (—7r, 7r):
Cesàro Convergence
In 1890, the Italian Ernesto Cesàro (1859—1906) broadened the definition of con-
vergence.
For example, the series 1 — 1 + 1 — 1 + 1 — corresponds to the sequence of
partial sums (1, 0, 1, 0, 1, 0, .). This sequence does not converge. But if we take
. .
the sum of the first n terms of this sequence and divide it by n, we get
(n+1)/2 1 1 n/2 1
forn odd: = — + —, forn even: = —.
n 2 2n n 2
The limit of this average value does exist. It equals 1/2. We say that the Cesàro
limitofi —1 + 1—1 + 1—... is 1/2.
Cesàro limits are particularly useful for Fourier series. Consider the Fourier
cosine series expansion of the constant function 7r/4, valid for —7r/2 <x
1 1 1
f(x) = cos(x) — cos(3x) + cos(5x) — cos(7x) +...
=A.
•
lim
n
246 Fourier Series
The derivative of f is 0 for all x e (—7r/2, 7r/2), but if we try to differentiate term
by term, we get a series that does not converge except at x = 0:
— sin(x) + sin(3x) — sin(5x) + sin(7x)
Now consider the Cesàro sum of this series. We first need to find the kth partial
sum (see Exercise 8.1.5)
(— 1 )k sin(2kx)
— sin(x) + sin(3x) — sin(5x) + + (—1)ksin ((2k — 1)x) =
2 cos(x)
(8.6)
We now compute the average of the first n partial sums (see Exercise 8.1.6)
11— sin(2x) sin(4x) — sin(6x) sin(2nx)
+ + + (—
n 2cosx 2cosx 2cosx 2cosx
— (tanx)(—1 + cos(2nx)) + (—1y2 sin(2nx)
(8 7)
— 4ncosx
We fix x e (—7r/2, 7r/2) and take the limit n —* oc. Since the numerator stays
bounded as the denominator approaches oc, the Cesàro limit of the series obtained
by term-by-term differentiation is 0, regardless of the value of x.
What if the limit of a sequence exists? Is the Cesàro limit the same? Fortunately,
the answer is "yes."
Proof Given any E > 0, we can find an N such that n N implies that I
(a1
(al+•..+aN)—NAI E
n
+(n-N)-n
(al+•••+aN)—NAI
+E.
n
For n sufficiently large,
(al+•••+aN)—NA
< E,
n
8.1 Pointwise Convergence 247
and therefore
—A <2E
n
for all n sufficiently large. Since this is true for every E > 0, the Cesàro limit
isA. LI
Theorem 8.2 (Lebesgue on Fourier). 1ff is integrable (in the Lebesgue sense) on
the interval [—7r, yr], then the Fourier series off converges to f almost everywhere,
at least in the Cesàro sense of convergence.
In some sense, we could not possibly ask for a better result. The only assumption
we need to make about f is that it is integrable, an assumption needed before we
can even define the coefficients of the Fourier series. On the other hand, the
conclusion is weaker than we might have wished: almost everywhere instead of for
all x e (—7r, 7r), convergence in the Cesàro sense rather than strict convergence.
Yet, as mathematicians were beginning to realize, asking for certain properties to
hold for all x introduces unnecessary complications. For many purposes, it makes
sense to consider two functions to be equivalent if they agree almost everywhere.
If we work with equivalence classes and f is integrable, then the Cesáro limit of
its Fourier series exists and is equivalent to f. If the Fourier series of f converges
in the usual sense, then it converges to a function that is equivalent to f.
Allowing for Cesàro convergence does introduce its own complications. The
series
Cesàro converges to the constant function 0. We have lost the uniqueness of the
representation by a trigonometric series. That is a high price to pay. When it is
worth paying depends on how we want to use the trigonometric representation.
Sometimes existence is more important than uniqueness; sometimes it is not.
248 Fourier Series
Exercises
8.1.1. Show that if
sin[(2n + 1)u]
tim [f(x — 2u) + f(x + 2u) — 2f(x)] du = 0,
0 SlflU
+ — + ... +
Set y = ix and use the fact that the imaginary part of eix is i sin x to prove
equation (8.6),
(— 1 )k sin(2kx)
sin(x) + sin(3x) — sin(5x) + + (—1)ksin ((2k — 1 )x) =
•
—
2 cos(x)
= + + + ar).
8.1 Pointwise Convergence 249
If the sequence (ar does not converge but does converge, then we
say that the original sequence has (C, k)-convergence. Find examples of sequences
with (C, k) convergence for each k, 2 <k <4.
to mean that x0 is the Cesàro limit of (xv). We say that a function f is Cesàro
continuous at x = x0 if
x0 implies f(xo).
Note that we have weakened the conclusion, but we have also weakened the
hypothesis. Is every continuous function also Cesàro continuous? Is every Cesàro
continuous function also continuous? Is f(x) = x2 Cesàro continuous? Is it Cesàro
continuous for any values of x?
8.1.10. Show that the Fourier sine series for the constant function 7r/4 on (0, 27r)
is given by
00.1
sin 1
0 <x (8.8)
4= 2k —
2A1 + 2A2 + + x)
f(x) = 0, x)+ (8.10)
Show that this series converges absolutely and uniformly, regardless of the choice
of sequence (A,). Therefore, f is continuous on R.
8.1.13. Using the uniqueness of the Fourier series expansion of a continuous func-
tion, show that the Fourier series for f on [—7r, 7t] is given by
f(x) = cos(nx),
where
=
m and k, the unique positive integers that satisfy
as m approaches oc. Show that if Am = mm2, then the Fourier series does not
converge at x = 0. Explain the difference between the series in equation (8.10) that
is used to define f and the Fourier series for f.
2A1 + 2A2 + .. + n! x)
n=1
then g is continuous and the Fourier series for g does not converge at any point of
the formk7r/n,k e Z,n eN.
8.2 Metric Spaces 251
Uniform convergence
Pointwise convergence
L2 convergence
Almost uniform Convergence almost
convergence everywhere
L' convergence
Cesaro convergence
Convergence in measure almost everywhere
+ + f2(x)dx
<f
for all N 1. It follows that if we let be the partial sum of the Fourier series,
then
(Sn(X)Sm(X))2dX= m<n,
k=m+1
lim f (g(x) — dx = 0.
—7T
Initially, Hamack claimed that f = g "in general," that is to say at all but an isolated
set of points (a set of points with empty derived set). Later that year, he realized
that he was wrong.
That same year, George Halphen found a function f for which the integral of
(Sn(x) — Sm(X))2 converges to 0— where is the partial sum of the trigonometric
series formed from the Fourier coefficients of f — but the sequence Sn fails to
converge at any but a single point. In other words, if we define a distance between
two function as
École Normale Supérieure. In addition to his work in analysis, he is noted for his
contributions to probability and statistics.
Real progress toward our modern understanding of convergence came in 1906,
with the publication of Fréchet's doctoral dissertation, Sur quelques points du
calculfonctionnel (On several aspects of functional calculus). It made use of and
demonstrated the power of thinking of the set of continuous functions on a closed
and bounded interval as points. The distance between two continuous functions
was defined as the maximum of the absolute value of their difference.
Frigyes Riesz read Fréchet's thesis with great interest, and in that same year of
1906, showed how Fréchet's use of distance between functions could be used to
prove a result of Erhard Schmidt on orthogonal systems of functions, a concept
that David Hilbert had devised for solving integral equations, about which there
will be more to say in Section 8.4. For now, suffice it to say that Riesz, who
was familiar with Harnack's attempts in 1882, recognized that with the Lebesgue
integral he could attain Harnack's goal and develop a very powerful tool for analysis
in the process. For functions f and g whose squares are Lebesgue integrable
over the interval [a, hI, he defined the distance between these functions to be
— g(x))2dx)'/2. Riesz's proof of the convergence result for Fourier series
was published in 1907. Ernst Fischer (1874—1954) discovered the same result in
the same year.
In 1910, Riesz published his groundbreaking generalization, Untersuchungen
über Systeme integrierbarer Funktionen (Analysis of a system of integrable func-
tions), extending his analysis to the general space of functions f for which If
is integrable over [a, hi, the U spaces, p 1. The term metric space would not
come until 1914 when Felix Hausdorif laid the foundations for topology in the
seminal work Grundzüge der Mengenlehre (A basic course in set theory).
Spaces
The set of vectors in R'1 has a lot of structure. It is closed under addition and under
multiplication by any scalar:
(a1, a2, . .. , + (b1, b2, . . . , = (a1 + b1, a2 + b2, . , + ba),
The set of all functions defined on [a, hI also is closed under addition and scalar
multiplication. Both sets have zero elements and additive inverses and satisfy the
basic properties of addition and scalar multiplication.
Given any vector in we can define its length or norm by
254 Fourier Series
This is simply the square root of the dot product of the vector with itself. If 61 is the
angle between vectors and b, then
= • +bb— b
= + — lIbII cos9. (8.11)
The dot product is basic to working with vectors in What makes itso useful is
that it maps a pair of vectors to a real number so that two vectors that are orthogonal
map toO and two identical unit vectors map to 1. If we define the natural basis unit
vectors by
•ek (8.12)
=
Combining this with a distributive law and the ability to factor out scalars, equa-
tion (8.12) uniquely defines the dot product of any two vectors,
= + + ... + + + +
= . + + ..• + .
+ + ••• +
. .
(0, = n Tj)
= f O(x)*(x)dx.
If f and g are measurable, then Theorem 6.6 guarantees sequences of simple
functions that converge to f and g, f, g. We define
(f, g) = n—+oo
lim = lim I dx.
I (lim
J n-÷oo
dx = I f(x)g(x)dx.
J
If f and g are each integrable, then so is their product.
Now that we have an inner product, we can define the norm of a function and the
distance between two functions. If f2 is integrable on [a, b], we define the norm
of f over [a, b] by
b 1/2
f(x)2dx)
We define the distance between two functions f and g, both integrable over [a, b],
by
d(f,g)=
f
= g almost everywhere, then the distance between f and g is
zero. If the distance between f and g is zero, then (f — g)2 = 0 almost everywhere,
so f = g almost everywhere. Because it is common to insist that two objects are
the same if the distance between them is zero, we shall assume that f and g are the
same if they are equal almost everywhere. To be specific, we work with equivalence
classes of integrable functions over [a, b] where two functions are equivalent if
and only if they are equal almost everywhere.
256 Fourier Series
Definition:
The space p> 1, consists of all functions f for which fP is integrable over
[a, b], together with a norm defined by
b 'IP
= (f If(x)IPdx)
and a distance defined by
Definition: L°°
The space L°° consists of all functions f that are bounded almost everywhere over
[a, b], together with a norm defined by
00
= inf {a I
f(x)I <a almost everywhere },
and a distance defined by
d00(f,g)=
f and g are considered identical if they are equal almost everywhere.
The set of functions f for which f2 is integrable over [a, b], equipped with the
distance function we have just defined, is denoted by L2, or L2[a, b] if we need to
specify the interval. More generally, Riesz defined as shown above.
We need to check that these really are vector spaces. The properties of vector
spaces are given on p. 18. Most of these properties are easily seen to be satisfied,
and we leave these for Exercise 8.2.18. We shall prove closure under addition.
Proof Since f and g are measurable, so is f + g. We only need to verify that the
integral of If + gI" is finite. This follows because
which is integrable.
Definition: Norm
Given a vector space V and a mapping N from V to R, we say that N is a norm
if it satisfies the following properties:
In this case, it is easy to see that this space is closed under addition and scalar
multiplication (see Exercise 8.2.19).
We have constructed examples of norms on sets of functions and then used these
norms to define distance. Our definition of an space can be used with 0 < p < 1
to define a vector space. The problem with these values of p is that the resulting
norm does not satisfy the fourth of the required properties of a norm, the triangle
inequality (see Exercises 8.2.26—8.2.28).
The last of these conditions, the triangle inequality, is the only property of the
norms that does not follow immediately from the definition. Later in this section
we shall see how to prove it.
Convergence
Convergence of (fk) to f in means that given any E > 0, we can find a response
K so that for any k K, we have that
Lemma 8.5 (Hölder—Riesz Inequality). For p, q > 1 such that i/p + 1/q = 1,
we take any f E gE We shall always have that fg E L' and
fb < (fb (fb
dx If(x)IPdx) (8.14)
Proof Since f and g are measurable, so is fg. Inequality (8.14) implies that fg
is integrable. We only need to prove inequality (8.14).
In Exercise 8.2.21, you are asked to verify that the equation i/p + 1/q = 1 is
equivalent to p — 1 = 1/(q — 1), to (q — l)p = q, and to (p — 1)q = p. Using
this equivalence, we see that for positive x, the function f(x) = xy —
x= = yq' (see Exercise 8.2.22). Therefore,
7 1\ yq
p p \ P1 q
We now set
= (fb = (fb
A B
= f a f a
8.2 Metric Spaces 259
We use inequality (8.15) with a = I f(x) I/A, ,8 = I g(x) / B: I
çb
1 ç" If(x)I Ig(x)i
dx dx
Ja = Ja A B
çb fb
<IJa pAP
dx+i dx
— Ja
f f(x) f f(x) + 1
Ig(x)I dx.
We apply the Hölder—Riesz inequality to each integral on the right side of this in-
equality,
fb
f(x)+ dx
1/q b i/P
f(x) + dx) dx)
<(L (f
1/p
dx) dx)
+ (f (f
=M
(f
=M(f
260 Fourier Series
Proof. Assume that f Dl. This implies that f is measurable. It is in U if and only
if dx is finite. Let r = q/p> 1 and define s so that 1 /r + 1/s = 1; that
is, s = r/(r — 1) = q/(q — p). We use the Hölder—Riesz inequality, Lemma 8.5,
with if and the constant function 1,
çh çb
J a a
(fb
dx) (b -
b
= (f b
dx) dx)
(f <(b —
(f
Exercises
8.2.1. For 1 p <q <cc, find an example of a function in U that is not in L".
= inf sup
xE[a,b]
where is the characteristic function of the open interval ((k — 1)/n, k/n)),
fk,n =
Show that if 1 p < cc, then this sequence converges in the norm. Show that
it does not converge in the L°° norm.
262 Fourier Series
8.2.13. Show that the sequence given in Exercise 8.2.12 converges in the Cesàro
sense almost everywhere.
8.2.14. Define = . Show that f,, converges pointwise to the constant
function 0, but this sequence does not converge in any norm, 1 p oc.
Compare to Example 4.6 on p. 100. Explain why this sequence does not converge
to the constant function 1/2 in the L' norm.
8.2.15. Find a sequence of functions that converges in the L°° norm but does not
converge pointwise.
8.2.16. Find a sequence of functions that Cesàro converges almost everywhere but
does not converge in measure.
8.2.17. Find a sequence of functions that converges in measure but does not Cesàro
converge almost everywhere.
8.2.18. Verify that b] satisfies the definition of a vector space as given on
p. 18.
8.2.19. Show that the set of function in L°°[a, b] is closed under addition and
scalar multiplication.
8.2.20. Show that if x and y are nonnegative, then
max{x, y} = lim
p—÷oo
(xp + yp1/P
)
and, in general, for nonnegative x1, x2, . . . ,
f f If(x)Idx (8.17)
Exercises 8.2.26—8.2.28 establish the fact that the triangle inequality does not
hold for spaces when 0 < p < 1.
8.2.26. Let f b], where 0 < p < 1, and g E [a, b], q = p/(p — 1) <0
and, where f 0 and g > 0 on [a, b]. Show that
fb (fb (fb
dx> If(x)IPdx)
8.2.27. Let f, g E where 0 < p < 1 and f, g 0 on [a, b]. Show that
+ +
8.2.28. For 0 < p < 1, show that there exist functions f, g E b] such that
+ +
8.2.29. For n E N, define
= (n(n + X(±,1)•
Show that for each pair of distinct integers, m, n, the distance between fm and
is 2, Ifm — II,, = 2, and therefore this is a bounded sequence in that does
not have a limit point in
The name used to designate complete metric spaces was chosen to honor one
of the founders of functional analysis, Stefan Banach (1892—1945). He was born
in Krakow in what was then Austria-Hungary, today Poland. In 1920, he began
teaching at Lvov Technical University in what was then Poland and is now Ukraine.
Until the Nazi occupation of Lvov in 1941, he was a prolific and important mathe-
matician. Imprisoned briefly by the Nazis, he spent the remainder of the war feeding
lice in Rudolf Stefan Weigl's Typhus Institute.2 Banach died of lung cancer shortly
after the war ended.
The proof that is complete for every p, 1 p oc, is known as the Riesz—
Fischer theorem. Riesz and Fischer each proved it independently for p = 2 in 1907.
The remaining cases were established by Riesz in 1910.
Proof We shall first prove the case p = To say that a sequence is Cauchy
in the L°° norm means that given any E > 0, we can find a response N so that
m, n N implies that fm — This means that fm(X) — <EI
F=(UAk)u( U m(F)=0.
k=1 1<m<n<oo
For each x in what remains, [a, b] — F, we have that m, n > N implies that
- For more information on this unpleasant occupation and how it was used to save the lives of a number of
Polish and Ukrainian intellectuals during World War II, see www.lwow.home.plfWeigl.html.
8.3 Banach Spaces 265
,8 IIfNIL,o}.
lfk(x)l /3 + 1.
<
-
In general, once we have found 1, we choose > 1
so that n implies
that
- fnkllp < 2k
It follows that
+ - lip +1 <oc.
= + —
266 Fourier Series
fb p
dx
+ —
j=1
+ (8.18)
fb p
dx dx <oc.
(8.19)
By the monotone convergence theorem, Theorem 6.14 on p. 174, the sequence of
functions
/ k
j=1
= + fnjl
+ — =
and define f any way we want, say f(x) = 0, on the set of measure zero where
the subsequence does not converge. From equation (8.19),
fb = fb p
dx — dx
+
b 00
<f +
f e U. It only remains to prove that converges to f in the sense
of the U norm.
We observe that
f(x) — fnk(x) -
= f—k
If — fnkIIp —
=
j=k
Given any E > 0, choose k so that E > For all n > we see that
1 1 3
LI
these three vectors, and we can use the dot product to find this decomposition. The
component of in the direction of is
V Vk
Vk.
Vk Vk
1•1+2•(—1)+3•2 5
1.1 +(—1)•(—1)+2•2 — 6
Now think about the Fourier series representation of a function in L2[—ir, in,
say f(x) = x:
These functions are orthogonal using the L2 inner product! For n 1, we note
that
(Jr 1
I 1 cos(nx) dx = — sin(nx) = 0,
n
(Jr —1
I 1 sin(nx) dx = cos(nx) = 0,
n
fir —--cos2(nx)
I sin(nx) cos(nx) dx = = 0.
2n
— —
Jt n
The Fourier series expansion of our function f defined by f(x) = x, —it <x <pr,
is simply the representation of f in terms of this orthogonal basis of sines and
cosines.
The amazing result discovered by Fischer and Riesz is that every function in
L2 has such a representation and every suitably convergent trigonometric series
corresponds to a function in L2.
Theorem 8.8 (Riesz—Fischer Theorem). Let f e L2[—ir, in, then f has a unique
Fourier series representation
where
(f(x), cos(nx)) (f(x), sin(nx))
n>O n>1.
(cos(nx), cos(nx)) — (sin(nx), sin(nx)) —
The convergence of this series is convergence in the sense of the L2 norm. Further-
more,
2
<cx. (8.21)
The implication also goes the other way. If(ao, a1, b1, a2, b2, ...) is any sequence
of real numbers for which + + converges, then
00
a0
+> cos(nx) + sin(nx))
is afunction in L2.
Comparing this to results on Fourier series in Section 8.1, we see that we have
strengthened Lebesgue's assumption. Instead of simply being integrable, we insist
that the square of the function must be integrable. In exchange, we get a much
stronger conclusion. We no longer need the Cesàro limit. Nevertheless, we get
convergence only in the L2 norm. But in 1966, the Swedish mathematician Lennart
Carleson (b. 1928) showed that convergence is not just in the L2 norm, it is pointwise
convergence almost everywhere. In 2006, Carleson received the Abel Prize "for his
profound and seminal contributions to harmonic analysis and the theory of smooth
dynamical systems." In 1970, Richard A. Hunt of Purdue University showed that
there is nothing special about L2. The same is true for functions in any space,
provided only that p is strictly greater than 1. All of the problematic functions that
require a Cesàro limit live in L' but not in p > 1.
The proof of the Carleson—Hunt theorem is beyond the scope of this book. The
remainder of this chapter will be devoted to proving the Riesz—Fischer theorem.
Because it requires little additional effort to prove a far more general result, I shall
present this proof in the context of Hilbert spaces.
Exercises
8.3.1. Verify that each pair of the vectors
= (1, —1,2), v2 = (2,0, —1), v3 = (1,5,2)
is orthogonal.
8.3.2. Show that for functions in L°°[a, bi, each of the following sets has measure
Ak = {x e [a, bi lfk(x)l>
Bm,n = {x e [a, bi lfm(x) — Ifm — I.
8.3.3. Let C = C[0, 11 be the set of all continuous functions on [0, 11. For f e C,
define the max norm by If umax = max Show that C equipped with the max
norm is a Banach space.
8.3.4. Let C = C[0, 11 be the set of all continuous functions on [0, 11. Show that
C equipped with the L2 norm, hf 112 = f2(x) dx, is not a Banach space.
8.3.5. Show that for f e L bI, 1 <p <oc and any E > 0, there exists a contin-
suchthat If — <E and If — <E.
8.4 Hubert Spaces 271
iib = cos9,
V 1i1,ii.
This result was proved for R°° by Cauchy in 1821 and for L2 (though they did
not call it that) independently by Victor Bunyakovsky in 1859 and Hermann A.
Schwarz in 1885.
0< (x—Ay,x---Ay)
= (x,x)+A2(y,y)—2A(x,y)
The parallelogram law says that the sum of the squares of the diagonals of a
parallelogram equals the sum of the squares of the sides,
iiibli2).
This is also true of any Hilbert space, and the proof is the same.
Proposition 8.11 (Parallelogram Law). The inner product of any Hubert space
satisfies the equality
This proof is left as Exercise 8.4.2. Equipped with this result, we can quickly
verify that L2 is the only U space whose norm arises from an inner product.
Proposition 8.12 (L2 Alone). The only U space that is also a Hi/bert space is
L2.
Proof In bi, take two subsets, S, T c [a, bi, such that S fl T = 0 and
rn(S) = m(T) 0. Define A = m(S)11P. In Exercise 8.4.3, you are asked to show
that
L2. Is anything missing? Is there any function (other than the constant function 0)
that is orthogonal to all of these in L2? In other words, do we have a complete
orthogonal set?
Definition: Orthogonality
If x and y are elements of a Hilbert space such that (x, y) = 0, then we say that
x and y are orthogonal. A set Q of elements of a Hubert space is called an
orthogonal set if for each pair x y E Q, we have that (x, y) = 0.
Our proof will be spread over Lemmas 8.14—8.16 and will proceed by contra-
diction. We assume that we have a nonzero function f e L2[—in, in for which
f7t PJT
The proof breaks up into three pieces. We first establish the existence of finite
trigonometric series with certain properties that help us isolate the value of f near
specific points. Next, we use the existence of such a finite series with the fact that
f is continuous to find a contradiction. Finally, we pull this all together to find a
contradiction when all we assume about f is that it is integrable. Notice that we
need to prove our result only for f e L2. What we shall actually prove is stronger
than we need.
Lemma 8.14 (Special Trigonometric Polynomial). Given any > 0 and any
E > 0, we can find a finite trigonometric series,
for which
1. T(x) > Oforallx e [--in, in,
2. T(x)dx = 1, and
3. T(x) <Eforalh3 < lxi <ir.
These functions are all nonnegative, and the area underneath the graph is always
1, but that area is concentrated as close to zero as we wish (see Figure 8.2). The
effect of integrating f T is to pick out the values of f closest to 0. If f is nonzero
at any point, say f(z) 0, then integrating f(x + z)T(x) picks out those values
of f in an arbitrarily small neighborhood of z.
Proof. Define
= (1 + (j (1 + dx)
(f—JT
—2 —1 1 2
The first two properties are clearly satisfied by this function. For the third property,
we observe that for 6 < x we have that
<(cos(6/2))
2
— 2
— 6
Since
cos(6/2)
cos(6/4)
we can find an n for which <E for all 6 < Ix I
Lemma 8.15 (Continuous f). 1ff is continuous on II—7r, 7t] and f is orthogonal
to cos(nx), n > 0, and to sin(nx), n > 1, then f is the constant function 0 on this
interval.
0=] f(x+z)T(x)dx
—7t
p7r
C
/ T(x)dx+ / f(x+z)T(x)dx+] f(x+z)T(x)dx
8
Jr —8
= (f(x + z)
f T(x)dx +f — c/2) T(x)dx
(f(x+z)-c/2)T(x)dx
+ f
— (M
+ f T(x)dx - (M
+ f T(x)dx
>
2 \ 21
where we can get any positive value we wish for E by suitable choice of T. This
tells us that
fir
0=] f(x+z)T(x)dx> c >0,
—JT
Lemma 8.16 (Integrable f). Let f be integrable on [—7r, 7t] where f is orthog-
onal to cos(nx), n > 0, and to sin(nx), n > 1, then f is the constant function 0
on this interval.
iscontinuous and that we can relate the Fourier coefficients of f and F by means
of integration by parts (see Exercise 7.4.11 on p. 239). For n 1, we have that
Jr Jr Jr
sin(nx) 1
I F(x)cos(nx)dx = F(x) — —I f(x) sin(nx)dx = 0,
J—7r n flJ_Jr
(Jr cos(nx) 1
I F(x) sin(nx) dx = — F(x) +— I f(x) cos(nx) dx = 0.
J—7r fl
The only nonzero Fourier coefficient of F is the constant term, say A0. All
Fourier coefficients of F — A0 are zero, so F is constant. This implies that f =
F'=O.
is a complete orthonormal set in L2[—7r, 7t]. The advantage of working with this
orthonormal set is that the Fourier coefficient of, say, is simply the
inner product of f with
All of the remaining results needed to establish the Riesz—Fischer theorem will
be done in the context of an arbitrary Hilbert space H for which there exits a
countable, complete, orthonormal set .. .1. For f H, the real number
U' is called the generalized Fourier coefficent.
For f H, we need to show that
= (f, f) — f) -
+
j=1 k=1
= If 112 +
- = If 112
— (8.25)
If 112. (8.26)
(8.27)
and therefore if f E H, then the sum of the squares of the generalized Fourier
coefficients must converge. This proves inequality (8.21) in the Riesz—Fischer
theorem (Theorem 8.8 on p. 269).
8.4 Hi/bert Spaces 279
f= (8.28)
= (8.29)
Finally, lf(ck) is any sequence of real numbers for which converges, then
00
converges to an element of H.
Proof Let
k=in+1
and, therefore,
k=m+1
sufficiently large. That means that our sequence is Cauchy, and therefore it
must converge. Let g be its limit,
lim
fl 00
=g =
k=1
=
j=1
=
It follows that
(g, (/)k) = jim = (f'
fl —+00
and, therefore,
hf 112
— =
= lim =0.
k=1
Exercises
8.4.1. Show that if x is orthogonal to ..., then x is orthogonal to any linear
combination
8.4.2. Using the definition of the norm in terms of the inner product, prove that
8.4.3. Finish the proof of Proposition 8.12 by proving the four identities in equa-
tion (8.23).
8.4.4. Use integration by parts (Exercise 7.4.12) to show that if F is a differentiable
function over f
and if F' = has the Fourier series representation
8.4.8. Let B be a Banach space whose norm satisfies the parallelogram law, (equa-
tion (8.22)). Show that if we define the inner product by
8.4.9. Let C[O, 1] be the set of continuous functions with the max norm (see
Exercise 8.3.3). Does this space satisfy the parallelogram law? Justify your answer.
8.4.10. Let x be orthogonal to each of the elements and let y =
Show that x is orthogonal to y.
9
Epilogue
Does anyone believe that the difference between the Lebesgue and Riemann integrals can
have physical significance, and that whether say, an airplane would or would not fly could
depend on this difference? if such were claimed, 1 should not care to fly in that plane.
— Richard W. Hamming'
Hamming's comment, though cast in a more prosaic style, echoes that of Luzin
with which we began the preface. When all is said and done, the Lebesgue integral
has moved us so far from the intuitive, practical notion of integration that one can
begin to question whether the journey was worth the price.
Before undertaking this study of the development of analysis in the late nine-
teenth, early twentieth centuries, I had been under the misapprehension that what
convinced mathematicians to adopt the Lebesgue integral was the newfound ability
to integrate the characteristic function of the rationals. In fact, the evidence is that
they were quite content to leave that function unintegrable. The ability to integrate
the derivative of Volterra's function was important, but less for the fact that the
Lebesgue integral expanded the realm of integrable functions than that, in so doing,
this integral simplified the fundamental theorem of calculus. This begins to get to
the heart of what made the Lebesgue integral so attractive: It simplifies analysis.
I have included Osgood's proof to show how difficult it can be to make progress
when chained to the Riemann integral and how easily such a powerful result as the
dominated convergence theorem flows from the machinery of Lebesgue measure
and integration.
The real significance of the Lebesgue integral was the reappraisal of the notion
of function that enabled and, in turn, was promoted by its creation. Following
282
Epilogue 283
Dinchlet and Riemann, mathematicians had begun to grasp how very significant
it would be to take seriously the notion of a function as an arbitrary rule mapping
elements of one set to another. Through the second half of the nineteenth century,
they came to realize that the study of real-valued functions of a real variable is the
study of the structure of R. Set theory and the geometry of R took on an importance
that was totally new.
This insight was solidified in Jordan's Cours d'analyse, the textbook of the mid-
1890s that would shape the mathematical thinking of Borel, Lebesgue, and their
contemporaries. Jordan established the principle that the integral is fundamentally
a geometric object whose definition rests on the concept of measure. Jordan got
the principle correct, but the details — his choice of a measure based on finite
covers — were discovered to be flawed. This was the direct inspiration for the work
of Borel and Lebesgue. The great simplification that came out of their work was
the recognition that what happens on a set of measure zero can be ignored.
Chapter 8 hints at the fundamental shift that occurred in the early twentieth
century when the theory of functions as points in a vector space — the basis for
functional analysis — emerged. To appreciate this new field, we must view it in
the context of all that was happening in mathematics. This book has followed a
single strand from the historical development of mathematics and so ignored much
else that was happening, influencing and being influenced by the development
of the theory of integration. We saw a hint of this in Borel's work in complex
analysis that led him to the Heine—Borel theorem and in occasional references to
multidimensional integrals. But we have ignored the entire development of complex
analysis, the insights into the calculus of variations, the study of partial differential
equations, and the nascent work in probability theory. Most of the nineteenth-
century mathematicians working in what today we call real analysis were working
broadly on analytical questions, and especially on the practical questions of finding
and describing solutions to situations modeled by partial differential equations.
The manuscript that Joseph Fourier deposited at the Institut de France on Decem-
ber 21, 1807, the event that I identitified in A Radical Approach to RealAnalysis as
the beginning of real analysis, showed how to solve Laplace's equation, a partial
differential equation,
—+------=o.
8x2 8w2
The same trick that was used there, the assumption that
z(x, w) =
284 Epilogue
would be shown to work on many other partial differential equations. The difficulty
would come in expressing the distribution of values along the boundary in terms
of the basis functions,
f(x) = z(x, 0) =
2
A set is separable if it contains a subset that is both countable and dense.
Appendix A
Other Directions
To avoid confusion, from here on we shall use "collection" when we are speaking of a set of sets.
287
288 Other Directions
there is at least one Borel set in — that is contained within the interval
(n, n + 1) (see Exercise A.1.1). Call it E = is a Borel
set, but it is not in any of the and so it is not in the collection We
need to define a larger collection of Borel sets, the collection of al! countable
unions, countable intersections, and differences of sets in We note that
the cardinality of is c.
We still are not done. We can find a countab!e collection of sets in whose
union is not in We need another collection that consists of all countable unions
of sets from all countab!e intersections of sets from and all differences of
sets in We shall call this collection of sets and now we begin to see the
problem. We are going to get many transfinite subscripts.
In set theory, this first infinite subscript is usually denoted by w rather than
oc to avoid confusion with the transfinite cardinal numbers for which adding
1 resu!ts in no change + 1 = so). The subscript w or oc is refened to as the
first transfinite ordinal number, distinguishing these from the transfinite cardinal
numbers. Because we do not rea!ly need a new symbol for infinity — we shall use 00
consistently in this appendix — we shall stick with oc. We are in good company. Our
notation is what Cantor used in his early explorations of transfinite induction when
he attempted to classify second species sets by continuing the notion of derived sets
00 (n)
into transfinite iterations: If S is the nth derived set of S, then S (oo) =
•
S
and 5(00+1) is the derived set of
We continue, aware that we have moved into the realm of transfinite numbers
where great care must be taken. By taking unions, intersections, and differences of
sets in B00+i, we get B00+2, and so on through for all n e N. Again, Lebesgue
showed that each is a proper subcollection of and so
does not contain all of the Bore! sets. The collection of a!l unions, intersections, and
differences of sets in can be denoted by B00+00, or, more succinctly,
as B00.2. This collection also has cardinality c. Of course, from here we get B00.2+i
and so on through n e N. It should by now be clear how we can build
for any n e N. What about the set built from all unions, intersections, and
A.] The Cardinalily of the Collection of Borel Sets 289
2
See, for example, Devlin (1993).
290 Other Directions
Connection to Baire
It was this 1905 paper (Lebesgue 1905c) in which Lebesgue solved the question
of the existence of functions in class 3, 4, ... (see definition of class, p. 115). Not
only was he able to prove the existence of functions in each finite class, he was
able to prove the existence of functions in class oc, functions that are not in class n
for any finite n but are limits of functions taken from the union of all finite classes.
Just as with Borel sets, there are functions that are not in class oc but are limits of
functions in class oc, that is to say, functions in class oc + 1. The entire process
repeats. For every countable ordinal, there are functions in that class.
Exercises
A.1.1. Using an appropriate linear function composed with the arctangent function,
construct a one-to-one, onto, and continuous function, f, from IR to (n, n + 1).
Show that for each n > 0, S is a set in Bn if and only if f(S) is a set in Bn. Use this
result to show that if E is in Bn Bn_i, then f(E) is also in Bn — Bn_i.
Dave Renfro has pointed out that there is another totally different approach to proving that = c. See Renfro
(2007).
A.2 The Generalized Riemann Integral 291
f'(x)dx = lim
f 0 E
<6.
—
1 1
+ —
For every N > K, the open intervals { (ak, are pairwise disjoint and their
union is contained in (0, 6), but
f(bk) -
k=K = (k +1/2k k _1/2) k=K
k
292 Other Directions
The sum of the oscillations is unbounded. This tells us that f cannot be expressed
as a Lebesgue integral of any function.
Maybe absolute continuity is too strict. Is it possible to define an integral so that
if a function is differentiable on the interval [a, bi, then its derivative is always
integrable and the evaluation part of the fundamental theorem of calculus always
holds?
In fact, this is possible. Several mathematicians, beginning with Arnaud Denjoy
in 1912, found ways of extending the Lebesgue integral. Denjoy's original definition
was greatly simplified by Luzin. In 1914, Oskar Penon came up with a different
formulation. Jaroslav Kurzweil, in 1957, discovered an extension of the Riemann
integral that, in the 1960s, Ralph Henstock would rediscover, realizing that it
could do the trick. The integrals of Denjoy, Penon, and Kurzweil—Henstock are
equivalent, though this would not be established until the 1980s. The description we
shall use is that of Kurzweil and Henstock. Because of the many people to whom
this integral could be attributed, we shall refer to it simply as the generalized
Riemann integral.
Recall that a tagged partition of [a, bi is a partition, (xO = a <x1 < =
b), of [a, bi together with a set of tags, ... xJ, one value taken from each
,
—xii)- <E.
A.2 The Generalized Riemann Integral 293
Dirichiet's Function
To see that this works for Dirichlet's function, the characteristic function of the
rationals over [0, 1], we let be an ordering of the set of rational numbers in
[0, 11. We define
— x= e Q,
6(x)_il
The sum of the lengths of the intervals for which the tag is rational is strictly less
than E, and therefore
We choose our gauge so that for each x e [a, bI, 0 < Iz —x 66(x) implies that
f(z)—f(x)
b—a
Differentiability at x guarantees that we can do this. This is equivalent to
< E
b—a
Iz-xI.
294 Other Directions
-xii) - (f(b) -
= — xii) - -
j=1 j=1
- + - xii))
=
- — + -
- — —
Proof Given E > 0, we must show how to construct a gauge function 6 so that
given any Riemann sum f(xJ)(xj — for which — we
have that
f <E.
The fact that Gk is open guarantees that we can choose 6(x) > 0. Given a Riemann
sum that satisfies this gauge, f(xJ)(xj — let Ek3 be the set that contains
xJ. Since — < — xl, we know that Gk3 xII.
We now take the difference between the Riemann sum and the Lebesgue integral
and bound it by pieces that we can bound,
j=1
fa f(x) dx = j=1 f — f(x)) dx
j=1 f [xJ_I,x3]flEk1
f(x)
j=1 f f(x) dx
+ j=1 [xf_I,xJ]—Ek3
296 Other Directions
n 00
—x1_1)+ 1)fidx+
k=—oo fU(k)
We want to choose ,8 and v so that each of these pieces is strictly less than E/3.
We pick ,8 = E/(4(b — a)). Since f is Lebesgue integrable, so is f Corollary 6.18
.
tells us there is an > 0, such that m( U(Gk — Ek)) < implies that
f U(Gk—Ek)
Since m( U(Gk — Ek)) < — Ek) < v(k), we need to choose v so that
Final Thoughts
Over the past few decades, several mathematicians have made the argument that the
generalized Riemann integral should be introduced in undergraduate real analysis
and that at the undergraduate level it should take priority over the Lebesgue integral.
Their argument is based on three premises: that these students are already familiar
with the Riemann integral, that the Lebesgue integral, resting as it does on the notion
of measure theory, is too complicated to introduce at the level of undergraduate
analysis, and that the generalized Riemann integral is much more satisfying because
it can handle a strictly larger class of functions and provides a clean and simple
proof of the most general statement of the evaluation part of the fundamental
theorem of calculus.
I disagree. While many calculus texts introduce the Riemann integral, the fact
is that Riemann's definition is both subtle and sophisticated. Its only real purpose
is to explore how discontinuous a function can be while remaining integrable. For
the functions encountered in the first year of calculus, which should be limited to
piecewise analytic functions, there is no reason to work with tagged partitions in
A.2 The Generalized Riemann Integral 297
Exercises
A.2.1. Prove that if f is Riemann integrable over [a, bi, then the generalized
Riemann integral of f over [a, b] also exists.
A.2.2. Prove that for any gauge over any closed interval, there is at least one tagged
partition that corresponds to that gauge.
A.2.3. Explain why we can choose a gauge that satisfies the bounds given in
inequality (A.3).
A.2.4. Find an example of a function that is differentiable on [0, 11 but not abso-
lutely continuous on this interval. Find an example of a function that is absolutely
continuous on [0, 11 but not differentiable at every point of this interval.
A.2.5. Modify the proof of Theorem A. 1 to weaken the assumption on f so that it
is differentiable at all but countably many points in [a, bi.
Appendix B
Hints to Selected Exercises
Exercises that can also be found in Kaczor and Nowak are listed at the start of each
section following the symbol The significance of 3.1.2 = 11:2.1.1 is that
Exercise 3.1.2 in this book can be found in Kaczor and Nowak, volume II, problem
2.1.1.
299
300 Hints to Selected Exercises
1.2.3 Try to find a sequence of nested intervals [a1, b1 I [a2, b2] D [a3, b3] D
so that the endpoints are elements of the sequence and ak equals or precedes
ak+1, and bk equals or precedes bk+1 in the sequence.
1.2.6 Let = ak. Show that > S2> •••. If A is the infimum of this
sequence, then it is also the limit. Show that A must satisfy the E-definition of
the tim sup. Show that if A satisfies the E-definition of the lim sup, then it is the
greatest lower bound of the sequence of 5n•
1.2.17 Since f is differentiable at c, (f(x) —f(c)) / (x — c) exists and
equals f'(c). Use the fact that for x > c, (1(x) — f(c)) / (x — c) = f'(y) for
some y between c and x. Show that f'(y) = f'(c).
1.2.21 Show that x1?/n2 converges uniformly over [—1, 11.
2.1.18 Integrate from 1/n to 1 and then take the limit as n —k oo. Rewrite as
[1 fl/n[aj
fl/n \ x Lx]! Jcx/n x
11
Wdx- Jcx/n
fl/n Lx] X
Show that the first two integrals cancel and the third integral approaches a ln a
as n approaches infinity.
2.1.19 Using the fact that the oscillation over [a, bi is bounded, show that for any
partition P and any E > 0, we can find a refinement of P, Q D P, so that the
Riemann sum for Q with tags at the left-hand endpoints is strictly larger than
S(P;f)—E.
2.2.5 Use the Taylor polynomial expansion with Lagrange remainder term.
2.2.6 One approach is to prove this by induction on N.
2.2.7 Uniform convergence will imply the first inequality.
Hints to Selected Exercises 301
2.3.3=111:1.7.7
2.3.6 8. Fix a prime p. Consider the set of rational numbers in [0, 11 with numerator
equal to p. What is the greatest distance between any real number in this interval
and the nearest rational number with numerator equal to p?
2.3.7 Show that m is continuous at every irrational number and discontinuous at
every rational.
2.3.8 Show that h is continuous at x = 0 and nowhere else.
2.3.9 Prove that g is continuous at c if 2'1c Z for any integer n. Given E > 0,
let M be the smallest integer such that < E. Find 6 > 0 so that is not
1)
an integer for any y e (c — 6, c + 6). Show that + 1)/2] is
constant on (x — 6, x + 6). Show that g(x) — <E for x — <6.
2.3.12 Proceed by induction on k and assume that the derived set of Tk_l is Tk_2 U
{0}. Let x 0 be an element of Tk — Tk_1. Explain why x is not in the derived set
of Tk_1. Explain why there must be an E > Oso that Tk_l fl (x — E, x + E) = 0.
Show that (x — E/2, x + E/2) contains at most finitely many elements of Tk.
2.3.17 Let [a, bi be any closed subinterval of [0, 11, a <b. Show that there
must be a closed subinterval [a1, bil c [a, bi, a1 < b1, on which the oscil-
lation stays less than 1. Show that this has a closed subinterval [a2, b2i c
[a1, bil, a2 <b2 on which the oscillation stays less than 1/2. In general, once
you have found [ak_i, bk_li, show that it must contain a closed subinterval
[ak, bki c [ak_l, bk_li, ak <bk, on which the oscillation stays less than 1/k.
Let a be contained in the intersection of all of these intervals. Show that f must
be continuous at a. Thus, every closed interval contains a point of continuity.
3.1.6 7. For each x e R, consider the set of fractions strictly less than x and of the
form +2'7b, b odd. Does this set have a largest element?
3.2.12=1:3.2.27
3.2.3 Order the intervals so that '1 = (ai, b1), '2 = (a2, b2), ..., = ba),
where b1 b2 Why can we assume that no two of the b, are equal?
Why can we assume that a1 <a2 < <ar? Why can we assume that a1 <
0 <b1? Now finish the proof.
3.2.6 Show that 11(a) — f(s)I <E implies that there is an element larger than s
that is in this set. Show that I f(a) — f(s)I > E implies that there is an element
smaller than s that is greater than or equal to every element in this set.
302 Hints to Selected Exercises
3.2.8 Show that every Cauchy sequence is bounded and that the limit point of this
set of points is equal to its limit as a sequence.
3.2.9 Let S be a bounded set and consider the sequence: a1 chosen from 5, a2
chosen from among the upper bounds of 5, a3 = (a1 + a2)/2, is the average
of the largest element of {a1, a2, .. ,
. } that is less than or equal to some
element of S and the smallest element of {a1, a2, , ar_i } that is an upper
. . .
bound for S. Show that this sequence is Cauchy and therefore converges. Show
that the element to which it converges must be the least upper bound of S.
3.2.10 Consider the set of left-hand endpoints of the nested intervals and prove that
the least upper bound of this set must lie in all of the intervals.
3.2.12 To prove that converges if such a sequence, (ba), exists, first show
that = < +
3.2.14 Show that 1/3 is the only point of this set that is in the open interval
(5/18, 7/18).
3.3.2 Figure 3.1 shows how to get the correspondence between N and the positive
rational numbers.
3.3.3 Pick a countable subset of R that is disjoint from the rationals, say Q + =
a + a E Qj. Define the correspondence so that if xQ U (Q + yr), then x
gets mapped to itself, and the union Q U (Q + gets mapped to just Q +
3.3.4 2. First find a one-to-one correspondence between R and (0, 1). Find a way
to use the arctangent function. Now find a one-to-one correspondence between
the rational numbers in (0, 1) and the rational numbers in [0, 1]. 5. You need to
be able to combine every pair of real numbers into a single real number in a way
that you can recover the original pair. Think of using the decimal expansions.
3.3.6 See hint to 3.3.3.
3.3.7 First define a one-to-one mapping 4 N x Z —k Z and then map (a, b) —k
[b])+b— [b].
3.3.8 Explain the natural bijection between real numbers in [0, 1] and the set of
mappings from N to {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} (which has cardinality Note
that each rational number with a denominator that is a power of 10— other than
0 and I — corresponds to two different mappings. We assume that the set of
mappings is countable, and assign one such mapping (equivalently, we assign
the decimal expansion of one real number) to each positive integer. We let
be a one-to-one and onto mapping from N to 10's. We now define a mapping
Hints to Selected Exercises 303
T (equivalently, a real number) with the property that T(n) (the nth digit of
T) is not the image of n in iji(n). (Equivalently, it is not equal to the nth digit
of
3.3.10 Let i/i be the mapping from A onto B. Select A' c A so that i/i : A' B
is one to one. Define = i/i and i/jr? = i/i o We define the one-to-one
correspondence between A and B as follows: For each a E A', if we can find an
n so that *T1(a) E B — A', then a is mapped to i/i(a). If a is not in A' or there
is no such n (if each time we apply we get back an element of A'), then a is
mapped to itself. Show that under this mapping, each element of B is the image
of exactly one element of A.
3.3.13 These are countably infinite sequences of natural numbers.
3.3.15 There is always one map from the empty set to any other set, the trivial map
that takes nothing to nothing.
4.1.3 To prove that nowhere dense implies the existence of subintervals with no
points of S, you may find it easier to prove the contrapositive: If every subin-
terval of (a, b) contains at least one point of 5, then every point in (a, b) is an
accumulation point of S.
4.1.10 One approach is to modify Cantor's first proof that R is not countable.
Assume S = (s1, ...) is perfect. Start with s1, and s3. For simplicity,
first assume that s1 < < S3. Since is an accumulation point of 5, there
are infinitely many points of S between si and s3. Pick the one with smallest
subscript larger than 3. If the new point is larger than discard s3. If it is
smaller, discard s1. Continue doing this to create a sequence of nested intervals.
4.1.12 Show that if Ix — yI < 1/3, then DS(x) — 1/2. What if Ix —
1/9?
4.2.3 First show that for any partition of [a, bi, there is a Riemann sum approxi-
mation to the integral of f' whose value is zero.
4.2.4 Note that every neighborhood of any point in SVC(4) contains at least part
of one of the removed intervals, and since it is not entirely contained in this
interval, it must contain an endpoint of one of the removed intervals.
4.2.5 Show that for every E > 0, there is some n E N so that after removing all
intervals of length 1 the intervals that remain all have length strictly less
than E.
304 Hints to Selected Exercises
4.2.10 Find the open interval centered at 0. Find the open intervals on either side of
+1/3. For each M e N, find the open intervals on either side of
4.2.14 Show that if x e S — S', then we cannot have points of S' that are arbitrarily
close to x.
4.3.3 If we have any point in then we can always find an increasing sequence
of integers (ni < n2 < n3 < •) and a sequence of intervals, [a1, b1 I, [a2, b21,
[a3, b31..., for which link is greater than or equal to E at all points in [ak, bk].
Think about our three examples.
4.3.5 Since F1 has finite outer content, it must be bounded, say F1 ç [a, bi. Define
Gk =[a,bI—Fk.
4.3.10 Show that for every n, the oscillation of at x = +1/rn is 1/rn.
4.4.2 Consider the examples you know for which the limit of the integrals does not
equal the integral of the limit.
4.4.5 The Cantor set consists of numbers that can be represented in base 3 without
the use of the digit 1. The set C does include numbers that have a 1 in their base
3 representation. How many is?
4.4.10 Show that the series of the derivatives of n2f(x — converges uniformly.
4.4.13 Let RN +Q = {rk <k <N, q e Qj. Show that the characteristic
function of this set is in class 2.
5.i.3=III:1.7.1,5.i.4=III:i.7.2,5.i.i=III:i.7.3,5.i.6=III:1.7.8,5.1.7=
111:1.7.13, 5.1.8 = 111:1.7.14, 5.1.9 = 111:1.7.15, 5.1.10 = 111:1.7.15
5.1.1 We know that for any a- > 0, the set of points with oscillation ? a must have
outer content 0. Thus for each k 1, we can find a closed interval (not a single
point) on which the oscillation is less than 1/k. Show how to find a sequence
of nested closed intervals so that all points in the kth interval have oscillation
less than 1/k. Prove that the point in all of these intervals must be a point of
continuity.
5.1.12 Are there any Borel sets that are not in this a-algebra?
non
k=1 m=1 n=m
m(K)=supm(UIj
J>1 1. — \J1
5.2.12 Use Exercise 5.1.5 and the fact that rn(S) = m ia).
5.2.13 Given any E > 0, choose covers U of S and V ofT so that me(S) > m(U) — E
and me(T) > m(V) — E. Now use the result proven in Exercise 5.2.12.
5.2.14 Choose 6 > 0 that is strictly less than inf {Ix — yI x e 5, y e T }.Define
U = UXEs and V = UyET We can restrict the collection of covers
of S U T to those that are contained in U U V.
306 Hints to Selected Exercises
5.3.2S= 1)).
5.3.4 We already know the middle inequality. The third inequality is fairly easy
to establish. The toughest part of this problem is to prove that if S c [a, b],
then
cj(S)=b_a_ce(SCfl[a,b]).
5.3.7 Use the fact that U is measurable and thus satisfies Carathéodory's condition.
5.3.9 Show that rn(T) <me(S).
5.3.11 Show that 1 implies 2 implies 3 implies 1. Show that 1 implies 4 implies 5
implies 1.
5.3.12 Every set is a countable union of bounded sets.
5.3.13 One direction, use the Carathéodory condition for both Si and Ti. Other
direction, show that there are measurable sets S and
and and me(T) = m(Ti). Show that
m fl Ti) = 0.
5.3.14 Let S1, Ti be the sets whose existence is established in Exercise 5.3.13. Let
C C S U T be a measurable set for which m(C) > m1(S U T) — E. Show that
<m(C)
=m(CflSi)+m(Cfl Ti)
<me(S) + m1(T).
5.3.15 Show that
I
\k=1 n=k I \n=k
sup inf = lim inf
n>k
f f f f
6.2.21 First show that if E = Ek, where the Ek are pairwise disjoint measur-
able sets, and 4 is any simple function, then fE 4(x) dx = fE, 4(x) dx.
Then extend this result to any integrable function, fE f(x) dx = fE,
f(x) dx.
6.3.12=III:2.3.14,6.3.13=III:2.3.15,6.3.14=III:2.3.18,6.3.15=III:2.3.19
6.3.9 Use Corollary 6.17 to show that it is enough to prove this theorem when f
is a simple function. Then use the definition of a simple function to show that
it is enough to prove this when f is the characteristic function of a measurable
set. Then use Theorem 5.11 to show that it is enough to prove this when f is the
characteristic function of an interval.
6.3.11 Use Fatou's lemma.
6.3.12 Apply Fatou's lemma to both fE dx and
fEc
6.3.16 See hint to Exercise 6.3.9.
Show that Baire's sequence, Example 4.5 on p. 100 converges in measure but
6.4.6
does not converge in Kronecker's sense.
6.4.7 Consider the characteristic function of a suitable set.
6.4.9 Choose any two numbers a, b e Al. Show that for any E > 0 and any x e R
there exist x1, x2 e N6(x) for which f(x1) — > lb — a I. Show that this
implies that f is discontinuous at x.
6.4.10 Let E E
f
I dx + rn(E — Er).
1+E 1+E
To show that the measure of E must be finite, consider the sequence defined by
= 1/(nx).
6.4.11 Consider Exercises 6.4.1 and 6.4.2.
6.4.14 If f is continuous relative to [a, b] — E, then {x e [a, b — E f(x) > c }
is an open set, and therefore measurable. Any subset of a set of measure zero is
measurable.
6.4.15 Show that
0000 00
=flU fl jx <1/kj.
k=1 n=1 m=1
310 Hints to Selected Exercises
f
6.4.20 By Exercise 6.1.16, g o f,, and g o are measurable. Use the fact that g is
uniformly continuous on [—C, Cl to show that g o converges in measure to
gof.
7.1.1 Start with the function x sin(1/x). Create a piecewise-defined function that
is continuous at 0 but with four different Dini derivatives at x = 0. Add an
appropriate linear function to make this function strictly increasing.
7.1.3 Assume there exists b > a such that f(b) + cb < f(a) + Ca. For each x e
[a, fri show that there is a neighborhood so that y e y x, implies
that
f(y)+cy—(f(x)+cx) c— Al
>
y—x 2
These neighborhoods provide a cover of [a, b]. Use the Heine—Borel theorem
to find a contradiction.
7.1.7 If c is a point of discontinuity of a monotonically increasing function, f, then
f(x) and f(x) exist, and f(x) f(x).
7.1.11 Show that for each partition P of [a, fri and each E > 0, there is a response N
so that for all n > N, V(P, f) — V(P, <E. It follows that for each P and
for n sufficiently large (how large depends on P), V(P, f) < V(P, +E
7.1.12 Explain why it is that if we can show that f has bounded variation on [0, a]
for some 0 <a < 1,then f has bounded variation on [0, 11. Find V(PN, f),
where PN is the partition of [0, with cut points at — 1
7.2.3 First show this result for a function g for which ? E > 0. Apply this to
g(x) = f(x) + EX.
7.3.4 For the second question, from Corollary 7.3 we know that T(x) = is
continuous.
7.3.5 Since the summands are positive, if every finite sum is <E, then the infinite
sum is <
f
7.4.5 Show that if F(x) = f(t) dt, then Theorem 7.18 implies that
(F'(t) — f(t)) dt = 0 for all x e [a, bi.
7.4.6 Use Theorem 7.18.
312 Hints to Selected Exercises
7.4.7 First show that if P is any partition of [a, b], then V(P, f)
f dx. For the other inequality, define = {x e [a, b] f'(x) > 0
S = {x [a, bflf'(x) <01, so that
pb p p
f'(x)dx_J f'(x)dx.
Ja S
Using the fact that every measurable set is almost a finite union of pairwise
disjoint open intervals, show that for any E > 0 there is a partition P of [a, b]
for which
<
1. First prove this equality for the case where g(S) is open. Use this to prove
it for the case where g(S) is closed. Then use the fact that given any E > 0
we can find an open set G g(S) and a closed set F g(S) for which
m(g(S)) — E <m(F) <m(G) <m(g(S)) + E.
2. To avoid the possibility that g'(B) might not be measurable,
define a sequence of open sets [c, d] G1 B for
which = 0. We have B G = and g'(G) =
is measurable. Use part 1 to show that m (A fl g'(G)) = 0.
3. Again define a sequence of open sets [c, d] H1 H2 C for
which = m(C). Then C = B U where m(B) =0.
7.4.10 Use Exercise 7.4.9. First prove it when f is a simple function, then when f
is nonnegative, and finally for an arbitrary integrable function.
7.4.11 Let A = supxE[ab} and B = supxE[ab} Use the inequality
Let Eq be the set of x e [a, b] for which equation (B.4) fails to hold and define
E=
\qEQ /
Using the fact that we can find a rational value as close as we wish to f(c), show
that every point in [a, bI — E is a Lebesgue point.
7.4.17 Use Luzin's theorem, Theorem 6.26.
8.1.10—8.1.15 = 111:2.5.36
sinu — sin6
8.1.8 Work backward starting with the fact that (0, 1, 0, 1, 0, 1, . . .) is (C, 1). If this
(1) (1) (1) (1)
15 the sequence (a1 , a2 , a3 , , .. .), what is (a1, a2, a3, a4, ...)?
8.1.10 Define f on (—ir, by f(0) = 0, f(x) = forx <0, and f(x) = 7r/4
for x > 0. The Fourier series for f is a pure sine series. Restrict it to (0, and
then do a change of variables, replacing x by x/2.
8.1.11 cos(a + b) — cos(a — b) = 2 sin a sin b. Use the convergence proven in Ex-
ercise 8.1.10 to justify the bound.
8.2.1 Use the fact that f is integrable over [0, 1] if and only if a > —1.
8.2.9 Use Proposition 8.6.
8.2.11 Use Egorov's theorem, Theorem 6.21.
8.2.13 Show that it converges in the Cesàro sense to 0 at every irrational value of x in
[0, 11.
8.2.17 Find a suitable sequence (ar) so that for fk,n = Xe-i the sequence
fi, 1, f2,2, does not converge in the Cesàro any irrational
value of x in [0, 1].
8.2.20 If x <y, rewrite the term inside the limit as y (1 + (x/y)P)
8.2.23 First show that for positive a and and i/p + 1/q = 1, c8 = aP/p + 18"/q
if and only if a =
8.2.26 Show that for 0 < p < 1 and positive x, f(x) = xy — xP/p has its minimum
atx = yl/(P').
8.2.28 Let A and B be disjoint subsets of [a, b] and set f = aX A' g =
Compute the norms and find values of a and for which these functions satisfy
the inequality.
cise 8.3.6 to show that II — flip -± 0. Finally use the Hölder—Riesz inequality
again to show that fg in the L' norm.
The condition p > 1 is used in the proof that (fr) is equi-integrable.
8.3.9 Using Lebesgue's dominated convergence theorem, show that gn f converges
to gf in the norm. To show that f,, converges to gf in the L" norm, use
the fact that
- + -
Hints to Selected Exercises 315
8.4.8 Linearity is the only property of inner products that does not follow imme-
diately. To prove that (x + y, z) = (x, z) + (y, z), use the parallelogram law to
show that
lix + y + z112 = 211x + y112 + 2iiZii2 — lix + y — z112
There is a similar set of identities for lix + y — z112. To show that (ax, y) =
a (x, y), first prove this for integer a, then rational a, and then use the continuity
of the inner product (exercise 8.4.7) to finish the proof. But be careful, we do
not yet know that this is an inner product. What properties were needed to prove
continuity?
317
318 Bibliography
Fichera, G. 1994. Vito Volterra and the birth of functional analysis. In Development
of Mathematics, 1900—1950. Edited by J. P. Pier. Basel: Birkhäuser, pp. 171—
184.
Gauss, C. F. 1876. Werke, vol. 3. Gottingen: Koniglichen Gesellschaft der
Wissenschaften.
Gordon, R. A. 1994. The Integrals of Lebesgue, Denjoy, Perron, and Henstock.
Graduate Studies in Mathematics, vol. 4. Providence, RI: American Mathemat-
ical Society.
Grabiner, J. V. 1981. The Origins of Cauchy's Rigorous Calculus. Cambridge, MA:
MIT Press.
Grattan-Guinness, I. 1970. The Development of the Foundations of Mathematical
Analysis from Euler to Riemann. Cambridge, MA: MIT Press.
1972. Joseph Fourier, 1 768—1 830. Cambridge, MA: MIT Press.
1990. Convolutions in French Mathematics, 1800—1 840, vols. I—Ill. Basel:
Birkhäuser Verlag.
Hamming, R. W. 1998. Mathematics on a distant planet. Am. Math. Mon. 105:
640—650.
Hardy, G. H. 1991. Divergent Series, 2nd ed. New York: Chelsea.
Hartman, S. and J. Mikusiñski. 1961. The Theory of Lebesgue Measure and Inte-
gration. Translated by L. F. Boron. New York: Pergamon Press.
Hawkins, T. 1975. Lebesgue's Theory of Integration: Its Origins and Development,
2nd ed. New York: Chelsea.
Heine, E. 1870. Ueber trigonometrische Reihen. J. Reine Angew. Math. 71: 353—
365.
1872. Die elemente der functionenlehre. J. Reine Angew. Math. 74: 172—
188.
Hermite, C. and T. J. Stieltjes. 1903—1905. Correspondance d'Hermite et de Stielt-
jes. Edited by B. Baillaud and H. Bourget. Paris: Gauthier-Villars.
Hobson, E. W. 1950. The Theory of Functions of a Real Variable and the Theory
of Fourier's Series, 3rd ed. Washington, DC: Harren Press.
Hochkirchen, T. 2003. Theory of Measure and Integration from Riemann to
Lebesgue. In A History of Analysis. Edited by H. N. Jahnke. Providence, RI:
American Mathematical Society, pp. 197—2 12.
Jordan, C. 1881. Sur la série de Fourier. C. R. Acad. Sci. Paris. 92: 228—230.
1892. Remarqes sur les intégrales définies. J. Math. Pures Appl. 4: 69—
99.
1893—1896. Cours d'analyse de l'Ecole Polytechnique, 3 vols. Paris:
Gauthier-Villars.
Kaczor, W. J. and M. T. Nowak. 2000—2003. Problems in Mathematical Analy-
sis, vols. I—Ill. Student Mathematical Library vols. 4, 12, 21. Providence, RI:
American Mathematical Society.
320 Bibliography
3rd ed. New York: Chelsea. Reprinted by American Mathematical Society. Prov-
idence, RI. (Original work published 1904.)
LUtzen, J. 2003. The foundations of analysis in the 19th century. In A History
of Analysis. Edited by H. N. Jahnke. Providence, RI: American Mathematical
Society, pp. 155—196.
Luzin, N. 2002a. Function. In Mathematical Evolutions. Translated by A. Shenitzer
and edited by A. Shenitzer and J. Stillwell. Washington, DC: Mathematical
Association of America, pp. 17—34.
2002b. Two letters by N. N. Luzin to M. Ya. Vygodskii. In Mathematical
Evolutions. Translated by A. Shenitzer and edited by A. Shenitzer and J. Stillwell.
Washington, DC: Mathematical Association of America, pp. 35—54.
Marek, V. and J. Mycielski. 2002. Foundations of mathematics in the twen-
tieth century. In Mathematical Evolutions. Edited by A. Shenitzer and J.
Sti!lwell. Washington, DC: Mathematical Association of America, pp. 225—
246.
Bibliography 321
Medvedev, Fyodor A. 1991. Scenes from the History of Real Functions, trans!.
Roger Cooke. Base!: Birkhäuser Ver!ag.
Moore, Gregory H. 1982. Zermelo's Axiom of Choice: Its Origins, Development,
and Influence. New York: Springer-Ver!ag.
Mykytiuk, S. and A. Shenitzer. 2002. Four significant axiomatic systmes and
some of the issues associated with them. In Mathematical Evolutions. Edited
by A. Shenitzer and J. Sti!!we!!. Washington, DC: Mathematica! Association of
America, pp. 219—224.
Newton, I. 1999. The Principia: Mathematica! Princip!es of Natura! Phi!oso-
phy. Translated by I. B. Cohen and A. Whitman. Berke!ey, CA: University
of Ca!ifornia Press. (Origina!!y pub!ished 1687.)
Osgood, W. F. 1897. Non-uniform convergence and integration of series term by
term. Am. J. Math. 19: 155—190.
Pier, J.-P., ed. 1994. Integration et mesure 1900—1950. In Development of Mathe-
matics 1900—1950. Base!: Birhäuser Ver!ag, pp. 517—564.
Poisson, S.-D. 1820. Suite du mémoire sur !es intégra!es définies. J. de l'Ecole Roy.
Poly. cahier. 11: 295—335.
Renfro, D. L. 2007. Message from discussion Borel set. Goog!e Groups.
groups.google.com/group/sci.math/msg/66168cf5 80929605. Accessed August
21, 2007.
Riemann, B. 1990. Gesammelte Mathematische Werke. Reprinted with comments
by R. Narasimhan. New York: Springer-Ver!ag.
Rudin, W. 1976. Principles of MathematicalAnalysis, 3rd ed. New York: McGraw-
Hi!!.
Saxe, K. 2002. Beginning Functional Analysis. New York: Springer-Ver!ag.
Schappacher, N. and R. Schoof. 1995. Beppo Levi and the Arithmetic of Ellip-
tic Curves. http://ha!.archives-ouvertes.fr/hal-00 1297 19/fr. Accessed 21 August,
2007.
Serret, J.-A. 1894. Calcul Différentiel et Integral, 4th ed. Paris: Gauthier-
Vil!ars.
Shenitzer, A. and J. Stepräns. 2002. The evo!ution of integration. In Mathematical
Evolutions. Edited by A. Shenitzer and J. Sti!!we!!. Washington, DC: Mathemat-
ica! Association of America, pp. 63—70.
Siegmund—Schu!tze, R. 2003. The origins of functiona! analysis. In A History
of Analysis. Edited by H. N. Jahnke. Providence, RI: American Mathematica!
Society, pp. 385—408.
Struik, D. J. 1986. A Source Book in Mathematics 1200—1800. Princeton: Princeton
University Press.
Vita!i, G. 1905. Una proprietà de!!a funzioni misurabi!i. Reale Istituto Lonbardo
di Scienze e Lettere. Rendiconti (2). 38: 600—603.
322 Bibliography
Wapner, L. M. 2005. The Pea and the Sun: A Mathematical Paradox. Wellesley,
MA: A. K. Peters.
Weierstrass, K. T. W. 1894—1927. Mathematische werke von Karl Weierstrass,
7 vols. Berlin: Mayer & Muller.
Whittaker, E. T. and G. N. Watson. 1978. A Course of Modern Analysis, 4th ed.
Cambridge: Cambridge University Press.
Index
323
324 Index
Cauchy, Augustin Louis, 1, 3, 7, 8, 11, 67, continuum hypothesis, 75, 95, 153, 156
132, 156, 272 convergence, 17
Cauchy criterion, 19, 263 absolute, 19
Cauchy integral, 7 almost everywhere, 162, 171
Cauchy sequence, 18, 62 almost uniform, 192, 195
Cauchy—Schwarz—Bunyakovski Inequality, bounded, 184
272 Cesàro, 245
ceiling, 16 (C, 1), 248
Cesàro, Ernesto, 245 (C, k), 249
Cesàro limit, 245, 246 in measure, 194
Chae, Soo Bong, 213 Kronecker's, 110, 194
change of variables pointwise, 17
in Lebesgue integral, 239 table of, 251
characteristic function, 122 uniform, 17, 19, 35, 98, 183
of the rationals, 3, 71 uniform in general, 42
Chasles, Michel, 24 countable additivity, 143
Church—Kleene ordinal, 289 countable set, 63
class, 115,290 countably infinite, 63
closed set, 55 Courant, Richard, 12
compact, 65 cover
closure, 56 countable, 134
cluster point, 47 finite, 44
Cohen, Paul, 75, 155 open, 66
compact set, 66 subcover, 66
closed and bounded, 65
continuous image of, 66 D'Antonio, Larry, 9n
complement, 16 Darboux, Gaston, 12, 24, 24, 25, 33, 36, 39
complete orthogonal set, 273 116
completeness, 62 Darboux integral
of U, 264 lower, 28
content, 44, 44, 123 upper, 28
continuity, 16, 54 Darboux sum, 26
absolute, 225, 226, 227 lower, 26
of total variation, 229 upper, 26
and compactness, 66 Darboux's functions, 36, 39
and differentiability, 12, 36—38, 203, 212, Darboux's theorem, 19, 21
213 de Freycinet, Charles, 11
and integrability, 20, 44—46 de la Vallée Poussin, Charles Jean Gustave
and oscillation at a point, 43 Nicolas, 226
approximate, 240 decreasing function, 17
Cesàro, 249 decreasing sequence, 17
of infinite series, 19 Dedekind, Julius Wilhelm Richard, 23, 24,
of integral, 20 51, 59—61
piecewise, 42 Dedekind cut, 61
relative to a set, 114 deleted neighborhood, 54
uniform, 16, 19, 67 DeMorgan's laws, 18
continuously differentiable function, 228 Denjoy, Arnaud, 292
continuum, 74 dense, 45, 57
Index 325
denumerable set, 63 Fourier, Jean Baptiste Joseph, 7, 10, 24, 41,
derivative 283
Dini, 204, 219 Fourier series, 2—4, 41—42, 114
of infinite series, 19 Carleson—Hunt theorem, 270
derived set, 47, 56, 82 convergence, 279
devil's staircase, 85—87, 297 Dini's condition, 245
Dini, Ulisse, 46, 90, 104, 150, 204, 206, 210, Dirichlet's conditions, 242
212, 245 Féjer's condition, 247
Dini derivative, 204, 219 Jordan's conditions, 245
Dini's theorem, 205 Lebesgue's condition, 247
Dirichlet, Peter Gustav Lejeune, 1, 3, 4, 23, Lipschitz's conditions, 244
24,41,67,68,71, 110, 242, 244, 283 Riesz—Fischer condition, 269
Dirichlet's function, 3, 45, 71, 100, 115, 167, Fraenkel, Adolf Abraham Halevi, 154
199, 242, 293 Fréchet, Maurice René, 56n, 253
discontinuity fundamental theorem of calculus, 8—12
pointwise, 45, 112—1 14 antidifferentiation, 9, 38, 203, 223, 232,
total, 45, 112 294
discrete set, 81 evaluation, 9, 223, 236, 293
distance between functions, 255
dominated convergence theorem, 183, 188 Ta, 103
DS. See devil's staircase T-point, 103
du Bois-Reymond, Paul David Gustav, 11, gauge, 292
12, 37, 46, 99, 103, 104, 120, 121, 242 Gauss, Carl Friedrich, 1, 23
Dugac, Pierre, 67, 67n generalized Fourier coefficient, 277
Duhamel, Jean Marie Constant, 36 generalized Riemann integral, 292,
295
Egorov, Dimitri Fedorovich, 192, 198 Gödel, Kurt, 75, 76, 155
Egorov's theorem, 192, 193 Granville, William Anthony, 11
Einstein, Albert, 75 Grattan-Guinness, Ivor, 9n
empty set, 16 greatest lower bound, 17
equi-integrable sequence, 271
equivalence class, 150 Hadamard, Jacques Salomon, 154, 156, 252
Erdôs, Paul, 247 Halphen, George Henri, 252
Euclid, 52n Hamming, Richard Wesley, 282
Hankel, Hermann, 24, 40, 42, 44—46, 48, 83,
Faber, Georg, 212 94, 112, 120, 204
Faber—Chisholm—Young theorem, 218 Hardy, Godfrey Harold, 11
Fatou, Pierre Joseph Louis, 187 Harnack, Axel, 46, 63, 226, 251, 253
Fatou's lemma, 187 Hartogs, Friedrich Moritz, 152
Fejér, Lipót, 243, 247, 249 Hausdorif, Felix, 96, 140, 154, 156, 253
Fields Medal, 75n Hawkins, Thomas, 46, 64n
finite cover, 44 Heine, Heinrich Eduard, 24, 40, 42, 67, 68,
first category, 111, 112 98, 110,204
first fundamental theorem of measure theory, Heine—Borel theorem, 65, 67—69, 105
69, 146n Henstock, Ralph, 292
first species, 47, 81 Hilbert, David, 153, 154, 253, 272
Fischer, Ernst Sigismund, 253, 264, 269 Hilbert space, 271
floor, 16 Hobson, Ernest William, 11
326 Index