Real and Functional Analysis - Serge Lang

Graduate Texts in Mathematics 142
Editorial Board
S. Axler F.W. Gehring P.R. Halmos
Springer-Verlag Berlin Heidelberg GmbH

BOOKS OF RELATED INTEREST BY SERGE LANG
Fundamentals of Diophantine Geometry

A systematic account of fundamentals, including the basic theory of heights,
Roth and Siegel's theorems, the Neron-Tate quadratic form, the Mordell-Weill
theorem, Weil and Neron functions, and the canonical form on a curve as it
related to the Jacobian via the theta function.
Introduction to Complex Hyperbolic Spaces
Since its introduction by Kobayashi, the theory of complex hyperbolic spaces
has progressed considerably. This book gives an account of some of the most
important results, such as Brody's theorem, hyperbolic imbeddings, curvature
properties, and some Nevanlinna theory. It also includes Cartan's proof for the
Second Main Theorem, which was elegant and short.
Elliptic Curves: Diophantine Analysis
This systematic account of the basic diophantine theory on elliptic curves starts
with the classical Weierstrass parametrization, complemented by the basic theory
of Neron functions, and goes on to the formal group, heights and the Mordell-
Weil theorem, and bounds for integral points. A second part gives an extensive
account of Baker's method in djophantine approximation and diophantine in-
equalities which were applied to get the bounds for the integral points in the
first part.
Cyclotomic Fields I and II
This volume provides an up-to-date introduction to the theory of a concrete and
classically very interesting example of number fields. It is of special interest to
number theorists, algebraic geometers, topologists, and algebraists who work in
K-theory. This book is a combined edition of Cyclotomic Fields (GTM 59) and
Cyclotomic Fields II (GTM 69) which are out of print. In addition to some minor
corrections, this edition contains an appendix by Karl Rubin proving the Mazur-Wiles
theorem (the "main conjecture") in a self-contained way.
OTHER BOOKS BY LANG PUBLISHED BY

SPRINGER-VERLAG
Introduction to Arakelov Theory • Riemann-Roch Algebra (with William Fulton) •
Complex Multiplication • Introduction to Modular Forms • Modular Units (with Daniel
Kubert) • Introduction to Aigebraic and Abelian Functions • Cyclotomic Fields I and il
• Elliptic Functions • Number Theory • AIgebraic Number Theory • SL2(R) • Abelian
Varieties. Differential Manifolds • Complex Analysis • Real Analysis • Undergraduate
Analysis. Undergraduate Algebra • Linear Algebra • Introduction to Linear Algebra •
Calculus of Several Variables • First Course in Calculus • Basic Mathematics •
Geometry: (with Gene Murrow) • Math! Encounters with High School Students
• The Beauty of Doing Mathematics • THE FILE
Serge Lang
Real and
Functional Analysis
Third Edition
With 37 Illustrations
, Springer
Serge Lang
Department of Mathematics
Yale University
New Haven, CT 06520
USA
Editorial Board
S. Axler F.W. Gehring P.R. Halmos
Department of Department of Department of
Mathematics Mathematics Mathematics
Michigan State University University of Michigan Santa Clara University
Bast Lansing, MI 48824 Ann Arbor, MI 48109 Santa Clara, CA 95053
USA USA USA
MSC 1991: Subject Classification: 26-01, 28-01, 46-01

Library of Congress Cataloging-in-Publication Data
Lang, Serge, 1927-
Real and functional analysis / Serge Lang. - 3rd ed.
p. cm. - (Graduate texts in mathematics ; 142)
Includes bibliographical references and index.
ISBN 978-1-4612-6938-0 ISBN 978-1-4612-0897-6 (eBook)
DOI 10.1007/978-1-4612-0897-6
1. Mathematical analysis. 1. Title. Il. Series.
QA300.L274 1993
515-dc20 92-21208
CIP
The previous edition was published as Real Analysis. Copyright 1983 by Addison-Wesley.
Printed on acid-frec paper.
© 1993 Springer-Verlag Berlin Heidelberg

Originally published by Springer-Verlag Berlin Heidelberg New York in 1993
Softcover reprint ofthe hardcover 3rd edition 1993
AH rights reserved. This work may not be translated or copied in whole or in part without
the written permission ofthe publisher Springer-Verlag Berlin Heidelberg GmbH, except for
brief excerpts in connection with reviews or scholarly analysis. Use in connection with any
form of information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used
freely by anyone.
Production coordinated by Brian Howe and managed by Terry Komak; manufacturing supervised by
Vincent Scelta.
Typeset by Aseo Trade Typesetting Ltd., North Point, Hong Kong.
9 8 7 6 5 4 3 2
SPIN 10545036
Foreword
This book is meant as a text for a first year graduate course in analysis.
Any standard course in undergraduate analysis will constitute sufficient
preparation for its understanding, for instance, my Undergraduate Anal-
ysis. I assume that the reader is acquainted with notions of uniform con-
vergence and the like.
In this third edition, I have reorganized the book by covering inte-
gration before functional analysis. Such a rearrangement fits the way
courses are taught in all the places I know of. I have added a number of
examples and exercises, as well as some material about integration on the
real line (e.g. on Dirac sequence approximation and on Fourier analysis),
and some material on functional analysis (e.g. the theory of the Gelfand
transform in Chapter XVI). These upgrade previous exercises to sections
in the text.
In a sense, the subject matter covers the same topics as elementary
calculus, viz. linear algebra, differentiation and integration. This time,
however, these subjects are treated in a manner suitable for the training
of professionals, i.e. people who will use the tools in further investiga-
tions, be it in mathematics, or physics, or what have you.
In the first part, we begin with point set topology, essential for all
analysis, and we cover the most important results.
I am selective here, since this part is regarded as a tool, especially
Chapters I and II. Many results are easy, and are less essential than
those in the text. They have been given in exercises, which are designed
to acquire facility in routine techniques and to give flexibility for those
who want to cover some of them at greater length. The point set topol-
ogy simply deals with the basic notions of continuity, open and closed
sets, connectedness, compactness, and continuous functions. The chapter
vi FOREWORD
concerning continuous functions on compact sets properly emphasizes

results which already mix analysis and uniform convergence with the
language of point set topology.
In the second part, Chapters IV and V, we describe briefly the two
basic linear spaces of analysis, namely Banach spaces and Hilbert spaces.
The next part deals extensively with integration.
We begin with the development of the integral. The fashion has been
to emphasize positivity and ordering properties (increasing and decreas-
ing sequences). I find this excessive. The treatment given here attempts
to give a proper balance between L i-convergence and positivity. For
more detailed comments, see the introduction to Part Three and Chapter
VI.
The chapters on applications of integration and distributions provide
concrete examples and choices for leading the course in other directions,
at the taste of the lecturer. The general theory of integration in mea-
sured spaces (with respect to a given positive measure) alternates with
chapters giving specific results of integration on euclidean spaces or the
real line. Neither is slighted at the expense of the other. In this third
edition, I have added some material on functions of bounded variation,
and I have emphasized convolutions and the approximation by Dirac
sequences or families even more than in the previous editions, for in-
stance, in Chapter VIII, §2.
For want of a better place, the calculus (with values in a Banach
space) now occurs as a separate part after dealing with integration, and
before the functional analysis.
The differential calculus is done because at best, most people will only
be acquainted with it only in euclidean space, and incompletely at that.
More importantly, the calculus in Banach spaces has acquired consider-
able importance in the last two decades, because of many applications
like Morse theory, the calculus of variations, and the Nash-Moser im-
plicit mapping theorem, which lies even further in this direction since one
has to deal with more general spaces than Banach spaces. These results
pertain to the geometry of function spaces. Cf. the exercises of Chapter
XIV for simpler applications.
The next part deals with functional analysis. The purpose here is
twofold. We place the linear algebra in an infinite dimensional setting
where continuity assumptions are made on the linear maps, and we show
how one can "linearize" a problem by taking derivatives, again in a
setting where the theory can be applied to function spaces. This part
includes several major spectral theorems of analysis, showing how we can
extend to the infinite dimensional case certain results of finite dimen-
sional linear algebra. The compact and Fredholm operators have appli-
cations to integral operators and partial differential elliptic operators (e.g.
in papers of Atiyah-Singer and Atiyah-Bott).
Chapters XIX and XXIX, on unbounded hermitian operators, combine
FOREWORD vii
both the linear algebra and integration theory in the study of such
operators. One may view the treatment of spectral measures as providing
an example of general integration theory on locally compact spaces,
whereby a measure is obtained from a functional on the space of contin-
uous functions with compact support.
I find it appropriate to introduce students to differentiable manifolds
during this first year graduate analysis course, not only because these
objects are of interest to differential geometers or differential topologists,
but because global analysis on manifolds has come into its own, both in
its integral and differential aspects. It is therefore desirable to integrate
manifolds in analysis courses, and I have done this in the last part, which
may also be viewed as providing a good application of integration theory.
A number of examples are given in the text but many interesting
examples are also given in the exercises (for instance, explicit formulas for
approximations whose existence one knows abstractly by the Weierstrass-
Stone theorem; integral operators of various kinds; etc). The exercises
should be viewed as an integral part of the book. Note that Chapters
XIX and XX, giving the spectral measure, can be viewed as providing
an example for many notions which have been discussed previously:
operators in Hilbert space, measures, and convolutions. At the same
time, these results lead directly into the real analysis of the working
mathematician.
As usual, I have avoided as far as possible building long chains of
logical interdependence, and have made chapters as logically independent
as possible, so that courses which run rapidly through certain chapters,
omitting some material, can cover later chapters without being logically
inconvenienced.
The present book can be used for a two-semester course, omitting
some material. I hope I have given a suitable overview of the basic tools
of analysis. There might be some reason to include other topics, such as
the basic theorems concerning elliptic operators. I have omitted this
topic and some others, partly because the appendices to my SL 2 (R}
constitutes a sub-book which contains these topics, and partly because
there is no time to cover them in the basic one year course addressed to
graduate students.
The present book can also be used as a reference for basic analysis,
since it offers the reader the opportunity to select various topics without
reading the entire book. The subject matter is organized so that it makes
the topics availab1e to as wide an audience as possible.
There are many very good books in intermediate analysis, and inter-
esting research papers, which can be read immediately after the present
course. A partial list is given in the Bibliography. In fact, the determina-
tion of the material included in this Real and Functional Analysis has
been greatly motivated by the existence of these papers and books, and
by the need to provide the necessary background for them.
viii FOREWORD
Finally, I thank all those people who have made valuable comments
and corrections, especially Keith Conrad, Martin Mohlenkamp, Takesi
Yamanaka, and Stephen Chiappari, who reviewed the book for Springer-
Verlag.
New Haven 1993/1996 SERGE LANG

Contents
PART ONE
General Topology
CHAPTER I
Sets ... .. . ..... ..... .. ......... . . .. .......... . ...... . ...... . ... .. 3
§1. Some Basic Terminology ......... .. ............................. 3
§2. Denumerable Sets .............................................. 7
§3. Zorn's Lemma ...... . .. . .. . .. . ........ . .... . . ............ .. .... 10
CHAPTER II
Topological Spaces 17
§1. Open and Closed Sets .......... ... ........................ .... . 17
§2. Connected Sets ................... . .......................... . . 27
§3. Compact Spaces ............................................... 31
§4. Separation by Continuous Functions .... .... ......... . .. .... .. ... 40
§5. Exercises .. ..... . ... . ........ .. .. ... ........ .. ............. . ... 43
CHAPTER III
Continuous Functions on Compact Sets 51
§1. The Stone-Weierstrass Theorem ...................... . ...... . ... 51
§2. Ideals of Continuous Functions .................................. 55
§3. Ascoli's Theorem . . .............. . .......... . .................. 57
§4. Exercises ........ .. .......... . ... . ..................... . ... .. .. 59
x CONTENTS
PART TWO
Banach and Hilbert Spaces 63
CHAPTER IV
Banach Spaces 65
§1. Definitions, the Dual Space, and the Hahn-Banach Theorem 65
§2. Banach Algebras . .. ....................... ... ....... .. ....... . 72
§3. The Linear Extension Theorem ................................. . 75
§4. Completion of a Normed Vector Space ... . ...................... . 76
§5. Spaces with Operators ....................... . ................ . 81
Appendix: Convex Sets .............. . ......................... . 83
1. The Krein-Milman Theorem ............ .... ........ .. ...... . 83
2. Mazur's Theorem ...................................... .. .. . 88
§6. Exercises 91
CHAPTER V
Hilbert Space 95
§1. Hermitian Forms . .................... .. . . ..... . ........... . ... 95

§2. Functionals and Operators ................... .. ...... . ...... . .. 104
§3. Exercises ..................................................... 107
PART THREE
Integration 109
CHAPTER VI
The General Integral 111
§1. Measured Spaces, Measurable Maps, and Positive Measures ....... 112
§2. The Integral of Step Maps .. .... . ....... .. .. . ......... ... . . .. . . 126
§3. The L1-Completion .......... . ............................... . 128
§4. Properties of the Integral: First Part ...... . ..................... 134
§5. Properties of the Integral: Second Part ...................... . .. . 137
§6. Approximations . ............ .. .............................. . 147
§7. Extension of Positive Measures from Algebras to (I-Algebras ....... 153
§8. Product Measures and Integration on a Product Space ........... . 158
§9. The Lebesgue Integral in RP ................ ... ................ 166
§10. Exercises .......................................... . ..... . ... 172
CHAPTER VII
Duality and Representation Theorems 181
§1. The Hilbert Space L 2 (/1) ........................•.. . ........ . .. . 181

§2. Duality Between U (/1) and L 00(/1) ............................ . ... 185
§3. Complex and Vectorial Measures . .................. . ............ 195
§4. Complex or Vectorial Measures and Duality .... . .......... . ...... 204
§5. The U Spaces, 1 < p < 00 .. . .. . . . . . . . . . . . . . . ... . . . . . . . . . . . . . ... 209
§6. The Law of Large Numbers ... ... .. .. ....................... .. . 213
§7. Exercises .................... . ................ . .. . ........ . ... 217
CONTENTS Xl
CHAPTER VIII
Some Applications of Integration 223
§1. Convolution ...... . .. .. ..... ... .. .. .. . .. .. .. ... . .. ... . . . . . .. .. 223
§2. Continuity and Differentiation Under the Integral Sign .... .. .. ... .. 225
§3. Dirac Sequences . .. . . .. . . . .... . . . .. . . . ...... . .... .. . .. . ........ 227
§4. The Schwartz Space and Fourier Transform . . . . . .. ...... . . .. . . . .. 236
§5. The Fourier Inversion Formula . . ... . . .. .. .. . . . .. .... . .. . .... . . . 241
§6. The Poisson Summation Formula .. .. .... ....... ..... ... .. . . .. .. 243
§7. An Example of Fourier Transform Not in the Schwartz Space . . .. . .. 244
§8. Exercises . . .. . .. . . . . .... .. .. . ....... .. .... ... ......... . . . . ... . 247
CHAPTER IX
Integration and Measures on Locally Compact Spaces 251
§1. Positive and Bounded Functionals on CAX) .. . .. .. . . . . . ... .. .. . . . 252
§2. Positive Functionals as Integrals ... . ... . .. .. ... . ...... . .. . . ... . . . 255
§3. Regular Positive Measures . ... .. . . .. . .. . .. . .. . . . .. . .... .. . .. .... 265
§4. Bounded Functionals as Integrals . . ..... . . .. . .. .... .. . . .. .... .... 267
§5. Localization of a Measure and of the Integral . . . ...... . . .. . ... ... . 269
§6. Product Measures on Locally Compact Spaces .. .. . . .. . .. . . . . . .. .. 272
§7. Exercises . . .... . ...... . ...... . . . . .. . .. .. . . . . .. . . . ... . . . . . . .... 274
CHAPTER X
Riemann-Stieltjes Integral and Measure 278
§1. Functions of Bounded Variation and the Stieltjes Integral . ... .. .... 278
§2. Applications to Fourier Analysis . . . .. .. . .. . . . . . . . ... . .... .. .. . .. . 287
§3. Exercises . .. .. ... .. . .. .. ... . .. . ... .. . ..... . .. . . .. ... .... ..... . 294
CHAPTER XI
Distributions 295
§1. Definition and Examples . .. . .... . .. ...... . ... . . ......... . .. . .. . 295
§2. Support and Localization . .. ..... ... . ... ..... .... . . .. . . . ... .... . 299
§3. Derivation of Distributions . . . . .. . . ... . .... . . . ... ... . . ......... . 303
§4. Distributions with Discrete Support 304
CHAPTER XII
Integration on Locally Compact Groups 308
§1. Topological Groups .. . ... . . ... . . ...... . ... . .. .... .. . .. . .. .. . . . . 308
§2. The Haar Integral, Uniqueness .. . .... . .. . . .. .... . . .... ....... . . . 313
§3. Existence of the Haar Integral . . .... . .. . ............ . .. . . .... .. . . 319
§4. Measures on Factor Groups and Homogeneous Spaces .. .. ..... ... . 322
§5. Exercises . ... . ... .. .. .. . . . .. .. .. .. ........ . . .. . . . .. . ... .. . . .. . 326
PART FOUR
Calculus . .. . . ..... .. . . ... . . ... . ...... . . . . .. . . . .. . .. .. . . . .. . .... . 329
xii CONTENTS
CHAPTER XIII
Differential Calculus 331
§l. Integration in One Variable . . . .. . .... .. .. . . . ... . . . .. . .... . . . .. 331
§2. The Derivative as a Linear Map .. . . . .. .. . ... . .. . .. .. ... . . .. . . . 333
§3. Properties of the Derivative . . ..... . ... . . . .. . ..... . ..... . . . . .. . 335
§4. Mean Value Theorem . . . . .. . .. . .. .. . . ... . . . . . . .. .. .. ... .. . . . . 340
§5. The Second Derivative . . ...... .. . .. .. . . .. .... . .. . . ... . ... .. . . . 343
§6. Higher Derivatives and Taylor's Formula .... . .. . .... . . . .. . . .. . . 346
§7. Partial Derivatives . .. . . . . . . . .... . . .. .... . ... . . . .. ..... . .. . . . . 351
§8. Differentiating Under the Integral Sign ... . . .. . . .. .. .. .. ... . . .. .. 355
§9. Differentiation of Sequences ... . . . . .. . . . .. . .. ...... . .. ..... ... . 356
§10. Exercises . .. .. . . .. .. . . . . .. . . ... .. . . . . . .. . . .. . . . ... . . . . .. . . ... 357
CHAPTER XIV
Inverse Mappings and Differential Equations 360
§l. The Inverse Mapping Theorem . . . ... . . . .. . ......... .. .. . . .. . .. . 360
§2. The Implicit Mapping Theorem . ... . .. ...... . . . . . . .... .. .. . ... . . 364
§3. Existence Theorem for Differential Equations . . . .. . ... .. . .. . .. . ... 365
§4. Local Dependence on Initial Conditions . ... .. ..... .. ... . .. . .. ... 371
§5. Global Smoothness of the Flow . . ... .. . . . .. . .... .. . .. . .. . ....... 376
§6. Exercises . ...... . ... . .... . .. . .. . ... . . .. . ... ........ . . . .. .. .. . . 379
PART FIVE
Functional Analysis 385
CHAPTER XV
The Open Mapping Theorem, Factor Spaces, and Duality 387
§l. The Open Mapping Theorem .. .. . ... .. . .. . . . ... . . .. . . . . . .. .. . .. 387
§2. Orthogonality . . . . ... ... ..... .. ... .. .. .. . .. .. .. .. .. . ...... . . .. 391
§3. Applications of the Open Mapping Theorem .. .. . ... . . . . . . . . . .. .. 395
CHAPTER XVI
The Spectrum 400
§1. The Gelfand-Mazur Theorem .. . .. . . . . . . . .. .. .. .... . . .. . . ..... . 400
§2. The Gelfand Transform .. ... .. . . .. .. ... . .... .. . . ... .. . . ... . . . .. 407
§3. C*-Algebras . . . .. ....... . .. . . .. .. . . .. ...... .... . ... . . . . .. . . . .. 409
§4. Exercises . . .. . ..... .. . . .. . ..... . ..... . .. . . . .. . ... .. ......... . . 412
CHAPTER XVII
Compact and Fredholm Operators 415
§1. Compact Operators ... . ..... .. . . . . .. .... . .. . . . . ... .. . . .. . .. .. . 415
§2. Fredholm Operators and the Index ...... ... . . . . . . . . . ... . . . .. . ... .. . 417
§3. Spectral Theorem for Compact Operators . . .. . . . . .... . ... . .. . ... . 426
§4. Application to Integral Equations .... .. .. ... . .. .. ..... . . . .. . . ... 432
§5. Exercises .. .. ... . ... . . . . . . .. . . . . .. .. . . .. . . . ... . . .... . . .. . ... . . 433
CONTENTS xiii
CHAPTER XVIII
Spectral Theorem for Bounded Hermitian Operators 438
§l. Hermitian and Unitary Operators ............................... 438
§2. Positive Hermitian Operators . . ............ .. ................... 439
§3. The Spectral Theorem for Compact Hermitian Operators .......... 442
§4. The Spectral Theorem for Hermitian Operators ................... 444
§5. Orthogonal Projections .... . ... . ... ...... .. .................... 449
§6. Schur's Lemma .... . .. ... ......... . ........................... 452
§7. Polar Decomposition of Endomorphisms ........................ . 453
§8. The Morse-Palais Lemma . ... . . .. . .. .. ..... ...... ..... ........ 455
§9. Exercises ..................................................... 458
CHAPTER XIX
Further Spectral Theorems 464
§l. Projection Functions of Operators ........................ ..... . 464
§2. Self-Adjoint Operators . ... ... ........ . . . .... . ........ ... .... .. . 469
§3. Example: The Laplace Operator in the Plane .. . .................. 476
CHAPTER XX
Spectral Measures 480
§l. Definition of the Spectral Measure ...... ..... ... ... . ............ 480
§2. Uniqueness of the Spectral Measure:
the Titchmarsh-Kodaira Formula ... .... . . .......... ... .. . . . .... 485
§3. Unbounded Functions of Operators ............................. 488
§4. Spectral Families of Projections ... ... ... ... .. ... . .... .. ......... 490
§5. The Spectral Integral as Stieltjes Integral .. .. . .. .. ..... .. . ... . .... 491
§6. Exercises . .................................................... 492
PART SIX
Global Analysis 495
CHAPTER XXI
Local Integration of DiHerential Forms 497
§l. Sets of Measure 0 .. . ............ ... ......... . ..... . . . . .. ...... 497
§2. Change of Variables Formula . ......... .. ........ ... ......... ... 498
§3. Differential Forms ... . ................... . ...... .. ............. 507
§4. Inverse Image of a Form ... .... ... .. . . ..... ... ....... ... . . .. ... 512
§5. Appendix ......................... . .......................... 516
CHAPTER XXII
Manifolds ....................................................... 523
§l. Atlases, Charts, Morphisms ... .. ..... ... .......... ...... .... . ... 523
§2. Submanifolds ........... .. ....... .. ...... . . ..... ... . . . ........ 527
§3. Tangent Spaces ............ .. . . .... . . .. . . .... . ................ 533
§4. Partitions of Unity . . ... ....... ... .. ... .. ... .. ... .. .. ... ... .... 536
§5. Manifolds with Boundary ... ... .. . . . . . ... . .. . . ... . ... . .. ....... 539
§6. Vector Fields and Global Differential Equations .................. 543
XIV CONTENTS
CHAPTER XXIII
Integration and Measures on Manifolds 547
§l. Differential Forms on Manifolds ................ . ......... ... .... 547
§2. Orientation ................................................... 551
§3. The Measure Associated with a Differential Form . ................ 553
§4. Stokes' Theorem for a Rectangular Simplex . . .. . .. . ............... 555
§5. Stokes' Theorem on a Manifold .. . .............................. 558
§6. Stokes' Theorem with Singularities .. . ........................ . . .. 561
Bibliography .............................................. . ..... 569

Table of Notation . .............. . ........... . .................. . 572
Index .. . ..... . ............ . ....................... . ....... . .... . 575
PART ONE
General Topology
CHAPTER
Sets
I, §1. SOME BASIC TERMINOLOGY
We assume that the reader understands the meaning of the word "set",
and in this chapter, summarize briefly the basic properties of sets and
operations between sets. We denote the empty set by 0. A subset S' of
S is said to be proper if S' =1= S. We write S' c S or S => S' to denote the
fact that S' is a subset of S.
Let S, T be sets. A mapping or map f : T --+ S is an association which
to each element x E T associates an element of S, denoted by f(x), and
called the value of f at x, or the image of x under f. If T' is a subset of
T, we denote by f(T') the subset of S consisting of all elements f(x) for
x E T'. The association of f(x) to x is denoted by the special arrow
X 1--+ f(x).
We usually reserve the word function for a mapping whose values are in
the real or complex numbers. The characteristic function of a subset S' of
S is the function X such that X(x) = 1 if XES' and X(x) = 0 if x ¢ S'. We
often write Xs' for this function.
Let X, Y be sets. A map f : X --+ Y is said to be injective if for all x,
x ' E X with x =1= x' we have f(x) =1= f(x'). We say that f is surjective if
f(X) = Y, i.e. if the image of f is all of Y. We say that f is bijective if it
is both injective and surjective. As usual, one should index a map f by
its set of arrival and set of departure to have absolutely correct notation,
but this is too clumsy, and the context is supposed to make it clear what
these sets are. For instance, let R denote the real numbers, and R' the
4 SETS [I, §l]
real numbers ;?;

~ O. The map
given by x 1-+ X2 is not surjective, but the map
ff : R-+R'
given by the same formula is surjective.

If f: X -+ Y is a map and S a subset of X , we denote by
flS
the restriction of f to S, namely the map f viewed as a map defined only

on S. For instance, if f: R -+ R' is the map XI-+X 2 , then f is not injec-
tive, but fiR' is injective. We often let fs = fXs be the function equal to
f on Sand 0 outside S.
A composite of injective maps is injective, and a composite of surjec-
tive maps is surjective. Hence a composite of bijective maps is bijective.
We denote by Q, Z the sets of rational numbers and integers respec-
tively. We denote by Z+ the set of positive integers (integers > 0), and
similarly by R+ the set of positive reals. We denote by N the set of
natural numbers (integers ;?;~ 0), and by C the complex numbers. A map-
ping into R or C will be called a function.
Let S and I be sets. By a family of elements of S, indexed by I , one
means simply a map f: I -+ S. However, when we speak of a family, we
write f(i) as h, and also use the notation {hLeI to denote the family.
Example 1. Let S be the set consisting of the single element 3. Let

I = {t, ... ,n} be the set of integers from I to n. A family of elements of
S, indexed by I, can then be written {aJi=l .....n with each ai = 3. Note
that a family is different from a subset. The same element of S may
receive distinct indices.
A family of elements of a set S indexed by positive integers, or non-
negative integers, is also called a sequence.
Example 2. A sequence of real numbers is written frequently in the

form
or
and stands for the map f : Z+ -+ R such that f(i) = Xi . As before, note
that a sequence can have all its elements equal to each other, that is
{l , l , l, ... }
is a sequence of integers, with Xi =I for each i E Z+ .

[I, §1] SOME BASIC TERMINOLOGY 5
We define a family of sets indexed by a set I in the same manner, that

is, a family of sets indexed by I is an assignment
which to each i E I associates a set Si' The sets Si mayor may not have
elements in common, and it is conceivable that they may all be equal.
As before, we write the family {SJieI '
We can define the intersection and union of families of sets, just as for
the intersection and union of a finite number of sets. Thus, if {SJieI is a
family of sets, we define the intersection of this family to be the set
consisting of all elements x which lie in all Si' We define the union
USi
ieI
to be the set consisting of all x such that x lies in some Si'

If S, S' are sets, we define S x S' to be the set of all pairs (x, y) with
XES and YES'. We can define finite products in a similar way. If Sl'
S2' .. . is a sequence of sets, we define the product
00
nSi
i=l
to be the set of all sequences (Xl' X2' .. . ) with Xi E Si ' Similarly, if I is an

indexing set, and {SJieI a family of sets, we define the product
nSi
iel
to be the set of all families {Xi}; e I with Xi E Si'

Let X, Y, Z be sets. We have the formula
(X u Y) x Z = (X x Z) u (Y x Z).
To prove this, let (w, z) E (X U Y) x Z with WE X U Y and ZE Z. Then

WE X or WE Y. Say WE X. Then (w, z) E X X Z. Thus
(X U Y) x Z c (X x Z) U (Y x Z) .
Conversely, X x Z is contained in (X u Y) x Z and so is Y x Z . Hence

their union is contained in (X u Y) x Z, thereby proving our assertion.
6 SETS [I, §1]
We say that two sets X, Yare disjoint if their intersection is empty.

We say that a union X v Y is disjoint if X and Yare disjoint. Note that
if X, Yare disjoint, then (X x Z) and (Y x Z) are disjoint.
We can take products with arbitrary families. For instance, if {X;}iEI
is a family of sets, then
( iEUI Xi) X Z= U (Xi

iE I
X Z).
If the family {X;}iEI is disjoint (that is Xi n Xj is empty if i =F j for i,

jE /), then the sets Xi x Z are also disjoint.
We have similar formulas for intersections. For instance,
(X n Y) x Z = (X x Z) n (Y x Z).
We leave the proof to the reader.

Let X be a set and Y a subset. The complement of Y in X, denoted
by ~x Y, or X - Y, is the set of all elements x E X such that x ¢ Y. If Y,
Z are subsets of X, then we have the following formulas:
~x(Y V Z) = ~x Y n ~xZ,
~x(Y nZ) = ~x Yv~xZ.
These are essentially reformulations of definitions. For instance, suppose

XEX and x¢(YvZ). Then x¢ Y and x¢Z. Hence xE~xYn~xZ.
Conversely, if x E ~x Y n ~xZ, then x lies neither in Y nor in Z, and
hence x E ~x(Yv Z). This proves the first formula. We leave the second
to the reader. Exercise: Formulate these formulas for the complement of
the union of a family of sets, and the complement of the intersection of a
family of sets.
Let A, B be sets and f: A --+ B a mapping. If Y is a subset of B, we
define f-l(y) to be the set of all x E A such that f(x) E Y. It may be that
f-l(y) is empty, of course. We call f-l(y) the inverse image of Y (under
f). If f is injective, and Y consists of one element y, then f-l( {y}) is
either empty or has precisely one element.
The following statements are easily proved:
If f: A --+ B is a map, and Y, Z are subsets of B, then
f- 1(yv Z) = f-l(y) v f- 1(Z),

f-l(y n Z) = f-l(y) nf- 1(Z).
More generally, if {¥;};EI is a family of subsets of B, then

[I, §2] DENUMERABLE SETS 7
and similarly for the intersection. Furthermore, if we denote by Y - Z

the set of all elements Y E Y and y i Z, then
In particular,
Thus the operation 1-1 commutes with all set theoretic operations.
I, §2. DENUMERABLE SETS
Let n be a positive integer. Let J. be the set consisting of all integers k,

1 ~ k ~ n. If S is a set, we say that S has n elements if there is a
bijection between Sand J. . Such a bijection associates with each integer
k as above an element of S, say k 1--+ ak • Thus we may use J. to "count"
S. Part of what we assume about the basic facts concerning positive
integers is that if S has n elements, then the integer n is uniquely deter-
mined by S.
One also agrees to say that a set has 0 elements if the set is empty.
We shall say that a set S is denumerable if there exists a bijection of
S with the set of positive integers Z+. Such a bijection is then said to
enumerate the set S. It is a mapping
which to each positive integer n associates an element of S, the mapping

being injective and surjective.
If D is a denumerable set, and I: S ~ D is a bijection of some set S
with D, then S is also denumerable. Indeed, there is a bijection g: D ~ Z+,
and hence g 0 I is a bijection of S with Z + .
Let T be a set. A sequence of elements of T is simply a mapping of
Z + into T. If the map is given by the association n 1--+ x., we also write
the sequence as {X.}.<;l' or also {Xl' X2' •. • }. For simplicity, we also
write {x.} for the sequence. Thus we think of the sequence as prescrib-
ing a first, second, ... , n-th element of T. We use the same braces for
sequences as for sets, but the context will always make our meaning
clear.
Examples. The even posItIve integers may be viewed as a sequence

{x.} if we put x. = 2n for n = 1, 2, .... The odd positive integers may
also be viewed as a sequence {Y.} if we put y. = 2n - 1 for n = 1, 2, ....
In each case, the sequence gives an enumeration of the given set.
We also use the word sequence for mappings of the natural numbers
into a set, thus allowing our sequences to start from 0 instead of 1. If we
8 SETS [I, §2]
need to specify whether a sequence starts with the O-th term or the first
term, we write
or
according to the desired case. Unless otherwise specified, however, we

always assume that a sequence will start with the first term. Note
that from a sequence {xn}n GO we can define a new sequence by letting
Yn = Xn - l for n ~ 1. Then Yl = Xo , Y2 = Xl ' ... . Thus there is no essen-
tial difference between the two kinds of sequences.
Given a sequence {x n }, we call X n the n-th term of the sequence. A
sequence may very well be such that all its terms are equal. For in-
stance, if we let Xn = 1 for all n ~ 1, we obtain the sequence {1, 1, 1, .. . } .
Thus there is a difference between a sequence of elements in a set T, and
a subset of T. In the example just given, the set of all terms of the
sequence consists of one element, namely the single number 1.
Let {Xl' X 2 , . .• } be a sequence in a set S. By a subsequence we shall
mean a sequence {x n1 ' x n2 , • • • } such that nl < n 2 < .. .. For instance, if
{xn} is the sequence of positive integers, Xn = n, the sequence of even
positive integers {x 2n } is a subsequence.
An enumeration of a set S is of course a sequence in S.
A set is finite if the set is empty, or if the set has n elements for some
positive integer n. If a set is not finite, it is called infinite.
Occasionally, a map of I n into a set T will be called a finite sequence
in T. A finite sequence is written as usual,
{Xl' ' " ,Xn } or (X;)i=l . .. .•n·
When we need to specify the distinction between finite sequences and

maps of Z+ into T, we call the latter infinite sequences. Unless otherwise
specified, we shall use the word "sequence" to mean infinite sequence.
Proposition 2.1. Let D be an infinite subset of Z +. Then D is de-

numerable, and in fact there is a unique enumeration of D, namely
{kl' k2 ' . . . } such that
Proof. We let kl be the smallest element of D. Suppose inductively

that we have defined kl < ... < k n in such a way that any element k in D
which is not equal to kl ' ... ,k n is > k n. We define kn+l to be the
smallest element of D which is > kn • Then the map n H k n is the desired
enumeration of D.
Corollary 2.2. Let S be a denumerable set and D an infinite subset of S.

Then D is denumerable.
[I, §2] DENUMERABLE SETS 9
Proof. Given an enumeration of S, the subset D corresponds to a

subset of Z+ in this enumeration. Using Proposition 2.1 we conclude
that we can enumerate D.
Proposition 2.3. Every infinite set contains a denumerable subset.
Proof. Let S be a infinite set. For every non-empty subset T of S, we

select a definite element aT in T. We then proceed by induction. We let
Xl be the chosen element as. Suppose that we have chosen Xl' ... ,x n
having the property that for each k = 2, ... ,n the element X k is the
selected element in the subset which is the complement of {x I ' ... ,xk-d.
We let X n +1 be the selected element in the complement of the set
{Xl' ... ,Xn}. By induction, we thus obtain an association n~xn for all
positive integers n, and since Xn #- Xk for all k < n it follows that our
association is injective, i.e. gives an enumeration of a subset of S.
Proposition 2.4. Let D be a denumerable set, and f: D -. S a surjective

mapping. Then S is denumerable or finite.
Proof. For each YES, there exists an element Xy E D such that f(xy) =
Y because f is surjective. The association y ~ Xy is an injective mapping
of S into D, because if y, Z E Sand Xy = x z , then
Let g(y) = x y. The image of g is a subset of D and is denumerable.

Since g is a bijection between S and its image, it follows that S is
denumerable or finite.
Proposition 2.5. Let D be a denumerable set. Then D x D (the set of

all pairs (x, y) with x, y E D) is denumerable.
Proof. There is a bijection between D x D and Z+ x Z+, so it will

suffice to prove that Z+ x Z+ is denumerable. Consider the mapping of
Z+ x Z+ -. Z+ given by
In view of Proposition 2.1, it will suffice to prove that this mapping is

injective. Suppose 2n 3m = 2r 3' for positive integers n, m, r, s. Say r < n.
Dividing both sides by 2r , we obtain
with k = n - r ~ 1. Then the left-hand side is even, but the right-hand

side is odd, so the assumption r < n is impossible. Similarly, we cannot
10 SETS [I, §3]
have n < r. Hence r = n. Then we obtain 3m = 3s. If m > s, then 3m- s = 1

which is impossible. Similarly, we cannot have s > m, whence m = s.
Hence our map is injective, as was to be proved.
Proposition 2.6. Let {Dl' D2 , • •• } be a sequence of denumerable sets.

Let S be the union of all sets Di (i = 1, 2, ... ). Then S is denumerable.
Proof. For each i = 1, 2, . .. we enumerate the elements of Db as

indicated in the following notation:
Dl : {Xll ' Xl2, X l3 , . . . }
D2 : {X2l ' X 22 , X 23 , . • ·· }
The map f: Z+ x Z+ -> D given by
f(i,j) = xij
is then a surjective map of Z+ x Z+ onto S. By Proposition 2.4, it

follows that S is denumerable.
Corollary 2.7. Let F be a non-empty finite set and D a denumerable set.

Then F x D is denumerable. If Sl' S2' . .. are a sequence of sets,
each of which is finite or denumerable, then the union Sl U S2 U .. . is
denumerable or finite .
Proof. There is an injection of F into Z+ and a bijection of D with

Z+. Hence there is an injection of F x D into Z+ x Z+ and we can
apply Corollary 2.2 and Proposition 2.6 to prove the first statement.
One could also define a surjective map of Z+ x Z+ onto F x D. As for
the second statement, each finite set is contained in some denumerable
set, so that the second statement follows from Propositions 2.1 and 2.6.
For convenience, we shall say that a set is countable if it is either finite

or denumerable.
I, §3. ZORN'S LEMMA

In order to deal efficiently with infinitely many sets simultaneously, one
needs a special property. To state it, we need some more terminology.
Let S be a set. An ordering (also called partial ordering) of (or on) S
[I, §3] ZORN'S LEMMA 11
is a relation, written x ~ y, among some pairs of elements of S, having

the following properties.
ORO 1. We have x ~ x.
ORO 2. If x ~ y and y ~ z then x ~ z.
ORO 3. If x ~ y and y ~ x then x = y.
We sometimes write y ~ x for x ~ y. Note that we don't require that the

relation x ~ y or y ~ x hold for every pair of elements (x, y) of S. Some
pairs may not be comparable. If the ordering satisfies this additional
property, then we say that it is a total ordering.
Example 1. Let G be a group. Let S be the set of subgroups. If H,

H' are subgroups of G, we define
H~H'
if H is a subgroup of H'. One verifies immediately that this relation

defines an ordering on S. Given two subgroups, H, H' of G, we do not
necessarily have H ~ H' or H ' ~ H.
Example 2. Let R be a ring, and let S be the set of left ideals of R.

We define an ordering in S in a way similar to the above, namely if L, L'
are left ideals of R, we define
L~L'
if L c L'.
Example 3. Let X be a set, and S the set of subsets of X. If Y, Z are

subsets of X, we define Y ~ Z if Y is a subset of Z. This defines an
ordering on S.
In all these examples, the relation of ordering is said to be that of

inclusion.
In an ordered set, if x ~ y and x "# y we then write x < y.
Let A be an ordered set, and B a subset. Then we can define an
ordering on B by defining x ~ y for x, y E B to hold if and only if x ~ y
in A. We shall say that it is the ordering on B induced by the ordering
on A, or is the restriction to B of the partial ordering of A.
Let S be an ordered set. By a least element of S (or a smallest
element) one means an element a E S such that a ~ x for all XES. Simi-
larly, by a greatest element one means an element b such that x ~ b for
all XES.
By a maximal element m of S one means an element such that if XES
and x ~ m, then x = m. Note that a maximal element need not be a
greatest element. There may be many maximal elements in S, whereas if
a greatest element exists, then it is unique (proof?).
12 SETS [I, §3]
Let S be an ordered set. We shall say that S is totally ordered if given

x, YES we have necessarily x ~ y or y ~ x.
Example 4. The integers Z are totally ordered by the usual ordering.

So are the real numbers.
Let S be an ordered set, and T a subset. An upper bound of T (in S)

is an element b E S such that x ~ b for all x E T. A least upper bound of
T in S is an upper bound b such that if c is another upper bound, then
b ~ c. We shall say that S is inductively ordered if every non-empty
totally ordered subset has an upper bound.
We shall say that S is strictly inductively ordered if every non-empty
totally ordered subset has a least upper bound.
In Examples 1, 2, 3, in each case, the set is strictly inductively ordered.
To prove this, let us take Example 1. Let T be a non-empty totally
ordered subset of the set of subgroups of G. This means that if H, H' E T,
then H c H' or H' c H. Let U be the union of all sets in T. Then:
(1) U is a subgroup. Proof: If x, y E U, there exist subgroups H,
H' E T such that x E Hand y E H'. If, say, H c H', then both
x, Y E H' and hence xy E H'. Hence xy E U. Also, X-I E H', so
X-I E U. Hence U is a subgroup.
(2) U is an upper bound for each element of T. Proof: Every H E T
is contained in U, so H ~ U for all HE T.
(3) U is a least upper bound for T. Proof: Any subgroup of G which
contains all the subgroups H E T must then contain their union
U.
The proof that the sets in Examples 2, 3 are strictly inductively
ordered is entirely similar.
We can now state the property mentioned at the beginning of the
section.
Zorn's Lemma. Let S be a non-empty inductively ordered set. Then

there exists a maximal element in S.
Zorn's lemma could be just taken as an axiom of set theory. How-

ever, it is not psychologically completely satisfactory as an axiom, be-
cause its statement is too involved, and one does not visualize easily the
existence of the maximal element asserted in that statement. We show
how one can prove Zorn's lemma from other properties of sets which
everyone would immediately grant as acceptable psychologically.
From now on to the end of the proof of Theorem 3.1, we let A be a
non-empty partially ordered and strictly inductively ordered set. We re-
call that strictly inductively ordered means that every non-empty totally
ordered subset has a least upper bound. We assume given a map
f: A ~ A such that for all x E A we have x ~ f(x). We could call such

a map an increasing map.
Let a E A. Let B be a subset of A. We shall say that B is admissible
if:
(1) B contains a.
(2) We have f(B) c B.
(3) Whenever T is a totally ordered subset of B, the least upper
bound of T in A lies in B.
Then B is also strictly inductively ordered, by the induced ordering of A.
We shall prove:
Theorem 3.1 (Bourbaki). Let A be a non-empty partially ordered and

strictly inductively ordered set. Let f: A ~ A be an increasing mapping.
Then there exists an element Xo E A such that f(xo) = Xo'
Proof Suppose that A were totally ordered. By assumption, it would

have a least upper bound bE A, and then
b ~ f(b) ~ b,
so that in this case, our theorem is clear. The whole problem is to

reduce the theorem to that case. In other words, what we need to find is
a totally ordered admissible subset of A.
If we throw out of A all elements x E A such that x is not ~ a, then
what remains is obviously an admissible subset. Thus without loss of
generality, we may assume that A has a least element a, that is a ~ x for
all x E A.
Let M be the intersection of all admissible subsets of A . Note that
A itself is an admissible subset, and that all admissible subsets of A
contain a, so that M is not empty. Furthermore, M is itself an admissi-
ble subset of A. To see this, let x E M. Then x is in every admissible
subset, so f(x) is also in every admissible subset, and hence f(x) E M.
Hence f(M) c M. If T is a totally ordered non-empty subset of M, and
b is the least upper bound of T in A, then b lies in every admissible
subset of A, and hence lies in M. It follows that M is the smallest
admissible subset of A, and that any admissible subset of A contained in
M is equal to M.
We shall prove that M is totally ordered, and thereby prove Theorem
3.1.
3.l.
[First we make some remarks which don't belong to the proof, but
will help in the understanding of the subsequent lemmas. Since a E M, we
see that f(a) E M, f 0 f(a) E M, and in general !"(a) E M. Furthermore,
14 SETS [I, §3]
If we had an equality somewhere, we would be finished, so we may

assume that the inequalities hold. Let Do be the totally ordered set
{f"(a)}" ~ o' Then Do looks like this :
a < f(a) < j2(a) < ... < f"(a) < ....
Let a l be the least upper bound of Do. Then we can form
in the same way to obtain D1 , and we can continue this process, to

obtain
••• are contained in M. If we had a precise way

It is clear that D 1 , D2 ,
of expressing the fact that we can establish a never-ending string of such
denumerable sets, then we would obtain what we want. The point is that
we are now trying to prove Zorn's lemma, which is the natural tool for
guaranteeing the existence of such a string. However, given such a string,
we observe that its elements have two properties: If c is an element of
such a string and x < c, then f(x) ~ c. Furthermore, there is no element
between c and f(c) , that is if x is an element of the string, then x ~ c or
f(c) ~ x. We shall now prove two lemmas which show that elements of
M have these properties.]
Let c E M. We shall say that c is an extreme point of M if whenever
x E M and x < c, then f(x) ~ c. For each extreme point c E M we let
Me = set of x E M such that x ~ c or f(c) ~ x.
Note that Me is not empty because a is in it.
Lemma 3.2. We have Me = M for every extreme point c of M.
Proof. It will suffice to prove that Me is an admissible subset. Let

x E Me. If X < c then f(x) ~ c so f(x) E Me. If X = c then f(x) = f(c) is
again in Me. If f(c) ~ x, then f(c) ~ x ~ f(x), so once more f(x) E Me.
Thus we have proved that f(Me) C Me Me..
Let T be a totally ordered subset of Me and let b be the least upper
bound of T in A. Since M is admissible, we have b E M. If all ele-
ments x E T are ~ c, then b ~ c and bE Me. If some x E T is such that
f(c) ~ x, then
f(c) ~ x ~ b,
and so b is in Me . This proves our lemma.
Lemma 3.3. Every element of M is an extreme point.

Proof. Let E be the set of extreme points of M . Then E is not empty

because a E E. It will suffice to prove that E is an admissible subset. We
first prove that f maps E into itself. Let c E E. Let x E M and suppose
x < f(c). We must prove that
f(x) ~ f(c).
By Lemma 3.2, M = M c ' and hence we have x < c, or x = c, or f(c) ~ x.

This last possibility cannot occur because x < f(c). If x < c then
f(x) ~ c ~ f(c).
If x = c then f(x) = f(c), and hence f(E) c E.

Next let T be a totally ordered subset of E. Let b the least upper
bound of T in A. We must prove that bEE. Let x E M and x < b.
We must show that f(x) ~ b. If for all c E E we have f(c) ~ x, then
c ~ f(c) ~ x for all c E E, whence x is an upper bound for E, whence
b ~ c and bEE. Otherwise, since Mc = M for all c E E, we must therefore
have x ~ c for some c E E. If x < c, then f(x) ~ c ~ b, and if x = c, then
f(x) = f(c) E E
by what has already been proved, and so f(x) ~ b. This proves that
bEE, that E is admissible, and thus proves Lemma 3.3.
We now see trivially that M is totally ordered. For let x, y E M.

Then x is an extreme point of M by Lemma 3.3, and y E Mx so y ~ x or
x ~ f(x) ~ y,
thereby proving that M is totally ordered. As remarked previously, this

concludes the proof of Theorem 3.1.
We shall obtain Zorn's lemma essentially as a corollary of Theorem
3.1. We first obtain Zorn's lemma in a slightly weaker form.
Corollary 3.4. Let A be a non-empty strictly inductively ordered set.

Then A has a maximal element.
Proof. Suppose that A does not have a maximal element. Then for
each x E A there exists an element Yx E A such that x < Yx' Let f: A ~ A
be the map such that f(x) = Yx for all x E A. Then A, f satisfy the hypoth-
eses of Theorem 3.1 and applying Theorem 3.1 yields a contradiction.
The only difference between Corollary 3.4 and Zorn's lemma is that in
Corollary 3.4, we assume that a non-empty totally ordered subset has a
least upper bound, rather than an upper bound. It is, however, a simple
16 SETS [I, §3]
matter to reduce Zorn's lemma to the seemingly weaker form of Corol-

lary 3.4. We do this in the second corollary.
Corollary 3.5 (Zorn's Lemma). Let S be a non-empty inductively

ordered set. Then S has a maximal element.
Proof. Let A be the set of non-empty totally ordered subsets of S.

Then A is not empty since any subset of S with one element belongs to
A. If X, YEA, we define X ~ Y to mean Xc Y. Then A is partially
ordered, and is in fact strictly inductively ordered. For let T = {X;}iEI be
a totally ordered subset of A. Let
Z= UX
iE I
i.
Then Z is totally ordered. To see this, let x, y E Z. Then x E Xi and

y E Xj for some i, j E I. Since T is totally ordered, say Xi c Xj. Then x,
y E Xj and since Xj is totally ordered, x ~ y or y ~ x. Thus Z is totally
ordered, and is obviously a least upper bound for T in A. By Corollary
3.4, we conclude that A has a maximal element Xo . This means that Xo
is a maximal totally ordered subset of S (non-empty). Let m be an upper
bound for Xo in S. Then m is the desired maximal element of S. For if
XES and m ~ x, then Xo u {x} is totally ordered, whence equal to Xo by
the maximality of Xo. Thus x E Xo and x ~ m. Hence x = m, as was to
be shown.
CHAPTER II
Topological Spaces
This chapter develops the standard properties of topological spaces. Most

of these properties do not go beyond the level of a convenient language.
In the text proper, we have given precisely those results which are used
very frequently in all analysis. In the exercises, we give additional results,
of which some just give routine practice and others give more special
results. To incorporate all this material in the text proper would be
extremely oppressive and would obscure the principal lines of thought
inherent in the basic aspects of the subject. The reader can always be
referred to Bourbaki [BoJ or Kelley [KeJ for encyclopaedic treatments.
II, §1. OPEN AND CLOSED SETS
Let X be a set. By a topology on X we mean a collection !Y of subsets

called the open sets of the topology, satisfying the following conditions:
TOP 1. The empty set and X itself are open.

TOP 2. A finite intersection of open sets is open.
TOP 3. An arbitrary union of open sets is open.
Example 1. Let X be any set. If we define an open set to be the

empty set or X itself, we have a topology on X, which is definitely not
interesting.
Example 2. Let X be a set, and define every subset to be open. In

particular, each element of X constitutes an open set. Again we have a
18 TOPOLOGICAL SPACES [II, §1]
topology, which is called the discrete topology on X. A space with the

discrete topology is called a discrete space. It does not look as if this
topology were any more interesting than that of Example 1, but in fact it
does occur in practice.
Example 3. Let X = R be the set of real numbers. Define a subset U

of R to be open if for each point x in U there exists an open interval J
containing x and contained in U. The three axioms of a topology are
easily verified. This topology is called the ordinary topology.
Example 4. Generalization of Example 3, and used very frequently in

analysis. We recall that a normed vector space (over the real numbers) is
a vector space E together with a function on E denoted by x ~ Ixl (real
valued) such that:
NVS 1. We have Ixl ~ 0 and = 0if and only if x = O.

NVS 2. If eE R and x E E, then lexl = Icllxl.
lellxJ.
NVS 3. If x, y E E, then Ix + yl ~ Ixl + Iyl·
Similarly, one defines the notion of normed vector space bver the
complex numbers. The axioms are the same, except that we then take
the number e to be complex in NVS 2.
By an open ball B in E centered at a point v, and of radius r > 0, we
mean the set of all x E E such that Ix - vi < r. We denote such a ball by
Br(v). We define a set U to be open in E if for each point r E U there
exists an open ball B centered at x and contained in U. Again it is easy
to verify that this defines a topology, also called the ordinary topology of
the normed vector space. It is but an exercise to verify that an open ball
is indeed an open set of this topology.
Let {x.} be a sequence in a normed vector space E. This sequence is
said to be Cauchy if given c (always assumed > 0) there exists N such
that for all m, n ~ N we have
This sequence is said to converge to an element x if given c, there exists

N such that for all n > N we have
Ix - x.1 < c.
Examples of Normed Vector Spaces
The sup norm. Let S be a set. A map f: S -+ F of S into a normed

vector space F is said to be bounded if there exists a number C > 0 such
[II, §1] OPEN AND CLOSED SETS 19
that If(x)1 ~ C for all XES. If f is bounded, define
Ilflis = II!II = sup If(x)l,

xeS
sup meaning least upper bound. It can be easily shown that the set of
bounded maps B(S, F) of S into F is a vector space, and that II II is a
norm on this space, called the sup norm.
The L I-Norm. Let E be the space of continuous functions on [0, 1].
L
For fEE define
Ilflll = If(x)i dx.
Then II III is a norm on E, called the U-norm. This norm will be a

major object of study when we do integration later, in a general context.
Much of this book is devoted to studying the convergence of se-

quences for one or the other of the above two norms. For instance,
consider the sup norm. A sequence of maps Un} is said to be uniformly
Cauchy on S if given e there exists N such that for all m, n> N we have
IIfn - fmlls < e.
It is said to be uniformly convergent to .a map f if given e there exists N

such that for all n ~ N we have
IIIn - fils < e.
In the second example, we would use the expressions L l-Cauchy and

L l-convergent instead of uniformly Cauchy and uniformly convergent, if
we replace the sup norm by the L l-norm in these definitions.
Up to a point, one can generalize the notion of subset of a normed
vector space as follows. Let X be a set. A distance function (also called
a metric) on X is a map (x, y) 1-+ d(x, y) from X x X into R satisfying the
following conditioris:
DIS 1. We have d(x, y) ~ °for all x, y E X, and = °if and only if

x = y.
DIS 2. For all x, y, we have d(x, y) = d(y, x).
DIS 3. For all x, y, z, we have
d(x, z) ~ d(x, y) + d(y, z).

A set with a metric is called a metric space. We can then define open
balls just as we did in the case of normed vector spaces, and also define
a topology in a metric space just as we did for a normed vector space.
Every open set is then a union of open balls. This topology is said to be
determined by the metric.
In a normed vector space, we can define the distance between elements
x, y to be d(x, y) = Ix - YI. It is immediately verified that this is a metric
on the space. Conversely, the reader will see in Exercise 5 how a metric
space can be embedded naturally in a normed vector space, in a manner
preserving the metric, so that the "generality" of metric spaces is illusory.
For convenience, we also make here the following definition: If A, Bare
subsets of a normed vector space, we define their distance to be
d(A, B) = inf Ix - yl, X E A, Y E B.
Basic theorems concerning subsets of normed vector spaces hold just as

well for metric spaces. However, almost all metric spaces which arise
naturally (and certainly all of those in this course) occur in a normed
vector space with a natural linear structure. There is enough of a change
of notation from Ix - yl to d(x, y) to warrant carrying out proofs with
the norm notation rather than the other.
Let fI and fI' be topologies on a set X. One verifies at once that
they are equal if and only if the following condition is satisfied: For each
x E X and each set V open in fI containing x, there exists a set V'
open in fI' such that x E V' c V, and conversely, given V' open in fI'
containing x, there exists V open in fI such that x EVe V'.
Example. The reader will verify easily that two norms I 11 and I 12 on
a vector space E give rise to the same topology if and only if they satisfy
the following condition: There exist C 1 , C2 > 0 such that for all x E E we
have
If this is the case, the norms are called equivalent.

Just to fix terminology, we define the closed ball centered at v and of
radius r ~ 0 to be the set of all x E E such that
Ix - vi ~ r.
We define the sphere centered at v, of radius r, to be the set of points x
such that
Ix - vi = r.
Warning. In some books, what we call a ball is called a sphere. This
is not good terminology, and the terminology used here is now essen-
tially universally adopted.
Examples of normed vector spaces are given in the exercises. The

standard properties of subsets of normed vector spaces having to do with
limits are also valid in metric spaces (cf. Exercise 5). We can define balls
and spheres in metric spaces just as in normed vector spaces. We can
also define the notion of Cauchy sequence in a metric space X as usual
(again cf. Exercise 5), and X is said to be complete if every Cauchy
sequence converges, i.e. has a limit in X.
Example 5. Let G be a group. We define a subset U of G to be open

if for each element x E U there exists a subgroup H of G, of finite index,
such that xH is contained in U. It is a simple exercise in algebra to
show that this defines a topology, which is called the profinite topology.
Example 6. Let R be a commutative ring (which according to stan-

dard conventions has a unit element). We define a subset U of R to be
open if for each x E U there exists an ideal J in R such that x + J is
contained in U . It is a simple exercise in algebra to show that this
defines a topology, which is called the ideal topology.
Note. The topologies of Examples 5 and 6 will not occur in any

significant way in this course, and may thus be disregarded by anyone
uninterested in this type of algebra.
A set together with a topology is called a topological space. In this

chapter we develop a large number of basic trivialities about topological
spaces, and except for the numbered theorems, it is recommended that
readers work out the proofs for all other assertions by themselves, even
though we have given most of them.
The duality between intersections and unions with respect to taking
the complement of a subset allows us to define a topology by means of
the complements of open sets, called closed sets. In any topological
space, the closed sets satisfy the following conditions:
CL 1. The empty set and the whole space are closed.

CL 2. The finite union of closed sets is closed.
CL 3. The arbitrary intersection of closed sets is closed.
The first condition is clear, and the other two come from the fact that
the complement of the union of subsets is equal to the intersection of
their complements, and that the complement of the intersection of subsets
is equal to the union of their complements.
Conversely, given a collection fF of subsets of a set X (not yet a
topological space), we say that it defines a topology on X by means of
closed sets if its elements satisfy the three conditions CL 1, 2, 3. We can

then define an open set to be the complement of a set in ff ff..
Example 7. Let X = Rn. Let f(x 1 , ••• ,xn ) be a polynomial in n vari-

ables. A point a = (a 1 , ••• ,an) in Rn is called a zero of f if f(a) = O. We
define a subset S of Rn to be closed if there exists a family {.t;};eI of
polynomials in n variables (with real coefficients) such that S consists
precisely of the common zeros of all .t; in the family (in other words, all
points a E Rn such that .t;(a) = 0 for all i). The reader may assume here
the result that, for any such closed set S, there exists a finite number of
polynomials f1' .. . ,/.. such that S is already the set of zeros of the set
{I1' ... ,/..}. It is easy to prove that we have defined a topology by means
of closed sets, and this topology is called the Zariski topology on Rn. It
is a topology which is adjusted to the study of algebraic sets, that is sets
which are zeros of polynomials. It will not reappear in this course, and
again a disinterested reader may omit it. It does become important in
subsequent courses, however. In 2-space, a closed set consists of a finite
number of points and algebraic curves. In 3-space, a closed set consists
of a finite number of points, algebraic curves, and algebraic surfaces.
Let X be a topological space, and S a subset. A point x E X is said to

be adherent to S if given an open set U containing x, there is some point
of S lying in U. In particular, every element of S is adherent to S. A
point of X is called a boundary point of S if every open set containing
this point also contains a point of S and a point not in S. Thus an
adherent point of S which does not lie in S is a boundary point of S. An
interior point of S is a point of S which does not lie in the boundary of
S. The set Int(S) of interior points of S is open.
A subset S of X is closed if and only if it contains all its boundary

points. This follows at once from the definitions.
By the closure of a subset S of X we mean the union of S and all its

boundary points. The closure of S, denoted by S, is therefore the set of
adherent points of s. It is also immediately verified that S is closed, and
is equal to the intersection of all closed sets containing S. In particular,
we have
As an exercise, the reader should prove that for subsets S, T of X we

have:
and
Equality does not necessarily hold in the formula on the right.

(Example?)
A subset S of a space X is said to be dense (in X) is S = X. For

instance, the rationals are dense in the reals.
Let X be a topological space and S a subset. We define a topology
on S by prescribing a subset V of S to be open in S if there exists an
open set U in X such that V = U (\ S. The conditions for a topology
on S are immediately verified, and this topology is called the induced
topology. With this topology, S is called a subspace.
Note. A subset of S which is open in S may not be open in X. For

instance, the real line is open in itself, but definitely not open in R2.
Similarly for closed sets. On the other hand, if U is an open subset of X,
then a subset of U is open in U in the induced topology if and only if it
is open in X. Similarly, if S is a closed subset of X, a subset of S is
closed in S if and only if it is closed in X.
If P is a certain property of certain topological spaces (e.g. connected,

or compact as we shall define later), then we say that a subset has
property P if it has this property as a subspace.
A topology on a set is often defined by means of a base for the open
sets. By a base for the open sets we mean a collection f!8 of open sets
such that any open set U is a union (possibly infinite) of elements of f!8.
There is an easy criterion for a collection of subsets to be a base for a
topology. Let X be a set and f!8 a collection of subsets satisfying:
B 1. Every element of X lies in some set in f!4.

B 2. If B, B' are in f!8 and x E B (\ B' then there exists some B" in f!4
such that x E B" and B" c B (\ B'.
If f!8 satisfies these two conditions, then there exists a unique topology
whose open sets are the unions of sets in f!4. Indeed, such a topology is
uniquely determined, and it exists because we can define a set to be open
if it is a union of sets in f!4. The axioms for open sets are trivially
verified.
Example. The open balls in a normed vector space form a base for
the ordinary topology of that space.
Example. Let X be a set and let o/L, l ' be topologies on X, that is

collections of open sets satisfying the axioms for a topology. We say that
l ' is a refinement of o/L, or that o/L is coarser than 1', if every set open in
o/L is also open in 1'. Thus o/L has fewer open sets than l ' ("fewer" in the
weak sense since o/L may be equal to 1').
Let Y be a topological space and let g; be a family of mappings

f: X ~ Y of X into Y. Let f!8 be the family of all subsets of X consisting
of the sets f- 1 (W), where W is open in Y and f ranges over fF. Then
we leave to the reader the verification of the following facts:
1. fJ6 is a base for a topology on X, i.e. satisfies conditions B 1, B 2.
2. This topology is the coarsest topology (the one with the fewest
open sets) such that every map f E fF is continuous.
We call this topology the weak topology on X determined by fF.

For an application of the weak topology, see Chapter IV, §1 and also
the appendix of Chapter IV.
There is a generalization of the weak topology as follows. Instead of
considering one space Y, we consider a family of spaces {li}, for i
ranging in some index set. We let fF be a family of mappings h: X -+ li.
We let fJ6 be the family of all subsets of X consisting of finite intersec-
tions of sets h- 1 (U;) where Ui is open in li. Then again it is easily
verified that fJ6 is a base for a topology, called the weak topology deter-
mined by the family fF. The product topology defined below will provide
an example of this more general case, when the family fF is the family of
projections on the factors of a product.
A topological space is said to be separable if it has a countable base.

(By countable we mean finite or denumerable.) Exercises on separable
spaces designed to acquaint the reader with them, and essentially all
trivial, are given at the end of the chapter. It is easy to see that the real
numbers have a countable base. Indeed, we can take for basis elements
the open intervals of rational radius, centered at rational points. Simi-
larly, Rn has a countable base.
Note. In most cases, the property defining separability is equivalent

with the property that there exists a countable dense subset (cf. Exercise
15), and this second property is sometimes used to define separability.
We find our definition to be more useful but the reader is warned on the
discrepancy with some other texts.
An open set containing a point x is called an open neighborhood of

this point. By a neighborhood of x we mean any set containing an open
set containing x. In a normed vector space, one speaks of an e-neighbor-
hood of a point x as being a ball of radius e centered at x.
Let X, Y be topological spaces. A map f: X -+ Y is said to be contin-
uous if the inverse image of an open set (in Y) is open in X. In other
words, if V is open in Y then f-1(V) is open in X. Equivalently, we see
that a map f is continuous if and only if the inverse image of a closed
set is closed.
Proposition 1.1. Let E, F be normed vector spaces and let f: E ~ F be

a map. This map is continuous if and only if the usual (e, <5) definition is
satisfied at every point of E.
We prove one of the two implications. Assume that f is continuous

and let x E E. Given e, let V be the open ball of radius e centered at
f(x). The open set U = f-l(V) contains an open ball B of radius <5
centered at x for some (j. In particular, if y E E and Ix - yl < (j, then
f(y) E V and If(y) - f(x)1 < e. This proves the (e, (j) property. The con-
verse is equally clear and is left to the reader.
Actually, this (e, (j) property can be formulated analogously in arbi-

trary topological spaces, as follows : The map f: X ~ Y is said to be
continuous at a point x E X if given a neighborhood V of f(x) there exists
a neighborhood U of x such that f(U) c v. It is then verified at once
that f is continuous if and only if it is continuous at every point.
Proposition 1.2. Let X be a metric space (or a subset of a normed

vector space) and let f: X ~ E be a map into a normed vector space.
Then f is continuous if and only if the following condition is satisfied.
Let {x n } be a sequence in X converging to a point x. Then {j(x n )}
converges to f(x).
The proof will be left as an exercise to the reader.
A composite of continuous maps is continuous.
Indeed, if f: X ~ Y and g : Y ~ Z are continuous maps and V is open

in Z, then
is seen to be open.
As usual, we observe that a continuous image of an open set is not
necessarily open.
A continuous map f: X ~ Y which admits a continuous inverse map
g : Y ~ X is called a homeomorphism, or topological isomorphism. It is
clear that a composite of homeomorphisms is also a homeomorphism.
As usual, we observe that a continuous bijective map need not be a
homeomorphism. In fact, later in this course, we meet many examples
of. vector spaces with two different norms on them such that the identity
map is continuous but not bicontinuous.
Let {X;}ieI be a family of topological spaces and let
X = TI Xi
ieI
be their product. We define a topology on X, called the product topol-

ogy, by characterizing a subset V of X to be open if for each x E V there
exists a finite number of indices i 1 , ... ,in and open sets Vii' .. . ,Vi" in the
spaces Xii' . .. ,Xi" respectively such that
XE U.11 X·· · x U.In x n X,. c

i;¢ i k
V.
The product for i 1= ik is taken for all indices i unequal to i 1 , • . . ,in. In

other words, we can say that the product topology is the one having as a
base all sets of the form
Such sets have arbitrary open sets at a finite number of components, and
the full space at all other components.
The product topology is the unique topology with the fewest open sets
in X which makes each projection map
continuous. Indeed, for each open set ~ in Xj ' the set
must be open if 7tj is continuous, and our previous assertion follows. In

other words, it is the weak topology determined by the family of all
projections on the factors.
More generally, given a set and a family of mappings of this set into
topological spaces, one can define a unique topology on the set making
all these mappings continuous, and having the fewest open sets doing
this, namely the weak topology. If S is a set, and
{/;: S ~ li};el
is a family of maps into topological spaces li, then the map

f: S~ n li
iel
such that f(x) = {/;(x)} is continuous for this topology.
Example 8. We can give Rn the product topology, which is called the

ordinary topology. We define the sup norm on Rn by
IIxll = maxlx;l
[II, §2] CONNECTED SETS 27
if x = (Xl' ... ,xn ) is given in terms of its coordinates. Then the topology
determined by this norm is clearly the same as the product topology.
Remark. A map f : X --+ Y which maps open sets onto open sets is
said to be open. A map which maps closed sets onto closed sets is said
to be closed. A continuous map need not be either. For instance, the
graph of the tangent is closed in the plane, but the projection map on
the x-axis maps it on an open interval:
Figure 2.1
The map which folds the plane over the real axis maps the open plane
on the closed half plane. If f : X --+ Y is continuous and bijective, then a
necessary and sufficient condition that f be a homeomorphism is that f
be open. This is simply a rephrasing of the continuity of the inverse
mapping f-l .
II, §2. CONNECTED SETS
A topological space X is said to be connected if it is not possible to

express X as a union of two disjoint non-empty open sets. Of course, we
can formulate the definition in terms of closed sets instead of open sets.
The reader's intuition of connectedness probably comes from the pos-
sibility of connecting two points of a set by a path. We shall discuss the
relation between this notion and the general notion later, after developing
first some basic properties of connected sets.
Proposition 2.1. Let f: X --+ Y be a continuous map. If X is connected

then the image of X is connected.
Proof. Without loss of generality we may assume that Y is the image

of f. Suppose that Y is not connected, so that we can write Y = U u V
where U, V are open, non-empty, and disjoint. Then
which is impossible. This proves our assertion.
Proposition 2.2. A topological space X is connected if and only if every

continuous map of X into a discrete space having at least two elements
is constant.
Proof. Assume that X is connected, and that f is a continuous map of

X into a discrete space with at least two elements. If f is not constant,
we can write the image of f as a union of two disjoint non-empty sets,
open by definition, and this contradicts our previous result. Conversely,
suppose that we can write X = U u V as a disjoint union of non-empty
open sets. Let p, q be two distinct objects and let the set {p, q} have the
discrete topology. If we define
f: X --+ {p, q}
to be the map such that
f(U) = {p} and f(V) = {q},
then f is continuous and not constant, as was to be shown.

Observe that our proof shows that instead of taking a discrete space
having at least two points, we can take a space with exactly two points
in characterizing a connected set, as we have just done.
Proposition 2.3. Let X be a topological space and let {S;}ieI be a

family of subspaces which are connected. If they have a point in com-
mon then their union is connected.
Proof. Let a lie in the intersection of all Si . If we can write
where U, V are open in this union, then Si Il U and Si Il V are open in Si

for each i and hence Si C U or Si C V. If for some i we have Si C U,
then a E U and consequently we must have Si C U for all i, thus proving
our assertion.
[II, §2] CONNECTED SETS 29
As a consequence of the preceding statement, we define the connected

component of a point a in X to be the union of all connected subspaces
of X containing a. This component is actually not empty, because the
set consisting of a alone is connected.
Proposition 2.4. Let X be a topological space and S a connected subset.

Then the closure of S is connected. In fact, if SeT c S, then T is
connected.
Proof Left to the reader.
Corollary 2.5. The connected component of a point is closed.
Proof Clear.
As promised, we now discuss the relation between the naive notion of

connectedness and the general notion. Let X be a topological space. We
say that X is arcwise connected if given two points x, y in X there exists
a piecewise continuous path from x to y. By a piecewise continuous path,
we mean a sequence of continuous maps {(J(l' . .. ,(J(r}, where each
is a continuous map defined on a closed interval [ai' bJ such that
We say that this path goes from x to y if
and (J(r(br) = y.
Of course, if such a path exists, then it is easy to define just one continu-
ous map
(J(: [a, b] --+ X
from some interval [a, b] into X such that (J((a) = x and (J((b) = y. One
can even take the interval [a, b] to be [0, 1].
Proposition 2.6. Any interval of real numbers is connected.
Proof We give the proof for a closed interval J = [a, b] and leave the
other cases (open, half-open, infinite intervals) as exercises. Suppose that
we can write J = A u B where A, B are closed, disjoint, and non-empty.
Say that a E A. Let c be the greatest lower bound of B. Then c lies in
the closure of B and since B is closed, c E B, so c i= a. For any x E J

with a < x < c, we must have x E A since c is a lower bound for B.
Since A is closed, and since c lies in the closure of the interval a < x < c,
it follows that c lies in A, a contradiction which proves our assertion.
Proposition 2.7. If a topological space is arcwise connected, then it is

connected.
Proof Let X be arcwise connected and suppose that we can write X

as a disjoint union of non-empty open sets V, V. Let x E V and y E V.
There exists a continuous map (X: J --+ X from a closed interval into
X starting at x and ending at y. Then (X - l(V) and (X-l(V) express J
as a disjoint union of non-empty disjoint sets which are open in J, a
contradiction.
The converse of the preceding result is false. For instance the subset of
the plane consisting of the y-axis and the graph of the curve y = sin(l/x)
is connected but not arcwise connected. In practice, however, most ordi-
nary sets which are connected are also arcwise connected, and the sort of
pathology which arises from sin(l/x) is just that : pathology. In Exercise
12, you will prove that an open subset of a normed vector space is
connected if and only if it is arcwise connected.
Theorem 2.8. Let {XJiEl be a family of connected topological spaces.

Then the product
X = Il Xi
iEI
is connected.
Proof Let f: X --+ {p, q} be a continuous map of X into a discrete

space consisting of two points. We must show that f is constant. Let
a E X and say that f(a) = p. Then f-l(p) contains an open neighbor-
hood of a of the form
V=u.'1 X " 'x U.'n x Il

i:# ik
X-,.
Let b be any other point of X and write a, b in terms of their

coordinates:
Let
z = {a.'1' ... , a·'n' (b') '4' . )
'1.,...11. ·· ·.'n
so that the coordinates of z are the same as those of a for i l , ..• ,in and
the same as those of b for the other indices. Then z E V and fez) = p.
[II, §3] COMPACT SPACES 31
Consider the composite of maps
Xi, ~ X ~ {p, q},
where g is the injective mapping such that
g( x.'1 ) = (x '1'
. a·12"'"
a·In' (b')'4' .)
""""'1 •...• In •
Then g is continuous, so is fog, and since the continuous image of a

connected set is connected, it follows that fog is constant on Xi,. In
particular, f 0 g(a i ,) = f(z) = p, and also
f( b.It' a·' 2 " " "a·n ' (b.). . . ) = p•

1':#;11o···. 1n
We now perform the same trick, replacing a i2 by b i2 , ••• , and a in by bin'

We then see that f(b) = p, thus proving that f is constant, which proves
the theorem.
Corollary 2.9. Euclidean n-space Rn is connected, and so is the product

of any number of intervals.
II, §3. COMPACT SPACES
Let X be a set and {S~}~EA a family of subsets. We say that this family
is a covering of X if its union is equal to X. If X is a topological space,
and {U~}~EA is a covering, we say it is an open covering if each U~ is
open. If {S~}~EA is a covering of X, we define a subcovering to be a
covering {Sp}PEB where B is a subset of A. In particular, a finite sub-
covering of {S~} is a covering {S~" ... ,S~J.
Let X be a topological space. We shall say that X is compact if any
open covering of X has a finite subcovering. As usual, we can express a
dual condition relative to closed sets. Let {F~laEA be a family of subsets
of X. We say that this family has the finite intersection property if any
finite intersection
is not empty.
Proposition 3.1. A topological space X is compact if and only if, for

any family {F~}~EA of closed sets having the finite intersection property,
the intersection
is not empty.
Proof. Assume that X is compact and let {Fa} be a family of closed

Proof
sets having the finite intersection property. Suppose that the intersection
of this family is empty. Then the complements C6Fa form an open cover-
ing of X, and there is a finite subcovering by open sets {C6Fa I , · · . ,C6FaJ.
Taking the complement, we conclude that the intersection
F.ell n···nF.an
is empty, which is a contradiction, thus proving the finite intersection

property. The converse is equally clear.
Proposition 3.2. A continuous image of a compact set is compact.
Proof. Let X be compact, and let f: X -> Y be a continuous map,

Proof
which is surjective. Let {J.;;} be an open covering of Y. Then {J-l(J.;;)} is
an open covering of X, and there is a finite subcovering
It follows that {J.;;I ' .. . ,J.;;J is a covering of Y, as was to be shown.
Proposition 3.3. A closed subspace of a compact space is compact.
Proof. Let X be a compact space and S a closed subspace. Let {Va}

Proof
be a covering of S by open sets in X. Let V be the complement of S in
X. Then {Va} together with V form an open covering of X, having a
finite subcovering
Since V is disjoint from S, it follows that already Val' ... ,Va" cover S,
thus proving our assertion.
The converse of the preceding assertion is almost true but not quite.
A topological space X is said to be Hausdorff if given points x, Y E X
and x =1= y there exist disjoint open sets V , V such that x E V and Y E V
If X is Hausdorff, then each point of X is obviously closed.
Proposition 3.4. A compact subspace of a Hausdorff space is closed.
Proof. Let S be a compact subset of the

Proof Hausdorff space X. We
prove that its complement is open. Let x be in the complement. For
each YES there exist disjoint open sets Vy, V, such that x E Vy and
Y E v, . The family {V,} yeS covers S and there is a finite subcovering
{V,I' ... ,v,J.

Then the intersection Uy, n ... n UYn is open, contains x, and is contained
in the complement of S, thus proving what we want.
A topological space X is said to be normal if it is Hausdorff, and if
given two disjoint closed sets A, B in X there exist disjoint open sets U,
V such that A c U and B c V.
Proposition 3.5. A compact Hausdorff space is normal. In fact, if A, B

are compact subsets of a Hausdorff space, and are disjoint, there exist
disjoint open sets U, V such that A c U and B c V.
Proof. The proof is similar to the previous one, and involves merely
one further application of the same principle. Using the same trick as in
this previous proof, we know that for each x E A there exist disjoint open
sets Ux , w" such that x E Ux and Be w" w".. (One would take the finite
union of the open sets v", ...
... ,v'n to obtain w" in the analogous situa-
tion.) The family of open sets {Ux } X E A covers A, and there exists a finite
subcovering
The open sets Ux , U ... U UXm and w", n ... n w"m

w"m solve our problem.
In the case of Hausdorff spaces, or normal spaces, we say also that

points (or closed sets) can be separated by open sets. The properties of
being Hausdorff or normal are thus called separation properties.
It is clear that a subspace of a Hausdorff space is Hausdorff. The
analogous statement for normal spaces is not necessarily true (cf. Kelley
[Ke], Exercise F, p. 132).
The general notion of a compact space is, in many practical cases,
equivalent with another notion with which the reader is probably already
familiar. We call a space X sequentiaUy compact if it has the Weierstrass-
Bolzano property, namely every sequence {xn} in X has a point of accu-
mulation (a point c such that given an open neighborhood U of c, there
exist infinitely many n such that Xn E U).
U). As usual, an equivalent condi-
tion is that an infinite subset of X has a point of accumulation. It is an
exercise to prove
prove::
Proposition 3.6. If a topological space has a countable base, then it is

compact if and only if it is sequentially compact.
(Cf. Exercise 19.)
The preceding criterion will not be used in this book.
Proposition 3.7. Compactness implies sequential compactness.

compactness.
Proof Let X be compact. It will suffice to prove that an infinite

subset of X has a point of accumulation. Suppose that this is not the
case, and let S be an infinite subset. Given x E X, there exists an open
set Ux containing x but containing only a finite number of the elements
of S. The family {UX}XEX covers X. Let {UX" ", ,UxJ be a finite sub-
covering. We conclude that there is only a finite number of elements of S
lying in the finite union
UXl U"'uUXn .
This is a contraqiction, which proves our assertion.

The converse is true under important and rather general conditions, as
shown in the next theorem.
Theorem 3.8. Let S be a subset of a metric space, or of a normed

vector space.
(i) S is compact if and only if S is sequentially compact.
(ii) S is compact if and only if S is complete, and given r > 0 there
exists a finite number of open balls of radius r which cover S.
Proof We have already proved that compactness implies sequential

compactness. Conversely, assume that S is sequentially compact. Then
certainly S is complete, and we shall prove that the other condition
stated in (ii) is satisfied. Suppose it is not. Let r > O. Let Xl E S and let
Bl be the open ball of radius r centered at Xl' Then Bl does not contain
S, and there is some X 2 E S, X 2 ¢ B l . Proceeding inductively, suppose
that we have found open balls B l , •.. ,Bn of radius r, and points Xl'
... ,X n with Xi E Bi such that Xk+l does not lie in Bl U··· U B k • We can
then find X n +l which does not lie in Bl U··· U B n , and we let Bn+1 be the
open ball of radius r centered at x n+1' Let v be a point of accumulation
of the sequence {x n }. By definition, there exist positive integers m, k with
k > m such that
IXk - vi < r/2
and
IXm - vi < r/2.
Then IXk - xml < r and this contradicts the property of our sequence {xn}
because X k lies in the ball Bm. This proves that S satisfies the condition
of (ii).
Now assume this condition. Let {U;}iEI be an open covering of S, and

suppose that there is no finite subcovering. We construct a sequence
{xn} in S inductively as follows. We know that S is covered by a finite
number of closed balls of radius!. Hence there exists at least one closed
t
ball C l of radius such that C l n S is not covered by a finite number of
Ui' We let Xl be a point of Cl n S. Suppose that we have obtained a
sequence of closed balls
such that Cn has radius 1/2n , with a point Xn E Cn n S, and such that
Cn n S is not covered by a finite number of Ui ' Since S itself can be
covered by a finite number of closed balls of radius 1/(2 n +1), it follows
that Cn n S can also be so covered, and hence there exists a closed ball
Cn + l of radius 1/(2n+1) and such that Cn+1 n S cannot be covered by a
finite number of Ui' We let Xn+1 be a point of Cn + l n S. This constructs
our sequence as desired. We see that {xn} is a Cauchy sequence in S,
which co verges to a point X in S. But x lies in some Uia which contains
Cn for all sufficiently large n, a contradiction which proves our theorem.
A subset S of a metric space, or a normed vector space, which can be

covered by a finite number of open balls of given radius r > 0 is said to
be totally bounded. We can phrase (ii) by saying that S is compact if and
only if it is complete and totally bounded. A subset of a topological
space is said to be relatively compact if its closure is compact. From (ii)
we get a convenient criterion for relative compactness.
Corollary 3.9. Let S be a subset of a complete normed vector space.

Assume that given r > 0 there exists a finite covering of S by balls of
radius r. Then S is relatively compact.
Proof The closure S of S has the same property, because if S is

covered by a finite number of balls of radius r/2, then the closure of S is
covered by a finite number of balls of radius r (centered at the same
points). Also S is complete. Hence we conclude that the closure of S is
compact.
As an applicatioQ of Theorem 3.8, we recall that a closed (bounded)

interval in R has the Weierstrass-Bolzano property. Hence it is compact,
and therefore so is any closed bounded subset of R (being a closed subset
of a compact set). The converse is also true, since a compact set is
closed, and must be bounded, otherwise one can find an infinite sequence
tending to infinity, and not having a point of accumulation.
One can also prove the compactness of a closed interval directly from
the least upper bound axiom, as follows. Let a < b, and let {UJiEI be an
open covering of [a, b]. Let S be the set of all x E [a, b] such that [a, x]
admits a finite subcovering. Then S is not empty (because a E S) and is
bounded from above by b. Let c be its least upper bound. Then c E U;a
for some index io . If a < c, select a number t with a < t < c such that
the interval [t, c] is contained in Vio' If a = c, let t = a. Then [a, t] can

be covered by a finite number of sets Vi' say Vi" . . . ,Vi"' If c "# b, then
Vio ' Vi" ... ,Vi" cover an interval [a, c'] with c' > c, a contradiction, proving
that c = b and that [a, b] is compact.
One can generalize to arbitrary compact sets some standard theorems

on closed intervals, e.g.:
Proposition 3.10. Let A be a compact set, and f: A ~ R a continuous

function on A. Then f has a maximum (a point c E A such that
f(c) ~ f(x) for all x E A).
Proof. The image f(A) is compact, so closed and bounded. The least
upper bound of f(A) lies in f(A), thus proving our assertion.
If A is a subset of a normed vector space, and if f: A ~ F is a

continuous map into some normed vector space F, then we say that f is
uniformly continuous on A if given £ there exists b such that whenever
x, YEA and Ix - yl < b, then If(x) - f(y)1 < £. We recall the theorem
from elementary analysis that:
Proposition 3.11. Let A be a compact subset of a normed vector space.

If f: A --+ F is a continuous map into a normed vector space, then f is
uniformly continuous. In fact, if A is contained in a subset S of a
normed vector space, if f is defined on S and continuous on A, then
given £ there exists b such that if x E A and YES and Ix - yl < b, then
If(x) - f(y)1 < e.
We recall the proof briefly. Given £, for each x E A we let r(x) > 0 be
such that if Iy - xl < r(x), then If(y) - f(x) I < £. We can cover A by
open balls Bi of radius
bi = r(xJ/2,
centered at Xi (i = 1, ... ,n). We let b = min bi • If x E A, then for some i

we have Ix - x;l < r(xJ/2. If Iy - xl < b, then Iy - Xii < r(xi) so that
If(y) - f(x) I ~ If(y) - f(xJI + If(x i ) - f(x)1

< 2£,
as was to be shown.
The preceding definition of uniform continuity, and the result just

proved, are of course valid for metric spaces, with the usual notation
d(x, y) replacing Ix - YI. The property which we proved, and which is
slightly stronger than uniform continuity on A, will be called relative

uniform continuity (relative to S, that is).
The only non-trivial theorem of this section is the theorem that a

product of compact spaces is compact. In situations when one can use
sequences, and one takes a finite product of spaces, however, the proof is
immediate. For instance, let E, F be normed vector spaces, and let S, T
be compact subsets of E, F, respectively. Let {zn} be a sequence in
S x T, and write Zn = (xn' Yn) with Xn E E and Yn E F. We can find a
subsequence {xn.} converging to a point a in S. We can then find a
subsequence {Yn;) converging to a point b in F. Then the sequence
{zn 'k } converges to (a, b) so that S x T is sequentially compact.
The idea for this proof is to project on the coordinates, and from
coordinatewise convergence, get the convergence in the product space.
However, if we do it for an infinite product, the above proof seems to fail
because we may exhaust all the indices before being through with the
proof. One can still formulate the basic idea so that it essentially carries
over to the most general case. Part of the difficulty in doing this is that
the points of accumulation in the various coordinate spaces are not
uniquely determined. Thus one must find a set theoretic device which
chooses simultaneously a point of accumulation in all coordinate spaces.
The proof below is due to Bourbaki.
Theorem 3.12 (Tychonoff's Theorem). Let {Xa}aeA be a family of com-

pact spaces. Then the product
is compact.
Proof. Let !IF = {Fi}ieI be a family of closed subsets of the product,

having the finite intersection property. The family of subsets of X (not
necessarily closed) containing our given family !IF and having the finite
intersection property is ordered by ascending inclusion. One verifies im-
mediately by taking the usual union that it is inductively ordered. It is
therefore contained in a maximal family !IF* having the finite intersection
property. Let
be the projection on the O(-th factor. For each 0(, the family of closed sets
{tra(F) }, FE !IF*,
has the finite intersection property, and consequently there exists an

element x" in each set n,,(F) for all FE ff'*. Let x = (x,,). We contend
that x belongs to all sets F E ff'* . This will prove our theorem.
To prove our contention, we observe that the intersection of a finite
number of sets in ff'* also lies in ff'* because of the maximality of ff'*.
Let V be an open set of X containing x, of the form
with each V". open in X". Then V" contains x". for all i, and therefore
V", contains ~ point of n,.;(F) for all FE ff'*. Hen'ce
contains a point of F for each FE ff'*. Because of the maximality of ff'*

with respect to the finite intersection property, it follows that
belongs to ff'*, and hence the finite intersection of these sets for
i = 1, .. . ,n
also belongs to ff'* . But this finite intersection is nothing else but our
set V, and hence V intersects each F in ff'*, so a fortiori each FE ff'.
Hence x lies in the closure of each FE ff', whence x E F for all FE ff', as
was to be shown.
Corollary 3.13. A subset of Rn is compact if and only if it is closed and

bounded.
Proof Let S be a subset of Rn and assume first that S is closed and

bounded. Then S is contained in the product of a finite number of
closed intervals, and is therefore a closed subset of a compact space. It is
thus compact. Conversely, if it is compact, it is closed, and it must be
bounded; otherwise, one can find a sequence of elements in S going out
to infinity, and not having a point of accumulation.
Corollary 3.14. All norms on Rn are equivalent.
Proof Let I I be the sup norm, and I I any other norm. It will
suffice to prove that these two norms are equivalent. If e l , ... ,en are the
usual unit vectors of Rn, then for x = Xl e l + ... + xne n we get
with C = n · max IeJ This proves one of the desired inequalities, and also
shows that the other norm is continuous, because
Ilxl-lyll ~ Ix - yl ~ CIIx - YII ·

Let Sl be the unit sphere centered at the origin for the sup norm. Then
Sl is closed and bounded, so compact, and the other norm has a mini-
mum on Sl' say at v. Thus for any x E Rn we get
and hence Ivlllxli ~ Ixl·
This yields the other inequality, and proves our corollary.
Using coordinates, we see that Corollary 3.14 also applies to a finite

dimensional vector space. A closed subset of a complete metric space is
complete, and a complete subset of a metric space is closed. We con-
clude that a finite dimensional subspace of a normed vector space is
complete, and therefore closed.
A space X is said to be locally compact if every point has a compact
neighborhood. For instance, Rn is locally compact, and so is any finite
dimensional vector space. It is clear that a normed vector space is locally
compact if and only if the closed unit ball is compact. (If the space is
locally compact, then some closed ball of radius r > 0 is compact, and
hence the unit ball is compact by multiplication with a positive number.)
Corollary 3.15 (F. Riesz). A normed vector space is locally compact if

and only if it is finite dimensional.
Proof. Let E be a locally compact normed vector space, and let B be
the closed ball of radius 1 centered at o. We can find a finite number of
points Xl' ... , X n E B such that B is covered by the open balls of radius t
centered at these points. We contend that Xl' ... ,Xn generate E. Let F be
the subspace generated by Xl ' ..• ,xn • Then F is finite dimensional, hence
closed in E as a trivial consequence of Corollary 3.14. Suppose that x E E
and x ¢ F. Let
d(x, F) = inf Ix - YI.
YEF
Drawing a closed ball around x intersecting F, and using the fact that
the intersection of F and this ball is compact, we conclude that there is
some Z E F such that d(x, F) = Ix - zl, and we have x - z :f. 0 since F is
closed in E. Then there is some Xi such that
IIxx -- z I 1
zl- Xi < 2"
and consequently that

Ix -zl
Ix - z - Ix - zl xii < - 2 - '
However, z + Ix - zl Xi lies in F, and by definition of z such that
d(x, F) = Ix - zl
we conclude that the left-hand side is ~ Ix - zl. This is a contradiction

which proves our corollary.
Let X be a locally compact Hausdorff space. One can construct a

compact space by adjoining to X a point "at infinity" as follows. Let p
be some point not in X and let X' be the union of X and {p}. We
define a base of open sets in X' by throwing into this base all subsets of
X which are open in X, and the complements in X' of compact sets in
X. That this defines a base is clear, and one also verifies at once that X'
is then compact. It is called the one point compactification of X.
It is easy to see that the one point compactification of R is homeo-
morphic to a circle. The one point compactification of the plane R2 is
homeomorphic to the sphere. In general, the one point compactification
of Rn is homeomorphic to the n-sphere (i.e. the set of all x E Rn+1 such
that Ixl = 1, where I I is the euclidean norm).
II, §4. SEPARATION BY CONTINUOUS FUNCTIONS
We are concerned throughout this section with a normal space X and

the manner by which one can separate two disjoint closed sets by means
of a continuous function.
Lemma 4.1. Let X be a normal space. If A is closed in X and A c V

is contained in an open set V, then there exists an open set VI such that
Proof Let B be the complement of V. By the definition of normality,

there exist disjoint open sets VI ' J-1 such that A c V 1 and B c J-1. It is
clear that VI satisfies our requirements.
Theorem 4.2 (Urysohn's Lemma). Let X be a normal space and let A,
°
B be disjoint closed subsets. Then there exists a continuous function f
on X with values in the interval [0, 1] such that f(A) = and f(B) = 1.
[II, §4] SEPARATION BY CONTINUOUS FUNCTIONS 41
Proof. In a metric space, which is the most important in practice, one

can give a trivial proof. Cf. Exercise 7. We now give the proof in
general. Let V 1 be the complement of B so that A c V 1 • We find V 1/2
such that
We then find V 1/4 and V 3/4 such that
Inductively, for each integer k with 0 ~ k ~ 2n, we find V k /2 " such that if
r < s, then Vr C Dr C Vs . We then define the function f by
f(x) = 1 if x E B,
f(x) = inf of all r such that x E Vr if x ~ B.
It is then essentially clear that f is continuous. We carry out the details.

It will suffice to prove that for numbers a, b such that 0 < a ~ 1 and
o ~ b < 1 the inverse images of the half-open intervals
f- 1 [0, a) and
are open. In fact, we have
f- 1 [0, a) = U Vr
r<a
because f(x) < a if and only if x lies in some Ur with r < a. Similarly, we
have f(x) > b if and only if x ~ Dr for some r > b, so that
f- 1 (b, 1] = U l6'D
r>b
r •
This proves our theorem.
Since a compact Hausdorff space is normal. Urysohn's lemma applies

in this case. One needs it frequently in the locally compact case in the
following form .
Corollary 4.3. Let X be a locally compact Hausdorff space, and K a

compact subset. There exists a continuous function g on X which is 1
on K and which is equal to 0 outside a compact set.
Proof Each x E K has an open neighborhood v" with compact clo-

sure. A finite number of such neighborhoods v"" ... ,v"n covers K. Let
V= v:
Xl
U"'uV:Xn .
Then the closure of V is compact. There exists a continuous function

9 ~ 0 on V (compact Hausdorff, hence normal) which is 1 on K and 0
outside V, i.e. 0 on V (\ ~v. We define 9 to be 0 on the complement of V
in X . Then 9 is continuous at every point in the complement of V, and
as function on X is also continuous on V. This proves our corollary.
Theorem 4.4 (Tietze Extension Theorem). Let A be a closed subset of a

normal space X and let f be a continuous (real valued) function on A.
Then there exists a continuous function f* on X whose restriction to A
is equal to f If f has values in [0, 1], then we can choose f* to have
values in [0, 1] also.
Proof Assume first that f has values in [0, 1]. If A, B are disjoint
closed subsets of X, we denote by gA,B a function with values in [0, 1]
such that g(A) = 0 and g(B) = 1. Such a function exists by Theorem 4.2.
We shall now define functions fn on A and gn on X.

We let fo = f and define sets Ao, Bo by the conditions:
Ao = {x E A such that f(x) ~ t},

Bo = {x E A such that f(x) ~ n
We let go = tgAo,Bo and define fl = fo - go. Inductively, suppose that we
have defined fn; we have
An = {x E A such that f,,(x) ~ m(t)n},

Bn = {x E A such that fn(x) ~ (t)(t)n}.
We then define
and let fn+1 = fn - gn' (Here of course, we understand by gn its restric-

tion to A.) Then in particular :
We have
and
[II, §5] EXERCISES 43
The first inequality is clear. The second is proved by induction. It is

clear for n = 0. Let n > 0. One distinguishes the three cases in which for
a given x E A we have x E An , or x ¢ An but x ¢ Bn, or x E Bn. The
desired inequality of f" is then obvious in each case, using the inductive
hypothesis.
From our inequalities (*), we then conclude that the series
converges pointwise, and furthermore converges to 1 on A. The uniform

bounds imply at once that the limit function is continuous, thus proving
Theorem 4.4, when 1 has values in [0, 1].
Remark 1. The restriction to the interval [0, 1] is of course unneces-

sary, and the theorem extends at once to any other closed bounded
interval, for instance by mapping such an interval linearly on [0, 1].
Now suppose that 1 is unbounded. Using the arctangent map we

reduce the theorem to the case when 1 takes values in the open interval
(-1, 1) and we must then know that the extension can be so chosen that
its values also lie in the open interval (-1, 1). Let B be the closed
set where the extension 1* (which we have constructed with values in
[ -1, 1]) takes on the values 1 or -1. Then A and B are disjoint, so
°
that by Urysohn's lemma there exists a continuous function h on X with
values in [0, 1] such that h is 1 on A and on B. Then hf* has values
in the open interval (-1, 1), as desired. This concludes the proof of
Theorem 4.4.
Remark 2. The theorem also holds in the complex case dealing sepa-
rately with the real and imaginary parts. The extra condition on the
restriction of the values can then be formulated analogously by requiring
that
11/*11 ~ 11111·
Indeed, suppose that we have extended 1 to a bounded continuous com-
plex valued function g. Let b = 11111. Let h be the function such that
h(z) = z if Izl ~ b, and h(z) = bz/ lzl if Izl > b. Then h is continuous,
IIhll ~ b, and hog fulfills our requirement.
II, §5. EXERCISES
1. (a) Let X, Y be compact metric spaces. Prove that a mapping f: X -+ Y is

continuous if and only if its graph is closed in X x Y.
(b) Let Y be a complete metric space, and let X be a metric space. Let A be
a subset of X . Let f: A -+ Y be a mapping that is uniformly continuous.
Let it be the closure of A in X. Show that there exists a unique extension

of f to a continuous map J: it -> Y, and that 1 is uniformly continuous.
You may assume that X, Yare subsets of a Banach space if you wish,
in order to write the distance function in terms of the absolute value sign.
2. Seminorms. Let E be a vector space. A function u: E -> R is called a
seminorm if it satisfies the same conditions as a norm except that we allow
u(x) = 0 without necessarily having x = O. In other words, u satisfies the
following conditions:
SN 1. We have u(x) ;?; 0 for all x E E.
SN 2. If x E E and a is a number, then u(ax) = lal u(x).
SN 3. We have u(x + y) ~ u(x) + u(y) for all x, y E E.
We also denote a seminorm by the symbols I I.
(a) If I I is a seminorm on E, show that the set Eo of elements x E E with
Ixl = 0 is a subspace.
(b) Define open balls with respect to a seminorm as with a norm. Show that
the topology whose base is the family of open balls is Hausdorff if and
only if the seminorm is a norm.
(c) Let {un} be a sequence of seminorms on E such that the values un(x) are
bounded. Let {an} be a sequence of positive numbers such that Lan
converges. Show that L an Un is a seminorm.
(d) Let {U;}iel be a family of seminorms on a vector space E. Let Xo E E and
let ii' ... ,in be a finite number of indices. Let r > O. We call the set of
all x E E such that
k = 1, ... ,n,
a basic open set. Show that the family of basic open sets is a base
for a topology on E, which is said to be determined by the family of
seminorms.
3. (a) Let [I be the set of all sequences 0( = {an} of numbers (say, real) such that
L lanl converges. Define
Show that this is a norm on [1, and that [I is complete under this norm.
(b) Let f3 = {bn } be a fixed sequence in [I. Show that the set of all 0( E [I
such that lanl ~ Ibnl is compact. Show that the unit sphere in [I is not
compact.
4. Let 0( be a real number, 0 < 0( ~ 1. A real valued function f on [0, 1] is said
to satisfy a Holder condition of order 0( if there is a constant C such that for
all x, y we have
If(x) - f(y)1 ~ Clx - yl'·
For such a function, define
Ilfll, = sup If(x)1 + sup If~X) - f~Y)I.

x x.y x - yl
x"y
(a) Show that the set of functions satisfying such a HOlder condition is a
vector space, and that II 11« is a norm on this space.
(b) Show that the set of functions f with Ilfll« ~ 1 is a compact subset of
C([O, 1]).
5. Metric spaces. (a) Let X be a metric space with distance function d. Define
d'(x, y) = min {l, d(x, y)}. Show that d' is a distance function, and that the
notion of convergence and limit with respect to d' is the same as with
respect to d.
(b) As in normed vector spaces, one can define Cauchy sequences, i.e. se-
quences {x n } such that given e, there exists N such that for all m, n ~ N
we have d(xn' x m ) < e. A metric space is called complete if every Cauchy
sequence converges. Show that if a metric space X as in part (a) is
complete with respect to d, then it is complete with respect to d'.
(c) For each x E X define the function fx on X by
fAy) = d(x, y).
Let I II be the sup norm. Show that
d(x, y) = IIfx - fyll.

Let a be a fixed element of X and let gx = fx - J.. Show that the map
x ....... gx is a distance-preserving embedding of X into the normed vector
space of bounded functions on X. (If the metric is bounded, you can use
fx instead of gx). Thus one need not fuss too much with abstract metric
spaces. Besides, almost all metric spaces which occur naturally are in fact
given as subsets of normed vector spaces.
A topological space is said to be metrizable if there exists a metric
such that the open balls form a basis for the topology. Such a metric is
said to be compatible with the topology.
6. Let A be a subset of a metric space X. For each x E X, let
d(x, A) = inf d(x, y)
for all yEA. Show that the map
x ....... d(x, A)
is a continuous function on X, and that d(x, A) = 0 if and only if x lies in the

closure of A . We call d(x, A) the distance from x to A .
7. (a) Show that a metrizable space is normal. [Hint: Let A, B be disjoint

closed subsets. Let U be the set of x such that d(x, A) < d(x, B) and let V
be the set of x such that d(x, B) < d(x, A).]
(b) If A, B are disjoint closed subsets of a metric space, show that the
function
x ....... d(x, A)/(d(x, A) + d(x, B))
can be used to prove Urysohn's lemma.
8. Let X be a topological space and E a normed vector space. Let M(X, E) be

the set of all maps of X into E and C(X, E) the space of all continuous maps
of X into E. Let B(X, E) be the space of all bounded maps, and BC(X, E)
the space of bounded continuous maps.
(a) Show that BC(X, E) is closed in B(X, E).
(b) Suppose that E is complete, i.e. a Banach space. Show that B(X, E) is
complete, with the sup norm.
(c) If X is compact, show that C(X, E) = BC(X, E).
9. Uniform convergence on compact sets. Let X be a Hausdorff space. Let
M(X, E) be the space of maps of X into a Banach space E. A sequence Un}
in this space is said to be uniformly Cauchy on compact subsets if given a
compact set K and e > 0, there exists N such that for m, n ~ N , we have
Ilf. - ImllK < e,

where II 11K is the sup norm on K. In other words, the sequence restricted to
K is uniformly Cauchy. The sequence is said to be uniformly convergent on
compact sets if there is some map I having the following property. Given a
compact set K and e, there exists N such that for n ~ N , we have
Ilf. - IIIK < e.

In other words, the sequence restricted to K is uniformly convergent. We
shall now make M(X, E) into a metric space for which the above convergence
is the same as convergence with respect to this metric, in certain cases.
A sequence {Ki} of compact subsets of X said to be exhaustive if their
union is equal to X, and if every compact subset of X is contained in some
K i . We assume that there exists such a sequence {K i }.
(a) Define
<Xi
d(f) = L r i min(l, II/IIK.).

i=l
If I is unbounded on K, then we set II/IIK = 00 and min(l, II/IIK) = 1.

Show that d(f) satisfies two of the properties of a norm, namely:
d(f) = 0 if and only if 1=0;

d(f + g) ~ d(f) + d(g).
(b) Define d(f, g) by d(f - g). Show that d(f, g) is a metric on M(X, E).
(c) Show that
and
(d) Show that a sequence {f.} converges uniformly on compact sets if and
only if it converges in the above metric.
(e) Let K be a compact set and e > O. Given f, let V(f, K, e) be the set of
all maps 9 such that 111- gilK < e. Show that V(f, K, e) is open in the
topology defined by the metric. Show that the family of all such open
sets for all choices of jj,, K, e is base for the topology. This proves that
the topology does not depend on the choice of exhaustive sequence {KJ.
(f) If E is complete, i.e.
i.e. aa Banach space, show that M(X, E) is complete in
the metric defined above.
(g) If X is locally compact, show that the space of continuous maps C(X, E)
is closed in M(X, E) for the metric.
to. Let U be the open unit disc in the plane. Show that there is an exhaustive
sequence of compact subsets of U.
U.
11. Let U be a connected open set in the plane (or in Euclidean space Rk). Show
that there is an exhaustive sequence of compact subsets of U.
12. Let U be an open subset of a normed vector space. Show that U is con-
nected if and only if U is arcwise connected.
13. The diagonal ~ in a product X x X is the set of all points (x, x).
(a) Show that a space X is Hausdorff if and only if the diagonal is closed in
X x X.
(b) Show that a product of Hausdorff spaces is Hausdorff.
14. If A is a subspace of a space X, we define the boundary of A (denoted by aA)

to be the set of all x such that any open neighborhood U of x contains a
point of A and a point not in A . In other words, aA = A 1\ ('G'A).
(a) Show that a(A u B) c aA u aB.
(b) Show that a(A 1\ B) c aA u aB.
(c) Let X, Y be topological spaces, and let A be a subset of X, B a subset of
Y. Show that
a(A x B) = (aA x B)u(A x aB).
(d) Let A be a subset of a complete normed vector space E. Let x E A and

let y be in the complement of A . Show that there exists a point on the
line segment between x and y which lies on the boundary of A A.. (The line
segment consists of all points x + t(y - x) with 0 ~ t ~ 1.)
Separable Spaces
15. A topological space having a countable base for its open sets is called separa-
ble. Show that a separable space has a countable dense subset.
ble.
16. (a) If X is a metric space and has a countable dense subset, then X is
separable.
(b) A compact metric space is separable.
17. (a) Every open covering of a separable space has a countable subcovering.
(b) A disjoint collection of open sets in a separable space is countable.
(c) A base for the open sets of a separable space contains a countable base.
18. A denumerable product of separable (resp. metric) spaces is separable (resp.

metric).
19. Let X be separable. Show that the following conditions are equivalent:
(a) X is compact.
(b) Every sequence {xn} in X has at least one point of accumulation, that is
X is sequentially compact.
(c) Every decreasing sequence {An} of non-empty closed sets has a nonempty
intersection.
20. Prove that a normal separable space X is metrizable (Urysohn metrization
theorem). [Hint: Let {Un} be a countable base for the topology. Let (Un" Um)
be an enumeration of all pairs of elements in this base such that Vn, cUm,.
For each i let I; be a continuous function satisfying 0 ~ I; ~ 1 and such that
I; is 0 on Vn , and 1 on the complement of Um ,. Let
00 1
d(x, y) = i~ 2i 11;(x) - l;(y)I·]
Show that d is a metric and that the identity mapping is continuous with
respect to the given topology on X and the topology obtained from the
metric. You will use the fact that given x E X and some open set Um in the
base containing x, there exists another set Un in the base such that
21. Regular spaces. A topological space X is called regular if it is Hausdorff, and

if given a point x and a closed set A not containing x, there exist disjoint
open sets U, V such that x E U and A c V.
(a) A subspace of a regular space is regular.
(b) Let X be a topological space. If every point has a closed neighborhood
which is regular, then X is regular.
(c) Every locally compact Hausdorff space is regular.
(d) If X is separable regular, show that every point x has a sequence of open
neighborhoods such that:
(i) Vn+1 C Un;
(ii) {x} =nUn.
The following exercises are of somewhat less general interest than the preced-
ing ones (but some are more amusing).
22. Proper maps. Let X, Y be topological spaces and f: X --+ Y a map. We say
that f is closed if f maps closed sets into closed sets. We say that f is proper
if f is continuous and if for every topological space Z the map
f x I z = fz: X x Z --+ Y x Z
given by fz(x, z) = (J(x), z) is closed.

(a) Show that a proper map is closed.
(b) For each i = 1, ... ,n let 1;: Xi --+ Y; be a continuous map. Assume that Xi
is not empty for each i. Let f: fl Xi --+ fl Y; be the product map. Show
that f is proper if and only if all I; are proper.
(c) If f: X --+ Y is proper and A is closed in X, show that flA is proper.
23. Let f: X ---> X' and g: X' ---> X" be continuous maps. Prove:
(a) If f and 9 are proper, so is 9 0 f.
(b) If 9 0 f is proper and f is surjective, then 9 is proper.
(c) If 9 0 f is proper and 9 is injective, then f is proper.
(d) If 9 0 f is proper and X'is Hausdorff, then f is proper.
24. Let X be a topological space, {p} a set consisting of one element p. The map
f: X ---> {p} is proper if and only if X is compact. [Hint: Assume that f is
proper. To show that X is compact, let {S.} be a family of non-empty closed
sets having the finite intersection property. Let Y = Xu {p}, where p is dis-
joint from X. Define a base for a topology of Y by letting a set be in this
base if it is of type S. u {p}, or if it is an arbitrary subset of X. Show this is
a base. The projection n: X x Y ---> Y is a closed map. Let D be the subset of
X x Y consisting of all pairs (x, x) with x E X. Then n(D) is closed and
therefore contains p. Hence there exists x E X such that (x, p) E D, whence
give an open U in X containing x, and any S., the set U x (S. u {p}) inter-
sects D, whence U intersects S., and x lies in n S•. ]
25. Let f: X ---> Y be a continuous map. Show that the following properties are
equivalent:
(a) f is proper.
(b) f is closed and for each y E Y the set f-1(y) is compact.
26. Let f: X ---> Y be proper. If B is a compact subset of Y, then f-1(B) is
compact.
27. (The marriage problem so baptized by Hermann Weyl.) Let B be a set of boys,
and assume that each boy b knows a finite set of girls Gb • The problem is to
marry each boy to a girl of his acquaintance, injectively. A necessary condi-
tion is that each set of n boys know collectively at least n girls. Prove that
this condition is sufficient. [Hint: First assume that B is finite, and use
induction. Let n > 1. If for all 1 ;;;; k < n each set of k boys knows > k girls,
marry off one boy and refer the others to the induction hypothesis. If for
some k with 1 ;;;; k < n there exists a subset of k boys knowing exactly k girls,
marry them by induction. The remaining n - k boys satisfy the induction
hypothesis with respect to the remaining girls (obvious!) and thus the case of
finite B is settled. For the infinite case, which is really the relevant problem
here, take the Cartesian product TI Gb over all b E B, each Gb being finite,
discrete, and use Tychonoff's theorem. For this elegant proof, cf. Halmos
and Vaughn, Amer. J. Math. January 1950, pp. 214-215.]
28. The Cantor set. Let K be the subset of [0, 1] consisting of all numbers
having a trecimal expansion
co a
L -3:'
n=l
where an = 0 or an = 2. This set is called the Cantor set. Show that K is

compact. Show that the complement of K consists of a denumerable union
of intervals, and that the sum of the lengths of these intervals is 1. Show that
the connected component of each point in K is the point itself. (One says
that K is totally disconnected.)
[It can be shown that a compact metric space is always a continuous

image of a Cantor set, and also that a totally disconnected compact metric
space is homeomorphic to a Cantor set. Cf. books on general topology.
The Cantor set has measure 0, is not countable, and is a rich source for
counterexamples.]
29. Peano curve. Let K be the Cantor set of the preceding exercise. Let S =
[0,1] x [0,1] be the unit square. Let f: K --+ S be the map which to each
element L: a"/3" of the Cantor set assigns the pair of numbers
( " b2n+1 " b 2n )
L.. 2" ' L.. 2" '
where bm = a m /2. Show that f is well defined. Show that f is surjective and
continuous. One can then extend f to a continuous map of the interval onto
the square. This is called a Peano curve. Note that the interval has dimen-
sion 1 whereas its image under the continuous map f has dimension 2. This
caused quite a sensation at the end of the nineteenth century when it was
discovered by Peano.
30. The semi parallelogram law (Bruhat-Tits). Let X be a complete metric space.
We say that X satisfies the semi parallelogram law, or is seminegative, if
given two points Xl' X2 E X there is a point z such that for all X E X we have
Prove that under this law, d(z, xd = d(XI, x 2 )/2, and z is uniquely deter-
mined. We call z the midpoint of Xl, X2.
31. (Serre, after Bruhat-Tits) Let X be a seminegative complete metric space. Let
S be a bounded subset of X. Show that there exists a unique closed ball
B,(xd of minimal radius containing S. [Use the semiparallelogram law both
for uniqueness and existence. For existence, show that if {B,.(xn )} is a se-
quence of closed balls containing S with lim rn = r (the inf of all
ali radii of
closed balls containing S), then {xn} is Cauchy.] The center of that closed
ball is called the circumcenter of S.
32. (Bruhat-Tits fixed point theorem) Let X be a complete seminegative metric
space. Let G be a group of isometries of X, i.e. bijective maps f: X --+ X
which preserve distance. Denote the action of G by (g, x) I-> g . x. Suppose G
has a bounded orbit (i.e. there is a point X such that the set S of all elements
g . X, g E G, is bounded). Then G has a fixed point (the circumcenter) of the
orbit.
For the above exercises, cf. Bruhat-Tits, Groupes Reductifs sur un Corps
Local I, Pub. IRES 41 (1972) pp. 5-251; and K. Brown, Buildings, Springer
Verlag, 1989, Chapter VI, Theorem 2 of §5.
CHAPTER III
Continuous Functions
on Compact Sets
III, §1. THE STONE-WEIERSTRASS THEOREM
Let E be a normed vector space (over the real or the complex numbers).
We can define the notion of Cauchy sequence in E as we did for real
sequences, and also the notion of convergent sequence (having a limit). If
every Cauchy sequence converges, then E is said to be complete, and is
also called a Banach space. A closed subspace of a Banach space is
complete, hence it is also a Banach space.
Examples. Let S be a non-empty set, and let F be a normed vector

space. We denote by B(S, F) the space of bounded maps from S into F.
It is a normed vector space under the sup norm, and if F is a Banach
space, then B(S, F) is complete, and thus is also a Banach space. The
proof that B(S, F) is complete if F is complete should be carried out as
an exercise. (The reader should have had a similar proof as part of a
course in advanced calculus but, at any rate, has had it for functions
which are real valued. the proof applies as well to Banach spaces.) If S
is a subset of a normed vector space (or a metric space) we denote by
C(S, F) the space of continuous maps of S into F, and by BC(S, F)
the subspace of bounded continuous maps. Then BC(S, F) is closed in
B(S, F), this being nothing else but a special case of the assertion that
a uniform limit of continuous maps is continuous. Again, the reader
should have seen a proof in the case of functions, and that same proof (a
3a-proof) applies to the case of maps into Banach spaces. (Do Exercise 0
if you have never done it before, or look up Undergraduate Analysis.)
Let X be a set. By an algebra A of functions on X (say, real valued)
we mean a subset of the ring of all functions having the properties that if
52 CONTINUOUS FUNCTIONS ON COMPACT SETS [III, §1]
f, 9 E A, then f + 9 and fg are in A, and if c E R, then cf EA. Most of

the algebras we deal with also contain the constant functions (identified
with R itself). We make a similar definition of an algebra over C.
For example, a polynomials in one variable form an algebra, and so
do polynomials in several variables. If <P is a function on some set S,
then the set of all functions which can be written in the form
with a i E R form an algebra, said to be generated by <po Similarly, we

have the notion of an algebra generated by a finite number of functions
<Pl' . .. ,<p" or by a family of functions. It is the algebra of polynomials
in <Pl' ... ,<Pro If X is a topological space, the set of all continuous
functions is an algebra, denoted by C(X). If we wish to specify the range
of values (real or complex), we write C(X, C) or C(X, R). Recall that a
function is a mapping with values in R or C.
Let S be a compact set. Let A be an algebra of continuous functions
on S. Every function in A is bounded because S is compact, and conse-
quently we have the sup norm on A, namely for f E A,
IIfll = sup If(x)l·

xeS
Thus A is contained in the normed vector space of all bounded functions

on S. We are interested in determining the closure of A. Since C(S) is
closed, the closure of A will be contained in C(S). We shall find condi-
tions under which it is equal to C(S). In other words, we shall find
conditions under which every continuous function on S can be uniformly
approximated by elements of A.
We shall say that A separates points of S if given points x, YES,
and x ;f. y, there exists a function f E A such that f(x) ;f. f(y). The ordi-
nary algebra of polynomial functions obviously separates points, since the
function f(x) = x already does so.
Theorem 1.1 (Stone-Weierstrass Theorem). Let S be a compact set,

and let A be an algebra of real valued continuous functions on S.
Assume that A separates points and contains the constant functions.
Then the uniform closure of A is equal to the algebra of all real
continuous functions on S.
We shall first prove the theorem under an extra assumption. We shall

get rid of the extra assumption afterwards.
Lemma 1.2. In addition to the hypotheses of the theorem, assume also

that if f, 9 E A then max(j, g) E A, and min(j, g) E A. Then the conclu-
sion of the theorem holds.
[III, §1] THE STONE-WEIERSTRASS THEOREM 53
Proof. We give the proof in three steps. First, we prove that given Xl'
X2 E S and Xl #- X2 ' and given real numbers 0(, p, there exists hE A such
that h(x l ) = 0( and h(X2) = p. By hypothesis, there exists qJ E A such that
qJ(X I) #- qJ(X2)' Let
Then h satisfies our requirements.
Next we are given a continuous function f on S and also given e. We

wish to find a function 9 E A such that
f(y) - e < g(y) < f(y) +e

for all YES. This will prove what we want. We shall satisfy these
inequalities one after the other. For each pair of points X, YES there
exists a function hx,y E A such that
hx,y(x) = f(x) and hx,y(Y) = f(y)·
If X = y, this is trivial. If X #- y, this is what we proved in the first step.

We now fix X for the moment. For each YES there exists an open ball
Uy centered at y such that for all z E Uy we have
hx)z) < f(z) + e.

This is simply the continuity of f - hx,y at y. The open sets Uy cover S,
and since S is compact, there exists a finite number of points Yl' ... ,Yn
such that UY1 ' ••• ,UYn already cover S. Let
Then hx lies in A according to the additional hypothesis of the lemma

(and induction). Furthermore, we have for all z E S :
hAz) < f(z) + e,

and hAx) = f(x), that is (h x - f)(x) = O.
Now for each XES we find an open ball Vx centered at X such that,
by continuity, for all z E Vx we have (h x - f)(z) > -e, or in other words,
f(z) - e < hx(z).
By compactness, we can find a finite number of points X l ' ... ,x m such

that v" ,' ... ,Vxm cover S. Finally, let
Then g lies in A, and we have for all ZE S
f(z) - 6 < g(z) < f(z) + 6,
thereby proving the lemma.

The theorem is an easy consequence of the lemma, and will follow if
we can prove that whenever J, g E A then max(J, g) and min(J, g) lie in
the closure of A. To prove this, we note first that we can write
f + g If - gl
max(J, g) = - 2- + - 2 - '
min(f ) = f + g _ If - gl
,g 2 2·
Consequently it will suffice to prove that if f E A then If I E A.

Since f is bounded, there exists a number c > 0 such that
-c ~f(x) ~ c
for all XES. The absolute value function can be uniformly approximated
by ordinary polynomials on the interval [-c, c] by Exercises 6, 7, or 8,
which are very simple ad hoc proof. Given 6, let P be a polynomial such
that
Ip(t)-ltll<6
for - c ~ t ~ c. Then
Ip(J(x») -If(x)11 < 6,
and hence If I can be approximated by po f. Explicitly, if
then
i.e.
p(J(x») = anf(x)n + ... + ao·
This concludes the proof of the Stone-Weierstrass theorem.
Corollary 1.3. Let S be a compact set in Rk. Any real continuous

function on S can be uniformly approximated by polynomial functions in
k variables.
[III, §2] IDEALS OF CONTINUOUS FUNCTIONS 55
Proof. The set of polynomials contains the constants, and obviously

separates points of Rk since the coordinate functions Xl' •.. ,Xk already do
this. So the theorem applies.
There is a complex version of the Weierstrass-Stone theorem. Let A

be an algebra of complex valued functions on the set S. If f E A, we
have its complex conjugate J defined by
J(X) = f(x).
For instance, if f(x) = e ix then J (x) = e- ix . If A is an algebra over C of

complex valued functions, we say that A is self conjugate if whenever
f E A the conjugate function J is also in A.
Theorem 1.4 (Complex S-W Theorem). Let S be a compact set and
A an algebra (over C) of complex valued continuous functions on S.
Assume that A separates points, contains the constants, and is self con-
jugate. Then the uniform closure of A is equal to the algebra of all
complex valued continuous functions on S.
Proof. Let A R be the set of all functions in A which are real valued.
We contend that AR is an algebra over R which satisfies the hypotheses
of the preceding theorem. It is obviously an algebra over R. If Xl "# x 2
are points of S, there exists f E A such that f(x l ) = 0 and f(x 2) = 1.
(The proof of the first step of Lemma 1.2 shows this.) Let g = f + 1.
Then g(x l ) = 0 and g(X2) = 2, and g is real valued, so AR separates
points. It obviously contains the real constants, and so the real S-W
theorem applies to it. Given a complex continuous function <p on S, we
write <p = u + iv, where u, v are real valued. Then u, v are continuous,
and u, v can be approximated uniformly by elements of A R, say f, g EAR
such that lIu - !II < e and ttv - gil < e. Then f + ig approximates
u + iv = <p, thereby concluding the proof.
Remark. The Stone-Weierstrass theorem has a useful application to

locally compact spaces. For such corollaries, we refer the reader to
Chapter IX, §6, and Chapter XVI, §3. For explicit approximations in
concrete cases, see the Exercises and also Chapter VIII, §1.
III, §2. IDEALS OF CONTINUOUS FUNCTIONS
The second theorem of this chapter deals with ideals of continuous func-
tions. Let S be a topological space, and R a ring of continuous functions
(real valued) pn S. An ideal J of R is a subset of R satisfying the
following properties: The zero function 0 is in J. If f, g E J, then f + g
and -fare in J, and if hER, then hf E J. The reader should really have
met the definition of an ideal in an algebra course, but we don't assume

this here, although some motivation from algebra is useful.
Let f be continuous on S. A zero of f is a point XES such that
f(x) = O. The set of zeros of f is a closed set denoted by Zf. Let J be
an ideal. Then the set
Z(J) n Zf'
= feJ
equal to the intersection of the sets of zeros of all f E J, is closed, and is

called the set of zeros of J. If J, J' are two ideals, and J c J', then
Z(J) :::> Z(J'). We ask to what extent the set of zeros of an ideal deter-
mines this ideal, and answer this question in an important case.
Theorem 2.1. Let X be a compact space, and let R be the ring of

continuous functions on X, with the sup norm. Let J be a closed ideal
(i.e. an ideal, closed under the sup norm). If fER is such that f(x) = 0
for all zeros x of J (i.e. if f vanishes on the set of zeros of J), then f
lies in J.
Proof. Given e, let U be the subset of X consisting of all x E X such

that If(x)1 < e. Then U is open, and the complement S of U is closed,
and hence compact. Note that U contains Zf. For each YES, we can
find a function gy in J such that gy(x) "# 0 in some open neighborhood
Yy of Y (by continuity). There is some finite covering {Yyl'·.· ,YyJ of S
corresponding to functions gYI' ... ,gYm. Let
9 = g2YI + .. . +g2Ym .
Then 9 is in J, is continuous, is nowhere 0 on S, and ~ O. Since 9 has a
minimum on S, there is a number a > 0 such that g(x) ~ a for all XES.
The function
ng
1 + ng
lies in J, because 1 + ng is nowhere 0 on X, its inverse is continuous

on X, so in R, and hence (1 + ng)-lng E J. For n large, the function
ng/(l + ng) tends uniformly to 1 on S, and hence the function
f~
1 + ng
lies in J, and approximates f within e on S. Since 0 ~ ng/(1 + ng) ~ 1 it

follows that on U we have the estimate
o ~ Ifng/(l + ng)1 < e,

[III, §3] ASCOLI'S THEOREM 57
and so fng/(l + ng) lies within 2e of f. Thus we have shown that flies
in the closure of J . Since J is assumed closed, we conclude that f lies in
J, thereby proving our theorem.
Remark 1. Situations analogous to that of Theorem 2.1 arise fre-

quently in mathematics. For instance, let R be the ring of polynomials in
n variables over the complex numbers, R = C [t l' ... ,tnJ. Let J be an
ideal of R, and define zeros of J to be n-tuples of complex numbers
x such that f(x) = 0 for all f E J. It is shown in algebraic geometry
courses that if f is a polynomial in R which vanishes on Z(J), then
fm E J for some positive integer m. This is called Hilbert's Nullstellensatz.
Remark 2. Theorem 2.1 is but an example of a type of theorem which

describes the topology of a space and describes properties of a space in
terms of the ring of continuous functions on that space. (Cf. also Exercise
5.) This is one way in which one can algebraicize the study of certain
topological spaces.
III, §3. ASCOLI'S THEOREM
In the examples of Chapter XVIII, §4, we shall deal with compact subsets
of function spaces, and we need a criterion for compactness, which is
theorem. It is also used in other places in analysis,
provided by Ascoli's theorem.
for instance in a proof of the Riemann mapping theorem in complex
analysis. Therefore, we give a proof here in the general discussion of
compact spaces.
Let X be a subset of a metric space, and let F be a Banach space. Let
<1> be a subset of the space of continuous maps C(X, F). We shall say
that <1> is (or its elements are) equicontinuous at a point Xo E X if given e,
there exists D such that whenever x E X and d(x, x o ) < DD,, then
If(x) - f(xo)1 < e
for all f E <1>. We say that <1> is equicontinuous on X if it is equicon-

tinuous at every point of X.
Theorem 3.1 (Ascoli's Theorem). Let X be a compact subset of a

metric space, and let F be a Banach space. Let <1> be a subset of the
space of continuous maps C(X, F) with sup norm. Then <1> is relatively
compact in C(X, F) if and only if the following two conditions are
satisfied:
ASC 1. <1> is equicontinuous.
ASC 2. For each x E X, the set <1>(x) consisting of all values f(x) for
f E <1> is relatively compact.
Proof Assume that satisfies the two conditions. We shall prove
that is relatively compact. For this it is sufficient to show that can
be covered by a finite number of balls of prescribed radius (Corollary 3.9
of Chapter II). Let r > O. By equicontinuity, for each x E X we select an
open neighborhood V(x) such that if y E V(x), then If(y) - f(x)1 < r for
all f E <1>. Then a finite number V(x 1)' ... , V(x n ) cover X. Each set
 (X d, .. . (X
, n)
is relatively compact, and hence so is their union
Y = (X 1)
1) u ... U
U (Xn ))..
Let B(a 1 ), ... ,B(am) be open balls of radius r centered at points a 1 , ... ,am
which cover Y. Then f(x d, ... ,J(xn ) lie in these balls. In fact, for each
i = 1, ... ,n we have
where u: {1, ... ,n} ~ {1, ... ,m} is some mapping. For each such map u
let <1>" be the set of f E
E such that for all i, we have
Then the finite number of <1>" cover <1>. It suffices now to prove that each
<1>" has diameter < 4r. But if f, 9 E <1>" and x E X, then x lies in some
V(x;), and then:
If(x) - g(x)1 ~ If(x) - f(x;)1 + If(x;) - a,,;! + la"i - g(xi)1 + Ig(x;) - g(x)1
< 4r.
This proves our implication, and the part of Ascoli's theorem which
is used in the applications. The converse is trivial and left to the reader.
Ascoli's theorem is used mostly when F is the real or complex num-

bers, and in that case, we reformulate it as a corollary.
Corollary 3.2. Let X be a compact subset of a metric space, and let 
be a subset of the space of continuous functions on X with sup norm.
Then is relatively compact if and only if is equicontinuous and
bounded (for the sup norm, of course).
Proof For each x E E X, our hypothesis that (x) is bounded implies

that (x) is relatively compact, since a closed bounded subset of a finite
dimensional space is compact. So we can apply the theorem.
[III, §4] EXERCISES 59
Remark. Since cI>cl> has a metric defined by the sup norm, as a rela-
tively compact set it has the property that any sequence has a convergent
subsequence, converging in its closure. Sometimes one deals with a lo-
cally compact set X which is a denumerable union of compact sets. In
that case, one obtains the following version of Ascoli's theorem.
Corollary 3.3. Let X be a metric space whose topology has a countable

base {VJ such that the closure V; of each Vi is compact. Let {f,,} be a
sequence of continuous maps of X into a Banach space. Assume that
{f,,} is equicontinuous (as a family of maps), and is such that for each
x E X, the closure of the set {fn(x)} (n = 1,2, . . . ) is compact. Then
there exists a subsequence which converges pointwise to a continuous
function f, and such that the convergence is uniform on every compact
subset.
Proof. We can find a sequence {V;} of open sets such that Vi C V;+1'
such that Vi is compact, and such that the union of the V; is X. For
each i, by the previous version of Ascoli's theorem, there exists a sub-
sequence which converges uniformly on Vi . The diagonal sequence with
respect to all i converges uniformly on every compact set. This proves
the corollary.
Remark. In light of Urysohn's metrization lemma, the hypotheses on

X in the corollary could be given as X separable locally compact.
III, §4. EXERCISES
o. Let S be a subset of a normed vector space (or a metric space), and let {f.}
be a sequence of continuous maps of S into a Banach space F. Assume that
{f.} is a Cauchy sequence (for the sup norm). Show that {In} converges to a
continuous function f (for the sup norm). Show that BC(S, F) is closed in
B(S, F).
1. Let X be a compact set and let R be the ring of continuous (real valued)
functions on X. Let J, J' be closed ideals of R. Show that J c: J' if and only
if Z(J) => Z(J').
2. Let S be a closed subset of X. Let J be the set of all fER such that f
vanishes on S. Show that J is a closed ideal. Assume that X is Hausdorff.
Establish a ring-isomorphism between the factor ring RjJ and the ring of
continuous functions on S. (We assume that you have had the notion of a
factor ring in an algebra course.)
3. Let X be a compact space and let J be an ideal of C(X). If the set of zeros
of J is empty, show that J = C(X). (This result is valid in both the real and
the complex case.)
4. Let X be a compact Hausdorff space. Show that a maximal ideal of C(X)

has only one zero, and is closed. (Recall that an ideal M is said to be
*
maximal if M C(X), and if there is no ideal J such that M c J c C(X)
other than M and C(X) itself.) Thus if M is maximal, then there exists P E X
such that M consists of all continuous functions i vanishing at p.
5. Let X be a normal space, and let R be the ring of continuous functions on
X. Show that the topology on X is the one having the least amount of open
sets making every function in R continuous.
6. Give a Taylor formula type proof that the absolute value can be approxi-
mated uniformly by polynomials. First, reduce it to the interval [-1, 1] by
multiplying the variable by c or c- I as the case may be. Then write It I =
-If. Select (j small, 0 < (j < 1. If we can approximate (t 2 + (j)1/2, then we
can approximate -If. Now to get (t 2 + (j)1 /2 either use the Taylor series
approximation for the square root function, or if you don't like the binomial
expansion, first approximate
log(t 22 + W/22 = t log(t 22 + (j)

by a polynomial P. Then take a sufficiently large number of terms from the
Taylor formula for the exponential function, say a polynomial Q, and use
Q 0 P to solve your problems.
7. Give another proof for the preceding fact, by using the sequence of poly-
nomials {Pn} , starting with Po(t) = 0 and letting
J>,,+I (t) = J>,,(t) + Ht - Pn(t)2).
Show that {Pn} tends to 0 uniformly on [0,1], showing by induction that
r.
O~vt-Pn(t)~
20 r. '
2 + nvt
whence 0 ~ 0 - J>,,(t) ~ 2/n.
8. Look at Example 1 of Chapter VIII, §3 to see another explicit way of
proving Weierstrass' approximation theorem for a continuous function on a
finite closed interval. Do Exercise 1 of that chapter.
9. Let X be a compact set in a normed vector space, and let {in} be a sequence
of continuous functions converging pointwise to a continuous function i, and
such that {J.} is a monotone increasing sequence. Show that the convergence
is uniform (Dini's theorem ; cf. Chapter IX, §1).
10. Let X be a compact metric space (whence separable). Show that the Banach
space C(X, R) or C(X, C) of continuous functions on X is separable.
[Hint : Let {xn } be a countable dense set in X and let gn be the function on
X given by
gn(X) = d(x, xn),

[III, §4] EXERCISES 61
where d is the distance function. Use the Stone-Weierstrass theorem applied

to the algebra generated by all functions g. to conclude that C(X, R) is
separable.] Note: Since a compact Hausdorff space is normal, and since a
normal separable space is metrizable, one can adjust the statement of the
theorem proved in the exercise as foIlows:
follows:
Let X be a compact Hausdorff separable space. Then C(X, R) is separable.
11. Let X, Y be compact Hausdorff spaces. If f, g are continuous functions on

X and Y respectively, we denote by f ® g the function such that
(f ® g)(x, y) = f(x)g(y).
Show that every continuous function on X x Y can be uniformly approxi-

mated by sums Ii=! J; ® gi where J; is continuous on X and gi is continuous
on y.
12. Let X be compact Hausdorff. By an algebra automorphism of C(X) we mean
a map (1: C(X) -+ C(X) such that (1 leaves the constants fixed, and satisfies
(1(f + g) = (1(f) + (1(g), (1(fg) = (1(f)(1(g).
Show that an algebra automorphism is norm preserving, i.e. I (1fll = IIf11.

13. Let X be a compact Hausdorff space and let A be a subalgebra of C(X, R).
Show that there exists a continuous map cp: X -+ Y of X onto a compact
space Y such that every element of A can be written in the form g 0 cp, where
g is a continuous function on Y.
14. Let X, Y be compact Hausdorff spaces. Show that X is homeomorphic to Y

if and only if C(X, C) is algebra-isomorphic to C(Y, C).
15. Let X be a compact Hausdorff space. Let vH be the set of all maximal ideals
in C(X, C). Define a closed set in vH to consist of all maximal ideals con-
taining a given ideal. Show that this defines a topology on vH. For each
x E X, let Mx be the ideal of functions in C(X, C) which vanish at x. Show
that the map
is a homeomorphism between X and vH.

16. For a E R let J.(x) = eioxe-x2. Prove that any function cp which is C'" and has
compact support on R can be uniformly approximated by elements of the
space generated by the functions J. over C. [Hint: If ifJ is a function van-
ishing outside a compact set, and N is a large integer, let ifJN be the extension
of ifJ on [-N, N] to R by periodicity. Use the partial sums of a Fourier
series to approximate such an extension of cp(x)e X " and then multiply by
e- x2 .] Remark. Instead of e- x2 you could use any function h(x) > 0 which is
Coo, and tends to 0 at infinity. This would not be the case in Exercises 19
and 20 below.
The next four exercises form a connected set.
17. Let X be compact Hausdorff and let p be a point of X. Let A be a

subalgebra of C(X, R) consisting of functions 9 such that g(p) = O. Assume
that there is no point q =1= p such that g(q) = 0 for all 9 E A, and that A
separates the points of X - {p}. The the uniform closure of A is equal to the
ideal of all functions vanishing at p.
18. Let X be locally compact Hausdorff, but not compact. Let C",(X, R) be the
algebra of continuous functions f on X such that f vanishes at infinity
(meaning, given 8 there exists a compact K such that If(x) I < 8 if x rt K). Let
A be a subalgebra of C",(X, R) which separates points of X. Assume that
there is no common zero to all functions in A. Show that A is dense in
C",(X, R).
19. Let f be a real valued continuous function on R<,;o (reals ~ 0). Assume that f
vanishes at infinity. Show that f can be uniformly approximated by functions
of the form e-Xp(x), where p is a polynomial. [Hint: First show that you can
approximate e- 2x by e-Xq(x) for some polynomial q(x), by using Taylor's
formula with remainder. If p is a polynomial, approximate e-nxp(x) by e-Xq(x)
for some polynomial q.]
20. Let f be a continuous function on R, vanishing at infinity. Show that f can
be uniformly approximated by functions of the form e- X2 p(x), where p is a
polynomial.
Remark. By changing variables, one can use e- CX and e- cx2 with a fixed
c > 0 instead of e- x and e- X2 in Exercises 19 and 20.
21. Let X be a metric space and E a normed vector space. Let BC(X, E) be the
space of bounded continuous maps of X into E. Let be a bounded subset
of BC(X, E). For x E X, let ev x : -+ E be the map such that evX<q» = q>(x).
Show that evx is a continuous bounded map. Show that is equicontinuous
at a point a E X if and only if the map x -+ evx of X into BC(, E) is
continuous at a.
22. Let X be a compact subset of a normed vector space, and E a normed vector
space. Show that any equicontinuous subset of C(X, E) is uniformly equi-
continuous. [This means: Given 8, there exists tJ such that Ix - yl < tJ im-
plies If(x) - f(y)1 < 8 for all f E <1>.]
23. Let X be a subset of a normed vector space and an equicontinuous subset
of BC(X, R). Let Y be the set of points x E X such that (x) is bounded.
Prove that Y is open and closed in X. If X is compact and connected, and if
for some point a E X the set (a) is bounded, show that is relatively
compact in C(X, R).
PART TWO
Banach and Hilbert Spaces
The two chapters of this part are absolutely basic for everything else that
follows, and introduce the most useful of all the spaces encountered in
analysis, namely Banach and Hilbert spaces. The reader who wishes to
study integration theory as soon as possible may continue these chapters
with Chapter VI, which will make essential use of the basic properties of
these spaces, especially the completion of a normed vector space and the
linear extension theorem. Indeed, the integral of the absolute value of a
function defines a seminorm on a suitable space of functions, whose com-
pletion will be the main object of study of the chapters on integration.
On the other hand, readers may look directly at the functional anal-
ysis, as a continuation of the linear theory of Banach and Hilbert spaces.
At some point, of course, these come together when we study the spectral
theorems and the existence of spectral measures.
As in the algebraic theory of vector spaces, we shall consider continu-
ous linear maps L: E -+ F of a normed vector space into another. The
kernel and image of L are defined as in the algebraic theory, namely the
kernel is the set of elements x E E such that L(x) = O. The image is
simply L(E). Both Ker L and 1m L are subspaces, of E and F respec-
tively. However, now that we have the norm, we note that the kernel is
a closed subspace (being the inverse image of the closed set {O}). Warn-
ing: the image if not necessarily closed. For conditions under which the
image is closed, see Chapter XV.
For the integration theory, we do not need such considerations of
subspace and factor space. However, we shall consider the dual space in
the context of integration, showing that various spaces of functions are
dual to each other. Thus we deal at somewhat greater length with the
dual space in this chapter. An application of the duality theory 10 the
context of Banach algebras will be given in Chapter XVI.
CHAPTER IV
Banach Spaces
IV, §1. DEFINITIONS, THE DUAL SPACE, AND

THE HAHN-BANACH THEOREM
Let E be a Banach space, i.e. a complete normed vector space. One can
deal with series L Xn in Banach spaces just as with series of numbers, or
of functions, and the most frequent test for convergence (in fact absolute
convergence) is the standard one:
Let {an} be a sequence of numbers ~

IXnl~ an for all n, then L Xn converges.
°such that L an converges. If
The proof is standard and trivial.
Let E, F be normed vector spaces. We denote by L(E, F) the space
map A.: E ~ F is continuous if and only if there exists C > °

of continuous linear maps of E into F. It is easily verified that a linear
such that
Il(x)1 ~ Clxl for all x E E. Indeed, if the C exists, continuity is obvious
(even uniform continuity). Conversely, if l is continuous at 0, then there
exists f> such that if Ixi ~ f>, then Il(x)1 < 1. Hence for any non-zero
x E E, we get
whence we can take C = 21f>.

Such a number C is called a bound for l, and l is also said to be
bounded. Let Sl be the unit sphere in E (centered at the origin), that is
66 BANACH SPACES [IV, §1]
the set of all x E E such that Ixl = 1. Then a bound for A is immediately
seen to be the same thing as a bound for the values of A on Sl' The
least upper bound of all values IA(x)l, for x E Sl' is called the norm of A,
and the map
is a norm on L(E, F). It is immediately seen that IAI is the greatest lower
bound of all numbers C > 0 such that
IA(X)I ~ Clxl, all x E E.
Let E, F, G be normed vector spaces, let u E L(E, F), and let v E L(F, G).
Then v 0 u is in L(E, G) and we have
Iv 0 ul ~ Ivllul.
Proof. A composite of continuous maps is continuous, and a compos-

ite of linear maps is linear, so our first assertion is clear. As to the
second, we have
Iv 0 u(x)1 = Iv(u(x)) I ~ Ivllu(x)1 ~ Ivllullxl,
so the desired inequality follows by definition.
If F is complete, then L(E, F) is complete.
This is but an exercise. If {An} is a Cauchy sequence of elements in

L(E, F), then for each x E E one verifies that {An(X)} is a Cauchy se-
quence in F, and hence converges to an element which we define to be
A(X). One then verifies that A is linear, and that if C = limlAnl, then C
is a bound for A, so that A is continuous. Finally one verifies that {An}
converges to A in L(E, F). (Fill in the details as Exercise 1, or look them
up in Undergraduate Analysis.)
We give some terminology concerning the space L(E, F) which is used

constantly in this book, and in analysis.
A continuous (bounded) linear map of a Banach space into itself is
called an endomorphism, or an operator.
In the case of two spaces E, F, an element u E L(E, F) is said to be
invertible if there exists v E L(F, E) such that
and
(where I is the identity mapping). In mathematics, the word isomorphism

refers to invertibility in various contexts, for instance a map having a
[IV, §1] DEFINITIONS, THE DUAL SPACE 67
continuous inverse, a linear inverse, a differentiable inverse, etc. ad lib.

Thus in each case, one should add an adjective to the word isomorphism
to make precise the kind of invertibility which is meant. In our present
case, we shall Call invertible elements of L(E, F) toplinear isomorphisms,
the adjective toplinear referring to the topology and the linearity. The
set of toplinear isomorphisms of E onto F is denoted by Lis(E, F). If
E = F, then we call toplinear isomorphisms of E with itself toplinear
automorphisms of E; the set of such automorphisms is denoted by
Laut(E). (For euphony, the reader may prefer the adjective topolinear
instead of toplinear.)
A toplinear isomorphism u between Banach spaces E, F which also
preserves the norm (that is lu(x)1 = Ixl for all x E E) will be called a
Banach isomorphism, or an isometry.
We shall also be dealing with bilinear maps. Let E, F, G be normed
vector spaces. A map
cP : E x F -> G
is said to be bilinear if for each x E E the map y ~ cp(x, y) is linear, and if

for each y E F the map x ~ cp(x, y) is linear. Such bilinear maps form a
vector space. It is easily verified (in a manner similar to the case of
linear maps) that cp is continuous if and only if there exists C such that
Icp(x, y)1 ~ Cixllyl

for all x E E, y E F. The greatest lower bound of such C then defines a
norm on the space of continuous bilinear maps, denoted by L(E, F; G),
and this space is a Banach space if G is complete. (Cf. Exercise 3.)
In the differential calculus, and other applications, we need an
isomorphism between L(E, L(F, G)) and L(E, F; G) as follows. Let
AE L(E, L(F, G)) and define CPA. by
cp A. (x, y) = A(X)(Y)·
Then CPA. is obviously bilinear, and we have
IcpA.(x, y)1 ~ IA(X)IIYI ~ IAllxllYI

so that
On the other hand, given cp E L(E, F; G), we can define Aq> by
Then
so that by definition,
Hence
Thus we get a Banach isomorphism
L(E, L(F, G))-L(E, F; G).
As one example of a bilinear map, we have
L(E, F) x E-F
such that (A, x)l--d(x). This bilinear map has norm 1.

Similarly, we can treat multilinear maps. If E 1 , •• • ,En , Fare normed
vector spaces, a multilinear map
is a map which is linear in each variable. Such a map is continuous if

and only if there exists C such that for all Xi E Ei we have
We have a norm-preserving isomorphism
from the space of repeated continuous linear maps to the space of con-
tinuous multilinear maps exactly as in the bilinear case. If F is complete,
then all these spaces are also complete.
We now consider a specially important space of linear maps.
The normed vector space L(E, R) [or L(E, C) in the complex case] is
called the dual space of E, and is denoted by E'. Elements of E' are
called functionals on E. Functionals can be used as substitutes for coor-
dinates. Indeed, suppose that E = R\ and let Ai be the i-th coordinate
function, that is
Then it is easily verified that {A1"" ,An} is a basis for the dual space of
Rk. Furthermore, the values of A1 , ••• ,An on an element X E Rk character-
ize this element. Although we do not have such convenient bases in the
infinite dimensional case, we still have such a characterization of elements
of E in terms of the values of functionals. This is based on the following
theorem.
Theorem 1.1. Let E be a real normed vector space, and let F be a

subspace. Let A.: F ~ R be a functional, bounded by a number C > o.
Then there exists an extension of A to a functional of E, having the
same bound.
Proof. Changing the norm on E (multiplying it by a number) we see

that it suffices to prove our theorem when C = 1. We first prove that if
VEE and v ¢ F, then we can extend A to F + Rv, and preserve the bound
1. Every element of F + Rv has a unique expression as x + tv with x E F
and t E R. Let a E R. The map .1.* on F + Rv such that
A*(x + tv) = A(x) + ta

is certainly linear. We must show that we can select a such that .1.* is
bounded by 1. Dividing both sides by t (if t =F 0), we see that it suffices
to find a number a such that
IA(Y) + al ~ Iy + vi
for all Y E F, or equivalently that for all Y E F,
A(Y) + a ~ Iy + vi and -A(Y) - a ~ Iy + vi.

This determines inequalities for a, namely
-A(Y) - Iy + vi ~ a ~ -A(Y) + Iy + vi,

and it suffices to show that the set of real a satisfying such inequalities is
not empty. But for all y, Z E F we have
IA(Y) - A(Z) I = IA(Y - z)1 ~ Iy - zi

so that
-A(Z) -Iz + vi ~ -A(Y) + Iy + vi·
From this we conclude that there is a non-empty interval of values of a
which satisfy our requirements.
We now use Zorn's lemma. We consider the set of pairs (G, .1.*) where
G is a subspace of E containing F, and .1.* is a functional on G having
the same bound as A, and extending A. We order such pairs
if G1 is a subspace of G2 and .1.2 is an extension of Ai. This is an

ordering, and our set of pairs is inductively ordered. The proof of this is
the usual proof: Given a totally ordered set of pairs as above, say
{(Gi , AJ}, we let G be the union of all Gi . We can define a functional .1.*
on G extending all A.i : Any x EGis in some Gi , and we define A.*(x) =

Ai(X). This is independent of the choice of i such that x E Gi , and the pair
(G, A*) is an upper bound for our family. By Zorn's lemma, let (G, A*) be
a maximal element. Then G = E, for otherwise, there is some vEE,
v ¢ G, and we can use the first part of the proof to get a bigger pair.
Corollary 1.2. Let E be a normed vector space, and vEE, v "# O. Then
there exists a functional A on E such that A(V) "# O.
Proof Let F be the one-dimensional space generated by v. We define

A on F taking any non-zero value on v, and extend A to E using
Theorem 1.1.
Theorem 1.1, or its Corollary, is referred to as the Hahn-Banach

theorem. We have formulated it over the reals, but it is also valid for
complex Banach spaces, and the complex case is easily reduced to the
real case. Indeed, given a complex functional A on a complex subspace
F, let cp be its real part. Let cp' be a real extension of cp to E, and define
A'(V) = cp'(v) - icp'(iv) for VEE.
You can verify as Exercise 2 that A' is a desired complex extension of A.

The dual space E' is a special case of the space of linear maps L(E, F)
when F is the space of scalars. As such, we have seen that it is a Banach
space with its natural norm. Furthermore, we can form the double dual
E" in a similar fashion, and E" is also a Banach space. Note that each
element x E E gives rise to a functional fx E E", given by
fx: E' ...... scalars R or C such that fAA) = A(X),
continuous for the topology defined by the norm on E'.
Proposition 1.3. The map x H fx is an injective linear map of E into E",

which is norm preserving, i.e. Ixi = Ifxl.
Proof Suppose x, Y E E and x "# y. Then x - y"# O. By the Hahn-

Banach theorem, there exists A E E' such that A(X - y) "# 0, so A(X) "# A(y).
This proves that fx "# /y, whence the map x H fx is injective. The inequality
IA(X)I ~ IAlixl
shows that Ifxl ~ Ixi. We leave to the reader the opposite inequality
Ixi ~ Ifxl, which concludes the proof that we have an isometric em-
bedding of E in E".
In Chapter II, §1 we defined the weak topology on a space, deter-

mined by a set of mappings into a topological space. We now apply this
notion to the dual space. We let ff be the family of functions on E'
given by
as above. The weak topology on E' determined by this family ff is

called simply the weak topology on E'. The next theorem gives one of its
most important properties.
Theorem 1.4 (Alaoglu's Theorem). Let E be a Banach space, and let E'l
be the unit ball in the dual space E'. Then E'l is compact for the weak
topology.
Proof. For each x E E, let Kx be the closed disc of radius 1 in C. Let
K= n Kx
XEE
Ixl;§; 1
be the Cartesian product of all closed discs of radius 1, taken over all
x E E satisfying Ixl;£ 1. We give K the product topology, so that by
Tychonoff's theorem, K is compact. We map E'l into K by the map
f: E'l -+ K such that AH n A{X) n fx{A)

Ixl;§; 1
=
Ixl;§; 1
Immediately from the definition, one sees that the map f is injective, and
thus gives an embedding of E'l into the product space. Furthermore, also
from the definition of the weak topology defined in Chapter II, §1, we
observe that the weak topology determined by the family ff is the same
as the weak topology determined by the family ffl of functionals fx with
x EEl (the closed unit ball in E), because any x E E, x # 0 is a scalar
multiple of a unit vector. More precisely, we also have an imbedding
f: E' c... nC
XEE
x given by AH n A{X),
XEE
and the following diagram is commutative :
E' ~ n ex
l l
XEE
E'1 ~ n Kx
xEE t
The product topology induced on n

Kx is the same as the topology
induced by viewing this product as a subspace of n
ex' Therefore, it
follows that the weak topology on E'l is the topology induced by viewing
E'l as a subspace of K via the embedding f, or also as a subspace of
n n
ex (x E E), via the embedding of E' in ex' To show that E'l is
compact, it suffices therefore to show that f(E'l) is closed in K.
To do this, we first prove that E' is closed in n
ex (x E E). Let
n y(x) (x E E) be an element of the product which lies in the closure of
f(E'). Given elements x, Y E E, we have to show that x 1--+ y(x) is a
bounded functional. By definition of the weak topology, given there
exists 2 E E' such that
12(x) - y(x)1 < B,
12(y) - Y(y)1 < B,
12(x + y) - y(x + y)1 < B.

But 2(x + y) = 2(x) + 2(y), whence Iy(x + y) - y(x) - y(y)1 < 3B, so
y(x + y) = y(x) + y(y).

Similarly, one sees that y(cx) = cy(x) for c E e, whence y is linear. Also
similarly, one sees that y is bounded. Furthermore, if n
y(x) lies in
the closure of E'l' then the above 2 can be chosen such that 121 ~ 1,
that is 12(x)1 ~ Ixl. Then by a similar epsilson argument, one sees that
ly(x)1 ~ lxi, which proves that f(E'd is closed, whence compact, thus
concluding the proof of Theorem 1.4.
Remark. In the case of Hilbert space, to be defined in the next

chapter, the Banach space E is self dual, and so in this case, one may
state that the unit ball in Hilbert space is compact in the weak topology.
IV, §2. BANACH ALGEBRAS
An algebra (say over R) is a vector space A, together with a mapping

A x A --+ A (called a multiplication) which is bilinear. This means that
for all u, v, W E A and c E R we have
u(V + w) = uv + UW, (u + v)w = uw + VW,

c(UV) = (cu)v = u(cv).
If in addition we have uv = VU, we say that the algebra is commutative.

If u(vw) = (uv)w, we say that the algebra is associative. If there exists an
element e E A such that eu = ue = u, we say that the algebra has a unit
[IV, §2] BANACH ALGEBRAS 73
element e, which is then uniquely determined, because if e' is another unit

element, then
e = ee' = e'.
A normed algebra is an associative algebra whose vector space is

normed, and whose norm satisfies the condition luvl ~ lullvl. A normed
algebra which is complete is called a Banach algebra.
For convenience when there is a unit element, we shall also assume
that lei = 1. See Exercise 5 which shows that this condition can always
be achieved by a simple redefinition of the norm.
Example 1. Let A be the vector space of bounded functions on a set,

multiplication being ordinary multiplication of functions. Then A is a
Banach algebra. So is the set of bounded continuous functions.
Example 2. Let A = R 3 and let the product be the cross product.

Then A is neither commutative nor associative, but otherwise satisfies
the other axioms of a normed algebra. Since non-associative algebras
occur so rarely in what we do, we have taken associativity into the
definition of a normed algebra, so that the present example is not that of
a normed algebra in our sense.
Example 3. Let E be a normed vector space. Then L(E, E) is an

algebra, if we define the multiplication to be composition of mappings.
In other words, if u, v E L(E, E), then the product u 0 v is again a contin-
uous linear map of E into itself, and we have associativity and bilinearity,
which follow at once from the definition of the sum of two linear maps.
Furthermore, L(E, E) has a unit element I which is the identity mapping.
We often write uv instead of u 0 v. Elements of L(E, E) are also called
endomorphisms of E, or operators on E, and we abbreviate L(E, E) by
End(E). If E is complete, i.e. a Banach space, then from remarks made in
§1, we conclude that End(E) is a Banach algebra. Of course, End(E) is
not necessarily commutative. It is the most important algebra studied in
this book. If E is finite dimensional, this algebra is essentially the alge-
bra of n x n matrices, where n = dim E.
Example 4. Let E be the vector space of continuous functions on R,

periodic of period 2n, with the sup norm. Then E is a Banach space. If
f, gEE, we define a product called the convolution product by
1 fTC
f * g(x) = 2n -TC f(t)g(x - t) dt.
It follows easily from elementary integrations that E is then a commuta-

tive, assocIatIve Banach algebra. Note that E does not have a unit
element. In this direction, see Chapter III, §1.
We observe that an algebra with a unit element contains a replica of

the scalars, under the map
c~ce,
which is injective, and preserves addition and multiplication. In the case

of L(E, E), an element cI (I = Identity) is simply "multiplication by c."
Let A be an associative algebra with unit element e. An element u of
A is said to be invertible if there exists v E A such that uv = vu = e. The
element v is uniquely determined by u, because if uw = wu = e, then
multiplying on the left by v shows that w = vuw = v. We call this ele-
ment the inverse of u and denote it by u- 1 • An invertible element is also
called a unit. If u, v are invertible, then so is uv, because
Theorem 2.1. Let A be a Banach algebra with unit element e. Then the
set of invertible elements is open in A. If v E A and Ivl < 1, then e + v
is invertible.
Proof. Let Ivl < 1. Then the series e + v + v2 + ... converges (abso-
lutely) and since
(e - v)(e + v + . .. +vn) = e - vn+l ,
it follows that e - v is invertible, and that its inverse is the limit of

e + v + ... + vn as n -+ 00. That we have - v instead of v makes no
difference, since 1- vi = Ivi. Suppose now that u is invertible, and let
Then
Hence wu -1 is invertible, whence w is invertible, thus proving our

theorem.
We observe that the map U~U-1 is continuous (as a map defined on

the set of invertible elements). The usual proof is valid.
Corollary 2.2. Let E, F be Banach spaces. Then the set of toplinear

isomorphisms of E onto F is open in L(E, F).
[IV, §3] THE LINEAR EXTENSION THEOREM 75
Proof Suppose that this set is not empty, and let u: E ~ F be a

toplinear isomorphism. Then for v E L(E, F) we have
If v is close to u, then u-lv is close to I, and is invertible by Theorem

2.1, so there exists W l such that
Similarly, there exists a toplinear automorphism W2 of F such that
Thus v has a right inverse and a left inverse, say v l , v2 , such that
and
Considering V l VV2 and using associativity shows that Vl = V2 , whence v is

invertible.
IV, §3. THE LINEAR EXTENSION THEOREM
Theorem 3.1. Let E be a normed vector space, F a subspace, and G a

complete normed vector space. Let
A.:F~G
be a continuous linear map, with norm C. Then the closure F of F in E

is a subspace of E. There exists a unique extension of A to a continu-
ous linear map X: F ~ G, and X has the same norm as A.
Proof Elements in F are limits of sequences in F. Thus if
x = lim Xn and Y = lim Yn'

then
x + Y = lim(xn + Yn)
and for c E R,
Hence F is a subspace of E.
The uniqueness of X is clear from continuity. We show its existence.
Let x E F, and let x = lim Xn with Xn E F. Then
Hence {Ax n } is a Cauchy sequence in G, and since G is assumed to be

complete, {Axn} has a limit in G which we denote by Xx. This value is
independent of the sequence Xn -+ x, for if x = lim x~ with x~ E F, then
lim AX n = lim AX~ . If
and Y = lim Yn
with Yn E F, then for CE R,
x +Y= lim(x n + Yn) and

Hence
X(X + y) = lim A(Xn + Yn) = lim(Axn + AYn) = lim AXn + lim AYn
= Xx + Xy.
Similarly, X(cx) = cX(x). Hence X is linear, and since for x E F we have
x = lim x, it follows that Xx = AX if x E F. Thus X is an extension of A.
Finally, we have
because the norm is a continuous function. Since
it follows that
because limits preserve inequalities. This proves that a bound for A is

also a bound for X and hence that IAI = IXI. This also concludes the
proof of Theorem 3.1.
We shall see examples of Theorem 3.1 very frequently in the sequel,

notably in the existence proof for the completion of a normed vector
space, in integration, Chapter VI, §3 and Chapter XIII, §1; and in the
spectral theorem of Chapter XVIII.
IV, §4. COMPLETION OF A NORMED VECTOR SPACE
Let E be a normed vector space. We wish to associate with E a

complete normed vector space in a manner analogous to that which
associates the real numbers to the rational numbers. We shall follow
[IV, §4] COMPLETION OF A NORMED VECTOR SPACE 77
the method of Cauchy sequences. For another method, cf. Exercise 25.
We define a completion of E to be a pair (E, <p) consisting of a Banach
space E and a continuous linear map
which is injective, such that <p(E) is dense in E, and such that <p preserves
the norm, i.e. l<pxl = Ixl for all x E E. We shall now prove that such a pair
is essentially uniquely determined. In fact, if (F, t/I) is another completion,
then there exists a unique invertible element A E L(E, F) such that the
following diagram is commutative, in other words t/I = A 0 <po
~\ /~
E
The proof is in fact very easy. The map
is continuous and linear (it even preserves the norm) and consequently,
by the linear extension theorem, it has a unique continuous linear exten-
sion of E into F, which we denote by A. Similarly, the continuous linear
map
has a continuous linear extension of F into E, which we denote by fJ..

Then fJ. 0 A.: E -+ E gi ves the identity when restricted to <p(E), and hence is
equal to the identity on E itself by continuity (or by the uniqueness part
of the linear extension theorem). Similarly, A 0 fJ. : F -+ F is the identity.
This proves the uniqueness of the completion.
We observe that our toplinear isomorphism A preserves norms, that is
IAxl = Ixl
for all x E E. This again follows by continuity.
We shall now give two proofs of the existence of a completion. So let
E be a normed vector space and let E' be its dual. As we saw in
Proposition 1.3, we have a natural norm-preserving injection E -+ E".
But E" is complete because E" = L(E', F) with complete F (F = scalars).
So the completion of E is simply the closure E in E". (Do Exercise 15.)
Next we give another proof, based on the same construction as the
real numbers from the rational numbers. This construction will be used
in the integration theory. See the examples after the construction.
The Cauchy sequences of elements of E form a vector space, which we
denote by S, As usual, we have the notion of null sequences, that is
sequences {xn} in E such that given e, there exists N such that for all
n > N we have IXnl < e. The null sequences form a subspace. We define
e
two Cauchy sequences = {xn} and '1 = {Yn} to be equivalent if there
exists a null sequence IX = {an} such that e= '1 + IX (in other words
Xn = Yn + an for all n). This is an equivalence relation, and we denote
e
the equivalence class of by~. Then the equivalence classes of Cauchy
sequences form a vector space in a natural way, and we have (for C E R):
and
We denote the vector space of equivalence classes of Cauchy sequences

by E. (It is nothing but the factor space of Cauchy sequences modulo
the subspace of null sequences.)
e
If = {xn} is a Cauchy sequence and '1 = {Yn} is equivalent to then e,
lim IXnl = lim IYnl.
n--+C() n-co
Then we define
n-->oo
It is verified at once that this is a norm of E, which is thus a normed

vector space.
We let
be the map such that <p(x) is the class of the Cauchy sequence {x, x, .. . }.
Then it is clear that <p is linear, and preserves norms. Furthermore, one
sees at once that if ~ is the class of a Cauchy sequence e,
and x = {xn},
then
Hence <p(E) is dense in if.

All that remains to prove is that E is complete. To do this, let {~n}
be a Cauchy sequence in if. For each n there exists an element Xn E E
such that
because <p(E) is dense in if. The sequence {xn} is then Cauchy (in E).
[IV, §4] COMPLETION OF A NORMED VECTOR SPACE 79
Indeed, we have
IXn- xml = l<px n - <pxml
~ l<px n - ~nl + I~n - ~ml + I~m - <pxml,

which gives a 3e-proof of the fact that {xn} is a Cauchy sequence. Let
~ = {x n}. Then {~n} converges to ~, because given e,
for n sufficiently large. This proves that E is complete, and concludes the
proof for the existence of a completion of E.
Example 1. In integration theory, covered later in this book, one

starts with the vector space of continuous functions, say on [0, 1], with
the U-norm
IIfl11 = I If(t)1 dt.
One can also take the vector space of continuous functions on R, van-
ishing outside some bounded interval, and define the U-norm similarly.
Then this space is not complete, and its completion is called L 1. It then
becomes a problem to identify elements of L 1 with certain functions, and
this is what we shall do.
Example 1 points to the need of a slight generalization of our normed

vector spaces. Indeed, even in elementary integration theory, one deals
with step functions, or piecewise continuous functions, which are such
that if IIfll1 = 0, then f may not be the zero function. For instance, if f
is 0 except at a finite number of points, then we do have IIfll1 = O. In
view of this, one defines a seminorm on a vector space E to be a function
satisfying all properties of a norm, except that we require
Ixi ~O
for all x E E, but we allow Ixi = 0 without having necessarily x = O.

Then it is clear that the set of all x E E such that Ixi = 0 is a subspace
Eo. The terminology of open and closed sets applies in the present
context, and the topology defined by a seminorm is simply not Haus-
dorff. In fact, the closure of 0 is obviously the space Eo itself.
In defining the completion, we can just as well define the comple-

tion of a space with a seminorm. We form Cauchy sequences and null
sequences, and we still get a map
j: E -+ E,
the only difference being that j has a kernel, which the reader will verify
to be precisely Eo. In fact, we have a norm on the factor space E/ Eo if
we define the norm of a coset Ix + Eol to be Ixl (independent of the coset
representative x since we have
Ix + yl = Ixl
for all y E Eo). Thus we can say that if E has a seminorm, the comple-
tion E is simply the completion of E/ Eo as discussed in this section.
A vector space E with a seminorm I I can be called a seminormed
space. We can define Cauchy sequences using the same definition as in
the normed case. We shall say that E is complete if every Cauchy
sequence in E converges-in other words, if given a Cauchy sequence
{x n } in E, there exists x E E such that given e, there exists N such that
for all n ~ N we have
IX n - xl < e.
Of course, the element x to which our sequence {x n } converges is not
uniquely determined, only up to an element of Eo . However, examples of
this situation arise in practice, in integration theory. One must then
distinguish between a complete seminormed space, and the completion of
E/ Eo mentioned above.
Example 2. Let E be the vector space of COO functions (say, real
°
valued) on R, vanishing outside a compact set (i.e. infinitely differentiable
functions 1 such that I(t) = if t is outside some bounded interval). We
define the HO-norm on E by
IIf11Ho = <1,1)1 /2,
f:
where
<I, I) = l(t)2 dt.
We define the HP-norm by
where D is the derivative. The completion of E under the HP-norm is

called an HP space. This kind of space is used very frequently in analysis.
For p = 0, the norm is also called the L2-norm.
[IV, §5] SPACES WITH OPERATORS 81
Example 3. On the interval [0, 1], we let CP be the space of functions

having p continuous derivatives. For f E CP we define
IIflb = sup IIDkfll·

k;!;p
Then this is a norm. It is an exercise to show that CP is already

complete under this norm.
IV, §5. SPACES WITH OPERATORS
Except for enumerating basic properties, it is rather rare in analysis that

one meets merely a normed vector space, or a Banach space, just by
itself. It is usually accompanied by a set of operators, and thus we make
here some general comments on this situation.
Let E be a normed vector space. Elements of L(E, E) are also called
operators on E. Let S be a set of operators on E. By an S-invariant
subspace F we mean a subspace such that for every A E S we have
AF c F, i.e. if x E F and A E S, then Ax E F. It is clear that if F is an
S-invariant subspace, then its closure is also S-invariant because if Xn E F
and Xn --. x, then AXn --. Ax, so Ax lies in the closure of F.
An operator B is said to commute with S if AB = BA for all A E S. If

B commutes with S, then both the kernel of B and its image are S-invariant
subspaces.
°
Proof If x E E and Bx = 0, then ABx = BAx = for all A E S, so the
kernel of B is S-invariant. Similarly, also from the relation ABx = BAx,
we see that the image of B is S-invariant.
If A is an operator on E, and co, . .. 'Cn are numbers, we may form the

operator
p(A) = cnAn + ... + col,
where
is the polynomial having the numbers as coefficients. If p, q are polyno-

mials and pq denotes the ordinary product of polynomials, then we have
(p + q)(A) = p(A) + q(A) and (pq)(A) = p(A)q(A).
Indeed, if q(t) = bmt m + ... + bo , then

where
dk = L crbs·
r+s=k
But
since assoclatlVlty, commutativity, and distributivity hold in multiplying

powers of A . The statement concerning the sum p + q is even more
trivial to see. Also, if c is a number, then
(cp)(A) = cp(A).
All these rules are useful when considering the evaluation of polynomials
on operators. In algebraic terminology, they express the fact that the
map
P 1--+ p(A)
is a ring-homomorphism from the ring of polynomials into the ring of

operators.
If F is an A-invariant subspace, then it is clear that F is also p(A)-
invariant for all polynomials p. Thus if F is in fact a subspace of E
which is invariant for an operator A, then it is also invariant for the set
of all polynomials in A, called also the ring of operators generated by A.
The same holds for any set of operators S, letting the ring of operators
generated by S be the set of all operators expressed as finite sums
where AI' ... ,An are elements of S, and the coefficients are numbers.
Indeed, if F is A- and B-invariant, then it is also (A + B)-invariant and
AB-invariant.
If an operator B commutes with all elements of S, then it is clear that
B also commutes with all elements in the ring of operators generated
by S, because if B commutes with Al and A 2 , then B commutes with
Al + A2 and also with Al A 2 . Furthermore, if F is a closed subspace
and is S-invariant, then it is also S-invariant, where S is the closure of
S. Indeed, if {Bn} is a sequence of operators in S converging to some
operator B, and if x E F, then the sequence {Bnx} is Cauchy, and hence
converges to Bx which lies in F.
In Chapters XVII and XVIII we study a pair (E, A) consisting of a
space E and an operator A, and analyze this pair, describing its structure
completely in important cases. The idea is to apply in the present con-
text an all-pervasive point of view in mathematics, which is to decompose
an object into a direct sum of simpler objects. In the present context, let
us make some general definitions.
[IV, App.] THE KREIN-MILMAN THEOREM 83
Let E be a Banach space, and F, G closed subspaces. We know that

the product F x G consisting of all pairs (y, z) with y E F and z EGis
also a Banach space, say under the sup norm. If the map
FxG-+E
given by
(y, z)~ y +z
is a toplinear isomorphism, then we say that E is the direct sum of the
subspaces F and G. Observe that our requirements involve both an
algebraic and a topological condition. It follows from our conditions
that
E=F+G and Fn G = {O}.
It will be proved later that, in fact, these two conditions are sufficient; in
other words, if they are satisfied, then the map
not only has an algebraic inverse, but this inverse is continuous (corol-
lary of the open mapping theorem). When E is a direct sum of F and G,
we write
E=F$G.
If A is an operator on E, then we are interested in expressing E as a

direct sum of A-invariant subspaces. Subsequent chapters give examples
of this situation.
APPENDIX: CONVEX SETS
APP., §1. THE KREIN-MILMAN THEOREM
Although we shall not use the theorem of this section later in the book
(except for some exercises), it is worthwhile giving it since it is used
at the beginning of more advanced and specialized courses, in a wide
variety of contexts. The exposition follows that of Artin (cf. Collected
Works).
Throughout this section, we let E be a vector space over the reals (not
normed). We let E* be a vector space of linear maps of E into R (not
necessarily the space of all such linear maps), and assume that E* separates
E, that is given x E E, x =f; 0 there exists A. E E* such that A.(x) =f; O. We
84 BANACH SPACES [IV, App.]
give E the topology having the smallest amount of open sets making all
A E E* continuous. A base for this topology is therefore given by the
following sets : We take x E E, and ..1. 1 , • • • ,An E E*, and e > 0. We let B
be the set of all y E E such that
The set of all such B is a base for the E*-topology.
A subset S of E is said to be convex if given x, YES, the line segment
(1 - t)x + ty, ° ~ t ~ 1,
joining x to y is contained in S.
We observe that an arbitrary intersection of convex sets is convex.
Lemma 1.1. Let Xl' .•. ,X.ES. Any convex set containing Xl ' .. . ,x.
also contains all linear combinations
°
with ~ ti ~ 1 for all i, and t 1 + ...
such linear combinations is convex.
+ tn = 1. Conversely, the set of all
Proof. If t. '# 1, then the above linear combination is equal to
The first assertion follows at once by induction. The converse is also an

immediate consequence of the definitions.
The following properties of convex sets also follow at once from the
definitions.
Let A: E -+ F be a linear map. If S is convex in E, then A(S) is convex

in F. If T is convex in F, then A-l(T) is convex in E. In other words,
the image and inverse image of a convex set under a linear map are
convex.
Let A E E*, A '# 0, and let Ho be the kernel of A (i.e. the set of all x E E
such that A(x) = 0). Then H.o is a closed subspace, and if vEE is such
that ..1.(v) '# 0, then
E = Ho + Rv.
If AI' A2 are non-zero functionals with the same kernel H, then there
exists c E R, c#-O such that Al = cA. 2 • Indeed, one sees at once that
c = Al (v)/ A2(V).
Let A#-O be an element of E*, and let c E R. By the hyperplane He we
mean the set of all x E E such that A(X) = c. In other words, He = A-I (c).
If Ho is the kernel of A, then He consists of all elements y + Yo with
y E Ho and Yo any fixed element of E such that A(Yo) = c.
The set of x E E such that A(X) ~ c will be called a closed half space
determined by the hyperplane, and so will the set of all x such that
A(X) ~ c. Similarly, we have the open half spaces, determined by the
inequalities A(X) > c and A(X) < c respectively.
If S is a closed subset of E and Xo a point, we say that a hyperplane
H separates Sand Xo if S is contained in one of the closed half spaces
determined by H, and Xo is not contained in this half space.
Theorem 1.2. Let S be a closed convex set in E, and let Xo rt S. Then

there exists a separating hyperplane for Sand xo, such that S is con-
tained in a closed half space determined by H.
Proof. We begin by proving our statement in the finite dimensional

case.
Let T be a closed convex subset of Rn, and let P be a point of Rn
such that P rt T. The function f(X) = IX - PI (euclidean norm) has a
minimum on T, say at Q E T. Let N = Q - P. Since P rt T, we have
N #- O. We contend that the hyperplane passing through Q, perpendicu-
lar to N, will satisfy our requirements. The equation of this hyperplane
is X· N = Q. N. Let Q' be any point of T, and Q' #- Q. For every t with
0< t ~ 1, we have
IQ - PI ~ IQ + t(Q' - Q) - PI = I(Q - P) + t(Q' - Q)I·

Squaring gives
Canceling and dividing by t, we obtain
o ~ 2(Q - P) · (Q' - Q) + t(Q' _ Q)2.
Letting t tend to 0 yields
Q' ·N ~ Q·N ~ P·N + N·N.

This proves that T is contained in the closed half space defined by
X·N~c,
where c = p. N + N· N, thus proving our contention, and the fact that

our hyperplane separates T and P.
We return to the general case of the space E. There exists a neighbor-
hood of Xo which does not intersect S. In other words, there exists e and
Ai' ... ,An E E* such that all Y E E satisfying
(i= 1, ... ,n)
do not lie in S. Consider the linear map
given by
The image of S is a convex set <p(S) in Rn, which does not intersect the
neighborhood of <p(xo) determined by the inequality
IIQ - <p(xo)ll < e (sup norm).
Its closure does not contain <p(x o). By our result in the finite dimen-
sional case, there exists a non-zero vector
such that <p(S) lies in the closed half spaces determined by N and a
suitable constant c. We let
Then A E E* and S is contained in a closed half space A ~ c, which does

not contain xo, thus proving Theorem 1.2.
Remark. All that we need in the sequel is that, the assumptions being
as in the theorem, there exists a functional A E E* such that A(Xo) is not
contained in A(S).
We define an extreme point of a convex set S to be a point XES

having the following property: Whenever Yi' Y2 are points of S such
that we can write
x = tYi + (1 - t)Y2
with 0 < t < 1, then Yi = Y2.

Theorem 1.3. Let S be a non-empty, convex, compact subset of E. Then
there exists an extreme point of s.
Proof Let fi' be the family of non-empty, convex, compact subsets of

E contained in S, and having the following additional property:
If KEfi' and x E K, and if Y1' Y2 E S are such that
x = tY1 + (1 - t)Y2
with 0 < t < 1, then Y1' Y2 E K.
Then the set S itself is in fi'. We can order elements of fi' by

descending inclusion, and if {KJiEI is a totally ordered subfamily, then
the intersection
is not empty, and clearly is again in fi'. Hence by Zorn's lemma, there
exists a minimal element So in fi'. We contend that So consists of one
point. (This will prove our theorem.) Since elements of E* separate
points, it will suffice to prove that fo{ each A E E*, the set A(So) consists
of one point. But A(So) is convex and compact, whence a closed bounded
interval. Let c be a right end point of this interval. Then the set
A-l(C) n So is non-empty, convex, compact. We contend that it lies in fi'.
Let x be an element in r1(c) n So, and suppose that we can write
x = tY1 + (1 - t)Y2
with Y1' Y2 E Sand 0 < t < 1. Since So E fi', we get Y1' Y2 E So. Applying
A, we find that
Since c is an end point of the interval A(SO), it follows that
Hence Y1' Y2 also lie in A-l(C), and this shows that A-l(C) n So is in fi'.
Since we took So minimal, we conclude that So is contained in A-l(C),
thereby proving our theorem.
Corollary 1.4. Let S be as in Theorem 1.3, and let A E E*. Let c be an

end point of the interval A(S). Then A-1 (c) n S contains an extreme
point of s.
Proof. The intersection of the hyperplane A-l(C) with S is non-empty,

convex, compact, and thus has an extreme point x, with respect to
A. -l(C) Il S. However, if Yl' Y2 E Sand
x = tYl + (1 - t)Y2
with 0 < t < 1, then A.(x) = c = tA(Yl) + (1 - t)A(Y2), and hence A(Yd =
A(Y2) = c, so that YI' Y2 E A-I(C) Il S. From this we conclude that
YI = Y2, and hence that x is also an extreme point of S itself.
Theorem 1.5 (Krein-Milman Theorem). Let K be a convex, compact

subset of E. Let S be the set of extreme points of K. Then K is the
smallest closed convex set containing all elements of S (i.e. the intersec-
tion of all closed convex sets containing S).
Proof. Let Sf be the intersection of all closed convex sets containing S.

Then Sf c K, and since K is compact, it follows that Sf is compact.
Suppose that there exists Xo E K but Xo ¢ Sf. By Theorem 1.2, there exists
A E E* such that A(Xo) is not contained in the interval A(Sf), say
Let c be the right end point of the interval A(K). By Corollary 1.4, the
set A-I (c) Il K contains an extreme point of K, contradicting the fact that
A(S) < c, and proving our theorem.
APP., §2. MAZUR'S THEOREM
In the applications of Theorem 1.2, one starts frequently with a convex

set in a Banach space, closed in the norm topology (i.e. the topology
defined by the norm). In Theorem 1.2, we needed a convex set closed for
the weak topology defined by a family of functionals. An example of
such a family is simply the totality of all functionals, continuous for the
norm topology. Of course, if a set S is compact for the norm topology,
it is also compact for the weak topology. One can then raise the ques-
tion whether a closed convex set for the norm topology is also closed for
the weak topology. The answer is yes:
Theorem 2.1 (Mazur's Theorem). Let E be a Banach space and let A

be a convex subset, closed for the norm topology. Then A is also closed
for the weak topology (that topology having the smallest amount of open
sets making all functionals continuous). In fact, A is the intersection of
all closed half spaces containing A.
The proof is self contained, and is based on the following lemma.

[IV, App.J MAZUR'S THEOREM 89
Lemma 2.2. Let V be an open non-empty convex set in E which does

not contain the origin. Then there exists a functional A. on E whose
kernel does not intersect U.
Proof. Let a E U. Then -a ¢ V, otherwise 0 E V because V is convex,

and this is impossible. By a cone we shall mean a subset C of E such
that if x E C, then tx E C for all real t ~ O. Let r be the set of all convex
cones containing V but not - a. Then r is not empty because the set
of all points tx with t ~ 0 and x E V, is verified to be a convex cone
directly from the definitions, and belongs to r. It is clear that r is
inductively ordered by ascending inclusion. Let C be a maximal element
of r. We contend that en (- C) is a closed hyperplane H which does
not intersect U. Picture:
o o
e-a
First we prove that the maximal cone C is closed. Suppose C is not

closed. Then we must have -a E C, for otherwise we have C c C E r
and C # C, contradicting the maximality of C. On the other hand, we
have a EVe C. Since V is open, there is a ball Be V centered at a
and of radius r > O. But C is convex. Therefore C contains the set A of
elements (-a + x)/ 2 with x E B. It is easy to see that A contains the ball
centered at the origin and of radius r/2. This and the fact that C is a
cone imply that C = E, a contradiction. It follows that H = en (- C) is
closed, is a cone, is convex, and H = - H. Therefore H is immediately
seen to be a closed subspace. We have H # E because -a ¢ C, so
-a¢H.
We have E = C u (- C). To see this, let x E E and suppose x ¢ C,
x ¢ - C. Since C is maximal, the cone consisting of all elements c + tx
with c E C, t ~ 0 contains - a, and so does the cone of all elements
c + t( -x), c E C, t ~ O. Hence we can write
with c1 , c2 E C and t1, t2 ~ O. Consequently

However, c 1 + t 1X = - a is on the line segment between c 1 and c 1 +

(tl + t 2 )x, and thus lies in C, a contradiction which proves that E =
Cu(-C).
Now suppose that x E C. Then the line segment between x and -a'
contains a point of H. For instance, on the segment x + t( -a - x) with
0;;:; t ;;:; 1, let r be the sup of all t such that x + t( -a - x) lies in C.
Then x + r( - a - x) lies in H, and r =f 1, otherwise - a E H, which is
impossible. We therefore have
(1 - r)x - ra = h E H,
whence
r 1
x = -1-- r a + -1-- r h.
Working also with -x instead of x, we conclude that E is generated by

H and a, so that the factor space E/H has dimension 1 and hence H is a
closed hyperplane.
Finally, H does not intersect U, for otherwise let hE H n U. Since U
is open, for small s > 0 we have h - sa E U so h - sa E C. But -h E C,
whence - sa E C and - a E C, which is impossible. This proves our
lemma.
We now prove: Let b be a point of a Banach space E, which does not

belong to the norm-closed non-empty convex set A. Then there exists a
functional A and a number 0( such that A(X) > 0( for all x E A and
A(b) < 0(.
Proof. Let B be an open ball centered at b and not intersecting A.

Then the set U = A - B, consisting of all points x - y with x E A and
y E B, is open, convex, non-empty, and does not contain the origin. (U
is open because it is a union of open sets a - B with a E A, and it is
immediately verified to be convex because the sum of two convex sets is
convex.) We apply our lemma to U and find a functional A as in the
lemma, so that AZ ~ 0 for all Z E A - B, and therefore AX ~ AY for all
x E A, y E B. Let f3 = infAx for x E A. The map A is an open map-
ping, for instance because A gives an isomorphism of a one-dimensional
subspace of E onto R. Therefore AY < f3 for all Y E B, so that in
particular, Ab < f3. We let 0( = t(Ab + f3) to conclude the proof of our
assertion.
Mazur's theorem follows at once, since we have proved that a non-

empty closed convex set is the intersection of all closed half spaces
containing it.
[IV, §6] EXERCISES 91
IV, §6. EXERCISES
1. Fill in the details that if F is complete, then L(E, F) is complete.

2. Show that the Hahn-Banach theorem for the complex case follows easily
from the real case of this theorem. In other words, finish the details of the
argument given after Corollary 1.2.
3. Let E, F, G be normed vector spaces. A bilinear map ;(: E x F -+ G is a map
which is linear in each variable, i.e. for each x E E the map yf-d(x, y) is
linear, and for each y E F, the map x I-> ;(x, y) is linear. Show that a bilinear
map ;( is continuous if and only if there exists C > 0 such that
I;(x, y)1 ~ qxllyl
for all x E E, y E F. Let L(E, F; G) be the set of continuous bilinear maps of

E x F into G. Show that L(E, F; G) is a normed vector space, if the norm of
;( is defined to be the inf of all numbers C as above. Show that if G is
complete, then L(E, F; G) is complete.
4. Let E be a Banach space and F a closed subspace. For each coset x + F of
F, define Ix + FI = infix + yl for y E F. Show that this defines a norm on the
factor space ElF, and that the natural map E -+ El F is continuous linear. (Cf.
Chapter XV, §1.)
5. Let A be a Banach algebra. Suppose that there is a unit element e # 0, but
that we do not necessarily have lei = 1.
(a) Show that lei ~ l.
(b) Define a new norm II II on A by putting
IxYI
Ilxll =sup- .
y"O Iyl
Show that II II is in fact a norm and that Ilell = l.

(c) Show that A is a Banach algebra under this new norm.
6. (a) Show that a finite dimensional subspace of a normed vector space is
closed.
(b) Let E be a Banach space and F a finite dimensional subspace. Show that
there exists a closed subspace G such that F + G = E and
FnG = {O}.
You will have to use the Hahn-Banach theorem.

7. Let F be a closed subspace of a normed vector space E, and let vEE,
v rt F. Show that F + Rv is closed. If E = F + Rv, show that E is the direct
sum of F and Rv. (You can give a simple ad hoc proof for this. A more
general result will be proved later as a consequence of the open mapping
theorem.)
8. Let E, F, G be normed vector spaces and assume that G is complete. Let

A.: E x F -+ G be a continuous bilinear map. Show that A. can be extended to
a continuous bilinear map of the completions E x F -+ G, which has the same
norm as A.. (Identify E, F as subspaces of their completions.)
9. Let A be a Banach algebra, commutative, and with unit element. Let J be an
ideal. Show that the closure of J is also an ideal. (The definition of an ideal
is the same as in the case of rings of continuous functions. If the algebra is
not commutative, then the same result is valid if we replace ideals by left
ideals.)
10. Let A be a commutative Banach algebra and let M be a maximal ideal.
Show that M is closed.
11. Give the proof of the inequality left to the reader in Proposition 1.3.
12. Let E be an infinite dimensional Banach space, and let {x.} be a sequence of
linearly independent elements of norm 1. Show that there exists an element
in the closure of the space generated by all x. which does not lie in any
subspace generated by a finite number of x •. [Hint: Construct this element
L
as an absolutely convergent sum c.x •. ]
13. Let {E.} be a sequence of Banach spaces. Let E be the set of all sequences
~ = {xn} with X. E E. such that L Ix.1 converges. Show that E is a vector
space, and that if we define
I~I = L Ix.1
then this is a norm, and E is complete.
14. Let E be a Banach space, and P, Q two operators on E such that P + Q = I,
and PQ = QP = o. Show that
E = Ker P + Ker Q,
and that Ker P = 1m Q. Show that Ker P n Ker Q = {O}, and that Ker P
and 1m P are closed subspaces.
15. Let E be a Banach space and let F be a vector subspace. Let F be the
closure of F. Prove that F is a subspace, and is complete.
16. Let A be a subset of a Banach space. By c(A) we denote the convex closure
of A, i.e. the intersection of all convex sets containing A. We let c(A) denote
the closure of c(A). Then c(A) is convex. Prove: If K is compact, then c(K)
is also compact. [Hint: Show c(K) is totally bounded as follows. First find a
finite number of points Xl' ... ,X. such that K is contained in the union of
the balls of radius e around these points. Let C be the convex closure of
the set {Xl' ... ,x.}. Show that C is compact, expressing C as a continuous
image of a compact set. Let Yl' ... ,Ym be points of C such that C is con-
tained in the union of balls of radius e around these points. Then get the
desired result.]
[IV, §6] EXERCISES 93
17. Let F be the complete normed vector space of continuous periodic functions
on [ -n, n] of period 2n, with the sup norm. Let E be the vector space of all
real sequences IX = {an} such that L lanl converges. Define
IIXI = L'"
n=1
lanl·
Show that this is a norm on E. Let
LIX(x) = L an cos nx,

so that L: E -+ F is a linear map. Show that L has norm 1. Let B the closed
unit ball of radius 1 centered at the origin in E. Show that L(B) is closed in
F. [Hint : Let {.he} (k = 1,2, . . . ) be a sequence of elements in L(B) which
converges uniformly to a function J in F. Let bn be the Fourier coefficient
of J with respect to cos nx. Let P= {b.}. Show that P is in E and that
L(P) = f.]
18. Let K be a continuous function of two variables defined for (x, y) in the
square [a, b] x [a, b]. Assume that IIKII ~ C for some constant C> 0, where
II II is the sup norm. Let E be the Banach space of continuous functions on
r
[a, b], and let T: E -+ E be the linear map such that
Tg(x) = K(t, x)g(t) dt.
Show that T is bounded and II TIl < C(b - a). For more on T, see Chapter
XIV, Exercise 5.
19. Let A be a commutative Banach algebra with unit element e, over the reals,
and define the exponential and logarithm maps by
u2
exp u = 1 + u + - + .. .
2!
and
(u - e)2 (u - e)3
log u = (u - e) - - - + - - - ...
2 3
Show that exp converges absolutely for an u E A, and that log converges
absolutely for all u with lu - el < 1. Show that the exp and log give inverse
continuous mappings from a neighborhood of 0 onto a neighborhood of e in
A . Show that they satisfy the usual function equations
exp(u + v) = (exp u)(exp v),

log(uv) = log u + log v,
in these domains of definition. Show that every element of A sufficiently close

to e is an n-th power for every positive integer n.
20. Let X be a compact Hausdorff space and let C(X) be the Banach space of
real continuous functions on X. If A. is a functional on C(X) (sup norm) such
that A.(1) = IA.I, show that A. is positive, in the sense that if f E C(X), f ~ 0,
then A.(f) ~ o.
CHAPTER V
Hilbert Space
V, §1. HERMITIAN FORMS
Essentially all of this chapter goes through over the real or the complex
numbers with no change. Since the theory over the complex does intro-
duce the extra conjugation, we use the complex language, and point out
explicitly in one or two instances those results which are valid only over
the complex.
Let E, F be vector spaces over C and let L: E -+ F be a map. We say
that L is antilinear, or semi-linear, if L is R-linear, and L(ax) = iXL(x) for
all x E E and a E C.
Let E be a vector space over the complex numbers. A sesquilinear
form or scalar product on E is a map
ExE-+C
denoted by
(x, Y)f-+ (x, y)
which is linear in its first variable, and semi-linear or antilinear in its

second variable, meaning that for x, y, Yl' Y2 E E, a E C, we have
and (x, ay) = iX(x, y).

If in addition we have for all x, y E E
(x, y) = (y, x),
we say that the form is hermitian. If furthermore we have (x, x) ~ 0 for

96 HILBERT SPACE [V, §1]
all x E E, we say that the form is positive. We say the form is positive
definite if it is positive, and <x, x> > 0 if x "# O. We shall assume through-
out that our form <, > is positive, but not necessarily definite. We ob-
serve that a sesquilinear form is always R-bilinear.
We define v to be perpendicular or orthogonal to w if <v, w> = O. Let
S be a subset of E. The set of elements VEE such that <v, w> = 0 for
all w E S is a subspace of E. This is easily seen and will be left as an
exercise. We denote this set by SJ.. Let Eo consist of all elements vEE
such that v E E1, that is <v, w> = 0 for all wEE. Then Eo is a subspace,
which will be called the null space of the hermitian product.
Theorem 1.1. If wEE is such that <w, w> = 0, then WE Eo, that is
<w, v> = 0 for all VEE.
Proof. Let t be real, and consider
o ~ <v + tw, v + tw> = <v, v> + 2t Re<v, w> + t 2<w, w>

= <v, v> + 2t Re<v, w>.
If Re<v, w> "# 0 then we take t very large of opposite sign to Re<v, w>.
Then <v, v> + 2t Re<v, w> is negative, a contradiction. Hence
Re<v, w> = O.
This is true for all VEE. Hence Re<iv, w> = 0 for all vEE, whence
Im<v, w> = O. Hence <v, w> = 0, as was to be shown.
We define Ivl = J<v, v>, and call it the length or norm of v. By

definition and Theorem 1.1, we have Ivl = 0 if and only if v E Eo.
Theorem 1.2 (Schwarz Inequality). For all v, wEE we have
I<v, w>1 ~ Ivllwl·
Proof. Let (X = <w, w> and p = -<v, w>. We have
o ~ <(Xv + pw, (Xv + pw>

= <(Xv, (Xv> + <pw, (Xv> + <(Xv, pw> + <pw, pw>
= (Xa<v, v> + pa<w, v> + (XP<v, w> + PP<w, w>.
Note that (X = Iw12. Substituting the values for (x, p, we obtain

[V, §1] HERMITIAN FORMS 97
But
Hence
If Iwl = 0, then WE Eo by Theorem 1.1 and the Schwarz inequality is

obvious. If Iwl =I- 0, then we can divide this last relation by Iw1 2, and
taking the square roots yields the proof of the theorem.
Theorem 1.3. The function V 1-+ Ivl is a seminorm on E, that is:

We have Ivl ~ 0, and Ivl = °if if v E Eo.
and only
For every complex oe, we have loevl = loellvl.
For v, wEE we have Iv + wi ~ Ivl + Iwl.
Proof. The first assertion follows from Theorem 1.1. The second is
left to the reader. The third is proved with the Schwarz inequality. It
suffices to prove that
To do this, we have
Iv + wl2 = (v + w, v + w) = (v, v) + (w,v) + (v, w) + (w, w).
But (w, v) + (v, w) = 2 Re(v, w) ~ 21(v, w)l. Hence by Schwarz,
Iv + wl2 ~ Ivl2 + 21(v, w)1 + Iwl2

~ Ivl2 + 21vllwl + Iwl2 = (lvl + Iw1)2.
Taking the square root of each side yields what we want.
We call I I the L 2 -norm (or we should really say the U-seminorm).

An element of E is said to be a unit vector if Ivl = 1. If Ivl =I- 0, then
viivi is a unit vector.
Let WEE be an element such that Iwl =I- 0, and let vEE. There exists
a unique number c such that v - cw is perpendicular to w. Indeed, for
v - cw to be perpendicular to w we must have
(v - cw, w) = 0,
whence (v, w) - (cw, w) = °and (v, w) = c(w, w). Thus
(v, w)
c= - - .
(w,w)
Conversely, letting c have this value shows that v - cw is perpendicular

to w. We call c the Fourier coefficient of v with respect to w.
Let VI' ... 'Vn be elements of E which are not in Eo, and which are
mutually perpendicular, that is <Vi' V) = 0 if i i= j. Let Ci be the Fourier
coefficient of V with respect to Vi. Then
is perpendicular to VI' ... , Vn • Indeed, all we have to do is to take the

product of V with vj • All the terms involving <Vi' V) will give 0, and we
shall have two terms
which cancel. Thus subtracting linear combinations as above orthogo-

nalizes V with respect to VI' . .. ,Vn •
We have two useful identities, namely:
The Pythagoras Theorem. If u, WEE are perpendicular, then
The Parallelogram Law. For u, wEE, we have
The proofs come immediately from expanding out the norm according to
the definitions.
Let {V;} i E I be a family of elements of E such that Ivd i= 0 for all i.
For each finite subfamily, we can take the space generated by this sub-
family, i.e. linear combinations
with complex coefficients ci . The union of all such spaces is called the
space generated by the family {V;}iEI . Let us denote this space by F. We
say that the family {v;} is total in E if the closure of F is equal to all
of E.
As a matter of notation, we shall omit the double indices and write
VI' ... ,vn instead of Vi" ... ,Vin •
We say that the family {Vi} is an orthogonal family if its elements are
mutually perpendicular, that is <Vi' Vj ) = 0 if i i= j, and if in addition
Iv;! i= 0 for all i. We say that it is an orthonormal family if it is ortho-
gonal and if Iv;! = 1 for all i. One can always obtain an orthonormal
family from an orthogonal family by dividing each vector by its norm.
A total orthonormal family is called a Hilbert basis, or also ~n ortho-
normal basis. (Warning: It is not necessarily a "basis" in the sense of
abstract algebra, i.e. not every element of the space is a linear combina-
tion of a finite number of elements in a Hilbert basis.)
Theorem 1.4. Let {vJ be an orthogonal family in E. Let x E E and let

Ci be the Fourier coefficient of x with respect to Vi' Let {aJ be a
family of numbers. Then
Proof. We know that
is orthogonal to each Vi' i = 1, .. . ,no Hence we get from Pythagoras:
IX- k~l akvk 12 = 1x -

n
kf:l Ck Vk +
n
kft (Ck -
n
adVk
12
= Ix - I Ck Vkl 2 + II (Ck - ak)VkI 2.
This proves the desired inequality.
A pre-Hilbert space is a vector space with a positive definite hermitian

form. If we start with a space with a form which is only positive (not
definite), we can obtain a pre-Hilbert space by taking the factor space
E/ Eo (i.e. equivalence classes of elements of E modulo Eo). Similarly, we
can form the completion of E. Viewing E as a space over the reals, we
can extend the R-bilinear form < , > to the completion. If E is a pre-
Hilbert space, then the extended form is hermitian positive definite.
(That it is hermitian positive follows by continuity. For the definiteness,
if {x n } is a sequence converging to x, and x # 0, we may assume that
Xn # 0, and then that {xn / lxnl} converges to x/lxl. Thus we may deal
with unit vectors, whence the definiteness follows immediately.)
A Hilbert space is a vector space with a positive definite hermitian
form, which is complete under the corresponding L 2- norm. Thus we see
that the completion of a pre-Hilbert space is a Hilbert space.
Lemma 1.5. Let E be a Hilbert space, and F a closed subspace. Let

x E E and let
a = inf Ix - YI.
yeF
Then there exists an element Yo E F such that
a=lx-Yol·
Proof. Let {Yn} be a sequence in F such that IYn - xl approaches a.

We show that {Yn} is Cauchy. By the parallelogram law, we have
IYn - Yml 2 = 21Yn - Xl2 + 21Ym - Xl2 - 41!(Yn + Ym) - Xl2
~ 21Yn - Xl2 + 21Ym - Xl2 - 4a 2
because of the definition of a. This shows that {Yn} is Cauchy, and thus
converges to some vector Yo' The lemma follows by continuity.
Theorem 1.6. Let F be a closed subspace of the Hilbert space E, and

assume that F "# E. Then there exists an element Z E E, Z "# 0, such that
Z is perpendicular to F.
Proof. Let x E E and x ¢ F. Let Yo E F be at minimal distance from x

(by the lemma), and let a be this distance. Let Z = x - Yo' Then z "# 0
since F is closed. For all Y E F, Y "# 0 and complex ex, we have
whence, expanding out, we obtain
o ~ ex<y, z) + iX<z, y) + exiX<y, y).

F
Yo
Figure 5.1
We let ex = t<z, Y), with t real "# O. We can then cancel t and get a
contradiction for small t, if <y, z) "# O. This proves the theorem.
Corollary 1.7. Let E be a Hilbert space, E"# {O}. Then there exists a
total orthogonal basis for E.
Proof. Let S be the set of non-empty orthogonal families. If!!i'1,!!i'2

are orthogonal families, we define !!i'l ~!!i'2 if !!i'l C !!i'2' This gives an
inductive ordering. Let f!lJ be a maximal element, and let F be the
subspace generated by f!lJ. We contend that F is dense in E. Otherwise,
F #- E, and by the theorem there exists Z E E, Z #- 0 and Z perpendicular

to F. We can then obtain a bigger orthogonal family than fJI, a contra-
diction which proves our corollary.
Corollary 1.8. Let E be a Hilbert space, and F a closed subspace. Then

E= F + F1-.
Proof If Yn E F and Zn E F\ then the sequence {Yn + zn} is Cauchy if

and only if {Yn} is Cauchy and {zn} is Cauchy (by the Pythagoras
theorem). Hence F + F1- is closed. If F + F1- #- E, then there exists
wEE, W #- 0, which is perpendicular to F + F1-, whence perpendicular to
F, so that WE F\ a contradiction which proves the corollary.
We observe that if F is a closed subspace, then FH = F. For any

x E, we can write uniquely
E
x=y+Z
with Y E F and ZE F1-. The map P: E -+ E such that
Px = Y
is called the orthogonal projection on F. It is obviously a continuous

linear map, and we study such maps in greater detail in Chapter XVIII,
§5.
Corollary 1.9. Let E be a Hilbert space. Let {F;} (i = 1,2, ... ) be a

sequence of closed subspaces which are mutually perpendicular, that is
Fi 1. Fj if i #- j. Let F be the closure of the space F generated by all Fi ·
(In other words, F is the closure of the space F consisting of all sums
Xi E F;.)
Then every element x of F has a unique expression as a convergent

series
Let Pi be the orthogonal projection on F;. Then Xi = p;x, and for any
choice of elements Yi E F; we have
Proof Since
n
X - L p;x
i=l
is orthogonal to F1 , ' " ,Fn we can use exactly the same argument as in
Theorem 1.4, and the Pythagoras theorem to show the last inequality,
writing
2
Ix- t Yil2 = Ix - t p; x + It (p;x _ YJI2.
1=1 1=1
1
1=1
There exists a sequence from F which approaches x. It therefore follows

that the partial sums
n
L p;x
i=l
must approach x also. If
00
X= LXi
i=l
with Xi E Fi, then we apply the projection P n (which is continuous!) to

conclude that p,.x = X n , thus proving the uniqueness.
It is convenient to call the family {Fi} an orthogonal decomposition of

F in the preceding theorem. If F = E, then we call it an orthogonal
decomposition of E, of course.
Suppose that the Hilbert space E has a denumerable total family {v n },
which we assume to be orthonormal. Then every element can be written
as a convergent series
where an is the Fourier coefficient of x with respect to Vn' and the

convergence is of course with respect to the L2-norm. Namely, we take
the spaces Fn in the previous discussion to be the I-dimensional spaces
generated by Vn' In particular, we see that L lan l2 converges, and that
If {v n } is merely an orthonormal system, not necessarily a Hilbert basis,

then of course we don't get the equality, merely the inequality
L lan l2~ Ix12.

00
n=l
This IS called the Bessel inequality, and it is essentially obvious from

previous discussions. For instance, for each n we can write
n n
V = V -
k=l
L: akvk
L akvk + k=l
and apply Pythagoras' theorem.
Conversely, we can define directly a set [2 consisting of all sequences
{an} such that L: lan l2 converges. If ex = {an} and P= {bn} are two se-
quences in this space, then using the Schwarz inequality, on finite partial
sums, one sees that
converges, whence we can define a product
<ex, P) = L: anbn·
Again from the above convergence, we conclude that [2 is in fact a vector
space, because
Furthermore, this product is a hermitian product on it. Finally, it is but

an exercise to verify that [2 is complete. Indeed, the family {v n } is total,
orthonormal in the completion of F, and in this completion any element
can be expressed as a convergent series, described above. Thus the ele-
ments of the completion are precisely those of [2.
The space [2 can also be interpreted as the completion of a space of
functions, those periodic of period 2n, say, a total orthogonal family then
being constituted by the functions
where n ranges over all integers (positive, negative, or zero).

It is clear that any two Hilbert spaces having denumerable ortho-
normal total families are isomorphic under the map which sends one
family on the other. Indeed, if G is another Hilbert space with total
orthonormal family {en}, then the map
is linear and preserves the norm. In this way, we get a map from our
space of periodic functions into [2, which is injective and preserves the
norm. It extends therefore uniquely to the completion.
In general, if two Hilbert spaces have total orthonormal families with
the same cardinality, then any bijection between these families extends to
a unique norm-preserving linear map of one space to the other.
V, §2. FUNCTIONALS AND OPERATORS
Theorem 2.1. For every y in the Hilbert space E, the map Ay such that
Ay(X) = <x, y) is a functional. The association
is a norm-preserving antilinear isomorphism between E and its dual space

E'.
Proof. The Schwarz inequality shows that IAyl ~ Iyl, and evaluating Ay
at y shows that IAyl = Iyl, so we get a norm-preserving semi-linear map
of E into E', semi-linear because of the hermitian nature of the scalar
product, namely for complex oc,
There remains to show that every functional comes from some y E E. Let
A be a functional, and let F be its kernel (the closed subspace of all x
such that A(X) = 0). If F =1= E, there exists Z E E, Z =1= 0 such that Z is
perpendicular to F (by Theorem 1.6). We contend that some scalar
multiple of Z achieves our purpose, say ocz. A necessary condition on oc is
that
<z,ocz) = A(Z)
or in other words, ii = A(Z}/<Z, z). This is also sufficient. Indeed, for any
x E E, we can write
and
lies in F. Taking the product with OCZ, we obtain
<x, ocz) = A(X)

thus proving our theorem.
By an operator we shall mean a continuous linear map of E into itself.

As we know, the space of operators End(E) is a Banach space.
[V, §2] FUNCTIONALS AND OPERA TORS 105
By Herm(E) we denote the set of all continuous hermitian forms on E.

By Sesqu(E) we denote the set of all continuous sesquilinear forms on E.
It is immediately verified that both these sets are in fact Banach spaces,
and that Herm(E) is a closed subspace of Sesqu(E). We shall now relate
continuous sesquilinear forms on E and operators.
Let A: E -> E be an operator. We define CPA by
CPA(X, y) = <Ax, y).

Then CPA is obviously a continuous sesquilinear form on E. Conversely,
let cP be such a form. For each y E E the map
XI---+CP(x, y)
is a functional, and consequently there exists a unique y* E E such that

for all x E E we have
cp(x, y) = <x, y*).
The map y 1---+ y* is immediately verified to be linear, using the uniqueness

of the element y* representing cp. Furthermore, from the Schwarz in-
equality, we find that
ly*1 ;;; Icpllyl·
If we define A *: E -> E to be the map such that A *Y = y*, then we
conclude that A * is a continuous linear map of E into itself, i.e. an
operator.
On the other hand, if we define t/I(y, x) = cp(x, y), then t/I is sesquilinear
continuous, and by what we have just seen, there exists a unique opera-
tor A such that t/I(y, x) = <y, Ax), or in other words
cp(X, y) = <Ax, y).

Thus cP = CPA for some A.
Theorem 2.2. The association
is a norm-preserving isomorphism between End (E) and the space of

continuous sesquilinear forms on E.
Proof. All that remains to be proved is that IAI = ICPAI. But

so that ICPA I ~ IAI. Conversely, we know that IAxl = IA-Axl and
Hence IAxl ~ ICPAllxi. This proves that IAI ~ ICPAI, whence our theorem
follows.
We have also shown that to each operator A we can associate a

unique operator A * satisfying the relations
<Ax, y) = <x, A*y)
for all x, y E E. We call A* the adjoint of A (transpose of A if our

Hilbert space is over the reals).
Theorem 2.3. The map A H A * satisfies the following properties:
(A + B)* = A* + B*, A** = A,

(ocA)* = aA*, (AB)* = B*A*,
and for the norm,
IA*I = IAI,
Proof. The first four properties are immediate from the definitions.
For instance,
<ocAx, y) = <Ax, ay) = <x, A*ay) = <x, aA*y).
From the uniqueness we conclude that (ocA)* = aA *. The others are

equally easy, and are left to the reader. As for the norm properties, we
have
I<A *x, y)1 = I<x, Ay)1 ~ IAllxllYI
so that
Since A** = A, it follows that IAI ~ IA*I so IAI = IA*I. Finally,
and conversely,
IAxl2 = <Ax, Ax) = <A*Ax, x) ~ IA*Allx1 2
so that IA I ~ IA * A 11/2. This proves our theorem.

[V, §3] EXERCISES 107
If q> is a continuous sesquilinear form on E, we define the function
q(x) = q>(x, x)
to be its associated quadratic form. In the complex case, we can recover

the sesquilinear form from the quadratic form. We phrase this in terms
of operators.
Theorem 2.4. For a complex Hilbert space, if A is an operator and

<Ax, x) = 0 for all x, then A = o.
Proof. This follows from what is called the polarization identity,
<A(x + y), x + y) - <A(x - y), x - y) = 2[<Ax, y) + <Ay, x)J.
Under the assumption of Theorem 2.4, the left-hand side is equal to O.

Replacing x by ix, we get
<Ax, y) + <Ay, x) = 0,
i<Ax, y) - i<Ay, x) = O.
From this it follows that <Ax, y) = 0 and hence that A = o.

Theorem 2.4 is of course false in the real case, since a rotation is not
necessarily 0, but may map every vector on a vector perpendicular to it.
However, in Chapter XVIII we shall deal with the case when A = A*, in
which case the result remains true, obviously.
Operators A such that A = A * are called hermitian, or self adjoint.
We shall study these especially in Chapter XVIII.
V, §3. EXERCISES
For the first two exercises, recall that a sequence {xn} in a Hilbert space H
converges weakly to 0 if for all Y E H we have lim<x n , y) = o.
1. Let {v n } (n = 1,2, . . . ) be a denumerable Hilbert basis for the Hilbert space H.
Show that the sequence {v n } converges weakly to 0, and hence that the unit
sphere is not closed in the unit ball for the weak topology.
2. Suppose the Hilbert space H has a countable basis. Let x E H be such that
Ix I ;;i; 1. Show that there exists a sequence {un} in H with IUn I = 1 for all h
such that {un} converges weakly to x.
3. Let X be a closed convex subset of a Hilbert space. Show that there exists a
point in X which is at smallest distance from the origin.
4. Let E be a Hilbert space, and let {xn} be an orthonormal basis. Let {cn} be a
sequence of positive numbers such that L c;
converges. Let C be the subset of
E consisting of all sums L
anxn where lanl ;;; Cn. Show that C is compact.
S. Show that a Hilbert space is separable (has a countable base for the topology)
if and only if it has a countable orthonormal basis.
6. Let A be an operator on a Hilbert space. Show that
Ker A = (1m A*).L.

7. Let E be the vector space of real valued continuous functions on an interval
[a, b]. Let K = K(x, y) be a continuous fUnction of two variables, defined on
the square a;;; x ;;; b and a;;; y ;;; b. An element f of E is said to be an
r
eigenfunction for K, with respect to a real number r, if
fey) =r K(x, y)f(x) dx.
r
We take E with the L 2 -norm of the hermitian product given by
<f,g) = fg·
Prove that if fl' ... ,f. are in E, mutually orthogonal, and of L 2- norm equal to
1, and if they are eigenfunctions with respect to the same number r, then n is
bounded by a number depending only on K and r. [Hint: Apply Bessel's
inequality.]
8. Let E be a pre Hilbert space.
(a) If E is complex, then Im<x, y) = Re<x, iy).
(b) Let x, y E E. If E is real, then
If E is complex, then
(c) Let F be a normed vector space such that the parallelogram law holds for
its norm. Define <x, y) by the formula in (b). Show that this is a positive
definite scalar product.
PART THREE
Integ rati on
This part deals with integration in multiple contexts. We start with the
integral on arbitrary measured spaces, setting the basic framework in a
context which makes its structure particularly clear. The main idea is
that one starts the theory of the integral by defining the integral on a
natural space of simple functions where one sees immediately what the
integral means. The space of step functions is the one which covers all
cases, from the most general to the most special. As we shall also see, if
one wants integration on the reals, or in euclidean space, then the space
generated by characteristic functions of intervals or cubes, or the Coo
functions with compact support, also form a natural starting space for
integration.
It turns out that for the basic framework of integration, all one needs
for the space of values is linearity and completeness, so a Banach space.
I think it obscures matters to assume (as is often done) that values are
first taken in the real numbers, and to make abusive use of the ordering
properties of the reals and of positivity in setting up the integral. Fur-
ther comments on this will be made in Chapter VI, especially the intro-
ductory comments.
However, doing general Banach valued integration on measured spaces
does not mean that one eventually slights special properties of complex
valued integration over the real numbers. This entire part will mix gen-
eral considerations with particular situations and examples, especially on
euclidean space and the real line. Readers can see how having the gen-
eral machinery of integration on measured spaces, or locally compact
spaces, is used to make easier the formulation of more concrete results.
For instance, in Chapter VIII, we give specific results on approximations
on R or Rn with Dirac sequences and families. In Chapter IX, two
110 INTEGRA TION [PART THREE]
sections on functions of bounded varIatIOns and the Stieltjes integral

illustrate the general relationships between measures and functionals on
Coo functions with compact support. They also emphasize what is pecu-
liar to the real numbers, as distinguished from what holds when the
values are taken in an arbitrary Banach space.
Thus, throughout this part, we see general integration theory on mea-
sured spaces alternate with special features on euclidean spaces or on the
real line.
CHAPTER VI
The General Integral
In this chapter we develop integration theory. We want two things from

an integral which are not provided by the standard Riemann integral of
bounded functions:
(1) We want to integrate unbounded functions.
(2) We want to be able to take limits under the integral sign, of a
fairly general nature, more general than uniform limits.
To achieve this, we proceed in a manner entirely similar to the manner
used when extending the integral to the completion of a space of step
functions, except that instead of the sup norm we use the L I-norm.
Simple and basic lemmas then allow us to identify elements of the com-
pletion with actual functions, and all properties of the integral then
become just as easy to prove as in the earlier versions of integration.
The lemmas are designed to show that if in addition to L I-convergence
we require pointwise convergence almost everywhere, then we still re-
cover essentially the Ll -completion, up to functions which vanish almost
everywhere.
The treatment here is a conglomerate of various treatments in the
literature. Unlike most treatments, however, I have based the existence
and definition of the integral on a very simple lemma, which I call the
fundamental lemma of integration (Lemma 3.1). It can be proved ab ovo
with a very short proof, and shows immediately how an U-Cauchy
sequence of functions converges (almost uniformly!). From this conver-
gence, one can immediately see how to extend the integral "by continu-
ity" from step maps to the most general class of mappings which is
desired. In the basic lemma, positivity plays no role whatsoever. A
posteriori, one notices that the monotone convergence theorem and the
112 THE GENERAL INTEGRAL [VI, §1]
"Fatou lemma" of other treatments become immediate corollaries of the

basic approximation lemmas derived from Lemma 3.1. Thus it turns out
that it is easier to work immediately with complex valued functions than
to go through the sequence of many other treatments, via positive func-
tions, real functions, and only then complex functions decomposed into
real and imaginary parts. The proofs become shorter, more direct, and
to me much more natural. One also observes that with this approach
nothing but linearity and completeness in the space of values is used.
Thus one obtains at once integration with Banach valued functions. But
readers may well omit considering this case if it makes them more com-
fortable to deal with C-valued functions only. Note, however, that vector
space valued functions are useful in giving an especially simple proof for
the Fubini theorem, which again I find more transparent than the proof
used in many treatments, based on positivity. Historically, Bochner was
the first to consider integration of Banach valued functions. From the
point of view taken here, there is no difference between Banach or com-
plex valued functions.
Actually, it is a reasonable question why one should want to identify
elements of the completion with functions : why not just work formally
with Cauchy sequences? One of the basic reasons is that certain proper-
ties of the formal completion which one wishes to use are obvious if
elements of this completion are identifiable with functions. For example,
consider the space L of continuous functions on [0, 1]. Let T: L --+ L
be the linear map given by Tf(x) = xf(x). Then T is continuous for the
Ll-norm on this space, whence T extends uniquely to a continuous linear
map T on the completion. Now it is clear that T is injective on L, and
one can ask if T: L --+ L is also injective. If we can identify an element of
the completion with a function f so that T is again given as multiplica-
tion by x, then one sees at once that T is injective. Otherwise, one has
to prove some lemma about L l-Cauchy sequences which amounts to a
special case of those proved to establish the representation of elements of
the completion by functions, and which serve in a wide variety of context.
I would also like to draw the reader's attention to the approximation
Theorem 6.3, which gives a key result in line with our general approach :
to prove something in integration theory, first prove it for a subspace of
functions for which the result is obvious, then extend by linearity and
continuity to the largest possible space.
VI, §1. MEASURED SPACES, MEASURABLE MAPS,

AND POSITIVE MEASURES
Let X be a set (non-empty). By a a-algebra in X we mean a collection

of subsets vIt having the following properties :
a-ALG 1. The empty set is in vIt.

[VI, §1] MEASURED SPACES, MEASURABLE MAPS 113
O'-ALG 2. The collection vii is closed under taking complements (in X)

and denumerable unions. I n other words, if A E vii then
CCxA E vii, and if {An} is a sequence of elements of vii, then
00
U An
n=I
is also an element of vii.
We conclude at once from these conditions that the whole set X is in

vii, and that a denumerable intersection of elements of vii is also in vii.
Also, using empty sets, we see that finite unions or intersections of ele-
ments of vii are also in vii, and we could just as well have assumed this
by saying "countable" instead of "denumerable" in our second axiom.
A set X together with a a-algebra vii is called a measurable space, and
the elements of vii are called its measurable sets. We note that if A, B
are measurable, and if we denote by A - B the set
A - B = A nCCxB
consisting of all elements of A not in B, then A - B is measurable.

To prove that a collection of subsets is a a-algebra, we shall often use
the following characterization:
A collection vii of subsets of X is a a-algebra if and only if it contains

the empty set, is closed under taking complements, finite intersection, and
such that, if {An} is a sequence of disjoint elements of vii then the union
U An is in vii.
Proof This is clear since we can write
00
U An = Al U (A2 - AI) U (A3 - (A 2 U AI)) U ··· .

n=I
We could also define the notion of an algebra of subsets of X . It is a

collection d of subsets satisfying the following conditions:
ALG 1. The empty set is in d.

ALG 2. If A, BEd, then A n B, A U B, and A - B are in d.
Thus we can say that a a-algebra is an algebra which is closed under

taking countable unions, and containing the set X itself.
Terminology. In some texts, what we call an algebra is called a ring

(of subsets). However, in the theory of algebraic structures (groups, rings,
fields, vector spaces, etc.) it has become more or less standard practice
to assume that a ring has a unit element for multiplication, while an
"algebra" is merely an additive group with a bilinear law of composition.
Our definitions have therefore been made to fit these conventions, in the
analogous situation of algebras of subsets. Here, of course, the "unit
element" is the whole space.
Let !/ be a collection of subsets of X. Then there exists a smallest

(J-algebra .A in X which contains !/.
Proof We can take for .A the intersection of all (J-algebras containing

!/. The collection of all subsets of X is such an algebra, and does
contain !/, so that we are not faced with the empty set. It is immediate
that the intersection .A above is itself a (J-algebra, so we are done.
In the preceding result, the (J-algebra .A is said to be generated by !/.
Example 1.
Let X be a topological space, and let !/ be the collection of all open
sets. The (J-algebra generated by these open sets is called the algebra of
Borel sets. An element of this algebra is called Borel measurable. In
particular, every denumerable intersection of open sets and every de-
numerable union of closed sets is Borel measurable.
Example 2.
Let (X,.A) be a measurable space. Let f: X -+ Y be a mapping of X
into some set Y. Let % be the collection of subsets S of Y such that
f-1(S) is measurable in X. Then % is a (J-algebra. The proof for this is
immediate from basic properties of inverse images of sets. We call % the
direct image of .A under f, and could denote it by f*(.A). (Cf. Exercise
1.)
Example 3.
Let X be a measurable space, and let Y be a subset. If.A is the
collection of measurable sets of X, we let .Ay consist of all subsets AnY,
where A E.A. Then it is clear that .Ay is a (J-algebra, which is said to be
induced by .A on Y. Then (Y, .Ay) is a measurable space.
Measurable Maps
If (X,.A) and (Y, %) are measurable spaces, and f: X -+ Y is a map, we

define f to be measurable if for every BE % the set f-l(B) is in.A. By
condition M2 below, one sees at once that if Y is a topological space,
and % is the (J-algebra of Borel sets, then f is measurable in this general
sense if and only if it satisfies the seemingly weaker condition stated in
M2, namely that the inverse image of an open set is measurable. In

practice, we deal only with maps into topological spaces, and in fact into
normed vector spaces.
Ml. If f: X --t Y is measurable, and g: Y --t Z is measurable, then the

composite g 0 f is measurable. This is clear.
M2. Let f: X --t Y be a map into a topological space, with the ()-
algebra of Borel sets. Suppose that for every open V in Y, the

inverse image f-l(V) is measurable. Then f is measurable.
Proof. Let % be the collection of subsets S of Y such that f- 1 (S) is

measurable in X. Then % is a (}-algebra and contains the open sets.
Hence it contains all Borel sets in Y, thus proving the desired result.
From now on, our maps will have values in a topological space, with
the Borel sets as measurable sets.
We note at once that taking complements, we could have defined
measurability by the condition that the inverse image of a closed set is
measurable. Furthermore, we see that the inverse image of a countable
union of closed sets, and the inverse image of a countable intersection of
open sets is measurable because if {Un} is a sequence of open sets, then
and similarly for closed sets. Example,' Let J be a half-open interval

(a, b] and let f: X ---+ R be measurable. Then
is measurable because we can write (a, b] as the union of closed intervals
for n = 1, 2, .. . .
We shall now give a large" number of criteria for mappings and sets to
be measurable, and we shall see that limit operations preserve measur-
ability, and algebraic operations likewise, under extremely mild hypo-
theses on the image space Y. These hypotheses will always be satisfied in
practice, and trivially so in the case when we deal with maps into the
real or complex numbers, or into Euclidean n-space.
M3. Let f: X ---+ Y x Z be a map of a measurable space X into a

product of topological spaces Y, Z. Write f in terms of its coord i-
nate maps, f = (g, h) where g: X -+ Y and h: X -+ Z. If f is mea-

surable, then so are g and h. Conversely, if g, h are measurable,
and every open set in Y x Z is a countable union of open sets
V x W, where V is open in Y and W is open in Z, then f is
measurable.
Proof If f is measurable, then composing f with the projections of

Y x Z on Y or Z shows that both g and h are measurable. Conversely,
if g, h are measurable, then for any open sets V, W in Y, Z respectively,
we have
Hence f- 1 (V x W) is measurable. The measurability of f- 1 (U) for any

open set U now follows from the assumption made on the topology of
Y x Z.
M4. In particular, we conclude that a complex function f on X is

measurable if and only if its real part and imaginary part are
measurable.
Note that the condition expressed on the product space Y x Z in our

criterion is satisfied if Y, Z are metric spaces and have denumerable
everywhere dense sets. Thus they are satisfied if Y, Z are separable
Banach spaces, and in particular for euclidean n-space. Actually, in most
applications we integrate complex valued functions, so that there is no
problem with this extra condition.
M5. If f is a measurable map of X into a normed vector space, then

the absolute value If I is measurable, being composed of f and the
continuous function y 1--+ Iy I.
We would like the sum of two measurable maps f, g into a normed

vector space E to be measurable. Since the sum can be viewed as the
composite of the map x 1--+ U(x), g(x)) and the sum map E x E -+ E,
which is continuous, what we want follows from our criterion concerning
maps into a product space, provided the extra condition is satisfied. In
particula'r, we obtain the following.
M6. Measurable complex valued functions on X form a vector space,

and similarly if the values are in a finite dimensional space, or if
we restrict ourselves to maps whose image is separable (i.e. contains
a countable dense set). Similarly, if f, g are measurable complex
functions on X, then the product fg is measurable.
For this last assertion, we note that the product is composed of the
map (f, g) and the product C x C -+ C, which is continuous.
[VI, §l] MEASURED SPACES, MEASURABLE MAPS 117
M7. Let J: X --+ Y be a mapping oj X into a metric space. Let {J..} be

a sequence oj measurable mappings oj X into Y which converges
pointwise to f. Then J is measurable.
Proof. Let U be open in Y. If x E J- 1 (U), then for all k sufficiently

large, we must have x E Jk- 1 (U) because ft,(x) converges to J(x). Hence
for each m,
U J;;l(U)
00
J- 1 (U) C
k=m
and consequently
J- 1 (U) c n u Jk- (U).
00
m=l k=m
00
1
On the other hand, let A be a closed set. Suppose that x lies in every
union
for all positive integers m. Then for arbitrarily large k, we see that Jk(X)
lies in A, and hence by assumption the limit J(x) lies in A because A is
closed. Hence we obtain the reverse inclusion
n u Jk- (A)
00
m=l k=m
00
1 c J-l(A).
Let V be a fixed open set. For each positive integer n let An be the
closed set of all y E Y such that d(y, t;6'V) ~ lin, and let v" be the open
set of all y E Y such that d(y, t;6'V) > lin. Then
and
00 00
V = U An = n=l
n=l
U v".
Thus we have the inclusions
U J- U n U Jk-
00 00
J-l(V) = 1 (An) :::J 1 (An)

n n m=lk=m
and
This proves that the equality holds, and shows that J-l(V) is measurable.
This last result is really the main thing we were after. We need it
immediately in the next section to know that if f is a limit of measurable
real valued functions, then for every real a, the set
is measurable when J is equal to the interval of all t > a or the interval

of all t ~ a.
In the definition and development of the first properties of the integral
in the subsequent sections, the limit property we have just proved, com-
bined with our definition, is the one which will be most useful. It turns
out that there is a condition which is necessary and sufficient for a map
to be measurable in all applications, but which we preferred to postpone
and state as a criterion rather than take as definition. We now discuss
this condition. It will be the useful one in dealing with further properties.
A map f: X -+ Z into any set Z is said to be a simple map if it takes
on only a finite number of values, and if, for each v E Z the inverse image
f- 1 (v) is measurable. Thus X can be written as a finite disjoint union,
m
X= UXi
i=l
where each X i is measurable, and f is constant on Xi .

It is clear that simple maps of X into a Banach space E form them-
selves a vector space.
If {<Pn} is a sequence of simple maps of X into a Banach space E, and
{<Pn} converges pointwise, then the limit is measurable, according to the
criterion M7. The converse is almost true, and is indeed true when E is
finite dimensional (so in particular when E represents the real or complex
numbers). We have:
M8. A map f: X -+ E of X into a finite dimensional space is measur-

able if and only if it is a pointwise limit of simple maps.
'Proof The result reduces immediately to the case when E = R. We

leave the reduction to the reader. Thus assume that f is measurable real
valued. For each integer n ~ 1 cut up the interval [ -n, n] into intervals
of equal length lin and denote these intervals by J1 , • • • ,IN • We take
each Jk to be closed on the left and open on the right. We let I N +1
consist of all t such that It I ~ n. Let
for k = 1, . .. ,N + 1
so that each Ak is measurable, the sets Ak (k = 1, .. . ,N + 1) are disjoint,
and their union is X . On each Ak we define a constant map t/ln by
if k = 1, ... ,N.
We can write AN+1 = BuB' where B consists of those elements x such

that f(x) ~ nand B' consists of those x such that f(x) < -no We define
and
Then the sequence {t/ln} converges pointwise to f, and each t/ln is a simple
function. This proves that measurability implies the other condition. The
converse is already known from M7, and thus our characterization of
measurable maps is proved.
The construction of the case we just discussed yields a useful addi-
tional property in the positive case:
M9. Let f: X --+ R ;;; o be a positive real valued measurable map. Then
f is a pointwise limit of an increasing sequence of simple maps.
Proof. The functions t/ln defined above are all ~ f, and we let
Then {CfJn} is increasing to f, as desired.
After discussing positive measures, we shall discuss a variant of condi-

tion MS, related to a given measure.
Positive Measures
We shall now define positive measures. To do this, it is convenient to

introduce the symbol 00 in the context of positivity (after all, we want
some sets to have infinite measure).
We let 00 be a symbol unequal to any real number. By [0,00] (which
we call also an interval) we mean all t which are real ~ or 00. We °
introduce the obvious ordering in [0, 00], with a < 00 for every real a.
We define addition and multiplication in [0,00] by the convention that
oo·a=a·oo=O if a = 0,
oo·a = a · 00 = 00 if 0< a ~ 00 ,
oo+a=a+oo=oo if °
~ a ~ 00.
Then associativity, distributivity, and commtativity hold in [0,00]. The

sum of a sequence of elements in [0, 00] then can be viewed to converge
°
to a number ~ or to 00.
Let X be a measurable space and let .Jt be the collection of its
measurable sets. A positive measure on .Jt (or on X , by abuse of lan-
guage) is a map
Jl: .Jt --+ [0, 00]
which is countably additive. In other words fJ.(0} = 0, and if {An} is

a sequence of measurable sets which are mutually disjoint (An II Am is
empty if n #- m), then
If A is measurable, we call fJ.(A} its measure, or fJ.-measure if the reference

to fJ. is necessary to avoid confusion.
Examples. Let X be a set and Xo an element of X. If A is a subset of

X containing xo, we define fJ.(A} = 1. If A does not contain Xo we define
fJ.(A} = 0. It is immediately verified that this defines a measure, called the
Dirac measure at Xo'
As another example, if a subset is finite, we define its measure to be its
number of elements, and if a subset is infinite, we define its measure to
be 00. Again it is immediately verified that this defines a measure, called
the counting measure.
We shall identify measures with integrals later.
A measurable space together with a measure is called a measured

space. When we want to specify all data in the notation, we write the
full triple (X, .A, fJ.) for a measured space.
We derive some trivial consequences from the definition of a positive
measure.
First· we note that the additivity of fJ. holds for finite sequences since
we can take all but a finite number of the An to be empty.
Next, a measure satisfies properties of monotonicity, namely:
If A, B are measurable, A c B, then fJ.(A} ;:;:; fJ.(B}.
This is obvious because we can write B = Au (B - A).
Proposition 1.1. If {An} is a sequence of measurable sets and An C An+1

for all n, and if
U An
00
A =
n=1
then
(This is understood in the obvious sense if fJ.(A) = oo .} To prove this, we

let Ao be the empty set, write
and use the countable additivity. We get

N
Jl(A) = lim L Jl(A n+
N-+oo n=O
1 - An) = lim
N-oo
Jl(A N ),
as was to be shown.
It will occasionally be useful to have the following characterization of

measures:
Proposition 1.2. A map Jl:.# ...... [0, 00] is a measure if and only if
Jl(0) = 0, Jl is finitely additive, and if {An} is an increasing sequence of
measurable sets whose union is A, then
Our assertion is obvious, taking into account our preceding arguments.
Proposition 1.3. If An is a decreasing sequence of measurable sets, i.e.

An+l c An for all n, if some An has finite measure, and if
00
A = nAn'
n=l
then
To prove this, say Jl(A 1 ) -# 00. We write
The sets Al - An form an ascending sequence, whose union is Al - A.

By our previous result, we conclude that
Our assertion follows.

Note that if we do not assume that some An has finite measure, then
the conclusion may be false. Indeed if all An have infinite measure, their
intersection may be empty. Think of the real numbers ~ n.
If An is an arbitrary sequence of measurable sets, then in general we

have only
This is again obvious.

Having the notion of (positive) measure on .It we emphasize the role
played by sets of measure 0, and we shall use the following terminology.
A property of elements of X is said to hold almost everywhere, or for
almost all x, if there exists a set S of measure 0 such that the property
holds for all x ¢ S. For instance, if f: X -+ R is a map of X into the
reals, we say that f ~ 0 almost everywhere if f(x) ~ 0 for almost all x,
i.e. for all x outside a set of measure O. Of course, we should really put
the Jl into the notation, and say Jl-almost everywhere or Jl-almost all, but
since we deal with a fixed measure, we omit the prefix Jl- for simplicity.
In developing the theory of the integral, we follow the oldest idea,
which is first to integrate step maps and then take limits. We shall now
discuss the measure theoretic aspect of this procedure.
Let A be a set of finite measure. By a partition of A we mean a finite
sequence {A;} (i = 1, ... ,r) of measurable sets which are disjoint and such
that
Let E be a Banach space. A map f: X -+ E is called a step map with

respect to such a partition if f is equal to 0 outside A (that is f(x) = 0 if
x ¢ A), and f(A;) has one element for each i (i.e. f is constant on AJ A
map f: X -+ E is said to be a step map if it is step with respect to some
partition of some set of finite measure. We denote the set of all step
maps by St(Jl, E) or more briefly by St(Jl).
If Y is a measurable subset of X, then the restriction to Y of a step
map on X is a step map on Y. Conversely, a step map on Y can be
extended to a step map on X by giving it value 0 outside Y. If f is a
map on X, we denote by fy the map such that fy(x) = 0 if x E Y and
fy(x) = f(x) if x E Y.
The set of step maps St(Jl, E) is a vector space. If f is a step map, then
so is If I· If f: X -+ E is a step map and g: X -+ C is a step function,
then gf (also written fg) is a step map.
Proof. This is proved trivially using a refinement of two partitions.

Indeed, if {A;} and {Bj } are two partitions of A, then
[VI, §I] MEASURED SPACES, MEASURABLE MAPS 123
is also a partition. Also, if f is 0 outside A, and g is 0 outside B, and A,

B are measurable of finite measure, then A u B has finite measure, and
we can find a partition of A u B with respect to which both f and g are
step maps. From this our assertions are obvious.
We shall not use the rest of this section until the corollaries of the
dominated convergence theorem in §5.
We shall define the integral on certain maps which are limits of step
maps. The present discussion is devoted to such limits. We define a map
to be J.L-measurable if it is a pointwise limit of a sequence of step maps
almost everywhere. In other words, if there exists a set Z of measure 0
and a sequence of step maps {cp.} such that {cp.(x)} converges to f(x) for
all x ~ Z. Let f: X -+ Y be J.L-measurable, and let A c X and BeY be
measurable subsets with f(A) c B. Then the induced map f : A -+ B is
J.L-measurable. Instead of MI, we have: if f : X -+ E is j.L-measurable, and
g: E -+ F is continuous, then go f is J.L-measurable.
MIO. The J.L-measurable maps of X into E form a vector space. If f, g

are J.L-measurable functions (complex), so is their product. In fact,
if f: X -+ E and g: X -+ Fare J.L-measurable maps into Banach
spaces, and E x F -+ G is a continuous bilinear map, then the
product fg (with respect to this map) is J.L -measurable. The abso-
lute value If I is J.L-measurable. If f is a J.L-measurable function
such that f(x) #- 0 for all x, then Ilf is J.L-measurable.
Proof. All statements are clear, except possibly the last, for which we
give 'the argument: If {cp.} is a sequence of step functions converging
pointwise to f, then we let t/I.(x) = ll cp.(x) if CP.(x) #- 0 and t/I.(x) = 0 if
CP.(x) = O. Then t/I. is step, and the sequence {t/I.} converges pointwise to
1If.
The property of J.L-measurability builds in some very strong finiteness

properties on both the set of departure and the set of arrival of the map.
To begin with, it is clear that a J.L-measurable map vanishes outside a
countable union of sets of finite measure. Such sets are important. We
give a name to them, and say that a measurable subset Y of X is (J-finite
if it is a countable union of sets of finite measure. More accurately, we
should really say that J.L is (J-finite on Y, and we should say that J.L is
(J-finite if it is (i-finite on X . However, we allow ourselves the other
terminology when J.L is fixed throughout a discussion.
Secondly, there exists a set Z of measure 0 such that the image
f(X - Z) of the complement of Z contains a countable dense set (i.e. is
separable). This is clear since outside such Z the map f is a pointwise
limit of step maps, and thus the image of X - Z lies in the closure of a
set which IS a countable union of finite sets. Thus we now have two
necessary conditions for a measurable map to be jl-measurable, namely
countability conditions on its domain and range. It turns out that these
are sufficient.
MIl. Let f: X ~ E be a map of X into a Banach space. The following

two conditions are equivalent:
(i) There exists a set Z of measure 0 such that the restriction of
f to the complement of Z is measurable, f vanishes outside
a ([-finite subset of X, and the image f(X - Z) contains a
countable dense set.
(ii) The map f is a pointwise limit almost everywhere of a se-
quence of step maps (that is, f is jl-measurable).
In particular, if jl is ([-finite and if f is a function (complex
valued), then f is jl-measurable if and only if there exists a subset
Z of measure 0 such that f is measurable on the complement of
z.
Proof. We have already proved that (ii) implies (i), using our preced-
ing remarks, and M7. Conversely, assume (i). We may assume that X is
a disjoint union of subsets X k (k = 1,2, ... ) of finite measure. If we can
prove that the restriction flX k of f to each X k is jl-measurable, then for
each k there is a sequence {cp?l} (j = 1,2, .. . ) of step maps on X k which
converges almost everywhere to flX k • We define CPn by the following
values:
for k = 1, ... ,n,

CPn(x) = 0
Then each CPn is a step map, and the sequence {CPn} converges almost
everywhere to f. This reduces the proof that f is jl-measurable to the
case when X has finite measure.
Suppose therefore that X has finite measure. We may also assume
that the image of f contains a countable dense set {Vk} (k = 1,2, ... ).
For each positive integer n, let B1/n(vd be the open ball of radius l/n
centered at vk • The union of these balls for all k = 1, 2, ... covers the
image of f, whence the union of the inverse images under f covers X
itself. If we take k large, it follows that the finite union of inverse images
differs from X by a set y" such that jl( y,,) < 1/2n. We let
Zn = Y" U y"+l U ...

so that J1.(Zn) ~ 1/2n- 1 • Then Zn => Zn+l => . .. is a decreasing sequence.

On X - Y" we can obviously find a step map CPn such that
If(x) - CPn(x) I < l/n for x ¢ Y".
We simply define the map CPn inductively to have the value Vi on the
inverse image of B 1/ n (V 1 ), the value V 2 on the inverse image of
and so forth. We let t/ln be equal to CPn on X - Zn and give t/ln the value
o on Zn . Then t/ln is a step map, and the sequence {t/In} converges
pointwise to J, except possibly on the set Z equal to the intersection of
all Zn, which has measure O. This proves what we wanted.
Remark 1. The proof is substantially the same as that of MS, granting

the necessary adjustment to the more general situation.
Remark 2. We get some uniformity of convergence from the proof,

outside a set of arbitrarily small measure.
Remark 3. We took values of f in a Banach space, but for purposes

of Mll, values in any complete metric space would have done just as
well. The additive structure plays no role. However, in all subsequent
applications, we deal with maps in vector spaces where the additive
structure does playa role.
Remark 4. Let vi{ be the a-algebra of all subsets of the set X. Let
f: X ~ E be an arbitrary map into· a Banach space. Then f is measur-
able, and J1.-measurable if J1. is such that J1.(Y) = 0 for all subsets Y of X.
This shows that it is reasonable to exclude the behavior on a set of
measure 0 in our definition of J1.-measurability.
MI2. Let Un} be a sequence of J1.-measurable maps, converging almost

everywhere to a map f Then f is J1.-measurable.
Proof This is clear by using (i) of Mll, and the following facts: A
denumerable union of sets of measure 0 has measure O. A denumerable
union of sets having countable dense subsets has a countable dense
subset. [If {Dk} is a sequence of denumerable sets in a metric space, then
'"
'" Dk => Dn U Dk => n=l
U Dn,
-",--
U for all n, whence

k=l
k=l
so that
and our second statement is clear also.]

Property MI2 concludes the list of properties which show that fl-
measurability is preserved under the standard operations of analysis, with
the sole exception of composition of maps, contrary to MI.
For the rest of this chapter, we let (X, vIl, JI) be a measured space, i.e.
vIl is a a-algebra in X, and JI is a positive measure on vIl. We let E be a
Banach space. At first reading, the reader may assume that all maps fare
complex or real valued, that is E = C or R. No proof or notation would
be made shorter by this assumption.
VI, §2. THE INTEGRAL OF STEP MAPS
If A is a measurable set of finite measure, and f is a step map with

respect to a partition {A;} (i = 1, ... ,r) of A, then we define its integral to
be
Ix f dfl = ;~ fl(A;)f(AJ
If {BJ (j = 1, ... ,s) is another partition of A, then f is step with respect
to the partition {A; (\ Bj } and we have
s
L fl(A; (\ Bj)f(A;) =
j=l
fl(A;)f(AJ
Summing over i shows that our integral does not depend on the partition
of A. If f is step with respect to a partition of a set A and a set B, then
it is also step with respect to a partition of A u B, and we see that our
integral is therefore well defined.
If A is an arbitrary measurable subset and f is a step map on X,
recall that fA is the map such that fA(X) = f(x) if x E A and fA(X) = 0 if
x ~ A. Then fA is a step map both on A and on X, and we define
If fl remains fixed throughout a discussion, we write
Ix f instead of Ix f dfl,
[VI, §2] THE INTEGRAL OF STEP MAPS 127
and even omit the X if the total space X is fixed, so that we also write
f f instead of Ix f.
If we integrate over a subset of X, then we shall always specify this
subset, however. We now have trivial properties of the integral.
First, the integral is obviously a linear map
f: St(,u, E) --+ E
which satisfies the following properties.
If A, B are disjoint, then
(1)
J f=Jf+ff.
AuB A B
This is clear from the linearity, and the fact that fAuB = fA + fB'
Over the rea Is, the integral is an increasing function of its variables.
This means: If E = Rand f ~ g, then
(2)
Furthermore, if f ~ °
and A c B, then
(3)
Property 2 can be obtained from its positive alternate, namely
(2P) If f ~ 0, then f~f 0.
Indeed, we just use linearity on g - f.

Finally, the integral satisfies the inequalities
(4)
where I II is the sup norm. This is an obvious estimate on a finite sum

expressing the integral.
We can define a semi norm on the space of step maps, by letting
That this is a seminorm is immediately verified. For instance, to show

that
we take a partition of a set of finite measure such that both f and g are
step maps with respect to this partition, and then we estimate using the
triangle inequality. This semi norm will be called the U-seminorm.
Note. The results of this section are at the level of a first course in
calculus. We don't take limits, and our results depend only on the
presence of an algebra (not necessarily a u-algebra) and a map Jl of this
algebra into the reals ~ 0 which is additive, i.e.
Jl(A u B) = Jl(A) + Jl(B)

for A, B disjoint in the algebra.
VI, §3. THE L 1·COMPLETION
We wish to investigate the completion of our space of step maps with

respect to the L l-seminorm. We recall that the completion is defined to
be the space of equivalence classes of Ll-Cauchy sequences, and that two
Cauchy sequences are said to be equivalent if their difference is an Ll_
null sequence. We denote the completion by U(Jl). We recall that the
L l-seminorm extends by continuity to a norm on this completion. We
have a linear map
whose kernel is the subspace of step maps whose L l-norm is o. We shall

describe this kernel in a more general situation later.
We want to determine a certain space of functions corresponding as
closely as possible to the elements of Ll(Jl). If every Ll-Cauchy sequence
were also pointwise convergent, there would be no problem. This is
however not the case, but the situation is close enough to this so that we
can almost think in these terms.
We define yl(Jl) to be the set of mappings such that there exists an
L l-Cauchy sequence of step mappings converging almost everywhere to
[VI, §3] THE L 1-COMPLETION 129
f. If {f,,} and {gn} are LI-Cauchy sequences of step mappings converg-

ing almost everywhere to f and g respectively, then {In + gn} and {lXfn}
(for any number IX) are L I-Cauchy and converge almost everywhere to
f + g and IXf respectively. Consequently 'p 1(JJ.) is a vector space.
In this section and the next, we speak of Cauchy sequences instead of
U-Cauchy sequences since this is the only seminorm which will enter
into considerations. Since we have several notions of convergence, how-
ever, we still specify by an adjective the type of convergence meant in
each case. Actually, it will be useful to say that a sequence {In} approxi-
mates and element f of 'pI if {In} is U-Cauchy and converges to f
almost everywhere.
We shall extend the integral to 'pI, and we need two lemmas, which
show that our approximation technique is not far removed from uniform
approximation. The first is the fundamental lemma of integration.
Lemma 3.1. Let {In} be a Cauchy sequence of step mappings. Then

there exists a subsequence which converges pointwise almost everywhere,
and satisfies the additional property: given e there exists a set Z of
measure < e such that this subsequence converges absolutely and uni-
formly outside Z.
Proof. For each integer k there exists Nk such that if m, n ~ Nk, then
We let our subsequence be gk = fNk' taking the Nk inductively to be

strictly increasing. Then we have for all m, n:
if m ~ n.
We shall show that the series
L (gk+l(X) -
00
gl(X) + gk(X))
k;1
converges absolutely for almost all x to an element of E, and in fact we

shall prove that this convergence is uniform except on a set of arbitrarily
small measure.
Let y" be the set of x E X such that
Since gn and gn+1 are step mappings, it follows that Y" has finite measure.
On Y", we have the inequality
1
2n ~ Ign+1 - gnl
whence
Hence
Let
Zn = Y" u Yn+l u ....
Then
If x ~ Zn, then for k ~ n we have
and from this we conclude that our series
L
00
(gk+l (x) - gk(X))

k=n
is absolutely and uniformly convergent, for x ~ Zn. This proves the state-
ment concerning the uniform convergence. If we let Z be the intersection
of all Zn, then Z has measure 0, and if x ~ Z, then x ~ Zn for some n,
whence our series converges for this x. This proves the lemma.
Lemma 3.2. Let {gn} and {h n} be Cauchy sequences of step mappings

of X into E, converging almost everywhere to the same map. Then the
following limits exist and are equal:
lim Ix gn = lim Ix hn·

Furthermore, the Cauchy sequences {gn} and {h n} are equivalent, i.e.
{gn - hn} is an U-null sequence.
[VI, §3] THE L l-COMPLETION 131
Proof The existence of the limit of each integral is of course a trivial-

ity. To see the argument once more, we have
so that {f gn} is a Cauchy sequence, whence converges. Let fn = gn - hn·

Then {fn} is Cauchy, converges almost everywhere to 0, and we must
prove that the integrals
and tlfnl
converge to 0.
Given E, there exists N such that if m, n ~ N we have
Let A be a set of finite measure outside of which fN vanishes. Then for

all n ~ N we have
By Lemma 3.1, there exists a subset Z of A such that
and a subsequence of n such that Un} tends to °

uniformly on A - Z.
Then for n large in this subsequence, we conclude that
f A-Z
Ifni < E.
Finally for n large in this subsequence we have
L ~L
If" I If" - fNI + L IfNI
~ II f" - fN 111 + Jl(Z) I fN I < 2E.

Taking the sum of our integrals over CfiA, A - Z, and Z we find the
desired bound,
This proves the lemma.
In view of Lemma 3.2, for every f in 21 we can define the integral
using any approximating sequence of step maps Un} to f. Elements of

21 will therefore be called integrable maps. It is clear that the integral is
a linear map of 21 into E.
We want to extend the seminorm I 111 to 21. We need a lemma for

this.
Lemma 3.3. If f is integrable and Un} is an approximating sequence

of step maps, then If I is integrable, and {Ifni} approximates If I· In
particular,
Ix If I = lim Ix Ifni = lim Ilfnl11 ·

Proof. It is clear that Ifni converges to If I almost everywhere, so that
If I is integrable. To see that {Ifni} is a Cauchy sequence, we note that
11f,,1-lfmll ~ Ifn - fml

whence
Illfnl-lfmll11 = Ix Ilfnl-lfmll ~ Ix Ifn - fml = IIfn - fmll1·
Lemma 3.3 implies in particular that
lim II fn II 1
is independent of the choice of approximating sequence {f,,} to f, and

thus allows us to define
IIfll1 = Ix If I = lim IIfnll1·

By continuity, this is trivially verified to be a semi norm on 21.
[VI, §3] THE L 1-COMPLETION 133
Let us summarize what we have done. Our purpose was to construct

a completion (essentially) of the space of step mappings, under the U-
seminorm. In any case, we have constructed a space 'p 1 on which we
have extended the integral and the seminorm by continuity. We must
still show that this space is complete. We could now either relate our
'p 1 with the space of equivalence classes of Cauchy sequences, and use
the result of Chapter 4, §4, that this latter space is complete, or repro-
duce independently the proof of that result in the present instance. For
convenience, we do this.
Theorem 3.4. The space 'p 1 is complete, under the seminorm II 111'
Proof. Let Un} be a Cauchy sequence in 'pl. For each n there exists
an element gn E St(J.l) such that
The sequence {gn} is then Cauchy. Indeed, we have
which gives a 3e-proof of the fact that {gn} is a Cauchy sequence. For a
subsequence of n, we know by Lemma 3.1 that {gn} converges almost
everywhere to a function I in 'pl. For this subsequence, we then have
and this is < 2e for n sufficiently large in the subsequence. Hence the
subsequence is L 1-convergent to f. It follows that the sequence {In} itself
is L 1-convergent to I, and concludes the proof.
Note that the statement of Theorem 3.4 is to be interpreted in the

sense that given a Cauchy sequence {f,,} of elements in 'p 1, there exists
some I in 'p 1 such that given e, we have II In - 1111 < e for n sufficiently
large. We still have the possibility that the seminorm I 111 is not a
norm, so that strictly speaking, "the" completion in the sense of Chapter
IV, §4, would be the factor space of 'p 1 by the subspace of all elements I
such that II I 111 = O.
Let us now take for granted the existence of a completion as the
space of equivalence classes of Cauchy sequences of step maps, modulo
null sequences. Denote this by U(J.l). Then we can define a map
which to each integrable IE.P 1 associates the equivalence class of a

Cauchy sequence {f,,} approximating f Lemma 3.2 shows that this map
is well defined, and it is obviously linear. The definition of the seminorm
on 21 means that in this notation, we have
IIfll1 = Ily(f)111'
Similarly, the integral, which is a continuous linear map
Ix d/1: St(/1, E) --+ E
for the U-seminorm of St(/1), extends in a natural way to U(/1). What

we have shown in Lemma 3.2 is that there is a way of lifting it to 21 in
such a way that for f E 21 we have
Ixf= Ix y(f).
The continuity of the integral with respect to our U-seminorm is implied

by the relation
This relation is true for step maps f, and consequently holds for the
extension of our continuous linear map to the completion. Therefore, it
holds also for elements of 21 by Lemma 3.3 and the definition of the
seminorm II 111 on 21. The preceding relation also shows that the inte-
gral has norm ~ 1, as a linear map.
VI, §4. PROPERTIES OF THE INTEGRAL: FIRST PART
We note that if f E 21 and g differs from f only on a set of measure 0,

then g lies in 2 1 , and the integrals of f, g coincide, as well as their
L 1-seminorms.
We also note that if f E 21, we can always redefine f on a set of
measure 0, say by giving it constant value on such a set, so that our new
map is measurable. Indeed, if {<Pn} is a sequence of step maps converging
to f except on some set Z of measure 0, we let t/ln be the same map as <Pn
outside Z, and define t/ln(x) = 0, say, for x E Z. Then t/ln is measurable,
and the sequence {t/ln} converges everywhere to a map g which is equal
to f except on Z. Furthermore, g is measurable, by M7.
The properties of the integral which we obtained for step maps now
extend to the integral of elements of 21. We shall go through these
[VI, §4] PROPERTIES OF THE INTEGRAL: FIRST PART 135
properties systematically once more. We start by repeating that
is linear.
We observe that if f, 9 are in 'p 1(/1) then If I, Igl are in 'p 1(/1, R), and
consequently if E = R, then
sup(f, g) = t(f + 9 + If - gl)
is in 'p 1, and so is inf(f, g) for a similar reason, namely
inf(f, g) = t(f + 9 - If - gl).
The expression for the sup also shows that if {f,,}, {gn} are sequences
in 'p 1(/1, R) which are L l-convergent to functions f, 9 respectively, then
sup(fn, gn) is L 1-convergent to sup(f, g).
If f is a real function, then we can write
where f+ = sup(f, 0) and f- = -inf(f, 0). It follows that f is in 'p 1 if

and only if f+ and f- are in 'pl. Such a decomposition is occasionally
useful in dealing with real valued maps.
For any measurable set A and any f E 'p 1(/1) the map fA is also in 'pl.
(Recall that fA is the same as f on A, and zero outside A.) Proof: If {<Pn}
is a sequence of step maps approximating f, then {<PnA} converges almost
everywhere to fA' and is Cauchy because
Hence {<PnA} approximates fA' From the linearity of the integral, we thus
obtain :
If A, B are disjoint measurable sets, then
(1)
JAuB
f=Jf+Jf.
A B
This follows from the fact that fAuB = fA + fB'

136 THE GENERAL INTEGRAL [VI, §4J
Over the reals, the integral is an increasing function of its variables.

This means: if E = Rand f ~ g, then
(2)
Furthermore, if f ~ °
and A c B are measurable, then
(3)
Property 2 can be obtained from its positive alternate, namely
(2P) If f ~ 0, then f~f 0.
This is clear since an approximating sequence of step functions {<Pn} can

always be taken such that <Pn ~ 0, replacing <Pn by sUP(<Pn' 0) if necessary.
Property 2 follows by linearity, and Property 3 is then obvious.
Finally, the integral on 2 1 (Jl) satisfies the inequalities
(4)
where II II is the sup norm. (We recall that 0· 00 = 0.) This is immediate,
taking an approximating sequence {<Pn} of step maps to J, using continu-
ity for the first inequality, and (2) for the second. When IIfII or Jl(A) is
infinite, the inequality is clear, and when both are finite, we use (2).
The next properties are general properties, immediate from the conti-
nuity of the integral. We make the Banach space explicit here.
Theorem 4.1. Let A.: E -T F be a continuous linear map of Banach

spaces. Then A. induces a continuous linear map
by
fl-+ A. 0 f ,
and we have
This is obvious for step maps, and follows by continuity for 21 .

[VI, §5] PROPERTIES OF THE INTEGRAL : SECOND PART 137
Theorem 4.2. Let E, F be Banach spaces. Then we have a top linear

isomorphism
If f : X -. E x F is a map, with coordinate maps f = (g, h) in E and F

respectively, then f E 21 if and only if g, h are in 2 1, and then
The proof is a simple exercise which we leave to the reader. (The

projection is a continuous linear map on each factor!) It applies in
particular in Rn, or in C, and we see that a complex map is in 21 if and
only if its real and imaginary parts are in 21. Actually, this particular
case can be seen even more easily, for if we write a complex function
f = g + ih
where g, h are real, we note that a sequence of complex step functions
approximates f if and only if its real part approximates g and its imagi-
nary part approximates h (with our definition of approximation, that is
U-Cauchy , and convergence almost everywhere). Thus
whenever f is in 2 1 (Jl, C).

All the properties mentioned up to now are essentially routine, and
are listed for the sake of completeness. It is natural to make such a list
involving properties like linearity, monotonicity, sup, inf, behavior under
linear maps, and product mappings, which are the standard finite opera-
tions on maps and spaces.
We now turn to the limiting operations, and list the properties of the
integral under these operations, giving a large number of criteria for limit
mappings to be in 21 .
VI, §5. PROPERTIES OF THE INTEGRAL: SECOND PART
We first generalize the basic and crucial Lemma 3.1 to arbitrary maps in
21 . This will be formulated as Theorem 5.2. We need a minor lemma
to use in the proof, which was automatically satisfied when we dealt with
step maps. We define a measurable set to be a-finite if it is a countable
union of sets of finite measure.
Lemma 5.1. Let f E 'p 1(J.l) be measurable. Let c > O. Let Sc be the set
of all x E X such that If(x)1 ~ c. Then Sc has finite measure. Further-
more, f vanishes outside a (J-finite set.
Proof. Let {<Pn} be an approximating sequence of step functions to f.

Taking a subsequence if necessary and using Lemma 3.1, we can assume
that there exists a set Z of measure < B such that the convergence of
{<Pn} is uniform on the complement of Z. Hence for all sufficiently large
n, we have
if x E Sc - Z .
This proves that Sc has finite measure. Taking the values c = 11k for
k = 1, 2, ... shows that f vanishes outside a (J-finite set. Actually we can
see this even more easily, since each <Pn vanishes outside a set of finite
measure, and f is the limit almost everywhere of {<Pn}, whence f vanishes
outside a countable union of sets of finite measure.
We see that Lemma 5.1 applies in particular to the characteristic

function of a measurable set : if it is in .Pi, then the measure of this set is
finite.
Theorem 5.2. Let Un} be a Cauchy sequence in 'p 1 which is U-

convergent to an element f in 'pl . Then there exists a subsequence
which converges to f almost everywhere, and also such that given B,
there exists a set Z of measure < B such that the convergence is uniform
on the complement of z.
Proof. Considering fn - f instead of f., we are reduced to proving

our theorem in the case f = 0. Selecting a subsequence, we may assume
without loss of generality that we have
Also, changing the fn on a set of measure 0, we can assume that all fn

are measurable. We proceed as in Lemma 3.1. Let Y,. be the set of x
such that Ifn(x)1 ~ 1/2n. Then
whence
[VI, §5] PROPERTIES OF THE INTEGRAL: SECOND PART 139
Let Zn = Y" U y"+1 u· ·· . Then /1(Zn) ~ 1/2n- 1. If x ¢ Zn' then for k ~ n

we have
whence {J;.} converges uniformly to 0 on the complement of Zn . We let

Z be the intersection of all Zn . Then Z has measure 0, and it is clear
that Un} converges pointwise to 0 on Z. This proves our theorem.
Corollary 5.3. An element f E 21 has seminorm II fill = 0 if and only if

f is f!qual to 0 almost everywhere.
Proof Assume that I f 111 = O. Then the sequence {O, 0, .. . } converges

in 21 to f, and by Theorem 5.2, it converges pointwise almost every-
where to f, so that f is 0 almost everywhere. The converse is obvious.
Corollary 5.3 is a major result in our theory. We define two maps of

X into E to be equivalent if they differ only on a set of measure O. We
see that the actual completion of the space of step maps under the
U-seminorm is the space of equivalence classes of functions in 2 1, under
the equivalence defined by the property of being equal almost every-
where. In other words, the kernel of the map
is the space of maps f which are 0 almost everywhere.
Corollary 5.4. Let {f,,} be a Cauchy sequence in 21 which converges

almost everywhere to a mapping f Then f is in 21, and is the U-limit
of {f,,}.
Proof The sequence Un} is L 1-convergent to some g E 2 1, and by

the theorem, some subsequence converges almost everywhere to g. Since
this subsequence converges almost everywhere also to f, it follows that
f = g almost everywhere. This proves our corollary.
Theorem 5.5 (Monotone Convergence Theorem). Let Un} be an in-
creasing (resp. decreasing) sequence of real valued functions in 21 such
that the integrals
are bounded. Then {f,,} is Cauchy, and is both L1 and almost every-
where convergent to some function f E 21.
Proof. Suppose that we deal with the increasing case. Let
(J. = sup
k
f X
fk·
Then for n ~ m we have
whence we see that the sequence of functions is Cauchy. By Theorem 5.2

a subsequence converges almost everywhere, and since the sequence {fn}
is increasing, it follows that Un} itself converges almost everywhere. That
convergence is in L 1-seminorm by Corollary 5.4. This proves our asser-
tion in the increasing case, and the decreasing case is similar, or follows
by considering the sequence {-fn}.
Corollary 5.6. If {f.} is a sequence of real valued functions in .p 1,

and if there exists a real-valued function 9 E'p 1 such that 9 ~ 0 and
Ifni ~ 9 for all n, then sup f. and inf fn are in .Pi, and
and f
inf fn ~ inf f
fn·
Proof. The functions
are in 'p 1 , and form an increasing sequence bounded by g. Hence they

converge almost everywhere and we can apply the theorem to conclude
the proof for the sup. The inf is dealt with similarly.
For the next corollary, we recall a definition. Let Un} be a sequence

of real valued functions ~ o. If
lim inf f.
k-oo n~k
exists, we call it the lim inC of the sequence {f.} . It is clear that if Un}
converges pointwise, then its lim inf exists and is equal to the limit.
Actually, in the next corollary,
lim inf fn{x)

k-oo n!i;k
will exist for almost all x, and the resulting function, which we may
[VI, §5] PROPERTIES OF THE INTEGRAL : SECOND PART 141
define arbitrarily on a set of measure zero, will be in 21. By abuse of

language, we still denote it by lim inf f".
Corollary 5.7 (Fatou's Lemma). Let {f,,} be a sequence of real valued

functions ~ 0 in 21. Assume that
lim inf II fn 111
exists (so is a real number ~ 0). Then lim inf fn(x) exists for almost all
x, the function lim inf fn is in 2 1 , and we have
Ix lim inf fn dJl ~ lim inf Ix fn dJl = lim inf IIf.lll·

Proof. We apply the monotone convergence theorem twice, first to
the decreasing sequence {gm} given by
Since {gm} is a decreasing sequence, converging to inf f., and since

.~k
for j = 1, ... ,m
we conclude from the monotone convergence theorem that
f f. ~ ff" ~
inf
• ~k
inf
.~k
lim inf
k-+oo .~k
ff.
Let hk = inf f •. Then {h k} is an increasing sequence for k = 1, 2, ... , and
.~k
we can apply the monotone convergence theorem to hk. The limit lim hk
k-+oo
is precisely lim inf f., and Fatou's lemma drops out as desired.
Note. Fatou's lemma is used most often in the simple case when {f.}
is pointwise convergent almost everywhere, and when the L l- seminorms
11f,,111 are bounded, thus ensuring that the pointwise limf" is in 21.
Theorem 5.8 (Dominated Convergence Theorem). Let U.} be a se·

quence of mappings in 2 1 (Jl). Assume that there exists some function
g E 2 1 (Jl, R) such that g ~ 0 and If.1 ~ g for all n. Assume that {f,,}
converges almost everywhere to some map f. Then f is in 21 and {f.}
is L l-convergent to f.
Proof For each positive integer k, let
gk = sup Ifn - fml·

m , n~k
Then {gk} is a decreasing sequence of real valued functions, and since

Ifn - fml ~ 2g, it follows from Corollary 5.6 that each gk is in f£1. By the
monotone convergence theorem and the hypothesis, the sequence {gk}
converges almost everywhere to O. Hence Un} is actually a Cauchy
sequence, and we can apply Corollary 5.4 to conclude the proof.
We now refer for the first time since the definition of f£1 to the
notion of ,u-measurability. The point is that we want to give criteria for
the limit of a sequence of maps to be in f£1, and ,u-measurability is the
natural hypothesis here. We refer the reader to Mll and emphasize the
countability implications arising from a map being in f£\ and hence
,u-measurable (by definition).
Corollary 5.9. Let f be ,u-measurable. Then f is in f£1(,u) if and only

if its absolute value If I is in f£1(,u, R). More generally, assume that
there exists an element 9 E f£1(,u, R) such that 9 ~ 0 and such that
If I ~ g. Then f is in f£1(,u).
Proof Let {(f}n} be a sequence of step maps converging pointwise to

f Without loss of generality we can assume that 9 is measurable. (We
may have to change all (f}n' f, and the given 9 on a set of measure 0.)
Define a map hn by
hn(x) = (f}n(x) if I(f}n(X) I ~ 2g(x),

hn(x) = 0 if I(f}n(X) I > 2g(x).
The set Sn of all x such that 2g(x) - I(f}n(X) I ~ 0 is measurable, and it

follows that hn is in f£1(,u) for each n. Furthermore {h n} converges
pointwise to f, and Ihnl ~ 2g. We can therefore apply the dominated
convergence theorem to conclude the proof.
Note. Corollary 5.9 explains the role of positivity in integration theory.
Corollary 5.10. Let Un} be a sequence of maps in f£1(.u) which con-

verges pointwise almost everywhere to f If there exists C ~ 0 such that
IIfnl11 ~ C for all n, then f is in f£1 and IIfl11 ~ C.
Proof. All fn are ,u-measurable, and hence f is ,u-measurable, by M12
of §l. By Corollary 5.9, it suffices to prove that If I is in f£1(,u, R). But
If I = lim If" I, and Fatou's lemma applies to conclude the proof.
Remark. In Corollary 5.10, we don't assert of course that Un} is

L 1-convergent to f. This is in general not true since for instance we can
find a sequence Un} converging everywhere to 0 such that each fn has
II fn 111 = 1. (Take very thin tall vertical strips moving towards the y-axis.)
To get U-convergence, we must of course cut down such fn in a manner
similar to that used in Corollary 5.9.
Corollary 5.11. Let f E 'p 1(J1.). Let g be a bounded measurable function

on X (so real or complex). Then gf is in 'p 1 (J1.).
Proof. Let {IPn} be a sequence of step maps converging both L 1 and

almost everywhere to f. Using M8 of §1, let {t/ln} be a sequence of
simple functions converging pointwise to g. Then {IPnt/ln} is a sequence of
step maps, and as n -+ 00, this sequence converges almost everywhere to
fg. Changing f and g on a set of measure 0 (e.g. giving them the value
0), we can assume that this convergence is pointwise everywhere. If C is
a bound for g, i.e. Ig(x)1 ~ C for all x, then Ifgl ~ Clfl. We can now
apply Corollary 5.9 to conclude the proof that fg is in 'pl. We can also
reproduce the proof of Corollary 5.9, i.e. after suitable adjustment we
may suppose that
IIPnl ~ 21fl
for all n, whence IIPnt/lnl ~ 2C1fl for all n, and then apply the dominated
convergence theorem directly.
Corollary 5.12. Let E x F -+ G be a continuous bilinear map of Banach

spaces into another. Let f E 'p 1 (J1., E) and let g be a bounded J1.-
measurable map of X into F. Then fg E 'p 1 (J1., G).
Proof. There is nothing to change in the preceding proof.
Corollary 5.13. Let Un} be a sequence of maps in 'p 1 such that
n~ Ix Ifni dJ1.
converges. Then the series
co
f(x) = L: fn(x)
n=l
converges almost everywhere, the map f is in 'p 1, and
f x
f dJ1. = f f fn dJ1..
n=l X
Proof Immediate from the dominated convergence theorem, consider-

ing the partial sums, and using the function
n
g(x) = lim
n
L Ifk(X)I·
k=l
Example. It is often useful to consider sums as in Corollary 5.13 in

the following context. Let {An} be a sequence of disjoint measurable sets
whose union is equal to X. For each n let fn be integrable over An' and
define fn to be 0 outside An' so that fn is then defined over all of X. Let
(Conversely, if f is given on all of X, we could let f" = fA = fXA where

n n
XAn is the characteristic function of An .) If
converges, then it follows that f is in ,21 over all of X .
Remark. In our discussion of measurability, we have already pointed

out that a pointwise limit of step maps takes its values in a separable set,
i.e. having a countable dense subset. Actually, taking the space generated
by the values of the step maps in a sequence converging to f we see that
this space, and its closure, have a countable dense subset. This applies
when f is in ,21 since we can change f on a set of measure 0, say giving
f the value 0 on such a set, so that f is a pointwise limit of step maps
on the complement of this set. Furthermore, we also recall that a limit
of step maps vanishes outside a countable union of sets of finite measure,
and this also applies to an element of ,21.
Corollary 5.14. Let f be in ,21 . Given e, there exists a set of finite

measure A such that
Proof As we have remarked, we can change f on a set of measure 0

such that f vanishes outside a countable union of sets of finite measure,
say {An}. Let
The sets Bn are increasing, and without loss of generality we may assume
that X = UBn· Then
We let A = Bn for n large. Our corollary follows from the monotone

convergence theorem.
The next theorem has a probabilistic interpretation as follows. If A is

a set of finite measure "# 0, we may view
as the average of f over A. The theorem will assert that if the average of
f over all such A lies in some closed set S, then in fact the values f(x)
must lie in S for almost all x. We call this the averaging theorem.
Theorem 5.15. Let f E !l'1(J1., E). Let S be a closed subset of E and

assume that for all measurable sets A of finite measure "# 0 we have
Suppose 0 E S or X is a-finite. Then f(x) E S for almost all x.
Proof Changing f on a set of measure 0, we may assume without

loss of generality that f vanishes outside a set which is a countable union
of sets of finite measure, and that E has a countable dense subset. It is
then clear that it will suffice to prove our theorem under the additional
assumption that J1.(X) < 00, which we now make. Let VEE and v ¢ S.
Let Br(v) be an open ball of radius r centered at v and not intersecting S.
Let A be the set of all x E X such that f(x) E Br(v). We prove that A has
measure O. Indeed, if J1.(A) > 0 we have
which is a contradiction. Hence J1.(A) = O. The lemma follows using the

countability assumption on E, and using a countable dense set in the

complement of S, together with open balls of rational radii around the
elements of this set, which form a base for the topology.
Corollary 5.16. Let f E 2 1 (/1) and assume that
for every measurable set A of finite measure. Then f is equal to 0

almost everywhere.
Proof. We take S to consist of 0 alone, and apply the theorem.
Corollary 5.17. Let f E 2 1 (/1). For each step function g the map fg is
in 21 (/1), and if
for all step functions g, then f(x) = 0 for almost all x.
Proof. Apply Corollary 5.16 to characteristic functions XA .
Corollary 5.1S. Let f E 2 1 (/1). Let b ~ o. If
for all sets A of finite measure, then If(x)1 ~ b for almost all x.
Proof. Let Sn be the subset of E consisting of those elements v such

that Ivl ~ b + lin and apply the theorem. Then take the union for n = 1,
2, . ...
The next corollary is included for later applications. The reader inter-
ested only in the case of complex or real functions may omit it.
Corollary 5.19. Let E be a Hilbert space and f E 2 1 (/1, E). If
for all step maps g, then f(x) = 0 for almost all x.

[VI, §6] APPROXIMA TIONS 147
Proof. The proof is really just like that of Corollary 5.17. First we
may assume that the image of f is contained in a separable Hilbert
subspace. Let e be a unit vector. For any measurable set A of finite
measure, the step map eXA having value e in A and 0 outside A is
bounded measurable. Let us denote by fe the Fourier coefficient of f
along e so that fe is a function. We have
This being true for all A it follows that f e is equal to 0 almost every-
where. Since there is a countable Hilbert basis in our Hilbert space, it
follows that f is 0 almost everywhere.
Corollary 5.20. Let E be a Hilbert space and f E !i'1(j1., E). For each
unit vector e E E, let fe be the component of f along e. Let b ~ O.
Assume that for each unit vector e and each set of finite measure A we
have
Then If(x)1 ~ b for almost all x.
Proof. We may assume that E is separable as in Theorem 5.15, and

that j1.(X) < 00 . Let vEE and Ivl > b. Let B,(v) be an open ball of
radius r centered at v not intersecting Bb(O). If A is the set of all x E X
such that f(x) E B,(v), we take e to be a unit vector in the direction of v.
Let c = Ivl. If x E A, then Ife(x) - cl < r so that fAx) E B,(c). By Corol-
lary 5.18 it follows that A has measure O. Our Corollary 5.20 follows at
once.
VI, §6. APPROXIMATIONS
We shall analyze Theorem 5.2 more closely, so as to fit certain situations

which arise in practice. Let us look at a special case, the real line. The
most natural definition of any integral on R is to start with step func-
tions defined on bounded intervals (open, closed, or half open or closed),
and define the integral for these. However, the sets which are finite
unions of bounded intervals do not form a a-algebra, only an algebra.
Thus we are faced with two problems: extend the measure (length) func-
tion on bounded intervals to a measure on the a-algebra generated by
the finite intervals, and second, show that the step functions taken with
respect to finite intervals are still U-dense in the !i'l-completion. The
problem of extending the measure to a CT-algebra is dealt with in §7.

Here, we settle the other question, and a count ability condition arises
naturally.
Let d be a subalgebra of ..4f, and assume that d consists of sets of
finite measure. We shall say that X is a-finite with respect to d if X is a
countable union of elements of d. Taking the usual inductive comple-
mentation, we see that if X is CT-finite with respect to d, then in fact,
there is a sequence {An} of disjoint elements of d such that
We recall that a step map f with respect to d is a map which is equal

to 0 outside some element A of d, and such that there is a partition
{A 1 , ... ,Ar} of A consisting of elements of d, such that f is step with
respect to this partition. We shall denote the space of step maps with
respect to d by St(d). We are interested in giving conditions under
which the closure of St(d) in 2"1 (Ji) is equal to 2"1. The next two
lemmas lead to the theorem giving such criterion. We first consider
those measurable subsets contained in some element A of d. Thus we
denote by d A the algebra induced by d on A, i.e. the algebra of all
elements of d contained in A. We let St(dA ) be the vector space of step
maps with respect to d A .
Remark. Let Y be a measurable subset of an element A of d and Xy

its characteristic function. Let cP be a step function such that
If we let CP1 = inf(cp, 1), then IXy - CP11 ~ IXy - cpl, and hence
IIxy - CP1111 < B.
We have a similar situation taking sup(cp, 0). We are interested in those

Y such that Xy lies in the closure of St(dA , R). Our remark shows that
in determining those Y, we may restrict our attention to those step func-
tions cP such that
For what follows, we also observe that St(dA , R) is closed under the
operations of sup and info
Lemma 6.1. Let A be an element of d. Let ~ be the collection

of measurable subsets Y of A whose characteristic function Xr lies in
the L 1-closure of St(dA , R), i.e. such that given B, there exists a step
[VI, §6] APPROXIMA TlONS 149
function cP E St(dA , R) satisfying
Ilxy - cplll < B.
Then ~ is a a-algebra in A.
Proof First we show that ~ is an algebra. If Y, Z E ~, then
sup(Xy, Xz) = XYuZ and inf(xy, Xz) = XYnZ
are in~. Also, XA - Xy = XA-Y is in~. Hence ~ is an algebra. To

show that it is a a-algebra, it suffices to show that if {Y,,} is a sequence
in ~ of disjoint elements, then U
Y" is ~. (If we have an arbitrary
sequence in ~, we can always adjust it by taking relative complementa-
tions to yield another sequence of disjoint elements in ~, having the
same union.) Thus let {Y,,} be a disjoint sequence in ~, and let {CPn} be
step functions in St(dA , R) such that
B
IlxY n - CPnlll < 2n '
Let
00
y=UY,,·
n=l
Then
We take n so large that the first term on the right is < B. The second
term on the right is estimated by
n
L
k=l
IlxYk - CPklll < B.
This proves that ~ is a a-algebra in A.
The next lemma pertains to a completely general situation.
Lemma 6.2. Let {AJieI be a family of sets whose union is equal to X.

For each i, let .Ali be a a-algebra of subsets of Ai ' Let ..¥ be the
collection of subsets Y of X such that Y (\ Ai E .Ali for all i. Then..¥ is
a a-algebra in X.
Proof Let Y E JV: Then f6'Y (\ Ai = Ai - Y. Hence f6'Y E..¥. Let Y,

Z E..¥. Then
(Y (\ Z) (\ Ai = (Y (\ Ai) (\ (Z (\ Ai)
whence Y n Z is in Y. Let {~} be a sequence of subsets of X in Y.

Then
whence U ~ is in Y . This proves our lemma.
Theorem 6.3. Let d be a subalgebra of .J(, consisting of sets of finite

measure, generating.J(. Assume that X is a-finite with respect to d.
Then the space St(d) of step mappings with respect to d is dense in
2 1(11, E). Furthermore, if {An} is a sequence in d whose union is X,
then for all Y E.J(, XYnA n lies in the L1-closure of St(dAn , R), for all n.
Proof. We prove the second assertion first. By Lemma 6.1, we have a

a-algebra ~n' and we apply Lemma 6.2. Every element of d is such
that A n An E ~n' and since d generates .J(, we conclude that Y = .J(.
Next, we prove a special case of our first statement:
If Y is a measurable set of finite measure, given 6 there exists a step

function cP with respect to d such that
Taking relative complements, we may assume the An are disjoint. By

Lemmas 6.1 and 6.2, for each n there exists a step function CPn with
respect to d such that
6
IIXYnA n - CPnl11 < 2n '
Since Y is the union of all sets Y n An' we can find some n such that
or in other words such that
It follows that
II XY - t
k=l
CPk II 1 ~ II XY - t
k=l
XYnA k t
I 1 + II k=l XYnAk - t
k=l
CPk II·
1
< 26.
This proves our special case.
[VI, §6] APPROXIMA TIONS 151
The general case is now obvious: a step map f ;/: 0 with respect to all
sets of finite measure is a finite linear combination
with Vj E E, Vj ;/: 0 for all j, and such that the sets lj have finite measure.
By definition, the space of these maps is U-dense in fel(p., E). For each
XY j we can find a step function CPj with respect to d such that
It follows immediately that
We can now strengthen the corollaries of Theorem 5.15.
Corollary 6.4. Let d be a subalgebra of .It, consisting of sets of finite

measure, generating.lt. Assume that X is a-finite with respect to d.
Let f E fel(p.). If
for all A E d, then f is equal to 0 almost everywhere.
Proof. Our assumption implies by linearity that
for all real step functions cP with respect to d. Let Y be a set of finite
measure. By Theorem 6.3 and Lemma 3.1, we can find a sequence of
step functions {cp.} with respect to d which converges almost everywhere
to XY and is also L1-convergent to Xy. Taking inf(cp., 1) and sup(cp., 0)
if necessary, we may assume without loss of generality that 0 ~ CPo ~ 1.
Then
and {jcp.} converges almost everywhere to fXy. By the dominated con-

vergence theorem it follows that
0= Ix fCPn converges to Ix fxy = t f.
This proves that
tf=O
for all sets of finite measure Y. By the a-finiteness, every measurable set
is a countable union of sets of finite measure. Since fXy = 0 almost
everywhere by Theorem 5.15, we conclude that f = 0 almost everywhere,
thus proving our corollary.
Example. Take E = R and let X = R also. Let d be the algebra

consisting of sets which are finite unions of bounded intervals (obviously
an algebra). We shall show in §9 that there is a unique measure on the
a-algebra generated by d such that the measure of an interval is its
length. Thus we can develop integration theory on the reals, and we can
apply the corollary to Theorem 6.3. Furthermore, the infinitely differen-
tiable functions which vanish outside a compact set are dense in .ft'l. In
fact, given a characteristic function Xy of a finite interval, we can find a
COO function cP which is equal to Xr except in a given e-neighborhood of
the two end points of the interval, and 0 ~ cP ~ 1. Thus as an applica-
tion of our corollary, we see that if
for all COO functions cP vanishing outside some compact set, then f is equal
to 0 almost everywhere. We shall state this result formally later in R".
Remark. We observe that the domain of validity of Theorem 6.3 is

actually greater than it seems, i.e. the hypothesis of a-finiteness is to
some extent superfluous. Indeed, every map in .ft'1(J1.) being a limit al-
most everywhere of step maps, must vanish outside some set which is a
denumerable union of sets of finite measure. In determining a dense
subset of .ft'l, we are merely attempting to approximate each individual
map f. Thus the hypothesis under which Theorem 6.3 holds can actually
be weakened to the following:
Every set of finite measure is contained in a countable union of sets of

d.
All the applications I know of actually occur in the a-finite case as we

defined a-finite, but one should keep in mind that in case of need, one
[VI, §7] EXTENSION OF POSITIVE MEASURES 153
could take the preceding property as the definition of a-finiteness with

respect to d, and still end up with the corresponding result. This remark
is the analogue with respect to the domain set of the remark preceding
Corollary 5.14, with respect to the image space. We see that the 21
theory has a built-in countability property for each one of its elements.
VI, §7. EXTENSION OF POSITIVE MEASURES

FROM ALGEBRAS TO <r-ALGEBRAS
In the previous sections, we started with a positive measure on a 17-

algebra, and then defined the integral for certain limits of step maps. We
now want to show how we can obtain such measures starting with fewer
data.
We recall that an algebra d of subsets of X is a collection of subsets
containing the empty set, such that d is closed under finite unions and
intersections, and such that if A, BEd, then A - BEd.
By a positive measure on an algebra d, we mean a map
fJ, : d~[O,oo]
such that fJ,(0) = 0, and such that fJ, is countably additive on d. This
means that if {An} is a sequence of disjoint elements of d, and if their
union U An is also in d, then
Under a suitable countability assumption, we shall prove that a mea-

sure on an algebra can be extended uniquely to a measure on the
a-algebra generated by d. Observe that the countability condition is
necessary for this to be possible, i.e. we could not merely assume that fJ,
is finitely additive on d. For instance, consider a denumerable set X =
{x n}, and let d be the algebra of all subsets. Let Xn have measure 1/2n,
and let a finite set have measure equal to the sum of the measures of its
elements. Let an infinite set have infinite measure. Then we have defined
a finitely additive function which is not a measure.
Theorem 7.1 (Hahn). Let fJ, be a positive measure on an algebra d in

X, and assume that X can be expressed as a denumerable union of sets
of d. Then fJ, can be extended to a positive measure on the a-algebra
.A generated by d, so that for Y E.41,
L
00
fJ,(Y) = inf fJ,(An),

n=l
the inf being taken over all sequences {An} in d whose union contains
Y. If X can be expressed as a countable union of sets of finite measure
in d, then there exists a unique extension of jJ. to a positive measure on
.A.
Proof. The proof will proceed in two steps and needs the notion of an
outer measure.
Let % be a a-algebra in a set X. An outer measure jJ. on % is a

function jJ.: % --+ [0, 00] satisfying the conditions:
OM 1. We have jJ.(0) = O.
OM 2. If A, BE % and A c B, then jJ.(A) ~ jJ.(B).
OM 3. If {An} is a sequence of elements of %, then
Lemma 7.2. Let jJ. be a positive measure on an algebra d in X, and

assume that X can be expressed as a denumerable union of sets of d.
On the a-algebra of all subsets of X, define
L jJ.(An),
00
jJ.*(Y) = inf
n=l
the inf being taken over all sequences {An} of elements of d whose
union contains Y. Then jJ.* is an outer measure which extends jJ..
Proof. We first show that if A Ed, then jJ.*(A) = jJ.(A), in other words
jJ.* extends jJ.. Since
A=Au0u0u'"
we see that jJ.*(A) ~ jJ.(A). Conversely, given e, let {An} be a sequence of

elements of d whose union covers A, and such that
Since A = U(An n A) it follows that
This proves that jJ.(A) ~ jJ.*(A), whence jJ.(A) = jJ.*(A), as desired.

From now on we omit the * on jJ. since jJ.* and jJ. take the same values
on d. We show that our extended jJ. is an outer measure. The first two
properties OM 1 and OM 2 are obvious. As for OM 3, it is clearly an

e/2n proof: let {lj} be a sequence of subsets of X which we may assume
have finite measure. Given e, for each j, let {A~)} (n = 1,2, ... ) be a
sequence in d whose union covers lj and such that
- ~ (j) e
n~l Jl(A n ) ;;;;; Jl(lj) + 21'
Then the denumerable family {A~)} (for j, n positive integers) covers

U lj , and we have
L Jl(A~» L Jl(lj) + e.
00
Jl(Y);;;;; ;;;;;
n.j j=l
This proves our proposition.
Let Jl be an outer measure on the set of all subsets of X. We say that

a subset A of X is ,,-measurable if for all subsets Z of X we have
Jl(Z) = Jl(Z n A) + Jl(Z n ~A).
Lemma 7.3. Let Jl be an outer measure on the subsets of X. Let!/' be

the collection of all subsets of X which are Jl-measurable. Then!/' is a
a-algebra, and Jl is a positive measure on !/'.
Proof. Since we deal only with Jl, we omit the prefix Jl-. We first
prove that !/' is an algebra. It obviously contains the empty set, and if
A is measurable, it is clear that ~A is measurable (the definition of
measurable is symmetric in A and ~A). Let A, B be measurable. We
show that A n B is measurable. Let Z be any subset of X. Since B is
measurable, we get
Jl(Z nAn B) + Jl(Z nAn ~B) = Jl(Z n A).
Add Jl(Z n ~ A) to both sides. On the right we obtain Jl(Z) because A is

measurable. To prove that An B is measurable, it will suffice to prove
that
Jl(Z n ~(A n B») = Jl(Z nAn ~B) + Jl(Z n ~ A).
But this is seen by using the fact that A is measurable, and writing
Thus A n B is measurable.
Next we observe that if A l ' ... ,An are disjoint measurable sets, and Z
is arbitrary, then
n
Jl(Zn(A1u' '' uA n »)= L Jl(ZnAd·
k=l
This follows for n = 2, replacing Z by Z 11 (A 1 u A 2 ) in the definition of

measurability, and then by induction. Let now {An} be a sequence of
disjoint measurable sets, and let A be their union. Using the fact that p.
is an outer measure, we get for any subset Z:
n
~ L p.(Z 11 A k ) + p.(Z 11 CCA)
k=l
for all n, whence
L
<Xl
p.(Z) ~ p.(Z 11 Ad + p.(Z 11 CC A)

k=l
because p. is an outer measure. The converse inequality
is true again because p. is an outer measure. Thus we have equality.

This proves both that A is measurable, so the measurable sets form a
(i-algebra, and that p. is countably additive on !/', thus concluding the
proof of the lemma.
To prove the existence part of the theorem, all we need to show now
is that the sets of our original algebra d are measurable. Let A E d and
let Z be any subset of X . The inequality
p.(Z) ~ p.(Z 11 A) + p.(Z 11 CCA)

is true because p. is an outer measure. Conversely, given e let {An} be a
sequence in d whose union covers Z and such that
Then Z 11 A is contained in the union of the sets An 11 A, and Z 11 CC A is

contained in the union of the sets An 11 CC A = An - A. Consequently
p.(Z 11 A) + p.(Z 11 CC A) ~ L p.(An 11 A) + L p.(An 11 CC A)

= L p.(An)
~ p.(Z) + e.
This proves the reverse inequality, and proves the existence of an exten-
sion of p. to a measure on !f, whence on the (i-algebra generated by d .
Now for the uniqueness, we let p. be as we have just constructed it,

and let v be any positive measure on the u-algebra vIt generated by
.91, extending p. on d. Let {An} be a sequence in .91 of sets of finite
p.-measure, whose union is X. For any given Y it suffices to prove that
v(Y nAn) = p.(Y nAn).
Thus it suffices to prove: if A E .91 has finite measure and Y is in vIt and
contained in A, then v(Y) = p.(Y). We have
p.( Y) = inf L p.(Bn) = inf L v(Bn)

the inf taken over all sequences {Bn} in .91 whose union contains Y. This
shows that v(Y) ~ p.(Y). But then also,
v(A - Y) ~ p.(A - Y).
However
p.(A) = v(A) = v(A - Y) + v(Y) ~ p.(A - Y) + p.(Y) = p.(A).
This proves that we must have v(Y) = p.(Y) and concludes the proof of
the theorem. (For another proof of uniqueness, cf. Exercise lO(b).)
Corollary 7.4. Let (X, vIt, p.) be a measured space, and let .91 be a
subalgebra of vIt consisting of sets of finite measure, generating .1(, and
such that X is u-finite with respect to d. A subset Z of X has p.*-
measure 0 if and only if given e, there exists a sequence {An} in .91
whose union covers Z and such that
L p.(An) < e.
00
n=l
Similarly for a set Z E vIt of measure o.

Proof It is clear that a set satisfying the stated condition has measure
o. Conversely, we know from the theorem that the measure on .91 has a
unique extension to vIt, given as the outer measure. From this our
assertion is obvious.
Remark. In euclidean space, with respect to Lebesgue measure (dis-

cussed later), the algebra .91 is that formed of finite disjoint unions of
cubes. Thus a set has measure 0 in Rn if and only if given e it can be
covered by a sequence of cubes, the sum of whose volumes is < e. In
many applications, one deals exclusively with sets of measure 0, and one
does not need any fancy measure theory or integration theory. Thus the
reader should keep this in mind so as to be more comfortable when
meeting such applications.
VI, §8. PRODUCT MEASURES AND INTEGRATION

ON A PRODUCT SPACE
Let X, Y be sets and .91, fJl algebras of subsets in X , Y respectively. By a

rectangle with respect to .91, fJl we mean a product A x B with A E .91
and BE fJl. We let .91 x fJl denote the collection of all finite disjoint
unions of rectangles with respect to .91, fJl. (Unless needed for clarity, we
omit the reference to .91, fJl in what follows.) We contend that .91 x fJl is
an algebra, in X x Y. This is easily proved as follows.
The empty set is in .91 x fJl. We have the identities:
and
If P, Q E.9I x fJl these show that both P (') Q and P - Q E.9I x fJl. Since
P u Q = (P - Q) u Q
and (P - Q) (') Q is empty, it follows that P u Q E.9I x fJl. This proves

that .91 x fJl is an algebra.
We denote by .91 ® fJl the a-algebra generated by .91 x fJl. Also, we

denote by .91" the a-algebra generated by .91 in X. We have
.91" ® fJl" = (.91 x fJl)".
Proof Since (.91 x fJl) c (.91" x fJl") c (.91" ® fJl") it follows that
(.91 x fJl)" c .91" ® fJl",
and we must prove the reverse inclusion. For each BE fJl consider the
a-algebra in X x B generated by all sets A x B with A E.9I. It is con-
tained in (.91 x fJl)", which therefore contains .91" x {B} for all B E fJl.
Now for any A E .91", it follows that {A} x fJl" is contained in (.91 x fJl)".
Thus finally,
.91" x fJl" c (.91 X fJl)",
whence the reverse inclusion
.91" ® fJl" c (.91 X fJl)",
which proves what we wanted.

[VI, §8] PRODUCT MEASURES AND INTEGRATION 159
Lemma 8.1. Let vii be a u-algebra in X and .AI a u-algebra in Y.

(i) Let Q E vii ®.AI and for each x E X let Qx be the set of y such
that (x, y) E Q. Then Qx E .AI.
(ii) Let f: X x Y -+ Z be a vii ® .AI measurable map into a topological
space Z. For each x E X, the map
given by fAy) = f(x, y) is measurable.
Proof. Let Y' be the collection of subsets Q E vii ®.AI such that
Qx E.AI for all x. Then Y' contains all rectangles A x B with A E vii
and B E.AI. It will suffice to prove that Y' is a u-algebra. The point is
that the operation Qf-+ Qx commutes with all the operations of set
theory. Indeed, X x Y E Y'. If Q E Y', then ~Q E Y' because (~Q)x =
~(Qx)' If Q, P are in Y', then
If {Qn} is a sequence in Y', then Qn)x = (U U

(Qn)x . Thus we see that
Y' is a u-algebra. This proves (i). As for (ii), if V is open in Z, then
so fx is measurable. This proves the lemma.
For the rest of this section we let (X, vii, J1.) and (Y,.AI, v) be u-finite
measured spaces. Let .91 and f!4 be the algebras of sets of finite mea-
sure in vii and .AI respectively.
If f is a step map with respect to .91 x 11, then we can define a

repeated integral of f. Indeed, for each x E X the map fx is a step
map on Y with respect to f!4. In fact, if
f = VXAxB
for some vEE and A E .91, B E f!4, then
and
Our assertion follows by linearity. Thus for each x E X, we can form

a first integral,
tfx dv.
If J = VXA x B as above, we see that
t Jx dv = VXA (x) v(B).
If J is a step map with respect to .91 x f!J, we conclude that the map
XH t Jx dv
is a step map with respect to .91.

We may therefore integrate this map over X, with respect to f.l, and
the repeated integral will be denoted by anyone of the following
notations:
Ix df.l(x) t Jx dv,
Ix t J(x, y) dv(y) df.l(x), IxtJdVdf.l.
We use similar notation if we reverse the order of integration, and it is

clear that on step maps, the repeated integrals are equal to each other,
no matter what order of integration is chosen. In fact, we see at once
that for A E .91 and B E f!J we have
Ix t XAxB dv df.l = f.l(A)v(B) = t Ix XAxB df.l dv.
The repeated integral is linear on the space oj step maps.
Proof Obvious, because each one of the single integrals is linear.

In particular, there is a unique finitely additive positive function f.l x v
on .91 x f!J such that for A E.9I and B E !!J we have
(f.l x v)(A x B) = f.l(A)v(B).

Theorem 8.2. Let (X,.I{, f.l) and (Y, %, v) be a-finite measured spaces.
There exists a unique positive measure f.l ® v on .I{ ® % such that Jor
all sets A, B oj finite measure in .I{ and % respectively we have
(f.l ® v)(A x B) = f.l(A)v(B).
Proof By Hahn's theorem, it suffices to prove that f.l x v is countably

additive on .91 x f!J, i.e. is a measure on .91 x f!J, where .91, f!J are the
algebras of sets of finite measure in .A and .¥ respectively. Let {Qn}

be an increasing sequence in .91 x f!4 whose union is an element Q of
.91 x f!4. Let f" be the characteristic function of Qn. Then {f,,} is increas-
ing to the characteristic function f of Q. Furthermore, for each x E X,
the function (fn)x is increasing to fx . By the monotone convergence
theorem with respect to v, we see that for each x,
t (f,,)x dv is increasing to t fx dv.
Now we apply the monotone convergence theorem with respect to f.l, to

conclude that
Ix t f" dv df.l converges to Ix t f dv df.l.

Lemma 8.3. Let Z be a set of (f.l ® v)-measure 0 in X x Y. Then for

almost all x E X we have v(ZJ = o.
Proof For each positive integer n, let Sn be the set of all x such that
v(ZJ ~ lin. Let S = USn. It will suffice to prove that S is contained in
a set of measure O. Given e, let {Rd be a sequence of rectangles whose
union contains Z and such that
00 e
k~ (f.l x v)(Rd < n2n'
Such {Rd exists by the corollary of Hahn's theorem. Then
Let T,. be the set of all x such that
1
-n ~ k=l
L V(Rk.J·
00
Then T,. is measurable, and Sn cT,.. Furthermore, the expression on the

right is integrable with respect to x, and we find that
1
-f.l(T,.) ~
n
Loofx v(Rk.x) df.l = L (f.l x V)(Rk) < -2"
k=l
00
k=l n
e
'
This shows that Jl(T,,) < e/2n, whence S is contained in a set of measure 0,
thereby proving our lemma. The converse will follow from Corollary 8.5.
Suppose that f is in ,Pl(Jl ® V, E) and let 9 differ from f only on a set

of (Jl ® v)-measure 0, say Z. Then for each x E X, the maps fx and gx
differ only at those points y E Y such that (x, y) E Z, i.e. those y such that
y E Zx. By the lemma, there exists a set S of measure 0 in X such that
for all x $ S we have v(Zx) = O. From this we conclude that for such
x $ S, the maps fx and gx differ only on a set of measure O. Thus f x is in
'pl(V, E) if and only if gx is in 'pI (v, E), and if this is the case, the
integrals with respect to v will be equal. This is the situation which we
meet in the next theorem.
Theorem 8.4 (Fubini's Theorem, Part 1). Let f E ,Pl(Jl ® v). Then for
almost all x, the map fx is in 'pl(V), the map given by
x~ IfxdV
for almost all x (and defined arbitrarily for other x) is in ,Pl(Jl); and we
have
f Xx Y
f d(Jl ® v) = ff
x y
fx dv dJl(x).
There is a natural Banach space norm preserving isomorphism
Proof. By Theorem 6.3, we can find a sequence {<Pn} of step mappings

with respect to d x fJI which converges to f both in L I-seminorm and
almost everywhere on X x Y. As before, d, fJI are the algebras of sets of
finite measure in vii and JV respectively. We let Z be a set of (Jl ® v)-
measure 0 in X x Y such that {<Pn} converges pointwise to f outside Z.
We let S be a set of Jl-measure 0 in X such that for x $ S we have
v(ZJ = o.
If x $ S, it follows that {<Pn.x} converges pointwise to fx on the comple-

ment of Zx.
Now we observe that for each n, the map
is a map of X into St(fJI). Indeed, <Pn.x is a step map with respect to fJI,
and for VEE, the formula
shows that II>n is step with respect to d. We view the space St(~) as
having the Ll-seminorm. We contend that {lI>n} is a Cauchy sequence.
This is easily seen, because
IIl1>n - II>m 111 = Ix IlI>n - II>mll dJl

= Ix I I«Pn(x, y) - «Pm(X, y)1 dv(y) dJl(x)
= II <t>n - «Pm 111 .
(Of course, the U-seminorms taken on the right and left of the preceding
equation refer to different spaces.)
By the fundamental Lemma 3.1, we may assume without loss of gener-
ality (using a subsequence if necessary) that there exists a set T of mea-
°
sure in X such that for x ¢ T the sequence {lI>n(x)} is Cauchy. [Lemma
3.1 and its proof are valid for values in a Banach space. For our
purposes, we note that the proof of this lemma applies as well in the
seminormed case to yield a pointwise Cauchy sequence for almost all x.
Alternatively, we may also take the natural map of St(Pl) into U(v) and
apply the lemma with respect to the Banach space U(v).] This means
that for each x ¢ T, the sequence
is Cauchy (that is, U -Cauchy with respect to v). If x ¢ S u T, we know

that {«Pnjy)} converges to {fAy)} for almost all y E Y. Hence by Corol-
lary 5.10, we conclude that Ix E £,1 (V) and that {«Pn,x} is U-convergent to
lx, so that
I «Pn,x dv converges to I Ix dv
for all x ¢ S u T.
Finally, we note that the map
is a step map with respect to d. [It is in fact the composite map of II>n
and the integral Sr dv.] Furthermore, the sequence {'P.} is Cauchy (U

with respect to Jl), as one sees by repeating the argument given above to
show that {n} is Cauchy. Also for all x If S u T we know that 'P.(x)
converges to the map 'P given by
'P(X) = t fx dv.
Consequently {'P.} is U -convergent to 'P, and as n ~ 00,
Ix t fP.,x dv dJl(x) converges to Ix t fx dv dJl(x).

Since fPn is a step map and
fx fy
fP•. xdv dJl(x) = fxxY
fP. d(Jl ® v),
we see that Fubini's theorem is proved.
Corollary 8.5. Let Q be a measurable subset of finite measure in

X x Y Then
Proof. If Q has finite measure, then XQ is in 'p 1 (Jl ® v) and the

theorem applies.
Remark. Our version of Fubini's theorem as it applies to the situation

in the corollary does not yield the fact that the map
is measurable (only that it is Jl-measurable). It happens to be true that

the map is in fact measurable. Cf. Exercise 11.
In Fubini's theorem, we start with a map f E 'p 1 (Jl ® v) and conclude

that the various partial mappings arising from this f are in the corre-
sponding 'pI spaces. One can ask for the converse, which is true, prop-
erly formulated.
Lemma 8.6. Let f: X x Y ~ E be a (Jl ® v)-measurable map. Then for

almost all x, the map fx is v-measurable.
Proof Let Z be a set of measure 0 in X x Y such that the restriction

of I to the complement of Z is measurable, and the image of the comple-
ment of Z in X x Y is separable. By Lemma 8.3, for almost all x the set
Zx has measure 0, and by Lemma 8.1 the restriction of Ix to the comple-
ment of Zx in Y is measurable, whence v-measurable by Mll. This
proves our lemma.
Theorem 8.7 (Fubini's Theorem, Part 2). Let I: X x Y -+ E be a

(J-l ® v)-measurable map. Assume that lor almost all x E X the map Ix is
in ,!l'l(V), and that the map given by
(lor almost all x, and arbitrary otherwise) is in ,!l'1(J-l, R). Then
and Part 1 01 Fubini's theorem applies.
Proof By Corollary 5.9 of the dominated convergence theorem, it

suffices to prove that III is in ,!l'1(J-l ® v, R), and thus we may assume
without loss of generality that I is a semi positive real function which
is (J-l ® v)-measurable, satisfying the other hypotheses of the theorem. By
condition M9 of §1, we can find a sequence of positive simple functions
{qJn} which is increasing to I pointwise everywhere (changing I if neces-
sary on a set of measure 0). Using the a-finiteness of X x Y, we may
assume further without loss of generality that each qJn vanishes outside
a set of finite measure, i.e. is step. For each x the sequence {qJn.x} is
increasing to Ix. Whenever x is such that Ix is in ,!l' 1, and qJn.. is
v-measurable, it follows that as n -+ 00,
L qJn.x dv is increasing and convergent to L Ix dv.
We can apply the corollary of Fubini's theorem (by linearity), and the
monotone convergence theorem once more to conclude that the sequence
given by
is increasing and convergent to

A final application of the monotone convergence theorem shows that f is

in 2 1 , thus proving our theorem.
VI, §9. THE LEBESGUE INTEGRAL IN RP
We start with R, and the algebra of subsets consisting of finite disjoint

unions of intervals. The length function is easily seen to extend to a
finitely additive function on this algebra. To get our theory going, we
must show that it is a measure, i.e. countably additive. It is in fact just
as convenient to prove a slightly more general statement.
Theorem 9.1. Let Un} be a sequence of functions ~ 0 on a closed

bounded interval I, decreasing monotonically to O. Assume that each fn
is a step function with respect to intervals. Then the sequence of (plain
and ordinary) integrals
decreases to O.
Proof For each n, the intervals on which fn is constant have a finite

number of end points. The union of such end points for all such inter-
vals and all n = 1, 2, ... is countable, and can therefore be covered by a
sequence of open intervals Jk such that
where I is the length. Let U = U Jk • If x E I and x ¢ U, there exists

some nx and an open interval V~ containing x such that fn (t) < e for all
t E Vx ' Since the sequence {f,,} is decreasing, it follows that fm(t) < e for
all m ~ nx and all t E Vx ' The family of open sets
k = 1, 2, .. . ; X E I, x ¢ U,
covers I, and hence there exists a finite subcovering
{Jk 1 , ••• ,Jk.. , Vx I , .. . ,v.x• }.
Let N = max(nX1 ' • • • ,nx ). If n ~ N, then
if tEVXI u·· · uv..

Xs
The integral fdn(x) dx is bounded by the sum of the integrals of fn over

the intervals Jk1 , •• • ,Jkr , and over the union V X1 U'" U VXs ' If C is a
[VI, §9] THE LEBESGUE INTEGRAL IN RP 167
bound for J1 (and hence all J.) we conclude that
L J.(x) dx ~ CB + 1(l)B,
which proves our theorem.
Corollary 9.2. The length Junction oj intervals extends uniquely to a

measure on the algebra consisting oj finite disjoint unions oj bounded
intervals.
Proof. If {A.} is a disjoint sequence in the algebra, whose union is an

element A of the algebra, then the characteristic functions
with B. = A1 U .. . u A.
forms a decreasing sequence of step functions; converging pointwise to 0,

to which we can apply the theorem.
Having our measure on the algebra of finite disjoint unions of bounded

intervals, we can first obtain a IT-algebra and a measure on it by Hahn's
theorem. Then §3 gives us the integral on the reals. We can apply the
theory of integration on product spaces to get the integral on RP, because
R is obviously IT-finite with respect to bounded intervals. Thus we now
have integration on RP. The IT-algebra of measurable sets in RP obtained
by the preceding procedure is that generated by the rectangles (i.e. p-fold
Cartesian products of intervals), and is thus the IT-algebra of Borel sets in
RP. The measure on this algebra obtained as above is called Lebesgue
measure. One can also extend it to the completion of the IT-algebra
generated by rectangles (cf. Exercise 7). This makes no difference con-
cerning integration, and is in fact frequently very convenient.
We observe that for Lebesgue measure, rectangles have the expected
measure, namely the product of the lengths of the sides.
For the rest oj this section, we let /1 denote Lebesgue measure. One
customarily writes y1(RP) instead oj y1(/1) in this case.
It is clear that RP is IT-finite, being a union of bounded rectangles, i.e.

p-dimensional rectangles. Thus we can apply the density statement con-
cerning step mappings with respect to finite unions of rectangles. We
shall give an application of Corollary 6.4.
If cp is a function on RP, we say that cp has compact support if
cp(x) = 0 for x outside some compact set. We let C:'(RP, C) be the space
of Coo (infinitely differentiable) functions (complex) with compact support.
It is clearly a vector space.
Theorem 9.3. Let

fEY I (/1,).
If
fRP
fqJ dJi. =0
for all cP E C,:x'(RP, C), then f is equal to 0 almost everywhere.
Proof. According to Corollary 6.4 it suffices to prove that
for all bounded rectangles A. We shall recall below how to approximate

a characteristic function XA of a rectangle by a COO function with compact
support, both almost everywhere and for the L I-seminorm. In other
words, we can find a sequence {CPn} of COO functions with compact sup-
port which tends almost everywhere to XA and is bounded, say by a
constant C. Then {cpnf} tends almost everywhere to fA = XAf, and each
cPnf is in yl by Corollary 5.11 of the dominated convergence theorem.
Applying the dominated convergence theorem, we conclude that {cpnf} is
L I-convergent to fA, whence
This proves what we wanted.
We now recall the construction mentioned in our proof. It is basically

a one-dimensional construction. Let a, b be real and a < b. The function
h(t) = e-I/(t-a)(b-t) if a < t < b,

h(t) = 0 if t ~ a or t ~ b,
is a bell-shaped Coo function which looks as follows:
a b
The function
g: x f--+ f:oo h(t) dt = g(x)
then starts from 0 and climbs between a and b to a constant value,

looking like this:
a b
Multiplying by a positive constant, we can assume that the top value is

equal to any given number > O.
If we make a translation on g we can assume for instance that a = O.
Considering the function g(cx) instead of g(x) where c is a large constant,
we can make the climb arbitrarily steep. Combining translations and
such steep climbs, we can then find a function which is COO and looks
like this:
/ \
a b
In other words, this function approximates the characteristic function of

[a, b] from below. We can do the same thing from above. Taking
suitable products to do the same thing in p-space, we end up with the
following result.
Lemma 9.4. Let A be a bounded rectangle in RP. Given 8, there exist

Coo functions <p, ljI having the following properties:
(i) We have
(ii) We have
In fact, if
fRP
(ljI - <p) dJ.l < 8.
the function ljI is 0 outside the rectangle
[a l - 8, bl + 8] x ... x Cap - 8, bp + 8]
and the function <p is 1 on the rectangle
[a l + 8, bl - 8] X ... x Cap + 8, bp - 8].

Observe that in deriving this lemma, we are dealing with the simplest
case of Riemann integration. The lemma is at the level of elementary
calculus.
The result of Theorem 9.3 really concerns the values of the map I on
bounded sets of RP. In many applications, it is not convenient to restrict
oneself to elements of ,21, and one needs a formulation which allows us
to deal with maps locally. Thus we say that a map I: RP -+ E is locally
integrable if for each compact set K in RP the map IK (equal to I on K
and 0 outside K) is in ,21 (Jl).
Corollary 9.5. Let I be a locally integrable map on RP such that lor all
qJ E C~(RP, C) we have
Then I is equal to 0 almost everywhere.
Proof This is really what Theorem 9.3 proved, since all we have to
consider is IA for every bounded rectangle A.
Theorem 9.6. The space C~(RP) is dense in ,21(Jl, C).
Proof We may restrict ourselves to the real functions. We know that

the step functions with respect to rectangles are dense in ,21. On the
other hand, the characteristic function of a rectangle can be approxi-
mated by COO functions with compact support, as we saw above for the
proof of Theorem 9.3. The assertion of our corollary follows at once.
Let I: RP -+ E be a map, and let a E RP. We define the translation ra!,

also written la, to be the map given by
(ra!)(x) = I(x - a).
If Y is a subset of RP, we define
to be the set of all points x +a with x E Y. Our definitions are adjusted

in such a way that
Theorem 9.7. The Lebesgue integral is translation invariant. This

means: II then lor each a E RP the map ral is in ,21(Jl), and
IE ,21 (Jl),
f
we have
fRP
ra! dJl =
RP
I dJl.
Proof. By Theorem 6.3 we can find a sequence {q>n} of step maps with
respect to finite unions of rectangles which converges both L 1 and almost
everywhere to f. If q> is a step map as above, it is clear that its integral
is the same as the integral of a translation !aq>, because if R is a rectan-
gle, then
But {!aq>n} converges almost everywhere to !af, and by the preceding

remark, {!aq>n} is L l-Cauchy, whence is also L l-convergent to !af. Our
theorem follows at once.
Theorem 9.8. If Y is a measurable set in RP, then we have:
J.l(Y) = inf J.l(U) for U open, U::::>Y,

J.l( Y) = sup J.l(K) for K compact, KeY.
Thus if Y has finite measure, given e there exists an open set U and a
compact set K such that
KcYcU and J.l(U - K) < e.
Proof. The statement concerning open sets is clear by applying the

definition of our measure as an application of the Hahn theorem, giving
J.l as the outer measure with respect to bounded rectangles. We can
always take the rectangles to be open to cover Y, since a closed rectangle
is contained in an open one whose measure is at most e/2 n bigger.
Concerning the statement about compact sets, suppose first that Y is
bounded, say contained in a closed bounded rectangle R. We find an
open set U containing R - Y such that
J.l(U) < J.l(R - Y) + e.
Let K = R n ~U = R - U. Then K is compact and contained in Y. We

have trivially:
J.l(K) ~ J.l(Y)= J.l(R) - J.l(R - Y)
~ J.l(R) - J.l(U) + e
~ J.l(K) + e.
This proves our assertion when Y is bounded. The general case follows
at once by considering the intersections of Y with a sequence {Rn} of
rectangles such that Rn c Rn+l for all n, and such that the union of the
Rn is the entire euclidean space.
172 THE GENERAL INTEGRAL [VI, §1O]
VI, §10. EXERCISES
Unless otherwise specified, (X, vIt, J1) is a measured space.

1. (a) Let vIt be a a-algebra in a set X, and let
I: X -+ y and g: Y -+ Z
be mappings. Show that
In other words, (g 0 f)* = g* 01*.

(b) Let vIt be a a-algebra in a set X, and let J1 be a positive measure. Let
I: X -+ Y be a mapping. Define the direct image I*J1 on I*vIt by the
condition
for all B in I*vIt. Show that I*J1 is a positive measure.

2. Egorotf's theorem. Assume that J1 is a-finite. Let I: X -+ E be a map and
assume that I is the pointwise limit of a sequence of simple maps {<Pn}.
Given e, show that there exists a set Z with J1(Z) < e such that the conver-
gence of {<Pn} is uniform on the complement of Z. [Hint: Assume first that
J1(X) is finite. Let Ak be the set where III ~ k. The intersection of all Ak is
empty so their measures tend to O. Excluding a set of small measure, you
can assume that I is bounded, in which case I is in 2 1 (J1) and you can use
the fundamental lemma of integration, or Theorem 5.2.]
3. Let {In} be a sequence of measurable functions. Show that the set of those x
such that {f.(x)} converges is a measurable set.
4. Let {an} be a sequence in [-00,00]. View [-00,00] as a toplogical space,
neighborhoods of -00 being given by sets [-00, a) for a real, and similarly
for neighborhoods of 00. Let {an} be a sequence in [ -00, 00]. By
lim sup a.
we mean the least upper bound of all points of accumulations of the sequence
{a.} . We allow -00 and +00 as points of accumulation, taking the obvious
ordering in [ -00, 00] where
-00 < a < 00

for all real a.
(a) Let b = lim sup a.. Suppose that b is a number. Show that given e, there
exists only a finite number of n such that a. > b + e, and there exist
infinitely many n such that a. > b - e. Prove that this property character-
izes the lim sup. Give a similar characterization when b = 00 .
(b) Charaterize lim inf similarly, and show that
lim inf a. = -lim sup( - a.).

[VI, §1O] EXERCISES 173
(c) A sequence {an} in [ -00, 00] converges if and only if
lim sup an = lim inf an '

(d) If {in} is a sequence of measurable maps of X into [-00, 00], then its
upper limit and lower limit are measurable. (By the way, the lim sup and
lim inf of the sequence {I.} are defined pointwise.)
5. Positive measurable maps. A map I: X -+ [0, 00] will be called positive.
(a) If I, g: X -+ [0,00] are measurable, show that 1+ g, Ig are measurable. If
{in} is a sequence of positive measurable maps, show that sup In and
inf 1. are also measurable.
(b) If J-I is a-finite, show that I is measurable if and only if I is the limit of
an increasing sequence of real valued step functions (0 outside a set of
finite measure).
(c) For a positive measurable map I: X -+ [0, 00] let {in} be a sequence of
positive simple functions (real valued) which is increasing to f. If the
integrals
Ix 1. dJ-l
exist and are bounded (so in particular each In is 0 outside a set of finite
measure), define the integral of I to be their least upper bound, and if
unbounded, define the integral of I to be 00. Show that this is well
defined, i.e. independent of the sequence {I.} increasing to f. Formulate
and prove the monotone convergence theorem in this context. Nate: In-
stead of redoing integration theory, you can quote results from the text to
shorten the procedure.
(d) For each measurable A and positive measurable map I: X -+ [0,00]
define
Show that J-If is a positive measure on X. If g: X -+ [0, 00] is measurable,

show that
6. Let {in} be a sequence of continuous functions on [0, 1] such that 0 ~ 1. ~ 1

and such that {J.(x)} converges to 0 for every x in [0, 1]. Show that
lim L 1. dJ-l = 0,
where J-I is Lebesgue measure.

7. Completion of a measure.
(a) Let Ji consist of all subsets Y of X which differ from an element of .At
by a set contained in a set of measure O. In other words, there exists a
set A in .At such that (Y - A) v (A - Y) is contained in a set of measure
o. Show that .II is a a-algebra. If we define il(Y) = Jl(A) for Y, A as

above, show that this is well defined on .:ii, and that il is a measure
on .:ii. We call (X , .:ii,il) the complete measure space determined by
(X, .II, /1), and we call il the completion of /1.
(b) Let (X;, .IIi' /1J (i = 1,2,3) be measured spaces. Show that
(.Ill ® .112 ) ®.II3 =.11 1 ® (.11 2 ® .113),

(/11 ® /12) ® /13 = /11 ® (/12 ® /13)·
If (X, .II, /1) and (Y, .;V, v) are measured spaces, show that
and il® v= /1® v.
8. (a) Direct image of a measure. Let (X,.II) and (Y, .;V) be measurable spaces.
Let I : X -+ Y be a map such that for each BE .;V we have I-I (B) E.II.
Let /1 be a positive measure on .II, and let /1. = 1./1 be defined on .;V
by /1.(B) = /1(J-1(B»). Show that /1. is a measure. Show that if g is in
2 1(/1.), then go I is in 2 1(/1), and that
(b) Let X, Y be topological spaces and I :X -+ Y a homeomorphism. Show

that I induces a bijective map
I·: ~(Y) -+ ~(X)

where ~ denotes the Borel algebra.
9. Let E be a Hilbert space with countable base. A map I : X -+ E is called
weakly measurable if for every functional ). on E the composite ). 0 I is
measurable. Let I, g: X -+ E be weakly measurable. Show that the map
X 1-+ <I(x), g(x»
is measurable. [Hint : Write the maps in terms of their component functions

with respect to a Hilbert basis, so the scalar product becomes a limit of
measurable functions.]
1O. Monotone families.
(a) A collection !/ of subsets of X is said to be monotone if, whenever {A.} is
an increasing (resp. decreasing) sequence of subsets in [/, then
UA. (resp. nA.)

also lies in!/. Let d be an algebra of subsets of X. Show that there
exists a smallest monotone collection of subsets of X containing d. De-
note it by.;V. If x E X, show that .;V is a a-algebra, and is thus the
smallest a-algebra containing d . [Hint : For each A E.;V, let .;V(A) be
the collection of all sets B E.;V such that B u A, B - A and A - B lie in
.;V. Then .;V(A) is monotone.]
[VI, §10] EXERCISES 175
(b) Assume that X E %. Let 11 be a positive measure on d. Show that an

extension of 11 to a positive measure on % is uniquely determined, by
proving: If Ill' 112 are extensions of 11 to %, then the collection of subsets
Y such that IlI(Y) = 1l2(Y) is monotone.
11. Let (X, vIt, 11) and (Y, %, v) be measures spaces. If Q E vIt ® %, show that
the map
is measurable (with respect to vIt). [Hint: Show that the set of Q in vIt ® %
having the above property is a monotone family containing the rectangles.]
12. Show that if c, is the (Lebesgue) measure of the closed n-ball in R' of radius
1, centered at the origin, then
and therefore
C, = C.-I
fn/2
-n/2
cos' t dt,
, rG + 1)
C = ----
13. Let T be a metric space and let f be a map on X x T such that for each
t E T the partial map
J,: x f-+ f(x, t)
is in Ie I. Assume that for each x the map t f-+ f(x, t) is continuous. Finally
assume that there is some g E Ie I (Il, R) such that Jf(x, t)J ~ Jg(x)J for all x.
Show that the function <II given by
<II(t) = L f(x , t) dll(x)

is continuous.
14. Dilferentiating under the integral sign. Let T be open in some euclidean
space. Let f be a map on X x T satisfying:
(a) For each t the map x f-+ f(x, t) is in Ie l .
(b) For each x, the map fx: t f-+ f(x, t) is differentiable, and its derivative is
continuous in t.
(c) The second partial Dd(x, t) is in Ie l for each t, and there exists an
element g E Ie I (Il, R) and g ~ 0 such that
JDd(x, t)J ~ g(x)

for all x, t.
Then the map <II as in the preceding exercise is differentiable, and its deriva-
L
tive is given by
D<II(t) = Dd(x, t) dll(X).
(If you prefer, take T to be an open interval.)

176 THE GENERAL INTEGRAL [VI, §1O]
15. Let Jl be Lebesgue measure on RP. If f, 9 E 2'1 (Jl), define f *9 by
f * g(x) = f Rp
f(t)g(x - t) dJl(t).
(a) Show that f*gE2'1(Jl) and that Ilf*gI11:::::; Ilf11111g111' We call f*g the
convolution of f and g.
(b) Show that convolution is commutative, associative, bilinear, and that
2'1(Jl) is therefore a Banach algebra. Does there exist a unit element in
this algebra?
16. Let M be the set of all finite positive Borel measures on RP. For each Jl E M
define IJlI = Jl(RP). For Jl, v E M, and any Borel subset A of RP define
where (J: RP x RP -+ RP is the sum, that is (J(x, y) = x + y.

(a) Show that (J-1 (A) is a Borel set in RP x RP.
(b) Show that J1 * v E M and that IJl * vi ~ IJlllvl.
(c) Show that Jl * v is the unique positive Borel measure r such that
f f dr = ff f(x + y) dv(y) dJl(x)

for every step function f with respect to rectangles.
(d) The operation (Jl, v) H Jl * v is called convolution. Show that it is commu-
tative, associative, and bilinear.
(e) Show that there exists a unit element in M, i.e. an element 11 such that
11 * Jl = Jl * 11 = Jl for all Jl EM.
(f) Let Jl be Lebesgue measure, and let f, 9 E 2'1 (Jl). Show that
(g) After you have read about complex measures in the next chapter, show
that all the previous properties apply as well to such measures, and that
these measures therefore form a Banach algebra under convolution.
17. Let X = [-n, n], and let J1 be Lebesgue measure. Let f E 2'1(J1, C). Show
that one can define the Fourier coefficients of f in the usual way, by
Cn =
1
-
fn .
f(x)e- InX dx.
2n -n
If Cn = 0 for all integers n, show that f is equal to 0 almost everywhere.

18. Riemann-Lebesgue lemma. Let f E ..'t'1(R). Prove that
lim
t-c.o
f R
f(x)e- itx dx = O.
[VI, §lOJ EXERCISES 177
[Hint: Approximate I by a COO function with compact support, in which case

integrate by parts.]
19. (a) Let R* be the multiplicative group of non-zero real numbers. Show that
the map
!/J H f R
dt
!/J(t)fiI'
for !/J a step function with respect to intervals not containing 0, defines a
positive Borel measure on R*. We denote this measure by f.1.*. Show that
a function I is in ..'f'1(f.1.*) if and only if I(x)/Ixl is in ..'f'1(f.1.), where f.1. is
Lebesgue measure, and that in this case,
f R*
I df.1.* = f R-{O}
I(x)lxl- 1 dx.
(b) Show f.1.* is invariant under multiplicative translations, and so is the inte-
gral on R* with respect to f.1.*. (Multiplicative translations are of type
x H ax for a =1= 0.)
20. Not all sets are measurable. Consider the reals modulo the rational numbers,
and in each coset x + Q, x real, select an element y such that 0 ~ y < 1.
Show that the set consisting of all such elements cannot be Lebesgue measur-
able. [Hint: Use the countable additivity to show that this set cannot be
measurable.]
21. Let X be a measured space with finite measure f.1.(X). Let IE ..'f'1(f.1.). Com-
pute the limit
!~~ Ix II(x)11(n df.1.(x).

22. Arbitrary products. Let (Xn, .ltn, f.1.n) be a family of measured spaces such that
f.1.n(Xn) = 1 for almost all n (meaning all but a finite number of n). Let
X=nXn
be the product space. Let At be the a-algebra generated by all sets of the
form
where An is measurable in X n, and An = Xn for almost all n. Then (X, At) is

a measurable space. A set A as above is called decomposable.
(a) Show that there exists a unique measure f.1. on (X, At) such that for every
decomposable set as above, we have
f.1.(A) = nf.1.n(An)·
(b) Let In E ..'f'l(f.1.n) and assume that In is the characteristic function of Xn for
almost all n. Show that the product function 1= ®In is in ..'f'1(f.1.), and
that
Note : Do this exercise first for finite products.

23. Dini's theorem. The dominated convergence therem is useful in almost all
instances, but sometimes, one wants a more delicate criterion for the inter-
change of a limit and an integral. This may be provided by Dini's theorem,
as follows.
Dini's Theorem. Let {F,,} be a sequence of functions on [a, (0) such that for
F" E U([a, B])
r
all n we have for all B ~ a. Let
l.(x) = F" .
Assume :
(a) The sequence {F,,} converges pointwise to a function F, uniformly on each
finite interval [a, B].
(b) The sequence {I.} converges uniformly on [a, (0).
(c) The improper integrals
f F"OO
a
= lim
8- 00
f8 F"
a
and fa
OO F = lim
8-00
f8 F
a
exist.
Then
lim f oo F" = f oo F.
"-CO a a
Proof. Given e, we consider
By assumption (b) there exists no such that for all m, n ~ no we have
for all B.
Let n ~ no . By assumption (c), there is B(n) and B( (0) such that
It oo F"I < e for B ~ B(n) and Itoo FI < e for B ~ B(oo).

We now pick B = max(B(n), B(oo)). Then the third integral on the right of
(1) is bounded by 2e. Using (a) we select m sufficiently large so that the
second integral in (1) is bounded bye. This concludes the proof.
[VI, §1O] EXERCISES 179
24. Caratheodory's criterion. Assume that M is the a-algebra of all Il-measurable

subsets of X in the sense of §7. Let I: X -+ Y be a map of X into a metric
space Y. Prove that I is measurable if and only if for every subset Z of X
and every two subsets B, C of Y satisfying
dist(B, C) > 0,
we have
[Hint: One direction is obvious. Conversely, assume (*). Let A be closed in

Y. Let:
Bm = {Y E Yjm ~ 1 ;;;; dist(y, A);;;; ~I}'

Cm = {y E Yldist(y, A) > ~},
B~ = {y E YIO < dist(y, A) ;;;; ~}.
Then Cm U B~ = ~A. Prove that for any subset Z of X we have
Then show that

lim Il(Z I I F I B~) = 0
m~oo
by considering the sums L Il(Z II I-I B k) for k even and k odd, and applying
the hypothesis (* ).]
25. Let X be a metric space and :F a family of subsets of X whose union covers
X. Let
q>: :F -+ R U { 00 }
be a non-negative function. For every c > 0 and A c X, let
IlAA) = inf
~, FE~
L q>(F),
where <§ is a family of :F such that:

(i) The union of the elements of <§ covers A.
(ii) If FE <§, then diam F ;;;; c.
(a) Prove that Ilc is an outer measure on X. Define Il(A) = limc~o IlAA).
(b) Prove that Il is an outer measure, called the Caratheodory measure asso-
ciated with (q>, :F).
26. Let Il be the Caratheodory measure associated with (q>, :F).
(a) Prove that the open sets of X are Il-measurable. (Use Caratheodory's
criterion applied to the identity map.)
(b) If all elements of ~ are Borel sets, prove that for any subset A of X we
have
Jl(A) = inf Jl(B),
the inf being taken for all Borel sets B containing A.

Examples. For Fe Rift let
where VIft is the volume of the m-dimensional ball in Rift, and let ~ be the
family of all open sets of Rift. The associated Caratheodory measure is called
the m-dimensional Hausdorff measure .?fm .
CHAPTER VII
Dual ity and Representation

Theorems
Throughout this chapter (X, vIt, fJ.) is a measured space.
VII, §1. THE HILBERT SPACE L2 (1l)
Consider first complex valued functions. We let 22(fJ.) be the set of all
functions f on X that are limits almost everywhere of a sequence of step
functions (i.e. fJ.-measurable), and such that Ifl2 lies in 21. Thus
If we wish to consider a Hilbert space E instead of C, we let 2 2(fJ., E)

be the set of all maps f: X .... E that are limits almost everywhere of a
sequence of step maps, and such that Ifl2 lies in 21 . There is no change
from the preceding definition. In this case,
Ifl2 = <f, f),
the value of <f, g) at x being given by the scalar product <f(x), g(x»
in E.
The reader interested only in the complex numbers can take E = C
and the product to be <f, g) = fg, where the bar denotes complex conju-
gation. Not a single proof, however, will be made shorter or simpler.
Theorem 1.1. The set 22(fJ.) is a vector space. If f, g E 22(p,), then

<f, g) is in 21(fJ.), and the map
(f,g)r-. Ix <f,g)dfJ.
182 DUALITY AND REPRESENTATION THEOREMS [VII, §1]
is a positIVe hermitian product on ft'2(p,) (not necessarily positive

definite).
Proof The map <f, g) is obviously a limit almost everywhere of step

maps, and we have
Thus the absolute value is bounded by a function in ft'l, whence by

Corollary 5.9 of the dominated convergence theorem, Chapter 11, it
follows that <f, g) is in ft'l . As for the fact that ft'2 is a vector space,
let f, g E ft'2 . We have
whence the same reference shows that f + g E ft'2 . It is clear that if a is

a number, then af is in ft'2, so ft'2 is a vector space. The last assertion
is now obvious.
We denote our hermitian product by
<f, g)p. = Ix <f, g) dJl.
We have the usual properties, like the Schwarz inequality. The L 2.

seminorm is defined by
Corollary 1.2. We have IIfII2 = 0 if and only if f is equal to 0 almost

everywhere.
Proof This is really a statement about If1 2, which is in ft'I, and we

know this result already.
Corollary 1.3. If X has finite measure and f E ft'2(Jl) , then actually f is

in ft'l(Jl) and IIfIIl ~ IIfII2 I lxll2 .
Proof We apply the theorem and the Schwarz inequality to the pair
If I and Ix (the constant 1 on X).
We can form the space L 2(Jl) of equivalence classes of maps in ft'2,

differing only on a set of measure O. We see that the hermitian product is
positive definite on L 2 (Jl).
Theorem 1.4. Let Un} be an U-Cauchy sequence in ft'2. Then there

exists some f in ft'2 having the following properties:
[VII, §1] THE HILBERT SPACE L2{J,t) 183
(i) The sequence {f,,} is U-convergent to f , so that 22 is complete,

and L 2(J-L) is a Hilbert space.
There exists a subsequence of {f,,} having the following properties:
(ii) This subsequence converges almost everywhere to f.
(iii) Given e, there exists a set Z with J-L(Z) < e such that the conver-
gence of this subsequence is uniform on the complement of z.
Proof. As before, we really prove these statements in reverse order.
We may assume all f" measurable. Taking a subsequence if necessary we
may assume that for m ~ n we have
11f" - fmll~ < 2;n.

We let y" be the set of x E X such that
Then y" has finite measure, and the proof of Lemma 3.1 in the preceding
chapter goes through as before. We have J-L(y") ~ 1/2n, and we let
Zn = Y" U y"+1 U . .. .
If x ~ Zn, then for k ~ n we have
2 1
Ih+1 (x) - h(x)1 < 2k
so that the series
converges uniformly and absolutely on the complement of Zn for each n,

whence pointwise and absolutely on the complement of Z (intersection of
all Zn). This already proves (ii) and (iii).
Let f(x) be the limit of f,,(x) as n -+ 00 if x ~ Z, and let f(x) = 0
if x E Z. There remains to prove that f is in 22 and that Un} is
L 2-convergent to f. The expression
is the U -seminorm of Ifn - fml2 . We fix m and take the limit as n -+ 00.
We can apply Fatou's lemma, and conclude that If - fml 2 is in 21,
whence f - fm is in !f2. Since !f2 is a vector space, and fm E !f2, we

conclude that f E !f2 . Fatou's lemma also shows that
so that for large m we see that Ilf - fml12 is small, i.e. the sequence Um}
is L 2 -convergent to f. This proves our theorem.
Corollary 1.5. If {f,,} is an L2-Cauchy sequence in !f2 and if Un}

converges almost everywhere to a map f, then f is in !f2 and {f,,} is
also L 2-convergent to f.
Proof. Obvious.
Theorem 1.6 (Dominated Convergence Theorem for L'1.). Let Un} be

a sequence in !f2 which converges pointwise almost everywhere to f.
Assume that there exists g E !f2(/l, R) such that g ~ 0 and such that
Ifni ~ g. Then f is in !f2 and Un} is L 2-convergent to f.
Proof. The proof is essentially the same as in the !f1 case. For each
positive integer k let
gk = sup Ifn - fml·
m. n ~k
Then {gk} is a decreasing sequence of real valued functions, and for m,

n ~ k we have If" - fml ~ 2g. Therefore by Corollary 5.9 of the mono-
tone convergence theorem (Chapter VI) it follows that
gf = sup If" - fml 2

m,n ~k
is in !fl. By the monotone convergence theorem and the hypothesis, the

sequence {gd converges almost everywhere to O. Hence {fn} is actually
an L 2-Cauchy sequence, and we can apply Corollary 1.5 to conclude the
proof.
Corollary 1.7. The step maps are dense in !f2.
Proof. Let {CPn} be a sequence of step maps converging pointwise to

an element f of !f2. Then f is measurable. Define
I/In(X) = CPn(x) if ICPn(X) I ~ 2If(x)l,

I/In(x) = 0 if ICPn(X) I > 2If(x)l.
[VII, §2] DUALITY BETWEEN Ll(j,L) AND L "' (j,L) 185
Then l/In is a step map for each n, and the sequence {l/In} converges
pointwise to f Furthermore, Il/Inl ~ 21fl for all n. The theorem shows
that {l/In} is L 2-convergent to f Any element of 22 is equivalent to one
for which you can find a sequence {qJn} as above. Hence our corollary is
proved.
VII, §2. DUALITY BETWEEN L1 Ca) AND L OOCa)
As Corollary 5.11 of the dominated convergence theorem, in Chapter VI

we found that if f E 21 and g is a bounded j,L-measurable function, then
fg is in 21. We now investigate this property more closely. Half of
what we do in this section will be valid in Hilbert space without changing
the proofs at all, but again the reader who wishes to understand every-
thing in terms of complex or real valued functions is welcome to do so
throughout.
We could put the sup norm on the space of step maps, but it is
convenient to adjust this norm in terms of the given measure j,L, and thus
define what is called the essential sup, as well as the completion of the
space of step maps under this seminorm. We define 2"'(j,L) to be the
vector space of maps f such that there exists a bounded j,L-measurable g
equal to f almost everywhere. Properties relating to the integral with
respect to j,L hold for equivalence classes of such maps. Therefore, if
f E 2 "' (j,L) it is natural of define its essential sup to be
ess sup(f) = Ilfll", = inf Ilgll,
g
where I II is the sup norm, and the inf is taken over all bounded
j,L-measurable maps g equal to f almost everywhere. Alternatively, for
each c ~ 0 let Sc be the set of all x such that If(x) I ~ c. We could have
defined Ilfll", by the condition:
Ilfll", = inf of the set of all numbers c such that j,L(SJ = o.

The equivalence between the two conditions is immediately verified. (For
instance if c > band j,L(Sc) > 0, then If(x) I ~ c for all x in a set of
measure > 0, so that for all g equivalent to f we must have Ilgll ~ c
also. This proves that c ~ b. The reverse inequality is equally clear.) We
also see at once that II II", is a seminorm on 2 "'(j,L). By definition, the
set of x such that If(x) I > Ilfll", has measure O.
If f E 2"'(j,L), it is clear that we have Ilfll", = 0 if and only if f is equal
to 0 almost everywhere. Consequently, we can form the space L "'(j,L) of
equivalence classes of elements of 2 "' (j,L), and we shall see in a moment
that L "' (j,L) is a Banach space.
Theorem 2.1.
(i) The space ,:e00(11) is complete. If Un} is an L oo-Cauchy sequence in
,:e00(11), then there exists a set Z of measure 0 such that the conver-
gence of {f,, } is uniform on the complement of Z.
(ii) If E is finite dimensional, then the simple maps are dense in
L 00(11, E).
(iii) If I1(X) is finite, then given I: and f E ,:e00(11), there exists a step
map cp and a set Z with I1(Z) < I: such that
If- cpl <I: on the complement of Z.
Proof. To prove the first statement, let Un} be an L oo-Cauchy se-

quence in ,:e00(11). Let Z be the set of all x such that we have
1f" (x) 1> IIfnil oo

or
for some n, or some pair m, n. Then Z has measure 0, and the conver-
gence of the sequence is uniform on the complement of Z. We let f have
value 0 in Z and be the uniform limit of the sequence {f,,} on the
complement of Z. Then f E ,:e00(11), and clearly is the L 00 limit of {f,,}.
Now assume that E is finite dimensional, or say equal to the complex
numbers for concreteness. Let f E ,:e00(11, C). After replacing f by an
equivalent function, we may assume that f is measurable and bounded.
Say the values of f are contained in a square. We cut up the square into
small I:-squares which are disjoint, and take their inverse images in X.
These give a partition of X and we can define a simple function with
respect to this partition by giving the function anyone of its values in a
given square. To get our small squares, let, say, e 1 , e2 be the standard
unit vectors in C = R2, and let S be the square
with 0 ~ t < I: and 0 ~ u< 1:. The translates
with integers m, n, are disjoint. If N is large and we take
-N ~ n ~ N and -N~m~N,
then our small squares Sm.n cover the image of f as desired. The argu-
ment also works in any finite dimensional space, taking unit vectors
e 1 , .. . ,ep in RP.
[VII, §2] DUALITY BETWEEN L 1 (Ji) AND Loo(Ji) 187
Finally, for the third part of the theorem, if Ji(X) is finite, then any
element of 2 OO (Ji) is in 21 (Ji), and our assertion follows from the fact
that elements of 21 are L I-limits of step maps, together with the funda-
mental lemma of integration, or Theorem 5.2 of Chapter VI.
Remark. We phrased the density of (ii) in terms of simple maps.

Recall that step maps are assumed to be equal to 0 outside a set of finite
measure. Thus the step maps cannot possibly be dense in L oo(Ji) if Ji(X) is
infinite, since the constant function 1 cannot be uniformly approximated
by step functions in that case. If we restrict our attention to the case
when Ji(X) is finite, then step maps and simple maps coincide. In applica-
tions this suffices, since one deals mostly with a-finite measures, and
certain problems can be reduced to the case of finite measures. The
density statement of (iii) is also useful in the infinite dimensional case.
Consider now the case of functions (complex valued, say). We have a

bilinear map
given by
(f, g)H Ix fg dJi = [f, g]11
This arises from Corollary 5.11 of the dominated convergence theorem

(Chapter VI). It is clear that the value of this map on (f, g) depends only
on the equivalence class of f and g, respectively, and thus defines a
bilinear map
Without changing anything above except the notation slightly, if we

write (f, g) instead of fg, and take the values of f, g in a Hilbert space
E, then what we said holds, except that as usual, the map is not bilinear
but sesquilinear (i.e. linear in its first variable, but anti-linear in its
second variable, that is a complex conjugation occurs when we multiply
g by a constant). For the convenience of the reader, we shall state our
results first for functions, and then for the Hilbert case. There will be
absolutely no difference in the proofs except for this change between fg
and (f, g).
Quite generally, let
r: F x G-+H
be a bilinear map of vector spaces into another vector space. If v E F

and WE G we write v.l wand say that v is orthogonal to W if r(v, w) = O.
We define the kernel on the left to consist of all v E F such that v.l G,
i.e. v is orthogonal to all elements of G, and similarly we define the
kernel on the right. These kernels are clearly subspaces of F and G,

respectively. We say that the bilinear map is non-degenerate if the kernels
on the left and right are equal to O. Suppose that F, G are normed
vector spaces (or semi-normed) and that the bilinear map is continuous.
In applications, the condition
Ir(v, w)1 ~ Ivllwl

is even satisfied. Then we obtain corresponding mappings of F and G
into each other's dual spaces, namely each v E F gives rise to the func-
tional Av E G' given by
Av(W) = r(v, w).
Similarly, each w E G gives rise to the function Aw in F' given by
We investigate this situation when we deal with the spaces L 1 (f.l) and
L OO(f.l).
Theorem 2.2. Let f.l be a-finite. The kernels on the right and left of the
bilinear map
are O. This map satisfies the product inequality
The maps g f--+ Ag and ff--+ AI for g E YOO(f.l) and f E y 1(f.l) induce norm-
preserving linear maps of L OO(f.l) and L 1 (f.l), respectively, into the other's
dual space. In the case of L OO(f.l), the map g f--+ Ag is a norm-preserving
isomorphism between L OO(f.l) and the dual space of U (f.l), i.e. the map is
surjective.
Proof Let f E y1(f.l) be orthogonal to YOO(f.l). Then f is 0 almost

everywhere by Corollary 5.19 of Chapter VI (the averaging theorem).
The other side works similarly as follows. If g is bounded f.l-measurable,
then for every measurable subset A of finite measure, the map gA = gXA is
in y1. We can therefore apply the same argument, and see that gA is 0
almost everywhere, whence g is 0 almost everywhere since f.l is assumed
a-finite.
Let C be a bound for g. Then
IIfgl11 = Ix Ifgl df.l ~ Ix If I df.l = CIIfl11 .

C
[VII, §2] DUALITY BETWEEN L 1(Ji) AND Loo(Ji) 189
This implies our inequality
and shows that IAgl ~ Ilglioo. For the reverse, let b = IAgl. For each
subset A of finite measure, we have
By Corollary 5.18 of the averaging theorem of Chapter VI, we conclude

that Ig(x)1 ~ b for almost all x, whence Ilglioo ~ b. Therefore IAgl = Ilglloo.
Now on the other side, let f E 2'1 (Ji), and define I/lfl to be the map
having value I/lf(x)1 if f(x) =/; 0 and 0 if f(x) = O. Then I/lfl is Ji-
measurable, and 1IIfi is Ji-measurable and bounded. Let g = 1I1fl. Then
from Ilglioo = 1 or 0, we get
This proves the reverse inequality, whence Ilflll = IAII.

This proves all our statements except the last, that L oo(Ji) provides
us with all functionals on U(Ji). To see this, we give the argument of
von Neumann (originally applied to the Radon-Nikodym theorem, see
below). Assume first that X has finite measure. Let A.: U(Ji) -+ C be a
functional, and let b be its norm so that we have
for all f E 2'1(Ji). The functional A can actually be viewed as defined on

2'2(Ji) because any map g in 2'2 on a set of finite measure X is in 2'1
(use the Schwarz inequality on the pair Igl and the function Ix). Thus
we obtain
This shows that A is continuous with respect to the L2-seminorm. Since

L 2(Ji) is a Hilbert space, there exists g E 2'2(Ji) such that we have
for all step maps f For any measurable set A of finite measure, we then
obtain
By Corollary 5.18 of the averaging theorem (Theorem 5.15, Chapter VI),

it follows that Ig(x)1 ~ b for almost all x, whence 9 is in fact in .!r'(J-l)
and Ilglioo ~ b. Since
)j = [J, g]1'
for all step maps f, this same relation must hold true for all f E 2 1 (J-l)
because the step maps are dense in 21. This proves our last assertion
when X has finite measure.
The general case when J-l is a-finite follows easily. We write X as a
disjoint union of sets of finite measure X k (k = 1,2, ... ). Let f E 21 (J-l),
and let fk = fXk be the same as f on X k and 0 outside X k • Then the
series
is L 1-convergent to f, say by the dominated convergence theorem, and

therefore by the continuity of A we have
For each k there exists a J-l-measurable map gk on X k , bounded by b, and

o outside X k such that
We let
L
00
9= gk
k=1
(pointwise). Then 9 is bounded by b, J-l-measurable, and it is clear that

A = Ag , thus concluding the proof of our theorem.
We now repeat the statement of Theorem 2.2 for the Hilbert case.
Theorem 2.3 (Hilbert Case). Assume that J-l is a-finite. Let E be a

Hilbert space. We have a sesquilinear map
U(J-l, E) x L OO(J-l, E) -+ C
defined for f E 2 1 (J-l, E) and 9 E 2°O(J-l, E) by
(J, g)f-+ <f, g)1' = L <J, g) dJ-l.

[VII, §2] DUALITY BETWEEN L 1 (J1.) AND L OO (J1.) 191
The kernels on both sides are O. The map g 1-+ Ag induces a norm-
preserving linear map of L OO(J1., E) onto the dual of U (J1., E) (so the map
is surjective), and the map fl-+ AJ induces a norm-preserving antilinear
map of L 1 (J1., E) into the dual of L OO (J1., E) (not necessarily surjective).
Proof. Exactly the same, except that when for instance we considered
in the proof of Theorem 2.2, we now have to write <eXA, g) for some
unit vector e E E, and apply Corollary 5.20 instead of Corollary 5.18 of
the averaging theorem.
We wish to characterize those elements of the dual of L OO(J1.) which can

be represented by some element in 'pI (J1.). Over the complex numbers,
the classical Radon-Nikodym theorem achieves this purpose, and can be
viewed as stating that if a functional on L OO(J1.) can be represented by a
finite measure, then it already can be represented by a function. We first
make some comments in this case.
Let v be a positive measure on vIt. We say that v is absolutely
continuous with respect to J1. or J1.-absolutely continuous, if we have v(A) =
o whenever J1.(A) = O. (Cf. Exercise 1.) We say that a functional A on
L OO(J1.) can be represented by a positive measure v if the functional has the
form
A.: g 1-+ Ix g dv.
We then write A = dv . If the functional can be so represented, then v is

necessarily absolutely continuous with respect to J1., because the func-
tional vanishes on characteristic functions of measurable sets A such that
J1.(A) = O. The Radon-Nikodym theorem in its classical form states:
If v is a finite positive measure on vIt which is J1.-absolutely continuous,

then there exists some f E 'p 1 (J1.) such that for all A E vIt we have
v(A) = L f dJ1..
This measure is conveniently denoted by J1.J'

A functional dv can be viewed as functional on various spaces (e.g.
spaces of step maps, L OO(J1.), etc.}. We shall always make it explicit on
which space we intend this functional to be. We observe that a func-
tional on L OO(J1.) represented by a map in 'pI is determined by its values
on step maps. Thus actually we can limit our attention to step maps.
But it is reasonable also to ask for functionals on simple maps, without

any reference to the measure fJ., and with continuity with respect to the
sup norm. For this purpose, we need another definition.
A positive measure v on vIt is said to be concentrated or carried in a
measurable set A if v(Y) = 0 for all Y in the complement of A. If Vi' V2
are two positive measures on vIt we say that they are orthogonal or
singular to each other and write Vi .1 V2 if there exists a decomposition
X=AuB
of X into a disjoint union of measurable sets such that Vi is concentrated
in A and V2 is concentrated in B.
Let ft'OO(vIt, C) denote the space of bounded measurable functions on
X. We make no reference to any measure here at all, and we take the
sup norm on this space. Let V be a finite positive measure on vIt. This
means that v(X) < 00. Then V gives rise to a functional on ft'OO(vIt, C) by
the map
satisfying the bound
Theorem 2.4 (Radon- Nikodym and Lebesgue). Assume that fJ. is (f-
finite, and let V be a finite positive measure on vIt. Then there exists a
unique decomposition
as a sum oj positive measures, such that Va is absolutely continuous with

respect to fJ., and Vs is singular with respect to fJ.. We have va.l Vs .
Finally, if V is absolutely continuous with respect to fJ., then there exists
an element J E ft'1(fJ.) such that V = fJ.J, and J is uniquely determined up
to equivalence. Furthermore, the Junctional on L OO(fJ.) represented by V is
then also represented by J, i.e. we have dv = J dfJ. on L OO(fJ.).
ProoJ (von Neumann). The uniqueness is essentially obvious. If we

can write
with Jl' J2 E ft'1(fJ.), and vs, v; singular with respect to fJ., then
whence Jl - J2 is 0 almost everywhere, and v; = Vs'

[VII, §2] DUALITY BETWEEN U (Jl) AND L OO(Jl) 193
Now for existence, we assume first that Jl(X) is finite. Then Jl + v is a

finite positive measure, and we consider the integral with respect to Jl + v
on fEOO(.H), i.e. on the bounded measurable functions. Since all sets have
finite measure, we don't need to specify that we deal with step functions
vanishing outside a set of finite measure. Using the Schwarz inequality
with respect to L 2 (Jl + v), we have for any step function q>:
It q> dvl ~ t 1q>1 dv ~ t 1q>1 d(Jl + v)
~ 11q>1I2111xI12'
where Ix is the function equal to 1 on X. Hence the map
q> 1--+ t q> dv
is L 2 (Jl + v)-continuous on step functions, whence it extends uniquely to

a functional on L 2(Jl + v). By the L 2 duality, there. exists a function
(uniquely determined up to equivalence) such that for all step functions q>
t t
we have
q> dv = q>h d(Jl + v).
Letting q> be the characteristic function of a measurable set A, we find
L h d(Jl + v) = v(A) ~ (Jl + v)(A).

By the averaging theorem (Theorem 5.15 of Chapter VI) we may assume
without loss of generality that 0 ~ h ~ 1, and setting h equal to 0 on a
set of (Jl + v)-measure 0, we may also assume that h is measurable.
For step functions q> we have
t q> d(Jl + v) = t q> dJl + t q> dv,
whence the same holds if q> is any bounded measurable function, by

the dominated convergence theorem. Consequently for g E fEOO(.H) we
have
(1)
Let Y be the set of all x E X such that 0 ~ h(x) < 1, and let Z be the set
of all x E X such that h(x) = 1. First let g be the characteristic function
of Z. From (1) we see that ft(Z) = O. Let g be arbitrary (bounded
measurable) and iterate (1). By induction we obtain
(2)
Take the limit as n -+ 00. The dominated convergence theorem shows
t L
that
gh" dv -+ g dv as n -+ 00.
Let
h
f= -
1- h
on Y and 0 outside Y. Since ft(Z) = 0, the first integral on the right is

really carried by Y, and taking the limit yields
We define Vs to be the measure obtained from v by
vs(A) = v(A n Z).
We could also write Vs = Vz . We let Va be the measure represented by ft!

on Y and 0 outside Y. We see that our theorem is proved in the finite
case.
The extension to the IT-finite case follows easily as in Theorem 2.2.
We express X as a disjoint union of measurable sets {Xk } of finite
measure, apply the finite result to each piece, and see that we get the
expected convergence.
The Lebesgue part is the decomposition into absolutely continuous

and singular measures. The representation of v by f is the Radon-
Nikodym part of the theorem. We look further into this. It is reason-
able to expect it to hold in Hilbert space, in the sense that if a functional
on L "'(ft, E} can be represented by a "measure", then it can be represented
by some f E .fel(ft}. When I mentioned this to Palais, he pointed out
to me that if one takes the right definition of measure, then the result
follows at once from the positive case, and I am indebted to him for the
following corollary.
If v is a positive measure on vii, absolutely continuous with respect to
[VII, §3] COMPLEX AND VECTORIAL MEASURES 195
ji., and hE ,;el(V, E) where E is a Hilbert space, then we get a functional A

on L OO(ji., E) by
gl-+ Ix <g, h) dv = A(g).
It will be convenient to denote this functional by h dv. In this case, we

also say that A can be represented by a finite (E-valued) measure. This
terminology will be justified in the next section.
Corollary 2.5. Assume that ji. is a-finite. Let E be a Hilbert space and
let v be a positive measure on Jt, absolutely continuous with respect to ji..
Let hE ,;el(V, E). Then there exists f E ,;el(ji., E), uniquely determined
up to equivalence, such that h dv = f dji.. In other words, if a functional
on ,;eOO(ji., E) can be represented by a finite E-valued measure, then it
can be representeq by a map fin ,;el(ji., E).
°
Proof. Let l/Ihl denote the function equal to at a point x such that
h(x) = 0, and equal to l /lh(x)1 if h(x) #- 0. Then h/ lhl is ji.-measurable
and bounded, and Ihl is in ,;el(V, R). Then Ihl dv is a positive measure
on Jt, which is absolutely continuous with respect to ji.. By the positive
Radon-Nikodym theorem, we conclude that Ihl dv = k dji., where k is
positive and in ,;el(ji.). Then
h
f= - k
Ihl
is in ,;el(ji., E), being the product of a bounded ji.-measurable map and an
element in ,;el(ji.). It is then clear that this f satisfies our requirements.
We shall see in the next section that, in fact, we can start from the
"measure" point of view to arrive at our functionals.
VII, §3. COMPLEX AND VECTORIAL MEASURES
Let Jt be a a-algebra in X and E a Banach space.
Instead of considering real positive valued measures, we wish to investi-

gate complex valued measures satisfying the same countable additivity
property. It is then clearer to start with Banach valued measures, so that
we see clearly where the property of finite dimensionality is used for
certain results peculiar to the complex numbers. Again, no proof would
be made shorter if we were to assume from the start that E = C. In any
case, finite or infinite dimensional spaces are useful in a number of
applications.
By a decomposition of a measurable set A, we mean a sequence {An}
of disjoint measurable sets whose union is A. (We don't use the word
partition, which was used for a finite decomposition of a set of finite
measure with respect to a positive measure.) A map
v:..# -. E
is called countably additive if v(0) = 0, and if for every A E..# and

every decomposition {An} of A we have
L v(An)·
00
v(A) =
n=1
This infinite sum is to be interpreted as convergent to the same value,

independent of the ordering of the terms. Its value is in E. We now
consider properties of such a countably additive map.
The limiting properties of a positive measure are again satisfied in the
present case, namely:
Let {Y,,} be an increasing sequence of measurable sets such that UY" =

Y Then
lim v(Y,,) = v(Y).
n--+ oo
Similarly, if {Y,,} is a decreasing sequence of measurable sets, and

y = nY", then
lim v(Y,,) = v(Y).
n--+ oo
The proof IS obvious, as for the positive measures. We define a

function
Ivl:..# -. [0,00]
by letting
00
Ivl(A) = sup L IV(An)l,

n=1
the sup being taken over all decompositions {An} of A. We shall prove
that Ivl is a positive measure, and that if E = C (or is finite dimensional),
then Ivl is in fact real valued, i.e. finite.
We observe that if A c B are measurable, then
Ivl(A) ~ Ivl(B).
This is obvious, because if {An} is a decomposition of A, then {An' B - A}

is a decomposition of B. In particular, if Ivl(B) is finite, so is Ivl(A).
Theorem 3.1. Let v:.It -+ E be countably additive. Then Ivl is a posi-

tive measure.
°
Proof. Let {An} be a decomposition of A E.It. Let bn be a real
number ~ such that bn ~ Ivl(An). Let {Anj } be a decomposition of An
such that
Then we may view {A nj } (n, j = 1,2, ... ) as a decomposition of A, and

therefore summing over n yields
Ibn - e ~ I I IV(Anj)1 ~ Ivl(A).

n n j
Taking the sup over all {bn } and letting e -+ 0, we get
In
Ivl(An) ~ Ivl(A).
Conversely, let {Bj } be any decomposition of A. By the countable

additivity of v applied to the decomposition {An n Bj } (n = 1,2, .. . ) of Bj ,
we get
~ Iv(B)1 = ~ II v(An n BJI

) ) n
~ I
hn
Iv(An n B)I ~ Ln Ivl(An)·
This is true for all decompositions {Bj } of A, whence we get the reverse
inequality
Ivl(A) ~ I Ivl(An),
n
The measure Ivl is sometimes called the total variation of v.
Theorem 3.2. If E is finite dimensional, and v:.It -+ E is countably

additive, then Iv I is real valued, i.e. finite.
Proof The general case reduces at once to the real case (compo-
nentwise). We deal with the real case as in Saks [Sa]. Suppose that
Ivl(X) = 00. We first observe that there exist measurable subsets of X
whose measures have arbitrarily large absolute values. This is seen as
follows. We take a decomposition {Xn} of X such that
is large. We combine all those terms with indices n such that v(Xn) have
the same sign. For either + or -, the corresponding sum will be large.
We take a finite number of such n, but sufficiently many so that the sum
of the corresponding Xn is a subset B with Iv(B)1 large. All we need here
is the finite additivity of v.
Now we construct a decreasing sequence of subsets of X having
measures whose absolute values tend to infinity. Let X = AI. By what
we have just seen, there exists a subset B c: Al such that
Iv(B)1 ~ IV(Adl + 2.
If Ivl(B) = 00, we let A2 = B. If Ivl(B) is finite, then
Ivl(AI - B) = 00
and we let A2 = Al - B. Then
Iv(A 2 )1 ~ Iv(B)1 - IV(Adl ~ 2.
It is clear that we could have replaced 2 by any number. Repeating the

procedure inductively, we get a decreasing sequence
such that Iv(An)1 ~ n. Let A = nAn. The countable additivity of v now

yields a contradiction, because
v(A) = lim v(An).
Example. The following is an example in which the conclusion of

Theorem 3.2 fails. It is already in a paper of Birkhoff (Trans. Amer.
Math. Soc., 38 (1935) pp. 357-378). Let E = [2 be the space of se-
quences {an} of (say) real numbers such that L a; converges, with the
standard scalar product. Then E is a Hilbert space. Let Jl. be Lebesgue
measure on the line. For each positive integer n and measurable set A
let
1
vn(A) = - Jl.(A n [n - 1, n]).
n
Let v(A) be the sequence whose n-th term is vn(A). It is clear that the
total variation of v is infinite and that v is countably additive on the
positive line X consisting of all real numbers ~ o.
By a vectorial measure on vIt we shall mean a countably additive map
v: vIt -+ E
such that Ivl(X) is finite, i.e. such that Ivl is a real valued positive
measure. [Recall that if A c B, then Ivl(A) ~ Ivl(B).] For simplicity, we
also call a vectorial measure a measure, and when we have to make a
distinction with the objects discussed in Chapter VI or the preceding
sections, we emphasize this and say positive measure for the former
object. Another way of making the distinction is to say (even more
correctly) an E-valued measure for our map v: vIt -+ E.
It is clear that E-valued measures form a vector space denoted by
Mi(vIt, E), or simply Mi. For such a measure, we define
IIvil = Ivl(X).
Then it is verified at once that II II is a norm (not merely a seminorm) on

Mi. In fact, Mi is complete, i.e. a Banach space. The proof is a routine
e/2 n proof which we leave to the reader. Theorem 3.2 shows that the
complex measures on vIt are precisely the complex valued, countably
additive functions on vIt.
Note. Our terminology is adjusted to the applications we are going to

make. It would be more proper to define an E-valued measure to be
simply a countably additive map v: vIt -+ E such that v(0) = 0, and
define then a bounded measure to be such a map that Ivl(X) < 00 . In the
sequel, we are concerned only with bounded measures or with complex
measures (which are automatically bounded), so that we have taken the
convention as described above.
Example. Let Il be a positive measure on vIt and let f E !l'1(1l).

Define IlJ by
Then it is immediately verified that IlJ is a measure, and that
IIIlJII ~ IIfil 1·
We shall prove the reverse inequality after a remark, which it is conve-

nient to formulate in a slightly more general context than we need for
the next theorem. Let Il be a positive measure, and let v: vIt -+ E be an
E-valued measure. We say that v is absolutely continuous with respect to
Il, or to put it shortly is Il-continuous, if either one of the following two
conditions is satisfied.
AC 1. If A E vi{ and Jl(A) = 0, then v(A) = o.

AC 2. Given t:, there exists (j such that Jl(A) < (j implies that Iv(A)1 < t:.
We shall prove that these two conditions are equivalent. It is clear that
AC 2 implies AC 1. Conversely, assume AC 1. If AC 2 is false, for
each positive integer n there exists a set y" such that Jl(y") < 1/2n , but
Iv(y")1 > t:. Then Ivl(y") > t:. Let
and let Z = nZn. Then Jl(Z} = 0, but
because y" c Zn. Hence there is some measurable subset Z' of Z such
that v(Z'}"# 0, contradictin'g AC 1 because Jl(Z'} = O. This proves the
equivalence between our two conditions.
Remark 1. Iff E ft'l(Jl}, then the measure Jl! is obviously jl-continuous.
Remark 2. The measure v is Jl-continuous if and only if Ivl is Jl-

continuous.
Theorem 3.3. Let Jl be a positive measure on vi{ and let f E ft'l(Jl}.

Then
The map fH Jl! is a norm-preserving embedding L l(Jl) -+ M1 .
Proof. It suffices to prove the inequality IIfl11 ~ IIJl!II. We may as-

sume IIfl11 > O. Given Ilf111' there exists a set A of finite measure such
that
Ix If I dJl - t: ~ L If I dJl.
By Jl-continuity, there exists (j such that if Z is a set with Jl(Z) < (j, then
t If I dJl < t:.
By the fundamental lemma of integration (Lemma 3.1 of Chapter VI),

there exists a step map cp and a set Z of measure < (j such that Z is
contained in A and such that we have for all x E A - Z :
t:
If(x) - cp(x}1 < Jl(A)·
Write
n
cP = L ViXA i '
i=1
where {AJ is a partition of A - Z, and let cP be 0 outside A. We have:
fA Ifldll-I>~~f Ifldll~~f (ICPI+ (~))dll

If dill
I Ai I Ai Il
=L cP + L I>Il(AJ
i Ai i Il(A)
~ ~ IL/ dill + 21>

~ IIlJI (A) + 21> ~ IIlJI (X) + 21>.
Corollary 3.4. For any step map 9 we have
Or symbolically, on step maps,
Proof For each measurable set A, we can apply Theorem 3.3 with
respect to A and get
The result for step maps follows by linearity.
We shall interpret measures as functionals. Let E be a Hilbert space

or the complex numbers and let v be an E-valued measure on vit. We
first view v as inducing a linear map on step mappings with respect to
Ivl. Let cP E St(lvl) and let us write
n
cP = L
i=1
ViXA i
where {At, .. . ,A n } is a partition of a set A having finite lvi-measure. We

define dv by
Ix cP dv = <cP, dv) = i~ viv(A;).

This obviously satisfies properties similar to those of the integral of

Chapter VI, §2. Note that we wrote viv(Ai} instead of
to fit the notation of functions better. In particular, since
we have the inequality
where the L 2-seminorm is taken with respect to Ivl. (There is no other

positive measure floating around at the moment.) Consequently dv is
L 2-continuous on St(lvl) and can thus be extended to a unique functional
on L 2(1 vI) since the step maps are dense. By the L 2-duality, we know
that there exists a unique (up to a set of lvi-measure O) map hE 'p 2(lvl)
such that on all step maps,
dv = h dlvl.
In other words, for all step maps qJ we have
<qJ, dv) = t qJh dlvl.
We shall say that dv is represented by h. Since Ivl is finite, we know that

h is in 'p 1 (lvl) (Schwarz inequality on Ihl and Ix).
We state the next theorem first for the complex numbers, for the
convenience of the reader interested only in the complex case.
Theorem 3.5. Let v be a complex measure on vIt. There exists a

measurable function h on X such that Ihl = I and such that for all
qJ E St(1 vi, C)
t
we have
<qJ, dv) = qJh dlvl·
This function h is uniquely determined up to lvi-equivalence.
Proof We have already found such an h in 'p 1 and we must show

that Ihl = 1. We may assume that h is measurable. For r > 0 let Sr be
the set of all x E X such that Ih(x)1 < r. Let {An} be a decomposition of
Sr. Then
In
IV(An)1 =I
n
If XA
X
n h divil ~I n
rlvl(An) = rlvl(Sr)'
This shows that Ivl(Sr) ~ rlvl(Sr)' If r < 1, we must have Ivl(Sr) = O.

Hence Ih(x)1 ~ 1 for almost all x. Changing h on a set of measure 0, we
may assume Ih(x)1 ~ 1 for all x.
For the reverse inequality, let A be a measurable set. Then from the
definition of h we have
Ix XA h dlvl = <XA' dv) = IV(A)I ~ Ivl(A).

The averaging theorem (Theorem 5.15 of Chapter VI and its corollaries)
shows that Ihl ~ 1 almost everywhere. This proves our theorem.
Corollary 3.6 (Hahn-Jordan Decomposition of a Measure). Let v be a

real valued measure on .,H and define
and
Then the expression v = v+ - v- gives a decomposition of v into a

difference of two mutually singular positive measures, and any such de-
composition is uniquely determined. If X = Au B is a decomposition
into two disjoint measurable sets such that v+ is carried by A and v- is
carried by B, then
v+(Y) = sup v(Z) for Z E.,H and Z c Y;

-v-(Y) = infv(Z) for Z E.,H and Z c Y.
Proof. We sketch the proof. By Theorem 3.5, there exists a real

valued function h such that Ihl = 1 and
dv = h dlvl.
Then h takes on only the values 1 and - 1. Let A be the set of points
where h takes the value 1, and let B be the set where h takes the value
- 1. Let v;; and vi: now be defined by the formulas
and
It is then clear that vi: and vi: are mutually singular, and it is immedi-
I
204 DUALITY AND REPRESENTATION THEOREMS [VII, §4J
ately verified that v;; = v+ , vi; = V- . We leave the uniqueness and the
proof of the last properties as an exercise.
VII, §4. COMPLEX OR VECTORIAL MEASURES

AND DUALITY
In this section we discuss the duality ansmg from complex or Hilbert

space valued measures. We let E be a Hilbert space, which the reader
may assume to be C in first reading, although as usual, no changes
would be needed.
Theorem 4.1 (Hilbert Case). Let E be a Hilbert space and let v be an

E-valued measure on vii. There exists a measurable map h: X ~ E such
that Ihl = 1 and such that for all cP E St(lvl, E) we have
<cp, dv) = Ix <cp, h) dlvl·
This map h is uniquely determined up to lvi-equivalence.
Proof. Identical with that of Theorem 3.5, except that we must insert
unit vectors e and write eXAn or eXA in the appropriate place.
Corollary 4.2 (Radon-Nikodym, Hilbert Case). Let E be a Hilbert

space. Let fl be a (J-finite positive measure on vii, and let v be an
E-valued measure on vii such that v is fl-continuous . Then there exists
f E !l'l(fl, E) such that v = flJ, uniquely determined up to fl-equivalence.
Proof. We can write (by the real form of Radon-Nikodym)
dlvl = k dfl
with some positive k in !l'l(fl, R), whence by the theorem, on step maps
we get (cf. Exercise 15)
dv = h dlvl = hk dfl,
as was to be shown.
If fl is a positive measure on vii, we can now associate with each

fl-continuous E-valued measure v on vii a functional, again denoted by
dv, on L OO(fl, E). Indeed, if we write
[VII, §4] COMPLEX OR VECTORIAL MEASURES AND DUALITY 205
with f E !t'l(Jl, E), then we define dv by
<g, dv) = Ix <g,/) dJl.

Let us denote by Ml(Jl, E) the vector space of Jl-continuous E-valued
measures on vIt.
Corollary 4.3. Let Jl be (i-finite, and E a Hilbert space. We have

arrows:
The first arrow, given by f ...... Jlf' is a norm-preserving isomorphism,

between U (Jl, E) and M 1 (Jl, E). The second, given by v ...... dv, is a norm-
preserving anti-linear map of M 1 (Jl, E) into the dual of L OO(Jl, E). If
v = Jlf with f E !t'l(Jl, E), then
Idvl = IIvll = IIfII1·

Proof The norm statements are obtained by combining Theorem 3.3
and Theorem 2.3. All other statements summarize what has already been
proved.
We now determine a necessary and sufficient condition for a func-

tional on L OO(Jl, E) to be expressible in the form f dJl, with some f in
!t'l(Jl, E). In other words, we characterize the image of the map
We shall say that a functional
A: St(Jl, E) --. C
is .a-continuous if there exists a positive real valued function r on vIt such

that
lim r(A) = 0,
!I(A) .... O
and such that for every g E St(Jl, E) we have
Similarly we define Jl-continuity on the bounded measurable maps, taking

g to be such a map. We recall that gA = XAg.
Corollary 4.4. Assume that /1(X) is finite. Every /1-continuous func-

tional on the step maps St(/1, E) has a unique extension to a /1-continuous
functional on L 00(/1, E). A functional A. on L 00(/1, E) can be written in
the form f d/1 with some f E 2 1 (/1, E) if and only if it is /1-continuous.
Proof. If f E 2 1(/1, E), then for any bounded measurable g we have
so that our condition of /1-continuity is satisfied. Conversely, let A. be a

/1-continuous functional on the step maps St(/1, E). To see that an exten-
sion to L 00(/1, E) is unique we note that if g E 2 00 (/1, E), then given e there
exists a set Z with /1(Z) < e and a sequence of step maps {<Pn} which
converges uniformly to g outside Z. This is true because on a set of
finite measure, every bounded measurable map is in 21, and the funda-
mental lemma of integration (Lemma 3.1 of Chapter VI) gives us such
approximation. It follows that any /1-continuous extension of A. to all of
2 00 (/1, E) is uniquely determined.
We prove existence by representing A. on the step maps as dv for some
measure v. For each fixed measurable A we consider the map
This map is obviously a functional on E, and hence by the self duality of

Hilbert space there exists a unique vector v(A) such that for all vEE we
have
A.(VXA) = (v, v(A).
The finite additivity of v follows from the additivity of A.. Furthermore,

we have the estimate
I(v, vA)1 = IA.(VXA)I ~ Ivlr(A).

This yields
IV(A)I ~ r(A).
Let {An} be a decomposition of A, and let
Bn = A 1 U ... u An·
Then v(Bn) = v(A 1) + ... + v(An), and
Iv(A) - v(Bn)1 = Iv(A - Bn)1 ~ r(A - Bn).
The right-hand side tends to 0 as n -+ 00, so that v is countable additive.

[VII, §4] COMPLEX OR VECTORIAL MEASURES AND DUALITY 207
As for its total variation, let en be the unit vector in the direction of
v(An), and consider the series
It is a measurable, bounded map. If n ~ m, we have
where Amn = Am U ... U An. Applying the Cauchy criterion, we have:
00 00
..1.(g) = L ..1.(ekXA.) = L IV(Ak)1 =
k=l k=l
1..1.(g)l,
and also by the hypothesis on A.,
1..1.(g)1 ~ r(A).
Taking A = X shows that the total variation is finite, whence v is a

measure.
Finally, it is clear from the definition of v that A. = dv on the step
maps. By Corollaries 4.2 and 4.3, if we write dv = 1 dll for some 1 in
2'1(11, E) then we can extend dv to a Il-continuous functional on 2'00(11, E).
This proves Corollary 4.4.
Example. Let X = [0, 1] with Lebesgue measure 11. Let F be the

space of continuous functions on X , so that F is a subspace of 2'00(11, C).
It is easy to verify that if I, 9 E F are equivalent (i.e. equal almost
everywhere), then they are equal, so that F is a subspace of L 00(11, C).
Let v be the measure which gives the set {O} measure 1, and gives a
subset of [0, 1] measure 0 if this subset does not contain O. For any
1 E F we have
Ixl dv = 1(0).
This measure v is obviously not Il-continuous, but dv is continuous for

the L oo(Il)-seminorm (actually a norm on F). We can extend the func-
tional dv on F to all of L oo (ll, C) by the Hahn-Banach theorem to give
examples of functionals on L 00(11, C) which cannot be represented by
Il-continuous measures.
Remark. The part of the proof showing that v is a measure does not
depend in an essential way on the assumption that E is a Hilbert space,
and goes through with very minor modifications in the arbitrary Banach
case. The definition of Jl-continuity of a functional A. applies in this case,
and one can characterize such functionals as measures in the following
manner:
Assume that Jl(X) is finite. Let E be a Banach space and E' its dual
space. There exists a unique norm-preserving linear map
from the space of Jl-continuous E' -valued measures into the dual space
of L OO(Jl, E), denoted by v ~ dv, whose image is the space of Jl-
continuous functional on L OO(Jl, E), and such that on step maps VXA
(v E E and A measurable) we have
<VXA dv) = <v, v(A).

The crucial part of the proof of the preceding statement, namely that a
Jl-continuous functional can be written as dv, follows closely the Hilbert
case proof of Corollary 4.4. See Exercises 16 and 17.
There remains to determine when a given measure v can be written in
the form JlJ for some f E !l'l(Jl, E), and E is an arbitrary Banach space.
A complete answer is given in Rieffel's paper [Ri], as follows:
Rieffel's Theorem. Let Jl be a-finite and let E be a Banach space. Let

m be an E-valued measure, which is Jl-continuous. Either one of the
following conditions is necessary and sufficient that m can be written in
the form JlJ for some f E !l'l(Jl, E):
R. Given A measurable and 0 < Jl(A) < 00, there exists B c A with
Jl(B) > 0 such that the average set
AvB(m) = set of all m(Y)/Jl(Y), Y c B, Jl( Y) > 0
is relatively compact.
R'. Given A measurable with 0 < Jl(A) < 00, there is some B c A and a
compact subset K of E not containing 0 such that Jl(B) > 0 and
m(Y) is contained in the cone generated by K for all Y c B.
Note. The cone generated by K is the set of all positive finite linear
combinations of elements of K. Condition R' may be expressed by say-
ing that m has compact direction locally somewhere. Condition R' is
obviously satisfied in the finite dimensional case. A discussion of the
literature and applications will also be found in Rieffel's paper. For an
example when the measure m cannot be written as JlJ' even though it is
Jl-continuous, cf. Exercise 21.
[VII, §5] THE LP SPACES, 1< P< 00 209
For more on the Radon-Nykodum derivative in euclidean space, and

the relation between differentiation and integration, I recommend Smith's
book [Smi].
VII, §5. THE LP SPACES, 1 < p < 00
We let (X, .A, Jl) be a measured space.
In this section we give results analogous to those concerning L 2 , re-

placing 2 by a real number p with 1 < p. We need some inequalities to
replace the Schwarz inequality. Throughout we let q be the positive
number (necessarily > 1) such that
1 1
- + - = 1,
p q
and call q the dual exponent of p.

We have the basic inequalities for real a, b > 0:
a 1/ P b 1/ q ~ ~ + ~,
p q
(a; by ~ ~(ap + b P).
There are several easy proofs for this. Either take the log of both sides
and use the convexity of the log, or proceed as follows. If t ~ 1, then
as one sees by differentiating both sides, evaluating at t = 1, and seeing

that the derivative on the right is bigger than the derivative on the left.
Suppose now that alb ~ 1; the inequality (*) drops out at once. The
other is proved similarly.
We let 2 P(Jl) be the set of maps f on X which are Jl-measurable, and
such that Ifl P lies in 21.
Theorem 5.1. Let 1 < p < 00. Then 2P(p,) is a vector space. If we
define
then I lip is a seminorm on 2 P. Iff E 2 P(Jl) and g E 2 Q(Jl), then Ifllgl

is in fE1 and Holder's inequality holds, namely
Proof We see that fEP is a vector space directly by applying the

inequality (**). If IIf11p = 0 or Ilgll q = 0, then f or g is 0 almost every-
where and the Holder inequality is obvious. Suppose that Ilfllp i= 0 and
Ilgll q i= O. Let
a=--
Ifl P and
Ilfll~
Using inequality (*), we find that
First this shows that Ifllgl is in !l'1 (corollary of the dominated conver-
gence theorem), and second it yields the last inequality stated in the
theorem, after we integrate over X. To show that II lip is a seminorm,
write
Integrating and using Holder's inequality yields the fact that II lip is a
seminorm, and concludes the proof of the theorem.
We are now in a position to prove many of the results of §1 hold if

one replaces L 2 by U with 1 < p < 00.
Theorem 5.2. Let Un} be an LP-Cauchy sequence in !l'P. Then there

exists some f E !l'P having the following properties:
(i) The sequence Un} is U-convergent to f, so that !l'P is complete.
There exists a subsequence having the following properties:
(ii) This subsequence of Un} converges almost everywhere to f
(iii) Given e, there exists a set Z with Jl(Z) < e such that the conver-
gence of this subsequence is uniform on the complement of Z.
Proof Identical with that of Theorem 1.4.
Theorem 5.3 (Dominated Convergence Theorem for P). Let {f,,} be

a sequence in !l'P which converges pointwise almost everywhere to f
Assume that there exists g E !l'P(Jl, R) such that g ~ 0 and such that
If" I ~ g. Then f is in !l'P and {f,,} is U-convergent to f
[VII, §5] THE LP SPACES, 1 < P < 00 211
Proof. As before.
Corollary 5.4. The step maps are dense in ff P, and U(j,t) is the comple-
tion of the step maps in the U-seminorm.
Proof. As before.
Finally, the duality statement holds. Over C, we let <f, g) = }g.

Theorem 5.5 and its proof are true as usual for a Hilbert space E and
E-valued maps f, g, with <f, g) denoting the scalar product of the values
of f and g.
Theorem 5.5. Assume that j,t is a-finite. For f E ffP(j,t) and 9 E ffq(j,t),
we let
and define Ag by Ag(f) = <J; g)w Then the map 9 1--+ Ag is norm-
preserving isomorphism of Lq(j,t) onto the dual space of U(j,t).
Proof. We consider first as usual the case when j,t(X) is finite. Our
map 9 1--+ Ag is certainly an injective linear map, and we have
by Holder's inequality. Let us prove that It IS surjective. Let A be a

functional on U(j,t). Then A can be viewed as a j,t-continuous, j,t-bounded
functional on L OO(j,t) because if 9 is a bounded measurable map, then 9 is
in ffP(j,t) and if C = IAI, then
If we replace X by A for any measurable A, and 9 by gA' we get

the same estimate with j,t(X) replaced by j,t(A). We can therefore apply
Corollary 4.4 of the Radon-Nikodym theorem (vectorial case). There
exists a map f E ffl(j,t) such that dv = f dj,t as a functional on LOO(j,t). We
shall prove that in fact, f lies in ffq(j,t). Let y" be the set of x such that
If(x)1 ~ n. We first get a bound for the integral of Ifl P over Y". Let
and let gn be equal to 9 on y" and 0 outside Y". (That is gn = Xy"g.) As

usual, dividing by If I is to be understood as I/lf(x)1 if f(x) # 0 and 0 if
f(x) = O. Then gn is bounded, IglP = Ifl q , and
From this we conclude that
By the monotone convergence theorem, it follows that Ifl q lies in ff\

whence If I lies in ff q and IIfllq ~ IAI.
The functionals A and f dJl have the same effect on step maps, which
are dense in ffP. Therefore they are equal on ffP(Jl). This proves our
theorem when Jl(X) is finite.
As for the a-finite case, we consider a decomposition X = UX k (dis-
joint union of sets of finite measure). For each X k we can find a function
A that lies in ffq(Jl) and is 0 outside X k , and such that A dJl represents A
over X k. Let h E ffP(Jl) be arbitrary and let hk be the same map as h on
X k and 0 outside X k • Then the series
is LP-convergent to h, say by the dominated convergence theorem, and

therefore by the continuity of A we have
co
Ah = L A(hk)·
k=l
For each k we have Ahk = <hk> fk)/l. If we let f = LA, it follows that
A = f dJl on ffP(Jl). This concludes the proof of the U-duality theorem.
Remark. The proof follows the classical pattern (see Rudin [Ru 1] or
Loomis [Lo]), granted the L 2 and (L \ L CO)-duality theorem. For the
general case when E is a Banach space, and one wants U(Jl, E') to be
dual to U(Jl, E) for 1 ~ p < 00, cf. Dinculeanu [Din], §13, Corollary 1 of
Theorem 8, where this is proved under some countability assumption.
The next theorem gives an example of an integral operator in the

fairly general setting of LP-spaces.
[VII, §6] THE LAW OF LARGE NUMBERS 213
Theorem 5.6. Let 1 ~ p ~ 00 and C > O. Let K be a measurable func-

tion on X x X such that
Ix IK(x, y)1 dJl(Y) ~C for all x E X
and
Ix IK(x, y)1 dJl(x) ~C for all Y E X.
Let f E U(Jl). Then the function SKf defined by
SKf(x) = Ix K(x, y)f(y) dJl(Y)
is defined for almost all x, and is in U(Jl). Furthermore,
Proof We leave the proof as an exercise. The L2-case is especially

interesting. Cf. Exercises 9-13 of Chapter XVII.
VII, §6. THE LAW OF LARGE NUMBERS
I cannot resist giving an application of integration theory to a proba-

bilistic setting which shows integration theory at work. This consists of
the "law of large numbers" in a suitable formulation. I follow the expo-
sition of [La-T]. This section can be read immediately after §1, as an
application of the definitions and convergence theorems in §1 concerning
L2.
We assume that the reader has done the exercise of extending the
notion of product measures to denumerable products. Specifically, we
use the following theorem.
Let (Xn' .An' Jln) be a sequence of measured spaces such that Jln(Xn) = 1
for almost all n (meaning for all but a finite number of n). Let.A be
the a-algebra in the product space
generated by all sets
where An E .An, and An = Xn for almost all n. Then there exists a

unique measure J-l on (X, vii) such that Jor all such sets A we have
We call J-l = ® J-ln the product measure.

In the sequel we assume that in fact J-ln(Xn) = 1 for all n. We view X
as our probability space.
Theorem 6.1. Suppose given a measurable subset Sn oj Xn Jor each n.

Assume that the limit exists,
Then Jor almost all elements (sequences) x = {x n} in X, the density oj n

such that Xn E Sn exists and is equal to L. This means:
The above theorem has a simple intuitive content, but some applica-
tions require a stronger version, as follows.
Theorem 6.2. Suppose given a measurable subset Sn oj Xn Jor each n.

Let {bn} be a sequence oj positive real numbers tending monotonically to
infinity. Assume that
Then Jor almost all sequences x we have

N
# {n ~ N, Xn E Sn} = L J-ln(Sn) + o(bN ),
n=l
The first theorem is obtained from the second by putting bn = n. We

shall now prove the theorem.
The first lemma, due to Kolmogoroff and formulated by him in proba-
bilistic terms, will be a refinement of the fundamental lemma of integra-
tion theory, which asserts that given an L1 (or U) Cauchy sequence,
there exists a subsequence that converges absolutely almost everywhere.
Here we give up on absolute convergence, but have conditions which
make the full sequence converge pointwise almost everywhere.
[VII, §6] THE LAW OF LARGE NUMBERS 215
Lemma 6.3. For each n let hn be a function on X n, also viewed as a

function on X by projection on the n-th factor. Assume that
Let
n
Hn(x) = L hk(x)
k=l
be the partial sum. Assume that L IIhkll~ converges. Then the limit
exists for almost all x E X.
Proof We first note that the functions hn are mutually orthogonal

on X. The heart of the proof lies in the next statement.
Kolmogorolf's Inequality. Given e, let
Z = {x E X, max Hff(x)
1 ~k ~n
~ e}.
Then
n
ejl(Z) ~ L
k=l
Ilhnll~·
Proof Let
Y,. = {x E X such that Hff(x) ~ e and H/(x) < e for all i < k}.
In other words, Y,. is the set of points x such that Hff(x) is the first
partial sum at least equal to e. Then the sets Y,. are disjoint, and we get
the inequality
Write
The last term is negative, and we shall leave it out when we integrate.
On the other hand, the middle term gives
This holds because Hk is effectively a function of only the first k vari-

ables, whereas Hn - Hk is effectively a function of only the last n - k
variables. The integral splits into a product of integrals over the distinct
variables, and is immediately seen to yield 0, as desired. Therefore we
can replace Hf by H; and then integrate over all of X, thereby by giving
L
as bound the square of the L2-norm of hk' which proves the asserted
inequality.
We have assumed that Lhk is in L2, that is

00
L
k=l
Ilhkll~ < 00.
This means that for mo sufficiently large, and n ~ m ~ mo, we get
Define
Then Zi has measure ~ 1/2i if we pick mo(i) sufficiently large. Let
w,; = Zn U Zn+l U ...
for large n, so that w,; has measure ~ 1/2n - 1 . Then the partial sums
L hk{x) converge for x not in w,;. Hence if we let W be the intersection
W= nw,;,
then these partial sums converge for x not in W, and W has measure
zero, thereby proving the lemma.
The next theorem is also due to Kolmogoroff, in that generality.
Theorem 6.4. For each n let fn be a function on X n, and assume that
f fn df.1.n = O.
Let {bn} be a sequence of positive real numbers monotonically increasing
to infinity. If
"L... b211fnl12
1 2 < 00,
n
[VII, §7] EXERCISES 217
then for almost all x the partial sums

n
Fn(x) = L fk(x)
k=l
satisfy the estimate
Proof Let hn = /,, /bn and apply the lemma to the partial sums
The lemma says that these partial sums converge for almost all x. It is a
trivial fact (proved by summation by parts) that if ak is a convergentL
seq uence, then
n
L
k=l
akbk = o(bn )·
Applying this fact when ak = hk(x) proves the theorem.

We have stated Theorem 6.4 under the normalization that the integral
of the functions fn is O. This is of course not satisfied in general, but a
translation reduces the general case to this special case. Indeed, suppose
that I/In are functions such that
is a constant cn . Define
Then the integral of fn is O. In particular, suppose that 1/1. is the charac-

teristic function of some subset S. of X n • Then
Applying Theorem 6.4 in this situation yields Theorem 6.2, as desired.
VII, §7. EXERCISES
Unless otherwise specified, (X, vIt, J.t) is a measured space.

1. Let d be an algebra in X and Jl a positive measure on d. Assume that all
elements of d have finite measure. For A, BEd define
d(A, B) = Jl(A - B) + Jl(B - A).

Show that d is a semimetric [in the obvious sense, that is d(A, B) ~ 0,
d(A, B) = d(B, A),
and the triangle inequality is satisfied]. The only difference from a metric is
that we may have d(A, B) = 0 and yet A # B. In this way, d becomes a
topological space, and II-continuity corresponds to the topological notion.
2. Radon-Nikodym derivative. Let II, v be positive measures, and let m be a
complex measure. Suppose that dm = f dll, where f E ,!l'1(1l) and dll = g dv
where g E ,!l'1(V, R). Prove that fg E ,!l'1(V), and that dm = fg dv. If we use the
notation dm/dll = f and dll/dv = g, then we have the old formalism
dm dll dm
dll dv = dv
One sometimes calls f the Radon-Nikodym derivative of m with respect to II.

[By the way, you may view dm or dll, dv as linear maps on step functions,
which amounts to considering measures m = III or II = vg .]
3. Let X consist of two points x and y. Define Il( {x} ) = 1 and
Il( {y}) = Il(X) = co.
Determine whether L 00(11, R) is the dual of L I (II, R).

4. Let F be a subspace of L2(1l, C) and assume that there is some number c> 0
such that for all f in F we have
Assume that Il(X) < co. Show that F is finite dimensional and that
dim F ~ CIl(X).
[Hint (Moser): Let fl' ... In be orthonormal elements in F. Let
Let Xo be a point such that
and also
Consider the function f = I rxdk with rx k = fk(Xo)/b.]

5. Let X be the set of positive integers, and let II be the counting measure on
X which gives each point measure 1. Let [1 and [00 denote U(Il, C) and
L 00(11, C).
(a) Show that [1 consists of all complex sequences rx = {an} with norm
Ilrxlll = I lanl < co.

Show that [00 consists of all complex sequences r:x = {a.} with norm
11r:x11 00 = supla.1 < 00.
(b) Show that [1 is the dual of the subspace Co of [00 conslstmg of all
sequences r:x = {a.} such that a. -+ 0 as n -+ 00. Show that the dual of [I
is [00 quoting any theorem from the text.
(c) Show that Co and [1 are separable, but that [00 is not separable.
The space H •. (For applications to PDE, cf. SL 2 (R), Appendix 4.)
6. Let s be an integer. On the integers Z define
Then /-Is is a measure on Z.

(a) Define the space Hs to be the space of functions on Z, written in the form
of sequences {a.}, such that the sum
converges. If! = {an} and g = {bn}, define the scalar product in Hs to be
Show that Hs = L2(Z, /-Is), and in particular is complete for the norm
associated with this scalar product.
(b) Show that the finite sequences! = {an} such that a. = 0 for all but a
finite number of n form a dense subspace of Hs .
7. For each function! E Coo(T), where T = R/Z is the circle, or if you wish, for
each COO function on R, periodic of period 1, associate the Fourier series
where a. = II !(t)e- 2ni• x dx.
(a) Integrating by parts, show that the coefficients satisfy the inequality
la.l« rnr1
for each positive integer k. The symbol « means that the left-hand side
is less than some constant times the right-hand side for Inl-+ 00.
(b) Prove that Coo(T) c L2(Z, /-Is) for all s E Z, and that Coo(T) is dense in this
space L2. [Look at the finite Fourier series
L a.e2ninx.]
i.i;>N
8. Let r < s. Prove that the unit ball in Hs is relatively compact in Hr, in other
words that this unit ball is totally bounded in Hr.
9. Let {In} be a sequence in 22(X, II) such that 11!.112 --> 0 as n --> 00. Prove that
:~~ Ix 1!.(x)llog(1 + 1!.(x)l) dll(X) = O.

10. Let 1 ~ p < 00 and let f E 2 P(R) (for Lebesgue measure). For a E R let T. be
the translation by a, that is T.f(x) = f(x - a). Prove that T.f converges to f
in U as a --> O. Is the conclusion still true if p = oo?
11. Let II be a a-finite positive measure on the Borel sets in R, and suppose
L I (R, II) c L oo(R, II). Show that there exists c > 0 such that if A is a Borel set
with II(A) > 0 then II(A) ~ c.
12. Prove Theorem 5.6. [Hint: Use Holder's inequality and Fubini's theorem.]
13. Let T: L2(X, II) --> L2(X, II) be a continuous linear map, and assume that X is
a-finite. Assume that T commutes with all operators Mg such that Mg(f) =
gf, for 9 E 2 00 and f E 22. Prove that T = Mg for some g. [Hint: Write X
as a disjoint union of sets of finite measure Xn and let cp be the function
which is the constant l /nZII(Xn )l/z on X n • For f E 2 00 1122, we have
T(cpf) = cpT(f) = fT(cp).
Let 9 = Tcp/cp. Then Tf = gf Prove that 9 is bounded as follows. If it is

not, given N there is a subset of finite positive measure Y such that Igl ~ N
on Y. Consider T( ((j/g) Xy) to contradict the bounded ness of T.]
For an application, see SLz(R), Lemma 4 of Theorem 4, Chapter XI, §3.
14. Let E be a Banach space and let v:.,It --> E be an E-valued measure. Show
that one can define (in a manner similar to that in the text) a linear map
St(lvl, C) --> E,
and that this map is U(lvl)-continuous. This linear map can therefore be
extended linearly by continuity to 2 1(lvl, C), thus allowing you to define
Sf dv, for f E 21(1vl, C).
15. Let E be a Banach space and E' its dual. In the bilinear map
given by
show that IAtl = IIflll and IAgl = Ilglloo,just as in the Hilbert case. [Hint: Use
step maps, and for a constant map, use the Hahn-Banach theorem to see
that given vEE, there exists v' E E' such that Iv'l = Ivl and (v, v') = Ivl.]
16. Assume that II(X) is finite. Let E be a Banach space and E' its dual space.
The definition of a II-continuous functional A on L 00(11, E) is as in the text.
Show that such a functional can be written in the form A = dv for some
E' -valued measure v, in the sense that on a map VX,c (v E E and A measur-
able) we have
l(vX,c) = <v, v(A) .
17. Prove the statement included in the remark following Corollary 4.4 of Theo-
rem 4.1, concerning that part of the dual of L'XJ(Il, E) represented by a mea-
sure in E'.
18. Let f be a Il-measurable map of X into a Banach space E. Given a measur-
able set A with Il(A) finite, and e, show that there exists Z c A such that
Il(Z) < e and f(A - Z) is relatively compact (or equivalently, totally bounded).
We may say that f is locally almost compact valued.
19. The essential image. Let E be a Banach space. Let f be a measurable map
and let A be a measurable set. The essential image of f on A is defined to be
the set of all vEE such that for every r > 0 the measure of the set
A nr'(Br(v)
is strictly positive. We denote it by ei,c(f).

(i) The essential image is closed.
(ii) If Il(A) > 0, then ei,c(f) intersects the image f(A).
(iii) The set Z of elements x E A such that f(x) does not lie in ei,c(f) has
measure o.
(iv) Let A = U A. be a denumerable union of measurable sets. Show that
= closure of U ei,c.(f).
<Xl
ei,c(f)
"=1
20. Let E be a real Banach space, and f: X -+ E any map. Let g: X -+ R be a

real positive function on X which is in !t"(Il, R) and such that
Assume that gf is in !t"(Il, E). If A. is a functional on E, and c E R, we define

a half space H+(A., c) to consist of all VEE such that A.V ~ c. Let H be such a
half space containing f(X). Show that
belongs to H.
In view of the result on convex sets in §2 of the Appendix to Chapter 4, it
follows that the above "average" in fact lies in the closure of the convex set
generated by the image f(X), i.e. the smallest closed convex set containing
f(X) .
21. Let E = L'(Il, C) where X = [0, 1] and Il is Lebesgue measure on the algebra
of Borel sets. For each Borel set A let
m(A) = class of X,c in L'(Il, C).

(a) Show that Iml is Lebesgue measure itself. (b) Show that m is an E-valued
measure which cannot be written as J-lj . [Hint: View dm as a functional
on step functions, say real valued, so that for any step function cp and
measurable set A we have
L Lcp dm = cpI dJ-l.]
22. Let E be a Banach space. Let P denote the set of all partitions, i.e. collec-
tions 1t consisting of a finite number of disjoint measurable sets of finite
measure. We let 1tl ~ 1t if every element of 1t is, up to a set of measure 0, the
union of elements of 1t 1. For each 1t E P and IE !l'1(J-l, E) we define
Ix = T,J = L [J-lj(A)/J-l(A)]XA
A.x
where J-lj(A) = SAl dJ-l.

(a) Show that 1',,: L 1(J-l, E) ..... L 1(J-l, E) is a continuous linear map of norm 1,
and that 1',,1 is L 1-convergent to I in the following sense: Given IJ there
exists 1to such that for all 1t ~ 1to we have
(b) Prove the same thing replacing 1 by p for 1 < p < 00.
CHAPTER VIII
Some Applications of
Integration
After the abstract theory on arbitrary measured spaces, it is a relief to

get into some classical situations on Rn where we see the integral at
work. None of this chapter will be used later, except for the approxima-
tion by Dirac families in the uniqueness proof for the spectral measure of
Chapter 20.
In this chapter we deal with the Fourier transform in a context of
absolute convergence. In Chapter 10, §2 we shall deal with a more deli-
cate context, involving oscillatory convergence.
VIII, §1. CONVOLUTION
Suppose first we deal with functions f, g on the real line. We shall study
their convolution, defined by the integral
f * g(y) = L: f(x)g(y - x) dx.
Of course, the integral must be convergent, or even absolutely conver-

gent. Theorems 1.1 and 1.2 will give conditions for such convergence.
Furthermore, we don't need to work only on R, and we shall express the
results on Rn, abbreviating
Ln f(x)g(y - x) dx = f
f(x)g(y - x) dx.
In most applications, one of the two functions f or g is continuous or

224 SOME APPLICATIONS OF INTEGRATION [VIII, §1]
even Coo, and the resulting convolution is also continuous or Coo. To see
this one must be able to take a limit or differentiate under the integral
sign, and the next section gives basic conditions under which this is
legitimate. We shall see several examples after the main approximation
theorem is proved in Theorem 3.1.
We now come to the basic tests for absolute convergence of the
convolution integral.
Theorem 1.1. Let f, g E 21 (Rn). Then for almost all y E Rn the function
x ~ f(x)g(y - x)
is in 21 (Rn). The convolution f *g given for almost all y by
f * g(y) = ff(x)g(y - x) dx
is also in 21. The association (f, g) ~ f *g is an associative, commuta-

tive bilinear map, satisfying
Thus 21(Rn) is a Banach algebra under the convolution product.
Proof We integrate If(x)llg(y - x)1 first with respect to y, and then

with respect to x. We apply part 2 of Fubini's theorem, Theorem 8.7 of
Chapter VI. We then conclude that f * g is in 21. The last inequality in
the statement of the theorem follows at once. The bilinearity is obvious,
and so is commutativity. The associativity is proved using Fubini's theo-
rem, and is left to the reader.
Theorem 1.2. Let f 21(Rn) and g E 2 P (Rn) with 1 ~ p ~ 00.

E Then
f * g(y) is defined by the integral for almost all y and is in 2 P• We
have
Proof The case p = 1 is treated in Theorem 1.1. Suppose that p = 00 .

If f E 21(Rn) and g E 2OO(Rn), so we may assume g is a bounded measur-
able function, then we may also form the convolution f * g given by the
same formula
f * g(y) = f f(x)g(y - x) dx = f f(y - x)g(x) dx.
The integrals converge absolutely, and we have the trivial estimate from
[VIII, §2] CONTINUITY AND DIFFERENTIATION 225
the first integral, replacing g by its bound Ilglloo, namely
Ilf * glloo ~ IIfl1111glioo

as desired.
Finally suppose that 1 < p. Let q be as usual such that lip + llq = 1.
Then we have the inequality
f If(x)1 1IPIg(y - x)llf(x)1 1/q dx
~ [f If(x)llg(y - xW dx J IP [f If(x)1 dx J lq,

from which we see that f * g is defined for almost all elements of Rn, and
also that
l(f * g)(yW ~ [f If(x)llg(y - xW dxJ Ilfllf,q·
We integrate and use Fubini's theorem, obtaining
But the U-seminorm is invariant under translations, i.e. Ilgxllp = Ilgllp.

Since 1 + pl q = p, we take the p-th root to obtain
VIII, §2. CONTINUITY AND DIFFERENTIATION

UNDER THE INTEGRAL SIGN
Lemma 2.1. Let X be a measured space with positive measure J1.. Let
U be an open subset of Rn. Let f be function on X x U. Assume :
(i) For each y E U the function Xf-+ f(x, y) is in g>1(J1.).
(ii) For each x E X and Yo E U, we have
lim f(x, y) = f(x, Yo).

Y-Yo
(iii) There exists a function f1 E g>1(J1.) such that for all y E U,
If(x, y)1 ~ If1 (x)l·

Then the function

yH Ix f(x, y) dJ1.(x)
is continuous.
Proof. It suffices to prove that for any sequence {yd converging to y,
Ix f(x, Yk) dJ1.(x) converges to Ix f(x, y) dJ1.(x).

Let J;.(x) = f(x, yd. Then {J;.} converges pointwise to the function
XH f(x, y),
and by (iii), we can apply the dominated convergence theorem to con-

clude the proof.
Lemma 2.2. Let X be a measured space with positive measure J1.. Let
U be an open subset of Rn. Let f be a function on X x U. Assume:
(i) For each y E U the function XH f(x, y) is in 21 (J1.).
(ii) For each y E U, each partial derivative Djf(x, y) (taken with respect
to the j-th y-variable) is in 21 (J1.).
(iii) There exists a function f1 E 2 1(J1.) such that for all y E U,
IDjf(x, y)1 ~ If1 (x)l·

Let
<l>(y) = Ix f(x, y) dJ1.(x).
Then D/l> exists and we have
Proof. Let ej be the usual j-th unit vector in Rn. We have
<l>(y + he) - <l>(y)

h =
f
x
1
h[f(x, y + he) - f(x, y)] dJ1.(x).
Using the mean value theorem and (iii), together with the dominated
convergence theorem, we conclude that the right-hand side has a limit,
equal to
[VIII, §3] DIRAC SEQUENCES 227
[As in the previous proof, we have to use the device of taking a sequence
{hd to apply the dominated convergence theorem in its standard form.]
Theorem 2.3. Let f E £"l(Rn) and let cP be a C OO function with compact

support. Then f * cP is Coo and in fa ct
Proof We can form the convolutions by using Theorem 1.1 and we
f
have
f * cp(y) = f(x)cp(y - x) dx.
Lemmas 2.1 and 2.2 show that f * cp is Coo, and allow us to differentiate
repeatedly under the integral sign.
VIII, §3. DIRAC SEQUENCES
By a Dirac sequence on Rn we shall mean a sequence of functions {CPk}

on Rn real valued, continuous, satisfying the following properties:
DIR 1. We have CPk 0 for all k.
f
~
DIR 2. For all k we have CPk(X) dx = 1.
DIR 3. Given e, b > 0 there exists ko such that
for all k ~ ko.
The third condition shows that for large k, the volume under CPk is concen-
trated near the origin. Thus in one variable, the sequence looks like this:
228 SOME APPLICA nONS OF INTEGRATION [VIII, §3]
We have drawn the picture so that the sequence satisfies a property

somewhat stronger than what is expressed in DIR 3, namely the support
of q>k tends to 0 as k -+ 00. We state this condition formally. By a Dirac
sequence with shrinking support, we mean a sequence satisfying DIR 1,
DIR 2, and the third condition:
DIR 3s. Each q>k has compact support, and given b, the support of q>k is
contained in the ball of radius b, centered at the origin, for all
k sufficient large.
To construct such a sequence we can start with a positive function q>,

continuous or even infinitely differentiable, having support in the ball of
radius 1, centered at the origin, and such that
f q>(x) dx = 1.
We then let q>k(X) = knq>(kx). A sequence constructed in this manner will

be called a regularizing sequence. It has additional properties besides
those three of the Dirac sequence namely: the support of q>t is deter-
mined explicitly in terms of the support of q>, and is contained in the ball
of radius 11k; in fact, it is contained in (1/k)supp q>. In addition, the
partial derivatives of q>k can be easily estimated in terms of those of q>
and k. This is frequently useful in applications when one has to make
careful estimates on such derivatives.
We now show how a Dirac sequence can be used to approximate a
function. We shall prove first the main approximation theorem for 2 00 ,
with condition DIR 3, and give some applications. Then we prove an
analogous theorem with condition DIR 3s, and see how it implies ap-
proximation results for functions in ,PP with p < 00.
Theorem 3.1. Let f be a bounded measurable function on Rn. Let A

be a compact set on which f is continuous. Let {q>k} be a sequence
satisfying DIR 1, DIR 2, DIR 3. Then q>k * f converges to f uniformly
on A.
Proof. For x E A we have:
q>k * f(x) = fq>k(y)f(x - y) dy
and by DIR 2,
f(x) = f(x) fq>k(y) dy = f

f(X)q>k(Y) dy.
f
Hence
<Pk * f(x) - f(x) = <Pk(Y) [f(X - y) - f(x)] dy.
By the relative uniform continuity of f on A, given B, there exists 1> such

that if Iyl < 1> then for all x E A we have
If(x - y) - f(x) I < B.
We then write
<Pk * f(x) - f(x) = r + r

JIyl<o JIyl;;;o
<Pk(y)[f(X - y) - f(x)] dy.
The integral over Iyl < 1> is then bounded by B. For the other integral
with Iyl ~ 1>, we use DIR 3 to conclude that this integral is bounded by
211fliooB for k sufficiently large. This concludes the proof.
We shall now give classical examples of Theorem 3.1.
Example 1 (The Landau Sequence and Weierstrass' Approximation

Theorem). By means of a suitable Dirac sequence one can give an
explicit proof for Weierstrass' theorem that a continuous function can be
uniformly approximated by a polynomial on an closed interval. Suppose
f is continuous on [a, b]. Making a translation and dilation of the
variable if necessary we may assume that [a, b] = [0, 1]. Let y = L(x) be
the equation of the straight line passing through the end points of the
graph of the function. Then L is a polynomial (of degree ~ 1), and
f - L has the additional property that f - L vanishes at and 1. Thus °
without loss of generality, to prove Weierstrass' theorem, we may assume
that f(O) = f(l) = 0. We then extend f to R by defining f(x) =
x ¢ [0, 1].
for °
We now define the Landau functions
for Ixl ~ 1,
where the constant Ck is taken to be
so that
f1
-1
<Pk(X) dx = 1.
We define CPk(X) = °
if x is outside the interval [-1, 1]. It
exercise to show that {cpd is a Dirac sequence. We have
IS an easy
CPk * f(x) = f: CPk(X - t)f(t) dt = f CPk(X - t)f(t) dt.
Furthermore, for x, t E [0, 1] the function CPk(X - t) is a polynomial,

namely
2k
CPk(X - t) = ck(1 - (x - t)2)k = Uj(x)t j
j=O
L
where each uj is a polynomial. Hence
where rJ.j = f tjf(t) dt,
and therefore CPk * f is a polynomial. By Theorem 3.1 the sequence {Cf>k *J}
converges to f uniformly on [0, 1], thus proving Weierstrass' theorem.
In some applications, we deal with periodic functions of period 2n:,

and in this case a Dirac sequence is defined in an analogous way, taking
integrals over an interval of length 2n:. The next example is of this type.
Example 2 (Cesaro Summation). Let f be a period continuous func-

tion of period 2n:. Let Sn,J be the n-th partial sum of the Fourier series
for f, that is
Sn,J(x) =
n
L Ck eikx where Ck = -1 f" .
f(t)e- lkt dt.
k=-n 2n: _"
Let An be the average of these partial sums, that is
Then a theorem of Fejer-Cesaro asserts that {An} converges uniformly to

f. This result is a special case of Theorem 3.1 as follows. Let
1 n-l m 1
Kn(x) = - L
2n:n m=O
L e
k= -m
ikx = -(Do
n
+ ... + Dn-d·
Then simple manipulations will prove the identity
1 sin 2 nx/2
Kn{x) = -2 . 2 x/2'
nn sm
It is then easy to verify that {K.} is a Dirac sequence, that Dn * f is the

n-th partial sum of the Fourier series of f, and that K. *f is the average
of the partial sums. Therefore by Theorem 3.1, the averages of the
partial sums of the Fourier series converge uniformly to f . Do Exercise 2
to carry out the details of this proof.
In some cases, instead of considering a sequence of functions, one

considers a family of functions indexed by some real numbers, as in the
next example.
Example 3 (Harmonic Functions and the Poisson Family). For 0 ~ r < 1,

define the Poisson family to be
Then Pr{O) satisfies the three conditions DIR 1, DIR 2, DIR 3 where k is
replaced by rand r --+ 1 instead of k --+ 00. In other words :
DIR 1. We have Pr{O) ~ 0 for all r and all O.

DIR 2. For all r we have
f~It Pr{O) dO = 1.
DIR 3. Given B and 15, there exists ro , 0 < ro < 1, such that if ro < r < 1,
then
f- fit
6
-It
Pr +
6
Pr < B.
For DIR 3 you will prove and use the formula
P{O) = ~ 1 - r2
r 2n 1 - 2r cos 0 + r 2 '
Theorem 3.1 concerning Dirac sequences applies to the family {Pr }, again
letting r --+ 1 instead of k --+ 00. In other words, let f be a bounded
measurable function on R which is periodic. Let S be a compact set on

which f is continuous. Let
Then {j,.} converges to f uniformly on S as r -4 1.

The use of the Poisson family comes from the desire to solve a bound-
ary value problem as follows. We are given a function f, viewed as a
°
function on the circle, that is f«(}) is periodic, as usual. We want to find
a function on the disc, that is a function u(r, (}) with ~ r ~ 1, satisfying
the Laplace equation du = 0, where d is the Laplace operator, given in
polar coordinates by
and such that u has period 2n in its second variable, that is
u(r, (}) = u(r, () + 2n).

We want u to be continuous, and we want u(l, (}) to be as much like f«(})
as possible. If f is continuous on the circle, then we want u(l, (}) = f«(}).
The convolution
u(r, (}) = (Pr * f)«(})
solves the problem, because dP = 0, so by differentiating under the inte-

gral sign we find
d(P * f) = (dP) * f = 0.
Carry out the details as Exercise 3.
Example 4 (The Heat Equation for the Laplace Operator). For t > 0,
and a real variable x, define
1
K t (X ) -- K( x, t ) -_ (4nt)1/2e -x 2 /4t
.
°
Then {K t } is a Dirac family, replacing k by t in the definition of Dirac
sequences, and letting t -4 instead of k -400. We define the heat opera-
tor to be
on functions of two variables (x, t). You can easily verify that HK = 0.
Thus K satisfies the heat equation. On R", we can define K t and H
similarly, by
1
K t (X ) -- K( x, t ) -_ (41tt)"/2e -x 2 /4t
.
Here x E R" is an n-tuple, and x 2 is the dot product of x with itself. The
heat operator is then written
H=d--
o
ot'
where d is the Laplace operator, d = L (%xy. Again HK = O.

It is Exercise 4 to verify that {K t } is a Dirac family. By arguments
similar to those of Example 3, one verifies that for any bounded continu-
ous function f, the function (x, t) f---+ (K t * f)(x) is a solution of the heat
equation on R" x R. The Dirac family {K t } is called the fundamental
heat family.
For an example of a solution of the heat equation for periodic func-

tions, see Exercise 7.
Example 5 (The Heat Equation for the Schroedinger Operator). Recall

first the elementary definitions of hyperbolic trigonometric functions:
et + e- t et _ e- t
cosh t = 2 ' sinh t = 2 '
cosh t 1
coth t = -'-h- ' cscht= ~h .
sm t sm t
Let At be the 2 x 2 matrix given by
A - (
t -
coth t
-csch t
-csch
coth
t).
t
Define the Mehler family {In on R2 by the formula
Thus F; is positive, real valued, and we can write
F;(X) = F(X, t) = F(x, y, t) = F;,Ay)
if X is the transpose of the vector (x, y) in R2. Then:

(a) For each fixed x, the family {F;,x} is a Dirac family for t -. O.
(b) Let
Thus L is the heat operator for the Schroedinger operator
Then LF = 0, where the partial differentiation is with respect to the

variables (x, t).
(c) For any reasonable function f of a variable y, we have the integral
f.
operation
F * f(x, t) = F(x, y, t)f(y) dy.
Suppose f is bounded and continuous. Then L(F * f) = 0, i.e. F * f

satisfies the heat equation for the Schroedinger operator.
The proofs are left as Exercise 5. Readers who want to see a general
context for this example are referred to Howe's article [How], Section 5.
See also [HowT], Chapter III, §2, and Exercise 5 of that chapter. For a
somewhat different context, see [BGV], 4.2, p. 154.
For an example when most of the formalism of a Dirac family works
but the family is not positive, see Exercise 7.
Next we consider the use of Dirac sequences for approximation in ft'p

with p < 00 . We shall use DIR 3s, i.e. functions with shrinking support.
Theorem 3.2. Let f be in ft'l(R"), and let A be a compact set on which

f is continuous. Let {lPd be a Dirac sequence with shrinking support.
Then lPk * f converges to f uniformly on A.
Proof. The proof is identical to the proof of Theorem 3.1 except for
the following final modification. At the very last estimate, for k large, the
support of lPk is contained in the ball of radius b, whence the integral
expressing lPk * f(y) - f(y) is concentrated on that ball, and is obviously
estimated bye, thus proving our theorem.
Corollary 3.3. The support of lPk * f is contained in
supp f + supp lPk·

If f is continuous with compact support, then {lPk * f} converges uni-
formly to f on R".
Proof We have
CfJk * f(x) = f CfJk(t)f(x - t) dt,
and this integral is concentrated on the support of CfJk' If
CfJk * f(x) =F 0,
then we must have x - t E SUpp f, for some t E supp CfJk' Hence
X E supp f + supp CfJk'

thus proving our first assertion. The second assertion follows at once
from the first, and from the theorem (both f and CfJk being equal to 0
outside some fixed compact set).
Corollary 3.4. Let f E 2 P(R n ) for 1 ~ p < 00. Then {CfJk * J} is U con-
vergent to f
Proof We know that Cc(Rn) is U dense in 2 P• Let 9 E Cc(Rn) be

such that
Ilf - gllp < e.
Then we estimate
Since
we have I CfJkll 1 = 1. Using Theorem 1.2, we find
By Theorem 3.2 and Corollary 3.3, {CfJk * g} converges uniformly to g, and

has a support which lies close to that of g. This implies that
for k large. The last of the three terms in our estimate above is < e,
thus concluding the proof.
VIII, §4. THE SCHWARTZ SPACE AND

FOURIER TRANSFORM
Let f be a function on Rn. We shall say that f tends to 0 rapidly at

infinity if for each positive integer m the function
xERn,
is bounded for Ixl sufficiently large. Here as in the rest of this chapter,
Ixl is the euclidean norm of x. Equivalently, the preceding condition can
be formulated by saying that for every polynomial P (in n variables) the
function Pf is bounded, or that the function
is bounded, for x sufficiently large (i.e. Ixl sufficiently large).

We define the Schwartz space to be the set of functions on Rn which
are infinitely differentiable (i.e. partial derivatives of all orders exist and
are continuous), and which tend to 0 rapidly at infinity, as well as their
partial derivatives of all orders.
Example of such functions. In one variable, e- x2 is one, and similarly

in n variables if we interpret x 2 as the dot product X' x, which we also
write x 2 • As a matter of notation, we shall write xy instead of X· Y if x, y
are elements of Rn.
If f is in the Schwartz space and P is a polynomial, then the product
Pf is in the Schwartz space.
If f is a Coo function of one variable which is 0 outside some bounded
interval, then f is in the Schwartz space. As an example, one can take
the function
{
e-i/(X- a)(b-Xl
if a < x < b,
f(x) = 0
otherwise.
An analogous function III n variables can be obtained by taking the

product
It is clear that the Schwartz space is a vector space, which we denote

by Sch(Rn) or simply by S. We take all our functions to be complex
valued, so S is a space over C.
We let Dj be the partial derivative with respect to the j-th variable.
For each n-tuple of integers ~ 0, P = (Pi' ... ,Pn), we write
[VIII, §4] SCHWARTZ SPACE AND FOURIER TRANSFORM 237
so that DP is a partial differential operator, which maps S into itself. As

a matter of notation, we write
Ipi = Pl + ... + Pn'

It is also convenient to use the notation Mjf for the function such that
Thus, Mj is multiplication by the j-th variable. Also
so that
(MPf)(x) = Xfl ... x:nf(x).
In what follows, we shall take the integral of certain functions over Rn,
and we use the following notation:
f
f(X) dx = fan
f(x) dx = fOCi ...
-00-00
foo f(x 1 , · · · ,xn ) dX 1 ••• dx n •
Since our functions will be taken from S, there is no convergence prob-

lem, because for x sufficiently large, we have for some constant C:
C
If(x) I ~ (1 + xi)"'(1 + x;)'
and we can view the integral as a repeated integral, the order of
integration being arbitrary. The justification is at the level of elementary
calculus. Furthermore, we differentiate under the integral sign, using the
formula
a: f K(x, y) dx = f a: K(x, y) dx
j j
for suitable functions K in situations where this is obviously permissible

(justification loco cit.), namely when the partial derivatives of K exist, are
continuous, and are bounded by an absolutely integrable function of x,
as in Lemma 2.2.
We shall also change variables in an integral, but nothing worse than
the following cases:
f
f(x - y) dx = ff(x) dx, f f( -x) dx = f f(x) dx.
If c > 0, then ff(cx) dx = :n ff(x) dx.

The general change of variables formula, of which these are but elemen-
tary cases, will be proved in detail in Chapter XXI, §2.
Finally, for normalization purposes, we shall write formally
This makes some formulas come out more symmetrically at the end.
We now define the Fourier transform of a function f E S by
Remember that xy = x· y.
Since
we see that we can differentiate under the integral sign, and that
By induction, we get
The analogous formula reversing the roles of DP and MP is also true,

namely:
To see this, we consider
and integrate by parts with respect to the j-th variable first. We let
u = f(x) and
Then v = ie- ixy and the term uv between -00 and +00 gives zero contri-
[VIII, §4] SCHWARTZ SPACE AND FOURIER TRANSFORM 239
bution because f tends to 0 at infinity. Hence
Induction now yields our formula.
Theorem 4.1. The Fourier transform ft-+! is a linear map of the

Schwartz space into itself.
Proof. If f E S, then it is clear that! is bounded, in fact by
The expression for MP! in terms of the Fourier transform of DPf, which
is in S, shows that M p! is bounded, so that ! tends rapidly to zero
at infinity. Similarly, one sees that MP Dq! is bounded, because we let
g = Dqf, g E S, and
is bounded. This proves our theorem.
For f, g E S we define the convolution
f * g(x) = f
f(t)g(x - t) d 1 t.
This integral is obviously absolutely convergent, and the reader will ver-
ify at once that the map
is bilinear. Furthermore changing variables shows that
Theorem 4.2. If f, g E S, then f * g is also in S, and
Furthermore,
Proof. We can differentiate under the integral sign with respect to x,

and thus obtain the formula for the partial derivatives DP(f * g), which
we see exist, are bounded, and are continuous. Now we write
Ixl m ~ (Ix - tl + Itl)m

= L cjklx - tljltl k
where cjk is a fixed integer depending only on m. Then
is bounded, and we conclude that f * g tends rapidly to zero at infinity.

We can apply the same argument to DPf * g to conclude that f * g lies in
s. Finally, we have
(f *g) "(y) = f (f *g)(x)e- ixy d1x
= fff(t)g(x-t)e-iXYdltdlx,
and we can interchange the order of integration to get
= f f f(t)g(x - t)e- iXY d 1x d 1t.
We change variables, letting U = x - t, d1u = d1x and see that our last
integral is equal to
Example 1. We recall the value
which is obtained first in one variable using polar coordinates. Let
Then we contend that h= h. To see this, we differentiate under the inte-

[VIII, §5] THE FOURIER INVERSION FORMULA 241
gral sign to find, say in one variable, that
Dh(y) = - yh(y).
Thus differentiating the quotient h(y)/e- y2 /2 yields 0, whence
for some number C. This number is equal to 1, using the evaluation of

the definite integral recalled above, and our present normalization of the
Fourier transform, with d 1 X instead of dx.
Example 2. Let a be real, a > 0, and for any function hE S let
g(x) = h(ax).
Then
g(y) = ~h(y/a).
an
This is proved trivially, changing the variable in the integral defining the
Fourier transform.
VIII, §5. THE FOURIER INVERSION FORMULA
If f is a function, we denote by f- the function such that f-(x) = f( -x).

The reader will immediately verify that the minus operation commutes
with all the other operations we have introduced so far. For instance:
Theorem 5.1 (Fourier Inversion). For every function f E S we have

J=f- .
Proof Let g be some function in S. After interchanging integrals, we
find
f !(x)e- iXYg(x) d x = f f f(t)e-irxe-ixy g(x) d t d1x

1 1
= f f(t)g(t + y) d1t.
242 SOME APPLICA nONS OF INTEGRATION [VIII, §5]
Let h E S and let g(u) = h(au) for a > O. Then
g(u) = ~h(u/a),
a
f
and hence
f
f(x)e-'XYh(ax) d1x =
A . 1t -+
f(t) ann (t y)
a- d1t
f
= f(au - y)h(u) d 1u
after a change of variables,
(t
U= - - ,
+ y)
a
Both integrals depend on a parameter a, and are continuous in a. We let

a --+ 0 and find
h(O)j(y) = f( - y) f h(u) du = f( - y)h(O).
Let h be the function of Example 1. Then Theorem 5.1 follows.
Theorem 5.2. For every f E S there exists a function <p E S such that
f = <p. If f, 9 E S, then
Proof First, it is clear that applying the roof operation four times to
a function f gives back f itself. Thus f = <p, where <p = F' '''' . Now to
prove the formula, write f = <p and 9 =.[1. Then j = <p- and 9 = ljI- by
Theorem 5.1. Furthermore, using Theorem 4.2, we find
as was to be shown.
We introduce the violently convergent hermitian product
<f, g) = f f(x)g(x) dx.
We observe that the first step of the proof in Theorem 5.1 yields
f j(x)g(x) dx = f f(x)g(x) dx
[VIII, §6] THE POISSON SUMMATION FORMULA 243
by letting y =
definitions
°on both sides. Furthermore, we have directly from the
where the bar means complex conjugate.
Theorem 5.3 (Parseval Formula). For f, g E S we have
<I, g) = <i. g)
and hence
Proof We have
This proves what we wanted.
Theorem 5.3 shows that the map I~ J is an automorphism of S,

preserving the hermitian product and thus the L 2 -norm, and thus extends
to an isometry on L2, since the Schwartz space is dense in L2.
VIII, §6. THE POISSON SUMMATION FORMULA
A function g on R" will be called periodic if g(x + k) = g(x) for all k E Z".
We let T" = R"/Z" be the n-torus. Let g be a periodic Coo function. We
define its k-th Fourier coefficient for k E Z" by
Ck = f
T"
g(x)e- 21tikx dx.
The integral on T" is by definition the n-fold integral with the variables
(Xl' ... ,X") ranging from °to 1. Integrating by parts d times for any
integer d > 0, and using the fact that the partial derivatives of g are
bounded, we conclude at once that there is some number C = C(d, g)
such that for all k E Z" we have Ickl ~ Clllkll d, where IIkll is the sup norm.
Hence the Fourier series
g(x) = L cke21tikx
kE Z"
converges to g uniformly.
If f is in the Schwartz space, we normalize its Fourier transform in
f
this section by
J(y) = f(x)e-21tiXY dx
a"
Poisson Summation Formula. Let f be in the Schwartz space. Then
L:
me zn
f(m) = L
me Z"
J(m).
Proof Let
g(x) = L f(x + k).
ke Z"
Then g is periodic and coo. If Cm is its m-th Fourier coefficient, then
L Cm = g(O) = L f(k).
me Z" ke Z"
On the other hand, interchanging a sum and integral, we get
Cm = f g(X)e-21timx dx = L: f f(x + k)e-21timx dx
L: f
T" ke Z" T"
= f(x + k)e- 21tim(x+k) dx
f
ke Z" T"
= f(X)e-21timx dx = J(m).
a"
This proves the Poisson summation formula.
VIII, §7. AN EXAMPLE OF FOURIER TRANSFORM

NOT IN THE SCHWARTZ SPACE
The Fourier transform of a function is often a complicated object, but to

deal with applications, all that is frequently needed are estimates on its
growth behavior. Functions in the Schwartz space provide the simplest
class of functions for which the Fourier transform behaves in a particu-
larly simple manner. We give here an example which is more compli-
cated. Let cp be the characteristic function of the unit disc in the plane,
that is
I if Ixl ~ 1,
{
cp(x) = 0 if Ixl > 1.
[VIII, §7] AN EXAMPLE OF FOURIER TRANSFORM 245
Then cp has compact support, and is certainly in 'pl . Its Fourier trans-
form is therefore given by the integral
<j?(y) = r
JIxl ~ 1
e-21tix'Y dx = r
JIxl ~ 1
e21tix .y dx.
This Fourier transform depends only on the distance s = Iyl, and if we

use polar coordinates, then we can rewrite the integral in the form
<j?(y) = t [L21t e21tirs cos 9 dO] r dr.
But the inner integral is a classical Bessel function, namely by definition,

for any integer n one lets
In(z) = ~ f1t e-ni9+izsin9 dO.

2n -1t
Thus
<j?(y) = 2n t lo(2nrs)r dr.
As an example of concrete analysis over the reals, we shall estimate the

Bessel function for z real tending to infinity.
Proposition 7.1. We have
for t --+ 00 .
(The sign « means that the left-hand side is bounded in absolute value
by a constant times the right-hand side, namely O(t- 1/2).)
Proof. For concreteness, we deal with the case n = 0, and we shall

just consider a typical integral contributing to lo(t), namely
I o
1t e it cos 9 dO = fl e itu
-1 ~
du .
Again typically, we show that
I I
o
e itu
~
1
du = O(t-l/2).
246 SOME APPLICATIONS OF INTEGRA nON [VIII, §7J
We may rewrite the integral in the form
f 1
o
e itu
~
1 g(u) du,
where g(u) = I/ J1+u is Coo over the interval. Integrating by parts (cf.
also Lemma 7.3), we see that the desired integral satisfies the bound :
Thus we are reduced to the following lemma.
Lemma 7.2. Let 0 ~ a ~ b ~ 1. Then uniformly in a, b we have
f b
eltu
. 1
du = O(t- 1/2 ).
a ~
Proof Let v = 1 - u, and then tv = r. Then the integral is estimated
by the absolute value of
t- 1/2fB e ir _ 1_ dr
A Jr '
where 0 ~ A ~ B. But writing e ir = cos r + i sin r, and noting that IIJr
is monotone decreasing, we see that the integral on the right-hand side is
uniformly bounded independently of A, B. This proves the lemma, and
also concludes the proof of the proposition for n = O.
The integration by parts shows that the asymptotic behavior of the

Fourier transform depends only on the singularity. The case just treated
is typical, and we let the reader handle the proof in general by using the
next lemma, which shows how the singularity affects the estimate.
Lemma 7.3. Let [a, b) be a half-open interval. Let f be a continuous

function on this interval, which is also in 21. Let 9 be Clan the closed
r
interval [a, bJ. Then the Fourier transform satisfies the estimate
g(u)eituf(u) du« (lIgll + IIg'II)IIFrIl,

where
and IIFrII is the sup norm for x E [a, b].

[VIII, §8] EXERCISES 247
Proof This is an immediate consequence of integration by parts.
Theorem 7.4. Let qJ be the characteristic function of the unit disc in the
plane. Then
Proof As before, let s = Iyl. By definition, we have
~ II Jo(rs)r dr = II II cos(urs)(l - U 2 )-1/2 du r dr
Lt
r
[setting ur = t, r du = dt] = cos(ts)(l - (t/ rftl/2 dt dr
= L cos(ts) (1 - (t/ r)2tl/2 dr dt
[by direct integration] = L cos(ts)(l - t 2)1/2 dt
[integration by parts] = -! e sin(ts)

sJo pdt.
1 - t2
Estimating this last integral as in Lemmas 7.2 and 7.3 concludes the
proof.
VIII, §8. EXERCISES
1. Show that the Landau functions form a Dirac sequence.

2. In the case of Fourier series as in Example 2, show that Dn *1 is the partial
sum of the Fourier series of f Prove that {Kn} as defined in Example 2 is a
Dirac sequence. (To see this worked out, cf. my Undergraduate Analysis,
Chapter 12, §3.)
3. Prove all the facts stated in Example 3, namely:
(a) The Poisson family is a Dirac family.
(b) The Poisson family satisfies the Laplace equation, so is harmonic on the
disc.
(c) For a continuous periodic function 1 of 0, the function
F(r, 0) = p,. * 1(0)
satisfies Laplace's equation, that is flF = O. You will need to differentiate

under the integral sign, and whatnot.
4. Prove the analogous statements for the heat equation of Example 4, replacing
the words "Poisson" and "Laplace" by "heat" in the preceding exercise, and
using a function f of a variable x on R (or Rn) instead of a periodic function

of O.
5. Prove the statements about the heat family for the Schroedinger operator in
Example 5.
The formula for F(x, y, t) can be arrived at naturally as follows. Suppose
F has the form
F(x, y, t) = exp(a(t)(x 2 + y2) + 2b(t)xy + c(t)).
Applying the heat operator, one sees that for F to satisfy the heat equation it
suffices that the unknown coefficients a, b, c satisfy ordinary differential equa-
tions in the variable t, with a solution which is given by the elementary
hyperbolic exponential functions as stated.
6. For y> 0 define
1 y
q>y(x) = - -2--2'
7t X + y
(a) Prove that {q>y} is a Dirac family, for y -+ 0 (instead of k -+ CIJ).

(b) Show that as a function of two variables q>(x, y) = q>y(x) satisfies Laplace's
equation, i.e. q> is harmonic on the upper half plane. With this construc-
tion, we get harmonic functions on the upper half plane having given
bounded boundary value on the real line.
The Dirac family of this example will be used in Chapter XX, §2 in
connection with functional analysis.
7. Although Dirac families cover a lot of territory, they are not universally
applicable. At the most basic level, for Fourier series, one still wants to know
conditions under which the ordinary Fourier series Dn * f converges to the
function f. We give here an example of an object which satisfies the condi-
tions of a Dirac family except for the positivity, on the circle, so for periodic
functions.
For t > 0 and x E R, let
L
00
O,(x) = O(x, t) = e-nn2'e2ninx.

11= -00
(a) Show that 0 satisfies the conditions of a Dirac family, except for the
positivity condition. Note that 0, is periodic in the variable x.
(b) For f continuous periodic, show that 0, *f converges to f uniformly as
t -+ O.
(c) Show that 0 satisfies the heat equation, and so does 0, * f(x), as a function
of (x, t). The heat equation is normalized here in the form
8. Let f E g>'(R) n g>2(R) and suppose the function xf"(x) is in g>'(R). Prove
that there is a C' function g such that g = f almost everywhere, and give a
formula for g.
[VIIi, §8] EXERCISES 249
9. Let A > 0 and let f E 2'1 (R). Define
fA(X) = _1_ fA !"(t)e itx dt.

jbc -A
Prove that
L(x) = -
1 f oo f(y)
sin A(x - y)
dy.
n - 00 x- y
to. Prove that there is a function h in the Schwartz space such that h A has
compact support and h(O) # O. Show that for such a function, hA = h for all
A sufficiently large (notation as in Exercise 9).
11. Let f E 2' 1 (R). Prove that!" is uniformly continuous on R.
12. The lattice point problem. Let N(R) be the number of lattice points (that is,
elements of Z2) in the closed disc of radius R in the plane. A famous
conjecture asserts that
for every e > O. It is known that the error term cannot be O(Rl/2(log R)k) for
any positive integer k (result of Hardy and Landau). Prove the following
best-known result of Sierpinski-Van der Corput-Vinogradov-Hua:
Sketch of Proof. Let cP be the characteristic function of the unit disc, and put
CPR(X) = cp(x/R).
Let IjJ be a COO function with compact support, positive, and such that
Let
fR'
ljJ(x) dx = 1.
Then N.} is a Dirac family for e -+ 0, and we can apply the Poisson summa-
tion formula to the convolution CPR >I< 1jJ. to get
LZ2 CPR
me
>I< 1jJ.(m) = L q;R(m)~.(m).
me Zl
= nR2 + L nR2q;(Rm)~(em).
m .. O
We shall choose e depending on R to make the error term best possible.

Note that CPR >I< 1jJ.(x) = CPR(X) if dist(x, SR) > e, where SR is the circle of
radius R. Therefore we get an estimate
Ileft-hand side - N(R)I « eR.

Splitting off the term with m = 0 on the right-hand side, we find (using
Theorem 7.4):
I R2cj>(Rm)~(em)« R 2- 3/2 I Iml-3/2~(em).

m "0 m"O
But we can compare this last sum with the integral
Therefore we find
We choose e = R- 1/3 to make the error term O(R 2/3 ), as desired.

For relations of the lattice point problem to the eigenvalue problem see
Guillemin [GuiJ.
CHAPTER IX
Integration and Measures on

Locally Compact Spaces
On a locally compact space, it is as natural to deal with continuous

functions having compact support as it is natural to deal with step
functions. Thus we must establish the relations which exist between
functionals on the former or the latter. As we shall see, they essentially
amount to the same thing.
Thus the main point of this chapter, is to see how one can associate a
measure to a functional on CAX). Applications will be given in Chapter
X and in the spectral theorem of Chapter XX. The measure derived from
that situation is called a spectral measure.
Specializing to euclidean spaces, we relate integration with differentia-
tion, using the infinitely differentiable functions and partial derivatives
to define distributions, generalizing the notion of measure. There is no
question here of going deeply into this theory, but only of showing
readers how it arises naturally, and of making it easier for them to read
standard treatises devoted to the subject.
If the locally compact space is a locally compact group, then one can
ask for the existence of an integral and a positive measure which are
invariant under left translations. This is dealt with in Chapter XII.
Both in this chapter and in Chapter XI, we prove the existence of
partitions of unity (in the locally compact and locally euclidean cases,
respectively). Strictly speaking, this is a tool belonging to general topol-
ogy, but we postponed dealing with it until it was needed. Such parti-
tions are used to glue together certain maps into a vector space, given
iocally. They are thus used to reduce certain types of global questions to
local ones.
252 LOCALLY COMPACT SPACES [IX, §1]
Throughout this chapter we let X be a locally compact Hausdorff space.
IX, §1. POSITIVE AND BOUNDED

FUNCTIONALS ON Cc(X)
We denote by Cc(X) the vector space of continuous functions on X with

compact support (i.e. vanishing outside a compact set). We write Cc(X, R)
or Cc(X, C) if we wish to distinguish between the real or complex valued
functions.
We do not give formally a topology to Cc(X), but observe that there
are two natural ones. Of course, we always have the sup norm, defined
on CAX) since every function is bounded, vanishing outside a compact
set.
The other topology would come from considering the subspaces C(K)
for each compact subset K of X, and observing that Cc(X) is the union
of all C(K) for all K. One can then give CAX) a topology called the
inductive limit of the topologies coming from the sup norms on each
subspace C(K). We do not go into this here, but we make additional
remarks at the end of §4.
We denote by CK(X) the subspace of Cc(X) consisting of those func-
tions which vanish outside K. (Same notation Cs(X) for those functions
which are 0 outside any subset S of X. Most of the time, the useful
subsets in this context are the compact subsets K.)
A linear map A of Cc(X) into the complex numbers (or into a normed
vector space, for that matter) is said to be bounded if there exists some
C ~ 0 such that we have
IAfI ~ Cllfil
for all f E Cc(X). Thus A is bounded if and only if A is continuous for the
norm topology.
A linear map A of CAX) into the complex numbers is said to be
positive if we have Af ~ 0 whenever f is real and ~ o.
Lemma 1.1. Let A: Cc(X) --+ C be a positive linear map. Then A is

bounded on CK(X) for any compact K.
Proof By the corollary of Urysohn's lemma, there exists a continuous

real function g ~ 0 on X which is 1 on K and has compact support. If
f E CK(X), let b = If I· Say f is real. Then bg ± f ~ 0, whence
A(bg) ± Af ~ 0
and IAfl ~ bA(g). Thus Ag is our desired bound.
[IX, §1] POSITIVE AND BOUNDED FUNCTIONALS ON CAX) 253
A complex valued linear map on Cc(X) which is bounded on each

subspace CK(X) for every compact K will be called a Cc-functional on
Cc(X). In accordance with a previous definition, a functional on Cc(X)
which is also continuous for the sup norm will be called a bounded
functional. It is clear that a bounded functional is also a Cc-functional.
Theorem 1.2. Let A be a bounded real functional on CAX, R). Then.A.

is expressible as the difference of two positive bounded functionals.
Proof If f ~ 0 is in Cc(X), define
for 0~9 ~f and 9 E CAX, R).
Then .A.+f ~ 0 and A+f ~ IAlllfli. Let c E R, c > O. Then
.A.+(cf) = sup .A.(cg) for 0 ~ 9 ~ f,
whence .A.+(cf) = d+(f). Let f1' f2 be functions ~ 0 in Cc(X, R). Then

taking 0 ~ gl ~ f1 and 0 ~ g2 ~ f2' we have
A+f1 + A+f2 = sup .A.91 + sup Ag 2

= SUP(.A.g1 + Ag2) = sup A(gl + g2)
~ .A. +(f1 + f2)·
Conversely, let 0 ~ 9 ~ f1 + f2· Then 0 ~ inf(f1' g) ~ f1 and
Hence
Ag = A{inf(f1' g)) + A{g - inf(f1' g))
~ A+f1 + .A.+f2·
Taking the sup on the left implies that .A.+(f1 + f2) ~ A+f1 + .A.+f2' thus
proving that .A. + is additive.
We extend the definition of .A.+ to all elements of Cc(X, R) by expres-
sing an arbitrary f as a difference
where f1' f2 ~ 0 and letting

254 LOCALL Y COMPACT SPACES [IX, §1]
The additivity of A+ on functions f; 0 implies at once that this is well

defined, i.e. independent of the expression of f as a difference of positive
functions. One then sees at once that this extension of A+ is linear. If
f f; 0, then A+f f; 0 and also A+f f; Af. We now define r by
Then A-is linear, and
Furthermore, both A+ and A-are positive. Finally it is verified at once

that A+ and A-are bounded, thus proving our theorem.
Note on Terminology. When dealing exclusively with Banach spaces,

as was the case until this section, we used the word functional to apply
to linear maps into the scalars, continuous with respect to the given
norm. In dealing with Cc(X), we shall usually say functional instead of
Cc-functional as defined above, and use an adjective (positive, bounded)
to describe any additional properties that such a linear map may have.
A positive functional satisfies a strong continuity property with respect
to increasing or decreasing sequences of continuous functions.
Theorem 1.3 (Dini). Let f E Cc(X, R) be f; 0, and let Un} be a se-

quence of positive functions in Cc(X) which is increasing to f Then
Un} converges to f uniformly. More generally, let be a family of
positive functions in Cc(X, R) which are ~ f, and such that
sup cP =f
<pee!>
Assume that if cP, 1/1 E <1>, then sup(cp, 1/1) E <1>. Given e, there exists cp E 
such that Ilf - cpll < e. If A is a positive functional on Cc(X), then
Af = sup Acp.
<pee!>
Proof The assertion concerning the sequence is a special case of the

assertion concerning the family. We prove the latter. Let f vanish out-
side the compact set K . For each x E K, we can find a function CPx E 
such that
f(x) - cpAx) < e.
Then there is some open neighborhood Vx of x such that
for all y E Y".

[IX, §2] POSITIVE FUNCTIONALS AS INTEGRALS 255
If we cover K by a finite number of such neighborhoods Vx ; (i = 1, ... ,n),

and let
then Ilf - <p11 < 8. This proves the first part of the theorem. The last
assertion follows by the continuity of A. (Lemma 1.1).
IX, §2. POSITIVE FUNCTIONALS AS INTEGRALS
The main result of the chapter is to interpret a functional on Cc(X) as an

integral. Let.A be the algebra of Borel sets in X. If f1 is a positive
measure on .A which is finite on compact sets, then f1 gives rise to a
positive functional, denoted by df1, and given by
<f, df1) = Ix f df1.
We shall prove the converse (Riesz' theorem), and first obtain a positive
measure from a positive functional.
If f is a function on X, we define its support to be the closure of the
set of all x such that f(x) # O. Thus the support is a closed set. We
denote it by supp(f).
We use the following notation as in Rudin [Ru 1], which we more or
less follow for the proof of Theorem 2.3. If V is open, we write
f -< V
to mean that f is real, f E CAX), 0 ~ f ~ 1, and supp(f) c V. Similarly,
if K is compact we write
K -<f
to mean XK ~ f ~ 1, and of course f E Cc(X).
Lemma 2.1. Given K compact, K c V open, there exists some f such

that
K-<f-<V.
Proof. This is an immediate consequence of Urysohn's lemma. All we

have to do is choose some open set W with compact closure W such
that K eWe W c V, and use the normality of W to find a function f
with 0 ~ f ~ 1 which is 1 on K and 0 on the boundary of W. We then
extend this function to be 0 outside W. This extension is continuous on
all of X.
Lemma 2.2. If V is open, then we have
Xv = sup f for f -< v.

Proof. Given x E V, there exists an open neighborhood W of x such
that x EWe We V, and such that W is compact. We can find a func-
tion f with 0 ~ f ~ 1 such that f(x) = 1 and f is 0 on the complement
of W, by the corollary of Urysohn's lemma. This proves our assertion.
Theorem 2.3 (Riesz Theorem, Part 1). Let A be a positive functional on

Cc(X). There exists a unique positive Borel measure satisfying conditions
(i) and (ii) below, and this measure also satisfies (iii) and (iv).
(i) If V is open, then
,u(V) = sup Ag for g -< v.

(ii) If A is a Borel set, then
,u(A) = inf ,u(V) for V open ::::> A.
(iii) If K is compact, then ,u(K) is finite.

(iv) If A is a Borel set and A is a-finite, or A is open, then
,u(A) = sup ,u(K) for K compact c A.
Remark 1. From (ii) and the remarks before Theorem 2.3, we see at
once that for any compact K we have
,u(K) = inf Af for K-<f.
Remark 2. The uniqueness of ,u satisfying (i) and (ii) is obvious, be-

cause (i) determines ,u on open sets, and (ii) determines ,u on all Borel
sets.
Remark 3. It is convenient to introduce a word to summarize the

main properties listed in Theorem 2.2. We shall say that a positive
measure ,u on a locally compact space, defined on a a-algebra .It con-
taining the Borel sets, is a-regular if it satisfies properties (ii), (ii), and (iv)
of Theorem 2.2 for the sets of .It. Even though X itself need not be
a-finite, in applications only a-finite sets arise because for instance any
function in !£,1 is equivalent to a function vanishing outside a a-finite set.
Let .It be a a-algebra containing the Borel sets. A positive measure ,u
on .It is said to be regular if ,u(K) is finite for all compact K, and if in
addition, for all A E vii we have
Jl{A) = inf Jl{V) for V open:::> A,

Jl{A) = sup Jl{K) for K compact c A.
Cases when a measure is not regular are to be regarded as pathological.

Our definitions are adjusted so that the following statement is merely a
rephrasing of parts of our definitions :
If X is a-finite and Jl is a-regular, then Jl is regular.
Note that the property that Jl{V) = sup Jl{K) for compact K c V is satis-
fied by open V. This is convenient because even in pathological situa-
tions, we are able to define the measure of Theorem 2.3 on Borel sets
rather than on a more restricted algebra (e.g. that generated by the
compact sets, as is sometimes done in the literature). Observe that if we
know that property (iv) is satisfied by all sets A of finite measure, then it
follows at once for any a-finite A. Indeed, let {An} be a disjoint sequence
of sets of finite measure, and let Kn be a compact subset of An such that
Jl{A n - Kn) < e/ 2n. Then K 1 U ... u Kn is compact, and Jl{K 1 u ... u Kn)
tends to the measure of U
An as n -+ 00, whether this measure is finite or
not.
For an example of pathology, let I be a non-countable set of indices,

and let {XJ (i E I) be disjoint copies of the interval [0,1]. Let Xi be a
point in Xi' and let S be the union of all Xi . Then S is discrete, and has
infinite measure, but all compact subsets of S are finite and have measure
o.
We now come to the proof of Theorem 2.3.
We shall actually define Jl on a larger algebra than that of the Borel

sets, more or less the largest algebra such that the measure still satis-
fies our four conditions. For instance, it is clear that the complete mea-
sure obtained from a measure satisfying our properties still satisfies these
properties.
Until the end of the proof of Theorem 2.6 we let f, g, h, denote

elements of Cc{X) which are real ~ o.
Lemma 2.4. Let,.1. be a positive functional on Cc{X). For each open set
V, define
Jl{V) = sup ,.1.g for g -< v.
For any subset Y of X define
Ji(Y) = inf Ji(V) for Y c V
Then Ji is an outer measure on the algebra of all subsets of x.

Proof It is clear that Ji(0) = 0 and that if A c B, then Ji(A) ~ Ji(B).
For convenience of notation, we write
sup(f, g) =f u g and inf(f, g) = fn g.
We prove that if VI, V2 are open, then
(1)
Let h -< VI u V2 . Let <D be the family of all functions SUp(gl' g2) with
gi -< V; for i =:= 1, 2. Then <D is closed under the sup operation (on a finite
number of elements), and we have
sup g = Xv, vv 2 •
ge<!>
Let <Dh be the family of all functions
gi-<V;, i=I,2.
Then h is the sup of all functions in <D h, whence by Theorem 1.3
Taking the sup over all h on the left yields our inequality (1).
Now let {An} be a sequence of subsets of X, and A = U An· Let v,.
be open, An C v,., and
Let V = U v,.. Then U An C U v,. = V Let g -< V Since g has compact

support, there is some n such that
g -< VI U ... U v,.,

whence by (1) and induction,
).g ~ Jl(Vd + ... + Jl(v,,).

Taking the sup over all 9 on the left yields
This proves our lemma.
Remark. The special role played by compact and open sets in con-
structing the algebra of measurable sets and the measure on it will stem
from the following property:
If K -< f -< V, then Jl(K) ~ Af ~ Jl(V).
The right inequality is obvious. As for the left one, let W be the set of x
such that f(x) > 1 - 1>. Then K c W Let 9 -< W be such that
Jl(W) ~ Ag + 1>.
We have
(1 - I»g ~ f whence (1 - I»Ag ~ Af
Hence
Af
Jl(K) ~ Jl(W) ~ Ag + I> ~ -1 -- I> + 1>.
This implies that Jl(K) ~ Af, as contended.
Our remark shows the main idea of what follows. We recover charac-
teristic functions of certain sets by squeezing them between compact and
open sets, and comparing them with functions f E Cc(X) on which the
given functional is defined. The K and V allow us to use the old
technique of lower and upper sums respectively. We first have to recover
the measure itself, however, and we proceed to do this. For convenience
of notation, the outer measure Jl described in Lemma 2.4 will be called
the outer measured determined by A.
Lemma 2.5. Let d be the collection of all subsets A of X such that

Jl(A) <
00, and
Jl(A) = sup Jl(K) for K compact cA.
Then d is an algebra containing all compact sets and all open sets of
finite measure. Furthermore, /1 is a positive measure on d. In fact, if

{An} is a disjoint sequence of elements of .91, and A = An' then U
If in addition /1(A) < 00, then A Ed.
Proof If K is compact, then there exists an open V containing K

such that V is compact. Let V -< g. For any f -< V we have f ~ g,
whence Af ~ Ag and /1(V) ~ Ag, so that /1(K) ~ Ag is finite. It is then
clear that KEd.
Let V be open. We shall prove that
/1(V) = sup /1(K) for K compact, K c V.
We may assume that /1(V) > O. To cover the case when /1(V) = 00, we let
r be a real number such that 0 < r < /1 (V). There exists f such that
r < Af ~ /1(V).
Let K be the support of f If W is an open set containing K, then

f -< W, whence
r < Af ~ /1(W),
and therefore r ~ /1(K). This proves that /1(V) = sup /1(K) for K compact
c V. In particular, if /1(V) is finite, then V E d.
Before proving that .91 is an algebra, we find it convenient to have the
finite additivity. Actually, it is no more troublesome to prove the count-
able additivity. First we prove that if K 1, K 2 are disjoint and compact,
then
Let Vi' V2 be disjoint open sets containing K i , K2 respectively. Let W

be open such that
Let gi -< W n V; be such that for i = 1, 2 we have
Then
/1(K i ) + /1(K 2 ) ~ /1(W n Vd + /1(W n V2 )
~ Ag i + Ag2 + 26
= A(gi + g2) + 26
~ /1(W) + 26 ~ /1(K i U K 2 ) + 36.

The reverse inequality is true because J.I. is an outer measure, so we get

the desired equality.
Given a sequence of disjoint sets {An} in d, let Kn C An be compact,
such that
Let A = U An· Then for all n,

n n
L J.I.(A;) ~ L J.I.(KJ + s = J.I.(K
i=l i=l
1 u··· u Kn) +S
~ J.I.(A) + s.
Letting n tend to infinity, and then s tend to 0, together with the fact
that J.I. is an outer measure, shows that
L J.I.(An) = J.I.(A).
00
n=l
This gives the countable additivity and also proves that A E d if J.I.(A) is
finite.
We can now prove that d is an algebra. Clearly the empty set is in
d. If A 1 , A2 Ed, we can find compact sets K 1 , K2 and open sets V1 ,
V2 such that
Ki C Ai C V; (i = 1, 2)
and such that for i = 1, 2 we have
J.I.(K i ) ~ J.I.(AJ ~ J.I.(V;) < J.I.(K i ) - s.
In particular, by the finite additivity of J.I., we have
J.I.(V; - KJ < s.
Since
we get
It follows that A 1 u A2 lies in d. Next we note that K 1 - V2 is com-

pact, V1 - K 2 is open, and
The difference of the two extreme sets satisfies
so that Al - A2 lies in d. Since we can write
it follows that Al n A2 lies in d, thus showing that d is an algebra, and

proving our lemma.
Theorem 2.6. Let A be a positive functional on Cc(X). Let Jl be the

outer measure determined by A, and let d be the algebra of all sets A
of finite measure such that
Jl(A) = sup Jl(K) for K compact c A.
Let A be the collection of all subsets Y of X such that Y n K lies in

d for all compact K. Then A is a a-algebra containing the Borel sets,
and Jl is a positive measure on A. Furthermore, d consists of the sets
of finite measure in A.
Proof It is clear that d c A. Let A K as usual denote the collection

of all sets Y n K with YEA. Then AK = d K, and is therefore a a-
algebra in K for each compact K, by Lemma 2.5. It follows immediately
that A itself is a a-algebra, because the operations of countable union,
intersection, and complementation in X commute with the operation of
intersecting with K. (Cf. Lemma 6.2 of Chapter VI where we met a
similar situation.)
That A contains all closed sets is obvious because if Y is closed and
K compact, then Y n K is compact and so lies in d. Therefore A
contains the Borel sets.
Let A be of finite measure in A. Let V be open containing A, and of
finite measure. Let K be compact c V such that
Jl(V) < Jl(K) + 6.

Since An K lies in d, there is some compact K' cAn K such that
Jl(A n K) < Jl(K') + 6.

But A c (A n K) u (V - K), so that
Jl(A) ~ Jl(A n K) + Jl(V - K) ~ Jl(K') + 26.

This proves that A lies in d, and therefore that d IS precisely the

algebra of sets of finite measure in vIt.
Finally, let {An} be a disjoint sequence in vIt. If some An has infinite
measure, the countable additivity of J1. on U An is clear. If all An have
finite measure, then Lemma 2.5 applies. This proves our theorem.
The measure of Theorem 2.3 (or Theorem 2.6) will be called the
associated measure of A, or the measure determined by A. In applications,
one needs it mainly on the Borel sets (or the completion of the Borel
sets).
We now wish to prove that the functional A is given by the integral.

First we note that if J1. is a a-regular measure and f E Cc(X), then f is in
2 1(J1.). Indeed, f being continuous implies that f is measurable. Also f
vanishes outside a compact set (so of finite measure), and is bounded on
that set, and hence f is in 21 (J1.), say by Corollary 5.9 of the dominated
convergence theorem (Chapter VI). Next we need a lemma.
Partitions of Unity. Let K be compact and let {U1, ... ,Un} be an open
covering of K . There exist functions;; (i = 1, ... ,n) such that;; --< Ui
and such that
n
L J;(x) = 1, all x E K.
i=l
Proof For each x E K let Wx be an open neighborhood of x such that

W; c Ui(x) for some index i(x). We can cover K by a finite number of
open _ sets Wx 1 , ... ,~ m . Let V; be the union of all open sets ~, such
that
_ ~J. c Ui • Then {V1 , ••• , v,,} is an open covering of K. Furthermore
V; cUi· Let gi be a function such that
Let
f1 = gl'
f2 = g2(1 - gl),
Then;; --< Ui' and by induction one sees at once that
f1 + ... + fn = 1 - (1 - gd··· (1 - gn)·
From this our condition L ;;(x) = 1 for x E K follows at once.

The functions {/;} are said to form a partition of unity over K, subor-
dinate to the covering {U 1 , ... ,Un}.
Theorem 2.7 (Riesz Theorem, Part 2). LetA be a positive functional

on Cc(X), and let JJ. be the Borel measure determined by..1.. For all
f E Cc(X) we have
Proof. It suffices to prove our statement when f is real, Ilfll =1= O. It

will also suffice to prove the inequality
(the reverse inequality following by considering -f instead of f) . Let K

be the support of f. Given e, which we may assume ~ IlfII, we can find
a partition {A l ' ... ,An} of K by measurable sets, a step function
n
 Ai such that
and also that f ~ C i on V;. [For instance, cut an interval containing the
image of f into eI2-subintervals, say half closed to make them disjoint,
and let c; be the right end point of each subinterval. Let Ai be the
inverse image in K of the i-th subinterval. Let Ci = c; + e12. For each i
let Jt; be open => Ai such that f ~ Ci on Jt;, and shrink Jt; to an open
V; => Ai satisfying (*).]
Let {hI' ... ,h n } be a partition of unity over K subordinate to
{VI' ... ,y"}. Then fh i has support in V;, and fh i ~ cih i . Furthermore,
K « inf(l, L h;), so that
Let C = maxlc;!. Then c ~ f + e. We have

n n
..1.f = L ..1.(fh;) ~ L ..1.(cih;)
i=1 i=1
n
= L C;..1.hi = L (c + c)..1.h i - c L ..1.hi
i=1
i
[IX, §3] REGULAR POSITIVE MEASURES 265
L (c
n
~ j + c)J.l(l';) - cJ.l(K)
j=1
~ j~ (c j + C{J.l(A;) + nll~IIJ - cJ.l(K)
~L <p dJ.l + 4e + cJ.l(K) - cJ.l(K)
~L f dJ.l + eJ.l(K) + 4e.
This proves our inequality since the integral of f over K is the same as
the integral of f over X, and concludes the proof of our theorem.
Corollary 2.8. Let Mo be the set of a-regular positive Borel measures

on X. The map
is an additive bijection between Mo and the set of positive functionals on

Cc(X).
Proof Theorems 2.3 and 2.7 show that the map J.ll-+ dJ.l is surjective.
Let J.ll' J.l2 be positive measures satisfying conditions (ii), (iii), and (iv) of
Theorem 2.3 and assume that dJ.ll = dJ.l2. To show that J.ll = J.l2' it
suffices to prove that the two measures coincide on compact sets, because
then (iv) shows that they coincide on open sets, and (ii) shows that they
are equal on Borel sets. Let K be compact, and let V be an open set
containing K such that
Let K -< f -< V. Then XK ~ f ~ Xv, whence
This proves one inequality, and the other follows by symmetry. Thus we
get a bijection between Mo and the set of positive functionals on Cc(X).
This bijection is obviously additive. This proves our corollary.
IX, §3. REGULAR POSITIVE MEASURES
Theorem 3.1. Let J.l be a positive a-regular Borel measure on X. Then

Cc(X) is dense in U(J.l) for 1 ~ p < 00.
Proof. The step functions are dense in U(/l), and it thus suffices to
prove that for any set A, of finite measure given e we can find some
f E Cc(X) such that
We take K compact, V open such that K cAe V, and /l(V) < /l(K) + e.
Let K -< f -< V. Then
and
Corollary 3.2. If f E 2 1 (/l) and Jf ({J d/l = 0 for all ({J E Cc(X), then
f = 0 almost everywhere.
Proof. Let A be a set of finite measure. Then XA is the U-limit of a

sequence {({In} in Cc(X) with 0 ~ ({In ~ 1. Taking a subsequence if neces-
sary, we may assume, by Theorem 5.2 of Chapter VI, that {({In} con-
verges to XA almost everywhere, and thus {f({Jn} converges to fXA almost
everywhere. By the dominated convergence theorem, we conclude that
JfxA d/l = 0, and Corollary 5.l6 of Chapter VI finishes the proof.
The next theorem shows that a measurable function is almost continu-
ous, on a set of finite measure.
Theorem 3.3 (Lusin's Theorem). Let /l be a positive a-regular Borel

measure on X. Let f be a complex measurable function on X, and
assume that there exists a set A of finite measure such that f is equal
to 0 outside A . Given e, there exists g E Cc(X) and a measurable set Z
with /l(Z) < e such that f(x) = g(x) for x E X - Z. Furthermore, we can
select 9 such that IIgll ~ IIfll (sup norm).
Proof. Let An be the set where If I ~ n. Since the intersection of all An

is empty, it follows that the measures /l(An) approach O. Excluding a set
of small measure, we suppose that f is bounded.
In this case, f is in 21 (/l, C). By Theorem 3.1 there exists a sequence
{gn} in CAX) which is U-convergent to f. Taking a subsequence if
necessary, and using Theorem 5.2 of Chapter VI, we may assume that
there is a set Z with /l(Z) < e such that the convergence is uniform
outside Z . By regularity, we can find a compact set K contained in
A - Z such that /l(A - K) < 2e. The convergence of {gn} is uniform on
[IX, §4] BOUNDED FUNCTIONALS AS INTEGRALS 267
K, and hence the restriction of f to K is a continuous function 9 on K.

Let V be open :::) K such that V is compact. By Theorem 4.4 of Chapter
2 (Tietze extension theorem) we can find a continuous function g* which
is equal to 9 on K and 0 on the boundary of V. We extend g* to all of
X by giving g* the value 0 outside V. Then g* is equal to f on K and is
in Cc(X).
This leaves only the last statement, that we can manage IlglI;£ IIfll.
Let b = Ilfll. Let h be the function such that h(z) = z if Izl ;£ band
h(z) = bz/lzl if Izl > b. Then h is continuous, IIhll ;£ b, and h 0 g* fulfills
our requirements, thus proving Lusin's theorem.
IX, §4. BOUNDED FUNCTIONALS AS INTEGRALS
Let m be a complex valued measure on the Borel sets of X. We shall say

that m is regular if Iml is regular. See Exercise 7 for examples. Recall
that for complex measures, Iml is always bounded, by Theorem 3.3 of
Chapter VII, and that we can define the norm Ilmil = Iml(X).
Theorem 4.1. The complex regular Borel measures on X form a Banach

space.
Proof We leave most of the proof as an exercise. We shall just prove

that if m l , m2 are regular, then ml + m2 is regular. Indeed, we have
For any Borel set A we select K compact in A such that
and
Then Iml(A - K) < 2e. Similarly for open sets, whence m1 + m2 is

regular.
We wish to interpret regular Borel measures as bounded functionals

on Cc(X). The easiest way at this point is to use the Radon- Nikodym
theorem, and write
dm = hdlml
for some hE 21(1ml, C), with Ihl = 1 (Theorem 3.5 of Chapter VII). Thus
by definition, for f E Cc(X), we define
<f, dm) = Ix fh dlml·

Let us denote by Mo(X, C) = Mo the Banach space of complex regular

Borel measures on X . The map
mf-+dm
is then a linear map of Mo into the dual space of Cc(X) (sup norm),
because we have the inequality
Idml ~ Ilmll,
or written out explicitly,
In fact:
Theorem 4.2 (Riesz Theorem, Part 3). The map m f-+ dm is a norm-
preserving isomorphism between the space of regular complex Borel mea-
sures on X and the dual space of Cc(X) (with sup norm topology).
Proof. Our map is obviously linear. To show that it is surjective, we

view any bounded functional A. as a functional on Cc(X, R) and then
decompose A. into its real and imaginary parts, say A. = (1 + ir, where (1, r
are then bounded functionals. We express each real bounded functional
as a difference of positive functionals using Theorem 1.2, and apply part
1 of the Riesz theorem to these positive functionals to represent them by
positive measures. If n is a positive bounded functional and J.l is the
measure which represents n by Theorem 2.3, then J.l(X) < 00. To see this,
note that by condition (iv) of this theorem, we have
J.l(X) = sup J.l(K) for K compact.
If K -< f, we must have
J.l(K) ~ Ix f dJ.l = nf ~ CIIfll = C

where C = Inl, so that in fact J.l(X) ~ Inl. By definition and the other
conditions of Theorem 2.3, we conclude that J.l is regular. If J.li with
i = 1, ... ,4, are the bounded regular positive measures representing (1+, (1-,
r+, r- respectively, then the complex measure
is regular and represents A., i.e. we have A. = dm, thus proving that our
map is surjective.
[IX, §5] LOCALIZATION OF A MEASURE AND THE INTEGRAL 269
To show that the map is injective, we have to prove that its kernel is
O. Suppose that dm = O. Let Jl = Iml and dm = h dJl with Ihl = 1. Then
<f, h\ = 0 for all ! E Cc(X). But Cc(X) is U-dense in !l'l(Jl, C) by Theo-
rem 3.1. We have the inequality
I<!, h\1 ~ Ilhllll!lll
for all ! E !l'l(Jl, C). It follows that <cp, h)1l = 0 for all step functions cp,
whence h is equal to 0 almost everywhere. Since Ihl = 1 we must have
Jl(X) = 0, thus proving m = O.
Finally, write again dm = h dJl with Jl = Iml and Ihl = 1. Let A = dm.
We have to show that Jl(X) ~ IAI. By Lusin's theorem, §3, we can find a
function g E Cc(X) such that g = Ii except on a set Z of measure < e, and
such that Igl ~ 1 on Z. Consequently
IAI ~ IAgl = It gh dJlI ~ Jl(X - Z) - Jl(Z)
~ Jl(X) - 2e.
This proves the desired inequality, and concludes the proof of the
theorem.
Remark. Let A be a Cc-functional on Cc(X). For each compact subset

K, the restriction of the functional to CK(X) is a bounded functional,
which has a corresponding measure JlK by Theorem 4.2. If Kl C K2 are
two compact sets, then it is easily verified that the restriction of JlK2 to
Kl is JlK,. If A is Borel-measurable and A c K, then we define
which does not depend on the choice of K. This function JlA.' defined on
Borel-measurable subsets A of compact sets, will also be called a mea-
sure, and more specifically the measure associated with A. For instance,
suppose Jl is a positive Borel measure on X , and! is a measurable
function, bounded on each compact set, then A = ! dJl defines such a
functional, which has such an associated measure. The measure JlA. could
also be called the direct limit of the measures JlK' taken over all compact
sets K.
IX, §5. LOCALIZATION OF A MEASURE

AND OF THE INTEGRAL
The introduction of partitions of unity in §2 is not as accidental as it

seems. We can use them to localize a measure, or functional.
Theorem 5.1. Let {~} be an open covering of X . For each index a,

let Aa be a functional on Cc(~)' Assume that for each pair of indices a,
f3 the functionals Aa and Ap are equal on Cc(~ II Wp). Then there exists
a unique functional A on X whose retriction to each Cc(~) is equal to
Aa. If each Aa is positive, then so is A.
Proof Let f E Cc(X) and let K be the support of f Let {hJ be a

partition of unity over K subordinated to a covering of K by a finite
number of the open sets ~. Then each hJ has support in some ~(i)
and we define
We contend that this sum is independent of the choice of a(i), and also
of the choice of partition of unity. Once this is proved, it is then obvious
(see Exercise 10) that A is a functional which satisfies our requirements.
We now prove this independence. First note that if ~'(i) is another one
of the open sets ~ in which the support of hJ is contained, then hJ has
support in the intersection ~(i) II ~'(i» and our assumption concerning
our functionals Aa shows that the corresponding term in the sum does
not depend on the choice of index a(i). Next, let {gk} be another parti-
tion of unity over K subordinated to some covering of K by a finite
number of the open sets ~. Then for each i,
whence
If the support of gkhJ is in some ~, then the value Aa(gkhJ) is inde-

pendent of the choice of index a. The expression on the right is then
symmetric with respect to our two partitions of unity, whence our theo-
rem follows.
Corollary 5.2. Let {~} be an open covering of X. For each index a,

let f.la be a positive (J-regular measure on ~. Assume that for each
pair of indices a, f3 the measures f.la and f.lp induce equal measures on
~ II Wp. Then there exists a unique (J-regular positive measure f.l on
X whose restriction to each ~ is equal to f.la '
Proof This is merely a rewording of the theorem, in view of the

correspondence between (J-regular measures and positive functionals.
Theorem 5.1 will be used only in the proof of Stokes' theorem in

Chapter XXIII.
[IX, §5] LOCALIZATION OF A MEASURE AND THE INTEGRAL 271
In §2 and in Theorem 5.1, we dealt with partitions of unity over a

compact subset of X. We shall now discuss partitions of unity over all
of X.
Let d/I be a covering of X, say by open sets. We say that d/I is locally
finite if every point of x has a neighborhood which intersects only finitely
many elements of the covering. A refinement {lj} of a covering {VJ of
X is a covering such that each lj is contained in some Vi. We also say
that the covering {lj} is subordinated to the covering {VJ
A (continuous) partition of unity on X consists of an open covering
{Jt;} of X and a family of real continuous functions
satisfying the following conditions.
PU 1. For all x E X , we have l/Ji(X) ~ o.

PU 2. The support of l/J i is contained in Jt;.
PU 3. The covering {Jt;} is locally finite.
PU 4. For each point x E X, we have
(The sum is taken over all i, but is in fact finite for any given x in view
of PU 3.) As a matter of notation, we often write that {(Jt;, l/J;)} or sim-
ply {l/JJ is a partition of unity if it satisfies the previous four conditions.
In the proof of the next theorem, we use the facts (trivially proved)
that if a space X has a countable base, then any open covering has a
countable subcovering, and any base contains a countable base.
Theorem 5.3. Let X be locally compact Hausdorff, and assume that the
topology of X has a countable base. Then X admits continuous parti-
tions of unity, subordinated to a given open covering d/I.
Proof. Let V 1 , V 2 , ••• , ••• be a base for the open sets, such that each
Vi is compact. We construct first inductively a sequence A 1, A 2 , • • • of
compact sets whose union is X and such that Ai is contained in the
interior of Ai+!. We let A1 = V1. If we have constructed Ai inductively,
then we let j be the smallest integer such that Ai is contained in
and we let Ai+1 be the compact set

Let Int abbreviate interior. For each point x of Ai+l - Int(Ai) we can
find a pair (Wx, Vx ) of open sets containing x such that Wx c Wx c Vx ,
such that Vx is contained in Int(A i+ 2 ) - Ai-I, and such that Vx is con-
tained in one of the open sets of the given covering U. There is a finite
number of pairs such that already the open sets Wx cover the compact
set A i + 1 - Int(AJ Taking all such finite collections of pairs for i = 1, 2,
... , we obtain a countable collection of pairs {(~, v,.)} such that the
{v,.} form a locally finite covering of X, the {~} is also an open cover-
ing, and ~ c Vk . Let hk be such that ~ -< hk -< v,. (see the beginning of
§2 for the notation -<.) Let
co
h= Lh
k=1
k•
Let
Then {I/Id is the desired partition of unity.
Theorem 5.4. Let {h;} (i = 1, 2, ... ) be a countable partition of unity on

X. Let /1 be a regular positive Borel measure on X, and let
Then for each i, hJ is in !l'1(/1), and
in the sense that the sum is absolutely convergent, and is equal to the
integral on the right.
Proof. Let
n
fn = L hJ
i=1
Then If" I ~ If I, and the sequence {f,,} is pointwise convergent to f. We

can therefore apply the dominated convergence theorem to conclude the
proof.
IX, §6. PRODUCT MEASURES ON LOCALLY

COMPACT SPACES
Let X, Y be locally compact Hausdorff spaces, and let /1, v be positive

rr-regular Borel measures on X and Y, respectively. We let gj(X) and
~(Y) denote the rr-algebras of Borel sets in X and Y, respectively. If X,
[IX, §6] PRODUCT MEASURES ON LOCALLY COMPACT SPACES 273
Yare a-finite with respect to these measures, then Fubini's theorem

applies. However, we warn the reader that in general, one does not have
&I(X x Y) = &I(X) ® &I(Y).
(Even if X is compact, Y = X. Examples are obtained by taking X with

abnormally many open sets.) However, we can still integrate functions in
CAX x Y), as shown by the following results, which are nothing but
corollaries of the Stone-Weierstrass theorem, expressed as lemmas.
Lemma 6.1. Let X, Y be locally compact Hausdorff spaces. Every

function in Cc(X x Y) can be uniformly approximated by functions
which are finite sums of type
Proof. We may restrict ourselves to the real case. We note that

functions of the above type form an algebra A which separates points,
and this algebra is such that if K is compact in X x Y, then there exists
some g E A such that g is equal to 1 on K. (For instance, if C, D are the
projections of K on X and Y, respectively, then K c C x D, and we can
write
g(x, y) = <p(x)t/!(y)
where <p is 1 on C and t/! is 1 on D.) We are therefore reduced to
proving a second lemma.
Lemma 6.2. Let X be locally compact Hausdorff, and let A be an

algebra of real valued functions in Cc(X), which separates points, and is
such that if K is compact in X, then there exists ex E A which is 1 on K.
Then A is dense in Cc(X) for the sup norm.
Proof. Let f
E Cc(X) and let K be the compact support of f. Let
ex E A be 1 onK . Let U be an open set containing the support of ex,
and having compact closure V. The restrictions to V of elements of A
form an algebra, which clearly satisfies the hypotheses of the Stone-
Weierstrass theorem. Therefore the restriction fl V can be uniformly
approximated by elements of A IV. Denote by II Ilv the sup norm over
V. If we can approximate f by an element PEA over V, say
Ilf - Pllv < t:,
then
lIexf - exPllv < t:llexll,
and thus we have a uniform approximation of Ct.f by Ct.p over U. But

Ct.f = f, and Ct.p is equal to 0 outside U. Thus the uniform approximation
holds over all of X, as was to be shown.
As a matter of notation, if <p is a function on X and t/! is a function

on Y, then we denote by <p ® t/! the function
(x, y) H <p(x)t/!(y),
and call it the product function. The set of finite sums of product
functions is an algebra, which we shall call the algebra generated by the
product functions.
Theorem 6.3. Let X, Y be locally compact Hausdorff spaces and let Jl,
v be positive a-regular Borel measures on X and Y, respectively. Assume
that X, Yare a-finite with respect to these measures. Then all functions
in Cc(X x Y) are in ,21(Jl ® v), and there exists a unique a-regular
Borel measure on X x Y which restricts to Jl ® v on 8l(X) ® 81(Y).
Proof. Lemma 6.1 shows that functions in Cc(X x Y) are (Jl ® v)-
measurable, and combined with Fubini's theorem shows that these func-
tions are in ,21(Jl ® v). The map
fHf f d(Jl® v)
XxY
then is obviously a positive functional on Cc(X x Y), and we can there-

fore apply Theorem 2.3 to get a a-regular Borel measure having the
desired properties. The Corollary 2.8 gives the uniqueness, thus proving
our theorem.
IX, §7. EXERCISES
We assume throughout that X is locally compact Hausdorff.

1. Let X be compact, and let C(X) be the algebra of real continuous functions
on X. If A is a functional on C(X), such that ,1.(1) = 1,1.1, show that A is
positive.
2. Assume that X is separable. Show that every open set is IT-compact.
3. Show that the complex regular Borel measures form a Banach space.
4. Assume that X, Yare locally compact Hausdorff and IT-compact. If JJ., v are
regular Borel measures on X, Y respectively, show that JJ. ® v is regular.
[IX, §7] EXERCISES 275
5. Let J.l, v be regular Borel measures on R". Define the convolution J.l * v by
where (J: R" x R" -> R" is the sum. Show that J.l * v is regular.
6. Assume that X is (J-compact. Let J.l be a regular Borel measure on X. If A is
measurable, show that there exists a closed set B c A and an open set V:::J A
such that J.l(V - B) < e.
7. Assume that every open set in X is (J-compact. If v is a positive Borel
measure which is finite on compact sets, show that v is regular. [Hint: Show
that v = J.l if J.l is the regular measure associated with dv as in the text. Do it
first for open sets.]
8. (a) Let M denote the Banach space of complex regular Borel measures on R".
If m, m' are in M, show that for f E C,(R") the integral
ff f(x + y) dm(x) dm'(y)
exists, and defines a bounded functional on CAR"), whose measure is

denoted by m * m', and is called the convolution of m and m'. Prove that
convolution of elements of M is associative, bilinear, commutative, and
has a unit element. Thus M is a Banach algebra.
(b) Let J.l be Lebesgue measure and let f E :;el(J.l, C). Show that for any
mEM we have
m*J.lj = J.lg
for some g E :;el(J.l, C). In algebraic terminology, this means that the ab-
solutely continuous elements of M (with respect to Lebesgue measure)
form an ideal in M.
9. (a) Let A be a bounded functional on C,(X), and let m be the regular complex
Borel measure such that dm = A. Show that A extends uniquely to a
functional on :;el(lmlJ by continuity, and that this follows at once from
the remarks preceding Theorem 4.2.
(b) Let {h;} (i = 1,2, ... ) be a countable partition of unity on X. Let
f E :;el(lmlJ. Show that
I A(hJ) = A(f).
[Note: This obvious extension of the text, and of Theorem 5.4 in particular,
is useful when dealing with manifolds. Cf. for instance Chapter XXIII, §3, §5,
and §6.]
10. Verify in detail the "obvious" fact in the proof of Theorem 5.l that A is a
functional, in particular that for each compact set K there is a number AK
such that for any f E C,(X) with support in K we have IAfl ~ AK Ilfll.
11. Let J.l be a regular positive measure on R. (a) Show that the functions of type
e- Xg(x) (where g is a polynomial) are dense in :;el(R+, J.l). (b) Show that the
functions of type e - X2 g(x) (where 9 is a polynomial) are dense in £,l(R, J-l).

(c) Same thing for £,P with 1 < P < 00 . [Hint: Cf. Exercises 19 and 20 of
Chapter III.]
12. Let J-l be a regular positive Borel measure on R. If f E £,1(J-l) and
t f(x)e itx dJ-l(x) = 0
for all real t, show that f(x) = 0 for J-l-almost all x. [Hint : By a Fourier
series argument, show that
if 9 is Coo of period 2N with large N, and then, also if 9 E Ce(R).]

13. Let J-l be a regular positive Borel measure on R.
(a) Assume that there exists c > 0 such that the function x 1--+ eelxl is in £,1(J-l).
Let f E £,P(J-l), 1 0 such that
for all n ~ 0, then f(x) = 0 for J-l-almost all x . Note : Actually, (b) implies (a).
[Hint for (a) : Show that the integral in Exercise 12 is analytic in t for t at a
distance ;:;; c/q from the real line, and 0 near the origin. You can also use
the exercises at the end of Chapter III.]
Examples. Taken dJ-l(x) = e- x2 dx. We get the completeness of the Her-
mite polynomials. For the Laguerre polynomials, one takes dJ-l(x) = h(x) dx,
where h(x) = 0 if x < 0 and h(x) = e- if x ~ o. And similarly for the other
X
classical polynomials, which are obtained by applying the orthogonalization

process to {x"} .
14. Let X be compact and let J-l be a regular measure on X. Let A be a
subset of X whose boundary has measure o. Let {x"} be a sequence of points
of X having the following property. For every continuous function f on X
we have
n
!~~ 1 i~
" f(xJ = f f dJ-l.
Let N(A, n) be the number of indices i;:;; n such that Xi E A. Prove that
. N(A, n)
lIm - - =J-l(A).
p.- oo n
[IX, §7] EXERCISES 277
This is the equidistribution of the sequence {xn} in X. In some applications,

instead of using condition (*) for all continuous functions f one uses it only
for a vector space of more special functions, dense in C(X) (e.g. the space
generated by characters on a compact group).
15. (Karamata's theorem.) Let J1 be a regular positive Borel measure on R+ such
that the integral
converges for t > 0, and such that
for some positive constants rand C. If f is a continuous function on [0, 1],

then
[Hint: By Weierstrass' approximation, it suffices to prove the theorem

when f is a polynomial, and hence when f(x) = Xk. This is done by direct
computation.]
For an application of Karamata's theorem, see [BGV, p. 95).
16. Let J1 be a finite positive regular Borel measure on R. Assume that
for all n E Z.
Prove that J1 = 0.
CHAPTER X
Riemann-Stieltjes Integral
and Measure
This chapter gives an example of a measure which arises from a func-

tional, defined essentially by generalizations of Riemann sums. We get
here into special aspects of the real line, as distinguished from the general
theory of integration on general spaces.
X, §1. FUNCTIONS OF BOUNDED VARIATION

AND THE STIELTJES INTEGRAL
Let us start with a finite interval [a, b] on the real line. To each
partition
P = [a = Xo, Xl' ... ,X. = b]
we associate its size,
a(P) = size P = max (Xk+l - x k ).

k
Let
f: [a, b] --+ E
be a mapping of the interval into a Banach space. Let P be a partition

of [a, b] as above. We define the variation Vp(f) to be
.-1
Vp(f) = L If(x k +1) - f(Xk) I·
k=O
[X, §1] FUNCTIONS OF BOUNDED VARIATION 279
We define the variation
V(f) = sup Vp(f),

p
where the sup (least upper bound if it exists, otherwise 00) is taken over
all partitions. If V(f) is finite, then f is called of bounded variation, and
f is bounded.
Examples. If f is real valued, increasing, and bounded on [a, b], then
f is obviously of bounded variation, in fact bounded by f(b) - f(a).
If f is differentiable on [a, b] and f' is bounded, then f is of bounded
variation (mean value theorem). This is so in particular if f is of class
CI .
The mappings of bounded variation form a vector space. In fact, if f

and g are of bounded variation, then for ex E C,
V(f + g) ~ V(f) + V(g) and V(exf) = jexj V(f).

In other words, denoting by BV([a, b]) the space of mappings of bounded
variation, we see that the variation V is a seminorm on BV([a, b]).
Let J, g have values in Banach spaces E, F respectively, and suppose
given a product (bilinear map) E x F ~ G into a Banach space, denoted
(u, v)f-+UV, and satisfying juvj ~ jujjvj. If f, g are of bounded variation,
so is f g, as one verifies easily. A special case of a product occurs of
course when E = C itself, and multiplication is just multiplication by
scalars. One then obtains a bound for the variation of a product, namely
V(fg) ~ IIfll V(g) + IIgjj V(f).
This estimate is an immediate consequence of estimating the sums for the

variation and using the triangle inequality.
The notation for the variation really should include the interval, and
we should write
V(f, a, b).
Define
J.j(x) = V(f, a, x),
so J.j is now a function of x, called the variation function of f.
Proposition 1.1. Let f E BV([a, b]).

(i) The function J.j is increasing.
(ii) If a ~ x ~ y ~ b, then
V(J, a, y) = V(J, a, x) + V(J, x, y).

(iii) If f is continuous, then J.j is continuous.
280 RIEMANN -STIEL TJES INTEGRAL AND MEASURE [X, §1]
Proof For (i), we note that if x < y, then we can always refine a
partItIon of [a, y] to include the number x. Furthermore, if p i is a
partition refining P, then
Then (i) follows at once. For (ii), we again use the fact that a partition of
[a, y] can be refined to contain x. Finally, suppose that f is continuous.
By (ii), the continuity from the right of J.f amounts to proving that
lim V(f, x, y) = o.
Suppose that the limit is not O. Then there exists a number fJ > 0 such
that
V(f, x, t) > fJ for all x < t ~ y.
Let x = Xo < Xl < . . < x. =y be a partition of [x, y] such that

.-1
L
k=O
If(x k+ 1 ) - f(xk)1 > fJ.
By the continuity of f at x, we can select Y1 very close to x and in

particular x < Y1 < Xl such that the inequality remains valid if we re-
place the term
If(x 1 ) - f(x)1 by
Thus we have proved:
There exists Y1 with x < Y1 < Y such that
V(f, Y1' y) > fJ.
Now we repeat this procedure with Y replaced by Y1' and find Y2 with
x < Y2 < Y1 such that
After N steps, we get

V(f, Y., y) > NfJ.
Since V(f, Y., y) ~ V(f, x, Y), this gives a contradiction, concluding the
proof.
Theorem 1.2. Let f be a real valued function on [a, b] of bounded

variation. Then there exist increasing functions g, h on [a, b] such that
g(a) = h(a) = 0, and

f(x) - f(a) = g(X) - h(x),
Vf(X) = g(X) + h(x).
If f is continuous, so are 9 and h.
Proof Define g, h by the formulas
and 2h = Vf - f + f(a).
If f is continuous, so are 9 and h by Theorem 1.1 (iii). In any case,
°
g(a) = h(a) = and the two formulas of the theorem are valid. There
remains only to prove that g, h are increasing. Let a ~ x ~ y ~ b. Then
by additivity of Proposition 7.1(ii),
2g(y) - 2g(x) = V(f, x, y) + f(y) - f(x) ~ 0,
so 9 is increasing, and similarly h is increasing, thus concluding the proof.
We now generalize the notion of Riemann integral to that of Stieltjes.

Let f, 9 be bounded maps of [a, b] into Banach spaces E, F respectively,
and assume given a product E x F --+ G denoted by (u, V)H uv such that
juvj ~ jujjvj. Given a partition P = {a = X o, Xl' • .. ,X n = b} and numbers
Ck with Xk ~ Ck ~ Xk+ l ' we define the Riemann-Stieltjes sum (abbreviated
RS-sum) as before, namely
n-l
S(P, c,J, g) = S(P, c) = L f(c k) [g(Xk+I) - g(Xk)].
k=O
Denote by a(P) the size of the partition P. We say that the limit
lim S(P, c)
a(P)-O
exists if there exists LEG such that given e there exists {) such that
whenever a(P) < {) then jS(P, c) - Lj < e. If
lim S(P, c, f, g)
a(P)-O
exists, we say that f is RS(g)-integrable, and we denote the limit by the

integral called the Riemann-Stieltjes integral
ffd9.
When g(x) = x, then the integral is just the Riemann integral, and we call
the function Riemann integrable.
282 RIEMANN -STIEL TJES INTEGRAL AND MEASURE [X, §lJ
It is immediate that the set of RS(g)-integrable maps is a vector space,

and that the RS-integral is a functional on this space. The space itself
will also be denoted by RS(g), or RS(g, [a, bJ) if we want to specify
[a, b].
Let a ~ b ~ c. It is immediately verified that if f is in RS(g, [a, cJ),
then f is in RS(g, [a, bJ) and RS(g, [b, cJ) and we have the usual
property
ffd9= r fdg+ ffd9 .
(Warning : The converse may not hold!)

Observe also that the RS-integral is linear in g. In other words if
f E RS(gd and f E RS(g2), then f E RS(gl + g2); if a E C then f E RS(ag);
and we have linearity of the integral keeping f fixed, viewing the integral
r r
as a function of g, that is
f d(ag) = a f dg.
These are immediate, using the triangle inequality.

We have the usual estimate with the sup norm.
r
Proposition 1.3. Assume f E RS(g), and 9 of bounded variation. Then
I f dg I ~ IlfII V(g),
where IlfII is the sup norm of f on [a, b].
Proof Immediate by estimating an approximating RS-sum and using

the triangle inequality.
Proposition 1.4. We have f E RS(g) if and only if 9 E RS(f), and in that
r r
case, the formula for integration by parts holds, namely
f dg + 9 df = f(b)g(b) - f(a)g(a).
Proof Suppose that J~ 9 df exists. Consider a RS-sum which we un-

wind using summation by parts:
n-l
S(P, c, f, g) = L f(c k) [g(X k+1) -
k=O
g(xdJ
n-l
= -
k=l
I [f(Ck) - f(ck-dJg(xd + f(cn-1)g(b) - f(a)g(a)
= f(b)g(b) - f(a)g(a) - S(Q, g, f)

where
n-I
S(Q, g, f) = g(a) [f(co) - f(a)] +L g(xd [f(cd - f(ck-d]
k=1
- g(b) [f(b) - f(cn-dl
Here we have written the product in reverse direction.

Then S(Q, g, f) is a RS-sum with respect to the partition
Q: [a = Co, c l , ... ,cn - I , b]
with the intermediate points a, x I' ... ,Xn - I , b. When the size of P
approaches 0, so does the size of Q, and by hypothesis, the sum S(Q, g, f)
approaches the integral J~ g df, thereby completing the proof of the
proposition.
Finally we give a criterion for RS-integrability.
Proposition 1.5. Assume that f is continuous and g of bounded variation

on [a, bl Then f E RS(g).
Proof. Given e let D be such that if Ix - yl < D then If(x) - f(y)1 < e.
Let P, P' be partitions of size < D. To estimate IS(P, c) - S(P'. c')I, we
may assume without loss of generality that P' is a refinement of P. Thus
it suffices to prove two estimates: if P' = P but we change the choice of
intermediate points c to c', then the difference of the sums is small; and
if P' is obtained from P by inserting one more point in the partition,
then again the difference of the sums is small. As to the first step, letting
P = P', we have
IS(P, c) - S(P, c')1 ~ L If(cJ - f(ci)llg(Xi+!) - g(xJI ~ eV(g),
which gives us the desired estimate. Secondly, suppose that P' is ob-
tained from P by inserting one point, say xi with Xj ~ xi ~ x j +!. Then
the size of P' is still < D. By the first step, to get the desired estimate for
IS(P, c) - S(P', c')1 we may assume without loss of generality that for
i 1= j we have x; = Xi' that Cj = xj, and xi is also selected as the interme-
diate point for the two intervals [Xj' xj] and [xi, xj + l ] of the partition
P'. Then
S(P, c) - S(P', c') = 0,
and the proposition is proved.
Often the RS-integral can be computed as a Riemann integral, as in

the next proposition.
284 RIEMANN-STIELTJES INTEGRAL AND MEASURE [X, §1]
Proposition 1.6. Let f be continuous and suppose that g is real differen-
r r
tiable on [a, b] with Riemann-integrable g'. Then f E RS(g), and
f dg = f(x)g'(x) dx,
where the integral on the left is the RS-integral, and the integral on the
right is the usual Riemann integral.
Proof This is immediate by using the mean value theorem on g ap.d

estimating the RS-sum by the triangle inequality and the hypothesis that
g' is Riemann integrable.
Finally we consider the special case when j, g are complex valued or

even real valued. Suppose g is of bounded variation. Then on C([a, b])
r
we obtain a bounded functional
ff--+ f dg.
By Proposition 1.1, and the linearity of the integral in f and g, we may

decompose the integral into a sum of four terms, each term involving
only real valued functions. Furthermore, if f E CAR), then the support of
f lies in some bounded interval [a, b] and we may define
t r fdg = fdg,
the right side being independent of the choice of [a, b]. Suppose that
there exists a number B > 0 such that
V(j, a, b) ~ B for all [a, b],
so the variations are uniformly bounded on finite intervals. In this latter

case, the space of such functions is denoted by BV(R), and called the
space of functions of bounded variation on R. On this space, we define
VR(f) = sup V(j, a, b).

[a,b)
Then VR(f) is a seminorm on BV(R), and results of this section extend to

BV(R). In particular, the estimate of Proposition 1.2 holds over R, that
is
for f E Cc(R) and g E BV(R). By the general representation theorem, there

exists a unique complex valued measure J.L g such that
for all f E CAR).
The variatIOn VR(g) simply provides an explicit concrete example of

the general notion of the norm liJ.Lgli of the measure. We call J.L g the
Riemann-Stieltjes measure associated with g.
If h is a bounded increasing function, then J.Lh is a finite positive
measure on R. It is immediate from the proof of Theorem 1.2 that the
decomposition of a real valued function g of bounded variation on a
finite closed interval is also valid on R, by the same formula using the
total variation. Then the decomposition of Theorem 1.2 gives an exam-
ple when a real valued measure is expressed as a difference of two
positive measures.
In some applications of the next section, we shall mix conditions when
a function is L 1 (R) and also conditions of bounded variation. The fol-
lowing observation is sometimes useful.
Lemma 1.7. Suppose f E Ll(R) n BV(R). Then f vanishes at infinity, in

the sense that
lim f(x) =0 and lim f(x) = o.
Indeed, suppose f(x) ~ c > 0 for infinitely many x. Then f(x) ~ cl2 (say)
for all x sufficiently large, otherwise f would not be in Ll(R) n BV(R).
Under these circumstances, in a situation when integration by parts is
valid for finite intervals as in Proposition 1.4, the extra terms
f(x)g(x) I:00 = }!~oo [f(b)g(b) - f(a)g(a)]
vanish.
Proposition 1.8.
(a) Let f be of bounded variation. Then the set of points of discontinu-
ity of f is countable.
r
(b) Assume f continuous at a and b with a ~ b. Then
f(b) - f(a) = dJ.L,(x).
In particular, the set consisting of the single point a has J.Lrmeasure O.

Proof We leave part (a) as an easy exercise (see Exercise 4), done by
a routine estimate. As for part (b), we note that the constant function 1
is continuous on [a, b], and all Riemann-Stieltjes sums give the same
value f(b) - f(a), as desired.
I have given above enough results on functions of bounded variations

to indicate the flavor of the theory, and to suffice for some applications:
to the spectral theorem of Chapter XX, §5, and to Fourier analysis in the
next section. For more on functions of bounded variations, see Natanson
[Natl
We end this section with a useful mean value theorem, which did not
fit anywhere else, and illustrates once more the technique of summation
by parts.
Proposition 1.9 (Integral Mean Value Theorem (Bonnet, 1849». Let f

and g be Riemann integrable on [a, bl Assume that f is positive in-
r r
creasing. Then there exists c E [a, b] such that
f(x)g(x) dx = f(b) g(x) dx.
Proof Let P = [a = X o , Xl' ... ,x.] be a partition of [a, b] such that
r
the Riemann sum
•
S = L f(xi)g(Xi)(Xi -
i=l
Xi-I) approximates f(x)g(x) dx .
Let ai = f(xJ and bi = g(xJ(x i - Xi-I). Summing by parts, and letting
•
Bi = L
k=i
g(Xk)(Xk - xk-d,
we find (putting B.+1 = 0) that
n
S = f(xo)BI + L Bi(J(Xi) -
i=l
f(X i- I »)·
Since f(xn) = f(b), we find
f(b) min Bk ~ S ~ f(b) max Bk.

k k
But Bk is a Riemann sum itself for g over the interval [X k- l , xn] = [X k- l , bl

[X, §2] APPLICA TIONS TO FOURIER ANALYSIS 287
Varying X k- 1 ' we see that given e, we obtain
inf f(b) fb g(x) dx - 2e ~ fb f(x)g(x) dx ~ sup f(b) fb g(x) dx + 2e,

e c a e c
whence the same inequality without the e. But
XH J: g(t) dt
is continuous, so by the intermediate value theorem for continuous func-

tions, the proposition follows.
X, §2. APPLICATIONS TO FOURIER ANALYSIS
In this section we give applications of integration to a more refined

Fourier analysis, especially using the results on functions of bounded
variation from §l. We start however with the simplest situation contain-
ing no delicate estimates. The next result is a routine version of the
Riemann- Lebesgue Lemma.
Proposition 2.1. Let f E C;x'(R). Given a positive integer k, there exists

a constant C = C(J, k) such that for A ~ 1 we have
Proof. We integrate by parts k times, so that a power Ak comes into

the denominator. The constant C is the sup norm of the k-th derivative
of f. Since f is assumed to have compact support, the term uv in the
integration by parts disappears since f(x) = 0 for x large positive and
large negative. So the proposition is clear.
In particular, the proposition gives a quantitative estimate for the limit
lim
A-+oo
f""
-00
f(x)e iAX dx = O.
We may consider next the variation when we must take an end point
into consideration.
We now introduce the condition of bounded variation. According
to Zygmund [Zy], Dirichlet was the first who proved Fourier series
288 RIEMANN -STIEL TJES INTEGRAL AND MEASURE [X, §2J
convergence theorems under the conditions of normalization and of being

increasing, which amounts to bounded variation. See Zygmund's com-
ments, Chapter II, 2.6.
We recall that for a function IE U(R), the Fourier transform I" is
defined by
I"(y) = - 1 foo .
I(x)e-"cy dx.
fo -00
The function I" is obviously bounded, by 11/111 '

Proposition 2.2. Let IE BV(R) n L 1 (R). Then there is a constant C =
CU) such that lor all t # 0 we have
Proof. We integrate by parts, obtaining for the integral the value
e- itx
-/(x)-.- Joo +:-1 foo e- itx dl(x).
It -00 It -00
Since IE BV(R) n Ll(R), we remarked in Lemma 1.7 that I(x) -+ 0 as

x -+ ±oo, so the first term vanishes. The integral of the second term is
finite since I has bound variation, so the estimate ~ Clltl is evident.
Proposition 2.3. Let IE Ll(R). Then
lim
.4-'00
foo-00
l(x)e iAX dx = O.
Proof. Let 9 E C~(R) be such that III - gill < e. Then the integral can
be written and estimated in the form
f~oo U(x) - g(x»)e iAX dx + f~oo g(x)e iAX dx.
The first integral is bounded in absolute value bye. The second integral
is < e for large A by Proposition 2.1. This proves the proposition.
Next we look at Fourier inversion under more delicate conditions

than those of the Schwartz space.
Let A > 0, and let IE U(R). We define
IA(X) =- 1 fA I"(t)e itx dt.

fo -A
We are interested in conditions under which we have 1"" = 1-, in the

[X, §2] APPLICA TIONS TO FOURIER ANALYSIS 289
sense that
f" "(x) = lim fA (x).
A-+oo
We begin by a lemma which gives another expression for fA'
Lemma 2.4. Let f ELl (R) and let A > O. Then
fA(X) =
1 fA "
M:
itx 1
f (t)e dt = -
foo f(y)
sin A(x - y)
dy.
y2n -A n - 00 x-y
If f is also in BV(R), then fA is bounded independently of A.
Proof. We have
./be f:A f"(t)e itx dt = f:A f~oo f(y)e - ity+itx dy dt

= f~oo f(y) f:A eit(x- y) dt dy [by Fubini's theorem]
= f oo 2f(y) sin A(x - y) dy,

- 00 x- y
which proves the formula. The boundedness of fA if f E BV(R) is left as

Exercise 1 (Stieltjes integration by parts).
The next theorem gives conditions for f /\/\ = f-. It involves an appli-
cation of the Bonnet mean value theorem. The theorem is in Titchmarsh
[Ti], who attributes the result to Prasad, Pringsheim, and Hobson.
Theorem 2.5. Let f : R ..... R be a function satisfying:

(a) f is of bounded variation on every finite closed interval.
(b) f is in L 1 (R).
(c) f is normalized so that for all x E R,
f(x) = t(f(x + ) + f(x - )].

Then
f(x) = .
lim -
1 f oo f(y)
sin A(x - y)
dy = .
lim fA(X).
A-+ oo n - 00 x- Y A-+oo
Proof. After a translation we may assume that x = O. Since (sin Ay)/y

is an even function of y, we may assume that f is even, and we are
1
reduced to proving
n . 00
sin Ay
-2 f (0) = lim f(y) - -dy,
A-+ oo 0 Y
under the additional assumption that f is continuous at O. We shall now

see that the result depends only on the local behavior of f near O. Pick
some b > o. Then
.1
hm
A--+ oo
00
b
f(y) - Ay dy = 0
sin -
Y
by Proposition 2.3. Therefore we are reduced to consider the integral in

the finite interval [0, b], where f is bounded. By additivity, we may also
assume without loss of generality that f is increasing, since a function of
bounded variation is the difference of two increasing functions.
We observe that the theorem is valid when f is constant on [0, b].
Indeed, in that case, we note from the change of variables u = Ay,
du = Ady, that
AYd _ IA sin u d
Io -sinY
b
- y-
oU
-- u-+-
7t
2
as A -+ 00 .
Since the integral is linear in f , we may assume without loss of generality

that f(O) = O. Given B there exists J such that for 0 ~ y ~ J we have
If(y) - f(O)1 = If(y)1 < B.

Then
Ib
o
f(y) sin Ay dy =
Yo"
I" Y
Ib
+ f(y) sin Ay dy.
The second integral on the right approaches 0 as A -+ 00 by Proposition

2.3 because (sin Ay)/y is bounded for y ~ J. So what matters is the
integral over [0, J]. By the integral mean value theorem, we now find
I"o
f(y) sin Ay dy
y
= f(J) I"
c
sin Ay dy
Y
= f(J) fA" sin u duo
Ac U
Since If(J)1 < B, and since the integral over [Ac, AJ] is bounded, we have
concluded the proof of the theorem.
In the next theorem, we deal still with another situation, when we do

not assume an L 1 (R)-condition, but deal with oscillatory integrals to
guarantee convergence in a quantitative form of the Riemann-Lebesgue
Lemma. The result concerns a situation intermediate between Proposi-
tion 2.2 and Proposition 2.3, and in the precise given form is due to
Barner [Ba 90].
[X, §2] APPLICATIONS TO FOURIER ANALYSIS 291
Theorem 2.6. Assume :

(a) f E BV(R).
(b) f(x) = O(lxn for some 8> 0 and x --+ O.
Then the improper integral that follows exists for A > 0, and satisfies
the bound
for A --+ 00.
Proof. We can split e iAy = cos(Ay) + i sin(Ay) and consider separately

the cosine and sine. The cosine is harder if anything because one has to
rely on assumption (b) on f to make the integrand integrable near O.
Hence we deal with the cosine for concreteness to fix ideas. We shall
first prove that the integral exists (the problem is at infinity), and then we
shall prove that it satisfies the stated bound as A --+ 00.
Existence. There will be some integrations by part, so we need the
two auxiliary functions defined respectively for x > 0 and all x E R:
.
CI(X) = - fcc -cos-t dt and
.
SI(X) = - fcc -sin t dt.
x t x t
Lemma 2.7. We have the estimates for x > 0:
Ici(x)1 ~ ~x and Isi(x)1 ~~.

x
Proof. This is immediate after integrating by parts.
To prove the existence of the improper integral in the theorem, we

define
gy
cos Ay
() = - -
y
and G(y) = - f y
CC cos At
- - dt = cl(Ay)
t
.
for y> O. Let c5 > O. By Proposition 1.4 justifying integration by parts,
r r
for c5 < a < 00, we get:
f(y) dG(y) = f(a)G(a) - f(c5)G(c5) - G(y) df(y)·
We now estimate each one of the three terms on the right side.
Since f is bounded by hypothesis, and G(a) --+ 0 as a --+ 00, we see that
f(a)G(a) --+ 0 as a --+ 00 , so the first term approaches 0 as a --+ 00.
We claim that f(b)G(b) --+ 0 as b --+ O. To see this, we split the inte-
.( )_II
gral:
CIX--
cos-tdt -
- cos-tdt
- fro
x t i t
= f x cos t - 1d
t+ fX -dt -cll
.( )
I t i t
= convergent integral(x) + log x + constant as x --+ O.

Let x = Ay so log x = log A + log y. Consider f(x) ci(Ax) = f(x)G(x). The
hypothesis that f(x) = O(lxl') immediately implies that f(x) ci(Ax) --+ 0 as
x --+ 0, thus proving the claim that f(b)G(b) --+ 0 as b --+ O.
Now to the third term. The limit
lim fa f(y) dG(y)

~~o ~
exists by the hypothesis on f, so the integral
f: G(y) df(y)
r r
exists. There remains only to prove that the tail end
G(y) df(y) = ci(Ay) df(y)
approaches 0 as a --+ 00 and a < b. But the routine estimate of Proposi-

tion 1.3 gives for the absolute value of the integral a bound V(f)2/Aa,
using also Lemma 2.7. Hence
lim
a-+oo
fa G(y) df(y)
0
exists.
This concludes the proof of existence for the improper integral of Theo-
rem 2.6.
Estimate of the integral. We shall now prove the stated estimate. We
let A. = 1/(1 + e). We decompose the integral:
fro
o
f(y) cos A Y dy =
y
f
0
I/A'
+ fro
I /A'
f(y) cos A Y dy
Y
= f o
lM'
+ [f(y) ci(AY)]lIA' -
fro
I/A'
ci(Ay) df(y)
[X, §2] APPLICATIONS TO FOURIER ANALYSIS 293
= f o
IlA' f(y)
-cos(Ay) dy - f(l / A).) ci(A 1-).)
y
- fro l / AA
ci(Ay) df(y).
We shall estimate each one of the three terms on the right side of the
equality.
For the first term, the integral is taken near zero where
If(y)/y I = O(y'-l)
and cos Ay is bounded. Hence the first integral in absolute value is
1 1
« -e -A t / ). for A --+ 00,
which is of the desired type.

For the second expression, using Lemma 2.7, we have the bound
). . 1-), 1 2 2
f(l / A ) cl(A )« Ad A1-). ~ A 2t /(1+,)
which is also of the desired type.

For the third expression, we have the usual sup bound for the integral,
If b
l / AA
ci(Ay) df(y) I ~ A ;-). VR(f)
for every b, and hence for the integral to 00 . This concludes the proof of
the theorem.
Remark. The previous theorem is similar to many of the same kind,

giving various conditions under which the Fourier transform tends to 0,
which is a form of the Riemann-Lebesgue Lemma. The method of proof
also follows a standard pattern, whereby one splits the integral over
various intervals, near ° and near 00, with a cut-off point chosen judi-
ciously. In determining the cut-off point, one may think of f(x) as actu-
ally being equal to x', and one first determines the cut-off as l/ A)., or a
small perturbation thereof. I don't know of a single theorem which covers
all examples of that kind. Barner's conditions have shown to be useful in
the applications he had in mind, and were motivated by those applications.
For a Parse val formula in a context similar to that of the present

section, see [JoLl
X, §3. EXERCISES
1. Let IE BV(R) and also IE L 1 (R). Define
S(x) = f x sin t
-dt.
o t
Show that for A > 0:
If'"
-00
I(y) sin A~ - Y) dyl
x Y
~ 211SI1 VR(f),
or better,
2. Let {f.} be a sequence of continuous functions on [a, b], converging uniformly

to f. Let g be of bounded variation on [a, b]. Prove that
lim fb In dg = fb I dg.
11-00 a a
3. Let I be continuous on [a, b] and let {gn} be a sequence of functions of

bounded variation, such that the variations are uniformly bounded, that is
there exists a constant B such that
V(g., [a, b]) ~ B.
Assume that g.(x) converges to g(x) for some bounded function g and all
x E [a, bJ. Show that
lim fb I dg. = fb I dg.

n-+oo a a
4. Let I be of bounded variation on [a, bJ. For each x E [a, b] define the jump
JAx) = lim supl/(Y) - l(x)1

,-0
where the sup is taken for Y E [x - r, x + r]. Prove:
(a) Given 6, there is only a finite number of points x E [a, b] such that
Jf(x) ~ 6.
(b) The set of points where I is not continuous is countable. [Show that I is
continuous at x if and only if Jf(x) = 0.]
5. Let I be a real valued function on [a, bJ. If the set of discontinuities of I has
measure 0, show that I is Riemann-integrable. The converse is also true, and
you can prove it as an exercise if you are sufficiently interested.
CHAPTER XI
Distributions
In Chapter IX, we saw how certain functionals on CJX) gave rise to a

measure. Here we consider the case when X = Rn and the functionals
satisfy additional continuity conditions with respect to differentiation.
XI, §1. DEFINITION AND EXAMPLES
We let D; = a/ax; be the i-th partial derivative applied to functions on Rn.

For a p-tuple (Pi' ... ,Pn) = P of integers ~ 0, we let
DP = Dr' ... D:"
and Ipi = Pi + ... + Pn. Then each differential operator DP operates on

functions in coo(Rn). Actually, we shall deal with the subspace of func-
tions C~(Rn) which have compact support. If / is a function, we let M f
be the operator which consists in multiplying by /, so that we have
Mf(g) = /g, for any function g. We have the formula
for any / E coo(Rn). We shall consider general differential operators
with coefficients (Xp E coo(Rn). Because of the preceding formula, we see

that such differential operators, viewed as linear maps on C~(Rn), form
296 DISTRIBUTIONS [XI, §1]
an algebra under composition. The differential operator being written as

above, we say that it has order ~ m. It is easy to verify that its expres-
°
sion as above determines the coefficients a p uniquely. Indeed, suppose
that D = 0. To prove that a p = it suffices to prove that for any a E Rn
we have ap(a) = 0. For a given p we consider a function given locally
near a by
Then
Dqf(a) = {op! ~f p :/; q,
If p=q,
where p! = Pi! ... PnL Hence (Df)(a) = p! ap(a) = 0, whence
We define seminorms on C;"(Rn) as follows. For each differential

operator D and f E C;"(Rn) we let
where I I is the sup norm, and for each integer m ~ ° we let
ltm(f) = sup IIDPfll.

Ipl~m
It is clear that ltD and lt m are seminorms on C;"(Rn).

For any subset K of Rn we denote by C;"(K) the space of those
functions in C;"(Rn) whose support lies in K. We define a distribution on
an open set V of Rn to be a linear map
T: C;"(U) -+ C
such that, for every compact set K contained in V, there exists a con-
stant AK and an integer m for which
all qJ E C;"(K).
Just as it is useful to have a criterion for continuity of a linear map in

terms of sequences when dealing with normed vector spaces, we have a
similar criterion under the present circumstances, namely:
Theorem 1.1. A linear map T: C;"(U) -+ C is a distribution if and only

if it satisfies the following property:
[XI, §1] DEFINITION AND EXAMPLES 297
Let {<pj} be a sequence in C;o(U), such that all <Pj have support in a
compact set K, and such that for every p, {DP<pj} converges to 0 uni-
formly on K. Then T<pj -+ O.
Proof. It is clear that if T is a distribution, then it satisfies the stated

property. Conversely, assume that it satisfies this property, and let K be
a compact subset of U. For each integer m ~ 0 let
am = supi Tfl·
the sup being taken for those f E C;o(K) such that 1tm (f) ~ 1. It will
suffice to show that for some m, we have am "# 00. Suppose that am = 00
for all m. Choose fm E C;o(K) such that 1tm(fm) ~ 1, but ITfml ~ m. Let
gm = fm /m. Then
and if k ~ m, then
so
But
tends to 0 as m -+ 00, and gm has support in K. Thus DPgm tends to

o uniformly on K, and Tgm -+ 0 by hypothesis, a contradiction which
proves the theorem.
We say that a distribution T on an open set V is of order ~ m if for

each compact set K c V there exists a number AK such that
for all <p E C;o(K). A distribution of order 0 is therefore a Cc-functional

as in Chapter IX, §1 and the Remark at the end of §4.
We shall now give examples of distributions.
Functions. Let f be a locally integrable function on an open set V of

R". (We recall that this means: f is jl-measurable for Lebesgue measure
jl, and f is integrable on every compact subset K of V.) We associate
with f the map ~ whose value on <p E C;o(U) is given by
~(<p) = Iv f(x)<p(x) dx = Iv f<p djl.

Then it is clear that T.r is a distribution of order 0 on each compact

subset K of U. In fact, if we use the obvious notation
Ilflll.K= LIf I dJl,
then
if qJ E C;'(K).
Furthermore, the map fl--+ T.r induces an injective linear map of U(U)
into the space of distributions on U, because we know from Corollary
9.5 of Chapter VI that if T.r = I'g for two locally integrable functions f, g,
then f is equal to 9 almost everywhere. Thus from now on, we can
interpret locally integrable functions as distributions.
Measures. Similarly, let Jl be a positive a-regular Borel measure on

the open set U of Rn. We know that dJl is a functional on Cc(U), and
since C;' (U) is a subspace of CAU), we can view dJl as a linear map on
C;' (U). Thus if qJ has support in K, we have
and we see again that dJl is a distribution of order 0 on every compact

subset of U. We could use the notation Tp. instead of dJl for the preced-
ing distribution. As with functions, we have an injective map Jll--+ dJl
from the set of positive a-regular Borel measures on U into the set of
distributions.
The Dirac distribution <> given by
is a special case of a distribution obtained from a measure, namely the

Dirac measure. For each a ERn, we also have the translate <>a of <>, given
by
Multiplication by a C OO Function. Let T be a distribution on U and let

a E Coo(U). We define aT to be To M a, so that
It is immediately verified that aT is a distribution.
Composition with Differential Operators. Let T be a distribution and

D a differential operator on U. Then To D (also written TD) is a distri-
[XI, §2] SUPPORT AND LOCALIZATION 299
bution V. Its value at ep is
(TD)(ep) = T(Dep).
In particular, if T can be represented by a locally integrable function J,
then
T.rD( ep) = Iv f(x)Dep(x) dx.

The verification that TD is a distribution is immediate from the
definitions.
XI, §2. SUPPORT AND LOCALIZATION
Let D be a differential operator on an open set V . For every open

subset V of V, we can view D as a differential operator on V; namely
considering the restriction of D to those functions having support in V.
We say that D is equal to zero on V if Dep = 0 for every ep E Coo(V). We
say that D is locally zero at a point a E V if D is equal to zero on some
open neighborhood of a, i.e. if there exists an open subset V of V,
containing a, such that Dep = 0 for every ep E Coo(V). We define the sup-
port of D by describing its complement, namely :
a ¢ supp(D) if and only if D is locally zero at a.
Note that if D is locally zero at a, then D is locally zero at every point

close to a, so that the support of D is closed in V.
Let T be a distribution on an open set V . We say that T is zero on
V if Tep = 0 for each ep E C:' (V). If T is a distribution on V, and V is an
open subset of V , then we can restrict T from C:' (V) to C:' (V), and this
restriction, denoted by TI V; is a distribution on V. We shall say that a
distribution T on V is locally zero at a point a E V if there exists an
open neighborhood V of a in V such that the restriction of T to V is
zero on V. We can thus define the support of T by the condition:
a ¢ supp(T) if and only if T is locally 0 at a.
As before, we see that the support of T is closed in V.

We can localize distributions just as we localized measures, by means
of partitions of unity, which must now be taken to be of class Coo, not
merely continuous. We restate this as a separate result.
Coo Partitions of Unity. Let K be a compact set in Rn and let {Vj}

(j = 1, . .. ,m) be an open covering of K. Then there exist functions epj
in C:O(~) such that qJj ~ 0,

m
and L qJj =
j=l
1 on K.
Proof For each x E K we can find an open ball centered at x, of

radius r(x), such that the ball of twice this radius centered at x is
contained in some ~. We cover K by a finite number of such balls, say
B1 , ..• ,B•. For each k = 1, ... ,s we find a function t/lk which is Coo,
which is equal to 1 on Bb 0 ~ t/lk ~ 1, and such that t/lk vanishes outside
a ball B~ centered at the same point as Bk and having a slightly bigger
radius. This is done by routine calculus technique, cf. Chapter VI, §9.
[We recall briefly below how to do this.] Inductively, one then sees that
if we let
then on Bl U'" U B. we have
This yields what we want, except for the fact that the indices may not be
j = 1, . . . ,m. But it is trivial to adjust this as desired. All we have to do
is to find for each k and index j(k) such that Bk is contained in Vj , and
then for each j = 1, ... ,m take the sum of those CX k such that j(k) = j, to
obtain qJj.
To get the function t/lk as in the preceding proof, we combine a
function whose graph is indicated below with the square of the euclidean
norm to get a Coo function which is 1 on a ball, and 0 outside another
ball of slightly bigger radius.
If the ball is not centered at the origin, we combine this with a

translation.
Theorem 2.1. Let T be a distribution on an open set V in R". If T is

locally zero at every point, then T = 0 on V.
[XI, §2] SUPPORT AND LOCALIZATION 301
Proof Let <p E C:'(U), and let K be the support of <po For each a E K
we can find an open set Ua such that T is zero on Ua . We can cover K
with a finite number of such open sets, say UI , ... ,Um • Let {<pJ be a C" )
partition of unity over K with j = 1, ... ,m, such that supp <Pj is contained
in Uj • Then
m
and T<p = 2:
j;1
T(<pj<p) = 0,
Corollary 2.2. Two distributions which are locally equal everywhere are
equal.
Corollary 2.3. Let T be a distribution on the open set U, and let

<p E C:'(U). If supp(T) n supp(<p) is empty, then T<p = 0.
Proof Let K = supp <p, and Q = supp T. There exists an open neigh-
borhood V of K which does not intersect Q and is contained in U . Let
a E C:'(V) be such that a = 1 on K and the support of a is contained in
V Then <p = a<p and
T(<p) = T(a<p).
It is immediately verified that aT is locally zero everywhere, and hence

that aT = 0, so that T<p = 0, as was to be proved.
Corollary 2.4. Let T be a distribution on an open set U, and assume

that T has compact support K. Let <p, IjJ E C:'(U) and assume that
<p = IjJ on an open neighborhood of K . Then T<p = TIjJ .
°
Proof Since <p - IjJ is equal to on an open neighborhood of K, it
follows that the supports of <p - IjJ and T are disjoint, whence we can
apply Corollary 2.3 to conclude the proof.
As an application of Corollary 2.4, we can extend the domain of

definition of a distribution T with compact support K to the whole space
COO(U). Indeed, if a is a function in C:'(U) which is equal to 1 on an
open neighborhood of K, and if f E COO(U), then we define
Tf = T(af).
The preceding corollary shows that this value is independent of the

choice of a subject to the condition that a = 1 on an open neighborhood
of K, and that it is an extension of T, namely T(a<p) = T(<p) if <p E C:'(U).
This extension is useful in a context like the following. Consider the
function f such that f(x) = x (say in one variable x). If T has compact
support, then we can speak of the value T(x) = Tf(x) using the definition
we just made.
Using partitions of unity over a whole open set, one can prove the
following result, left as an exercise.
Let {V;} be an open covering of an open set V in Rn. For each i, let 1i
be a distribution on Vi' and assume that for each pair i, j the restric-
tions of 1i and 1j to Vi n ~ are equal. Then there exists a unique
distribution T on V which is equal to 1i on each Vi'
We give one more example of the localization principle, using parti-

tions of unity over a compact set. We say that a distribution T is locally
of order ~ m at a point a if there exists a compact neighborhood K of a
such that T is of order ~ m on K.
Theorem 2.5. If a distribution T on V is locally of order ~ m at every

point of V, then T is of order ~ m on every compact subset of V.
Proof Let K be a compact subset of V . Let {lXj } (j = 1, .. . ,k) be a

COO partition of unity over K, such that each lXj has support in an open
set Vj whose closure Vj is compact, contained in V, and such that T is of
order ~ m on this closure. For any cP E C:'(K) we have
k
T( cp) = L
j=l
T(lXjCP)·
We note that SUpp(lXjCP) C ~ c~. Let Aj be a number such that
all f E C:' (~).

Then
k
IT(cp)1 ~ L
j=l
Aj 1tm (lXj CP)·
But
DP(lXjCP) = L ",jqDqcp
Iql;:i!lpl
with suitable functions "'jq determined by lXj and p. Thus
Hence there is a constant Bj such that
all cP E C:'(K),
[XI, §3] DERIVA TION OF DISTRIBUTIONS 303
whence
k
IT(cp)1 ~ L Aj Bj 1tm (cp), all cp E C~(K).
j=l
XI, §3. DERIVATION OF DISTRIBUTIONS
Theorem 3.1. Let D be a differential operator on the open set U of R".

Then there exists a unique differential operator D* such that for any
functions f E C<Xl(U) and cp E C~(U) we have
f fDcp dp, = f (D*f)cp dp,.
Proof. For the existence, we may restrict ourselves to the case when
D = aDP for some a E C<Xl(U). Then we have
because integration by parts shows that
This proves the existence. As for uniqueness, suppose that D* and D' are
differential operators such that
f (D'f)cp f
= (D*f)cp
for all f E C<Xl(U) and cp E C~(U). Then
f ((D* - D')f)cp = °
for all cp, so that (D* - D')f = 0, whence D* = D'.
We shall call D* the adjoint of D. The map D 1-+ D* is an anti-
automorphism of the ring of differential operators; anti because
Let T be a distribution on U and D a differential operator. We define
DT = To D* = TD*
on C:'(U). In particular, if D = D; is the i-th partial derivative, then
D;* = -D; and
The reason for our definition is that if f is a C 1 function, then

recalling that TJ ( <p) = Sf<p dj1, we have the formula
as one sees from Theorem 3.1.
Example. Let f be the locally integrable function on R, such that

f(x) = 1 if x ~ 0 and f(x) = 0 if x < 0 (this is sometimes called the
Heaviside function). A trivial integration in one variable shows that the
derivative of ~ is simply the Dirac distribution, i.e. we have
D~ = 15,
where D = Dl is the derivative in one variable.
Example. Consider the Laplace operator
Let
1
g(x, y) = 2n log r
where r = (x 2 + y2)1 /2 as usual. Then 1'g is a distribution on the plane,

and
is the Dirac distribution at the origin. See formula L 3 of Chapter XIX,

§3.
XI, §4. DISTRIBUTIONS WITH DISCRETE SUPPORT
To investigate the structure of distributions with discrete support, it suf-

fices to describe distributions whose support is one point, and then by
translation, distributions whose support is at the origin. We can then
give a complete description of such distributions.
[XI, §4] DISTRIBUTIONS WITH DISCRETE SUPPORT 305
Theorem 4.1. Let T be a distribution whose support is {O}. Then there

exists an integer m ~ 0 and constants cp such that
T = L cpDPJ.
Ipl ;> m
In fact, Cp = (-1)lpIT(x P)/pL
Proof. First we recall from differential calculus that if U is an open

set containing 0, and if f E Coo(U) and DPf(O) = 0 for Ipi ~ k, with k ~ 1,
then there exist COO functions fp such that
This is proved by starting with the formula
f(x) = f(x) - f(O) = L f'(tx)x dt = L

f'(tx) dt· x,
and continuing to integrate similarly the successive derivatives of f. We

then write the Taylor expansion of f, namely
Since T has compact support, it has order ~ m for some m. We shall

use the definition of T on functions as in the discussion following Corol-
lary 2.4.
We consider the Taylor expansion of f up to the terms of order m,
and consider the function
(1)
Then (Dqg)(O) = 0 if Iql ~ m, and our preceding remark allows us to write

9 as a sum of terms each of which is of type
with Ikl ~ m + 1. We shall prove below that T(xkh) = 0 if h is COO in

some neighborhood of 0, T has order ~ m, and Ikl ~ m + 1. Once we
have this, we conclude that Tg = 0, whence from (1), we obtain
from which our theorem follows at once.

We now prove that T(xkh) = 0 under the stated conditions. Let a be

a COO function with support in the unit disc, and equal to 1 on some
neighborhood of O. Let a,(x) = a(rx), so that the support of a, shrinks to
the origin as r -+ 00 , and in fact lies in the disc of radius 1/r. Fix q such
that Iql ~ m + 1. We have for r> 0 :
and it will suffice to prove that this value tends to 0 as r -+ 00. Since m
is the order of T, there exists a constant A such that
Thus we have to estimate
Ipi ~m.
The support of xqa,h lies in the disc of radius 1/r. The usual formula for
the derivative of a product yields
with j + k + 1= p. (Addition is componentwise, and the coefficients Cjkl

are variations of binomial coefficients, determined universally by p and q.)
The derivatives Dlh are uniformly bounded on a given neighborhood of
the origin. We have
and hence Dka, is bounded by rlkl times a bound for the derivatives of a
itself, up to order m. In the circle of radius 1/r we have
But Ikl ~ m - Ijl < Iql - UI. This proves that DP(xqa,h) tends to 0 as
r -+ 00, and concludes the proof of our theorem.
More generally, by a similar technique, one can prove that a distribu-

tion with compact support is equal to DTf , where f is a continuous
function, D is some differential operator, and Tf is the distribution given
J
by Tf ( q» = f q> dfJ.. Our intent here is not to give an exhaustive theory,
but merely to give the reader a brief acquaintance and feeling for func-
tionals depending on a more involved topology than that of a norm,
taking into account partial derivatives. For a concise and very useful
summary of other facts, cf. the first two chapter of Hormander [Ho], and
also Palais EPa 1].
[XI, §4] DISTRIBUTIONS WITH DISCRETE SUPPORT 307
We conclude with a remark which is technically useful. One can

define distributions on a torus (R" modulo Z") using eX) functions on R"
which are periodic. The advantage of doing this lies in the fact that in
this case, every distribution has compact support, and estimates become
easier to make. One can use certain open subsets of the torus, as local
domains replacing open sets in euclidean space, and thus one avoids
certain quasipathological types of distributions arising from open subsets
in R", due to exceptional growth along the boundary. In many cases, it
is worth paying the price of periodicity to achieve this. For an exposi-
tion along these lines, cf. EPa 1].
CHAPTER XII
Integration on Locally
Compact Groups
This chapter is independent of the others, but is interesting for its own
sake. It gives examples of integration in a different setting from eu-
clidean space, for instance integration on a group of matrices. For an
application of integration and some functional analysis to compact
groups, see Exercise 11.
XII, §1. TOPOLOGICAL GROUPS
A topological group G is a topological space G together with a group

law, i.e. maps
GxG-+G and
which define a group law and the inverse mapping in the group, such
that these maps are continuous. After this section, i.e. from §2 to the end
of the chapter, we assume always in addition that G is Hausdorff.
Examples. (1) Euclidean space RP is a topological group under

addition.
(2) The multiplicative group R* of non-zero real numbers under multi-
plication. Similarly, the multiplicative group of non-zero complex num-
bers C* under multiplication.
(3) The group of non-singular n x n matrices Matn(R) or Matn(C) un-
der multiplication.
(4) The group SLn(R) or SLn(C) of matrices having determinant 1,
under multiplication.
[XII, §1] TOPOLOGICAL GROUPS 309
(5) The Galois group of the algebraic numbers over the rational num-
bers, with the Krull topology.
(6) The additive group of p-adic numbers.
(If you don't know these last two examples, don't panic; forget about
them. They won't be used in this book.)
In the non-commutative case, we write the group multiplicatively as

usual. In the commutative case, we write it either multiplicatively or
additively, depending on situations.
If G is a topological group, and a E G, then we get a map
called translation by a, and defined by 'aX = ax. More accurately we call

this left translation by a. Multiplication being continuous, it is clear that
'a is continuous, and is in fact a homeomorphism since it has an inverse,
namely translation by a -I.
The map XI-+X- I is also a homeomorphism of G onto itself.
Let e be the unit element of G. Let V be an open set containing e. If
a E G, then aV is an open set containing a. If V is an open set contain-
ing a, then a -I V is an open set containing e. Thus neighborhoods of e
and neighborhoods of any point in G differ only by translation.
The technique of (8, b) in metric spaces can be used in topological
groups by using translations, to give a uniform way of describing close-
ness. For instance, if a, bEG and V is an open neighborhood of e, we
can say that a, b are V-close if a E bV. This relation is symmetric if we
can select V to be symmetric, i.e. V = V-I (where for any set S in G, the
set S-I is the set of all elements X-I with XES). This can always be
done: if V is an open neighborhood of e in G, then V (') V-I is a
symmetric open neighborhood.
The 8/2 technique can also be used: given an open neighborhood V of
e, there exists an open neighborhood V of e such that VV = V 2 c: V .
Indeed, the map G x G -+ G being continuous, the inverse image of V is
open in G x G and contains an open set W x W' containing (e, e). We
let V = W (') W'. Similarly, we can find an open neighborhood V of e
such that VV- I c: V.
We have the usual uniformity statements for continuous maps on
compact sets. Let S be a subset of G and f: S -+ E be a map into a
normed vector space. We say that f is (left) uniformly continuous on S if
given 8, there exists a neighborhood V of e such that for all x, YES with
y-I x E V we have
If(x) - f(y)1 < 8.
Proposition 1.1. If S is compact and f: S -+ E is continuous, then f is

uniformly continuous.
310 INTEGRATION ON LOCALLY COMPACT GROUPS [XII, §1]
Proof. Just the same as when S is in a metric space. For each XES
we can find an open neighborhood Ux of e such that if y E xUx , then
If(y) - f(x) I < 6.
Let Vx be an open neighborhood of e such that V/ c Ux . There exists a

finite covering {Xl v"" ... 'Xn v"J of S with Xl' ... ,Xn E S. Let
V= v. n· ·· n
Xl
V.Xn .
Let x, YES and suppose that X E yv. We have y E Xi v,,; for some i, and
hence
so that
If(x) - f(y)1 ;£ If(x) - f(x;) I + If(x i ) - f(y)1
< 26,
Proposition 1.2. If A, B are compact sets, then the product AB is

compact.
Proof. AB is the image of A x B under the continuous composition

law of G.
Proposition 1.3. Let A be a subset of G. Then the closure of A

satisfies
the intersection being taken over all open neighborhoods V of e.
Proof. Let X E A. For any open neighborhood V of e the open set

X V-I contains X and hence intersects A, i.e. there is some YEA such that
y = xv- l with some v E V. Then X = yv and X E A V, so we get one
inclusion. Conversely, if X is in all A V, then X V-I intersects A for all V,
whence x lies in the closure of A. This proves our assertion.
Proposition 1.4. Let H be a subgroup of G. The closure H is a

subgroup.
Proof. This is proved purely formally using the fact that translations
and the inverse map are homeomorphisms. Namely, if h E H, then
hH=HcR
[XII, §1] TOPOLOGICAL GROUPS 311
and H c h -1 ll. Since h -1 II is closed it follows that II c h -1 ll, whence

hll ell. Hence Hll c ll. If x E ll, then Hx c II and He llx- 1, which
is closed. Hence
and therefore llx c II so that llll c ll. Similarly, H- 1 = Hell, so that

H c ll-t, and since ll-1 is closed, we get II c ll-t, whence ll-1 c H.
Thus II is a subgroup.
Example. If we take H = {e} , then II is the smallest closed subgroup

of G. By what we saw concerning the closure of a set, it is equal to the
intersection of all open neighborhoods of e.
We now consider coset spaces and factor groups. Let H be a sub-

group of G. We have the set of left cosets {xH}, x E G, which we denote
by G/ H, and a natural map
n : G -+ G/ H,
which to each x E G associates the left coset xH. We give G/H the
topology having the minimum amount of open sets making n continuous.
Thus a subset W in G/H is defined to be open if and only if n-l(W) is
open in G. We have the following characterization of open sets in G/H:
Proposition 1.5. A subset of G/H is open if and only if it is of the form

n(V) for some open set V in G.
Proof. If W is open in G/H, then W = n(n- 1(W)) and n-l(W) is open.

Conversely, if V is open in G, then
is open, so n(V) is open.
In particular, we see that the map n: G -+ G/H is an open mapping, i.e.

maps open sets onto open sets. All open subsets of G/H may be written
in the form VH/H with V open in G. In particular, G/ H is locally
compact.
Proposition 1.6. If K' is a compact subset of G/ H, then there exists a

compact subset K of G such that K' = n(K).
Proof. Let A be a compact neighborhood of e in G. Then n(A) is a

compact neighborhood of the unit coset in G/ H (it is compact and we
know that n is an open mapping). Let XI ' • •• , X n be elements of G such
that the sets n(XiA) cover K'. Let
Then K is compact and satisfies our requirements.
The preceding property will be useful when we consider continuous

functions on G/H.
If we let H = e be the closure of the identity subgroup (interesting, if

at all, only when the set consisting of e alone is not closed), then H is
closed and it is easily seen that every point in G/H is closed. It then
follows that G/H is Hausdorff, because of a general property of topologi-
cal groups, namely:
If each point of a topological group G is a closed set, then G is

Hausdorff.
Proof Let x, y E G and x -# y. Let U be the complement of xy-l and

let V be a symmetric open neighborhood of e such that VV c U. Then
V does not intersect Vxy-l, and hence Vx, Vy are disjoint open sets
containing x and y respectively.
As an exercise, the reader can show that if H is a subgroup of G, then

G/H is Hausdorff if and only if H is closed. From now on, we deal only
with Hausdorff groups, and take coset spaces or factor groups (with
normal H) only when H is closed.
Example. A subgroup H of G is said to be discrete if the induced

topology on H is the discrete topology, i.e. every point is open. Let
G = RP and let VI' .•. ,Vm (m ~ p) be vectors linearly independent over the
reals. Let r be the additive group of all linear combinations
with integers k i • Then r is a subgroup, which is immediately verified to

be discrete.
The group Z of integers is a discrete subgroup of R, and the factor
group R/Z is isomorphic to the circle group (group of complex numbers
of absolute value 1, under multiplication), under the map
The group GL(n, Z) of non-singular n x n matrices with integer com-

ponents is a discrete subgroup of GL(n, R).
[XII, §2] THE HAAR INTEGRAL, UNIQUENESS 313
XII, §2. THE HAAR INTEGRAL, UNIQUENESS
By a (left) Haar measure on a locally compact group G (assumed Haus-

dorff from now on) we mean a positive measure J1 on the Borel sets
which is a-regular, non-zero on any non-empty open set (ot equivalently
on any Borel set containing a non-empty open set), and which is left
invariant, meaning that
J1(xA) = J1(A)
for all measurable sets A, and all x E G.

We shall get hold of a Haar measure by going through positive func-
tionals. Thus by a Haar functional we shall mean a positive non-zero
functional A on Cc(G) which is left invariant, i.e.
for all IE CAG). Here as usual, fal = fa is the a-translate of f, defined

by
Remark. Observe the presence of the inverse a- 1 • This is deliberate.

We want the formula
for a, bEG, and this formula is true with the present definition. Func-
tions form a contravariant system, i.e. if T: X -+ Y is a map of sets, then
it induces a map in the reverse direction
T*: functions on Y -+ functions on X.
We can apply this when T = fa is the translation, thus forcing the inverse
a -1 when applying translation to functions.
The original proof for the existence of Haar measure due to Haar
provides the standard model for all known proofs. We shall prove the
existence of the functional in §2, following Weil's exposition [W]. Here,
we discuss the relation between the measure and the functional; we prove
uniqueness and give examples.
First we prove a lemma which shows that a locally compact group
has a certain a-finiteness built into it. We recall that a set is called
a-compact if it is a countable union of compact sets.
Lemma 2.1. Let G be a locally compact group. Then there exists an

open and closed subgroup H which is a-compact.
Proof. Let K be a symmetric compact neighborhood of e. Then the

sets Kn = KK ... K (n times) are compact neighborhoods of e, and form
an increasing sequence since e E K n for all n. Let H = K OO be the union
of all K n for all positive integers n. Then H is a-compact. Furthermore,
H is a subgroup (obvious). Next, H is open, because if x E H, then
x E K n for some n, whence
xK c K n+1 c H,
and H is open. All co sets of H are open, and we can write G as a

disjoint union of cosets of H. Then H itself is the complement of an
open set (union of all cosets -# H), whence H is closed. This proves our
lemma.
In view of the lemma, we can write
G= U xiH
i€I
for i in some indexing set I, and H is open, closed, and a-compact. Let
f.1 be a Haar measure. By the remarks following Theorem 2.3, Chapter
14, it follows that the measure on each coset xiH is regular. If A is an
arbitrary measurable set, then we can write
A= UAi with Ai C xiH
and the Ai are disjoint. If we have proved the uniqueness of Haar

measure on H (i.e. the fact that two Haar measures differ by a constant
> 0), then the reader will verify easily that the uniqueness follows for G
itself.
Theorem 2.2. If f.1 is a Haar measure, then for any f E ,g>1(f.1) and any
L L
a E G we have
f(ax) df.1(x) = f(x) df.1(x).
In particular, the functional df.1 on Cc(X) is left invariant, and therefore a

H aar functional.
Proof. If ((J is any step function then by linearity we see that
Let {((In} be a sequence of step functions converging both U and almost

everywhere to some f in ,g>1(f.1). Then {'a((Jn} converges almost every-
where to taf. On the other hand, {talPn} is immediately seen to be

LI-Cauchy because we have remarked that the integral of talP is the
same as the integral of lP for any step function lP. The first assertion of
our theorem follows at once. (Note: This is the same proof we gave for
the invariance of Lebesgue measure on euclidean space.)
Theorem 2.3. Let J.l and v be Baar measures on G. Let dJ.l and dv be
the functionals on CA G) associated with J.l and v. Then there exists a
number c > 0 such that dv = c· dJ.l.
Proof. By a previous remark, we may assume that G is a-compact,

and hence a-finite with respect to our Haar measures. We shall apply
Fubini's theorem and refer the reader to Chapter IX, §6.
We shall first give a simpler proof when the Haar measure is also
right invariant (which applies for instance when G is commutative). Let
hE Cc(G) be a positive function such that
L h dJ.l = 1.
We can find such h by first selecting a non-empty open set V with

compact closure, a function ho such that V -< ho, and then multiply ho by
a suitable constant. (The symbol -< was defined in §2 of Chapter IX.)
For an arbitrary f E Cc(G) we have:
L f dv = L h dJ.l L f dv = L L h(y)f(x) dv(x) dJ.l(Y)
= L L h(y)f(xy) dv(x) dJ.l(Y)
= L L h(y)f(xy) dJ.l(Y) dv(x)
= L L h(x-1y)f(y) dJ.l(Y) dv(x)
= L L h(x-1y)f(y) dv(x) dJ.l(Y)
= c LfdJ.l
where
c = c(v) = L h(x- 1) dv(x).

This proves our theorem in the present case. It also gives us an explicit
determination of the constant c involved in the statement of theorem.
The proof of uniqueness when the Haar measure is not also right
invariant is slightly more involved, and runs as follows. For each non-
zero positive function f E Cc(G), we consider the ratio of the integrals
(taken over G):
r(f) =
f-f-.
f dJL
f dv
It will suffice to show that this ratio is independent off We select a

positive function h = hK with support equal to a compact neighborhood
K of the origin, such that h is symmetric [i.e. h(x) = h(x- 1) for all x],
and also
fh dv = 1.
We can obviously satisfy these conditions with K arbitrarily small, i.e.

contained in a given open neighborhood of the origin. (To get the
symmetry, use a product h(x) = t/I(x)t/I(x- 1) where t/I has small support,
and to normalize the integral, multiply by a suitable positive constant.)
Now for any f as above, we consider the difference
f h dJL f f dv - f h dv f f dJL = f f [h(x)f(y) - h(y)f(x)] dJL(x) dv(y).
We change x to yx in the second term. We change x to y-1 x in the first

term, and then replace y -1 X by X -1 Y using the symmetry of h to get
h(x -1 y)f(y) in the first term. We reverse the order of integration for the
first term, change y to xy and change the order of integration once more.
We then see that our difference is equal to
ff h(y) [f(xy) - f(yx)] dJL(x) dv(y).
We now estimate this integral. If K is small enough, then
If(xy) - f(yx)1 < e
for all x E G and all y E K. Furthermore if y E K, the function
XH f(xy) - f(yx)
has support in the set (supp J)K- 1 u K-1(supp f), which is compact, and
whose ji-measure is bounded by a fixed number CJ depending only on f,
as K shrinks to the origin. Since h is positive, we get the estimate (using
the fact that f h dv = 1):
Dividing by f f dv, we obtain
lim fh K dji = r(J),

K-{e}
with an obvious notation concerning the use of the limit symbol. The
left-hand side is independent of f This proves what we wanted, and
concludes the proof of the uniqueness of Haar measure in general.
Corollary 2.4. The map ji 1-+ dji is a bijection between the set of H aar
measures, then there exists c > °

measures on G and the set of H aar functionals. If ji, v are H aar
such that v = cji.
Proof Let A be a Haar functional. Let ji be the measure associated

with A by Theorem 2.3 of Chapter IX. For any open set V we have
ji(V) = sup Af for f -< V,

and f -< V if and only if fa -< aVo Thus ji(aV) = ji(V). For any Borel set
A we have
ji(A) = inf ji(V) for V open :::> A,
and we conclude similarly that ji(aA) = ji(A), so that ji is left invariant.

Let V be a non-empty open set. Suppose that ji(V) = 0. Any com-
consequently ji(K) = °
pact set K can be covered by a finite number of translates of V, and
for all compact K. If f E Cc(G), f #- 0, then
flllfil -< W for some open W with compact closure. It follows that
° ~ Af ~ ji(W) Ilfll = 0, contradicting the non-triviality of A. This proves
that ji(V) > 0, and hence that ji is a Haar measure. Thus the map
ji 1-+ dji from Haar measures to Haar functionals is surjective. The map
is injective by the Corollary of Theorem 2.7, Chapter IX. The last state-
ment is now clear.
We are in a position to give examples of Haar measure and integrals.

As is often the case, for any concretely given group, one can exhibit a
specific Haar functional. The uniqueness theorem can then be used to
guarantee that it is the only one possible.
Examples. (1) Let G = RP be the additive group of euclidean space.

Then Lebesgue measure is the Haar measure.
(2) Let G = R* be the multiplicative group of non-zero real numbers.
On Cc(R*) we define a functional
!~ f a)
-a)
dx
!(X)jXj.
Thus we let J1* be the measure such that dJ1*(x) = dx/ixi. This is easily
seen to be invariant under multiplicative translations. Namely, suppose
that a < O. We compute
We change variables, with u = ax, du = a dx. Then Ixl = lul/lal, and
But the limits of integration 00 and -00 get reversed, and we conclude
at once that our integral is equal to
f a)
-a)
du
!(u)fUI'
as desired. The case when a > 0 is even easier.

(3) Let T be the circle group, i.e. the group of complex numbers of
absolute value 1. The map A. given by
is immediately seen to be a positive functional on Cc(T), and also left

invariant, so that it is a Haar functional.
(4) Let G be a discrete group. The measure giving value 1 to a subset
of G consisting of one element is a Haar measure. What is its cor-
responding functional?
As a matter of notation, one usually writes dx instead of dJ1(x) for a

Haar measure or its corresponding functional. For instance, in Example
1, we would say that dx/lxl is a Haar measure on R* if dx is a Haar
measure on R.
Other examples will be given in the exercises.

[XII, §3] EXISTENCE OF THE HAAR INTEGRAL 319
As Weil pointed out in [W], most of classical Fourier analysis can be

done on locally compact commutative groups. For this and other appli-
cations, besides [W], the reader can consult Loomis [Lo], Rudin [Ru 2]
for commutative groups, and the collected works of Harish-Chandra for
non-commutative groups.
For purposes of this book, the proof of existence given in the next
section can be omitted.
XII, §3. EXISTENCE OF THE HAAR INTEGRAL
In this section we prove that a Haar functional exists.
We let L + denote the set of all functions in CAX, R~o). Then L +

is closed under addition and multiplication by real numbers ~ 0. If
°
f, gEL + and if we assume that g is not identically 0, then there exist
numbers Ci > and elements Si E G (i = 1, . . . ,n) such that for all x we
have
n
f(x) ~ L Cig(Si X ).
i=l
For instance, we let V be an open set and m > such that g ~ m > on
V. We can take all Ci = sup f lm and cover the support K of f by
° °
translates Sl V, ... 'Sn V. We define
(f : g)
to be the inf of all sums L Ci for all choices of {c;}, {s;} satisfying the
above inequality. If g = 0, we define (f: g) to be 00. The symbol (f: g)
satisfies the following properties. The first expresses an in variance under
translation, where fa(x) = f(a- 1 x).
(1) (fa : g) = (f: g),

(2) (fl + f2 : g) ~ (fl : g) + (f2 : g),
(3) (cf: g) = c(f : g) if c ~ 0.
(4) If fl ~ f2' then (fl : g) ~ (f2 : g),
(5) (f: g) ~ (f: h)(h : g),
(6) (f: g) ~ sup flsup g.
The first four properties are obvious. For (5), we note that if
f(x) ~ L cih(si x ) and

then
f(x) ~ L cjdjg(tAx).
j,j
For property (6), let x be such that f(x) = sup f. Then
whence (6) follows.

Let ho be a fixed non-zero function in L + . We define
A (f) = (f : g) .
9 (h o : g)
Then we have
(7)
For each fixed g, the map Ag will give an approximation of the Haar
functional, which will be obtained below as a limit in a suitable sense.
We note that Ag is left invariant, and satisfies
c ~O.
Furthermore, we shall now prove that if the support of g is small, then

Ag is almost additive.
Lemma 3.1. Given f1' f2 E L + and e, there exists a neighborhood V of

e such that
for all gEL + having support in V.
Proof. Let h be in L + and have the value 1 on the support of f1 + f2'

Let
f = f1 + f2 + <5h
with a number <5 > O. Let
and h2 = f2 1f.
We use the usual convention that h1 and h2 are 0 wherever f is equal to

O. Then h1' h2 are in L + , and are therefore uniformly continuous. Let V
be such that
and
[XII, §3] EXISTENCE OF THE HAAR INTEGRAL 321
whenever Y E xv. Let 9 have support in V, and assume that 9 # O. Let

be positive numbers and Sl' ... 'Sn E G be such that
C 1 , ••• ' Cn
for all x. If g(SiX) # 0 , then SiX E V. We obtain
11 (X) = l(x)h 1(X) ;;; L Cig(SiX)h1 (X)

;;; L Cig(SiX) [hi (Si- 1 ) + <5],
and consequently
We have a similar inequality for (f2 : g). Since hi + h2 ;;; 1, we obtain
Taking the inf over the families {c;}, we find
(fl : g) + (f2 : g) ;;; (f : g)(1 + 2<5)

;;; [(fl + 12 : g) + <5(h : g)] (1 + 2<5).
Divide by (h o : g). We obtain
But by (7), we know that Aill + 12) and Ag(h) are bounded from above
by numbers depending only on 11' 12 (and h, which itself depends only
on 11' 12). Hence for small <5 we conclude the proof of the lemma.
For each non-zero IE L + we let If be the closed interval
Let I be the Cartesian product of all intervals If. Then I is compact by

Tychonoff's theorem. Each gEL +, 9 # 0 gives rise to a map Ag , which is
determined by its values Ai!) for IE L +, 1# O. We may therefore view
each Ag as a point of I. For each open neighborhood V of e in G let Sy
be the closure of the set of all Ag with 9 having support in V. Then Sy is
compact, and the collection of compact sets Sy has the finite intersection
property because
By compactness, there exists an element A. in the intersection of all sets

Sv. We contend that A. is additive. This is immediate, because given e
and any neighborhood V of e, and fl' f2' f3 = fl + f2 we can find
gEL + having support in V such that
Using Lemma 3.1, we conclude that A. is additive.

If f E L, we can write f = fl - f2 with fl' f2 E L +. We define
The additivity of A. shows that this is well defined, i.e. independent of the
choice of fl' f2' and it is immediately verified that A. is then linear on L.
Furthermore, from the properties of Ag , we also conclude that A is left
invariant, and that for any f E L + we have
In particular, A. is non-trivial. This concludes the proof of the existence

of the Haar functional.
XII, §4. MEASURES ON FACTOR GROUPS

AND HOMOGENEOUS SPACES
Let H be a closed subgroup of G and let dy be a Haar measure on H.

Let f E Cc ( G). Then the function
Ul--+ L
f(uy) dy
is continuous on G, as one verifies at once from the uniform continuity

of f. Furthermore, it is constant on left co sets of H, because of the left
invariance of the Haar measure on H, and has compact support on GIH.
Thus if we write u for elements of GIH, there exists a unique function
fH E Cc(GIH) such that
fH(U) = L f(uy) dy.
Theorem 4.1. Let H be a closed subgroup of G. The map

[XII, §4] MEASURES ON FACTOR GROUPS 323
is a linear map of CAG) onto Cc(G/H). If H is normal, and du is a

Haar measure on G/H, then the functional
ff-+ ff
GIH H
f(uy) dy du
is a Haar functional on G.
Proof. That the repeated integral is a positive functional and is left

invariant is obvious. We must show that it is non-trivial. This will come
from the first statement, valid even when H is not normal, and easily
proved as follows. Let n: G -+ G/H be the natural map. Let I' E Cc(G/H)
and let K' be the support of f'. We know from §1 that there exists a
compact K in G such that n(K) = K'. Let g E Cc(X) be a positive func-
tion which is > 0 on K. Then gH will be > 0 on K'. Let
h(x) = I'(n(x»)
gH(n(x»)
h(x) =0
Then h is continuous on G, and is constant on cosets of H. Let f = gh.
Then it is clear that fH = 1', thus proving our first assertion, and the
theorem.
When H is not normal, we can still say something, and we cast it in a

slightly more general context. Let S be a locally compact Hausdorff
space, and G our group. We say that G operates on S if we are given a
continuous map
satisfying the conditions
x(yu) = (xy)u and eu =u

for all x, y E G, u E Sand e the unit element of G. Then for each x E G
we have a homeomorphism 't"x : S -+ S given by 't"xU = xu.
The coset space of a closed subgroup H is an example of the above,
beqlUse for any coset yH we can define 't"AyH) = xyH. The operation is
obviously continuous. A homogeneous space is a space on which G
operates transitively. Such a space is the isomorphic to G/H with some
closed subgroup H .
Let A. be a positive functional on CAS). We shall say that A. is rela-
tively invariant (for the given operation of G) if for each a E G there exists
a number "'(a) such that for all f E Cc(S) we have
),(ra!) = "'(a)),(f).
As before, we define raf(s) = f(a- 1s) for a E G, s E S. It follows immedi-

ately that "' : G~ R* is a continuous homomorphism into the multiplica-
tive group of positive reals, called the character of the functional. We
define a relatively invariant measure similarly.
As with Haar measures, we have
Theorem 4.2. The map J.l.1-+ dJ.l. is a bijection between the set of (1-
regular positive relatively invariant measures on S and the set of positive
relatively invariant functionals on Cc(S).
In the case of the coset space, we then have the analog of Theorem
4.1.
Theorem 4.3. Let G be a locally compact group and H a closed sub-

group. Let du be a relatively invariant non-zero positive functional on
C.(GIH) with character "', and let dy be a Haar measure on H . Then
ff
the map
fl-+ f(uy)",-1(uy) dy du
GI H H
is a Haar functional on G.
Proof. Our map is obviously linear, positive, and non-trivial because

of the first statement of Theorem 4.1. There remains to show that it is
left invariant. But
ff
GI H H
f(auy)",-l(uy) dy du = "'(a) ffGIH H
f(auy)",-l(auy) dy duo
The outer integral on the right is the integral of (f",-1)H translated by

a-I and hence by the relative invariance of du and the definition of the
character, is the same as the integral of (f",-l)H times ",(ar 1. This yields
what we wanted.
Example 1. In Exercise 4 you will show that the multiplicative group

C* of complex numbers is isomorphic to R+ x T, where T is the circle,
and that Haar measure on C* is given by r- 1 dr dO (in terms of the usual
polar coordinates.
[XII, §4] MEASURES ON FACTOR GROUPS 325
Example 2. Let G = GL(n, R) be the group of invertible real n x n

matrices. In Exercise 5 you will show that Haar measure on G is given
by dx/ldet xl for x E G.
Example 3. Let G be the subgroup of GL(2, R) consisting of all

matrices
(a
o
b)1 .
In Exercise 9 you will show that right Haar measure on G is the product
measure on R* x R, and is not equal to left Haar measure.
Example 4. 8U (2). This is the special unitary group, defined here to

be the group of 2 x 2 complex matrices of the form
with ali + f3P = 1.
Write a = a 1 + ia 2 and 13 = b1 + ib2 with ai' bi real. Then there is a

bijection of R4 with the real vector space of matrices
and under this bijection, SU(2) corresponds to the sphere 8 3 defined by

the equation
ai + a~ + bi + b~ = 1.
The reader can prove at once that euclidean measure on R4 given by
is invariant by the action of SU(2) on R4.

Define
and
() = angle of 13.
Then we leave as an exercise to show that
and (u 1 , U 2 , ()) define a C OO isomorphism of 8 3 on SU(2). Finally, let

f E Cc (SU(2)), and let F be the corresponding function in Cc (S3), where

F = F(u 1 , u2 , B) is a function of the above coordinates on S3. Then the
Haar measure Ji. on SU(2) can be expressed in terms of the coordinates
by the formula
iSU(2)
f 1
dJi. = -22
TC
f1
-1
f~
- J1-u1
f21< F(u
0
1, u2, B) dB dU2 dU 1 •
We leave the verification as Exercise 12. This normalization gives the

compact group SU(2) measure 1.
XII, §5. EXERCISES
We let G be a locally compact Hausdorff group and f1 a Haar measure.
1. Identify C with R2. Let f1 be Lebesgue (Haar) measure on C. Let a E C, and

let z denote the general element of C. Show that df1(az) = lal 2 df1(z), as
functionals on Cc(C) = Cc (R2).
2. Let T be the circle group and let H be a discrete abelian group which is not
countable. Let t be a fixed element of T and let {x;} (i E I) be a non-
countable subset of H. Let S be the set of all pairs (t, x;), i E I. Show that S
is discrete in T x H, and that if f1 is Haar measure on T x H, then S has
infinite measure. Show that all compact subsets of S have measure O. Thus
Haar measure is not regular.
3. Show that G is compact if and only if f1(G) is finite. Show that G is discrete
if and only if the set consisting of e alone has measure > O.
4. Let C* be the multiplicative group of complex numbers #- O. Let R+ denote
the multiplicative group of real numbers > o. Show that C* is isomorphic to
R+ x T (where T is the circle group) under the map (r, U)I-HU. We write
U = e21ti8 • Show that the Haar integral on C* is given by
ff--> i i'"
l
o 0
f(re 21ti8 ) -dr d8.
r
5. Let G = GL(n, R) be the group of real n x n matrices. Show that Haar

measure on G is given by dx/ldet xl, if dx is a Haar measure on the
n2 -dimensional space of all n x n matrices. (Use the change of variables
formula.)
6. Let f E CAG) and let a E G. We denote by raf or f" the right translate of f,
defined by
For each fixed a, show that there exists a number ~(a) such that for all
[XII, §5] EXERCISES 327
L f(xa -I) dx = L\(a) L

f(x) dx.
Show that L\(ab) = L\(a)L\(b), and that L\ is continuous, as a map of G into

R*. We call L\ the modular function on G.
7. If L\ is the modular function on G, show that
L f(x-I)L\(X- I ) dx = Lf(x) dx,
where dx is Haar measure, and that L\(x -I) dx is right Haar measure.
8. If G is compact and t/!: G --> R + is a continuous homomorphism into R + ,
show that t/! is trivial, i.e. t/!(G) = 1. In this case, Haar measure is also right
invariant.
9. Compute the modular function for the group G of all affine maps x I--> ax + b
with a E R* and bE R. In fact, show that L\(a, b) = a. In this case the right
Haar measure is not equal to the left Haar measure. Show that the right
Haar measure is the Cartesian product measure on R* x R.
to. Let G be a locally compact group with Haar measure, let L\ be the modular
function as in Exercise 7. Let MI be the set of regular complex measures on
G.
(a) Just as in Exercise 8 of Chapter IX, prove that M' is a Banach space, if
we define the convolution m * m' to be the measure associated with the
functional
fl--> ff f(xy) dm(x) dm'(y),
for f E CAG).
(b) For mE M', define mV to be the direct image of m under the mapping
Xl-->X-' of G. Show that Ilmv II = Ilmli.
(c) If J1. is a right measure, prove that J1.v = L\J1..
11. Let G be a compact abelian group. By a character of G we mean a con-
tinuous homomorphism t/!: G --> C* into the multiplicative group of non-zero
complex numbers.
(a) Show that the values of t/! lie on the unit circle.
(b) If G = R"/Z" is an n-torus, show that the characters separate points.
Assume this for the general case.
(c) Let 0": G --> G be a topological and algebraic automorphism of G, or an
automorphism for short. Show that 0" preserves Haar measure, and in-
duces a norm-preserving linear map T: L 2(J1.) --> L 2(J1.) by fl--> f 00".
(d) If t/! is a non-trivial character on G, show that Jt/! dJ1. = O. If t/! is trivial,
that is t/!(G) = 1, then Jt/! dJ1. = 1, assuming that J1.(G) = 1, which we do.
Prove that the characters generate an algebra which is dense for the sup
norm in the algebra of continuous functions. Prove that the characters
form a Hilbert basis for L 2 .
(e) Let f be an eigenvector for T in L 2, with eigenvalue IX, that is Tf = IX!

Show that IX is a root of unity. [Hint : First, IIXI = 1. Then write the
L
Fourier expansion F = e",1/1 in L2, and observe that eT ", = IX-le",. Use
L
the fact that le",1 2 < 00 whence if e", #- 0, then for some n, T"I/I = 1/1 and
IX" = 1.]
12. Prove the statements made in Example 4, concerning Haar measure on SU(2).
You may use the following trick. Let hE Cc(R+) be such that
LX) h(r)r 3 dr = 1.
Show that the functional
defines a functional on C(S3) which is invariant under the action of SU(2).

PART FOUR
Calculus
The differential calculus is an essential tool, and some of it will be used

in some of the later applications. Readers can bypass it until they come
to a place where it is used. The differential calculus in Banach spaces
was developed quite a while ago, in the 1920s, by Frechet, Graves,
Hildebrandt, and Michael. Its recent return to fashion, after a period
during which it was somewhat forgotten, is due to increasingly fruitful
applications to function spaces in various contexts of analysis and
geometry.
For instance, at the end of Chapter XIV, in the exercises, we show
how the calculus in Banach spaces can be used by describing the start of
the calculus of variations.
In functional analysis, one considers at the very least functions from
scalars into Banach or Hilbert spaces, giving rise to curves (real or
complex) in such spaces.
We shall also use the elementary integral of Banach-valued continuous
functions in the spectral theory of Chapter XVI, §1, in connection with
Banach-valued analytic functions.
The Morse- Palais lemma of Chapter XVIII gives a nice illustration of
the second derivative of a map in Hilbert space.
In the part on global analysis, readers will see the calculus used
in dealing with finite dimensional manifolds, especially for integration
theory, as in Stokes' theorem. For applications to infinite dimensional
manifolds, see [La 2].
As a matter of exposition, we prefer to develop ab ovo the integral of
step maps from an interval into a Banach space, and its extension to the
uniform closure of the space of step maps, which can be done much
more easily than the construction of the L 1 general integral.
CHAPTER XIII
D iffe re ntia I Ca leu Ius
Throughout this chapter and the next, we let E, F, G denote

Banach spaces.
XIII, §1. INTEGRATION IN ONE VARIABLE
Let [a, bJ be a closed interval, and E a Banach space. By a step map

f: [a, bJ -4 E we mean a map for which there exists a partition
and elements Vi' ""V n E E such that if ai-l < t < ai' then f(t) = Vi' We
then say that f is stt;p with respect to P. The notion of a refinement of a
partition is the usual one, and if J, g are two step maps of [a, bJ into E,
then there exists a partition P such that both J, g are step with respect
to P. From this we see that the step maps form a subspace of the space
of all bounded maps, and we deal with the sup norm on this space.
We define the integral of a step map f with respect to a partition P by
n
Jp(f) = l: (ai -
i=l
ai-dv i ,
the notation being as above. This is in fact independent of P, and we

write simply J(f) or J:(f) to specify the interval [a, b]. It is then easily
seen that J is linear, and that IJ(f)1 ~ (b - a) IIfll, so J is continuous,
with bound b - a. We can therefore extend J to the closure of the space
of step maps by the linear extension theorem. If f lies in this closure, we
332 DIFFERENTIAL CALCULUS [XIII, §1]
denote J(J) from now on by
and call it the integral. If a ~ c ~ b, then one verifies without difficulty

that
(1)
If a ~ c<d ~ b, we define
Then formula (1) actually holds for any three points a, b, c in any order,
lying in an interval on which f is in the closure of the space of step
maps.
Since a continuous map is uniformly continuous on a compact set,
one concludes that the continuous maps of [a, b] into E lie in the closure
of the space of step maps, so that the integral is defined over continuous
maps.
If E = E1 X ••• x En is a product of Banach spaces, and
is represented by coordinate maps h: [a, b] -+ E j , then it is trivially

verified that
If E = R, and f ?;. 0, then
ff?;'O
as one sees first for step maps, and then by continuity for uniform limits
of step maps.
For convenience, the closure of the space of step maps will be called
the space of regulated maps. Thus a map is called regulated if it is a
uniform limit of step maps.
[XIII, §2] THE DERIVATIVE AS A LINEAR MAP 333
Proposition 1.1. Let A: E -- F be a continuous linear map. Iff: [a, b]-- E

is regulated, then A 0 f is regulated and
This follows immediately from the definitions. Indeed, if f is the uni-

form limit of a sequence of step maps Un}, then each A 0 fn is a step map
of [a, b] into F, which clearly converges to A 0 f For a step map In we
have directly from the definition that
Taking the limit proves our formula.
XIII, §2. THE DERIVATIVE AS A LINEAR MAP
Let U be open in E, and let x E U. Let f: U -- F be a map. We shall

say that f is differentiable at x if there exists a continuous linear map
A: E -- F and a map I/t defined for all sufficiently small h in E, with
values in F, such that
lim I/t(h) = 0,
h-O
and such that
f(x + h) = f(x) + A(h) + Ihll/t(h).

Setting h = 0 shows that we may assume that I/t is defined at 0 and that
1/t(0) = O. The preceding formula still holds.
Equivalently, we could replace the term Ihll/t(h) by a term q>(h) where
q> is a map such that
lim q>(h) = O.
h-O Ihl
The limit is taken of course for h #- 0, otherwise the quotient does not
make sense.
A mapping q> having the preceding limiting property is said to be o(h)
for h --+ O.
We view the definition of the derivative as stating that near x, the
values of f can be approximated by a linear map A, except for the
additive term f(x), of course, with an error term described by the limiting
properties of r/J or cp described above.
It is clear that if f is differentiable at x, then it is continuous at x.
We contend that if the continuous linear map A exists satisfying (*),
then it is uniquely determined by f and x. To prove this, let AI' A2 be
continuous linear maps having property (*). Let vEE. Let t have real
values> 0 and so small that x + tv lies in U. Let h = tv. We have
f(x + h) - f(x) = Al(h) + Ihlr/Jl(h)

= A2(h) + Ihl r/J2(h)
with
lim r/Jih) =0
h .... O
for j = 1, 2. Let A = Al - A2' Subtracting the two expressions for
f(x + tv) - f(x),

we find
Al(h) - A2(h) = Ihl(r/J2(h) - r/Jl(h)),
and setting h = tv, using the linearity of A,
We divide by t and find
Take the limit as t ~ O. The limit of the right side is equal to O. Hence
A1 (V) - A2(V) = 0 and Al(V) = A2(V). This is true for every vEE, whence
Al = ,{2, as was to be shown.
In view of the uniqueness of the continuous linear map A, we call it
the derivative of f at x and denote it by f'(x) or Df(x). Thus f'(x) is a
continuous linear map, and we can write
f(x + h) - f(x) = f'(x)h + Ihl r/J(h)

with
lim r/J(h) = O.
h .... O
We have written f'(x)h instead of f'(x)(h) for simplicity, omitting a set of

parentheses. In general we shall often write
Ah
instead of A(h) when A is a linear map.

[XIII, §3] PROPER TIES OF THE DERIVATIVE 335
If f is differentiable at every point x of V, then we say that f is

differentiable on U. In that case, the derivative f' is a map
Df = f' : V -+ L(E, F)
from V into the space of continuous linear maps L(E, F), and thus to
each x E V, we have associated the linear map f'(x) E L(E, F). If f' is
continuous, we say that f is of class C 1 • Since f' maps V into the
Banach space L(E, F), we can define inductively f to be of class CP if all
derivatives Dkf exist and are continuous for 1 ~ k ~ p.
If f : [a, b] -+ F is a map of a real variable, then its derivative
f'(t): R -+ F
is a linear map into the vector space F. However, if A.: R -+ F is any

linear map, then for all t E R we have
A(t) = A(t·l) = tA(I).
Hence A is multiplication (on the right) by the vector A(I) in F, and we

usually may identify A with this vector.
XIII, §3. PROPERTIES OF THE DERIVATIVE
Sum. Let E, F be complete normed vector spaces, and let V be open in

E. Let f, g: V -+ F be maps which are differentiable at x E U. Then
f + g is differentiable at x and
(f + g)'(x) = f'(x) + g'(x).

If c is a number, then
(cf)'(x) = cf'(x).
Proof. Let Al = f'(x) and A2 = g'(x) so that
f(x + h) - f(x) = Al h + Ihl IjJl (h),

g(x + h) - g(x) = A2h + Ihl IjJ2(h),
where lim t/l;{h) = O. Then

h .... O
(f + g)(x + h) - (f + g)(x) = f(x + h) + g(x + h) - f(x) - g(x)

= Alh + A2h + Ihl(ljJl(h) + IjJ2(h)
= (AI + A2)(h) + Ihl (ljJl (h) + IjJ2(h).
Since lim (1/1 I (h) + 1/12(h)) = 0, it follows by definition that

h-O
Al + A2 = (f + g),(x),
as was to be shown. The statement with the constant is equally clear.
Product. Let FI , F2 , G be complete normed vector spaces, and let

FI x F2 ~ G be a continuous bilinear map. Let U be open in E and let
f: U ~ FI and g: U ~ F2 be maps differentiable at x E U. Then the
product map fg is differentiable at x and
(fg)'(x) = f'(x)g(x) + f(x)g'(x).

Before giving the proof, we make some comments on the meaning of
the product formula. The linear map represented by the right-hand side
is supposed to mean the map
VH (f'(x)v)g(x) + f(x}(g'(x)v).
Note that f'(x): E ~ FI is a linear map of E into FI , and when applied to
VEE yields an element of Fl. Furthermore, g(x) lies in F2 , and so we
can take the product
(f'(x)v)g(x) E G.
Similarly for f(x) (g'(x)v). In practice we omit the extra set of parenthe-
ses, and write simply
f'(x)vg(x).
Proof. Changing the norm on G if necessary, we may assume that
Ivwl ~ Ivllwl
We have:
f(x + h)g(x + h) - f(x)g(x)

= f(x + h)g(x + h) - f(x + h)g(x) + f(x + h)g(x) - f(x)g(x)
= f(x + h)(g(x + h) - g(x)) + (f(x + h) - f(x))g(x)
= f(x + h)(g'(x)h + IhI1/l 2(h)) + (f'(x)h + Ihl1/l 1 (h))g(x)
= f(x + h)g'(x)h + Ihlf(x + h) 1/12 (h) + f'(x)hg(x) + Ihll/ll(h)g(x)

= f(x)g'(x)h + f'(x)hg(x) + U(x + h) - f(x))g'(x)h
+ Ihlf(x + h)1/I2(h) + Ihll/ll(h)g(x).

[XIII, §3] PROPERTIES OF THE DERIVATIVE 337
The map
hf-'--> f(x)g '(x)h + f'(x)hg(x)
is the linear map of E into G, which is supposed to be the desired
derivative. It remains to be shown that each of the other three terms
appearing on the right are of the desired type, namely o(h). This is
immediate. For instance,
1(J(x + h) - f(x))g'(x)hl ~ If(x + h) - f(x)llg'(x)llhl

and
lim If(x + h) - f(x)llg'(x)1 = 0
h-O
because f is continuous, being differentiable. The others are equally

obvious, and our property is proved.
Quotient. Assume that A is a Banach algebra with unit e, and let U be

the open set of invertible elements. Then the map u f-'--> u- 1 is differenti-
able on U, and its derivative at a point Uo is given by
Vf-'--> -uo-1 vUo-1 .
Proof We have
(uo + hf1 - U(j1 = (uo(e + U(j1 h))-1 - U(j1

= (e + U(j1 hf1 U(j1 - U(j1
= (e - U(j1 h + 0(h))U(j1 - U(j1
= - U(j1 hU(j1 + o(h).
This proves that the derivative is what we said it is.
Chain Rule. Let U be open in E and let V be open in F. Let f: U -4 V

and g: V -4 G be maps. Let x E U. Assume that f is differentiable at x
and g is differentiable at f(x). Then go f is differentiable at x and
(g 0 f)'(x) = g'(J(x)) 0 f'(x).

Before giving the proof, we make explicit the meaning of the usual
formula. Note that f'(x): E-4 F is a liner map, and g'(J(x)): F -4 G is a
linear map, and so these linear maps can be composed, and the compos-
ite is a linear map, which is continuous because both g'(J(x)) and f'(x)
are continuous. The composed linear map goes from E into G, as it
should.
Proof Let k(h) = f(x + h) - f(x). Then
g(J(x + h)) - g(J(x)) = g'(J(x))k(h) + Ik(h)11/I1 (k(h))
with lim 1/11 (k) = O. But
k(h) = f(x + h) - f(x) = f'(x)h + IhI1/l 2(h),

with lim 1/12(h) = O. Hence
h"'O
g(J(x + h)) - g(J(x))

= g'(J(x))f'(x)h + Ihlg'(J(X))1/I2(h) + Ik(h)11/I1(k(h)).
The first term has the desired shape, and all we need to show is that
each of the next two terms on the right is o(h). This is obvious. For
instance, we have the estimate
Ik(h)1 ~ 1f'(x)llhl + IhII1/l 2 (h)1

and
lim 1/11(k(h)) = 0
h ... O
from which we see that Ik(h)11/I1 (k(h)) = o(h). We argue similarly for the
other term.
Map with Coordinates. Let U be open in E, let
and let f = (f1' . . . Jm) be its expression in terms of coordinate maps.

Then f is differentiable at x if and only if each J; is differentiable at x,
and if this is the case, then
f'(x) = (J{(x), ... J':'(x)).
Proof This follows as usual by considering the coordinate expression
f(x + h) - f(x) = (I1 (x + h) - f(x), ... Jm(x + h) - fm(x)).
Assume that J;'(x) exists, so that
J;(x + h) - J;(x) = J;'(x)h + IPi(h)

[XIII, §3] PROPER TIES OF THE DERIVATIVE 339
where qJi(h) = o(h). Then
f(x + h) - f(x) = (j{(x)h, ... ,f~(x)h) + (qJl(h), ... ,qJm(h))

and it is clear that this last term in Fl x .. . x Fm is o(h). (As always, we
use the sup norm in Fl x .. . x Fm.) This proves that f'(x) is what we said
it is. The converse is equally easy and is left to the reader.
Theorem 3.1. Let A.: E --+ F be a continuous linear map. Then A is

differentiable at every point of E and A.'(x) = A for every x E E.
Proof This is obvious, because
A(X + h) - A(X) = A(h) + o.

Note therefore that the derivative of A is constant on E.
Corollary 3.2. Let f: U --+ F be a differentiable map, and let A.: F --+ G
be a continuous linear map. Then for each x E U,
(A 0 f)'(x) = A(j'(X)),
so that for every VEE we have
(A 0 f)'(x)v = A(j'(X)V).
Proof This follows from Theorem 3.1 and the chain rule. Of course,
one can also give a direct proof, considering
A(j(X + h)) - A(j(X)) = A(j(X + h) - f(x))

= A(j'(x)h + Ihl ",(h))
= A(j'(x)h) + IhIA(",(h)),
and noting that lim (",(h)) = o.

n~O
Lemma 3.3. If f is a differentiable map on an interval [a, b] whose

derivative is 0, then f is constant.
Proof Suppose that f(t) #- f(a) for some t E [a, b]. By the Hahn-
Banach theorem, let A be a functional such that A(j(t)) #- A(j(a)). The
map A0 f is differentiable, and its derivative is equal to O. Hence A0 f is
constant on [a, b], contradiction.
Fundamental Theorem of Calculus. Let f be regulated on [a, b], and

assume that f is continuous at a point c of [a, b]. Then the map
is differentiable at c and its derivative is f(c).
Proof. The standard proof works, namely
cp(c + h) - cp(c) = f C
C
+h
f
f
and
C +h
cp(c + h) - cp(c) - hf(c) = C (f - f(c)).
The right-hand side is estimated by
Ihl sup If(t) - f(c)1
for t between c and c + h, thus proving that the derivative is f(c).

In particular,
f(b) - f(a) = r f'(t) dt.
XIII, §4. MEAN VALUE THEOREM
The mean value theorem essentially relates the values of a map at two
different points by means of the intermediate values of the map on the
line segment between these two points. In vector spaces, we give an inte-
gral form for it.
We shall be integrating curves in the space of continuous linear maps
L(E, F).
We shall also deal with the association
L(E, F) x E ..... F
given by
(A, Y) 1-+ A(Y)
for A E L(E, F) and Y E E. It is a continuous bilinear map.

Let IX: J ..... L(E, F) be a continuous map from a closed interval J =
[a, b] into L(E, F). For each t E J, we see that lX(t) E L(E, F) is a linear
[XIII, §4] MEAN VALUE THEOREM 341
map. We can apply it to an element y E E and a(t)y E F. On the other
r
hand, we can integrate the curve a, and
a(t) dt
is an element of L(E, F). If a is differentiable, then da(t)/dt is also an

element of L(E, F).
Lemma 4.1. Let a: J ...... L(E, F) be a continuous map from a closed

= [a, b] into L(E, F). Let y E E. Then
r r
interval J
a(t)y dt = a(t) dt· y
r
where the dot on the right means the application ofthe linear map
a(t) dt
to the vector y.
Proof. Here y is fixed, and the map
AH A(Y) = AY
is a continuous linear map of L(E, F) into F. Hence our lemma follows
from the last property of the integral proved in §l.
Theorem 4.2. Let V be open in E and let x E U. Let y E E. Let

f: V ...... F be a C i map. Assume that the line segment x + ty with
o ~ t ~ 1 is contained in U. Then
f(x + y) - f(x) = f f'(x + ty)y dt = f f'(x + ty) dt· y.
Proof. Let g(t) = f(x + ty). Then g'(t) = f'(x + ty)y. By the funda-
mental theorem of calculus we find that
g(1) - g(O) = Ii g'(t) dt.
But g(1) = f(x + y) and g(O) = f(x). Our theorem is proved, taking into
account the lemma which allows us to pull the y out of the integral.
Corollary 4.3. Let U be open in E and let x, z E U be such that

the line segment between x and z is contained in U (that is the segment
x + t(z - x) with 0 ~ t ~ 1). Let f: U ~ F be of class C l . Then
If(z) - f(x)1 ~ Iz - xl sup 1f'(v)l,

the sup being taken for all v in the segment.
Proof. We estimate the integral, letting x + y = z. We find
ILl f'(x + ty)y dtl ~ (1 - 0) suplf'(x + ty)llyl,
the sup being taken for 0 ~ t ~ 1. Our corollary follows.
(Note. The sup of the norms of the derivative exists because the seg-
ment is compact and the map t t-+ If' (x + ty)1 is continuous.)
Corollary 4.4. Let U be open in E and let x, z, Xo E U. Assume that

the segment between x and z lies in U. Then
If(z) - f(x) - f'(xo)(z - x)1 ~ Iz - xl sup If'(v) - f'(xo)l,

the sup being taken for all v on the segment between x and z.
Proof. We can either apply Corollary 4.3 to the map g such that
g(x) = f(x) - f' (xo)x, or argue directly with the integral:
f(z) - f(x) = Ll f'{x + t(z - x))(z - x) dt.

We write
f'{x + t(z - x)) = f'{x + t(z - x)) - f'(x o) + f'(xo),
and find
f(z) - f(x) = f'(xo)(z - x) + L U'{x + t(z - x)) - f'(x o)] (z - x) dt.
We then estimate the integral on the right as usual.
We shall call Theorem 4.2 or either one of its two corollaries the
mean value theorem in vector spaces. In practice, the integral form of the
remainder is always preferable and should be used as a conditioned
[XIII, §5] THE SECOND DERIVATIVE 343
reflex. One big advantage it has over the others is that the integral, as a
function of y, is just as smooth as f', and this is important in some
applications. In others, one needs only an intermediate value estimate,
and then Corollary 4.3, or especially Corollary 4.4, may suffice.
XIII, §5. THE SECOND DERIVATIVE
Let U be open in E and let f: U -+ F be differentiable. Then
Df = f' : U -+ L(E, F)
and we know that L(E, F) is again a complete normed vector space.

Thus we are in a position to define the second derivative
We have seen in Chapter IV, §1 that we can identify L(E, L(E, F)) with
L(E, E; F), which we denote by L2(E, F), i.e. the space of continuous
bilinear maps of E into F .
Theorem 5.1. Let w: El x E2 -+ F be a continuous bilinear map. Then

w is differentiable, and for each (Xl ' X2) E El x E2 and every
we have
so that Dw: El x E2 -+ L(EI X E 2, F) is linear. Hence D 2w is constant,

and D3 W = 0.
Proof. We have by definition
This proves the first assertion, and also the second, since each term on
the right is linear in both (Xl ' X2) = x and h = (hI ' h2)' We know that
the derivative of a linear map is constant, and the derivative of a con-
stant map is 0, so the rest is obvious.
We consider especially a bilinear map
A.: E x E -+ F
and say that A is symmetric if we have
A(V, W) = A(W, v)
for all v, WEE. In general, a multilinear map
A:Ex " 'xE~F
is said to be symmetric if
for any permutation (J of the indices 1, ... ,no In this section we look at
the symmetric bilinear case in connection with the second derivative.
We see that we may view a second derivative D2f(x) as a continuous
bilinear map. Our next theorem will be that this map is symmetric. We
need a lemma.
Lemma 5.2. Let A: E x E ~ F be a bilinear map, and assume that there

exists a map I/J defined for all sufficiently small pairs (v, w) E E x E with
values in F such that
lim I/J(v, w) = 0,
(v,w) .... (O,O)
and that
IA(V, w)1 ~ II/J(v, w)llvllwl·
Then A = 0.
Proof. This is like the argument which gave us the uniqueness of the
derivative. Take v, WEE arbitrary, and let s be a positive real number
sufficiently small so that I/J(sv, sw) is defined. Then
IA(SV, sw)1 ~ I(sv, sw)llsvllswl,

whence
s2 IA(V, w)1 ~ s21I/J(sv, sw)llvllwl.
Divide by S2 and let s ~ 0. We conclude that A(V, w) = 0, as desired.
Theorem 5.3. Let U be open in E and let f : U~ F be twice differenti-

able, and such that D2f is continuous. Then for each x E U, the bilinear
map D2f(x) is symmetric, that is
for all v, wEE.

[XIII, §5] THE SECOND DERIVATIVE 345
Proof Let x E U and suppose that the open ball of radius r in E

centered at x is contained in U. Let v, WEE have lengths < r/2. Let
g(x) = f(x + v) - f(x).

Then
f(x + v + w) - f(x + w) - f(x + v) + f(x)
= g(x + w) - g(x) = f g'(x + tw)w dt
= f [Df(x + v + tw) - Df(x + tw)]w dt
= Il Il D2f(x + SV + tw)v ds' w dt.

Let
t/I(sv, tw) = D2f(x + SV + tw) - D2f(x).
f Il
Then
g(x + w) - g(x) = D 2f(x)(v, w) ds dt
+ Il Il t/I(sv, tw)v' w ds dt
= D2f(x)(v, w) + cp(v, w)
where cp(v, w) is the second integral on the right, and satisfies the
estimate
Icp(v, w)1 ~ sup It/I(sv, tw)llvllwl·
S.I
The sup is taken for 0 ~ s ~ 1 and 0 ~ t ~ 1. If we had started with
gl(X) = f(x + w) - f(x)
and considered gl(X + v) - gl(X), we would have found another expres-

sion for the expression
f(x + v + w) - f(x + w) - f(x + v) + f(x),

namely
where
ICPl(V, w)1 ~ sup It/ll(SV, tw)llvllwl·
s. 1
But then
D 2f(x)(w, v) - D 2f(x)(v, w) = cp(v, w) - CPl(V, W).
By the lemma, and the continuity of D2f which guarantees that
sup It/t(sv, tw)1 and sup It/tl (sv, tw)1

s,t s,t
satisfy the limit condition of the lemma, we now conclude that
as was to be shown.
For an application of the second derivative, cf. the Morse-Palais
lemma in Chapter XIII. It describes the behavior of a function in a
neighborhood of a critical point in a manner used for instance in the
calculus of variations.
XIII, §6. HIGHER DERIVATIVES AND

TAYLOR'S FORMULA
We may now consider higher derivatives. We define
Thus DPf(x) is an element of L{E, L(E, ... ,L(E, F) ... ») which we denote
by U(E, F). We say that f is of class CP on U or is a CP map if Dkf(x)
exists for each x E U, and if
D"f: U --+ L k(E, F)
is continuous for each k = 0, ... ,p.
We have trivially Dq D'f(x) = DPf(x) if q + r = p and if DPf(x) exists.

Also the p-th derivative DP is linear in the sense that
and
If J. E U(E, F) we write
If q +r = p, we can evaluate J.(v 1 , .•• ,vp ) in two steps, namely

[XIII, §6] HIGHER DERIVATIVES AND TAYLOR'S FORMULA 347
We regard A(V l , ... ,Vq) as the element of U-q(E, F) given by
Lemma 6.1. Let V2' ... ,vp be fixed elements of E. Assume that f is p
times differentiable on U. Let
Then g is differentiable on U and
Dg(x)(v) = DPf(x)(v, V2 , ... ,v p).
Proof. The map g: U -+ F is a composite of the maps
and
where A is given by the evaluation at (V2' ... ,v p ). Thus A is continuous

and linear. It is an old theorem that
namely the corollary of Theorem 3.1. Thus
Dg(x)v = (DPf(x)v)(V2' ... ,vp),
which is precisely what we wanted to prove.
Theorem 6.2. Let f be of class CP on U. Then for each x E U the map

DPf(x) is multilinear symmetric.
Proof. By induction on p ~ 2. For p = 2 this is Theorem 5.3. In

particular, if we let g = Dp- 2f we know that for V l , V 2 E E,
and since DPf = D2 Dp- 2f we conclude that
(*) DPf(x)(v l , ... ,v p) = (D 2DP- 2f(x))(Vl, V2)'(V 3 , •.. ,vp)

= (D 2DP- 2f(x))(V2, Vl )'(V 3 , ... ,v p)
= D Pf(x)(v 2 , Vl , V3 , ... ,vp).
Let (J be a permutation of (2, . .. ,p). By induction,

By the lemma, we conclude that
(**)
From (*) and (**) we conclude that DPf(x) is symmetric because any
permutation of (1, ... ,p) can be expressed as a composition of the permu-
tations considered in (*) or (**). This proves the theorem.
For the higher derivatives, we have similar statements to those ob-

tained with the first derivative in relation to linear maps. Observe that if
(I) E U(E, F) is a multilinear map, and A E L(F, G) is linear, we may com-
pose these
Ex···xE~F!.G
to get A 0 (I), which is a multilinear map of Ex··· x E -+ G. Further-

more, (I) and A being continuous, it is clear that A 0 (I) is also continuous.
Finally, the map
given by "composition with X', namely
(I) ~ A0 (I),
is immediately verified to be a continuous linear map, that is for (1)1'

(1)2 E U(E, F) and C E R we have
and
and for the continuity,
so
Theorem 6.3. Let f: U -+ F be p times differentiable and let A: F -+ G be

a continuous linear map. Then for every x E U we have
DP(A 0 f)(x) = A 0 DPf(x).
Proof. Consider the map X~DP-1(A 0 f)(x). By induction,
By the Corollary 3.2 concerning the derivative

[XIII, §6] HIGHER DERIVATIVES AND TAYLOR'S FORMULA 349
namely the derivative of the composite map
we get the assertion of our theorem.
If one wishes to omit the x from the notation in Theorem 6.3, then
one must write
Occasionally, one omits the lower * and writes simply DP(2 0 f) = 20 DPf
Taylor's Formula. Let V be open in E and let f: V --+ F be of class CPo

Let x E V and let y E E be such that the segment x + ty, 0 ~ t ~ 1, is
contained in U. Denote by y(k) the k-tuple (y, y, ... ,y). Then
Df(x)y Dr1f(x)y(P-l)
f(x + y) = f(x) +- 1-! - + ... + (p _ I)! + Rp
where
Rp = f 1 (1-t)P-l
o (p-l)!
DPf(x + ty)y(P) dt.
Proof We can give two proofs, the first by integration by parts as

usual, starting with the mean value theorem,
f(x + y) = f(x) + L Df(x + ty)y dt.
We consider the map t f--+ Df(x + ty)y of the interval into F , and the
usual product
R x F --+ F,
which consists in multiplying vectors of F by numbers. We let
u = Df(x + ty)y, v=-(I-t), and dv = dt.
This gives the next term, and then we proceed by induction, letting
(1 - t)rl
u = DPf(x + ty)y(P) and dv = (p _ I)! dt
at the p-th state. Integration by parts yields the next term of Taylor's
formula, plus the next remainder term.
The other proof can be given by using the Hahn-Banach theorem and
applying a continuous linear function to the formula. This reduces the
proof to the ordinary case of functions of one variable, that is with
values in R. Of course, in that case, we also proceed by induction, so
there is really not much to choose from between the two proofs.
The remainder term Rp can also be written in the form
R =
P
I i
0
(1 - W- 1
(p - I)!
DPf(x + ty) dt . y(P)
.
The mapping
yH I 0
l(I-t)P-l
(p _ I)! DPf(x + ty) dt
is continuous. If f is infinitely differentiable, then this mapping is

infinitely differentiable since we shall see later that one can differentiate
under the integral sign.
Estimate of the Remainder. With notation as in Taylor's formula, we

can also write
f(x + y) = f(x) + Df(x)y + ... + DPf(x)y(P) + O(y)

I! p!
where
IO(y)1 ~ sup IDPf(x + ty~ - DPf(x)IIYIP
O;:;t;:;l p.
and
lim O(y) =0
y~O lylP .
Proof. We write
DPf(x + ty) - DPf(x) = I/!(ty).
Since DPf is continuous, it is bounded in some ball containing x, and
lim I/!(ty) =0
y~O
uniformly in t. On the other hand, the remainder Rp given above can be

written as
Ii (1
o (p-l)!
-
t)p-l
DPf(x)y(P) dt + Ii
0
(1
(p-l)!
- W- I/!(ty)y(P) dt.
1
We integrate the first integral to obtain the desired p-th term, and esti-
[XIII, §7] PARTIAL DERIVATIVES 351
mate the second integral by
0~~~1 1I/I(ty)lIyIP
e
Jo
(1 - t)P-1
(p _ I)! dt,
where we can again perform the integration to get the estimate for the
error term O(y).
Theorem 6.4. Let V be open in E and let f: V -+ F1 X • •• x Fm be a

map with coordinate maps (/1' ... ,f,,). Then f is of class CP if and only
if each /; is of class CP, and if that is the case, then
Proof We proved this for p = 1 in §3, and the general case follows by
induction.
Theorem 6.5. Let V be open in E and V open in F. Let f: V -+ V and

g: V -+ G be CP maps. The go f is of class CPo
Proof We have
D(g 0 f)(x) = Dg(f(x)) 0 Df(x).
Thus D(g 0 f) is obtained by composing a lot of maps, namely as

represented in the following diagram:
V ~ L(F'G))
11 x -L(E, G)
V ~ L(E,F)
If p = 1, then all mappings occurring on the right are continuous and so

D(g 0 f) is continuous. By induction, Dg and Df are of class crt, and
all the maps used to obtain D(g 0 f) are of class cr1 (the last one on
the right is a composition of linear maps, and is continuous bilinear, so
infinitely differentiable by Theorem 5.1). Hence D(g 0 f) is of class CP-t,
whence go f is of class CP, as was to be shown.
XIII, §7. PARTIAL DERIVATIVES
Consider a product E = E 1 X •.. x En of complete normed vector spaces.

Let Vi be open in Ei and let
be a map. We write an element x E V l X ... X Vn in terms of its "coordi-

nates", namely x = (Xl" " ,Xn ) with Xi E Vi '
We can form partial derivatives just as in the simple case when E =
Rn. Indeed, for Xl' ... 'X i - l ' Xi+1, ... ,Xn fixed, we consider the partial
map
of Vi into F. If this map is differentiable, we call its derivative the partial

derivative of f and denote it by DJ(x) at the point x. Thus, if it exists,
DJ(x) = A.: E i -+ F
is the unique continuous linear map AE L(E i , F) such that
for h E Ei and small enough that the left-hand side is defined.
Theorem 7.1. Let Vi be open in Ei (i = 1, .. . ,n) and let
be a map. This map is of class CP if and only if each partial derivative
is of class C r l . If this is the case, and
then
n
Df(x)v = L DJ(x)v i •
i=l
Proof. We shall give the proof just for n = 2, to save space. We

assume that the partial derivatives are continuous, and want to prove
that the derivative of f exists and is given by the formula of the theorem.
We let (x, y) be the point at which we compute the derivative, and let
h = (hl ' h2). We have
f(x + hl' Y + h2) - f(x, y)

= f(x + hl ' Y+ h2) - f(x + hl' y) + f(x + hl' y) - f(x, y)
= Il Dd(x + hl ' Y+ th2)h2 dt + Il Dd(x + thl' y)h l dt.
[XIII, §7] PARTIAL DERIVATIVES 353
Since Dd is continuous, the map rjJ given by
satisfies
lim rjJ(hl' th 2) = o.
h-O
Thus we can write the first integral as
L Dd(x + hi' Y + th2)h2 dt = L Dd(x, y)h2 dt + L rjJ(hl' th2)h2 dt
= Dd(x, y)h2 + L rjJ(hl' th2)h2 dt.
Estimating the error term given by this last integral, we find
ILl rjJ(hl' th2)h2 dtl ~ ~~~l IrjJ(hlth2)lIh21

0
~ Ihl suplrjJ(hl' th 2)1
= o(h).
Similarly, the second integral yields
Dd(x, y)h l + o(h).

Adding these terms, we find that Df(x, y) exists and is given by the
formula, which also shows that the map Df = f' is continuous, so f is of
class C l . If each partial is of class CP, then it is clear that f is CPo We
leave the converse to the reader.
It will be useful to have a notation for linear maps of products into

products. We treat the special case of two factors. We wish to describe
linear maps
We contend that such a linear map can be represented by a matrix
where each Aij: Ej ...... F; is itself a linear map. We thus take matrices
whose components are not numbers any more but are themselves linear
maps. This is done as follows.
Suppose we are given four linear maps Aij as above. An element of

E1 x E2 may be viewed as a pair of elements (V1' V2) with V1 EEl and
V 2 E E 2 . We now write such a pair as a column vector
and define A(V 1, v2) to be
so that we multiply just as we would with numbers. Then it is clear that

A is a linear map of E 1 x E2 into F1 x F2 .
Conversely, let A.: E1 x E2 --+ F1 X F2 be a linear map. We write an
element (V1' v 2) E E1 x E2 in the form
(V1' v2) = (V1' 0) + (0, V2)'

We also write A in terms of its coordinate maps ..1.= (..1.1' . 1. 2) where
. 1. 1: E1 x E2 --+ F1 and . 1. 2: E1 x E2 --+ F2 are linear. Then
A(V1 ' V2) = (..1. 1(V 1, V2), A2(V 1, V2))

= (..1. 1(V1' 0) + ..1.1 (0, V1), A2(V 1, 0) + . 1. 2(0, V2)).
The map
is a linear map of E1 into F1 which we call All' Similarly, we let
All (v 1) = ..1.1 (V1' 0), A12 (V 2) = . 1. 1(0, V2),

A21(vd = A2(V 1, 0), A22 (V 2) = . 1. 2(0, v2)·
Then we can represent A as the matrix
as explained in the preceding discussion, and we see that A(V1' v2) is

given by the multiplication of the above matrix with the vertical vector
formed with V 1 and V 2 .
Finally, we observe that if all Aij are continuous, then the map A is
also continuous, and conversely.
We can apply this to the case of partial derivatives, and we formulate
the result as a corollary.
[XIII, §8] DIFFERENTIA TING UNDER THE INTEGRAL SIGN 355
Corollary 7.2. Let U be open in E1 x E2 and let f: U --+ F1 X F2 be a

CP map. Let f = (f1' f2) be represented by its coordinate maps
and
Then for any x E U, the linear map Df(x) is represented by the matrix
Proof This follows by applying Theorem 7.1 to each one of the maps
f1 and f2, and using the definitions of the preceding discussion.
Observe that except for the fact that we deal with linear maps, all that
precedes is treated just like the standard way for functions on open sets
of n-space, where the derivatives follow exactly the same formalism with
respect to the partial derivatives.
Theorem 7.3. Let U be open in E1 x E2 and let f: U --+ F be a map

such that Dd, D2f, D1D2f, and D2Dd exist and are continuous. Then
D1D2f= D2DJ
Proof The proof is entirely analogous to the standard proof of the

similar result for functions of two variables, and will be left to the reader.
Actually, if we assume that f is of class C 2 , then the second derivative
D2f(x) is represented by the matrix (D;Djf(x)), with i, j = 1, 2. By Theo-
rem 5.3, we know that D2f(x) is symmetric, whence we conclude that
D1Dd(x) = D2Dd(x).
XIII, §8. DIFFERENTIATING UNDER THE INTEGRAL SIGN
Theorem 8.1. Let U be open in E and let J = [a, b] be an interval. Let

f: J x U --+ F be a continuous map such that D2f exists and is continu-
f
ous. Let
g(x) = f(t, x) dt.
Then 9 is differentiable on U and
Dg(x) = f Dd(t, x) dt.
Proof Differentiability is a property relating to a point, so let x E U.

Selecting a sufficiently small open neighborhood V of x, we can assume
that Dd is bounded on J x V. Let A be the linear map
A= f Dd(t, x) dt.
We investigate
g(x + h) - g(x) - Ah = f [J(t, x + h) - f(t, x) - Dd(t, x)h] dt
= f [f Dd(t, x + uh)h du - D2 f(t, X)h] dt
= f {f [Dd(t, x + uh) - Dd(t, x)]h dU} dt.

We estimate:
Ig(x + h) - g(x) - Ahl ~ max IDd(t, x + uh) - Dd(t, x)llhl,
the maximum being taken for 0 ~ u ~ 1 and 0 ~ t ~ 1. By the relative

uniform continuity of D2f with respect to the compact set J x {x}, we
conclude that given e there exists () such that whenever Ihl < () then this
maximum is < e. This proves that A is the derivative g'(x), as desired.
XIII, §9. DIFFERENTIATION OF SEQUENCES
Theorem 9.1. Let U be an open subset of a Banach space E, and let

{fn} be a sequence of c 1 maps of U into a Banach space F. Assume
that {fn} converges pointwise to a map f, and also that the sequence of
derivatives U:} converges uniformly, to a mapping
g: U -+ L(E, F).
Then f is differentiable, and f' = g.

Proof Let Xo E U. Differentiability being a local property, we can
assume without loss of generality that U is an open ball cetnered at Xo.
For x E U, we have by the mean value theorem applied to fn - fm:
Ifn(x) - fm(x) - (f,.(xo) - fm(xo))1 ~ Ix - xol sup If:(y) - f~(Y)I.

YE U
Given e there exists N such that if m, n> N, then
Ilf: - f~11 <e and Ilf: - gil < e.

[XIII, §10J EXERCISES 357
Letting m tend to infinity, we conclude that for n> N we have
(1)
Fix n > N. Again by the mean value theorem, there exists <5 such that if
Ix - xol <<5 we have
(2)
Finally, use the fact that Ilf: - gil < t:. We conclude from (1) and (2) that
XIII, §10. EXERCISES
1. Let U be open in E and V pen in F. Let
f:U ..... V and g: V ..... G
be of class CPo Let X o E U. Assume that Dkf(x o} = 0 for all k = 0, . .. ,po

Show that Dk(g 0 f)(x o) = 0 for 0 ~ k ~ p. [Hint: Induction.] Also prove
that if Dkg(j(x O» = 0 for 0 ~ k ~ p, then (Dk(g 0 f)(x o ) = 0 for 0 ~ k ~ p.
2. Let f(t) = L cnt n be a power series with real coefficients, converging in a
circle of radius r. Let A be a Banach algebra. Show that the map
is a C 1 (or even C OO) map on the disc of radius r centered at the origin in A.
3. Let E, F be Banach spaces, and Lis(E, F) the set of toplinear isomorphisms
between E and F. Show that the map Uf-+ u- 1 from Lis(E, F) to Lis(F, E) is
differentiable, and find its derivative (as in the case of Banach algebras).
4. Let A be a Banach algebra with unit e. Show that one can define a square
root function in a neighborhood of e, in such a manner that it is of class C 1
(or even C OO).
S. Let Z be a compact topological space, E a Banach space, and F = CO(Z, E)
the Banach space of continuous maps of Z into E, with the sup norm. Let U
be open in E, and let V be the subset of F consisting of all maps f: Z ..... U
which map Z into U, so V = CO(Z, U). Let g: U ..... G be a map of U into a
Banach space G.
(a) If 9 is continuous, show that the map
of V into CO(Z, G) is continuous.

358 DIFFERENTIAL CALCULUS [XIII, §lOJ
(b) If g is of class C 1, show that the above map is of class C\ and find a
formula for its derivative.
(c) If g is of class CP, show that the above map is of class CPo
6. Let J = [a, bJ be a closed interval, and let U be open in a Banach space E.
Let g: U -+ G be a C I map. Let C°(J, U) be the set of continuous maps of J
into U. Show that C°(J, U) is open in C°(J, E), and that the map
is of class C I. The notation means that
8i a)(t) = f g(a(u)) duo
Find an expression for the derivative of 89 •

7. Let f be a map of class Cion a Banach space E such that f(tx) = tf(x) for
all real t and all x E E. Show that f is linear, and in fact that f(x) = f'(O)x.
8. Let f be a map of class C 2 on a Banach space E such that f(tx) = t 2f(x) for
all real t and all x E E. Show that f is quadratic, and that in fact
f(x) = D 2f(0)(x, x).

Generalize Exercises 7 and 8.
9. Let E be a Banach space, and J = [a, b] a closed interval. For each C I
curve IX: J -+ E let the C I norm of IX be defined by
Ilali l = Iiall + lIa'll

where IX' is the derivative of a. Show that this is a norm, and that the space
CI(J, E) of C I curves is complete under this norm.
to. Let U be open in a Banach space and let BCP(U, F) be the space of maps
f: U -+ F into a Banach space F which are of class CP, and such that all
derivatives D,,! are bounded, for k = 0, ... ,po Show that BCP(U, F) is a
Banach space, under the norm
Ilflle. = sup IIDkfll,

the sup being taken for 0 ~ k ~ p.
11. This exercise is a starting point for the calculus of variations. Let E be a
Banach space and U an open subset of R x E x E. Let
H: U-+R
be a CP function. Let J be a closed interval [a, b]. Let V be the subset of

[XIII, §1O] EXERCISES 359
C I (J, E) consisting of all curves oc E C I (J, E) such that the curve
tf--+{t, oc(t), oc'(t)), t E J,

lies in U.
(a) Show that V is open in CI(J, E).
(b) Show that the map
oc f--+ f H{t, oc(t), oc'(t)) dt = f(oc)
is a CP function on V, and determine its derivative.

(c) Let g: J --+ L(E, R) be a continuous function such that
f g(t)u(t) dt =0
for every C I curve 0': J --+ E having the property that
uta) = u(b) = O.
Show that g = O.
(d) Let CI(J, E, 0) be the subset of curves oc in CI(J, E) such that
oc(a) = oc(b) = O.
Show that C I (J, E, 0) is a closed subspace. Restrict the function f to the

open set
of this closed subspace. Show that if an element oc E Va is a local mini-

mum of f in Va, then
for all t E J. [Hint: Show that if a function has a local minimum at a

point, then its derivative is 0 at that point, and use (c).]
CHAPTER XIV
Inverse Mappings and

Differential Equations
XIV, §1. THE INVERSE MAPPING THEOREM
Both the inverse mapping theorem and the existence theorem for differen-
tial equations will be based on a basic and simple lemma in complete
metric spaces.
Lemma 1.1 (Shrinking Lemma). Let M be a complete metric space, and

let T: M -+ M be a mapping. Assume that there exists a number K with
o < K < 1 such that for all x, y E M we have
d(Tx, Ty) ~ Kd(x, y),
where d is the distance function in M. Then T has a unique fixed point

z, that is a point such that Tz= z. If x E M, then
z = lim T"x.
Proof. For simplicity of notation, we assume that M is a closed subset

of a Banach space. We first observe that a fixed point z, if it exists, is
unique because if Zl is also fixed, then
so z - Z 1 = O. Now for existence, let m, n be positive integers and say

n ~ m, n = m + r. Then for any x we have
[XIV, §1] THE INVERSE MAPPING THEOREM 361
and
Ix - T'xl ~ Ix - Txl + ITx - T 2 xI + .. + 1T'-1 x - T'xl

~ (1 + K + ... + K,-1)lx - Txl .
This shows that the sequence {Px} is Cauchy, converging to some

element Z E M. This element z is a fixed point because
and for n sufficiently large, TT n x approaches z and also Tz. This proves
the shrinking lemma.
We shall call K in the lemma a shrinking constant for T

Let V be open in a Banach space E, and let f : V ~ F be a CP map
(p ~ 1). We shall say that f is a CP-isomorphism or is CP-invertible on V
if the image f(V) is an open set V in F, and if there exists a CP map
g: V~V
such that g o f and fog are the identity maps on V and V respectively.
We say that f is a local CP-isomorphism at a point x in V, or is locally
CP-invertible at x, if there exists an open set V 1 contained in V and
containing x such that the restriction of f to V 1 is CP-invertible on V 1 •
It is clear that the composite of two CP-isomorphisms is again a
CP-isomorphism, and that the composite of two locally CP-invertible
maps is also locally CP-invertible. In other words, if f is locally CP-
invertible at x, if f(x) is contained in some open set V, and if g: V ~ G is
locally CP-invertible at f(x), then go f is locally CP-invertible at x.
The inverse mapping theorem provides a criterion for a map to be
locally CP-invertible, in terms of its derivative.
Theorem 1.2 (Inverse Mapping Theorem). Let V be open in a Banach

space E, and let f: V ~ F be a CP map. Let Xo E V and assume that
f'(x o ): E ~ F is a top linear isomorphism (i.e. invertible as a continuous
linear map). Then f is a local CP-isomorphism at xo .
Proof. Let A = f' (x o ). It suffices to prove that A-1 0 f is locally

invertible at Xo because we may consider A-1 0 f instead of f itself. Thus
we have reduced our theorem to the case where E = F and f'(x o) is the
identity mapping. Next, making translations, it suffices to prove our
theorem when Xo = 0 and f(x o ) = 0 also. From now on, we make these
additional assumptions.
Let g(x) = x - f( x ). Then g'(O) = 0 and by continuity there exists
362 INVERSE MAPPINGS AND DIFFERENTIAL EQUATIONS [XIV, §1]
r > 0 such that if Ixl ~ 2r, then
Ig'(x)1 ~ t·
From the mean value theorem we see that Ig(x)1 ~ tlxl, and hence that g
maps the closed ball Br(O) into Br/2(0). We contend that given y E Br/2(0),
there exists a unique element x E Br(O) such that f(x) = y. We prove this
by considering the map
gy(x) = y +x- f(x).
If Iy l ~ r/2 and Ixl ~ r, then Igy(x)1 ~ r and hence gy may be viewed as a

mapping of the complete metric space Br(O) into itself. The bound of t
on the derivative together with the mean value theorem shows that gy is
a shrinking map, i.e. that
for XI' X2 E Br(O). By the shrinking lemma, it follows that gy has a

unique fixed point, which is precisely the solution of the equation
f(x) = y. This proves our contention.
We obtain a local inverse for f, which we denote by f-I . This inverse

is continuous, because writing x = x - f(x) + f(x) we see that
IXI - x21 ~ If(x l ) - f(X2)1 + Ig(x l ) - g(x2)1

~ If(xd - f(X2)1 + tixi - x21,
whence
We shall now see that this inverse is differentiable on the open ball
Br/2 (0). Indeed, fix y I E Br/2(0) and let y I = f(x I) with x I
E Br(O). Let
y E Br/2(0), and let y = f(x) with x E Br(O). Then:
(*) If-I(y) - f-I(YI) - f'(xd-l(y - YI)1

= Ix - Xl - f'(xd-I(J(x) - f(xl))I.
From the differentiability of f, we can write
f(x) = f(x l ) + f'(xd(x - xd + o(x - xd·

If we substitute this in (*), we find the expression
[XIV, §1] THE INVERSE MAPPING THEOREM 363
Of course, f'(x 1 f 1 is bounded by a fixed constant and by what we have

already seen, we have
From the definition of differentiability, we conclude that f- 1 is differenti-

able at Y1 and that its derivative is given by
Since the mappings f- 1 , 1', "inverse" are continuous, it follows that

DU- 1 ) is continuous and thus that f- 1 is of class C 1. Since taking
inverses is Coo, it follows inductively that f- 1 is CP, as was to be shown.
We shall generalize part of the inverse mapping theorem in Chapter
XV, §3.
In some applications it is necessary to know that if the derivative of a
map is close to the identity, then the image of a ball contains a ball of
only slightly smaller radius. The precise statement follows. In this book,
it will be used only in the proof of the change of variables formula, and
therefore may be omitted until the reader needs it.
Lemma 1.3. Let U be open in E, and let f : U..... E be of class C 1 •

Assume that f(O) = 0, 1'(0) = I. Let r > 0 and assume that .8r (O) c U.
Let 0 < s < 1, and assume that
If'(z) - f'(x)1 ~ s
for all x, z E .8AO). If Y E E and lyl ~ (1 - s)r, then there exists a

unique x E .8r (O) such that f(x) = y.
Proof The map gy given by gy(x) = x - f(x) + y is defined for Ixl ~ r

and Iyl ~ (1 - s)r, and maps .8r (O) into itself because, from the estimate
If(x) - xl = If(x) - f(O) - f'(O)xl ~ Ixl sup If'(z) - 1'(0)1

~ sr,
we obtain
Igy(x)1 ~ sr + (1 - s)r = r.
Furthermore, gy is a shrinking map because, from the mean value theo-

rem, we get
Igy(x 1 ) - gy(x 2 )1 = IX1 - X2 - (f(x 1 ) - f(X2»)1

= IX1 - X2 - f'(0)(x 1 - X2) + o(x 1 , x 2 )1
= lo(x 1 , x 2 )1
where
1«5(xl' x2)1 ~ IXl - x 21sup II'(z) - 1'(0)1
~ slx l - x21·
Hence gy has a unique fixed point x E B,(O) which is such that f(x) = y.
XIV, §2. THE IMPLICIT MAPPING THEOREM
Its statement is as follows.
Theorem 2.1. Let U, V be open sets in Banach spaces E, F respectively,

and let
f: U x V -+ G
be a CP mapping. Let (a, b) E U X V, and assume that
Dd(a, b): F -+ G
is a top linear isomorphism. Let f(a, b) = O. Then there exists a continu-

ous map g: Uo -+ V defined on an open neighborhood Uo of a such that
g(a) = b and such that
f(x, g(x)) =0
for all x E UO • If Uo is taken to be a sufficiently small ball, then 9 is
uniquely determined, and is also of class CPo
Proof. Let A. = Dd(a, b). Replacing f by A.- l 0 f we may assume with-

out loss of generality that Dd(a, b) is the identity. Consider the map
qJ: U x V -+ E x G
given by
qJ(x, y) = (x, f(x, y)).
Then the derivative of qJ at (a, b) is immediately computed to be repre-

sented by the matrix
DqJ(a b) = ( I E
, Dd(a, b)
0)
Dd(a, b)
= (IE
Dd(a, b)
0)
IG
whence qJ is locally invertible at (a, b) since the inverse of DqJ(a, b) exists
0)
and is the matrix
( IE
- Dd(a, b) IG .
[XIV, §3] EXISTENCE THEOREM FOR DIFFERENTIAL EQUATIONS 365
We denote the local inverse of ({J by t/J. We can write
t/J(x, z) = (x, h(x, z))
where h is some mapping of class CPo We define
g(x) = h(x, 0).

Then certainly g is of class CP and
(x, f(x, g(x))) = ({J(x, g(x)) = ({J(x, h(x, 0))

= ({J(t/J(x, 0)) = (x, 0).
This proves the existence of a CP map g satisfying our requirements.

Now for the uniqueness, suppose that go is a continuous map defined
near a such that go(a) = band f(x, go(x)) = 0 for all x near a. Then
go(x) is near b for such x, and hence
Since ({J is invertible near (a, b) it follows that there is a unique point
(x, y) near (a, b) such that ({J(x, y) = (x, 0). Let Uo be a small ball on
which g is defined. If go is also defined on Uo , then the above argument
shows that g and go coincide on some smaller neighborhood of a. Let
x E Uo and let v = x-a. Consider the set of those numbers t with
o ~ t ~ 1 such that g(a + tv) = go(a + tv). This set is not empty. Let s
be its least upper bound. By continuity, we have g(a + sv) = go(a + sv).
If s < 1, we can apply the existence and that part of the uniqueness just
proved to show that g and go are in fact equal in a neighborhood of
a + sv. Hence s = 1, and our uniqueness statement is proved, as well as
the theorem.
Note. The particular value f(a, b) = 0 in the preceding theorem is

irrelevant. If f(a, b) = c for some c #- 0, then the above proof goes
through replacing 0 by c everywhere.
XIV, §3. EXISTENCE THEOREM FOR

DIFFERENTIAL EQUATIONS
Let E be a Banach space and U an open set in E. By a vector field on

U we simply mean a mapping f: U ...... E, which we interpret as assign-
ing a vector to each point of U. We shall assume our vector field is CP
with P ~ 1. Let Xo be a point of U. An integral curve for f with initial
condition Xo is a mapping of class C (r ~ 1)
a: J ~ U
defined on an open interval J containing 0, such that a(O) = Xo and such

that
a'(t) = f(a(t»).
We visualize this as saying that the velocity (tangent) vector of the curve
a at a point is equal to the vector associated to that point by the vector
field. We observe that an integral curve can also be viewed as a solution
of the integral equation
a(t) = Xo + L f(a(s») ds.
Namely, any solution of this integral equation is obviously an integral

curve of f with the specified initial condition, and conversely, such
an integral curve satisfies the integral equation. Furthermore, we ob-
serve that an integral curve for f is then necessarily of class CP+l, by
induction.
We shall say that f satisfies a Lipschitz condition on U if there exists a
°
number K > such that
If(x) - f(y)1 ~ K Ix - yl
for all x, y E U. We then call K a Lipschitz constant for f. If f is of

class C 1, it follows at once from the mean value theorem that f is locally
Lipschitz, that is Lipschitz in the neighborhood of every point, and that
it is bounded on such a neighborhood.
Let f: U ~ E be a vector field and Xo E U. By a local flow at Xo we
mean a mapping
a: J x Uo ~ U
where J is an open interval containing 0, and Uo is an open subset of U

containing x o , such that for each x in Uo the map
tI-+ aAt) = a(t, x)
is an integral curve for f with initial condition x (namely a(O, x) = x).

We define a local flow with the eventual intent to analyze its dependence
on x. However, for this section, the occurrence of x is still incidental,
and is introduced only to get some uniformity results. We shall prove
that a local flow always exists if f satisfies a Lipschitz condition.
Theorem 3.1. Let f: V -+ E be a vector field satisfying a Lipschitz

condition with constant K > O. Let Xo E U. Let 0 < a < 1, assume that
the closed ball B2a (x o ) is contained in V, and that f is bounded by a
constant L > 0 on this ball. If b is a number > 0 such that b < aiL
and b < 11K, then there exists a unique local flow
where Jb is the open interval - b < t < b, and Ba(x o ) is the open ball of
radius a centered at Xo '
Proof. Let Ib be the closed interval - b ~ t ~ b and let x be a point

in Ba(x o ). Let M be the set of continuous maps
of the closed interval into the closed ball of center Xo and radius 2a, such
that 0((0) = x. We view M as a subset of the space of continuous maps
of Ib into E, with the sup norm. Then M is complete. For each 0( in M
we define the curve SO( by
SO((t) = x + t f(O((u)) duo
Then SO( is certainly continuous, and SO((O) = X. The distance of any

point of SO( from x is bounded by the norm of the integral, and we have
the estimate
ISO((t) - xl ~ bL < a.
Hence SO( lies in M, so S maps M into itself. Furthermore, S is a

shrinking map, because for 0(, P in M we have
IISO( - SPII ~ b sup If(O((u)) - f(P(u)) I

~ bK 110( - PII·
We can now apply the shrinking lemma to conclude the proof of our
theorem.
If we fix the initial condition x, then each integral curve O(x is of

course differentiable. However, we shall be interested in the dependence
on x, and it is already easy to show continuity.
Corollary 3.2. The local flow 0( in Theorem 3.1 is continuous. Further-

more, the map x 1-+ O(x of Ba(xo) into the space of curves is continuous,
and in fact satisfies a Lipschitz condition.
Proof. The second statement obviously implies the first. So fix x in

Ba(x o) and take y close to x in BAxo)' We let Sx be the shrinking map of
the theorem, corresponding to the initial condition x. Then
Let C = bK so 0 < C < 1. Then
Ilax - S;axll ~ Ilax - Syaxll + S;axll + ... + IIS;-lax - S;axll

IISya x -
~ (1 + C + ... + cn-l)lx - YI.

Since the limit of S;a x is equal to ay as n goes to infinity, the continuity
of the map x f--> ax follows at once. In fact, the map satisfies a Lipschitz
condition as stated.
It is easy to formulate a uniqueness theorem for integral curves over

their whole domain of definition.
Theorem 3.3. Let U be open in E and let f: U -+ E be a vector field of

class C P, P ~ 1. Let
and
be two integral curves for f with the same initial condition Xo' Then a l
and a 2 are equal on J l n J2 •
Proof. Let Q be the set of numbers b such that a l (t) = a 2 (t) for
o ~ t < b. Then Q contains some number b > 0 by the local uniqueness
theorem. If Q is not bounded from above, the equality of a l (t) and a 2 (t)
for all t > 0 follows at once. If Q is bounded from above, let b be its
least upper bound. We must show that b is the right end point of
J l n J2 • Suppose that this is not the case. Define curves /31 and /32 near
o by
and
Then /31 and /32 are integral curves of f with the initial conditions a l (b)
and a 2(b) respectively. The values /3l(t) and /32(t) are equal for small
negative t because b is the least upper bound of Q. By continuity it
follows that al(b) = a2 (b), and finally we see from the local uniqueness
theorem that
for all t in some neighborhood of 0, whence a l and a 2 are equal in a

neighborhood of b, contradicting the fact that b is a least upper bound
of Q. We can argue the same way towards the left end points, and thus
prove our theorem.
For each x E U, let J(x) be the union of all open intervals containing °
on which integral curves for j are defined, with initial condition equal to
x. Then Theorem 3.3 allows us to define the integral curve uniquely on
all of J(x).
°
Remark. The choice of as the initial time value is made for conve-
nience. From Theorem 3.3 one obtains at once (making a time transla-
tion) the analogous statement for an integral curve defined on any open
interval; in other words, if J 1 , J 2 do not necessarily contain 0, and to is a
point in J 1 1\ J 2 such that (Xl (to) = (X2(tO), and also we have the differential
equations
and
then (Xl and (X2 are equal on J 1 1\ J 2 • One can also repeat the proof of
Theorem 3.3 in this case.
In practice, one meets vector fields which may be time dependent, and
also depend on parameters. We discuss these to show that their study
reduces to the study of the standard case.
Time-Dependent Vector Fields
Let J be an open interval, U open in a Banach space E, and
j:J x U ~E
a CP map, which we view as depending on time t E J. Thus for each t,

the map x 1---+ j(t, x) is a vector field on U. Define
!:JxU~RxE
by
!(t, x) = (l,J(t, x))
and view ! as a time-independent vector field on J x U. Let ex be its

flow, so that
D1 ex(t, s, x) = !(ex(t, s, x)), ex(O, s, x) = (s, x).
We note that ex has its values in J x U and thus can be expressed in

terms of two components. In fact, it follows at once that we can write ex
in the form
ex(t, s, x) = (t + s, ex2(t, s, x)).
Then a2 satisfies the differential equation
as we see from the definition of 1 Let
/3(t, x) = a2 (t, 0, x).

Then /3 is a flow for j, i.e. satisfies the differential equation
D1/3(t, x) = j(t, /3(t, x», /3(0, x) = x.
Given x E U, any value of t such that a is defined at (t, x) is also such

that a is defined at (t, 0, x) because ax and /3x are integral curves of the
same vector field, with the same initial condition, hence are equal. Thus
the study of time-dependent vector fields is reduced to the study of
time-independent ones.
Dependence on Parameters
Let V be open in some space F and let
g: J x V x U-.E
be a map which we view as a time-dependent vector field on U, also

depending on parameters in V. We define
G: J x V x U -. F x E
by
G(t, z, y) = (0, g(t, z, y»
for t E J, Z E V, and y E U. This is now a time-dependent vector field on

V x U. A local flow for G depends on three variables, say /3(t, z, y), with
initial condition /3(0, z, y) = (z, y). The map /3 has two components, and it
is immediately clear that we can write
/3(t, z, y) = (z, a(t, z, y»
for some map a depending on three variables. Consequently a satisfies

the differential equation
D1a(t, Z, y) = g(t, z, a(t, z, y», a(O, z, y) = y,
which gives the flow of our original vector field g depending on the
[XIV, §4] LOCAL DEPENDENCE ON INITIAL CONDITIONS 371
parameters Z E V. This procedure reduces the study of differential

equations dependihg on parameters to those which are independent of
parameters.
XIV, §4. LOCAL DEPENDENCE ON INITIAL CONDITIONS
We shall now see that the map x ~ IXx in fact depends differentiably on x.
The proof, which depends on a very simple application of the implicit
mapping theorem in Banach spaces, was found independently by Pugh
and Robbin.
Let V be open in E and let f : V -+ E be a CP map (which we call a
vector field). Let b > 0 and let Ib be the closed interval of radius b
centered at O. Let
be the Banach space of continuous maps of Ib into E. We let V be the

subset of F consisting of all continuous curves
mapping Ib into our open set U. Then it is clear that V is open in F

because for each curve a the image a(lb) is compact, hence at a finite
distance from the complement of V, so that any curve close to it is also
contained in U .
We define a map
T : V x V-+F
by
T(x, a) = x + L f 0 a - a.
Here we omit the dummy variable of integration, and x stands for the
constant curve with value x. If we evaluate the curve T(x, a) at t, then
by definition we have
T(x, a)(t) = x + L f(a(u») du - a(t).
Lemma 4.1. The map T is of class CP, and its second partial derivative
is given by the formula
D2 T(x, a) = L Df 0 a - I
where I is the identity. In terms of t, this reads:
D2 T(x, a)h(t) = I Df(a(u))h(u) du - h(t).
Proof It is clear that the first partial derivative Dl T exists and is

continuous, in fact C'o, being linear in x up to a translation. To deter-
mine the second partial, we apply the definition of the derivative. The
derivative of the map a r-+ a is of course the identity. We have to get the
derivative with respect to a of the integral expression. We have for small
h:
~ L If 0 (a + h) - f 0 a - (Df 0 a)hl·
We estimate the expression inside the integral at each point u, with u

between 0 and the upper variable of integration. From the mean value
theorem, we get
If(a(u) + h(u)) - f(a(u)) - Df(a(u))h(u) I

~ IIhil sup IDf(zJ - Df(a(u)) I
where the sup is taken over all points Zu on the segment between a(u)
and a(u) + h(u). Since Df is continuous, and using the fact that the
image of the curve a(Ib) is compact, we conclude (as in the case of
uniform continuity) that as IIhil ~ 0, the expression
sup IDf(zJ - Df(a(u)) I
also goes to O. (Put the Band J in yourself.) By definition, this gives us

the derivative of the integral expression in a. The derivative of the final
term is obviously the identity, so this proves that D2 T is given by the
formula which we wrote down.
This derivative does not depend on x. It is continuous in a. Namely,
L
we have
D2 T(x, r) - D2 T(x, a) = [Df 0 r - Df 0 a].
If a is fixed and r is close to a, then Df 0 r - Df 0 a is small, as one

proves easily from the compactness of a(Ib), as in the proof of uniform
continuity. Thus D2 T is continuous. By Theorem 7.1 of Chapter XIII
we now conclude that T is of class C 1.
The derivative of D2 T with respect to a can again be computed as

before if Df is itself of class C l, and thus by induction, if f is of class C P
we conclude that D2 T is of class crl so that by Theorem 7.1 of Chap-
ter XIII, we conclude that T itself is of class CPo This proves our lemma.
We observe that a solution of the equation
T(x, a) =0
is precisely an integral curve for the vector field, with initial condition
equal to x. Thus we are in a situation where we want to apply the
implicit mapping theorem.
Lemma 4.2. Let Xo E U. Let a > 0 be such that Df is bounded, say by

a number C l > 0, on the ball Ba(x o ) (we can always find such a since Df
is continuous at xo). Let b < 1/C l • Then D2 T(x, a) is invertible for all
(x, a) in Ba(x o ) x V.
Proof. We have an estimate
It Df(a(u))h(u) du I~ bClllhll.
This means that

ID2 T(x, a) + II < 1,
and hence that D2 T(x, a) is invertible, as a continuous linear map, thus

proving Lemma 4.2.
Theorem 4.3. Let p be a posltzve integer, and let f: U -+ E be a CP

vector field. Let Xo E U. Then there exist numbers a, b > 0 such that
the local flow
is of class CPo
Proof. We take a so small and then b so small that the local flow
exists and is uniquely determined by Theorem 3.1. We then take b
smaller and a smaller so as to satisfy the hypotheses of Lemma 4.2. We
can then apply the implicit mapping theorem to conclude that the map
x f-+ (Xx is of class C p. Of course, we have to consider the flow (X and still
must show that (X itself is of class CPo It will suffice to prove that Dl (X
and D2 (X are of class c r l , by Theorem 7.1 of Chapter XIII. We first
consider the case p = 1.
We could derive the continuity of (X from Corollary 3.2 but we can
also get it as an immediate consequence of the continuity of the map
X 1---+ IXx . Indeed, fixing (s, y) we have
IIX(t, x) - IX(S, y)1 ~ IIX(t, x) - lX(t, y)1 + IIX(t, y) - IX(S, y)1

~ IllXx - lXyll + IlXy(t) - lXy(s)l.
Since lXy is continuous (being differentiable), we get the continuity of IX.

Since
D11X(t, x) = f(lX(t, x)),
we conclude that D11X is a composite of continuous maps, whence
continuous.
Let cP be the derivative of the map x 1---+ lXx, so that
is of class CP-l. Then
IXx + w -lX x = cp(x)w + Iwll/l(w)
where I/I(w) ~ 0 as w ~ O. Evaluating at t, we find
lX(t, x + w) - lX(t, x) = (cp(x)w)(t) + Iwll/l(w)(t),

and from this we see that
Then
ID21X(t, x)w - D21X(S, y)wl ~ I(cp(x)w)(t) - (cp(y)w)(t)1

+ I(cp(y)w)(t) - (cp(y)w)(s)l.
The first term on the right is bounded by
Icp(x) - cp(y)11 wi
so that
ID21X(t, x) - D21X(t, y)1 ~ Icp(x) - cp(y)l·
We shall prove below that
I(cp(y)w)(t) - (cp(y)w)(s) I
is uniformly small with respect to w when s is close to t. This proves the

continuity of D21X, and concludes the proof that IX is of class C 1.
The following proof that I(cp(y)w)(t) - (cp(y)w) (s)1 is uniformly small

was shown to be by Professor Yamanaka. We have
(1) lX(t, x) = x + L /(IX(U, x)) duo
Replacing x with x + AW (w E E, A#- 0), we obtain
(2) lX(t, x + AW) = x + AW + L /(IX(U, x + AW)) duo
Therefore
lX(t, x + AW) - lX(t, x)

(3)
A
= W+ f
0
l 1
I [f(IX(U, x + AW)) - /(IX(U, x))] duo
On the other hand, we have already seen in the proof of Theorem 4.3
that
(4) lX(t, x + AW) - lX(t, x) = A(cp(X)W)(t) + IAIlwl tjJ(AW)(t).

Substituting (4) in (3), we obtain :
IAI
(cp(x)w)(t) + Tiwi tjJ(AW)(t)
= W+ f 0
l 1
I [f{IX(U, x + AW)) - /(IX(U, x))] du
= W + Lf G(u, A, v) dv du,
where
with
e1 (A) = A( cp(x)w)(u) + IAIlwl tjJ(AW)(U),
Letting A -+ 0, we have
(5) (cp(x)W)(t) = W + L D/(IX(U, x))(cp(x)w)(u) duo

By (5) we have
I(cp(x)w)(t) - (cp(x)w) (s)1 ~ If Df(rx(u, x))(cp(x)w) (u)) dul
~ bC1Icp(x)I'lwl'lt - sl,
from which we immediately obtain the desired uniformity.
I
We have
rx(t, x) = x + f(rx(u, x)) duo
We can differentiate under the integral sign with respect to the parameter
x and thus obtain
where I is a constant linear map (the identity). Differentiating with re-

spect to t yields the linear differential equation satisfied by D 2 rx, namely
and this differential equation depends on time and parameters. We have

seen in §3 how such equations can be reduced to the ordinary case. We
now conclude that locally, by induction, D2 rx is of class C p - 1 since Df is
of class C p - 1 • Since
Dl rx(t, x) = f(rx(t, x)),
we conclude by induction that Dl rx is CP-l. Hence rx is of class CP by

Theorem 7.1 of Chapter 5. Note that each time we use induction, the
domain of the flow may shrink. In the next section, we shall prove a
more global result. In any case, we have proved Theorem 4.3.
XIV, §5. GLOBAL SMOOTHNESS OF THE FLOW
Let U be open in a Banach space E, and let f: U -+ E be a C P vector

field. We let J(x) be the domain of the integral curve with initial condi-
tion equal to X.
Let :D(f) be the set of all points (t, x) in R x U such that t lies in
[XIV, §5] GLOBAL SMOOTHNESS OF THE FLOW 377
J(x). Then we have a map
(X : !l(f) ~ U
defined on all of !l(f), letting (X(t, x) = (Xx(t) be the integral curve on J(x)
having x as initial condition. We call this the flow determined by f, and
we call !l(f) its domain of definition.
Theorem 5.1. Let f: U ~ E be a CP vector field on the open set U of

E, and let (X be its flow . Abbreviate (X(t, x) by tx if (t, x) is in the
domain of definition of the flow. Let x E U. If to lies in J(x), then
J(tox) = J(x) - to
(translation of J(x) by -to), and we have for all t in J(x) - to:
Proof. The two curves defined by
and
are integral curves of the same vector field, with the same initial condi-
tion tox at t = O. Hence they have the same domain of definition J(tox).
Hence tl lies in J(tox) if and only if tl + to lies in J(x). This proves
the first assertion. The second assertion comes from the uniqueness of
the integral curve having given initial condition, whence the theorem
follows.
Theorem 5.2. If f is of class CP (with p ~ (0), then its flow is of class

CP on its domain of definition.
Proof. First let p be an integer ~ 1. We know that the flow is locally

of class CP at each point (0, x), by Theorem 4.3. Let Xo E U and let J(xo)
be the maximal interval of definition of the integral curve having Xo as
initial condition. Let !l(f) be the domain of definition of the flow, and
let (X be the flow. Let Q be the set of numbers b > 0 such that for each t
with 0 ~ t < b there exists an open interval J containing t and an open
set V containing Xo such that J x V is contained in !l(f) and such that
(X is of class CP on J x V. Then Q is not empty by Theorem 4.3. If Q is
not bounded from above, then we are done looking toward the right end
point of J(x o). If Q is bounded from above, we let b be its least upper
bound. We must prove that b is the right end point of J(xo). Suppose
that this is not the case. Then (X(b, xo) is defined. Let Xl = (X(b, x o). By
the local Theorem 4.3, we have a unique local flow at Xl' which we
denote by 13:
13(0, x) = x,
defined for some open interval Ja = (-a, a) and open ball Ba(x l ) of ra-
dius a centered at Xl' Let (j be so small that whenever b - (j < t < b we
have
We can find such (j because
lim a(t, xo) = Xl

r-b
by continuity. Select a point tl such that b - (j < tl < b. By the hypo-

thesis on b, we can select an open interval J l containing tl and an open
set VI containing Xo so that
maps J l x Vl into Ba / 2 (xd. We can do this because a is continuous at

(tl' x o), being in fact CP at this point. If It - tIl < a and X E VI' we
define
qJ(t, x) = f3(t - t l , a(tl' x).
Then
and
Dl qJ(t, x) = D l f3(t - t l , a(tl' x)
= f(f3(t - t l , a(tl' x)))
= f(qJ(t, x).
Hence both qJx and ax are integral curves for f with the same value at t l .
They coincide on any interval on which they are defined by Theorem 3.3.
If we take (j very small compared to a, say (j < a/4, we see that qJ is an
extension of a to an open set containing (tl' x o), and also containing
(b, x o). Furthermore, qJ is of class CP, thus contradicting the fact that b
is strictly smaller than the end point of J(x o ). Similarly, one proves the
analogous statement on the other side, and we therefore see that 1)(f) is
open in R x V and that a is of class CP on 1)(f), as was to be shown.
The idea of the above proof is very simple geometrically. We go as

far to the right as possible in such a way that the given flow a is of class
CP locally at (t, x o). At the point a(b, x o) we then use the flow 13 to
extend differentiably the flow a in case b is not the right-hand point of
[XIV, §6] EXERCISES 379
J(x o ). The flow f3 at lX(b, x o) has a fixed local domain of definition, and
we simply take t close enough to b so that f3 gives an extension of IX, as
described in the above proof.
Of course, if f is of class Coo, then we have shown that IX is of class CP
for each positive integer p, and therefore the flow is also of class Coo.
XIV, §6. EXERCISES
1. (Tate) Let E, F be complete normed vector spaces. Let f: E -> F be a map

having the following property. There exists a number C > 0 such that for all
x, Y E E we have
If(x + y) - f(x) - f(y) I ;;i; c.
Show that there exists a unique linear map g: E -> F such that 9 - f is
bounded for the sup norm. [Hint: Show that the limit
. f(2 n x)
g(x) = hm - n-
n-oo 2
exists.]
2. Generalize Exercise 1 to the bilinear case. In other words, let f : E x F -> G
be a map and assume that there is a constant C such that
If(x l+ Xl, y) - f(XI, y) - f(x l , y)1 ;;i; C,

If(x, Yl + Yl) - f(x, Ytl- f(x, Yl)1 ;;i; C
for all X, Xl' Xl E E and Y, Yl, Yl E F. Show that there exists a unique
bilinear map g : E x F -> G such that f - 9 is bounded for the sup norm.
3. Prove the following statement. Let Br be the closed ball of radius reentered
at 0 in E. Let f: Br -> E be a map such that:
(a) If(x) - f(y)1 ;;i; b Ix - yl with 0 0 be such that Ig(x) - f(x) I ;;i; c for all x. Assume that 9 has a fixed point
Xl' and let Xl be the fixed point of f. Show that IXl - xli ;;i; c/(1 - b).
5. Let K be a continuous function of two variables, defined for (x, y) in the
square a;;i; X ;;i; b and a;;i; Y ;;i; b. Assume that IIKII;;i; C for some constant
C > O. Let f be a continuous function on [a, b] and let r be a real number
satisfying the inequality
1
Irl < C(b - a)"
Show that there is one and only one function 9 continuous on [a, b] such
that
f(x) = g(x) + r r K(t, x)g(t) dt.
6. Newton's method. This method serves the same purpose as the shrinking
lemma but sometimes is more efficient and converges more rapidly. It is used
to find zeros of mappings.
Let B, be a ball of radius r centered at a point Xo E E. Let f: B, -> E be a
C 2 mapping, and assume that r
is bounded by some number C G 1 on B,.
Assume that f'(x) is invertible for all x E B, and that 1f'(xt11 ~ C for all
x E B,. Show that there exists a number b depending only on C such that if
If(xo)1 ~ b then the sequence defined by
lies in B, and converges to an element x such that f(x) = O. Hint,' Show

inductively that
IXn+l - xnl ~ Clf(xn)l,
If(xn+1)1 ~ IXn+l - xnl2 C,
and hence that
7. Apply Newton's method to prove the following statement. Assume that

f: U -> E is of class C 2 and that for some point Xo E U we have f(xo) = 0
and f'(x o) is invertible. Show that given y sufficiently close to 0, there exists
x close to Xo such that f(x) = y. [Hint: Consider the map g(x) = f(x) - y.]
Note. The point of the Newton method is that it often gives procedure
which converges much faster than the procedure of the shrinking lemma.
Indeed, the shrinking lemma converges more or less like a geometric series.
The Newton method converges with an exponent of 2n. For an interesting
application of the Newton method, see the Nash-Moser implicit mapping
theorem [Nas], [Mo 2], [Ha]. See also the partial axiomatization which I
gave in [La 4]. These show that the calculus in Banach spaces is insufficient
and leads to calculus in Frechet spaces, where the inverse mapping theorem
and existence theorem for differential equations is much more subtle.
8. The following is a reformulation due to Tate of a theorem of Michael Shub.
(a) Let n be a positive integer, and let f: R -> R be a differentiable function
such that f'(x) G r > 0 for all x. Assume that f(x + 1) = f(x) + n. Show
that there exists a strictly increasing continuous map ex: R -> R satisfying
ex(x + 1) = ex(x) + 1
such that
f(ex(x)) = ex(nx).
[Hint: Follow Tate's proof. Show that f is continuous, strictly increasing,
and let g be its inverse function. You want to solve IX(X) = g(lX(nx)). Let
M be the set of all continuous functions which are increasing (not neces-
sarily strictly) and satisfying IX(X + 1) = IX(X) + 1. On M, define the norm
IIIXIl = sup IIX(x)l.

O;>x ;>1
Let T: M -+ M be the map such that
(TIX)(x) = g(lX(nx)).
Show that T maps Minto M and is a shrinking map. Show that M is

complete, and that a fixed point for T solves the problem.] Since one can
write
one says that the map x t--+ nx is conjugate to f Interpreting this on the
circle, one gets the statement originally due to Shub that a differentiable
function on the circle, with positive derivative, is conjugate to the n-th
power for some n.
(b) Show that the differentiability condition can be replaced by the weaker
condition: There exist numbers '1' '2 with 1 < '1 <'2 such that for all
x ~ 0 we have
9. Let M be a complete metric space (or a closed subset of a complete normed

vector space if you wish), and let S be a topological space. Let T: S x M -+ M
be a continuous map, such that for each u e S the map T,.: M -+ M given
by T,.(x) = T(u, x) is a shrinking map with constant K., 0 < K. < 1. Assume
that there is some K with 0 < K < 1 such that K. ~ K for all u e S. Let
q>: S -+ M be the map such that q>(u) is the fixed point of T,. . Show that
q> is continuous.
10. Exercises 10 and 11 develop a special case of a theorem of Anosov, by a

proof due to Moser.
First we make some definitions. Let A: R2 -+ R2 be a linear map. We say
that A is hyperbolic if there exist numbers b > 1, c < 1, and two linearly
independent vectors v, W in R2 such that Av = bv and Aw = cwo As an
example, show that the matrix (linear map)
A=G ~)
has this property.
Next we introduce the C I norm. If f is a C I map, such that both f and
I' are bounded, we define the C I norm to be
IIflll = max(lIfll, 111'11),
where I II is the usual sup norm. In this case, we also say that f is
C I-bounded.
The theorem we are after runs as follows:

Theorem. Let A: R2 ..... R2 be a hyperbolic linear map. There exists b hav-
ing the following property. Iff: R2 ..... R2 is a C I map such that
Ilf - Alii < b,

then there exists a continuous bounded map h: R2 ..... R2 satisfying the
equation
fo h = h 0 A.
First prove a lemma.
Lemma. Let M be the vector space of continuous bounded maps of R2 into
R2. Let T: M ..... M be the map defined by Tp = p - A-lop 0 A. Then Tis
a continuous linear map, and is invertible.
To prove the lemma, write
where p+ and p- are functions, and note that symbolically,
that is Tp+ = (I - S)p+ where IISII < 1. So find an inverse for T on p+.
Analogously, show that Tp- = (1- SOl )p- where II So II < 1, so that SoT =
So - I is invertible on p-. Hence T can be inverted componentwise, as it
were.
To prove the theorem, write f = A + g where g is C I-small. We want to
solve for h = I + P with p E M, satisfying f 0 h = h 0 A. Show that this is
equivalent to solving
Tp = -A-log 0 h,
or equivalently,
p = - r- I (A -log 0 (I + p)).
This is then a fixed point condition for the map R: M ..... M given by
Show that R is a shrinking map to conclude the proof.

11. One can formulate a variant of the preceding exercise (actually the very case
dealt with by Anosov-Moser). Assume that the matrix A with respect to the
standard basis of R2 has integer coefficients. A vector Z E R2 is called an
integral vector if its coordinates are integers. A map p: R2 ..... R2 is said to be
periodic if p(x + z) = p(x) for all x E R2 and all integral vectors z. Prove:
Theorem. Let A be hyperbolic, with integer coefficients. There exists b
having the following property. If g is a CI , periodic map, and Ilglll < b, and
if f = A + g, then there exists a periodic continuous map h satisfying the
equation
fo h =h0 A.
Note. With only a bounded amount of extra work, one can show that the
map h itself is CO-invertible, and so f = h 0 A 0 h- l •
12. (a) Let f be a C l vector field on an open set U in E. If f(xo) = 0 for some
Xo E U, if IX: J -> U is an integral curve for f, and there exists some to E J
such that lX(t o) = Xo, show that lX(t) = Xo for all t E J. (A point Xo such
that f(xo) = 0 is called a critical point of the vector field.)
(b) Let f be a C l vector field on an open set U of E. Let IX: J -> U be an
integral curve for f Assume that all numbers t > 0 are contained in J,
and that there is a point P in U such that
lim lX(t) = P.
Prove that f(P) = o. (Exercises 12(a) and 12(b) have many applications,
notably when f = grad g for some function g. In this case we see that P
is a critical point of the function g.)
13. Let U be open in the (real) Hilbert space E and let g: U -> R be a C 2
function. Then g': U -> L(E, R) is a C l map into the dual space, and we
know that E is self dual. Thus there is a C l map f: U -> E such that
g'(X)v = (v,f(x»
for all x E U and VEE. We call f the gradient of g.

Let g: U -> R be a function of class C 2. Let Xo E U and assume that Xo is
a critical point of g (that is g'(xo) = 0). Assume also that D2g(XO) is negative
definite. By definition, take this to mean that there exists a number c > 0
such that for all vectors v we have
Prove that if Xl is a point in the ball Br(xo) of radius r, centered at Xo , and if

r is sufficiently small, then the integral curve IX of grad g having Xl as initial
condition is defined for all t G 0 and
lim lX(t) = Xo.

'-00
Hint: Let I/I(t) = (lX(t) - Xo)·(IX(t) - xo) be the square of the distance from
lX(t) to Xo. Show that 1/1 is strictly decreasing, and in fact satisfies
I/I'(t) ~ -cl/l(t).
Divide by I/I(t) and integrate to see that
log I/I(t) -log 1/1(0) ~ -ct.
14. Let U be open in E and let f: U -> E be a C l vector field on U. Let Xo E U

and assume that f(xo) = v "# O. Let IX be a local flow for f at Xo. Let F be a
subspace of E which is complementary to the one-dimensional space gener-
ated by v, that is the map

RxF-+E
given by (t, y) t-+ tv + y is an invertible continuous linear map.

(a) If E = Rn show that such a subspace exists. (The general case can be
proved by the Hahn-Banach theorem.)
(b) Show that the map p: (t, y) t-+ (t, Xo + y) is a local C 1 isomorphism at
(0, 0). Compute DP in terms of DIIX and D2 1X.
(c) The map a : (t, y) t-+ Xo + y + tv is obviously a C 1 isomorphism, because it
is composed of a translation and an invertible linear map. Define locally
at Xo the map q> by q> = po a-I, so that by definition,
q>(xo + y + tv) = lX(t, Xo + y).

Using the chain rule, show that for all x near Xo we have
Dq>(x)v = f(q>(x)).
In the language of charts (Chapter XXI) this expresses the fact that if a
vector field is not zero at a point, then after a change of charts, this
vector field can be made to be constant in a neighborhood of that point.
15. Let J be an open interval (a, b) and let U be open in E. Let f : J x U -+ E
be a continuous map which is Lipschitz on U uniformly for every compact
subinterval of J . Let IX be an integral curve of f, defined on a maximal open
subinterval (a o , bo) of J. Assume:
(a) There exists e > 0 such that the closure 1X(bo - e, bo}} is contained in U.
(b) There exists c> 0 such that If(t, lX(t)) 1 ~ C for all t in (b o - e, bolo
Then bo = b.
16. Linear differential equations. Let J be an open interval containing 0, and let
V be open in a Banach space E. Let L be a Banach space. Let A : J x V -+ L
be a continuous map, and let L x E -+ E be a continuous bilinear map.
Let Wo E E. Then there exists a unique map k J x V -+ E, which for each
x E V is a solution of the differential equation
Dl A(t, x) = A(t, X)A(t, x ), A(O, x) = Wo .
This map A is continuous. [Hint: Use Exercise 15. We see that in the linear
case, the integral curve is defined over the whole interval J.]
17. Let U be open in a Banach space E and let f: U -+ E be a C 1 vector field.
Assume that f is bounded. Let IX be an integral curve for f, and let J be its
maximal interval of definition. Suppose that J does not contain all positive
real numbers, and let b be its right end point. Show that
lim lX(t)
,-b
exists, and that it is a boundary point of U. Cf. [La 1] and [La 2] to see
Exercises 12-17 worked out.
PART FIVE
Functional Analysis
In this part, we present some basic and substantial results of functional

analysis, which are extremely widely used. The part splits into essentially
independent considerations dealing with Banach spaces in general, and
then with the special case of hermitian operators in Hilbert space.
First we have a chapter on some general theorems which extend to
Banach spaces some basic algebraic results on finite dimensional spaces,
taking into account the linear topology. The algebraic theorems concern-
ing the existence of complementary subspaces, or criteria for a linear map
to be surjective, need a more systematic study in light of the additional
structure provided by the norms and the continuity of linear maps.
The rest of the part handles systematically the spectrum of an opera-
tor in various contexts. First we deal with the spectrum in a fairly
general context of Banach algebras. Then we study spectral decomposi-
tions for specific types of operators, starting with compact operators
which are closest to those in finite dimensional spaces. Then we go into
the systematic study of hermitian operators in Hilbert space. Note that
the proof of the spectral theorem in this context is self-contained, inde-
pendent of the chapter on Banach spaces. Knowing just the self duality
of Hilbert space and the existence of orthogonal complements for closed
subspaces constitutes a sufficient tool for the applications to the rest of
the book. The spectral theorems are included so that readers can push
forward in these particular directions if they are so inclined by taste
rather than by formal requirements for the present basic course.
The functional analysis is principally concerned with the study of a
space with an operator, giving as simple a description as possible for the
way in which this operator operates. The two spectral theorems give
examples of the standard manner in which such a description can be
386 FUNCTIONAL ANALYSIS [PART FIVE]
made, i.e. either by describing a basis for the space on which the effect
of the operator is obvious, or by giving a structure theorem for the
algebra generated by the operator. These two ways permeate functional
analysis.
CHAPTER XV
The Open Mapping Theorem,

Factor Spaces, and Duality
XV, §1. THE OPEN MAPPING THEOREM
We begin with a general theorem on metric spaces.
Theorem 1.1 (Baire's Theorem). Let X be a complete metric space, and

assume that X is the union of a sequence of closed subsets Sn. Then
some Sn contains a non-empty open ball.
Proof Suppose that this is not the case. We find Xl in the comple-
ment of Sl (which cannot be the whole space) and some closed ball
Br,(xd centered at Xl of radius r 1 > 0, contained in this complement. By
assumption, there is some X z in Br , (xd contained in the complement of
Sz and some closed ball Br2 (x z ) contained in Br , (Xl)' and which lies in
the complement of Sz. We continue inductively using a sequence r 1 , rz ,
... such that rn > 0 and rn -> o. We thus obtain a sequence of closed
balls
such that Br (xn) is disjoint from Sl u··· uSn. We then select X n+1 and
Brn +,(xn+1) c BrJx n ) disjoint from Sn+1. Then the sequence {xn} is a
Cauchy sequence, converging to a point X, and X lies in every Br (xn) for
all n. Hence X does not lie in Sn for any n, contradicting the hypothesis
that the union of all Sn is equal to X. This proves Baire's theorem.
Corollary 1.2. Let X be a complete metric space, and {Un} a sequence

of open dense sets. Then the intersection n
Un is not empty.
Proof Take the complement of the sets in Baire's theorem.

388 OPEN MAPPING, FACTOR SPACES, DUALITY [XV, §1]
Theorem 1.3 (Open Mapping Theorem). Let E, F be Banach spaces,

and let <p: E ~ F be a continuous linear map, which is surjective. Then
 0 we denote by B. the open ball of radius s in E,

centered at the origin, and by C. the open ball in F centered at the
origin. Let Sn = <p(Bn). Then Sn is closed, and the union of all sets Sn is
equal to F. By Baire's theorem, some <p(Bn) contains a set which is dense
in some non-empty open ball V in F, centered at a point y. If y = <p(x),
for some x E E, then translating by y, we conclude that there is some
k ~ 1 and r > 0 such that <p(Bkr ) contains a set which is dense in Cr. By
homogeneity (i.e. the fact that Bts = tB. for s, t > 0), it follows that this
last statement holds if we replace r by any number s > O. We shall prove
that in fact, <p(Bkr ) contains Cr. Select 0 < (j < 1, and let y E F, Iyl < r.
There exists Xl E E with Ix 11 < kr such that
Inductively, there exist Xl' ... ,Xn E E such that IXnl < k(jn-l r and
Iy - <p(xd - ... - <p(xn)1 < (jnr.
Then the sum Xl + .. + Xn converges to an element x such that y = <p(x),

and furthermore,
Ixi < kr/(l - (j).
Hence <p(Bkr ) contains the ball Cr(l-cS) of radius r(1 - (j). This is true
for every (j > 0 whence our assertion follows that <p(Bkr ) contains Cr.
Now to conclude the proof of Theorem 1.3, let U be an open set in E,
and let x E U. Let B be an open ball centered at the origin in E
such that x + B c U. Then <p(x) + <p(B) is contained in <p(U). But <p(B)
contains an open ball centered at the origin in F. This proves that <p(U)
is open, and concludes the proof of the open mapping theorem.
Corollary 1.4. Let <p: E -+ F be a continuous linear map, which is

bijective. Then <p is a toplinear isomorphism.
Proof. The inverse of <p is also continuous, so we are done.
Corollary 1.5. Let F, G be closed subspaces of E such that F +G= E

and F n G = {O}. Then the map
FxG-+E
such that (x, y) H X +y is a toplinear isomorphism.

[XV, §1] THE OPEN MAPPING THEOREM 389
Proof It is continuous and bijective, so that Corollary 1.4 applies.
In Corollary 1.5, we shall also say that E is the direct sum of F, G

and we write
E = FEB G.
We say that F, G are complementary subspaces.

Let E be a Banach space and F a closed subspace. Then we can
define a norm on the factor space ElF by
Ix + FI = inf Ix + YI·
YEF
Then ElF is complete under this norm, i.e. is also a Banach space. To see
this, let
cP: E -+ ElF
be the canonical map which to each x E E associates the coset cp(x) =

x + F. Then cp is a continuous linear map, and
Icp(x)1 ~ Ixl·
Let {¢n} be a Cauchy sequence in El F. Taking a subsequence if neces-
sary, we may assume without loss of generality that
We find inductively elements Xn E E such that cp(x n) = ~n and such that
Indeed, suppose that we have found Xl' ... ,Xn satisfying these conditions.
Since I~n+l - ~nl < 2Ll' we can find y such that
1
and Iyl < 2n+1
by the definition of the norm on ElF. We let X n + l = Y + Xn to achieve

what we want. Then the sequence {xn} is a Cauchy sequence in E, and
converges to some element x. It follows that cp(xn) = ~n converges to
cp(x) since cp is continuous, as was to be shown.
Let cp: E -+ G be a continuous linear map where E, G are Banach

spaces. The image q>(E) is a subspace, which is not necessarily closed.

Let F be the kernel of q>. Then we have the usual linear map
EjF --+ G
induced by q>, namely the map such that x + F t-+ q>(x) = q>(x + F). This
map is in fact continuous, because there exists C > 0 such that for all
x E E we have
Iq>(X) I ~ Clx + FI
Since q>(x) = q>(x + y) for all Y E F it follows that
Iq>(X) I ~ Clx + FI
whence the continuity of EjF --+ G. Consequently, by Corollary 1.4 of
the open mapping theorem, if q> is surjective, it follows that the map
EjF --+ G is a toplinear isomorphism.
Let E be a vector space and F a subspace. If EjF has finite dimen-
sion, then we say that F has finite codimension, and we call dim EjF its
codimension.
Corollary 1.6. Let E be a Banach space and F a closed subspace of

finite dimension or finite codimension. Then F has a complementary
closed subspace.
Proof. Assume that F is finite dimensional. The proof is then inde-

pendent of the open mapping theorem, namely we let {q>l"" ,q>n} be a
basis of the dual space of F. By Hahn-Banach, we extend each q>i to a
functional on E, denoted by the same letter, and we map
for x E E. Let G be the kernel of this map. Then G is closed, and it is

immediately verified that G is a complement of F.
Next, assume that F has finite codimension, and let {YI"" ,Yn} be a
basis of EjF. Let Xl' . . . ,Xn be elements of E mapping into YI' . .. ,Yn
respectively in the natural map
E --+ EjF.
Let G be the space generated by x I ' .. . ,Xn • Then G is finite dimensional,

hence closed, and F n G = {O} while F + G = E. We can apply Corollary
1.5 to conclude the proof of this case.
Later in discussing Fredholm operators, we shall also need the follow-

ing completely elementary fact:
[XV, §2] ORTHOGONALITY 391
Proposition 1.7. If F is closed in E, and ElF is finite dimensional, and

if G is a subspace of E such that F c GeE, then G is closed.
Proof The image of G in the factor space ElF is in a finite dimen-

sional space, hence closed. Since G is the inverse image in E of its image
in ElF, it follows that G is closed.
Corollary 1.8. Let E, G be Banach spaces. Let cp: E ~ G be a continu-

ous linear map such that the image cp(E) is finite codimensional. Then
cp(E) is closed.
Proof We can find in the usual way (as in Corollary 1.6) a finite
dimensional subspace F of G such that G = cp(E) + F. Of course, so far,
this is an algebraic direct sum, not yet topological. Factoring out the
kernel of cp, we may assume without loss of generality that cp is injective.
We compose cp with the natural map G ~ GIF. Then the composite
is a bijective continuous linear map of E on GIF, hence is a toplinear

isomorphism by Corollary 1.4. Hence the inverse map (t/J 0 cp)-l is con-
tinuous, and therefore so is the map
which maps GIF on cp(E). Hence cp(E) is toplinearly isomorphic with

GIF. Since GIF is complete, it follows that cp(E) is complete, and conse-
quently cp(E) is closed in G, as was to be shown.
XV, §2. ORTHOGONALITY
We could now deal with either the real or complex case. We deal with
the latter, since it is useful to get used to the complex conjugation which
occurs, and introduces only a change of notation.
Let E be a Banach space over the complex numbers. We let E* be
the space of anti-linear maps cp: E ~ C, i.e. continuous maps which are
R-linear and satisfy
Elements of this space will be called anti-functionals or semi-functionals.

This space is obtained from the dual space E' very simply, namely if cp is
a functional in E', then the map q5 defined by
q5(x) = cp(x)
is an anti-functional, i.e. an element of E*, and conversely. We shall

apply to elements of E* certain results proved for elements of E', e.g. the
Hahn-Banach theorem. Let E, F be Banach spaces, and let
u: E ---+ F
be a continuous linear map. Then u induces a map
u*: F* ---+ E*
such that
CPHCPOU,
and it is clear that u* is linear and continuous. It is convenient here to

use a notation as in Hilbert space. We define a map
E x E* ---+ C
by
(x, cp)H<X, cp) = cp(x).
This map is continuous sesquilinear, and we shall see that it behaves very
much like the scalar product of Hilbert space for the basic formalism of
duality.
First the remark that the map
UHU*
is anti-linear from L(E, F) to L(F*, E*). By definition, we have
<ux, cp) = <x, u*cP)
for all x E E, cP E F*. Thus we call u* the adjoint of u. We note that u*

is the unique element of L(F*, E*) which satisfies this formula. We have
(1) lui = lu*l·
To prove this, observe that for any cP E F* we have
I(u*cP)(x) I = Icp(ux) I ~ Icpllullxl

so that lu*1 ~ lui. Conversely, for each x E E, by the Hahn-Banach theo-
rem, there exists cP E F* such that Icp(ux)1 = luxl, and Icpl ~ 1. Then for
[XV, §2] OR THOGONALITY 393
this cp, we get
luxl = Icp(ux) I = I(u* cp)(x) I ~ lu*llcpllxl ~ lu*llxl.

Hence lui ~ lu*l, thus proving our assertion.
We have the following duality between spaces, subspaces, and factor
spaces. Let F be a closed subspace of E. We denote by F.l the set of all
elements cp E E* such that cp(F) = O. (This is similar to the situation in
Hilbert space, but here we have the natural map
Ex E* -+ C
instead of the hermitian product of Hilbert space.) Then F.l is clearly a

closed subspace. We have a natural continuous linear map
E* -+F*
by restriction, i.e. each cp E E* induces by restnctlOn an element of F* .

The kernel is precisely F.l. Furthermore, our map is surjective, because an
anti-functional IjJ on F can be extended to an anti-functional cp on E by
the Hahn-Banach theorem. Hence we have a natural toplinear isomorphism
(2) E*/F.l ~ F*.
We observe that the notion of perpendicularity can be defined on the

other side as well, i.e. given a subset S of E*, we let S.l be the set of
x E E such that <x, cp) = 0 for all cp E S. Then S.l is a closed subspace of E.
We have a natural top linear isomorphism
(3) F.l ~ (E/F)*.
Indeed, each cp E F defines an element of (E/F)* since <F, cp) = O. It is

clear that this map induces our stated isomorphism.
Let F be a subspace of E. Then
(4) F.l.l = F.
Indeed, it is clear that Fe F.l.l, and F.l.l is closed so that Fe F.l.l.

Conversely, suppose that x ¢ F. Then there is some anti-functional cp on
E/F such that cp(x) =F 0 by the Hahn-Banach theorem. Hence x ¢ F.l.l,
We also have the duality associated with a continuous linear map
u: E -+ G
namely
(5) Ker u* = (1m u).L,
and if the image of u is closed, then so is the image of u* and
(6) 1m u* = (Ker u).L.
We leave (5) as an exercise, and prove (6). We have for x E E:
<x, u*G*) = 0 if and only if <ux, G*) = o.

Hence 1m u* c (Ker u) .L. Conversely, let <p E E* and <p 1. Ker u. We have
a toplinear isomorphism
0" : E/Ker u --+ u(E)
by Corollary 1.4 (0" is continuous bijective). We view <p as an anti-

functional on E/Ker u, and then <p 0 0"-1 is an anti-functional on u(E),
which is a closed subspace of G. We can extend <p 0 0"-1 to an anti-
functional IjJ of G by the Hahn- Banach theorem. Then it is clear that
u*1jJ = <p, whence <p E 1m u*. This proves that 1m u* = (Ker u) .L, and in
particular proves that 1m u* is closed, thus proving (6).
In particular, if again the image of u is closed, then we have toplinear

isomorphisms
(7) Ker u* ~ (E/uE)*,

(8) (Ker u)* ~ E*/u*G*,
in a natural way.
The reader acquainted with the language of exact sequences will see
that our results can be expressed as follows. If
is an exact sequence of Banach spaces, then the adjoint sequence
o+- F* +- E* +- G* +- 0
is also exact.
[XV, §3] APPLICA TrONS OF THE OPEN MAPPING THEOREM 395
XV, §3. APPLICATIONS OF THE OPEN

MAPPING THEOREM
The results of this section will not be used at all throughout the rest of
this book, and are included only for the sake of completeness. The first
two give criteria for a linear map to be continuous.
As usual, if q>: E --+ F is a map, we define the graph of q> to be the set
of all points (x, q>(x)) in E x F. If q> is linear, then the graph of q> is
obviously a subspace of E x F.
Theorem 3.1 (Closed Graph Theorem). Let q>: E --+ F be a linear map
from one Banach space into another, and assume that the graph is
closed. Then q> is continuous.
Proof Let G be the graph of q>, so that by assumption G is a closed

subspace of E x F. The projection
given by {x, q>(X))HX
is obviously continuous and bijective. By Corollary 1.4 of the open map-

ping theorem, it follows that this projection is a toplinear isomorphism,
and thus has a continuous linear inverse. If we compose this continuous
linear inverse with the projection on F, then we obtain q>, thus proving
that q> is continuous, as desired.
Theorem 3.2 (Principle of Uniform Boundedness). Let E, F be Banach

spaces, and let {T;LeI be a family of continuous linear maps from E
into F. Assume that for each x E E the set {T;xLeI is bounded. Let B
be a bounded subset of E. Then the set
U T;(B)
ie I
is bounded.
Proof For each positive integer n let en be the set of all x E E such
that I T;x I ~ n for all i E I. Since each T; is continuous, it follows that en
is closed. By assumption, we have
00
E = U en·
n=l
By Baire's theorem (Theorem 1.1) it follows that some em contains an

open ball Br(x o) with r > O. This means that if Ixl < r, then for all i E I
we have
IT;(xo + x)1 ~ m,
whence
I7;(x) I ~ 17;(x + xo)1 + I7;(xo) I
~ 2m.
Our theorem follows by homogeneity.
Corollary 3.3. Let {T,,} be a sequence of continuous linear maps of E

into F. Assume that for each x E E the limit
Tx = lim T"x
n-+oo
exists. Then T is a continuous linear map of E into F, and
lim T"x = 0
x-+o
uniformly in n.
Proof. It is clear that T is linear. For each x E E, the sequence {T"x}

converges and hence is bounded, so that we can apply the theorem, and
our corollary follows at once since we see that T is continuous at O.
The next two theorems provide one type of generalization of the

inverse mapping theorem, to surjective mapping theorems.
Theorem 3.4. Let E, F be Banach spaces. The subset of L(E, F)

consisting of surjective maps is open in L(E, F).
Proof. Let A.: E - F be a continuous linear map and assume that A. is

surjective. By the open mapping theorem, and homogeneity, there exists
C > 0 having the following property. Given Y E F with lyl ~ 1, there
exists x E E such that Ax = y and Ixl ~ Clyl. Changing the norm on F
to an equivalent norm, we may assume without loss of generality that
C = 1. Let 0 < r < 1. Let <p E L(E, F) be such that IA. - <pI ~ r. We shall
prove that <p is surjective, and it will suffice to prove that <p maps the
ball of radius -1_1_ in E onto the ball of radius 1 in F. Let y = YI E F
-r
and IYII ~ 1. By what we have just remarked, there exists Xl E E such
that Axl = YI and IXll ~ 1. Let
Then
[XV, §3] APPLICA nONS OF THE OPEN MAPPING THEOREM 397
There exists Xz E E such that ,{xz = yz and IXzl ~ r. Let
Then
There exists X3 E E such that AX3 = Y3 and IX31 ~ rZ. Continuing induc-
tively, we find x. such that AX. = y. and
Then
Y1 = <PX1 + . .. +<PX. + Y.+1 ·
If we let
co
X = LX.,
.=1
1 .
t hen Ixi ~ -1-' and <PX = Y1' thus provmg our theorem.
-r
Theorem 3.5 (Surjective Mapping Theorem (Graves»). Let U be open

in a Banach space E. Let f: U --+ F be a C 1 map into a Banach space
F. Let Xo E E. If f'(xo) is surjective, then f is locally open in a
neighborhood of xo. More precisely, there exists an open neighbor-
hood V of Xo contained in U having the following property. For each
X E V and open ball Bx centered at x, contained in V, the image f(BJ
contains an open neighborhood of f(x).
Proof. After a translation, we may assume that Xo = 0 and f(x o) = O.

Also by the preceding theorem, it will suffice to prove that if B is an
open ball centered at 0 in E, then f(B) contains an open neighborhood of
o in F. Let A = 1'(0). By the open mapping theorem and homogeneity,
there exists C > 0 having the following property. Given y E F with Iyl ~ 1,
there exists XE E such that AX = y and Ixl ~ C Iyl. Changing the norm
on F to an equivalent norm, we may assume without loss of generality
that C = 1. Let 0 < r < 1. Taking B with sufficiently small radius, by
1
the mean value theorem we have for x, Z E - -B:
1- r
If(x) - f(z) - A(x - z)1 ~ rlx - zl·

It will suffice to prove that
and we now prove this. Let Xl E Band AX l = Yl. Let Y2 = AX l - f(x l )·

By (*) we find
Xl + X2 E (1 + r)B,
and by (*) we find
IAX l - f(x l + x 2)1 = If(xd - f(x l + x 2) + Ax 21

~ rlx21 ~ r2lxll.
Let Y3 = Axl - f(x l + X2)· There exists X3 with IX31 ~ IY31 ~ r2lxll, such
that Ax3 = Y3. Then
We have
IYl - f(x l+ X2 + x 3)1

= IAxl - f(x l + X2) + f(x l + X2) - f(x l + X2 + x 3)1,
= IAX3 + f(x l + X2) - f(x l + X2 + x 3)1·
so that we get
Inductively, we find Xn such that AXn = Yn, IXnl ~ rn-llxll, and
We can then find Xn+l such that Ixn+ll ~ rnlxll and
AX n + l = Yl - f(Xl + ... + xn)·

Then
IYl - f(x l+ ... + xn+l)1

= Iy - f(x l + ... + xn) + f(x l + . .. +xn) - f(x l + ... + xn+dl
~ rlxn+11 ~ rn+llxll.
[XV, §3] APPLICA TIONS OF THE OPEN MAPPING THEOREM 399
We let
co
X = Lx
n=1
n,
1
and we see that f(x) = Y1. Furthermore, x E - - B thus proving our
1- r '
theorem.
CHAPTER XVI
The Spectrum
In this chapter we give basic facts about the spectrum of an element in a

Banach algebra. Under certain circumstances, we represent such an alge-
bra as the algebra of continuous functions on its spectrum, which is
defined as the space of its maximal ideals (or the space of characters), to
be given the weak topology. In the next chapters, we shall deal with
spectral theorems corresponding to more specific examples of Banach
algebras. The proofs in the later chapters are independent of those in the
present chapter, except for Theorem 1.2 and its corollaries. Thus the rest
of this chapter may be bypassed. On the other hand, the general repre-
sentation of a Banach algebra as an algebra of continuous functions on
the spectrum gives a nice application of the Stone-Weierstrass theorem,
and is useful in other contexts besides the spectral theorems which form
the remainder of the book, so I have included the basic results to pro-
vide a suitable background for applications not included in this book.
XVI, §1. THE GELFAND-MAZUR THEOREM
Let A be a Banach algebra over the complex numbers. We assume that

A has a unit element e. Let v E A. The set of complex numbers z such
that v - ze is not invertible is called the spectrum of v and is denoted by
a(v). We shall investigate special cases in Chapters XVII and XVIII.
Here we shall prove that the spectrum is not empty. Before doing that,
we make a simple remark concerning the spectrum.
Theorem 1.1. The spectrum of an element v E A is a closed and bounded

set in C. In fact, if z is in the spectrum, then Izl ~ Ivl.
[XVI, §1] THE GELFAND-MAZUR THEOREM 401
Proof We show that the complement is open. Let Zo be a complex

number such that v - zoe is invertible. If z is sufficiently close to zo,
then v - ze is invertible because the set of invertible elements is open, so
the spectrum is closed. Furthermore, if Izl > lvi, then Ivlzl < 1, and hence
e - viz is invertible by Theorem 2.1 of Chapter IV, §2, so that v - ze is
also invertible, as contended.
Theorem 1.2. Let A be a commutative normed algebra over the real

numbers, with unit element e. Assume that there exists an element j E A
such that / = - e. Let C = R + Rj. Given v E A, v =f. 0, there exists an
element c E C such that v - ce is not invertible in A.
Proof (Tornheim). Assume that v - ze is invertible for all z E C. Con-

sider the mapping f: C -+ A defined by
f(z) = (v - zefl.
Then f is continuous, and for z =f. ° we have
From this we see that f(z) approaches °

when z goes to infinity in C.
Hence the map z 1--+ If(z)1 is a continuous map of C into the real numbers
;;:;; 0, is bounded, and is small outside some large circle. Hence it has a
maximum, say M. Let D be the set of elements z E C such that If(z)1 =
M. Then D is not empty, D is bounded, and is closed. We shall prove
that D is open, whence a contradiction.
Let Co be a point of D, which, after a translation, we may assume to
be the origin. We shall see that if r is real > 0, then all points on the
circle of radius r lie in D. Indeed, consider the sum
1 1
L - -k '
n
S(n) = -
n k=l v - IX r
where IX is a pnmltlve n-th root of unity, say IX = e2 "i/n. Let t be a

variable. Taking formally the logarithmic derivative of
Il (t -
n
t n - rn = IXkr)
k=l
shows that
402 THE SPECTRUM [XVI, §1]
and hence, dividing by n and by tn-I, and substituting v for t, we obtain
1
S(n) = I.
V - r(rl vt
If r is small (say Irlvl < 1), then we see that
lim IS(n)1 = lv-II = M.
Suppose that there exists a complex number ~ of absolute value 1 such

that
I-v -1- I<M.

~r
Then there exists an interval on the unit circle near ~, and there exists
8> 0 such that for all roots of unity ( lying in this interval, we have
_ 1 I<M-8.
1v- (r
(This is true by continuity.) Let us take n very large. Let bn be the

number of n-th roots of unity lying in our interval. Then bnln is approxi-
mately equal to the length of the interval (over 2n). We can express S(n)
as a sum
S(n)= - 1[1
n
LI - -k + Ln -1
-k]'
v-ex r v-ex r
the first sum LI being taken over those roots of unity ex k lying in our
interval, and the second sum being taken over the others. Each term in
the second sum has norm ~ M because M is a maximum. Hence we
obtain the estimate
1
IS(n)1 ~ -n [ILII + ILnl]
bn
~ M - - 8.
- n
This contradicts the fact that the limit of IS(n)1 is equal to M, and proves
our theorem.
Corollary 1.3 (Gelfand-Mazur Theorem). Let K be a normed field over

the reals. Then K = R or K = C.
[XVI, §i] THE GELFAND-MAZUR THEOREM 403
Proof. Assume first that K contains C. Then the theorem implies that
K = C. If K does not contain C, in other words does not contain a
square root of -1, we let E = K(j) where / = -1. (One can give a
formal definition of the field E as one defines the complex numbers from
ordered pairs of real numbers. Thus we let E consist of pairs (x, y) with
x, y E K, and define multiplication in E as if (x, y) = x + yj. This makes
E into a field.) We can define a norm on E by putting
Ix + yjl = Ixl + Iyl

for x, y E K. Then E is a normed R-space. Furthermore, if z = x + yj
and z' = x' + y'j are in E, then
Izz'l = lxx' - yy'l + Ixy ' + x'yl

~ Ixx'i + Iyy'l + Ixy'l + Ix'yl
~ Ixllx'i + Iylly'l + Ixlly'l + Ix'IIYI
~ (Ixl + lylHlx'l + 1y'1)
~ Izllz'l.
Hence we have defined a norm on E. We can apply Theorem 1.2 to

conclude the proof.
Corollary 1.4. The spectrum oj an element in any complex Banach

algebra (commutative or not) with unit element is not empty.
Proof. If A is a Banach algebra with unit, and if v E A, then the

closure of the algebra generated by e and v is a commutative Banach
algebra. Hence we can apply the theorem to it.
We shall see later in this chapter that under fairly general conditions,
a Banach algebra is isomorphic to the algebra of continuous functions on
a compact set. This set is obtained in a natural way, namely it is the
maximal ideal space of A.
Finally, we remark that the fact that the spectrum is not empty can
also be proved by quoting an elementary theorem about analytic func-
tions of a complex variable, namely that a bounded analytic function is
constant. The proof runs as follows. Suppose that we have an element v
in our algebra such that (v - zer 1 exists for all complex z. Then cer-
tainly the map
is not constant, and hence there exists a functional A. on the algebra such
that the map
Z~A.[(v - ze)-l] = J(z)
404 THE SPECTRUM [XVI, §l]
is not constant. However, it is immediately verified that this map is

differentiable (i.e. complex differentiable), and since (v - zer 1 --+ 0 as
z -+ 00, it follows that f is bounded, a contradiction which proves what
we wanted.
We will need to consider Banach-valued analytic functions somewhat
further in the context of the remark above. Let E be a Banach space,
and let U be open in C. Let f : U -+ E be a mapping. We define f to be
analytic if given Zo E U there exist elements an E E such that f(z) is equal
to a convergent power series
00
f(z) =L an(z - zo)"

n=O
for z in some disc of positive radius centered at Zo. As usual in complex

analysis, convergent here means absolutely convergent. Observe that for
every functional A on E, we have
00
A 0 f(z) = L A(anHz - zo)"·

n=O
Thus A 0 f is an analytic function in the usual sense of complex analysis.

By the Hahn-Banach theorem, and the usual uniqueness of a power
series expansion for complex valued analytic functions, we also get the
uniqueness for Banach-valued analytic functions, and we get the uni-
queness of analytic continuation. Furthermore, a number of theorems
from complex analysis are valid in the Banach-valued case, and their
proofs can be carried out by using the Hahn-Banach theorem. For
instance for an analytic function on a closed disc of radius R, we have
Cauchy's integral formula
f(z) = ~ r f(O d(,

2m Jc ( - z
where C is the circle of radius R centered at the origin.

Indeed, the integral on the right is the simple-minded integral as
defined in calculus, Chapter XIII, §l. To prove that both sides of the
formula are equal, by the Hahn-Banach theorem it suffices to prove that
applying every functional to one side is equal to the functional applied to
the other side. We can use Proposition 1.1 of Chapter XIII to see this.
The Cauchy integral representation allows us to prove in the Banach-
valued case that if f is analytic on a closed disc of radius R, then the
radius of convergence of its power series centered at 0 is at least equal to
R. Indeed, the usual proof works by using the geometric series, writing
[XVI, §1] THE GELFAND - MAZUR THEOREM 405
Thus we get the same integral formula for the coefficients of the power
series for J at 0 as in the complex valued cases.
We leave it to the reader to verify that if A is a Banach algebra with
unit e, and x E A, then the function
J(z) = (x - zet 1
is analytic on the open set of complex numbers z such that x - ze is

invertible.
The same arguments as in ordinary complex analysis show that the
radius of convergence of a power series L anz n is lilim sup lanI 1 / n. We
shall be concerned with a special power series as follows.
Let A be a Banach algebra with unit element e. We denote by q(x)
the spectrum of an element x E A . We are interested in bounds for the
spectrum. The essential structure will be derived from the power series
in the variable z, with coefficients in A. The next lemmas describe its

radius of convergence more precisely.
Lemma 1.5. Let A be a normed algebra. Let x E A. Then
lim Ix nl1/ n
n--+ co
exists, and is equal to inf Ixnll /n.
Proof. Without loss of generality, we may assume that xn f= 0 for all

positive integers n. Let cn = Ixnl > O. We obviously have
for all positive integers m, n. We put Co = 1. Fix m. Then any integer n

can be written in the form n = q(n)m + r(n) with 0 ~ r(n) < m. Then
Cn1/ n < cq(n)/ncl

= m
/n
r(n)·
But r(n) is bounded by m, and lim q(n)/n = 11m. Hence
lim sup c! /n ~ cl~m.

n--+ co
This inequality holds for all positive integers m, and hence
n--+ co n n--+ co

In light of the lemma, we define the spectral radius
The reason for the name will become clear from Theorem 1.7 below.
We are now ready to consider the power series
L
00
hx(z) = x"z"
"=0
with x"z" E A for all complex numbers z. We are interested in the do-
main of convergence of the series.
Lemma 1.6. Suppose A is a Banach algebra with unit. Then the radius
of convergence of the series hx is l/p(x).
Proof This is immediate from the usual arguments in the theory of a

complex variable, which show in the present context that the radius of
convergence is l/lim sUp[X"[l/". We can then apply Lemma 1.5.
In Theorem 1.1 we saw that the spectrum a(x) is compact and con-
tained in the disc [z[ ~ [xl. We now prove a more precise inequality.
Theorem 1.7. Let A be a Banach algebra with unit e =f. O. Then
p(x) = sup [z[ .

ZEa(x)
Proof Suppose [z[ > p(x). Let
p(x) < r < [z[.
Then [(Z-l X )"[ ~ (r/[zlt for large n, so the series L (Z-l X )" converges and
shows that z rf; a(x). Conversely, let s = sup[z[ for z E a(x), so s is the
smallest radius for a closed disc centered at the origin and containing the
spectrum a(x). Suppose WE C and [w[ > s. Then w rf; a(x), so x - we is
invertible. Thus the function
h(z) = (e - ZX)-l
is analytic on the disc [z [ < s -1. As we remarked earlier in this section,

this implies that the radius of convergence of the power series
[XVI, §2] THE GELFAND TRANSFORM 407
IS ~ S -1, and therefore
whence p(x) ~ S.
This concludes the proof of Theorem 1.7.
XVI, §2. THE GELFAND TRANSFORM
Throughout, we let A be a commutative Banach algebra with unit ele-

ment e =F 0, over the complex numbers.
We wish to represent elements of A as continuous functions on a certain

space, whose points are going to be the maximal ideals of A. We do this
in a sequence of lemmas.
Lemma 2.1. Let M be a maximal ideal of A . Then M is closed.
Proof By definition, M =F A and so e ¢ M. By Theorem 2.1 of Chap-

ter IV, there is an open neighborhood U of e consisting of invertible
elements, and U is contained in the complement of M, so U is contained
in the complement of the closure M, thus proving the lemma.
Lemma 2.2. An element u E A is invertible if and only if u is not

contained in any maximal ideal.
Proof It is clear that an invertible element is not contained in a

maximal ideal (which, by definition, is =F A). Conversely, suppose u is
not contained in any maximal ideal. Let J = Au be the ideal generated
by u. If J = A, then u is invertible. If J =F A, then by Zorn's lemma, J is
contained in a maximal ideal M. Indeed, let S be the set of ideals I
containing J and =F A. Then S is ordered by inclusion, and S is induc-
tively ordered, for let {Ik} be a totally ordered subset. Let 1= U I k·
Then it is immediately verified that I is an ideal, and I =F A, for other-
wise I contains the unit element, and this unit element must therefore lie
in some ideal I k, contrary to the assumption that Ik =F A for all k. This
shows that if J =F A then u is contained in some maximal ideal, a contra-
diction which proves the lemma.
Let A: A --+ C be a functional, i.e. a continuous linear map into the

scalars C. If in addition A satisfies the multiplicative conditions
A(XY) = A(X)A(Y) and A(e) = 1,
then we call A a character of A. Thus a character is a Banach-algebra

homomorphism (rather than a Banach space homomorphism) into the

scalars C. The kernel of such a character is a maximal ideal, because
its image is the field C. Conversely, by the theorem of Gelfand-Mazur
(Corollary 1.3) if M is a maximal ideal, then AIM is naturally isomorphic
to C, and therefore we obtain a character
which to each element x E A associates its residue class x +M in AIM =

C. We then obtain :
Proposition 2.3. The association M 1-----+ A.M is a bijection between the set
of maximal ideals and the set of characters of A. For each character A.,
we have IA.I ~ 1.
Proof. The first statement has been proved. As to the second, let us
first prove that if Ixl ~ 1 then IA.(x) I ~ 1. If 1A.(x)1 > 1, then 1A.(xn)1 ~ 00
as n ~ 00, but Ixnl ~ 1, which contradicts the continuity of A.. Thus
1A.(x)1 ~ 1. Then for any x =F 0, let y = x/lxl so Iyl = 1. Then 1A.(y)1 ~ 1,
so IA.I ~ 1, thus proving the proposition.
In light of the fact that A. = A.M for some M, we see that we may
rephrase Proposition 2.3 as follows. If x E A and c E C is such that
x == c mod M, and if Ixl ~ 1, then lei ~ 1.
The Banach algebra A being a Banach space has its dual A'. But it
also has the set of characters, which we denote by A, and A c A'l' where
A'l is the unit ball in A'. In Chapter IV, §1 we defined the weak
topology on A'l by embedding A'l in a product space
f : A'l c:... TI
Ixl~ 1
Kx c TI
xeA
ex·
In particular, we obtain the weak topology on A because as we have just

seen, we have an inclusion of A in the unit ball of A':
Let .it be the set of maximal ideals of A. In light of the bijection of .it
with A, we have a natural embedding (the restriction of f)
g: A or .it c:... TI Kx by
Ixl~ I
where for each x E A and A. E A we have gAA.) = A.(x).

[XVI, §3] C*-ALGEBRAS 409
Theorem 2.4. The image g(A) is closed and therefore ,4 is compact in

the weak topology.
Proof. The argument is basically the same as the argument used to

prove Theorem 1.4 of Chapter IV. If y is in the closure of g(,4), then one
proves as before by a continuity argument that y satisfies the condition
y(xy) = y(x)y(y),
just as we proved y(x + y) = y(x) + y(y). We leave the routine to the

reader.
Since each function gx (x E A) is continuous on vii :::::: ,4 by definition of

the weak topology, the map
is a mapping from A to C(vII, C):::::: C(,4, C). It is immediately verified

that this map is a ring homomorphism, i.e.
gx+y = gx + gy and
Also gcx = cg x for c E C, so we say that Xl--+gx is an algebra homomor-

phism. Thus we have represented the elements of A as continuous func-
tions on the maximal ideal space, or on the character space depending
on which notion one wishes to emphasize. This representation is called
the Gelfand transform. The kernel of the homomorphism
A -+ C(vII, C) ~ C(,4, C)
is obviously the intersection of all maximal ideals, which is called the

radical of A. For general Banach algebras, one cannot say anything
further, but in the next section, we shall give a criterion when this kernel
is O. In such a case, A "is" the algebra of continuous functions on a
compact space, obtained from A in a natural way.
XVI, §3. C*-ALGEBRAS
Let A be a complex Banach algebra. By an involution on A we mean a

map x 1--+ x* of A into itself satisfying:
(x + y)* = x* + y*, (ax)* = iXx*,

(xy)* = y*x*, x** = x.
If A has a unit element e, then the reader will verify at once that e* = e.
If further x is invertible, then (x -1)* = (X*)-I.
By a C*-algebra, we mean a Banach algebra with an involution as
above, satisfying the additional condition
for all x E A.
We suppose A is a C*-algebra for the rest of this section. Observe that

from the defining condition, we get
whence Ixl ~ Ix*l.

Since x = x**, it follows that
Ixl = Ix*l·
Examples. The standard example of a commutative C*-algebra is that
of the algebra of continuous functions on a compact Hausdorff space,
with the sup norm. The involution is given by the complex conjugate,
f* = 1. Theorem 3.3 below shows that under certain conditions, there is
no other.
Let H be a Hilbert space, and let A = End(H) be the algebra of
bounded linear maps of H into itself (operators). Then A is a C*-algebra,
the star operation being the adjoint. In Chapter XVIII, we shall consider
the commutative subalgebra generated by one hermitian operator and
reprove the basic theorem independently of the result in this section.
For another example of an involution (which does not satisfy the
condition of a C*-algebra), see Exercise 9.
We have assumed the existence of a unit element in the algebra A for

simplicity. When there is no unit element a priori, one may embed A in
another algebra with unit, i.e. adjoin a unit element, and thereby reduce
the study of an involution to the case when a unit element is present.
We leave this construction as Exercises 5 through 8.
Proposition 3.1. Let A be a C*-algebra with unit. Let x E A. If x = x*

then the spectrum a(x} is real and p(x} = Ixl.
Proof Let Z E a(x}. For all real t it follows that z + it E a(x + ite}. By
Theorem 1.7 we obtain:
Iz + itl 2 ~ Ix + itel 2 ~ I(x + ite)(x - ite}1

~ Ix 2 + t 2 el
~ Ixl2 + t 2
[XVI, §3] C*-ALGEBRAS 411
using the condition of a C*-algebra, and the convention that lei = 1.

Write z = a + ib with a, b real. Then for all real t we find
which is possible only if b = 0, so z is real, and the spectrum is real.

As to the statement about the spectral radius, from the condition
defining a C*-algebra, we have
and we simply use the definition of the spectral radius to conclude the
proof.
Proposition 3.2. Let A be a commutative C*-algebra with unit. Let.A. be

a character. Then .A.(x*) = .A.(x) for all x E A.
Proof. If x = x* then the assertion follows from Proposition 3.1 be-

cause .A.(x) is in the spectrum of x. For x arbitrary, we decompose x into
a sum
1
x = u - iv with u = t(x + x*) and v = 2j(x - x*).
Then u, v satisfy u = u* and v = v*. Furthermore
.A.(x*) = .A.(u - iv) = .A.(u) - i.A.(v) = .A.(u) + i.A.(v) = .A.(x),

which proves the proposition.
We now come to the main structure theorem.
Theorem 3.3 (Gelfand-Naimark). Let A be a commutative C*-algebra

with unit. Then the intersection of all maximal ideals of A is 0, and the
map x ~ fx gives a norm-preserving isomorphism of A on C(A, C).
Proof. It follows as an immediate consequence of the Stone-Weierstrass

theorem that the image g(A) in C(A) is dense. Since A is assumed
complete, it will suffice to prove that the map g is norm-preserving. Let
x E A and write g(x) instead of gx. Then by Proposition 3.2,
g(x*x) = g(x*)g(x) = g(x)g(x) = Ig(xW.
But the element y = x*x satisfies y = y*. By Theorem 1.7 and Proposi-
tion 3.1, we find

Ig(y)1 = p(y) = Iyl,
so Ig(x)1 = Ixl by the definition of a C*-algebra, thus concluding the
proof.
XVI, §4. EXERCISES
1. Let A be a complex Banach algebra with unit element, and let u E A. Let
a(u) be the spectrum of u. Let p be a polynomial with complex coefficients.
Show that the spectrum of p(u) is equal to p(a(u)), i.e. to the set of all
numbers p(IX), where IX lies in the spectrum of u. [Hint : For one inclusion
write
p(t) - p(lX) = (t - lX)q(t)
for some polynomial q, and for the other, write
p(t) - IX = (t - IXd • .• (t - IX.)
if IX is in the spectrum of p(u).] Of course, the result applies especially if u is

an operator on a Banach space E.
2. Let E be a Banach space and let F, G be two closed subspaces such that
E = FEEl G is their direct sum. Let A be an operator on E and assume that
F, G are A-invariant. Let AF and AG denote the restrictions of A to F and G
respectively. Let a(A) denote the spectrum of A.
(a) Let IX be a complex number. Show that A - IXI is invertible if and only if
AF - IXIF and AG - IXIG are invertible.
(b) Show that
3. Let A be a complex Banach algebra with unit element e and let u E A. Show
that the map
is (complex) differentiable and analytic on the complement of the spectrum of

u. One calls R(u, z) = (u - zer 1 the resolvent of u.
Run through systematically some of the basic theorems of complex anal-
ysis and prove them in the Banach-valued case if they are true. (Liouville's
theorem, Cauchy's theorem, etc.)
4. Let t/I be the function defined for real t by
so that t/I can be viewed as a function on the unit circle. Let A be the set of
[XVI, §4] EXERCISES 413
all functions which can be written as infinite sums
00
1= L cnl/l n
-00
with Cn complex, satisfying

00
L Icnl <
- 00
00 .
For the norm defined by

11/11 = L lenl,
show that A is a Banach algebra under ordinary addition and multiplication
of functions. Prove that if I(t) =f. 0 for all t, then I is invertible in A . [Hint:
It suffices to prove that for any character A of A we have A(f) =f. O. Show
that IA(I/I)I = 1 so that A(I/I) = I/I(to) for some to.]
5. Adjunction of a unit element. Let A be a normed algebra. We embed A in a
normed algebra with unit as follows. Let A = C x A, with addition and
multiplication defined componentwise. Define
I(z, x)1 = Izl + Ixl for z E C and x E A.
Prove that A is a normed algebra, with unit (1,0), and containing an isomor-
phic image of A as the subset of elements (0, x) with x E A . Show that A is
an ideal in A . Warning: If A happened to have a unit element, then this unit
is not the same one as the unit in A.
6. Suppose A is a C*-algebra but without our assuming that A has a unit
element. Prove that there exists on A a norm extending the norm on A
which makes A into a C*-algebra with unit. (Warning : It is not the norm of
Exercise 5.) [Hint: Observe that for x E A, we have
IxYI
Ixl =SUp -
yeA lyl
and so for x E A define Ixl by the same formula. Since A is an ideal in A, it

follows that xy E A for x E A and YEA. Prove that this definition gives a
norm on A, and that this norm satisfies the condition for a C*-algebra.]
7. Let A be a Banach algebra with an involution, and let B be a C*-algebra.
Let h: A -+ B be an algebraic homomorphism (no condition is put with re-
spect with the norm) such that h(x*) = h(x)*. Prove that
Ih(x)1 ~ Ixl for all x E A.
8. Using Exercise 7, prove that the norm on A defined in Exercise 6 is the

unique norm extending the norm on A, and making A into a C*-algebra.
9. With the same notation as Exercise 10 of Chapter XII, if m E MI(G), define
m*=mV =rnv.
Show that this star operation is an involution, and that Ilmil = Ilm*ll.
10. Let A be a C*-algebra with unit, and let x E A be such that x* = X-I. Show
that the spectrum of x lies on the unit circle.
11. Let A be a Banach algebra with unit e. Let v E A. Show that there exists a
unique COO mapping K: R --+ A such that:
d
H 1. We have dt K(t) = K(t)v.
H 2. K(O) = e.
Show that the image of K lies in the multiplicative group of invertible ele-
ments of A.
CHAPTER XVII
Compact and Fredholm

Operators
The operators in infinite dimensional spaces closest to operators in finite

dimensional spaces are the compact operators, which will now be studied
systematically. A large number of examples of compact operators are
given in the exercises.
XVII, §1. COMPACT OPERATORS
We recall that a subset of a topological space is said to be relatively

compact if its closure is compact. We had proved a convenient criterion
for this (Corollary 3.9 of Chapter II), namely:
Let X be a subset of a complete normed vector space. Assume that

given r > 0 there exists a finite covering of X by balls of radius r.
Then X is relatively compact.
This criterion will be used frequently in this chapter.
Let E, F be normed vector spaces (not necessarily complete) and let
u: E -+ F
be a linear map. We say that u is compact if u maps bounded sets in E

into relatively compact sets in F. Equivalently, we can say that u maps
the unit ball in E into a relatively compact set in F. It is then clear that
u must be continuous, because if B is the unit ball in E, then u(B) has
compact closure, whence is bounded. It is also clear that our definition
416 COMPACT AND FREDHOLM OPERATORS [XVII, §1]
is equivalent to saying that if {x n } is a bounded sequence in E, then

{ux n } has a convergent subsequence.
Examples. If E or F is finite dimensional, then u is compact. Conse-

quently, if just the image of u is finite dimensional, then u is compact.
Since a locally compact Banach space is finite dimensional, by Corol-
lary 3.15 of Chapter II, it follows that the identity map of an infinite
dimensional Banach space is not compact.
In a later section, we shall prove that the following type of operator is

compact. Let K(x, y) be a continuous function on the rectangle a ~ x ~ b
r
and c ~ y ~ d. If f is continuous on [a, b], we define
Sf(y) = K(x, y)f(x) dx.
It will be shown later that S is compact. Thus our theory applies to the
study of this type of integral equation.
We denote by K(E, F) the set of compact linear maps of E into F.
Theorem 1.1. The compact linear mappings from E to F form a vector

space. If F is complete, then K(E, F) is a closed subspace of L(E, F).
Proof If X, Yare compact in F then X + Y is compact, being the

continuous image of the compact set X x Y under the map (x, y) H X + y.
If B is the unit ball in E, then it follows that for u, v E K(E, F) the set
u(B) + v(B) is compact. But then
u(B) + v(B) c u(B) + v(B).

Since u(cB) = cu(B) for any scalar c, it follows that K(E, F) is a vector
space. To show it is closed in L(E, F) when F is complete, let u be in its
closure. It will suffice to prove that u(B) is covered by a finite number of
open balls of given radius r. Let v E K(E, F) be such that lu - vi < r/2.
Since v is compact, we know that v(B) is covered by a finite number of
open balls of radius r/2, centered say at points Yl' ... ,Yn. For each x E B
we then have
lu(x) - v(x)1 < r/2 and Iv(x) - yd < r/2
for some i. This implies that lu(x) - Yil < r, and hence that u(B) is
covered by a finite number of balls of radius r, as was to be shown.
Remark 1. Let F be a Banach space. It follows from Theorem 1 that

if {un} is a sequence of elements of L(E, F) such that the image of Un is
[XVII, §2] FREDHOLM OPERA TORS AND THE INDEX 417
finite dimensional for all n, and if {un} converges to an element u of

L(E, F), then u is compact. It is not known, however, if a compact
operator can always be expressed as the limit of such a sequence. It does
hold for compact operators in Hilbert space.
Remark 2. We gave the definition of compact mappings on spaces

which are not necessarily complete. Note that if u: E --+ F is compact,
and if E, F denote the completions of E and F respectively, then the
linear continuous extension
is also compact. This is immediate. Furthermore, if E1 is any subspace

of E and Fl ::::> u(E 1 ), then the restriction
is also compact.
Theorem 1.2. Let E, F, G, H be normed vector spaces and let
f: E --+ F, u: F --+ G, g: G --+ H
be continuous linear maps. If u is compact then u 0 f and g 0 u are

compact. In particular, K(E, E) is a two-sided ideal of L(E, E).
Proof The first relation follows from the fact that a continuous image
of a compact set is compact. The second is obvious. The third comes
from the definitions.
Theorem 1.3. Let E, F be Banach spaces and u: E --+ F a compact linear

map. Then u': F' --+ E' is compact.
Proof One can give a direct simple proof, but the reader will note
that our assertion is an immediate consequence of the Ascoli theorem.
We shall make no use of Theorem 1.3 in this book, and thus we leave
the details to the reader.
XVII, §2. FREDHOLM OPERATORS AND THE INDEX
Let E, F be normed vector spaces. A continuous linear map
T: E --+ F
is said to be Fredholm if:
(i) Ker T is finite dimensional.
(ii) 1m T is closed and finite codimensional.
Example (The Shift Operator). Let E be the Hilbert space of all

sequences IX = {an} such that L
lan l2 converges. (This is essentially the
space of Fourier series.) We define
T: E --+ E
by
if IX = (aI' a 2, ... ). The kernel of T is I-dimensional, and T is surjective.

There are variants of this operator, for instance the operator such that
which has 0 kernel, and whose image has codimension 1.

We shall show in Theorem 2.1 that if u is compact, then I - u is
Fredholm. In a later section, we shall give other examples of Fredholm
operators, as integral or differential operators. The reader may look at
these now to see the concrete applications of our algebra to analysis.
We shall use constantly the corollaries of Theorem 1.3, Chapter XV,
which the reader is advised to review carefully. The results expressed
in these corollaries, most of which depend on the open mapping theo-
rem, will be quoted without further specific reference. We note in par-
ticular that as a consequence of these corollaries, when E, F are Banach
spaces, then the hypothesis for Fredholm maps T that 1m T is closed
follows from the finite codimensionality, and could thus be omitted from
the definition of a Fredholm map in this case.
We shall also use the fact that a finite dimensional subspace of a
Banach space admits a closed complement. This was an exercise using
the Hahn-Banach theorem.
Theorem 2.1. Let E be a Banach space, and u: E --+ E a compact opera-

tor. Then I - u is Fredholm.
Proof. The identity I restricted to the kernel of I - u is equal to u,

and is consequently compact. Hence this kernel is finite dimensional,
because a locally compact normed vector space is finite dimensional.
Now we show that the image of I - u is closed. Let T = I - u. Let G
be a closed complement for Ker T, so that
E = Ker(l - u) E8 G.
We obtain continuous linear maps
TIG: G --+ E and ulG: G --+ E,
the restrictions of T and u to G. Furthermore, the kernel of TIG is {O}.

It will suffice to prove that TG = TE is closed, and for this it will suffice
to prove that the inverse map
is continuous. (Indeed, in that case, TG is complete, so closed.) It even

suffices to prove that (TIG)-l is continuous at 0, by linearity. Suppose
that this is not the case. Then we can find a sequence {xn} in G such
that TXn ~ 0, but {xn} does not converge to 0. Selecting a suitable
subsequence, we can assume without loss of generality that IXnl ~ r >
for all n. Then 1/1xnl ~ 1/r for all n, and consequently T(xn /lxnl) also
°
converges to O. Furthermore, xn/lxnl has norm 1, and hence some sub-
sequence of
converges. Since
it follows that a subsequence of {xn/lxnU converges to some element z in G,

also having norm 1. But then 0= z - u(z), and Tz = 0 . This contradicts
the fact that G (\ Ker T = {a}, and thus proves that TE = TG is closed.
Finally we have to show that TE has finite codimension. We shall
need the following lemma, which will also be used later in the spectral
theorem.
Lemma 2.2. Given 1>. Let F be a closed subspace of a normed vector

space H, and assume that F # H. Then there exists x E H with Ixi = 1
such that
d(x, F) = inf Ix - yl ~ 1 - 1>.
yeF
Proof Let z E Hand z ¢ F. Select Yo E F such that
Iz - Yol ~ (inf Iz -
yeF
YI) (1 + 1».
We let
z - Yo
x = .
Iz - Yol
Then for Y E F we have
Ix - YI = IIzz -- Yo I Iz - Yo - Iz - YolYI 2 Iz - Yol .

- Y =
Yol Iz - Yol -Iz - Yol(1 + 1»
which proves our lemma.
To apply the lemma, suppose that TE does not have finite codimen-
sion. We can find a sequence of closed subspaces
TE = Ho C HI C . •. c Hn C ...
such that each Hn is closed and of codimension 1 in Hn+ 1 just by adding

one-dimensional spaces to TE inductively. By the lemma, we can find in
each Hn an element Xn such that IXnl = 1 and IXn- yl ~ 1 - e for all
Y E Hn - I . Then for all k < n :
Iux n - uxkl = IXn- TXn - Xk + TXkl

~l-e
because - TXn - Xk + TXk lies in Hn- I . This shows that the sequence
{uxn} cannot have a convergent subsequence, and contradicts the com-
pactness of u, thus proving Theorem 2.1.
We denote by Fred(E, F) the set of Fredholm operators from E into

F. If T E Fred(E, F), then we define the index of T to be
ind T = dim Ker T - dim FITE.
In the language of linear algebra, the factor space F ITE is also called the
cokernel of T, and thus
ind T = dim Ker T - dim coker T.
Theorem 2.3. Let E, F be Banach spaces. Then Fred(E, F) is open in

L(E, F), and the function TI-+ ind T is continuous on Fred(E, F), hence
constant on connected components.
Proof. Let S: E ~ F be a Fredholm operator. We wish to prove that

if T E L(E, F) is close to S, then T itself is Fredholm. Let N be the
kernel of S, and let G be a closed complement for N, that is E = NEB G.
Then S induces a toplinear isomorphism of G on its image SG (by the
open mapping theorem), and we can write F = SG EB H for some finite
dimensional subspace H . The map
Gx H ~ SG EB H =F
given by
(x, Y)I-+SX +Y
is a toplinear isomorphism. We know that the set of toplinear isomor-
phisms of one Banach space into another is open in the space of all
[XVII, §2] FREDHOLM OPERATORS AND THE INDEX 421
continuous linear maps. If T is close to S, then the map
G x H -> TG EEl H =F
given by
(x, Y}H Tx +Y
is therefore also a toplinear isomorphism. Hence the kernel of T is finite
dimensional, since G n Ker T = {O}, and in fact dim Ker T is at most
equal to the codimension of G in E. The image of T has finite codimen-
sion (at most equal to the dimension of H), and is consequently closed,
by Corollary 1.8 of Chapter XV. This proves that T is Fredholm, and
proves our first assertion.
Now concerning the index, we observe that G EEl Ker T is a direct sum
of two closed subspaces, and there is some finite dimensional subspace M
such that
E = G EEl Ker T EEl M.
Then T induces a toplinear isomorphism
G EEl M -> T(G EEl M} = TG EEl TM,

and
dim M = dim TM.
Hence we get
ind T = dim Ker T - (dim H - dim TM)

= dim Ker T + dim M - dim H
= dim Ker S - dim H
= ind S.
Corollary 2.4. Let E be a Banach space and u a compact operator on

E. If 1 - u is injective (i.e. Ker 1 - u = {O}}, then 1 - u is a top linear
automorphism.
Proof. For each real t, the operator tu is compact. The map t H tu is

continuous, and so is the map
tHind(l- tu}.
Hence this map is constant. Letting t =0 and t = 1 shows that
ind(l - u} = O.
Hence I - u is surjective, whence a toplinear isomorphism by the open

mapping theorem.
Note. Examples of compact operators u furnish immediately exam-

ples of Fredholm operators 1- u. For other examples, cf. for instance
Smale's paper [Sm 3]. One obtains Fredholm linear maps by taking the
derivative of certain "Fredholm" non-linear maps, which are of interest in
differential equations. Thus one sees the linearization provided by the
derivative as a first step in analyzing non-linear problems.
Let T, S be continuous linear maps E --+ F. We say that T is congru-

ent to S modulo compact operators if T - S is compact, and we write
T == S mod K(E, F).
This congruence is an equivalence relation, and if T == S, T1 == Sl' then

TTl == SSl ' This is immediately verified as a consequence of Theorem
1.2. Of course, the composition TTl (or SSl) must make sense. It means
that we compose T1 : E1 --+ E with T as above, and similarly with SSl '
Similar congruence statements hold for sums.
We say that T: E --+ F is invertible modulo compact operators if there
exists a continuous linear map T1 : F --+ E such that
and T1 T == IE mod K(E, E).
Thus we call T1 an inverse of T modulo compact operators.
Theorem 2.5. Let E, F be Banach spaces and let T: E -+ F be a contin-

uous linear map. Then T is Fredholm if and only if T is invertible
modulo compact operators K(E, F). We can select an inverse of T
modulo compact operators, having finite codimensional image.
Proof Let T be Fredholm, and write direct sum decompositions
E = Ker T$G, F = 1m T$H
with closed subspaces G, H. We let S be the composite
where pr is the production, and inc. is the inclusion. Then IF - TS is the

projection on H, and IE - ST is the projection on Ker T. This proves
that T has an inverse modulo compact operators. Conversely, suppose
that S is such an inverse; then we have
Ker T c Ker ST,

so T has finite dimensional kernel. Also
1m T:::> 1m TS
and TS has a closed image of finite co dimension by Theorem 2.1. Hence

1m T has a closed image of finite codimension, so that T is Fredholm.
Note. As an exercise, prove the usual uniqueness of an inverse, that is,

suppose that there exist continuous linear maps Tl , T2 such that
TTl == IF mod K(F, F)

and
T2 T == IE mod K(E, E).
Show that Tl == T2 mod K(F, E), and that Tl or T2 is thus an inverse for
T modulo compact operators.
Corollary 2.6. The composite of Fredholm maps is Fredholm. If T is

Fredholm and u is compact, then T + u is Fredholm.
Proof Clear.
Corollary 2.7. If T is Fredholm and u is compact, then
ind(T + u) = ind T.
Proof The same proof works as for the corollary of Theorem 2.3,
namely we connect T + u with T by the segment.
T+ tu, O~t~1.
The next theorem will not be used later in a significant way and thus
its proof can be omitted if the reader is allergic to formal algebra.
Theorem 2.S. Let E, F, G be Banach spaces, and let
S:E~F and T:F~G
be Fredholm. Then
ind TS = ind T + ind S.

Proof To do this proof properly, we need an algebraic lemma. Let V
be a vector space and W a subspace. Let
f: V ~ f(V)
be a linear map, with image f(V), which we also write fV for simplicity
of notation. If the factor space V/ W is finite dimensional, we denote by
(V : W) the dimension of the factor space V/ W We denote by lj- the
kernel of f in V, and by »j the kernel of fin W, that is W (\ lj- .
Lemma 2.9. Let V be a vector space and Wa subspace. Let f: V -+ fV

be a linear map. Then
(V: W) = (fV: fW) + (lj- : »j)

in the sense that if two of these induces are finite, then so is the third,
and the stated relation holds.
Proof. Consider the composite of linear maps
V -+ fV -+ fV/fW
The kernel certainly contains lj- + W If x E V lies in the kernel, this

means that there exists some YEW such that f(x) = f(y), and then
f(x - y) = 0, so x - y lies in lj- . Hence the kernel is precisely equal to
lj- + W Hence we obtain an isomorphism
(1) V/(lj- + W) ~ fV/fW

We have inclusions of subspaces
(2) Wclj-+WcV
We consider the linear map
lj- -+ (lj- + W)/ W

given by
X 1--+ class of x modulo W
An element of the kernel is such that it also lies in W, so that we obtain

an isomorphism
(3) lj-/ »j ~ (lj- + W)/ W

From this we see at once that if two of our indices are finite, so is
the third. Indeed, suppose that (V: W) and (fV: fW) are finite. Then
(V: lj- + W) is finite by (1) and hence (lj- : »j) is finite by (3). The
others are proved similarly. As for the relation concerning the dimen-
sions, we see that whenever our indices are finite, then
(4) (V: W) = (V: lj- + W) + (lj- + W : W).

If we now use (1) and (3), we get the relation stated in the lemma, as was
to be shown.
We return to the proof of Theorem 2.8.
For simplicity of notation, we use our notation Es for the kernel of S
in E. We write down the definitions of the index for T, Sand TS:
ind S = (Es : 0) - (F : SE),

ind T = (FT : 0) - (G : TF),
ind TS = (E TS : 0) - (G : TSE).
We have the following inclusions:
{O} c Ker S c Ker TS
because Sx = 0 implies TSx = 0, and also
TSE c TF c G.
Hence
(5) (E TS : 0) = (E TS : Es) + (Es : 0)

and
(6) (G: TSE) = (G : TF) + (TF : TSE).

We apply our lemma to the spaces SE c F, and to the map T. We then
get
(7) (F : SE) = (TF : TSE) + (FT : SE n FT).

From the inclusions
we obtain
(8)
If we now substitute the values of (5), (6), (7), (8) into the expression for
ind TS - ind T - ind S,

we obtain
(ETS: Es) - (SE n FT : 0),
and we have to show that this is equal to O. Let us write E TS as a direct

sum
for some finite dimensional W Then we can write<E as a direct sum
Then
SE = S(W (£l U) = SW EB SUo
We contend that SE n FT = SW Indeed, it is clear that SW c Fn and

conversely, if Y ESE, Y = Sx and TSx = 0, then x = Xl + X 2 with Xl E E TS
and X2 E W, and Sx = SX 2 E SW whence SE n FT = SW But then
(ETS: Es) = (W: 0) = (SW: 0)
because S is an isomorphism on W This concludes the proof of our

theorem.
XVII, §3. SPECTRAL THEOREM FOR

COMPACT OPERATORS
Throughout this section, we let E be a Banach space, and let u: E --+ E be a

compact operator.
We are interested in the spectrum of u. We recall that a number a is

called an eigenvalue for u if there exists a non-zero vector X E E such that
ux = ax.
In that case we call x an eigenvector for u, belonging to a.

U a is a number =I 0, then au is compact and so is a-lu. Hence
u - a/ and a/ - u are Fredholm, by Theorem 2.1. Furthermore, for
every positive integer n, the operator (I - u)" can be written as
(I - u)" = / - Ul
for some compact u l , because we expand with the binomial expansion,

and use Theorem 1.2. Hence (u - al)" is also Fredholm for a =I 0.
By Corollary 2.4, we know that for a =I 0,
ind(u - al)" = 0,
in other words,
dim Ker(u - al)" = dim coker(u - a/)".

[XVII, §3] SPECTRAL THEOREM FOR COMPACT OPERA TORS 427
Theorem 3.1. Let a be a number #- O. Either u - aI is invertible, or a

is an eigenvalue of u. In other words, every element of the spectrum of
u is an eigenvalue, except possibly O. If E is infinite dimensional, then 0
is in the spectrum.
Proof. By Corollary 2.4 we know that if u - aI has kernel {O}, then

u - aI is invertible. Thus our first statement is essentially merely a re-
formulation of this corollary. If 0 is not in the spectrum, then u is
invertible, and then the image of the closed unit ball by u is homeomor-
phic to this unit ball and is compact, so that E is locally compact, and
hence finite dimensional, thereby proving Theorem 3.1.
In the theory of a finite dimensional vector space V, with an endomor-

phism u: V --+ V, one knows that we can decompose V into a direct sum
such that each N j corresponds to an eigenvalue a j of u, and such that for

each i, there exists an integer rj having the property that
(u - a;IYWj = O.
As one says, u - ajI is nilpotent on N j . When that is the case, a theorem

like the Jordan normal form theorem gives a canonical matrix represent-
ing u with respect to a suitable basis, namely blocks consisting of trian-
gular matrices of type
aj 0 0
o aj 0
o 0 0
o 0 0 aj
Thus the decomposition of V into subspaces as above yields complete

information concerning u. We shall now do this for compact operators.
Of course, we get infinitely many subspaces in the decomposition, corre-
sponding to possibly infinitely many eigenvalues.
Lemma 3.2. Let a be a non-zero eigenvalue for u. Then there exists an

integer r > 0 such that
Ker(u - aIY = Ker(u - al)n
for all n ~ r.
Proof It suffices to prove that Ker(l - u)' = Ker(l - u)", for all
n ~ some r. Suppose that this is not the case. Then we have a strictly
ascending chain of subs paces
Ker(l - u) ~ Ker(l - U)2 ~ • .. ~ Ker(l - u)" ~ ....
By Lemma 2.2, we can find an element
x. E Ker(l - u)·
such that Ix.1 = 1 and x. is at distance ~ 1 - e from Ker(l - U)"-l . Let

T = I - u. Then just as in this lemma, we find for k < n:
lux. - uXkl = Ix. - Tx. - Xk + TXkl

~l-e
because Tx. lies in Ker(I - U)"-l .. This contradicts the compactness of u

and proves the lemma.
It is clear that if Ker(u - rxI)' = Ker(u - rxI)'+l for some r, then
Ker(u - rxI)' = Ker(u - rxI)·

for all n ~ r. We call the smallest integer r for which this is true the
exponent of rx.
Theorem 3.3. Let rx be a non-zero eigenvalue of u, and let r be its

exponent. Then we have a direct sum decomposition
E = Ker(u - rxI)' EB Im(u - rxI)"
and each of the spaces occurring in this direct sum is a closed invariant
subspace of u. If P =F rx is another non-zero eigenvalue of u, and s is its
exponent, then
Ker(u - PI)' c: Im(u - rxI)'.
Proof Let T = (u - rxI)'. Then T is Fredholm. Both Ker T and 1m T

are u-invariant closed subspaces. Furthermore, we have
Ker Tn 1m T = {O}.
Indeed, suppose that x E Ker Tn 1m T We can write x = Ty for some

y E E. Since Tx = 0 we get T2y = (u - rxWry = O. Since
Ker T = Ker T2
[XVII, §3] SPECTRAL THEOREM FOR COMPACT OPERATORS 429
by the lemma, we conclude that y E Ker T, and therefore x = O. Finally,

since the index of T is equal to 0, it follows that codim 1m T = dim Ker T.
Since we have already a direct sum decomposition Ker TEE> 1m T for
some subspace of E, it follows that this subspace must be all of E. This
proves our first assertion. Let now P=I (X be another eigenvalue for u,
and let S = (u - PI)'. Then ST = TS, so Ker T and 1m Tare S-invariant
subspaces. Let x E Ker S. We can write
x=y+z
uniquely with y E Ker T and ZE 1m T. Then
o= Sx = Sy + Sz,
and since Sy E Ker T, Sz E 1m T, it follows from the uniqueness of the
decomposition that Sy = O. But S, T are obtained as relatively prime
polynomials in u, and hence there exist polynomials in u, namely P and
Q, such that
PS + QT = I.
(We recall the proof below.) Applying this to y shows that Iy = 0 so that
y = 0 and hence x = Z E 1m T, thus proving our theorem.
Now to recall the proof of the existence of P, Q, let A =u- !Xl and
B = u - pl. There exist constants a, b such that
aA + bB = I.
We take n sufficient!y large, and raise both sides to the n-th power. We
obtain
L" ciaA)j(bB)"-j = I.
j=O
If we take n sufficiently large, then j ~ r or n - j ~ s, and thus the

existence of P, Q follows as desired.
Theorem 3.4. Assume that there are irifinitely many eigenvalues. Then
the eigenvalues =I 0 of u form a denumerable set, and if we order them
as !Xl' !X2' • .• such that
then
Proof. Given c > 0 we first show that there is only a finite number of
eigenvalues a such that lal ~ c. If this is not true, then we can find a
sequence of eigenvectors {w n} belonging to distinct eigenvalues {an} such
that Iwnl = 1 and lanl ~ c > 0 for all n. The vectors W l , ... 'Wn are lin-
early independent, for otherwise, if n ~ 2 and
then we apply u to this relation, and get
We divide by a l and subtract, obtaining
By induction, we could assume W2, ... , Wn linearly independent, and hence

C2 = ... = Cn = 0, whence C l = 0, as was to be shown. We let Hn be the
space generated by W l' ... ,Wn' By Lemma 2.2 we can find Xn E Hn such
that IXnl = 1 and
for all Y E H n - l . Then for k < n we get for some Y E Hn - l:
This contradicts the compactness of u, and proves that the number of

eigenvalues a such that lal ~ c is finite.
Thus we can order the eigenvalues in a sequence {a l , a2""} such that
and we get lim ai = O. This proves our theorem.

i-co
Let {aJ be the sequence of eigenvalues of u, ordered such that

lail ~ lai+ll. Let r i be the exponent of a i • We can form the subspaces
n
Fn = L Ker(u -
i=l
aJ)\
and the sum is direct since each Ker(u - ai)'; is finite dimensional. Then
we get an ascending sequence of subspaces
Fl cF2 c···cF"c···.
[XVII, §3] SPECTRAL THEOREM FOR COMPACT OPERATORS 431
Similarly, each one of these subspaces has a complementary closed sub-

space Hn which can be described in various ways. For instance, we can
proceed by induction, assuming that we have already found such a closed
u-invariant subspace Hn. Then ulHn is a compact operator, whose eigen-
values are precisely aj for j > n, and we can decompose Hn as in Theorem
3.3. Assuming inductively that
Hn = 1m nn
i=l
(u - a;!)",
we conclude that the same relation holds when n is replaced by n + 1.

We get a decreasing sequence
and a direct sum decomposition
This decomposition of E as a direct sum is what we call the spectral

theorem for compact operators. Our method of proof is due to Riesz. Cf.
[Di] and [R - N].
We conclude this section with some remarks in case we are given a
compact operator on a normed vector space which is not complete. This
is often the case, when we are given an operator say on Coo functions,
and we extend it to the completion of the space of Coo functions with
respect to some norm.
Theorem 3.5. Let E be a normed vector space and u: E -+ E a compact

operator. Let E be the completion of E, and let
be the continuous linear extension of u. Then ii maps E into E itself. If

a =1= 0 is an eigenvalue of ii of exponent r, and N« = Ker(ii - aI)" then
N« is contained in E.
Proof Let x E E and let {x n } be a sequence in E approaching x.

Then {ux n } has a subsequence which converges in E, and hence ux lies
in E, thus proving our first assertion. As to the second, let x be an
eigenvector of ii in E belonging to the eigenvalue a =1= O. Then
°
whence x lies in E. Inductively, suppose that the kernel of (u - rxl)k
is contained in E, and suppose that (u - rxI)k+1X = for some x E if.
Then (u - rxl)x = y lies in the kernel of (u - rxl)k and hence lies in E.
Therefore
lies in E, thus proving our theorem.
XVII, §4. APPLICATION TO INTEGRAL EQUATIONS
We consider a continuous function K(x, y) of two variables ranging over
r
a square [a, b] x [a, b]. Then we obtain an operator SK such that
(1) SKf(x) = K(x, y)f(y) dy.
We shall consider this operator with respect to two norms on the space
E of continuous functions on [a, b].
Case 1. We take E with the sup norm, so that E is a Banach space.

Then SK is compact.
Indeed, this follows trivially from Ascoli's theorem, because K is uni-

formly continuous, and if is a subset of E, bounded by C > 0, then
estimating the integral in the usual way shows that SK(<I» is bounded,
r
and for f E we have
(2) ISd(x) - Sd(x o)I ~ IK(x, y) - K(x o , y)llf(y)1 dy
< C(b - a)B
whenever Ix - xol < b. Hence SK(<I» is equicontinuous, and our assertion

is proved.
Thus we can apply the spectral theorem for compact operators.
r
Case 2. We take E with the L 2 -norm, arising from the hermitian
product
<f, g) = f(t)g(t) dt.
Then again SK is compact, even as a linear map of E into itself (even

though E is not yet complete with respect to the L 2- norm !).
[XVII, §5] EXERCISES 433
Proof Similar to the proof in the preceding case, except that we

estimate the integral by the Schwarz inequality, namely
If M is a bound for K on the square, and Kx is the function such that

KAy) = K(x, y), then IIKxl12 ~ M(b - a)1 /2, and hence by Schwarz, from
(1) we even get a bound for the sup norm:
so that if I lies in an L 2-bounded set, then SKI lies in a CO-bounded set.

Similarly, estimating (2) with the Schwarz inequality shows that if is an
L 2-bounded set, then SK(<I» is equicontinuous, so that we can apply
Ascoli's theorem to SK(<I». Note that SK(<I» in this case is relatively
compact with respect to the sup norm, let alone the L 2-norm, even
though we started with a set which was only L 2-bounded.
The spectral theorem applies therefore in the present case, and so does
Theorem 3.5, which showed that the finite dimensional spaces corre-
sponding to the eigenvalues # 0 actually had bases with elements in E
rather than in the L 2-completion of E.
If we take K to be hermitian, for instance real valued and such that
K(x, y) = K(y, x), then we have a Fourier expansion of any function in E
as an L 2-convergent series. One can then start playing the same game as
in the ordinary theory of Fourier series, and ask for uniform or pointwise
convergence. We leave this to look up for readers who have a more
direct interest in integral equations.
Several more examples of compact integral operators will be given in
the exercises.
XVII, §5. EXERCISES
1. Let E be the space of COO functions of one variable, periodic of period 2n.
Let D be the derivative. Denote by Eo the space E together with the norm
arising from the hermitian product
e" fg,
<f, g>o = Jo
and for each positive integer p denote by Ep the same space but with the
product
(a) If J E E and C k is the Fourier coefficient of J with respect to the function

eikx (k integer), show that Ck goes to 0 like 11k 2 , for k --+ 00. Use integra-
tion by parts.
(b) SHow that T = I - D2 is a toplinear isomorphism
by constructing an inverse S, using term by term integration of the

Fourier series.
(c) Show that this inverse is compact.
2. (a) Let E be the normed vector space of continuous functions on [0, 1J with
the sup norm. Let S: E --+ E be the linear map such that
SJ(x) = f: J(t) dt.
Show that S is continuous, and that IS"II/" --+ 0 as n --+ 00. [Hint: Show
that I(S"J)(x) I ~ IIJllx"ln! by induction. You will need some inequality
like n! ~ nne -".]
(b) Show that 0 is the only element in the spectrum of S, and that S is
compact.
*
(c) For each IX 0, given a continuous function gEE, show that there exists
a continuous function J E E such that
SJ - IXJ = g.
Express J explicitly as an integral involving g and the exponential
function.
3. Let J = [0, 1J and let E be the vector space of all C I paths IX: J --+ R". Let
II II be the sup norm and I I the euclidean norm on R". Given two paths IX, p
f
define
<IX, P>o = >

<IX(t), P(t) dt,
the product <IX(t), P(t) > being the dot product. Its associated norm is called
the HO-norm on E, and will be denoted by II 110' Define
(a) Show that this is a positive definite scalar product, and that its norm,
which we call the HI-norm and denote by II III' is equivalent to the norm
arising from the scalar product
(b) Show that for IX E E and s, t E J, we have

and
(c) Let HO(J, Rn) be the completion of E with respect to the HO-norm, and let
Hl(J, Rn) be its completion with respect to the HI-norm. Show that the
identity mapping on E induces injective continuous linear maps
and
Show that both these maps are compact.

4. If you have not already done them, do Exercise 18 of Chapter IV and
Exercise 7 of Chapter V.
5. Let U be a bounded open set in a Banach space E and let F be a finite
dimensional space. Let p ~ 1. Let BCP(U, F) be the space of CP maps
f: U -+ F such that Dkf is bounded for k = 1, . . . ,p. Show that the iden-
tity map of BCP+l(U, F) into BCP(U, F) is a compact operator. [Hint: Use
Ascoli's theorem and the mean value theorem.]
6. Let H, E be Hilbert spaces, and let A: H -+ E be an operator. Show that A is
compact if and only if A maps weakly convergent sequences into strongly
convergent sequences. (A sequence is said to converge weakly if the sequence
obtained by applying any functional converges. It converges strongly if it
converges in the usual norm of the Hilbert space.) Show also that if A is
compact and Vn -+ b weakly in H, then AVn -+ 0 strongly in E. [Hint: You
may use the principle of uniform bounded ness, and also the fact that the unit
ball is closed in the weak topology, see Exercise 10 of Chapter IV. Cf.
SL 2 (R), Appendix 4, for details of proof.]
7. Let H, E be Hilbert spaces and let A: H -+ E be a compact linear map. Let

{e;} (i = 1,2, .. . ) be an orthonormal basis in H. Let H(N) be the closed
subspace generated by the ei with i ~ N. Show that given e, there exists N
such that for all v E H(N), we have
[Hint : Use the preceding exercise. If the conclusion is false, pick a sequence
vn E H(n) of unit vectors such that IAvnl > e.]
8. Let H, be the Hilbert space defined in Exercise 6 of Chapter VII. Following
Exercise 8 of that chapter, if r < s, prove that the inclusion Hs c... H, is a
compact linear map.
The next three exercises give examples of compact operators which are called
Hilbert-Schmidt. For a systematic treatment of such operators in general Hilbert
spaces, see Exercises 17 and 18 of Chapter XVIII.
9. Hilbert-Schmidt operators in L2. Let (X, vii, dx) and (Y,.AI, dy) be measured
spaces. Assume that L2(X) and L2(y) have countable Hilbert bases.
(a) Show that if {(!1i} and {I/Ii} are Hilbert bases for L 2(X) and L 2( Y), respec-
tively, then {(!1i ® I/Ii } is a Hilbert basis for L 2(dx ® dy).
(b) Let K E L2(dx ® dy). Prove that the operator
from
L
given by
Sd(x) = K(x, y)f(y) dy
is compact. [Hint: Prove first that it is bounded, with bound IIKI12 '
Using partial sums for the Fourier expansion of K, show that SK can be
approximated by operators with finite dimensional images. Cf. Theorem 2
of SL2(R), Chapter I, §3.]
Let H be a Hilbert space with countable Hilbert basis. An operator A
on H is said to be Hilbert-Schmidt if there exists a Hilbert basis {q>;}
such that
(c) If Y = X, show that SK is a Hilbert-Schmidt operator.

(d) Conversely, let T: L2(X) ..... L2(X) be a Hilbert-Schmidt operator. Show
that there exists K E L 2(X X X) such that T = SK'
[Hint: Let Tq>j = L tijq>j. Show that L It jj l2 < 00, and let
j
K = L tij( q>i ® CPj)·

i,j
For a more subtle result along these lines, cf. SL2(R), Theorem 6 of
Chapter XII, §3.]
10. Assume that (X, At, dx) = (Y, %, dy) in the preceding exercise, and that X has
finite measure. Let
K = L cm .• q>m ® CPo
m ••
be the Fourier series for K. Let Pm •• be the integral operator defined by the
function q>m ® CP.·
(a) Show that Pm •• q>. = q>m and Pm •• q>j = 0 if h #- n.
(b) Assume that the coefficients c m,. tend to 0 sufficiently rapidly. Show that
K is in Ll and that
f x
K(x, x) dx = L c•.•.
•
(c) Again assume that the coefficients cm•• tend to 0 sufficiently rapidly. Show
that
L <SKq>., q>.> = L c•.• ,
• •
Under suitable convergence conditions, this gives an integral expression
for the "trace" of SK'
(d) If the Fourier coefficients of K tend to 0 sufficiently rapidly, show that

SK is the product of two Hilbert-Schmidt operators. By definition, this
means SK is of "trace class". See Exercise 23 of Chapter XVIII. [Hint:
Look at the technique of the next exercise.]
11 . Let T again be the circle, viewing functions on T as functions on the reals
which are periodic of period 1. Let K be a Coo function on TxT.
(a) Show that the integral operator SK given by
SKf(x) = f f(y)K(x, y) dy
is the product of two Hilbert- Schmidt operators.

(b) Show that with the notation of Exercise 10, we have
I o
I K(x, x) dx = I
•
c•.• .
[Hint : Let {ep.} be the Hilbert basis given by ep.(x) = e2ni•x. Let
1
B= I cm.•(1 + n2 )Pm .• and C=" - - P. .
71 + / }
m.'
Show that BC = SK ']

12. Let T be the circle as before, and for f e L 2(T) define
Cf(x) = f e 2ni(x-')f(t) dt.
(a) Prove that C is a compact operator on L 2 (T).

(b) Prove that C* = C (so C is self-adjoint).
(c) Describe the spectrum of C.
13. Prove the following theorem (worked out in SL2(R), Chapter XII, §3).
Theorem. Let X be a locally compact space with a finite positive measure Jl.
Let H be a closed subspace of L2(X, Jl) = L 2(X), and let T be a linear map of
H into the vector space BC(X) of bounded continuous functions on X . Assume
that there exists C > 0 such that
II Tfll ~ CIIfl12 for all fe H,
where I II is the sup norm. Then
T: H -+ L2(X)
is a compact operator, which can be represented by a kernel in L2(X x X), as

in Exercise 9.
CHAPTER XVIII
Spectral Theorem for

Bounded Hermitian Operators
This chapter may be viewed as a direct continuation of the linear algebra

in the context of Hilbert space first discussed in Chapter V.
XVIII, §1. HERMITIAN AND UNITARY OPERATORS
Let E be a Hilbert space. We recall that an operator A : E~ E is a

continuous (or bounded) linear map. We defined the adjoint A* in Chap-
ter V, §2. An operator A such that A = A * is called hermitian, or self-
adjoint. If E is a real Hilbert space, then instead of hermitian we also say
that A is symmetric. If A is invertible, then one sees at once that
(A- 1 )* = (A*)-l.
The case when A = A* is the main one studied in this chapter. For a
complex Hilbert space, the following properties are equivalent, concerning
an operator A :
We have A = A*.
The form CPA: (x, Y)f--+ <Ax, y) is hermitian.
The numbers <Ax, x) are real for all x E E.
The equivalence between the first two is left to the reader. As to the
third, suppose that A = A*. Then
<Ax, x) = <x, A*x) = <x, Ax) = <Ax, x)

[XVIII, §2] POSITIVE HERMITIAN OPERA TORS 439
so (Ax, x) is real. Conversely, assume that this is the case. Then
(Ax, x) = (x, Ax) = (A*x, x)

for all x, whence «(A - A*)x, x) = 0 for all x, and A = A* by polariza-
tion (Theorem 2.4 of Chapter V).
Let E be a Hilbert space, and A an invertible operator. Then the
following conditions are equivalent:
UN 1. A* = A-I .
UN 2. IAxl = Ixl for all x E E.
UN 3. (Ax, Ay) = (x, y) for all x, y E E.
UN 4. IAxl = 1 for every unit vector x E E.
An invertible operator satisfying these conditions is said to be hilber-

tian, or unitary. The equivalence between the four conditions is very
simple to establish and will be left to the reader (Exercise 3). The set of
unitary (or hilbertian) operators is a group, denoted by Hilb(E).
In §7 you will see how to decompose an arbitrary operator into
a product of hermitian and unitary operators. Thus hermitian and uni-
tary operators are the basic ones. We shall now study especially the
hermitian ones.
XVIII, §2. POSITIVE HERMITIAN OPERATORS
We wish to see how much information on the norm of A can be derived

from knowing the values of the quadratic form (Ax, x).
Lemma 2.1. Let A be an operator, and c a number such that
I(Ax, x)1 ~ clxl2
for all x E E. Then for all x, y we have
I(Ax, y)1 + I(x, Ay)1 ~ 2clxllyl.
Proof. By the polarization identity,
21(Ax, y) + (Ay, x)1 ~ clx + yl2 + clx _ Yl2 = 2c(lxl 2 + lyI2).
Hence
I(Ax, y) + (Ay, x)1 ~ c(lxl 2 + lyI2).
440 BOUNDED HERMITIAN OPERA TORS [XVIII, §2]
Figure 18.1
We multiply y by e i9 and thus get on the left-hand side
The right-hand side remains unchanged, and for suitable e, the left-hand
side becomes
I<Ax, y)1 + I<Ay, x)l·
(In other words, we are lining up two complex numbers by rotating one
by e and the other by -e.) Next we replace x by tx and y by y/t for t
real and t > O. Then the left-hand side remains unchanged, while the
right-hand side becomes
The point at which g'(t) = 0 is the unique minimum, and at this point to
we find that
g(t o) = Ixllyl·
Theorem 2.2. Let A be a hermitian operator. Then IA I is the greatest

lower bound of all values c such that
I<Ax, x)1 ~ clxl2
for all x, or equivalently, the sup of all values I<Ax, x)1 taken for x on
the unit sphere in E.
Proof When A is hermitian we obtain
I<Ax, y)1 ~ clxllyl

[XVIII, §2] POSITIVE HERMITIAN OPERA TORS 441
for all x, Y E E, so that we get IA I ~ c in the lemma. On the other hand,

c = IA I is certainly a possible value for c by the Schwarz inequality. This
proves our theorem.
Theorem 2.2 allows us to define an ordering in the space of hermitian

operators. If A is hermitian, we define A ~ 0 and say that A is positive
if <Ax, x) ~ 0 for all x E E. If A, B ate hermitian we define A ~ B if
A - B ~ O. This is indeed an ordering, and the usual rules hold: if
Al ~ Bl and A2 ~ B 2 , then
If c is a real number ~ 0 and A ~ 0, then cA ~ O. So far, however, we

have said nothing about a product of positive hermitian operators AB,
even if AB = BA. We shall deal with this question later.
Let c be a bound for A. Then I<Ax, x)1 ~ clxl2 and consequently
-cl ~ A ~ cl.
For simplicity, if IJ. is real, we sometimes write IJ. ~ A instead of IJ.I ~ A,

and similarly we write A ~ {3 instead of A ~ {31. If we let
IJ. = inf <Ax, x) and {3 = sup <Ax, x),

Ixl=l Ixl=l
then we have
IJ. ~ A ~ {3,
and from Theorem 2.2,
IAI = max(llJ.l, 1{31).
The next two sections are devoted to generalizing to Hilbert space the
spectral theorem in the finite dimensional case. These two sections are
logically independent of each other. In the finite dimensional case, the
spectral theorem for hermitian operators asserts that there exists a basis
consisting of eigenvectors. We recall that an eigenvector for an operator
A is a vector w =1= 0 such that there exists a number c for which Aw = cwo
We then call c an eigenvalue, and say that w, c belong to each other. In
the next section, we describe a special type of hermitian operator for
which the generalization to Hilbert space has the same statement as in
the finite dimensional case. Afterwards, we give a theorem which holds
in the general case, and can be used as a substitute for the "basis"
statement in many applications. Some of these applications are described
in subsequent sections.
XVIII, §3. THE SPECTRAL THEOREM FOR COMPACT

HERMITIAN OPERATORS
An operator A: E --+ E is said to be compact if given a bounded sequence

{x n } in E, the sequence {Ax n } has a convergent subsequence. It is pre-
cisely this condition which will allow us to get an orthogonal basis for a
hermitian operator. It is clear that if E is finite dimensional, every
operator is compact.
Throughout this section, we let E be a complex Hilbert space, and

A: E --+ E a compact hermitian operator.
A subspace V of E is called A-invariant if A V c V, i.e. if x E V, then

Ax E V. If V is A-invariant, then its closure is also A-invariant. Further-
more V.i is A-invariant because if x E V.i, then for all y E V we get
<y, Ax) = (Ay, x) = o.

We recall that the spectrum of A is the set of numbers c such that
A - cl is not invertible. We note that an eigenvalue c of a hermitian
operator is real, because if w is an eigenvector belonging to c, then
c<w, w) = <Aw, w) = <w, Aw) = c<w, w),
so c = c. Since the I-dimensional space generated by an eigenvector is

A-invariant, it follows that the orthogonal complement of this space is
A-invariant.
For each eigenvalue c let Ee be the sp~ce generated by all eigenvectors
having this eigenvalue, and call it the c-eigenspace. Then for every x E Ee
we have Ax = cx. We note that Ee is a closed subspace, and Eo is the
kernel of A.
Let W 1 and W 2 be eigenvectors belonging to eigenvalues c 1 , C2 respec-
tively, such that C1 =fc 2 . Then w 1 1.w2, because
Consequently Ee, is orthogonal to Ee2 .
Suppose that V is a non-zero finite dimensional A-invariant subspace of

E. Then A restricted to V induces a self-adjoint operator on V, and
there exists an orthogonal basis of V consisting of eigenvectors for A.
This is a trivial fact of linear algebra, which we reprove here. Let w

be a non-zero eigenvector for A in V, and let W be the orthogonal
[XVIII, §3] COMPACT HERMITIAN OPERATORS 443
complement of w in V Then W is A-invariant, and we can complete the

proof by induction.
Theorem 3.1 (Spectral Theorem). Let A be a compact hermitian opera-

tor on the Hilbert space E. Then the family of eigenspaces {EJ, where
c ranges over all eigenvalues (including 0), is an orthogonal decomposi-
tion of E.
Proof. Let F be the closure of the subspace generated by all Ec (as in

Corollary 1.9 of Chapter V), and let H be the orthogonal complement of
F. Then H is A-invariant, and A induces a compact hermitian operator
on H, which has no eigenvalue. We must show that H = {OJ. This will
follow from the next lemma.
Lemma 3.2. Let A be a compact hermitian operator on the Hilbert

space H ¥= {O}. Let c = IA I. Then c or - c is an eigenvalue for A.
Proof. There exists a sequence {xn} in H such that IXnl = 1 and
Selecting a subsequence if necessary, we may assume that
for some number a, and a = ± IA I. Then
o ~ IAxn - aXnl2 = <Axn - ax n, AXn - ax n>

= IAxnl2 >+ a 2 xnl 2
- 2a<Ax n, x n 1
~ a2 - 2a<Axn, x n>+ a2.
The right-hand side approaches 0 as n tends to infinity. Since A is

compact, after selecting a subsequence, we may assume that {Ax n } con-
verges to some vector y, and then {axn} must converge to y also. If
a = 0, then IA I = 0 and A = 0, so we are done. If a ¥= 0, then {xn} itself
must converge to some vector x, and then Ax = ax so that a is the
desired eigenvalue for A, thus proving our lemma, and the theorem.
We observe that each Ec has a Hilbert basis consisting of eigenvectors,

namely any Hilbert basis of Eo because all non-zero elements of Ec are
eigenvectors. Hence E itself has a Hilbert basis consisting of eigen-
vectors. Thus we recover precisely the analog of the theorem in the finite
dimensional case. Furthermore, we have some additional information,
which follows trivially:
If c -# 0, each Ec is finite dimensional; otherwise a denumerable subset

from a Hilbert basis would provide a sequence contradicting the com-
number of eigenvalues c such that lei ~ r. Thus

quence of eigenvalues if E is infinite dimensional.
°
pactness of A . For a similar reason, given r > 0, there is only a finite
is a limit of the se-
XVIII, §4. THE SPECTRAL THEOREM FOR

HERMITIAN OPERATORS
Let p be a polynomial with real coefficients, and let A be a hermitian

operator. Write
As in Chapter IV, §5, we define
We let R[A] be the algebra generated over R by A, that is the algebra of

all operators p(A), where p(t) E R[t]. We wish to investigate the closure
of R[A] in the (real) Banach space of all operators. We shall show how
to represent this closure as a ring of continuous functions on some
compact subset of the reals. First, we observe that the hermitian opera-
tors form a closed subspace of End(E), and that R[A] is a closed sub-
space of the space of hermitian operators.
As observed at the end of §2, we can find real numbers ex, {3 such that
exI ~ A ~ {3I.
We shall prove that if p is a real polynomial which takes on positive

values on the interval [ex, {3], then p(A) is a positive operator. For this
we need a purely algebraic lemma.
Lemma 4.1. Let p be a real polynomial such that p(t)

t E [ex, {3]. Then we can express p in the form
~ °for all
where Q;, Qj' Qk are real polynomials, and c ~ 0.
Proof We first factor p into linear and irreducible quadratic factors

over the real numbers. If p has a root y such that ex < y < {3, then the
multiplicity of y is even (otherwise p changes sign near y, which is impos-
sible), and then (t - y) occurs in an even power. If a root y is ~ ex we
[XVIII, §4] THE SPECTRAL THEOREM 445
have a linear factor t - y which we write
t - y = (t - a) + (a - y)
and note that a - y is a real square. If y is a root ~ f3, then we write

the linear factor as
y - t = (y - f3) + (f3 - t)
and note that y - f3 is a real square. In a factorization of p we can take

the factors to be of type (t - y)2m(YJ if y is root such that a < y < f3, and
otherwise to be of type t - y or y - t according as y < a or y > f3. The
quadratic factors are of type (t - a)2 + b2. The constant c (which can be
taken as a constant factor) is then ~ 0 since p is positive on the interval.
Multiplying out all these factors, and noting that a sum of squares times
a sum of squares is a sum of squares, we conclude that p has an expres-
sion as stated in the lemma, except that there still appear terms of type
(t - a)(f3 - t)Q(t)2
where Q is a real polynomial. However, such terms can be reduced to

terms of the other types by using the identity
(t _ a)(f3 _ t) = (t - a)2(f3 - t) + (t - a)(f3 - t)2

f3-a
Now to study R[A], we observe that the map
pH p(A)
is a ring-homomorphism of R[t] onto the ring R[A]. Furthermore, if B,

C are hermitian operators such that BC = CB and B ~ 0, then trivially,
BC 2 is positive because
<BC 2 x, x) = <CBCx, x) = <BCx, Cx) ~ o.

The sum of two positive hermitian operators is positive. Hence from the
expression of p in the lemma, we obtain
Lemma 4.2. If p is positive on [a, f3], then p(A) is a positive operator.

If p, q are polynomials such that p ~ q on [a, f3], then p(A) ~ q(A).
Finally,
Ip(A)1 ~ lip II,
the sup norm being taken on [a, f3].
Proof The first assertion comes from the remarks preceding our
lemma. The second follows at once by considering q - p. Finally, if we
let
q(t) = Ilpll ± p(t)
then q ~ 0 on [ex, P] and hence q(A) ~ 0, whence the last assertion fol-
lows from Theorem 2.2.
We conclude that the map

pl--+ p(A)
is a continuous linear map from the space of polynomial functions on

[ex, P] into R[Al By the linear extension theorem, we can extend this
map to the Banach space of continuous functions on [ex, P] by continu-
ity, and thus we can define f(A) for any continuous function f on [ex, P],
by the Stone-Weierstrass theorem. If {Pn} is a sequence of polynomials
converging uniformly to f on [ex, P], then by definition,
f(A) = lim Pn(A).
Furthermore, again by continuity, we have
If(A)1 ~ Ilfll,
the sup norm being taken on [ex, Pl If Pn -? f and qn -? g, then Pnqn -? fg·
Hence we obtain (fg)(A) = f(A)g(A) for any continuous functions, f, g.
In other words, our map is also a ring-homomorphism.
Theorem 4.3. If A ~ 0, then there exists BE R[A] such that B2 = A.

The product of two commuting positive hermitian operators is again
positive.
Proof The continuous function t 1/ 2 maps on a square root of A in

R[A], and it is clear that any element of R[A] commutes with A. If A,
C commute and we write A = B2 with B in R[A], then Band C also
commute because C commutes with p(A) for all real polynomials p, and
hence C commutes with all elements of R[Al But as we have seen, if
C ~ 0, then B 2 C ~ 0. This proves our theorem.
The kernel of our map fl--+ f(A) is a closed ideal in the ring of
continuous functions on [ex, Pl We forget for a moment our definition
of the spectrum given in Chapter XVI, §1, and here define the spectrum
a(A) to be the closed set of zeros of this ideal. We use Theorem 2.1 of
Chapter III.
If f is any continuous function on a(A), we extend f to a continuous
[XVIII, §4] THE SPECTRAL THEOREM 447
function on [a, fJ] having the same sup norm, say Jl' and define
J(A) = Jl (A).
If g is another extension of J to [a, fJ], then g - Jl vanishes on a(A),
and hence g(A) = Jl (A). Hence J(A) is well defined, independently of the
particular extension of J to [a, fJ]. We denote by I IIA the sup norm
with respect to a(A); thus
IIJIIA = sup IJ(t)l .

IEO"(A)
We then obtain a ring-homomorphism from the ring of continuous func-

tions on a(A) into R[A], and we have
IJ(A)I ~ IIJIIA·
We now state the spectral theorem.
Theorem 4.4. The map JI-+ J(A) is a Banach-isomorphism Jrom the
°
algebra oj continuous Junctions on a(A) onto the Banach algebra R[A].
A continuous Junction J is ~ on a(A) if and only if J(A) ~ o.
Proof. We had derived the norm inequality previously from the posi-
°
tivity statement. We do this again in the opposite direction. Thus we
assume first that J(A) ~ 0 and prove that J is ~ on the spectrum of A.
Assume that this is not the case. Then J is negative at some point c of
the spectrum. Let g be a continuous function whose graph is as follows:
c (3
Figure 18.2
°
Thus g is ~ 0, and has a positive peak at c. Then Jg is ~ and Jg is
negative at the point c of the spectrum. Hence - Jg ~ 0, and hence
- J(A)g(A) ~ O. But J(A) ~ 0 and g(A) ~ 0, so that by Theorem 4.3 we
also have J(A)g(A) ~ O. This implies that J(A)g(A) = 0, which is impos-
sible since Jg does not vanish on the spectrum. We conclude that J ~ °
on a(A), and in view of our previous result this proves the positivity
statement of the theorem.
Now for the norm, let b = If(A)I. Then bI ± f(A) ~ 0, whence

b ± f(t)
~ 0 on the spectrum. This proves that
IIfIIA ~ If(A)I,
and hence a sequence {f,,(A)} converges if and only if the sequence of

continuous functions Un} converges uniformly on the spectrum. This
concludes the proof of the spectral theorem.
There remains to identify the spectrum as we have defined it in this

section, and the spectrum of Chapter XVI, §l, which we shall call the
general spectrum.
Corollary 4.5. If A is hermitian, then the spectrum a(A) is equal to the

set of complex numbers z such that A - zI is not invertible.
Proof. Let z be complex and such that A - zI is not invertible. Then

z is real, for otherwise let
g(t) = (t - z)(t - z).
Then g(t) =f 0 on a(A), and hence h(t) = l /g(t) is its inverse. Then
h(A)(A - Zl) would be an inverse for A - zI, a contradiction. This
proves that z is real.
Let ~ be real and not in the spectrum a(A). Then t - ~ is invertible
on a(A), and hence so is A - O.
Suppose that ~ is in the spectrum a(A). Let g be the continuous
function whose graph is as follows.
Figure 18.3
That is,
l / lt -
g(t) = {N ~I if It - ~I ~ l iN,
if It - ~I ~ liN.
[XVIII, §5] ORTHOGONAL PROJECTIONS 449
If A - 0 is invertible, let B be an inverse,
B(A - 0) = (A - eI)B = I.
Since I(t - e)g(t)1 ~ 1 we get I(A - O)g(A)1 ~ 1, whence
Ig(A)1 = IB(A - O)g(A)1 ~ IBI·
But g(t) has a large sup on the spectrum if we take N large, and hence
Ig(A)1 is equally large, a contradiction. Theorem 4.4 is proved.
The main idea to use the positivity to get the spectral theorem is due
to F. Riesz. However, most treatments go from the positivity statement
to an integral representation of A which we give in Chapter XX. Von
Neumann always emphasized that it is much more efficient to prove at
once the statement of Theorem 4.4, which suffices for many applications,
and can be obtained quite simply from the positivity statement. In fact,
the arguments used to derive Theorem 4.4 from the positivity statement
are taken from a seminar of Von Neumann around 1950.
Example. Let A be hermitian as above. Given a real number s,

consider the continuous function
Then the operator .fs(A) = e- sA is defined. It is an easy matter to show

that the association SHe-sA is a Coo map from R into Laut(E), that it is
also a homomorphism, and satisfies the conditions:
d
HI. ds e- sA = _Ae- sA .
H 2. For every vEE, we have
lim e-sAv = v.
s-o
One calls e- sA the Heat operator associated with A.
For examples of uses of Theorem 4.3, see §7.
XVIII, §5. ORTHOGONAL PROJECTIONS
Corollary 1.8 of Chapter V shows that we have orthogonal decompositions

in Hilbert space similar to those in euclidean spaces. A standard criterion
for such decompositions in algebra generalizes to Hilbert spaces, namely:
Theorem 5.1. Let P, Q be hermitian operators such that
p2 = P, PQ = QP = 0 , P +Q= I.
Then Q2 = Q, and we have
Ker P = 1m Q = (Ker Q).l.
In particular, we have the orthogonal decomposition
E = Ker P + 1m P.
Proof This proof is independent of the spectral theorem, and uses
only basic definitions, together with Corollary 1.8 of Chapter V. Let
F = Ker P. If x E F, we have
x = I x = Px + Qx = Qx
so that x is in the image of Q. Since PQ = QP = 0, it follows that the
image of Q is in the kernel of P, whence Ker P = 1m Q. We obviously
have
Q2 = (I - p)2 = I - P = Q
so that our relations between P and Q are symmetric. We still must

show that F .l = Ker Q. Suppose that <F, x) = 0. Then from
<E, Qx) = <QE, x) = <F, x)
°
we conclude that Qx = so F.l c Ker Q. The converse inclusion follows
from these same equalities, and our theorem is proved.
Let A be an operator and let F be a subspace of E. We say that F is

invariant for A if AF c F (that is Ax E F for all x E F). If this is the case,
then it is clear that the closure F is also an invariant subspace.
Let A, B be operators such that AB = BA . Then Ker B and 1m Bare
invariant subspaces for A. Indeed, if Bx = 0, then BAx = ABx = 0, so
Ker B is invariant. If y = Bx, then Ay = ABx = BAx, so 1m B is invariant.
An operator A which is hermitian is said to be positive definite if
A ~ cI > ° for some c > 0. If F is a closed invariant subspace for A, we
say that A is positive definite on F if the restriction of A to F is positive
definite. (This restriction is clearly hermitian.) We say that A is negative
definite if - A is positive definite.
Corollary 5.2. Let A be an invertible hermitian operator. Then there

exists an orthogonal decomposition E = F + F.l such that F, F.l are
[XVIII, §5] ORTHOGONAL PROJECTIONS 451
A-invariant closed subspaces, and such that A is positive definite on F

and negative definite on F1-.
Proof. We use the spectral theorem. Let g be the function such that
g(t) = 1 if t ~ 0 and g(t) = 0 if t < O. Since A is invertible, it follows that
o is not in the spectrum of A. Hence g is continuous on the spectrum,
and g2 = g on the spectrum. Hence g(A) = P satisfies p 2 = P. Let F =
1m P. Then P is an orthogonal projection on F by Theorem 5.1. Since
A commutes with g(A), and since tg(t) ~ 0 on the spectrum of A, it
follows that AP = PAis a positive operator. Furthermore, A maps F
into itself, and since A -1 exists on E, and also maps F into itself, let A +
be the restriction of A to F. Then A + is positive, invertible on F, whence
positive definite (because the spectrum is closed, and 0 is not in the
spectrum). Similarly, let h(t) = 1 - g(t) and Q = h(A). Then th(t) ~ 0 on
the spectrum of A, and by similar arguments, letting A - be the restric-
tion of - A to F1-, we conclude that A - is positive definite on F1-. This
proves what we wanted.
Corollary 5.3. Let A be an invertible hermitian operator. Then there

exist an orthogonal decomposition E = F + F1- and positive definite op-
erators A + on F, A - on F1- such that if we write x = y + z with Y E F
and Z E F1-, then
Proof. This is a rephrasing of the preceding result.
Finally, for a positive operator, we can go one step further in our

normalization. Namely, if A ~ 0, then we can write A = B2 for some B
in R[A], and hence if A ~ 0, then the quadratic form x H <Ax, x) can
be written
<Ax, x) = <A 1/2X, A 1/2X).
If A is invertible, so is A 1/ 2. This corresponds to the diagonalization of

quadratic (or symmetric bilinear) forms in the finite dimensional case.
Indeed, in that case, a positive form can be written as
and a negative form can be written as
with respect to a suitable orthonormal basis of the given positive definite

hermitian product <, ) on euclidean space.
XVIII, §6. SCHUR'S LEMMA
Let S be a set of operators on the Hilbert space E, and let F be a

subspace of E. We say that F is invariant under S if for every A E Sand
x E F we have Ax E F. In other words, AF c F for every A E S. If F is
invariant under S, we observe that its closure F is also invariant under S.
Theorem 6.1. Let S be a set of operators on the Hilbert space E,

leaving no closed subspace invariant except {OJ and E itself. Let A be a
hermitian operator such that AB = BA for all B E S. Then A = cI for
some real number c.
Proof. It will suffice to prove that there is only one element in the
spectrum of A. Suppose that there are two, C 1 1= c 2 • There exist continu-
ous functions f, g on the spectrum such that neither is 0 on the spec-
trum, but fg is 0 on the spectrum. For instance, we can take for f, g the
functions whose graphs are indicated on the next figure.
Figure 18.4
We have f(A)B = Bf(A) for all BE S (because B commutes with real

polynomials in A, hence with their limits). Hence f(A)E is invariant
under S because
Bf(A)E = f(A)BE c f(A)E .
Let F be the closure of f(A)E. Then F 1= {OJ because f(A) 1= O. Further-

more, F 1= E because g(A)f(A)E = {OJ and hence g(A)F = {OJ. Since F is
invariant under S, we have a contradiction, thus proving our theorem.
Corollary 6.2. Let S be a set of operators of the Hilbert space E,

leaving no closed subspace invariant except {OJ and E itself. Let A be
an operator such that AA* = A*A, AT = TA, and A*T = TA* for all
T E S. Then A = cI for some complex number c.
Proof. Write A = B + iC where B, C are hermitian and commute (e.g.

B = (A + A*)/2 and C = (A - A*)/ 2i). Apply the theorem to each one of
Band C to prove the corollary.
[XVIII, §7] POLAR DECOMPOSITION OF ENDOMORPHISMS 453
Remark. Schur's lemma is used, among other places, in the represen-

tation theory of groups. Let G be a group, and suppose that we have a
homomorphism (called a representation)
p: G -+ Laut(E)
of G into the toplinear automorphisms of a Hilbert space E. Assume

that G is commutative, and that the image p(G) satisfies the hypotheses
of the set S in the corollary, and also is such that if A E p(G), then
A* E p(G). Then we conclude that for each a E G, the image p(a) is equal
to c,J. Thus a 1--+ c" is a homomorphism of G into the multiplicative
group of complex numbers. In the terminology of representations, one
says that an irreducible unitary representation of G is one dimensional,
because E must then be of dimension 1.
XVIII, §7. POLAR DECOMPOSITION OF ENDOMORPHISMS
Among other things, this section shows some ways how Theorem 4.3 is
used. Namely, for any operator T on a Hilbert space E, the operator
T*T is hermitian positive, and so has a square root by Theorem 4.3. We
start with a special case of the polar decomposition which is of interest
for its own sake.
Theorem 7.1. Let T : E-+ E be an operator on the Hilbert space E.

Assume that Ker T = 0, and that TE is dense in E. Then Im(T*T)1 /2 is
dense, and there exists a continuous linear map U defined on this image
such that
U(T*T)1 /2X = Tx.
This operator U is norm preserving, and its kernel is {O}.
Proof. To show that U is well defined, it suffices to prove that if
(T*T)1 /2X = (T*T)1 /2y,

then Tx = Ty. But applying (T*T)1 /2 yields T*Tx = T*Ty. On the
other hand, the kernel of T* is {O}, because if T*u = 0 then for any VEE
we have
0= <T*u, v) = <u, Tv),
and since the image of T is dense, this implies that u is orthogonal to all
of E, whence u = O. Hence we get Tx = Ty, thus defining U uniquely by
our given formula. Then we find
«T*T)1 /2X, (T*T)1 /2y) = <x, T*Ty) = <Tx, Ty) .

whence it follows that U preserves lengths on the domain of defini-

tion. The fact that Ker U = {O} is a consequence of the norm-preserving
property.
The image of (T* T)1/2 is dense, for suppose that y is orthogonal to
this image. Let x = (T*T)1/2 y. Then (T*T)1/2 X = T*Ty, and we get
0= (T*Ty, y) = (Ty, Ty) = ITyI2.
Since Ker T = {O}, we get y = 0 whence Im(T*T)1/2 is dense. This con-

cludes the proof of the theorem.
The norm-preserving linear map U in the theorem can then be ex-

tended by continuity to all of H (which is the closure o~ its domain of
definition), and this extension is a unitary automorphism of H.
This result is useful in the theory of group representations. Indeed,
suppose that we are given a group G and two homomorphisms
R: G -+ Laut(E) and S: G -+ Laut(E)
of G into the group of invertible operators on E. Assume that T and T*

commute with (R, S) in the sense that for all a E G we have
I
TR(a) = S(a)T and T*S(a) = R(a)T*.
Verify that U satisfies similar relationships. This shows that our two
representations are "isomorphic".
Next we deal with the general polar decomposition. Let H be a
Hilbert space. Let U: H -+ E be a bounded linear map into some other
Hilbert space E. We say that U is a partial isometry if there exists an
orthogonal decomposition
such that the restnctlon of U to H 1 is an isometry (norm-preserving

linear map) onto the image of H 1 , and the restriction of U to H2 is equal
to O.
The purpose of this section is to prove:
Theorem 7.2. Let A: H -+ H be an arbitrary operator. Then there

exists a unique decomposition
A= UP,
where U is a partial isometry and P is hermitian positive.

[XVIII, §8] THE MORSE - PALAIS LEMMA 455
Proof. We shall give the essential steps of the proof, and leave certain
routine details as Exercise 15.
First note that A *A is symmetric positive and has a unique symmetric
positive square root, denoted by
(a) We can define a linear map U = UA on 1m PA by the formula
U(A*A)1 /2V = Av,

namely this is well defined. (If (A*A)1 /2V = 0 then Av = 0.)
(b) The map U: 1m PA ~ 1m A is a unitary map, which can therefore
be extended by continuity to the closure of 1m PA • Define U to be
o on the orthogonal complement of 1m PA • Then U is a partial
isometry.
(c) The decomposition A = UP into a partial isometry U (relative
to 1m P) and a positive operator P is unique, i.e. if A = WQ, then
P = Q, U = W
(d) We have A* = UA,PA" where UA, = U* and PA, = UPAU*.
The decomposition A = UP is called the polar decomposition of A, and
(d) gives the polar decomposition of A* in terms of the polar decomposi-
tion of A.
XVIII, §8. THE MORSE-PALAIS LEMMA
Let U be an open set in some (real) Hilbert space E, and let f be a Cp+2
function on U, with p ~ 1. We say that Xo is a critical point for f if
Df(x o ) = O. We wish to investigate the behavior of f at a critical point.
After translations, we can assume that Xo = 0 and that f(x o ) = O. We
observe that the second derivative D2f(0) is a continuous bilinear form
on E. Let A = D2 f(O), and for each x E E let Ax be the functional
y 1---+ A(x, y). If the map x 1---+ Ax is a toplinear isomorphism of E with its
dual space E', then we say that A is non-singular, and we say that the
critical point is non-degenerate.
We recall that a local CP-isomorphism cp at 0 is a CP-invertible map
defined on an open set containing O.
Theorem 8.1. Let f be a Cp+2 function defined on an open neighbor-

hood of 0 in the Hilbert space E, with p ~ 1. Assume that f(O) = 0, and
that 0 is a non-degenerate critical point of f. Then there exists a local
CP-isomorphism at 0, say cp, and an invertible symmetric operator A such
that
f(x) = <Acp(x), cp(x».
Proof. We may assume that V is a ball around O. We have
f(x) = f(x) - f(O) = L Df(tx)x dt,
and applying the same formula to Df instead of f, we get
f(x) = Ll L D2 f(stx)tx· x ds dt = g(x)(x, x)

where
Ll Ll
g(x) = D2 f(stx)t ds dt.
Then g is a CP map into the Banach space of continuous bilinear maps

on E, and even the space of symmetric such maps by Theorem 5.3 of
Chapter XIII. We know that this Banach space is toplinearly isomorphic
to the space of symmetric operators on E, and thus we can write
f(x) = <A(x)x, x)
where A: V ---. Sym(E) is a CP map of V into the space of symmetric
operators on E. A straightforward computation shows that
tD 2 f(0)(v, w) = <A(O)v, w).
Since we assumed that D2 f(O) is non-singular, this means that A(O) is

invertible, and hence A(x) is invertible for all x sufficiently near O.
We want to define cp(x) to be C(x)x where C is a suitable CP map
from a neighborhood of 0 into the open set of invertible operators, and
in such a way that we have
<A(x)x, x) = <A(O)cp(x), cp(x» = <A(O)C(x)x, C(x)x).
This means that we must seek a map C such that
C(x)* A (0) C(x) = A(x).
If we let B(x) = A(O)-l A (x), then B(x) is close to the identity I for small
x. The square root function has a power series expansion near 1, which
is a uniform limit of polynomials, and is COO on a neighborhood of I (cf.
Exercise 2 of Chapter XIII), and we can therefore take the square root of
[XVIII, §8] THE MORSE-PALAIS LEMMA 457
B(x), so that we let
We contend that this C(x) does what we want. Indeed, since both A(O)
and A(x) (or A(xfl) are self adjoint, we find that
B(x)* = A(x)A(O)-I,
whence
B(x)* A(O) = A(O)B(x).
But C(x) is a power series in I - B(x), and C(x)* is the same power
series in 1- B(x)*. The preceding relation holds if we replace B(x) by
any power of B(x) (by induction), hence it holds if we replace B(x) by
any polynomial in I - B(x), and hence finally, it holds if we replace B(x)
by C(x), and thus
C(x)* A(O)C(x) = A(O)C(x)C(x) = A(O)B(x) = A(x),

which is the desired relation.
All that remains to be shown is that cp is a local CP-isomorphism at O.
But one verifies that in fact, Dcp(O) = C(O), so that what we need follows
from the inverse mapping theorem. This concludes the proof of Theorem
8.1.
Corollary S.2. Let f be a Cp+2 function near 0 on the Hilbert space E,

such that 0 is a non-degenerate critical point. Then there exists a local
CP-isomorphism '" at 0, and an orthogonal decomposition E = F + F\
such that if we write x = y + z with y E F and z E F\ then
f("'(x)) = <y, y) - <z, z).
Proof. The theorem reduces the problem to the case discussed in

Corollaries 5.2 and 5.3. In that case, on a space where A is positive
definite, we can always make the toplinear isomorphism x ~ A 1/2X to get
the quadratic form to become the given hermitian product < , ), and
similarly on the space where A is negative definite.
Note. The Morse-Palais lemma was proved originally by Morse in

the finite dimensional case, using the Gram-Schmidt orthogonalization
process. The elegant generalization and its proof in the Hilbert space
case is due to Palais. Cf. CPa 2]. It shows (in the language of coordinate
systems) that a function near a critical point can be expressed as a
quadratic form after a suitable change of coordinate system (satisfying
requirements of differentiability). It comes up naturally in the calculus of
458 BOUNDED HERMITIAN OPERA TORS [XVIII, §9J
vanatlOns, EPa 1J and [Sm 1]. For instance, one considers a space of
paths (of various smoothness) (1: [a, bJ -4 E where E is a Hilbert space.
r
One then defines a function on these paths, essentially related to the
length
f((1) = <(1'(t), (1'(t}) dt
and one investigates the critical points of this function, especially its
minimum values. These turn out to be the solutions of the variational
problem, by definition of what one means by a variational problem.
Even if E is finite dimensional, so a euclidean space, the space of paths
is infinite dimensional, so that we need an infinite dimensional theory to
deal with this question.
XVIII, §9. EXERCISES
1. Let E be a Hilbert space.

(a) Let P be a hermitian operator such that p2 = P. Show that P is an
orthogonal projection on a closed subspace.
(b) Conversely, let E = F + FJ. be an orthogonal decomposition, where F is
closed subspace. Let P be the orthogonal projection on F, and assume
that F -# {OJ. (i) Show that IFI = 1, and that p 2 = P. (ii) Show that P is
hermitian.
2. Let A be hermitian and positive. Show that for all x, y we have
I(Ax, y)1 2 ~ I(Ax, x)II(Ay, Y)I·
3. Prove that the four conditions UN 1 through UN 4 defining a unitary opera-

tor are equivalent.
4. If A is hermitian, show that I + iA is invertible.
5. Let A be an operator and let a(A) be its spectrum (we assume that our
Hilbert space is complex). Show that
a(A *) = a(A).
6. Prove the statement made in the text that if A is compact, hermitian, then
(Ax, x) takes on a maximum or minimum on the unit sphere.
7. Let E be the space of real valued continuous functions on [0, 1] and let
M: E -+ E be the linear map given by
(Mf)(x) = xf(x).
We take the L 2 -norm on E, arising from the scalar product
(f,g) = ffg,
[XVIII, §9] EXERCISES 459
and we let E2 denote the completion of E with respect to this norm. Show
that M is self-adjoint, and that for any real a, the operator M - aI is not
invertible on E 2 , for otherwise, it would be invertible on E. Show that a is
not an eigenvalue of M. Note: M is obviously injective on E, but you will
have to prove that, for instance, it is injective on E z , so deal with LZ-Cauchy
sequences in E.
8. Let E be a complex Hilbert space with a denumerable orthonormal basis

{xn} (n = 1,2, .. . ). Let S be a compact infinite subset of the complex
numbers. Show that there exists a denumerable dense subset {an} of S. Show
that there exists a unique operator A on E such that AXn = anXn for all
n, and that the spectrum of A is equal to S. Show that the eigenvalues of
A are precisely equal to the numbers an. Show that if a is in S and not
equal to any an ' then the image of A - aI is dense in E but not equal
to E.
9. Let lZ be the Hilbert space of sequences a = {an}, n ~ 1, such that L lanl z

converges, with the hermitian product
if f3 = {bn }. Let T be the shift operator, that is
Show that the spectrum of T is the unit disc and that T has no eigenvalue.
to. Irreducible representations of compact groups. Let G be a compact group, and

E a complex Hilbert space. A (unitary) representation R of G in E is a
continuous homomorphism,
R: G --> Aut(E)
of G into the group of (unitary) automorphisms of E. We let dx be a Haar

measure on G such that G has measure 1. We say that a representation R is
irreducible if there is no closed subspace of E invariant under R(G) other
than 0 and E itself. The basic result is:
Theorem. If R: G --> Aut(E) is a unitary irreducible representation of a

compact group, then E is finite dimensional.
Prove this by the following steps. Let {v;} be an orthonormal basis of E.

Let P be the projection on the one-dimensional space generated by VI.
(a) Using Schur's lemma, prove that there exists a number c such that
t R(x)PR(xfl dx = cI.
In fact, show that the operator on the left is a positive operator, commut-
ing with all R(a), a E G.
>,
(b) Considering (cv l , VI show that c > o.
(c) For any x E G, {R(xf1vJ is an orthonormal basis {wJ. Prove that for
any n,
n
I
i=1
<Pwi , Wi) ~ 1.
(d) Conclude that nc ~ 1, whence n is bounded.
11. Let A be a hermitian operator on the Hilbert space E, and assume that the
spectrum of A is the union of two disjoint closed sets S, T. Show that E
admits a direct sum decomposition into two closed subspaces Es and ET
which are A-invariant, and such that, if we let As and AT be the restriction of
A to Es and ET respectively, then the spectrum of As is S and the spectrum
of AT is T. (Cf. Exercise 2 of Chapter XVI.)
12. Let A be a hermitian operator on a Hilbert space. If c is an isolated point of

the spectrum, show that c is an eigenvalue.
13. Show that an operator A on a Hilbert space is hermitian positive if and only
if there exists an operator B such that A = B* B.
14. Let S be a non-zero Banach subalgebra of operators on a Hilbert space E.

Assume that S is *-closed (i.e. if A E S then A* E S), and that all elements of S
consist of compact operators. Prove that there exists an S-irreducible sub-
space (i.e. a subspace oF 0 which has no S-invariant subspace other than 0
and itself), and that E is the orthogonal sum of S-irreducible subspaces.
[Hint : Writing A = B + iC, where B, C are hermitian, you can find a hermi-
tian element A in S such that A oF O. Let A be an eigenvalue for A, and
among all S-invariant subspaces, let M oF 0 be such that the eigenspace MA
for A has minimal dimension. Let v EM, v oF O. Prove that the closure of Sv
is irreducible.]
15. Give the details of the proofs in statements (a), (b), (c), (d) for Theorem 8.1.
The next exercise gives a complement and refinement of Theorem 8.1 when we
deal with an automorphism of the Hilbert space.
Polar Decomposition of an Automorphism
16. Let A: E -+ E be an operator on a Hilbert space.

(a) If A is unitary, then show that A -I is unitary. Also, A is hilbertian if and
only if A* A = I. If A, Bare hilbertian, so is AB. In the language of
algebra, hilbertian operators form a group, denoted by Hilb(E).
(b) An operator A is said to be skew-symmetric if A* = -A. Since we work
in the real case, we shall say that an operator is symmetric instead of
hermitian, and let Sym(E) denote the space of symmetric operators on E.
We let Sk(E) denote the space of skew-symmetric operators. Show that
End(E) is a direct sum
L(E, E) = Sym(E) EB Sk(E).

(c) For all operators A, show that the series
A2
exp(A) = I + A + - + . ..
2!
converges, and if AB = BA, then
exp(A + B) = exp(A) exp(B).

For all operators A sufficiently close to the identity I, the series
(A - 1)2
log A = (A - I) - + ...
2
converges, and if AB = BA, then
log AB = log A + log B.

(d) If A is symmetric (resp. skew-symmetric), then exp A is symmetric positive
definite (resp. hilbertian). If A is a toplinear automorphism sufficiently
close to I and is positive definite symmetric (resp. hilbertian), then log A is
symmetric (resp. skew symmetric).
(e) Show that the exponential map gives a homeomorphism from the space
Sym(E) of symmetric operators of E to the space Pos(E) of symmetric
positive definite automorphisms of E. Define its inverse.
(f) Show that the space of toplinear automorphisms of the Hilbert space E is
homeomorphic to the product
Hilb(E) x Pos(E)
under the map given by
(H, P)HHP.
[Hint,' To construct the inverse, given an invertible operator A, we must

express it in a unique way as a product A = HP where H is hilbertian, P
is symmetric positive definite, and both H, P depend continuously on A.
Show that
P = (A*A) 1/2 and
exist and satisfy our requirements.]
Hilbert-Schmidt Operators
17. Assume that H has a countable Hilbert basis. An operator A IS called

Hilbert-Schmidt if there exists some Hilbert basis {ud such that
(a) Prove that the same convergence holds for any other Hilbert basis {Vj}.
For Hilbert-Schmidt operators A and B, define their scalar product
with some Hilbert basis {uJ.

(b) Show that the sum is absolutely convergent, and independent of the
choice of Hilbert basis. Show that B* A is Hilbert-Schmidt.
(c) Show that the Hilbert-Schmidt operators form a vector space, which
therefore has the scalar product defined above. Denote the corresponding
norm by
or
so that
Prove the additional properties, where A, B denote Hilbert-Schmidt opera-

tors, and X denotes an arbitrary operator.
HS 1. IIA*lIz = IIAllz.
HS 2. XA and AX are Hilbert-Schmidt, and
HS 3. A Hilbert-Schmidt operator is compact.

[For HS 3, use the projection on the finite dimensional spaces generated by
U!, •.• ,UN for a finite number of u;.]
HS 4. IIA + BII~ - IIAII~ - IIBII~ = 2· Re<A, B).
HS 5. Re L <Au;, Bu;) = Re L <A*u;, B*u;).

HS 6. <A*, B*) = <A, B).
HS 7. <XA, B) = <A, X*B) and <AX, B) = <A, BX*).
Trace Class Operators
18. An operator is said to be of trace class if it is the product of two Hilbert-

Schmidt operators, say A = B*C where B, C are Hilbert-Schmidt. For such
operators A, define the trace of A:
The first sum shows that the trace is independent of the choice of B, C.
Show :
TR 1.
TR 2. If A is of trace class, so are AX and X A, and we have
tr(AX) = tr(XA).
TR 3. A is of trace class if and only if PA is of trace class, and
TR 4. Let P be a symmetric positive operator. Then P is of trace class if and

only if L <Pui , Ui) converges.
[Hint: Use p I / 2 .] If A is an operator of trace class, define
TR 5. The operators of trace class form a vector space; the function
is a norm, satisfying II Alii = II A * 111 .

[Hint: Write PA + B = U*(A + B) where U is a partial isometry.]
TR 6. If A is of trace class, so are X A, AX, and we have
and
TR 7. If A is of trace class, then Itr AI ~ IIAIII'

TR 8. Let 1'" be a sequence of operators on H converging weakly to an operator
T. In other words, for each v, WE H, suppose that <1'"v, w) -+ <Tv, w).
Let A be of trace class. Then
tr(TA) = lim tr(1'"A).
[Hint: Assume first that A = P is positive symmetric. Since A is compact

(because A is Hilbert-Schmidt and US 3), there is a Hilbert basis of H
consisting of eigenvectors {Ui}, with AUi = CiUi' You will now need the uni-
form boundedness theorem to conclude that the norms 1'" are bounded. Then
use the absolute convergence
to prove the assertion in this case. In general write the polar decomposition
A = UP, so T A = (TU)P and you can apply the first part of the proof.]
Note: For complete proofs, cf. [La 3], Appendix of Chapter VII, and [Sh].
CHAPTER XIX
Further Spectral Theorems
In this chapter, we use the spectral theorem of Chapter XVIII to give a

finer theory, making sense of the expression f(A) when f is not continu-
ous. Ultimately, one wants to use very general functions f in the context
of measure theory, namely bounded measurable functions, as a corollary
of what was done in Chapter XVIII. For our purposes here, we deal
with an intermediate category of functions, essentially characteristic func-
tions of intervals. These give rise to projection operators, whose formal-
ism is important for its own sake. We also want to deal with unbounded
operators as an application.
This chapter may be omitted, and is included only for those who want
to go into spectral theory a little more deeply, as in the next chapter.
The development of spectral measures gives a good example of how
measure theory and the general functional analysis of this chapter can be
put together.
XIX, §1. PROJECTION FUNCTIONS OF OPERATORS
We need to extend the notion f(A) to functions f which are not continu-
ous, to include at least characteristic functions of intervals. We follow
Riesz- Nagy more or less. We let H be a Hilbert space.
Lemma 1.1. Let IX be real, and let {An} be a sequence of hermitian

operators such that An ~ IXI for all n, and such that An ~ A n+1' Given
v E H, the sequence {Anv} converges to an element of H . If we denote
this element by Av, then v H Av is a bounded hermitian operator.
[XIX, §1] PROJECTION FUNCTIONS OF OPERA TORS 465
Proof From the inequality
we conclude that <Anv, v) converges, for each v E H. Since
it follows that <Anv, w) converges for each pair of elements v, WE H.

Define
Then Av is anti linear, and I<Anv, w)1 ~ Clvllwl for some C and all v,
w E H . Hence there exists an operator A such that
<Av, w) = lim <Anv, w).

n .... oo
Since <Anv, w) = <v, Anw), it follows that A is hermitian.
Lemma 1.2. Let f be a function on the spectrum of A, bounded from

below, and which can be expressed as a pointwise convergent limit of a
decreasing sequence of continuous functions, say {h n}. Then
is independent of the sequence {h n}.
Proof Say gn(t) clecreases also to f(t). Given k, for large n we have
by Dini's theorem, so for all t we have gn(t) ~ hk(t) + e, and hence
This shows that
and therefore that
This is true for all e. Letting e -+ 0 and using symmetry, we have proved
our lemma.
466 FURTHER SPECTRAL THEOREMS [XIX, §1]
From Lemma 1.2, we see that the association
ft-+ f(A)
can be extended to the linear space generated by functions which can be

obtained as limits from above of decreasing sequences of continuous
functions, and are bounded from below. The map is additive, order pre-
serving, and clearly multiplicative, i.e.
(fg)(A) = f(A)g(A)
for f, 9 in this vector space.
The most important functions to which we apply this extension are
characteristic functions like the function t/tJt) whose graph is drawn in
Figure 19.1. It is a limit of the functions hn(t) drawn in Figure 19.2.
Figure 19.1
c c+ l
n
Figure 19.2
Lemma 1.3. Let t/tc(A) =~. If (XI ~ A ~ PI, then:

(i) ~ = 0 if c < (x, and ~ = I if c ~ p.
(ii) If c ~ c', then ~ ~ ~"
Proof. Clear from Lemma 1.2.
Observe that we also have ~2 = ~, i.e. that ~ is a projection. We call

{~} the spectral family associated with A.
We keep the same notation, and we shall make use of the two functions
fc, gc whose graph is drawn in Figure 19.3. Thus fc(t) + gc(t) = It - cI.
We have
(t - c)(1 - t/tc(t)) = fc(t).
[XIX, §1] PROJECTION FUNCTIONS OF OPERA TORS 467
c c
Figure 19.3
Hence
(1) (A - cI)(I - 1'.,) = !c(A),

(2) A - cI = !c(A) - gc(A),
(3) (A - cI)P" = - gc(A)P" = - gc(A).
Theorem 1.4. Let p" be the spectral family associated with A. If b ~ c,

then we have
bI ~ A ~ cI,
Proof From (1) above, we have A - bI = fb(A) on the orthogonal

complement of Pb , whence the inequality bI ~ A follows on this comple-
ment since fb ~ 0. From (3) above, we have
A - cI = -gAA)
on the image of P", and since -gc is ~ 0, we get A ~ cI on this image.

Theorem 1.5. The family {P,} is strongly continuous from the right.
Proof Let v E H. Our assertion means that P,,+.v -+ p"v as e -+ 0. It

suffices to prove that
because
Let h.(t) be the function whose graph is shown in Figure 19.4. We have
and
468 FUR THER SPECTRAL THEOREMS [XIX, §1]
C c+€ c+6+€
Figure 19.4
In other words, we have
We let 8 - t O. Then h.(t) decreases to t/lc(t) and <h.(A)v, v> decreases to

<~v, v>, which completes the proof of the theorem.
Theorem 1.6 (Lorch). From the left,
lim (~ - ~-.) = Qc
.-0
is the projection on the c-eigenspace of A.
Proof. Using Theorem 1.4, we have
(c - 8)(~ - ~_.) ~ A(~ - ~_.) ~ c(~ - ~_.)

whence
I(A - cI)(~ - ~_.)I ~ 8.
But for each v, lim._ o (~ - ~_.)v exists, say = w. It follows that

Aw = cw, i.e. Qe maps H into the c-eigenspace.
Conversely, if cp is a continuous function, then for any A-invariant
closed subspace F, we have
cp(A If) = cp(A)If.
We want to show that Qe is the identity on the c-eigenspace, and without

loss of generality we may therefore assume that H = He is the eigenspace.
Then ~ = 0 because !c = 0 on the spectrum of A. If b < c, then
fb(A) = A - bI = (c - b)I
is invertible, and hence Pb = O. This proves Lorch's theorem.

[XIX, §2] SELF-ADJOINT OPERA TORS 469
XIX, §2. SELF-ADJOINT OPERATORS
Let H be a Hilbert space and A a linear map,
defined on a dense subspace. Consider the set of vectors v E H such that

there exists w E H such that
(u, w) = (Au, v),
or in other words, (u, w) - (Au, v) = O. The set of such v is the projec-

tion on the first factor of the intersection of the kernels of
(v, w)t-+(u, w) - (Au, v),
It is a vector space. To each v in this vector space there is exactly one

w, if it exists, having the above property, because
ut-+(Au, v)
is a functional on a dense subspace. Hence we can define an operator

A * by the formula
A*v = w,
on the space DA , of such vectors v. We call the pair (A*, DA ,) the adjoint
of A.
Let J: H x H --. H x H be the operator such that J(x, y) = (- y, x).
Then J2 = -I. We note that the graph GA , of A* is given by the
formula
where 1. denotes orthogonal complement, and hence the graph of A * is

closed.
We say that A is closed if its graph GA is closed.
If A is closed, then DA , is dense in H.
Proof Let hE D)., so

because we assumed that A is closed. We conclude that (0, h) EGA' and

hence h = 0, proving our assertion.
If A is closed, then A ** = A.
If DA and DA, are dense, then GA" = closure of GA.
If A is defined on DA and B is defined on DB' if DA c DB' and if the

restriction of B to DA is A, then one usually says that A is contained in B,
and one writes A c B. The above assertion shows that A c A**.
We say that A is symmetric if (Au, v) = (u, Av) for all u, VEDA- We
say that A is self-adjoint, A = A*, if in addition DA = DA,.
If A is symmetric, then A c A*.
This is clear. Recall that we assumed DA dense in H.
If A, B are self-adjoint and A c B, then A = B.
This is also clear, because in general B* c A*, so in the self-adjoint

case, Be A, whence A = B.
Let A be symmetric, defined on DA dense as above. Let A E C not be

real. Then A - A./ is injective on DA , because from
Au = AU and (Au, u) = (u, Au)

we conclude
A(U, u) = (Au, u) = (u, AU) = X(u, u),
so u = 0. Hence we can define an operator
on the image (A + A./)DA. We contend that U is unitary. This amounts to

verifying that for u, vEDA we have
(Au + Xu, Av + Xv) = (Au + AU, Av + AV),

which is obvious.
[XIX, §2] SELF-ADJOINT OPERATORS 471
Lemma 2.1. If A is symmetric, closed, and A E C is not real, then

(A + AI)DA is closed.
Proof Let {un} be a sequence in DA such that {(A + ,u)un} is Cauchy.

Since U is unitary, it follows that
is also Cauchy, hence {(A - I)u n} is Cauchy, and {un} is Cauchy, say
converging to u. But
is Cauchy, whence also {Au n } is Cauchy. Since the graph of A is

assumed closed, we conclude that {(un' Aun)} converges to an element
(u, Au) in the graph, and the sequence
converges to (A + ,u)u. This proves that (A + 'u)DA is closed.
Theorem 2.2. Let A be symmetric, closed with dense domain. Let A E C

be not real, and such that (A + 'u)DA and (A + II)DA are dense (whence
equal to H by the lemma). Then A is self-adjoint.
Proof. Let v E DA*. It suffices to show that vEDA. We have by defini-

tion
<Au, v) = <u, A*v),
Since (A + 'u)DA = H, there exists U1 E DA such that
Then
<Au, v) = <u, AU 1 + AU 1 - AV),
whence
«A + II)u, v) = «A + II)u, u 1 ),
This proves that v = u 1 , as was to be shown.
Remark. In the literature, you will find that the dimension of the
cokernel of (A + 'u)DA is called a defect index. We are concerned here
with a situation in which the defect indices are O.
Corollary 2.3. Let A be symmetric with dense domain. Let A E C be not

real, and such that (A + 'u)DA and (A + II)DA are dense. Then the
closure of GA is the graph of an operator which is self-adjoint.
Proof Since A is symmetric, the domain of A * is also dense, and we

have shown above that GA " is the closure of GA , so A has a closure.
It is immediate that this closure is also symmetric, and the theorem
applies.
An operator A defined on DA is called essentially self-adjoint if the

closure of its graph is the graph of a self-adjoint operator. The corollary
gives a sufficient condition for an operator to be essentially self-adjoint.
Theorem 2.4. Let A be a self-adjoint operator. Let z E C and z not

real. Then A - zI has kernel o. There is a unique bounded operator
which establishes a bijection between Hand DA , and is the inverse of

A - zI. We have
R(z)* = R(z).
If 1m z, 1m w #- 0, then we have the resolvent equation
(z - w)R(z)R(w) = R(z) - R(w) = (z - w)R(w)R(z),
so in particular, R(z), R(w) commute. We have JR(z)J ~ I/ Jlm zJ.
Proof. Let z = x + iy. If u is in the domain of A, then
because A is symmetric, so the cross terms disappear. This proves that

the kernel of A - zI is 0, and that the inverse of A - zI is continuous,
when viewed as defined on the image of A - zI. If v is orthogonal to
this image, i.e.
<Au - zu, v) = 0
for all u E DA , then <Au, v) = <u, zv), and since A is self-adjoint, it

follows that v lies in the domain of A and that Av = zv. Since the kernel
of A - Zl is 0, we conclude that v = O. Hence the image of A - zI is
dense, so that by Lemma 2.1 this image is all of Hand R(z) is every-
where defined, equal to the inverse of A - zI. We then have
[(A - wI) - (A - zI)]R(w) = (z - w)R(w).
Multiplying this on the left by R(z) yields the resolvent formula of the
theorem, whose proof is concluded.
We write
R(i) = (A - if)-i = C + iB
where B, C are bounded hermitian. From the resolvent equation between
R(i) and R( - i) we conclude that B, C commute. We may call B the
imaginary part of (A - i/rl, symbolically
B = Im(A - il)-i .
Lemma 2.5. With the above notation, we have C = AB and BA c AB.

The kernel of B is 0, and 0 ~ B ~ /.
Proof. We have from R(i)* = R( -i) that
(A - if)-i - (A + i/r i = 2iB.
We multiply this on the left with A, noting that
A(A - if)-i = i(A - ifr i +/

and
A(A + ifr i =- i(A + ifr i + / .
We then obtain C = AB. For BA we multiply the first relation on the

right by A, so that we use
and similarly for A + i/. The relation BA c AB follows. The kernel of B

is 0, for any vector in the kernel is also in the kernel of C = AB, whence
in the kernel of (A - i/rl, and therefore equal to 0. We leave the
relation B ~ 0 to the reader. That B ~ / follows from IR(i)1 ~ 1, a spe-
cial case of the last inequality in Theorem 2.4.
We now give an example of a self-adjoint operator. It will be shown

later that any self-adjoint operator is of this nature.
Theorem 2.6. Let {Hn} be a sequence of Hilbert space. Let An be a

bounded self-adjoint operator on Hn. Let H be the orthogonal direct
sum of the H n, so that H consists of all series Un withL L
1Un 12 < 00.
There exists a unique self-adjoint operator A on H such that each Hn
is contained in the domain DA and such that the restriction of A to Hn is
An . / ts domain is the vector space of series u = L un such that
Proof The uniqueness is clear from the property that if A, Bare

self-adjoint and A c B, then A = B. It suffices now to prove that if we
let DA be the domain described above, and define Au by LAnun, then A
is self-adjoint. It is clear that A is symmetric. Let v E DA *. Then
<U, A*v) = <Au, v),
Say U = Lun • Then
IfuEHn , then
<un' A*v) = <Aun, v),
<un, (A*v}n) = <Aun, vn),
whence VEDA' so DA* C DA and A is therefore self-adjoint. This proves

the theorem.
In the situation of Theorem 2.6, we use the notation
We deal with the converse of Theorem 2.6. Let A be an arbitrary

self-adjoint operator on the Hilbert space H, and let
(A - iIr t = C + iB
as above.
We are in a position to decompose our Hilbert space by means of B.

Let ()c be the function whose graph is given in Figure 19.5, and which
gives rise to a projection operator.
I if _1- < t .;; l

--_I 9n (t)= { n+ 1 n
o otherwise
1 1
n+1 n
Figure 19.5
Let {P} be the spectral family for B, and let
Q. = (}.(B) = PI t. - PI t(.H)·
Then Q. is a projection operator, and we let
Then
is an orthogonal direct sum. In fact, let {} and I] be the functions whose

graphs are shown in Figure 19.6(a) and (b) respectively.
/i(t) T/(t)
(a) (b)
Figure 19.6
°
Then 1 - e = I] and I](B) = because the spectral family for B is continu-
ous at 0, in view of Lemma 2.5 (kernel B = 0) and Lorch's theorem
(Theorem 1.6).
Let s.(t) be the function whose graph is shown in Figure 19.7. Then
Bs.(B) = e.(B) = Q•.
\ {I/I if _I_<I"!
'-- s,,(I) = n+1 n
o otherwise
I I
n+1 n
Figure 19.7
Theorem 2.7. Let A be a self-adjoint operator and let B = 1m(A - iffi.

Let Q. = e.(B) be the projection operator defined by the function e.
above. Then A is defined on 1m Q., and
Q.A c AQ. = s.(B)C.

Let Hn = QnH. Then H is the orthogonal direct sum of the spaces Hn,
the restriction of A to Hn is a bounded operator An, and
Proof Since tsn(t) = 0n(t), we get Bsn(B) = On(B) = Qn. Then by Lemma
2.5
In particular, AQn is everywhere defined. On the other hand,
This proves that QnA c AQn. It means that given vEDA' if
is the decomposition of v according to the space Hn , and if
then
So Av = L Avn , and the theorem is proved.
XIX, §3. EXAMPLE: THE LAPLACE OPERATOR

IN THE PLANE
We shall give a typical example of an unbounded symmetric operator.

We shall assume that the reader is acquainted with some notions of
advanced calculus, and in particular with Stokes' theorem in the plane.
These notions are treated later in this book, but to give examples, one
has to use something concrete, taken possibly from other courses. We let
(x, y) be the variables of R2 and we let
be the Laplace operator. We let
L =-.1
in order that L turn out to be a positive operator.

[XIX, §3] EXAMPLE : THE LAPLACE OPERA TOR IN THE PLANE 477
If U is a region in the plane with piecewise smooth boundary, and

F = (f, g) is a smooth vector field on U, we have the Stokes-Green
theorm in dimension 2,
f r div F dx dy JBdr
JU
=
U
F· n ds,
where Bd U is the boundary of U with the appropriate orientation.

Letting
F = g ' grad f = (gfx, gf,) or f· grad g = (jgx,/gy),
we obtain the formula
fL(gN-fAg)dXd Y = LdU(g! -f:~)dS.

If f, g have compact support, and U is the whole plane, then there is no
boundary, and the term on the right in this last formula is equal to O.
We shall use the scalar product for f, g E CcOO(R) given by
<f, g) = f J.2 f(x , y)g(x, y) dx dy.

Consider now L or A to be operators on the space Ccoo (R2) of C OO _
functions with compact support. Then L is of course not bounded, it is
just a linear map. To avoid putting complex conjugates, assume that the
functions f, g are real valued in this space. Then we find:
L 1. The operator L or A is symmetric, that is
<Lf, g) = <f, Lg).
This comes from the Stokes-Green formula with no boundary, as we just

observed. In addition, we have the property :
L 2. The operator L is positive, and in fact we have
To prove this second formula, consider the differential form
of of
w(x, y) = ox f dy - oyf dx.
Then
iPf + (af)2] 2f f
dw = [ ax2f ax dx 1\ dy - [aay2f + (aay )2] dy 1\ dx
whence the second formula follows by the standard form of Stokes'

theorem, namely
fJur dw = r
JBdU
w.
Considering L as an operator on functions on the unit disc, say, or on

the plane with appropriate behavior at infinity to insure convergence, one
can then show that L is a self-adjoint operator. For the analogous
theorem on the upper half plane, cf. SL 2 (R).
It is a technical and not trivial matter to give an explicit representa-
tion for the resolvent in terms of classical integrals. We don't go into
this here. However, we do mention one other object associated with a
situation like the above, namely a fundamental solution. Let z = (x, y),
and define
1
G(z, z') = 2n log \z - z'\.
Then G is symmetric in z, z' and is Coo except on the diagonal z = z'.

Furthermore, the (improper) integral of log \z\ exists on any compact
subset of the plane, because the function log r is locally integrable near
r = 0, and
dx dy = r dr de,
where (r, e) are polar coordinates. (No fancy integration is needed here,
but in the language of integration, we could say that log r is locally L 1.)
Now we may view G as defining an integral operator
f
by the formula
SGf(z') = G(z, z')f(z) dz,

&2
where dz = dx dy for simplicity. This integral can also be written
1
-2
n
f
&2
(log \wl)f(w + z') dw.
Standard theorems from advanced calculus justify the fact that you can
differentiate under the integral sign because f is assumed to be smooth
[XIX, §3] EXAMPLE: THE LAPLACE OPERA TOR IN THE PLANE 479
with compact support, thus showing that SGf is COO. (Cf. Lemma 2.2 of
Chapter VIII.) If D is the open disc of radius 1, then it is immediate that
SG maps Ccoo(D) into BCoo(D), namely that SGf is bounded.
We now have the fundamental formula:
L 3.
where I is the identity. Suppose we fix Zl, and let
g(z) = G(z, Z/)
viewed as a function of z only. Then the formula asserts that
If.2 gAf dx dy = f(Z/).
For the proof, let r = Iz - z/l, and view f in terms of the polar coordi-
nates (r, e). Apply the Stokes-Green formula to the region U(e) outside a
circle of radius e, so that the boundary is the circle S(e) of radius e with
reversed orientation. We have ds = de if we parametrize the circle of
radius e by
°e~ ~ 2ne.
Also, Ag = 0. The right-hand side of Green's formula gives
f (f ~
S(e)
Og
un
- g~
Of)
un
ds = 10
2 1[' 1
f(e, e)-2 de -
ne
10
2 1[' 1
-2
n
of
(log e)~ de.
ur
As e --> 0, the first term goes to f(O, 0) and the second term goes to 0.
Hence the desired formula follows.
We see that we have inverted the Laplace operator in some fashion,

but only by a function with a logarithmic singularity, and definitely
unbounded. Such a fundamental solution is, however, useful for con-
structing other solutions, or for constructing the resolvent. We don't go
into this here. Cf. for instance, Folland [Fo]. The resolvent can be
represented by a kernel in terms of Bessel functions.
CHAPTER XX
Spectral Measures
In the spectral theorems of Chapters XVIII and XIX, we defined func-

tions of an operator f(A) with continuous functions first, and then essen-
tially characteristic functions of an interval by a limiting process. If v is
a given vector, then the association
<
fl-+ f(A) v, v)
defines a functional on Cc(R). But from Chapter IX we know that such a

functional determines a unique measure, which is thus associated with the
operator A and the vector v. This measure is called a spectral measure.
The point of this chapter is to reformulate the results of previous chap-
ters in terms of measures, and to extend the meaning of f(A) to cases
when f is Borel measurable. In this way, we put together a lot of
previous material, which is thus put to work: the spectral theorem of
Chapter XIX; measures on locally compact spaces in Chapter IX; convo-
lutions and Dirac sequences (families) in Chapter VIII.
XX, §1. DEFINITION OF THE SPECTRAL MEASURE
We first state formally as a theorem the measure associated with the

functional mentioned in the introduction.
Theorem 1.1. Let A be a bounded hermitian operator on a Hilbert

space H. Let v E H. There exists a unique positive measure J1v = J1A,v
[XX, §1] DEFINITION OF THE SPECTRAL MEASURE 481
on R such that for every q; E Cc(R) we have
<q;(A)v, v) = t q; dll v ·
This measure is finite, and especially
Proof Let Av be the functional on CAR) defined by
If q;~ 0, then q;(A) ~ 0 by Theorem 4.4 of Chapter XIX. Hence Av is a

positive functional. The existence and uniqueness of the measure Ilv is
then a special case of Theorem 2.7 of Chapter IX. Furthermore, by
Theorem 4.4 of Chapter XIX we have
thus giving the desired bound for the measure Ilv, and concluding the
proof.
The measure liv is called the spectral measure associated with A and v.
By polarization, for v, w E H we see the existence and uniqueness of a
complex measure liv,w such that
<q;(A)v, w) = t q; dllv,w
It is clear that liv,w is C-linear in v and anti-linear in w. Furthermore, we

again have a bound
I<q;(A)v, w)1 ~ 11q;lloolvllwl,

whence by Theorem 4.2 of Chapter IX we also obtain the bound
II Ilv,wII ~ Ivllwl·
The measure Ilv,w is also called the spectral measure associated with A, v,
w. Applying the defining formula for Ilv,w to a real valued function q;, we
see immediately that
liv,w = Ilw, v'

482 SPECTRAL MEASURES [XX, §1]
Let BM(R) be the Banach space of bounded (Borel) measurable func-

tions on R. For each f E BM(R) the association
(v, w)~ ff dJ1.v.w
is linear in v and anti-linear in w. Furthermore
as one sees by applying the dominated convergence theorem to a se-

quence {n} in CAR) approaching f pointwise almost everywhere with
respect to the measure lJ1.v.wl, and such that Inl ;£ IIflloo. Thus our asso-
ciation is continuous, and there exists a unique bounded operator, which
we denote by f(A), such that
<f(A)v, w) = f f dJ1.v.w·
The following properties are then satisfied for f, 9 E BM(R).
SPEC 1. (fg)(A) = f(A)g(A).

SPEC 2. f(A)* = 1 (A).
SPEC 3. If fl is the function fl (t) = 1, then fl (A) = I.
SPEC 4. If the functions f(t) and g(t) = tf(t) are bounded measurable,
then g(A) = Af(A).
SPEC 5. We have If(A)I;£ Ilflloo. Furthermore, if {j,,} is a bounded
sequence in BM(R) converging pointwise to J, then {In(A)}
converges strongly to f(A).
Properties SPEC 1 through SPEC 4 are special cases of the spec-

tral theorem, as formulated in Theorem 4.4 of Chapter XVIII, in case
f, 9 E Cc(R), and so is the bound
If(A)1 ~ Ilflloo
in that case. The properties for f E BM(R) then follow by applying
the dominated convergence theorem and taking limits. For instance, to
prove SPEC 1, fix '" E CAR) and let {n} be a bounded sequence in CAR)
converging to f pointwise almost everywhere with respect to the positive
measures
1J1.l/I(A)v.wl and
[XX, §1] DEFINITION OF THE SPECTRAL MEASURE 483
We obtain
<qJn(A)iP(A)v, w) = f qJniP d/Lv,w
= f qJn d/L"'(A)V,W'
which converges to
f fiP d/Lv,w = <(fiPHA)v, w)
by the dominated convergence theorem if we use the first expression on

the right, and also converges to
f f d/L"'(A)V,w = <f(A)iP(A)v, w)
if we use the second expression on the right. This takes care of one
factor. We take care of the other by using a sequence {iPn} converging to
g in the same manner as above. This proves SPEC 1, and also proves
the equivalent formula
SPEC 6. ffg d/Lv,w = f f d/L9(A)V,W"
We wish to extend the above results to unbounded operators.
Theorem 1.2. Let A be a self-adjoint operator. There exists a unique

association fH f(A) from BM(R) into the bounded operators on H
satisfying SPEC 1 through SPEC 5.
Proof By Theorem 2.7 of Chapter XIX there exists a direct sum

decomposition
and
where An = AIHn is the restriction of A to Hn and is bounded self-adjoint.

Let f E BM(R) and v E H,
Since If(An)vnl ~ IlflLxllvnl, there is a unique bounded operator f(A) such

that
for all v E H.
To each Hn and Vn E Hn we can associate the measure Ii~~ as above. Since
the series
L <f(A}vn , vn >
n
converges absolutely, and defines a positive functional on Cc(R} (even on

BM(R}). Therefore :
Proposition 1.3. Given a direct sum decomposition as above, there exists

a unique positive measure liv such that for all f E Cc(R} we have the
formula
The formalism of the five SPEC properties extends at once to the case of
an unbounded operator A. For example, in the case of SPEC 4, note
that
where g(t} = tf(t}.
It follows that
L f(A}v n E DA ,
where DA is the domain of A, as in Chapter XIX, Theorems 2.6 and 2.7.

Hence SPEC 4 is valid.
Uniqueness will be proved in the next section, as an application of
Dirac families.
We defined the measure liv non-canonically, in a way seemingly depen-

dent on the decomposition of the Hilbert space into a direct sum such
that A restricts to a bounded operator on each summand. Of course, the
measure can be characterized intrinsically as follows.
Theorem 1.4. Let A be a self-adjoint operator on H. Let v E H. The

measure liv is the unique positive measure Ii such that for all <p E CAR}
we have
Proof. Assuming the uniqueness in Theorem 1.2 to be proved in the

next section, the present theorem is merely a special case of Theorem 2.7,
Chapter IX (associating measures to functionals).
[XX, §2] UNIQUENESS OF THE SPECTRAL MEASURE 485
Example. This example is the unbounded analogue of the example

given in Chapter XVIII, §4. Let A be self-adjoint, and assume that there
exists a positive number c such that
for all v E H.
Then there exists a unique differentiable mapping K: R+ --+ End(H) sat-

isfying the following conditions:
H 1. The image of K is contained in the domain of A, and
d
ds K(s) = - AK(s).
H 2. For each v E H we have lim K(s)v = v.

s-o
The proof is an exercise, which can be carried out by using the case for
bounded operators, and the direct sum decomposition of Theorem 2.6
for unbounded operators. I took the statement from Faltings [Fa 1],
Lemma 3.4. Readers can find another idea for the existence proof in this
reference, which also proves the uniqueness.
Observe that for each positive s, the function fP.(t) = e- st is bounded
on the spectra of the bounded operators An on the components Hn, and
sufficiently uniformly so that there is no difficulty in handling the exis-
tence part of the proof by considering its effect on infinite sums L Vn • As
in Proposition 1.3 the existence proof by this method is not invariant,
but the uniqueness show that the end result fPs(A) is independent of the
direct sum decomposition of H. One may write
and K (s) is called the heat operator associated with A.
XX, §2. UNIQUENESS OF THE SPECTRAL MEASURE:

THE TITCHMARSH-KODAIRA FORMULA
The uniqueness proof of this section provides a substantial example of

the use of Dirac families, with weaker conditions than have been men-
tioned previously. For our purposes here, we define a Dirac family to
be a family {fP.} (e > 0) of L1-functions on R satisfying the following
properties:
DIR 1. We have fP. ~ 0 for all e.

DIR 2. For all e, we have
t CPt (x) dx = 1.
DIR 3. Given (j > 0 and (j' > 0, we have
f-" -00
+ foo
"
CPt < (j'
for all e sufficiently close to O.
Theorem 2.1. Let {cpt} be a Dirac family satisfying DIR 1, DIR 2,

DIR 3. Let h be bounded measurable on R. Then CPt * h converges
uniformly to h as e -.0, on every compact set where h is continuous.
Proof. Same as in Chapter VIII, Theorem 3.1.
Suppose given an association fr-. f(A) satisfying the five spectral prop-
erties. For each v, WE H there is a unique measure J-tv.w such that
(f(A)v, w) = f f dJ-tv.w ·
Let z be complex and not real. The function f(t) such that
1
f(t)=-
t-z
is bounded measurable, and tf(t) is bounded. Also (t - z)f(t) = 1. Hence
(A - zI)f(A) = 1.
This means that the resolvent has the integral expression
«A - zl) -1 v, w) = f R
- 1
t- z
dJ-tv.w(t).
We write J-tv instead of J-tv.v' Note that J-tv is a positive measure.
Theorem 2.2. Let A be a self-adjOint operator on H and let v E H. Let

R(z) = (A - zl)-1 for z not real. For any t/I E Cc(R) we have
f R t~O m
1.
t/I(A) dJ-tv(A) = lim -2 f R
([R(A + ie) - R(A - ie)]v, V)t/I(A) dA.
[XX, §2] UNIQUENESS OF THE SPECTRAL MEASURE 487
If A1 < A2 are real numbers which have j.J.v-measure 0, then
f A2 dj.J.v(A) = lim -2'1 fA2

<[R(A + ie) - R(A - ie)]v, v) dA.
Al .... 0 1t1 Al
The proof is based on the following lemma.
Lemma 2.3. Let j.J. be a positive regular measure on R such that j.J.(R) is
finite. Then for", E Cc(R) we have
. 1 fco fco e fco

lIm - ( _ A)2 2 "'(A) dj.J.(t) dA = "'(A) dA .
.... 0 1t -co -co t +e -co
Furthermore, if A1 < A2 are real and such that the set {A1' A2} has j.J.-
measure 0, then
1 fA2
lim -
fco (A)2
e
2 dj.J.(t) dA =
fA2
dj.J.(A) .
.... 0 1t Al -co t- +e Al
Proof. First observe that the family of functions
is a Dirac family on R for e --+ 0. The left-hand side integrals in our

lemma can be written
where h is either", or the characteristic function of the interval [A1' A2 ].

We apply Fubini's theorem to see that this expression is equal to
f: CPt * h(t) dj.J.(t).
Note that CPo * h is bounded, and converges pointwise to h if h = "', and

pointwise to h except at the end points A1 , A2 in the other case. Since
we picked our interval so that the end points have j.J.-measure 0, we can
apply the dominated convergence theorem to conclude the proof.
The lemma obviously proves Theorem 2.2, because
1 2ie
t - A- ie - t - A + ie = (t - A)2 + e2'
Furthermore, Theorem 2.2 provides the desired uniqueness left hanging

in the last section, because it gives the value of the measure entirely in
terms of the resolvent and Lebesgue measure, as on the right-hand side
of the first formula on elements of CJR).
It is possible to develop the spectral theory by starting with a dir-
ect proof of Theorem 2.2, showing that the limit on the right-hand
side exists. One then defines the spectral measure as that associated
with the corresponding functional, and one proves the other properties
from there. Cf. Akhiezer-Glazman, Theory of Linear Operators in Hil-
bert Space, Translated from the Russian, New York, Frederick Ungar,
1963, pp. 8 and 31.
XX, §3. UNBOUNDED FUNCTIONS OF OPERATORS
In the first two sections, we studied bounded functions of an operator,

and this operator could be bounded or unbounded. But the values f(A)
were bounded. We shall now extend this definition to arbitrary Borel
measurable functions f, and in this way recover A itself as an integral. If
A is unbounded, then we shall see that A = f(A) where f(t) = t; and of
course, t is not a bounded function of t.
Theorem 3.1. Let A be a self-adjoint operator. Let f be a real valued

Borel measurable function on R. Then there exists a unique self-adjoint
operator f(A) such that :
(i) The domain of f(A) consists of those v E H for which
(ii) For all v in the domain of f(A), we have
<f(A)v, v) =f f d/l v '
(iii) If(A)vI 2 = f j2 d/l v '
(iv) If f is bounded, then f(A) has the previous meaning, and if f(t) = t,
then f(A) = A.
Proof. Observe first that the integral in (ii) exists by the Schwarz
inequality. To prove the theorem, let
fn(t) = {f(t) if n < !(t) ~ n + 1,

o otherwise.
[XX, §3] UNBOUNDED FUNCTIONS OF OPERA TORS 489
Let y" = 1- 1 ((n, n + 1]). Thus In is equal to I on Yn and 0 outside Y".

Let Xn be the characteristic function of y" and let En = Xn(A). Then En is
a projection operator. Let
Then the Hn are clearly mutually orthogonal, and we contend that we

have a direct sum decomposition
as in Theorem 2.6 of Chapter XIX.

To see this, note that
N
L
n=-N
Xn --+ 1
as N --+ 00, and hence
strongly.
Let Bn = I.(A), so that Bn is bounded, and operates on Hn through the
projection on Hn because InXn = Xnin = In' whence
Let I(A) = B be the self-adjoint operator whose domain is the usual one,
consisting of v = L
Vn with Vn E Hn and L
IBnvnl2 < 00. Then
~ <In(A)vn,fn(A)vn) = ~ f 1.2 d}lv
= f~ 1.2 d}lv
by the monotone convergence theorem. It follows that IE y2(}lv)' The

converse is similarly clear. This proves (iii). Also we get
</(A)v, v) = L <1.(A)vn, vn)

= L f I. d}lv
= fI d}lv'
This proves everything except the final assertion that if I(t) = t, then
I(A) = A. But this follows from the fact that In(A) is equal to A re-
stricted to Hn = EnH, since fn(A) and Xn(A) have the usual meaning, as in
§1. This concludes the proof of the theorem.
XX, §4. SPECTRAL FAMILIES OF PROJECTIONS
By a spectral family in a Hilbert space H we mean a family of ortho-

gonal projections {P,}, t E R, satisfying the following conditions:
SF 1. If a ;;;; b, then Pa ;;;; Pb •

SF 2. lim Pr = 0 and lim Pr = I strongly.
t-+- oo
The first condition means that if Ha and Hb are the subspaces on which
Pa and Pb project H, then Ha c Hb and Pa projects Hb on Ha. The second
means that for each vector v E H we have
lim Prv = 0 and lim Prv = v.

t-+- oo ,~ oo
In Chapter XIX we defined such a family for a bounded hermitian

operator. In this case, we note that Pa = 0 for a large negative, and
n = 1 for b large positive. A spectral family satisfying this additional
condition is called limited.
Observe also that the spectral family associated with a bounded opera-
tor is continuous on the right by Theorem 1.5 of Chapter XIX. How-
ever, we do not assume right-continuity in our general definition of a
spectral family. The spectral family associated with a bounded operator
A was defined as follows. For each bE R we let t/lb be the function as
shown in Figure 20.1.
"'b(t) = 1 if t ~ b
"'b(t) = 0 if t> b.
Figure 20.1
Then Pr = t/lt(A) defines the spectral family associated with A. But we

have seen in the first section of this chapter how to make sense of f(A)
when f is bounded measurable and A is a self-adjoint operator, not
necessarily bounded. This allows us to get:
[XX, §5] SPECTRAL INTEGRAL AS STIEL TJES INTEGRAL 491
Theorem 4.1. Let A be a self-adjoint operator on H. Let P, = I/tt(A}.

Then {P,} is a spectral family, strongly continuous on the right.
Proof. As before, we shall obtain an expression for P, in terms of a

direct sum decomposition as in Theorem 2.6 and 2.7 of Chapter XIX.
Suppose that
and
where each An = AHn is bounded hermitian. Let {p,(n)} be the spectral

family of An on H n, so p,(n) = I/tt(An}. Then by §1, we get
The first condition SF 1 is obviously satisfied. For the second, fix

v= Lvn • Then
Select N so large that
Then let t --+ -00 to get the first limit. For t --+ 00 consider v - p,v.
Finally, we want to prove continuity from the right, i.e. for v E H we
want to show
lim (P'H - p,)v = o.
0-0
We look at
Again take N so large that L IV l2 < s.

n We can then find <5 so small that
N
"1... 1[p'(n) V 12 < s
t+o - p'(n)]
tn '
n=l
thus getting our continuity and proving the theorem.
XX, §5. THE SPECTRAL INTEGRAL AS

STIELTJES INTEGRAL
In Chapter IX, §7 we defined the Stieltjes integral with respect to an

increasing real valued function. Such a function arises naturally from a
spectral family, as follows. If h is an increasing function, we again let dh
be the associated Stieltjes functional on CAR}.
Let {P,} be a spectral family, not necessarily associated with an opera-
tor. Let v E H and let h = hp,v be the function
h(t) = <Prv, v).
Then h is positive, increasing, bounded by 1.
Theorem 5.1. Let A be a self-adjoint operator, and let {Pr} be the

associated spectral family. Let h(t) = <Prv, v) as above. Then for any
function cp in CAR), we have
f f
cp dh = cp dJl.v,
where Jl.v is the spectral measure associated with A and v.
Proof We know from §4 that
For a partition T of sufficiently small size, the integral
is approximated by a sum
L CP(Ck) «Prk+ Prk)v, v) = L cp(ck)<(I/I tk+l (A) -l/Itk(A))v, v)

fL
1 -
= cp(Ck)(l/Itk+l -l/ItJ dJl.v·

But
is an ordinary Riemann sum for cp, uniformly close to cp if the partition

has sufficiently small size. By the dominated convergence theorem with
respect to the measure Jl.v this last expression is therefore uniformly close
to
thus proving the theorem.
XX, §6. EXERCISES
Instead of starting with a self-adjoint operator as in the text, one may start with
a spectral family, develop the functional calculus, and get back (unbounded)
operators as follows.
[XX, §6] EXERCISES 493
1. Let {In be a spectral family, and let v E H . Let
h(t) = <p,v, v).
Show that there is a positive functional Av, bounded by 1, such that
Av(q» = lim L q>(c «P'k.' -

k) P,.)v, v)
= lim S(T, c, q»
T .e
the limit being taken in the same sense as in the text, for the size of the
partition tending to O. Deduce the existence and uniqueness of a measure Ji.v
such that
In a similar way, obtain the complex measure Ji.v,w such that
Av,w(q» = lim L q>(cdGP,.. , - P,.)v, w).

T ,e
Show that
2. Conclude that there exists a unique bounded operator J(P) for each J E BM(R)
L
such that
J dJi.v,w = <J(P)v, w).
3. Show that the map J~ J(P) is a linear map from BM(R) into the space of
operators, satisfying the five properties SPEC 1 through SPEC 5, except that
A is replaced by P.
4. Theorem. Let {P,} be a spectral Jamily. Let J be a real Borel measurable

Junction on R. Then there exists a unique self-adjoint operator J(P) such that:
(i) The domain oj J(P) consists oj those v E H such that
(ii) <J(P)v, v) = f J dJi.v Jor all v E Domain oj J(P).

(iii) IJ(P)vI 2 = f J2 dJi.v·
[Hint: Follow step by step the proof given for the analogous theorem in the
text, concerning J(A) when we start with a self-adjoint operator A.]
Right continuity played no role in the above results. It is important only
for uniqueness purposes, as shown in the next result.
5. Theorem. Let {P,}, {Qt} be spectral Jamilies, which are strongly continuous Jrom
the right, and such that they induce the same Junctional on Ce(R). Then P, = Qt
for all t. If b is a real number and I/Ib is the function whose graph is drawn
below, then
•b
if t ~ b
otherwise
b
Proof. From the assumption it follows that if 00, we get 11 = I/Ib(P), thereby proving the theorem.

PART SIX
Global Analysis
One of the most attractive things that can be done with analysis is to
mix it up with the global topology of geometric structures. For instance,
whereas the local existence theorem for differential equations yields inte-
gral curves in an open set of say euclidean space, one may wish to see
what happens if a differential equation is given on the sphere. In this
case, the integral curves wind around the sphere and one investigates their
behavior as time goes to infinity. Similarly, one can work on toruses, or
arbitrarily complicated similar structures, which have one thing in com-
mon: locally, they look like euclidean space, but globally they turn and
twist. The relations between the analytic properties, and the algebraic-
topological invariants associated with the topological structure, constitute
one of the central parts of mathematics. Our task here is but to lay
down the most basic definitions to prepare readers for further readings,
and to give them the flavor of global results, as distinguished from local
ones in open sets of euclidean space.
We should add, however, that even on open sets of euclidean space,
i.e. locally, we may be interested in certain objects and properties which
are invariant under CP changes of coordinate systems, i.e. under CP iso-
morphisms. The language of manifolds provides the natural language for
such properties. Thus we begin with the change of variables formula,
which gives an example how the integral changes under C 1 isomor-
phisms. The change is of such a nature that we can associate with it an
integral on manifolds. This is done in the last chapter, which includes
the basic theorem of Stokes.
We don't do too much with differential equations besides defining the
basic notions on manifolds. Readers can refer to [La 2] for further
foundations. Smale's survey [Sm 2] is an excellent starting point for the
496 GLOBAL ANALYSIS [PART SIX]
global analysis of ordinary differential equations. As for partial differen-

tial equations, Nirenberg's exposition of certain basic results in [PrJ gives
an exceptionally attractive introduction for this part of global analysis.
In fact, the whole proceedings [PrJ are highly recommended.
CHAPTER XXI
Local Integration of
Differential Forms
Throughout this chapter, f.1 is Lebesgue measure on Rn.

If A is a subset of Rn, we write £,l(A) instead of £,l(A, f.1, C).
XXI, §1. SETS OF MEASURE 0
We recall that a set has measure 0 in Rn if and only if, given e, there
exists a covering of the set by a sequence of rectangles {R j } such that
L f.1(Rj ) < e. We denote by Rj the closed rectangles, and we may always
assume that the interiors RJ = Int(R) cover the set, at the cost of in-
creasing the lengths of the sides of our rectangles very slightly (an e/2n
argument). We shall prove here some criteria for a set to have measure
O. We leave it to the reader to verify that instead of rectangles, we could
have used cubes in our characterization of a set of a measure 0 (a cube
being a rectangle all of whose sides have the same length).
We recall that a map f satisfies a Lipschitz condition on a set A if
there exists a number C such that
If(x) - f(y) I ~ Clx - yl
for all x, YEA. Any C 1 map f satisfies locally at each point a Lipschitz
condition, because its derivative is bounded in a neighborhood of each
point, and we can then use the mean value estimate,
If(x) - f(y) I ~ Ix - yl suplf'(z)l,

the sup being taken for z on the segment between x and y. We can take
498 LOCAL INTEGRATION OF DIFFERENTIAL FORMS [XXI, §2]
the neighborhood of the point to be a ball, say, so that the segment

between any two points is contained in the neighborhood.
Lemma 1.1. Let A have measure 0 in Rn and let f: A -+ Rn satisfy a

Lipschitz condition. Then f(A) has measure O.
Proof. Let C be a Lipschitz constant for f. Let {RJ be a sequence of

cubes covering A such that L f1(R) < 1::. Let rj be the length of the side
of Rj • Then for each j we see that f(A n R) is contained in a cube Rj
whose sides have length ~ 2Crj • Hence
f1(Rj) ~ 2nCnrj = 2nCf1(R).
Our lemma follows.
Lemma 1.2. Let U be open in Rn and let f: U -+ Rn be a C i map. Let

Z be a set of measure 0 in U. Then feZ) has measure O.
Proof. For each x E U there exists a rectangle Rx contained in U such

that the family {R~} of interiors covers Z. Since U is separable, there
exists a denumerable subfamily covering Z, say {Rj } . It suffices to prove
that feZ n R) has measure 0 for each j. But f satisfies a Lipschitz
condition on Rj since Rj is compact and f' is bounded on Rj , being
continuous. Our lemma follows from Lemma 1.1.
Lemma 1.3. Let A be a subset of Rm. Assume that m < n. Let

-+ R n satisfy a Lipschitz condition. Then f(A) has measure O.
f: A
Proof. We view R m as embedded in R n on the space of the first m co-

ordinates. Then Rm has measure 0 in Rn, so that A has also n-dimensional
measure O. Lemma 1.3 is therefore a consequence of Lemma 1.1.
Note. All three lemmas may be viewed as stating that certain parame-
trized sets have measure O. Lemma 1.3 shows that parametrizing a set by
strictly lower dimensional spaces always yields an image having measure
O. The other two lemmas deal with a map from one space into another
of the same dimension. Observe that Lemma 1.3 would be false if f is
only assumed to be continuous (Peano curves).
XXI, §2. CHANGE OF VARIABLES FORMULA
We first deal with the simplest of cases. We consider vectors Vi' ... ,Vn in
Rn and we define the block B spanned by these vectors to be the set of
points
[XXI, §2] CHANGE OF VARIABLES FORMULA 499
with 0 1. We say that the block is degenerate (in Rn) if the vectors
~ ti ~
are linearly dependent. Otherwise, we say that the block is
VI' ... ,V n
non-degenerate, or is a proper block in Rn.
We see that a block in R2 is nothing but a parallelogram, and a block in

R3 is nothing but a parallelepiped (when not degenerate).
We shall sometimes use the word volume instead of measure when
applied to blocks or their images under maps, for the sake of geometry.
We denote by Vol(v l , ... ,vn) the volume of the block B spanned by
VI' ... ,Vn . We define the oriented volume
taking the + if Det(v l , ... ,Vn) > 0 and the - if Det(v l , ... ,Vn) < O. The
determinant is viewed as the determinant of the matrix whose column
vectors are V I ' ... , Vn , in that order.
We recall the following characterization of determinants. Suppose that
we have a product
which to each n-tuple of vectors associates a number such that the prod-
uct is multilinear, alternating, and such that
if e l , ... ,en are the unit vectors. Then this product is necessarily the
determinant, i.e. it is uniquely determined. "Alternating" means that if
Vi = Vj for some i =I j then
The uniqueness is easily proved, and we recall this short proof. We can
write
for suitable numbers a i j , and then
= La al,a(!)ea(!) /\ '" /\ an.a(n)ea(n)
= La al .a(!)·· · an .a(n)ea(l) /\ ... /\ ea(n)'
The sum is taken over all maps 0-: {1, .. . ,n} ..... {1, . .. ,n}, but because of
the alternating property, whenever 0- is not a permutation the term corre-
sponding to 0- is equal to O. Hence the sum may be taken only over all
permutations. Since
where 1'(0-) = 1 or -1 is a sign depending only on 0-, it follows that the

alternating product is completely determined by its value e! /\ . • . /\ en'
and in particular is the determinant if this value is equal to 1.
Theorem 2.1. We have
and
Proof. If v!, . . • ,Vn are linearly dependent, then the determinant is

equal to 0, and the volume is also equal to 0, for instance by Lemma 1.3.
So our formula holds in this case. It is clear that
To show that Yolo satisfies the characteristic properties of the determi-

nant, all we have to do now is to show that it is linear in each variable,
say the first. In other words, we must prove
for CE R,
As to the first assertion, suppose first that C is some positive integer k.

Let B be the block spanned by v, V2' ... ,Vn . We may assume without
loss of generality that v, V 2 , .. • , Vn are linearly independent (otherwise, the
relation is obviously true, both sides being equal to 0). We verify at once
from the definition that if B(v, V2' ... ,vn ) denotes the block spanned by
V, V 2 , ..• 'Vn then B(kv, V2, • . • ,vn ) is the union of the two sets
and B(v, V2, ••• ,vn ) + (k - l)v,
which have only a set of measure 0 in common, as one verifies at once

from the definitions.
Therefore, we find that
Vol(kv, V2, ... ,vn ) = Vol{(k - l)v, + Vol(v, V2, ..• ,vn )
V2 , ... ,vn )
= (k - l)Vol(v, V2, • •• ,vn ) + Vol (v, V2, •• • ,vn )

= k Vol(v, V2 , . • . ,vn ),
as was to be shown.
Now let
v = vdk
for a positive integer k. Then applying what we have just proved shows
that
Writing a positive rational number in the form ml k = m . 11k, we conclude

that the first relation holds when c is a positive rational number. If r is
a positive real number, we find positive rational numbers c, c' such that
c ~ r ~ c'. Since
we conclude that
Letting c, c' approach r as a limit, we conclude that for any real number
r ~ 0 we have
Vol(rv, V2, ... ,vn ) = r Vol (v, V2, ..• ,vn )·
Finally, we note that B( - V, V 2 , .• . ,vn ) is the translation of
B(v, V 2 , . .. ,vn )
by -v so that these two blocks have the same volume. This proves the
first assertion.
As for the second, we look at the geometry of the situation, which is
made clear by the following picture in case v = V1, W = V2.
The block spanned by V1, V 2 , • •• consists of two "triangles" T, T' having

only a set of measure zero in common. The block spanned by V1 + V2
and V 2 consists of T' and the translation T + V 2 . It follows that these
two blocks have the same volume. We conclude that for any number c,
Indeed, if c = 0 this is obvious, and if c # 0 then
C Vo10(v 1 + CV 2 , V 2 , .. . ,vn ) = Vo10(v 1 + CV 2 , CV 2 , . .. ,vn )

= VoIO(vl, CV2, .•• ,vn ) = C VoIO(vl' V 2 , ... ,vn ).
We can then cancel c to get our conclusion.
To prove the linearity of Volo with respect to its first variable, we may
assume that V2, •.. 'Vn are linearly independent, otherwise both sides of
(**) are equal to O. Let V1 be so chosen that {v1, ... ,vn } is a basis ofRn.
Then by induction, and what has been proved above,
VOIO(C1V 1 + .. . +CnV n ' V2, .•• ,vn ) = VoIO(C1V 1 + ... + Cn-1Vn- 1' V 2 , ... ,vn )
= VoIO(C1V 1, V 2 , • • •,vn )
= C1 VoIO(v 1, ... ,vn )·
From this the linearity follows at once, and the theorem is proved.
Corollary 2.2. Let S be the unit cube spanned by the unit vectors in Rn.
Let A.: Rn -+ Rn be a linear map. Then
Vol A(S) = IDet(A)I.
Proof If Vl , ... ,Vn are the images of e l , . . . ,en under A, then A(S) is the
block spanned by Vl , . . .,Vn • If we represent A by the matrix A = (a i ),
then
and hence Det(v l , .. . ,vn ) = Det(A) = Det(A). This proves the corollary.
Corollary 2.3. If R is any rectangle in Rn and A: Rn -+ Rn is a linear

map, then
Vol A(R) = IDet(A)1 Vol(R).
Proof. After a translation, we can assume that the rectangle is a

block. If R = Al (S) where S is the unit cube, then
whence by Corollary 2.2,
Vol A(R) = IDet(A 0 Adl = IDet(A) Det(Adl = IDet(A)1 Vol(R).
The next theorem extends Corollary 2.3 to the more general case
where the linear map A is replaced by an arbitrary Cl-invertible map.
The proof then consists of replacing the C 1 map by its derivative and
estimating the error thus introduced. For this purpose, we define the
Jacobian determinant
where Jj(x) is the Jacobian matrix, and f'(x) is the derivative of the map
f: U -+Rn.
Theorem 2.4. Let R be a rectangle in Rn, contained in some open set U.

Let f: U -+ Rn be a C l map, which is Cl-invertible on U. Then
Proof When f is linear, this is nothing but Corollary 2.3 of the

preceding theorem. We shall prove the general case by approximating f
by its derivative. Let us first assume that R is a cube for simplicity.
Given e, let P be a partition of R, obtained by dividing each side of R

into N equal segments for large N. Then R is partitioned into N n
subcubes which we denote by Sj (j = 1, ... ,Nn). We let aj be the center
of Sj.
We have
Vol f(R} = L Vol f(Sj}
j
because the images f(S) have only sets of measure 0 in common. We

investigate f(S) for each j. The derivative f' is uniformly continuous on
R. Given e, we assume that N has been taken so large that for x E Sj we
have
To determine Vol f(Sj} we must therefore investigate f(S} where S is a

cube centered at the origin, and f has the form
f(x} = AX + q>(x}, 1q>(x}1 ~ Ixle
on the cube S. (We have made suitable translations which don't affect
volumes.) We have
A-1 0 f(x} = X +r l 0 q>(x},
so that A-1 0 f is nearly the identity map. For some constant C, we have
for XES:
IA -1 0 q>(x} I ~ Ceo
From the lemma after the proof of the inverse mapping theorem, we
conclude that A-1 0 f(S} contains a cube of radius
(1 - Ce)(radius S)
and trivial estimates show that A-1 0 f(S} is contained in a cube of radius
(1 + Ce)(radius S).
We apply A to these cubes, and determine their volumes. Putting indices

j on everything, we find that
IDet f'(a)1 Vol(S) - eC l Vol(Sj}

~ Volf(S) ~ IDetf'(a)1 Vol(S) + eC l Vol(Sj}
with some fixed constant C1. Summing over j and estimating ILlfl, we
see that our theorem follows at once, in case R is a cube.
Remark. We assumed for simplicity that R was a cube. Actually, by

changing the norm on each side, multiplying by a suitable constant, and
taking the sup of the adjusted norms, we see that this involves no loss of
generality. Alternatively, we can approximate a given rectangle by cubes.
Corollary 2.5. If g is a continuous function on feR), then
Proof The functions g and (g 0 f)ILlfl are uniformly continuous on

feR) and R respectively. Let us take a partition of R and let {Sj} be the
subrectangles of this partition. If b is the maximum length of the sides of
the subrectangles of the partition, then f(S) is contained in a rectangle
whose sides have length ~ Cb for some constant C. We have
We may assume g real. The sup and inf of g on f(Sj) differ only by e if b
is taken sufficiently small. Using the theorem, applied to each Sj' and
replacing g by its minimum mj and maximum Mj on Sj we see that the
corollary follows at once.
Theorem 2.6 (Change of Variables Formula). Let U be open in Rn

and let f: U -+ Rn be a C 1 map, which is C 1 invertible on U. Let
g E ..'l'l(f(U)). Then (g 0 f)ILlfl is in ..'l'l(U) and we have
Proof Let R be a closed rectangle contained in U. We shall first

prove that the restriction of (g 0 f) ILlfl to R is in ..'l'l(R), and that the
formula holds when U is replaced by R. We know that cc(f(U)) is
U-dense in ..'l'l(f(U)) by Theorem 3.1 of Chapter IX. Hence there exists
a sequence {gd in cc(f(U)) which is U-convergent to g. Using Theorem
5.2 of Chapter VI, we may assume that {gk} converges pointwise to g
except on a set Z of measure 0 in feU). By Lemma 1.2, we know that
f-1(Z) has measure O.
Let g: = (gk 0 f)ILlfl. Each function g: is continuous on R. The se-
quence {gt} converges almost everywhere to (g 0 f)ILlfl restricted to R.
It is in fact an L 1-Cauchy sequence in ..'l'l(R). To see this, we have by
the result for rectangles and continuous functions (corollary of the pre-
ceding theorem):
so the Cauchy nature of the sequence {gn is clear from that of {gd. It
follows that the restriction of (g 0 f) IAJI to R is the L I-limit of {gn, and
is in ;tJI(R). It also follows that the formula of the theorem holds for R,
that is
when A = R.
The theorem is now seen to hold for any measurable subset A of R,
°
since f(A) is measurable, and since a function g in ;tJI(j(A)) can be
extended to a function in ;tJI(j(R)) by giving it the value outside f(A).
From this it follows that the theorem holds if A is a finite union of
rectangles contained in U. We can find a sequence of rectangles {Rm}
contained in U whose union is equal to U, because U is separable.
Taking the usual stepwise complementation, we can find a disjoint se-
quence of measurable sets
whose union is U, and such that our theorem holds if A = Am. Let
and
Then L hm converges to g and L h! converges to (g 0 f)IAJI. Our theo-

rem follows from Corollary 5.13 of the dominated convergence theorem,
Chapter VI.
Note. In dealing with polar coordinates or the like, one sometimes

meets a map f which is invertible except on a set of measure 0. It is
now trivial to recover a result covering this type of situation.
Corollary 2.7. Let U be open in Rn and let f: U ~ Rn be a C 1 map.

Let A be a measurable subset of U such that the boundary of A has
measure 0, and such that f is C l invertible on the interior of A. Let g
be in ;tJI(j(A)). Then (g 0 f)IAJI is in ;tJI(A) and
fJ(A)
gdfl=i (gof)IAJldfl.
A
Proof. Let Uo be the interior of A. The sets f(A) and f(Uo) differ
only by a set of measure 0, namely f(8A). Also the sets A, Uo differ only
[XXI, §3] DIFFERENTIAL FORMS 507
by a set of measure O. Consequently we can replace the domains of

integration f(A) and A by f(Uo ) and Uo , respectively. The theorem
applies to conclude the proof of the corollary.
Note. Since step maps are dense in yl(X, E) for a Banach space E,
the preceding proof generalizes at once to the case of Banach valued
maps.
The change of variables formula depends on a C 1 isomorphism

f: U --> V between open sets of n-space. It suggests that one should
define some object which changes by multiplication of the Jacobian (or
its absolute value) under such an isomorphism, and this is what we shall
do in the next section, by defining differential forms. After that, we intro-
duce a language, that of manifolds, which allows us to speak invariantly
about these objects.
XXI, §3. DIFFERENTIAL FORMS
We recall first two simple results from linear (or rather multilinear) alge-
bra. We use the notation E(r) = E x Ex··· x E, r times.
Theorem A. Let E be a finite dimensional vector space over the reals of

dimension n. F or each positive integer r with 1 ~ r ~ n there exists a
vector space /\r E and a multilinear alternating map
denoted by (u 1 , ••• ,ur ) f--> U 1 1\ ... 1\ U r , having the following property. If

{Vl' ... ,vn } is a basis of E, then the elements
{ V.'1 1\ ... 1\ v.}

1 ,
form a basis of /\r E.

We recall that alternating means that U 1 1\ ... 1\ U r = 0 if U i = uj for
some i "# j. We call /\r E the roth alternating product (or exterior prod-
uct) of E. If r = 0, we define /\ 0 E = R. Elements of /\r E which can be
written in the form U 1 1\ ... 1\ Ur are called decomposable. Such elements
generate /\r E. If r > dim E, we define /\r E = {O}.
Theorem B. For each pair of positive integers (r, s), there exists a
unique product (bilinear map)
/\r E x /\s E --> /\r+s E

such that if u i , ... ,U" Wi' ... ,Ws E E then
(u 1 1\ • .. 1\ U,) x (W 1 1\ ..• 1\ Ws) ~ U1 1\ .•. 1\ U, 1\ Wi 1\ .. , 1\ Ws'
This product is associative.
The proofs for these two statements will be briefly summarized in the
appendix to this chapter.
Let E* be the dual space, E* = L(E, R). (We prefer here to use E*
rather than E', first because we shall use the prime for the derivative, and
second because we want a certain notational consistency as in §4.) If
E = R n and Ai' ... ,An are the coordinate functions, then each Ai is an
element of the dual space, and in fact {Ai"" ,An} is a basis of this dual
space.
Let U be an open set in Rn. By a differential form of degree r on U
(or an r-form) we mean a map
w: U _1\' E*
from U into the r-th alternating product of E*. We say that the form is
of class CP if the map is of class CPo (We view 1\' E* as a normed vector
space, using any norm. It does not matter which, since all norms on a
finite dimensional vector space are equivalent.)
Since {Ai"" ,An} is a basis of E*, we can express each differential
form in terms of its coordinate functions with respect to the basis
{A..'1 1\'" 1\ A..'r } (ii < ... < i,),
namely for each x E U we have
W(X) = "~ h 1.. ,.i.(X)Ai I 1\ ' " 1\ Air

(i)
where ft.i) = h, " ' i r is a function on U. Each such function has the same
order of differentiability as W. We call the preceding expression the
standard form of W. We say that a form is decomposable if it can be
written as just one term f(X)A i, 1\ ... 1\ Air' Every differential form is a
sum of decomposable ones.
We agree to the convention that functions are differential forms of
degree O.
It is clear that the differential forms of given degree r form a vector
space, denoted by n'(U).
Let E = Rn. Let f be a function on U. For each x E U the derivative
f'(X): Rn _ R
is a linear map, and thus an element of the dual space. Thus
/' : V ~ E*
is a differential form of degree 1, which is usually denoted by df If f is

of class CP, then df is class CP-I.
Let Ai be the i-th coordinate function. Then we know that
for each x E V because A'(X) = A for any continuous linear map A.

Whenever {x I ' ... , X n } are used systematically for the coordinates of a
point in Rn, it is customary in the literature to use the notation
This is slightly incorrect, but is useful in formal computations. We

shall also use it in this book on occasions. Similarly, we also write
(incorrectly)
w = L..,.Jh
~ 1"( . ) dx·11 /\ ... /\ dx 'r
·
(i)
instead of the correct
In terms of coordinates, the map df (or /') is given by
df(x) = /,(x) = DJ(X)A 1 + ... + DJ(X)An

where DJ(x) = af/ax i is the i-th partial derivative. This is simply a
restatement of the fact that if h = (hI' ... ,hn ) is a vector, then
,of of
f (x)h = ~hl + ... + :;-hn •
UX I UXn
Thus in old notation, we have
of of
df(x) = ~ dX 1 + ... +:;- dx n·
UX I UXn
Let wand IjJ be forms of degrees rand s respectively, on the open set
U. For each x E V we can then take the alternating product w(x) /\ ljJ(x)
and we define the alternating product w /\ IjJ by
(w /\ ljJ)(x) = w(x) /\ ljJ(x).

If f is a differential form of degree 0, that is a function, then we define
f /\ W = fw
where (fw)(x) = f(x)w(x). By definition, we then have
W /\ ftjJ = fw /\ tjJ.
We shall now define the exterior derivative dw for any differential form
w. We have already done it for functions. We shall do it in general first
in terms of coordinates, and then show that there is a characterization
independent of these coordinates. If
w = '\' 1(.) dk /\ ... /\ dk

i..J Jh '1 lr
(i)
we define
dw = I dJ;i) /\ dAi l /\ ••• /\ dA ir •
(i)
Example. Suppose n = 2 and w is a I-form, given in terms of the two

coordinates (x, y) by
w(x, y) = f(x, y) dx + g(x, y) dy.

Then
dw(x, y) = df(x, y) /\ dx + dg(x, y) /\ dy
= (:~ dx + :~ dY) /\ dx + (:~ dx + :~ dY) /\ dy
af ag
= - dy /\ dx + - dx /\ dy
ay ax
= (a f _ ag ) dy /\ dx
ay ax
because the terms involving dx /\ dx and dy /\ dy are equal to 0.
Theorem 3.1. The map d is linear, and satisfies
d(w /\ tjJ) = dw /\ tjJ + (-I)'w /\ dtjJ

if r = deg w. The map d is uniquely determined by these properties, and
by the fact that for a function f, we have df = f'.
Proof The linearity of d is obvious. Hence it suffices to prove the

formula for decomposable forms. We note that for any function f we
have
d(fw) = df 1\ W +f dw.
Indeed, if w is a function g, then from the derivative of a product we get

d(fg) = f dg + g df. If
where g is a function, then
d(fw) = d(fg dAi, 1\ ... 1\ dA i) = d(fg) 1\ dAi, 1\ ... 1\ dAir

= (f dg + g df) 1\ dAi, 1\ ... 1\ dAir
= f dw + df 1\ W,
as desired. Now suppose that
w = f dX'1 1\' " 1\ dX'r and ./,

VI
= g dXJI 1\' " 1\ dA l·s
=fw = g.ji
with i 1 < .. . < ir and jl < ... < js as usual. If some iv = jfJ' then from
the definitions we see that the expressions on both sides of the equality
in the theorem are equal to O. Hence we may assume that the sets
of indices iI ' . . .,ir and j 1, ... ,j. have no element in common. Then
d(w 1\ .ji) = 0 by definition, and
d(w 1\ 1jI) = d(fgw 1\ .ji) = d(fg) 1\ W 1\ .ji

= (g df + f dg) 1\ W 1\ .ji
= dw 1\ IjI + f dg 1\ W 1\ .ji
= dw 1\ IjI + (- 1)'f W 1\ dg 1\ .ji
= dw 1\ IjI + (-I)'w 1\ dljl,
thus proving the desired formula, in the present case. (We used the fact
that dg 1\ w = (-I)'w 1\ dg, whose proof is left to the reader.) The for-
mula in the general case follows because any differential form can be
expressed as a sum of forms of the type just considered, and one can
then use the bilinearity of the product. Finally, d is uniquely determined
by the formula, and its effect on functions, because any differential form
is a sum of forms of type f dAi, 1\ ••• 1\ dAir and the formula gives an
expression of d in terms of its effect on forms of lower degree. By
induction, if the value of d on functions is known, its value can then be
determined on forms of degree ~ 1. This proves the theorem.
Corollary 3.2. Let w be a form of class C 2 . Then ddw = o.

Proof If f is a function, then
of
df(x) = L -OX
n
j=l
dX
j
j
and
Using the fact that the partials commute, and the fact that for any two
positive integers r, s we have dX,1\ dx. = -dx. 1\ dx" we see that the
preceding double sum is equal to O. A similar argument shows that the
theorem is true for I-forms of type g(x) dXi where 9 is a function, and
thus for all I-forms by linearity. We proceed by induction. It suffices to
prove the formula in general for decomposable forms. Let w be decom-
posable of degree r, and write
where deg tf; = 1. Using the formula of Theorem 3.1 twice, and the fact
that ddtf; = 0 and dd'1 = 0 by induction, we see at once that ddw = 0, as
was to be shown.
XXI, §4. INVERSE IMAGE OF A FORM
We start with some algebra once more. Let E, F be finite dimensional

vector spaces over R and let A: E ~ F be a linear map. If Jl.: F ~ R is an
element of F*, then we may form the composite linear map
Jl.0A:E~R
which we visualize as
We denote this composite Jl. 0 A by A*(Jl.). It is an element of E*. We

have a similar definition on the higher alternating products, and in the
appendix, we shall prove:
Theorem C. Let A: E ~ F be a linear map. For each r there exists a

unique linear map
A*: 1\' F* ~ 1\' E*
[XXI, §4] INVERSE IMAGE OF A FORM 513
having the following properties:

(i) A*(W " t/I) = A*(W) " A*(t/I) for WE I\r
F*, t/I E I\s
F*.
(ii) If J.l E F* then A*(J.l) = J.l 0 A, and A* is the identity on 1\ 0 F* = R.
Remark. If J.lh' ... ,J.ljr are in F*, then from the two properties of
Theorem C, we conclude that
Now we can apply this to differential forms. Let U be open in E = Rn

and let V be open in F = Rm. Let f: U --+ V be a CP map, p ~ 1. For
each x E U we obtain the linear map
f'(x): E --+ F
to which we can apply the preceding discussion. Consequently, we can

reformulate Theorem C for differential forms as follows :
Theorem 4.1. Let f: U --+ V be a CP map, p ~ 1. Then for each r there

exists a unique linear map
f*: Qr(v) --+ Qr(u)

(i) For any differential forms w, t/I on V we have
f*(w " t/I) = f*(w) " f*(t/I).
(ii) If g is a function on V then f*(g) = g 0 f, and if w is a 110rm then
(f*w)(x) = w(j(x)) 0 df(x).
We apply Theorem C to get Theorem 4.1 simply by letting A = f'(x)

at a given point x, and we define
(f*w)(x) = f'(x)*w(j(x)).
Then Theorem 4.1 is nothing but Theorem C applied at each point x.
Example 1. Let Yl' ... ,Ym be the coordinates on V, and let J.lj be the
j-th coordinate function, j = 1, . .. ,m, so that Yj = J.lj(Yl, ... ,Ym)' Let
f: U --+ V
be the map with coordinate functions
Yj = h(x) = Ilj 0 f(x).

If
w(y) = g(y) dYit 1\ ... 1\ dYjs
is a differential form on V, then
I f*w = (g 0 f) df.11 1\'" 1\ df·Js .
Indeed, we have for x E U:
(f*w)(x) = g(J(X»)(llj, 0 f'(x») 1\ .. , 1\ (Iljs 0 f'(x»)

and
h'(x) = (Ilj 0 f)'(x) = Ilj 0 f'(x) = dh(x).
Example 2. Let f: [a, b] -+ R2 be a map from an interval into the

plane, and let x, Y be the coordinates of the plane. Let t be the coordi-
nate in [a, b]. A differential form in the plane can be written in the form
w(x, y) = g(x, y) dx + h(x, y) dy
where g, h are functions. Then by definition,
dx dy
f*w(t) = g(x(t), y(t») dt dt + h(x(t), y(t») dt dt
if we write f(t) = (x(t), y(t»). Let G = (g, h) be the vector field whose
components are 9 and h. Then we can write
f*w(t) = G(f(t»)'f'(t) dt,
which is essentially the expression which is integrated when defining the

integral of a vector field along a curve.
Example 3. Let U, V be both open sets in n-space, and let f: U -+ V

be a CP map. If
w(y) = g(y) dYl 1\ ... 1\ dYn'

[XXI, §4] INVERSE IMAGE OF A FORM 515
where Yj = jj(x) is the j-th coordinate of Y, then
dYj = D1jj(x) dX 1 + ... + D,jj(x) dx n ,

oy. oy.
= - ' dx 1 + ... + -' dx
oX 1 ox. •
and consequently, expanding out the alternating product according to the
usual multilinear and alternating rules, we find that
As in §2, d f is the determinant of the Jacobian matrix of f.
Theorem 4.2. Let f: V -+ V and g: V -+ W be CP maps of open sets. If

w is a differential form on W, then
(g 0 f)*(w) = f*(g*(w)) .
Proof. This is an immediate consequence of the definitions.
Theorem 4.3. Let f: V -+ V be a C 2 map and let w be a differential

form of class C 1 on V. Then
f*(dw) = df*w.
In particular, if 9 is a function on V, then
f*(dg) = d(g 0 f).
Proof. We first prove this last relation. From the definitions, we have
dg(y) = g'(y), whence by the chain rule,
(j*(dg))(x) = g'(j(x)) 0 f'(x) = (g 0 f)'(x)
and this last term is nothing else but d(g 0 f)(x), whence the last relation
follows. For a form of degree 1, say
w(y) = g(y) dYl,
with Y1 = f1 (x), we find
(f* dw)(x) = (g'(f(x)) 0 f'(x)) 1\ df1(X).

Using the fact that ddfl = 0, together with Theorem 3.1, we get
(df*w)(x) = (d(g 0 f)) (x) 1\ dfl(X),
which is equal to the preceding expression. Any I-form can be expressed

as a linear combination of forms, gi dYi' so that our assertion is proved
for forms of degree 1.
The general formula can now be proved by induction. Using the
linearity of f* , we may assume that w is expressed as w = t/J 1\ 1] where
t/J, 1] have lower degree. We apply Theorem 3.1, and (i) of Theorem 4.1
to
f* dw = f*(dt/J 1\ 1]) + (-l)'f*(t/J 1\ d1])
and we see at once that this is equal to df*w, because by induction,

f* dt/J = df*t/J and f* d1] = df*1]. This proves the theorem.
Let U be open in n-space, and let w be a continuous differential form

on U of degree n. We can associate a positive measure with w as
follows. Let us write
w(x) = h(x) dX 1 1\ ... 1\ dx n •
If g E CC<U), we define Iwl by
<g, Iwl) = Iv g(x)lh(x)1 dX 1 ••• dX n = Iv glhl d/1.

Then gl-+<g, Iwl ) is a positive functional on CC<U). We know that there
exists a unique regular measure associated with this functional, and we
shall call this measure the measure on U associated with Iwl. We may
denote it by /11"'1' It is characterized by the relation
We shall analyze this measure more closely in Chapter XXIII.
XXI, §5. APPENDIX
We shall give brief reviews of the proofs of the algebraic theorems which
have been quoted in this chapter.
We first discuss "formal linear combinations". Let S be a set. We
wish to define what we mean by expressions
[XXI, §5] APPENDIX 517
where {cJ are numbers, and {sJ are distinct elements of S. What do we
wish such a "sum" to be like? Well, we wish it to be entirely determined
by the "coefficients" ci , and each "coefficient" Ci should be associated
with the element Si of the set S. But an association is nothing but a
function. This suggests to us how to define "sums" as above.
For each S E S and each number C we define the symbol
CS
to be the function which associates c to sand 0 to z for any element

z E S, Z "# s. If b, c are numbers, then clearly
b(cs) = (bc)s and (b + c)s = bs + cs.

We let T be the set of all functions defined on S which can be written in
the form
where C i are numbers, and Si are distinct elements of S. Note that

we have no problem now about addition, since we know how to add
functions.
We contend that if Sl' ... 'Sn are distinct elements of S, then
are linearly independent. To prove this, suppose c l ' ... ,Cn are numbers
such that
(the zero function).
Then by definition, the left-hand side takes on the value Ci at Si and

hence C i = O. This proves the desired linear independence.
In practice, it is convenient to abbreviate the notation, and to write
simply Si instead of lSi' The elements of T, which are called formal linear
combinations of elements of S, can be expressed in the form
and any given element has a unique such expression, because of the linear
independence of Sl' ... 'Sn' This justifies our terminology.
We now come to the statements concerning multilinear alternating

products. Let E, F be vector spaces over R. As before, let
E(r) = Ex' .. x E,
taken r times. Let

f : E(r) -+ F
be an r-multilinear alternating map. Let Vl , ... ,Vn be linearly indepen-

dent elements of E. Let A = (a ij ) be an r x n matrix and let
Then
f(u l , .. . ,ur) = f(a ll Vl + ... + alnvn, ... ,arl Vl + ... + arnv n)

= L f(al,(1(l)V(1(l) , . .. ,ar.(1(r)V(1(r»)
(1
= L(1 a l .(1(l) ••• ar.(1(r)f(V(1(l)' ... ,V(1(r»)
where the sum is taken over all maps 11: {I, ... ,r} -+ {l, ... ,n}. In this
sum, all terms will be 0 whenever 11 is not an injective mapping, that is
whenever there is some pair i, j with i # j such that l1(i) = 11(j), because
of the alternating property of f. From now on, we consider only injec-
tive maps 11. Then {11(1), ... ,11(r)} is simply a permutation of some r-tuple
(il' ... ,ir ) with i l < .. , < ir •
We wish to rewrite this sum in terms of a determinant.
For each subset S of {I, ... ,n} consisting of precisely r elements, we

can take the r x r submatrix of A consisting of those elements aij such
that j E S. We denote by
Dets(A)
the determinant of this submatrix. We also call it the subdeterminant of

A corresponding to the set S. We denote by P(S) the set of maps
11: {I, .. . ,r} -+ {I, .. . ,n}
whose image is precisely the set S. Then
Dets(A) = L
(1eP(S)
6s(l1)a l .(1(l)··· ar.(1(r)'
where 6s(l1) is the sign of 11, depending only on 11. In terms of this
notation, we can write our expression for f(u l , ... ,ur) in the form
(1) f(u l , ' " ,ur) = L Dets(A)f(vs )

s
where Vs denotes (ViI' ... ,V;) if il < ... < ir are the elements of the set S.
The first sum over S is taken over all subsets of 1, ... ,n having precisely
r elements.
Theorem A. Let E be a vector space over R, of dimension n. Let r be

an integer 1 ~ r ~ n. There exists a finite dimensional space /\r E and
an r-multilinear alternating map E(r) -+ /\r E denoted by
satisfying the following properties:

AP 1. If F is a vector space over Rand g: E(r) -+ F is an r-multilinear
alternating map, then there exists a unique linear map
such that for all u l , .. . ,Ur E E we have
AP 2. If {VI' ... ,vn } is a basis of E, then the set of elements
V · A . . . A V·
II lr'
is a basis of /\r E.
Proof. For each subset S of {1, . . . ,n} consIstmg of precisely r ele-
ments, we select a letter ts . As explained at the beginning of the section,
these letters ts form a basis of a vector space whose dimension is equal
to the binomial coefficient (~). It is the space of formal linear combina-
tions of these letters. Instead of t s , we could also write to) = til'" ir with
i l < .. . < ir • Let {VI' ... ,vn } be a basis of E and let U 1 , ... ,Ur be elements
of E. Let A = (aij) be the matrix of numbers such that
Define
u1 A .•• A Ur = L Dets{A)t S '
s
We contend that this product has the required properties.

The fact that it is multilinear and alternating simply follows from the
corresponding property of the determinant.
We note that if S = {i i , ... ,q with i i < ... < in then
t S = v·11 1\ '" 1\ v·tr .
A standard theorem on linear maps asserts that there always exists a

unique linear map having prescribed values on basis elements. In partic-
ular, if g: E(r) -> F is a multilinear alternating map, then there exists a
unique linear map
such that for each set S, we have
if ii' ... ,ir are as above. By formula (1), it follows that
for all elements u i , ... ,Ur of E. This proves AP 1.

As for AP 2, let {w 1 , ... , W n} be a basis of E. From the expansion of
(1), it follows that the elements {ws}, i.e. the elements {Wi! 1\ ... 1\ w;J
with all possible choices of r-tuples (ii" " ,ir) satisfying i i < .. . <ir are
generators of I\r E. The number of such elements is precisely (~).
Hence they must be linearly independent, and form a basis of 1\ E, as
was to be shown.
Theorem B. For each pair of positive integers (r, s) there exists a

unique bilinear map
such that if U i , ... ,Un Wi' ... ,Ws E E then
This product is associative.
Proof For each r-tuple (u i , ... ,ur) consider the map of E(s) into
I\r+s E given by
This map is obviously s-multilinear and alternating. Consequently, by

AP 1 of Theorem A, there exists a unique linear map
g(u) = g"I .-...ur 'I\s E -> I\r+s E

o
such that for any elements WI' • . • ,W, E E we have
Now the association (u) ~ g(u) is clearly an r-multilinear alternating map

of E(r) into L(I\' E, I\r+s E), and again by AP 1 of Theorem A, there
exists a unique linear map
such that for all elements u I , ... ,U r E E we have

I
gu" .. " Ur = g*(u I /\ ... /\ ur)·
To obtain the desired product I\r E x 1\' E ~ I\r+, E, we simply take the
association
It is bilinear, and is uniquely determined since elements of the form

U I /\ ... /\ U r generate I\r E, and elements of the form WI /\ .. . /\ Ws gen-
erate 1\' E. This product is associative, as one sees at once on decom-
posable elements, and then on all elements by linearity. This proves
Theorem B.
Let E, F be vector spaces, finite dimensional over R, and let A: E ~ F

be a linear map. If fJ.: F ~ R is an element of the dual space F*, i.e. a
linear map of F into R, then we may form the composite linear map
fJ. 0 A.: E ~ R
which we visualize as
We denote this composite fJ. 0 A. by A.*(fJ.). It is an element of E*.
Theorem C. Let A: E ~ F be a linear map. For each r there exists a

unique linear map
A. *: I\r F* -+ I\r E*
having the following properties :

(i) A.*(w /\ 1jI) = A.*(w) /\ A.*(IjI), for WE I\r F*, IjI E 1\5 F* .
(ii) If fJ. E F* then A. *(fJ.) = fJ. 0 A., and A. * is the identity on 1\0 F* = R.
Proof The composition of mappings
F* x .. . x F* = F*(r) ~ E* x . .. X E* = E*(r) ~ I\r E*

given by
is obviously multilinear and alternating. Hence there exists a unique

linear map I\r F* --+ I\r E* such that
Property (i) now follows by linearity and the fact that decomposable
elements III /\ ... /\ Ilr generate I\r F*. Property (ii) comes from the de-
finition. This proves Theorem C.
CHAPTER XXII
Manifolds
XXII, §1. ATLASES, CHARTS, MORPHISMS
Let X be a set. An atlas of class CP (p ~ 0) on X is a family of pairs

{(Vi' <PJ} (i E l) satisfying the following conditions:
AT 1. Each Vi is a subset of X and the Vi cover X.

AT 2. Each <Pi is a bijection of Vi onto an open subset of a euclidean
space E, and for every pair i, j of indices, the set <Pi(Vi n ~) is
open in E.
AT 3. The map
is a CP-isomorphism for each pair of indices i, j.
The space E is assumed to be the same for all i. If its dimension is n,

we say that the atlas is n-dimensional. All that is done in this chapter
would go over to the Banach case, but the principal applications we have
in mind in the next chapter are strictly finite dimensional, and so for a
first introduction to manifolds here we make the finite dimensionality
assumption at once. Readers are referred to [La 2] for the general
development, in a systematic way. They will note that there is essentially
no change from the partial development given here.
Each pair (Vi' <PJ will be called a chart of the atlas. We see that the
inverse map
524 MANIFOLDS [XXII, §1]
may be interpreted as a parametrization of a portion of X by an open

set in euclidean space. Thus in particular, X is a set which can be
covered by subsets, each of which is so parametrized. The extra condi-
tion AT 3 is one which will allow us to speak of differentiability relative
to X itself.
Since cp: V --+ Rn is a map into n-space, we can represent cp by coordi-

nate functions, and we can write for x E V,
We call (Xl"" ,Xn) the local coordinates of x in the chart (V, cp). The
notation here is already somewhat concise, but useful. If readers feel the
need for it, they may extend this notation as follows. Denote a point
of X by P. Then in a chart cp: V --+ Rn at P, we have coordinates
(Xl (P), ... ,xn(P)) for the point cp(P), P E V, and we abbreviate this n-tuple
by x(P). In most cases, it is a useful abbreviation to do away with the
extra letter.
Let V be a subset of X and let cp: V --+ cpV be a bijection of V onto
an open subset of E. We say that the pair (V, cp) is compatible with the
atlas {(Vi' CPJ} if each map CPiCP-l (defined on a suitable intersection as in
AT 3) is a CP -isomorphism. Two atlases are said to be compatible if each
chart of one is compatible with the other atlas. The relation of compati-
bility between atlases is immediately verified to be an equivalence rela-
tion. An equivalence class of CP-atlases on X is said to define a structure
of CP-manifold on X. The number n being fixed, we say that X is then
an n-dimensional manifold.
[XXII, §1] ATLASES, CHARTS, MORPHISMS 525
If (V, cp) and (V, t/J) are two charts of a manifold, then we shall call the
map cp 0 t/J-l (defined on t/J(V (\ V), whenever V (\ V is not empty) a
transition map.
So far we have not assumed that X has a topology. In many cases, a
topology is first given, and then to make the atlases topologically com-
patible with this topology, one can require the additional condition that
the maps cp of the charts be homeomorphisms. However, it is also useful
not to do this and deal with the more general situation when X is
merely a set, and we shall in fact have an important application later
when we deal with the tangent bundle.
We shall now see how to define a topology on X by means of the
atlases. Let {( Vi' cp;)} be an atlas. A subset V of X is defined to be open
if and only if the intersection V (\ Vi with each open set of the atlas is
such that CPi(V (\ V;) is open, in E of course. It is a trivial exercise to
verify that this defines a topology. Furthermore, if {(l-j, t/Jj)} is an equiva-
lent atlas, then the two topologies coincide. We leave the formal verifica-
tion to the reader. We note merely that the basic reason is that if a
point x lies in charts Vi and l-j, then there is a subset W containing x
such that CPi Wand t/Jj Ware open. Since a topology is really determined
locally (i.e. an open set is a union of open neighborhoods of its points)
one sees at once that a set is open relative to one atlas if and only if it is
open relative to the other.
Let X be a manifold, and V an open subset of X. Then it is possible,
in the obvious way, to induce a manifold structure on V , by taking as
atlases the intersections
(Vi (\ V, CP;l(Vi (\ V»).
Example 1. Any open set in euclidean space is a manifold, the charts

being the obvious ones: CP -isomorphisms of open subsets onto other
open sets in euclidean space.
Example 2. We speak of RjZ as the circle group. Then RjZ is a

compact manifold, for which we can find an atlas consisting of two
charts. The open interval (0, 1) maps bijectively onto an open subset of
RjZ (by assigning to each real number its equivalence class modulo Z),
and the open interval (-t, t) also maps bijectively onto an open subset
of RjZ. Readers will verify at once that these two maps are the charts of
a Ceo atlas.
Example 3. Instead of Rj Z we can take Rnjzn, the n-dimensional

torus, and define charts similarly.
Example 4. Let sn be the n-sphere in Rn+l, i.e. the set of all points
(x l' . . . ,Xn+l) such that
xi + ... + X;+l = 1.
Then sn is a manifold, if we define charts as follows. Let
The sphere is the set of points x such that f(x) = 1. For any point
a E sn, a = (a 1 , ... ,an+d, some coordinate is not equal to 0, say a 1 • Then
DJ(a) =f. 0,
and we can apply the implicit function theorem, so that there is a C O

map ((Jl defined on an open neighborhood V of (a2' . .. ,an +1 ) such that
and ((Jl (a 2 , ••• ,an + 1 ) = a 1 · Furthermore, if we take V small enough, then

((Jl is uniquely determined. Let
It is an exercise to verify that the collection of all similar pairs (((JV, ((J-l)
is a Ceo atlas for sn. Actually, we shall obtain some theorems below
which will prove this, and give general criteria showing that certain
subsets of euclidean space are manifolds.
In our definition of a manifold, it was convenient to take the charts as

maps from the set X into the vector space. In our examples, we actually
defined their inverses. We may visualize a manifold as a set which is
parametrized locally by open subsets of some euclidean space. The pa-
rametrizing maps are the inverse maps of the charts. The whole point
of condition AT 3 is to ensure that the parametrizations are compatible
with a certain order of differentiability.
Example 5. Let X = R and let ((J: X ~ R be the map ((J(x) = x 3 . Then

(X, ((J) is a chart defining an atlas. We therefore get a differentiable
structure on R, but the identity map is not C 1 compatible with this atlas,
because the map x ~ Xl/3 is not differentiable at 0.
Let X, Y be manifolds. Then the product X x Y is a manifold in an

obvious way. If { (Vi' ((Ji)} and {(lj, "')} are atlases for X, Y, respectively,
then
{(Vi X ltj, ((Ji x"')}
is an atlas for the product, and the product of compatible atlases gives
rise to compatible atlases, so that we do get a well-defined product
manifold.
[XXII, §2] SUBMANIFOLDS 527
We know what it means for a map from an open set in euclidean

space into another euclidean space to be differentiable, or of class CPo
Since our definition of a manifold is based locally on open sets in eucli-
dean space, we can now define the notion of a CP map from one mani-
fold into another. Let X , Y be CP-manifolds and I : X ~ Y, a map. We
say that I is a CP map if given x E X there exists a chart (U, cp) at x and
a chart (V, 1/1) at I(x) such that I(U) c V and such that the map
Iv. v = 1/1 0 I 0 cp -1: cp U ~ 1/1 V

is a CP map. If this holds, then this same condition holds for any choice
of charts (U, cp) at x and (V, 1/1) at I(x) such that I(U ) c v.
It is clear that the composite of two CP maps is itself a CP map
(because it is true for open subsets of euclidean space).
It should be noted that CP manifolds and maps are useful with p finite
because Banach space techniques can be applied to sets of mappings.
Indeed, the CP-bounded maps of one open set of euclidean space into
another form a Banach space. Manifold theory goes through if instead
of euclidean space we take a Banach space in the definition of a mani-
fold, and one can then give a manifold structure to the set of CP maps of
one manifold into another. We don't go into this aspect of manifold
theory in this book, but readers should keep this possibility in mind for
future applications.
We shall deal with a fixed p throughout a discussion. Thus it is
convenient to call a CP map I: X ~ Y by a neutral name, and we call
such maps morphisms. (If the order of differentiability needs to be spe-
cified, we can always add the cP prefix.) By a CP-isomorphism, or simply
isomorphism I: X ~ Y we mean a morphism for which there exists an
inverse morphism, i.e. a morphism g: Y ~ X such that 9 0 I and log are
the identity mappings of X and Y, respectively. This is the same termi-
nology which we used with respect to open sets in Euclidean spaces.
Similarly, we have the notion of a local CP-isomorphism at a point x E X ,
meaning that I induces an isomorphism of an open neighborhood of x in
X onto an open neighborhood of I(x) in Y.
XXII, §2. SUBMANIFOLDS
A manifold may arise like the torus, not embedded in any particular
euclidean space, or it may be given as a subset of some euclidean space
like the sphere. We now study this second possibility.
Let X be a topological space and Y a subspace. We say that Y is
locally closed in X if every point Y E Y has an open neighborhood U in
X such that Y 11 U is closed in U. We leave it to the reader to verify
that a locally closed subset of X is the intersection of an open set and a
closed set in X. For instance any open subset of X is locally closed, and
any open interval is locally closed in the plane.
Let X be a manifold and Y a subset. We shall say that Y is a
submanifold if, roughly speaking at each point Y E Y there exists a chart
such that in this chart, the points of Y correspond to a factor in a
product space. We now make this condition precise as follows. For each
Y E Y there exists a chart (V, cp) at y such that cp gives an isomorphism
where Vl is open in some space E l ' and V2 is open in some space E 2 ,

and such that
for some point a2 E V2 . If we make a translation on V2 , it is clear that

we can always adjust V2 such that a2 = o.
If we let E = E 1 X E 2, then the coordinates split up naturally, namely
we can write R n = Rm x Rq, and
If a2 = 0 in our preceding definition, then we see that the points of Y in

the given chart (V, 1/1) are precisely the points having coordinates
(Xl' ... ,X m , 0, ... ,0).
All of this explains what we said about Y being locally at each point a
factor in a product.
We observe that if Y is a submanifold of X, then Y is locally closed in
X. We must also justify our terminology by showing that Y is a mani-
fold in its own right. Indeed, if (V, cp) is a chart at y as in our definition,
then cp induces a bijection
The collection of pairs {( Y n V, CPl)} obtained in the above manner con-

stitutes an atlas for Y, of class CPo
The proof of this statement is essentially a triviality, and consists merely

in keeping the definitions straight. We give it in full. Let
be another chart at y such that

[XXII, §2] SUB MANIFOLDS 529
for some point b2 E W2 • Then we get the bijection If; 1: Y n W --> W1 •

Furthermore,
<p(Y n V n W) is open in V1 x {a2} and thus equal to V; x {a 2}
for some open V; in V1 • Similarly,
I/I(Y n V n W) is open in W1 x {b 2 } and thus equal to W; x {b 2 }
for some open W; in W1 . We have isomorphisms
and
under V'
is an isomorphism. If we look at the effect of this isomorphism on the

part of W' corresponding to Y, we see that it simply induces by restric-
tion a map
whence a map W; --> V; which is of class CP, and has a CP inverse,

induced by 1f;' 0 <pH: V' --> W'. This proves what we wanted, i.e. that the
family of all pairs {( Y n V, <p)} is an atlas for Y.
The proof is based on the following obvious fact, which it is useful to

keep in mind when dealing with submanifolds.
Let V1 , V2 , W1 , W2 be open subsets of euclidean spaces, and let
be a CP map. Let a2 E V2, b2 E W2, and assume that 9 maps V1 x {a 2 }

into W1 x {b 2}. Then the induced map
is also a CP map.
Indeed, it is obtained as a composite map
the first map being an injection of V1 as a factor, and the third map a
projection on the first factor.
The following statement has a proof based on the same principle.
Let Y be a submanifold of X and let f: Z --+ X be a map from a

manifold Z into X such that f(Z) is contained in Y. Let fy: Z --+ Y be
the induced map. Then f is a morphism if and only if fy is a morphism.
We leave the proof to the reader.

We observe that if Y is a submanifold of X, then the inclusion map of
Y into X is a morphism. If Y is also a closed subset of X, then we say
that Y is a closed submanifold.
Theorem 2.1. Let U be open in Rn and let f: U --+ R be a CP function,

p ~ 1. Let
and assume that DJ(a) # O. Then the map
is a local CP isomorphism at a, i.e. is a chart at a. If f(a) = c, then

locally at a, the inverse image f-1(c) is a submanifold of Rn.
Proof. The Jacobian matrix of cp at a is the matrix
0 0 0 0
0 0 0 0
0 0 0 0
DJ(a) DJ(a)
and its determinant is Dnf(a) # O. The inverse mapping theorem shows

that cp is a local CP isomorphism at a, thus proving Theorem 2.1.
Corollary 2.2. Let Y be the subset of U consisting of all points x such

that f(x) = c. Then there exists an open neighborhood V of a such that
Y n V is a submanifold of X.
Proof. We take V such that cp is a CP isomorphism on V. Those

points of Y correspond to the points such that
cp(x) = (x l ' ... ,x n - 1 , c).
If g is the inverse mapping of cp, then the map

[XXII, §2] SUBMANIFOLDS 531
is the inverse parametrization of a chart, and qJ restricted to V n Y maps

V n Y in a factor of a product, as desired.
Example. The map f(x} = xi + ... + x; and c = 1 give the sphere in

the preceding corollary, so that the sphere is a submanifold of Rn. For
any point of the sphere, some coordinate is not equal to 0, and the
partial derivative of f at that point is not 0, so that the corollary applies.
The argument proving Theorem 2.1 can easily be generalized to cover

other cases in which we can prove that a certain subset is a submanifold.
We shall formulate these criteria, which involve the derivative as linear
map. We shall not use them later in this book, and the reader may omit
them without harm. For further applications and terminology concerning
this, cf. books on differentiable manifolds, e.g. [La 2].
Let V be open in E and let f: V --+ F be a cP morphism with p ~ 1.

Let Xo E V. Assume that f'(x o} is a linear isomorphism of E onto a
subspace Fl of F. Let F = Fl EF> Fz . Then there exists a local CP
isomorphism g: F --+ Fl X Fz at f(x o } and an open subset VI of V con-
taining Xo such that the composite map 9 0 f induces a CP isomorphism
of VI onto an open subset of Fl'
Proof. Consider the map
given by
qJ(x, yz} = (f(x), 0) + (0, yz)·
Then
where we use the matrix representation of a linear map of E x Fz into

Fl x Fz . For VI E E and Vz E Fz we have
In this representation, we view f'(x o} as a linear map of E onto Fl' We

see that qJ' (x o , O) is a linear isomorphism between E x F2 and Fl x F2.
By the inverse mapping theorem, we conclude that qJ is a local CP
isomorphism at (x o , O). We let t/J be its local CP inverse. Then it is
obvious that t/J induces a map 9 to satisfy our requirements.
In the preceding result, we may view f as parametrizing a subset of F,

say locally at Xo by VI --+ f(V l }. The lemma shows that there is a chart
at f(x o) in f(Ud which maps f(U 1 ) into a factor in the product Fl x F2.
Note that if we write E = Rq and F = Rn, then the subspace Fl of F is
not necessarily equal to Rq in its usual embedding in Rn as the space of
the first q coordinates. The subspace Fl can be quite arbitrary. How-
ever, we can find a complementary subspace F2, and then a basis of Fl
and of F2 in such a way that if we take coordinates with respect to this
basis, then the coordinates of g(J(U1 )) are precisely the coordinates
(Xl' ... ,Xq , 0, ... ,0). In our geometric terminology, we can say that f(Ud
is a submanifold of F.
The next result deals with the dual situation, where instead of an
injection we deal with a projection. If we have a map
then we shall say that this map is a projection (on the first factor) if this
map can be expressed as a composite
of the actual projection on the first factor, followed by a map of V1

into F.
Let V be open in E and let a be a point of U. Let f: V ~ F be a CP

map, p ~ 1. Assume that the derivative f'(a): E ~ F is surjective. Let
E2 be a subspace of E such that f'(a) induces a linear isomorphism of
E2 with F, and let El be a complementary subspace to E2 in E, that is
E = El X E 2 . Then the map
is a local CP isomorphism at a.
Proof. The derivative of the map at a is represented by the matrix
( 11
Dtf(a)
0)
Dzf(a)
and is therefore invertible at a = (a 1 , a2) because Dzf(a) by definition is a

linear isomorphism of E2 with F. The inverse mapping theorem shows
that our map is locally CP invertible at a, as was to be shown.
In particular, let c E F, and consider those points X E V such that

f(x) = c, i.e. such that f has constant value c. If V2 E E2 is such that
f(v2) = c (and V2 is close to a2), then the inverse image f- 1 (c) corre-
sponds to a factor V1 x {V2} in El x E2 locally near a. One of the
[XXII, §3] TANGENT SPACES 533
most important examples is that of a function, which we treated in

Theorem 2.1.
XXII, §3. TANGENT SPACES
Let X be a CP manifold (p ~ 1). Let x be a point of X . We then have a

representation of x in every chart at x, which maps an open neighbor-
hood of x into a euclidean space E. We consider triples (U, <p, v) where
(U, <p) is a chart at x, and v is a vector in the vector space in which <pU
lies. We say that two such triples (U, <p, v) and (V, t/I, w) are equivalent if
the derivative of t/I 0 <p-l at <px maps v on w. The formula reads
(t/I 0 <p-l )'(<px) v = w.
This is obviously an equivalence relation by the chain rule. An equiva-

lence class of such triples is called a tangent vector of X at x. Thus we
represent a tangent vector much the same way that we represent a point
of X, by its representation relative to charts. The set of such tangent
vectors is called the tangent space of X at x and is denoted by T,,(X).
Each chart (U, <p) determines a bijection of TAX) on a euclidean space,
namely the equivalence class of (U, <p, v) corresponds to the vector v.
Suppose that X is a manifold. Then each derivative
(t/I 0 <p-l )'(<px): E~E
is an invertible linear map. Let Vi' V2 be vectors representing tangent

vectors Vi' v2 in the chart (U, <p), and let Wi' W2 represent the same
tangent vectors in (V, t/I). Then by definition
From this we see that Vi + V2 and Wi + W2 represent the same tangent

vector, and that if C E R, then cV l and CW l represent the same tangent
vector. Thus we can define addition and multiplication by numbers in
T,,(X) in such a way that
and
Then Tx(X) is a vector space, and the map
is a linear isomorphism of E onto Tx(X).

The derivative of a map defined on open sets of Euclidean spaces can
now be interpreted on manifolds. Let J: X --+ Y be a CP map and let

x E X. We define the tangent map at x,
as the unique linear map having the following property: If (U, cp) is a
chart at x and (V, "') is a chart at J(x) such that J(U) c V, and v is a
tangent vector at x represented by v in the chart (U, cp), then
Txf(v)
is the tangent vector at J(x) represented by DJu, v(x)v. It is immediately

verified that there does exist such a unique linear map. The tangent
linear map is also occasionally denoted by dJx, and is also called the
differential of J at x. The representation of TxJ on the spaces of charts
can be given in the form of a diagram.
Tf(x)(Y) --4 F
Here of course, F is the space in which ",(V) lies.

If J: X --+ Y and g: Y --+ Z are two CP maps, then the chain rule can be
expressed by the formula
In particular, suppose that Y is a submanifold of X, and let x E Y be a

point of Y. Then we have the inclusion map
j: Y --+ X
which induces an injective linear map
whose image is a subspace of TAX). This is the situation which is

usually depicted by the following picture :
[XXII, §3] TANGENT SPACES 535
Here X = E is the whole vector space. Suppose that Y is a submanifold

of E and let x E Y. Let
be a local isomorphism of some open V1 in a space F with Y, at a point

Y 1 E V1 such that 1/11 (y) = x. Let us view 1/11 as a map of V1 into E. Then
is an injective linear map, whose image is a subspace Eo of E. One can

verify directly, or from the abstract fact that 'Fyl/l is defined, that if
is a local isomorphism of some open V2 in F with Y, at a point Y2 E V2

such that 1/12(Y2) = x also, then the image of I/I;(Y2) is in fact equal to Eo.
This subspace Eo is the translation of the "tangent space" drawn on the
picture. In fact, the tangent space drawn on the picture consists of all
pairs (x, v) with v E Eo. We view each such pair (x, v) as a located vector,
starting at x and ending at x + v.
The collection of tangent spaces, namely the union of all TAX) for all
x E X, will be called the tangent bundle of X, and will be denoted by
T(X). We can in fact make T(X) into a Cr 1 manifold by giving natural
charts for it as follows.
We have a natural map
n: T(X) -+ X
which maps each tangent space Yx(X) on the point x of X. We call n the
natural projection. Let (U, qJ) be a chart of X, with qJU is open in E. We
then obtain a map
defined by
if n(v) = x and v is a tangent vector at x, represented by v in E, with

respect to the chart. In fact, it is clear that 'rp is a bijection.
Let (U, qJ) and (V, 1/1) be two charts. We have
We obtain a transition mapping
by
for x E V n V and vEE. Since the derivative D(1jI a q> -1) is of class C r1
and is a linear isomorphism at x, we conclude that our family of maps
{!rp}, for (V, q» ranging over all charts of X, is an atlas for T(X), and
therefore that T(X) is a C r1 manifold, as we predicted it would be.
We call each chart (n-lV, !rp) a trivializing chart of T(X), over the open
set V. Locally, we see that each such trivializing chart for T(X) gives an
isomorphism of the tangent bundle over V with a product q>V x E.
Let f: X -+ Y be a CP morphism, p ~ 1. We can then define a tangent
map
Tf: T(X) -+ T(Y)
to be simply the map equal to
on the tangent space at x. It is immediately clear from the way in which

we defined the charts for the tangent bundle that Tf is a crl morphism.
Over an open set V of X with chart (V, q», suppose that f maps V into
an open set V of Y, with chart (V, 1jI). We can represent Tf as the
derivative as on the following diagram:
n- l (V)
- t.
q>V x E
Tlj j
n-l(V)
~ IjIV x F.
The map on the right can be viewed as the pair (fu,v,f~,v)'
XXII, §4. PARTITIONS OF UNITY
Let X be a C P manifold, p ~ O. By a CP function on X we shall always

mean a morphism of X into R of class CP (unless otherwise specified,
when we take complex valued functions). The functions form a ring
CP(X). As usual, the support of a function is the closure of the set of
points x such that f(x) =I- O.
Let X be a topological space. A covering of X is called locally finite if
every point of X has a neighborhood which intersects only finitely many
elements of the covering. A refinement {lj} of a covering {VJ of X is a
covering such that each lj is contained in some Vi' We also say that the
covering {lj} is subordinated to the covering {VJ,
A partition of unity (of class CP) on a manifold X consists of an open
covering {Vi} of X and a family of CP functions
[XXII, §4] PARTITIONS OF UNITY 537
satisfying the following conditions:
PU 1. For all x E X, we have !/Ilx) ;;; o.

PU 2. The support of !/Ii is contained in Vi'
PU 3. The covering is locally finite.
PU 4. For each point x E X we have E!/Ii(X) = 1.
(The sum is taken over all i, but is in fact finite for any given point x in
view of PU 3.)
As a matter of notation, we often write that {(Vi' !/Ii)} is a partition of
unity if it satisfies the previous four conditions.
Theorem 4.1. Let X be a manifold which is Hausdorff and whose

topology has a countable base. Given an open covering IlIt of X, there
exists an atlas {(Y,., qJk)} such that the covering {Y,.} is locally finite and
subordinated to the given covering 1lIt, such that qJk Y,. is the open ball
B3 (0) of radius 3 centered at 0, and such that the open sets ~ =
qJi:1(B 1) cover X (where Bl is the open ball of radius 1 centered at 0).
Proof Let Vi' V 2 , ••• be a basis for the open sets of X such that each
D; is compact. We can find such a basis since X is locally compact. We
construct inductively a sequence Ai, A 2 , ... of compact sets whose union
is X, such that Ai is contained in the interior of Ai+l' We start with
Al = V1 · Suppose that we have constructed Ai, ... ,Ai' Let j be the
smallest integer such that Ai is contained in Vi u··· u~. We let Ai+1 be
the closed and compact set
This gives our desired sequence of compact sets.

For each point x E X we can find an arbitrarily small chart (vx , qJJ at
x such that qJx Vx is the open ball of radius 3 centered at O. We can
therefore assume that Vx is contained in some open set of the covering 1lIt.
As for the statement concerning the ball of radius 3, we can always
shrink our open set Vx so that its image is a ball, and then adjust the
image by a translation and multiplication by a positive number to make
the image exactly equal to B3(0). For each i and each x in the open set
we select Vx to be contained in this open set. We let Wx = qJ;l(Bl) be

the inverse image of the ball of radius 1. We can cover the compact set
(annulus)
by a finite number of sets Wx " . .. ,Wx ",. Let fJli denote the family
{ Vx , ' • •• , VXrn } ' and let fJI be the union of all fJl i for all i = 1, 2, . . . . Then
fJI is an open covering of X, is locally finite, and is subordinated to our
given covering il/I. It also satisfies the other requirements of the theorem.
Corollary 4.2. Let X be a CP manifold which is Hausdorff, and whose

topology has a countable base. Then X has CP partitions of unity
subordinated to a given covering il/I.
Proof. Let {( ~, q>d} be as in the theorem, and let w" = q>;;l (B 1) be

as in the theorem. We can find a function t/Jk of class CP such that
o ~ t/Jk ~ 1, such that t/Jk(X) = 1 for x E w", and t/Jk(X) = 0 for x rJ:~. (The
proof is recalled below.) We now let
The sum is finite at each point, and we let Yk = t/Jd t/J. Then {(~, yd} is
the desired partition of unity.
We now recall the argument giving the function t/Jk . If 0 ~ a < b, then
the function defined by
-1
(t--- a--:-)-:-:(b-----:-
exp -:- t)
in the open interval a < t < band 0 outside the interval determines a
bell-shaped C OO function from R to R. Its integral from -00 to t divided
by the area under the bell yields a function which lies strictly between 0
and 1 in the interval a < t < b, is equal to 0 for t ~ a and is equal to 1
for t ~ b.
We can therefore find a real valued function of a real variable, say
l1(t), such that l1(t) = 1 for 1tl < 1 and l1(t) = 0 for 1tl ~ 1 + D with small
D, and such that 0 ~ 11 ~ 1. Then 11(lx1)2 = t/J(x) gives us a function
which is equal to 1 on the ball of radius 1 and 0 outside the ball of
radius 1 + D. (We denote by lithe euclidean norm.) This function can
then be transported to the manifold by any given chart whose image is
the ball of radius 3.
Corollary 4.3. Let A, B be disjoint closed subsets of Rn, or of a

manifold X admitting CP partitions of unity subordinate to any given
open covering. Then there exists a CP function f such that
o~ f ~ 1, f = 1 on A, f = 0 on B.
Proof. For each x E A let Ux be an open neighborhood of x not

intersecting B. Let {aJ be a partition of unity subordinate to the cover-
[XXII, §5] MANIFOLDS WITH BOUNDARY 539
ing consisting of CIJ A and {Ux } X EA' Let J be the set of those indices j
such that supp IXj C UxUl for some x(j) E A. Let
f= L IX
jEJ
j •
For any x E X there is only a finite number of functions IXj such that
IXj(X) =1= 0, so our sum expressing f is actually a finite sum. If x E A and
IX; has support in ClJA, then IX;(X) = O. Hence for each x E A we have
f(x) = 1. If x E B, then IXj(X) = 0 for each j E J so f(x) = O. For any
x E X we have 0 ~ f(x) ~ 1 because of the definition of a partition of
unity and the fact that we take our sum for f only over a subset of the
indices {j}. This proves our corollary.
Remark. In some cases, one wants a function f as in the corollary

with certain bounds on its derivative. One can achieve such bounds by
being more careful in selecting the IX;. For an example of the kind of
technique used, cr. the end of Chapter XXIII, §6.
XXII, §5. MANIFOLDS WITH BOUNDARY
In our applications, we need manifolds with boundary. Let A.: E -+ R

be a functional on E. For instance, if E = Rn we may consider A = An
to be the projection on the n-th coordinate. We denote by E~ the
kernel of A, and by E1 (resp. E")J the set of points x E E such that
AX ~ 0 (resp. AX ~ 0). We call E~ the hyperplane determined by A, and
we call E1 or EJ: a half space. The terminology is justified by the
natural pictures.
If J1. is another functional, and E1 = E;, then there exists a number

c > 0 such that A = CJ1..
This is easily proved. Indeed, we see at once that the kernels of A and
J1. must be equal. Suppose that A =1= O. Let Xo be such that A(Xo) > O.
Then J1.(x o) > 0 also. The functional
A - CJ1.
where c = A(Xo)/J1.(xo) vanishes on the kernel of A (or J1.), and also on Xo.
Therefore it is the 0 functional and c satisfies our requirement.
In practice, we shall use mostly coordinate functions as functionals,

and especially the n-th coordinate function. However, it is reasonable to
use a slightly more invariant language which exhibits more directly the
geometric nature of the forthcoming constructions. We shall be inter-

ested in figures like the shell of a cylinder:
If we exclude the two end circles, then what is left is just an ordinary
manifold. However, if we include the two end circles, then we have an
object which, at each point of one of the end circles, does not look like
some open set in 2-space, but rather looks like a point at the boundary
of a half plane. In other words, we have a parametrization as indicated
by the following picture:
-
We shall formulate the definitions and lemmas which allow us to give a
formal development for such parametrizations.
Let E, F be euclidean spaces, and let E1 and Fit be two half spaces in
E and F, respectively. Let V, V be open subsets of these half spaces
respectively. We shall say that a mapping
f: V -+ V
is of class CP if the following condition is satisfied. Given x E V, there

exists an open neighborhood V 1 of x in E, and an open neighborhood V1
of f(x) in F, and a CP map f1: V 1 -+ V1 such that the restriction of f1 to
V 1 ( l V is equal to f. As usual, we take p ~ 1.
We can now define a manifold with boundary in a manner entirely
similar to the one used to define manifolds, namely by conditions AT 1,
AT 2, AT 3, except that we take the Vi of an atlas to be open subsets
of half spaces. The notion of CP-isomorphism is defined as usual by the
condition of having a CP-inverse.
We must make some remarks concerning the boundary, and we need
some lemmas, e.g. to show that the boundary is a "differentiable
invariant".
[XXII, §5] MANIFOLDS WITH BOUNDARY 541
Lemma 5.1. Let U be open in E, and let f: U --+ F and g: U --+ F be

two CP maps (p ~ 1). Assume that f and g have the same restriction to
Un Et for some half space Et, and let x E Un Et. Then f'(x) = g'(x).
Proof. After considering the difference of f and g, we may assume

without loss of generality that the restriction of f to U n Et is o. It then
follows that f'(x) = 0 because the directions of the half space span the
whole space.
Lemma 5.2. Let U be open in E. Let fl. be a non-zero functional on F,

and let f: U --+ Fp.+ be a CP map with p ~ 1. Let x be a point of U such
that f(x) lies in F~. Then f'(x) maps E into F~.
Proof. Without loss of generality, after translations, we may assume

that x = 0 and f(x) = O. Let W be a given bounded neighborhood of 0
in F. Suppose that we can find an element VEE such that fl.f'(O)v # O.
We can write (for small t > 0):
f(tv) = tf'(O)v + o(t)w l
with some element WI E W By assumption, f(tv) lies in Fr Applying fl.,

we get
tfl.f'(O)v + o(t)fl.(w l) ~ o.
Dividing by t, this yields
o(t)
fl.f'(O)v ~ -t-fl.(wl ) .
Replacing t by - t we get a similar inequality on the other side. Letting

t tend to 0 shows that fl.f'(O)v = 0, a contradiction.
Let U be open in some half plane Et. We define the boundary of U

(written aU) to be the intersection of U with E~. We define the interior
of U, written Int(U), to be the complement of au in U. Then Int(U) is
open in E.
Example. Let E1 be a half space, with A. # O. Then from our defini-

tion, we see that this half space is Coo isomorphic to a product
where R + is the set of real numbers ~ O. The boundary in this case is

E~ x {O}.
Lemma 5.3. Let A. be a functional on E and J.! a functional on F. Let

U be open in E1 and V open in F/l+ ' Assume that U n E~ and V n F~
are not empty. Let f : U -+ V be a CP-isomorphism (p ~ 1). Then A. =f. 0
if and only if J.! =f. O. If A. =f. 0, then f induces a CP-isomorphism of
Int(U) on Int(V) and of au on av.
Proof. For each x E U, we conclude from the chain rule that f'(x) is
invertible. Our first assertion then follows from Lemma 5.2. We also see
that no interior point of U maps on a boundary point of V and con-
versely. Thus f induces a bijection of au and av, and a bijection of
Int(U) on Int(V). Since these interiors are open in their respective spaces,
it follows that f induces an isomorphism between them. As for the
boundary, it is a submanifold of the full space, and locally, our definition
of the derivative, together with the product structure, shows that the
restriction of f to au must be an isomorphism on av.
We see that Lemma 5.3 gives us the invariance of the boundary under
cP maps (p ~ 1), first for open subsets of half spaces, but then also
immediately for the boundary of a manifold since the property reduces at
once to such subsets, under charts.
We can then describe local coordinates at a point in a manifold with
boundary as follows. If the point is not a boundary point, then a neigh-
borhood of this point is described by coordinates (Xl'''' ,X n ) in some
open set of R n, which we may even take to contain 0, and such that 0
corresponds to the given point.
If the point is a boundary point, then an open neighborhood can be
described by coordinates (Xl' .,. ,xn ) satisfying
and Xl ' ,,,,xn- l lying in some open set of Rn-l.
After a translation, we can even achieve an inequality Xn ~ 0 instead of

Xn ~ a. The points with coordinates Xn = 0 are precisely those on the
boundary. This comes from the fact that after a suitable choice of basis
of Rn, we can always achieve the result that a functional is simply the
projection on the first coordinate of a suitable basis.
Similarly, we can define an embedded k-dimensional submanifold with
boundary in Rn, in terms of coordinates. Namely, we say that a subset X
of Rn is such a submanifold if for each X E X there exists an open set U
in Rn containing x, an open set V in Rn and a CP isomorphism cp : U -+ V
such that
cp(UnX) = Vn(H k x {c}),
where Hk is, say, the half plane in Rk defined by Xk ~ a, and c is a point

(CUI' ... ,cn) in Rn-k.
[XXII, §6] GLOBAL DIFFERENTIAL EQUATIONS 543
Example. Consider the cylinder, conveniently placed vertically as follows:
We can define a chart for part of the cylinder in terms of the three
e,
coordinates (r, z) satisfying the inequalities :
r = c,
The map !/J such that

!/J(r, e, z) = (r cos e, r sin e, z)
is the inverse of the map qJ in the previous definition.
The Tangent Spaces. These can be defined much as for a manifold

without boundary, since the charts and the spaces in which they lie
can be used as before. For the equivalence classes between vectors, we
needed the derivative of maps defining the changes of charts, but these
are well defined independently of the manner in which one extends such
changes of charts from half spaces to a full open set in the vector space.
Partitions of Unity. The theorem proved in §4 goes over without

essential change to manifolds with boundary. Of course, the open balls in
Theorem 4.1 have to be replaced with their intersections with half spaces.
XXII, §6. VECTOR FIELDS AND GLOBAL

DIFFERENTIAL EQUATIONS
Let X be a manifold without boundary, of class CP with p;:;; 2. We
assume that X is Hausdorff. Let n : T(X) -+ X be the natural map of its
tangent bundle onto X. We know that T(X) is a manifold of class CP-l,
and that n is of class CP-l.
By a vector field on X we mean a morphism (of class CP-l)
~: X -+ T(X)
such that ~(x) lies in the tangent space Tx(X) for each x E X, or in other
words, such that n 0 ~ = id x . Thus a vector field assigns a tangent vector
to each point.
When we identify the tangent bundle of an open set U in E with the
product U x E relative to a chart (U, cp), then we see that a vector field
corresponds to a map
U-+UxE
such that
~(x) = (x, f(x))
where f: U -+ E is a CP-l map. Thus a vector field is completely deter-

mined by the map f, which has been studied in Chapter XIV. We call f
the local representation of the vector field ~ in the chart (U, cp).
Let J be an open interval of R. The tangent bundle of J is then
naturally identifiable with J x R, since the identity map of J is a global
chart for J . In particular, we can view the number 1 as a tangent vector
at each point, and we have a constant vector field over J which takes
this value 1 at all points.
Let a: J -+ X be a curve, i.e. a map from an open interval J into X.
Assume that a is of class Cl . We want to take the derivative of a.
Locally at each point of J, we can shrink the domain of definition of ex
to a subinterval Jo such that a(Jo) is contained in the domain of defini-
tion U of a chart (U, cp). Then the composite cp 0 ex is a curve into E,
Jo .!. U.!. cpU c E.
We can then take the derivative (cp 0 a)'(t) for t E Jo as a vector in E.

This vector represents a tangent vector in 7;.(t)(X), and it is immediately
clear that if we change the chart to another (V, IjJ), then (IjJ 0 a)'(t) repre-
sents the same tangent vector. In this way we obtain a curve which we
shall denote by a', into the tangent bundle, namely
a': J -+ T(X),
which is such that a'(t) lies in 7;.(t)(X). We shall also write da/dt instead
of a'(t), following standard notation, consistent with previous notation
when we studied vector fields on open sets of vector spaces.
We say that a is an integral curve for the vector field ~ if we have
a'(t) = ~(ex(t))
[XXII, §6] VECTOR FIELDS 545
for all t E J. If J contains 0 and a(O) = xo, we say that Xo is the initial
condition of a. The theorems on differential equations proved in Chapter
XIV, §3, §4, §5 can now be formulated on manifolds.
Let a l : J l ~ X and a 2: J2 -+ X be two integral curves of the vector field

eon X, with the same initial condition Xo' Then a 1 and a2 are equal on
J 1 n J2 •
Proof. The proof is identical with that of Theorem 3.3 of Chapter

XIV.
The preceding result allows us to define an integral curve with given

initial condition x on a maximal interval J(x). Of course, the local
existence theorem proved in Chapter XIV shows that such an integral
curve exists. As before, we let !l(e) be the subset of R x X consisting of
all points (t, x) such that t E J(x). We define a global flow for eto be the
map
such that for each x E X the map ax : J(x) -+ X given by
e
is an integral curve for with initial condition x. When we select a chart
at a point x of X, then we see that this definition of flow coincides with
the definition we gave for open sets in euclidean spaces for the local
representation of our vector field. As in Chapter XIV, we abbreviate
a(t, x) by tx.
Theorem 6.1. Let e be a vector field on X and a its flow. Let x E X.

If to lies in J(x), then
J(tox) = J(x) - to
and we have for all t in J(x) - to :
Proof. Just like the proof of Theorem 5.1 of Chapter XIV.
Theorem 6.2. Let e be a vector field of class C p- 1 on the CP manifold

X (2 ;:;; p;:;; 00). Then the domain !l(e) is open in R x X and the flow a
for e is a cr1 morphism.
Proof. Identical with the proof of Theorem 5.2, Chapter XIV.
Corollary 6.3. For each t E R, the set of x E X such that (t, x) is

contained in the domain :D(~) is open in X.
Corollary 6.4. Let :Dt(~) be the set of points x of X such that (t, x) lies
in :D(~). Then :Dt(~) is open for each t E R, and at is a CP-isomorphism
of :Dt(~) onto an open subset of X. In fact, at(:D t) = :D- t and a;l = a_to
Proof Immediate from Theorems 4.1 and 6.1.
Corollary 6.5. If Xo is a point of X and t is in J(xo), then there exists

an open neighborhood U of Xo such that t lies in J(x) for all x E U, and
the map
X 1--+ tx
is an isomorphism of U onto an open neighborhood of tx o .
In the present section, we have given the terminology which allows us

to discuss differential equations on manifolds.
CHAPTER XXIII
Integration and Measures

on Manifolds
Throughout this chapter, unless otherwise specified, we use the word

manifold to denote manifolds possibly having boundaries. From §3 to the
end, we let X be a manifold of class CP with p ~ 1, which is Hausdorff
and has a countable base. These last two assumptions are to ensure
that X admits CP partitions of unity, subordinated to any given open
covering.
XXIII, §1. DIFFERENTIAL FORMS ON MANIFOLDS
Let X be a CP manifold, with p always ~ 1. To each tangent space

T,,(X) = T" we can associate the dual space T,,*, and the alternating prod-
uct 1\' Tx*· We form the union
denoted by 1\ T*(X).
By a differential form on X (of degree r) we shall mean a map
w: X ~ 1\' T*(X)
such that for each x the value w(x) lies in 1\' T,,* . (We shall add differ-
entiability conditions in a moment.) The set of differential forms is a
vector space denoted by n'(X).
If f : X ~ Y is a CP map of manifolds, then we obtain an induced map
f* : n'(Y) ~ n'(X),
548 INTEGRATION AND MEASURES ON MANIFOLDS [XXIII, §1]
just as in the case of subsets of euclidean spaces, and arising from the
induced linear map at each point,
(Cf. Theorem C of the Appendix, Chapter XXI, and Theorem 4.1 of

Chapter XXI.)
Essentially as we did with tangent vectors, we can find local represen-
tations of differential forms in the corresponding euclidean space Rn = E
of the manifold X. Indeed, let x E X . Let
cp: U ~ cpU c Rn
be a chart at x. We have an isomorphism
which to each vector vERn associates the class of (U, cp, v). If A. is a
functional on T", then A. 0 atp is a functional on Rn. Let w be a I-form on
U, and Wx the value of w at x. Then Wx can be pulled back to R", to
obtain the form
on Rn. If w is an r-form, and Xl' . .. ,Xn are the coordinates of x in Rn,

then there exist functions g(i) on cpU such that
and we say that the expression on the right is the local expression of w
determined by the chart, or corresponding to the chart cpo We shall also
commit the abuse of notation, writing g(i)(X) instead of g(i)(X l , ... ,x").
We abbreviate
for simplicity.
If (V, "') is another chart such that Un V is not empty (so that our
two charts (U, cp) and (V, "') may be viewed as charts at a common
point), then we obtain a representation of w determined by",. If w is a
1-form, then W x is a functional on Tx , and the pull backs of Wx to R" by
atp and a", respectively can be visualized in the following diagram:
Rn
Rn
[XXIII, §1] DIFFERENTIAL FORMS ON MANIFOLDS 549
The vertical map on the left is simply the derivative (qJ 0 1/1-1)'(I/IX), i.e.
the derivative at I/Ix of the transition map qJ 0 1/1-1 giving the change of
charts. In terms of local coordinates, the change in the local representa-
tion of Wx is given in terms of partial derivatives, which are of class CP-1.
Similarly, for any r-form the change in the local representation is given
by certain subdeterminants of the Jacobian matrix of (qJ 0 1/1-1)'(I/IX), and
is again of class C p - 1 • The most important case is that of an n-form, and
we can then write w locally as
wi = g(x) dX 1 A ... A dx n •
If (V, 1/1) is the other chart, then we have explicitly
if x = f(y) and f = qJ 0 1/1-1. As usual, I1f is the Jacobian determinant.

We say that w is of class CP-1 at a point, if in some local repre-
sentation relative to a chart at that point, the functions g(i) as above
are of class CP-l. The remark in the preceding paragraph then shows
that this will then be true for any local representation relative to any
chart at that point. We say that w is Cp-l if it is of class CP-l at every
point.
Theorems 3.1, 4.1, 4.2, 4.3 of Chapter XXI, concerning the operation
w 1--+ dw, and the inverse image of a form now extend immediately to
manifolds. In fact, the theorems of Chapter XXI give the expression for
these operations on the local representation of forms. We define the
wedge product w A 1'/ of two forms just as in the local case, according to
the general algebraic result of Theorem B in the Appendix to Chapter
XXI. We shall now repeat the statements of the theorems loco cit. on
manifolds.
There exists a unique family of linear maps
d: nr(x) -+ nr+l(X) (r = 0, 1, 2, ... )
defined on the space of r-forms (of class 0, into the space of forms of
class Cq-1), satisfying the properties that if deg w = r, then
d(w A 1'/) = dw A 1'/ + (-1)'w A d1'/,
and df = Tf if f is a function, i.e. a form of degree 0.
If f: X -+ Y is a CP map, p ~ 1, then for each r there exists a unique

linear map
f*: nr(y) -+ nr(x)

(i) For any differential forms w, " on Y we have
f*(w 1\ ,,) = f*(w) 1\ f*(,,).
(ii) If 9 is a function on Y, then f*(g) = 9 0 J, and if w is a Ijorm then

(f*w)(x) = w(j(x)) 0 Txf.
If g: Y -+ Z is a CP map, then
(g 0 f)* = f* 0 g*.
Let f: X -+ Y be a C 2 map and w a differential form of class C 1 on

Y. Then
f*(dw) = df*w.
In particular, if 9 is a C 1 function on Y, then
f*(dg) = d(g 0 f).
Observe that the operation d loses one order of differentiability, and

that if f is of class CP, then Tf is of class cr 1 , so that f*w has the
order of differentiability equal to the minimum of that of Tf and w.
Our definition of the local representation of a differential form in
terms of local coordinates is compatible with this operation of inverse
image taken with respect to the map of a chart. In fact, if (U, cp) is a
chart, so that
cp: U -+ cpU
is a CP-isomorphism of U onto an open set in a half space, and if w is a

differential form on U, then we can take
which is a differential form on cpU. The expression in local coordinates

of w is nothing but the expression of (cp -1 )*(w) taken with respect to the
identity chart of cpU as a subset of Rft. In the case of isomorphisms like
charts, it is useful to use the notation cp*w instead of the inverse image
we have just written.
We can define the support of a differential form as we define the

support of a function. It is the closure of the set of all x E X such that
w(x) "# O. If w is a form of class cq and 0( is a 0 function on X, then we
[XXIII, §2] ORIENT A TION 551
can form the product aw, which is the form whose value at x is a(x}w(x}.
If a has compact support, then aw has compact support. Later, we shall
study the integration of forms, and reduce this to a local problem by
means of partitions of unity, in which we multiply a form by functions.
If X is a manifold and Y a submanifold, then any differential form on
X induces a form on Y. We can view this as a very special case of the
inverse image of a form, under the embedding (injection) map
id: Y -+ X.
In particular, if Y has dimension n - 1, and if (Xl ' ... ,xn) is a system of

coordinates for X at some point of Y such that the points of Y corre-
spond to those coordinates satisfying Xj = c for some fixed number c, and
index j, and if the form on X is given in terms of these coordinates by
then the restriction of w to Y (or the form induced on Y) has the

representation
/'..
f(x 1 , • •• ,c, . .. ,x n } dX 1 1\ '" 1\ dXj 1\ . . . 1\ dx n ,
where the roof over dXj means dXj is omitted. We should denote this
induced form by Wy, although occasionally we omit the subscript Y. We
shall use such an induced form especially when Y is the boundary of a
manifold X.
XXIII, §2. ORIENTATION
Let V, V be open sets in half spaces of Rn and let cP: V -+ V be a C 1

isomorphism. We shall say that cP is orientation preserving if the Jaco-
bian determinant d",(x} is > 0, all X E V . If the Jacobian determinant is
negative, then we say that cP is orientation reversing.
Let X be a CP manifold, p~ 1, and let {(Vi,CPi}} be an atlas. We say
that this atlas is oriented if all transition maps CPj 0 CPi- 1 are orientation
preserving. Two atlases {(Vi' CPi)} and {(v,., tlta)} are said to define the
same orientation, or to be orientation equivalent, if their union is oriented.
We can also define locally a chart (V, tit) to be orientation compatible with
the oriented atlas {(Vi ' cp;)} if all transition maps CPi 0 tit-I (defined when-
ever Vi n V is not empty) are orientation preserving. An orientation
equivalence class of oriented atlases is said to define an oriented mani-
fold, or to be an orientation of the manifold. It is a simple exercise to
verify that if a connected manifold has an orientation, then it has two
distinct orientations.
552 INTEGRA TION AND MEASURES ON MANIFOLDS [XXIII, §2]
The standard examples of the Moebius strip or projective plane show

that not all manifolds admit orientations. We shall now see that the
boundary of an oriented manifold with boundary can be given a natural
orientation.
Let q>: U -+ Rn be an oriented chart at a boundary point of X, such

that:
(1) if (Xl' ... ,X n ) are the local coordinates of the chart, then the bound-
ary points correspond to those points in Rn satisfying X I = 0; and
(2) the points of U not in the boundary have coordinates satisfying
Xl < O.
Then (x z , .. . ,xn ) are the local coordinates for a chart of the boundary,
namely the restriction of q> to ax (') U, and the picture is as follows.
------Xl
We may say that we have considered a chart q> such that the manifold
lies to the left of its boundary. If readers think of a domain in R Z,
having a smooth curve for its boundary, as on the following picture, they
will see that our choice of chart corresponds to what is usually visualized
as "counterclockwise" orientation.
The collection of all pairs (U (') ax, q>1(U (') ax»), chosen according to
the criteria described above, is obviously an atlas for the boundary ax,
and we contend that it is an oriented atlas.
We prove this easily as follows. If
and
are coordinate systems at a boundary point corresponding to choices

of charts made according to our specifications, then we can write y =
[XXIII, §3] DIFFERENTIAL FORMS AND MEASURES 553
f(x) where f = (f1' ... ,fn) is the transition mapping. Since we deal with
oriented charts for X, we know that .1f (x) > 0 for all x. Since f maps
boundary into boundary, we have
for all X2' .• • ,Xn • Consequently the Jacobian matrix of f at a point
T. .
(0, X 2 , •• • ,xn ) is equal to
[ D,/, (0, ,x.)

0···0]
.1~n-1) ,
where .1~n-1) is the Jacobian matrix of the transition map g induced by f

on the boundary, and given by
Yn = f,.(0, X 2 , ••• ,xn)·

However, we have
D 1f 1 (0, X2"
) _ I1m
.. ,Xn -
' f1 (h, X2 ' . • . , X n )
h '
h--O
taking the limit with h < 0 since by prescription, points of X have coor-
dinates with Xl < O. Furthermore, for the same reason we have
Consequently
Dd1(0, X 2 , • • •,xn ) > O.
From this it follows that .1~n-1)(X2' ... ,xn ) > 0, thus proving our assertion
that the atlas we have defined for ax is oriented.
From now on, when we deal with an oriented manifold, it is understood

that its boundary is taken with orientation described above, and called the
induced orientation.
XXIII, §3. THE MEASURE ASSOCIATED WITH

A DIFFERENTIAL FORM
Let X be a manifold of class CP with p ~ 1. We assume from now on

that X i.s Hausdorff and has a countable base. Then we know that X
admits CP partitions of unity, subordinated to any given open covering.
(Actually, instead of the conditions we assumed, we could just as well

have assumed the existence of CP partitions of unity, which is the precise
condition to be used in the sequel.)
Theorem 3.1. Let dim X = n and let w be an njorm on X of class Co,

i.e. continuous. Then there exists a unique positive functional A on
Cc(X) having the following property. If (U, ((J) is a chart and
w(x) = f(x) dX l 1\ dX n
is the local representation of w in this chart, then for any 9 E Cc(X) with
support in U, we have
(1) Ag = f"'u
g",(x) If(x) I dx,
where g", represents 9 in the chart [i.e. g",(x) = g(({J-l(X»)], and dx is

Lebesgue measure.
Proof. The integral in (1) defines a positive functional on Cc(U). The

change of variables formula shows that if (U, ({J) and (V, t/I) are two
charts, and if 9 has support in Un V, then the value of the functional is
independent of the choice of charts. Thus we get a positive functional by
the general localization theorem for measures or functionals (Theorem 5.1
of Chapter IX, §5), using partitions of unity.
The positive measure corresponding to the functional in Theorem 3.1

will be called the measure associated with Iwl, and can be denoted by
J.Llwl·
Theorem 3.1 does not need any orientability assumption. With such
an assumption, we have a similar theorem, obtained without taking the
absolute value.
Theorem 3.2. Let dim X = n and assume that X is oriented. Let w be

an njorm on X of class CO. Then there exists a unique functional A on
Cc(X) having the following property. If (U, ((J) is an oriented chart and
w(x) = f(x) dX l 1\ . .. 1\ dX n
is the local representation of w in this chart, then for any 9 E Cc(X) with
support in U, we have
Ag = f g",(x)f(x) dx,
"'u
where g", represents 9 in the chart, and dx is Lebesgue measure.
[XXIII, §4] STOKES' THEOREM FOR A RECTANGULAR SIMPLEX 555
Proof Since the Jacobian determinant of transition maps belonging

to oriented charts is positive, we see that Theorem 3.2 follows like Theo-
rem 3.1 from the change of variables formula (in which the absolute
value sign now becomes unnecessary) and the existence of partitions of
unity.
If A is the functional of Theorem 3.2, we shall call it the functional

associated with w. For any function g E CAX), we define
Ix gw = Ag.
If J.tlwl(X) is finite, then we know by general theory that we can extend A

by continuity to 2 1 (jml), where m is the regular complex Borel measure
associated with A (cf. Theorem 4.2, Chapter IX and also Exercise 9 of
that chapter). If in particular w has compact support, we can also pro-
ceed directly as follows. Let {Or:;} be a partition of unity over X such
that each Or: i has compact support. We define
f w=~f
x 'x
Or:iW,
all but a finite number of terms in this sum being equal to O. As usual,
it is immediately verified that this sum is in fact independent of the
choice of partition of unity, and in fact, we could just as well use only a
partition of unity over the support of w. Alternatively, if Or: is a function
in CAX) which is equal to 1 on the support of w, then we could also
define
It is clear that these two possible definitions are equivalent.

For an interesting theorem at the level of this chapter, see J. Moser's
paper "On the volume element on a manifold," Trans. Amer. Math. Soc.
120 (December 1965) pp. 286-294.
XXIII, §4. STOKES' THEOREM FOR A

RECTANGULAR SIMPLEX
Let
be a rectangle in n-space, i.e. a product of n closed intervals. The set-

theoretic boundary in R consists of the union over all i = 1, . . . ,n of the
pieces
R? = [ai' bl ] x ... x {ai} x ... x [an, bn],
Rt = [al,b l ] x ... x {b;} x ... x [an,bnJ.
If
/'-.
W(XI' ... ,xn) = f(x l , ... ,xn) dX 1 1\ .. . 1\ dXj 1\ .. . 1\ dX n
is an (n - 1)-form, and the roof over anything means that this thing is to
be omitted, then we define
if i = j, and 0 otherwise. And similarly for the integral over Rt. We

define the integral over the oriented boundary to be
f cO R
= L (_1)i [f
n
i=l R?
- f] R[
.
Stokes' Theorem for Rectangles. Let R be a rectangle in an open set U

in n-space. Let W be an (n - 1)-form on U. Then
fR dw = faOR
w.
Proof. In two dimensions, the picture looks like this:
b
2
a2
-R? D -R~
R}
It suffices to prove the assertion when w is a decomposable form, say

/'-.
w(x) = f(x 1 , ... ,xn) dX 1 1\ ... 1\ dX j 1\ ... 1\ dx n.
We then evaluate the integral over the boundary of R. If i #- j, then it is

[XXIII, §4] STOKES' THEOREM FOR A RECTANGULAR SIMPLEX 557
clear that
fR? w = 0= f Ri
w,
so that
On the other hand, from the definitions we find that
dw(x) = ( -oj dx OJ)

+ . .. +OX
- dx n A dx A'" A
/'..
dx · A .•. A dx
ox 1 1 n 1 J n
(The (_l)j-l comes from interchanging dXj with dx 1 , ••• ,dxj _1 • All other
terms disappear by the alternation rule.)
Integrating dw over R, we may use repeated integration and integrate
ojj OXj with respect to Xj first. Then the fundamental theorem of calculus
for one variable yields
We then integrate with respect to the other variables, and multiply by

(_l)j-l. This yields precisely the value found for the integral of w over
the oriented boundary aO R, and proves the theorem.
Remark. Stokes' theorem for a rectangle extends at once to a version

in which we parametrize a subset of some space by a rectangle. Indeed,
if a: R -+ V is a C 1 map of a rectangle of dimension n into an open set V
in R N , and if w is an (n - 1)-form in V, we may define
J. L
dw = a* dw.
One can define
f f
0"
w =
oOR
a*w,
and then we have a formula
In the next section, we prove a somewhat less formal result.
XXIII, §5. STOKES' THEOREM ON A MANIFOLD
Theorem 5.1. Let X be an oriented manifold of class C 2 , dimension n,

and let OJ be an (n - l)-form on X, of class C 1 • Assume that OJ has
compact support. Then
f f
x
dOJ =
oX
OJ.
Proof. Let {OCJiEI be a partition of unity, of class C 2 • Then
and this sum has only a finite number of non-zero terms since the
support of OJ is compact. Using the additivity of the operation d, and
that of the integral, we find :
Suppose that OC i has compact support in some open set J.'i of X and that
we can prove
in other words we can prove Stokes' theorem locally in J.'i. We can write
and similarly
Using the additivity of the integral one more, we get

[XXIII, §5] STOKES ' THEOREM ON A MANIFOLD 559
which yields Stokes' theorem on the whole manifold. Thus our argument
with partitions of unity reduces Stokes' theorem to the local case, namely
it suffices to prove that for each point of X there exists an open neigh-
borhood V such that if w has compact support in V, then Stokes' theo-
rem holds with X replaced by V. We now do this.
If the point is not a boundary point, we take an oriented chart (U, <p)
at the point, containing an open neighborhood V of the point, satisfying
the following conditions: <pU is an open ball, and <p V is the interior of a
rectangle, whose closure is contained in <pU. If w has compact support
in V, then its local representation in <p U has compact support in <p V.
Applying Stokes' theorem for rectangles as proved in the preceding sec-
tion, we find that the two integrals occurring in Stokes' formula are
equal to 0 in this case (the integral over an empty boundary being equal
to 0 by convention).
Now suppose that we deal with a boundary point. We take an
oriented chart (U, <p) at the point, having the following properties. First,
<pU is described by the following inequalities in terms of local coordi-
nates (x l ' ... ,Xn ):
and -2 < Xj <2 for j = 2, ... ,no
Next, the given point has coordinates (1,0, ... ,0), and that part of U on
the boundary of X, namely Un ax, is given in terms of these coordi-
nates by the equation Xl = 1. We then let V consist of those points
whose local coordinates satisfy
and -1 < Xj < 1 for j = 2, . . . ,n.
If w has compact support in V, then w is equal to 0 on the boundary of the

rectangle R equal to the closure of <p V, except on the face given by Xl = 1,
which defines that part of the rectangle corresponding to ax n V. Thus
the support of w looks like the shaded portion of the following picture.
-If------f
In the sum giving the integral over the boundary of a rectangle as in the
previous section, only one term will give a non-zero contribution, corre-
sponding to i = 1, which is
Furthermore, the integral over R~ will also be 0, and in the contribution

of the integral over Rl, the two minus signs will cancel, and yield the
integral of w over the part of the boundary lying in V, because our
charts are so chosen that (X2' ... ,xn ) is an oriented system of coordinates
for the boundary. Thus we find
r dw = Jvr ncJx w,
Jv
which proves Stokes' theorem locally in this case, and concludes the
proof of Theorem 5.1.
For any number of reasons, some of which we consider in the next

section, it is useful to formulate conditions under which Stokes' theorem
holds even when the form w does not have compact support. We shall
say that w has almost compact support if there exists a decreasing se-
quence of open sets {Uk} in X such that the intersection
is empty, and a sequence of C 1 functions {gk}' having the following

properties:
AC 1. We have 0 ~ gk ~ 1, gk = 1 outside Uk' and gkW has compact

support.
AC 2. If Jik is the measure associated with Idg k 1\ wi on X , then
lim Jik(Uk) = o.
k- oo
We then have the following application of Stokes' theorem.
Corollary 5.2. Let X be a C 2 oriented manifold, of dimension n, and let

w be an (n - 1){orm on X , of class C 1 . Assume that w has almost
compact support, and that the measures associated with Idwl on X and
Iwl on ax are finite. Then
f f
X
dw =
oX
w.
[XXIII, §6] STOKES' THEOREM WITH SINGULARITIES 561
Proof. By our standard form of Stokes' theorem we have
We estimate the left-hand side by
Since the intersection of the sets Uk is empty, it follows for a purely

measure-theoretic reason that
lim
k-+co
f
eX
gk W = f
eX
W.
Similarly,
lim
k-+co
f
X
gk dw = f
X
dw.
The integral of dg k /\ W over X approaches 0 as k ~ 00 by assumption,

and the fact that dg k /\ w is equal to 0 on the complement of Uk since gk
is constant on this complement. This proves our corollary.
The above proof shows that the second condition AC 2 is a very

natural one to reduce the integral of an arbitrary form to that of a form
with compact support. In the next section, we relate this condition to a
question of singularities when the manifold is embedded in some bigger
space.
XXIII, §6. STOKES' THEOREM WITH SINGULARITIES
If X is a compact manifold, then of course every differential form on X

has compact support. However, the version of Stokes' theorem which we
have given is useful in contexts when we start with an object which is
not a manifold, say as a subset of Rn, but is such that when we remove a
portion of it, what remains is a manifold. For instance, consider a cone
(say the solid cone) as illustrated in the next picture:
The vertex and the circle surrounding the base disc prevent the cone
from being a submanifold of R3. However, if we delete the vertex and
this circle, what remains is a submanifold with boundary embedded in
R3. The boundary consists of the conical shell, and of the base disc
(without its surrounding circle). Another example is given by polyhedra,
as on the following figure.
The idea is to approximate a given form by a form with compact

support, to which we can apply Theorem 5.1, and then take the limit.
We shall indicate one possible technique to do this.
The word "boundary" has been used in two senses: the sense of point
set topology, and the sense of boundary of a manifold. Up to now, they
were used in different contexts so no confusion could arise. We must
now make a distinction, and therefore use the word boundary only in its
manifold sense. If X is a subset of RN, we denote its closure by X as
usual. We call the set theoretic difference X - X the frontier of X in RN,
and denote it by fr(X).
Let X be a submanifold without boundary of RN, of dimension n. We
know that this means that at each point of X there exists a chart for an
open neighborhood of this point in RN such that the points of X in this
chart correspond to a factor in a product, just as in Chapter XXII, §2. A
point P of X - X will be called a regular frontier point of X if there
exists a chart at P in RN with local coordinates (Xl' . .. ,xN ) such that P
has coordinates (0, ... ,0); the points of X are those with coordinates
Xn + l = ... = XN = 0 and
and the points of the frontier of X which lie in the chart are those with
coordinates satisfying
The set of all regular frontier points of X will be denoted by ax, and
will be called the boundary of X. We may say that X u ax is a sub-
manifold of RN, possibly with boundary.
A point of the frontier of X which is not regular will be called
singular. It is clear that the set of singular points is closed in RN. We
now formulate a version of Theorem 5.1 when OJ does not necessarily

have compact support in X u ax. Let S be a subset of RN. By a
fundamental sequence of open neighborhoods of S we shall mean a se-
quence {Uk} of open sets containing S such that, if W is an open set
containing S, then Uk C W for all sufficiently large k.
Let S be the set of singular frontier points of X and let OJ be a form
defined on an open neighborhood of X, and having compact support.
The intersection of supp OJ with (X u aX) need not be compact, so that
we cannot apply Theorem 5.1 as it stands. The idea is to find a funda-
mental sequence of neighborhoods {Uk} of S, and a function gk which is
o on a neighborhood of Sand 1 outside Uk so that gkOJ differs from OJ
only inside Uk' We can then apply Theorem 5.1 to gkOJ and we hope
that taking the limit yields Stokes' theorem for OJ itself. However, we
have
Thus we have an extra term on the right, which should go to 0 as k ..... 00

if we wish to apply this method. In view of this, we make the following
definition.
Let S be a closed subset of RN. We shall say that S is negligible for X
if there exists an open neighborhood U of S in R N , a fundamental
sequence of open neighborhoods {Ud of S in U, with Uk C U, and a
sequence of C 1 functions {gd, having the following properties.
NEG 1. We have 0 ~ gk ~ l. Also, gk(X) = 0 for x in some open neigh-

borhood of S, and gk(X) = 1 for x ¢: Uk'
NEG 2. If OJ is an (n - I)-form of class C 1 on U, and J.1.k is the measure
associated with Idg k /\ OJ I on U!l X, then J.1.k is finite for large
k, and
lim J.1.k(U ! l X) = O.
k- oo
From our first condition, we see that gkOJ vanishes on an open neighbor-
hood of S. Since gk = 1 on the complement of Uk' we have dg k = 0 on
this complement, and therefore our second condition implies that the
measures induced on X near the singular frontier by Idgk /\ OJI (for k =
1, 2, . . .), are concentrated on shrinking neighborhoods and tend to 0 as
k ..... 00.
Theorem 6.1 (Stokes' Theorem with Singularities). Let X be an ori-

ented, C 2 submanifold without boundary of RN. Let dim X = n. Let OJ
be an (n - I)-form of class C 1 on an open neighborhood of X in R N ,
and with compact support. Assume that:
(i) If S is the set of singular points in the frontier of X, then

S n supp w is negligible for x.
(ii) The measures associated with Idwl on X, and Iwl on ax, are finite.
Then
f
x
dw = fax w.
Proof. Let U, {Ud, and {gd satisfy conditions NEG 1 and NEG 2.
Then gkW is 0 on an open neighborhood of S, and since w is assumed to
have compact support, one verifies immediately that
(supp gkW) n (X u aX)
is compact. Thus Theorem 5.1 is applicable, and we get
We have
Since the intersection of all sets Uk n ax is empty, it follows for purely

measure theoretic reasons that the limit of the right-hand side is 0 as
k ~ 00. Thus
For similar reasons, we have
lim
k~co
f
X
gk dw = fX
dw.
Our second assumption NEG 2 guarantees that the integral of dg k A W

over X approaches O. This proves our theorem.
We shall now give criteria for a set to be negligible.
Criterion 1. Let S, T be compact negligible sets for a submanifold X of

RN (assuming X without boundary). Then the union S u T is negligible
for X.
Proof Let U, {Ud, {gd and V, {l'k}, {hd be triples associated with S
and T, respectively, as in conditions NEG 1 and NEG 2 (with V re-
placing U and h replacing g when T replaces S). Let
W=Uuv, and
Then the open sets {l¥,.} form a fundamental sequence of open neighbor-
hoods of S u T in W, and NEG 1 is trivially satisfied. As for NEG 2, we
have
so that NEG 2 is also trivially satisfied, thus proving our criterion.
Criterion 2. Let X be an open set, and let S be a compact subset in R".

Assume that there exists a closed rectangle R of dimension m ~ n - 2
and a C 1 map 0": R -. Rn such that S = O"(R). Then S is negligible for X.
Before giving the proof, we make a couple of simple remarks. First,

we could always take m = n - 2, since any parametrization by a rectan-
gle of dimension < n - 2 can be extended to a parametrization by a
rectangle of dimension n - 2 simply by projecting away extra coordi-
nates. Second, by our first criterion, we see that a finite union of sets as
described above, i.e. parametrized smoothly by rectangles of codimension
~ 2, is negligible. Third, our Criterion 2, combined with the first crite-
rion, shows that negligibility in this case is local, i.e. we can subdivide a
rectangle into small pieces.
We now prove Criterion 2. Composing 0" with a suitable linear map,
we may assume that R is a unit cube. We cut up each side of the cube
into k equal segments and thus get k m small cubes. Since the derivative
of 0" is bounded on a compact set, the image of each small cube is
contained in an n-cube in Rn of radius ~ C/k (by the mean value theo-
rem), whose n-dimensional volume is ~ (2 C)njk". Thus we can cover the
image by small cubes such that the sum of their n-dimensional volumes
is ~ (2C)njkn-m ~ (2C)"/k2.
Lemma 6.2. Let S be a compact subset of Rn. Let Uk be the open set
of points x such that d(x, S) < 2jk. There exists a Coo function gk on R n
which is equal to 0 in some open neighborhood of S, equal to 1 outside
Uk' 0 ~ gk < 1, and such that all partial derivatives of gk are bounded
by C1 k, where C1 is a constant depending only on n.
Proof Let cp be a Coo function such that 0 ~ cp ~ 1, and
cp(x) = 0 if 0 ~ IIxll ~ I j 2,
cp(x) = 1 if 1~ Ilxll.
566 INTEGRA TION AND MEASURES ON MANIFOLDS [XXIII, §6]
We use I I for the sup norm in Rn. The graph of <p looks like this:
-1 -t t
For each positive integer k, let <Pk(X) = <p(kx). Then each partial deriva-
tive D;<Pk satisfies the bound
which is thus bounded by a constant times k. Let L denote the lattice of

integral points in Rn. For each 1E L, we consider the function
This function has the same shape as <Pk but is translated to the point
ll 2k. Consider the product
taken over all 1E L such that d(lI2k, S) ~ 11k. If x is a point of R n such

that d(x, S) < 1/4k, then we pick an 1 such that
d(x, ll 2k) ~ 1/2k.
For this 1 we have d(l12k, S) < 11k, so that this 1 occurs in the product,
and
<Pk(X - ll 2k) = o.
Therefore gk is equal to 0 in an open neighborhood of S. If on the other
hand we have d(x, S) > 21k and if 1 occurs in the product, that is
d(//2k, S) ~ 11k,
then
d(x, ll 2k) > 11k
and hence gk(X) = 1. The partial derivatives of gk are bounded in the

desired manner. This is easily seen, for if Xo is a point where gk is not

identically 1 in a neighborhood of x o, then Ilxo - 10/2kll ~ 11k for some
10 . All other factors CfJk(X - 112k) will be identically 1 near Xo unless
IIxo - 112kll ~ 11k. But then III - 10 II ~ 4 whence the number of such I is
bounded as a function of n (in fact by 9"). Thus when we take the
derivative, we get a sum of at most 9" terms, each one having a deriva-
tive bounded by C1 k for some constant C1 . This proves our lemma.
We return to the proof of Criterion 2. We observe that when an

(n - I)-form w is expressed in terms of its coordinates,
w(x) = L. .fj(x) dX 1 A •.. A

/'..
dXj A ... A dx",
then the coefficients .fj are bounded on a compact neighborhood of S.

We take Uk as in the lemma. Then for k large, each function
is bounded on Uk by a bound C2 k, where C2 depends on a bound for

w, and on the constant of the lemma. The Lebesgue measure of Uk is
bounded by C 3 1k 2 , as we saw previously. Hence the measure of Uk
associated with Idg k A wi is bounded by C3 1k, and tends to 0 as k -+ 00.
This proves our criterion.
As an example, we now state a simpler version of Stokes' theorem,
applying our criteria.
Theorem 6.3. Let X be an open subset of R". Let S be the set of

singular points in the closure of X, and assume that S is the finite union
of C 1 images of m-rectangles with m ~ n - 2. Let w be an (n - I)-form
defined on an open neighborhood of X. Assume that w has compact
support, and that the measures associated with Iwl on ax and with Idwl
on X are finite. Then
f f
x
dw =
eX
w.
Proof. Immediate from our two criteria and Theorem 6.1.
We can apply Theorem 6.3 when, for instance, X is the interior of a

polyhedron, whose interior is open in R". When we deal with a sub-
manifold X of dimension n, embedded in a higher dimensional space RN ,
then one can reduce the analysis of the singular set to Criterion 2
provided that there exists a finite number of charts for X near this
singular set on which the given form w is bounded. This would for
instance be the case with the surface of our cone mentioned at the
568 INTEGRA nON AND MEASURES ON MANIFOLDS [XXIII, §6]
beginning of the section. Criterion 2 is also the natural one when dealing
with manifolds defined by algebraic inequalities. By using the resolution
of singularities due to Hironaka one can parametrize a compact set of
algebraic singularities as in Criterion 2.
Finally, we note that the condition that ill have compact support in an
open neighborhood of X is a very mild condition. If for instance X is a
bounded open subset of R", then X is compact. If ill is any form on
some open set containing X, then we can find another form 1'/ which is
equal to ill on some open neighborhood of X and which has compact
support. The integrals of 1'/ entering into Stokes' formula will be the
same as those of ill. To find 1'/, we simply multiply ill with a suitable COO
function which is 1 in a neighborhood of X and vanishes a little further
away. Thus Theorem 6.3 provides a reasonably useful version of Stokes'
theorem which can be applied easily to all the cases likely to arise
naturally.
Bibliography
[A-B] M . ATIYAH and R. BOTT, "The Lefschetz fixed point theorem for ellip-
tic complexes," Annals of Math ., 86 (1967) pp. 374-407.
[A- Si] M . ATIYAH and I. SINGER, "The index of elliptic operators," Annals of
Math ., 87 (1968) pp. 484- 530, 546-604.
[A- Se] M . ATIYAH and G. SEGAL, "The index of elliptic operators," Annals of
Math ., 87 (1968) pp. 531 - 545.
[Ab-R] R. ABRAHAM and JOEL ROBBIN, Transversal Mappings and Flows,
Benjamin, New York, 1967.
[Ba] K. BARNER, Einfuhrung in die Analytische Zahlentheorie, 1990.
[BGV] N . BERLINE, E. GETZLER, and M . VERGNE, Heat Kernels and Dirac
Operators, Grundlehren der Math. Wiss. 298, Springer-Verlag, New
York, 1992. (This book contains an extensive useful bibliography.)
[Bo] N. BOURBAKI, General Topology, Addison-Wesley, Reading, Mass.,
1968.
[Di] J . DIEUDONNE, Foundations of Modern Analysis, Academic Press, New
York, 1960.
[Din] N. DINCULEANU, Vector Measures, Veb Deutscher Verlag, Berlin, 1966.
[Dix] J. DIXMIER, Les C*-Algebres et Leurs Representations, Gauthier Villars,
Paris, 1964.
[Du] N . DUNFORD and J. SCHWARTZ, Linear Operators, Interscience, New
York, 1958.
[Fa] L. F ADDEEV, Expansion in Eigenfunctions of the Laplace Operator on
the Fundamental Domain of a Discrete Group on the Lobacevskii
Plane, AMS Trans!. Trudy (1967) pp. 357-386.
[Fal] G . FAL TINGS, Lectures on the Arithmetic Riemann- Roch Theorem,
Annals of Math. Studies, 127, Princeton University Press, Princeton,
NJ, 1992.
[Fo 1] B. FOLLAND, Real Analysis, Wiley- Interscience, New York, 1984.
570 BIBLIOGRAPHY
[Fo 2] G.B. FOLLAND, Introduction to Partial Differential Equations, Mathe-

matical Notes, Princeton University Press, Princeton, NJ, 1976.
[Ge- R] I. GELFAND, D. RAIKOV, and G . SHILOV, Commutative Normed Rings,
Chelsea, New York, 1964.
[Gui] V. GUILLEMIN, Some Classical Theorems in Spectral Theory Revisited,
In seminar on singularities of solutions of linear partial differential
equations (edited by Lars Hormander), Annals of Math. Studies, 91,
Princeton University Press, Princeton, NJ, 1979.
[Ha] R.S. HAMILTON, "The inverse function theorem of Nash and Moser,"
Bull. AMS, 7 (1982) pp. 65-222.
[Hi] W. HILDENBRAND, Core and Equilibria of a Large Economy, Princeton
University Press, Princeton, NJ, 1974.
[Hor] L. HORMANDER, Linear Partial Differential Operators, Springer-Verlag,
Berlin, 1964.
[How] R. HOWE, "The Oscillator Semigroup", Proc. Symp. Pure Math . Vol. 48
(1988), Amer. Math. Soc., Providence, RI.
[HowT] R. HOWE and E.c. TAN, Non-Abelian Harmonic Analysis, Universitext,
Springer-Verlag, New York, 1992.
[JoL] J. JORGENSON and S. LANG, "Analytic properties of regularized prod-
ucts: Part II, Fourier theoretic properties," to appear.
[Ke] J.L. KELLEY, General Topology, Van Nostrand, New York, 1955.
[Ku] T. KUBOTA, Introduction to Eisenstein Series, Halsted Press, New York,
1973.
[Ku] R. KUNZE and I. SEGAL, Integrals and Operators, McGraw-Hill, New
York, 1968. Second edition, Springer-Verlag, New York, 1978.
[La 1] S. LANG, Undergraduate Analysis, Springer-Verlag, New York, 1983.
[La 2] S. LANG, Differential Manifolds, Addison-Wesley, Reading, Mass., 1972.
Reprinted by Springer-Verlag, New York, 1983.
[La 3] S. LANG SL2(R), Addison-Wesley, Reading, Mass., 1975. Reprinted by
Springer-Verlag, New York, 1983.
[La 4] S. LANG, "Fonctions implicites et plongements Riemanniens," Seminaire
Bourbaki No. 257, 1961-1962.
[La-T] S. LANG and H. TROTTER, Frobenius Distributions in GLrExtension
(Springer Lecture Notes 504), Springer-Verlag, New York, 1976.
[Lo] L. LOOMIS, Abstract Harmonic Analysis, Van Nostrand, New York,
1953.
[Lo-S] L. LOOMIS and S. STERNBERG, Advanced Calculus, Addison-Wesley,
Reading, Mass., 1968.
[Mi] J. MILNOR, "Morse theory," Annals of Math. Studies, 51, Princeton
University Press, Princeton, NJ, 1963.
[Mo 1] J. MOSER, "On a theorem of V. Anosov," J. Differential Equations, 5
(1969) pp. 411 - 440.
[Mo 2] J. MOSER, "A new technique for the construction of solutions of non-
linear differential equations," Proc. NAS 47 (1961) pp. 1824-1831.
[Nas] J. NASH, "The embedding problem for Riemannian manifolds," Annals
of Math., 63 (1956) pp. 20-63.
BIBLIOGRAPHY 571
[Nat] I. NATANSON, Theory of Functions of a Real Variable, Ungar, New

York (1955), Chapter VIII.
CPa 1] R. PALAIS, "Seminar on the Atiyah-Singer index theorem," Annals of
Math. Studies, 57, Princeton University Press, Princeton, NJ, 1965.
CPa 2] R. PALAIS, "Morse theory on Hilbert manifolds," Topology (1963) pp.
299-340.
[PrJ "Principio di minimo e sue applicazioni aile equazioni funzionali,"
C.I.M.E. Pisa, 1958.
[Pro] "Proceedings of the conference on global analysis," Berkeley, 1968;
Amer. Math. Soc., Providence RI, 1969.
[Ri] M. RIEFFEL, "The Radon-Nikodym theorem for the Bochner integral,"
Transactions AMS, 131, No.2 (1968) pp. 466-487.
[R-N] F. RIESZ and B. NAGY, Functional Analysis, Ungar, New York, 1955.
[Ro] H. ROYDEN, Real Analysis, Macmillan, New York, 1988.
[Ru 1] W. RUDIN, Real and Complex Analysis, McGraw-Hill, New York, 1966.
[Ru 2] W. RUDIN, Fourier Analysis on Groups, Interscience, New York, 1962.
[Sa] S. SAKS, Theory of the Integral, 1947.
[Sch 1] L. SCHWARTZ, Theory of Distributions, Hermann, Paris, 1957.
[Sch 2] L. SCHWARTZ, Mathematics for Physical Sciences, Addison-Wesley,
Reading, Mass., 1966.
[ShJ M.A. SHUBIN, Pseudodifferential Operators and Spectral Theory, Spinger-
Verlag, New York, 1987.
[Sm 1J S. SMALE, "Morse theory and a non-linear generalization of the
Dirichlet problem," Annals of Math. (1964) pp. 382-396.
[Sm 2] S. SMALE, "Differentiable dynamical systems," Bulletin AMS (1967) pp.
747-817.
[Sm 3] S. SMALE, "An infinite dimensional version of Sard's theorem," Amer. J.
Math. (1965) pp. 861-866.
[Smi] T.K. SMITH, A primer of Mathematical Analysis, Springer-Verlag, New
York, 1983.
[SpJ M. SPIVAK, Calculus on Manifolds, Benjamin, New York, 1965.
[TiJ E.C. TITCHMARSH, Theory of the Fourier Integral, Cambridge Univer-
sity Press, Cambridge, UK, 1937.
[W] A. WElL, L'Integration dans les Groupes Topologiques et ses Applica-
tions, Hermann, Paris, 1938.
[Zy] A. ZYGMUND, Trigonometrical Series, First Edition 1935; Second Edi-
tion, Chelsea, New York, 1952.
Table of Notation
A*: adjoint, p. 106

B(X, E): bounded mappings of X into E, p. 19
BV(R): space of functions of bounded variation on R, p. 284
CP: p-times continuously differentiable mappings, p. 346
Cc(X): space of continuous functions with compact support on X,
p. 252
C;o (X): infinitely differentiable functions with compact support on
X, pp. 167, 296
DA : domain of an unbounded operator A, p. 469
End(E): continuous linear maps of E into itself, p. 73
f" or j: Fourier transform of !, pp. 238, 288
fA: the function or mapping which is equal to ! on A and 0
outside A, p. 4
The same notation is also used in the theory of the double
Fourier transform, for a truncation of the double Fourier
transform, p. 288
!*g: convolution, p. 223
Hilb(E): norm-preserving automorphisms of a Hilbert space E, p. 439
Int: interior, pp. 22, 497
L(E, F): space of continuous linear maps of E into F, p. 65
Laut(E): continuous linear automorphisms of E, p. 67
Lis(E, F): continuous linear isomorphisms of E with F, p. 67
21: space of absolutely integrable functions or mappings, p. 128
L1 : equivalence classes of the above, p. 128
LP: similar to the above, p. 209
M1(A, E) : E-valued measures, p. 199
M1(/1, E): /1-continuous E-valued measures, p. 205
T ABLE OF NOTATION 573
Ilmll: norm of a measure m, pp. 195, 199

J1.v: spectral measure, p. 480
J1.v,w : spectral measure, p. 481
J1.* v: convolution of measures, p. 275
RS(g): space of Riemann-Stieltjes integrable functions with respect
to g, p. 282
St(J1., E): space of step mappings from X into E, p. 122
a(x): spectrum of x, p. 400
~: distribution associated with a function f, p. 297
XA: characteristic function of A, p. 3
(X, At, J1.): measured space, p. 120
V(f): variation of f, pp. 279, 284
<: p.255
Index
A space 51
Barner's theorem 291
Absolutely continuous 191, 199 Base 23
Adherent 22 Bessel function 245
Adjoint 106, 392, 438, 469 inequality 102
of differential operator 303 Bijective 3
Alaoglu's theorem 71 Bilinear 67, 91
Algebra 51, 72, 113 Block 498
of functions 51 Bonnet mean value 286
of subsets 113 Borel set or measurable 114
Algebra automorphism 61 Bound of linear map 65
Almost all 122 Boundary
Almost compact support 560 of manifold 541
Almost everywhere 122 point 22
Alternating 507 Bounded 18
product 507, 509 functional 253
Anosov theorem 381 linear map 65, 252
Antifunctional 391 measure 199
Antilinear 95, 391 variation 279, 284
Approximate 129 Bourbaki's theorem 13
Approximation Bruhat-Tits 50
Dirac 228
LI 147
Stone-Weierstrass 52, 61, 62
Arcwise connected 29
c
Ascoli's theorem 57 Cc-functional 253
Atlas 523 CP-invertible 361
Automorphism 67 CP-isomorphism 361
Averaging theorem 145, 221 C·-algebra 410
Calculus of variation 358
B Cantor set 49
Carathi:odory
Baire's theorem 387 criterion 179
Banach algebra 73 measure 179
isomorphism 67 Carried (measure) 179, 192
576 INDEX
Cesaro summation 230 Differentiating under integral sign

Change of variables formula 505 175, 225, 355
Character 327, 407 Differentiation of sequence 356
Characteristic function 3 Dini's theorems 60, 178, 254
Chart 523 Dirac
Circumcenter 50 distribution 298
Closed ball 20 family 228, 485
graph theorem 395 measure 120
map 48 sequence 227
operator 469 Direct image
set 21 of a measurable space 114
Closure 22 of a measure 174
Codimension 390 Direct sum 389
Commute 81 Discrete
Compact 31 subgroup 312
groups 459 support 305
hermitian operator 442 topology 18
operator 415 Distance 19, 45
support 167, 252 Distribution 296
Complement 6 Dominated convergence theorem
Complementary subspace 389 141, 184, 210
Complete 21 , 51 Dual
metric space 21 exponent 209
Completion 77 space 68
of a measure 173 Duality
Concentrated (measure) 192 LI 185, 190, 220
Connected U 211
component 29
set 27 E
Continuous 24
Converges Egoroff's theorem 172
strongly 435 Eigenfunction 108
weakly 107, 435 Eigenspace 442
Convex 84 Eigenvalue 426
Convolution 73, 176, 223, 234, 239 Eigenvector 426
of measures 275, 327 Endomorphism 66, 73
Countable 10 Enumerate 7
Countably additive 120, 196 Equicontinuous 57
Counting measure 120 Equivalent
Covering 21 maps (for a measure) 139
norms 20,38
D Essential
image 221
Decomposable form 507, 508 sup 185, 218
Decomposition 195 Essentially self-adjoint 472
Dense 22 Exterior derivative 410
Denumerable 7 Extreme point 86
Derivation of a distribution 446
Derivative 334 F
Differentiable 333
Differential 534 Family 4
equation 135 Fatou's lemma 141
form 508, 547 Finite intersection property 31
form on a manifold 503 Flow 366, 377
INDEX 577
Fourier I
coefficients 98, 176
inversion 241, 289 Ideal 55
series 230 Ideal topology 21
transform 238, 288 Image 3
Fredholm operator 417 Implicit mapping theorem 364, 532
Frontier 562 Index 420
Fubini's theorem 162 Induced topology 23
Function 4 Inductively ordered 12
Functionals 68, 104 Initial condition 366, 371
Fundamental lemma of integration Injective 3
111,129 Integrable 132
Integral
curve 544
G equation 432
Gelfand-Mazur theorem 402 general theory 129
Gelfand-Naimark theorem 411 in one variable 331
Gelfand transform 407, 409 Lebesgue 166
Generated (-algebra) 114 mean value theorem 286
Global flow 377, 545 of step maps 126
Gradient 383 Integral curve 365, 544
Integral operators 213,432, 478
Integration by parts 282
H Interior 22, 497
HP spaces 80 Invariant subspace 81,442, 450
Hs spaces 219 Inverse image 6
Haar functional and measure 313, 324 of differential form 513
Hahn-Banach theorem 70 Inverse mapping theorem 361
Hahn decomposition 203 Invertible 66, 74
Hahn's theorem 153 Irreducible representation 459
Half space 85, 539 Isometry 67
Harmonic functions 231 Isomorphism 66, 361, 527
Hausdorff
measure 180 J
space 32
Heat Jacobian 503
equation 232, 234, 248
operator 232, 235, 248, 449, 485 K
Hermite polynomials 276
Hermitian Karamata theorem 277
form 95 Kernel 87
operator 107, 438 Kolmogoroff inequality 215
Hilbert Krein-Milman theorem 88
basis 98
Nullstellensatz 57 L
space 99
Hilbertian operator 439 Ll 129
Hilbert-Schmidt operators 435, 461 Ll seminorm 19, 128
Holder condition 44 L2 181
Holder inequality 210 L2 bound 108, 218, 220, 437
Homeomorphism 25 L2 norm 97, 182
Homogeneous space 323 U 209
Hyperbolic 381 Laguerre polynomials 276
Hyperplane 539 Landau approximation 229
578 INDEX
Laplace operator 476 Modular function 327

Lattice points 249 Monotone
Least upper bound 12 convergence theorem 139
Lebesgue integral 166 families 174
measure 167 Morphism 527
theorem 75 Morse- Palais lemma 455
Lim inf 140 Multilinear 68
Linear differential equations 384
extension theorem 75
N
Lipschitz condition 366, 497
Local Negative definite 450
coordinates 524 Negligible 563
flow 366 Neighborhood 24
isomorphism 361, 527 Newton's method 380
order 302 Non-degenerate 188
representation of vector field 544 critical point 455
Localization of measure 270 Non-measurable set 177
Locally Non-singular 455
closed 527 Norm 18
compact 39 Normal 33
compact groups 313 Normed
finite 271 , 536 algebra 73
integrable 170 vector space 18
invertible 361 Null space 96
zero (distribution) 299
Lorch's theorem 468 o
Lusin's theorem 266
One point compactification 40
M Open
ball 18
Jl-continuous 199 covering 31
Jl-measurable 123, 155 mapping theorem 388
Manifold 524 set 17
with boundary 540 Operator 66, 73, 105
Mapping 3 Order of distribution 297
Marriage problem 49 Ordering 10
Maximal element 11 Ordinary topology 18
Mazur's theorem 88 Orientation 551
Mean value theorem 341 Oriented volume 499
Measurable Orthogonal 96, 187
map 114 basis 98
set 113, 155 decomposition 102
space 113 family 98
Measure 120 measures 192
associated with a differential form projection 101, 450
554 Orthonormal 98
associated with a functional 263 Outer measure 154, 259
Measure 0 497
Measured space 120 P
Mehler family 233
Metric space 19,45 Parallelogram 98, 100
Metrizable 45 Parameters 370
Midpoint 50 Parseval formula 243
INDEX 579
Partial Rieffel's theorem 208

derivatives 352 Riemann-Lebesgue lemma 176,287,
isometry 454 291
Partition 122 Riemann-Stieltjes
Partition of unity 263, 270, 271, 536 integral 281
Peano curve 50 measure 285
Perpendicular 96 Riesz theorem 256, 264, 268
Piecewise continuous 29
Poisson s
family 231
kernel 231 a-algebra 112
summation formula 244 a-finite 123, 137, 148
Polar decomposition 454, 460 a-regular measure 256
Polarization 107 Schroedinger operator 233
Positive Schur's lemma 452
definite 96, 450 Schwartz space 236
functional 252, 265 Schwarz inequality 96
measurable maps 173 Second derivative 344
measure 119,446 Self adjoint 107, 470
operator 441, 446 Semilinear 95
Pre-Hilbert space 99 Seminorm 44
Product measure 160, 177, 214 Semi parallelogram law 50
on locally compact spaces 272 Separable 24, 47
Product topology 26 Separate
Projection 101,450, 166 closed sets 33
Proper points 33, 52
map 48 Separation by continuous functions
subset 3 40
Pythagoras' theorem 98 Sequence 4, 7
Sequentially compact 33
Sesquilinear 95
Q
Shrinking lemma 360
Quadratic form 107 Shub's theorem 380
Simple map 118
R Singular
measure 192
Radon- Nikodym point 562
derivative 218 Size of partition 278
theorem 192, 204 Skew symmetric 460
Rectangle 158, 167 Spectral
Refinement family 466, 490, 493
of covering 536 integral 492
of topology 23 measure 481
Regular measure 256, 265, 267 radius 406
Regular point 562 Spectral theorems
Regular space 48 bounded hermitian operators 447
Regularizing sequence 228 compact operators 426,431,443
Regulated map 332 self-adjoint operators 470
Relatively compact 35, 415 Spectrum 400, 442, 446
Relatively invariant 323 Sphere 20
Representation by a measure 191, Square root of operator 446
195 Step map 122, 148, 184, 331
Resolvant 412 Stieltjes integral 281, 491
580 INDEX
Stokes' theorem 555, 558 Transition map 525

with singularities 563 Translation 170, 309
Stone- Weierstrass theorem 52,61,62, Transpose 106
273,446 Trivializing chart 536
Strictly inductively ordered 12 Tychonoff's theorem 37
Strong convergence 435
Subcovering 31
Submanifold 528 u
Subordinated 271, 536 Uniform 19
Subsequence 8 bounded ness 395
Subspace 22 convergence 19
Summation by parts 282 convergence on compact sets 46
Sup norm 18 Uniformly continuous 36, 309
Support 255, 299, 550 Unit 71,413
Surjective 3 Unit vector 97
Surjective mapping theorem 397 Unitary 439
Symmetric 438, 470 group 325, 328
Upper bound 11
T Urysohn's
lemma 40,45
Tangent metrization theorem 48
bundle 535
map 534
space 533 V
vector 533 Value 3
Taylor's formula 349 Variation 278
Theta function 248 function 279
Titchmarsh-Kodaira formula 487 Vector field 365, 544
Tietze extension theorem 42 Vectorial measure 199
Time-dependent vector field 369
Toplinear isomorphism 67
Topological W
group 308
isomorphism 25 Weak
space 21 convergence 107
Topology 17 topology 24, 71
Tornheim's proof 401 Weakly measurable 174
Total Weierstrass approximation 229
family 98 Weierstrass-Bolzano property 33
ordering 11
variation 197
Totally
z
bounded 35 Zariski topology 22
ordered 11 Zero 21
Trace 436, 462 Zero of ideal 56
class 462 Zorn's lemma 12, 16

Real and Functional Analysis - Serge Lang

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Real and Functional Analysis - Serge Lang

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Real and Functional Analysis - Serge Lang

Uploaded by

Copyright:

Available Formats

Graduate Texts in Mathematics 142

Springer-Verlag Berlin Heidelberg GmbH

Fundamentals of Diophantine Geometry

OTHER BOOKS BY LANG PUBLISHED BY

MSC 1991: Subject Classification: 26-01, 28-01, 46-01

Printed on acid-frec paper.

© 1993 Springer-Verlag Berlin Heidelberg

concerning continuous functions on compact sets properly emphasizes

New Haven 1993/1996 SERGE LANG

§1. Hermitian Forms . .................... .. . . ..... . ........... . ... 95

§1. The Hilbert Space L 2 (/1) ........................•.. . ........ . .. . 181

Bibliography .............................................. . ..... 569

I, §1. SOME BASIC TERMINOLOGY

real numbers ;?;

given by x 1-+ X2 is not surjective, but the map

given by the same formula is surjective.

the restriction of f to S, namely the map f viewed as a map defined only

Example 1. Let S be the set consisting of the single element 3. Let

Example 2. A sequence of real numbers is written frequently in the

is a sequence of integers, with Xi =I for each i E Z+ .

We define a family of sets indexed by a set I in the same manner, that

to be the set consisting of all x such that x lies in some Si'

to be the set of all sequences (Xl' X2' .. . ) with Xi E Si ' Similarly, if I is an

to be the set of all families {Xi}; e I with Xi E Si'

To prove this, let (w, z) E (X U Y) x Z with WE X U Y and ZE Z. Then

Conversely, X x Z is contained in (X u Y) x Z and so is Y x Z . Hence

We say that two sets X, Yare disjoint if their intersection is empty.

( iEUI Xi) X Z= U (Xi

If the family {X;}iEI is disjoint (that is Xi n Xj is empty if i =F j for i,

We leave the proof to the reader.

~x(Y nZ) = ~x Yv~xZ.

These are essentially reformulations of definitions. For instance, suppose

f- 1(yv Z) = f-l(y) v f- 1(Z),

More generally, if {¥;};EI is a family of subsets of B, then

and similarly for the intersection. Furthermore, if we denote by Y - Z

I, §2. DENUMERABLE SETS

Let n be a positive integer. Let J. be the set consisting of all integers k,

which to each positive integer n associates an element of S, the mapping

Examples. The even posItIve integers may be viewed as a sequence

according to the desired case. Unless otherwise specified, however, we

{Xl' ' " ,Xn } or (X;)i=l . .. .•n·

When we need to specify the distinction between finite sequences and

Proposition 2.1. Let D be an infinite subset of Z +. Then D is de-

Proof. We let kl be the smallest element of D. Suppose inductively

Corollary 2.2. Let S be a denumerable set and D an infinite subset of S.

Proof. Given an enumeration of S, the subset D corresponds to a

Proposition 2.3. Every infinite set contains a denumerable subset.

Proof. Let S be a infinite set. For every non-empty subset T of S, we

Proposition 2.4. Let D be a denumerable set, and f: D -. S a surjective

Let g(y) = x y. The image of g is a subset of D and is denumerable.

Proposition 2.5. Let D be a denumerable set. Then D x D (the set of

Proof. There is a bijection between D x D and Z+ x Z+, so it will

In view of Proposition 2.1, it will suffice to prove that this mapping is

with k = n - r ~ 1. Then the left-hand side is even, but the right-hand

have n < r. Hence r = n. Then we obtain 3m = 3s. If m > s, then 3m- s = 1

Proposition 2.6. Let {Dl' D2 , • •• } be a sequence of denumerable sets.

Proof. For each i = 1, 2, . .. we enumerate the elements of Db as