Advanced Calculus MV
Advanced Calculus MV
Kenneth Kuttler
1 Introduction 9
4 Sequences 71
4.1 Vector Valued Sequences And Their Limits . . . . . . . . . . . . . . . . 71
4.2 Sequential Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3 Closed And Open Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4 Cauchy Sequences And Completeness . . . . . . . . . . . . . . . . . . . 78
4.5 Shrinking Diameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3
4 CONTENTS
5 Continuous Functions 85
5.1 Continuity And The Limit Of A Sequence . . . . . . . . . . . . . . . . . 88
5.2 The Extreme Values Theorem . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3 Connected Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.5 Sequences And Series Of Functions . . . . . . . . . . . . . . . . . . . . . 94
5.6 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.7 Sequences Of Polynomials, Weierstrass Approximation . . . . . . . . . . 99
5.7.1 The Tietze Extension Theorem . . . . . . . . . . . . . . . . . . . 102
5.8 The Operator Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.9 Ascoli Arzela Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
∗
15.2.4 A Formula For α (n) . . . . . . . . . . . . . . . . . . . . . . . . 426
15.3 Hausdorff Measure And Linear Transformations . . . . . . . . . . . . . . 428
15.4 The Area Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
15.4.1 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . 430
15.5 The Area Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
15.6 Area Formula For Mappings Which Are Not One To One . . . . . . . . 440
15.7 The Coarea Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
15.8 A Nonlinear Fubini’s Theorem . . . . . . . . . . . . . . . . . . . . . . . 452
Copyright °c 2007,
8 CONTENTS
Introduction
This book is directed to people who have a good understanding of the concepts of one
variable calculus including the notions of limit of a sequence and completeness of R. It
develops multivariable advanced calculus.
In order to do multivariable calculus correctly, you must first understand some linear
algebra. Therefore, a condensed course in linear algebra is presented first, emphasizing
those topics in linear algebra which are useful in analysis, not those topics which are
primarily dependent on row operations.
Many topics could be presented in greater generality than I have chosen to do. I have
also attempted to feature calculus, not topology although there are many interesting
topics from topology. This means I introduce the topology as it is needed rather than
using the possibly more efficient practice of placing it right at the beginning in more
generality than will be needed. I think it might make the topological concepts more
memorable by linking them in this way to other concepts.
After the chapter on the n dimensional Lebesgue integral, you can make a choice
between a very general treatment of integration of differential forms based on degree
theory in chapters 10 and 11 or you can follow an independent path through a proof
of a general version of Green’s theorem in the plane leading to a very good version of
Stoke’s theorem for a two dimensional surface by following Chapters 12 and 13. This
approach also leads naturally to contour integrals and complex analysis. I got this idea
from reading Apostol’s advanced calculus book. Finally, there is an introduction to
Hausdorff measures and the area formula in the last chapter.
I have avoided many advanced topics like the Radon Nikodym theorem, represen-
tation theorems, function spaces, and differentiation theory. It seems to me these are
topics for a more advanced course in real analysis. I chose to feature the Lebesgue
integral because I have gone through the theory of the Riemann integral for a function
of n variables and ended up thinking it was too fussy and that the extra abstraction of
the Lebesgue integral was worthwhile in order to avoid this fussiness. Also, it seemed
to me that this book should be in some sense “more advanced” than my calculus book
which does contain in an appendix all this fussy theory.
9
10 INTRODUCTION
Some Fundamental Concepts
1. Two sets are equal if and only if they have the same elements.
2. To every set, A, and to every condition S (x) there corresponds a set, B, whose
elements are exactly those elements x of A for which S (x) holds.
3. For every collection of sets there exists a set that contains all the elements that
belong to at least one set of the given collection.
5. If A is a set there exists a set, P (A) such that P (A) is the set of all subsets of A.
This is called the power set.
These axioms are referred to as the axiom of extension, axiom of specification, axiom
of unions, axiom of choice, and axiom of powers respectively.
It seems fairly clear you should want to believe in the axiom of extension. It is
merely saying, for example, that {1, 2, 3} = {2, 3, 1} since these two sets have the same
elements in them. Similarly, it would seem you should be able to specify a new set from
a given set using some “condition” which can be used as a test to determine whether
the element in question is in the set. For example, the set of all integers which are
multiples of 2. This set could be specified as follows.
{x ∈ Z : x = 2y for some y ∈ Z} .
In this notation, the colon is read as “such that” and in this case the condition is being
a multiple of 2.
Another example of political interest, could be the set of all judges who are not
judicial activists. I think you can see this last is not a very precise condition since
11
12 SOME FUNDAMENTAL CONCEPTS
So what is a condition?
We will leave these sorts of considerations and assume our conditions make sense.
The axiom of unions states that for any collection of sets, there is a set consisting of all
the elements in each of the sets in the collection. Of course this is also open to further
consideration. What is a collection? Maybe it would be better to say “set of sets” or,
given a set whose elements are sets there exists a set whose elements consist of exactly
those things which are elements of at least one of these sets. If S is such a set whose
elements are sets,
∪ {A : A ∈ S} or ∪ S
signify this union.
Something is in the Cartesian product of a set or “family” of sets if it consists of
a single thing taken from each set in the family. Thus (1, 2, 3) ∈ {1, 4, .2} × {1, 2, 7} ×
{4, 3, 7, 9} because it consists of exactly one element from each of the sets which are
separated by ×. Also, this is the notation for the Cartesian product of finitely many
sets. If S is a set whose elements are sets,
Y
A
A∈S
out to be illogical even though such usage may be grammatically correct. Quantifiers
are used often enough that there are symbols for them. The symbol ∀ is read as “for
all” or “for every” and the symbol ∃ is read as “there exists”. Thus ∀∀∃∃ could mean
for every upside down A there exists a backwards E.
DeMorgan’s laws are very useful in mathematics. Let S be a set of sets each of
which is contained in some universal set, U . Then
© ª C
∪ AC : A ∈ S = (∩ {A : A ∈ S})
and © ª C
∩ AC : A ∈ S = (∪ {A : A ∈ S}) .
These laws follow directly from the definitions. Also following directly from the defini-
tions are:
Let S be a set of sets then
B ∪ ∪ {A : A ∈ S} = ∪ {B ∪ A : A ∈ S} .
B ∩ ∪ {A : A ∈ S} = ∪ {B ∩ A : A ∈ S} .
Unfortunately, there is no single universal set which can be used for all sets. Here is
why: Suppose there were. Call it S. Then you could consider A the set of all elements
of S which are not elements of themselves, this from the axiom of specification. If A
is an element of itself, then it fails to qualify for inclusion in A. Therefore, it must not
be an element of itself. However, if this is so, it qualifies for inclusion in A so it is an
element of itself and so this can’t be true either. Thus the most basic of conditions you
could imagine, that of being an element of, is meaningless and so allowing such a set
causes the whole theory to be meaningless. The solution is to not allow a universal set.
As mentioned by Halmos in Naive set theory, “Nothing contains everything”. Always
beware of statements involving quantifiers wherever they occur, even this one. This little
observation described above is due to Bertrand Russell and is called Russell’s paradox.
X × Y ≡ {(x, y) : x ∈ X and y ∈ Y }
D (f ) ≡ {x : (x, y) ∈ f } ,
written as f : D (f ) → Y .
It is probably safe to say that most people do not think of functions as a type of
relation which is a subset of the Cartesian product of two sets. A function is like a
machine which takes inputs, x and makes them into a unique output, f (x). Of course,
14 SOME FUNDAMENTAL CONCEPTS
that is what the above definition says with more precision. An ordered pair, (x, y)
which is an element of the function or mapping has an input, x and a unique output,
y,denoted as f (x) while the name of the function is f . “mapping” is often a noun
meaning function. However, it also is a verb as in “f is mapping A to B ”. That which
a function is thought of as doing is also referred to using the word “maps” as in: f maps
X to Y . However, a set of functions may be called a set of maps so this word might
also be used as the plural of a noun. There is no help for it. You just have to suffer
with this nonsense.
The following theorem which is interesting for its own sake will be used to prove the
Schroder Bernstein theorem.
Theorem 2.1.2 Let f : X → Y and g : Y → X be two functions. Then there
exist sets A, B, C, D, such that
A ∪ B = X, C ∪ D = Y, A ∩ B = ∅, C ∩ D = ∅,
f (A) = C, g (D) = B.
The following picture illustrates the conclusion of this theorem.
X Y
f
A - C = f (A)
g
B = g(D) ¾ D
Definition 2.1.4 Let I be a set and let Xi be a set for each i ∈ I. f is a choice
function written as
Y
f∈ Xi
i∈I
The axiom of choice says that if Xi 6= ∅ for each i ∈ I, for I a set, then
Y
Xi 6= ∅.
i∈I
Sometimes the two functions, f and g are onto but not one to one. It turns out that
with the axiom of choice, a similar conclusion to the above may be obtained.
Similarly g0−1 is one to one. Therefore, by the Schroder Bernstein theorem, there exists
h : X → Y which is one to one and onto.
Thus the first element of X × Y is (x1 , y1 ), the second element of X × Y is (x1 , y2 ), the
third element of X × Y is (x2 , y1 ) etc. This assigns a number from N to each element
of X × Y. Thus X × Y is at most countable.
It remains to show the last claim. Suppose without loss of generality that X is
countable. Then there exists α : N → X which is one to one and onto. Let β : X×Y → N
be defined by β ((x, y)) ≡ α−1 (x). Thus β is onto N. By the first part there exists a
function from N onto X × Y . Therefore, by Corollary 2.1.5, there exists a one to one
and onto mapping from X × Y to N. This proves the theorem. ¥
X = {x1 , x2 , x3 , · · ·}
and
Y = {y1 , y2 , y3 , · · ·} .
Consider the following array consisting of X ∪ Y and path through it.
x1 → x2 x3 →
. %
y1 → y2
Thus the first element of X ∪ Y is x1 , the second is x2 the third is y1 the fourth is y2
etc.
Consider the second claim. By the first part, there is a map from N onto X × Y .
Suppose without loss of generality that X is countable and α : N → X is one to one and
onto. Then define β (y) ≡ 1, for all y ∈ Y ,and β (x) ≡ α−1 (x). Thus, β maps X × Y
onto N and this shows there exist two onto maps, one mapping X ∪ Y onto N and the
other mapping N onto X ∪ Y . Then Corollary 2.1.5 yields the conclusion. This proves
the theorem. ¥
2. If x ∼ y then y ∼ x. (Symmetric)
Definition 2.1.10 [x] denotes the set of all elements of S which are equivalent
to x and [x] is called the equivalence class determined by x or just the equivalence class
of x.
With the above definition one can prove the following simple theorem.
Proof: Let sup ({An : n ∈ N}) = r. In the first case, suppose r < ∞. Then letting
ε > 0 be given, there exists n such that An ∈ (r − ε, r]. Since {An } is increasing, it
follows if m > n, then r − ε < An ≤ Am ≤ r and so limn→∞ An = r as claimed. In
the case where r = ∞, then if a is a real number, there exists n such that An > a.
Since {Ak } is increasing, it follows that if m > n, Am > a. But this is what is meant
by limn→∞ An = ∞. The other case is that r = −∞. But in this case, An = −∞ for all
n and so limn→∞ An = −∞. The case where An is decreasing is entirely similar. This
proves the lemma. ¥
n
Sometimes the limit of a sequence does not exist. For example, if an = (−1) , then
limn→∞ an does not exist. This is because the terms of the sequence are a distance
of 1 apart. Therefore there can’t exist a single number such that all the terms of the
sequence are ultimately within 1/4 of that number. The nice thing about lim sup and
lim inf is that they always exist. First here is a simple lemma and definition.
18 SOME FUNDAMENTAL CONCEPTS
Definition 2.2.3 Denote by [−∞, ∞] the real line along with symbols ∞ and
−∞. It is understood that ∞ is larger than every real number and −∞ is smaller
than every real number. Then if {An } is an increasing sequence of points of [−∞, ∞] ,
limn→∞ An equals ∞ if the only upper bound of the set {An } is ∞. If {An } is bounded
above by a real number, then limn→∞ An is defined in the usual way and equals the
least upper bound of {An }. If {An } is a decreasing sequence of points of [−∞, ∞] ,
limn→∞ An equals −∞ if the only lower bound of the sequence {An } is −∞. If {An } is
bounded below by a real number, then limn→∞ An is defined in the usual way and equals
the greatest lower bound of {An }. More simply, if {An } is increasing,
lim An = sup {An }
n→∞
Lemma 2.2.4 Let {an } be a sequence of real numbers and let Un ≡ sup {ak : k ≥ n} .
Then {Un } is a decreasing sequence. Also if Ln ≡ inf {ak : k ≥ n} , then {Ln } is an
increasing sequence. Therefore, limn→∞ Ln and limn→∞ Un both exist.
Proof: Let Wn be an upper bound for {ak : k ≥ n} . Then since these sets are
getting smaller, it follows that for m < n, Wm is an upper bound for {ak : k ≥ n} . In
particular if Wm = Um , then Um is an upper bound for {ak : k ≥ n} and so Um is at
least as large as Un , the least upper bound for {ak : k ≥ n} . The claim that {Ln } is
decreasing is similar. This proves the lemma. ¥
From the lemma, the following definition makes sense.
Definition 2.2.5 Let {an } be any sequence of points of [−∞, ∞]
lim sup an ≡ lim sup {ak : k ≥ n}
n→∞ n→∞
and
lim inf an
n→∞
are both real numbers. Then limn→∞ an exists if and only if
lim inf an = lim sup an
n→∞ n→∞
Suppose first that limn→∞ an exists and is a real number. Then by Theorem 4.4.3 {an }
is a Cauchy sequence. Therefore, if ε > 0 is given, there exists N such that if m, n ≥ N,
then
|an − am | < ε/3.
From the definition of sup {ak : k ≥ N } , there exists n1 ≥ N such that
It follows that
2ε
sup {ak : k ≥ N } − inf {ak : k ≥ N } ≤ |an1 − an2 | + < ε.
3
∞ ∞
Since the sequence, {sup {ak : k ≥ N }}N =1 is decreasing and {inf {ak : k ≥ N }}N =1 is
increasing, it follows from Theorem 4.1.7
Since sup {ak : k ≥ N } ≥ inf {ak : k ≥ N } it follows that for every ε > 0, there exists
N such that
sup {ak : k ≥ N } − inf {ak : k ≥ N } < ε
Thus if m, n > N, then
|am − an | < ε
which means {an } is a Cauchy sequence. Since R is complete, it follows that limn→∞ an ≡
a exists. By the squeezing theorem, it follows
The significance of lim sup and lim inf, in addition to what was just discussed, is
contained in the following theorem which follows quickly from the definition.
20 SOME FUNDAMENTAL CONCEPTS
Proof: This follows from the definition. Let λn = sup {ak bk : k ≥ n} . For all n
large enough, an > a − ε where ε is small enough that a − ε > 0. Therefore,
λn ≥ sup {bk : k ≥ n} (a − ε)
for all n large enough. Then
lim sup an bn = lim λn ≡ lim sup an bn
n→∞ n→∞ n→∞
≥ lim (sup {bk : k ≥ n} (a − ε))
n→∞
= (a − ε) lim sup bn
n→∞
In other words, first sum on j yielding something which depends on k and then sum
these. The major consideration for these double series is the question of when
∞ X
X ∞ ∞ X
X ∞
ajk = ajk .
k=m j=m j=m k=m
2.3. DOUBLE SERIES 21
In other words, when does it make no difference which subscript is summed over first?
In the case of finite sums there is no issue here. You can always write
M X
X N N X
X M
ajk = ajk
k=m j=m j=m k=m
because addition is commutative. However, there are limits involved with infinite sums
and the interchange in order of summation involves taking limits in a different order.
Therefore, it is not always true that it is permissible to interchange the two sums. A
general rule of thumb is this: If something involves changing the order in which two
limits are taken, you may not do it without agonizing over the question. In general,
limits foul up algebra and also introduce things which are counter intuitive. Here is an
example. This example is a little technical. It is placed here just to prove conclusively
there is a question which needs to be considered.
Example 2.3.1 Consider the following picture which depicts some of the ordered pairs
(m, n) where m, n are positive integers.
0 0 0 0 0 c 0 -c
0 0 0 0 c 0 -c 0
0 0 0 c 0 -c 0 0
0 0 c 0 -c 0 0 0
0 c 0 -c 0 0 0 0
b 0 -c 0 0 0 0 0
0 a 0 0 0 0 0 0
The numbers next to the point are the values of amn . You see ann = 0 for all n,
a21 = a, a12 = b, amn = c for (m, n) on the line y = 1 + x whenever m > 1, and
amn = −c for all (m, n) on the line y = x − 1 whenever m > 2.
P∞ P∞ P∞
Then m=1 amn = a if n = 1, m=1 amn = b−c if n = 2 and if n > 2, m=1 amn =
0. Therefore,
∞ X
X ∞
amn = a + b − c.
n=1 m=1
22 SOME FUNDAMENTAL CONCEPTS
P∞ P∞ P∞
Next observe that n=1 amn = b if m = 1, n=1 amn = a+c if m = 2, and n=1 amn =
0 if m > 2. Therefore,
X∞ X ∞
amn = b + a + c
m=1 n=1
and so the two sums are different. Moreover, you can see that by assigning different
values of a, b, and c, you can get an example for any two different numbers desired.
It turns out that if aij ≥ 0 for all i, j, then you can always interchange the order
of summation. This is shown next and is based on the following lemma. First, some
notation should be discussed.
Definition 2.3.2 Let f (a, b) ∈ [−∞, ∞] for a ∈ A and b ∈ B where A, B
are sets which means that f (a, b) is either a number, ∞, or −∞. The symbol, +∞
is interpreted as a point out at the end of the number line which is larger than every
real number. Of course there is no such number. That is why it is called ∞. The
symbol, −∞ is interpreted similarly. Then supa∈A f (a, b) means sup (Sb ) where Sb ≡
{f (a, b) : a ∈ A} .
Unlike limits, you can take the sup in different orders.
Lemma 2.3.3 Let f (a, b) ∈ [−∞, ∞] for a ∈ A and b ∈ B where A, B are sets.
Then
sup sup f (a, b) = sup sup f (a, b) .
a∈A b∈B b∈B a∈A
Proof: Note that for all a, b, f (a, b) ≤ supb∈B supa∈A f (a, b) and therefore, for all
a, supb∈B f (a, b) ≤ supb∈B supa∈A f (a, b). Therefore,
sup sup f (a, b) ≤ sup sup f (a, b) .
a∈A b∈B b∈B a∈A
Repeat the same argument interchanging a and b, to get the conclusion of the lemma.
Theorem 2.3.4 Let aij ≥ 0. Then
∞ X
X ∞ ∞ X
X ∞
aij = aij .
i=1 j=1 j=1 i=1
Proof: First note there is no trouble in defining these sums because the aij are all
nonnegative. If a sum diverges, it only diverges to ∞ and so ∞ is the value of the sum.
Next note that
X∞ X ∞ X∞ Xn
aij ≥ sup aij
n
j=r i=r j=r i=r
because for all j,
∞
X n
X
aij ≥ aij .
i=r i=r
Therefore,
∞ X
X ∞ ∞ X
X n m X
X n
aij ≥ sup aij = sup lim aij
n n m→∞
j=r i=r j=r i=r j=r i=r
n X
X m n
X m
X
= sup lim aij = sup lim aij
n m→∞ n m→∞
i=r j=r i=r j=r
n
XX∞ n
XX∞ ∞ X
X ∞
= sup aij = lim aij = aij
n n→∞
i=r j=r i=r j=r i=r j=r
All the topics for calculus of one variable generalize to calculus of any number of variables
in which the functions can have values in m dimensional space and there is more than
one variable.
The notation, Cn refers to the collection of ordered lists of n complex numbers. Since
every real number is also a complex number, this simply generalizes the usual notion
of Rn , the collection of all ordered lists of n real numbers. In order to avoid worrying
about whether it is real or complex numbers which are being referred to, the symbol F
will be used. If it is not clear, always pick C.
Fn ≡ {(x1 , · · ·, xn ) : xj ∈ F for j = 1, · · ·, n} .
(x1 , · · ·, xn ) ∈ Fn ,
it is conventional to denote (x1 , · · ·, xn ) by the single bold face letter, x. The numbers,
xj are called the coordinates. The set
{(0, · · ·, 0, t, 0, · · ·, 0) : t ∈ F}
for t in the ith slot is called the ith coordinate axis. The point 0 ≡ (0, · · ·, 0) is called
the origin.
Thus (1, 2, 4i) ∈ F3 and (2, 1, 4i) ∈ F3 but (1, 2, 4i) 6= (2, 1, 4i) because, even though
the same numbers are involved, they don’t match up. In particular, the first entries are
not equal.
The geometric significance of Rn for n ≤ 3 has been encountered already in calculus
or in precalculus. Here is a short review. First consider the case when n = 1. Then
from the definition, R1 = R. Recall that R is identified with the points of a line. Look
at the number line again. Observe that this amounts to identifying a point on this line
with a real number. In other words a real number determines where you are on this line.
Now suppose n = 2 and consider two lines which intersect each other at right angles as
shown in the following picture.
23
24 BASIC LINEAR ALGEBRA
6 · (2, 6)
(−8, 3) · 3
2
−8
Notice how you can identify a point shown in the plane with the ordered pair, (2, 6) .
You go to the right a distance of 2 and then up a distance of 6. Similarly, you can identify
another point in the plane with the ordered pair (−8, 3) . Go to the left a distance of 8
and then up a distance of 3. The reason you go to the left is that there is a − sign on the
eight. From this reasoning, every ordered pair determines a unique point in the plane.
Conversely, taking a point in the plane, you could draw two lines through the point,
one vertical and the other horizontal and determine unique points, x1 on the horizontal
line in the above picture and x2 on the vertical line in the above picture, such that
the point of interest is identified with the ordered pair, (x1 , x2 ) . In short, points in the
plane can be identified with ordered pairs similar to the way that points on the real
line are identified with real numbers. Now suppose n = 3. As just explained, the first
two coordinates determine a point in a plane. Letting the third component determine
how far up or down you go, depending on whether this number is positive or negative,
this determines a point in space. Thus, (1, 4, −5) would mean to determine the point
in the plane that goes with (1, 4) and then to go below this plane a distance of 5 to
obtain a unique point in space. You see that the ordered triples correspond to points in
space just as the ordered pairs correspond to points in a plane and single real numbers
correspond to points on a line.
You can’t stop here and say that you are only interested in n ≤ 3. What if you were
interested in the motion of two objects? You would need three coordinates to describe
where the first object is and you would need another three coordinates to describe
where the other object is located. Therefore, you would need to be considering R6 . If
the two objects moved around, you would need a time coordinate as well. As another
example, consider a hot object which is cooling and suppose you want the temperature
of this object. How many coordinates would be needed? You would need one for the
temperature, three for the position of the point in the object and one more for the
time. Thus you would need to be considering R5 . Many other examples can be given.
Sometimes n is very large. This is often the case in applications to business when they
are trying to maximize profit subject to constraints. It also occurs in numerical analysis
when people try to solve hard problems on a computer.
There are other ways to identify points in space with three numbers but the one
presented is the most basic. In this case, the coordinates are known as Cartesian
coordinates after Descartes1 who invented this idea in the first half of the seventeenth
century. I will often not bother to draw a distinction between the point in n dimensional
space and its Cartesian coordinates.
The geometric significance of Cn for n > 1 is not available because each copy of C
corresponds to the plane or R2 .
1 René Descartes 1596-1650 is often credited with inventing analytic geometry although it seems
the ideas were actually known much earlier. He was interested in many different subjects, physiology,
chemistry, and physics being some of them. He also wrote a large book in which he tried to explain
the book of Genesis scientifically. Descartes ended up dying in Sweden.
3.1. ALGEBRA IN FN , VECTOR SPACES 25
x + y = (x1 , · · ·, xn ) + (y1 , · · ·, yn )
≡ (x1 + y1 , · · ·, xn + yn ) (3.2)
With this definition, the algebraic properties satisfy the conclusions of the following
theorem. These conclusions are called the vector space axioms. Any time you have a
set and a field of scalars satisfying the axioms of the following theorem, it is called a
vector space.
(v + w) + z = v+ (w + z) , (3.4)
v+ (−v) = 0, (3.6)
α (v + w) = αv+αw, (3.7)
(α + β) v =αv+βv, (3.8)
α (βv) = αβ (v) , (3.9)
1v = v. (3.10)
In the above 0 = (0, · · ·, 0).
You should verify these properties all hold. For example, consider 3.7
α (v + w) = α (v1 + w1 , · · ·, vn + wn )
= (α (v1 + w1 ) , · · ·, α (vn + wn ))
= (αv1 + αw1 , · · ·, αvn + αwn )
= (αv1 , · · ·, αvn ) + (αw1 , · · ·, αwn )
= αv + αw.
where the ci are scalars. The set of all linear combinations of these vectors is called
span (x1 , · · ·, xn ) . If V ⊆ Y, then V is called a subspace if whenever α, β are scalars
and u and v are vectors of V, it follows αu + βv ∈ V . That is, it is “closed under
the algebraic operations of vector addition and scalar multiplication” and is therefore, a
vector space. A linear combination of vectors is said to be trivial if all the scalars in
the linear combination equal zero. A set of vectors is said to be linearly independent if
the only linear combination of these vectors which equals the zero vector is the trivial
linear combination. Thus {x1 , · · ·, xn } is called linearly independent if whenever
p
X
ck xk = 0
k=1
it follows that all the scalars, ck equal zero. A set of vectors, {x1 , · · ·, xp } , is called
linearly dependent if it is not linearly independent. Thus the set of vectors Pp is linearly
dependent if there exist scalars, ci , i = 1, · · ·, n, not all zero such that k=1 ck xk = 0.
then X
0 = 1xk + (−cj ) xj ,
j6=k
a nontrivial linear combination, contrary to assumption. This shows that if the set is
linearly independent, then none of the vectors is a linear combination of the others.
Now suppose no vector is a linear combination of the others. Is {x1 , · · ·, xp } linearly
independent? If it is not, there exist scalars, ci , not all zero such that
p
X
ci xi = 0.
i=1
Not all of these scalars can equal zero because if this were the case, it would follow
that x1P= 0 and so {x1 , · · ·, xr } would not be linearly independent. Indeed, if x1 = 0,
r
1x1 + i=2 0xi = x1 = 0 and so there would exist a nontrivial linear combination of
the vectors {x1 , · · ·, xr } which equals zero.
Say ck 6= 0. Then solve (3.11) for yk and obtain
s-1 vectors here
z }| {
yk ∈ span x1 , y1 , · · ·, yk−1 , yk+1 , · · ·, ys .
Now replace the yk in the above with a linear combination of the vectors, {x1 , z1 , · · ·, zs−1 }
to obtain v ∈ span {x1 , z1 , · · ·, zs−1 } . The vector yk , in the list {y1 , · · ·, ys } , has now
been replaced with the vector x1 and the resulting modified list of vectors has the same
span as the original list of vectors, {y1 , · · ·, ys } .
Now suppose that r > s and that span {x1 , · · ·, xl , z1 , · · ·, zp } = V where the vectors,
z1 , ···, zp are each taken from the set, {y1 , · · ·, ys } and l+p = s. This has now been done
for l = 1 above. Then since r > s, it follows that l ≤ s < r and so l + 1 ≤ r. Therefore,
xl+1 is a vector not in the list, {x1 , · · ·, xl } and since span {x1 , · · ·, xl , z1 , · · ·, zp } = V
there exist scalars, ci and dj such that
l
X p
X
xl+1 = ci xi + dj zj . (3.12)
i=1 j=1
Now not all the dj can equal zero because if this were so, it would follow that {x1 , · · ·, xr }
would be a linearly dependent set because one of the vectors would equal a linear
combination of the others. Therefore, (3.12) can be solved for one of the zi , say zk , in
terms of xl+1 and the other zi and just as in the above argument, replace that zi with
xl+1 to obtain
p-1 vectors here
z }| {
span x1 , · · ·xl , xl+1 , z1 , · · ·zk−1 , zk+1 , · · ·, zp = V.
span (x1 , · · ·, xs ) = V.
But then xr ∈ span {x1 , · · ·, xs } contrary to the assumption that {x1 , · · ·, xr } is linearly
independent. Therefore, r ≤ s as claimed.
Here is another proof in case you didn’t like the above proof.
28 BASIC LINEAR ALGEBRA
Theorem 3.2.4 If
Proof: Suppose r > s. Let Ep denote a finite list of vectors of {v1 , · · ·, vs } and
let |Ep | denote the number of vectors in the list. Let Fp denote the first p vectors in
{u1 , · · ·, ur }. In case p = 0, Fp will denote the empty set. For 0 ≤ p ≤ s, let Ep have
the property
span (Fp , Ep ) = V
and |Ep | is as small as possible for this to happen. I claim |Ep | ≤ s−p if Ep is nonempty.
Here is why. For p = 0, it is obvious. Suppose true for some p < s. Then
for
{z1 , · · ·, zm } ⊆ {v1 , · · ·, vs } .
Then not all the di can equal zero because this would violate the linear independence
of the {u1 , · · ·, ur } . Therefore, you can solve for one of the zk as a linear combination
of {u1 , · · ·, up+1 } and the other zj . Thus you can change Fp to Fp+1 and include one
fewer vector in Ep . Thus |Ep+1 | ≤ m − 1 ≤ s − p − 1. This proves the claim.
Therefore, Es is empty and span (u1 , · · ·, us ) = V. However, this gives a contradiction
because it would require
us+1 ∈ span (u1 , · · ·, us )
which violates the linear independence of these vectors. This proves the theorem. ¥
Proof: From the exchange theorem, r ≤ s and s ≤ r. Now note the vectors,
of hissing as in “The sixth shiek’s sixth sheep is sick”. This is the reason that bases is used instead of
basiss.
3.2. SUBSPACES SPANS AND BASES 29
Is it also in V ?
r
X r
X r
X
α ck vk + β dk vk = (αck + βdk ) vk ∈ V
k=1 k=1 k=1
Definition 3.2.8 Let V be a vector space. Then dim (V ) read as the dimension
of V is the number of vectors in a basis.
Of course you should wonder right now whether an arbitrary subspace of a finite
dimensional vector space even has a basis. In fact it does and this is in the next theorem.
First, here is an interesting lemma.
Proof: This follows immediately from the proof of Theorem 3.2.10. You do exactly
the same argument except you start with {v1 , · · ·, vr } rather than {v1 }.
It is also true that any spanning set of vectors can be restricted to obtain a basis.
Proof: Let r be the smallest positive integer with the property that for some set,
{v1 · ··, vr } ⊆ {u1 · ··, up } ,
span (v1 · ··, vr ) = V.
Then r ≤ p and it must be the case that {v1 · ··, vr } is linearly independent because if
it were not so, one of the vectors, say vk would be a linear combination of the others.
But then you could delete this vector from {v1 · ··, vr } and the resulting list of r − 1
vectors would still span V contrary to the definition of r. This proves the theorem. ¥
Definition 3.3.1 Let V and W be two finite dimensional vector spaces. A func-
tion, L which maps V to W is called a linear transformation and written as L ∈ L (V, W )
if for all scalars α and β, and vectors v, w,
Here
v1
v ≡ ... ∈ Fn .
vn
In the general case, the space of linear transformations is itself a vector space. This
will be discussed next.
(L + M ) v ≡ Lv + M v.
αL (v) ≡ α (Lv) .
3.3. LINEAR TRANSFORMATIONS 31
You should verify that all the axioms of a vector space hold for L (V, W ) with
the above definitions of vector addition and scalar multiplication. What about the
dimension of L (V, W )?
Before answering this question, here is a lemma.
Lemma 3.3.3 Let V and W be vector spaces and suppose {v1 , · · ·, vn } is a basis for
V. Then if L : V → W is given by Lvk = wk ∈ W and
à n ! n n
X X X
L ak vk ≡ ak Lvk = ak wk
k=1 k=1 k=1
then L is well defined and is in L (V, W ) . Also, if L, M are two linear transformations
such that Lvk = M vk for all k, then M = L.
and so L = M because they give the same result for every vector in V .
The message is that when you define a linear transformation, it suffices to tell what
it does to a basis.
for V and W respectively. Using Lemma 3.3.3, let wi vj ∈ L (V, W ) be the linear
transformation defined on the basis, {v1 , · · ·, vn }, by
wi vk (vj ) ≡ wi δ jk .
Note that to define these special linear transformations, sometimes called dyadics, it is
necessary that {v1 , · · ·, vn } be a basis since their definition requires giving the values of
the linear transformation on a basis.
Let L ∈ L (V, W ). Since {w1 , · · ·, wm } is a basis, there exist constants djr such that
m
X
Lvr = djr wj
j=1
which shows
m X
X n
L= djk wj vk
j=1 k=1
because the two linear transformations agree on a basis. Since L is arbitrary this shows
{wi vk : i = 1, · · ·, m, k = 1, · · ·, n}
spans L (V, W ).
If X
dik wi vk = 0,
i,k
then
X m
X
0= dik wi vk (vl ) = dil wi
i,k i=1
and so, since {w1 , · · ·, wm } is a basis, dil = 0 for each i = 1, · · ·, m. Since l is arbitrary,
this shows dil = 0 for all i and l. Thus these linear transformations form a basis and
this shows the dimension of L (V, W ) is mn as claimed.
Definition 3.3.6 Let V, W be finite dimensional vector spaces such that a basis
for V is
{v1 , · · ·, vn }
and a basis for W is
{w1 , · · ·, wm }.
Then as explained in Theorem 3.3.5, for L ∈ L (V, W ) , there exist scalars lij such that
X
L= lij wi vj
ij
Consider a rectangular array of scalars such that the entry in the ith row and the j th
column is lij ,
l11 l12 · · · l1n
l21 l22 · · · l2n
.. .. .. ..
. . . .
lm1 lm2 ··· lmn
This is called the matrix of the linear transformation with respect to the two bases. This
will typically be denoted by (lij ) . It is called a matrix and in this case the matrix is
m × n because it has m rows and n columns.
Theorem 3.3.7 Let L ∈ L (V, W ) and let (lij ) be the matrix of L with respect
to the two bases,
{v1 , · · ·, vn } and {w1 , · · ·, wm }.
of V and W respectively. Then for v ∈ V having components (x1 , · · ·, xn ) with respect
to the basis {v1 , · · ·, vn }, the components of Lv with respect to the basis {w1 , · · ·, wm }
are
X X
l1j xj , · · ·, lmj xj
j j
3.3. LINEAR TRANSFORMATIONS 33
Theorem 3.3.8 Let (V, {v1 , · · ·, vn }) , (U, {u1 , · · ·, um }) , (W, {w1 , · · ·, wp }) be three
vector spaces along with bases for each one. Let L ∈ L (V, U ) and M ∈ L (U, W ) . Then
M L ∈ L (V, W ) and if (cij ) is the matrix of M L with respect to {v1 , · · ·, vn } and
{w1 , · · ·, wp } and (lij ) and (mij ) are the matrices of L and M respectively with respect
to the given bases, then
m
X
crj = mrs lsj .
s=1
and
wi vl δ jk (vr ) = wi δ jk δ lr
which shows
(wi uj ) (uk vl ) = wi vl δ jk (3.13)
Therefore,
à !
X X
ML = mrs wr us lij ui vj
rs ij
X X
= mrs lij (wr us ) (ui vj ) = mrs lij wr vj δ is
rsij rsij
à !
X X X
= mrs lsj wr vj = mrs lsj wr vj
rsj rj s
Theorem 3.3.9 Suppose (V, {v1 , · · ·, vn }) is a vector space and a basis and
(V, {v10 , · · ·, vn0 }) is the same vector space with a different basis. Suppose
¡ 0 ¢L ∈ L (V, V ) .
Let (lij ) be the matrix of L taken with respect to {v1 , · · ·, vn } and let lij be the n × n
matrix of L taken with respect to {v10 , · · ·, vn0 }That is,
X X
0
L= lij vi vj , L = lrs vr0 vs0 .
ij rs
¡ ¢
Then there exist n × n matrices (dij ) and d0ij satisfying
X
dij d0jk = δ ik
j
34 BASIC LINEAR ALGEBRA
such that X
0
lij = dir lrs d0sj
rs
Proof: First consider the identity map, id defined by id (v) = v with respect to the
two bases, {v1 , · · ·, vn } and {v10 , · · ·, vn0 }.
X X
id = d0tu vt0 vu , id = dij vi vj0 (3.14)
tu ij
Therefore, Ã !
X
d0ti dij = δ tj .
i
Switching the order of the above products shows
à !
X
0
dti dij = δ tj
i
¡ ¢
In terms of matrices, this says d0ij is the inverse matrix of (dij ) .
Now using 3.14 and the cancellation property 3.13,
X X X
0
L = liu vi vu = lrs vr0 vs0 = id 0
lrs vr0 vs0 id
iu rs rs
X X X
= dij vi vj0 0
lrs vr0 vs0 d0tu vt0 vu
ij rs tu
X ¡ ¢
0
= dij lrs d0tu vi vj0 (vr0 vs0 ) (vt0 vu )
ijturs
X X X
= 0
dij lrs d0tu vi vu δ jr δ st = 0
dij ljs d0su vi vu
ijturs iu js
and since the linear transformations, {vi vu } are linearly independent, this shows
X
0
liu = dij ljs d0su
js
Since the ij th entries are equal, the two matrices are equal.
Next consider the uniqueness of the inverse. If AB = BA = I, then using the
associative law,
¡ ¢
B = IB = A−1 A B = A−1 (AB) = A−1 I = A−1
Thus if it acts like the inverse, it is the inverse.
Consider now the inverse of a product.
¡ ¢ ¡ ¢
AB B −1 A−1 = A BB −1 A−1 = AIA−1 = I
¡ ¢ −1
Similarly, B −1 A−1 AB = I. Hence from what was just shown, (AB) exists and
equals B −1 A−1 .
Finally consider the statement about transposes.
³ ´ X X¡ ¢ ¡ ¢ ¡ ¢
T
(AB) ≡ (AB)ji ≡ Ajk Bki ≡ B T ik AT kj ≡ B T AT ij
ij
k k
th
Since the ij entries are the same, the two matrices are equal. This proves the theorem.
¥
In terms of matrix multiplication, Theorem 3.3.9 says that if M1 and M2 are matrices
for the same linear transformation relative to two different bases, it follows there exists
an invertible matrix, S such that
M1 = S −1 M2 S
This is called a similarity transformation and is important in linear algebra but this is
as far as the theory will be developed here.
36 BASIC LINEAR ALGEBRA
You know how to do this from the above definition of matrix mutiplication. You get
µ ¶
AE + BG AF + BH
.
CE + DG CF + DH
where Aij is a si × pj matrix where si does not depend on j and pj does not depend
P on
matrix is called a block matrix, also a partitioned matrix. Let n = j pj
i. Such a P
and k = i si so A is an k × n matrix. What is Ax where x ∈ Fn ? From the process
of multiplying a matrix times a vector, the following lemma follows.
and that for all i, j, it makes sense to multiply Bis Asj for all s ∈ P
{1, · · ·, m} and that
for each s, Bis Asj is the same size so that it makes sense to write s Bis Asj .
Theorem 3.4.2 Let B be a block matrix as in 3.16 and let A be a block matrix
as in 3.17 such that Bis is conformable with Asj and each product, Bis Asj is of the
same size so they can be added. Then BA is a block matrix such that the ij th block is
of the form X
Bis Asj . (3.18)
s
3.5. DETERMINANTS 37
3.5 Determinants
3.5.1 The Determinant Of A Matrix
The following Lemma will be essential in the definition of the determinant.
Lemma 3.5.1 There exists a unique function, sgnn which maps each list of numbers
from {1, · · ·, n} to one of the three numbers, 0, 1, or −1 which also has the following
properties.
sgnn (1, · · ·, n) = 1 (3.19)
sgnn (i1 , · · ·, p, · · ·, q, · · ·, in ) = − sgnn (i1 , · · ·, q, · · ·, p, · · ·, in ) (3.20)
In words, the second property states that if two of the numbers are switched, the value
of the function is multiplied by −1. Also, in the case where n > 1 and {i1 , · · ·, in } =
{1, · · ·, n} so that every number from {1, · · ·, n} appears in the ordered list, (i1 , · · ·, in ) ,
Proof: To begin with, it is necessary to show the existence of such a function. This
is clearly true if n = 1. Define sgn1 (1) ≡ 1 and observe that it works. No switching
is possible. In the case where n = 2, it is also clearly true. Let sgn2 (1, 2) = 1 and
sgn2 (2, 1) = −1 while sgn2 (2, 2) = sgn2 (1, 1) = 0 and verify it works. Assuming such a
function exists for n, sgnn+1 will be defined in terms of sgnn . If there are any repeated
38 BASIC LINEAR ALGEBRA
numbers in (i1 , · · ·, in+1 ) , sgnn+1 (i1 , · · ·, in+1 ) ≡ 0. If there are no repeats, then n + 1
appears somewhere in the ordered list. Let θ be the position of the number n + 1 in the
list. Thus, the list is of the form (i1 , · · ·, iθ−1 , n + 1, iθ+1 , · · ·, in+1 ) . From 3.21 it must
be that
sgnn+1 (i1 , · · ·, iθ−1 , n + 1, iθ+1 , · · ·, in+1 ) ≡
n+1−θ
(−1) sgnn (i1 , · · ·, iθ−1 , iθ+1 , · · ·, in+1 ) .
It is necessary to verify this satisfies 3.19 and 3.20 with n replaced with n + 1. The first
of these is obviously true because
n+1−(n+1)
sgnn+1 (1, · · ·, n, n + 1) ≡ (−1) sgnn (1, · · ·, n) = 1.
If there are repeated numbers in (i1 , · · ·, in+1 ) , then it is obvious 3.20 holds because
both sides would equal zero from the above definition. It remains to verify 3.20 in the
case where there are no numbers repeated in (i1 , · · ·, in+1 ) . Consider
³ r s
´
sgnn+1 i1 , · · ·, p, · · ·, q, · · ·, in+1 ,
where the r above the p indicates the number, p is in the rth position and the s above
the q indicates that the number, q is in the sth position. Suppose first that r < θ < s.
Then µ ¶
r θ s
sgnn+1 i1 , · · ·, p, · · ·, n + 1, · · ·, q, · · ·, in+1 ≡
³ r s−1
´
n+1−θ
(−1) sgnn i1 , · · ·, p, · · ·, q , · · ·, in+1
while µ ¶
r θ s
sgnn+1 i1 , · · ·, q, · · ·, n + 1, · · ·, p, · · ·, in+1 =
³ r s−1
´
n+1−θ
(−1) sgnn i1 , · · ·, q, · · ·, p , · · ·, in+1
and so, by induction, a switch of p and q introduces a minus sign in the result. Similarly,
if θ > s or if θ < r it also follows that 3.20 holds. The interesting case is when θ = r or
θ = s. Consider the case where θ = r and note the other case is entirely similar.
³ r s
´
sgnn+1 i1 , · · ·, n + 1, · · ·, q, · · ·, in+1 =
³ s−1
´
n+1−r
(−1) sgnn i1 , · · ·, q , · · ·, in+1 (3.22)
while ³ ´
r s
sgnn+1 i1 , · · ·, q, · · ·, n + 1, · · ·, in+1 =
³ r
´
n+1−s
(−1) sgnn i1 , · · ·, q, · · ·, in+1 . (3.23)
By making s − 1 − r switches, move the q which is in the s − 1th position in 3.22 to the
rth position in 3.23. By induction, each of these switches introduces a factor of −1 and
so ³ ´ ³ ´
s−1 s−1−r r
sgnn i1 , · · ·, q , · · ·, in+1 = (−1) sgnn i1 , · · ·, q, · · ·, in+1 .
Therefore,
³ r s
´ ³ s−1
´
n+1−r
sgnn+1 i1 , · · ·, n + 1, · · ·, q, · · ·, in+1 = (−1) sgnn i1 , · · ·, q , · · ·, in+1
3.5. DETERMINANTS 39
³ r
´
n+1−r s−1−r
= (−1) (−1) sgnn i1 , · · ·, q, · · ·, in+1
³ r
´ ³ r
´
n+s 2s−1 n+1−s
= (−1) sgnn i1 , · · ·, q, · · ·, in+1 = (−1) (−1) sgnn i1 , · · ·, q, · · ·, in+1
³ r s ´
= − sgnn+1 i1 , · · ·, q, · · ·, n + 1, · · ·, in+1 .
This proves the existence of the desired function.
To see this function is unique, note that you can obtain any ordered list of distinct
numbers from a sequence of switches. If there exist two functions, f and g both satisfying
3.19 and 3.20, you could start with f (1, · · ·, n) = g (1, · · ·, n) and applying the same
sequence of switches, eventually arrive at f (i1 , · · ·, in ) = g (i1 , · · ·, in ) . If any numbers
are repeated, then 3.20 gives both functions are equal to zero for that ordered list. This
proves the lemma. ¥
In what follows sgn will often be used rather than sgnn because the context supplies
the appropriate n.
Definition 3.5.2 Let f be a real valued function which has the set of ordered
lists of numbers from {1, · · ·, n} as its domain. Define
X
f (k1 · · · kn )
(k1 ,···,kn )
to be the sum of all the f (k1 · · · kn ) for all possible choices of ordered lists (k1 , · · ·, kn )
of numbers of {1, · · ·, n} . For example,
X
f (k1 , k2 ) = f (1, 2) + f (2, 1) + f (1, 1) + f (2, 2) .
(k1 ,k2 )
where the sum is taken over all ordered lists of numbers from {1, · · ·, n}. Note it suffices
to take the sum over only those ordered lists in which there are no repeats because if
there are, sgn (k1 , · · ·, kn ) = 0 and so that term contributes 0 to the sum.
Let A be an n × n matrix, A = (aij ) and let (r1 , · · ·, rn ) denote an ordered list of n
numbers from {1, · · ·, n}. Let A (r1 , · · ·, rn ) denote the matrix whose k th row is the rk
row of the matrix, A. Thus
X
det (A (r1 , · · ·, rn )) = sgn (k1 , · · ·, kn ) ar1 k1 · · · arn kn (3.24)
(k1 ,···,kn )
and
A (1, · · ·, n) = A.
Proposition 3.5.4 Let
(r1 , · · ·, rn )
be an ordered list of numbers from {1, · · ·, n}. Then
sgn (r1 , · · ·, rn ) det (A)
X
= sgn (k1 , · · ·, kn ) ar1 k1 · · · arn kn (3.25)
(k1 ,···,kn )
X
sgn (k1 , · · ·, kr , · · ·, ks , · · ·, kn ) a1k1 · · · arkr · · · asks · · · ankn ,
(k1 ,···,kn )
where it took p switches to obtain(r1 , · · ·, rn ) from (1, · · ·, n). By Lemma 3.5.1, this
implies
p
det (A (r1 , · · ·, rn )) = (−1) det (A) = sgn (r1 , · · ·, rn ) det (A)
and proves the proposition in the case when there are no repeated numbers in the ordered
list, (r1 , · · ·, rn ). However, if there is a repeat, say the rth row equals the sth row, then
the reasoning of 3.27 -3.28 shows that A (r1 , · · ·, rn ) = 0 and also sgn (r1 , · · ·, rn ) = 0 so
the formula holds in this case also.
Observation 3.5.5 There are n! ordered lists of distinct numbers from {1, · · ·, n} .
With the above, it is possible to give a more
¡ symmetric
¢ description of the determinant
from which it will follow that det (A) = det AT .
Summing over all ordered lists, (r1 , · · ·, rn ) where the ri are distinct, (If the ri are not
distinct, sgn (r1 , · · ·, rn ) = 0 and so there is no contribution to the sum.)
n! det (A) =
X X
sgn (r1 , · · ·, rn ) sgn (k1 , · · ·, kn ) ar1 k1 · · · arn kn .
(r1 ,···,rn ) (k1 ,···,kn )
This proves the corollary. ¥ since the formula gives the same number for A as it does
for AT .
Corollary 3.5.7 If two rows or two columns in an n×n matrix, A, are switched, the
determinant of the resulting matrix equals (−1) times the determinant of the original
matrix. If A is an n × n matrix in which two rows are equal or two columns are equal
then det (A) = 0. Suppose the ith row of A equals (xa1 + yb1 , · · ·, xan + ybn ). Then
where the ith row of A1 is (a1 , · · ·, an ) and the ith row of A2 is (b1 , · · ·, bn ) , all other
rows of A1 and A2 coinciding with those of A. In other words, det is a linear function
of each row A. The same is true with the word “row” replaced with the word “column”.
Proof: By Proposition 3.5.4 when two rows are switched, the determinant of the
resulting matrix is (−1) times the determinant of the original matrix. By Corollary
3.5.6 the same holds for columns because the columns of the matrix equal the rows of
the transposed matrix. Thus if A1 is the matrix obtained from A by switching two
columns, ¡ ¢ ¡ ¢
det (A) = det AT = − det AT1 = − det (A1 ) .
If A has two equal columns or two equal rows, then switching them results in the same
matrix. Therefore, det (A) = − det (A) and so det (A) = 0.
It remains to verify the last assertion.
X
det (A) ≡ sgn (k1 , · · ·, kn ) a1k1 · · · (xaki + ybki ) · · · ankn
(k1 ,···,kn )
X
=x sgn (k1 , · · ·, kn ) a1k1 · · · aki · · · ankn
(k1 ,···,kn )
X
+y sgn (k1 , · · ·, kn ) a1k1 · · · bki · · · ankn
(k1 ,···,kn )
By Corollary 3.5.7
r
X ¡ ¢
det (A) = ck det a1 ··· ar ··· an−1 ak = 0.
k=1
¡ ¢
The case for rows follows from the fact that det (A) = det AT . This proves the corol-
lary. ¥
Recall the following definition of matrix multiplication.
One of the most important rules about determinants is that the determinant of a
product equals the product of the determinants.
det (AB) =
X
sgn (k1 , · · ·, kn ) c1k1 · · · cnkn
(k1 ,···,kn )
à ! à !
X X X
= sgn (k1 , · · ·, kn ) a1r1 br1 k1 ··· anrn brn kn
(k1 ,···,kn ) r1 rn
X X
= sgn (k1 , · · ·, kn ) br1 k1 · · · brn kn (a1r1 · · · anrn )
(r1 ···,rn ) (k1 ,···,kn )
X
= sgn (r1 · · · rn ) a1r1 · · · anrn det (B) = det (A) det (B) .
(r1 ···,rn )
Proof: Denote M by (mij ) . Thus in the first case, mnn = a and mni = 0 if i 6= n
while in the second case, mnn = a and min = 0 if i 6= n. From the definition of the
determinant, X
det (M ) ≡ sgnn (k1 , · · ·, kn ) m1k1 · · · mnkn
(k1 ,···,kn )
Letting θ denote the position of n in the ordered list, (k1 , · · ·, kn ) then using the earlier
conventions used to prove Lemma 3.5.1, det (M ) equals
X µ ¶
θ n−1
n−θ
(−1) sgnn−1 k1 , · · ·, kθ−1 , kθ+1 , · · ·, kn m1k1 · · · mnkn
(k1 ,···,kn )
Now suppose 3.31. Then if kn 6= n, the term involving mnkn in the above expression
equals zero. Therefore, the only terms which survive are those for which θ = n or in
other words, those for which kn = n. Therefore, the above expression reduces to
X
a sgnn−1 (k1 , · · ·kn−1 ) m1k1 · · · m(n−1)kn−1 = a det (A) .
(k1 ,···,kn−1 )
To get the assertion in the situation of 3.30 use Corollary 3.5.6 and 3.31 to write
µµ T ¶¶
¡ ¢ A 0 ¡ ¢
det (M ) = det M T = det = a det AT = a det (A) .
∗ a
The first formula consists of expanding the determinant along the ith row and the second
expands the determinant along the j th column.
Proof: Let (ai1 , · · ·, ain ) be the ith row of A. Let Bj be the matrix obtained from A
by leaving every row the same except the ith row which in Bj equals (0, · · ·, 0, aij , 0, · · ·, 0) .
Then by Corollary 3.5.7,
Xn
det (A) = det (Bj )
j=1
Denote by A the (n − 1) × (n − 1) matrix obtained by deleting the ith row and the
ij
i+j ¡ ¢
j th column of A. Thus cof (A)ij ≡ (−1) det Aij . At this point, recall that from
44 BASIC LINEAR ALGEBRA
Proposition 3.5.4, when two rows or two columns in a matrix, M, are switched, this
results in multiplying the determinant of the old matrix by −1 to get the determinant
of the new matrix. Therefore, by Lemma 3.5.11,
µµ ij ¶¶
n−j n−i A ∗
det (Bj ) = (−1) (−1) det
0 aij
µµ ij ¶¶
i+j A ∗
= (−1) det = aij cof (A)ij .
0 aij
Therefore,
n
X
det (A) = aij cof (A)ij
j=1
which is the formula for expanding det (A) along the ith row. Also,
n
¡ ¢ X ¡ ¢
det (A) = det AT = aTij cof AT ij
j=1
n
X
= aji cof (A)ji
j=1
which is the formula for expanding det (A) along the ith column. This proves the
theorem. ¥
Note that this gives an easy way to write a formula for the inverse of an n×n matrix.
Theorem
¡ ¢ 3.5.14 A−1 exists if and only if det(A) 6= 0. If det(A) 6= 0, then
A−1
= a−1
ij where
a−1
ij = det(A)
−1
cof (A)ji
for cof (A)ij the ij th cofactor of A.
Now consider
n
X
air cof (A)ik det(A)−1
i=1
when k 6= r. Replace the k column with the rth column to obtain a matrix, Bk whose
th
determinant equals zero by Corollary 3.5.7. However, expanding this matrix along the
k th column yields
n
X
−1 −1
0 = det (Bk ) det (A) = air cof (A)ik det (A)
i=1
Summarizing,
n
X −1
air cof (A)ik det (A) = δ rk .
i=1
−1
a−1
ij = cof (A)ji det (A) .
Corollary 3.5.15 Let A be an n×n matrix and suppose there exists an n×n matrix,
B such that BA = I. Then A−1 exists and A−1 = B. Also, if there exists C an n × n
matrix such that AC = I, then A−1 exists and A−1 = C.
det B det A = 1
thus solving the system. Now in the case that A−1 exists, there is a formula for A−1
given above. Using this formula,
n
X n
X 1
xi = a−1
ij yj = cof (A)ji yj .
j=1 j=1
det (A)
T
where here the ith column of A is replaced with the column vector, (y1 · · · ·, yn ) , and
the determinant of this modified matrix is taken and divided by det (A). This formula
is known as Cramer’s rule.
46 BASIC LINEAR ALGEBRA
Theorem 3.5.19 If A has determinant rank, r, then there exist r rows of the
matrix such that every other row is a linear combination of these r rows.
Proof: Suppose the determinant rank of A = (aij ) equals r. If rows and columns
are interchanged, the determinant rank of the modified matrix is unchanged. Thus rows
and columns can be interchanged to produce an r × r matrix in the upper left corner of
the matrix which has non zero determinant. Now consider the r + 1 × r + 1 matrix, M,
a11 · · · a1r a1p
.. .. ..
. . .
ar1 · · · arr arp
al1 · · · alr alp
where C will denote the r × r matrix in the upper left corner which has non zero
determinant. I claim det (M ) = 0.
There are two cases to consider in verifying this claim. First, suppose p > r. Then
the claim follows from the assumption that A has determinant rank r. On the other
hand, if p < r, then the determinant is zero because there are two identical columns.
Expand the determinant along the last column and divide by det (C) to obtain
r
X cof (M )ip
alp = − aip .
i=1
det (C)
Now note that cof (M )ip does not depend on p. Therefore the above sum is of the form
r
X
alp = mi aip
i=1
which shows the lth row is a linear combination of the first r rows of A. Since l is
arbitrary, This proves the theorem. ¥
3.5. DETERMINANTS 47
Corollary 3.5.21 If A has determinant rank, r, then there exist r columns of the
matrix such that every other column is a linear combination of these r columns. Also
the column rank equals the determinant rank.
Proof: This follows from the above by considering AT . The rows of AT are the
columns of A and the determinant rank of AT and A are the same. Therefore, from
Corollary 3.5.20, column rank of A = row rank of AT = determinant rank of AT =
determinant rank of A.
The following theorem is of fundamental importance and ties together many of the
ideas presented above.
1. det (A) = 0.
3. A is not onto.
Proof: Suppose det (A) = 0. Then the determinant rank of A = r < n. Therefore,
there exist r columns such that every other column is a linear combination of these
th
columns by Theorem 3.5.19. In particular, it follows that for¡ some m, the m column¢
is a linear combination of all the others. Thus letting A = a1 · · · am · · · an
where the columns are denoted by ai , there exists scalars, αi such that
X
am = α k ak .
k6=m
¡ ¢T
Now consider the column vector, x ≡ α1 ··· −1 ··· αn . Then
X
Ax = −am + αk ak = 0.
k6=m
Since also A0 = 0, it follows A is not one to one. Similarly, AT is not one to one by the
same argument applied to AT . This verifies that 1.) implies 2.).
Now suppose 2.). Then since AT is not one to one, it follows there exists x 6= 0 such
that
AT x = 0.
Taking the transpose of both sides yields
xT A = 0
3. A is onto.
Definition 3.5.24 Let L ∈ L (V, V ) and let {v1 , · · ·, vn } be a basis for V . Thus
the matrix of L with respect to this basis is (lij ) ≡ ML where
X
L= lij vi vj
ij
Then define
det (L) ≡ det ((lij )) .
ML0 = S −1 ML S
because S −1 S = I and det (I) = 1. This shows the definition is well defined.
Also there is an equivalence just as in the case of matrices between various properties
of L and the nonvanishing of the determinant.
1. det (L) = 0.
3. L is not onto.
3.6. EIGENVALUES AND EIGENVECTORS OF LINEAR TRANSFORMATIONS 49
Proof: Suppose 1.). Let {v1 , · · ·, vn } be a basis for V and let (lij ) be the matrix of
L with respect to this basis. By definition, det ((lijP
)) = 0 and so (lij ) is not one to one.
Thus there is a nonzero vector x ∈ Fn such that j lij xj = 0 for each i. Then letting
Pn
v ≡ j=1 xj vj ,
X n
X XX
Lv = lrs vr vs x j vj = lrs vr δ sj xj
rs j=1 j rs
X X
= lrj xj vr = 0
r j
n
Then if {Lvi }i=1 were linearly independent, it would follow that
X
0 = Lv = xi Lvi
i
and so all the xi would equal zero which is not the case. Hence these vectors cannot be
linearly independent so they do not span V . Hence there exists
and therefore, there is no u ∈ V such that Lu = w because if there were such a u, then
X
u= xi vi
i
P
and so Lu = i xi Lvi ∈ span (Lv1 , · · ·,Lvn ) .
Finally suppose L is not onto. Then (lij ) also cannot be onto Fn . Therefore,
n
det ((lij )) ≡ det (L) = 0. Why can’t
P (lij ) be onto? If it were, then for any y ∈ F ,
there exists x ∈ Fn such that yi = j lij xj . Thus
X X X X
yk vk = lrs vr vs x j vj = L xj vj
k rs j j
but the expression on the left in the above formula is that of a general element of V
and so L would be onto. This proves the theorem. ¥
λ id −A ∈ L (V, V )
and since it has zero determinant, it is not one to one so there exists v 6= 0 such that
(λ id −A) v = 0.
The following lemma gives the existence of something called the minimal polynomial.
It is an interesting application of the notion of the dimension of L (V, V ).
Lemma 3.6.3 Let A ∈ L (V, V ) where V is either a real or a complex finite dimen-
sional vector space of dimension n. Then there exists a polynomial of the form
This implies there exists a polynomial, q (λ) which has the property that q (A) = 0.
Pn2
In fact, one example is q (λ) ≡ c0 + k=1 ck λk . Dividing by the leading term, it can
be assumed this polynomial is of the form λm + cm−1 λm−1 + · · · + c1 λ + c0 , a monic
polynomial. Now consider all such monic polynomials, q such that q (A) = 0 and pick
the one which has the smallest degree. This is called the minimial polynomial and will
be denoted here by p (λ) . This proves the lemma. ¥
Av = µv.
p (λ) = (λ − µ) k (λ)
where k (λ) is a polynomial having coefficients in F. Since p has minimal degree, k (A) 6=
0 and so there exists a vector, u 6= 0 such that k (A) u ≡ v 6= 0. But then
The next claim about the existence of an eigenvalue follows from the fundamental
theorem of algebra and what was just shown.
3.7. EXERCISES 51
It has been shown that every zero of p (λ) is an eigenvalue which has an eigenvector
in V . Now suppose µ is an eigenvalue which has an eigenvector in V so that Av = µv
for some v ∈ V, v 6= 0. Does it follow µ is a zero of p (λ)?
0 = p (A) v = p (µ) v
and so det (tI − BA) = pBA (t) = tn−m det (tI − AB) = tn−m pAB (t) . This proves the
theorem. ¥
3.7 Exercises
1. Let M be an n × n matrix. Thus letting M x be defined by ordinary matrix
multiplication, it follows M ∈ L (Cn , Cn ) . Show that all the zeros of the mini-
mal polynomial are also zeros of the characteristic polynomial. Explain why this
requires the minimal polynomial to divide the characteristic polynomial. Thus
q (λ) = p (λ) k (λ) for some polynomial k (λ) where q (λ) is the characteristic poly-
nomial. Now explain why q (M ) = 0. That every n × n matrix satisfies its char-
acteristic polynomial is the Cayley Hamilton theorem. Can you extend this to a
result about L ∈ L (V, V ) for V an n dimensional real or complex vector space?
2. Give examples of subspaces of Rn and examples of subsets of Rn which are not
subspaces.
52 BASIC LINEAR ALGEBRA
4. Let L ∈ L (V, W ) . Then L (V ) denotes those vectors in W such that for some
v,Lv = w. Show L (V ) is a subspace.
5. Let L ∈ L (V, W ) and suppose {w1 , · · ·, wk } are linearly independent and that
Lzi = wi . Show {z1 , · · ·, zk } is also linearly independent.
9. Let M (t) = (b1 (t) , · · ·, bn (t)) where each bk (t) is a column vector whose com-
ponent functions are differentiable functions. For such a column vector,
T
b (t) = (b1 (t) , · · ·, bn (t)) ,
define
T
b0 (t) ≡ (b01 (t) , · · ·, b0n (t))
Show
n
X
0
det (M (t)) = det Mi (t)
i=1
where Mi (t) has all the same columns as M (t) except the ith column is replaced
with b0i (t).
where ei is the vector which has a 1 in the ith place and zeros elsewhere.
11. Let {w1 , · · ·, wn } be a basis for the vector space, V. Show id, the identity map is
given by X
id = δ ij wi wj
ij
This is also often denoted by (x, y) and is called an inner product. I will use either
notation.
Notice how you put the conjugate on the entries of the vector, y. It makes no
difference if the vectors happen to be real vectors but with complex vectors you must
do it this way. The reason for this is that when you take the dot product of a vector
with itself, you want to get the square of the length of the vector, a positive number.
Placing the conjugate on the components of y in the above definition assures this will
take place. Thus X X 2
x·x= xj xj = |xj | ≥ 0.
j j
If you didn’t place a conjugate as in the above definition, things wouldn’t work out
correctly. For example,
2
(1 + i) + 22 = 4 + 2i
and this is not a positive number.
The following properties of the dot product follow immediately from the definition
and you should verify each of them.
Properties of the dot product:
1. u · v = v · u.
2. If a, b are numbers and u, v, z are vectors then (au + bv) · z = a (u · z) + b (v · z) .
3. u · u ≥ 0 and it equals 0 if and only if u = 0.
(x·αy) = (αy · x) = α (y · x) = α (x · y)
Equality holds in this inequality if and only if one vector is a multiple of the other.
θ (x · y) = |(x · y)|
¡ ¢
Consider p (t) ≡ x + θty · x + tθy where t ∈ R. Then from the above list of properties
of the dot product,
0 ≤ p (t) = (x · x) + tθ (x · y) + tθ (y · x) + t2 (y · y)
= (x · x) + tθ (x · y) + tθ(x · y) + t2 (y · y)
= (x · x) + 2t Re (θ (x · y)) + t2 (y · y)
= (x · x) + 2t |(x · y)| + t2 (y · y) (3.34)
and this must hold for all t ∈ R. Therefore, if (y · y) = 0 it must be the case that
|(x · y)| = 0 also since otherwise the above inequality would be violated. Therefore, in
this case,
1/2 1/2
|(x · y)| ≤ (x · x) (y · y) .
On the other hand, if (y · y) 6= 0, then p (t) ≥ 0 for all t means the graph of y = p (t) is
a parabola which opens up and it either has exactly one real zero in the case its vertex
touches the t axis or it has no real zeros. From the quadratic formula this happens
exactly when
2
4 |(x · y)| − 4 (x · x) (y · y) ≤ 0
which is equivalent to 3.33.
It is clear from a computation that if one vector is a scalar multiple of the other that
equality holds in 3.33. Conversely, suppose equality does hold. Then this is equivalent
2
to saying 4 |(x · y)| −4 (x · x) (y · y) = 0 and so from the quadratic formula, there exists
one real zero to p (t) = 0. Call it t0 . Then
¡ ¢ ¯ ¯2
p (t0 ) ≡ x + θt0 y, x + t0 θy = ¯x + θty¯ = 0
Theorem 3.8.6 For length defined in Definition 3.8.5, the following hold.
Proof: The first two claims are left as exercises. To establish the third,
2
|z + w| ≡ (z + w, z + w)
= z·z+w·w+w·z+z·w
2 2
= |z| + |w| + 2 Re w · z
2 2
≤ |z| + |w| + 2 |w · z|
2 2 2
≤ |z| + |w| + 2 |w| |z| = (|z| + |w|) .
which implies
||z|| − ||w|| ≤ ||z − w||
and now switching z and w, yields
0
ap bp
ab ≤ + 0.
p p
n n
" µ ¶p µ ¶p0 #
X |xi | |yi | X 1 |xi | 1 |yi |
≤ + 0
i=1
A B i=1
p A p B
n n
1 1 X p 1 1 X p0
= p
|xi | + 0 p
|yi |
p A i=1 p B i=1
1 1
= + =1
p p0
and so
n
à n
!1/p à n !1/p0
X X p
X p0
|xi | |yi | ≤ AB = |xi | |yi | .
i=1 i=1 i=1
Proof: It is obvious that ||·||p does indeed satisfy most of the norm axioms. The
only one that is not clear is the triangle inequality. To save notation write ||·|| in place
of ||·||p in what follows. Note also that pp0 = p − 1. Then using the Holder inequality,
n
X
p p
||x + y|| = |xi + yi |
i=1
n
X n
X
p−1 p−1
≤ |xi + yi | |xi | + |xi + yi | |yi |
i=1 i=1
Xn n
X
p p
= |xi + yi | p0 |xi | + |xi + yi | p0 |yi |
i=1 i=1
à n !1/p0 à n !1/p à n !1/p
X p
X p
X p
≤ |xi + yi | |xi | + |yi |
i=1 i=1 i=1
³ ´
p/p0
= ||x + y|| ||x||p + ||y||p
p/p0
so dividing by ||x + y|| , it follows
p −p/p0
||x + y|| ||x + y|| = ||x + y|| ≤ ||x||p + ||y||p
³ ³ ´ ´
p 1
p− p0 =p 1− p0 = p p1 = 1. . This proves the theorem. ¥
It only remains to prove Lemma 3.8.9.
3.8. INNER PRODUCT AND NORMED LINEAR SPACES 57
Proof of the lemma: Let p0 = q to save on notation and consider the following
picture:
x
b
x = tp−1
t = xq−1
t
a
Z a Z b
ap bq
ab ≤ tp−1 dt + xq−1 dx = + .
0 0 p q
Note equality occurs when ap = bq .
Alternate proof of the lemma: Let
µ ¶q
1 p 1 b
f (t) ≡ (at) + , t>0
p q t
You see right away it is decreasing for a while, having an assymptote at t = 0 and then
reaches a minimum and increases from then on. Take its derivative.
µ ¶q−1 µ ¶
p−1 b −b
f 0 (t) = (at) a+
t t2
bq
tp+q = . (3.42)
ap
Thus
bq/(p+q)
t=
ap/(p+q)
and so at this value of t,
µ ¶
q/(p+q) b p/(p+q)
at = (ab) , = (ab) .
t
1³ q/(p+q)
´p 1 ³
p/(p+q)
´q
pq/(p+q)
(ab) + (ab) = (ab)
p q
but recall 1/p + 1/q = 1 and so pq/ (p + q) = 1. Thus the minimum value of f is ab.
Letting t = 1, this shows
ap bq
ab ≤ + .
p q
Note that equality occurs when the minimum value happens for t = 1 and this indicates
from 3.42 that ap = bq . This proves the lemma. ¥
58 BASIC LINEAR ALGEBRA
where the denominator is not equal to zero because the xj form a basis and so
xk+1 ∈
/ span (x1 , · · ·, xk ) = span (u1 , · · ·, uk )
Thus by induction,
uk+1 ∈ span (u1 , · · ·, uk , xk+1 ) = span (x1 , · · ·, xk , xk+1 ) .
Also, xk+1 ∈ span (u1 , · · ·, uk , uk+1 ) which is seen easily by solving 3.43 for xk+1 and
it follows
span (x1 , · · ·, xk , xk+1 ) = span (u1 , · · ·, uk , uk+1 ) .
¯ Pk ¯−1
¯ ¯
If l ≤ k, then denoting by C the scalar ¯xk+1 − j=1 (xk+1 · uj ) uj ¯ ,
k
X
(uk+1 · ul ) = C (xk+1 · ul ) − (xk+1 · uj ) (uj · ul )
j=1
k
X
= C (xk+1 · ul ) − (xk+1 · uj ) δ lj
j=1
= C ((xk+1 · ul ) − (xk+1 · ul )) = 0.
3.8. INNER PRODUCT AND NORMED LINEAR SPACES 59
n
The vectors, {uj }j=1 , generated in this way are therefore an orthonormal basis because
each vector has unit length.
The process by which these vectors were generated is called the Gram Schmidt
process.
Theorem 3.8.14 Let H be a finite dimensional inner product space and let
L ∈ L (H, F) . Then there exists a unique z ∈ H such that for all x ∈ H,
Lx = (x · z) .
Proof: By the Gram Schmidt process, there exists an orthonormal basis for H,
{e1 , · · ·, en } .
First note that if x is arbitrary, there exist unique scalars, xi such that
n
X
x= x i ei
i=1
(x · z1 − z2 ) = 0
for all x and in particular for x = z1 − z2 which requires z1 = z2 . This proves the
theorem. ¥
Now with this theorem, it becomes easy to define something called the adjoint of
a linear operator. Let L ∈ L (H1 , H2 ) where H1 and H2 are finite dimensional inner
product spaces. Then letting ( · )i denote the inner product in Hi ,
x → (Lx · y)2
60 BASIC LINEAR ALGEBRA
is in L (H1 , F) and so from Theorem 3.8.14 there exists a unique element of H1 , denoted
by L∗ y such that for all x ∈ H1 ,
and
(x · αL∗ y + βL∗ z)1 = α (x·L∗ y)1 + β (x·L∗ z)1
Since
(x·L∗ (αy + βz))1 = (x · αL∗ y + βL∗ z)1
for all x, this requires
L∗ (αy + βz) = αL∗ y + βL∗ z.
In simple words, when you take it across the dot, you put a star on it. More precisely,
here is the definition.
I will not bother to place subscripts on the symbol for the dot product in the future. I
will be clear from context which inner product is meant.
This does the first claim. The second part was discussed earlier when the adjoint was
defined.
3.8. INNER PRODUCT AND NORMED LINEAR SPACES 61
You should verify this is so from the definition of the usual inner product on Fk . The
following little proposition is useful.
Proposition 3.8.18 Suppose A is an m × n matrix where m ≤ n. Also suppose
det (AA∗ ) 6= 0.
Then A has m linearly independent rows and m independent columns.
Proof: Since det (AA∗ ) 6= 0, it follows the m × m matrix AA∗ has m independent
rows. If this is not true of A, then there exists x a 1 × m matrix such that
xA = 0.
Hence
xAA∗ = 0
and this contradicts the independence of the rows of AA∗ . Thus the row rank of A
equals m and by Corollary 3.5.20 this implies the column rank of A also equals m. This
proves the proposition. ¥
62 BASIC LINEAR ALGEBRA
where {v1 , · · ·, v2 } is a basis. Of course different bases will yield different matrices,
(lij ) . Schur’s theorem gives the existence of a basis in an inner product space such that
(lij ) is particularly simple.
Proof: If dim (H) = 1 let H = span (w) where |w| = 1. Then Lw = kw for some
k. Then
L = kww
because by definition, ww (w) = w. Therefore, the theorem holds if H is 1 dimensional.
Now suppose the theorem holds for n − 1 = dim (H) . By Theorem 3.6.4 and the
assumption, there exists wn , an eigenvector for L∗ . Dividing by its length, it can be
assumed |wn | = 1. Say L∗ wn = µwn . Using the Gram Schmidt process, there exists an
orthonormal basis for H of the form {v1 , · · ·, vn−1 , wn } . Then
which shows
L : H1 ≡ span (v1 , · · ·, vn−1 ) → span (v1 , · · ·, vn−1 ) .
Denote by L1 the restriction of L to H1 . Since H1 has dimension n − 1, the induction
hypothesis yields an orthonormal basis, {w1 , · · ·, wn−1 } for H1 such that
n−1
XX j
L1 = cij wi wj . (3.44)
j=1 i=1
has the property that its dot product with wn is 0 so in particular, this is true for the
vectors {w1 , · · ·, wn−1 }. Now define cin to be the scalars satisfying
n
X
Lwn ≡ cin wi (3.45)
i=1
and let
j
n X
X
B≡ cij wi wj .
j=1 i=1
3.8. INNER PRODUCT AND NORMED LINEAR SPACES 63
Then by 3.45,
X j
n X n
X
Bwn = cij wi δ nj = cin wi = Lwn .
j=1 i=1 j=1
If 1 ≤ k ≤ n − 1,
X j
n X k
X
Bwk = cij wi δ kj = cik wi
j=1 i=1 i=1
det (λI − C)
where C is the upper triangular matrix which has cij for i ≤ j and zeros elsewhere.
This equals 0 if and only if λ is one of the diagonal entries, one of the ckk . This proves
the theorem. ¥
There is a technical assumption in the above theorem about the eigenvalues of re-
strictions of L∗ being in F, the field of scalars. If F = C this is no restriction. There is
also another situation in which F = R for which this will hold.
Proof: It suffices to verify the two linear transformations are equal on {w1 , · · ·, wn } .
Then
¡ ∗ ¢
wp · (wi wj ) wk ≡ ((wi wj ) wp · wk ) = (wi δ jp · wk ) = δ jp δ ik
(wp · (wj wi ) wk ) = (wp · wj δ ik ) = δ ik δ jp
Since wp is arbitrary, it follows from the properties of the inner product that
¡ ∗ ¢
x · (wi wj ) wk = (x · (wj wi ) wk )
∗
for all x ∈ H and hence (wi wj ) wk = (wj wi ) wk . Since wk is arbitrary, This proves
the lemma. ¥
Proof: Let {w1 , · · ·, wn } be an orthonormal basis for H and let (lij ) be the matrix
of L with respect to this orthonormal basis. Thus
X X
L= lij wi wj , id = δ ij wi wj
ij ij
64 BASIC LINEAR ALGEBRA
Denote by ML the matrix whose ij th entry is lij . Then by definition of what is meant
by the determinant of a linear transformation,
so by the above lemma, the eigenvalues are real and are therefore, in the field of scalars.
Now with this lemma, the following theorem is obtained. This is another major
theorem. It is equivalent to the theorem in matrix theory which states every self adjoint
matrix can be diagonalized.
The scalars are the eigenvalues and wk is an eigenvector for λk for each k.
where cij = 0 if i > j. Now using Lemma 3.8.21 and Proposition 3.8.16 along with the
assumption that L is self adjoint,
n X
X n n X
X n n X
X n
L= cij wi wj = L∗ = cij wj wi = cji wi wj
j=1 i=1 j=1 i=1 i=1 j=1
3.9. POLAR DECOMPOSITIONS 65
If i < j, then this shows cij = cji and the second number equals zero because j > i.
Thus cij = 0 if i < j and it is already known that cij = 0 if i > j. Therefore, let λk = ckk
and the above reduces to
n
X n
X
L= λj wj wj = λj wj wj
j=1 j=1
which shows all the wk are eigenvectors. This proves the theorem. ¥
Lemma 3.9.1 Suppose R ∈ L (X, Y ) where X, Y are finite dimensional inner prod-
uct spaces and R preserves distances,
|Rx|Y = |x|X .
Then R∗ R = I.
Proof: Since R preserves distances, |Rx| = |x| for every x. Therefore from the
axioms of the dot product,
2 2
|x| + |y| + (x · y) + (y · x)
2
= |x + y|
= (R (x + y) · R (x + y))
= (Rx·Rx) + (Ry·Ry) + (Rx · Ry) + (Ry · Rx)
2 2
= |x| + |y| + (R∗ Rx · y) + (y · R∗ Rx)
Then
Thus |(R∗ Rx − x · y)| = 0 for all x, y because the given x, y were arbitrary. Let y =
R∗ Rx − x to conclude that for all x,
R∗ Rx − x = 0
R∗ R = RR∗ = id
F = RU, U = U ∗ , (U is Hermitian),
U 2 = F ∗ F, R∗ R = I,
λi (vi · vi ) = (F ∗ F vi · vi ) = (F vi · F vi ) ≥ 0.
Let
n
X 1/2
U≡ λi vi vi .
i=1
Let {U x1 , · · ·, U xr } be an orthonormal basis for U (X) . Extend this using the Gram
Schmidt procedure to an orthonormal basis for X,
{U x1 , · · ·, U xr , yr+1 , · · ·, yn } .
{F x1 , · · ·, F xr , zr+1 , · · ·, zm } .
Then
r
X n
X
Rx ≡ ck F xk + dk zk . (3.46)
k=1 k=r+1
à à r
! r
!
X X
∗
= F F bk xk − x · bk xk − x
k=1 k=1
à à r ! à r !!
X X
= U2 bk xk − x · bk xk − x
k=1 k=1
à à r ! à r !!
X X
= U bk xk − x ·U bk xk − x
k=1 k=1
à r r
!
X X
= bk U xk − U x · bk U xk − U x =0
k=1 k=1
by 3.47.
Since |Rx| = |x| , it follows R∗ R = I from Lemma 3.9.1. This proves the theorem.
¥
The following corollary follows as a simple consequence of this theorem. It is called
the left polar decomposition.
68 BASIC LINEAR ALGEBRA
Hence X X
(R∗ wj , wi ) (Rwi , wl ) = δ jl = (Rwi , wj ) (Rwi , wl ) (3.48)
i i
because X
id = δ jl wl wj
jl
Thus letting M be the matrix whose ij th entry is (Rwi , wl ) , det (R) is defined as
det (M ) and 3.48 says
X³ ´
MT Mil = δ jl .
ji
i
It follows
³ T´ ¡ ¢ 2
1 = det (M ) det M = det (M ) det M = det (M ) det (M ) = |det (M )| .
3.10 Exercises
1. For u, v vectors in F3 , define the product, u ∗ v ≡ u1 v1 + 2u2 v2 + 3u3 v3 . Show the
axioms for a dot product all hold for this funny product. Prove
1/2 1/2
|u ∗ v| ≤ (u ∗ u) (v ∗ v) .
2. Suppose you have a real or complex vector space. Can it always be considered
as an inner product space? What does this mean about Schur’s theorem? Hint:
Start with a basis and decree the basis is orthonormal. Then define an inner
product accordingly.
h i
2 2
3. Show that (a · b) = 14 |a + b| − |a − b| .
2
4. Prove from the axioms of the dot product the parallelogram identity, |a + b| +
2 2 2
|a − b| = 2 |a| + 2 |b| .
5. Suppose f, g are two Darboux Stieltjes integrable functions defined on [0, 1] . Define
Z 1
(f · g) = f (x) g (x)dF.
0
Show this dot product satisfies the axioms for the inner product. Explain why the
Cauchy Schwarz inequality continues to hold in this context and state the Cauchy
Schwarz inequality in terms of integrals. Does the Cauchy Schwarz inequality still
hold if Z 1
(f · g) = f (x) g (x)p (x) dF
0
where p (x) is a given nonnegative function? If so, what would it be in terms of
integrals.
6. If A is an n × n matrix considered as an element of L (Cn , Cn ) by ordinary matrix
multiplication, use the inner product in Cn to show that (A∗ )ij = Aji . In words,
the adjoint is the transpose of the conjugate.
7. A symmetric matrix is a real n × n matrix A which satisfies AT = A. Show every
symmetric matrix is self adjoint and that there exists an orthonormal set of real
vectors {x1 , · · ·, xn } such that
X
A= λk xk xk
k
That is, with respect to this basis the matrix of A is diagonal. Hint: This is a
Pn was done to prove Theorem 3.8.23. Use Schur’s theorem
harder versionPof what
n
to write A = j=1 i=1 Bij wi wj where Bij is an upper triangular matrix. Then
use the condition that A is normal and eventually get an equation
X X
Bik Blk = Bki Bkl
k k
Next let i = l and consider first l = 1, then l = 2, etc. If you are careful, you will
find Bij = 0 unless i = j.
70 BASIC LINEAR ALGEBRA
where the vectors {w1 , · · ·, wn } are an orthonormal set. Show that A must be
normal. In other words, you can’t represent A ∈ L (H, H) in this very convenient
way unless it is normal.
10. If L is a self adjoint operator defined on an inner product space, H such that
L has all only nonnegative eigenvalues. Explain how to define L1/n and show
why what you come up with is indeed the nth root of the operator. For a
self adjoint operator L on an inner product space, can you define sin (L) ≡
P∞ k 2k+1
k=0 (−1) L / (2k + 1)!? What does the infinite series mean? Can you make
some sense of this using the representation of L given in Theorem 3.8.23?
11. If L is a self adjoint linear transformation defined on L (H, H) for H an inner
product space which has all eigenvalues nonnegative, show the square root is
unique.
12. Using Problem 11 show F ∈ L (H, H) for H an inner product space is normal if
and only if RU = U R where F = RU is the right polar decomposition defined
above. Recall R preserves distances and U is self adjoint. What is the geometric
significance of a linear transformation being normal?
13. Suppose you have a basis, {v1 , · · ·, vn } in an inner product space, X. The Gram-
mian matrix is the n × n matrix whose ij th entry is (vi · vj ) . Show this matrix is
invertible.
P Hint:
P You might try to show that the inner product of two vectors,
a
k k kv and k bk vk has something to do with the Grammian.
71
72 SEQUENCES
Theorem 4.1.5 Suppose {an } and {bn } are vector valued sequences and that
Also,
lim (an · bn ) = (a · b) (4.2)
n→∞
where here ¡ ¢ ¡ ¢
xk = x1k , · · ·, xnk , x = x1 , · · ·, xn .
so for n > nε ,
ε ε
||xa + yb− (xan + ybn )|| < |x| + |y| ≤ ε.
2 (1 + |x| + |y|) 2 (1 + |x| + |y|)
Now consider the second. Let ε > 0 be given and choose n1 such that if n ≥ n1 then
|an − a| < 1.
For such n, it follows from the Cauchy Schwarz inequality and properties of the inner
product that
For n ≥ nε ,
|an · bn − a · b| ≤ (|a| + 1) |bn − b| + |b| |an − a|
ε ε
< (|a| + 1) + |b| ≤ ε.
2 (|a| + 1) 2 (|b| + 1)
This proves 4.2. The claim, 4.3 is left for you to do.
Finally consider the last claim. If 4.4 holds, then from the definition of distance in
Fn , v
uX
u n ³ ´2
lim |x − xk | ≡ lim t xj − xjk = 0.
k→∞ k→∞
j=1
¯ ¯
¯ ¯
On the other hand, if limk→∞ |x − xk | = 0, then since ¯xjk − xj ¯ ≤ |x − xk | , it follows
from the squeezing theorem that
¯ ¯
¯ ¯
lim ¯xjk − xj ¯ = 0.
k→∞
Theorem 4.1.7 Let {xn } be a sequence of real numbers and suppose each xn ≤ l
(≥ l)and limn→∞ xn = x. Then x ≤ l (≥ l) . More generally, suppose {xn } and {yn } are
two sequences such that limn→∞ xn = x and limn→∞ yn = y. Then if xn ≤ yn for all n
sufficiently large, then x ≤ y.
Proof: Let ε > 0 be given. Then for n large enough,
l ≥ xn > x − ε
and so
l + ε ≥ x.
Since ε > 0 is arbitrary, this requires l ≥ x. The other case is entirely similar or else
you could consider −l and {−xn } and apply the case just considered.
Consider the last claim. There exists N such that if n ≥ N then xn ≤ yn and
|x − xn | + |y − yn | < ε/2.
Then considering n > N in what follows,
x − y ≤ xn + ε/2 − (yn − ε/2) = xn − yn + ε ≤ ε.
Since ε was arbitrary, it follows
x − y ≤ 0.
This proves the theorem. ¥
74 SEQUENCES
Theorem 4.1.8 Let {xn } be a sequence vectors and suppose each ||xn || ≤ l
(≥ l)and limn→∞ xn = x. Then x ≤ l (≥ l) . More generally, suppose {xn } and {yn }
are two sequences such that limn→∞ xn = x and limn→∞ yn = y. Then if ||xn || ≤ ||yn ||
for all n sufficiently large, then ||x|| ≤ ||y|| .
Proof: It suffices to just prove the second part since the first part is similar. By
the triangle inequality,
|||xn || − ||x||| ≤ ||xn − x||
and for large n this is given to be small. Thus {||xn ||} converges to ||x|| . Similarly
{||yn ||} converges to ||y||. Now the desired result follows from Theorem 4.1.7. This
proves the theorem. ¥
Consequently, if k ≤ l,
al ≤ al ≤ bl ≤ bk . (4.6)
Now define © ª
c ≡ sup al : l = 1, 2, · · ·
By the first inequality in 4.5, and 4.6
© ª
ak ≤ c = sup al : l = k, k + 1, · · · ≤ bk (4.7)
for each k = 1, 2 · · · . Thus c ∈ Ik for every k and This proves the lemma. ¥ If this went
too©fast, the reason for ªthe last inequality in 4.7 is that from 4.6, bk is an upper bound
to al : l = k, k + 1, · · · . Therefore, it is at least as large as the least upper bound.
and xn2 ∈ I2 , n3 such that n3 > n2 and xn3 ∈ I3 , etc. (This can be done because in each
case the intervals contained xn for infinitely many values of n.) By the nested interval
lemma there exists a point, c contained in all these intervals. Furthermore,
|xnk − c| < (b − a) 2−k
and so limk→∞ xnk = c ∈ [a, b] . This proves the theorem. ¥
Theorem 4.3.2 The intersection of any finite collection of open sets is open.
The union of any collection of open sets is open. The intersection of any collection of
closed sets is closed and the union of any finite collection of closed sets is closed.
Proof: To see that any union of open sets is open, note that every point of the
union is in at least one of the open sets. Therefore, it is an interior point of that set
and hence an interior point of the entire union.
Now let {U1 , · · ·, Um } be some open sets and suppose p ∈ ∩mk=1 Uk . Then there exists
rk > 0 such that B (p, rk ) ⊆ Uk . Let 0 < r ≤ min (r1 , r2 , · · ·, rm ) . Then B (p, r) ⊆
∩mk=1 Uk and so the finite intersection is open. Note that if the finite intersection is
empty, there is nothing to prove because it is certainly true in this case that every point
in the intersection is an interior point because there aren’t any such points.
Suppose {H1 , · · ·, Hm } is a finite set of closed sets. Then ∪m k=1 Hk is closed if its
complement is open. However, from DeMorgan’s laws,
C
(∪m m C
k=1 Hk ) = ∩k=1 Hk ,
a finite intersection of open sets which is open by what was just shown.
Next let C be some collection of closed sets. Then
C © ª
(∩C) = ∪ H C : H ∈ C ,
a union of open sets which is therefore open by the first part of the proof. Thus ∩C is
closed. This proves the theorem. ¥
Next there is the concept of a limit point which gives another way of characterizing
closed sets.
Definition 4.3.3 Let A be any nonempty set and let x be a point. Then x is
said to be a limit point of A if for every r > 0, B (x, r) contains a point of A which is
not equal to x.
Example 4.3.4 Consider A = B (x, δ) , an open ball in a normed vector space. Then
every point of B (x, δ) is a limit point. There are more general situations than normed
vector spaces in which this assertion is false.
Proof: Suppose first a is a limit point of A. There exists a1 ∈ B (a, 1) ∩ A such that
a1 6= a. Now supposing distinct points, a1 , · · ·, an have been chosen such that none are
equal to a and for each k ≤ n, ak ∈ B (a, 1/k) , let
½ ¾
1
0 < rn+1 < min , ||a − a1 || , · · ·, ||a − an || .
n+1
Then there exists an+1 ∈ B (a, rn+1 ) ∩ A with an+1 6= a. Because of the definition of
rn+1 , an+1 is not equal to any of the other ak for k < n+1. Also since ||a − am || < 1/m,
it follows limm→∞ am = a. Conversely, if there exists a sequence of distinct points of A
converging to a, then B (a, r) contains all an for n large enough. Thus B (a, r) contains
infinitely many points of A since all are distinct. Thus at least one of them is not equal
to a. This establishes the first part of the theorem.
Now consider the second claim. If A is closed then it is the complement of an open
set. Since AC is open, it follows that if a ∈ AC , then there exists δ > 0 such that
B (a, δ) ⊆ AC and so no point of AC can be a limit point of A. In other words, every
limit point of A must be in A. Conversely, suppose A contains all its limit points. Then
AC does not contain any limit points of A. It also contains no points of A. Therefore, if
a ∈ AC , since it is not a limit point of A, there exists δ > 0 such that B (a, δ) contains
no points of A different than a. However, a itself is not in A because a ∈ AC . Therefore,
B (a, δ) is entirely contained in AC . Since a ∈ AC was arbitrary, this shows every point
of AC is an interior point and so AC is open. This proves the theorem. ¥
Closed subsets of sequentially compact sets are sequentially compact.
Proof: Let H be a closed and bounded set in Fn . Then H ⊆ B (0, r) for some r.
Therefore, if x ∈ H, x = (x1 , · · ·, xn ) , it must be that
v
u n
uX
t 2
|xi | < r
i=1
However, this sequence cannot have any convergent subsequence because if kmk → k,
C
then for large enough m, k ∈ B (0,m) ⊆ D (0, m) and kmk ∈ B (0, m) for all k large
enough and this is a contradiction because there can only be finitely many points of the
sequence in B (0, m) . If K is not closed, then it is ¡missing¢ a limit point. Say k∞ is a
1
limit point of K which is not in K. Pick km ∈ B k∞ , m . Then {km } converges to
k∞ and so every subsequence also converges to k∞ by Theorem 4.1.6. Thus there is no
point of K which is a limit of some subsequence of {km } , a contradiction. This proves
the theorem. ¥
What are some examples of closed and bounded sets in a general normed vector
space and more specifically Fn ?
Then D (z, r) is closed and bounded. Also, let S (z,r) denote the set of points
{w ∈ V : ||w − z|| = r}
Then S (z, r) is closed and bounded. It follows that if V = Fn ,then these sets are
sequentially compact.
||z − x|| ≥ ||z − y|| − ||y − x|| > ||x − y|| + r − ||x − y|| = r
Proof: Let ε = 1 in the definition of a Cauchy sequence and let n > n1 . Then from
the definition,
||an −an1 || < 1.
It follows that for all n > n1 ,
||an || < 1 + ||an1 || .
Therefore, for all n,
n1
X
||an || ≤ 1 + ||an1 || + ||ak || .
k=1
Proof: Let ε > 0 be given and suppose an → a. Then from the definition of
convergence, there exists nε such that if n > nε , it follows that
ε
||an −a|| <
2
Therefore, if m, n ≥ nε + 1, it follows that
ε ε
||an −am || ≤ ||an −a|| + ||a − am || < + =ε
2 2
showing that, since ε > 0 is arbitrary, {an } is a Cauchy sequence.
The following theorem is very useful. It is identical to an earlier theorem. All that
is required is to put things in bold face to indicate they are vectors.
Theorem 4.4.4 Suppose {an } is a Cauchy sequence in any normed vector space
and there exists a subsequence, {ank } which converges to a. Then {an } also converges
to a.
Proof: Let ε > 0 be given. There exists N such that if m, n > N, then
Definition 4.4.5 If V is a normed vector space having the property that every
Cauchy sequence converges, then V is called complete. It is also referred to as a Banach
space.
Theorem 4.5.2 ∞
Let {Fn }n=1 be a sequence of closed sets in Fn such that
|p − q| ≤ diam (Fk ) .
4.6 Exercises
1. For a nonempty set, S in a normed vector space, V, define a function
Show
||dist (x, S) − dist (y, S)|| ≤ ||x − y|| .
3. The interior of a set was defined above. Tell why the interior of a set is always an
open set. The interior of a set A is sometimes denoted by A0 .
4. Given an example of a set A whose interior is empty but whose closure is all of
Rn .
6. Give an example of a finite dimensional normed vector space where the field of
scalars is the rational numbers which is not complete.
7. Explain why as far as the theorems of this chapter are concerned, Cn is essentially
the same as R2n .
Pp P
10. Suppose A ⊆ Rn and z ∈ co (A) . Thus z = k=1 tk ak for tk ≥ 0 and k tk =
1. Show there exists n + 1 of the points {a1 , · · ·, ap } such that z is a convex
combination of these n + 1 points. Hint: Show that if p > n + 1 then the vectors
p
{ak − a1 }k=2 must be P linearly dependent. Conclude from this Pp the existence of
p
scalars {αi } such that i=1 αi ai = 0. Now for s ∈ R, z = k=1 (tk + sαi ) ak .
Consider small s and adjust till one or more of the tk + sαk vanish. Now you are
in the same situation as before but with only p − 1 of the ak . Repeat the argument
till you end up with only n + 1 at which time you can’t repeat again.
11. Show that any uncountable set of points in Fn must have a limit point.
12. Let V be any finite dimensional vector space having a basis {v1 , · · ·, vn } . For
x ∈ V, let
Xn
x= xk vk
k=1
so that the scalars, xk are the components of x with respect to the given basis.
Define for x, y ∈ V
n
X
(x · y) ≡ xi yi
i=1
Show this is a dot product for V satisfying all the axioms of a dot product presented
earlier.
13. In the context of Problem 12 let |x| denote the norm of x which is produced by
this inner product and suppose ||·|| is some other norm on V . Thus
à !1/2
X 2
|x| ≡ |xi |
i
where X
x= xk vk . (4.8)
k
This is referred to by saying the two norms are equivalent. Hint: The top half is
easy using the Cauchy Schwarz inequality. The bottom half is somewhat harder.
Argue that if it is not so, there exists a sequence {xk } such that |xk | = 1 but
k −1 |xk | = k −1 ≥ ||xk || and then note the vector of components of xk is on
S (0, 1) which was shown to be sequentially compact. Pass to a limit in 4.8 and
use the assumed inequality to get a contradiction to {v1 , · · ·, vn } being a basis.
14. It was shown above that in Fn , the sequentially compact sets are exactly those
which are closed and bounded. Show that in any finite dimensional normed vector
space, V the closed and bounded sets are those which are sequentially compact.
15. Two norms on a finite dimensional vector space, ||·||1 and ||·||2 are said to be
equivalent if there exist positive numbers δ < ∆ such that
Show the statement that two norms are equivalent is an equivalence relation.
Explain using the result of Problem 13 why any two norms on a finite dimensional
vector space are equivalent.
4.6. EXERCISES 83
∞
16. A normed vector space, V is separable if there is a countable set {wk }k=1 such
that whenever B (x, δ) is an open ball in V, there exists some wk in this open ball.
Show that Fn is separable. This set of points is called a countable dense set.
17. Let V be any normed vector space with norm ||·||. Using Problem 13 show that
V is separable.
18. Suppose V is a normed vector space. Show there exists a countable set of open
∞
balls B ≡ {B (xk , rk )}k=1 having the remarkable property that any open set, U is
the union of some subset of B. This collection of balls is called a countable basis.
∞
Hint: Use Problem 17 to get a countable
¡ ¢ dense dense set of points, {xk }k=1 and
1
then consider balls of the form B xk , r where r ∈ N. Show this collection of
balls is countable and then show it has the remarkable property mentioned.
19. Suppose S is any nonempty set in V a finite dimensional normed vector space.
Suppose C is a set of open sets such that ∪C ⊇ S. (Such a collection of sets is
called an open cover.) Show using Problem 18 that there are countably many
∞
sets from C, {Uk }k=1 such that S ⊆ ∪∞
k=1 Uk . This is called the Lindeloff property
when every open cover can be reduced to a countable sub cover.
20. A set, H in a normed vector space is said to be compact if whenever C is a set of
open sets such that ∪C ⊇ H, there are finitely many sets of C, {U1 , · · ·, Up } such
that
H ⊆ ∪pi=1 Ui .
Show using Problem 19 that if a set in a normed vector space is sequentially
compact, then it must be compact. Next show using Problem 14 that a set in a
normed vector space is compact if and only if it is closed and bounded. Explain
why the sets which are compact, closed and bounded, and sequentially compact
are the same sets in any finite dimensional normed vector space
84 SEQUENCES
Continuous Functions
Continuous functions are defined as they are for a function of one variable.
There is a theorem which makes it easier to verify certain functions are continuous
without having to always go to the above definition. The statement of this theorem is
purposely just a little vague. Some of these things tend to hold in almost any context,
certainly for any normed vector space.
2. If and f and g have values in Fn and they are each continuous at x, then f · g is
continuous at x. If g has values in F and g (x) 6= 0 with g continuous, then f /g is
continuous at x.
Proof: First consider 1.) Let ε > 0 be given. By assumption, there exist δ 1 > 0 such
ε
that whenever |x − y| < δ 1 , it follows |f (x) − f (y)| < 2(|a|+|b|+1) and there exists δ 2 > 0
ε
such that whenever |x − y| < δ 2 , it follows that |g (x) − g (y)| < 2(|a|+|b|+1) . Then let
0 < δ ≤ min (δ 1 , δ 2 ) . If |x − y| < δ, then everything happens at once. Therefore, using
the triangle inequality
85
86 CONTINUOUS FUNCTIONS
Now consider 2.) There exists δ 1 > 0 such that if |y − x| < δ 1 , then |f (x) − f (y)| <
1. Therefore, for such y,
|f (y)| < 1 + |f (x)| .
It follows that for such y,
|f · g (x) − f · g (y)| ≤ |f (x) · g (x) − g (x) · f (y)| + |g (x) · f (y) − f (y) · g (y)|
Now let ε > 0 be given. There exists δ 2 such that if |x − y| < δ 2 , then
ε
|g (x) − g (y)| < ,
2 (2 + |g (x)| + |f (x)|)
Now let 0 < δ ≤ min (δ 1 , δ 2 , δ 3 ) . Then if |x − y| < δ, all the above hold at once and so
|f · g (x) − f · g (y)| ≤
This proves the first part of 2.) To obtain the second part, let δ 1 be as described above
and let δ 0 > 0 be such that for |x − y| < δ 0 ,
which implies |g (y)| ≥ |g (x)| /2, and |g (y)| < 3 |g (x)| /2.
Then if |x − y| < min (δ 0 , δ 1 ) ,
¯ ¯ ¯ ¯
¯ f (x) f (y) ¯¯ ¯¯ f (x) g (y) − f (y) g (x) ¯¯
¯ − =
¯ g (x) g (y) ¯ ¯ g (x) g (y) ¯
|f (x) g (y) − f (y) g (x)|
≤ ³ 2
´
|g(x)|
2
2
≤ 2 [|f (x) g (y) − f (y) g (y) + f (y) g (y) − f (y) g (x)|]
|g (x)|
2
≤ 2 [|g (y)| |f (x) − f (y)| + |f (y)| |g (y) − g (x)|]
|g (x)|
· ¸
2 3
≤ 2 |g (x)| |f (x) − f (y)| + (1 + |f (x)|) |g (y) − g (x)|
|g (x)| 2
2
≤ 2 (1 + 2 |f (x)| + 2 |g (x)|) [|f (x) − f (y)| + |g (y) − g (x)|]
|g (x)|
≡ M [|f (x) − f (y)| + |g (y) − g (x)|]
where M is defined by
2
M≡ 2 (1 + 2 |f (x)| + 2 |g (x)|)
|g (x)|
of f −1 (U ) . Next suppose the condition about inverse images of open sets are open. Then
apply this condition to the open set B (f (x) , ε) . The condition says f −1 (B (f (x) , ε)) is
open and since x ∈ f −1 (B (f (x) , ε)) , it follows x is an interior point of f −1 (B (f (x) , ε))
so there exists δ > 0 such that B (x, δ) ⊆ f −1 (B (f (x) , ε)) . This says f (B (x, δ)) ⊆
B (f (x) , ε) . In other words, whenever ||y − x|| < δ, ||f (y) − f (x)|| < ε which is the
condition for continuity at the point x. Since x is arbitrary, This proves the theorem.
¥
Proof: Suppose first that f is continuous at x and let xn → x. Let ε > 0 be given.
By continuity, there exists δ > 0 such that if ||y − x|| < δ, then ||f (x) − f (y)|| < ε.
However, there exists nδ such that if n ≥ nδ , then ||xn − x|| < δ and so for all n this
large,
||f (x) − f (xn )|| < ε
which shows f (xn ) → f (x) .
Now suppose the condition about taking convergent sequences to convergent se-
quences holds at x. Suppose f fails to be continuous at x. Then there exists ε > 0 and
xn ∈ D (f ) such that ||x − xn || < n1 , yet
||f (x)|| ≤ l (≥ l) .
Proof: Since ||f (xn )|| ≤ l and f is continuous at x, it follows from the triangle
inequality, Theorem 4.1.8 and Theorem 5.1.1,
Proof: Suppose f (kn ) → f (k) . Does it follow kn → k? If this does not happen,
then there exists ε > 0 and a subsequence still denoted as {kn } such that
|kn − k| ≥ ε (5.1)
Now since K is compact, there exists a further subsequence, still denoted as {kn } such
that
kn → k0 ∈ K
However, the continuity of f requires
f (kn ) → f (k0 )
and so f (k0 ) = f (k). Since f is one to one, this requires k0 = k, a contradiction to 5.1.
This proves the theorem. ¥
Since K is sequentially compact, there exists a subsequence, {xkl } such that liml→∞ xkl =
x ∈ K. Then by continuity of f,
which shows f achieves its maximum on K. To see it achieves its minimum, you could
repeat the argument with a minimizing sequence or else you could consider −f and
apply what was just shown to −f , −f having its minimum when f has its maximum.
This proves the theorem. ¥
Proof: First of all, denote by C the set of closed sets which contain A. Then
A = ∩C
Each H C is open and so the union of all these open sets must also be open. This is
because if x is in this union, then it is in at least one of them. Hence it is an interior
point of that one. But this implies it is an interior point of the union of them all which
is an even larger set. Thus A is closed.
The interesting part is the next claim. First note that from the definition, A ⊆ A so
if x ∈ A, then x ∈ A. Now consider y ∈ A0 but y ∈ / A. If y ∈
/ A, a closed set, then there
C
exists B (y, r) ⊆ A . Thus y cannot be a limit point of A, a contradiction. Therefore,
A ∪ A0 ⊆ A
S = A ∪ B, A, B 6= ∅, and A ∩ B = B ∩ A = ∅.
In this case, the sets A and B are said to separate S. A set is connected if it is not
separated. Remember A denotes the closure of the set A.
Note that the concept of connected sets is defined in terms of what it is not. This
makes it somewhat difficult to understand. One of the most important theorems about
connected sets is the following.
Theorem 5.3.4 Suppose U and V are connected sets having nonempty intersec-
tion. Then U ∪ V is also connected.
It follows one of these sets must be empty since otherwise, U would be separated. It
follows that U is contained in either A or B. Similarly, V must be contained in either
A or B. Since U and V have nonempty intersection, it follows that both V and U are
contained in one of the sets, A, B. Therefore, the other must be empty and this shows
U ∪ V cannot be separated and is therefore, connected. ¥
5.3. CONNECTED SETS 91
S ≡ {t ∈ [x, y] : [x, t] ⊆ A}
Corollary 5.3.9 Let E be a connected set in a normed vector space and suppose
f : E → R and that y ∈ (f (e1 ) , f (e2 )) where ei ∈ E. Then there exists e ∈ E such that
f (e) = y.
You can verify that this set of points in the normed vector space R2 is not arcwise
connected but is connected.
Proof: This is easy from the convexity of the set. If x, y ∈ B (z, r) , then let
γ (t) = x + t (y − x) for t ∈ [0, 1] .
Proof: Suppose not. Then it achieves two different values, k and l 6= k. Then
Ω = f −1 (l) ∪ f −1 ({m ∈ Z : m 6= l}) and these are disjoint nonempty open sets which
separate Ω. To see they are open, note
µ µ ¶¶
−1 −1 1 1
f ({m ∈ Z : m 6= l}) = f ∪m6=l m − , m +
6 6
¡¡ ¢¢
which is the inverse image of an open set while f −1 (l) = f −1 l − 61 , l + 16 also an
open set. ¥
Definition 5.5.2 Let {fn } be a sequence of functions. Then the sequence con-
verges pointwise to a function f if for all x ∈ D, the domain of the functions in the
sequence,
f (x) = lim fn (x)
n→∞
Thus you consider for each x ∈ D the sequence of numbers {fn (x)} and if this
sequence converges for each x ∈ D, the thing it converges to is called f (x).
for all x ∈ D.
Proof: Let ε > 0 be given and pick z ∈ D. By uniform convergence, there exists N
such that if n > N, then for all x ∈ D,
||f (x) − f (z)|| ≤ ||f (x) − fn (x)|| + ||fn (x) − fn (z)|| + ||fn (z) − f (z)||
< ε/3 + ε/3 + ε/3 = ε
||f (y) − f (z)|| ≤ ||f (y) − fn (y)|| + ||fn (y) − fn (z)|| + ||fn (z) − f (z)||
< ε/3 + ε/3 + ε/3 = ε
for all x ∈ D. Then for any x ∈ D, pick n ≥ N and it follows from Theorem 4.1.8
Proof: This follows from Theorem 5.5.6 and Theorem 5.5.4. This proves the corol-
lary. ¥
Here is one more fairly obvious theorem.
Proof: If the sequence converges pointwise, then by Theorem 4.4.3 the sequence
{fn (x)} is a Cauchy sequence for each x ∈ D. Conversely, if {fn (x)} is a Cauchy se-
quence for each x ∈ D, then {fn (x)} converges for each x ∈ D because of completeness.
Now suppose {fn } is uniformly Cauchy. Then from Theorem 5.5.6 there exists f
such that {fn } converges uniformly on D to f . Conversely, if {fn } converges uniformly
to f on D, then if ε > 0 is given, there exists N such that if n ≥ N,
|fn (x) − fm (x)| ≤ |fn (x) − f (x)| + |f (x) − fm (x)| < ε/2 + ε/2 = ε.
and its value at x is given by the limit of the sequence of partial sums in
P5.4. If for all
∞
x ∈ D, the limit in 5.4 exists, then 5.5 is said to converge pointwise. k=1 fk is said
to converge uniformly on D if the sequence of partial sums,
( n )
X
fk
k=1
converges uniformly.If the indices for the functions start at some other value than 1,
you make the obvious modification to the above definition.
for all x ∈ D.
Proof: The first part follows from Theorem 5.5.8. The second part follows from
observing the condition is equivalent to the sequence of partial sums forming a uniformly
Cauchy sequence and then by Theorem P∞5.5.6, these partial sums converge uniformly to
a function which is the definition of k=1 fk . This proves the theorem. ¥
Is there an easy way to recognize when 5.6 happens? Yes, there is. It is called the
Weierstrass M test.
Proof: Let z ∈ D. Then letting m < n and using the triangle inequality
¯¯ ¯¯
¯¯ X
n m
X ¯¯ n
X X∞
¯¯ ¯¯
¯¯ fk (z) − fk (z)¯¯ ≤ ||fk (z)|| ≤ Mk < ε
¯¯ ¯¯
k=1 k=1 k=m+1 k=m+1
P∞
whenever m is large enough because of the assumption that n=1 Mn converges. There-
fore, the sequence
P∞ of partial sums is uniformly Cauchy on D and therefore, converges
uniformly to k=1 fk on D. This proves the theorem. ¥
Theorem
P∞ 5.5.12 If {fn } is a sequence of continuous
P∞ functions defined on D
and k=1 fk converges uniformly, then the function, k=1 f k must also be continuous.
Proof: This follows from Theorem 5.5.4 applied to the sequence of partial
P∞ sums of
the above series which is assumed to converge uniformly to the function, k=1 fk . ¥
98 CONTINUOUS FUNCTIONS
5.6 Polynomials
General considerations about what a function is have already been considered earlier.
For functions of one variable, the special kind of functions known as a polynomial has
a corresponding version when one considers a function of many variables. This is found
in the next definition.
Definition 5.6.1 Let α be an n dimensional multi-index. This means
α = (α1 , · · ·, αn )
where each αi is a positive integer or zero. Also, let
n
X
|α| ≡ |αi |
i=1
Then xα means
xα ≡ xα 1 α2 αn
1 x2 · · · x3
where each xj ∈ F. An n dimensional polynomial of degree m is a function of the form
X
p (x) = dα xα.
|α|≤m
where the dα are complex or real numbers. Rational functions are defined as the quotient
of two polynomials. Thus these functions are defined on Fn .
For example, f (x) = x1 x22 + 7x43 x1 is a polynomial of degree 5 and
x1 x22 + 7x43 x1 + x32
4x31 x22 + 7x23 x1 − x32
is a rational function.
Note that in the case of a rational function, the domain of the function might not
be all of Fn . For example, if
x1 x22 + 7x43 x1 + x32
f (x) = ,
x22 + 3x21 − 4
the domain of f would be all complex numbers such that x22 + 3x21 6= 4.
By Theorem 5.0.2 all polynomials are continuous. To see this, note that the function,
π k (x) ≡ xk
is a continuous function because of the inequality
|π k (x) − π k (y)| = |xk − yk | ≤ |x − y| .
Polynomials are simple sums of scalars times products of these functions. Similarly,
by this theorem, rational functions, quotients of polynomials, are continuous at points
where the denominator is non zero. More generally, if V is a normed vector space,
consider a V valued function of the form
X
f (x) ≡ dα xα
|α|≤m
Next, using what was just shown and the binomial theorem again,
Xm µ ¶ Xm µ ¶
m 2 k m−k m m−k
k x (1 − x) = k (k − 1) xk (1 − x)
k k
k=0 k=1
Xm µ ¶
m m−k
+ kxk (1 − x)
k
k=0
m µ ¶
X m m−k
= k (k − 1) xk (1 − x) + mx
k
k=2
Xµ
m−2
m
¶
m−2−k
= (k + 2) (k + 1) xk+2 (1 − x) + mx
k+2
k=0
Xµ
m−2 ¶
m − 2 k+2 m−2−k
= m (m − 1) x (1 − x) + mx
k
k=0
X µm − 2¶
m−2
m−2−k
2
= x m (m − 1) xk (1 − x) + mx
k
k=0
= x2 m (m − 1) + mx = x2 m2 − x2 m + mx
It follows
m µ ¶
X m 2 m−k
(k − mx) xk (1 − x)
k
k=0
100 CONTINUOUS FUNCTIONS
m µ ¶
X m ¡ 2 ¢ m−k
= k − 2kmx + x2 m2 xk (1 − x)
k
k=0
and from what was just shown along with the binomial theorem again, this equals
µ ¶2
m 1
x2 m2 − x2 m + mx − 2mx (mx) + x2 m2 = −x2 m + mx = −m x− .
4 2
Thus the expression is maximized when x = 1/2 and yields m/4 in this case. This
proves the lemma. ¥
Now let f be a continuous function defined on [0, 1] . Let pn be the polynomial
defined by
n µ ¶ µ ¶
X n k n−k
pn (x) ≡ f xk (1 − x) . (5.7)
k n
k=0
n
Now for f a continuous function defined on [0, 1] and for x = (x1 , · · ·, xn ) ,consider the
polynomial,
m
X m µ ¶µ ¶
X µ ¶
m m m k1 m−k1 k2 m−k2
pm (x) ≡ ··· ··· x (1 − x1 ) x2 (1 − x2 )
k1 k2 kn 1
k1 =1 kn =1
µ ¶
m−kn k1 kn
··· xknn (1 − xn ) f , · · ·, . (5.8)
m m
Also define if I is a set in Rn
¡ k1 ¢
To simplify the notation, let k = (k1 , · · ·, kn ) where each ki ∈ [0, m], k
m ≡ m,· · ·, kmn ,
and let µ ¶ µ ¶µ ¶ µ ¶
m m m m
≡ ··· .
k k1 k2 kn
Also define
||k||∞ ≡ max {ki , i = 1, 2, · · ·, n}
m−k m−k1 m−k2 m−kn
xk (1 − x) ≡ xk11 (1 − x1 ) xk22 (1 − x2 ) · · · xknn (1 − xn ) .
This is the n dimensional version of the Bernstein polynomials which is what results n
the case where n=1.
Lemma 5.7.2 For x ∈ [0, 1]n , f a continuous F valued function defined on [0, 1]n ,
n
and pm given in 5.8, pm converges uniformly to f on [0, 1] as m → ∞.
5.7. SEQUENCES OF POLYNOMIALS, WEIERSTRASS APPROXIMATION 101
n
and so for x ∈ [0, 1] ,
X µ ¶ ¯ µ ¶ ¯
m k m−k ¯¯ k ¯
|pm (x) − f (x)| ≤ x (1 − x) ¯ f − f (x)¯¯
k m
||k||∞ ≤m
X µm¶ ¯ µ ¶ ¯
k m−k ¯¯ k ¯
¯
≤ x (1 − x) ¯f m − f (x)¯
k
k∈G
X µm¶ ¯ µ ¶ ¯
m−k ¯¯ k ¯
¯
+ xk (1 − x) ¯ f − f (x) ¯ (5.9)
C
k m
k∈G
n
Letting M ≥ max {|f (x)| : x ∈ [0, 1] } it follows
because on GC ,
2
(kj − mxj )
< 1, j = 1, · · ·, n.
η 2 m2
Now by Lemma 5.7.1,
µ ¶n ³
1 m ´n
|pm (x) − f (x)| ≤ ε + 2M .
η 2 m2 4
102 CONTINUOUS FUNCTIONS
Therefore, since the right side does not depend on x, it follows that for all m sufficiently
large,
||pm − f ||[0,1]n ≤ 2ε
and since ε is arbitrary, this shows limm→∞ ||pm − f ||[0,1]n = 0. This proves the lemma.
¥
Proof: Let gk : [0, 1] → [ak , bk ] be linear, one to one, and onto and let
Proof: The continuity of x → dist (x, S) is obvious if the inequality 5.11 is estab-
lished. So let x, y ∈ Rn . Without loss of generality, assume dist (x, S) ≥ dist (y, S) and
pick z ∈ S such that |y − z| − ε < dist (y, S) . Then
Lemma 5.7.5 Let H, K be two nonempty disjoint closed subsets of Rn . Then there
exists a continuous function, g : Rn → [−1, 1] such that g (H) = −1/3, g (K) =
1/3, g (Rn ) ⊆ [−1/3, 1/3] .
Proof: Let
dist (x, H)
f (x) ≡ .
dist (x, H) + dist (x, K)
The denominator is never equal to zero because if dist (x, H) = 0, then x ∈ H because
H is closed. (To see this, pick hk ∈ B (x, 1/k) ∩ H. Then hk → x and since H is closed,
x ∈ H.) Similarly, if dist (x, K) = 0, then x ∈ K and so the denominator is never zero
as claimed. Hence ¡f is continuous
¢ and from its definition, f = 0 on H and f = 1 on K.
Now let g (x) ≡ 23 f (x) − 12 . Then g has the desired properties. ¥
Proof: Let H = f −1 ([−1, −1/3]) , K = f −1 ([1/3, 1]) . Thus H and K are disjoint
closed subsets of M. Suppose first H, K are both nonempty. Then by Lemma 5.7.5 there
exists g such that g is a continuous function defined on all of Rn and g (H) = −1/3,
g (K) = 1/3, and g (Rn ) ⊆ [−1/3, 1/3] . It follows ||f − g||M < 2/3. If H = ∅, then f
has all its values in [−1/3, 1] and so letting g ≡ 1/3, the desired condition is obtained.
If K = ∅, let g ≡ −1/3. This proves the lemma. ¥
Proof: Using Lemma 5.7.7, let g1 be such that g1 (Rn ) ⊆ [−1/3, 1/3] and
2
||f − g1 ||M ≤ .
3
Suppose g1 , · · ·, gm have been chosen such that gj (Rn ) ⊆ [−1/3, 1/3] and
¯¯ ¯¯
¯¯ m µ ¶i−1 ¯¯
X µ ¶m
¯¯ 2 ¯¯ 2
¯¯f − gi ¯¯ < . (5.12)
¯¯ 3 ¯¯ 3
i=1 M
¡ 3 ¢m ³ Pm ¡ 2 ¢i−1 ´
and so 2 f − i=1 3 gi can play the role of f in the first step of the proof.
Therefore, there exists gm+1 defined and continuous on all of Rn such that its values
are in [−1/3, 1/3] and
¯¯µ ¶ à m µ ¶i−1
! ¯¯
¯¯ 3 m X 2 ¯¯ 2
¯¯ ¯¯
¯¯ f− gi − gm+1 ¯¯ ≤ .
¯¯ 2 3 ¯¯ 3
i=1 M
104 CONTINUOUS FUNCTIONS
Hence ¯¯Ã ! µ ¶ ¯¯
¯¯ m µ ¶i−1
X m ¯¯ µ ¶m+1
¯¯ 2 2 ¯¯ 2
¯¯ f − gi − gm+1 ¯¯ ≤ .
¯¯ 3 3 ¯¯ 3
i=1 M
It follows there exists a sequence, {gi } such that each has its values in [−1/3, 1/3] and
for every m 5.12 holds. Then let
X∞ µ ¶i−1
2
g (x) ≡ gi (x) .
i=1
3
It follows ¯∞ µ ¶ ¯
¯X 2 i−1 ¯ X m µ ¶i−1
¯ ¯ 2 1
|g (x)| ≤ ¯ gi (x)¯ ≤ ≤1
¯ 3 ¯ 3 3
i=1 i=1
and ¯µ ¶ ¯ µ ¶
¯ 2 i−1 ¯ 2
i−1
1
¯ ¯
¯ gi (x)¯ ≤
¯ 3 ¯ 3 3
so the Weierstrass M test applies and shows convergence is uniform. Therefore g must
be continuous. The estimate 5.12 implies f = g on M . ¥
The following is the Tietze extension theorem.
Theorem 5.7.9 Let M be a closed nonempty subset of Rn and let f : M → [a, b]
be continuous at every point of M. Then there exists a function, g continuous on all of
Rn which coincides with f on M such that g (Rn ) ⊆ [a, b] .
2
Proof: Let f1 (x) = 1 + b−a (f (x) − b) . Then f1 satisfies the conditions of Lemma
5.7.8 and so there exists g1 : Rn → [−1,
¡ b−a
n
¢ 1] such that g is continuous on R and equals
f1 on M . Let g (x) = (g1 (x) − 1) 2 + b. This works. ¥
With the Tietze extension theorem, here is a better version of the Weierstrass ap-
proximation theorem.
Theorem 5.7.10 Let K be a closed and bounded subset of Rn and let f : K → R
be continuous. Then there exists a sequence of polynomials {pm } such that
lim (sup {|f (x) − pm (x)| : x ∈ K}) = 0.
m→∞
Then by the Weierstrass approximation theorem, Theorem 5.7.3 there exists a sequence
of polynomials {pm } converging uniformly to g on R. Therefore, this sequence of poly-
nomials converges uniformly to g = f on K as well. This proves the theorem. ¥
By considering the real and imaginary parts of a function which has values in C one
can generalize the above theorem.
Corollary 5.7.11 Let K be a closed and bounded subset of Rn and let f : K → F
be continuous. Then there exists a sequence of polynomials {pm } such that
lim (sup {|f (x) − pm (x)| : x ∈ K}) = 0.
m→∞
Definition 5.8.1 Let V, W be two finite dimensional normed vector spaces hav-
ing norms ||·||V and ||·||W respectively. Let L ∈ L (V, W ) . Then the operator norm of
L, denoted by ||L|| is defined as
Then the following theorem discusses the main properties of this norm. In the future,
I will dispense with the subscript on the symbols for the norm because it is clear from
the context which norm is meant. Here is a useful lemma.
Lemma 5.8.2 Let V be a normed vector space having a basis {v1 , · · ·, vn }. Let
( ¯¯ n ¯¯ )
¯¯ X ¯¯
n ¯¯ ¯¯
A= a ∈ F : ¯¯ ak vk ¯¯ ≤ 1
¯¯ ¯¯
k=1
and now it is apparent that if |a − b| is sufficiently small so that each |ak − bk | is small
enough, this expression is larger than 1. Thus there exists δ > 0 such that B (a, δ) ⊆ AC
showing that AC is open. Therefore, A is closed.
Next consider the claim that A is bounded. Suppose this is not so. Then there exists
a sequence {ak } of points of A,
¡ ¢
ak = a1k , · · ·, ank ,
Let µ ¶
a1k an
bk = , · · ·, k
|ak | |ak |
Then |bk | = 1 so bk is contained in the closed and bounded set, S (0, 1) which is
sequentially compact in Fn . It follows there exists a subsequence, still denoted by {bk }
106 CONTINUOUS FUNCTIONS
such that it converges to b ∈ S (0,1) . Passing to the limit in 5.13 using the following
inequality, ¯¯ ¯¯
¯¯ n ¯¯ X ¯
n ¯ j
¯
¯¯X ajk Xn
¯¯ a ¯
¯¯ ¯ ¯ ¯ k ¯
¯¯ v j − b v
j j ¯¯ ≤ ¯ − bj ¯ ||vj ||
¯¯ j=1 |a k | ¯¯ j=1 ¯ |a k | ¯
j=1
Pn
to see that the sum converges to j=1 bj vj , it follows
n
X
bj vj = 0
j=1
and this is a contradiction because {v1 , · · ·, vn } is a basis and not all the bj can equal
zero. Therefore, A must be bounded after all. This proves the lemma. ¥
1. ||L|| < ∞
2. For all x ∈ X, ||Lx|| ≤ ||L|| ||x|| and if L ∈ L (V, W ) while M ∈ L (W, Z) , then
||M L|| ≤ ||M || ||L||.
3. ||·|| is a norm. In particular,
(a) ||L|| ≥ 0 and ||L|| = 0 if and only if L = 0, the linear transformation which
sends every vector to 0.
(b) ||aL|| = |a| ||L|| whenever a ∈ F
(c) ||L + M || ≤ ||L|| + ||M ||
Finally consider 3.) If ||L|| = 0 then from 2.), ||Lx|| ≤ 0 and so Lx = 0 for every x
which is the same as saying L = 0. If Lx = 0 for every x, then L = 0 by definition. Let
a ∈ F. Then from the properties of the norm, in the vector space,
L (v) ∈ B (L (v) , δ) ⊆ U.
Then if w ∈ V,
||L (v − w)|| = ||L (v) − L (w)|| ≤ ||L|| ||v − w||
and so if ||v − w|| is sufficiently small, ||v − w|| < δ/ ||L|| , then L (w) ∈ B (L (v) , δ)
which shows B (v, δ/ ||L||) ⊆ L−1 (U ) and since v ∈ L−1 (U ) was arbitrary, this shows
L−1 (U ) is open. This proves the theorem. ¥
The operator norm will be very important in the chapter on the derivative.
Part 1.) of Theorem 5.8.3 says that if L ∈ L (V, W ) where V and W are two normed
vector spaces, then there exists K such that for all v ∈ V,
||Lv||W ≤ K ||v||V
An obvious case is to let L = id, the identity map on V and let there be two different
norms on V, ||·||1 and ||·||2 . Thus (V, ||·||1 ) is a normed vector space and so is (V, ||·||2 ) .
Then Theorem 5.8.3 implies that
Theorem 5.8.4 Let V be a finite dimensional vector space and let ||·||1 and ||·||2
be two norms for V. Then these norms are equivalent which means there exist constants,
δ, ∆ such that for all v ∈ V
A set, K is sequentially compact if and only if it is closed and bounded. Also every finite
dimensional normed vector space is complete. Also any closed and bounded subset of a
finite dimensional normed vector space is sequentially compact.
108 CONTINUOUS FUNCTIONS
and so
1
||v||1 ≤ ||v||2 ≤ K2 ||v||1 .
K1
Next consider the claim that all closed and bounded sets in a normed vector space
are sequentially compact. Let L : Fn → V be defined by
n
X
L (a) ≡ ak vk
k=1
where {v1 , · · ·, vn } is a basis for V . Thus L ∈ L (Fn , V ) and so by Theorem 5.8.3 this
is a continuous function. Hence if K is a closed and bounded subset of V it follows
¡ ¢
L−1 (K) = Fn \ L−1 K C = Fn \ (an open set) = a closed set.
Also L−1 (K) is bounded. To see this, note that L−1 is one to one onto V and so
L−1 ∈ L (V, Fn ). Therefore,
¯ −1 ¯ ¯¯ ¯¯ ¯¯ ¯¯
¯L (v)¯ ≤ ¯¯L−1 ¯¯ ||v|| ≤ ¯¯L−1 ¯¯ r
where K ⊆ B (0, r) . Since K is bounded, such an r exists. Thus L−1 (K) is a closed
and bounded subset of Fn and is therefore sequentially compact.
© ªIt follows that if
∞ ∞
{vk }k=1 ⊆ K, there is a subsequence {vkl }l=1 such that L−1 vkl converges to a
point, a ∈ L−1 (K) . Hence by continuity of L,
¡ ¢
vkl = L L−1 (vkl ) → La ∈ K.
It is clear most axioms of a norm hold. The triangle inequality also holds because by
the triangle inequality for Fn ,
1/2
n
X ¯ j ¯ 2
||x + y|| ≡ ¯x + y j ¯
j=1
1/2 1/2
n
X Xn
¯ j ¯2 ¯ j ¯2
≤ ¯x ¯ + ¯y ¯ ≡ ||x|| + ||y|| .
j=1 j=1
By the first part of this theorem, this norm is equivalent to the norm on V . Thus
n oK∞ is
closed and bounded with respect to this new norm. It follows that for each j, xjk
k=1
is a bounded sequence in F and so by the theorems about sequential compactness in F
it follows upon taking subsequences n times, there exists a subsequence xkl such that
for each j,
lim xjkl = xj
l→∞
Example 5.8.5 Let V beP a vector space and let {v1 , · · ·, vn } be a basis. Define a norm
n
on V as follows. For v = k=1 ak vk ,
In the above example, this is a norm on the vector space, V . It is clear ||av|| = |a| ||v||
and that ||v|| ≥ 0 P
and equals 0 if and onlyPif v = 0. The hard part is the triangle
n n
inequality. Let v = k=1 ak vk and w = v = k=1 bk vk .
Proposition 5.9.2 The above definition yields a norm and in fact C (K; V ) is a
complete normed linear space.
Proof: This is obviously a vector space. Just verify the axioms. The main thing
to show it that the above is a norm. First note that ||f || = 0 if and only if f = 0
and ||αf || = |α| ||f || whenever α ∈ F, the field of scalars, C or R. As to the triangle
inequality,
||f + g|| ≡ sup {||(f + g) (x)|| : x ∈ K}
Furthermore, the function x → ||f (x)||V is continuous thanks to the triangle inequality
which implies
|||f (x)||V − ||f (y)||V | ≤ ||f (x) − f (y)||V .
Therefore, ||f || is a well defined nonnegative real number.
It remains to verify completeness. Suppose then {fk } is a Cauchy sequence with
respect to this norm. Then from the definition it is a uniformly Cauchy sequence and
since by Theorem 5.8.4 V is a complete normed vector space, it follows from Theorem
5.5.6, there exists f ∈ C (K; V ) such that {fk } converges uniformly to f . That is,
lim ||f − f k || = 0.
k→∞
||f || ≤ C
for all f ∈ F.
B (x, ε) ∩ D 6= ∅.
5.9. ASCOLI ARZELA THEOREM 111
¡ ¢
Proof: Let n ∈ N. Pick k1n ∈ K. If B k1n , n1 ⊇ K, stop. Otherwise pick
µ ¶
n n 1
k2 ∈ K \ B k1 ,
n
Continue this way till the process ends. It must end because if it didn’t, there would
exist a convergent subsequence which would imply two of the kjn would have to be closer
than 1/n which is impossible from the construction. Denote this collection of points
by Dn . Then D ≡ ∪∞ n=1 Dn . This must work because if ε > 0 is given and x ∈ K, let
1/n < ε/3 and the construction implies x ∈ B (kin , 1/n) for some kin ∈ Dn ∪ D. Then
kin ∈ B (x, ε) .
D is countable because it is the countable union of finite sets. This proves the lemma.
¥
Definition 5.9.5 More generally, if K is any subset of a normed vector space
and there exists D such that D is countable and for all x ∈ K,
B (x, ε) ∩ D 6= ∅
then K is called separable.
Now here is another remarkable result about equicontinuous functions.
Lemma 5.9.6 Suppose {fk }∞
k=1 is equicontinuous and the functions are defined on
a sequentially compact set K. Suppose also for each x ∈ K,
lim fk (x) = f (x) .
k→∞
Proof: Uniform convergence would say that for every ε > 0, there exists nε such
that if k, l ≥ nε , then for all x ∈ K,
||fk (x) − fl (x)|| < ε.
Thus if the given sequence does not converge uniformly, there exists ε > 0 such that for
all n, there exists k, l ≥ n and xn ∈ K such that
||fk (xn ) − fl (xn )|| ≥ ε
Since K is sequentially compact, there exists a subsequence, still denoted by {xn } such
that limn→∞ xn = x ∈ K. Then letting k, l be associated with n as just described,
ε ≤ ||fk (xn ) − fl (xn )||V ≤ ||fk (xn ) − fk (x)||V
+ ||fk (x) − fl (x)||V + ||fl (x) − fl (xn )||V
By equicontinuity, if n is large enough, this implies
ε ε
ε < + ||fk (x) − fl (x)||V +
3 3
and now taking n still larger if necessary, the middle term on the right in the above is
also less than ε/3 which yields a contradiction. Hence convergence is uniform and so it
follows from Theorem 5.5.6 the function f is actually continuous and
lim ||f − f k || = 0.
k→∞
© ª∞ © ª∞
Proof: Denote by f(k,n) n=1 a subsequence of f(k−1,n) n=1 where the index de-
noted by (k − 1, k − 1) is always less than the index denoted by (k, k) . Also let the
∞
countable dense subset of Lemma 5.9.4 be D = {dk }k=1 . Then consider the following
diagram.
f(1,1) , f(1,2) , f(1,3) , f(1,4) , · · · → d1
f(2,1) , f(2,2) , f(2,3) , f(2,4) , · · · → d1 , d2
f(3,1) , f(3,2) , f(3,3) , f(3,4) , · · · → d1 , d2 , d3
f(4,1) , f(4,2) , f(4,3) , f(4,4) , · · · → d1 , d2 , d3 , d4
..
.
© ª∞
The meaning is as follows. f(1,k) k=1 is a subsequence of the original sequence which
∞
converges at d1 . Such a subsequence exists because {fk (d1 )}k=1 is contained in a
bounded set so a subsequence converges by Theorem 5.8.4. (It is given to be in a
bounded set and so the closure
© ªof∞this bounded set is both closed and bounded, hence
weakly compact.) Now f(2,k) k=1 is a subsequence of the first subsequence which
converges, at d2 . Then by Theorem 4.1.6 this new subsequence continues to converge at
d1 . Thus, as indicated by the diagram, it converges at both b1 and b2 . Continuing this
way explains
© theª∞ meaning of the diagram. Now consider the subsequence of the original
sequence f(k,k) k=1 . For k ≥ n, this subsequence is a subsequence of the subsequence
∞
{fn,k }k=1 and so it converges at d1 , d2 , d3 · · · , d4 . This being true for all n, it follows
© ª∞
f(k,k) k=1 converges at every point of D. To save on notation, I shall simply denote
this as {fk } .
Then letting d ∈ D,
Thus for k, l large enough, the right side is less than ε. This shows that for each x ∈ K,
∞
{fk (x)}k=1 is a Cauchy sequence and so by completeness of V this converges. Let f (x)
be the thing to which it converges. Then f is continuous and the convergence is uniform
by Lemma 5.9.6. This proves the theorem. ¥
5.10 Exercises
1. In Theorem 5.7.3 it is assumed f has values in F. Show there is no change if f
has values in V, a normed vector space provided
P you redefine the definition of a
polynomial to be something of the form |α|≤m aα xα where aα ∈ V .
2. How would you generalize the conclusion of Corollary 5.7.11 to include the situa-
tion where f has values in a finite dimensional normed vector space?
5.10. EXERCISES 113
3. If {fn } and {gn } are sequences of Fn valued functions defined on D which converge
uniformly, show that if a, b are constants, then afn + bgn also converges uniformly.
If there exists a constant, M such that |fn (x)| , |gn (x)| < M for all n and for all
x ∈ D, show {fn · gn } converges uniformly. Let fn (x) ≡ 1/ |x| for x ∈ B (0,1)
and let gn (x) ≡ (n − 1) /n. Show {fn } converges uniformly on B (0,1) and {gn }
converges uniformly but {fn gn } fails to converge uniformly.
4. Formulate a theorem for series of functions of n variables which will allow you to
conclude the infinite series is uniformly continuous based on reasonable assump-
tions about the functions in the sum.
5. If f and g are real valued functions which are continuous on some set, D, show
that
min (f, g) , max (f, g)
are also continuous. Generalize this to any finite collection of continuous functions.
Hint: Note max (f, g) = |f −g|+f
2
+g
. Now recall the triangle inequality which can
be used to show |·| is a continuous function.
for some α ≤ 1 for all x, y. Show every Holder continuous function is uniformly
continuous.
114 CONTINUOUS FUNCTIONS
23. The operator norm was defined for L (V, W ) above. This is the usual norm used
for this vector space of linear transformations. Show that any other norm used on
L (V, W ) is equivalent to the operator norm. That is, show that if ||·||1 is another
norm, there exist scalars δ, ∆ such that
for all L ∈ L (V, W ) where here ||·|| denotes the operator norm.
24. One alternative norm which is very popular is as follows. Let L ∈ L (V, W ) and
let (lij ) denote the matrix of L with respect to some bases. Then the Frobenius
norm is defined by
1/2
X 2
|lij | ≡ ||L||F .
ij
where p ≥ 1 or even
||L||∞ = max |lij | .
ij
25. Explain why L (V, W ) is always a complete normed vector space whenever V, W
are finite dimensional normed vector spaces for any choice of norm for L (V, W ).
Also explain why every closed and bounded subset of L (V, W ) is sequentially
compact for any choice of norm on this space.
26. Let L ∈ L (V, V ) where V is a finite dimensional normed vector space. Define
∞
X Lk
eL ≡
k!
k=1
Explain the meaning of this infinite sum and show it converges in L (V, V ) for any
choice of norm on this space. Now tell how to define sin (L).
27. Let X be a finite dimensional normed vector space, real or complex. Show that X
n n
is separable.
PnHint: Let {vPi }i=1 be a basis and define a map from F to X, θ, as
n
follows. θ ( k=1 xk ek ) ≡ k=1 xk vk . Show θ is continuous and has a continuous
inverse. Now let D be a countable dense set in Fn and consider θ (D).
where
||f || ≡ sup{|f (x)| : x ∈ X}
and
|f (x) − f (y)|
ρα (f ) ≡ sup{ α : x, y ∈ X, x 6= y}.
|x − y|
Show that (C α (X; Rn ) , ||·||α ) is a complete normed linear space. This is called a
Holder space. What would this space consist of if α > 1?
30. Let {fn }∞ α n p
n=1 ⊆ C (X; R ) where X is a compact subset of R and suppose
||fn ||α ≤ M
for all n. Show there exists a subsequence, nk , such that fnk converges in C (X; Rn ).
The given sequence is precompact when this happens. (This also shows the em-
bedding of C α (X; Rn ) into C (X; Rn ) is a compact embedding.) Hint: You might
want to use the Ascoli Arzela theorem.
31. This problem is for those who know about the derivative and the integral of a
function of one variable. Let f :R × Rn → Rn be continuous and bounded and let
x0 ∈ Rn . If
x : [0, T ] → Rn
and h > 0, let ½
x0 if s ≤ h,
τ h x (s) ≡
x (s − h) , if s > h.
For t ∈ [0, T ], let
Z t
xh (t) = x0 + f (s, τ h xh (s)) ds.
0
Show using the Ascoli Arzela theorem that there exists a sequence h → 0 such
that
xh → x
in C ([0, T ] ; Rn ). Next argue
Z t
x (t) = x0 + f (s, x (s)) ds
0
x0 = f (t, x) , t ∈ [0, T ]
x (0) = x0 .
{x : |x − x0 | ≤ r}
where this is the usual norm coming from the dot product. Let P : Rn → D (x0 , r)
be defined by ½
x if x ∈ D (x0 , r)
P (x) ≡ x−x0
x0 + r |x−x 0|
if x ∈
/ D (x0 , r)
Show that |P x − P y| ≤ |x − y| for all x ∈ Rn .
5.10. EXERCISES 117
33. Use Problem 31 to obtain local solutions to the initial value problem where f is
not assumed to be bounded. It is only assumed to be continuous. This means
there is a small interval whose length is perhaps not T such that the solution to
the differential equation exists on this small interval.
118 CONTINUOUS FUNCTIONS
The Derivative
g (v)
lim =0 (6.1)
||v||→0 ||v||
Note that from Theorem 5.8.4 the question whether a given function is differentiable
is independent of the norm used on the finite dimensional vector space. That is, a
function is differentiable with one norm if and only if it is differentiable with another
norm.
The definition 6.1 means the error,
f (x + v) − f (x) − Lv
converges to 0 faster than ||v||. Thus the above definition is equivalent to saying
119
120 THE DERIVATIVE
or equivalently,
||f (y) − f (x) − Df (x) (y − x)||
lim = 0. (6.3)
y→x ||y − x||
The symbol, o (v) should be thought of as an adjective. Thus, if t and k are con-
stants,
o (v) = o (v) + o (v) , o (tv) = o (v) , ko (v) = o (v)
and other similar observations hold.
Proof: First note that for a fixed vector, v, o (tv) = o (t). This is because
o (tv) o (tv)
lim = lim ||v|| =0
t→0 |t| t→0 ||tv||
Now suppose both L1 and L2 work in the above definition. Then let v be any vector
and let t be a real scalar which is chosen small enough that tv + x ∈ U . Then
Therefore, subtracting these two yields (L2 − L1 ) (tv) = o (tv) = o (t). Therefore, di-
viding by t yields (L2 − L1 ) (v) = o(t)t . Now let t → 0 to conclude that (L2 − L1 ) (v) =
0. Since this is true for all v, it follows L2 = L1 . This proves the theorem. ¥
Theorem 6.2.1 (The chain rule) Let U and V be open sets, U ⊆ X and V ⊆ Y .
Suppose f : U → V is differentiable at x ∈ U and suppose g : V → Fq is differentiable
at f (x) ∈ V . Then g ◦ f is differentiable at x and
Proof: This follows from a computation. Let B (x,r) ⊆ U and let r also be small
enough that for ||v|| ≤ r, it follows that f (x + v) ∈ V . Such an r exists because f is
continuous at x. For ||v|| < r, the definition of differentiability of g and f implies
g (f (x + v)) − g (f (x)) =
Since ε > 0 is arbitrary, this shows o (f (x + v) − f (x)) = o (v) because whenever ||v||
is small enough,
||o (f (x + v) − f (x))||
≤ ε.
||v||
By 6.4, this shows
It follows
µ ¶
f (x + tvk ) − f (x)
lim π j
t→0 t
fj (x + tvk ) − fj (x)
≡ lim ≡ Dvk fj (x)
t→0
à t !
X
= πj Jik (x) wi = Jjk (x)
i
In other words, the matrix of Df (x) is nothing more than the matrix of partial deriva-
tives. The k th column of the matrix (Jij ) is
∂f f (x + tek ) − f (x)
(x) = lim ≡ Dek f (x) .
∂xk t→0 t
Thus the matrix of Df (x) with respect to the usual basis vectors is the matrix of
the form
f1,x1 (x) f1,x2 (x) · · · f1,xn (x)
.. .. ..
. . .
fm,x1 (x) fm,x2 (x) · · · fm,xn (x)
where the notation g,xk denotes the k th partial derivative given by the limit,
g (x + tek ) − g (x) ∂g
lim ≡ .
t→0 t ∂xk
The above discussion is summarized in the following theorem.
Theorem 6.3.1 Let f : Fn → Fm and suppose f is differentiable at x. Then all
∂fi (x)
the partial derivatives ∂xj exist and if Jf (x) is the matrix of the linear transformation,
∂fi
Df (x) with respect to the standard basis vectors, then the ij th entry is given by ∂xj (x)
also denoted as fi,j or fi,xj .
6.4. A MEAN VALUE INEQUALITY 123
Dv f (x)
is defined by
f (x + tv) − f (x)
lim
t→0 t
where t ∈ F. This is often called the Gateaux derivative.
What if all the partial derivatives of f exist? Does it follow that f is differentiable?
Consider the following function, f : R2 → R,
½ xy
f (x, y) = x2 +y 2 if (x, y) 6= (0, 0) .
0 if (x, y) = (0, 0)
Then from the definition of partial derivatives,
f (h, 0) − f (0, 0) 0−0
lim = lim =0
h→0 h h→0 h
and
f (0, h) − f (0, 0) 0−0
lim = lim =0
h→0 h h→0 h
However f is not even continuous at (0, 0) which may be seen by considering the behavior
of the function along the line y = x and along the line x = 0. By Lemma 6.1.3 this
implies f is not differentiable. Therefore, it is necessary to consider the correct definition
of the derivative given above if you want to get a notion which generalizes the concept
of the derivative of a function of one variable in such a way as to preserve continuity
whenever the function is differentiable.
Lemma 6.4.1 Let Y be a normed vector space and suppose h : [0, 1] → Y is differ-
entiable and satisfies
||h0 (t)|| ≤ M.
Then
||h (1) − h (0)|| ≤ M.
Suppose t < 1. Then there exist positive numbers, hk decreasing to 0 such that
and now it follows from 6.5 and the triangle inequality that
and so
||h (t + hk ) − h (t)|| > (M + ε) hk
Now dividing by hk and letting k → ∞
||h0 (t)|| ≥ M + ε,
Proof: Let
h (t) ≡ f (x + t (y − x)) .
Then by the chain rule,
h0 (t) = Df (x + t (y − x)) (y − x)
and so
by Lemma 6.4.1
Theorem 6.5.1 Let X be a normed vector space having basis {v1 , · · ·, vn } and
let Y be another normed vector space having basis {w1 , · · ·, wm } . Let U be an open set
in X and let f : U → Y have the property that the Gateaux derivatives,
f (x + tvk ) − f (x)
Dvk f (x) ≡ lim
t→0 t
exist and are continuous functions of x. Then Df (x) exists and
n
X
Df (x) v = Dvk f (x) ak
k=1
where
n
X
v= ak vk .
k=1
6.5. EXISTENCE OF THE DERIVATIVE, C 1 FUNCTIONS 125
n
X
= [f (x + ak vk ) − f (x)] +
k=1
n
X k
X k−1
X
f x+ aj vj − f (x + ak vk ) − f x+ aj vj − f (x) (6.6)
k=1 j=1 j=1
Now without loss of generality, it can be assumed the norm on X is given by that of
Example 5.8.5,
n
X
||v|| ≡ max |ak | : v = ak vk
j=1
because by Theorem 5.8.4 all norms on X are equivalent. Therefore, from 6.7 and the
assumption that the Gateaux derivatives are continuous,
¯¯ ¯¯
¯¯ k−1 ¯¯
¯ ¯ X ¯¯
0 ¯ ¯
||h (t)|| = ¯¯ Dvk f x+ aj vj + tak vk − Dvk f (x + tak vk ) ak ¯¯¯¯
¯¯ j=1 ¯¯
≤ ε |ak | ≤ ε ||v||
126 THE DERIVATIVE
provided ||v|| is sufficiently small. Since ε is arbitrary, it follows from Lemma 6.4.1 the
expression in 6.6 is o (v) because this expression equals a finite sum of terms of the form
h (1) − h (0) where ||h0 (t)|| ≤ ε ||v|| . Thus
n
X
f (x + v) − f (x) = [f (x + ak vk ) − f (x)] + o (v)
k=1
n
X n
X
= Dvk f (x) ak + [f (x + ak vk ) − f (x) − Dvk f (x) ak ] + o (v) .
k=1 k=1
Defining
n
X
Df (x) v ≡ Dvk f (x) ak
k=1
P
where v = k ak vk , it follows Df (x) ∈ L (X, Y ) and is given by the above formula.
It remains to verify x → Df (x) is continuous.
||(Df (x) − Df (y)) v||
X n
≤ ||(Dvk f (x) − Dvk f (y)) ak ||
k=1
n
X
≤ max {|ak | , k = 1, · · ·, n} ||Dvk f (x) − Dvk f (y)||
k=1
n
X
= ||v|| ||Dvk f (x) − Dvk f (y)||
k=1
and so
n
X
||Df (x) − Df (y)|| ≤ ||Dvk f (x) − Dvk f (y)||
k=1
which proves the continuity of Df because of the assumption the Gateaux derivatives
are continuous. This proves the theorem. ¥
This motivates the following definition of what it means for a function to be C 1 .
Definition 6.5.2 Let U be an open subset of a normed finite dimensional vector
space, X and let f : U → Y another finite dimensional normed vector space. Then f is
said to be C 1 if there exists a basis for X, {v1 , · · ·, vn } such that the Gateaux derivatives,
Dvk f (x)
exist on U and are continuous.
6.6. HIGHER ORDER DERIVATIVES 127
Now the following major theorem states these two definitions are equivalent.
Proof: It was shown in Theorem 6.5.1 that Definition 6.5.2 implies 6.5.3. Suppose
then that Definition 6.5.3 holds. Then if v is any vector,
f (x + tv) − f (x) Df (x) tv + o (tv)
lim = lim
t→0 t t→0 t
o (tv)
= Df (x) v+ lim = Df (x) v
t→0 t
Thus Dv f (x) exists and equals Df (x) v. By continuity of x → Df (x) , this establishes
continuity of x → Dv f (x) and proves the theorem. ¥
Note that the proof of the theorem also implies the following corollary.
x → Df (x)
Thus,
Df (x + v) − Df (x) = D2 f (x) v + o (v) .
This implies
D2 f (x) ∈ L (X, L (X, Y )) , D2 f (x) (u) (v) ∈ Y,
and the map
(u, v) → D2 f (x) (u) (v)
is a bilinear map having values in Y . In other words, the two functions,
and D3 f (x) may be considered as a trilinear map having values in Y . In general Dk f (x)
may be considered a k linear map. This means the function
with similar conventions for higher derivatives than 3. Another convention which is
often used is the notation
Dk f (x) vk
instead of
Dk f (x) (v, · · ·, v) .
Note that for every k, Dk f maps U to a normed vector space. As mentioned above,
Df (x) has values in L (X, Y ) , D2 f (x) has values in L (X, L (X, Y )) , etc. Thus it makes
sense to consider whether Dk f is continuous. This is described in the following definition.
Definition 6.6.2 Let U be an open subset of X, a normed vector space and let
f : U → Y. Then f is C k (U ) if f and its first k derivatives are all continuous. Also,
Dk f (x) when it exists can be considered a Y valued multilinear function.
6.7 C k Functions
Recall that for a C 1 function, f
X X
Df (x) v = Dvj f (x) aj = Dvj fi (x) wi aj
j ij
à !
X X X
= Dvj fi (x) wi vj ak vk = Dvj fi (x) wi vj (v)
ij k ij
P
where k ak vk = v and X
f (x) = fi (x) wi . (6.8)
i
This is because à !
X X
wi vj ak vk ≡ ak wi δ jk = wi aj .
k k
Thus X
Df (x) = Dvj fi (x) wi vj
ij
6.7. C K FUNCTIONS 129
I propose to iterate this observation, starting with f and then going to Df and then
D2 f and so forth. Hopefully it will yield a rational way to understand higher order
derivatives in the same way that matrices can be used to understand linear transforma-
tions. Thus beginning with the derivative,
X
Df (x) = Dvj1 fi (x) wi vj1 .
ij1
Then in this special case, the following definition is equivalent to the above as a
definition of what is meant by a C k function.
Definition 6.7.5 Let U be an open subset of Rn and let f : U → Y. Then for k
a nonnegative integer, f is C k if for every |α| ≤ k, Dα f exists and is continuous.
Proof: Recall by Theorem 5.8.4 it does not matter how this norm is defined and
the definition above is convenient. It obviously satisfies most axioms of a norm. The
only one which is not obvious is the triangle inequality. I will show this now.
suppose then that ||x|| + ||x1 || ≥ ||y|| + ||y1 || . Then the above equals
||x|| + ||x1 || ≤ max (||x|| , ||y||) + max (||x1 || , ||y1 ||) ≡ ||(x, y)|| + ||(x1 , y1 )||
B ((x, y) , r) ∈ U.
This says that if (u, v) ∈ X × Y such that ||(u, v) − (x, y)|| < r, then (u, v) ∈ U. Thus
if
||(u, y) − (x, y)|| = ||u − x|| < r,
then (u, y) ∈ U. This has just said that B (x,r), the ball taken in X is contained in Uy .
This proves the lemma. ¥
Or course one could also consider
Ux ≡ {y : (x, y) ∈ U }
in the same way and conclude this set is open in Y . Also, Qn the generalization to many
factors yields the same conclusion. In this case, for x ∈ i=1 Xi , let
¡ ¢
||x|| ≡ max ||xi ||Xi : x = (x1 , · · ·, xn )
Qn
Then a similar argument to the above shows this is a norm on i=1 Xi .
Q
Corollary 6.8.2 Let U ⊆ ni=1 Xi and let
© ¡ ¢ ª
U(x1 ,···,xi−1 ,xi+1 ,···,xn ) ≡ x ∈ Fri : x1 , · · ·, xi−1 , x, xi+1 , · · ·, xn ∈ U .
where v = (v1 , · · ·, vn ) .
Proof: Suppose then that Di g exists and is continuous for each i. Note that
k
X
θj vj = (v1 , · · ·, vk , 0, · · ·, 0) .
j=1
Pn P0
Thus j=1 θj vj = v and define j=1 θj vj ≡ 0. Therefore,
n
X k
X k−1
X
g (x + v) − g (x) = g x+ θj vj − g x + θ j vj (6.10)
k=1 j=1 j=1
k
X k−1
X
g x+ θj vj − g (x+θk vk ) − g x + θj vj − g (x) (6.12)
j=1 j=1
and the expression in 6.12 is of the form h (vk ) − h (0) where for small w ∈ Xk ,
k−1
X
h (w) ≡ g x+ θj vj + θk w − g (x + θk w) .
j=1
Therefore,
k−1
X
Dh (w) = Dk g x+ θj vj + θk w − Dk g (x + θk w)
j=1
and by continuity, ||Dh (w)|| < ε provided ||v|| is small enough. Therefore, by Theorem
6.4.2, whenever ||v|| is small enough,
which shows that since ε is arbitrary, the expression in 6.12 is o (v). Now in 6.11
which shows Dg (x) exists and equals the formula given in 6.9.
Next suppose g is C 1 . I need to verify that Dk g (x) exists and is continuous. Let
v ∈ Xk sufficiently small. Then
g (x + θk v) − g (x) = Dg (x) θk v + o (θk v)
= Dg (x) θk v + o (v)
since ||θk v|| = ||v||. Then Dk g (x) exists and equals
Dg (x) ◦ θk
Now x → DgQn (x) is continuous. Since θk is linear, it follows from Theorem 5.8.3 that
θk : Xk → i=1 Xi is also continuous, This proves the theorem. ¥
The way this is usually used is in the following corollary, a case of Theorem 6.8.5
obtained by letting Xi = F in the above theorem.
Corollary 6.8.6 Let U be an open subset of Fn and let f :U → Fm be C 1 in the sense
that all the partial derivatives of f exist and are continuous. Then f is differentiable
and
Xn
∂f
f (x + v) = f (x) + (x) vk + o (v) .
∂xk
k=1
Letting (s, t) → (0, 0) and using the continuity of fxy and fyx at (x, y) ,
By considering the real and imaginary parts of f in the case where f has values in
C you obtain the following corollary.
x4 − y 4 + 4x2 y 2 x4 − y 4 − 4x2 y 2
fx = y 2 , fy = x 2
(x2 + y 2 ) (x2 + y 2 )
6.10. IMPLICIT FUNCTION THEOREM 135
Now
fx (0, y) − fx (0, 0)
fxy (0, 0) ≡ lim
y→0 y
−y 4
= lim = −1
y→0 (y 2 )2
while
fy (x, 0) − fy (0, 0)
fyx (0, 0) ≡ lim
x→0 x
x4
= lim =1
x→0 (x2 )2
showing that although the mixed partial derivatives do exist at (0, 0) , they are not equal
there.
and ¯¯ ¯¯
¯¯ −1 ¯¯ −1
¯¯(I − A) ¯¯ ≤ (1 − r) . (6.15)
Furthermore, if
© ª
I ≡ A ∈ L (X, X) : A−1 exists
the map A → A−1 is continuous on I and I is an open subset of L (X, X).
Then à !
n
X
(I − A) ck vk =0
k=1
n
which requires each ck = 0 because the {vk } are independent. Hence {(I − A) vk }k=1
is a basis for X because there are n of these vectors and every basis has the same size.
Therefore, if y ∈ X, there exist scalars, ck such that
n
à n !
X X
y= ck (I − A) vk = (I − A) ck vk
k=1 k=1
−1
so (I − A) is onto as claimed. Thus (I − A) ∈ L (X, X) and it remains to estimate
its norm.
which shows the map which takes a linear transformation in I to its inverse is continuous.
This proves the lemma. ¥
The next theorem is a very useful result in many areas. It will be used in this
section to give a short proof of the implicit function theorem but it is also useful in
studying differential equations and integral equations. It is sometimes called the uniform
contraction principle.
Theorem 6.10.2 Let X, Y be finite dimensional normed vector spaces. Also let
E be a closed subset of X and F a closed subset of Y. Suppose for each (x, y) ∈ E × F,
T (x, y) ∈ E and satisfies
Then for each y ∈ F there exists a unique “fixed point” for T (·, y) , x ∈ E, satisfying
T (x, y) = x (6.19)
M
||x (y) − x (y0 )|| ≤ ||y − y0 || . (6.20)
1−r
Proof: First consider the claim there exists a fixed point for the mapping, T (·, y).
For a fixed y, let g (x) ≡ T (x, y). Now pick any x0 ∈ E and consider the sequence,
Then by 6.17,
p
X rk ||g (x0 ) − x0 ||
≤ rk+i−1 ||g (x0 ) − x0 || ≤ .
i=1
1−r
∞
Since 0 < r < 1, this shows that {xk }k=1 is a Cauchy sequence. Therefore, by com-
pleteness of E it converges to a point x ∈ E. To see x is a fixed point, use the continuify
of g to obtain
x ≡ lim xk = lim xk+1 = lim g (xk ) = g (x) .
k→∞ k→∞ k→∞
Then there exist positive constants, δ, η, such that for every y ∈ B (y0 , η) there exists a
unique x (y) ∈ B (x0 , δ) such that
f (x (y) , y) = 0. (6.22)
1
||D1 T (x, y)|| < . (6.24)
2
Also, it can be assumed δ is small enough that
¯¯ ¯¯
¯¯ −1 ¯¯
¯¯D1 f (x0 , y0 ) ¯¯ ||D2 f (x, y)|| < M (6.25)
¯¯ ¯¯
¯¯ −1 ¯¯
where M > ¯¯D1 f (x0 , y0 ) ¯¯ ||D2 f (x0 , y0 )||. By Theorem 6.4.2, whenever x, x0 ∈
B (x0 , δ) and y ∈ B (y0 , δ),
1
||T (x, y) − T (x0 , y)|| ≤ ||x − x0 || . (6.26)
2
Solving 6.23 for D1 f (x, y) ,
and
−1
D2 g (x, y) = −D1 f (x0 , y0 ) D2 f (x, y) .
Also note that T (x, y) = x is the same as saying f (x0 , y0 ) = 0 and also g (x0 , y0 ) = 0.
Thus by 6.25 and Theorem 6.4.2, it follows that for such (x, y) ∈ B (x0 , δ) × B (y0 , η),
≤ M ||y2 − y1 || . (6.29)
From now on assume ||x − x0 || < δ and ||y − y0 || < η so that 6.29, 6.27, 6.28, 6.26, and
6.25 all hold. By 6.29, 6.26, 6.28, and the uniform contraction principle, Theorem 6.10.2
¡ ¢
applied to E ≡ B x0 , 5δ 6 and F ≡ B (y0 , η) implies that for each y ∈ B (y0 , η), there
¡ ¢
exists a unique x (y) ∈ B (x0 , δ) (actually in B x0 , 5δ6 ) such that T (x (y) , y) = x (y)
which is equivalent to
f (x (y) , y) = 0.
Furthermore,
||x (y) − x (y0 )|| ≤ 2M ||y − y0 || . (6.30)
This proves the implicit function theorem except for the verification that y → x (y)
is C 1 . This is shown next. Letting v be sufficiently small, Theorem 6.8.5 and Theorem
6.4.2 imply
0 = f (x (y + v) , y + v) − f (x (y) , y) =
D1 f (x (y) , y) (x (y + v) − x (y)) +
+D2 f (x (y) , y) v + o ((x (y + v) − x (y) , v)) .
The last term in the above is o (v) because of 6.30. Therefore, using 6.27, solve the
above equation for x (y + v) − x (y) and obtain
−1
x (y + v) − x (y) = −D1 (x (y) , y) D2 f (x (y) , y) v + o (v)
Now it follows from the continuity of D2 f , D1 f , the inverse map, 6.30, and this formula
for Dx (y)that x (·) is C 1 (B (y0 , η)). This proves the theorem. ¥
The next theorem is a very important special case of the implicit function theo-
rem known as the inverse function theorem. Actually one can also obtain the implicit
function theorem from the inverse function theorem. It is done this way in [28] and in
[2].
x0 ∈ W ⊆ U, (6.33)
f −1 is C 1 , (6.35)
F (x, y) ≡ f (x) − y
where y0 ≡ f (x0 ). Thus the function y → x (y) defined in that theorem is f −1 . Now
let
W ≡ B (x0 , δ) ∩ f −1 (B (y0 , η))
and
V ≡ B (y0 , η) .
This proves the theorem. ¥
The scalar valued entries of the matrix of D2 f (x (y) , y) have the same differentiabil-
ity as the function y →D2 f (x (y) , y) . This is because the linear projection map, π ij
mapping L (Y, Z) to F given by π ij L ≡ Lij , the ij th entry of the matrix of L with
respect to the given bases is continuous thanks to Theorem 5.8.3. Similar considera-
tions apply to D1 f (x (y) , y) and the entries of its matrix, D1 f (x (y) , y)ij taken with
respect to suitable bases. From the formula for the inverse of a matrix, Theorem 3.5.14,
−1 −1
the ij th entries of the matrix of D1 f (x (y) , y) , D1 f (x (y) , y)ij also have the same
differentiability as y →D1 f (x (y) , y).
Now consider the formula for the derivative of the implicitly defined function in 6.31,
−1
Dx (y) = −D1 f (x (y) , y) D2 f (x (y) , y) . (6.36)
The above derivative is in L (Y, X) . Let {w1 , · · ·, wm } be a basis for Y and let {v1 , · · ·, vn }
be a basis for X. Letting xi be the ith component of x with respect to the basis for
X, it follows from Theorem 6.7.1, y → x (y) will be C k if all such Gateaux derivatives,
Dwj1 wj2 ···wjr xi (y) exist and are continuous for r ≤ k and for any i. Consider what is
required for this to happen. By 6.36,
X³ −1
´
Dwj xi (y) = −D1 f (x (y) , y) (D2 f (x (y) , y))kj
ik
k
≡ G1 (x (y) , y) (6.37)
6.11. TAYLOR’S FORMULA 141
where (x, y) → G1 (x, y) is C k−1 because it is assumed f is C k and one derivative has
been taken to write the above. If k ≥ 2, then another Gateaux derivative can be taken.
Since a similar result holds for all i and any choice of wj , wk , this shows x is at least
C 2 . If k ≥ 3, then another Gateaux derivative can be taken because then (x, y, z) →
G2 (x, y, z) is C 1 and it has been established Dx is C 1 . Continuing this way shows
Dwj1 wj2 ···wjr xi (y) exists and is continuous for r ≤ k. This proves the following corollary
to the implicit and inverse function theorems.
Corollary 6.10.5 In the implicit and inverse function theorems, you can replace
C 1 with C k in the statements of the theorems for any k ∈ N.
f : U ⊆ Rn × Rm → Rn
and f (x0 , y0 ) = 0 while f is C 1 . How can you recognize the condition of the implicit
−1
function theorem which says D1 f (x0 , y0 ) exists? This is really not hard. You recall
the matrix of the transformation D1 f (x0 , y0 ) with respect to the usual basis vectors is
f1,x1 (x0 , y0 ) · · · f1,xn (x0 , y0 )
.. ..
. .
fn,x1 (x0 , y0 ) · · · fn,xn (x0 , y0 )
−1
and so D1 f (x0 , y0 ) exists exactly when the determinant of the above matrix is
nonzero. This is the condition to check. In the general case, you just need to verify
D1 f (x0 , y0 ) is one to one and this can also be accomplished by looking at the matrix
of the transformation with respect to some bases on X and Z.
Then F (1) = F (0) = 0. Therefore, by Rolle’s theorem there exists t between 0 and 1
such that F 0 (t) = 0. Thus,
m
X h(k+1) (t) k
0 = −F 0 (t) = h0 (t) + (1 − t)
k!
k=1
m
X h(k) (t) k−1 m
− k (1 − t) − K (m + 1) (1 − t)
k!
k=1
And so
m
X m−1
X
h(k+1) (t) k h(k+1) (t) k
= h0 (t) + (1 − t) − (1 − t)
k! k!
k=1 k=0
m
−K (m + 1) (1 − t)
h(m+1) (t) m m
= h0 (t) + (1 − t) − h0 (t) − K (m + 1) (1 − t)
m!
and so
h(m+1) (t)
K= .
(m + 1)!
This proves the theorem. ¥
Now let f : U → R where U ⊆ X a normed vector space and suppose f ∈ C m (U ).
Let x ∈ U and let r > 0 be such that
B (x,r) ⊆ U.
It follows from Taylor’s formula for a function of one variable given above that
m
X D(k) f (x) vk D(m+1) f (x+tv) vm+1
f (x + v) = f (x) + + . (6.38)
k! (m + 1)!
k=1
B (x,r) ⊆ U,
and ||v|| < r, there exists t ∈ (0, 1) such that 6.38 holds.
6.11. TAYLOR’S FORMULA 143
D2 f (x+tv) v2
f (x + v) = f (x) + Df (x) v+ . (6.39)
2
Consider
where
∂ 2 f (x+tv)
Hij (x+tv) = D2 f (x+tv) (ei ) (ej ) = .
∂xj ∂xi
From Theorem 6.9.1, this is a symmetric real matrix, thus self adjoint. By the
continuity of the second partial derivative,
1
f (x + v) = f (x) + Df (x) v+ vT H (x) v+
2
1¡ T ¢
v (H (x+tv) −H (x)) v . (6.40)
2
where the last two terms involve ordinary matrix multiplication and
vT = (v1 · · · vn )
f (y) ≤ f (x) .
144 THE DERIVATIVE
Proof: Since Df (x) = 0, formula 6.40 holds and by continuity of the second deriva-
tive, H (x) is a symmetric matrix. Thus H (x) has all real eigenvalues. Suppose first
that H (x) has all positive eigenvalues and that all are larger than δ 2 > 0. Then by
n
Theorem 3.8.23, H (x) has an orthonormal
Pn basis of eigenvectors, {vi }i=1 and if u is an
arbitrary vector, such that u = j=1 uj vj where uj = u · vj , then
n
X n
X
uT H (x) u = uj vjT H (x) uj vj
j=1 j=1
n
X n
X 2
= u2j λj ≥ δ 2 u2j = δ 2 |u| .
j=1 j=1
1 2 1 2 δ2 2
f (x + v) ≥ f (x) + δ 2 |v| − δ 2 |v| = f (x) + |v| .
2 4 4
This shows the first claim of the theorem. The second claim follows from similar rea-
soning. Suppose H (x) has a positive eigenvalue λ2 . Then let v be an eigenvector for
this eigenvalue. Then from 6.40,
1
f (x+tv) = f (x) + t2 vT H (x) v+
2
1 2¡ T ¢
t v (H (x+tv) −H (x)) v
2
which implies
1 2 1 ¡ ¢
f (x+tv) = f (x) + t2 λ2 |v| + t2 vT (H (x+tv) −H (x)) v
2 2
1 2 2 2
≥ f (x) + t λ |v|
4
whenever t is small enough. Thus in the direction v the function has a local minimum
at x. The assertion about the local maximum in some direction follows similarly. This
proves the theorem. ¥
This theorem is an analogue of the second derivative test for higher dimensions. As
in one dimension, when there is a zero eigenvalue, it may be impossible to determine
from the Hessian matrix what the local qualitative behavior of the function is. For
example, consider
f1 (x, y) = x4 + y 2 , f2 (x, y) = −x4 + y 2 .
Then Dfi (0, 0) = 0 and for both functions, the Hessian matrix evaluated at (0, 0) equals
µ ¶
0 0
0 2
but the behavior of the two functions is very different near the origin. The second has
a saddle point while the first has a minimum there.
6.12. THE METHOD OF LAGRANGE MULTIPLIERS 145
gi (x) = 0, i = 1, · · ·, m (6.41)
be a collection of equality constraints with m < n. Now consider the system of nonlinear
equations
f (x) = a
gi (x) = 0, i = 1, · · ·, m.
x0 is a local maximum if f (x0 ) ≥ f (x) for all x near x0 which also satisfies the
constraints 6.41. A local minimum is defined similarly. Let F : U × R → Rm+1 be
defined by
f (x) − a
g1 (x)
F (x,a) ≡ .. . (6.42)
.
gm (x)
Now consider the m + 1 × n Jacobian matrix, the matrix of the linear transformation,
D1 F (x, a) with respect to the usual basis for Rn and Rm+1 .
fx1 (x0 ) · · · fxn (x0 )
g1x1 (x0 ) · · · g1xn (x0 )
.. .. .
. .
gmx1 (x0 ) ··· gmxn (x0 )
If this matrix has rank m+1 then some m+1×m+1 submatrix has nonzero determinant.
It follows from the implicit function theorem that there exist m + 1 variables, xi1 , · ·
·, xim+1 such that the system
F (x,a) = 0 (6.43)
specifies these m + 1 variables as a function of the remaining n − (m + 1) variables and
a in an open set of Rn−m . Thus there is a solution (x,a) to 6.43 for some x close to x0
whenever a is in some open interval. Therefore, x0 cannot be either a local minimum or
a local maximum. It follows that if x0 is either a local maximum or a local minimum,
then the above matrix must have rank less than m + 1 which, by Corollary 3.5.20,
requires the rows to be linearly dependent. Thus, there exist m scalars,
λ1 , · · ·, λm ,
are linearly independent, then, µ 6= 0 and dividing by µ yields an expression of the form
fx1 (x0 ) g1x1 (x0 ) gmx1 (x0 )
.. .. ..
. = λ1 . + · · · + λm . (6.46)
fxn (x0 ) g1xn (x0 ) gmxn (x0 )
at every point x0 which is either a local maximum or a local minimum. This proves the
following theorem.
6.13 Exercises
1. Suppose L ∈ L (X, Y ) and suppose L is one to one. Show there exists r > 0 such
that for all x ∈ X,
||Lx|| ≥ r ||x|| .
Hint: You might argue that |||x||| ≡ ||Lx|| is a norm.
P
2. Show every polynomial, |α|≤k dα xα is C k for every k.
5. The existence of partial derivatives does not imply continuity as was shown in an
example. However, much more can be said than this. Consider
( 2
(x2 −y4 )
f (x, y) = if (x, y) 6= (0, 0) ,
(x2 +y 4 )2
1 if (x, y) = (0, 0) .
Show each Gateaux derivative, Dv f (0) exists and equals 0 for every v. Also show
each Gateaux derivative exists at every other point in R2 . Now consider the curve
x2 = y 4 and the curve y = 0 to verify the function fails to be continuous at (0, 0).
This is an example of an everywhere Gateaux differentiable function which is not
differentiable and not continuous.
Determine whether f is continuous at (0, 0). Find fx (0, 0) and fy (0, 0) . Are the
partial derivatives of f continuous at (0, 0)? Find D(u,v) f ((0, 0)) , limt→0 f (t(u,v))
t .
Is the mapping (u, v) → D(u,v) f ((0, 0)) linear? Is f differentiable at (0, 0)?
6.13. EXERCISES 147
whenever t ∈ [0, 1]. Suppose also that f is differentiable. Show then that for every
x, y ∈ V,
(Df (x) − Df (y)) (x − y) ≥ 0.
(u · v (x)) = Df (x) u.
This special vector is called the gradient and is usually denoted by ∇f (x) . Hint:
You might review the Riesz representation theorem presented earlier.
Hint: Consider T x = f (x)−Lx and argue ||DT (x)|| < k. Then consider Theorem
6.4.2.
x1 , x2 ∈ B (x0 , δ) ,
then
r
|f (x1 ) − f (x2 ) − Df (x0 ) (x1 − x2 )| ≤ |x1 − x2 | (6.47)
2
then use Problem 1.
12. Suppose M ∈ L (X, Y ) and suppose M is onto. Show there exists L ∈ L (Y, X)
such that
LM x =P x
where P ∈ L (X, X), and P 2 = P . Also show L is one to one and onto. Hint:
Let {y1 , · · ·, ym } be a basis of Y and let M xi = yi . Then define
m
X m
X
Ly = αi xi where y = α i yi .
i=1 i=1
148 THE DERIVATIVE
Show {x1 , · · ·, xm } is a linearly independent set and show you can obtain {x1 , · ·
·, xm , · · ·, xn }, a basis for X in which M xj = 0 for j > m. Then let
m
X
Px ≡ α i xi
i=1
where
m
X
x= αi xi .
i=1
Show that there exists ε > 0 and an open subset of B (x0 , δ) , V , such that
f : V →B (f (x0 ) , ε) is one to one and onto. Also Df −1 (y) exists for each y ∈
B (f (x0 ) , ε) and is given by the formula
£ ¡ ¢¤−1
Df −1 (y) = Df f −1 (y) .
Hint: Let
Ty (x) ≡ T (x, y) ≡ x−L−1 (f (x) − y)
(1−r)δ n
for |y − f (x0 )| < 2||L−1 || , consider {Ty (x0 )}. This is a version of the inverse
1
function theorem for f only differentiable, not C .
6.13. EXERCISES 149
16. Recall the nth derivative can be considered a multilinear function defined on X n
with values in some normed vector space. Now define a function denoted as
wi vj1 · · · vjn which maps X n → Y in the following way
X
≡ wi (ak1 ak2 · · · akn ) δ j1 k1 δ j2 k2 · · · δ jn kn
k1 k2 ···kn
= wi aj1 aj2 · · · ajn (6.49)
Show each wi vj1 · · · vjn is an n linear Y valued function. Next show the set of n
linear Y valued functions is a vector space and these special functions, wi vj1 ···vjn
for all choices of i and the jk is a basis of this vector space. Find the dimension
of the vector space.
Pn Pn
17. Minimize j=1 xj subject to the constraint j=1 x2j = a2 . Your answer should
be some function of a which you may assume is a positive number.
18. Find the point, (x, y, z) on the level surface, 4x2 + y 2 − z 2 = 1which is closest to
(0, 0, 0) .
19. A curve is formed from the intersection of the plane, 2x + 3y + z = 3 and the
cylinder x2 + y 2 = 4. Find the point on this curve which is closest to (0, 0, 0) .
20. A curve is formed from the intersection of the plane, 2x + 3y + z = 3 and the
sphere x2 + y 2 + z 2 = 16. Find the point on this curve which is closest to (0, 0, 0) .
21. Find the point on the plane, 2x + 3y + z = 4 which is closest to the point (1, 2, 3) .
22. Let A = (Aij ) be an n × n matrix which is symmetric. Thus Aij = Aji and recall
∂
(Ax)i = Aij xj where as usual sum over the repeated index. Show ∂x i
(Aij xj xi ) =
2Aij xj . Show that when you use the method of P Lagrange multipliers to maximize
n
the function, Aij xj xi subject to the constraint, j=1 x2j = 1, the value of λ which
corresponds to the maximum value of this functions is such that Aij xj = λxi .
Thus Ax = λx. Thus λ is an eigenvalue of the matrix, A.
23. Let x1 , · · ·, x5 be 5 positive numbers. Maximize their product subject to the
constraint that
x1 + 2x2 + 3x3 + 4x4 + 5x5 = 300.
24. Let f (x1 , · · ·, xn ) = xn1 x2n−1 · · · x1n . Then f achieves a maximum on the set,
( n
)
X
n
S≡ x∈R : ixi = 1 and each xi ≥ 0 .
i=1
25. Let (x, y) be a point on the ellipse, x2 /a2 +y 2 /b2 = 1 which is in the first quadrant.
Extend the tangent line through (x, y) till it intersects the x and y axes and let
A (x, y) denote the area of the triangle formed by this line and the two coordinate
axes. Find the minimum value of the area of this triangle as a function of a and
b.
Qn Pn
26. Maximize i=1 x2i (≡ x¡21 ×x22 ×x
¢
2 2
3 ×···×xn ) subject to the constraint,
2 2
i=1 xi = r .
2 n
Show the maximum is r /n . Now show from this that
à n
!1/n n
Y 1X 2
x2i ≤ x
i=1
n i=1 i
and there exist values of the xi for which equality holds. This says the “geometric
mean” is always smaller than the arithmetic mean.
27. Maximize x2 y 2 subject to the constraint
x2p y 2q
+ = r2
p q
where p, q are real numbers larger than 1 which have the property that
1 1
+ = 1.
p q
show the maximum is achieved when x2p = y 2q and equals r2 . Now conclude that
if x, y > 0, then
xp yq
xy ≤ +
p q
and there are values of x and y where this inequality is an equation.
Measures And Measurable
Functions
The integral to be discussed next is the Lebesgue integral. This integral is more general
than the Riemann integral of beginning calculus. It is not as easy to define as this
integral but is vastly superior in every application. In fact, the Riemann integral has
been obsolete for over 100 years. There exist convergence theorems for this integral
which are not available for the Riemann integral and unlike the Riemann integral, the
Lebesgue integral generalizes readily to abstract settings used in probability theory.
Much of the analysis done in the last 100 years applies to the Lebesgue integral. For
these reasons, and because it is very easy to generalize the Lebesgue integral to functions
of many variables I will present the Lebesgue integral here. First it is convenient to
discuss outer measures, measures, and measurable function in a general setting.
K ⊆ ∪m
k=1 Uk .
It was shown earlier that in any finite dimensional normed vector space the closed
and bounded sets are those which are sequentially compact. The next theorem says that
in any normed vector space, sequentially compact and compact are the same.1 First
here is a very interesting lemma about the existence of something called a Lebesgue
number, the number r in the next lemma.
Lemma 7.1.2 Let K be a sequentially compact set in a normed vector space and let
U be an open cover of K. Then there exists r > 0 such that if x ∈ K, then B (x, r) is a
subset of some set of U.
1 Actually,
this is true more generally than for normed vector spaces. It is also true for metric spaces,
those on which there is a distance defined.
151
152 MEASURES AND MEASURABLE FUNCTIONS
Proof: Suppose no such r exists. Then in particular, 1/n does not work for each
n ∈ N. Therefore, there exists xn ∈ K such that B (xn , r) is not a subset of any of the
sets of U. Since K is sequentially compact, there exists a subsequence, {xnk } converging
to a point x of K. Then there exists r > 0 such that B (x, r) ⊆ U ∈ U because U is an
open cover. Also xnk ∈ B (x,r/2) for all k large enough and also for all k large enough,
1/nk < r/2. Therefore, there exists xnk ∈ B (x,r/2) and 1/nk < r/2. But this is a
contradiction because
B (xnk , 1/nk ) ⊆ B (x, r) ⊆ U
contrary to the choice of xnk which required B (xnk , 1/nk ) is not contained in any set
of U . This proves the lemma. ¥
Proof: Suppose first K is sequentially compact and let U be an open cover. Let r
be a Lebesgue number as described in Lemma 7.1.2. Pick x1 ∈ K. Then B (x1 , r) ⊆ U1
m
for some U1 ∈ U. Suppose {B (xi , r)}i=1 have been chosen such that
B (xi , r) ⊆ Ui ∈ U .
m m
If their union contains K then {Ui }i=1 is a finite subcover of U. If {B (xi , r)}i=1 does
not cover K, then there exists xm+1 ∈ / ∪mi=1 B (xi , r) and so B (xm+1 , r) ⊆ Um+1 ∈ U.
∞
This process must stop after finitely many choices of B (xi , r) because if not, {xk }k=1
would have a subsequence which converges to a point of K which cannot occur because
whenever i 6= j,
||xi − xj || > r
Therefore, eventually
K ⊆ ∪m m
k=1 B (xk , r) ⊆ ∪k=1 Uk .
Corollary 7.1.4 Let X be a finite dimensional normed vector space and let K ⊆ X.
Then the following are equivalent.
In words, you look at all coverings of A with open intervals. For each of these
open coverings, you add the “lengths” of the individual open intervals and you take the
infimum of all such numbers obtained.
Then 1.) is obvious because if a countable collection of open intervals covers B then
it also covers A. Thus the set of numbers obtained for B is smaller than the set of
numbers for A. Why is µ (∅) = 0? Pick a point of continuity of F. Such points exist
because F is increasing and so it has only countably many points of discontinuity. Let
a be this point. Then ∅ ⊆ (a − δ, a + δ) and so µ (∅) ≤ 2δ for every δ > 0.
Consider 2.). If any µ (Ai ) = ∞, there is nothing to prove. The assertion simply is
∞ ≤ ∞. Assume then that µ (Ai ) < ∞ for all i. Then for each m ∈ N there exists a
m ∞
countable set of open intervals, {(am
i , bi )}i=1 such that
X∞
ε
µ (Am ) + > (F (bm m
i −) − F (ai +)) .
2m i=1
By Theorem 7.1.3, finitely many of these intervals also cover [a, b]. It follows there exists
n
finitely many of these intervals, {(ai , bi )}i=1 which overlap such that a ∈ (a1 , b1 ) , b1 ∈
(a2 , b2 ) , · · ·, b ∈ (an , bn ) . Therefore,
n
X
µ ([a, b]) ≤ (F (bi −) − F (ai +))
i=1
It follows
n
X
(F (bi −) − F (ai +)) ≥ µ ([a, b])
i=1
n
X
≥ (F (bi −) − F (ai +)) − ε
i=1
≥ F (b+) − F (a−) − ε
Since ε is arbitrary, this shows
µ ([a, b]) ≥ F (b+) − F (a−)
but also, from the definition, the following inequality holds for all δ > 0.
µ ([a, b]) ≤ F ((b + δ) −) − F ((a − δ) −) ≤ F (b + δ) − F (a − δ)
Therefore, letting δ → 0 yields
µ ([a, b]) ≤ F (b+) − F (a−)
This establishes 3.).
Consider 4.). For small δ > 0,
µ ([a + δ, b − δ]) ≤ µ ((a, b)) ≤ µ ([a, b]) .
Therefore, from 3.) and the definition of µ,
F ((b − δ)) − F ((a + δ)) ≤ F ((b − δ) +) − F ((a + δ) −)
= µ ([a + δ, b − δ]) ≤ µ ((a, b)) ≤ F (b−) − F (a+)
Now letting δ decrease to 0 it follows
F (b−) − F (a+) ≤ µ ((a, b)) ≤ F (b−) − F (a+)
This shows 4.)
Consider 5.). From 3.) and 4.)
F (b+) − F ((a + δ))
≤ F (b+) − F ((a + δ) −)
= µ ([a + δ, b]) ≤ µ ((a, b])
≤ µ ((a, b + δ)) = F ((b + δ) −) − F (a+)
≤ F (b + δ) − F (a+) .
7.3. GENERAL OUTER MEASURES AND MEASURES 155
This establishes 5.) and 6.) is entirely similar to 5.). This proves the theorem. ¥
The triple (Ω, S, µ) is often called a measure space. Sometimes people refer to (Ω, S) as
a measurable space, making no reference to the measure. Sometimes (Ω, S) may also be
called a measure space.
µ(∪∞
i=1 Ei ) = lim µ(En ) (7.1)
n→∞
µ(∩∞
i=1 Ei ) = lim µ(En ). (7.2)
n→∞
Stated more succinctly, Ek ↑ E implies µ (Ek ) ↑ µ (E) and Ek ↓ E with µ (E1 ) < ∞
implies µ (Ek ) ↓ µ (E).
156 MEASURES AND MEASURABLE FUNCTIONS
1. For every F ∈ F
2. For every F ∈ F
The first of the above conditions is called inner regularity and the second is called
outer regularity.
S (0,r) ≡ {x ∈ Y : ||x|| = r}
D (0,r) ≡ {x ∈ Y : ||x|| ≤ r}
B (0,r) ≡ {x ∈ Y : ||x|| < r}
Thus S (0,r) is a closed set as is D (0,r) while B (0,r) is an open set. These are closed
or open as stated in Y . Since S (0,r) and D (0,r) are intersections of closed sets, Y and
a closed set in X, these are also closed in X. Of course B (0, r) might not be open in
X. This would happen if Y has empty interior in X for example. However, S (0,r) and
D (0,r) are compact.
158 MEASURES AND MEASURABLE FUNCTIONS
Proof: It is desired to show that in this setting outer regularity implies inner
regularity. First suppose F ⊆ D (0, n) where n ∈ N and F ∈ S. The following diagram
will help to follow the technicalities. In this picture, V is the material between the two
dotted curves, F is the inside of the solid curve and D (0,n) is inside the larger solid
curve.
V F F V
D(0, n) \ F
because it is given that S contains the open sets. By 7.4 there exists an open set,
V ⊇ D (0, n) \ F such that
Since µ is a measure,
µ (V \ (D (0, n) \ F )) + µ (D (0, n) \ F ) = µ (V )
and by 7.6,
µ (V \ (D (0, n) \ F )) < ε
7.4. THE BOREL SETS, REGULAR MEASURES 159
so in particular,
µ (V ∩ F ) < ε.
Now
V ⊇ D (0, n) ∩ F C
and so
C
V C ⊆ D (0, n) ∪ F
which implies
V C ∩ D (0, n) ⊆ F ∩ D (0, n) = F
Since F ⊆ D (0, n) ,
¡ ¡ ¢¢ ³ ¡ ¢C ´
µ F \ V C ∩ D (0, n) = µ F ∩ V C ∩ D (0, n)
³ ³ ´´
C
= µ (F ∩ V ) ∪ F ∩ D (0, n)
= µ (F ∩ V ) < ε
l + ε < µ (Fn )
provided n is large enough. Now it was just shown there exists K a compact subset of
Fn such that µ (Fn ) < µ (K) + ε. Then K ⊆ F and
and so whenever l < µ (F ) , it follows there exists K a compact subset of F such that
l < µ (K)
Lemma 7.4.4 Let X be a normed vector space and let S be any nonempty subset of
X. Define
dist (x, S) ≡ inf {||x − y|| : y ∈ S}
Then
|dist (x1 , S) − dist (x2 , S)| ≤ ||x1 −x2 || .
Proof: Suppose dist (x1 , S) ≥ dist (x2 , S) . Then let y ∈ S such that
Then
Since ε is arbitrary, this proves the lemma in case dist (x1 , S) ≥ dist (x2 , S) . The case
where dist (x2 , S) ≥ dist (x1 , S) is entirely similar. This proves the lemma. ¥
The next lemma says that regularity comes free for finite measures defined on the
Borel sets. Actually, it only almost says this. The following theorem will say it. This
lemma deals with closed in place of compact.
µ (F ) = inf {µ (V ) : V ⊇ F, V is open}
Proof: For convenience, I will call a measure which satisfies the above two conditions
“almost regular”. It would be regular if closed were replaced with compact. First note
every open set is the countable union of closed sets and every closed set is the countable
intersection of open sets. Here is why. Let V be an open set and let
© ¡ ¢ ª
Kk ≡ x ∈ V : dist x, V C ≥ 1/k .
Then clearly the union of the Kk equals V and each is closed because x→ dist (x, S) is
always a continuous function whenever S is any nonempty set. Next, for K closed let
Then F contains the open sets. I want to show F is a σ algebra and then it will follow
F = B (Y ).
First I will show F is closed with respect to complements. Let F ∈ F. Then since
µ is finite and F is inner regular, there exists K ⊆ F such that
µ (F \ K) = µ (F ) − µ (K) < ε.
But K C \ F C = F \ K and so
¡ ¢ ¡ ¢ ¡ ¢
µ KC \ F C = µ KC − µ F C < ε
showing that µ is outer regular on F C . I have just approximated the measure of F C with
the measure of K C , an open set containing F C . A similar argument works to show F C
7.4. THE BOREL SETS, REGULAR MEASURES 161
C C
is inner regular. You start
¡ C with ¢V ⊇ F such that µ (V \ F ) <Cε, note F \ V = V \ F,
C
and then conclude µ F \ V < ε, thus approximating F with the closed subset,
V C.
Next I will show F is closed with respect to taking countable unions. Let {Fk } be
a sequence of sets in F. Then since Fk ∈ F, there exist {Kk } such that Kk ⊆ Fk and
µ (Fk \ Kk ) < ε/2k+1 . First choose m large enough that
ε
µ ((∪∞ m
k=1 Fk ) \ (∪k=1 Fk )) < .
2
Then
µ ((∪m m
k=1 Fk ) \ (∪k=1 Kk )) ≤ µ (∪m
k=1 (Fk \ Kk ))
Xm
ε ε
≤ <
2k+1 2
k=1
and so
µ ((∪∞ m
k=1 Fk ) \ (∪k=1 Kk )) ≤ µ ((∪∞ m
k=1 Fk ) \ (∪k=1 Fk ))
+µ ((∪m m
k=1 Fk ) \ (∪k=1 Kk ))
ε ε
< + =ε
2 2
Since µ is outer regular on Fk , there exists Vk such that µ (Vk \ Fk ) < ε/2k . Then
µ ((∪∞ ∞
k=1 Vk ) \ (∪k=1 Fk )) ≤ µ (∪∞
k=1 (Vk \ Fk ))
X∞
≤ µ (Vk \ Fk )
k=1
X∞
ε
< =ε
2k
k=1
and this completes the demonstration that F is a σ algebra. This proves the lemma. ¥
The next theorem is the main result. It shows regularity is automatic if µ (K) < ∞
for all compact K.
Proof: From Lemma 7.4.5 µ is outer regular. Now let F ∈ B (Y ). Then since µ is
finite, it follows from Lemma 7.4.5 there exists H ⊆ F such that H is closed, H ⊆ F,
and
µ (F ) < µ (H) + ε.
Then let Kk ≡ H ∩ B (0, k). Thus Kk is a closed and bounded, hence compact set and
∪∞
k=1 Kk = H. Therefore by Theorem 7.3.2, for all k large enough,
µ (F )
< µ (Kk ) + ε
< sup {µ (K) : K ⊆ F and K compact} + ε
≤ µ (F ) + ε
S (0,r) = ∩∞
k=1 (B (0, r + 1/k) − D (0,r − 1/k))
a countable intersection of open sets which are decreasing as k → ∞. Since µ (B (0, r)) <
∞ by assumption, it follows from Theorem 7.3.2 that for each rk there exists an open
set, Uk ⊇ S (0,rk ) such that
µ (Uk ) < ε/2k+1 .
Let µ (F ) < ∞. There is nothing to show if µ (F ) = ∞. Define finite measures, µk
as follows.
µ1 (A) ≡ µ (B (0, 1) ∩ A) ,
µ2 (A) ≡ µ ((B (0, 2) \ D (0, 1)) ∩ A) ,
µ3 (A) ≡ µ ((B (0, 3) \ D (0, 2)) ∩ A)
etc. Thus
∞
X
µ (A) = µk (A)
k=1
and each µk is a finite measure. By the first part there exists an open set Vk such that
and
µk (Vk ) < µk (F ) + ε/2k+1
Without loss of generality Vk ⊆ (B (0, k) \ D (0, k − 1)) since you can take the intersec-
tion of Vk with this open set. Thus
such that
whenever m is large enough. Therefore, letting µm (A) ≡ µ (A ∩ B (0, m)) , there exists
a compact set, K ⊆ F ∩ B (0, m) such that
was either oil, bread, flour or fish. In mathematics such things have also been done with sets. In the
book by Bruckner Bruckner and Thompson there is an interesting discussion of the Banach Tarski
paradox which says it is possible to divide a ball in R3 into five disjoint pieces and assemble the pieces
to form two disjoint balls of the same size as the first. The details can be found in: The Banach Tarski
Paradox by Wagon, Cambridge University press. 1985. It is known that all such examples must involve
the axiom of choice.
164 MEASURES AND MEASURABLE FUNCTIONS
Definition 7.5.2 (µbS)(A) ≡ µ(S ∩ A) for all A ⊆ Ω. Thus µbS is the name
of a new outer measure, called µ restricted to S.
The next lemma indicates that the property of measurability is not lost by consid-
ering this restricted measure.
Theorem 7.5.4 Let Ω be a set and let µ be an outer measure on P (Ω). The
collection of µ measurable sets, S, forms a σ algebra and
∞
X
If Fi ∈ S, Fi ∩ Fj = ∅, then µ(∪∞
i=1 Fi ) = µ(Fi ). (7.9)
i=1
This measure space is also complete which means that if µ (F ) = 0 for some F ∈ S then
if G ⊆ F, it follows G ∈ S also.
Proof: First note that ∅ and Ω are obviously in S. Now suppose A, B ∈ S. I will
show A \ B ≡ A ∩ B C is in S. To do so, consider the following picture.
7.5. MEASURES AND OUTER MEASURES 165
S
T CT C
S A B
T T
S AC B
T T
S A BC
T T
S A B
B
Since µ is subadditive,
¡ ¢ ¡ ¢ ¡ ¢
µ (S) ≤ µ S ∩ A ∩ B C + µ (A ∩ B ∩ S) + µ S ∩ B ∩ AC + µ S ∩ AC ∩ B C .
Now using A, B ∈ S,
¡ ¢ ¡ ¢ ¡ ¢
µ (S) ≤ µ S ∩ A ∩ B C + µ (S ∩ A ∩ B) + µ S ∩ B ∩ AC + µ S ∩ AC ∩ B C
¡ ¢
= µ (S ∩ A) + µ S ∩ AC = µ (S)
It follows equality holds in the above. Now observe, using the picture if you like, that
¡ ¢ ¡ ¢
(A ∩ B ∩ S) ∪ S ∩ B ∩ AC ∪ S ∩ AC ∩ B C = S \ (A \ B)
and therefore,
¡ ¢ ¡ ¢ ¡ ¢
µ (S) = µ S ∩ A ∩ B C + µ (A ∩ B ∩ S) + µ S ∩ B ∩ AC + µ S ∩ AC ∩ B C
≥ µ (S ∩ (A \ B)) + µ (S \ (A \ B)) .
By induction, if Ai ∩ Aj = ∅ and Ai ∈ S,
n
X
µ(∪ni=1 Ai ) = µ(Ai ). (7.12)
i=1
Now let A = ∪∞
i=1 Ai where Ai ∩ Aj = ∅ for i 6= j.
∞
X n
X
µ(Ai ) ≥ µ(A) ≥ µ(∪ni=1 Ai ) = µ(Ai ).
i=1 i=1
166 MEASURES AND MEASURABLE FUNCTIONS
Since this holds for all n, you can take the limit as n → ∞ and conclude,
∞
X
µ(Ai ) = µ(A)
i=1
Therefore, letting
F ≡ ∪∞
k=1 Fk
In order to establish 7.11, let the Fn be as given there. Then, since (F1 \ Fn )
increases to (F1 \ F ), 7.10 implies
which implies
lim µ (Fn ) ≤ µ (F ) .
n→∞
But since F ⊆ Fn ,
µ (F ) ≤ lim µ (Fn )
n→∞
and this establishes 7.11. Note that it was assumed µ (F1 ) < ∞ because µ (F1 ) was
subtracted from both sides.
It remains to show S is closed under countable unions. Recall that if A ∈ S, then
AC ∈ S and S is closed under finite unions. Let Ai ∈ S, A = ∪∞ n
i=1 Ai , Bn = ∪i=1 Ai .
Then
By Lemma 7.5.3 Bn is (µbS) measurable and so is BnC . I want to show µ(S) ≥ µ(S \
A) + µ(S ∩ A). If µ(S) = ∞, there is nothing to prove. Assume µ(S) < ∞. Then apply
Parts 7.11 and 7.10 to the outer measure, µbS in 7.13 and let n → ∞. Thus
Bn ↑ A, BnC ↓ AC
µ (S) ≥ µ (S ∩ G) + µ (S \ G)
However,
µ (S ∩ G) + µ (S \ G) ≤ µ (S ∩ F ) + µ (S \ F ) + µ (F \ G)
= µ (S ∩ F ) + µ (S \ F ) = µ (S)
Theorem 7.5.6¡ Let (Ω,¢ F, µ) be a σ finite measure space. Then there exists a
unique measure space, Ω, F, µ satisfying
¡ ¢
1. Ω, F, µ is a complete measure space.
2. µ = µ on F
3. F ⊇ F
µ (G \ F ) = µ (G \ F ) = 0 (7.14)
168 MEASURES AND MEASURABLE FUNCTIONS
Proof: First consider the claim about uniqueness. Suppose (Ω, F1 , ν 1 ) and (Ω, F2 , ν 2 )
both satisfy 1.) - 4.) and let E ∈ F1 . Also let µ (Ωn ) < ∞, · · ·Ωn ⊆ Ωn+1 · ··,
and ∪∞ n=1 Ωn = Ω. Define En ≡ E ∩ Ωn . Then there exists Gn ⊇ En such that
µ (Gn ) = ν 1 (En ) , Gn ∈ F and Gn ⊆ Ωn . I claim there exists Fn ∈ F such that
Gn ⊇ En ⊇ Fn and µ (Gn \ Fn ) = 0. To see this, look at the following diagram.
Hn En Fn Hn
Gn \ En
Then
µ (S) = µ (∪i Si )
X
≤ µ (∪i Ei ) ≤ µ (Ei )
i
X¡ ¢ X
≤ µ (Si ) + ε/2i = µ (Si ) + ε.
i i
If F ⊇ E and F ∈ F , then µ (F ) ≥ µ (E) and so µ (E) is a lower bound for all such
µ (F ) which shows that
This verifies 2.
Next consider 3. Let E ∈ F and let S be a set. I must show
µ (S) ≥ µ (S \ E) + µ (S ∩ E) .
If µ (S) = ∞ there is nothing to show. Therefore, suppose µ (S) < ∞. Then from the
definition of µ there exists G ⊇ S such that G ∈ F and µ (G) = µ (S) . Then from the
definition of µ,
µ (S) ≤ µ (S \ E) + µ (S ∩ E)
≤ µ (G \ E) + µ (G ∩ E)
= µ (G) = µ (S)
This verifies 3.
Claim 4 comes by the definition of µ as used above. The other case is when µ (S) =
∞. However, in this case, you can let G = Ω.
It only remains to verify 5. Let the Ωn be as described above and let E ∈ F such
that E ⊆ Ωn . By 4 there exists H ∈ F such that H ⊆ Ωn , H ⊇ Ωn \ E, and
It follows
µ (E) = µ (F ) = µ (F ) .
In the case where E ∈ F is arbitrary, not necessarily contained in some Ωn , it follows
from what was just shown that there exists Fn ∈ F such that Fn ⊆ E ∩ Ωn and
µ (Fn ) = µ (E ∩ Ωn ) .
Letting F ≡ ∪n Fn
X
µ (E \ F ) ≤ µ (∪n (E ∩ Ωn \ Fn )) ≤ µ (E ∩ Ωn \ Fn ) = 0.
n
Proof: Let F ∈ B (X) with µ (F ) < ∞. By Theorem 7.5.6 there exists G ∈ B (X)
such that
µ (G) = µ (G) = µ (F ) .
Now by regularity of µ there exists an open set, V ⊇ G ⊇ F such that
µ (F ) + ε = µ (G) + ε > µ (V ) = µ (V )
Thus µ is also inner regular. The last assertion follows from the uniqueness part of
Theorem 7.5.6 and This proves the theorem. ¥
A repeat of the above argument yields the following corollary.
Corollary 7.5.8 The conclusion of the above theorem holds for X replaced with Y
where Y is a closed subset of X.
Theorem 7.6.1 Let S denote the σ algebra of Theorem 7.5.4 applied to the outer
measure µ in Theorem 7.2.1 on which µ is a measure. Then every open interval is in
S. So are all open and closed sets. Furthermore, if E is any set in S
Proof: The first task is to show (a, b) ∈ S. I need to show that for every S ⊆ R,
³ ´
C
µ (S) ≥ µ (S ∩ (a, b)) + µ S ∩ (a, b) (7.18)
Suppose first S is an open interval, (c, d) . If (c, d) has empty intersection with (a, b) or
is contained in (a, b) there is nothing to prove. The above expression reduces to nothing
more than µ (S) = µ (S). Suppose next that (c, d) ⊇ (a, b) . In this case, the right side
of the above reduces to
The only other cases are c ≤ a < d ≤ b or a ≤ c < d ≤ b. Consider the first of these
cases. Then the right side of 7.18 for S = (c, d) is
The last case is entirely similar. Thus 7.18 holds whenever S is an open interval. Now
it is clear 7.18 also holds if µ (S) = ∞. Suppose then that µ (S) < ∞ and let
S ⊆ ∪∞
k=1 (ak , bk )
such that
∞
X ∞
X
µ (S) + ε > (F (bk −) − F (ak +)) = µ ((ak , bk )) .
k=1 k=1
Then since µ is an outer measure, and using what was just shown,
³ ´
C
µ (S ∩ (a, b)) + µ S ∩ (a, b)
³ ´
C
≤ µ (∪∞ ∞
k=1 (ak , bk ) ∩ (a, b)) + µ ∪k=1 (ak , bk ) ∩ (a, b)
∞
X ³ ´
C
≤ µ ((ak , bk ) ∩ (a, b)) + µ (ak , bk ) ∩ (a, b)
k=1
X∞
≤ µ ((ak , bk )) ≤ µ (S) + ε.
k=1
Since ε is arbitrary, this shows 7.18 holds for any S and so any open interval is in S.
It follows any open set is in S. This follows from Theorem 5.3.10 which implies that
if U is open, it is the countable union of disjoint open intervals. Since each of these
open intervals is in S and S is a σ algebra, their union is also in S. It follows every
closed set is in S also. This is because S is a σ algebra and if a set is in S then so is its
complement. The closed sets are those which are complements of open sets.
Thus the σ algebra of µ measurable sets,³ F includes
´ B (R). Consider the comple-
tion of the measure space, (R, B (R) , µ) , R, B (R), µ . By the uniqueness assertion in
Theorem 7.5.6 and the fact that (R, F, µ) is complete, this coincides with (R, F, µ) be-
cause the construction of µ implies µ is outer regular and for every F ∈ F , there exists
G ∈ B (R) containing F such that µ (F ) = µ (G) . In fact, you can take G to equal a
countable intersection of open sets. By Theorem 7.4.6 µ is regular on every set of B (R) ,
this because µ is finite on compact sets. Therefore, by Theorem 7.5.7 µ = µ is regular
on F which verifies the last two claims. This proves the theorem. ¥
172 MEASURES AND MEASURABLE FUNCTIONS
xn > l.
Proof: First note that the first and the third are equivalent. To see this, observe
f −1 ([d, ∞]) = ∩∞
n=1 f
−1
((d − 1/n, ∞]),
f −1 ((d, ∞]) = ∪∞
n=1 f
−1
([d + 1/n, ∞]),
so the first and fourth conditions are equivalent. Thus the first four conditions are
equivalent and if any of them hold, then for −∞ < a < b < ∞,
and so the third condition holds. Therefore, all five conditions are equivalent. This
proves the lemma. ¥
This lemma allows for the following definition of a measurable function having values
in (−∞, ∞].
7.7. MEASURABLE FUNCTIONS 173
Definition 7.7.2 Let (Ω, F, µ) be a measure space and let f : Ω → (−∞, ∞].
Then f is said to be F measurable if any of the equivalent conditions of Lemma 7.7.1
hold.
−1
¡ 1 1
¢
£ Proof: The¤ idea is to show f ((a, b)) ∈ F. Let V m ≡ a+ m, b − m and V m =
1 1
a + m , b − m . Then for all m, Vm ⊆ (a, b) and
(a, b) = ∪∞ ∞
m=1 Vm = ∪m=1 V m .
Note that Vm 6= ∅ for all m large enough. Since f is the pointwise limit of fn ,
You should note that the expression in the middle is of the form
−1
∪∞ ∞
n=1 ∩k=n fk (Vm ).
Therefore,
−1
f −1 ((a, b)) = ∪∞
m=1 f
−1
(Vm ) ⊆ ∪∞ ∞ ∞
m=1 ∪n=1 ∩k=n fk (Vm )
⊆ ∪∞
m=1 f
−1
(V m ) = f −1 ((a, b)).
It follows f −1 ((a, b)) ∈ F because it equals the expression in the middle which is mea-
surable. This shows f is measurable.
Proposition 7.7.4 Let (Ω, F, µ) be a measure space and let f : Ω → (−∞, ∞].
Then f is F measurable if and only if f −1 (U ) ∈ F whenever U is an open set in R.
f −1 (U ) = ∪∞
k=1 f
−1
((ak , bk )) ∈ F
because F is a σ algebra.
From this proposition, it follows one can generalize the definition of a measurable
function to those which have values in any normed vector space as follows.
Now here is an important theorem which shows that you can do lots of things to
measurable functions and still have a measurable function.
Theorem 7.7.6 Let (Ω, F, µ) be a measure space and let X, Y be normed vector
spaces and g : X → Y continuous. Then if f : Ω → X is F measurable, it follows g ◦ f
is also F measurable.
174 MEASURES AND MEASURABLE FUNCTIONS
−1
Proof: From the definition, it suffices to show (g ◦ f ) (U ) ∈ F whenever U is an
open set in Y. However, since g is continuous, it follows g−1 (U ) is open and so
−1 ¡ ¢
(g ◦ f ) (U ) = f −1 g−1 (U ) = f −1 (an open set) ∈ F.
Lemma 7.7.7 Let ||x|| ≡ max {|xi | , i = 1, 2, · · ·, n} for x ∈ Fn . Then every set U
which is open in Fn is the countable union of balls of the form B (x,r) where the open
ball is defined in terms of the above norm.
Proof: By Theorem 5.8.3 if you consider the two normed vector spaces (Fn , |·|)
and (Fn , ||·||) , the identity map is continuous in both directions. Therefore, if a set, U
is open with respect to |·| it follows it is open with respect to ||·|| and the other way
around. The other thing to notice is that there exists a countable dense subset of F.
The rationals will work if F = R and if F = C, then you use Q + iQ. Letting D be
a countable dense subset of F, Dn is a countable dense subset of Fn . It is countable
because it is a finite Cartesian product of countable sets and you can use Theorem 2.1.7
of Page 15 repeatedly. It is dense because if x ∈ Fn , then by density of D, there exists
dj ∈ D such that
|dj − xj | < ε
then d ≡ (d1 , · · ·, dn ) is such that ||d − x|| < ε.
Now consider the set of open balls,
B ≡ {B (d, r) : d ∈ Dn , r ∈ Q} .
This collection of open balls is countable by Theorem 2.1.7 of Page 15. I claim every
open set is the union of balls from B. Let U be an open set in Fn and x ∈ U . Then there
exists δ > 0 such that B (x, δ) ⊆ U. There exists d ∈ Dn ∩B (x, δ/5) . Then pick rational
number δ/5 < r < 2δ/5. Consider the set of B, B (d, r) . Then x ∈ B (d, r) because
r > δ/5. However, it is also the case that B (d, r) ⊆ B (x, δ) because if y ∈ B (d, r) then
Corollary 7.7.8 Let (Ω, F, µ) be a measure space and let X be a normed vector
space with basis {v1 , · · ·, vn } . Let π k be the k th projection map onto the k th component.
Thus
Xn
π k x ≡ xk where x = xi vi .
i=1
Proof: The if part has already been noted. Suppose that each π k ◦ f is an F valued
measurable function. Let g : X → Fn be given by
g (x) ≡ (π 1 x, · · ·,π n x) .
Thus g is linear, one to one, and onto. By Theorem 5.8.3 both g and g−1 are continuous.
Therefore, every open set in X is of the form g−1 (U ) where U is an open set in Fn . To
see this, start with V open set in X. Since g−1 is continuous, g (V ) is open in Fn and
so V = g−1 (g (V )) . Therefore, it suffices to show that for every U an open set in Fn ,
¡ ¢ −1
f −1 g−1 (U ) = (g ◦ f ) (U ) ∈ F.
By Lemma 7.7.7 there are countably many open balls of the form B (xj , rj ) such that
U is equal to the union of these balls. Thus
−1 −1
(g ◦ f ) (U ) = (g ◦ f ) (∪∞
k=1 B (xk , rk ))
−1
= ∪∞
k=1 (g ◦ f ) (B (xk , rk )) (7.19)
Now from the definition of the norm,
n
Y
B (xk , rk ) = (xkj − δ, xkj + δ)
j=1
and so
−1 −1
(g ◦ f ) (B (xk , rk )) = ∩nj=1 (π j ◦ f ) ((xkj − δ, xkj + δ)) ∈ F.
It follows 7.19 is the countable union of sets in F and so it is also in F. This proves the
corollary. ¥
n
Note that if {fi }i=1 are measurable functions defined on (Ω, F, µ) having values in
follows f is a measurable Fn valued function. Now let
F then letting f ≡ (f1 , · · ·, fn ) , itP
n
Σ : Fn → F be given by Σ (x) ≡ k=1 ak xk . Then Σ is linear and so by Theorem 5.8.3
it follows Σ is continuous. Hence by Theorem 7.7.6, Σ (f ) is an F valued measurable
function. Thus linear combinations of measurable functions are measurable. By similar
reasoning, products of measurable functions are measurable. In general, it seems like
you can start with a collection of measurable functions and do almost anything you like
with them and the result, if it is a function will be measurable. This is in stark contrast
to the functions which are generalized Riemann integrable.
The following theorem considers the case of functions which have values in a normed
vector space.
Theorem 7.7.9 Let {fn } be a sequence of measurable functions mapping Ω to
X where X is a normed vector space and (Ω, F) is a measure space. Suppose also that
f (ω) = limn→∞ fn (ω) for all ω ∈ Ω. Then f is also a measurable function.
Proof: It is required to show f −1 (U ) is measurable for all U open. Let
½ ¾
¡ C
¢ 1
Vm ≡ x ∈ U : dist x, U > .
m
Thus ½ ¾
¡ ¢ 1
Vm ⊆ x ∈ U : dist x, U C ≥
m
and Vm ⊆ Vm ⊆ Vm+1 and ∪m Vm = U. Then since Vm is open, it follows that if
f (ω) ∈ Vm then for all sufficiently large k, it must be the case fk (ω) ∈ Vm also. That
is, ω ∈ fk−1 (Vm ) for all sufficiently large k. Thus
−1
f −1 (Vm ) = ∪∞ ∞
n=1 ∩k=n fk (Vm )
176 MEASURES AND MEASURABLE FUNCTIONS
and so
f −1 (U ) = ∪∞
m=1 f
−1
(Vm )
−1
= ∪m=1 ∪n=1 ∩∞
∞ ∞
k=n fk (Vm )
¡ ¢
⊆ ∪∞
m=1 f
−1
Vm = f −1 (U )
which shows f −1 (U ) is measurable. The step from the second to the last line follows
−1
because if ω ∈ ∪∞ ∞
n=1 ∩k=n fk (Vm ) , this says fk (ω) ∈ Vm for all k large enough.
Therefore, the point of X to which the sequence {fk (ω)} converges must be in Vm
which equals Vm ∪ Vm0 , the limit points of Vm . This proves the theorem. ¥
Now here is a simple observation involving something called simple functions. It
uses the following notation.
where each xk ∈ X and the Ak are disjoint measurable sets. (Such functions are often
referred to as simple functions.) Then f is measurable.
0 ≤ sn (ω) (7.20)
Then tn (ω) ≤ f (ω) for all ω and limn→∞ tn (ω) = f (ω) for all ω. This is because
n
tn (ω) = n for ω ∈ I and if f (ω) ∈ [0, 2 n+1 ), then
1
0 ≤ f (ω) − tn (ω) ≤ . (7.22)
n
Thus whenever ω ∈
/ I, the above inequality will hold for all n large enough. Let
7.8 Exercises
1. Let C be a set whose elements are σ algebras of subsets of Ω. Show ∩C is a σ
algebra also.
2. Let Ω be any set. Show P (Ω) , the set of all subsets of Ω is a σ algebra. Now let
L denote some subset of P (Ω) . Consider all σ algebras which contain L. Show
the intersection of all these σ algebras which contain L is a σ algebra containing
L and it is the smallest σ algebra containing L, denoted by σ (L). When Ω is a
normed vector space, and L consists of the open sets, σ (L) is called the σ algebra
of Borel sets.
3. Consider Ω = [0, 1] and let S denote all subsets of [0, 1] , F such that either F C
or F is countable. Note the empty set must be countable. Show S is a σ algebra.
(This is a sick σ algebra.) Now let µ : S → [0, ∞] be defined by µ (F ) = 1 if F C
is countable and µ (F ) = 0 if F is countable. Show µ is a measure on S.
4. Let Ω = N, the positive integers and let a σ algebra be given by F = P (N), the
set of all subsets of N. What are the measurable functions having values in C?
Let µ (E) be the number of elements of E where E is a subset of N. Show µ is a
measure.
5. Let F be a σ algebra of subsets of Ω and suppose F has infinitely many elements.
Show that F is uncountable. Hint: You might try to show there exists a count-
able sequence of disjoint sets of F, {Ai }. It might be easiest to verify this by
contradiction if it doesn’t exist rather than a direct construction however, I have
seen this done several ways. Once this has been done, you can define a map, θ,
from P (N) into F which is one to one by θ (S) = ∪i∈S Ai . Then argue P (N) is
uncountable and so F is also uncountable.
6. A probability space is a measure space, (Ω, F, P ) where the measure, P has the
property that P (Ω) = 1. Such a measure is called a probability measure. Random
vectors are measurable functions, X, mapping a probability space, (Ω, F, P ) to
Rn . Thus X (ω) ∈ Rn for each ω ∈ Ω and P is a probability measure defined on
the sets of F, a σ algebra of subsets of Ω. For E a Borel set in Rn , define
¡ ¢
µ (E) ≡ P X−1 (E) ≡ probability that X ∈ E.
Show this is a well defined probability measure on the Borel sets of Rn . Thus
µ (E) = P (X (ω) ∈ E) . It is called the distribution. Explain why µ must be
regular.
178 MEASURES AND MEASURABLE FUNCTIONS
7. Suppose (Ω, S, µ) is a measure space which may not be complete. Show that
another way to complete the measure space is to define S to consist of all sets of
the form E where there exists F ∈ S such that (F \ E) ∪ (E \ F ) ⊆ N for some
N ∈ S which has measure zero and then let µ (E) = µ1 (F )? Explain.
The Abstract Lebesgue
Integral
The general Lebesgue integral requires a measure space, (Ω, F, µ) and, to begin with,
a nonnegative measurable function. I will use Lemma 2.3.3 about interchanging two
supremums frequently. Also, I will use the observation that if {an } is an increasing
sequence of points of [0, ∞] , then supn an = limn→∞ an which is obvious from the
definition of sup.
where the integral on the right is the usual Riemann integral because eventually M > f .
For f a nonnegative decreasing function defined on [0, ∞),
Z ∞ Z R Z R
f dλ ≡ lim f dλ = sup f dλ
0 R→∞ R−1 R>1 R−1
Since decreasing bounded functions are Riemann integrable, the above definition is
well defined. Now here are some obvious properties.
179
180 THE ABSTRACT LEBESGUE INTEGRAL
Note both sides could equal +∞. This proves the lemma. ¥
which makes sense because λ → µ ([f > λ]) is nonnegative and decreasing.
Lemma 8.2.1 If f (λ) = 0 for all λ > a, where f is a decreasing nonnegative func-
tion, then Z Z
∞ a
f (λ) dλ = f (λ) dλ.
0 0
where the ci are each nonnegative real numbers, the distinct values of s.
Pp
Lemma 8.2.2 Let s (ω) = i=1 ai XEi (ω) be a nonnegative simple function with the
ai the distinct non zero values of s. Then
Z p
X
sdµ = ai µ (Ei ) . (8.1)
i=1
Proof: Without loss of generality, assume 0 ≡ a0 < a1 < a2 < · · · < ap and that
µ (Ei ) < ∞. Here is why. If µ (Ei ) = ∞, then letting a ∈ (ai−1 , ai ) by Lemma 8.2.1,
the left side would be
Z ap Z ap
µ ([s > λ]) dλ ≥ µ ([ai XEi > λ]) dλ
0 0
Z a
= sup µ ([ai XEi > λ]) ∧ M dλ
M 0
= sup aM = ∞
M
and so both sides are equal to ∞. Thus it can be assumed for each i, µ (Ei ) < ∞. Then
letting a0 ≡ 0, it follows from Lemma 8.2.1 and Lemma 8.1.2,
Z ∞ Z ap p Z
X ak
µ ([s > λ]) dλ = µ ([s > λ]) dλ = µ ([s > λ]) dλ
0 0 k=1 ak−1
p Z
X ak p
X p
X p
X
= µ (Ej ) dλ = (ak − ak−1 ) µ (Ej )
k=1 ak−1 j=k k=1 j=k
p
X p
X p
X p
X
= ak µ (Ej ) − ak−1 µ (Ej )
k=1 j=k k=1 j=k
p
X p
X p−1
X p
X
= ak µ (Ej ) − ak µ (Ej )
k=1 j=k k=0 j=k+1
p−1
X p
X p
X
= ap µ (Ep ) − 0 + ak µ (Ej ) − µ (Ej )
k=1 j=k j=k+1
p−1
X p
X
= ap µ (Ep ) + ak µ (Ek ) = ak µ (Ek ) .
k=1 k=1
where the ci are not necessarily distinct but the Ei are disjoint. It follows that
Z n
X
sdµ = ci µ (Ei ) .
i=1
Proof: Let the values of s be {a1 , · · · , am }. Therefore, since the Ei are disjoint,
each ai equal to one of the cj . Let Ai ≡ ∪ {Ej : cj = ai }. Then from Lemma 8.2.2 it
follows that
Z m
X Xm X
sdµ = ai µ (Ai ) = ai µ (Ej )
i=1 i=1 {j:cj =ai }
m
X X n
X
= cj µ (Ej ) = ci µ (Ei ) .
i=1 {j:cj =ai } i=1
Proof: Let
n
X m
X
s(ω) = αi XAi (ω), t(ω) = β j XBj (ω)
i=1 i=1
where αi are the distinct values of s and the β j are the distinct values of t. Clearly as+bt
is a nonnegative simple function because it has finitely many values on measurable sets
In fact,
Xm X n
(as + bt)(ω) = (aαi + bβ j )XAi ∩Bj (ω)
j=1 i=1
because the sum on the right is a lower sum for the Riemann integral on the left. Now
from the definition,
Z R Z R
µ ([f > λ]) dλ ≡ sup (µ ([f > λ]) ∧ M ) dλ
R−1 M R−1
k
X
= sup sup hk (µ ([f > ihk ]) ∧ M )
M k∈N i=1
k
X k
X
= sup lim hk (µ ([f > ihk ]) ∧ M ) = sup hk µ ([f > ihk ])
k∈N M →∞ i=1 k∈N i=1
Note this last equality holds even if µ ([f > ihk ]) = ∞. This proves the lemma. ¥
k
X k
X
= sup lim µ ([fn > ihk ]) hk = sup sup µ ([fn > ihk ]) hk
k∈N n→∞ i=1 k∈N n i=1
184 THE ABSTRACT LEBESGUE INTEGRAL
k
X Z R
= sup sup µ ([fn > ihk ]) hk = sup µ ([fn > λ]) dλ,
n k∈N n R−1
i=1
the sequence {µ ([fn > λ])}being increasing in n. Then from the above,
Z Z ∞ Z R
f dµ ≡ µ ([f > λ]) dλ ≡ sup µ ([f > λ]) dλ
0 R R−1
Z R Z R
= sup sup µ ([fn > λ]) dλ = sup sup µ ([fn > λ]) dλ
R n R−1 n R R−1
Z ∞ Z
≡ sup µ ([fn > λ]) dλ = lim fn dµ
n 0 n→∞
Similarly this also shows that for such nonnegative measurable function,
Z ½Z ¾
f dµ = sup s : 0 ≤ s ≤ f, s simple
In other words, Z ³ Z
´
lim inf fn dµ ≤ lim inf fn dµ
n→∞ n→∞
Proof: By Theorem 7.7.12 on Page 176 there exist increasing sequences of nonneg-
ative simple functions, sn → f and tn → g. Then af + bg, being the pointwise limit
of the simple functions asn + btn , is measurable. Now by the monotone convergence
theorem and Lemma 8.2.4,
Z Z
(af + bg) dµ = lim asn + btn dµ
n→∞
µ Z Z ¶
= lim a sn dµ + b tn dµ
n→∞
Z Z
= a f dµ + b gdµ.
where ck ∈ C and µ (Ek ) < ∞. For s a complex simple function as above, define
n
X
I (s) ≡ ck µ (Ek ) .
k=1
Lemma 8.7.3 The definition, 8.7.2 is well defined. Furthermore, I is linear on the
vector space of complex simple functions. Also the triangle inequality holds,
|I (s)| ≤ I (|s|) .
Pn P
Proof: Suppose k=1 ck XEk (ω) = 0. Does it follow that k ck µ (Ek ) = 0? The
supposition implies
n
X n
X
Re ck XEk (ω) = 0, Im ck XEk (ω) = 0. (8.4)
k=1 k=1
P
Choose λ large and positive so that λ + Re ck ≥ 0. Then adding k λXEk to both sides
of the first equation above,
n
X n
X
(λ + Re ck ) XEk (ω) = λXEk
k=1 k=1
8.7. THE LEBESGUE INTEGRAL, L1 187
R
and by Lemma 8.2.4 on Page 182, it follows upon taking of both sides that
n
X n
X
(λ + Re ck ) µ (Ek ) = λµ (Ek )
k=1 k=1
Pn
which implies k=1 Re ck µ (Ek ) = 0. Similarly,
n
X
Im ck µ (Ek ) = 0
k=1
Pn
and so k=1 ck µ (Ek ) = 0. Thus if
X X
cj XEj = dk XFk
j k
P P
then j cj XEj + k (−dk ) XFk = 0 and so the result just established verifies
X X
cj µ (Ej ) − dk µ (Fk ) = 0
j k
Then pick θ ∈ C such that θI (s) = |I (s)| and |θ| = 1. Then from the triangle inequality
for sums of complex numbers,
X
|I (s)| = θI (s) = I (θs) = θcj µ (Ej )
j
¯ ¯
¯ ¯
¯X ¯ X
= ¯ θc µ (E )¯≤ |θcj | µ (Ej ) = I (|s|) .
¯ j j ¯
¯ j ¯ j
Then
I (f ) ≡ lim I (sn ) . (8.6)
n→∞
and for m, n large enough this last is given to be small so {I (sn )} is a Cauchy sequence
in C and so it converges. This verifies the limit in 8.6 at least exists. It remains to
consider another sequence {tn } having the same properties as {sn } and verifying I (f )
determined by this other sequence is the same. By Lemma 8.7.3 and Fatou’s lemma,
Theorem 8.5.1 on Page 185,
Z
|I (sn ) − I (tn )| ≤ I (|sn − tn |) = |sn − tn | dµ
Z
≤ |sn − f | + |f − tn | dµ
Z Z
≤ lim inf |sn − sk | dµ + lim inf |tn − tk | dµ < ε
k→∞ k→∞
whenever n is large enough. Since ε is arbitrary, this shows the limit from using the tn
is the same as the limit from using sn . This proves the lemma. ¥
Consider the following picture. I have just given a definition of an integral for
functions having values in C. However, [0, ∞) ⊆ C.
R
What if f has values in [0, ∞)? Earlier f dµ was defined for such functions and now
IR (f ) has been defined. Are they the same? If so, I can be regarded as an extension of
dµ to a larger class of functions.
Lemma 8.7.6 Suppose f has values in [0, ∞) and f ∈ L1 (Ω) . Then f is measurable
and Z
I (f ) = f dµ.
and so f is measurable. Also it is always the case that if a, b are real numbers,
¯ + ¯
¯a − b+ ¯ ≤ |a − b|
and so
Z ¯ ¯ Z Z
¯ + +¯
¯(Re sn ) − (Re sm ) ¯ dµ ≤ |Re sn − Re sm | dµ ≤ |sn − sm | dµ
8.8. APPROXIMATION WITH SIMPLE FUNCTIONS 189
where x+ ≡ 12 (|x| + x) , the positive part of the real number, x. 1 Thus there is no loss
of generality in assuming {sn } is a sequence of complex simple functions
R having values
in [0, ∞). Then since for such complex simple functions, I (s) = sdµ,
¯ Z ¯ ¯Z Z ¯ Z
¯ ¯ ¯ ¯
¯I (f ) − f dµ¯ ≤ |I (f ) − I (sn )| + ¯ sn dµ − f dµ¯ < ε + |sn − f | dµ
¯ ¯ ¯ ¯
whenever n is large enough. But by Fatou’s lemma, Theorem 8.5.1 on Page 185, the
last term is no larger than
Z
lim inf |sn − sk | dµ < ε
k→∞
R
whenever n is large enough. Since ε is arbitrary, this shows I (f ) = f dµ as claimed.
¥ R
As explained
R above, I can be regarded as an extension R of dµ so from now on, the
usual symbol, dµ will be used. It is now easy to verify dµ is linear on L1 (Ω) .
Z Z Z µZ Z ¶
+ − + −
f dµ = (Re f ) dµ − (Re f ) dµ + i (Im f ) dµ − (Im f ) dµ ,
Also for every f ∈ L1 (Ω) , for every ε > 0 there exists a simple function s such that
Z
|f − s| dµ < ε.
Proof: First it is necessary to verify that L1 (Ω) is really a vector space because
it makes no sense to speak of linear maps without having these maps defined on a
vector space. Let f, g be in L1 (Ω) and let a, b ∈ C. Then let {sn } and {tn } be
sequences of complex simple functions associated with f and g respectively as described
in Definition 8.7.4. Consider {asn + btn } , another sequence of complex simple functions.
Then asn (ω) + btn (ω) → af (ω) + bg (ω) for each ω. Also, from Lemma 8.7.3
Z Z Z
|asn + btn − (asm + btm )| dµ ≤ |a| |sn − sm | dµ + |b| |tn − tm | dµ
1 The 1
negative part of the real number x is defined to be x− ≡ 2
(|x| − x) . Thus |x| = x+ + x− and
x= x+ − x− . .
190 THE ABSTRACT LEBESGUE INTEGRAL
and the sum of the two terms on the right converge to zero as m, n → ∞. Thus af +bg ∈
L1 (Ω) . Also
Z Z
(af + bg) dµ ≡ lim (asn + btn ) dµ
n→∞
µ Z Z ¶
= lim a sn dµ + b tn dµ
n→∞
Z Z
= a lim sn dµ + b lim tn dµ
n→∞ n→∞
Z Z
= a f dµ + b gdµ.
+ 1
(Re f ) = (|Re f | + Re f )
2
and
− 1
(Re f ) = (|Re f | − Re f )
2
+
so both of these functions are in L1 (Ω) . Similar formulas establish that (Im f ) and
−
(Im f ) are in L1 (Ω) .
The formula follows from the observation that
³ ´
+ − + −
f = (Re f ) − (Re f ) + i (Im f ) − (Im f )
R
and the fact shown first that f → f dµ is linear.
To verify the triangle inequality, let {sn } be complex simple functions for f as in
Definition 8.7.4. Then
¯Z ¯ ¯Z ¯ Z Z
¯ ¯ ¯ ¯
¯ f dµ¯ = lim ¯ sn dµ¯ ≤ lim |s | dµ = |f | dµ.
¯ ¯ n→∞ ¯ ¯ n→∞
n
Now the last assertion follows from the definition. There exists a sequence of simple
functions {sn } converging pointwise to f such that for all m, n large enough,
Z
ε
> |sn − sm | dµ
2
Fix m and let n → ∞. By Fatou’s lemma
Z Z
ε
ε > ≥ lim inf |sn − sm | dµ ≥ |f − sm | dµ.
2 n→∞
Proof: Suppose f ∈ L1 (Ω) . Then from Definition 8.7.4, it follows both real and
imaginary parts of f are measurable. Just take real and imaginary parts of sn and
observe the real and imaginary parts of f are limits of the real and imaginary parts of
sn respectively. By Theorem 8.8.1 this shows the only if part.
R The more interesting part is the if part. Suppose then that f is measurable and
|f | dµ < ∞. Suppose first that f has values in [0, ∞). It is necessary to obtain the
sequence of complex simple functions. By Theorem 7.7.12, there exists an increasing
sequence of nonnegative simple functions, {sn } such that sn (ω) ↑ f (ω). Then by the
monotone convergence theorem,
Z Z
lim (2f − (f − sn )) dµ = 2f dµ
n→∞
and so Z
lim (f − sn ) dµ = 0.
n→∞
R
Letting m be large enough, it follows (f − sm ) dµ < ε and so if n > m
Z Z
|sm − sn | dµ ≤ |f − sm | dµ < ε.
Lemma 8.8.3 Let {an } be a sequence in [−∞, ∞] . Then limn→∞ an exists if and
only if
lim inf an = lim sup an
n→∞ n→∞
and in this case, the limit equals the common value of these two numbers.
Since ε is arbitrary, the two must be equal and they both must equal a. Next suppose
limn→∞ an = ∞. Then if l ∈ R, there exists N such that for n ≥ N,
l ≤ an
inf an > l.
n>N
Therefore, limn→∞ an = ∞. The case for −∞ is similar. This proves the lemma. ¥
and there exists a measurable function g, with values in [0, ∞],2 such that
Z
|fn (ω)| ≤ g(ω) and g(ω)dµ < ∞.
2 Note that, since g is allowed to have the value ∞, it is not known that g ∈ L1 (Ω) .
8.9. THE DOMINATED CONVERGENCE THEOREM 193
R
Subtracting 2gdµ,
Z
0 ≤ − lim sup |f − fn |dµ.
n→∞
Hence
µZ ¶
0 ≥ lim sup |f − fn |dµ
n→∞
µZ ¶ ¯Z Z ¯
¯ ¯
≥ lim inf |f − fn |dµ ≥ ¯ f dµ − fn dµ¯¯ ≥ 0.
¯
n→∞
This proves the theorem by Lemma 8.8.3 because the lim sup and lim inf are equal. ¥
Z Z
lim inf (gn + g) dµ − lim sup |f − fn | dµ
n→∞ n→∞
Z Z
= lim inf ((gn + g) − |f − fn |) dµ ≥ 2gdµ
n→∞
R
and so − lim supn→∞ |f − fn | dµ ≥ 0. Thus
µZ ¶
0 ≥ lim sup |f − fn |dµ
n→∞
µZ ¶ ¯Z Z ¯
¯ ¯
≥ lim inf |f − fn |dµ ≥ ¯ f dµ − fn dµ¯¯ ≥ 0.
¯
n→∞
{E ∩ A : A ∈ F }
and the measure is µ restricted to this smaller σ algebra. Clearly, if f ∈ L1 (Ω), then
f XE ∈ L1 (E)
W = ∪m
k=1 D (x, rx )
and W is a compact subset of V because it is closed and bounded, being the finite union
of closed and bounded sets. Now define
¡ ¢
dist x, W C
f (x) ≡
dist (x, W C ) + dist (x, K)
K ≺ f ≺ V. (8.7)
It remains to prove the last assertion. By Theorem 7.4.6, µ is regular and so there
exist compact sets, {Kk } and open sets {Vk } such that Vk ⊇ Vk+1 , Kk ⊆ Kk+1 for all
k, and
Kk ⊆ E ⊆ Vk , µ (Vk \ Kk ) < 2−k .
8.10. APPROXIMATION WITH CC (Y ) 195
From the first part of the lemma, there exists a sequence {fk } such that
K k ≺ f k ≺ Vk .
Then fk (x) converges to XE (x) a.e. because if convergence fails to take place, then x
must be in infinitely many of the sets Vk \ Kk . Thus x is in
∩∞ ∞
m=1 ∪k=m Vk \ Kk
Now the functions are all bounded above by 1 and below by 0 and are equal to zero off
V1 , a set of finite measure so by the dominated convergence theorem,
Z
lim |XE (x) − fk (x)| dµ = 0,
k→∞
the dominating function being XE (x) + XV1 (x) . This proves the lemma. ¥
With this lemma, here is an important major theorem.
Proof: By considering separately the positive and negative parts of the real and
imaginary parts of f it suffices to consider only the case where f ≥ 0. Then by Theorem
7.7.12 and the monotone convergence theorem, there exists a simple function,
p
X
s (x) ≡ cm XEm (x) , s (x) ≤ f (x)
m=1
such that Z
|f (x) − s (x)| dµ < ε/2.
∞
By Lemma 8.10.2, there exists {hmk }k=1 be functions in Cc (Y ) such that
Z
lim |XEm − fmk | dµ = 0.
k→∞ Y
Let
p
X
gk (x) ≡ cm hmk .
m=1
196 THE ABSTRACT LEBESGUE INTEGRAL
You should verify this mostly satisfies the axioms of a norm. The problem comes in
asserting f = 0 if ||f || = 0 which strictly speaking is false. However, the other axioms
of a norm do hold.
The Lebesgue integral taken with respect to this measure, is called the Lebesgue
Stieltjes integral. Note that any real valued continuous function is measurable with
respect to S. This is because if f is continuous, inverse images of open sets are open
and open sets are in S. Thus f is measurable because f −1 ((a, b)) ∈ S. Similarly if
f has complex values this argument applied to its real and imaginary parts yields the
conclusion that f is measurable.
For f a continuous function, how does the Lebesgue Stieltjes integral compare with
the Darboux Stieltjes integral? To answer this question, here is a technical lemma.
Thus D cannot contain an interval of length 2ε. Since ε is arbitrary, D cannot contain
any interval.
Since f is continuous, it follows from Theorem 5.4.2 on Page 94 that f is uniformly
continuous. Therefore, there exists δ > 0 such that if |x − y| ≤ 3δ, then
Now let {x0 , · · · , xmn } be a partition of [a, b] such that |xi − xi−1 | < δ for each i. For
k = 1, 2, · · · , mn − 1, let zkn ∈
/ D and |zkn − xk | < δ. Then
¯ n ¯ ¯ ¯
¯zk − zk−1
n ¯
≤ |zkn − xk | + |xk − xk−1 | + ¯xk−1 − zk−1
n ¯
< 3δ.
Proof: Since F is an increasing function it can have only countably many disconti-
nuities. The reason for this is that the only kind of discontinuity it can have is where
F (x+) > F (x−) . Now since F is increasing, the intervals (F (x−) , F (x+)) for x a
point of discontinuity are disjoint and so since each must contain a rational number and
the rational numbers are countable, and therefore so are these intervals.
Let D denote this countable set of discontinuities of F . Then if l, r ∈
/ D, [l, r] ⊆ [a, b] ,
it follows quickly from the definition of the Darboux Stieltjes integral that
Z b
X[l,r) dF = F (r) − F (l) = F (r−) − F (l−)
a
Z
= µ ([l, r)) = X[l,r) dµ.
Now let {sn } be the sequence of step functions of Lemma 8.11.2 such that these step
functions converge uniformly to f on [c, d] Then
¯Z ¯ Z
¯ ¡ ¢ ¯ ¯ ¯ 1
¯ X[c,d] f − X[c,d] sn dµ¯¯ ≤ ¯X[c,d] (f − sn )¯ dµ ≤ µ ([c, d])
¯ n
and
¯Z ¯ Z
¯ b¡ ¢ ¯¯ b
1
¯
¯ X[c,d] f − X[c,d] sn dF ¯ ≤ X[c,d] |f − sn | dF < (F (b) − F (a)) .
¯ a ¯ a n
mn
X ¡ n ¢ ¡ n ¢
= f zk−1 µ [zk−1 , zkn )
k=1
mn
X ¡ n ¢¡ ¡ n ¢¢
= f zk−1 F (zkn −) − F zk−1 −
k=1
mn
X ¡ n ¢¡ ¡ n ¢¢
= f zk−1 F (zkn ) − F zk−1
k=1
mn Z b
X Z b
¡ n ¢
= f zk−1 X[zk−1
n n ) dF =
,zk sn dF.
k=1 a a
Therefore, ¯Z ¯
¯ Z b ¯
¯ ¯
¯ X[c,d] f dµ − X[c,d] f dF ¯
¯ a ¯
¯Z Z ¯
¯ ¯
≤ ¯¯ X[c,d] f dµ − X[c,d] sn dµ¯¯
¯Z Z b ¯ ¯Z Z b ¯
¯ ¯ ¯ b ¯
¯ ¯ ¯ ¯
+ ¯ X[c,d] sn dµ − sn dF ¯ + ¯ sn dF − X[c,d] f dF ¯
¯ a ¯ ¯ a a ¯
8.12. EXERCISES 199
1 1
≤ µ ([c, d]) + (F (b) − F (a))
n n
and since n is arbitrary, this shows
Z Z b
f dµ − f dF = 0.
a
Thus, if F (x) = x so the Darboux Stieltjes integral is the usual integral from calculus,
Z b Z
f (t) dt = X[a,b] f dµ
a
where µ is the measure which comes from F (x) = x as described above. This measure
is often denoted by m. Thus when f is continuous
Z b Z
f (t) dt = X[a,b] f dm
a
for either the Lebesgue or the Riemann integral. Furthermore, when f is continuous,
you can compute the Lebesgue integral by using the fundamental theorem of calculus
because in this case, the two integrals are equal.
8.12 Exercises
1. Let Ω = N ={1, 2, · · ·}. Let F = P(N), the set of all subsets of N, and let µ(S) =
number of elements in S. Thus µ({1}) = 1 = µ({2}), µ({1, 2}) = 2, etc. Show
(Ω, F, µ) is a measure space. It is called counting measure. What functions are
measurable in this case? For a nonnegative function, f defined on N, show
Z ∞
X
f dµ = f (k)
N k=1
3. If (Ω, F, µ) is a measure
R space and
R f ≥ 0 is measurable, show that if g (ω) = f (ω)
1
a.e. ω andRg ≥ 0, then
R gdµ = f dµ. Show that if f, g ∈ L (Ω) and g (ω) = f (ω)
a.e. then gdµ = f dµ.
200 THE ABSTRACT LEBESGUE INTEGRAL
A ≡ ∪∞
i=1 Ci .
5. Let A ⊆ P (Ω) where P (Ω) denotes the set of all subsets of Ω. Let σ (A) denote
the intersection of all σ algebras which contain A, one of these being P (Ω). Show
σ (A) is also a σ algebra.
6. We say a function g mapping a normed vector space, Ω to a normed vector space
is Borel measurable if whenever U is open, g −1 (U ) is a Borel set. (The Borel
sets are those sets in the smallest σ algebra which contains the open sets.) Let
f : Ω → X and let g : X → Y where X is a normed vector space and Y equals
C, R, or (−∞, ∞] and F is a σ algebra of sets of Ω. Suppose f is measurable and
g is Borel measurable. Show g ◦ f is measurable.
7. Let (Ω, F, µ) be a measure space. Define µ : P(Ω) → [0, ∞] by
Show µ satisfies
Let S = {ω ∈ Ω such that ω ∈ Ei for infinitely many values of i}. Show µ(S) = 0
and S is measurable. This is part of the Borel Cantelli lemma. Hint: Write S
in terms of intersections and unions. Something is in S means that for every n
there exists k > n such that it is in Ek . Remember the tail of a convergent series
is small.
9. ↑ Let {fn } , f be measurable functions with values in C. {fn } converges in measure
if
lim µ(x ∈ Ω : |f (x) − fn (x)| ≥ ε) = 0
n→∞
for all f ∈ S.
11. Let (Ω, F, µ) be a measure space and suppose f, g : Ω → (−∞, ∞] are measurable.
Prove the sets
{ω : f (ω) < g(ω)} and {ω : f (ω) = g(ω)}
are measurable. Hint: The easy way to do this is to write
Note that l (x, y) = x−y is not continuous on (−∞, ∞] so the obvious idea doesn’t
work.
12. Let {fn } be a sequence of real or complex valued measurable functions. Let
Show S is measurable. Hint: You might try to exhibit the set where fn converges
in terms of countable unions and intersections using the definition of a Cauchy
sequence.
13. Suppose un (t) is a differentiable function for t ∈ (a, b) and suppose that for t ∈
(a, b),
|un (t)|, |u0n (t)| < Kn
P∞
where n=1 Kn < ∞. Show
∞
X ∞
X
( un (t))0 = u0n (t).
n=1 n=1
Hint: This is an exercise in the use of the dominated convergence theorem and
the mean value theorem.
202 THE ABSTRACT LEBESGUE INTEGRAL
∞
14. Let E be a countable subset of R. Show m(E) = 0. Hint: Let the set be {ei }i=1
and let ei be the center of an open interval of length ε/2i .
15. ↑ If S is an uncountable set of irrational numbers, is it necessary that S has
a rational number as a limit point? Hint: Consider the proof of Problem 14
when applied to the rational numbers. (This problem was shown to me by Lee
Erlebach.)
16. Suppose {fn } is a sequence of nonnegative measurable functions defined on a
measure space, (Ω, S, µ). Show that
Z X∞ ∞ Z
X
fk dµ = fk dµ.
k=1 k=1
Hint: Use the monotone convergence theorem along with the fact the integral is
linear.
R∞
17. The integral −∞ f (t) dt will denote the Lebesgue integral taken with respect to
one dimensional Lebesgue measure as discussed earlier. Show that for α > 0, t →
2
e−at is in L1 (R). The gamma function is defined for x > 0 as
Z ∞
Γ (x) ≡ e−t tx−1 dt
0
−t x−1 1
Show t → e t is in L (R) for all x > 0. Also show that
Γ (x + 1) = xΓ (x) , Γ (1) = 1.
How does Γ (n) for n an integer compare with (n − 1)!?
18. This problem outlines a treatment of Stirling’s formula which is a very useful
approximation to n! based on a section in [34]. It is an excellent application of
the monotone convergence theorem. Follow and justify the following steps using
the convergence theorems for the Lebesgue integral as needed. Here x > 0.
Z ∞
Γ (x + 1) = e−t tx dt
0
where this last improper integral equals a well defined constant (why?). It is
very easy, when you know something about multiple
√ integrals of functions of more
than one variable to verify this constant is π but the necessary mathematical
machinery has not yet been presented. It can also be done through much more
difficult arguments in the context of functions of only one variable. See [34] for
these clever arguments.
19. To show you the power of Stirling’s formula, find whether the series
X∞
n!en
n=1
nn
converges. The ratio test falls flat but you can try it if you like. Now explain why,
if n is large enough
µZ ∞ ¶
1 −s2
√ −n n+(1/2) √
n! ≥ e ds 2e n ≡ c 2e−n nn+(1/2) .
2 −∞
Use this.
20. The Riemann integral is only defined for functions which are bounded which are
also defined on a bounded interval. If either of these two criteria are not satisfied,
then the integral is not the Riemann integral. Suppose f is Riemann integrable
on a bounded interval, [a, b]. Show that it must also be Lebesgue integrable with
respect to one dimensional Lebesgue measure and the two integrals coincide.
21. Give a theorem in which the improper Riemann integral coincides with a suitable
Lebesgue integral. (There are many such situations just find one.)
R∞
22. Note that 0 sinx x dx is a valid improper Riemann integral defined by
Z R
sin x
lim dx
R→∞ 0 x
but this function, sin x/x is not in L1 ([0, ∞)). Why?
23. Let f be a nonnegative strictly decreasing function defined on [0, ∞). For 0 ≤ y ≤
f (0), let f −1 (y) = x where y ∈ [f (x+) , f (x−)]. (Draw a picture. f could have
jump discontinuities.) Show that f −1 is nonincreasing and that
Z ∞ Z f (0)
f (t) dt = f −1 (y) dy.
0 0
25. ↑ Consider the sequence of functions defined in the following way. Let f1 (x) = x on
[0, 1]. To get from fn to fn+1 , let fn+1 = fn on all intervals where fn is constant. If
fn is nonconstant on [a, b], let fn+1 (a) = fn (a), fn+1 (b) = fn (b), fn+1 is piecewise
linear and equal to 12 (fn (a) + fn (b)) on the middle third of [a, b]. Sketch a few
of these and you will see the pattern. The process of modifying a nonconstant
section of the graph of this function is illustrated in the following picture.
Show {fn } converges uniformly on [0, 1]. If f (x) = limn→∞ fn (x), show that
f (0) = 0, f (1) = 1, f is continuous, and f 0 (x) = 0 for all x ∈ / P where P is
the Cantor set. This function is called the Cantor function.It is a very important
example to remember. Note it has derivative equal to zero a.e. and yet it succeeds
in climbing from 0 to 1. Thus
Z 1
f 0 (t) dt = 0 6= f (1) − f (0) .
0
26. Let m(W ) > 0, W is measurable, W ⊆ [a, b]. Show there exists a nonmeasurable
subset of W . Hint: Let x ∼ y if x − y ∈ Q. Observe that ∼ is an equivalence
relation on R. See Definition 2.1.9 on Page 16 for a review of this terminology. Let
C be the set of equivalence classes and let D ≡ {C ∩ W : C ∈ C and C ∩ W 6= ∅}.
By the axiom of choice, there exists a set, A, consisting of exactly one point from
each of the nonempty sets which are the elements of D. Show
W ⊆ ∪r∈Q A + r (a.)
A + r1 ∩ A + r2 = ∅ if r1 6= r2 , ri ∈ Q. (b.)
Observe that since A ⊆ [a, b], then A + r ⊆ [a − 1, b + 1] whenever |r| < 1. Use
this to show that if m(A) = 0, or if m(A) > 0 a contradiction results.Show there
exists some set, S such that m (S) < m (S ∩ A) + m (S \ A) where m is the outer
measure determined by m.
27. ↑ This problem gives a very interesting example found in the book by McShane
[31]. Let g(x) = x + f (x) where f is the strange function of Problem 25. Let
P be the Cantor set of Problem 24. Let [0, 1] \ P = ∪∞ j=1 Ij where Ij is open
and Ij ∩ Ik = ∅ if j 6= k. These intervals are the connected components of the
complement of the Cantor set. Show m(g(Ij )) = m(Ij ) so
∞
X ∞
X
m(g(∪∞
j=1 Ij )) = m(g(Ij )) = m(Ij ) = 1.
j=1 j=1
Thus m(g(P )) = 1 because g([0, 1]) = [0, 2]. By Problem 26 there exists a set,
A ⊆ g (P ) which is non measurable. Define φ(x) = XA (g(x)). Thus φ(x) = 0
unless x ∈ P . Tell why φ is measurable. (Recall m(P ) = 0 and Lebesgue measure
is complete.) Now show that XA (y) = φ(g −1 (y)) for y ∈ [0, 2]. Tell why g −1 is
8.12. EXERCISES 205
9.1 π Systems
The approach to n dimensional Lebesgue measure will be based on a very elegant idea
due to Dynkin.
For example, if Rn = Ω, an example of aQπ system would be the set of all open sets.
n
Another example would be sets of the form k=1 Ak where Ak is a Lebesgue measurable
set.
The following is the fundamental lemma which shows these π systems are useful.
1. K ⊆ G
2. If A ∈ G, then AC ∈ G
∞
3. If {Ai }i=1 is a sequence of disjoint sets from G then ∪∞
i=1 Ai ∈ G.
H ≡ {G : 1 - 3 all hold}
then ∩H yields a collection of sets which also satisfies 1 - 3. Therefore, I will assume in
the argument that G is the smallest collection of sets satisfying 1 - 3, the intersection of
all such collections. Let A ∈ K and define
GA ≡ {B ∈ G : A ∩ B ∈ G} .
I want to show GA satisfies 1 - 3 because then it must equal G since G is the smallest
collection of subsets of Ω which satisfies 1 - 3. This will give the conclusion that for
A ∈ K and B ∈ G, A ∩ B ∈ G. This information will then be used to show that if
A, B ∈ G then A ∩ B ∈ G. From this it will follow very easily that G is a σ algebra
which will imply it contains σ (K). Now here are the details of the argument.
207
208 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
A ∩ ∪∞ ∞
i=1 Bi = ∪i=1 A ∩ Bi ∈ G
GB ≡ {A ∈ G : A ∩ B ∈ G} .
I just proved K ⊆ GB . The other arguments are identical to show GB satisfies 1 - 3 and
is therefore equal to G. This shows that whenever A, B ∈ G it follows A ∩ B ∈ G.
This implies G is a σ algebra. To show this, all that is left is to verify G is closed
under countable unions because then it follows G is a σ algebra. Let {Ai } ⊆ G. Then
let A01 = A1 and
because finite intersections of sets of G are in G. Since the A0i are disjoint, it follows
∪∞ ∞ 0
i=1 Ai = ∪i=1 Ai ∈ G
Therefore, G ⊇ σ (K) because it is a σ algebra which contains K and This proves the
lemma. ¥
and continue this way. The iterated integral is said to make sense if the process just
described makes sense at each step. Thus, to make sense, it is required
xi1 → f (x1 , · · ·, xn )
can be integrated. Either the function has values in [0, ∞] and is measurable or it is a
function in L1 . Then it is required
Z
xi2 → f (x1 , · · ·, xn ) dxi1
can be integrated and so forth. The symbol in 9.1 is called an iterated integral.
With the above explanation of iterated integrals, it is now time to define n dimen-
sional Lebesgue measure.
Proposition 9.2.2 There exists a σ algebra of sets of Rn which contains the open
sets, F n and a measure mn defined on this σ algebra such that if f : Rn → [0, ∞)
is measurable with respect to F n then for any permutation (i1 , · · ·, in ) of {1, · · ·, n} it
follows Z Z Z
f dmn = ··· f (x1 , · · ·, xn ) dxi1 · · · dxin (9.2)
Rn
In particular, this implies that if Ai is Lebesgue measurable for each i = 1, · · ·, n then
à n ! n
Y Y
mn Ai = m (Ai ) .
i=1 i=1
make sense and are equal. Now define G to be those subsets of Rn which have property
P.
Thus K ⊆ G because if (i1 , · · ·, in ) is any permutation of {1, 2, · · ·, n} and
n
Y
A= Ai ∈ K
i=1
210 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
then Z Z n
Y
··· XRp ∩A dxi1 · · · dxin = m ([−p, p] ∩ Ai ) .
i=1
Now suppose F ∈ G and let (i1 , · · ·, in ) and (j1 , · · ·, jn ) be two permutations. Then
¡ ¢
Rp = Rp ∩ F C ∪ (Rp ∩ F )
and so
Z Z Z Z
¡ ¢
··· XRp ∩F C dxi1 · · · dxin = ··· XRp − XRp ∩F dxi1 · · · dxin
Since Rp ∈ G the iterated integrals on the right and hence on the left make sense. Then
continuing with the expression on the right and using that F ∈ G, it equals
Z Z
n
(2p) − · · · XRp ∩F dxi1 · · · dxin
Z Z
n
= (2p) − · · · XRp ∩F dxj1 · · · dxjn
Z Z
¡ ¢
= ··· XRp − XRp ∩F dxj1 · · · dxjn
Z Z
= · · · XRp ∩F C dxj1 · · · dxjn
Z Z N
X
= ··· lim XRp ∩Fk dxi1 · · · dxin
N →∞
k=1
Do
PNthe iterated integrals make sense? Note that the iterated integral makes sense for
k=1 XRp ∩Fk as the integrand because it is just a finite sum of functions for which the
iterated integral makes sense. Therefore,
∞
X
xi1 → XRp ∩Fk (x)
k=1
is also measurable. Therefore, one can do another integral to this function. Continu-
ing this way using the monotone convergence theorem, it follows the iterated integral
makes sense. The same reasoning shows the iterated integral makes sense for any other
permutation.
Now applying the monotone convergence theorem as needed,
Z Z Z Z X
∞
··· XRp ∩F dxi1 · · · dxin = ··· XRp ∩Fk dxi1 · · · dxin
k=1
Z Z N
X
= ··· lim XRp ∩Fk dxi1 · · · dxin
N →∞
k=1
Z Z N Z
X
= ··· lim XRp ∩Fk dxi1 · · · dxin
N →∞
k=1
Z N Z Z
X
= · · · lim XRp ∩Fk dxi1 · · · dxin · · ·
N →∞
k=1
N Z
X Z
= lim ··· XRp ∩Fk dxi1 · · · dxin
N →∞
k=1
N Z
X Z
= lim ··· XRp ∩Fk dxj1 · · · dxjn
N →∞
k=1
the last step holding because each Fk ∈ G. Then repeating the steps above in the
opposite order, this equals
Z Z X
∞ Z Z
··· XRp ∩Fk dxj1 · · · dxjn = ··· XRp ∩F dxj1 · · · dxjn
k=1
Using the monotone convergence theorem repeatedly as in the first part of the argument,
this equals
X∞ Z Z ∞
X
lim · · · XRp ∩Fk dxj1 · · · dxjn ≡ mn (Fk ) .
p→∞
k=1 k=1
212 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
Applying the monotone convergence theorem repeatedly on the right, this yields that
the iterated integral makes sense and
Z Z Z
XF dmn = · · · XF dxj1 · · · dxjn
Rn
It follows 9.2 holds for every nonnegative simple function in place of f because these are
just linear combinations of functions, XF . Now taking an increasing sequence of non-
negative simple functions, {sk } which converges to a measurable nonnegative function
f
Z Z
f dmn = lim sk dmn
Rn k→∞ Rn
Z Z
= lim · · · sk dxj1 · · · dxjn
k→∞
Z Z
= · · · f dxj1 · · · dxjn
In particular, iterated integrals for any permutation of {1, · · ·, n} are all equal.
Proof: It suffices to prove this for f having real values because if this is shown the
general case is obtained by taking real and imaginary parts. Since f ∈ L1 (Rn ) ,
Z
|f | dmn < ∞
Rn
9.2. N DIMENSIONAL LEBESGUE MEASURE AND INTEGRALS 213
and so both 12 (|f | + f ) and 12 (|f | − f ) are in L1 (Rn ) and are each nonnegative. Hence
from Proposition 9.2.2,
Z Z · ¸
1 1
f dmn = (|f | + f ) − (|f | − f ) dmn
Rn n 2 2
ZR Z
1 1
= (|f | + f ) dmn − (|f | − f ) dmn
Rn 2 Rn 2
Z Z
1
= ··· (|f (x)| + f (x)) dxi1 · · · dxin
2
Z Z
1
− ··· (|f (x)| − f (x)) dxi1 · · · dxin
2
Z Z
1 1
= ··· (|f (x)| + f (x)) − (|f (x)| − f (x)) dxi1 · · · dxin
2 2
Z Z
= ··· f (x) dxi1 · · · dxin
Corollary 9.2.4 Suppose f is measurable with respect to F n and suppose for some
permutation, (i1 , · · ·, in )
Z Z
··· |f (x)| dxi1 · · · dxin < ∞
Then f ∈ L1 (Rn ) .
Theorem 9.2.5 Let B (Rn ) be the Borel sets on Rn . There exists a measure mn
defined on B (Rn ) such that if f is a nonnegative Borel measurable function,
Z Z Z
f dmn = · · · f (x) dxi1 · · · dxin (9.3)
Rn
then f ∈ L1 (Rn ). The measure mn is both inner and outer regular on the Borel sets.
That is, if E ∈ B (Rn ),
Proof: Most of it was shown earlier since B (Rn ) ⊆ F n . The two assertions about
regularity follow from observing that mn is finite on compact sets and then using Theo-
rem 7.4.6. It remains to show the assertion about the product of Borel sets. If each Ak
is open, there is nothing to show because the result isQan open set. Suppose then that
n
whenever A1 , · · ·, Am , m ≤ n are open, the product, k=1 Ak is a Borel set. Let
Qn K be
the open sets in R and let G be those Borel sets such that if Am ∈ G it follows k=1 Ak
is Borel. Then K is a π system and is contained in G. Now suppose F ∈ G. Then
Ãm−1 n
! Ãm−1 n
!
Y Y Y Y
C
Ak × F × Ak ∪ Ak × F × Ak
k=1 k=m+1 k=1 k=m+1
Ãm−1 n
!
Y Y
= Ak × R × Ak
k=1 k=m+1
B ∪ A = D.
where B, A are disjoint and B and D are Borel. Therefore, A = D \ B which is a Borel
set. Thus G is closed with respect to complements. If {Fi } is a sequence of disjoint
elements of G
Ãm−1 n
! Ãm−1 n
!
Y Y Y Y
∞
Ak × ∪i Fi × Ak = ∪i=1 Ak × Fi × Ak
k=1 k=m+1 k=1 k=m+1
which is a countable union of Borel sets and is therefore, Borel. Hence G is also closed
with respect to countable unions of disjoint sets. Thus by the Lemma on π systems
G ⊇ σ (K) = B (R) and this shows that Am can be any Borel set. Thus the assertion
about the product is true if only A1 , · · ·, Am−1 are open while the rest are Borel.
Continuing this way shows the assertion remains true for each Ai being Borel. Now the
final formula about the measure of a product follows from 9.3.
Z Z Z
X k=1 Ak dmn = · · · XQnk=1 Ak (x) dx1 · · · dxn
Q n
Rn
Z Z Y
n n
Y
= ··· XAk (xk ) dx1 · · · dxn = m (Ak ) .
k=1 k=1
9.3 Exercises
R 2 R 6−2z R 3−z ¡ ¢
1. Find 0 0 1 (3 − z) cos y 2 dy dx dz.
2x
R 1 R 18−3z R 6−z ¡ ¢
2. Find 0 0 1 (6 − z) exp y 2 dy dx dz.
3x
R 2 R 24−4z R 6−z ¡ ¢
3. Find 0 0 1 (6 − z) exp x2 dx dy dz.
4y
R 20 R 1 R 5−z R 25 R 5− 1 y R 5−z
5. Find 0 0 1 y sinx x dx dz dy + 20 0 5 1 y sin x
x dx dz dy. Hint: You might
5 5
try doing it in the order, dy dx dz
6. Explain why for each t > 0, x → e−tx is a function in L1 (R) and
Z ∞
1
e−tx dx = .
0 t
Thus Z Z Z
R R ∞
sin (t)
dt = sin (t) e−tx dxdt
0 t 0 0
Now explain why you can change the order of integration in the above iterated
integral. Then compute what you get. Next pass to a limit as R → ∞ and show
Z ∞
sin (t) 1
dt = π
0 t 2
216 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
R∞ Rr
7. Explain why a f (t) dt ≡ limr→∞ a f (t) dt whenever f ∈ L1 (a, ∞) ; that is
f X[a,∞) ∈ L1 (R).
R1 R∞
8. B(p, q) = 0 xp−1 (1 − x)q−1 dx, Γ(p) = 0 e−t tp−1 dt for p, q > 0. The first of
these is called the beta function, while the second is the gamma function. Show
a.) Γ(p + 1) = pΓ(p); b.) Γ(p)Γ(q) = B(p, q)Γ(p + q).Explain why the gamma
function makes sense for any p > 0.
−1/2
9. Let f (y) = g (y) = |y| if y ∈ (−1, 0) ∪ (0, 1) and f (y) = g (y) = 0 if y ∈/
(−1,
R 0) ∪ (0, 1). For which values of x does it make sense to write the integral
R
f (x − y) g (y) dy?
10. Let {an } be an increasing sequence of numbers in (0, 1) which converges to 1.
Let
R gn be a nonnegative function which equals zero outside (an , an+1 ) such that
gn dx = 1. Now for (x, y) ∈ [0, 1) × [0, 1) define
∞
X
f (x, y) ≡ gn (y) (gn (x) − gn+1 (x)) .
k=1
Explain why this is actually a finite sum for each such (x, y) so there are no
convergence questions in the infinite sum. Explain why f is a continuous function
on [0, 1) × [0, 1). You can extend f to equal zero off [0, 1) × [0, 1) if you like. Show
the iterated integrals exist but are not equal. In fact, show
Z 1Z 1 Z 1Z 1
f (x, y) dydx = 1 6= 0 = f (x, y) dxdy.
0 0 0 0
Does this example contradict the Fubini theorem? Explain why or why not.
It follows that for each E ∈ Fn there exists F ∈ B (Rn ) such that F ⊇ E and
mn (E) = mn (F ) .
Qn
If Ak is Lebesgue measurable then k=1 Ak ∈ Fn and
à n
! n
Y Y
mn Ak = m (Ak ) .
k=1 k=1
Vmk ⊇ Em
and
¡ ¢¡ ¢
mn (Vmk ) − mn (Em ) = mn (Vmk \ Em ) < 2−k 2−m .
Then
E = ∪∞ ∞
m=1 Em ⊆ ∪m=1 Vmk ≡ Vk
and
¡ ¢
∩kl=1 Vl \ E ⊆ ∪∞
m=1 Vmk \ E
so
∞ ∞
¡¡ ¢ ¢ X X
mn ∩kl=1 Vl \ E ≤ mn (Vmk \ E) < 2−k 2−m ≤ 2−k .
m=1 m=1
Let G ≡ ∩∞ k
k=1 ∩l=1 Vl . Then from the above, and passing to the limit, it follows
mn (G \ E) = 0.
mn (E \ ∪∞ ∞
m=1 Kmk ) ≤ mn (∪m=1 (Em \ Kmk )) < 2
−k
Let F = ∪∞ ∞
k=1 ∪m=1 Kmk . Then
mn (E \ F ) ≤ mn (E \ ∪∞
m=1 Kmk ) < 2
−k
m (Ck ). In fact, you can have Bk equal a countable intersection of open sets and Ck a
countable union of compact sets. Then
n n
à n !
Y Y Y
m (Ak ) = m (Ck ) ≤ mn Ck
k=1 k=1 k=1
à n
! Ã n
!
Y Y
≤ mn Ak ≤ mn Bk
k=1 k=1
n
Y n
Y
= m (Bk ) = m (Ak ) .
k=1 k=1
It remains to prove the claim about the measure being translation invariant.
Let K denote all sets of the form
n
Y
Uk
k=1
which is also a finite Cartesian product of finitely many open sets. Also,
à n
! Ã n !
Y Y
mn x + Uk = mn (xk + Uk )
k=1 k=1
n
Y
= m (xk + Uk )
k=1
n
à n
!
Y Y
= m (Uk ) = mn Uk
k=1 k=1
The step to the last line is obvious because an arbitrary open set in R is the disjoint
union of open intervals and the lengths of these intervals are unchanged when they are
slid to another location.
Now let G denote those Borel sets E with the property that for each p ∈ N
n n
mn (x + E ∩ (−p, p) ) = mn (E ∩ (−p, p) )
n
and the set, x + E ∩ (−p, p) is a Borel set. Thus K ⊆ G. If E ∈ G then
¡ n¢ n n
x + E C ∩ (−p, p) ∪ (x + E ∩ (−p, p) ) = x + (−p, p)
n
which implies x + E C ∩ (−p, p) is a Borel set since it equals a difference of two Borel
sets. Now consider the following.
¡ n¢ n
mn x + E C ∩ (−p, p) + mn (E ∩ (−p, p) )
¡ n ¢ n
= mn x + E C ∩ (−p, p) + mn (x + E ∩ (−p, p) )
n n
= mn (x + (−p, p) ) = mn ((−p, p) )
¡ n¢ n
= mn E C ∩ (−p, p) + mn (E ∩ (−p, p) )
which shows ¡ n¢ ¡ n¢
mn x + E C ∩ (−p, p) = mn E C ∩ (−p, p)
9.5. MOLLIFIERS 219
showing that E C ∈ G.
If {Ek } is a sequence of disjoint sets of G,
n n
mn (x + ∪∞ ∞
k=1 Ek ∩ (−p, p) ) = mn (∪k=1 x + Ek ∩ (−p, p) )
n
Now the sets {x + Ek ∩ (−p, p) } are also disjoint and so the above equals
X n
X n
mn (x + Ek ∩ (−p, p) ) = mn (Ek ∩ (−p, p) )
k k
n
= mn (∪∞
k=1 Ek ∩ (−p, p) )
Thus G is also closed with respect to countable disjoint unions. It follows from the
lemma on π systems that G ⊇ σ (K) . But from Lemma 7.7.7 on Page 174, every open
set is a countable union of sets of K and so σ (K) contains the open sets. Therefore,
B (Rn ) ⊇ G ⊇ σ (K) ⊇ B (K) which shows G = B (Rn ).
I have just shown that for every E ∈ B (Rn ) , and any p ∈ N,
n n
mn (x + E ∩ (−p, p) ) = mn (E ∩ (−p, p) )
mn (x + E) = mn (E) .
mn (F ) = mn (E) = mn (x + E) = mn (x + (E \ F ) ∪ F )
= mn (x + E \ F ) + mn (x + F ) = mn (x + F ) .
This proves the theorem. ¥
9.5 Mollifiers
From Theorem 8.10.3, every function in L1 (Rn ) can be approximated by one in Cc (Rn )
but even more incredible things can be said. In fact, you can approximate an arbitrary
function in L1 (Rn ) with one which is infinitely differentiable having compact support.
This is very important in partial differential equations. I am just giving a short intro-
duction to this concept here. Consider the following example.
Then a little work shows ψ ∈ Cc∞ (U ). The following also is easily obtained.
220 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
You show this by verifying the partial derivatives all exist and are continuous. The
only place this is hard is when |x − z| = r. It is left as an exercise. You might consider
a simpler example, ½ 2
e−1/x if x 6= 0
f (x) =
0 if x = 0
and reduce the above to a consideration of something like this simpler case.
Proof: This follows right away from the change of variables formula. In the left,
let x − y ≡ u. Then the left side equals
Z
f (x − u) g (u) du
because the absolute value of the determinant of the derivative is 1. Now replace u with
y and This proves the proposition. ¥
The following lemma will be useful in what follows. It says among other things that
one of these very unregular functions in L1loc (Rn ) is smoothed out by convolving with
a mollifier.
9.5. MOLLIFIERS 221
Lemma 9.5.7 Let f ∈ L1loc (Rn ), and g ∈ Cc∞ (Rn ). Then f ∗ g is an infinitely
differentiable function. Also, if {ψ m } is a mollifier and U is an open set and f ∈
C 0 (U ) ∩ L1loc (Rn ) , then at every x ∈ U,
such that
|g(x + tej − y) − g (x − y)| ≤ M |t|
for any choice of x and y. Therefore, there exists a dominating function for the inte-
grand of the above integral which is of the form C |f (y)| XK where K is a compact set
depending on the support of g. It follows from the dominated convergence theorem the
limit of the difference quotient above passes inside the integral as t → 0 and so
Z
∂ ∂
(f ∗ g) (x) = f (y) g (x − y) dy.
∂xj ∂xj
∂
Now letting ∂x j
g play the role of g in the above argument, a repeat of the above
reasoning shows partial derivatives of all orders exist. A similar use of the dominated
convergence theorem shows all these partial derivatives are also continuous.
It remains to
¡ verify
¢ the claim about the mollifier. Let x ∈ U and let m be large
1
enough that B x, m ⊆ U. Then
Z
|f ∗ g (x) − f (x)| ≤ |f (x − y) − f (x)| ψ m (y) dy
B (0, m
1
)
By continuity of f at x, for all m sufficiently large, the above is dominated by
Z
ε ψ m (y) dy = ε
B (0, m
1
)
and this proves the claim.
Now consider the formula in the case where f ∈ C 1 (U ). Using Proposition 9.5.6,
f ∗ ψ m (x + hei ) − f ∗ ψ m (x)
=
h
ÃZ Z !
1
f (x+hei −y) ψ m (y) dy − f (x − y) ψ m (y) dy
h B (0, m
1
) B (0, m
1
)
222 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
Z
(f (x+hei −y) − f (x − y))
= ψ m (y) dy
B( 1
0, m) h
¡ 1¢
Now letting m be small enough that B 0, m is contained in U along with the continuity
of the partial derivatives, it follows the difference quotients are uniformly bounded for
all h sufficiently small and so one can apply the dominated convergence theorem and
pass to the limit obtaining
Z
fxi (x − y) ψ m (y) dy ≡ fxi ∗ ψ m (x)
Rn
Theorem 9.5.8 Let K be a compact subset of an open set, U . Then there exists
a function, h ∈ Cc∞ (U ), such that h(x) = 1 for all x ∈ K and h(x) ∈ [0, 1] for all x.
Also there exists an open set W such that
K⊆W ⊆W ⊆U
Proof: Let r > 0 be small enough that K+B(0, 3r) ⊆ U. The symbol, K+B(0, 3r)
means
{k + x : k ∈ K and x ∈ B (0, 3r)} .
Thus this is simply a way to write
∪ {B (k, 3r) : k ∈ K} .
K Kr U
1
Consider XKr ∗ ψ m where ψ m is a mollifier. Let m be so large that m < r. Then from
the¡ definition
¢ of what is meant by a convolution, and using that ψ m has support in
1
B 0, m , XKr ∗ ψ m = 1 on K and its support is in K + B (0, 3r), a bounded set. Now
using Lemma 9.5.7, XKr ∗ψ m is also infinitely differentiable. Therefore, let h = XKr ∗ψ m .
As
¡£ to ¤¢the existence of the open set W, let it equal the closed and bounded set
h−1 12 , 1 . This proves the theorem. ¥
The following is the remarkable theorem mentioned above. First, here is some no-
tation.
Theorem 9.5.10 Cc∞ (Rn ) is dense in L1 (Rn ). Here the measure is Lebesgue
measure.
Proof: Let f ∈ L1 (Rn ) and let ε > 0 be given. Choose g ∈ Cc (Rn ) such that
Z
|g − f | dmn < ε/2
9.5. MOLLIFIERS 223
whenever m is large enough. This follows because since g has compact support, it is
uniformly continuous on Rn and so if η > 0 is given, then whenever |y| is sufficiently
small,
|g (x) − g (x − y)| < η
for all x. Thus, since g has compact support, if y is small enough, it follows
Z
|g − gy |dmn (x) < ε/2.
There is no measurability problem in the use of Fubini’s theorem because the function
The thing about locally finite collection of sets is that the closure of their union
equals the union of their closures. This is clearly true of a finite collection.
Lemma 9.5.12 Let H be a locally finite collection of sets of a normed vector space
V . Then © ª
∪H = ∪ H : H ∈ H .
Proof: It is obvious ⊇ holds in the above claim. It remains to go the other way.
Suppose then that p is a limit point of ∪H and p ∈ / ∪H. There exists r > 0 such
that B (p, r) has nonempty intersection with only finitely many sets of H say these are
H1 , · · · , Hm . Then I claim p must be a limit point of one of these. If this is not so, there
would exist r0 such that 0 < r0 < r with B (p, r0 ) having empty intersection with each
224 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
of these Hi . But then p would fail to be a limit point of ∪H. Therefore, p is contained
in the right side. It is clear ∪H is contained in the right side and so This proves the
lemma. ¥
A good example to consider is the rational numbers each being a set in R. This
is not a locally finite collection of sets and you note that Q = R 6= ∪ {x : x ∈ Q} . By
contrast, Z is a locally finite collection of sets, the sets consisting of individual integers.
The closure of Z is equal to Z because Z has no limit points so it contains them all.
is in C ∞ (Rn ) .
K1 ⊆ W1 ⊆ W 1 ⊆ V1
W i ⊆ Ui ⊆ U i ⊆ Vi , U i is compact.
∞
Similarly, {Ui }i=1 is locally finite.
Wi Ui Vi
9.6. THE VITALI COVERING THEOREM 225
∞ ∞
Since the sets, {Wi }i=1 are locally finite, it follows ∪∞
i=1 Wi = ∪i=1 Wi and so it is
possible to define φi and γ, infinitely differentiable functions having compact support
such that
U i ≺ φi ≺ Vi , ∪∞ ∞
i=1 W i ≺ γ ≺ ∪i=1 Ui .
Now define ½ P∞ P∞
γ(x)φi (x)/ j=1 φj (x) if j=1 φj (x) 6= 0,
ψ i (x) = P∞
0 if j=1 φj (x) = 0.
P∞
If x is such that j=1 φj (x) = 0, then x ∈ / ∪∞i=1 Ui because φi equals one on Ui .
Consequently γ (y) = 0 for all y near x thanks to the fact that ∪∞ i=1 Ui is closed and
so
P∞ ψ i (y) = 0 for all y near x. Hence ψ i is infinitely differentiable at such x. If
j=1 φj (x) 6= 0, this situation persists near x because each φj is continuous and so ψ i
is infinitely differentiable at such
P∞points also. Therefore ψ i is infinitely differentiable. If
x ∈ K, then γ (x) = 1 and so j=1 ψ j (x) = 1. Clearly 0 ≤ ψ i (x) ≤ 1 and spt(ψ j ) ⊆ Vj .
This proves the theorem. ¥
The functions, {ψ i } are called a C ∞ partition of unity.
The method of proof of this lemma easily implies the following useful corollary.
Proof: Keep Vi the same but replace Vj with V fj ≡ Vj \ H. Now in the proof above,
applied to this modified collection of open sets, if j 6= i, φj (x) = 0 whenever x ∈ H.
Therefore, ψ i (x) = 1 on H. ¥
or any other. The proof given here is from Basic Analysis [27]. It first considers the
case of open balls and then generalizes to balls which may be neither open nor closed.
If B1 , B2 ∈ G then B1 ∩ B2 = ∅, (9.5)
G is maximal with respect to 9.4 and 9.5. (9.6)
By this is meant that if H is a collection of balls satisfying 9.4 and 9.5, then H cannot
properly contain G.
226 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
Proof: If no ball of F has radius larger than k, let G = ∅. Assume therefore, that
∞
some balls have radius larger than k. Let F ≡ {Bi }i=1 . Now let Bn1 be the first ball
in the list which has radius greater than k. If every ball having radius larger than k
intersects this one, then stop. The maximal set is {Bn1 } . Otherwise, let Bn2 be the
next ball having radius larger than k which is disjoint from Bn1 . Continue this way
∞
obtaining {Bni }i=1 , a finite or infinite sequence of disjoint balls having radius larger
than k. Then let G ≡ {Bni }. To see G is maximal with respect to 9.4 and 9.5, suppose
B ∈ F, B has radius larger than k, and G ∪ {B} satisfies 9.4 and 9.5. Then at some
point in the process, B would have been chosen because it would be the ball of radius
larger than k which has the smallest index. Therefore, B ∈ G and this shows G is
maximal with respect to 9.4 and 9.5. ¥
For an open ball, B = B (x, r) , denote by B e the open ball, B (x, 4r) .
Suppose
∞ > M ≡ sup {r : B(p, r) ∈ F} > 0.
Then there exists G ⊆ F such that G consists of disjoint balls and
e : B ∈ G}.
A ⊆ ∪{B
Then B (p, r) must have nonempty intersection with some ball from G1 ∪···∪Gm because
if it didn’t, then Gm would fail to be maximal. Denote by B (p0 , r0 ) a ball in G1 ∪···∪Gm
which has nonempty intersection with B (p, r) . Thus
µ ¶m
2
r0 > M.
3
r0 x
¾ w p
p0 r
?
Then
<r0
z }| {
|x − p0 | ≤ |x − p| + |p − w| + |w − p0 |
< 32 r0
z }| {
µ ¶m−1
2
< r + r + r0 ≤ 2 M + r0
3
µ ¶
3
< 2 r0 + r0 = 4r0 .
2
This proves the lemma since it shows B (p, r) ⊆ B (p0 , 4r0 ). ¥
With this Lemma consider a version of the Vitali covering theorem in which the
balls do not have to be open. In this theorem, B will denote an open ball, B (x, r) along
with either part or all of the points where ||x|| = r and ||·|| is any norm for Rn .
Definition 9.6.3 b
Let B be a ball centered at x having radius r. Denote by B
the open ball, B (x, 5r).
A ≡ ∪ {B : B ∈ F} .
Suppose
∞ > M ≡ sup {r : B(p, r) ∈ F} > 0.
Then there exists G ⊆ F such that G consists of disjoint balls and
b : B ∈ G}.
A ⊆ ∪{B
Proof: For
¡ B¢ one of these balls, say B (x, r) ⊇ B ⊇ B (x, r), denote by B1 , the
open ball B x, 5r
4 . Let F1 ≡ {B1 : B ∈ F} and let A1 denote the union of the balls in
F1 . Apply Lemma 9.6.2 to F1 to obtain
f1 : B1 ∈ G1 }
A1 ⊆ ∪{B
Definition 9.7.1 Let F be a collection of balls that cover a set, E, which have
the property that if x ∈ E and ε > 0, then there exists B ∈ F, diameter of B < ε and
x ∈ B. Such a collection covers E in the sense of Vitali.
Recall that from this definition, if S ⊆ Rn there exists E1 ⊇ S such that mn (E1 ) =
mn (S). To see this, note that it suffices to assume in the above definition of mn that
the Ek are also disjoint. If not, replace with the sequence given by
F1 = E1 , F2 ≡ E2 \ F1 , · · ·, Fm ≡ Em \ Fm−1 ,
etc. Then for each l > mn (S) , there exists {Ek } such that
X X
l> mn (Ek ) ≥ mn (Fk ) = mn (∪k Ek ) ≥ mn (S) .
k k
then let E1 = ∩k Gk .
Note this implies that if mn (S) = 0 then S must be in F because of completeness
of Lebesgue measure.
Theorem 9.7.2 Let E ⊆ Rn and suppose 0 < mn (E) < ∞ where mn is the
outer measure determined by mn , n dimensional Lebesgue measure, and let F be a
collection of closed balls of bounded radii such that F covers E in the sense of Vitali.
Then there exists a countable collection of disjoint balls from F, {Bj }∞j=1 , such that
mn (E \ ∪∞j=1 B j ) = 0.
Proof: From the definition of outer measure there exists a Lebesgue measurable set,
E1 ⊇ E such that mn (E1 ) = mn (E). Now by outer regularity of Lebesgue measure,
there exists U , an open set which satisfies
E1
E ⊆ ∪∞ b
j=1 Bj , Bj ⊆ U.
9.7. VITALI COVERINGS 229
Therefore,
³ ´ X ³ ´
mn (E1 ) = mn (E) ≤ mn ∪∞ b
j=1 Bj ≤
bj
mn B
j
X ¡ ¢
= 5n mn (Bj ) = 5n mn ∪∞
j=1 Bj
j
Then
mn (E1 ) > (1 − 10−n )mn (U )
≥ (1 − 10−n )[mn (E1 \ ∪∞ ∞
j=1 Bj ) + mn (∪j=1 Bj )]
=mn (E1 )
z }| {
−n
≥ (1 − 10 )[mn (E1 \ ∪∞
j=1 Bj ) +5 −n
mn (E) ].
and so ¡ ¡ ¢ ¢
1 − 1 − 10−n 5−n mn (E1 ) ≥ (1 − 10−n )mn (E1 \ ∪∞
j=1 Bj )
which implies
(1 − (1 − 10−n ) 5−n )
mn (E1 \ ∪∞
j=1 Bj ) ≤ mn (E1 )
(1 − 10−n )
Now a short computation shows
(1 − (1 − 10−n ) 5−n )
0< <1
(1 − 10−n )
Hence, denoting by θn a number such that
(1 − (1 − 10−n ) 5−n )
< θn < 1,
(1 − 10−n )
¡ ¢
mn E \ ∪ ∞ ∞
j=1 Bj ≤ mn (E1 \ ∪j=1 Bj ) < θ n mn (E1 ) = θ n mn (E)
Now using Theorem 7.3.2 on Page 155 there exists N1 large enough that
θn mn (E) ≥ mn (E1 \ ∪N N1
j=1 Bj ) ≥ mn (E \ ∪j=1 Bj )
1
(9.7)
Let F1 = {B ∈ F : Bj ∩ B = ∅, j = 1, · · ·, N1 }. If E \ ∪N
j=1 Bj = ∅, then F1 = ∅ and
1
³ ´
mn E \ ∪N j=1 j = 0
1
B
Therefore, in this case let Bk = ∅ for all k > N1 . Consider the case where
E \ ∪N
j=1 Bj 6= ∅.
1
In this case, since the balls are closed and F is a Vitali cover, F1 6= ∅ and covers
E \ ∪N N1
j=1 Bj in the sense of Vitali. Repeat the same argument, letting E \ ∪j=1 Bj play
1
the role of E. (You pick a different E1 whose measure equals the outer measure of
N1
E \ ∪j=1 Bj and proceed as before.) Then choosing Bj for j = N1 + 1, · · ·, N2 as in the
above argument,
θn mn (E \ ∪N N2
j=1 Bj ) ≥ mn (E \ ∪j=1 Bj )
1
³ ´
mn E \ ∪ N k
j=1 Bj = 0.
for every k ∈ N. Therefore, the conclusion holds in this case also because θn < 1. This
proves the theorem. ¥
There is an obvious corollary which removes the assumption that 0 < mn (E).
Corollary 9.7.3 Let E ⊆ Rn and suppose mn (E) < ∞ where mn is the outer mea-
sure determined by mn , n dimensional Lebesgue measure, and let F, be a collection of
closed balls of bounded radii such thatF covers E in the sense of Vitali. Then there exists
a countable collection of disjoint balls from F, {Bj }∞ ∞
j=1 , such that mn (E \ ∪j=1 Bj ) = 0.
Proof: If 0 = mn (E) you simply pick any ball from F for your collection of disjoint
balls. ¥
It is also not hard to remove the assumption that mn (E) < ∞.
n
Proof: Let Rm ≡ (−m, m) be the open rectangle having sides of length 2m which
is centered at 0 and let R0 = ∅. Let Hm ≡ Rm \ Rm . Since both Rm and Rm have the
n
same measure, (2m) , it follows mn (Hm ) = 0. Now for all k ∈ N, Rk ⊆ Rk ⊆ Rk+1 .
Consider the disjoint open sets, Uk ≡ Rk+1 \ Rk . Thus Rn = ∪∞ k=0 Uk ∪ N where N is
a set of measure zero equal to the union of the Hk . Let Fk denote those balls of F
which are contained in Uk and let Ek© ≡ U
ªk∞∩ E. Then from Theorem 9.7.2, there exists
a sequence of disjoint balls, Dk ≡ Bik i=1 of Fk such that mn (Ek \ ∪∞ k
j=1 j = 0.
B )
∞
Letting {Bi }i=1 be an enumeration of all the balls of ∪k Dk , it follows that
∞
X
mn (E \ ∪∞
j=1 Bj ) ≤ mn (N ) + mn (Ek \ ∪∞ k
j=1 Bj ) = 0.
k=1
¥
Also, you don’t have to assume the balls are closed.
and so
¡ ¢ ¡ ∞ ¢
mn (E \ ∪∞
i=1 Bi ) ≤ mn E \ ∪∞
i=1 Bi + mn ∪i=1 Bi \ Bi
¡ ¢
= mn E \ ∪∞
i=1 Bi = 0.
¥
This implies you can fill up an open set with balls which cover the open set in the
sense of Vitali.
Note that if
h (x) = Lx
where L ∈ L (Rn , Rn ) , then L is included in 9.8 because
In fact, o (v) = 0.
It is convenient in the following lemma to use the norm on Rn given by
n
and so mn (B (x, r)) = (2r) .
and so there exist arbitrarily small rx < 1 such that B (x,5rx ) ⊆ V and whenever
||v|| ≤ rx , ||o (v)|| < k ||v|| . Thus
∞
X ³ ³ ´´ X∞
≤ bi
mn h B ≤ mn (B (h (xi ) , 2krxi ))
i=1 i=1
∞
X ∞
X
n
= mn (B (xi , 2krxi )) = (2k) mn (B (xi , rxi ))
i=1 i=1
n n
≤ (2k) mn (V ) ≤ (2k) ε.
mn (h (T )) = lim mn (h (Tk )) = 0.
k→∞
Proof: By Theorem 9.4.2 there exists F which is a countable union of compact sets,
F = ∪∞
k=1 Kk such that
F ⊆ S, mn (S \ F ) = 0.
Then since h is continuous
h (F ) = ∪k h (Kk ) ∈ B (Rn )
h (S) = h (F ) ∪ h (S \ F ) ∈ Fn
because it is the union of two sets which are in Fn . This proves the lemma. ¥
In particular, this proves the following corollary.
In the next lemma, the norm used for defining balls will be the usual norm,
à n
!1/2
X 2
|x| = |xk | .
k=1
Thus a unitary transformation preserves distances measured with respect to this norm.
In particular, if R is unitary, (R∗ R = RR∗ = I) then
Lemma 9.8.4 Let R be unitary and let V be a an open set. Then mn (RV ) =
mn (V ) .
9.8. CHANGE OF VARIABLES FOR LINEAR MAPS 233
Proof: First assume V is a bounded open set. By Corollary 9.7.6 there is a disjoint
sequence of closed balls, {Bi } such that U = ∪∞ i=1 Bi ∪ N where mn (N ) = 0. Denote by
x
P∞i the center of B i and let ri be the radius of Bi . Then by Lemma 9.8.1 mn (RV ) =
i=1 mn (RBi ) . Now by invariance of translation of Lebesgue measure, this equals
∞
X ∞
X
mn (RBi − Rxi ) = mn (RB (0, ri )) .
i=1 i=1
Since R is unitary, it preserves all distances and so RB (0, ri ) = B (0, ri ) and therefore,
∞
X ∞
X
mn (RV ) = mn (B (0, ri )) = mn (Bi ) = mn (V ) .
i=1 i=1
This proves the lemma in the case that V is bounded. Suppose now that V is just
an open set. Let Vk = V ∩ B (0, k) . Then mn (RVk ) = mn (Vk ) . Letting k → ∞, this
yields the desired conclusion. This proves the lemma in the case that V is open. ¥
Lemma 9.8.5 Let E be Lebesgue measurable set in Rn and let R be unitary. Then
mn (RE) = mn (E) .
Proof: Let K be the open sets. Thus K is a π system. Let G denote those Borel
sets F such that for each p ∈ N,
n n
mn (R (F ∩ (−p, p) )) = mn (F ∩ (−p, p) ) .
Thus G contains K from Lemma 9.8.4. It is also routine to verify G is closed with respect
to complements and countable disjoint unions. Therefore from the π systems lemma,
G ⊇ σ (K) = B (Rn ) ⊇ G
and this proves the lemma whenever E ∈ B (Rn ). If E is only in Fn , it follows from
Theorem 9.4.2
E =F ∪N
where mn (N ) = 0 and F is a countable union of compact sets. Thus by Lemma 9.8.1
where dj ≥ 0 and {ej } is the usual orthonormal basis of Rn . Then for all E ∈ Fn
Hence
à n
! ( n
)
Y X
D (ak , bk ) = dk xk ek such that xk ∈ (ak , bk )
k=1 k=1
n
Y
= (dk ak , dk bk ) .
k=1
It follows
à à n
!! Ã n
!Ã n
!
Y Y Y
mn D (ak , bk ) = dk (bk − ak )
k=1 k=1 k=1
à n
!
Y
= |det (D)| mn (ak , bk ) .
k=1
Thus K ⊆ G.
Suppose now that F ∈ G and first assume D is one to one. Then
¡ ¡ n ¢¢ n n
mn D F C ∩ (−p, p) + mn (D (F ∩ (−p, p) )) = mn (D (−p, p) )
and so
¡ ¡ n ¢¢ n n
mn D F C ∩ (−p, p) + |det (D)| mn (F ∩ (−p, p) ) = |det (D)| mn ((−p, p) )
which shows
¡ ¡ n ¢¢ n n
mn D F C ∩ (−p, p) = |det (D)| [mn ((−p, p) ) − mn (F ∩ (−p, p) )]
¡ n¢
= |det (D)| mn F C ∩ (−p, p)
In case D is not one to one, it follows some dj = 0 and so |det (D)| = 0 and
n
Y
¡ ¡ n ¢¢ n
0 ≤ mn D F C ∩ (−p, p) ≤ mn (D (−p, p) ) = (di p + di p) = 0
i=1
¡ n¢
= |det (D)| mn F C ∩ (−p, p)
so F C ∈ G.
If {Fk } is a sequence of disjoint sets of G and D is one to one
∞
X
n n
mn (D (∪∞
k=1 Fk ∩ (−p, p) )) = mn (D (Fk ∩ (−p, p) ))
k=1
∞
X n
= |det (D)| mn (Fk ∩ (−p, p) )
k=1
n
= |det (D)| mn (∪k Fk ∩ (−p, p) ) .
If D is not one to one, then det (D) = 0 and so the right side of the above equals 0.
The left side is also equal to zero because it is no larger than
n
mn (D (−p, p) ) = 0.
9.8. CHANGE OF VARIABLES FOR LINEAR MAPS 235
Thus G is closed with respect to complements and countable disjoint unions. Hence it
contains σ (K) , the Borel sets. But also G ⊆ B (Rn ) and so G equals B (Rn ) . Letting
p → ∞ yields the conclusion of the lemma in case E ∈ B (Rn ).
Now for E ∈ Fn arbitrary, it follows from Theorem 9.4.2
E =F ∪N
where N is a set of measure zero and F is a countable union of compact sets. Hence as
before,
G=E∪S
Proof: Let RU be the right polar decomposition (Theorem 3.9.3 on Page 66) of A.
Thus R is unitary and X
U= dk wk wk
k
Recall from Lemma 3.9.5 on Page 68 the determinant of a unitary transformation has
absolute value equal to 1. Then from Lemma 9.8.5,
mn (AE) = mn (RU E) = mn (U E) .
Let X
Q= wj ej
j
Do both sides to wk and observe both sides give dk wk . Since the two linear operators
agree on a basis, they must be the same. Thus
Lemma 9.9.1 Let U and V be bounded open sets in Rn and let h, h−1 be C 1 func-
tions such that h (U ) = V . Also let f ∈ Cc (V ) . Then
Z Z
f (y) dmn = f (h (x)) |det (Dh (x))| dmn
V U
Proof: First note h−1 (spt (f )) is a closed subset of the bounded set, U and so it is
compact. Thus x → f (h (x)) |det (Dh (x))| is bounded and continuous.
Let x ∈ U. By the assumption that h and h−1 are C 1 ,
|f (h (x1 )) |det (Dh (x1 ))| − f (h (x)) |det (Dh (x))|| < ε (9.11)
Corollary 9.9.2 Let U and V be bounded open sets in Rn and let h, h−1 be C 1
functions such that h (U ) = V . Also let E ⊆ V be measurable. Then
Z Z
XE (y) dmn = XE (h (x)) |det (Dh (x))| dmn .
V U
mn (N ) ≤ mn (∪∞
k=m Gk \ Kk )
X∞ ∞
X
≤ mn (Gk \ Kk ) < 2−k = 2−(m−1)
k=m k=m
238 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
showing mn (N ) = 0.
Then fk (h (x)) must converge to XE (h (x)) for all x ∈/ h−1 (N ) , a set of measure
zero by Lemma 9.8.1. Thus
Z Z
fk (y) dmn = fk (h (x)) |det (Dh (x))| dmn .
V U
Corollary 9.9.3 Let U and V be open sets in Rn and let h, h−1 be C 1 functions
such that h (U ) = V . Also let E ⊆ V be measurable. Then
Z Z
XE (y) dmn = XE (h (x)) |det (Dh (x))| dmn .
V U
Proof: For each x ∈ U, there exists rx such that B (x, rx ) ⊆ U and rx < 1. Then by
the mean value inequality Theorem 6.4.2, it follows h (B (x, rx )) is also bounded. These
balls, B (x, rx ) give a Vitali cover of U and so by Corollary 9.7.6 there is a sequence of
these balls, {Bi } such that they are disjoint, h (Bi ) is bounded and
mn (U \ ∪i Bi ) = 0.
It follows from Lemma 9.8.1 that h (U \ ∪i Bi ) also has measure zero. Then from Corol-
lary 9.9.2
Z XZ
XE (y) dmn = XE∩h(Bi ) (y) dmn
V i h(Bi )
XZ
= XE (h (x)) |det (Dh (x))| dmn
i Bi
Z
= XE (h (x)) |det (Dh (x))| dmn .
U
Theorem 9.9.4 Let U and V be open sets in Rn and let h, h−1 be C 1 functions
such that h (U ) = V. Then if g is a nonnegative Lebesgue measurable function,
Z Z
g (y) dmn = g (h (x)) |det (Dh (x))| dmn . (9.13)
V U
Proof: From Corollary 9.9.3, 9.13 holds for any nonnegative simple function in place
of g. In general, let {sk } be an increasing sequence of simple functions which converges
to g pointwise. Then from the monotone convergence theorem
Z Z Z
g (y) dmn = lim sk dmn = lim sk (h (x)) |det (Dh (x))| dmn
V k→∞ V k→∞ U
Z
= g (h (x)) |det (Dh (x))| dmn .
U
9.9. CHANGE OF VARIABLES FOR C 1 FUNCTIONS 239
Corollary 9.9.5 Let U and V be open sets in Rn and let h, h−1 be C 1 functions
such that h (U ) = V. Let g ∈ L1 (V ) . Then
Z Z
g (y) dmn = g (h (x)) |det (Dh (x))| dmn .
V U
This is a pretty good theorem but it isn’t too hard to generalize it. In particular, it
is not necessary to assume h−1 is C 1 .
it follows that
n−1
mn (Kε ) ≤ 2n ε (diam (K) + ε) .
Proof: Using the Gram Schmidt procedure, there exists an orthonormal basis for
V , {v1 , · · ·, vn−1 } and let
{v1 , · · ·, vn−1 , vn }
be an orthonormal basis for Rn . Now define a linear transformation, Q by Qvi = ei .
Thus QQ∗ = Q∗ Q = I and Q preserves all distances and is a unitary transformation
because ¯ ¯2 ¯ ¯2 ¯ ¯2
¯ X ¯ ¯X ¯ X ¯X ¯
¯ ¯ ¯ ¯ 2 ¯ ¯
¯Q ai ei ¯ = ¯ ai vi ¯ = |ai | = ¯ ai ei ¯ .
¯ ¯ ¯ ¯ ¯ ¯
i i i i
Thus mn (Kε ) = mn (QKε ). Letting k0 ∈ K, it follows K ⊆ B (k0 , diam (K)) and so,
where B n−1 refers to the ball taken with respect to the usual norm in Rn−1 . Every
point of Kε is within ε of some point of K and so it follows that every point of QKε is
within ε of some point of QK. Therefore,
To see this, let x ∈ QKε . Then there exists k ∈ QK such that |k − x| < ε. Therefore,
|(x1 , · · ·, xn−1 ) − (k1 , · · ·, kn−1 )| < ε and |xn − kn | < ε and so x is contained in the set
on the right in the above inclusion because kn = 0. However, the measure of the set on
the right is smaller than
n−1 n−1
[2 (diam (QK) + ε)] (2ε) = 2n [(diam (K) + ε)] ε.
and
¯R ¯
¯ 1 ¯
|h (x + v) − h (x) − Dh (x) v| ¯ 0 Dh (x + tv) vdt − Dh (x) v¯
≤
|v| |v|
R1
0
|Dh (x + tv) v − Dh (x) v| dt
≤ .
|v|
Now from uniform continuity of Dh on the compact set, {x : dist (x,K) ≤ δ} it follows
there exists r1 < δ such that if |v| ≤ r1 , then ||Dh (x + tv) − Dh (x)|| < ε for every
x ∈ K. From the above formula, it follows that if |v| ≤ r1 ,
R1
|h (x + v) − h (x) − Dh (x) v| 0
|Dh (x + tv) v − Dh (x) v| dt
≤
|v| |v|
R1
0
ε |v| dt
< = ε.
|v|
Then mn (h (Z)) = 0.
9.9. CHANGE OF VARIABLES FOR C 1 FUNCTIONS 241
∞
Proof: Let {Uk }k=1 be an increasing sequence of open sets whose closures are com-
pact and whose union equals U and let Zk ≡ Z ∩ Uk . To obtain such a sequence,
let ½ ¾
¡ C
¢ 1
Uk = x ∈ U : dist x,U < ∩ B (0, k) .
k
First it is shown that h (Zk ) has measure zero. Let W be an open set contained in Uk+1
which contains Zk and satisfies
mn (Zk ) + ε > mn (W )
and let r1 > 0 be a constant as in Lemma 9.9.8 such that whenever x ∈ Uk and
0 < |v| ≤ r1 ,
|h (x + v) − h (x) − Dh (x) v| < ε |v| . (9.14)
Now the closures of balls which are contained in W and which have the property that
their diameters are less than r1 yield a Vitali covering n ofoW. Therefore, by Corollary
9.7.6 there is a disjoint sequence of these closed balls, Bei such that
W = ∪∞ e
i=1 Bi ∪ N
where N is a set of measure zero. Denote by {Bi } those closed balls in this sequence
which have nonempty intersection with Zk , let di be the diameter of Bi , and let zi be a
point in Bi ∩ Zk . Since zi ∈ Zk , it follows Dh (zi ) B (0,di ) = Di where Di is contained
in a subspace, V which has dimension n − 1 and the diameter of Di is no larger than
2Ck di where
Ck ≥ max {||Dh (x)|| : x ∈ Zk }
Then by 9.14, if z ∈ Bi ,
Thus
h (Bi ) ⊆ h (zi ) + Di + B (0,εdi )
By Lemma 9.9.6
n−1
mn (h (Bi )) ≤ 2n (2Ck di + εdi ) εdi
³ ´
n−1
≤ dni 2n [2Ck + ε] ε
≤ Cn,k mn (Bi ) ε.
Proof: Let Z = {x : det (Dh (x)) = 0} , a closed set. Then by the inverse function
theorem, h−1 is C 1 on h (U \ Z) and h (U \ Z) is an open set. Therefore, from Lemma
9.9.9, h (Z) has measure zero and so by Theorem 9.9.4,
Z Z Z
g (y) dmn = g (y) dmn = g (h (x)) |det (Dh (x))| dmn
h(U ) h(U \Z) U \Z
Z
= g (h (x)) |det (Dh (x))| dmn .
U
and Z the set where |det Dh (x)| = 0, Lemma 9.9.9 implies mn (h(Z)) = 0. For x ∈ U+ ,
the inverse function theorem implies there exists an open set Bx ⊆ U+ , such that h is
one to one on Bx .
Let {Bi } be a countable subset of {Bx }x∈U+ such that U+ = ∪∞i=1 Bi . Let E1 = B1 .
If E1 , · · ·, Ek have been chosen, Ek+1 = Bk+1 \ ∪ki=1 Ei . Thus
∪∞
i=1 Ei = U+ , h is one to one on Ei , Ei ∩ Ej = ∅,
and each Ei is a Borel set contained in the open set Bi . Now define
∞
X
n(y) ≡ Xh(Ei ) (y) + Xh(Z) (y).
i=1
The set, h (Ei ) , h (Z) are measurable by Lemma 9.8.2. Thus n (·) is measurable.
∞ Z
X
= Xh(Ei ) (y)XF (y)dmn
i=1 h(U )
X∞ Z
= Xh(Ei ) (y)XF (y)dmn
i=1 h(Bi )
X∞ Z
= XEi (x)XF (h(x))| det Dh(x)|dmn
i=1 Bi
X∞ Z
= XEi (x)XF (h(x))| det Dh(x)|dmn
i=1 U
Z X
∞
= XEi (x)XF (h(x))| det Dh(x)|dmn
U i=1
Z Z
= XF (h(x))| det Dh(x)|dmn = XF (h(x))| det Dh(x)|dmn .
U+ U
Observe that
#(y) = n(y) a.e. (9.16)
because n(y) = #(y) if y ∈ / h(Z), a set of measure 0. Therefore, # is a measurable
function because of completeness of Lebesgue measure.
Proof: From 9.16 and Lemma 9.10.1, 9.17 holds for all g, a nonnegative simple
function. Approximating an arbitrary measurable nonnegative function, g, with an
increasing pointwise convergent sequence of simple functions and using the monotone
convergence theorem, yields 9.17 for an arbitrary nonnegative measurable function, g.
This proves the theorem. ¥
y1 = ρ cos θ
y2 = ρ sin θ
244 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
where ρ > 0 and θ ∈ R. Thus these transformation equations are not one to one but they
are one to one on (0, ∞)×[0, 2π). Here I am writing ρ in place of r to emphasize a pattern
which is about to emerge. I will consider polar coordinates as spherical coordinates in
two dimensions. I will also simply refer to such coordinate systems as polar coordinates
regardless of the dimension. This is also the reason I am writing y1 and y2 instead of
the more usual x and y. Now consider what happens when you go to three dimensions.
The situation is depicted in the following picture.
R (x1 , x2 , x3 )
φ1
ρ
R2
From this picture, you see that y3 = ρ cos φ1 . Also the distance between (y1 , y2 ) and
(0, 0) is ρ sin (φ1 ) . Therefore, using polar coordinates to write (y1 , y2 ) in terms of θ and
this distance,
y1 = ρ sin φ1 cos θ,
y2 = ρ sin φ1 sin θ,
y3 = ρ cos φ1 .
where φ1 ∈ R and the transformations are one to one if φ1 is restricted to be in [0, π] .
What was done is to replace ρ with ρ sin φ1 and then to add in y3 = ρ cos φ1 . Having
done this, there is no reason to stop with three dimensions. Consider the following
picture:
R (x1 , x2 , x3 , x4 )
φ2
ρ
R3
From this picture, you see that y4 = ρ cos φ2 . Also the distance between (y1 , y2 , y3 )
and (0, 0, 0) is ρ sin (φ2 ) . Therefore, using polar coordinates to write (y1 , y2 , y3 ) in terms
of θ, φ1 , and this distance,
Continuing this way, given spherical coordinates in Rn , to get the spherical coordi-
nates in Rn+1 , you let yn+1 = ρ cos φn−1 and then replace every occurance of ρ with
ρ sin φn−1 to obtain y1 · · · yn in terms of φ1 , φ2 , · · ·, φn−1 ,θ, and ρ.
It is always the case that ρ measures the distance from the point in Rn to the origin
in Rn , 0. Each φi ∈ R and the transformations will be one to one if each φi ∈ [0, π] ,
and θ ∈ [0, 2π). It can be shown using math induction that these coordinates map
Qn−2 n
i=1 [0, π] × [0, 2π) × (0, ∞) one to one onto R \ {0} .
9.11. SPHERICAL COORDINATES IN N DIMENSIONS 245
Proof: Formula 9.18 is obvious from the definition of the spherical coordinates.
The first claim isQalso clear from the definition and math induction. It remains to verify
n−2
9.19. Let A0 ≡ i=1 (0, π) × (0, 2π) . Then it is clear that (A \ A0 ) × (0, ∞) ≡ N is a
set of measure zero in Rn . Therefore, from Lemma 9.8.1 it follows h (N ) is also a set of
measure zero. Therefore, using the change of variables theorem, Fubini’s theorem, and
Sard’s lemma,
Z Z Z
f (y) dy = f (y) dy = f (y) dy
Rn Rn \{0} Rn \({0}∪h(N ))
Z
= f (h (φ, θ, ρ)) ρn−1 Φ (φ, θ) dmn
A0 ×(0,∞)
Z
= XA×(0,∞) (φ, θ, ρ) f (h (φ, θ, ρ)) ρn−1 Φ (φ, θ) dmn
Z ∞ µZ ¶
n−1
= ρ f (h (φ, θ, ρ)) Φ (φ, θ) dφ dθ dρ.
0 A
Now the claim about f ∈ L1 follows routinely from considering the positive and negative
parts of the real and imaginary parts of f in the usual way. This proves the theorem.
¥
Notation 9.11.2 Often this is written differently. Note that from the spherical coor-
dinate formulas, f (h (φ, θ, ρ)) = f (ρω) where |ω| = 1. Letting S n−1 denote the unit
sphere, {ω ∈ Rn : |ω| = 1} , the inside integral in the above formula is sometimes written
as Z
f (ρω) dσ
S n−1
where σ is a measure on S n−1 . See [27] for another description of this measure. It isn’t
an important issue here. Either 9.19 or the formula
Z ∞ µZ ¶
ρn−1 f (ρω) dσ dρ
0 S n−1
will
¡ be referred
¢ R to as polar coordinates and is very useful in establishing estimates. Here
σ S n−1 ≡ A Φ (φ, θ) dφ dθ.
R ³ ´s
2
Example 9.11.3 For what values of s is the integral B(0,R) 1 + |x| dy bounded
n
independent of R? Here B (0, R) is the ball, {x ∈ R : |x| ≤ R} .
1 Actually it is only a function of the first but this is not important in what follows.
246 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
I think you can see immediately that s must be negative but exactly how negative?
It turns out it depends on n and using polar coordinates, you can find just exactly what
is needed. From the polar coordinates formula above,
Z ³ ´s Z R Z
2 ¡ ¢s
1 + |x| dy = 1 + ρ2 ρn−1 dσdρ
B(0,R) 0 S n−1
Z R ¡ ¢s
= Cn 1 + ρ2 ρn−1 dρ
0
Now the very hard problem has been reduced to considering an easy one variable problem
of finding when
Z R
¡ ¢s
ρn−1 1 + ρ2 dρ
0
∂gi ∂ det(Dg)
where here (Dg)ij ≡ gi,j ≡ ∂xj . Also, cof (Dg)ij = ∂gi,j .
and so
∂ det (Dg)
= cof (Dg)ij (9.20)
∂gi,j
which shows the last claim of the lemma. Also
X
δ kj det (Dg) = gi,k (cof (Dg))ij (9.21)
i
9.12. BROUWER FIXED POINT THEOREM 247
Subtracting the first sum on the right from both sides and using the equality of mixed
partials,
X X
gi,k (cof (Dg))ij,j = 0.
i j
P
If det (gi,k ) 6= 0 so that (gi,k ) is invertible, this shows j (cof (Dg))ij,j = 0. If
det (Dg) = 0, let
gk (x) = g (x) + εk x
where εk → 0 and det (Dg + εk I) ≡ det (Dgk ) 6= 0. Then
X X
(cof (Dg))ij,j = lim (cof (Dgk ))ij,j = 0
k→∞
j j
Definition
¡ ¢ 9.12.2 Let h be a function defined on an open set, U ⊆ Rn . Then
k
h ∈ C U if there exists a function g defined on an open set, W containng U such
that g = h on U and g is C k (W ) .
In the following lemma, you could use any norm in defining the balls and everything
would work the same but I have in mind the usual norm.
³ ´
Lemma 9.12.3 There does not exist h ∈ C 2 B (0, R) such that h :B (0, R) →
∂B (0, R) which also has the property that h (x) = x for all x ∈ ∂B (0, R) . Such a
function is called a retraction.
Proof: Suppose such an h exists. Let λ ∈ [0, 1] and let pλ (x) ≡ x + λ (h (x) − x) .
This function, pλ is called a homotopy of the identity map and the retraction, h. Let
Z
I (λ) ≡ det (Dpλ (x)) dx.
B(0,R)
Now by assumption, hi (x) = xi on ∂B (0, R) and so one can form iterated integrals
and integrate by parts in each of the one dimensional integrals to obtain
XZ X
I 0 (λ) = − cof (Dpλ (x))ij,j (hi (x) − xi ) dx = 0.
i B(0,R) j
but Z Z
I (1) = det (Dh (x)) dmn = # (y) dmn = 0
B(0,1) ∂B(0,1)
because from polar coordinates or other elementary reasoning, mn (∂B (0, 1)) = 0. This
proves the lemma. ¥
The following is the Brouwer fixed point theorem for C 2 maps.
³ ´
Lemma 9.12.4 If h ∈ C 2 B (0, R) and h : B (0, R) → B (0, R), then h has a
fixed point, x such that h (x) = x.
Proof: Suppose the lemma is not true. Then for all x, |x − h (x)| 6= 0. Then define
x − h (x)
g (x) = h (x) + t (x)
|x − h (x)|
where t (x) is nonnegative and is chosen such that g (x) ∈ ∂B (0, R) . This mapping is
illustrated in the following picture.
f (x)
x
g(x)
Then µ ¶
x − h (x)
Ht (x, t) = 2 h (x) , + 2t.
|x − h (x)|
If this is nonzero for all x near B (0, R), it follows from the implicit function theorem
that t is a C 2 function of x. From 9.22
µ ¶
x − h (x)
2t = −2 h (x) ,
|x − h (x)|
s µ ¶2
x − h (x) ³ ´
2
± 4 h (x) , − 4 |h (x)| − R2
|x − h (x)|
9.13. EXERCISES 249
and so
µ ¶
x − h (x)
Ht (x, t) = 2t + 2 h (x) ,
|x − h (x)|
s µ ¶2
³ ´ x − h (x)
2 2
= ± 4 R − |h (x)| + 4 h (x) ,
|x − h (x)|
If |h (x)| < R, this is nonzero. If |h (x)| = R, then it is still nonzero unless
(h (x) , x − h (x)) = 0.
But this cannot happen because the angle between h (x) and x − h (x) cannot be π/2.
Alternatively, if the above equals zero, you would need
2
(h (x) , x) = |h (x)| = R2
which cannot happen unless x = h (x) which is assumed not to happen. Therefore,
x → t (x) is C 2 near B (0, R) and so g (x) given above contradicts Lemma 9.12.3. This
proves the lemma. ¥
Now it is easy to prove the Brouwer fixed point theorem.
Theorem 9.12.5 Let f : B (0, R) → B (0, R) be continuous. Then f has a fixed
point.
Proof: If this is not so, there exists ε > 0 such that for all x ∈ B (0, R),
|x − f (x)| > ε.
By the Weierstrass approximation theorem, there exists h, a polynomial such that
n o ε
max |h (x) − f (x)| : x ∈ B (0, R) < .
2
Then for all x ∈ B (0, R),
ε ε
|x − h (x)| ≥ |x − f (x)| − |h (x) − f (x)| > ε − =
2 2
contradicting Lemma 9.12.4. This proves the theorem. ¥
9.13 Exercises
1. Recall the definition of fy . Prove that if f ∈ L1 (Rn ) , then
Z
lim |f − fy | dmn = 0
y→0 Rn
This is known as continuity of translation. Hint: Use the theorem about being
able to approximate an arbitrary function in L1 (Rn ) with a function in Cc (Rn ).
2. Show that if a, b ≥ 0 and if p, q > 0 such that
1 1
+ =1
p q
then
ap bq
ab ≤ +
p q
ap bq
Hint: You might consider for fixed a ≥ 0, the function h (b) ≡ p + q − ab and
find its minimum.
250 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
Hint: If either of the factors on the right equals 0, explain why there is nothing
¡R p ¢1/p ¡R q ¢1/q
to show. Now let a = |f | / |f | dµ and b = |g| / |g| dµ . Apply the
inequality of the previous problem.
4. Let E be a Lebesgue measurable set in R. Suppose m(E) > 0. Consider the set
E − E = {x − y : x ∈ E, y ∈ E}.
Explain why f is continuous at 0 and f (0) > 0 and use continuity of translation
in L1 .
5. If f ∈ L1 (Rn ) , show there exists g ∈ L1 (Rn ) such that g is also Borel measurable
such that g (x) = f (x) for a.e. x.
6. Suppose f, g ∈ L1 (Rn ) . Define f ∗ g (x) by
Z
f (x − y) g (y) dmn (y) .
Show this makes sense for a.e. x and that in fact for a.e. x
Z
|f (x − y)| |g (y)| dmn (y)
Next show Z Z Z
|f ∗ g (x)| dmn (x) ≤ |f | dmn |g| dmn .
Hint: Use Problem 5. Show first there is no problem if f, g are Borel measurable.
The reason for this is that you can use Fubini’s theorem to write
Z Z
|f (x − y)| |g (y)| dmn (y) dmn (x)
Z Z
= |f (x − y)| |g (y)| dmn (x) dmn (y)
Z Z
= |f (z)| dmn |g (y)| dmn .
Explain. Then explain why if f and g are replaced by functions which are equal
to f and g a.e. but are Borel measurable, the convolution is unchanged.
7. In the situation of Problem 6 Show x → f ∗ g (x) is continuous whenever g is also
bounded. Hint: Use Problem 1.
8. Let : [0, ∞) → R be in L1 (R, m). The Laplace transform b
R ∞ f−xt 1
R x is given by f (x) =
0
e f (t)dt. Let f, g be in L (R, m), and let h(x) = 0 f (x − t)g(t)dt. Show
h ∈ L1 , and b
h = fbgb.
9.13. EXERCISES 251
9. Suppose A is covered by a finite collection of Balls, F. Show that then there exists
p bi where B
a disjoint collection of these balls, {Bi }i=1 , such that A ⊆ ∪pi=1 B bi has
the same center as Bi but 3 times the radius. Hint: Since the collection of balls
is finite, they can be arranged in order of decreasing radius.
10. Let f be a function defined on an interval, (a, b). The Dini derivates are defined
as
f (x + h) − f (x)
D+ f (x) ≡ lim inf ,
h→0+ h
f (x + h) − f (x)
D+ f (x) ≡ lim sup
h→0+ h
f (x) − f (x − h)
D− f (x) ≡ lim inf ,
h→0+ h
f (x) − f (x − h)
D− f (x) ≡ lim sup .
h→0+ h
Suppose f is continuous on (a, b) and for all x ∈ (a, b), D+ f (x) ≥ 0. Show that
then f is increasing on (a, b). Hint: Consider the function, H (x) ≡ f (x) (d − c)−
x (f (d) − f (c)) where a < c < d < b. Thus H (c) = H (d). Also it is easy to see
that H cannot be constant if f (d) < f (c) due to the assumption that D+ f (x) ≥ 0.
If there exists x1 ∈ (a, b) where H (x1 ) > H (c), then let x0 ∈ (c, d) be the point
where the maximum of f occurs. Consider D+ f (x0 ). If, on the other hand,
H (x) < H (c) for all x ∈ (c, d), then consider D+ H (c).
11. ↑ Suppose in the situation of the above problem we only know
D+ f (x) ≥ 0 a.e.
Does the conclusion still follow? What if we only know D+ f (x) ≥ 0 for every
x outside a countable set? Hint: In the case of D+ f (x) ≥ 0,consider the bad
function in the exercises for the chapter on the construction of measures which
was based on the Cantor set. In the case where D+ f (x) ≥ 0 for all but countably
many x, by replacing f (x) with fe(x) ≡ f (x) + εx, consider the situation where
D+ fe(x) > 0 for all but
³ countably
e e
´ many x. If in this situation, f (c) > f (d) for
some c < d, and y ∈ fe(d) , fe(c) ,let
n o
z ≡ sup x ∈ [c, d] : fe(x) > y0 .
and conclude that aside from a set of measure zero, D+ f (x) = D+ f (x). Similar
reasoning will show D− f (x) = D− f (x) a.e. and D+ f (x) = D− f (x) a.e. and so
off some set of measure zero, we have
D− f (x) = D− f (x) = D+ f (x) = D+ f (x)
252 THE LEBESGUE INTEGRAL FOR FUNCTIONS OF N VARIABLES
which implies the derivative exists and equals this common value. Hint: To show
9.23, let U be an open set containing Npq such that m (Npq ) + ε > m (U ). For
each x ∈ Npq there exist y > x arbitrarily close to x such that
Thus the set of such intervals, {[x, y]} which are contained in U constitutes a
Vitali cover of Npq . Let {[xi , yi ]} be disjoint and
m (Npq \ ∪i [xi , yi ]) = 0.
and so m (Npq ∩ V ) = m (Npq ). For each x ∈ Npq ∩V , there exist y > x arbitrarily
close to x such that
f (y) − f (x) > q (y − x) .
Thus the set of such intervals, {[x0 , y 0 ]} which are contained in V is a Vitali cover
of Npq ∩ V . Let {[x0i , yi0 ]} be disjoint and
and therefore, (q − p) m (Npq ) ≤ pε. Since ε > 0 is arbitrary, this proves that
there is a right derivative a.e. A similar argument does the other cases.
13. Suppose f is a function in L1 (R) and f is infinitely differentiable. Does it follow
that f 0 ∈ L1 (R)? Hint: What if φ ∈ Cc∞ (0, 1) and f (x) = φ (2n (x − n)) for
x ∈ (n, n + 1) , f (x) = 0 if x < 0?
14. For a function f ∈ L1 (Rn ), the Fourier transform, F f is given by
Z
1
F f (t) ≡ √ e−it·x f (x) dx
2π R n
R∞ Rr
16. For this problem define a f (t) dt ≡ limr→∞ a f (t) dt. Note this coincides with
the Lebesgue integral when f ∈ L1 (a, ∞). Show
R∞ sin(u) π
(a) 0 u du = 2
R∞
(b) limr→∞ δ sin(ru)
u du= 0 whenever δ > 0.
1
R
(c) If f ∈ L (R), then limr→∞ R sin (ru) f (u) du = 0.
R∞
Hint: For the first two, use u1 = 0 e−ut dt and apply Fubini’s theorem to
RR R
0
sin u R e−ut dtdu. For the last part, first establish it for f ∈ Cc∞ (R) and
then use the density of this set in L1 (R) to obtain the result. This is called the
Riemann Lebesgue lemma.
17. ↑Suppose that g ∈ L1 (R) and that at some x > 0, g is locally Holder continuous
from the right and from the left. This means
lim g (x + r) ≡ g (x+)
r→0+
exists,
lim g (x − r) ≡ g (x−)
r→0+
exists and there exist constants K, δ > 0 and r ∈ (0, 1] such that for |x − y| < δ,
r
|g (x+) − g (y)| < K |x − y|
g (x+) + g (x−)
= .
2
18. ↑ Let g ∈ L1 (R) and suppose g is locally Holder continuous from the right and
from the left at x. Show that then
Z R Z ∞
1 g (x+) + g (x−)
lim eixt e−ity g (y) dydt = .
R→∞ 2π −R −∞ 2
Assume that g has exponential growth as above and is Holder continuous from
the right and from the left at t. Pick γ > η. Show that
Z R
1 g (t+) + g (t−)
lim eγt eiyt Lg (γ + iy) dy = .
R→∞ 2π −R 2
This formula is sometimes written in the form
Z γ+i∞
1
est Lg (s) ds
2πi γ−i∞
and is called the complex inversion integral for Laplace transforms. It can be used
to find inverse Laplace transforms. Hint:
Z R
1
eγt eiyt Lg (γ + iy) dy =
2π −R
Z R Z ∞
1
eγt eiyt e−(γ+iy)u g (u) dudy.
2π −R 0
Now use Fubini’s theorem and do the integral from −R to R to get this equal to
Z
eγt ∞ −γu sin (R (t − u))
e g (u) du
π −∞ t−u
where g is the zero extension of g off [0, ∞). Then this equals
Z
eγt ∞ −γ(t−u) sin (Ru)
e g (t − u) du
π −∞ u
which equals
Z ∞
2eγt g (t − u) e−γ(t−u) + g (t + u) e−γ(t+u) sin (Ru)
du
π 0 2 u
and then apply the result of Problem 17.
20. Let K be a nonempty closed and convex subset of Rn . Recall K is convex means
that if x, y ∈ K, then for all t ∈ [0, 1] , tx + (1 − t) y ∈ K. Show that if x ∈ Rn
there exists a unique z ∈ K such that
|x − z| = min {|x − y| : y ∈ K} .
This z will be denoted as P x. Hint: First note you do not know K is compact.
Establish the parallelogram identity if you have not already done so,
2 2 2 2
|u − v| + |u + v| = 2 |u| + 2 |v| .
This chapter is on the Brouwer degree, a very useful concept with numerous and impor-
tant applications. The degree can be used to prove some difficult theorems in topology
such as the Brouwer fixed point theorem, the Jordan separation theorem, and the in-
variance of domain theorem. It also is used in bifurcation theory and many other areas
in which it is an essential tool. This is an advanced calculus course so the degree will be
developed for Rn . When this is understood, it is not too difficult to extend to versions
of the degree which hold in Banach space. There is more on degree theory in the book
by Deimling [9] and much of the presentation here follows this reference.
To give you an idea what the degree is about, consider a real valued C 1 function
defined on an interval, I, and let y ∈ f (I) be such that f 0 (x) 6= 0 for all x ∈ f −1 (y). In
this case the degree is the sum of the signs of f 0 (x) for x ∈ f −1 (y), written as d (f, I, y).
In the above picture, d (f, I, y) is 0 because there are two places where the sign is 1
and two where it is −1.
The amazing thing about this is the number you obtain in this simple manner is
a specialization of something which is defined for continuous functions and which has
nothing to do with differentiability.
There are many ways to obtain the Brouwer degree. The method I will use here
is due to Heinz [23] and appeared in 1959. It involves first studying the degree for
functions in C 2 and establishing all its most important topological properties with the
aid of an integral. Then when this is done, it is very easy to extend to general continuous
functions.
When you have the topological degree, you can get all sorts of amazing theorems
like the invariance of domain theorem and others.
257
258 BROUWER DEGREE
¡ ¢
Definition 10.1.1 For Ω a bounded open set, denote by C Ω the set of func-
¡ ¢
tions which are continuous on Ω and by C m Ω , m ≤ ∞ the space of restrictions of
¡ ¢
functions in Ccm (Rn ) to Ω. The norm in C Ω is defined as follows.
© ª
||f ||∞ = ||f ||C (Ω) ≡ sup |f (x)| : x ∈ Ω .
¡ ¢ ¡ ¢
If the functions take values in Rn write C m Ω; Rn or C Ω; Rn for these functions
¡ ¢
if there is no differentiability assumed. The norm on C Ω; Rn is defined in the same
way as above,
© ª
||f ||∞ = ||f ||C (Ω;Rn ) ≡ sup |f (x)| : x ∈ Ω .
Also, C (Ω; Rn ) consists of functions which are continuous on Ω that have values in Rn
and C m (Ω; Rn ) denotes the functions which have m continuous derivatives defined on
Ω.
¡ ¢
Theorem 10.1.2
¡ ¢
Let Ω be a bounded open set in Rn and let f ∈ C Ω . Then
there exists g ∈ C ∞ Ω with ||g − f ||C (Ω) < ε. In fact, g can be assumed to equal a
polynomial for all x ∈ Ω.
Lemma 9.12.1 on Page 246 will also play an important role in the definition of the
Brouwer degree. Earlier it made possible an easy proof of the Brouwer fixed point
theorem. Later in this chapter, it is used to show the definition of the degree is well
defined. For convenience, here it is stated again.
∂gi ∂ det(Dg)
where here (Dg)ij ≡ gi,j ≡ ∂xj . Also, cof (Dg)ij = ∂gi,j .
Another simple result which will be used whenever convenient is the following lemma,
stated in somewhat more generality than needed.
Lemma 10.1.5 Let K be a compact set and C a closed set in a complete normed
vector space such that K ∩ C = ∅. Then
Proof: Let
d ≡ inf {||k − c|| : k ∈ K, c ∈ C}
Let {kn } , {cn } be such that
1
d+ > ||kn − cn || .
n
Since K is compact, there is a subsequence still denoted by {kn } such that kn → k ∈ K.
Then also
||cn − cm || ≤ ||cn − kn || + ||kn − km || + ||cm − km ||
If d = 0, then as m, n → ∞ it follows ||cn − cm || → 0 and so {cn } is a Cauchy sequence
which must converge to some c ∈ C. But then ||c − k|| = limn→∞ ||cn − kn || = 0 and so
c = k ∈ C ∩ K, a contradiction to these sets being disjoint. This proves the lemma. ¥
In particular the distance between a point and a closed set is always positive if the
point is not in the closed set. Of course this is obvious even without the above lemma.
h : Ω × [0, 1] → Rn
such that h (x, 1) = g (x) and h (x, 0) = f (x) and x → h (x,t) ∈ Uy for all t ∈
[0, 1] (y ∈
/ h (∂Ω, t)). This function, h, is called a homotopy and f and g are homotopic.
¡ ¢
It is obvious that Uy is an open subset of C Ω; Rn . Next consider the claim that
[f ] is also an
¡ open¢ set. If f ∈ Uy , There exists δ > 0 such that B (y, 2δ) ∩ f (∂Ω) = ∅.
Let f1 ∈ C Ω; Rn with ||f1 − f ||∞ < δ. Then if t ∈ [0, 1], and x ∈ ∂Ω
By Sard’s lemma, Lemma 9.9.9 on Page 240, g (S) is a set of measure zero and so in
particular contains no open ball and so there exists a regular values of g arbitrarily close
e be one of these regular values, |y−e
to y. Let y y| < ε/2, and consider
It follows g1 (x) = y if and only if g (x) = y e and so, since Dg (x) = Dg1 (x), y is a
regular value of g1 . Then for t ∈ [0, 1] and x ∈ ∂Ω,
Then Z
d (g, Ω, y) ≡ lim φε (g (x) − y) det Dg (x) dx
ε→0 Ω
Lemma 10.2.5 The above definition is well defined. In particular the limit exists.
In fact Z
φε (g (x) − y) det Dg (x) dx
Ω
10.2. DEFINITIONS AND ELEMENTARY PROPERTIES 261
does not depend on ε whenenver ε is small enough. If y is a regular value for g then
for all ε small enough,
Z
φε (g (x) − y) det Dg (x) dx ≡
Ω
X© ª
sgn (det Dg (x)) : x ∈ g−1 (y) (10.1)
¡ ¢
If f , g are two functions in C 2 Ω; Rn such that for all x ∈ ∂Ω
y∈
/ tf (x) + (1 − t) g (x) (10.2)
d (f , Ω, y) = d (g, Ω, y)
Proof:
First consider the case where y is a regular value of g. I will show that in this case,
the integral expression is eventually constant for small ε > 0 and equals the right side
of 10.1. I claim the right side of this equation is actually a finite sum. This follows from
the inverse function theorem because g−1 (y) is a closed, hence compact subset of Ω due
to the assumption that y ∈ / g (∂Ω). If g−1 (y) had infinitely many points in it, there
would exist a sequence of distinct points {xk } ⊆ g−1 (y). Since Ω is bounded, some
subsequence {xkl } would converge to a limit point x∞ . By continuity of g, it follows
x∞ ∈ g−1 (y) also and so x∞ ∈ Ω. Therefore, since y is a regular value, there is an
open set, Ux∞ , containing x∞ such that g is one to one on this open set contradicting
the assertion that liml→∞ xkl = x∞ . Therefore, this set is finite and so the sum is well
defined.
Thus the right side of 10.1 is finite when y is a regular value. Next I need to show
the left side of this equation is eventually constant. By what was just shown, there are
m
finitely many points, {xi }i=1 = g−1 (y). By the inverse function theorem, there exist
disjoint open sets, Ui with xi ∈ Ui , such that g is one to one on Ui with det (Dg (x))
having constant sign on Ui and g (Ui ) is an open set containing y. Then let ε be small
enough that B (y, ε) ⊆ ∩m i=1 g (Ui ) and let Vi ≡ g
−1
(B (y, ε)) ∩ Ui .
g(U3 ) g(U2 )
x3 V1 x1
V3 ε y g(U1 )
ª
V2 x2
262 BROUWER DEGREE
The reason for this is as follows. The integrand on the left is nonzero only if g (x) − y ∈
B (0, ε) which occurs only if g (x) ∈ B (y, ε) which is the same as x ∈ g−1 (B (y, ε)).
Therefore, the integrand is nonzero only if x is contained in exactly one of the disjoint
sets, Vi . Now using the change of variables theorem,
Xm Z
¡ ¢¯ ¯
= φε (z) det Dg g−1 (y + z) ¯det Dg−1 (y + z)¯ dz
i=1 g(Vi )−y
¡ ¢
By the chain rule, I = Dg g−1 (y + z) Dg−1 (y + z) and so
¡ ¢¯ ¯
det Dg g−1 (y + z) ¯det Dg−1 (y + z)¯
¡ ¡ ¢¢
= sgn det Dg g−1 (y + z) ·
¯ ¡ ¢¯ ¯ ¯
¯det Dg g−1 (y + z) ¯ ¯det Dg−1 (y + z)¯
¡ ¡ ¢¢
= sgn det Dg g−1 (y + z)
m
X Z m
X
sgn (det Dg (xi )) φε (z) dz = sgn (det Dg (xi )) .
i=1 B(0,ε) i=1
¡ ¢
In case g−1 (y) = ∅, there exists ε > 0 such that g Ω ∩ B (y, ε) = ∅ and so for ε this
small, Z
φε (g (x) − y) det Dg (x) dx = 0.
Ω
With this done it is necessary to show that the integral in the definition of the degree
is constant for small enough ε even if y is not a regular value. To do this, I will first
show that if 10.2 holds, then 10.3 holds. This particular part of the argument is the
trick which makes surprising things happen. This is where the fact the functions are
twice continuously differentiable is used. Suppose then that f , g satisfy 10.2. Also let
ε > 0 be such that for all t ∈ [0, 1] ,
In this formula, the function det is considered as a function of the n2 entries in the n × n
matrix and the , αj represents the derivative with respect to the αj th entry. Now as in
the proof of Lemma 9.12.1 on Page 246,
and so Z XX
B= φε (f − y + t (g − f )) ·
Ω α j
is in Cc1 (Ω) because if x ∈ ∂Ω, it follows by 10.4 that for all t ∈ [0, 1]
and so φε (f (x) −y + t (g (x) −f (x))) = 0. Furthermore this situation persists for x near
∂Ω. Therefore, integrate by parts and write
Z XX
∂
B=− (φε (f − y + t (g − f ))) ·
Ω α j
∂x j
The second term equals zero by Lemma 10.1.4. Simplifying the first term yields
Z XXX
B=− φε,β (f − y + t (g − f )) ·
Ω α j β
Z X
= − φε,α (f − y + t (g − f ))
Ω α
= −A.
0
Therefore, H (t) = 0 and¡ so H ¢is a constant.
Now let g ∈ Uy ∩ C 2 Ω; Rn . By Sard’s lemma, Lemma 9.9.9 there exists a regular
value y1 of g which is very close to y. This is because, by this lemma, the set of points
which are not regular values has measure zero so this set of points must have empty
interior. Let
g1 (x) ≡ g (x) + y − y1
and let y1 − y be so small that
y∈
/ (1 − t) g1 + gt (∂Ω) ≡ g1 + t (g − g1 ) (∂Ω) for all t ∈ [0, 1] .
Then g1 (x) = y if and only if g (x) = y1 which is a regular value. Note also D (g (x)) =
D (g1 (x)). Then from what was just shown, letting f = g and g = g1 in the above and
using g − y1 = g1 − y,
Z
φε (g (x) − y1 ) det (D (g (x))) dx
ZΩ
= φε (g1 (x) − y) det (D (g (x))) dx
ZΩ
= φε (g (x) − y) det (D (g (x))) dx
Ω
Since y1 is a regular value of g it follows from the first part of the argument that the
first integral in the above is eventually constant for small enough ε. It follows the last
integral is also eventually constant for small enough ε. This proves the claim about
the limit existing and in fact being constant for small ε. The last claim follows right
away from the above. Suppose 10.2 holds. Then choosing ε small enough, it follows
d (f , Ω, y) = d (g, Ω, y) because the two integrals defining the degree for small ε are
equal. This proves the lemma. ¥ ¡ ¢
Next I will show that if f ∼ g where f , g ∈ Uy ∩ C 2 Ω; Rn then d (f , Ω, y) =
d (g, Ω, y) . In the special case where
this has already been done in the above lemma. In the following lemma the two functions
k, l are only assumed to be continuous.
Proof: This lemma is not really very surprising. By Lemma 10.2.3, [k] is an open
set and since everything in [k] is homotopic to k, it is also connected because it is path
connected. The Lemma merely asserts there exists a piecewise linear curve joining k
and l which stays within the open connected
¡ ¢ set, [k] in such a way that the vertices
of this curve are in the dense set C 2 Ω; Rn . This is the abstract idea. Now here is a
more down to earth treatment.
10.2. DEFINITIONS AND ELEMENTARY PROPERTIES 265
where δ ¡> 0 is¢ small. By Lemma 10.2.3, for each i ∈ {1, · · ·, m} , there exists gi ∈
Uy ∩ C 2 Ω; Rn such that
||gi − h (·, ti )||∞ < δ (10.7)
Thus
||gi − gi−1 ||∞ ≤ ||gi − h (·, ti )||∞
+ ||h (·, ti ) − h (·, ti−1 )||∞ + ||gi−1 − h (·, ti−1 )||∞ < 3δ.
(Recall g0 ≡ k in case i = 1.)
It was just shown that for each x ∈ ∂Ω,
For x ∈ ∂Ω
so that for all t ∈ [0, 1] , |h (x, t) − y| > 5δ. This proves the lemma. ¥
¡ ¢
With this lemma the homotopy invariance of the degree on C 2 Ω; Rn is easy to
obtain. The following theorem gives this homotopy invariance and summarizes one of
the results of Lemma 10.2.5.
¡ ¢
Theorem 10.2.7 Let f , g ∈ Uy ∩ C 2 Ω; Rn and suppose f ∼ g. Then
d (f ,Ω, y) = d (g,Ω, y) .
¡ ¢
When f ∈ Uy ∩ C Ω; Rn and y is a regular value of g with f ∼ g,
2
X© ª
d (f ,Ω, y) = sgn (det Dg (x)) : x ∈ g−1 (y) .
¡ ¢
Proof: From Lemma 10.2.6 there exists a sequence of functions in C 2 Ω; Rn having
the properties listed there. Then from Lemma 10.2.5
The second assertion follows from Lemma 10.2.5. Finally consider the claim the degree
is an integer. This is obvious if y is a regular point. If y is not a regular point, let
g1 (x) ≡ g (x) + y − y1
y∈
/ (tg1 + (1 − t) g) (∂Ω) .
f1 (x) ≡ f (x) + y − y1
it follows
y∈
/ tf + (1 − t) f1 (∂Ω)
Then from Lemma 10.2.5
Z
d (f , Ω, y) = d (f1 , Ω, y) = lim φε (f1 (x) − y) Df (x) dx
ε→0 Ω
Z
= lim φε (f (x) − y1 ) Df (x) dx ≡ d (f , Ω, y1 )
ε→0 Ω
d (f , Ω, y) ≡ f (g, Ω, y)
¡ ¢
where g ∈ Uy ∩ C 2 Ω; Rn and f ∼ g.
10.2. DEFINITIONS AND ELEMENTARY PROPERTIES 267
Theorem 10.2.9 The definition of the degree given in Definition 10.2.8 is well
defined, equals an integer, and satisfies the following properties. In what follows, id (x) =
x.
1. d (id, Ω, y) = 1 if y ∈ Ω.
¡ ¢
2. If Ωi ⊆ Ω, Ωi open, and Ω1 ∩Ω2 = ∅ and if y ∈
/ f Ω \ (Ω1 ∪ Ω2 ) , then d (f , Ω1 , y)+
d (f , Ω2 , y) = d (f , Ω, y).
¡ ¢
3. If y ∈ / f Ω \ Ω1 and Ω1 is an open subset of Ω, then
d (f , Ω, y) = d (f , Ω1 , y) .
Proof: First it is necessary to show the definition is well defined. There are two
parts to this. First I need to show there exists g with the desired properties and then
I need to show that it doesn’t matter which g I happen to pick. The first part is easy.
Let δ be small enough that
B (y, δ) ∩ f (∂Ω) = ∅.
¡ ¢
Then by Lemma 10.2.3 there exists g ∈ C 2 Ω; Rn such that ||g − f ||∞ < δ. It follows
that for t ∈ [0, 1] ,
y∈/ (tg + (1 − t) f ) (∂Ω)
and so g ∼ f . This does the first part. Now consider the second part. Suppose g ∼ f
and g1 ∼ f . Then by Lemma 10.2.3 again
g ∼ g1
and by Theorem 10.2.7 it follows d (g,Ω, y) = d (g1 ,Ω, y) which shows the definition is
well defined. Also d (f ,Ω, y) must be an integer because it equals d (g,Ω, y) which is an
integer.
Now consider the properties. The first one is obvious from Theorem 10.2.7 since y
is a regular point of id.
Consider the second property. The assumption implies
y ∈
/ (tg + (1 − t) f ) (∂Ω) , y ∈
/ (tg + (1 − t) f ) (∂Ω1 )
y ∈
/ (tg + (1 − t) f ) (∂Ω2 ) (10.10)
268 BROUWER DEGREE
for all t ∈ [0, 1] . Then it follows from Lemma 10.2.5, for all ε small enough,
Z
d (g, Ω, y) = φε (g (x) − y) Dg (x) dx
Ω
φε (g (x) − y) = 0 if x ∈
/ Ω1 ∪ Ω2
µZ
= lim φε (g (x) − y) Dg (x) dx+
ε→0 Ω1
Z ¶
φε (g (x) − y) Dg (x) dx
Ω2
= d (g, Ω1 , y) + d (g, Ω2 , y)
= d (f , Ω1 , y) + d (f , Ω2 , y)
and so
d (f , Ω, y) = d (g, Ω, y)
10.3. BORSUK’S THEOREM 269
¡ ¢
The seventh claim is done already for the case where f ∈ C 2 Ω; Rn in Theorem
10.2.7. It remains to verify this for the case where f is only continuous. This will be
done by showing y → d (f ,Ω, y) is continuous. Let y0 ∈ Rn \ f (∂Ω) and let δ be small
enough that
B (y0 , 4δ) ∩ f (∂Ω) = ∅.
2
¡ n
¢
Now let g ∈ C Ω; R such that ||g − f ||∞ < δ. Then for x ∈ ∂Ω, t ∈ [0, 1] , and
y ∈ B (y0 , δ) ,
d (f , Ω, y) = d (g, Ω, y)
¡ ¢
Lemma 10.3.2
¡
Let g ∈ C 2 Ω; Rn be an odd map. Then for every ε > 0, there
¢
2 n
exists h ∈ C Ω; R such that h is also an odd map, ||h − g||∞ < ε, and 0 is a regular
value of h.
Proof: In this argument η > 0 will be a small positive number and C will be a
constant which depends only on the diameter of Ω. Let h0 (x) = g (x) + ηx where η is
chosen such that det Dh0 (0) 6= 0. Now let Ωi ≡ {x ∈ Ω : xi 6= 0}. In other words, leave
out the plane xi = 0 from Ω in order to obtain Ωi . A succession of modifications is
about to take place on Ω1 , Ω1 ∪ Ω2 , etc. Finally a function will be obtained on ∪nj=1 Ωj
which is everything except 0. ¯ ¯ ¡ ¢
Define h1 (x) ≡ h0 (x)−y1 x31 where ¯y1 ¯ < η and y1 = y11 , · · ·, yn1 is a regular value
of the function, x → h0x(x)
3 for x ∈ Ω1 . the existence of y1 follows from Sard’s lemma
1
h0 (x)
because this function is in C 2 ( Ω1 ; Rn ). Thus h1 (x) = 0 if and only if y1 = x31
.
Since y1 is a regular value, it follows that for such x,
à ∂
¡ 3¢ !
h0i,j (x) x31 − ∂x j
x1 h0i (x)
det =
x61
à ∂
¡ 3¢ 1 3 !
h0i,j (x) x31 − ∂xj x1 yi x1
det 6= 0
x61
implying that
µ ¶
∂ ¡ 3¢ 1
det h0i,j (x) − x y = det (Dh1 (x)) 6= 0.
∂xj 1 i
This
¡ shows¢ 0 is a regular value of h1 on the set Ω1 and it is clear h1 is an odd map in
C 2 Ω; Rn and ||h1 − g||∞ ≤ Cη where C depends only on the diameter of Ω.
Now
¡ suppose
¢ for some k such that 1 ≤ k < n there exists an odd mapping hk
in C 2 Ω; Rn such that 0 is a regular value of hk on ∪ki=1 Ωi and ||hk − g||∞ ≤ Cη.
k+1
Sard’s theorem implies there
¯¯ exists
¯¯ y a regular value of the function x → hk (x) /x3k+1
defined on Ωk+1 such that y¯ ¯ k+1 ¯¯
< η and let hk+1 (x) ≡ hk (x)−yk+1 x3k+1 . As before,
hk+1 (x) = 0 if and only if hk (x) /x3k+1 = yk+1 , a regular value of x → hk (x) /x3k+1 .
Consider such x for which hk+1 (x) = 0. First suppose x ∈ Ωk+1 . Then
à ∂
¡ 3 ¢ k+1 3 !
hki,j (x) x3k+1 − ∂x j
xk+1 yi xk+1
det 6 6= 0
xk+1
However, if x ∈ ∪ki=1 Ωk but x ∈ / Ωk+1 , then xk+1 = 0 and so the left side of 10.11
reduces to det (hki,j (x)) which is not zero because 0 is assumed a regular value of hk .
Therefore, 0 is a regular value for hk+1 on ∪k+1 k+1
i=1 Ωk . (For x ∈ ∪i=1 Ωk , either x ∈ Ωk+1 or
x∈/ Ωk+1 . If x ∈ Ωk+1 0 is a regular value by the construction above. In the other ¡ case,
¢
0 is a regular value by the induction hypothesis.) Also hk+1 is odd and in C 2 Ω; Rn ,
and ||hk+1 − g||∞ ≤ Cη.
Let h ≡ hn . Then 0 is a regular value of h for x ∈ ∪nj=1 Ωj . The point of Ω which is
not in ∪nj=1 Ωj is 0. If x = 0, then from the construction, Dh (0) = Dh0 (0) and so 0 is
a regular value of h for x ∈ Ω. By choosing η small enough, it follows ||h − g||∞ < ε.
This proves the lemma. ¥
10.4. APPLICATIONS 271
¡ ¢
Theorem 10.3.3 (Borsuk) Let f ∈ C Ω; Rn be odd and let Ω be symmetric
with 0 ∈
/ f (∂Ω). Then d (f , Ω, 0) equals an odd integer.
be such that ||f − g1 ||∞ < δ and let g denote the odd part of g1 . Thus
1
g (x) ≡ (g1 (x) − g1 (−x)) .
2
|f (x) − g (x)| =
¯ ¯
¯1 ¯
¯ (f (x) − f (−x)) − 1 (g1 (x) − g1 (−x))¯
¯2 2 ¯
1 1
≤ |f (x) − g1 (x)| + |f (−x) − g1 (−x)| < δ
2 2
¡ ¢
Thus ||f − g||∞ < δ also. By Lemma 10.3.2 there exists odd h ∈ C 2 Ω; Rn for which
0 is a regular value and ||h − g||∞ < δ. Therefore,
and so, from the from the definition of the degree, d (f , Ω, 0) = d (h, Ω, 0).
m
Since 0 is a regular point of h, h−1 (0) = {xi , −xi , 0}i=1 , and since h is odd,
Dh (−xi ) = Dh (xi ) and so
m
X m
X
d (h, Ω, 0) ≡ sgn det (Dh (xi )) + sgn det (Dh (−xi )) + sgn det (Dh (0)) ,
i=1 i=1
an odd integer. ¥
10.4 Applications
With these theorems it is possible to give easy proofs of some very important and
difficult theorems.
As a first application, consider the invariance of domain theorem. This result says
that a one to one continuous map takes open sets to open sets. It is an amazing result
which is essential to understand if you wish to study manifolds. In fact, the following
theorem only requires f to be locally one to one. First here is a lemma which has the
main idea.
272 BROUWER DEGREE
Lemma 10.4.2 Let g : B (0,r) → Rn be one to one and continuous where here
B (0,r) is the ball centered at 0 of radius r in Rn . Then there exists δ > 0 such that
g (0) + B (0, δ) ⊆ g (B (0,r)) .
The symbol on the left means: {g (0) + x : x ∈ B (0, δ)} .
Proof: For t ∈ [0, 1] , let
µ ¶ µ ¶
x −tx
h (x, t) ≡ g −g
1+t 1+t
Then for x ∈ ∂B (0, r) , h (0, t) 6= 0 because if this were so, the fact g is one to one
implies
x −tx
=
1+t 1+t
and this requires x = 0 which is not the case. Since ∂B (0,r) × [0, 1] is compact, there
exists δ > 0 such that for all t ∈ [0, 1] and x ∈ ∂B (0,r) ,
B (0, δ) ∩ h (x,t) = ∅.
In particular, when t = 0, B (0,δ) is contained in a single component of
Rn \ (g − g (0)) (∂B (0,r))
and so
d (g − g (0) , B (0,r) , z)
is constant for z ∈ B (0,δ) . Therefore, from the properties of the degree in Theorem
10.2.9,
d (g − g (0) , B (0,r) , z) = d (g − g (0) , B (0,r) , 0)
= d (h (·, 0) , B (0,r) , 0)
= d (h (·, 1) , B (0,r) , 0) 6= 0
the last assertion following from Borsuk’s theorem, Theorem 10.3.3 and the observation
that h (·, 1) is odd. From Theorem 10.2.9 again, it follows that for all z ∈ B (0, δ) there
exists x ∈ B (0,r) such that
z = g (x) − g (0)
which shows g (0) + B (0, δ) ⊆ g (B (0, r)) and This proves the lemma. ¥
Now with this lemma, it is easy to prove the very important invariance of domain
theorem.
Theorem 10.4.3 (invariance of domain)Let Ω be any open subset of Rn and
n
let f : Ω → R be continuous and locally one to one. Then f maps open subsets of Ω to
open sets in Rn .
Proof: Let B (x0 , r) ⊆ U ⊆ Ω where f is one to one on B (x0 , r) and U is an open
subset of Ω. Let g be defined on B (0, r) given by
g (x) ≡ f (x + x0 )
Then g satisfies the conditions of Lemma 10.4.2, being one to one and continuous. It
follows from that lemma there exists δ > 0 such that
f (U ) ⊇ f (B (x0 , r)) = f (x0 + B (0, r))
= g (B (0,r)) ⊇ g (0) + B (0, δ)
= f (x0 ) + B (0,δ) = B (f (x0 ) , δ)
This shows that for any x0 ∈ U, f (x0 ) is an interior point of f (U ) which shows f (U ) is
open. This proves the theorem. ¥
10.4. APPLICATIONS 273
Corollary 10.4.4 If n > m there does not exist a continuous one to one map from
Rn to Rm .
Proof: By the invariance of domain theorem, f (Rn ) is an open set. It is also true
that f (Rn ) is a closed set. Here is why. If f (xk ) → y, the growth condition ensures
that {xk } is a bounded sequence. Taking a subsequence which converges to x ∈ Rn
and using the continuity of f , it follows f (x) = y. Thus f (Rn ) is both open and closed
which implies f must be an onto map since otherwise, Rn would not be connected. ¥
The next theorem is the famous Brouwer fixed point theorem.
Proof: Consider h (x, t) ≡ tf (x) − x for t ∈ [0, 1] . Then if there is no fixed point
in B for f , it follows that 0 ∈/ h (∂B, t) for all t. When t = 1, this follows from having
no fixed point for f . If t < 1, then if this were not so, then for some x ∈ ∂B,
tf (x) = x
Definition
³ ´ 10.4.7 f is a retraction of B (0, r) onto ∂B (0, r) if f is continuous,
f B (0, r) ⊆ ∂B (0,r), and f (x) = x for all x ∈ ∂B (0,r).
Theorem 10.4.8 There does not exist a retraction of B (0, r) onto its bound-
ary, ∂B (0, r).
Proof: Suppose f were such a retraction. Then for all x ∈ ∂B (0, r), f (x) = x and
so from the properties of the degree, the one which says if two functions agree on ∂Ω,
then they have the same degree,
Theorem 10.4.9 Let Ω be a symmetric open set in Rn such that 0 ∈ Ω and let
f : ∂Ω → V be continuous where V is an m dimensional subspace of Rn , m < n. Then
f (−x) = f (x) for some x ∈ ∂Ω.
¡ Proof:
¢ Suppose not. Using the Tietze extension theorem, extend f to all of Ω,
f Ω ⊆ V . (Here the extended function is also denoted by f .) Let g (x) = f (x)−f (−x).
/ g (∂Ω) and so for some r > 0, B (0,r) ⊆ Rn \ g (∂Ω). For z ∈ B (0,r),
Then 0 ∈
d (g, Ω, z) = d (g, Ω, 0) 6= 0
V ⊇ g (Ω) ⊇ B (0,r)
Theorem 10.4.10 Let n be odd and let Ω be an open bounded set in Rn with
0 ∈ Ω. Suppose f : ∂Ω → Rn \ {0} is continuous. Then for some x ∈ ∂Ω and λ 6= 0,
f (x) = λx.
Proof: Using the Tietze extension theorem, extend f to all of Ω. Also denote the
extended function by f . Suppose for all x ∈ ∂Ω, f (x) 6= λx for all λ ∈ R. Then
0∈
/ tf (x) + (1 − t) x, (x, t) ∈ ∂Ω × [0, 1]
0∈
/ tf (x) − (1 − t) x, (x, t) ∈ ∂Ω × [0, 1] .
Thus there exists a homotopy of f and id and a homotopy of f and − id. Then by the
homotopy invariance of degree,
d (f , Ω, 0) = d (id, Ω, 0) , d (f , Ω, 0) = d (− id, Ω, 0) .
n
But this is impossible because d (id, Ω, 0) = 1 but d (− id, Ω, 0) = (−1) = −1. This
proves the theorem. ¥
||f − f 1 ||∞ ≤
||f − f 0 ||∞ + ||f0 − f1 ||∞
=
||f − f 0 ||∞ + |e
y1 − y1 |
δ δ
< + .
3r 2
¡ ¢
Suppose now there exists fk ∈ C 2 Ω; Rn with each of the yi for i = 1, · · ·, k a regular
value of fk and
µ ¶
δ k δ
||f − f k ||∞ < + .
2 r 3
Then letting Sk denote the singular values of fk , Sard’s theorem implies there exists
ek+1 such that
y
δ
|e
yk+1 − yk+1 | <
3r
and
y / Sk ∪ ∪ki=1 (Sk + yk+1 − yi ) .
ek+1 ∈ (10.12)
Let
ek+1 .
fk+1 (x) ≡ fk (x) + yk+1 − y (10.13)
ek+1
fk (x) + yk+1 − yi = y
ek+1 ∈
and so fk (x) is a regular value for fk since by 10.12, y / Sk + yk+1 − yi and
so fk (x) ∈
/ Sk . Therefore, for i ≤ k, yi is a regular value of fk+1 since by 10.13,
Dfk+1 = Dfk . Now suppose fk+1 (x) = yk+1 . Then
ek+1
yk+1 = fk (x) + yk+1 − y
Let e
f ≡ fr . Then
¯¯ ¯¯ µ ¶
¯¯e ¯¯ δ δ
¯¯f − f ¯¯ < + <δ
∞ 2 3
The product formula considers the situation depicted in the following diagram in
/ g (f (∂Ω)) and the Ki are the connected components of Rn \ f (∂Ω).
which y ∈
f ¡ ¢ g
Ω → f Ω → Rn
y
Rn \f (∂Ω)=∪i Ki
K3
f g
Ω - K1 - y
K2
¡ ¢
Lemma 10.5.3 Let f ∈ C Ω; Rn , g ∈ C 2 (Rn , Rn ) , and y ∈/ g (f (∂Ω)). Suppose
also that y is a regular value of g. Then the following product formula holds where Ki
are the bounded components of Rn \ f (∂Ω).
∞
X
d (g ◦ f , Ω, y) = d (f , Ω, Ki ) d (g, Ki , y) .
i=1
xij
X mi
∞ X X z}|{
= sgn det Dg e e
f (z) sgn det Df (z)
i=1 j=1 z∈e
f −1 (xij )
X mi
∞ X
¡ ¢ ³ ´ X∞ ³ ´
= sgn det Dg xij d e
f , Ω, xij = d (g, Ki , y) d ef , Ω, xij
i=1 j=1 i=1
∞
X
= d (g, Ki , y) d (f , Ω, Ki ) .
i=1
Theorem 10.5.4
¡ (product
¢
∞
formula) Let {Ki }i=1 be the bounded components of
R \ f (∂Ω) for f ∈ C Ω; Rn , let g ∈ C (Rn , Rn ), and suppose that y ∈
n
/ g (f (∂Ω)).
Then
∞
X
d (g ◦ f , Ω, y) = d (g, Ki , y) d (f , Ω, Ki ) . (10.16)
i=1
|g (f (x)) + t (e
g (f (x)) − g (f (x))) − y| ≥ |g (f (x)) − y| − t |e
g (f (x)) − g (f (x))|
≥ 3δ − tδ > 0.
It follows that
d (g ◦ f , Ω, y) = d (e
g ◦ f , Ω, y) . (10.17)
Now also, ∂Ki ⊆ f (∂Ω) and so if z ∈ ∂Ki , then g (z) ∈ g (f (∂Ω)). Consequently, for
such z,
|g (z) + t (e
g (z) − g (z)) − y| ≥ |g (z) − y| − tδ > 3δ − tδ > 0
which shows that, by homotopy invariance,
d (g, Ki , y) = d (e
g, Ki , y) . (10.18)
and the sum has only finitely many non zero terms. This proves the product formula.
Note there are no convergence problems because ¡ ¢these sums are actually finite sums
because, as in the previous lemma, g−1 (y) ∩ f Ω is a compact set covered by the
components of Rn \ f (∂Ω) and so it is covered by finitely many of these components.
For the other components, d (f , Ω, Ki ) = 0 or else d (g, Ki , y) = 0. ¥
The following theorem is the Jordan separation theorem, a major result. A home-
omorphism is a function which is one to one onto and continuous having continuous
inverse. Before the theorem, here is a helpful lemma.
¡ ¢
Lemma 10.5.5 Let Ω be a bounded open set in Rn , f ∈ C Ω; Rn , and suppose
∞
{Ωi }i=1 are disjoint open sets contained in Ω such that
¡ ¢
y∈/ f Ω \ ∪∞ j=1 Ωj
Then
∞
X
d (f , Ω, y) = d (f , Ωj , y)
j=1
where the sum has all but finitely many terms equal to 0.
Proof: By assumption, the compact set f −1 (y) has empty intersection with
Ω \ ∪∞
j=1 Ωj
and so this compact set is covered by finitely many of the Ωj , say {Ω1 , · · · , Ωn−1 } and
¡ ¢
y∈/ f ∪∞ j=n Ωj
10.6. JORDAN SEPARATION THEOREM 279
n−1
X ∞
X
d (f , Ω, y) = d (f , Ωj , y) + d (f ,O, y) = d (f , Ωj , y)
j=1 j=1
Lemma 10.5.6 Define ∂U to be those points x with the property that for every r > 0,
B (x, r) contains points of U and points of U C . Then for U an open set,
∂U = U \ U
Let C be a closed subset of Rn and let K denote the set of components of Rn \ C. Then
if K is one of these components, it is open and
∂K ⊆ C
because by Lemma 10.5.6, ∂K ⊆ C and on C, f = f . Thus the right side is of the form
⊆f (C)
z }| {
f −1 f (∂K) = f −1 (f (∂K)) ⊆ C
280 BROUWER DEGREE
Let H denote the set of bounded components of Rn \ f (∂K). By the product formula,
³ ´ X ¡ ¢ ³ ´
1 = d f −1 ◦ f , K, y = d f , K, H d f −1 , H, y , (10.19)
H∈H
the sum being a finite sum from the product formula. It might help to consult the
following diagram.
f
→
Rn \ C Rn \ f (C)
f −1
←
K L
K Rn \ f (∂K)
y∈K H, H1
H
LH
Now letting x ∈ L ∈ L, if S is a connected set containing x and contained in Rn \ f (C),
then it follows S is contained in Rn \ f (∂K) because ∂K ⊆ C. Therefore, every set of
L is contained in some set of H. Furthermore, if any L ∈ L has nonempty intersection
with H ∈ H then it must be contained in H. This is because
³ C
´
L = (L ∩ H) ∪ (L ∩ ∂H) ∪ L ∩ H .
L ∩ ∂H ⊆ L ∩ f (∂K) ⊆ L ∩ f (C) = ∅.
C
Since L is connected, L∩H = ∅. Letting LH denote those sets of L which are contained
in H equivalently having nonempty intersection with H, if p ∈ H \ ∪LH = H \ ∪L,
then p ∈ H ∩ f (C) and so
Claim 1:
H \ ∪LH ⊆ f (C) .
but
³ this is just
´ Claim 2. By Lemma 10.5.5, I can write the above sum in place of
−1
d f , H, y . Therefore,
X ¡ ¢ ³ ´ X ¡ ¢ X ³ ´
1= d f , K, H d f −1 , H, y = d f , K, H d f −1 , L, y
H∈H1 H∈H1 L∈LH
By definition, ¡ ¢ ¡ ¢
d f , K, H = d f , K, x
¡ ¢ ¡ ¢
where x is any point of H. In particular d f , K, H = d f , K, L for any L ∈ LH .
Therefore, the above reduces to
X ¡ ¢ ³ ´
= d f , K, L d f −1 , L, y (10.22)
L∈L
Here is why. There are finitely many H ∈ H1 for which the term in the double sum of
10.21 is not zero, say H1 , · · · , Hm . Then the above sum in 10.22 equals
m
X X ¡ ¢ ³ ´ X ¡ ¢ ³ ´
d f , K, L d f −1 , L, y + d f , K, L d f −1 , L, y
k=1 L∈LHk L\∪m
k=1 LHk
The second sum equals 0 because those L are contained in some H ∈ H for which
¡ ¢ ³ ´ ¡ ¢ X ³ ´
0 = d f , K, H d f −1 , H, y = d f , K, H d f −1 , L, y
L∈LH
X ¡ ¢ ³ ´
= d f , K, L d f −1 , L, y .
L∈LH
which is the same as the sum in 10.21. Therefore, 10.22 does follow. Then the sum in
10.22 reduces to X ¡ ¢ ³ ´
= d f , K, L d f −1 , L, K
L∈L
and all but finitely many terms in the sum are 0. Letting |K| denote the number of
elements in K, similar for L,
à !
X X X ¡ ¢ ³ ´
|K| = 1= d f , K, L d f −1 , L, K
K∈K K∈K L∈L
à !
X X X ¡ ¢ ³ ´
|L| = 1= d f , K, L d f −1 , L, K
L∈L L∈L K∈K
Suppose |K| < ∞. Then you can switch the order of summation in the double sum for
|K| and so
à !
X X ¡ ¢ ³ ´
|K| = −1
d f , K, L d f , L, K
K∈K L∈L
à !
X X ¡ ¢ ³ ´
= d f , K, L d f −1 , L, K = |L|
L∈L K∈K
It follows that if either |K| or |L| is finite, then they are equal. Thus if one is infinite, so
is the other. This proves the theorem because if n > 1 there is exactly one unbounded
component to both Rn \ C and Rn \ f (C) and if n = 1 there are exactly two unbounded
components. ¥
As an application, here is a very interesting little result. It has to do with d (f , Ω, f (x))
in the case where f is one to one and Ω is connected. You might imagine this should
equal 1 or −1 based on one dimensional analogies. In fact this is the case and it is a
nice application of the Jordan separation theorem and the product formula.
where Ω is a bounded open set and also let h be C 1 (Rn ; Rn ), vanishing outside some
bounded set. Then there exists ε0 > 0 such that whenever 0 < ε < ε0
Z
d (h, Ω, y) = φε (h (x) − y) det Dh (x) dx
Ω
for all y ∈ S.
hm ≡ h∗ψ m .
∞
¡ ¢
Thus hm ∈ C Ω; Rn and hm converges uniformly to h while Dhm converges uni-
formly to Dh. Denote by ||·||∞ this norm defined by
For y ∈ S, let z ∈ B (y, ε) where ε < ε0 and suppose x ∈ ∂Ω, and k, m ≥ M . Then
for t ∈ [0, 1] ,
showing that for each y ∈ S, B (y, ε) ∩ ((1 − t) hm + thk ) (∂Ω) = ∅. By Lemma 10.2.5,
for all y ∈ S, Z
φε (hm (x) − y) det (Dhm (x)) dx =
Ω
Z
φε (hk (x) − y) det (Dhk (x)) dx (10.24)
Ω
for all k, m ≥ M . By this lemma again, which says that for small enough ε the integral
is constant and the definition of the degree in Definition 10.2.4,
Z
d (y,Ω, hm ) = φε (hm (x) − y) det (Dhm (x)) dx (10.25)
Ω
d (y,Ω, h) = d (y,Ω, hm ) =
Z
φε (hm (x) − y) det (Dhm (x)) dx
Ω
whenever ε is small enough. Fix such an ε < ε0 and use 10.24 to conclude the right side
of the above equation is independent of m > M . Now from the uniform convergence
noted above,
Z
d (y,Ω, h) = lim φε (hm (x) − y) det (Dhm (x)) dx
m→∞ Ω
Z
= φε (h (x) − y) det (Dh (x)) dx.
Ω
Lemma 10.7.2 Let h ∈ C 1 (Rn ; Rn ) and h vanishes off a bounded set. Let mn (A) =
0. Then h (A) also has measure zero.
Proof: Both sides depend only on h restricted to Ω and so the above results apply
and give the formula of the theorem for any h ∈ C 1 (Rn ; Rn ) which vanishes off a
bounded set which coincides with h on Ω. This proves the lemma. ¥
¡ ¢
Lemma 10.7.4 If h ∈ C 1 Ω, Rn for Ω a bounded connected open set with ∂Ω
having measure zero and h is one to one on Ω. Then for any E a Borel set,
Z Z
XE (y) d (y, Ω, h) dy = det (Dh (x)) XE (h (x)) dx
h(Ω) Ω
Furthermore, off a set of measure zero, det (Dh (x)) has constant sign equal to the sign
of d (y, Ω, h).
Let ¡ ¢
dist y, WjC
fj (y) ≡ ¡ ¢
dist (y, Kj ) + dist y, WjC
Thus fj is nonnegative, increasing in j, has compact support in Wj , is continuous, and
eventually fj (y) = 1 for all y ∈ O and fj (y) = 0 for all y ∈
/ O. Thus limj→∞ fj (y) =
XO (y).
C
Now let O ⊆ h (∂Ω) . Then from the above, let fj be as described above for the
open set O ∩ h (Ω) . (By invariance of domain, h (Ω) is open.)
Z Z
fj (y) d (y, Ω, h) dy = det (Dh (x)) fj (h (x)) dx
h(Ω) Ω
From Proposition 10.6.2, d (y, Ω, h) either equals 1 or −1 for all y ∈ h (Ω). Then by
the monotone convergence theorem on the left, using the fact that d (y, Ω, h) is either
always 1 or always −1 and the dominated convergence theorem on the right, it follows
Z Z
XO (y) d (y, Ω, h) dy = det (Dh (x)) XO∩h(Ω) (h (x)) dx
h(Ω) Ω
Z
= det (Dh (x)) XO (h (x)) dx
Ω
Then as shown above, G contains the π system of open sets. Since |det (Dh (x))| is
bounded uniformly, it follows easily that if E ∈ G then E C ∈ G. This is because, since
Rn is an open set,
Z Z
XE (y) d (y, Ω, h) dy + XE C (y) d (y, Ω, h) dy
h(Ω) h(Ω)
Z Z
= d (y, Ω, h) dy = det (Dh (x)) dx
h(Ω) Ω
Z Z
= det (Dh (x)) XE C (h (x)) dx + det (Dh (x)) XE (h (x)) dx
Ω Ω
Now cancelling Z
det (Dh (x)) XE (h (x))
Ω
from the right with Z
XE (y) d (y, Ω, h) dy
h(Ω)
≥ εmn (Eε )
and so mn (Eε ) = 0. Therefore, if E is the set of x ∈ Ω where det (Dh (x)) > 0, it
equals ∪∞ k=1 E1/k , a set of measure zero. Thus off this set det Dh (x) ≤ 0. Similarly, if
d (y, Ω, h) = 1, det Dh (x) ≥ 0 off a set of measure zero. This proves the lemma. ¥
Theorem 10.7.5
¡ Let
¢ f ≥ 0 and measurable for Ω a bounded open connected
set where h is in C 1 Ω; Rn and is one to one on Ω. Then
Z Z
f (y) d (y, Ω, h) dy = det (Dh (x)) f (h (x)) dx.
h(Ω) Ω
Proof: First suppose ∂Ω has measure zero. From Proposition 10.6.2, d (y, Ω, h)
either equals 1 or −1 for all y ∈ h (Ω) and det (Dh (x)) has the same sign as the sign of
the degree a.e. Suppose d (y, Ω, h) = −1. By Theorem 9.9.10, if f ≥ 0 and is Lebesgue
measurable,
Z Z Z
f (y) dy = |det (Dh (x))| f (h (x)) dx = (−1) det (Dh (x)) f (h (x)) dx
h(Ω) Ω Ω
The case where d (y, Ω, h) = 1 is similar. This proves the theorem when ∂Ω has measure
zero.
Next I show it is not necessary to assume ∂Ω has measure zero. By Corollary 9.7.6
there exists a sequence of disjoint balls {Bi } such that
mn (Ω \ ∪∞
i=1 Bi ) = 0.
Since ∂Bi has measure zero, the above holds for each Bi and so
Z Z
f (y) d (y, Bi , h) dy = det (Dh (x)) f (h (x)) dx
h(Bi ) Bi
From Lemma 10.7.4 det (Dh (x)) has the same sign off a set of measure zero as the
constant sign of d (y, Ω, h) . Therefore, using the monotone convergence theorem and
that h is one to one,
Z Z
det (Dh (x)) f (h (x)) dx = det (Dh (x)) f (h (x)) dx
Ω ∪∞
i=1 Bi
288 BROUWER DEGREE
∞ Z
X ∞ Z
X
= det (Dh (x)) f (h (x)) dx = f (y) d (y, Ω, h) dy
i=1 Bi i=1 h(Bi )
Z Z
= f (y) d (y, Ω, h) dy = f (y) d (y, Ω, h) dy
∪∞
i=1 h(Bi ) h(∪∞
i=1 Bi )
Z
= f (y) d (y, Ω, h) dy,
h(Ω)
the last line following from Lemma 10.7.2. This proves the theorem. ¥
10.8 Exercises
1. Show the Brouwer fixed point theorem is equivalent to the nonexistence of a
continuous retraction onto the boundary of B (0, r).
2. Using the Jordan separation theorem, prove the invariance of domain theorem.
Hint: You might consider B (x, r) and show f maps the inside to one of two
components of Rn \ f (∂B (x, r)) . Thus an open ball goes to some open set.
3. Give a version of Proposition 10.6.2 which is valid for the case where n = 1.
4. Suppose n > m. Does there exists a continuous one to one map, f which maps Rm
onto Rn ? This is a very interesting question because there do exist continuous maps
of [0, 1] which cover a square for example. Hint: First show that if K is compact
and f : K → f (K) is one to one, thennf −1 must o be continuous. Now consider
∞
the increasing sequence of compact sets B (0, k) whose union is Rm and the
n ³ ´o∞ k=1
increasing sequence f B (0, k) which you might assume covers Rn . You
k=1
might use this to argue that if such a function exists, then f −1 must be continuous
and then apply Corollary 10.4.4.
5. Can there exists a one to one onto continuous map, f which takes the unit interval
to the unit disk? Hint: Think in terms of invariance of domain and use the hint
to Problem 4.
Thus ∂Ui = Γ.
9. Let K be a nonempty closed and convex subset of Rn . Recall K is convex means
that if x, y ∈ K, then for all t ∈ [0, 1] , tx + (1 − t) y ∈ K. Show that if x ∈ Rn
there exists a unique z ∈ K such that
|x − z| = min {|x − y| : y ∈ K} .
This z will be denoted as P x. Hint: First note you do not know K is compact.
Establish the parallelogram identity if you have not already done so,
2 2 2 2
|u − v| + |u + v| = 2 |u| + 2 |v| .
and then use this to argue {zk } is a Cauchy sequence. Then if zi works for i = 1, 2,
consider (z1 + z2 ) /2 to get a contradiction.
10. In Problem 9 show that P x satisfies the following variational inequality.
(x−P x) · (y−P x) ≤ 0
for all y ∈ K. Then show that |P x1 − P x2 | ≤ |x1 − x2 |. Hint: For the first part
2
note that if y ∈ K, the function t → |x− (P x + t (y−P x))| achieves its minimum
on [0, 1] at t = 0. For the second part,
(x1 −P x1 ) · (P x2 −P x1 ) ≤ 0, (x2 −P x2 ) · (P x1 −P x2 ) ≤ 0.
Explain why
(x2 −P x2 − (x1 −P x1 )) · (P x2 −P x1 ) ≥ 0
and then use a some manipulations and the Cauchy Schwarz inequality to get the
desired inequality.
290 BROUWER DEGREE
11. Establish the Brouwer fixed point theorem for any convex compact set in Rn .
Hint: If K is a compact and convex set, let R be large enough that the closed
ball, D (0, R) ⊇ K. Let P be the projection onto K as in Problem 10 above. If f
is a continuous map from K to K, consider f ◦P . You want to show f has a fixed
point in K.
12. Suppose D is a set which is homeomorphic ³to B (0, 1).´ This means there exists a
continuous one to one map, h such that h B (0, 1) = D such that h−1 is also
one to one. Show that if f is a continuous function which maps D to D then f has
a fixed point. Now show that it suffices to say that h is one to one and continuous.
In this case the continuity of h−1 is automatic. Sets which have the property that
continuous functions taking the set to itself have at least one fixed point are said
to have the fixed point property. Work Problem 6 using this notion of fixed point
property. What about a solid ball and a donut?
13. There are many different proofs of the Brouwer fixed point theorem. Let l be
a line segment. Label one end with A and the other end B. Now partition the
segment into n little pieces and label each of these partition points with either A
or B. Show there is an odd number of little segments with one end labeled with A
and the other labeled with B. If f :l → l is continuous, use the fact it is uniformly
continuous and this little labeling result to give a proof for the Brouwer fixed
point theorem for a one dimensional segment. Next consider a triangle. Label the
vertices with A, B, C and subdivide this triangle into little triangles, T1 , · · · , Tm
in such a way that any pair of these little triangles intersects either along an entire
edge or a vertex. Now label the unlabeled vertices of these little triangles with
either A, B, or C in any way. Show there is an odd number of little triangles
having their vertices labeled as A, B, C. Use this to show the Brouwer fixed point
theorem for any triangle. This approach generalizes to higher dimensions and you
will see how this would take place if you are successful in going this far. This is an
outline of the Sperner’s lemma approach to the Brouwer fixed point theorem. Are
there other sets besides compact convex sets which have the fixed point property?
14. Using the ¡definition
¢ of the derivative and the Vitali covering theorem, show that
if f ∈ C 1 U ,Rn and ∂U has n dimensional measure zero then f (∂U ) also has
measure zero. (This problem has little to do with this chapter. It is a review.)
15. Suppose Ω is any open bounded subset of Rn which contains 0 and that f : Ω → Rn
is continuous with the property that
f (x) · x ≥ 0
for all x ∈ ∂Ω. Show that then there exists x ∈ Ω such that f (x) = 0. Give a
similar result in the case where the above inequality is replaced with ≤. Hint:
You might consider the function
h (t, x) ≡ tf (x) + (1 − t) x.
where the aj are small real distinct numbers and argue that both this function
and f are analytic but that 0 is a regular value for g although it is not so for f .
However, for each aj small but distinct d (f , Br , 0) = d (g, Br , 0).
292 BROUWER DEGREE
19. Using Problem 18, prove the fundamental theorem of algebra as follows. Let p (z)
be a nonconstant polynomial of degree n,
Show that for large enough r, |p (z)| > |p (z) − an z n | for all z ∈ ∂B (0, r). Now
from Problem 17 you can conclude d (p, Br , 0) = d (f, Br , 0) = n where f (z) =
an z n .
20. Generalize Theorem 10.7.5 to the situation where Ω is not necessarily a connected
open set. You may need to make some adjustments on the hypotheses.
Integration Of Differential
Forms
11.1 Manifolds
Manifolds are sets which resemble Rn locally. A manifold with boundary resembles half
of Rn locally. To make this concept of a manifold more precise, here is a definition.
where
Rn0 ≡ {u ∈ Rn : u1 = 0}
and ∂Ω is called the boundary of Ω. Note that if n = 1, Rn0 is just the single point 0.
By convention, we will consider the boundary of such a 0 dimensional manifold to be
empty.
293
294 INTEGRATION OF DIFFERENTIAL FORMS
Theorem 11.1.4 Let ∂Ω and int (Ω) be as defined above. Then int (Ω) is open
in Ω and ∂Ω is closed in Ω. Furthermore, ∂Ω ∩ int (Ω) = ∅, Ω = ∂Ω ∪ int (Ω), and for
n ≥ 1, ∂Ω is an n − 1 dimensional manifold for which ∂ (∂Ω) = ∅. The property of
being in int (Ω) or ∂Ω does not depend on the choice of atlas.
Proof: It is clear that Ω = ∂Ω∪int (Ω). First consider the claim that ∂Ω∩int (Ω) =
∅. Suppose this does not happen. Then there would exist x ∈ ∂Ω ∩ int (Ω). Therefore,
there would exist two mappings Ri and Rj such that Rj x ∈ Rn0 and Ri x ∈ Rn< with
x ∈ Ui ∩ Uj . Now consider the map, Rj ◦ R−1
i , a continuous one to one map from R≤
n
n
to R≤ having a continuous inverse. By continuity, there exists r > 0 small enough that,
R−1
i B (Ri x,r) ⊆ Ui ∩ Uj .
Therefore, Rj ◦ R−1 n n
i (B (Ri x,r)) ⊆ R≤ and contains a point on R0 , Rj x. However,
this cannot occur because it contradicts the theorem on invariance of domain, Theorem
10.4.3, which requires that Rj ◦ R−1 n
i (B (Ri x,r)) must be an open subset of R and
n
this one isn’t because of the point on R0 . Therefore, ∂Ω ∩ int (Ω) = ∅ as claimed. This
same argument shows that the property of being in int (Ω) or ∂Ω does not depend on
the choice of the atlas.
To verify that ∂ (∂Ω) = ∅, let Si be the restriction of Ri to ∂Ω ∩ Ui . Thus
{u ∈ Rn : u1 = 0} ,
where ki is chosen sufficiently large enough that (Ri (Vi ))2 − ki < 0, it follows that
{(Vi , S0i )} is an atlas for ∂Ω as an n − 1 dimensional manifold such that every point of
∂Ω is sent to to Rn−1 < and none gets sent to Rn−1
0 . It follows ∂Ω is an n − 1 dimensional
manifold with empty boundary. In case n = 1, the result follows by definition of the
boundary of a 0 dimensional manifold.
Next consider the claim that int (Ω) is open in Ω. If x ∈ int (Ω) , are all points of
Ω which are sufficiently close to x also in int (Ω)? If this were not true, there would
exist {xn } such that xn ∈ ∂Ω and xn → x. Since there are only finitely many charts
of interest, this would imply the existence of a subsequence, still denoted by xn and a
single map, Ri such that Ri (xn ) ∈ Rn0 . But then Ri (xn ) → Ri (x) and so Ri (x) ∈ Rn0
showing x ∈ ∂Ω, a contradiction to int (Ω) ∩ ∂Ω = ∅. Now it follows that ∂Ω is closed
in Ω because ∂Ω = Ω \ int (Ω). This proves the theorem. ¥
The mappings, Ri ◦ R−1 j are called the overlap maps. In the case where k = 0, the Ri
are only assumed continuous so there is no differentiability available and in this case,
the manifold is oriented if whenever A is an open connected subset of int (Ri (Ui ∩ Uj ))
whose boundary has measure zero and separates Rn into two components,
¡ ¢
d y,A, Rj ◦ R−1
i ∈ {1, 0} (11.2)
depending on whether y ∈ Rj ◦R−1 i (A). An atlas satisfying 11.1 or more generally 11.2
is called an oriented atlas. By Lemma 10.7.4 and Proposition 10.6.2, this definition in
terms of the degree when applied to the situation of a C k manifold, gives the same thing
as the earlier definition in terms of the determinant of the derivative.
The advantage of using the degree in the above definition to define orientation is
that it does not depend on any kind of differentiability and since I am trying to relax
smoothness of the boundary, this is a good idea.
In calculus, you probably looked at piecewise smooth curves. The following is an
attempt to generalize this to the present situation.
R−1
i ∈ C 1 (Ri (Ui \ L) ; Rm ) (11.3)
Rj ◦ R−1
i ∈ C 1 (Ri ((Ui ∩ Uj ) \ L) ; Rn ) (11.4)
©¯¯ ¯¯ ª
sup ¯¯DR−1 ¯¯
i (u) F : u ∈ Ri (Ui \ L) < ∞ (11.5)
where the norm is the Frobenius norm
1/2
X 2
||M ||F ≡ |Mij | .
i,j
If v = Ri ◦ R−1
j (u) , I will often write
∂ (v1 · · · vn )
≡ det DRi ◦ R−1
j (u)
∂ (u1 · · · un )
∂(v1 ···vn )
Thus in this situation ∂(u1 ···un ) ≥ 0.
¡ ¡ ¢ ¢
Since this is true for arbitrary E ⊆ A, it follows det D Ri ◦ R−1 j (u) ≥ 0 a.e. u ∈ A
© ¡ ¡ ¢ ¢ ª
because if not so, then you could take Eδ ≡ u : det D Ri ◦ R−1 j (u) < −δ and
for some δ > 0 this would have positive measure. Then the right side of the above is
negative while the left is nonnegative. By the Vitali covering theorem Corollary 9.7.6,
and the assumptions of P C 1 , there exists a sequence of disjoint open balls contained in
Ri ◦ R−1
j (Ui ∩ Uj \ L) , {Ak } such that
¡ ¢
int Ri ◦ R−1 ∞
j (Uj ∩ Ui ) = L ∪ ∪k=1 Ak
¡ ¢
and from the above, there exist sets of measure zero Nk ⊆ Ak such that det D Ri ◦ R−1 j¢ (u) ≥
¡ ¢ ¡
0 for all u ∈ Ak \ Nk . Then det D Ri ◦ R−1 j (u) ≥ 0 on int Ri ◦ R−1 j (Uj ∩ Ui ) \
∞
(L ∪ ∪k=1 Nk ) . This proves one direction.
¡ Now
¢ consider the other direction.
Suppose the condition det D Ri ◦ R−1 j (u) ≥ 0 a.e. Then by Theorem 10.7.5
Z Z
¡ ¢ ¡ ¡ ¢ ¢
d v, A, Ri ◦ R−1
j dv = det D Ri ◦ R−1
j (u) du ≥ 0
Ri ◦R−1
j (A) A
The degree is constant on the connected open set Ri ◦ R−1j (A) . By Proposition 10.6.2,
the degree equals either −1 or 1. The above inequality shows it can’t equal −1 and so
it must equal 1. This proves the proposition. ¥
This shows it would be fine to simply use 11.6 as the definition of orientable in
the case of a P C 1 manifold and not bother with the definiton in terms of the degree.
However, that one is more general because it does not depend on any differentiability.
I will use this result whenever convenient in what follows.
(µ(Ω) < ∞)
11.2. SOME IMPORTANT MEASURE THEORY 297
and let fn , f be complex valued functions such that Re fn , Im fn are all measurable and
lim fn (ω) = f (ω)
n→∞
for all ω ∈
/ E where µ(E) = 0. Then for every ε > 0, there exists a set,
F ⊇ E, µ(F ) < ε,
such that fn converges uniformly to f on F C .
Proof: First suppose E = ∅ so that convergence is pointwise everywhere. It follows
then that Re f and Im f are pointwise limits of measurable functions and are therefore
measurable. Let Ekm = {ω ∈ Ω : |fn (ω) − f (ω)| ≥ 1/m for some n > k}. Note that
q
2 2
|fn (ω) − f (ω)| = (Re fn (ω) − Re f (ω)) + (Im fn (ω) − Im f (ω))
and so, · ¸
1
|fn − f | ≥
m
is measurable. Hence Ekm is measurable because
· ¸
1
Ekm = ∪∞n=k+1 |fn − f | ≥ .
m
For fixed m, ∩∞k=1 Ekm = ∅ because fn converges to f . Therefore, if ω ∈ Ω there exists
1
k such that if n > k, |fn (ω) − f (ω)| < m which means ω ∈
/ Ekm . Note also that
Ekm ⊇ E(k+1)m .
Since µ(E1m ) < ∞, Theorem 7.3.2 on Page 155 implies
0 = µ(∩∞
k=1 Ekm ) = lim µ(Ekm ).
k→∞
Let k(m) be chosen such that µ(Ek(m)m ) < ε2−m and let
∞
[
F = Ek(m)m .
m=1
C
Hence ω ∈ Ek(m 0 )m0
so
|fn (ω) − f (ω)| < 1/m0 < η
for all n > k(m0 ). This holds for all ω ∈ F C and so fn converges uniformly to f on F C .
∞
Now if E 6= ∅, consider {XE C fn }n=1 . Each XE C fn has real and imaginary parts
measurable and the sequence converges pointwise to XE f everywhere. Therefore, from
the first part, there exists a set of measure less than ε, F such that on F C , {XE C fn }
C
converges uniformly to XE C f. Therefore, on (E ∪ F ) , {fn } converges uniformly to f .
This proves the theorem. ¥
298 INTEGRATION OF DIFFERENTIAL FORMS
Proof: Let ε > 0 be given and suppose S is uniformly integrable. First suppose the
functions are real valued. Let δ be such that if µ (E) < δ, then
¯Z ¯
¯ ¯
¯ f dµ¯ < ε
¯ ¯ 2
E
R ε
Let ε > 0 be given and choose R large enough that [|f |>R]
|f | dµ < 2. Now let
ε
µ (E) < 2R . Then
Z Z Z
|f | dµ = |f | dµ + |f | dµ
E E∩[|f |≤R] E∩[|f |>R]
ε ε ε
< Rµ (E) + < + = ε.
2 2 2
This proves the lemma. ¥
The following gives a nice way to identify a uniformly integrable set of functions.
11.2. SOME IMPORTANT MEASURE THEORY 299
Lemma 11.2.4 Let S be a subset of L1 (Ω, µ) where µ (Ω) < ∞. Let t → h (t) be a
continuous function which satisfies
h (t)
lim =∞
t→∞ t
Proof: First I show S is bounded in L1 (Ω; µ) which means there exists a constant
M such that for all f ∈ S, Z
|f | dµ ≤ M.
Ω
From the properties of h, there exists Rn such that if t ≥ Rn , then h (t) ≥ nt. Therefore,
Z Z Z
|f | dµ = |f | dµ + |f | dµ
Ω [|f |≥Rn ] [|f |<Rn ]
Letting n = 1,
Z Z
|f | dµ ≤ h (|f |) dµ + R1 µ ([|f | < R1 ])
Ω [|f |≥R1 ]
≤ N + R1 µ (Ω) ≡ M.
Z
1 N
≤ |f | dµ + Rn µ (E) ≤ + Rn µ (E)
n Ω n
and letting n be large enough, this is less than
ε/2 + Rn µ (E)
Theorem 11.2.5 Let {fn } be a uniformly integrable set of complex valued func-
tions, µ(Ω) < ∞, and fn (x) → f (x) a.e. where f is a measurable complex valued
function. Then f ∈ L1 (Ω) and
Z
lim |fn − f |dµ = 0. (11.7)
n→∞ Ω
300 INTEGRATION OF DIFFERENTIAL FORMS
Proof: First it will be shown that f ∈ L1 (Ω). By uniform integrability, there exists
δ > 0 such that if µ (E) < δ, then
Z
|fn | dµ < 1
E
for all n. By Egoroff’s theorem, there exists a set, E of measure less than δ such that
on E C , {fn } converges uniformly. Therefore, for p large enough, and n > p,
Z
|fp − fn | dµ < 1
EC
which implies Z Z
|fn | dµ < 1 + |fp | dµ.
EC Ω
Then since there are only finitely many functions, fn with n ≤ p, there exists a constant,
M1 such that for all n, Z
|fn | dµ < M1 .
EC
But also,
Z Z Z
|fm | dµ = |fm | dµ + |fm |
Ω EC E
≤ M1 + 1 ≡ M.
B A
11.4. THE AREA MEASURE ON A MANIFOLD 301
det (BA) =
1 X X
sgn (i1 · · · im ) sgn (j1 · · · jm )
m! i ···i j1 ···jm
1 m
X 1 X X
= sgn (i1 · · · im ) sgn (j1 · · · jm ) ·
r1 ,··· ,rm
m! i ···i j1 ···jm
1 m
X 1 X
= sgn (i1 · · · im ) Bi1 r1 Bi2 r2 · · · Bim rm ·
r1 ,··· ,rm
m! i ···i
1 m
X
sgn (j1 · · · jm ) Ar1 j1 Ar2 j2 · · · Arm jm
j1 ···jm
X 1 2
= sgn (r1 · · · rn ) det (B (r1 · · · rm )) det (A (r1 · · · rm ))
r1 ,··· ,rm
m!
where ¡ ¡ ¢¢1/2
∗
Ji (u) ≡ det DR−1 −1
i (u) DRi (u)
302 INTEGRATION OF DIFFERENTIAL FORMS
where the sum is taken over all increasing strings of n indices (i1 , · · · , in ) and
xi1 ,u1 xi1 ,u2 · · · xi1 ,un
· · · xi2 ,u2
∂ (xi1 · · · xin ) xi2 ,u1 xi2 ,u2
(u) ≡ det .. .. .. .. (u) (11.8)
∂ (u1 · · · un ) . . . .
xin ,u1 xin ,u2 · · · xin ,un
v = Sj ◦ R−1
i (u) (11.9)
¡ ¢
for u ∈ Ri ((Vj ∩ Ui ) \ L) . Then R−1
i = S−1
j ◦ Sj ◦ Ri
−1
and so by the chain rule,
¡ ¢
DS−1 j (v) D Sj ◦ Ri
−1
(u) = DR−1i (u)
Therefore,
¡ ¡ ∗ ¢¢1/2
Ji (u) = det DR−1 −1
i (u) DRi (u)
1/2
n×n n×n n×n
z ¡ }| ¢ {z }| {z ¡ }| ¢ {
∗ ∗
= det D Sj ◦ R−1
i (u) DS−1 −1 −1
j (v) DSj (v)D Sj ◦ Ri (u)
¯ ¡ ¡ ¢ ¢¯
= ¯det D Sj ◦ R−1
i (u) ¯ Jj (v) (11.10)
Similarly ¯ ¡ ¡ ¢ ¢¯
Jj (v) = ¯det D Ri ◦ S−1
j (v) ¯ Ji (u) . (11.11)
In the situation of 11.9, it is convenient to use the notation
∂ (v1 · · · vn ) ¡ ¡ ¢ ¢
≡ det D Sj ◦ R−1
i (u)
∂ (u1 · · · un )
and this will be used occasionally below.
It is necessary to show the above definition is well defined.
Theorem 11.4.2 The above definition of surface measure is well defined. That
is, suppose Ω is an n dimensional orientable P C 1 manifold with boundary and let
p q
{(Ui , Ri )}i=1 and {(Vj , Sj )}j=1 be two atlass of Ω. Then for E a Borel set, the com-
putation of σ n (E) using the two different atlass gives the same thing. This defines a
Borel measure on Ω. Furthermore, if E ⊆ Ui , σ n (E) reduces to
Z
¡ ¢
XE R−1 i (u) Ji (u) du
Ri Ui
Also σ n (L) = 0.
q Z
X ¡ ¢ ¡ −1 ¢ ¡ −1 ¢
= η j R−1
i (u) ψ i Ri (u) XE Ri (u) Ji (u) du
j=1 Ri (Ui ∩Vj )
q Z
X ¡ ¢ ¡ −1 ¢ ¡ −1 ¢
= η j R−1
i (u) ψ i Ri (u) XE Ri (u) Ji (u) du
j=1 int Ri (Ui ∩Vj \L)
The reason this can be done is that points not on the interior of Ri (Ui ∩ Vj ) are on
the plane u1 = 0 which is a set of measure zero and by assumption Ri (L ∩ Ui ∩ Vj )
has measure zero. Of course the above determinants in the definition of Ji (u) in the
integrand are not even defined on the set of measure zero Ri (L ∩ Ui ) . It follows the
p
definition of σ n (E) in terms of the atlas {(Ui , Ri )}i=1 and specified partition of unity
is
p X
X q Z
¡ ¢ ¡ −1 ¢ ¡ −1 ¢
η j R−1
i (u) ψ i Ri (u) XE Ri (u) Ji (u) du
i=1 j=1 int Ri (Ui ∩Vj \L)
The integral is unchanged if it is taken over Sj (Ui ∩ Vj ) and so, by 11.11, this equals
q Z
p X
X ¡ ¢ ¡ −1 ¢ ¡ −1 ¢
η j S−1
j (v) ψ i Sj (v) XE Sj (v) Jj (v) dv
i=1 j=1 Sj (Ui ∩Vj )
Xq Z
¡ ¢ ¡ −1 ¢
= η j S−1
j (v) XE Sj (v) Jj (v) dv
j=1 Sj (Ui ∩Vj )
which equals the definition of σ n (E) taken with respect to the other atlas and partition
of unity. Thus the definition is well defined. This also has shown that the partition of
unity can be picked at will.
It remains to verify the claim. First suppose E = K a compact subset of Ui . Then
using Lemma 11.5.3 there exists a partition of unity such that ψ k = 1 on K. Consider
the sum used to define σ n (K) ,
p Z
X ¡ ¢ ¡ −1 ¢
ψ i R−1
i (u) XK Ri (u) Ji (u) du
i=1 Ri Ui
Next consider the general case. By Theorem 7.4.6 the Borel measure σ n is regular. Also
Lebesgue measure is regular. Therefore, there exists an increasing sequence of compact
∞
subsets of E, {Kr }k=1 such that
Next take an increasing sequence of compact sets contained in Rk (E) such that
Thus Z
¡ ¢
σ n (E) = XE R−1
k (u) Jk (u) du
Rk Uk
as claimed.
So what is the measure of L? By definition it equals
p Z
X ¡ ¢ ¡ −1 ¢
σ n (L) ≡ ψ i R−1
i (u) XL Ri (u) Ji (u) du
i=1 Ri Ui
Xp Z
¡ ¢
= ψ i R−1
i (u) XRi (L∩Ui ) (u) Ji (u) du = 0
i=1 Ri Ui
because by assumption, the measure of each Ri (L ∩ Ui ) is zero. This proves the theo-
rem. ¥
∂gi
where here (Dg)ij ≡ gi,j ≡ ∂xj .
11.5. INTEGRATION OF DIFFERENTIAL FORMS ON MANIFOLDS 305
Proposition 11.5.2 Let Ω be an open connected bounded set in Rn¡ such ¢that Rn \
∂Ω consists of two, three if n = 1, connected components. Let f ∈ C Ω; Rn be con-
tinuous and one to one. Then f (Ω) is the bounded component of Rn \ f (∂Ω) and for
y ∈ f (Ω) , d (f , Ω, y) either equals 1 or −1.
Also recall the following fundamental lemma on partitions of unity in Lemma 9.5.15
and Corollary 9.5.14.
If K is a compact subset of U1 (Ui )there exist such functions such that also ψ 1 (x) = 1
(ψ i (x) = 1) for all x ∈ K.
With the above, what follows is the definition of what a differential form is and how
to integrate one.
Definition 11.5.4 Let I denote an ordered list of n indices taken from the set,
{1, · · · , m}. Thus I = (i1 , · · · , in ). It is an ordered list because the order matters. A
differential form of order n in Rm is a formal expression,
X
ω= aI (x) dxI
I
dxi1 ∧ · · · ∧ dxin ,
and the sum is taken over all ordered lists of indices taken from the set, {1, · · · , m}.
For Ω an orientable n dimensional manifold with boundary, let {(Ui , Ri )} be an oriented
atlas for Ω. Each Ui is the intersection of an open set in Rm , Oi , with Ω and so there
exists a C ∞ partition of unity subordinate to the open cover,
P {Oi } which sums to 1 on
Ω. Thus ψ i ∈ Cc∞ (Oi ), has values in [0, 1] and satisfies i ψ i (x) = 1 for all x ∈ Ω.
Define
Z p XZ
X ¡ ¢ ¡ −1 ¢ ∂ (xi1 · · · xin )
ω≡ ψ i R−1
i (u) aI Ri (u) du (11.12)
Ω i=1 Ri Ui ∂ (u1 · · · un )
I
Note
∂ (xi1 · · · xin )
,
∂ (u1 · · · un )
given by 11.8 is not defined on Ri (Ui ∩ L) but this is assumed a set of measure zero so
it is not important in the integral.
Of course there are all sorts of questions related to whether this definition is Rwell
defined. What if you had a different atlas and a different partition of unity? Would Ω ω
change? In general, the answer is yes. However, there is a sense in which the integral of
a differential form is well defined. This involves the concept of orientation which looks
a lot like the concept of an oriented manifold.
306 INTEGRATION OF DIFFERENTIAL FORMS
R
The above definition of Ω ω is well defined in the sense that any two atlass which
have the same orientation deliver the same value for this symbol.
q XZ
p X
X ¡ ¢ ¡ −1 ¢ ¡ −1 ¢ ∂ (xi1 · · · xin )
= η j R−1i (u) ψ i Ri (u) aI Ri (u) du
i=1 j=1 I Ri (Ui ∩Vj ) ∂ (u1 · · · un )
Xp X q XZ
¡ ¢ ¡ −1 ¢ ¡ −1 ¢ ∂ (xi1 · · · xin )
= η j R−1
i (u) ψ i Ri (u) aI Ri (u) du
i=1 j=1 int Ri (Ui ∩Vj \L) ∂ (u1 · · · un )
I
The reason this can be done is that points not on the interior of Ri (Ui ∩ Vj ) are on
the plane u1 = 0 which is a set of measure zero and by assumption Ri (L ∩ Ui ∩ Vj ) has
measure zero. Of course the above determinant in the integrand is not even defined on
Ri (L ∩ Ui ∩ Vj ) . By the change of variables formula Theorem 9.9.10 and Proposition
11.1.7, this equals
q XZ
p X
X ¡ ¢ ¡ −1 ¢ ¡ −1 ¢ ∂ (xi1 · · · xin )
η j S−1
j (v) ψ i Sj (v) aI Sj (v) dv
i=1 j=1 int Sj (Ui ∩Vj \L) ∂ (v1 · · · vn )
I
q XZ
p X
X ¡ ¢ ¡ −1 ¢ ¡ −1 ¢ ∂ (xi1 · · · xin )
= η j S−1
j (v) ψ i Sj (v) aI Sj (v) dv
i=1 j=1 Sj (Ui ∩Vj ) ∂ (v1 · · · vn )
I
q XZ
X ¡ ¢ ¡ −1 ¢ ∂ (xi1 · · · xin )
= η j S−1
j (v) aI Sj (v) dv
j=1 Sj (Vj ) ∂ (v1 · · · vn )
I
R
which is the definition Ω
ω using the other atlas {(Vj , Sj )} and partition of unity. This
proves the theorem. ¥
1 Thisis so issues of existence for the various integrals will not arrise. This is leading to Stoke’s
theorem in which even more will be assumed on aI .
11.6. STOKE’S THEOREM AND THE ORIENTATION OF ∂Ω 307
=1
z }| {
∂
X m
X p
X
aI (x) ψ j dxk ∧ dxi1 ∧ · · · ∧ dxin−1
∂xk j=1
I k=1
m
XX p
∂aI X
+ ψ (x) dxk ∧ dxi1 ∧ · · · ∧ dxin−1
∂xk j=1 j
I k=1
m
XX ∂aI
= (x) dxk ∧ dxi1 ∧ · · · ∧ dxin−1 ≡ dω
∂xk
I k=1
It follows
Z p Z
m X ¡ ¢ ¡ ¢
XX ∂ ψ j aI ¡ −1 ¢ ∂ xk , xi1 · · · xin−1
dω = Rj (u) du
Rj (Uj ) ∂xk ∂ (u1 , · · · , un )
I k=1 j=1
308 INTEGRATION OF DIFFERENTIAL FORMS
p Z
m X ¡ ¢ ¡ ¢
XX ∂ ψ j aI ¡ −1 ¢ ∂ xkε , xi1 ε · · · xin−1 ε
= Rjε (u) du+
∂xk ∂ (u1 , · · · , un )
I k=1 j=1 Rj (Uj )
m X p Z ¡ ¢ ¡ ¢
XX ∂ ψ j aI ¡ −1 ¢ ∂ xk , xi1 · · · xin−1
Rj (u) du
∂xk ∂ (u1 , · · · , un )
I k=1 j=1 Rj (Uj )
m Xp Z ¡ ¢ ¡ ¢
XX ∂ ψ j aI ¡ −1 ¢ ∂ xkε , xi1 ε · · · xin−1 ε
− Rjε (u) du (11.16)
j=1 Rj (Uj )
∂xk ∂ (u1 , · · · , un )
I k=1
Here
R−1 −1
jε (u) ≡ Rj ∗ φε (u)
for φε a mollifier and xiε is the ith component mollified. Thus by Lemma 9.5.7, this
function with the subscript ε is infinitely differentiable. The last two expressions in
11.16 sum to e (ε) which converges to 0 as ε → 0. Here is why.
¡ ¢ ¡ ¢
∂ ψ j aI ¡ −1 ¢ ∂ ψ j aI ¡ −1 ¢
Rjε (u) → Rj (u) a.e.
∂xk ∂xk
(Note l goes up to n not m.) Recall Rj (Uj ) is relatively open in Rn≤ . Consider the
integral where l > 1. Integrate first with respect to ul . In this case the boundary term
vanishes because of ψ j and you get
Z
¡ ¢
− A1l,l ψ j aI ◦ R−1
jε (u) du (11.18)
Rj (Uj )
11.6. STOKE’S THEOREM AND THE ORIENTATION OF ∂Ω 309
Next consider the case where l = 1. Integrating first with respect to u1 , the term reduces
to
Z Z
−1
¡ ¢
ψ j aI ◦ Rjε (0, u2 , · · · , un ) A11 du1 − A11,1 ψ j aI ◦ R−1
jε (u) du (11.19)
Rj Vj Rj (Uj )
and du1 represents du2 du3 · · · dun on Rj Vj for short. Thus Vj is just the part of ∂Ω
which is in Uj and the mappings S−1 j given on Rj Vj = Rj (Uj ∩ ∂Ω) by
S−1 −1
j (u2 , · · · , un ) ≡ Rj (0, u2 , · · · , un )
are such that {(Sj , Vj )} is an atlas for ∂Ω. Then if 11.18 and 11.19 are placed in 11.17,
it follows from Lemma 11.5.1 that 11.17 reduces to
XX p Z
ψ j aI ◦ R−1
jε (0, u2 , · · · , un ) A11 du1 + e (ε)
I j=1 Rj Vj
Now as before, each ∂xsε /∂ur converges pointwise a.e. to ∂xs /∂ur , off Rj (Vj ∩ L)
assumed to be a set of measure zero, and the integrands are bounded. Using the Vitali
convergence theorem again, pass to a limit as ε → 0 to obtain
p Z
XX
ψ j aI ◦ R−1
j (0, u2 , · · · , un ) A11 du1
I j=1 Rj Vj
p Z
XX
= ψ j aI ◦ S−1
j (u2 , · · · , un ) A11 du1
I j=1 Sj Vj
p Z ¡ ¢
XX ∂ xi1 · · · xin−1
= ψ j aI ◦ S−1
(u2 , · · · , un )
j (0, u2 , · · · , un ) du1 (11.20)
∂ (u2 , · · · , un )
I j=1 Sj Vj
R
This of course is the definition of ∂Ω ω provided {Sj , Vj } is an oriented atlas. Note
the integral is well defined because of the assumption that Ri (L ∩ Ui ∩ ∂Ω) has mn−1
measure zero. That ∂Ω is orientable and that this atlas is an oriented atlas is shown
next. I will write u1 ≡ (u2 , · · · , un ).
R What if spt aI ⊆ K ⊆ Ui ∩ Uj for each I? Then using Lemma 11.5.3 it follows that
dω =
¡ ¢
XZ ∂ xi1 · · · xin−1
−1
aI ◦ Sj (u2 , · · · , un ) (0, u2 , · · · , un ) du1
Sj (Vj ∩Vj ) ∂ (u2 , · · · , un )
I
This is done by using a partition of unity which has the property that ψ j equals 1 on
K which forces all the other ψ k to equal Rzero there. Using the same trick involving a
judicious choice of the partition of unity, dω is also equal to
¡ ¢
XZ ∂ xi1 · · · xin−1
aI ◦ S−1
i (v2 , · · · , v n ) (0, v2 , · · · , vn ) dv1
Si (Vj ∩Vj ) ∂ (v2 , · · · , vn )
I
Since Si (L ∩ Ui ) , Sj (L ∩ Uj ) have measure zero, the above integrals may be taken over
Sj (Vj ∩ Vj \ L) , Si (Vj ∩ Vj \ L)
310 INTEGRATION OF DIFFERENTIAL FORMS
R
respectively. Also these are equal, both being dω. To simplify the notation, let π I
denote the projection onto the components corresponding to I. Thus if I = (i1 , · · · , in ) ,
π I x ≡ (xi1 , · · · , xin ) .
XZ
= aI ◦ S−1 −1
i (v1 ) det Dπ I Si (v1 ) dv1
I Si (Vj ∩Vj \L)
R
and both equal to dω. Thus using the change of variables formula, Theorem 9.9.10,
it follows the second of these equals
XZ ¡ ¢¯ ¡ ¢ ¯
aI ◦ S−1 −1
Si ◦ S−1 ¯ −1
(u1 )¯ du1
j (u1 ) det Dπ I Si j (u1 ) det D Si ◦ Sj
I Sj (Vj ∩Vj \L)
¡ ¢ (11.21)
I want to argue det D Si ◦ S−1
j (u1 ) ≥ 0. Let A be the open subset of Sj (Vj ∩ Vj \ L)
on which for δ > 0, ¡ ¢
det D Si ◦ S−1j (u1 ) < −δ (11.22)
I want to show A = ∅ so assume A is nonempty. If this is the case, we could consider
an open ball contained in A. To simplify notation, assume A is an open ball. Letting
fI be a smooth function which vanishes off a compact subset of S−1 j (A) the above
argument and the chain rule imply
XZ
fI ◦ S−1 −1
j (u1 ) det Dπ I Sj (u1 ) du1
I Sj (Vj ∩Vj \L)
XZ
= fI ◦ S−1 −1
j (u1 ) det Dπ I Sj (u1 ) du1
I A
XZ ¡ ¢ ¡ ¢
fI ◦ S−1 −1
j (u1 ) det Dπ I Si Si ◦ S−1 −1
j (u1 ) det D Si ◦ Sj (u1 ) du1
I A
and consequently
XZ ¡ ¢ ¡ ¢
0=2 fI ◦ S−1 −1
j (u1 ) det Dπ I Si Si ◦ S−1 −1
j (u1 ) det D Si ◦ Sj (u1 ) du1
I A
© ª∞
Now for each I, let fIk ◦ S−1
j be a sequence of bounded functions having compact
k=1 ¡ ¢
support in A which converge pointwise to det Dπ I S−1 i Si ◦ S−1
j (u1 ) . Then it follows
from the Vitali convergence theorem, one can pass to the limit and obtain
Z X
¡ ¡ ¢¢2 ¡ ¢
0 = 2 det Dπ I S−1
i Si ◦ S−1
j (u1 ) det D Si ◦ S−1
j (u1 ) du1
A
ZI X
¡ ¡ ¢¢2
≤ −2δ det Dπ I S−1
i Si ◦ S−1
j (u1 ) du1
A I
11.7. GREEN’S THEOREM, AN EXAMPLE 311
det Dπ I S−1
i (v1 ) ≡ 0 (11.23)
Dπ I S−1
i (v1 ) 6= 0
contradicting 11.23. Thus A = ∅ and since δ > 0 was arbitrary, this shows
¡ ¢
det D Si ◦ S−1
j (u1 ) ≥ 0.
where the two integrals are taken with respect to the given oriented atlass.
g (u2 , · · · , un ) < b for (u2 , · · · , un ) ∈ B. Also g vanishes outside some compact set in
Rn−1 and g is continuous.
Note that finitely many of these sets Q cover ∂Ω because ∂Ω is compact. Assume there
exists a closed subset of ∂Ω, L such that the closed set SQ defined by
has mn−1 measure zero. g ∈ C 1 (B \ SQ ) and all the partial derivatives of g are uni-
formly bounded on B \ SQ . The following picture describes the situation. The pointy
places symbolize the set L.
Q
R(W )
u
W R - R(Q)
x a b
Define P1 : Rn → Rn−1 by
P1 u ≡ (u2 , · · · , un )
and Σ : Rn → Rn given by
Σu ≡ u − g (P1 u) e1
≡ u − g (u2 , · · · , un ) e1
≡ (u1 − g (u2 , · · · , un ) , u2 , · · · , un )
Thus Σ is invertible and
Σ−1 u = u + g (P1 u) e1
≡ (u1 + g (u2 , · · · , un ) , u2 , · · · , un )
For x ∈ ∂Ω∩Q, it follows the first component of Rx is g (P1 (Rx)) . Now define R :W →
Rn≤ as
u ≡ Rx ≡ Rx − g (P1 (Rx)) e1 ≡ ΣRx
and so it follows
R−1 = R∗ Σ−1 .
These mappings R involve first a rotation followed by a variable sheer in the direction
of the u1 axis. From the above description, R (L ∩ Q) = 0 × SQ , a set of mn−1 measure
zero. This is because
(u2 , · · · , un ) ∈ SQ
if and only if
(g (u2 , · · · , un ) , u2 , · · · , un ) ∈ R (L ∩ Q)
if and only if
(0, u2 , · · · , un ) ≡ Σ (g (u2 , · · · , un ) , u2 , · · · , un )
∈ ΣR (L ∩ Q) ≡ R (L ∩ Q) .
Since ∂Ω is compact, there are finitely many of these open sets, Q1 , · · · , Qp which
cover ∂Ω. Let the orthogonal transformations and other quantities described above
11.7. GREEN’S THEOREM, AN EXAMPLE 313
However, ¡ ¢
det (DΣj ) = 1 = det DΣ−1
j
¡ ¢
and det (Ri ) = det (Ri∗ ) = 1 by assumption. Therefore, for a.e. u ∈ Rj ◦ R−1
i (A) ,
¡ ¡ −1
¢ ¢
det D Rj ◦ Ri (u) > 0.
By Proposition 11.1.7 Ω is indeed an oriented manifold with the given atlas.
Proof: From the definition and using the usual technique of ignoring the exceptional
set of measure zero,
Z p X n Z ¡ ¢
X ∂ak R−1i (u) k−1 ¡ ¢ ∂ (x1 · · · xn )
dω ≡ (−1) ψ i R−1
i (u) du
Ω i=1 Ri Wi ∂xk ∂ (u1 · · · un )
k=1
Now from the above description of R−1 i , the determinant in the above integrand equals
1. Therefore, the change of variables theorem applies and the above reduces to
Xp X n Z Z X p X n
∂ak (x) k−1 ∂ak (x) k−1
(−1) ψ i (x) dx = (−1) ψ i (x) dmn
i=1 k=1 Wi
∂x k Ω i=1 k=1 ∂x k
Z X n
∂ak (x) k−1
= (−1) dmn
Ω ∂xk
k=1
This follows because the differential form on the left is of the form
c + Qdx
P dx ∧ dy c ∧ dy
Q u2
6
Σ
R R(W ) -
W - R(W )
Ω
11.8. THE DIVERGENCE THEOREM 315
The vertical arrow at the end indicates the direction of increasing u2 . The vertical side
of R (W ) shown there corresponds to the curved side in R (W ) which corresponds to
the part of ∂Ω which is selected by Q as shown in the picture. Here R is an orthogonal
transformation which has determinant equal to 1. Now the shear which goes from the
diagram on the right to the one on its left preserves the direction of motion relative to
the surface the curve is bounding. This is geometrically clear. Similarly, the orthogonal
transformation R∗ which goes from the curved part of the boundary of R (W ) to the
corresponding part of ∂Ω preserves the direction of motion relative to the surface. This
is because orthogonal transformations in R2 whose determinants are 1 correspond to
rotations. Thus increasing u2 corresponds to counter clockwise motion around R(W )
along the vertical side of R(W ) which corresponds to counter clockwise motion around
R(W ) along the curved side of R(W ) which corresponds to counter clockwise motion
around Ω in the sense that the direction of motion along the curve is always such that
if you were walking in this direction, your left hand would be over the surface. In other
words this agrees with the usual calculus conventions.
The assertion between the first and second lines follows right away from properties of
determinants and the definition of the integral of the above wedge products in terms
of determinants. From Green’s theorem and R the change of variables formula applied to
the individual terms in the description of Ω dω
Z
div (F) dx =
Ω
p Z
X n
X k−1 ck · · · , xn ) ¡
∂ (x1 , · · · x ¢
(−1) ψ j Fk ◦ R−1
j (0, u2 , · · · , un ) du1 ,
j=1 Bj k=1 ∂ (u2 , · · · , un )
© ª
the above with ψ s F. Next let η j be a partition of unity η j ≺ Qj such that η s = 1 on
spt ψ s . This partition of unity exists by Lemma 11.5.3. Then
Z
div (ψ s F) dx =
Ω
p Z
X n
X k−1 ck · · · , xn ) ¡
∂ (x1 , · · · x ¢
(−1) η j ψ s Fk ◦ R−1
j (0, u2 , · · · , un ) du1
j=1 Bj k=1 ∂ (u2 , · · · , un )
Z n
X k−1 ∂ (x1 , · · · x
ck · · · , xn )
= (−1) (ψ s Fk ) ◦ R−1
s (0, u2 , · · · , un ) du1 (11.25)
Bs k=1 ∂ (u2 , · · · , un )
because since η s = 1 on spt ψ s , it follows all the other η j equal zero there.
Consider the vector N defined for u1 ∈ Rs (Ws \ L) ∩ Rn0 whose k th component is
k−1 ∂ (x1 , · · · x
ck · · · , xn ) k+1 ∂ (x1 , · · · x
ck · · · , xn )
Nk = (−1) = (−1) (11.26)
∂ (u2 , · · · , un ) ∂ (u2 , · · · , un )
Suppose you dot this vector with a tangent vector ∂R−1
s /∂ui . This yields
X k+1 ck · · · , xn ) ∂xk
∂ (x1 , · · · x
(−1) =0
∂ (u2 , · · · , un ) ∂ui
k
a determinant with two equal columns provided i ≥ 2. Thus this vector is at least in
some sense normal to ∂Ω. If i = 1, then the above dot product is just
∂ (x1 · · · xn )
=1
∂ (u1 · · · un )
This vector is called an exterior normal.
The important thing is the existence of the vector, but does it deserve to be called
an exterior normal? Consider the following picture of Rs (Ws )
u2 , · · · , un
Rs (Ws ) -
e1
u1
We got this by first doing a rotation of a piece of Ω and then a shear in the direction of
e1 . Also it was shown above
T
R−1 ∗
s (u) = Rs (u1 + g (u2 , · · · , un ) , u2 , · · · , un )
Hence if θ is the angle between N and x(u−heh1 )−x(u) , it must be the case that θ > π/2.
However,
x (u − he1 ) − x (u)
h
points in to Ω for small h > 0 because x (u − he1 ) ∈ Ws while x (u) is on the boundary
of Ws . Therefore, N should be pointing away from Ω at least at the points where
u → x (u) is differentiable. Thus it is geometrically reasonable to use the word exterior
on this vector N.
One could normalize N given in 11.26 by dividing by its magnitude. Then it would
be the unit exterior normal n. The norm of this vector is
à n µ ¶2 !1/2
X ∂ (x1 , · · · x
ck · · · , xn )
∂ (u2 , · · · , un )
k=1
The integrand
¡ ¢ ¡ −1 ¢
u1 → ψ s F ◦ R−1
s (u1 ) · n Rs (u1 )
is Borel measurable and bounded. Writing as a sum of positive and negative parts and
using Theorem 7.7.12, there exists a sequence of bounded simple functions {sk } which
converges pointwise a.e. to this function. Also the resulting integrands are uniformly
integrable. Then by the Vitali convergence theorem, and Theorem 11.4.2 applied to
these approximations,
Z
¡ ¢ ¡ −1 ¢
ψ s F ◦ R−1
s (u1 ) · n Rs (u1 ) J (u1 ) du1
Bs
Z
¡ ¡ ¢¢
= lim sk Rs R−1
s (u1 ) J (u1 ) du1
k→∞ Bs
Z
= lim sk (Rs (x)) dσ n−1
k→∞ Ws ∩∂Ω
Z
= ψ s (x) F (x) · n (x) dσ n−1
Ws ∩∂Ω
Z
= ψ s (x) F (x) · n (x) dσ n−1
∂Ω
Recall the exceptional set on ∂Ω has σ n−1 measure zero. Upon summing over all s using
that the ψ s add to 1,
XZ Z
ψ s (x) F (x) · n (x) dσ n−1 = F (x) · n (x) dσ n−1
s ∂Ω ∂Ω
318 INTEGRATION OF DIFFERENTIAL FORMS
On the other hand, from 11.25, the left side of the above equals
XZ XZ X n
div (ψ s F) dx = ψ s,i Fi + ψ s Fi,i dx
s Ω s Ω i=1
Z n
à !
X X
= div (F) dx + ψs Fi
Ω i=1 s ,i
Z
= div (F) dx
Ω
¥
This proves the following general divergence theorem.
Theorem 11.8.1 Let Ω be a bounded open set having P C 1 boundary as de-
scribed above. Also let ¡F be a¢ vector field with the property that for Fk a component
function of F, Fk ∈ C 1 Ω; Rn . Then there exists an exterior normal vector n which is
defined σ n−1 a.e. (off the exceptional set L) on ∂Ω such that
Z Z
F · ndσ n−1 = div (F) dx
∂Ω Ω
It is worth noting that everything above will work if you relax the requirement in
P C 1 which requires the partial derivatives be bounded off an exceptional set. Instead,
it would suffice to say that for some p > n all integrals of the form
Z ¯ ¯
¯ ∂xk ¯p
¯ ¯ du
¯ ¯
Ri (Ui ) ∂uj
are bounded. Here xk is the k th component of R−1 i . This is because this condition will
suffice to use the Vitali convergence theorem. This would have required more work to
show however so I have not included it. This is also a reason for featuring the Vitali
convergence theorem rather than the dominated convergence theorem which could have
been used in many of the steps in the above presentation.
All of the above can be done more elegantly and in greater generality if you have
Rademacher’s theorem which gives the almost everywhere differentiability of Lipschitz
functions. In fact, some of the details become a little easier. However, this approach
requires more real analysis than I want to include in this book, but the main ideas are
all the same. You convolve with a mollifier and then do the hard computations with
the mollified function exploiting equality of mixed partial derivatives and then pass to
the limit.
WI (E)
0
11.9. SPHERICAL COORDINATES 319
Definition 11.9.1 The symbol WI (E) represents the piece of a wedge between
the two concentric spheres such that the points x ∈ WI (E) have the property that
x/ |x| ∈ E, a subset of the unit sphere in Rn , S n−1 and |x| ∈ I, an interval on the real
line which does not contain 0.
Now here are some technical results which are interesting for their own sake. The
first gives the existence of a countable basis for Rn . This is a countable set of open sets
which has the property that every open set is the union of these special open sets.
Lemma 11.9.2 Let B denote the countable set of all balls in Rn which have centers
x ∈ Qn and rational radii. Then every open set is the union of sets of B.
Proof: Let U be an open set and let y ∈ U. Then B (y, R) ⊆ U for some R > 0.
Now by density of Qn in Rn , there exists x ∈ B (y, R/10) ∩ Qn . Now let r ∈ Q and
satisfy R/10 < r < R/3. Then y ∈ B (x, r) ⊆ B (y, R) ⊆ U. This proves the lemma.
With the above countable basis, the following theorem is very easy to obtain. It is
called the Lindelöf property.
Theorem 11.9.3 Let C be any collection of open sets and let U = ∪C. Then
there exist countably many sets of C whose union is also equal to U .
Proof: Let B0 denote those sets of B in Lemma 11.9.2 which are contained in some
set of C. By this lemma, it follows ∪B 0 = U . Now use axiom of choice to select for each
B0 a single set of C containing it. Denote the resulting countable collection C 0 . Then
U = ∪B0 ⊆ ∪C 0 ⊆ U
This proves the theorem.
Now consider all the open subsets of Rn \ {0} . If U is any such open set, it is clear
that if y ∈ U, then there exists a set open in S n−1 , E and an open interval I such that
y ∈ WI (E) ⊆ U. It follows from Theorem 11.9.3 that every open set which does not
contain 0 is the countable union of the sets of the form WI (E) for E open in S n−1 .
The divergence theorem and Green’s theorem hold for sets WI (E) whenever E is
the intersection of S n−1 with a finite intersection of balls. This is because the resulting
set has P C 1 boundary. Therefore, from the divergence theorem and letting I = (0, 1)
Z Z Z
x
div (x) dx = x· dσ + x · ndσ
WI (E) E |x| straight part
where I am going to denote by σ the measure on S n−1 which corresponds to the diver-
gence theorem and other theorems given above. On the straight parts of the boundary
of WI (E) , the vector field x is parallel to the surface while n is perpendicular to it, all
this off a set of measure zero of course. Therefore, the integrand vanishes and the above
reduces to
nmn (WI (E)) = σ (E)
Now let G denote those Borel sets of S n−1 such that the above holds for I = (0, 1) ,
both sides making sense because both E and WI (E) are Borel sets in S n−1 and Rn \{0}
respectively. Then G contains the π system of sets which are the finite intersection of
balls with S n−1 . Also if {Ei } are disjoint sets in G, then WI (∪∞ ∞
i=1 Ei ) = ∪i=1 WI (Ei )
and so
nmn (WI (∪∞
i=1 Ei )) = nmn (∪∞
i=1 WI (Ei ))
∞
X
= n mn (WI (Ei ))
i=1
∞
X
= σ (Ei ) = σ (∪∞
i=1 Ei )
i=1
320 INTEGRATION OF DIFFERENTIAL FORMS
and so G is closed with respect to countable disjoint unions. Next let E ∈ G. Then
¡ ¡ ¢¢ ¡ ¡ ¢¢
nmn WI E C + nmn (WI (E)) = nmn WI S n−1
¡ ¢ ¡ ¢
= σ S n−1 = σ (E) + σ E C
Now subtracting the equal quantities nmn (WI (E)) and σ (E) from both sides yields
E C ∈ G also. Therefore, by the Lemma on π systems Lemma 9.1.2, it follows G contains
the σ algebra generated by these special sets E the intersection of finitely many open
balls with S n−1 . Therefore, since any open set is the countable union of balls, it follows
the sets open in S n−1 are contained in this σ algebra. Hence G equals the Borel sets.
This has proved the following important theorem.
Theorem 11.9.4 Let σ be the Borel measure on S n−1 which goes with the di-
vergence theorems and other theorems like Green’s and Stoke’s theorem. Then for all E
Borel,
σ (E) = nmn (WI (E))
where I = (0, 1). Furthermore, WI (E) is Borel for any interval. Also
¡ ¢ ¡ ¢ ¡ ¢
mn W[a,b] (E) = mn W(a,b) (E) = (bn − an ) mn W(0,1) (E)
Proof: To show WI (E) is Borel for any I first suppose I is open of the form (0, r).
Then
WI (E) = rW(0,1) (E)
and this mapping x → rx is continuous with continuous inverse so it maps Borel sets
to Borel sets. If I = (0, r],
WI (E) = ∩∞
n=1 W(0, 1 +r ) (E)
n
and so it is Borel.
W[a,b] (E) = W(0,b] (E) \ W(0,a) (E)
so this is also Borel. Similarly W(a,b] (E) is Borel. The last assertion is obvious and
follows from the change of variables formula. This proves the theorem.
Now with this preparation, it is possible to discuss polar coordinates (spherical
coordinates) a different way than before.
Note that if ρ = |x| and ω ≡ x/ |x| , then x = ρω. Also the map which takes
(0, ∞) × S n−1 to Rn \ {0} given by (ρ, ω) → ρω = x is one to one and onto and
continuous. In addition to this, it follows right away from the definition that if I is any
interval and E ⊆ S n−1 ,
XWI (E) (ρω) = XE (ω) XI (ρ)
Z b Z b ¡ ¢
= ρn−1 σ (E) dρ = ρn−1 nmn W(0,1) (E) dρ
a a
n
¡ n
¢
= (b − a ) mn W(0,1) (E)
and by Theorem 11.9.4, this equals mn (WI (E)) . If I is an interval which contains 0,
the above conclusion still holds because both sides are unchanged if 0 is included on the
left and ρ = 0 is included on the right. In particular, the conclusion holds for B (0, r)
in place of F .
Now let G be those Borel sets F such that the desired conclusion holds for F ∩
B (0, M ).This contains the π system of sets of the form WI (E) and is closed with
respect to countable unions of disjoint sets and complements. Therefore, it equals the
Borel sets. Thus
Z ∞Z
mn (F ∩ B (0, M )) = ρn−1 XF ∩B(0,M ) (ρω) dσdρ
0 S n−1
Now let M → ∞ and use the monotone convergence theorem. This proves the lemma.
The lemma implies right away that for s a simple function
Z Z ∞Z
sdmn = ρn−1 s (ρω) dσdρ
Rn 0 S n−1
11.10 Exercises
1. Let X
ω (x) ≡ aI (x) dxI
I
m
be a differential form where x ∈ R and the I are increasing lists of n indices
taken from 1, · · · , m. Also assume each aI (x) has the property that all mixed
partial derivatives are equal. For example, from Corollary 6.9.2 this happens if
the function is C 2 . Show that under this condition, d (d (ω)) = 0. To show this,
first explain why
dxi ∧ dxj ∧ dxI = −dxj ∧ dxi ∧ dxI
322 INTEGRATION OF DIFFERENTIAL FORMS
When you integrate one you get −1 times the integral of the other. This is the
sense in which the above formula holds. When you have a differential form ω with
the property that dω = 0 this is called a closed form. If ω = dα, then ω is called
exact. Thus every closed form is exact provided you have sufficient smoothness
on the coefficients of the differential form.
2. Recall that in the definition of area measure, you use
¡ ∗ ¢1/2
J (u) = det DR−1 (u) DR−1 (u)
Now in the special case of the manifold of Green’s theorem where
R−1 (u2 , · · · , un ) = R∗ (g (u2 , · · · , un ) , u2 , · · · , un ) ,
show s µ ¶2 µ ¶2
∂g ∂g
J (u) = 1+ + ··· +
∂u2 ∂un
3. Let u1 , · · · , up be vectors in Rn . Show det M ≥ 0 where Mij ≡ ui · uj . Hint:
Show this matrix has all nonnegative eigenvalues and then use the theorem which
says the determinant is the product of the eigenvalues. This matrix is called the
Grammian matrix. The details follow from noting that M is of the form
∗
u1
¡ ¢
U ∗ U ≡ ... u1 · · · up
u∗p
and then showing that U ∗ U has all nonnegative eigenvalues.
4. Suppose {v1 , · · · , vn } are n vectors in Rm for m ≥ n. Show that the only appro-
priate definition of the volume of the n dimensional parallelepiped determined by
these vectors,
X n
sj vj : sj ∈ [0, 1]
j=1
is
1/2
det (M ∗ M )
where M is the m × n matrix which has columns v1 , · · · , vn . Hint: Show this is
clearly true if n = 1 because the above just yields the usual length of the vector.
Now suppose the formula gives the right thing for n − 1 vectors and argue it gives
the right thing for n vectors. In doing this, you might want to show that a vector
which is perpendicular to the span of v1 , · · · , vn−1 is
u1 u2 ··· un
v11 v12 ··· v1n
det .. .. ..
. . .
vn−1,1 vn−1,2 · · · vn−1,n
where {u1 , · · · , un } is an orthonormal basis for span (v1 , · · · , vn ) and vij is the
j th component of vi with respect to this orthonormal basis. Then argue that
if you replace the top line with vn1 , · · · , vnn , the absolute value of the resulting
determinant is the appropriate definition of the volume of the parallelepiped. Next
note you could get this number by taking the determinant of the transpose of the
above matrix times that matrix and then take a square root. After this, identify
this product with a Grammian matrix and then the desired result follows.
11.10. EXERCISES 323
5. Why is the definition of area on a manifold given above reasonable and what is
its geometric meaning? Each function
ui → R−1 (u1 , · · · , un )
yields a curve which lies in Ω. Thus R−1 ,ui is a vector tangent to this curve and
R−1
,ui dui is an “infinitesimal” vector tangent to the curve. Now use the previous
problem to see that when you find the area of a set on Ω, you are essentially
summing the volumes of infinitesimal parallelepipeds which are “tangent” to Ω.
Here
∂u
≡ ∇u · n
∂n
where n is the unit outer normal described above. Establish this formula which is
known as Green’s identity. Hint: You might establish the following easy identity.
This material is mostly in the book by Evans [14] which is where I got it. It is really
partial differential equations but it is such a nice illustration of the divergence theorem
and other advanced calculus theorems, that I am including it here even if it is somewhat
out of place and would normally be encountered in a partial differential equations course.
12.1 Balls
Recall, B (x, r) denotes the set of all y ∈ Rn such that |y − x| < r. By the change of
variables formula for multiple integrals or simple geometric reasoning, all balls of radius
r have the same volume. Furthermore, simple reasoning or change of variables formula
will show that the volume of the ball of radius r equals αn rn where αn will denote the
volume of the unit ball in Rn . With the divergence theorem, it is now easy to give a
simple relationship between the surface area of the ball of radius r and the volume. By
the divergence theorem,
Z Z
x
div x dx = x· dσ n−1
B(0,r) ∂B(0,r) |x|
x
because the unit outward normal on ∂B (0, r) is |x| . Therefore,
and so
σ n−1 (∂B (0, r)) = nαn rn−1 .
© ª
You recall the surface area of S 2 ≡ x ∈ R3 : |x| = r is given by 4πr2 while the volume
of the ball, B (0, r) is 43 πr3 . This follows the above pattern. You just take the derivative
with respect to the radius of the volume of the ball of radius r to get the area of the
surface of this ball. Let ω n denote the area of the sphere S n−1 = {x ∈ Rn : |x| = 1} . I
just showed that
ω n = nαn . (12.1)
I want to find αn now and also to get a relationship between ω n and ω n−1 . Consider
the following picture of the ball of radius ρ seen on the side.
325
326 THE LAPLACE AND POISSON EQUATIONS
r
y ρ Rn−1
Taking slices at height y as shown and using that these slices have n − 1 dimensional
area equal to αn−1 rn−1 , it follows from Fubini’s theorem
Z ρ
n
¡ ¢(n−1)/2
αn ρ = 2 αn−1 ρ2 − y 2 dy (12.2)
0
√
Lemma 12.1.1 Γ (1/2) = π
Proof: µ ¶ Z ∞
1
Γ ≡ e−t t−1/2 dt
2 0
Now change the variables letting t = s2 so dt = 2sds and the integral becomes
Z ∞ Z ∞
2 2
2 e−s ds = e−s ds
0 −∞
¡ ¢ R∞ ¡ ¢2 R∞ R∞
Thus Γ 12 = −∞ e−x dx so Γ 12 = −∞ −∞ e−(x +y ) dxdy and by polar coordi-
2 2 2
π 1/2
=2
(1/2) Γ (1/2)
from the above lemma. Now suppose the theorem is true for n. Then letting ρ = 1,
12.2 implies
Z 1
π n/2 ¡ ¢n/2
αn+1 = 2 ¡ n ¢ 1 − y2 dy
Γ 2 +1 0
12.2. POISSON’S PROBLEM 327
At this point, use the result of Problem 8 on Page 216 to simplify the messy integral
which equals the beta function. Thus the above equals
¡ ¢
π n/2 Γ (1/2) Γ n+2
¡n ¢ ¡ n+2 12¢
Γ 2 +1 Γ 2 + 2
π 1/2 π (n+1)/2
= π n/2 ¡ 1 ¢ = ¡ ¢
Γ 2 n + 32 Γ n+1
2 +1
and this gives the correct formula for αn+1 . This proves the theorem. ¥
∆u = f, in U, u = g on ∂U . (12.3)
Here U is an open bounded set for which the divergence theorem holds. For example,
it could be a P C 1 manifold. When f = 0 this is called Laplace’s equation and the
boundary condition given is called a Dirichlet boundary condition. When ∆u = 0,
the function, u is said to be a harmonic function. When f 6= 0, it is called Poisson’s
equation. I will give a way of representing the solution to these problems. When this
has been done, great and marvelous conclusions may be drawn about the solutions.
Before doing anything else however, it is wise to prove a fundamental result called the
weak maximum principle.
and
∆u ≥ 0 in U.
Then © ª
max u (x) : x ∈ U = max {u (x) : x ∈ ∂U } .
u (x0 ) ≤ u (x1 )
contrary to the property of x0 above. It follows that my claim is verified. Pick such an
ε. Then wε assumes its maximum value in U say at x2 . Then by the second derivative
test,
∆wε (x2 ) = ∆u (x2 ) + 2ε ≤ 0
which requires ∆u (x2 ) ≤ −2ε, contrary to the assumption that ∆u ≥ 0. This proves
the theorem. ¥
The theorem makes it very easy to verify the following uniqueness result.
and
∆u = 0 in U, u = 0 on ∂U.
Then u = 0.
Proof: From the weak maximum principle, u ≤ 0. Now apply the weak maximum
principle to −u which satisfies the same conditions as u. Thus −u ≤ 0 and so u ≥ 0.
Therefore, u = 0 as claimed. This proves the corollary. ¥
Define ½
ln |x| if n = 2
rn (x) ≡ 1
if n > 2 .
|x|n−2
Proof: I will verify the case where n ≥ 3 and leave the other case for you.
à n !−(n−2)/2 −n/2
X X
Dxi x2i = − (n − 2) xi x2j
i=1 j
Therefore,
−(n+2)/2
X n
X
Dxi (Dxi (rn )) = x2j (n − 2) nx2i − x2j .
j j=1
It follows
−(n+2)
2
X n
X n X
X n
∆rn = x2j (n − 2) n x2i − x2j = 0.
j i=1 i=1 j=1
εµ
x Uε
B(x, ε) = B²
Then the divergence theorem will continue to hold for Uε (why?) and so¡ I ¢can use
Green’s identity, Problem 6 on Page 323 to write the following for u, v ∈ C 2 U .
Z µ Z ¶ Z µ ¶
∂v ∂u ∂v ∂u
(u∆v − v∆u) dx = u −v dσ − u −v dσ (12.4)
Uε ∂U ∂n ∂n ∂Bε ∂n ∂n
ψ x (y) = rn (y − x)
¡ ¢
so that 12.5 vanishes for y ∈ ∂U and ψ x is in C 2 U and also satisfies
∆ψ x = 0.
The existence of such a function is another issue. For now, assume such a
function exists.1 Then assuming such a function exists, 12.4 reduces to
Z Z Z µ ¶
∂v ∂v ∂u
− v∆udx = u dσ − u −v dσ. (12.6)
Uε ∂U ∂n ∂Bε ∂n ∂n
The idea now is to let ε → 0 and see what happens. Consider the term
Z
∂u
v dσ.
∂Bε ∂n
¡ ¢ ¡ ¢
The area is O εn−1 while the integrand is O ε−(n−2) in the case where n ≥ 3. In the
case where n = 2, the area is O (ε) and the integrand is O (|ln |ε||) . Now you know that
limε→0 ε ln |ε| = 0 and so in the case n = 2, this term converges to 0 as ε → 0. In the
case that n ≥ 3, it also converges to zero because in this case the integral is O (ε) .
Next consider the term
Z Z µ ¶
∂v ∂rn ∂ψ x
− u dσ = − u (y) (y − x) − (y) dσ.
∂Bε ∂n ∂Bε ∂n ∂n
1 In fact, if the boundary of U is smooth enough, such a function will always exist, although this
requires more work to show but this is not the point. The point is to explicitly find it and this will
only be possible for certain simple choices of U .
330 THE LAPLACE AND POISSON EQUATIONS
This term does not disappear as ε → 0. First note that since ψ x has bounded derivatives,
Z µ ¶ µ Z ¶
∂rn ∂ψ x ∂rn
lim − u (y) (y − x) − (y) dσ = lim − u (y) (y − x) dσ
ε→0 ∂Bε ∂n ∂n ε→0 ∂Bε ∂n
(12.7)
and so it is just this last item which is of concern.
First consider the case that n = 2. In this case,
à !
y1 y2
∇r2 (y) = 2, 2
|y| |y|
Also, on ∂Bε , the exterior unit normal, n, equals
1
(y1 − x1 , y2 − x2 ) .
ε
It follows that on ∂Bε ,
à !
∂r2 1 y1 − x1 y2 − x2 1
(y − x) = (y1 − x1 , y2 − x2 ) · 2, 2 = .
∂n ε |y − x| |y − x| ε
Z Z
|v∆u| dx ≤ C |rn (y − x) − ψ x (y)| dy
Bε Bε
Z
≤ C |rn (y − x)| dy + O (εn )
Bε
Using polar coordinates to evaluate this improper integral in the case where n ≥ 3,
Z Z εZ
1 n−1
C |rn (y − x)| dx = C n−2
ρ dσdρ
Bε 0 S n−1 ρ
Z εZ
= C ρdσdρ
0 S n−1
12.2. POISSON’S PROBLEM 331
which also converges to 0 as ε → 0. Therefore, returning to 12.6 and using the above
limits, yields in the case where n ≥ 3,
Z Z
∂v
− v∆udx = u dσ + u (x) (n − 2) ω n , (12.10)
U ∂U ∂n
and in the case where n = 2,
Z Z
∂v
− v∆udx = u dσ − u (x) 2π. (12.11)
U ∂U ∂n
These two formulas show that it is possible to represent the solutions to Poisson’s
problem provided the function, ψ x can be determined. I will show you can determine
this function in the case that U = B (0, r) .
This proves the first claim. Next suppose |x| , |y| < r and suppose, contrary to what is
claimed, that
y |x| rx
− = 0.
r |x|
Then
2
y |x| = r2 x
2
and so |y| |x| = r2 |x| which implies
|y| |x| = r2
Note that ¯ ¯
¯ y |x| rx ¯¯
lim ¯¯ − = r.
x→0 r |x| ¯
Then ψ x (y) = rn (y − x) if |y| = r, and ∆ψ x = 0. This last claim is obviously true if
x 6= 0. If x = 0, then ψ 0 (y) equals a constant and so it is also obvious in this case that
∆ψ x = 0.
The following lemma is easy to obtain.
Then ( −(n−2)(y−x)
|y−x|n if n ≥3
∇f (y) = y−x .
|y−x|2
if n = 2
Also, the outer normal on ∂B (0, r) is y/r.
From Lemma 12.2.5 it follows easily that for v (y) = rn (y − x) − ψ x (y) and y ∈
∂B (0, r) , then for n ≥ 3,
³ ´
µ ¶−(n−2) y− |x| r2
∂v y − (n − 2) (y − x) |x| 2x
= · n + (n − 2) ¯ ¯n
∂n r |y − x| r ¯ r2 ¯
¯y− |x| 2 x¯
³ ´
¡ ¢ r2
− (n − 2) r2 − y · x
|x|2
2 (n − 2) r2 − |x| 2x · y
= n + ³ r ´n ¯ ¯n
r |y − x| |x| r ¯ r2 ¯
r ¯ y− |x| 2 x¯
³ 2 ´
¡ ¢ |x| 2
− (n − 2) r2 − y · x (n − 2) r2 r − x · y
= n + ¯ ¯n
r |y − x| r ¯ |x| r ¯
¯ r y − |x| x¯
Thus
1
u (x) = ·
ω n (n − 2)
"Z Z Ã ! #
2
x (n − 2) r2 − |x|
(ψ (y) − rn (y − x)) f (y) dy + g (y) n dσ (y) . (12.12)
U ∂U r |y − x|
∆u = 0 in U, u = g on ∂U
From 12.12 it follows that if u solves the above problem, known as the Dirichlet problem,
then
2 Z
r2 − |x| 1
u (x) = g (y) n dσ (y) .
ωn r ∂U |y − x|
¡ ¢
I have shown this in case u ∈ C 2 U which is more specific than to say u ∈ C 2 (U ) ∩
¡ ¢
C U . Nevertheless, it is enough to give the following lemma.
For n = 2,
Z 2
1 r2 − |x|
1= dσ (y) .
∂U 2πr |y − x|2
334 THE LAPLACE AND POISSON EQUATIONS
The representation formula, 12.14 is called Poisson’s integral formula. I have now
shown it works better than you had a right to expect for the Laplace equation. What
happens when f 6= 0?
¡ ¢
Lemma 12.2.10 Let f ∈ C U or in Lp (U ) for p > n/2 2 . Then for x ∈ U, and
x0 ∈ ∂U, Z
1
lim (ψ x (y) − rn (y − x)) f (y) dy = 0.
x→x0 ω n (n − 2) U
Proof: There are two parts to this lemma. First the following claim is shown in
which an integral is taken over B (x0 , δ). After this, the integral over U \ B (x0 , δ) will
be considered. First note that
lim ψ x (y) − rn (y − x) = 0
x→x0
Claim:
Z Z
lim ψ x (y) |f (y)| dy = 0, lim rn (y − x) |f (y)| dy = 0.
δ→0 B(x0 ,δ) δ→0 B(x0 ,δ)
which converges to 0 as δ → 0.
1 1
If f ∈ Lp (U ) , then by Holder’s inequality, (Problem 3 on Page 250) for p + q = 1,
Z δ Z
|f (x0 + ρw)| ρdσdρ
0 S n−1
Z δ Z
= |f (x0 + ρw)| ρ2−n ρn−1 dσdρ
0 S n−1
ÃZ Z !1/p
δ
p n−1
≤ |f (x0 + ρw)| ρ dσdρ ·
0 S n−1
ÃZ Z !1/q
δ ¡ ¢
2−n q
ρ ρn−1 dσdρ
0 S n−1
≤ C ||f ||Lp (U ) .
2 This means f is measurable and |f |p has finite integral
12.2. POISSON’S PROBLEM 337
Now apply the dominated convergence theorem in this last integral to conclude it con-
verges to 0 as x → x0 . This proves the lemma. ¥
The following lemma follows from this one and Theorem 12.2.7.
¡ ¢
Lemma 12.2.11 Let f ∈ C U or in Lp (U ) for p > n/2 and let g ∈ C (∂U ) . Then
if u is given by 12.12 in the case where n ≥ 3 or by 12.13 in the case where n = 2, then
if x0 ∈ ∂U,
lim u (x) = g (x0 ) .
x→x0
Not surprisingly, you can relax the condition that g ∈ C (∂U ) but I won’t do so here.
The next question is about the partial differential equation satisfied by u for u given
by 12.12 in the case where n ≥ 3 or by 12.13 for n = 2. This is going to introduce a
new idea. I will just sketch the main ideas and leave you to work out the details, most
of which have already been considered in a similar context.
Let φ ∈ Cc∞ (U ) and let x ∈ U. Let Uε denote the open set which has B (y, ε)
deleted from it, much as was done earlier. In what follows I will denote with a subscript
of x things for which x is the variable. Then denoting by G (y, x) the expression
ψ x (y) − rn (y − x) , it is easy to verify that ∆x G (y, x) = 0 and so by Fubini’s theorem,
Z ·Z ¸
1
(ψ x (y) − rn (y − x)) f (y) dy ∆x φ (x) dx
U ω n (n − 2) U
Z ·Z ¸
1
= lim (ψ x (y) − rn (y − x)) f (y) dy ∆x φ (x) dx
ε→0 U ω n (n − 2) U
Z εµZ ¶
1
= lim (ψ x (y) − rn (y − x)) ∆x φ (x) dx f (y) dy
ε→0 U Uε ω n (n − 2)
338 THE LAPLACE AND POISSON EQUATIONS
G(y,x)
Z Z z }| {
1 x
= lim ψ (y) − rn (y − x) ∆x φ (x) dx f (y) dy
ε→0 U Uε ω n (n − 2)
Z " Z µ ¶ #
1 ∂φ ∂G
= lim f (y) − G −φ dσ (x) dy
ε→0 ω n (n − 2) U ∂B(y,ε) ∂nx ∂nx
Z Z
1 ∂G
= lim f (y) φ dσ (x) dy
ε→0 ω n (n − 2) U ∂B(y,ε) ∂nx
x
Now x →ψ (y) and its partial derivatives are continuous and so the above reduces to
Z Z
1 ∂rn
= lim f (y) φ (x − y) dσ (x) dy
ε→0 ω n (n − 2) U ∂B(y,ε) ∂nx
Z Z Z
1 1
= lim f (y) φ n−1 dσ (x) dy = f (y) φ (y) dy.
ε→0 ω n U ∂B(y,ε) ε U
Lemma 12.3.1 Let u be harmonic and C 2 on an open set, V and let B (x0 , r) ⊆ V.
Then Z
1
u (x0 ) = u (y) dy
mn (B (x0 , r)) B(x0 ,r)
The fact that u ≥ 0 is used in going to the last line. Since U1 is compact, there exist
m
finitely many balls having centers in U1 , {B (xi , r)}i=1 such that
U1 ⊆ ∪m
i=1 B (xi , r/2) .
Furthermore each of these balls must have nonempty intersection with at least one of the
others because if not, it would follow that U1 would not be connected. Letting x, y ∈ U1 ,
there must be a sequence of these balls, B1 , B2 , · · · , Bk such that x ∈ B1 , y ∈ Bk , and
Bi ∩ Bi+1 6= ∅ for i = 1, 2, · · · , k − 1. Therefore, picking a point, zi+1 ∈ Bi ∩ Bi+1 , the
above estimate implies
1 1 1 1
u (x) ≥ n
u (z2 ) , u (z2 ) ≥ n u (z3 ) , u (z3 ) ≥ n u (z4 ) , · · · , u (zk ) ≥ n u (y) .
2 2 2 2
Therefore,
µ ¶k µ ¶m
1 1
u (x) ≥ u (y) ≥ u (y) .
2n 2n
340 THE LAPLACE AND POISSON EQUATIONS
and so © ª
max u (x) : x ∈ U1 = sup {u (y) : y ∈ U1 }
m m © ª
≤ (2n ) inf {u (x) : x ∈ U1 } = (2n ) min u (x) : x ∈ U1 .
This proves the inequality. ¥
The next theorem comes from the representation formula for harmonic functions
given above.
Proof: Let B (x0 ,r) ⊆ U. I will show that u ∈ C ∞ (B (x0 ,r)) . From 12.20, it follows
that for x ∈ B (x0 ,r) ,
2 Z
r2 − |x − x0 | u (y)
n dσ (y) = u (x) .
ωn r ∂B(x0 ,r) |y − x|
2
−|x−x0 |2
It is obvious that x → r ωn r is infinitely differentiable. Therefore, consider
Z
u (y)
x→ n dσ (y) . (12.22)
∂B(x0 ,r) |y − x|
equals
−(n+2)
−n |x+tθ (t) ek − y| (xk + θ (t) t − yk )
and as t → 0, this converges uniformly for y ∈ ∂B (x0 , r) to
−(n+2)
−n |x − y| (xk − yk ) .
This uniform convergence implies you can take a partial derivative of the function of x
given in 12.22 obtaining the partial derivative with respect to xk equals
Z
−n (xk − yk ) u (y)
n+2 dσ (y) .
∂B(x0 ,r) |y − x|
Now exactly the same reasoning applies to this function of x yielding a similar formula.
The continuity of the integrand as a function of x implies continuity of the partial
derivatives. The idea is there is never any problem because y ∈ ∂B (x0 , r) and x is a
given point not on this boundary. This proves the theorem. ¥
Liouville’s theorem is a famous result in complex variables which asserts that an
entire bounded function is constant. A similar result holds for harmonic functions.
12.4. LAPLACE’S EQUATION FOR GENERAL SETS 341
and these terms converge to 0 as r → ∞. Since the inequality holds for all r > |x| , it
follows ∂u(x)
∂xk = 0. Similarly all the other partial derivatives equal zero as well and so u
is a constant. This proves the theorem. ¥
∆u = 0 on U, and u = g on ∂U.
I will be presenting Perron’s method for this problem. This method is based on exploit-
ing properties of subharmonic functions which are functions satisfying the following
definition.
© ª
Proof: Suppose x ∈ U and u (x) = max u (y) : y ∈ U ≡ M. Let V denote
the connected component of U which contains x. Then since u is subharmonic on
V, it follows that for all small r > 0, u (y) = M for all y ∈ ∂B (x, r) . Therefore,
there exists some r0 > 0 such that u (y) = M for all y ∈ B (x, r0 ) and this shows
{x ∈ V : u (x) = M } is an open subset of V. However, since u is continuous, it is also a
closed subset of V. Therefore, since V is connected,
{x ∈ V : u (x) = M } = V
and so by continuity of u, it must be the case that u (y) = M for all y ∈ ∂V ⊆ ∂U.
This proves the theorem because M = u (y) for some y ∈ ∂U . ¥
As a simple corollary, the proof of the above theorem shows the following startling
result.
Corollary 12.4.3 Suppose U is a connected open set and that u is subharmonic on
U. Then either
u (x) < sup {u (y) : y ∈ U }
for all x ∈ U or
u (x) ≡ sup {u (y) : y ∈ U }
for all x ∈ U .
The next result indicates that the maximum of any finite list of subharmonic func-
tions is also subharmonic.
Lemma 12.4.4 Let U be an open set and let u1 , u2 , · · · , up be subharmonic functions
defined on U. Then letting
v ≡ max (u1 , u2 , · · · , up ) ,
it follows that v is also subharmonic.
Proof: Let x ∈ U. Then whenever r is small enough to satisfy the subharmonicity
condition for each ui .
v (x) = max (u1 (x) , u2 (x) , · · · , up (x))
à Z Z !
1 1
≤ max u1 (y) dσ (y) , · · · , up (y) dσ (y)
ω n rn−1 ∂B(x,r) ω n rn−1 ∂B(x,r)
Z Z
1 1
≤ max (u 1 , u 2 , · · · , up ) (y) dσ (y) = v (y) dσ (y) .
ω n rn−1 ∂B(x,r) ω n rn−1 ∂B(x,r)
This proves the lemma. ¥
The next lemma concerns modifying a subharmonic function on an open ball in such
a way as to make the new function harmonic on the ball. Recall Corollary 12.2.8 which
I will list here for convenience.
Corollary 12.4.5 Let
¡ U¢ = B (x0 , r) and let g ∈ C (∂U ) . Then there exists a unique
2
solution u ∈ C (U ) ∩ C U to the problem
∆u = 0 in U, u = g on ∂U.
This solution is given by the formula,
Z 2
1 r2 − |x − x0 |
u (x) = g (y) n dσ (y) (12.24)
ω n r ∂U |y − x|
for every n ≥ 2. Here ω 2 = 2π.
12.4. LAPLACE’S EQUATION FOR GENERAL SETS 343
Thus ux0 ,r is harmonic on B (x0 , r) , and equals to u off B (x0 , r) . The wonderful
thing about this is that ux0 ,r is still subharmonic on all of U . Also note that from
Corollary 12.2.9 on Page 335 every harmonic function is subharmonic.
Lemma 12.4.7 Let U be an open set and B (x0 ,r) ⊆ U as in the above definition.
Then ux0 ,r is subharmonic on U and u ≤ ux0 ,r .
Proof: First I show that u ≤ ux0 ,r . This follows from the maximum principle. Here
is why. The function u−ux0 ,r is subharmonic on B (x0 , r) and equals zero on ∂B (x0 , r) .
Here is why: For z ∈ B (x0 , r) ,
Z
1
u (z) − ux0 r (z) = u (z) − ux ,r (y) dσ (y)
ωρn−1 ∂B(z,ρ) 0
for all ρ small enough. This is by the mean value property of harmonic functions and
the observation that ux0 r is harmonic on B (x0 , r) . Therefore, from the fact that u is
subharmonic,
Z
1
u (z) − ux0 r (z) ≤ (u (y) − ux0 ,r (y)) dσ (y)
ωρn−1 ∂B(z,ρ)
where Sg consists of those functions u which are subharmonic with u (y) ≤ g (y) for all
y ∈ ∂U and u (y) ≥ min {g (y) : y ∈ ∂U } ≡ m.
Proposition 12.4.9 Let U be a bounded open set and let g ∈ C (∂U ). Then wg ∈ Sg
and in addition to this, wg is harmonic.
∞
Proof: Let B (x0 , 2r) ⊆ U and let {xk }k=1 denote a countable dense subset of
B (x0 , r). Let {u1k } denote a sequence of functions of Sg with the property that
By Lemma 12.4.7, it can be assumed each u1k is a harmonic function in B (x0 , 2r) since
otherwise, you could use the process of replacing u with ux0 ,2r . Similarly, for each l,
there exists a sequence of harmonic functions in Sg , {ulk } with the property that
Now define
wk = (max (u1k , · · · , ukk ))x0 ,2r .
Then each wk ∈ Sg , each wk is harmonic in B (x0 , 2r), and for each xl ,
For x ∈ B (x0 , r)
Z 2
1 r2 − |x − x0 |
wk (x) = wk (y) n dσ (y) (12.25)
ω n 2r ∂B(x0 ,2r) |y − x|
and so there exists a constant, C which is independent of k such that for all i =
1, 2, · · · , n and x ∈ B (x0 , r), ¯ ¯
¯ ∂wk (x) ¯
¯ ¯
¯ ∂xi ¯ ≤ C
which shows that w is also harmonic. I have shown that w = wg on a dense set. Also, it
follows that w (x) ≤ wg (x) for all x ∈ B (x0 , r). It remains to verify these two functions
are in fact equal.
Claim: wg is lower semicontinuous on U.
Proof of claim: Suppose zk → z. I need to verify that
Let ε > 0 be given and pick u ∈ Sg such that wg (z) − ε < u (z) . Then
Definition 12.4.10 A bounded open set, U has the barrier condition at z ∈ ∂U,
if there exists a function, bz called a barrier function which has the property that bz is
subharmonic on U, bz (z) = 0, and for all x ∈ ∂U \ {z} , bz (x) < 0.
Theorem 12.4.11 Let U be a bounded open set which has the barrier condition
at z ∈ ∂U and let g ∈ C (∂U ) . Then the function, wg , defined above is in C 2 (U ) and
satisfies
∆wg = 0 in U,
lim wg (x) = g (z) .
x→z
Proof: From Proposition 12.4.9 it follows ∆wg = 0. Let z ∈ ∂U and let bz be the
barrier function at z. Then letting ε > 0 be given, the function
g (x) − g (z) + ε
bz (x) ≤
K
for any choice of positive K. Now choose K large enough that Bδ < g(x)−g(z)+ε
K for all
x ∈ ∂U. This can be done because Bδ < 0. It follows the above inequality holds for all
x ∈ ∂U . This proves the claim.
Let K be large enough that the conclusion of the above claim holds. Then, for all
x, u− (x) ≤ g (x) for all x ∈ ∂U and so u− ∈ Sg which implies u− ≤ wg and so
but this would be wrong because I do not know that wg is continuous at a boundary
point. I only have shown that it is harmonic in U. Therefore, a little more is required.
Let
u+ (x) ≡ g (z) + ε − Kbz (x) .
Then −u+ is subharmonic and also if K is large enough, it follows from reasoning similar
to that of the above claim that
−u+ (x) = −g (z) − ε + Kbz (x) ≤ −g (x)
on ∂U. Therefore, letting u ∈ Sg , u − u+ is a subharmonic function which satisfies for
x ∈ ∂U,
u (x) − u+ (x) ≤ g (x) − g (x) = 0.
Consequently, the maximum principle implies u ≤ u+ and so since this holds for every
u ∈ Sg , it follows
wg (x) ≤ u+ (x) = g (z) + ε − Kbz (x) .
It follows that
g (z) − ε + Kbz (x) ≤ wg (x) ≤ g (z) + ε − Kbz (x)
and so,
g (z) − ε ≤ lim inf wg (x) ≤ lim sup wg (x) ≤ g (z) + ε.
x→z x→z
Since ε is arbitrary, this shows
lim wg (x) = g (z) .
x→z
Proposition 12.4.14 Suppose Condition 12.4.13 holds. Then U satisfies the bar-
rier condition.
In fact, you have to have a fairly pathological example in order to find something
which does not satisfy the barrier condition. You might try to think of some examples.
Think of B (0, 1) \ {z axis} for example. The points on the z axis which are in B (0, 1)
become boundary points of this new set. Thus this set can’t satisfy the above condition.
Could this set have the barrier property?
348 THE LAPLACE AND POISSON EQUATIONS
The Jordan Curve Theorem
This short chapter is devoted to giving an elementary proof of the Jordan curve theorem
which is independent of the chapter on degree theory. I am following lecture notes from
a topology course given by Fernley at BYU in the 1970’s. The ideas used in this
presentation are elementary and also lead to more general notions in algebraic topology.
In addition to this, these techniques are very useful in complex analysis.
0 cell
R
6 6 2 cell
6 ª
2 cell 1 cell 2 cell
n
For k = 0, 1, 2, one speaks of k chains. For {aj }j=1 a set of k cells, the k chain is
denoted as a formal sum
C = a1 + a2 + · · · + an
where the sum is taken modulo 2. The sums are just formal expressions like the above.
Thus for a a k cell, a + a = 0, 0 + a = a, the summation sign is commutative. In other
words, if a k cell is repeated an even number of times in the formal sum, it disappears
resulting in 0 defined by 0 + a = a + 0 = a. For a a k cell, |a| denotes the points of the
plane which are contained in a. For a k chain, C as above,
so |C| is the union of the k cells in the sum remembering that when a k cell occurs
twice, it is gone and does not contribute to |C|.
The following picture illustrates the above definition. The following is a picture of
the 2 cells in a 2 chain. The dotted lines indicate the lines in the grating.
349
350 THE JORDAN CURVE THEOREM
Now the following is a picture of the 1 chain consisting of the sum of the 1 cells
which are the edges of the above 2 cells. Remember when a 1 cell is added to itself, it
disappears from the chain. Thus if you add up the 1 cells which are the edges of the
above 2 cells, lots of them cancel off. In fact all the edges which are shared between two
2 cells disappear. The following is what results.
C = a1 + a2 + · · · + an ,
In the second of the above pictures, you have a 1 cycle. Here is a picture of another
one in which the boundary of another 2 cell has been included over on the right.
This 1 cycle shown above is the boundary of exactly two 2 chains. What are they?
C1 consists of the 2 cells in the first picture above along with the 2 cell whose boundary
is the 1 cycle over on the right. C2 is all the other 2 cells of the grating. You see this
clearly works. Could you make that 2 cell on the right be in C2 ? No, you couldn’t do
it. This is because the 1 cells which are shown would disappear, being listed twice.
This illustrates the fundamental lemma of the plane which comes next.
Lemma 13.0.17 If C is a bounded 1 cycle (∂C = 0), then there are exactly two 2
chains D1 , D2 such that
C = ∂D1 = ∂D2 .
Proof : The lemma is vacuously true unless there are at least two vertical lines and
at least two horizontal lines in the grating G. It is also obviously true if there are exactly
two vertical lines and two horizontal lines in G. Suppose the theorem is true for n lines
in G. Then as just mentioned, there is nothing to prove unless there are either 2 or more
351
vertical lines and two or more horizontal lines. Suppose without loss of generality there
are at least as many veritical lines are there are horizontal lines and that this number
is at least 3. If it is only two, there is nothing left to show. Let l be the second vertical
line from the left. Let {e1 , · · · , em } be the 1 cells of C with the property that |ej | ⊆ l.
Note that ej occurs only once in C since if it occurred twice, it would disappear because
of the rule for addition. Pick one of the 2 cells adjacent to ej , bj and add in ∂bj which
is a 1 cycle. Thus X
C+ ∂bj
j
is a bounded 1 cycle and it has the property that it has no 1 cells contained in l. Thus
you could eliminate l from the grating G and all the 1 cells of the above 1 chain are edges
of the grating G \ {l}. By induction, there are exactly two 2 chains D1 , D2 composed
of 2 cells of G \ {l} such that for i = 1, 2,
X
∂Di = C + ∂bj (13.1)
j
and this shows there exist two 2 chains which have C as the boundary. If ∂Di0 = C,
then
X X X
∂Di0 + ∂bj = ∂ Di0 + bj = C + ∂bj
j j j
P
and by P induction, there are exactly two 2 chains which Di0 + j bj can equal. Thus
adding j bj there are exactly two 2 chains which Di0 can equal.
Here is another proof which is not by induction. This proof also gives an algorithm
for identifying the two 2 chains. The 1 cycle is bounded and so every 1 cell in it is part
of the boundary of a 2 cell which is bounded. For the unbounded 2 cells, label them all
as A. Now starting from the left and moving toward the right, toggle between A and
B every time you hit a vertical 1 cell of C. This will label every 2 cell with either A
or B. Next, starting at the top move down and toggle between A and B every time
you encounter a horizontal 1 cell of C. This also labels every 2 cell as either A or
B. Suppose there is a contradiction in the labeling. Pick the first column in which a
contradiction occurs and then pick the top contradictory 2 cell in this column. There
are various cases which can occur, each leading to the existence of a vertex of C which
is contained in an odd number of 1 cells of C, thus contradicting the conclusion that C
is a 1 cycle. In the following picture, AB will mean the labeling from the left to right
gives A and the labeling from top to bottom yields B with similar modification for AA
and BB.
AA BB AA AA
BB AB AA AB
352 THE JORDAN CURVE THEOREM
AB
Thus that center vertex is a boundary point of C and so C is not a 1 cycle after
all. Similar considerations would hold if the contradictory 2 cell were labeled BA. Thus
there can be no contradiction in the two labeling schemes. They label the 2 cells in G
either A or B in an unambiguous manner.
The labeling algorithm encounters every 1 cell of C (in fact of G) and gives a label
to every 2 cell of G. Define the two 2 chains as A and B where A consists of those
labeled as A and B those labeled as B. The 1 cells which cause a change to take place
in the labeling are exactly those in C and each is contained in one 2 cell from A and one
2 cell from B. Therefore, each of these 1 cells of C appears in ∂A and ∂B which shows
C ⊆ ∂A and C ⊆ ∂B. On the other hand, if l is a 1 cell in ∂A, then it can only occur
in a single 2 cell of A and so the 2 cell adjacent to that one along l must be in B and so
l is one of the 1 cells of C by definition. As to uniqueness, in moving from left to right,
you must assign adjacent 2 cells joined at a 1 cell of C to different 2 chains or else the
1 cell would not appear when you take the boundary of either A or B since it would be
added in twice. Thus there are exactly two 2 chains with the desired property. ¥
The next lemma is interesting because it gives the existence of a continuous curve
joining two points.
Proof : There are an odd number of 1 cells of C which have x at one end. Otherwise
∂C 6= x + y. Begin at x and move along an edge leading away from x. Continue till
there is no new edge to travel along. You must be at y since otherwise, you would have
found another boundary point. Thus there is a continuous curve in |C| joining x to y.
¥
The next lemma gives conditions under which you can go around a couple of closed
sets. It is called Alexander’s lemma. The following picture is a rough illustration of the
situation.
353
x
C1
C2
F1 F2
Lemma 13.0.19 Let F1 be compact and F2 closed. Suppose C1 , C2 are two 1 chains,
∂Ci = x + y where x, y ∈ / F1 ∪ F2 . Suppose C2 does not intersect F2 and C1 does not
intersect F1 . Also suppose the 1 cycle C1 + C2 bounds a 2 chain D for which |D| ∩
F1 ∩F2 = ∅. Then there exists a 1 chain C such that ∂C = x+y and |∂C|∩(F1 ∪ F2 ) = ∅.
In particular x, y cannot be in different components of the complement of F1 ∪ F2 .
Definition 13.0.21 A Jordan arc is a set of points of the form Γ ≡ r ([a, b])
where r is a one to one map from [a, b] to the plane. For p, q ∈ Γ, say p < q if
p = r (t1 ) , q = r (t2 ) for t1 < t2 . Also let pq denote the arc r ([t1 , t2 ]).
Proof: Suppose this is not so. Then there exists x, y points in ΓC which are in
different components of ΓC . Let G be a grating having x, y as points of intersection of a
horizontal line and a vertical line of G and let
¡ p, q¢be the points at the ends of the Jordan
arc. p = r (a) and q = r (b) . Now let z = r a+b 2 and consider the two arcs pz and zq.
If ∂C = x + y then it is required |C| ∩ Γ 6= ∅ since otherwise these two points would
not be in different components. Suppose there exists C1 , ∂C1 = x + y and |C1 | ∩ zq = ∅
and C2 , ∂C2 = x + y but |C2 | ∩ pz = ∅. Then C1 + C2 is a 1 cycle and so by Lemma
13.0.17 there are exactly two 2 chains whose boundaries are C1 + C2 . Since z ∈ / |Ci | ,
it follows z = pz ∩ zq can only be in one of these 2 chains because it is a single point.
Then by Lemma 13.0.19, Alexander’s lemma, there exists C a 1 chain with ∂C = x + y
and |C| ∩ (pz ∪ zq) = ∅ so by Lemma 13.0.18 x, y are not in different components of ΓC
contrary to the assumption they are in different components. Hence one of pz, zq has
the property that every 1 chain, ∂C = x + y goes through it. Say every such 1 chain
goes through zq. Then let zq play the role of pq and conclude every 1 chain C such that
∂C = x + y goes through either zw or wq there
µµ ¶ ¶
a+b 1
w=r +b
2 2
Thus, continuing this way, there is a sequence of Jordan arcs pk qk where r (tk ) = qk
and r (sk ) = pk with |tk − sk | < b−a
2k
, [sk , tk ] ⊆ [a, b] such that every C with ∂C = x + y
has nonempty intersection with pk qk . The intersection of these arcs is r (s) where s =
∩∞k=1 [sk , tk ]. Then all such C must go through r (s) because such C with ∂C = x + y
must intersect pk qk for each k and their intersection is r (s). But now there is an obvious
contradiction to having every 1 chain whose boundary is x + y intersecting r (s). Pick a
1 chain whose boundary is x+y. Let D be the two chain of at most four 2 cells consisting
of those two cells which have r (s) on some edge. Then ∂ (C + ∂D) = ∂C = x + y but
/ |C + ∂D| . Therefore, this contradiction shows ΓC must be connected after all.
r (s) ∈
¥
The other important observation about a Jordan arc is that it has no interior points.
This will follow later from a harder result but it is also easy to prove.
Lemma 13.0.23 Let Γ = r ([a, b]) be a Jordan arc where r is as above, one to one,
onto and continuous. Then Γ has no interior points.
Proof : Suppose to the contrary that Γ has an interior point p. Then for some r > 0,
B (p, r) ⊆ Γ.
Consider the circles of radius δ < r centered at p. Denoting as Cδ one of these, it follows
the Cδ are disjoint. Therefore, since r is one to one, the sets r−1 (Cδ ) are also disjoint.
Now r is continuous and one to one mapping to a compact set. Therefore, r−1 is also
continuous. It follows r−1 (Cδ ) is connected and compact. Thus by Theorem 5.3.8 each
of these sets is a closed interval of positive length since r is one to one. It follows there
exist disjoint open nonempty intervals consisting of the interiors of r−1 (Cδ ) , {Iδ }δ<r .
This is a contradiction to the density of Q and the fact that Q is at most countable. ¥
Definition 13.0.24 Let r map [a, b] to the plane such that r is one to one on
[a, b) and (a, b] but r (a) = r (b). Then J = r ([a, b]) is called a simple closed curve. It is
also called a Jordan curve. Also since the term “boundary” has been given a specialized
meaning relative to chains of various sizes, we say x is in the frontier of S if every open
ball containing x contains points of S as well as points of S C .
Note that if J is a Jordan curve, then it is the union of two Jordan arcs whose
intersection is two distinct points of J. You could pick z ∈ (a, b) and consider r ([a, z])
and r ([z, b]) as the two Jordan arcs.
355
The next lemma gives a probably more convenient way of thinking about a Jordan
curve. It says essentially that a Jordan curve is a wriggly circle.
Lemma 13.0.25 J is a simple closed curve if and only if there exists a mapping
θ : S 1 → J where S 1 is the unit circle
© ª
(x, y) : x2 + y 2 = 1 ,
Proof : First suppose J is the image of the unit circle as just explained. Then let
p : [0, 2π] → S 1 be defined as p (t) ≡ (cos (t) , sin (t)) . Then consider r (t) ≡ θ (p (t)).
r is one to one on [0, 2π) and (0, 2π] with r (0) = r (2π) and is continuous, being the
composition of continuous functions.
Suppose now that J is a simple closed curve so there is a parameterization r and
an interval [a, b] such that r is continuous and one to one on [a, b) and (a, b] with
r (a) = r (b) . Define θ −1 : J → S 1 by
µ µ ¶ µ ¶¶
2π ¡ −1 ¢ 2π ¡ −1 ¢
θ −1 (x) ≡ cos r (x) − a , sin r (x) − a
b−a b−a
Note that θ −1 is onto S 1 . The function is well defined because it sends the point r (a) =
r (b) to the same point, (1, 0) . It is also one to one. To see this note r−1 is one to one
on J \ {r (a) , r (b)} . What about the case where x 6= r (a) = r (b)? Could θ −1 (x) =
θ −1 (r (a))? In this case, r−1 (x) is in (a, b) while r−1 (r (a)) = a so
Hence by continuity of r, xn → r (t) and so r (t) must equal r (a) = r (b) . It follows from
the assumption of what a simple curve is that t ∈ {a, b} . Hence θ −1 (xn ) converges to
either µ µ ¶ µ ¶¶
2π 2π
cos (a − a) , sin (a − a)
b−a b−a
or µ µ ¶ µ ¶¶
2π 2π
cos (b − a) , sin (b − a)
b−a b−a
but these are the same point. This has shown that if xn → r (a) = r (b) , there is a
subsequence such that θ −1 (xn ) → θ −1 (r (a)) . Thus θ −1 is continuous at r (a) = r (b).
Next suppose xn → x 6= r (a) ≡ p. Then there exists B (p, r) such that for all n
large enough, xn and x are contained in the compact set J \ B (p, r) ≡ K. Then r is
continuous and one to one on the compact set r−1 (K) ⊆ (a, b) and so by Theorem 5.1.3
r−1 is continuous on K. In particular it is continuous at x so θ −1 (xn ) → θ −1 (x). This
proves the lemma. ¥
Now with this preparation, here is the main result, the Jordan curve theorem.
Proof : To begin with consider the claim there are no more than two components.
Suppose this is not so. Then there exist x, y, z each of which is in a different component
of J C . Let J = H ∪ K where H and K are two Jordan arcs joined at the points a and
b.
a
H K
Let C1 be a square shaped curve (having vertical and horizontal sides) which joins
x and y which misses H. Then let C2 be such a square shaped curve which joins x and
y and misses K. This is possible by Theorem 13.0.22 which says the complements of
these Jordan arcs are connected and Problem 7 on Page 113 which says an open set is
connected if and only if every pair of points can be joined by a square curve. Similarly,
let C3 and C4 be square curves joining y and z such that C3 misses H and C4 misses
K. Let G be a grating which has each of the corners of all these square curves as
intersections of horizonal and vertical lines of G as well as each of x, y, z.
Thus ∂C1 = x + y = ∂C2 and ∂C3 = y + z = ∂C4 . Then C1 + C2 is a cycle and so
by Lemma 13.0.17 there are exactly two 2 chains D, E such that
∂E = ∂D = C1 + C2 .
I claim that neither |D| nor |E| can contain both a and b. Here is why. If they were
both in |D| for example, then E would be a 2 chain whose boundary is C1 + C2 which
misses {a, b} ≡ H ∩ K. (Note {a, b} = H ∩ K and so C1 + C2 does not contain either a
or b.) Therefore, by Lemma 13.0.19 there exists a 1 chain C such that ∂C = x + y which
misses both H and K which along with Lemma 13.0.18 then contradicts the assertion
that x, y are in different components. Similarly, not both a, b can be in |E| . Say D is
the two chain such that a ∈ |D| and b ∈ |E|. Similarly there are exactly two 2 chains
P, Q such that ∂P = ∂Q = C3 + C4 and by similar reasoning neither |P | nor |Q| can
contain both a and b. Let a ∈ |P | and b ∈ |Q|. Now (C1 + C3 ) + (C2 + C4 ) is a 1 cycle
because
∂ (C1 + C3 + C2 + C4 ) = x + y + y + z + x + y + y + z = 0
and so D + Q is one of these two chains. Therefore, |C1 + C3 | misses H and |C2 + C4 |
misses K and both a and b are in |D + Q| . It follows the other two chain of which
C3 + C4 + C1 + C2 is the boundary has empty intersection with {a, b} = H ∩ K and by
Alexander’s lemma there exists a 1 chain C such that x + z = ∂C and so x, z are not in
different components after all.Thus there are at most two components to J C .
Next, why are there at least two components in J C ? Suppose there is only one and
let a, b be the points of J described above and H, K also as above. Let Q be a small
square 1 cycle which encloses a on its inside such that b is not inside Q. Thus a is on
the inside of Q and b is on the outside of Q as shown in the picture.
357
a
Q
H K
C
B (0, R) where J ⊆ B (0, R). Thus there are two components for J C , the unbounded
C
one which contains B (0, R) and the bounded one which must be contained in B (0, R) .
This proves the theorem. ¥
Line Integrals
where the sums are taken over all possible lists, {a = t0 < · · · < tn = b} . The set of
points traced out will be denoted by γ ∗ ≡ γ ([a, b]). The function γ is called a param-
eterization of γ ∗ . The set of points γ ∗ is called a rectifiable curve. If a set of points
γ ∗ = γ ([a, b]) where γ is continuous and γ is one to one on [a, b) and also one to one
on (a, b], then γ ∗ is called a simple curve. A closed curve is one which has a parame-
terization γ defined on an interval [a, b] such that γ (a) = γ (b). It is a simple closed
curve if there is a parameterization g such that γ is one to one on [a, b) and one to one
on (a, b] with γ (a) = γ (b).
The case of most interest is for simple curves. It turns out that in this case, the above
concept of length is a property which γ ∗ possesses independent of the parameterization
γ used to describe the set of points γ ∗ . To show this, it is helpful to use the following
lemma.
359
360 LINE INTEGRALS
Let xt ≡ tx1 + (1 − t) x2 and yt ≡ ty1 + (1 − t) y2 . Then xt < yt for all t ∈ [0, 1] because
with strict inequality holding for at least one of these inequalities since not both t and
(1 − t) can equal zero. Now define
Since h is continuous and h (0) < 0, while h (1) > 0, there exists t ∈ (0, 1) such that
h (t) = 0. Therefore, both xt and yt are points of (a, b) and φ (yt ) − φ (xt ) = 0 contra-
dicting the assumption that φ is one to one. It follows φ is either strictly increasing or
strictly decreasing on (a, b) .
This property of being either strictly increasing or strictly decreasing on (a, b) carries
over to [a, b] by the continuity of φ. Suppose φ is strictly increasing on (a, b) , a similar
argument holding for φ strictly decreasing on (a, b) . If x > a, then pick y ∈ (a, x) and
from the above, φ (y) < φ (x) . Now by continuity of φ at a,
Therefore, φ (a) < φ (x) whenever x ∈ (a, b) . Similarly φ (b) > φ (x) for all x ∈ (a, b).
It only remains to verify φ−1 is continuous. Suppose then that sn → s where sn
and s are points of φ ([a, b]) . It is desired to verify that φ−1 (sn ) → φ−1 (s) . If this
¯does not happen, there¯ exists ε > 0 and a subsequence, still denoted by sn such that
¯φ−1 (sn ) − φ−1 (s)¯ ≥ ε. Using the sequential compactness of [a, b] there exists a further
subsequence, still denoted by n, such that φ−1 (sn ) → t1 ∈ [a, b] , t1 6= φ−1 (s) . Then by
continuity of φ, it follows sn → φ (t1 ) and so s = φ (t1 ) . Therefore, t1 = φ−1 (s) after
all. This proves the lemma. ¥
Now suppose γ and η are two parameterizations of the simple curve γ ∗ as described
above. Thus γ ([a, b]) = γ ∗ = η ([c, d]) and the two continuous functions γ, η are one
to one on their respective open intervals. I need to show the two definitions of length
yield the same thing with either parameterization. Since γ ∗ is compact, it follows
from Theorem 5.1.3 on Page 88, both γ −1 and η −1 are continuous. Thus γ −1 ◦ η :
[c, d] → [a, b] is continuous. It is also uniformly continuous because [c, d] is compact. Let
P ≡ {t0 , · · ·, tn } be a partition of [a, b] , t0 < t1 < · · · < tn such that for L < V (γ, [a, b]) ,
n
X
L< |γ (tk ) − γ (tk−1 )| ≤ V (γ, [a, b])
k=1
Note the sums approximating the total variation are all no larger than the total variation
because when another point is added in to the partition, it is an easy exercise in the
triangle inequality to show the corresponding sum either becomes larger or stays the
same.
Let γ −1 ◦ η (sk ) = tk so that {s0 , · · ·, sn } is a partition of [c, d] . By the lemma, the
sk are either strictly decreasing or strictly increasing as a function of k, depending on
whether γ −1 ◦ η is increasing or decreasing. Thus γ (tk ) = η (sk ) and so
n
X
L< |η (sk ) − η (sk−1 )| ≤ V (γ, [a, b])
k=1
It follows that whenever L < V (γ, [a, b]) , there exists a partition of [c, d] , {s0 , · · ·, sn }
such that
X n
L< |η (sk ) − η (sk−1 )|
k=1
14.1. BASIC PROPERTIES 361
It follows that for every L < V (γ, [a, b]) , V (η, [c, d]) ≥ L which requires V (η, [c, d]) ≥
V (γ, [a, b]). Turning the argument around, it follows
V (η, [c, d]) = V (γ, [a, b]) .
This proves the following fundamental theorem.
Theorem 14.1.3 Let Γ be a simple curve and let γ be a parameterization for
Γ where γ is one to one on (a, b), continuous on [a, b] and of bounded variation. Then
the total variation
V (γ, [a, b])
can be used as a definition for the length of Γ in the sense that if Γ = η ([c, d]) where
η is a continuous function which is one to one on (c, d) with η ([c, d]) = Γ,
V (γ, [a, b]) = V (η, [c, d]) .
This common value can be denoted by V (Γ) and is called the length of Γ.
The length is not dependent on parameterization. Simple curves which have such
parameterizations are called rectifiable.
14.1.2 Orientation
There is another notion called orientation. For simple rectifiable curves, you can think
of it as a direction of motion over the curve but what does this really mean for a wriggly
curve? A precise description is needed.
Definition 14.1.4 Let η, γ be continuous one to one parameterizations for a
simple rectifiable curve. If η −1 ◦ γ is increasing, then γ and η are said to be equivalent
parameterizations and this is written as γ ∼ η. It is also said that the two parameteri-
zations give the same orientation for the curve when γ ∼ η.
When the parameterizations are equivalent, they preserve the direction of motion
along the curve and this also shows there are exactly two orientations of the curve
since either η −1 ◦ γ is increasing or it is decreasing thanks to Lemma 14.1.2. In simple
language, the message is that there are exactly two directions of motion along a simple
curve.
Lemma 14.1.5 The following hold for ∼.
γ ∼ γ, (14.1)
If γ ∼ η then η ∼ γ, (14.2)
If γ ∼ η and η ∼ θ, then γ ∼ θ. (14.3)
Proof: Formula 14.1 is obvious because γ −1 ◦ γ (t) = t so it is clearly an increasing
function. If γ ∼ η then γ −1 ◦η is increasing. Now η −1 ◦γ must also be ¡ increasing
¢ ¡ because¢
it is the inverse of γ −1 ◦η. This verifies 14.2. To see 14.3, γ −1 ◦θ = γ −1 ◦ η ◦ η −1 ◦ θ
and so since both of these functions are increasing, it follows γ −1 ◦ θ is also increasing.
This proves the lemma. ¥
Definition 14.1.6 Let Γ be a simple rectifiable curve and let γ be a parame-
terization for Γ. Denoting by [γ] the equivalence class of parameterizations determined
by the above equivalence relation, the following pair will be called an oriented curve.
(Γ, [γ])
In simple language, an oriented curve is one which has a direction of motion specified.
362 LINE INTEGRALS
Actually, people usually just write Γ and there is understood a direction of motion
or orientation on Γ. How can you identify which orientation is being considered?
Proposition 14.1.7 Let (Γ, [γ]) be an oriented simple curve and let p, q be any
two distinct points of Γ. Then [γ] is determined by the order of γ −1 (p) and γ −1 (q).
This means that η ∈ [γ] if and only if η −1 (p) and η −1 (q) occur in the same order as
γ −1 (p) and γ −1 (q).
Proof: Suppose γ −1 (p) < γ −1 (q) and let η ∈ [γ] . Is it true that η −1 (p) <
η (q)? Of course it is because γ −1 ◦ η is increasing. Therefore, if η −1 (p) > η −1 (q)
−1
it would follow
¡ ¢ ¡ ¢
γ −1 (p) = γ −1 ◦ η η −1 (p) > γ −1 ◦ η η −1 (q) = γ −1 (q)
which is a contradiction. Thus if γ −1 (p) < γ −1 (q) for one γ ∈ [γ] , then this is true
for all η ∈ [γ].
Now suppose η is a parameterization for Γ defined on [c, d] which has the property
that
η −1 (p) < η −1 (q)
Yes because these reduce to γ −1 (p) on the left and γ −1 (q) on the right. It is given
that γ −1 (p) < γ −1 (q) . This proves the lemma. ¥
This shows that the direction of motion on the curve is determined by any two
points and the determination of which is encountered first by any parameterization in
the equivalence class of parameterizations which determines the orientation. Sometimes
people indicate this direction of motion by drawing an arrow.
Now here is an interesting observation relative to two simple closed rectifiable curves.
The situation is illustrated by the following picture.
γ 1 (α) = γ 2 (β)
Γ2 l Γ1
γ 1 (δ) = γ 2 (θ)
Proposition 14.1.8 Let Γ1 and Γ2 be two simple closed rectifiable oriented curves
and let their intersection be l. Suppose also that l is itself a simple curve. Also suppose
the orientation of l when considered a part of Γ1 is opposite its orientation when consid-
ered a part of Γ2 . Then if the open segment (l except for its endpoints) of l is removed,
the result is a simple closed rectifiable curve Γ. This curve has a parameterization γ
with the property that on γ −1
j (Γ ∩ Γj ) , γ
−1
γ j is increasing. In other words, Γ has an
orientation consistent with that of Γ1 and Γ2 . Furthermore, if Γ has such a consistent
orientation, then the orientations of l as part of the two simple closed curves, Γ1 and
Γ2 are opposite.
14.1. BASIC PROPERTIES 363
γ 1 (α) = γ 2 (β)
Γ2 Γ1
γ 1 (δ) = γ 2 (θ)
Proof: Let Γ1 = γ 1 ([a, b]) , γ 1 (a) = γ 1 (b) , and Γ2 = γ 2 ([c, d]) , γ 2 (c) = γ 2 (d) ,
with l = γ 1 ([α, δ]) = γ 2 ([θ, β]). (Recall continuous images of connected sets are con-
nected and the connected sets on the real line are intervals.) By the assumption the two
orientations are opposite, something can be said about the relationship of α, δ, θ, β. Sup-
pose without loss of generality that α < δ. Then because of this assumption it follows
γ 2 (θ) = γ 1 (δ) , γ 2 (β) = γ 1 (α). The following diagram might be useful to summarize
what was just said.
a α δ b γ
1
c θ β d γ
2
Note the first of the interval [β, d] matches the last of the interval [a, α] and the first
of [δ, b] matches the last of [c, θ] , all this in terms of where these points are sent.
Now I need to describe the parameterization of Γ ≡ Γ1 ∪ Γ2 . To verify it is a
simple closed curve, I must produce an interval and a mapping from this interval to Γ
which satisfies the conditions needed for γ to be a simple closed rectifiable curve. The
following is the definition as well as a description of which part of Γj is being obtained.
It is helpful to look at the above picture and the following picture in which there are
intervals placed next to each other. Above each is where the left end point starts off
followed by its length and finally where it ends up.
γ 1 (a), α − a, γ 1 (α) γ 2 (β), d − β, γ 2 (d) γ 2 (c), θ − c, γ 2 (θ) γ 1 (δ), b − δ, γ 1 (b)
Note it ends up where it started, at γ 1 (a) = γ 1 (b). The following involved description
is nothing but the above picture with the edges of the little intervals computed along
with a description of γ which corresponds to the above picture.
Then γ (t) is given by
γ (t) ≡
γ 1 (t) , t ∈ [a, α] , γ 1 (a) → γ 1 (α) = γ 2 (β)
γ (t + β − α) , t ∈ [α, α + d − β] , γ 2 (β) → γ 2 (d) = γ 2 (c)
2
γ 2 (t + c − α − d + β) , t ∈ [α + d − β, α + d − β + θ − c] ,
γ 2 (c) = γ 2 (d) → γ 2 (θ) = γ 1 (δ)
γ (t − α − d + β − θ + c + δ) , t ∈ [α + d − β + θ − c, α + d − β + θ − c + b − δ] ,
1
γ 1 (δ) → γ 1 (b) = γ 1 (a)
The construction shows γ is one to one on
(a, α + d − β + θ − c + b − δ)
and if t is in this open interval, then
γ (t) 6= γ (a) = γ 1 (a)
364 LINE INTEGRALS
and
γ (t) 6= γ (α + d − β + θ − c + b − δ) = γ 1 (b) .
Also
γ (a) = γ 1 (a) = γ (α + d − β + θ − c + b − δ) = γ 1 (b)
so it is a simple closed curve. The claim about preserving the orientation is also obvious
from the formula. Note that t is never subtracted.
It only remains to prove the last claim. Suppose then that it is not so and l has
the same orientation as part of each Γj . Then from a repeat of the above argument,
you could change the orientation of l relative to Γ2 and obtain an orientation of Γ
which is consistent with that of Γ1 and Γ2 . Call a parameterization which has this
new orientation γ n while γ is the one which is assumed to exist. This new orientation
of l changes the orientation of Γ2 because there are two points in l. Therefore on
γ 2−1 (Γ ∩ Γ2 ), γ −1
n γ 2 is decreasing while γ
−1
γ 2 is assumed to be increasing. Hence γ
and γ n are not equivalent. However, the above construction would leave the orientation
of both γ 1 ([a, α]) and γ 1 ([δ, b]) unchanged and at least one of these must have at least
two points. Thus the orientation of Γ must be the same for γ n as for γ. That is, γ ∼ γ n .
This is a contradiction. This proves the proposition. ¥
There is a slightly different aspect of the above proposition which is interesting. It
involves using the shared segment to orient the simple closed curve Γ.
Corollary 14.1.9 Let the intersection of simple closed rectifiable curves, Γ1 and Γ2
consist of the simple curve l. Then place opposite orientations on l, and use these two
different orientations to specify orientations of Γ1 and Γ2 . Then letting Γ denote the
simple closed curve which is obtained from deleting the open segment of l, there exists
an orientation for Γ which is consistent with the orientations of Γ1 and Γ2 obtained
from the given specification of opposite orientations on l.
where τ j ∈ [tj−1 , tj ] . (Note this notation is a little sloppy because it does not identify
R
the specific point, τ j used. It is understood that this point is arbitrary.) Define γ f ·dγ
as the unique number which satisfies the following condition. For all ε > 0 there exists
a δ > 0 such that if ||P|| ≤ δ, then
¯Z ¯
¯ ¯
¯ f ·dγ − S (P)¯ < ε.
¯ ¯
γ
Then γ ∗ is a set of points in Rn and as t moves from a to b, γ (t) moves from γ (a)
to γ (b) . Thus γ ∗ has a first point and a last point. (In the case of a closed curve these
are the same point.) If φ : [c, d] → [a, b] is a continuous nondecreasing function, then
γ ◦ φ : [c, d] → Rn is also of bounded variation and yields the same set of points in Rn
with the same first and last points.
exists, so does Z
f ·d (γ ◦ φ)
γ◦φ
and Z Z
f ·dγ = f ·d (γ ◦ φ) . (14.4)
γ γ◦φ
Proof: There exists δ > 0 such that if P is a partition of [a, b] such that ||P|| < δ,
then ¯Z ¯
¯ ¯
¯ f ·dγ − S (P)¯ < ε.
¯ ¯
γ
By continuity of φ, there exists σ > 0 such that if Q is a partition of [c, d] with ||Q|| <
σ, Q = {s0 , · · ·, sn } , then |φ (sj ) − φ (sj−1 )| < δ. Thus letting P denote the points in
[a, b] given by φ (sj ) for sj ∈ Q, it follows that ||P|| < δ and so
¯ ¯
¯Z n ¯
¯ X ¯
¯ f ·dγ − f (γ (φ (τ j ))) · (γ (φ (sj )) − γ (φ (sj−1 )))¯¯ < ε
¯
¯ γ j=1 ¯
n
X
∗ ∗
+f (γ (σ )) · (γ (tp ) − γ (t )) + f (γ (σ j )) · (γ (tj ) − γ (tj−1 )) ,
j=p+1
p−1
X
S (P) ≡ f (γ (τ j )) · (γ (tj ) − γ (tj−1 )) +
j=1
Therefore,
p−1
X 1 1
|S (P) − S (Q)| ≤ |γ (tj ) − γ (tj−1 )| + |γ (t∗ ) − γ (tp−1 )| +
j=1
m m
Xn
1 1 1
|γ (tp ) − γ (t∗ )| + |γ (tj ) − γ (tj−1 )| ≤ V (γ, [a, b]) . (14.7)
m j=p+1
m m
Clearly the extreme inequalities would be valid in 14.7 if Q had more than one extra
point. You simply do the above trick more than one time. Let S (P) and S (Q) be
Riemann Stieltjes sums for which ||P|| and ||Q|| are less than δ m and let R ≡ P ∪ Q.
Then from what was just observed,
2
|S (P) − S (Q)| ≤ |S (P) − S (R)| + |S (R) − S (Q)| ≤ V (γ, [a, b]) .
m
14.2. THE LINE INTEGRAL 367
Then ¯Z ¯
¯ ¯
¯ f ·dγ ¯ ≤ M V (γ, [a, b]) . (14.9)
¯ ¯
γ
In case γ (a) = γ (b) so the curve is a closed curve and for fk the k th component of f ,
mk ≤ fk (x) ≤ Mk
Proof: Let 14.8 hold. From the proof of Theorem 14.2.3, when ||P|| < δ m ,
¯Z ¯
¯ ¯
¯ f ·dγ − S (P)¯ ≤ 2 V (γ, [a, b])
¯ ¯ m
γ
and so ¯Z ¯
¯ ¯
¯ f ·dγ ¯ ≤ |S (P)| + 2 V (γ, [a, b])
¯ ¯ m
γ
Using the Cauchy Schwarz inequality and the above estimate in S (P) ,
n
X 2
≤ M |γ (tj ) − γ (tj−1 )| + V (γ, [a, b])
j=1
m
2
≤ M V (γ, [a, b]) + V (γ, [a, b]) .
m
This proves 14.9 since m is arbitrary.
To verify 14.10 use the above inequality to write
¯Z Z ¯ ¯Z ¯
¯ ¯ ¯ ¯
¯ f ·dγ − fm · dγ ¯ = ¯ (f − fm ) · dγ (t)¯¯
¯ ¯
¯
γ γ γ
Proof of the claim: Let P ≡ {t0 , · · · , tp } be a partition with the property that
¯Z ¯
¯ Xp ¯
¯ ¯
¯ c·dγ− c· (γ (tk ) − γ (tk−1 ))¯ < ε.
¯ γ ¯
k=1
where
||γ 0 ||∞ ≡ max {|γ 0 (t)| : t ∈ [a, b]}
which exists because γ 0 is given to be continuous. Therefore it follows V (γ, [a, b]) ≤
||γ 0 ||∞ (b − a) . This proves the lemma. ¥
The following is a useful theorem for reducing bounded variation curves to ones
which have a C 1 parameterization.
Proof: Extend γ to be defined on all R according to the rule γ (t) = γ (a) if t < a
and γ (t) = γ (b) if t > b. Now define
Z 2h
t+ (b−a) (t−a)
1
γ h (t) ≡ γ (s) ds.
2h 2h
−2h+t+ (b−a) (t−a)
where the integral is defined in the obvious way, that is componentwise. Since γ is
continuous, this is certainly possible. Then
Z b+2h Z b+2h
1 1
γ h (b) ≡ γ (s) ds = γ (b) ds = γ (b) ,
2h b 2h b
Z a Z a
1 1
γ h (a) ≡ γ (s) ds = γ (a) ds = γ (a) .
2h a−2h 2h a−2h
Also, because of continuity of γ and the fundamental theorem of calculus,
½ µ ¶µ ¶
1 2h 2h
γ 0h (t) = γ t+ (t − a) 1+ −
2h b−a b−a
µ ¶µ ¶¾
2h 2h
γ −2h + t + (t − a) 1+
b−a b−a
and so γ h ∈ C 1 ([a, b]) . The following lemma is significant.
µ ¶¸¯
2h ¯
γ s − 2h + tj−1 + (tj−1 − a) ¯¯
b−a
n ¯ µ
Z 2h X ¶
1 ¯
≤ ¯γ s − 2h + tj + 2h (tj − a) −
2h 0 j=1 ¯ b−a
µ ¶¯
2h ¯
γ s − 2h + tj−1 + (tj−1 − a) ¯¯ ds.
b−a
2h
For a given s ∈ [0, 2h] , the points, s − 2h + tj + b−a (tj − a) for j = 1, · · ·, n form an
increasing list of points in the interval [a − 2h, b + 2h] and so the integrand is bounded
above by V (γ, [a − 2h, b + 2h]) = V (γ, [a, b]) . It follows
n
X
|γ h (tj ) − γ h (tj−1 )| ≤ V (γ, [a, b])
j=1
for all h < 1. Here S (P) is a Riemann Stieltjes sum of the form
n
X
f (γ (τ i )) · (γ (ti ) − γ (ti−1 ))
i=1
and Sh (P) is a similar Riemann Stieltjes sum taken with respect to γ h instead of γ.
Because of 14.15 γ h (t) has values in H ⊆ Ω. Therefore, fix the partition P, and choose
h small enough that in addition to this, the following inequality is valid.
ε
|S (P) − Sh (P)| <
3
Let η ≡ γ h . Formula 14.14 follows from the lemma. This proves the theorem. ¥R
This is a very useful theorem because if γ is C 1 ([a, b]) , it is easy to calculate γ f dγ
and the above theorem allows a reduction to the case where γ is C 1 . The next theorem
shows how easy it is to compute these integrals in the case where γ is C 1 . First note
that if f is continuous and γ ∈ C 1 ([a,
R b]) , then by Lemma 14.2.5 and the fundamental
existence theorem, Theorem 14.2.3, γ f ·dγ exists.
Proof: Let P be a partition of [a, b], P = {t0 , · · ·, tn } and ||P|| is small enough that
whenever |t − s| < ||P|| ,
|f (γ (t)) − f (γ (s))| < ε (14.17)
and ¯ ¯
¯Z n ¯
¯ X ¯
¯ f ·dγ − f (γ (τ )) · (γ (t ) − γ (t )) ¯
j−1 ¯ < ε.
¯ j j
¯ γ j=1 ¯
Now
n
X
f (γ (τ j )) · (γ (tj ) − γ (tj−1 ))
j=1
Z n
bX
= f (γ (τ j )) · X[tj−1 ,tj ] (s) γ 0 (s) ds
a j=1
where here ½
1 if s ∈ [p, q]
X[p,q] (s) ≡ .
0 if s ∈
/ [p, q]
Also,
Z b Z n
bX
f (γ (s)) · γ 0 (s) ds = f (γ (s)) · X[tj−1 ,tj ] (s) γ 0 (s) ds
a a j=1
n Z
X tj
≤ |f (γ (τ j )) − f (γ (s))| |γ 0 (s)| ds
j=1 tj−1
X
≤ ||γ 0 ||∞ ε (tj − tj−1 )
j
= ε ||γ 0 ||∞ (b − a) .
It follows that
¯Z Z b ¯
¯ ¯
¯ ¯
¯ f ·dγ − f (γ (s)) · γ 0 (s) ds¯
¯ γ a ¯
¯ ¯
¯Z Xn ¯
¯ ¯
≤ ¯ f ·dγ − f (γ (τ j )) · (γ (tj ) − γ (tj−1 ))¯¯
¯
¯ γ j=1 ¯
¯ ¯
¯X Z b ¯
¯ n ¯
¯
+¯ f (γ (τ j )) · (γ (tj ) − γ (tj−1 )) − f (γ (s)) · γ (s) ds¯¯
0
¯ j=1 a ¯
≤ ε ||γ 0 ||∞ (b − a) + ε.
The following lemma is useful and follows quickly from Theorem 14.2.2.
Lemma 14.2.10 In the above definition, there exists a continuous bounded variation
function, γ defined on some closed interval, [c, d] , such that γ ([c, d]) = ∪m
k=1 γ k ([ak , bk ])
and γ (c) = γ 1 (a1 ) while γ (d) = γ m (bm ) . Furthermore,
Z m Z
X
f · dγ = f · dγ k .
γ k=1 γk
The following theorem shows that it is very easy to compute a line integral when
the function has a potential.
14.2. THE LINE INTEGRAL 373
Proof: By Theorem 14.2.6 there exists η ∈ C 1 ([a, b]) such that γ (a) = η (a) , and
γ (b) = η (b) such that ¯Z ¯
Z
¯ ¯
¯ f · dγ − f · dη ¯¯ < ε.
¯
γ η
1
Then from Theorem 14.2.8, since η is in C ([a, b]) , it follows from the chain rule and
the fundamental theorem of calculus that
Z Z b Z b
d
f · dη = f (η (t)) η 0 (t) dt = F (η (t)) dt
η a a dt
= F (η (b)) − F (η (a)) = F (γ (b)) − F (γ (a)) .
Therefore, ¯¯ Z ¯¯
¯¯ ¯¯
¯¯(F (γ (b)) − F (γ (a))) − f (z) dz ¯¯¯¯ < ε
¯¯
γ
is path independent for all γ a bounded variation curve such that γ ∗ is contained in Ω.
This means the above line integral depends only on γ (a) and γ (b).
Proof: The first part was proved in Theorem 14.2.11. It remains to verify the
existence of a potential in the situation of path independence.
Let x0 ∈ Ω be fixed. Let S be the points x of Ω which have the property there is
a bounded variation curve joining x0 to x. Let γ x0 x denote such a curve. Note first
that S is nonempty. To see this, B (x0 , r) ⊆ Ω for r small enough. Every x ∈ B (x0 , r)
is in S. Then S is open because if x ∈ S, then B (x, r) ⊆ Ω for small enough r and if
y ∈ B (x, r) , you could go take γ x0 x and from x follow the straight line segment joining
x to y. In addition to this, Ω \ S must also be open because if x ∈ Ω \ S, then choosing
B (x, r) ⊆ Ω, no point of B (x, r) can be in S because then you could take the straight
line segment from that point to x and conclude that x ∈ S after all. Therefore, since Ω
is connected, it follows Ω \ S = ∅. Thus for every x ∈ S, there exists γ x0 x , a bounded
variation curve from x0 to x.
Define Z
F (x) ≡ f · dγ x0 x
γ x0 x
374 LINE INTEGRALS
F is well defined by assumption. Now let lx(x+tek ) denote the linear segment from x to
x + tek . Thus to get to x + tek you could first follow γ x0 x to x and from there follow
lx(x+tek ) to x + tek . Hence
Z
F (x+tek ) − F (x) 1
= f · dlx(x+tek )
t t lx x+te
( k)
Z
1 t
= f (x + sek ) · ek ds → fk (x)
t 0
by continuity of f . Thus ∇F = f and This proves the theorem. ¥
Corollary 14.2.14 Let Ω be a connected open set and f : Ω → Rn . Then f has a
potential if and only if every closed, γ (a) = γ (b) , bounded variation curve contained
in Ω has the property that Z
f · dγ = 0
γ
Proof: Using Lemma 14.2.10, this condition about closed curves is equivalent to
the condition that the line integrals of the above theorem are path independent. This
proves the corollary. ¥
Such a vector valued function is called conservative.
Note that θ −1 is onto S 1 . The function is well defined because it sends the point γ (a) =
γ (b) to the same point, (1, 0) . It is also one to one. To see this note γ −1 is one
to one on Γ \ {γ (a) , γ (b)} . What about the case where x 6= γ (a) = γ (b)? Could
θ −1 (x) = θ −1 (γ (a))? In this case, γ −1 (x) is in (a, b) while γ −1 (γ (a)) = a so
θ −1 (x) 6= θ −1 (γ (a)) = (1, 0) .
Thus θ −1 is one to one on Γ.
Why is θ −1 continuous? Suppose xn → γ (a) = γ (b) first. Why does θ −1 (xn ) →
(1, 0) = θ −1 (γ (a))? Let {xn } denote any subsequence of the given sequence. Then by
compactness of [a, b] there exists a further subsequence, still denoted by xn such that
γ −1 (xn ) → t ∈ [a, b]
Hence by continuity of γ, xn → γ (t) and so γ (t) must equal γ (a) = γ (b) . It follows
from the assumption of what a simple curve is that t ∈ {a, b} . Hence θ −1 (xn ) converges
to either µ µ ¶ µ ¶¶
2π 2π
cos (a − a) , sin (a − a)
b−a b−a
or µ µ ¶ µ ¶¶
2π 2π
cos (b − a) , sin (b − a)
b−a b−a
but these are the same point. This has shown that if xn → γ (a) = γ (b) , there is a
subsequence such that θ −1 (xn ) → θ −1 (γ (a)) . Thus θ −1 is continuous at γ (a) = γ (b).
Next suppose xn → x 6= γ (a) ≡ p. Then there exists B (p, r) such that for all n
large enough, xn and x are contained in the compact set Γ \ B (p, r) ≡ K. Then γ is
continuous and one to one on the compact set γ −1 (K) ⊆ (a, b) and so by Theorem 5.1.3
γ −1 is continuous on K. In particular it is continuous at x so θ −1 (xn ) → θ −1 (x). This
proves the lemma. ¥
© ª
Theorem 14.3.4 Let C denote the unit circle, (x, y) ∈ R2 : x2 + y 2 = 1 . Sup-
pose γ : C → Γ ⊆ R2 is one to one onto and continuous. Then R2 \ Γ consists of two
components, a bounded component (called the inside) Ui and an unbounded component
(called the outside), Uo . Also the boundary of each of these two components of R2 \ Γ
is Γ and Γ has empty interior.
Proof: That R2 \ Γ consists of two components, Uo and Ui follows from the Jordan
separation theorem. There is exactly one unbounded component because Γ is bounded
and so Ui is defined as the bounded component. It remains to verify the assertion about
Γ being the boundary. Let x be a limit point of Ui . Then it can’t be in Uo because these
are both open sets. Therefore, all the limit points of Ui are in Ui ∪ Γ. Similarly all the
limit points of Uo are in Uo ∪ Γ. Thus ∂Ui ⊆ Γ and ∂Uo ⊆ Γ.
I claim Γ has empty interior. This follows because by Theorem 5.1.3 on Page 88,
γ and γ −1 must both be continuous since C is compact. Thus if B is an open ball
contained in Γ, it follows from invariance of domain that γ −1 (B) is an open set in
R2 . But this needs to be contained in C which is a contradiction because C has empty
interior obviously.
Now let x ∈ Rn such that x ∈ / Uo ∪ Ui . Then x ∈ Γ and must be a limit point of
either Uo or Ui since if this were not so, Γ would be forced to have nonempty interior.
Hence U1 ∪ U2 = R2 . Next I will show ∂Ui = ∂Uo . Suppose then that
p ∈ ∂Ui \ ∂Uo
Ui = U1i ∪ γ ∗o ∪ U2i
Proof: Denote by C the unit circle and let θ : C → Γ be continuous one to one and
onto. Say θ (a) = p and θ (b) = q. Let Cj , j = 1, 2 denote the two circular arcs joining
a and b. Thus letting Γj ≡ θ (Cj ) it follows Γ1 , Γ2 are simple curves whose union is Γ
which intersect at the points p and q. Letting Γj ∪ γ ∗ ≡ Jj it follows Jj is a simple
closed curve. Here is why. Define
½
θ (x) if x ∈ C1
h1 (x) ≡
f2 (x) if x ∈ C2
14.3. SIMPLE CLOSED RECTIFIABLE CURVES 377
p
Γ
Γ1 U1i U2i Γ2
γ∗
U1i ∪ γ ∗o ∪ U2i
p
Γ
y
Γ1 U1i U2i Γ2
γ∗
Thus γ ∗o divides B (y, r) into two halves, H1 , H2 . The ball contains points of U2i
by the Jordan curve theorem. Say H2 contains some of these points. Then I claim H2
cannot contain any points of U1i . This is because if it did, there would be a segment
joining a point of U1i with a point of U2i which is contained in H2 which is a connected
open set which is therefore contained in a single component of J1C . This is a contradiction
because as shown above, U2i ⊆ U1o . Could H2 contain any points of U2o ? No because
then there would be a segment joining a point of U2o to a point of U2i which is contained
in the same component of J2C . Therefore, H2 consists entirely of points of U2i . Similarly
H1 consists entirely of points of U1i . Therefore, U1i ∪ γ ∗o ∪ U2i is an open set because
the only points which could possibly fail to be interior points, those on γ ∗0 are interior
points of U1i ∪ γ ∗o ∪ U2i .
Suppose equality does not hold in 14.19. Then there exists w ∈ Ui \(U1i ∪ γ ∗o ∪ U2i ) .
Let x ∈ U1i ∪ γ ∗o ∪ U2i . Then since Ui is connected and open, there exists a continuous
mapping r : [0, 1] → Ui such that r (0) = x and³r (1) = w. Since U´1i ∪ γ ∗o ∪ U2i is open,
C
there exists a first point in the closed set r−1 (U1i ∪ γ ∗o ∪ U2i ) , s. Thus r (s) is a
limit point of U1i ∪ γ ∗o ∪ U2i but is not in this set which implies it is in U1o ∩ U2o . It
follows r (s) is a limit point of either U1i or U2i because each point of γ ∗o is a limit point
of U1i and U2i . Also, r (s) cannot be in γ ∗0 because it is not in U1i ∪ γ ∗o ∪ U2i . Suppose
without loss of generality it is a limit point of U1i . Then every ball containing r (s)
must contain points of U1o ∩ U2o ⊆ U1o as well as points U1i . But by the Jordan curve
theorem, this implies r (s) is in J1 but is not in γ ∗o . Therefore, r (s) is a point of Γ and
this contradicts r (s) ∈ Ui . Therefore, equality must hold in 14.19 after all. This proves
the lemma. ¥
The following lemma has to do with decomposing the inside and boundary of a simple
closed rectifiable curve into small pieces. The argument is like one given in Apostol
[3]. In doing this I will refer to a region as the union of a connected open set with its
boundary. Also, two regions will be said to be non overlapping if they either have empty
intersection or the intersection is contained in the intersection of their boundaries.The
height of a set A equals sup {|y1 − y2 | : (x1 , y1 ) , (x2 , y2 ) ∈ A} . The width of A will be
defined similarly.
Lemma 14.3.6 Let Γ be a simple closed rectifiable curve. Also let δ > 0 be given
such that 2δ is smaller than both the height and width of Γ. Then there exist finitely
n
many non overlapping regions {Rk }k=1 consisting of simple closed rectifiable curves
along with their interiors whose union equals Ui ∪ Γ. These regions consist of two kinds,
those contained in Ui and those with nonempty intersection with Γ. These latter regions
are called “border” regions. The boundary of a border region consists of straight line
segments parallel to the coordinate axes of the form x = mδ or y = kδ for m, k integers
along with arcs from Γ. The regions contained in Ui consist of rectangles. Thus all of
these regions have boundaries which are rectifiable simple closed curves. Also all regions
14.3. SIMPLE CLOSED RECTIFIABLE CURVES 379
are contained in a square having sides of length no more than 2δ. There are at most
µ ¶
V (Γ)
4 +1
δ
border regions. The construction also yields an orientation for Γ and for all these
regions, and the orientations for any segment shared by two regions are opposite.
and let
y2 ≡ min {γ 2 (t) : t ∈ [a, b]} .
Thus (x1 , y1 ) is the “top” point of Γ while (x2 , y2 ) is the “bottom” point of Γ. Consider
the lines y = y1 and y = y2 . By assumption |y1 − y2 | > 2δ. Consider the line l given
by y = mδ where m is chosen to make mδ as close as possible to (y1 + y2 ) /2. Thus
y1 > mδ > y2 . By Theorem 14.3.4 or 13.0.26(xj , yj ) j = 1, 2, being on Γ are both
limit points of Ui so there exist points pj ∈ Ui such that p1 is above l and p2 is below
l. (Simply pick pj very close to (xj , yj ) and yet in Ui and this will take place.) The
horizontal line l must have nonempty intersection with Ui because Ui is connected. If
it had empty intersection it would be possible to separate Ui into two nonempty open
sets, one containing p1 and the other containing p2 .
Let q be a point of Ui which is also in l. Then there exists a maximal segment
of the line l containing q which is contained in Ui ∪ Γ. This segment, γ ∗ satisfies the
conditions of Lemma 14.3.5 and so it divides Ui into disjoint open connected sets whose
boundaries are simple rectifiable closed curves. Note the line segment has finite length.
Letting Γj be the simple closed curve which contains pj , orient γ ∗ as part of Γ2 such
that motion is from right to left. As part of Γ1 the motion along the curve is from left
to right. By Proposition 14.1.7 this provides an orientation to each Γj . By Proposition
14.1.8 there exists an orientation for Γ which is consistent with these two orientations
on the Γj .
Now do the same process to the two simple closed curves just obtained and continue
till all regions have height less than 2δ. Each application of the process yields two new
non overlapping regions of the desired sort in place of an earlier region of the desired
sort except possibly the regions might have excessive height. The orientation of a new
line segment in the construction is determined from the orientations of the simple closed
curves obtained earlier. By Proposition 14.1.7 the orientations of the segments shared
by two regions are opposite so eventually the line integrals over these segments cancel.
Eventually this process ends because all regions have height less than 2δ. The reason for
this is that if it did not end, the curve Γ could not have finite total variation because
there would exist an arbitrarily large number of non overlapping regions each of which
have a pair of points which are farther apart than 2δ. This takes care of finding the
subregions so far as height is concerned.
Now follow the same process just described on each of the non overlapping “short”
regions just obtained using vertical rather than horizontal lines, letting the orientation
of the vertical edges be determined from the orientation already obtained, but this time
feature width instead of height and let the lines be vertical of the form x = kδ where k
is an integer.
How many border regions are there? Denote by V (Γ) the length of Γ. Now de-
compose Γ into N arcs of length δ with maybe one having length less than δ. Thus
N − 1 ≤ V (Γ)δ and so
V (Γ)
N≤ +1
δ
380 LINE INTEGRALS
The resulting regions are each contained in a box having sides of length no more than
2δ in length. Each of these N arcs can’t intersect any more than four of these boxes.
Therefore, at most 4N boxes of the construction can intersect Γ. Thus there are no
more than µ ¶
V (Γ)
4 +1
δ
border regions. This proves the lemma. ¥
Lemma 14.3.7 Let R = [a, b] × [c, d] be a rectangle and let P, Q be functions which
are C 1 in some open set containing R. Orient the boundary of R as shown in the fol-
lowing picture. This is called the counter clockwise direction or the positive orientation
where
f (x, y) ≡ (P (x, y) , Q (x, y)) .
In this context the line integral is usually written using the notation
Z
P dx + Qdy.
∂R
Proof: This follows from direct computation. A parameterization for the bottom
line of R is
γ B (t) = (a + t (b − a) , c) , t ∈ [0, 1]
A parameterization for the top line of R with the given orientation is
γ T (t) = (b + t (a − b) , d) , t ∈ [0, 1]
14.3. SIMPLE CLOSED RECTIFIABLE CURVES 381
Z bZ d Z dZ b
= − Py (x, y) dydx + Qx (x, y) dxdy
a c c a
Z
= (Qx − Py ) dm2
R
By Fubini’s theorem, Theorem 9.2.3 on Page 212. (To use this theorem you can extend
the functions to equal 0 off R.) This proves the lemma. ¥
Note that if the rectangle were oriented in the opposite way, you would get
Z Z
f · dγ = (Py − Qx ) dm2
γ R
With this lemma, it is possible to prove Green’s theorem and also give an ana-
lytic criterion which will distinguish between different orientations of a simple closed
rectifiable curve. First here is a discussion which amounts to a computation.
nδ
Let Γ be a rectifiable simple closed curve with inside Ui and outside Uo . Let {Rk }k=1
denote the non overlapping regions of Lemma 14.3.6 all oriented as explained there and
let Γ also be oriented as explained there. It could be shown that all the regions contained
in Ui have positive orientation but this will not be fussed over here. What can be said
with no fussing is that since the shared edges have opposite orientations, all these interior
regions are either oriented positively or they are all oriented negatively.
Let Bδ be the set of border regions and let Iδ be the rectangles contained in Ui . Thus
in taking the sum of the line integrals over the boundaries of the interior rectangles, the
integrals over the “interior edges” cancel out and you are left with a line integral over
the exterior edges of a polygon which is composed of the union of the squares in Iδ .
Now let f (x, y) = (P (x, y) , Q (x, y)) be a vector field which is C 1 on Ui , and suppose
also that both Py and Qx are in L1 (Ui ) (Absolutely integrable) and that P, Q are
continuous on Ui ∪Γ. (An easy way to get all this to happen is to let P, Q be restrictions
to Ui ∪ Γ of functions which are C 1 on some open set containing Ui ∪ Γ.) Note that
∪δ>0 {R : R ∈ Iδ } = Ui
382 LINE INTEGRALS
Let ∂R denote the boundary of R for R one of these regions of Lemma 14.3.6 oriented
2
as described. Let wδ (R) denote
2
(max {Q (x) : x ∈ ∂R} − min {Q (x) : x ∈ ∂R})
2
+ (max {P (x) : x ∈ ∂R} − min {P (x) : x ∈ ∂R})
By uniform continuity of P, Q on the compact set Ui ∪Γ, if δ is small enough, wδ (R) < ε
for all R ∈ Bδ . Then for R ∈ Bδ , it follows from Theorem 14.2.4
¯Z ¯
¯ ¯ 1
¯ f · dγ ¯¯ ≤ wδ (R) (V (∂R)) < ε (V (∂R)) (14.20)
¯ 2
∂R
Denote by ΓR the part of Γ which is contained in R ∈ Bδ and V (ΓR ) is its length. Then
the above sum equals
à !
X
ε V (ΓR ) + Bδ = ε (V (Γ) + Bδ )
R∈Bδ
where Bδ is the sum of the lengths of the straight edges. This is easy to estimate. Recall
from 14.3.6 there are no more than
µ ¶
V (Γ)
4 +1
δ
of these border regions. Furthermore, the sum of the lengths of all four edges of one of
these is no more than 8δ and so
µ ¶
V (Γ)
Bδ ≤ 4 + 1 8δ = 32V (Γ) + 32δ.
δ
14.3. SIMPLE CLOSED RECTIFIABLE CURVES 383
Thus the absolute value of the second sum on the right in 14.21 is dominated by
Since ε was arbitrary, this formula implies with Green’s theorem proved above for
squares, Z X Z X Z
f · dγ = lim f · dγ + lim f · dγ
Γ δ→0 ∂R δ→0 ∂R
R∈Iδ R∈Bδ
X Z Z Z
= lim f · dγ = lim ± (Qx − Py ) dm2 = ± (Qx − Py ) dm2
δ→0 ∂R δ→0 Iδ Ui
R∈Iδ
where the ± adusts for whether the interior rectangles are all oriented positively or all
oriented negatively. ¥
This has proved the general form of Green’s theorem which is stated in the following
theorem.
Qx , Py ∈ L1 (Ui )
just take the other orientation for Γ. This proves the theorem. ¥
With this wonderful theorem, it is possible to give an analytic description of the two
different orientations of a rectifiable simple closed curve. The positive orientation is the
one for which Greens theorem holds and the other one, called the negative orientation
is the one for which Z Z
f · dγ = (Py − Qx ) dm2 .
Γ Ui
There are other regions for which Green’s theorem holds besides just the inside and
boundary of a simple closed curve. For Γ a simple closed curve and Ui its inside, lets
refer to Ui ∪ Γ as a Jordan region. When you have two non overlapping Jordan regions
which intersect in a finite number of simple curves, you can delete the interiors of these
simple curves and what results will also be a region for which Green’s theorem holds.
This is illustrated in the following picture.
384 LINE INTEGRALS
Ui2
U1i
There are two Jordan regions here with insides U1i and U2i and these regions intersect
in three simple curves. As indicated in the picture, opposite orientations are given to
each of these three simple curves. Then the line integrals over these cancel. The area
integrals add. Recall the two dimensional area of a bounded variation curve equals 0.
Denote by Γ the curve on the outside of the whole thing and Γ1 and Γ2 the oriented
boundaries of the two holes which result when the curves of intersection are removed,
the orientations as shown. Then letting f (x, y) = (P (x, y) , Q (x, y)) , and
Γ2 U Γ1
Γ
where ∂U is oriented as indicated in the picture and involves the three oriented curves
Γ, Γ1 , Γ2 .
Proof: Let © ª
K ≡ y ∈ R2 : dist (y, α∗ ) ≤ r
where r is small enough that K ⊆ U . This is easily done because α∗ is compact. Let
Consider
n−1
X
|R (α (tj+1 )) − R (α (tj ))| (14.22)
j=0
Now if P is any partition, 14.22 can always be made larger by adding in points to P till
||P|| < δ and so this shows
Then
D1 G (v, x) = DR (x + v) − DR (x)
and so by uniform continuity of DR on the compact set K, it follows there exists δ > 0
such that if |v| < δ, then for all x ∈ α∗ ,
By Theorem 6.4.2 again it follows that for all x ∈ α∗ and |v| < δ,
n−1
X
= F (R (α (tj ))) · (R (α (tj+1 )) − R (α (tj )))
j=0
n−1
X
= F (R (α (tj ))) · [DR (α (tj )) (α (tj+1 ) − α (tj )) + o (α (tj+1 ) − α (tj ))]
j=0
where
and by 14.23,
|o (α (tj+1 ) − α (tj ))| < ε |α (tj+1 ) − α (tj )|
It follows
¯ ¯
¯n−1 n−1 ¯
¯X X ¯
¯ F (γ (tj )) · (γ (tj+1 ) − γ (tj )) − F (R (α (tj ))) · DR (α (tj )) (α (tj+1 ) − α (tj ))¯¯
¯
¯ j=0 j=0 ¯
(14.24)
n−1
X n−1
X
≤ |o (α (tj+1 ) − α (tj ))| ≤ ε |α (tj+1 ) − α (tj )| ≤ εV (α, [a, b])
j=0 j=0
F (R (α (tj ))) · (Ru (α (tj )) (α1 (tj+1 ) − α1 (tj )) + Rv (α (tj )) (α2 (tj+1 ) − α2 (tj )))
where the determinant is expanded formally along the top row. Let f : U → R3 for
U ⊆ R3 denote a vector field. The curl of the vector field yields another vector field
and it is defined as follows.
where here ∂j means the partial derivative with respect to xj and the subscript of i in
(curl (f ) (x))i means the ith Cartesian component of the vector, curl (f ) (x) . Thus the
curl is evaluated by expanding the following determinant along the top row.
¯ ¯
¯ i j k ¯
¯ ¯
¯ ∂ ∂ ∂ ¯.
¯ ∂x ∂y ∂z ¯
¯ f1 (x, y, z) f2 (x, y, z) f3 (x, y, z) ¯
Note the similarity with the cross product. More precisely and less evocatively,
µ ¶ µ ¶ µ ¶
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
∇ × f (x, y, z) =≡ − i+ − j+ − k.
∂y ∂z ∂z ∂x ∂x ∂y
where ∂U consists of some simple closed rectifiable oriented curves as explained above.
Here the u and v axes are in the same relation as the x and y axes.
γ j = R ◦ αj ,
where the αj are parameterizations for the oriented curves making up the boundary of
U such that the conclusion of Green’s theorem holds. Let S denote the surface,
S ≡ {R (u, v) : (u, v) ∈ U } ,
n Z
X
((F ◦ R) · Ru , (F ◦ R) · Rv ) · dαj
j=1 αj
By the assumption that the conclusion of Green’s theorem holds for U , this equals
Z
[((F ◦ R) · Rv )u − ((F ◦ R) · Ru )v ] dm2
ZU
= [(F ◦ R)u · Rv + (F ◦ R) · Rvu − (F ◦ R) · Ruv − (F ◦ R)v · Ru ] dm2
ZU
= [(F ◦ R)u · Rv − (F ◦ R)v · Ru ] dm2
U
the last step holding by equality of mixed partial derivatives, a result of the assumption
that R is C 2 . Now by Lemma 14.4.3, this equals
Z
(Ru (u, v) × Rv (u, v)) · (∇ × F (R (u, v))) dm2
U
For an example of a right handed system of vectors, see the following picture.
±
c
y a
b
¼
In this picture the vector c points upwards from the plane determined by the other
two vectors. You should consider how a right hand system would differ from a left hand
system. Try using your left hand and you will see that the vector, c would need to point
in the opposite direction as it would for a right hand system.
From now on, the vectors, i, j, k will always form a right handed system. To repeat,
if you extend the fingers of your right hand along i and close them in the direction j,
the thumb points in the direction of k. Recall these are the basis vectors e1 , e2 , e3 .
The following is the geometric description of the cross product. It gives both the
direction and the magnitude and therefore specifies the vector.
3
|b|sin(θ)
b ¼
θ a -
a × b = − (b × a) , a × a = 0, (14.26)
For α a scalar,
(αa) ×b = α (a × b) = a× (αb) , (14.27)
a× (b + c) = a × b + a × c, (14.28)
(b + c) × a = b × a + c × a. (14.29)
Formula 14.26 follows immediately from the definition. The vectors a × b and b × a
have the same magnitude, |a| |b| sin θ, and an application of the right hand rule shows
they have opposite direction. Formula 14.27 is also fairly clear. If α is a nonnegative
scalar, the direction of (αa) ×b is the same as the direction of a × b,α (a × b) and
a× (αb) while the magnitude is just α times the magnitude of a × b which is the same
as the magnitude of α (a × b) and a× (αb) . Using this yields equality in 14.27. In the
case where α < 0, everything works the same way except the vectors are all pointing in
the opposite direction and you must multiply by |α| when comparing their magnitudes.
The distributive laws are much harder to establish but the second follows from the first
quite easily. Thus, assuming the first, and using 14.26,
(b + c) × a = −a× (b + c)
= − (a × b + a × c)
= b × a + c × a.
To verify the distributive law one can consider something called the box product.
That is, if you pick three numbers, r, s, and t each in [0, 1] and form ra+sb + tc, then
the collection of all such points is what is meant by the parallelepiped determined by
these three vectors.
14.5. INTERPRETATION AND REVIEW 391
a×b
Á
c
3
θ b
a -
You notice the area of the base of the parallelepiped, the parallelogram determined
by the vectors, a and b has area equal to |a × b| while the altitude of the parallelepiped
is |c| cos θ where θ is the angle shown in the picture between c and a × b. Therefore,
the volume of this parallelepiped is the area of the base times the altitude which is just
|a × b| |c| cos θ = a × b · c.
This expression is known as the box product and is sometimes written as [a, b, c] . You
should consider what happens if you interchange the b with the c or the a with the c.
You can see geometrically from drawing pictures that this merely introduces a minus
sign. In any case the box product of three vectors always equals either the volume of
the parallelepiped determined by the three vectors or else minus this volume. From
geometric reasoning like this you see that
a · b × c = a × b · c.
In other words, you can switch the × and the ·.
With this information, the following gives the coordinate description of the cross prod-
uct.
Proposition 14.5.4 Let a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k be two vectors.
Then
a × b = (a2 b3 − a3 b2 ) i+ (a3 b1 − a1 b3 ) j+
+ (a1 b2 − a2 b1 ) k. (14.30)
Proof: From the above table and the properties of the cross product listed,
(a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k) =
a1 b2 i × j + a1 b3 i × k + a2 b1 j × i + a2 b3 j × k+
+a3 b1 k × i + a3 b2 k × j
= a1 b2 k − a1 b3 j − a2 b1 k + a2 b3 i + a3 b1 j − a3 b2 i
= (a2 b3 − a3 b2 ) i+ (a3 b1 − a1 b3 ) j+ (a1 b2 − a2 b1 ) k (14.31)
This proves the proposition. ¥
The easy way to remember the above formula is to write it as follows.
¯ ¯
¯ i j k ¯¯
¯
a × b = ¯¯ a1 a2 a3 ¯¯ (14.32)
¯ b1 b2 b3 ¯
where you expand the determinant along the top row. This yields
(a2 b3 − a3 b2 ) i− (a1 b3 − a3 b1 ) j+ (a1 b2 − a2 b1 ) k (14.33)
which is the same as 14.31.
Á f (dV )
This motivates the following definition of what is meant by the integral over a paramet-
rically defined surface in R3 .
where the αj are parameterizations for the oriented bounded variation curves bounding
the region U oriented such that the conclusion of Green’s theorem holds. Let S denote
the surface,
S ≡ {R (u, v) : (u, v) ∈ U } ,
394 LINE INTEGRALS
Proof: Formula 14.37 was established in Theorem 14.4.4. The unit normal of the
point R (u, v) of S is (Ru × Rv ) / |Ru × Rv | and from the definition of the integral over
the surface, Definition 14.5.7, Formula 14.38 follows. ¥
x + iy ≡ x − iy
are continuous where f (z) ≡ u (x, y) + iv (x, y) with u and v being called the real and
imaginary parts of f . The only new thing is that writing an ordered pair (x, y) as x + iy
with the convention i2 = −1 makes C into a field. Now here is the definition of what it
means for a function to be analytic.
f (z + ∆z) − f (z)
lim ≡ f 0 (z)
∆z→0 ∆z
exists and is a continuous function of z ∈ U . For a function having values in C denote
by u (x, y) the real part of f and v (x, y) the imaginary part. Both u and v have real
values and
f (x + iy) ≡ f (z) ≡ u (x, y) + iv (x, y)
14.6. INTRODUCTION TO COMPLEX ANALYSIS 395
and all these partial derivatives, ux , uy , vx , vy are continuous on U . (The above equations
are called the Cauchy Riemann equations.)
Proof: First suppose f is analytic. First let ∆z = ih and take the limit of the
difference quotient as h → 0 in the definition. Thus from the definition,
f (z + ih) − f (z)
f 0 (z) ≡ lim
h→0 ih
u (x, y + h) + iv (x, y + h) − (u (x, y) + iv (x, y))
= lim
h→0 ih
1
= lim (uy (x, y) + ivy (x, y)) = −iuy (x, y) + vy (x, y)
h→0 i
f (z + h) − f (z)
f 0 (z) ≡ lim
h→0 h
u (x + h, y) + iv (x + h, y) − (u (x, y) + iv (x, y))
= lim
h→0 h
= ux (x, y) + ivx (x, y) .
ux = vy , vx = −uy
and this yields the Cauchy Riemann equations. Since z → f 0 (z) is continuous, it follows
the real and imaginary parts of this function must also be continuous. Thus from the
above formulas for f 0 (z) , it follows from the continuity of z → f 0 (z) all the partial
derivatives of the real and imaginary parts are continuous.
Next suppose the Cauchy Riemann equations hold and these partial derivatives are
all continuous. For ∆z = h + ik,
Dividing by ∆z and taking a limit yields f 0 (z) exists and equals ux (x, y) + ivx (x, y)
which are assumed to be continuous. This proves the proposition. ¥
396 LINE INTEGRALS
where τ j ∈ [tj−1 , tj ] . (Note this notation is a little sloppy because it does not identify
R the
specific point, τ j used. It is understood that this point is arbitrary.) Define γ f (z) dz
as the unique number which satisfies the following condition. For all ε > 0 there exists
a δ > 0 such that if ||P|| ≤ δ, then
¯Z ¯
¯ ¯
¯ f (z) dz − S (P)¯ < ε.
¯ ¯
γ
You note that this is essentially the same definition given earlier for the line integral
only this time the function has values in C rather than Rn and there is no dot product
involved. Instead, you multiply by the complex number γ (tj ) − γ (tj−1 ) in the Riemann
Stieltjes sum. To tie this in with the line integral even more, consider a typical term in
the sum for S (P). Let γ (t) = γ 1 (t) + iγ 2 (t) . Then letting u be the real part of f and
v the imaginary part, S (P) equals
n
X
(u (γ 1 (τ j ) , γ 2 (τ j )) + iv (γ 1 (τ j ) , γ 2 (τ j )))
j=1
(γ 1 (tj ) − γ 1 (tj−1 ) + i (γ 2 (tj ) − γ 2 (tj−1 )))
n
X
= u (γ 1 (τ j ) , γ 2 (τ j )) (γ 1 (tj ) − γ 1 (tj−1 ))
j=1
n
X
− v (γ 1 (τ j ) , γ 2 (τ j )) (γ 2 (tj ) − γ 2 (tj−1 ))
j=1
Xn
+i v (γ 1 (τ j ) , γ 2 (τ j )) (γ 1 (tj ) − γ 1 (tj−1 ))
j=1
Xn
+i u (γ 1 (τ j ) , γ 2 (τ j )) (γ 2 (tj ) − γ 2 (tj−1 ))
j=1
14.6. INTRODUCTION TO COMPLEX ANALYSIS 397
Since the functions u and v are continuous, the limit as ||P|| → 0 of the above equals
Z Z
(u, −v) · dγ + i (v, u) · dγ
γ γ
¥
This proves most of the following lemma.
Proof: The existence of the two line integrals as limits of S (P) as ||P|| → 0
follows from continuity of u, v and Theorem 14.2.3 along with the above discussion
which decomposes the sum for the contour integral into the expression of 14.39 for
which the two sums converge to the line integrals in the above formula. This proves the
lemma. ¥
The lemma implies all the algebraic properties for line integrals hold in the same
way for contour integrals. In particular, if γ is C 1 , then
Z Z b
f (z) dz = f (γ (t)) γ 0 (t) dt.
γ a
Proposition 14.6.5 Suppose F 0 (z) = f (z) for all z ∈ Ω, an open set containing
γ ∗ where γ : [a, b] → C is a continuous bounded variation curve. Then
Z
f (z) dz = F (γ (b)) − F (γ (a)) .
γ
Proof: Letting u and v be real and imaginary parts of f, it follows from Lemma
14.6.4 Z Z Z
f (z) dz = (u, −v) · dγ + i (v, u) · dγ (14.41)
γ γ γ
Lemma 14.6.6 Let U be an open set in C and let Γ be a simple closed rectifiable
curve contained in U having parameterization γ. Also let f be analytic in U . Then
Z
f (z) dz = 0.
γ
Proof: This follows right away from the Cauchy Riemann equations and the formula
14.40. Assume without loss of generality the orientation of Γ is the positive orientation.
If not, the argument is the same. Then from formula 14.40,
Z Z Z
f (z) dz = (u, −v) · dγ + i (v, u) · dγ
γ γ γ
Proof: Let Bδ , Iδ be those regions of Lemma 14.3.6 where as earlier Iδ are those
which have empty intersection with Γ and Bδ are the border regions. Without loss of
generality, assume Γ is positively oriented. As in the proof of Green’s theorem you can
apply the same argument to the line integrals on the right of 14.40 to obtain, just as in
the proof of Green’s theorem
X Z X Z Z
f (z) dz + f (z) dz = f (z) dz
R∈Iδ ∂R R∈Bδ ∂R γ
14.6. INTRODUCTION TO COMPLEX ANALYSIS 399
In this case the first sum on the left in the above formula equals 0 from Lemma 14.6.6
for any δ > 0. Now just as in the proof of Green’s theorem, you can choose δ small
enough that
X ¯¯Z ¯
¯
¯ f (z) dz ¯¯ < ε.
¯
R∈Bδ ∂R
Lemma 14.6.8 Let R be a rectangle such that ∂R is positively oriented. Recall this
means the direction of motion is counter clockwise.
Proof: This follows from a routine computation and is left to you. In the case
where z is on the outside of ∂R, the conclusion follows from the Cauchy integral formula
Theorem 14.6.7 as you can verify by noting that f (w) ≡ 1/ (w − z) is analytic on an
open set containing R and that in fact its derivative equals what you would think,
2
−1/ (w − z) .
Proof: In constructing the special regions in the proof of Green’s theorem, always
choose δ such that the point z is not on any of the lines mδ = y and x = kδ. This makes
it possible to avoid thinking about the case where z is not on the interior of any of the
rectangles of Iδ . Pick δ small enough that Iδ 6= ∅ and z is contained in some R0 ∈ Iδ .
From Lemma 14.6.8 it follows for each R ∈ Iδ
Z Z
1 f (w) 1 f (w) − f (z)
dw − f (z) = dw
2πi R w − z 2πi R w−z
400 LINE INTEGRALS
As in the proof of Green’s theorem, choosing δ small enough the second sum on the left
in the above satisfies
¯ ¯
X ¯¯Z
¯X Z ¯ ¯
¯ 1 ¯ 1 ¯
¯ dw¯ ≤ ¯ dw ¯ < ε.
¯ ¯ ¯ ¯
R∈Bδ ∂R w − z ∂R w − z
R∈Bδ
where R0 is the rectangle for which z is on the inside of ∂R0 . Then by this lemma
again, this equals 2πi. Therefore for such small δ, 14.43 reduces to
¯ Z ¯
¯ 1 ¯
¯2πi − dw ¯<ε
¯ ¯
γ w−z
Proof:
Z µ ¶
1 f (w) f (w)
(g (z + h) − g (z)) /h = − dw
h γ (w − z − h) (w − z)
Z µ ¶ Z µ ¶
1 h 1
= f (w) dw = f (w) dw
h γ (w − z − h) (w − z) γ (w − z − h) (w − z)
2
Consider only h ∈ C such that 2 |h| < dist (z, γ ∗ ). The integrand converges to 1/ (w − z) .
Then for these values of h,
¯ ¯ ¯ ¯
¯ 1 1 ¯ ¯ h ¯
¯ ¯ ¯ ¯
¯ − ¯ = ¯ ¯
¯ (w − z − h) (w − z) (w − z)2 ¯ ¯ (w − z − h) (w − z)2 ¯
|h| 2 |h|
≤ 3 = 3
dist (z, γ ∗ ) /2 dist (z, γ ∗ )
is uniform for |h| < dist (z, γ ∗ ) /2 . Using Theorem 14.2.4, it follows
g (z + h) − g (z)
g 0 (z) = lim
h→0 h
Z µ ¶
1 f (w) f (w)
= lim − dw
h→0 h γ (w − z − h) (w − z)
Z
f (w)
= 2 dw.
γ (w − z)
One can then differentiate the above expression using the same arguments. Contin-
uing this way results in the following formula.
Z
(n) f (w)
g (z) = n! n+1 dw
γ (w − z)
direction as shown.
z3
z1 z2
R R
Denote by ∂T
f (z) dz, the expression, γ(z1 ,z2 ,z3 ,z1 )
f (z) dz. Consider the following
picture.
z3
ª
T11 I T
- ¾
R T21 ª
ª
T31 T41 I
- Iµ -
z1 z2
Thus
Z 4 Z
X
f (z) dz = f (z) dz. (14.44)
∂T k=1 ∂Tk1
On the “inside lines” the integrals cancel because there are two integrals going in op-
posite directions for each of these inside lines.
Now let T1 play the same role as T . Subdivide as in the above picture, and obtain T2
such that ¯Z ¯
¯ ¯ α
¯ f (w) dw¯¯ ≥ 2 .
¯ 4
∂T2
and ¯Z ¯
¯ ¯ α
¯ f (w) dw¯¯ ≥ k .
¯ 4
∂Tk
Then let z ∈ ∩∞ 0
k=1 Tk and note that by assumption, f (z) exists. Therefore, for all k
large enough,
Z Z
f (w) dw = (f (z) + f 0 (z) (w − z) + g (w)) dw
∂Tk ∂Tk
where |g (w)| < ε |w − z| . Now observe that w → f (z) + f 0 (z) (w − z) has a primitive,
namely,
2
F (w) = f (z) w + f 0 (z) (w − z) /2.
Therefore, by Proposition 14.6.5,
Z Z
f (w) dw = g (w) dw.
∂Tk ∂Tk
From Theorem 14.2.4 applied to contour integrals or the definition of the contour inte-
gral,
¯Z ¯
α ¯ ¯
≤ ¯ ¯ g (w) dw¯¯ ≤ ε diam (Tk ) (length of ∂Tk )
4 k
∂Tk
and so
α ≤ ε (length of T ) diam (T ) .
R
Since ε is arbitrary, this shows α = 0, a contradiction. Thus ∂T f (w) dw = 0 as
claimed. ¥
This fundamental result yields the following important theorem.
Theorem 14.6.12 (Morera1 ) Let Ω be an open set and let f 0 (z) exist for all
z ∈ Ω. Let D ≡ B (z0 , r) ⊆ Ω. Then there exists ε > 0 such that f has a primitive on
B (z0 , r + ε). (Recall this is a function F such that F 0 (z) = f (z) .)
Proof: Choose ε > 0 small enough that B (z0 , r + ε) ⊆ Ω. Then for w ∈ B (z0 , r + ε) ,
define Z
F (w) ≡ f (u) du.
γ(z0 ,w)
which converges to f (w) due to the continuity of f at w. This proves the theorem. ¥
The following is a slight generalization of the above theorem which is also referred to
as Morera’s theorem. It contains the proof that the condition of continuity of z → f 0 (z)
is redundant.
1 Giancinto Morera 1856-1909. This theorem or one like it dates from around 1886
404 LINE INTEGRALS
Proof: As in the proof of Morera’s theorem, let B (z0 , r) ⊆ Ω and use the given
condition to construct a primitive, F for f on B (z0 , r) . (As just shown in Theorem
14.6.12, the given condition is satisfied whenever f 0 (z) exists for all z ∈ Ω.) Then F is
analytic and so by the Cauchy integral formula, for z ∈ B (z0 , r)
Z
1 F (w)
F (z) = dw.
2πi ∂B(z0 ,r) w − z
It follows from Theorem 14.6.10 that F and hence f have infinitely many derivatives,
implying that f is analytic on B (z0 , r) . Since z0 is arbitrary, this shows f is analytic on
Ω. In particular z → f 0 (z) is continuous because actually this function is differentiable.
This proves the corollary. ¥
This shows that an equivalent definition of what it means for a function to be analytic
is the following definition.
Definition 14.6.14 Let U be an open set in C and suppose f 0 (z) exists for all
z ∈ U. Then f is called analytic.
These theorems form the foundation for the study of functions of a complex variable.
Some important theorems will be discussed in the exercises.
14.7 Exercises
1. Suppose f : [a, b] → [c, d] is continuous and one to one on (a, b) . For s ∈ (c, d) ,
show
d (f, (a, b) , s) = ±1
show it is 1 if f is increasing and −1 if f is decreasing. How can this be used to
relate the degree to orientation?
2. In defining a simple curve the assumption was made that γ (t) 6= γ (a) and γ (t) 6=
γ (b) if t ∈ (a, b) . Is this fussy condition really necessary? Which theorems and
lemmas hold with simply assuming γ is one to one on (a, b)? Does the fussy
condition follow from assuming γ is one to one on (a, b)?
2
R
3. RShow that for many open sets R in R , Area of U = ∂U
xdy, and Area of U =
1
∂U
−ydx and Area of U = 2 ∂U −ydx + xdy. Hint: Use Green’s theorem.
f (n) (z0 )
an = . (14.46)
n!
Hint: You use the Cauchy integral formula. For z ∈ B (z0 , r) and C the positively
oriented boundary,
Z Z
1 f (w) 1 f (w) 1
f (z) = = z−z0 dw
2πi C w − z 2πi C w − z0 1 − w−z 0
Z X ∞
1 f (w) n
= (z − z0 ) dw
2πi C n=0 (w − z0 )n+1
Now explain why you can switch the sum and the integral. You will need to argue
the sum converges uniformly which is what will justify this manipulation. Next
use the result of Theorem 14.6.10.
14. Prove the following amazing result about the zeros of an analytic function. Let
Ω be a connected open set (region) and let f : Ω → X be analytic. Then the
following are equivalent.
Z ≡ {z ∈ Ω : f (z) = 0} .
X∞
f (n) (z0 ) n
f (z) = (z − z0 )
n=m
n!
X∞
f (z) f (m) (z0 ) f (n) (z0 ) n−m
m = + (z − z0 )
(z − z0 ) m! n=m+1
n!
Now let zn → z0 , zn 6= z0 but f (zn ) = 0. What does this say about f (m) (z0 )?
Clearly the first two conditions are equivalent and they imply the third.
15. You want to define ez for z complex such that it is analytic on C. Using Problem
14 explain why there is at most one way to do it and still have it coincide with ex
when z = x + i0. Then show using the Cauchy Riemann equations that
d z
e = ez .
dz
Hint: For the first part, suppose two functions, f, g work. Then consider f − g.
this is analytic and has a zero set, R.
14.7. EXERCISES 407
16. Do the same thing as Problem 15 for sin (z) , cos (z) . Also explain with a very short
argument why all identities for these functions continue to hold for the extended
functions. This argument shouldn’t require any computations at all. Why is
sin (z) no longer bounded if z is allowed to be complex? Hint: You might try
something involving the above formula for ez to get the definition.
17. Show that if f is analytic on C and f 0 (z) = 0 for all z, then f (z) ≡ c for some
constant c ∈ C. You might want to use Problem 14 to do this really quickly. Now
using Theorem 14.6.10 prove Liouville’s theorem which states that a function
which is analytic on all of C which is also bounded is constant. Hint: By that
theorem, Z
0 1 f (w)
f (z) = dw
2πi Cr (w − z)2
where Cr is the positively oriented circle of radius r which is centered at z. Now
consider what happens as r → ∞. You might use the corresponding version of
Theorem 14.2.4 applied to contour integrals and note the total length of Cr is 2πr.
18. Using Problem 15 prove the fundamental theorem of algebra which says every
nonconstant polynomial having complex coefficients has at least one zero in C.
(This is the very best way to prove the fundamental theorem of algebra.) Hint:
If p (z) has no zeros, consider 1/p (z) and prove it must then be bounded and
analytic on all of C.
19. Let f be analytic on Ui , the inside of Γ, a rectifiable simple closed curve positively
oriented with parameterization γ. Suppose also there are no zeros of f on Γ.
Show then that the number of zeros, of f contained in Ui counted according to
multiplicity is given by the formula
Z 0
1 f (z)
dz
2πi γ f (z)
Qm
Hint: You ought to first show f (z) = k=1 (z − zk ) g (z) where the zk are the
zeros of f in Ui and g (z) is an analytic function which never vanishes in Ui ∪ Γ. In
the above product there might be some repeats corresponding to repeated zeros.
20. An open connected set U is said to be star shaped if there exists a point z0 ∈ U
∗
called a star center such that the for all z ∈ U, γ (z0 , z) as described in before
the proof of the Cauchy Goursat theorem is contained in U . For example, pick
any complex number α and consider everything left after leaving out the ray
{tα : t ≥ 0} . Show this is star shaped with a star center tα for t < 0. Now for U
a star shaped open connected set, suppose g is analytic on U and g (z) 6= 0 for all
z ∈ U. Show there exists an analytic function h defined on U such that
eh(z) = g (z) .
This function h (z) is like log (g (z)). Hint: Use an argument like that used to
prove Morera’s theorem and the Cauchy Goursat theorem to obtain a primitive
for g 0 /g, h1 . Next consider the function
ge−h1
d
¡ −h ¢
Using the chain rule and the product rule, show dz ge 1 = 0. Using one of the
results of Problem 17 show
g = ceh1
for some constant c. Tell why c can be written as ea+ib . Then let h = h1 + a + ib.
408 LINE INTEGRALS
21. One of the most amazing theorems is the open mapping theorem. Let U be an
open connected set in C and suppose f : U → C is analytic. Then f (U ) is either
a point or an open connected set. In the case where f (U ) is an open connected
set, it follows that for each z0 ∈ U, there exists an open set, V containing z0 and
m ∈ N such that for all z ∈ V,
m
f (z) = f (z0 ) + φ (z) (14.47)
f (z) − f (z0 ) 6= 0.
Explain why there exists g (z) analytic and nonzero on B (z0 , r) such that for some
positive integer m,
m
f (z) − f (z0 ) = (z − z0 ) g (z)
Next one tries to take the mth root of g (z) . Using Problem 20 there exists h
analytic such that ³ ´m
g (z) = eh(z) , g (z) = eh(z)/m
Now let φ (z) = (z − z0 ) eh(z)/m . This yields the formula 14.47. Also φ0 (z0 ) =
eh(z0 )/m 6= 0. Now consider
but both are in B (0, δ). Hence there exists z2 6= z1 such that φ (z2 ) = ei2π/m φ (z1 )
(φ is one to one) but f (z2 ) = f (z1 ). If f is one to one, then the above shows
that f −1 is continuous and for each z, the m in the above is always 1 so f 0 (z) =
eh(z)/1 6= 0. Hence
¡ ¢0 f −1 (f (z1 )) − f −1 (f (z))
f −1 (f (z)) = lim
f (z1 )→f (z) f (z1 ) − f (z)
z1 − z 1
= lim = 0
z1 →z f (z1 ) − f (z) f (z)
14.7. EXERCISES 409
22. Let U be what is left when you leave out the ray tα for t ≥ 0. This is a star
shaped open set and g (z) = z is nonzero on this set. Therefore, there exists h (z)
such that z = eh(z) by Problem 20. Explain why h (z) is analytic on U. When
α = −1 this is called the principle branch of the logarithm. In this case define
Arg (z) ≡ θ ∈ (−π, π) such that the given z equals |z| eiθ . Explain why this
principle branch of the logarithm is
Note it follows from the open mapping theorem this is an analytic function on U .
You don’t have to fuss with any tedium in order to show this.
23. Suppose Γ is a simple closed curve and let Ui be the inside. Suppose f is analytic
on Ui and continuous on Ui ∪ Γ. Consider the function z → |f (z)|. This is a
continuous function. Show that if it achieves its maximum at any point of Ui then
f must be a constant. Hint: You might use the open mapping theorem.
24. Let f, g be analytic on Ui , the inside of Γ, a rectifiable simple closed curve positively
oriented with parameterization γ. Suppose either
or
|f (z) − g (z)| < |f (z)| on Γ
Let Zf denote the number of zeros in Ui and let Zg denote the number of zeros of
g in Ui . Then neither f , g, nor f /g can equal zero anywhere on Γ and Zf = Zg .
Hint: The first condition implies for all z ∈ Γ,
f (z)
∈ C \ [0, ∞)
g (z)
and argue
Z 0 Z Z
(f /g) f0 g0
0= dz = dz − dz = Zf − Zg .
γ f /g γ g0 γ g
You could consider F = L (f /g) where L is the analytic function defined on
C \ [0, ∞) with the property that
eL(z) = z.
Thus
eL(z) L0 (z) = 1, L0 (z) = 1/z.
In the second case, show g/f ∈/ (−∞, 0] and so a similar thing can be done. This
problem is a case of Rouche’s theorem.
25. Use the result of Problem 24 to give another proof of the fundamental theorem of
algebra as follows. Let g (z) be a polynomial of degree n, an z n + an−1 z n−1 + · · · +
a1 z + a0 where an 6= 0. Now let f (z) = an z n . Let Γ be a big circle, large enough
that |f (z) − g (z)| < |f (z)| on this circle. Then tell why g and f have the same
number of zeros where they are counted according to multiplicity.
410 LINE INTEGRALS
26. Let p (z) = z 7 + 11z 3 − 5z 2 + 5. Identify a ball B (0, r) which must contain all the
zeros of p (z) . Try to make r reasonably small. Use Problem 24.
27. Here is another approach to the open mapping theorem which I think might be a
little easier and shorter which is based on Rouche’s theorem and makes no reference
to real variable techniques. Let f : U → C where U is an open connected set.
Then f (U ) is either an open connected set or a point. Hint: Suppose f (U ) is
not a point. Then explain why for any z0 ∈ U there exists r > 0 such that
m
f (z) − f (z0 ) = g (z) (z − z0 )
where g is analytic and nonzero on B (z0 , r). Now consider the function z →
f (z) − w. I would like to use Rouche’s theorem to claim this function has the
same number of zeros, namely m as the function z → f (z) − f (z0 ). Let
for each z ∈ ∂B (z0 , r) and so you can apply Rouche’s theorem. What does this
say about when f is one to one? Why is f (U ) open? Why is f (U ) connected?
/ γ∗.
28. Let γ : [a, b] → C be of bounded variation, γ (a) = γ (b) and suppose z ∈
Define Z
1 dw
n (γ, z) ≡ .
2πi γ w − z
This is called the winding number. When γ ∗ is positively oriented and a simple
closed curve, this number equals 1 by the Cauchy integral formula. However, it is
always an integer. Furthermore, z → n (γ, z) is continuous and so is constant on
every component of C \ γ ∗ . For z in the unbounded component, n (γ, z) = 0. Most
modern treatments of complex analysis feature the winding number extensively in
the statement of all the major theorems. This is because it makes possible the most
general form of the theorems. Prove the above properties of the winding number.
Hint: The continuity is easy. It follows right away from a simple estimate and
Theorem 14.2.4 applied to contour integrals. The tricky part is in showing it is an
integer. This is where it is convenient to use Theorem 14.2.6 applied to contour
integrals. There exists η : [a, b] → C which is C 1 on [a, b] and
which shows Z
1 dw
2πi γ w−z
is an integer as claimed. So how do you show the contour integral involving η
yields an integer? As mentioned above,
Z Z b
1 dw 1 η 0 (t)
= dt
2πi ηk w−z 2πi a η (t) − z
Let Z t
η 0 (s)
g (t) ≡ ds
a η (s) − z
Formally this is a lot like some sort of log (η (s) − z) (recall beginning calculus) so
it is reasonable to consider µ g(t) ¶0
e
.
η (t) − z
Show this equals 0. Explain why this requires the function which is differentiated
must be constant. Thus
eg(a) eg(b)
=
η (a) − z η (b) − z
Now η (a) = η (b) , g (a) = 0, and so eg(a) = 1 = eg(b) . Explain why this requires
g (b) = 2mπi for m an integer. Now this gives the desired result.
29. Let
B 0 (a, r) ≡ {z ∈ C such that 0 < |z − a| < r} .
Thus this is the usual ball without the center. A function is said to have an
isolated singularity at the point a ∈ C if f is analytic on B 0 (a, r) for some r > 0.
An isolated singularity of f is said to be removable if there exists an analytic
function, g analytic at a and near a such that f = g at all points near a. A major
theorem is the following.
lim f (z) (z − a) = 0.
z→a
2
Prove this theorem. Hint: Let h (z) = f (z) (z − a) . Then h (a) = 0 and h0 (a)
exists and equals 0. Show this. Also h is analytic near a. Therefore,
∞
X k
h (z) = ak (z − a)
k=2
2
Maybe consider g (z) = h (z) / (z − a) . Argue g is analytic and equals f for z
near a.
30. Another really amazing theorem in complex analysis is the Casorati Weierstrass
theorem.
412 LINE INTEGRALS
where g (z) is analytic near a. When the above formula holds, f is said to have a
pole of order M at a.
Prove this theorem. Hint: Suppose a is not removable and B (z0 , δ) has no points
of f (B 0 (a, r)) . Such a ball must exist if f (B 0 (a, r)) is not dense in the plane.
this means that for all 0 < |z − a| < r,
|f (z) − z0 | ≥ δ > 0
Hence
1
lim (z − a) = 0
z→a f (z) − z0
and so 1/ (f (z) − z0 ) has a removable sincularity at a. See Problem 29. Let g (z)
be analytic at and near a and agree with this function. Thus
∞
X n
g (z) = an (z − a) .
n=0
There are two cases, g (a) = 0 and g (a) 6= 0. First suppose g (a) = 0. Then explain
why
m
g (z) = h (z) (z − a)
where h (z) is analytic and non zero near a. Then
1 1
f (z) − z0 =
h (z) (z − a)m
Show this yields the desired conclusion. Next suppose g (a) 6= 0. Then explain why
g (z) 6= 0 near a and this would contradict the assertion that a is not removable.
31. One of the very important techniques in complex analysis is the method of residues.
When a is a pole the residue of f at a denoted by res (f, a) , is defined as b−1 in
14.48. Suppose a is a pole and Γ is a simple closed rectifiable curve containing a
on the inside with no other singular points on Γ or anywhere else inside Γ. Show
that under these conditions,
Z
f (z) dz = 2πi (res (f, a))
Γ
m
Also describe a way to find resR(f, a) by multiplying by (z − a) and differentiat-
1
ing. Hint: You should show Γ (z−a) m dz = 0 whenever m > 1. This is because
32. Using Problem 9 give a holy version of the Cauchy integral theorem. This is
it. Let Γ be a positively oriented rectifiable simple closed curve with inside Ui
and remove finitely many open discs B (zj , rj ) from Ui . Thus the result is a holy
14.7. EXERCISES 413
This is the very important residue theorem for computing line integrals. Hint:
You should use Problem 32 and Problem 30, the Casorati Weierstrass theorem.
414 LINE INTEGRALS
Hausdorff Measures And Area
Formula
Definition 15.1.1 For a set, E, denote by r (E) the number which is half the
diameter of E. Thus
1 1
r (E) ≡ sup {|x − y| : x, y ∈ E} ≡ diam (E)
2 2
Let E ⊆ Rn .
∞
X
Hδs (E) ≡ inf{ β(s)(r (Cj ))s : E ⊆ ∪∞
j=1 Cj , r (Cj ) ≤ δ}
j=1
Proof: It is clear that Hs (∅) = 0 and if A ⊆ B, then Hs (A) ≤ Hs (B) with similar
assertions valid for Hδs . Suppose E = ∪∞ s i ∞
i=1 Ei and Hδ (Ei ) < ∞ for each i. Let {Cj }j=1
be a covering of Ei with
∞
X
β(s)(r(Cji ))s − ε/2i < Hδs (Ei )
j=1
415
416 HAUSDORFF MEASURES AND AREA FORMULA
which shows Hδs is an outer measure. Now notice that Hδs (E) is increasing as δ → 0.
Picking a sequence δ k decreasing to 0, the monotone convergence theorem implies
∞
X
Hs (E) ≤ Hs (Ei ).
i=1
Hs (S) = Hs (S ∩ E) + Hs (S \ E) .
Next I will show the σ algebra of Hs measurable sets includes the Borel sets. This
is done by the following very interesting condition known as Caratheodory’s criterion.
whenever dist(A, B) > 0, then the σ algebra of measurable sets contains the Borel sets.
Proof: It suffices to show that closed sets are in F, the σ-algebra of measurable
sets, because then the open sets are also in F and consequently F contains the Borel
sets. Let K be closed and let S be a subset of Ω. Is µ(S) ≥ µ(S ∩ K) + µ(S \ K)? It
suffices to assume µ(S) < ∞. Let
1
Kn ≡ {x : dist(x, K) ≤ }
n
By Lemma 7.4.4 on Page 159, x → dist (x, K) is continuous and so Kn is closed. By
the assumption of the theorem,
If limn→∞ µ((Kn \ K) ∩ S) = 0 then the theorem will be proved because this limit
along with 15.2 implies limn→∞ µ (S \ Kn ) = µ (S \ K) and then taking a limit in 15.1,
µ(S) ≥ µ(S ∩ K) + µ(S \ K) as desired. Therefore, it suffices to establish this limit.
Since K is closed, a point, x ∈
/ K must be at a positive distance from K and so
Kn \ K = ∪∞
k=n Kk \ Kk+1 .
Therefore
∞
X
µ(S ∩ (Kn \ K)) ≤ µ(S ∩ (Kk \ Kk+1 )). (15.3)
k=n
If
∞
X
µ(S ∩ (Kk \ Kk+1 )) < ∞, (15.4)
k=1
then µ(S ∩ (Kn \ K)) → 0 because it is dominated by the tail of a convergent series so
it suffices to show 15.4.
XM
µ(S ∩ (Kk \ Kk+1 )) =
k=1
X X
µ(S ∩ (Kk \ Kk+1 )) + µ(S ∩ (Kk \ Kk+1 )). (15.5)
k even, k≤M k odd, k≤M
By the construction, the distance between any pair of sets, S ∩ (Kk \ Kk+1 ) for different
even values of k is positive and the distance between any pair of sets, S ∩ (Kk \ Kk+1 )
for different odd values of k is positive. Therefore,
X X
µ(S ∩ (Kk \ Kk+1 )) + µ(S ∩ (Kk \ Kk+1 )) ≤
k even, k≤M k odd, k≤M
[ [
µ( S ∩ (Kk \ Kk+1 )) + µ( S ∩ (Kk \ Kk+1 )) ≤ 2µ (S) < ∞
k even k odd
PM
and so for all M, k=1 µ(S ∩ (Kk \ Kk+1 )) ≤ 2µ (S) showing 15.4 and proving the
theorem. ¥
The next theorem applies the Caratheodory criterion above to Hs .
Theorem 15.1.5 The σ algebra of Hs measurable sets contains the Borel sets
and H has the property that for all E ⊆ Rn , there exists a Borel set F ⊇ E such that
s
Hs (F ) = Hs (E).
∞
X
Hδs (A ∪ B) + ε > β(s)(r (Cj ))s.
j=1
418 HAUSDORFF MEASURES AND AREA FORMULA
Thus X X
Hδs (A ∪ B )˙ + ε > β(s)(r (Cj ))s + β(s)(r (Cj ))s
j∈J1 j∈J2
where
J1 = {j : Cj ∩ A 6= ∅}, J2 = {j : Cj ∩ B 6= ∅}.
Recall dist(A, B) = 2δ 0 and so J1 ∩ J2 = ∅. It follows
Hs (A ∪ B) ≥ Hs (A) + Hs (B).
and
∞
X
Hδs (E) + δ > β(s)(r (Cj ))s.
j=1
Let
Fδ = ∪∞
j=1 Cj .
Thus Fδ ⊇ E and
∞
X ¡ ¢
Hδs (E) ≤ Hδs (Fδ ) ≤ β(s)(r Cj )s
j=1
∞
X
= β(s)(r (Cj ))s < δ + Hδs (E).
j=1
Letting k → ∞,
Hs (E) ≤ Hs (F ) ≤ Hs (E)
This proves the theorem. ¥
A measure satisfying the first conclusion of Theorem 15.1.5 is sometimes called a
Borel regular measure.
15.1.2 Hn And mn
Next I will compare Hn and mn . To do this, recall the following covering theorem which
is a summary of Corollaries 9.7.5 and 9.7.4 found on Page 230.
In the next lemma, the balls are the usual balls taken with respect to the usual
distance in Rn .
15.1. DEFINITION OF HAUSDORFF MEASURES 419
Lemma 15.1.7 If mn (S) = 0 then Hn (S) = Hδn (S) = 0. Also, there exists a
constant, k such that Hn (E) ≤ kmn (E) for all E Borel. Also, if Q0 ≡ [0, 1)n , the unit
cube, then Hn ([0, 1)n ) > 0.
Now letting E be Borel, it follows from the outer regularity of mn there exists a de-
creasing sequence of open sets, {Vi } containing E such such that mn (Vi ) → mn (E) .
Then from the above,
Hδn (E) ≤ lim Hδn (Vi ) ≤ lim kmn (Vi ) = kmn (E) .
i→∞ i→∞
Now let Bi be a ball having radius equal to diam (Ci ) = 2r (Ci ) which contains Ci . It
follows
n α (n) 2n n
mn (Bi ) = α (n) 2n r (Ci ) = β (n) r (Ci )
β (n)
420 HAUSDORFF MEASURES AND AREA FORMULA
which implies
∞
X X∞
n β (n)
1> β (n) r (Ci ) = mn (Bi ) = ∞,
i=1 i=1
α (n) 2n
a contradiction. This proves the lemma. ¥
Lemma 15.1.8 Every open set in Rn is the countable disjoint union of half open
boxes of the form
n
Y
(ai , ai + 2−k ]
i=1
−k
where ai = l2 for some integers, l, k. The sides of these boxes are of equal length.
One could also have half open boxes of the form
n
Y
[ai , ai + 2−k )
i=1
Proof: Let
n
Y
Ck = {All half open boxes (ai , ai + 2−k ] where
i=1
Let U be open and let B1 ≡ all sets of C1 which are contained in U . If B1 , · · ·, Bk have
been chosen, Bk+1 ≡ all sets of Ck+1 contained in
¡ ¢
U \ ∪ ∪ki=1 Bi .
Let B∞ = ∪∞ i=1 Bi . In fact ∪B∞ = U . Clearly ∪B∞ ⊆ U because every box of every Bi
is contained in U . If p ∈ U , let k be the smallest integer such that p is contained in a
box from Ck which is also a subset of U . Thus
p ∈ ∪Bk ⊆ ∪B∞ .
Hence B∞ is the desired countable disjoint collection of half open boxes whose union
is U . The last assertion about the other type of half open rectangle is obvious. This
proves the lemma. ¥
Proof: I will show Hn is a positive multiple of mn for any choice of β (n) . Define
mn (Q0 )
k=
Hn (Q0 )
where Q0 = [0, 1)n is the half open unit cube in Rn . I will show kHn (E) = mn (E) for
any Lebesgue measurable set. When this is done, it will follow that by adjusting β (n)
the multiple can
Qn be taken to be 1.
Let Q¡ =¢ i=1 [ai , ai + 2−k ) be a half open box where ai = l2−k . Thus Q0 is the
n
union of 2k of these identical half open boxes. By translation invariance, of Hn and
mn
¡ k ¢n n 1 1 ¡ k ¢n
2 H (Q) = Hn (Q0 ) = mn (Q0 ) = 2 mn (Q) .
k k
Therefore, kHn (Q) = mn (Q) for any such half open box and by translation invariance,
for the translation of any such half open box. It follows from Lemma 15.1.8 that
kHn (U ) = mn (U ) for all open sets. It follows immediately, since every compact set
is the countable intersection of open sets that kHn = mn on compact sets. Therefore,
they are also equal on all closed sets because every closed set is the countable union of
compact sets. Now let F be an arbitrary Lebesgue measurable set. I will show that F
is Hn measurable and that kHn (F ) = mn (F ). Let Fl = B (0, l) ∩ F. Then there exists
H a countable union of compact sets and G a countable intersection of open sets such
that
H ⊆ Fl ⊆ G (15.6)
and mn (G \ H) = 0 which implies by Lemma 15.1.7
mn (G \ H) = kHn (G \ H) = 0. (15.7)
To do this, let {Gi } be a decreasing sequence of bounded open sets containing Fl and
let {Hi } be an increasing sequence of compact sets contained in Fl such that
Then letting G = ∩i Gi and H = ∪i Hi this establishes 15.6 and 15.7. Then by com-
pleteness of Hn it follows Fl is Hn measurable and
m=1
Let
Sk = ∪N k k k
m=1 Em × (−cm , cm ).
k
Then (x,y) ∈ Sk if and only if f (x) > 0 and |y| < sk (x) ≤ f (x). It follows that
Sk ⊆ Sk+1 and
S = ∪∞
k=1 Sk .
But each Sk is a Borel set and so S is also a Borel set. This proves the lemma. ¥
Let Pi be the projection onto
APi x x
and these are Borel measurable functions of Pi x. Also, if {Ai } is a disjoint sequence of
sets in G then ¡ ¢ X ¡ ¢
m (∪i Ai ∩ Rk )Pi x = m (Ai ∩ Rk )Pi x
i
15.2. TECHNICAL CONSIDERATIONS∗ 423
Pi x → m(APi x )
Proof : The first assertion is obvious from the definition. The Borel measurability
of S(A, ei ) follows from the definition and Lemmas 15.2.2 and 15.2.1. To show Formula
15.8,
Z Z 2−1 m(APi x )
mn (S(A, ei )) = dxi dx1 · · · dxi−1 dxi+1 · · · dxn
P i Rn −2−1 m(APi x )
Z
= m(APi x )dx1 · · · dxi−1 dxi+1 · · · dxn
P i Rn
= m(A).
x1 = Pi x1 + y1 ei , x2 = Pi x2 + y2 ei .
For x ∈ A define
l(x) = sup{y : Pi x+yei ∈ A}.
g(x) = inf{y : Pi x+yei ∈ A}.
Then it is clear that
l(x1 ) − g(x1 ) ≥ m(APi x1 ) ≥ 2|y1 |, (15.10)
l(x2 ) − g(x2 ) ≥ m(APi x2 ) ≥ 2|y2 |. (15.11)
Claim: |y1 − y2 | ≤ |l(x1 ) − g(x2 )| or |y1 − y2 | ≤ |l(x2 ) − g(x1 )|.
424 HAUSDORFF MEASURES AND AREA FORMULA
If |y1 − y2 | ≤ |l(x2 ) − g(x1 )|, then we use the same argument but let
Since x1 , x2 are arbitrary elements of S(A, ei ) and ε is arbitrary, this proves 15.9. ¥
The next lemma says that if A is already symmetric with respect to the j th direction,
then this symmetry is not destroyed by taking S (A, ei ).
Proof : By definition,
Pj x + ej xj ∈ S(A, ei )
if and only if
|xi | < 2−1 m(APi (Pj x+ej xj ) ).
Now
xi ∈ APi (Pj x+ej xj )
if and only if
xi ∈ APi (Pj x+(−xj )ej )
by the assumption on A which says that A is symmetric in the ej direction. Hence
Pj x + ej xj ∈ S(A, ei )
if and only if
|xi | < 2−1 m(APi (Pj x+(−xj )ej ) )
if and only if
Pj x+(−xj )ej ∈ S(A, ei ).
This proves the lemma. ¥
15.2. TECHNICAL CONSIDERATIONS∗ 425
Proof: Suppose first that A is Borel. Let A1 = S(A, e1 ) and let Ak = S(Ak−1 , ek ).
Then by the preceding lemmas, An is a Borel set, diam(An ) ≤ diam(A), mn (An ) =
mn (A), and An is symmetric. Thus x ∈ An if and only if −x ∈ An . It follows that
(If x ∈ An \B(0, r (An )), then −x ∈ An \B(0, r (An )) and so diam (An ) ≥ 2|x| > diam(An ).)
Therefore,
mn (An ) ≤ α(n)(r (An ))n ≤ α(n)(r (A))n .
It remains to establish this inequality for arbitrary measurable sets. Letting A be such
a set, let {Kn } be an increasing sequence of compact subsets of A such that
Then
β (n) n
Hn (B (0, 1)) + ε ≥ H (B (0, 1))
α (n)
By the Vitali covering theorem, there exists a sequence of disjoint balls, {Bi } such
that B (0, 1) = (∪∞ n
i=1 Bi ) ∪ N where mn (N ) = 0. Then Hδ (N ) = 0 can be concluded
because Hδ ≤ H and Lemma 15.1.7. Using mn (B (0, 1)) = Hn (B (0, 1)) again,
n n
∞
X n
Hδn (B (0, 1)) = Hδn (∪i Bi ) ≤ β (n) r (Bi )
i=1
∞ ∞
β (n) X n β (n) X
= α (n) r (Bi ) = mn (Bi )
α (n) i=1
α (n) i=1
β (n) β (n) β (n) n
= mn (∪i Bi ) = mn (B (0, 1)) = H (B (0, 1))
α (n) α (n) α (n)
which implies α (n) ≤ β (n) and so the two are equal. This proves that if α (n) = β (n) ,
then the Hn = mn on the measurable sets of Rn .
This gives another way to think of Lebesgue measure which is a particularly nice
way because it is coordinate free, depending only on the notion of distance.
For s < n, note that Hs is not a Radon measure because it will not generally
be finite on compact sets. For example, let n = 2 and consider H1 (L) where L is a
line segment joining (0, 0) to (1, 0). Then H1 (L) is no smaller than H1 (L) when L is
considered a subset of R1 , n = 1. Thus by what was just shown, H1 (L) ≥ 1. Hence
H1 ([0, 1] × [0, 1]) = ∞. The situation is this: L is a one-dimensional object inside R2
and H1 is giving a one-dimensional measure of this object. In fact, Hausdorff measures
can make such heuristic remarks as these precise. Define the Hausdorff dimension of a
set, A, as
dim(A) = inf{s : Hs (A) = 0}
µZ 1 ¶
p−1 q−1
Γ(p)Γ(q) = x (1 − x) dx Γ(p + q),
0
µ ¶
1 √
Γ = π
2
Next
Z ∞ Z ∞
−t p−1
Γ (p) Γ (q) = e t dt e−s sq−1 ds
Z0 ∞ Z ∞ 0
Now
µZ ∞ ¶2 Z ∞ Z ∞ Z ∞ Z ∞
e−(x +y 2 )
2 2 2 2
e−x dx = e−x dx e−y dy = dxdy
0 0 0 0 0
Z ∞ Z π/2
2 1
= e−r rdθdr = π
0 0 4
and so µ ¶ Z ∞
1 2 √
Γ =2 e−u du = π
2 0
Theorem 15.2.7 α(n) = π n/2 (Γ(n/2+1))−1 where Γ(s) is the gamma function
Z ∞
Γ(s) = e−t ts−1 dt.
0
Thus
2 √
π 1/2 (Γ(1/2 + 1))−1 = √ π = 2 = α (1) .
π
and this shows the theorem is true if n = 1.
Assume the theorem is true for n and let Bn+1 be the unit ball in Rn+1 . Then by
the result in Rn , Z 1
mn+1 (Bn+1 ) = α(n)(1 − x2n+1 )n/2 dxn+1
−1
428 HAUSDORFF MEASURES AND AREA FORMULA
Z 1
= 2α(n) (1 − t2 )n/2 dt.
0
Also recall the right polar decomposition, Theorem 3.9.3 on Page 66. This theorem says
you can write a linear transformation as the composition of two linear transformations,
one which preserves length and the other which distorts, the right polar decomposition.
The one which distorts is the one which will have a nontrivial interaction with Hausdorff
measure while the one which preserves lengths does not change Hausdorff measure.
These ideas are behind the following theorems and lemmas.
Hn (P A) = Hn (A) .
15.3. HAUSDORFF MEASURE AND LINEAR TRANSFORMATIONS 429
∞
X
Hδn (P A) + ε > α(n)(r(Cj ))n .
j=1
Since P preserves lengths, it follows P is one to one on P (Rn ) and P −1 also preserves
lengths on P (Rn ) . Replacing each Cj with Cj ∩ (P A),
∞
X
Hδn (P A) + ε > α(n)r(Cj ∩ (P A))n
j=1
∞
X ¡ ¢n
= α(n)r P −1 (Cj ∩ (P A))
j=1
≥ Hδn (A).
∞
X n
Hδn (A) + ε ≥ α(n) (r (Cj ))
j=1
Then
∞
X n
Hδn (A) + ε ≥ α(n) (r (Cj ))
j=1
X∞
n
= α(n) (r (P Cj ))
j=1
≥ Hδn (P A).
Hence Hδn (P A) = Hδn (A). Letting δ → 0 yields the desired conclusion in the case where
A is bounded. For the general case, let Ar = A ∩ B (0, r). Then Hn (P Ar ) = Hn (Ar ).
Now let r → ∞. This proves the lemma. ¥
Hn (F A) = Hn (RU A)
Proof: Let
Tk ≡ {x ∈ T : ||Dh (x)|| < k} .
Thus T = ∪k Tk . I will show h (Tk ) has Hn measure zero and then it will follow that
h (T ) = ∪∞
k=1 h (Tk )
and so
h (B (x, 5rx )) ⊆ B (h (x) , 6krx ).
Letting δ > 0 be given, the Vitali covering theorem implies there exists a sequence of
disjoint balls {Bi },n Bio= B (xi , rxi ), which are contained in W such that the sequence
of enlarged balls, B bi , having the same center but 5 times the radius, covers Tk and
6krxi < δ. Then ³ ³ ´´
Hδn (h (Tk )) ≤ Hδn h ∪∞
i=1
b
B i
∞
X ³ ³ ´´ X∞
≤ Hδn b
h Bi ≤ Hδn (B (h (xi ) , 6krxi ))
i=1 i=1
15.4. THE AREA FORMULA 431
∞
X ∞
X
n n
≤ α (n) (6krxi ) = (6k) α (n) rxni
i=1 i=1
∞
X
n
= (6k) mn (B (xi , rxi ))
i=1
n n ε
≤ (6k) mn (W ) ≤ (6k) = ε.
k n 6n
Since ε > 0 is arbitrary, this shows Hδn (h (Tk )) = 0. Since δ is arbitrary, this implies
Hn (h (Tk )) = 0. Now
Hn (h (T )) = lim Hn (h (Tk )) = 0.
k→∞
−1
where {vk } is an orthonormal basis and since U (x) is given to exist, each ak > 0.
2
Therefore, (U (x) w, w) ≥ δ (x) |w| where δ (x) > 0.
−1
Next I claim h is continuous on h (V ) . Suppose then that h (xk ) → h (x) . Then
let the sequentially compact set B (x, r) ⊆ V. Without loss of generality all xk may be
assumed to lie in B (x, r). If {xk } fails to converge to x, then since B (x, r) is sequentially
compact, there exists a subsequence {xkl } converging to z 6= x. But then
¡ ¢ ¡ ¢¡ ¢
= R h−1 (y) U h−1 (y) h−1 (y + v) − h−1 (y)
¡ ¢
+o h−1 (y + v) − h−1 (y)
therefore
¡ ¢ ¡ ¢¡ ¢ ¡ ¢
R∗ h−1 (y) v =U h−1 (y) h−1 (y + v) − h−1 (y) + o h−1 (y + v) − h−1 (y)
(15.13)
Using continuity of h−1 , it follows that if v is small enough,
¯ ¡ −1 ¢¯ ¯ ¯
¯o h (y + v) − h−1 (y) ¯ ≤ η ¯h−1 (y + v) − h−1 (y)¯
2
Taking the inner product of both sides of 15.13 with h−1 (y + v) − h−1 (y) yields
¯ ∗ ¡ −1 ¢ ¯¯ ¯
¯R h (y) v¯ ¯h−1 (y + v) − h−1 (y)¯
¯ ¯2 η ¯ ¯2
≥ η ¯h−1 (y + v) − h−1 (y)¯ − ¯h−1 (y + v) − h−1 (y)¯ .
2
Now since R preserves distances and R∗ R = I
and so
|R∗ v| ≤ |v| . (15.14)
Thus the above formula implies
η ¯¯ −1 ¯
|v| ≥ h (y + v) − h−1 (y)¯ . (15.15)
2
Since Nη has Hn measure zero, there exist {Ck } covering Nη such that r (Ci ) < δ and
∞
X n
εη n /4n > α (n) r (Ck ) .
k=1
Without
© loss ofªgenerality each Ck has nonempty intersection with Nη , containing yk .
Now h−1 (Ck ) covers h−1 (Nη ) and from 15.15
¡ ¢ 2
diam h−1 (Ck ) ≤ × 2 × diam (Ck )
η
and so
X µ ¶n
n
¡ −1 ¢ 2 n
H4δ/η h (Nη ) ≤ α (n) (diam (Ck ) )
η
k
4n X n εη n 4n
≤ α (n) r (C k ) < =ε
ηn 4n η n
k
n
¡ ¢
Since ε is arbitrary, H4δ/η h−1 (Nη ) = 0 and letting δ → 0 yields
¡ ¢ ¡ ¢
Hn h−1 (Nη ) = mn h−1 (Nη ) = 0
because by Theorem 15.1.9 Lebesgue and Hausdorff measure are the same on the
Lebesgue measurable sets of Rn . Now take η k → 0
¡ ¢ ¡ ¡ ¢¢
mn h−1 (N ) = lim mn h−1 Nηk = 0.
k→∞
Letting δ → 0,
Hn (P E) ≤ Ln Hn (E) + Ln ε
Hn (T ) ≥ Hn (RR∗ T ) = Hn (R∗ T ) .
which implies
h (x+B (0,rx )) ⊇ RU B (0, rx (1 − ε)) + h (x)
or in other words,
and so
h (x+B (0, rx )) =
|det U (x)| ≤ M, x ∈ V
and so ¡ ¡ ¢¢
Hn h (∩m Om ) \ h−1 (W ) ∩ A = 0
by Lemma 15.4.1. Therefore, if m is large enough, letting O = Om gives the desired
open set.
Let x ∈ h−1 (W ) ∩ A. First note h−1 (W ) is a Borel set because
© ª
S ≡ E ∈ B (Rm ) : h−1 (E) ∈ B (Rn )
is a σ algebra which contains the open sets due to the fact h is continuous. Therefore,
S = B (Rm ). Thus h−1 (W ) ∩ A is measurable.
There exists 1 > rx > 0 small enough that 15.19 and 15.18 both hold. There exists
a possibly smaller rx such that
B (x, rx ) ⊆ O (15.20)
and
||det (U (x1 ))| − |det (U (x))|| < ε (15.21)
whenever x1 ∈ B (x, rx ) .
The collection of such balls is a Vitali cover of h−1 (W ) ∩ A. By Corollary 9.7.5 there
is a sequence of disjoint balls {Bi } such that for
N ≡ h−1 (W ) ∩ A \ ∪∞
i=1 Bi ,
h−1 (W ) ∩ A \ ∪∞
i=1 Bi ∩ h
−1
(W ) ∩ A,
Hn (h (N )) = 0.
¡ ¢
h−1 (W ) ∩ A = ∪∞i=1 Bi ∩ A ∩ h
−1
(W ) ∪ N
436 HAUSDORFF MEASURES AND AREA FORMULA
W ∩ h (A) = (∪∞
i=1 h (Bi ∩ A) ∩ W ) ∪ h (N ) (15.22)
where mn (N ) = Hn (h (N )) = 0.
Denote by xi the center of Bi and ri the radius. Using 15.18, Lemma 15.4.1 which
says h takes sets of Lebesgue measure zero to sets of Hn measure zero, the translation
invariance of Hn , Lemma 15.3.2 which gives the rule for taking a linear transformation
outside the Hausdorff measure of something, 15.21, and the assumption that h is one
to one,
R R R
XW (y) dHn = h(A)∩W dHn = ∪∞ h(Bi ∩A)∩W dHn
h(A) i=1
R R
= h(∪∞ Bi ∩A∩h−1 (W )) dH ≥ h(∪∞ Bi ) dHn − ε ≥
n
i=1 i=1
P∞ n
P∞ n
i=1 H (h (Bi )) − ε ≥ i=1 H (Dh (xi ) (B (0, (1 − ε) ri ))) − ε
P∞
= i=1 |det (U (xi ))| mn (B (0, (1 − ε) ri )) − ε
n P∞
= (1 − ε) i=1 |det (U (xi ))| mn (B (xi ,ri )) − ε
³ R ´
n P∞
≥ (1 − ε) i=1 Bi
|det (U (x))| dm n − εm n (B i ) −ε
n P∞ R n
≥ (1 − ε) i=1 Bi ∩A∩h−1 (W ) |det (U (x))| dmn − (1 − ε) εmn (V ) − ε
nR n
= (1 − ε) V X∪∞ i=1 h(Bi ∩A∩h
−1 (W )) (h (x)) |det (U (x))| dmn − (1 − ε) εmn (V ) − ε
nR n
= (1 − ε) V XW ∩h(A) (h (x)) |det (U (x))| dmn − (1 − ε) εmn (V )
nR
− (1 − ε) V XN (x) |det (U (x))| dmn − ε
nR n
= (1 − ε) A XW (h (x)) |det (U (x))| dmn − (1 − ε) εmn (V ) − ε
The last three lines follows from 15.22. Recall mn (N ) = 0. Since ε > 0 is arbitrary,
this shows
Z Z
XW (y) dHn ≥ XW (h (x)) |det (Dh (x))| dmn
h(A) A
The opposite inequality can be established in exactly the same way using 15.19
instead of 15.18 and turning all the inequalities around featuring (1 + ε) instead of
(1 − ε) , much as was done in the proof of Lemma 9.9.1. Thus
R R P∞ R
h(A)
XW (y) dHn = h(A)∩W dHn = i=1 h(Bi ∩A)∩W dHn
P∞ P∞
= i=1 Hn (h (Bi ∩ A) ∩ W ) ≤ i=1 Hn (h (Bi ))
P∞
≤ i=1 Hn (Dh (xi ) (B (0, (1 + ε) ri )))
P∞
= i=1 |det (U (xi ))| mn (B (0, (1 + ε) ri ))
n P∞
= (1 + ε) i=1 |det (U (xi ))| mn (B (xi ,ri ))
P ³R ´
n ∞
≤ (1 + ε) i=1 Bi
|det (U (x))| dmn + εmn (Bi )
15.5. THE AREA FORMULA 437
n P∞ R n
≤ (1 + ε) i=1 Bi ∩A∩h −1 (W ) |det (U (x))| dmn + (1 + ε) εmn (V )
nR
+ (1 + ε) O\(A∩h−1 (W )) |det (U (x))| dmn
n R
≤ (1 + ε) V
X∪∞
i=1 h(Bi ∩A)∩W
(h (x)) |det (U (x))| dmn
n n
+ (1 + ε) εmn (V ) + ε (1 + ε) M
n R n
= (1 + ε) XW ∩h(A) (h (x)) |det (U (x))| dmn + (1 + ε) εmn (V )
V
n nR
+ε (1 + ε) M − (1 + ε) V XN (x) |det (U (x))| dmn
n R n n
= (1 + ε) A
XW (h (x)) |det (U (x))| dmn + (1 + ε) εmn (V ) + ε (1 + ε) M
Since ε is arbitrary, This proves the lemma. ¥
Next the Borel sets will be enlarged to Hn measurable sets.
|det U (x)| ≤ M, x ∈ V
Note that
and the second term on the left equals Xh−1 (F \W ) (x) and is measurable because h−1 (F \ W )
has Lebesgue measure zero by Lemma 15.4.1 while the term on the right is Lebesgue
measurable because F is Borel so XF ◦ h is measurable because it is a Borel measurable
function composed with a continuous function (why?). Therefore, x →XW (h (x)) |det (U (x))|
is also measurable. This proves the lemma. ¥
You don’t need to assume the open sets are bounded and you don’t need to assume
a bound on |det U (x)|.
438 HAUSDORFF MEASURES AND AREA FORMULA
Corollary 15.5.3 Let V be an open set in Rn and let h ∈ C 1 (V ) be one to one and
also for Dh (x) = R (x) U (x) the right polar decomposition, U (x) is one to one. Let E
be Hn measurable and let A ⊆ V be Lebesgue measurable. Then
Z Z
n
XE (y) dH = XE (h (x)) |det (U (x))| dmn .
h(A) A
Proof: For each x ∈ A, there exists rx such that B (x, rx ) ⊆ V and rx < 1. Then by
the mean value inequality Theorem 6.4.2 and the observation that ||Dh (x)|| is bounded
on the compact set B (x, rx ), it follows h (B (x, rx )) is also bounded. Also |det U (x)| is
bounded on the compact set B (x, rx ). These balls are a Vitali cover of A. By Corollary
∞
9.7.5 there is a sequence of these disjoint balls {Bi } such that mn (A \ ∪i=1 Bi ) = 0 and
so
A = ∪∞ i=1 (Bi ∩ A) ∪ N
Proof: From Corollary 15.5.3, 15.24 holds for any nonnegative simple function in
place of g. In general, let {sk } be an increasing sequence of simple functions which
converges to g pointwise. Then from the monotone convergence theorem
Z Z Z
g (y) dHn = lim sk dHn = lim sk (h (x)) |det (U (x))| dmn
h(A) k→∞ h(A) k→∞ A
Z
= g (h (x)) |det (U (x))| dmn .
A
You don’t need to assume U (x) is one to one. The following lemma is like Sard’s
lemma presented earlier. However, it might seem a little easier and if so, it is because
the area formula above is available and Hausdorff measures are in some ways easier to
work with since they only depend on distance.
Then Hn (h (E)) = 0.
Proof: Recall the notation J (x) ≡ |det (U (x))| discussed earlier. Modify it slightly
as
Jh (x) ≡ |det (U (x))|
where Dh (x) = R (x) U (x). First suppose V is bounded and so is ||Dh (x)||. Define
kε : Rn → Rm × Rn µ ¶
h (x)
kε (x) ≡
εx
Thus µ ¶
h (x + v) − h (x)
kε (x + v) − kε (x) =
εv
µ ¶
Dh (x) v
= + o (v)
εv
and so µ ¶
Dh (x)
Dkε (x) = ∈ L (Rn , Rm × Rn )
ε id
It is left as an exercise to explain how
∗ ¡ ∗ ¢
Dkε (x) = Dh (x) ε id ∈ L (Rm × Rn , Rn )
and ¡ ¢ ¡ ¢
2 ∗
Jkε (x) ≡ det Dkε (x) Dkε (x) = det Dh∗ (x) Dh (x) + ε2 id
Now there is an orthonormal basis, {vk } for Rn such that
n
X n
X
Dh∗ (x) Dh (x) = ak vk vk , ε2 id = ε2 vk vk .
k=1 k=1
where each ak ≥ 0 and on E, at least one equals 0. Thus the determinant above is the
determinant of a diagonal matrix which has all positive entries on the diagonal but at
least one of them is ε2 . Since ||Dh (x)|| is bounded, this shows there exists a constant
C independent of x ∈ V such that on E, Jkε (x) ≤ εC. Now by the earlier area formula,
Theorem 15.5.4,
Z Z
n
Xkε (E) (z) dH = Xkε (E) (kε (x)) Jkε (x) dmn
kε (V ) V
Z
= XE (x) Jkε (x) dmn
V
Thus
Hn (kε (E)) ≤ εCmn (E)
440 HAUSDORFF MEASURES AND AREA FORMULA
However, |h (x) − h¡ (x1 )| ≤¢ |kε (x) − kε (x1 )| and so by Lemma 15.4.4 Hn (kε (E)) ≥
Hn (h (E)) . Let P y x ≡ y. Thus
Hn (h (E)) ≤ εCmn (E)
and since ε is arbitrary, this shows Hn (h (E)) = 0.
In the general case when V might not be bounded define for each k ∈ N sufficiently
large that the sets are nonempty
© ¡ ¢ ª
Vk ≡ B (0,k) ∩ x ∈ V : dist x,V C > 1/k
function, Z Z
g (y) dHn = g (h (x)) |det (U (x))| dmn . (15.25)
h(V ) V
n o
−1
Proof: Let E = x ∈ V : U (x) does not exist which is the same as the set
where det U (x) = 0. Then by Lemma 15.5.6 Hn (h (E)) = 0 and so
Z Z
g (y) dHn = Xh(V \E) (y) g (y) dHn
h(V ) h(V )
Z
= Xh(V \E) (h (x)) g (h (x)) |det (U (x))| dmn
V
Z
= XV \E (x) g (h (x)) |det (U (x))| dmn
ZV
= g (h (x)) |det (U (x))| dmn
V
Lemma 15.6.1 For x ∈ V+ , there exists an open ball Bx ⊆ V+ such that h is one
to one on Bx .
Proof: Let Dh (x) = R (x) U (x) be the right polar decomposition. Recall that
2
U (x) is self adjoint and satisfies U (x) v · v ≥ δ |v| for some δ > 0 where δ is the
2
smallest eigenvalue of U (x) , the square root of the smallest eigenvalue of U (x) =
∗
Dh (x) Dh (x) . Let r > 0 such that B (x, r) ⊆ V+ . Then for y ∈ B (x, r), the continuity
2
of y → U (y) , resulting for the assumption that h is C 1 , implies
³ ´
2 2 2 2
U (y) v · v = U (x) v · v + U (y) − U (x) v · v
2 δ2 2 δ2 2
≥ δ 2 |v| − |v| = |v|
2 2
2
provided r is sufficiently small. Thus for small enough r, the eigenvalues of U (y) for
y ∈ B (x, r) are at least as large√as δ 2 /2 and so the eigenvalues of U (y) for these values
of y are lat least as large as δ/ 2. Thus for y ∈ B (x, r),
δ 2
U (y) v · v ≥ √ |v| .
2
Now make r still smaller if necessary such that for y, z ∈ B (x, r),
δ
||Dh (y) − Dh (z)|| < .
2
Then for any y, z of this sort,
δ
|h (z) − h (y) − Dh (y) (z − y)| < |z − y| . (15.26)
2
This follows from the mean value inequality, Theorem 6.4.2 because if you define for
such a fixed y ∈ B (x, r)
∪∞
i=1 Ei = V+ , h is one to one on Ei , Ei ∩ Ej = ∅,
442 HAUSDORFF MEASURES AND AREA FORMULA
and each Ei is a Borel set contained in the open set Bi . Now define
∞
X
n(y) ≡ Xh(Ei ∩A) (y) + Xh(Z) (y).
i=1
Thus
∞
X
Xh(Ei ∩A) (y) = XA+
i=1
where A+ ≡ V+ ∩ A. The set, h (Ei ) , h (Z) are Hn measurable by Lemma 15.4.2. Thus
n (·) is Hn measurable.
Lemma 15.6.2 Let V be an open set, F ⊆ h(V ) be Hn measurable and let A be a
Lebesgue measurable subset of V . Then
Z Z
n(y)XF (y)dmn = XF (h(x))| det Dh(x)|dmn .
h(A) A
∞ Z
X
= Xh(Ei ) (y)XF (y)dHn
i=1 h(A)
∞ Z
X
= Xh(Ei ∩A) (y)XF (y)dHn
i=1 h(Bi ∩A)
∞ Z
X
= XEi ∩A (x)XF (h(x))| det U (x))|dmn
i=1 Bi ∩A
X∞ Z
= XEi ∩A (x)XF (h(x))| det U (x) |dmn
i=1 A
Z X
∞
= XEi ∩A (x)XF (h(x))| det U (x) |dmn
A i=1
Z Z
= XF (h(x))| det U (x) |dmn = XF (h(x))| det U (x) |dmn .
A+ A
The integrand in the integrand on the left was shown to be Lebesgue measurable
from the above argument. Therefore, the integrand of the integral on the right is also
Lebesgue measurable because it equals
XF (h(x))| det U (x) |XA+ + 0XZ (x)
and both functions in the sum are measurable. This proves the lemma. ¥
Definition 15.6.3 For y ∈ h(A), define a function, #, according to the for-
mula
#(y) ≡ number of elements in h−1 (y).
Observe that
#(y) = n(y) Hn a.e. (15.28)
/ h(Z), a set of Hn measure 0. Therefore, # is a measurable
because n(y) = #(y) if y ∈
function because of completeness of Hn .
15.7. THE COAREA FORMULA 443
Proof: From 15.28 and Lemma 15.6.2, 15.29 holds for all g, a nonnegative simple
function. Approximating an arbitrary measurable nonnegative function, g, with an
increasing pointwise convergent sequence of simple functions and using the monotone
convergence theorem, yields 15.29 for an arbitrary nonnegative measurable function, g.
This proves the theorem. ¥
Lemma 15.7.1 Let A ∈ L (Rn , Rm ) . Then the nonzero eigenvalues of AA∗ and A∗ A
are the same and occur with the same algebraic multiplicities.
Proof: This follows from Theorem 3.6.5 on Page 51 applied to the matrices of A
and A∗ .
and
∞
X s
α (s) (r (Si )) < t.
i=1
I claim these sets can be taken to be open sets. Choose λ > 1 but close enough to 1
that
X∞
s
α (s) (λr (Si )) < t
i=1
Replace Si with Si + B (0, η i ) where η i is small enough that
diam (Si ) + 2η i < λ diam (Si ) .
Then
diam (Si + B (0, η i )) ≤ λ diam (Si )
and so r (Si + B (0, η i )) ≤ λr (Si ) . Thus
∞
X s
α (s) r (Si + B (0, η i )) < t.
i=1
Hence you could replace Si with Si + B (0, η i ) and so one can assume the sets Si are
open.
Claim: If z is close enough to y, then A ∩ h−1 (z) ⊆ ∪∞ i=1 Si .
Proof: If not, then there exists a sequence {zk } such that
zk → y,
and
xk ∈ (A ∩ h−1 (zk )) \ ∪∞
i=1 Si .
By compactness of A, there exists a subsequence still denoted by k such that
zk → y, xk → x ∈ A \ ∪∞
i=1 Si .
Hence
h (x) = lim h (xk ) = lim zk = y.
k→∞ k→∞
But x ∈/ ∪∞
i=1 Si contrary to the assumption that A ∩ h−1 (y) ⊆ ∪∞ i=1 Si .
It follows from this claim that whenever z is close enough to y,
¡ ¢
Hδs A ∩ h−1 (z) < t.
This shows © ¡ ¢ ª
z ∈ Rp : Hδs A ∩ h−1 (z) < t
¡ ¢
is an open set and so y →Hδs A ∩ h−1 (y) is Borel measurable whenever A is compact.
Now let V be an open set and let
Ak ↑ V, Ak compact.
Then ¡ ¢ ¡ ¢
Hδs V ∩ h−1 (y) = lim Hδs Ak ∩ h−1 (y)
k→∞
s
¡ −1
¢
so y →Hδ V ∩ h (y) is Borel measurable for all V open. This proves the lemma. ¥
15.7. THE COAREA FORMULA 445
Lemma 15.7.5 Let A be a Lebesgue measurable subset of V an open set and let
h : V ⊆ Rn → Rm be Lipschitz continuous,
Then ¡ ¢
y → Hn−m A ∩ h−1 (y)
is Lebesgue measurable. Furthermore
Z
¡ ¢ m α (n − m) α (m)
Hn−m A ∩ h−1 (y) dmm ≤ 2m (Lip (h)) mn (A)
Rm α (n)
Proof: Let A be a bounded Lebesgue measurable set in Rn . Then by inner and
outer regularity of Lebesgue measure there exists an increasing sequence of compact
sets, {Kk } contained in A and a decreasing sequence of open sets, {Vk } containing A
such that mn (Vk \ Kk ) < 2−k . Thus mn (V1 ) ≤ mn (A) + 1. By Lemma 15.7.4
Z
¡ ¢ m α (n − m) α (m)
Hδn−m V1 ∩ h−1 (y) dmm < 2m (Lip (h)) (mn (A) + 1) .
Rm α (n)
Also ¡ ¢
Hδn−m Kk ∩ h−1 (y) ≤
¡ ¢ ¡ ¢
Hδn−m A ∩ h−1 (y) ≤ Hδn−m Vk ∩ h−1 (y) (15.31)
By Lemma 15.7.4
Z
¡ ¡ ¢ ¡ ¢¢
= Hδn−m Vk ∩ h−1 (y) − Hδn−m Kk ∩ h−1 (y) dmm
m
ZR
¡ ¢
= Hδn−m (Vk − Kk ) ∩ h−1 (y) dmm
Rm
m α (n − m) α (m)
≤ 2m (Lip (h)) mn (Vk \ Kk )
α (n)
m α (n − m) α (m) −k
< 2m (Lip (h)) 2
α (n)
Let the Borel measurable functions, g and f be defined by
¡ ¢ ¡ ¢
g (y) ≡ lim Hδn−m Vk ∩ h−1 (y) , f (y) ≡ lim Hδn−m Kk ∩ h−1 (y)
k→∞ k→∞
¡ ¢
It follows from the dominated convergence theorem using Hδn−m V1 ∩ h−1 (y) as a
dominating function and 15.31 that
¡ ¢
f (y) ≤ Hδn−m A ∩ h−1 (y) ≤ g (y)
and Z
(g (y) − f (y)) dmm = 0.
Rm
¡ ¢
By completness of mm , this establishes y →Hδn−m A ∩ h−1 (y) is Lebesgue measur-
able. Then by Lemma 15.7.4 again,
Z
¡ ¢ m α (n − m) α (m)
Hδn−m A ∩ h−1 (y) dmm ≤ 2m (Lip (h)) mn (A) .
Rm α (n)
Letting δ →
¡ 0 and using¢ the monotone convergence theorem yields the desired inequality
for Hn−m A ∩ h−1 (y) .
15.7. THE COAREA FORMULA 447
Then the following formula holds along with all measurability assertions needed for it to
make sense. Z Z
¡ ¢
Hn−m A ∩ h−1 (y) dy = Jh (x) dx (15.32)
Rm A
By assumption, for each x ∈ V there exists i such that det (Dxi h (x)) 6= 0 which implies
det Dhi (x) 6= 0. This follows from Proposition 3.8.18 which says that Dh (x) has m
independent columns. Hence
∪i,j Fji = V
Thus
∪i,j Fji ∩ A = A
© ª
The problem is the Fji might not be disjoint. Let Eji be measurable sets such that
Eji ⊆ Fki ∩ A for some k, the sets, Eji are disjoint, and their union equals A. Then
Z X ∞ Z
X ¡ ∗ ¢1/2
Jh (x) dx = det Dh (x) Dh (x) dx. (15.33)
A i ∈Λ(n,m) j=1 Eji
¡ ¢
Now each Eji is contained in some Bki and so hi has an inverse on hi Bki which I will
denote by g. Thus letting π ic x ≡ xic and using the definition of g
g (h (x) , xic ) = x.
¡ ¢
on hi Eji . Changing the variables using the area formula, the expression in 15.33
equals Z
Jh (x) dx =
A
X ∞ Z
X ¡ ∗ ¢1/2
det Dh (g (y)) Dh (g (y)) |det Dg (y)| dy =
i ∈Λ(n,m) j=1 hi (Eji )
X ∞ Z
X ¡ ∗ ¢1/2 ¯¯ ¯−1
det Dh (g (y)) Dh (g (y)) det Dhi (g (y))¯ dy. (15.34)
i ∈Λ(n,m) j=1 hi (Eji )
Note the integrands are all Borel measurable functions because they are continuous
functions of the entries of matrices which entries come from taking limits of difference
quotients of continuous functions. Thus from 15.33,
Z
¡ ∗ ¢1/2
det Dh (x) Dh (x) dx =
Eji
Z
¡ ∗ ¢1/2 ¯¯ ¯−1
Xhi (E i ) (y) det Dh (g (y)) Dh (g (y)) det Dhi (g (y))¯ dy (15.35)
j
Rn
Next this integral is split using Fubini’s theorem. Let y1 ∈ Rm be fixed and now it is
necessary to decide where y2 is. I need
¡ ¢ ¡ ¡ ¢ ¡ ¢¢
(y1 , y2 ) ∈ hi Eji = h Eji , π ic Eji
¡ ¢
This requires y2 ∈ π ic Eji . However, it is also the case that y1 is given. Now y1 = h (x)
and so
x = (xi , xic ) = (xi , y2 ) ∈ h−1 (y1 )
which implies π ic x = y2 ∈ π ic h−1 (y1 ) . Thus
¡ ¢
y2 ∈ π ic h−1 (y1 ) ∩ Eji
Z Z
¡ ∗ ¢1/2 −1
det Dh (g (y)) Dh (g (y)) |det Dxi h (g (y))| dy2 dy1
Rm π ic (h−1 (y1 )∩Eji )
(15.36)
Now consider the inner integral in 15.36 in which y1 is fixed. The integrand equals
· µ ∗ ¶¸1/2
¡ ¢ Dxi h (g (y)) −1
det Dxi h (g (y)) Dxic h (g (y)) ∗ |det Dxi h (g (y))| .
Dxic h (g (y))
(15.37)
I want to massage the above expression slightly. Since y1 is fixed, and
it follows
Corollary 15.7.7 Let A be a Lebesgue measurable set in V, an open set and let
h : V ⊆ Rn → Rm where m ≤ n is C 1 and for which
¡ ∗ ¢1/2
Jh (x) ≡ det Dh (x) Dh (x) 6= 0.
Then the following formula holds along with all measurability assertions needed for it to
make sense. Z Z
¡ ¢
Hn−m A ∩ h−1 (y) dy = Jh (x) dx (15.39)
Rm A
Proof: Consider for each x ∈ A, a ball, B (x, rx ) ⊆ B (x, rx ) ⊆ V such that rx < 1.
Then by the mean value inequality, Theorem 6.4.2, h is Lipschitz on Bi . Letting {Bi }
be an open covering of A, let Ci ≡ Bi ∩ A. Then let {Ai } be such that Ai ⊆ Ci but the
Ai are disjoint and ∪i Ai = A. The conclusion of Lemma 15.7.6 applies with Bi playing
the role of V in that lemma and one can write
Z Z
n−m
¡ −1
¢
H Ai ∩ h (y) dy = Jh (x) dx
Rm Ai
Z ∞
X Z M
X
n−m
¡ −1
¢ ¡ ¢
H Ai ∩ h (y) dy = lim Hn−m Ai ∩ h−1 (y) dy
Rm Rm M →∞
i=1 i=1
Z M
X M Z
X
¡ ¢ ¡ ¢
= lim Hn−m Ai ∩ h−1 (y) dy = lim Hn−m Ai ∩ h−1 (y) dy
M →∞ Rm i=1 M →∞ Rm
i=1
M Z
X Z X
∞
= lim XAi (x) Jh (x) dx = XAi (x) Jh (x) dx
M →∞
i=1 i=1
Z Z
= X ∪∞
i=1 Ai
(x) Jh (x) dx = XA (x) Jh (x) dx
¡ ∗ ¢1/2
where Jh (x) ≡ det Dh (x) Dh (x) .
Proof: By Corollary 15.7.7, this formula is true for all measurable A contained in
the open set Rn \ S. It remains to verify the formula for all measurable sets, A, whether
or not they intersect S.
Consider the case where
Then
Dk (x, y) = (Dh (x) , εI)
and so
µ µ ¶¶
Dh∗ ¡ ∗ ¢
Jk2 = det (Dh (x) , εI) = det Dh (x) Dh (x) + ε2 I
εI
¯¯ ¯ ¯
Now ¯¯Dh (x) Dh (x) ¯¯ is bounded on A because A is compact and Dh is continuous.
∗
Since A ⊆ S, at least one of these λk equals zero. However, they are all
¯¯ bounded by some
∗ ¯¯
constant, C for all x ∈ A due to the existence of an upper bound for ¯¯Dh (x) Dh (x) ¯¯ .
Thus
m
Y ¡ 2 ¢
Jk2 = λi + ε2 ∈ [ε2m , C 2 ε2 ] (15.41)
i=1
α(n)
where Cnm = α(n−m)α(m) . It is formula 15.30 applied to the situation where h = p. It
is clear p is Lipschitz continuous with Lipschitz constant 1 since p is a projection.
Claim: ³ ´
Hn−m k−1 (y) ∩ p−1 (w) ∩ A × B (0,1)
¡ ¢
≥ XB(0,1) (w) Hn−m h−1 (y − εw) ∩ A .
if and only if
if and only if
(x, w1 ) ∈ h−1 (y − εw) ∩ A × {w}.
Therefore for w ∈ B (0,1),
³ ´
Hn−m k−1 (y) ∩ p−1 (w) ∩ A × B (0,1)
¡ ¢ ¡ ¢
≥ Hn−m h−1 (y − εw) ∩ A × {w} = Hn−m h−1 (y − εw) ∩ A .
(Actually equality holds in the claim.) From the claim, 15.42 is at least as large as
Z Z
¡ ¢
Cnm Hn−m h−1 (y − εw) ∩ A dwdy (15.43)
Rm B(0,1)
Z Z
¡ ¢
= Cnm Hn−m h−1 (y − εw) ∩ A dydw
B(0,1) Rm
³ ´Z ¡ ¢
= Cnm mm B (0,1) Hn−m h−1 (y) ∩ A dy. (15.44)
Rm
The use of Fubini’s theorem is justified because the integrand is Borel measurable.
Now by 15.44 this has shown
³ ´ ³ ´Z ¡ ¢
εCmn+m A × B (0,1) ≥ Cnm mm B (0,1) Hn−m h−1 (y) ∩ A dy.
Rm
Since this holds for arbitrary compact sets in S, it follows from Lemma 15.7.5 and inner
regularity of Lebesgue measure that the equation holds for all measurable subsets of S.
Thus if A is any measurable set contained in V
Z Z
¡ ¢ ¡ ¢
Hn−m A ∩ h−1 (y) dy = Hn−m A ∩ S ∩ h−1 (y) dy
Rm Rm
Z
¡ ¢
+ Hn−m (A \ S) ∩ h−1 (y) dy
m
Z R
¡ ¢
= Hn−m (A \ S) ∩ h−1 (y) dy
m
ZR Z
= Jh (x) dx = Jh (x) dx
A\S A
where ¡ ∗ ¢1/2
Jh (x) = det Dh (x) Dh (x) .
15.8. A NONLINEAR FUBINI’S THEOREM 453
Xp Z
¡ ¢
= ci Hn−m Ei ∩ h−1 (y) dy
i=1 h(Ei )
Z p
X ¡ ¢
= ci Hn−m Ei ∩ h−1 (y) dy
h(V ) i=1
Z "Z #
n−m
= s dH dy
h(V ) h−1 (y)
Z "Z #
n−m
= s dH dy. (15.46)
h(V ) h−1 (y)
Proof: Let si ↑ g where si is a simple function satisfying 15.46. Then let i → ∞ and
use the monotone convergence theorem to replace si with g. This proves the nonlinear
version of Fubini’s theorem. ¥
Note that this formula is a nonlinear version of Fubini’s theorem. The “n − m di-
mensional surface”, h−1 (y), plays the role of Rn−m and Hn−m is like n−m dimensional
Lebesgue measure. The term, J ((Dh (x))), corrects for the error occurring because of
the lack of flatness of h−1 (y) .
454 HAUSDORFF MEASURES AND AREA FORMULA
Bibliography
[3] Apostol, T. M., Mathematical Analysis, Addison Wesley Publishing Co., 1974.
[5] Bartle R.G., A Modern Theory of Integration, Grad. Studies in Math., Amer.
Math. Society, Providence, RI, 2000.
[6] Bartle R. G. and Sherbert D.R. Introduction to Real Analysis third edition,
Wiley 2000.
[8] Davis H. and Snider A., Vector Analysis Wm. C. Brown 1995.
[10] D’Angelo, J. and West D. Mathematical Thinking Problem Solving and Proofs,
Prentice Hall 1997.
[13] Evans L.C. and Gariepy, Measure Theory and Fine Properties of Functions,
CRC Press, 1992.
[14] Evans L.C. Partial Differential Equations, Berkeley Mathematics Lecture Notes.
1993.
[16] Federer H., Geometric Measure Theory, Springer-Verlag, New York, 1969.
[18] Fonesca I. and Gangbo W. Degree theory in analysis and applications Clarendon
Press 1995.
[20] Gromes W. Ein einfacher Beweis des Satzes von Borsuk. Math. Z. 178, pp. 399
-400 (1981)
455
456 BIBLIOGRAPHY
[22] Hardy G., A Course Of Pure Mathematics, Tenth edition, Cambridge University
Press 1992.
[23] Heinz, E.An elementary analytic theory of the degree of mapping in n dimensional
space. J. Math. Mech. 8, 231-247 1959
[24] Henstock R. Lectures on the Theory of Integration, World Scientific Publiching
Co. 1988.
[25] Horn R. and Johnson C. matrix Analysis, Cambridge University Press, 1985.
[32] Nobel B. and Daniel J. Applied Linear Algebra, Prentice Hall, 1977.
[33] Rose, David, A., The College Math Journal, vol. 22, No.2 March 1991.
[34] Rudin, W., Principles of mathematical analysis, McGraw Hill third edition 1976
[35] Rudin W., Real and Complex Analysis, third edition, McGraw-Hill, 1987.
[36] Salas S. and Hille E., Calculus One and Several Variables, Wiley 1990.
[37] Sears and Zemansky, University Physics, Third edition, Addison Wesley 1963.
[38] Tierney John, Calculus and Analytic Geometry, fourth edition, Allyn and Bacon,
Boston, 1969.
[39] Yosida K., Functional Analysis, Springer Verlag, 1978.
Index
457
458 INDEX
vectors, 25
Vitali
convergence theorem, 298
Vitali convergence theorem, 299
Vitali covering theorem, 227, 228, 230
Vitali coverings, 228, 230
volume of unit ball, 427