0% found this document useful (0 votes)
148 views360 pages

Wendell H. Fleming - Functions of Several Variables-ADDISON-WESLEY (1965)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views360 pages

Wendell H. Fleming - Functions of Several Variables-ADDISON-WESLEY (1965)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 360

antes

39001006923919

seis
Ah
ie
Functions of
Several Variables
This book is in the

ADDISON-WESLEY SERIES IN MATHEMATICS

Lynn H. Loomis, Consulting EHditor


Functions of
Several Variables

WENDELL H. FLEMING
Brown University

ADDISON-WESLEY PUBLISHING COMPANY, INC.

READING, MASSACHUSETTS - PALO ALTO * LONDON - DALLAS » ATLANTA


Copyright © 1965

Philippines Copyright 1965

ADDISON-WESLEY PUBLISHING COMPANY, INC.

Printed in the United States of America

ALL RIGHTS RESERVED. THIS BOOK, OR PARTS THEREOF,


MAY NOT BE REPRODUCED IN ANY FORM
WITHOUT WRITTEN PERMISSION OF THE PUBLISHER.

Library of Congress Catalog Card No. 65-15697


ee ae
[aS FL

To Brown UNIVERSITY

on the occasion of its bicentennial

1764-1964
Digitized by the Internet Archive
in 2022 with funding from
Kahle/Austin Foundation

httos://archive.org/details/functionsofseverO0O00wend
Preface

The purpose of this book is to give a systematic development of differential


and integral calculus for functions of several variables. The traditional topics
from advanced calculus are included: maxima and minima, chain rule, implicit
function theorem, multiple integrals, divergence and Stokes’ theorems, and so
on. However, the treatment differs in several important respects from the tra-
ditional one. Vector notation is used throughout, and the distinction is main-
tained between n-dimensional euclidean space #” and its dual. By introducing
convex and concave functions a more thorough treatment of extrema is possible.
The elements of the Lebesgue theory of integrals are given. In place of the tra-
ditional vector analysis in #”, we first introduce exterior algebra and the cal-
culus of exterior differential forms. The formulas of vector analysis then become
special cases of formulas about differential forms and integrals over r-manifolds
in euclidean £”, for arbitrary dimensions r and n.
The book is suitable for a college course at the advanced undergraduate level.
By omitting certain chapters, a one semester course can be based on it. Jor
instance, if the students already have a good knowledge of partial differentiation,
then Chapters 1 and 2 can be quickly reviewed, omitting those topics concern-
ing convexity. Substantial parts of Chapters 4, 5, 6, and 7 can then be covered
in a semester. There is also enough material for a more leisurely full-year course.
Some knowledge of linear algebra and elementary topology of E” is presumed.
However, the results needed from linear algebra are reviewed (in some cases
without proof), and the necessary topological material is given in the Appendix.
The author is indebted to many colleagues and students at Brown University.
Without the stimulation they provided, this book would not have been written.
Thanks are especially due Ubiratan D’Ambrosio and John Brothers, who
carefully read the entire manuscript and furnished many improvements to it.
Thanks are also due Fred Almgren, William Tyndall, and William Ziemer, who
read various chapters; and to Joan Phillips and Lance McVay, who verified
answers to the homework problems.

Providence, Rhode Island W. H. F.


December 1964
24

- on
(reed) Care (G0 Brin it O im ou Ih Se :
a! ees (27 ever hi Mite Mi iy weoth sare Ra . pee
pein elise ow ayaa tarts SH
il, Weasel) hues emcees pOrepirn marr =j
ie? eleicee | incient’ lata @. O48 sete Pet
MAN ei) kee JOG t ; iohant tient yee .
a — eee Ww a ¥ : ’ a” ‘ sana a

PO ah «8 ety mee wa a5. cat boas Ake


Sad eblired s as ttn!) salle “Yon olen ofee n
L rie

, ~ - : j : 4 9 a oe ee | <a

' |<)-% ypSq oe tinesll


nt | ton ees
5% ' = ai wy 4%
=x a 7
x len wader
a. oa ehal TRS 9g
. i ..
4 Se

e
a ' aif ;

eS od| ;

| 7

— —

,
ve i
Contents

EUCLIDEAN SPACES, CONVEXITY

i= 1 Euclidean E£”
te 2 Sets, functions .
ibs 3 Linear functions
1 —4 Convex sets .
ths 5 Convex and concave faetions
ce1 6 Noneuclidean norms

DIFFERENTIATION OF REAL-VALUED FUNCTIONS

2-1 Directional and partial derivatives


2-2 Differentiable functions
2-3 Functions of class C™@ .
2-4 Convex and concave functions Reontinteny ©
2-5 Relative extrema
2-6 Differential 1-forms

VECTOR-VALUED FUNCTIONS OF ONE VARIABLE

3-1 Derivatives .
3-2 Curves in E”
3-3 Line integrals
*3-4 Gradient method

VECTOR-VALUED FUNCTIONS OF SEVERAL VARIABLES

4-1 Transformations
4-2 Linear and affine Prennformations
4-3 Differentiable transformations
4-4 Composition. ,
4-5 The inverse function Ecorert
4-6 The implicit function theorem
4-7 Manifolds
4-8 The multiplier rule

INTEGRATION

5-1 Intervals
5-2 Measure .
5-3 Integrals over En
Contents

Integrals over bounded sets 151


Iterated integrals 155
The unbounded case 163
Change of measure under ace transformations 170
Transformation of integrals 173
Coordinate systems in HK” . 180
—10 Convergence theorems ; 184
—11 Differentiation under the cee sign 197
*
—12
coreon
comes
ees
a
To
Wee
eee L?-spaces 200

EXTERIOR ALGEBRA AND DIFFERENTIAL CALCULUS

Alternating multilinear functions 206


Multicovectors . 210
Multivectors 214
Induced linear anetoemntions 222
Differential forms 225
The adjoint and eouierentiall 232
Special results forn = 3 . 236

INTEGRATION ON MANIFOLDS

Regular transformations 239


Coordinate systems on mepiiolda) 247
Measure and integration on manifolds 250
Orientations, integrals of r-forms 257
The divergence theorem 262
Stokes’ formula ager 273
Closed and exact differential ieee 276

APPENDIX

The real number system 283


Axioms for a vector space 285
Basic topological notions in H” 288
Sequences in Hh” See 291
Limits and continuity of Trane oeInA tion 296
Topological spaces . 300
Connected spaces 306
Compact spaces. 308
Review of Riemann “Nigatan 313
A-10 Monotone functions 315

HISTORICAL NOTES 317


REFERENCES 319

ANSWERS TO SELECTED PROBLEMS 323


INDEX 331
CHAPTER 1

Euclidean Spaces, Convexity

This book is about the differential and integral calculus of functions of


several variables. For this purpose one needs first to know some basic properties
of euclidean space of arbitrary finite dimension n. We begin in Section 1-1
with that topic. Later in the chapter, convex sets and convex functions are
discussed.
Some knowledge of linear algebra and elementary topology is needed to
read this book. For the reader’s convenience the necessary material about
topology has been included in the Appendix. An acquaintance with elementary
calculus for functions of one variable is presumed. However, some of the basic
theorems from elementary calculus are given at the end of the Appendix.
Linear algebra is reviewed as needed in various parts of the book.
Format. The word Theorem has been reserved for what the author considers
the most important results. Results of lesser depth or interest are labeled
Proposition. The symbol § indicates the end of the proof of a theorem or propo-
sition. Occasionally part of a proof is left to the reader as a homework exercise.
The sections marked with an asterisk (*) may be omitted without disrupting
the organization. References and a brief historical survey are given at the
end of the book.

1-1 EUCLIDEAN E”
While calculus has been motivated in large part by problems from geom-
etry and physics, its foundations rest upon the idea of number. Therefore a
thorough treatment of calculus should begin with a study of the real numbers.
The real number system satisfies a list of axioms about arithmetic and order,
which express properties of numbers with which everyone is familiar from
elementary mathematics. To be more precise, the real number system is what
is called in algebra an ordered field. To this list of axioms must be added one
further axiom which expresses the completeness of the real number system.
The completeness axiom can be introduced in several different forms. Of these
we shall take the property that any nonempty set of real numbers which is
bounded above has a least upper bound. This axiom is more subtle than the
1
2 Euclidean Spaces, Convexity 1-1

others and is the foundation stone for some of the most important theorems in
calculus. The axioms for the real number system are listed in Section 1 of the
Appendix.
Scalars and vectors. By scalar we shall mean a real number. In ele-
mentary mathematics a vector is described as a quantity which has both diree-
tion and length. Vectors are illustrated by drawing arrows issuing from a given
point 0. The point at the head of the arrow specifies the vector. Therefore we
may (and shall) say that this point 7s the vector. Thus in two dimensions a
vector is just a point (x, y) of the plane E?. Vectors in E® are added by the
parallelogram law, which amounts to adding corresponding components. Thus

(x,y) + (ue) = (@+u,y tv).


The product of (2, y) by a scalar c is the vector (ex, cy). The zero vector is
(0, 0).
With this in mind, let us define the space 2" for any positive integer n.
The elements of E” are n-tuples (x',...,2") of real numbers. For short, we
write x for the n-tuple (z',...,2"). The notation x € E™ means “x is an
element of E".” The elements of E” will be called vectors, and also pornis,
depending5 on which term seems more suggestive
55S in the context. Addition and
scalar multiplication are defined in E” as follows. If

rise (st sey"); y¥= (y', .«-.y

are any two elements of £", then


fol + y9',>..,2"
x-+-+y=—@ 1 + y").

If x € £" and cis a scalar, then

The zero element of E” is’


Oe ee OD):

With these definitions £” satisfies the axioms for a vector space (Appendix A-2).
The term “vector” will be reserved for elements of E” rather than those of any
space satisfying these axioms. F
_ The superscripts should not be confused with powers of x. For instance,
(x*)? means the square of the 7th entry 2x‘ of the n-tuple (x!,..., 2”).
If n = 1 we identify the i-tuple x = (x) with the scalar x. In this ease
addition and scalar multiplication reduce to ordinary addition and multipli-
cation of real numbers. If nm = 2 or 3 we usually write (zx, y) or (z, y, 2) as is
commonly done in elementary analytic geometry, rather than (x!,2?) or
(x', x*, 2°). Practically all of the theorems will be stated and proved for arbi-
trary dimension n. However, the special cases n = 2, 3 will frequently appear
in the examples and homework problems.
1-1 Euclidean E” 3

The notions of vector sum and multiplication by scalars determine the


vector space structure of HE”, but are not enough to define the concepts of
distance and angle. These arise by introducing an inner product in BH”. An
inner product assigns to each pair x, y of vectors a scalar, and must have the
five properties listed in Problem 2 at the end of the section. The one which we
shall use is the standard euclidean inner product, denoted by - ,
n

vec \ Pees ya ey
fat
The vector space E” with this inner product is called euclidean n-space. Other
inner products in #” will be considered later in Section 1-6.
The euclidean norm (or length) of a vector x is

ba) See
It is positive except when x = 0, and satisfies the following two important
inequalities: For every x, y € EH”,

Ix-y| < |x| ly| (Cauchy’s inequality), (1-1)


Ix+y| < |x|+ ly| (triangle inequality). (1-2)

Proof of (1-1). If y = 0, then both sides of (1-1) are 0. Therefore let us


suppose that y ~ 0. For every scalar t,

(x + ty): (x + ty) = x-x+ 2tk-y+y-y,


since the inner product is commutative and distributive [Problem 2,* parts
(a), (b), (c)]. The left-hand side is |x + ty|?, and x-x = |x|”, y-y = ly|”.
The right-hand side is quadratic in ¢t, and has a minimum when

Substituting this expression for t, we find that

0 < |x + toy|? = |x|? — ——-

Ix- yl? < |x/?lyl?.


The last inequality is equivalent to Cauchy’s inequality. §

From the proof we see that equality in Cauchy’s inequality is equivalent


to the fact that |x + toy| = 0, that is, that x + toy = 0. Thus, if y ¥ 0,

*The problem number, for example, Problem 2, refers to the end of the section in
which it is cited, unless stated otherwise.
4 Euclidean Spaces, Convexity 1-1

Ix-y| = |x| |y| ¢f and only if x is a scalar multiple of y. If x-y = |x| ly|, then
x is a nonnegative scalar multiple of y (and conversely).
Proof of (1-2). We write, as before, Se

(aghe
Y)ato Vie Xue iah2X1SR GY ys
From Cauchy’s inequality,

lx + yl? < [xl? + 2[x| [yl + Iyl?,


or

yl alae) 2
This is equivalent to the triangle inequality. J

If y ¥ 0, equality holds in (1-2) if and only if x is a nonnegative scalar


multiple of y.
Using the fact that |cx| = |c| |x|, one can easily prove by induction on m
the following extension of the triangle inequality:
m . m *

Deez ey felix; (1-3)


j=1 j=1

for every choice of scalars c',...,c™ and of vectors x,,...,Xm. We recall


that >°; c’x; is called a linear combination of x,,..., Xm.
Z

ly—2|
x

lx—y| y
0
FicureE 1-1 Figure 1-2

The euclidean distance between x and y is |x — y|. If x, y, and z are vec-


tors, then
Kien Z ==) (XY)
(yi 2).
Applying (1-2) to x — y and y — z, we have

[Xatd ZS XG RY aly eZ
which justifies the name “triangle inequality.” See Fig. 1-1.
If x and y are nonzero vectors, the angle 6 between x and y is defined
by the formula
60s Oyu O<@0 5
[x| ly|
1-1 Euclidean E” 5

This formula agrees in dimensions n = 2, 3 with the one in elementary analytic


geometry. The vectors x and y are orthogonal if x-y = 0, in other words,
if is a right angle. We have

eevee xX) ty|" oxy;


or

|x — y|? = |x|? + ly|? — 2I| ly| cos 9,


which is the law of cosines from trigonometry (Fig. 1-2).

Orthonormal bases. /” is an n-dimensional vector space, and any linearly


independent set of vectors {v1,..., Vn} with n elements is a basis for it.
A basis {vi,...,V,»} for EH” is called orthonormal if v;-v; = 6;;, where

ni og fe:
ogee L eeL DGS AW 6 ale
LA ge
The symbol 6;; was first introduced by the mathematician Kronecker, and
consequently is called “Kronecker’s delta.” The unit coordinate vectors

ey = (ls O aaa. Oe

1p
Th osalb

Cer ORO eee OL)

form the standard orthonormal basis for H”. We have, for each x € EH”,
n

err eer re;


a

For instance, (2, — 1,3) = 2e, — e» + 3e3.


If v is any unit vector (|v| = 1), then x-v is the component of x with
respect to v. Since x-e; = x’, the components of x with respect to the standard
orthonormal basis vectors e;,...,¢@, are x’,...,2”. If {vj,..., Vn} is any
orthonormal basis for H”, then v,,..., V, are mutually orthogonal unit vectors.
Each x € E” can be uniquely represented as a linear combination

x=clv,; +---+ "vp. (1-4)

Taking the inner product of each side with v; and using the formula v;-v; =
6;;, we obtain
x:V; = Ge

The coefficients c’ in (1-4) are just the components of x with respect to the
orthonormal basis vectors.
6 Euclidean Spaces, Convexity 1-2

PROBLEMS
1. Let n = 4, x = e1 — + 2e4 =
e2 2), y= 3e1 —
(1, —1,0, e2 + e3 + 4 =
(3,—1,1,1). Find x+y, x—y, [x+ yl, Ix — yl, |x|, lyl, x-y. Verify (1-1)
and (1-oe in this example.
2. Prove that the standard euclidean inner prone in E” has the following five
properties:
(a) x-y = y°x. (b) &+y)-2=x-z+y-z.
(OVI CX) Vu _c(X“
y)- (d) O-x = 0. (@) seooe SO th be 2 ©
3. Using Problem 2, show that
(w+
cx)- (y+ dz) =w:y+cx:y+dw-z-+ cdx:z.

4. Show that 2|x|?+ 2ly|? = |x+ y|?-+ |x — y|?. What does this say about
parallelograms? See Fig. 1-3.

Figure 1-3

5. Show that |x -+ y| |x — y| < |x|? + |y|? with equality if and only if x-y = 0.
What does this say about parallelograms?
6. Prove (1-3), using (1-2) and induction on m.
7. Let n = 4, and v1 = 2(3e1 + 4e3), vo = $(4e2 — 3e4), v3 = (/2/10)(—4e;, +
3e2 + 3e3 + 4e4). Show that vi, ve, v3 are mutually orthogonal unit vectors.
Find a unit vector v4 such that vi, v2, v3, v4 form an orthonormal basis for E?.
8. Show that the distance between any two elements of an orthonormal basis for
Bris s/ 2.
9. (Gram-Schmidt process.) Let {x1,...,Xn} be a basis for EH”. Let v1 = |x1|~1x1,
y2 = X2 — (2-vi)v1, V2 = ly2|~"y2, ys = x3 — (K3+ v1)v1 — (x3-V2)v2, V3 =
lys|—tys,..-,Vn = l|yn|—lyn. Show that {vi,...,v,} is an orthonormal basis
for E”,
Note: In this book, “Show that...” and “Prove that...” both mean “give a
valid mathematical proof.”

1-2 SETS, FUNCTIONS


In this section we have collected a number of basic definitions and some
notation which will be used repeatedly. We presume that the reader is
acquainted with the most elementary aspects of set theory. The symbols

€, Z, BPR ar
are

stand, respectively, for 2s an element of, is not an element of, union, intersection,
difference, and inclusion. Sets will ordinarily be denoted by capital italicized
]-2 Sets, Functions 7h

letters. A set will be described either by listing its elements or by some property
characterizing them. Thus {2, 5,7} is the set whose elements are the three
numbers 2, 5, and 7. If S is a set and 7 a property pertaining to elements of S,
then {p © S:7} denotes the set of all p © S with property 7. For example,
{(x, y) € BE? : x? + y® = 1} is the circle with center (0,0) and radius 1. The
set {(x,y) € E?: 27 + y? = —1} is the empty set. The set {(z, y) € E?:
x? + y” > 0} is all of E?.
When the set S in question is clear from the context, we write simply
oem

Topology of E”. By the 6-nezghborhood of a point x, where 6 > 0, let us


mean the set of all y distant less than 6from x. Thus, if U denotes the 6-neigh-
borhood of x, then
Usay ly 1 x| <6):
A set A C EH” is open if for every point x € A there is a neighborhood U of
x such that U C A. A set A is closed if its complement A° = HE” — A is open.
The open sets determine what is called the topology of E”. Such notions as
limit, continuity, connected set, and compact set can be expressed in terms of
open sets and are called topological notions. They are described in the
Appendix.
Using the triangle inequality it can be shown that any neighborhood U
is an open set (see Section A-3). We shall sometimes refer to neighborhoods
as open spherical n-balls. When n = 1, 2,3, they are respectively open inter-
vals in H#', open circular disks in H”, and open spherical balls in H?. A set of
the form {y :|y — x| < 6}, where 6 > 0, is closed. Such sets are called closed
spherical n-balls.

Cartesian product sets. If S and 7 are sets, then the cartesian product set
S x T is formed by taking all ordered pairs (p,q) where p € S and qe T.
For example, if S = {1,2,...,n} and T = {1,2,...,m}, then the elements
of S X T are pairs (7,7) of positive integers with 1 <1 <n,t< 7 < m. In
the same way, the plane E£? is the cartesian product H' x E’.
If S,,...,S, are sets, then the n-fold cartesian product Sy X +++ X Sy 1s
formed by taking all (ordered) n-tuples (p1,..., Dn), where p; € S; for each
4 =1,...,n. In particularE” = E1 x --- X E’.
Functions. A function f assigns to each element p of some set S an element
f(p) of another set 7. The element f(p) is called the value of f at p.
This is not a satisfactory definition of “function” because of the ambiguity
of the word “assigns.” A more careful definition is the following. Any subset
f of the cartesian product S x T is called a relation between S and T. A rela-
tion f is called a function if for every p © S there is exactly one g € T' such
that (p,q) € f. This element q is denoted by f(p).
The set S is the domain of f. We shall sometimes say that f is a function
from S into T. If for every g € T there is some p € S such that q = f(p),
Euclidean Spaces, Convexity 1-2
8

then we say that f is onto T. A function f is wnivalent (or one-one) if py ~ po


implies that f(pi) ¥ f(p2).
This book is about the calculus of functions whose domains are subsets
of #”. Such functions are frequently called by the suggestive but imprecise
name, “functions of n real variables.” We may occasionally use this name in
passages intended to motivate a more careful discussion to follow. However,
we never try to make precise the phrase “n real variables.” It was only after
such vague terms as “variable” and “quantity” were abandoned that calculus
was put on a foundation acceptable by present-day standards.
A function f from a set S into E! is a real-valued function. When S Cc E£’,
f is a real-valued “function of one real variable.” Among such functions are
the algebraic functions and elementary transcendental functions (sin, cos,
tan, log, ete.), which should be familiar from elementary calculus. The expo-
nential function is denoted in this book by “exp.” Thus exp x = e”, where
é is the base for natural logarithms.
Functions with values in some euclidean HE”, n > 1, are called vector-valued
and will be indicated by boldface letters (say, g). A vector-valued function g
from a subset A of some euclidean #” into #” will also be called a transformation
from A into #”. By merely writing “transformation” in place of “vector-
valued function,” we have of course introduced no new mathematical idea.
However, the word “transformation” is supposed to have a geometric flavor
which aids intuition. Some authors say “mapping” instead of “transformation.”
The differential calculus of transformations is developed in Chapters 3 and 4.
If f and g are functions with the same domain S and values in a vector
space U, then the sum f + g is defined by

(f + g)(p) = f(p) + g(p)


for every p © S. In particular, it makes sense to speak of sums of real-valued
functions or of transformations. If f has values in U and ¢ is real valued, then
¢f is the U-valued function given by

(tf) (p) = o(p)f(p)


for every p € S. If ¢ is a constant function, ¢(p) = c for every p € S, then
we write cf instead of $f.
Often one is interested only in the values of a function f for elements of
some subset A of its domain. The restriction of f to A is the function with
domain A and the same values as f there. It is denoted by f|A. Thus

TA — Ap i(p)) pe At.
For instance, if a real-valued function f is integrated over an interval J c EZ’,
then it is only f|J which is important. The values of f outside J do not affect
the integral.
1-3 Linear Functions a

Functions on cartesian products. Let S,; and Sz be sets, and f a function


from the cartesian product S; x Sz into a set T. Given p; € Sy, let f(pi, )
denote the function from Sz into T whose value at each pz € Sz is f(py, po).
Given pz © Sz, the function f( pz) from S, into T is similarly defined.

1-3 LINEAR FUNCTIONS


Let L be a real-valued function whose domain is E”.

Definition. The function L is linear if:


(a) L(x +y) = L(x) + L(y) for every x,y € E”; and
(b) Licx) = cL(x) for every x © E” and scalar c.
These two conditions are equivalent to the single condition L(cx + dy) =
cL(x) + dL(y) for every x,y © E” and scalars c,d. By induction, if L is linear,
then

L (=°x;)= Seb) (1-5)


j=1 j=1

for every m,X1,.--,Xm © E”, and scalars c',...,c”. In words this states
that “L of a linear combination of x,,...,Xm is the corresponding linear
combination of L(x,),..., L(xn).”
If a,,..., 0, are real numbers, then the function L defined by

L(x) = az? +---+ 4,2", (1-4)

for every x © E”, is linear. Conversely, if L is a linear function, let

a; = L(e,), t= 1,..

For each x we have

Applying (1-5) with c? = z’, x; = e;, and m = n, we get


L(x) = z'L(e;) + --- + 2"Lfe,).

This has the form (1-6).


Let us denote the right-hand side of (1-6) by a-x. We have proved:
Proposition 1. A real-valued function L 1s linear if and only if there exist
real numbers 01,...,4n such that L(x) = a-x for weryx € E”.]
The object a will be called a covector and a;,...,4, are its components.
Covectors are not elements of E”, but belong to the n-dimensional vector space
(£")* dual to E”. This will be explained in more detail later in the section.
Note that the components z’ of a vector are denoted with superscripts and the
10 Euclidean Spaces, Convexity 1=3

components a; of a covector with subscripts. The number a -x is called the


scalar product of the covector a and vector x. The components of a satisfy the
formula a; = a- e;.
If a is a covector, then there is a vector y‘with the same components,
y' = a; fori = 1,...,n. This vector y has the property that

ees Oe RS ress (1-7)


i=1 i=1
The - denotes the scalar product on the right-hand side and the euclidean
inner product on the left-hand side.
Since covectors can be changed into vectors by this simple device, it is
not immediately apparent why there is any need to distinguish between ”
and its dual. In fact, by this device we shall avoid practically any mention of
covectors until Chapter 2. However, they become useful there in the definition
of differential (Section 2-2) and later in the statement of the chain rule (Sec-
tion 4-4). The distinction between vectors and covectors is essential in the
development of the last two chapters (6 and 7) of the book.
Note about terminology. What we call covectors are often called covariant
vectors. What we call vectors are then called contravariant vectors. A vector-
valued function (which we call a transformation) is often called a contravariant
vector field. Similarly a covector-valued function (which we shall call a dif-
ferential form of degree 1 in Section 2-6) is the same thing as a covariant vector
field.
Hyperplanes, half-spaces. The solutions of a linear equation a-x = c
form what is called a hyperplane (point for n = 1, line for n = 2, plane for
n = 3). More precisely:
Definition. A hyperplane in E” is a set of the form {x:a-x = c}, where
a ~ 0.

Of course, a ~ O means that a; ¥ 0 for at least one 7. If b is a scalar,


then ba;,..., ba, are the components of the covector ba. For any nonzero
scalar b, {x:a-x =c} = {x: (ba) -x = bc}. Thus the covector a and scalar c
defining a hyperplane are determined only up to a scalar multiple. If c ¥ 0,
we may, for instance, always take c = 1.
A hyperplane P = {x:a-x = c} is parallel to Py = {x:a-x = c,} for
any c; ~ c. If c; = 0, then P; contains 0, and P is a vector subspace of E”
of dimension n — 1. This last statement follows from Problem 5.
Example. Find the hyperplane P in #* which contains the four points e1, e: +
2e2, e2 + 3e3, es + 4e4. Every x € P must satisfy the equation a-x = c, where
a and c must be found. Taking in turn x = e1, x = e; + 2e0,..., we obtain
6S AG S wi, ¢ = a: (e1 + 2e2) = ai + 2a,
c a: (e2 + 3e3) = ag+ 8az, c = a: (e3
+ 4e4) = ag + 4ag.
1-3 Linear Functions 11

From these equations

aj-= ¢, an = U: az = c¢/8, aa = c/6.

Taking for convenience c = 6, we have

P = {x: 6x! + 223+x4 = 6}.

Definition. Leta ~ 0. A closed half-space is a set of the form {x:a-x > c},
and an open half-space is a set of the form {x:a-x > c}.

SA = {x:a-x>0}

— Figures 1-4

P= {x:ax=c}

See Fig. 1-4. A set H of the form {x:a-x < c},a ¥ 0, is a closed half-
space, since H is also {x:(—a)-x > —c}. The same remark applies to
open half-spaces. A hyperplane P = {x:a-x = c}divides E” into two half-
spaces. More precisely, H” — P is the union of the open half-spaces

aXe ae Xeoec, and. 4x 7a-x:< ch,

The definition suggests that a closed half-space is a closed set. To prove


this we use the fact that if f is a continuous real-valued function with domain
E”, then {x : f(x) > c} isa closed set. See Section A-6. Every linear function is
continuous (Problem 4) and has domain #”. Therefore letting f(x) = a-x
for every x € #”, we conclude that {x:a-x > c} is closed. Similarly, every
open half-space is an open set.

The dual space of E”. Let us now give a more thorough description of the
space of covectors, dual to H”. The reader may postpone this discussion and
study it together with later chapters.
According to Section A—2 every vector space U has a dual 0*, whose ele-
ments are the real-valued linear functions with domain 0. If VU has finite
dimension n, then U* also has dimension n. Moreover, given any basis for U,
there is a dual basis for 0*.
Now let 0 = £”.

Definition. A covector is an element of the space (#”)* dual to E”.

Thus a covector is just a real-valued linear function L. The basis dual to


the standard basis {e1,...,€n} for #” is the set CXG XO erheree X18
the linear function such that X*(x) = x’ for every x. This is seen from the dis-
12 Euclidean Spaces, Convexity 1-3

cussion of dual bases in A-2, setting

(ys ==) (35 and C= Lear

The functions X!,..., X” are called the standard cartesian coordinate functions.
In order to emphasize the duality between vectors and covectors, it is
desirable to change the notation for covector. From now on we shall ordinarily
denote covectors by a, b,... rather than by, say, Z. As in Proposition 1, we
write a-x for L(x), and call a-x the scalar product. The basis dual to
fe,,...,@n} will be denoted by {e',...,e”} rather than {X’,..., X”}.
The notation is chosen so that for every formula about vectors there will
be a corresponding formula about covectors obtained by interchanging sub-
scripts and superscripts. For instance, the components of a vector x satisfy
xz’ = X'(x) = e’-x. The corresponding formula for the components of a
covector a is a; = a-e;. In (#")* a euclidean inner product and norm are
defined in the same way as in E”.
These facts are summarized in the table below.

Vectors Covectors

Standard bases Cine Se Sex Cheat


F i
n n

x= ze; a= > aye


i=l j=l
n 7a n

Euclidean inner product x= y-= > xy a-b = y aid;


fmt rel
; 2 2
Euclidean norm |x| = x-x jal =a-a

@ .

Scalar product a-x = > ajx*


=I
a a
e-e; = 6;
a a
e-x=2 a-e; = a;
=)

In the table, 5; = 6,; is Kronecker’s delta (p. 5). The scalar product
of a vector and a covector involves only the vector space structure of ZH” and
its dual. It does not depend on the fact that we chose the euclidean inner
product rather than some other inner product. If H” is given a noneuclidean
inner product, then the appropriate formula for changing covectors into vec-
tors is (1-14b) in Section 1-6.
_ Qne important fact about vectors and covectors is that their components
x’ and a; change oppositely with respect to linear transformations. This will
be seen in Section 4-2.
1-4 Convex Sets 13

PROBLEMS
1. Let n = 3. Find the plane which contains the three points e1, eg, and e3 — 3e).
Sketch its intersection with the first octant in E°.
2. (a) Find the hyperplane in #* containing the four points 0, e, + e2, eg — eg +
2e3, 3e4 — eo.
(b) Find the value of ¢ for which t(e; — e2) + (1 — Z)eg is in this hyperplane.
oo. Prove that any hyperplane is a closed set.

4. Prove that any linear function is continuous. [Hint: |ja-x —a-y| = |a-(x—y)|.
Use Cauchy’s inequality.]
5. Let {x1,...,Xn} be a basis for H”. Define L by the formula
eX ite, me ei CeXn— CP
for every c!,...,c”. Show that:
(a) ZL is a linear function.
(b) The set P = {x: L(x) = 0} is a hyperplane containing 0, xi,..., Xn—1.
(ec) P is the only hyperplane containing these n points.
6. Let xo, X1,...,Xnz—1 be such that x1 — xo0,...,Xn—1 — Xo are linearly inde-
pendent. Prove that there is exactly one hyperplane containing xo, Xi1,...,Xn—1.

1-4 CONVEX SETS


In order to say what the term “convex
set” means, let us first define the notion of
line segment.
Definition. Let x,, x2 © H” with x; ¥ Xp.
The line through x; and Xz is

{x:x = tx, + (1 — #)Xo, t any scalar}.

If we set h = x, — Xe, then this can be


rewritten as
{x :xX = X_ + th, t any scalar}. Figure 1-5

In the plane E? the vector equation x = x2 + th becomes

a ee tL | £2), Tf SES SOI Y2),

which, in elementary analytic geometry, are called parametric equations of


the line through (x1, y:) and (a, yz).
The line segment joining x; and Xg is

ex = (xX, + ( t)Xo, t ‘= [0, 1}},

where [a, b] denotes the set of real numbers ¢ such thata < t < 6 (Section A-1).
For example, if t = 4, then x is the midpoint of the line segment joining
x, and x.(Fig. 1-5). The points corresponding to t = 3, 2 trisect the line
segment.
14 Euclidean Spaces, Convexity 1-4

Definition. Let K C E”. Then K is a convex set if the line segment joining
any two points of K is contained in K (Fig. 1-6).

a a
x, Xx XK

Convex Not convex


Figurp 1-6

E” itself is a convex set. The empty set and sets with just one point trivially
satisfy the definition; hence they are convex. The reader should be able to
think of several kinds of geometric objects such as lines, planes, spherical balls,
regular solids, and so on, which appear to be convex sets. However, geometric
intuition is not always a reliable guide, especially in four or more dimensions.
In any case, intuition is no substitute for a proof that the set in question is
actually convex.
The convex subsets of H” have many remarkable geometric properties.
There is an extensive mathematical literature devoted to them [7],* [10], [13].
However, in the present section we shall go no further than to obtain a few
basic facts about convex sets which are useful in calculus. The main result
(Theorem 1) will be the characterization of closed convex sets as intersections
of closed half-spaces.
The definition of convex set makes sense in any vector space. During
recent years the study of convexity in infinite-dimensional vector spaces has
furnished powerful new tools in such diverse branches of mathematical analysis
as complex function theory, differential equations, and calculus of variations.
Let us consider some familiar subsets of HE” and prove that they are convex.
To show that a set K is convex directly from the definition, we must verify
that for every x,, Xo © K and ¢ € (0, 1], the point x = tx, + (1 — £)x2 also
belongs to K. In the definition, we assumed that x; ¥ xX. But if xy = Xo, it
is trivial that x € K, since x = x, = Xp.
Example 1. Any closed half-space is a convex set. Let H = {x:a-x> ct},
a ~ 0. Let x1, x2 © H and x = tx; + (1 — #)xe, where t € [0,1]. Thena-x; > c
and a-:x2 > c. Since ¢ > 0, ta-x1 > tc; and since 1 —t > 0, (1 — t)a-xo >
(1 — #)c. Consequently,

a-x = ta-xi+ (1 — Da-xe > t+ (1 — fe =.


This shows that x € H. Therefore H is a convex set. Similarly, any hyperplane is a
convex set (Problem 2) and any open half-space is a convex set.
Example 2. Let U be a neighborhood, namely, U = {x:|x — xo| < 6}, for
some xo and 6 > 0. To show that U is a convex set, we proceed as in Example 1.

*Numbers in brackets refer to references at the end of the book.


1-4 Convex Sets 15

Let x1, x2 © U and x = tx; + (1 — #)xe, where t € [0, 1]. Then


x1 — xo| < 6, x2 — xol< 6,
xX — Xo = t(x1 — Xo) + (1 — (x2 — Xp),
Ix — xo| < ¢lx1 — xo] + (1 t)|x2 — xo| < 6.
Hence x € U.
Example 3. Let n = 1. The nonempty convex subsets of E! are just the intervals.
(See Section A~7 for the definition of interval.)

In more complicated examples it is inconvenient to verify convexity directly


from the definition. Instead, it is easier to apply one of the following criteria
for convexity.
Proposition 2a. Jf K,,..., Km are convex sets, then their intersection
Figs | I, 18 CONVEL.
Proof. Let x;,X2g be any two points of Ky} N-:-N Km, xX; # Xo. Let 1
denote the line segment joining x, and x2. For each j = 1,...,m,
X1, X2 © K;. Since each K; is convex, 1C K; for each 7= 1,...,m. Thus
Giese
1), Kan. I
A set which is the intersection of a finite number of closed half-spaces is
called a convex polytope. Since a half-space is a convex set, any convex poly-
tope is a convex set by Proposition 2a.
Example 4. Let 7 be a triangle in the plane H?. Then T is the intersection of
three half-planes, bounded by the lines through the sides of T.
A convex polytope is the set of all points x which satisfy a given finite
system of linear inequalities of the form a’-x > c’,j =1,...,m. The
theory of linear programming is concerned with the problem of maximizing or
minimizing a linear function subject to such a system of linear inequalities.
It has various interesting economic and engineering applications (see references
[10] and [13]). In Section 2-5, it is shown that the maximum and minimum
values of a linear function must occur at “extreme points” of K, at least if K
is compact.
In the proof of Proposition 2a we did not really use the fact that the num-
ber of sets K; is finite. Therefore we have:
Proposition 2b. The intersection of any collection of convex sets 1s a convex
set. ¥

In particular, the intersection of any collection of half-spaces is convex.


The intersection of any collection of closed sets is a closed set. Hence, if each
of the half-spaces is closed, the intersection is a closed, convex set. An important
fact about closed convex sets is that, excluding trivial cases, the converse holds.
The converse can be stated in a slightly sharper form, in that only half-spaces
bounded by supporting hyperplanes need be used (Theorem 1). In order to
do this we first state the following.
16 Euclidean Spaces, Convexity 1-4

Definition. Let K be a closed convex set. Assume that K is neither the


empty set nor H”. A hyperplane P is called supporting for K if PN K
is not empty and K is contained in one of the two half-spaces bounded
byels \

If P is supporting for K, the set P 9 K is convex by Proposition 2a, and


contains only boundary points of K (Problem 7). Moreover, given any bound-
ary point of K, there is at least one supporting hyperplane containing it (Prob-
lem 8). If K has interior points and the boundary fr K is “sufficiently smooth,”
then given y € frK, there is just one supporting hyperplane containing y.
It is the tangent hyperplane to fr K at y, and can be found by the methods of
calculus. This will be explained in Section 4-7.
If, for example, 7 is a triangle in HE”, then each vertex is contained in an
infinite number of supporting lines to T. Each other boundary point is con-
tained in a single supporting line, the line through the edge of 7 containing
it (Fig. 1=7).
/
Example 5. Let B = {x:|x| < 1}, the closed unit / Supporting line
n-ball. Then a
Way = beepd Sat,

Given y € fr B, let

Xeic Xa oe be
Py = 4x2y>x = 1}, Figure 1-7

so that Py is the hyperplane bounding Hy. By Cauchy’s inequality and the fact
ony Saale
VOR WINS Shah
Equality holds if and only if x is a positive scalar multiple of y. Hence BC Hy
and B 1 Py consists only of y. The supporting hyperplane to B at y is Py (Fig. 1-8).

Again let K be any nonempty, closed convex set which is a proper subset
of KE” (K # EH”). Let 3Cx denote the collection of all closed half-spaces H such
that K CH and the hyperplane P bounding H is supporting for K. For
instance, the collection 3Cg in the above example consists of the various half-
spaces H, for all possible choices of y € fr B.
The notation
(a
HERK

stands for the intersection of all half-spaces H € 3Cx (see Section A-3).

Theorem 1. K = O) H.
HERR
eS |S
YU ,
Convex Sets 17

]/ ii

Z,
Figure 1-8 Figure 1-9

Proof. For convenience let us set a Hi


2 let \¥,=
ee ae

Since K C H for each H € 3x, K C Ky. Let us show that K = Ky.


Suppose it is not. Then there exists some x; € K,; — K. Since K is a
closed set, there is a point xo € K nearest x,, that is, |x — x,| > |xo — x,|
for every x € K (Fig. 1-9). See Problem 5(c), Section A-8. Consider the
closed half-space
EL oi="A X= (Xo ee X1) (X — Xo) = 0}.
Then x; ¢ Ho since

(xo = 21) 6 (X= Xo)i—= — [Xo — xi\” < 0.


To show that K C Ho let x be any point of K. Since K is convex, tx +
(1 — #)xo € K for every ¢ € [0,1]. Then

xis Wee) Xo Xt| = iXo— Xai”,


or:
\(xo — x1) + t(x — Xo)|? > [xo — xi/?,
[xo — Xi|? + 2t(xo — x1) - (kK — Xo) + [x — x0|? > [xo — x1 |”.
Subtracting |x9 — x,|? from both sides and dividing by t, we get for0 < t < 1

2(xo- xi) (X—1X0) + IX — Xo] = 0.


Letting t > 0* we find that 2(x9 — x1) - (x — Xo) > 0, which shows that
x € Hy. Thus K C Ho. The boundary of Ho is

Po {x2 (xo — x1) +(x — Xo) = 0},

and x9 € Po. Hence Py N K is not empty, and therefore Po is supporting for K.


This shows that Ho € Kx. Consequently, K; C Ho. But x; € Ki,x; ¢ Ho,
a contradiction. Therefore K = K,. J
18 Euclidean Spaces, Convexity 1-4

We give an example to illustrate the theorem.

Example 6. Let f be a real-valued function with domain H!, which has everywhere a
derivative f’(x). Assume that f’ is an increasing function. Let

A = {@,y):y = f@)}.
For each s consider the closed half-space H, above the tangent line to f at (s, f(s)),
H, = {(2,y):y — f(s) > f’(s)\(@ — s)}. (See Fig. 1-10.) Consider (a, y) € A and
suppose first that x > s. By the mean value theorem, f(x) — f(s) = f’()(@ — 8),
where s < t < x. Since f’ is increasing, f’(t) > f’(s). Hence

y — f(s) 2 f@) — f(s) > f’®@ — 8),


which shows that (x, y) € H;. Similarly, («, y) € H, ifx < sorifx = s. Therefore
Are Gi; for every s @ Eb) If (, y) EA, then y= f@), and @yy)" 4 Ay iors =:
Therefore A is the intersection of the collection of closed half-spaces H;. The tangent
line at (s, f(s)) is supporting for A at that point. The collection #4 consists of the
: fe
yy, y)Uy

x, y)
// ye
ss (s, f(s))

Figure 1-10

A sufficient condition that f’ be an increasing function is that f(x) > @ for every
z. More generally, f’ is increasing if f’’(z) > 0 for every x and each point where
f(x) = 0 is isolated.

Convex combinations. The definition of convex set is expressed in terms


of pairs of points. It can also be given in terms of convex combinations of any
finite number m of points. Let x1, ..., Xm be distinct points (x; ~ x,ifj # k).

Definition. A point x is a convex combination of x1,...,Xm if there exist


scalars t',..., ” such that

ie hs is >0 forj=1,...,m
j=1 j=l

To say that x is a convex combination of two points of S is merely to say


that x lies on some line segment with endpoints in S. For instance, if 8 is the
circle with equation x? + y? = a?, then every point in the circular disk
1-4 Convex Sets 19

{(z, y) 22% ty? < a} bounded by S is a convex combination of two points


of S.
On the other hand, if S consists of three noncollinear points (xo, yo),
(v1, Yi), (2, y2), then each boundary point of the triangle with these points
as vertices is a convex combination of two points of S, but the interior points
are not. However, each interior point (x, y) is a convex combination of (x2, y2)
and some point (wu, v) on the edge opposite (x2, ys) (Fig. 1-11). Since (uw, v) is
a convex combination of (xo, yo) and (x1, yi), we can write (2, y) as a convex
combination of the three points (29, yo), (*1, 1), (2, y2) as follows. Writing
X = (2, y), xX; = (2;, y;), there exist s, ¢ € [0, 1] such that

X = ¢[sxo + (1 — s)x,] + (1 — é)xe = t°xy + t’x, + #?xzp,

where #° = ts, t! = t(1 — s), ¢? = 1 — t are nonnegative and ¢® + #1 + ¢? = 1.


Proposition 3. A set K is convex if and only if every convex combination of
points of K 1s a point of K.

Proof. Let K be convex. Let us prove by induction on m that if x is any


convex combination of x,,...,Xm © K, then x © K. The case m = 1 is
trivial. Assuming the result true for the integer m > 1, let x be a convex
combination of points X1,..., Xm41 of K,
(isl m+1 | _
Th tee et OM 10r) —91,005,
mr 1.
j=1 j=l

If ¢+* = 1, then ? = 0 forj < mand x = Xm+1 isin K. If t”*! < 1, let
Shea eae eee MOL pie Jt 1,
m .

y= De SX
j=l
Then y is a convex combination of x1,...,Xm. By the induction hypothesis,
y € K. But
2.5 ly = (1 co t)Xm+1)

and ¢ € [0,1]. Hence x € K.


(11, Y1)

(u, v) *1

(29 yo) (%9, Yo)

Figure 1-11 Figure 1-12


20 Euclidean Spaces, Convexity 1-4

Conversely, assume that every convex combination of points of K is a


point of K. In particular this is true for convex combinations of any pair
X1, X2 of points of K. Hence K is convex. §
=
Let Xo, Xi,---,Xr,7 <n, be distinct points of H” such that the dif-
ferences X; — Xo,..-,Xr — Xo form a linearly independent set. The set of
all convex combinations of x9, X1,...,X, 1s called the r-semplex with vertices
Xo, X1,---,X,r. A 1-simplex is a line segment, a 2-simplex a triangle, and a
3-simplex a tetrahedron. According to Proposition 3, any simplex whose vertices
lie in a convex set K is contained in K (Fig. 1-12).
A point x of an r-simplex can be written in a unique way as a convex
combination i
x=— ifs:
SS Ux;
j=0

of the vertices Xo, X1,... , X, (Problem 6). The numbers 2°, ¢',... , ¢” are called
the barycentric coordinates of x. The (r — 1)-dimensional face opposite the
vertex x; is the set of points of the r-simplex with ¢' = 0.
For example, the vertices xo, X1, X2 of a triangle have barycentric coordi-
nates (1, 0,0), (0, 1,0), (0, 0, 1), respectively. The midpoint of the face oppo-
site Xo has barycentric coordinates (0, $, 3). The interior points of the triangle
have barycentric coordinates (¢°, ¢', t”), all of which are strictly positive. In
each case 1? + ¢} + #2? = 1.
The simplex with vertices 0,e;,...,¢€, is called the standard n-simplez.
It will be denoted by 2, and will be of use later in Section 5-7 in connection with
integration. The barycentric coordinates (¢°, t’,...,¢”) of a point x € = are
wivemby G22 tor? = bo...te) = 1i— (ess 4%),
*Further results about convex combinations. In the definition of convex
combination, no upper bound was put on the positive integer m. However,
for most purposes one need consider only m < n+ 1. More precisely:
Proposition. Jf S C E” and x is a convex combination of points of S, then
X 1s a convex combination of n + 1 or fewer points of S.

Proof. Let x be a convex combination with m > n+ 1, # > 0 forj =


1,...,m, and xj,...,X» © S. Let us show that x is a convex combination
of m — 1 points of S. Since m — 1 > n, there exist c!,..., c”— not all 0
such that Z
CAX Xenlere (C= (Xe Xp loe OF
Let c™ = — (ct +---+c¢”—). Then
m

Decx, 20. Dec 20


j= j=
Let
1-4 Convex Sets 21

where @ is a positive number chosen so that


Sie Uslorseach)7ic- 1 =. em and.s* =.0
for some k. Explicitly, Y
x}
= mar fo S i)
|
Q w//)
Then y SE 4)
iS SS s’x;, = S 8’, Xo
J#k J#k

and consequently x is a convex combina-


tion of the m — 1 points x,...,X,—1, Figure 1-13
Xk+1,+--+, Xm-
Hither m — 1 = n-+1, or else the same argument shows that x is a
convex combination of m — 2 points of S. Continuing, we find that x must
be a convex combination of n + 1 or fewer points of S.

If S is the set of vertices of an n-simplex 7 and each of the barycentric


coordinates of x is positive, then x is not a convex combination of fewer than
n + 1 points of S. Hence the number n + 1 is the best possible in the propo-
sition. However, if S is a connected set, then n + 1 can be replaced by n.
This is proved as follows. Suppose that x* is a point which is a convex combi-
nation of n + 1 points xg, X1,...,Xn of a connected set S, but not of fewer
than n + 1 points of S. The differences x; — Xo,...,Xn — Xo form a linearly
independent set; for if not, the reasoning used to prove the proposition above
shows that x* is a convex combination of n of the points xo, X1,..., Xn. There-
fore Xo, X1,.--, Xn are the vertices of an n-simplex 7’, and all the barycentric
coordinates of x* are positive. Let 7’o be the face of 7 opposite xo, and

Ko = {x:x* = tx+ (1 — dy, where y € Tp and ¢ € (0, 1]}.

Ko is a convex polytope, and its boundary fr Ko consists of portions of the


hyperplanes which contain x* and the (n — 2)-dimensional faces of 7’ (we
leave the verification of this to the reader). If fr Ko intersects S, then x* is
a convex combination of fewer than n + 1 points of S, contrary to hypothesis
(Fig. 1-13). Hence S q fr Ko is empty. The interior int Ko and the comple-
ment Kj = HE” — Ko are open sets, their union contains S, and their inter-
section is empty. But xo € int Ko and x; € Ko fori =1,...,n. Hence
Snint Ko and Sq KG are relatively open, nonempty sets, which implies
that S is disconnected (Section A-7). This is a contradiction.
By slightly refining the proof, an even stronger result is obtained. Suppose
that S = S; U---U Sx, where k < n and S;,...,S, are connected sets.
For each 1 = 1,...,n consider the corresponding convex polytope K;. Then
int K; N int K; is empty whenever 7 # j and S ¢q fr K; is empty for every 2.
Moreover, x; € int K;. Since k < n, some pair of the points x;, x; must belong
22 Euclidean Spaces, Convexity 1-4

to the same set S,. Then S, is not connected, a contradiction. Hence, if S is


the union of n or fewer connected sets, every x which is a convex combination
of points of S is a convex combination of n or fewer points of S.
\

PROBLEMS

{. Show that K is a convex set by directly applying the definition. Sketch A in


the cases n = 1, 2, 3.
(a) Ko = {xr |z!) + +--+ |2"| < J}.
(b) K = {x =clvyy +-+-+e%,, OSci' <1 for + =1,...,n}, where
{v1,..., Vn} is a basis for H". This is the n-parallelepiped spanned by
Vi,--., Vn with 0 as a vertex.
. Let P be a hyperplane. Prove that the line through any two points of P is con-
tained in P. Why does this imply that P is a convex set?
. Show that each of the following subsets of EH? is closed and convex by writing it
as the intersection of closed half-planes:
(a) The regular hexagon with center (0, 0) and e; as one vertex.
(DOr y)s Ye eig LSet 1}.
(c) {(a,y):y < loga, x > 0}. [Hint: Use the method of Example 6.]
diay) a0 7 sine, 0 Sen Sor}.
. Write the standard n-simplex as the intersection of n-+ 1 closed half-spaces.
Illustrate for n = 2 andn = 3.
. Write $e; + 4e2 as a convex combination of e1, $e2 — e1. Also write it as a
convex combination of 0, e2, e; + eg. Illustrate.
. Show that if x can be represented in two ways as a convex combination of
Xo, X1,..., Xr, then x; — Xo,...,X, — Xo forma linearly dependent set. [Hint:
Ifx = @9°xo-+---+ x, and 9+ ---+¢ = 1, then x — xo = #!(x; — x9) +--+ +
t"(X; = Xo).]

“I . Prove that a supporting hyperplane for a closed convex set A can contain no
interior point of K.
. Let y be any boundary point of a closed convex set AK. Show that A has a sup-
porting hyperplane P which contains y. [Hint: Let {y} be a sequence of points
exterior to AK such that y» tends to yas m — ©. Let x» be a point of K nearest
tO Ym and Um = (Ym — Xm)/|Ym — Xml. Then |u,| = 1 and xm tends to y as
m — ©. By the proof of Theorem 1 there is a supporting hyperplane of the form
{XiUm*(X — Xm) = 0}. Let u be an accumulation point of the bounded set
{Uj, Us, ...p and P = {rru+ (x — y) = 0}.]
. The barycenter of an r-simplex is the point at which the barycentric coordinates
are equal, 4 = g! = --. = 7f,
(a) Show that the barycenter of a triangle is at the intersection of the medians.
(b) State and prove a corresponding result for r > 3.
10. Let x be a convex combination of xi,..., Xm and let x; be a convex combination
Of yj1,---+,¥Yimj J = 1,..., m. Show thatx is a convex combination ofz1, ... , Zp,
which are the distinct elements of the set {yj,:k = 1,...,mj,j = 1,..., mb}.
1-5 Convex and Concave Functions 23

11. Let S be any subset of E". The set S of all convex combinations of points of S
is the convex hull of S.
(a) Using Problem 10, show that S§ is convex.
(b) Using Proposition 3, show that if K is convex and S C K, then SC K. Thus
the convex hull is the smallest convex set containing S.
12. Given xo and 6 > 0, let C = {x: |x‘ — 2o| < 6,7 = 1,...,n}, an n-cube with
center xo and side length 26. The vertices of C are those x with |a’ — zo| = 6
fora = 1,...,n”. Show that C is the convex hull of its set of vertices. [Hint:
Use induction on n.]
13. Let K be a closed subset of H” such that both K and its complement #” — K
are nonempty convex sets. Prove that K is a half-space.
14. Let K be any convex set. Prove that its interior and its closure are also convex
sets.
15. Let A and B be convex subsets of #”. The join of A and B is the set of all x such
that x lies on a line segment with one endpoint in A and the other in B. Show
that the join of A and B is a convex set.

1-5 CONVEX AND CONCAVE FUNCTIONS

Functions which are either concave or convex arise naturally in connec-


tion with the study of convex sets. They also occur in a wide variety of appli-
cations of calculus. We will see in Section 2-5 that the theory of maxima and
minima is much simpler for them than for functions which are neither concave
nor convex.
Let f be a real-valued function and K a convex subset of the domain of f.
Definition. The function f is conver on K if, for every x,, X2 © K and
t & [0, 1],
TURD lem) Xo) (Xa) 4 — 8)f(Xa). (1-8a)
If strict inequality holds in (1-8a) whenever x; ¥ x2 and 0 <¢ < l,
‘then f is strictly convex on K.
The assumption that K is a convex set is needed to ensure that the point
ix, + (1 — #)xe belongs to the domain of f. In order to see the geometric
meaning of convexity let us denote points of H"*! by (a',..., 2”, 2) or, for
short, by (x, z). Let
KY = {x 2ex eK, 2 > s(x).
If x; = Xo, then (1-8a) holds trivially. Therefore suppose that x; # Xe.
Let J denote the line segment in £”*! joining (x1, f(x1)) and (Xp, f(x2)).
Points of | are of the form

(tx, + (1 — 2x», tf(xi) + (1 — Of (x2)),


where ¢ € [0, 1]. Inequality (1-8a) says that such points belong to Kt. There-
fore the definition says geometrically that the line segment / is contained in
K* for every pair of points x,, X2 € K.
24 Euclidean Spaces, Convexity

Proposition 4. The function f is


convex on K if and only if K* is a convex
subset of BE”*}.

Proof. Let f be convex on K.


Let (1, 24), (X2, z2) © es (Xi, 21) 7
(Xo, 22), and l’ be the line segment join-
ing them. Let (x, z) be any point of U’.
Then
x = ix; + (1 — #)x., = ae |
|
asi
j
ae SOR er ee
Xj x Xo
h(x, fx)
ee
}
2= ta + — de,
where 0 <t<1. Since z; > f(x;) and
2g > f(X2), we have (Fig. 1-14) KicurE 1-14

Be Ke tt) Xo) Bee

Hence (x, z) € K*. This proves that K* is a convex set.


Suppose, conversely, thatfis not convex on K. Then there exist x;, Xo € K
and ¢ € [0,1] such that (1-8a) does not hold. The point (tx; -+ (1 — #)Xe,
if(x;) + (1 — df(x2)) belongs to the line segment / joining the points
(xi, f(x1)) and (xe, f(x2)), but not to K*. Since (x, f(x:)), (Xo, f(x2)) € K*,
the set K* is not convex. §
As an illustration of Proposition 4 let us return to Example 6, p. 18. If we
let K = E', the set A defined there is merely K*. Therefore f is convex on
E' if f’ is an increasing function. Actually, the convexity is strict. This is a
special case of an even stronger result.

Proposition 5. (n = 1) Let K C EH’ be an interval, and f a function which


has a derivatwe f'(x) for every x © K. Then:
(a) f ts convex on K if and only if f’ is nondecreasing on K.
(b) f ts strictly convex on K if and only zf f’ is increasing on K.

It is convenient to postpone the proof until Section 2-4. If f is convex on


K but not strictly convex, then f’(x) is constant on some subinterval of K.
This means that f contains some line segment.
From Proposition 5 we can deduce the following useful test.

Corollary. Lei f have a second derivative f(x) for every x € K, where K C E}


zs an interval. Then:
(a) f 2s convex on K if and only if f(x) > 0 for every x € K.
(b) Lf f(x) > 0 except at a finite number of points of K, then f is strictly
conver on K.

Proof. Apply Proposition 5. §


1-5 Convex and Concave Functions 25

In Section 2-4 these results are generalized to functions of n variables,


tan le
For any real number c, let

ae Xe Ke (xX) = C)

Proposition 6. If f 7s convex on K, then K, is a convex set for every c.


Proof. For every x1, X2 © K, and t € (0, 1],

f(t + @ = t)X2) = if(x1) + @! aa t)f(Xe) =< te + (1 ae t)c = ¢.

Hence tx; + (1 — t)xs € Ky. J

The same proof shows that {x : f(x) < c} is also convex. The converse to
Proposition 6 is false; for example, let f be any increasing function with domain
E'. Then K, is either all of E', a semi-infinite interval, or the empty set. In
each case K, is convex. However, f need not be a convex function; for instance,
if f(z) = x3, then f is not convex on E!.
Example 1. Let A be any nonempty closed subset of H”. For every x, let f(x) be
the distance from x to A, namely,

I(x) i= mine yy eA):


Let us show that the function f so defined is convex on H” if and only if A is a convex
set. If fis convex on EH”, then Kg = A and A is convex by Proposition 6 withe = 0.
Conversely, assume that A is convex. Given x1, x2 € H” and t € (0, 1], let yi be a
point of A nearest x1, y2 a point of A nearest x2, and

x =ixi+(1—i)xe, y = tyi+ (1 — dye.


(Actually, one can say “the nearest point” rather than “a nearest point” since the
set A is convex. This fact is not needed here.) Since A is convex, y € A. By defini-
tion of f, f(x) < |x — y|. Then

PoeaeXte Vt) ater t)(Xo0—= ¥o)|,


f(x) < txi1 — yil + (1 — d|x2 — yo| = éf(x1) + — Of(xe).
Hence f is a convex function on £”.
In particular, let A consist of a single point xo. Then f(x) = |x — xo| and this
function is convex on HK”.

Concave functions. The definition of concave function is obtained by


reversing the inequality sign in (1—-8a) : f is concave on K if, for every x1, X2 © K
and t € (0, 1],
f(t, + (Ll — Ox.) > (ei) + (L — Of). (1-8b)
If strict inequality holds whenever x; ~ x2 and 0 < ¢ < 1, then f is strictly
concave on K.
26 Euclidean Spaces, Convexity 1=5

There are propositions about concave functions corresponding to Propo-


sitions 4, 5, and 6 for convex functions. In them K + must be replaced by

Kee) eex Ghee 3 (x)},


and K, by
Kee Xe Kes ec

The words “increasing” and “decreasing” must be exchanged. A function f is


concave on K if and only if —f is convex on K. By using this fact, or by repeat-
ing the proofs of Propositions 4, 5, and 6, it is easy to prove these propositions
about concave functions.
Many useful inequalities can be obtained from (1-8a) or (1-8b) by judi-
ciously choosing the function f and the number ¢.

Example 2. Let p > 1, p not necessarily an integer. Xp tu


Let f(z) = |x|? for every x € E!. Then
x
Semi ace 10:
f(a) = P| ey
—p|a|? .
if ae <a; =)

and f’(0) = 0. The functionf’ is increasing. Hence ee


f is strictly convex on E!. Taking t = 4, we have |

fls(@1 + x2)] S $f(x1) + f(a).


Figure 1-15
Multiplying both sides of the inequality by 2?, we get

lea ve[? S 2°—*([21|? + [xa]?). (1-9)


The inequality is strict unless 21 = 22.

*Continuity of convex functions

Theorem. Let K be an open convex set and f convex on K. Then f is con-


tinuous on K.

Proof.+ Let Xo be any point of K, and d the distance from x9 to the boundary
of K (d = +a if K = EH"). Let C be an n-cube with center xo and side length
26, where n"!?5 < d. Let V denote the set of vertices of C (see Problem 12,
Section 1-4). V is a finite set. Let

M = max {f(x):x € V}.

By Proposition 6, Ky is a convex set. Since C is the convex hull of V and


ve Ky, Cle Gain

} This proof was suggested by F. J. Almgren.


1-5 Convex and Concave Functions 27

Let x be any point such that 0 < |x — xo| < 6, and define xp + u, x9 — u
on the line through xp and x as in Fig. 1-15. Let us write x as a convex com-
bination of x9 + u and Xo, and Xp as a convex combination of x and Xo — u.
Tet 6~"Ix — xp|,then
x = UX+ u) + (1 — £)Xo,
1 t
Roker = hepep Xo Us

Since f is convex,

F(X) S tf&%o + u) + (1 — )f(Ko) < tM + (1 — Af(Xo),

fo) < qe + Ta fico —uy < OEE


The inequalities give

—t[M — f(xo)] S f(x) — fo) < tM — f(xo)],


or

Fe) ieee sah (1410)


The estimate (1-10) shows that f is continuous at xo. §
If K is not open, then a convex function f may be discontinuous at bound-
ary points of K. See the example below. The interior of K is an open convex
set, and by the theorem f is continuous at every interior point.
Example 3. Let AK = [0,1] and f(x) = «if 0 < x < 1, f(0) = 1. Then f is con-
vex on K but is discontinuous at the left endpoint 0.

PROBLEMS
1. In each case find those intervals of ZH! on which f is convex and those on which
it is concave. Illustrate with a sketch.
(20 fe) i= 127: (b) f(z) = exp (—2z).
(c) f(z) = 2/(1 — |), || ¥ 1. (d) f(z) = log (w?+ 1).
2. (a) Show that no polynomial of odd degree (> 8) is a convex function on H!.
(b) Which fourth degree polynomials are convex functions on 11?
(c) Why must a polynomial (of degree >2) which is a convex function on EH!
be a strictly convex function?
3. Let f(x) = x(x) and g(x) = $(1/x), where ¢ has a second derivative $’’(x) for
every x > 0. Show thatf is convex on (0, ©) if and only if g is convex on (0, ©).
AviLety > 0.7 > 0,0 =< < 1. Show that i+ (1 — dy = z'y'—. (Hint: Log
is an increasing, concave function.]
5. Prove by induction on m that if f is concave on K, then

(X ‘) > Ss t’f(x;)
j=1 all
28 Euclidean Spaces, Convexity 1-6

for every X1,...,Xm€K and scalars ¢t!,...,¢" such that each ¢#/ > 0 and
ti ----+t™ = 1. [For convex functions the sense of the inequality is reversed.]
6. (a) Generalizing Problem 4, show that if %1,...,%m are positive numbers,
OS forg = ale. meando) = Paten
mr 1 ¢™

titi +++ ttm > UL +++ om.


(b) Prove that the geometric mean is no more than the arithmetic mean, namely,

LIPTas
aieemek
Sulton 1

7. Show that if f and g are convex on K, then f + g is convex on K.


8. (a) Let f and g be convex on K, and let h(x) = max [f(x), g(x)] for every x € K.
Show that h is convex on K. [Hint: Use Proposition 4.]
(b) Illustrate for the case f(x) = |x — 1], gv) = 2/2.
9. Let f be strictly convex on E”, with f(0) = 0. Given x € H", x # 0, let dt) =
f(tx)/t. Prove that ¢ is increasing on {t:t > O}.
10. Let K be compact and convex. Let f be continuous and strictly convex on K.
Let m be the minimum value of f on K. Prove that K,, has precisely one element.
State the corresponding result for strictly concave functions.
11. Let f be both convex and concave on H”. Show that there exist a and b such
that f(x) = a:x-+ 6 for every x € EK”.
12. Let f be continuous on K, and assume that f(4(x; + xe)) < 4f(x1) + 4f(x2)
for every x1, X2 © K. Show that f is convex on K. [Hint: First show (1-8a)
when ¢ = 7/2" wherej = 0,1,..., 2" and k is a positive integer.]

*1-6 NONEUCLIDEAN NORMS

It is sometimes advantageous to consider norms on #” other than the


standard euclidean norm. The distance between two points x and y defined
by such a norm need not agree with the euclidean distance. As a result, such
geometric notions as length, area, and spherical ball are changed when con-
sidered with respect to a noneuclidean norm. However, we shall see that any
noneuclidean norm leads to the same collection of open sets as the euclidean
norm. Since the collection of open sets determines all of the topological proper-
ties of H”, these properties are therefore independent of the particular norm
chosen.
Definition. A norm is a real-valued function || ||with domain H” such that:
(1) ||x|| > 0 for every x = 0,
(2) |lex|| = |c| ||x|| for every c and x, and
(3) [lx + yll < IIxl| + |ly|| for every x and y.
This agrees with the definition of norm on any vector space VU
(Section A-6, Problem 5) if we take 0 = £”. The notation || || rather than,
say, f iscustomary. From Axiom (2) with c = 0, ||0|| = 0. Axiom (3) is the
1-6 Noneuclidean Norms 29

triangle inequality. Just as for the euclidean norm one can easily prove using
(2), (3), and induction on m that

I> xsl < DD Le! [bx


j=l j=1
(1-11)
for every choice of c’,...,c” and x1,...,Xm. In particular, let m = 2,
6a tec. lf where Olt < Ie ‘Then

[x1 + (1 — @)xel| < é|[xi|| + A — 4)lxol.


This shows that || || is a convex function.
The distance between x and y with respect to the norm || || is defined as
x — yl]. From Axiom (1) the distance between any two distinct points is
positive. The 6-neighborhood with center xo with respect to the norm || || is
{x : ||x — Xq|| < 6}. The closed n-ball with center xo and radius 6 is

xk
— "xo 33).
As in the euclidean case, they are convex sets.

Example 1. Let
Ix = D> i
The n-balls with respect to this norm are convex polytopes. For example, if n = 2,
the closed unit 2-ball {x: ||x|| < 1} is the square with vertices e1, e2, —e1, —ee.
Compare with Problem 1(a), Section 1-4.
Example 2. Inner products, quadratic norms. A real-valued function B on the
cartesian product HL” X E” is bilinear if for each x, y € H” the functions B(x, ) and
B( ,y) are linear. An inner product in £” is a bilinear function such that: (a) B(x, y) =
B(y, x) for every x, y, and (b) B(x,x) > 0 for every x # 0. With any inner product
B is associated a quadratic norm given by

I[x|| = V B(x, x).


Axiom (1) follows from (b), and (2) from B(cx, cx) = c?B(x,x). The proof of the
triangle inequality (3) is the same as for the standard euclidean norm (1-2).
With any bilinear function B is associated an n X n matrix (¢;;) such that

Bex,y) = Dy exe'y’. (1-12)


4,9=1

In fact, if n aS
x = De Ties, Yis Ss y'e;,
i=1 eh!

then n n

B(x, y) = > «Ble: y) = D, vy’ Blex, e,),


i=1 i,j=1
30 Euclidean Spaces, Convexity 1-6

and we set ci; = B(e;,e;). Condition (a) states that the matrix (c;;) is symmetric,
Cij= cj; for 1, 7 = 1,...,n. For the standard euclidean inner product, ci; = 6;;
and the matrix is the identity.
The n-balls with respect to any quadratic norm are n-dimensional ellipsoids.
This will be proved later (Section 4-8) by finding a new orthonormal basis for H”
for which the matrix associated with B is diagonal.
If y is a vector, then there is a covector a such that

Qh, 38) S Bors (1-18)

for every vector x. The components of a are

ag = aegy!, wert =a means (1-14a)


j=1 2

On the other hand, given a covector a there is a vector y such that (1-13) holds. Its
components are
n

Ua ih cee) (1-14b)
j=1

where (c’) is the inverse of the matrix (c;;). For the standard euclidean inner product,
y’ = a;, and formula (1-13) becomes (1-7).
Proposition. Corresponding to any norm || || on E” there exist positive num-
bers m and M such that for every x € E”,

m|x| < ||x|| < Mx. (1-15)


Proof. Let
M = nmax {lle,||,..., ||en||}.
Then writing
x=z'e, +---+ 2%,

and using (1-11), we have

ial< DOfelled < Dh


But |z’| < |x| foreach i = 1,...,n. Hence

Ix < Mix.
From this |x — y|| < M|x — y| for every x and y, which implies that
|| || is a continuous function. Therefore it has a minimum value m on the com-
pact set {x : |x| = 1},
m = min {|[x||> |x) = 1},
By Axiom (1) m > 0. If x = 0, all terms in (1-15) are 0. Given any x # 0,
let ¢ = |x|~*. Then |cx| = c|x| = 1, and hence ||cx|| > m. By Axiom (2)
1-6 Noneuclidean Norms 31

l|cx|| = |e] ||x||, from which

|x|] = mlx].a
Irom the proposition,

mix — y| < |x — yl < Mix —y|


for every x and y. This says that the ratio of the || ||-distance to the euclidean
distance is bounded between m and M. Let us call a set D || ||-open if every
Xo € D has some neighborhood with respect to this norm which is contained
in D. If |x — xo| < 6/M, then ||x — xo|| < 6. Therefore the 6-neighborhood
of Xo with respect to the norm || || contains the ordinary euclidean (6/M)-
neighborhood of xo. If D is || ||-open, then every-xo € D has a euclidean neigh-
borhood contained in D. Hence D is open in the ordinary sense. Similarly, the
euclidean 6-neighborhood of x9 contains the (m6)-neighborhood of x9 with
respect to the norm || ||. It follows that every set which is open in the ordinary
sense is also || ||-open. Thus a set is || ||-open if and only if it is open in the usual
sense. Since the open subsets of a topological space determine the topology
(Section A-6), all norms on E” lead to the same topology of E”.
The idea of norm on an infinite dimensional vector space is also very im-
portant; for instance, see [19]. In fact, it is the starting point for modern-day
functional analysis. It is not true in the infinite dimensional case that any two
norms on the same vector space define the same class of open sets.
The closed unit n-ball
Ke x7 |x| <1 (1-16)
with respect to any norm has the following four properties:
(1) K is compact;
(2) K is convex;
(3) K is symmetric about 0; and
(4) K contains a euclidean neighborhood of 0.
Symmetry about 0 means that —x € K for every x © K. By Axiom (2)
with c = —1, ||—x|| = ||x||, and hence K has Property (3). From the propo-
sition, K has property (4) and is bounded. For any continuous function f,
{x: f(x) < 1} is a closed set. Since || || is continuous, K is closed. Hence
Property (1) holds. We already noted (2), which also follows from Proposition 6
since || || is a convex function.
Let us show that, conversely, any set K with these four properties gives
rise to a norm with respect to which K is the closed unit n-ball.
Theorem. Let K be any set with Properties (1)-(4). Let ||0|| = 0, and for
every x ¥ 0, let
1
I[xI| = max {t: tx € K} (1-17)
Then || || is a norm and (1-16) holds.
32 Euclidean Spaces, Convexity 1-6

Proof. By (1) and (4) there exist r; > 0 and rg > O such that y € K if
ly] <r, andy ¢ K if |y| > ra. Hence given x + 0, tx EK if |t) < r1/|x|,
and tx ¢ K if |t| > re/|x|. (See Fig. 1-16.) Let
\
Ss; = {t 2b S KE

Then S, contains the (7;/|x|)-neighborhood of 0 and is bounded above by


1/|x|. Since K is a closed set, S, is also closed. Hence S, has a largest element
max S,, which is positive. This shows that ||x|| = 1/max S, is well-defined and
is positive. Moreover, by (2) and the fact that 0 € K, the line segment be-
tween 0 and any point of K iscontained in K. Therefore max S, > 1 if and
only if x € K, which says that (1-16) holds.
It remains to verify Axioms (2) and (3) for a norm. By Property (3),
||—x|| = ||x|]. It is left to the reader to check that if c > 0, then
1
Max. = ; max see

and consequently ||cx|| = c||x||. Then Axiom (2) </


holds. For (3) we may assume that x # 0 and /
y= 0. Let
t
= WED | = WENO, Oe Fea

Observe that 0 < wu < 1 and ee

=1 = 245
1 1
(lel + Iv. Frere 1-16;
A little manipulation shows that su = (1 — u)t. Consequently,

su(x + y) = u(sx) + (1 — uty.


Since sx, ty € K and K is a convex set by Property (2), su(x + y) € K. There-
fore su < max S,;4,, and

Ix+lvl =+1 > Ix +yll.


| ———

This verifies Axiom (3). §f

The dual norm. With any norm || || on H” is associated a norm on the


dual space (H”)*. This norm on (£”)* is also denoted by || ||. It is defined,
for every a € (H”)*, by

al) sax (a xoxo (1-18)

A linear function is continuous and has a maximum value on the compact


set {x : ||x|| = 1}. Hence we are justified in writing max in (1-18). Let us verify
1-6 Noneuclidean Norms 33

Axioms (1), (2), and (3). Ifa # 0, thena-x ¥ 0 for some x. We can assume
that a-x > 0, since if not, x can be replaced by —x. If y = (1/||x||)x, then
a-y> 0 and |ly|| = 1. Thus {a-x:||x|| = 1} contains some positive num-
ber, and its maximum |lal| is positive. The reader should verify that ||ca|] =
|c| |lal|. Given a, b € (E”)*, we have whenever ||x|| = 1,

(a+ b):-x=a-x+b-x < |lal|


+ ||b]).

Hence the number |la|| + ||b|| is an upper bound for {(a + b)-x:||x|| = 1}.
The least upper bound of this set is ||a + b||. Thus

lla + bl] < |lal] + [|bll.


This verifies Axioms (1), (2), and (3).
For any x ¥ 0, (1/||x||)x has norm 1. Hence
‘|
eA ae ee1 <
pay 0 = &- (Gay2)< Ie
or a-x < |al| ||x||. Replacing x by —x, we find that —a-x < |a|| ||x/|.
Therefore
ja-x| < |lall |x|. (1-19)
This inequality corresponds to Cauchy’s inequality.
There is a formula dual to (1-18) for ||x||:

ix|=—"max (atx jail = 1). (1-20)

From 1-19, a- x < ||x|| for every covector a such that |/a|| = 1. Hence

Vxile maxeale xs |lalie— 1}:

To prove the opposite inequality, consider first any y with |/y|| = 1. Then y
is a boundary point of the closed convex set K defined in (1-16). By a corollary
to Theorem 1 (Problem 8, Section 1-4), K has a supporting hyperplane at y.
Thus there exists a covector b such that

b-x <1 foreveryxeK, and b-y=1.

By definition of the dual norm, ||b|| = 1. Then

1 = b-y < max {a-y:|la|| = 1}.

Thus |ly|| < max {a-y:|lal|/ = 1}. We have already proved the opposite
inequality. Hence (1-20) is true for elements of norm 1. Since ||ey|! = |c| ||y|)
and
max {a- (cy) : |lal] = 1} = |c| max {a-y:]a|| = Lf,
(1-20) then follows for elements of arbitrary norm.
34 Euclidean Spaces, Convexity 1-6

PROBLEMS
Pere ix| =sraax tot 8, ey:
(a) Show that this is a norm. x
(b) Describe the neighborhoods with respect to it.
(c) Show that the triangle with vertices 0, e1, e2 is equilateral with respect to the
distance which it defines.
2. The ellipse K = {(z, y): 2? -+ ry + 4y? < 1} has Properties (1)—-(4). (See p. 31.)
(a) For what (quadratic) norm is it the closed unit 2-ball?
(b) Find |le1 — eg||.
Se letro = Jand let |x| = (> 721 (a1?) "2. [For p = 1 this is Example labove,
and for p = 2 this is the euclidean norm.] Show that this is a norm, in the follow-
ing steps:
(a) Let f(x) = 0%; |z'|?. Show thatf is convex on EH”. (Hint: In Section 1-5
we showed this for n = 1. Hence if 0 < ¢ < 1, |t2*-+ (1 — Oy'|? < tx]? +
(ily?
(b) Let K = {x: f(x) < 1}. Show that K satisfies Properties (1)—(4) on p. 31.
(c) Show that ||x|| is given by (1-17). [Note: For this norm the inequality
Ix+ yl] < ||x|] + lly|| is called Minkowski’s inequality. There is a related
inequality for integrals which we shall prove in Section 5-12.]
4. Show that if p = 1 in Problem 3, the dual norm is given by

llal| = max {[ai|,..., |an[}.


[If p > 1, the dual norm will be found using calculus in Section 4-8.]
5. A seminorm on HK” is a real-valued function f satisfying: (i) f(x) > 0 for every x;
(ii) f(ex) = |clf(x) for every c and x; and (iii) f(x -+ y) < f(x) + f(y) for every
x and y.
(a) Let f be a seminorm and K = {x: f(x) < 1}. Show that K is closed and
satisfies Properties (2), (3), and (4), p. 31. Show that K is compact if and
only if f is a norm.
(b) Conversely, let K be any closed set satisfying Properties (2), (3), and (4). Let
f(x) = Oif x = 0 orif the line through 0 and x is contained in K. Otherwise, let
1
X= max {t:tx € K}
as in (1-17). Show that f is a seminorm.
(c) Letn = 3and f(z, y,z) = |z| + 2|y|. Sketch K and show that
f is a seminorm.
CHAPTER 2

Differentiation of
Real-Valued Functions

We shall now begin the differential calculus for real-valued functions of


several variables. The first step is to define the basic notions—directional
derivative, differentiable function, and so on—and to prove some basic facts
about differentiable functions and functions of class C”. Taylor’s formula is
then obtained. It is applied to the characterization of convex functions of class
C™ and to problems of relative extrema. The chain rule for partial derivatives
is postponed te Chapter 4, since it is a natural corollary of the composite func-
tion theorem for vector-valued functions to be proved there.

2-1 DIRECTIONAL AND PARTIAL DERIVATIVES

If f is a function of one variable, then its derivative at a point x9 is defined

2 f'(o) = h-0
lim
feo +8) = feo)”
h

provided the limit exists. The corresponding expression for functions of several
variables does not make sense, since h is then a vector and division by h is
undefined. Therefore we must find an acceptable substitute for it. Let us first
consider the derivative of f in various directions.
Let us call any unit vector v (that is, vector with |v| = 1) a direction
in #”. The directions are just the points of the (n — 1)-dimensional sphere
which bounds the unit n-ball. If n = 1, the only directions are e; and —e,,
which we have identified with the scalars 1 and —1. If n = 2, every direction
can be written (cos 6, sin 6) where 0 < 6 < 27. The angle @ determines the
direction. For any n > 2 the components of a direction v satisfy v’ = cos 6;,
1 = 1,...,n, where 9; is the angle between v and e;.
Given X and a direction v, the line through x9 + v and Xo is called the
line through Xo with direction v. According to the definition on p. 13, this line is

{x:X = Xo + iv, tany scalar}. (2-1)


35
36 Differentiation of Real-Valued Functions 2-1

Let f be a function with domain D C E”, and let xo be an interior point


of D.

Definition. The derivative of f at Xo in the direction Vv 1s

ise f(Xo + tv) — f(Xo) CD


t0 t

af the lamit exists. ie

Since Xp is an interior point, the 6-neighbor-


hood of Xo is contained in D for some 6 > 0
(Fig. 2-1). Since

|G¥o + tv) — Xo| = lévl = [El


Xo + tv € D provided |t| < 6. The domain of
the function ¢ defined by

o(t) = f(Ko + tv) Figure 2-1

contains the é-neighborhood of 0. The derivative of f in the direction v is


¢' (0), if has a derivative at 0.
The line through xo with direction —v is the same line as the one through
Xo with direction v. However, the derivative in the direction —v is the negative
of the derivative in direction v (Problem 6). The direction v defines an orzenta-
tion of this line, and —v the opposite orientation. When the orientation
changes, the directional derivative changes sign. In effect, by assigning the
orientation v we agree that the point x9 + sv precedes Xo + ¢tv on the line if
& <tk

Example 1. Let D = EH? and

Quy
f(z, y) = puny fey) (0, 0):

Since the domain D is to be HE? we must give f some value at (0,0). More or less
arbitrarily, we let f(0,0) = 1. Let us find the directional derivatives of f at (0, 0).
Given a direction (cos 9, sin 6),

2t” cos 6 sin 6


p(t) = f(t cos 0, ¢ sin @) =
t2(cos? 6 + sin? @)

or d(t) = sin 26, foreveryt ~ 0. But ¢(0) = 1. If sin 26 = 1, then ¢ is the constant
function with value 1 everywhere, and ¢’(0) = 0. Thus if 6 = 2/4 or 5/4, the
directional derivative exists and is 0. For all other values of 6 the function ¢ is dis-
continuous at 0, and consequently ¢’(0) does not exist. Thus f has a directional
derivative at (0,0) only in the directions (2/2, V/2/2) and (—+/2/2, — V/2/2).
2-1 Directional and Partial Derivatives 37

In the next section we shall see that if f is differentiable at xo, then the de-
rivative in every direction exists and is easily calculated. Thus the unpleasant
phenomenon illustrated by the example cannot occur for differentiable functions.
The partial derivatives of f are defined as the derivatives in the directions
€1,..., €n, if these directional derivatives exist. There are several equivalent
notations in use for partial derivatives. Of these we shall adopt just two The
ath partial derivative of f at x,7 = 1,...,n, is denoted by

Of
Thus eS? sea Ox; St
Ce wee tae ad, ) 2°) = fl ) Aa)
) ;
Fi(X)
5 =
ba
| d ’
j ’
(2-3)
provided the limit exists. Stated in less precise terms, f;(x) is the derivative
taken with respect to the 7th variable while holding all other variables fixed.

Example 2. Let f(z, y,z) = 22+ y+ cos (y?z). Then

fi (x, Y, z) = 2x,

fa(z, Yy, z) = 1,— 2yz sin (GZ),

faa, Y; z) = —y? sin (y?2).

The symbol f; will denote the real-valued function whose value at x is f;(x).
Its domain is the set of points where f has an 7th partial derivative.
For purposes of brevity, we shall occasionally abuse the notation by writing
f; for the value f;(x) at some particular x. In each such instance this abuse will
be indicated either explicitly or by the context.

Example 3. Let f(x) = Wlg(x)] for every x € D. Suppose that the 7th partial deriva-
tive of g at xo and the derivative of at g(xo) exist. By the composite function theorem
for functions of one variable
Filo) = ¥’[g(xo)]gi(Xo). (2-4)
This theorem will be proved in Section 4-4 as a special case of the composite function
theorem for transformations.

PROBLEMS
Unless otherwise stated, the domain D of f is £” for the particular n indicated
in the problem.
1. In each case find the partial derivatives of f.
(a) f(a, y) = xlog (zy), D = {(a,y): ay > 0}.
(Dray 2) a= (a2 2y2ct-2) 8.
(c) f(x) = x-x.
2. Let f(z, y) = (« — 1)? — y?. Find the derivative of f at eg in any direction v,
using the definition of directional derivative.
3. Let f(z) = 21/3. Show thatf has no derivative at 0.
38 Differentiation of Real-Valued Functions 2-2

4. Let f(x, y) = (xy)'/3. (a) Using the definition of directional derivative, show that
fi(0, 0) = fe(0,0) = 0, and that +e, +e2 are the only directions in which the
(b) Show thatf is continuous at (0, 0).
derivative at (0, 0) exists.
5. Let f(z, y,z) = |e + y+ 2]. Find those direction’. in which the derivative of f
at e; — eg exists. [Hint: The absolute value function, g(t) = |t| for every t € £!,
has no derivative at 0.]
6. Show that the derivative of f at xo in the direction —v is the negative of the deriva-
tive at Xo in the direction v.

2-2 DIFFERENTIABLE FUNCTIONS


The existence of a derivative for a function of one variable is a fact of
considerable interest. Geometrically, it says that a tangent line exists. How-
ever, the fact that a function of several variables has partial derivatives is
not in itself of much interest. For one thing, the existence of derivatives in the
directions of the standard basis vectors e;,...,€, does not imply that deriva-
tives exist in other directions. Moreover, the function need not have a tangent
hyperplane even if there is a derivative in every direction (see Example 2 below).
We shall now define a more natural notion, that of differentiability.
Geometrically, differentiability means the existence of a tangent hyperplane.
It will be shown that most of the basic properties of differentiable functions of
one variable remain true for differentiable functions of several variables.
Let us again consider an interior point Xo of the domain D of a real-valued
function f.

Definition. The function f is differentiable at xo if there is a linear function


L (depending on xg) such that

epiibtti
= leo th) 0. (2-5)
h0 {h
Let us show that if f is differentiable at xo, then f has a derivative at xo
in every direction v. Taking h = tv, (2-5) implies that

jim£20 + ¥)— feo) — Liew) ==)qi),


t—0 t

by Proposition A-5, and therefore

f(Xo == au Saat f(Xo) L(v)


lim = 0,
t—0

lim f(Xo =r tv) Salil f(Xo)


a IAL
t0 t

This shows that L(v) is the derivative at xo in the direction v.


The linear function L is called the differential of f at xo, and will be denoted
by df(Xo). As in Section 1-3, the linear function df(xo) is also called a covector,
2-2 Differentiable Functions 39

and its value at a vector h is denoted by df(xo) -h rather than (Tae Lav
e;, then the number a; = L/(e,) is the ith partial derivative fi(Xo). Hence
the components of the covector df(xo) are the partial derivatives:

af(xo) = Dofalxo)e’,
1=1
(2-6)

fo) h = So filao)hi. t=1


(2-7)
If x € D, let us set x = x9 +h. The vector h = x — Xo is often called
the “increment between x and xg,” and in the time-honored notation of calculus
one would write Ax for h. The number f(xo + h) — f(xo) is the corresponding
“increment in f.” However, we shall use neither the word increment nor the
notation Ax.
co

(Xo, 29 +df(Xo)-h)
(Xo, 20)

|
X=Xjth
-e———————
——— Zo =J(Xo)
Figure 2-2

If f is differentiable at xo, then the differential at xo furnishes a linear


approximation df(Xo) - (x — Xo) to f(x) — f(Xo) when x is near x9. The error
in this approximation is f(x) — f(xo) — df(Xo) - (k — Xo), which is the numer-
ator in (2-5). It is small compared to the distance |x — Xo| when |x — Xo| is
small. Geometrically, this means that the hyperplane in #”*! whose equation
is 2 = f(Xo) + df(Xo) - (kK — Xo) is tangent to f at (xo, f(Xo)). This is illus-
trated by Fig. 2-2 for n = 2. A precise definition of tangent hyperplane will
be given later in Section 4-7.

Example 1. Let f(z, y) = (zy)!/%. Find the tangent plane at (1,1,1). By ele-
mentary calculus

fia; y) —eeeetey8) fala, y= gelty%,


40 Differentiation of Real-Valued Functions 2-2

except at (0,0). Moreover, fi and f2 are continuous functions except at (0,0). By


Theorem 2 in the next section, f is differentiable at any (xo, yo) # (0,0). The com-
ponents of df(xo, yo) are f1(xo, yo) and f2(xo, yo). The equation for the tangent plane
at (x0, yo, {(xo, Yo)) is \
z = f(xo, yo) + filwo, yo)(% — xo) + f2(xo, yo)(y — Yo).
Taking x9 = yo = 1, the equation of the tangent plane at (1, 1, 1) is

2=1+e-1)+4y—D.
The partial derivatives f1(0, 0) and fo(0, 0) are both 0, according to Problem 4,
Section 2-1. However, there is no tangent plane at (0,0, 0). If there were a tangent
plane at (0, 0, 0), then f would have to be differentiable at (0, 0). Since there is not a
derivative in every direction at (0, 0), f is not differentiable there.

Proposition 7. Jf f 7s differentiable at xo then f is continuous at Xo.


Proof. For any h

f(Xo + h) — fo) = [f(ko + h) — f(xo) — df(xo) -h] + df(xo)- bh. (*)


From the definition of limit (with ¢€ = 1) there is a positive number 69 such
that if 0 < |h| < 69 the quotient in (2-5) has absolute value less than 1,

[fo + h) — f(xo) — df(xo) bh] < [hl.


By Cauchy’s inequality
ldf(xo) hl < |df(Xo)| |hl.
Applying the triangle inequality to the right side of (+),
[fo +h) — fXo)| < [f(xo +h) — flKo) — df(xo) +h] + |df(Ko) - hl.
Consequently, if 0 < |h| < do,

[fo + hb) — f(xo)| < Clb, (2-8)


where C = 1 + |df(xo)|. Given € > 0, let 6 = min {6o, €/C}. Then |f(xo + h) —
f(Xo)| < € for every h such that 0 < |h| < 6. This shows that

lim f(X9 + h) = f(Xo),


h—0

in other words, that f is continuous at Xo. §


For n = 1, f is differentiable at xo if and only if the derivative f’(zxo)
exists, since both statements are equivalent to the existence of a tangent line
at (ao, f(xo)). However, for n > 2 a function f may have a derivative at Xo
in every direction yet not be differentiable or even continuous at Xo. This is
shown by the following example.
2-2 Differentiable Functions 41

Example 2. Let
Qry”
f(a, y) = if (x,y) ~ (0,0), and f(0,0) = 0.
oes
There are two cases to consider. If cos@ ¥ 0, then the derivative at (0,0) in the
direction (cos 6, sin @) is

lim f(t cos 6, t sin 0) — f(0, 0) i, 2 cos 6 sin” 6 22 sin” 6


eG t +0 cos2 0+ #2sin¢6 cosd —

If cos@ = 0, then f(écos 6, t sin 0) = 0 for every ¢ and the directional derivative
at (0, 0) is 0. However, f(y”, y) = 1 for every y ~ 0. Since f(0,0) = 0, f is not
continuous at (0,0). By Proposition 7, f is not differentiable at (0, 0).

Let us next state a proposition which, although of no interest in itself, will


be useful later.

Proposition 8. Let ¢(t) = f(Xo + th). Then for every t such that f is dif-
ferentiable at Xo + th,

¢'(t) = df(Xo + th) -h.


Proof. If h = 0 the result is trivial. If h # 0, then

elim f(Xo + th + n) — f(Ko + th) — df(xo + th)-


n—0 in|

In particular, let y = wh. Then

T>0 oF

0 = lim eats ue2 UE Ghee dha


Note that if h is a direction (|h| = 1) and ¢t = 0, we again obtain the
formula df(xo) - h for the directional derivative.
As a first application of Proposition 8 let us extend the mean value theorem
to functions of several variables. Let (0, 1) denote the open interval with end-
points 0 and 1 (as in Section A-1).

Mean Value Theorem. Let f be differentiable at every point of the line segment
joining Xo and Xy +h. Then there exists a number s € (0, 1) such that

f(xo + h) — f(ko) = df(Xo + sh) -h.

Proof. Let ¢ have the same meaning as in Proposition 8. By the mean


value theorem for functions of one variable (Section A-8) there exists s € (0, 1)
such that ¢(1) — ¢(0) = ¢’(s). We apply Proposition 8.
42 Differentiation of Real-Valued Functions 2-2

Note that the point x9 + sh is on the line segment joining Xo and Xo + h.


The number s in the mean value theorem is not unique. We have no interest
in actually calculating s. The mean value theorem is used to obtain various
estimates which are valid no matter where s is in the interval (0, 1). The mean
value theorem is often stated in a slightly sharper form, in which f is required
to be differentiable at each point x9 + th for ¢ in the open interval (0, 1) and
continuous at X) and x» +h. The proof is the same.

Definition. If f is differentiable at every point of a subset A of its domain


D, then we say that f is differentiable on A. If D is an open set and f is
differentiable at every point of D, then f is called a differentiable function.

The mean value theorem has the following corollaries.

Corollary 1. Let f be differentiable on a convex set K and C > 0 a number


such that |df(x)| < C for everyx © K. Then for every x,y € K,

(@) fOr Cx tye


Proof. By the mean value theorem, with x9 = y, Xx») + h = x,

j=) SG) = ai hea yi iy)


where s € (0,1). By Cauchy’s inequality,
Pe Gols lofy six y))||x yp = Cx vied
Corollary 2. Let f be a differentiable function whose domain D is an open,
connected set, such that df(x) = 0 for eery x © D. Then f is a constant
function.

Proof. Let. xo be some point of D; andilet D, — {x<f(x) = f&o)) = It


x € D, then some neighborhood U of x is contained in D. Every neighborhood
isa convex set. By Corollary 1, with = O and K = U, f(y) = fa) = Je)
for every y € U. Hence U C D,. This shows that D, is an open set.
Since f is differentiable, f is continuous by Proposition 7. Therefore
D — D, = {x:f(x) # f(Xo)} is also open by the corollary to Proposition A-6.
If D — Dy, is not empty, then D is the union of two disjoint, nonempty open
sets D; and D — D,. Since D is connected, this is impossible. Hence D — D,
is empty, and D = D,.J

Corollary 2 generalizes the result that if f’(~) = 0 for every x in an open


interval, then f is constant there.
Note: H. Whitney (Duke Math. J. 1 (1935), 514-517) gave an example
of a connected set A C EH” and a differentiable function f, such that Ci Gay) 20
for every (x, y) € A but f(x, y) is not constant on A. The set A in Whitney’s
example has no interior point.
2-2 Differentiable Functions 43

If x is any point where f is differentiable, then besides the covector

dj(x) = )) fixe’
tent
whose components are the partial derivatives f;(x), it is sometimes more suit-
able to think instead of the vector with these same components. This is called
the gradient vector at x and is denoted by grad f(x). Thus

grad f(x) = il filxes:


i=1
Another common notation for the gradient vector is V/(x).

*Note: This definition of the gradient vector is correct only if we use the
euclidean inner product in HE”. If H#” is given some other inner product B,
then one should use formula (1-14b) for changing covectors into vectors.
The gradient vector in that case is
n

grad f(x) = >) c'Ffi(x)e,


i,j=1
and according to formula (1-13),

B(grad f(x), h) = df(x) -h,


which becomes for the euclidean inner product simply grad f(x) -h = df(x) -h.

PROBLEMS
In Problems 1, 2, 3, and 8, assume that f is differentiable. In each case this
follows from Theorem 2 in the next section.
1. Let f(x, y) = 3x7y + 2zy?. Find the tangent plane at (1, —2, 2).
2. Using the formula df(xo) : v for directional derivative, find the derivative of f at
xo in the direction v.

(a) f(z, y) = zy, Xo = (1,3), Vv= cH ey

(ay) = «exp (24), x0 =e) = eg) v= re Gicech)


(c) f(a, y, 2) = ax” + by? + cz", Xo = €1, V = e3.
3. Let f(z, y) = log (x? + 2y+ 1) + Jo cos () dt, y > —3.
(a) Find df(a, y).
(b) Find approximately f(.03, .03).
4, Find grad f(x) for each of the following functions:
(a) f(x) =x0-x. (b) f(x) = x, x #0. (c) fe) = (xox)?
44 Differentiation of Real-Valued Functions 2-3

5. Let f(x, y) = 2axy?/(x? + y*), if (x, y) # (0,0), and f(0, 0) = 0, as in Example 2.


(a) Show that —1 < f(z, y) < 1 for every (z, y).
(b) Find {(a, y) :f(@, y) = 1} and {(z,y):f@,y) = —l}.
(c) Find {(z, y): grad f(z, y) = (0, 0)}. i
(d) Find {(2, y) : f(x, y) = c} for any c, and illustrate with a sketch.
6. Let f and g be differentiable at xo. (a) Prove that the sum f + g is differentiable
at xo, and d(f + g)(xo) = df(xo) + dg(xo). (b) Prove that the product fg is dif-
ferentiable at xo, and d(fg)(xo) = f(xo)dg(xo) + g(xo)df(xo). [Hint: Recall the
proof forn = 1.]
7. (Euler’s formula). Let p be a real number. A function f is called homogeneous of
degree p if f(tx) = t?f(x) for every x ~ O andt > 0. Let f be differentiable for
all x ~ 0. Show that if f is homogeneous of degree p, then

df(x) +x = pf(x)
for every x ~ 0, and conversely. [Hint: Let g(t) = f(tx) and use Proposition 8
with xo = 0. For the converse, show that for fixed x, (t)t~? is a constant.]
8. Let Q(x) = Dotj=1 Cijv‘z’, where Ci; = Cj; and Q(x) > 0 for every x ~ 0. Let
f(x) = [Q(x)]*/*. Calculate df(x) and verify Euler’s formula for this function.

2-3 FUNCTIONS OF CLASS Cc”

Let f be a function whose domain is an open set D C EH”.


Definition. If f is continuous, then f is said to be a function of class C.
If the partial derivatives f1(x),...,f,(x) exist for every x € D and
fi,--+,Jfn are continuous functions, then f is a function of class C.
The classes C of functions, where ¢g= 2,3,..., will be defined below.
We will first prove the following sufficient condition for differentiability, which
is adequate for most purposes.
Theorem 2. If f is a function of class C‘”, then f is a differentiable function.

Proof. Let us proceed by induction on the dimension n. If n = 1, differen-


tiability means simply that f’(x) exists for every x € D, while if f is of class
C™ then f’ isa continuous function. Let us assume that the theorem is true in
dimension n — 1. a”
Let xp be any point of D and 59 > O such
that the 69-neighborhood of xo is contained in D. palin
Let us write (Fig. 2-3) y
ca? (Csi pal ciate Ppageen MEIN ay nace he
o(@); = fe’,..., 2°",
oS25) = f(@,Got 20), %o (o-+H,
papix)
x
provided the point (%, xg) is in D. The partial
derivatives of ¢ are

PAX) — sexs fo) 1 a, eat Figure 2-3


2-3 Functions of Class C‘” 45

Since f is of class C’, each f; is continuous. Hence each ¢; is continuous and


¢ is of class C‘?. By the induction hypothesis ¢ is differentiable at 5. There-
fore, given € > 0 there exists 6;,0 < 6; < 69, such that
n—1 ‘ -

6&0 +B) — $(®o) — DY ai(Bo)h"| < $ fh


t=1

whenever \h| < 6,. Since f, is continuous, there exists 62,0 < 52 < 69, such
that |fn(y) — fn(Xo)| < €/2 whenever ly — xo| < 6. Let 6 = min {6}, 35}
and let |h| < 6. By the mean value theorem,

(Ro +h) — o(&o +b) = fo +h, xo + kh”) — fo +h, 29)


= fn(&o + By x} + sh")h”
for some s € (0,1). Setting y= (o + h, BSE OM),

WaeweXole sic <a[h| <3 6.


Since f(o) = (Ko),
f(xo + h) = f(x) = [fo + h) — o(% + b)] ;
+ [@@o + h) — $(%o)],
flo + h) — f(xo) — af(xo)-h = [fn(y)h” — fr(Xo)h"]
+o + h) — $(%) — sy) eso
t=1

Using the above inequalities, the triangle inequality, and the fact that |h”| <
|h|, |b] < |h|, we get
Ifo +h) — (Ko) — af(xo) hl < 5 |A"| +5 [Al < lhl
whenever |h| < 6. This proves that f is differentiable at Xo. I

Corollary. Every function of class C\” is of class C.

Proof. Apply Proposition 7 and Theorem 2. J

Example 1. If f and g are functions of class C” with the same domain D, then f + g
is of class C™. Using the product rule from elementary calculus, the partial deriva-
tives of the product are
+ foi:
(f9)i = fig ese its

Since sums and products of continuous functions are again continuous, (fg); is con-
tinuous for each i = 1,...,n. Hence fg is of class C™.
Example 2. The composite of two functions of class C“) is also of class C™. For
suppose that f = Y°g, where y and g are of class C. By formula (2-4), p. 37,
fi = (W’ °g)g:. Since y/ and g are continuous, their composite ¥ © g is continuous
46 Differentiation of Real-Valued Functions 2-3

(Proposition A-7). Since g; is continuous, the product (W’ © g)g; is continuous. Thus
f is of class C“.

Higher-order partial derivatives. The partial] derivatives fj (x) eee x)


are often called the first-order partial derivatives of f at x. The functions
fi,..-,/n may themselves possess partial derivatives. If f; has a jth partial
derivative at x, then this partial derivative is called a partial derivative of order
2, and is denoted by

IEAeS) tore
af (x)
axidx"
For example, if F(, y) aa rie then filz, y) ia Dae fir, y) = 2y°,
fia, y) = Gay’.
If all of the partial derivatives f;;(x), 1, 7 = 1,...,”, exist at every
x € D and each f;; is a continuous function, then f is called a function of class
Cc By the corollary to Theorem 2, if f is of class C then fi,...,f, are
continuous. Hence any function of class C is also of class C™.
The partial derivatives of f of order gq = 3,4,... are defined similarly,
wherever they exist. The notation for partial derivative at x, first in the direc-
tion e;,, second in the direction e;,, and so on, is
WE Aes) il << 11 << n, = il. ooo y Ue

Definition. If all of the gth order derivatives of f exist at every x € D and


each f; 1 ...;, q is a continuous function, then f is a function of class C.
By the corollary to Theorem 2, any function of class C” is also of class
C‘7—)_ As q increases, more and more restrictive conditions are placed on the
smoothness of f. In many parts of differential calculus it is sufficient to assume
that f is of class C” or class C’”’. However, for some purposes one needs C“®
forg > 2. For instance, in Taylor’s formula it is assumed that f is of class C™.
The sum and product of two functions of class C‘” are of class C®. If,
in Example 2, y and g are of class C, their composite f is also of class C™.
Example 3. Let p > 0, p not an integer, and ~(x) = |x|? for every x € E!. Then
y is of class C if q < p but not if ¢g> p. The proof of this is left to the reader
(Problem 4). Thus for every g there exist functions of class C™ which are not of class
Oath),

Example 4. Any polynomial in n variables is a function of class C for every q. If f


is a rational function, f(x) = P(x)/Q(x) where P and Q are polynomials, then f is
of class C on any open set where Q(x) ~ 0.

It can happen that a function f has the second partial derivatives f;; and
fii, t # J, but that f;; # fj; See Problem 6. However, this undesirable phe-
nomenon cannot occur if f is of class C™. This is even true under the shghtly
weaker hypotheses of the following theorem in which no assumption is made
about the other second-order partial derivatives of f.
2-3 Functions of Class C'”’ 47

Theorem 3. If f is of class C and both f,;; and fii are continuous, t ¥ J,


then ie = aa

Proof. Suppose first that 7 = 2. We need to show that fig = fo1. Let
(Xo, Yo) be any point of D, and 59 > 0 such that the 69-neighborhood of
(Xo, Yo) is contained in D. For 0 < u < 69/+/2 let
i
A(u) = ee [F(zo + u, Yo + u) — f(%o, Yo + u) — f(to + u, Yo) + f(Zo, Yo)]-

A(u) is sometimes called the second difference quotient. Let

g(x) = f(x, yo + u) — f(a, yo)


for every x such that (x, yo + u), (%, yo) € D. The domain of g is an open
subset of H* which contains the closed interval [x9, 29 + uJ]. Moreover, g/(z) =
filz, yo + u) — filx, yo). Thus g is of class C“” since f; is continuous, and

A(x) = = [glo + 0)— glao)h.


Applying the mean value theorem to g, there exists & € (20, %) + u) such that

A(u) = 29'() = 2[hil vo +H) — f(b,wo)]


See Fig. 2-4. Of course the number £ depends on wu. Let

h(y) = filé, y)
for every y such that (& y) € D. The domain of h is open and contains
[yo, Yo t+ ul. Moreover, h’(y) = fio(£, y), h is of class C\ since fi. is con-
tinuous, and
it
A(u) = U= [h(yo + u) — hlyo)].

Another application of the mean value theorem gives


y
A(u) = fiel& ), (xo, Yor¥) (to tu, yoru)
for some n € (Yo, Yo + u), depending on u.
By reversing the roles of the first and
second variables and repeating the proof,
we find that (Xo, Yo) | (to+u, Yo)
|
A(u) = fai(&*, 0”)

for some

£* € (xo, %o + u) and n* © (Yo, Yo + u). Figure 2-4


48 Differentiation of Real-Valued Functions 2-3

Since f;2 and fg, are continuous, given € > 0 there exists 6 € (0, 69) such
thateitOn<yu <0/~/2,

[tr2(Ee mp) == 7
too, Yo) ene, \far(E*, gia fo1(ton Yo) ||<ee:

This shows that


lim A(u) = fie(%o, Yo) = fe1(o, Yo),
u—0

and proves the theorem if n = 2.


If n > 2 we need consider only the case 7< 7. Given Xp = (xp,... , 1%) € D,
let
b(2, y) = f (x6, Ge Gee) coe v, coe E> eis Lo »Y; to; oD o% x6),

for every (x, y) In some open set containing (x, x). Applying the theorem to
¢, we find that

fij(Xo) = ¢12(X0, x) = $21(Xo, 2) = fis(Xo). E

Irom Theorem 3 it follows that in calculating any gth-order partial deriva-


tive of a function of class C it is only the number of partial differentiations
with respect to each of the variables which matters, and not the order in which
they are taken. Thus for functions of class C’’, fi23 = fige, fire = fici, and
so on.
We are now going to prove a stronger version of the mean value theorem,
which is valid for functions of class C. Let f be of class C” and x, x9 © D
such that the line segment joining x and Xo 2s contained in D. In particular, if
D is convex then x and Xo can be any pair of points of D. Let h = x — Xo,
and define ¢ as in Proposition 8 by ¢(t) = f(xo + th). The domain of ¢ is
{t:x + th € D}, which is an open subset of H' containing the closed interval
[0, 1]. By repeated application of Proposition 8 to f, fi, fi;,..., we find that:

o'() = D> filxo + thn’,


I

¢"() = Ss = fag(Xo + an h',


pail || pil

G?OQ= DY) fay a,(%o + hyn h%.


Speers
By Taylor’s formula for functions of one variable (Section A-8) there
exists s € (0, 1) such that

(1) = 60) + 9'O) + 5 9") Ho + Ea PO + F4%),


2-3 Functions of Class C'” 49

But ¢(1) = f(x), ¢(0) = f(xo), and we have by substitution:

Taylor’s formula with remainder:

CS) == HEE: DD Neer acd) +5 >a fis(Xo)(a* — xo)(x? — xh) +-:-


call Sil

Ga, De fe
1

Oneeet
n

t Be), 2-4)
5 °

wate ig-1=1

where h' = x' — a4, s € (0,1), and

ee ee
Rae) =D) Farnig (Ko + sh)hit; «= Be) (2-10)

If we ignore the remainder R,(x), the right-hand side is a polynomial in


h',...,h" of degree g — 1. If the remainder is small this polynomial furnishes
an approximation to f(x). Notice that the first terms on the right-hand side
are just the first degree approximation f(xo) + df(xo) -h to f(x) considered
in Section 2-2.
If f is a polynomial of degree g — 1 then the Taylor approximation is
exact; in other words, #,(x) = 0 for every x.

Example 5. Let f(x, y) = x7y and (xo, yo) = (1, —1). Then

fi = 2xy, fe =27, fi = 2y,


fiz = fei = 22, fir2 = fier = feir = 2,

and all other partial derivatives are 0. Here we have written f; for short in place of
fi(z, y), and so on. Ra(z, y) = 0, and Taylor’s formula becomes

Ge eat tal) a fi@i Dt faty a1)

a x eG? 13 Se PyneGe Diy =- 1)) += x [3fi12(@ — 1)?(y + 1)],

where the partial derivatives on the right-hand side are evaluated at (1, —1). Thus

eee oer 1) (a — 1)? 2 — yi) 4 @ — 1)? + J).


Functions of class C ona set. In many instances eitherf is not of class
C on its entire domain D, or else one is interested only in its values on some
subset of D.

Definition. Let A be a nonempty subset of the domain of f. Then f is of


class C on A if there exists an open set D, containing A and a function
F of class C@ with domain D, such that F(x) = f(x) for every x € A.
50 Differentiation of Real-Valued Functions 2-3

If A is open, then we may take D; = A. In that case F = f|A, where


{|A is the restriction of f to A. When A is an open subset of D, f is of class
CO on A if and only if f|A is a function of class C.
The function F in the definition is called ansextension of class C of f|A.
It is generally not easy to determine whether there is such an extension F’.
However, if A has some simple geometrical shape, there is sometimes a method
for explicitly constructing extensions. Let us illustrate this only in the follow-
ing case, which will be of interest in Section 3-2. Let A be a closed interval
[a, b] C #', and for simplicity let g = 1. Suppose that f is continuous on
[a, b], of class C” on (a, 6), and that the one-sided limits (Section A-6)

1; = lim f’(2), lo = lim f’(z)


z—at iyo

exist. Let
Lie ieee,
A= Mad) © BIE iy OS SNe.

Et oe Sa sy

The function g is continuous on E!. Let

Fa) = fla) + f°9(w) dw,


for every x € E!. Then F is of class C“”, and by the fundamental theorem of
calculus, f(x) = F(x) for every x € [a, b]. Therefore f is of class C on [a, b].
By induction if f is continuous on [a, b], of class C on (a, b), and

lim f‘?(zx), lim f(z)


zat xr—b—

exist, then f is of class C” on {a, b].


Some general theorems about extensions of class C‘” were proved by
H. Whitney. Let us cite without proof a result which is 2 special case of a
theorem of Whitney [Ann. of Math. 35 (1934), 485]. Let A be the closure of
an open set B, and assume either that A is convex or that its boundary fr A
isan (n — 1)- manifold of class C‘” (see Section 4-7). Let f be of class C on B,
and continuous on A. Moreover, assume that for each 72,...7, there is a
function F;,...; . continuous on A such that Te (x) equals the gth-order partial
derivative f;,...:,(x) for everyx € B. Then there exists a function F of class
C® on E” such that F(x) = f(x) for every x © A. Hence f is of classC™ on A.
Actually, what one needs to assume about A to apply Whitney’s theorem
is the following: Every xo € A has a neighborhood U such that any pair of
points x, y € UN B can be joined in B by a polygon of length no more than
cx — y|, where c > 1 depends only on U. If A is convex, then the line seg-
ment joining x and y lies in B and one may take c = 1.
2-3 Functions of Class C‘” 51

For other extension theorems of Whitney, see Trans. Amer. Math. Soc.
36 (1934), and Bull. Amer. Math. Soc. 50 (1944).
*Functions of class C’”); real analytic functions. Let us say that f is of
class C? if f is of class C for every q. If f is of class” and lim,_,» Rq(x) = 0,
then in place of Taylor’s formula with remainder we may put the corresponding
infinite series. This infinite series is called the Taylor series for f(x) at Xo.
If K is a convex subset of D and xy € K, then the following is a sufficient
condition that f(x) be the sum of its Taylor series for every x € K. Suppose
that there is a positive number 1 whose gth power bounds every gth-order
partial derivative of f, namely,

Vapor x) a * (2-11)
TOteCVeLVEXiee age al Ore GNC) SS 41,. 125 tg < 1. Smee K is convex,
for each x € K the estimate (2-11) also holds at xo + sh. Since |h’| < |hl,
from (2-10) we have

[R@| < EM
SALete ealOnS
where C= nM |h|. Since C2/q! — Oasq — a,

lim R,(x) = 0
q-~

for every x € K, provided inequalities (2-11) hold.


A function f is called analytic if every xo € D has a neighborhood U,,
such that the Taylor series at Xo converges to f(x) for every x € U,,. It would
lead us away from our main objectives to discuss analytic functions in any
detail. Therefore let us issue just one word of caution, namely, not every func-
tion of class C“ is analytic. As an example, let D = EH! and let

f(x) = JexP (- 2) VS AROET bp

0 ie ge SO

Let us show that this function is of class C‘” and that f‘’(0) = 0 for every
g = 1,2,... Forx # 0 the derivatives f(x) can be computed by elementary
calculus, and each f‘” is continuous on H' — {0}. It is at the point 0 where
f must be examined. Now
lim u* exp (—u) = 0 foreach k = 0,1,2,..., (2-12)
uUu— +o

a fact which we shall prove immediately below. If x < 0, then f(x) = f(x) =
f’(z) = ++-=0. Using (2-12) with k = 0, exp (—1/z”) > 0 as x — OF.
Since f(0) = 0, f iscontinuous. If x > 0

or 2 i eo
ee) i ey | 1
ee x3 exp ( 5) v4 exp ( 4)
52 Differentiation of Real-Valued Functions 2-3

Using 2-12 with k = 2, f(x) ~0as2—07*. Therefore limz4o f(z) = 0.


By Problem 3, f’(0) = 0 and f is of class C. For each q = 2,3,...,f'(x)
is a polynomial in 1/z times exp (—1/zx”) for x > 0. Hence lim,_,o pO) ——
By Problem 3 and induction on q, f‘(0) = 0 and f € C® for every gq. Thus
f ©C™. If we expand f by Taylor’s formula about 0, then f(z) = R(x) for
every x. If x > 0 the remainder R,(x) does not tend to 0 as g — ». Hence
f is not an analytic function.
Proof of (2-12). For each u > 0 let ¥(u) = u~* exp u. Then

y’(u) = (u — k)u*“! exp u, wv’(u) = [u? — 2ku + k(k + 1)Ju*? exp u.

The expression in brackets has a minimum when u = k and is positive there.


Hence y’’(u) > 0 and for each wo (p. 24)

Y(u) > (uo) + (uo)(u — Uo).


If up > k, then ¥/(uo) > O and the right-hand side tends to +o as u— +o.
Hence y(u) > +o and 1/y(u) — Oasu > w+. ff

PROBLEMS
1. Expand f(a, y, z) = xyz by Taylor’s formula about xo = (1, —1,0), with g = 4.
2. Let f(z, y) = W(ax + by), where a and 6 are scalars and y is of class C in some
open set containing 0. Show that Taylor’s formula about (0, 0) becomes
q—1 _(m)
f@,y) = > ps(ee) (ax)’(by)"2+ Ra(2, y),
m=0 ;

where (”;) is the binomial coefficient (which equals the number of 7-element subsets
of a set with m elements).
3. Let f be continuous on an open set D and of class C on D — {xo}. Suppose
moreover that J; = limy_,x, fi(x) exists for each 7 = 1,...,n”. Prove that J; =
fi(xo), and consequently that f is of class C™ on D. State and prove a correspond-
ing result in case g > 1.
4. Prove the statement made in Example 3.
5. Let f@) = z* sin(1/z) if « ~ 0; and f(0) = 0. Show that:
(a) Ifk 0, then f is discontinuous at 0.
(b) If k = 1, then f is of class C but not differentiable at 0.
(c) If k = 2, then f is differentiable but not of class C™.
(d) What can you say fork > 3?
6. Let f(x, y) = zy? — y?)/(a? + y?), if @, y) 4 (0,0), and f(0, 0) = 0.
(a) If (z,y) A (0,0), find fie(x, y) and foi(z,y) by elementary calculus, and
verify that they are equal.
(b) Using Problem 3 show that f1(0,0) = f2(0,0) = 0 and f is of class C®.
(c) Using the definition of partial derivative, show that f12(0,0) and f21(0, 0)
exist but are not equal. Why does this not contradict Theorem 3?
2-4 Convex and Concave Functions (continued) 53

7. Given n and q, how many solutions of the equation 7; +--+ 7, = q are there
with 21,...,% nonnegative integers? With 71,...,7, positive integers? What
func-
does this say about the number of different gth-order partial derivatives of a
tion of class C(O?

2-4 CONVEX AND CONCAVE FUNCTIONS (continued)

if a junction f is sufficientiy smooth, then f can be tested for convexity


or concavity by using calculus. To begin with let us assume that f is dif-
ferentiable. Later in the section we make the stronger assumption that f is
of class C'”? and obtain a test, in terms of the second-order partial derivatives
(Theorem 4), which reduces when n = 1 to the second derivative test given
in Seetion 1-5.
Figure 1-14 in Section 1-5 suggests that convexity of a differentiable func-
tion f is equivalent to the fact that f lies above its tangent hyperplane at each
point (Xo, f(xo)). The following proposition shows that this is indeed so.

Proposition 9a. Let { be differentiable on a convex set K. Then f is convex


on K af and only if

f(x) = fo) + df(xo) - & — Xo) (2-13a)


for every Xo, x © K.

Proof. Let f be convex on K, and let xo, x be any two points of K. Let
h = x — xp) andt e€ (0,1). By definition of convex function,

f&o = th) <if(xo + bh) + (1 = #)f(Xo)-


This inequality may be rewritten as

f(Xo + th) — fixo) S Ufo +h) — f(xo)l. (2-14)


Subtracting tdf(xo) -h from both sides and dividing by ¢,

fees ++ th) — fo) = MA f(Ko) <he 4 ny) —= fixefxs) — af(an) <b.


The left-hand side tends to 0 as > 0%. Hence the right-hand side is non-
negative, which says that (2-13a) holds.
Conversely, assume that (2-13a) holds for every xX), x EK. Let X,
Soe Xx and Jet te (0,1). Let

Xo sie (8) Xo, h = X; — Xo.

A little manipulation shows that


t= x, — — ah.
54 Differentaition of Real-Valued Functions 2-4

By (2-13a) we have
f(a) > fo) + df(ao) +h,
f(x2) > flo) + df (xo) C a4 h):
Multiplying by ¢/(1 — 4) in the first inequality and adding, we get

the) + san) > (FAG +1) fe0), or Yen) + (1 = flex) > Hex)
But this is just the inequality (1-8a) in the definition of convex function. We
assumed that ¢ € (0,1), but if t = 0 or 1, (1-8a) trivially holds. Therefore
f is convex on K. §
By sharpening the inequality in (2-13a) we get a necessary and sufficient
condition for strict convexity.

Proposition 9b. Let f be differentiable on a convex set K. Then f rs strictly


convex on K if and only af

F(x) > f(Ko) + df(Ko) - & — Xo) (2-13b)


for every xX, Xo © K withx # Xo.

Proof. Let f be strictly convex on K. In particular, f is convex on K and


(2-13a) holds for every x, Xo € K. Suppose that x ~ xo, and leth = x — Xp.
For every t € (0, 1)
df(Xo) - (th) < f(Xo + th) — f(x),
by (2-13a) applied with x replaced by xo + th. But according to (2-14),
which holds strictly since f is strictly convex,

f(Xo + th) — fo) < tf(Ko + h) — f(xo)].


Therefore
tdf(Xo)-h < é[f(xo + h) — f(Xo)].
Upon dividing both sides by t we get (2—13b).

The proof of the converse is the same as for Proposition 9a, all inequalities
now being strict.

For concave functions the inequality signs must be reversed in (2-13a)


and (2-13b). The first of these inequalities then says geometrically that f
lies below its tangent hyperplane at (xo, f(xo)), and the second says that this
is strictly true except at the point (xo, f(xo)) itself.
We can now easily prove Proposition 5, which was previously stated in
Section 1-5.
2-4 Convex and Concave Functions (continued) 55

Proof of Proposition 5. Let f be convex on K, where K C HE! is an interval.


Let «,y © K,y < x. By (2-13a) applied with xp = y,

pe ee eth ype Pgy)


By (2-18a) applied with xo = 2,

Oe Os) ie) fy) <P De = 4).


Therefore f’(y)(« — y) < f'(z)(« — y), from which f’(y) < f'(z). This
proves that f’ is nondecreasing on K. If f is strictly convex, then each of these
inequalities is strict. In particular, f’(y) < f(x), which shows that f’ is in-
creasing on K.
Conversely, assume that f’ is nondecreasing on K. Let x9, x € K, and
suppose first that ro < x. By the mean value theorem there exists y € (Zo, 2)
such that
F(z) — flto) = Fy)(e — 2).
Since f’ is nondecreasing, f’(ao) < f’(y). Therefore

TQ)e to) 2 0) — Zo),


which is equivalent to inequality (2-13a). Similarly, (2-13a) holds if x < 2p.
By Proposition 9a, f is convex on K. If f’ is increasing, then f’(r9) < f(y),
and the proof shows that (2-13b) holds. By Proposition 9b, f is strictly convex
on K. §j

Let us next prove a theorem which provides a convenient test for con-
cavity or convexity of a function of class C’. Let f be of class C”’ on an open
set D. Let Q be the function with domain D x EH” defined by the formula

OG hy = >> fcr. (2-15)


age

It is the sign of Q which determines whether f is convex, concave, or neither.


Given x, (2-15) defines a function on EH” which we denote by Q(x, ). The
function Q(x, ) is a quadratic polynomial which in linear algebra is called the
quadratic form corresponding to the n X n symmetric matrix (f;;(x)), of second
partial derivatives. Theorem 3 guarantees that this matrix is symmetric.
Let us write Q(x, ) > 0 if Q(x,h) => 0 for every h, and Q(x, ) > 0 if
Q(x,h) > 0 for every h # 0. Note that Q(x, 0) = 0. In the theory of quad-
ratic forms Q(x, ) is called positive semidefinite if Q(x, ) = 0, and positive
definite if Q(x, ) > 0.
Similarly, we write Q(x, ) < Oif Q(x,h) < 0 for every h, andQ(x, ) < 0
if Q(z, h) < 0 for every h # 0. The corresponding terms are negative semi-
definite and negative definite.
If Q(x, ) has values of both signs, then it is indefinite.
56 Differentiation of Real-Valued Functions 2-4

Theorem 4. Let f be of class C’ on an open, convex set K. Then:


(a) f is convex on K if and only if Q(x, ) > 0 for every x € K.
(a’) If Q(x, ) > 0 for every x © K, then f ws strictly convex on K.
(b) f is concave on K if and only if Q(x, ) < 0 for every x € K.
(b’) If Q(x, ) < 0 for every x € K, then f is strictly concave on K.

Proof. Since K is convex we may use Taylor’s formula with g = 2 and


any pair of points Xo, x € K:

F(x) = f(%o) + af (Xo) +h + 200 + sh, h), (2-16)


where s € (0,1) and h = x — Xo. Let us first prove (a’). By hypothesis,
Q(y, ) > 0 for every y € K, and in particular for y = x9 + sh. Therefore
Q(x -+ sh,h) > Oifh ¥ 0, from which

f(x) > f(Ko) + af(Ko) «hb.


By Proposition 9b, f is strictly convex on K.
Let us next prove (a). If Q(x, ) > 0 for every x € K, then the same
reasoning shows that
F(x) 2 f(Ko) + af(Xo) «he
By Proposition 9a, f is convex on K. On the other hand, if it is not true that
Q(x, ) > 0 for every x EK, then Q(xo, ho) < 0 for some xo € K and
hyo ~ 0. Since f is of class C’, Q( , ho) is continuous on K. Hence there
exists 6 > 0 such that Q(y, ho) < 0 for every y in the 6-neighborhood of xo.
Let h = cho, where c > 0 is small enough that |h| < 6, and let x = xp +h.
Since Q(x + sh, ) is quadratic,

Q(Xo -t sh, h) = c7Q(xo + sh, ho) < 0.


From (2-16)
F(x) < f(xo) + df(Xo) -h.
By Proposition 9a, f is not convex on K.
This proves (a) and (a’). Parts (b) and (b’) follow respectively from (a)
and (a’) by considering —f. §

If n = 1, then Q(x, h) = f’’(x)h?. The sign of f’’(x) determines whether


Q(x, ) is positive definite, negative definite, or 0. For n = 1, Theorem 4
restates, in a slightly weaker form, the second derivative test given in Sec-
tion 1-5 (corollary to Proposition 5).
Example 1. Let f be a homogeneous quadratic polynomial,
n

fx) = Do caz'a’,
i,j=1
2-4 Convex and Concave Functions (continued) 57

for each x € E”, where the n X n matrix (c;;) is symmetric. Then Hie) omen are|
Q(x, h) = 2f(h). Hence f is convex on HE” in case f(x) > 0 for every x, and concave
in case f(x) < 0 for every x. If f has values of both signs, then f is neither convex
nor concave.

Example 2. Let f(x) = exp [g(x)], where g is of class C®) and convex on K. Then

fix) = exp [9(x)]gi(x).


Using the product rule, we get

fis(®) = exp [9(x)] lie) 95x) + gis(x)].


Writing for short g; for g;(x), and so on,

Q(x, h) = exp [9(x)] d gg’ + DO on


eg t,j=1

The last term on the right-hand side is nonnegative since g is convex. Moreover,

n n 2

Sy gigjh'h’ = p>on aes


i,j=1 i=1

and exp [g(x)] > 0. Hence Q(x, h) > 0 for every x € K and every h. Therefore
f is convex on K. This example is a special case of Problem 4, since the exponential
function is increasing and convex on E!.,

In both of these examples we could determine the sign of Q(x, ) by direct


calculations. When this is not feasible, one of the following tests for definiteness
may be applied.

I. (n = 2). In this case

Q(x, y, h, k) = fish? + 2fiehk + fook?

where we have written (h, k) for (h', h”) and f,; for f;;(z, y).
If the discriminant —(f11f22 — f?2) is negative, the equation Q(z, y, h, k) = 0
has no roots (h, k) except the trivial one (0,0). The sign of f1; and fo: deter-
mines whether Q(z, y, , ) > OorQ(z,y, , ) < 0.
If fisfes — fro < 0, then {(h, k) : Q(z, y, h, k) = 0} consists of two lines
intersecting at (0,0). They divide the (h, k)-plane into four parts, on two of
which Q(x, y, h, k) > 0 and on the other two of which Q(z, y,h,k) <0. In
this case, Q(z, y, , ) is indefinite. Thus

OG) ae Oars i118. 0; feo > 9, fiife2 — fiz > 0.


am alie PAD ERS foo < 0, fisfes — fig > 0.
Q(x, y, , ) ts indefinite af fiifos — ir,
58 Differentiation of Real-Valued Functions 2-4

Li fiifsee— fie = 0, then

Q(x, y, h, k) = c(ah + bk)?

where the numbers a, ), c satisfy s

ca” = fii, cab = fio, cb? = foo.

If c > 0, then Q(z, y, , ) is positive semidefinite but not positive definite.


Similarly, if c < 0 then Q(z, y, , ) is negative semidefinite.

Example 3. Let f(x,y) = de? + y?) + ay. Then fir = 2, foo = y, fii fo2—fi2 =
ay — 1. Hence f is strictly convex on the part of the first quadrant above the hyper-
bola zy = 1, and strictly concave on the part of the third quadrant below this
hyperbola.

II. For any n let

ANE) SUPA) data)= ot(ZH) F9), ace)= det(FuG))


foi(&) fee(x)

These are called the principal minor determinants of the matrix (f;;(x)). The
mth principal minor d,,(x) is the determinant of the matrix obtained by delet-
ing the last n-m rows and columns. The determinant d,(x) is called the Hessian
Off at x,
Let us state without proof the following criterion:

Oe en Oe te ti, (xX) BOF for at lee yet


COGa GOs Gal) Pay xe OF ifonaiia= ay ers

For a proof of the first of these two statements, see reference [3], especially
pp. 140, 147. The second follows from the first by considering —Q. Here
iff is an abbreviation for “if and only if.”
Criterion II is fairly convenient for small values of n, but becomes unwieldy
for larger ones. This is because of the very large number of operations required
to calculate the determinant of an m X m matrix even for moderately small m.
III. In linear algebra it is shown that any quadratic form can be written
as a linear combination of squares by suitably choosing a new orthonormal
basis for H”. This fact is also proved in Section 4-8 below. Therefore

Q(x,h) = DD An), (2-17)


a
where for each h € E”, n'(x),..., (x) are the components of h with respect
to some orthonormal basis {v1(x),..., Vn(x)} for H”,

h = >> n'(x)vi(x).
oa
2-4 Convex and Concave Functions (continued) 59

The numbers )1(x),..., n(x) are just the characteristic values of the matrix
CATES)
If \,(x) > 0 for each i = 1,...,n, then from (2-17) Q(x, h) > 0 unless
n'(x) = 0 for each 7 (that is, unless h = 0). In this case Q(x, ) is positive
definite. Conversely, if h = v,(x), then Q(x, h) = \,(x). Therefore, if Q(x, ) > 0,
then in particular Q(x, v,(x)) > 0, and A,(x) > 0. This proves the first of
the following statements:

OC es 0 Mate A(x)
> 0) fort = arn:
ieee OI Ne Xe ONS Ord == a1een,

The second is proved in the same way. Replacing on both sides “>0” by
“>0” we get a criterion for nonnegative semidefiniteness, and replacing “<0”
by “<0,” one for nonpositive semidefiniteness.
If n is fairly large it is better, instead of criterion II, to try some numerical
method for putting Q(x, ) in the form (2-17).
PROBLEMS
1. Use Theorem 4 to determine whether f is convex on K, concave on K, or neither.
Unless otherwise indicated, K = EH? or E?.
(Ah fGyy, 2 = a? iy? — 42?
(b) f(z, Yy; z) Se Te ane ee
(oy f{@,y) = (ea y+ 1)’,K = {@,y):2+ y+ 1 > 0}.
GG giz) = exp (0 ayy? 27).
(e) f@, y) = exp (zy).
In which cases is the convexity or concavity strict?
2. Let f(z, y) = o(x? + y?), where ¢ is of class C), increasing and concave. Show
thatf is convex on the circular disk x? + y? < a? if and only if ¢’(u) + 2ud’’(u) > 0
whenever 0 < u < a?.
3. Using Problem 2, find the largest a such that f is convex on x? + y? < a’.
(a) f(x,y) = log (1+ 2? + y?). (b) f(z, y) = sin @? + y?).
4, Let K be an open, convex set, and g a function which is convex and of class C)
on K. Let I C E! be an interval such that g(x) € J for every x € K. Let gd bea
function which is of class C‘?’, nondecreasing, and convex on J. Let f be the com-
posite of @ and g, f(x) = $[g(x)] for every x € K.
(a) Using Theorem 4, prove that f is convex on K.
(b) Prove the same result without the assumption that ¢ and g are of class C®,
by using directly the definitions of convex function and nondecreasing function.
5. Using Problem 4, show that each of the following functions is convex on H”:
(a) f(x) = |x|’,p = 1. (De Ryeae let XX) aie.
(eo) fxe=) (ie lx|7)?*, 9 > 1. [Hint: First consider p = 1.]
6. Let f be of class C“, decreasing, and convex on a semi-infinite interval (c, ©).
Prove that if f(z) > 0 for every x > c, then limz,+4.f'(z) = 0. [Hint: Let l =
sup {f’(c):2 > cl}. Either 1 = 0 or / < 0. Using the fundamental theorem of
calculus, show that if 1 < 0, then limz,4. f(x) = —%.]
60 Differentiation of Real-Valued Functions 2-5

7. Let K be a convex set with nonempty interior int K, and x* some point of int K.
Prove each of the following:
(a) For every x € K and s € (0, 1], sx* + (1 — s)x € int K.
(b) If f is continuous on K and convex on int KSthen f is convex on K. [Hzint:
Use (a).]
(c) Suppose that fr K contains no line segment, and that f is continuous on K
and strictly convex on int K. Then f is strictly convex on K.

2-5 RELATIVE EXTREMA


Let A be some subset of #” and f a function whose domain contains A.
Let us consider the problem of minimizing or maximizing f(x) on A.
Definitions. If xp is a point of A such that f(xo) < f(x) for every x € A,
then f has an absolute minimum at Xo. The number

f(Xo) =m 4f(x)ex Ee A}

is the minimum value of f on A. (Of course, there need not be any such
point Xo. However, if A is a compact set, then by the corollary to
Theorem A-6, any continuous function has an absolute minimum at some
point of A.) If f(xo) < f(x) for every x € A except xo, then f has a strict
absolute minimum at Xo.

We say that f has a relative minimum at Xo if there is a neighborhood U


of xq such that f(xo) < f(x) for every x € A mM U. If U can be so chosen
that f(xo) < f(x) for every x € A A U except Xo, then f has a strict relative
minimum at Xo.

The notions of absolute maximum and relative maximum are defined


similarly by reversing the inequality signs. We say extremum for either maxi-
mum or minimum.
In some cases the extrema can be found by inspection. For example, if
A = E” and f(x) = |x|, then f(0) = 0 and f(x) > 0 for every x ~ 0. Hence
f has a strict absolute minimum at 0. Since this function is not differentiable
at 0, the minimum could not have been found through the use of calculus.
If f and A are smooth enough, the relative extrema can be found by using
calculus. In the present section we assume that A is an open set. In Section 4-8
we shall learn a technique for finding the extrema when A is a smooth sub-
manifold of H”.

Definition. A point Xo is a critical point if df(xo) = 0.


If f is a differentiable function, one need look only among the critical
points for relative extrema.

Proposition 10. Jf f has a relative extremum at Xo and f is differentiable


at Xo, then Xo 1s a critical point.
7315) Relative Extrema 61

Proof. Given a direction v, let ¢(t) = f(xo + tv) for every ¢ in some open
subset of H' containing 0. Then ¢ has a relative extremum at 0, and con-
sequently by elementary calculus ¢’(0) = 0. But ¢/(0) = df(xo)-v is the
derivative at Xo in the direction v. Hence df(xo) -v = 0 for every v, which
implies that df(xo) = 0. J

It is illuminating to look at this result in a slightly different way. In place


of the covector df(x) let us consider the vector grad f(x) with the same com-
ponents (p. 43). Suppose that x is not the critical point. Then grad f(x) ¥ 0.
Let us find the direction v for which the directional derivative at x is maximum.
By Cauchy’s inequality,

grad f(x) -v < [grad f(x)| |v];


and equality holds if and only if v = v(x), where

]
v(x) = grad f(x).
|grad f(x)|
This direction is called the direction of the gradient at x, and is the one which
maximizes the directional derivative. The maximum value of the directional
derivative is

grad f(x) - v(x) = aa 7ep]Bad MH) - grad f(x) = |erad f(x))


By going a short distance from x in the direction v(x), f(x) is increased.
Hence f cannot have a relative maximum at x. The direction —v(x) minimizes
the directional derivative at x. In the same way, f cannot have a relative
minimum at x. This confirms the conclusion of Proposition 10.
This discussion is the basis for the gradient method (or method of steepest
ascent) for finding maxima. A good intuitive picture of the gradient method
may be obtained by thinking of an ambitious mountain climber who always
takes the steepest direction. Let us suppose that the surface of the mountain
can be represented in the form ({2, y, f(x, y)) : (x, y) € A}, where f is a smooth
function. In particular, no vertical cliffs, overhangs, or sharp ridges are allowed.
If the mountain has the shape indicated in Fig. 2-5(a), that is, if f is a strictly

(a) (b)
Figurr 2-5
62 Differentiation of Real-Valued Functions 2-5

concave function, then it appears that the summit will be reached by this
technique. However, if the mountain has a more complicated shape, the
climber may reach a false summit or a saddle as in Fig. 2-5(b). Once he reaches
any critical point, the gradient method tells him to stay there.
The gradient method will be defined more precisely later (Section 3-4).
For functions which are convex or concave, Proposition 10 has a converse.

Theorem 5. Let f be differentiable and convex on an open convex set A and


Xo € A acritical point. Then f has an absolute minimum at Xo.

Proof. Since df(xo) = 0, f(x) > f(Xo) for every x € A by Proposition 9a. §f

Similarly, any differentiable concave function has an absolute maximum at


any critical point.

Corollary. <A differentiable function which is strictly convex (or strictly


concave) has at most one critical potnt.

Proof. Let f be strictly convex on A, and suppose that f has an absolute


minimum at distinct points xo, x; € A,

f(%o) = f(%1) < f)


for every x € A. Since df(xo) = 0, by Proposition 9b (with x = xj),
f(x1) > f(xo). This is a contradiction. J

For functions which are neither convex nor concave, the theory of relative
extrema is more complicated. We shall consider only functions of class C on
A. The main result is:

Theorem 6. Let f be of class C on an open set A, and xo € A acritical


point. Then:

(a) Qo, ) = 0 1s necessary for a relative minimum at Xo.


(a’) Q(Xo, ) > 0 ts sufficient for a strict relative minimum at Xo.
(b) Q(xo, ) = 0 zs necessary for a relative maximum at Xo.
(b’) Qo, ) a
— 0 zs sufficient for a strict relative maximum at Xo.

Proof. Let f have a relative minimum at x9. Then there exists a neighbor-
hood U of xo such that

f(x) > f(xo) for everyx EeUN A.


Since A is open, we may assume that U C A. Since df(xo) = 0, Taylor’s
formula with g = 2 becomes

f(x) = f(Ko) + Qo + sh, h), (2-18)


2-5 Relative Extrema 63

where h = x — Xo, x © U. Suppose that Q(xo, ho) < 0 for some ho. The
proof of Theorem 4 shows that f(x) < f(xo) for some x € U of the form
Xo + cho. This is a contradiction. Therefore Q(xo, ) > 0, which proves (a).
To prove (a’), suppose that Q(xo, ) > 0. Using Problem 8 and the fact
that the functions f;; are continuous at xo, there exists a neighborhood U of
Xo such that U C A and Q(y, ) > 0 for every y € U. Taking y = xo + sh,
we find from (2-18) that f(x) > f(xo) for every x € U, x ¥ Xo. This proves
that f has a strict relative minimum at x9. Statements (b), (b’) follow respec-
tively from (a), (a’) by considering —f.

Definition. A critical point x is nondegenerate if the Hessian determinant


dn(x) = det (f;;(x)) is not 0.

A nondegenerate critical point may be tested by one of the three criteria


I, I, or III at the end of Section 2-4. Note that in applying Theorem 6 we
need to know the sign of Q(xo, ) at the critical point x9 itself, while to apply
Theorem 4 one must know the sign of Q(x, ) at every point of K.
If n = 2, then f has a relative extremum at any critical point where
fiifoo — f?g > 0. The sign of fy; and fo: determines whether it is a relative
maximum or relative minimum. A critical point where f,;fo2 — f?. < 0 is
called a saddle point. The function illustrated by Fig. 2-5(b) has one point of
absolute maximum, one of relative maximum, and one saddle point.

Wi =) =
c=—7pes

C= at
27

Figure 2-6

Example. Let f(x,y) = 2y2 — x(x — 1)? for every (2, y) € H? and A = E?. This
function has two critical points, (4, 0) and (1, 0). We find that

fu =4—62, fe =4, firfee —fis = 16 — 24a.


The point (4,0) gives a relative minimum and (1, 0) is a saddle point. In this
example it is instructive to find the level sets {(z, y) :f(2, y) = c}. They are indicated
in Fig. 2-6 for the critical values —s> = fG,0), 0 = f(1,0), and for nearby
values of c.
64 Differentiation of Real-Valued Functions 2-5

The point (4, 0) of relative minimum is an isolated point of the level set con-
taining it. For —34 < c < 0 the level set has two parts. The one which encloses
(4, 0) resembles a small ellipse if ¢ is near —z. This can be attributed to the fact
that near (4, 0), f(z, y) is approximated by the first two nonzero terms in its Taylor
expansion about (4, 0), namely,

f, 0) + 404, 0,2 —4,9) = — ty + (@ — B+ By’.


The level sets of this quadratic function are ellipses with center (4,0) if ¢c > —z¥.
Similarly, f(z, y) is approximated by —(x — 1)? + 2y? near the saddle point (1, 0).
The level sets —(x — 1)? + 2y? = c are hyperbolas if c # 0. Near (1, 0) the level
sets of f resemble these hyperbolas. For c = 0 we get the lines /2y = +(@ — 1)
tangent to the level set off at (1, 0).

If f is continuous on a compact set A, then f has absolute extrema on A.


They may occur either at interior or at boundary points of A. If an absolute
maximum occurs at an interior point x9 of A, then xg is among the relative
maxima in int A. We can try to find it by Theorem 6. However, Theorem 6
does not apply at boundary points of A.
If xo € fr A and Xo gives an absolute maximum, then f(x) < f(Xo) for
every x € A, and in particular for every x € fr A. Therefore xg also gives an
absolute maximum among points of fr A. If fr A is sufficiently smooth, the
Lagrange multiplier rule (Section 4-8) can be applied.

*Extrema of linear functions. Let f be a linear function. Then calculus


is of no help in finding the extrema of f. Since f(x) = a-x = aya! +--+ + a,2”,
the partial derivatives are f;(x) = a,;. If f has a critical point, then a = 0 and
f(x) = 0 for every x.
Let us assume that a ~ 0 and consider the problem of extremum on a
convex polytope K (p. 15). If K is contained in {x:xz‘ > 0,7= 1,...,n},
this is a problem in linear programming and has various interesting applications.
See [10] and [13].
For simplicity let us assume that K is compact. The extrema of f must
occur on the boundary fr K. Let us show that they can be found by consider-
ing only certain points of fr K, called extreme points.

Definition. Let K be a convex set. A point x € K is an extreme point of K


if there do not exist distinct points x,, x2 € K and ¢ € (0, 1) such that
x = tx, + (1 — 2)xo.

Stated geometrically, x is extreme if it is interior to no line segment in K.

Examples. The extreme points of a simplex are the vertices. If K is a closed n-ball
then every point of fr K is extreme. A half-space has no extreme points.

Proposition. Let K be compact and convex. Then every point of K is a con-


vex combination of extreme points of K.
2-5 Relative Extrema 65

Proof. Let us proceed by induction on the dimension n. If n = 1, then


K is an interval or a single point. Suppose that the proposition is true in dimen-
sionn — 1. Let x9 € K. If xo is a boundary point, then by Problem 8, Sec-
tion 1-4, K has a supporting hyperplane P containing x9. By an isometry of
E” (see Section 4-2), we may arrange that x9 = O and the equation of P is
x" = (0. The set K 9 P is compact and convex. By the induction hypothesis
Xo 1s a convex combination of extreme points of K M P and hence (Problem 11)
of extreme points of K.
If xo € int K, then any line through xo intersects K in a segment with
endpoints x,, X2 € fr K. Since x; and x2 are convex combinations of the set
of extreme points, so is Xo (Problem 10, Section 1-4). §

By the proposition on p. 20, taking as S the set of extreme points of K,


each point of K is a convex combination of n + 1 or fewer extreme points.
If S is connected, then n + 1 may be replaced by n.
Let C be the maximum value on K of the linear function f and K; =
{x € K : f(x) = C}. If K, is found, the problem of maximum is solved.

Corollary. K, 7s the convex set spanned by those extreme points of K at which


f has an absolute maximum.

Proof. Let x € K,. By the proposition x = > t’x;, where x1,...,Xm


are extreme points, each ¢? > 0, and >-t? = 1. All sums are from 1 to m.
Since C' is the maximum value, f(x;) < C. Since f is linear,

a eet C=C
But f(x) = C, and since each t? > 0 we must have f(x;) = Cforj = 1,...,m.
Thus X1,...,Xm © K,. Conversely, if f(x;) = C for each7 and x is a convex
COMbinationOl x7, <5. , Xm, then f(x) = C. |

If K is a convex polytope, then by induction on n the set of extreme points


is finite. The problem is no longer one of calculus, but instead that of maxi-
mizing f on this finite set. Except in the simplest situations, the method of
unsystematic search among the extreme points is of little value. The best
known systematic method is called the simplex method of linear programming.
In a sense it is an adaptation of the gradient method.

PROBLEMS

In Problems 1 through 6 let A = £” for the indicated n.


1. Find the critical points, relative extrema, and saddle points. Make a sketch
indicating the level sets.

(fC ce eee rae al I OG) ="@ 441) y— 2):


(c) f(z, y) = sin (wy). (d) fz, y) = zy@ — 1).
66 Differentiation of Real-Valued Functions 2-5

. Find the critical points, relative extrema, and saddle points.


(a) f(z, y) = 28+ 2 — 4ay — 2y?.
(b) f(z, y) = ay +1) — ay.
(c) f(z, y) = cos x cosh y. \
[Note: The hyperbolic functions sinh and cosh are defined by

sinh z = [exp x — exp (—2)],


cosh z = 3[exp x + exp (—2)].

Their derivatives are given by the formulas sinh’ = cosh, cosh’ = sinh.]

. Let f(z, y,z) = x? + y? — 27. Show that f has one critical point, which does
not give a relative extremum. Describe the level sets.
. Let f(z, y, 2) = 2? + 3y? + 227 — 2Qry+ 2zrz. Show that 0 is the minimum
value of f.
. Given x1,...,Xm, find the point x where }>7); |x — x;|? has an absolute mini-
mum, and find the minimum value.
. (a) In Problem 1(a) find the (absolute) maximum and minimum values of f on
the circular disk xz? + y? < 1.
(b) Do the same for 1(c).
. (a) Show
ear that under the hypotheses of Theorem 5, {x € A: df(x) = 0} is a
convex set.
(b) Illustrate this result in case A = E? and f(z, y) = (x — y)?.
. Let g(h) = do?j=i cish'h?. Assume that g > 0, that is, that g(h) > 0 for
every h ~ 0.
(a) Show that there exists a number m > 0 such that g(h) > ml/h? for every h.
[Hint: The polynomial g is continuous, and has a positive minimum value
m on the unit (n — 1)-sphere.]
(b) Suppose that |Ci; — ci;| < en? for each 7, 7 = 1,...,n. Let G(h) =
Dij=1 Cijh‘h?. Show that G(h) > (m — ©)J|h|? for every h. Hence G > 0
ife < m.
. Let f(x) = ¥(a-x), where y is of class C@. Show that every critical point of
f is degenerate.
10. Let xo be a nondegenerate critical point of a function f of class C@’. Show that
Xo is isolated, that is, that xo has a neighborhood U containing no other critical
points of f. [Hint: Let x be another critical point in U. Apply the mean value
theorem to each of the functions f1,...,f, to find that

= > fy (2’ — x0), t= +1Ses: n, (x)


j=l

where each y; € U. Show that if U is small enough, det (f:;(y,)) + 0 and con-
sequently the system of equations (*) has only the solution x — xo = 0, a
contradiction. ]
fi Let A be closed and convex, and P a supporting hyperplane for K. Show that
any extreme point of K Q P is an extreme point of K.
2-6 Differential 1-Forms 67

*12. Let K be a closed convex polytope (not necessarily compact) and f be a linear
function such that f(x) is bounded above on K. Show that f has an absolute
maximum on K.

2-6 DIFFERENTIAL 1-FORMS

Let us first give a rough description of this notion and afterward be more
precise. A differential form w of degree 1 is supposed to be an “expression linear
in the differentials dr!,... , dx””:
w = w, dr’ +---+ w, dz”, (2-19)

where the coefficients w;,...,@, are real-valued functions. In case there is


a real-valued differentiable function f such that w; is the 7th partial derivative
f; for each 7 = 1,...,%, then w is called the differential of f and is written
df. Thus
Cp hike AEP Sey he (2-20)
It is important to know whether or not a given differential form w is the dif-
ferential of a function. A considerable part of the discussion in this section
and in Section 3-38 is directed to just this question. One gets a necessary con-
dition (2-22) from the fact that the mixed partial derivatives f;; and f;; of a
function f of class C’ are equal. This necessary condition turns out to be
sufficient if the domain is simply connected.
We recall from Section 1-3 that the elements of the space (#”)* dual to
E” are called covectors; and that the components a; of a covector a are written
with subscripts. No matter what precise meaning we shall give to the symbols
dx',..., dx”, the functions w;,..., @, must determine the differential form w.
For each x, the numbers w (x), ..., n(x) are the components of a covector.
This suggests that we may define a differential form as a function whose values
are covectors.
To state this precisely:

Definition. A differential form of degree 1 is a function w with domain


D C E” and values in (#”)*.

For short we shall usually say “1-form” instead of “differential form of


degree 1.” In Chapter 6 differential forms of any degree r = 0, 1, 2,...,n
are defined.
The value of w at x is denoted by w(x). It is the covector

w(x) = wi(xje’ +--+ + wn(xle”, (2-21)


where, as in Section 1-3, e',..., e” are the standard basis covectors.
A 1-form w is a constant form if there is a covector a such that w(x) = a
for every x € D. In particular, for each 7 = 1,..., 7 let us consider the con-
stant 1-form with value e*. This 1-form is denoted by dz’. Since (#”)* is a
68 Differentiation of Real-Valued Functions 2-6

vector space, the sum w + ¢ of two functions w and ¢ with the same domain D
and values in (£”)* is defined (p. 8). Similarly, the product fw is defined
if f is a real-valued function and w a 1-form, with the same domain Dain
particular, w; dz is the 1-form whose value at eath xis w;(x)e’. From (2-21),
w, dx! +---+ w, dx” is the 1-form whose value at each x is w(x). Therefore
formula (2-19) is correct.

The differential of a function. Let us now suppose that D is an open set.


Let f be a real-valued differentiable function with domain D. The differential
of f at x is the covector df(x) whose components are the partial derivatives
fix), or e Mink):

Definition. The differential of f is the differential form df of degree 1


whose value at each x € D is the covector df(x).

Some authors define df as the real-valued function whose domain is the


cartesian product D x H” and whose value at each pair (x, h) is the number
df(x)-h. Knowing df(x), one can find df(x)-h for every h € E”, and vice
versa. Hence this definition is equivalent to the one which we have given.
If f and g are differentiable functions with the same domain D, then

df 9) "aj = dg; d(fg) = fdg + g df.


These formulas follow from Problem 6, Section 2-2. Similarly, writing c for
the constant function with value c,

ie
= T

where 0 denotes the “zero form” whose value is 0 everywhere. If D is connected,


then, conversely, df = 0 implies that f is a constant function. This is just a
restatement of Corollary 2, Section 2-2.
If L is a linear function, then dL is a constant 1-form. For let L(x) =
a-x, where a is just another notation for the linear function L (p. 9). Then
the 2th partial derivative of L is a; and dL(x) = a for every x.
In particular, the standard cartesian coordinate functions X',...,X”
(p. 11) are linear. In fact X‘(x) = e*-x = x’, and dX*(x) = e*. Hence
dX’ is just the constant 1-form which we have denoted by dz’. The common
practice of writing dx’ instead of dX’ arises from the habit of confusing nota-
tionally a function with its value at some particular point x, in this case of
confusing X* with x’ = X*(x). Nevertheless, following custom, we adhere to
the notation dz’.
Definition. A 1-form w is exact if there is a function f such that w = df.

If df = dg, then d(f — g) = 0. If D is connected, f — g is a constant


function. Hence the function f whose differential is a given exact 1-form w is
determined up to the addition of a constant function, if the domain is connected.
2-6 Differential 1-Forms 69

A 1-form w is of class C if its components w; are functions of class C™.


If w = df, then w; = df/dax'. In this case w is of class C if and only if f is
of class C&T),
Let us look for some criteria to determine whether a 1-form w is exact or
not. If w is of class C'’ and w = df, then f is of class C®. Using Theorem 3
and the 0/dz' notation for partial derivatives,

OW; a7 a a7 00;
dx? dvidx’ —-axa? ax’

Thus the conditions

t=, i7=1,...,n, (2-22)

are necessary for exactness of w.

Definition. A 1-form w of class C‘” which satisfies (2-22) is called a closed


1-form.
In (2-22) we may as well suppose that 7 < j. Thus the definition says in
effect that w is a closed differential form if its components @1,..., Wn satisfy
these n(n — 1)/2 conditions. For instance, if n = 2 let us write dr and dy
instead of dx! and dx”, and

M(z,y) = or(,y), N(x, y) = wale, 9).


The expression for a 1-form is then

wo = Mdz+N dy.

The condition that w be closed is that the components M and N satisfy

We have shown that every exact 1-form is closed. The converse is false,
as Example 2 below shows. It is comparatively easy to check whether condi-
tions (2-22) are satisfied or not. Therefore it is very desirable to find some
additional condition which will guarantee that the converse holds. Such a con-
dition is that the domain D be simply connected. We shall prove in Chapter 7
that if D is simply connected, then every closed 1-form with domain D
is exact. We shall define the term “simply connected” in Section 7-7. For
the present, let us merely say that any convex set, and in particular E”, is
simply connected. For n = 2, an open, connected set D is simply connected
if and only if, roughly speaking, D has no holes.
We have been careful to distinguish notationally between functions and
their values. One can scarcely attain a sound knowledge of calculus until this
70 Differentiation of Real-Valued Functions 2-6

distinction is recognized. Nevertheless, in examples we sometimes abuse the


notation for brevity. For instance, d(x?y)= ae dx + x” dy is short for the
pavenrent “df= fi dx +f dy, where f(x, y) = x iufiewn=]2an ey) =
x? for every (x, y) € E”.”

Example 1. Let w = 2xry dv + (x? + 2y) dy, D = E*. This is of course an abbrevia-
tion for wo = Mdz+ N dy, where M(x, y) = 2cy, N(x, y) = x? -+ 2y, for every
(x, y) € E?. In this example,

aM ON
Oy (x,y) = 2x = ae (ara)

for every (x, y). Hence w is a closed 1-form. Since H? is connected and simply con-
nected, w = df where f is determined up to the addition of a constant function. The
function f can be found by partial integration with respect to the first variable, as
follows:

oF(wy) = M(x, y) = 20, fly = y+ ol)


where the function ¢ is determined from

Of 2 2 2 D
pt eh ea eee Yin Desi CYR eae
O )-

Of course these equations hold for every (x, y) € H?. Then ¢’(y) = 2y, and ¢(y) =
y + c for every y, where the “constant of integration” ¢ is a number which may be
chosen arbitrarily. Hence for every (a, y) € H?,

ACO) aI me
Example 2. Let D = H? — {(0,0)}. By removing (0, 0) we have made a hole, and
D is not simply connected. Let w = M dx + N dy, where for every (x, y) € D

Y
M(a, y) = ea

H
N(x, y) = 22 y2

A computation shows that 0]M//dy = 0N/dx in D,


hence w is closed. Let us show that w is not exact.
Let D, be the open subset of D obtained by deleting
the positive z-axis. For every (z, y) € Dj let O(z, y) (0, 0)
be the angle from the positive z-axis to (gz, y),
FIGURE 2-7
0 < O(a, y) < 2m (Fig. 2-7). Using elementary cal-
culus, we find that in Di, dO = w [Problem 6(a)]. If there were a function f of
class C™) on D such that w = df, then upon restricting f to D, we would have
d(f — ®) = 0. Since D; is connected, f — © would be constant on D;. This would
imply that © can be continuously extended across the positive x-axis, which is false.
Hence w is not exact.
2-6 Differential 1-Forms 71

Exampie 3. In some cases it can be seen by inspection that w is exact. For instance,
ifw = 2r!dz!4---+ 22" dx" and D = E”, then

@ = A(x)? + +--+ @)2 4+ = dx-x+o).


The reader may have also discovered by inspection that the form w in Example 1
is exact.

PROBLEMS
1. Let n = 1. Give a precise interpretation of the formula df/dx = f’ from ele-
mentary calculus. [Hint: The quotient of two real-valued functions is defined
wherever the denominator does not have the value 0.]
2. Letn = 3 and w = Mdz+ N dy+Odz. What do conditions (2-22) become in
this case?
3. In each case determine whether or not w is exact. If exact, find all functions f such
LO A) Saye
(ayi @ = zy dr -+ (@2/2) dy, D = E?. (b) w = 2dxr-+ xz dy + zy dz, D = E?.
(cnc =" y det D = 8,
(d) @= (1/x2 + 1/y*)y dz — x dy), D = {(ryy) se ~ 0 and yy, = 0}.
4. Let w = dy + p(x)y dx and D be the vertical strip {(z, y):a < a < b}. Let p be
continuous on (a, b) and P be an antiderivative of p, that is, a function such that
P'(x) = p(x) for every x € (a,b). Let f(z) = exp [P(x)]. Show that fw is exact.
5. Show that
n 2 n

(a) (% :)|= 2 De x dx’.


Fal i,j=1

(b) d De ax’ |= 2 De Ss vde'.


xj (lke

[Hint for (b): What is (Dox)? — >> (z')??]


6. In Example 2: (a) Show that @1(z,y) = M(a, y), O2(2, y) = N(a, y) for every
(z, y) € D;. You may use the formulas for the derivatives of the inverse
trigonometric functions.
(b) Verify that 0M/dy = ON/dx by calculating these partial derivatives.
7. Let g be continuous on #!. Show that

g(|x|) yy, xd’


i=

is an exact 1l-form. [Hint: Let h(u) = ug(u) for every u€ FE}. The function h
has an antiderivative.]
CHAPTER 3

Vector-Valued Functions
of One Variable

In this chapter, we shall first define the derivative of a function g from a


set J C E' into HE”. The derivative has many of the same properties as the
derivative of a real-valued function in elementary calculus. When J is an inter-
val, g represents a curve in #”, provided its derivative g’(t) is never 0. Any
vector-valued function f obtained from g by a suitable parameter change
represents the same curve as g. The line integral of a differential 1-form w
along a curve Y is defined in Section 3-3. It turns out that the line integral
depends just on the endpoints of 7 if and only if w is exact (Theorem 7). In
Section 3-4 the gradient method for extrema is described.
Except for the introductory Section 3-1, this chapter may be postponed
and read together with Chapter 7.

3-1 DERIVATIVES

Let g be a function from a set J C EL! into EH”. Let ¢ be an interior point
of J. Then the derivative of g at t is the vector

g'(!) = lim =[gt + «) — g(0)


u—0
(3-1)
provided the limit exists.
The derivative of a vector-valued function has many of the same properties
as in the case of real-valued functions. If f and g both have a derivative at t,
then
f+ ¢)'(O =f@ + 2’,
(f-¢)'(® = f@-g® +fO-2’@. (3-2)
Here f - g is the real-valued function whose value at each ¢ is the inner product
f(t) - g(t). The proof of these two formulas is left to the reader (Problem 4).
The derivative has a geometric interpretation as a tangent vector. Let
us suppose that J is an interval. As ¢ traverses J from left to right, the point
72
3-1 Derivatives 73

g(t) traverses some curve in H”. A precise


definition of the term “curve” is given in XotVo
the next section. 4
Let us assume that tp is a point of J at
which g’(to) # 0. Let
X0
Xo = g(to), Vo = g (to),
= (20) Y = Xo + Uo,
Figure 3-1

where ¢ = to + wu and |u| is small enough that t€ J. The ratio of the distances
x — y| and |x — xo| may be written, upon multiplying numerator and de-
nominator by 1/|u|, in the form

= ibaa ey 1
Betagieg dele henEU) 8) azaigey =e)
Hence

oe ail Le
eee oan laa]
This justifies calling vg a tangent vector at Xo, and the line through x9 and
Xo + Vo a tangent line at Xo (Fig. 3-1). Note that we have used the assumption
that g’(to) ¥ 0.
The number ¢ is often called a parameter. It need not have any geometric
or physical significance. However, if n = 3 and ¢t happens to denote time in
a physical problem, then g’(t) is the velocity vector.
A vector-valued function g has components g',...,g”, which are the
real-valued functions such that

g(t) = Ss g'(te:
4=1

for every te J. If g’(é) exists, then the 7th component u g(t + u) — g'(t)]
of the expression on the right side of (3-1) tends to g'’(t) as u — 0, by Propo-
sition A—4b, and

AO SS EOE (3-3)
i=l

Conversely, if g(t) exists for each 7 = 1,...,n, then g’(¢) exists and is given
by (3-3).

Example. Let n = 2, g(t) = t?e1 + (log t)ez. Find the tangent line at e;. In this
example g!(t) = #2, g(t) = logt, and to = 1, xo = g(l) = e1. Then g’(t) =
2te; + t—!e2, and vo = g’(1) = 2e1-+e2. The tangent line goes through e; and
3e,; + eg. Its equation is 2y = x — 1.
74 Vector-Valued Functions of One Variable 3-2

PROBLEMS
1. Find the tangent line at 2—!/2e; — 2'/?ee to the ellipse represented by g(t) =
(cos t)e; + (2 sin é)e2, J = [0, 27]. Illustrate with a sketch.
2. Find the tangent line at e: + e2 + e3’to the curve represented by
g(t) = tey + t'/eq+ t3e3, 4 < t < 2.
3. A particle moves along the parabola y” = 42 with constant speed 2 and so that
dy/dt = g?’(t) > 0. Find the velocity vector g’(t) at e1 — 2e2. [Note: The speed
is |g’ ()|.]
4. Give a proof of formulas (3-2):
(a) Using the corresponding formulas for derivatives of real-valued functions
and (3-8).
(hb) Directly from the definition (3-1).
5. Let g(t) = (3¢/(1 + #3)Jer + [8¢7/(1 + é)Jeo, t # —1.
(a) Sketch the curve traversed by g(¢) on the interval (—%, —1). On the interval
af) 2):
(b) cee ee {g(t):t ¥ —1} = {(2, y):23 + y® = 32y}. This set is called the
folium of Descartes.

3-2 CURVES IN E”
Let g be a function from an interval J C EH! into E”. Then g(t) traverses
a curve in #” as the “parameter” ¢ traverses J. It is better not to call g itself
a curve. Instead one should regard any vector-valued function f obtained from
g by a suitable change of parameter as representing the same curve as g. We
shall define a curve as an equivalence class of equivalent parametric represen-
tations. To simplify matters we shall at first consider only curves with con-
tinuously changing tangents.
Let us now be more precise. Let us for simplicity assume that J = [a, b],
a closed bounded interval, and that the components g',...,g” are of class
CY on [a,b]. By g(a) and g”(b) we mean respectively right-hand and left-
hand derivatives. They are equal to the derivatives at a and b of any class C‘”
extension of g' to an open set containing [a, b]. See p. 50.
Definition. If g’(t) # 0 for every ¢ € [a, b], then g is a parametric repre-
sentation of class C‘” on [a, b].
To motivate the definition of equivalence which we are going to make,
let us first consider an example.
Example 1. Let g(t) = te: + t7e2,1 <¢< 2. Then g!(t) = #, g(t) = #7, g’(t) =
e; + 2te2 ~ 0. Hence g is a parametric representation of class C“ on the interval
[1, 2]. In fact, it represents the arc of the parabola y = x? between (1, 1) and (2, 4),
traversed from left to right (Fig. 3-2). If we let f(r) = (exp r)e1 + (exp 27)ee,
0 < 7 < log 2, then f also represents this same parabolic arc. In effect, f is obtained
from g by the parameter change ¢ = expr. It is reasonable to regard f and g as
equivalent, and we shall do so.
3-2 Curves in E” 75

Now let g be any parametric representation of class C” on [a, b]. Let ¢


be any real-valued function of class C“ on some closed interval [a, B] such that

¢'(T) > 0 for every T € [a, §], o(a) = a, @(8)'== b. (8-4)

Let f be the composite of g and ¢, denoted


by f= ge-@. Then

f(r) = gld(7)] for every 7 € [a, 6].

From the composite function theorem

$7) = g'1o@)19'(7), fori = 1,...,2,


which is the same as

f(r) = g'[d(7)]¢'(7). (3-5)


In particular, f’(7) ~ 0 and f is also a
parametric representation of class C“. The
tangent vector f’(7) differs from the tangent FiguRE 3-2
vector g’[¢(7)] by the positive scalar multiple
¢'(7). (Scalar multiplication on the right means the same thing as on the left,
Vo = cv.)

Definition. We say that f is equivalent to g if there exists ¢ satisfying the


above conditions such that f = g > ¢.

The properties of reflexivity, symmetry, and transitivity required of an


equivalence relation hold (Problem 6). By an equivalence class is meant the
collection of all parametric representations of class C‘” equivalent to a given
one. The reader may have encountered the notion of equivalence class else-
where in mathematics. An example is the definition of the rational numbers
starting from the integers.

Definition. A curve Y of class C‘” is an equivalence class of parametric


representations of class C™.
By requiring that the components g',...,g” be of class Cok > y2eand
allowing only parameter changes ¢ of class C‘”’, the notion of curve of class
C‘ can be defined in the same way. To study curvature of curves one needs
to assume class C?? at least. See reference [22]. However, for present purposes
we need only class C’. From now on we shall say “curve” instead of “curve
of class C”,” and “representation” instead of “parametric representation of
class C‘”.”
Each curve has an infinite number of representations. If g is one such
representation, then each parameter change ¢ leads to another. It is often
76 Vector-Valued Functions of One Variable 3=2

highly advantageous to make a judicious choice of parameter. In a physical


problem, time (measured according to some preassigned scale) may be the
preferred parameter. Tor certain curves one of the components LA Ne a
can be taken as the parameter. For example, ifthe first component g'’(t) of
the tangent vector g’(t) is everywhere positive, then g' has an inverse (Sec-
tion A-10). Let us take for ¢ the inverse of g'. Formally this amounts simply
to solving the equation x! = g'(t) for t, obtaining t = ¢(x'). Set 7 = c'.
Then z! is the new parameter and f!(x!) = x’.
Figure 3-3 illustrates this situation for 2 a
= Dy . |

A curve Y is to be regarded as the path | (&, f(z) |


traversed by a moving point, and we have
not excluded the possibility that Y passes
through the same point several times. The fi@) =a, P(x) =f)
multiplicity of a point x is the number of
points ¢ € [a, b] such that g(t) = x. The Figure 3-3
multiplicity does not depend on the particular
representation g chosen for Y, since any @ satisfying (8-4) is a univalent
function, namely, ¢(71;) # $(T2) if 71 # To. The trace of Y is the set of points
of positive multiplicity, that is, the set of points through which Y passes at
least once. If x has multiplicity 1, then x is called a simple point. If every
point of the trace is simple, then Y is called a simple arc.
The point g(a) is called the znitial endpoint of Y and g(b) the final endpoint.
If g(a) = g(b), then ¥ is called a closed curve. A closed curve is called simple
if every point of the trace is simple except g(a), which has multiplicity 2
(Fig. 3-4).
Example 2. Let g(t) = xo + t(x1 — xo), 0 < t¢< 1. Then Y is the line segment
joining x; and xo, traversed from xo to x;. It is a simple arc.

Example 3. Let g(t) = (cos mt)e1 + (sin mt)e2, 0 < t < 27, where m is an integer
not 0. The trace is the unit circle x? + y? = 1. The closed curve Y which g repre-
sents goes around the circle |m| times, counterclockwise if m > 0 and clockwise if
m <0. Iim = +1, then Y is a simple closed curve.

At this point we need some properties of integrals, which are reviewed


in Section A-9. In the present chapter we employ the Riemann definition of
integral, as is customary in calculus. The more sophisticated Lebesgue theory
of integrals is developed in Chapter 5.
Definition. The length | of a curve Y is

L= fo le'ol ae (3-6)
If f is equivalent to g, then

[it@lar = [Peto ar = ["leolay,


3-2 Curves in E” 77

g(b)
7
of ap

g(b)
g(a) g(a) =g(b) g(a)
Simple are Simple closed curve Not simple

FIGurRE 3-4

by (3-5) and the theorem about change of variables in integrals (Section A-9).
Thus / does not depend on the particular representation chosen for Y.
Formula (3-6) is suggested by considering inscribed polygons. Let
a= to <i < +++ < tm_1 < th = D, and let p = max {t; — to, f2—t,...,
tm — tm—1}. The polygon which joins successively g(t;_1) with g(t;) has ele-
mentary length
m

yy le — eur). (+)
j=1

The length / is the limit of the elementary lengths of polygons inscribed in Y.


More precisely, given € > O, there exists 6 > 0 such that |(*) — l| < «
whenever wp < 6. Since we mention this fact just to motivate the definition
(3-6), the proof will only be indicated. Since the derivative g’ is continuous,
g(t;) — g(t;-1) can be replaced by g’(s;)(t; — t;-1) and the sum (*) by

De le'(si)|5 — G1), (+*)


j=1
with error tending to 0 as u» — 0, where s; can be chosen arbitrarily in
[t;1, t;]. But (**) is a Riemann sum for the integral (38-6), and tends to / as
py — 0. The proof that (*) can be replaced by (**) with small error makes use
of the fact that the continuous function g’ is uniformly continuous on the
compact set [a, b] (Section A-8, Problem 6).
Every smooth curve ¥ has a representation of particular geometric interest.
It is called the standard representation, or representation with arc length s
as parameter, and is defined in the following way. Let g represent 7 on {a, b],
and let J
Sp i; \g’(w)| du for every t € [a, 5].

The length of the part of Y represented on {a, ¢] is S(t). Clearly S(a) = 0 and
S(b) = l. By the fundamental theorem of calculus,

S(t) = |g’(O| > 0


78 Vector-Valued Functions of One Variable 3—2

for every t € [a,b]. In particular, if ¢ signifies time then S’(¢) is the length of
the velocity vector, that is, the speed of motion.
Since S’(t) > 0 the equation s = S(t) can be solved for ¢t. More precisely,
the function S has an inverse ¢ of class C‘” on [OJ]. Let G = g°¢. ThenG
is the standard representation of Y. From (3-5)

G'(s) = g'[o(s)]¢’(s), for every s € (0, |].


Since
“i 1 = il
oS) = Sig] ele
we find that
|G’(s)| = 1 (3~7a)
for every s € [0, J]. Hence G(s) is a unit tangent vector at the point G(s). If we
write dx'/ds for G(s), then (3-7a) can be rewritten
Jy} 2 dx” 2
() ae ae (=) = ih, (3-7b)

Example 3 (continued). Let m > 0. Then |g’(é)| = m, S(t) = mt. Solving the equation
s = S(t) for t, we obtain the standard representation

G(s) = (cos s)e; + (sin s)eg, 0< s < 2mz.

Piecewise smooth curves. It is not difficult to adapt the preceding dis-


cussion to curves which are of class C“” except for a finite number of corners
and cusps. By a parametric representation of a piecewise smooth curve is
meant a continuous function g on an interval [a, b] with the following property:
Here exist to, by 1-1, to With

(i=, SS Uh) Kaper oe Up e ely a)

such that the restriction of g to each of the closed subintervals [t;_1, t;], 7 =
1,...,p, is a parametric representation of class C"”. In particular, g has at
each ¢; interior to [a, b] right- and left-hand derivatives which need not be
equal. Parameter changes which are piecewise of class C‘” are admitted. A
piecewise smooth curve is an equivalence class of parametric representations
which are piecewise of class C“”.
Example 4. Let m = 2 and g(t) = te; + |t — llex,0 <t< 2. This represents
the polygon from eg to e; to 2e; + eg with a corner at e;. Let d(7) = 72+ 1 and
f(r) = g(7? ++ 1) = (7? + 1)e1+ |r3le2, —1 < +r < 1. Then fl(7) = 73+ 1 and
f?(7) = |r|. The components f!, f? are of class C™, which might lead one to think
that there is no corner. However, ¢ does not define an admissible parameter change,
since ¢’(0) = 0 contrary to (3-4). Since f/(0) = 0, f is not a parametric representa-
tion of class C., This example emphasizes the importance of the restriction ¢’(r) > 0
in (3-4).
3=3 Line Integrals 79

PROBLEMS
1. Which of the following represent simple arcs? Simple closed curves? Illustrate
with a sketch.
(a) g(t) = (acost)e; + (bsint)e2,a > 0,b > 0, J = (0, 2z}.
(b) Same as (a) except J = [—7, 2z].
(c) g(t) = (—cosh t)e;+ (sinh t)e2, J = [—1,1] (see p. 66 for the definition
of cosh and sinh).
2. (a) Let Y be represented by f(x) = rei + f(x)e2, a < x < b, where f is of class
C™ on [a, b]. Show that

i= fSaye ae
(b) Find 7 in case f(x) = |z|?/2,a = —d.
3. Find the standard representation of the helical curve represented on [0, 27] by
g(t) = (cos te; + (sin t)eg + teg. Sketch the trace.
4. Sketch the trace of the curve Y represented on [0, 27] by g(t) = (cos t)e1 + (sin 2t)ee.
Find the tangent vectors to Y at the double point (0, 0).
5. Let g!(é) = cos (1/t) exp (—1/2), g2(t) = sin (1/é) exp (—1/2) if 0 < t < 1, and
g'(0) = 92(0) = 0.
(a) Show that g! and g? are of class C on [0,1]. [Hint: u*® exp (—u) > 0 as
> =o |
(b) Does g = g'e; + g7e2 represent a curve of class C“? Illustrate with a sketch.
6. Let us write f ~ g to mean f is equivalent to g. Prove that:
(a) g ~ g (reflexivity).
(b) If f ~ g, then g ~ f (symmetry).
(c) If gi ~ go and ge ~ gz, then gi ~ gz (transitivity).
7. Let Y be a curve of class C@. Prove that the multiplicity of any point x is finite.
8. Let Yo and Y; be curves represented on [a, 6] by go and gi, respectively. For every
u € [0,1] let Y, be the curve represented by gu(t) = ugi(t) + (1 — u)go(t),
a<t< 6b. Let l(u) be the length of Y,. Prove that / is a convex function on
[0, 1]. When is the convexity strict?

3-3 LINE INTEGRALS


Let D be an open subset of #”. Let w be a 1-form with domain D, and ¥Y
a curve whose trace is contained in D. We assume that w is continuous. Let
us consider an inscribed polygon joining successively the points g(¢;_1) and
g(t;) as in the previous section. If s; € [¢;—1, ¢,], then o[g(s;)] is a covector,
and its scalar product with the vector g(t;) — g(tj;-1) 1s a number. Let us
consider the sum
m

>, olg(ss)] - let) — g(ts—1)I- G)


j=1
By reasoning like that indicated on p. 77 the sum (*) tends to the integral
(3-8a) as up> 0. This integral is called a line integral.
80 Vector-Valued Functions of One Variable 3-3

Definition. Let w be continuous and Y piecewise smooth. The line miegral


of w along Y¥ is a
i w[g(]- e'(0) dt, (3-Sa)
a

. s . . . . . . ) ~
The line integral exists, since if g is piecewise of class C‘" the integrand
in (3-8a) is bounded and has a finite number of discontinuities. If f is equivalent
to g, then using (3-5)

[ole 8°@ a= [*ale(o@)] - e'l6@e'@) ar = f° wlE)] -#°@ ar.


Hence the line integral does not depend on the particular representation
g chosen for Y.
Line integrals have an important role in many parts of mathematical
analysis, for example, in the theory of complex analytic functions. Several
fundamental physical concepts are also expressed in terms of line integrals.
Two of these—work and circulation of a steadily flowing fluid—will be men-
tioned at the end of the section.
The notation for line integral is f,#. Writing out the scalar product
in (3-8a),
. -b n :
|— | » we (d)] "(ae (3-8b)
y Ja t=1

The notation for differential form is supposed to suggest (8-Sb). Let us


write w = w; dx} + ---+ , dv” as in (2-19) and formally multiply and
divide the right-hand side by dé. If we set x = g(é) and write dx'/dé for g"(8,
then we get the integrand on the right-hand side of (38-8b).
Example 1. Let Y be the semicircle with center (0, 0) and endpoints eee, directed
from —aez to aeg. Let us evaluate J,x dy — y dx. Points (2, y) of the semicirele
satisfy the equations r = acos 6, y = asin 6, where —7/2 < @ < w/2. The most
convenient representation for Y is on [—w/2,7/2] with g(@) = (a cos @e, +
(asin @)ey. Then
-x/2
a ee fis dy =) = | 2 2
| (x dy — ydz) = ice (:8 Y 7) dé = a a = ar.
—r/2

Elementary properties of line integrals. From the corresponding linearity


property of ordinary integrals,

[@t+o = fet [s [ (co) = efa, (3-9)


for any pair of 1-forms , ¢, and scalar c. Let g represent Y on [a, d], and let
¢ be of class C‘” on [a, 8] with

¢(rT) < 0 for every t € [a, 8], o(a) = 3b, ¢(8) = a.


3-3 Line Integrals 81

The formula for change of variables in integrals still holds if we agree as usual
in calculus that ff = — f?. The composite f = g-@ represents a curve,
which is denoted by —7 and is called the curve obtained by reversing the sense of
direction of Y. From the change of variables formula

le = fe (3-10)

Let %,...,Y%p, be piecewise smooth curves such that the final endpoint
of 7; is the initial endpoint of ¥;,, forj = 1,...,p — 1. Let 7 be obtained
by “joining together” the curves ¥;,...,7%,. More precisely, let us divide
(0, 1] into p subintervals [(7 — 1)/p,j/p] of the same length. Each curve 7;
has a representation on an interval [a;,b,;]. By a linear change of parameter
we may assume that a; = (7 — 1)/p, bj = j/p. Let g; be such a represen-
tation of 7; for each 7 = 1,...,p. Then g;(j/p) = gj41(j/p). Let g be the
function such that g(t) = g,(t) fort © [(7 — 1)/p,j/p). Then g is a para-
metric representation which is piecewise of class C'”’, and Y is the curve which
g represents. Let us call7 the swm of these curves and writeY = ¥; +---+ 7p.
Since an ordinary integral over [a, b] is the sum of the integrals over the sub-
intervals [(7 — 1)/p,j/p], we have

[ o= a ec od (3-11)
bstaseat ler i Vp

Example 2. Let 7 be the boundary of a rectangle in H*, directed counterclockwise


as in Fig. 3-5. Letw = Mdz-—+ N dy. Then

The most convenient representation for 7; is obtained by setting g'(t) = t, g?(t) =


c,a<t<b. Taking similar representations for Y2, —Y3, —Y4 and using (3-10),
we find that
b d b d
fe =f Mit, o) at+ f N(b, t) dt aI M(t, d) dt ea) M(a, t) dt.
7 a ro a c

Let us now consider the case when w is an exact differential form, w = df,
where f is of class C‘” on an open set D containing the trace of 7. By the def-
inition (3-84), y
i)
i df = / dflg(t) - g(t) dt. ab
Let us use the formula ch

(f-g)'() = dfle]- gh.


This is a special case of the chain rule, which
will be proved later (Section 4-4). When arc FicurE 3-5
82 Vector-Valued Functions of One Variable 3-3

length is the parameter it says that the derivative at s of f: G equal the


derivative of f in the direction of the unit tangent veetor at G(s). By the
fundamental theorem of calculus .

[ G-8)'@ & = fle) — fel.


Let x9 = g(@) and x; = g(}) be the endpoints of y. Then

[ dj = f(x.) — f(xo). (3-12)


7

This is a generalization of the fundamental theorem of caleulus. It shows that


the line integral of an exact 1-form depends only on the endpoints of Y. In
particular, if Y is closed then the line integral is 0.
The following theorem shows that each of these properties characterizes
exact 1-forms. We say that ¥ les m D if its trace is a subset of D. By curve
we mean here piecewise smooth curve.

Theorem 7. Lei D C E™ be open, and w a continuous l-form with demain D-


The following three statements are equivalent:
(1) @ ts eraeé.
(2) For every closed curve Y lying in D, fy eo = 0.
(3) If %, and Y2 are any two curves lying in D with the same tnitial endpeint
and the same final endpotni, then fy, @ = fy, @. (See Fig. 3-6)

Fietre 3-6 Frecure 3-7

Proof. We have seen that (1) implies (2) nm Theorem 7. If ¥; and Y2


have the same endpoints, then ¥; — 72 is closed. If (2) holds, then

o= [ la Mhteal[e.:
“Ti—T2 “tr

Hence (2) implies (3).


Tt remains to show that (3) implies (1). For simplicity let us assume that
D is connected. If D is not connected, the construction to follow must be
applied separately to each component (that is, maximal connected subset) of D.
Let xp be some point of D, and define f as follows. Since Dis open and
connected, any point of D can be joined to x9 by a curve (Seetion A-7). For
3-3 Line Integrals 83

every x € D let
{ay= [ w,
where Y is any curve lying in D with initial endpoint x9 and final endpoint x.
Since we are assuming (3) in Theorem 7, it does not matter which curve with
these properties is chosen. Let us show that df = w.
Given x € D, let U be a neighborhood of x contained in D and 6 the
radius of U. Let 0 < u < 6 and for each i= 1,...,n let ¥; be the line
segment from x to x + ue;. (See Fig. 3-7.) Then

f(x
+ wei) — f(x) = _ - oe [e
Let gi(t) = x + te;, Y(t) = w(x + te;). Then g; represents 7; on [0, uJ, and
hence 1 : pfs
7 Lt + wes) —f@)1 = 7u [- = mt W(t) dt.

Since w; is a continuous function, y; is continuous. Consequently, the right-


hand side tends to ¥,;(0) = w;(x) as u — 07 by the fundamental theorem of
calculus. Similarly, u—'[f(x + ue;) — f(x)] tends to w,;(x) as u > 07.
We have shown that each partial derivative of f of order 1 exists at x
and that
ee re ee a en,
Therefore df(x) = w(x). Since this is true for every x € D, w = df. J

Corollary. If D is simply connected and w is of class C'"’, then each of the


statements (1), (2), and (3) of Theorem 7 is equivalent to the statement that
w 1s closed.

Work. Let D be an open connected subset of #*. In mechanics the idea


of force field is considered. A force field assigns at each x € D a linear function,
which we shall call the force covector acting at x and shall denote by w(x). Ifh is
a “small displacement” from x, then the work done moving a particle along the
line segment from x to x + h is approximately w(x):h. The force field is the
differential form w of degree 1 whose value at each x € D is the force covec-
tor w(x).
For present purposes it is simpler to regard force as a covector rather than
a vector. However, one can also consider the force vector F(x) = w(x)e; +
Wo(X)@2 + w3(x)ez with the same components as w(x). This simple device
for changing covectors into vectors is justified since we use the standard euclid-
ean inner product. If #” is given another inner product, then the components
of F(x) would be found by formula (1—-14b).
Let Y be a piecewise smooth curve lying in D. Using the notation on
p. 79, with for sake of simplicity s; = t;_1, the vector h; = g(t;) — g(tj-1)
84 Vector-Valued Functions of One Variable 3-3

is a displacement from g(f;—), which is small if » is small. The work done going
along Y from g(t;—1) to g(t;) should be approximately w/[g(¢;_1)]-h,;. This sug-
gests the following.
: p be i
Definition. The work w done in moving a particle along Y is

w= |o.
Uf

If arc length is used as parameter, then from the definition (38a),


l
w= / w[G(s)] - G’(s) ds.
0
The expression w[G(s)]- G’(s) is called the component of the field at G(s) in
the direction of the unit tangent vector G’(s) to Y.
A force field w of class C‘ is called conservative if w is closed. By the
corollary to Theorem 7 this is the same as saying w is exact if D is simply con-
nected. If w is exact and w = df, then f is a potential of the field w. If D is
connected, f is determined up to the addition of a constant function.
Example 3. Let w = —p~3(xdxr-+ ydy + zdz), where p? = x?-+ y?-+ 2? and
D = E?® — {0}. If we agree that f(z, y,z) > 0 as p — ©, then the potential f is
given by f(x) = p~!. Except for a multiplicative constant it is the Newtonian poten-
tial due to a mass concentrated at 0.

Steady fluid flow. Let w(x) be interpreted as the velocity at x associated


with a fluid flowing in D. It is assumed that the velocity at x does not vary
with time. The component of the velocity in the direction of the unit tangent
vector is w[G(s)]-G’(s). The above expression {yw is called the circulation
along Y. The flow is called zrrotational if fyw = 0 for every closed curve ¥
which lies in some simply connected open subset of D. If w is of class C",
this is equivalent to the fact that w is closed.
In Section 7-6 we shall again mention fluid flows, and shall define there
the rotation covector curl w(x) which is everywhere 0 if the velocity field is
irrotational.

PROBLEMS (6, ¢)
1. Evaluate 4f, x dy — y dz in case: Y
(a) Y bounds the triangle shown in Fig. 3-8.
(b) Y is represented by g(t) = (acost)e; +
(bsin thes, O0O<t< 2x, where a,b > 0. 0, 0) (a, 0)
Your answer should be the area of the set
enclosed by Y. This is a very special case of Figure 3-8
Green’s theorem (Section 7-6).
2. Let gift) = te1+ (2t — les, 1 < t < 2, and got) = (t+ 1l)er+ (2? +#+4 leo,
0<t<1. Evaluate fy, and J;, for each of the 1-forms (a), (¢), and (d) in
Problem 3, Section 2-6, where Y; and Yg are the curves represented by gy and go,
respectively.
3-3 Line Integrals 85

3. Let n = 1,w = fda, and Y be the interval [a, b] directed from a to b. Show that

[e = [ fe ae.

4, Let DC E? be open and simply connected, and u, v functions of class C which


satisfy

[Note: These two first-order partial differential equations are called the Cauchy-
Riemann equations. They are fundamental to the theory of complex analytic
functions.]
Show that for any closed curve Y lying in D,

[ude — vay = 0, [vde+ way = 0.


Y “Af

5. Show that the force field w = W(p?)(x dx + y dy + z dz) is conservative, where w is


any function of class C on EH! and p? = x?+ y?+ 2?. Find its potential f,
OMe 0:
Other line integrals. The following problems deal not with integrals of 1-forms,
but with some other types of line integrals which often occur.
6. If f is continuous on D, then (by definition)
b

[fas = iffle@lle’@| at.


Show that this integral does not depend on the particular representation g chosen
for Y. In particular, if are length is the parameter, then

[fas a [ 60) ds.

7. The moment of inertia of a curve Y about a point xo is fy |x — xo|? ds. Find the
moment of inertia about 0 of the line segment in H? joining e; and eg + 2e3.
8. The centroid x of a curve Y is the point such that

B= (f x'ds)/t 6 = My oo oy We
af

where J is the length of Y.


(a) Find the centroid of the helical curve in Problem 3, Section 3-2.
(b) Find its moment of inertia about 7e3.
9. Let W be continuous on D X H” and satisfy the homogeneity condition W(x, ch) =
cW(x, h) whenever c > 0. Then (by definition)

fw s, i Wle(t), e”(t)] dt.


86 Vector-Valued Functions of One Variable 3-4

Show that:
(a) This integral does not depend on the particular representation g of Y.
(b) If W(x, h) = w(x)-h, then fy W = fw; and if W(x,h) = f(x)|hl, fy V =
Sy f ds. *s
(c) Let W(x,h) = |/hl|, where || || is any norm on #” (Section 1-6). Then f, WV
is called the length of Y with respect to this norm. Show that if Y is the line
segment joining x; and xo, then the length is ||x; — xo].

*3-4 GRADIENT METHOD


In Section 2-5 we found relative extrema of a function f by calculating
the critical points and testing them by Theorem 6. However, in practice the
equation df(x) = 0 for the critical points can be explicitly solved only when f
has some special form. On p. 61 a method for finding critical points approxi-
mately was indicated. It is called the gradient method, or method of steepest
ascent, and will now be described more precisely.
Let D be an open set and F = (F',...,F") a vector-valued function
whose components F" are of class C‘” on D. A function g from an interval J
into £” is called a solution of the system of first-order ordinary differential
equations
dx*
ap E &) eM eg cee

if g’(t) = Flg(t)] for every te J. An existence theorem for such systems


(see [5], Chap. 1) states that given x, € D there is a solution g on some open
interval J containing 0 such that g(0) = x;.
Now let f be of class C‘’” on D. Let F(x) = X|
grad f(x) be the gradient vector of f at x. Assume
that x; is not a critical point of f. A solution g of

g(t) = grad f(g], g(0) =x, (3-13)

is called a gradient trajectory of f through x1. We


shall prove later (Section 4-7) that grad f(x) is
a normal vector to the level set of f containing
x. Therefore the gradient trajectories are normal
Figure 3-9
to the level sets, as indicated in Fig. 3-9.
Let ¢(f) = f{g(d)]. By the chain rule, which will be proved in Section 4-4,

¢'() = grad f[g()]-g’® = |grad flg(]|? > 0.


Hence ¢ is increasing. In other words, the values of f increase along each gradient
trajectory as t increases.
Let us assume that there is a gradient trajectory g through x, which is
defined for every t > 0.
3-4 Gradient Method 87

By the uniqueness theorem for systems of differential equations ([5], Chap. 1)


g(t) is never a critical point. However, one may ask whether g(t) approaches
a critical point X9 as t > ++. While this is not always true, we shall prove
two partial results in this direction.

Proposition. Jf g(t) tends to a limit xo as t > +, then xo is a critical


point of f.

Proof. Define ¢ as above. Then ¢ is increasing and

fo) = t+
lim ¢().

Suppose that grad f(xo) # 0. Since grad f is continuous there exists a neigh-
borhood U of x9 and m > 0 such that |grad f(x)| > m for every x € U. There
exists ¢; such that g(t) © U for every t > t,. By the fundamental theorem of
calculus, if t; < tg then

ole) = os) + fMees aot im (seat),


The right-hand side tends to +0 as tg ~ +a, but o(te) < f(xo). This is a
contradiction. Hence grad f(xo) = 0. §
Note: It may happen that a trajectory g remains in a compact set K C D
for every t > 0, but g(t) does not approach a limit x9 as t > +-oo. In that
case it can be shown that g has a “limit set” B, which consists of all accumu-
lation points of sequences [g(t;)] for all possible sequences ¢;, t2,... tending
to +a. Bis a compact, connected subset of the level set {x : f(x) = C}, where
C = lim;_,. ¢(t), and every point of B is critical. If f has only isolated critical
points, then B is a single point x9 and g(t) > x9 as t > «. We shall not
prove this.

However, let us show that if xg is a nondegenerate critical point at which


f has a relative maximum, then any trajectory starting sufficiently near xo
leads to Xo.

Proposition. Let xo be a nondegenerate critical point such that Q(xo, ) ts


negative definite. Then there is a neighborhood U of Xo such that Xo =
lim,_,. g(t) provided x, € U.

Proof. By Problem 8, Section 2-5, there exists m > O such that Q(xo, h) <
—m\h|? for every h. Since f is of class C”’ there is a neighborhood U of xo
such that |fi;(xo + sh) — fi;(Xo)| < m/2n? for i, 7 = 1,...,n, whenever
Xy) + h € Uands € (0, 1). Since f;(xo) = 0 there is by the mean value theorem
s; € (0, 1) such that

filto +h) = Do fejlto + sabyn’.


j=l
88 Vector-Valued Functions of One Variable 3-4

iexce OU and hex =—sx4, then

grad f(x) - (x — x0) =D) fis(to + ahhh? < — 5 Ihl?.


Cot

Now let ¥(é) = |g(t) — xo|”, the square of the distance from xo. Taking
x = g(t), we have provided g(t) € U

y(t) = 2g) — xo] - 8’) S —mlg() — x0l’,


which becomes y(t) < —my/(t). Now g(0) =
x, € U, and since y is decreasing, g(t) € U for
every t > 0. Dividing by y(t) in the in-
equality y/(t) < —my(t) and integrating over
0, tl,

log y(t) — log YO) < —mt.

The right-hand side tends to —# ast— o. Figure 3-10


Hence so does log y(t); and y(t) tends to 0.

Figure 3-9 indicates the behavior of the level sets and gradient trajectories
near a nondegenerate maximum. The situation is similar near a nondegenerate
mininum. The gradient trajectories in that case are followed as t ~ —o.
Near a saddle point (n = 2) the behavior of the trajectories is indicated
in Fig. 3-10.

Example 1. Let f(z, y) = x* — y?. The equations of the gradient trajectories are

=dx = 2 dy
ae et)
Gin tard #
whose solutions are g!(t) = x1 exp (2¢), g?(t) = y1 exp (—2t). The trajectories lie
on the hyperbolas zy = k orthogonal to the level sets x? — y? = c, and on the co-
ordinate axes. Only trajectories starting from points (0, yi) lead to the saddle
point (0, 0).

PROBLEMS
1. Sketch the level sets and gradient trajectories.
(2) (Gy) = 12? — 2y?,
(b) f(z, y) = ry + y.
2. Let f be a strictly concave function on H” which has an absolute maximum at
xo. By Theorem 5, f has no other critical points. Consider any gradient trajectory g.
(a) Let Y(t) = |g(t) — xo|?. Show thatw is nonincreasing.
(b) Show that y(t) — 0, and hence g(t) — xo, ast ~ +.
3. Let G be the representation with arc length as parameter of a gradient trajectory g.
Show that G’(s) is the direction of the gradient at G(s). See pp. 61, 78.
CHAPTER 4

Vector-Valued Functions
of Several Variables

In this chapter we shall study the differential calculus of functions of


several variables with values in H”. Among the main results will be the theorems
about composition and inverses, and the implicit function theorem. Later in
the chapter, subsets of #” which are smooth manifolds are considered, and the
spaces of tangent and normal vectors at a point of a smooth manifold are
found. These ideas are then applied to obtain the Lagrange multiplier rule
for constrained extremum problems.
Functions with values in #” will be called transformations rather than
vector-valued functions. This term has a useful geometric connotation, and
it also agrees with rather common usage. Some authors use instead the term
“mapping.” The differential calculus of transformations is based on local
linear approximations, just as for the special case (n = 1) of real-valued
functions already considered in Chapter 2. Consequently, it is first necessary
to review some results about linear transformations. This is done in Section 4-2.

4-1 TRANSFORMATIONS
Let n and r be positive integers. Let g be a function with domain A C EH”
and values in #”. Such a vector-valued function g is called a transformation
from A into EZ”. Let points of A be denoted by t = (t',...,¢). The image
of a set B CA is the set {g(t):t © B}. It is denoted by g(B). The inverse
image of a set A C E” is the set {t C A: g(t) € A}. It is denoted by g~'(A).
These ideas are indicated schematically in Fig. 4-1.
We recall that if g is continuous then the image of any compact set is
compact and the image of any connected set is connected. Moreover, if g is
continuous then the inverse image of any open set is open relative to A. See
Section A-6. The inverse image of a compact set under a continuous transfor-
mation need not be compact, nor the inverse image of a connected set connected.
The image of an open set need not be open.
89
90 Vector-Valued Functions of Several Variables 4-1

Figure 4-1

Example 1. Let n = 1. If we regard H! as a 1-dimensional vector space over itself,


then every real-valued function is a transformation.

Example 2. Let r = 1, A = [a, 6], and g be a parametric representation of a curve Y


as in Section 3-2. The image of A is the trace of the curve. If A = {x} is the set
consisting of a single point x of the trace, then g~!(A) is a finite subset of A. The
number of its elements is the multiplicity of x.

Example 3. Let n = r = 2. Points of the domain A are denoted by (s, t) and those
of the image g(A) by (z, y). It is helpful to think of two copies of the plane H?. The
first contains A and will be called the st-plane. The second contains g(A) and will be
called the zy-plane.
In our example we let A be the whole st-plane, and

g(s,t) = (s? + t?)e1 + 2steg

for every (s, t) © E?. If (z, y) € g(A), then x = s?+ #t?, y = 2st andz+y> 0,
x — y= 0. Therefore g(A) is contained in the quadrant Q shown in Fig. 4-2. In
fact, g(A) = Q. This is seen as follows: Let C be a circle with center (0, 0) and radius
a > 0. Points of C are given by s = acos¢,t = asing, where 0 < @ < 27. The
image g(C) is the trace of the curve represented on [0, 27] by

g(acos¢, asing) = ae; + (a? sin 2¢)ee.

a (s 3)
y, “Yl
Z
YY.
Ui

YY;
A= EF? y = 2st

Figure 4—2
4-1 Transformations 91

This trace is the vertical line segment shown in Fig. 4-2. By letting a take all possible
nonnegative values, one gets a collection of line segments covering Q. If a = 0,
the line segment degenerates to the point (0, 0). This shows that Q = g(A).

The components g',...,g" of a transformation g with respect to the


standard basis for #” are the real-valued functions such that

g(t) = (g'(t),..., 9°) = Dd g'We,


o—

for every t € A. In Example 3,

Gulspoi "S51 tee gs, t) = 2st.

Composition and inverses. The composite of two transformations f and g


is denoted by f > g. It is defined whenever g has its values in the domain of f.
We recall that if f and g are continuous, then the composite is also continuous
(Proposition A-7). In Section 4-4 we shall prove that if f and g are differen-
tiable, then f > g is also differentiable.
A transformation g is unzvalent if distinct points of A have distinct images;
that is, if g(t;) = g(te) implies t; = te. (The equivalent term “one-one trans-
formation” is also used.) If g is univalent, let g—! be the transformation whose
value at each point x € g(A) is the point t © A such that g(t) = x. Then
g | has domain g(A) and

SIO Sy a CLS ee
for every t € A and x € g(A). The transformation g ~~1 is called the inverse
of g. The notation g~‘(A) for inverse image of a set is consistent with this
one. If g is univalent and A C g(A), then g~'(A) is the image of A under g7’.

Example 3 (continued). This transformation g is not univalent. If (x,y) = g(s, 0),


then (s +t)? =2+y, (s —t)? =2—y. These equations have four solutions
for (s, t) if (x, y) is interior to @. However, if we take in each case the principal square
root, then
s—t=vVr-y, stt=vaet+y,

NESE t
_vaety—-Ve—y
a 2 a 2
Such points (s, é) belong to Q (regarded as a subset of the st-plane). Let g be the
restriction of g to Q. Then @ is univalent and g(Q) = g(4) = Q. The value of the
inverse £~! at any (z, y) € Q is

(eet, See)
gz (z,y) = 2 2

The main theorem about inverses of transformations appears in Section 4-5.


92 Vector-Valued Functions of Several Variables 4-2

PROBLEMS
1. Letn = 1,A = KE’, and g(t) = t-t— 2Qit.
(a) Find the image g(H”’).
(b) For each c find the inverse image of the semi-infinite interval [c, ©).
2. In Example 8 find:
(a) The image of any vertical line s = c.
(b) The inverse image of any line y = mz through the origin.
(c) The image of the circular disk bounded by C in Fig. 4-2.
3. Let g(s,t) = |s — tler + |s+ tle2, A = H?. Find g(H?) and answer questions
(a), (b), and (c) in Problem 2 for this transformation.
4. Let g(s, t) = (t cos 2rs)e, + (tsin 2rs)eg + (1 — thes3,A = E”.
(a) Show that g(H?) is a cone with vertex e3.
(b) What is the image of the square {(s,t):0 << s<1,0<t< 1}?
(c) Find g~'({e3}) and g~*({e1}).
5. Let g(s, t) = 1/(s?-+ st+ t?)e1 + 1/(s?+ st+ t?)?e0, A = {(s, t):0 < s? +0? < 1}.
(a) Show that g(A) is part of the parabola y = 2”, and find it.
(b) Find g~"({(c, c?)}).

4-2 LINEAR AND AFFINE TRANSFORMATIONS


In this section we shall collect some facts about linear transformations
from one euclidean vector space into another. For those results which are
stated without proof, references are given to [12]. However, the results in
question are standard in linear algebra and may be found in practically any
good book on the subject.

Definition. A set P C EH” is a vector subspace of E” if:


(1) The sum x + y of any two elements x, y € P is also an element of
P; and
(2) Any scalar multiple cx of an element x € P is also an element of P.

In other words, P is a vector subspace of #” if P provided with the addi-


tion and scalar multiplication in #” satisfies the axioms for a vector space.
Pahassasdimensionyo and Ol. <9. alt p=) 0 thengi2t— a0) lie mls
then P is a line containing 0; if p = 2, P is a plane containing 0, and so on.

Definition. A transformation L from H” into EH” is linear if:


(1) L(s + t) = L(s) + L(t) for every s, t € H’; and
(2) L(cs) = cL(s) for every s € EH” and scalar c.

This is a special case of the definition in Section A-2 of the Appendix.

If L is a linear transformation, then the set L(H#") is a vector subspace


of EH”. To prove this, let x, y @ L(Z”). Then x = L(s), y = L(t) for some
4-2 Linear and Affine Transformations 93

s, teh. But x+y=L(s)+L@) =Lis+t). Hence x+y ecLi(F’).


Similarly, if x € L(H”) then cx € L(H”) for every scalar c. The dimension p
of the vector space L(Z’) is called the rank of L. The kernel of L is {t : L(t) = 0}.
It is a vector subspace of H” (Problem 2). The dimension pv of the kernel is
called the nullity of L. The rank and nullity are related by (see [12], p. 66)

De ME e (4-1)
The matrix of L. Let us denote the standard basis vectors for H” by
€;,...,¢€, and those for H” by e;,...,@€n. With a linear transformation L
and these bases is associated a matrix (c}) with n rows and r columns, in the
following way. For each j = 1,...,7r let v; = L(e;), and let c be the ith
component of the vector v;:
n

v= Ds Cie, eh ae (4-2)
i=1
Then (c}) is the matrix of L, and v,,...,v, are the column vectors of this
matrix. Note that the superscript 7 indicates the row, and subscript 7 the
column, of the matrix.
Actually, for any pair of bases for HE” and E” there is a matrix associated
with L. It is shown in linear algebra that by suitable choice of bases the asso-
ciated matrix can be made to have some special form [for instance, the Jordan
canonical form if r = n ({12], p. 207)]. What we have called “the” matrix of
L is the matrix corresponding to the standard bases for #” and EH”.
Since L is linear,

L(t) = L (x 7)os Ss tL (e;).


j=1 j=1

Hence L(t) is a linear combination of the column vectors:

L(t) = > v;. (4-3a)


j=1

If x = L(t) and we take components of each side of (4-3a), then


:
fe 2
SS ot

he Ah ads SUD
a j , —_— 4
(4-4a)
— €

j=l

The components L',..., L” of L are real-valued linear functions, in other


words, covectors. Hence
(ee (taeaw at,
where w’ is just another notation for the covector L* (p. 12). From (4-4a)
the components of w’ are the entries cj, ..., ¢, of the 7th row. For that reason
94 Vector-Valued Functions of Several Variables 4-2

w!,..., w” are called the row covectors of the matrix (ct).


Vv; Vo V3 Aon Ae

Wle/ ct com eC e NC,

w'| ci ¢3 c
w’| c? OP

w" cn Cr

By (4-3a) the column vectors v,,...,V, span L(#’). The rank p equals
the largest number of linearly independent column vectors of the matrix. Since
row rank equals column rank ({12], p. 105), p is also the largest number of
linearly independent row covectors of the matrix.

Composition. Let L be linear from EH” into #”, and M linear from E”
into H?. The composite M - L is linear. Its matrix is the product of the matrices
of M and L (Problem 4).

The case r= n. Let I denote the identity linear transformation, I(t) = t


for every t € EH”. Its matrix is (5'), which has 1 for each element of the principal
diagonal and 0 elsewhere. L is nonsingular if it has rank p = n, and singular
if p <n. A nonsingular linear transformation L has an inverse L~', which
is also a linear transformation with

Lp LasikLe — LL

If r = n, then the n X n matrix (cj) has a determinant, denoted by


det (c}). This number will also be called the determinant of L. Thus by
definition
det L = det (c’).

Among the properties of determinants we recall:

det (M -L) = det M det L, (See Reference [12], p. 143.)


det L = 0 if and only if L is singular. (See Reference [12], p. 150.)

By (4-1) L is singular if and only if y > 0. But vy> 0 means that L(t) = 0
for some t ~ 0. Therefore, from (4-4a) the system of homogeneous linear
equations

OS a hee (4-5)
gat
has a nontrivial solution if and only if det L = 0.
4-2 Linear and Affine Transformations 95

In later chapters we shall see that the absolute value of the determinant
is the ratio of n-dimensional volumes, and the sign of the determinant deter-
mines an orientation.

Example 1. Let n =r =p = 2. Let L(s,t) = (2s+ the; + (3s — f)e2. The


matrix of L 1s

( )
3 i!

The row covectors are w! = 2e! + e?, w2 = 3e! — e2. The column vectors are
Vi = 2e; + 3e2, v2 = e1 — eg. Since detL = —5 ¥ 0, L is nonsingular. (See
Fig. 4-3.)

N
y
cL \
&
\
p(x, y)
s fr
Vi
(x, y) =L(s, ¢) 4
x=2s+t vA
y=3s—t We

x Ii
Ne Figure 4-3

MS

Let M(az, y) = (2x — 5y)E; — rEg, where Ej, Ee denote the standard basis
vectors for the plane H? in which M has its values. The matrix of M is

( 2 Bi
—] 0

Since det M = —5 + 0, Mis also nonsingular. The composite is found by

(M ° L)(s, ¢) M(2s + t, 3s — t) = [2(2s + t) — 5(8s — #)JE1 — (28 + Eo,

(M ° L)(s, t) ll (—1ls + 7t)E; — (2s + t)Eo.

The matrix of M o Lis

(ein).
As expected, det M>L = 25 = det M det L.

Example 2. Let n = r = 3, and let L be the linear transformation which takes the
standard basis vectors €1, €2, €3 respectively into

Vv, = e; + 2e2 — e3, Vo = —e; + e2, v3 = —e; + 4e2 — €3.


96 Vector-Valued Functions of Several! Variables 4-2

The matrix is { eee


2 1 4 ]>
=1 Oty
which has v1, V2, v3 as column vectors. The determinant is 0, and therefore L is
singular. In fact, v3 is a linear combination of v; and v2, namely, vg = vy + 2ve.
Since v; and v2 are linearly independent, the rank of L is 2. L(#?) is the plane con-
taining 0, v1, v2. By (4-1) the kernel has dimension 1. It is found by solving the
system (4-5) of homogeneous linear equations. One solution is t = €1 + 2€2 — €3.
The kernel consists of all scalar multiples of ty.

The dual L* of a linear transformation. This is the linear transformation


from the dual space (£”)* into (#7)*, defined by the formula

aL (thes (a)jeot

for every covector a € (H”)* and vector t € EH’.


Er pels iE

(E")* & (B")*


In particular, let a = e’. Then
L(t) ne oO E(t)ee Ae et
for every t € H”. Hence

U7 (ee ws (a, 5 6. ie

where w’ is the 7th row covector. The formula dual to (4—3a) is

L (eS aww? (4-3b)


i=1
If b = L*(a), then its components are given by

EE WR oi? A on oe (4-4b)
=
This follows from (4-3b) since the jth component of w’* is ch.

Affine transformations. If L is linear, then L(O) = 0. This fact gives 0 a


special role which is somewhat unnatural from the geometric viewpoint. To
avoid this it is sometimes better to deal instead with affine transformations.

Definition. A transformation g is affine if there exist a linear transforma-


tion L and xp € E” such that

g(t) = L(t) + x9 for every t € KE’. (4-6)


Linear and Affine Transformations 97

If r = n and L = I, then g is a translation.


A translation merely takes each t into t + xo. If g is affine, then g(0) = xo.
Hence an affine transformation g is linear if and only if g(0) = 0. Every affine
transformation is the composite of a translation and a linear transformation.

Isometries of E”. Let g be a transformation from EZ” into B”. If g pre-


serves the distance between each pair of points, then g is called an isometry.

Definition. If |g(s) — g(t)| = |s — t| for every s, t © E”, then g is an


isometry of E”.

Let us first suppose that g is an isometry of H” which leaves 0 fixed, namely,


g(0) = 0. Then taking t = 0, we have |g(s)| = |s| for every s € E”. Using
the formula |x — y|? = |x|? — 2x-y + |y|?, we have

lge(s)|? — 2g(s) - g(t) + |g(t)|? = |s|? — 2s-t + |t]?,


for every s, t € E”. Therefore

g(s) g(t) = s-t (4=7))


which says that g preserves the inner product. Let v; = g(e;),7 = 1,...,n.
Then |v;| = 1 and from (4-7)

WR ONG = Gp OS Or;

tor each 7, 7 = 1,..., where, as in Section 1-1, 6,; is Kronecker’s delta.


Hence v;,..., Vn form an orthonormal basis for #”. Let us show that g is a
linear transformation. For each s, t, we have from (4—7)

o(Srivi— s €) = 5", g(t) -v; = #, gis+t)-v=s’?+?,

and hence
gis) ty = e(S)o ett) v5 0,
for each j= 1,...n. The vector g(s + t) — g(s) — g(t) has component 0
with respect to each basis vector v;. Hence

g(s + t) = g(s) + g(t).


Similarly g(cs) = cg(s) for every s and scalar c. Thus g is linear. The column
vectors of its matrix are Vj,..-, Vn.

Definition. A linear transformation which preserves the standard euclidean


inner product is an orthogonal transformation.

Proposition 11. (7 = 7). L is an orthogonal transformation if and only af


the column vectors V1, .--, Vn form an orthonormal basis for £",
98 Vector-Valued Functions of Several Variables 4-2

Proof. We have already shown that if L is orthogonal, then views eave


form an orthonormal basis. To prove the converse, we see from (4—3a) that

L(s) - L(t) = ss s't’v;° Vj.


i,j=l

If vj, ..., Vn is an orthonormal basis, then v,; + v; = 6;;, and

L(s)- L(t) = s-t


for every s, t € HE”.

Theorem 8. (r = n). A transformation g is an isometry of E” af and only


if g is an affine transformation of the form g(t) = L(t) + Xo for every
t € H”, where L 1s orthogonal.

Proof. Let g be an isometry of EH”. Let f(t) = g(t) — Xo for every t € E”,
where xp = g(0). Then

HE) HG) = 30) = 0)| = 8 = 4


for every s, t € HE”. Hence f is an isometry. Moreover, f(0) = 0. We have
already shown that f must be orthogonal.
Conversely, let L be orthogonal. Then L(s)-L(t) = s-t for every s,
te k”. Taking s = t, we have

HOF = We), = eos = Be

Hence |L(s)| = |s|. Replacing s by s — t, we have

IL(s) — L(t)| = |L(s — t)| = [s — t|


for every s, t€ HE”. Hence L is an isometry of EH”. Since |g(s) — g(t)| =
IL(s) — L(t)|, g is also an isometry of E”. §
Let L’ denote the linear transformation which is defined by the formula

y Lit) = L'(y) -t (4-8)


for every y, t © E”. The - here denotes inner product rather than scalar prod-
uct. If we did not distinguish between vectors and covectors, then L‘ would
be the same as L*.
The 7th column vector L‘(e;) has the same components as the row covector
w' = L*(e'). Thus the matrix of L’ is the transposed matrix obtained by
exchanging rows and columns of the matrix of L.
Applying (4-8) with y = L(s), we get

L(s) - L(t) = (L’- L)(s) -t.


4-2 Linear and Affine Transformations 99

From this equation, L is orthogonal if and only if s-t = (L‘~ L)(s) -t for
every s, t © Hk”. But this is equivalent to the statement that s = (L‘ > L)(s)
for every s, in other words, that I = L'. L. Hence L is orthogonal if and only if

Bs
If L is orthogonal, then
1 = det I det L‘ det L.

But det L' = detL ({12], p. 146). Hence 1 = (det L)?, and detL = +1.
If L is orthogonal and det L = 1, then L is called a rotation of E” about 0.

Example 3. Any translation is an isometry of £”, and L = I.

Example 4. Let S be the orthogonal transformation which takes each t = (é!,..., t”)
into S(t) = (t!,...,¢"—!, —t”). S is a reflection of E” about the hyperplane t” = 0.
Its matrix is
1 0

0 =I

Two such reflections take each t into itself; that is, So S = I. HenceS = S~! = S‘.
If M is any orthogonal transformation with det M = —1, then L = S-M is a
rotation of HE” about 0 and

Thus any orthogonal transformation is either a rotation or the composite of S and


a rotation.

Example 5. Let n = 2, and L be a rotation of the plane H? about (0,0). Since


lv;| = 1, vi = (cos O)e1 + (sin A)e2 for some 6 € (0,2). Since L is a rotation,
vo = (—sin #)e; + (cos @)eg. The matrix is

cos @ —sin )
sin 0 cos 8

The angle of rotation is 8.

PROBLEMS
1. Letr = 3,n = 2, andL be the linear transformation such that L(e1) = e1 — 2e2,
L(€2) e1, L(e3) = 5e; + ee. Find the matrix of L, the rank, and the kernel.
2. Show that the kernel of a linear transformation is a vector subspace of its domain.
3. Let r = n, and let Li(t) = c’t‘ for every t © E”, where c!,...,c” are scalars.
(a) What is the matrix?
(b) Find L—! if it exists.

Univ. of Arizona Library


100 Vector-Valued Functions of Several Variables 4-3

(ce) If cl = +--+ = c” > O, then L is called homothetic about 0. Describe L geo-


metrically. Show that if L and M are homothetic about 0, then L~! and
M -L are also homothetic about 0.
4. (a) Show directly from the definitions that the composite of two linear trans-
formations is also linear.
(b) Let (ci), (dj), and (bj) denote respectively the matrices of L, M, and MeL.
Show that

i = Sees) Wn ean aethinete (4-9)


t=1

n, bee =P = A,
@ il
(a) Describe geometrically the linear transformation L with matrix ¢ )c
0
(b) Find S- L and L~S§, where S is the same as in Example 4. Show that both
are rotations of H? about (0, 0).
6. (a) Show that the vectorsvi = (i/V5)(e1 + 2e3), v2 = (1/v/
10)(Zen,
/5e2 + e3), v3 = (1/V10)(2e1 + V5e2 — es) form an orthonormal basis
for 1°.
(b) Let L be the orthogonal transformation whose matrix has vj, v2, v3 as column
vectors. Find L! and verify that L’o L = I. Is L a rotation?
7. Let Land M be rotations of H” about 0. Show that L~! and M © L are also rotations.
8. (a) Show that the composite of two affine transformations is also affine.
(b) Which affine transformations are univalent?

4-3 DIFFERENTIABLE TRANSFORMATIONS

Let g be a transformation from A C EH” into E”, and let tg be an interior


point of A. We would like to find a local linear approximation for the difference
g(t) — g(to). If there is such an approximation, then g is said to be differen-
tiable at tp. More precisely:

Definition. A transformation g is differentiable at ty if there exists a linear


transformation L (depending on ty) such that

lim [kl7
xo [g(to +k) — g(to) — L(k)] = 0. (4-10a)
If we set t = to + k, then L(t — to) is the desired local approximation
to g(t) — g(to). If nm = 1, the definition agrees with the one in Section 2-2,
p. 388. Moreover, for n > 1 the expression in (4-10a) tends to 0 if and only if
each of its components tends to 0 as k — 0 (see Proposition A-4b). Thus
(4-10a) is equivalent to

: 1 t a
ive ra lg (tore KE agi(tor v
atl (k) a 20, (4-10b)
HOR.= Ih os a Hes
4-3 Differentiable Transformations 101

The partial derivatives of the components g' are denoted by g} or dg‘/at’,


as in Chapter 2.

Proposition 12. A transformation g is differentiable at to if and only if


each of its components g',..., 9g" is differentiable at to.
If g is differentiable at to, then the matrix of the linear transformation
L is the matrix of partial derivatives gi(to).

Proof. Since (4-10b) states that each component g’' is differentiable at to,
the first assertion follows at once. If g is differentiable at to, then L'(k) =
dg'(to) -k for every k € E”. Hence the row covectors are Cay. LER,
and the elements of the matrix are the partial derivatives gj(to). §

Definitions. 1. The column vectors of the matrix are -called the partial
derivatives of the transformation g at to, and are denoted by g;(to) or
(dg/dt’) (to). Thus
n
d i
Zi(to) = 1 (to) = S gi(to)e:, (4-11la)
=
for each 7 = 1,...,7. The jth partial derivative g;(to) can be regarded
as the derivative of g with respect to the jth variable while all of the other
variables are held fixed, in the sense described in Section 3-1.
The formula, dual to (4-lla), for the row covectors is

dg'(to) = D7 gilto)e’. (4-11b)


j=l

2. If A is an open set and g is differentiable at each point of A, then g is


called a differentiable transformation.
3. The linear transformation L in (4-10a) is called the differential of g
at to and is denoted by Dg(to). The differential of a differentiable trans-
formation g is the function Dg whose value at each t € A is Dg(t).
4. In case r = n the determinant of the linear transformation Dg(t) is
called the Jacobian of g at t. It is denoted by /g(t). Thus

Jg(t) = det Dg(t) = det (g}(t)). (212)


Another common notation for the Jacobian is

NGS
iv ae (t).
if)
ee Antie)
We shall see that the Jacobian often plays the same role in the calculus
of functions of several variables as the derivative does in the case r = n = 1.
In particular, this is so in the theorems about inverses (Section 4-5) and trans-
forming multiple integrals (Section 5-8). If the Jacobian is 0 at a point to,
102 Vector-Valued Functions of Several Variables 4-3

then Dg(to) is singular. This suggests some kind of irregularity in the behavior
of g near to. In order to exclude such irregularities we shall repeatedly have to
make the assumption that the Jacobian is not 0.

Example 1. If n = 1, then Dg(t) has the single row covector dg(t). We may identify
Dg with dg and Dg(t) with dg(t). If dg(t) # 0, the rank of Dg(t) is 1; otherwise it is 0.

Example 2. If r = 1, then there is a single column vector. It is the derivative g’(t).


If g’(t) # 0, the rank is 1; otherwise it is 0.

Example 3. Let n = r = 2,A = E?, and

g(s,t) = (s? — t?)e1 + 2steg.

The partial derivatives of g are the column vectors

gi(s,t) = 2se1 + 2teo,


go(s,t) = —2te; + 2seo.

The matrix of Dg(s, t) is


2s —2t
2G 2s
The Jacobian is
2s —2t — 2 )
HAG, i) = dot (
1 a = A(s|'-- 1 ).

It is 0 only at (0,0). If (s, t) ¥ (0, 0), the rank is 2. At (0, 0) the rank is 0.

Definition. If the components g',...,g” are of class C'”, q > 0, then


g is a transformation of class C. Similarly, if g!,...,g” are of class CY
on B CA, then g is of class C” on B.

In Section 2-3 we called a real-valued g a function of class C if g is con-


tinuous. A transformation g is continuous if and only if g',...,g” are con-
tinuous (Section A-5). Hence the transformations of class C® are just the
continuous ones.

Theorem 9. Hvery differentiable transformation is continuous. Every trans-


formation of class C’” is differentiable.

Proof. Apply Propositions 7, 12, and Theorem 2. §

For most theorems in the differential calculus of transformations one


needs to assume that g is of class C‘” at least. An exception is the composite
function theorem (Section 4-4), in which only differentiability need be assumed.
In the remainder of this section we shall establish several inequalities of
a rather technical nature. These inequalities will be used in the proofs of
theorems to follow.
4-3 Differentiable Transformations 103

We first need to introduce a norm which measures the “size” of a linear


transformation. Let

[L|| = max {(L(t)| :|t| < 1}.


The set of all linear transformations with domain EZ’ and values in E” forms
a vector space of dimension nr (see Problem 2, Section A-2). The usual prop-
erties of a norm are satisfied (Problem 3).
Let us show that
[L(t)| < ||L|| |t| (4-13)
for every te". If t = 0, then both sides are 0. If t ¥ 0, let c = |t{—!.
Since L is linear, L(ct) = cL(é). Since [ct| = 1, |L(ct)| < ||LI|. Thus
|t|—"|L(t)| < ||LI|, which is the same as (4-13).
Since L(s) — L(t) = L(s — t), we have upon replacing t by s — t
in (4-13)
IL(s) — Lt)| < |ILI||s— tl. (4-14)
Proposition 13a. Let g be differentiable at to. Then given € > 0 there exists
a neighborhood Qo of to such that Qo C A and

lg(t) — g(to)| < (||Dg(to)|] + €) |t — tol (4-15)


for every t € Qo.

Proof. Let L = Dg(to) and set g(t) = g(t) — L(t). Since DL(to) = L,
Dg(to) = 0 (the zero linear transformation). By (4—10a), in which g is replaced
by g, there is a neighborhood Qp of to such that

E(t) — B(to)| < elt — tol (*)


for every t € Qo. But

g(t) — g(to) = [Lit) — Lito)] + [g(t) — g(to)].


From (*), (4-14), and the triangle inequality we get (4-15). I

If g is of class C“”, there is a stronger version of Proposition 13.

Proposition 13b. Let g be of class C‘? and to € A. Then given € > 0 there
exists a neighborhood Q of to such that Q C A and

lg(s) — g(t)| < (||Dg(to)|| + €) Is — t| (4-16)


for eerys,t EQ.

Proof. Let g be as before. The row covectors dg‘(to) are all 0. Since the
partial derivatives of g are continuous, given € > 0 there is a neighborhood 2
104 Vector-Valued Functions of Several Variables 4-3

of to such that |dj‘(u)| < e/n for every u € Qand7 = 1,...,n. By Corollary 1,
p. 42, for every s, t € Q,

las) —9°)1 < FIs a4,


Aer CoD mie i=1
Ol ce — o (**)
From (**) and (4-14) we obtain (4-16) in the same way as before.

If r = n and L is nonsingular, there is besides the upper estimate (4-13)


a lower estimate for |L(t)|. Let x = L(t). Then t = L~'(x), and applying
(4-13) to L~!, we get |t| < ||L~|| |x|. Replacing t by s — t, we have

——1 |s — ¢| < [L(s) — Li) (4-17)


LI
for every s, te HE”. From this inequality we get a lower estimate for
lg(s) — g(t)| as follows.

Proposition 14. Besides the hypotheses of Proposition 13b, assume that


r = nand Dg(to) is nonsingular. Let c = 1/||Dg(to)~'||. Then

FG) SG) © (4-18)


for every s,t EQ.

Proof. We have

HO) = 730) = WAG) Oa NO) =F.


From the triangle inequality, |x + y| > |x| — ly|. Therefore (4-18) follows
from (**) and (4-17). ff

PROBLEMS

1. For each of Problems 3, 4, and 5, Section 4-1, find:


(a) Where g is differentiable.
(b) The partial derivatives of g.
(c) The rank of Dg(s, t).
(d) The Jacobian Jg(s, t), where applicable.
2. (a) Let g be affine, g(t) = L(t) + xo for every t € H”. Show that De®) = iy,
for every t € EH’.
(b) Let g be a differentiable transformation such that Dg is a constant function
and A is a connected open set. Show that g is the restriction to A of an affine
transformation.
4-4 Composition 105

3. Show that:
(a) ||L|| > 0 unless L has rank 0.
(b) |[cL|| = |e] ||LI).
(c) IL -+ L'| < [LI] + ILI.
(d) ||MeL|| < ||[M|| ||L}.
4. Another norm for linear transformations, which we denote by ||| |||, is defined
as follows:
LW [Wola = [wet
where w!,..., w” are the row covectors. Show that properties (a)—(d) of Prob-
lem 3 hold for this norm. Show that ||L]| < |||L]|].
5. Let r = n and g be a differentiable transformation. Then g is called conformal if
there exists a real-valued function yu such that u(t) > 0 and u(t) Dg(t) is a rotation
of H” for every tE A.
(a) Using Proposition 11 show that g is conformal if and only if, for every t € A,
Jg(t) > 0 and the partial derivatives of g satisfy:

gi(t)-g,(t) = 0 ift 4J, (1)


and
lgi(t)| = |go(t)| = --- = [ga(t)| = 1/u(t). (2)

(b) Show that if g is conformal, then u(t) = [Jg(t)]—!/”.


(c) Let n = 2. Show that g is conformal if and only if Jg(t) > 0 and gj(t) =
g3(t), ga(t) = —gi(t) for every te A. [Note: The partial differential equa-
tions gt = g3, g2 = —gi are the Cauchy-Riemann equations (Problem 4,
Section 3-3).]
6. Show that g is conformal:
(a) g asin Example 3,A = EH? — {(0,0)}.
(b) g!(s,t) = exp (s? — #?) cos 2st, g?(s,t) = exp (s” — ¢?) sin 2st, A = H? — {(0, 0)}.
7.. Let g be of class C. The maximum rank possible for Dg(t) is min (r,n). Show
that {t:rank Dg(t) = min (7, n)} is open.

4-4 COMPOSITION
We shall now derive a rule for differentiating the composite of two dif-
ferentiable transformations. As corollaries of the basic formula (4-19) we then
obtain a formula for Jacobians and the chain rule for partial derivatives.
Let g be a transformation from an open set A C #” into an open set
D Cc E”, and let f be a transformation from D into E?.

Composite Function Theorem. Let g be differentiable at to and f be differen-


tiable at x9 = g(to). Then the composite F = fg is differentiable at to
and
DF(to) = Df(xo) » Dg(to). (4-19)
106 Vector-Valued Functions of Several Variables 4-4

Proof. Let L = Dg(to), M = Df(xo), and f =f —M. Using the fact that
M is linear, we have

F(to + k) — F(to) — (M°L)(K) =


f[g(to + k)] — flg(to)] + Mlg(to + &) — g(to) — L(k)].
To prove the theorem let us show that

i 1 Tg]Hlelto + ¥)]— F{g(to)]} = 0, (*)


and

lim 7 Mig(to + k) — g(to) — L(®)]= ce)


k0 rf
Let C = ||L|| + 1. By Proposition 13a with e = 1, there exists 69 > 0
such that
lg(to + k) — g(to)| < Clk
whenever |k| < 59. Since Df(xo) = 0, by Proposition 13a given e€ > 0,
there exists » > 0 such that

Fx) — Fxo)| < G lx — xl


whenever |x — xo| < 7. Let 6 = min {C~'n, 69}. If |k| < 6, then taking
x = g(ty + k) we get

lfle(to + &)— Hle(to)ll<GClkl


= elk.
This proves (*).
For every y, |M(y)| < |/M|| ly|. Hence the norm of the expression in (**)
is no more than ||M|| |k|~'|g(to + k) — g(to) — L(k)|, which tends to 0 as
k — O since g is differentiable at to. This proves (**). §f

The matrix of DF(to) is the product of the matrices of M and L. Let us


consider the special case p = 1, f and F now being real-valued. If we abbre-
viate by writing

=e), te = ihe, 0 = Gy)


and use (4-9), p. 100, with

b; = F;, d; = fi, Cros


we obtain:

Corollary 1. (Chain Rule)

Eee ehan = these (4-20)


4-4 Composition 107

Another suggestive form for this important formula is obtained by writing


it with the other notation for partial derivatives:

dF
[oh at
of ae
ee
ag’ = sie Te
af ag”
eee Ghtoeseles
at? =—sax at? ax” ot?
The chain rule is just a particular case of (4-4b) which describes how the
components of a covector change under the dual L* of a linear transformation.
In the present instance L = Dg(to), and dF (to) = L*[df(xo)].
If r = 1, then we may identify DF(to) with its column vector F’(to),
and Dg(to) with g’(to). From (4-19) we then get:

Corollary 2. Letr = 1. Then F’(to) = Df(Xo)[g’(to)].i

Corollary 2 will be used in the discussion of tangent vectors in Section 4-7.


Again let p = 1, and suppose that f and g are of class C” for some g > 1.
In particular, f and g are differentiable. Formula (4-20) applies at every
point t € A and the corresponding point x = g(t) € D. Thus

Fit) = oe Bait),
for every t€ Aandj = 1,...,r. Since all of the functions f;, g, g} are con-
tinuous, each partial derivative F; is continuous. Hence F is of class C"”.
If q > 2, then repeated application of the chain rule shows that F is of class
C® and gives formulas for calculating its partial derivatives of orders 1,
Deters Us
In case p > 1 and f, g are of class C™, the preceding discussion shows
that the components F!,...,F? of F are of class C, since F’ = f! og for
each | = 1,...,p. Therefore F is of class C. We have proved:

Corollary 3. Iff and g are of class C™, then F is of classC. J


Example 1. Let r = 1. The chain rule becomes

F(t) = Do file@lg"@,
=)
which can also be written
F’(t) = dflg(t)]- 2’.
If in addition n = 1, it becomes F’(t) = f’[g(t)]g’(t), which is the composite function
rule of elementary calculus.

Example 2. Let F(z) = f[z, g(x)], where fand g are of class C). In this case g'(x) =
z,g?(z) = g(x), and the formula in Example 1 becomes

F'(x) = filz, g(x)] + falz, 9(2)]g’(@).


108 Vector-Valued Functions of Several Variables 4-4

Another application of the chain rule together with the formula for the derivative
of a product gives

F(a) = fir + 2f129' (a) + fealg’ (ay)? + fog’ (a).

In this formula the partial derivatives of f are evaluated at (2, g(2)).

Example 3. Let f be of class C and let

FG, 6) = flrcos 0, risin 6]:


Let us show that
il 1
fir
+ fee = BU lets hoot et

The expression on the left-hand side is called the Laplacian of f. The partial dif-
ferential equation f11 + fez = 0 is called Laplace’s equation. Its solutions are called
harmonic functions. The formula above expresses the Laplacian in polar coordinates.
In this example

g(r, 0) = r cos 8, g(r, 0) = rsin 6.

Using the chain rule, we get

Fy = figi + fogi = fi cos 0+ fo sin 8,


F2 = fig + fogs = fi(—r sin 6) + fo(r cos 6).
Further application of the chain rule gives

Fy, = cos O(f11 cos 6 + fig sin 8) + sin 0(fo1 cos 6 + fee sin 8),
F22 = —r sin 6[f11(—r sin 8) + fi2(r cos 6)] + r cos 6[f2i(—r sin 6) +- f22(r cos 8)]
—fir cos 6 — for sin 0.

Combining terms and using the fact that fo1 = fie, we get

1 1
Pi + 75 Fae = fir + fo2 — — (fr cos
6+ fesin 8),
I 1
eta
et eee eGh a
oApy: — olic
r r

This is what we wished to show.

Corollary 4. Lei n = r = p, and let f and g be differentiable. Then, for


everyt EA,

Oa Bo) ee i eer ec Cee)


(4-21)
O(L eee te) Oumar rec aere b)
DOE
the Jacobians being evaluated at t and at x = g(t).
4-4 Composition 109
Proof. By (4-19), DF(t) = Df(x) » Dg(t). Hence

det DF(t) = det Df(x) det Dg(t). §


Example 4. Let n =r = p = 2. Let f(z,y) = f'(x, y)E1 + f(z, y)E2, g(r, 0) =
(r cos @)e1 + (rsin O)eg. As before, E1, Ee denote the standard basis vectors for the
plane H? in which f takes its values. Then

a(g', 9) a. (a —r sin ")ape


O(r, 0) sin 0 reos 6)
Hence
CSE a clon sap
TD) 8 OER

PROBLEMS
Assume that all functions which occur in these problems are of class C).
1. Let F(x, y) = f(x, xy). Find the mixed partial derivative Fj2.
2. Let F(x, y) = fix, y, g(a, y)]. Express the partial derivatives of F of orders
1 and 2 in terms of those of f and g.
3. Letn = r = p = 2. Find the Jacobian 0(F!, F7)/0(s, t) at the indicated point
by means of Corollary 4.
(a) f(@,y) = zyEi + x7yEo, g(s, t) = (s+ ter + (s? — #?)ee, (so, fo) = (2, 1).
(b) f(z, y) = o@ + y)E1 + $(@ — y)Ez, g(s, t) = (exp t)e1 + exp (—s)ee,
(so, to) = (log 2)e1.
4. (a) Show that the chain rule is still true if p > 1, namely,

A ees
(b) Use it to find (OF /ds) (so, to) and (OF /dt) (so, to) in Problem 3(a).
5. Let G be the standard representation of a curve Y of class C?.
(a) Show that G’(s)-G’’(s) = 0. [Hint: Use the fact that |G’(s)|? = 1.]
(b) Let g be any parametric representation of class C®) of Y and define S(t) as
in Section 3-2. Show that S’’(t) = g’(t)- g’’(t)/S’(t).
(c) If G(s) # 0, then G’’(s) is called the principal normal vector and |G’’(s)|
the absolute curvature at G(s). Show that

|G"1S)]|
_ fe’OPle"OP — (2-2)
le"
6. Let f(z, y) = o(a — cy) + Ww + cy), where cis a scalar. Show that fez = cfi1.
7. Let n = 4 and p = [(x!)?2+ (a7)?+ (23)7]1/2. Let f(x) = [¢(o — cx*) +
¥(p + cx*)]/p. Show that faa = c?(fi1 + fo2 + f33). [Note: The partial differ-
ential equation fran = c7 (fir +--+ + fn—1,n—1) is called the wave equation in n
variables. Problem 6 gives D’Alembert’s solution for » = 2. Solutions of the
type in Problem 7 are called spherical waves.]
110 Vector-Valued Functions of Several Variables 4-5

8. Suppose that f satisfies the partial differential equation fe = fi1 + bf, where b
isa scalar. Let F(x, y) = exp (—by)f(z, y). Show that F2 = Fi1.
9. Let F = foL, where L is a linear transformation with matrix (c}). Show that

S tye 2 lel wear


tel

10. Using Problem 9 show that ifr = n and Lis orthogonal, then Fi; + +++ + Fan =
fir ++++-+fnn. In other words, the Laplacian is invariant under orthogonal
transformations of HE”. [Hint: L' = L—!]
11. Let n = r. A linear transformation L is a Lorentz transformation of #” if L~! =
S-L'~S, where S is as in Example 4, Section 4-2.
(a) Show that if M and L are Lorentz, then ML and L~! are also Lorentz.
[Hint: S2 = I]
(b) Show that L is Lorentz if and only if S = L’e SL.
(c) Show that L is Lorentz if and only if
n—1

> Wor — "oP -> (’y? — ey?


oi

for every t. [Hint: The right-hand side is S(t) - t. Use (b).]

12. Show that if L is Lorentz, then Fi, + --->—+ Pa—inn—1 — Fan = fir 27°
fn—1,n—-1 — fnn. In other words, the wave operator is invariant under Lorentz
transformations (c = 1).

4-5 THE INVERSE FUNCTION THEOREM

Let us assume that r = n. If g is a univalent transformation, then g has


an inverse g—' (and conversely). It occasionally happens that g—! can be found
explicitly by solving the system of equations x’ = g'(t),7 = 1,...,n, for the
components ¢',..., ¢” in terms of x. However, the more common situation is
either that these equations cannot be explicitly solved, or that it is inconvenient
to solve them explicitly. One would like a criterion which guarantees that the
inverse g~' exists, and a formula for its differential, without explicitly finding
g | itself.
If n = 1 and the domain is an interval, we merely have to require that
g is differentiable and g’(t) is never 0. Then g is a strictly monotone function
(Section A-10) and its inverse is differentiable. The derivative of the inverse
is given by
g(x) = 1/9’), ife = gd. (4-22)
This is proved in elementary calculus. For example, see [9].
In two or more dimensions the Jacobian Jg(t) takes the place of the
derivative g’(t). However, the situation is by no means as simple as before.
First of all, we shall have to assume that g is at least of class C, a stronger
4-5 The Inverse Function Theorem 111

condition than differentiability. Second, and more important, is the fact that
the Jacobian’s being of one sign implies only that g locally has an inverse. Any
point to € A has a neighborhood © such that the restriction g|Q of g to @ has
an inverse f. An example given below shows that g itself need not have an
inverse.
It is plausible that a local inverse exists. For t near to, g(t) is approxi-
mated by G(t) = L(t — to) + xo, where L = Dg(to), xo = g(to). Since
det L = Jg(to) ¥ 0, the affine transformation G has an inverse. This sug-
gests, but of course does not prove, the following.

Inverse Function Theorem. Let g be a transformation of class C, q > 1,


from an open set A C E” into E”. Assume that Jg(t) ¥ 0 for every t € A.
Then gwen any to € A there exists a neighborhood Q of to such that Q C A and:
(1) The restriction g|Q is univalent.
(2) The set U = g(Q) ts open.
(3) The inverse f of g|Q is of classC. (See Fig. 4-4.)

(f has domain U)
Xo = g(t)

Figure 4—4

Proof. Step 1. Let ty € A. Let c be as in Proposition 14 (p. 104), and let


Q be as in Propositions 13b and 14, with e = c/2. Then

le@) — (| > 51s —¢


for every s, t €Q. In particular, g(s) = g(t) implies that s = t. Hence the
restriction of g to Q is univalent.

Step 2. Let U = g(Q). To show that U is open we shall show that any
x, € U has a neighborhood U, such that U; C U. From Step 1, x1 = g(t:)
for exactly one t; € 2. Let Q; be a neighborhood of t; whose closure cl {2; is
contained in Q, and let IT, denote the boundary of Q,. Since the restriction of
g to @ is univalent and t; €T;,x1 ¢ g(T'1). (See Fig. 4-5.) Since g is of class
C™, and therefore is continuous, the image g(I',) of the compact set I; 1s
compact. Let a, be one-half the distance from x; to g(['1), and let Uy be the
neighborhood of x, of radius ¢}.
112 Vector-Valued Functions of Several Variables

Figure 4-5

Let x € U,. Then for every t ET),

Xqp = (0) (Xq ex [Xe (0)|:

Using the triangle inequality, we get

20 ati —e(t) | = Xe — 2x) = [xg (t) |:


Since |x; — x| < oj, we must have

Cig o)x rer et)


for every t @ Ty. Let
n

Voi (xe ei —es [zg (0)


i=1
The real-valued function y is of class C and has a minimum on the closed
n-ball cl Q, = Q, U WP ie But

¥(t1) I |x — x,|? < oi,


and
y(t) > of for every t ET).

Hence the minimum value is less than 07, and must be attained at some interior
point tg € Q;. By Proposition 10, dy(t,) = 0. Since

dy(t) = —2 5 (x* — gi(t)) dg’(t),


ja
we have, upon setting c’ = x’ — g’(t2),

0 = a c’ dg'(ts).
oH
4-5 The Inverse Function Theorem 113

Since Jg(tz) # 0, the row covectors dg'(t2),...,dg"(t2) are linearly inde-


pendentes Therefore ¢' P=10)7) =") 45,7, and x’ =/g(ts)!
We have shown that if x © U,, then x € g(Q,). Thus U, C g(Q,) C U.
Step 3. The existence of the inverse f required in (3) of the theorem is
immediate from (1). We must show that f is of class C'. Let x; © U and
t; = f(x) as in Step 2. Let L; = Dg(t,). (The subscript on L does not de-
note a partial derivative.) Let us first show that f is differentiable at x, and
Df(x1) = L;’. Let cy = 1/||Ly;'||. Given € > 0, there is a neighborhood
Qe of ty, QM C Q, such that
€C{C
EO) = Ale) —I51G = ty) 5) It ae t,| (*)

for every t € Q2 where ¢ is as in Step 1.


By (2) there is a neighborhood U, of x; such that Uz C g(Q.). Let x € Uz.
Then x = g(t), where t € Qe. Since x; = g(t,), from Step 1 we have

; taecety | oleae xy) (#*)


Moreover, since t = f(x) and t; = f(x),

Dee (Xie Lae e— x1)] — — [e(t) ape ti) = Lilt ti)


Since ¢,|7| < |Li(7)| for every 7, we get

elf) £ (x1) — Ly “xe x1) |S g(t) — gilts) — Li(t — ti)|:


Then from (*) and (**),

fayette) Lyx x) = efx — x1],


for every x € U2. This shows that f is differentiable at x; and Df(x1) = be.
We have shown that f is a differentiable function, and that

Df(x) = {Delf(x)]}~™ (4-23)


for every x € U. By Theorem 9, f is continuous. Since each g} is a continuous
function, the composite g} - f is continuous. If (y;) is a nonsingular matrix and
(z) its inverse, then for each i = 1,...,7 the elements 2,..., 2, of the 7th
row satisfy the system of linear equations
n
i il .
ie) Z1Yj5 Gi lL esies Mh
l=1

By Cramer’s rule ((12], p. 151), each z is a rational function (quotient of two


polynomials) in yj,..., yn. Applying this with y; = g;[f(x)], 2 = fi(x), we
find from (4-23) that the partial derivatives fi are continuous. Hence f is of
class C™. If g is of class C, then gi is of class C and by Corollary 3, Sec-
114 Vector-Valued Functions of Several Variables 4-5

tion 4-4, gi » f is of classC‘”. Hence fj is of class C‘” and f of class Cc. Repeat-
ing this argument, we find that if g is of class C@ then f is also of class C.
This completes the proof of the inverse function theorem. J

Formula (4-23) is an extension of formula (4-22). By taking the deter-


minants, we obtain another formula which is also an extension of (4-22):

Jf(x) = ele t = f(x). (4-24)


Jg(t)
The inverse function theorem has the following:

Corollary. Let g satisfy the hypotheses of the inverse function theorem. Then
the image of any open subset of A 1s open.

Proof. Let B C A be open, and let x9 € g(B). Then xo = g(to) for some
t) € B. (There may be several possible choices for to.) By the inverse function
theorem, applied to g|B, there exists a neighborhood Q of to such that g(Q)
is an open subset of g(B). Therefore some neighborhood of Xo is contained in
g(B). This shows that g(B) is an open set. §f

Example. Let r = n = 2, and

g(s, t) = (cosh s cos t)e; + (sinh s sin t)eg,

where cosh and sinh are hyperbolic functions. Then

gi(s, t) = sinh s cost, g3(s, t) ll —cosh s sin f,


gi(s, t) = cosh ssint, ga(s, t) = sinh s cost.

The Jacobian is sinh? s cos? t-+ cosh? s sin? t, which simplifies because cosh? s =
1 + sinh? s and cos? t + sin? t = 1 to

Jg(s, t) = sinh? s+ sin? t.

If we take for A the right half-plane s > 0, then sinhs > 0 and Jg(s, t) > 0. The
hypotheses of the inverse function theorem are satisfied, hence local inverses exist.
Since cos and sin are periodic, g(s,t-+ 27) = g(s,t). The transformation g is not
univalent, and consequently has no inverse. By the corollary, g(A) is an open set
which, as we shall soon see, is ZH? with a line segment removed.
Let A = {(s,t):s > 0,0 < t < 2m}, and let g be the restriction of g to A.
Let us show that g has an inverse. It is not easy to solve the equations

x = g'(s,t) = cosh
scost, y = g*(s,t) = sinhssint

explicitly for s and ¢. However, let us consider what happens on vertical straight
lines s = c. For each c > 0, g(c, t) represents on (0, 27] an ellipse with major semi-
The Inverse Function Theorem 115

Figure 4-6

axis of length coshe > 1 and minor semiaxis of length sinh c. Each of these ellipses
has +e, as foci, and g(c,0) = g(c, 27) = (coshc)ey. If s, ¥ se, then the points
&(si, t1) and &(sg, tz) lie on different ellipses. Moreover, &(s, ti) = &(s, t2) implies
i) = pe Hence &(s1, t1) = (se, te) implies that (si, t1) = (se, tg), and % is univalent.
The image of A is HE? with the semi-infinite line on the z-axis from —e; to © deleted.
The part of the boundary of A on the s-axis is transformed onto the part of the line
from e; to ©, and the vertical part of the boundary onto the part from —e, to e}.
Hence g(cl A) = E?. By periodicity each value which g takes on A is also taken
somewhere on A or its lower boundary. Hence g(A) is H? with the line segment join-
ing —e; and e; removed. (See Fig. 4-6.)

Regular transformations

Definition. (7 = n). A transformation g is regular if:


(1) g is of class C”,
(2) g is univalent, and
(3) Jg(t) ¥ O for every t € A.

A regular transformation g has an inverse g~! which is also of class C‘).


Regular transformations are called by many authors diffeomorphisms of class
C. {A transformation of class C“ which has an inverse of class C° is called
a homeomorphism (Section A-6).]
One might expect naively that at worst a transformation distorts shapes,
and that the image of a set has basically the same structure as the original.
For instance, the image of a smooth simple are should be a smooth simple arc,
the interior of a set should transform onto the interior of the image, and so on.
Irom various examples we know by now that this need not be the case at all.
However, it is so for regular transformations. They are the ones which behave
properly throughout calculus.
The notion of regular transformation is the basis for the discussion in
Chapter 7 of coordinate changes on manifolds. The transformation law for
multiple integrals will be proved in Chapter 5 only for regular transformations.
116 Vector-Valued Functions of Several Variables 4-6

PROBLEMS
1. Determine whether g satisfies the hypotheses of the inverse function theorem.
Find g(A). If g is univalent, find g~ explicitly.
«
(a) g(t) = t+ xo (a translation), A = E£”.
(b) g(¢,t) = (s+ er + (s — tes,A = BP.
(ec) g(s, t) (s? — s — 2)e, + 3tez, A = E?.
(d) g(s,t) = (s? — t)e1 + stes, A = EB? — {(0, 0)}.
(e) g(s, #) = (log st)e1 + 1/(s?+ t?)e2,A = {(s8,t):0 <t < 8}.
Dalet giye— ¢-2i*, A = (0,0), Find 7.
3. Let r = n = 38, and g(s,t, u) = (uwcos ste: + (usin st)eo + (s+ uje3. Then
g(e1 + €3) = e1 + 2e3. Let f be a local inverse of g such that f(e1 + 2e3) =
€, + 3. Find Df(e; + 2e3) using (4-23).
4. In the example on p. 114, what are the images of horizontal straight lines? Show
that g is a conformal transformation (Problem 5, Section 4-3), and hence that the
images of vertical and horizontal straight lines intersect at right angles. Illustrate
with a sketch.
5. Let g(s, t) = (exp scost)e; + (exp ssint)eg and A = E?.
(a) Show that g satisfies the conditions of the inverse function theorem, but is
not univalent.
(b) Let A = {(s, t):0 < t < 27}. Show that the restriction of g to A is univalent,
and find its inverse.
(c) Find g(£?).
(d) Show that g is conformal.
6. Let g1(s, t) = s+ f(t), 97(s, t) = t+ f(s), where
f is of class C and |f’(s)| < ¢ < 1
for every s € E!.
(a) Show that g(H?) = E?. [Hint: Given (z, y) define ¥(s, t) as in Step 2 of the
proof of the inverse function theorem. Prove that y has a minimum at some
point (s*, é*) and that g(s*, *) = (, y).]
(b) Show that g is univalent.
7. Let A be an open convex set and g a differentiable transformation such that
Dija1 9)(t)h'h’? > 0 for every t€ A andh ¥ 0. Show that g is univalent. [Hint:
Suppose that g(t1) = g(t2). Let h = te — t1, f(t) = [g(t) — g(ti)]-h, and
apply the mean value theorem to f.| This result is due to H. Nikaido.

4-6 THE IMPLICIT FUNCTION THEOREM


There is a principle, often carelessly stated, that an equation ®(x) = 0
“implicitly determines one of the variables x',...,2” as a function of the
remaining n — 1 variables.” More generally, if 1 < m < n, then m equations
'(x) = --- = 6"(x) = 0 are supposed to determine implicitly m of the
variables in terms of the other n — m.
Simple examples show that this principle is invalid unless some additional
assumptions are made. Let us suppose that 6',..., 6” are of classC™, g > 1.
Let
@ (be oe
4-6 The Implicit Function Theorem 117

The implicit function theorem guarantees the local validity of the principle
near any point Xo such that ®(xo) = 0 and D@(x,) has maximum rank m.
Before stating and proving this theorem, let us indicate what it asserts
when n = 3, m= 1. Let xo = (20, Yo, 20) be a point such that (x9) = 0
and d®(xo) 4 0. Since the components of d(x») are the partial derivatives
1 (Xo), B2(Xo), &3(Xo), at least one of the components is not 0. For instance,
suppose that ®3(x9) # 0. Then in some neighborhood U of xo, @3(x) ¥ 0
and the equation 6(x) = 0 “determines z as a function of x and y.” More
precisely, there is a function ¢ of class C such that for (z, y,z) € U,
P(x, y, z) = Oif and only if z = ¢(2, y). The domain R of ¢ is an open subset
of the xy—plane.

Example 1. Let ®(z, y,z) = x? -+ y? — 22 — 1. The set M = {x:@(x) = 0} is


a hyperboloid. Solving the equation ®(x) = 0 for z, we getz = +(2?+ y? — 1)!”.
If zo > 0, then we take ¢(z, y) = (x7 + y? — 1)!/”, so that zo = $(z0, yo). While
the theorem guarantees only the existence of ¢ in some neighborhood of (zo, yo), in
this example ¢ is actually of class C‘ on the complement of the disk x?-++ y? < 1.
If zo <0, then one should take ¢(2, y) = —(a? + y? — 1)!”. If zo = 0, then
®3(xo) = 0. In this case the equation = 0 determines near xo one of the vari-
ables x or y as a function of the other two.

Returning to the general case, let r = n — m. If D@®(xo) has rank m,


then some set of m columns of its matrix is linearly independent. For the
present let us assume that the last m columns are linearly independent. This
means that the square matrix obtained by deleting columns 1, 2,...,7 must
have nonzero determinant. Thus

AC Aen se)
= 0) at Xo. (4-25)
Oa wae)

Let us introduce the notation


A
yo ENO ok eas oe) = (Cp ae)

for the vectors obtained by taking only the first 7 components of x and Xp.

Implicit Function Theorem. Lei &',..., 6” be of class C on an open set


D containing Xo, where gq > 1 andl <m <n. Assume that ®(xo) = 0
and that (4-25) holds. Then there exist a neighborhood U of Xo, an open set
R CE” containing %, and functions $',...,¢” of class C'\ on R such
that:
OE oss a)
~ 0 ateveryx € U; ‘ay
ane 50)
and

fx EU: @(x) = 0} = {KEU:2ER, ct’= 4'(8) fori=1,..., m}. (2)


118 Vector-Valued Functions of Several Variables 4-6

Proof. Since ® is at least of class C", the Jacobian in (1) is a continuous


function. By assumption (4-25) it is not zero at Xo, and therefore is not zero
for x in some neighborhood Uo of Xo.
Let us consider the transformation f, with*domain Uo and values in EH”,
which has components X!,...,X’, @1,...,”. As in Section 1-3, X° is
the 7th standard cartesian coordinate function. For every x € Uo,

es) See, oeee eT


rl == GH!
ana X) '(x), UPN 2s 8 eebyLOE

The transformation f is of class C. Its matrix of partial derivatives is


Pe. 2 0 HO a oe
il :

0 ee xlens () ee ee
ob} egal. Cag) Gh Ge

a ee ie ee oe
By properties of determinants, the Jacobian Jf(x) equals the determinant
of the m X m block in the lower right-hand corner. Since the latter is just the
Jacobian in (4-25), Jf(x) # 0. By the inverse function theorem, there is a
neighborhood U of xo such that f(U) is an open set and the restriction f|U has
an inverse g of class C”’. Note here that the roles of the symbols f and g in
Section 4-5 have been reversed.
Git r+] ener ot1)
{x : (x) =0}

Figure 4-7

Writing (2,0) tor (Fer natn0) ee 0) sleta(seesitipe 4-7)

R= {%: (2,0) €f(U)}.

Since f(U) is an open set, R is open. For every % € R, let

EMG) == af 16.) eas on feih


4-6 The Implicit Function Theorem 119

For x € U, (x) = 0 if and only if € R and f(x) = (%, 0). Since f|U and
g are inverses, f(x) = (%, 0) if and only if x = g(%, 0). §

The partial derivatives of ¢',...,¢” can be calculated in terms of those


of &',..., 6” by means of the chain rule and Cramer’s rule. We illustrate the
technique in two special cases. Let us suppose that g > 2.
Let n = 3, m = 1 as at the beginning of the section. Then

Px, y, (x, y)] = 0, ©


and ®,[z, y, ¢(x, y)] # 0 for every (x, y) € R. Applying the chain rule to (*),
we get
®
@,+43¢,5=0, = a
5 (**)
*

do + $362 = 0, 2 = — a

In the formulas (**) the partial derivatives of & are evaluated at (2, y, (x, y)).
To calculate the second-order partial derivatives $11, $12, ¢22, the chain
rule is applied again. For instance, taking the partial derivative with respect
to the second variable in the first of equations (**), we get

Bio + Bish2 + [P32 + 3362161 + P3bi2 = 0.

Substituting the expressions for ¢1, ¢2 obtained above and solving for ¢j9,
we get
a (63)"@19 — Bob313 — Bi b3Hzo + D1 P2%33
oie == (63)3 ;

Let m= 2,n=3,r=n—m=1. Writing ® = (4, W) rather than


(61, 67), and ¢, w rather than ¢', ¢”, we have

P[x, (x), ¥(x)] = 9,


W(x, (x), ¥(x)] = 9,
and ,V3 — $3W2 ~ 0 for every x € R. The partial derivatives in question
are evaluated at (a, (x), ¥(x)). By the chain rule

b, + Sod’ + 3’ = 0,
Sa a Vod’ ie Wy’ a 0,

and by Cramer’s rule

gi 30, — P1V3
@oVs — b3Vo
120 Vector-Valued Functions of Several Variables 4-6

The second derivatives ¢’’, ¥’’ can be found by another application of the chain
rule.
For convenience we assumed in (4-25) that the last m columns of the
m X n matrix of partial derivatives (€!(xo)) weréeslinearly independent. More
generally, one need merely suppose that some set of m columns is linearly inde-
pendent, in other words, that the linear transformation D@®(x9) has maximum
rank m. Let us suppose that columns J), j2, . - - , jm form a linearly independent
set, where we may suppose that j1 < je <--+- < jm. Let 2,...,1, be those
integers between 1 and n not included among j1,...,Jm, with 13 < +++ < 4,.
For brevity let us write \ for the r-tuple of integers (7,,...,7,) and x for the
ruupler(ae t,o").
The implicit function theorem now states, roughly speaking, that locally
the equation ®(x) = 0 determines x”1,..., 2’ as functions of x*. More pre-
cisely, U, R, and ¢',..., 6” exist as before such that
O(a eee
~ 0 ateveryxec U,
d(x7!,..., x)
and

{x € U:@(x) = 0} = {x:x* € R, x” = ¢'(x*) ford = 1,..., m}.


In the case we considered above, j1,...,jm are the integers r+ 1,
(eee een LD ee eee en cathentx ag==:

Example 2. Suppose that m = 2,n = 5, and

AB", &
ee) ~ (0 at xo.
OG, av)

Then we can take 71 = 1, jo = 4,A = (2,32, 5).


Let m = 1. Then D®(xo) = d®(xo), and D&(xo) has maximum rank 1 if and
only if at least one partial derivative ®j(xo) is not zero. If ®;(xo) # 0, then we can
fakegp e=ig and x =" (Ge ae el ee ee”):

Example 3. Let
@(z, y,z) = 2? + y2 + 22 — Qrz — 4
Then
Pi (x,y, 2) = 2a — 2Qz,
Po(z, y, 2) = 2y,
@3(x, y, z) = 22 — Qa.

If db(z, y,z) = 0, then y = 0, = z, and O@,y,z) = —4 #0. The implicit


function theorem applies at any (20, yo, 20) where ®(zo, yo, 20) = 0. If zo ¥ 20,
then $3(xo, yo, 20) # 0. We may take 7; = 3, X = (1, 2), and proceed as above.
We may equally well take 7; = 1, \ = (2,3). However, if xo = zo, then we must
take gn = ALIN = Cy Be
4-6 The Implicit Function Theorem 121

PROBLEMS

In each problem assume that © is of class 0),


1. Let ®[x, o(x)] = 0 and ®o[z, o(x)] ¥ 0 for every x € R. Find ¢’ and ¢”.
2. Let ®[dly, z), y,z] = 0 and &)[9(y, z),y, 2] ¥ 0 for every (y,z) E R. Find 11.
3. Let m = 2, n = 4, B(x) = (27)?+ (24)? — Qx1x3, W(x) = (2?)3+ (24)8+
(*)2 — (c*)°, and ® = (%,V). Letxo’= (1, —1, 1,1), 1-= 1,92 = 3.
(a) Show that the hypotheses of the implicit function theorem are satisfied.
(b) Write 6! = ¢, 6? = y, where according to the theorem,

(a7)? + (w#)? — 26(x?, x4)W(x?, 24) = 0,


(a2)8 +--(@*)? + [b(z?, 2*)]® — fy@?, 24)]8 = 0
for every (x?, x4) € R. Find the first-order partial derivatives of ¢ and y at
x} = (—1, )).
4. Let B(a, y, z) = 22 + 4y? — Qyz — 22, x9 = 2e, + e2 — 4ez.
(a) Verify the hypotheses of the implicit function theorem.
(b) Find the largest neighborhood U of xo such that ®3(2, y,z) # 0 for every
(x, y, 2) € U.
(c) Find the largest neighborhood of xo containing no critical point of ®.
be let PG, y) = 2% — 92.x0.= (0,0).
(a) Let U be any neighborhood of (0, 0), of radius a, and R = (—a/v/2, a/V2).
Find a function @ such that ®[z, d(a)] = 0 for every x € R.
(b) Show that no ¢ exists for which Eq. (2) of the implicit function theorem holds.
6. (a) Let m = 2. Let ® = (©, WV), where V(x) = 6(x)®(x) for every x € D and
6 is a real-valued function. Show that D@®(xo) has rank less than 2 at any xo
such that ®(x9) = 0.
(b) State and prove a corresponding result for m > 2.
7. Give an alternate proof of the implicit function theorem, in case m = 1, n = 2,
by carrying out the following steps. Let © be of class C“ and suppose that
B(x, yo) = 0, Pe(xo, yo) # 0. For definiteness assume that 2(xo, yo) > 0.
(a) Show that there exists «€> 0 such that ®(xo, y) < 0 if yo—e Sy < yo
and ®(to, y) > Oifyo <y S yote.
(b) Show that there exists 6 > 0 such that B(x, yo — €) < 0 and ®(2, yo + €) > O
if |x — xo| < 6.
(c) Let I = {(2z, y): |x — xo| < 6, ly — yo| < «|. The numbers e and 6 in (a)
and (b) may be so chosen that @o(z, y) > 0 for every (x,y) € I. Show that
if |zy — zo] < 6 the equation ®(2i1,y) = 0 has exactly one solution y1
with (21, y1) EJ. Set yi = (x1). This defines ¢@ on the open interval
(xo — 6,20 + 64).
(d) Show that ¢ is differentiable and that

¢'(z) =
__ bile, $(2))
Holz, $(x)]
In this proof of the theorem, the rectangle J replaces the circular neighborhood
U, but this is unimportant. Can you extend this proof to the case m = 1,n > 2?
122 Vector-Valued Functions of Several Variables 4-7

4-7 MANIFOLDS
The word manifold is used in mathematics to describe a topological space
which locally is “like” euclidean E’, for some r called the dimension of the
manifold. For instance, a circle is locally like E*. Such geometric figures in
E® as ellipsoids, cylinders, and tori are locally like HZ”. A cone is not locally
like H? near its vertex.
We shall approach the idea of manifold from a rather concrete viewpoint.
For us, a manifold M is a subset of some euclidean #” which can locally be
described by an equation ®(x) = 0, where D@(x) must have maximum rank.
Another definition of manifold can be given abstractly in terms of coordinate
systems. It has the advantage that one need not presuppose that M is a subset
of some euclidean space. This will be discussed in Chapter 7.

Definition. Let 1 < r < n,q > 1. A nonempty set M C E” is a manifold


of dimension r and class C‘ if M has the following property: For every
Xo € M there exist a neighborhood U of xo and ® = (@!,---, 6"~”) of
class C‘? on U, such that D®(x) has rank n — r for every x € U and

MO ee xi Se UE (x)t = 02

Throughout the following discussion we shall take g= 1. For brevity


we shall say r-manifold instead of “manifold of dimension + and class C‘).”
If r = n, let us call any open subset of H” an n-manifold.
Let us indicate how an r-manifold M is locally like E’. First assume that
(4-25) holds at xo, and define f as in the proof of the implicit function theorem.
The neighborhood of xo chosen on p. 118 need not coincide with the neighbor-
hood U in the definition of manifold. In this section let us denote the former
neighborhood by U, rather than U, and let us denote the set R on p. 118 by
k,. We may suppose that U; C U. Now f|U, is a regular transformation
(p. 115), and
f(V iy Oy) {(%, 0):%¢€ R;}

is a relatively open subset of the 7-dimensional subspace of E” spanned by


€;,...,€,. Therefore it is reasonable to say that MO U, is “like” E” (Fig. 4-8).
In case (4-25) does not hold, one must replace the r-tuple of integers 1, 2,...,r
by some other r-tuple \, as indicated on p. 120.

f(MaU,)

Figure 4-8
4-7 Manifolds 123

Example 1. Let n = 2, r = 1, and H be the hyperbola x2 — y? = 1. To show


that H is a 1-manifold, let us take B(z,y) = 2? — y2?—1. Then D®(z,y) =
dP(x, y) = 2xe' — Qye?. If (x,y) ¥ (0,0), then d&(z, y) ~ O and the rank is 1.
Given (xo, yo) € H, let U be any neighborhood of (zo, yo) which does not contain
(0, 0). In this example the choice of does not depend on (zo, yo).

Example 2. Let ./ be the union of H and one of its asymptotes,

ME ye y= 1 (a, y) y=}:
To show that © is a 1-manifold we must show that given (xo, yo) € M, there exist
U and ® such that d®(z,y) #0 in U and MN U = {(2,y) ECU: B(az, y) = O}.
If (vo, yo) € H, then we let B(x, y) = x? — y? — 1 as before, and let U be any
neighborhood of (xo, yo) which does not meet the asymptote y = x. However, if
(zo, yo) is on the asymptote, we take ®(z, y) = y — x and U any neighborhood
of (xo, yo) which does not meet H. In this example our choice of ® depends on (zo, yo).

Example 3. Let M = {(z, y):2? = y?}. This set consists of the two lines y = -£2,
and is not a l-manifold. Roughly speaking, / is not like H! near the crossing point
(0, 0). More precisely, if M were a 1-manifold, then by the implicit function theorem
the following would be true: Each (zo, yo) € M has a neighborhood U, such that
either MMU, = {(x, o()):2€ Ri} or MN Ui = {(¥y), y) : y © Re}, where
Ri, Rez are open. In the present example, (0, 0) has no such neighborhood.

Most examples of manifolds which we shall consider are obtained in the


following way. Let ® be a transformation of class C“”, from an open set D C E”
into H”. Let
M = {x:®(x) = 0 and D®(x) has rank m}. (4-26)

If M is not empty, then it is an r-manifold, where m = n — r. To show that


M is an r-manifold, in the definition of manifold let us choose this same ® for
every X) © M. By Problem 7, Section 4-3, {x:D®(x) has rank m} is open.
Hence any x9 € M has a neighborhood U such that D®(x) has rank m for every
xe Uj andM aU = { € U:@(x) = 0}.
When (4-26) holds, we say that M is the r-manifold determined by ®.

Example 4. The (n — 1)-sphere {x:|x| = 1} is an (n — 1)-manifold. In fact, it is


the (n — 1)-manifold determined by ®, where ®(x) = |x|? — 1. The only critical
point of ® is 0, which is not on the (n — 1)-sphere.

Example 5. Let /’ be real-valued and of class C“. Consider a level set

Bees Axe h(x) = ch.

Let ®(x) = F(x) — c. Then d®(x) = dF (x). If B. is not empty and contains no
critical points of F', then B, is the (n — 1)-manifold determined by ®. If B, contains
critical points, then the (n — 1)-manifold determined by ® is

M, = B, — (set of critical points of F),


124 Vector-Valued Functions of Several Variables 4-7

unless M, happens to be empty. We saw in Section 2-5 that B. need not resemble
E”—! near a critical point contained in Be.

Example 6. Let F(z, y) = exp (ry). The partial derivatives are F'1(z, y) = y exp (zy)
and Fo(z, y) = xexp (zy). The only critical point is (0,0), and F(0,0) = 1. If
c < 0, then B, is empty. If c > 0,c ¥ 1, then B, is a 1-manifold. In fact, B- is the
hyperbola zy = loge. The level set By is the union of the xz- and y-axes, and is not
a l-manifold. M,; = By, — {(0, 0)} is a 1-manifold.

Example 7. As in Example 5, if F is of class C and has values in LE”, m = n — 1,


then
M, = {x: F(x) = cand DF(x) has rank m}

is either empty or an r-manifold.

Example 8. In particular, let a!,...,a™” be linearly independent covectors and let


Ce Ha itherset
;
Plax al x—nc tor (eel. et,

is called an r-dimensional plane. Let L be the linear transformation with row covec-
{Orssa set, ava pincers has rank m, L(#")3=h™. Hence 'P =i{x: L(iz)e=nc) ms
not empty. Moreover, DL(x) = L has rank m for every x. Thus, any r-dimensional
plane P is an r-manifold.

Tangent vectors to a manifold. Let // be a manifold and xp € M.

Definition. A vector h is a tangent vector to M at xo if there exists a func-


tion y from an interval (— 6, 6) into M such that y(0) = xp and y/(0) = h.

The definition can be restated in a way which is more appealing geo-


metrically. For brevity let us set x, = Xo + th. Then h is a tangent vector if,
for some 6 > 0, there exists y; € M whenever 0 < |t| < 6, such that

Tey) eal (*)


t0 |¢|

If we set y(t) = y; for 0 < |é < 6, and ¥(0) = Xo, then (*) states that
y’/(0) = h. (See Fig. 4-9.)
Let 7x0) denote the set of all tangent vectors at xo. It is called the tan-
gent space to M at Xo. If r is the dimension of M, then it is plausible that the
tangent space is a vector space of dimension r. Let us show that this is true.
Let U and @® be the same as in the definition of manifold.

Theorem 10. The tangent space T(xo) is the kernel of the linear transforma-
tion D®(xXo).

Since D®(xo) has rank m, the kernel T(xo) is a vector subspace of E”


with dimension r = n — m.
4-7 Manifolds 125

Figure 4-9 Figure 4-10

Proof of Theorem 10. Let L = D@(x9). We must show that his a tangent
vector if and only if L(h) = 0.
Let h € T(x). Let yw be as in the definition of tangent vector. Then
@[y(t)] = 0 for every t € (—6, 6). Calculating the derivative of ®> y by
Corollary 2, Section 4-4, we have

0 = LIy’(0)] = L(h).
Conversely, let L(h) = 0. For simplicity let us assume that (4-25) holds,
that is, that the last m columns of the matrix of L are linearly independent.
Betti — (XxX)... XX", ©). &”), and’ U, be a neighborhood of x9 such
that the restriction of f to U, is regular, as in the proof of the implicit function
theorem. There exists 5 > 0 such that %) + th € R, for every t € (—6, 5).
Let g = (f|U,)~! and Z
W(t) = g(Ko + th, 0).

Then y(t) € M and wis of class C‘? (Fig. 4-10). We must show that y/(0) = h.
Let A = Df(xo). By formula (4-23), A7! is the differential of g at the
point (9, 0) = f(x). Therefore

v’(0) = AW1(h, 0).


By definition of f,
NG = dhe Co Wh et nec
Al™t(h) = dé'(x)~h, [== ae ent.

Since h is in the kernel of D(x), A’t"(h) = 0. Thus A(h) = (h, 0), and

tA gh, 0) — v0).
Normal vectors to a manifold. A vector n is called normal to M at Xo if
n-h = 0 for every h € 7T(X0). The normal vectors form a vector space of
126 Vector-Valued Functions of Several Variables 4-7

dimension m = n — 1, the orthogonal complement of the tangent space


T(Xo). The gradient grad @'(xy) is the vector with the same components as
the covector d@'(x9). By Theorem 10
grad @'(xo) -h = d@'(xo)-h = 0

for every h € T(xo) and! =1,...,m. Hence grad ®'(xo)j2 4 grad © (Xp)
are normal vectors to M at Xo. Since D®(xo) has rank m, these vectors are linearly
independent, and they form a basis for the space of normal vectors.
In particular, if M is an (n — 1)-manifold, then m = 1; grad (xo) is
a normal vector, and all others are scalar multiples of it.
Xg+n
Xp th

Xo

Figure 4-11
Tangent r-plane

Tangent r-planes. The r-plane tangent to M at xo is

{x:X = x9 + h,h € T(Xo)}.

(See Fig. 4-11.) The terms tangent line, tangent plane, and tangent hyperplane
are used when r = 1, 2, and n — 1, respectively. The tangent 7-plane is

{x d'(xo)- (xt ixo) —="0 efor Ge 1 em


Example 9. Let
M = {(z, Y) z) ee y? + 2? 2x2 A= 0}.

According to Example 3, Section 4-6, M is the 2-manifold determined by the function


® of that example. Then

d®(x, y,z) = 2(x — ze! + ye? + 2(z — z)e?.

Let us find the spaces of tangent and normal vectors to M at (2, V3, 1).

grad &(2, V/3, 1) = 2e; + 2\/8e2 — 2e3.


This is a normal vector, and any other is a scalar multiple of it. The tangent vectors
h satisfy
0 = d6(2,V/3,1)-h = 2h) + 2/3 A? — 2p.
Two linearly independent solutions of this equation are e; + e3 and V/3e; — eo.
These vectors form a basis for T(2, 1/3, 1). The equation of the tangent plane is
2(¢ — 2) + 2V3(y — V3) — 2 — 1) = 0O,orz+ V3y — 2 = 4.
4-7 Manifolds 127

Example 10. Let Jf C H? be a 1-manifold. Let us find the tangent space at


(xo, yo, 20.) € M. Let us write ® = (@,W). The tangent vectors satisfy

0 = d®(x0, yo, 20): h = dV (xo, yo, Zo) « h.


From Cramer’s rule one solution of this pair of linear equations is the vector d with
components (Problem 8)

d} = @.¥3 — D3Vo, d? = 63Y, — PV3, d? = @Vo — PV),

the partial derivatives being evaluated at (xo, yo, zo). The tangent space T(x, yo, 20)
consists of all scalar multiples of d.

*Intersections of manifolds. Let M be an r-manifold and N an s-manifold,


with MN nonempty. Let us assume that r+ s > n. Let T(xo) denote
the tangent space to M at a point x» € M NN, and T2(xo) the tangent space
to N at Xo. There exist a neighborhood U, of xo and ® = (4!,..., 6"~”)
such that D@(x) has rank n — r for every x € U, and

Mn Cf — {x € U, : (x) = 0}.

In the same way, there exists a neighborhood U2 of xp and W= (W!,..., ¥"~)


such that DW (x) has rank n — s for every x € U2 and

Nn Us = {x € Ug: W(x) = 0}.

ere sandO =i en Oe Pl. ve), Then

(MAN) AU = {xe U:0(x) = 0}.


The kernel of D@(xXo) 1s T71(Xo) N T2(X0). Hence D@(x) has the desired rank
n—(r+s—n) = (n—r)+(n— 8s). From the definition, MN is an
(r + s — n)-manifold.
Example. Let n = 3,r = s = 2. If the tangent planes of M and N do not coincide
at any point of MM N, then MM Nisa 1-manifold. The tangent line at xo €E MM N
is the intersection of the tangent planes to M and WN at Xo.

PROBLEMS
In Problems 1-6 and 11, one can show that the set in question is a manifold by
verifying (4-26) for suitable ®.
1. Let F(x, y) = exp (x? + 2y?-+ 2). Find the level sets and determine which are
1-manifolds.
2. (a) Show that if c ~ 0, then the hyperboloid x? + y? — 42? = cis a 2-manifold.
Is the cone x? + y? = 42? a 2-manifold?
(b) Find the tangent plane at 2e1 — e2-+ eg to the hyperboloid 2? + y? — 42° = 1.
3. Let M = {(a,y,2):2y = 0, 2?-+y?+ 2? = 1,2 ¥ +1}. Show that M is a
1-manifold. Sketch M.
128 Vector-Valued Functions of Several Variables 4-7

4, Letfbe of class on an open set A C E?. Let M = {(2, y, f(a, y)): (t, y) € A}.
(a) Show that M is a 2-manifold.
(b) Show that (fi(2, y), fe(x, y), — 1) is a normal vector to M at (2, y, f(a, y)).
(c) Show that the equation for a tangent plane agrees with the one in Section 2-2.
. Let A C E! be open, and f, g be real-valued functions of class C on A.
(a) Show that M = {(z, f(x), g(z)):a © A} is a 1-manifold.
(b) Show that (1, f’(x), g’(x)) is a tangent vector to M at (2, f(x), g(2)).
. Let M = {(z,y): 2% = y*,x > 0,y > 0, (z,y) ¥ (¢, e)}, where e is the base
for natural logarithms. Show that M is a 1-manifold. Make a sketch.
let iM = {@, y,2):2y = az = 0}. Is M a 1-manifolds
. In Example 10 show that d is a tangent vector, using Cramer’s rule.
. Let Mf and N be r-manifolds such that (clM) MN and Mf clN are empty,
MCE",N CE”. Prove that M U N is an r-manifold.
10. Let J/ be an r-manifold and A an open set such that 1M A is not empty. Prove
that JJM A is an r-manifold.
Wile Let M = {x:)-7j-1 cijv'x’ = 1}, where the matrix (c;;) has rank n and is
symmetric.
(a) Show that M is an (n — 1)-manifold.
(b) Show that the equation of the tangent hyperplane at xo € J is
n
Gg
> Gage any, = Me
1,j=1

12. Product manifolds. Let M C E” be an r-manifold and N C £”™ an s-manifold.


Regarding E™*” as the cartesian product E" X E™, show that M X N is an
(r + s)-manifold. Show that the tangent space at a point of Mf X N is the car-
tesian product of the tangent spaces at the corresponding points of WM and N.
13. Let points of H+ be denoted by (2, y, u,v). LetC = {x:22+y? = 1,u? +02 = 1},
K = {x:2?-+ y? < 1,u?-+ v? < 1}, and B = frK (boundary of K).
(a) Show that C is a 2-manifold.
(b) Show that B-C is a 38-manifold. [Hint: Use Problems 9 and 12.]
. Let M, ®, and U be as in the definition of “manifold.” For each 1 = 1,...,m
let W'(x) = g'(x)®'(x), where g! is of class C and g'(xo) ~ 0. Show that there
is a neighborhood Uo of xo such that DW'(x) has rank m for every x € Uo and
MQ Uo = {x € Uo: W(x) = 0}. [Hint: Show that dW!(xo),..., d¥™(xo) are
linearly independent.]
. Let us identify £”” with the set of all linear transformations from E” into itself
by associating with each linear transformation L the vector
n
a
Cjei+(G—I)n;
i,j=1

where (c;) is the matrix of L. Let O(n) be the set of orthogonal transformations
On ae:
4-8 The Multiplier Rule 129

(a) Show that O(n) is a manifold of dimension $4n(n — 1) and class CO. [Hint:
Proposition 11.]
(b) Let I be the identity transformation of E”. Show that L is a tangent vector
to O(n) at Lif and only if L‘ = —L. (Such a linear transformation is called
skew symmetric.)
(c) Let SO(n) be the set of all rotations of E” about 0. Show that SO(n) is a
relatively open subset of O(n), and hence SO(n) is also a manifold of dimen-
sion 3n(n — 1). Show that SO(n) is the largest connected subset of O(n)
which contains I. [Hint: Using induction on n, show that any L € SO(n)
can be joined with I by a path in SO(n).]

4-8 THE MULTIPLIER RULE

Let M be a manifold and f be a real-valued function of class C‘? on some


open set containing M@. Let us consider the problem of finding the extrema of
the function f|/. This is called a problem of constrained extrema.
If Xo is a point of M at which f has a constrained relative maximum, then
Xo has a neighborhood U9 such that

f(x) < f(xo) for every x € M 7 Up.

(Recall the definition on p. 60.) If f has a constrained relative minimum at


Xo, then the inequality sign is reversed.
Since M is a manifold, there exists a neighborhood U of x9 and ® of class
C on U such that
MoU — {x © Us; B(x) = 0}

and D@®(x) has maximum rank m for every x € U. We may assume that
UG Wo:
Roughly speaking, the multiplier rule states that by introducing suitable
multipliers o1,...,@m the constrained extremum problem can be treated as
one of ordinary (unconstrained) extremum. More precisely:

Lagrange Multiplier Rule. Let f have a constrained relatwe extremum at Xo.


Then there exist real numbers 1,...,%m Such that Xo ts a critical point of
the function
F = f+o,6' 2 1 O,2".

Proof. It suffices to consider the case of a constrained relative maximum.


Let h be any tangent vector to M at xo. Let ¢(t) = f[W(®], where y is the
same as in the definition of tangent vector. Since y(t) € M and f|M has a
relative maximum at Xo, ¢ has a relative maximum at 0. Therefore ¢’(0) = 0.
By the chain rule,
¢'(0) = dfl¥(0)] - (0) = df(Xo) « h.
130 Vector-Valued Functions of Several Variables 4-8

Let L = D(x). The equation b = L*(a) has a solution a if b-h = 0


for every h in the kernel of L ({12, p. 103]). Let b = df(xo). By (4-8b)
and Theorem 10, the covector df(xo) is a linear combination of the row covec-
tors of D®(xo): \
df (Xo) == Chil dé! (xo) +-:---+ dm db” (Xo).

Let 6; = —a; for! =1,...,m. Then dF(xo) = 0. J

Example 1. Let f(z, y,z) = x — y+ 2z. Let us find the maximum and minimum
values of f on the ellipsoid

M = {(a,y,2):22
+y2+ 2? = 2.
Let ®(2, y,z) = 2 — (2? + y? + 227) and F = f+ 0%. The multiplier o is yet to
be determined. From the multiplier rule we get three equations

Py eS 1 — 2cx0
= 0,
Fo = —1
— 2cyo = 0;
F37=) +2) — 4oz9,
— 0:

From these and the fourth equation ® = 0 we get

no = 5 yo = — 52> St ee eS ep)

Therefore xo = +(/2/2)(e: — e2-+ es), depending on which of the two possible values
for ¢ is used. Since f is continuous and M is a compact set, f has a maximum and a
minimum value on Jf. One of the two critical points obtained by the multiplier rule
must give the maximum and the other the minimum. Since f[(+/2/2)(e1 — e2+ e3)] =
2/2 and f[—(V/2/2)(e1 — e2 + e3)] = —2/2, these numbers are the maximum
and minimum values, respectively.

Example 2. Let f(x) = > 07-1 b:(2‘)?, where b; # 0 for eachz = 1,...,n. Let M
be the hyperplane {x:a-x = 1}, and F(x) = f(x) + o(1 — a-x). If xq isa critical
point of F, then
0 = dF (xo) = df(xo) — oa.

Thus df(xo) = a, or

TCG) iss = tr Se IR te
From this and the equation a: xo = 1,
n 2
a 0a; 1 1 SS a;

PAD OTR, G5 2445,,

provided the sum is not zero. To determine whether xo gives an extremum, we use
the formula
f(xo +h) = f(xo) + df(xo)-h-+ f(b),
4-8 The Muitiplier Rule 131

which is valid for homogeneous quadratic polynomials. Points of the hyperplane M


are of the form xo -+ h, where a-h = 0. Since df(xo) = oa, the above formula
simplifies to
f(%o + bh) = f(xo) + f(h).
If f(h) > 0 for every h satisfying a-h = 0, then f has an absolute constrained mini-
mum at xo.

The characteristic values of a symmetric matrix. Let (c}) be ann Xn


matrix and L the corresponding linear transformation. A number 2 is a charac-
teristic value of L if the linear transformation L — XI is singular. There are
n characteristic values \1,...,n, counting multiplicities. If L(x) = \,x
and x ~ 0, then x is a characteristic vector corresponding to the characteristic
value \;. The numbers \j,...,A, may be complex, and the characteristic
vectors may have complex components ({12], p. 164).
Let us suppose, however, that (cj) is a symmetric matrix, ci = c’ for
1,j =1,...,n. Let us show that the characteristic values are real. In fact,
\; can be characterized as the value of a certain constrained maximum.
Consider the homogeneous quadratic polynomial

f(x) = L@) x= > cix'x’,


i,j=l
and let M, be the unit (n — 1)-sphere in H”, M, = {x:|x| = 1}. Let

Ay = max {f(x) :x € Mj}, (4-27)

and let v; be a point of M, at which the maximum is attained. By the multi-


plier rule, with
Pie} (X) ial XX),
there is a multiplier o such that v; is a critical point of F. Then grad f(x) =
2L(x), and hence
Ol erad F(v)) = 2L (v3) 20v7:

Hence L(v,) = ov, which shows that o is a characteristic value and v; a


characteristic vector. Since f(x) = L(x) - x,

Ay = f(v1) = Li) - V1 = OV
- V1,

and since v1 - Vv; = l,g = 4.


We next let
(Vipe = ea xX Xen Xtav a= OU},

Ao = max {f(x) :x € M3},

and vs € My» be a point such that f(v2) = 2. We have added another con-
132 Vector-Valued Functions of Several Variables 4-8

straint. Hence M, C M, and dy < dy. Obviously vo-v; = 0. For k =


Ore let
M, = {x:\x| =, x-v,=— 0 for? =al,.-.,k— UU,

Ae = max {f(x):x © M;}, (4-28)

and v; € M;, be such that A; = f(v;,). Then

Ne AAR Tee NG

and {v,,..., Vn} is an orthonormal basis for #”. Let us show by induction
on k that ; is a characteristic value and v; a characteristic vector. This is
true if k = 1. Let k > 2. Applying the multiplier rule with
k—1

Med) Sales i al F358. te OS ahi


=
we get
k—1
0 = 2L(y;) — 201v; + SE Cries
i=1
Since v;-v; = 0 for 7 ¥ 7, we get upon taking the inner product with v,,

Oe ZL (Vp) V; Gs iy elena
a, = L(vx) + vi = f(a),

or 0; = Xx. Since (c}) is symmetric, L’ = L (p. 98) and

L(vi) - Vj = Ve: L(v;).

Using the induction hypothesis, we get

LVe) ViVi AV Onl ae kee

Hence Og = °°: =O, = 0, and L(v;) => AVE:

If k = n, the multiplier rule does not apply. However, we used the mul-
tiplier rule only to show that L(v;) is a linear combination of vy,..., vz. If
k = n, this is clear from the fact that {v,,...,v,} is a basis for EH”.

Theorem 11. The characteristic values of the symmetric matrix (c') are
A1,-++)An- For eachi = 1,...,n, Vv; is a characteristic vector correspond-
ing to d,. If &',..., & denote the components of x with respect to the ortho-
normal basis {v1,..., Vn}, then

MCS INMEFIest bon aie (ao (4-29)


for every x € EH”.
4-8 The Multiplier Rule 133

Proof. The first two statements have already been proved. To prove the
third, we have
n

aS a DeLey;; faa
cl

im eeeLy.) SS Niki,
al 1=1

fi) = L@).x—= 57 vty, x,


i=1
which is just (4-29). §

Corollary. f 7s positive definite of and only if \; > 0 for eachi = 1,...,n.

Proof. If each X; is positive, then by (4-29), f(x) > 0 whenever x = 0.


Conversely, if f(x) > 0 for every x ~ 0, then \; = f(v,;) > O for each 7. J

These results have a geometric interpretation. Let

eX (Xe be

If f is positive definite, then B is called


an (n — 1)-dimensional ellipsoid. Setting
My = 1/V/, we have (Fig. 4-12)

jn SO ine aa
C=)

lign-—»2-ands A, > 0, A> < 0; then B is a


hyperbola. If n = 3 and A; > 0, A3 < 0,
then B is a hyperboloid. It has one sheet
if X\2 > O and two sheets if A» < 0.

PROBLEMS
1. Set M = {(z,y,2):4+y+2 = 1} and f(z,y,z) = 322+ 3y?+ 22. Show
that there is a constrained absolute minimum, and find the minimum value of
fon M.
2. Use the multiplier rule to find the distance to the parabola y? = x from the point
cey,c ~ 0. [Hint: Let f(x,y) = (« —c)?+ y?, which is the square of the
distance. |
3. Find the distance from the point ey — 2e2 — e3 to the line {(a, y,z):% = y = 2}.
4. (a) Use the multiplier rule to show that Jal] = max {a-x:|x| = 1}
(b) Deduce the same result from Cauchy’s inequality.
134 Vector-Valued Functions of Several Variables 4-8

. Let M be a manifold, x; ¢ M, and suppose that xo is a point of M nearest x1.


Using the multiplier rule, show that x; — xo is a normal vector to M at xo.
. Show that the distance from a point x; to the hyperplane {x:a-x = }} is
ja-x1 — d]/lal.
. Let f(x) = via?--- "and M = {x:2!+---+2"=1,2' > Ofori =1,...,n}.
(a) Show that f(x) < n-" for every
x € M, with equality ifz! =--- =a" =n}.
[Hint: First show that f has an absolute maximum on M. Apply the multi-
plier rule to log f, which has a maximum at the same point where f has one.]
(b) Using (a), prove that the geometric mean of n positive numbers is no more
than their arithmetic mean. See Problem 6(b), Section 1-5.
. Let p > 1 and p’ be the number such that p—! + (p’)—! = 1. Let ||x|| be as in
Problem 3, Section 1-6; and for each covector a let |la|| = max {a-x: ||x|| = 1}.
Show that
n i \ 1/2!
lial =( 0 ce
i

[Note: For these norms the inequality |a- x] < |lal] ||x|| [formula (1-18), p. 32] is
called Hélder’s inequality. A related inequality for integrals is given in Section 5-12].
. Let f@, y, z) = 2xz + y?.
(a) Find the characteristic values \1, Ag, A3.
(b) Sketch the surface with equation 2xz-++ y? = 1. With equation 2xz-+ y? = 0.
10. Let ||L|| be defined as in Section 4-3. Show that ||L||? is the largest characteristic
value of L'e L. [Hint: Use (4-8) with y = L(t)].
Hil. (Second derivative test for constrained relative maxima.) Let f and ® be of
class C@), and let Q(x, h) = Sof j;=; Fi;(x)h*h? where F is as in the multiplier rule.
Show that:
(a) If f|M has a relative maximum at xo, then Q(xo, h) > 0 for every h € 7'(xo).
(b) If Q(xo,h) > O for every h € T(xo), h ¥ 0, then f|M has a strict relative
maximum at xo. [Hints: See the proof of Theorem 6. Set hy = t7!(y; — xo),
and show that lim;_,9 Q(xo, hz) = Q(xo, h).]
CHAPTER io

Integration

The integral of a real-valued function over a set is a generalization of the


notion of sum. It is defined by approximating in a suitable way by certain
finite sums. The first careful definition was due to Riemann (see the Historical
Notes). Riemann defined the integral of a function over an interval [a, b] of
the real line E’. In the succeeding years Riemann’s idea was extended in several
ways. However, the Riemann integral has several intrinsic drawbacks, and for
a truly satisfactory treatment of integration a different approach had to be
found.
About 1900 Lebesgue discovered a more sophisticated and flexible theory
of integrals. In this chapter the elements of the Lebesgue theory are given.
The first step is to define the measure of a set A C H”. For n = 1, 2, or 3, the
measure is respectively the length, area, or volume of A. An important property
of Lebesgue measure is its countable additivity [formula (5-9)]. While not
every set A is assigned a measure, countable additivity insures that the class
of measurable sets is large enough for all applications encountered in mathe-
matical analysis.
After measure, the integral of a bounded function f over a bounded set A
is defined using upper and lower integrals. The integral exists under the very
mild assumptions that A is measurable and f is measurable on A. Later
(Sections 5-6 and 5-10), the integral is studied without those boundedness
assumptions.
The definition of an integral does not furnish an effective procedure for
the actual evaluation of integrals. However, the theorems on iterated integrals
and transformation of integrals (Sections 5-5, 5-8), together with the funda-
mental theorem of calculus, provide a useful technique for this purpose.
Among the important features of the Lebesgue theory are the theorems
about integration term by term in sequences of functions. Such questions are
treated in Section 5-10.

Notation. The n-dimensional measure of a set A will be denoted by V,,(A).


If the dimension n is clear from the context, then we write simply “measure”
rather than “n-dimensional measure” and V(A) rather than V,(A). The
135
136 Integration 5-1

integral of f over A is denoted by

J fd, or J f(x) dV n(x).


A A :

If A = £”, we write simply ffdV,. The symbols dV, and dV,(x) are used
after an integral sign merely for convenience and for traditional reasons. They
will have no significance by themselves. (On the other hand, df has a meaning
already explained in Section 2-6. In particular, dx’ stands for the differential
of the ith standard cartesian coordinate function X*.)
If n = 1, then we write, as is customary, f4 f(z) dx instead of
Sa f(x) dV;(z), and if A = [a, b], we write f? f(x) dz.

5-1 INTERVALS
What is the n-dimensional measure of a subset A of #”? To start with,
let us consider the simplest possible case—where A is an n-dimensional interval.
A 2-dimensional interval is a rectangle with sides parallel to the coordinate
axes (Fig. 5-1). Its area is the product of the lengths of its sides. Since H? is
the cartesian product H! x E', a 2-dimensional interval is just the cartesian
product of 1-dimensional intervals. Similarly, a set J C H” is called an n-dimen-
sional interval if I is the cartesian product of 1-dimensional intervals:

If di, ono Oa,

where each J; is a finite interval of H!. The interval J is closed if each J; is


closed, and open if each J; is open. For instance, if J is closed, then there exist
ae ral ait) =e (oe n
Xo = (0,---, 20); Xy = (41, -.., 21)

Wille qe, tor each?==" lee wn esuch) thats) "(coer )and

LK Se le
The n-dimensional measure V(I) of I is the product of the lengths of the inter-
eulStel
iy elem pad as
We shall next define the measure of a set which is a finite union of
n-dimensional intervals. For this purpose the idea of grid of hyperplanes is
introduced.

Grids. For each 7 = 1,...,n let us take a finite set of real numbers; let
the elements of these sets be denoted by x;, where

fh Kee
and m;+1 is the number of elements of the ith set. Let P} be the hyperplane
with equation x’ = 2}, and let II be the union of all of these hyperplanes.
Such a set II is called a grid of hyperplanes. A grid divides H” into a finite
number of n-dimensional intervals, called intervals of 1, and a finite number
5-1 Intervals 137

x2

Xie \
0 | |
| |
| |
| |

Figure 5-1 Figure 5-2

of unbounded sets. The latter could be called semi-infinite intervals of II, but
we shall have no occasion to do so. The intervals of I have the form J =
Ji X+++X Jn where J; = [2}, 2}41] and the integers j,,...,j, may be
chosen arbitrarily subject to 1 < 7; < mi4;. There are m,--+m, intervals
of the grid, and for convenience we have taken them to be closed. (See Fig. 5-2.)
Let us call a set Y a figure if Y is the union of certain intervals I,,..., I»
of some grid II. The measure of Y is

VAG ay{Ii eaa hart VC 5). (5-1)


There are many possible choices for II. Consequently, we must show that
V(Y) depends only on Y and not on the particular grid chosen. Let us call
Il’ a refinement of Il if II CI’. It is easy to show that V(Y) is unchanged
if II’ is obtained by adding one hyperplane to II, and hence by induction if
II is replaced by any refinement of it. Now let II and II’ be any two grids such
that Y is the union of intervals of II and also the union of intervals of II’. Then
II UT’ is a refinement of both. Consequently, V(Y) is the same whether
II or II’ is used.
This same reasoning shows that if Y and Z are figures, then Y and Z can
be written as unions of intervals of the same grid I]. Therefore Y UZ is a
figure. Moreover,
VG WA) = (Cae antay (5-2)
If Y NZ is empty, then equality holds in (5-2).

PROBLEMS
TeLet =) 1sand Y= (0,1) U [2,3], Z = 11; 3] U [4, 5).. Verify formula. (5-2) in
this example.
2a leiw =12 and Y= (0,2) < (0,1) U [1,3] x [Lb 2], Z = [—1, 2] x [—1, 3]. Find
a grid II such that both Y and Z are unions of intervals of II. Find the areas of
Y,Z, Y UZ, and Y f Z, and verify that V(Y) + V(Z) — V(Y UZ).= V(Y NZ).
138 Integration 5-2

3. Let Ix = [0,1] X [0,1] X [0,1] and Io = [%, 2] X [0, 2] X [—1, 2], Find the
volume of J; U Ig and of 11 N Ie.
4. Let m be a positive integer, f(z) = expz, and Y = I; U--:U Im, where for
i = Ipootg .
Ix = [(k — 1)/m, k/m] X [0, f(k/m)].
Find the area Vo(Y). Show that it is approximately e — 1 if mis large.
5. (a) Let I and Je be n-dimensional intervals. Show that I, Q J2 is also an interval
provided that it has nonempty interior.
(b) When is I; U J2 an interval?

5-2 MEASURE
We shall now define the measure of a bounded set A. This is done in two
stages. I irst, the measure of an open set @ is defined by approximating G
from within by figures, and that of a compact set K by approximating K from
without by figures. In the second stage, A is approximated from within by
compact sets and from without by open sets. This two-stage approximation
process is an important feature of the Lebesgue theory of measure.
There is an older theory of measure due to Jordan. In this theory A is
approximated simultaneously from within and without by figures. The Jordan
theory is unsatisfactory for several reasons. Among them is the fact that the
class of sets to which it applies is too small. For instance, there are compact
sets to which the Jordan theory does not assign any measure.
Let G be an open set. If Y is a figure contained in G, then the measure
of G must be more than V(Y). It is defined to be the least upper bound of the
set of all such numbers V(Y). (See Fig. 5-3.)

Figure 5-3

Definition. The measure of an open set G is V(G) = sup {V(Y):Y CG}.

If the set S= {V(Y) : Y C G} has no upper bound, then we set V(@) =


+o. For instance, V(E") = +o. If @ is bounded, then G is contained in
some interval J, and V(J) is an upper bound for S. In this case V(Q) is finite.
5-2 Measure 139

If H is an open subset of G, let T = {V(Y):Y CH}. Since Tc SS


sup T < sup S. Thus V(A) < V(Q@).
By definition (Section A-1), the least upper bound sup S has the property
that s < sup S for every s € S, and given € > 0 there exists an s € S such
that supS < ste.

Example 1. Let G = intZ where Z is a figure (recall that any figure is a closed set).
Let us show that V(int Z) = V(Z). For any figure Y C int Z, V(Y) < V(Z); and
given e« > 0 one can find such a figure Y with V(Z) < V(Y) +. Thus V(Z) =
Supa VY) vee mt Z\.

We shall now establish a formula for the sum of the measures of two open
sets. For this purpose a topological lemma is needed.

Lemma 1. Let G and H be open sets and K be a compact subset of GU H.


Then there exists d > 0 such that the d-neighborhood of any x € K is either
contained in G or in H.

Proof. The sets EH” — G and E” — H are closed. Let

f(x) = dist (x, E” — @), g(x) = dist (x, E” — H),


where dist (x, A) is the distance from x to the set A (Section A-8). The func-
tions f and g are continuous and f(x) + g(x) > 0 for every x E GUH. The
continuous function f + g has a positive minimum value c on the compact set
K. Let d = c/2. For every x € K, either f(x) > dorg(x) > d.J

Lemma 2a. Let G and H be open sets of finite measure. Then

ViG UH) = ViG) -- Viz). (5-3)

Proof. Let W be any figure such that W C G U H. Let d be as in Lemma 1


with K = W. The figure W is the union of intervals of a grid I. By refining
II if necessary we may suppose that each interval of II has diameter less than d.
Let Y be the union of those intervals of II which are contained in G, and Z
the union of those contained in H. By Lemma 1, W C Y UZ. Consequently,

VW ae OA) Vi Vz).

Since Y CcG, V(Y) < V(G); similarly, V(Z) < V(H). Hence

VAQUA) ES WACEN Ss VAC ERY,


The number V(G) + V(A) is an upper bound for {V(W):W CG U H}; and
hence it is no less than the least upper bound V(G U H). This proves (5-3). I

We next define the measure of a compact set K by approximating K from


without by figures. For this purpose we consider those figures Z such that K
is contained in the interior of Z.
140 Integration 3-2

Definition. The measure of a compact set K is

Vikan dV Oke mz
4

Example 2. Any figure Y is a compact set. The new and old definitions of V(Y)
agree. The proof is like the one in Example 1.

Lemma 3. Lei K and L be compact sets such that K M L is empty. Then

VGGU Dy Vik} Va): (5-4)

Proof. Let f(x) = dist (x, L). Since K ML is empty and L is closed,
f(x) > O for every x E K. Since K is compact and f is continuous, f has a
positive minimum value d on K.
Let W be any figure such that K UZ C int W. Then W is the union of
intervals J;,..., 7, with diameters less than d. Let Y be the union of those
intervals 7; such that (int J;) MN K is not empty, and Z the union of those
such that (int J;) QL is not empty. Then YUZCW, YNZ is empty,
and KC int Y, L CintZ. Hence

AO Se WIDE VAG TARAS AU)


This shows that V(K) + V(ZL) is a lower bound for {V(W):K ULC W},
and hence is no more than the greatest lower bound V(K U L). §

Now let A be any bounded set. Its outer measure, denoted by V(A), is
defined by approximating from without by open sets:

V(A) = inf {(V(@): ACG.

Similarly, the znner measure V(A) is defined by approximating from within


by compact sets:
VA eee aS) Ge
Lemma 4. Let K be a compact set and G an open set such that K C G. Then
there 1s a figure Y such that K C int Y and Y CG.

The proof is left to the reader (Problem 6).


If K C A, then by Lemma 4 V(K) < V(G) whenever A CG. Hence
V(K) is a lower bound for {V(G): A C G}, and V(K) < V(A). Thus V(A)
is an upper bound for {V(K) :K Cc A}, and

VADESV A):
It is easy to show that if B C A, then

V(B) VA) Bea


5-2 Measure 141

If H is a bounded open set, then V(H) < V(GQ) for any open set @ con-
taining H, and there is equality when G= H. Hence V(H) = V(H). Given
€ > 0, there is a figure Y C H with V(H) — e < V(Y). Since Y is a com-
pact set, V(Y) < V(H), and hence V(H) — e < V(H). Since this is true
for every € > 0, V(H) < V(H). But V(H) < V(#), and therefore

V(H) = V(A) = V(H).


Similarly, if Z is a compact set then

VL) = VL) = V(L).


Definition. A bounded set A is called measurable if its outer and inner
measures are equal. If A is measurable, then the number

V(A) = V(A) = V(A)


is called the n-dimensional measure of A.

We showed above that bounded open sets and compact sets are measur-
able and that the new definition of their measures agrees with the previous one.
Many sets which are neither open nor compact are also measurable. In fact,
the only examples of nonmeasurable sets are obtained in a quite nonconstruc-
tive way using the “axiom of choice” of set theory (see [15], p. 157).
We shall now show that finite unions, intersections, and differences of
bounded measurable sets are also measurable. [or this purpose let us first
prove the following.

Lemma 5. Let A and B be bounded sets. Then

V(A UB) < V(A) + V(B). (5-5)

If An B is empty, then

V(A UB) > V(A) + VOB). (5-6)


Proof. Given € > 0, there are open sets G > A, H D B such that

V(G) < V(A) + €/2, V(H) < V(B) + €/2.

Then G U Z is an open set containing A U B, and from Lemma 2a

VGAR WEE eae (GH) TeV (G) a (1),


V(A UB) < V(A) 4+ V(B) +.

Since the last inequality is true for every € > 0, we must have (5-5).
Similar reasoning, using Lemma 3, gives (5-6). I
142 Integration 5-2

Proposition 15a. Let A and B be bounded measurable sets such that A B


is empty. Then A U B is measurable and

V(A UB) = V(A) 4) ). (5-7)

Proof. By Lemma 5,

V(A) + V(B) < V(AUB) < V(AUB) < V(A) + VO).


Since the extreme left-hand and right-hand sides are the same, both V(A U B)
and V(A U B) must equal V(A) + V(B).§

Let us call a finite collection {A;,..., Am} of sets desjoint if Az, MN Az 1s


empty whenever k ~ I.

Corollary 1. Jf {Aj,..., Am} is a finite disjoint collection of bounded,


measurable sets, then Ay U-++ U Am 7s measurable and

V APOC RUEA, eV 4A; (5-8)


k=1
Proof. Use induction on m. §

Corollary 2. Let A be a bounded set. Then A is measurable if and only if


for every € > O there exist a compact set K and an open set G such that
K CAC Gand V(G — K) < .

The proof is left to the reader (Problem 7).

Proposition 16. Let A and B be bounded measurable sets. Then A — B,


AB, and A U B are also measurable.

Proof. Let us first prove that A — B is measurable. Given e > 0, let


G, G’ be open sets and K, K’ compact sets such that K CA CG,K'’ CBCG,
and
VAG == IK) GIs. VG I) eon ey:

Let H = G — K’, L= K — G. Then d is open, L is compact,

LCA—BCH.

Moreover, H — L is open and

Ee GR Ge KG a)
By Lemma 2a,

Vit —L) < ViG— BK) VG — Kh) <e

This shows that 4 — B is measurable.


5-2 Measure 143

Now AnNB= A—(A—B). Since A and A —B are bounded


measurable sets, by the first part of the proposition their difference A nN B
is measurable. Moreover,

Av B— (A — By) U (6 — A) U(A 1B),

and the three sets on the right-hand side are measurable and no two of them
intersect. By Corollary 1, A U B is measurable. §

Countable additivity of measure. The result expressed by Corollary 1 is


called finite additivity of measure. Let us now prove a stronger result.
A series >0¢—; 2% of nonnegative numbers converges if the partial sums
Sm = >-f=1 4, are bounded, since the partial sums then form a bounded
nondecreasing sequence (see Section A-4). If the sequence of partial sums is
unbounded, the series is said to diverge to +o. A sequence Ay, Ag,... of
sets is disjoint if A; M Az is empty whenever k + l.

Proposition 15b. Jf A,, Ao,... 7s a disjoint sequence of measurable sets


and if A = A, U Ag U-:: ts bounded, then A is measurable and

V(A) = DD VAD. (5-9)


k=1

This property is called countable additivity of measure: If we let A; be empty


for k > m, then (5-8) is a special case of (5-9).
To prove Proposition 15b we first state:

Lemma 2b. Let Gi, Go, ... be open sets each of which has finite measure.
LNG = Gy U G2 U---, then

VG) < k=1


VG».
Proof. Let Y C G. The figure Y is a compact set and Gj, Go, ... form an
open covering of Y. Hence Y C G; U-:+ U Gm for some m, and by Lemma 2a

VY) < k=1


VG) < esl
VG».
Since this is true for every such Y, the lemma follows.

Proof of Proposition 15b. We have A; U:::UAmCA for every m.


Using (5-8)

> V(Ay) < VA).


k=1
144 Integration 5-2

Since this is true for each m,

>> V(Ax) < VA),


k=1

On the other hand, given e > 0 let G; be an open set such that A, C G; and
Vi GeV pa eo, 2, end lebe G et GUL oie letter:
A C @and therefore V(A) < V(G). By Lemma 2

V(A) < SS VAP + ys go


b=! ca

and 5>2—* = 1. Since this is true for any € > 0, A is measurable and (5-9)
holds. §

A sequence Aj, Ag,... of sets is called monotone nondecreasing if


Ay Ga Ag Gr

Theorem 12a. Let A be a bounded set such that A = A; U Ao U°:-.,


where Aj, 1s measurable for each k = 1,2,... Then A is measurable, and

V(A) < >> VAD. (5-10)


k=1
If the sequence of sets Ay, Ag,... ts disjoint, then equality holds in (5-10).
If the sequence A,, Ao,... ts nondecreasing, then

V (Ay sin (As). (8-11)

Proof. Let

B, == Alig (oy — Ay — A ee ee pe A; > Ag eo CRA Fe) ee

By Proposition 16 each B; is measurable. Moreover, B,, Bg,... are disjoint


and their union is A. By Proposition 15b, A is measurable and

VGA\e ye (Bar
k=1
Since By C Az, V(Be) < V(Az), giving (5-10). If the sequence Aj, Ao,...
is disjoint, B, = A; and (5-10) becomes formula (5-9) which has already been
proved.
Lita GA Geen Bag) ee ee eri

V(An) = > V(Bp).


k=1

Taking the limit, we get (5-11). §


5-2 Measure 145

Unbounded sets. Let U, = {x: |x| < r}. A set A is called measurable
if A m U, is measurable for every r > 0. The measure of A is

VAD bm VACA OIEESE (5-12)

If we set ¢(r) = V(A n U,), then ¢ is a nondecreasing function. There-


fore the limit exists in (5-12). It may be finite or +a (Section A-10). If A
is a bounded set, then A C Ur, MOmsOMemja el ON The 175.04 1— A, Ua For
bounded sets this definition agrees with the previous one.
It will be proved in Section 5-10 that Theorem 12a remains true without
the assumption that the sets A, A,, A9,... are bounded.
Let A,, Ae,... be measurable sets such that A; > Ag D---, and let
A = Ain A2N::--; then formula (5-11) is still correct provided V(A,) ts
finite. This is also proved in Section 5-10.

Example 3. Let Amn = im, 0). “Then V(A;,) = --o for each m = 1,2,..., but
V(A1N A2n::: ) = 0 since the intersection is empty.

Example 4. Let G@ be an open subset of Z!. If @ is the union of a finite number of


disjoint open intervals J1,..., Im, then V(G) = V(J1) +---+ VC»). It can also
happen that G is the union of a disjoint infinite sequence Jj, J/g, ... of open intervals,
in which case

KG oy VC).
k=1

The third possibility is that G contains some half-line. In that case V(@) = +.

The sets of measure 0 play a special role. They turn out to be negligible
in integration theory, and for that reason we shall call them null sets.

- Definition. If V(A) = 0, then A is a null set.

Corollary. Jf A;, Ao,... are null sets, then Ay U Ag U-::: 28 a null set.
If B C A and A is a null set, then B is a null set.

Proof. By (5-10)

Ona AU Aa Uc) 0 0,
k=1

which proves the first assertion. If A is a bounded null set and B C A, then

Olt) (Bye VB r= VA) 10.


Hence B is measurable and is a null set. If A is any null set, then A 9 U, isa
null set for each r and Bn U,Cc An U,. Hence Bn U,; is a null set, which
implies that B is measurable and is a null set. f
146 Integration 5-2

Example 5. Let A C M, where M is an (n — 1)-manifold. It is plausible that the


n-dimensional measure of A is 0, and this fact will be proved in Section 5-8. Hence
any such set A is a null set.

Example 6. A set A is countable if either A is a finite set or its elements can be arranged
in an infinite sequence, A = {x1,X2,...} where x, ~ x: fork #1. Any one point
set is a null set. Hence, taking A; = {x;}, we find that any countable set 1s a null set.

Example 7. Let A be the set of rational numbers in the interval (0,1). Then A is
countable. For instance, one can write

Hence A is a null set. Since Vi(A) + Vi[(0, 1) — A] = Vil, 1)] = 1, the set of
irrational numbers in (0, 1) has measure 1 and therefore must be an uncountable set.

PROBLEMS
In 1, 2, and 3 assume that the sets are bounded.
1. Let A and B be measurable. Show that:
(a) V(A — B) = V(A) — VAN B).
(Dye VARS By Vi A) Bye= Vi) = V(b):
2. Show that if A, B, and C are measurable, then

VCALUTB UIC) =) Aa VB) aC ri ALB) VV (Ain) C)


— ViBiy CO) V Aee iC):
3. Show that if A is measurable and B is a null set, then

VAU B) = V4 — B) = VA).

4. Let A = Aj U Ap U-:*:, where Ay = {(@,y):2 = 1/k,0 < y < 1) for & =


1, 2ooon Minony dare Vo(Al) = @.
5. Let Ao be the circular disk with center (0,0) and radius1. Fork = 1,2,..., let
A; be the circular disk with center (1 — 4—*)e; and radius 4—*—1. Let A = Ag —
(Ai U Ag U-::-). Find Vo(A).
6. Prove Lemma 4.
7. Prove Corollary 2 to Proposition 15a. [Hint: If K C G, then G@ = K U (G — K).
By Proposition 15a, V(@) = V(K) + V(G — K).]
8. (a) Show that if A and B are countable sets, then A U B is countable.
(b) Show that if B C A and A is countable, then B is countable.
(c) Show that if Ai, A2,... are countable sets, then 41 U Ag U- ~ is countable.
9. Let A = {21, z2,...} be a countable subset of (0,1). Given 0 < « < 1, let
€; = ne Ih: = (tx, = Ge Lp+ ex), and G = Ty UV) Io U-:-

(a) Show that Vi(G@) < e.


(b) In particular, let A be the set of rational numbers in (0,1). Let K =
[0,1] — G. Then K is a compact subset of the irrational numbers. Show
that Vi(K) > 1 — «.
(c) Show that K = fr K.
5-3 Integrals Over E” 147

10. Let G = Ay U A2U-:> where Aj = (4, 3), 4o= (, 2) U J, 8), Ag =


(sy, zy) U--:U (33, 38),--- Thus 4; is the union of 2’! intervals of length
377. (See Fig. 5-4.)

0 Ag Ay Ag 1
I
——————
oe
As As ‘As As
Figure 5-4

(a) Show that Vi(@) = 1. Hence the compact set K = [0,1] — G@ is a null
set. K is called the Cantor discontinuum.
(b) Show that no connected subset of K contains more than one point.
(c) Show that « € K if and only ifx = )°72, a:3—‘ where a; = Oor2,i = 1,2,...
(d) Let f(z) = 214,;2-*—! for xe K. Show that f(K) = [0,1]. Hence K
is uncountable.
(e) For z in the kth interval of A; let f(x) have the constant value (2k — 1)2~/,
k = 1,2,...,27-!,7 = 1,2,... Show that f is continuous and nondecreas-
ing on [0, 1]. [Note: f is called the Cantor function.]
11. Show that any straight line in EH? has area 0.
12. Show that if A is an unbounded measurable set, then
V(A) = sup {V(K): KC A}.
(if V(A) = + ©, this means that for every C > 0 there is a compact set K C A
with Vici) w= C.)

5-3 INTEGRALS OVER E”


Let f be real-valued with domain EH”. The support of f is the smallest
closed set K such that f(x) = 0 for every x € K.
Example 1. Let f(z) = x + 1 if e € (0,1) and f(x) = Oif x € (0,1). The support
of f is the closed interval [0, 1].

The object of this section is to define the integral ff dV of f over HE” when
f is bounded and has compact support. This is done first for functions taking
only a finite number of values. In that case the integral is just a certain finite
sum. A function ¢ is called a step function if there exists a disjoint collection
{Ay,..., Am} of bounded measurable sets such that ¢(x) is constant on each
A; and ¢(x) = 0 for x € Ay U---UAm. If o(x) = cx for every x € Aj,
then the integral of ¢ over EH” is

JeeS AD (5-13)
k=
Example 2. Let (zx) = k/m for x € [(k — 1)/m,k/m), k = 1,2,...,m, and
¢o(x) = O fora ¢ [0,1). Then
1 2 mm 1
m2 | m2 ieee m Im
[oa-
We write dx instead of dV; in case n = 1.
148 Integration 5-3

The collection {A1,..., Am} is not uniquely determined by ¢. However,


let {B,,..., Bp} be another disjoint collection of bounded measurable sets
such that ¢(x) is constant on each B; and ¢(x) = 0 for x ¢ By U- es U Bi.
For each k = 1,...,m the collection {Az N By,..., Az M By} is disjoint,
and applying (5-8) to it,
m m Pp

CC) ae cxV (Ax M Bi).


k= les

In the same way, if d; = ¢(x) for x € Bj, then

SS Hien) = 0S eh aesy bo
t=1 t=1 k=1

Since d; = cz, whenever B; N Ax is not empty, the right-hand sides are equal.
This shows that {¢ dV does not depend on which collection is used.
Similar reasoning shows that the sum ¢ + wy of two step functions is also
a step function. Moreover,

[etna = foav+ [va (5-14a)


[(e) ve ef¢ dV for any scalar c. (5-14b)

[vw < fowity< 4 (5-14¢)


The notation ~ < $ means that ¥(x) < (x) for every x € E”.
We are now ready to define upper and lower integrals. Let f be a function
which is bounded and has compact support. The upper integral of f will be
denoted by ff dV. If ¢ is any step function such that ¢ > f, then f¢dV is an
upper estimate for it. We take the greatest lower bound of the set of all such
numbers f¢ dV.

Definition. The upper integral off over E” is

fia - inf {fodV:¢> f\. (5-15a)


We must check that there is at least one such step function, to insure that
the set on the right-hand side is not empty. However, since f is bounded,
there is a number C such that |f(x)| < C for every x. Since its support is
compact there is an interval J such that f(x) = 0 for every x ¢ I. Let ¢o(x) =
Cifx € J and ¢o(x) = Oifx Z I. Then do > f and go is a step function.
In the same way, the lower integral of f over E” is denoted by ff dV. It is
the least upper bound of the set of all numbers fy dV, where yw is a step function
andy <f:
[rw =sup{{yaViy < f\. (5515p)
Ify <f < 4, then by (5-14c), fydV < fedV.
5-3 Integrals Over E” 149

This implies that a


/fav < ih
fav.

If ¢ is a step function, then f¢dV = SedV = fea.

Definition. A bounded function f with compact support is integrable if


its upper and lower integrals are equal. Its integral over E” is

Vea Soa frav. (5-16)


We just observed that any step function is integrable. In the next section
it will be shown that every function in a much larger class is integrable.

Proposition 17. Let f and g be integrable functions. Then f + g is integrable


and cf 7s integrable for any scalar c. Moreover,

fo Berd Vas fia +:[o dV. (5-17a)


i(cf) dV =c /fav. (5-17b)

fra x Jo dV iff <Q. (5-17¢)

Proof. Given € > 0 there exists a step function ¢, > f such that

foray < ffav + €/2,


and a step function ¢2 > g such that

foav < [aav + €/2.


Then ¢, + ¢¢2 is a step function and ¢; + ¢2 => f +g. Consequently,

[o+ oa < [lor + av.


Using (5-14a), we have

[G+ou < frav+ foav +e.


Since this is true for every € > 0,

i (ee oda = frav + JodV. (5-18a)


Similarly,
i(f+g)dV = [fav ae fo dV. (5-18b)
150 Integration 5-3

If f and g are integrable, then the right-hand sides of (5-18a) and (5-18b)
are equal. The left-hand side of (5-18b) is not greater than the left-hand
side of (5-18a). Hence both upper and lower integrals of f + g equal ff dV +
fg dV. This proves that f + g is integrable and (5-17a). The rest of the proof
is left to the reader (Problem 4). §

An n-dimensional interval J = J, X --- X J» is half-open to the right if


each of the 1-dimensional intervals J; is half-open to the right. (In the defi-
nition of figure we could equally well have used intervals half-open to the right
instead of closed intervals.) Let us call a function ¢ an elementary step function
if @ is constant on each interval of some grid II and ¢ has the value 0 outside
the intervals of II. To avoid ambiguity about the values of ¢ on the bounding
faces of intervals we take the intervals of II half-open to the right.

*Riemann integral. If in (5-15a) only elementary step functions ¢ are


allowed, then the upper Riemann integral of f is obtained. The lower Riemann
integral of f is obtained by allowing only elementary step functions wy in (5-15b).
Let us denote upper and lower Riemann integrals by S(f) and S(f). Then

S(f) < [fav < frav < BU). (5-19)


If S(f) = S(f), then f is called Riemann integrable. Their common value S(f)
is the Riemann integral of f. From (5-19), if f is Riemann integrable, then f is
integrable [in the sense of (5-16)] and

S(f) = [fav. (5-20)


It can be proved that a bounded function f with compact support is
Riemann integrable if and only if V({x:f is discontinuous at x}) = 0. See
[1], pp. 230 and 260.

PROBLEMS

1. Determine whether f is bounded. Find its support.


(a) fiz) = 2 — |2|.
(b) f@, y) = zexp (—2? — y?).
(c) f(z, y) = 1 if either z or y is a rational number, f(z, y) = 0 if both z and y
are irrational.
(d) f@,y) = @—ywle+y|—@t+y|z—y| if lel+ yl <1, f@,y) =0 if
|x| + |y| = 1. Illustrate with a sketch.
2. Let [a] denote the largest integer which is no greater than a (for instance, [7] = 3).
Let ¢(z, y) = [x + y]if0 < « < 7,0 < y < s, wherer and sare positive integers.
For all other (x, y) let d(z, y) = 0. Show that

SodVe2 = rs(r+ 5 — 1)/2.


5-4 Integrals Over Bounded Sets 151

3. Let a unit square be divided into a small square in the center and 2m annular
figures of equal width surrounding it, as shown in Fig. 5-5. Let ¢(z, y) = 0 for
(x, y) in the small square or outside the large square. Let o(a, y) = (—1)*k in
the kth annular figure, k = 1,...,2m. Show that

fe dV2 = 8m(2m + 1)/(4m + 1)?.


What is this approximately when m is large?

e| Figure 5-5

4. (a) Show that if f is integrable, then f(cf) dV = cffdV. [Hint: Show that this
is true if c > 0, and that Sg Wy = S(—9) dV for every g. If c < 0, set
g = cfandg = —<cf.]
(b) Show that ffdV < fgdVitf < g.

5-4 INTEGRALS OVER BOUNDED SETS


Let A be a bounded measurable set and f be a function which is bounded
on A. More precisely, the domain of f contains A and there is a number C
such that |f(x)| < C for every x © A. Let us consider a new function with
the same values as f on A and the value 0 otherwise. This function will be
denoted by f4. Thus
ies te ifxe A
0 ifx€ A.
The function f4 is bounded and has compact support. The values of f outside
A should contribute nothing to the integral of f over A.
Definition. The function f is zntegrable over A if f4 is an integrable function.
The integral of f over A is the number

[iv = [iaav. (5-21)


By Proposition 17 sums and scalar multiples of functions integrable over
A are again integrable over A.
152 Integration 5-4

The integral has a number of basic properties which we summarize in the


next theorem.

Theorem 13. Jf all the integrals involved exist}.then:

Cf Go) av = 0)pay ig dV.


(2) ih,(cf) dV =e iL,fav.
(3) il1dV = V(A).
A
(4) If f(x) < g(x) for every x © A, then J fwv< J g dV.
A A
(5) Tf |f(x)| < C for every x© A, then viefav @ I, [fl av < CV(A).
(6) If Aisa null set, then i fd =0.
(7) If AN Bisa null set, then [ fav =f fav + f fav.
AUB A B
Proof. First of all,

G+ga=fatga, (f)a = oa.


Therefore (1) and (2) follow from Proposition 17. Let 14 denote the function
with the value 1 on A and otherwise 0. It is a step function, called the
characteristic function of A, and by (5-13) with m = 1, c; = 1, fl4 dV = V(A).
This establishes (3). For (4) we have f4 < g4 and apply Proposition 17.
To prove (5) we have

f(x) < |f()| < C for every x € A.


Hence from (4)

[sv < fs flav < few,

and the right-hand side is CV(A). Similarly, —f(x) < |f(x)| and

- [fav < f lilav < evi).


Since |f4fdV| is either f4fdV or its negative, this proves (5). Part (6)
follows from (5). To prove (7), faus = fa +fs — fang. By Proposition 17

[fsuzaV = [faav + [fe dV — fiannav,

and the last term is 0 by (6).


5-4 Integrals Over Bounded Sets 153

From (6) and (7), the integral is unchanged if A is replaced by A UN


or A — N, where N is any null set. Similarly, if f(x) = g(x) except for x in
some null set, then f and g have the same integral.

Example. If fr A is a null set, then

[fe ‘i{lagektte 7 fast

In elementary examples fr A is always a null set. If A is the set of rational numbers


in [0, 1], then fr A = [0, 1], which is not a null set. Problem 9, p. 146 furnishes an
example of a compact subset of E! with frontier of positive length. There are open,
connected subsets of H? with frontiers which have positive area.

Let us next show that under quite mild assumptions about f, the integral
exists. It is for this purpose that the idea of measurable function is introduced.
Let f have domain E”.

Definition. If {x: f(x) > c} is a measurable set for every scalar c, then
f is a measurable function.

It will be shown in Section 5-10 that such operations as taking the sum of
two measurable functions or the limit of a sequence of measurable functions
lead again to measurable functions. Just as for nonmeasurable sets, the only
examples of nonmeasurable functions are obtained in a nonconstructive way
using the axiom of choice.

Lemma. Jf f zs a bounded, nonnegative, measurable function with compact


support, then f rs integrable.

Proof. By replacing f by f/C, where C is an upper bound for f(x) on A,


we may assume that 0 < f(x) < 1 for every x. Let J be an interval containing
the support of f. Given € > 0, let m be
a positive integer such that V(I) < em.
For-each ki-="1,2.7, 3,7 let

eax h(x) >) (ke —* 1) /my,

Ay = fan
{x: (k — 1)/m < f(x) < k/m}.

(See I'ig. 5-6.) Since f is measurable,


each EH; is a measurable subset of
I, hence Ay = Ey, — Ex41 is also
measurable. Let ¢@ and yw be step
functions defined by

(x) = k/m, v(x) = (&— 1)/m Ficure 5-6


154 Integration 5-4

forex GA, aud o(x) = "V(a)e—sOnit f(x) On VEhenyg(x)h ay (xmas 1/70


on H, = {x:f(x) > 0} and is 0 otherwise. Hence

fe a= fv dV = V(EN)/m,
V(E,)/m < V(D)/m < e.

Moreover, ¥ < f < 4, from which

[var < frav < [fav < [oav.


Hence the upper and lower integrals differ by less than e. Since this is true for
every positive e, f is integrable. J

Now let A be a bounded measurable set, and f be a function whose domain


contains A. If {x © A:f(x) > c} is measurable for every scalar c, then we
call f measurable on A.

Theorem 14. Jf f zs bounded and measurable on A, then f is integrable over A.

Proof. Let us first assume that f > 0. If c > 0, then the set

AX fA (X) ewe) ae x eA f(x)e > ct

is measurable since f is measurable on A. If c < 0, the left-hand side is E”


which is measurable. Hence f4 is measurable. By the lemma, f4 is integrable.
If f has negative values on A, let g(x) = f(x) + C where C is an upper
bound for |f(x)| on A. Then g > 0, and g is bounded and measurable on A.
Hence g is integrable over A, and so is f. §f

Corollary. /f f 7s bounded and continuous on A, then f is integrable over A.

Proof. For every c, {x € A: f(x) > c} is open relative to A. In other


words, it is the intersection of A with an open set, and hence is measurable. §

In particular, if A is compact, then any f continuous on A is bounded


and therefore integrable over A.
Theorem 14 has a sort of converse. If a bounded function f is integrable
over A, then f is measurable on A. We shall not prove this.

*Relation to the Riemann integral. A function f is Riemann integrable


over A if fa is Riemann integrable according to the definition on p. 150. Its
Riemann integral over A is S(f4). If f is Riemann integrable over A, then from
(5-20), Sa) = Saf av.
5-5 Iterated Integrals 155

A bounded set A is Jordan measurable if its characteristic function 1,


is Riemann integrable. It can be shown that A is Jordan measurable if and
only if fr A i# a null set. (See [1], p. 256.)
if A = [a,b), a closed interval of HL’, then the definition of Riemann
integral given above can rather easily be shown to agree with Riemann’s
origina) definition of integral as the limit of sums (Section A-9).

PROBLEMS
1. Let fs) = MU —2* if OS 2 < 2, f(z) = 0 otherwise. Using the notation in
the proof of the Jemma, describe the sets A1,..., An. Sketch the step functions
¢ and yp in the case m = 4,
2. In each case show that f is integrable over A.
(a) fiz) = z* expz, A = (0, a}.
(b) f(z) = sin (1/4) ifz # 0,f0) = 5, A = [—1, 1].
() fz,y = —/)/@ —»,A = ey, a] <1,\yl < 1,2"¥ y}.
(d) fiz) = 0 if & ia irrational, f(z) = 1/q if c = p/q where p and q are integers
with no cornmon factor, A = (0, 1).
(e) fiz) = 1 if & ie irrational, f(z) = 0 if z is rational, A = [a, BJ.
3. For each part of Problem 2 describe the sets {x © A: f(x) > c}.
4. Show that if f, g, and h are integrable over A and |f(x) — g(x)|< h(x) for every
x€ A, then |f,fdV — fagdV| < fahdv.
5, Let f be of dass C™ on (0, a) and b = max {|f’"(z)|:0 < ¢ < aj. Let g(z)=
SO) +f/O)z. Using Proble mn 4, show that \fo f dz — af(0) — a*f’(0)/2| < a*b/6.
Use this result to estimate f}/* exp(—z?/2) dz.
6. (Mean value theorem for integrals.) Let A be compact and connected. Let f be
continuous on A and g be integrable over A with g(x) > O foreveryx € A. Prove
that there exists x* © A such that

[9 qv = fx") gav.

(Hint: Let C and c be the maximum and minimum values of f on A. Then


cg < fy < Cy. Use (2) and (4) of Theorem 13 and the intermediate value theorem.]

5-5 ITERATED INTEGRALS


Thus far we have given no effective procedure for the actual evaluation
of integrals. One method for doing this is by writing the integral as an iterated
integral and applying the fundamental theorem of calculus. Let 1 < 8 < n.
In most cases we shall take s = 1 or s = n — 1. Then &” can be regarded
as the cartesian product E* x E”~*. Let us write x = (x’, x’’), where

x = (r',...,2°) € BE’, v= 4, mEr~


156 Integration Bye)

Let A be a bounded set and (Fig. 5-7)


ICD NC OI
R = {x’ : A(x’) is not empty}.
(5-22) A(x)
Let f be a function whose domain
contains A. For each fixed x’ € RF let
f(x’, ) denote the function whose
value at each x’ © A(x’) is f(x’, x’). ee
Let us assume that A is compact and ee
f is continuous on A. Then A(x’) and
R are compact and f(x’, ) is continuous Figure 5-7
(Section A-8). In stating the theorem
about iterated integrals let us use the longer notation f4 f(x) dV,(x) for the
integral. Let
gx’) = ff’, x”) dVn_o(@”).
A(x’)

Then the integral of f over A equals the integral of g over R.

Theorem 15. Let f be continuous on a compact set A. Then

[,$@
A
ane) = fae’)
R
ave). (5-28a)
Actually, Theorem 15 is true if A is merely measurable and f is integrable
over A. In that case there may be a subset of AR of s-dimensional measure 0
for which g(x’) is undefined. We shall not prove this general form of Theorem 15.
However, in the next section the theorem will be extended to the case when
A is a g-compact set.
In the proof the following fact about monotone sequences of functions
will be used. For each vy = 1, 2,... let F, be a bounded measurable function
with compact support such that

lie om iin 30;


Let
f(x)?" lim F(x)" =forevery x.
yoo

Then F is measurable (and of course bounded), and

iiF dV, = lim [F, dVp.


A proof of this fact will be given in the section on convergence theorems
(Corollary 3, Section 5-10).
Proof of Theorem 15. The proof will proceed by observing that the theorem
is true for elementary step functions, and then by constructing a monotone
sequence of elementary step functions tending to f4.
5-5 Iterated Integrals 157

Let us first show that

J®(x) dV,(x) = J{fB(x’, x”) dV n(x")} dV .(x’) (x)

if is any elementary step function. If J is any n-dimensional interval then


I= I1' XI", where I’ and I” are s- and (n — s)-dimensional intervals.
If ® is the characteristic function 1; of J, then (*) becomes V,(I) =
V.1’)V,—s(1”), which is true by definition of measure for intervals. But any
elementary step function can be written as a linear combination ® =
cyP; + +++ + ¢p&,, where each 4%; is the characteristic function of an n-dimen-
sional interval. Then

JedV, = >»Ck {[ av,.| ON A ext)av,.| dV,

which is just (*).


Now let A be a compact set and f be continuous on A. For the moment,
assume that f > 0. Let F = f4. Let us define a monotone sequence of ele-
mentary step functions F; > F, > --- as follows. Let Jo be some interval
containing A and let C be the maximum value of f on A. Let F,(x) = C for
x € Ig, and F,(x) = 0 otherwise. Divide Jo into 2” congruent subintervals
lip ly pe and lewA, =A) cli; lirxien;, and A; ismotiempty,
let F'o(x) be the maximum value of f on Ax. Otherwise, let P2(x) = 0. Then
F, > FF, > F. The function F3 is defined similarly by dividing each interval
I, into 2” congruent subintervals, and so on (Fig. 5-8).
Let dg) = diam Jp. At the vth step the diameter
d, of each interval I is 2~’t! do, and F,(x) = 0 ex-
cept on those intervals of diameter d, whose clos-
ures meet A. Let us show that for every x

Bx) slim x) (**)


yo

If x € A, then since A is closed there exists v(x) such


that d, < dist (x, A) for every v > v(x). In this Figure 5-8
case 0 = F(x) = F,(x) when pv > v(x). Thus (**)
holds for every x € A. Suppose that x € A. Since f is continuous, given
€ > 0 there exists 6(x) > 0 such that |f(y) — f(x)| < € for every ye A
such that |y — x| < 6(x). Choose v(x) such that d, < 6(x) for every y > v(x).
Then F(x) < F,(x) < F(x) + € for every v > v(x). Therefore (**) also
holds when x € A.
By the monotone sequences theorem

fkF dV, = lim [F, dV».


yoo
158 Integration 5-5

For each x’, the functions F,(x’, ) form a monotone sequence tending to
TixCes ae Let
G(x’) = ijF(x’, )dVa-, Gx)a i)F(x’, ) dVn—s-
Applying the monotone sequences theorem to the sequence [F,(x’, )],

linG, (x) )e—= G(X!)

for every x’. Moreover, G; > Gz > --- Applying the monotone sequences
theorem to this sequence,
lim [G,dV, = ilGdV..
y—o

Since F, is an elementary step function, we have by (*) with @ = F,,

if
F,dV, = iG, dVs.
Therefore
{|FdV, = iGdV.. (5-23b)

However, f4 = F and gr = G. Hence (5-23b) is the same as (5-23a).


To remove the assumption f > 0, write f = (f +C) — C, where C is
the maximum value of |f| on A. Then f is the difference of nonnegative con-
tinuous functions for each of which (5-23a) holds. By subtraction, (5-23a)
holds for f. J

Corollary.
Vn(A) = ipVn—slA(x’)] dV,(x’). (5-24)
Proof. Take f(x) = 1 in (5-23a). J
In particular, let s = 1. Writing xz’ = x' = u, (5-24) becomes

V, (Ay = il.Vn—1[A(u)] du. (5-25)


The set A(w) is congruent to the intersection of A and the hyperplane z! = wu.
In effect, (5-25) states that V,,(A) is the integral of the (n — 1)-dimensional
measures of these intersections. For n = 3, this is the method of “volumes
by slices” of elementary calculus.
Example 1. Let A be the n-simplex with vertices 0, ce1,..., cen, where c > 0. For
ce = 1 this is the standard n-simplex (Section 1-4). Let us show by induction on n
that V,(A) = c/n). If nm = 1, then A = [0,c] and Vi(A) = cc. Assuming the
result in dimension n — 1, we apply (5-25). (See Fig. 5-9.) Now

A= {x:2'+---tar< C.£) 10 lorem eee


A(u) = {x/:2?-----+
a" <¢ — 4, zt > O fort = 2,...,n},
5-5 Iterated Integrals 159

and R = [0,c]. Therefore

” n—1 no n
(c — u) By (c — u) c
USN (n — 1)! a n(n — D!o nl
To evaluate this integral we have of course used the fundamental theorem of calculus.
A proof of this theorem is given in Section A-9.

a! Figure 5-9

Example 2. Let s = n — 1. Suppose that A has the particular form

AR= aSx \ie oe] (x) x CLR t,


Then
A(x’) = [A(x’), H(x’)]
and
Vn(A) = i [H(x’) — h(x’)] dVn—1(x’).
For instance, if n = 3 then A is a solid bounded above by the surface with equation
z = H(z, y) and below by the one with equation z = h(a, y), (x, y) € R.

It is not essential in Theorem 15 that x’ = (!,..., 2°). One can equally


well take integers 7; < +--+ < 7, and

Xe —e(t 04), Xi = (r1,...., 08-8),


where 7} < +++ < Jn—s are those integers between 1 and n not included among
V1,
, - oe 5] 1g.

In particular, let n = 2. Then s = 1 and we can take either x’ = x or


x’ = ¥y. To avoid writing parentheses we write fdx ff dy instead of [{Jf dy} dz.
Then
[fave = fax [fay - [au [Ffda,

where the iterated integrals are taken over the appropriate subsets of H'. Many
authors write the iterated integral as fff dy dx, but this notation would lead to
confusion when we come to the exterior differential calculus in later chapters.
The iterated integral is usually easier to evaluate when taken in one of the
two possible orders than in the other order.
160 Integration 5-5

Example 3. Consider the iterated integral

i af, (x” + y) dy.

Then f(a, y) = «2 -++ y and A is as shown (Fig. 5-10). Evaluating the inner integral
first, we get
1 1 4

[L,@a4)
1 2
2 y
a2
dex <4 fears —1
(.!| ae = ¢é.
Writing ff dV2 as an iterated integral in the opposite order, we get

we x ne [ov
[afc : + y) dx = ~
[ (E+a)/ a =8/ie y?? dy
== 28.

If n = 3, then the integral can be written as an iterated triple integral.

Example 4. Let A = {(2,y,z):2 > 0,2 >0,0< y < 4— 2? — 27}. Then

[tava te Mew? % gate, Ys

and A(z, y) is the interval (0, 7 ee y]. (See Fig. 5-11.) Writing the integral
over F& as an iterated integral, we get

2 Ae V4—22—y
[favs =f de [ dy [ f dz.
A 0 0 0

There are 5 other possible ways of writing {4 f dV3 as an iterated triple integral.
For instance, crs
4—z°—z
[favs
A
= fff
s Vo
f dy\ dV (a, 2)
2 V4—22 Lg ye
s / dz / dex| f dy,
0 0 0

where S = {(z,y):2 => 0,z = 0,27-+2? < 4}.

In the same way, for any n the integral can be written in n! possible ways
as an n-fold iterated integral.

Moments. Let B be a closed set, and A be a bounded measurable set. For


each even integer k the kth moment of a point x about B is [dist (x, B)]*. The
moment of A about B is
i [dist (x, B)]* dV,,(x).

If k is odd, then we define the kth moment only in case B is a hyperplane.


Let B = {x:a-x = c}. We may suppose that |a| = 1. This determines a
5-5 Iterated Integrals 161

pe

3 0)
R

Figure 5-10 Figure 5-11

up to a change of sign. We make a particular choice for a, which amounts to


choosing one of the two half spaces bounded by B to be “positive” and the
other “negative.” Then dist (x, B) = |a-x — c|, by Problem 6, Section 4-8.
The kth moment of a point x about B is defined to be [a - x — c]*, and the kth
moment of A about B to be

i [a-x — c]* dV,(x).


A

If k is even, this agrees with the previous formula.

_ Centroids. Let us denote by m* the first moment of A about the hyperplane


a = 0, taking a = e’,c = 0. Thus

mi = | a’ dV,,(x), Doras thn Sage:


A

Let m = (m',...,m"”). The centroid of A is the point x such that m =


V,(A)x, provided V,(A) > 0. [Centroids of lower dimensional sets can be
defined similarly by replacing n-dimensional measures and integrals by r-fold
ones. See Section 7-3.] The first moment of A about any hyperplane B is
then seen to be a:m — cV(A), which is V(A) times the first moment of X
about B.
If n < 3, then one may think of A as a body made of some material of
density p(x). The moments of mass are defined by inserting the factor p(x)
under the integral sign in each of the above formulas. The center of mass is
the point X with

=; _ fav'p(x)
dVn(x)
Pomona oat
If p is constant, then ¥ = X.
162 Integration 5-5

Example 5. Find the centroid of the hemispherical n-ball H = {x:|x| < 1, Peele
(See Fig. 5-12.) Now Ve) = a,/2, where a, is the measure of the unit n-ball
(Problem 7). By symmetry,m' = 0 fori > 1, and

m} ie dV (x)

ll ie auf id Vee 1(x”’)


H(u)

ll avai u(1
2\ (n—1)/2 _ On—1
wu) UT

Thus
x Jes 207-1 e

(n+ lan” Figure 5-12


PROBLEMS

1. Find the area and also the centroid of:

OCs Sen) ORIGy ei


2. Express the iterated integral
1 f(y)
[uf ” ey dx, where f(y) = min (1,log (1/y)],
0 0

as an integral over a set A C E?, and then as an iterated integral in the opposite
order. Evaluate it.
3. Express as an iterated triple integral:

i!fdV3, where A = {(a, y, 2): va a 2 <& y" <8 — (x? + Ze


A
4. Find the volume of

{(x, y, 2):|a]+ lyl + lal S 2, Jal S 1, ly] S 1}.


5. Find the volume of
Ue ye) cP yl |Zit 02,278)
6. (a) Suppose that f(z, y) = g(x)h(y) for every (2, y) € A and that A = RX S.

A S

1 1
(b) Evaluate iL dxf exp (a + y) dy.

T 1/2
(c) Evaluate il dy [ xy cos (c + y) dz.
0 0
7. Let a, be the measure of the unit n-ball {x:|x| < 1}. Prove by induction that
apr” is the measure of an n-ball of radius r, and a, = 2an—if§ (1 — u2)-D/2 dy,
Show that a4 = 17/2. [In Section 5-9 we give a general formula for an.]
5-6 The Unbounded Case 163

8. Let >) be the standard n-simplex.


(a) Show that the centroid of >> is at the barycenter.
(b) Show that the second moment of > about the (n — r)-dimensional plane
spanned by e-41,...,€n 1s 2r/(n + 2)!
9. A sequence of functions 71, 72,... is said to converge uniformly to F if given
€ > 0 there exists vo (depending only on e) such that |F,(x) — F(x)| < e¢ for
every x and vy > vo. Show that the sequence constructed in the proof of Theorem 15
converges uniformly to f4 if and only if f(x) = 0 for every x € fr A. [Hint: f is
uniformly continuous on the compact set A. Uniform continuity of a function is
defined in Problem 5, Section A-8.]

5-6 THE UNBOUNDED CASE

In the previous sections the integral of a bounded function f over a bounded


set A has been considered. However, in some instances the integral can be
defined without these boundedness conditions. In the present section let us
make the following two simplifying assumptions:
(a) f is continuous on A.
(b) There is a nondecreasing sequence of compact sets K, C Ko C:-::
such that 4.= K, U Ke U->-
Any set A with property (b) is called o-compact. In Section 5-10 the
integral will be considered under the weaker assumptions that A is a measur-
able set and f is measurable on A. However, in all elementary examples either
(a) and (b) are satisfied, or else they hold as soon as A is replaced by A — N
where N is a suitable null set. Moreover, both (a) and (b) will always be satis-
fied in the applications made in Chapter 7 to integrals over manifolds.
The definition is given in two steps.
Step 1. Let us first assume that f > 0. For any compact set K C A the
integral fx f dV exists in the sense of Section 5-4. The integral of f over A is
defined as the least upper bound of such integrals:

(PfdV = sup {fefd: Kc Al. (5-26)


If the set of numbers on the right-hand side has an upper bound, then f is
integrable over A. Otherwise, the integral diverges to --«. The definition has
two immediate consequences, which we state in the following form.

Lemma. Let g and h be continuous on A, withg = 0,h = 0.


(a) If h < g and g is integrable over A, then h 1s integrable over A.
(b) If g and h are both integrable over A, then g + h is integrable over A.

Proof of (a). For every compact set K C A,

[hav < faa < f gav.


164 Integration

Hence {4 g dV is an upper bound for {fx hdV :K C A}.1


Proof of (b). For every compact set K C A,

iL gthav=f gav+f hav Sf gav + f hav.


Hence the right-hand side is an upper bound for {fx (g + h)dV:K CA}.
Step 2. If a continuous function f also has negative values, then its integral
is defined by writing f as the difference of two nonnegative functions. Let

f*(x) = max {f(x), 0},


f(%) = max {—f(x), 0}.
Then f* and f~ are continuous on A (Problem 9) and

fo =fte) -—f@®, l@l|=f@+fe@, (5-27)


for every x € A. (See Fig. 5-13.) The function f is called integrable over A
if ft and f~ are integrable over A. Its integral is

i fq = ikid Va jbf— av. (5-28)


If both f and A are bounded, the new definition of integral agrees with the
one in Section 5-4 (Problem 10).

Figure 5-13

Some authors assign a value +c or —o to the integral in case at most


one of the functions f* and f~ has a divergent integral. However, in no case
should one try to evaluate the meaningless expression «1 — o if both ft
and f— have divergent integrals.
The lemma above has two important corollaries.

Corollary 1. Let f be continuous on A. Then f is integrable over A if and


only if |f| 7s integrable over A.
Proof. Let f be integrable over A. Then f* and f~ are integrable over AP
and by (b) of the lemma so is their sum |f| = ff + f~. Conversely, let |f|
be integrable over A. Since 0 < ft < |f|,0 < f~ < fl, by (a) of the lemma
f* and f~ are integrable over A. Hence f is integrable over A. i
Corollary 2. (Comparison test). Let f and g be continuous on A. Lf lft <g
and g 1s integrable over A, then f is integrable over A.
5-6 The Unbounded Case 165

Proof. In (a) of the lemma, let h = |f|. §


Corollary 3. Let f and g be continuous on A. If f and g are integrable over A,
then f + g is integrable over A.

Proof. Note that |f + g9| < |f| + |g]. By Corollary 1 and (b) of the
lemma, |f| ++ |g| is integrable over A. Apply the comparison test.

The integral can be written as a limit. Let K,, Ko, ... be any nondecreas-
ing sequence of compact sets such that Ad= K, U Ky U--- Then

J fdV = lim [ faV. (5-29)


A prow JKy

This will be proved later as a corollary to a theorem in Section 5-10. Notice


that the limit is the same for all such sequences K,, Ko,... If an additional
mild assumption is made about the sequence K,, Ko,..., then (5-29) can be
verified directly from the definition of integral (Problem 11).
The elementary properties of integrals listed in Theorem 13 still hold.
Parts (1), (2), (4), and (5) are proved by applying Theorem 13 on each set K,
and passing to limit by means of (5-29). Of course, the right-hand estimate
in (5) is meaningless if either f is unbounded or A has infinite measure. Part (3)
is Problem 12, Section 5-2. To prove (6), let V(A) = 0. Since K, CA,
V(K,) = 0. By Theorem 13 the integral of f over each K, is 0, and by (5-29)
the integral of f over A is 0.
Let us prove (7). If A and B are g-compact, then

Bee OS, Uae) Biwi Wise:

where K, and L, are compact for each »y = 1,2,...and K; C Ko C:::,


L, CL2Cc--: Then K, U L, is also compact for each v, and

(Ky U Li) G (Ke U Le) Cee,

AP eet om) oi) h es Lo), Ue.

Hence A U B is o-compact. Taking K, 1 L, instead of K, U L,, the same


reasoning shows that A M B iso-compact. If A MQ Bisa null set, then K, N L,
is a null set for each vy. Hence

iL nee ak ¥. pce
Taking the limit as vy> o,

= dV.
[turf tv Last
Note: Not every measurable set is o-compact. For instance, it can be
shown that the set of irrational numbers in [0, 1] is not a-compact. However,
166 Integration 5-6

it turns out that any measurable set A has a o-compact subset B such that
A — Bisa null set.
Let us now consider some important particular cases.
Case 1. A is closed. For every r > 0 let A(r) = {x EA: |x| < 7}.
Each of these sets is compact. Let r ,7rg,... be any nondecreasing sequence
tending to +o, and K, = A(r,). Let f be integrable over A, and let

V(r) = ee fav.

By (5-29), ¥(r,) tends to the integral of f over A as vy> o. Since this is true
for every such sequence rj, 72,..., we have

J fdV = lim fav. (5-30)


A r—-+to JA(r)

In particular, ifn = 1 and A = [a, o), then

[fax Selim hap:


roto Ja

Example 1. Let f(x) = 2~?. Then

~ 1
x ?de=
| p= I

ifp > 1. If p < 1, then the integral diverges to +.

Example 2. Let f(z) = exp (—bx),b > 0. Then f is integrable over [a, ©) for any a.

In these examples f > 0. When f > 0 the function yw is nondecreasing;


if ¥(r) is bounded, then y(r) tends to a finite limit equal to the integral of f
over A. If ¥(r) is unbounded, then the integral of f over A diverges to +o.
On the other hand, if f has both positive and negative values on A, then
there may be a finite limit in (5-30) while the integral of |f| diverges to -+oo.
In that case the right-hand side of (5-30) defines a conditionally convergent
integral. While conditionally convergent integrals are important in some
parts of mathematical analysis (for example, in the treatment of Fourier
integrals), they are not within the scope of the Lebesgue theory. We treat them
only in Problems 5 and 8.
Case 2. Let A = K — {xo}, where K is com-
pact. The function f is continuous on A, but may
be unbounded on any neighborhood of xo. For
each 6 > 0 let A’(6) = {x EK: |x — xo| > 8}
(Fig. 5-14). Each of the sets A’(6) is compact,
and
i fdV = lim fdv. (5-31)
30t JA’(8) Figure 5-14
5-6 The Unbounded Case 167

The same formula holds if f is continuous on A = K — L, where L is any


closed set. In this case A’(5) is the set of points of K distant at least 6 from L.

Example 3. Let A = (0, 1] and f(x) = x—”. Then fo 2? dx = 1/(1 — p) if p < 1,


and the integral diverges to-+ if p > 1.

Note: If A C [a,b] and [a,b] — A is a null set, then we still use the
notation f? for f.4.
Case 3. Let f be continuous on A = K — {xo}, where K is closed but not
compact. Let A; = {x eK : |x — xo| > 1}, Ag = {xEK:0 < |x — xo| < 1}.
Then f4,fdV and f4,fdV can be treated respectively as in Cases 1 and 2.
Since A; M Ag is a null set, if both of these integrals exist their sum is fy f dV.
If either the integral of f over A, or the integral over Ay does not exist, then
f is not integrable over A. To show this, suppose for instance that f is not
integrable over A,. By Corollary 1, f4, |f| dV diverges to +o. Since A, C A,
so does f4 |f| dV. Hence f is not integrable over A.

Example 3 (continued). Let A = (0, ©) and f(x) = 2-?. Taking A, = [1, ~),
Ag = (0, 1], the integral over Aj exists only if p > 1, and over Ag only if p < 1.
Hence fo z~” dz diverges to + for every p.

Case 4. Any open set A is g-compact (Problem 6). In most elementary


examples integrals over an open set can be treated by the methods just
described, recalling that the integral is not affected by adding or removing
any null set.
In many instances it can be shown that f is integrable by comparing |f|
with a function g known to be integrable (Corollary 2).

Example 4. Let A = K — {xo}, where K is an n-ball with center xo. Then

iiix — xol~” aV
A

exists if p < n and diverges to-+™ if p > n. This is proved in Section 5-9 by intro-
ducing generalized spherical coordinates. Let f(x) = $(x)|x — xo|~”, where ¢ is
continuous and |¢(x)| < C for every x A. Let g(x) = C\x — xo|"”. By the
comparison test, f is integrable over A if p < n. If ¢(x) = ¢ > 0 for every xe A,
let h(x) = clx — xo|~”. If p > n, then f > h. The integral of h over A diverges
to +, and hence so does the integral of f.
In the same way
/ Ix — xol|? dV
E”"—K

exists if p > n and diverges to-+° if p < n. A similar discussion applies.

Example 5. Let

T(u) = iE2" exp (—z) dz, u> 0. (5-32)


0
168 Integration 5-6

The function I is called the gamma function. Let f(z) = 2“~1 exp (—2). Since
f(z) < 2-1 and p = 1— u <1, f is integrable over (0,1] by comparison with
a-?. For any 0, 2° exp (—xz) > 0 as x +--+ and therefore is bounded on [1, ~).
Letting b = u-+ 1 we see that there is a number C such that f(x) < C/x? for every
x €[l, ©). Thus f is integrable over [1, ©) and over (0, 1], therefore over (0, ©).
The gamma function generalizes the factorial. First of all,

Pa) = ieexp (—2) dx = 1.


0
Integrating by parts,

T'(u + 1) ll irx exp (—2) dz


0
—2z" exp (—2) le++ ufo x“—' exp (—2) dr,
0 0
which gives
Tu+ 1) = ww). (5-33)

The integration by parts over (0, ©) is justified by taking it first over intervals
[6, 1] and [1, 7] and letting 6 > 0+, r +--+. In particular, if m is an integer, then
IMG. == 1) = witGo). Siac 1)) = il,

Tim + 1) = ml. (5-34)

Theorem 15 about iterated integrals can be extended as follows: Let us first


consider the case f > 0. Let F = fa, F, = fx,, and

Gx!) = [FG ) dVn—s


G(x’) = |F(x’, ) dVn—s-
From Section 5—5 we know that each G, is measurable and

[Fen = [Gav
for each y = 1,2,... The sequences F,, F2,... and Gi, Go,... are non-
decreasing. Using the monotone sequences theorem (p. 190) three times, just
as in the previous section,

[Fan = [@av,,
which is the formula (5-23b) for iterated integrals. If either integral in (5-23b)
diverges to +o, then so does the other. Therefore, if f > 0, one way to show
that f is integrable over A is to show that the iterated integral exists.
Example 6. Let f(z, y) = |z|/(1-+ «2+ y?)?, A = EE. By symmetry

AVE © Ox z geese
fs 2 a wv Genes 222 dy TSA
5-6 The Unbounded Case 169

Now let f be any continuous function integrable over A. The iterated in-
tegrals theorem applies to ft and f—, and hence by subtraction to f. Caution: If
f has both positive and negative values on A, then one cannot conclude that f
is integrable from the fact that the iterated integral exists.

Example 7. Let f(x,y) = y~! cosa, A = [0,7] X (0, 1]. Then

1 7 1
i dy [ y "cose dx = f Ody = 0.
0 0 0
The iterated integral in the opposite order does not exist, hence f is not integrable
over A.
Note: In the discussion of iterated integrals above, the integral of f(x’, )
over A(x’) is in the sense of the present section. However, the function g in
(5-23a) need not be continuous on R. In fact, g(x’) may be +o for certain
values of x’. If g is discontinuous and unbounded on R, then its integral has
to be taken in the sense described in Section 5-10.

PROBLEMS
1. Determine whether the integral exists or is divergent to +.
2

(a) / yo [Hint: Let (x) = 1/2 + 1]

us /2

(b) |sin z| ” dx. [Hint: x/sinz > lasz > 01]


—1/2

(c) ifP(x) exp (—czx) dz, P a polynomial, c > 0.


0

dx
(d) / FS oer Tees
ees Eel)
Fae
(e) / mine daz.

Daechow thatet(2,4,2)00 <2 — (2-- y7)/ry, 0 < « < 1,0 < y X< 1} has in-
finite volume.
3. Show that the volume of {(z, y, z):0 < z < |zy| exp (—2? — y?)} is 1.
4. Show that ff(2, y) dVo(a, y) exists if f is bounded, continuous, and |f(z, y)| <
Care ie) aarp 2:
5. If lim, f", f(z) dx exists but f is not integrable on E’, then this limit is called
the Cauchy principal value. Find the Cauchy principal value:
(a) fa) = 2/7 2). (c) Any odd function f, f(—z) = —f(z)
(b) f(z) = x +1/(1+ 2”). for every «.
170 Integration 5-7

6. Let A be open. Let

Ky = {xt ix] <7, dist (XB A) 20),

where 71,72,..-, 61, 62,... are monotone sequences tending respectively to


+o and 0. Show that each K, is compact and that Ki C K2C-::, A=
KG UW itp Uj oor
7. Show that each of the following integrals over H” converges.

(a) if(|x|? + 1)7?? dV (x), p > n.


(b) if\x|~*!
dV(x).
8. Let f(z) = (—1)"/m if « © [m, m + 1), m = 1, 2,...
(a) Show that lim,,. fi f(x) dz exists and equals —1-+ 4— 3+ °°
(b) Let K, = (Ux-1 [2k — 1, 2k]) U (U7: (27, 22+ 1]). Show that

lim i f(x) dx ¥ lim [ f(x) dz.


poo J Ky ro J1

9. (a) Assume that for every real number c, both of the sets {x € A:f(x) < c}
and {x € A:f(x) > c} are open relative to A. Show that f is continuous
on A. [Hint: Show that {x € A:c < f(x) < d} is open relative to A.
(b) Using (a) and the Corollary to Proposition A-6 (Section A-6), prove: if f is
continuous on A, then f+ is continuous on A.
10. Let A be a bounded, o-compact set, and let f be bounded and continuous on A.
Show that fu fdV, as defined by (5-28), is the same as in Section 5-4. [Hint:
Suppose first that f > 0. For any compact set K C A, we have, taking the integral
over A in the previous sense, 0 < fafdV — fxfdV < C[V(A — K)] provided
f(z) < C for every xE A. But V(A — K) = V(A) — V(K) is arbitrarily
small. Apply this to ft, f-, and subtract.]
11. Let Ki, Ko,... be a nondecreasing sequence of compact sets, and A =
Ki U K2U-:: Assume that if K is any compact subset of A, then K C K, for
some v (sufficiently large). Prove (5-29). [Hint: Suppose first that f > 0, the
integral being then given by (5—-26).]
12. (Difficult.) Let f be continuous on an open set D. Assume that the integrals of
ft and f~ over D both diverge to-+%. Show that given any number / there is a
sequence of compact sets Ki; C Ko C:--: such that D = K, U Ke U--:> and
linn dea =.

5-7 CHANGE OF MEASURE UNDER AFFINE TRANSFORMATIONS


Our next objective is to give formulas which describe how measure and
integrals change under a regular transformation g. In this section we consider
the special case when g is affine and prove the formula for the measure of g(A)
when A is a compact set. This special result is then used in the proof of the
general transformation formula (5-38).
S37 Change of Measure Under Affine Transformations 171

Let g be an affine transformation from E” into EB”. According to Sec-


tion 4-2, there exist x9 and a linear transformation L such that

g(t) = L(t) + xo
for every t € EH”.

Theorem 16. For every compact set K,

V[g(K)] = |det L|V(K). (5-35)

Proof. First of all, (5-35) is true for some particular kinds of affine trans-
formations. Certain elementary details of the proofs are left to the reader
(Problem 4).
(1) g is a translation, L = I. Then V[g(A)] = V(A) for any figure A,
and hence for any compact set.
From (1) we may assume from now on that x» = 0, g = L.
(2) For some k and /, the transformation L merely interchanges the com-
ponents ¢* and ¢’ of t. Then V[L(A)] = V(A) for every figure, and hence for
every compact set.
(3) The matrix of L is diagonal. In this case L*(t) = c’t', where ci,..., c? nm

are the diagonal elements.


In this case det L = cj... c®. If A isa figure, then V[L(A)] = |det L|V(A).
The same is then true if A is any compact set.
(4) There exist k and 1l,k # l, such that

ete ec oem L(t) tore =: ky


Let us for notational simplicity take 1 = 1. Let Q = L(K), and using
the notation on p. 158 let Q(u) = {x”: (u, x”) EQ}, K(u) = {t’: (u, t”) EK}.
If.x = L(t), then x” = ct'e, +t’. Hence Q(u) is just K(u) translated by
cue;,. Therefore

Vite) = JVn—i[K(u)] du = /Vn—i1Q(u)] du = Vn(Q).


Since det L = 1, this proves (5-35) for linear transformations of type (4).
Next, we observe that if M and N are any two linear transformations for
which the theorem holds, then

V((M - N)(K)] = |det M|V[N(K)]


= |det M| |det N|V(K),

and (det M)(det N) = detM-N. Hence the theorem also holds for their
composite M-N.
If N has row covectors w',...,w” and M is of type (4), then the kth
row covector of M-N is w” + cw’ and the others are unchanged. The kth
172 Integration ms)

column vector of N-M is v;,-+ cv; and the others are unchanged, where
Vj,..-, V, are the column vectors of N. A transformation M of type (2) when
applied on the left interchanges the kth and Ith row covectors of N, and when
applied on the right interchanges the kth and /th column vectors. Moreover,
the inverse of a transformation of type (2) or (4) is of the same type.
Now let L be any linear transformation. Then

L = M,-:-->
Mi,
where M,,...,M, are of types (2), (3), and (4). Since Theorem 16 holds for
each M,,, it is true for L.

Corollary. Jfg is an isometry of E”, then V[g(K)] = V(K).

Proof. For thendet L= +1. I

Let us apply Theorem 16 to calculate the measures of simplexes and


parallelepipeds.

Measure of an n-simplex. Let S be an n-simplex with vertices Xo, X1,..., Xn


(see p. 20). Let v; = x; — Xo. The vectors vj, ..., V, are linearly independent.
Let L be the linear transformation with v,,...,V, as column vectors, and
let g(t) = L(t) + xo. As before, let >° be the standard n-simplex. If t =
@,...,t) €d, lett? = 1— (@+---+ 2%) and let x = g(t). Then

X =X o+ >, t(x; — Xo) = yD ix


i=1 i=0

Hence x € S and ¢°, t',...,¢” are its barycentric coordinates. Conversely,


every x © S is obtained in this way. Thus S = g(>). By Example (1),
Section 5-5, V(>>) = 1/n!. Hence, from (5-35),

V(S) = |det L|/n!. (5-36)


Measure of an n-parallelepiped. Given xg and linearly independent
vectors Vi,..., Vn, let

Pa pe aa a Ae ae NE a, go ia
—t

Then P is the n-parallelepiped spanned by v,...,V, with x9 as vertex


(Fig. 5-15). Let Io be the unit n-cube. Then P = g(JIo), where g is as before.
Since V(J9) = 1,
V(P) = |det LI. (5-37)
If v,,..., Vn are linearly dependent, then P is called a degenerate n-parallele-
piped. In that case V(P) = 0.
5-8 Transformation of Integrals

Xo FV]

Xp t+Vo

Figure 5-15

PROBLEMS
Ike Find the volume of the tetrahedron with vertices e1, —e2, e3, e1 + 2e2 + e3.
2. Find the area of the parallelogram with vertices e — e2, 2e1-+ e2, —2e1, —e1 + 2eo.
3. Let S be an n-simplex, with vertices xo, X1,...,Xn. Show that
he
x0 el
ok) el foine
1
aR

1
V(S) = — | det
n! B
° ZO 1 Xn

il il 1

[Hint: Subtract the first column from each of the other columns. This does not
change the determinant.]
. Complete the details of steps (1), (2), and (3) of the proof of Theorem 16.
. Let (cj) be a positive definite symmetric n X n matrix, and let \1,..., An be its
characteristic values. Show that

r({eDS cin'x? < ')= an/VA1-°°* nm


1,j=1

where a, is the measure of the unit n-ball. [Hint: See Theorem 11.]
. Show that if |v;| < C for each i = 1,...,n and P is a parallelepiped spanned by
Vi,--+,Vn, then V(P) < C”. [Hint: Use induction, a suitable rotation of E”,
and the method of slices.]

5-8 TRANSFORMATION OF INTEGRALS


Let g be a regular transformation with domain an open set A C E”. Regu-
larity means that g is of class C and has an inverse g~' of class C’”” (p. 115).
Let f be continuous on the open set D = g(A). The object of this section is
to prove the formula (5-38) which expresses the integral of f over a set A C D
as an integral over the corresponding subset B = g'(A) of A.
174 Integration 5-8

Figure 5-16

Let us first give an imprecise derivation of the transformation formula


(5-38). (See Fig. 5-16.) For the moment, let B be compact, and let Z be a
figure approximating B from without. Let J C Z and to be a point of J. Let

Xo = g(to), L = Dg(to).
The affine approximation G to g at to is given by

G(t) = Xo + L(t — to).


If J is small, then G(J) and g(J/) nearly coincide. Hence V[g(J)] is nearly
V{G(J)], which by Theorem 16 equals |det L|V(J). Since f is continuous, its
integral over g(J) is approximately f(x,)V(g(Z)], and since detL is the
Jacobian Jg(to),
[fa
g(1)
~ fleto)ll7e(to)|VWD).
Let
o(t) = flg(t)]|7g(t)|.
Since f is continuous and g is of class C“”, the composite f > g is continuous
and |Jg| is continuous. Hence ¢ is continuous. The integral of ¢ over I is
approximately ¢(to)V(Z), which is the right-hand side of the above expression.
The figure Z is the union of small intervals [,,..., I>. Let t, € Ix. We
should have approximately

[sav a Dy (tk)
V Lk) ~ i oaV.

This suggests the formula

Hifav = I $ dV.
A B
The following theorem states that the formula is correct, and not merely for
compact sets.
5-8 Transformation of Integrals 175

Theorem 17. Let g be a regular transformation from A onto D. Let f be


continuous on D and A be any o-compact subset of D. Then

[f@ave@)
A
=f,g (A) fetlivediavio, (5-38)
provided either integral exists.

More generally, Theorem 17 is true if A C D is measurable and f is any


function measurable on D. We shall not prove this.

Proof of Theorem 17. If A is compact, then g—'(A) is compact since g~!


is continuous. Since f and ¢ are continuous, both integrals exist. The theorem
states that they are equal. Suppose for the moment that the theorem is known
for compact sets. If A is g-compact, then A is the union of a nondecreasing
sequence K;, Ko,... of compact sets. Let LZ, = g—'(K,). Then B = g7}(A)
is the union of the nondecreasing sequence Lj, Lo,... of compact sets. If
f = 0, then ¢ > O and

{ fav = lim [_ fav,


J A pow JKy

gdV = lim g dV.


B pou J Ly

The right-hand sides are equal. If either integral diverges to +, then so does
the other. Thus (5-38) holds if f > 0. In the general case, write f = ft — f7,
¢=¢' —¢. Since |Jg| > 0, we have

Gumi jiee ee) eeeoe —i( fe > g)i/e\.


If either f or ¢ is integrable, then so is the other and (5-88) holds.
It remains to prove (5-38) in case A is compact. We may assume that
f = 0. An indirect proof of (5-38) will be given. For this purpose let us
prove three lemmas. In these lemmas J denotes an n-dimensional cube. It
is convenient to take J half-open to the right. Then J = cl J — N, where
N is a certain compact set composed of (n — 1)-dimensional faces of J. Hence
g(I) = g(cl I) — g(N) is the difference of compact sets, and so is measurable.
Let J be the side length of J, d its diameter, and I’ the concentric closed n-cube
of side length (1 + 7)l. Let G, L, xo be as defined at the
beginning of the section.

Lemma 1. Let to © A and t > 0 be gwen. Then


to has a neighborhood 2 C A such that g([) C G(’)
for every n-cube ICQ with to Eel. (See Fig.
5-17.)

Proof. li x =-G(s), y = G(t), then x —y =


Lis — to) — L(t — to). Since L is linear, x —y = Figure 5-17
176 Integration

L(s — t). Let C = ||L7'|| (p. 103). Thens — t = L~’(@ — y), and
s—t] < Ck — yl. (*)
Let o = 7/(2V/nC). Since g is differentiable, to has a neighborhood
Q Cc A such that
lg(t) — G(t)| < o|t — to (**)
for every t € Q.
Let x € g(I) and s = G’/(x). Then x = G(s) = g(t) for some t € J.
By (*) and (**)
ls — t| < Co|t — tol.

Since ty © I, |t — tol < d. Since d = ~/n J,

ls — t| < CoV/nl < 71/2.


This implies that s € J’, and x € G(J’). §
Lemma 2. Let tp © A and € > © be given. Then to has a netghborhood
Q, C A such that
i, faVi< i dV + €V(I)
g(r) JI

for every n-cube I CQ, with to € el I.


Proof. Let
a = f(Xo) = flg(to)], b = |Jg(to)|.
Leteo( 64) ==(a—" 6)(b — 6) =" (area a7), ‘Then d(0, 0) Ocand
¢@ is continuous. Hence ¢(£,7) < € for every (£,7) in some neighborhood V
of (0,0). Choose some € > 0, 7 > 0 small enough that (é,7) € V. Then

@+H1+% <@—)b—H+e (*)


Since f is continuous, there is a neighborhood U of xo such that
a—&< f(x) <a-+ & for every xe U. Since g and |Jg| are continuous,
there is a neighborhood Q, of to such that g(t) © U and b — & < |Jg(t)| for
every t € Q;. We may assume that 0; C 2, where © is as in Lemma 1.
Then
faV < (a+ &V{[g(7)I,
g(1)

and by Lemma 1, V[g(J)] < V[G(/’)]. By Theorem 16,

VIGO] = bv) — ber) Vid).


Therefore
ade Sg Ot Tah).
g(I
5-8 Transformation of Integrals 177

On the other hand,

[eav > @— HO — OVD.


These two inequalities and (*) give Lemma 2. §

Lemma 3. Let I C A be an n-cube. Then

if Ale ifo dV.


g(J) i

Proof. Suppose this is false for some n-cube 7°. Then

c= got = [ov

is positive. Divide J° into m = 2” congruent n-cubes J,,..., Im, half-open


to the right. Since J,,...,JZm are disjoint and g is univalent, the sets
g(J1),..., g(Im) are disjoint. Hence

[ew = | ow,

0
sav= > | fav.
gC) Gay g(J5)

For at least one 7 we must have

ip hee = fou Bee

Choose some such j and let J’ = J,;. By dividing J! into 2” congruent n-cubes
and repeating the argument, we obtain J*. Continuing, we obtain a sequence
of n-cubes I' D> I? D- ++ such that

| CidVeSec leet eli 212, ne)


g(I4) ne
and the diameter of J’ tends to 0 as 1— ». By Theorem A-3, (cl I’) nN
(cl J?) M-++ contains just one point to. Let € < c/V(I°). Then

AGE y=. OE AOE pecans


If 2, is as in Lemma 2, then J’ C Q, for large enough / and we obtain a con-
tradiction. J

To complete the proof of Theorem 17, let A be compact and B = g~!(A).


Let Z,) be some n-cube, half-open to the right, containing B. Divide Zo into
2” congruent n-cubes and let Z, be the union of those which meet B. Then
divide each n-cube of Z,; into 2” congruent n-cubes, and let Z2 be the union
178 Integration 5-8

of those meeting B, and so on. Then Z; > Z2 D--- and their intersection
is B. There exists vo such that Z, C A for all vy > v9.
Applying Lemma 3 to each of these congruent disjoint n-cubes comprising
Z,, and adding, we get

[fav sf fav < [ o dV


A g£(Zy) Zy

for each vy > vo. Moreover,

ih ¢dV = lim [ dv.


B poo JZy

This fact will be proved in the section on convergence theorems [Corollary 5(b),
Section 5-10]. Thus
| fv < i ¢ dV.
A B
But g~’ is also a regular transformation. Interchanging the roles of A
and B,
[ev
B
sfA eee ela.
Bonen | |/e(t)|=" ix g(t), and

ole @)ilJen
(x)) — fG):
Therefore
if
B
gqV < iA fav,
which proves (5-38) for the case of compact sets. This completes the proof
of Theorem 17. J

In particular, taking f(x) = 1 we have:

Corollary 1.
VAy= fe, Vel avin. (5-39)
Corollary 2. /f B is a null set, then g(B) is a null set.

Corollary 3. Jf A is a measurable subset of an r-manifold M, wherer < n — 1,


then A is a null set.

Proof of Corollary 3. By the implicit function theorem (Section 4-6),


any Xo € M has a neighborhood U such that U mM M = g(R), where R is a
relatively open subset of an r-dimensional vector subspace. Then R is a g-com-
pact null set. By Corollary 2, U q M is a null set.
If K CM and K is compact, then there exist a finite number of such
neighborhoods U;,...,U, such that K C(U,U---U Un) 01) Mew Hence
5-8 Transformation of Integrals 179

Figure 5-18

K is a null set. If A is any measurable set, then V(A) = sup {V(K) :K C A}.
If A C M, then A is a null set.
Example 1. Let 4 = {@, y):2 > 0,4 > 0,0 < ay < 3,4 < y <'2z}, f@, y) = y?,
g(s,t) = Vs/te: + Vste2fors > 0,t > 0. We show that g is univalent by solving
the equations
B= Wl, y = Vst

explicitly, and find that s = zy, t = y/z. Since Jg(s,t) = 1/2t ¥ 0, g is regular.
Moreover, the part in A of the hyperbola zy = c, 0 < ¢ < 38, corresponds to the
segment s = c, 1 < t < 2in B. Hence B is as shown in Fig. 5-18, and

ss Ss
Jy aVo(z, y) = 3] at sds = i.
A eo
Example 2. Let P be an n-parallelepiped. Then P = g(Jo) where, as on p. 172, g is
affine and Jo is the unit n-cube. Then

| f(x) dV (x) = |detL|[ f(L(t)] aV(t).


P ile

PROBLEMS
1. Letn = 1, g(t) = t? — 2+ 3, B = (0,1). What does (5-38) become?
2. Let g(s, t) = (s?-+t7)e1+ (s? — t?)eo,s>0,t >0,and A= {(2, y):2<a+y <4,
z —y > 0,y > 0}. Show that g is regular and evaluate [4 «—! dV 2(z, y).
3. Find the second moment about (0,0) of the parallelogram with vertices (0, 0),
e; + e2, —2e; + 3e2, —e; + 4e2. [Hint: Let g be the linear transformation L
with column vectors e; + e2, —2e; + 3e2.]
4. Let B be a compact set, X its centroid, and g an affine transformation. Show that
g(x) is the centroid of g(B).
5. Let A be symmetric about 0, that is, x € A implies —x € A. Letf be integrable
over A and f(—x) = —/f(x) for every x € A. Show that fafdV = 0.
6. Let g be of class OC) on A and K be a compact subset of A. Show that there is a
number C such that |g(s) — g(t)| < C|s — t| for every s, t © K. [Hint: Proposi-
tion 13b, Section 4-3.]
7. Let g be regular and K, C as in Problem 6. Show that if BC int K, then
V[g(B)] < C"V(B). (Hint: Fort € B the partial derivatives of g satisfy |gi(t)| < C.
Use Problem 6, p. 173 to see that |Jg(t)| < C”.]
5-9
180 Integration

5-9 COORDINATE SYSTEMS IN E”


Let D be an open subset of Z”. Let f!,...,f” be functions of class C“”
on D such that the° transformation f = (f!,...,f”) is regular. Since a regular
. » * 5
transformation f is univalent, the numbers f'(X),...,f”(x) uniquely specify
x and can be regarded as a set of “coordinates” for x.

Definition. A regular transformation f from D into E” is a coordinate


system for D. The numbers f1(x),...,f/"(x) are the coordinates of x in
this coordinate system.

Since we have already considered regular transformations, this definition


involves nothing new except a change of viewpoint. In many problems it is D
which has actual geometric or physical significance. The transformation f
is introduced simply as a device for solving the problem, and the open set
A = f(D) has only an auxiliary status.
In particular, many integrals can be evaluated by introducing a suitable
coordinate system. The transformation formula (5-38) is applied with B =
f(A), g = f~’. The objective usually is to choose a coordinate system for which
B is simpler than A (for instance an interval) or ¢ is simpler than f, or both.
Let us consider some particular coordinate systems.
1. The identity transformation I gives the standard cartesian coordinate
system for HE”. The components z!,..., 2” are the standard cartesian coordi-
nates of x.
2. If f is an affine transformation, then the coordinate system is called
affine.
3. If M is an r-manifold, ® = (@',...,6"~") is as in Section 4-7 and
O(D ob a) 0G ee mae 80
at Xo, then (X’,..., X", @',..., 6") is a coordinate system for some neigh-
borhood U of xo. The coordinates of a point x € M q U in this system are
(i ee el Oe ae 0):
4. Polar coordinates. Let D be E” with the positive z-axis N removed.
Let R(x, y) = Vx? + y2 and O(z, y) be the angle from N to the half-line
from (0, 0) through (2, y), with 0 < O(2, y) < 2a. Then (R, @) is the polar
coordinate system for D. If r= R(z,y), 6 = O(z, y), then (r, 6) are the
coordinates of (x, y) and

7 GEE y= rein 6 == g(r 8),

where g = (R, O)~'. Since Jg(7, 6) = 7, the transformation formula becomes

fl,f(a, y) dVo(x, y) = (Ef[r cos 8, r sin 6|r dV'o(r, 8),


where g(B) = A — N. Since N is a null set, the integral over A is the same
as the integral over A — N.
Coordinate Systems in E” 181

5. Spherical coordinates in E”. For n = 2, this is the polar coordinate sys-


tem. Proceeding inductively, let r = |x|, 6’ be the angle from the positive x!-
axis to x (more precisely, 6' = cos—' (x'/r),0 < 6! < 7), and (p, 6?,..., 6”—1)
be spherical coordinates for x’ = (x?,...,2"), where p = |x”| = rsin9!.
The coordinates of x are

x! = rcos 6!

x” = rsin 6! cos 6?

z”—! = rsin 6! sin 6?7--- sin 6"~2 cos 6” —!


pe n
TSin. Orem 6 sin @"—- sin 0” *
This defines the spherical coordinate system (R, 8',...-,O"—1) on D =
Ek” — N, where N is a certain null set. The Jacobian is

JoG0 oO tme e== et sin’- 6 sina ©67 ~-sin 6" =~.

Example 1. Suppose that f(x) = (|x|), where ¢ is continuous on (ri, r2). Then

iry <|xl<ro fav = 6, iTY” g(ryr"— ar,


where 6, is a number not depending on ¢, rj, or re. To find Bn, set@ = 1. Then

ain(r — rh) = Buf “r"~! dr = Balt? — ri)/n,


Lp|

where a@,, is the measure of the unit n-ball. Hence Bn = nan, which turns out to be
the (n — 1)-dimensional measure of the unit (n — 1)-sphere.

6. Cylindrical coordinates in E®. Let (R, @) be the polar coordinate sys-


tem. Then (R, O, Z) is a coordinate system for D = E* — {(x,0, z):x >O}.
The equations = rcos@, y=rsiné, 2 =z relate the cylindrical and
standard cartesian coordinates of a point x € D. In a similar way cylindrical
coordinates can be introduced in EK”.
7. The idea of barycentric coordinates (p. 20) does not agree precisely
with the definition in this section. However, let ¢°, t', ..., t” be the barycentric
coordinates of x, with respect to the vertices Xo, X1,..., Xn of an n-simplex S.
Let g be the affine transformation defined on p. 172, and let f = ga ee Chen
i1,...,¢” are the coordinates of x in the affine coordinate system f, and i
1—(@4+.---4+
2).

*Gamma and beta functions. The gamma function was defined on p. 167.
If we let x = g(s) = s?/2, then

i: a" exp (—2) dx = ar el (s?)“—} exp (—s?/2)s ds,


0 0
182 Integration 3-9

and we obtain another expression for I'(u):

l@) = ed 3°”! exp (—s”/2) ds, (Vee sN (5-40)


0
Let us calculate the product I'(u)I'(v). Now

LEO) = | | So
exp (3772) ds{” t?°—! exp (—t?/2) dt.
0 0
Writing the iterated integral as an integral over the first quadrant Q and intro-
ducing polar coordinates,

T(u)P(v) = 22-"-? ikpee Mexp |(st ol dV a3, 0)


ee) 1/2 . v—
= cae po Tox (1/2) arf (cos 6)?"—? (sin 6)?°—? dé.
0 0
The first integral on the right-hand side is 2“t’—! T'(u + v) by (5-40). Let

BCD) = ye (Cos 0)"=" (Gin 6)"= dé, Vi SAU) v>0. (541)


0
The function B is called the beta function, and we have just shown that

T'(u)T(v)
Buu, v) = T(u +0) : (5-42)

Example 2. Let wu=v = 4. Then B(4, 4) = 2ff'?d@ = 7. Hence [P'(4)]? =


alli) = Gry OE
Td) = Vx. (5-43)
Using the formula
Raa Lie al), (5-44)

proved earlier, '(m + 4) can be found explicitly for any positive integer m. For
instance,

If uw < 0, then the integral defining I'(u) diverges. However, if wu is not an


integer m = 0, —1, —2,..., one can use (5-44) to define ['(u). For instance,
if —1 < u < 0, thenO0 < u+1 < 1 and by definition [(u) = T(u + 1)/u.
Next I'(u) is defined for —2 < u < 1, and soon. The gamma function can also
be defined for complex values of u. (See [20], p. 148.)
To obtain another expression for B(w, v), set cos? 6 = g(0) = z and apply the
transformation formula. Then |g’(@)| = 2 cos @ sin 6, and
a/2 2 ee 3
IO) = / (cos? 6)“—1(sin? 6)”—12 cos 6 sin 6 dé,
0

He = iraah ae Ey (5-45)
5-9 Coordinate Systems in E” 183

A variety of integrals can be reduced to either (5-41) or (5-45) and hence can be
evaluated in terms of the gamma function. See Problem 10.
The measure a, of the unit n-ball. According to Problem 7, p. 162,

pita age 2f (1 — v2)?1


ay,
0

Setting uw = g(z) = +/z, then g’(z) = 4 2~/? and


On Set eagl ive ewe de

Sf)
An—1 0

is ee a

rar n (242)
1

An = An—] nN
r G a 1)

Moreover, a; = 2. By induction on n and formula (5-43),


gr”! 2
ee ee (5-46)
On
~ (n/2)P(n/2)’
If n is even, n = 21, then IT(J) = I! and ag; = 7'/l!.

PROBLEMS
1. Let A = {(2,y):22 + y? < a?,2 > 0}. Evaluate fi, zy? dVo(a, y) by intro-
ducing polar coordinates.
2. Find the area of {(z, y):4 < y < 24,1 < 4+ 4y < 4} by introducing f'(z, y) =
-y/2, f2(z, y) = z+ 4y as coordinates of (x, y).
3. Let A = {(2,y): O< 2? +y? < 2,2?-—y?<1,2>0,y> 0}. Find fa rdVo(z, y)
by introducing the coordinates f'(z, y) = 2? + y?, f?(z,y) = a? — y?.
4. Write the iterated integral fj drfj7” dy ff?”
f(a, y, 2) dz as an iterated integral in cylindrical
coordinates.
5. Find V3({(z, y, 2): 2? + y? + 2? < a7, 27+
y? > b?}) where a > b.
6. (Solids of revolution.) Let S be a compact
subset of the right half-plane and

A= {(z, Y, z) : (r, z) ‘S Si

where r? = x?-+ y?. Show that V3(A) =


2nGV2(S8), where (7, 2) is the centroid of S
(Fig. 5-19). x Figure 5-19
184 Integration 5-10

7. Find V3({(q, y, 2): exp (—22) > y?+ 27,2 > 0}). .
8. Let (f',...,f”) be a coordinate system for Dy C ee and (¢!,..., ”) a coordi-
nate eaten for Dg C E?. Show that (f!,...,f",¢',..., 6”), regarded as func-
tions on D; X De, form a coordinate system for\D; X De
9. (Bipolar coordinates in £4). In this system the coordinates of x are 7 cos 8, 7 sin 8,
p cos a, p sin a, where (r, 6) are polar coordinates in the z!x?-plane and (p, a) polar
coordinates in the a3z4-plane. Find, using bipolar coordinates,

ik (x!)? dV a(x)
KXK

where K is the unit circular disk x? + y? < 1.


10. In terms of the I’-function, find:

(a) | \/1 — x8 dx. [Hint: Let x? = z.]


0
(b) The area of {(2, y):0 < y < Vecosz, —1/2 < x < w/2}.

(c) i x exp (—2°) dz,a > —1,6> 0.


0

(d) / loss)'x" dt. ces) ld < =I.

(e) [ (x ae dV n(x), k = 1,2,...,>) the standard n-simplex.

) (ethan > rs dV,(x), k = 1,2,...,a; > Ofort = 1,...,n.


t=]

11. Let Q(x) = Ss cjv'x’ > 0 for every x # 0, where ci = ci for i, j |= de


=
tl

Show that

iiexp [—Q(x)/2] dVn(x) = (20)? [det (c§)]72”,


[Hint: Make a suitable orthogonal transformation.]

5-10 CONVERGENCE THEOREMS


In this section we establish some properties of measurable functions. Then
the Lebesgue integral is defined without the special assumptions of Section 5-6.
Two important theorems about interchanging integral and limit signs are
proved—the theorem about monotone sequences and Lebesgue’s dominated
convergence theorem.
Let us call the ordered number field Z' with two “ideal points” —a# and
+ adjoined the extended real number system. The points --«, —w are not
5-10 Convergence Theorems -185

numbers. However, for present purposes we agree that —x1 <a < +x
for every number a, and that

ee) oh Ge ene Kroo)iet la = a =P (p05) = +50,


+o ifa> 0
Oe eg (eae ia,
Similar conventions are made regarding —o«. However, (+o) + (—x)
is undefined.
The extended real number system will be denoted by H!. If S is a non-
empty subset of #', let sup S be the smallest b € H! such that x < b for
every x ES. If +o ES, then clearly sup S = +c. Moreover, if « < +o
for every x € S but S has no upper bound in E’, then sup S = +o.
The definition of inf S is similar. By neighborhood of + let us mean
any set {c € H':x > c}. Then convergence to -+« of sequences in EZ! makes
sense. Any nondecreasing sequence 2, 22,... in H! such that x, > —o%
for some m has a limit x9 which is finite (that is, in #1) or +o.
The following lemma about doubly indexed monotone sequences in E!
is used several times.

Lemma 1. Assume that Gn, > 0, Gmy < Amv+41, Amy S Ami, for every
ie = Un OP od aN eae Mat

Lbiaey INT, Gey =H Miva, ITN Ce


m4 p30 vo M0

Proof. Let
De wT aa (Cone
yo mo

These limits exist since dm, < Amo < +--+ and a), < do, <-+-- Moreover,
Dr oe ree andic ye Coe Let

Deine Ce lim cy.


m—o vow

For every m and v, dm, < c. Hence b» < c for every m, and it follows that
b < cc. Similarly, c < b. ff

Measure. In Section 5-2 the theory of Lebesgue measure was developed


for bounded sets. For unbounded sets only the definition of measure was given
there. Let us now show that Theorem 12a, which was the main result of Sec-
tion 5-2, still holds without assuming boundedness.

Theorem 12b. Let A = A; U Ag U---, where Ax is measurable for each


k = 1,2,... Then the conclusions of Theorem 12 hold, providing formulas
(5-10) and (5-11) are interpreted in the extended real number system E a
186 Integration 5-10

Proof. Let U, denote the r-neighborhood of 0, as on p. 145. By definition,


Am M U, is bounded and measurable for every r > 0. Moreover,

A VU, =Agn, G,)LU (Ac @- Ue) Us

By Theorem 12, A q U; is measurable for each r. Hence A is measurable.


Let us next assume that A, C Ay C--: and prove (5-11). Considering
only integer values of 7, let dm, = V(AmM U,). The hypotheses of Lemma 1
are satisfied. Moreover, by the definition (5-12),

bye—elim Vi Aan OnUp eA)

By (5-11) for the case of bounded nondecreasing sequences of sets,

(ee Ina VAC aye) = Wee eee


Mo

By Lemma 1 and (5-12),

lim CAF, em CAL UP ea Ae


Mo yoo

This is (5-11).
If AG, A>)» are disjomt, let By, — Ay U-+-U A, Then bie Boies
A = B, U By U---, and by what we have just proved

S V(Ax) = lim V(Bn) = V(A).


jpn m—0

This is (5-9). Since differences of measurable sets are measurable [Problem 1(a)],
the same reasoning as on p. 144 establishes (5-10). §f

Some additional properties of measurable sets are listed in Problem 1.

Corollary 1. Let Ay, Ag,... be measurable sets such that Ay D Ag D::-


and V(A;) ts finite. Then A = Ay MN Ag NQ--- ts measurable and

V(A) = lim V(A,). (5-47)

Proof. By Problem 1(c), A is measurable. Let H, = A, — A,, vy=


V2: == Ay —_A., By Problem 1(b); VG) = Viet) — V(A;) and
VE) = V(Al)—V (A). SinceHy GBs Co andi sh Hewes. by.
Theorem 12b
VGA) 2 (A) tor VAs
yoo
ead):

Since V(A,) does not depend on v, we obtain (5-47).

In the discussion of integrals to follow, it is convenient to consider func-


tions which may have the values —# or +n.
5-10 Convergence Theorems 187

Properties of measurable functions. Let f be a function with domain 2”


and values in the extended real number system #!. Then f is measurable if

Penge. ssa, (5-48)

is a measurable set for every real number c. Let us list and prove some proper-
ties of measurable functions.

(1) If f is measurable, then for every c the sets {x : f(x) < ch, {x: f(x) > ¢}ry
and {x : f(x) < c} are measurable sets.

Proof. By Problem I(a), with A = EH”, the complement of a measurable


set is measurable. Since {x : f(x) < c} is the complement of the set in (5-48),
it is measurable. Now {x: f(x) > c} = Nea {x: f(x) > ¢ — 1/m}. Each
set on the right is measurable, and by Problem 1(c) their intersection is measur-
able. Taking complements, the third set is measurable. J
From the definition and (1), f~!(J) is measurable if J is any semi-infinite
interval. It can be shown that f—'(Z) is measurable if H is any open set or
closed set.
In the next statement we agree that Of = 0 even when f has extended
real values.
(2) If f is measurable, then af 1s measurable for any real number a.

Proof. This follows from the definition and (1). §


(8a) If f and g are measurable and

h(x) = max {f(x), g(x)} for every x,


then h 1s measurable.
PPOOfme x (keene ex (x) Glen x g(x) > oc}:
In particular, if f is measurable and

f*(x) = max {f(x), 0}, f-(&) = max {—f(x), 0},


then f* and f~ are measurable.
Statement (3a) extends to the maximum of a finite number of measurable
functions and, more importantly, to sequences of functions.
(3b) If fi, fe, ... are measurable and

g(x) = sup {fi(x), fo(x),...} for every x,


then g 1s measurable.
Proof. .
AxecOXe sac) al rke f(x) = Ch:Ol
vi
188 Integration 5-10

In particular, if fj < fo < +++ then g is the limit of this nondecreasing


sequence.
Similarly, if
hy iit 4 fx) one a) On every: x,

then A is measurable.
Let yi, yo, ... be any sequence in #'. Let

24 = inf ‘le Unies oe i Ly, 2

Then z} < zo <--: The limit of the monotone sequence 2), 22,... is called
the lower limit of the sequence y1, y2,..., and is denoted by lim inf y,. Simi-
larly, if asl
p= SUDA Ys Uyeete ere} i Spel oe

then w; > wo >--:: The limit of the monotone sequence wy, we,... 1s the
upper limit lim sup y,. Since z, < w, for each vy= 1,2,..., we must have
yoo

lim inf y, < lim sup y,.

Equality holds if and only if the sequence yj, ye,... has a limit.
(4) If fi, fo,... are measurable and

jx)e—slim- ini, ye efor every x,

then f is measurable. i
2f00) me ObMCACn Vs lap): eum lew, (Xe mine, (X) 0)10q(X) eee ens,
(8b) each h, is measurable; hy < ha < ---, and by the definition of “lim inf,”
f(x) = lim,_.. h,(x) for every x. Therefore f is measurable. J

(5) If fi, fo, . . . are measurable and

f(x) = lim f,(x) for every x,

then f is measurable.
This is a particular case of (4). J

In the next statement it is assumed that f(x) + g(x) is everywhere defined.


In other words, for no x it is true that f(x) = +0, g(x) = —o or vice versa.
(6) If f and g are measurable, then f + g is measurable.
Prooy, For each" 1725565 let

shins (6-9) Ss A

IAS) iS ee Vit eS oy,


J/V WG fea,
5-10 Convergence Theorems 189

where 7 is an integer, and 1 — v? < j < v?. This construction is like one
on p. 153. Then
f(x) = lim f,(x)
vo

for every x. Defining g, in the same way,

g(x) = lim g,(x).


For each v the functions f, and g, are measurable and take a finite number of
values. It is easy to show that their sum f, + g, is measurable. Since for
every X
fe) + g(x) = Tim [fo() + o(2)],
f + g is measurable by (5). jj

Definition of the integral. The integral of a bounded measurable function


with compact support was defined in Section 5-3. Let us extend the defi-
nition in three steps.

(a) Let f be bounded, measurable, and f > 0. For each r > 0, f is meas-
urable on the r-neighborhood U, of 0. By Theorem 14, f is integrable
over U,. Let $(r) equal the integral of fover U,. Since f > 0, ¢ is non-
decreasing. Hence ¢(7r) tends to a limit, finite or +x, as7r—>-+to. Let

/fdV = lim [ fav. (5-49)


ro JU,

If the limit is finite, then f is integrable (over HE”). Otherwise the left-hand
side of (5-49) equals +o. If0 < f < g, then ffdV < fgdV.
(b) Let f be measurable and f > 0. For any ¢t > 0 consider the function
:f such that
:f(x) = min {f(x),¢} for every x.

For each t, ;f is a bounded, measurable function. It is called the truncation


of f at height ¢. If s < t, then ,f < ,f and hence f fdV < fiuf dV. Let

JfdV = lim [ if av. (5-50)


t+-+0

If the limit is finite, then f is ztegrable. Otherwise ffdV = +. If


0 <f<g, then ,f < g for every t. Consequently, ffdV < sg dV.
If f is bounded, then ;f = f for all sufficiently large ¢. The definitions
in (a) and (b) agree. Moreover, if f has compact support, then ¢(7) is
constant for all sufficiently large r and the definition of integral in (a)
agrees with the one in Section 5-3.
If A is a measurable set, then its characteristic function 14 (see
p. 152) is bounded and measurable. From (5-12), (5-49), and (3) of
Theorem 13, V(A) equals the integral of 14.
190 Integration 5-10

(c) A measurable function f is integrable if ft and f~ are integrable.


If f is integrable, then

[trav jee dV = fe dV. (5-51)

In case f is bounded and has compact support, then the new definition
agrees with the one in Section 5-3. For if the integral is taken in the
sense of Section 5-3, then formula (5-51) holds. (See Proposition 17 and
Theorem 14.) Since f* and f~ have the same integral in the new sense
as in Section 5-3, so does f.

Let us now give some general theorems about the validity of interchanging
the symbols “lim,_,..” and “f”.

Lemma 2. Let K be a compact set and $4, ¢2,... be bounded measurable


functions with supports contained in K such that $1 => ¢2 = -:: and
lim,_,« ¢)(x) = 0 for every x. Then

lim |, dV ==" 0.
von

Proof. There exists C such that ¢,(x) < C for every x and vy = 1, hence
also for yi== 2,3,... Given e > 0 lev c= €/2V(K)> Let

A, = {X : by (x) > c}.

By hypothesis K > A; > Ag D--: and A;yMA2N--- is the empty set.


Since V(A)}) is finite, we may apply formula (5-47), obtaining

limi Vi( Al) == WV (Aa yA sie ie) 0:

Let vo be such that V(A,) < €/2C for every v > vo. Then

i EAEAY CATON ACL rea HEE ES thy

while for every v,

i _ WV < cV(K — A,) < eV(K) = €/2.

Hence for every vy > vo,

[oa =| sav <el


Monotone sequences theorem. Let fi, fo,... be measurable functions
with 0 < fi < fe < +--+, and let f(x) = lim,_,. f,(x) for every x. Then

ifdV = lim [f, av. (5-52)


5-10 Convergence Theorems 191

Proof. We have already shown that the limit f is measurable. First let us
assume that f is bounded and has compact support. Let ¢, = f — f,. Since
0 <f, < f, each function f, is also bounded and has compact support. Moreover

[oav = frav — fray, yee |ae


From Lemma 2, we get (5-52).
Next let us suppose merely that f is bounded. Let

amy = f ays HD PAG oe


Um

and apply Lemma 1. Finally, if f is unbounded observe that for each m = 1,2,...,

Oss mi1 S mf2 Seek mJ (X) = lim mdv(X),

where ,,f, is the truncation of f, at height m. Apply Lemma 1 again with

inn = |nfs AV, ts Py aot |

Note: The theorem states in particular that f is integrable if and only if


the nondecreasing sequence of numbers ff; dV, ffzdV,... has a finite limit.

The double limiting process (5-49) and (5-50) can be replaced by a single
one. Let f be measurable and f > 0. Let

Vaite) (ea weXie


jeg — UW ahr dee oy, (*)
f(x) otherwise.

Then 0 < f, < fe < --- and f,(x) tends to f(x) for every x. Hence by the
theorem,
frav = lim [f, dV.
vo

Corollary 2. Jf f and g are real-valued integrable functions, then f + g ws


integrable and
[Gtoav = frav + [aav. (5-53)
Proof. By (6) f +g is measurable. If f > 0, g = 0, define f, and g,
by (*). For each v, f, and g, are bounded and have compact supports. By
Proposition 17, Section 5-3,

[G +0) av = [ray + foav.


192 Integration 5-10

The sequences fi, fo,---591, 92,---»f1 + 91, fo + 92,--. are nondecreasing


and tend respectively to f, g, f +g. By the monotone sequences theorem, the
corollary is true when f = 0,g = 0.
In the general case,

0< (f+pns
fi +g".
Since ft and g* are integrable, so is (f + g)*. Similarly (f + g)~ is integrable.
Then
frt+gt=(f+t@t+4,
fog. (fg) Saree,
where ¢ > 0. Since the corollary is true for nonnegative functions,

[rrav + fotav = f[G+mtav + foay,


[roa
+ [oa = fG+a-av
+ fear.
Subtracting, we get (5-53). If

In particular, if f is integrable, then |f| = ft +f is integrable. It is


easy to show that cf is integrable if f is integrable, and

[@ dV = effav.
Corollary 3. Let f,, fo,... be measurable,f, be integrable,f, > fe >--- = 0,
and let f(x) = lim,_,. f,(x) for every x. Then (5-52) holds.

Proof. Let g, = f1 — f, and apply the theorem to the nondecreasing


sequence gj, gg... which has f; — f as limit. J

For sequences which are not necessarily monotone, there is a result called:

Fatou’s lemma. Let f, be measurable and f, > 0 for each vy = 1,2,.


Then
if(lim inf f,) dV < lim inf [f, dV. (5-54a)
yo y— oo

Proof. Let h, = inf {f,,f,41,...}. Since h, < fm whenever v < m,

[hav < [fn aV, a=), vale ly An

[trdV <Uiminti. dV) ava eee (+*)


mo

Let f = liminff,. The sequence hy, ho,... is nondecreasing and tends to ii


yoo
5-10 Convergence Theorems 193

By the monotone sequences theorem,

[rav = lim [h, dV.


v—oo

Since (**) holds for each v, the limit is no more than the right-hand side of (**). §f

A statement about measurable functions is said to hold almost everywhere


if it is true except for x in some null set.

Proposition. Jf f 7s integrable, then f(x) is finite almost everywhere.


Proofs Lif 2.05 let

Aye Xe Xi iy, Ag = 4X -j(x) = +o}.

Then A, = A; M A2NM---and A; > AgD--- Foreach™, let ¢,,(x) = f(x)


if x € A», and otherwise ¢»(x) = 0. Then ¢, > ml A, where 1471s the
characteristic function of A. Hence

mV(Am) = /ml4,dV < /bm aV.


Since ¢, < f we have, dividing by m,
1
VARS me |rav.

Since the right-hand side tends to 0 asm > w, V(A.,) = 0.


In the general case, ft and f~ are integrable and

{x : f(x) = bo} = (x: ft(x) = +o} U sf) = +0).4


If f is measurable and g(x) = f(x) almost everywhere, then g is measur-
able. Moreover, if either is integrable then so is the other and ffdV = fg dV
(Problem 3).
In Fatou’s lemma the hypothesis “f, > 0 for every v” can be replaced
by “f, > ¢ for every v, where ¢ is integrable.” For this purpose we make the
following convention. If f > ¢ and ¢ is integrable, then ¢(x) is finite almost
everywhere. By f — ¢ let us mean the function with value f(x) — ¢(x) if
$(x) ~ +o, and value 0 if ¢(x) = +0. Then f(f — ¢)dV = ffdV — fodV
if f — ¢ is integrable. If the integral of f — ¢ diverges to +-~, then we agree
that the integral of f also diverges to +.
Let g, = f, — ¢. By assumption, g, > 0 for each vy= 1,2,... More-
over,
lining (x)= lim at 7,(x) =" p(x);
yv— 0 v—-

provided $(x) # -++x, hence almost everywhere. Since for each v

[uav = [fav — [eav,


194 Integration 5-10

we have
lim inf Jg, dV = lim inf [f,dV — i¢ dV.
voo poo

Applying Fatou’s lemma to the sequence 91, ga, - and adding f¢dV to each
side, we again get (5—54a).
Similarly, if for each v, f, < ¢ where ¢ is integrable, then

/(lim sup f,) dV > lim sup [f, dV. (5-54b)

Note: In the monotone sequences theorem and its corollary the hypothesis
f, = 0 can also be replaced by f, > $, where ¢ is integrable.

Lebesgue’s dominated convergence theorem. Let f,, fo, ... be measurable


functions such that:

(a) lim f,(x) = f(x) almost everywhere.

(b) There is an integrable function g such that |f,| < g forv =1,2,...
Then
[trav = lim [f, dV.

Proof. Since —g < f, < g and both —g and g are integrable, we can
apply (5-54a) and (5-54b). But

lim infty ,(x) slim sap, x)= fi x)


yoo

almost everywhere. Hence

lim sup [f, dV < /fdV < lim inf [f, av.
VO yo

But lim inf < lm sup, and hence ff, dV tends to ff dV. §

Integrals over measurable sets. Let A be a measurable set. Just as in


Section 5-4, a function f is called measurable on A if, for every real c,
{x € A : f(x) > c} is measurable. If the function f4 (p. 151) is integrable, then
f is integrable over A and we set

[fav = [fa dV.


If f and g are integrable over A, then f + g is integrable over A. This follows
from Corollary 2, since (f + g)4 = fa +a. Similarly cf is integrable over A.
The basic properties of integrals listed in Theorem 13, p. 152, remain true.
The formula V(A) = f41dV was established above, and the remaining
parts of Theorem 13 are proved exactly as before.
5-10 Convergence Theorems 195

The monotone sequences theorem, Fatou’s lemma, and Lebesgue’s domi-


nated convergence theorem remain true for integrals over A. In these theorems
one has simply to write f4 in place of f, and to replace the phrases

“measurable” by “measurable on A,”


“integrable” by “integrable over A,”
“for every x” by “for every x € A,”
“almost everywhere” by “almost everywhere in A.”
M fahaseg? by “lf.(x)| < g(x) for every x © A.”
Tnveach case let Fi", k ==s7 4. Then

[Fav = [ fea, frav = [ faa.


If, for every x € A,O < fi(x) < fo(x) < ---andf,(x) tends to f(x) asvy > o,
then 0 < F(x) < Fo(x) < --- and F,(x) tends to F(x) for every x € E”.
Applying the monotone sequences theorem to the sequence F';, F2,..., we get

J fadV = lim [ f, av.


A poo JA

The proofs of Fatou’s lemma and the dominated convergence theorem for
integrals over A are similar.

Corollary 4. Let A be a bounded measurable set of finite measure and f,, fo, .. -
measurable on A. Assume that:
(a) lim,_,. f,(x) = f(x) almost everywhere in A.
(b) There is a number C such that |f,(x)| < C for every x € A and vy =
Pe hen
ilfdV = lim [ fav.
A v—oo JA

Proof. Let g(x) = C for every x. Since V(A) is finite, g is integrable over
A. The corollary is then a special case of the dominated convergence theorem
for integrals over A.

Corollary 5. (a) Let A, Ao,... be a nondecreasing sequence of measurable


sets. Let f be integrable over A = A, U Ag U::: Then

fTdvVe= lime lee dV: (5-55)


A pou J Ay

(b) Let Ay, Ag,... be a nonincreasing sequence of measurable sets, and


lee A = AyN Ag N-:: Then (5-55) holds providedf is integrable over Ay.
Provmomayam ley ~—f4y” — 12,2... Then -lim,.,,. f,(x), = f@): for
every x € A and |f,(x)| < |f(x)|.. The conclusion follows from the dom-
inated convergence theorem, with g = |f|. ll
196 Integration 5-10

The proof of (b) is similar.


Note: If f > 0 and the sequence A,, Ag,... is nondecreasing, then we
could have appealed instead to the monotone sequences theorem. In that
case it is unnecessary to assume that f is integrable over A. This observation
is useful in proving the next corollary.

Corollary 6. Let A be o-compact, f be continuous on A, and f > 0. Then

/ 7 aV = "sup {f id Vai GVA 2s compact} : (5-56)


A K

Proof. Let s denote the right-hand side of (5-56). Since fx fdV < fafdV
whenever K C A, we must have s < fafdV. Let Ky, Ko,... be a non-
decreasing sequence of compact sets such that A = K,; UK2U-:-- Then
Sx,fdV < s for each y = 1,2,... Setting A, = K, in (5-55), we find that
Saf dV <s. Hence fafdV = s.f

The right-hand side of (5-56) was taken in Section 5-6 as the definition.
Corollary 6 shows that the definition there agrees with the one in the present
section in case f > 0. Since the procedure for defining the integral when f also
has negative values was the same in both sections [see (5-28) and (5-51)],
the two definitions agree in general.

PROBLEMS

1. Show that:
(a) If A and B are measurable, then A — B is measurable.
[Aint: (A 3B) Ty U, = AAU, = BU]
(b) If BC A and A has finite measure, then V(A — B) = V(A) — V(B).
(c) If Ai, Ag,... are measurable, then Ay M Az M--- is measurable.

ieee (Fd ed |) Ca
m=1 m=1

(d) If Ay, Ag,... are o-compact, then Aj U A2 U-=: is o-compact.

Hint: Let Am = Kmi U Km2U-+--+, where Km1 C Km2C--+ Let

Kee eltak a
j,k=1

(e) If A is any measurable set, then A = BU N where B is o-compact and N is


a null set. [Hint: Show that the result is true for bounded measurable sets.]

2. Let f and g be measurable real-valued functions. Show that their product is meas-
urable. [Hznt: Show that the square of a measurable function is measurable.
Phen: 28" = 9)? = f2=— 97
5-11 Differentiation Under the Integral Sign 197

3. Show that:
(a) If A is measurable, N is a null set, and d —-NCBCAUVN, then B is
eee and V(A) = V(B). [Hint: First consider the case of bounded
sets.
(b) If f is measurable and f(x) = g(x) almost everywhere, then g is measurable.
(c) Moreover, if f is integrable, then {[f dV = fg dV. [Hint: The result is known
from Section 5-4 to be true if f and g are bounded and have compact supports.]
4, Let f be measurable and f > 0. Show that if f[fdV = 0, then f(x) = 0 almost
everywhere.
5. Let f,(x) = sin vrx. Show that:
(a) lim inf f,(~) = —1 and lim sup f,(z) = 1 whenever z is irrational. [Hint: If
yo

x is irrational, then every arc of the circle s* + ¢? = 1 contains (cos yz, sin
vx) for infinitely many ».]
1 1
(b) (lim inf f,) dx = —1,liminf |] f,dzr = 0.
0 yo yo 0)

6. Let f,(z) = v if « € (0, v—') and f,(z) = 0 otherwise, y = 1,2,... Show that
lim,4. f(z) = 0 for every x, but ff,dzx = 1. Why does this not contradict
Lebesgue’s dominated convergence theorem?
eleta, a=") @2- y*)eye= ie 2... Show that0 < f,@) < 1, lim,..f,(2) = 0
for every x, and {f,dx = a. Why does this not contradict the dominated con-
vergence theorem?
8. (a) Let fi, fe,... be integrable over a measurable set A, and assume that
i e-1 Sa |fel dV is finite. Let G(x) = >?-1 |fx(x)|. Since the terms of this series
are nonnegative, it either converges or diverges to +-% for every x € A. Show
that G is integrable over A and hence G(x) is finite for almost every x € A.
[Hint: Apply the monotone sequences theorem to the sequence Gj, G2,...,
where G, = |(fi)al +> °° + [fal J
(b) By (a) the series )-#1 f;(x) converges absolutely for almost every x € A.
Let F(x) be the sum of the series. Show that fa
PFdV = %1 Sa fi dV.
[Hint: Apply the dominated convergence theorem to the sequence F, F2,...,
where F, = (fi)a t+: + (fp).
9. Let fi, fe,... be integrable over A, where A has finite measure. Assume that
\fx(x)| < Cy for every x A and k = 1,2,..., and that the series ))¥1 Cx
converges. Show that the series > f¢-1 fx(x) converges absolutely for almost every
x € A; and if F(x) is the sum of the series, then fa FdVn = Doki1 Tate EVR
[Hint: Use Problem 8.]

5-11 DIFFERENTIATION UNDER THE INTEGRAL SIGN


Let A be a o-compact subset of #”, B be an open subset of E’, and
A xX B= {(x,t):x € A,t € B} be their cartesian product. In this section
we are concerned with the validity of the formula

2S ie ty x) iUE (GEC OLYXe) ected (5-57)


at? JA YA gt’
198 Integration 5-11

Lemma 1. Let f be continuous on A X B. Assume that there is a function


g integrable over A such that |f(x, t)| < g(x) for veryxe A, teEB. Let
o(t) = ih,f(x, t) dV, (x), teB. (5-58)
Then ¢$ is continuous on B.
Proof. Let to be any point of B. If t,, te,... is any sequence in B tending
to to, then since f(x, ) is continuous at to

f(x, to) =" lim f@,.t,.) for every.xie Ay

Since |f(x, tm)|< g(x), by Lebesgue’s dominated convergence theorem

[0% to)AVn(w) = lim f(x, tm)AVn(2).


Thus ¢(tm) — ¢(to) as m — o. Since this is true for every sequence in B
tending to tg, ¢ is continuous at to. J
Lemma 2. Let 1 = 1. Assume that f and df/dt are continuous on A X B
and satisfy
6)
f(x, D| < g@), . (x | < h(x) for every xE& A, tEB,

where g and h are integrable over A. Then

$'(0) ee 2 a,Mine, Ve (5-59)


Proof. Let to € B, and let 6 > 0 be such that B contains the interval
(to — 6,t) + 6). If0 < wu < 6, then df/dt is integrable over A X [to, to + ul.
This set is g-compact, and by the iterated integrals theorem,
ee auf 2 (x, t) dVn(x) = i {f
# (x, t) a|dV ,(x). (*)

By the fundamental theorem of calculus the inner integral on the right-hand


side is f(x, to + u) — f(x, to). If we let y(é) denote the right-hand side of
(5-59), then (*) becomes

[WO dt= blo + u) — oto). (#*)


to+u

Similarly (**) is true when —6 < u < 0. Lemma 1 implies that y is con-
tinuous; hence by the fundamental theorem of calculus, ¥(to) = $/(to). I
If, instead of an open set, B is a closed interval [a, b], then the proof of
Lemma 2 shows that (5-59) is still true provided ¢/(t) means the one-sided
derivative at the endpoints.
Differentiation Under the Integral Sign 199

Theorem 18. Let f and of/dt’, 7= 1,...,1, be continuous on A X B


and satisfy

tics tle ee
of
es(s,1)| < h;(x) for everyxEeA, te B,

where g and hy,..., hi are integrable over A. Then the function ¢ in (5-58)
is of class C' on B and its partial derivatives are gwen by (5-57).

Proof. Applying Lemma 2 with ¢',...,t’~1, #*1,...,7! fixed, we get

2D Gye Sigs (vai a, weet erik lea


at’ A gt?

Applying Lemma 1 to the function df/dt’, we find that d¢/dt’ is continuous


on B for each 7. Hence ¢ is of class C‘). §

Corollary. The conclusion of the theorem holds if A is compact and f together


Wun Otol, — 1)... are continuous on A < B.

Proof. Let U be any neighborhood whose closure is contained in B. Since


A X cl U is compact, |f(x, t)| and

Pa eee

are bounded on A X U by some number C. Let g =h; =C. By the theorem


with B replaced by U, ¢ is of class C on U and (5-57) holds there. Since this
is true for every such U, ¢ is of class C‘? on B and (5-57) holds for every
teB.§q

Example. Let ¢(f) = {7° 2—' exp (—2t) dz, t> 0. Find ¢’(t). Using (5-59)

oO) = = exp (—at) dx = Se cu


i

If B is an interval (a, ©), a > 0, the hypotheses of Lemma 2 are satisfied with g(x) =
h(a) = exp (—ax). The formula for ¢’(¢) is correct for all ¢ in any such interval,
and hence for every t > 0.

PROBLEMS
1. Find ¢’(t) if @(@ is:

(a) [ log (2 + t”) ap, th 4 (b) Le "exp (xt) sin x dz.


1 _—

2. Let o(t)= fo log (2 — xt?) dx. Show that $’(0)= 0 and that ¢ is concave
on the interval (—1, 1).
3. For « € E' let o(x)= fo exp (—#”) cos zt dt. Show that ¢’(z) = — ad(x) and
find (a).
200 Integration S73

4, Leibnitz’ rule. Show that


b(t) Ss

“> (x, t) dr,


tla’(t) + iL ©
alllfs (oa) is= f[b(t), Jo’) — fla®, 4
provided f and Of/dt are continuous on [ao, bel< B where B is open, that
ao < a(t) < bo and ao < b(t) < bo for every ¢ € B, and that the functions a, 6 are
of class C. [Hint: Let G(z,t) = Ja, f(s, t) ds, so that 9G/dx = f. Calculate
the derivative of G[b(t), t] — G{a(é), é].]
wileto@) = J.., 0° exp G12) de,t > 1.) Find ot);
. Let o(z, f) = f2*2 f(s) ds, where c > 0 and f is of class C on E’. Show that
(0° /dt?) = c?(d7p/0x").
. Let o(z, y) = Sb dt fe f(s, t) ds, x > 0, y > 0. Show that if f is continuous on
{(z, y):4 = 0,y = 0}, then 07p/dx dy = f.
. Let

g(x) = / exp (ee — - dt, for x € E’.


0 t2
(a) Show that ¢(x) = 42 exp (—2|z|) for all x. [Hint: For x > 0, show that
¢’(x) = —2¢(z) using the substitution s = 1/t.]
(b) Note that application of (5-59) gives a false result at x = 0. Why is this
not surprising?

. Let o() = Jo f(a, t) dx, for t > 0, where f(x, #) = exp (—ai)z—'sinz. Show
that:
Qe Or 1/0).
(b) 6) ~ 0 as t>-+o. [Hint: Apply the dominated convergence theorem
with f,(x) = f(z, t,) where 1 < t1 < t2 < ---andt, ~ +0 asvy > ~ 1]

sin x
(c) tim aaah AF peetet @ f dx = +,
roe /0 x 2 0

Hints: From (a) and (b), #(t) = 7/2 — tan—' ¢. Integrate by parts to show
that

f sen ae ee ee ot | 2 anr
By the dominated convergence theorem,

im, fi) Gh =| a
(07 1.0 Ome

*5-12 L?-SPACES
Let A be a measurable set of positive measure, and let p be a number
such that p > 1. The collection of all measurable real-valued functions f
with domain A such that |f|? is integrable over A will be denoted by L?(A).
For instance, L*(A) is just the collection of all real-valued functions integrable
over A.
5-12 L?—Spaces 201

By inequality (1-9),

[f&) + ge)? S 2P- IF)? + |g(@)]?).


Hence if |f|? and |g|? are integrable over A, then |f + g|? is also integrable
over A. This shows that the sum of two elements in L?(A) is also in L?(A).
Clearly cf € L?(A) iff © L?(A) and c isa real number. Thus L?(A) is a vector
space over the real number field. The p-norm of a function f € L?(A) is the
number
fle = Cf, lal?aVn)*”. (5-60)
The p-norm has the following three properties:
(a) ||fllp = 0 if and only if f(x) = 0 for almost all x € A.
(b) |lefll> = lel ||fllp for every real c.
(c) lf + gllp < Ifllp + llgllo-
The proofs of (a) and (b) are left to the reader (Problem 4). Property (c)
is called Minkowskv’s inequality. Let us defer the proof to p. 203. A finite dimen-
sional version of Minkowski’s inequality was given on p. 34.
Let us regard as equivalent any two functions whose values agree almost
everywhere in A. Then (a), (b), and (c) state that L?(A) is a normed vector
space (see Problem 6, Section A-6). The distance between two functions
f,g € L(A) is ||f — gl|p. It is called the distance zn mean of order p. A se-
quence
fy, fo, . . . converges to f in mean of orderp if ||fm — f||p > 0asm — o.
If for every € > 0 there exists N such that ||f;, — fill, < efor every l,m > N,
then the sequence fj, fo, . . . is Cauchy in mean of order p.

Theorem. Every Cauchy sequence fi, f2,... im L(A) converges in mean of


order p to a limit f € L?(A).

This theorem is one of the remarkable features of the Lebesgue theory.

Proof of Theorem. Let fi, f2,... be a Cauchy sequence in L?(A). There


is an increasing sequence of positive integers N;, No, ... such that ||fn —fillp <
Warm for everyan, LiceN»y Let gn— fy, andilet

F(x) = >>2lge) — geal.


k=1
(*)
Since its terms are nonnegative, this series either converges or diverges to +-~%
for every x € A. But

YE ii2g, — grate aV = >> 2"?(\lgz — getille)” < >, je


bai JA k=1 k=.
202 Integration 5-12

Since the series on the right converges, so does the one on the left. Therefore
F is integrable over A and B = {x € A: F(x) = +} is a null set [Prob-
lem 8(a), p. 197]. Now each term of a nonnegative series is no more than the
sum of the series. Applying this observation to («), we find that forx € A — B

lgu() — ge4i(e)| < 2-*[F(x)]"””.


Therefore, fors = 1,2,:..andx Ee A — B,
s—1

lgu(x) — ga+s()| < >> loner) — ge4r+i(x)|


r=0

< PQ? Ra < 2)”. (#*)


r=0

Since the right-hand side approaches 0 as k — o, the sequence of real numbers


gi(X), go(x),...is Cauchy. Let

f(x) = lim g, (x), ifxe A — B,


ko

and let f(x) = Oifx © B. Letting s > o in (**),

lge(x) — f(x)| < 2°-*[F(@)]"?


forx € A — B. Since B is a null set,

i! lee — flav =< cee | F dav.


A A
Since F is integrable over A, the right-hand side tends to 0. Therefore

lim llge — fle = 0.


If m > Nx, then |lgx — fmllp < 27~**/” (recall the definition of g,).
Using Minkowski’s inequality

[be alls SS Wie fs a Gh flo.


Since the right-hand side tends to 0 as k — ow, this shows that

we llFm 7 ill = 0.1

Note: The idea of convergent sequence makes sense in any metric space S.
If every Cauchy sequence in S converges, then S is called a complete metric
space. A normed vector space U which is complete is called a Banach space.
The theorem above states that L?(A) is a Banach space.
If p > 1, then the number p’ such that

: '
~+o=1
P Pp
is called conjugate to p.
5-12 L?—Spaces 203

Theorem. /f f © L”(A) and g € L”’(A), then fg is integrable over A and

Falla S |flloligin, (Hélder’s inequality). (5-61)


Proof. Let ¢(t) = t’” for t > 0. Since 0 < 1/p < 1 the function ¢ is
concave. Hence g(t) < ¢(1) + ¢’(1)(¢ — 1), or

1 t 1
eas ((G1) ) aya.
Setting ¢ = u?v-”’, where v > 0, we find since 1 — p’ = —p'/p that
uP p'
w < 2s -+- aot (5-62)

Obviously this inequality also holds when v = 0.


If ||f||p = 0, then f(x) = 0 almost everywhere in A and both sides of
Holder’s inequality are 0. Similarly both sides are 0 if ||g||,, = 0. Suppose
that ||f\|, > 0, |lgll>- > 0, and let

f= (/llflly)f, 9G= C/llglldg.-


Then
[ifrav = fig’ av = 1,
and setting uw= |f(x)|, v = |g(x)| in (5-62)

|A eo)
%
aw@larwe)
-
<bP1 44Ppi. =1.
But the left-hand side is ||fg||1/||fl|pllg||>-. This proves Holder’s inequality. |

Note: The finite dimensional version of Hélder’s inequality (Problem 8,


p. 134) expresses the fact that the p-norm on H#” and the p’-norm on the dual
space (#”)* are dually related. There is an infinite dimensional analog which
we state without proof. Let [L?(A)]’ denote the set of all real-valued linear
functions on L?(A) which are continuous. On [L?(A)]’ there is defined a dual
norm, just as in the finite dimensional case. Then [L?(A)]’ is isomorphic with
L”’(A), and the dual norm is just the p’-norm. (See [15], p. 211.)

Proof of Minkowski’s inequality.

[wtoPav < f illstoP av +f lolli toPtav.


By Holder’s inequality,

f Wille + oP
ol
av < Cf isPav)?
Pp 1/p
Cf lp t+ol? PP avy
(p—1)p’ 1/p’
204 Integration 5-12

But (p — 1)p’ = p. Estimating similarly the last term in (*), we get

[e+ al? av < (tlle + llalls)(f,, lf + ol” av)"


If ||f-+gllp = 0, both sides are 0. Otherwise we divide both sides by
(f\f + gl? dV)?" Since 1 — 1/p’ = 1/p, we get Minkowski’s inequality. |

If p = 1, then (formally) p’ = o. Let us call f essentially bounded if


fis equivalent to a bounded function g [equivalent means that f(x) = g(x) almost
everywhere in A]. Let L(A) be the collection of all essentially bounded measur-
able functions. For f € L”(A) let |/f||. = ess sup {|f(x)|:x © A}, where the
right-hand side means inf {(sup {|g(x)|:x © A}):g is equivalent to f}. Then
Hélder’s inequality is still true when p = 1.
If p = 2, then p’ = 2. In L(A) let us introduce an inner product -
as follows:
fig= faavs.
Then f-f = ((|lfll2)?.. Moreover |f+g| < ||fg||, and hence

lf-gl < W#llellglle. (5-63)

This formula corresponds to Cauchy’s inequality in #”. In fact, the space


L?(A) is an infinite dimensional analog of euclidean E”.
An inner product space H is a vector space with an inner product - satis-
fying a list of axioms corresponding to those on p. 6. If H is infinite dimen-
sional and complete, then H is a Hilbert space. The preceding theorem shows
that L?(A) is a Hilbert space.

PROBLEMS
1. Let f(x) = |x|~*. Show that if A is the unit n-ball, then f € L?(A) for p < n/a
but not for p > n/a. [Hint: See Section 5-6, Example (4).]
2. Let f(z) = 2 (log z)-2, A = (0,34). Show that
f € L?(A) only for p = 1.
3. Let 1 < q < p. Using Hélder’s inequality show that if A has finite measure, then
f © L*(A) implies that f € L(A). Give an example to show that it is necessary
to assume A has finite measure. [Hint: Apply Hélder’s inequality to the functions
|f|2 and 1 with p replaced by p/q.]
4. Prove properties (a) and (b) (see p. 201) of the p-norm.
CHAPTER 6

Exterior Algebra and


Differential Calculus

In this chapter we shall introduce the calculus of differential forms, which


also goes by the name “exterior differential calculus.” We recall from Sec-
tion 2-6 that a differential form of degree 1 is a covector-valued function. In
order to define differential forms of higher degree r > 1 we first introduce
multicovectors of degree r. For brevity, they are called r-covectors. An
r-covector is an alternating, multilinear function with domain the r-fold car-
tesian product H” x --- K H”. It turns out that the r-covectors form a vector
space of dimension (;’), which is denoted by (£7)*.
Dually, an alternating, multilinear function with domain the r-fold car-
tesian product (H”)* x --- & (#”)* is called an r-vector. The r-vectors form
a vector space H;", whose dual space turns out to be (E")*.
There is a natural product for multicovectors called the exterior product
and denoted by the symbol A. If w is an r-covector and ¢ an s-covector, then
w A ¢isacertain (r + s)-covector. Dually, the exterior product of an r-vector
a with an s-vector B is an (r + s)-vector a A B. The exterior product is
associative, and it is commutative except for a possible sign change (Propo-
sition 20).
Certain multivectors, called decomposable, have an interesting geometric
interpretation. An r-vector a is decomposable if there are 1-vectors hy,...,h,
such that a = h,; A --: A h,. It turns out (Theorem 19) that if a ¥ 0,
then h,,...,h, span an r-dimensional vector subspace P of #”. With @ is
associated an orientation of P. If two of the vectors hy, ..., h, are interchanged,
then a changes sign and the orientation of P changes. The norm |a| of a de-
composable r-vector a equals the r-dimensional measure of a certain r-paral-
lelepiped.
A differential form of degree r (called for brevity an r-form) is defined as
an r-covector-valued function.
205
206 Exterior Algebra and Differential Calculus 6-1

Every r-form of class C™ has an exterior differential dw, which is a


form of degree r + 1. The usual formulas for the differentials of sums and
products remain true except for a possible sign change in the product rule.
Another important fact is that d(dw) = 0 for any form w of class C™. Besides
its differential, w has a codifferential dw which is a form of degree r—1. In the
next chapter the codifferential is used only for r = 1, in which case it becomes
the divergence. In the last section of the chapter the basic formulas of vector
analysis in H? are derived.

6-1 ALTERNATING MULTILINEAR FUNCTIONS


Let us call a real-valued function LZ with domain #” 1-linear if L is linear.
Tor any integer r > 1 we shall now consider functions called r-linear. For
simplicity let us first consider r = 2. Let B be a real-valued function with
domain the cartesian product E” x EH”. The elements of #” x E” are pairs
of vectors, denoted by (h, k). We recall from p. 29 that the function B is bilinear
if B(h, ) and B( ,k) are linear functions for every (h, k). It was shown there
that if B is bilinear and

Oy = Bie, e;), iE i) = 15 ~ 2-4 ft, (6-la)

then for every (h, k)

B(h,k) = >> w;h*k’. (6-2a)


4,j=1

In this chapter we are interested in a special class of 7-linear functions,


called alternating. For r = 2, B is alternating if B(h, k) = —B(k, h) for every
(h,k). If B is bilinear and alternating, then w;; = —w,;, and in particular
w;; = 0. Formula (6-2a) can be rewritten

B(h, k) = D> (wijh'k? + w;:h7k’),


i<j
or
B(h, k) = >> w.,(h'k? — hk’). (6-3)
i<j

Conversely, given n(n — 1)/2 numbers w,;;, 7 < j, formula (6-3) defines an
alternating bilinear function.
Similarly, for any r > 2 Jet M be a real-valued function with domain the
r-fold cartesian product H” x --- X EH”. The elements of H” x --- x EH” are
r-tuples of vectors, denoted by (h;,...,h,).

Definition. The function M is multilinear of degree r if for each = 1,...,7


and hy,...,hj1,hy41,...,h, the function M(hi,...,by_3, ,hy41,...,h,)
is linear.
6-1 Alternating Multilinear Functions 207

For brevity we write r-linear instead of multilinear of degree r. When


r = 2 we wrote hy = h, hy = k. The new definition agrees for r = 2 with the
definition of bilinear function. The formula which generalizes (6-2a) to multi-
linear functions is

WA ee De) loth oa hi, (6-2b)


Tip enol
where
Wipe i, = M(e;,, ; ner): (6-1b)

This is proved by induction on r.

Interchanges. Let S be some set. For our purposes we shall take either
ee eOtoe lene), eli (py, , py) and (p,,..., p.) are r-tuples
of elements of S, let us say that the second r-tuple is obtained from the first
by interchanging p, and p; if p, = p:, pi = ps, and p;j = pi forl $ sg, t.
Examples. The triple of vectors (h3, he, hi) is obtained from (hj, he, hs) by inter-
changing hg and hy. The 4-tuple of integers (1, 5, 3, 7) is obtained from the 4-tuple
(1, 7, 3, 5) by interchanging 5 and 7.

Definition. An 7-linear function M/ is alternating if M(hy,...,h,) changes


sign whenever two vectors in an r-tuple (h;,...,h,) are interchanged.

We know that the sum of two linear functions is a linear function. From
this fact and the definition of multilinear function, the sum M + N of two
r-linear functions M and N isr-linear. If M and N are alternating, then MZ ++ N
is alternating. Similarly, if c is a scalar then cM is r-linear when MM is r-linear
and alternating when M is alternating.
Let (H})* denote the set of all alternating, r-linear functions with domain
iE” x --- x H”. By the remarks just made, (#7)* satisfies the axioms for a
vector space. Let us now prove two propositions which enable us to find the
dimension of (/7')* and a basis for it.
An r-tuple (h,,...,h,) is called linearly dependent if there exist scalars
c},..., cc", not all 0, such that c’h, + ----+ c’h, = 0.
Proposition 18. Let M be r-linear and alternating. Jf (hy,...,h,) is a
linearly dependent r-tuple, then M(h,,...,h,) = 0.

Proof. First of all, the conclusion is true if some vector in the r-tuple is
repeated. For instance, suppose that h; = hg. Since M is alternating,

Miaiens hy wyaeeh eae Ge (ho. hy, hayerjn, h,):


Then M (hy, hy, hg, ...,h,) is its own negative, and must be 0.
Suppose for instance that h, is a linear combination of the vectors pre-
ceding it, b a
heh) + eeepc Jn ae
208 Exterior Algebra and Differential Calculus 6-1

Since M(h,,...,h,_1, ) is a linear function,


r—1
M(hp pee hee Wyo (igs Dyer, by):
a ‘

In the Ith term on the right-hand side, the vector h; is repeated, and hence
each term is 0. Thus M(h,,...,h,) = 0.1

For any r > 2 there is the trivial alternating r-linear function 0, which
has the value 0 for every r-tuple (hy,...,h,). If r > n, then (hi,...,h,)
must be linearly dependent and from the proposition we get the following.

Corollary. Jf 7 > n, then 0 is the only alternating r-linear function.

Therefore let us suppose that r < n. It is now convenient to introduce


some more notation. The letter \ will denote an r-tuple of integers,

= (Gin ee OP

where 1 < i <n for each Kk=1,...,r. There are n” such r-tuples of
integers. If 7; <--- < 7,, then 2 is called an increasing r-tuple. There are
(") increasing r-tuples, where (7) = n!/r!(n — r)!is a binomial coefficient. We
write >>), for a sum over all r-tuples and >',; for a sum over all increasing
r-tuples.
The following generalization of the Kronecker symbol 6; will be used.
Let X = (%j,...-,%), # = (1,---, 7,7) be r-tuples of, mtegers. Then dit is an
element of an r X r matrix; and is 1 if 2, = j;, 0 otherwise. Let

5, = det (65*).
The important properties of 5. are:
(1) If no integer is repeated in the r-tuple \ and w = X, then 5. == Ik
In this case 4, = 7, if and only if k = J. Hence 5. = det (sf) = 1.
(2) If no integer is repeated in the r-tuple u and d is obtained from pw by p
interchanges, then 5n =a)
Hach interchange of elements of « interchanges two column vectors of the
matrix (55*) and changes the sign of the determinant. Therefore (2) follows
from (1).
(3) In all other cases, 5. = (0)
If some integer is repeated in yw, then two column vectors of the matrix are
the same and the determinant is 0. If the integers 7,,..., 7, are distinct and
some 2; does not appear among them, then the kth row covector of the matrix
is 0 and the determinant is 0.
Now let M be an alternating r-linear function. For brevity let us set
oy, = Qi, --.dp
6-1 Alternating Multilinear Functions 209

Sometimes we will still write w,,...;, rather than w, particularly when r < 3
or r =n. If X is obtained from yu by one interchange, then w, = —w,. In
particular, w, = 0 if any integer is repeated. If \ is obtained from p by p
interchanges, then w, = (—1)?w, = dw).
If » has no repetitions, then exactly one increasing \ is obtained from pu
by interchanges. Hence for every p,

Oy = Dy ordy, (6-4)
[A]

where at most one term on the right-hand side is different from 0.

Examples. Let n = 5, r = 4. Then w1231 = 0 since 1 is repeated in the 4-tuple


AX = (1, 2, 3,1). Since (2, 3, 4, 5) is obtained from (5, 4, 2,3) by an odd number of
interchanges, w2345 = —w5493.

Let us now consider some particular elements of the space (Z7)*. For each
r-tuple \ = (21, ...,%,) let e* be the function such that

e\(hy,...,h,) = det (hi*) (6-5a)


for every r-tuple of vectors (hi,...,h,). Note that the r x r matrix (hik) is
formed from rows 72,...,%, of the » X r matrix (hj) which has h,,...,h,
as column vectors. By properties of determinants, e* is r-linear and alternating.
Thus e* belongs to (E”)*.
Taking in particular hy, ...,h, to be standard basis vectors, hy = ej, for
1 = 1,...,7,-we obtain in (6-5a) the matrix (55) whose determinant is by.
Thus
N d (6-6)
e (e;,, Oe) oH) = Om

If \ is obtained from u by an interchange, then two row covectors of the matrix


(hik) in (6-5a) are interchanged. The determinant changes sign. Hence

elie hwent— el(hy es.h;)


for every r-tuple (hi,...,h,), which means that e* = —e“. In particular,
e* = 0 if \ has any repetitions. If \ is obtained from yw by p interchanges,
then e* = (—1)?e*.
Let us make the convention that es = 0 in case r > n. This is useful
in defining the exterior product in the next section. ‘e
When r = 2 and \ = (1,7), e(h, k) = h'k’? — h’k*. If B is bilinear and
alternating, then
Ba »S. w,e",
tj

since by formula (6-3) both sides have the same value for each pair of vectors
(h, k). This is a particular case of the following.
210 Exterior Algebra and Differential Calculus 6-2

Proposition 19. (r <n.) Let M be r-linear and alternating. Then

M = >> we’, (6-7)


[\]
where the numbers wy are given by (6-1b).
Proof. Let M equal the right-hand side of (6-7). For each wu= (j1,.-+ 5 Jr);

M(e;,, be a5) en) =: SS we*(e;,, 5404 e;.).


(M
T'rom (6-4) and (6-6),
oN
Me;,, BS é;) = > W) Oy = Wy.
[A]

But M and M are r-linear and have the same value w, at (e;,,..., €;,) for
each yp. By (6-2b) M = M.4

PROBLEMS

i, eng = H ihel
651, 851, 6214, 255, 425.
2. Letn = 4,r = 3, wi23 = 2, w134 = —1, and w, = 0 for every other increasing
triple \. Find M(e4, e1 — e3, e2 + e3).
3. Show that:

(a) 5p = dh. (b) & = DT dy = 1| DT by8r.


N
(u] B

Ke 1 m
(@) SEX, = REE

[Hint for (c): Use (b) and (6-6).]


4. Let M be r-linear, not necessarily alternating. Let w, be as in (6-1b) and &, =
(1/r!)>°) @,63. The function My = 301) &)e? is r-linear and alternating.
(a) Show that My = (1/r!)>0, Gye". (Hint: &, = Yit,1 G64; use Problem 3(c).]
(b) Show that if M/ is alternating, then ®, = w, and hence M, = M.

6-2 MULTICOVECTORS

Let us now introduce a different name and a different notation for alter-
nating, multilinear functions.

Definition. A multicovector of degree r is an alternating r-linear function


with domain the r-fold cartesian product H” --- & E”.
For brevity, multicovectors of degree 7 are called r-covectors. From now
on multicovectors will ordinarily be denoted by the Greek letters w or ¢ rather
than J as in the previous section.
6-2 Multicovectors 211

We observed in the last section that the set (#”)* of all r-covectors satisfies
the axioms for a vector space. When r > n, its only element is 0 by Proposi-
tion 18. When 1 < r < n, Proposition 19 states that if w is any r-covector, then

w = >) we’. (6-8)


[A]
Therefore the r-covectors e* with \ increasing span (H")*. These r-covectors
form a linearly independent set (Problem 7), which is therefore a basis for
(EHy)*. It is called the standard basis. The number w is the component of w
with respect to the basis element e*. Since there are (") increasing r-tuples
of integers between 1 and n, (£")* has dimension (”).
Every 1-linear function is alternating. Thus a 1-covector is just a covector,
and (H7)* = (£”)* is the dual space of HE”. If we identify the 1-tuple (7)
with 7, then the standard basis 1-covectors e!,... , e” are just those introduced
in Section 1-3. As in previous chapters we shall use the letters a, b to denote
1-covectors.
If r = n, then the n-covector e'’” n is essentially the determinant func-
tion. Its value at (hy,...,h,) is det(h/), which is the determinant of the
nm X n matrix with column vectors h,,...,h,. Since (£%)* is one-dimensional,
every n-covector has the form w = ce!"*” where ¢ = @4...n-

Example. Let n = 5, r = 3, and w = 6e!45 — 2e43! — e5!4, The increasing


triple (1, 4, 5) is obtained from (5, 1, 4) by an even number of interchanges. Hence
e°!4 = e!45. The increasing triple (1, 3, 4) is obtained from (4, 3, 1) by one inter-
change. Hence e423! = —e!34, andw = 2e!34-+ 5e!4°. This expresses w as a linear
combination of the standard basis 3-covectors. The components of w are w134 = 2,
W145 = 5, and w, = 0 for every other increasing triple X.

Products. In (H;)* we define the euclidean inner product

wo
= Di ordy
[A]

and set |w|? = w-w. The standard basis elements are orthonormal with
respect to this inner product.
Another important product is the exterior product, denoted by the sym-
bol A. The exterior product of an r-covector and an s-covector is an
(r + s)-covector, defined as follows: If

NESS ae 5 24) een gfe ta)

let us write \, v for the (r + s)-tuple

CR eae

Definition. Let 1 <r <n,1<s <n. If 2 and p are increasing, then

e Ae =e” (6-9)
212 Exterior Algebra and Differential Calculus 6-2

If w is an r-covector and ¢ is an s-covector, with respective components


w, ¢, then
GONE MOREE »
TALE;
[Al] \

Note that ifr +s > nthenw A ¢, being an (r + s)-covector, must be 0.

Examples. Let n = 4. Then e!? A e?4 = e!?94,


e3 A e124 = 93124 = 91234 el4 A e2 = el424 = 0,

since the integer 4 is repeated.

Proposition 20. The exterior product has the following properties:

1) @+HAn=@An+E A yy).
(2) (cw) A § = cw A §).
(3) § A w= (—1)"w A §, of w has degree r and ¢ has degree s.
(4) GE Aw) An=o6A WA»).
Proof. The proof of (1) and (2) is almost immediate from the definition
and is left to the reader (Problem 8). To prove (3),

vr = (Ghia ew re eee

By s interchanges we may bring 72, to the left past 7;,...,js. Similarly, s


interchanges bring each of 72,..., 2, in turn past 71,...,Js. Thus X, v is ob-
tained from v, \ by rs interchanges, and e”* = (—1)"*e*”. Hence
r r,
fA w= SS fae” = (G1) yD, Wy Kye @
(IEA) [Al]
which proves (3).
Let us first prove the associative law (4) for basis elements. Let \ =
(iy. 2 tr), U— (J1,-.- 55s), and p'— (ky,.-., k,) bevinereasine 7-,¥s> and
t-tuples, respectively. Let

r, Lf) = (21, o¢ Oetlry dan De ries ky, COX phat

Let us show that


ed? _ (e* A e”) A e’.

If some integer is repeated in the (r + s)-tuple \, v, then both sides are 0.


If no integer is repeated, then

(e* A e”) in e? = e” A e? = (aie A e” == (—1)Per?

where 7 is an increasing (r + s)-tuple obtained from \,v by p interchanges.


6-2 Multicovectors 213

These same p interchanges change the (7 + s + #)-tuple NAVepmOn 74 p:


Hence
eh”? — (—1)P?e™? — (e® A e”) A e?.
Similarly e”” = e* A (e’ A e’), and hence
(e* A e”) A e? = e* A (e’ A e*). (6-10)
From this formula it is a straightforward matter to obtain (4) (Problem 9). §

If either r or s is even, then the exterior product is commutative. If r =


s = 1,wehavea A b= —b A a.
The exterior product of any finite number of multicovectors is defined by
induction. Using (6-9) repeatedly, we find that if \ is increasing,

eh = et Ae Ae”. (6-11)
Since both sides of (6-9) change sign under interchanges in \ or v, formula
(6-9) is also true for nonincreasing r-tuples. Thus (6-11) is valid whether \
is Increasing or not.

Examples. Let n = 5. Then

(e ay 3e') A (e"4 2e'*) na ee De 265° & Dour” = feuL pa en de és.

2153 1235
e” A (3e! -— 2e”) Veer eee (3e7! — 2e”") Ne m= 3e = 3e

*Remarks. The exterior product has been defined in terms of the standard
bases. It is not clear that it is “coordinate free,” in other words, that the same
exterior product would be obtained starting from different bases. However,
let us add one additional property to the list (1)—(4):

(5) If w =a! A--- A a’, then w(hy,...,h,) = det (a*-h;) for every
T-i ple (Dye ap):

This property of the exterior product will be proved in Section 6-3 [see
(6-12), (6-14)]. Formula (6-11) is a special case of (5). This is seen by taking
a” — e* and recalling (6-5a). Moreover, (6-9) is a consequence of (6-11) and
the associative law (4). Once the product is known for basis elements, Proper-
ties (1) and (2) determine it in general. Thus A is the only product with
Properties (1)—(5). In fact, (3) can be omitted from the list since it follows
from the other four. Since none of these five properties refers to bases, the
exterior product is coordinate free.

*Note about terminology. A multilinear function M of degree r and


domain EH” « --+ < EH” is often called a covariant tensor of rank r. An r-covector
is then called an alternating covariant tensor of rank r.
214 Exterior Algebra and Differential Calculus 6-3

The sum of an 7-covector and an s-covector has been defined only when
yr = s. However, one may form the direct sum

(A")* = (B0)" © (BD)* © --- ® (E)" @--


where we agree that (>)* is the scalar field. The exterior product induces a
product in (A”)*, which is then an algebra over the real numbers. This algebra
is called the exterior algebra of (E”)*. See reference [4]. (A”)* is sometimes
called the covariant Grassman algebra or the covariant alternating tensor algebra
of EH”.

PROBLEMS
1. Write down the standard basis for (##)* for each r = 1, 2, 3, 4. Find all products
e* A e’ where = (2) and vy = (j,k, l) is an increasing triple.
Y, Ihe me = B, Sumyollnys
(a) (2e! — e?) A (8e?-+ e). (ay) CHIN Ce
(c) (et — e? + 3e?) A e?!. (d) (e73 + e3!) A (Se! — e?).
3. Let n = 5. Simplify:
(a) e253 Ix (el4 + er, (b) (e? + e°) A el A (e® sod e*).

4. Let a and b be l-covectors and w = a A b. Show that w;; = ab; — ajbi.


5. Show that if a, b, c are 1-covectors, then

aNbtbAc+cA
a= (a—b)A (b—c).
6. Show that if w=a IX b, then WijWer—E WinW1; WiIWj, = 0 for Dy De k, j=
1,...,n. [Hint: Using Problem 4,

since the first row is a linear combination of the second and third rows.]
7. Show that if }’;,) cye* = 0, thenc, = 0 for every increasing \. [Hint: See (6-6).]
8. Prove (1) and (2) of Proposition 20.
9. Prove the associative law (4) of Proposition 20, using (1), (2), and (6-10).
10. Show thatw A ¢ A 4» = —n A § A @ if w has degree 7, 7 has degree t, and
both r, t are odd.

6-3 MULTIVECTORS
If U is any vector space, then alternating r-linear functions on U X +--+ X U
can be defined just as in Section 6-1 where we took U = E”. Let us now take
=) (2)* the dualispace to £”.
Definition. A multivector of degree r is an alternating r-linear function with
domain the r-fold cartesian product (H”)* K +++ & (B")*.
6-3 Multivectors 215

For brevity, multivectors of degree r are called r-vectors. They will usually
be denoted by the Greek letters a or 8. When r = 1, the 1-linear functions
on (H”)* are identified with the elements of H” in the way explained in Section
A-2. Then a 1-vector is just a vector, and will be denoted as usual by x or h.
For every statement about multicovectors in Sections 6-1 and 6-2, there
is a dual statement about multivectors obtained by everywhere exchanging
the words “vector” and “covector.” For instance, if @ is an r-vector and
= (11, 6.0. 559 ty)’, let

a ==! oC e'r).

This is dual to the formula [see (6-1b) with M = ]

Oy = w(€;,, Tok Ou G:).

Let e, be the r-vector defined by the formula dual to (6-5a):

€,(a",..., a") = det (a) (6-5b)


for every r-tuple (a',..., a") of covectors.
Let H? denote the set of all 7-vectors. Then E” satisfies the axioms for a
vector space. It consists of 0 only ifr > n. For 1 < r < n the r-vectors
€, with ) increasing form the standard basis for HE”. The number a* is the
component of aw with respect to ey.
The inner product @ - 8 of two r-vectors, and the exterior product a A B
of an r-vector @ and an s-vector 6 are defined by the formulas dual to those in
Section 6-2. In each instance subscripts are replaced by superscripts and vice
versa. The exterior product of multivectors has the same properties listed in
Proposition 20. The scalar product w- @ of an r-covector w and an r-vector a
is defined in the third from last line of the Table 6-1. The last two lines of
the table are particular cases of the formula for w- a.
The formulas in the second line and in the last two lines are true whether
and pw are increasing or not, since they are known to be true for increasing
r-tuples, and both sides of each formula change sign under interchanges.
The reader should compare this table with the corresponding table for
r = 1, p. 12. According to the definition (Section A-2), the dual space of KH;
consists of all real-valued linear functions / with domain H?. The dual space
may be identified with (H7?)* in the following way. Given an r-covector w,
let F(a«) = w+ a for every a € EH}. This establishes an isomorphism between
(£”)* and the dual space of £;’.. The next to last line of the table implies that
the standard bases for 7} and (7)* are dual.

*Note about terminology. Multivectors of degree 7 are also called alter-


nating contravariant tensors of rank r. The exterior algebra A” of H” can be
introduced in the way indicated at the end of Section 6-2.
216 Exterior Algebra and Differential Calculus 6-3

TaBLE 6-1

r-vectors r-covectors

n joe *
Elements of E, (E;)

Standard basis elements ; ,


. .
(A increasing) en = 63, A -:2 A e;, PS = OT
Uy
ects RE v
»
a= >> a ey Qo = SS WE
[A] [A]
Euclidean inner
Nan
product a-B = oa B ole = DS erga
[A] [A]
: 2 2
Euclidean norm lal" = a-a@ lw)" = w-w
d
Scalar product @O-a = yy Wa
[A]
d N
@ +e, = On
e*-a =a? Qx°@ = Wr
ee

Definition. An7-covector w is decomposable if there exist covectors a!,..., a”


such that w = a! A --- A a”. Similarly, an r-vector a is decomposable
if there exist vectors h,,...,h, such that aw= hy A --- A h,.

In the remainder of this section we shall mainly discuss decomposable


r-vectors. Each statement about them has a dual which applies to decompos-
able r-covectors. Clearly every 1-vector is decomposable. If & is an n-vector,
then
CLF CO ie 0 (COb) o/\ Con /\o Ae

where c = a''”. Hence every n-vector is decomposable. In Section 6-6 it


will be shown that any (n — 1)-vector is decomposable. However, for
2 <r <n — 2 there are nondecomposable r-vectors; see Problem 9. Since
e, = e;, A --: A e;,, the standard basis r-vectors are decomposable.
It is not correct to identify a decomposable r-vector @ with the r-tuple
(h;,...,h,) since there are many ways to write @ as an exterior product of
vectors. The corollary to Theorem 19 below will furnish a geometric descrip-
tion of all possible such decompositions of a.

Proposition 21. /f a = h; A ---: A h,, then for every r-covector w,

we = why, oo, hy): (6-12)


6-3 Multivectors 217

Proof. Let @(hi,...,h,) = w- (hy A+++ A h,) for every r-tuple


(h,,...,h,). Then @ is an alternating r-linear function. Moreover

@(Cj,,..-,€j,) = wey = a,
for every u. Hence & = w. §
Proposition 22. Leta = hy A --- A h,andw=a! A -:: A a’. Then
for every r-tuple \ = (11,..., 7),

a® = det (hj), (6-13a)


w, = det (ai), (6-13b)
and
wa = det (a*-h)). (6-14)
Proof. Taking w = e* in formula (6-12), we get

poate to = 1e.(ha, hy)


Recalling the definition (6-5a) of e*, we get (6-13a). The formula (6-13b)
dual to (6-13a) is obtained similarly. Let w’(hi,...,h,) = det (a* - hy) for
every (hy,...,h,). Then w’ is multilinear and alternating and for every i

OAC ere eC, )e—sdet (a. e;,) = det (a’,).

Using (6-13b), we get w’ = w. Then (6-14) follows from (6-12). J

The formulas (6—-13a), (6-13b), and (6-14) may not provide the easiest
way to compute the components and the scalar product in numerical examples.
For instance, see Examples (1) and (2) below. However, they are important
for various other reasons.

Proposition 23. Jf w-a = w-B for every decomposable r-covector w, then


Case

Proof. The standard basis r-covectors e* are decomposable. Hence for


every increasing X,
a’ =e-a=—e-B= 6]
The decomposable 7-vectors have an important geometric significance
which will be described next.
First we recall the following results from linear algebra:
(1) Any linearly independent set {hy,...,h,} is a basis for the vector
subspace P C HE” spanned by these vectors (definition).
(2) Given any such set there exist h,,1,...,h, such that {hy,..., hy}
is a basis for #”.
(3) For every basis {h,,...,h,} for #” there is a dual basis {at,..., a"}
fornia) tay Hy. = 8¢ for k,l = 1,...,n (Section A-2).
218 Exterior Algebra and Differential Calculus 6-3

Definition. A linearly independent r-tuple (hy,...,h,) is called a frame


for the vector subspace P spanned by hy, ... , h,.

The only difference between the notions of basis and frame is that the
latter takes into account the order in which the basis vectors hy,...,h, are
written.
Theorem 19. (a) An r-tuple (hy,...,h,) 7s linearly dependent if and only
af hy IK 922 IX h, = 0.

(b) Let P C E” be an r-dimensional vector subspace and (hy,...,h,),


(hi,...,h/) be any two frames for P. Then there is a scalar c such that

toe Ag, == ch Ae 2 b,, (6-15)


(c) Conversely, if (h1,...,h,) and (hj,...,h}) are frames which satisfy
(6-15) for some scalar c, then they are frames for the same vector subspace P.
(See Fig. 6-1.)
Proof of (a). Let (hy,...,h,) be linearly dependent anda = h; A::: A h,.
By Propositions 18 and 21, w- a = 0 for every w. By Proposition 23, a = 0.
On the other hand, if (hy,...,h,) is linearly independent, let a',...,a” be
covectors such that a*-h,; = 6f, and let w = a! A --- A a”. By (6-14)

w-a
= det (8%) = 1.

Hence a ¥ 0.
Proof of (b). Each hf is a linear combi- hj <7
nation oleh 7 R,,

= OS my) Cee i

m=1

If w=a' A --- A a’ is any decomposable


r-covector, then

wa’ = det (a*- ht) = det & (a* - hot)


M1 Figure 6-1

where a’ = hi A --- A hy. The matrix on the right is the product of the
matrices (a* -h,,) and (c7"). Hence if c = det (.

w-a’ = cdet (a*-h,) = cw-a = w-(ca).

This is true for every decomposable w; hence a = caw by Proposition 23.

Proof of (c). Since a = hi A =-- A h,, a’ = hi A --- A ht are not


0 by hypothesis, (a) implies that {h,,...,h,} and fh{,...,h’} are linearly
6-3 Multivectors 219

independent sets. It suffices to show that each hj is a linear combination of


h,,...,h,. Suppose that this is false for some J, say for 1 = 1. Then
{hi,...,h,, hj} is a linearly independent set. Let a!,...,a"+! be such that
amohys 160) forbkanie=nd ws ost nee 1, where we have set hj = h,4,;. Let
w=alA--»Aa™ Then w-a=1, but w-a’ = 0 since the elements
a®-hi = 6*4, of the first column of the r X r matrix (a*-h{) are 0. This
contradicts the assumption that a’ = ca,c ¥ 0.5

Example 1. Show that 2e; + 3e2 — e3, e1 + 2e2, e; — 2e3 are linearly dependent.
Their exterior product is

(2e; + 3e2 — es) A (e1 + 2e2) A (e1 — 2e3)


= (e12 — e31 — 2e32) A (e1 — 2e3) = —2e123 — 2e321 = O.
Example 2. Show that (e; + 3e3, e2 — e3) is a frame for the same 2-dimensional
vector subspace of EH? as (2e; + e2 + 5e3, 4e; + e2 + 1le3). Calculating the
exterior products, we get

(e1 + 3e3) A (e2 — e3) = e12 — 13 — 3e23,


(2e1 + e2 + 5e3) A (4e1 + e2 + 1les) = —2e12 + 2e13 + 6e93.
The second 2-vector is —2 times the first.

Let a ~ 0 be a decomposable 7-vector. Then a = hy A --- A h,, and


the vector subspace P spanned by hy,,...,h, is called the 7-space of a. If
a=hj, A--- A hj, then taking c = 1 im part (c) of Theorem 19, we see
that h{,..., hj also span this same vector subspace P. Thus P depends only
on @w and not on the particular way @ is written as the exterior product of vectors.
Ti %c' 0, then (chy,h>,...,h;) is another frame for P and ca =
(ch;) A he A --- A h,. Thus @ and caw have the same r-space P. On the
other hand, if &’ is not a scalar multiple of a, then @ and a’ have different
7-Spaces.

Orientations. Let P be an r-dimensional vector subspace of H”.

Definition. A decomposable r-vector ao is an orientation for P if jao| = 1


and P is the r-space of ap.
If a is any r-vector whose r-space is P, then ce is an orientation for P
provided |ca| = 1. Since |ce| = |c| |a|, we must have c = +|a|~'. P has
two orientations. If a is one of them, then —aqp is the other.

Example 3. Let \ = (ii,...,%,) have no repetitions. The r-space of e, is spanned


by e:,,.-+,@:,- Since |ey| = 1, e) is an orientation for it.

If (h,,...,h,) is any frame for P and q@p is an orientation of P, then

hi A+: Ah, = ap, c= +h; A--- A b,| ¥ 0.


220 Exterior Algebra and Differential Calculus 6-3

Let us say that the frame (hy,...,h,) has orientation ao if c > 0, and
orientation —ay ifc < 0.
If two vectors in the frame (h,,..., h,) are interchanged, then the exterior
product h, A --- A h, changes sign. Thus the orientation of a frame changes
under interchanges.
In Example (2), (11)~/?(e12 — €13 — 3€93) is an orientation. The frame
(e, + 3e3, €2 — e3) has this orientation.
Let r= n. Then P = EH” and +e)..., are the two orientations. Let us
call e;..., the standard, or positive, orientation of H” and —e,..., the negative
orientation of EH”. When r < n we do not attempt to call one orientation of P
positive and the other negative. If (h,,...,h,) is a frame for H”, then by
(6-13a) :

The frame has positive orientation if det (kh?) > 0 and negative orientation
if det (hi) < 0.
Measure for r-parallelepipeds. It was shown in Section 5-7 that if K is
an n-parallelepiped spanned by hy,...,h, with xo as vertex, then

Vat) = det (hry) — (bi A b,).


More generally, if x9, hy,...,h, are vectors, then

Ke xerxya—
ox, a be < i << L ke boatl
k=1
is the r-parallelepiped spanned by hy, ... , h; with xo as vertex.

Definition. The 7-dzmensional measure of K is

V,(K) = [hi A --: A hl. (6-16)


By part (a) of Theorem 19, V,(K) = 0 if and only if (hy, ...,h,) is linearly
dependent.
We now have a criterion which shows when two frames lead to the same
r-vector.

Corollary. Let (h,,...,h,), (hj,...,h}) be frames. Thenh, A -++ A h, =


hi A --: A hy ¢ and only af these frames span the same r-space P, have
the same orientation, and their parallelepipeds with 0 as vertex have the same
r-measure.

Measure for r-simplices. Let S be an r-simplex with vertices xo, ... , X;.
Let hy, = xz, — Xo, k = 1,...,7r. Reasoning as in Section 5-7, we have
a 7

S={x:x=
xo + >) thy,t > 0 fork = Lae tees
k=1 k=1
6-3 Multivectors 221

The r-dimensional measure of S is defined to be

il
VS) = Al Ihy A-+: A hy. (6-17)

Both (6-16) and (6-17) are very special cases of a general formula (7-5)
in Chapter 7 for r-dimensional measure.

Example 4. The area of the triangle in H? with vertices 0, 3e, + e2, e3 — eg is


3|(3e1 + e2) A (es — eg)|. Since
(3e1 + e2) A (e3 — eg) = —3e12 + 3e13 + e293,

the components are a1? = —3, a13 = 3, a3 = 1. Since |a|? = D1) (e)2, the
area is /19/2. The area can also be calculated from formula (6-18) below.

To show that the definition (6-16) of r-measure for parallelepipeds is


reasonable, let us show that V,(K) is the product of the lengths of the vectors
h,,...,h, in case these vectors are mutually orthogonal. If v,,...,v, are
vectors, let a* be the covector with the same components as the vector vy.
Thena’ A --+ A a” is the r-covector with the same components as the r-vector
Veer ceNAV, bye (6-14)

Wa) geen V,) (hye /\h,) — det (vz > hy),


where the - now denotes inner product. In particular, let v,; = h;. Then

| lhy A -:- A h,|? = det (hy


-hy). (6-18)

Taking square roots, we get a formula for V,(K). If h,,...,h, are mutually
orthogonal, then h,-h,; = 0 for k ¥ 1 and det (hy-h;) = |h,|?--- |h,|?.. In
this case V,(K) = |h,| --- |h,| as required.

PROBLEMS

1. Simplify (n = 6):
(a) es A es A e24.
(b) eg A e3 A e6e.
(c) e1 A (e14 + e684).
(d) (e; + 3e4 — e6) A (2e23 + e36) A e45.
(e) (e12 + e13) A (e34 + e25) A (e56 + e468).
2. Evaluate the indicated scalar products (n = 4), using (6-14).
(a) (et + e?) - (e1 + eg).
(b) e!? + e34.
(c) e!94- (e431 + 3e124).
(d) (e1 — e*) A (e?-+ e*) - (ex + 2e4) A (e2 — 2e4).
3. Using Theorem 19 show that (2e; + e3, e2 + e4, e1 + e4, e3 + e4) is a frame
for E*. What is its orientation?
222 Exterior Algebra and Differential Calculus 6-4

4, Do e; + e4, e2 + es, e3 + e6, €1 + €5, e2 + e6, €3 + e4 form a basis for H®?


5. Show that (e1 — e2, e2 — e3) and (3e1 — eg — 2e3, 2e1 — eg — e3) are frames
for the same vector subspace of 3. Do their orientations agree?
6. Find the area of the triangle with vertices 2e3, e1— e2 + 2e3, e1 + 3e3.
7. Find the volume of the 3-simplex in H* with vertices 0, e1 — e3, e2, e3 + 2e4.
8. Let K be an r-parallelepiped spanned by hj,...,h, with 0 as vertex. For each
increasing \ let K* = X*(K) where X is the projection onto the r-space of
e,. (X* leaves the components z41,...,2'* of any x unchanged and replaces
each of the other components by 0.) Let a = hi A -:: A h,. Show that
lo} = V,(K*) and hence [V,(K)]? = do: [V-(K®)]*. Illustrate for n = 3 and
ep == ily A
9. Show that:
(a) If w is decomposable, thena A a = 0.
(b) If w and B are decomposable r-vectors, then (a + B) A (a+ 6B) = 2a A B
if r is even and is 0 if r is odd.
(c) The 2-vector e12 + e34 is not decomposable. [Hint: Use (a).]
10. Let @ and B be decomposable nonzero 2-vectors, and P, Q be their respective
2-spaces. Show that if PQA Q = {0}, then a+ is not decomposable; and if
PQ is a line through 0, then a + B is decomposable and a # cB. [Hints: In
the first instance a = h A k, B = h’ A k’, where {h,k,h’,k’} is a linearly
independent set. In the secconda =h A k,B =h/ k’, wherehe PN Q.]
11. Leta =h A k,@ # 0. Show that the matrix (a) has rank 2. [Hint: Show
that each column vector of the matrix is a linear combination of h and k.]
12. Let (xo, X1,...,X,) be an (r + 1)-tuple such that the vectors x1 — xo,... , X, — Xo
are linearly independent. Such an (r-+ 1)-tuple defines an oriented r-simplex.
Its r-vector is 1/r!(x1 — xo) A ++: A (% — Xo). Let B; be the (r — 1)-vector
of the 7th oriented face (xo, X1,.-.., X:—1, X:41,-.-,X,r). Show that

>, (-1)'8; = 0
i=0

6-4 INDUCED LINEAR TRANSFORMATIONS

Let m and n be positive integers. With any linear transformation L from


E” into E” is associated for each r = 1, 2,... a linear transformation L, from
E into Ey with the following property. If (k;,..., K,) is any r-tuple of vectors
in #”, then we require that

L,(ki A ~-- A K,) = Liki) A --- A Lik,). (6-19)


For r = 2 this is illustrated in Fig. 6-2.
With this in mind let us define L, as follows. Let €1,---, €m be the standard
basis elements of H”. Then v; = L(e;) is the jth column vector of L. Let
& = (j1,---,Jr) be increasing. Then e, = €;, A+++ A e;,, and remembering
6-4 Induced Linear Transformations 223

—_
k
FigurEe 6-2

that we want (6-19) to be correct, we set

L,(e,) SV ge / Naas oN OVss (6-20a)

Since L, is to be linear, its value at any B is determined once the values at the
basis elements e, are known. For any B = Yo, BYe,

A OPS NS AG (6-21)
[H]

The linear transformations L, are said to be induced by L. Of course


L, = L. Ifr > m, then E? has the single element 0 and L,(0) = 0.
Let us show that for every B € E?’, Y © H™,

L,+s(8 A Y) = L,(8) A L(Y). (6-22a)

Teta (ij. Jy). and Y= (hy, -., bh.) be mcreasing.. Then

L,(é,) (aL he) Wee, Seen NY Gn) LV n/N ee /NAV Es

If any integer is repeated in the (7 + s)-tuple (u, v), then this is 0. Otherwise,
the right-hand side is (—1)”L,4s(e,) where 7 is the increasing (r + s)-tuple
obtained from (u,v) by p interchanges. Since e,, = (—1)?e, and L,4, is
linear, (—1)?L,4.(€,) = Lr+e(€,,). Thus

L,(éx) IX L,(€,) = Dyce eny)s

Therefore (6—22a) is correct for basis of elements of H;” and HY’. Since each of
these transformations is linear, (6-22a) then holds in general.
By induction there is a generalization of (6-22a) for products of any num-
ber of multivectors. In particular, in this way we get the required formula
(6-19) for products of vectors.
Let 6 € E” and aw = L,(8). Let us find a formula for the components a*
ineterme olathe componentsion 8. SIL \ =="(21,250a5 27) =) Gi,=- », Jr), and
(c}) is the matrix of L, let
= det (c}').
224 Exterior Algebra and Differential Calculus 6-4

By (6-138a), @ is the Ath component of vj, A --* A vj, By (6-20) and (6-21)

oy = ences, JK ooo J vie

{HJ .

Since both sides have the same components,

ov wrc oto — La) (6-232)


[H]

When 7 = 1, this becomes (4—4a).


The dual transformation. Let L* be the linear transformation from (H7)*
into (#7")* which is dual to L,. It is defined from the formula

a> L,(8) = "L, (w) 8 (6-24)


for every B € Ey’, w € (E?)*. Let us prove the formula dual to (6—20a) :

Li(e’) = w2 A --- Aw", (6-20b)


where w',...,w” are the row covectors of L. By (6-13b) the uth component
ObWes Nt: = \ wis om If in (6-24) we set w = e*, 8B= e, and recall (6-20a),
then
e*. (Vie tN Ve COr Gr

The left-hand side equals én, and the right-hand side is the wth component of
L*(e*). Since both sides of (6-20b) have the same components oF they are equal.
T’rom (6-20b) the formulas dual to (6-19)—(6—23a) follow in the same way
as before.

PROBLEMS
1. Let m = 2,n = 3, L(s, 2t) = (s — 2t)ey — seo -+ (28+ 3t)e3. Find:
(a) L*(a). (b) cf). (c) Le(@).
(d) L3(). Opi Gay
2. Prove the dual of (6—-22a):

Lris(@ A §) = Lr) A Li(s) (6-22b)


it Oe (hr), ¢ GNEs)a.
3. Prove the dual of (6-23a):

= et ene: (6-23b)
IM
4. Let L be an orthogonal transformation of Z”. Show that |L,(@)| = lal:
(a) If w is decomposable. [Hint: (6-18).]
(b) For any r-vector @.
6-5 Differential Forms 225

6-5 DIFFERENTIAL FORMS

In Section 2-6 a differential form of degree 1 was defined as a covector-


valued function. It was shown that any such differential form w is a linear
combination of dz',... , dx”,
Qo = Dp WO; dx’,
i=1
where the coefficients w,,...,W, are real-valued functions. For r > 2 a
differential form of degree r is supposed to be an alternating polynomial of
degree r in dz',..., dx” with coefficients w, which are real-valued functions.
This idea is expressed more precisely by the following definition.

Definition. A differential form of degree + is a function w with domain


D C E” and values in (£?)*. The value of w at x is denoted by w(x).

The values of w are r-covectors. The same Greek letters w and ¢ used in
the last section to denote r-covectors are now used to denote differential forms.
The context will indicate clearly which is intended.
For brevity we say “r-form” instead of “differential form of degree r.”
It is convenient to call any real-valued function f a 0-form. If r > n, then the
only r-form is the one which has the value 0 for every x € D. We also use 0
to denote this r-form.
Let w be an r-form and ¢ be an s-form, with the same domain D. The
exterior product w /\ ¢ is the (r + s)-form defined by

(w A §)(x) = w(x) A F)
for every x € D. Similarly, fw is the r-form such that

(fw) (x) = f(x)w(x)


for every x € D. The rules for multiplication of multicovectors described in
Proposition 20 hold also for products of differential forms. —
We recall that dz’ is the 1-form with constant value e* for every x. Then
dx' / dx’ is the 2-form with constant value e’ A e’ = e” for every x. Let
us denote the components of the 2-covectors w(x) by w,;(x). By (6-8)

w(x) = )> wi;(x)e”


i<j
for every x € D. If w;; is the real-valued function whose value at x is w;;(x),
then ;
o= De (or) 00 e/a,
i<j

(Strictly speaking, the 2-form dx’ A dz’ has domain EH”, and we mean here
its restriction to D.)
226 Exterior Algebra and Differential Calculus 6-5

Similarly, for any r-tuple \ = (i1,...,7%,) the r-form dx! A +++ A dc'*
has constant value e*. Hence if w is an 7-form, then

= OY tite money wrk: (6-25)


[A]

where the value of w, at x is w,(x). Using Problem 4, Section 6-1, one can
also write

‘= 4 X W) die Ns Ande

We say that an 7-form w is of class C@ if the functions w, in (6-25) are


of class C”.
We recall that if f is a 0-form of class C‘”, then df is the 1-form
df = f, dz’ -—--=- + f, dz”,

where f;,...,f, are the partial derivatives. In particular, if w is an r-form of


class C‘’, then w is a 0-form of class C‘ and dw, is defined.

Definition. Let w be an r-form of class C"’. The exterior differential dw


is the (r + 1)-form defined by the formula

doit) dane dt Nana dae. (6-26)


[A]

Example 1. Letr = 1,w = wo; dv! +---+ , dz”. Then

de) >eudst\udeieaty)) ai du’) \ de’.


i=l i=1 |j= 1 Cts

From the formulas e’ A e? = —e? A e', e? A e* = 0, we have

dx A dx? = —dzi A dx’, dz* (\ dz* = 0.


Therefore

dot») ee a dx A de’. (6-27)


ee dx" Ox”

In particular, if m = 2andw = M dz-+ N dy, then

dw = dM A dx+dNA dy a ee dy.

Example 2. If r = n, then

w = fdz' A --- A dz”

where f = 1...n. Since dw is an (n + 1)-form, dw = 0.


6-5 Differential Forms PUN

Example 3. Ifm = 3 and w = 2dx-+ 2? dy + x?y dz, then

dw = d(2) A dx+ d(z?) A dy+ d(x2y) A dz


= 2z2dz A dy+ 2xydx A dz+
x7 dy A dz.

Proposition 24. The exterior differential has the following properties:

(1) d(w + £) = dw -} df, if w and ¢ are r-forms of class CY.


(2) dw A §) = dw A §+ (—1)'w A df, if w ts an r-form and ¢ is an
s-form, both of class C™.
(3) d(dw) = 0 2f w is an r-form of class C?.

Note: | li 7 = Owe agree that f A ¢ = ff. Similarly, if s = 0 then


w / f = fw. The proposition remains true if r = 0 or s = 0.
Proof. The coefficients of w + ¢ in (6-25) are a, + § and d(w, + &) =
dw, + df. Therefore (1) holds. Similarly, d(cw) = c dw.
To prove (2) let us for brevity set

1D he IN SOS IN ine
Let us first show that
dfEs A BY) =dfA EB A EB. (*)
If any integer is repeated in the (7 ++ s)-tuple \, v then both sides are 0. Other-
wise, E* A E” = (—1)?E’ where 7 is increasing. By definition, d(fE’) =
df A E’. Multiplying both sides by (—1)” we get (*). Now

Ot = DS CA
(AI)

By the ordinary product rule

d(of) = § dw, + wy dh.


By (*) with f= wb,
d(onGE* A E’) = (6,dex + oy df) A E* A E’.
Since d¢, has degree 1 and E* degree 7, by (3) of Proposition 20

dE a1) Baan de
The scalar-valued function ¢, commutes with any differential form. Hence

d(wy,E* A E’) = (dw, A E‘) A (E’)


+(-1)"(wE*)
A (di AE’). (#*)
Using (1),
d(w /\ f) = yD, d(w§,E* A E’),
[Al]
228 Exterior Algebra and Differential Calculus 6-5

while

J) (de, A EY) A (GE) = be deo A 2 A » cE = dw A ¢.


[Al] [A] J (v)

Similarly,
(-1)" >> (@E”) A dt A EB’) = (-1)o A a,
[Ally]
which proves (2).
If f is of class C, then from (6-27)

d(df) = >( ae vf ) dx’ A dx’ = 0.


say \0x" da? = x? Ox"

The form E* has constant coefficients and hence dE* = 0. Using the product
rule (2), d(df A E*) = 0. Taking f = , and using (1),

d(dw) = d @ deo A 2 = i d(da, A EB) = 0.5


1d] [A]
Definition. An 7-form w is closed if dw = 0. If w = dg for some (r — 1)-
form ¢, then w is an exact r-form.

If r = 1, these definitions agree with the ones given in Section 2-6. If


w is exact and ¢ can be chosen to be of class C, then dw = d(d¢) = 0. Hence
w is closed. Poincaré’s lemma states that if domain D is star-shaped then
conversely any closed form w is exact. This will be proved in Section 7-7.

*Remark. The exterior differential d is uniquely determined by Proper-


ties (1), (2), (3) and the following property.
(4) For r = 0, df agrees with its definition in Section 2-6.

Let d’ also have these four properties. Then d’/E* = d'(d’x"! A E*), where
= (to,...,%,). But dx’ = d’x' by (4) since dz’ stands for the differential of
the coordinate function X*. Using (2), (3), and induction on r, d/E* = 0.
Using (2) and (4), d’(fE*) = df A E*. Using (1),

Dod! @ nb) = So d'(wE)


[A] [A]

pido, wer
[A]

Thus d’w = dw for every w of class C. In particular, this proves that the
exterior differential d is “coordinate free.”
Transformation law for differential forms. Let g be a transformation of
class CY from an open set A C E™ into EB". Let D be an open set containing
6-5 Differential Forms 229

g(A)
g D
—_—_—_P

w # —__. @

Figure 6-3

the image g(A). If w is any r-form with domain D, then there is a corresponding
r-form denoted by w’ with domain A. Formally, w' is obtained by merely
substituting g(t) for x and dg’ for dx’. The precise definition of w’ is as follows.

Definition. For cach t € A, the value of w’ at t is

w(t) = L*[e(x)], (6-28)


where
x gt) Le Delt),
and L* is the dual linear transformation induced by L (p. 224). In case r = 0
we agree that f’ = f°g. (The notation * is used for brevity even though it
does not indicate the dependence on g and the degree r.)

Proposition 25. The operation * has the following properties:


(1) (@ + 9)? = wh + & if w and ¢ are of degree r.
(2) (wo A gt =o! A &, af w is of degree r and & of degree s.
(3) (af)' = d(f-g), of f ts of class C™.
(4) (det A +--+ A dx) = dg® A -+> A dg’.
(5) dw’ = (dw)", if w ts of class CY and g of class C™.
Proof. Since LF is linear,

(o + Mt) = wl(t) + (0.


Since this is true for every t € A, this proves (1). Using (6-22b), we get

@oA §)*(t)= w(t)


A oUt)
for every t € A, which proves (2). By the chain rule (p. 106), d(f ° g)(t) =
L¥({dj(x)]. By (6-28) the right-hand side is (df)*(t). Thus (3) holds. Recall
that. dz’ stands for dX*, where X*(x) = x’ for each x. Then g' = X*og and
from (3) with f= X’,
(dx*)* = dg’. (6-29)
230 Exterior Algebra and Differential Calculus 6-5

Then (4) follows from this and (2). To prove (5) we have from (1)-(4)

w! = Doo nedg A--- A dg”,


[A]

dest = d(wa, egdg A-:: A dg’’).


[A]

By (3), (dw,)’ = d(w,« g). Since g is of class C®, d(dg’) = 0. Therefore,


by the product rule
HGHDe IN ooo (uh)
Using the product rule again, we have

deo’ = S~ (dwx)* A dg A+++ A dg’t = (dw)I


[A]

_ As in Chapter 4 let g; denote the jth partial derivative of the component


g’. Let
ee UK AO el

GS ee)
The matrix of Dg(t) is (gi(t)), and the row covectors are dg'(t),...,dg"(t).
By (6-13b) the uth component of dg! A +--+ A dg’” is On: Therefore

(dx A +++ A da’)! = >> ghdtt A ++ A dt*. (6-30)


(H]

In applying this formula in the next chapter we shall usually take r = m.


In that case the only increasing r-tuple is uw= (1, 2,..., 7) and the right-hand
side of (6-30) has just one term.

Example 4. Let» = 3,r = m = 2, (x,y,z) = g(s,t). Then

Ge fh Gey
fowl= a(g', 9°)
FIC‘) ds /\ dt.

Ifw = fdx A dz, thenw? = fog (dx A de)t.

Example 5. Let m = n = r. Then, writing f = w}...n, we have

w = (fdr! A --- A dx")!


# n
= fog(dz! A --+ A ax")
n

=feog ie
ne Sanh)5 dt) A+++ A at’
er:
Example 6. Let m = r = 1. Then w(t) = w[g(t)] - g’(t), and the definition (3-8a)
of the line integral can be rewritten
Does
acral
j= wo.
6-5 Differential Forms 231

*Note. In tensor language a differential form of degree r, being an r-covector-


valued function, is called a covariant alternating tensor field of rank r. From (6-23b)
we obtain the transformation law for the components of such a tensor field:

y= Dereggm iff =o.


[A]

PROBLEMS
Assume that all forms which appear are of class C).
. Find the exterior differential of:
(a) «2y dy — xy? dz.
(b) cos (ry?) dx A dz.
(c) f(a, z) dz.
(d) ady A dz+ydz fA dx-+2zdz A dy.
. Let P,Q, R have domain D C E%. Show that

UP dy \ de + Qa A det Rar A ay) = (2


Ox
s OB)
Oz
ae ay A ae

. (a) Find an (n — 1)-form ¢ such that df = dz! A -++ A dx”. [Hint: Prob-
lem 1(d).]
(b) Find an (r — 1)-form ¢ such that d?* = E?.
(c) Show that if the coefficients w, in (6-25) are constant functions, then w is
exact.
. (a) Show that if w and ¢ are closed differential forms, then w A ¢ is closed.
(b) Show that if w is closed and ¢ is exact, thenw / ¢ is exact.
. Find the exterior differential of:
(a) "dw \ > —w i dg.
(b) dwA SEAn+w di A n+owA ¢ A dy, if w and § are of even degree,
. A function f is an integrating factor for a 1-form w if f(x) + 0 for every x € D
and fw is closed. Show that if w has an integrating factor thenw A dw = 0.
_.Letn = m=2,r=1, 0 = Mdzr+Ndy. Find explicitly dw and w? and
verify that (dw)? = dw’.
. Letn = m = 8, g(s, t, uw) = (scost)er + (ssin t)e2 + ue3. Find:
(a) (fdx A dy A dz)*. (b) (a dy A dz)*.
. Show that if w is a 2-form, then

0; Oox: , Iw
dw = >> (aay tea Beit) a’ A dx’ \ da’.
i<j<k

10. Let w!,...,w? be 1-forms such that w* = )°7_ it dg ate lee pe Assume
that the functions f} are of class C™, the g’ are of class C, and that the 1-covectors
ENG@ eset » 0? (x) are linearly independent for every x € D. Find 1-forms 0;
such that dw' = >?_,0; A wi. [Hint: The p X p matrix (fj(x)) must be
nonsingular.]
232 Exterior Algebra and Differential Calculus 6-6

[Note: Conversely, if dw’ is a linear combination of w!,...,w? with coeffi-


cients 1-forms 6%, then locally functions fj, g’ as above can be found. This result
is called the Frobenius integration theorem, and has important applications in
geometry and differential equations. See [9], p. zl

6-6 THE ADJOINT AND CODIFFERENTIAL


To each r-vector a we shall now assign a certain (n — r)-vector, which
is called the adjoint of w@ and is denoted by *a. Let us begin with the special
dimension r = n — 1, which is the only one needed in connection with the
divergence theorem in the next chapter.
Leta hy A *-> A hyo. lf @ = 0; then we set-*a"—07 liar 0;
then *@ will turn out to be the vector h with the following three properties:
(1) h is a vector normal to the (n — 1)-space P spanned by hy,..., Br_1;
(2) (h,h,,...,h,_1) is a positively oriented frame for #”; (3) |h| = lal.
Condition (3) says that the length of h equals V,_:(K), where K is the
(n — 1)-parallelepiped vertex 0 spanned by hy,...,h,_1. (See Fig. 6-4.)
With this in mind, let us define * first
for the standard basis (n — 1)-vectors. Let h=*a
(eA eer 7 el ete 2) ICE
7 — 1 interchanges will change the n-tuple Y
(z, 2’) into the increasing n-tuple (1, ..., 7), Si

(—1)'*—e; IX @s? = Ciiccoe


\
3
Therefore we set ol2
aS h Aha
ea. 0

*e;7 = (==1)) er (6-31)


Figurr 6-4
We want the operation * to behave linearly.
For any (n — 1)-vector a = "7, a’e,, let *ax be the vector h =
ry a” (*e;"). Its components are

hae (1) aan = ees (6-32a)


Example 1. Let » = 3. In this particular dimension it is useful to consider instead
of the standard basis {e23, e13, e12} for H3 the basis {e23, e31, €12}, where e31 = —ej3.
Then any 2-vector w can be written a = a7e93 + a3!e3; + a!2e12, where a3! =
—a'3; and
“O27 = Ci, “CR = Gy OD = OB,
ate = fp}, EEE A al? = fh? ifh = *a.

Let us show that *a@ has Properties (1), (2), and (3) above. Given a frame
(hi,...,h, 1), these three properties determine a vector, which we denote
temporarily by h. Let (hj, ...,h),_,) be an orthogonal frame for P, hi,- hy = 0
if k ~ 1. Then @ is a scalar multiple 6 of hi A --- A hi_, and replacing
6-6 The Adjoint and Codifferential 233

hi by bhi we may assume that a = hi A -:- A hi_,. By (1), h-hi = 0


for each k = 1,...,n — 1. Therefore

Jb A o| = [bj [hi] - ~~ [bhi] = [Al lal.


By (2), h A @ = cej..., where c > 0. In fact, c = |h A al. Let h = +a.
From (6-32a), |h| = |@| and consequently ¢ = |h| {h].
On the other hand,

h K oe= (3 ve.)A & oes) = pbRe 9 few


v= = i=1

since e; A ej; = O unless 7 = j. By (6-32a)

h aC — (Hahje.s),

and hence h-h = c = [h| |h|. Equality holds in Cauchy’s inequality (Sec-
tion 1-1) and therefore h is a positive scalar multiple of h. By (3), |h| = {hl,
and hence h = h as required.
We can now show that every (n — 1)-vector @ is decomposable. Let
a ~ 0, and let h = *a. The vector h is normal to an (n — 1)-dimensional
subspace P. Let @ be an (n — 1)-vector of P whose orientation and norm
are chosen such that @ and h are related by (1)—(3). Then @ is decomposable
and h = *a@. Thus *& = *a, which by (6-32a) implies that @ = a@.
If w is an (n — 1)-covector, then *w is the 1-covector whose components
are given by the dual to (6-32a):

— (—1)* 40, ee Nee edn UNG erp (6-32b)

If w is an (n — 1)-form, then *w is the 1-form such that (+w)(x) = *w(x) for


each x in the domain D of w. The dual to (6-31) is *e” = (—1)*~'e’, and
hence
ar er ete Ndr A A at") = (1) dz

If w is an (n — 1)-form of class C’”, then dw is an n-form. Consequently,


dw = fdx'! A --+ A dx”, where f is a scalar-valued function. To get a con-
venient expression for f, let ¢ = *w. Its components are given by (6-32b).
Let

div ¢ =
da} Ox”

This function is called the divergence of the 1-form ¢. By a short calculation


(Problem 2), the desired function f is just div ¢. Thus

dw = divédx' A+++ A dx", if f§= *a. (6-34)


234 Exterior Algebra and Differential Calculus 6-6

When n = 3 the divergence has an important physical significance which


will be indicated in-Section 7-4.

The remainder of this section will not be used in Chapter 7. Let us define
*a for any r-vector a when 0 <r <n. If r = 0 orn we set

OEE ae TAY) =O

If 0 <r <n, let X = (%1,...,%,) be any increasing 7-tuple, and let \’ =


(j1,-++5Jn—r) be the increasing (n — 1)-tuple whose entries are those integers
j, between 1 and n which do not appear in d. Let

€ = BN".
It is +1, depending on whether an odd or an even number of interchanges
puts )’,\ in increasing order. If @ is any r-vector, then its adjoint is the
(n — r)-vector *a whose components satisfy

(+a) = ade. (6-35a)


If r = n—1 and dA = 7, then V = (1), e = (—1)*“!, and (6-35a) agrees
with (6-32a). From the definition (6—-35a),

*(a + B) = *a + *8, *(ca) = c*a.

Moreover, *@ = 0 if and only if a = 0. Thus the operation * gives an iso-


morphism between H? and H7_,. This isomorphism preserves inner products.
In fact, if e@ and @ are r-vectors then

ta *8 = D7 (*x)” (#8)* = D> (e:)?0B%.


[A] [A]

Since (Q)7 = 1, ta-*8 =a-f. Taking aw = B we have in particular


|*x| = |a|. Since ee = (—1)""”,

kkgy = (Ae es

Now let v be any increasing (n — 7)-tuple. Then

GA ey, 96 Co,
which is 0 if vy ¥ 2’ and is ee1..., if v = . If B is any (n — r)-vector, then

[ey [\ Ch = SS B’a*e, Ae, = (> Bae cs)Sileoory


(IEA) tA]
and
BAa= (8 Foe) Oye: (6-36)
6-6 The Adjoint and Codifferential 235

If a ¥ 0 is decomposable, then *@ has the following geometric interpre-


tation. Let w = hy A --- A hy, and let h,44,...,h, be vectors such that:
(1) (Br41,-..,h,) is a frame for the orthogonal complement of the r-space
of a@; (2) (hyy1,..., bn, hy,...,h,) is a positively oriented frame for Wee
(3) |b.41 A --+> A h,| = lal. Then ta = h,,, A -:: A hy. The proof
is similar to the one given above for = n — 1.
If w is an r-covector, then *w is the (n — r)-covector such that

(#w),’ = WE. (6-35b)


If w is a differential form of degree r, then *w is the (n — r)-form such that
(*w) (x) = *w(x) for every x in the domain of w.

Example 2. *(f dx A --- A dx'r) = eyf dx’1 A +++ A dxin-r,whered’ = (j1,..., jn—r)-
For7 = 0 or n,

*f = fdxi A +++ A dx, *(fda! A +++ A da”) =f.


Let w be an r-form of class C‘?. Then d(*w) is an (n — r+ 1)-form,
and *d(*w) is an (r — 1)-form.

Definition. The codifferential of w is

dw = (—1)"™— *d(#w). (6-37a)

Since **~ = (—1)""—"w, substituting *w for w we get

d(+w) = *dw. (6-37b)

If r = 0, we invent a form 0 of degree —1 and agree that df =o. If


¢ is a 1-form, consider the (n — 1)-form w = (—1)"—! *¢. Then ## = ¢ and
by (6-37b) dt = *dw. By (6-34) d¢ = div¢. Thus the codifferential of a
1-form ¢ is just the 0-form div ¢.

Notes. Many authors define the adjoint so that in (2) above (hy,...,h,,
h,ii,.--,h,) is a positively oriented frame for H”. When r(n — r) is odd,
according to that definition *a has opposite sign to the one here.
The definition of the adjoint involves the euclidean norm. Hence both
the adjoint and the codifferential depend on the euclidean structure inherited
by E” and (#”)* from the euclidean inner product in #”; while the notions
of A and d actually depend only on the vector space structure and not the inner
product.
In riemannian geometry one is provided at each point x with an inner
product B,, not necessarily the euclidean inner product. The definition of
adjoint must be modified accordingly. The codifferential is again defined by
(6-37a). However, the formula (6-33) for the divergence and its generalization
(Problem 5) must be modified. See [9] and Chap. V of [17].
236 Exterior Algebra and Differential Calculus 6-7

PROBLEMS
1. Let n = 2. Show that
(a) th = h?e, — hes.
(b) *(M dx + N dy) = N dx — M ay. \
(c) *d(N dx — M dy) = —(OM/dx + ON/dy).
2. (a) Let n = 3, and w = Pdy A dz+Qdz A dx+ Rdx A dy be a 2-form.
Show that
ty = Pde +Qdy+ R dz, dw = (0P/dx + 0Q/dy + OR/dz) dx A dy 4A dz.

(b) Let w be an (n — 1)-form,

i—1 n
w= Do wvde' A--- A de Ade” Aw Kae,
seat
and let ¢ = *w. Show that dw = div¢dz! A -:- A da”.
3. Show that: (a) div (df) is the Laplacian of f. (b) div(fw) = fdivw + df-o,
where (¢- w)(x) = ¢(x) - w(x) for 1-forms §, w.

4. Let a and B be r-vectors. Show that:


(a) (ta) A B = a: Be1...n.
(bisa N” Br a1) ana AO (+8),
(c) (#w) + (*a) = w-a@ for any r-covector w.
5. Show that the components of dw satisfy (dw), = 071 0w,,;/dx', where (v,1) =
(Cie Saas heey MeN
6. Show that the components of dw satisfy (dw), = >-jt} (—1)/7} dwy,/d2x's, where
Neiset be T=UUplen(? ym tay os219 tT, © ey CeLe
7. If ¢ is a 1-form and w is an r-form let ¢ - w be the (r — 1)-form such that *(€-w) =
(—1)"—1¢ A (#w). Show that:
(a) d(fw) = fdw+df-w. — (b) § +), = Vier Ciw,,s.

*6-7 SPECIAL RESULTS FOR n = 3

Vector analysis in H? is traditionally based on four operations besides the


usual vector addition, scalar multiplication, and inner product. These opera-
tions are the cross product, triple scalar product, curl, and divergence. The
last of these was defined in the previous section, for any dimension n. The
other three are special to three dimensions, and can be expressed in terms
of A, *, and d as follows.
If hy and hy are vectors, their cross product is denoted by h, X hg. It
is the vector
h;' < ho*= *(hy A ho). (6-38a)

See Fig. 6-4. The cross product distributes with vector addition and scalar
multiplication, and hy X hy = —h,; X hg. However, it is not associative.
6-7 Special Results for n = 3 237

The triple scalar product of three vectors is denoted by [hy, hs, hg]. It is
given by
[hi, ho, hg] = *(hy A he A hg). (6-39)
Its absolute value equals |hy A hy A hg], which is the volume of the paral-
lelepiped spanned by hy, he, hg with vertex 0. The sign of the triple scalar
product is positive if (hy, hy, hs) is a positively oriented frame for E* and
negative if this frame is negatively oriented.
When n = 3, r(m — r) is always even and (—1)"™-” = 1. Then
*(h; X he) = hy A he. Using Problem 4(a), Section 6-6,

hy, A he A hg = (hy X hg) - hg e193,


which gives another formula for the triple scalar product:

[hi, hy, hg] = (hy X hg) - hg.


The cross product of two covectors, or of two 1-forms, is given by
wX f = *(w A §). (6-38b)

The curl of a 1-form w is the 1-form curl w given by


curl w = *dw. (6-40)

Its physical significance will be indicated in Section 7-6 in connection with


Stokes’ formula.

Example. Show that div (curlw) = 0 for every 1-form w of class C). Using the fact
that d* = *d (formula 6-37b),
div (curlw) = d(*dw) = *d(dw) = *0 = 0.

PROBLEMS
Assume that all forms are of class C®.
1. Show that:
(a) h Xk = —k xX h. (b) h X (k1 + ko) = hX ki +h X ko.
2. Let w = Mdx+ Ndy+ Odz. Show that curl w = (00/dy — ON/0z) dx +
(0M /dz — 00/dx) dy + (ON /dx — OM /dy) dz.
3. Find e; X e; for all pairs 7,7 = 1, 2, 3.
4. With the aid of (6-38a) and (6-40), show that:
(a) div (¢ X w) = Oif ¢ and w are closed. 4 (b) curl (fw) = feurlw + df X w.
(c) curl (f df) = 0. (d) curl (§ X w) = dF A @).
(e) curl (curlw) = d(divw) — Lapl w, where Lapl (Mdz-+ N dy+ Odz) =
(Lapl M) dx + (Lapl N) dy + (Lap! O) dz and Laplf is the Laplacian of the
function f.
(f) ¢-curlw — w-curlg = div (w X ¢). (Hint: By the dual to (6-36), § - *dw =
#(€ A dw).J
238 Exterior Algebra and Differential Calculus 6-7

5. Show that:
(a) hy X (he X hg) = (hi -h3)he — (hy-he)hg. ([Hint: Since both sides are
trilinear in (hy, hg, hg) it suffices to prove this when hy, he, hg are standard
basis vectors. Use Problem 3.] .
(b) The cross product is not associative.
(c) (hi X he) X (h3 X ha) = [hi, he, haJh3 — [h1, ho, haha.
6. Let w = (#; dz! + Ee dx?+ E3 dx?) A dx*+ By dx? A dx? + Be dr? A dx! +
B3 dz! A dx?, where the functions B;, E; are of class C‘ on an open subset of
E*, Show that dw = 0 if and only if curl E + 0B/dz* = 0, divB = 0. Here
curl and div are taken in the variables (x!, x”, 23). [Note: The equation dw = 0
represents one-half of Maxwell’s equations for an electromagnetic field in free
space. The functions £;, H2, H3 represent the electrical components of the field
and B,, Be, B3 the components of a magnetic induction vector. There is a similar
equation which represents the other half of Maxwell’s equations. See [9], p. 45.]
CHAPTER ;

Integration on Manifolds

The topic of this chapter is integration over subsets of an r-manifold


M CE”. For this purpose we first study regular transformations from one
y-manifold into another. A regular transformation from a set S C M into E”
defines a coordinate system for S. It is not always possible to find a single
coordinate system for all of M. However, from the implicit function theorem,
coordinates can be introduced locally. Using this fact, together with a device
called partition of unity, the integral of a continuous function f over a set
A C M is defined in Section 7-3. Next, the idea of orientation on a manifold
is introduced, and integrals of differential forms of degree r are defined.
The divergence theorem states that the integral of an (n — 1)-form w
over the boundary fr D of an open set D equals the integral over D of dw.
The orientations on D and fr D must be chosen consistently, and certain regu-
larity assumptions (p. 264) are made. When n = 2 and 3 the divergence theorem
is equivalent to theorems in vector analysis commonly attributed to Green
and to Gauss.
The divergence theorem is a special case of a result which states that the
integral of the differential dw of an (r — 1)-form w over a portion A of an
oriented r-manifold M equals the integral of w over the suitably oriented
boundary of A. This is called Stokes’ formula. In the final section the idea of
homotopy between two transformations is introduced and is applied to give
sufficient conditions in order that a closed differential form be exact.

7-1 REGULAR TRANSFORMATIONS


In Chapter 4 an r-manifold M was defined as a subset of some euclidean
E” which can locally be described by setting equal to 0 functions ®',..., @"~"
with linearly independent differentials. For the precise definition, see Sec-
tion 4-7. An r-manifold M has at each x € M a tangent space, denoted in the
present section by T y(x).
For purposes of integration it is necessary to consider manifolds from a
different point of view. We must show that a manifold can be locally described
239
240 Integration on Manifolds 7-\

by a system of r coordinate functions F!,...,F’. This idea will be made


precise in the next section. We need first to introduce the idea of regular trans-
formation from one r-manifold into another.
Let g be a transformation whose domain isa set N C #”. As in Section 2-3,
we shall say that g is of class C” on N if there exists a transformation G of
class C‘” on some open set A containing N such that g = G|N. The trans-
formation G is an extension of g of class C@.
Now let N Cc B” be an r-manifold, and g a transformation from N into
E”, where r < min {m,n}. Let g be of class CY on N. If r = m, N is itself
an open subset of #” and we shall take N = A, G = g.
When + < m, different extensions of g may lead to different values for
the differential DG(t) at a point t € N. However, let us now show that the
restriction of DG(t) to the tangent space at t is the same for all extensions.

Proposition 26. Let G and G be of class CY on some open set containing N,


and let GIN = G|N. Then DG(t)(k) = DG(t)(k) for everyt E N and
k € Ty(t).

Proof. Let tj €N and ke Ty(to). By definition of tangent vector


(Section 4-7), there is a function y from an interval (—6, 6) into N such that
y(0) = to, ¥/(0) = k. By Corollary 2, Section 4-4, the derivative of Go y
at 0 is DG(to)(k). But Gov = G-wsince GIN = GIN. 5
By this proposition we may set without ambiguity Dg(t) = DG(t)|T v(t),
and may then write Dg(t)(k) in place of DG(t)(k) if k is any tangent vector
to. N at t. Let us next show that if the values of g lie in an r-manifold M, then
Dg(t) takes the tangent space at t into the tangent space at g(t).

Proposition 27. Let g be a transformation of class C™ from N into an


r-manifold M. If k € Ty(to) andh = Dg(to)(k), then h € T y[g(to)].

Proof. Let G be an extension of g of class C’. Let y be as in the proof


of Proposition 26, and lett @ = Goy = goy. Then o’(0) = h.
Let xo = g(to), U be a neighborhood of xo, and ® = (@',...,6"~") be
the same as in the definition of manifold. Then ®[6(s)] = 0 for every s in some
interval about 0. Calculating the derivative at 0 of ®-@ by Corollary 2,
Section 4-4,
D®(Xo)[0'(0)] = D®(xo)(h) = 0.
By Theorem 10, h € 7'y,(xo). §

When r = m = n, g is a transformation from an open set A C E" into E’.


In this chapter we call such a transformation flat. A flat transformation g
has at each t € A a Jacobian Jg(t). We recall that the factor |Jg(t)| appears
in the formula (5-38) for transforming integrals.
Regular Transformations 241

For arbitrary r,m, and n let us now introduce a nonnegative number


Jg(t), which for flat transformations becomes |/g(t)|. Let {k,,...,k,} be
any basis for the tangent space T'y(t). Let
Elbo he A. h,|
Jg(t) = Ik, , (J=1)
where
h = Dg(t)(k)), (R=) etiyrdga i

By Theorem 19 the denominator is not 0. Stated geometrically, gg(t) =


V,(K’)/V,(K), where K and K’ are r-parallelepipeds with 0 as vertex spanned
respectively by k,,...,k, and by h,, ..: ,h,.
We must show that gg(t) does not depend on the particular basis chosen
for T(t).
erm Le— set) eer eee ikea sth Ave: sh) Then
a = L,(8), where L, is the induced linear transformation defined in Section 6-4.
Let {k{,...,k/} be another basis for 7'y(t), and consider the corresponding
Die Lik) eG eK Nata /\ kK, a = hi A. --- A hy ~By Theorem 19
B’ = cB, where c is a scalar. Since L, is linear, a’ = L,(cB) = cL,(B). Thus
a’ = ca and |a’|/|6’| = |e|/|6|. This shows that gg(t) does not depend on
the particular basis chosen for the tangent space at t.
The condition gg(t) > 0 means that h,; A --- A h, # 0, which by
Theorem 19 is equivalent to linear independence of the set {h,,...,h,}.
Proposition 27 then has the following corollary.

Corollary. Jf Jg(to) > 0, then Dg(to) takes Tx (to) onto T u[g(to)].

Proof. Since Dg(to) is linear, it takes Ty(to) onto a vector subspace P of


T wlg(to)|. If {ki,...,k,} is a basis for Ty(to), then the set {hy,...,h,} is
linearly independent and each h; € P. Since the vector spaces P and T y[g(to)]
have the same dimension r and P C T' y[g(to)], they are the same. J

If r = m and N = A is an open subset of HL’, then Ty(t) = E” and for


k,,...,k, we may take the standard basis vectors €1,...,¢€,. In this case,
h; = g,(t) is the lth partial derivative of g at t. Since je, A --: A e€-| =
le1....| = 1, we have

Jett) = leit) A+++ Ag(t)| ifr =m. (7-2)


If r = n = m, then the right-hand side equals the absolute value of the deter-
minant of Dg(t); and thus gg(t) = |Jg(t)| if g is a flat transformation.
Definition. A transformation g from an r-manifold N into an r-manifold
M is regular if:
(1) g is of class C on N;
(2) g is univalent; and
(3) jg(t) > 0 for every t€ N.
242 Integration on Manifolds 7-1

A regular transformation g may distort shapes. However, if g is regular,


then the image g(B) of any set B C N is “qualitatively” the same as B. We
shall prove (Theorem 20) that the inverse g~' is regular. In particular, g and
g~' are continuous, which implies that B and g(B) are the same topologically
[to use the correct technical term, B and g(B) are homeomorphic]. Condi-
tions (1) and (3) insure that g is properly behaved from the viewpoint of dif-
ferential calculus. For instance, we have just shown that the differential takes
tangent spaces to N onto the corresponding tangent spaces to J.
A regular transformation is called by many authors a dzffeomorphism
of class C).
Note that we have assumed that g(N) C M for some r-manifold M/. One
might guess that conditions (1), (2), and (3) imply that g(N) lies in an r-mani-
fold; but Problem 3 shows that this is false. Later in the section we shall find
some additional conditions under which the guess is correct (see Theorem 21
and its Corollary 1).
Example 1. Let AC EH! be an open interval, and g a transformation from A into a
l-manifold 11 C E”. Let us assume (1), (2), and the condition g’(t) # 0. Since
r = 1, by (7-2) Jg(t) = |g’(| > 0. The vector g’(¢) is a tangent vector to M at
g(t), and Ty[g(t)] consists of all scalar multiples of it. The set g(A) is called an open
simple arc. If J CA is a closed interval, then g|J represents a simple are with end-
points included (see Section 3-2).

Proposition 28. Let g be a regular transformation from N into M, and ¢ be


a transformation of class C™ from M into E?. Then

Ib g(t) = Joz)ge(), wx = g(t). (7-3)


Proof. Let ki, hy be as above, and let 9; = Dd@(x)(h;),/ = 1,...,r. By
the composite function theorem, n; = D(@ > g)(t)(ki). If g@(x) > 0, then

[Wie eee rt ee aie eeeTel)(Deer


PAIN SRO NTS) © TRONS OED he con (P|
which is just (7-3). If g@(x) = 0, then the set {hy,...,h,} is linearly
dependent. This implies that {m1,...,%,} is linearly dependent, and there-
fore §(p ° g)(t) = 0.9
Corollary. If ¢ and g are regular, then their composite $ ° g is regular.

Proof. Since @ and g are of class C and univalent, so is @>g. Since


J¢(x) > Oand gg(t) > 0, by (7-3) J(@> g)(t) > 0.5
Example 2. Let S C EH? be a set such that

S = {(z, y, o(a, y)): (a, y) € R},


where R isan open subset of EH? and ¢ is of class C on R. The set S is a 2-manifold.
To see this, let B(z, y,z) = z — $(z, y), D = {(2, y, 2): (2, y) € Rk}. Then d®(x) #0
and S = {x € D:®(x) = 0}. Hence S is the 2-manifold determined by ®.
7A Regular Transformations 243

Let g(z, y) = rey + yea =e $(z, y)e3.

Then g is of class C‘ from R onto S and is univalent. The vectors 0g/dx and dg/dy
give a basis for the tangent space 7's[g(z, y)]. By (7-2), Jge(x,y) = |dg/dx A dg/dy|.
See Fig. 7-1.
Calculating these partial derivatives, we z

get
g(x,y)
Og dg Og 0p
Aen bale ny re wey

adx \ ay a e12 ay e31 dx €23,

Ey 28| -[1+(22) (2)


ane oy Ox Oy
= (1+ lagi}.
x
Since the last line is positive, Jg(z, y) > 0.
Thus g is a regular transformation. Figure 7-1
Let X, Y, and Z be the standard cartesian coordinate functions for £3; and let

F = (X|S, Y|S).
Since X and Y are of class C“, F is of class C“ on S. Moreover, F is univalent; in
fact, F = g—!. Since Fog is the identity transformation, J(F°g) = 1. By (7-8),

1
JF (x) = >0, ifx = g(z,y).
Je(x, y)
Thus F is also a regular transformation.

Example 3. Generalizing Example 2, let \ = (71,...,%,) be an increasing r-tuple


of integers, 1 < %% <n for k = 1,...,1r; and let (j1,..-.,jn—r) be the increasing
(n — r)-tuple complementary to \. Let R be an open subset of H”, and g!,...,6"~"
of class C® on R. Let g be the transformation from F& into /” such that

g(x) = ot eae a eA 8

ately
22OC ae
where .
Xe (cr), 2).

Then g is of class C™ and univalent. The explicit formula for Jg(x*) is complicated.
However, we can show that Jg(x*) > 0 as follows. Since
n—r l

OE = e:,44 Se on ej), 08 IK, 302 IX OB = e), + other terms.


Ox” (z= Ox"* Ox"! et

This r-vector is not 0 since its Ath component (the coefficient of e,) is 1. Hence
Jg(x,) > 0, which shows that g is regular.
244 Integration on Manifolds : U\

Let S = g(R) and F = XS, where XX = (X4,...,X%) and X!,..., X”


are the standard cartesian coordinate functions for #”. As in Example 2, S is an
r-manifold and F = g~! is regular. In the next section F will be called a cartesian
coordinate system for S.
The importance of Example 3 lies in the impli¢it function theorem. If M is an
r-manifold, then for every xo € M there exist an increasing r-tuple \ and a neighbor-
hood U, of xo such that S = MM Uj; has a cartesian coordinate system F of the type
just described. See p. 120.

Let us return from these examples to establish some further properties of


regular transformations. We recall that a set S C M is relatively open if
S = MD, where D is an open subset of H” (Section A-6). A relatively
open subset of an r-manifold is itself an 7-manifold.

Theorem 20. Let g be a regular transformation from an r-manifold N into


an r-manifold M. Then g(N) is a relatively open subset of M and g—! is a
regular transformation.

Proof. We know already that the theorem is true in the following two
particular cases:
(a) If @ is a flat regular transformation from an open set A C E’ into EH’,
then by the inverse function theorem $(A) is open and @~' is regular. (b) If g
is as in Example 3 and S = g(R) = M nO U,, where U, is a neighborhood of
some Xo € M, then g(R) is open relative to M and F = g™! is regular.
In the general case, let g be regular from N into M. Let to) € N and
Xo = g(to). By the implicit function theorem, there exist an increasing r-tuple
\ and a relative neighborhood S of xo such that F = X |S is regular. The
set R = F(S) is open and F~’ is regular from R onto S. (In Example 3, F~!
was denoted by g.) Similarly, there exist a relative neighborhood S’ of to and
F’ regular from S’ onto an open set R’ C E” such that (F’)~! is also regular.
Since g is continuous, we may arrange that g(S’) C S. Consider the transfor-
mation @ = Fo g- (F’)| from R’ into R. By the corollary to Proposition 28,
¢ is regular. Since ¢ is flat, @(R’) is open and $7! is continuous. Let S; be a
relative neighborhood of xo such that F(S;) C ¢(R’). If x € S81, then x = g(t)
for t = ((F’)~' > @ '-F)(x). Therefore S; C g(N). Moreover (see Fig. 7-2),

gS: = (F)7* 2 1 » (FSi).

2 S
|
| Xo
|
z I \ M
! |

¢
Se
eensee
R

EigurRE 7—2
T1 Regular Transformations 245

Since each of the three transformations on the right-hand side is regular,


g ‘|S, is regular.
Since every x9 € M has such a neighborhood Sj, this proves Theorem 20. J
In the next theorem we drop the assumption that g(N) is contained in
an 7-manifold. Instead we deduce it from conditions (1), (2), and (3) in the
definition of regularity and the additional assumption that g is an open trans-
formation.

Definition. A transformation g from a set N C HE” into E” is open if g(B)


is open relative to g(N) for every set B open relative to N.
Theorem 21. Let g be a transformation from an r-manifold N into E”, such
that g 1s open and satisfies (1), (2), and (3). Then g(N) is an r-manifold,
and g 1s regular.

Proof. We must show that g(N) is an r-manifold. Once this is proved,


the regularity of g follows from the definition. Let us first prove the theorem
in case N is an open set A C #’. For each increasing r-tuple \ = (71,..., 7;)
let g* be the flat transformation from A into EZ” with components g’!,..., 9’.
By formula (6-13a) the \th component of the r-vector g(t) A --- A g,(t)
is the Jacobian Jg*(t).
Let S = g(A), and let xp = g(to) be any point of S. By (7-2),
Zi(to) A --: A g,-(to) # 0. Hence there is some \ such that Jg*(to) ¥ 0.
By the inverse function theorem there is a neighborhood Q of tg such that
g|Q is regular. The set R = g*(Q) is open and contains x. Let @ = (g*|2)~1
and G = go@. Since G*#(x*) = x for each k = 1,...,7r, G is of the type
considered in Example 3. Therefore G(R) is an r-manifold. But x9 € G(R)
and G(R) = g(Q) is a relatively open subset of S, since g is an open trans-
formation. Since any x9 € S lies in a relatively open subset of S which is an
r-manifold, S is an r-manifold. This proves Theorem 21 in the case N C KH’.
In the general case, let t) € N. Then to has a relative neighborhood S’
with which is associated by the implicit function theorem a transformation F’
as in the proof of Theorem 20. Let 2 = ge (F’)~!. Since F’ and (F’)~! are
regular, & also satisfies the hypotheses of Theorem 21. Its domain R’ = F’(S’)
is an open subset of HZ’. By what has already been proved, g(S’) = g(R’) is
an r-manifold. Since every to € N has such a neighborhood S’, g(NV) is an
y-manifold. ff

Corollary 1. Jf g satisfies (1), (2), and (3) and g' is continuous, then
g(N) is an r-manifold and g 1s regular.
Proof. Let B CN be relatively open. Then g(B) = (g~')~'(B) is open
relative to g(N) by Proposition A-6. Therefore g is an open transformation. J

Corollary 2. Let g be regular from an r-manifold N into an r-manifold M.


If QCN is a p-manifold, p <r, then g(Q) is a p-manifold and g\Q is
regular.
246 Integration on Manifolds 7-1

Proof. Clearly the restriction g|Q is of class C and univalent since g


has these properties. If t © Q, then there is a basis {k;,...,k,} for Ty(t)
such that {k,,...,k,} is a basis for T'g(t). Since g is regular, the set of images
fh,,...,h,} of these basis elements under Dg(t) is linearly independent.
Therefore {h,,...,h,} is a linearly independent set. This shows that
g(giQ)(t) > 0 for every t EQ.
Now (g|Q)~! = g‘|g(Q). By Theorem 20, g~* is regular; and in particular
g~' is continuous. Hence its restriction to g(Q) is continuous. The conclusion
follows from Corollary 1. §
Note: If M and N are manifolds of class C”, g > 1, then we may consider
regular transformations of class C‘ from N into M. In that case all trans-
formations which appear in the proof of Theorem 20 are of class C, and
hence g~! is of class C‘. Similarly, in Theorem 21 and its corollaries, g(N)
and g(Q) are of class C if g, N, and Q are of class C™.
All of the results of this chapter are correct if one assumes merely that
q = 1. However, to simplify the proof of the divergence theorem and Stokes’
formula, we shall later take q = 2.

PROBLEMS
1. For each of the following transformations from A C E? into E?, find Jg(s, ¢) and
g(A). Show that g is univalent.
(a) g(s,t) = (s+ tei + (s — 3t)e2 + (—2s+ 2¢+ 2)e3,A = {(s,):0<s+t
Leib SA? SS OPS
(b) g(p, 0) = (p cos a)ex + (p sin @ cos A)eg + (p sin @ sin Aes, O< a < 1/2 (a
fixed), A = (0, 0) x (0) 2).
(c) g(s,t) = ste; + seg-+ te3,A = E?.
2. Let G(s, ¢, w) = ase; ++ bteg + cues3, where a, b, and c are positive. Let N be
the sphere s?-+ ¢?-+ u? = 1 and M the ellipsoid 2?/a? + y?/b? + 22/c? = 1.
(a) Find Ty(t), Tu[G(t)], t = (s,t,u). Verify that the image of 7'y(t) under
DG(t) is Tu[G(t)].
(b) Let g = G|N. Calculate Jjg(t) from (7-1) and show that g is regular from
N onto AM.
3. Let g(t) = (cos t)e1 + (sin 2f)e2, A = (0, 37/2). Sketch g(A). Show that g is
univalent, and that Jg(t) > 0, but that g(A) is not a 1-manifold. Why does this
not contradict Theorem 21?
4. Letn = 4,7 = 2, = (1, 2). Show that in Example 3, gg(z1, 2?) = [1 + |dg¢!|?+
|dp?|? + (Hibs — $i$3)7]”?.
5. Let A C H” be open and bounded. Let g be continuous and univalent on cl A.
Show that if g/A is of class C™ and Jgg(t) > 0 for every t € A, then g(A) is an
r-manifold and g|A is regular. [Hint: Problem 8(d), Section A-8, and Corollary 1.]
6. Let g be of class C from an r-manifold N into EZ”, and let Jg(to) > 0. Show that
to has a neighborhood Q such that g(N Q) is an r-manifold and g|(V NQ) is
regular. [Hints: First consider the case N C E*. By generalizing Proposition 14,
Section 4-3, find a neighborhood Qo such that g is univalent on NA clQo. Use
Problem 5.]
UD Coordinate Systems on Manifolds 247

7-2 COORDINATE SYSTEMS ON MANIFOLDS

Let M be an r-manifold. Since M is r-dimensional it should be possible


to find, at least locally on M, r functions F',...,F" such that the numbers
F1(x),..., F(x) will serve as coordinates of a point x € M. When r = ie
M is an open subset of #”. In that case we saw already in Section 5-9 that any
regular transformation F = (F',..., F”) will serve to coordinate M/.
The definition of coordinate system on an r-manifold is similar.

Definition. Let S be a nonempty, relatively open subset of an r-manifold


M cE”. Any regular transformation F from S into E” is a coordinate
system for S. The coordinates of a point x € S in this system are
n(x eee ho(X) 5

By Theorem 20, the set A = F(S) is an open subset of #” and the trans-
formation g = F~! is regular from A onto S. It will be g rather than F which
is ordinarily used for calculations in the sections to follow.

Example 1. Let us return to the three examples in Section 7-1. In the first of them,
the function F = g~! is a coordinate for the open simple are S = g(A). The co-
ordinate of a point x = g(¢) is ¢ in this system.
In the second of those examples, F is a coordinate system for S. The coordinates
of a point (2, y, (2, y)) € S in this system are x, y. In the third example, F = X|S.
The coordinates of a point x in the coordinate system F are v4,...,a'. Such a
coordinate system will be called cartesian.

Example 2. Let F = (F!,...,7’) be a coordinate system for S, and let S, =


fx € S: F4(x) = c}. Assume that S, is not empty. Then S, is an (r — 1)-manifold
and W, = (F2|S,,..., F|S-) is a coordinate system for S,. The proof is left to the
reader (Problem 4). This is illustrated in Fig. 7-3 when r = n and Sis an open set D.

FIGuRE 7-3
248 Integration on Manifolds U7)

Example 3. Let (R, 0!,..., @”—!) be the spherical coordinate system for the open
set D in Example 5, p. 181. The complement of D is a null set. Setting R(x) =
we get, according to Example 2, a spherical coordinate system for the intersection
of D with the unit (n — 1)-sphere S"—!. It turns out that S*~! — Dis null in dimen-
sion n — 1, in the sense to be explained in the néxt section. Consequently, this
coordinate system can be used to evaluate integrals over S”—!.

It is not ordinarily possible to find a single coordinate system for all of a


manifold M. If S C M has a coordinate system F, then by Theorem 20 both
F and F7! are continuous. Therefore S is homeomorphic with an open set
A CE”. Since A cannot be both open and compact, S is not compact. In par-
ticular, a compact manifold M (for instance, a sphere or torus) cannot be
coordinatized by a single system.

Definition. A relatively open set S C M which has a coordinate system


is a coordinate patch on M.

By the implicit function theorem every point of M les in some coordinate


patch S. Actually, each point of M lies in an infinite number of coordinate
patches. Let us now show how different coordinate systems are related in
overlapping patches.
Coordinate changes. If S is a coordinate patch and F a coordinate system
for S, then another coordinate system F’ for S can be obtained as follows.
Let @ be a regular flat transformation from A = F(S) into #’, and let FY =
oF. Since ¢ and F are regular, F’ is regular. Hence F’ is a coordinate system
for S.
Now let S and S be coordiriate patches, F a SSO MES system for S, and
F a coordinate system for S. Suppose that S q 8is not empty. Let us show
that in Sq 5S, F can be obtained from F by a regular flat transformation d.
Let
Aj = FES (iS ee oe ECS irs),
and g = F—!, ¢ = F~1. By Theorem 20, g is regular. Its restriction to Ao
is also ‘oasis and hence the composite ¢=F. (g|Ao) is regular. Moreover,
F(x)= [F(x)] for every x € SS. See Fig. 7-4.

=F.(g|Ao)

Figure 7-4
7-2 Coordinate Systems on Manifolds 249

Example 1 (continued). If r = 1 and AC ZH! is an open interval, then a coordinate


change is determined by a real-valued ¢ such that ¢’(t) ~ 0. If $/(t) > 0 for every
t € A, then ¢ was called in Section 3-2 a “parameter change.” The condition ¢’(t) > 0
insures that @ does not reverse the orientation (see Section 7-4) of the 1-manifold IM.

*Manifolds defined by coordinate systems. We have seen that an r-mani-


fold M C E” is covered by coordinate patches. Conversely, let M be a subset
of H” with the following property: There is a collection $ of relatively open
subsets of M which cover MW; and with each S € § there is associated a homeo-
morphism F from S onto an open set A C EH” such that F~! is of class C‘? and
JF—'(t) > 0 for each t € A. By Corollary 1 to Theorem 21, each S € § is an
r-manifold; and hence M is an r-manifold. This shows that instead of the
approach in Section 4-7 we could have defined manifolds in terms of coordinate
systems.
We have taken a rather concrete approach to the idea of manifold, con-
sidering only manifolds given as subsets of some euclidean space. However,
the manifolds encountered in practice often are not given in this way. The
approach via coordinate systems allows one to take a more abstract point of
view. From this viewpoint the definition of manifold is as follows: An r-manifold
of class C‘” is a Hausdorff topological space Z provided with a collection of
open subsets S (called coordinate patches) covering Z and for each coordinate
patch a homeomorphism F from S onto an open set A C H’. The regularity
of the flat transformation @ in Fig. 7-4 is now imposed as an axiom.
If Z is an r-manifold according to this more abstract definition and Z is
separable, then Z can be realized as a submanifold of some euclidean space,
in fact as a submanifold of B2"*+. See [21], Chap. IV. A result of this type
is called an embedding theorem. By separable we mean that Z has a covering
either by finitely many coordinate patches or by a sequence of coordinate
patches.

PROBLEMS
1. Let M = {(a,y, 2): 22+ 24y+ 2? = 3,2 >0,y > |z|}. Let F = (x M, Y|M)
and F = (X|M,Z|M). Describe A, A, g, g, and @ (see Fig. 7-4).
2. Let Mo = {qy? + 27, y, z):y > 0} and let Fa, y, 2) = (y+ z, exp z) for
(x, y, 2) € M. Show that F is a coordinate system for MM and find F(M). [Hint:
First take y and z as coordinates on M and then find a suitable coordinate change
@ giving the system F.]
3. Stereographic projection. Let M be the sphere «7+ y?+ (¢ —1)? = 1. For
each x = (2, y, 2) € M except the “north pole” 2e3, let (s, t, 0) be the point where
the line through 2e3 and x meets the plane z = 0. Let F(x) = (s, é).
(a) Show that F is a coordinate system for MZ — {2e3}.
(b) Let hy, he be tangent vectors to M at x, and let k; = DF(x)(h)), / = 1, 2.
Show that the angle between k; and kg equals the angle between hy and hg.
250 Integration on Manifolds T=}

4. In Example 2 prove that S, is an (r — 1)-manifold and W, is a coordinate system


for it. [Hints: Let A = F(S). F(S,), being the intersection of A with the hyper-
plane t! = c, is an (r — 1)-manifold. See Corollary 2 to Theorem 21.]
5. Let (F!,..., F") be a coordinate system for S. Lets <r and So = (kor (x). =
cl,..., F8(x) = c*}. Show that if S, is not empty, then S, is an (r — s)-manifold
and (F*+1|S.,..., F"|S.) a coordinate system for it.
6. Let f1,...,f7, !,...®"— be functions of class C™ on an open set D. Suppose
that F = (f'|S,..., f"|S) is a coordinate system for S, that S = {x€ D: B(x) = 0},
and that D@®(x) has rank n — r for every x € D. Show that each xo ES has a
neighborhood U such that (f!,...,f7,®),...,0°0) restricted to U issa\co-
ordinate system for U.
7. Let 1 <r <n, and let Mn, r) = {a € EH;: |a| = 1, awis decomposable}. Identify
EY with BE), and show that SW(n, 7) is a manifold of dimension r(n — r). [Huint:
Given ao € Min,7r) let (vi,...,Vx) be an orthonormal frame for H” such that
ao =v; A ::: A v;. Show that if @ is in a small enough neighborhood of ao,
then @ can be uniquely written in the form

a =e(v+ DS tava) A= A (vet Da tov)


k=r+1 k=r+1

The r(n — r) numbers t,, can be taken as coordinates of @.|

7-3 MEASURE AND INTEGRATION ON MANIFOLDS

Let us now define 7-dimensional measure for subsets of an r-manifold


and integrals with respect to it. The r-dimensional measure of a set A C M
will be denoted by V,(A), and the integral of a function f over A by fa f dV,
or by fa f(x) dV,(x). When r = n, these turn out to have the same meaning
as in Chapter 5.
For simplicity, the integral will be defined only for continuous functions.
Moreover, it is assumed throughout this section that A is a o-compact set
(Section 5-6). By these assumptions, we avoid some slightly tedious discus-
sion of measurability which for present purposes is irrelevant. We recall from
p. 175 that if A is o-compact and F is continuous on A, then F(A) is also
o-compact.
Let us first consider the case when A is contained in some coordinate
patch S. Let F be a coordinate system for S, and let g = F—!. The following
discussion is intended to motivate the definition of [4 fdV,. Let us consider
a “small” r-cube J, of side length a and vertex to, as indicated in Fig. 7-5.
Let us set kj = ae;, hy = agi(to),1 = 1,...,r. Then |ky A --- A k,| =
V,(Z), and by (6-16) |h; A --- A h,| = V,(K), where K is the r-parallele-
piped with vertex x9 spanned by hy,...,h,. By (7-1) the ratio of these two
numbers is Jg(to). Thus
TS Measure and Integration on Manifolds 251

Xo = (to)
h, = agy(to)

Figure 7-5

Moreover, f[g(to)|V-(K) should furnish a good approximation to fri OVE


If Z is a figure composed of small nonoverlapping r-cubes J,,..., Im, then
Sez) f dV should be approximately the corresponding sum
m

ail g(tx)|Je (te) V(x).

In the exact formula Z is replaced by the set B= g—1(A), and the sum by
an integral.

Definition. Let A be a o-compact subset of a coordinate patch S, and let


f be continuous on A. Then

[A J@W.@ = fB fle®lset) dV.®, A= eB), (7-4)


provided the function (f - g)Jg is integrable over B.

The integral over B is taken in the sense of Section 5-6. By (7-2), Jg(t) =
lgi(t) A --: A g,(t)|. Since g is of class C"”, gg is continuous. Hence (f » g) gg
is continuous. If f > 0, then the integral over B either exists or diverges to
+c. When the latter occurs we agree that the integral of f over A also diverges
to +o.
We must show that the integral does not depend on the particular choice
of coordinate system. Let 8 be another coordinate patch such that A C Ss,
and let F be a coordinate system for S. Let us adopt the notation of Fig. 7-4.
Then g = £>° 9, and by (7-3)
iets en
Let B = F(A). Then B = ¢#(B) and by the transformation formula for
integrals (Theorem 17)

| stecoiset av. = | se gE (Jo-"@)|aV(0).


252 Integration on Manifolds 7-3

Since @~! is a flat transformation, |J¢—'| = g@'. Therefore

[B fle@iset) av.) = f-fE)198@


B 4
) aV-),
as required.
If we take f(x) = 1, then (7-4) defines the r-dimensional measure of A:

V(A) =f, gett) aV.(t), 4 = g(B). (7-5)


Example 1. Let I C E® be a 2-manifold, and S a relatively open subset of MW on
which x and y can be taken as coordinates, as in Example 2, Section 7-1. Since
jg = [1 + |dd|?]'", we have

[fave = fffley 6, WIG + doc, PY"? avo, w).


A B

Example 2. Let J/ be a 1-manifold and B be a closed interval [a, 6]. Using the ter-
minology of Section 3-2, A is the trace of the simple arc Y represented on [a, 6] by g.
Formula (7-4) becomes
b

[fe
A
avi = [fle
a
le’) a.
The right-hand side is J, f ds, as defined on p. 85.

*Note. The line integral f,fds was defined in Section 3-2 without requiring that
Y be simple. If Y¥ is not simple, then its trace A = g({a, 6]) need not be contained
in a l-manifold. There is a more general notion of r-dimensional measure and integral
for sets which are not necessarily subsets of an r-manifold. The general formula which
becomes for simple ares the one in Example 2 is

b
[ f@N@) avi) = [lel |e") at
where V(x) is the multiplicity of the point x. For any r > 1 there is a similar formula

[ f@ON@ av.(x) = [ ste(t)Ise(t) avo),


A B
(*)
where B is a o-compact subset of H’, g is of class C“ on B, A = g(B), and
again N(x) (= number of points t € B such that g(t) = x) is the multiplicity of x.
[See p. 144, H. Federer, “The (¢, k) rectifiable subsets of n space,” Trans. Amer.
Math. Soc. 62 (1947), 114-192.]
In formula (*) it is not necessary to assume that Dg(t) has maximum rank r.
If B’ = {te B:rank Dg(t) < r}, then Jg(t) = 0 for every te B’. Therefore
B’ contributes nothing to the integral on the right-hand side of («). It turns out that
g(B’) has r-dimensional measure 0, and therefore contributes nothing to the integral
on the left-hand side of («).
7-3 Measure and Integration on Manifolds 253

Example 3. Let H be the hemisphere x? + y? + 2? = 1, z > 0. Introducing spher-


ical coordinates on H, let

g(, 0) = (sind cos @)ex + (sin gd sin O)e2 + (cos d)e3.

The image of a small square [¢,¢-+ a] X [6,@-+ a] in the (4, @) plane is a small
sector of the hemisphere which is approximately a rectangle of side lengths a and
asin ¢. Since |dg/dd A dg/d6|a? is approximately the area of the sector, this suggests
that Jg(¢, 0) = |dg/d¢ A dg/06| = sing. The reader should check this formula
(Problem 3). If we take B = (0, 7/2) X (0, 27) then H — g(B) is an arc of a great
circle corresponding to 6 = 0. This arc is 2-dimensionally null in the sense defined
below, and hence

[100
H
aVa(x) =fg(B) f0x) aac = f0hia i0~" fle(; 8] sin
o48.
Let us turn to the general case when A is not necessarily contained in some
coordinate patch. To simplify matters let us at first assume that J is a compact
manifold and f is continuous everywhere on M. The traditional way to proceed
As to dissect / into a finite number of nonoverlapping pieces S;,..., Sm each
of which has a coordinate system, with fr S; M fr S; contained in a finite union
of (r — 1)-manifolds and M = el 8; U--- UelS,. Then

/ fdV,= ss / fdV>. (7-6)


A 4 PANS,

In simple examples it is easy to find such dissections of 17. However, the


theorem that every compact r-manifold M has such a dissection is a difficult
one to prove. See [21], Chap. IV. Nor is it evident that the integral is inde-
pendent of the particular dissection chosen.
The same result can be achieved by a simpler device called partition of
unity. The basic difficulty with dissections is that S; and S; cannot overlap.
With partitions of unity this problem is avoided.

Partition of unity. Let us recall from Section 5-3 that the support of a
function wy is the smallest closed set outside of which y(x) = 0. Let us first
find for every Xo and r > 0a function y of class C‘“’ on E” such that ¥(x) > 0
on the open r-ball with center xp and the support of y is the closed r-ball with
center Xo. In fact, let
—I]
h(a) = exp Cy — <a << Il.

x)= 0, |x|
>1.
From the example at the end of Section 2-3 and the composite function
theorem, h is of class C on E!. Let

yo) = n(Z—
A).
254 Integration on Manifolds V=8

Definition. Let M be a compact manifold. A collection of functions


1, -.+, dm} 18 a partition of unity for M if:
(1) ¢x is of class C‘ on M and ¢, > 0,k = 1,...,m;
(2) The support of ¢; is a compact subset of some coordinate patch,
ce lee, 7; and
(3) >ox41 o4(%) = 1 for every x € M.

Proposition 29. Any compact manifold M has a partition of unity.


Proof. Every x € M is contained in some coordinate patch S. Since S
is relatively open there is a neighborhood U of x such that MnclU CS.
Since M is compact, a finite collection {U,,..., Um} of such neighborhoods
covers M. Let x, be the center of U;z, 7, the radius, and y;, the function of
class C‘*? constructed above with xy = xXx, r = Tx. The collection of functions
{1,...,Wm} satisfies (1) and (2) of the definition, but not necessarily (3).
However, by construction ~,(x) + --- + m(x) > 0 for every x EM. Let

Wi(X)
=; =
BiGhe seen ’
Somlees
ke ;
ee EM.

The collection {¢1,...,@m} is a partition of unity for M. J

Let f be continuous on M. Since M is compact, f is bounded on M. This


insures that all of the following integrals exist. If the support of f is a compact
subset K of a coordinate patch, then we let

| Pa. = fdV;.
A ANK

In particular, if {¢1,..., @m} 1s a partition of unity, then for any f the support
of fo, is compact and lies in some coordinate patch. Hence the integral of
fx is defined.

Definition. Let A be a o-compact subset of a compact manifold M, and


{o1,--+,m} be a partition of unity for M. Then for any f continuous
on M

/A jie = ae
> i]
A
fox dV, (7-7)

In case A is contained in some coordinate patch S, this agrees with the


earlier definition (7-4), since }'¢;,(x) = 1 for every x € A. We must show
that the integral does not depend on the particular partition of unity chosen
for M. Let {X1,...,Xp} be another partition of unity for M. Then for every
xeM

f\xx) = 8 Oo) ee
k=1
3 Measure and Integration on Manifolds 255

Since the support of fx; is contained in some coordinate patch, its integral
over A can be written according to (7-4) as an integral over a set BC EK’.
By Theorem 13, the integral over B of a finite sum is the sum of the integrals.
Hence

| raav, = 3 | seabed, Le ee.


A =a A

SS | raav, = 5 ss | rx dV,
— ti — i

In the same way


m m p

Ds [FoudV, = DD) |fdexrdV,.


Since the right-hand sides are equal, the integral of f over A does not depend
on which partition of unity is chosen.
If f(x) = 1 for every x € M, then the integral gives the r-dimensional
measure

V4) = >> ifoedVen


ca A
When A is a subset of some coordinate patch, this agrees with the previous
definition. If V,(A) = 0, then A is called an r-null set. The integral has the
same elementary properties listed in Theorem 13 for r = n (Problem 9).
Moreover, V, is countably additive (Problem 10).

Measure and integration on noncompact manifolds. If J/ is not com-


pact then the discussion is somewhat more complicated. Partitions of unity
consisting of infinite collections {¢1, ¢2,...} must be considered. To (1)-(3)
must be added:
(4) If K is any compact subset of 7, then the support of ¢;, meets K for
only finitely many k.
The sum in (3) is now an infinite series. However, on any compact set
only finitely many terms are different from 0. Every manifold has a partition
of unity. This can be proved by an elaboration of the proof of Proposition 29,
which we shall not give.
Let f be continuous on A. Then f is called integrable over A if
Yi Sa lflé, dV, is finite. If f is integrable over A, then its integral is
et din Sor dV».
The following results are true whether M is compact or not. However,
we shall give the proof only for the compact case.

Proposition 30. Jf A is a subset of M which is an (r — 1)-manifold, then


A is an r-null set.
256 Integration on Manifolds Yas

Proof. Suppose first that A CS, where S is a coordinate patch. Let F be


a coordinate system for S. By Theorem 21 the set B = F(A) is an (r — 1)-
manifold. By Corollary 3, p. 178, V,(B) = 0. Therefore, from (7-5) V,(A) = 0.
In the general case, let {¢1,...,¢m} be a partition of unity for MW, and
let K;,, be the support of ¢,. Then V,(A 9 K;,) = 0 and hence

[eae = [de aVs= 0.


Summing from 1 to m, V,(A) = 0.5

Corollary. Jf A zs contained in a countable union of (r — 1)-manifolds,


then A 1s an r-null set.

Formula (5-38) about change of variables in an integral by a regular flat


transformation has the following generalization.

Theorem 22. Let g be a regular transformation from an r-manifold N «into


an r-manifold M. Let f be continuous on M, and A be any o-compact subset
of M. Then

[A f@wv.@ =,g (A)


fleblsett) av-(), (7-7)
provided either integral exists.
Proof. By using a partition of unity, it suffices to consider the case when
f has compact support contained in some coordinate patch S and A CS.
Let F be a coordinate system for S. Then F > g is a coordinate system for
S —te8 (S)_ let G =F G == (hee) |) pat oG. Let 4 GB) einen
g '(A) = G’(B). By (7-4) the left-hand side of (7-7) equals fz (f » G)JG dV,
while the right-hand side equals fz [f > g ° G’][(gg) > G’]gG’ dV,. But G =
g°G’, and by (7-3), JG = (gg) > G’gG’. Therefore both sides of (7-7) are
the same. J

PROBLEMS

1. Find the area of {(z, y, ry): 2? + y? < 1}.


2. Let n = 2,r = 3. Show that (7-5) becomes

V2(A)
= ago)? ,I aaa)
fi a(s, t) a(s, t)
| Eaten
a(s, f) dV2(s, t).

3. In Example 3 calculate dg/d¢ A 0g/00 and verify that its norm is sin @.
Moments, centroids. These are defined in the same way as forr = n. For example,
if A has positive r-dimensional measure then the components of its centroid are

te el i & 5
y= Vid) le GVEA
X) et — oseen
7--4 Orientations; Integrals of r-Forms 257

If r = 2, n = 8, and A is thought of as a surface with continuous density p(x)


(mass per unit of area), then the mass is f4 p(x) dV2(x) and the components of
the center of mass are

z= ibx'p(x) dV 2¢x) /i"p(x) dVo(x), i = 1, 2,3.


4. Find the second moment about the z-axis of:
(a) The sphere +? + y?+ 2? = 1.
(b) The triangle with vertices e1, e2, e3.
5. Show that 3e3 is the centroid of the hemisphere H in Example 3. Use spherical
coordinates on H.
6. (Surfaces of revolution). Let Y be a simple are (or simple closed curve) lying in
the half y > 0 of the (a, y) plane. From Section 3-2, Y has a standard repre-
sentation G on [0, J], where / is the length and |G’(s)| = 1 forO < s< 1. Let
g(s,t) = @(s)e1 + G(s) [(cos t)ez + (sin é)e3], and let IZ = g((0, 1) X [0, 2z]).
(a) Prove Pappus’ theorem: Vo2(M) = 2ryl, where (%, 7) is the centroid of Y.
(b) Find the area of a torus (doughnut) of major radius r; and minor radius re.
7. Let S be the unit (n — 1)-sphere in #”. Show that the (n — 1)-measure of the
“gone” {x © S:a < x” < 6b} depends only on the difference 6 — a when n = 3,
but this is false when n + 3. Assume that —1 < a < b <1.
8. Let v(r) = anr” be the n-dimensional measure of a spherical n-ball of radius r.
Show that v’(r) is the (n — 1)-measure of its boundary. [Note: a, was calculated
on p. 183. If Bx-1 = Vn—i [unit (n — 1)-sphere], then 8,1 = naz.]
9. Prove that the statements (1)—(7) obtained by replacing n by r everywhere in
Theorem 13 are true for integrals over g-compact subsets of a compact r-manifold.
10. Let M be a compact r-manifold. Let A = Ai U A2U~::, where Aj, Ag,...
are disjoint c-compact subsets of M. Show that V,(A) = V,(A1) + V,(A2) +:
[Hint: Use a partition of unity and Theorem 12.]

7-4 ORIENTATIONS; INTEGRALS OF r-FORMS


Let M be an r-manifold. For each x € M the tangent space 7'(x) is an
y-dimensional vector subspace of #”. According to Section 6-3 T(x) has two
possible orientations, each of which is an r-vector of norm 1. If one of these
orientations is denoted by o(x), then the other is —o(x). We would like to
choose the orientation for 7'(x) consistently on M; in other words we want the
function o whose value at x is 0(x) to be continuous on M/.

Definition. J is an orientable manifold if there exists a continuous r-vector-


valued function o such that o(x) is an orientation for the tangent space
T(x) for every x € M. The function o is an orzentation for M.

It can be shown that a connected manifold has at most two orientations.


Let us find out what orientability means in the extreme dimensions r = 1,
fea ADD
258 Integration on Manifolds 7-4

r= 1. If M is a 1-manifold, then the two orientations for the 1-dimen-


sional vector space 7'(x) are unit tangent vectors at x pointing in opposite
directions. M is oriented by assigning a unit tangent vector v(x) continuously
on M (Fig. 7-6). It can be shown that every 1-manifold is orientable.

Fiaure 7-6
x x+v(x)

r = n. In this case the n-manifold M is an open subset of H”. The pos-


sible values for 0(x) are +€1...,. If M is connected, then o(x) must be constant.
If o(x) = e..., for every x € M, then M is positively oriented; and if o(x) =
—€1...n for every x € M, then M is negatwely oriented.
r=n—1l. If M is an (n — 1)-manifold in E£”, then the adjoint
n(x) = *o(x) is a unit normal vector to M at x. The condition that M be
orientable is that a unit normal can be chosen continuously on M. If D is an
open set which is on one side of its boundary fr D (see Section 7-5), then the
exterior unit normal orients fr D. If M is not the boundary of an open set, then
M may not be orientable. This is shown by the following famous surface.

FIGurE 7-7

Example 1. The Mébius strip. This is a 2-manifold M C KE? which is not orientable.
It may be visualized by twisting a strip of paper and pasting together the ends
(Fig. 7-7). The edge of the strip must be omitted in order that M be locally like E?.
The fact that a unit normal cannot be chosen continuously may be expressed more
picturesquely by saying that the Mébius strip is a surface with “only one side.”
Example 2. The Mobius strip is not a compact 2-manifold. An example of a compact,
nonorientable 2-manifold is the Klein bottle, or twisted torus. It is obtained by also
joining together the lateral edges of the rectangle used to make the Mobius strip, as
indicated in Fig. 7-8. The Klein bottle cannot be realized as a submanifold of E3,
since it can be proved that any compact (n — 1)-manifold M C E” is the boundary
of an open set and hence is orientable. However, the Klein bottle can be realized as a
submanifold of H*.
q

M/F q
7-4 Orientations; Integrals of r-Forms 259

Integrals of r-forms. Let M be an r-manifold with orientation o, and


# an r-form continuous on M. For each x € M consider w(x) - o(x), the
scalar product of the r-covector w(x) and the r-vector o(x). Since w and o are
continuous functions, w - 0 is a continuous real-valued function. Let A be a
o-compact subset of WM.

Definition. The integral of w over A with the orientation o is

[02

= [,A 2@ - 0) aV.(x), (7-8)
provided w - o is integrable over A.
In particular, if M is compact, then w-o is continuous, bounded, and
hence integrable over any g-compact subset of M. The integral has the follow-
ing elementary properties:

O ifs ce arw) ~ theo) sed os

(2) i] cw = ef w, for any scalar c.

© fo — fu
A° Ae

(4) If |w(x)| < C for everyx © A, then |[o*| EaCV Ay

(5) i = fro? t [get = A, U Ag and A; N Az ts empty.

These follow at once from corresponding elementary properties of the


right-hand side of (7-8). See Problem 9, Section 7-3.
For instance, in (3)

ibw(x) - [—0(x)] dV,(x) = — ib,w(x) - o(x) dV,(x).

Since |o(x)| = 1, |w(x) - o(x)| < |w(x)|. Then in (4)

\f, w(x) - o(x) dV (x) |< iL,|w(x)| dV.(x) < CV,(A).

In (1), one can take more generally a finite number of r-forms w', ... , w”, or
more generally an infinite sequence w!, w”,... provided ~~, Sa |w*(x)| dV (x)
converges. Similarly, the generalization of (5) is still true if A = A; UA2U-->-,
where A;, Ag,... are disjoint o-compact sets and > ¢—1 Sak |w(x)| dV,(x)
converges.
The case r= n. Let At denote A with the positive orientation e4...n
of BH”. Let w = fdx’ A --: A de” be ann-form. Then

w(x) Cae = @1...n(X) = t(x),


260 Integration on Manifolds Wop

and (7-8) becomes


[afdet Ao A da” =| faVn. (7-9)
A A
The left-hand side of (7-9) changes sign if either the orientation of A is reversed
or two differentials dx’ and dz’ are interchanged. For instance, if n = 2 then

[fee A dy = [fo
[fax A dy= [fw A dz = —f fae.

Orientation induced by a regular transformation. Let N be an r-manifold


which is oriented by an orientation O; and let g be regular from N into M.
Let t € N, and let L, be the linear transformation induced by Dg(t), as in
the discussion on page 241. Let a(x) = L,[O(t)]. Since |O(t)| = 1, from
(7-1) we have
la(x)| = Jg(t|O@| = sgt), x = g(t).
Let o(x) = |@(x)|~'e(x). Then o is an orientation for g(N), called the
orientation induced from O by g.
If N is a positively oriented open set A C EH’, then as in the derivation of
formula (7-2), a(x) = gi(t) A -:: A g,(t).

Example 3. Let J/ C EH? be an orientable 2-manifold, and o a given orientation


for M. Let g be regular from A C E? into M@. We must determine whether the orien-
tation induced from the positive orientation of HZ? agrees with o. If x = g(s, é), then

ni) Se Ee Be Ame Aree


whereby (6-13a) a(x) = 0(g', g’)/0(s, t). Then o(x) = c(x)a(x), where c(x) =
+|a(x)|~!. The induced orientation is the given one, provided c(x) > 0.

Example 4. Let H be the hemisphere in Example 3, p. 253, oriented so that 0!2(x) > 0
for every x © H. The vector n(x) = *o(x) is normal to H, and its third compo-
nent n3(x) equals 0!?(x). We have oriented H so that the normal “points upward.”
If (f, 8) are spherical coordinates of x, and g is as on p. 253, then

Cay Ye
x) =
a6, 8) =
Se
sngcos¢ > 0.

Therefore the induced orientation is o.

Let w be an r-form which is continuous on M, and let w’ be the r-form de-


fined in Section 6-5. It has the property that

w(t) -O(t) = w(x)- a(x), ifx = g(t).


. . #. .
Since g is of class C’, w” is continuous on N.
7-4 Orientations; Integrals of r-Forms 261

Proposition 31. Let A = g(B), where B is a o-compact subset of N. Let o


be the ortentation induced by g from the orientation O on N. Then

—— # fos

es Loe wy
provided either integral exists.

Proof. From the discussion above, a(x) = gjg(t)o(x) if x = g(t). Divid-


ing by gg(t), we have

w(x) - o(x) = w(t) O(t)


Jg(t)
By Theorem 22,

i_ &(%) + 0(8) dV (x) = iLw' (t)Jg(t)- O(t) ge(t) aV,(t).


Canceling gjg(t) on the right-hand side, we get (7-10). J

An important particular case of Proposition 31 is obtained by taking


for N an open set A C H”. This proposition, together with a judicious choice
of coordinate systems on M, furnishes a tool for evaluating integrals of r-forms.

Example 5. Let 7 = 1,AC #!. Then w(t) = olg(t)]-g’(). If B is an interval,


then fgt w* is the line integral of w along the curve in HZ” represented parametrically
on B by g.

Example 6. Let w = fdzu A --+ A dx‘, and let AC EH". From Section 6-5

(ana) tee OR 2) asadgi) News dg’?

ER ae, eae
ag", Sra? b} g'”) 1 r

By formulas (7-9) and (7-10),

i1 Sod ‘r= o Oe
ee de
es
fsa AOA INES Wie ei fy”
provided o is the orientation induced by g.
Continuing Example 4, we have for instance

: a / 4 a(g', meee
9°) do
| sae dy yt e 0(¢, 8) oh

= [sin ¢ cos 9, sin ¢ sin 6, cos f) sin ¢ cos d dV 2(4, 8),


B

where B = (0,7/2) X (0, 27).


262 Integration on Manifolds 7-5

PROBLEMS
1. Let A C E” have the positive orientation and let g be a regular flat transformation
from A into EH”.
(a) Show that g induces the positive orientation en g(A) if and only if Jg(t) > 0
for every tE A.
(b) Show that (7-10) becomes
1 n 1 n
[aha mix cad IK, ake = Le ° gJgdt A ie) se A dt ;

provided Jg(t) > 0 for every tE A.


2. Let A = {(a,y,2):y = 2? + 22, y < 4}, oriented so that o°!(x) > 0. Evaluate:
(a) fozde A ay. (b) focxpyde A de.

[Hint: Use polar coordinates in the (2, z)-plane.]


3. Let A be the triangle in H? with vertices e1, —e2, 2e3.
(a) Show that o = 4(2e23 — 2e31 + e12) is an orientation for the plane con-
taining A.
(b) Evaluate [40xdy A dz. [Hint: Take g affine such that g(>) = A, where
> is the standard 2-simplex.]
4. Let A = {(2, y, 2): 272+ y? = 22,2 > 0,0 < z < 1}, oriented so that 0!7(x) < 0.
Evaluate
/ 2 Gi IN, Wee

be Letim = 4and M = (x7 (1)?-- @*)2 = 1) @?)? (2 = 1) let oS


(cos s)e; + (sin s)e2g + (cos t)e3 + (sin t)e4, 0 < s,t < 2r.
(a) Find the orientation o induced by g from the positive orientation of E?.
(b) Evaluate
[oa A dx* + aie? dx” IX dx’.
M

6. Suppose that M is the r-manifold determined by ®, in the sense that M satisfies


(4-26), p. 123. Show that M7 is orientable.

7-5 THE DIVERGENCE THEOREM

This is an n-dimensional generalization of the fundamental theorem of


calculus and has numerous applications in geometry and in physics. We shall
first state the theorem in two different ways and derive some corollaries of it.
A proof is given later in the section.
The divergence theorem equates the integral of an (n — 1)-form w over
the boundary of an open set D and the integral of dw over D. The integral
of a differential form depends on an orientation. We must assign on the bound-
ary fr D an orientation corresponding to the positive orientation on D.
For this purpose we must assume that D lies on one side of its boundary.
This is expressed precisely as follows.
7-5 The Divergence Theorem 263

Definition. Let xo € fr D, and let U be a neighborhood of xo such that


(fr D) mM U is an (n — 1)-manifold. Then D is on one side of its boundary
mm U if there exists a function ® of class C on U such that d®(x) ~ 0
for every x € U and
DG) Ue Xe Li b(x) = Oy
DEG xe
U = D(x)e< 0),

If fr D is an (n — 1)-manifold and every xo € fr D has such a neighbor-


hood U, then D is on one side of its boundary.

Example 1. Let D = {x:|x| < 1 or 1 < |x| < 2}. Then fr D is the union of two
concentric (n — 1)-spheres of radii 1 and 2. However, D is on both sides of the
inner (n — 1)-sphere.
Actually this example is rather artificial. If D is the interior of its closure, then
using the implicit function theorem it can be shown that D is on one side of fr D.

Definition. Let x € fr D, and n # 0 be a vector normal to fr D at x.


Then n is an exterior normal at x if there exists 6 > O such that x + in € D
for —6 < t < Oandx-+in&€ (el D)° for0 < t < 6.

From the definition, all exterior normals at x are positive scalar multiples
of any particular one. We shall be principally concerned with the unit exterior
normal, which will be denoted by v(x) (|»(x)| = 1).
Let D be on one side of its boundary in U.
The vector n(x) = grad #(x) is normal to fr D
QuxeeuirD) OU.
Let y(t) = (x + tn(x)). Then y(0) = 0
and

y/(0) = grad (x) - n(x) = |grad #(x)|? > 0.


There exists 6 > 0 such that y(t) < 0 for
=o —? < 0 and ¥@) > 0 for0.<¢< 6. There
fore grad ®(x) is an exterior normal at x. The
vector
v(x) = |grad &(x)|~! grad (x) Ficure 7-9

is the unit exterior normal to D at x. Since @ is of class C'”, v is a continuous


function on (fr D) N U. See Fig. 7-9.
Let o(x) be the (n — 1)-vector such that v(x) = *o(x). The (n — 1)-
space of 0(x) is the tangent space 7(x), and |o(x)| = |v(x)| = 1. Hence o(x)
is an orientation for 7'(x). A frame (hy,...,h,—1) for 7'(x) has this orientation
if and only if (v(x), hy,..., h»—1) is a positively oriented frame for £”. Since
y is continuous on (fr D) M U and the components of v and o are related by
(6-32a), the function o is continuous there. Thus o is an orientation for
(fr D) A U, which we call the positive orientation.
264 Integration on Manifolds 7-5

The preceding discussion was local. However, let fr D be an (n — 1)-


manifold, and let D lie on one side of it. There is a (unique) exterior unit
normal v(x) at each x € fr D, and the orientation o(x) such that *o(x) = v(x)
is defined for every x € fr D. Since every Xo & fr D has a relative neighbor-
hood in which o is continuous, the function o is continuous on fr D.
This defines the positive orientation 0 on fr D. Let us write 0D* instead
of (fr D)° for fr D with this orientation. (The symbol @ is widely used to
denote a boundary.)
Let us state and prove the divergence theorem for the following class of
open sets, which will be called regular domains.

Definition. An open set D C E” is a regular domain if: (1) D is bounded;


(2) fr D is an (n — 1)-manifold of class C; and (3) D is on one side of
its boundary.

Divergence theorem. Let D be a regular domain and w an (n — 1)-form


of class C‘? on el D. Then
f+? = i. dw. (7-11a)

Let us defer the proof until later in the section. The last assumption
means that w is the restriction to cl D of a form of class C”’ on some open
set Do containing cl D. The somewhat restrictive assumption (2) about fr D
is made to simplify the proof. The theorem is still true if fr D is not a mani-
fold but instead consists of a finite number of pieces of class C‘” intersecting in
sets of dimension n — 2. For example, if D is the interior of an n-cube then
the pieces are the faces, which are cubes of dimension n — 1 and intersect in
(n — 2)-dimensional cubes. This more general form of the divergence theorem
will be precisely stated at the end of the section. For certain special kinds of
sets D there is an easy proof of the theorem (Problems 5, 6).

The case n = 2. Suppose that fr D = C, U--++ UC» where each C;, is the
trace of a simple closed curve Y¥;, and C,,...,Cm are disjoint. The orientation
is chosen by selecting the unit tangent vector v(z, y) so that (»(z, y), v(z, y)) is
a positively oriented orthonormal frame for H?. Intuitively speaking this
means that as the boundary is traversed, D is always on the left. Then

[i Pie: Br yap
If we write w = M dx + N dy, then (7-1la) becomes

Se M dx - y= aN
Ov aM G
Pa ik (x " dx A dy. (7-12)
4 a

This is known as Green’s theorem. (See Fig. 7-10.)


7-5 The Divergence Theorem 265

Figure 7-10

Example 2. Let w = $(x dy — y dz). Then dw = dx A dy and Vo(D)s= fpt dx A dy.


Hence the area of D can be written as an integral over the boundary :

VoD) = afaD dy — yde.


The divergence theorem is often stated in a different way which does not
involve integrals of differential forms. Let ¢ be a 1-form of class C on el D.
Its divergence
div © =
ok
a
— 2 dx

is continuous. Let w be the (n — 1)-form such that *» = ¢. By formulas


(6-32) and (6-34),
w(x) + 0(x) = *w(x) - *0(x) = ¢(x) - v(x),
dw = div (dx'. A -:* A dz”.
Therefore
ie = yl = Le a ae
frD

| thes dw = ihe div ¢ dx’ A+++


= } 1 iy iota
A dx 7 —=
Ne div ¢ dV.
}

The conclusion (7—11a) of the divergence theorem can be restated as:

if (x) - v(x) dVn_1(z) = il div ¢(x) dV,,(x). (7-11b)


frD D
The number ¢(x) - v(x) is called the (exterior) normal convponent of the con-
vector ¢(x).
Note: In (7-11b) the distinction between vectors and covectors which we
have maintained is not customary. If the covector ¢(x) is replaced by the
vector with the same components, then - means the standard euclidean inner
product (see remarks in Section 3-3). In Green’s formulas below, df should
then be replaced by grad f.
For n = 3, the divergence theorem is often called Gauss’ theorem or
Ostrogradsky’s theorem. It has various interesting physical interpretations.
266 Integration on Manifolds =s)

Let ¢ be a force field acting in some open set Do C E?, For each x € Do,
¢(x) is the force covector acting at x. For notational simplicity, let us set
M = fr D throughout the discussion to follow. The number Ji £(x) - v(x) dV 2(x)
is called the outward flux across the boundary M. The divergence theorem
expresses the outward flux as a volume integral over D. If D has small diameter
and contains x9, then the outward flux is approximately V3(D) div ¢(Xo).
To make this statement more precise let us state the following.

Lemma. Jf f 7s continuous on an open set Do containing Xo, then

f(xo) = _limdiam D—0


ate iLf(x) dV, (x).
In words, this formula says that given e > 0 there exists 6 > O such that
if D is any open set of diameter less than 6 with x9 € D, then

|V(D)f(xo) — ffx) dVa(x)| < €Vn(D).


Proof. Given € > 0 let 6 > O be such that |f(x) — f(xo)| < € whenever
Ix — Xo| < 6. If xo € Dand diam D < 4, then

Va(D)fHo) — ffx) @Vn]| =| f L7G) — F@)] aVn(@)|


< fi,lio) — $@)| aVn(x) < V(D).1
If in the lemma we take D regular and f = div ¢, then for any n and xo
in the domain of ¢

Teco = ihn
diam D0
rite iE¢(x) - (x) dVn_a (2). (7-13)
As another physical interpretation, consider a fluid flowing in an open set
Do C E?. Let t denote time and x = (x,y,z). Let p(x, t) be the density and
v(x, t) the velocity at x and time ¢. Let ¢ = pv. Suppose that D is regular and
cl D C Do. The left-hand side of (7-11b) represents the rate at which mass
is flowing out of D. Therefore, if m(t) is the mass of the fluid in D at time t,
then from the divergence theorem

dm :
~ ar {idiv (pv) dV 3.
On the other hand,
mit) = i.p(x, t) dV3(x).
Differentiating under the integral sign (Section 5-11),

im _ f200p ay,
dm _
US The Divergence Theorem 267

For each to the functions —div (pv) and dp/dt have the same integral over
every regular D with cl DC Do. By the lemma, for every xo € Do these
functions have the same value at (Xo, fo). In other words,

Op
Arias —div (pv).

If the density p is constant, then the fluid is incompressible. Thus for incom-
pressible fluids div v = 0 at every time ¢.
If div ¢(x) = 0 for every x in its domain Do, then ¢ is called divergence
free (or solenoidal). The divergence theorem has the following corollary.

Corollary. Let ¢ be ef class CY on an open set Do. Then & is divergence


free af and only if

[, §@-¥@) na) = 0 (*)


for every regular domain D such that cl D C Do.

Proof. If ¢ is divergence free, then the equation (*) is immediate from


(7-11b). Conversely if (*) holds for every such D, then by (7-13) div ¢(x9) = 0
for every Xo € Do. Ef

Green’s formulas. Let f be of class C on cl D. Let f,(x) denote the


derivative of f in the direction of the exterior normal at x € M, namely,

f(x) = df(x) - v(x).


Let ¢ be another function of class C on cl D, and let ¢(x) = $(x) df(x).
Then
ox) (x) = o@)f(),

div f= P 2(64)- do:


df + ¢ Laplf.
Oxt
t=1

Hence we get the first Green’s formula:

[ev
M
Vn = D
(dd:
af+ ¢ Laplf]dVn. (7-14)
In the same way

if fey Vn
M
= if [df- do +f Lapl 4] dVn.
D

Subtracting, we get the second Green’s formula

[the — fool Vn = fi (Laplf — fLapl 6] dVn.


M D
(7-15)
268 Integration on Manifolds 7-5

Example 3. A function f is called harmonic if Laplf = 0. Let f be harmonic, and


apply the first Green’s formula with ¢ = f. Then

2
iL,
Sf Vari e= iC|df| an. (7-16)

When n = 3 the right-hand side often has (except for a suitable multiplicative
constant) the physical interpretation of energy.
If f is harmonic and f(x) = 0 for every x € M, then from (7-16) the integral
of the nonnegative continuous function |df|? is 0. Hence df(x) = 0 for every x € cl D.
Given xo € D let x; be a point of M nearest xo. The line joining xo and x, lies in
cl D, and from the mean value theoremf is constant on it. Since f(x1) = 0, we must
have f(xo) = 0. Thus f(x) = 0 for every x € el D.
Suppose that f and g are both of class C’) on cl D and harmonic, and that f(x) =
g(x) for every x € M. Then $(x) = f(x) — g(x) = 0 for x € M and ¢ is harmonic.
Hence $(x) = 0, and f(x) = g(x), for every x E€ cl D. This shows that there is at
most one harmonic function of class C® on cl D with given values on the boundary
M. It is more difficult to show that there is in fact a harmonic function f with given
boundary values. This is called Dtrichlet’s problem. If the boundary data f|M are
merely continuous, then f is continuous on cl D and of class C®) and harmonic on D.
See [14], Chap. XI. If the boundary data are smooth enough, then f is of class C
and harmonic on cl D. For instance this is true if J is of class C@) and f is of class C®
on M.

Let us now turn to the proof of the divergence theorem. The proof will
proceed by first proving the theorem when D is either H” or a half-space and
w has compact support. The general case will then be reduced to these two
by introducing local coordinates on fr D and a partition of unity. As before,
we may let ¢ = *w and may prove either of the two equivalent formulas
(7-1la) and (7-11b). As in Chapter 5, ff dV, denotes the integral of f over
all of E”.

Lemma l. Let ¢ be a 1-form of class C on E” such that € has compact


support. Then fdiv ¢dV, = 0.

Proof. Let 1 <7< n. By the iterated integrals theorem

far,
OC;
a |{f ORG
Bac a ave i! ),
where xe = (z*,...,2°",a't!,...,2”"). Since ¢; has compact support,
the inner integral is 0 by the fundamental theorem of calculus. Therefore
fd¢;/dx' dV, = 0. Summing from 1 to n we get the lemma. jf

In the next lemma we write (as in Section 5-5) x’ = (x!,...,2"7)


instead of x”.
The Divergence Theorem 269

Lemma 2. Lei H be the half-space {x : 2" < 0}, and let ¢ be as in Lemma 1.
Then
iLdiv ¢ dV, = itGaz’, 0) dVn ai(2').
Proof. If 7% <n, then fy 0¢;/dx' dV, = 0 as in the proof of Lemma 1.
Tor 1 = n we have

Abn ays
0

ip
iLOxn dV rat Ae arn dx |avs1(x’.

By the fundamental theorem of calculus the inner integral is ¢,(x’, 0), since
¢, has compact support. §f

Proposition 32. Let f be a regular flat transformation from an open set


D, CE” onto an open set Ay C EB”. Let D be a regular domain such that
Dig Dis wei emp, and te A — f(D fieDs),S = Cr D) nDi, N = tS).
Then:
(a) A ws open and N = (fr A) N A.
(b) A zs on one side of its boundary in a neighborhood of each point of N.
(c) If Jf(x) > 0 for every x © Dy, then the positive orientation for D is
induced by £—' from the positive orientation for A, and the positive orien-
tation for S from the positive orientation for N.

Proof. Let g = f~'. The first assertion (a) follows from the fact that a
regular transformation f is a homeomorphism. Let to © N, and xo = g(t).
Let U and © be as in the definition, p. 263. We may assume that U C Dy.
Let W(t) = @[g(t)] for tef(U). Since Dg(t) has maximum rank n and
d&(x) # 0, the chain rule implies that d¥(t) # 0. The open set f{(U) contains
a neighborhood Q of to, and
(irAy a i= ate O-(t) == 0},
Kp = Cease) 2,
Therefore A is on one side of its boundary in 2. This proves (b).
Since Jg(t) = 1/Jf(x) > 0, g preserves the positive orientation of EH”.
We must show that the orientation induced on S from the positive orientation
1s positive.
Let ko = grad W(ty) and np = grad (xo). They are exterior normals
to A and to D respectively. Let hy = L(ko), where L = Dg(to). From the
chain rule, ky = L‘(no) where L’ is the transpose of L. By formula (4-8)

ho: No = [Le L‘(no)] 2a) = L‘(no) : L‘(no),

ho + no = |L'(no)|* > 0.
Let (k,,...,Kn—1) be a positively oriented frame for the tangent space to
Tea tet ymaucelct hp tL (kewl ve le hens (hj, 22, by_y) is 4
270 Integration on Manifolds US)

frame for the tangent space 7T(Xo) to fr D at xo. Since (Ko, ki,..., Kn—1)
is a positively oriented frame and g preserves the orientation of E”,
(ho, hy,...,h,) is a positively oriented frame.
Now ho = cio +h, where c = (ho -No)/(¥o +o) and h € T(xo). From
this,
No A hy aN yer = cho /\ hy, n/N job ae

Since hy no > 0, ¢ > 0. Therefore the frame (no, hi, ... , h,—1) has positive
orientation, which implies that (h;,...,h,—1) orients S positively at Xo.I

If Jf(x) < 0 for every x € Dj, then f~' induces the negative orientation
(corresponding to the interior normal) on S.
Proof of divergence theorem. Let us show that each x9 € cl D has a neigh-
borhood Uo such that (7-lla) holds provided w has compact support con-
tained in cl Up. If xo € D, let Ug be a small enough neighborhood that
cel Up C D. Then fap+ w = 0, and by Lemma 1, fap+ dw = 0.
Let xo € fr D, and let H be as in Lemma 2. Let us find a neighborhood
D, of x9 and a regular transformation f with domain D, such that

{DAD )CH, fr D) a Di)cicd,


and Jf(x) > 0 for every x € D,;. For this purpose let us first suppose that
v"(X9) > 0, where v(xo) is the unit exterior normal at x9. Let ® be as on p. 263.
Then grad (x) = cy(x), where c > 0. Taking nth components, ©,(x9) =
cv" (Xo). For D, take a neighborhood of x9 in which ®,(x) > 0, and let

TAC See Se ee he Ee CICSD


Then Jf(x) = ©,(x) > 0 for every x € D,. If the condition v"(xo) > 0 is
not satisfied, then for f we take f - L, where L is a rotation of E” such that
L[p(xo)] is a vector whose last component is positive and f is of the type just
described.
Let Uo be a neighborhood of x9 such that el Ug C Dy, and let w have com-
pact support contained in Up. Let g = f~!. Since g is of class C” and pre-
serves the positive orientation of H”, by Proposition 31 and (c) of Proposition 32,

ihedw = ike (dw)' = jhedes’. (*)


#t
ee o= ae w. (**)
*x

But by Lemma 2, the right-hand sides of («) and (#*) are equal.
Since cl D is a compact set, a finite number of such neighborhoods
Uy,...,Um cover cl D. Let Wx(x) and ¢%(x) be defined as in the proof of
Proposition 29, for x € cl D. Since ¢,w has compact support contained in
el Ul,
ie oro = ibe(G0) pn hom eens (*)
7-5 The Divergence Theorem 271

By the product rule,


AU(pyw) = dd, A w+ x dw.

Since 2’¢, = 1, © dg, = 0. Summing from 1 to m in (*), we have

e $x)io
ie & és)ica [r Q
which is precisely (7—-lla). §

The assumption that fr D is a manifold of class C can be considerably


weakened. Let us state without proof a somewhat more general version of the
divergence theorem. Let D be an open, bounded set. Assume that

TE Die" AG rr eA, UL,

where: (a) A; is a relatively open subset of fr D and cl A; is a compact subset


of an (n — 1)-manifold M;, k = 1,...,p; and (b) B is a compact set con-
tained in a finite union of (nm — 2)-manifolds, and (cl A;) nN (cl A?) CB
whenever k # 1. Moreover, assume that D is on
one side of its boundary in a neighborhood of each
point of (fr D) — B. On each A, we assign the
positive orientation, determined by the exterior
normal. Then
Pp
> On / dw,
Real ae pt
Figure 7-11
provided w is of classC"” onel D. (See Fig. 7-11.)
Let us say that such a set D has a boundary which is piecewise of class C\’.

Example 4. Let D be an n-simplex. Let Ao,..., 4, beits (open) (n — 1)-dimensional


faces, let MZ; be the hyperplane containing Ax, and let B be the union of the (n — 2)-
dimensional faces of D.

PROBLEMS
Unless otherwise indicated, assume that D is a regular domain.
1. Let n = 2. Show that:

(a)a) VD)VoD) ==—[—f ,yde.ude


(b) fo @? + y?)aV2 = bf 2"dy — 9"ae.
3 3

2. Evaluate fast y? dx A dz, where D is the standard 3-simplex.


3. Let D be the disk 22+ y? <1 and w = (xdy — ydz)/(x?+ y”). Then
Sapt w = 2m while {p+ dw = 0. Why does this not contradict Green’s theorem?
272 Integration on Manifolds 7-5

4, Letn = 4 and D = {x: (x1)?+ (a2)?+ (@3)? < (2*)?,0 < 2* < 1}.
Evaluate :

(a) fl .@? +24) de! A de’ A dx’. 0b) foal?detA da? A aa.
aD aD
5. Suppose that D = {(z,y):f@)<y <g@); y
a<z <b} = {(z,y):¢y) <2 < vy),¢ <
y < d\. Show directly from the fundamental
theorem of calculus and properties of line inte-
grals that

Adding, we get the Green’s theorem for regular Figure 7-12


domains of this special type. (See Fig. 7-12.)

6. Prove the divergence theorem directly from the fundamental theorem of calculus
when D is:
(amt henmninn-cubernx 3 0F<c oe < al —sl er
(b) The standard n-simplex.
7. For each ¢ in some interval (—a, a) let T; be a regular flat transformation with
domain Do. Assume that To(x) = x for every x € Do and that T is of class C®?
as a function of (x,t) on Do X (—a,a). Let v(t) = V,[T:(D)] where cl DC Do.
Prove that v/(0) = fp div Wo dVn, where W; = OT;/dt. [Hint: Show that the
integrand is (0/0t)JT;(x) evaluated at t = 0.]
In Problems 8 and 9 let 7 = fr D.
8. Show that:

(b) iEx: p(x) dV n—1(X) = nV ,(D).


(a) iLv'(x) OM il (x) = (0).

(c) JM fy) dVn—1(x) = ii


D
Lapl f(x) dV n(x).

9. Let D be connected, f harmonic, and f,(x) = 0 for every x € Mf. Show that
f(x) is constant on D.

10. Let D = {x:a < |x| < 6}, where 0 <a < Bb.
(a) Show that if f(x) = p(|x|), then f,(x) = y/(|x|) when |x| = 6 and f,(x) =
—y’ (|x|) when |x| = a.
(b)) Let Yr) = —[(@ —2)82_i)u! r=2, where'n.> 2/and/6,-7 1s the (Gee 1)—
measure of the unit (n — 1)-sphere. Let
f be asin (a). Show that
f isharmonic.
(c) Let ¢ be harmonic on the n-ball B = {x:|x| < bd}. Show that $(0) =
(Bn—1b"—!)—! fir 3@dVn_1. [Hint: Apply the second Green’s formula with
D and f as above and let a > 0+]
7-6 Stokes’ Formula 273

7-6 STOKES’ FORMULA

The divergence theorem is a special case of a result which is nowadays


called Stokes’ formula. Let w be an (r — 1)-form. Stokes’ formula equates
the integral of dw over a portion A of an oriented r-manifold MW and the integral
of w over the (suitably oriented) boundary of A.
Let us begin with the following particular case and afterward generalize.
Let B C E” be a regular domain, and let A = g(B) where g is a regular trans-
formation of class C‘”’ from some open set containing cl B into M. Let o be
the orientation induced on A from the positive orientation of H”. The (r — 1)-
manifold K = g(fr B) is the boundary of A relative to M. Let dA° denote
K with the orientation induced from the positive orientation of fr B.
Let w be an (r — 1)-form of class C‘” on cl A. Then

ibeoe Ibe (de) a iE de,


#
Qo = (4)
ie aBt
By the divergence theorem, the right-hand sides are equal. Therefore we have

Stokes’ formula
/ mye aly (Vala)
aA° Ag
The case r= 1, n= 3. Then »w= Pdr+Qdy+ Rdz is a 1-form
and dw is a 2-form. The 1-form *dw is called curl w, and the vector n(x) = *o(x)
is a unit normal to A. Since dw(x) - 0(x) = curl w(x) -n(x), formula (7-17)
becomes
skBoe iL,curl w(x) - n(x) dV o(x). (7-18)
ro)

The name Stokes’ formula was traditionally applied to (7-18), and not its
generalization (7-17).
: z

g
ae ae

8 UY]

wv

Figure 7-13
274 Integration on Manifolds 7-6

Example. Let 08+ consist of a single simple closed curve Y in H?. Then dA* consists
of a simple closed curve Y in EH.
The normal n(x) varies continuously on A. At a boundary point x of A, n(x)
can be visualized in the following way. Let x = g(s, t), where (s,t) E frB. Let
v be the exterior normal and v the positively oriented unit tangent vector to Y at
(s, ). The vector h = Dg(s, t)(v) is a tangent vector to Y at x. If ho = Dg(s, t)(v),
then (ho, h) is a frame for the tangent space to M at x and has the required orientation
o(x). Hence (n(x), ho, h) is a positively oriented frame for H?. (See Fig. 7-13.)

If P, Q, R are regarded as the components of a velocity field, then fy


represents the circulation along the boundary 7. Stokes’ formula expresses
the circulation as the integral over A of the normal component curl w(x) - n(x)
of the curl.
In particular, let A lie in a plane I, oriented by a unit vector ng normal to
II. Then n(x) = ng for every x € A. Let x9 be a point of the domain of w.
If A contains xg and A has small diameter, then the right-hand side of (7-18)
is approximately curl w(x9) -M9V2(A). More precisely,
il
lw(x ):nj = lim [o
Oa ae VR
This is proved using a lemma similar to the one for the proof of the corresponding
formula (7-12) for the divergence.
Some generalizations. Let M be an orientable manifold of class C™.
We proved Stokes’ formula above in case cl A is contained in some coordinate
patch. By using partitions of unity, this restriction can be removed.
Proposition 33. Let M be compact and o an orientation for M. Then

i
M?
dar.

for every (r — 1)-form of class CY on M.


Proof. Let {$1,..., dm} be a partition of unity for M. Let g™ bea regular
transformation of class C’”’ from an open set A; C EH” onto a coordinate patch
S; containing the support of ¢;. Then

be d(¢,0) = stfi [d(¢,0)] (6 d(dyw)' =


by Lemma 1 of the last section. Since ¢, = 1, © dé, = 0, we get, as in
the proof of the divergence theorem,

IL.ial » i d( dye) = 0.1


Since M has empty boundary relative to itself, one would expect to obtain
0 on the left-hand side of (7-17) when A = M. Proposition 33 states that
this is correct.
7-6 Stokes’ Formula 275

Now let M be any orientable r-manifold of class C°’. Let us call a rela-
tively open set A C M a regular domain on M if:
(1) cl A is a compact subset of M;
(2) the boundary K of A relative to M is an (r — 1)-manifold of class C;
(3) A is on one side of K.

By condition (3) we mean that if F is any coordinate system for S Cc M,


then F(A S) is on one side of F(K 8S) in a neighborhood of each point
of F(K nS).
Let o be an orientation for M. Then o determines an orientation o’ on K
as follows. Let S be a coordinate patch and F a coordinate system for S. If
the orientation induced by F~' from the positive orientation of E’ is 0, then
for x € K 18, o’(x) is the orientation induced from the positive orientation
of F(K 4 8). Otherwise, 0’(x) is the orientation opposite to this one. From
part (c) of Proposition 32, Section 7-5, it can be shown that o’(x) is independent
of the particular coordinate system chosen (Problem 3).
Let K with the orientation o’ be denoted by 0A°.

Theorem 23. Let A be a regular domain on M, and let w be an (r — 1)-form


of class CY on el A. Then
/ Oo dw. (7-17)
aA° AS.

This theorem can be proved using the divergence theorem and a partition
of unity in much the same way as for Proposition 33. We shall not give the
details.
We have assumed that M is of classC'?, but Theorem 23 is still true for
manifolds of class C”’. Moreover, the relative boundary K may be piecewise
of class C‘” in the sense explained at the end of Section 7-5. For instance, if
M is an r-plane and A an r-simplex contained in M, then the boundary of A
relative to M is piecewise of class C””.

PROBLEMS
1. Let w = yedx + ady-+dz. Let Y be the unit circle in the zy-plane, oriented in
the counterclockwise direction. Calculate f, and [40 dw and verify that they
are equal, where the orientation o is chosen so that 0A®° = Y and:
(a) A is the disk 2? + y? < 1 in the zy-plane.
(eee etre, Lee.) ee ye LP
2. Let w = zexp (—y) dx + zdy+ ydz. Evaluate fac dw when A is:
(a) The ellipsoid x?/a? + y?/b? + z27/c? = 1 oriented by the exterior normal.
(b) The square with vertices 0, e: + e2, V2 e3,e1 + e2 + 1/2 e3, oriented so
that 073(x) > 0.
(ec) The paraboloid y = x? + z? oriented so that 03!(x) > 0.
3. Show that the orientation o’ for K does not depend on the particular choice of
coordinate systems for M used in its definition.
276 Integration on Manifolds Sf,

4. Let M = fr D, where Disa regular domain in £”. Show that Ju (*dw)-vdV,-1 = 0


if w is an (n — 2)-form of class C™ on M.
5. Prove the following :
de(xo)- a9 = lim [V,(2)17? if
aA
ao!
diam A0

where xo € A and A lies in an r-plane II oriented by @o.


6. Let w be the r-vector of an r-simplex So and Bo, B1,..., 8, the (r — 1)-vectors of
its oriented faces (Problem 12, Section 6-3). Show that

deo(xo) >a = D>) (—1)'w(&o) - Bi.


i=0
[Hint: Consider simplexes S similar to So and containing xo. Apply Problem 5
with A = S.J

7-7 CLOSED AND EXACT DIFFERENTIAL FORMS

Any exact differential form w = dy is closed, provided y is of class C’.


This is a consequence of the formula in Section 6-5 d(dj) = 0. Whether,
conversely, every closed form w is exact depends on the topological nature of
the domain D of w. In this section we shall give two sufficient conditions that
every closed 7r-form with domain D be exact. The first is that D be simply
connected and applies when r = 1. The second is that D be star-shaped and
applies for any degree r.

Homotopies. Let f and g be transformations of class C from a set


BC Ek” into a set A C EH”. We are interested in whether it is possible to
smoothly interpolate in A between f and g. If this is possible then f and g
are called homotopic in A. To state this more precisely, let us consider the
subset [0,1] x Bof B™ T+.

Definition. If there is a transformation H of class C'’ on [0, 1] B such


that H(s, t) € A for every (s, t) € (0, 1] x Band H(0, t) = f(t), H(1, t) = g(t)
for every t € B, then f and g are homotopic in A.

In the usual definition of homotopy in topology, H is required to be merely


continuous. What we call homotopy is then called a homotopy of class C“’.

Example 1. Let A be convex. Then we may take

H(s, t) = sg(t) + (1 — s)f(t).


Therefore any two transformations f and g of class C) with values in a convex set
A are homotopic in A. In particular, this is true when A = H”,
7-7 Closed and Exact Differential Forms 277

To define simple connectedness one may take B to be a circle. However,


instead of a circle it is more convenient to let B be an interval [a, b] with the
endpoints identified. Let f and g be transformations from [a, b] into A such
that f(a) = f(b) and g(a) = g(b). Then f and g are strictly homotopic in A if
the homotopy H in the definition above can be chosen so that H(s, a) = H(s, b)
for every s & [0, 1].
If dH/dt ¥ 0, then for each s the transformation H(s, ) represents on
[a, b] a closed curve Y, of class C in the sense of Section 3-2. Intuitively, one
may regard a homotopy as a smooth interpolation by the curves Y, between the
curve Yo represented by f and the curve Y; represented by g. However, for
technical reasons it is disadvantageous to include the conditions 0H/dt # 0 in
the definition of homotopy.

Definition. If g is strictly homotopic in A to a constant transformation f,


then g is null homotopic in A.

If f(t) = xo for every ¢ € [a, b], then one should think intuitively that
y;, shrinks to the point x9 as s > 0°. When A is an open subset of E? this is
possible roughly speaking provided 7, does not loop around any holes which
may be present in A. In Fig. 7-14, A has two holes and the curves 7, in the
figure are not null homotopic in A.

%y

Let D be an open set, and w a 1-form with domain D. Let us set

0) = f* eole(t)]- g(t) dt.


In case g’(t) ¥ 0, (g, w) is just another notation for the line integral of w along
the curve represented by g.

Proposition 34. Let w be closed. If f and g are strictly homotopic in D, then


(f,w) = (g, ).
Proof. Let w* be the 1-form on the rectangle R = [0, 1] X [a, b] induced
by the transformation H. Since dw = 0, dw’ = (dw)' = 0. By Green’s
theorem
# #
i),0 =] ,dw = 0.
aR R
278 Integration on Manifolds Ute

The integral over 0R* is the sum of the inte-


grals over the four segments \y,...,Aq in-
dicated in Fig. 7-15. Now

wo = D> wi; Har’)!


t=!

= oH’ oH’
= Yon (% ash, au):
i)

n b t
0H
feta d fom are
Ag ti a Figure 7-15

H and 0H‘/dt being evaluated at (1, t). Since H(1, t) = g(#), the right-hand
side is just (g,w). Similarly, since H(O, t) = f(¢)

fe =aa— (f
..w)s

Since H(s, a) = H(s, b),


it #
" @ Y w.f

Example 2. Let n = 2 and let D be the plane with (0,0) removed. Let w =
(x dy — y dx)/(x? + y?). Formally, w = d®, where O(z, y) is the angle from the
positive x-axis to (a, y),0 < O(a, y) < 27. However, @ is defined only in the plane
with a slit removed even though w is defined and of class C“) in D. For each integer
m £0 let gm(t) = (cos mt)ey + (sin mt)ez, O< t < 297. Then (gn, w) = 2mr,
which shows that g,, and g; are not strictly homotopic in D when m # I. The trans-
formation g,, represents the unit circle traversed |m| times, counterclockwise if m > 0
and clockwise if m < 0.

Proposition 34 has the following corollaries.

Corollary 1. /f g 7s null homotopic in D, then (g,w) = 0.

Proof. If f is constant, then (f, w) = 0. J

Definition. An open set D is simply connected if every transformation g


of class C® from an interval [a, b] into D, satisfying g(a) = g(b), is null
homotopic in D.

Roughly speaking, D is simply connected if every closed curve in D can


be shrunk in D to a point. When D C EH? this amounts to saying that D “has
no holes.” Removal of a single point, as in Example 2, must be counted as
introducing a hole.
If D = {x € E®: |x| > 1}, then D is simply connected, yet D has a
“hole.”
7-7 Closed and Exact Differential Forms 279

Corollary 2. Jf D is a simply connected open subset of E”, then every closed


l-form with domain D is exact.

Proof. By Theorem 7, Section 3-3, it suffices to show that fy = 0 for


every piecewise smooth closed curve Y lying in D. Let g be a representation
of such a curve Y on [0, 1], such that g is piecewise of class C‘”. There is a
sequence gj, g2,... of transformations of class C‘”’ on [0,1] such that:
Cer ee ee, )etome = 2 2) g(t) —> git) for every ¢ & [0; 1},
and gi,(t) — g’(t) except at the (finitely many) points of discontinuity of g’,
as m— «©; (3) |gmn(t)| and |g’,(Q)| are bounded by some number C. Such a
sequence can be found by a standard smoothing technique (Problem 5). By
Lebesgue’s dominated convergence theorem, (gm,w) — (g,w) aS m— om.
Bye Corollary 4) (g,,,@) — 0 for each m= 1,2,... Therefore (gw) =
Syro = 0.1
Let us turn to the question of finding a condition on D which insures that
any closed form of arbitrary degree r is exact. For this purpose, let B be an
open subset of #”. Let us introduce an operation which changes any r-form
n of class C‘” on [0, 1] X B into an (r — 1)-form of class C‘” on B. The latter
form is denoted by fj n. If r = 1, then y = fds + y', where y! involves the
differentials dt!,..., dt. In this case fo » = Jo f(s, ) ds, which is of class C™
on B. Next, if » = ds A @ = Yiu) 6,ds A dé?1 A +--+ A dé*r-1, then we set

“1 1

| T= SS (| 6, as)Ca YN (7-19)
0 Ue ew
Finally, any r-form y on [0,1] X B can be written 4 = ds A 6+ y', where
n' involves only the differentials dt’, ..., dt”:

Te = Gy CES Oe Ne
[J

We set fo 7 = fods A 9. Using the rules for exterior differentiation, we


find that
Ant

dn = d(ds A 0) + dni = —ds A d’0+ds A radi

where d’ denotes the differential with respect to t of a form on [0,1] x B and


the components of the r-form dy'/ds are the partial derivatives dn, /d8.
Therefore
1 nit AS
/ dn = =f ds A vo+ | donee
A J0 0 $
il
= | ds d’o + n'(1) = n'(0),
/ 0
280 Integration on Manifolds UI

where 7}(s) is the r-form on B with coefficients »,(s, ). Differentiating under


the integral sign,

ener = of
ate, 2 Weck
en MLL

provided f is of class C‘?. Hence

affas = ea fds) a — iads A df.

Applying this in (7-19), if 7 is of class C we get

dj Sesoue saad Gs A dit +++ A dt


0
paras ® (x*)
1

==k ds A Ss Gah, IN UPSIs 28 IS av) =| ds A d’e.


0
[4]

From (*) and (**) we get

[a
0
tafin=aQ
0
— 1). (7-20)
Now let w be an r-form of class C‘” on A. Let H be a
homotopy between transformations f and g, and let of, wh,
wiz denote the r-forms induced respectively by f, g, and H. =
Let y = wy. Thenyn!(1) = wh and 71(0) = w}. Therefore
1 1
{i fee + df ees = wy — woe (7-21) A *2
0 0

With this formula we can readily deduce a result about


closed forms which is called Poincaré’s lemma. Figure 7-16

Definition. A set A is star-shaped if there is a point x» € A such that for


every x € A the line segment joining xg and x is contained in A (Fig. 7-16).

Poincaré’s lemma. Let D be a star-shaped open set and let 1 <r <n.
Then every closed r-form with domain D is exact.

Proof. Let xo be a point with respect to which D is star-shaped. Let


f(x) = Xo, g(x) = x, H(s,x) = xo + s(x — xo), B= D. [This homotopy
merely shrinks 2 radially to the point x9.] Then og = w; and since r > y
and df' = 0, wf = 0. Since w is closed, dw = (dw) = 0. Let ¢ = lg Ge
Then by (7-21), d§ = w. J
7-7 Closed and Exact Differential Forms 281

*Note: Poincaré’s lemma gives only a sufficient condition on D that every


closed form be exact. A necessary and sufficient condition can be obtained
from DeRham’s theorem ([21], Chap. IV or [17], Chap. IV).

Let us state without proof the following version of the theorem. Let
Z’(D) denote the set of all closed r-forms of class C on D. If w and ¢ are
closed, then w -++ ¢ is closed and cw is closed for any scalar c. Thus Z"(D) is
a vector space over #'. Similarly, let &"(D) denote the vector space consisting
of all exact r-forms of the type w = df where ¢ is of class C‘“” on D. Then
&’(D) C Z"(D). According to DeRham’s theorem, the quotient vector space
5"(D) = Z"(D)/8"(D) is isomorphic to the r-dimensional cohomology group
of D with real coefficients. (The homology and cohomology groups of a space
are defined in algebraic topology. They contain a great deal of topological
information about the space.) In particular, every closed r-form is exact if
and only if 3C"(D). = 0.

PROBLEMS
1. Let D be the solid torus obtained by rotating the circular disk (y — a)?-+ 22 < b?,
0 < 6b < a, about the z-axis. Let Y be the circular path traversed by the center
of the disk. Show that f, (2 dy — y dz)/(x?-+ y?) # 0. Hence by Corollary 1,
Y is not null homotopic in D.
2. Let S be the sphere 2? + y? + z? = a?, oriented by the unit exterior normal.
Let
w = p (xdy A dze+ydz A de +2zdzA dy), p? = 2? + y?+2?,

the domain of w being H? — {0}. Show that:


(a) w is closed.
(b) fsow = 42. [Hint: Find *w(x)- v(x), where v(x) is the exterior normal.]
(c) EH? — {0} is simply connected.
3. Let D be star-shaped and let D = g(D), where g is a regular flat transformation
of class C®). Show that every closed form with domain D is exact.
4. The winding number of a closed curve Y in H? about a point (xo, yo) not in the
trace of Y is
( bffSe Ae
ONO OT Or dy (@ = 20)? + (y¥ — yo)?
Let Y be the positively oriented boundary of a regular domain D.

(a) Show that w(zo, yo) = 1 if (xo, yo) € D. [Hint: Apply Green’s theorem to

Disc ye Dn 20) (yon 2 €}


where € < dist [(xo, yo), fr D]. Note that m = 2 in formula (7-12).]

(b) Show that w(x, yo) = 0 if (zo, yo) el D.


282 Integration on Manifolds Vath

5. For m = 1,2,... let hm be a function of class C on E! such that h, > 0,


feohm dx = 1, hm(x) = 0 whenever |z| > 1/m. [For instance, we may take
hm (x) = mh(max), where h is as on p. 253.] Let y be a piecewise continuous function
on FE! which is periodic of period 1. Let Wn(x) = fr, W(yhn(z — y) dy =
f% W(a + 2)hm(z) dz. Show that: ‘
(a) If |W(x)| < C for every x, then |W,,(x)| < C for every x and m = 1,2,...
(b) Wm is of class C®) and of period 1,m = 1,2,...
(c) If fg ¥ dx = 0, then {own dx = 0,m = 1,2,...
(d) At each point xo of continuity of ¥, Wm(to) > (ro) as m—> ©. [Hint:_
Ym (%0) — W(to) = Stim (W(co + 2) — (x0)
lm (2)dz.]
[Note: In the proof of Corollary 2, let y be a periodic extension of g’, and let
gmt) = g'(0) + Jovm(x) dx, i = 1,...,n.]
Appendix

A-1 THE REAL NUMBER SYSTEM

Let us begin with a list of axioms which describes the real number system E!.

Axiom I. (a) Any two real numbers have a sum x + y and a product xy,
which are also real numbers. Moreover,
Commutative law ce yy a oe, Tea.
Associative law ae) S> By Ss REM) =e & L(y) == (xy)Z,
Distributive law x(y + 2) = cy + xz
for every x,y, and z.
(b) There are two (distinct) real numbers 0 and 1 which are identity
elements respectively under addition and multiplication:

= I) = ae, CL = 92
for every «x.
(c) Every real number x has an inverse —x with respect to addition, and
_ if « ¥ O, an inverse x‘ with respect to multiplication:

oo (=x) = 0, opens’

Axiom II. There is a relation < between real numbers such that:

(a) For every pair of numbers x and y, exactly one of the following alter-
natives holds: z < y,7 = y, y < @.
(b) w < wanda < yimply w < y (transitive law).
(c) « < y implies x + 2 < y + 2 for every z.
(d) « < y implies zz < yz whenever 0 < z.
From Axioms I and II follow all of the ordinary laws of arithmetic. In
algebra any set with two operations (usually called “addition” and “multi-
plication”) having the properties listed in Axiom I is called a field. A field is
called ordered if there is in it a relation < satisfying Axiom II.
283
284 Appendix A-1

The real numbers form an ordered field. However, this is by no means the
only ordered field. For example, the rational numbers also form an ordered
field. We recall that x rational means that x = p/q where p and q are integers
and gq ~ 0. Yet another axiom is needed td characterize the real number
system. This axiom can be introduced in several ways. Perhaps the simplest
of these is in terms of least upper bounds.
Let S be a nonempty set of real numbers. If there is a number c such that
az < c¢ for every x € S, then c is called an upper bound for S. If c¢ is an upper
bound for S and b > c, then 6 is also an upper bound for S.
Axiom Illa. Any set S of real numbers which has an upper bound has a
least upper bound.
The least upper bound for S will be denoted by sup S. If S has no upper
bound, then we set sup S = +o.
A number d is a lower bound for S if d < x for every x ES. If S hasa
lower bound, then (Problem 2) S has a greatest lower bound. It is denoted
by inf S. If S has no lower bound, then we set inf S = —o.

Example 1. Let S = {1, 2,3,...}, the set of positive integers. Then sup S = +
anGeiniy Sele

Example 2. Let a and 6 be real numbers with a < 6. The sets

[GO| =e ase 0) (GO) me ee ucete<0).


(a, bo = Go tO (G.0| =r <a by
are called finite intervals with endpoints a and 6. The first of these intervals is called
closed, the second open, the last two half-open. In each instance 6 is the least upper
bound and a is the greatest lower bound.
In the same way the semi-infinite intervals

[a,©)
= {e:2> a}, (4,0)
= {aia > a}
are called respectively closed and open, and have a as greatest lower bound. The
corresponding intervals (—, b], (—°, b) have 6 as least upper bound.

Let S be a set which has an upper bound. Example 2 shows that the
number sup S need not belong to S. If sup S does happen to be an element of S,
then it is the largest element of S and we write “max S” instead of “sup S.”
Similarly, if S is bounded below and inf S is an element of S, then we write for it
nun 5.”
Example 3. Let S = {x:2? < 2 and z is a rational number}. Then 2 = sup S
and —V/2 = inf S. Since V2 is not a rational number, this example shows that the
least upper bound axiom would no longer hold if we replaced the real number system
by the rational number system.
Example 4. Let S = {sinz:a € [—z,7a]}. Then —1 = minS,1 = maxS.
For every € > 0, x > 0 there is a positive integer m such that x < me.
This is called the archimedean property of the real number system. To prove it,
A-2 Axioms for a Vector Space 285

suppose to the contrary that for some pair €, x of positive numbers, me < x
forevery m = 1, 2,... Thenzisan upper bound for the set S= {e, 2e, 3e,.. .}.
Let c= supS. Then (m-+ l)e < ¢ and therefore me < c — ¢, for each
m = 1,2,... Hence c — é¢ is an upper bound for S smaller than sup S, a
contradiction. This proves the archimedean property.
We shall not prove that there actually is a system satisfying Axioms I,
II, and IIa. There are two well-known methods of constructing the real
number system, starting from the rational numbers. One of them is the method
of Dedekind cuts and the other is Cantor’s method of Cauchy sequences.
Axioms I, II, and IIIa characterize the real numbers; in other words,
any two systems satisfying these three axioms are essentially the same. To
put this more precisely in algebraic language, any two ordered fields satisfying
Axiom IIIa are isomorphic.
For proofs of these facts, refer to [2], Chap. III.

PROBLEMS
1. Find the least upper bound and greatest lower bound of each of the following sets:
(a) {x:22 — 382+ 2 < O}.
(b) {2:a3.-+ 2? — 27 < 2}.
(c) {sinz-+ cos 2:2 € [0, z]}.
(d) {wexpax:a2 < 0}. [Note: exp denotes the exponential function, expx = e?,
where e¢ is the base for natural logarithms.]
(e) ke z, 3, Ts O50 ap

2. Let T = {x: —z € S}. Show that —sup T = inf S.


3. Let « and y be real numbers with x < y. Show that there is a rational number z
such that x < z < y. |[Hint: By the archimedean property there is a positive
integer g such that q~! < y — x. Let z = p/q, where p is the smallest positive
integer such that qx < p.]

A-2 AXIOMS FOR A VECTOR SPACE


A vector space over the real number field is a nonempty set © together
with two operations called “addition” and “scalar multiplication.” The sum
u + v of two elements u, v € VU is also an element of U and the scalar multiple
cu of wu€ UV by the real number c is an element of U. These operations are
required to satisfy the following axioms:
(1) Addition is associative and commutative.
(2) There is a zero element 6 such that wu + 0 = wu for every u € V.
(3) The distributive laws hold:

(c+ d)u = cu + du, cu + v) = cu+o

for every real c, d and u,v € V.


(4) (cd)u = c(du) for every real c, d, and u € V.
ae eli— 4, lor every uc U.
286 Appendix A-2

It is easy to show that EH” satisfies these five axioms. However, a multi-
tude of other important vector spaces besides #” occur in mathematics.
A subset B of a vector space VU is called a linearly dependent set if there
exist distinct elements u1,...%m © B and real numbers Cie Ca notes aU)
such that
Cu, ee Ws

If B is not linearly dependent, then B is a linearly independent set. V is a finite


dimensional vector space if some finite subset B of U spans VU, namely, if
every element u € U is a linear combination u = clu, +---+ cum where
Uti see eed.
A basis for V is a linearly independent set which spans U. If V is finite
dimensional, then every basis B has the same number n of elements (see [12],
p. 43). The number n is the dimension of 0. If n = 0, then UV has the single
element 6. Ifn > Oand B = {uj,..., Un} is a basis for U, then every u € U
can be uniquely written as a linear combination

wu = clu, +--- + c Un.

Let U and W be vector spaces. Let L be a function with domain U and


values in W. Then LZ is linear if
(a) L(wu + v) = L(u) + Liv) for every u, v © U; and
(b) L(cu) = cL(u) for every u € UV and real c.

Let L and M be linear. The sum L + UM is given by

(L + M)(u) = L(u) + M(u)

for every u € 0. The function L + M has properties (a) and (b), and thus
is linear. If c is a real number, then cL is the linear function given by (cL)(u) =
cL(u) for every u € V.
Let £(0, ‘W) denote the set of all linear functions with domain U and
values in W, together with these operations of sum of functions and multi-
plication of functions by scalars. Then £(, ‘W) satisfies Axioms (1)-(5) for
a vector space. The zero element of £(U, ‘W) is the function whose value at
every u € VU is the zero element of W.

The dual space of U. Let us now suppose that WwW = H! and set
U* = £(0, H'). The vector space U* is called the dual space of U. Let us
show that if U has positive, finite dimension n, then 0* also has dimension n.
Let B = {uy,..., Un} be a basis for U. Let L',..., E” be the real-valued
functions such that for each 7 = 1,...,n and u = clu; +-+-+ cup,

LCi ee C"Un) = c°.


A-2 Axioms for a Vector Space 287

These functions L' are linear, and therefore belong to 0*. They are specified
by their values at the basis elements:

Lj) = 6, 14,7 =1,...,n, (*)


where 6} = lifi
= jand & = Oifi
¥ j.
Let us show that B* = {L',..., L"} is a basis for U*. Suppose that
b,L' + ---+ b,L" = 6, where @ is the zero function. Then, for every u € U,

b,L'(u) +--+. +b, L"(u) = O(u) = 0.


Taking u = u; and applying formula (*), b; = 0 for each 7 = 1,...n. Thus
B* is a linearly independent set. To show that B* spans 0*, given L € U*
let a; = L(u,;). If u = clu, +--- + cup, then since L is linear L(Xc'u;) =
dec'L(u;). Therefore
n n

LG yi) ac’ = »o a;L’(u).


tJ i=!

Since this is true for every u € UV,

hy SOG? 358745
be.

which shows that B* spans U*.


The basis B* is called dual to the basis B.
A function ¢ from a vector space U into a vector space ‘W is an isomorphism
if ¢ is linear and ¢(u) # ¢(v) whenever u # v. If there is such an isomorphism
from VU onto W, then VU and ‘W are zsomorphic vector spaces. All n-dimensional
vector. spaces are isomorphic. If {w,,..., Un} isa basis for U and {w,,..., Wn}
a basis for W, then the linear function ¢ such that ¢(u;) = w; for7 = 1,...,n
is an isomorphism from VU onto W.
In particular, any finite-dimensional vector space © is isomorphic with
its dual U*. However, this isomorphism is unnatural from several points of
view. In this book we maintain the distinction between U and 0*.
A more natural isomorphism is the following one from a vector space VU
into the dual U** of U*. For each u € VU let d(u) = L,, where Ll, € U** is
the real-valued linear function such that l,(Z) = L(u) for every L € v*.
This isomorphism is onto U** if 0 is finite dimensional.

PROBLEMS

1. Let U be the set of all polynomials p(x) = aox™ + aiz™—! + +++ + Gm—1t + Om
of degree <™m, with the usual notions of addition and scalar multiplication. Here
x* denotes the kth power of x. Show that U is a vector space of dimension m + 1,
and find a basis for U.
2. Show that if O and W are vector spaces of finite positive dimensions n and r, then
£(U, W) has dimension nr.
288 Appendix A-3

A-3 BASIC TOPOLOGICAL NOTIONS IN E”


Let #” be euclidean n-dimensional space, defined in Section 1-1. A nezgh-
borhood of a point xo € E” is a spherical ball U = {x: |x — Xo| < 6}, where
5 > 0 is called the radius of U. We also call U the 6-neighborhood of x9. Let
A be a subset of H”. A point x is called interior to A if there is some neigh-
borhood U of x such that U C A. If some neighborhood of x is contained in
the complement A° = HE” — A, then x is exterior to A. If every neighborhood
of x contains at least one point of A and at least one point of A’, then x is a
frontier point of A. An interior point of A necessarily is a point of A, and an
exterior point must be a point of A°. However, a frontier point may belong
either to A or to A®.

Example 1. Let U be the 6-neighborhood of xo (Fig. A-1).


Let us show that every point of U is interior to U. Given
x € U, let r = 6 — |x — xo| and let V be the r-neigh-
borhood of x. If ye&@V, then y — xo = (y — x) +
(x — xo) and by the triangle inequality

be OSS hp eb Mg
ly
— xo] < r+ lx — xo] = 6.
Figure A-1

Hence y € U. This shows that V C U. Similarly, every point x such that |x — xo| > 6
is exterior to U. If |x — xo| = 6, x is a frontier point.

Example 2. Let n = 1. Then neighborhoods are open intervals (xo — 6, 20 + 6).


Let A be the set of all rational numbers. Any open interval contains both rational
and irrational numbers. Hence every point of H! is a frontier point of A.

Definition. The interior of a set A is the set of all points interior to A. It


is denoted by int A. The set of all frontier points of A is the boundary
(or frontier) of A, and is denoted by fr A. The set A U fr A is the closure
of A and is denoted by cl A.

In Example 1 int U = U and fr U is the (n — 1)-dimensional sphere of


radius 6. The set cl U = {x:|x — xo| < 6} is called the closed spherical
n-ball with center x9 and radius 6. In Example 2 int A is the empty set, and
frgeAe ec leAga—e

Example 3. Let A = E”. Then int EH” = cl HZ” = HE” and fr E” is the empty set.

The closure of a set A consists of those points not exterior to A. Thus

(clt4) 2 int GAs):

It is always true that int A C A. If these two sets are the same, then A is
called an open set.
A-3 Basic Topological Notions in E” 289

Definition. A set A is open if every point of A is interior to A.

From the above examples any neighborhood is an open set and E” is an


open set. The empty set furnishes another example of an open set.

Proposition A-la. Jf A and B are open sets, then A UB and AB are


open.

Proof. Let x € A UB. By the definition of the union of two sets, either
x € Aorx € B. Ifx € A, then there is a neighborhood U ofx such that U C A.
Since AC AUB,UCAUB. Similarly, if x € B there is a neighborhood of
x contained in A U B. This proves that A U B is open.
If x € A 1B, then x has neighborhoods U,, U2 such that U; C A and
U,CcB. Let U3 = Uy U2. Then U3 is a neighborhood of xand U3; Cc AN B.
Therefore A M B is open. §

Similarly, if A,,..., Am are open sets, then their union A; U-:- U Am


and their intersection A; M---M Am are open. As far as unions are con-
cerned, the same is still true if the number of open sets is infinite. In order to
make this last statement precise let us introduce some set-theoretic notation.
By indexed collection of sets let us mean a function with domain some nonempty
set J (called an index set) whose values are subsets of some set S. In the present
instance S = H”. Let A, denote the value of the function at u € J. Moreover,
let
LJ A, = {p €S:peA, for some p € J},
pes

= {p € S:p eA, for every


p & g}.
pes

These sets are, respectively, the wnzon and intersection of the indexed collection.
If J is a finite set, then the indexed collection is called finite. If g = {1,2,...},
then the indexed collection is an infinite sequence of sets and is written
Anas ee orl Ay) =—="1,12;).-. In that) case the union is “written
A; U Ag U-:: or U-_, Am, with similar notations for the intersection.

Proposition A-Ib. The union of any indexed collection of open sets is open.

The proof is the same as for the first part of Proposition Ala.

Example 4. Let Am be the (1/m)-neighborhood of a point xo, m = 1,2,... Then

Agi) 2 11 * ==" 4X05,

which is not an open set.

Definition. A sct A is closed if its complement A° is open.


290 Appendix A-3

In other words, A is closed if A contains all of its frontier points, which is


to say A = cl A. Since

(2 a U 4s,
pes
(U dali Cre
Lesa pes

we have the following statement from Propositions A-la and A-1b.

Proposition A-2. The intersection of any indexed collection of closed sets 1s


closed. The union of any finite indexed collection of closed sets is closed.

Besides indexed collections, we shall have occasion to consider unindexed


collections of sets. (We use the term “collection of sets,” rather than “set of
sets,” for a set whose elements are subsets of some given set S.) Let us use
German script letters to denote collections of sets. For instance, the elements
of a finite collection Y = {A,,..., Am} of subsets of H” are the sets A; C E”,
(MN as eee
The union and intersection of a collection Y of sets are, respectively, the sets

LJ) 4 = {pe S:peEA forsome


A € 9},
AEY

() A= {pES8:peEA forevery
A € Uf}.
AEY

If each set of the collection is indexed by itself (taking J = YA, A4 = A), then
this definition of union and intersection agrees with the one for indexed collec-
tions. Propositions A-lb and A-2 remain true for unindexed collections.

PROBLEMS
1. Find int A, fr A, cl A if A is:
(amex OF<a|xi—-xo| <0 0820:
(b) {x:|x — xo] = 6}, 6 > 0.
(CRG) 0s ye<i a ole yl)
(Dimi rcOs.Oayocit 0) 0 pute <m eO e<aO) <0 Dan
(e) {(a, y):2 or y is irrational}.
(f) Any finite set.
(g) we 3, 3) Om te n=l.
2. In Problem 1 which sets are open? Which are closed?
3. Let A be any set. Show that int A is open, and that both fr A and cl A are closed.
4. Show that:
() aie 41 se (ALS), (b) el A =) cl (lA):
(ce) fr A = el A fel (CA9). (d) int-A = (el (A%)¢.
5. Show by giving examples that the following are in general false:
(a) int (cl A) = int A. (is) sue Gie AD) = iiie Al.
6. Let A be open and B closed. Show that A — B is open, and that B — A is closed.
A-4 Sequences in E” 291

A-4 SEQUENCES IN E”

An infinite sequence is a function whose domain is the set of positive integers.


For brevity, we shall say sequence to mean infinite sequence. In this section let
us consider sequences with values in H”. It is customary to denote by Xm
the value of the function at the integer m= 1, 2,..., and to call x, the
mth term of the sequence. The sequence itself is denoted by x1, X2,..., or for
brevity by [x]. It must not be confused with the set {x,, Xo, ...} whose ele-
ments are the terms of the sequence. This set may be finite or infinite. For
instance if x, = (—1)” then the sequence is —1, 1, —1,..., and the set
{®1, Zq,...} has only two elements —1 and 1.

Definition. Suppose that for every € > 0 there exists a positive integer N
such that |x, — xo| < ¢€ for every m > N. Then Xp is the limit of
the sequence [x].

The notations “xg = limm_,. Xm” and “Xm — Xo aS m — o” are used to


mean that xo is the limit of the sequence [x,,]. A sequence is called convergent
if it has a limit, otherwise divergent. The integer N in the definition depends of
course on €. Given ¢€ there is a smallest possible choice for N. However, for
purposes of the theory of limits it is of no interest to calculate it. What matters
is the fact that some WN exists.

Proposition A-3a. Let xX) = lim Xp, Yo = lim ym. Then:


ma mao

(a) Xo ah WO = lim (Chey ae Viale


mM >o

(DyecxXo —-oliin cx, for any scalar c.


m—>o

(c) Xo° Yo = hm Xn- Ym.

_ The proof of this is left to the reader (Problem 6). Only superficial changes
are needed in the proof given for real-valued sequences in any careful elementary
caleulus text. Moreover, it is similar to the proof of Proposition A—3b in the
next section.
: : é ; 0 t
Proposition A-4a. Xo = lim xX, if and only af op thie oes) Sere ala)
mo mon

eles 7

Proof. For any vector h,

ToS AWS ae eater leat: (*)


In particular, this is true with h = x, — Xo. Suppose that Xm — Xo as
m— o. From the definition of “limit” and the fact that |ai, — 29| < [Xm — Xol,
given € > 0 there exists N such that jxi, — 2| < € for every m > N. Hence
zr, > a asm — «. Conversely, suppose that x, 2 t as m— x for each
292 Appendix A-4

i= 1,...,n. Then given € > 0 there exists for each 7 an N; such that
|x‘, — | < e/nforeverym > N;. Let N = max {Nj,..., Nn}. Ifm 2 N,
then
n

[Xm
— Xo] < aon aewerol <n = €.
i=1

Hence xm — Xp aaSm— ~w.§

Let us next prove three theorems which depend on the least upper bound
axiom. It will be seen later that when n = 1 each of these theorems describes
a property of the real number system which, taken together with the archi-
medean property, is equivalent to the least upper bound axiom.
A sequence [x] of real numbers is called monotone if either x; < v2 <
tga < ++: or tT] > Xy > t3 >-:- In the first instance the sequence is non-
decreasing, in the second nonincreasing. A sequence [xX,,] is bounded if there is
a number C' such that |x,,.| < C for every m = 1,2,...

Theorem A-1. very bounded monotone sequence of real numbers has a limit.

Proof. Let [xm] be nondecreasing and bounded. Let ro = sup {x1, 2...}.
Given e > 0 there exists an N such that x9 — ry < €. Otherwise 7m < Xo — €
for every m = 1,2,..., and x) — e€ would be a smaller upper bound than the
least upper bound zg. Since the sequence is nondecreasing, ty < %m < %o
for every m > N. Hence |tm — X%o| = Xo — tm < € for every m > N.
This shows that 7, — to as m— o.
If the sequence is nonincreasing and bounded, let x9 = inf {x1, xo,...}.
In the same way tm — 2% asm — o.§

Example 1. Let 0 < a < 1. The sequence a, a”, a®,... of its powers is decreasing,
and is bounded below by 0. The limit of the sequence is b = inf fa, a”, a?,...}.
eimce 6 = a"™'!Va-'b < a™ for m= 1,2,;... Therefore a,b —< O) Howeyer
a—'b > bsince0 < a < 1; and hence a~!b = b. This implies that 6 = 0.

Let us next obtain a criterion for convergence due to Cauchy.

Definition. If for every € > 0 there exists a positive integer N such that
|x: — Xm| < € for every 1, m > N, then [x] is a Cauchy sequence.

Theorem A-2. (Cauchy convergence criterion.) A sequence [Xm] is convergent


af and only if at is a Cauchy sequence.

Proof. Let [Xm] be convergent, and xo be its limit. Then given e > 0
there exists N such that |x, — X9| < €/2 for every m > N. Now

Xj —= Xp, == (X71 Xo) (X01 Xm)


A-4 Sequences in E” 293

If l,m > N, then by the triangle inequality

— Xml < |x: — Xol + [xo — xml aes


[xz = €.

Therefore [x,,] is a Cauchy sequence.


The proof of the converse is more difficult. Let us first show that every
Cauchy sequence is bounded. If [x,,] is Cauchy, then taking € = 1 in the defi-
nition there is an N such that |x; — x,| < 1 for every l, m > N. In par-
ticular let 1 = N and let C = max {|x,|,..., |xw—i|,|xw| + 1}. Then by
the triangle inequality and the fact that x», = xy + (Xm — Xn),

Xml < [xw| + [Xm — xw| < |xw| 4+ 1


ieee Lherelore 1x,|< Cior every m = 1,2... ..
Next, let [z,,] be a Cauchy sequence of real numbers, and C be a number
such that |7,| < C for every m. For each m = 1,2,... let

His =—=\64
SS WT ee eel plas ae1

SIncee oops Glin, Laat, .-))) We mustshave y;, -S Yaar. Since


Lm <C for every m, Ym < C for every m. The sequence [ym] is nondecreas-
ing and bounded. By Theorem A-1 the sequence [y»] has a limit yo. Let
us show that tm — yo as m— ow. Given e€ > O there exists N such that
for every m > N, |tn — %m| < €/2, and hence
€ €
In — 5 GE tn + 5°

If m > N, then zy — €/2 is a lower bound and xy + e€/2 an upper bound


iOUe lanety ey umubereiorenwoen qs. IN
E €
ty — 5S Ym S Yo S tn + 5°
€ €
lan Yo! << 9 + 9 ——-" ioe
rn a Yol = an zy|

This proves that tm — Yo a8 m — o.


Finally, if [xm] is a Cauchy sequence in E£”, then for each 7 = 1,...,n
the components form a Cauchy sequence [z’,] of real numbers (Problem 4).
Each of these sequences has a limit yj. By Proposition A-4a, Xm — Yo as
m7 ov.§

A nonempty set is called bounded if it is contained in some spherical n-ball.


The diameter of a nonempty set A is

diam A = sup {|x — y|:x,y € A}.

A is bounded if and only if diam A is finite.


294 Appendix A-4

Theorem A-3. (Cantor). Let [Am] be a sequence of closed sets such that
Aya Ajo): ond. 0 == lin, diam Ayre) hen Ane \eso (lice nCOlle
tains a single point.

Proof. For each m = 1, 2,... let Xm be a‘point of A». Let us show that
[x] is a Cauchy sequence. Given e > 0, there exists N such that diam Ay < e.
If 1, m > N, then x;,xm € Aw since Ay C Any, Am C Ay. Therefore

Ix; — Xm| < diam Ay < e.

By Theorem A-2 the sequence [x,,] has a limit xo. For each / = 1,2,...,
Xm © A; for every m > Il since Am C A;. Since A; is closed, x9 € Az (Prob-
lem 5). Since this is true for each 1, xX») € Ay N Ag N:::
Ifx € Ay mM Ag n--:, thenx € A, and

0 < |x — xo| < diam 4,,, Tn eo eae

Since diam 4,, > 0 asm — om, |x — Xo| = 0 and x = Xp. ff

If in Theorem A-3 it is not assumed that the diameter of A,, tends to 0,


then it is still true that 4; M Az N--- is not empty, provided some set A»
is bounded. (See Problem 4, Section A-8.)

Example 2. Let n = 1, Am = {m,m-+1,m-+ 2,...}. Then each A,, is closed,


unbounded, and A; > A2D::: The intersection 41 MN A2M--:-: is empty.

*Note. Let n = 1. Suppose that we took instead of the least upper bound
property of H’ (Axiom IIIa, Section A-1) the following.

Axiom Illb. (a) E' has the archimedean property.


(b) Every Cauchy sequence of real numbers is convergent.
Then the least upper bound property becomes a theorem. ‘To prove this,
let S be any subset of H' which is bounded above, and let c be some upper
bound for S. Let us define a sequence of closed intervals J;, Iz, . . as follows:
Let a be some point of S, and I; = [a,c]. Divide J, at the midpoint (a + c)/2
into two congruent closed intervals. If (a + c)/2 is an upper bound for S,
let Ig be the left-hand interval, otherwise let J2 be the right-hand interval.
In general, suppose m > 1 and I, has been defined. If the midpoint of J,, is
an upper bound for S, let Im41 be the left half of J,,, otherwise the right half.
The archimedean property implies that for any x > 0, the sequence [m7]
tends to 0 as m— o. Since 0 < 2-” < m7}, the sequence [2~”z] also
tends to 0. Let c = 2(¢ — a). Now I; > I2D--- and the length of I,, is
2~"x. By the proof of Theorem A-3, I; N Iz N--- contains a single point
xo. By the construction, 79 = sup S. (See Fig. A-2.)
This shows that (in the presence of Axioms I and II) Axioms IIIa and
IIIb are equivalent. The endpoints of the intervals J,, form monotone
sequences. If we took as an axiom the archimedean property and Theorem A-1,
A-4 Sequences in E” 295

Ty
WO
ee ae
ee
a ¢
l ue |
I,

Figure A-2

then similar reasoning shows that these sequences must converge to a common
limit x9 and x9 = sup S. Again, this is equivalent to Axiom IIIa.

Infinite series. J*ormally, an infinite series is an expression written >> 7_ xz


or X; + X2+:-:+ To be more precise, with any sequence [x;] is associated
another sequence [s,,], where Ss» = x; + +--+ Xm is called the mth partial
sum. This pair of sequences defines an infinite series. If the sequence of partial
sums has a limit s, then the series is convergent and s is its sum. This is denoted
by s = x; + X2.+--- If the sequence of partial sums has no limit, then
the series is divergent.
Ii s=x,; +%.+-:::, t=yityot+---, then st+t= (1 +y1)+
(Xo + Yo) +---+ and cs = (cx,) + (cxe) +--- for any scalar c. This follows
from the definition and Proposition A-3a. Some further elementary properties
are given in Problems 7(c) and 8.

PROBLEMS
In Problems 1 and 2 you may use the results of Problems 9 and 10.
1. Find the limit if it exists.
(aan ll= 2" — 25%)/ 8" 35"):
(b) 8g 3 I sin (mm/2).
@) Bp = SM Gas.
(da) tm = (m+ 1)/(m — 1))™. [Hint: (1+ 1/m)™ >e as m > © |]
(Cleo nem n(n Salm? — 1) Jr.
2. Find the limit if it exists, using Proposition A—4a.
(8). Gray Ym) = (A m)/ — 2m), 1/0 + m)).
(DMG Yr) Sole on List am)e
(c) (Xm, Ym) = (1 — 2-”, (m? + 3”)/m!).
3. Show that a sequence [x,] has at most one limit xo. [Hint: If yo were another
limit, let € = |xo — yo|/2.]
4. Show that [x] is a Cauchy sequence if and only if [z;,] is a Cauchy sequence
iO? CHONG, = Ih saanite
5. Show that if x, © A for every m > 1 and xo = limm_,~ Xm, then xo € cl A.
6. Prove Proposition A-3a.
7. (Comparison tests.) Show that:
(a) If 0 < am < ym for every m > 1 and yn > 0 as m— ©, then tm — 0
asm —> ©.
296 Appendix A-5

(b) If [zm], [ym] are nondecreasing sequences such that %m < Ym for each
m = 1,2,... Ym — and
yasm — ©,then [zm] has a limit « < y.
(c) If 0< 2m < ym for every m = 1,2,... and ¢ = yi + y2+--°--, then the
series x1 + x2 +--+ converges with sum s < f.
8. An infinite series x1 + x2 + °+-+ converges absolutely if the series of nonnegative
numbers |x;| + |x2|-+--- converges. Prove that any absolutely convergent
infinite series is convergent. [Hint: Show that the sequence [s,,] of partial sums
is Cauchy.]
9. Show that if a > 0, then
@) ime ew (b) lim a”/m! = 0.
mo moO

(c) lim (am)'/”" = 1 provided lim zm = a.

[Hints: For part (a) reduce to the case 0 < a < 1. By Example 1, if 6 < 1
then a < 6” for only finitely many m. For part (b), compare with the sequence
[c/m] for suitable c and suitable / in Problem 7(a).]
1O, Ian ay SS Nie a, Of) — Wht es Diy, BANC! SONS ae WE, == Orie = OW) Ih, A, ao o
Show that zo/yo = limm_.0%m/Ym- (Hint: By (c) of Proposition A—3a it suffices
to show that yo! = limm—o 2m

A-5 LIMITS AND CONTINUITY OF TRANSFORMATIONS

In discussing continuity of functions it is convenient to introduce the


following terminology. Let f be a function from a set S into a set 7’. The zmage
underf of a set A CS is the set f(A) = {f(p):p © A}. It is a subset of T,
and in fact the restriction f|A (Section 1-2) is a function from A onto f(A).
The inverse image of a set B C T is the set f-1(B) = {p:f(p) € B}. It isa
subset of S. See Section 4-1 for illustrations and examples.
Let us now suppose that f is a function from a set D C EH” into HE”, where
n and m are positive integers. Such functions are called transformations in
this book.
The definition of “limit” for transformations is patterned after the one
encountered in elementary calculus for real-valued functions of one variable.
A punctured neighborhood of xo is a neighborhood with the center x» removed.
Let us assume that D contains some punctured neighborhood of x9. For the
definition of “limit”, Xo itself need not be in D. If x9 € D, the value of f at
Xo 1s irrelevant.

Definition. If for every neighborhood V of yo there is a punctured neigh-


borhood U of xo such that f(U) C V, then yo is the limit of the trans-
formation f at xo (Fig. A-3).

In the definition it is understood that the radius of U is small enough so


that U C D. The notations yp = lim, x, f(x) and f(x) — yo as x — xo are
used to mean that yo is the limit of f at xo.
A-5 Limits and Continuity of Transformations 297

Figure A-3

If we let € and 6 denote respectively the radii of V and U, then the defi-
nition may be rephrased as follows: f(x) — yo as x — Xo if for every € > 0
there exists 6 > 0 such that |f(x) — yo| < € whenever 0 < |x — xo| < 6.
The number 6 depends of course on € and may also depend on xo. Given €
and Xo, there is a largest possible 6. However, there is ordinarily no reason
to try to calculate it.
Let us first show that limits behave properly with respect to sums and
products. Let f and g have the same domain and values in the same euclidean E”.

Proposition A-3b. Jf yo = i) and Zo = pu g(x), then

(1)yo+ zo= Jim(f(x) + g(@)]


(2) cy5 = ug cf(x), for any scaiar c.

(3) Yo: Zo = PCS) > g(x).


Proof. Let V be any neighborhood of yo + Zo and let € be its radius.
Let V,, Vz be the neighborhoods of radius €/2 of yo, Zo respectively. If y € Vy
and z € Vg, then by the triangle inequality

epee 24) (Yo + Zo)| = iy yol + [2 — zl < 5 PS — €.

Hence y + z € V. By hypothesis there exist punctured neighborhoods U,, U2


of Xo such that f(U,) C Vi and g(U2) C Ve. Let U = Uy, Use, which is
also a punctured neighborhood of xo. If x € U, then f(x) € Vy, g(x) € V2.
Consequently, f(x) + g(x) € V, which shows that (f + g)(U) CV. This
proves (1). The proof of (2) is left to the reader (Problem 2).
To prove (3) let Vo be the neighborhood of yo of radius 1, and Uo be a
punctured neighborhood of x9 such that f{(Uo) C Vo. Let

C = max {lyo| +1, |zol}-


If WE Vo, then

VieswiVooye— vor — |Yo. 4 1,


and hence |y| < C.
Now
f(x) - g(x) —Yo- Zo = f(x) - [g(x) — Zo] + Zo: [f(x) — Yol.
298 Appendix A-5

From the triangle inequality and Cauchy’s inequality,

lf(x) - g(®) — yo°Zol < |f(%)| |g(x) — Zol + [zo] If) — yol- (*)
\
Given € > 0, let V,, V2 be the neighborhoods of radius €/2C' of yo, Zo respec-
tively,.and let Vi = Von Vi. U-esx 2C,; then Vj = Vy. By hypothesis
there are punctured neighborhoods U,, Us of x9 such that f(U;) C V2,
TUG Vee let U = Ur Us. Forvevery xe Upii(x) eG.) go and) hence
[f(x)| < C. From (*),
€ €
iQ ee tye" Za < Coat Cag = €

for every x € U. This proves (3). I

A transformation f is called bounded on a set A if there exists C such that


|f(x)| < C for every x € A. In the course of the proof we showed that if f
has a limit at x9, then f is bounded on some punctured neighborhood Uo of Xo.

Proposition A-4b. yo = lim f(x) 7f and only if yj = lim f*(x) for each
ee eae habe ate”

The proof is like that for Proposition A-4a.

Proposition A-5. Jf yo = lim f(x), then for any v ¥ 0,


X—XQ

Yo = lim f(x + tv).


t0

Proof. Let V be any neighborhood of yo. There exists 6 > 0 such that
f(x) € V whenever 0 < |x — xo| < 6. If0 < |t| < 6/|v|, then

|(Xo + tv) — Xo] = |é| |v] < 6.

Hence f(x +- tv) € V for every ¢ in the punctured 6/|v|-neighborhood of 0. §

The points Xo + tv lie on the line through x9 and x» -+v. Roughly


speaking, Proposition A-5 states that if f has a limit yp at xo, then yo is also
the limit as xp 1s approached along any line containing xo. When f fails to have
a limit at x9 this fact can often be discovered by testing f along various lines.

Example 1. Let f(x,y) = 2?/(#? + y?), (z, y) # (0,0), and let xo = (0,0). Taking
v =e; = (1,0), f(,0) = 1 for every t¥ 0. Hence f(t, 0) > 1 ast > 0. Similarly,
taking v = eg = (0,1), f(0,t) = 0 for every t ¥ 0 and f(0, t) ~ 0 ast > 0. Since
these limits are different, f has no limit at (0, 0).

The converse to Proposition A-5 is false, as the following example shows.


A-5 Limits and Continuity of Transformations 299

Example 2. Let f(x,y) = (y? — «)?/(y4+ 27), (2,y) ¥ (0,0), and again let
xo = (0,0). Consider any v = (h,k) # (0,0). Then

fie
GC =i

which tends to 1 as t > 0. However, every punctured neighborhood of (0,0) con-


tains part of the parabola y? = 2, and f(y?, y) = 0. Hence f does not have a limit
at (0, 0).

Continuity. It often happens that x9 € D and the limit at xo is just f(x).

Definition. A transformation f is continuous at xo if f(xo) = lim,_,x, £(x).

When f is continuous at x9, punctured neighborhoods may be replaced


by neighborhoods. The definition may be restated: f is continuous at xo if
for every neighborhood V of f(xo) there is a neighborhood U of xo such that
fC) Ve
From Proposition A-8b, sums and products of transformations con-
tinuous at X9 are also continuous at x9. From Proposition A—4b, f is con-
tinuous at Xo if and only if its components f!,...,f” are continuous at Xo.

Example 3. Let I(x) = x, the identity transformation. Then I is everywhere con-


tinuous (take U = V above). Therefore the components of I are everywhere con-
tinuous. In this book these components are called the standard cartesian coordinate
functions, and are denoted by X!,..., X”. For each x, X*(x) = x*. See Section 1-3.

Example 4. Any polynomial in n variables is everywhere continuous. This is proved


by induction on the degree of the polynomial using the continuity of the coordinate
functions X‘* and of constant functions.

Example 5. A rational function f(x) = P(x)/Q(x), where P and Q are polynomials,


is continuous at each point where Q(x) # 0 (see Problem 4). For instance, in
Examples 1 and 2, f is continuous at each (a, y) # (0, 0).

It will be shown later (Proposition A—7) that the composite of two contin-
uous transformations is continuous. In Section 4—4 it is shown that any differ-
entiable transformation is continuous.

Limits at oo. Let us call a set of the form {x: |x| > 6} a punctured neigh-
borhood of «. The definition of “limit at «” then reads: yo = limjz|_,. f(x)
if for every neighborhood V of yo there exists a punctured neighborhood U
of « such that f{(U) C V.
When f is real valued we say that lim,_,x, f(x) = + if for every C > 0
there is a punctured neighborhood U of xo such that f(x) > C whenever
x € U. The definition of “lim,_x, f(x) = —” is similar.
300 Appendix A-6

PROBLEMS
th Find the limit at xo if it exists.
(a) f@pye— ty (r> 47) xor—" er eo:
(b) I(x; y) = cy) ce ae y”), XO (0, 0). ie
(ec) f(z) = (1 — cosz)/x?, xo = 0. [Hint: limz_o (sin z)/4 = 1.]
(d) f(z) = |x — 2le; + |x + 2leo, xo = 3.
(e) f(x,y) = yer + (ry)?/[(y)? + (& — y)7le2, xo = (0, 0).
At which points is each of these functions continuous?
. Prove (2) of Proposition A-3b.
. Show that if yo = lims_,x, f(x), then |yo| = limz_x, |f(x)|. By an example, show
that the converse is false.
. Let yo = lims4x, f(x), 20 = lims—x, g(x). Show that if zo ¥ 0 then

$f) _ yo.
x>x9 g(X) 20

. Show that limy_,x, f(x) = + if and only if lims_x, [f@)]—* = 0 and f(x) >)0
for every x in some punctured neighborhood of xo.
. Find the limit if it exists.
A 4
gS. oi zy”
a lim : (b) Li oes
(a) (e,4)=1(0,0) 22-1 Y* (x,y) (0,0) & ae y

oo ie (x - X1)(X- X2) » where x; and xg are given vectors not 0.


|x|—+00 xx
. [x — xj|
d) lim -————.
) [x|+0 |X — Xo|
. Show that f is continuous at xo if and only if f(xo) = limm_,o f(Xm) for every
sequence [x,,] such that x, € D form = 1,2,...andxm — x9 asm — @.

A-6 TOPOLOGICAL SPACES

The notion of topological space occurs in practically all branches of


mathematics. There are several equivalent definitions; of these, we shall give
the one in terms of neighborhoods.
Definition. Let S be a nonempty set. For every p € S let U, be a collec-
tion of subsets of S called neighborhoods of p such that:
(1) Every point p has at least one neighborhood.
(2) Every neighborhood of p contains p.
(3) If U; and U2 are neighborhoods of p, then there is a neighborhood
U3 of p such that Uz C U1 MN Us.
(4) If U is a neighborhood of p and q € U, then there is a neighborhood
V of gq such that V Cc U.
Then S is a topological space.
Topological Spaces 301

More precisely, the topological space is S together with the collections


U, of neighborhoods. However, it is common practice to omit explicit reference
to the collections of neighborhoods when no ambiguity can arise.
For our purposes, the following two examples of topological spaces are
of primary importance.
Example 1. Let S = H”, and as in Section A-3 let Ux be the collection of all open
spherical n-balls with center x. Clearly, Axioms (1) and (2) of the definition above
are satisfied, and in (3) we may take Us = U1 Ue. Axiom (4) was verified in
Example 1, Section A-3. Thus EH” is a topological space.

Example 2. Let S C #”. Let neighborhoods of x € S be all sets SM U where U is a


neighborhood of x in H” (Fig. A—4). These are called relative neighborhoods of x and
the topology on S defined by the collections of relative neighborhoods is the relative
topology. It is discussed further later in the section. Roughly speaking, the relative
topology is the one obtained by simply ignoring the complementary set S° = HE” — S.

Figure A-4
1S)

In any topological space S the basic notions of interior, frontier, and closure
are defined just as in Section A-3 for the topological space #”. For instance, p
is interior to a set A C S if some neighborhood of p is contained in A. An open
subset of the topological space S is a subset A each point of which is interior to A.
If S — A is open, then A isa closed subset of S. Axiom (4) guarantees that any
neighborhood is an open set. Propositions A-la, A-1b, and A-2 are still true,
and the proofs are almost the same as before.
It often happens that two different collections of neighborhoods Up, UZ
lead to the same collection of open subsets of S. In E” we need not have started
with spherical neighborhoods. For instance, the neighborhoods obtained from
any noneuclidean norm on #” lead to the same open sets as in Section A-3.
See Section 1-6.
The open sets, and not the particular kinds of neighborhoods from which
they were obtained, determine all of the topological properties of S. Thus we
say that the collections Up, UW, define the same topology on S if they have the
same collection of open sets.
To give some idea of the breadth of the notion of topological space, let us
give a few more examples.
Example 3. Let S be any set, and let every p € S have exactly one neighborhood,
namely S itself. The only open sets are S and the empty set.
302 Appendix A-6

Example 4. Let S be any set, and let the sole neighborhood of p be the set {p} with
the one element p. Then every subset of S is open.

These examples represent opposite extremes. In Example 3, S is called


an indiscrete space, in Example 4 a discrete space.
Example 5. Let § be the set whose elements are all bounded real-valued functions on
the interval [0, 1]. Let us call the distance between functions f and g the number

d(f,g) = sup {|f(@) — g(a)|:2 € [0, 1}.


Let neighborhoods of f be all sets of the form {g € F:d(f,g) < ¢}. (See Fig. A-5.)

Ficure A-5

Continuous functions. Let f be a function from a topological space S into


a topological space 7’.

Definition. The function f is continuous at po if for every neighborhood V


of f(po) there exists a neighborhood U of po such that f(U) C V. If f is
continuous at every point of S, then f is continuous on S.

In this book the following cases will be of interest: (a) S is an open subset
of HE” and T = HE”. In that case the definition of continuity agrees with the
one in Section A-5. (b) SC H” and T CE”. The sets S and T are given
the relative topology, defined below. (c) S is an open subset of #”, and T is
some other finite-dimensional vector space. Specifically, for 7’ we shall take
either the dual space (#”)*, the space Hy of multivectors of degree r, or its
dual (H*)*. Each of these spaces has a euclidean norm. Just as for #”, neigh-
borhoods of a point p in each of them are of the form {q:|q — p| < 6,6 > O}.
Case (c) could be reduced to (a), since there is a norm preserving isomorphism
between each of these vector spaces and euclidean EH” of the appropriate
dimension .

Proposition A-6. f 7s continuous on S if and only rf the inverse image f—'(B)


of any open set B is open.

Proof. Let f be continuous on S and B C T be open. Let p be any point


of f-'(B) and V be a neighborhood of f(p) such that V C B. Since f is con-
A-6 Topological Spaces 303

tinuous, there is a neighborhood U of p such that f(U) C V. Then U c f7'(B),


which shows that f—!(B) is open.
Conversely, let f~'(B) be open for each open set B. Let p be any point
of S, and V be any neighborhood of f(p). Since V is open, f—!(V) is open and
contains p. Let U bea neighborhood of p such that U Cc f~'(V). Then f(U) CV,
which shows that f is continuous at p. Since this is true for every p € S, f is
continuous on S. §

Corollary. Jf f 7s real valued and continuous on S, then

{p: f(p) > c}, {p:f(p) < c} are open sets,


{p:f(p) > ch, {p:flp) < ch are closed sets.

Proof. In this case 7 = E'. The semi-infinite interval (c, « ) is open


and {p:f(p) > c} = f—'[(c,oo)]. Similarly, {p : f(p) < c} = fo | (==c0, ¢)].
The last two sets are complements of the first two.

Composites. Let f be a function from S into 7, and g from R into S. The


composite f © g is defined by

(f° g)(r) =F [g(r)] for every r € R.


Proposition A-7. [f g 1s continuous at ro and f ts continuous at po = g('o),
then f ° g ts continuous at ro.

g(W)
W (feg)(W)
(U 7
g ie f aS I
aa

te are aa
fog
Figure A-6

Proof. Let V be any neighborhood of f(po). (See Fig. A-6.) There is a


neighborhood U of po such that f(U) C V. Moreover, there is a neighborhood
W of ro such that g(W) C U. Then (f° g)(W) = f[g(W)] Cf(U) CV. This
shows that f° g is continuous at ro.

Example 6. Let D C E” be open. Given xo and v, let g(t) = xo + tv for every scalar t.
Then g is continuous from H! into H”. By Proposition A-6, A = {t:x0 + tw & D}
is open. Let f be continuous on D and let o(t) = f(xo + tv) fort EA. Then ¢ =
f° (g|A), and by Proposition A-7 ¢ is continuous on A. This result is similar to
Proposition A-5.

Example 7. Let 1<s<n—J1. Let us regard #” as the cartesian product


E* < E"—*, and write x = (x’,x’’), where x’ = (a},..., 2°), x” = (w*tl,..., 2").
304 Appendix A-6

Given x} € FE’, let g be the function from £"~* into H” such that g(x”) = (xo, x”
for every x’ € E"-*, Such a function g is called an injection. Since |g(x’”) — g(y”’)| =
jx” — y”|, gis continuous. Let D C E” be open, and let D(xo) = {x””: (Ko, x”) € D}.
Since D(xo) = g7!(D), by Proposition A-6 D(x6) is an open subset of H”~*. Let fbe
continuous on D. The function f(x, ) whose value at each x’ © D(x9) is f(x, x”)
is the composite of f and g| D(xg). By Proposition A-7, f(xo, ) is continuous. Similarly,
given x¥, the set {x’: (x’, x) € D} is open and the function f( , x0) is continuous.

Subspaces. Let So be a topological space and S a nonempty subset of So.


By disregarding the complement Sp — S, S becomes a topological space in
the following way. If p € S, then a neighborhood of p relative to Sisaset Sn U
where U is a neighborhood in So of p. Axioms (1)—(4) are satisfied. For instance,
to prove (3) let V; and V2 be relative neighborhoods of p. Then V; = SO Uj,
V2 = SN Us, where U, and Uz are neighborhoods in Sp of p. Since So is a
topological space, there is a neighborhood U3 in Sp of p with U3 C Uy, N Ug.
Then V3 = S 2 U3 is a relative neighborhood of p and V3 CV, NM Ve. This
topology is called the relative topology induced on S by the topology of So,
and S is a topological subspace of So. In particular, if So = EH” this is the rela-
tive topology on S mentioned in Example 2.

Proposition A-8. A set A is open relative to S if and only if A = Sn D,


where D is an open subset of So.

Proof. Let A = Sq D, where D is open in So. If p € A, then there is a


neighborhood U in So such that U C D. Then V = S fq U isa relative neigh-
borhood of p and V C A. Hence A is relatively open.
Conversely, let A be relatively open. For each p € A, let V, be some
relative neighborhood of p with V, C A. Then V, = SM Up, where U, is a
neighborhood in So of p. Let D = Upea Uy. Then Disopenand A = Sq D.§

Taking complements we find that the relatively closed sets are those of the
form S Q E where E 1s a closed subset of So.
If A CS and A is an open subset of So, then A is relatively open (take
D = A in Proposition A-8). On the other hand, there are generally many
sets which are relatively open but not open.

Example 8. Let So = H', S = [a,b]. Ifa < x < b, then the interval [a, x) is open
relative to S but is not an open subset of #'. A function f is said to have right-hand
limit yo at a if for every € > 0 there exists 6 > 0 such that |f(z) — yo| < € when-
ever a <x <a-+ 6. The idea of left-hand limit at the other endpoint 6 is defined
similarly. A function f is continuous on [a, b] if and only if f is continuous on the
open interval (a, 6) and

fla) = lim f(@), —f(@) = lim f(a),


ra => 0m

whee thre limits are respectively right- and left-hand.


A-6 Topological Spaces 305

Example 9. Let So = H!, S be the set of rational numbers, A = SQ (—%, /2) =


SM (—%, V2]. Then A is both open and closed relative to S. However, A is
neither an open nor a closed subset of EH.

Homeomorphisms. Let f be a univalent function from a topological space


S into a topological space 7. Then f has an inverse f—', whose domain is f(S).
For each q € f(S) its value f~'(q) is the unique p € S such that f(p) = g.
If both f and its inverse f~' are continuous functions, then f is a homeomor-
phism. If there is a homeomorphism from S onto T, then S and T are homeo-
morphic topological spaces.
In topology, homeomorphic spaces S and T may be regarded as indis-
tinguishable. Every topological property enjoyed by S is also enjoyed by T.

PROBLEMS

1. Show that:
(a) In Example 3 any real-valued function continuous on S is constant.
(b) In Example 4 every function with domain S is continuous.
2. Let S be an open subset of So. Show that the relatively open sets are just those
open subsets of So contained in S.
3. Consider the following nonstandard topology on the plane H?: In this topology
the “d-neighborhood” of (zo, yo) is the set {(z, y):%9 —6 <x <20+6,y = yot.
(a) Verify that Axioms (1)—(4) are satisfied.
(b) Show that if D C E? is open in the usual sense, then D is open in this topology,
but not conversely.
(c) Let f(x, y) = g(x)h(y), where g and h have domain EH! and g is continuous
in the usual topology of #!. Show that f is continuous in this topology.
4, Let So be a topological space, and let f be continuous on So. Let S C So have the
relative topology. Show that f|S is continuous on S.
5. A metric space is a nonempty set S together with a real-valued function d with
domain the cartesian product S < S, such that:
(i) d(p,q) => 0 for every p,q € S, d(p, q) = Oif and only if p = gq;
(ii) d(p,q) = dq, p) for every p,q € 8;
(iii) d(p, g) < d(p,r) + d(r, q) for every p, g,r € S.
Let the collection U, consist of all sets {¢: d(p, q) < 6}, where 6 > 0.
(a) Verify Axioms (1)—(4) for a topological space.
(b) Show that {q:d(p, q) < 6} is open and that {q¢:d(p, q) < 4} is closed.
6. A normed vector space is a vector space U together with a real-valued function || ||
with domain U such that: (a) ||u|| > 0 for every u€ VU, u ¥ 0, (b) |leul] = |e} |lull
for every c and u € UV, and (ce) |ju-+ o|| < |lul| + |lvl| for every uo EV. Let
d(u,v) = ||u — o||. Verify the axioms (i), (ii), and (iii) for a metric space in
Problem 5. [Note: If 0 = E”, then it was shown in Section 1-6 that every norm
on E” Jeads to the same topology. In Example 5 above & is a normed vector space.
The norm is ||f|| = sup{|f(x)|: 2 € [0, 1]}. The L?-spaces (Section 5-12) furnish
other examples of normed vector spaces.]
306 Appendix A-7

A-7 CONNECTED SPACES


From the intuitive point of view a set should be regarded as connected if
it consists of one piece. Thus an interval on the real line Z' is connected, while
the set [0, 1] U [2, 3] is disconnected. Tor more complicated sets, intuition
is not a reliable guide.

Definition. A topological space S is disconnected if there exist nonempty


open sets A and B such that S = A UB and An B is empty. If S is
not disconnected, then S is a connected space.
A subset S of a topological space So is connected if S is a connected space
in the relative topology.

Example 1. Let S = (0, 1]U[2, 3]. Let A = [0, 1], B = [2,3]. Then A = SNK (—1, 8),
which shows that A is open relative to S. Similarly, B is open relative to S. Since
S = AU Band AN Bis empty, S is disconnected.

Definition. A nonempty set J C E' is an interval if for every x,y EJ,


x < y, the set [x, y] is contained in J (Fig. A-7).

© Zz y

[x, y]
Figure A-7

The intervals can be classified into 10 types (Problem 4).

Proposition A-9. A set S C EH! is connected if and only if S is an interval.

Proof. If S is not an interval, then there exist 7, y E S,x < y,andz¢S8S


such, that 2 <2.<y. .Let A'— JS (y(—oo,2), B= SG co). Ehene4
and B are nonempty and relatively open, A U B = S, and A 1 B is empty.
Therefore S is disconnected.
Conversely, suppose that some interval J is disconnected. Then J =
A UB where A and B are not empty and open relative to J, and A 9 B is
empty. Let x; € A and z2 © B. The notation A, B may be chosen so that
1 < @g. Since J is an interval, [71, x2] C J; hence the fact that A is relatively
open implies that there exists 6; > 0 such that [x,, 2, + 6,) C A. Similarly,
there exists 62 > 0 such that (rz — 6, x2] CB. Letting By} = {rE B:x> xy}
and y = inf By, we have x; < y < a. Since J is an interval, y ce J. If
y € A, then some interval (y — 6, y + 6) is contained in A and y+ 6 is a
lower bound for B,, contrary to the fact that y is the greatest lower bound.
Similarly, if y € B, then some interval (y — 6,y + 6) is contained in By,
and y is not a lower bound. This is a contradiction. §
A-7 Connected Spaces 307

Let f be a function from a topological space S into a topological space T.


In the definition of continuity in the last section we may as well assume that
= f(S). For if V is a neighborhood of f(po), then f(U) C V if and only if
f(U) CV nf(S). The sets V q f(S) are just the relative neighborhoods of
f(po) in the topological subspace f(S).

Theorem A-4, If S is a connected space and f is continuous on S, then f(S)


as connected.

Proof. Suppose that f(S) is disconnected. Then f(S) = P UQ where


P and Q are open relative to f(S), nonempty, and P MQ is empty. By Propo-
sition A-6, taking 7 = f(S), the nonempty sets A = f-1(P), B = f-'Q)
are open. Moreover, S = A UB and AB is empty. Hence S is discon-
nected, contrary to hypothesis. §

Corollary. (Intermediate value theorem). If S is a connected space and f is


real-valued and continuous on S, then f(S) zs an interval.

Proof. By Proposition A-9 every connected subset of EH! is an interval. §

*Pathwise connectedness. Let p and q be points of a topological space S.


A path in S from p to qis a continuous function g from [0, 1] into S with g(0) = p,
g(1) = q. If every such pair of points can be joined by a path in S, then S is
called pathwise connected.

Proposition A-10. Jf S is pathwise connected, then S is connected.

Proof. If S is disconnected, then S = A U B as in the definition of dis-


connected space. Let p € A, gq © B, and g be a path in S joining p and q.
Since g is continuous, g~'(A) and g~'(B) are open relative to [0,1], their
union is [0, 1], and their intersection is empty. This contradicts the fact that
[0, 1] is connected. §

Example 2. Let S = S; U Seg, where

Some) (On )ee ley ss 1)


Shs AGH Wee Se (ye
Then it can be shown that S is connected but not pathwise connected.

On the other hand, any open connected subset of EH” is pathwise connected.
In fact, any two points of D can be connected by a polygonal path (Problem 9).

PROBLEMS
1. Show from the definition that the following are disconnected subsets of the plane E?:
(a) The hyperbola x2 — y? = 1.
(b) Any finite subset of #? with at least two elements.
(C) {@ yin? <y?}.
308 Appendix A-8

2. Show that each of the following sets is pathwise connected.


(a) Any convex set (See Section 1-4).
(b) The unit circle x? + y? = 1 in E?.
(c) The unit sphere 2? + y? + 2? = 1 in B?.
3. Show that a space S is disconnected if and only if S has a nonempty proper subset
A which is both open and closed. (Proper subset means A # S.)
4. Show that:
(a) Each interval of the eight types described in Section A-1 is an interval accord-
ing to the definition in the present section.
(b) Every interval is either one of these types, a point, or H!.
5. Let S be as in Example 2. Show that:
(a) S is a closed set.
(b) There is no path in S joining (0, 0) and any point of Soe.
(c) Sis a connected set.
6. Instead of Axiom IIIa (least upper bound property) about the real numbers, take
as an axiom the property that H! is connected. Prove Axiom IIIa as a theorem.
[Hint: Let A = fall upper bounds of S} and B = A*. Show that B is open;
and if S has no least upper bound, then A is open.]
7. Let gi be a path from p,; to pg and g2 a path from pe to ps. Let h(t) = gi(2t)
if0 <t< 4, and A(t) = go(2t — 1) if 4 < ¢ < 1. Show that h is a path from
p1 to pz.
8. Let DC E” be open. By polygonal path in D from x to y let us mean a path g in
D from x to y with the following property: There exist to, t1,...,¢m such that
O=t6 oh << t,1 <t, = Vand ¢@) = 2h) C= Wve te <1 tee,
where vi, = (f,41 — t.) —[g(te41) — g(t)], & = 0,1,...m — 1.
Let g be a polygonal path in D from x to y, and U be any convex set (Section 1—4)
such that y © U and UC D. Using Problem 7, find a polygonal path in D from
x to any point z € U.
9. Let DC EH” be open. Given x € D, let A = {y: there is a polygonal path in D
from x to y} and let B = D — A. Using Problem 8, show that A and B are open
sets and A is not empty. [Hint: Any neighborhood is convex.] If B is not empty,
then D is disconnected.

A-8 COMPACT SPACES

Let us begin the discussion of compact spaces by considering subsets of #”.


Definition. A point Xo is an accumulation point of a set A C E” if every
neighborhood of x9 contains an infinite number of points of A.

For instance every point of the closed interval [a, b] is an accumulation


point of the open interval (a,b). The set {1, 4, 4,...} has the single accumu-
lation point 0. The set of positive integers has no accumulation point.
We recall from Section A-4 that a set A C EH” is called bounded if A has
finite diameter. It is plausible that if a bounded set has an infinite number
A-8 Compact Spaces 309

of points, then its points must accumulate somewhere. The truth of this is
expressed by the following.

Bolzano-Weierstrass theorem. very bounded infinite subset of E” has at


least one accumulation point.

Proof. Let A be a bounded infinite set and J,


be some closed n-cube containing A. A closed n-cube
Dasma tnem lore x= (7 ryl — 4/28 1, nt
where Xo is the center and a is the side length. Divide
I, into m = 2” closed congruent n-cubes I1,,..., Lim
as indicated in Fig. A-8. Since A is an infinite set,
AQ 14; must be infinite for at least onek = 1,..., m.
Choose some such k and let J,;, = I. In the Figure A-8
same way divide [2 into 2” closed congruent
n-cubes [o1,..., lam. As before, A Io, is infinite for at least one k.
Choose such a k and let J,, = Iz. Continuing, we obtain closed n-cubes
Tells > such that) A ms; isvininite for’each 1=1;2, 5. and
diam I; ~ 0 as!— ~». By Theorem A-3, 1; N Ig N--- has a single point
Xo. If U is any neighborhood of xg, then J; C U for large enough J. Since
AnlI;c ANU, An U is an infinite set. Therefore xg is an accumulation
point of A. §

Definition. A set S C HE” is compact if every infinite set A C S has at least


one accumulation point xg € S.

Theorem A-5. A subset S of E” is compact if and only if S 1s bounded and


closed.

Proof. Let S be bounded and closed, and let A be any infinite subset of S.
Since S is bounded, A is bounded. By the Bolzano-Weierstrass theorem A has
an accumulation point x9. If xo ¢ S, then xo is exterior to S since S is closed.
Thus x9 has a neighborhood U which does not intersect S. Since AM U is
not empty, this is impossible. Therefore xo € S, which shows that S is compact.
To prove the converse, suppose that S is unbounded. Then for each
ie ale eeetheres exists x, 5 -suchs that. |x,,| =m, ~“The;set) A —
{x1, X2,...} is infinite and has no accumulation point. Hence S is not compact.
If S is not closed, then there exists a point Xo € frS — S. Form =1,2,...
there exists x, € S such that |x, — Xo| < 1/m. The set A = {x1, Xo,...}
is infinite and has the single accumulation point Xo. But x9 € S. Hence S is
not compact. jj

Let us next give another description of compactness, in terms of open


coverings. Let S be a subset of a topological space So. A collection % of sets
is a covering of S if every point of S belongs to some set A € Y, that is, if
310 Appendix A-8

S C Use A. If YX’ C W and Y’ is also a covering of S, then Y’ is called a sub-


covering. If every A € A is open, then YY is an open covering of S.
In the following theorem Sy = EH”.
\ :
Heine-Borel theorem. Jf S is a compact subset of E”, then every open covering
of S contains a finite subcovering.

Proof. Let % be an open covering of S. Suppose that no finite collection


1’ CW covers S. Let us define a sequence of compact sets S; D> S2D-°-:-
such that diam S,; —~ 0 as k — o and no finite subcollection of 2{ covers any
S,. Let S; = S. Since S; is bounded, some closed n-cube J; contains Sj.
Divide J, into n-cubes J;,,..., [4m as in the proof of the Bolzano-Weierstrass
theorem, and let S;, = S; N Iyx. Since S; and J, are closed, so is Syx. If
for every k = 1,...,m some finite collection %, C2 covered Siz, then
%, U-++U%, would be a finite subcovering of S, contrary to assumption.
Choose some k for which no finite subcollection of 2 covers S,,, and let
Siz = So. Repeating this process, we obtain the desired sequence of compact
sets.
By Theorem A-3, Sy; S2M--- contains a single point x9. Since %
covers S, Xo belongs to some set A € %; and since Y is an open covering, A is
an open set. Therefore there is a neighborhood U of xo such that U C A.
Since diam S; — 0 as k— o, S; C U for large enough k. For such k, S; is
covered by the subcollection of 2{ consisting of the single set A. Since by con-
struction no finite subcollection of 9 covers any Sx, this is a contradiction. §

Example 1. Let D be open and S be a compact subset of D. Each x € S has a neigh-


borhood Ux, such that Ux, C D. Let % be the collection of these neighborhoods Ux.
Then there is a finite subset {x1,...,Xp} of S such that SC Ux, U-:+U Uy GD.

The converse to the Heine-Borel theorem is true. To prove it, suppose that
S is not compact. Then either S is unbounded or S is not closed. If S is un-
bounded, let A,, be the neighborhood of 0 of radius m = 1,2,... Then
{A,, Ao,...} 1S an open covering of S which has no finite subcovering. If S
is not closed, let x9 € frS — S and A,, = {x:|x — xo| > 1/m}. Then
{A,, Ao,...} is an open covering of S with no finite subcovering. This proves
the converse.
The definition of compactness in terms of accumulation points was the
first historically. The characterization in terms of open coverings has less
intuitive appeal but is more useful for proving theorems. Moreover, it is the
appropriate notion of compactness in general topological spaces.

Definition. A subset S of a topological space So is compact if every open


covering of S contains a finite subcovering.

From the Heine-Borel theorem and its converse, if So = HB” this defini-
tion is equivalent to the previous one.
A-8 Compact Spaces 311

If So, considered as a subset of itself, is compact, then So is called a com-


pact topological space.

Theorem A-6. Jf S is a compact space and f is continuous on S, then f(S)


as compact.

Proof. Let ® be any open covering of f(S). Since f is continuous, f~!(B)


is open for every B € ®. The collection of sets f—!(B) is an open covering of S.
Since S is compact, a finite subcollection {f~'(B,),...,f—~}(Bm)} covers S.
Then {B;,..., By} is a finite subcollection of @ which covers f(S). Hence
f(S) is compact. J

Example 2. Let 4 C EH” be compact. Using the notation of Example 7, p. 303, the
set A(x) is closed since its complement is open. Since A is bounded, A (x4) is bounded.
Therefore A(x() is compact. Let p(x) = x’. The transformation p is called a pro-
jection of E” onto H*. Since p is continuous, the set p(A) = {x’: (x’,x”) € A for
some x’’} is compact.

An important particular case of Theorem A-6 is obtained by taking


T C E' as follows.

Corollary. Jf S 7s a compact space and f is real valued and continuous on S,


then f has a maximum and a minimum value on S.

Proof. Any compact subset of HE! has a least and a greatest element
(Problem 2). §

As an application of the corollary let us prove the following.

Mean value theorem. Let f be real valued and continuous on a closed interval
[a, b], and let the derivative f’(x) exist for every x € (a,b). Then there exists
c € (a, b) such that
| f(b) — f(a) =f (ec) — a).
Proof. Let m = [f(b) — f(a)]/(b — a) and let F(x) = f(b) — f(x)
m(b — x). Then F is continuous on [a, b] and F’(x) = —f’(x) + m for x € (a, b).
Since [a, b] is compact, F has a maximum and a minimum value on [a, b]. If
the maximum value is positive, then since F(a) = F(b) = 0, the maximum
must occur at some 2x, € (a,b). By elementary calculus F’(x;) = 0 and we
may take c= x,. Similarly, if the minimum value is negative we may take
¢ = 22, where F(x2) is the minimum value. If neither of these possibilities
occurs, then F(x) = 0 on {a, 6] and c is arbitrary. J

The mean value theorem has the following generalization.

Taylor’s theorem with remainder. Let f together with its derivatives


f',f", ..., £4» be continuous on a closed interval [a, 6] and let the qth-order
312 Appendix A-8

derivative f\?(x) exist for every x & (a,b). Then there exists c € (a,b) such
that

fe) — f@ = fla — a) \
ee ra lee 2) orn eo)
f"(@) et CE ae a— PEL IR.
where
@r.
Rae Ness
Proof. Let

G(x) = fo) — f@) — f@)6 — 2)


f(a) (ae a je

where the number K is so chosen that G(a) = 0. Then G(b) = 0 and, using
the product rule,

Repeating the reasoning in the proof of the mean value theorem, there exists
c € (a, b) such that G’(c) = 0. Then f@(c) = K.]

The number R, is called the remainder. If f has derivatives of every


order g and if R, — 0 as q— om, then
(4)
f0) = 3 ro (6 — a)’.
q=0 ;

This is called Taylor’s series about a. We have set f° = f,0! = 1. A sufficient


condition that f(b) be given by the Taylor series about a is that there be a
positive number M such that |f‘?(x)| < M4 for every x € [a,b] and q = 1,
2020) Kori. C = (> — a), then
Ce

| Ra| < gq’

which tends to 0 asq—- ~.


We assumed that a < b, but the case a > 6 is similar.

PROBLEMS
1. A point x is an isolated point of A if there is a neighborhood U of x such that
AM U = {x}. Show that every point of el A is either an isolated point of A or
an accumulation point of A.
to Let S be a compact subset of H'. Show that inf S€ S and sup S € S.
A-9 Review of Riemann Integration 313

3. Show that S is a compact subset of So if and only if S is a compact topological


space in the relative topology. [Hint: If 2{ is an open covering of S, then the sets
Sn A, A € A, form a covering by relatively open sets. Conversely, every cover-
ing by relatively open sets can be obtained from some such Y%.]
4. Let Aj, A2,... be nonempty compact subsets of HE” such that 41; > 4oD---
Show that Nn-1 4m is not empty. [Hint: For each m choose xm € Am. Apply
the Bolzano-Weierstrass theorem to the set A = {x1,x2,...}.]
5. Let A be a nonempty subset of H”, and let f(x) = inf {Ix — y|:y € A}. This
is the distance from x to A. Show that:
(a) f@) = Oif and only ifx € el A.
(b) |f(x1) — f(x2)| < |x1 — xe] for every x1, xo © HE”; and consequently f is a
continuous function on #”. [Hint: Triangle inequality]
(c) If A is closed and x € A, then there is a point y € A nearest x. [Hint: Let
g(y) = |x — y|. Apply the corollary to Theorem A-6 to show that g has a
minimum on S = Af) K,, where K, = {y:|x — y| < r} andr > f(x).]
6. Uniform continuity. A transformation f is uniformly continuous on S C HE” if given
e > 0 there exists 6 > 0 (depending only on e) such that |f(x) — f(y)| < e for
every x,y © S with |x — y| < 6. Show that if S is compact then every f con-
tinuous on S is uniformly continuous on S. [Hint: If not, then there exists « > 0
and for m = 1,2, ...Xm,¥m€S such that |f(&m) — f(ym)| > €and [xm — ym| < 1/m.
Let xo be an accumulation point of {xi, x2,...}. Show that the continuity of f at
Xo is contradicted. |
7. Let A’ C E* and A” C E"-* be compact. Show that A’ x A” is compact. [Note:
This is a very special case of Tykhonov’s theorem, which states that the cartesian
product of compact topological spaces is compact. See [16], p. 175.]
8. Let So be a topological space.
(a) Show that if So is compact, then any closed set BC So is compact. [Hint:
Let 2{ be any open covering of B. To the collection 2{ add the open set So — B.]
(b 7, Suppose that So has the following property: (H). For every p, q € S, there
exist a neighborhood U of p and a neighborhood V of g such that UM V is
empty. Show that any compact set S C So is closed. [Note: A topological
space with property (H) is called a Hausdorff space.]
(c) Show that any metric space (p. 305) is a Hausdorff space.
(d) Let f be continuous and univalent from a compact space S onto a Hausdorff
space T. Show that f—! is continuous from T onto S. [Hint: Let BCS be
closed. Show that (f~!)~—!(B) is closed and use Proposition A-6.]

A-9 REVIEW OF RIEMANN INTEGRATION


Let f be real valued and continuous on an interval [a, b]. Then f has an
integral over [a, b], denoted by f? f(t) dt. According to Riemann’s definition
of the integral, it is the limit of sums:

i fat = lim Y/ fst) — Gv)


g m

—0 =
314 Appendix A-9

where
by
OT eee Eb Hit hey SSF,
and
w= max {ty nae to, ig = ty, 50 a 9 Ute cer ineare

More generally, the Riemann integral exists for any bounded function
with a finite number of discontinuities. It agrees with the integral in Lebesgue’s
sense, which is defined in Chapter 5 for a much wider class of functions.

Fundamental theorem of calculus. Let f be continuous on [a, b] and let


t
Fi) = f f(s)ds, a<t<b.

Then F’(t) = f(t) for every t € [a, b].

Proof. By elementary properties of the integral, ifh > Oandt+h < b,


then

F(t+ aD— Fi)_ i ou

Since f is continuous, given € > 0 there exists 6 > O such that f(t) —é <
f(s) < f(t) + € whenever |s — t| < 6. Thenifh < 6,

nlf — < fts) ds < ALY + 4,


z IG ae 4 ah)
fi) — SSO Tay Ce
Hence
He tim,2 a De F(t).

The right-hand side is the right-hand derivative of F at ¢. Similarly, f(t) equals


the left-hand derivative of f at t. §

In the theorem, F’(a) means the right-hand derivative and F’(b) means
the left-hand derivative.
The fundamental theorem says that F is an antiderivative of f. If @ is
any antiderivative of f, then G’(t) — F’(t) = 0 for every t¢ € [a, b], and by
the mean value theorem G(t) — F(t) is constant on [a, b]. Thus G(t) — F(t) =
G(a) — F(a) = G(a), and upon setting t = b we obtain
b
ely = tela) = | f(s) ds.
A-10 Monotone Functions 315

Change of variables in integrals. Let ¢ be any real-valued function pos-


sessing a continuous derivative on some closed interval [a, 8] such that

¢ (7) = 0 for every 7 € [a, 6], (a) = a, (8) = b.


Then a < ¢(7) < 6 for every 7 € [a, 8].
If U = F » 9g,then

U(r) = F'[6(7)]6'(7) = flb(7)]¢'(7)


for every 7 & [a, 8]. Since

EXG)p— Ua) a) (b) == U8),

[10 &= [Sle@le'@ ar.


This is the formula for change of variables in integrals. If ¢’(r) < 0 for every
T € [a, B], then ¢(a) > (8). The same formula holds if we agree that

A-10 MONOTONE FUNCTIONS

Let f be real valued with domain S Cc E!.

Definition. If for every x, y € S such that x < y,

TOY) then f is increasing.


Tea ty) then f is nondecreasing.
HO SiG) then f is decreasing.
TH) Be hy) then f is nonincreasing.

If f is either an increasing function or a decreasing function, then f is


called strictly monotone. If f is either nondecreasing or nonincreasing, then f is
monotone. If the restriction of f to A is monotone, then f is monotone on A.
A function f is univalent if f(x) # f(y) whenever x ¥ y. Clearly, any
strictly monotone function is univalent. If S is an interval, then conversely
any continuous univalent function must be strictly monotone. This can be
proved from the intermediate value theorem.
Let A be an interval and assume that the derivative f’(x) exists for every
x € A. It is proved in elementary calculus that:
(a) f is nondecreasing on A if and only if f(x) = 0 for every x € A.
(b) If f(x) > 0 except at a finite number of points of A, then f is increasing
on A.
316 Appendix A-10

If f is inereasing, then f is univalent and consequently has an inverse f_'.


The derivative of the inverse is given by

f-* © = I/F’, (*)


if t = f(x) and f’(z) # 0. For proofs of these facts see, for instance [8].
Among the examples of strictly monotone functions from calculus are
the exponential function exp, whose inverse is log. The restriction to [—7/2 1/2]
of the function sin is strictly monotone. Its inverse is denoted by sin~’.
Limits at +00. We say that f(x) — yo as x ~ +o if for every e > 0
there exists 6 such that |f(x) — yo| < € for every x > b. By the same proof
as for Theorem A-1, if f is monotone and bounded on a semi-infinite interval
[a, oo) then f has a limit as x — +o. Similarly, if f is monotone and bounded
on (—o, a], then f has a limit as x — —oo.

PROBLEMS
1. Let us say that f(z) — +9 asa — +0 if for every C > 0 there exists 6 such that
f(x) > C for every x > 6. Show that if f is nondecreasing and unbounded on
[a, ©), then f(x) > +0 asx > -+o.
2. (a) Give a precise definition, similar to that in Problem 1, for “zm, —>-—+2 as
m > co ae

(b) Let f have domain [a, ©). Show that f(z) — yo as x >-+0 if and only if
f(%m) — yo for every nondecreasing sequence [rm] such that am => a for
m = 1,2,...andt%m—7>-+%o asm— o.
Historical Notes

Chapter 1. The ideas of space of n-tuples and n-dimensional geometry go


back at least to the middle of the nineteenth century. Many of the early con-
tributions to the theory of convex sets were made by H. Minkowski around
1900. In particular, he defined supporting hyperplanes and gave one of the
first proofs of Theorem 1. Convex functions were introduced by J. Jensen
(1906). Further historical background about convexity may be found in
Inequalities, by G. H. Hardy, J. E. Littlewood, and G. Polya, Cambridge Univ.
Press, 1934, and in Theorie der Konvexen Kérper, by T. Bonnesen and W. Fenchel,
Springer, Berlin, 1934.
Chapters 2, 3, 4. The formal rules of differential calculus for functions of
several variables were practically all known by the early nineteenth century.
Developments of the nineteenth century showed clearly the necessity in calculus
of admitting a quite general notion of function. It is not enough to consider
merely that functions “define analytically” [for example, the elementary func-
tions and those obtained implicitly from elementary functions by solving
equations of the type F(a, y) = O]. Precise statements of the rules of several
variable calculus and sound proofs of them came considerably later.
The definition of differentiable function was first given by W. H. Young
(1908) and by M. Fréchet (1911). Afterward I'réchet developed the idea of
differential of a function f from a normed vector space U into a normed vector
space W. The differential (often called today the Fréchet differential) of f
assigns at each p € VU a linear function from VU into W. In Chapter 4 we have
taken U and W to be euclidean vector spaces, and in Chapter 2 we took W = E’.
Some economy of thought is gained by developing differential calculus from the
start for general U and W; this is done, for instance, in [6].
The determinant bearing his name was introduced by C. Jacobi in 1841.
However, for a long time afterward the inverse and implicit function theorems
were stated in an imprecise way, which often led those applying them to over-
look the local character of these theorems. The false result that a transformation
with nonzero Jacobian is globally one-to-one has been too often quoted by
mathematicians, both pure and applied.
317
318 Historical Notes

Chapter 5. In an article on representation of functions by trigonometric series


(1854), B. Riemann defined the integral of a function f over an interval [a, 6]
as a limit of sums. Upper and lower Riemann integrals were introduced by
J. Darboux (1875). The theory of finitely additive measure, usually called today
Jordan content (or Jordan measure), was discovered around 1890 by G. Peano
and by C. Jordan. The crucial importance of requiring that measure be count-
ably additive was realized by E. Borel (1898). Soon afterward (1902), H.
Lebesgue’s thesis appeared, which decisively changed the course of integration
theory. The L?-spaces were introduced by I’. Riesz (1910), who was one of the
pioneers in the development of functional analysis (this includes the study of
infinite-dimensional normed vector spaces).
For a more detailed historical account of integration theory, see N. Bourbaki,
Eléments d’ Histoire des Mathématiques, Hermann, Paris, 1960.
Chapters 6, 7. The algebra of multivectors (exterior algebra) was invented by
H. Grassmann (1862). For many years his work was not properly appreciated.
Exterior differential forms were introduced by H. Poincaré and by E. Cartan
(about 1900). Poincaré used differential forms in his theory of integral invariants
in mechanics, while Cartan first applied them to Pfaffian systems of differential
equations. Since that time, exterior differential forms have found many uses in
differential geometry, topology, and mathematical physics. See, for instance, [9].
The adjoint operation * for differential forms on a riemannian manifold is due
to W. Hodge (see his book, The Theory and Application of Harmonic Integrals,
Cambridge Univ. Press, 1941). As we have defined it, *w may differ in sign from
Hodge’s definition.
Yor further historical information about exterior algebra and calculus see the
Note Historique at the end of reference [4], or pp. 78-91 of Eléments d’ Histoire
des Mathématiques cited above.
The classical formulas 7-12, 7-11b to which the divergence theorem reduce
when n = 2, 3, were employed in the theory of potential by G. Green (1828)
and C. Gauss (1839). Despite the name Green’s theorem when n = 2, this
formula actually appeared earlier in the works of Gauss and Lagrange.
Some authors call ‘converse of Poincaré’s lemma”’ what we have called (ac-
cording to rather common practice) Poincaré’s lemma. The result in question
was actually first proved by V. Volterra. See [17, p. 98].
References

1. T. M. Aposron, Mathematical Analysis. Reading, Mass.: Addison-Wesley


(1957).
2. G. Brrxnorr and 8. Macuaneg, A Survey of Modern Algebra. New York:
Macmillan (1953).
3. M. Bocurr, Introduction to Higher Algebra. New York: Macmillan (1929).
4. N. Boursaxt, Eléments de Mathématique, Livre II Algébre, Chapitre 3,
Algébre Multilinéaire, Actualités Scientifiques et Industrielles, No. 1044.
Hermann, Paris (1958).
5. EK. A. Copprneton and N. Levinson, Theory of Ordinary Differential
Equations. New York: McGraw-Hill (1955).
6. J. DimupoNNE, Foundations of Modern Analysis. New York: Academic
Press (1960).
7. H. G. Eaauesron, Convexity. New York: Cambridge University Press
(1958).
8. H. Feprrer and B. Jonsson, Analytic Geometry and Calculus. New York:
Ronald (1961).
9. H. FuanpsErs, Differential Forms with Applications to the Physical Sciences.
New York: Academic Press (1963).
10. D. Gaus, The Theory of Linear Economic Models. New York: McGraw-Hill
(1960).
11. P. R. Haumos, Naive Set Theory. Princeton: D. Van Nostrand (1960).
12. K. Horrman and R. Kunzg, Linear Algebra. Englewood Cliffs, N. J.:
Prentice-Hall (1961).
13. S. Karun, Mathematical Methods and Theory in Games, Programming, and
Economics. Reading, Mass.: Addison-Wesley (1959).
14. O. D. Ketuoae, Foundations of Potential Theory. Berlin: Springer (1929).
15. E. J. McSuane and T. A. Borrts, Real Analysis. Princeton: D. Van
Nostrand (1959).
16. B. Menvexson, Introduction to Topology. Boston: Allyn and Bacon (1962).
17. G. pe Ruam, Variétés Différentiables, Actualités Scientifiques et Industrielles,
No. 1222, Hermann, Paris (1955).
18. F. M. Srewart, Introduction to Linear Algebra. Princeton: D. Van Nostrand
(1963).
319
320 References

19. A. Taytor, Introduction to Functional Analysis. New York: Wiley (1958).


20. E. C. Trrcumarsu, The Theory of Functions. London: Oxford University
Press (1939).
21. H. Wuitney, Geometric Integration Theory. Rrinceton: Princeton University
Press (1957).
22. T. J. Wiutmore, An Introduction to Differential Geometry. London: Oxford
University Press (1959).
Answers to Problems
ANSWERS TO PROBLEMS

Section 1-1

1. 4e; — 2e2+ €34 3e4, —2e1 — e3-+ 4, V30, V6, V6, V12, 6
7. V4 = +(+/2/10)(4e1 + 3e2 — 3e3 + 4e4)
Section 1-3

1. {x:a2!1+
22+ 473 = 1}
Pe Ge Xoo eon oro te = Ob) t= 4s
Section 1-4

BO PP ad
9. (b) The barycenter is at the intersection of the line segments which join the
vertices with the barycenters of the opposite (r — 1)—dimensional faces.
Section 1-5

1. (a) Concave on E! (b) Convex on FE!


(c) Concave on (—, —1] and on (0, 1), convex on (—1, 0] and on (1,~)
(d) Concave on (—%, —1] and on [1, ©), convex on [—1, 1]
2. (b) If f(z) = aox* + aya? + agar? + aga + a4, then ap > 0, 24aga2 > Yai.
Section 1-6

1. (b) n-cubes
2. (2) |\(@a,y)|| = [e*-+ zy+ 4y7]"? (b) 2
Section 2-1

1. (a) filz,y) = 1+ log (ay), fo(t,y) = z/y


(Dias ey = Otte 2y2 = 2)* eefa(a, y,2) = L2y@2- 2y* 2),
fal, y,2) = 3(0? + 2y? + 2)?
(c) fi(x) = 22°
2. The derivative in direction (cos 0, sin 8) is —2(cos 6 + sin 6).
5. The derivative is 0 in those directions for which v! + v2 + v0? = 0. There is no
derivative in other directions.

Section 2-2

1. It has the equation 4 + 5y +2-+ 4 = 0.


5) (a) 4/5" (b) I/v/2e (c) 0
3. (a) [22(w?+ 2y+ 1)—-! + cos («?)Je1 + 2(x? + 2y + 1) te?
(b) 0.09
323
324 Answers to Problems

4, (a) Xo (b) xi*x (c) 2(x0


* X)Xo
SDN Gry le — yea 7 OY) eet m0)
(c) The union of the sets in 5(b) and the z-axis, with (0, 0) excluded sincef is not
differentiable there s
(d) {(z, y):2 = y2(c“! 4 Ve? — 1), « ¥ 0} if |e] < 1,¢ #0; the x- and
y-axes if c = 0

Section 2-3

1. wyz = —z-+ (y+ lz (x 124 @ Uy = lz


Cmong12(0,0) = ——1 7510, 0) =a
— 1 — 1
(ae ); « ) DG Onn> g
0] i = |

Section 2-4

1. (a) Neither (b) Concave, not strictly


(ec) Convex if p < Oorp > 1. Concave if0 < p < 1. Not strictly
(d) Strictly convex
(e) Neither. However, f is strictly convex on each half of {(a, y) :2xy < —1}.
3. (a) a = 1 (b) a? satisfies the equation cot a2 = 2a?, a? < 7/2.

Section 2-5

1. (a) Maximum at $e1 (b) Saddle point at —e; + 2e2


(c) Maximum at each point where zy = 7/2 -+ 2mm; minimum at each point
where xy = —1/2-+ 2mm, m any integer. Saddle point at 0
(d) Saddle points at O and at e;
2. (a) Maximum at —e,; + eg, saddle point at 4(—e, + e2)
(b) Saddle points at —e2 and at e; + ee
(c) Saddle points at me, m any integer
5. x = (1/m) (x1 + ---+ xX). The minimum value is
1 m m™m 2

pap Pa ayes
i= k=1
8.8)
65 @) 2,0 (b) sn 3, —sing

Section 2-6

ES i Ne be OP ite
3. (a) f(z, y) = 422y +c (b) Not exact (c) Not exact
(d) f(x,y) = x/y — y/x + (2, y), where ¢ is constant on each of the four quad-
rants into which the coordinate axes divide EH?.

Section 3-1

ly = 2@ — v2)
2 l= eG) se le = Be)
3. /2(—e1 + eo)
Answers to Problems 325

Section 3-2

1. (a) Simple closed curve (b) Neither (c) Simple arc


. (b) 3 [(4 + 9b)3/2 — 8]
. G(s) = gls/V2], 0< s < 2V2n
—e 1 — 2eo, e; — 2eg, and any scalar multiples of these tangent vectors
o.
bd
WwW
PR (b) No, since g’(0) = 0.

Section 3-3

1. (a) 4ac (b) mab


2. (a) 44 (ce) 2, (d) —2
5. f(z, y, 2) = 46(p?), where (u) = JSoy(v) dv
bs 2076 8. (a) mes (b) 2/203+ 20/20

Section 4-1

i (a) Lt; © )

(b) {t:|t] >1+¢+1)'%; ife > 0,


{t:|.|t] — 1] > (+1473 if -—1 <c<0; Hife < —1
2. (a) The parabola « = c? + y?/4c? if ¢ ¥ 0; the positive x-axis if c = 0
(b) The lines t = s(k + Vk? — 1) ifk = 1/m, m2? < 1; thelinet = sifm = 1;
the line t = —sif m = —1; {(0,0)} if m? > 1
@ni@, y) €Q:a < a7}
Sry) wa, yee 2 Ony = OF
(a) The parts of the lines x + y = 2|cl, y = x + 2|cl, y+ 2|c| = 2 in g(E?)
(b) The union of the lines s + ¢ = m(s — #) ands-+t = —m(s — ?)
nme a Vacs 2aseae Oy) 70}
4. (b) The part of the cone between the plane z = 0 and the vertex
(c) The s-axis; {(m, 1) : m any integer}
Or- (a) g(A) = {@,27):2 > 3
(b) If c > 2, the part of the ellipse s? + st + i? = 1/cin A; if c < 2, the empty
set

Section 4-2

i ( : ; 2)- The rank is 2. The kernel consists of all scalar multiples of

e1 — lleg + 2e3.

3. (a) The diagonal elements are c!,..., c”; all other elements are 0.
(b) (L—1)*(x) = (c*)—!2z' provided c* ¥ 0 for every 7
5. (a) Reflection in the line s = ¢
(b) They are rotations through angle 37/2, 1/2 respectively.
6. L is not a rotation.
8. (b) If g(t) = L(t) + xo, then L must be nonsingular.
326 Answers to Problems

Section 4-3

1. (a) Except where s? = ¢?; everywhere in £?; everywhere in A


(b) gi = e1 + €2,g2 = —e1 + e2inQ: = {(s,t):s —t > 0,s+t > 0},
gi = —e1
+ ee, go = e1 + e2 in Qe = {(s,t):s —t < 0,s+# > O},
with similar expressions in the other two of the four quadrants Q3, Q4 into
which the lines s = -tt divide E?.
g1 = —2nt(sin 21s)e; ++ 2mt(cos 278)e2,
g2 = (cos 27s)e; + (sin 27s)e2 — e3;
gi = —(s?-+ st + t?)—-2(2s + d[e1 — 2(s? + st + t?) lea],
ge = —(s?-+ st-+ #7)—7(s + 2¢)[e1 — 2(s? + st + t?)—‘ep].
(c) 2;2 unless? = 0, and lif ¢ = 0; 1
(d) 2 in Qi, Q3, —2 in Qe, Q4; not applicable; 0

Section 4-4

1. Fig = 2f12 + zyfe2 + fe, the partial derivatives of f being evaluated at (x, ry)
2. Fi = fit f3gi, Fe = fe+ fsge,
Fix = fir + 2f1391 + f33(g1)? + f3g11,
Fi2 = fie + fisge + f329g1 + f33gig2 + fsgi2,
Foo = fo2 + 2fe3g2 + f33(g2)? + fsg22
3. (a) 162 (b) —¢’(3)¢'(3)
Ay A See
Section 4-5

1. (a) Yes, g(E") = E", g(x) = x — x0


(b) Yes, g(B?) = B®. g(x,y) = $e + 2yer + (@ — yea]
(c) No, g(#?) is the half-plane 42 > —9.
(d) Yes, g(A) = A, g is not univalent since g(—s, —t) = g(s, t).
(e) Yes, g(A) = {(z,y):0<y < Sexp (—2)}, g7"(z, y) = [a+ bd)e1+ (a — d)eal,
where a = (y—! + 2e*)"/2,6 = (y—1! — 2e*)1/2,
2 Giant) eal al) tel 18 re AO
—1 0 1
3. Its matrix is ® i @
i © @
4. If 0 < cose < 1, the image of the line ¢ = c is the right half of the hyperbola
z?/cos? ¢— y?/sin?c = 1; and if —1 < ¢ < 0 it is the left half. If cosc = 0,
it is the y-axis, if cose = 1 the right half of the z-axis; if cosc = —1, the left half.
5. (b) (g|A)~*(a, y) = log R(a, y)e1+ O(2, y)e2, where R(x, y) = (27+ y?)'/?, O(a, y)
is the angle from the positive z-axis to (a, y).
(c) g(H?) = E? — {(, 0)}
Section 4-6

1. g = —@1/Bg. g” = —[(b2)?b11 — 21 hob12Q + (41)2bd0]/(h2)3


2. dir = —[(B2)?h11 — 261 2by2 + (61)?b22]/(6)3
3.¢1 = —l, ¥1 = 0, gd2 = 0, Po = 1 at (—1,1)
4. (b) Radius 3\/2/2 (c) Radius /21
Answers to Problems 327

Section 4-7

1. {(z, y): F(x, y) = c} is an ellipse if loge > 2, is the one point set {(0, 0)} if
log ¢ = 2, and is empty if log c < 2. Any ellipse is a 1-manifold.
2. (a) The cone is not a 2-manifold.
(b) 2@ — 2) =" + 1) + 4¢@—1) =
7. No

Section 4-8

ih,
2. |clife < 4, Vc— fife> 3
3. V14/3
9
Section 5-1

mah lea eee (2 est VY UZ) 18) VY1) Z) us


3. Vi U Ie) = 48, Vilin Ie) = $
an 1/m
4. (e — 1) exp ein)

Section 5-2

5. 239/240.

Section 5-3

1. (a) Unbounded; (—, 0]


(b) Bounded; E?
(c) Bounded; EH?
(d) Beaded {(x, y)2y aC) erealy/ lesa}
Same

Section 5-5

1. (a) 2, (4, 2 (b) 1+ 1/2, (%, 0) where F = 2/(6-+ 37)


Zia Se
i ,2 (ae
aoe ate 1a
3. dx ie
—2 Vv4_— x Ve2422 ome!

4, 8
51
- 75
6. (b) (e — 1)? (Ce 2a oT

' Section 5-6

1. (a) Exists (b) Exists if p < 1; divergent if p > 1


(c) Exists (d) Exists
(e) Exists if p < 1,p-+q > 1; divergent otherwise
5. (a) 0 (b) a (c) 0
328 Answers to Problems

Section 5-7

fem? 27
Section 5-8
3 1 : ae :
il. i io) Ghd — i;fig()\(2 — 2t) dt, provided either integral exists.
2 0
2, 2 log 2
3. 48
Section 5-9

ll, 2a Ms Tis te By a
m [2 co in 1 (rcos6@,
r sin @) ¥
4, I do i 8 o rar flr cos 0, r sin 8, 2] dz
0

5. 2
Ge = ihe ak Oper i

10. (a) an
AV) @)/T Ce (b) P(H)1(4)/T@)
a+1 fae as
(c) Soleszs) (a) ey T(e
+ 1)
fe) he 1) /Pa@aek=- 1)

Gear. a‘ >}Ure i

Section 5-11

=) 1+ exp (rt)
1. (a) 2tan (1/2) (b) Le ae

8. $(2) = (Wx/2) exp (—2"/4)


i. ‘a2 exp (t*) — (1 + 1/log t) exp (? log” t)]

Section 6-1

es hy SA UN
2. —1

Section 6-2
1. el A e234 = e3 A e124 = 1234 92 A e134 = ef A e123 — 91234

2. (a) 6e!2 + 2el3 — e283 (b) 0


(c) —3e123 (d) 4e!23
Gy (a) —el2345 (b) el235 _s el234 s= el3s45

Section 6-3

1. (@) —e2345 (b) 0 (c) —e146


(d) 2eg
+ e2 — 2e1’, where’ = (1,...,4—1,¢+1,...,6)
(e) 2€1.--6

Li Ginee) ar - Gra
3. Negative 4. Yes 5. No 6. 3/2 We NiH
Answers to Problems 329

Section 6-4

1. (a) ale = L* (a), en bi = G1 — Ge oh 2a3, bo = —2a, + 3a3


(b) ci} = —2, ciB = 7, cB = —3
(c) If @ = cee, then Lo(8) = evi A vo = c(—2e12 + 7e13 — 3e23), where vy
and v2 are the column vectors.
(d) lf w = wel? + w 3e13 += wo3e79, then L3(w) = (—2w12 + 7013 — 3w23)e!?.
(e) 0

Section 6-5

1. (a) day dx A dy (b) 2ay sin (xy?) dx A dy A dz


(c) —fodx A dz (@)) Bake IN Gh IN We
ise Peieieers
ee) Sade eA de= Ads
i
A =. Ade n
=f

1 - SpA. 5 é 5
(b) 7 a (—1)**17"*B"* where \y = Cs Sa ee oa)

5. (a) Oif ris odd; —2dw A d¢ if r is even


(b) 0
70 w f = Meed :
gdg + Negdg?.
gdg.
D (dw)”
(dw)
1 = fe
@
AN
Ht)° Bas
Bes ds/\ dt

8. (a) fogsds A dt A du (b) scostsintds A du+


s? cos? tdt A du
Section 6-7

3. €1
X €2 = €3, e€1
X €3 = —e2, C2
X €3 = e1

Section 7-1

1. (a) 4+/3; the triangle with vertices 2e3, e: + e2, e1 — 3e2 + 4e3
(b) p sina; a solid cone from which points (2, 0,z),0 < x < z, are deleted
(c) [1 + s? + #?]!/?; the hyperboloid x = yz
2. (a) Ty(t) = {k:k!s+ k7t-+ k3u = O},
Tu(x) = fh:h!x/a? + h2y/b? + h3z/c? = 0}
(b) Jg(t) = [a2b2u2 + a2c7¢2 + b2¢252]1/2

3. g is not an open transformation.


Section 7-2

1. A = {(@, y):|z| < 1, lz] < y < ( — 2”)/2},


A = {(a,2):|2| < 1,2 > OR0, 2 < 3 — 2? — Qlz|},
PG) menren-piyes 4-0 2-7 — 2y)/#e3,
g(v,2) = rer + $(3 — 2? — z?)eo + 203,
CG) (G)(8 — 2?" 2y) 1!)
Dah (MV je=at(eit) =t < exp s}
Section 7-3

iB (eat |
- rane 2 ;
pe=21s=
— = (sin¢ cos d)e12 + (sin ¢sin A)e31 + (sin ¢ cos A)e23
dp
330 Answers to Problems

4. (a) 82/3 ~— (b) 3/6


6. (b) 44° rire
Section 7-4

2. (a) —8r _— (b) r(e*# — 1) >


3.74

a) o(x) = [(— sin s)e; + (cos s)e2] A [(— sin t)e3 + (cos t)ea]

Section 7-5

oars1
3. w is not of class C@ on el D.

4. (a) 5 (b) -3

Section 7-6

he 9 2. (a) 0 (b) V2(1 — e7?) (c) 0

Section A-1

il, @) B&B, 2 (b) V2, no lower bound


(c) 10% <1 (d) 0, ie: (e) Is 5

Section A-2

JerOne basis tor Uas (iva a2). 0".

Section A-3

Poe oes OecuiX = Xole< eo, o\X: |X[e=20 OF 0b 4X:|X) <2 0h


(b) Empty set, A, A
(c) A, the union of the half-lines y = 0, y = a+ 1,24 > —1, {@,y):0<y <
z+i1,z2 > —l}
(d) A, the union of the circle x? + y? = 1 and the line segment joining (0, 0) and
(1, 0), the closed circular disk x? + y? < 1
(e) Empty set, E?, E?
(f) Empty set, A, A
(g) Empty set, AU {0}, AU {0}
2. Open in (ec), (d); closed in (b), (f)

Section A-4

il, ) © (b) No limit (c) 0 (d) e? (e) 1


2. (a) (—4,0) (b) Nolimit (¢) (1,0)
Section A-5

1. (a) 4 (b) No limit (c) 4 (d) (1,5) = e1 + 5ee (e) No limit


Continuity: In (a), (b), (e) except at (0, 0). In (c) except at 0. If we set f(0) =
3, then f is also continuous at 0. In (d) at every point of FE}
6. (a) 0 (b) Nolimit (ce) Nolimit (d) 1
Index
INDEX

Absolute curvature, 109 Characteristic values and vectors, of a


Accumulation point, 308 linear transformation, 131
Adjoint, 232 of a matrix, 131
Affine transformation, 96 Closed differential forms, 69, 228
Alternating multilinear function, 207 Closed sets, 301
Analytic function (real), 51 Closure, 288
Angle between two vectors, 4 Codifferential, 235
Archimedean property, 284 Compact space, 309
Complete metric space, 202
Banach space, 202 Components, 5, 9, 211, 215
Composite function theorem, 105
Barycenter, 22
Concave function, 25
Barycentric coordinates, 20
Conformal transformation, 105
Beta function, 182
Connected space, 306
Bolzano-Weierstrass theorem, 309
Continuous functions, 302
Boundary, 288
Continuous transformations, 299
Bounded set, 293
Contravariant vectors, 10
Bounded transformation, 298
Convergence theorems, 184
Convergent sequences, 291
Cantor’s: Convex:
- discontinuum, 147 combinations, 18, 20
function, 147 function, 23
theorem, 294 hull, 28
Cartesian product, 7 polytope, 15
Cauchy’s: set, 14
convergence criterion, 292 Coordinate:
inequality, 3 changes, 248
principal value, 169 patches, 248
sequences, 292 systems, in #”, 180
Center of mass, 161 on manifolds, 247
Centroids, 85, 161, 256 Coordinates, barycentric, 20
Chain rule for transformations, 106 cylindrical, 181
Change of variables in Riemann spherical, 181
integrals, 315 Countable additivity of measure, 143
Characteristic function, 152 Covariant vector, 10
333
334 Index

Covector, 9, 11 Fatou’s lemma, 192


Coverings, 309 Field, 283
Critical point, 60 Figure, 137
Cube, n-dimensional, 23 Flat transformation, 240
Curves in E”, related notions, 75 Fluid flow, 84
Cylindrical coordinates, 181 Frame, 218
Frontier point, 288
Function, analytic (real), 51
De Rham’s theorem, 281 beta, 182
Derivative, in a direction, 36 Canlore 147
of vector-valued functions, 72 ae
; characteristic, 152
Diameter of a set, 293 (q)
: : of class C™, 44, 46, 49
Diffeomorphism, 115, 242 (c0)
: E of class C™, 51
Differentiable:
3 concave, 25
function, 38, 42 ;
: continuous, 302
transformation, 100
Differential: CE Bt
f 67 295 differentiable, 38, 42
rine! gamma, 168, 181
us 9 OIE oe; ee : harmonic, 108, 268
Differentiation, under integral sign, 197
homogeneous, 44
Directional derivative, 35 ;
Directions in E”, 35 ae ey ae
Dirichlet problem, 268 :
: measurable, 153
Distance, 4, 29
: monotone, 315
Divergence, 233 Si
Divergence theorem, 262, 264, 270 BuppOrernl47
1 2 : :
BEESON IIE Functions, related notions, 7
aes : ‘ Fundamental theorem of calculus, 314
of a linear transformation, 96
norm, 32
space of £”, 11 Gamma function, 168, 181
vector space, 286 Gauss’ theorem, 264
Gradient, 43
Eaelideant method, 86
palaces Gram-Schmidt orthogonalization
distance, 4
: pees process, 6
nan , Grassman algebra, 214
eee Greatest lower bound, g. 1. b., 284
norm, 3
Green’s:
space HE”, 1, 288
formulas, 267
Euler’s formula, 44
theorem, 264
Exact differential form, 68, 228, 276
Extension of functions, 50 Pee
Exterior:
algebra, 214 Half-space, 10
points, 288 Harmonic function, 108, 268
Extrema, relative, 60 Hausdorff space, 313
Extreme points, 64 Heine-Borel theorem, 310
Index 335

Hessian determinant, 63 Least upper bound, I. u. b., 284


Hilbert space, 204 Lebesgue dominated convergence
Holder’s inequality, 134 theorem, 194
Homeomorphism, 115, 305 Leibnitz rule, 200
Homogeneous function, 44 Length, 3
Homotopies, 276 of a curve, 76
Hyperplane, 10 Level sets, 63
Limit of a sequence, 291
Implicit function theorem, 117 Limit of a transformation, 295
Improper integral (= the unbounded Line, 13
case), 163 integral, 80
Infinite series, 295 segment, 13
Inner measure, 140 Linear:
Inner product, 3 dependence, 286
Inner product space, 204 function, 9
Integrable function, 149 independence, 286
Integrals, change of variables in, 315 transformation, 92
differentiation under sign of, 197 Lorentz transformation, 110
iterated, 155 L?-spaces, 200
line, 80
over bounded sets, 151 Manifolds, 122
over EH”, 147 Matrix, of a linear transformation, 93
over manifolds, 239 Maximum, 60
over measurable sets, 194 Maxwell’s equations, 238
of r-forms, 259 Mean value theorem, 155, 311
Riemann, 150, 313 Measurable:
transformations of, 173 function, 153
Interior points, 288 set, 141
Intersection, of manifolds, 127 Measure, 139
Interval, 136 and integration on manifolds, 250
Inverse function theorem, 111 of an n-parallelepiped, 172, 220
Inversion of order of partial derivatives, of an n-simplex, 158, 172, 220
47 outer and inner, 140
Isometries, 97 of unbounded sets, 145
Isomorphisms, 287 of the unit n-ball, 162, 183
Iterated integrals, 155 Metric space, 305
Minimum, 60
Minkowski’s inequality, 34, 201
Jacobian, of a transformation, 101 Mobius strip, 258
Jordan measurable set, 155 Moment of inertia, 85
Moments, 160, 256
Klein bottle, 258 Monotone, function, 315
Kronecker symbol, 5, 12 sequence, 292
generalized, 208 sequences theorem, 190
Multicovector, 210
Lagrange multiplier rule, 129 Multilinear function, 206
Laplacean, 108 Multivector, 214
336 Index

n-dimensional interval, 136 Saddle point, 63


Neighborhoods, 300 Scalar, 2
Nondegenerate critical points, 63 product, 10, 12
Norm, 28 Seminorm, 34
dual, 32 Sequences, 291
Normal vector to a manifold, 125 Series, 295
Normed vector space, 305 Sets, related notions, 6
Null sets, 145 Simplex, 20
Simply connected set, 278
Spherical coordinates, 181
Open set, 289, 301 Step function, 147
Open transformation, 245 Stereographic projection, 249
Orientations, 36, 219 Stokes’ theorem, 273
for manifolds, 257 Support, of a function, 147
Orthogonal: Supporting hyperplane, 16
transformation, 97 Surface of revolution, 257
vectors, 5
Orthogonalization process of
Gram-Schmidt, 6 Tangent plane to a manifold, 126
Orthonormal bases, 5
Tangent vector, to a curve, 73
Ostrogradsky’s theorem, 264
to a manifold, 124
Outer measure, 140
Taylor’s:
formula, 49
Pappus’ theorem, 257 theorem, 311
Partial derivative, of a function, 37 Tensor:
of a transformation, 101 algebra, 214
Partition of unity, 253 field, 231
Pathwise connectedness, 307 Topological:
Poincare’s lemma, 280 space, 300
Point, 2 subspace, 304
of accumulation, 308 Topology of #”; related notions, 7
critical, 60 Transformation, affine, 96
Polygonal path, 308 bounded, 298
Polytope, convex, 15 chain rule for, 106
Principal normal vector, 109 of class C, 102
Product manifolds, 128 conformal, 105
continuous, 299
differentiable, 100
Quadratic form, 55
flat, 240
of integrals, 173
Real numbers, 283 Jacobian of a, 101
Regular transformations, 115, 241 linear, 92
Relative: characteristic values and
extrema, 60 vectors of a, 131
topology, 301 dual of a, 96
Riemann integral, 150, 313 matrix of a, 93
Rotations, 99 open, 245
Index 337

orthogonal, 97 Uniform continuity, 313


partial derivative of a, 101
regular, 115, 241
Vector, 2
Transformations; related notions, 89
field, 10
Translations, 97
space, 285
Triangle inequality, 3

Unbounded case in measure and Wave equation, 109


integration, 163 Work, 84

ABCDE698765
\
a\\
eter)
steer ie

i4

You might also like