0% found this document useful (0 votes)
34 views113 pages

Math 107 Fall 2011 Notes

This document outlines the topics that will be covered in a course on vector calculus. It begins by motivating the generalization of the Fundamental Theorem of Calculus to higher dimensions. This will involve studying n-dimensional Euclidean space, differentiable manifolds, vector-valued functions like paths and vector fields, and differential forms - which allow integration to be defined on manifolds. The document provides an overview of the key concepts and theorems that will be discussed to build the mathematical foundations for a multidimensional generalization of the Fundamental Theorem of Calculus.

Uploaded by

lu cas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views113 pages

Math 107 Fall 2011 Notes

This document outlines the topics that will be covered in a course on vector calculus. It begins by motivating the generalization of the Fundamental Theorem of Calculus to higher dimensions. This will involve studying n-dimensional Euclidean space, differentiable manifolds, vector-valued functions like paths and vector fields, and differential forms - which allow integration to be defined on manifolds. The document provides an overview of the key concepts and theorems that will be discussed to build the mathematical foundations for a multidimensional generalization of the Fundamental Theorem of Calculus.

Uploaded by

lu cas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 113

Vector Calculus

Lecture Notes

Adolfo J. Rumbos

c Draft date November 23, 2011
2
Contents

1 Motivation for the course 5

2 Euclidean Space 7
2.1 Definition of 𝑛–Dimensional Euclidean Space . . . . . . . . . . . 7
2.2 Spans, Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Dot Product and Euclidean Norm . . . . . . . . . . . . . . . . . 11
2.4 Orthogonality and Projections . . . . . . . . . . . . . . . . . . . 13
2.5 The Cross Product in ℝ3 . . . . . . . . . . . . . . . . . . . . . . 18
2.5.1 Defining the Cross–Product . . . . . . . . . . . . . . . . . 20
2.5.2 Triple Scalar Product . . . . . . . . . . . . . . . . . . . . 24

3 Functions 27
3.1 Types of Functions in Euclidean Space . . . . . . . . . . . . . . . 27
3.2 Open Subsets of Euclidean Space . . . . . . . . . . . . . . . . . . 28
3.3 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Images and Pre–Images . . . . . . . . . . . . . . . . . . . 35
3.3.2 An alternate definition of continuity . . . . . . . . . . . . 35
3.3.3 Compositions of Continuous Functions . . . . . . . . . . . 37
3.3.4 Limits and Continuity . . . . . . . . . . . . . . . . . . . . 38

4 Differentiability 41
4.1 Definition of Differentiability . . . . . . . . . . . . . . . . . . . . 42
4.2 The Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Example: Differentiable Scalar Fields . . . . . . . . . . . . . . . . 44
4.4 Example: Differentiable Paths . . . . . . . . . . . . . . . . . . . . 49
4.5 Sufficient Condition for Differentiability . . . . . . . . . . . . . . 51
4.5.1 Differentiability of Paths . . . . . . . . . . . . . . . . . . . 51
4.5.2 Differentiability of Scalar Fields . . . . . . . . . . . . . . . 53
4.5.3 𝐶 1 Maps and Differentiability . . . . . . . . . . . . . . . . 55
4.6 Derivatives of Compositions . . . . . . . . . . . . . . . . . . . . . 56

5 Integration 61
5.1 Path Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.1.1 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3
4 CONTENTS

5.1.2 Defining the Path Integral . . . . . . . . . . . . . . . . . . 67


5.2 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3 Gradient Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4 Flux Across Plane Curves . . . . . . . . . . . . . . . . . . . . . . 73
5.5 Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.6 Calculus of Differential Forms . . . . . . . . . . . . . . . . . . . . 89
5.7 Evaluating 2–forms: Double Integrals . . . . . . . . . . . . . . . . 91
5.8 Fundamental Theorem of Calculus in ℝ2 . . . . . . . . . . . . . . 96
5.9 Changing Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 99

A Mean Value Theorem 105

B Reparametrizations 107
Chapter 1

Motivation for the course

We start with the statement of the Fundamental Theorem of Calculus (FTC)


in one–dimension:

Theorem 1.0.1 (Fundamental Theorem of Calculus). Let 𝑓 : 𝐼 → ℝ denote a


continuous1 function defined on an open interval, 𝐼, which contains the closed
interval [𝑎, 𝑏], where 𝑎, 𝑏 ∈ ℝ with 𝑎 < 𝑏. Suppose that there exists a differen-
tiable2 function 𝐹 : 𝐼 → ℝ such that

𝐹 ′ (𝑥) = 𝑓 (𝑥) for all 𝑥 ∈ 𝐼.

Then ∫ 𝑏
𝑓 (𝑥)d𝑥 = 𝐹 (𝑏) − 𝐹 (𝑎). (1.1)
𝑎

The main goal of this course is to extend this result to higher dimensions. In
order to indicate how we intend to do so, we first re-write the integral in (1.1)
as follows:
First denote the interval [𝑎, 𝑏] by 𝑀 ; then, its boundary, denoted by ∂𝑀 , consists
of the end–points 𝑎 and 𝑏 of the interval; thus,

∂𝑀 = {𝑎, 𝑏}.

Since 𝐹 ′ = 𝑓 , the expression 𝑓 (𝑥)d𝑥 is 𝐹 ′ (𝑥)d𝑥, or the differential of 𝐹 , denoted


by d𝐹 . We therefore may write the integral in (1.1) as
∫ 𝑏 ∫
𝑓 (𝑥)d𝑥 = 𝑑𝐹.
𝑎 𝑀

1 Recall that a function 𝑓 : 𝐼 → ℝ is continuous at 𝑐 ∈ 𝐼, if (i) 𝑓 (𝑐) is defined, (ii) lim 𝑓 (𝑥)
𝑥→𝑐
exists, and (iii) lim 𝑓 (𝑥) = 𝑓 (𝑐).
𝑥→𝑐
2 Recall 𝑓 (𝑥) − 𝑓 (𝑐)
that a function 𝑓 : 𝐼 → ℝ is differentiable at 𝑐 ∈ 𝐼, if lim exists.
𝑥→𝑐 𝑥−𝑐

5
6 CHAPTER 1. MOTIVATION FOR THE COURSE

The reason for doing this change in notation is so that later on we can talk
about integrals over regions 𝑀 in Euclidean space, and not just integrals over
intervals. Thus, the concept of the integral will also have to be expanded. To
see how this might come about, we discuss briefly how the right–hand side the
expression in (1.1) might also be expressed as an integral.
Rewrite the right–hand side of (1.1) as the sum

(−1)𝐹 (𝑎) + (+1)𝐹 (𝑏);

thus, we are adding the values of the function 𝐹 on the boundary of 𝑀 taking
into account the convention that, as we do the integration on the left–hand side
of (1.1), we go from left to right along the interval [𝑎, 𝑏]; hence, as we integrate,
“we leave 𝑎” (this explains the −1 in front of 𝐹 (𝑎)) and “we enter 𝑏” (hence the
+1 in from of 𝐹 (𝑏)). Since integration of a function is, in some sense, the sum
of its values over a certain region, we are therefore led to suggesting that the
right–hand side in (1.1) may be written as:

𝐹.
∂𝑀

Thus the result of the Fundamental Theorem of Calculus in equation (1.1) may
now be written in a more general form as
∫ ∫
𝑑𝐹 = 𝐹. (1.2)
𝑀 ∂𝑀

This is known as the Generalized Stokes’ Theorem and a precise state of this
theorem will be given later in the course. It says that under certain conditions
on the sets 𝑀 and ∂𝑀 , and the “integrands,” also to be made precise later
in this course, integrating the “differential” of “something” over some “set,” is
the same as integrating that “something” over the boundary of the set. Before
we get to the stage at which we can state and prove this generalized form of
the Fundamental Theorem of Calculus, we will need to introduce concepts and
theory that will make the terms “something,” “set” and “integration on sets”
make sense. This will motivate the topics that we will discuss in this course.
Here is a broad outline of what we will be studying.

∙ The sets 𝑀 and ∂𝑀 are instances of what is known as differentiable man-


ifolds. In this course, they will be subsets of 𝑛–dimensional Euclidean
space satisfying certain properties that will allow us to define integration
and differentiation on them.
∙ The manifolds 𝑀 and ∂𝑀 live in 𝑛–dimensional Euclidean space and
therefore we will be spending some time studying the essential properties
of Euclidean space.
∙ The generalization of the integrands 𝐹 and d𝐹 will lead to the study of
vector valued functions (paths and vector fields) and differential forms.
Chapter 2

Euclidean Space

2.1 Definition of 𝑛–Dimensional Euclidean Space


Euclidean space of dimension 𝑛, denoted by ℝ𝑛 , is the vector space of column
vectors with real entries of the form
⎛ ⎞
𝑥1
⎜ 𝑥2 ⎟
⎜ .. ⎟ .
⎜ ⎟
⎝ . ⎠
𝑥𝑛

Remark 2.1.1. In the text, elements of ℝ𝑛 are denoted by row–vectors; in the


lectures and homework assignments, we will use column vectors. The convention
that I will try to follow in the lectures is that if we are interested in locating a
point in space, we will use a row vector; for instance, a point 𝑃 in ℝ𝑛 will be
indicated by 𝑃 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ), where 𝑥1 , 𝑥2 , . . . , 𝑥𝑛 are the coordinates of the
point. Vectors in ℝ𝑛 can also be used to locate points; for instance, the point
𝑃 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) is located by the vector
⎛ ⎞
𝑥1
−−→ ⎜ 𝑥2 ⎟

𝑂𝑃 = ⎜ . ⎟ ,

⎝ .. ⎠
𝑥𝑛

where 𝑂 denotes the origin, or zero vector, in 𝑛 dimensional Euclidean space. In


−−→
this case, we picture 𝑂𝑃 as a directed line segment (“an arrow”) starting at 𝑂
−−→
and ending at 𝑃 . On the other hand, the vector 𝑂𝑃 can also be used to indicate
the direction of the line segment and its length; in this case, the directed line
segment can be drawn as emanating from any point. The direction and length
of the segment are what matter in the latter case.
As a vector space, ℝ𝑛 is endowed with the algebraic operations of

7
8 CHAPTER 2. EUCLIDEAN SPACE

∙ Vector Addition
⎛ ⎞ ⎛ ⎞
𝑥1 𝑦1
⎜ 𝑥2 ⎟ ⎜ 𝑦2 ⎟
Given 𝑣 = ⎜ . ⎟ and 𝑤 = ⎜ . ⎟ , the vector sum 𝑣 + 𝑤 or 𝑣 and 𝑤 is
⎜ ⎟ ⎜ ⎟
⎝ .. ⎠ ⎝ .. ⎠
𝑥𝑛 𝑦𝑛
⎛ ⎞
𝑥1 + 𝑦1
⎜ 𝑥2 + 𝑦2 ⎟
𝑣+𝑤 =⎜
⎜ ⎟
.. ⎟
⎝ . ⎠
𝑥𝑛 + 𝑦𝑛

∙ Scalar Multiplication
⎛ ⎞
𝑥1
⎜ 𝑥2 ⎟
Given a real number 𝑡, also called a scalar, and a vector 𝑣 = ⎜ . ⎟ the
⎜ ⎟
⎝ .. ⎠
𝑥𝑛
scaling of 𝑣 by 𝑡, denoted by 𝑡𝑣, is given by
⎛ ⎞
𝑡𝑥1
⎜ 𝑡𝑥2 ⎟
𝑡𝑣 = ⎜ . ⎟
⎜ ⎟
⎝ .. ⎠
𝑡𝑥𝑛

Remark 2.1.2. In some texts, vectors are denoted with an arrow over the
symbol for the vector; for instance, →−𝑣, →
−𝑟 , etc. In the text that we are using
this semester, vectors are denoted in bold face type, v, r, etc. For the most part,
we will do away with arrows over symbols and bold face type in these notes,
lectures, and homework assignments. The context will make clear whether a
given symbol represents a point, a number, a vector, or a matrix.

2.2 Spans, Lines and Planes


The span of a single vector 𝑣 in ℝ𝑛 is the set of all scalar multiples of 𝑣:

span{𝑣} = {𝑡𝑣 ∣ 𝑡 ∈ ℝ}.

Geometrically, if 𝑣 is not the zero vector in ℝ𝑛 , span{𝑣} is the line through the
origin on ℝ𝑛 in the direction of the vector 𝑣.
If 𝑃 is a point in ℝ𝑛 and 𝑣 is a non–zero vector also in ℝ𝑛 , then the line
through 𝑃 in the direction of 𝑣 is the set
−−→ −−→
𝑂𝑃 + span{𝑣} = {𝑂𝑃 + 𝑡𝑣 ∣ 𝑡 ∈ ℝ}.
2.2. SPANS, LINES AND PLANES 9
⎞ ⎛
2
Example 2.2.1 (Parametric Equations of a line in ℝ3 ). Let 𝑣 = ⎝−3⎠ be a
1
vector in ℝ3 and 𝑃 the point with coordinates (1, 0 − 1). Find the line through
𝑃 in the direction of 𝑣.
Solution: The line through 𝑃 in the direction of 𝑣 is the set
⎧⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎫
⎨ 𝑥 𝑥 1 2 ⎬
⎝𝑦 ⎠ ∈ ℝ3 ⎝𝑦 ⎠ = ⎝ 0 ⎠ + 𝑡 ⎝−3⎠ , 𝑡 ∈ ℝ
𝑧 𝑧 −1 1
⎩ ⎭

or ⎧⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎫
⎨ 𝑥 𝑥 1 + 2𝑡 ⎬
⎝𝑦 ⎠ ∈ ℝ3 ⎝𝑦 ⎠ = ⎝ −3𝑡 ⎠ , 𝑡 ∈ ℝ
𝑧 𝑧 −1 + 𝑡
⎩ ⎭
⎛ ⎞
𝑥
Thus, for a point ⎝𝑦 ⎠ to be on the line, 𝑥, 𝑦 and 𝑧 must satisfy
𝑧
the equations ⎧
⎨ 𝑥 = 1 + 2𝑡;
𝑦 = −3𝑡;
𝑧 = −1 + 𝑡,

for some 𝑡 ∈ ℝ. These are known as the parametric equations of


the line. The variable 𝑡 is known as a parameter. □

In general, the parametric⎛ equations


⎞ of a line through 𝑃 (𝑏1 , 𝑏2 , . . . , 𝑏𝑛 ) in the
𝑎1
⎜ 𝑎2 ⎟
direction of a vector 𝑣 = ⎜ . ⎟ in ℝ𝑛 are
⎜ ⎟
⎝ .. ⎠
𝑎𝑛


 𝑥1 = 𝑏1 + 𝑎1 𝑡
⎨ 𝑥2 = 𝑏2 + 𝑎2 𝑡

..


 .
𝑥𝑛 = 𝑏𝑛 + 𝑎𝑛 𝑡

In some cases we are interested in the directed line segment from a point
𝑃1 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) to a point 𝑃1 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) in ℝ𝑛 . We will denote this set
by [𝑃1 𝑃2 ]; so that,
−−→ −−−→
[𝑃1 𝑃2 ] = {𝑂𝑃1 + 𝑡𝑃1 𝑃2 ∣ 0 ⩽ 𝑡 ⩽ 1}.

The span of two linearly independent vectors, 𝑣1 and 𝑣2 , in ℝ𝑛 is a two–


dimensional subspace of ℝ𝑛 . In three–dimensional Euclidean space, ℝ3 , span{𝑣1 , 𝑣2 }
10 CHAPTER 2. EUCLIDEAN SPACE

is a plane through the origin containing the points located by the vectors 𝑣1 and
𝑣2 .
If 𝑃 is a point in ℝ3 , the plane through 𝑃 generated by the linearly inde-
pendent vectors 𝑣1 and 𝑣2 , also in ℝ3 , is given by
−−→ −−→
𝑂𝑃 + span{𝑣1 , 𝑣2 } = {𝑂𝑃 + 𝑡𝑣1 + 𝑠𝑣2 ∣ 𝑡, 𝑠 ∈ ℝ}.
⎛ ⎞ ⎛ ⎞
2 6
Example 2.2.2 (Equations of planes ℝ3 ). Let 𝑣1 = ⎝−3⎠ and 𝑣2 = ⎝ 2 ⎠
1 −3
3
be vectors in ℝ and 𝑃 the point with coordinates (1, 0 − 1). Give the equation
of the plane through 𝑃 spanned by the vectors 𝑣1 and 𝑣2 .

Solution: The plane through 𝑃 spanned by the vectors 𝑣1 and 𝑣2


is the set
⎧⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎫
⎨ 𝑥 𝑥 1 2 6 ⎬
⎝𝑦 ⎠ ∈ ℝ3 ⎝𝑦 ⎠ = ⎝ 0 ⎠ + 𝑡 ⎝−3⎠ + 𝑠 ⎝ 2 ⎠ , 𝑡, 𝑠 ∈ ℝ
𝑧 𝑧 −1 1 −3
⎩ ⎭

This leads to the parametric equations



⎨ 𝑥 = 1 + 2𝑡 + 6𝑠
𝑦 = −3𝑡 + 2𝑠
𝑧 = −1 + 𝑡 − 3𝑠.

We can write this set of parametric equations as single equation


involving only 𝑥, 𝑦 and 𝑧. We do this by first solving the system

⎨ 2𝑡 + 6𝑠 = 𝑥 − 1
−3𝑡 + 2𝑠 = 𝑦
𝑡 − 3𝑠 = 𝑧 + 1

for 𝑡 and 𝑠.
Using Gaussian elimination, we get can determine conditions on 𝑥,
𝑦 and 𝑧 that will allows us to solve for 𝑡 and 𝑠:
3 ∣ 𝑥−1
⎛ ⎞ ⎛ ⎞
2 6 ∣ 𝑥−1 1 2
⎝−3 2 ∣ 𝑦 ⎠ → ⎝−3 2 ∣ 𝑦 ⎠→
1 −3 ∣ 𝑧 + 1 1 −3 ∣ 𝑧 + 1

𝑥−1 𝑥−1
⎛ ⎞ ⎛ ⎞
1 3 ∣ 2 1 3 ∣ 2
3 3 1
⎝ 0 11 ∣ 2 (𝑥 − 1) + 𝑦 ⎠→⎝ 0 1 ∣ 22 (𝑥 − 1) + 11 𝑦
⎠→
0 −6 ∣ − 12 (𝑥 − 1) + (𝑧 + 1) 0 −1 ∣ 1 1
− 12 (𝑥 − 1) + 6 (𝑧 + 1)

𝑥−1
⎛ ⎞
1 3 ∣ 2
3 1
⎝ 0 1 ∣ 22 (𝑥− 1) + 11 𝑦 ⎠
7 1 1
0 0 ∣ 132 (𝑥 − 1) + 11 𝑦 + 6 (𝑧 + 1)
2.3. DOT PRODUCT AND EUCLIDEAN NORM 11

Thus, for the system to be solvable for 𝑡 and 𝑠, the third row must
be a row of zeros. We therefore get the equation

7 1 1
(𝑥 − 1) + 𝑦 + (𝑧 + 1) = 0
132 11 6
or
7(𝑥 − 1) + 12(𝑦 − 0) + 22(𝑧 + 1) = 0.
This is the equation of the plane. □

In general, the equation

𝑎(𝑥 − 𝑥𝑜 ) + 𝑏(𝑦 − 𝑦𝑜 )𝑐(𝑧 − 𝑧𝑜 ) = 0

represents a plain in ℝ3 through the point 𝑃 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ). We will see in a later


section that 𝑎, 𝑏 and 𝑐 are the components of a vector perpendicular to the
plane.

2.3 Dot Product and Euclidean Norm



⎞ ⎛ ⎞
𝑥1 𝑦1
⎜ 𝑥2 ⎟ ⎜ 𝑦2 ⎟
Definition 2.3.1. Given vectors 𝑣 = ⎜ . ⎟ and 𝑤 = ⎜ . ⎟ , the inner
⎜ ⎟ ⎜ ⎟
⎝ .. ⎠ ⎝ .. ⎠
𝑥𝑛 𝑦𝑛
product or dot product, of 𝑣 and 𝑤 is the real number (or scalar), denoted by
𝑣 ⋅ 𝑤, obtained as follows
⎛ ⎞
𝑦1
⎜ 𝑦2 ⎟
𝑣 ⋅ 𝑤 = 𝑣 𝑇 𝑤 = 𝑥1 𝑥2 ⋅ ⋅ ⋅ 𝑥𝑛 ⎜ . ⎟ = 𝑥1 𝑦1 + 𝑥2 𝑦2 + ⋅ ⋅ ⋅ + 𝑥𝑛 𝑦𝑛 .
( ) ⎜ ⎟
⎝ .. ⎠
𝑦𝑛

The superscript 𝑇 in the above definition indicates that the column vector
𝑣 has been transposed into a row vector.
The inner or dot product defined above satisfies the following properties
which can be easily checked:

(i) Symmetry: 𝑣 ⋅ 𝑤 = 𝑤 ⋅ 𝑣

(ii) Bi-Linearity: (𝑐1 𝑣1 + 𝑐2 𝑣2 ) ⋅ 𝑤 = 𝑐1 𝑣1 ⋅ 𝑤 + 𝑐2 𝑣2 ⋅ 𝑤, for scalars 𝑐1 and 𝑐2 ;


and

(iii) Positive Definiteness: 𝑣 ⋅ 𝑣 ⩾ 0 for all 𝑣 ∈ ℝ𝑛 and 𝑣 ⋅ 𝑣 = 0 if and only if 𝑣


is the zero vector.

Given an inner product in a vector space, we can define a norm as follows.


12 CHAPTER 2. EUCLIDEAN SPACE

Definition 2.3.2 (Euclidean Norm in ℝ𝑛 ). For any vector 𝑣 ∈ ℝ𝑛 , its Euclidean


norm, denoted ∥𝑣∥, is defined by

∥𝑣∥ = 𝑣 ⋅ 𝑣.
Observe that, by the positive definiteness of the inner product, this definition
makes sense. Note also that we have defined the norm of a vector to be the
positive square root of the the inner product of the vector with itself. Thus, the
norm of any vector is always non–negative.
If 𝑃 is a point in ℝ𝑛 with coordinates (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ), the norm of the vector
−−→
𝑂𝑃 that goes from the origin to 𝑃 is the distance from 𝑃 to the origin; that is,
−−→

dist(𝑂, 𝑃 ) = ∥𝑂𝑃 ∥ = 𝑥21 + 𝑥22 + ⋅ ⋅ ⋅ + 𝑥2𝑛 .
If 𝑃1 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) and 𝑃2 (𝑦1 , 𝑦2 , . . . , 𝑦𝑛 ) are any two points in ℝ𝑛 , then the
distance from 𝑃1 to 𝑃2 is given by
−−→ −−→ √
dist(𝑃1 , 𝑃2 ) = ∥𝑂𝑃2 − 𝑂𝑃2 ∥ = (𝑦1 − 𝑥1 )2 + (𝑦2 − 𝑥2 )2 + ⋅ ⋅ ⋅ + (𝑦𝑛 − 𝑥𝑛 )2 .
As a consequence of the properties of the inner product, we obtain the fol-
lowing properties of the norm:
Proposition 2.3.3 (Properties of the Norm). Let 𝑣 denote a vector in ℝ𝑛 and
𝑐 a scalar. Then,
(i) ∥𝑣∥ ⩾ 0 and ∥𝑣∥ = 0 if and only if 𝑣 is the zero vector.
(ii) ∥𝑐𝑣∥ = ∣𝑐∣∥𝑣∥.
We also have the following very important inequality
Theorem 2.3.4 (The Cauchy–Schwarz Inequality). Let 𝑣 and 𝑤 denote vectors
in ℝ𝑛 ; then,
∣𝑣 ⋅ 𝑤∣ ⩽ ∥𝑣∥∥𝑤∥.
Proof. Consider the function 𝑓 : ℝ → ℝ given by
𝑓 (𝑡) = ∥𝑣 − 𝑡𝑤∥2 for all 𝑡 ∈ ℝ.
Using the definition of the norm, we can write
𝑓 (𝑡) = (𝑣 − 𝑡𝑤) ⋅ (𝑣 − 𝑡𝑤).
We can now use the properties of the inner product to expand this expression
and get
𝑓 (𝑡) = ∥𝑣∥2 − 2𝑡𝑣 ⋅ 𝑤 + 𝑡2 ∥𝑤∥2 .
Thus, 𝑓 (𝑡) is a quadratic polynomial in 𝑡 which is always non–negative. There-
fore, it can have at most one real root. It then follows that
(2𝑣 ⋅ 𝑤)2 − 4∥𝑤∥2 ∥𝑣∥2 ⩽ 0,
from which we get
(𝑣 ⋅ 𝑤)2 ⩽ ∥𝑤∥2 ∥𝑣∥2 .
Taking square roots on both sides yields the inequality.
2.4. ORTHOGONALITY AND PROJECTIONS 13

The Cauchy–Schwarz inequality, together with the properties of the inner


product and the definition of the norm, yields the following inequality known
as the Triangle Inequality.
Proposition 2.3.5 (The Triangle Inequality). For any 𝑣 and 𝑤 in ℝ𝑛 ,
∥𝑤 + 𝑤∥ ⩽ ∥𝑣∥ + ∥𝑤∥.
Proof. This is an Exercise.

2.4 Orthogonality and Projections


We begin this section with the following geometric example.
Example 2.4.1 (Distance from a point to a line). Let 𝑣 denote a non–zero
vector in ℝ𝑛 ; then, span{𝑣} is a line through the origin in the direction of 𝑣.
Given a point 𝑃 in ℝ3 which is not in the span of 𝑣, we would like to find the
distance from 𝑃 to the line; in other words, the shortest distance from 𝑃 to any
point on the line. There are two parts to this problem:
∙ first, locate the point, 𝑡𝑣, on the line that is closest to 𝑃 , and
∙ second, compute the distance from that point to 𝑃 .
Figure 2.4.1 shows a sketch of the line in ℝ3 representing span{𝑣}.

𝑧
6

𝑃
@
H@  𝑤
 :
 r
HH
 
@H

  @ @HH
 @R HH
@
  𝑣@@ 𝑡𝑣 HH

𝑥 
 @
@
R
@ j 𝑦
H
@
@ span{𝑣}

Figure 2.4.1: Line in ℝ3


−−→
To do this, we first let 𝑤 = 𝑂𝑃 denote the vector from the origin to 𝑃 (see
sketch in Figure 2.4.1), and define the function
𝑓 (𝑡) = ∥𝑤 − 𝑡𝑣∥2 for any 𝑡 ∈ ℝ;
14 CHAPTER 2. EUCLIDEAN SPACE

that is, 𝑓 (𝑡) is the square of the distance from 𝑃 to any point on the line through
𝑂 in the direction of 𝑣. We wish to minimize this function.
Observe that 𝑓 (𝑡) can be written in terms of the dot product as
𝑓 (𝑡) = (𝑤 − 𝑡𝑣) ⋅ (𝑤 − 𝑡𝑣),
which can be expanded by virtue of the properties of the inner product and the
definition of the Euclidean norm into
𝑓 (𝑡) = ∥𝑤∥2 − 2𝑡𝑣 ⋅ 𝑤 + 𝑡2 ∥𝑣∥2 .
Thus, 𝑓 (𝑡) is a quadratic polynomial in 𝑡 which can be shown to have an absolute
minimum when
𝑣⋅𝑤
𝑡= .
∥𝑣∥2
Thus, the point on span{𝑣} which is closest to 𝑃 is the point
𝑣⋅𝑤
𝑣,
∥𝑣∥2
−−→
where 𝑤 = 𝑂𝑃 .
The distance form 𝑃 to the line (i.e., the shortest distance) is then
𝑣⋅𝑤
𝑣−𝑤 .
∥𝑣∥2
Remark 2.4.2. The argument of the previous example can be used to show that
the point on the line
−−→
𝑂𝑃𝑜 + span{𝑣},
for a given point 𝑃𝑜 , which is the closest to 𝑃 is given by
−−→ 𝑣 ⋅ 𝑤
𝑂𝑃𝑜 + 𝑣,
∥𝑣∥2
−−→
where 𝑤 = 𝑃𝑜 𝑃 , and the distance from 𝑃 to the line is
−−→ 𝑣 ⋅ 𝑤
𝑂𝑃𝑜 + 𝑣−𝑤 .
∥𝑣∥2
Definition 2.4.3 (Orthogonality). Two vectors 𝑣 and 𝑤 in ℝ𝑛 are said to be
orthogonal, or perpendicular, if
𝑣 ⋅ 𝑤 = 0.
Definition 2.4.4 (Orthogonal Projection). The vector
𝑣⋅𝑤
𝑣
∥𝑣∥2
is called the orthogonal projection of 𝑤 onto 𝑣. We denote it by 𝑃𝑣 (𝑤). Thus,
(𝑣 ⋅ 𝑤)
𝑃𝑣 (𝑤) = 𝑣.
∥𝑣∥2
2.4. ORTHOGONALITY AND PROJECTIONS 15

−−→
𝑃𝑣 (𝑤) is called the orthogonal projection of 𝑤 = 𝑂𝑃 onto 𝑣 because it lies
along a line through 𝑃 which is perpendicular to the direction of 𝑣. To see why
this is the case compute

(𝑃𝑣 (𝑤) − 𝑤) ⋅ 𝑃𝑣 (𝑤) = ∥𝑃𝑣 (𝑤)∥2 − 𝑃𝑣 (𝑤) ⋅ 𝑤


(𝑣 ⋅ 𝑤)2 𝑣⋅𝑤
= 2
− 𝑣⋅𝑤
∥𝑣∥ ∥𝑣∥2
(𝑣 ⋅ 𝑤)2 (𝑣 ⋅ 𝑤)2
= 2

∥𝑣∥ ∥𝑣∥2
= 0.

Thus, 𝑃𝑣 (𝑤) is perpendicular to the line connecting 𝑃 to 𝑃𝑣 (𝑤).


By the previous calculation we also see that any vector 𝑤 can be written as

𝑤 = 𝑃𝑣 (𝑤) + (𝑤 − 𝑃𝑣 (𝑤));

that is, the sum of a vector parallel to 𝑣 and another vector perpendicular to 𝑣.
This is known as the orthogonal decomposition of 𝑤 with respect to 𝑣.

Example 2.4.5. Let 𝐿 denote the line given parametrically by the equations

⎨ 𝑥 = 1−𝑡
𝑦 = 2𝑡 (2.1)
𝑧 = 2 + 𝑡,

for 𝑡 ∈ ℝ. Find the point on the line, 𝐿, which is closest to the point 𝑃 (1, 2, 0)
and compute the distance from 𝑃 to 𝐿.

Solution: Let 𝑃𝑜 be the point on 𝐿 with coordinates (1, 0, 2) (note


that 𝑃𝑜 is the point in ℝ3 corresponding to 𝑡 = 0). Put
⎛ ⎞
0
−−→ ⎝ ⎠
𝑤 = 𝑃𝑜 𝑃 = 2 .
−2
⎛ ⎞
−1
Let 𝑣 = ⎝ 2 ⎠ ; 𝑣 is the direction of the line, 𝐿; so that any point
1
−−→
on 𝐿 is of the form 𝑂𝑃𝑜 + 𝑡𝑣, for some 𝑡 in ℝ.
The point on the line 𝐿 which is closest to 𝑃 is
−−→
𝑂𝑃𝑜 + 𝑃𝑣 (𝑤),

where 𝑃𝑣 (𝑤) is the orthogonal projection of 𝑤 onto 𝑣; that is,

(𝑣 ⋅ 𝑤) 2 1
𝑃𝑣 (𝑤) = 𝑣 = 𝑣 = 𝑣.
∥𝑣∥2 6 3
16 CHAPTER 2. EUCLIDEAN SPACE

Thus, the point on 𝐿 which is closest to 𝑃 correspond to 𝑡 = 1/3 in


(2.1); that is, the point 𝑄(2/3, 2/3, 7/3) is the point on 𝐿 which is
closest to 𝑃 .
The distance form 𝑃 to the line 𝐿 is
dist(𝑃, 𝐿) = dist(𝑃, 𝑄)
−−→ −−→
= ∥𝑂𝑃 − 𝑂𝑄∥,

so that ⎛ ⎞
1/3
dist(𝑃, 𝐿) = ⎝ 4/3 ⎠
−7/3
⎛ ⎞
1
1 ⎝ ⎠
= 4
3
−7

1√ 66
= 1 + 16 + 49 = .
3 3

Definition 2.4.6 (Unit Vectors). A vector 𝑢 ∈ ℝ𝑛 is said to be a unit vector


if ∥𝑢∥ = 1; that is, 𝑢 has unit length.
If 𝑢 is a unit vector in ℝ𝑛 , then the orthogonal projection of 𝑤 ∈ ℝ𝑛 onto 𝑢
is given by
𝑃𝑢 (𝑤) = (𝑤 ⋅ 𝑢)𝑢.
We call this vector the orthogonal component of 𝑤 in the direction of 𝑢.
If 𝑣 is a non–zero vector in ℝ𝑛 , we can scale 𝑣 to obtain a unit vector in the
1
direction of 𝑣 as follows: 𝑣.
∥𝑣∥
1
Denote this vector by 𝑣ˆ; then, 𝑣ˆ = 𝑣 and
∥𝑣∥

1 1
∥ˆ
𝑣∥ = 𝑣 = ∥𝑣∥ = 1.
∥𝑣∥ ∥𝑣∥

As a convention, we will always try to denote unit vectors in a given direction


with a hat upon the symbol for the direction vector.
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0
Example 2.4.7. The vectors ˆ𝑖 = ⎝0⎠ , ˆ 𝑗 = ⎝1⎠ , and ˆ 𝑘 = ⎝0⎠ are unit
0 0 1
vectors in ℝ3 . Observe also that they are mutually orthogonal; that is
ˆ𝑖 ⋅ ˆ
𝑗 = 0, ˆ𝑖 ⋅ ˆ
𝑘 = 0, 𝑗⋅ˆ
and ˆ 𝑘 = 0.
2.4. ORTHOGONALITY AND PROJECTIONS 17

Note also that every vector 𝑣 in ℝ3 can be written us


𝑣 = (𝑣 ⋅ ˆ𝑖)ˆ𝑖 + (𝑣 ⋅ ˆ 𝑗 + (𝑣 ⋅ ˆ
𝑗)ˆ 𝑘)ˆ
𝑘.
This is known as the orthogonal decomposition of 𝑣 with respect to the basis
{ˆ𝑖, ˆ 𝑘} in ℝ3 .
𝑗, ˆ
Example 2.4.8 (Normal Direction to a Plane in ℝ3 ). The equation of a plane
in ℝ3 is given by
𝑎𝑥 + 𝑏𝑦 + 𝑐𝑧 = 𝑑
where 𝑎, 𝑏, 𝑐 and 𝑑 are real constants.
Suppose that 𝑃𝑜 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) is a point on the plane. Then,
𝑎𝑥𝑜 + 𝑏𝑦𝑜 + 𝑐𝑧𝑜 = 𝑑. (2.2)
Similarly, if 𝑃 (𝑥, 𝑦, 𝑦) is another point on the plane, then
𝑎𝑥 + 𝑏𝑦 + 𝑐 = 𝑑. (2.3)
Subtracting equation (2.2) from equation (2.3) we then obtain that
𝑎(𝑥 − 𝑥𝑜 ) + 𝑏(𝑦 − 𝑦𝑜 ) + 𝑐(𝑧 − 𝑧𝑜 ) = 0.
This is the general equation of a plane derived in a previous example. This⎛
equa-

𝑎
tion can be interpreted as saying that the dot product of the vector 𝑛 = ⎝ 𝑏 ⎠
⎛ ⎞ 𝑐
𝑥 − 𝑥𝑜
−−→
with the vector 𝑃𝑜 𝑃 = ⎝ 𝑦 − 𝑦𝑜 ⎠ is zero. Thus the vector 𝑛 is orthogonal, or
𝑧 − 𝑧𝑜
perpendicular, to any vector lying on the plane. We then say that 𝑛 is normal
vector to the plane. In the next section we will see how to obtain a normal
vector to the plane determined by three non–collinear points.
Example 2.4.9 (Distance from a point to a plane). Let 𝐻 denote the plane in
ℝ3 given by ⎧⎛ ⎞ ⎫
⎨ 𝑥 ⎬
𝐻 = ⎝ 𝑦 ⎠ ∈ ℝ3 𝑎𝑥 + 𝑏𝑦 + 𝑐𝑧 = 𝑑 .
𝑧
⎩ ⎭

Let 𝑃 denote a point which is not on the plane 𝐻. Find the shortest distance
from the point 𝑃 to 𝐻.
Solution: Let 𝑃𝑜 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) be any point in the plane, 𝐻, and define
−−→
the vector, 𝑤 = 𝑃𝑜 𝑃 , which goes from the point 𝑃𝑜 to the point 𝑃 .
The shortest distance from 𝑃 to the plane will be the norm of the
projection of 𝑤 onto the orthogonal direction vector,
⎛ ⎞
𝑎
𝑛 = ⎝ 𝑏⎠ ,
𝑐
18 CHAPTER 2. EUCLIDEAN SPACE

to the plane 𝐻. Then,

dist(𝑃, 𝐻) = ∥𝑃𝑛 (𝑤)∥,


𝑤⋅𝑛
where 𝑃𝑛 (𝑤) = 𝑛. □
∥𝑛∥2

Example 2.4.10. Let 𝐻 be the plane in ℝ3 given by the equation

2𝑥 + 3𝑦 + 6𝑧 = 6.

Find the distance from 𝐻 to 𝑃 (0, 2, 2).

Solution: Let 𝑃𝑜 denote the 𝑧–intercept of the plane; namely,


𝑃𝑜 (0, 0, 1), and put ⎛ ⎞
0
−−→ ⎝ ⎠
𝑤 = 𝑃𝑜 𝑃 = 2 .
1
Then, according to the result of Example 2.4.9,

∣𝑤 ⋅ 𝑛∣
dist(𝑃, 𝐻) = ,
∥𝑛∥

where ⎛ ⎞
2
𝑛 = ⎝ 3⎠ ,
6
so that
𝑤 ⋅ 𝑛 = 12,
and √
∥𝑛∥ = 4 + 9 + 36 = 7.
Consequently,
12
dist(𝑃, 𝐻) = .
7

2.5 The Cross Product in ℝ3


We begin this section by first showing how to compute the area of parallelogram
determined by two linearly independent vectors in ℝ2 .

Example 2.5.1 (Area of a Parallelogram). Let 𝑣 and 𝑤 denote two linearly


independent vectors in ℝ2 given by
( ) ( )
𝑎1 𝑏
𝑣= and 𝑤 = 1 .
𝑎2 𝑏2
2.5. THE CROSS PRODUCT IN ℝ3 19

𝑦 

 



𝑏2 𝑤  


A
 A
Aℎ
 
 
A
 A 
𝑎2  A
 𝑣
 *

 

𝑏1 𝑎1 𝑥

Figure 2.5.2: Vectors 𝑣 and 𝑤 on the 𝑥𝑦–plane

Figure 2.5.2 shows shows a sketch of the arrows representing 𝑣 and 𝑤 for the
special case in which they lie in the first quadrant of the 𝑥𝑦–plane.
We would like to compute the area of the parallelogram, 𝑃 (𝑣, 𝑤), determined
by 𝑣 and 𝑤. This may be computed as follows:

area(𝑃 (𝑣, 𝑤)) = ∥𝑣∥ℎ,

where ℎ may be obtained as ∥𝑤 − 𝑃𝑣 (𝑤)∥; that is, the distance from 𝑤 to its
orthogonal projection along 𝑣. Squaring both sides of the previous equation we
have that

(area(𝑃 (𝑣, 𝑤)))2 = ∥𝑣∥2 ∥𝑤 − 𝑃𝑣 (𝑤)∥2

= ∥𝑣∥2 (𝑤 − 𝑃𝑣 (𝑤)) ⋅ (𝑤 − 𝑃𝑣 (𝑤))

= ∥𝑣∥2 (∥𝑤∥2 − 2𝑤 ⋅ 𝑃𝑣 (𝑤) + ∥𝑃𝑣 (𝑤)∥2 )

(𝑣 ⋅ 𝑤)2
( )
(𝑣 ⋅ 𝑤)
= ∥𝑣∥2 ∥𝑤∥2 − 2𝑤 ⋅ 𝑣 +
∥𝑣∥2 ∥𝑣∥2

(𝑣 ⋅ 𝑤)2
( )
(𝑣 ⋅ 𝑤)
= ∥𝑣∥2 ∥𝑤∥2 − 2 𝑤 ⋅ 𝑣 +
∥𝑣∥2 ∥𝑣∥2

(𝑣 ⋅ 𝑤)2 (𝑣 ⋅ 𝑤)2
( )
= ∥𝑣∥2 ∥𝑤∥2 − 2 +
∥𝑣∥2 ∥𝑣∥2

= ∥𝑣∥2 ∥𝑤∥2 − (𝑣 ⋅ 𝑤)2 .


20 CHAPTER 2. EUCLIDEAN SPACE

Writing this in terms of the coordinates of 𝑣 and 𝑤 we then have that

(area(𝑃 (𝑣, 𝑤)))2 = ∥𝑣∥2 ∥𝑤∥2 − (𝑣 ⋅ 𝑤)2

= (𝑎21 + 𝑎22 )(𝑏21 + 𝑏22 ) − (𝑎1 𝑏1 + 𝑎2 𝑏2 )2

= 𝑎21 𝑏21 + 𝑎21 𝑏22 + 𝑎22 𝑏21 + 𝑎22 𝑏22 − (𝑎21 𝑏21 + 2𝑎1 𝑏1 𝑎2 𝑏2 + 𝑎22 𝑏22 )

= 𝑎21 𝑏22 + 𝑎22 𝑏21 − 2𝑎1 𝑏1 𝑎2 𝑏2

= 𝑎21 𝑏22 − 2(𝑎1 𝑏2 )(𝑎2 𝑏1 ) + 𝑎22 𝑏21

= (𝑎1 𝑏2 − 𝑎2 𝑏1 )2 .
(2.4)
Taking square roots on both sides, we get

area(𝑃 (𝑣, 𝑤)) = ∣𝑎1 𝑏2 − 𝑎2 𝑏1 ∣.

Observe that the expression in the absolute value on the right-hand side of the
previous equation is the determinant of the matrix:
( )
𝑎1 𝑏1
.
𝑎2 𝑏2

We then have that the area of the parallelogram determined by 𝑣 and 𝑤 is


the absolute value of the determinant of a 2 × 2 matrix whose columns are the
vectors 𝑣 and 𝑤. If we denote the matrix by [𝑣 𝑤], then we obtain the formula

area(𝑃 (𝑣, 𝑤)) = ∣ det([𝑣 𝑤])∣.

Observe that this formula works even in the case in which 𝑣 and 𝑤 are not
linearly independent. In this case we get that the area of the parallelogram
determined by the two vectors is 0.

2.5.1 Defining the Cross–Product


Given two linearly independent vectors, 𝑣 and 𝑤, in ℝ3 , we would like to asso-
ciate to them a vector, denoted by 𝑣 × 𝑤 and called the cross product of 𝑣 and
𝑤, satisfying the following properties:
∙ 𝑣 × 𝑤 is perpendicular to the plane spanned by 𝑣 and 𝑤.
∙ There are two choices for a perpendicular direction to the span of 𝑣 and
𝑤. The direction for 𝑣 × 𝑤 is determined according to the so called “right–
hand rule”:
With the fingers of your right hand, follow the direction of 𝑣
while curling them towards the direction of 𝑤. The thumb will
point in the direction of 𝑣 × 𝑤.
2.5. THE CROSS PRODUCT IN ℝ3 21

∙ The norm of 𝑣 × 𝑤 is the area of the parallelogram, 𝑃 (𝑣, 𝑤), determined


by the vectors 𝑣 and 𝑤.
These properties imply that the cross product is not a symmetric operation;
in fact, it is antisymmetric:
𝑤 × 𝑣 = −𝑣 × 𝑤 for all 𝑣, 𝑤 ∈ ℝ3 .
From this property we immediately get that
𝑣 × 𝑣 = 0 for all 𝑣 ∈ ℝ3 ,
where 0 denotes the zero vector in ℝ3 .
Putting the properties defining the cross product together we get that
𝑣 × 𝑤 = ±area(𝑃 (𝑣, 𝑤))ˆ
𝑛,
where 𝑛ˆ is a unit vector perpendicular to the plane determined by 𝑣 and 𝑤, and
the sign is determined by the right hand rule.
In order to compute 𝑣 × 𝑤, we first consider the special case in which 𝑣 and
𝑤 lie along the 𝑥𝑦–plane. More specifically, suppose that
⎛ ⎞ ⎛ ⎞
𝑎1 𝑏1
𝑣 = ⎝𝑎2 ⎠ and 𝑤 = ⎝𝑏2 ⎠ .
0 0
Figure 2.5.3 shows the situation in which 𝑣 and 𝑤 lie on the first quadrant of
the 𝑥𝑦–plane.

𝑦 

 


𝑏2 𝑤 



A
 A
Aℎ
 
 
A
 A 
𝑎2  A
   𝑣
*

 

𝑏1 𝑎1 𝑥

Figure 2.5.3: Vectors 𝑣 and 𝑤 on the 𝑥𝑦–plane


⎛ ⎞
0
For the situation shown in the figure, 𝑣 × 𝑤 is in the direction of ˆ
𝑘 = ⎝0⎠ .
1
We then have that
𝑣 × 𝑤 = area(𝑃 (𝑣, 𝑤))ˆ𝑘,
22 CHAPTER 2. EUCLIDEAN SPACE

where the area of the parallelogram 𝑃 (𝑣, 𝑤) is computed as in Example 2.5.1 to


obtain ( )
𝑎1 𝑏1
area(𝑃 (𝑣, 𝑤)) = det
𝑎2 𝑏2
It turns out that putting the columns in the matrix in the order that we did
takes into account the sign convention dictated by the right–hand–rule. We
then have that ( )
𝑎1 𝑏1 ˆ
𝑣 × 𝑤 = det 𝑘.
𝑎2 𝑏2
( )
𝑎 𝑏 𝑎 𝑎2
In order to simplify notation, we will write 1 1 for det 1 . Thus,
𝑎2 𝑏2 𝑏1 𝑏2

𝑎1 𝑏1 ˆ
𝑣×𝑤 = 𝑘.
𝑎2 𝑏2

Observe that, since the determinant of the transpose of a matrix is the same
as that of the matrix, we can also write

𝑎1 𝑎2 ˆ
𝑣×𝑤 = 𝑘, (2.5)
𝑏1 𝑏2

for vectors ⎛ ⎞ ⎛ ⎞
𝑎1 𝑏1
𝑣 = ⎝𝑎2 ⎠ and 𝑤 = ⎝𝑏2 ⎠
0 0
lying in the 𝑥𝑦–plane.
In general, the cross product of the vectors
⎛ ⎞ ⎛ ⎞
𝑎1 𝑏1
𝑣 = ⎝𝑎2 ⎠ and 𝑤 = ⎝𝑏2 ⎠
𝑎3 𝑏3

in ℝ3 is the vector
𝑎2 𝑎3 ˆ 𝑎1 𝑎3 ˆ 𝑎 𝑎2 ˆ
𝑣×𝑤 = 𝑖− 𝑗+ 1 𝑘, (2.6)
𝑏2 𝑏3 𝑏1 𝑏3 𝑏1 𝑏2
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0
where ˆ𝑖 = ⎝0⎠ , ˆ 𝑗 = ⎝1⎠ , and ˆ 𝑘 = ⎝0⎠ are the standard basis vectors in
0 0 1
ℝ3 .
Observe that if 𝑎3 = 𝑏3 = 0 in definition on 𝑣 × 𝑤 in (2.6), we recover the
expression in (2.5),
𝑎 𝑎2 ˆ
𝑣×𝑤 = 1 𝑘
𝑏1 𝑏2
for the cross product of vectors lying entirely in the 𝑥𝑦–plane.
2.5. THE CROSS PRODUCT IN ℝ3 23

In the remainder of this section, we verify that the cross product of two
vectors, 𝑣 and 𝑤, in ℝ3 defined in the (2.6) does indeed satisfies the properties
listed at the beginning of the section. To check that 𝑣 × 𝑤 is orthogonal to the
plane spanned by 𝑣 and 𝑤, write
⎛ ⎞ ⎛ ⎞
𝑎1 𝑏1
𝑣 = ⎝𝑎2 ⎠ and 𝑤 = ⎝𝑏2 ⎠
𝑎3 𝑏3

and compute the dot product of 𝑣 and 𝑣 × 𝑤,

𝑣 ⋅ (𝑣 × 𝑤) = 𝑣 𝑇 (𝑣 × 𝑤)
⎛ ⎞
𝑎2 𝑎3

⎜ 𝑏2 𝑏3 ⎟

⎜ ⎟
⎜ ⎟
( ) ⎜ 𝑎1 𝑎3 ⎟
= 𝑎1 𝑎2 𝑎3 ⎜⎜− 𝑏1

⎜ 𝑏3 ⎟

⎜ ⎟
⎜ ⎟
⎝ 𝑎1 𝑎2 ⎠
𝑏1 𝑏2

so that
𝑎2 𝑎3 𝑎 𝑎3 𝑎 𝑎2
𝑣 ⋅ (𝑣 × 𝑤) = 𝑎1 − 𝑎2 1 + 𝑎3 1 .
𝑏2 𝑏3 𝑏1 𝑏3 𝑏1 𝑏2

We recognize in the right–hand side of equation (??) the expansion along the
first row of the determinant
𝑎1 𝑎2 𝑎3
𝑎1 𝑎2 𝑎3 ,
𝑏1 𝑏2 𝑏3

which is 0 since the first two rows are the same. Thus,

𝑣 ⋅ (𝑣 × 𝑤) = 0,

and therefore 𝑣 × 𝑤 is orthogonal to 𝑣. Similarly, we can compute

𝑏1 𝑏2 𝑏3
𝑤 ⋅ (𝑣 × 𝑤) = 𝑎1 𝑎2 𝑎3 = 0,
𝑏1 𝑏2 𝑏3

which shows that 𝑣 × 𝑤 is also orthogonal to 𝑤. Hence, 𝑣 × 𝑤 is orthogonal to


the span of 𝑣 and 𝑤.
Next, to see that ∥𝑣 × 𝑤∥ gives the area of the parallelogram spanned by 𝑣
and 𝑤, compute

∥𝑣 × 𝑤∥2 = (𝑎21 + 𝑎22 + 𝑎23 )(𝑏21 + 𝑏22 + 𝑏23 ) − (𝑎1 𝑏1 + 𝑎2 𝑏2 + 𝑎3 𝑏2 )2 ,


24 CHAPTER 2. EUCLIDEAN SPACE

which can be written as

∥𝑣 × 𝑤∥2 = ∥𝑣∥2 ∥𝑤∥2 − (𝑣 ⋅ 𝑤)2 . (2.7)

The calculations displayed in (2.4) then show that (2.7) can we written as

∥𝑣 × 𝑤∥2 = [area(𝑃 (𝑣, 𝑤)]2 ,

which which it follows that

∥𝑣 × 𝑤∥ = area(𝑃 (𝑣, 𝑤)).

2.5.2 Triple Scalar Product


Example 2.5.2 (Volume of a Parallelepiped). Three linearly independent vec-
tors, 𝑢, 𝑣 and 𝑤, in ℝ3 determine a solid figure called a parallelepiped (see
Figure 2.5.4 on page 24). In this section, we see how to compute the volume of
that object, which we shall denote by 𝑃 (𝑢, 𝑣, 𝑤).

𝑛=𝑣×𝑤
 
6    
𝑢   
   
 
 ℎ   
 𝑤   
 *  
     

 -
𝑣

Figure 2.5.4: Volume of Parallelepiped

First, observe that the volume of the parallelepiped, 𝑃 (𝑣, 𝑤, 𝑢), drawn in
Figure 2.5.4 is the area of the parallelogram spanned by 𝑣 and 𝑤 times the
height, ℎ, of the parallelepiped:

volume(𝑃 (𝑣, 𝑤, 𝑢)) = area(𝑃 (𝑣, 𝑤)) ⋅ ℎ, (2.8)

where ℎ can be obtained by projecting 𝑢 onto the cross–product, 𝑣 × 𝑤, of 𝑣


and 𝑤; that is

𝑢⋅𝑛
ℎ = ∥𝑃𝑛 (𝑢)∥ = 𝑛 ,
∥𝑛∥2
where
𝑛 = 𝑣 × 𝑤.
2.5. THE CROSS PRODUCT IN ℝ3 25

We then have that


∣𝑢 ⋅ (𝑣 × 𝑤)∣
ℎ= .
∥𝑣 × 𝑤∥
Consequently, since area(𝑃 (𝑣, 𝑤)) = ∥𝑣 × 𝑤∥, we get from (2.8) that

volume(𝑃 (𝑣, 𝑤, 𝑢)) = ∣𝑢 ⋅ (𝑣 × 𝑤)∣. (2.9)

The scalar, 𝑢⋅(𝑣 ×𝑤), in the right–hand side of the equation in (2.9) is called
the triple scalar product of 𝑢, 𝑣 and 𝑤.
Given three vectors
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
𝑐1 𝑎1 𝑏1
𝑢 = ⎝𝑐2 ⎠ , 𝑣 = ⎝𝑎2 ⎠ and 𝑤 = ⎝𝑏2 ⎠
𝑐3 𝑎3 𝑏3

in ℝ3 , the triple scalar product of 𝑢, 𝑣 and 𝑤 is given by

𝑎2 𝑎3 𝑎 𝑎3 𝑎 𝑎2
𝑢 ⋅ (𝑣 × 𝑤) = 𝑐1 − 𝑐2 1 + 𝑐3 1 ,
𝑏2 𝑏3 𝑏1 𝑏3 𝑏1 𝑏2
or
𝑐1 𝑐2 𝑐3
𝑢 ⋅ (𝑣 × 𝑤) = 𝑎1 𝑎2 𝑎3 ;
𝑏1 𝑏2 𝑏3
that is, 𝑢 ⋅ (𝑣 × 𝑤) is the the determinant of the 3 × 3 matrix whose rows are the
vector 𝑢, 𝑣 and 𝑤, in that order. Since the determinant of the transpose of a
matrix is the same as the determinant of the original matrix, we may also write

𝑢 ⋅ (𝑣 × 𝑤) = det[ 𝑢 𝑣 𝑤 ],

the determinant of the 3 × 3 matrix whose columns are the vector 𝑢, 𝑣 and 𝑤,
in that order.
26 CHAPTER 2. EUCLIDEAN SPACE
Chapter 3

Functions

3.1 Types of Functions in Euclidean Space


Given a subset 𝐷 of 𝑛–dimensional Euclidean space, ℝ𝑛 , we are interested in
functions that map 𝐷 to 𝑚–dimensional Euclidean space, ℝ𝑚 , where 𝑛 and 𝑚
could possibly be the same. We write

𝐹 : 𝐷 → ℝ𝑚

and call 𝐷 the domain of 𝐹 ; that is, the set where the function is defined.

Example 3.1.1. The function 𝑓 given by

1
𝑓 (𝑥, 𝑦) = √
1 − 𝑥2 − 𝑦 2

is defined over the set

𝐷 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 < 1},

or the open unit disc in ℝ2 . In this case, 𝑛 = 2 and 𝑚 = 1.

There are different types of functions that we will be studying in this course.
Some of the types have received traditional names, and we present them here.

∙ Vector Fields. If 𝑚 = 𝑛 > 1, then the map

𝐹 : 𝐷 → ℝ𝑛

is called a vector field on 𝐷. The idea here is that each point in 𝐷 gets
assigned a vector. A picture for this is provided by a model of fluid flow
in which it point in region where fluid is flowing gets assigned a vector
giving the velocity of the flow at that particular point.

27
28 CHAPTER 3. FUNCTIONS

∙ Scalar Fields. For the case in which 𝑚 = 1 and 𝑛 > 1, every point in
𝐷 now gets assigned a scalar (a real number). An example of this in
applications would be the temperature distribution over a region in space.
Scalar fields in this course will usually be denoted by lower case letters (𝑓 ,
𝑔, etc.). The value of a scalar field
𝑓: 𝐷 →ℝ
at a point 𝑃 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) in 𝐷 will be denoted by
𝑓 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ).
If 𝐷 is a region in the 𝑥𝑦–plane, we simply write
𝑓 (𝑥, 𝑦) for (𝑥, 𝑦) ∈ 𝐷.

∙ Paths. If 𝑛 = 1, 𝑚 > 1 and 𝐷 is an interval, 𝐼, of real line, then the map


𝜎 : 𝐼 → ℝ𝑚
is called a path in ℝ𝑚 .
Example 3.1.2. Let 𝜎(𝑡) = (cos 𝑡, sin 𝑡) for 𝑡 ∈ (−𝜋, 𝜋], then
𝜎 : (−𝜋, 𝜋] → ℝ2
is a path in ℝ2 . A picture of this map would a particle in the 𝑥𝑦–plane
moving along the unit circle in the counterclockwise direction.

3.2 Open Subsets of Euclidean Space


In Example 3.1.1 we saw that the function 𝑓 given by
1
𝑓 (𝑥, 𝑦) = √
1 − 𝑥2 − 𝑦 2
has the open unit disc, 𝐷 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 < 1}, as its domain. 𝐷 is an
example of what is known as an open set.
Definition 3.2.1 (Open Balls). Given 𝑥 ∈ ℝ𝑛 , the open ball of radius 𝑟 > 0 in
ℝ𝑛 about 𝑥 is defined to be the set
𝐵𝑟 (𝑥) = {𝑦 ∈ ℝ𝑛 ∣ ∥𝑦 − 𝑥∥ < 𝑟}.
That is, 𝐵𝑟 (𝑥) is the set of points in ℝ𝑛 which are within a distance of 𝑟 from
𝑥.
Definition 3.2.2 (Open Sets). A set 𝑈 ⊆ ℝ𝑛 is said to be open if and only if
for every 𝑥 ∈ 𝑈 there exists 𝑟 > 0 such that
𝐵𝑟 (𝑥) ⊆ 𝑈.
The empty set, ∅, is considered to be open.
3.3. CONTINUOUS FUNCTIONS 29

Example 3.2.3. For any 𝑅 > 0, the open ball 𝐵𝑅 (𝑂) = {𝑦 ∈ ℝ𝑛 ∣ ∥𝑦∥ < 𝑅} is
an open set.

Proof. Let 𝑥 be an arbitrary point in 𝐵𝑅 (𝑂); then ∥𝑥∥ < 𝑅. Put 𝑟 = 𝑅−∥𝑥∥ > 0
and consider the open ball 𝐵𝑟 (𝑥). If 𝑦 ∈ 𝐵𝑟 (𝑥), then, by the triangle inequality,

∥𝑦∥ = ∥𝑦 − 𝑥 + 𝑥∥ ⩽ ∥𝑦 − 𝑥∥ + ∥𝑥∥ < 𝑟 + ∥𝑥∥ = 𝑅,

which shows that 𝑦 ∈ 𝐵𝑅 (𝑂). Consequently,

𝐵𝑟 (𝑥) ⊆ 𝐵𝑅 (𝑂).

It the follows that 𝐵𝑅 (𝑂) is open by Definition 3.2.2.

Example 3.2.4. The set 𝐴 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑦 = 0} is not an open subset of


ℝ2 . To see why this is the case, observe that for any 𝑟 > 0, the ball 𝐵𝑟 ((0, 0)) is
is not a subset of 𝐴, since, for instance, the point (0, 𝑟/2) is in 𝐵𝑟 ((0, 0)), but
it is not an element of 𝐴.

3.3 Continuous Functions


In single variable Calculus you learned that a real valued function, 𝑓 : (𝑎, 𝑏) → ℝ,
defined in the open interval (𝑎, 𝑏), is continuous at 𝑐 ∈ (𝑎, 𝑏) if

lim 𝑓 (𝑥) = 𝑓 (𝑐).


𝑥→𝑐

We may re–write the last expression as

lim ∣𝑓 (𝑥) − 𝑓 (𝑐)∣ = 0.


∣𝑥−𝑐∣→0

This is the expression that we will use to generalize the notion of continuity at a
point to vector valued functions on subsets of Euclidean space. We will simply
replace the absolute values by norms.

Definition 3.3.1. Let 𝑈 be an open subset of ℝ𝑛 and 𝐹 : 𝑈 → ℝ𝑚 be a vector–


valued map on 𝑈 . 𝐹 is said to be continuous at 𝑥 ∈ 𝑈 if

lim ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ = 0.


∥𝑦−𝑥∥→0

If 𝐹 is continuous at every 𝑥 in 𝑈 , then we say that 𝐹 is continuous on 𝑈 .

Example 3.3.2. Let 𝑇 : ℝ𝑛 → ℝ be a linear transformation. Then, 𝑇 is con-


tinuous on ℝ𝑛 .

Proof: Since 𝑇 is linear, there exists a vector, 𝑤, in ℝ𝑛 such that

𝑇 (𝑣) = 𝑤 ⋅ 𝑣 for all 𝑣 ∈ ℝ𝑛 .


30 CHAPTER 3. FUNCTIONS

It then follows that, for any 𝑢 and 𝑣 in ℝ𝑛

∥𝑇 (𝑣) − 𝑇 (𝑢)∥ = ∥𝑤 ⋅ (𝑣 − 𝑢)∥ ⩽ ∥𝑤∥∥𝑣 − 𝑢∥,

by the Cauchy–Schwartz inequality. Hence, by the Squeeze (or Sandwich) The-


orem in single–variable Calculus, we obtain that

lim ∥𝑇 (𝑣) − 𝑇 (𝑢)∥ = 0,


∥𝑣−𝑢∥→0

and so 𝑇 is continuous at 𝑢. Since 𝑢 is any element of ℝ𝑛 , it follows that 𝑇 is


continuous on ℝ𝑛 .
Example 3.3.3. Let 𝐹 : ℝ2 → ℝ2 be given by
( ) ( 2) ( )
𝑥 𝑥 𝑥
𝐹 = , for all ∈ ℝ2 .
𝑦 −𝑦 𝑦
( )
𝑥𝑜
Prove that 𝐹 is continuous at every ∈ ℝ2 .
𝑦𝑜
Solution: First, estimate
2 2
𝑥2 − 𝑥2𝑜
( ) ( ) ( )
𝑥 𝑥𝑜
𝐹 −𝐹 =
𝑦 𝑦𝑜 −𝑦 + 𝑦𝑜

= (𝑥2 − 𝑥2𝑜 )2 + (𝑦 − 𝑦𝑜 )2 ,
which may be written as
( ) ( ) 2
𝑥 𝑥𝑜
𝐹 −𝐹 = (𝑥 + 𝑥𝑜 )2 (𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 , (3.1)
𝑦 𝑦𝑜
after factoring.
( )
𝑥
Next, restrict to values of ∈ ℝ2 such that
𝑦
( ) ( )
𝑥 𝑥𝑜
− ⩽ 1. (3.2)
𝑦 𝑦𝑜

It follows from (3.2) that


√ √
∣𝑥 − 𝑥𝑜 ∣ = (𝑥 − 𝑥𝑜 )2 ⩽ (𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 ⩽ 1.

Consequently, if (3.2) holds, then

∣𝑥∣ = ∣𝑥 − 𝑥𝑜 + 𝑥𝑜 ∣ ⩽ ∣𝑥 − 𝑥𝑜 ∣ + ∣𝑥𝑜 ∣ < 1 + ∣𝑥𝑜 ∣, (3.3)

where we have used the triangle inequality. It follows from the last
inequality in (3.3) that

∣𝑥 + 𝑥𝑜 ∣ ⩽ ∣𝑥∣ + ∣𝑥𝑜 ∣ ⩽ 1 + 2∣𝑥𝑜 ∣, (3.4)


3.3. CONTINUOUS FUNCTIONS 31

where we have, again, used the triangle inequality. Applying the


estimate in (3.4) to the equation in (3.1), we obtain
( ) ( ) 2
𝑥 𝑥𝑜
𝐹 −𝐹 ⩽ (1 + 2∣𝑥𝑜 ∣)2 (𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 ,
𝑦 𝑦𝑜

which implies that


( ) ( ) 2
𝑥 𝑥𝑜
𝐹 −𝐹 ⩽ (1 + 2∣𝑥𝑜 ∣)2 [(𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 ]. (3.5)
𝑦 𝑦𝑜

Taking the positive square root on both sides of the inequality in


(3.5) then yields
( ) ( )
𝑥 𝑥𝑜 √
𝐹 −𝐹 ⩽ (1 + 2∣𝑥𝑜 ∣) (𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 . (3.6)
𝑦 𝑦𝑜

From (3.6) we get that, if (3.2) holds, then


( ) ( ) ( ) ( )
𝑥 𝑥𝑜 𝑥 𝑥𝑜
0⩽ 𝐹 −𝐹 ⩽ (1 + 2∣𝑥𝑜 ∣) − . (3.7)
𝑦 𝑦𝑜 𝑦 𝑦𝑜

Applying the Squeeze Theorem to the inequality in (3.7)


( )we see
( that,
)
𝑥 𝑥𝑜
since the rightmost expression in (3.7) goes to 0 as −
𝑦 𝑦𝑜
goes to 0, ( ) ( )
𝑥 𝑥𝑜
lim 𝐹 −𝐹 = 0.
⎛ ⎞ ⎛ ⎞ 𝑦 𝑦𝑜
⎜𝑥 ⎟ ⎜𝑥 𝑜 ⎟
⎜ ⎟−⎜ ⎟ →0
𝑦 𝑦𝑜
⎝ ⎠ ⎝ ⎠

( )
𝑥𝑜
Hence, 𝐹 is continuous at . □
𝑦𝑜

Example 3.3.4. Let 𝑓 : ℝ2 → ℝ be given by

𝑓 (𝑥, 𝑦) = 𝑥𝑦, for all (𝑥, 𝑦) ∈ ℝ2 .

Prove that 𝑓 is continuous at every (𝑥𝑜 , 𝑦𝑜 ) ∈ ℝ2 .

Solution: We want to show that, for every (𝑥𝑜 , 𝑦𝑜 ) ∈ ℝ2 ,

lim ∣𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 )∣ = 0. (3.8)


∥(𝑥,𝑦)−(𝑥𝑜 ,𝑦𝑜 )∥→0

First, write

𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 ) = 𝑥𝑦 − 𝑥𝑜 𝑦𝑜 = 𝑥𝑦 − 𝑥𝑜 𝑦 + 𝑥𝑜 𝑦 − 𝑥𝑜 𝑦𝑜 ,

or
𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 ) = 𝑦(𝑥 − 𝑥𝑜 ) + 𝑥𝑜 (𝑦 − 𝑦𝑜 ). (3.9)
32 CHAPTER 3. FUNCTIONS

Taking absolute values on both sides of (3.9) and applying the tri-
angle inequality yields that
∣𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 )∣ ⩽ ∣𝑦∣∣𝑥 − 𝑥𝑜 ∣ + ∣𝑥𝑜 ∣∣𝑦 − 𝑦𝑜 ∣. (3.10)
Restricting to values of (𝑥, 𝑦) such that
∥(𝑥, 𝑦) − (𝑥𝑜 , 𝑦𝑜 )∥ ⩽ 1, (3.11)
we see that
√ √
∣𝑦 − 𝑦𝑜 ∣ = (𝑦 − 𝑦𝑜 )2 ⩽ (𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 ⩽ 1,
so that
∣𝑦∣ = ∣𝑦 − 𝑦𝑜 + 𝑦𝑜 ∣ ⩽ ∣𝑦 − 𝑦𝑜 ∣ + ∣𝑦𝑜 ∣ ⩽ 1 + ∣𝑦𝑜 ∣, (3.12)
provided that (3.11) holds. Thus, using the estimate in (3.12) in
(3.10), we obtain that, if (𝑥, 𝑦) satisfies (3.11),
∣𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 )∣ ⩽ (1 + ∣𝑦𝑜 ∣)∣𝑥 − 𝑥𝑜 ∣ + ∣𝑥𝑜 ∣∣𝑦 − 𝑦𝑜 ∣. (3.13)
Next, apply the Cauchy–Schwarz inequality to the right–hand side
of (3.13) to obtain
√ √
∣𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 )∣ ⩽ (1 + ∣𝑦𝑜 ∣)2 + 𝑥2𝑜 (𝑥 − 𝑥𝑜 )2 + (𝑦 − 𝑦𝑜 )2 ,
or
∣𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 )∣ ⩽ 𝐶𝑜 ∥(𝑥, 𝑦) − (𝑥𝑜 , 𝑦𝑜 )∥,

for values of (𝑥, 𝑦) within 1 of (𝑥𝑜 , 𝑦𝑜 ), where 𝐶𝑜 = (1 + ∣𝑦𝑜 ∣)2 + 𝑥2𝑜 .
We then have that, if ∥(𝑥, 𝑦) − (𝑥𝑜 , 𝑦𝑜 )∥ ⩽ 1, then
0 ⩽ ∣𝑓 (𝑥, 𝑦) − 𝑓 (𝑥𝑜 , 𝑦𝑜 )∣ ⩽ 𝐶𝑜 ∥(𝑥, 𝑦) − (𝑥𝑜 , 𝑦𝑜 )∥. (3.14)
The claim in (3.8) now follows by applying the Squeeze Theorem to
the expressions in (3.14) since the rightmost term in (3.14) goes to
0 as ∥(𝑥, 𝑦) − (𝑥𝑜 , 𝑦𝑜 )∥ → 0. □
Proposition 3.3.5. Let 𝑈 denote an open subset of ℝ𝑛 and 𝐹 : 𝑈 → ℝ𝑚 be a
vector valued function defined on 𝑈 and given by
⎛ ⎞
𝑓1 (𝑣)
⎜ 𝑓2 (𝑣) ⎟
𝐹 (𝑣) = ⎜ . ⎟ , for all 𝑣 ∈ 𝑈,
⎜ ⎟
⎝ .. ⎠
𝑓𝑚 (𝑣)
where
𝑓𝑗 : 𝑈 → ℝ, for 𝑗 = 1, 2, . . . 𝑚,
are real valued functions defined on 𝑈 . The vector valued function, 𝐹 , is con-
tinuous at 𝑢 ∈ 𝑈 if and only if each one of its components, 𝑓𝑗 , for 𝑗 = 1, 2, . . . 𝑚,
is continuous at 𝑢.
3.3. CONTINUOUS FUNCTIONS 33

Proof: 𝐹 is continuous at 𝑢 ∈ 𝑈 if and only if

lim ∥𝐹 (𝑣) − 𝐹 (𝑢)∥2 = 0,


∥𝑣−𝑢∥→0

if and only if ⎛ ⎞
∑𝑚
lim ⎝ ∣𝑓𝑗 (𝑣) − 𝑓𝑗 (𝑢)∣2 ⎠ = 0,
∥𝑣−𝑢∥→0
𝑗=1

if and only if
𝑚

lim ∣𝑓𝑗 (𝑣) − 𝑓𝑗 (𝑢)∣2 = 0,
∥𝑣−𝑢∥→0
𝑗=1

if and only if

lim ∣𝑓𝑗 (𝑣) − 𝑓𝑗 (𝑢)∣2 = 0, for all 𝑗 = 1, 2, . . . , 𝑚,


∥𝑣−𝑢∥→0

if and only if

lim ∣𝑓𝑗 (𝑣) − 𝑓𝑗 (𝑢)∣ = 0, for all 𝑗 = 1, 2, . . . , 𝑚,


∥𝑣−𝑢∥→0

if and only if each 𝑓𝑗 is continuous at 𝑢, for 𝑖 = 1, 2, . . . , 𝑚.


Example 3.3.6 (Continuous Paths). Let (𝑎, 𝑏) denote the open interval from
𝑎 to 𝑏. A path 𝜎(𝑎, 𝑏) → ℝ𝑚 , defined by
⎛ ⎞
𝑥1 (𝑡)
⎜ 𝑥2 (𝑡) ⎟
𝜎(𝑡) = ⎜ . ⎟ , for all 𝑡 ∈ (𝑎, 𝑏),
⎜ ⎟
⎝ .. ⎠
𝑥𝑚 (𝑡)

where each 𝑥𝑖 , for 𝑖 = 1, 2, . . . , 𝑚, denotes a real valued function defined on


(𝑎, 𝑏), is continuous if and only if each 𝑥𝑖 is continuous.
Proof. Let 𝑡𝑜 denote an arbitrary element in (𝑎, 𝑏). By Proposition 3.3.5, 𝜎 is
continuous at 𝑡𝑜 if and only if each 𝑥𝑖 : (𝑎, 𝑏) → ℝ is continuous at 𝑡𝑜 . Since,
this is true for every 𝑡𝑜 ∈ (𝑎, 𝑏), the result follows.
A particular instance of the previous example is the path in ℝ2 given by

𝜎(𝑡) = (cos 𝑡, sin 𝑡)

for all 𝑡 in some interval (𝑎, 𝑏) of real numbers. Since the sine and cosine
functions are continuous everywhere on ℝ, it follows that the path is continuous.

Example 3.3.7 (Linear Functions are Continuous). Let 𝐹 : ℝ𝑛 → ℝ𝑚 be a


linear function. Then 𝐹 is continuous on ℝ𝑛 ; that is, 𝐹 is continuous at every
𝑢 ∈ ℝ𝑛 .
34 CHAPTER 3. FUNCTIONS

Proof: Write
𝑤1𝑇 𝑣
⎛ ⎞
⎜ 𝑤2𝑇 𝑣 ⎟
𝐹 (𝑣) = ⎜ . ⎟ , for all 𝑣 ∈ ℝ𝑛 ,
⎜ ⎟
⎝ .. ⎠
𝑇
𝑣𝑚 𝑣
where 𝑤1𝑇 , 𝑤2𝑇 , . . . , 𝑤𝑚
𝑇
are the rows of the matrix representation of the function
𝐹 relative to the standard basis in ℝ𝑛 . It then follows that
⎛ ⎞
𝑓1 (𝑣)
⎜ 𝑓2 (𝑣)⎟
𝐹 (𝑣) = ⎜ . ⎟ , for all 𝑣 ∈ ℝ𝑛 ,
⎜ ⎟
⎝ .. ⎠
𝑓𝑚 (𝑣)
where
𝑓𝑗 (𝑣) = 𝑤𝑗 ⋅ 𝑣, for all 𝑣 ∈ ℝ𝑛 ,
and 𝑗 = 1, 2, . . . , 𝑚. As shown in Example 3.3.2, each 𝑓𝑗 is continuous at every
𝑢 ∈ ℝ𝑛 . It then follows from Proposition 3.3.5 that 𝐹 is continuous at every
𝑢 ∈ ℝ𝑛 .
Example 3.3.8. Define 𝑓 : ℝ𝑛 → ℝ by 𝑓 (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) = 𝑥𝑖 , for a fixed 𝑖 in
{1, 2, . . . , 𝑛}. Show that 𝑓 is continuous on ℝ.
Solution: Observe that 𝑓 is linear. In fact, note that
𝑓 (𝑣) = 𝑒𝑖 ⋅ 𝑣, for all 𝑣 ∈ ℝ𝑛 ,
where 𝑒𝑖 is the 𝑖th vector in the standard basis of ℝ𝑛 . It follows
from the result of Example 3.3.7 that 𝑓 is continuous on ℝ𝑛 . □
Example 3.3.9 (Orthogonal Projections are Continuous). Let 𝑢
ˆ denote a unit
vector in ℝ𝑛 and define 𝑃𝑢ˆ : ℝ𝑛 → ℝ𝑛 by
𝑃𝑢ˆ (𝑣) = (𝑣 ⋅ 𝑢
ˆ)ˆ
𝑢, for all 𝑣 ∈ ℝ𝑛 .
Prove that 𝑃𝑢ˆ is continuous on ℝ𝑛 .
Solution: Observe that 𝑃𝑢ˆ is linear. In fact, for any 𝑐1 , 𝑐2 ∈ ℝ and
𝑣1 , 𝑣2 ∈ ℝ𝑛 ,
𝑃𝑢ˆ (𝑐1 𝑣1 + 𝑐2 𝑣2 ) = [(𝑐1 𝑣1 + 𝑐2 𝑣2 ) ⋅ 𝑢
ˆ]ˆ
𝑢

= (𝑐1 𝑣1 ⋅ 𝑢
ˆ + 𝑐2 𝑣2 ⋅ 𝑢
ˆ)ˆ
𝑢

= (𝑐1 𝑣1 ⋅ 𝑢 𝑢 + (𝑐2 𝑣2 ⋅ 𝑢
ˆ)ˆ ˆ)ˆ
𝑢

= 𝑐1 (𝑣1 ⋅ 𝑢 𝑢 + 𝑐2 (𝑣2 ⋅ 𝑢
ˆ)ˆ ˆ)ˆ
𝑢

= 𝑐1 𝑃𝑢ˆ (𝑣1 ) + 𝑐2 𝑃𝑢ˆ (𝑣2 ).


It then follows from the result of Example 3.3.7 that 𝑃𝑢ˆ is continuous
on ℝ𝑛 . □
3.3. CONTINUOUS FUNCTIONS 35

3.3.1 Images and Pre–Images


Let 𝑈 denote and open subset of ℝ𝑛 and 𝐹 : 𝑈 → ℝ𝑚 be a map.
Definition 3.3.10. Given 𝐴 ⊆ 𝑈 , we define the image of 𝐴 under 𝐹 to be the
set
𝐹 (𝐴) = {𝑦 ∈ ℝ𝑚 ∣ 𝑦 = 𝐹 (𝑥) for some 𝑥 ∈ 𝑈 }.
Given 𝐵 ⊆ ℝ𝑚 , we define the pre–image of 𝐵 under 𝐹 to be the set

𝐹 −1 (𝐴) = {𝑥 ∈ 𝑈 ∣ 𝐹 (𝑥) ∈ 𝐵}.

Example 3.3.11. Let 𝜎 : ℝ → ℝ2 be given by 𝜎(𝑡) = (cos 𝑡, sin 𝑡) for all 𝑡 ∈ ℝ.


If 𝐴 = (0, 2𝜋], then the image of 𝐴 under 𝜎 is the unit circle around the origin
in the 𝑥𝑦–plane, or

𝜎((0, 2𝜋]) = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1}.

Example 3.3.12. Let 𝜎 be as in the previous example, and 𝐴 = (0, 𝜋/2). Then,

𝜎(𝐴) = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1, 0 < 𝑥 < 1, 0 < 𝑦 < 1}.

Example 3.3.13. Let 𝐷 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 < 1}, the open unit disc in
ℝ2 , and 𝑓 : 𝐷′ → ℝ be given by

𝑓 (𝑥, 𝑦) = 1 − 𝑥2 − 𝑦 2 , for (𝑥, 𝑦) ∈ 𝐷

Find the pre–image of 𝐵 = {0} under 𝑓 .

Solution:
𝑓 −1 (0) = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑓 (𝑥, 𝑦) = 0}.
Now, 𝑓 (𝑥, 𝑦) = 0 if and only if

1 − 𝑥2 − 𝑦 2 = 0

if and only if
𝑥2 + 𝑦 2 = 1.
Thus,
𝑓 −1 (0) = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1},
or the unit circle around the origin in ℝ2 . □

3.3.2 An alternate definition of continuity


In this section we will prove the following proposition
Proposition 3.3.14. Let 𝑈 denote an open subset of ℝ𝑛 . A map 𝐹 : 𝑈 → ℝ𝑚
is continuous on 𝑈 if and only if the pre–image of any open subset of ℝ𝑚 under
𝐹 is an open subset of 𝑈 .
36 CHAPTER 3. FUNCTIONS

Proof. Suppose that 𝐹 is continuous on 𝑈 . Then, according to Definition 3.3.1,


for every 𝑥 ∈ 𝑈 ,
lim ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ = 0.
∥𝑦−𝑥∥→0

In other words, 𝐹 (𝑦) can be made arbitrarily close to 𝐹 (𝑥) by making 𝑦 suffi-
ciently close to 𝑥.
Let 𝑉 denote an arbitrary open subset of ℝ𝑚 and consider

𝐹 −1 (𝑉 ) = {𝑥 ∈ 𝑈 ∣ 𝐹 (𝑥) ∈ 𝑉 }.

We claim that 𝐹 −1 (𝑉 ) is open. To see why this is the case, let 𝑥 ∈ 𝐹 −1 (𝑉 ).


Then, 𝐹 (𝑥) ∈ 𝑉 . Therefore, since 𝑉 is open, there exists 𝜀 > 0 such that

𝐵𝜀 (𝐹 (𝑥)) ⊆ 𝑉.

This implies that, any 𝑤 ∈ ℝ𝑛 satisfying ∥𝑤 − 𝐹 (𝑥)∥ < 𝜀 is also an element of


𝑉.
Now, by the continuity of 𝐹 at 𝑥, we can make ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ < 𝜀 bay
making ∥𝑦 − 𝑥∥ sufficiently small; say, smaller than some 𝛿 > 0. It then follows
that
∥𝑦 − 𝑥∥ < 𝛿 implies that ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ < 𝜀,
which in turn implies that 𝐹 (𝑦) ∈ 𝑉 , or 𝑦 ∈ 𝐹 −1 (𝑉 ). We then have that

𝑦 ∈ 𝐵𝛿 (𝑥) implies that 𝑦 ∈ 𝐹 −1 (𝑉 ).

In other words,
𝐵𝛿 (𝑥) ⊆ 𝐹 −1 (𝑉 ).
Therefore, 𝐹 −1 (𝑉 ) is open, an so the claim is proved.
Conversely, assume that for any open subset, 𝑉 , of ℝ𝑚 , 𝐹 −1 (𝑉 ) is open. We
show that this implies that 𝐹 is continuous at any 𝑥 ∈ 𝑈 . To see this, suppose
that 𝑥 ∈ 𝑈 and let 𝜀 > 0 be arbitrary. Now, since 𝐵𝜀 (𝐹 (𝑥)), the open ball of
radius 𝜀 around 𝐹 (𝑥), is an open subset of ℝ𝑚 , it follows that

𝐹 −1 (𝐵𝜀 (𝐹 (𝑥)))

is open, by the assumption we are making in this part of the proof. Hence, since
𝑥 ∈ 𝐹 −1 (𝐵𝜀 (𝐹 (𝑥))), there exists 𝛿 > 0 such that

𝐵𝛿 (𝑥) ⊆ 𝐹 −1 (𝐵𝜀 (𝐹 (𝑥))).

This is equivalent to saying that

∥𝑦 − 𝑥∥ < 𝛿 implies that 𝑦 ∈ 𝐹 −1 (𝐵𝜀 (𝐹 (𝑥))),

or
∥𝑦 − 𝑥∥ < 𝛿 implies that 𝐹 (𝑦) ∈ 𝐵𝜀 (𝐹 (𝑥)),
3.3. CONTINUOUS FUNCTIONS 37

or
∥𝑦 − 𝑥∥ < 𝛿 implies that ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ < 𝜀.
Thus, given an arbitrary 𝜀 > 0, there exists 𝛿 > 0 such that

∥𝑦 − 𝑥∥ < 𝛿 implies that ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ < 𝜀.

This is precisely the definition of

lim ∥𝐹 (𝑦) − 𝐹 (𝑥)∥ = 0.


∥𝑦−𝑥∥→0

3.3.3 Compositions of Continuous Functions


Proposition 3.3.14 provides another definition of continuity: A map is contin-
uous if and only if the pre–image of any open set under the map is open. We
will now use this alternate definition prove that a composition of continuous
functions is continuous.
Let 𝑈 be an open subset of ℝ𝑛 and 𝑄 an open subset of ℝ𝑚 . Suppose that
we are given two maps 𝐹 : 𝑈 → ℝ𝑚 and 𝐺 : 𝑄 → ℝ𝑘 . Recall that in order to
define the composition of 𝐺 and 𝐹 , we must require that the image of 𝑈 under
𝐹 is contained in the domain, 𝑄, of 𝐺; that is,

𝐹 (𝑈 ) ⊆ 𝑄.

If this is the case, then we define the composition of 𝐺 and 𝐹 , denoted 𝐺 ∘ 𝐹 ,


by
𝐺 ∘ 𝐹 (𝑥) = 𝐺(𝐹 (𝑥)) for all 𝑥 ∈ 𝑈.
This yields a map
𝐺 ∘ 𝐹 : 𝑈 → ℝ𝑘 .
Proposition 3.3.15. Let 𝑈 be an open subset of ℝ𝑛 and 𝑄 an open subset of
ℝ𝑚 . Suppose that the maps 𝐹 : 𝑈 → ℝ𝑚 and 𝐺 : 𝑄 → ℝ𝑘 are continuous on
their respective domains and that 𝐹 (𝑈 ) ⊆ 𝑄. Then, the composition 𝐺∘𝐹 : 𝑈 →
ℝ𝑘 is continuous on 𝑈 .
Proof. According to Proposition 3.3.14, it suffices to prove that, for any open
set 𝑉 ⊆ ℝ𝑘 , the pre–image (𝐺 ∘ 𝐹 )−1 (𝑉 ) is an open subset of 𝑈 . Thus, let
𝑉 ⊆ ℝ𝑘 be open and observe that

𝑥 ∈ (𝐺 ∘ 𝐹 )−1 (𝑉 ) iff (𝐺 ∘ 𝐹 )(𝑥) ∈ 𝑉
iff 𝐺(𝐹 (𝑥)) ∈ 𝑉
iff 𝐹 (𝑥) ∈ 𝐺−1 (𝑉 )
iff 𝑥 ∈ 𝐹 −1 (𝐺−1 (𝑉 )),

so that
(𝐺 ∘ 𝐹 )−1 (𝑉 ) = 𝐹 −1 (𝐺−1 (𝑉 )).
38 CHAPTER 3. FUNCTIONS

Now, 𝐺 is continuous, consequently, since 𝑉 is open, 𝐺−1 (𝑉 ) is an open subset


of 𝑄 by Proposition 3.3.14. Similarly, since 𝐹 is continuous, it follows again
from Proposition 3.3.14 that 𝐹 −1 (𝐺−1 (𝑉 )) is open. Thus, (𝐺 ∘ 𝐹 )−1 (𝑉 ) is
open. Since, 𝑉 was an arbitrary open subset of ℝ𝑘 , it follows from Proposition
3.3.14 that 𝐺 ∘ 𝐹 is continuous on 𝑈 .

Example 3.3.16 (Evaluating scalar fields on paths). Let (𝑎, 𝑏) denote an open
interval of real numbers and 𝜎 : (𝑎, 𝑏) → ℝ𝑛 be a path. Given a scalar field
𝑓 : ℝ𝑛 → ℝ, we can define the composition

𝑓 ∘ 𝜎 : (𝑎, 𝑏) → ℝ

by 𝑓 ∘ 𝜎(𝑡) = 𝑓 (𝜎(𝑡)) for all 𝑡 ∈ (𝑎, 𝑏). Thus, 𝑓 ∘ 𝜎 is a real valued function
of a single variable like those studied in Calculus I and II. An example of a
composition 𝑓 ∘ 𝜎 is provided by evaluating the electrostatic potential, 𝑓 , along
the path of a particle moving according to 𝜎(𝑡), where 𝑡 denotes time.
According to Proposition 3.3.15, if both 𝑓 and 𝜎 are continuous, then so is
the function 𝑓 ∘ 𝜎. Therefore, if lim 𝜎(𝑡) = 𝑥𝑜 for some 𝑡𝑜 ∈ (𝑎, 𝑏) and 𝑥𝑜 ∈ ℝ𝑛 ,
𝑡→𝑡𝑜
then
lim 𝑓 (𝜎(𝑡)) = 𝑓 (𝑥𝑜 ).
𝑡→𝑡𝑜

The point here is that, if 𝑓 is continuous at 𝑥𝑜 , the limit of 𝑓 along any con-
tinuous path that approaches 𝑥𝑜 must yield the same value of 𝑓 (𝑥𝑜 ).

3.3.4 Limits and Continuity


In the previous example we saw that if a scalar field, 𝑓 , is continuous at a point
𝑥𝑜 ∈ ℝ𝑛 , then for any continuous path 𝜎 with the property that 𝜎(𝑡) → 𝑥𝑜 as
𝑡 → 𝑡𝑜 ,
lim 𝑓 (𝜎(𝑡)) = 𝑓 (𝑥𝑜 ).
𝑡→𝑡𝑜

In other words, taking the limit along any continuous path approaching 𝑥𝑜 as
𝑡 → 𝑡𝑜 must yield one, and only one, value.

Example 3.3.17. Let 𝑓 : ℝ2 ∖{(0, 0)} → ℝ be given by

∣𝑥∣
𝑓 (𝑥, 𝑦) = √ , for (𝑥, 𝑦) ∕= (0, 0).
𝑥2 + 𝑦 2

Show that lim 𝑓 (𝑥, 𝑦) does not exist.


(𝑥,𝑦)→(0,0)

Solution: If the limit did exist, then we would be able to define 𝑓


at (0, 0) so that 𝑓 was continuous there. In other words, suppose
that
lim 𝑓 (𝑥, 𝑦) = 𝐿.
(𝑥,𝑦)→(0,0)
3.3. CONTINUOUS FUNCTIONS 39

Then, the function 𝑓ˆ: ℝ2 → ℝ defined by


{
𝑓 (𝑥, 𝑦), if (𝑥, 𝑦) ∕= (0, 0);
𝑓ˆ(𝑥, 𝑦) =
𝐿, if (𝑥, 𝑦) = (0, 0),

would be continuous on ℝ2 . Thus, for any continuous path, 𝜎, with


the property: 𝜎(𝑡) → (0, 0) as 𝑡 → 0, we would have that

lim 𝑓ˆ(𝜎(𝑡)) = 𝑓ˆ(0, 0) = 𝐿,


𝑡→0

since 𝑓ˆ ∘ 𝜎 would be continuous by Proposition 3.3.15.


However, if 𝜎1 (𝑡) = (0, 𝑡) for 𝑡 ∈ ℝ, then 𝜎1 is continuous and
𝜎1 (𝑡) → (0, 0) as 𝑡 → 0 and

lim 𝑓ˆ(𝜎1 (𝑡)) = 0;


𝑡→0

while, if 𝜎2 (𝑡) = (𝑡, 0) for 𝑡 ∈ ℝ, then 𝜎2 is continuous and 𝜎2 (𝑡) →


(0, 0) as 𝑡 → 0 and
lim 𝑓ˆ(𝜎2 (𝑡)) = 1.
𝑡→0

This yields a contradiction, and therefore

∣𝑥∣
lim √
(𝑥,𝑦)→(0,0) 𝑥2 + 𝑦 2

cannot exist. □
40 CHAPTER 3. FUNCTIONS
Chapter 4

Differentiability

In single variable Calculus, a real valued function, 𝑓 : 𝐼 → ℝ, defined on an an


open interval 𝐼, is said to be differentiable at a point 𝑎 ∈ 𝐼 if the limit

𝑓 (𝑥) − 𝑓 (𝑎)
lim
𝑥→𝑎 𝑥−𝑎
exists. If this limit exists, we denote it by 𝑓 ′ (𝑎) and call it the derivative of 𝑓
at 𝑎. We then have that
𝑓 (𝑥) − 𝑓 (𝑎)
lim = 𝑓 ′ (𝑎).
𝑥→𝑎 𝑥−𝑎
The last expression is equivalent to

𝑓 (𝑥) − 𝑓 (𝑎)
lim − 𝑓 ′ (𝑎) = 0,
𝑥→𝑎 𝑥−𝑎

which we can re–write as


∣𝑓 (𝑥) − 𝑓 (𝑎) − 𝑓 ′ (𝑎)(𝑥 − 𝑎)∣
lim = 0. (4.1)
𝑥→𝑎 ∣𝑥 − 𝑎∣

Expression (4.1) had the familiar geometric interpretation learned in Calculus


I: If 𝑓 is differentiable at 𝑎, the the graph of 𝑦 = 𝑓 (𝑥) can be approximated by
that of the tangent line,

𝐿𝑎 (𝑥) = 𝑓 (𝑥) + 𝑓 ′ (𝑎)(𝑥 − 𝑎) for all 𝑥 ∈ ℝ,

in the sense that, if


𝐸𝑎 (𝑥 − 𝑎) = 𝑓 (𝑥) − 𝐿𝑎 (𝑥)
is the error in the approximation, then

∣𝐸𝑎 (𝑥 − 𝑎)∣
lim = 0;
𝑥→𝑎 ∣𝑥 − 𝑎∣

41
42 CHAPTER 4. DIFFERENTIABILITY

that is the error in the linear approximation to 𝑓 at 𝑎 goes to 0 more rapidly


than ∣𝑥 − 𝑎∣ goes to 0 as 𝑥 gets closer to 𝑎.
If we are interested in differentiability of 𝑓 at a variable point 𝑥 ∈ 𝐼, and
not a fixed point 𝑎, then we can rewrite (4.1) more generally as
∣𝑓 (𝑦) − 𝑓 (𝑥) − 𝑓 ′ (𝑥)(𝑦 − 𝑥)∣
lim = 0,
𝑦→𝑥 ∣𝑦 − 𝑥∣
or
∣𝑓 (𝑦) − 𝑓 (𝑥) − 𝑓 ′ (𝑥)(𝑦 − 𝑥)∣
lim = 0. (4.2)
∣𝑦−𝑥∣→0 ∣𝑦 − 𝑥∣
The limit expression in (4.2) is the one we are going to be able to extend to
higher dimensions for a vector–valued function 𝐹 : 𝑈 → ℝ𝑚 defined on an open
subset, 𝑈 , of ℝ𝑛 . The symbols 𝑥 and 𝑦 will represent vectors in 𝑈 , and the
absolute values will turn into norms. To see how the expression 𝑓 ′ (𝑥)(𝑦 − 𝑥) can
be generalized to higher dimensions, let 𝑓 ′ (𝑥) = 𝑚𝑥 , the slope of the tangent
line to the graph of 𝑓 at 𝑥, and 𝑦 = 𝑥 + 𝑤; then,
𝑓 (𝑥 + 𝑤) − 𝑓 (𝑥) = 𝑚𝑥 𝑤 + 𝐸𝑥 (𝑤),
where
∣𝐸𝑎 (𝑤)∣
lim = 0.
𝑤→0 ∣𝑤∣
Observe that the map
𝑤 7→ 𝑚𝑥 𝑤
defines a linear map from ℝ to ℝ. We then conclude that if 𝑓 is differentiable at
𝑥, there exists a linear map such that the linear map approximates the difference
𝑓 (𝑥 + 𝑤) − 𝑓 (𝑥) in the sense that the error in the approximation goes to 0 as
𝑤 → 0 at a faster rate than ∣𝑤∣ approaches 0. This notion of using linear
maps to approximate functions locally is the key to extending the concept of
differentiability to higher dimensions.

4.1 Definition of Differentiability


Definition 4.1.1 (Differentiability). Let 𝑈 denote an open subset of ℝ𝑛 and
𝐹 : 𝑈 → ℝ𝑚 be a vector–valued map defined on 𝑈 . 𝐹 is said to be differentiable
at 𝑥 ∈ 𝑈 if and only if there exists a linear transformation 𝑇𝑥 : ℝ𝑛 → ℝ𝑚 such
that
∥𝐹 (𝑦) − 𝐹 (𝑥) − 𝑇𝑥 (𝑦 − 𝑥)∥
lim = 0. (4.3)
∥𝑦−𝑥∥→0 ∥𝑦 − 𝑥∥
Thus, 𝐹 is differentiable at 𝑥 ∈ 𝑈 iff it can be approximated by a linear
function for values sufficiently close to 𝑥.
Rewrite the expression in (4.3) by putting 𝑦 = 𝑥 + 𝑤, then 𝐹 is differentiable
at 𝑥 ∈ 𝑈 iff there exists a linear transformation 𝑇𝑥 : ℝ𝑛 → ℝ𝑚 such that
∥𝐹 (𝑥 + 𝑤) − 𝐹 (𝑥) − 𝑇𝑥 (𝑤)∥
lim = 0. (4.4)
∥𝑤∥→0 ∥𝑤∥
4.2. THE DERIVATIVE 43

We can also say that 𝐹 : 𝑈 → ℝ𝑚 is differentiable at 𝑥 ∈ 𝑈 iff there exists a


linear transformation 𝑇𝑥 : ℝ𝑛 → ℝ𝑚 such that
𝐹 (𝑥 + 𝑤) = 𝐹 (𝑥) + 𝑇𝑥 (𝑤) + 𝐸𝑥 (𝑤), (4.5)
where 𝐸𝑥 (𝑤), the error term, has the property that
∥𝐸𝑥 (𝑤)∥
lim = 0. (4.6)
∥𝑤∥→0 ∥𝑤∥

4.2 The Derivative


Proposition 4.2.1 (Uniqueness of the Linear Approximation). Let 𝑈 denote
an open subset of ℝ𝑛 and 𝐹 : 𝑈 → ℝ𝑚 be a map. If 𝐹 is differentiable at 𝑥 ∈ 𝑈 ,
then the linear transformation, 𝑇𝑥 , given in Definition 4.1.1 is unique.
Proof. Suppose there is another linear transformation, 𝑇 : ℝ𝑛 → ℝ𝑚 , given
by Definition 4.1.1 in addition to 𝑇𝑥 . We show that 𝑇 and 𝑇𝑥 are the same
transformation.
From (4.5) and (4.6) we get that
𝐹 (𝑥 + 𝑤) = 𝐹 (𝑥) + 𝑇𝑥 (𝑤) + 𝐸𝑥 (𝑤),
where
∥𝐸𝑥 (𝑤)∥
lim = 0.
∥𝑤∥→0 ∥𝑤∥
Similarly,
𝐹 (𝑥 + 𝑤) = 𝐹 (𝑥) + 𝑇 (𝑤) + 𝐸(𝑤),
where
∥𝐸(𝑤)∥
lim = 0.
∥𝑤∥→0 ∥𝑤∥
It then follows that
𝑇 (𝑤) + 𝐸(𝑤) = 𝑇𝑥 (𝑤) + 𝐸𝑥 (𝑤) (4.7)


for all 𝑤 ∈ ℝ𝑛 sufficiently close to 0 .
Let 𝑢 𝑢 in (4.7) for 𝑡 ∈ ℝ sufficiently close
ˆ denote a unit vector and put 𝑤 = 𝑡ˆ
to 0. Then, by the linearity of 𝑇 and 𝑇𝑥 ,
𝑡𝑇 (ˆ
𝑢) + 𝐸(𝑡ˆ
𝑢) = 𝑡𝑇𝑥 (ˆ
𝑢) + 𝐸𝑥 (𝑡ˆ
𝑢).
Dividing by 𝑡 ∕= 0 we get
𝐸(𝑡ˆ𝑢) 𝐸𝑥 (𝑡ˆ
𝑢)
𝑇 (ˆ
𝑢) + = 𝑇𝑥 (ˆ
𝑢) + . (4.8)
𝑡 𝑡
Next, observe that
∥𝐸𝑥 (𝑡ˆ𝑢)∥ ∥𝐸𝑥 (𝑡ˆ
𝑢)∥
lim = lim =0
∣𝑡∣→0 ∣𝑡∣ ∥𝑡ˆ
𝑢∥→0 ∥𝑡ˆ
𝑢∥
44 CHAPTER 4. DIFFERENTIABILITY

by (4.6). Similarly,
∥𝐸(𝑡ˆ
𝑢)∥
lim = 0.
∣𝑡∣→0 ∣𝑡∣
Thus, letting 𝑡 → 0 in (4.8) we get that

𝑇 (ˆ
𝑢) = 𝑇 (ˆ
𝑢).

Hence 𝑇 agrees with 𝑇𝑥 on any unit vector 𝑢 ˆ. Therefore, 𝑇 and 𝑇𝑥 agree on the
standard basis {𝑒1 , 𝑒2 , . . . , 𝑒𝑛 } of ℝ𝑛 . Consequently, since 𝑇 and 𝑇𝑥 are linear

𝑇 (𝑣) = 𝑇𝑥 (𝑣) for all 𝑣 ∈ ℝ𝑛 ;

that is, 𝑇 and 𝑇𝑥 are the same transformation.


Proposition 4.2.1 allows as to talk about the derivative of 𝐹 at 𝑥.
Definition 4.2.2 (Derivative of a Map). Let 𝑈 denote an open subset of ℝ𝑛
and 𝐹 : 𝑈 → ℝ𝑚 be a map. If 𝐹 is differentiable at 𝑥 ∈ 𝑈 , then the unique
linear transformation, 𝑇𝑥 , given in Definition 4.1.1 is called the derivative of 𝐹
at 𝑥 and is denoted by 𝐷𝐹 (𝑥). We then have that if 𝐹 is differentiable at 𝑥 ∈ 𝑈 ,
there exists a unique linear transformation, 𝐷𝐹 (𝑥) : ℝ𝑛 → ℝ𝑚 , such that

𝐹 (𝑥 + 𝑤) = 𝐹 (𝑥) + 𝐷𝐹 (𝑥)𝑤 + 𝐸𝑥 (𝑤),

where
∥𝐸𝑥 (𝑤)∥
lim = 0.
∥𝑤∥→0 ∥𝑤∥

4.3 Example: Differentiable Scalar Fields


Let 𝑈 denote an open subset of ℝ𝑛 and let 𝑓 : 𝑈 → ℝ be a scalar field on 𝑈 . If
𝑓 is differentiable at 𝑥 ∈ 𝑈 , there exists a unique linear map 𝐷𝑓 (𝑥) : ℝ𝑛 → ℝ
such that
𝑓 (𝑥 + 𝑤) = 𝑓 (𝑥) + 𝐷𝑓 (𝑥)𝑤 + 𝐸𝑥 (𝑤) (4.9)
for 𝑤 ∈ ℝ𝑛 with sufficiently small norm, ∥𝑤∥, where

∣𝐸𝑥 (𝑤)∣
lim = 0. (4.10)
∥𝑤∥→0 ∥𝑤∥

Now, since 𝐷𝑓 (𝑥) is a linear map from ℝ𝑛 to ℝ, there exists an 𝑛–row vector

𝑣 = [ 𝑎1 𝑎2 ⋅⋅⋅ 𝑎𝑛 ]

such that
𝐷𝑓 (𝑥)𝑤 = 𝑣 ⋅ 𝑤 for all 𝑤 ∈ ℝ𝑛 ; (4.11)
that is, 𝐷𝑓 (𝑥)𝑤 is the dot–product of 𝑣 an 𝑤. We would like to know what the
differentiability of 𝑓 implies about the components of the vector 𝑣.
4.3. EXAMPLE: DIFFERENTIABLE SCALAR FIELDS 45

Apply (4.9) to the case in which 𝑤 = 𝑡ˆ 𝑒𝑗 , where 𝑡 ∈ ℝ is sufficiently close to


0 and 𝑒ˆ𝑗 is the 𝑗 th vector in the standard basis for ℝ𝑛 , to get that

𝑓 (𝑥 + 𝑡ˆ
𝑒𝑗 ) = 𝑓 (𝑥) + 𝐷𝑓 (𝑥)(𝑡ˆ
𝑒𝑗 ) + 𝐸𝑥 (𝑡ˆ
𝑒𝑗 ). (4.12)

Using the linearity of 𝐷𝑓 (𝑥) and (4.11) we get from (4.12) that

𝑒𝑗 ) − 𝑓 (𝑥) = 𝑡𝑣 ⋅ 𝑒ˆ𝑗 + 𝐸𝑥 (𝑡ˆ


𝑓 (𝑥 + 𝑡ˆ 𝑒𝑗 ).

Dividing by 𝑡 ∕= 0 we then get that

𝑒𝑗 ) − 𝑓 (𝑥)
𝑓 (𝑥 + 𝑡ˆ 𝐸𝑥 (𝑡ˆ𝑒𝑗 )
= 𝑎𝑗 + . (4.13)
𝑡 𝑡
It follows from (4.10) that

∣𝐸𝑥 (𝑡ˆ
𝑒𝑗 )∣ ∣𝐸𝑥 (𝑡ˆ
𝑒𝑗 )∣
lim = lim = 0,
𝑡→0 ∣𝑡∣ ∣𝑡∣→0 ∥𝑡ˆ
𝑒𝑗 ∥

and therefore, we get from (4.13) that

𝑒𝑗 ) − 𝑓 (𝑥)
𝑓 (𝑥 + 𝑡ˆ
lim = 𝑎𝑗 . (4.14)
𝑡→0 𝑡
Definition 4.3.1 (Partial Derivatives). Let 𝑈 be an open subset of ℝ𝑛 ,

𝑓: 𝑈 →ℝ

denote a scalar field, and 𝑥 ∈ 𝑈 . If

𝑒𝑗 ) − 𝑓 (𝑥)
𝑓 (𝑥 + 𝑡ˆ
lim
𝑡→0 𝑡
exists, we call it the partial derivative of 𝑓 at 𝑥 with respect to 𝑥𝑗 and denote
∂𝑓
it by (𝑥).
∂𝑥𝑗
The argument leading up to equation (4.14) then shows that if the scalar
field 𝑓 : 𝑈 → ℝ is differentiable at 𝑥 ∈ 𝑈 , then its partial derivatives at 𝑥 exist
and they are the components of the matrix representation of the linear map
𝐷𝑓 (𝑥) : ℝ𝑛 → ℝ with respect to the standard basis in ℝ𝑛 :
[ ]
∂𝑓 ∂𝑓 ∂𝑓
[𝐷𝑓 (𝑥)] = (𝑥) (𝑥) ⋅ ⋅ ⋅ (𝑥) .
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

Definition 4.3.2 (Gradient). Suppose that the partial derivatives of a scalar


field 𝑓 : 𝑈 → ℝ exist at 𝑥 ∈ 𝑈 . The expression
[ ]
∂𝑓 ∂𝑓 ∂𝑓
(𝑥) (𝑥) ⋅ ⋅ ⋅ (𝑥)
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
46 CHAPTER 4. DIFFERENTIABILITY

is usually written as a row vector


( )
∂𝑓 ∂𝑓 ∂𝑓
(𝑥), (𝑥), . . . , (𝑥)
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

is called the gradient of 𝑓 at 𝑥. The gradient of 𝑓 at 𝑥 is denoted by the symbol


∇𝑓 (𝑥). We then have that
( )
∂𝑓 ∂𝑓 ∂𝑓
∇𝑓 (𝑥) = (𝑥), (𝑥), . . . , (𝑥) ,
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

or, in terms of the standard basis in ℝ𝑛 ,


∂𝑓 ∂𝑓 ∂𝑓
∇𝑓 (𝑥) = (𝑥) 𝑒ˆ1 + (𝑥) 𝑒ˆ2 + ⋅ ⋅ ⋅ + (𝑥) 𝑒ˆ𝑛 .
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

Example 4.3.3. Let 𝑓 : ℝ2 → ℝ be given by



1
⎨ −

𝑥 2 + 𝑦2
𝑓 (𝑥, 𝑦) = 𝑒 if (𝑥, 𝑦) ∕= (0, 0)

⎩0 if (𝑥, 𝑦) ∕= (0, 0).

Compute the partial derivatives of 𝑓 and its gradient. Is 𝑓 differentiable at


(0, 0)?

Solution: According to Definition 4.3.1,

∂𝑓 𝑓 (𝑥 + 𝑡, 𝑦) − 𝑓 (𝑥, 𝑦)
(𝑥, 𝑦) = lim .
∂𝑥 𝑡→0 𝑡
Thus, we compute the rate of change of 𝑓 as 𝑥 changes while 𝑦 is
fixed. For the case in which (𝑥, 𝑦) ∕= (0, 0), we may compute ∂𝑓 /∂𝑥
as follows:
1
⎛ ⎞
∂𝑓 − 2 2⎟
(𝑥, 𝑦) = ∂𝑥 ⎝𝑒 𝑥 + 𝑦 ⎠
∂ ⎜
∂𝑥
1
− 2 2 ∂
(
1
)
= 𝑒 𝑥 +𝑦 ⋅ − 2
∂𝑥 𝑥 + 𝑦2
1
− 2 2 2𝑥
= 𝑒 𝑥 +𝑦 ⋅ 2
(𝑥 + 𝑦 2 )2
1
2𝑥 − 2
= ⋅𝑒 𝑥 + 𝑦2 .
(𝑥2 + 𝑦 2 )2

That is, we took the one dimensional derivative with respect to 𝑥


and thought of 𝑦 as a constant (or fixed with respect to 𝑥). Notice
4.3. EXAMPLE: DIFFERENTIABLE SCALAR FIELDS 47

that we used the Chain Rule twice in the previous calculation. A


similar calculation shows that
1
∂𝑓 2𝑦 − 2 2
(𝑥, 𝑦) = 2 ⋅𝑒 𝑥 +𝑦
∂𝑥 (𝑥 + 𝑦 2 )2

for (𝑥, 𝑦) ∕= (0, 0).


To compute the partial derivatives at (0, 0), we must compute the
limit in Definition 4.3.1. For instance,

∂𝑓 𝑓 (𝑡, 0) − 𝑓 (0, 0)
(0, 0) = lim
∂𝑥 𝑡→0 𝑡
1
− 2
𝑒 𝑡
= lim
𝑡→0 𝑡
1/𝑡
= lim .
𝑡→0 1/𝑡2
𝑒
Applying L’Hospital’s Rule we then have that

∂𝑓 1/𝑡2
(0, 0) = lim 2
∂𝑥 𝑡→0
2/𝑡3 𝑒1/𝑡
1 𝑡
= lim
2 𝑡→0 𝑒1/𝑡2
= 0.

∂𝑓
Similarly, (0, 0) = 0. It then follows that
∂𝑦

∇𝑓 (0, 0) = (0, 0),

or the zero vector, and, for (𝑥, 𝑦) ∕= (0, 0),


1

2𝑒 𝑥2 + 𝑦2
∇𝑓 (𝑥, 𝑦) = (𝑥, 𝑦),
(𝑥2 + 𝑦 2 )2
or
1

2𝑒 𝑥2 + 𝑦2
∇𝑓 (𝑥, 𝑦) = (𝑥 ˆ𝑖 + 𝑦 ˆ
𝑗).
(𝑥2 + 𝑦 2 )2

To show that 𝑓 is differentiable at (0, 0), we show that

𝑓 (𝑥, 𝑦) = 𝑓 (0, 0) + 𝑇 (𝑥, 𝑦) + 𝐸(𝑥, 𝑦),


48 CHAPTER 4. DIFFERENTIABILITY

where
∣𝐸(𝑥, 𝑦)∣
lim √ = 0,
(𝑥,𝑦)→(0,0) 𝑥2 + 𝑦 2
and 𝑇 is the zero linear transformation from ℝ2 to ℝ.
In this case
1

𝐸(𝑥, 𝑦) = 𝑒 𝑥2 + 𝑦2 if (𝑥, 𝑦) ∕= (0, 0).
Thus, for (𝑥, 𝑦) ∕= (0, 0),
1 1
− 2 2 −
∣𝐸(𝑥, 𝑦)∣ 𝑒 𝑥 +𝑦 𝑒 𝑢2
√ = √ = ,
𝑥2 + 𝑦 2 𝑥2 + 𝑦 2 𝑢

where we have set 𝑢 = 𝑥2 + 𝑦 2 . Thus,
1

∣𝐸(𝑥, 𝑦)∣ 𝑒 𝑢2
lim √ = lim = 0,
(𝑥,𝑦)→(0,0) 𝑥2 + 𝑦 2 𝑢→0 𝑢
by the same calculation involving L’Hospital’s Rule that was used to
compute ∂𝑓 /∂𝑥 at (0, 0). Consequently, 𝑓 is differentiable at (0, 0)
and its derivative is the zero map. □
We have seen that if a scalar field 𝑓 : 𝑈 → ℝ is differentiable at 𝑥 ∈ 𝑢, then
𝑓 (𝑥 + 𝑤) = 𝑓 (𝑥) + ∇𝑓 (𝑥) ⋅ 𝑤 + 𝐸𝑥 (𝑤)
for all 𝑤 ∈ ℝ𝑛 with sufficiently small norm, ∥𝑤∥, where ∇𝑓 (𝑥) is the gradient
of 𝑓 at 𝑥 ∈ 𝑈 , and
∣𝐸𝑥 (𝑤)∣
lim = 0.
∥𝑤∥→0 ∥𝑤∥
Applying this to the case where 𝑤 = 𝑡ˆ
𝑢, for a unit vector 𝑢
ˆ, we get that
𝑢) − 𝑓 (𝑥) = 𝑡∇𝑓 (𝑥) ⋅ 𝑢
𝑓 (𝑥 + 𝑡ˆ ˆ + 𝐸𝑥 (𝑡ˆ
𝑢)
for 𝑡 ∈ ℝ sufficiently close to 0. Dividing by 𝑡 ∕= 0 and letting 𝑡 → 0 leads to
𝑢) − 𝑓 (𝑥)
𝑓 (𝑥 + 𝑡ˆ
lim = ∇𝑓 (𝑥) ⋅ 𝑢
ˆ,
𝑡→0 𝑡
where we have used (4.10).
Definition 4.3.4 (Directional Derivatives). Let 𝑓 : 𝑈 → ℝ denote a scalar field
defined on an open subset 𝑈 of ℝ𝑛 , and let 𝑢
ˆ be a unit vector in ℝ𝑛 . If the limit
𝑢) − 𝑓 (𝑥)
𝑓 (𝑥 + 𝑡ˆ
lim
𝑡→0 𝑡
exists, we call it the directional derivative of 𝑓 at 𝑥 in the direction of the unit
vector 𝑢ˆ. We denote it by 𝐷𝑢ˆ 𝑓 (𝑥).
4.4. EXAMPLE: DIFFERENTIABLE PATHS 49

We have then shown that if the scalar field 𝑓 is differentiable at 𝑥, then its
directional derivative at 𝑥 in the direction of a unit vector 𝑢
ˆ is given by

𝐷𝑢ˆ 𝑓 (𝑥) = ∇𝑓 (𝑥) ⋅ 𝑢


ˆ;

that is, the dot–product of the gradient of 𝑓 at 𝑥 with the unit vector 𝑢 ˆ. In
other words, the directional derivative on 𝑓 at 𝑥 in the direction of a unit vector
ˆ is the component of the orthogonal projection of ∇𝑓 (𝑥) along the direction of
𝑢
𝑢
ˆ.

4.4 Example: Differentiable Paths


Example 4.4.1. Let 𝐼 denote an open interval in ℝ, and suppose that the path
𝜎 : 𝐼 → ℝ𝑛 is differentiable at 𝑡 ∈ 𝐼. It then follows that there exists a linear
map 𝐷𝜎(𝑡) : ℝ → ℝ𝑛 such that

𝜎(𝑡 + ℎ) − 𝜎(𝑡) = 𝐷𝜎(𝑡)(ℎ) + 𝐸𝑡 (ℎ), (4.15)

where
∥𝐸𝑡 (ℎ)∥
lim = 0. (4.16)
ℎ→0 ∣ℎ∣

(a) Show that the linear map 𝐷𝜎(𝑡)ℝ → ℝ𝑛 is of the form

𝐷𝜎(𝑡)(ℎ) = ℎv(𝑡) for all ℎ ∈ ℝ,

where the vector v(𝑡) is obtained from

v(𝑡) = 𝐷𝜎(𝑡)(1);

that is, v(𝑡) is the image of the real number 1 under the linear transforma-
tion 𝐷𝜎(𝑡).

Solution: Let ℎ denote any real number; then, by the linearity


of 𝐷𝜎(𝑡),

𝐷𝜎(𝑡)(ℎ) = 𝐷𝜎(𝑡)(ℎ ⋅ 1) = ℎ𝐷𝜎(𝑡)(1) = ℎv.

(b) Write 𝜎(𝑡) = (𝑥1 (𝑡), 𝑥2 (𝑡), . . . , 𝑥𝑛 (𝑡)) for all 𝑡 ∈ 𝐼. Show that if 𝜎 : 𝐼 → ℝ𝑛
is differentiable at 𝑡 ∈ 𝐼 and v = 𝐷𝜎(𝑡)(1), then each function 𝑥𝑗 : 𝐼 → ℝ,
for 𝑗 = 1, 2, . . . , 𝑛, is differentiable at 𝑡, and

𝑥′𝑗 (𝑡) = 𝑣𝑗 (𝑡),

where 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 are the components of the vector v; that is,

v(𝑡) = (𝑣1 (𝑡), 𝑣2 (𝑡), . . . , 𝑣𝑛 (𝑡)), for all 𝑡 ∈ 𝐼.


50 CHAPTER 4. DIFFERENTIABILITY

Solution: Writing 𝜎 and v(𝑡) as a column vector, equation (4.15)


takes the form
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
𝑥1 (𝑡 + ℎ) 𝑥1 (𝑡) 𝑣1 (𝑡)
⎜ 𝑥2 (𝑡 + ℎ) ⎟ ⎜ 𝑥2 (𝑡) ⎟ ⎜ 𝑣2 (𝑡) ⎟
− = ℎ ⎜ .. ⎟ + 𝐸𝑡 (ℎ),
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ .. ⎟ ⎜ .. ⎟
⎝ . ⎠ ⎝ . ⎠ ⎝ . ⎠
𝑥𝑛 (𝑡 + ℎ) 𝑥𝑛 (𝑡) 𝑣𝑛 (𝑡)
or, after division by ℎ ∕= 0,
𝑥1 (𝑡 + ℎ) − 𝑥1 (𝑡)
⎛ ⎞
⎛ ⎞
⎜ ℎ ⎟ 𝑣1 (𝑡)
⎜ 𝑥2 (𝑡 + ℎ) − 𝑥2 (𝑡) ⎟ ⎜

⎟ ⎜ 𝑣2 (𝑡) ⎟ 𝐸𝑡 (ℎ)
⎜ ⎟
⎜ ℎ ⎟ = ⎜ .. ⎟ + .
.. ℎ

⎜ ⎟ ⎝ . ⎠
⎜ . ⎟

𝑥 (𝑡 + ℎ) − 𝑥 (𝑡)
⎠ 𝑣𝑛 (𝑡)
𝑛 𝑛

It then follows from (4.16) that
𝑥𝑗 (𝑡 + ℎ) − 𝑥𝑗 (𝑡)
lim = 𝑣𝑗 (𝑡) for each 𝑗 = 1, 2, . . . 𝑛,
ℎ→0 ℎ
which shows that each 𝑣𝑗 : 𝐼 → ℝ is differentiable at 𝑡 with
𝑥′𝑗 (𝑡) = 𝑣𝑗 (𝑡)
for each 𝑗 = 1, 2, . . . , 𝑛. □

Notation: If 𝜎 : 𝐼 → ℝ𝑛 is differentiable at every 𝑡 ∈ 𝐼, the vector valued


function v : 𝐼 → ℝ𝑛 given by v(𝑡) = 𝐷𝜎(𝑡)(1) is called the velocity of the path
𝜎, and is usually denoted by 𝜎 ′ (𝑡). We then have that
𝐷𝜎(𝑡)(ℎ) = ℎ𝜎 ′ (𝑡) for all ℎ ∈ ℝ
and all 𝑡 at which the path 𝜎 is differentiable. We can then re–write (4.15) as
𝜎(𝑡 + ℎ) = 𝜎(𝑡) + ℎ𝜎 ′ (𝑡) + 𝐸𝑡 (ℎ).
Re–writing this expression once more, by replacing 𝑡 by 𝑡𝑜 and 𝑡 + ℎ by 𝑡, we
have that
𝜎(𝑡) = 𝜎(𝑡𝑜 ) + (𝑡 − 𝑡𝑜 )𝜎 ′ (𝑡𝑜 ) + 𝐸𝑡𝑜 (𝑡 − 𝑡𝑜 ). (4.17)
where
∥𝐸𝑡𝑜 (𝑡 − 𝑡𝑜 )∥
lim = 0. (4.18)
𝑡→𝑡𝑜 ∣𝑡 − 𝑡𝑜 ∣
The expression
𝜎(𝑡𝑜 ) + (𝑡 − 𝑡𝑜 )𝜎 ′ (𝑡𝑜 )
in (4.17) gives the vector–parametric equation of a straight line through 𝜎(𝑡𝑜 )
in the direction of the velocity vector, 𝜎 ′ (𝑡𝑜 ), of the path 𝜎(𝑡) at the 𝑡𝑜 . Thus,
(4.17) and (4.18) yield the following interpretation of differentiability of a path
𝜎(𝑡) at 𝑡𝑜 :
4.5. SUFFICIENT CONDITION FOR DIFFERENTIABILITY 51

If a path 𝜎 : 𝐼 → ℝ𝑛 is differentiable at the 𝑡𝑜 , then it can be ap-


proximated by a straight line through 𝜎(𝑡𝑜 ) in the direction of the
velocity vector 𝜎 ′ (𝑡𝑜 ).
Definition 4.4.2 (Tangent line to a path). The straight line given perimetrically
by the vector equation

r(𝑡) = 𝜎(𝑡𝑜 ) + (𝑡 − 𝑡𝑜 )𝜎 ′ (𝑡𝑜 ) for 𝑡 ∈ ℝ

is called the the tangent line to the path 𝜎(𝑡) and the point 𝜎(𝑡𝑜 ).
Example 4.4.3. Give the tangent line to the path

𝜎(𝑡) = (cos 𝑡, 𝑡, sin 𝑡) for 𝑡 ∈ ℝ

when 𝑡𝑜 = 𝜋/4.

Solution: The equation of the tangent line is given by

𝑟(𝑡) = 𝜎(𝑡𝑜 ) + (𝑡 − 𝑡𝑜 )𝜎 ′ (𝑡𝑜 ),

where 𝜎 ′ (𝑡) = (− sin 𝑡, 1, cos 𝑡); so that, for 𝑡𝑜 = 𝜋/4, we get that
(√ √ ) ( ( √ √ )
2 𝜋 2 𝜋) 2 2
𝑟(𝑡) = , , + 𝑡− − , 1,
2 4 2 4 2 2
for 𝑡 ∈ ℝ.
Writing (𝑥, 𝑦, 𝑧) for the vector 𝑟(𝑡), we obtain the parametric equa-
tions for the tangent line:
⎧ √ √ (
2 2 𝜋
)
⎨𝑥 = 2 − 2 𝑡 − 4

𝜋
𝑦=√ 4 +𝑡 √ (

𝑧 = 22 + 22 𝑡 − 𝜋4
⎩ )

4.5 Sufficient Condition for Differentiability


4.5.1 Differentiability of Paths
Let 𝐼 be an open interval
⎛ of real numbers and 𝜎 : 𝐼 → ℝ𝑛 denote a path in

𝑥1 (𝑡)
⎜ 𝑥2 (𝑡) ⎟
ℝ𝑛 . Write 𝜎(𝑡) = ⎜ . ⎟ , for all 𝑡 ∈ 𝐼 and suppose that the functions
⎜ ⎟
⎝ .. ⎠
𝑥𝑛 (𝑡)
𝑥1 (𝑡), 𝑥2 (𝑡), . . . , 𝑥𝑛 (𝑡) are all differentiable in 𝐼. We show that the path 𝜎 is
differentiable according to Definition 4.1.1.
52 CHAPTER 4. DIFFERENTIABILITY

Let 𝑡 ∈ 𝐼 and ℎ ∈ ℝ be such that 𝑡 + ℎ ∈ 𝐼. Since each 𝑥𝑖 : 𝐼 → ℝ is


differentiable at 𝑡, we can write

𝑥𝑗 (𝑡 + ℎ) = 𝑥𝑖 (𝑡) + 𝑥′𝑗 (𝑡)ℎ + 𝐸𝑗 (𝑡, ℎ), for all 𝑗 = 1, 2, . . . 𝑛. (4.19)

where
∣𝐸𝑗 (𝑡, ℎ)∣
lim = 0, for all 𝑗 = 1, 2, . . . 𝑛. (4.20)
ℎ→0 ∣ℎ∣
It follows from (4.19) that

𝑥𝑗 (𝑡 + ℎ) − 𝑥𝑗 (𝑡) − ℎ𝑥′𝑗 (𝑡) = 𝐸𝑗 (𝑡, ℎ) for 𝑗 = 1, 2, . . . , 𝑛. (4.21)

Putting
𝑥′1 (𝑡)
⎛ ⎞
⎜ 𝑥′2 (𝑡) ⎟
𝜎(𝑡) = ⎜ . ⎟ , (4.22)
⎜ ⎟
⎝ .. ⎠
𝑥′𝑛 (𝑡)
we obtain from the equations in (4.21) that

𝑥1 (𝑡 + ℎ) − 𝑥1 (𝑡) − ℎ𝑥′1 (𝑡)


⎛ ⎞
⎜ 𝑥2 (𝑡 + ℎ) − 𝑥2 (𝑡) − ℎ𝑥′2 (𝑡) ⎟
𝜎(𝑡 + ℎ) − 𝜎(𝑡) − ℎ𝜎 ′ (𝑡) = ⎜
⎜ ⎟
.. ⎟
⎝ . ⎠
𝑥𝑛 (𝑡 + ℎ) − 𝑥𝑛 (𝑡) − ℎ𝑥′𝑛 (𝑡)
⎛ ⎞
𝐸1 (𝑡, ℎ)
⎜ 𝐸2 (𝑡, ℎ) ⎟
= ⎜ ⎟,
⎜ ⎟
..
⎝ . ⎠
𝐸𝑛 (𝑡, ℎ)

where 𝐸1 (𝑡, ℎ), 𝐸2 (𝑡, ℎ), . . . , 𝐸𝑛 (𝑡, ℎ) are given in (4.19) and satisfy (4.20). It
then follows that, for ℎ ∕= 0 and ∣ℎ∣ small enough,
⎛ ⎞
𝐸1 (𝑡, ℎ)/ℎ
1 ⎜ 𝐸2 (𝑡, ℎ)/ℎ ⎟
(𝜎(𝑡 + ℎ) − 𝜎(𝑡) − ℎ𝜎 ′ (𝑡)) = ⎜ ⎟.
⎜ ⎟
..
ℎ ⎝ . ⎠
𝐸𝑛 (𝑡, ℎ)/ℎ

Taking the square of the norm on both sides we get that


𝑛 2
∥𝜎(𝑡 + ℎ) − 𝜎(𝑡) − ℎ𝜎 ′ (𝑡)∥2 ∑ 𝐸𝑗 (𝑡, ℎ)
2
= .
∣ℎ∣ 𝑗=1

Hence, by virtue of (4.20),


∥𝜎(𝑡 + ℎ) − 𝜎(𝑡) − ℎ𝜎 ′ (𝑡)∥
lim = 0,
ℎ→0 ∣ℎ∣
4.5. SUFFICIENT CONDITION FOR DIFFERENTIABILITY 53

which shows that 𝜎 is differentiable at 𝑡. Furthermore, 𝐷𝜎(𝑡) : ℝ → ℝ𝑛 is given


by
𝐷𝜎(𝑡)ℎ = ℎ𝜎 ′ (𝑡), for all ℎ ∈ ℝ,
where 𝜎 ′ (𝑡) is given in (4.22).

4.5.2 Differentiability of Scalar Fields


Let 𝑈 denote an open subset of ℝ𝑛 and 𝑓 : 𝑈 → ℝ be a scalar field defined on
𝑈 . Suppose also that the partial derivatives of 𝑓 ,
∂𝑓 ∂𝑓 ∂𝑓
(𝑥), (𝑥), . . . , (𝑥),
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
exist for all 𝑥 ∈ 𝑈 . We show in this section that, if the partial derivatives
of 𝑓 are continuous on 𝑈 , then the scalar field 𝑓 is differentiable according to
Definition 4.1.1.
Observe that ∇𝑓 defines a map from 𝑈 to ℝ𝑛 by
( )
∂𝑓 ∂𝑓 ∂𝑓
∇𝑓 (𝑥) = (𝑥), (𝑥), . . . , (𝑥) for all 𝑥 ∈ 𝑈.
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
Note that, if the partial derivatives of 𝑓 are continuous on 𝑈 , then the vector
field
∇𝑓 : 𝑈 → ℝ𝑛
is a continuous map.
Proposition 4.5.1. Let 𝑈 denote an open subset of ℝ𝑛 and 𝑓 : 𝑈 → ℝ be a
scalar field defined on 𝑈 . Suppose that the partial derivatives of 𝑓 are continuous
on 𝑈 . Then the scalar field 𝑓 is differentiable.
Proof: We present the proof here for the case 𝑛 = 2. In this case we may write
∂𝑓
⎛ ⎞
⎜ ∂𝑥 (𝑥, 𝑦)⎟
∇𝑓 (𝑥, 𝑦) = ⎜ ⎟,
⎜ ⎟
⎝ ∂𝑓 ⎠
(𝑥, 𝑦)
∂𝑦
∂𝑓 ∂𝑓
where we are assuming that the functions and are continuous on 𝑈 .
∂𝑥 ∂𝑦
Let (𝑥, 𝑦) ∈ 𝑈 ; then, since 𝑈 is open, there exists 𝑟 > 0 such that 𝐵𝑟 (𝑥, 𝑦) ⊆
𝑈 . It then follows that, for (ℎ, 𝑘) ∈ 𝐵𝑟 (0, 0), (𝑥 + ℎ, 𝑦 + 𝑘) ∈ 𝑈 . For (ℎ, 𝑘) ∈
𝐵𝑟 (0, 0) we define

𝐸(ℎ, 𝑘) = 𝑓 (𝑥 + ℎ, 𝑦 + 𝑘) − 𝑓 (𝑥, 𝑦) − ∇𝑓 (𝑥, 𝑦) ⋅ (ℎ, 𝑘). (4.23)

We prove that
∣𝐸(ℎ, 𝑘)∣
lim √ =0 (4.24)
(ℎ,𝑘)→(0,0) ℎ2 + 𝑘 2
54 CHAPTER 4. DIFFERENTIABILITY

Assume that ℎ > 0 and 𝑘 > 0 (the other cases can be treated in an analogous
manner). By the mean value theorem, there are real numbers 𝜃 and 𝜂 such that
0 < 𝜃 < 1 and 0 < 𝜂 < 1 and

∂𝑓
𝑓 (𝑥 + ℎ, 𝑦 + 𝑘) − 𝑓 (𝑥, 𝑦 + 𝑘) = (𝑥 + 𝜃ℎ, 𝑦 + 𝑘) ⋅ ℎ,
∂𝑥
and
∂𝑓
𝑓 (𝑥, 𝑦 + 𝑘) − 𝑓 (𝑥, 𝑦) = (𝑥, 𝑦 + 𝜂𝑘) ⋅ 𝑘.
∂𝑦
Consequently,

∂𝑓 ∂𝑓
𝑓 (𝑥 + ℎ, 𝑦 + 𝑘) − 𝑓 (𝑥, 𝑦) = (𝑥 + 𝜃ℎ, 𝑦 + 𝑘) ⋅ ℎ + (𝑥, 𝑦 + 𝜂𝑘) ⋅ 𝑘.
∂𝑥 ∂𝑦

Thus, in view of (4.23), we see that


( ) ( )
∂𝑓 ∂𝑓 ∂𝑓 ∂𝑓
𝐸(ℎ, 𝑘) = (𝑥 + 𝜃ℎ, 𝑦 + 𝑘) − (𝑥, 𝑦) ℎ + (𝑥, 𝑦 + 𝜂𝑘) − (𝑥, 𝑦) 𝑘.
∂𝑥 ∂𝑥 ∂𝑦 ∂𝑥

Thus, 𝐸(ℎ, 𝑘) is the dot product of the vector 𝑣(ℎ, 𝑘), given by
( )
∂𝑓 ∂𝑓 ∂𝑓 ∂𝑓
𝑣(ℎ, 𝑘) = (𝑥 + 𝜃ℎ, 𝑦 + 𝑘) − (𝑥, 𝑦), (𝑥, 𝑦 + 𝜂𝑘) − (𝑥, 𝑦) ,
∂𝑥 ∂𝑥 ∂𝑦 ∂𝑥

and the vector (ℎ, 𝑘). Consequently, by the Cauchy–Schwarz inequality,

∣𝐸(ℎ, 𝑘)∣ ⩽ ∥𝑣(ℎ, 𝑘)∥∥(ℎ, 𝑘)∥.

Dividing by ∥(ℎ, 𝑘)∥ for (ℎ, 𝑘) ∕= (0, 0) we then get

∣𝐸(ℎ, 𝑘)∣
√ ⩽ ∥𝑣(ℎ, 𝑘)∥, (4.25)
ℎ2 + 𝑘 2

where
√( )2 ( )2
∂𝑓 ∂𝑓 ∂𝑓 ∂𝑓
∥𝑣(ℎ, 𝑘)∥ = (𝑥 + 𝜃ℎ, 𝑦 + 𝑘) − (𝑥, 𝑦) + (𝑥, 𝑦 + 𝜂𝑘) − (𝑥, 𝑦)
∂𝑥 ∂𝑥 ∂𝑦 ∂𝑥

tends to 0 as (ℎ, 𝑘) → (0, 0) since the partial derivatives of 𝑓 are continuous on


𝑈 . It then follows from the estimate in (4.25) and the Sandwich Theorem that

∣𝐸(ℎ, 𝑘)∣
lim √ = 0,
(ℎ,𝑘)→(0,0) ℎ2 + 𝑘 2

which is (4.24). This shows that 𝑓 is differentiable at (𝑥, 𝑦). Since (𝑥, 𝑦) was
arbitrary, the result follows.
4.5. SUFFICIENT CONDITION FOR DIFFERENTIABILITY 55

4.5.3 𝐶 1 Maps and Differentiability


Definition 4.5.2 (𝐶 1 Maps). Let 𝑈 denote an open subset of ℝ𝑛 . The vector
valued map
⎛ ⎞
𝑓1 (𝑥)
⎜ 𝑓2 (𝑥) ⎟
𝐹 (𝑥) = ⎜ . ⎟ for all 𝑥 ∈ 𝑈,
⎜ ⎟
⎝ .. ⎠
𝑓𝑚 (𝑥)

where 𝑓𝑖 : 𝑈 → ℝ are scalar fields on 𝑈 , is said to be of class 𝐶 1 , or a 𝐶 1 map,


if the partial derivatives

∂𝑓𝑖
(𝑥) 𝑖 = 1, 2, . . . , 𝑚; 𝑗 = 1, 2, . . . , 𝑛,
∂𝑥𝑗

are continuous on 𝑈 .

Proposition 4.5.1 then says that a 𝐶 1 scalar field must be differentiable.


Thus, being a 𝐶 1 scalar field is sufficient for the map being differentiable. How-
ever, it is not necessary. For example, the function
⎧ ( )
⎨(𝑥2 + 𝑦 2 ) sin 1
, if (𝑥, 𝑦) ∕= (0, 0)
𝑓 (𝑥, 𝑦) = 𝑥2 + 𝑦 2
0, if (𝑥, 𝑦) = (0, 0)

is differentiable at (0, 0); however, the partial derivatives are not continuous at
the origin (This is shown in Problem 9 of Assignment #5).

The result of Proposition 4.5.1 applies more generally to 𝐶 1 vector–valued


maps:

Proposition 4.5.3 (𝐶 1 implies Differentiability). Let 𝑈 denote an open subset


of ℝ𝑛 and 𝐹 : 𝑈 → ℝ𝑚 be a vector field on 𝑈 defined by
⎛ ⎞
𝑓1 (𝑥)
⎜ 𝑓2 (𝑥) ⎟
𝐹 (𝑥) = ⎜ . ⎟ for all 𝑥 ∈ 𝑈,
⎜ ⎟
⎝ .. ⎠
𝑓𝑚 (𝑥)

where the scalar fields 𝑓𝑖 : 𝑈 → ℝ are of class 𝐶 1 in 𝑈 , for 𝑖 = 1, 2, . . . , 𝑚.


Then, the vector–valued 𝐹 is differentiable in 𝑈 and the matrix representation
of the linear transformation

𝐷𝐹 (𝑥) : ℝ𝑛 → ℝ𝑚
56 CHAPTER 4. DIFFERENTIABILITY

is given by
∂𝑓1 ∂𝑓1 ∂𝑓1
⎛ ⎞
⎜ ∂𝑥1 (𝑥) (𝑥) ⋅⋅⋅ (𝑥) ⎟
⎜ ∂𝑥2 ∂𝑥𝑛 ⎟
⎜ ⎟
⎜ ∂𝑓 ∂𝑓2 ∂𝑓2 ⎟
⎜ 2
(𝑥) (𝑥) ⋅ ⋅ ⋅ (𝑥) ⎟ .

⎜ ∂𝑥1

∂𝑥2 ∂𝑥𝑛 ⎟ (4.26)
⎜ .
.. .. .. .. ⎟

⎜ . . . ⎟

⎝ ∂𝑓𝑚 ∂𝑓𝑚 ∂𝑓𝑚 ⎠
(𝑥) (𝑥) ⋅ ⋅ ⋅ (𝑥)
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
The matrix of partial derivative of the components of 𝐹 in equation (4.26) is
called the Jacobian matrix of the map 𝐹 at 𝑥. It is the matrix that represents
the derivative map 𝐷𝐹 (𝑥) : ℝ𝑛 → ℝ𝑚 with respect to the standard bases in
ℝ𝑛 and ℝ𝑚 . We will therefore denote it by 𝐷𝐹 (𝑥). Hence, 𝐷𝐹 (𝑥)𝑤 can be
understood as matrix multiplication of the Jacobian matrix of 𝐹 at 𝑥 by the
column vector 𝑤. If 𝑚 = 𝑛, then the determinant of the square matrix 𝐷𝐹 (𝑥)
is called the Jacobian determinant of 𝐹 at 𝑥, and is denoted by the symbols
∂(𝑓1 , 𝑓2 , . . . , 𝑓𝑛 )
𝐽𝐹 (𝑥) or . We then have that
∂(𝑥1 , 𝑥2 , . . . , 𝑥𝑛 )

∂(𝑓1 , 𝑓2 , . . . , 𝑓𝑛 )
𝐽𝐹 (𝑥) = = det 𝐷𝐹 (𝑥).
∂(𝑥1 , 𝑥2 , . . . , 𝑥𝑛 )

Example 4.5.4. Let 𝐹 : ℝ2 → ℝ2 be the map


( 2
𝑥 − 𝑦2
)
𝐹 (𝑥, 𝑦) = for all (𝑥, 𝑦) ∈ ℝ2 .
2𝑥𝑦

Then, the Jacobian matrix of 𝐹 is


( )
2𝑥 −2𝑦
𝐷𝐹 (𝑥, 𝑦) = for all (𝑥, 𝑦) ∈ ℝ2 ,
2𝑦 2𝑥

and the Jacobian determinant is

𝐽𝐹 (𝑥, 𝑦) = 4(𝑥2 + 𝑦 2 ).

If we let 𝑢 = 𝑥2 − 𝑦 2 and 𝑣 = 2𝑥𝑦, we can write the Jacobian determinant as


∂(𝑢, 𝑣)
.
∂(𝑥, 𝑦)

4.6 Derivatives of Compositions


The goal of this section is to prove that compositions of differentiable functions
are differentiable:
Theorem 4.6.1 (The Chain Rule). Let 𝑈 denote an open subset of ℝ𝑛 and
𝑄 and open subset of ℝ𝑚 , and let 𝐹 : 𝑈 → ℝ𝑚 and 𝐺 : 𝑄 → ℝ𝑘 be maps.
4.6. DERIVATIVES OF COMPOSITIONS 57

Suppose that 𝐹 (𝑈 ) ⊆ 𝑄. If 𝐹 is differentiable at 𝑥 ∈ 𝑈 and 𝐺 is differentiable


at 𝑦 = 𝐹 (𝑥) ∈ 𝑄, then the composition

𝐺 ∘ 𝐹 : 𝑈 → ℝ𝑘

is differentiable at 𝑥 and the derivative map 𝐷(𝐺 ∘ 𝐹 )(𝑥) : ℝ𝑛 → ℝ𝑘 is given by

𝐷(𝐺 ∘ 𝐹 )(𝑥)𝑤 = 𝐷𝐺(𝑦)𝐷𝐹 (𝑥)𝑤 for all 𝑤 ∈ ℝ𝑛 .

Proof. Since 𝐹 is differentiable at 𝑥 ∈ 𝑈 , for 𝑤 ∈ ℝ𝑛 with ∥𝑤∥ sufficiently small,

𝐹 (𝑥 + 𝑤) = 𝐹 (𝑥) + 𝐷𝐹 (𝑥)𝑤 + 𝐸𝐹 (𝑤), (4.27)

where
∥𝐸𝐹 (𝑤)∥
lim = 0. (4.28)
∥𝑤∥→0 ∥𝑤∥
Similarly, for 𝑣 ∈ ℝ𝑚 with ∥𝑣∥ sufficiently small,

𝐺(𝑦 + 𝑣) = 𝐺(𝑦) + 𝐷𝐺(𝑦)𝑣 + 𝐸𝐺 (𝑣), (4.29)

where
∥𝐸𝐺 (𝑣)∥
lim = 0. (4.30)
∥𝑣∥→0 ∥𝑣∥
It then follows from (4.27) that, for 𝑤 ∈ ℝ𝑛 with ∥𝑤∥ sufficiently small,

(𝐺 ∘ 𝐹 )(𝑥 + 𝑤) = 𝐺(𝐹 (𝑥 + 𝑤))


= 𝐺(𝐹 (𝑥) + 𝐷𝐹 (𝑥)𝑤 + 𝐸𝐹 (𝑤)) (4.31)
= 𝐺(𝐹 (𝑥) + 𝑣),

where we have set


𝑣 = 𝐷𝐹 (𝑥)𝑤 + 𝐸𝐹 (𝑤). (4.32)
Observe that, by the triangle inequality and the Cauchy–Schwarz inequality,

∥𝑣∥ ⩽ ∥𝐷𝐹 (𝑥)∥∥𝑤∥ + ∥𝐸𝐹 (𝑤)∥, (4.33)

where v
u𝑚 ∑ 𝑛 (
u∑ )2
∂𝑓𝑖
∥𝐷𝐹 (𝑥)∥ = ⎷ (𝑥) ;
𝑖=1 𝑗=1
∂𝑥𝑗

so that, by virtue of (4.28), we can make ∥𝑣∥ small by making ∥𝑤∥ small. It
then follows from (4.29) and (4.31) that

(𝐺 ∘ 𝐹 )(𝑥 + 𝑤) = 𝐺(𝐹 (𝑥)) + 𝐷𝐺(𝐹 (𝑥))𝑣 + 𝐸𝐺 (𝑣),

where 𝑣 as given in (4.32) can be made sufficiently small in norm by making


∥𝑤∥ sufficiently small. It then follows that, for ∥𝑤∥ sufficiently small,

(𝐺 ∘ 𝐹 )(𝑥 + 𝑤) = (𝐺 ∘ 𝐹 )(𝑥) + 𝐷𝐺(𝑦)𝐷𝐹 (𝑥)𝑤 + 𝐷𝐺(𝑦)𝐸𝐹 (𝑤) + 𝐸𝐺 (𝑣). (4.34)


58 CHAPTER 4. DIFFERENTIABILITY

Put
𝐸(𝑤) = 𝐷𝐺(𝑦)𝐸𝐹 (𝑤) + 𝐸𝐺 (𝑣) (4.35)
𝑛
for 𝑤 ∈ ℝ and 𝑣 as given in (4.32). The differentiability of 𝐺 ∘ 𝐹 at 𝑥 will then
follow from (4.34) if we can prove that

∥𝐸(𝑤)∥
lim = 0. (4.36)
∥𝑤∥→0 ∥𝑤∥
This will also prove that

𝐷(𝐺 ∘ 𝐹 )(𝑥)𝑤 = 𝐷𝐺(𝑦)𝐷𝐹 (𝑥)𝑤 for all 𝑤 ∈ ℝ𝑛 .

To prove (4.36), take the norm of 𝐸(𝑤) defined in (4.35), apply the triangle and
Cauchy–Schwarz inequalities, and divide by ∥𝑤∥ to get that

∥𝐸(𝑤)∥ ∥𝐸𝐹 (𝑤)∥ ∥𝐸𝐺 (𝑣)∥ ∥𝑣∥


⩽ ∥𝐷𝐺(𝑦)∥ + , (4.37)
∥𝑤∥ ∥𝑤∥ ∥𝑣∥ ∥𝑤∥

where, by virtue of the inequality in (4.33),

∥𝑣∥ ∥𝐸𝐹 (𝑤)∥


⩽ ∥𝐷𝐹 (𝑥)∥ + .
∥𝑤∥ ∥𝑤∥

The proof of (4.36) will then follow from this last estimate, (4.28), (4.30), (4.37)
and the Squeeze Theorem. This completes the proof of the Chain Rule.

Example 4.6.2. Let 𝑈 be an open subset of the 𝑥𝑦–plane, ℝ2 , and 𝑓 : 𝑈 → ℝ


be a differentiable scalar field. Let 𝑄 be an open subset of the 𝑢𝑣–plane, ℝ2 , and
Φ : 𝑄 → ℝ2 be a differentiable map such that Φ(𝑄) ⊆ 𝑈 . Then, by the Chain
Rule, the map
𝑓 ∘ Φ: 𝑄 → ℝ
is differentiable. Furthermore, putting

𝑔(𝑢, 𝑣) = (𝑓 ∘ Φ)(𝑢, 𝑣),

where ( )
𝑥(𝑢, 𝑣)
Φ(𝑢, 𝑣) = , for (𝑢, 𝑣) ∈ 𝑄,
𝑦(𝑢, 𝑣)
we have that
𝐷𝑔(𝑢, 𝑣) = 𝐷𝑓 (𝑥(𝑢, 𝑣), 𝑦(𝑢, 𝑣))𝐷Φ(𝑢, 𝑣).
Writing this in terms of Jacobian matrices we get
⎛ ⎞
∂𝑥 ∂𝑥
∂𝑢 ∂𝑣 ⎟
( ) ( )⎜
∂𝑔 ∂𝑔 ∂𝑓 ∂𝑓 ⎜
= ⎟,

∂𝑥 ∂𝑦 ⎝ ∂𝑦

∂𝑢 ∂𝑣 ∂𝑦 ⎠
∂𝑢 ∂𝑣
4.6. DERIVATIVES OF COMPOSITIONS 59

from which we get that

∂𝑔 ∂𝑓 ∂𝑥 ∂𝑓 ∂𝑦
= +
∂𝑢 ∂𝑥 ∂𝑢 ∂𝑦 ∂𝑢

and
∂𝑔 ∂𝑓 ∂𝑥 ∂𝑓 ∂𝑦
= + .
∂𝑣 ∂𝑥 ∂𝑣 ∂𝑦 ∂𝑣
In the previous example, if Φ : 𝑄 → ℝ2 is a one–to–one map, then Φ is called
a change of variable map. Writing Φ in terms of a its components we have

𝑥 = 𝑥(𝑢, 𝑣)
𝑦 = 𝑦(𝑢, 𝑣),

we see that Φ changes from 𝑢𝑣–coordinates to 𝑥𝑦–coordinates. As a more con-


crete example, consider the change to polar coordinates maps

𝑥 = 𝑟 cos 𝜃
𝑦 = 𝑟 sin 𝜃,

where 0 ⩽ 𝑟 < ∞ and −𝜋 < 𝜃 ⩽ 𝜋. We then have that


∂𝑓 ∂𝑓 ∂𝑥 ∂𝑓 ∂𝑦
= +
∂𝑟 ∂𝑥 ∂𝑟 ∂𝑦 ∂𝑟

and
∂𝑓 ∂𝑓 ∂𝑥 ∂𝑓 ∂𝑦
= +
∂𝜃 ∂𝑥 ∂𝜃 ∂𝑦 ∂𝜃
give the partial derivatives of 𝑓 with respect to the polar variables 𝑟 and 𝜃 in
terms of the partial derivatives of 𝑓 with respect to the Cartesian coordinates
𝑥 and 𝑦 and the derivative of the change of variables map
( )
𝑟 cos 𝜃
Φ(𝑟, 𝜃) = .
𝑟 sin 𝜃

Example 4.6.3. Let 𝑈 denote an open subset of ℝ𝑛 and 𝐼 an open interval


of real numbers. Suppose that 𝑓 : 𝑈 → ℝ is a scalar differentiable field and
𝜎 : 𝐼 → ℝ𝑛 is a differentiable path with 𝜎(𝐼) ⊆ 𝑈 . Then, by the Chain Rule,
𝑓 (𝜎(𝑡)) is differentiable for all 𝑡 ∈ 𝐼, and

d
𝑓 (𝜎(𝑡)) = ∇𝑓 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡) for all 𝑡 ∈ 𝐼.
d𝑡

Example 4.6.4 (Tangent plane to a sphere). Let 𝑓 : ℝ3 → ℝ be given by

𝑓 (𝑥, 𝑦, 𝑥) = 𝑥2 + 𝑦 2 + 𝑧 2 for all (𝑥, 𝑦, 𝑧) ∈ ℝ3 .


60 CHAPTER 4. DIFFERENTIABILITY

Define the set


𝑆 = {(𝑥, 𝑦, 𝑧) ∈ ℝ3 ∣ 𝑓 (𝑥, 𝑦, 𝑧) = 1}.
Then, 𝑆 is the sphere of radius 1 around the origin in ℝ3 , or the unit sphere in
ℝ3 .
Let 𝜎 : 𝐼 → ℝ3 denote a 𝐶 1 maps that lies entirely on the unit sphere; that
is,
𝑓 (𝜎(𝑡)) = 1 for all 𝑡 ∈ 𝐼.
Then, differentiating with respect to 𝑡 on both sides,
d
𝑓 (𝜎(𝑡)) = 0 for all 𝑡 ∈ 𝐼,
d𝑡
and applying the Chain Rule, we obtain that

∇𝑓 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡) = 0 for all 𝑡 ∈ 𝐼.

Thus, the gradient of 𝑓 is perpendicular to the tangent to the path 𝜎.


For a fixed point, (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ), on the sphere 𝑆, consider the collection of all
𝐶 1 paths, 𝜎 : 𝐼 → ℝ3 on the sphere, such that 𝜎(𝑡𝑜 ) = (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) for a fixed
𝑡𝑜 ∈ 𝐼. What we have just derived shows that the tangent vectors to the path
at (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) all lie on a plane perpendicular to ∇𝑓 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ). This plane is
called the tangent plane to 𝑆 at (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ), and it has ∇𝑓 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) as its
normal vector.
For example, the tangent plane to 𝑆 at the point
( )
1 1 1
, ,√
2 2 2
has normal vector √
𝑛 = ∇𝑓 (1/2, 1/2, 1/ 2),
where
∇𝑓 (𝑥, 𝑦, 𝑧) = 2𝑥 ˆ𝑖 + 2𝑦 ˆ
𝑗 + 2𝑧 ˆ
𝑘;
so that √
𝑛 = ˆ𝑖 + ˆ
𝑗+ 2ˆ
𝑘.

Consequently, the tangent plane to 𝑆 at the point (1/2, 1/2, 1/ 2) has equation

( ) ( ) ( )
1 1 1
(1) 𝑥 − + (1) 𝑦 − + ( 2) 𝑧 − √ = 0,
2 2 2
which simplifies to √
𝑥+𝑦+ 2 𝑧 = 2.
Chapter 5

Integration

In this chapter we extend the concept of the Riemann integral


∫ 𝑏
𝑓 (𝑥)d𝑥
𝑎

for a real valued function, 𝑓 , defined on a closed and bounded interval [𝑎, 𝑏].
We begin by defining integrals of scalar fields over curves in ℝ𝑛 which can be
parametrized by 𝐶 1 paths.

5.1 Path Integrals


Definition 5.1.1 (Simple Curve). A curve 𝐶 in ℝ𝑛 is said to be a 𝐶 1 , simple
curve if there exists a 𝐶 1 path 𝜎 : 𝐼 → ℝ𝑛 , for some open interval 𝐼 containing
a closed and bounded interval [𝑎, 𝑏], such that
(i) 𝜎([𝑎, 𝑏]) = 𝐶,
(ii) 𝜎 is one–to–one on [𝑎, 𝑏], and
(iii) 𝜎 ′ (𝑡) is never the zero vector for all 𝑡 in 𝐼.
The path 𝜎 is called a parametrization of the curve 𝐶.
Example 5.1.2. Let 𝐶 denote the arc of the unit circle in ℝ2 given by
𝐶 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1; 𝑦 ⩾ 0; 0 ⩽ 𝑥 ⩽ 1}.
Figure 5.1.1 shows a picture of 𝐶. The path 𝜎 : [0, 𝜋/2] → ℝ2 given by
𝜎(𝑡) = (cos 𝑡, sin 𝑡) for all 𝑡 ∈ [0, 𝜋/2]
provides a parametrization of 𝐶. Observe that 𝜎 is a 𝐶 1 path defined for all 𝑡 ∈ ℝ
since sin and cos are infinitely differentiable functions in all of ℝ. Furthermore,
observe that
𝜎 ′ (𝑡) = (− sin 𝑡, cos 𝑡) for all 𝑡 ∈ ℝ

61
62 CHAPTER 5. INTEGRATION
𝑦

r
(cos 𝑡, sin 𝑡)

1 𝑥

Figure 5.1.1: Curve 𝐶

always has norm 1; thus, condition (iii) in Definition 5.1.1 is satisfied.


To show that 𝜎 is one–to–one on [0, 𝜋/2], suppose that

𝜎(𝑡1 ) = 𝜎(𝑡2 )

for some 𝑡1 and 𝑡2 in [0, 𝜋/2]. Then,

(cos(𝑡1 ), sin(𝑡1 )) = (cos(𝑡2 ), sin(𝑡2 ))

and so
cos(𝑡1 ) = cos(𝑡2 ).
Since cos is one–to–one on [0, 𝜋/2], it follows that

𝑡1 = 𝑡2 ,

and, therefore, 𝜎 is one–to–one. Thus, condition (ii) in Definition 5.1.1 also


holds true for 𝜎.
Condition (i) in Definition 5.1.1 is left for the reader to verify.
There are more than one way to parametrize a given simple curve. For
instance, in the previous example, we could have used 𝛾 : [0, 𝜋] → ℝ2 given by

𝛾(𝑡) = (cos(𝑡/2), sin(𝑡/2)) for all 𝑡 ∈ [0, 𝜋].

𝛾 is called a reparametrization of the curve 𝐶. Observe that, since


1
∥𝛾 ′ (𝑡)∥ = , for all 𝑡 ∈ ℝ,
2
this new parametrization of 𝐶 amounts to traversing the curve 𝐶 at a slower
speed.
Definition 5.1.3. Let 𝜎 : [𝑎, 𝑏] → ℝ𝑛 be a differentiable, one–to–one path.
Suppose also that 𝜎 ′ (𝑡), is never the zero vector. Let ℎ : [𝑐, 𝑑] → [𝑎, 𝑏] be a
one–to–one and onto map such that ℎ′ (𝑡) ∕= 0 for all 𝑡 ∈ [𝑐, 𝑑]. Define

𝛾(𝑡) = 𝜎(ℎ(𝑡)) for all 𝑡 ∈ [𝑐, 𝑑].

𝛾 : [𝑐, 𝑑] → ℝ𝑛 is a called a reparametrization of 𝜎


5.1. PATH INTEGRALS 63

Observe that the path 𝜎 : [0, 1] → ℝ2 given by



𝜎(𝑡) = (𝑡, 1 − 𝑡2 ) for all 𝑡 ∈ [0, 1]

also parametrizes the quarter circle 𝐶 in the previous example. However, it is


not a 𝐶 1 parametrization of 𝐶 in the sense of Definition 5.1.1 since the derivative
map ( )
′ 𝑡
𝜎 (𝑡) = 1, − √ for ∣𝑡∣ < 1,
1 − 𝑡2
does not extend to a continuous map on an open interval containing [0, 1] since
it is undefined at 𝑡 = 1.

Figure 5.1.2: Curves which are not simple

Definition 5.1.4 (Simple Closed Curve). A curve 𝐶 in ℝ𝑛 is said to be a 𝐶 1 ,


simple closed curve if there exists a 𝐶 1 parametrization of 𝐶, 𝜎 : [𝑎, 𝑏] → ℝ𝑛 ,
satisfying:
(i) 𝜎([𝑎, 𝑏]) = 𝐶,
(ii) 𝜎(𝑎) = 𝜎(𝑏),
(iii) 𝜎 is one–to–one on [𝑎, 𝑏), and
(iv) 𝜎 ′ (𝑡) is never the zero vector for all 𝑡 where it is defined.

Example 5.1.5. The unit circle, 𝐶, in ℝ2 given by

𝐶 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1},
64 CHAPTER 5. INTEGRATION

is a 𝐶 1 , simple closed curve. The path 𝜎 : [0, 2𝜋] → ℝ2 given by


𝜎(𝑡) = (cos 𝑡, sin 𝑡) for all 𝑡 ∈ [0, 2𝜋]
provides a 𝐶 1 parametrization of 𝐶 satisfying all the conditions in Definition
5.1.4. The verification of this is left to the reader.
Remark 5.1.6. Condition (ii) in Definition 5.1.1 and condition (iii) in Def-
inition 5.1.4 guarantee that a simple curve does not have self–intersections or
crossings. Thus, the plane curves pictured in Figure 5.1.2 are not simple curves.

5.1.1 Arc Length


Definition 5.1.7 (Arc Length of a Simple Curve). Let 𝐶 denote a simple curve
(either closed or otherwise). We define the arc length of 𝐶, denoted ℓ(𝐶), by
∫ 𝑏
ℓ(𝐶) = ∥𝜎 ′ (𝑡)∥d𝑡,
𝑎
𝑛 1
where 𝜎 : [𝑎, 𝑏] → ℝ is a 𝐶 parametrization of 𝐶, over a closed and bounded
interval [𝑎, 𝑏], satisfying the conditions in Definition 5.1.1 (or in Definition 5.1.4
for the case of a simple closed curve).
Example 5.1.8. Let 𝐶 denote the quarter of the unit circle in ℝ2 defined in
Example 5.1.2 (see also Figure 5.1.1). In this case,
𝜎(𝑡) = (cos 𝑡, sin 𝑡) for all 𝑡 ∈ [0, 𝜋/2]
provides a 𝐶 1 parametrization of 𝐶 with
𝜎 ′ (𝑡) = (− sin 𝑡, cos 𝑡) for all 𝑡 ∈ ℝ;
so that ∥𝜎 ′ (𝑡)∥ = 1 for all 𝑡 and therefore
∫ 𝜋/2 ∫ 𝜋/2
′ 𝜋
ℓ(𝐶) = ∥𝜎 (𝑡)∥d𝑡 = d𝑡 = .
0 0 2
To see why the definition of arc length in Definition 5.1.7 is plausible, con-
sider a simple curve pictured in Figure 5.1.3 and parametrized by the 𝐶 1 path
𝜎 : [𝑎, 𝑏] → ℝ𝑛 .
Subdivide the interval [𝑎, 𝑏] into 𝑁 subintervals by means of a partition
𝑎 = 𝑡𝑜 < 𝑡1 < 𝑡2 < ⋅ ⋅ ⋅ < 𝑡𝑖−1 < 𝑡𝑖 < ⋅ ⋅ ⋅ < 𝑡𝑖 < 𝑡𝑁 −1 < 𝑡𝑁 = 𝑏.
This partition generates a polygon in ℝ𝑛 constructed by joining 𝜎(𝑡𝑖−1 ) to 𝜎(𝑡𝑖 )
by straight line segments, for 𝑖 = 1, 2, . . . , 𝑁 (see Figure 5.1.3). If we denote
the polygon by 𝑃 , then we can approximate ℓ(𝐶) by ℓ(𝑃 ); we then have that
𝑁

ℓ(𝐶) ≈ ∥𝜎(𝑡𝑖 ) − 𝜎(𝑡𝑖−1 )∥.
𝑖=1
5.1. PATH INTEGRALS 65

𝜎(𝑡𝑁 )
r





r 
H
𝜎(𝑡H
2) H 𝜎(𝑡𝑖 )
HH ( ((r
HHr(((
𝜎(𝑡𝑖−1 )

r
𝜎(𝑡1 )



r
𝜎(𝑡𝑜 )

Figure 5.1.3: Approximating arc length

Now, since 𝜎 is 𝐶 1 , and hence differentiable,

𝜎(𝑡𝑖 ) − 𝜎(𝑡𝑖−1 ) = (𝑡𝑖 − 𝑡𝑖−1 )𝜎 ′ (𝑡𝑖−1 ) + 𝐸𝑖 (𝑡𝑖 − 𝑡𝑖−1 )

for each 𝑖 = 1, 2, . . . , 𝑁 , where

∥𝐸𝑖 (ℎ)∥
lim = 0,
ℎ→0 ∣ℎ∣

for each 𝑖 = 1, 2, . . . , 𝑁 . Now, by making 𝑁 larger and larger, while assuring


that the largest of the differences 𝑡𝑖 − 𝑡𝑖−1 , for each 𝑖 = 1, 2, . . . , 𝑁 , gets smaller
and smaller, we can make the further approximation
𝑁

ℓ(𝐶) ≈ ∥𝜎 ′ (𝑡𝑖−1 )∥(𝑡𝑖 − 𝑡𝑖−1 ).
𝑖=1

Observe that the expression


𝑁

∥𝜎 ′ (𝑡𝑖−1 )∥(𝑡𝑖 − 𝑡𝑖−1 )
𝑖=1

is a Riemann sum for the function ∥𝜎 ′ (𝑡)∥ over the interval [𝑎, 𝑏]. Now, since
we are assuming the 𝜎 is of class 𝐶 1 , it follows that the map 𝑡 7→ ∥𝜎 ′ (𝑡)∥ is
66 CHAPTER 5. INTEGRATION

continuous on [𝑎, 𝑏]. Thus, a theorem from analysis guarantees that the sums
𝑁

∥𝜎 ′ (𝑡𝑖−1 )∥(𝑡𝑖 − 𝑡𝑖−1 )
𝑖=1

converge as 𝑁 → ∞ while

max (𝑡𝑖 − 𝑡𝑖−1 ) → 0.


1⩽𝑖⩽𝑁

The limit will be the Riemann integral of ∥𝜎 ′ (𝑡)∥ over the interval [𝑎, 𝑏]. Thus,
it makes sense to define ∫ 𝑏
ℓ(𝐶) = ∥𝜎 ′ (𝑡)∥d𝑡.
𝑎

We next see that we will always get the same value of the integral for any
𝐶 1 parametrization of 𝜎.
Let 𝛾(𝑡) = 𝜎(ℎ(𝑡)), for all 𝑡 ∈ [𝑐, 𝑑], be reparametrization of 𝜎 : [𝑎, 𝑏] → ℝ𝑛 ;
that is, ℎ is a one–to–one, differentiable function from [𝑐, 𝑑] to [𝑎, 𝑏] with ℎ′ (𝑡) > 0
for all 𝑡 ∈ (𝑐, 𝑑). We consider the integral
∫ 𝑑
∥𝛾 ′ (𝑡)∥d𝑡.
𝑐

By the Chain Rule,

𝑑
𝛾 ′ (𝑡) = [𝜎(ℎ(𝑡))] = ℎ′ (𝑡)𝜎 ′ (ℎ(𝑡)).
𝑑𝑡
We then have that
∫ 𝑑 ∫ 𝑑

∥𝛾 (𝑡)∥d𝑡 = ∥ℎ′ (𝑡)𝜎 ′ (ℎ(𝑡))∥d𝑡
𝑐 𝑐

∫ 𝑑
= ∥𝜎 ′ (ℎ(𝑡))∥ ∣ℎ′ (𝑡)∣d𝑡
𝑐

∫ 𝑑
= ∥𝜎 ′ (ℎ(𝑡))∥ ℎ′ (𝑡)d𝑡,
𝑐

since ℎ (𝑡) > 0. Next, make the change of variables 𝜏 = ℎ(𝑡). Then, d𝜏 = ℎ′ (𝑡)d𝑡

and ∫ 𝑑 ∫ 𝑏
′ ′
∥𝜎 (ℎ(𝑡))∥ℎ (𝑡)d𝑡 = ∥𝜎 ′ (𝜏 )∥d𝜏.
𝑐 𝑎

It then follows from Definition 5.1.7 that


∫ 𝑑
ℓ(𝐶) = ∥𝛾 ′ (𝑡)∥d𝑡
𝑐
5.1. PATH INTEGRALS 67

for any reparametrization 𝛾 = 𝜎 ∘ ℎ of 𝜎, with ℎ′ > 0. In the case in which


ℎ′ < 0, we get the same result with the understanding that ℎ(𝑐) = 𝑏 and
ℎ(𝑑) = 𝑎. Thus, any reparametrization of 𝜎 will yield the same value for the
integral ℓ(𝐶) given in Definition 5.1.7.
It remains to see that any two parametrizations

𝜎 : [𝑎, 𝑏] → ℝ𝑛 and 𝛾 : [𝑐, 𝑑] → ℝ𝑛

of a simple curve 𝐶 are reparametrizations of each other. This will be proved


in Appendix B.

5.1.2 Defining the Path Integral


Let 𝑈 be an open subset of ℝ𝑛 and 𝐶 be a 𝐶 1 simple curve (closed or otherwise)
which is entirely contained in 𝑈 . Suppose that 𝑓 : 𝑈 → ℝ is a continuous scalar
field defined on 𝑈 . We define the integral of 𝑓 over the curve 𝐶, denoted by

𝑓,
𝐶

as follows: ∫ ∫ 𝑏
𝑓= 𝑓 (𝜎(𝑡))∥𝜎 ′ (𝑡)∥d𝑡, (5.1)
𝐶 𝑎

where 𝜎 : [𝑎, 𝑏] → ℝ𝑛 is a 𝐶 1 parametrization of 𝐶, over a closed and bounded


interval [𝑎, 𝑏], satisfying the conditions in Definition 5.1.1 (or in Definition 5.1.4
for ∫
the case of a simple closed curve).
𝑓 is called the path integral of 𝑓 over 𝐶. This integral is guaranteed to
𝐶
exist as a limit of Riemann sums of the function 𝑓 (𝜎(𝑡))∥𝜎 ′ (𝑡)∥ over [𝑎, 𝑏] by
virtue of the continuity of 𝑓 and the fact that 𝜎 is a 𝐶 1 parametrization of 𝐶.
Example 5.1.9. A metal wire is in the shape of the portion of a parabola
𝑦 = 𝑥2 from 𝑥 = −1 to 𝑥 = 1. Suppose the linear mass density along the wire
(in grams per centimeter) is proportional to the distance to the 𝑦–axis (the axis
of the parabola). Compute the mass of the wire.
Solution: The wire is parametrized by the path

𝜎(𝑡) = (𝑡, 𝑡2 ) for − 1 ⩽ 𝑡 ⩽ 1.

Let 𝐶 denote the image of 𝜎. Let 𝑓 (𝑥, 𝑦) denote the linear mass
density of the wire. Then, 𝑓 (𝑥, 𝑦) = 𝑘∣𝑥∣ for some constant of pro-
portionality 𝑘. It then follows that the mass of the wire is
∫ ∫ 1
𝑀= 𝑓= 𝑘∣𝑡∣∥𝜎 ′ (𝑡)∥d𝑡,
𝐶 −1

where
𝜎 ′ (𝑡) = (1, 2𝑡),
68 CHAPTER 5. INTEGRATION

so that √
∥𝜎 ′ (𝑡)∥ = 1 + 4𝑡2 .
Hence, by the symmetry of the wire with respect to the 𝑦 axis
∫ ∫ 1 √
𝑀= 𝑓 =2 𝑘𝑡 1 + 4𝑡2 d𝑡.
𝐶 0

Evaluating this integral yields

𝑘 √
𝑀= (5 5 − 1).
6


The definition of 𝑓 given in (5.1) is based on a choice of parametrization,
𝐶 ∫
𝜎 : [𝑎, 𝑏] → ℝ𝑛 , for 𝐶. Thus, in order to see that 𝑓 is well defined, we need
∫ 𝐶

to show that the value of 𝑓 is independent of the choice of parametrization;


𝐶
more precisely, we need to see that if 𝛾 : [𝑐, 𝑑] → ℝ𝑛 is another parametrization
of 𝐶, then
∫ 𝑑 ∫ 𝑏
𝑓 (𝛾(𝑡))∥𝛾 ′ (𝑡)d𝑡 = 𝑓 (𝜎(𝑡))∥𝜎 ′ (𝑡)∥d𝑡. (5.2)
𝑐 𝑎

For the case in which 𝛾 is a reparametrization of 𝜎; that is, the case in which
𝛾(𝑡) = 𝜎(ℎ(𝑡)), for all 𝑡 ∈ [𝑐, 𝑑], where ℎ is a one–to–one, differentiable function
from [𝑐, 𝑑] to [𝑎, 𝑏] with ℎ′ (𝑡) > 0 for all 𝑡 ∈ (𝑐, 𝑑). We see that (5.2) follows
from the Chain Rule and the change of variables: 𝜏 = ℎ(𝑡), for 𝑡 ∈ [𝑐, 𝑑]. In fact
we have
𝑑
𝛾 ′ (𝑡) = [𝜎(ℎ(𝑡))] = ℎ′ (𝑡)𝜎 ′ (ℎ(𝑡)),
𝑑𝑡
so that
∫ 𝑑 ∫ 𝑑
𝑓 (𝛾(𝑡))∥𝛾 ′ (𝑡)∥d𝑡 = 𝑓 (𝜎(ℎ(𝑡)))∥∥𝜎 ′ (ℎ(𝑡))∥ ℎ′ (𝑡)d𝑡,
𝑐 𝑐

since ℎ′ (𝑡) > 0. Thus, since d𝜏 = ℎ′ (𝑡)d𝑡, we can write


∫ 𝑑 ∫ 𝑏
𝑓 (𝛾(𝑡))∥𝛾 ′ (𝑡)∥d𝑡 = 𝑓 (𝜎(𝜏 ))∥𝜎 ′ (𝜏 )∥d𝜏,
𝑐 𝑎

which is (5.2) for the case in which one of the paths is reparametrization of the
other. Finally, using the results of Appendix B in this notes, we see that (5.2)
holds for any two parametrizations, 𝜎 : [𝑎, 𝑏] → ℝ𝑛 and 𝜎 : [𝑐, 𝑑] → ℝ𝑛 , of the
𝐶 1 simple curve, 𝐶.
5.2. LINE INTEGRALS 69

5.2 Line Integrals


In the previous section we saw how to integrate a scalar field on a 𝐶 1 , simple
curve. In this section we describe how to integrate vector fields on curves.
Technically, what we’ll be doing is integrating a component (which is a scalar)
of a vector field on the given curve. More precisely, let 𝑈 denote an open subset
of ℝ𝑛 and let 𝐹 : 𝑈 → ℝ𝑛 be a vector field on 𝑈 . Suppose that there is a curve,
𝐶, which is contained in 𝑈 and which is parametrized by a 𝐶 1 path

𝜎 : [𝑎, 𝑏] → ℝ𝑛 .

We have seen that the vector 𝜎 ′ (𝑡) gives the tangent direction to the path at
𝜎(𝑡). The vector
1
𝑇 (𝑡) = 𝜎 ′ (𝑡)
∥𝜎 ′ (𝑡)∥
is, therefore, a unit tangent vector to the path. The tangential component of
the of the vector field, 𝐹 , is then given by the dot product of 𝐹 and 𝑇 :

𝐹 ⋅ 𝑇.

The line integral of 𝐹 on the curve 𝐶 parametrized by 𝜎 is given by


∫ ∫ 𝑏
𝐹 ⋅ 𝑇 d𝑠 = 𝐹 (𝜎(𝑡)) ⋅ 𝑇 (𝑡) ∥𝜎 ′ (𝑡)∥d𝑡.
𝐶 𝑎

Observe that we can re–write this as


∫ ∫ 𝑏
1
𝐹 ⋅ 𝑇 d𝑠 = 𝐹 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡) ∥𝜎 ′ (𝑡)∥d𝑡;
𝐶 𝑎 ∥𝜎 ′ (𝑡)∥
therefore,
∫ ∫ 𝑏
𝐹 ⋅ 𝑇 d𝑠 = 𝐹 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡)d𝑡. (5.3)
𝐶 𝑎

Example 5.2.1. Let 𝐹 : ℝ2 ∖{(0, 0)∥ → ℝ2 be given by


−𝑦 ˆ 𝑥
𝐹 (𝑥, 𝑦) = 𝑖+ 2 𝑗 for (𝑥, 𝑦) ∕= (0, 0),
ˆ
𝑥2 +𝑦 2 𝑥 + 𝑦2
and let
∫ 𝐶 denote the unit circle traversed in the counterclockwise direction. Eval-
uate 𝐹 ⋅ 𝑇 d𝑠.
𝐶

Solution: The path

𝜎(𝑡) = (cos 𝑡, sin 𝑡), for 𝑡 ∈ [0, 2𝜋],

is a 𝐶 1 parametrization for 𝐶 with

𝜎 ′ (𝑡) = (− sin 𝑡, cos 𝑡), for 𝑡 ∈ ℝ.


70 CHAPTER 5. INTEGRATION

Applying the definition of the line integral in (5.3) yields


∫ ∫ 2𝜋
𝐹 ⋅ 𝑇 d𝑠 = 𝐹 (cos 𝑡, sin 𝑡) ⋅ (− sin 𝑡, cos 𝑡)d𝑡
𝐶 0

∫ 2𝜋
= (sin2 𝑡 + cos2 𝑡)d𝑡
0

= 2𝜋.


Let
𝐹 (𝑥, 𝑦, 𝑧) = 𝑃 (𝑥, 𝑦, 𝑧) ˆ𝑖 + 𝑄(𝑥, 𝑦, 𝑧) ˆ
𝑗
denote a vector filed defined in a region 𝑈 of ℝ2 , where 𝑃 and 𝑄 are continuous
scalar fields defined on 𝑈 . Let

𝜎(𝑡) = 𝑥(𝑡) ˆ𝑖 + 𝑦(𝑡) ˆ


𝑗, for 𝑡 ∈ [𝑎, 𝑏],

be a 𝐶 1 parametrization of a 𝐶 1 curve, 𝐶, contained in 𝑈 . Then

𝜎 ′ (𝑡) = 𝑥′ (𝑡) ˆ𝑖 + 𝑦 ′ (𝑡) ˆ


𝑗 for 𝑡 ∈ (𝑎, 𝑏),

and, applying the definition of the line integral of 𝐹 on 𝐶 in (5.3) yields


∫ ∫ 𝑏
𝐹 ⋅ 𝑇 d𝑠 = (𝑃 (𝑥(𝑡), 𝑦(𝑡))𝑥′ (𝑡) + 𝑄(𝑥(𝑡), 𝑦(𝑡))𝑦 ′ (𝑡))d𝑡
𝐶 𝑎

∫ 𝑏
= (𝑃 (𝑥(𝑡), 𝑦(𝑡))𝑥′ (𝑡)d𝑡 + 𝑄(𝑥(𝑡), 𝑦(𝑡))𝑦 ′ (𝑡)d𝑡)
𝑎

Next, use the notation d𝑥 = 𝑥′ (𝑡)d𝑡 and d𝑦 = 𝑦 ′ (𝑡)d𝑡 for the differentials of 𝑥
and 𝑦, respectively, to re–write the line integral as
∫ ∫
𝐹 ⋅ 𝑇 d𝑠 = 𝑃 d𝑥 + 𝑄d𝑦. (5.4)
𝐶 𝐶

Equation (5.4) suggests another way to evaluate the line integral of a 2–dimensional
vector field on a plane curve.

Example 5.2.2. Evaluate the line integral −𝑦d𝑥 + (𝑥 − 1)d𝑦, where 𝐶 is
𝐶
the simple closed curve made up of the line segment from (−1, 0) to (1, 0) and
the top portion of the unit circle traversed in the counterclockwise direction (see
picture in Figure 5.2.4).

Solution: Observe that 𝐶 is not a 𝐶 1 curve since no tangent vector


can be defined at the points (−1, 0) and (1, 0). However, 𝐶 can be
decomposed into two 𝐶 1 curves (see Figure 5.2.4):
5.2. LINE INTEGRALS 71

𝑦
(0, 1)

JJ 𝐶
]
2

-
(−1, 0) 𝐶1 (1, 0) 𝑥

Figure 5.2.4: Example 5.2.2 Picture

(i) 𝐶1 : the directed line segment from (−1, 0) to (1, 0), and
(ii) 𝐶2 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1, 𝑦 ⩾ 0}; the top portion of the
unit circle in ℝ2 traversed in the counterclockwise sense.
Then,
∫ ∫ ∫
−𝑦d𝑥 + (𝑥 − 1)d𝑦 = −𝑦d𝑥 + (𝑥 − 1)d𝑦 + −𝑦d𝑥 + (𝑥 − 1)d𝑦.
𝐶 𝐶1 𝐶2

We evaluate each of the integrals separately.


On 𝐶1 : 𝑥 = 𝑡 and 𝑦 = 0 for −1 ⩽ 𝑡 ⩽ 1; so that d𝑥 = d𝑡 and d𝑦 = 0.
Thus, ∫
−𝑦d𝑥 + (𝑥 − 1)d𝑦 = 0.
𝐶1
On 𝐶2 : 𝑥 = cos 𝑡 and 𝑦 = sin 𝑡 for 0 ⩽ 𝑡 ⩽ 𝜋; so that d𝑥 = − sin 𝑡d𝑡
and d𝑦 = cos 𝑡d𝑡. Thus
∫ ∫ 𝜋
−𝑦d𝑥 + (𝑥 − 1)d𝑦 = (− sin 𝑡(− sin 𝑡)d𝑡 + (cos 𝑡 − 1) cos 𝑡d𝑡)
𝐶2 0
∫ 𝜋
= (sin2 𝑡 + cos2 𝑡 − cos 𝑡)d𝑡
0
∫ 𝜋
= (1 − cos 𝑡)d𝑡
0

𝜋
= [𝑡 − sin 𝑡]0

= 𝜋.
It then follows that

−𝑦d𝑥 + (𝑥 − 1)d𝑦 = 𝜋.
𝐶


72 CHAPTER 5. INTEGRATION

We can obtain an analogous equation to that in (5.4) for the case of a three
dimensional field
𝐹 = 𝑃 ˆ𝑖 + 𝑄 ˆ𝑗+𝑅 ˆ 𝑘,
where 𝑃 , 𝑄 and 𝑅 are scalar fields defined in some region 𝑈 of ℝ3 which contains
the simple curve 𝐶:
∫ ∫
𝐹 ⋅ 𝑇 d𝑠 = 𝑃 d𝑥 + 𝑄d𝑦 + 𝑅d𝑧. (5.5)
𝐶 𝐶

5.3 Gradient Fields


Suppose that a field 𝐹 : 𝑈 → ℝ𝑛 is the gradient of a 𝐶 1 scalar field, 𝑓 , defined
on 𝑈 ; that is, 𝐹 = ∇𝑓 . Then, for any 𝐶 1 parametrization,
𝜎 : [0, 1] → ℝ𝑛 ,
of a curve 𝐶 in 𝑈 connecting a point 𝑥𝑜 to 𝑥1 , also in 𝑈 ,
∫ ∫ 1
𝐹 ⋅ 𝑇 d𝑠 = 𝐹 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡)d𝑡
𝐶 0

∫ 1
= ∇𝑓 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡)d𝑡
0

∫ 1
d
= (𝑓 (𝜎(𝑡))) d𝑡
0 d𝑡

= 𝑓 (𝜎(1)) − 𝑓 (𝜎(0))

= 𝑓 (𝑥1 ) − 𝑓 (𝑥𝑜 ).
Thus, the line integral of 𝐹 = ∇𝑓 on a curve 𝐶 is determined by the values of
𝑓 at the endpoints of the curve.
A field 𝐹 with the property that 𝐹 = ∇𝑓 , for a 𝐶 1 scalar field, 𝑓 , is called
a gradient field, and 𝑓 is called a potential for the field 𝐹 .
Example 5.3.1 (Gravitational Potential). According to Newton’s Law of Uni-
versal Gravitation, the earth exerts a gravitational pull on an object of mass 𝑚
at a point (𝑥, 𝑦, 𝑧) above the surface of the earth, which is at a distance of

𝑟 = 𝑥2 + 𝑦 2 + 𝑧 2
from the center of the earth (located at the origin of three dimensional space),
an is given by
𝑘𝑚
𝐹 (𝑥, 𝑦, 𝑧) = − 2 𝑟ˆ, (5.6)
𝑟
where 𝑟ˆ is a unit vector in the direction of the vector ⃗𝑟 = 𝑥 ˆ𝑖 + 𝑦 ˆ
𝑗+𝑧 ˆ
𝑘. The
minus sign indicates that the force is directed towards the center of the earth.
Show that the field 𝐹 is a gradient field.
5.4. FLUX ACROSS PLANE CURVES 73

Solution: We claim that 𝐹 = ∇𝑓 , where

𝑘𝑚 √
𝑓 (𝑟) = and 𝑟 = 𝑥2 + 𝑦 2 + 𝑧 2 ∕= 0. (5.7)
𝑟
To see why this is so, use the Chain Rule to compute

∂𝑓 ∂𝑟 𝑘𝑚 𝑥
= 𝑓 ′ (𝑟) =− 2 .
∂𝑥 ∂𝑥 𝑟 𝑟
Similarly,
∂𝑓 𝑘𝑚 𝑦 ∂𝑓 𝑘𝑚 𝑧
= − 2 , and =− 2 .
∂𝑦 𝑟 𝑟 ∂𝑧 𝑟 𝑟
It then follows that
∂𝑓 ˆ ∂𝑓 ˆ ∂𝑓 ˆ
∇𝑓 = 𝑖+ 𝑗+ 𝑘
∂𝑥 ∂𝑦 ∂𝑧

𝑘𝑚 𝑥 ˆ 𝑘𝑚 𝑦 ˆ 𝑘𝑚 𝑧 ˆ
= − 𝑖− 2 𝑗− 2 𝑘
𝑟2 𝑟 𝑟 𝑟 𝑟 𝑟
𝑘𝑚 ( 𝑥 ˆ 𝑦 ˆ 𝑧 ˆ)
= − 2 𝑖+ 𝑗+ 𝑘
𝑟 𝑟 𝑟 𝑟
𝑘𝑚 1 ( ˆ )
= − 2 𝑥 𝑖+𝑦 ˆ𝑗+𝑧 ˆ
𝑘
𝑟 𝑟
𝑘𝑚
= − 𝑟ˆ,
𝑟2
which is the vector field 𝐹 defined in (5.6). □

It follows from the fact that the Newtonian gravitational field 𝐹 defined in
(5.6) is a gradient field that the line integral of 𝐹 along any curve in ℝ3 , which
does not go through the origin, connecting ⃗𝑟𝑜 = (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) to ⃗𝑟1 = (𝑥1 , 𝑦1 , 𝑧1 ),
is given by

𝑘𝑚 𝑘𝑚
𝐹 ⋅ 𝑇 d𝑠 = 𝑓 (𝑥1 , 𝑦1 , 𝑧1 ) − 𝑓 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) = − ,
𝐶 𝑟1 𝑟𝑜
√ √
where 𝑟𝑜 = 𝑥2𝑜 + 𝑦𝑜2 + 𝑧𝑜2 and 𝑟1 = 𝑥21 + 𝑦12 + 𝑧12 . The function 𝑓 defined in
(5.7) is called the gravitational potential.

5.4 Flux Across Plane Curves


According the Jordan Curve Theorem, a simple closed curve in the plane divides
the plane into two connected regions:

(i) a bounded region called the “inside” of the curve, and


74 CHAPTER 5. INTEGRATION

(ii) an unbounded region called the “outside” of the curve.


Let 𝐶 denote a 𝐶 1 , simple, closed curve in the plane parametrized by the 𝐶 1
path
𝜎 : [𝑎, 𝑏] → ℝ2 .
We can then define a unit vector, 𝑛ˆ, perpendicular to to the tangent unit vector,
𝑇 , to the curve, and pointing towards the outside of the curve. 𝑛ˆ is called the
outward unit normal to the curve.
Example 5.4.1. The outward unit normal to the unit circle, 𝐶, parametrized
by the path
𝜎(𝑡) = (cos 𝑡, sin 𝑡), for 𝑡 ∈ [0, 𝜋],
is the vector
𝑛
ˆ(𝑡) = (cos 𝑡, sin 𝑡), for 𝑡 ∈ [0, 𝜋].
In general, if the parametrization of a 𝐶 1 , simple, closed curve, 𝐶, is given
by
𝜎(𝑡) = (𝑥(𝑡), 𝑦(𝑡)) for 𝑎 ⩽ 𝑡 ⩽ 𝑏,
1
where 𝑥 and 𝑦 are 𝐶 functions of 𝑡, then the vector
( )
1 d𝑦 ˆ d𝑥 ˆ
ˆ(𝑡) = ± ′
𝑛 𝑖− 𝑗 ,
∥𝜎 (𝑡)∥ d𝑡 d𝑡
where the sign is chosen appropriately, will be the outward unit normal to the
curve. We assume, for convenience, that the path 𝜎 is always oriented so that
the positive sign indicates the outward direction.
Given a vector field, 𝐹 = 𝑃 ˆ𝑖 + 𝑄 ˆ 𝑗, defined on a region containing a 𝐶 1 ,
simple, closed curve, 𝐶, we define the flux of 𝐹 across 𝐶 to be the integral
∫ ∫ 𝑏 ( )
1 d𝑦 ˆ d𝑥 ˆ
𝐹 ⋅𝑛ˆd𝑠 = 𝐹 (𝜎(𝑡)) ⋅ ′ 𝑖− 𝑗 ∥𝜎 ′ (𝑡)∥d𝑡
𝐶 𝑎 ∥𝜎 (𝑡)∥ d𝑡 d𝑡
∫ 𝑏 ( )
d𝑦 ˆ d𝑥 ˆ
= 𝑗) ⋅
(𝑃 ˆ𝑖 + 𝑄 ˆ 𝑖− 𝑗 d𝑡
𝑎 d𝑡 d𝑡
∫ 𝑏 ( )
d𝑦 d𝑥
= 𝑃 −𝑄 d𝑡
𝑎 d𝑡 d𝑡
Thus, using the definitions of the differentials of 𝑥 and 𝑦, we can write the flux
of 𝐹 across the curve 𝐶 as
∫ ∫
𝐹 ⋅𝑛ˆd𝑠 = 𝑃 d𝑦 − 𝑄d𝑥. (5.8)
𝐶 𝐶

Example 5.4.2. Compute the flux of the field 𝐹 (𝑥, 𝑦) = 𝑥 ˆ𝑖 + 𝑦 ˆ


𝑗 across the
unit circle
𝐶 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 = 1}
traversed in the counterclockwise direction.
5.5. DIFFERENTIAL FORMS 75

Solution: Parametrize the circle with 𝑥 = cos 𝑡, 𝑦 = sin 𝑡, for


𝑡 ∈ [0, 2𝜋]. Then, d𝑥 = − sin 𝑡d𝑡, d𝑦 = cos 𝑡, and, using the definition
of flux in (5.8),
∫ ∫
𝐹 ⋅𝑛
ˆd𝑠 = 𝑃 d𝑦 − 𝑄d𝑥
𝐶 𝐶

∫ 2𝜋
= (cos2 𝑡 + sin2 𝑡)d𝑡
0

= 2𝜋.

An interpretation of the flux of a vector field is provided by the following
situation in fluid dynamics: Let 𝑉 (𝑥, 𝑦) denote the velocity field of a plane fluid
in some region 𝑈 in ℝ2 containing the simple closed curve 𝐶. Then, at each
point (𝑥, 𝑦) in 𝑈 , 𝑉 (𝑥, 𝑦) gives the velocity of the fluid as it goes through that
point in units of length per unit time. Suppose we know the density of the fluid
as a function, 𝜌(𝑥, 𝑦), of the position of the fluid in 𝑈 (this is a scalar field) in
units of mass per unit area (since this is a two–dimensional fluid). Then, the
vector field
𝐹 (𝑥, 𝑦) = 𝜌(𝑥, 𝑦)𝑉 (𝑥, 𝑦),
in units of mass per unit length per unit time, gives the rate of fluid flow per
unit length at the point (𝑥, 𝑦). The integrand
𝐹 ⋅𝑛
ˆd𝑠,
in the flux definition in (5.8), is then in units of mass per unit time and measures
the amount of fluid that crosses a section of the curve 𝐶 of length d𝑠 in the
outward normal direction. The flux then gives the rate at which the fluid is
crossing the curve 𝐶 from the inside to the outside; in other words, the flux
gives the rate of flow of fluid out of the region bounded by 𝐶.

5.5 Differential Forms


The expression 𝑃 d𝑥 + 𝑄d𝑦 + 𝑅d𝑧 in equation (5.4), where 𝑃 , 𝑄 and 𝑅 are
scalar fields defined in some open region in ℝ3 is an example of a differential
form; more precisely, it is called a differential 1–form. The discussion presented
in this section parallels the discussion found in Chapter 11 of Baxandall and
Liebeck’s text.
Let 𝑈 denote an open subset of ℝ𝑛 . Denote by ℒ(ℝ𝑛 , ℝ) the space of linear
transformations from ℝ𝑛 to ℝ. The space ℒ(ℝ𝑛 , ℝ) is also referred to as the
dual of ℝ𝑛 and denoted by (ℝ𝑛 )∗ .
Definition 5.5.1 (Preliminary Definition of Differential 1–Forms in 𝑈 ). A dif-
ferential 1–form, 𝜔, is a map 𝜔 : 𝑈 → ℒ(ℝ𝑛 , ℝ) which assigns to each 𝑝 ∈ 𝑈 , a
linear transformation 𝜔𝑝 : ℝ𝑛 → ℝ.
76 CHAPTER 5. INTEGRATION

It was shown in Problem 4 of Assignment 2 that to every linear transforma-


tion 𝜔𝑝 : ℝ𝑛 → ℝ there corresponds a unique vector, 𝑤𝑝 ∈ ℝ𝑛 , such that

𝜔𝑝 (ℎ) = 𝑤𝑝 ⋅ ℎ, for all ℎ ∈ ℝ𝑛 . (5.9)

Denoting the vector 𝑤𝑝 by (𝐹1 (𝑝), 𝐹2 (𝑝), . . . , 𝐹𝑛 (𝑝)), we can then write the ex-
pression in (5.9) as

𝜔𝑝 (ℎ) = 𝐹1 (𝑝)ℎ1 +𝐹2 (𝑝)ℎ2 +⋅ ⋅ ⋅ +𝐹𝑛 (𝑝)ℎ𝑛 , for (ℎ1 , ℎ2 , . . . , ℎ𝑛 ) ∈ ℝ𝑛 . (5.10)

Thus, a differential 1–form, 𝜔, defines a vector field 𝐹 : 𝑈 → ℝ𝑛 given by

𝐹 (𝑝) = (𝐹1 (𝑝), 𝐹2 (𝑝), . . . , 𝐹𝑛 (𝑝)), for all 𝑝 ∈ 𝑈. (5.11)

Conversely, a vector field, 𝐹 : 𝑈 → ℝ𝑛 as in (5.11) gives rise to a differential


1–form, 𝜔, by means of the formula in (5.10). Thus, there is a one–to–one
correspondence between differential 1–forms and the space vector fields on 𝑈 .
In the final definition of a differentiable 1–form, we will require that the vector
field associated to a given form, 𝜔, is at least 𝐶 1 ; in fact, we will require that
the field be 𝐶 ∞ , or smooth.
Definition 5.5.2 (Differential 1–Forms in 𝑈 ). A differential 1–form, 𝜔, on 𝑈
is a (smooth) map 𝜔 : 𝑈 → ℒ(ℝ𝑛 , ℝ) which assigns to each 𝑝 ∈ 𝑈 a linear
transformation, 𝜔𝑝 : ℝ𝑛 → ℝ, given by

𝜔𝑝 (ℎ) = 𝐹1 (𝑝)ℎ1 + 𝐹2 (𝑝)ℎ2 + ⋅ ⋅ ⋅ + 𝐹𝑛 (𝑝)ℎ𝑛 ,

for all ℎ = (ℎ1 , ℎ2 , . . . , ℎ𝑛 ) ∈ ℝ𝑛 , where the vector field 𝐹 = (𝐹1 , 𝐹2 , . . . , 𝐹𝑛 ) is


a smooth vector field in 𝑈 .
Example 5.5.3. Given a smooth function, 𝑓 : 𝑈 → ℝ, the vector field ∇𝑓 : 𝑈 →
ℝ𝑛 gives rise to a differential 1–form denoted by 𝑑𝑓 and defined by
∂𝑓 ∂𝑓 ∂𝑓
𝑑𝑓𝑝 (ℎ) = (𝑝) ℎ1 + (𝑝) ℎ2 + ⋅ ⋅ ⋅ + (𝑝) ℎ𝑛 , (5.12)
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
for all ℎ = (ℎ1 , ℎ2 , . . . , ℎ𝑛 ) ∈ ℝ𝑛 .
Example 5.5.4. As a special instance of Example 5.5.3, for 𝑗 ∈ {1, 2, . . . , 𝑛},
consider the function 𝑥𝑗 : 𝑈 → ℝ given by

𝑥𝑗 (𝑝) = 𝑝𝑗 , for all 𝑝 = (𝑝1 , 𝑝2 , . . . , 𝑝𝑛 ) ∈ 𝑈.

The differential 1–form, 𝑑𝑥𝑗 is then given by


∂𝑥𝑗 ∂𝑥𝑗 ∂𝑥𝑗
(𝑑𝑥𝑗 )𝑝 (ℎ) = (𝑝) ℎ1 + (𝑝) ℎ2 + ⋅ ⋅ ⋅ + (𝑝) ℎ𝑛 ,
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
for all ℎ = (ℎ1 , ℎ2 , . . . , ℎ𝑛 ) ∈ ℝ𝑛 ; so that,

(𝑑𝑥𝑗 )𝑝 (ℎ) = ℎ𝑗 , for all ℎ = (ℎ1 , ℎ2 , . . . , ℎ𝑛 ) ∈ ℝ𝑛 . (5.13)


5.5. DIFFERENTIAL FORMS 77

Combining the result in (5.12) in Example 5.5.3 with that of (5.13) in Ex-
ample 5.5.4, we see that for a smooth function 𝑓 : 𝑈 → ℝ,

∂𝑓 ∂𝑓 ∂𝑓
𝑑𝑓𝑝 (ℎ) = (𝑝) 𝑑𝑥1 (ℎ) + (𝑝) 𝑑𝑥2 (ℎ) + ⋅ ⋅ ⋅ + (𝑝) 𝑑𝑥𝑛 (ℎ),
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

for all ℎ ∈ ℝ𝑛 , which can be written as

∂𝑓 ∂𝑓 ∂𝑓
𝑑𝑓𝑝 = (𝑝) 𝑑𝑥1 + (𝑝) 𝑑𝑥2 + ⋅ ⋅ ⋅ + (𝑝) 𝑑𝑥𝑛 ,
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

for 𝑝 ∈ 𝑈 , or
∂𝑓 ∂𝑓 ∂𝑓
𝑑𝑓 = 𝑑𝑥1 + 𝑑𝑥2 + ⋅ ⋅ ⋅ + 𝑑𝑥𝑛 , (5.14)
∂𝑥1 ∂𝑥2 ∂𝑥𝑛
which gives an interpretation of the differential of a smooth function, 𝑓 , as
a differential 1–form. The expression in (5.14) displays 𝑑𝑓 as a linear combi-
nation of the set of differential 1–forms {𝑑𝑥1 , 𝑑𝑥2 , . . . , 𝑑𝑥𝑛 }. In fact, the set
{𝑑𝑥1 , 𝑑𝑥2 , . . . , 𝑑𝑥𝑛 } is a basis for the space of differential 1–forms. Thus, any
differential 1–form, 𝜔, can be written as

𝜔 = 𝐹1 𝑑𝑥1 + 𝐹2 𝑑𝑥2 + ⋅ ⋅ ⋅ + 𝐹𝑛 𝑑𝑥𝑛 , (5.15)

where 𝐹 = (𝐹1 , 𝐹2 , . . . , 𝐹𝑛 ) is a smooth vector field defined in 𝑈 .


Differential 1 forms act on oriented, smooth curves, 𝐶, by means on integra-
tion; we write
∫ ∫
𝜔(𝐶) = 𝜔= 𝐹1 𝑑𝑥1 + 𝐹2 𝑑𝑥2 + ⋅ ⋅ ⋅ + 𝐹𝑛 𝑑𝑥𝑛 .
𝐶 𝐶

Example 5.5.5 (Action on Directed Line Segments). Given points 𝑃1 and 𝑃2


in ℝ𝑛 , the segment of the line going from 𝑃1 to 𝑃2 , denoted by [𝑃1 , 𝑃2 ], is called
the directed line segment from 𝑃1 to 𝑃2 . Thus,
{−−→ −−−→ }
[𝑃1 , 𝑃2 ] = 𝑂𝑃1 + 𝑡𝑃1 𝑃2 ∣ 0 ⩽ 𝑡 ⩽ 1 ,

where 𝑂 is the origin in ℝ𝑛 . Thus, [𝑃1 , 𝑃2 ] is a simple, 𝐶 1 curve parametrized


by the path
−−→ −−−→
𝜎(𝑡) = 𝑂𝑃1 + 𝑡𝑃1 𝑃2 , 0 ⩽ 𝑡 ⩽ 1.
The action of a differential 1–form, 𝜔 = 𝐹1 𝑑𝑥1 + 𝐹2 𝑑𝑥2 + ⋅ ⋅ ⋅ + 𝐹𝑛 𝑑𝑥𝑛 is then

𝜔([𝑃1 , 𝑃2 ]) = 𝐹 ⋅ d→

𝑟
[𝑃1 ,𝑃2 ]

Example 5.5.6. Evaluate the differential 1–form 𝜔 = 𝑦𝑧d𝑥 + 𝑥𝑧d𝑦 + 𝑥𝑦d𝑧 on


the directed line segment from the point 𝑃1 (1, 1, 0) to the point 𝑃2 (3, 2, 1).
78 CHAPTER 5. INTEGRATION

Solution: We compute

𝜔([𝑃1 , 𝑃2 ]) = 𝑦𝑧d𝑥 + 𝑥𝑧d𝑦 + 𝑥𝑦d𝑧,
[𝑃1 ,𝑃2 ]

where [𝑃1 , 𝑃2 ] is parametrized by



⎨𝑥 = 1 + 2𝑡

𝑦 =1+𝑡

𝑧=𝑡

for 0 ⩽ 𝑡 ⩽ 1. Then, ⎧
⎨ 𝑑𝑥 = 2 𝑑𝑡

𝑑𝑦 = 𝑑𝑡

𝑑𝑧 = 𝑑𝑡,

and
∫ ∫ 1
𝑦𝑧d𝑥 + 𝑥𝑧d𝑦 + 𝑥𝑦d𝑧 = [2(1 + 𝑡)𝑡 + (1 + 2𝑡)𝑡 + (1 + 2𝑡)(1 + 𝑡)]d𝑡
𝐶 0

∫ 1
= (2𝑡 + 2𝑡2 + 𝑡 + 2𝑡2 + 1 + 𝑡 + 2𝑡 + 2𝑡2 )d𝑡
0

∫ 1
= (1 + 6𝑡 + 6𝑡2 )d𝑡
0

= 6.

Thus, the differential 1–form, 𝜔 = 𝑦𝑧d𝑥 + 𝑥𝑧d𝑦 + 𝑥𝑦d𝑧 maps the


directed line segment [(1, 1, 0), (3, 2, 1)] to the real number 6. □

Example 5.5.7. Let 𝜔 = 𝑘1 𝑑𝑥1 +𝑘2 𝑑𝑥2 +⋅ ⋅ ⋅+𝑘𝑛 𝑑𝑥𝑛 , where 𝑘1 , 𝑘2 , . . . , 𝑘𝑛 are
real constants, be a constant differential 1–form. For any two distinct points,
𝑃𝑜 and 𝑃1 , in ℝ𝑛 , compute 𝜔([𝑃𝑜 , 𝑃1 ])

Solution: The vector field corresponding to 𝜔 is

𝐹 (𝑥) = (𝑘1 , 𝑘2 , . . . , 𝑘𝑛 ), for all 𝑥 ∈ ℝ𝑛 .

Compute

𝜔([𝑃𝑜 , 𝑃1 ]) = 𝐹 ⋅ d→

𝑟
[𝑃𝑜 ,𝑃1 ]

∫ 1
= 𝐹 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡) d𝑡,
0
5.5. DIFFERENTIAL FORMS 79

where
−−→
𝜎(𝑡) = 𝑂𝑃𝑜 + 𝑡𝑣, for 0 ⩽ 𝑡 ⩽ 1,
−−−→
where 𝑣 = 𝑃𝑜 𝑃1 , the vector that goes from 𝑃𝑜 to 𝑃1 . Thus,

𝜔([𝑃𝑜 , 𝑃1 ]) = 𝐾 ⋅ 𝑣,

where 𝐾 = (𝑘1 , 𝑘2 , . . . , 𝑘𝑛 ) is the constant value of the field 𝐹 . □

Definition 5.5.8 (Differential 0–Forms). A differential 0–form in 𝑈 ⊆ ℝ𝑛 is a


𝐶 ∞ scalar field 𝑓 : 𝑈 → ℝ which acts on points in 𝑈 by means of the evaluation
the function at those points; that is,

𝑓𝑝 = 𝑓 (𝑝), for all 𝑝 ∈ 𝑈.

Definition 5.5.9 (Differential of a 0–Form). The differential of a 0 form, 𝑓 , in


𝑈 is the differential 1–form given by

∂𝑓 ∂𝑓 ∂𝑓
𝑑𝑓 = 𝑑𝑥1 + 𝑑𝑥2 + ⋅ ⋅ ⋅ + 𝑑𝑥𝑛 .
∂𝑥1 ∂𝑥2 ∂𝑥𝑛

Example 5.5.10. Given a 0–form 𝑓 in ℝ𝑛 , evaluate 𝑑𝑓 ([𝑃1 , 𝑃2 ]).

Solution: Compute the line integral


∫ ∫
∂𝑓 ∂𝑓 ∂𝑓
d𝑓 = 𝑑𝑥1 + 𝑑𝑥2 + ⋅ ⋅ ⋅ + 𝑑𝑥𝑛
[𝑃1 ,𝑃2 ] [𝑃1 ,𝑃2 ] ∂𝑥1 ∂𝑥2 ∂𝑥𝑛

∫ 1
= ∇𝑓 (𝜎(𝑡)) ⋅ 𝜎 ′ (𝑡) d𝑡,
0

where
−−→ −−−→
𝜎(𝑡) = 𝑂𝑃1 + 𝑡𝑃1 𝑃2 , 0 ⩽ 𝑡 ⩽ 1.
Thus, by the Chain Rule,
∫ ∫ 1
𝑑
𝑑𝑓 = [𝑓 (𝜎(𝑡))] d𝑡
[𝑃1 ,𝑃2 ] 0 𝑑𝑡

= 𝑓 (𝑃2 ) − 𝑓 (𝑃1 ).

where we have used the Fundamental Theorem of Calculus. Thus


d𝑓 ([𝑃1 , 𝑃2 ]) is therefore determined by the values of 𝑓 at the points
of the directed line segment [𝑃1 , 𝑃2 ]. □

Example 5.5.11. For two distinct points 𝑃𝑜 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ) and 𝑃1 (𝑥1 , 𝑦1 , 𝑧2 ) in


ℝ3 , compute 𝑑𝑥([𝑃𝑜 , 𝑃1 ]), 𝑑𝑦([𝑃𝑜 , 𝑃1 ]) and 𝑑𝑧([𝑃𝑜 , 𝑃1 ]).
80 CHAPTER 5. INTEGRATION

Solution: Apply the result of the previous example to the function


𝑓 (𝑥, 𝑦, 𝑧) = 𝑥, for all (𝑥, 𝑦, 𝑧) ∈ ℝ𝑛 , to obtain that
𝑑𝑥([𝑃𝑜 , 𝑃1 ]) = 𝑓 (𝑃1 ) − 𝑓 (𝑃𝑜 ) = 𝑥1 − 𝑥𝑜 .
Similarly,
𝑑𝑦([𝑃𝑜 , 𝑃1 ]) = 𝑦1 − 𝑦𝑜 ,
and
𝑑𝑧([𝑃𝑜 , 𝑃1 ]) = 𝑧1 − 𝑧𝑜 .

Next, we define differential 2–forms. Before we give a formal definition, we


need to define bilinear, skew–symmetric forms.
Definition 5.5.12 (Bilinear Forms on ℝ𝑛 ). A bilinear form on ℝ𝑛 is a function
from ℝ𝑛 × ℝ𝑛 to ℝ which is linear in both variables; that is, 𝐵 : ℝ𝑛 × ℝ𝑛 → ℝ
is bilinear if
𝐵(𝑐1 𝑣1 + 𝑐2 𝑣2 , 𝑤) = 𝑐1 𝐵(𝑣1 , 𝑤) + 𝑐2 𝐵(𝑣2 , 𝑤),
for all 𝑣1 , 𝑣2 , 𝑤 ∈ ℝ𝑛 , and all 𝑐1 , 𝑐2 ∈ ℝ, and
𝐵(𝑣, 𝑐1 𝑤1 + 𝑐2 𝑤2 ) = 𝑐1 𝐵(𝑣, 𝑤1 ) + 𝑐2 𝐵(𝑣, 𝑤2 ),
for all 𝑣, 𝑤1 , 𝑤2 ∈ ℝ𝑛 , and all 𝑐1 , 𝑐2 ∈ ℝ.
Example 5.5.13. The function 𝐵 : ℝ𝑛 × ℝ𝑛 → ℝ given by 𝐵(𝑣, 𝑤) = 𝑣 ⋅ 𝑤, the
dot–product of 𝑣 and 𝑤, is bilinear.
Definition 5.5.14 (Skew–Symmetric Bilinear Forms on ℝ𝑛 ). A bilinear form,
𝐵 : ℝ𝑛 × ℝ𝑛 → ℝ on ℝ𝑛 , is said to be skew–symmetric if
𝐵(𝑤, 𝑣) = −𝐵(𝑣, 𝑤), for all 𝑣, 𝑤 ∈ ℝ𝑛 .
Example 5.5.15. For a fixed vector, 𝑢, in ℝ3 , define 𝐵 : ℝ3 × ℝ3 → ℝ by
𝐵(𝑣, 𝑤) = 𝑢 ⋅ (𝑣 × 𝑤), the triple scalar product of 𝑢, 𝑣 and 𝑤, for all 𝑣 and 𝑤 in
ℝ3 . Then, 𝐵 is skew symmetric.
Example 5.5.16 (Skew–Symmetric Forms in ℝ2 ). Let 𝐵 : ℝ2 × ℝ2 → ℝ be
a skew–symmetric bilinear form on ℝ2 . We than have that Then 𝐵(ˆ𝑖, ˆ𝑖) =
𝐵(ˆ𝑗, ˆ 𝑗, ˆ𝑖) = −𝐵(ˆ𝑖, ˆ
𝑗) = 0 and 𝐵(ˆ 𝑗). Set 𝜆 = 𝐵(ˆ𝑖, ˆ
𝑗). Then, for any vectors
2
𝑣 = 𝑎𝑖 + 𝑏𝑗 and 𝑤 = 𝑐𝑖 + 𝑑𝑗 in ℝ , we have that
ˆ ˆ ˆ ˆ

𝐵(𝑣, 𝑤) = 𝐵(𝑎ˆ𝑖 + 𝑏ˆ
𝑗, 𝑐ˆ𝑖 + 𝑑ˆ
𝑗)

= 𝑎𝑐 𝐵(ˆ𝑖, ˆ𝑖) + 𝑎𝑑 𝐵(ˆ𝑖, ˆ


𝑗) + 𝑏𝑐 𝐵(ˆ
𝑗, ˆ𝑖) + 𝑏𝑑 𝐵(ˆ
𝑗, ˆ
𝑗)

= (𝑎𝑑 − 𝑏𝑐) 𝐵(ˆ𝑖, ˆ


𝑗)

= 𝜆(𝑎𝑑 − 𝑏𝑐)
( )
𝑎 𝑐
= 𝜆 det .
𝑏 𝑑
5.5. DIFFERENTIAL FORMS 81

We have therefore shown that for every skew–symmetric, bilinear form, 𝐵 : ℝ2 ×


ℝ2 → ℝ, there exists 𝜆 ∈ ℝ such that

𝐵(𝑣, 𝑤) = 𝜆 det[ 𝑣 𝑤 ], for all 𝑣, 𝑤 ∈ ℝ2 , (5.16)

where [ 𝑣 𝑤 ] denotes the 2 × 2 matrix whose first column are the entries of 𝑣,
and whose second column are the entries of 𝑤.
Example 5.5.17 (Skew–Symmetric Forms in ℝ3 ). Let 𝐵 : ℝ3 × ℝ3 → ℝ be a
skew–symmetric bilinear form on ℝ3 . We than have that Then

𝐵(ˆ𝑖, ˆ𝑖) = 𝐵(ˆ


𝑗, ˆ
𝑗) = 𝐵(ˆ
𝑘, ˆ
𝑘) = 0 (5.17)

and
𝐵(ˆ
𝑗, ˆ𝑖) = −𝐵(ˆ𝑖, ˆ
𝑗),

𝐵(ˆ
𝑘, ˆ𝑖) = −𝐵(ˆ𝑖, ˆ
𝑘), (5.18)

𝐵(ˆ
𝑘, ˆ
𝑗) = −𝐵(ˆ
𝑗, ˆ
𝑘).
Set
𝜆1 = 𝐵(ˆ
𝑗, ˆ
𝑘),

𝜆2 = 𝐵(ˆ𝑖, ˆ
𝑘), (5.19)

𝜆3 = 𝐵(ˆ𝑖, ˆ
𝑗).
Then, for any vectors 𝑣 = 𝑎1ˆ𝑖 + 𝑎2ˆ
𝑗 + 𝑎3 ˆ
𝑘 and 𝑤 = 𝑏1ˆ𝑖 + 𝑏2ˆ 𝑘 in ℝ3 , we
𝑗 + 𝑏3 ˆ
have that
𝐵(𝑣, 𝑤) = 𝐵(𝑎1ˆ𝑖 + 𝑎2ˆ
𝑗 + 𝑎3 ˆ
𝑘, 𝑏1ˆ𝑖 + 𝑏2ˆ
𝑗 + 𝑏3 ˆ
𝑘)

= 𝑎1 𝑏2 𝐵(ˆ𝑖, ˆ
𝑗) + 𝑎1 𝑏3 𝐵(ˆ𝑖, ˆ𝑘)
+𝑎2 𝑏1 𝐵(ˆ 𝑗, ˆ𝑖) + 𝑎2 𝑏3 𝐵(ˆ𝑗, ˆ
𝑘)
+𝑎3 𝑏1 𝐵(𝑘, 𝑖) + 𝑎3 𝑏2 𝐵(ˆ
ˆ ˆ 𝑘, ˆ
𝑗),

where we have used (5.17). Rearranging terms we obtain

𝐵(𝑣, 𝑤) = 𝑎2 𝑏3 𝐵(ˆ
𝑗, ˆ
𝑘) + 𝑎3 𝑏2 𝐵(ˆ𝑘, ˆ
𝑗)
+𝑎3 𝑏1 𝐵(𝑘, 𝑖) + 𝑎1 𝑏3 𝐵(ˆ𝑖, ˆ
ˆ ˆ 𝑘) (5.20)
+𝑎1 𝑏2 𝐵(ˆ𝑖, ˆ
𝑗) + 𝑎2 𝑏1 𝐵(ˆ
𝑗, ˆ𝑖).

Next, use (5.18) and (5.19) to rewrite (5.20) as

𝐵(𝑣, 𝑤) = 𝜆1 (𝑎2 𝑏3 − 𝑎3 𝑏2 ) − 𝜆2 (𝑎1 𝑏3 − 𝑎3 𝑏1 ) + 𝜆3 (𝑎1 𝑏2 − 𝑎2 𝑏1 ),

which can be written as


𝑎2 𝑎3 𝑎 𝑎3 𝑎 𝑎2
𝐵(𝑣, 𝑤) = 𝜆1 − 𝜆2 1 + 𝜆3 1 . (5.21)
𝑏2 𝑏3 𝑏1 𝑏3 𝑏1 𝑏2
82 CHAPTER 5. INTEGRATION

Recognizing the term on the right–hand side of (5.21) as the triple scalar product
of the vector Λ = 𝜆1ˆ𝑖 + 𝜆2ˆ
𝑗 + 𝜆3 ˆ
𝑘 and the vectors 𝑣 and 𝑤, we see that we have
shown that for every skew–symmetric, bilinear form, 𝐵 : ℝ3 × ℝ3 → ℝ, there
exists a vector Λ ∈ ℝ3 such that
𝐵(𝑣, 𝑤) = Λ ⋅ (𝑣 × 𝑤), for all 𝑣, 𝑤 ∈ ℝ3 . (5.22)

Let 𝒜(ℝ𝑛 ×ℝ𝑛 , ℝ) denote the space of skew–symmetric bilinear forms in ℝ𝑛 .


Definition 5.5.18 (Differential 2–Forms). Let 𝑈 denote an open subset of ℝ𝑛 .
A differential 2–form in 𝑈 is a smooth map, 𝜔 : 𝑈 → 𝒜(ℝ𝑛 × ℝ𝑛 , ℝ), which
assigns to each 𝑝 ∈ 𝑈 , a skew–symmetric, bilinear form, 𝜔𝑝 : ℝ𝑛 × ℝ𝑛 → ℝ.
Example 5.5.19 (Differential 2–forms in ℝ2 ). Let 𝑈 denote an open subset of
ℝ2 and 𝜔 : 𝑈 → 𝒜(ℝ2 × ℝ2 , ℝ) be a differential 2–form. Then, by Definition
5.5.18, for each 𝑝 ∈ 𝑈 , 𝑤𝑝 is a skew–symmetric, bilinear form in ℝ2 . By the
result in Example 5.5.16 expressed in equation (5.16), for each 𝑝 ∈ 𝑈 , there
exists a scalar, 𝑓 (𝑝), such that
𝜔𝑝 (𝑣, 𝑤) = 𝑓 (𝑝) det[ 𝑣 𝑤 ], for all 𝑤, 𝑤 ∈ ℝ2 . (5.23)
In order to fulfill the smoothness condition in Definition 5.5.18, we require that
the scalar field 𝑓 : 𝑈 → ℝ given in (5.23) be smooth.
Example 5.5.20 (Differential 2–forms in ℝ3 ). Let 𝑈 denote an open subset of
ℝ3 and 𝜔 : 𝑈 → 𝒜(ℝ3 × ℝ3 , ℝ) be a differential 2–form. Then, by Definition
5.5.18, for each 𝑝 ∈ 𝑈 , 𝑤𝑝 is a skew–symmetric, bilinear form in ℝ3 . Thus, using
the representation formula in (5.22) of Example 5.5.17, for each 𝑝 ∈ 𝑈 , there
exists a vector, 𝐹 (𝑝) ∈ ℝ3 , such that
𝜔𝑝 (𝑣, 𝑤) = 𝐹 (𝑝) ⋅ (𝑣 × 𝑤), for all 𝑣, 𝑤 ∈ ℝ3 . (5.24)
The smoothness condition in Definition 5.5.18 requires that the vector field
𝐹 : 𝑈 → ℝ3 given in (5.24) be smooth.
Definition 5.5.21 (Wedge Product of 1–Forms). Given two differential 1–
forms, 𝜔 and 𝜂, in some open subset, 𝑈 , of ℝ𝑛 , we can define a differential
2–form in 𝑈 , denoted by 𝜔 ∧ 𝜂, as follows
(𝜔 ∧ 𝜂)𝑝 (𝑣, 𝑤) = 𝜔𝑝 (𝑣)𝜂𝑝 (𝑤) − 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣), for 𝑝 ∈ 𝑈, and 𝑣, 𝑤 ∈ ℝ𝑛 . (5.25)
To see that the expression for (𝜔 ∧ 𝜂)𝑝 given in (5.25) does define a bilinear
form, compute
(𝜔 ∧ 𝜂)𝑝 (𝑐1 𝑣1 + 𝑐2 𝑣2 , 𝑤) = 𝜔𝑝 (𝑐1 𝑣1 + 𝑐2 𝑣2 )𝜂𝑝 (𝑤) − 𝜔𝑝 (𝑤)𝜂𝑝 (𝑐1 𝑣1 + 𝑐2 𝑣2 )

= [𝑐1 𝜔𝑝 (𝑣1 ) + 𝑐2 𝜔𝑝 (𝑣2 )]𝜂𝑝 (𝑤)


−𝜔𝑝 (𝑤)[𝑐1 𝜂𝑝 (𝑣1 ) + 𝑐2 𝜂𝑝 (𝑣2 )]

= 𝑐1 𝜔𝑝 (𝑣1 )𝜂𝑝 (𝑤) + 𝑐2 𝜔𝑝 (𝑣2 )𝜂𝑝 (𝑤)


−𝑐1 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣1 ) − 𝑐2 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣2 ),
5.5. DIFFERENTIAL FORMS 83

so that
(𝜔 ∧ 𝜂)𝑝 (𝑐1 𝑣1 + 𝑐2 𝑣2 , 𝑤) = 𝑐1 [𝜔𝑝 (𝑣1 )𝜂𝑝 (𝑤) − 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣1 )]
+𝑐2 [𝜔𝑝 (𝑣2 )𝜂𝑝 (𝑤) − 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣2 )]

= 𝑐1 (𝜔 ∧ 𝜂)𝑝 (𝑣1 , 𝑤) + 𝑐2 (𝜔 ∧ 𝜂)𝑝 (𝑣2 , 𝑤),

for all 𝑣1 , 𝑣2 , 𝑤 ∈ ℝ𝑛 and 𝑐1 , 𝑐2 ∈ ℝ. A similar calculation shows that

(𝜔 ∧ 𝜂)𝑝 (𝑣, 𝑐1 𝑤1 + 𝑐2 𝑤2 ) = 𝑐1 (𝜔 ∧ 𝜂)𝑝 (𝑣, 𝑤1 ) + 𝑐2 (𝜔 ∧ 𝜂)𝑝 (𝑣, 𝑤2 ),

for all 𝑣, 𝑤1 , 𝑤2 ∈ ℝ𝑛 and 𝑐1 , 𝑐2 ∈ ℝ.


Similarly, to see that (𝜔 ∧ 𝜂)𝑝 : ℝ𝑛 × ℝ𝑛 → ℝ is skew–symmetric, compute

(𝜔 ∧ 𝜂)𝑝 (𝑤, 𝑣) = 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣) − 𝜔𝑝 (𝑣)𝜂𝑝 (𝑤)

= −[𝜔𝑝 (𝑣)𝜂𝑝 (𝑤) − 𝜔𝑝 (𝑤)𝜂𝑝 (𝑣)]

= −(𝜔 ∧ 𝜂)𝑝 (𝑣, 𝑤)

Proposition 5.5.22 (Properties of the Wedge Product). Let 𝜔, 𝜂 and 𝛾 denote


1–forms in 𝑈 , and open subset of ℝ𝑛 . Then,
(i) 𝜔 ∧ 𝜂 = −𝜂 ∧ 𝜔;
(ii) 𝜔 ∧ 𝜔 = 0, where 0 denotes the bilinear form that maps every pair of
vectors to 0;
(iii) (𝜔 + 𝜂) ∧ 𝛾 = 𝜔 ∧ 𝛾 + 𝜂 ∧ 𝛾;
(iv) 𝜔 ∧ (𝜂 + 𝛾) = 𝜔 ∧ 𝜂 + 𝜔 ∧ 𝛾.
Example 5.5.23. Let 𝑃𝑜 (𝑥𝑜 , 𝑦𝑜 ), 𝑃1 (𝑥1 , 𝑦1 ) and 𝑃2 (𝑥2 , 𝑦2 ) denote three non–
collinear points in the 𝑥𝑦–plane. Put
−−−→
𝑣 = 𝑃𝑜 𝑃1 = (𝑥1 − 𝑥𝑜 )ˆ𝑖 + (𝑦1 − 𝑦𝑜 )ˆ
𝑗

and
−−−→
𝑤 = 𝑃𝑜 𝑃2 = (𝑥2 − 𝑥𝑜 )ˆ𝑖 + (𝑦2 − 𝑦𝑜 )ˆ
𝑗.
Then, according to (5.25) in Definition 5.5.21,

(𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) = 𝑑𝑥(𝑣) 𝑑𝑦(𝑤) − 𝑑𝑥(𝑤) 𝑑𝑦(𝑣)

= (𝑥1 − 𝑥𝑜 )(𝑦2 − 𝑦𝑜 ) − (𝑥2 − 𝑥𝑜 )(𝑦1 − 𝑦𝑜 ),

where we have used the result of Example 5.5.11. We then have that

𝑥1 − 𝑥𝑜 𝑥2 − 𝑥𝑜
(𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) = ,
𝑦1 − 𝑦𝑜 𝑦2 − 𝑦𝑜
84 CHAPTER 5. INTEGRATION

which is the determinant of the 2 × 2 matrix, [𝑣 𝑤], whose columns are the
vectors 𝑣 and 𝑤. In other words,

(𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) = det[𝑣 𝑤]. (5.26)

We have therefore shown that the (𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) gives the signed area of the
parallelogram determined by the vectors 𝑣 and 𝑤.

Example 5.5.24. Let 𝑃𝑜 (𝑥𝑜 , 𝑦𝑜 , 𝑧𝑜 ), 𝑃1 (𝑥1 , 𝑦1 , 𝑧1 ) and 𝑃2 (𝑥2 , 𝑦2 , 𝑧2 ) denote


three non–collinear points in ℝ3 . Put
−−−→
𝑣 = 𝑃𝑜 𝑃1 = (𝑥1 − 𝑥𝑜 )ˆ𝑖 + (𝑦1 − 𝑦𝑜 )ˆ
𝑗 + (𝑧1 − 𝑧𝑜 )ˆ
𝑘

and
−−−→
𝑤 = 𝑃𝑜 𝑃2 = (𝑥2 − 𝑥𝑜 )ˆ𝑖 + (𝑦2 − 𝑦𝑜 )ˆ
𝑗 + (𝑧1 − 𝑧𝑜 )ˆ
𝑘.
Then, as in Example 5.5.23, we compute

𝑥1 − 𝑥𝑜 𝑥2 − 𝑥𝑜
(𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) = ,
𝑦1 − 𝑦𝑜 𝑦2 − 𝑦𝑜

which we can also write as


𝑥1 − 𝑥𝑜 𝑦1 − 𝑦𝑜
(𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) = , (5.27)
𝑥2 − 𝑥𝑜 𝑦2 − 𝑦𝑜

Similarly, we compute

𝑦1 − 𝑦𝑜 𝑧1 − 𝑧𝑜
(𝑑𝑦 ∧ 𝑑𝑧)(𝑣, 𝑤) = , (5.28)
𝑦2 − 𝑦𝑜 𝑧2 − 𝑧𝑜

and
𝑧1 − 𝑧𝑜 𝑥1 − 𝑥𝑜
(𝑑𝑧 ∧ 𝑑𝑥)(𝑣, 𝑤) = ,
𝑧2 − 𝑧𝑜 𝑥2 − 𝑥𝑜
or
𝑥1 − 𝑥𝑜 𝑧1 − 𝑧𝑜
(𝑑𝑧 ∧ 𝑑𝑥)(𝑣, 𝑤) = − . (5.29)
𝑥2 − 𝑥𝑜 𝑧2 − 𝑧𝑜
We recognize in (5.28), (5.29) and (5.27) the components of the cross product
of the vectors 𝑣 and 𝑤,

𝑦1 − 𝑦𝑜 𝑧1 − 𝑧𝑜 𝑥1 − 𝑥𝑜 𝑧1 − 𝑧𝑜 𝑥1 − 𝑥𝑜 𝑦1 − 𝑦𝑜
𝑣×𝑤 = ˆ𝑖− 𝑗+
ˆ 𝑘.
ˆ
𝑦2 − 𝑦𝑜 𝑧2 − 𝑧𝑜 𝑥2 − 𝑥𝑜 𝑧2 − 𝑧𝑜 𝑥2 − 𝑥𝑜 𝑦2 − 𝑦𝑜
5.5. DIFFERENTIAL FORMS 85

We can therefore write

(𝑑𝑦 ∧ 𝑑𝑧)(𝑣, 𝑤) = (𝑣 × 𝑤) ⋅ ˆ𝑖, (5.30)

(𝑑𝑧 ∧ 𝑑𝑥)(𝑣, 𝑤) = (𝑣 × 𝑤) ⋅ ˆ
𝑗, (5.31)
and
(𝑑𝑥 ∧ 𝑑𝑦)(𝑣, 𝑤) = (𝑣 × 𝑤) ⋅ ˆ
𝑘. (5.32)

Differential 0–forms act on points. Differential 1–forms act on directed line


segments and, more generally, on oriented curves. We will next see how to define
the action of differential 2–forms on oriented triangles. We first define oriented
triangles in the plane.
Definition 5.5.25 (Oriented Triangles in ℝ2 ). Given three non–collinear points
𝑃𝑜 , 𝑃1 and 𝑃2 in the plane, we denote by 𝑇 = [𝑃𝑜 , 𝑃1 , 𝑃2 ] the triangle with
vertices 𝑃𝑜 , 𝑃1 and 𝑃2 . 𝑇 is a 2–dimensional object consisting of the simple
curve generated by the directed line segments [𝑃𝑜 , 𝑃1 ], [𝑃1 , 𝑃2 ], and [𝑃2 , 𝑃𝑜 ] as
well as the interior of the curve. If the curve is traversed in the counterclockwise
sense, the 𝑇 has positive orientation; if the curve is traversed in the clockwise
sense the 𝑇 has negative orientation.
Definition 5.5.26 (Action of a Differential 2–Form on an Oriented Triangle
in ℝ2 ). The differential 2–form, d𝑥 ∧ d𝑦, acts on an oriented triangle 𝑇 by
evaluating its area, if 𝑇 has a positive orientation, and the negative of the area
if 𝑇 has a negative orientation:

d𝑥 ∧ d𝑦(𝑇 ) = ± area(𝑇 ).

We denote this by ∫
d𝑥 ∧ d𝑦 = signed area of 𝑇. (5.33)
𝑇

According to the formula (5.26) in Example 5.5.23, the expression in (5.33)


may also be obtained by computing
−−−→ −−−→

1
d𝑥 ∧ d𝑦 = (𝑑𝑥 ∧ 𝑑𝑦)(𝑃𝑜 𝑃1 , 𝑃𝑜 𝑃2 ), (5.34)
𝑇 2
−−−→ −−−→
since (𝑑𝑥 ∧ 𝑑𝑦)(𝑃𝑜 𝑃1 , 𝑃𝑜 𝑃1 ) gives the signed area of the parallelogram generated
−−−→ −−−→ −−−→ −−−→
by the vectors 𝑃𝑜 𝑃1 and 𝑃𝑜 𝑃2 . By embedding the vectors 𝑃𝑜 𝑃1 and 𝑃𝑜 𝑃2 in
the 𝑥𝑦–coordinate plane in ℝ3 , we may also use the formula in (5.32) to obtain
that
1 −−−→ −−−→

d𝑥 ∧ d𝑦 = (𝑃𝑜 𝑃1 × 𝑃𝑜 𝑃2 ) ⋅ ˆ
𝑘. (5.35)
[𝑃𝑜 𝑃1 𝑃2 ] 2

Example 5.5.27. Let 𝑃𝑜 (0, 0), 𝑃1 (1, 2) and 𝑃2 (2, 1) and let 𝑇 ∫
= [𝑃𝑜 , 𝑃1 , 𝑃2 ]
denote the oriented triangle generated by those points. Evaluate d𝑥 ∧ d𝑦.
𝑇
86 CHAPTER 5. INTEGRATION

Solution: Embed the points 𝑃𝑜 , 𝑃1 and 𝑃2 in ℝ3 by appending 0


as the last coordinate, and let
⎛ ⎞ ⎛ ⎞
1 2
−−−→ ⎝ ⎠ −−−→ ⎝ ⎠
𝑣 = 𝑃𝑜 𝑃1 = 2 and 𝑤 = 𝑃𝑜 𝑃2 = 1 .
0 0

1
Then d𝑥 ∧ d𝑦 is the component of the vector 𝑣 × 𝑤 along the
𝑇 2
direction of ˆ
𝑘; that is,

1
d𝑥 ∧ d𝑦 = (𝑣 × 𝑤) ⋅ ˆ
𝑘,
𝑇 2
where
ˆ𝑖 𝑗
ˆ 𝑘
ˆ
𝑣×𝑤 = 1 2 0 = (1 − 4) ˆ
𝑘 = −3 ˆ
𝑘.
2 1 0
It then follows that ∫
3
d𝑥 ∧ d𝑦 = − .
𝑇 2
1
We see that (𝑣 × 𝑤) ⋅ ˆ 𝑘 gives the appropriate sign for the d𝑥d𝑦(𝑇 )
2
since in this case 𝑇 has negative orientation. □

In general, for non–collinear points 𝑃𝑜 , 𝑃1 and 𝑃2 in ℝ3 , the value of d𝑥 ∧ d𝑦


on 𝑇 = [𝑃𝑜 , 𝑃1 , 𝑃2 ] is obtained by the formula in (5.35); namely,

1
d𝑥 ∧ d𝑦(𝑇 ) = 𝑑𝑥 ∧ 𝑑𝑦 = (𝑣 × 𝑤) ⋅ ˆ 𝑘,
𝑇 2
where
−−−→ −−−→
𝑣 = 𝑃𝑜 𝑃1 and 𝑤 = 𝑃𝑜 𝑃2 .
This gives the signed area of the orthogonal projection of the triangle 𝑇 onto
the 𝑥𝑦–plane. Similarly, using the formulas in (5.30) and (5.31), we obtain the
values of the differential 2–forms d𝑦 ∧ d𝑧 and d𝑧 ∧ d𝑥 on the oriented triangle
𝑇 = [𝑃𝑜 𝑃1 𝑃2 ]: ∫
1
d𝑦 ∧ d𝑧(𝑇 ) = d𝑦 ∧ d𝑧 = (𝑣 × 𝑤) ⋅ ˆ𝑖,
𝑇 2
and ∫
1
d𝑧 ∧ d𝑥(𝑇 ) = d𝑧 ∧ d𝑥 = (𝑣 × 𝑤) ⋅ ˆ
𝑗.
𝑇 2
∫ ∫ ∫
Example 5.5.28. Evaluate d𝑦 ∧ d𝑧, d𝑧 ∧ d𝑥, and d𝑥 ∧ d𝑦, where
𝑇 𝑇 𝑇
𝑇 = [𝑃𝑜 , 𝑃1 , 𝑃2 ] for the points

𝑃𝑜 (−1, 1, 2), 𝑃1 (3, 2, 1) and 𝑃2 (4, 7, 1).


5.5. DIFFERENTIAL FORMS 87

Solution: Set
⎛⎞ ⎛⎞
4 5
−−−→ −−−→
𝑣 = 𝑃𝑜 𝑃1 = ⎝ 1 ⎠ and 𝑤 = 𝑃𝑜 𝑃2 = ⎝ 6 ⎠ ,
−1 −1
and compute
ˆ𝑖 𝑗
ˆ 𝑘
ˆ
𝑣×𝑤 = 4 1 −1 = (−2+6) ˆ𝑖−(−8+5) ˆ
𝑗+(24−5) ˆ
𝑘 = 4 ˆ𝑖+3 ˆ
𝑗+19 ˆ
𝑘.
5 6 −2
It then follows that ∫
d𝑦 ∧ d𝑧 = 2,
𝑇

3
d𝑧 ∧ d𝑥 =
𝑇 2
and ∫
19
d𝑥 ∧ d𝑦 = .
𝑇 2

We end this section by showing that, in ℝ3 , the space of differential 2–forms


in an open subset 𝑈 of ℝ3 is generated by the set
{𝑑𝑦 ∧ 𝑑𝑧, 𝑑𝑧 ∧ 𝑑𝑥, 𝑑𝑥 ∧ 𝑑𝑦}, (5.36)
in the sense that, for every differential 2–from, 𝜔, in 𝑈 , there exists a smooth
vector field 𝐹 : 𝑈 → ℝ3 ,
𝐹 (𝑥, 𝑦, 𝑧) = 𝐹1 (𝑥, 𝑦, 𝑧) ˆ𝑖 + 𝐹2 (𝑥, 𝑦, 𝑧) ˆ
𝑗 + 𝐹3 (𝑥, 𝑦, 𝑧) ˆ
𝑘,
such that
𝜔𝑝 = 𝐹1 (𝑝) 𝑑𝑦 ∧ 𝑑𝑧 + 𝐹2 (𝑝) 𝑑𝑧 ∧ 𝑑𝑥 + 𝐹3 (𝑝) 𝑑𝑥 ∧ 𝑑𝑦, for all 𝑝 ∈ 𝑈.
Let 𝜔 : 𝑈 → 𝒜(ℝ3 × ℝ3 , ℝ) be a differential 2–form in an open subset, 𝑈 , of ℝ3 .
We consider vectors
𝑣 = 𝑎1 ˆ𝑖 + 𝑎2 ˆ
𝑗 + 𝑎3 ˆ
𝑘
and
𝑤 = 𝑏1 ˆ𝑖 + 𝑏2 ˆ
𝑗 + 𝑏3 ˆ
𝑘
in ℝ3 . For each 𝑝 ∈ 𝑈 , we compute

𝜔𝑝 (𝑣, 𝑤) = 𝜔𝑝 (𝑎1 ˆ𝑖 + 𝑎2 ˆ
𝑗 + 𝑎3 , 𝑏1 ˆ𝑖 + 𝑏2 ˆ
𝑗 + 𝑏3 ˆ
𝑘)

= 𝑎1 𝑏2 𝜔𝑝 (ˆ𝑖, ˆ
𝑗) + 𝑎1 𝑏3 𝜔𝑝 (ˆ𝑖, ˆ
𝑘) (5.37)
𝑎2 𝑏1 𝜔𝑝 (ˆ𝑗, ˆ𝑖) + 𝑎2 𝑏3 𝜔𝑝 (ˆ
𝑗, ˆ
𝑘)+
𝑎3 𝑏1 𝜔𝑝 (𝑘, 𝑖) + 𝑎3 𝑏2 𝜔𝑝 (ˆ
ˆ ˆ 𝑘, ˆ
𝑗),
88 CHAPTER 5. INTEGRATION

where we have used the fact that

𝜔𝑝 (ˆ𝑖, ˆ𝑖) = 𝜔𝑝 (ˆ
𝑗, ˆ
𝑗) = 𝜔𝑝 (ˆ
𝑘, ˆ
𝑘) = 0,

which follows from the skew–symmetry of the form 𝜔𝑝 : ℝ3 × ℝ3 → ℝ. Using


the skew–symmetry again, we obtain from (5.37) that

𝜔𝑝 (𝑣, 𝑤) = 𝑎2 𝑏3 𝜔𝑝 (ˆ
𝑗, ˆ
𝑘) + 𝑎3 𝑏2 𝜔𝑝 (ˆ
𝑘, ˆ
𝑗)
+𝑎1 𝑏3 𝜔𝑝 (ˆ𝑖, ˆ
𝑘) + 𝑎3 𝑏1 𝜔𝑝 (ˆ
𝑘, ˆ𝑖)
+𝑎1 𝑏2 𝜔𝑝 (𝑖, 𝑗) + 𝑎2 𝑏1 𝜔𝑝 (ˆ
ˆ ˆ 𝑗, ˆ𝑖)
(5.38)
= 𝜔𝑝 (ˆ 𝑘)(𝑎2 𝑏3 − 𝑎3 𝑏2 )
𝑗, ˆ
+𝜔𝑝 (ˆ 𝑘, ˆ𝑖)(𝑎3 𝑏1 − 𝑎1 𝑏3 )
𝑗)(𝑎1 𝑏2 − 𝑎2 𝑏1 ).
+𝜔𝑝 (ˆ𝑖, ˆ

Next, use Definition 5.5.21 to compute

𝑑𝑦 ∧ 𝑑𝑧(𝑣, 𝑤) = 𝑑𝑦(𝑣)𝑑𝑧(𝑤) − 𝑑𝑦(𝑤)𝑑𝑧(𝑣)


(5.39)
= 𝑎2 𝑏3 − 𝑏2 𝑎3 ,

𝑑𝑧 ∧ 𝑑𝑥(𝑣, 𝑤) = 𝑑𝑧(𝑣)𝑑𝑥(𝑤) − 𝑑𝑧(𝑤)𝑑𝑥(𝑣)


(5.40)
= 𝑎3 𝑏1 − 𝑏3 𝑎1 ,
and
𝑑𝑥 ∧ 𝑑𝑦(𝑣, 𝑤) = 𝑑𝑥(𝑣)𝑑𝑦(𝑤) − 𝑑𝑥(𝑤)𝑑𝑦(𝑣)
(5.41)
= 𝑎1 𝑏2 − 𝑏1 𝑎2 .
Substituting the expressions obtained in (5.39)–(5.41) into the last expression
on the right–hand side of (5.38) yields

𝜔𝑝 (𝑣, 𝑤) = 𝜔𝑝 (ˆ 𝑘) 𝑑𝑦 ∧ 𝑑𝑧(𝑣, 𝑤)
𝑗, ˆ
+𝜔𝑝 (ˆ 𝑘, ˆ𝑖) 𝑑𝑧 ∧ 𝑑𝑥(𝑣, 𝑤)
𝑗) 𝑑𝑥 ∧ 𝑑𝑦(𝑣, 𝑤),
+𝜔𝑝 (ˆ𝑖, ˆ

from which we get that

𝜔𝑝 = 𝜔𝑝 (ˆ 𝑘) 𝑑𝑦 ∧ 𝑑𝑧 + 𝜔𝑝 (ˆ
𝑗, ˆ 𝑘, ˆ𝑖) 𝑑𝑧 ∧ 𝑑𝑥 + 𝜔𝑝 (ˆ𝑖, ˆ
𝑗) 𝑑𝑥 ∧ 𝑑𝑦. (5.42)

Setting
𝐹1 (𝑝) = 𝜔𝑝 (ˆ
𝑗, ˆ
𝑘),

𝐹2 (𝑝) = 𝜔𝑝 (ˆ
𝑘, ˆ𝑖),

𝐹3 (𝑝) = 𝜔𝑝 (ˆ𝑖, ˆ
𝑗),
5.6. CALCULUS OF DIFFERENTIAL FORMS 89

we see from (5.42) that

𝜔𝑝 = 𝐹1 (𝑝) 𝑑𝑦 ∧ 𝑑𝑧 + 𝐹2 (𝑝) 𝑑𝑧 ∧ 𝑑𝑥 + 𝐹3 (𝑝) 𝑑𝑥 ∧ 𝑑𝑦, (5.43)

which shows that every differential 2–form in ℝ3 is in the span of the set in
(5.36).
To show that the representation in (5.43) is unique, assume that

𝐹1 (𝑝) 𝑑𝑦 ∧ 𝑑𝑧 + 𝐹2 (𝑝) 𝑑𝑧 ∧ 𝑑𝑥 + 𝐹3 (𝑝) 𝑑𝑥 ∧ 𝑑𝑦 = 0, (5.44)

the differential 2–form that maps every pair of vectors (𝑣, 𝑤) ∈ ℝ3 × ℝ3 to the
real number 0. Then, applying the form in (5.44) to the pair (ˆ 𝑗, ˆ
𝑘) we obtain
that

𝐹1 (𝑝) 𝑑𝑦 ∧ 𝑑𝑧(ˆ 𝑘) + 𝐹2 (𝑝) 𝑑𝑧 ∧ 𝑑𝑥(ˆ


𝑗, ˆ 𝑘) + 𝐹3 (𝑝) 𝑑𝑥 ∧ 𝑑𝑦(ˆ
𝑗, ˆ 𝑗, ˆ
𝑘) = 0,

which implies that


𝐹1 (𝑝) = 0,
in view of the results of the calculations in (5.39)–(5.41). Similarly, applying
(5.44) to (ˆ
𝑘, ˆ𝑖) and (ˆ𝑖, ˆ
𝑗), successively, leads to

𝐹2 (𝑝) = 0 and 𝐹3 (𝑝) = 0,

respectively. Thus, the set in (5.36) is also linearly independent; hence, the
representation in (5.43) is unique.

5.6 Calculus of Differential Forms


Proposition 5.5.22 on page 83 in these notes lists some of the algebraic properties
of the wedge product of differential 1–forms defined in Definition 5.5.21. Prop-
erties (i) and (ii) in Proposition 5.5.22 can be verified for the differential 1–forms
𝑑𝑥 and 𝑑𝑦 directly from the definition and the results in Example (5.5.11). In
fact, for non–collinear points 𝑃𝑜 (𝑥𝑜 , 𝑦𝑜 ), 𝑃1 (𝑥1 , 𝑦1 ) and 𝑃2 (𝑥2 , 𝑦2 ) in ℝ2 , using
Definition 5.5.21 we compute
−−−→ −−−→ −−−→ −−−→ −−−→ −−−→
(𝑑𝑦 ∧ 𝑑𝑥)(𝑃𝑜 𝑃1 , 𝑃𝑜 𝑃2 ) = 𝑑𝑦(𝑃𝑜 𝑃1 )𝑑𝑥(𝑃𝑜 𝑃2 ) − 𝑑𝑦(𝑃𝑜 𝑃2 )𝑑𝑥(𝑃𝑜 𝑃1 )
−−−→ −−−→ −−−→ −−−→
= −[𝑑𝑥(𝑃𝑜 𝑃1 )𝑑𝑦(𝑃𝑜 𝑃2 ) − 𝑑𝑥(𝑃𝑜 𝑃2 )𝑑𝑦(𝑃𝑜 𝑃1 )]
−−−→ −−−→
= −(𝑑𝑥 ∧ 𝑑𝑦)(𝑃𝑜 𝑃1 , 𝑃𝑜 𝑃2 ).

Consequently,
𝑑𝑦 ∧ 𝑑𝑥 = −𝑑𝑥 ∧ 𝑑𝑦. (5.45)
From this we can deduce that

𝑑𝑥 ∧ 𝑑𝑥 = 0. (5.46)
90 CHAPTER 5. INTEGRATION

Thus, the wedge product of differential 1–forms is anti–symmetric.


We can also multiply 0–forms and 1–forms; for instance, the differential
1–form,
𝑃 (𝑥, 𝑦) 𝑑𝑥,
where 𝑃 : 𝑈 → ℝ is a smooth function on an open subset, 𝑈 , of ℝ2 , is the
product of a 0–form and a differential 1–form.
The differential 1–form, 𝑃 𝑑𝑥, can be added to another 1–form, 𝑄 𝑑𝑦, to
obtain the differential 1–form for example,

𝑃 𝑑𝑥 + 𝑄 𝑑𝑦, (5.47)

where 𝑃 and 𝑄 are smooth scalar fields. We can also multiply the differential
1–from in (5.47) by the 1–form 𝑑𝑥:

(𝑃 𝑑𝑥 + 𝑄 𝑑𝑦) ∧ 𝑑𝑥 = 𝑃 𝑑𝑥 ∧ 𝑑𝑥 + 𝑄 𝑑𝑦 ∧ d𝑥 = −𝑄 𝑑𝑥 ∧ 𝑑𝑦,

where we have used (5.45) and (5.46).


We have already seen how to obtain a differential 1–form from a differential
0–form, 𝑓 , by computing the differential of 𝑓 :
∂𝑓 ∂𝑓 ∂𝑓
𝑑𝑓 = 𝑑𝑥 + 𝑑𝑦 + 𝑑𝑧.
∂𝑥 ∂𝑦 ∂𝑧
This defines an operator, 𝑑, from the class of 0–forms to the class of 1–forms.
This operator, 𝑑, also acts on the 1–form

𝜔 = 𝑃 (𝑥, 𝑦) 𝑑𝑥 + 𝑄(𝑥, 𝑦) 𝑑𝑦

in ℝ2 , where 𝑃 and 𝑄 are smooth scalar fields, as follows:

d𝜔 = (𝑑𝑃 ) ∧ 𝑑𝑥 + (𝑑𝑄) ∧ 𝑑𝑦
( ) ( )
∂𝑃 ∂𝑃 ∂𝑄 ∂𝑄
= 𝑑𝑥 + 𝑑𝑦 ∧ 𝑑𝑥 + 𝑑𝑥 + ∧ d𝑦 𝑑𝑦
∂𝑥 ∂𝑦 ∂𝑥 ∂𝑦

∂𝑃 ∂𝑃 ∂𝑄 ∂𝑄
= 𝑑𝑥 ∧ 𝑑𝑥 + 𝑑𝑦 ∧ 𝑑𝑥 + 𝑑𝑥 ∧ 𝑑𝑦 + 𝑑𝑦 ∧ 𝑑𝑦
∂𝑥 ∂𝑦 ∂𝑥 ∂𝑦
( )
∂𝑄 ∂𝑃
= − 𝑑𝑥 ∧ 𝑑𝑦,
∂𝑥 ∂𝑦

where we have used (5.45) and (5.46). Thus, the differential of the 1–form

𝜔 = 𝑃 𝑑𝑥 + 𝑄 𝑑𝑦

in ℝ2 is the differential 2–form


( )
∂𝑄 ∂𝑃
𝑑𝜔 = − 𝑑𝑥 ∧ 𝑑𝑦.
∂𝑥 ∂𝑦
5.7. EVALUATING 2–FORMS: DOUBLE INTEGRALS 91

Thus, the differential, 𝑑𝜔, of the 1–form, 𝜔, acts on oriented triangles,

𝑇 = [𝑃1 , 𝑃2 , 𝑃3 ],

in ℝ2 . By analogy with what happens to the differential, 𝑑𝑓 , of a 0–form, 𝑓 ,


when it is integrated over a directed line segment, we expect that

𝑑𝜔
𝑇

is completely determined by the action of 𝜔 on the boundary, ∂𝑇 , of 𝑇 , which


is a simple, closed curve made up of the directed line segments [𝑃1 , 𝑃2 ], [𝑃2 , 𝑃3 ]
and [𝑃3 , 𝑃1 ]. More specifically, if 𝑇 has positive orientation, we expect that
∫ ∫
𝑑𝜔 = 𝜔. (5.48)
𝑇 ∂𝑇

This is the Fundamental Theorem of Calculus in two–dimensions for the special


case of oriented triangles, and we will prove it in the following sections. We will
first see how to evaluate the 2–form 𝑑𝜔 on oriented triangles.

5.7 Evaluating 2–forms: Double Integrals


Given an oriented triangle, 𝑇 = [𝑃1 , 𝑃2 , 𝑃3 ], in the 𝑥𝑦–plane and with positive
orientation, we would like to evaluate the 2–form 𝑓 (𝑥, 𝑦)d𝑥 ∧ d𝑦 on 𝑇 , for a
given continuous scalar field 𝑓 ; that is, we would like to evaluate

𝑓 (𝑥, 𝑦) 𝑑𝑥 ∧ 𝑑𝑦.
𝑇

For the case in which 𝑇 has a positive orientation, we will denote the value of

𝑓 (𝑥, 𝑦)d𝑥 ∧ d𝑦 by
𝑇 ∫
𝑓 (𝑥, 𝑦)d𝑥d𝑦 (5.49)
𝑇
and call it the double integral of 𝑓 over 𝑇 . In this sense, we then have that
∫ ∫
𝑓 (𝑥, 𝑦)d𝑦 ∧ d𝑥 = − 𝑓 (𝑥, 𝑦)d𝑥d𝑦,
𝑇 𝑇

for the case in which 𝑇 has a positive orientation.


We first see how to evaluate the double integral in (5.49) for the case in
which 𝑇 is the unit triangle 𝑈 = [(0, 0), (1, 0), (0,
∫ 1)] in Figure 5.7.5, which is
oriented in the positive direction. We evaluate 𝑓 (𝑥, 𝑦)d𝑥d𝑦 by computing
𝑇
two iterated integrals as follows
∫ ∫ 1 {∫ 1−𝑥 }
𝑓 (𝑥, 𝑦)d𝑥d𝑦 = 𝑓 (𝑥, 𝑦) d𝑦 d𝑥. (5.50)
𝑈 0 0
92 CHAPTER 5. INTEGRATION

(0, 1)
@
@ 𝑥+𝑦 =1
@
@
@
@
(0, 0) (1, 0) 𝑥

Figure 5.7.5: Unit Triangle 𝑈

Observe that the “inside” integral,


∫ 1−𝑥
𝑓 (𝑥, 𝑦) d𝑦,
0

yields a function of 𝑥 for 𝑥 ∈ [0, 1]; call this function 𝑔; that is,
∫ 1−𝑥
𝑔(𝑥) = 𝑓 (𝑥, 𝑦) d𝑦 for all 𝑥 ∈ [0, 1];
0

Then, ∫ ∫ 1
𝑓 (𝑥, 𝑦)d𝑥d𝑦 = 𝑔(𝑥) d𝑥.
𝑈 0
We could also do the integration with respect to 𝑥 first, then integrate with
respect to 𝑦:
∫ ∫ 1 {∫ 1−𝑦 }
𝑓 (𝑥, 𝑦)d𝑥d𝑦 = 𝑓 (𝑥, 𝑦) d𝑥 d𝑦. (5.51)
𝑈 0 0

In this case the inner integral yields a function of 𝑦 which can then be integrated
from 0 to 1.
Observe that the iterated integrals in (5.50) and (5.51) correspond to alter-
nate descriptions of 𝑈 as

𝑈 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 0 ⩽ 𝑥 ⩽ 1, 0 ⩽ 𝑦 ⩽ 1 − 𝑥}

or
𝑈 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 0 ⩽ 𝑥 ⩽ 1 − 𝑦, 0 ⩽ 𝑦 ⩽ 1},
respectively.
The fact that the iterated integrals in equations (5.50) and (5.51) yield the
same value, at least for the case in which 𝑓 is continuous on a region containing
𝑈 , is a special case of a theorem in Advanced Calculus or Real Analysis known
as Fubini’s Theorem.
5.7. EVALUATING 2–FORMS: DOUBLE INTEGRALS 93

Example 5.7.1. Evaluate 𝑥 d𝑥d𝑦.
𝑈

Solution: Using the iterated integral in (5.50) we get


∫ ∫ 1 {∫ 1−𝑥 }
𝑥 d𝑥d𝑦 = 𝑥 d𝑦 d𝑥
𝑈 0 0

∫ 1 [ ]1−𝑥
= 𝑥𝑦 d𝑥
0 0

∫ 1
= 𝑥(1 − 𝑥) d𝑥
0

∫ 1
= (𝑥 − 𝑥2 ) d𝑥
0

1
= .
6
We could have also used the iterated integral in (5.51):
∫ ∫ 1 {∫ 1−𝑦 }
𝑥 d𝑥d𝑦 = 𝑥 d𝑥 d𝑦
𝑈 0 0

∫ 1 [1 ]1−𝑦
= 𝑥2 d𝑦
0 2 0

∫ 1
1
= (1 − 𝑦)2 d𝑦
2 0

∫ 0
1
= − 𝑢2 d𝑥
2 1

∫ 1
1
= 𝑢2 d𝑢
2 0

1
= .
6

Iterated integrals can be used to evaluate double–integrals over plane regions
other than triangles. For instance, suppose a region, 𝑅, is bounded by the
vertical lines 𝑥 = 𝑎 and 𝑥 = 𝑏, where 𝑎 < 𝑏, and by the graphs of two functions
𝑔1 (𝑥) and 𝑔2 (𝑥), where 𝑔1 (𝑥) ⩽ 𝑔2 (𝑥) for 𝑎 ⩽ 𝑥 ⩽ 𝑏; that is

𝑅 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑔1 (𝑥) ⩽ 𝑦 ⩽ 𝑔2 (𝑥), 𝑎 ⩽ 𝑥 ⩽ 𝑏};


94 CHAPTER 5. INTEGRATION

then, {∫ }
∫ ∫ 𝑏 𝑔2 (𝑥)
𝑓 (𝑥, 𝑦) d𝑥d𝑦 = 𝑓 (𝑥, 𝑦) d𝑦 d𝑥.
𝑅 𝑎 𝑔1 (𝑥)

Example 5.7.2. Let 𝑅 denote the region in the first quadrant bounded ∫ by the
2 2
unit circle, 𝑥 + 𝑦 = 1; that is, 𝑅 is the quarter unit disc. Evaluate 𝑦 d𝑥d𝑦.
𝑅

Solution: In this case, the region 𝑅 is described by



𝑅 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 0 ⩽ 𝑦 ⩽ 1 − 𝑥2 , 0 ⩽ 𝑥 ⩽ 1},

so that √
∫ ∫ 1 ∫ 1−𝑥2
𝑦 d𝑥d𝑦 = 𝑦 d𝑦d𝑥
𝑅 0 0

∫ 1 1−𝑥2
1 2
= 𝑦 d𝑥
0 2 0

∫ 1
1
= (1 − 𝑥2 ) d𝑥
2 0

1
=
3

Alternatively, the region 𝑅 can be described by

𝑅 = {(𝑥, 𝑦) ∈ ℝ2 ∣ ℎ1 (𝑦) ⩽ 𝑥 ⩽ ℎ2 (𝑦), 𝑐 ⩽ 𝑦 ⩽ 𝑑},

where ℎ1 (𝑦) ⩽ ℎ2 (𝑦) for 𝑐 ⩽ 𝑦 ⩽ 𝑑. In this case,


∫ ∫ {∫ 𝑑 ℎ2 (𝑦)
}
𝑓 (𝑥, 𝑦) d𝑥d𝑦 = 𝑓 (𝑥, 𝑦) d𝑥 d𝑦.
𝑅 𝑐 ℎ1 (𝑦)

Example 5.7.3. Identify the region, 𝑅, in the plane in which the following
iterated integral ∫ 1∫ 1
1
√ d𝑥d𝑦
0 𝑦 1 + 𝑥2
is computed. Change the order of integration and then evaluate the double inte-
gral ∫
1
√ d𝑥d𝑦.
𝑅 1 + 𝑥2
Solution: In this case, the region 𝑅 is

𝑅 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑦 ⩽ 𝑥 ⩽ 1, 1 ⩽ 𝑦 ⩽ 1}.
5.7. EVALUATING 2–FORMS: DOUBLE INTEGRALS 95

𝑥=𝑦 𝑥=1

Figure 5.7.6: Region 𝑅 in example 5.7.3

This is also represented by


𝑅 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 0 ⩽ 𝑥 ⩽ 1, 1 ⩽ 𝑦 ⩽ 𝑥};
see picture in Figure 5.7.6. It then follows that
∫ ∫ 1∫ 𝑥
1 1
√ d𝑥d𝑦 = √ d𝑦d𝑥
1 + 𝑥 2 1 + 𝑥2
𝑅 0 0

∫ 1 𝑥
1
= √ 𝑦 d𝑥
0 1 + 𝑥2 0

∫ 1
1
= √ 𝑥 d𝑥
0 1 + 𝑥2
∫ 1
1
= √ 2𝑥 d𝑥
0 2 1 + 𝑥2
∫ 2
1
= √ d𝑢
1 2 𝑢

√ 2
= 𝑢
1

= 2 − 1.

2
If 𝑅 is a bounded region of ℝ , and 𝑓 (𝑥, 𝑦) ⩾ 0 for all (𝑥, 𝑦) ∈ 𝑅, then

𝑓 (𝑥, 𝑦) d𝑥d𝑦
𝑅

gives the volume of the three dimensional solid that lies below the graph of the
surface 𝑧 = 𝑓 (𝑥, 𝑦) and above the region 𝑅.
96 CHAPTER 5. INTEGRATION

Example 5.7.4. Let 𝑎, 𝑏 and 𝑐 be positive real numbers. Compute the volume
of the tetrahedron whose base is the triangle 𝑇 = [(0, 0), (𝑎, 0), (0, 𝑏)] and which
lies below the plane
𝑥 𝑦 𝑧
+ + = 1.
𝑎 𝑏 𝑐

Solution: We need to evaluate 𝑧 d𝑥d𝑦, where
𝑇
( 𝑥 𝑦)
𝑧 =𝑐 1− − .
𝑎 𝑏
Then,
∫ ∫ (
𝑥 𝑦)
𝑧 d𝑥d𝑦 = 𝑐 1− − d𝑥d𝑦
𝑇 𝑇 𝑎 𝑏
∫ 𝑎 ∫ 𝑏(1−𝑥/𝑎) ( 𝑥 𝑦)
= 𝑐 1− − d𝑦d𝑥
0 0 𝑎 𝑏

𝑎 ]𝑏(1−𝑥/𝑎)
𝑥𝑦 𝑦 2
∫ [
= 𝑐 𝑦− − d𝑥
0 𝑎 2𝑏 0
∫ 𝑎 [ ( ]
𝑥) 𝑥 ( 𝑥) 1 2( 𝑥 )2
= 𝑐 𝑏 1− − 𝑏 1− − 𝑏 1− d𝑥
0 𝑎 𝑎 𝑎 2𝑏 𝑎
𝑎
𝑥2
∫ ( )
1 𝑥
= 𝑏𝑐 − + 2 d𝑥
0 2 𝑎 2𝑎
[𝑎 𝑎 𝑎]
= 𝑏𝑐 − +
2 2 6
𝑎𝑏𝑐
= .
6

5.8 Fundamental Theorem of Calculus in ℝ2


In this section we prove the Fundamental Theorem of Calculus in two dimensions
expressed in (5.48). More precisely, we have the following theorem:

Proposition 5.8.1 (Fundamental Theorem of Calculus for Oriented Triangles


in ℝ2 ). Let 𝜔 be a 𝐶 1 1–form defined on some plane region containing a posi-
tively oriented triangle 𝑇 . Then,
∫ ∫
d𝜔 = 𝜔. (5.52)
𝑇 ∂𝑇
5.8. FUNDAMENTAL THEOREM OF CALCULUS IN ℝ2 97

More specifically, let 𝜔 = 𝑃 d𝑥 + 𝑄d𝑦 be a differential 1–form for which 𝑃


and 𝑄 are 𝐶 1 scalar fields defined in some region containing a positively oriented
triangle 𝑇 . Then
∫ ( ) ∫
∂𝑄 ∂𝑃
− d𝑥d𝑦 = 𝑃 d𝑥 + 𝑄d𝑦. (5.53)
𝑇 ∂𝑥 ∂𝑦 ∂𝑇

This version of the Fundamental Theorem of Calculus is known as Green’s


Theorem.

Proof of Green’s Theorem for the Unit Triangle in ℝ2 . We shall first prove Propo-
sition 5.8.1 for the unit triangle 𝑈 = [(0, 0), (1, 0), (0, 1)] = [𝑃1 , 𝑃2 , 𝑃3 ]:
∫ ( ) ∫
∂𝑄 ∂𝑃
− d𝑥d𝑦 = 𝑃 d𝑥 + 𝑄d𝑦, (5.54)
𝑈 ∂𝑥 ∂𝑦 ∂𝑈

where 𝑃 and 𝑄 are 𝐶 1 scalar fields defined on some region containing 𝑈 , and ∂𝑈
is made up of the directed line segments [𝑃1 , 𝑃2 ], [𝑃2 , 𝑃3 ] and [𝑃3 , 𝑃1 ] traversed
in the counterclockwise sense.
We will prove separately that
∫ ∫
∂𝑄
d𝑥d𝑦 = 𝑄d𝑦, (5.55)
𝑈 ∂𝑥 ∂𝑈

and ∫ ∫
∂𝑃
− d𝑥d𝑦 = 𝑃 d𝑥. (5.56)
𝑈 ∂𝑦 ∂𝑈

Together, (5.55) and (5.56) will establish (5.54).

𝑃3 @
@ 𝑥+𝑦 =1
@
@
@
@
𝑃1 𝑃2 𝑥

Figure 5.8.7: Unit Triangle 𝑈

Evaluating the double integral in (5.55) we get


∫ ∫ 1 ∫ 1−𝑦
∂𝑄 ∂𝑄
d𝑥d𝑦 = d𝑥d𝑦.
𝑈 ∂𝑥 0 0 ∂𝑥
98 CHAPTER 5. INTEGRATION

Using the Fundamental Theorem of Calculus to evaluate the inner integral we


then obtain that
∫ ∫ 1
∂𝑄
d𝑥d𝑦 = [𝑄(1 − 𝑦, 𝑦) − 𝑄(0, 𝑦)] d𝑦. (5.57)
𝑈 ∂𝑥 0

Next, we evaluate the line integral in (5.55) to get


∫ ∫ ∫ ∫
𝑄d𝑦 = 𝑄d𝑦 + 𝑄d𝑦 + 𝑄d𝑦
∂𝑈 [𝑃1 ,𝑃2 ] [𝑃2 ,𝑃3 ] [𝑃3 ,𝑃1 ]

or ∫ ∫ ∫
𝑄d𝑦 = 𝑄d𝑦 + 𝑄d𝑦, (5.58)
∂𝑈 [𝑃2 ,𝑃3 ] [𝑃3 ,𝑃1 ]

since d𝑦 = 0 on [𝑃1 , 𝑃2 ].
Now, parametrize [𝑃2 , 𝑃3 ] by
{
𝑥=1−𝑦
𝑦 = 𝑦,

for 0 ⩽ 𝑦 ⩽ 1. It then follows that


∫ ∫ 1
𝑄d𝑦 = 𝑄(1 − 𝑦, 𝑦)d𝑦. (5.59)
[𝑃2 ,𝑃3 ] 0

Parametrizing [𝑃3 , 𝑃1 ] by
{
𝑥=0
𝑦 = 1 − 𝑡,

for 0 ⩽ 𝑡 ⩽ 1, we get that {


d𝑥 = 0d𝑡
d𝑦 = −d𝑡,
and ∫ ∫ 1
𝑄d𝑦 = − 𝑄(0, 1 − 𝑡)d𝑡,
[𝑃3 ,𝑃1 ] 0

which we can re-write as


∫ ∫ 0 ∫ 1
𝑄d𝑦 = − 𝑄(0, 𝑦)(−d𝑦) = − 𝑄(0, 𝑦)d𝑦. (5.60)
[𝑃3 ,𝑃1 ] 1 0

Substituting (5.60) and (5.59) into (5.58) yields


∫ ∫ 1 ∫ 1
𝑄d𝑦 = 𝑄(1 − 𝑦, 𝑦)d𝑦 − 𝑄(0, 𝑦)d𝑦 (5.61)
∂𝑈 0 0

Comparing the left–hand sides on the equations (5.61) and (5.57), we see that
(5.55) is true. A similar calculation shows that (5.56) is also true. Hence,
Proposition 5.8.1 is proved for the unit triangle 𝑈 .
5.9. CHANGING VARIABLES 99

In subsequent sections, we show how to extend the proof of Green’s Theorem


to arbitrary triangles (which are positively oriented) and then for arbitrary
bounded regions which are bounded by positively oriented simple curves.

5.9 Changing Variables


We would like to express the integral of a scalar field, 𝑓 (𝑥, 𝑦), over an arbitrary
triangle, 𝑇 , in the 𝑥𝑦–plane,

𝑓 (𝑥, 𝑦) d𝑥d𝑦, (5.62)
𝑇

as an integral over the unit triangle, 𝑈 , in the 𝑢𝑣–plane,



𝑔(𝑢, 𝑣) d𝑢d𝑣,
𝑈

where the function 𝑔 will be determined by 𝑓 and an appropriate change of


coordinates that takes 𝑈 to 𝑇 .
We first consider the case of the triangle 𝑇 = [(0, 0), (𝑎, 0), (0, 𝑏)], pictured
in Figure 5.9.8, where 𝑎 and 𝑏 are positive real numbers.

𝑏 H
HH
HH
Δ𝑥 H
Δ𝑦 HH
(𝑥, 𝑦) H
H H
H
𝑎 𝑥

Figure 5.9.8: Triangle [(0, 0), (𝑎, 0), (0, 𝑏)]

Observe that the vector field


Φ : ℝ2 → ℝ2
defined by ( ) ( ) ( )
𝑢 𝑎𝑢 𝑢
Φ = , for all ∈ ℝ2 ,
𝑣 𝑏𝑣 𝑣
maps the unit triangle, 𝑈 , in the 𝑢𝑣–plane pictured in Figure 5.9.9, to the trian-
gle 𝑇 in the 𝑥𝑦–plane. The reason for this is that the line segment [(0, 0), (1, 0)]
in the 𝑢𝑣–plane, parametrized by
{
𝑢=𝑡
𝑣 = 0,
100 CHAPTER 5. INTEGRATION

(0, 1)
@
@
@
Δ𝑢 @
Δ𝑣@
(𝑢, 𝑣) @
(0, 0) (1, 0) 𝑢

Figure 5.9.9: Unit Triangle, 𝑈 , in the 𝑢𝑣–plane

for 0 ⩽ 𝑡 ⩽ 1, gets mapped to


{
𝑥 = 𝑎𝑡
𝑦 = 0,

for 0 ⩽ 𝑡 ⩽ 1, which is a parametrization of the line segment [(0, 0), (𝑎, 0)] in
the 𝑥𝑦–plane.
Similarly, the line segment [(1, 0), (0, 1)] in the 𝑢𝑣–plane, parametrized by
{
𝑢=1−𝑡
𝑣 = 𝑡,

for 0 ⩽ 𝑡 ⩽ 1, gets mapped to


{
𝑥 = 𝑎(1 − 𝑡)
𝑣 = 𝑏𝑡,

for 0 ⩽ 𝑡 ⩽ 1, which is a parametrization of the line segment [(𝑎, 0), (0, 𝑏)] in
the 𝑥𝑦–plane.
Similar considerations show that [(0, 1), (0, 0)] gets mapped to [(0, 𝑏), (0, 0)]
under the action of Φ on ℝ2 .
Writing ( ) ( ) ( )
𝑥(𝑢, 𝑣) 𝑢 𝑢
=Φ for all ∈ ℝ2 ,
𝑦(𝑢, 𝑣) 𝑣 𝑣
we can express the integrand in the double integral in (5.62) as a function of 𝑢
and 𝑣:
𝑓 (𝑥(𝑢, 𝑣), 𝑦(𝑢, 𝑣)) for (𝑢, 𝑣) in 𝑈.
We presently see how the differential 2–form d𝑥d𝑦 can be expressed in terms of
d𝑢d𝑣. To do this consider the small rectangle of area Δ𝑢Δ𝑣 and lower left–hand
corner at (𝑢, 𝑣) pictured in Figure 5.9.9. We see where the vector field Φ maps
this rectangle in the 𝑥𝑦–plane. In this case, it happens to be a rectangle with
5.9. CHANGING VARIABLES 101

lower–left hand corner Φ(𝑢, 𝑣) = (𝑥, 𝑦) and dimensions 𝑎Δ𝑢 × 𝑏Δ𝑣. In general,
however, the image of the Δ𝑢 × Δ𝑣 rectangle under a change of coordinates Φ
will be a plane region bounded by curves like the one pictured in Figure 5.9.10.
In the general case, we approximate the area of the image region by the area

𝑏 H
H
HH
H
HH
H
HH
(𝑥, 𝑦) H
H
𝑎 𝑥

Figure 5.9.10: Image of Rectangle under Φ

of the parallelogram spanned by vectors tangent to the image curves of the line
segments [(𝑢, 𝑣), (𝑢 + Δ𝑢, 𝑣)] and [(𝑢, 𝑣), (𝑢, 𝑣 + Δ𝑣)] under the map Φ at the
point (𝑢, 𝑣). The curves are given parametrically by

𝜎(𝑢) = Φ(𝑣, 𝑣) = (𝑥(𝑢, 𝑣), 𝑦(𝑢, 𝑣)) for 𝑢 ⩽ 𝑢 ⩽ 𝑢 + Δ𝑢,

and
𝛾(𝑣) = Φ(𝑢, 𝑣) = (𝑥(𝑢, 𝑣), 𝑦(𝑢, 𝑣)) for 𝑣 ⩽ 𝑣 ⩽ 𝑣 + Δ𝑣.

The tangent vectors the the point (𝑢, 𝑣) are, respectively,


( )
∂𝑥 ˆ ∂𝑦 ˆ
Δ𝑢 𝜎 ′ (𝑢) = Δ𝑢 𝑖+ 𝑗 ,
∂𝑢 ∂𝑢

and
( )
∂𝑥 ˆ ∂𝑦 ˆ
Δ𝑣 𝛾 ′ (𝑣) = Δ𝑣 𝑖+ 𝑗 ,
∂𝑣 ∂𝑣

where we have scaled by Δ𝑢 and Δ𝑣, respectively, by virtue of the linear ap-
proximation provided by the derivative maps 𝐷𝜎(𝑢) and 𝐷𝛾(𝑣), respectively.
The area of the image rectangle can then be approximated by the norm of the
cross product of the tangent vectors:

Δ𝑥Δ𝑦 ≈ ∥Δ𝑢 𝜎 ′ (𝑢) × Δ𝑣 𝛾 ′ (𝑣)∥

= ∥𝜎 ′ (𝑢) × 𝛾 ′ (𝑣)∥Δ𝑢Δ𝑣
102 CHAPTER 5. INTEGRATION

Evaluating the cross–product 𝜎 ′ (𝑢) × 𝛾 ′ (𝑣) yields


( ) ( )
′ ′ ∂𝑥 ˆ ∂𝑦 ˆ ∂𝑥 ˆ ∂𝑦 ˆ
𝜎 (𝑢) × 𝛾 (𝑣) = 𝑖+ 𝑗 × 𝑖+ 𝑗
∂𝑢 ∂𝑢 ∂𝑣 ∂𝑣

∂𝑥 ∂𝑦 ˆ ˆ ∂𝑦 ∂𝑥 ˆ ˆ
= 𝑖×𝑗+ 𝑗×𝑖
∂𝑢 ∂𝑣 ∂𝑢 ∂𝑣
( )
∂𝑥 ∂𝑦 ∂𝑦 ∂𝑥 ˆ
= − 𝑘
∂𝑢 ∂𝑣 ∂𝑢 ∂𝑣

∂(𝑥, 𝑦) ˆ
= 𝑘,
∂(𝑢, 𝑣)
∂(𝑥, 𝑦)
where denotes the determinant of the Jacobian matrix of the Φ at (𝑢, 𝑣).
∂(𝑢, 𝑣)
It then follows that
∂(𝑥, 𝑦)
Δ𝑥Δ𝑦 ≈ Δ𝑢Δ𝑣,
∂(𝑢, 𝑣)
which translates in terms of differential forms to
∂(𝑥, 𝑦)
d𝑥d𝑦 = d𝑢d𝑣.
∂(𝑢, 𝑣)
We therefore obtain the Change of Variables Formula
∫ ∫
∂(𝑥, 𝑦)
𝑓 (𝑥, 𝑦) d𝑥d𝑦 = 𝑓 (𝑥(𝑢, 𝑣), 𝑦(𝑢, 𝑣)) d𝑢d𝑣. (5.63)
𝑇 𝑈 ∂(𝑢, 𝑣)
This formula works for any regions 𝑅 and 𝐷 in the plane for which there is a
change of coordinates Φ : ℝ2 → ℝ2 such that Φ(𝐷) = 𝑅:
∫ ∫
∂(𝑥, 𝑦)
𝑓 (𝑥, 𝑦) d𝑥d𝑦 = 𝑓 (𝑥(𝑢, 𝑣), 𝑦(𝑢, 𝑣)) d𝑢d𝑣. (5.64)
𝑅 𝐷 ∂(𝑢, 𝑣)
Example 5.9.1. For the case in which 𝑇 = [(0, 0), (𝑎, 0), (0, 𝑏)] and 𝑈 is the
unit triangle in ℝ2 , and Φ is given by
( ) ( ) ( )
𝑢 𝑎𝑢 𝑢
Φ = for all ∈ ℝ2 ,
𝑣 𝑏𝑣 𝑣

The Change of Variables Formula (5.63) yields


∫ ∫
𝑓 (𝑥, 𝑦) d𝑥d𝑦 = 𝑎𝑏 𝑓 (𝑎𝑢, 𝑏𝑣) d𝑢d𝑣.
𝑇 𝑈

Example 5.9.2. Let 𝑅 = {(𝑥, 𝑦) ∈ ℝ2 ∣ 𝑥2 + 𝑦 2 ⩽ 1}. Evaluate



2 2
𝑒−𝑥 −𝑦 d𝑥d𝑦.
𝑅
5.9. CHANGING VARIABLES 103

Solution: Let 𝐷 = {(𝑟, 𝜃) ∈ ℝ2 ∣ 0 ⩽ 𝑟 ⩽ 1, 0 ⩽ 𝜃 < 2𝜋} and


consider the change of variables
( ) ( ) ( )
𝑟 𝑟 cos 𝜃 𝑟
Φ = for all ∈ ℝ2 ,
𝜃 𝑟 sin 𝜃 𝜃
or {
𝑥 = 𝑟 cos 𝜃
𝑦 = 𝑟 sin 𝜃.
The change of variables formula (5.64) in this case then reads
∫ ∫
∂(𝑥, 𝑦)
𝑓 (𝑥, 𝑦) d𝑥d𝑦 = 𝑓 (𝑟 cos 𝜃, 𝑟 sin 𝜃) d𝑟d𝜃,
𝑅 𝐷 ∂(𝑦, 𝜃)
2
−𝑦 2
where 𝑓 (𝑥, 𝑦) = 𝑒−𝑥 , and
⎛ ⎞
∂𝑥 ∂𝑥
∂(𝑥, 𝑦) ⎜ ∂𝑟 ∂𝜃 ⎟
= det ⎜
⎜ ⎟
∂(𝑦, 𝜃)

⎝ ∂𝑦 ∂𝑦 ⎠
∂𝑟 ∂𝜃
( )
cos 𝜃 −𝑟 sin 𝜃
= det
sin 𝜃 𝑟 cos 𝜃

= 𝑟.

Hence,
∫ ∫
−𝑥2 −𝑦 2 2
𝑒 d𝑥d𝑦 = 𝑒−𝑟 𝑟 d𝑟d𝜃
𝑅 𝐷

∫ 2𝜋 ∫ 1
2
= 𝑒−𝑟 𝑟 d𝑟d 𝜃
0 0

∫ 2𝜋 [ ]1
1 2
= − 𝑒−𝑟 d𝜃
0 2 0

∫ 2𝜋
1
1 − 𝑒−1
( )
= d𝜃
2 0

= 𝜋 1 − 𝑒−1 .
( )

Example 5.9.3 (Green’s Theorem for Arbitrary Triangles in ℝ2 ).


104 CHAPTER 5. INTEGRATION
Appendix A

The Mean Value Theorem


in Convex Sets

Definition A.0.4 (Convex Sets). A subset, 𝐴, of ℝ𝑛 is said to be convex if


given any two points 𝑥 and 𝑦 in 𝐴, the straight line segment connecting them is
entirely contained in 𝐴; in symbols,

{𝑥 + 𝑡(𝑦 − 𝑥) ∈ ℝ𝑛 ∣ 0 ≤ 𝑡 ⩽ 1} ⊆ 𝐴

Example A.0.5. Prove that the ball 𝐵𝑟 (𝑂) = {𝑥 ∈ ℝ𝑛 ∣ ∥𝑥∥ < 𝑟} is a convex
subset of ℝ𝑛 .

Solution: Let 𝑥 and 𝑦 be in 𝐵𝑟 (𝑂); then, ∥𝑥∥ < 𝑟 and ∥𝑦∥ < 𝑟.
For 0 ⩽ 𝑡 ⩽ 1, consider

𝑥 + 𝑡(𝑦 − 𝑥) = (1 − 𝑡)𝑥 + 𝑡𝑦.

Thus, taking the norm and using the triangle inequality

∥𝑥 + 𝑡(𝑦 − 𝑥)∥ = ∥(1 − 𝑡)𝑥 + 𝑡𝑦∥


⩽ (1 − 𝑡)∥𝑥∥ + 𝑡∥𝑦∥
< (1 − 𝑡)𝑟 + 𝑡𝑟 = 𝑟.

Thus, 𝑥 + 𝑡(𝑦 − 𝑥) ∈ 𝐵𝑟 (𝑂) for any 𝑡 ∈ [0, 1]. Since this is true for
any 𝑥, 𝑦 ∈ 𝐵𝑟 (𝑂), it follows that 𝐵𝑟 (𝑂) is convex. □

In fact, any ball in ℝ𝑛 is convex.


Proposition A.0.6 (Mean Value Theorem for Scalar Fields on Convex Sets).
Let 𝐵 denote and open, convex subset of ℝ𝑛 , and let 𝑓 : 𝐵 → ℝ be a scalar field.
Suppose that 𝑓 is differentiable on 𝐵. Then, for any pair of points 𝑥 and 𝑦 in
𝐵, there exists a point 𝑧 is the line segment connecting 𝑥 to 𝑦 such that

𝑓 (𝑦) − 𝑓 (𝑥) = 𝐷𝑢ˆ 𝑓 (𝑧)∥𝑦 − 𝑥∥,

105
106 APPENDIX A. MEAN VALUE THEOREM

ˆ is the unit vector in the direction of the vector 𝑦 − 𝑥; that is,


where 𝑢
1
𝑢
ˆ= (𝑦 − 𝑥).
∥𝑦 − 𝑥∥

Proof. Assume that 𝑥 ∕= 𝑦, for if 𝑥 = 𝑦 the equality certainly holds true.


Define 𝑔 : [0, 1] → ℝ by

𝑔(𝑡) = 𝑓 (𝑥 + 𝑡(𝑦 − 𝑥)) for 0 ⩽ 𝑡 ⩽ 1.

We first show that 𝑔 is differentiable on (0, 1) and that

𝑔 ′ (𝑡) = ∇𝑓 (𝑥 + 𝑡(𝑦 − 𝑥)) ⋅ (𝑦 − 𝑥) for 0 < 𝑡 < 1.

(This has been proved in Exercise 4 of Assignment #10).


Now, by the Mean Value Theorem, there exists 𝜏 ∈ (0, 1) such that

𝑔(1) − 𝑔(0) = 𝑔 ′ (𝜏 )(1 − 0) = 𝑔 ′ (𝜏 ).

It then follows that

𝑓 (𝑦) − 𝑓 (𝑥) = ∇𝑓 (𝑥 + 𝜏 (𝑦 − 𝑥)) ⋅ (𝑦 − 𝑥).

Put 𝑧 = 𝑥 + 𝜏 (𝑦 − 𝑥); then, 𝑧 is a point in the line segment connecting 𝑥 to 𝑦,


and
𝑓 (𝑦) − 𝑓 (𝑥) = ∇𝑓 (𝑧) ⋅ (𝑦 − 𝑥)

𝑦−𝑥
= ∇𝑓 (𝑧) ⋅ ∥𝑦 − 𝑥∥
∥𝑦 − 𝑥∥

= ∇𝑓 (𝑧) ⋅ 𝑢
ˆ ∥𝑦 − 𝑥∥

= 𝐷𝑢ˆ 𝑓 (𝑧)∥𝑦 − 𝑥∥,


1
where 𝑢
ˆ= (𝑦 − 𝑥).
∥𝑦 − 𝑥∥
Appendix B

Reparametrizations

In this appendix we prove that any two parameterizations of a 𝐶 1 simple curve


are reparametrizations of each other; more precisely,

Theorem B.0.7. Let 𝐼 and 𝐽 denote open intervals of real numbers containing
closed and bounded intervals [𝑎, 𝑏] and [𝑐, 𝑑], respectively, and 𝛾1 : 𝐼 → ℝ𝑛 and
𝛾2 : 𝐽 → ℝ𝑛 be 𝐶 1 paths. Suppose that 𝐶 = 𝛾1 ([𝑎, 𝑏]) = 𝛾2 ([𝑐, 𝑑]) is a 𝐶 1 simple
curve parametrized by 𝛾1 and 𝛾2 . Then, there exists a differentiable function,
𝜏 : 𝐽 → 𝐼, such that

(i) 𝜏 ′ (𝑡) > 0 for all 𝑡 ∈ 𝐽;

(ii) 𝜏 (𝑐) = 𝑎 and 𝜏 (𝑑) = 𝑏; and

(iii) 𝛾2 (𝑡) = 𝛾1 (𝜏 (𝑡)) for all 𝑡 ∈ 𝐽.

In order to prove Theorem B.0.7, we need to develop the notion of a tangent


space to a 𝐶 1 curve at a given point. We begin with a preliminary definition.

Definition B.0.8 (Tangent Space (Preliminary Definition)). Let 𝐶 denote a


𝐶 1 simple curve parameterized by a 𝐶 1 path, 𝜎 : 𝐼 → ℝ𝑛 , where 𝐼 is an open
interval containing 0, and such that and 𝜎(0) = 𝑝. We define the tangent
space, 𝑇𝑝 (𝐶), of 𝐶 at 𝑝 to be the span of the nonzero vector 𝜎 ′ (0); that is,

𝑇𝑝 (𝐶) = span{𝜎 ′ (0)}.

Remark B.0.9. Observe that the set 𝑝 + 𝑇𝑝 (𝐶) is the tangent line to the curve
𝐶 at 𝑝, hence the name “tangent space” for 𝑇𝑝 (𝐶).

The notion of tangent space is important because it allows us to define the


derivative at 𝑝 of a map 𝑔 : 𝐶 → ℝ which is solely defined on the curve 𝐶. The
idea is to consider the composition 𝑔 ∘ 𝜎 : 𝐼 → ℝ and to require that the real
valued function 𝑔 ∘ 𝜎 be differentiable at 𝑡 = 0. For the case of a 𝐶 1 scalar field,

107
108 APPENDIX B. REPARAMETRIZATIONS

𝑓 , which is defined on an open region containing 𝐶, the Chain Rule implies that
𝑓 ∘ 𝜎 is differentiable at 0 and

(𝑓 ∘ 𝜎)′ (0) = ∇𝑓 (𝜎(0)) ⋅ 𝜎 ′ (0) = ∇𝑓 (𝑝) ⋅ 𝑣,

where 𝑣 = 𝜎 ′ (0) ∈ 𝑇𝑝 (𝐶). Observe that the map

𝑣 7→ ∇𝑓 (𝑝) ⋅ 𝑣 for 𝑣 ∈ 𝑇𝑝 (𝐶)

defines a linear map on the tangent space of 𝐶 at 𝑝. We will denote this linear
map by 𝑑𝑓𝑝 ; that is, 𝑑𝑓𝑝 : 𝑇𝑝 (𝐶) → ℝ is given by

𝑑𝑓𝑝 (𝑣) = ∇𝑓 (𝑝) ⋅ 𝑣, for 𝑣 ∈ 𝑇𝑝 (𝐶).

Observe that we can then write, for ℎ ∈ ℝ with ∣ℎ∣ sufficiently small,

(𝑓 ∘ 𝜎)(0 + ℎ) = 𝑓 (𝜎(0)) + 𝑑𝑓𝑝 (ℎ𝜎 ′ (0)) + 𝐸0 (ℎ),

where
∣𝐸0 (ℎ)∣
lim = 0.
ℎ→0 ∣ℎ∣
Definition B.0.10. Let 𝐶 denote a 𝐶 1 curve parametrized by a 𝐶 1 path,
𝜎 : 𝐼 → ℝ𝑛 , where 𝐽 is an open interval containing 0 and such that 𝜎(0) = 𝑝 ∈ 𝐶.
We say that the function 𝑔 : 𝐶 → ℝ is differentiable at 𝑝 if there exists a linear
function
𝑑𝑔𝑝 : 𝑇𝑝 (𝐶) → ℝ
such that
(𝑔 ∘ 𝜎)(ℎ) = 𝑔(𝑝) + 𝑑𝑔𝑝 (ℎ𝜎 ′ (0)) + 𝐸𝑝 (ℎ),
where
∣𝐸𝑝 (ℎ)∣
lim = 0.
ℎ→0 ∣ℎ∣
We see from Definition B.0.10 that, if 𝑔 : 𝐶 → ℝ is differentiable at 𝑝, then

𝑔(𝜎(ℎ)) − 𝑔(𝑝)
lim
ℎ→0 ℎ
exists and equals 𝑑𝑔𝑝 (𝜎 ′ (0)). We have already seen that if 𝑓 is a 𝐶 1 scalar field
defined in an open region containing 𝐶, then

𝑑𝑓𝑝 (𝜎 ′ (0)) = ∇𝑓 (𝑝) ⋅ 𝜎 ′ (0).

If the only information we have about a function 𝑔 is what it does to points on


𝐶, then we see why Definition B.0.10 is relevant. It the general case it might
not make sense to talk about the gradient of 𝑔.
An example of a function, 𝑔, which is only defined on 𝐶 is the inverse of a
𝐶 1 parametrization, 𝛾 : 𝐽 → ℝ𝑛 , of 𝐶 where 𝐽 is an interval containing 0 in its
109

interior with 𝛾(0) = 𝑝. Here we are assuming that 𝛾 is one–to–one and onto 𝐶,
so that
𝑔 = 𝛾 −1 : 𝐶 → 𝐽
is defined. We claim that, since 𝛾 ′ (0) ∕= 0, according to the definition of 𝐶 1
parametrization in Definition 5.1.1 on page 61 in these notes, the function 𝑔 is
differentiable at 𝑝 according to Definition B.0.10. In order to prove this, we
first show that 𝑔 is continuous at 𝑝; that is,
Lemma B.0.11. Let 𝐶 be a 𝐶 1 curve parametrized by a 𝐶 1 map, 𝜎 : 𝐼 → ℝ𝑛 ,
where 𝐼 is an interval of real numbers containing 0 in its interior with 𝜎(0) = 𝑝.
Let 𝛾 : 𝐽 → ℝ𝑛 denote another 𝐶 1 parametrization of 𝐶, where 𝐽 is an interval
of real numbers containing 0 in its interior with 𝛾(0) = 𝑝. For every 𝑞 ∈ 𝐶,
define 𝑔(𝑞) = 𝜏 if and only if 𝛾(𝜏 ) = 𝑞. Then,

lim 𝑔(𝜎(ℎ)) = 0. (B.1)


ℎ→0

Proof: Write
𝜏 (ℎ) = 𝑔(𝜎(ℎ)), for ℎ ∈ 𝐼. (B.2)
We will show that
lim 𝜏 (ℎ) = 0; (B.3)
ℎ→0

this will prove (B.1).


From (B.2) and the definition of 𝑔 we obtain that

𝛾(𝜏 (ℎ)) = 𝜎(ℎ), for ℎ ∈ 𝐼. (B.4)

Letting ℎ = 0 in (B.4) we see that

𝛾(𝜏 (0)) = 𝑝, (B.5)

from which we get that


𝜏 (0) = 0, (B.6)
since 𝛾 : 𝐽 → ℝ𝑛 is a parametrization of 𝐶 with 𝛾(0) = 𝑝.
Write
𝜎(𝑡) = (𝑥1 (𝑡), 𝑥2 (𝑡), . . . , 𝑥𝑛 (𝑡)), for all 𝑡 ∈ 𝐼, (B.7)
𝛾(𝜏 ) = (𝑦1 (𝜏 ), 𝑦2 (𝜏 ), . . . , 𝑦𝑛 (𝜏 )), for all 𝜏 ∈ 𝐽, (B.8)
and
𝑝 = (𝑝1 , 𝑝2 , . . . , 𝑝𝑛 ). (B.9)
Since 𝛾 ′ (𝜏 ) ∕= 0 for all 𝜏 ∈ 𝐽, there exists 𝑗 ∈ {1, 2, . . . , 𝑛} such that

𝑦𝑗′ (0) ∕= 0.

Consequently, there exists 𝛿 > 0 such that


∣𝑦𝑗′ (0)∣
∣𝜏 ∣ ⩽ 𝛿 ⇒ ∣𝑦𝑗′ (𝜏 )∣ ⩾ . (B.10)
2
110 APPENDIX B. REPARAMETRIZATIONS

It follows from (B.4), (B.7) and (B.8) that

𝑦𝑗 (𝜏 (ℎ)) = 𝑥𝑗 (ℎ), for ℎ ∈ 𝐼. (B.11)

Next, use the differentiability of the function 𝑦𝑗 : 𝐽 → ℝ and the mean value
theorem to obtain 𝜃 ∈ (0, 1) such that

𝑦𝑗 (𝜏 (ℎ)) − 𝑝𝑗 = 𝜏 (ℎ)𝑦𝑗′ (𝜃𝜏 (ℎ)), (B.12)

where we have used (B.5), (B.6) and (B.9). Thus, for

∣𝜏 (ℎ)∣ ⩽ 𝛿,

it follows from (B.10) and (B.12) that

𝑚 ∣𝜏 (ℎ)∣ ⩽ ∣𝑦𝑗 (𝜏 (ℎ)) − 𝑝𝑗 ∣, (B.13)

where we have set


∣𝑦𝑗′ (0)∣
𝑚= > 0. (B.14)
2
On the other hand, it follows from (B.11) and the differentiability of 𝑥𝑗 that

𝑦𝑗 (𝜏 (ℎ)) = 𝑝𝑗 + ℎ𝑥′𝑗 (0) + 𝐸𝑗 (ℎ), for ℎ ∈ 𝐼, (B.15)

where
𝐸𝑗 (ℎ)
lim = 0. (B.16)
ℎ→0 ℎ
Consequently, using (B.13) and (B.15), if 𝜏 (ℎ)∣ ⩽ 𝛿,

𝑚 ∣𝜏 (ℎ)∣ ⩽ ∣ℎ∣∣𝑥′𝑗 (0)∣ + ∣𝐸𝑗 (ℎ)∣. (B.17)

The statement in (B.3) now follows from (B.17) and (B.16), since 𝑚 > 0 by
virtue of (B.14).
Lemma B.0.12. Let 𝐶, 𝜎 : 𝐼 → ℝ𝑛 and 𝛾 : 𝐽 → ℝ𝑛 be as in Lemma B.0.11. For
every 𝑞 ∈ 𝐶, define 𝑔(𝑞) = 𝜏 if and only if 𝛾(𝜏 ) = 𝑞. Then, the function 𝜏 : 𝐼 → 𝐽
is differentiable at 0. Consequently, the function 𝑔 : 𝐶 → 𝐽 is differentiable at
𝑝 and
𝑔(𝜎(ℎ)) − 𝑔(𝑝)
𝑑𝑔𝑝 (𝜎 ′ (0)) = lim = 𝜏 ′ (0).
ℎ→0 ℎ
Furthermore,
1
𝛾 ′ (0) = ′ 𝜎 ′ (0). (B.18)
𝜏 (0)
Proof: As in the proof of Lemma B.0.11, let 𝑗 ∈ {1, 2, . . . , 𝑛} be such that

𝑦𝑗′ (0) ∕= 0. (B.19)

Using the differentiability of 𝛾 and 𝜎, we obtain from (B.11) that

𝑝𝑗 + 𝜏 (ℎ)𝑦𝑗′ (0) + 𝐸1 (𝜏 (ℎ)) = 𝑝𝑗 + ℎ𝑥′𝑗 (0) + 𝐸2 (ℎ), (B.20)


111

where
𝐸1 (𝜏 (ℎ)) 𝐸2 (ℎ)
lim =0 and lim = 0. (B.21)
𝜏 (ℎ)→0 𝜏 (ℎ) ℎ→0 ℎ
We obtain from (B.20) and (B.19) that
[ ]
𝜏 (ℎ) 1 𝐸1 (𝜏 (ℎ)) 𝑥′𝑗 (0) 1 𝐸2 (ℎ)
1+ ′ = ′ + ′ ,
ℎ 𝑦𝑗 (0) 𝜏 (ℎ) 𝑦𝑗 (0) 𝑦𝑗 (0) ℎ

from which we get


𝑥′𝑗 (0) 1 𝐸2 (ℎ)
′ + ′
𝜏 (ℎ) 𝑦𝑗 (0) 𝑦𝑗 (0) ℎ
= . (B.22)
ℎ 1 𝐸1 (𝜏 (ℎ))
1+ ′
𝑦𝑗 (0) 𝜏 (ℎ)
Next, apply Lemma B.0.11 and (B.21) to obtain from (B.22) that

𝜏 (ℎ) 𝑥′𝑗 (0)


lim = ′ ,
ℎ→0 ℎ 𝑦𝑗 (0)

which shows that 𝜏 is differentiable at 0, in view of (B.6).


Finally, applying the Chain Rule to the expression in (B.4) we obtain

𝜏 ′ (0)𝛾 ′ (0) = 𝜎 ′ (0),

which yields (B.18).

The expression in (B.18) in the statement of Lemma B.0.12 allows us to ex-


pand the preliminary definition of the tangent space of 𝐶 at 𝑝 given in Definition
B.0.8 as follows:

Definition B.0.13 (Tangent Space). Let 𝐶 denote a 𝐶 1 simple curve in ℝ𝑛


and 𝑝 ∈ 𝐶. We define the tangent space, 𝑇𝑝 (𝐶), of 𝐶 at 𝑝 to be

𝑇𝑝 (𝐶) = span{𝜎 ′ (0)},

where 𝜎 : (−𝜀, 𝜀) → 𝐶 is any 𝐶 1 map defined on (−𝜀, 𝜀), for some 𝜀 > 0, such
that 𝜎 ′ (𝑡) ∕= 0 for all 𝑡 ∈ (−𝜀, 𝜀) and 𝜎(0) = 𝑝.

Indeed, if 𝛾 : (−𝜀, 𝜀) → 𝐶 is another 𝐶 1 map with the properties that 𝛾 ′ (𝑡) ∕=


0 for all 𝑡 ∈ (−𝜀, 𝜀) and 𝛾(0) = 𝑝, it follows from (B.18) in Lemma B.0.12 that

span{𝜎 ′ (0)} = span{𝛾 ′ (0)}.

Thus, the definition of 𝑇𝑝 𝐶 in Definition B.0.13 is independent of the choice of


parametrization, 𝜎.
Next, let 𝛾 : 𝐽 → 𝐶 be a parametrization of a 𝐶 1 curve, 𝐶. We note for
future reference that 𝛾 ′ (𝑡) ∈ 𝑇𝛾(𝑡) (𝐶) for all 𝑡 ∈ 𝐽. To see why this is the case,
112 APPENDIX B. REPARAMETRIZATIONS

let 𝜀 > 0 be sufficiently small so that (𝑡−𝜀, 𝑡+𝜀) ⊂ 𝐽, and define 𝜎 : (−𝜀, 𝜀) → 𝐶
by
𝜎(𝜏 ) = 𝛾(𝑡 + 𝜏 ), for all 𝜏 ∈ (−𝜀, 𝜀).
Then, 𝜎 is a 𝐶 1 map satisfying 𝜎 ′ (𝜏 ) = 𝛾 ′ (𝜏 + 𝑡) ∕= 0 for all 𝜏 ∈ (−𝜀, 𝜀) and
𝜎(0) = 𝛾(𝑡). Observe also that 𝜎 ′ (0) = 𝛾 ′ (𝑡). It then follows by Definition
B.0.13 that 𝛾 ′ (𝑡) ∈ 𝑇𝛾(𝑡) 𝐶 for all 𝑡 ∈ 𝐽.

Proposition B.0.14 (Chain Rule). Let 𝐶 be a 𝐶 1 simple curve parametrized


by a 𝐶 1 path, 𝛾 : 𝐽 → ℝ𝑛 . Suppose that 𝑔 : 𝐶 → ℝ is a differentiable function
defined on 𝐶. Then, the map 𝑔 ∘ 𝛾 : 𝐽 → ℝ is a differentiable function and

𝑑
[𝑔(𝛾(𝑡))] = 𝑑𝑔𝛾(𝑡) (𝛾 ′ (𝑡)), for all 𝑡 ∈ 𝐽. (B.23)
𝑑𝑡
Proof: Put 𝜎(ℎ) = 𝛾(𝑡 + ℎ), for ∣ℎ∣ sufficiently small. By Definition B.0.10,

𝑔(𝛾(𝑡 + ℎ)) = 𝑔(𝛾(𝑡)) + 𝑑𝛾(𝑡) (ℎ𝛾 ′ (𝑡)) + 𝐸𝛾(𝑡) (ℎ), (B.24)

where
∣𝐸𝛾(𝑡) (ℎ)∣
lim = 0. (B.25)
ℎ→0 ∣ℎ∣
The statement in (B.23) now follows from (B.24), (B.25), and the linearity of
the map 𝑑𝑔𝛾(𝑡) : 𝑇𝛾(𝑡) (𝐶) → ℝ.

Proof of Theorem B.0.7: Let 𝐼 and 𝐽 denote open intervals of real numbers con-
taining closed and bounded intervals [𝑎, 𝑏] and [𝑐, 𝑑], respectively, and 𝛾1 : 𝐼 →
ℝ𝑛 and 𝛾2 : 𝐽 → ℝ𝑛 be 𝐶 1 paths. Suppose that 𝐶 = 𝛾1 ([𝑎, 𝑏]) = 𝛾2 ([𝑐, 𝑑]) is a
𝐶 1 simple curve parametrized by 𝛾1 and 𝛾2 . Define 𝜏 : 𝐽 → 𝐼 by 𝜏 = 𝑔 ∘ 𝛾2 ,
where 𝑔 = 𝛾1−1 , the inverse of 𝛾1 . By Lemma B.0.12, 𝑔 : 𝐶 → 𝐼 is differentiable
on 𝐶. It therefore follows by the Chain Rule (Proposition B.0.14) that 𝜏 is
differentiable and

𝜏 ′ (𝑡) = 𝑑𝑔𝛾1 (𝑡) (𝛾2′ (𝑡)), for all 𝑡 ∈ 𝐽.

In addition, we have that

𝛾1 (𝜏 (𝑡)) = 𝛾2 (𝑡), for all 𝑡 ∈ 𝐽.

Thus, by the Chain Rule,

𝜏 ′ (𝑡)𝛾1′ (𝜏 (𝑡)) = 𝛾2′ (𝑡), for all 𝑡 ∈ 𝐽. (B.26)

Taking norms on both sides of (B.26), and using the fact that 𝛾1 and 𝛾2 are
parametrizations, we obtain from (B.26) that

∥𝛾2′ (𝑡)∥
∣𝜏 ′ (𝑡)∣ = , for all 𝑡 ∈ 𝐽. (B.27)
∥𝛾1′ (𝜏 (𝑡))∥
113

Since 𝛾2′ (𝑡) ∕= 0 for all 𝑡 ∈ 𝐽, it follows from (B.27) that

𝜏 ′ (𝑡) ∕= 0, for all 𝑡 ∈ 𝐽.

Thus, either
𝜏 ′ (𝑡) > 0, for all 𝑡 ∈ 𝐽, (B.28)
or
𝜏 ′ (𝑡) < 0, for all 𝑡 ∈ 𝐽. (B.29)
If (B.28) holds true, then the proof of Theorem B.0.7 is complete. If (B.29) is
true, consider the function 𝜏˜ : 𝐽 → 𝐼 given by

𝜏˜(𝑡) = 𝜏 (𝑏 + 𝑎 − 𝑡), for all 𝑡 ∈ 𝐽.

Then, 𝜏˜ satisfies the the properties in the conclusion of the theorem.

You might also like