100% found this document useful (10 votes)
4K views337 pages

Multi-Variable Calculus A First Step PDF

Uploaded by

antifragil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (10 votes)
4K views337 pages

Multi-Variable Calculus A First Step PDF

Uploaded by

antifragil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 337

Yunzhi Zou

Multi-Variable Calculus
Also of Interest
Single Variable Calculus, A First Step
Yunzhi Zou, 2018
ISBN 978-3-11-052462-8, e-ISBN (PDF) 978-3-11-052778-0,
e-ISBN (EPUB) 978-3-11-052785-8

Hausdorff Calculus, Applications to Fractal Systems


Yingjie Liang, Wen Chen, Wei Cai, 2019
ISBN 978-3-11-060692-8, e-ISBN (PDF) 978-3-11-060852-6,
e-ISBN (EPUB) 978-3-11-060705-5

Stochastic Models for Fractional Calculus


Mark M. Meerschaert, Alla Sikorskii, 2019
ISBN 978-3-11-055907-1, e-ISBN (PDF) 978-3-11-056024-4,
e-ISBN (EPUB) 978-3-11-055914-9

Modern Umbral Calculus, An Elementary Introduction with Applications


to Linear Interpolation and Operator Approximation Theory
Francesco Aldo Costabile, 2019
ISBN 978-3-11-064996-3, e-ISBN (PDF) 978-3-11-065292-5,
e-ISBN (EPUB) 978-3-11-065009-9

Fractional Calculus in Applied Sciences and Engineering


Changpin Li (Ed.)
ISSN 2509-7210
Yunzhi Zou

Multi-Variable
Calculus

|
A First Step
Mathematics Subject Classification 2010
Primary: 26B12, 26B20, 26B15; Secondary: 26B05, 26B10

Author
Prof. Yunzhi Zou
Department of Mathematics
Sichuan University
610065 Chengdu
People’s Republic of China
[email protected]

ISBN 978-3-11-067414-9
e-ISBN (PDF) 978-3-11-067437-8
e-ISBN (EPUB) 978-3-11-067443-9

Library of Congress Control Number: 2019953764

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

© 2020 Walter de Gruyter GmbH, Berlin/Boston


Cover image: shulz/iStock/Getty Images Plus
Typesetting: VTeX UAB, Lithuania
Printing and binding: CPI books GmbH, Leck

www.degruyter.com
Contents
Introduction | IX

1 Vectors and the geometry of space | 1


1.1 Vectors | 1
1.1.1 Concepts of vectors | 1
1.1.2 Linear operations involving vectors | 2
1.1.3 Coordinate systems in three-dimensional space | 3
1.1.4 Representing vectors using coordinates | 5
1.1.5 Lengths, direction angles | 7
1.2 Dot product, cross product, and triple product | 9
1.2.1 The dot product | 9
1.2.2 Projections | 12
1.2.3 The cross product | 13
1.2.4 Scalar triple product | 17
1.3 Equations of lines and planes | 18
1.3.1 Lines | 18
1.3.2 Planes | 23
1.4 Curves and vector-valued functions | 30
1.5 Calculus of vector-valued functions | 32
1.5.1 Limits, derivatives, and tangent vectors | 32
1.5.2 Antiderivatives and definite integrals | 35
1.5.3 Length of curves, curvatures, TNB frame | 37
1.6 Surfaces in space | 42
1.6.1 Graph of an equation F (x, y, z) = 0 | 42
1.6.2 Cylinder | 44
1.6.3 Quadric surfaces | 46
1.6.4 Surface of revolution | 46
1.7 Parameterized surfaces | 49
1.8 Intersecting surfaces and projection curves | 50
1.9 Regions bounded by surfaces | 56
1.10 Review | 57
1.11 Exercises | 59
1.11.1 Vectors | 59
1.11.2 Lines and planes in space | 60
1.11.3 Curves and surfaces in space | 61

2 Functions of multiple variables | 65


2.1 Functions of multiple variables | 65
2.1.1 Definitions | 65
2.1.2 Graphs and level curves | 67
VI | Contents

2.1.3 Functions of more than two variables | 69


2.1.4 Limits | 70
2.1.5 Continuity | 75
2.2 Partial derivatives | 76
2.2.1 Definition | 76
2.2.2 Interpretations of partial derivatives | 80
2.2.3 Partial derivatives of higher order | 82
2.3 Total differential | 83
2.3.1 Linearization and differentiability | 83
2.3.2 The total differential | 89
2.3.3 The linear/differential approximation | 90
2.4 The chain rule | 92
2.4.1 The chain rule with one independent variable | 92
2.4.2 The chain rule with more than one independent variable | 94
2.4.3 Partial derivatives for abstract functions | 97
2.5 The Taylor expansion | 98
2.6 Implicit differentiation | 101
2.6.1 Functions implicitly defined by a single equation | 101
2.6.2 Functions defined implicitly by systems of equations | 103
2.7 Tangent lines and tangent planes | 106
2.7.1 Tangent lines and normal planes to a curve | 106
2.7.2 Tangent planes and normal lines to a surface | 109
2.8 Directional derivatives and gradient vectors | 113
2.9 Maximum and minimum values | 122
2.9.1 Extrema of functions of two variables | 122
2.9.2 Lagrange multipliers | 130
2.10 Review | 136
2.11 Exercises | 138
2.11.1 Functions of two variables | 138
2.11.2 Partial derivatives and differentiability | 139
2.11.3 Chain rules and implicit differentiation | 140
2.11.4 Tangent lines/planes, directional derivatives | 141
2.11.5 Maximum/minimum problems | 142

3 Multiple integrals | 145


3.1 Definition and properties | 145
3.2 Double integrals in rectangular coordinates | 150
3.3 Double integral in polar coordinates | 157
3.4 Change of variables formula for double integrals | 161
3.5 Triple integrals | 165
3.5.1 Triple integrals in rectangular coordinates | 165
3.5.2 Cylindrical and spherical coordinates | 175
Contents | VII

3.6 Change of variables in triple integrals | 179


3.7 Other applications of multiple integrals | 181
3.7.1 Surface area | 181
3.7.2 Center of mass, moment of inertia | 187
3.8 Review | 188
3.9 Exercises | 191
3.9.1 Double integrals | 191
3.9.2 Triple integrals | 192
3.9.3 Other applications of multiple integrals | 193

4 Line and surface integrals | 195


4.1 Line integral with respect to arc length | 195
4.1.1 Definition and properties | 196
4.1.2 Evaluating a line integral, ∫C f (x, y)ds, in ℝ2 | 197
4.1.3 Line integrals ∫C f (x, y, z)ds in ℝ3 | 199
4.2 Line integral of a vector field | 201
4.2.1 Vector fields | 201
4.2.2 The line Integral of a vector field along a curve C | 202
4.3 The fundamental theorem of line integrals | 208
4.4 Green’s theorem: circulation-curl form | 216
4.4.1 Positive oriented simple curve and simply connected region | 216
4.4.2 Circulation around a closed curve | 217
4.4.3 Circulation density | 217
4.4.4 Green’s theorem: circulation-curl form | 219
4.4.5 Applications of Green’s theorem in circulation-curl form | 222
4.5 Green’s theorem: flux-divergence form | 231
4.5.1 Flux | 231
4.5.2 Flux density – divergence | 232
4.5.3 The divergence-flux form of Green’s theorem | 233
4.6 Source-free vector fields | 235
4.7 Surface integral with respect to surface area | 237
4.8 Surface integrals of vector fields | 241
4.8.1 Orientable surfaces | 241
4.8.2 Flux integral ∬S (F ⋅ N)dS | 242
4.9 Divergence theorem | 248
4.9.1 Divergence of a three-dimensional vector field | 248
4.9.2 Divergence theorem | 250
4.10 Stokes theorem | 256
4.10.1 The curl of a three-dimensional vector field | 256
4.10.2 Stokes theorem | 258
4.11 Review | 265
VIII | Contents

4.12 Exercises | 268


4.12.1 Line integrals | 268
4.12.2 Surface integrals | 269

5 Introduction to ordinary differential equations | 273


5.1 Introduction | 273
5.2 First-order ODEs | 275
5.2.1 General and particular solutions and direction fields | 275
5.2.2 Separable differential equations | 277
5.2.3 Substitution methods | 279
5.2.4 Exact differential equations | 281
5.2.5 First-order linear differential equations | 283
5.3 Second-order ODEs | 287
5.3.1 Reducible second-order equations | 287
5.3.2 Second-order linear differential equations | 291
5.3.3 Variation of parameters | 307
5.4 Other ways of solving differential equations | 308
5.4.1 Power series method | 309
5.4.2 Numerical approximation: Euler’s method | 310
5.5 Review | 313
5.6 Exercises | 315
5.6.1 Introduction to differential equations | 315
5.6.2 First-order differential equations | 315
5.6.3 Second-order differential equations | 316

Further reading | 319

Index | 321
Introduction
Calculus has been widely applied to an incredible number of disciplines since its
inception in the seventeenth century. In particular, the marvelous Maxwell equa-
tions revealed the laws that govern electric and magnetic fields, which led to the
forecasting of the existence of the electromagnetic waves. The industrial revolution
witnessed the many applications of calculus. The power of calculus never diminishes,
even in today’s scientific world. For this reason, there is no doubt that calculus is one
of the most important courses for undergraduate students at any university in the
world.
On the other hand, during the past century, especially since the 2000s, many Chi-
nese and other non-English speaking people have gone to English speaking countries
to further their studies, and more are on their way. Also, as global cooperations and
communications become important for people to tackle big problems, there are needs
for people to know and understand each other better. Fortunately, Sichuan University
has a long history of global connections. Its summer immersion program is well known
for its size and popularity. Each year, it sends and hosts thousands of students from
different parts of the world. We believe that there are other similar situations where
students come and go to different places or countries without disrupting their studies.
For those students, a suitable textbook is helpful.
However, there are many challenges in developing such a suitable book. First of
all, for most freshmen whose English is not their first language, the textbook should
employ English as plain as possible. Second, the textbook should take into account
what students have learned in high school and what they need in a calculus course.
Third, there must be a smooth transition from the local standards to those globally
accepted. Furthermore, such a book must have some new insights to inject new energy
into the many already existing texts. This includes, but is not limited to, addressing
discovery over rote learning; being as concise as possible while covering the essential
content required by most local and global universities; and being printed in color as
most texts in English are. The book Single Variable Calculus: A First Step, which was
the first such calculus text in China, has provided a response to these challenges since
it was published by the World Publishing Company in 2015 and by De Gruyter in 2018.
The present book, Multivariable calculus: a first step, makes sure that these efforts
continue.
With more than 10 years in teaching calculus courses to students at the Wu Yu-
zhang honors college at Sichuan University, I have had the chance to work with local
students using books and resource materials in English. We adopted or referred stu-
dents to many calculus books, for example Thomas’ Calculus, 10th edition, by Finney,
Weir, and Giordano; Calculus, 5th edition, by Stewart; Calculus, by Larson, Edwards,
and Hosteltle; Calculus: Ideas and Applications, by Himonnas and Howard; Calculus:
Early Transcendentals, 2nd edition, by Briggs, Cochran, and Gillett; and other books in

https://doi.org/10.1515/9783110674378-201
X | Introduction

Chinese, including the calculus books published by the Mathematics departments at


Sichuan University and Tongji University. They are all good textbooks, and I acknowl-
edge that I was inspired much by them, not only by their structure that builds the
contents, but also by the nice problems that enhance understanding. Most of the ex-
ercises in this text were developed over the past decade. My many teaching assistants
contributed a lot by helping create or collecting resources in the past years. Among
them are Zengbao Wu, Liang Li, Mengxin Li, Bo Qian, Xi Zhu, and Yang Yang. Most of
the exercises were inspired by the books mentioned before. Some are original, while
others we think original may be similar to those found in existing books or other re-
sources. Those are usually classic problems and can be found in many places.
My thanks also go to students in the years of 2015, 2016, 2017, and 2018 at Wu
Yuzhang College who helped proofread the manuscript or notes and provided use-
ful feedback. I also appreciate my wonderful colleagues Wengui Hu, Li Ren, and Hao
Wang, who worked with me teaching calculus using the early versions of this book. In
particular, Wengui contributed many good problems. I received valuable suggestions
from my dear friend Dr. Harold Reiter at UNCC and Dr. Wenyuan Liao at the University
of Calgary, who always lends me a hand in solving problems arising in using LaTeX or
MATLAB. Professor Xiaozhan Xu provided me with many excellent PPTs and anima-
tions for teaching the course.
My special thanks go to my dear friend Dr. Jonathan Kane, whose talents in math-
ematics and English improved the manuscript a lot, with thorough and professional
edits. Also the ideas of adding contents such as moment of inertia, the torus problem,
and solving simple PDEs were due to Jon.
I enjoyed very much working with the wonderful people mentioned above. I am
sure that without them this book would not have achieved this level. The following
list is in alphabetical order.

1. Dr. Wengui Hu, Associate Professor of Mathematics


Sichuan University, China
2. Dr. Jonathan Kane, Emeritus Professor of Mathematics
University of Wisconsin, Whitewalter, USA
3. Dr. Wenyuan Liao, Associate Professor of Mathematics
University of Calgary, Canada
4. Dr. Harold Reiter, Professor of Mathematics,
University of North Carolina at Charlotte, USA
5. Dr. Li Ren, Associate Professor of Mathematics
Sichuan University, China
6. Dr. Hao Wang, Associate Professor of Mathematics
Sichuan University, China
7. Mr. Xiaozhan Xu, Professor of Mathematics
Sichuan University, China
Introduction | XI

8. Mr. Zengbao Wu, Mr. Liang Li, Ms. Mengxin Li, Mr. Bo Qian, Mr. Xi Zhu, Mr. Yang
Yang, and Mr. Yi Guo
Graduate students working as teacher assistants

I also would like to thank my Mathematics department and the academic affairs of-
fice at Sichuan University. I always have their encouragement and generous support,
which make me happy to devote time and energy in writing this book and make the
publication of the work possible.
We have been working hard on this version; however, there might still be typos
and even mistakes. The responsibility for those errors in this book lie entirely with
me. I will be happy to receive comments and feedback anytime whenever they arise.
I can be reached via [email protected].

Sincerely,
Yunzhi Zou
Professor of Mathematics
Sichuan University
Chengdu, P.R. China
[email protected]
610065
1 Vectors and the geometry of space
In this chapter we introduce vectors and coordinate systems for three-dimensional
space. They are very helpful in our study of multivariable calculus. In particular, vec-
tors provide simple descriptions and insight concerning curves and planes. We also
introduce some surfaces in space. The graph of a function of two variables is a surface
in space which gives additional insight into the properties of the function.

1.1 Vectors
1.1.1 Concepts of vectors

The term vector is used to indicate a quantity that has both a magnitude and a di-
rection, for instance, displacement, acceleration, velocity, and force. Scientists often
represent a vector geometrically by an arrow (a directed line segment). The arrow of
the directed line segment points in the direction of the vector, while the length of the
arrow represents the magnitude of the vector. We denote vectors by letters that have
→󳨀 →󳨀 → 󳨀 󳨀
an arrow overbar, such as → 󳨀a , b , i , k ,→
v . For example, suppose an object moves along
a straight line from point A to point B. The vector s⃗ representing this displacement geo-
metrically has initial point A (the tail) and terminal point B (the head), and we indicate
󳨀󳨀→
this by writing s⃗ = AB (as shown in Figure 1.1(a)). We also denote vectors by printing
the letters in boldface, such as a, b, i, k, v. In this book, we use both notations. We de-
note the magnitude (also called the length) of a vector a⃗ (or a) by |a|⃗ (or |a|). If |a|⃗ = 1,
then we say that a⃗ is a unit vector.

(a) (b) (c) (d)

Figure 1.1: Vectors, addition, scalar multiplication, and subtraction.

We say that two vectors a⃗ and b⃗ are equivalent (or equal) if they have the same length
and the same direction, and we write a⃗ = b.⃗ Note that two vectors with the same length
and direction are considered equal even when the vectors are in two different loca-
tions. The zero vector, denoted by 0⃗ or 0, has length 0, and, consequently, it is the

https://doi.org/10.1515/9783110674378-001
2 | 1 Vectors and the geometry of space

only vector with no specific direction. If nonzero vectors a⃗ and b⃗ have the same direc-
tion or if a⃗ has exactly the opposite direction to that of b,⃗ then we say that they are
parallel, and we write a⃗ ‖ b.⃗

1.1.2 Linear operations involving vectors

We assume that vectors considered here can be represented by directed line segments
or arrows in two-dimensional space, ℝ2 , or three-dimensional space, ℝ3 . However,
vectors can be defined much more generally without reference to the directed line seg-
ments.

Definition 1.1.1 (Vector addition). If a⃗ and b⃗ are vectors positioned so the initial point of b⃗ is at the
terminal point of a,⃗ then the sum a⃗ + b⃗ is the vector from the initial point of a⃗ to the terminal point of b.⃗

This definition of vector addition is illustrated in Figure 1.1(b), and you can see why
this definition is sometimes called the triangle law or parallelogram law.

Note. If the initial point of b⃗ is not at the terminal point of a,⃗ then a copy of b⃗ (same
length and direction) can be made with its initial point at the terminal point of a,⃗ and
the sum can be created using a⃗ and this copy of b.⃗

Vector addition satisfies the following laws for any three vectors a,⃗ b,⃗ c:⃗
(1) Commutative law: a⃗ + b⃗ = b⃗ + a.⃗
(2) Associative law: (a⃗ + b)⃗ + c⃗ = a⃗ + (b⃗ + c).

Definition 1.1.2 (Scalar multiplication, negative of a vector). If λ is a scalar (a number) and a⃗ is a vec-
tor, then the scalar multiple λa⃗ is also a vector. If λ > 0, then λa⃗ has the same direction as the vector
a⃗ and has length λ times the length of → 󳨀a . If λ < 0, then λa⃗ has the reverse direction to the direction
of a⃗ and has length that is |λ| times the length of → a . If λ = 0 or a⃗ = 0⃗ (zero vector), then λa⃗ = 0.⃗
󳨀
In particular, the vector −a⃗ is called the negative of a,⃗ and it means the scalar multiple (−1)a⃗ has the
same length as a⃗ but points in the opposite direction.

Scalar multiplication satisfies the following laws for any two vectors a,⃗ b⃗ and any
two scalars λ, μ:
(3) Associative law: λ(μa)⃗ = (λμ)a⃗ = μ(λa).⃗
(4) Distributive laws: (λ + μ)a⃗ = λa⃗ + μa⃗ and λ(a⃗ + b)⃗ = λa⃗ + λb.⃗

By the distributive law (4) b⃗ + (−b)⃗ = 1b⃗ + (−1)b⃗ = (1 − 1)b⃗ = 0,⃗ so b⃗ and −b⃗ act as
negatives of each other. Also, we can see that two nonzero vectors are parallel to each
other if they are scalar multiples of one another. The zero vector is considered to be
parallel to all other vectors. It is easy to establish the following theorem.
1.1 Vectors | 3

Theorem 1.1.1. Suppose a⃗ and b⃗ are two nonzero vectors. Then a⃗ ‖ b⃗ if and only if there exists a number
λ ≠ 0 such that a⃗ = λb.⃗

The difference or subtraction a⃗ − b⃗ of two vectors is defined in terms of a vector sum as

a⃗ − b⃗ = a⃗ + (−b).

Hence, we can construct a⃗ − b⃗ geometrically by first drawing the negative −b⃗ of b,⃗
and then adding −b⃗ to a⃗ using the parallelogram law as in Figure 1.1(d). This shows
that the vector a⃗ − b⃗ is the vector from the head of b⃗ to the head of a.⃗ The operation
of subtracting two vectors does not satisfy the commutative law (1) or the associative
law (2), but it does satisfy the distributive law (4), λ(a⃗ − b)⃗ = λa⃗ − λb.⃗

1.1.3 Coordinate systems in three-dimensional space

To locate a point in a plane in a two-dimensional Cartesian coordinate system with


perpendicular x- and y-axes, two numbers or coordinates are necessary, and this is
why a plane is called two-dimensional. That is, the point can be represented as an
ordered pair (a, b) of real numbers where the x-coordinate, a, is the directed distance
from the y-axis to the point, and the y-coordinate, b, is the directed distance from the
x-axis to the point.
To locate a point in three-dimensional space, three coordinates are required. We
start with a fixed point, O, called the origin. We then draw three number lines that
all pass through O and are perpendicular to each other. Usually, we put two number
lines: one horizontal and one vertical. We call the three number lines the coordinate
axes and label them as the x-axis, the y-axis, and the z-axis in a way that satisfies the
right-hand rule. This rule helps determine the direction of the z-axis. If you curl your
right-hand fingers naturally in a 90° rotation from the positive x-axis to the positive
y-axis, then the direction that your thumb points is the positive direction of the z-axis,
as shown in Figure 1.2(a). The three axes determine three coordinate planes called
the xy-plane, the xz-plane, and the yz-plane, as shown in Figure 1.2(b). Therefore, the

(a) (b) (c) (d)

Figure 1.2: Three-dimensional coordinate system, axes, coordinate planes, and octants.
4 | 1 Vectors and the geometry of space

space is divided into eight octants. We label them the first octant, the second octant,
the third octant, the fourth octant, the fifth octant, the six octant, the seventh octant,
and the eighth octant in a way that is shown in Figure 1.2(c).
To locate a point P in space, we project the point onto the three coordinate planes.
If the directed distance from the yz-plane to the point P is a, the directed distance
from the xz-plane to the point P is b, and the directed distance from the xy-plane to
the point P is c, then we say that the point P has x-coordinate a, y-coordinate b, and
z-coordinate c, and we use the ordered triple (a, b, c) to represent these coordinates.
This can be seen by drawing a rectangular box where O and P are two end points of the
main diagonal, as shown in Figure 1.3(a). This coordinate system is called the three-
dimensional Cartesian coordinate system. For example, to locate the point with coor-
dinates (1, 2, −1), we start from the origin and go along the x-axis for 1 unit; then turn
left and go parallel to the y-axis for 2 units; then go downward for 1 unit arriving at
(1, 2, −1), which is in the fifth octant as shown in Figure 1.3(b).

(a) (b) (c)

Figure 1.3: Three-dimensional coordinate system, coordinates, points, distance between two points.

Note that there is a one-to-one correspondence between points in the space and the
set of all ordered triples (a, b, c). Sometimes, we use ℝ3 to denote the Cartesian product
ℝ × ℝ × ℝ = {(x, y, z)|x, y, z ∈ ℝ}.

Distance between two points in space


In a two-dimensional plane, by using the Pythagorean theorem, we have the following
formula for the distance between two points (x1 , y1 ) and (x2 , y2 ) in the plane:

distance between two points d = √(x1 − x2 )2 + (y1 − y2 )2 .

In three-dimensional space, for any two points P(x1 , y1 , z1 ) and Q(x2 , y2 , z2 ), we have a
rectangular box with P and Q as the two endpoints of a main diagonal, as shown in
Figure 1.3(c). Then we apply the Pythagorean theorem twice to get

distance between P and Q = √(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 . (1.1)


1.1 Vectors | 5

1.1.4 Representing vectors using coordinates

It is extremely useful to represent vectors using coordinates. First, we have three stan-
dard basis vectors called i,⃗ j,⃗ and k,⃗ which are three unit vectors in the positive direc-
tions of the x-, y-, and z-axes, respectively. If those vectors have their tails at the origin
O, then their heads will be the points (1, 0, 0), (0, 1, 0), (0, 0, 1), respectively, as shown
in Figure 1.4(a).

(a) (b)

Figure 1.4: Three-dimensional coordinate system, basis vectors, position vectors.

󳨀→
Definition 1.1.3. A vector OP with initial point O, the origin, and terminal point P(x, y, z) is called the
position vector of the point P(x, y, z).

󳨀󳨀→
By the definition of vector addition, we must have OP = x i ⃗ + y j ⃗ + z k.⃗ This follows from
󳨀󳨀→
the box determined by the vector OP (see Figure 1.4(b)), because the parallelogram
rule for addition gives

󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→


OP = OQ + QP
󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→
= OT + TQ + QP,

󳨀󳨀→ 󳨀󳨀→
where OT is along the x-axis with length x and is x i,⃗ TQ is parallel to the y-axis with
󳨀
󳨀→
length y and is y j,⃗ and QP is parallel to the z-axis with length z and is z k.⃗ The numbers
󳨀󳨀→
x, y, and z are referred to as the components of the vector OP.
If we add two vectors expressed in the i,⃗ j,⃗ k⃗ format, then the commutative and as-
sociative laws of vector addition show that adding two vectors can be done by adding
their components, i. e.,

(x1 i ⃗ + y1 j ⃗ + z1 k)⃗ + (x2 i ⃗ + y2 j ⃗ + z2 k)⃗ = (x1 + x2 )i ⃗ + (y1 + y2 )j ⃗ + (z1 + z2 )k.⃗ (1.2)


6 | 1 Vectors and the geometry of space

By the distributive law one can see that multiplying a vector by a scalar λ is the same
as multiplying each component by λ, i. e.,

λ(xi ⃗ + yj ⃗ + z k)⃗ = λxi ⃗ + λyj ⃗ + λz k.⃗ (1.3)

Example 1.1.1. If a⃗ = 5i ⃗ + 2j ⃗ − 3k⃗ and b⃗ = 4i ⃗ − 9k,⃗ express the vector 2a⃗ + 3b⃗ in terms of i,⃗ j,⃗ and k.⃗

Solution. Using properties of vectors, we have

2a⃗ + 3b⃗ = 2(5i ⃗ + 2j ⃗ − 3k)⃗ + 3(4i ⃗ − 9k)⃗


= 10i ⃗ + 4j ⃗ − 6k⃗ + 12i ⃗ − 27k⃗
= 22i ⃗ + 4j ⃗ − 33k.⃗

Now we use the notation ⟨x, y, z⟩ to denote a position vector with its head at the
point (x, y, z), and this is the coordinate representation of this position vector. Since
any vector in space can be translated so that its initial point is the origin, any vector
in space can be represented in the form ⟨x, y, z⟩. We now give definitions for vector
operations using its coordinates representation as follows.

Definition 1.1.4. If a⃗ = ⟨x1 , y1 , z1 ⟩ and b⃗ = ⟨x2 , y2 , z2 ⟩ are two position vectors and λ is a real number,
then

a⃗ + b⃗ = ⟨x1 + x2 , y1 + y2 , z1 + z2 ⟩,
a⃗ − b⃗ = ⟨x − x , y − y , z − z ⟩,
1 2 1 2 1 2

λa⃗ = ⟨λx1 , λy1 , λz1 ⟩.

Note that those operations also work for two-dimensional vectors; the only difference
is that there is no z-component (or the z-component is always 0). Also, from the defi-
nition, we know that

a⃗ = b⃗ ⇐⇒ x1 = x2 , y1 = y2 , and z1 = z2 , (1.4)

that is, their corresponding components are identical.

󳨀→
Example 1.1.2. Consider any vector PQ, where the initial point is P(x1 , y1 , z1 ) and the terminal point is
Q(x2 , y2 , z2 ). Then find coordinates of the midpoint of the line segment PQ.

Solution. Since

󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→


PQ = OQ − OP = ⟨x2 − x1 , y2 − y1 , z2 − z1 ⟩,
1.1 Vectors | 7

󳨀󳨀→ 󳨀󳨀→
if M(x, y, z) is the midpoint of the line segment PQ, then 2PM = PQ, so we have

⟨2(x − x1 ), 2(y − y1 ), 2(z − z1 )⟩ = ⟨x2 − x1 , y2 − y1 , z2 − z1 ⟩.

This means

2⟨x, y, z⟩ = ⟨x2 − x1 , y2 − y1 , z2 − z1 ⟩ + 2⟨x1 , y1 , z1 ⟩


= ⟨x2 + x1 , y2 + y1 , z2 + z1 ⟩.

Hence, we can deduce that the formula for the midpoint M is


x1 + x2 y1 + y2 z1 + z2
M( , , ). (1.5)
2 2 2

1.1.5 Lengths, direction angles

Length and distance formula


The length of a vector is the length of the line segment whose endpoints are the head
and tail of the vector. By using the Pythagorean theorem, we have the following theo-
rem.

Theorem 1.1.2. If a vector is represented by a⃗ = ⟨x, y, z⟩, then

|a|⃗ = √x 2 + y 2 + z 2 .

1
If |a|⃗ = 1, then a⃗ is a unit vector. If a⃗ is not the zero vector, |a|⃗
a⃗ is the unit vector in the
direction of a.⃗

Example 1.1.3. Find the unit vector in the direction of

(1)a⃗ = ⟨1, 2, −1⟩ and (2)b = 4i − j − 8k.

Solution.
1. The length of a⃗ is |a|⃗ = √12 + 22 + (−1)2 = √6. So the unit vector e⃗ in the direction
of a⃗ is
1 1 1 2 1
e⃗ = a⃗ = ⟨1, 2, −1⟩ = ⟨ , ,− ⟩.
|a|⃗ √6 √6 √6 √6
2. The given vector has length

|4i − j − 8k| = √42 + (−1)2 + (−8)2 = √81 = 9.

So the unit vector with the same direction is


1 4 1 8
(4i − j − 8k) = i − j − k.
9 9 9 9
8 | 1 Vectors and the geometry of space

We have seen the distance formula before. Now we can derive it from the length
of a vector as well. The distance between the two points P(x1 , y1 , z1 ) and Q(x2 , y2 , z2 ) is,
󳨀󳨀→
therefore, the length of the vector PQ, so it is
󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→
|PQ| = |OQ − OP| = √(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 . (1.6)

Example 1.1.4. Find a point P on the y-axis such that |PA| = |PB|, where A(−4, 1, 7) and B(3, 5, 2) are
two points.

Solution. We assume the point P has the coordinates (0, y, 0). From the distance for-
mula, we have

√(−4 − 0)2 + (1 − y)2 + (7 − 0)2 = √(3 − 0)2 + (5 − y)2 + (2 − 0)2 .

Solving for y, we obtain y = −7


2
. Therefore, the point P has coordinates (0, − 72 , 0).

Direction angles and direction cosines


󳨀󳨀→ 󳨀󳨀→
Let a⃗ = OA and b⃗ = OB be two vectors in a plane or space as in Figure 1.5(a) and (b).

(a) (b) (c) (d)

Figure 1.5: Angle between two vectors, perpendicular vectors, and direction angles.

Definition 1.1.5 (Angle between two vectors, direction angle, and direction cosines). If a⃗ and b⃗ are
two vectors with a common tail, then:
1. The angle between the vectors a⃗ and b⃗ is the angle θ between 0 and π formed using the two
vectors as sides.
2. The two vectors a⃗ and b⃗ are called perpendicular (orthogonal) if and only if the angle between
them is π2 .
3. The angle between a vector a⃗ and the x-axis is the angle between a⃗ and the unit base vector i.⃗
4. The angle between a vector a⃗ and the y-axis is the angle between a⃗ and the unit base vector j.⃗
5. The angle between a vector a⃗ and the z-axis is the angle between a⃗ and the unit base vector k.⃗
6. The direction angles α, β, and γ of a vector a⃗ are the angles between a⃗ and the x-, y-, and z-axes,
respectively; cos α, cos β, and cos γ are called direction cosines of a.⃗
1.2 Dot product, cross product, and triple product | 9

From Figure 1.5(d), if the vector a⃗ = ⟨x, y, z⟩ has direction angles α, β, γ, then we have
x y z
cos α = , cos β = , and cos γ = . (1.7)
|a|⃗ |a|⃗ |a|⃗
Since
x2 y2 z2
cos2 α + cos2 β + cos2 γ = 2
+ 2 + 2
|a|⃗ |a|⃗ |a|⃗
x2 + y2 + z 2
=
|a|⃗ 2
= 1,

it follows that
⟨x, y, z⟩
⟨cos α, cos β, cos γ⟩ = (1.8)
|a|⃗
is the unit vector in the direction of a.⃗

Example 1.1.5. If A(2, 2, √2) and B(1, 3, 0) are two points, find the length, direction cosines, and direc-
󳨀→
tion angles of the vector AB.

󳨀󳨀→
Solution. Because AB = ⟨1 − 2, 3 − 2, 0 − √2⟩ = ⟨−1, 1, −√2⟩, we have
󳨀󳨀→
|AB| = √(−1)2 + 12 + (−√2)2 = 2.
󳨀󳨀→
The unit vector in the direction of AB is
1 −1 1 √2
⟨−1, 1, −√2⟩ = ⟨ , , − ⟩.
2 2 2 2
Hence,
1 1 √2
cos α = − , cos β = , and cos γ = −
2 2 2
are the three direction cosines, and
2π π 3π
α= , β= , and γ=
3 3 4
are the three direction angles with the positive x-, y- and z-axes, respectively.

1.2 Dot product, cross product, and triple product


1.2.1 The dot product

So far we have introduced the two operations on vectors: addition and multiplica-
tion by a scalar. Now the following questions arise: How about multiplication? Can
10 | 1 Vectors and the geometry of space

Figure 1.6: Work done by a force, dot product.

we multiply two vectors to obtain a useful quantity? In fact, there are two commonly
used useful products of vectors called the dot product and the cross product.
As shown in Figure 1.6, you may already know from physics that the work done,
W, by a force F applied during a displacement along the vector s is

W = |F||s| cos θ,

where θ is the angle between the two vectors F and s. It is, therefore, useful to define
a product of two vectors in this way.

Definition 1.2.1 (Dot/scalar/inner product). The dot product a ⋅ b of the two vectors a and b is defined
by

a ⋅ b = |a||b| cos θ,

where θ is the angle between vectors a and b.

Example 1.2.1. If the two vectors a and b have length 3 and 4, and the angle between them is π/3, find
a ⋅ b.

Solution. Using the definition, we have

1
a ⋅ b = |a||b| cos(π/3) = 3 ⋅ 4 ⋅ = 6.
2

Well, this definition looks good as it has a physical basis. However, mathemati-
cally, it is not easy to find the dot product directly as we first need to know the angle
between the vectors. Using the coordinate representation of a vector, it turns out that
there is a remarkable way to compute the dot product, as we will see in the following
theorem.

Theorem 1.2.1. If a = ⟨a1 , a2 , a3 ⟩ and b = ⟨b1 , b2 , b3 ⟩, then

a ⋅ b = a1 b1 + a2 b2 + a3 b3 .
1.2 Dot product, cross product, and triple product | 11

Proof. Suppose the angle between a and b is θ. Note that the three vectors, a, b, and
c = b − a form the three sides of a triangle. By the cosine law, we have

|c|2 = |a|2 + |b|2 − 2|a||b| cos θ.

Since

|c|2 = |b − a|2 = (b1 − a1 )2 + (b2 − a2 )2 + (b3 − a3 )2


= b21 − 2b1 a1 + a21 + b22 − 2b2 a2 + a22 + b23 − 2b3 a3 + a23 ,
|a|2 = a21 + a22 + a23 , and
2
|b| = b21 + b22 + b23 ,

substituting these values into the cosine law equation and canceling out all the
squares gives

−2b1 a1 − 2b2 a2 − 2b3 a3 = −2|a||b| cos θ.

Therefore, we have

a ⋅ b = a1 b1 + a2 b2 + a3 b3 .

In view of this theorem, we give the following alternative definition of the dot
product.

Definition 1.2.2 (Alternative definition of the dot product). If a = ⟨a1 , a2 , a3 ⟩, b = ⟨b1 , b2 , b3 ⟩, and θ is
the angle between the two vectors, then the dot product is defined by

a ⋅ b = |a||b| cos θ = a1 b1 + a2 b2 + a3 b3 .

Finding the dot product of a and b is incredibly easy by using coordinates. We just
multiply corresponding components and add. Using this definition, we can deduce
the following properties of the dot product.

Theorem 1.2.2 (Properties of the dot product). If a, b, and c are any three vectors and λ is any scalar,
then the dot product satisfies:
1. a ⋅ a = |a|2 ,
2. a ⋅ b = b ⋅ a,
3. if a and b are two nonzero vectors, then a ⋅ b = 0 means that a and b are perpendicular to each
other,
4. (a + b) ⋅ c = a ⋅ c + b ⋅ c,
5. (λa) ⋅ b = λ(a ⋅ b) = a ⋅ (λb),
6. 0 ⋅ a = 0.
12 | 1 Vectors and the geometry of space

These properties are similar to the rules for real numbers and can be easily proved
by using either of the two definitions of the dot product. However, some properties of
real number multiplication do not apply to the dot product. For example, if two real
numbers satisfy ab = 0, then either a = 0 or b = 0 or both. This is not true for the dot
product. If a and b are two nonzero vectors, then a ⋅ b = 0 indicates the two vectors
are perpendicular to each other, and it is not necessary that either a = 0 or b = 0.
By using the dot product, we can find the angle between two vectors, as shown in
the following example.

Example 1.2.2. Find the angle between the two vectors i + 2j − k and 2j − k.

u⋅v
Solution. By the definition of the dot product, u ⋅ v = |u| ⋅ |v| cos θ, so cos θ = |u|⋅|v|
.
Thus,
(i + 2j − k) ⋅ (2j − k) 1 ⋅ 0 + 2 ⋅ 2 + (−1) ⋅ (−1)
cos θ = = ≈ 0.913.
|i + 2j − k||2j − k| √12 + 22 + (−1)2 √22 + (−1)2

So the angle θ ≈ cos−1 (0.913) ≈ 0.42 radians (about 24°).

1.2.2 Projections

󳨀󳨀→ 󳨀󳨀→
Suppose that a = OA and b = OB are two vectors with the same tail O. If S is the foot
󳨀󳨀→ 󳨀→
of the perpendicular from B to the line containing OA, then the vector OS is called the
a
vector projection of the vector b onto the vector a, written as Proja b. If e = |a| is the
󳨀󳨀→
unit vector in the direction of OA, then the vector projection is λe, where λ = |b| cos θ
is the size (positive or negative) of the projection vector and θ is the angle between the
two vectors, as shown in Figure 1.7. Hence, the projection of vector b onto vector a is
|b| cos θ
Proja b = a.
|a|
The scalar projection of vector b onto vector a is defined as

ProjScala b = |b| cos θ.

Figure 1.7: Vector projections.


1.2 Dot product, cross product, and triple product | 13

By using the dot product a ⋅ b = |a||b| cos θ, we have


a⋅b a⋅b
Proja b = a and ProjScala b = .
|a|2 |a|

Example 1.2.3. Show that any vector r = ⟨x, y, z⟩ can be written as

r = Proji r + Projj r + Projk r.

Solution. For Proji r, since |i| = 1, we have


r⋅i
Proji r = i = (r ⋅ i)i = ⟨x, y, z⟩ ⋅ ⟨1, 0, 0⟩i = xi.
|i|2
Similarly, Projj r = yj and Projk r = zk. Therefore,

r = Proji r + Projj r + Projk r.

1.2.3 The cross product

󳨀󳨀→
In mechanics, the moment of a force F⃗ acting on a rod OP is the vector with magnitude
⃗ 󳨀
|F||
󳨀→ 󳨀󳨀→
OP| sin θ, where θ is the angle between the vectors F⃗ and OP. The direction of the
󳨀
󳨀→
moment vector is perpendicular to F⃗ and OP (see Figure 1.8(a)) and satisfies the right-
󳨀󳨀→
hand rule: if you curl your right fingers naturally from vector F⃗ to vector OP, then your
thumbs points in the direction of the moment vector, as shown in Figure 1.8(b) and (c).
Therefore, it makes sense to define a product of two vectors a⃗ and b⃗ as follows.

(a) (b) (c)

Figure 1.8: Cross product, moment/torque.

Definition 1.2.3 (Cross/vector/outer product). The cross product denoted by a × b of vector a and vec-
tor b in ℝ3 is a new vector which is perpendicular to both vector a and vector b. The length of a × b
is

|a × b| = |a||b| sin θ

and a, b, a × b, in that order, satisfy the right-hand rule.


14 | 1 Vectors and the geometry of space

According to the above definition and using Figure 1.4(a), we can see that

i × i = 0, i × j = k, i × k = −j, j × j = 0, j × i = −k,
j × k = i, k × i = j, k × j = −i, and k × k = 0.

But in general, how can we compute the cross product? If we try to compute

a × b = (a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k)

by using the normal rules for numbers, such as the commutative, associative, and
distributive rules, we may find an interesting vector

c = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩.

This vector, in fact, satisfies conditions that we have set for a cross product, as we will
see in the following theorem.

Theorem 1.2.3. If a = ⟨a1 , a2 , a3 ⟩, b = ⟨b1 , b2 , b3 ⟩, and

c = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩,

then:
1. c is perpendicular to both a and b.
2. |c| = |a||b| sin θ, where θ is the angle between a and b.

Proof. We compute the dot product to show they are perpendicular. We have

a ⋅ c = ⟨a1 , a2 , a3 ⟩ ⋅ ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩
= a1 a2 b3 − a1 a3 b2 + a2 a3 b1 − a2 b3 a1 + a3 a1 b2 − a3 a2 b1
= 0.

Similarly b ⋅ c = 0. Therefore, we claim that c is perpendicular to both a and b.


Furthermore,

|c|2 = (a2 b3 − a3 b2 )2 + (a3 b1 − b3 a1 )2 + (a1 b2 − a2 b1 )2


= a22 b23 + a23 b22 − 2a2 b3 a3 b2 +
a23 b21 + b23 a21 − 2a3 b1 b3 a1 + a21 b22 + a22 b21 − 2a1 b2 a2 b1
= (a21 + a22 + a23 )(b21 + b22 + b23 ) − (a1 b1 + a2 b2 + a3 b3 )2
= |a|2 |b|2 − (a ⋅ b)2
= |a|2 |b|2 (1 − cos2 θ) = |a|2 |b|2 sin2 θ.

So |c| = |a||b| ⋅ sin θ.


1.2 Dot product, cross product, and triple product | 15

Now the only issue that remains is whether a, b, and c, in that order, satisfy the
right-hand rule. This can be seen in a simple case where a and b are in the first quad-
rant of the xy-plane with tails at the origin. Then the sign of the term aa2 − bb2 determines
1 1
the relative positions of a and b, and the sign of the z-component of c, a1 b2 − a2 b1 , de-
termines whether c points upward or downward. This is exactly the right-hand rule:
when you curl your right fingers from a to b, then your thumb points in the direction
of c.
In light of the above discussion, we now give an alternative definition of the cross
product.

Definition 1.2.4 (Alternative definition of the cross product). Let a = ⟨a1 , a2 , a3 ⟩ and b = ⟨b1 , b2 , b3 ⟩.
Then the cross product (also vector product) a × b is defined by
a × b = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩.

Using the knowledge of determinants, we have


a × b = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩
󵄨󵄨 i j k 󵄨󵄨󵄨󵄨
󵄨󵄨󵄨 a2 a3 󵄨󵄨󵄨 󵄨󵄨󵄨 a1 a3 󵄨󵄨󵄨 󵄨󵄨󵄨 a1 a2 󵄨󵄨󵄨 󵄨󵄨
= ⟨󵄨󵄨󵄨󵄨 󵄨󵄨 , − 󵄨󵄨 󵄨󵄨 , 󵄨󵄨 󵄨
󵄨󵄨⟩ = 󵄨󵄨 a a a 󵄨󵄨󵄨 , (1.9)
󵄨󵄨 1 2 3 󵄨󵄨
󵄨󵄨 b2 b3 󵄨󵄨󵄨 󵄨󵄨󵄨 b1 b3 󵄨󵄨󵄨 󵄨󵄨󵄨 b1 b2 󵄨󵄨󵄨
󵄨 󵄨 󵄨 󵄨 󵄨
󵄨󵄨 󵄨
󵄨󵄨 b1 b2 b3 󵄨󵄨󵄨

where 󵄨󵄨󵄨 ac db 󵄨󵄨󵄨 = ad − bc. This is much better for remembering the cross product.
󵄨 󵄨
Using the definition of the vector product, we have the following theorem.

Theorem 1.2.4 (Properties of the cross product for three-dimensional vectors). For any three vectors
a, b, and c in ℝ3 and a scalar λ, we have:
1. a × a = 0,
2. if a and b are nonzero vectors, then a × b = 0 if and only if a ‖ b,
3. b × a = −(a × b),
4. a × (b + c) = a × b + a × c,
5. (a + b) × c = a × c + b × c,
6. (λa) × b = λ(a × b) = a × (λb),
7. a ⋅ (b × c) = (a × b) ⋅ c,
8. a × (b × c) = (a ⋅ c)b − (a ⋅ b)c.

Using one of the definitions of the cross product, we can prove these properties by
writing the vectors in their components form. Note that the cross product fails to obey
most of the laws satisfied by real number multiplication, such as the commutative and
associative laws. Check for yourself that a × (b × c) ≠ (a × b) × c for most vectors a, b,
and c.

Example 1.2.4. Find a vector that is perpendicular to the plane containing the three points P(1, 0, 6),
Q(2, 5, −1), and R(−1, 3, 7).
16 | 1 Vectors and the geometry of space

󳨀󳨀→ 󳨀→
Solution. The cross product of the two vectors PQ and PR is such a vector. This is be-
󳨀󳨀→ 󳨀→
cause the cross product is perpendicular to both PQ and PR and is, thus, perpendicular
to the plane through the three points P, Q, and R. Since
󳨀󳨀→
PQ = (2 − 1)i ⃗ + (5 − 0)j ⃗ + (−1 − 6)k⃗ = i ⃗ + 5j ⃗ − 7k,⃗
󳨀→
PR = (−1 − 1)i ⃗ + (3 − 0)j ⃗ + (7 − 6)k⃗ = −2i ⃗ + 3j ⃗ + k,⃗

we evaluate the cross product of these two vectors using the determinant approach,
i. e.,
󵄨󵄨 ⃗ 󵄨󵄨
󵄨 i j⃗ k⃗ 󵄨󵄨
󳨀󳨀→ 󳨀→ 󵄨󵄨󵄨󵄨 󵄨󵄨
PQ × PR = 󵄨󵄨 1 5 −7 󵄨󵄨 = (5 + 21)i ⃗ − (1 − 14)j ⃗ + (3 + 10)k⃗
󵄨󵄨
󵄨󵄨
󵄨󵄨 −2 3 1 󵄨󵄨󵄨
󵄨 󵄨
= 26i ⃗ + 13j ⃗ + 13k.⃗

So the vector ⟨26, 13, 13⟩ is perpendicular to the plane passing through the three points
P, Q, and R. In fact, any nonzero scalar multiple of this vector, such as ⟨2, 1, 1⟩, is also
perpendicular to the plane. Figure 1.9 illustrates the vector perpendicular to the plane.

Figure 1.9: Cross product, Example 1.2.4.

Note that the length of the vector |a × b| = |a||b| sin θ is equal to the area of the paral-
lelogram determined by a and b, assuming they have the same initial point, as shown
in Figure 1.8(d). Therefore, we have the following theorem.

Theorem 1.2.5. Given two nonzero vectors a and b with a common tail, we have

area of a parallelogram with adjacent sides a and b = |a × b|,


1
area of a triangle with adjacent sides a and b = |a × b|.
2
1.2 Dot product, cross product, and triple product | 17

Example 1.2.5. Find the area of the triangle with vertices P(1, 0, 6), Q(2, 5, −1), and R(−1, 3, 7).

󳨀󳨀→ 󳨀→
Solution. In the previous example, we already computed that PQ × PR = ⟨26, 13, 13⟩.
The area of the parallelogram with adjacent sides PQ and PR is the magnitude of the
cross product, i. e.,
󳨀󳨀→ 󳨀→
|PQ × PR| = √(26)2 + (13)2 + (13)2 = 13√6.
13√6
Thus, the area of the triangle PQR is 2
.

1.2.4 Scalar triple product

Suppose three nonplanar vectors a, b, and c, have a common tail. What is the volume
of the parallelepiped determined by these three vectors as shown in Figure 1.10?

Figure 1.10: Triple product, volume of a parallelepiped.

Consider the base parallelogram; its area is A = |b × c|. Let θ be the angle between
a and b × c. Noting that b × c is perpendicular to b and c and the height h of the
parallelepiped is

h = |a|| cos θ|

(we should use | cos θ| instead of cos θ to ensure that we obtain a positive result when
θ > π2 ), we conclude that the volume V of the parallelepiped is given as follows:

V = Ah = |b × c||a|| cos θ| = 󵄨󵄨󵄨a ⋅ (b × c)󵄨󵄨󵄨.


󵄨 󵄨

Thus, we have proved that the volume of the parallelepiped determined by the three
vectors a, b, and c with a common tail is given as follows:

V = 󵄨󵄨󵄨a ⋅ (b × c)󵄨󵄨󵄨. (1.10)


󵄨 󵄨
18 | 1 Vectors and the geometry of space

A product like a ⋅ (b × c) is called a scalar triple product of the three vectors a,⃗
b,⃗ and c.⃗ Note that we can write this scalar triple product as a 3 × 3 determinant as
follows:

󵄨󵄨 󵄨󵄨󵄨 a1 a2 a3
󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨 b b3 󵄨󵄨 󵄨 b b3 󵄨󵄨 󵄨 b b2 󵄨󵄨 󵄨󵄨 󵄨󵄨
a ⋅ (b × c) = a1 󵄨󵄨󵄨󵄨 2 󵄨󵄨 − a2 󵄨󵄨󵄨 1 󵄨󵄨 + a3 󵄨󵄨󵄨 1 󵄨󵄨 = 󵄨󵄨 b1 b2 b3 󵄨󵄨󵄨 .
󵄨󵄨 c2 c3 󵄨󵄨 󵄨󵄨 c1 c3 󵄨󵄨 󵄨󵄨 c1 c2 󵄨󵄨 󵄨󵄨
󵄨 󵄨󵄨 c 󵄨󵄨
󵄨 󵄨 󵄨 󵄨
󵄨 1 c2 c3 󵄨󵄨

If the above scalar triple product is 0, then it means that the volume of the paral-
lelepiped determined by the three vectors a, b, and c is 0. Then, we can conclude
that the three vectors must be coplanar (that is, they lie in the same plane). In terms
of linear algebra, they are linearly dependent.

Example 1.2.6. Use the scalar triple product to determine whether the vectors a = ⟨2, 0, −7⟩, b =
⟨1, −1, −3⟩, and c = ⟨1, 1, −1⟩ are coplanar.

Solution. Since
󵄨󵄨 2 0 −7 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨 󵄨 󵄨 −1 −3 󵄨󵄨 󵄨 1 −3 󵄨󵄨 󵄨 1 −1 󵄨󵄨
a ⋅ (b × c) = 󵄨󵄨󵄨󵄨 1 −1 −3 󵄨󵄨󵄨󵄨 = 2 󵄨󵄨󵄨󵄨 󵄨󵄨 − 0 󵄨󵄨󵄨 󵄨󵄨 − 7 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 1 −1 󵄨󵄨󵄨 󵄨󵄨󵄨 1 −1 󵄨󵄨󵄨 󵄨󵄨󵄨 1 1 󵄨󵄨
󵄨󵄨 1 1 −1 󵄨󵄨 󵄨
= 8 − 0 − 7 × 2 = −6

is not 0, the vectors a, b, and c are not coplanar.

1.3 Equations of lines and planes


1.3.1 Lines

A line in the two-dimensional xy-plane is determined by a point on the line and the
direction of the line (its slope, or angle of inclination, or a vector parallel to the line).
The equation of the line can be written by using the usual slope-intercept form y =
mx + b.
A line L in ℝ3 is also determined once we know a point P(x0 , y0 , z0 ) on L and the
direction of L. However, we do not have the concept of “slope of a line” as we do in ℝ2 .
In three-dimensional space, the direction of a line L can be conveniently described by
a vector v = ⟨m, n, p⟩ parallel to L. If P(x, y, z) is an arbitrary point on L, then the vector
󳨀󳨀󳨀→
P0 P is parallel to v exactly when the point P is on the line, as shown in Figure 1.11, so
for some real number t we have
󳨀󳨀󳨀→
P0 P = tv,
⟨x − x0 , y − y0 , z − z0 ⟩ = ⟨tm, tn, tp⟩.
1.3 Equations of lines and planes | 19

(a) (b)

Figure 1.11: Lines in space.

Equating the components, we have

x − x0 = tm, y − y0 = tn, and z − z0 = tp,

or

x = x0 + tm, y = y0 + tn and z = z0 + tp, (1.11)

or

{ x = x0 + tm,
{
{ y = y0 + tn, (1.12)
{
{ z = z0 + tp.

Equations (1.11) and (1.12) are called parametric equations of the line passing
through the point (x0 , y0 , z0 ) with the direction vector v = ⟨m, n, p⟩. Note that equa-
tion (1.11) can be rewritten as

x − x0 y − y0 z − z0
= = , (1.13)
m n p

which is called symmetric equations of the line. If one of m, n, and p is 0, say, m = 0,


then we can still use the notation symbolically, i. e.,

x − x0 y − y0 z − z0
= = ,
0 n p
y−y0 z−z0
but this should be interpreted as x = x0 and n
= p
.

Note. In general, if a vector v = ⟨m, n, p⟩ is used to describe the direction of a line L,


then the numbers m, n, and p are called direction numbers of L. Since there are many
vectors parallel to L, any of them could be used to describe the direction of L. We can
20 | 1 Vectors and the geometry of space

also see that any three numbers proportional to m, n, and p are also direction num-
bers for L. The three direction numbers determine the three direction angles; they are
“angles of inclination” with respect to the three coordinate axes. If v = ⟨m, n, p⟩ is a
unit vector, then the three direction numbers are actually its three direction cosines.

Also, equation (1.11) can be written in a vector form,

x x0 m
( y ) = ( y0 ) + t ( n ) , (1.14)
z z0 p

or

r = ⟨x0 , y0 , z0 ⟩ + t⟨m, n, p⟩, (1.15)

or

r = r0 + tv. (1.16)

Equations (1.14)–(1.16) are all called vector equations for the line L passing through the
point (x0 , y0 , z0 ) with direction v.

Example 1.3.1. Find parametric equations, a vector equation, and symmetric equations of the line L
which passes through the points A(1, 2, −1) and B(0, 1, 3).

󳨀󳨀→
Solution. The vector AB = ⟨0 − 1, 1 − 2, 3 − (−1)⟩ = ⟨−1, −1, 4⟩ is a direction vector of the
line L. Hence, a vector equation of L is

r = ⟨1, 2, −1⟩ + t⟨−1, −1, 4⟩

or

x 1 −1
( y ) = ( 2 ) + t ( −1 ) .
z −1 4

This gives the parametric equations of line L

x = 1 − t, y = 2 − t, z = −1 + 4t.

Symmetric equations of L are obtained by eliminating the parameter t, i. e.,

x−1 y−2 z+1


= = .
−1 −1 4

The graph of the line is shown in Figure 1.12(a).


1.3 Equations of lines and planes | 21

(a) (b)

Figure 1.12: Lines in space, Examples 1.3.1 and 1.3.2.

Example 1.3.2. Show that the lines L1 and L2 with parametric equations

x = 1 + 2t, y = 2 − t, z = −3 + 4t,
x = 2 + s, y = 4 − s, z = 4 + 2s

are skew lines. That is, L1 and L2 do not intersect in a point and are not parallel to each other and,
therefore, do not lie in the same plane.

Solution. The lines are not parallel because the corresponding direction vectors v1 =
⟨2, −1, 4⟩ and v2 = ⟨1, −1, 2⟩ are not parallel because there is no scalar λ such that
⟨1, −1, 2⟩ = λ⟨2, −1, 4⟩. In other words, their components are not proportional. We at-
tempt to solve the system of equations in t and s to find any intersection points. We
have

1 + 2t = 2 + s,
2 − t = 4 − s,
−3 + 4t = 4 + 2s.

Solving the first two equations for t and s gives t = 3 and s = 5, but these values do not
satisfy the third equation. Therefore, there are no values of t and s that satisfy all three
equations, so the system of equations is inconsistent. Thus, L1 and L2 do not intersect
and are skew lines. The graphs of the two lines are shown in Figure 1.12(b).

The angle between two lines is the angle between their direction vectors. There-
fore, we can use the dot product to find the angle, as shown in the following example.

Example 1.3.3. Find the acute angle between two lines

x−1 y z+3
L1 : = = and L2 : x = 2t, y = −2 − 2t, z = −t.
1 −4 1
22 | 1 Vectors and the geometry of space

Solution. A direction vector of L1 is v1 = ⟨1, −4, 1⟩ and of L2 is v2 = ⟨2, −2, −1⟩. If θ is


the angle between the two lines, we have

v1 ⋅ v2 1 ⋅ 2 + (−4) ⋅ (−2) + 1 ⋅ (−1) 1


cos θ = = = .
|v1 ||v2 | √12 + (−4)2 + 12 √22 + (−2)2 + (−1)2 √2

The desired angle is, therefore, θ = π/4.

Example 1.3.4. Find symmetric equations of the line L that passes through (2, 1, 14) and perpendicu-
larly intersects the line L0 : x−3
2
= y1 = z−1
1
.

Solution. Suppose that the line L intersects L0 at the point P(x, y, z). Then the coordi-
nates of P must have the form

x = 3 + 2t, y = t, and z = 1 + t, for some t.

The vector parallel to L with initial point (2, 1, 14) and terminal point P is

⟨3 + 2t − 2, t − 1, 1 + t − 14⟩ = ⟨1 + 2t, t − 1, −13 + t⟩.

Since the two lines intersect perpendicularly, the direction of L0 is also perpen-
dicular to this vector, so

⟨1 + 2t, t − 1, −13 + t⟩ ⋅ ⟨2, 1, 1⟩ = 0.

Solving for t, we have t = 2. Hence, P has coordinates (7, 2, 3) and a vector parallel
to L is ⟨7, 2, 3⟩ − ⟨2, 1, 14⟩ = ⟨5, 1, −11⟩. Therefore, symmetric equations of L are

x−7 y−2 z−3


= = .
5 1 −11
Note. This example shows how to find the foot of the perpendicular of a point onto a
given line. This can be used to find the distance from a given point to a given line, as
shown in the following example.

Example 1.3.5. Find the perpendicular distance from the point Q(1, 2, 3) to the straight line with para-
metric equations x = 3 + t, y = 4 − 2t, z = −2 + 2t.

Solution. Let t be the value such that the point on the line N(3 + t, 4 − 2t, −2 + 2t)
󳨀󳨀→
is the foot of the perpendicular from the point Q to the line. The vector NQ must be
󳨀󳨀→
perpendicular to the direction of the line, so NQ ⋅ ⟨1, −2, 2⟩ = 0. This means

⟨1 − (3 + t), 2 − (4 − 2t), 3 − (−2 + 2t)⟩ ⋅ ⟨1, −2, 2⟩ = 0,


⟨−2 − t, −2 + 2t, 5 − 2t⟩ ⋅ ⟨1, −2, 2⟩ = 0,
1.3 Equations of lines and planes | 23

−2 − t − 2(−2 + 2t) + 2(5 − 2t) = 0,


4
t= .
3
So, the foot of the perpendicular from the point Q to the line is N(3 + t, 4 − 2t, −2 + 2t) =
N( 13 , 4 , 2 ), and the distance from the point Q to the line is
3 3 3

2 2 2
13 4 2
|NQ| = √( − 1) + ( − 2) + ( − 3) = √17.
3 3 3

Note. The distance can also be obtained by minimizing the function d(t) = √|NQ|.
Also, one can show that the distance from a point P to a line r = r0 + vt is
󳨀󳨀→
|MP × v|
distance from P to a line = , where M is any point on the line. (1.17)
|v|

1.3.2 Planes

A plane is a surface that is determined by a point M0 (x0 , y0 , z0 ) and a normal vector n.


That is, there is a unique plane that passes through the given point M0 and is perpen-
dicular to a given direction n. How do you find an equation for this plane? Assume
󳨀󳨀󳨀󳨀→
that M(x, y, z) is a point in space. Then M is in the plane if and only if the vector M0 M
is orthogonal to the normal vector n (see Figure 1.13), that is,
󳨀󳨀󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀󳨀→
n ⋅ M0 M = 0 or n ⋅ (OM − OM0 ) = 0.

If n = ⟨a, b, c⟩, then expanding the dot product gives


󳨀󳨀→ 󳨀󳨀󳨀→
n ⋅ (OM − OM0 ) = ⟨a, b, c⟩ ⋅ (⟨x, y, z⟩ − ⟨x0 , y0 , z0 ⟩).

Thus,

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0. (1.18)

Figure 1.13: Planes in space.


24 | 1 Vectors and the geometry of space

This is called the Cartesian equation/linear equation of the plane through M0 (x0 , y0 , z0 )
with normal vector n = ⟨a, b, c⟩. By collecting terms in the equation, we can write the
equation as

ax + by + cz + d = 0, (1.19)

where d = −(ax0 + by0 + cz0 ). A point (x, y, z) is in the plane if and only if it satisfies
this equation.

Example 1.3.6. The plane x = 0 is the yz-coordinate plane, the plane y = 0 is the xz-coordinate plane,
and the plane z = 0 is the xy-coordinate plane; z = 3 is the plane parallel to the xy-plane with distance
3 units from it.

Example 1.3.7. Find an equation of the plane that passes through the point (2, 2, −1) with normal vec-
tor n⃗ = ⟨1, 2, 3⟩. Also, find the intercepts of the plane with the three coordinate axes and then sketch
the plane.

Solution. Plug a = 1, b = 2, c = 3 and x0 = 2, y0 = 2, z0 = −1 into the equation (1.18).


We get an equation of the plane

1(x − 2) + 2(y − 2) + 3(z + 1) = 0,

or

x + 2y + 3z = 3.

In order to find the x-intercept, we set y = z = 0 in this equation and solve for x to
get x = 3. Similarly, the y-intercept is 3/2 and the z-intercept is 1. The plane is shown
in Figure 1.14(a).

(a) (b)

Figure 1.14: Planes, Examples 1.3.7 and 1.3.8.


1.3 Equations of lines and planes | 25

Example 1.3.8. Find an equation of the plane through the three points P(−1, −3, 2), Q(0, −1, 7), and
R(3, 2, −1).

󳨀󳨀→ 󳨀→
Solution. The vectors PQ and PR are
󳨀󳨀→
PQ = ⟨0, −1, 7⟩ − ⟨−1, −3, 2⟩ = ⟨1, 2, 5⟩

and
󳨀→
PR = ⟨3, 2, −1⟩ − ⟨−1, −3, 2⟩ = ⟨4, 5, −3⟩.
󳨀󳨀→ 󳨀→ 󳨀󳨀→ 󳨀→
Their cross product PQ × PR is orthogonal to the desired plane and, thus, n⃗ = PQ × PR
is a normal vector to the plane. Hence, an equation of the plane is
󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→ 󳨀→
PM ⋅ n⃗ = PM ⋅ (PQ × PR) = 0,

where M(x, y, z) is an arbitrary point in the plane. Using the triple product formula
gives
󵄨󵄨 x − (−1) y − (−3) z−2 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 1 2 5 󵄨󵄨 = 0.
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨 4 5 −3 󵄨󵄨
󵄨
Simplifying this, we obtain

23y − 31x − 3z + 44 = 0.

The graph of the plane is shown in Figure 1.14(b).

In general, an equation of the plane passing through three points P1 (x1 , y1 , z1 ),


P2 (x2 , y2 , z2 ), and P3 (x3 , y3 , z3 ) is
󵄨󵄨 x − x y − y1 z − z1 󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 x2 − x1 y2 − y1 z2 − z1 󵄨󵄨 = 0.
󵄨󵄨 󵄨󵄨
󵄨󵄨 x − x y3 − y1 z3 − z1 󵄨󵄨
󵄨 3 1 󵄨
In particular, if the three points are three intercepts with the x-, y-, and z-axes
given by P1 (a, 0, 0), P2 (0, b, 0), and P3 (0, 0, c), then an equation of the plane is
󵄨󵄨 x − a y z 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 −a b 0 󵄨󵄨 = 0,
󵄨󵄨 󵄨󵄨
󵄨󵄨 −a
󵄨 0 c 󵄨󵄨
󵄨

and this simplifies to (provided, a, b, and c are all nonzero)


x y z
+ + = 1.
a b c
26 | 1 Vectors and the geometry of space

Figure 1.15: Angle between two planes.

We can define the angle between two planes using their normal vectors as shown in
Figure 1.15.

Definition 1.3.1. The angle between two planes is defined as the acute angle between the normal
vectors of the two planes. Two planes are considered to be perpendicular if their normal vectors are
orthogonal.

Example 1.3.9. Find the angle between the planes x − y − 2z = 1 and x + y − 3z = 1.

Solution. The normal vectors of these two planes are

n⃗ 1 = ⟨1, −1, −2⟩ and n⃗ 2 = ⟨1, 1, −3⟩,

respectively. Let θ be the angle between the two planes. Then


n⃗ 1 ⋅ n⃗ 2 1(1) + (−1)(1) + (−2)(−3)
cos θ = = ≈ 0.73855.
|n⃗ 1 ||n⃗ 2 | √12 + (−1)2 + (−2)2 √12 + 12 + (−3)2

So the acute angle between the given planes is cos−1 (0.73855) ≈ 42°.

Example 1.3.10. Find a formula for the perpendicular distance D from the point P(x0 , y0 , z0 ) to the
plane ax + by + cz + d = 0.

Solution. Let P1 (x1 , y1 , z1 ) be any point in the given plane. Then


󳨀󳨀→
P1 P = ⟨x0 − x1 , y0 − y1 , z0 − z1 ⟩.

The vector n⃗ = ⟨a, b, c⟩ is a normal vector of the plane. Then, as shown in Figure 1.16,
the distance D from P to the plane is
󵄨 󳨀󳨀→
D = 󵄨󵄨󵄨|P1 P| cos θ󵄨󵄨󵄨.
󵄨
1.3 Equations of lines and planes | 27

Figure 1.16: Distance from a point P to a plane ax + by + cz + d = 0.

Thus,

󵄨 󳨀󳨀→
󵄨
󵄨 󵄨󵄨 󳨀󳨀→ |n|⃗ 󵄨󵄨󵄨󵄨
D = 󵄨󵄨󵄨|P1 P| cos θ󵄨󵄨󵄨 = 󵄨󵄨󵄨|P1 P| ⋅ cos θ ⋅ 󵄨
󵄨󵄨 |n|⃗ 󵄨󵄨󵄨
󳨀󳨀→
1 󵄨󵄨 󳨀󳨀→ 󵄨󵄨 |P1 P ⋅ n|⃗
= |P P| ⋅ cos θ ⋅ | n|
⃗ =
|n|⃗ 󵄨 1
󵄨 󵄨󵄨
|n|⃗
|a(x0 − x1 ) + b(y0 − y1 ) + c(z0 − z1 )|
=
√a2 + b2 + c2
|ax0 + by0 + cz0 − (ax1 + by1 + cz1 )|
=
√a2 + b2 + c2
|ax0 + by0 + cz0 + d|
= , (1.20)
√a2 + b2 + c2

since −(ax1 + by1 + cz1 ) = d.

Example 1.3.11. Find the distance between the two parallel planes x + 2y − 2z = 5 and 2x + 4y − 4z = 3.

Solution. The two planes are parallel to each other since their normal vectors ⟨1, 2, −2⟩
and ⟨2, 4, −4⟩ are parallel. In order to find the distance D between the two planes, we
can, instead, find the distance from any point in one plane to the other plane. For
example, we can put y = z = 0 in the equation of the first plane, to get x = 5, so
(5, 0, 0) is a point in the first plane. Using formula (1.20) from Example 1.3.10,

󵄨󵄨 2(5) + 4(0) − 4(0) − 3 󵄨󵄨 7


D = 󵄨󵄨󵄨
󵄨 󵄨󵄨
󵄨= .
󵄨󵄨 √22 + 42 + (−4)2 󵄨󵄨󵄨 6

So the distance between the two planes is 7/6.


28 | 1 Vectors and the geometry of space

The intersection of two planes that are not parallel is of course a line. So a line L
can be described as the line of intersection of two planes in the form

A1 x + B1 y + C1 z = D1 ,
L:{ (1.21)
A2 x + B2 y + C2 z = D2 .

This is a general equation of the line L. The symmetric equations of a line are an exam-
ple of this form. There will, of course, be infinitely many possible choices for the two
planes that intersect in a given line L.

Example 1.3.12. Rewrite the line L determined by the equations below in the form of parametric equa-
tions and then in the form of symmetric equations:

x + y − z = 1,
{
2x + y + 3z = 4.

Solution. First of all, we find a point on the line by choosing z = 0 and solving the
equations for x and y,

x + y = 1,
{
2x + y = 4,

obtaining x = 3 and y = −2. Therefore, the point (3, −2, 0) lies on line L. Note that the
direction vector v of line L is perpendicular to both normal vectors of the given planes,
so it is given by the cross product
󵄨󵄨 󵄨󵄨
󵄨󵄨 i⃗ j⃗ k⃗ 󵄨󵄨
󵄨󵄨 󵄨󵄨
v = n1 × n2 = 󵄨󵄨󵄨 1 1 −1 󵄨󵄨 = 4i ⃗ − 5j ⃗ − k.⃗
󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨 2 1 3 󵄨󵄨

Hence, parametric equations of line L are x = 3 + 4t, y = −2 − 5t, z = −t. Symmetric


equations of line L are

x−3 y+2 z
= = .
4 −5 −1

Vector equations of planes


Suppose two nonparallel vectors a and b lie in the plane with their tails at M0 (x0 , y0 , z0 ).
Then, any vector with its head a point in the plane and tail at M0 can be given by a lin-
ear combination of the two vectors a, b. Assume M0 is the head of the position vector
r0 . Then, for any position vector r with head at a point in the plane (see Figure 1.17),
there must be two scalars λ and u such that

r − r0 = λa + ub.
1.3 Equations of lines and planes | 29

Figure 1.17: Vector equations of a plane.

Therefore, the vector equation

r = r0 + λa + ub (1.22)

describes a plane in space. Suppose r = ⟨x, y, z⟩, a = ⟨a1 , a2 , a3 ⟩, and b = ⟨b1 , b2 , b3 ⟩.


Then
x x0 a1 b1
( y ) = ( y0 ) + λ ( a2 ) + u ( b2 ) (1.23)
z z0 a3 b3
is a vector equation of the plane. This can be rewritten as

{ x = x0 + λa1 + ub1 ,
{
{ y = y0 + λa2 + ub2 , (1.24)
{
{ = z0 + λa3 + ub3 .
z
These are parametric equations of the plane. Note that this can be written in the form

r(λ, u) = ⟨x(λ, u), y(λ, u), z(λ, u)⟩, (1.25)

which is a vector-valued function with two parameters.

Example 1.3.13. Rewrite the equation of the plane 2x − y − 3z = 10 in a vector form r ⃗ = r0⃗ + λa⃗ + μb.⃗

Solution. We must find a position vector r0⃗ whose terminal point is a point in the
plane and two nonparallel vectors a⃗ and b⃗ which are both parallel to the plane. To find
three such vectors, we find three points in the plane. It is easy to check that (5, 0, 0),
(0, −10, 0), and (2, 0, −2) are three points in the plane and, therefore,
0 5 −5 2 5 −3
a⃗ = ( −10 ) − ( 0 ) = ( −10 ) and b⃗ = ( 0 ) − ( 0 ) = ( 0 )
0 0 0 −2 0 −2
30 | 1 Vectors and the geometry of space

are two vectors parallel the plane, and a⃗ 󳠑 b,⃗ thus, a vector equation of the plane is
given by

5 −5 −3
r ⃗ = ( 0 ) + λ ( −10 ) + μ ( 0 ) .
0 0 −2

Also, we can solve the equation 2x −y −3z = 10 to find a general solution. For example,
we let y = λ and z = u be two free variables. Then a general solution to the equation
10+λ+3u 1 3
x 2
5 2 2
( y )=( λ ) = ( 0 ) + λ( 1 ) + u( 0 )
z u 0 0 1

is in the desired vector form.

1.4 Curves and vector-valued functions


A line is a special curve in space. As seen from the previous section, a line in space
has parametric equations

{ x = x0 + mt,
{
{ y = y0 + nt,
{
{ z = z0 + pt,
where (x0 , y0 , z0 ) is a point on the line and ⟨m, n, p⟩ is the direction of the line. We can
rewrite this in a vector form

r = r0 + vt

with r = ⟨x, y, z⟩, r0 = ⟨x0 , y0 , z0 ⟩, and v = ⟨m, n, p⟩ is the direction vector. This can be
written as

r(t) = ⟨x0 + mt, y0 + nt, z0 + pt⟩,

which is a vector-valued function of t with each component being a linear function of


t. In general, the graph of a vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩ is a curve in
space. Its parametric form is x = x(t), y = y(t), and z = z(t). You can imagine that this
curve is the trajectory of a moving object: at each specific time t, its position vector
is r(t).

Example 1.4.1 (A helix). The graph of the vector-valued function r(t) = 2 cos ti + 2 sin tj + 0.5tk, t ≥ 0,
is called a helix. The curve is shown in Figure 1.18.
1.4 Curves and vector-valued functions | 31

Figure 1.18: A picture of a helix.

Example 1.4.2 (Slinky curve). A slinky curve is defined as r(t) = ⟨a(t) cos t, a(t) sin t, 1.2 sin 20t⟩. The
graph of the curve when a(t) = 5 + cos 20t and 0 ≤ t ≤ 2π is shown in Figure 1.19.

Figure 1.19: A picture of a slinky curve.

Sometimes it is helpful to visualize a curve in space by projecting the curve onto one
of the coordinate planes. If a curve has the vector equation r(t) = ⟨x(t), y(t), z(t)⟩, then
its view from above is its projection onto the xy-plane, and when it is projected, its
x- and y-coordinates remain unchanged, but the z-coordinate becomes 0. Thus, the
projection of the curve onto the xy-plane has the equation

r(t) = ⟨x(t), y(t), 0⟩.

In Example 1.4.1, the projection of the helix onto the xy-plane has the equation

r(t) = ⟨2 cos t, 2 sin t, 0⟩

or x2 + y2 = 4 and z = 0. It is a circle in the xy-plane.


32 | 1 Vectors and the geometry of space

Similarly, to obtain an equation for the projection of the curve r(t) = ⟨x(t), y(t), z(t)⟩
onto the xz-plane, we set the y-coordinate to be 0. To obtain an equation for the projec-
tion curve of the curve r(t) = ⟨x(t), y(t), z(t)⟩ onto the yz-plane, we set the x-coordinate
to be 0. For instance, the projection of the curve r(t) = ⟨2 cos t, 2 sin t, 0.5t⟩ onto the
yz-plane has an x-coordinate equal to 0, giving

r(t) = ⟨0, 2 sin t, 0.5t⟩

or x = 0 and y = 2 sin(2z). It is the graph of y = 2 sin(2z) in the plane x = 0.

1.5 Calculus of vector-valued functions


1.5.1 Limits, derivatives, and tangent vectors

We can also define the limit of a vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩ at a point
t0 . Similar to a scalar function, if t → t0 implies r(t) → L, then we say limt→t0 r(t) = L,
where L = ⟨a, b, c⟩ is a constant vector. More precisely, it is defined as follows.

Definition 1.5.1. Let L = ⟨a, b, c⟩ be a constant vector and r(t) = ⟨x(t), y(t), z(t)⟩ be a vector-valued
function. Then limt→t0 r(t) = L if and only if for any given ε > 0, there is a number δ > 0 such that

󵄨󵄨 󵄨
󵄨󵄨r(t) − L󵄨󵄨󵄨 < ε whenever 0 < |t − t0 | < δ.
󵄨 󵄨

Since r(t) = x(t)i + y(t)j + z(t)k and

2 2 2
󵄨󵄨r(t) − L󵄨󵄨󵄨 = √(x(t) − a) + (y(t) − b) + (z(t) − c) < ε,
󵄨󵄨 󵄨

using the above definition and applying the limit laws for scalar functions, we have
the following theorem.

Theorem 1.5.1. Let L = ⟨a, b, c⟩ be a constant vector and let r(t) = ⟨x(t), y(t), z(t)⟩ be a vector-valued
function. Then

lim r(t) = L ⇐⇒ lim x(t) = a, lim y(t) = b, and lim z(t) = c.


t→t0 t→t0 t→t0 t→t0

Therefore, limt→t0 r(t) = ⟨limt→t0 x(t), limt→t0 y(t), limt→t0 z(t)⟩. That is, to evaluate the
limit of a vector-valued function, we evaluate the limit of each component of the func-
tion, given that all limits exist.

Example 1.5.1. Given the vector-valued function r(t) = ⟨ 1−cos


t2
t −t
, e , tan−1 t⟩, evaluate the limits
(a) limt→0 r(t) and (b) limt→∞ r(t).
1.5 Calculus of vector-valued functions | 33

Solution.
(a) We have limt→0 r(t) = ⟨limt→0 1−cos
t2
t
, limt→0 e−t , limt→0 tan−1 t⟩ = ⟨limt→0 sin 2t
t
,
1
1, 0⟩ = ⟨ 2 , 1, 0⟩.
(b) We have limt→∞ r(t) = ⟨limt→∞ 1−cos
t2
t
, limt→∞ e−t , limt→∞ tan−1 t⟩ = ⟨0, 0, π2 ⟩.

Intuitively, we know that if each component of r(t) is continuous, then the curve r(t)
must be continuous, which means that you can draw the curve continuously, without
lifting your pencil. The formal definition of continuity is given below.

Definition 1.5.2. A vector-valued function r(t) is continuous at t0 if and only if limt→t0 r(t) = r(t0 ).

This means that each component of r(t) must be continuous at t = t0 .


We now consider the trajectory of a moving object, where at any instant its posi-
tion vector is given by r(t). Its displacement over the time Δt is r(t) − r(t0 ) = r(t0 + Δt) −
r(t0 ). Therefore,
Δr r(t0 + Δt) − r(t0 )
=
Δt Δt
is the average velocity of the object during this time interval. The limit as Δt → 0, if
it exists, is the instantaneous velocity at that t0 . This is the definition of a derivative.
Thus,
Δr r(t0 + Δt) − r(t0 )
r󸀠 (t0 ) = lim = lim .
Δt→0 Δt Δt→0 Δt
If x(t), y(t), and z(t) are differentiable one-variable functions, then
r(t0 + Δt) − r(t0 )
r󸀠 (t0 ) = lim
Δt→0 Δt
(x(t0 + Δt) − x(t0 ))i + (y(t0 + Δt) − y(t0 ))j+(z(t0 + Δt) − z(t0 ))k
= lim
Δt→0 Δt
(x(t0 + Δt) − x(t0 ))i (y(t0 + Δt) − x(t0 ))j (y(t0 + Δt) − y(t0 ))k
= lim + lim + lim
Δt→0 Δt Δt→0 Δt Δt→0 Δt
= x󸀠 (t0 )i + y󸀠 (t0 )j + z 󸀠 (t0 )k.
Or this can be written as r󸀠 (t0 ) = ⟨x 󸀠 (t0 ), y󸀠 (t0 ), z 󸀠 (t0 )⟩. If this vector is not 0, then
it is a vector tangent (tangent vector) to the curve r(t) at t0 . Figure 1.20(a) illustrates
this idea. We can extend this idea to define the derivative as a function of t as follows.

Definition 1.5.3. If x(t), y(t), and z(t) are three differentiable functions on the interval (a, b), then the
derivative of the vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩ is

r󸀠 (t) = ⟨x 󸀠 (t), y 󸀠 (t), z 󸀠 (t)⟩.

If this vector is not 0, then it is a vector tangent to the curve r(t).


34 | 1 Vectors and the geometry of space

(a) (b)

Figure 1.20: Tangent vector/line and normal plane.

In light of the above definition, we are now able to derive an equation for the tangent
line to the curve r(t) at any point t = t0 . Since the curve at point (x(t0 ), y(t0 ), z(t0 )) has
tangent vector r󸀠 (t0 ) = ⟨x 󸀠 (t0 ), y󸀠 (t0 ), z 󸀠 (t0 )⟩, the symmetric equations of the tangent
line, provided r󸀠 (t0 ) ≠ 0, are
x − x(t0 ) y − y(t0 ) z − z(t0 )
= = 󸀠 . (1.26)
x󸀠 (t0 ) y󸀠 (t0 ) z (t0 )
Parametric equations of the tangent line at t = t0 are

{ x = x(t0 ) + x 󸀠 (t0 )t,


{
{ y = y(t0 ) + y󸀠 (t0 )t, (1.27)
{
{ z = z(t0 ) + z (t0 )t,
󸀠

and a vector equation of the tangent line at t = t0 is

r(t) = r(t0 ) + r󸀠 (t0 )t, (1.28)

r󸀠 (t )
where r󸀠 (t0 ) is a tangent vector. The unit tangent vector at t = t0 is T = |r󸀠 (t0 )| .
0
Note that the plane passing through the curve at t = t0 with a normal vector paral-
lel to the tangent vector to the curve at t = t0 is the normal plane to the curve at t = t0 ,
as shown in Figure 1.20(b). The normal plane to the curve at t = t0 has the equation

x󸀠 (t0 )(x − x0 ) + y󸀠 (t0 )(y − y0 ) + z 󸀠 (t0 )(z − z0 ) = 0. (1.29)

Example 1.5.2. Find an equation for the tangent line and normal plane to the curve
r(t) = ⟨sin t, cos t, sin 2t⟩ at t = π/6.

Solution. The point is (sin π/6, cos π/6, sin(2 × π/6)) = (1/2, √3/2, √3/2), and since
r󸀠 (t) = ⟨cos t, − sin t, 2 cos 2t⟩,

r󸀠 (π/6) = ⟨cos(π/6), − sin(π/6), 2 cos(2 ⋅ π/6)⟩ = ⟨√3/2, −1/2, 1⟩.


1.5 Calculus of vector-valued functions | 35

So, the parametric equations for the desired tangent line are
1
x= + 23 t,

{
{ 2
{
{ √3
{ y= 2
− 21 t,
{
{
{ √3
{ z= 2
+ t.
An equation for the normal plane at t = π/6 is
√3 1 1 √3 √3
(x − ) − (y − ) + (z − ) = 0.
2 2 2 2 2
Figure 1.21 shows the tangent line and normal plane at t = π/6.

(a) (b) (c) (d)

Figure 1.21: Tangent line and normal plane, Example 1.5.2.

By using the above definition of the derivative for a vector-valued function, we can
deduce the following theorem, the proof of which is omitted here.

Theorem 1.5.2. Let u(t) and v(t) be two differentiable vector-valued functions and f (t) be a differen-
tiable scalar-valued function over a < t < b. Let c be a constant vector. Then at any t in (a, b), we
have:
d
1. dt
(c) = 0,
d d d
2. dt
(u(t) ± v(t)) = dt u(t) ± dt v(t) (sum or difference rule),
d d d
3. dt
(f (t)u(t)) = ( dt f (t))u(t) + f (t) dt u(t) (constant multiple rule),
d
4. dt
u(f (t)) = u 󸀠
(f (t))f 󸀠
(t) (chain rule),
d
5. dt
(u(t) ⋅ v(t)) = u󸀠 (t) ⋅ v(t) + u(t) ⋅ v󸀠 (t) (dot product rule),
d
6. dt
(u(t) × v(t)) = u󸀠 (t) × v(t) + u(t) × v󸀠 (t) (cross product rule).

1.5.2 Antiderivatives and definite integrals

Similar to scalar functions, if R󸀠 (t) = r(t), then we say R(t) is an antiderivative of r(t),
and we write the indefinite integral of r(t) as

∫ r(t)dt = R(t) + C,
36 | 1 Vectors and the geometry of space

b
where C is an arbitrary constant vector. For definite integrals, we write ∫a r(t)dt =
R(b) − R(a). In light of the previous definition for derivative, we have the following
formal definition using the components of r(t).

Definition 1.5.4. If r(t) = ⟨x(t), y(t), z(t)⟩ is continuous for a ≤ t ≤ b, then we define

∫ r(t)dt = ⟨∫ x(t)dt, ∫ y(t)dt, ∫ z(t)dt⟩ and

b b b b
∫ r(t)dt = ⟨∫ x(t)dt, ∫ y(t)dt, ∫ z(t)dt⟩.
a a a a

Example 1.5.3. If r󸀠 (t) = e2t i + sec2 tj + sin tk,


1. find r(t).
2. furthermore, if r(0) =< 1, 1, 2 >, determine r(t).
π/4
3. find ∫0 r(t)dt.

Solution.
1. Since r󸀠 (t) = ⟨e2t , sec2 t, sin t⟩,

r(t) = ∫ r󸀠 (t)dt = ⟨∫ e2t dt, ∫ sec2 tdt, ∫ sin tdt⟩

1
= ⟨ e2t + c1 , tan t + c2 , − cos t + c3 ⟩
2
1
= ⟨ e2t , tan t, − cos t⟩ + ⟨c1 , c2 , c3 ⟩.
2

2. Since r(0) = ⟨1, 1, 2⟩, we have

1
⟨1, 1, 2⟩ = ⟨ e0 , tan 0, − cos 0⟩ + ⟨c1 , c2 , c3 ⟩
2
1
so ⟨c1 , c2 , c2 ⟩ = ⟨ , 1, 3⟩.
2

Then, r(t) = ⟨ 21 e2t + 21 , tan t + 1, − cos t + 3⟩.


3. By definition

π/4 π/4 π/4 π/4


1 1
∫ r(t)dt = ⟨ ∫ ( e2t + )dt, ∫ (tan t + 1)dt, ∫ (− cos t + 3)dt⟩
2 2
0 0 0 0
π/4
1 t 󵄨󵄨󵄨
= ⟨ e2t + 󵄨󵄨󵄨 , − ln | cos t|π/4 π/4
0 + π/4, − sin t + 3t|0 ⟩
4 2 󵄨󵄨0
eπ/2 π 1 ln 2 π √2 3π
= ⟨( + − ), + ,− + ⟩.
4 8 4 2 4 2 4
1.5 Calculus of vector-valued functions | 37

1.5.3 Length of curves, curvatures, TNB frame

As seen before, s, the arc length, or length of a plane curve ⟨x(t), y(t)⟩ for a ≤ t ≤ b, is

b
2 2
s = ∫ √[x󸀠 (t)] + [y󸀠 (t)] dt.
a

The analog for a curve in space is the length of a curve r(t) = ⟨x(t), y(t), z(t)⟩ for a ≤
t ≤ b, which is

b
2 2 2
s = ∫ √[x󸀠 (t)] + [y󸀠 (t)] + [z 󸀠 (t)] dt,
a

provided that the integrand is integrable. The integrand is always integrable when the
curve is smooth, that is, x󸀠 (t), y󸀠 (t), and z 󸀠 (t) are continuous on [a, b].
Again, thinking of a moving object along the curve, the length of the curve is in-
deed the distance traveled by the object over time interval [a, b]. Since the derivative
of position with respect to time is the velocity, v(t), and the derivative of distance trav-
eled with respect the time t is the speed, we have

ds 󵄨󵄨
v(t) = r󸀠 (t) and = 󵄨v(t)󵄨󵄨󵄨.
󵄨
dt 󵄨
b
It is not a surprise that the length of the curve is s = ∫a |v(t)|dt.
We conclude this in the following definition.

Definition 1.5.5. If r󸀠 (t) is continuous, the curve r(t) is a smooth curve, and the length of this curve for
a ≤ t ≤ b is defined as

b b
󵄨 󵄨 2 2 2
∫󵄨󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨󵄨dt = ∫ √ [x 󸀠 (t)] + [y 󸀠 (t)] + [z 󸀠 (t)] dt.
a a

Example 1.5.4. Find the length of the curve r(t) = ⟨3 cos t, 4 cos t, 5 sin t⟩ for 0 ≤ t ≤ 2π.

Solution. The length of the curve s is given by the integral


2 2 2
s = ∫ √((3 cos t)󸀠 ) + ((4 cos t)󸀠 ) + ((5 sin t)󸀠 ) dt
0
2π 2π

= ∫ √9 sin2 t + 16 sin2 t + 25 cos2 tdt = ∫ 5dt = 10π.


0 0
38 | 1 Vectors and the geometry of space

Parameterization by arc length


Now we consider a vector-valued function r(t) = ⟨x(t), y(t)⟩ with the following para-
metric equations representations:

x = R cos t, x = R cos u2 ,
(a) { 0 ≤ t ≤ 2π , (b) { 0 ≤ u ≤ 4π ,
y = R sin t, y = R sin u2 ,
x = R cos 2t, x = R sin 3θ, 2π
(c) { 0≤t≤π, (d) { 0≤θ≤ .
y = R sin 2t, y = R cos 3θ, 3

They actually describe the same curve. In this case, it is a circle centered at (0, 0)
with radius R. The name of a parameter, of course, does not matter. However, how
the curve evolves as the parameter increases does make a difference. For example,
in (a), the circle is formed counterclockwise while in (d) it is formed clockwise. The
positive orientation of a curve is the direction in which the curve is generated as the
parameter increases. So the positive orientation of (a) is counterclockwise, while the
positive orientation of (d) is clockwise.
A curve may be parameterized in many ways, as shown above. In some ways, the
parameter may have a nice geometric interpretation. For example, in (a), at each point
on the circle, the corresponding value of the parameter t is exactly the angle (measured
in radians) formed by the corresponding radius and the positive x-axis. We now intro-
duce a very natural way for describing a curve where its parameter represents the arc
length. We first investigate the following curve:

x = 2 cos 2t ,
{ for 0 ≤ t ≤ 4π.
y = 2 sin 2t ,

The initial point is (2, 0). When t = π, the corresponding arc length is also π. When t =
2π, the corresponding arc length is also 2π. In general, the length of the interval [0, t] is
equal to the length of the curve generated. We say that the curve r(t) = ⟨2 cos 2t , 2 sin 2t ⟩
is parameterized by arc length. In this case we also write

s s
r(s) = ⟨2 cos , 2 sin ⟩
2 2

with s being the arc length parameter.


But how do we know whether a curve r(t) is parameterized by arc length? First,
we note that by definition
t

s = ∫󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨dt,
󵄨 󵄨
a

and so ds dt
= |r󸀠 (t)|. This means ds = |r󸀠 (t)|dt. Therefore, the change in t is equal to
the change in s if and only of if |r󸀠 (t)| = 1. In particular, if the curve starts at r(a) and
|r󸀠 (t)| = 1 for all t, then when t = a, we have s = 0, and when t ≠ a, we have s = t − a.
1.5 Calculus of vector-valued functions | 39

Example 1.5.5. Determine whether the curves

(a) r(t) = ⟨sin t, 1, cos t⟩ for t ≥ 1 and (b) r(t) = ⟨t, t + 1, 6t⟩ for 0 ≤ t ≤ 12

use arc length as a parameter. If not, find a description that uses arc length as a parameter.

Solution.
1. For (a), r󸀠 (t) = ⟨cos t, 0, − sin t⟩, so |r󸀠 (t)| = √(cos t)2 + 02 + (− sin t)2 = 1. Yes, it
uses arc length as a parameter.
2. For (b), r󸀠 (t) = ⟨1, 1, 6⟩, so |r󸀠 (t)| = √12 + 12 + 62 = √38 ≠ 1. No, it does not use arc
length as a parameter. Since
t t

s = ∫󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨dt = ∫ √38dt = √38t,


󵄨 󵄨
0 0

s
if we replace t by √38
, the parameterized curve

s s 6s
r1 (s) = ⟨ , + 1, ⟩
√38 √38 √38

uses arc length as a parameter.

Curvature, normal vector, and the TNB frame


If you observe the two curves in Figure 1.22(a,b) which are both bending downward,
you will notice a difference. One curve bends more shapely than the other. To measure
the sharpness that a curve bends, we need the concept of curvature. Also observing
the unit tangent vectors T of the curve which bends more shapely, you will see that the
change (in direction) of the unit tangent vector with respect to arc length is quicker
than the one that bends less sharply. In other words, over a certain length of curve,
the unit tangent vector changes more in direction if the curve bends more sharply.
Therefore, we have the following definition.

(a) (b) (c)

Figure 1.22: Curvatures, osculating circles.


40 | 1 Vectors and the geometry of space

Definition 1.5.6 (Curvature). If r(t) is a smooth curve and T is its unit tangent vector, then the curvature
κ of a smooth curve r(t) is defined as

󵄨󵄨 󵄨󵄨
󵄨󵄨 dT 󵄨󵄨
κ = 󵄨󵄨󵄨 󵄨󵄨󵄨.
󵄨󵄨 ds 󵄨󵄨
󵄨 󵄨

ds
Because dt
= |r󸀠 (t)|, by using the chain rule, we have

󵄨󵄨 dT 󵄨󵄨 󵄨󵄨 dT 󵄨󵄨 1 1 󵄨󵄨󵄨 dT 󵄨󵄨󵄨
κ = 󵄨󵄨󵄨 󵄨󵄨󵄨 = 󵄨󵄨󵄨 󵄨󵄨󵄨 ds = 󸀠 󵄨󵄨󵄨 󵄨󵄨󵄨.
󵄨 󵄨 󵄨 󵄨
󵄨󵄨 ds 󵄨󵄨 󵄨󵄨 dt 󵄨󵄨 | | |r (t)| 󵄨󵄨 dt 󵄨󵄨
dt

Intuitively speaking, since curvature measures the degree that a curve bends, a
straight line must have 0 curvature. At all points on a circle, the curvature would be the
same constant, and a smaller circle should have larger curvature. Let us use Definition
1.5.6 to verify this understanding.

Example 1.5.6. Find the curvature for the straight line r(t) = ⟨x0 + mt, y0 + nt, z0 + pt⟩.

dr 1
Solution. Since dt
= ⟨m, n, p⟩ is constant, T = ⟨m, n, p⟩. Then,
√m2 +n2 +p2

dT d 1
= ( ⟨m, n, p⟩) = 0.
dt dt √m2 + n2 + p2

1
Therefore, the curvature at any point on the line is κ = | dT |
|r󸀠 (t)| dt
= 0. This agrees
with our intuition.

Example 1.5.7. Find the curvature for the circle r(t) = ⟨R cos t, R sin t⟩.

dr
Solution. Since dt
= ⟨−R sin t, R cos t⟩,

1
T= ⟨−R sin t, R cos t⟩ = ⟨− sin t, cos t⟩.
√(−R sin t)2 + (R cos t)2

Then the curvature is

1 󵄨󵄨󵄨󵄨 dT 󵄨󵄨󵄨󵄨 1 󵄨󵄨 dT 󵄨󵄨
κ=
󵄨󵄨 󵄨󵄨
󵄨 󵄨 = 󵄨 󵄨
|r󸀠 (t)| 󵄨󵄨󵄨 dt 󵄨󵄨󵄨 √(−R sin t)2 + (R cos t)2 󵄨󵄨󵄨 dt 󵄨󵄨󵄨
1 󵄨󵄨󵄨 d 󵄨󵄨 1
󵄨 1 1
= 󵄨󵄨󵄨 ⟨− sin t, cos t⟩󵄨󵄨󵄨 = 󵄨󵄨󵄨⟨− cos t, − sin t⟩󵄨󵄨󵄨 = √(− cos t)2 + (− sin t)2 = .
󵄨 󵄨
R 󵄨 dt
󵄨 󵄨󵄨 R R R

The curvature is the same at each point on a circle, and a larger circle has a smaller
curvature.
1.5 Calculus of vector-valued functions | 41

In general, calculating the curvature using the definition involves many steps.
However, sometimes it is easier to calculate the curvature of a curve by using the fol-
lowing theorem, which can be derived using Theorem 1.5.2.

Theorem 1.5.3. Let r(t) be a twice differentiable smooth curve. The curvature of r(t) is then

|r󸀠 (t) × r󸀠󸀠 (t)|


κ= .
|r󸀠 (t)|3

This theorem allows us to evaluate the curvature of a parameterized curve by evaluat-


ing its first- and second-order derivatives at a point.

Example 1.5.8. Find the curvature of the curve y = x 2 at the point with greatest curvature.

Solution. Let x = t, y = t 2 , and z = 0. Then the curve is r(t) = ⟨t, t 2 , 0⟩. At each t,

r󸀠 (t) = ⟨1, 2t, 0⟩, and r󸀠󸀠 (t) = ⟨0, 2, 0⟩


r󸀠 (t) × r󸀠󸀠 (t) = ⟨1, 2t, 0⟩ × ⟨0, 2, 0⟩ = ⟨0, 0, 2⟩.

So the curvature is
|⟨0, 0, 2⟩| 2
κ= = .
|⟨1, 2t, 0⟩|3 √1 + 4t 2 3

So at t = 0, which is the origin, the parabola y = x2 has the greatest curvature, which
is 2.

Note. If a curve r(t) is parameterized by arc length, then |r󸀠 (t)| = 1, ds = dt, and so
dr
ds
= T, and we have

1 󵄨󵄨󵄨󵄨 dT 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 dT 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 d2 r 󵄨󵄨󵄨󵄨


κ= 󵄨 or κ = 󵄨󵄨󵄨r󸀠󸀠 (s)󵄨󵄨󵄨.
󵄨 󵄨
󵄨 󵄨=󵄨 󵄨=󵄨
|r󸀠 (s)| 󵄨󵄨󵄨 ds 󵄨󵄨󵄨 󵄨󵄨󵄨 ds 󵄨󵄨󵄨 󵄨󵄨󵄨 ds2 󵄨󵄨󵄨
The curvature tells how fast a curve turns. But in which direction does it turn? The
principal unit normal vector determines this.

dT
Definition 1.5.7. Let r be a smooth curve. If dt
is not 0, then the principal unit normal vector N at a
point on the curve is defined to be
dT
dt
N= .
| dT
dt
|

If the curve is parameterized by arc length, then we have


dT
ds 1 dT dT
N= = or = κN.
| dT | κ ds ds
ds
42 | 1 Vectors and the geometry of space

Because 1 = T ⋅ T, differentiation with respect to s shows that


dT dT
0= ⋅T+T⋅ = 2κN ⋅ T,
ds ds
meaning that T and N are orthogonal at all points of the curve. Also, N points to the
inside of the curve in the direction where the curve is turning.
There is one more aspect of the curve that we need to consider: the curve might
twist out of the plane determined by T and N. We define the unit binormal vector B to
be T × N. Then
dB d dT dN dN
= (T × N) = ×N+T× = T× .
ds ds ds ds ds
Note that dT
ds
× N = 0, since dT
ds
and N are parallel to each other. So, we know that dB
ds
is
orthogonal to T. On the other hand, since 1 = B ⋅ B, we have
dB ⋅ B dB dB
0= = ⋅B+B⋅ .
ds ds ds
This means that dBds
is also perpendicular to B. Therefore, dB
ds
must be parallel to
B × T = N. Then there is a scalar τ such that
dB
= −τN,
ds
where τ is the torsion, whose magnitude is the rate at which the curve twists out of the
TN plane. Furthermore, we can derive dN
ds
= −κT + τB (left as an exercise).
In summary, the formulas
dT
= κN
ds
T󸀠 0 κ 0 T
dN
= −κT +τB or ( N ) = ( −κ
󸀠
0 τ )( N ),
ds
B󸀠 0 −τ 0 B
dB
= −τN
ds
are called Frenet–Serret formulas or Frenet–Serret theorem, named after two French
mathematicians. The TNB frame, as shown in Figure 1.23, is useful when it is impossi-
ble or hard to assign a natural coordinate system for a trajectory as in relative theory
or models of microbial motion. It is also called the Frenet–Serret frame.

1.6 Surfaces in space


1.6.1 Graph of an equation F (x, y, z) = 0

As seen before, a plane, which is a simple surface in space, has equation

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0 or ax + by + cz + d = 0.


1.6 Surfaces in space | 43

(a) (b) (c)

Figure 1.23: TNB frame.

All the solutions to the equation are points in the plane. Also, for any point in the
plane, its coordinates must satisfy the equation. In general, for an equation of three
variables,

F(x, y, z) = 0,

all its solutions are the set of points in space that form a surface, which is called the
graph of the equation.

Example 1.6.1. Find an equation for the sphere with radius R centered at P0 (x0 , y0 , z0 ).

Solution. Suppose P(x, y, z) is a point on the sphere. The distance between P and P0
must be R, that is,

|PP0 | = R,
√(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = R,

and so, this sphere has an equation

(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = R2 .

Example 1.6.2. Find the locus of points with equal distance from the two points A(1, 2, 3) and B(2, 1−4).

Solution. If P(x, y, z) is any point with equal distance from A and from B, then |PA| =
|PB|, and this becomes

√(x − 1)2 + (y − 2)2 + (z − 3)2 = √(x − 2)2 + (y − 1)2 + (z + 4)2 .

Squaring both sides and simplifying this expression gives

2x − 2y − 14z − 7 = 0.

The locus form a plane.


44 | 1 Vectors and the geometry of space

(a) (b)

Figure 1.24: Definition of a cylinder.

Note that squaring both sides of an equation, as above, can introduce extra solutions,
because A = B and A = −B both square to A2 = B2 . However, this does not happen here
because the square roots on both sides must be nonnegative, and this does not allow
one side to be negative.

1.6.2 Cylinder

Consider a plane in space again. A plane can be considered as the surface which is
formed by all lines that are parallel to a given line and pass through a given curve.
Or, in other words, the plane is formed by moving a line along a curve. This type of
surface is called a cylinder, as shown in Figure 1.24.

Definition 1.6.1. A cylinder is defined as a surface that consists of all lines (called rulings) that are
parallel to a given line and pass through a given curve.

We first consider the cases where all rulings are parallel to one of the coordinate axes.

Example 1.6.3 (Parabolic cylinder). Sketch the graph of the surface y 2 = 2x in three-dimensional
space and show that it is a cylinder.

Solution. Note that the equation of the graph y2 = 2x does not involve z. This means
that for any x0 and y0 satisfying this equation, there is a line of solutions (x0 , y0 , z)
for every possible z-value (that is, z is unrestricted by the equation). Furthermore, any
horizontal plane with equation z = k (parallel to the xy-plane) intersects the graph
in the same curve with equation y2 = 2x, a parabola. Figure 1.25(a) shows how the
graph is formed by moving a line parallel to the z-axis along the parabola y2 = 2x in
the xy-plane. This surface is called a parabolic cylinder. The graph can also be formed
by infinitely many shifted copies of the same parabola y2 = 2x along the z-axis.
1.6 Surfaces in space | 45

(a) (b) (c)

Figure 1.25: Examples of cylinders: parabolic cylinder, elliptic cylinder, hyperbolic cylinder.

Note.
1. In general, if one of the variables x, y, or z is missing from the equation of a sur-
face, then the surface is a cylinder with rulings parallel to the axis of the missing
variable.
2. It is useful to sketch surfaces in space by using the traces which are the intersec-
tion curves of the surface and planes parallel to one of the coordinate planes.

Example 1.6.4 (Elliptic cylinder). Identify and sketch in three-dimensional space the surfaces
(a) x 2 + 2y 2 = R 2 and (b) x 2 + z 2 = R 2 .

Solution. (a) Since z is missing, this must be a cylinder with rulings parallel to the
z-axis. The graph of the equation x2 + 2y2 = R2 , for z = k (a constant), is an ellipse in
the plane z = k. Hence, the surface x2 + 2y2 = R2 is an elliptic cylinder whose rulings
are parallel to the z-axis and, so, are vertical (see Figure 1.25(b)).
(b) Similarly, x2 + z 2 = R2 is a circular cylinder whose rulings are parallel to the
y-axis and thus they are horizontal.

z2
Example 1.6.5 (Hyperbolic cylinder). Identify and sketch the surface y 2 − 9
= 1.

Solution. Since x is missing, this is a cylinder with rulings parallel to the x-axis. All
the traces for constant x are hyperbolas. This is a hyperbolic cylinder whose graph is
shown in Figure 1.25(c).

Example 1.6.6. Describe the surfaces of (a) x = sin y and (b) z = ln x.

Solution. (a) It is a cylinder with rulings parallel to the z-axis. The trace with z = 0 is
the curve x = sin y in the xy-plane. The cylinder is generated by moving this curve up
and down along the z-axis.
46 | 1 Vectors and the geometry of space

(b) It is a cylinder with rulings parallel to the y-axis. The trace with y = 0 is the
curve z = ln x in the xz-plane. The cylinder is generated by moving this curve left and
right along the y-axis.

1.6.3 Quadric surfaces

A quadric surface is the graph of a second-degree polynomial equation with three vari-
ables, x, y, and z. The most general such equation is

Ax 2 + By2 + Cz 2 + Dxy + Eyz + Fxz + Gx + Hy + Iz + J = 0,

where A, B, C, . . . , J are constants. By translation and rotation of the axes (in algebra:
completing the square and making a linear transformation) it is possible to bring this
equation into one of the two standard forms,

Ax 2 + By2 + Cz 2 + D = 0 or Ax 2 + By2 + Iz = 0,

where A, B, C, and I are nonzero (otherwise, the graphs are cylinders). The signs (pos-
itive or negative) of these constants and whether D is zero lead to the following list of
the types of quadric surfaces in three-dimensional space:
2 2 2
1. elliptic cone zc2 = ax 2 + by 2 (D = 0),
x2 y2 z2
2. ellipsoid a2
+ b2
+ c2
= 1 (D ≠ 0),
2
x2 2
3. hyperboloid of one sheet a2
+ by 2 − zc2 = 1 (D ≠ 0),
2 2 2
4. hyperboloid of two sheets ax 2 − by 2 − zc2 = 1 (D ≠ 0),
2 2
5. elliptic paraboloid z = ax 2 + by 2 ,
2 2
6. hyperbolic paraboloid z = ax 2 − by 2 .

One needs to be aware that the same surfaces with different orientations are obtained
when the roles of the variables are interchanged.
Like conic sections in two-dimensional space, quadric surfaces admit similar
geometric and physical properties, which makes them useful in designing satellite
dishes, headlamps, mirrors in telescopes, cooling towers for nuclear power plants,
water tanks, and so forth.
Using traces, it is not hard to sketch the graph of these quadric surfaces. They are
summarized in Figure 1.26.

1.6.4 Surface of revolution

One type of special surface in space is obtained by revolving a curve about a line. For
example, if we revolve the plane curve y = x 2 about the x axis, we obtain a surface of
1.6 Surfaces in space | 47

Figure 1.26: Quadric surfaces.

revolution in space. How do we find an equation for this surface? We consider a more
general case. Suppose we have a curve f (y, z) = 0 in the yz-plane (this means x = 0),
and we rotate this curve about the z-axis, as shown in Figure 1.27. To find an equation
for the surface, we consider a point P(x, y, z) on the surface. The point P is obtained
by revolving the point P0 (y0 , z0 ) on the original curve f (y, z) = 0 about the z-axis. Note
that P and P0 actually have the same height above the xy-plane and the same distance
from the z-axis; therefore, we have

z0 = z and |y0 | = √x 2 + y2

(a) (b)

Figure 1.27: Surface of revolution.


48 | 1 Vectors and the geometry of space

and, thus, (±√x2 + y2 , z) must satisfy the equation f (y, z) = 0. Therefore, we have

f (±√x2 + y2 , z) = 0.

This equation has three variables and is exactly an equation of the surface obtained
by rotating the curve f (y, z) = 0 in the yz-plane about the z-axis.

Example 1.6.7. Find an equation of the surface of revolution formed by revolving a straight line L in
the yz-plane with equation z = ay about the z-axis.

Solution. As discussed above, an equation of the surface is

z = ±a√x 2 + y2 .

That gives

z 2 = a2 (x 2 + y2 ).

This type of surface is called a circular cone. The graph of a circular cone is shown in
Figure 1.28.

(a) (b)

Figure 1.28: Surface of revolution, Example 1.6.7.

Using similar ideas, one can determine that

f (y, ±√x2 + z 2 ) = 0

is the equation of the surface obtained by revolving the curve f (y, z) = 0 in the yz-plane
about the y-axis. Also, an equation for the surface obtained by revolving a curve in a
coordinate plane other than the yz-plane about one of the axes can be determined in
a similar manner.
1.7 Parameterized surfaces | 49

Example 1.6.8. Find an equation for each surface of revolution obtained by revolving the curve
x 2 + 49 y 2 = 1 in the xy-plane about (a) the x-axis and (b) the y-axis.

Solution. For (a), a rotation about the x-axis, keep x unchanged and replace y with
±√y2 + z 2 , yielding

4 2
x2 + (±√y2 + z 2 ) = 1.
9
This simplifies to
4 2 4 2
x2 + y + z = 1,
9 9
which is an equation for the desired surface.
For (b), similarly, we keep y unchanged, and we replace x by ±√x 2 + z 2 in the equa-
tion of the curve. We obtain
4
x2 + z 2 + y2 = 1,
9
which is an equation of that surface of revolution. The graphs of these surfaces of
revolution are shown in Figure 1.29.

(a) (b) (c)

Figure 1.29: Surface of revolution, Example 1.6.8.

1.7 Parameterized surfaces


As seen in Section 1.3.2, a plane with an equation ax + by + cz + d = 0 also has a vector
equation

r(u, v) = r0 + ua + vb.

In general, the graph of the vector-valued function r(u, v) with two independent pa-
rameters u and v is a surface in space. Its parametric form is

x = x(u, v), y = y(u, v), and z = z(u, v).


50 | 1 Vectors and the geometry of space

Example 1.7.1. Find a parameterization for each of the following surfaces:

(1)2x + 4y − z = 5, (2)x 2 + 2y 2 = 1, (3)x 2 + y 2 + z 2 = 4, (4)z 2 = x 2 + 4y 2 .

Solution.
1. Let x = u and y = v. Then z = 2u + 4v − 5. Or

{ x = u,
{
{ y = v,
{
{ z = 2u + 4v − 5.

2. Let x = cos u and y = √12 sin u. Since the equation does not involve the variable z,
z could be any real number. Thus,

{ x = cos u,
{
{ y = √12 sin u,
{
{ z=v

is a parameterization for the cylinder x 2 + 2y2 = 1.


3. For the sphere, one can check that

x = 2 cos ϕ cos θ, y = 2 cos ϕ sin θ, and z = 2 sin ϕ

is a parameterization.
4. For the cone, let x = u cos v, y = 21 u sin v, and z 2 = u2 . So

1
x = u cos v, y = u sin v, and z=u
2

is a parameterization.

1.8 Intersecting surfaces and projection curves


Besides using a vector-valued function, there is another way of interpreting a curve in
space. As we have seen before, a line is the curve of intersection of two planes, and a
trace is, in fact, the curve of intersection of a plane and a surface in space.

Example 1.8.1. For each of the following curves, find two surfaces so that the curve is their intersection
curve:
1. the line r(t) = ⟨2 − 3t, 4 + t, −2 − 5t⟩,
2. the helix r(t) = ⟨2 cos t, 2 sin t, 0.5t⟩.
1.8 Intersecting surfaces and projection curves | 51

Solution.
1. The line has symmetric equations

x−2 y−4 z+2


= = ,
−3 1 −5
so
x−2 y−4
= 1
, x + 3y − 14 = 0,
{ −3
y−4
which simplifies to {
= z+2
, 5y + z − 18 = 0.
1 −5

So the line is the line of intersection of the two planes x+3y−14 = 0 and 5y+z−18 =
0.
2. The helix has parametric equations

x = 2 cos t, y = 2 sin t, z = 0.5t.

z
Therefore, x2 + y2 = 4 and t = 0.5 = 2z, so x = 2 cos(2z). Both of them are cylinders.
Therefore, the helix is the curve of intersection of the two cylinders, and we have

x2 + y2 = 4,
{
x = 2 cos(2z).

Figure 1.30 shows the graphs of the cylinder x2 + y2 = 4 and the cylinder x = 2 cos(2z).

Figure 1.30: A helix is the curve of intersection of two cylinders.

In general, the graphs F(x, y, z) = 0 and G(x, y, z) = 0 are surfaces in space, and

F(x, y, z) = 0,
{
G(x, y, z) = 0
52 | 1 Vectors and the geometry of space

describes the curve of intersection of the two surfaces. We call it a general equation of
the curve. Intuitively, if the system of equation has just one independent variable, say,
x, then y and z are dependent variables. Therefore, we have the parametric equations
x = x, y = y(x), and z = z(x), which describe a curve in space. For example,

z = x2 + 2y2 ,
{
z=3

describes an ellipse in the z = 3 plane. It has parametric equations x = √3 cos t, y =


√ 32 sin t, and z = 3.

z=x 2 +y 2 ,
Example 1.8.2. Describe the curve given by the equations {
x 2 +y 2 +z 2 =2.

Solution. Since z = x2 + y2 and x2 + y2 + z 2 = 2,

z 2 + z − 2 = 0.

This means that either z = −2 (rejected, as z ≥ 0) or z = 1. Thus,

z = 1,
{
x2 + y2 = 1.

This curve is a circle in the z = 1 plane. It has parametric equations x = cos t, y = sin t,
and z = 1. This curve is the intersection curve of a paraboloid and a sphere with center
at the origin and radius √2, as shown in Figure 1.31(b).

(a) (b) (c)

Figure 1.31: Curves as intersections of surfaces.

We were fortunate that in the previous example, there exist nice equations to describe
curves in space. However, in some cases, it might be hard to find a simple equation for
a space curve as the intersection curve of two surfaces. For example, consider

x + y + 2z = 0, z = x2 + 2y2 ,
{ or {
x2 + y2 + z 2 = 4 x 2 + y2 = 1.
1.8 Intersecting surfaces and projection curves | 53

We know that the first example is the intersection curve of a plane and a sphere. In-
tuitively, we know this is a circle in space. But for the second one, it might be hard to
visualize it. To study a curve like this, it would be helpful to view it from the top or
side. This means that we can study its projection curves onto one of the coordinate
planes. How can we find equations for those projection curves? We first consider the
case where we project the curve

F(x, y, z) = 0,
{
G(x, y, z) = 0

onto the xy-plane.


Well, the xy-plane has the feature that z = 0. So, if we try to eliminate the variable
z from the simultaneous equations, we might obtain an equation H(x, y) = 0. This
is actually a cylinder parallel to the z-axis, and this cylinder contains the curve of
intersection! This is because for any point on the curve of intersection, its coordinates
must satisfy both F(x, y, z) = 0 and G(x, y, z) = 0, and, therefore, satisfy H(x, y) = 0.
Since this cylinder is vertical, if we view it from above, then the projection curve will
be
H(x, y) = 0,
{
z = 0.

Similarly, if we want the projection curve on the xz- or yz-coordinate planes, we simply
eliminate the variable y or x from the simultaneous equations. This gives a cylinder
parallel to the y- or x-axis, respectively. The curve of intersection of the cylinder with
the xz- or yz-coordinate planes is the desired projection curve. So, finding projection
curves is not hard now.

Example 1.8.3. Find an equation for the projection curve of the intersecting curve of the plane
x + y + 2z = 0 and the sphere x 2 + y 2 + z 2 = 4:
1. onto the xy-plane,
2. onto the yz-plane.

Solution.
1. Onto the xy-plane, we eliminate the variable z to obtain
2
x+y
x2 + y2 + (− ) = 4,
2

which simplifies to 5x 2 +5y2 +2xy = 0. This is an elliptic cylinder, and the projection
curve
5x2 + 5y2 + 2xy = 16,
{
z=0

is an ellipse in the xy-plane.


54 | 1 Vectors and the geometry of space

2. Onto the yz-plane, we eliminate the variable x to obtain

(−y − 2z)2 + y2 + z 2 = 4,

which simplifies to 2y2 +5z 2 +4yz = 4. This is an elliptic cylinder, and the projection
curve

2y2 + 5z 2 + 4yz = 4,
{
x=0

is also an ellipse in the yz-plane. Figure 1.32 shows the graphs and projections.

(a) (b) (c) (d)

Figure 1.32: Curves as intersections of surfaces.

Example 1.8.4 (Viviani curve). The following curve C is given:

x 2 + y 2 + z 2 = 4,
{
(x − 1)2 + y 2 = 1.

Find the projection curve onto the xy-plane.

Solution. The curve is the intersection curve of a sphere and a circular cylinder. The
variable z is missing in the second equation, so there is no need to eliminate z because
the second equation is already a cylinder containing the curve C. So the projection
curve onto the xy-plane is a circle centered at (1, 0) with radius 1 (as in Figure 1.33)
given by

z = 0,
{
(x − 1)2 + y2 = 1.

Note. This curve has nice parametric equations. If x = 1 + cos t and y = sin t, then

z 2 = 4 − (1 + cos t)2 − (sin t)2


= 4 − 1 − 2 cos t − (cos t)2 − (sin t)2
= 2 − 2 cos t
1.8 Intersecting surfaces and projection curves | 55

(a) (b) (c) (d)

Figure 1.33: Viviani curve, top view.

= 2(1 − cos t)
t
= 4 sin2 .
2

Thus, we can take z = 2 sin 2t . A vector equation for this curve is, therefore,

t
r(t) = ⟨1 + cos t, sin t, sin ⟩.
2

Setting the z-component to 0, we have the projection curve onto the xy-plane, i. e.,

r(t) = ⟨1 + cos t, sin t, 0⟩,

which is the same as (x −1)2 +y2 = 1. The intersection curve of the cylinder (x −a)2 +y2 =
a2 and the sphere x2 + y2 + z 2 = 4a2 is called a Viviani curve, named after an Italian
mathematician. Figure 1.33 shows the top and side views of a Viviani curve.

In some cases, we can also find projection curves onto some plane which is not
parallel to any of the coordinate planes.

2x−y+z=0,
Example 1.8.5. Find the projection line of the line L: { x−y−2z+10=0 onto the plane y + 2z + 2 = 0.

Solution. It is easy to see that both planes 2x − y + z = 0 and x − y − 2z + 10 = 0 contain


L, and any linear combination of these equations gives a plane containing L. We have

(2x − y + z) + λ(x − y − 2z + 10) = 0,

where λ is any fixed number. Among all these planes that contain L, there must be
exactly one plane which contains the projection line of L onto the plane y + 2z + 2 = 0.
This plane is the one which is perpendicular to the plane y + 2z + 2 = 0. This means
that the normal vectors of the two planes are perpendicular, i. e.,

⟨0, 1, 2⟩ ⋅ ⟨2 + λ, −1 − λ, 1 − 2λ⟩ = 0.
56 | 1 Vectors and the geometry of space

Therefore, we have 0(2 + λ) + 1(−1 − λ) + 2(1 − 2λ) = 0 with solution λ = 51 . Hence, the
plane containing L and the projection line is

1
(2x − y + z) + (x − y − 2z + 10) = 0, or
5
11x − 6y + 3z + 10 = 0.

The projection line L is the line of intersection of this plane and the plane y+2z+2 =
0. So the projection line is

y + 2z + 2 = 0,
{
11x − 6y + 3z + 10 = 0.

Figure 1.34 shows these planes and the projection line onto the plane y + 2z + 2 = 0 in
blue.

Figure 1.34: Projection line.

1.9 Regions bounded by surfaces


Suppose a region in space is bounded by two surfaces F(x, y, z) = 0 and G(x, y, z) = 0.
How do you find its projection region onto one of the three coordinate planes? For
example, the region R is bounded by the paraboloid z = x 2 +y2 and the top semisphere
z = √2 − x2 − y2 . The intersecting curve of the two surfaces is

z = x2 + y2 ,
{
z = √2 − x 2 − y2 .

Projecting this curve onto the xy-plane to obtain an equation of the projection curve
gives

z=0 and x 2 + y2 = 1,
1.10 Review | 57

and the projection region of R onto the xy-plane is, therefore,

z=0 and x2 + y2 ≤ 1.

To project the region onto the xz-plane, we first note that the projection region of z =
x2 + y2 onto the xz-plane is

y=0 and z ≥ x 2 .

The projection region of z = √2 − x2 − y2 onto the xz-plane is a half-circle,

y=0 and z ≤ √2 − x 2 .

Therefore, the projection region of R onto the xz-plane is the region bounded by z = x 2
and z = √2 − x2 in the plane y = 0.

1.10 Review
The main concepts discussed in this chapter are listed below.
1. Vector operations of addition, subtraction, and scalar multiplication, and the dot
product and cross product:
󵄨󵄨 i j k 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨 󵄨󵄨
a ⋅ b = a1 b1 + a2 b2 + a3 b3 , a × b = 󵄨󵄨󵄨󵄨 a1 a2 a3 󵄨󵄨󵄨 .
󵄨󵄨 󵄨󵄨
󵄨󵄨 b1 b2 b3 󵄨󵄨

2. Equations of lines and planes in space:

a1 x + b1 y + c1 z = d1 ,
lines : r = r0 + tv, r = ⟨x(t), y(t), z(t)⟩, or {
a2 x + b2 y + c2 z = d2 ,
x − x0 y − y0 z − z0
or = =
m n p
or x = x0 + mt, y = y0 + nt and z = z0 + pt,
planes : a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0 or ax + by + cz + d = 0.

3. Distance from a point P(x0 , y0 , z0 ) to a plane ax + by + cz + d = 0,


󵄨󵄨 󳨀󳨀→ 󵄨
󵄨 P P ⋅ n 󵄨󵄨󵄨 |ax0 + by0 + cz0 + d|
distance = 󵄨󵄨󵄨 1 󵄨= ,
󵄨󵄨 |n| 󵄨󵄨󵄨 √a2 + b2 + c2
where P1 is any point in the plane and n is a normal vector of the plane.
4. Distance from a point P(x0 , y0 , z0 ) to a line r = r0 + tv ,
󳨀󳨀→
|P1 P × v|
distance = ,
|v|
where P1 is any point in the plane and v is a direction vector of the line.
58 | 1 Vectors and the geometry of space

5. Distance between two skew lines r = r1 + tv1 and r = r2 + tv2 ,


󳨀󳨀󳨀→
|P1 P2 ⋅ (v1 × v2 )|
distance = ,
|v1 × v2 |

where P1 is any point on the first line and P2 is any point on the second line.
6. For a vector-valued function r(t):
(a) r󸀠 (t) = ⟨x 󸀠 (t), y󸀠 (t), z 󸀠 (t)⟩, ∫ r(t)dt = ⟨∫ x(t)dt, ∫ y(t)dt, ∫ z(t)dt⟩,
(b) tangent line at t = t0 : r = r(t0 ) + tr󸀠 (t0 ),
(c) normal plane at t = t0 :

x󸀠 (t0 )(x − x(t0 )) + y󸀠 (t0 )(y − y(t0 )) + z 󸀠 (t0 )(z − z(t0 )) = 0,

b
(d) length of curves: s = ∫a |r󸀠 (t)|dt, for a ≤ t ≤ b,
dT 1
(e) curvature: κ = ds
= | dT |
|r󸀠 (t)| dt
|r󸀠 (t)×r󸀠󸀠 (t)|
=|r󸀠󸀠 (t)|3
,
dT dT
(f) the principal unit normal vector: N = dt /| dt |,
(g) the unit binormal vector: B = T × N,
(h) torsion: τ = −N ⋅ dB
ds
.
7. The cylinders parallel to one of the axes: F(x, y) = 0, G(x, z) = 0, and H(y, z) = 0.
8. Quadric surfaces:

z2 x2 y2 x2 y2 z 2
= + elliptic cone + + = 1 ellipsoid
a2 b2 c2 a2 b2 c2
x2 y2 z 2 x2 y2
+ − = 1 hyperboloid of one sheet z= + elliptic paraboloid
a2 b2 c2 a2 b2
x2 y2 z 2 x2 y2
− − = 1 hyperboloid of two sheets z = − hyperbolic paraboloid
a2 b2 c2 a2 b2

9. Surface of revolution:

f (y, z) = 0,
{ about the z-axis f (±√x 2 + y2 , z) = 0,
x = 0,

about the y-axis f (y, ±√x 2 + z 2 ) = 0.

Similar results hold for curves in other coordinate planes rotated about one of the
axes.
10. Vector form of a plane: r = r0 + ua + vb.
11. Surfaces with vector parametric forms: r = ⟨x(u, v), y(u, v), z(u, v)⟩.
12. Finding the projections of curves by eliminating one of the variables x, y, or z.
1.11 Exercises | 59

1.11 Exercises
1.11.1 Vectors

1. Sketch the following points in a three-dimensional coordinate system:

(1, 2, 3), (−1, 0, 2), (0, 2, 0), (0, −1, 1), (2, −1, 2), (2, 0, 0).

2. Find the three points that are symmetrical to M0 (x0 , y0 , z0 ) about the x-axis, the
xz-plane, and the origin, respectively.
󳨀󳨀→
3. If |AB| = 11, A(4, −7, 1), and B(6, 2, z), find z.
4. If a⃗ = ⟨5, 7, 8⟩, b⃗ = ⟨3, −4, 6⟩, and c⃗ = ⟨−6, −9, −5⟩, then find the length and direc-
tion angles of the vector a⃗ + b⃗ + c.⃗
5. If r = i − 2j − 2k, then find the unit vector in the direction of r. Also, find the three
direction cosines.
6. Which of the following expressions make sense? Which do not make sense? Ex-
plain your answers.
(1) (a⃗ ⋅ b)⃗ ⋅ c,⃗ (2) (a⃗ ⋅ b)⃗ c,⃗ (3) |a|(
⃗ b⃗ ⋅ c),

(4) (a⃗ + b) ⋅ c,⃗ (5) a⃗ + b ⋅ c,⃗ (6) |a|(
⃗ ⃗ ⃗ b + c).
⃗ ⃗
7. Which of the following identities are true? Which are false? Explain your answers.
(1) |a|⃗ a⃗ = a⃗ ⋅ a,⃗ (2) (a⃗ ⋅ b)(
⃗ a⃗ ⋅ b)⃗ = (a⃗ ⋅ a)(
⃗ b⃗ ⋅ b),

(3) (a⃗ ⋅ b)c⃗ = a(⃗ b ⋅ c),
⃗ ⃗ ⃗ (4) (a + b) ⋅ (a + b) = a ⋅ a + 2a ⋅ b + b ⋅ b.
8. Simplify the following expressions:
(1) i ⃗ × (j ⃗ + k)⃗ − j ⃗ × (i ⃗ + k)⃗ + k⃗ × (i ⃗ + j ⃗ + k),
⃗ (2) (2a + b) × (c − a) + (b − c) × (a + b).
9. Prove Theorem 1.2.4.
10. If a⃗ = 3i ⃗ − j ⃗ − 2k,⃗ b⃗ = i ⃗ + 2j ⃗ − k,⃗ and c⃗ = i ⃗ + j,⃗ find
(1) a⃗ ⋅ b⃗ and a⃗ × b,⃗ (2) Projb⃗ a,⃗ (3) (−2a)⃗ ⋅ 3b,⃗
(4) the angle between a⃗ and b,⃗ (5) (a⃗ × b)⃗ ⋅ c.⃗
11. Prove that a⋅(b × c) = (a × b) ⋅ c.
12. Prove that the four points A(2, −1, −2), B(1, 2, 1), C(2, 3, 0), and D(−1, 5, 4) are not
coplanar.
13. Prove the Cauchy-Schwarz inequality for three-dimensional vectors a⃗ and b,⃗

|a⃗ ⋅ b|⃗ ≤ |a||


⃗ b|.

A general form of the inequality is given by


2
n n n
(∑ ai bi ) ≤ ∑ a2i ∑ b2i .
i=1 i=1 i=1

14. Use the projection method to show that in ℝ2 the distance from a point P(x0 , y0 )
to the line ax + by + c = 0 is
|ax0 + by0 + c|
.
√a2 + b2
60 | 1 Vectors and the geometry of space

15. If r1 , r2 , and r3 are three nonzero position vectors, show that their heads are
collinear if and only if

r1 × r2 + r2 × r3 + r3 × r1 = 0.

1.11.2 Lines and planes in space

1. Find an equation for the line that:


(1) passes through the two points (1, 3, 2) and (−1, 3, −2).
(2) passes through the point (1, 2, −3) and is parallel to the vector v = ⟨0, 1, 1⟩.
(3) passes through the point (−3, 2, 1) and is parallel to the x-axis.
(4) passes through the point (0, 3, 2) and is perpendicular to the xz-plane.
2. Find an equation for the plane that:
(1) passes through the point (6, 3, 2) and is parallel to another plane 3x − 7y + 5z −
12 = 0.
(2) passes through the points (1, 2, 1), (0, 1, 1), and (1, −1, 2).
(3) passes through the point (1, 7, −3) and contains the line x = 1 − 3t, y = t,
z = 2 − t.
3. Determine whether each pair of lines is parallel, perpendicular, intersecting, or
skew lines.
x+1 y−2 z−3
(1) ℓ1 : = = ℓ2 : x = 2 − t, y = 3 + 2t, and z = 4t,
2 0 2
x+1 y+7 z
(2) ℓ1 : x = t, y = 0, and z = −2t ℓ2 : = = ,
2 9 −4
(3) ℓ1 : r = ⟨t, t, −3t⟩ ℓ2 : r = ⟨1 + t, 2t, 5 − 3t⟩.

4. Find the shortest distance between the two lines


x−1 y z x y z+2
= = and = = .
0 1 1 2 −1 0

5. Find the plane that passes through the point (1, 2, 1) and contains the line of inter-
section of the planes x − y + z = 2 and 2x − y − 2z = 1.
6. Find parametric equations of the line that passes through (4, 1, 3) and is parallel
to the line
x−3 y z−1
= = .
2 1 5

7. Find parametric equations of the line through (2, 1, 1) that intersects the line x+13
=
y+3 z
1
= 2 and is parallel to the plane 3x − 4y + z = 10.
8. Find the angle between two planes 4x + 2y + 4z − 7 = 0 and 3x − 4y = 0.
9. Find the distance from the point A(1, 2, 3) to the line x = t, y = 4 − 3t, z = 3 − 2t.
1.11 Exercises | 61

10. Find the acute angle between the line { x+y+3z=0,


x−y−z=0 and the plane x − y − z + 1 = 0.
11. Find symmetric equations of the line that is contained in the plane π : x +y +z +1 =
0, passes through the point of intersection of the plane π and the line L : { y+z+1=0,
x+2z=0,
and is perpendicular to the line L.
12. Find the distance from the point (1, −1, 2) to the plane x − y + 2z = 2.
13. Find a vector equation in the form r = r0 + ur1 + vr2 for the plane 2x + 5y − z = 8.

1.11.3 Curves and surfaces in space

1. Sketch the following curves in space:


(1) r(t) = cos ti + 2j + 2 sin tk, 0 ≤ t ≤ 2π,
(2) r(t) = 2 cos ti + 2 sin tj + 3k, 0 ≤ t ≤ 2π.
2. If r(t) = ⟨te−t , sin 1t , t sin 1t ⟩, find limt→0 r(t) and limt→∞ r(t). Is r(t) continuous at
t = 0?
3. Find the point of intersection, if it exists, of the plane x + y = 0 and the curve
r(t) = ⟨t, sin t, cot t⟩.
4. Prove Theorem 1.5.2.
5. For the vector-valued function

cos t sin t cos t sin t sin t


r(t) = ⟨ + ,− + , ⟩,
√2 √3 √2 √3 √3

find:
(a) an equation for its tangent line at t = π2 .
(b) an equation for its normal plane at t = π2 .
(c) an equation of the projection curve onto the xy-plane.
6. If r󸀠 (t) = ⟨tet , t cos t 2 , − √ 2t2 ⟩ and r(0) = ⟨0, 2, 4⟩, find r(t).
t +4
7. A particle is moving in space with velocity v(t) = ⟨3 cos t, 4 cos t, 5 sin t⟩. Initially
it starts at the origin. Find its position when t = π and the distance it traveled
during 0 ≤ t ≤ π.
8. Determine whether the curves
(1) r(t) = cos t 2 i + sin t 2 j, (2) r(t) = 5 cos ti + 3 sin tj + 4 sin tk
use arc length as a parameter. If not, find a description that uses arc length as a
parameter.
9. Prove Theorem 1.5.3.
10. For each curve

(1) r(t) = 3 cos ti + 3 sin tj + 2tk and (2) r(t) = et cos ti + et sin tj + et k,

find
(a) the unit tangent vector T at t = 0.
(b) the principal unit normal vector N at t = 0.
62 | 1 Vectors and the geometry of space

(c) the curvature of the curve at t = 0.


(d) the unit binormal vector B at t = 0.
(e) the torsion at t = 0.
11. Prove that dN
ds
= −κT + τB.
12. On a smooth curve C, the kissing circle (the circle of curvature or osculating circle,
see Figure 1.22(c)) at a point P is the circle that (a) is tangent to C at P, (b) has the
same curvature as C at P, and (c) lies on the same side of C as the principal normal
vector N. The radius of this circle is κ1 and is called the radius of curvature. If the
curve is y = y(x), then the curvature of the curve at P(x0 , y0 ) can be shown to be

|y󸀠󸀠 (x0 )|
,
(1 + [y󸀠 (x0 )]2 )3/2

and the center of this circle at P(x0 , y0 ) can be shown to be

1 + [y󸀠 (x0 )]2 1 + [y󸀠 (x0 )]2


(x0 − y󸀠 (x0 ) , y 0 + ).
y󸀠󸀠 (x0 ) y󸀠󸀠 (x0 )

(a) Find an equation of the kissing circle for the parabola y = x 2 at x = 2.


(b) Find an equation of the kissing circle for the cycloid r(t) = ⟨t − sin t, 1 − cos t⟩
at t = π.
13. Find an equation of the sphere that passes through the point (2, −4, 3) and contains
2 2
the circle { x +y
z=0.
=5,

14. Identify and sketch the following surfaces:


(1) z = y2 , (2) x2 + 8y2 = 4, (3) x 2 − z 2 = 9,
z2
(4) y = x3 , (5) 4x2 + y2 + 16
= 1, (6) z = 2x2 + 6y2 ,
2 2 2 x2 x2 y2
(7) z + 2y − 2x = 0, (8) z + y − 4
= 0, (9) 25
+ 16
− z 2 = 0,
2 2 2 2
x y y
(10) 25 + 16 − z 2 = 1, (11) − x9 − 16 + z 2 = 1.
15. Find the intersections of the given surface with the planes x = k, y = k, z = k and
identify the surface and sketch it.
(1) x2 + y2 + z 2 = 2az, (2) x2 + y2 = 2az, (3) x 2 + z 2 = 2az,
(4) x2 − y2 = z 2 , (5) x2 = 2az.
16. Find an equation for the surface obtained by rotating the parabola y = x 2 , z = 0
about the y-axis.
2
17. Find an equation for the surface S obtained by rotating the curve { y x=0 =6−z, about

the z-axis. Find the projection curve onto the xy-plane of the curve of intersection
of the surface S and the cone z = √x2 + y2 .
18. Find an equation of the cylinder consisting of all lines parallel to the direction
2
⟨2, 1, −1⟩ and passing through the curve { y −4x=0,
z=0.
19. Find the projection curve of
z=xy,
(1) { x2 +2y2 =1 onto the xy-plane,
(2) r(t) = ⟨1 + t 2 , sin t, t⟩ onto the xz-plane,
1.11 Exercises | 63

(3) z = x2 + 2y2 and x2 + y2 = 1 onto (a) the xy-plane and (b) the yz-plane.
20. Find the projection regions of
(1) the paraboloid z = x2 + y2 ,
(2) the solid bounded by the cone z = √x2 + 2y2 and the sphere x 2 + y2 + z 2 = 1
onto the three coordinate planes.
21. Find an equation of the projection line of the line r(t) = ⟨2 + t, 3 − 2t, 4t⟩ onto the
plane x + y − z = 1.
22. Try to find an equation for the projection curve of the curve r(t) = ⟨2 cos t, 2 sin t, 4t⟩
onto the plane x − 2y + 3z = 2.
23. (Ruled surfaces) In geometry, a surface S is ruled (also called a scroll) if through
every point of S there is a straight line that lies on S. For example, a plane and a
circular cone are both ruled surfaces. A surface is doubly ruled if through every
point of S there are two distinct lines that lie on the surface.
Show that:
(a) the cylinder x2 + 2z 2 = 1 is ruled.
(b) the hyperbolic paraboloid z = x2 − y2 and the hyperboloid of one sheet x 2 +
y2 − z 2 = 1 are doubly ruled surfaces.
2 Functions of multiple variables
In single-variable calculus, a function depends on only one variable. However, in the
real world, physical quantities often depend on two or more variables. For example,
the volume V of a circular cone depends on its base radius r and height h, so it is a
function of two variables. The temperature T of a city in China depends on the time t,
the longitude x, and the latitude y of the city. The temperature T here is a function of
three variables. A smart person’s IQ may depend on his genes, thinking skills, edu-
cation, and so forth. It is a function of more than three variables. In this chapter, we
study multivariable functions and apply differential calculus to such functions.

2.1 Functions of multiple variables


2.1.1 Definitions

We first start with functions of two variables. Similar to functions of one variable, a
real-valued function f of two variables is defined as follows.

Definition 2.1.1. A real function of two real variables is a rule f (also called a mapping or correspon-
dence) that assigns to each ordered pair of real numbers (x, y) in a set D ⊂ ℝ2 a unique number z ∈ ℝ.
We denote this rule by z = f (x, y). The set D is called the domain of f . The set {f (x, y)|(x, y) ∈ D} is
called the range of the function f . The variables x and y are called the independent variables and z is
called the dependent variable.

We usually denote a function of two variables explicitly by the equation z = f (x, y)


or implicitly by the equation F(x, y, z) = 0. Usually, we assume that the domain of a
function of two variables is all possible ordered pairs (x, y) of real numbers for which
f (x, y) makes physical and/or mathematical sense. That is, when the domain is not
otherwise specified, we take it to be

D = domain of f = {(x, y) : f (x, y) is defined}

or, if f describes some real-life situation,

D = domain of f = {(x, y) : f (x, y) makes physical sense}.

The domain of a function of two variables can have interior points and boundary
points, and it may be an open or closed region in the xy-plane, just the way the do-
mains of one-variable functions defined on subsets of the real number line can (see
Figure 2.1(b) and (c)). We give the following important definitions.

https://doi.org/10.1515/9783110674378-002
66 | 2 Functions of multiple variables

(a) (b) (c)

Figure 2.1: Functions of two variables, domain, range, boundary point, interior point.

Definition 2.1.2.
1. A point (x0 , y0 ) in a region (set) R in the xy-plane is an interior point of R if there exists a disk with
center (x0 , y0 ) that lies entirely in R.
2. A point (x0 , y0 ) is a boundary point of R if every disk centered at (x0 , y0 ) contains points that lie
outside of R as well as points that lie in R. The boundary point may or may not belong to R.
3. The interior points of a region make up the interior of the region.
4. The region’s boundary points make up its boundary.
5. A region is open if it consists entirely of interior points.
6. A region is closed if it contains all its boundary points.
7. A region in the plane is bounded if it lies inside a disk of fixed radius.
8. A region is unbounded if it is not bounded.

The structures of domains of two-variable functions can vary considerably from one
function to another, as shown in the following examples.

Example 2.1.1. The function f defined by

f (x, y) = √ (4 − x 2 − y 2 )(x 2 + y 2 − 1)

has a maximum domain consisting of all (x, y) satisfying (4 − x 2 − y 2 )(x 2 + y 2 − 1) ≥ 0 ensuring the
square root exists. Solving this inequality gives

D = {(x, y) : 1 ≤ x 2 + y 2 ≤ 4}.

One can easily draw a picture of this domain, which is an annulus (ring), as shown in Figure 2.2(a).

x2 y2
Example 2.1.2. Find the domain of the function z = √ 1 − a2
− b2
+ ln(x + y) (a > 0 and b > 0).

2 2
Solution. The function z is defined only for these pairs of (x, y) satisfying 1− ax 2 − by 2 ≥ 0
and x + y > 0, so the domain for this function of two variables is
󵄨󵄨 x2 y2
D = {(x, y)󵄨󵄨󵄨 2 + 2 ≤ 1 and x + y > 0}.
󵄨
󵄨󵄨 a b
2.1 Functions of multiple variables | 67

(a) (b) (c)

Figure 2.2: Domains of functions of two variables, Example 2.1.1, Example 2.1.2, and Example 2.1.3.

It is a region bounded by the elliptical curve and the line y = −x. This is neither an
open nor a closed region, as shown in Figure 2.2(b).

Example 2.1.3. Find the domain and range of the function z defined by

z(x, y) = √9 − x 2 − y 2 .

Solution. The domain D of z is found by noticing that a square root cannot take neg-
ative values, i. e.,

D = {(x, y)|9 − x2 − y2 ≥ 0} = {(x, y)|x 2 + y2 ≤ 9},

which is the disk with center (0, 0) and radius 3. The range of z is

{z|z = √9 − x2 − y2 , for all (x, y) ∈ D}.

Since z is a positive square root, z ≥ 0 and the domain restriction 0 ≤ x 2 +y2 ≤ 9 shows
that

0 ≤ √9 − x2 − y2 ≤ 3.

Hence, the range is the set of real numbers

{z|0 ≤ z ≤ 3} = [0, 3].

The graph of the function z = √9 − x2 − y2 is shown in Figure 2.2(c).

2.1.2 Graphs and level curves

As seen in the previous chapter, the set of points (x, y, z) whose coordinates satisfy the
equation z = f (x, y) consist of a surface in space. This surface is called the graph of the
68 | 2 Functions of multiple variables

function z = f (x, y). For instance, the graph of the function z = 3x − y + 3 is a plane,
the graph of the function z = √a2 − x2 − y2 is the upper semisphere, and the graph of
z = x2 − 3y2 is a hyperbolic paraboloid in space.
As seen in the previous chapter, it might be hard to visualize or sketch the graph of
a function of two variables, for example, z = x 2 −4y2 , which is a hyperbolic paraboloid.
However, we can use the ideas of contour curves and level curves to help visualize the
graph. One may already have an idea from the daily weather forecasts or topographic
mappings of a mountain, as shown in Figure 2.3.

Figure 2.3: Graphs, contours, and level curves.

Assume that you are walking on the surface of a mountain, and the surface is the graph
of a function z = f (x, y). If you walk along a path on which your elevation remains
constant, say, z0 , which is actually the height above the bottom plane of the mountain,
then the path is part of a contour curve which is the intersecting curve of the surface
z = f (x, y) and the plane z = z0 . When the contour curves are projected onto the
xy-plane, those projection curves are called level curves.

Example 2.1.4. Find and sketch the level curves of the following surfaces:
1. z = x 2 − 4y 2 ,
2. f (x, y) = x 4 + y 4 + 8xy.

Solution.
1. The level curves are described by the equations

z0 = x 2 − 4y2 ,
{
z = 0,

where z0 is a constant. For z0 = 0, the level curve is two straight lines x = ±2y
in the xy-plane. For all values of z0 ≠ 0, the level curves are hyperbolas in the
xy-plane. Setting z0 = 1, 4, 8, −1, −4, −8 enables us to obtain Figure 2.4.
2.1 Functions of multiple variables | 69

(a) (b) (c)


2 2
Figure 2.4: Graph and level curves for z = x − 4y .

(a) (b)
4 4
Figure 2.5: Graph of z = x + y + 8xy and its level curves.

2. We have not seen the graph of f (x, y) before. We use a graphing utility to help
sketch the surface and graph the level curves, as shown in Figure 2.5.

2.1.3 Functions of more than two variables

Likewise, a function f of three variables x, y, and z is a rule that assigns to each ordered
triple (x, y, z) in a domain set D ⊂ ℝ3 a unique real number u, and we write

u = f (x, y, z)

to indicate that it is a function of these three variables. If the domain of a function f


of three variables x, y, and z is not specified, then it is usually taken to be the set of
ordered triples (x, y, z) in ℝ3 for which f (x, y, z) makes physical or mathematical sense.
70 | 2 Functions of multiple variables

The graph of a function of three variables is the set of those points (x, y, z, u) in
four-dimensional space where u = f (x, y, z)! Therefore, we cannot visualize it in three-
dimensional space. To get some sense of the graph, we could use an idea similar to
level curves. If we set u0 = f (x, y, z), for any constant u0 , the graph of u0 = f (x, y, z) is
a level surface in space. For example, the level surfaces of the function u = x 2 + y2 + z 2
are spheres in space.
Although most of the functions that we will work with in this textbook will be func-
tions of two or three variables, scientists, engineers, and mathematicians often need
to work with functions of four or more variables. Those functions are defined in a sim-
ilar way. In general, a function f of n variables is a rule that assigns a unique number
u to an n-tuple (x1 , x2 , . . . , xn ) ∈ ℝn of real numbers, and we write u = f (x1 , x2 , . . . , xn ).
Sometimes we use vector notation to write such functions more compactly. That is, if
the ordered n-tuple is considered to be a vector x = ⟨x1 , x2 , . . . , xn ⟩, then we write f (x)
in place of f (x1 , x2 , . . . , xn ), and we can then write the function compactly as u = f (x),
or u = f (P), for P ∈ ℝn .

2.1.4 Limits

Limits for functions of two variables are required to develop the calculus of functions
of two variables. They can often be interpreted in much the same way as we interpret
limits of functions of one variable. For example, the statement
lim 3x 2 y = −12
(x,y)→(2,−1)

means that the value of 3x 2 y gets closer and closer to −12 as (x, y) gets closer and closer
to (2, −1). It may seem obvious that if (x, y) is close to (2, −1), then x is close to 2 and y
is close to −1, so the value of 3x 2 y is close to 3(22 )(−1) = −12. However, we will soon
see that the limit of a function of two variables is not always this clear. Vague phrases
like “closer and closer to” can be hard to interpret in some circumstances, and we can
avoid most of these difficulties by defining the limit more precisely.

Definition 2.1.3. Let D be a region in the xy-plane and (a, b) be an interior point of D. Let f be a real-
valued function of two variables defined on D except possibly at (a, b). We say that a real number L is
the limit of f as (x, y) approaches (a, b), and we write

lim f (x, y) = L
(x,y)→(a,b)

if for every number ε > 0 there is a number δ > 0 (the value of δ depends on ε) such that
󵄨󵄨 󵄨
󵄨󵄨f (x, y) − L󵄨󵄨󵄨 < ε
󵄨 󵄨 for all (x, y) ∈ D satisfying 0 < √(x − a)2 + (y − b)2 < δ.

We also write f (x, y) → L (meaning f (x, y) approaches L) as (x, y) → (a, b) (meaning (x, y) approaches
(a, b)).
2.1 Functions of multiple variables | 71

Note that the set of points (x, y) satisfying the condition 0 < √(x − a)2 + (y − b)2 <
δ forms a punctured open disk with center (a, b) and radius δ (“punctured” means
one point, the center (a, b), is excluded and “open” means the boundary circle is not
included). This punctured disk is sometimes denoted as U((a, ̊ b), δ). Note that we can
relax the requirement that the point (a, b) be in the interior of D as long as for every
δ > 0 the punctured disk U((a,̊ b), δ) contains elements of D where f is defined. For
example, lim(x,y)→(0,0) √x + y exists, even though √x + y is not defined for x + y < 0.

Example 2.1.5. Show that lim(x,y)→(1,2) (2x + 4y) = 10.

Proof. For any given ε > 0, we choose δ = 8ε . Then when

0 < √(x − 1)2 + (y − 2)2 < δ,

ε
we have |x − 1| < 8
and |y − 2| < 8ε , and then

󵄨󵄨(2x + 4y) − 10󵄨󵄨󵄨 = 󵄨󵄨󵄨2(x − 1) + 4(4 − 2)󵄨󵄨󵄨


󵄨󵄨 󵄨 󵄨 󵄨

≤ 2|x − 1| + 4|y − 2|
ε ε
≤ 2 + 4 < ε.
8 8
So, by the definition, we conclude that lim(x,y)→(1,2) (2x + 4y) = 10.

xy
Example 2.1.6. Assume f (x, y) = . Show that
√x 2 +y 2

lim f (x, y) = 0.
(x,y)→(0,0)

Proof. The domain of the function f (x, y) is D = ℝ2 \{(0, 0)}. Since |2xy| ≤ x 2 + y2 , it
follows that

1 x2 + y2 √x 2 + y2
󵄨 󵄨󵄨 xy
󵄨 󵄨󵄨 |xy|
󵄨󵄨f (x, y) − 0󵄨󵄨󵄨 = 󵄨󵄨󵄨 − 0󵄨󵄨󵄨 =
󵄨󵄨 󵄨
≤ = .
󵄨󵄨 √x2 + y2 x + y2 2 √x 2 + y2 2
󵄨󵄨 √ 2

Therefore, for any ε > 0, we can ensure that |f (x, y) − 0| < ε if we choose δ = 2ε. This
is because when

0 < √(x − 0)2 + (y − 0)2 = √x2 + y2 < δ = 2ε,

we have

√x2 + y2 2ε
󵄨󵄨f (x, y) − 0󵄨󵄨󵄨 ≤ = ε.
󵄨󵄨 󵄨
<
2 2
72 | 2 Functions of multiple variables

That is, whenever the point P(x, y) ∈ ℝ2 \{(0, 0)} and |OP| < δ = 2ε, we always have

󵄨󵄨f (x, y) − 0󵄨󵄨󵄨 < ε.


󵄨󵄨 󵄨

Thus,

lim f (x, y) = 0.
(x,y)→(0,0)

Note. The function f (x, y) is not defined when (x, y) = (0, 0); however, a limit can exist
at a point where the function is not defined.

The definitions and properties of limits of functions of two variables are very sim-
ilar to those of one-variable functions, and can be extended in a very similar way to
functions of more variables.

Theorem 2.1.1 (Limit laws). Suppose that lim(x,y)→(a,b) f (x, y) = L, lim(x,y)→(a,b) g(x, y) = M, where L
and M are real numbers. Then:
1. lim(x,y)→(a,b) (f (x, y) ± g(x, y)) = L ± M (sum/difference rule),
2. lim(x,y)→(a,b) f (x, y)g(x, y) = LM (product rule),
f (x,y) L
3. lim(x,y)→(a,b) g(x,y)
= M
(quotient rule, given that M ≠ 0).

sin xy
Example 2.1.7. Show that lim(x,y)→(0,2) x
= 2.

Solution. We know the one-variable limit limx→0 sinx x = 1, and we can use this here by
considering the product xy to be a single variable such that xy → 0 as (x, y) → (0, 2),
as follows:

sin(xy) sin(xy) sin(xy)


lim = lim [ ⋅ y] = lim ⋅ lim y = 1 ⋅ 2 = 2.
(x,y)→(0,2) x (x,y)→(0,2) xy xy→0 xy y→2

3−√xy+9
Example 2.1.8. Evaluate lim(x,y)→(0,1) xy
.

Solution. Let t = xy. Then t → 0. We have

3 − √xy + 9 3 − √t + 9 (3 − √t + 9)(3 + √t + 9)
lim = lim = lim
(x,y)→(0,1) xy t→0 t t→0 t(3 + √t + 9)
−t −1 1
= lim = lim =− .
t→0 t(3 + √t + 9) t→0 3 + √t + 9 6

For limits of functions of a single variable, when we let x approach a number a,


there are only two possible directions of approach: from the left or from the right.
If the left-hand limit and right-hand limit differ, limx→a− f (x) ≠ limx→a+ f (x), then
limx→a f (x) does not exist.
2.1 Functions of multiple variables | 73

For functions of two variables, the situation is different. This is because we can
let (x, y) approach (a, b) from an infinite number of directions in any manner so long
as (x, y) stays within the domain of f , as shown in Figure 2.6(b). The existence of the
limit lim(x,y)→(a,b) f (x, y) means that f (x, y) approaches the same value no matter in
what direction (x, y) approaches (a, b). Therefore, if there are two different routes for
(x, y) → (a, b) along which the function f (x, y) approaches different values, then we
can conclude that the limit lim(x,y)→(a,b) f (x, y) does not exist.

(a) (b)

Figure 2.6: Limits of functions of two variables.

Example 2.1.9. Investigate whether or not the limit lim(x,y)→(0,0) f (x, y) exists when f is defined in two
parts by

xy
x 2 +y 2
when (x, y) ≠ (0, 0),
f (x, y) = {
0 when (x, y) = (0, 0).

Solution. It is easy to check that if (x, y) → (0, 0) along the x-axis, then f (x, y) → 0.
If (x, y) → (0, 0) along the y-axis, then f (x, y) → 0. Now, we have obtained identical
limits along the two axes. However, this does not show that the given limit is 0. If we
let (x, y) approach (0, 0) along the line y = kx, we have

xy kx2 k
lim 2 2
= lim 2 2 2
= .
(x,y)→(0,0),y=kx x +y x→0 x +k x 1 + k2

Obviously, this limit varies with different values of k. So the limit lim(x,y)→(0,0) f (x, y)
does not exist.

Example 2.1.10. Investigate

x2y
lim .
(x,y)→(0,0) x 4+ y2
74 | 2 Functions of multiple variables

Solution. Allowing (x, y) to approach (0, 0) along any line y = kx, the limit is

x2 y kx3 kx
lim = lim 4 = lim 2 = 0.
(x,y)→(0,0) x 4 +y 2 x→0 x + k 2 x 2 x→0 x + k 2

However, this is not sufficient to prove the existence of the limit, even though we have
infinitely many paths along which the limits are all 0. Let us see what happens when
the path is a parabola, say, y = mx2 . Then the limit is

x2 y mx4 m
lim = lim 4 4 4
= .
4
(x,y)→(0,0) x + y 2 x→0 x +m x 1 + m4
The limit depends on m! This means that when (x, y) approaches (0, 0) along different
parabolas, we have different limits. Therefore, we conclude that the limit

x2 y
lim
(x,y)→(0,0) x 4 + y 2

does not exist.

Iterated limits
Now we consider the two iterated limits

lim lim f (x, y) and lim lim f (x, y).


x→a y→b y→b x→a

The limit limx→a limy→b f (x, y) means we evaluate the one-variable limit limy→b f (x, y)
first holding x as a constant, and then evaluate the one-variable limit
limx→a (limy→b f (x, y)) letting x → a. Note that the two iterated limits are actually
two specific paths by which a point (x, y) approaches (a, b). Therefore, we have the
following theorem.

Theorem 2.1.2. If lim(x,y)→(a,b) f (x, y) exists, then both limx→a limy→b f (x, y) and limy→b limx→a f (x, y)
exist and

lim f (x, y) = lim lim f (x, y) = lim lim f (x, y).


(x,y)→(a,b) x→a y→b y→b x→a

This theorem indicates a way to evaluate the limit of a function of two variables, that
is, to evaluate two one-variable limits given that all limits involved exist. For instance,

lim (2x + 4y) = lim(lim(2x + 4y)) = lim(2x + 8) = 10.


(x,y)→(1,2) x→1 y→2 x→1

However, the converse is not true. For instance,


xy xy
lim lim = lim lim = 0,
x→0 y→0 x2 + y2 y→0 x→0 x2 + y2
xy
but lim(x,y)→(0,0) x2 +y2
does not exist.
2.1 Functions of multiple variables | 75

For functions of more than two variables, say, n variables, we can define the limit
at a point P0 ∈ ℝn in a similar manner. In a compact notation, limP→P0 f (P) = L means
that for any given ε > 0, there is a δ > 0 such that whenever 0 < |PP0 | < δ, we have
|f (P)−L| < ε. Also, the limit laws apply to limits of functions of more than two variables
as well.

Example 2.1.11. Find lim(x,y,z)→(1,1,1) .


√xy+√yz−√xz−z
√xz+√yz−√xy−y

Solution. This limit requires (x, y, z) to approach (1, 1, 1), which is a boundary point of
the domain of the function. We can assume all x, y, and z are positive and try factor-
ization. We obtain
√xy + √yz − √xz − z √x(√y − √z) + √z(√y − √z)
lim = lim
(x,y,z)→(1,1,1) √xz + √yz − √xy − y (x,y,z)→(1,1,1) √x(√z − √y) + √y(√z − √y)

(√x + √z)(√y − √z)


= lim
(x,y,z)→(1,1,1) (√x + √y)(√z − √y)

(√x + √z)
= lim − = −1.
(x,y,z)→(1,1,1) (√x + √y)

2.1.5 Continuity

Recall that evaluating limits of a continuous function f of a single variable is easy


because f is defined to be continuous at x = a when limx→a f (x) = f (a). That is, eval-
uation of the limit can be accomplished by direct substitution of x = a in f (x). This
definition includes three separate conditions: (a) the limit must exist as x → a, (b) the
function value f (a) must exist, and finally (c) the limit and this function value must
be the same. Similarly, we can define the continuity for functions of two variables.

Definition 2.1.4. Let f be a function of two variables, and let (a, b) be in its domain 𝒟. We say that f is
continuous at (a, b) if

lim f (x, y) = f (a, b).


(x,y)→(a,b)

If f is not continuous at (a, b), then we say that f is discontinuous at (a, b) and that f has a discontinuity
at (a, b). If f is continuous at every point in 𝒟, then we say that f is continuous on 𝒟.

For example, the function


xy
x2 +y2
when x2 + y2 ≠ 0,
f (x, y) = {
0 when x2 + y2 = 0

is discontinuous at (0, 0) because the limit does not even exist there, as shown in an
earlier example.
76 | 2 Functions of multiple variables

All the points on the circle C = {(x, y)|x2 +y2 = 1} are discontinuities of the function
f (x, y) = sin x2 +y1 2 −1 because the function is not defined at any point of this circle. It is
also possible to define a function f that is discontinuous at (a, b) such that the limit of
f exists as (x, y) → (a, b) and f (a, b) exists, but the two values are different.
Similar to the continuity of functions of one variable, all elementary functions of
two variables are continuous on their natural domains. That is, the limit of an elemen-
tary function f of two variables at point (a, b) in its domain is given by

lim f (x, y) = f (a, b).


(x,y)→(a,b)

cos(xy)+sin(xy)
Example 2.1.12. Find lim(x,y)→(1,π) x+y
.

cos(xy)+sin(xy)
Solution. The function x+y
is an elementary function and its domain is

D = {(x, y)|x + y ≠ 0}.

Since (1, π) lies in its domain, this function is continuous at the point (1, π) and

cos(xy) + sin(xy) cos π + sin π 1


lim = =− ≈ −0.241.
(x,y)→(1,π) x+y 1+π 1+π

The continuity of functions of more than two variables is defined similarly using
the compact notation.

Definition 2.1.5. A function f of n variables is continuous at P0 , if and only if limP→P0 f (P) = f (P0 ),
where P and P0 are points in ℝn .

Finally, the extreme value theorem, intermediate value theorem, and uniform continu-
ity theorem also hold for continuous functions f of n variables defined on a closed,
bounded region in ℝn .

2.2 Partial derivatives


2.2.1 Definition

Suppose that a pollution index Q depends on two factors, x and y, which are outputs
of pollutants from two factories. Now, you have a little money in hand and can invest
it in only one of the factories to reduce the pollution index. Which factory will you put
your money on? Of course, one would like to choose the factory whose small change
in output of pollutants will result in the greatest drop in Q. This involves the idea of
partial derivatives in which we hold one of the variables constant, and try to find the
2.2 Partial derivatives | 77

rate of change of a function with respect to the other variable. So a partial derivative
is simply a one-variable differentiation applied to a two- (or more) variable function.
That is, suppose f is a function of two variables, x and y, but we let only x vary while
holding y = b constant. We have then converted f into a function of a single variable,
x, i. e., g(x) = f (x, b). If g(x) = f (x, b) has a derivative at a, then we call it the partial
derivative of f with respect to x at (a, b) and denote it by fx (a, b) or 𝜕f (a,b)
𝜕x
. One-variable
derivatives were defined in terms of a limit. The definition of partial derivatives for
functions of two variables is also defined using limits in a similar way.

Definition 2.2.1. Let f (x, y) be a real-valued function with domain D ⊂ ℝ2 , and let (a, b) be an interior
point of D. The partial derivative of f with respect to x at (a, b) is denoted and defined by

𝜕f (a, b) f (a + Δx, b) − f (a, b)


= lim .
𝜕x Δx→0 Δx
The partial derivative of f with respect to y at (a, b) is denoted and defined as

𝜕f (a, b) f (a, b + Δy) − f (a, b)


= lim .
𝜕y Δy→0 Δy

Note.
1. The derivative f 󸀠 (x) is interpreted as rate of change. Partial derivatives are also
rates of change. If z = f (x, y), then 𝜕f /𝜕x represents the rate of change of z with
respect to x when y is held constant. Similarly, 𝜕f /𝜕y represents the rate of change
of z with respect to y when x is held constant.
2. Partial derivatives are sometimes called partials.
3. Sometimes, we use h in the above limits in place of Δx or Δy.

If z = f (x, y) has a partial derivative with respect to x at all points (x, y) ∈ D, then
fx (x, y) is also a function of x and y with domain D. We call it the partial derivative of
f (x, y) with respect to x and denote it by any of the following:
𝜕z 𝜕f 𝜕f (x, y)
, , , zx , or fx (x, y).
𝜕x 𝜕x 𝜕x
Similarly, we denote the partial derivative of f (x, y) with respect to y by
𝜕z 𝜕f 𝜕f (x, y)
, , , zy , or fy (x, y).
𝜕y 𝜕y 𝜕y
Sometimes the subscript “x” is replaced by the number 1, “y” by 2, and so on, so that
𝜕f 𝜕f
= f1 , = f2 .
𝜕x 𝜕y
So, to compute the partial derivative fx or fy , all we have to remember is that it
is just the ordinary one-variable derivative where we regard y or x as a constant, and
we can, therefore, apply all the derivative laws for functions of a single variable when
finding fx or fy .
78 | 2 Functions of multiple variables

x
Example 2.2.1. If f (x, y) = cos( 2+y ), compute and .
𝜕f 𝜕f
𝜕x 𝜕y

Solution. To compute 𝜕x𝜕f


, we regard y as a constant. Using the chain rule for functions
of one variable, we have

𝜕f x 𝜕 x x 1
= − sin( )⋅ ( ) = − sin( )⋅ .
𝜕x 2+y 𝜕x 2 + y 2+y 2+y

Similarly, we compute 𝜕f
𝜕y
as follows:

𝜕f x 𝜕 x x x
= − sin( )⋅ ( ) = sin( )⋅ .
𝜕y 2+y 𝜕y 2 + y 2+y (2 + y)2

Example 2.2.2. If f (x, y) = 2x 2 y − 3xy 2 + 2x − y 2 + 3, then find 𝜕x (2, −3) and 𝜕y (2, −3) by using one-
𝜕f 𝜕f

variable differentiation formulas and a second time by using the definition as a limit.

Solution. Method 1: We have


𝜕f (x, y)
(2x2 y − 3xy2 + 2x − y2 + 3)
𝜕
=
𝜕x 𝜕x
= 4xy − 3y2 + 2.

So
󵄨󵄨
(2, −3) = 4xy − 3y2 + 2󵄨󵄨󵄨 x=2 = 4 ⋅ 2 ⋅ (−3) − 3 ⋅ (−3)2 + 2 = −49.
𝜕f
𝜕x 󵄨y=−3

Method 2: We have

𝜕f 󵄨󵄨󵄨󵄨 f (2 + Δx, −3) − f (2, −3)


󵄨 = lim
𝜕x 󵄨󵄨󵄨x=2,y=−3 Δx→0 Δx
(−6(2 + Δx)2 − 25(2 + Δx) − 6) + 80
= lim
Δx→0 Δx
−6Δx2 − 49Δx
= lim
Δx→0 Δx
= −49.

Method 3: We can plug in y = −3 first to get

f (x, −3) = 2x2 (−3) − 3x(−3)2 + 2x − (−3)2 + 3


= −6x 2 − 25x − 6.

Then fx (x, −3) = −12x − 25 and so fx (2, −3) = −12(2) − 25 = −49.


Similarly, we get 𝜕f (2,−3)
𝜕y
= 50.
2.2 Partial derivatives | 79

Example 2.2.3. The Cobb–Douglas production function is defined as

P(L, C) = bLα C 1−α ,

where P is the total production (the monetary value of all goods produced in a period), L is the amount
of labor (some measure of the total labor used in that period), and C is the amount of capital invested
(the monetary worth of all machinery, equipment, and buildings); b and α are constants. Find the
partial derivatives PL and PC .

Solution. We have

PL (L, C) = αbLα−1 C 1−α and PC = b(1 − α)Lα C −α .

Note. In 1928 Charles Cobb and Paul Douglas used this function to model the growth
of American economy during the period 1899–1922. Their model turned out to be re-
markably accurate even though there were many factors affecting economic perfor-
mance.

Example 2.2.4. The ideal gas equation is given by pV = RT , where R is a constant. Show that

𝜕p 𝜕V 𝜕T
⋅ ⋅ = −1.
𝜕V 𝜕T 𝜕p

Proof. From

RT 𝜕p RT
p= 󳨐⇒ =− 2,
V 𝜕V V
RT 𝜕V R
V= 󳨐⇒ = ,
p 𝜕T p
pV 𝜕T V
T= 󳨐⇒ = ,
R 𝜕p R

it follows that
𝜕p 𝜕V 𝜕T RT R V RT
⋅ ⋅ =− 2 ⋅ ⋅ =− = −1.
𝜕V 𝜕T 𝜕p V p R pV

Note. This example also shows that partial derivatives cannot be interpreted as ratios
of differentials, as otherwise 𝜕V ⋅ 𝜕T ⋅ 𝜕p would be equal to 1.
𝜕p 𝜕V 𝜕T

Example 2.2.5. Investigate and at (0, 0) for f (x, y) = √x 2 + y 2 .


𝜕f 𝜕f
𝜕x 𝜕y

Solution. By applying the definition of partial differentiation, we have

𝜕f (0, 0) f (0 + Δx, 0) − f (0, 0)


= lim
𝜕x Δx→0 Δx
80 | 2 Functions of multiple variables

√(0 + Δx)2 + 02 − √02 + 02


= lim
Δx→0 Δx
|Δx|
= lim .
Δx→0 Δx

We know that this limit does not exist as the two one-sided limits Δx → 0+ and Δx →
0− do not match. Therefore, 𝜕f (0,0)
𝜕x
does not exist. Similarly, 𝜕f (0,0)
𝜕y
does not exist either.

Now, we have seen that for some functions the partial derivatives exist, and for
some functions they do not. Besides using the definition, we shall try to interpret par-
tials geometrically.

2.2.2 Interpretations of partial derivatives

There is a geometric interpretation of partial derivatives fx (a, b) and fy (a, b). When we
compute 𝜕x
𝜕f
, we keep y fixed, say, y = b. Therefore, we only consider the points (x, b, z).
So

z = f (x, y) and y = b,

and this is actually the intersection curve C of the plane y = b and the surface S, the
graph of z = f (x, y). The derivative fx (a, b) is therefore the slope of the tangent line to
the curve C at (a, b, f (a, b)) on S. Similarly, fy (a, b) is the slope of the tangent line to the
curve z = f (a, y) in the x = a plane at the point (a, b, f (a, b)). Figure 2.7(a) illustrates
the geometric interpretation of partial derivatives.
Now, we can explain why the function z = √x 2 + y2 has no partials at (0, 0). The
graph of this function is an upper cone with its vertex at the origin. The plane y = 0

(a) (b)

Figure 2.7: Geometric interpretation of partial derivatives.


2.2 Partial derivatives | 81

intersects the cone in two lines z = x and z = −x in the xz-plane. The origin is a
corner of the intersection lines, and, therefore, it has no derivative there, as shown in
Figure 2.7(b).
We already know that when a function of one variable has a derivative at a point P,
it must also be continuous at the point P. However, continuity does not follow for a
function of two variables just because it has partial derivatives at a point. Take, for ex-
ample, the function f (x, y) equal to x+y when either x or y equals 0, but with f (x, y) = 4
at all other points. This function has partial derivatives with respect to x and y equal to
1 at (0, 0), but the function is clearly not continuous at (0, 0). The next example shows
that a function can have partial derivatives at every point yet still be a discontinuous
function.

Example 2.2.6. Find the partial derivatives of the function

xy
x 2 +y 2
when x 2 + y 2 ≠ 0,
f (x, y) = {
0 when x 2 + y 2 = 0.

Solution. If (x, y) ≠ (0, 0), the one-variable quotient rule gives

𝜕f (x, y) xy y(x2 + y2 ) − xy ⋅ 2x y(y2 − x 2 )


=( 2 ) = = 2 .
𝜕x x + y2 x (x2 + y2 )2 (x + y2 )2

If (x, y) = (0, 0), then using the definition of partial differentiation we find

𝜕f (0, 0) f (0 + Δx, 0) − f (0, 0) 0−0


= lim = lim = 0.
𝜕x Δx→0 Δx Δx→0 Δx

Similarly, when (x, y) ≠ (0, 0),

𝜕f (x, y) x(x2 − y2 )
= 2
𝜕y (x + y2 )2

and
𝜕f (0, 0)
= 0.
𝜕y

So, this function has a partial derivative at every point (x, y); however, we have already
seen in a previous example that this function is not continuous at the point (0, 0).

The concept of partial derivatives can be extended to functions of more than two
variables in a natural way. For instance, a function u = f (x, y, z) generally has three
partial derivatives and the partial derivative of the function with respect to x is defined
as
f (x + Δx, y, z) − f (x, y, z)
fx (x, y, z) = lim ,
Δx→0 Δx
82 | 2 Functions of multiple variables

where (x, y, z) is an interior point in the domain of u. However, there is no nice geo-
metric interpretation for fx as a slope of some visible tangent line. To find the partial
derivative fx we hold y and z constant and use the one-variable derivative rules to find
fx . In a similar manner, we can find fy and fz .

Example 2.2.7. Find the partial derivatives of the function r = √x 2 + y 2 + z 2 .

Solution. To find 𝜕r
𝜕x
, we regard y and z as constants. Then we find
𝜕r 2x x x
= = = .
𝜕x 2√x2 + y2 + z 2 √x 2 + y2 + z 2 r

y
By symmetry, 𝜕r
𝜕y
= r
and 𝜕r
𝜕z
= zr .

2.2.3 Partial derivatives of higher order

For a function z = f (x, y), the partial derivatives fx (x, y) and fy (x, y) can themselves be
differentiated, giving four more derivatives, i. e., (fx )x , (fx )y , (fy )x , (fy )y , called second
derivatives or second-order partial derivatives. The standard notation for these second-
order partial derivatives are similar to the notations y󸀠󸀠 and d2 y/dx 2 for the second
derivatives of a function y = f (x) of a single variable. We write
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fx )x = fxx = ( ) = 2 = 2,
𝜕x 𝜕x 𝜕x 𝜕x
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fx )y = fxy = ( )= = ,
𝜕y 𝜕x 𝜕y𝜕x 𝜕y𝜕x
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fy )x = fyx = ( )= = ,
𝜕x 𝜕y 𝜕x𝜕y 𝜕x𝜕y
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fy )y = fyy = ( ) = 2 = 2.
𝜕y 𝜕y 𝜕y 𝜕y
2
𝜕f
The notation fxy or 𝜕y𝜕x means that we first differentiate with respect to x (keep-
ing y constant) and then differentiate with respect y (keeping x constant), whereas in
𝜕2 f
computing fyx = 𝜕x𝜕y , the differentiation order is reversed.

Example 2.2.8. Find the second-order partial derivatives of

f (x, y) = x 2 ey + y cos x.

Solution. We find the partials fx and fy , and we differentiate each of these functions
with respect to each of x and y to give fxx , fxy , fyx , and fyy . Then we obtain the following:

fx = 2xey − y sin x,
2.3 Total differential | 83

fy = x2 ey + cos x,
fxx = 2ey − y cos x,
fxy = 2xey − sin x,
fyx = 2xey − sin x,
fyy = x2 ey .

We note that for this function, whose second-order partial derivatives are continu-
ous, the “mixed partials” fxy and fyx are equal. This result holds generally for functions
with continuous second-order derivatives.

Theorem 2.2.1 (Clairaut’s theorem: equality of mixed partial derivatives). If z = f (x, y) is defined and
has continuous second-order partial derivatives throughout a domain 𝒟, then the two functions fxy
and fyx are identical at any interior point of 𝒟.

Notations for third- and higher-order partial derivatives are defined in a similar way.

Example 2.2.9. Calculate fxxyz when f (x, y, z) = sin(3x + 2yz).

Solution. We have

fx = 3 cos(3x + 2yz),
fxx = −9 sin(3x + 2yz),
fxxy = −18z cos(3x + 2yz),
fxxyz = −18 cos(3x + 2yz) + 36yz sin(3x + 2yz).

2.3 Total differential


2.3.1 Linearization and differentiability

In one-variable calculus, if a function y = f (x) is differentiable at x = a this means that


the change Δy in y can be written as

Δy = AΔx + o(Δx),

where A is a constant that only depends on the point a, not the change Δx in x. It turns
out that the constant A is exactly f 󸀠 (a), the derivative at x = a. Thus,

Δy = f 󸀠 (a)Δx + o(Δx).

When Δx is small Δy = y − f (a) ≈ f 󸀠 (a)Δx, which can be rewritten as

y ≈ f (a) + f 󸀠 (a)(x − a).


84 | 2 Functions of multiple variables

The function L(x) = f (a) + f 󸀠 (a)(x − a) is the local linearization of f (x) at x = a, which
is, in fact, the tangent line approximation of f at x = a.
A similar approximation can be made for a function of two variables, z = f (x, y).
Consider a small change Δz in z at a point (a, b), caused by changes Δx in x and Δy in y:

Δz = f (a + Δx, b + Δy) − f (x, y).

The increment Δz represents the change in the value of f when (a, b) changes from
(a, b) to (a + Δx, b + Δy). In general, the exact increment Δz in z is hard to find. For
example, for z = xy at (1, 1) with Δx = 0.09 and Δy = −0.02, Δz = 1.090.98 − 11 . However,
even though z is not a linear function, Δz can be very close to a linear expression of Δx
and Δy, and the difference is negligible as Δx → 0 and Δy → 0. When this happens,
we say that the function z = f (x, y) is differentiable at the point (a, b). The formal
definition is given below.

Definition 2.3.1 (Differentiability of a real-valued function z = f (x, y) of two variables). Assume that f
is a real-valued function with domain D ⊂ ℝ2 and that (a, b) is an interior point of D. The function f is
differentiable at (a, b) if there exist constants A and B such that

Δz = AΔx + BΔy + o(ρ),

where A and B depend on a and b but are independent of Δx and Δy, and ρ = √(Δx)2 + (Δy)2 . If
z = f (x, y) is differentiable at every point in D, then we say that z is differentiable on D.

Note that if z = f (x, y) is differentiable at (a, b), then Δz ≈ AΔx + BΔy for some con-
stants A and B. This expression can be rewritten as

f (x, y) ≈ f (a, b) + A(x − a) + B(y − b).

The formula L(x, y) = f (a, b)+A(x−a)+B(y−b) gives the local linearization of z = f (x, y)
at (a, b), which is the equation of a plane that approximates the surface well at points
near (a, b, f (a, b)). This plane, as we will see later, is indeed the tangent plane to the
graph of z = f (x, y) at (a, b, f (a, b)). So, intuitively speaking, if a function z = f (x, y)
is differentiable at a point, then its graph must be continuous and “smooth” at that
point so that there exists a plane that could nicely touch (be tangent to) the surface at
that point. This is indeed the case, as shown in the following two theorems.

Theorem 2.3.1. If z = f (x, y) is differentiable at (a, b), then it must be continuous at the point (a, b).

Proof. In fact, from the above definition, if z = f (x, y) is differentiable at (a, b), then

Δz = AΔx + BΔy + o(√(Δx)2 + (Δy)2 ),

lim Δz = lim (AΔx + BΔy + o(√(Δx)2 + (Δy)2 )) = 0.


(Δx,Δy)→(0,0) (Δx,Δy)→(0,0)
2.3 Total differential | 85

But Δz = f (a + Δx, b + Δy) − f (a, b) = f (x, y) − f (a, b); therefore,

lim Δz = lim (f (x, y) − f (a, b)) = 0,


(Δx,Δy)→(0,0) (x,y)→(a,b)
lim f (x, y) = f (a, b),
(x,y)→(a,b)

which means z = f (x, y) is continuous at (a, b).

By the above theorem, we know the following.


If a function is not continuous at (a, b), then it is not differentiable at (a, b).
Therefore, functions such as
xy
x2 +y2
(x, y) ≠ (0, 0), xy2
f (x, y) = { and f (x, y) =
0 (x, y) = (0, 0) x4 + y4

are not differentiable at (0, 0) since they are not continuous there.
Also, as in one-variable calculus, if y = f (x) is differentiable at x = a, then dy/dx
exists at x = a. For a function of two variables, we claim that if it is differentiable, then
it has partial derivatives.

Theorem 2.3.2. If z = f (x, y) is differentiable at point (a, b), then the partial derivatives 𝜕z(a,b)
𝜕x
and
𝜕z(a,b)
𝜕y
exist. Furthermore,

𝜕z(a, b) 𝜕z(a, b)
Δz = Δx + Δy + o(ρ).
𝜕x 𝜕y

Proof. If the function z = f (x, y) is differentiable at a point (a, b), then there exist A
and B, independent of Δx and Δy, such that

Δz = AΔx + BΔy + o(ρ), where ρ = √(Δx)2 + (Δy)2 .

In particular, when Δy = 0 (that is, y = b), then ρ = |Δx| and

Δz = AΔx + o(|Δx|).

Dividing both sides by Δx gives


Δz Δx o(|Δx|)
=A + .
Δx Δx Δx
Note that in the case of Δy = 0, Δz = f (a + Δx, b) − f (a, b). Then, when taking the limit
of both sides as Δx → 0 , we obtain
𝜕z(a, b)
= A.
𝜕x

Similarly, 𝜕z(a,b)
𝜕y
= B. This completes the proof.
86 | 2 Functions of multiple variables

Note. By the above theorem, we conclude the following.


If either fx (a, b) or fy (a, b) does not exist, then f (x, y) is not differentiable at (a, b).
If one replaces the point (a, b) by a general point (x, y), then when a function
z = f (x, y) is differentiable at (x, y), we have
𝜕z 𝜕z
Δz = Δx + Δy + o(ρ).
𝜕x 𝜕y

Example 2.3.1. Determine whether the following functions are differentiable at (0, 0):
xy
1 { (x, y) ≠ (0, 0),
(1) f (x, y) = sin , (2) g(x, y) = { √x 2 +y 2 and
x2 + y2 0 (x, y) = (0, 0),
{
2 2 1
(x + y ) sin x 2 +y 2 (x, y) ≠ (0, 0),
(3) h(x, y) = {
0 (x, y) = (0, 0).

Solution.
1
1. Since f (x, y) = sin x2 +y 2 is undefined at (0, 0), it is not continuous at (0, 0). There-

fore, it is not differentiable at (0, 0).


2. As seen in a previous example, lim(x,y)→(0,0) xy 2 2
= 0 = g(0, 0). Therefore, g(x, y)
√x +y
is continuous at (0, 0). Computing gx (0, 0), we have
Δx⋅0
−0
g(0 + Δx, 0) − g(0, 0) √(Δx)2 +02
gx (0, 0) = lim = lim = 0.
Δx→0 Δx Δx→0 Δx
Similarly, gy (0, 0) = 0. Thus, if g(x, y) is differentiable at (0, 0), we must have

Δg(x, y) = gx (0, 0)Δx + gy (0, 0)Δy + o(ρ),


g(0 + Δx, 0 + Δy) − g(0, 0) = gx (0, 0)Δx + gy (0, 0)Δy + o(ρ),
ΔxΔy
= 0 + 0 + o(ρ), where ρ = √(Δx)2 + (Δy)2 .
√(Δx)2 + (Δy)2

ΔxΔy
That is, is negligible with respect to √(Δx)2 + (Δy)2 , as (Δx, Δy) → (0, 0),
√(Δx)2 +(Δy)2
i. e.,
ΔxΔy
√(Δx)2 +(Δy)2
lim = 0.
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2

However,
ΔxΔy
√(Δx)2 +(Δy)2 ΔxΔy
lim = lim .
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2 (Δx,Δy)→(0,0) (Δx)2 + (Δy)2
2.3 Total differential | 87

This limit does not exist as if Δy = kΔx , the limit depends on k. So, we reached a
contradiction. So g(x, y) is not differentiable at (0, 0).
3. For h(x, y), it is easy to see that lim(x,y)→(0,0) h(x, y) = 0 = h(0, 0), so h(x, y) is con-
tinuous at (0, 0). We now compute

h(0 + Δx, 0) − h(0, 0) ((Δx)2 + 02 ) sin (Δx)12 +02 − 0


hx (0, 0) = lim = lim = 0.
Δx→0 Δx Δx→0 Δx
Similarly, hy (0, 0) = 0. We need to check whether the difference between Δh and
the linearization hx (0, 0)Δx +hy (0, 0)Δy is negligible with respect to √(Δx)2 + (Δy)2
as (Δx, Δy) → (0, 0). We write

Δh − hx (0, 0)Δx − hy (0, 0)Δy


lim
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2
1
((Δx)2 + (Δy)2 ) sin (Δx)2 +(Δy) 2 − 0
= lim
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2
1
= lim √(Δx)2 + (Δy)2 sin = 0.
(Δx,Δy)→(0,0) (Δx)2 + (Δy)2

So Δh − hx (0, 0)Δx − hy (0, 0)Δy = o(√(Δx)2 + (Δy)2 ). That is,

Δh = hx (0, 0)Δx + hy (0, 0)Δy + o(√(Δx)2 + (Δy)2 ),

and the function h(x, y) is differentiable at (0, 0).

Now, we have seen some nice properties of a differentiable function of two vari-
ables. However, how can we determine whether a function is differentiable? In one-
variable calculus, we know that as long as f 󸀠 (a) exists, y = f (x) is differentiable at
x = a. But in multivariable calculus, this is not the case. The previous theorem shows
that existence of the partial derivatives is a necessary condition for differentiability,
but it is not a sufficient condition for differentiability. In Example 2.2.6, we saw that
the function has partials at (0, 0) but is not continuous there; therefore, it is not dif-
ferentiable there. For a sufficient condition, we have the following theorem.

Theorem 2.3.3 (Test for differentiability of a real-valued function). Let z = f (x, y) be a real-valued
function of two variables and (a, b) an interior point of its domain D. If 𝜕x
𝜕z
and 𝜕y
𝜕z
are continuous at
(a, b), then f is differentiable at (a, b).

Proof. Recall the Lagrange mean value theorem for differentiable functions of one
variable, y = f (x),

f (x) − f (a) = f 󸀠 (ξ )(x − a), where ξ is some number between a and x, or


88 | 2 Functions of multiple variables

f (a + Δx) − f (a) = f 󸀠 (a + θΔx)Δx, where 0 < θ < 1.

Therefore,

Δz = f (a + Δx, b + Δy) − f (a, b)


= f (a + Δx, b) − f (a, b) + f (a + Δx, b + Δy) − f (a + Δx, b)
= fx (a + θ1 Δx, b)Δx + fy (a + Δx, b + θ2 Δy)Δy, where 0 < θ1 , θ2 < 1.

Since 𝜕z
𝜕x
= fx (x, y) and 𝜕z
𝜕y
= fy (a, b) are both continuous at (a, b), we have

lim fx (a + θ1 Δx, b) = fx (a, b) and lim fy (a + Δx, b + θ2 Δy) = fy (a, b).


(Δx,Δy)→(0,0) (Δx,Δy)→(0,0)

Thus, there exist ε1 and ε2 such that ε1 → 0, ε2 → 0 as Δx → 0 and Δy → 0, and

fx (a + θ1 Δx, b) = fx (a, b) + ε1 and fy (a + Δx, b + θ2 Δy) = fy (a, b) + ε2 .

So,

Δz = fx (a + θ1 Δx, b)Δx + fy (a + Δx, b + θ2 Δy)Δy


= (fx (a, b) + ε1 )Δx + (fy (a, b) + ε2 )Δy
= fx (a, b)Δx + fy (a, b)Δy + ε1 Δx + ε2 Δy.

Furthermore,
Δz − fx (a, b)Δx − fy (a, b)Δy ε1 Δx + ε2 Δy
lim = lim
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2 (Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2
Δx Δy
= lim ε1 ( ) + ε2 ( )
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2 √(Δx)2 + (Δy)2
= 0 + 0 = 0.

Therefore,

Δz − fx (a, b)Δx − fy (a, b)Δy = o(√(Δx)2 + (Δy)2 ) or

Δz = fx (a, b)Δx + fy (a, b)Δy + o(√(Δx)2 + (Δy)2 ),

which proves that z = f (x, y) is differentiable at (a, b).

The converse of this theorem is not true. A counterexample is the following:


1
(x2 + y2 ) sin x2 +y 2 when x 2 + y2 ≠ 0,
f (x, y) = {
0 when x 2 + y2 = 0.

One can show that the partial derivatives of f (x, y) exist, and f is differentiable, but
the partial derivatives are not continuous at (0, 0).
2.3 Total differential | 89

2.3.2 The total differential

In one-variable calculus, we defined the differential dy to be the function of the differ-


ential dx by

dy = f 󸀠 (x)dx,

and when dx = Δx, we have dy = f 󸀠 (x)Δx. Similarly, we define the differential dz as


follows.

Definition 2.3.2. If z = f (x, y) is differentiable at (a, b), the differential dz (sometimes called the total
differential) at (a, b) is the function defined by

𝜕z(a, b) 𝜕z(a, b)
dz = dx + dy,
𝜕x 𝜕y

where the two independent variables are the differentials dx and dy.

Example 2.3.2. When z = exy , find dz(2, 1).

Solution. Since

= fx = yexy = fy = xexy
𝜕z 𝜕z
and
𝜕x 𝜕y

are both continuous at (2, 1), by the definition of dz at a point (a, b), we have

𝜕z(2, 1) 𝜕z(2, 1)
dz = dx + dy
𝜕x 𝜕y
= e2 dx + 2e2 dy.

If the total differential of z = f (x, y) exists for all points in a region D, then dz
becomes a function of (x, y) ∈ D and (dx, dy) ∈ ℝ2 .

Example 2.3.3. Find the differential dz = df (x, y) when f is the function defined by

z = f (x, y) = 1 + ln(3 − x 2 − y 2 ) for (x, y) ∈ D = [−1, 1] × [−1, 1].

Solution. Since
𝜕z −2x 𝜕z −2y
= fx = and = fy =
𝜕x 3 − x2 − y2 𝜕y 3 − x2 − y2

are both continuous on D, by the definition of dz, we have

𝜕f 𝜕f
dz = dx + dy
𝜕x 𝜕y
90 | 2 Functions of multiple variables

= (1 + ln(3 − x2 − y2 ))x dx + (1 + ln(3 − x 2 − y2 ))y dy


−2x −2y
= dx + dy.
3 − x2 − y2 3 − x2 − y2

The definition for a total differential can be extended naturally to differentiable


functions of more than two variables. For instance, if u = f (x, y, z) is differentiable,
then the total differential du is defined as
𝜕u 𝜕u 𝜕u
du = dx + dy + dz.
𝜕x 𝜕y 𝜕z

y
Example 2.3.4. Find du if u = x + sin 2
+ eyz .

Solution. Since
𝜕u 1 y
= cos + zeyz , = yeyz ,
𝜕u 𝜕u
= 1,
𝜕x 𝜕y 2 2 𝜕z

it follows that

1 y
dz = dx + ( cos + zeyz )dy + yeyz dz.
𝜕u 𝜕u 𝜕u
du = dx + dy +
𝜕x 𝜕y 𝜕z 2 2

2.3.3 The linear/differential approximation

In one-variable calculus, when dx = Δx, then dy = f 󸀠 (a)Δx and Δy ≈ dy. So the lin-
earization is essentially the differential approximation. Similarly, for a differentiable
function z = f (x, y) of two variables, when dx = Δx and dy = Δy , then at (a, b) we have

𝜕z(a, b) 𝜕z(a, b)
dz = Δx + Δy.
𝜕x 𝜕y

Thus,

Δz = dz + o(√(Δx)2 + (Δy)2 ).

When Δx and Δy are both small, this gives the approximation

𝜕z(a, b) 𝜕z(a, b)
Δz = f (x, y) − f (a, b) ≈ dz = Δx + Δy, (2.1)
𝜕x 𝜕y

or, equivalently,

𝜕z(a, b) 𝜕z(a, b)
f (x, y) ≈ f (a, b) + Δx + Δy. (2.2)
𝜕x 𝜕y
2.3 Total differential | 91

The function
𝜕z(a, b) 𝜕z(a, b)
L(x, y) = f (a, b) + (x − a) + (y − b)
𝜕x 𝜕y

is called the local linearization of the function z = f (x, y) at the point (a, b). This is
essentially the total differential approximation.

Example 2.3.5. Find an approximation for (1.04)2.02 using a suitable local linearization.

Solution. Let f (x, y) = xy . Then 1.042.02 = f (1.04, 2.02) and this is close to f (1, 2) = 12 =
1, so we can use the differential approximation to approximate the change from this
known value. Since

= yxy−1 = x y ln x,
𝜕z 𝜕z
and
𝜕x 𝜕y

their values at (1, 2) are

(1, 2) = 2 × 12−1 = 2 (1, 2) = 12 ln 1 = 0.


𝜕z 𝜕z
and
𝜕x 𝜕y

Changing x : 1 → 1.04 gives Δx = 0.04 and y : 2 → 2.02 gives Δy = 0.02. Thus,

𝜕z(1, 2) 𝜕z(1, 2)
f (x, y) ≈ f (1, 2) + Δx + Δy.
𝜕x 𝜕y

By substitution,

𝜕z 𝜕z
f (1.04, 2.02) ≈ f (1, 2) + (1, 2)0.04 + (1, 2)0.02
𝜕x 𝜕y
= 1 + 2 × 0.04 + 0 × 0.02 = 1.08.

Note. A calculator gives a better approximation, (1.04)2.02 = 1.082448 . . . , but our er-
ror is less than 3 × 10−3 .

Example 2.3.6. Use the total differential to estimate of the change of the function z = √20 − 7x 2 − y 2
when (x, y) changes from (1, 2) to (0.98, 2.03).

Solution. Since

𝜕z(1, 2) −14x 󵄨󵄨 −14 × 1 7


and
󵄨󵄨
= 󵄨󵄨 = =−
𝜕x 2
2√20 − 7x2 − y2 󵄨(1,2) 2√20 − 7(1) − 2
󵄨 2 3

𝜕z(1, 2) −2y 󵄨󵄨
󵄨󵄨 −2 × 2 2
= 󵄨󵄨 = =− ,
𝜕y 2 2 󵄨 √ 2
2√20 − 7x − y 󵄨(1,2) 2 20 − 7(1) − 2
2 3
92 | 2 Functions of multiple variables

the differential is
𝜕z(1, 2) 𝜕z(1, 2) 7 2
dz = Δx + Δy = − (0.98 − 1) − (2.03 − 2) = 0.0026667.
𝜕x 𝜕y 3 3
Since Δz = f (0.98, 2.03) − f (1, 2) ≈ dz, the change in z is approximately 0.00266667.
Using a calculator,
√20 − 7 × 0.982 − 2.032 − √20 − 7 × 12 − 22 ≈ 0.0025938.

The two values are very close, but dz is much easier to evaluate when a calculator is
not available.

Note. The linearization for a function of two variables at a point is essentially a plane
approximation. For this function, it is
7 2
L(x, y) = 3 − (x − 1) − (y − 2),
3 3
which is the equation of a plane. This is the tangent plane at (1, 2) as we will see later
in this chapter.

Linear approximations can be used for differentiable functions of more than two
variables. For example, if u = f (x, y, z) is differentiable at (a, b, c), then

Δu ≈ dz and
f (x, y, z) ≈ f (a, b, c) + fx (a, b, c)Δx + fy (a, b, c)Δy + fz (a, b, c)Δz.

There are some exercises involving differential approximation for functions of more
than two variables in the end of this chapter.

2.4 The chain rule


2.4.1 The chain rule with one independent variable

In single-variable calculus, we have found that the chain rule is useful for differen-
tiating a composite function: if y = f (x) and x = x(t) are both differentiable, then
the composite function y is a differentiable function of t, and the derivative of y with
respect to t is
dy dy dx
= . (2.3)
dt dx dt
For functions of several variables, there are several versions of the chain rules.
We first consider z = f (x, y), where each variable x = ϕ(t) and y = ψ(t) is, in turn,
a differentiable function of a variable t. This means z = f (ϕ(t), ψ(t)) is a function of
the variable t. When fx and fy are both continuous, we are able to differentiate z with
respect to t to get dz
dt
, as seen in the following theorem.
2.4 The chain rule | 93

Theorem 2.4.1. If x = ϕ(t) and y = ψ(t) are two differentiable functions of t, and z = f (x, y) is a differ-
entiable function of x and y, then the composite function z = f (x, y) = f (ϕ(t), ψ(t)) is a differentiable
function of t and

dz 𝜕z dx 𝜕z dy
= + . (2.4)
dt 𝜕x dt 𝜕y dt

Proof. A small change Δt in t causes changes of Δx in x and Δy in y. So, there is a


change of Δz in z, and since z is differentiable, we have
𝜕z 𝜕z
Δz = Δx + Δy + o(√(Δx)2 + (Δy)2 ).
𝜕x 𝜕y
Dividing both sides of this equation by Δt, we have

2 2
Δz 𝜕z Δx 𝜕z Δy o(√(Δx) + (Δy) )
= + +
Δt 𝜕x Δt 𝜕y Δt Δt
2 2 2 2
𝜕z Δx 𝜕z Δy o(√(Δx) + (Δy) ) √(Δx) + (Δy)
= + + ⋅
𝜕x Δt 𝜕y Δt √(Δx)2 + (Δy)2 Δt

2 2 2 2
𝜕z Δx 𝜕z Δy o(√(Δx) + (Δy) ) √ Δx Δy
= + + ⋅ ( ) +( ) .
𝜕x Δt 𝜕y Δt √(Δx)2 + (Δy)2 Δt Δt

Since x is differentiable, it is continuous. So, if we now let Δt → 0, then Δx → 0.


Similarly, we also have Δy → 0. This in turn means that
dz Δz 𝜕z Δx 𝜕z Δy
= lim = lim + lim
dt Δt→0 Δt 𝜕x Δt→0 Δt 𝜕y Δt→0 Δt
o(√(Δx)2 + (Δy)2 ) 2 2
Δx Δy
+ lim lim √( ) + ( )
Δx→0,Δy→0
√(Δx)2 + (Δy)2 Δt→0 Δt Δt

2 2
𝜕z dx 𝜕z dy dx dy
= + + 0 ⋅ √( ) + ( )
𝜕x dt 𝜕y dt dt dt
𝜕z dx 𝜕z dy
= + .
𝜕x dt 𝜕y dt
It is helpful to use a tree diagram to remember this chain rule (and other chain
rules as well), as shown in Figure 2.8. For the function in the previous theorem, since
z is a function of x and y, we draw branches from the dependent variable z to the inter-
mediate variables x and y. Then, we draw branches from x and y to the independent
variable t, since both x and y are functions of t. Then, on each branch, we write the cor-
responding derivatives. To find dz dt
, we multiply the derivatives along each path from z
to t, and then add these products to get
dz 𝜕z dx 𝜕z dy
= + .
dt 𝜕x dt 𝜕y dt
94 | 2 Functions of multiple variables

Figure 2.8: Chain rule: one independent variable.

dz
Example 2.4.1. Suppose z = f (x, y) = x 2 − y 2 , x = sin t, y = cos t. Find dt
using the chain rule.

Solution. Applying the chain rule, we obtain


dz 𝜕z dx 𝜕z dy
= +
dt 𝜕x dt 𝜕y dt
= 2x ⋅ cos t + (−2y) ⋅ (− sin t)
= 2 sin t cos t + 2 cos t sin t = 2 sin 2t.

In this example, we can eliminate the intermediate variables x and y to obtain

z = x2 − y2 = (sin t)2 − (cos t)2 = − cos 2t

by using the double-angle formula. So dz/dt = (− cos 2t)󸀠 = 2 sin 2t, which agrees with
the answer we obtained by using the chain rule.

2.4.2 The chain rule with more than one independent variable

We now consider another case, z = f (x, y), but where each of x and y is a function of
two variables s and t, i. e., x = ϕ(s, t), y = ψ(s, t). Then z = f (ϕ(s, t), ψ(s, t)) is indirectly
a function of s and t. We can apply Theorem 2.4.1 to find 𝜕z 𝜕t
and 𝜕z𝜕s
. That is, if we hold
s fixed, then we can compute 𝜕t by using Theorem 2.4.1 to get
𝜕z

𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + .
𝜕t 𝜕x 𝜕t 𝜕y 𝜕t

Similarly, we can compute 𝜕z


𝜕s
if we hold t fixed. We summarize this in the following
theorem.

Theorem 2.4.2. Suppose that z = f (x, y) is a differentiable function of x and y, where x = ϕ(s, t) and
y = ψ(s, t) are differentiable functions of s and t. Then
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y 𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + and = + .
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s 𝜕t 𝜕x 𝜕t 𝜕y 𝜕t
2.4 The chain rule | 95

In this version of the chain rule, there are three types of variables, i. e., s and t are
independent variables, x and y are called intermediate variables, and z is the depen-
dent variable. Like the one-independent-variable case, to remember this chain rule
(or any other one), it is helpful to draw a tree diagram representation of the function
relationships,as shown in Figure 2.9. This time, we have two more branches. To find
𝜕z
𝜕s
we multiply the partial derivatives along each path from z to s, and then add these
products, i. e.,
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + .
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s
Similarly, we find
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= +
𝜕t 𝜕x 𝜕t 𝜕y 𝜕t
by using the paths from z to t.

Figure 2.9: Chain rule: two independent variables.

Example 2.4.2. Using the chain rule as an aid, find the partial derivatives of the function z = ex sin y,
where x = st and y = s + t.

Solution. The tree diagram representation of function relationships is the same as in


Figure 2.9, so we have
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= +
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s
= ex ⋅ sin y ⋅ t + ex cos y ⋅ 1
= est sin(s + t)t + est cos(s + t)
= est (t sin(s + t) + cos(s + t)),

and, similarly,

= est [s sin(s + t) + cos(s + t)].


𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= +
𝜕t 𝜕x 𝜕t 𝜕y 𝜕t
96 | 2 Functions of multiple variables

The chain rule and the tree diagram representation of function relationships can
be extended to cases that involve both one and more independent variables or func-
tions of more than two variables, as we will see in the following example.

Example 2.4.3. Let f (u, v, w) = euvw , where u = x 2 , v = x + 2y, and w = yx . Find and .
𝜕f 𝜕f
𝜕x 𝜕y

Solution. This example is different from previous ones in several ways. First, we note
that this is a function of three variables, and the intermediate variables are u, v, and w.
Furthermore, u has one independent variable, while v and w both have two indepen-
dent variables. With the help of the tree diagram shown in Figure 2.10(a), we find

𝜕f 𝜕f du 𝜕f 𝜕v 𝜕f 𝜕w
= + +
𝜕x 𝜕u dx 𝜕v 𝜕x 𝜕w 𝜕x
1
= vweuvw ⋅ 2x + uweuvw ⋅ 1 + uveuvw ⋅
y
uv
= euvw (2vwx + uw + )
y
3 2x 2 (x + 2y) + x3 + x 2 (x + 2y) 3 4x3
= ex (x+2y)/y
( ) = ex (x+2y)/y ( + 6x2 ).
y y

Similarly,

𝜕f 𝜕f 𝜕v 𝜕f 𝜕w
= +
𝜕y 𝜕v 𝜕y 𝜕w 𝜕y
x
= uweuvw ⋅ 2 + uveuvw ⋅ (− )
y2
2x3 x3 (x + 2y)
= euvw ( − )
y y2
3 2x3 x3 (x + 2y) x 4 x3 (x+2y)/y
= ex (x+2y)/y
( − ) = − e .
y y2 y2

(a) (b) (c) (d)

Figure 2.10: Tree diagrams for Examples 2.4.3, 2.4.4, 2.4.5, and 2.4.6.
2.4 The chain rule | 97

2.4.3 Partial derivatives for abstract functions

In some theoretical analysis, we define some functions abstractly, for example, u =


f (x − y, xy ), where we do not know the exact analytical equation for the function. With
the help of the chain rule, we can still find their partial derivatives in term of some
derivative notation.

Example 2.4.4. Let z = ϕ(x 2 + y 2 ). Find 𝜕z


𝜕x
.

Solution. Let s = x2 + y2 . Then z = ϕ(s), and by the chain rule


𝜕z dz 𝜕s
= = ϕ󸀠 (u) × 2x = 2xϕ󸀠 (x 2 + y2 ).
𝜕x ds 𝜕x

Example 2.4.5. If z = f (s2 − t 2 , st), find 𝜕z


𝜕s
and 𝜕z
𝜕t
.

Solution. Let x = s2 − t 2 and y = st. Then z = f (x, y) and


𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + = fx ⋅ (2s) + fy ⋅ (t),
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + = fx ⋅ (−2t) + fy ⋅ (s).
𝜕t 𝜕x 𝜕t 𝜕y 𝜕t

du
Example 2.4.6. If u = f (x, x 2 , x 3 ), find dx
.

Solution. Let s = x, t = x2 , and w = x3 . Then


du 𝜕f ds 𝜕f dt 𝜕f dw
= + +
dx 𝜕s dx 𝜕t dx 𝜕w dx
⋅ 3x2 .
𝜕f 𝜕f 𝜕f
= ⋅1+ ⋅ 2x +
𝜕s 𝜕t 𝜕w

Example 2.4.7. Let u = f (x + y + z, xyz) and suppose that f has continuous second-order partials. Find
𝜕2 u
𝜕u
𝜕z
and 𝜕x𝜕z
in terms of the partial derivatives of f .

Solution. Let s = x + y + z and t = xyz. Then u = f (s, t) and


𝜕u 𝜕f 𝜕s 𝜕f 𝜕t
= +
𝜕z 𝜕s 𝜕z 𝜕t 𝜕z
= f1 + xyf2 ,

where f1 means the derivative of f with respect to the first variable, and f2 means that
with respect to the second variable. We have
𝜕2 u
= (f1 + xyf2 )x = (f1 )x + yf2 + xy(f2 )x
𝜕x𝜕z
98 | 2 Functions of multiple variables

𝜕f1 𝜕s 𝜕f1 𝜕t 𝜕f 𝜕s 𝜕f2 𝜕t


= + + yf2 + xy( 2 + )
𝜕s 𝜕x 𝜕t 𝜕x 𝜕s 𝜕x 𝜕t 𝜕x
= f11 + f12 yz + yf2 + xy(f21 + yzf22 )
= f11 + yf12 (x + z) + yf2 + xy2 zf22 .

Note. Using the notation f1 , f2 , etc., helps to clarify some ambiguity in the notation
𝜕f
𝜕x
, which may mean f1 (this is also fu ) or the partial derivative of the entire function
with respect to x.

2.5 The Taylor expansion


For a function of one variable y = f (x), which has an (n + 1)th-order derivative at x = a,
we have

f 󸀠󸀠 (a) f (n) (a) f (n+1) (ξ )


f (x) = f (a) + f 󸀠 (a)(x − a) + (x − a)2 + ⋅ ⋅ ⋅ + (x − a)n + (x − a)n+1
2! n! (n + 1)!
or

f 󸀠󸀠 (a) f (n) (a) f (n+1) (a + θΔx)


f (x) = f (a) + f 󸀠 (a)Δx + (Δx)2 + ⋅ ⋅ ⋅ + (Δx)n + (Δx)n+1 ,
2! n! (n + 1)!

where ξ is between a and x, 0 < θ < 1, and

f 󸀠󸀠 (a)
f (x) ≈ f (a) + f 󸀠 (x)(x − a) + (x − a)2
2!

is called the quadratic approximation. Similarly, for a function of two variables z =


f (x, y), which has (n + 1)th-order continuous partials at (a, b), we define

ϕ(t) = f (a + tΔx, b + tΔy).

So

ϕ󸀠󸀠 (0) 2 ϕ(n) (0) 2 ϕ(n+1) (θt) n+1


ϕ(t) = ϕ(0) + ϕ󸀠 (0)t + t + ⋅⋅⋅ + t + t .
2! n! (n + 1)!

Note that

ϕ(t) = f (a + tΔx, b + tΔy), ϕ(1) = f (x, y), and ϕ(0) = f (a, b).

Thus,

ϕ󸀠 (t) = fx (a + tΔx, b + tΔy)Δx + fy (a + tΔx, b + tΔy)Δy and


ϕ (0) = fx (a, b)Δx + fy (a, b)Δy.
󸀠
2.5 The Taylor expansion | 99

Furthermore,

ϕ󸀠󸀠 (t) = fxx (a + tΔx, b + tΔy)(Δx)2 + fxy (a + tΔx, b + tΔy)ΔxΔy


+ fyx (a + tΔx, b + tΔy)ΔxΔy + fyy (a + tΔx, b + tΔy)(Δy)2 and
2 2
ϕ (0) = fxx (a, b)(Δx) + 2fxy (a, b)ΔxΔy + fyy (a, b)(Δy) ,
󸀠󸀠
since fxy = fyx .

Letting t = 1, we have the quadratic approximation

f (x, y) ≈ f (a, b) + fx (a, b)Δx + fy (a, b)Δy


1
+ (fxx (a, b)(Δx)2 + 2fxy (a, b)ΔxΔy + fyy (a, b)(Δy)2 ).
2

1
Example 2.5.1. Use a linear and quadratic approximation to estimate .
√10−(2.01)2 −2(0.98)2

1
Solution. Let z = f (x, y) = . The function has continuous partials at (2, 1).
√10−x 2 −2y2
The value of the function at (2, 1) is f (2, 1) = 21 , and its first partials are

1 −3 −3
fx = − (10 − x2 − 2y2 ) 2 (−2x) = x(10 − x 2 − 2y2 ) 2 ,
2
−3 1
fx (2, 1) = 2(10 − 22 − 2(1)2 ) 2 = ,
4
1 −3 −3
fy = − (10 − x2 − 2y2 ) 2 (−4y) = 2y(10 − x 2 − 2y2 ) 2 ,
2
−3 1
fy (2, 1) = 2(1)(10 − 22 − 2(1)2 ) 2 = .
4
Thus, the linear approximation is

f (2.01, 0.98) ≈ f (2, 1) + fx (2, 1)Δx + fy (2, 1)Δy


1 1 1
= + (0.01) + (−0.02)
2 4 4
= 0.4975.

Now, we compute the second partial derivatives. We have

− 32 3 −5
fxx = (10 − x2 − 2y2 ) + x(− )(10 − x 2 − 2y2 ) 2 (−2x)
2
− 32 − 52
= (10 − x2 − 2y2 ) + 3x2 (10 − x2 − 2y2 ) ,
3
− 52 1
fxx (2, 1) = (10 − 22 − 2(1) )
2 −2
+ 3(22 )(10 − 22 − 2(1)2 ) = ,
2
− 32 󸀠
fxy = (x(10 − x2 − 2y2 ) )y
3 −5
= x(− )(10 − x2 − 2y2 ) 2 (−4y),
2
100 | 2 Functions of multiple variables

3 −5 3
fxy (2, 1) = 2(− )(10 − 22 − 2(1)2 ) 2 (−4(1)) = ,
2 8
− 32 󸀠
fyy = (2y(10 − x2 − 2y2 ) )y
− 32 3 −5
= 2(10 − x2 − 2y2 ) + 2y(− )(10 − x 2 − 2y2 ) 2 (−4y),
2
3
3 −5
fyy (2, 1) = 2(10 − 22 − 2(1)2 ) 2 + 2(1)(− )(10 − 22 − 2(1)2 ) 2 (−4(1))

2
5
= .
8

Therefore, the quadratic approximation is

f (2.01, 0.98) ≈ f (2, 1) + fx (2, 1)Δx + fy (2, 1)Δy


1
+ (fxx (2, 1)(Δx)2 + 2fxy (2, 1)ΔxΔy + fyy (a, b)(Δy)2 )
2
1 1 1
= + (0.01) + (−0.02)
2 4 4
1 1 3 5
+ (( )(0.01)2 + 2( )(0.01)(−0.02) + ( )(−0.02)2 )
2 2 8 8
= 0.49758.

1
Compared with the value of computed by a calculator, 0.49757, the
√10−(2.01)2 −2(0.98)2
quadratic approximation gives a better estimation.

If we use the notation


n n
n 𝜕n f (x, y)
+ Δy ) f (x, y) = ∑ ( )(Δx)k (Δy)n−k k n−k ,
𝜕 𝜕
(Δx
𝜕x 𝜕y k=0
k 𝜕x 𝜕y

then we can conclude with Taylor’s theorem (or the Taylor expansion) for a function
of two variables.

Theorem 2.5.1. If z = f (x, y) has continuous (n + 1)th partial derivatives on some neighborhood D
containing the point (a, b) and (x, y) = (a + Δx, b + Δy) is a point in D, then

2
𝜕 𝜕 1 𝜕 𝜕
f (x, y) = f (a, b) + (Δx + Δy )f (a, b) + (Δx + Δy ) f (a, b)
𝜕x 𝜕y 2! 𝜕x 𝜕y
n
1 𝜕 𝜕
+ ⋅⋅⋅ + (Δx + Δy ) f (a, b)
n! 𝜕x 𝜕y
n+1
1 𝜕 𝜕
+ (Δx + Δy ) f (a + θΔx, b + θΔy)
(n + 1)! 𝜕x 𝜕y

for some 0 < θ < 1.


2.6 Implicit differentiation | 101

2.6 Implicit differentiation


We have shown some examples of explicitly defined functions where intermediate
variables can be eliminated, allowing us to find partial derivatives of the function with-
out using the chain rule. However, the chain rule can be extremely useful in finding
partials for functions defined implicitly by one or more equations.

2.6.1 Functions implicitly defined by a single equation

Recall that an equation involving several variables can, in theory, be solved to give
one variable, say, z, as a function of the other variables (with domain limitation), and
in this case z is said to be implicitly defined as a function of the other variables by
that equation. If the equation can actually be solved to give a formula for z, then z
is defined explicitly by that formula. Implicit differentiation is a process for finding
derivatives of implicitly defined functions.
The chain rules for functions of two variables can be used to find a formula to
implicitly differentiate a function of one variable implicitly defined by an equation.
We first consider an equation of the form F(x, y) = 0, which defines y implicitly as a
differentiable function of x, for all x in some set D. This means that there exists some
function y = y(x) such that F(x, y(x)) = 0 for all x ∈ D (D is the domain of y), but we
may not have a formula for y(x). In spite of the lack of a formula y(x), we can find a
formula for its derivative by the method of implicit differentiation, provided that F is
dy
differentiable. We develop the formula here for dx by differentiating both sides of the
equation F(x, y) = 0 with respect to x (assuming y is a function of x) using the chain
rule, i. e.,

𝜕F 𝜕F dy
+ = 0.
𝜕x 𝜕y dx

If 𝜕F/𝜕y ≠ 0, then we can solve for dy/dx, obtaining

dy F
𝜕F
𝜕x
= − 𝜕F = − x.
dx Fy
𝜕y

There is a theorem, called the implicit function theorem, which guarantees the ex-
istence of the derivative dy/dx under some conditions. It says that if F is defined on a
region containing (a, b), where F(a, b) = 0, Fy (a, b) ≠ 0, and Fx and Fy are both con-
tinuous on this region, then the equation F(x, y) = 0 defines a unique y as a function
of x near a with y(a) = b, and the derivative dy/dx does exist and is equal to −Fx /Fy .

Example 2.6.1. Find y 󸀠 if x 3 + y 3 − 6xy + x 2 − 2y = 1.


102 | 2 Functions of multiple variables

Solution. Let

F(x, y) = x3 + y3 − 6xy + x 2 − 2y − 1 = 0.

Then

Fx = 3x2 − 6y + 2x and Fy = 3y2 − 6x − 2.

Therefore,
dy F 3x 2 − 6y + 2x
y󸀠 = =− x =− 2 .
dx Fy 3y − 6x − 2

Now we consider a more complicated version, where z is defined implicitly as a


function z = z(x, y) by an equation of the form F(x, y, z) = 0. If this equation defines a
differentiable function z = f (x, y) for (x, y) in some region, and F(x, y, z) is also differ-
entiable, then we can use the chain rule to differentiate F(x, y, z) = 0 with respect to x
(assuming that z is a function of the independent variables x and y) as follows:
𝜕F 𝜕F 𝜕z
+ = 0.
𝜕x 𝜕z 𝜕x
If 𝜕F/𝜕z ≠ 0, we solve for 𝜕z/𝜕x, i. e.,

F
𝜕F
= − x.
𝜕z 𝜕x
= − 𝜕F
𝜕x Fz
𝜕z

In a similar manner, we have

Fy
𝜕F
𝜕z 𝜕y
= − 𝜕F = − .
𝜕y Fz
𝜕z

Example 2.6.2. Use three methods to find 𝜕z


𝜕x
and 𝜕z
𝜕y
when the equation x 3 + y 3 + z 3 + 6xyz = 7 defines
z implicitly as a function of x and y.

Solution. Method 1: Let F(x, y, z) = x3 + y3 + z 3 + 6xyz − 7 = 0. Then

3x2 + 6yz x 2 + 2yz


𝜕F
𝜕z 𝜕x
= − 𝜕F =− 2 =− 2 ,
𝜕x 3z + 6xy z + 2xy
𝜕z

3y2 + 6xz y2 + 2xz


𝜕F
𝜕z 𝜕y
=− =− 2
=− 2 .
𝜕y 𝜕F 3z + 6xy z + 2xy
𝜕z

Method 2: We avoid the use of the formula by differentiating the equation with
respect to x, using the one-variable chain rule and the one-variable implicit differen-
tiation, assuming that z is a function of x, and treating y as a constant. Then we have

3x2 + 3z 2
𝜕z 𝜕z
+ 6yz + 6xy = 0.
𝜕x 𝜕x
2.6 Implicit differentiation | 103

Solving this equation for 𝜕z/𝜕x, we obtain


𝜕z x2 + 2yz
=− 2 .
𝜕x z + 2xy
Similarly, implicit differentiation with respect to y gives the same formula as above for
𝜕z/𝜕y.
Method 3: We use the differential operator “d.” Then we obtain

d(x 3 + y3 + z 3 + 6xyz) = d(7),


d(x3 ) + d(y3 ) + d(z 3 ) + d(6xyz) = d(7),
3x2 dx + 3y2 dy + 3z 2 dz + 6(xd(yz) + yzdx) = 0 (product rule),
2 2 2
x dx + y dy + z dz + 2(x(ydz + zdy) + yzdx) = 0,
(x2 + 2yz)dx + (y2 + 2xz)dy + (z 2 + 2xy)dz = 0.

Then
x2 + 2yz y2 + 2xz
dz = − 2
dx − 2 dy.
z + 2xy z + 2xy
2 2
This means 𝜕z
𝜕x
= − xz 2 +2xy
+2yz
and 𝜕z
𝜕y
= − yz 2 +2xy
+2xz
.

2.6.2 Functions defined implicitly by systems of equations

In general, two equations will allow two variables to be defined implicitly as functions
of the remaining variables, three equations will allow three variables to be defined
implicitly as functions of the remaining variables, and so on.
We first consider the system of equations

F(x, y, z) = 0,
{
G(x, y, z) = 0.
Assume that two functions y(x) and z(x) are implicitly defined as functions of the in-
dy dz
dependent variable x. How do we find dx and dx ? We apply the chain rule to both
equations simultaneously to obtain
dy dz
Fx + Fy dx + Fz dx = 0,
{ dy dz
Gx + Gy dx + Gz dx = 0.
󵄨󵄨 F F 󵄨󵄨
If the Jacobian determinant 󵄨󵄨󵄨 Gy Gz 󵄨󵄨󵄨 = Fy Gz − Fz Gy ≠ 0, then, by using Cramer’s rule,
󵄨 y z󵄨
the determinant solution is
󵄨󵄨 Fy Fx 󵄨󵄨
󵄨󵄨󵄨 Fx Fz 󵄨󵄨󵄨 󵄨󵄨 󵄨
dy 󵄨󵄨 Gx Gz 󵄨󵄨 dz 󵄨 Gy Gx 󵄨󵄨
= − 󵄨 F F 󵄨 and = − 󵄨󵄨 F F 󵄨󵄨 . (2.5)
dx 󵄨󵄨 y z 󵄨󵄨
󵄨󵄨 Gy Gz 󵄨󵄨 dx 󵄨󵄨 y z 󵄨󵄨
󵄨󵄨 Gy Gz 󵄨󵄨
󵄨 󵄨 󵄨 󵄨
104 | 2 Functions of multiple variables

x 2 +y 2 +z 2 =4, dy dz
Example 2.6.3. If { find and .
x 2 −y+2z 2 =2, dx dx

Solution. We could use equation (2.5), but instead we find the derivative with respect
to x for each side of the equations using implicit differentiation to obtain

dy dz
2x + 2y dx + 2z dx = 0,
{ dy dz
2x − dx
+ 4z dx = 0.

Now multiplying the first equation by 2 and subtracting it from the second equation,
we have
dy dz
2x + 2y dx + 2z dx = 0,
{ dy
−2x − (1 + 4y) dx = 0.

dy
Solve for dx
to obtain

dy −2x
= .
dx 1 + 4y

dy
Substituting dx
back into the first equation, we have

dz 1 −2x x + 2xy
= − (2x + 2y( )) = − .
dx 2z 1 + 4y z + 4yz

Now we consider the situation where the two variables u and v are defined implic-
itly as functions u = u(x, y) and v = v(x, y) by two equations of the following form:

F(x, y, u, v) = 0,
{
G(x, y, u, v) = 0.

We use the chain rule to differentiate the equations with respect to x (keeping y con-
stant), and then we solve the two equations for 𝜕u
𝜕x
and 𝜕x
𝜕v
. Then we have

Fx + Fu 𝜕u
𝜕x
+ Fv 𝜕x
𝜕v
= 0,
{
Gx + Gu 𝜕u
𝜕x
+ Gv 𝜕x
𝜕v
= 0.

Note that these equations are linear in the variables 𝜕u 𝜕x


and 𝜕x
𝜕v
with coefficients Fx , Fu ,
Fv , Gx , Gu , and Gv . Consequently, we can solve for 𝜕x and 𝜕x using the usual methods
𝜕u 𝜕v

for solving linear equations. By Cramer’s rule, the determinant solution is


󵄨󵄨 Fx Fv 󵄨󵄨 󵄨󵄨 Fu Fx 󵄨󵄨
𝜕u 󵄨󵄨 G Gv
󵄨󵄨 𝜕v 󵄨󵄨 G Gx
󵄨󵄨
= − 󵄨󵄨 F x Fv
󵄨
󵄨󵄨 and = − 󵄨󵄨 F u Fv 󵄨󵄨 ,
󵄨 (2.6)
𝜕x 󵄨󵄨 u 󵄨󵄨 𝜕x 󵄨󵄨 u 󵄨󵄨
󵄨󵄨 Gu Gv 󵄨 󵄨󵄨 Gu Gv 󵄨
2.6 Implicit differentiation | 105

given that the denominator is not 0. Similarly, we can find


󵄨󵄨 Fy Fv 󵄨󵄨 󵄨󵄨 Fu Fy 󵄨󵄨
󵄨󵄨 󵄨 󵄨󵄨 󵄨
𝜕u 󵄨󵄨 Gy Gv 󵄨󵄨󵄨 𝜕v 󵄨 Gu Gy 󵄨󵄨
= − 󵄨 F F 󵄨 and = − 󵄨󵄨 F F 󵄨󵄨 (2.7)
𝜕y 󵄨󵄨 u v 󵄨󵄨 𝜕y 󵄨󵄨 u v 󵄨󵄨
󵄨󵄨 Gu Gv 󵄨󵄨 󵄨󵄨 Gu Gv 󵄨󵄨

by differentiating both of the original equations with respect to y.

Example 2.6.4. Suppose two equations xu − yv = 1 and yu + xv = 2 implicitly define u and v as


functions of x and y. Find 𝜕u , 𝜕u , 𝜕v , and 𝜕y
𝜕x 𝜕y 𝜕x
𝜕v
.

Solution. Differentiating the system of equations with respect to x (assuming y is a


constant), we have
𝜕u 𝜕v 𝜕u 𝜕v
u+x −y =0 and y +v+x = 0.
𝜕x 𝜕x 𝜕x 𝜕x
Solving this system of linear equations (treating x, y, u, v as constants) for 𝜕u
𝜕x
and 𝜕v
𝜕x
,
we obtain
𝜕u xu + yv 𝜕v −xv + yu
=− 2 and = 2 .
𝜕x x + y2 𝜕x x + y2
We can also use equation (2.7). Since

F(x, y, u, v) = xu − yv − 1 = 0 and G(x, y, u, v) = yu + xv − 2 = 0,

we have

Fu = x, Fv = −y, Fy = −v, Gu = y, Gv = x, and Gy = u.

Then, by equation (2.7), we obtain


󵄨󵄨󵄨 Fy Fv 󵄨󵄨󵄨 󵄨󵄨 −v −y 󵄨󵄨
󵄨󵄨 Gy Gv 󵄨󵄨 󵄨 x 󵄨󵄨 −xv + yu xv − yu
= − 󵄨󵄨 F F 󵄨󵄨 = − 󵄨󵄨 ux
𝜕u
=− = 2 .
𝜕y 󵄨 u
󵄨󵄨 G G 󵄨󵄨 v 󵄨 󵄨󵄨 y
󵄨
−y 󵄨󵄨
x 󵄨󵄨 x2 + y2 x + y2
󵄨 u v󵄨
Similarly,
󵄨󵄨 Fu Fy 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 x −v 󵄨󵄨
𝜕v 󵄨 Gu Gy 󵄨󵄨 󵄨󵄨 y u 󵄨󵄨 xu + yv
= − 󵄨󵄨 F = − 󵄨󵄨 x =− .
𝜕y 󵄨󵄨 u
󵄨󵄨 Gu
Fv 󵄨󵄨
󵄨 󵄨y
󵄨
−y 󵄨󵄨
x 󵄨󵄨 x2 + y2
Gv 󵄨󵄨

Example 2.6.5. Assume x, y are independent variables and

u = x 2 + y 2 + cos v,
{
y sin v + v sin x = 0.

Find 𝜕v
𝜕x
and 𝜕u
𝜕y
at the point (x, y, u, v) = (0, 1, 0, π).
106 | 2 Functions of multiple variables

Solution. We differentiate both equations with respect to x to get


𝜕u
𝜕x
= 2x − sin v 𝜕x
𝜕v
,
{
y cos v 𝜕x
𝜕v
+ 𝜕v
𝜕x
sin x + v cos x = 0.

When x = 0, y = 1, u = 0, v = π, we have
𝜕u
𝜕x
= 0,
{ 𝜕v
− 𝜕x + π cos 0 = 0.

Thus, 𝜕v 󵄨󵄨
𝜕x 󵄨󵄨(0,1,0,π)
= π. To compute 𝜕u
𝜕y
, we differentiate both equations with respect to y
to get
𝜕u
𝜕y
= 2y − sin v 𝜕y
𝜕v
,
{
sin v + y cos v 𝜕y
𝜕v
+ 𝜕v
𝜕y
sin x = 0.

When x = 0, y = 1, u = 0, v = π, the first equation gives 𝜕u 󵄨󵄨


𝜕y 󵄨󵄨(0,1,0,π)
= 2.

2.7 Tangent lines and tangent planes


2.7.1 Tangent lines and normal planes to a curve

We have already found tangent lines and normal planes for a curve C in space given
by a vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩, where x(t), y(t), and z(t) are differ-
entiable functions of t. The line tangent to the curve at t = t0 is

r(t) = r(t0 ) + r󸀠 (t0 )t,

where r󸀠 (t0 ) is the tangent vector at t = t0 . The parametric equations and symmetric
equations of the tangent line are, therefore,

x = x0 + x󸀠 (t0 )t,
{
{ x − x0 y − y0 z − z0
y = y0 + y󸀠 (t0 )t, and = 󸀠 = 󸀠 , respectively.
{
{ x󸀠 (t0 ) y (t0 ) z (t0 )
{ z = z0 + z (t0 )t,
󸀠

The normal plane at the same point is x󸀠 (t0 )(x − x0 ) + y󸀠 (t0 )(y − y0 ) + z 󸀠 (t0 )(z − z0 ) = 0.
Now we are able to find tangent lines and normal planes for a curve C that is im-
plicitly defined by a system of equations of the form

C = {(x, y, z)|F(x, y, z) = 0 and G(x, y, z) = 0}.

In general, these equations implicitly define two variables as functions of the third,
say, y = y(x) and z = z(x). Thus, we can parameterize C with parameter x as follows:

x = x, y = y(x), and z = z(x).


2.7 Tangent lines and tangent planes | 107

dy
The implicit differentiation methods described previously allow us to compute dx and
dz
dx
even though we do not have formulas for y(x) and z(x). Consequently, we are able
dy dz
to find the tangent vector ⟨1, , ⟩
dx dx
and tangent line at any point on C.

Example 2.7.1. Find an equation of the tangent line and an equation of the normal plane at the point
(1, −2, 1) of the curve C defined implicitly by

C = {(x, y, z)|x 2 + y 2 + z 2 = 6 and x + y + z = 0}.

Solution. If we regard x as the independent variable and as the parameter, then the
system of equations implicitly defines two functions y = y(x) and z = z(x). Choosing
the parameterizations x = x, y = y(x), and z = z(x), a tangent vector of the curve is
dy dz
⟨1, dx , dx ⟩. In order to find the derivatives of these two functions, first implicitly differ-
entiate the system of equations with respect to x, i. e.,

dy dz
2x + 2y dx + 2z dx = 0,
{ dy dz
1+ dx
+ dz
= 0.

dy dz
Solving this system of linear equations for dx
and dx
gives

dy z − x dz x − y
= and = .
dx y − z dx y − z

Therefore, when x = 1, y = −2, and z = 1, we have y󸀠 (1) = 0 and z 󸀠 (1) = −1, and the
dy dz
tangent vector ⟨1, dx , dx ⟩ is

v = ⟨1, 0, −1⟩.

So, symmetric equations of the tangent line are

x−1 y+2 z−1


= = .
1 0 −1

Parametric equations, with parameter t, of the tangent line are

x = 1 + t, y = −2, z = 1 − t.

An equation of the normal plane is

1 ⋅ (x − 1) + 0 ⋅ (y + 2) + (−1) ⋅ (z − 1) = 0.

This simplifies to the equation x − z = 0.


108 | 2 Functions of multiple variables

Note. If this example had asked for the tangent line at (−2, 1, 1), then something dif-
ferent would have happened. If you try the parameterization

x = x, y = y(x), and z = z(x),

then you will have the problem that


dy dz
2x + 2y dx + 2z dx = 0,
{ dy dz
1+ dx
+ dx
=0

has no solutions at (−2, 1, 1) since the two equations would be inconsistent. This does
not mean that there is no tangent there, but the tangent line is parallel to the yz-plane
(perpendicular to the x-axis). To solve the problem, we try a different way to parame-
terize the curve, i. e.,

x = x(y), y = y, and z = z(y).

Then, differentiating with respect to y, we have

2x dx
dy
dz
+ 2y + 2z dy = 0,
{ dx dz
dy
+1+ dy
= 0.

At (−2, 1, 1) we obtain dx
dy
= 0 and dz
dy
= −1. Therefore, the tangent vector is T = ⟨0, 1, −1⟩.
So, the tangent line is

x+2 y−1 z−1


= = .
0 1 −1
Figure 2.11 shows these graphs.

Figure 2.11: Tangent line to the intersection curve of a plane and a sphere.
2.7 Tangent lines and tangent planes | 109

2.7.2 Tangent planes and normal lines to a surface

Suppose S is a surface with equation F(x, y, z) = 0, and let M(x0 , y0 , z0 ) be a specific


point on S. We show that there is a unique direction that is normal to (perpendicular
to) the surface S at the point M, provided F is differentiable.
Consider any curve C that passes through the point M and lies entirely in the sur-
face S, with parametric equations

x = x(t), y = y(t), and z = z(t), (α ≤ t ≤ β)

such that t = t0 gives the point M. Since C lies on S, any point (x(t), y(t), z(t)) on C must
satisfy the defining equation F(x, y, z) = 0 of S, so that

F(x(t), y(t), z(t)) = 0. (2.8)

If x(t), y(t), and z(t) are differentiable functions of t, and F is also differentiable, then
we can use the chain rule to differentiate both sides of (2.8) as follows:
𝜕F dx 𝜕F dy 𝜕F dz
+ + = 0.
𝜕x dt 𝜕y dt 𝜕z dt

Computing this at M where t = t0 gives

Fx (M)x󸀠 (t0 ) + Fy (M)y󸀠 (t0 ) + Fz (M)z 󸀠 (t0 ) = 0, (2.9)

and rewriting this as a dot product of two vectors shows

⟨Fx (M), Fy (M), Fz (M)⟩ ⋅ ⟨x󸀠 (t0 ), y󸀠 (t0 ), z 󸀠 (t0 )⟩ = 0.

Denoting the vectors by

n = ⟨Fx (M), Fy (M), Fz (M)⟩

and

v = ⟨x 󸀠 (t0 ), y󸀠 (t0 ), z 󸀠 (t0 )⟩,

where v is the tangent vector to C at M, equation (2.9) can be written in terms of a dot
product as

n ⋅ v = 0. (2.10)

This equation shows that n is perpendicular to the tangent vector v at M for any curve C
on S that passes through M and satisfies the above differentiability conditions. There-
fore, all the tangent lines of these curves at M must be coplanar as shown in Figure 2.12.
Those tangent lines form a plane which we define as the tangent plane to the surface
at M.
110 | 2 Functions of multiple variables

(a) (b)

Figure 2.12: Tangent planes and normal line to a surface at a point.

Definition 2.7.1. Assume a surface in space has equation F (x, y, z) = 0, and F (x, y, z) is differentiable
at M(x0 , y0 , z0 ). Then the tangent plane to the surface at M is

Fx (M)(x − x0 ) + Fy (M)(y − y0 ) + Fz (M)(z − z0 ) = 0.

The nonzero vector n = ⟨Fx (M), Fy (M), Fz (M)⟩ is a normal vector of the tangent plane at M. The normal
line at M is
x − x0 y − y0 z − z0
= = .
Fx (M) Fy (M) Fz (M)

Example 2.7.2. Find equations of the tangent plane and normal line to the ellipsoid

2x 2 + 4y 2 + z 2 = 10

at the point (−1, 1, −2).

Solution. The equation of the ellipsoid can be written as

F(x, y, z) = 2x 2 + 4y2 + z 2 − 10 = 0.

Therefore, we have

Fx (x, y, z) = 4x, Fy (x, y, z) = 8y, Fz (x, y, z) = 2z, and


Fx (−1, 1, −2) = −4, Fy (−1, 1, −2) = 8, Fz (−1, 1, −2) = −4.

The equation of the tangent plane at (−1, 1, −2) is, therefore,

−4(x − (−1)) + 8(y − 1) + (−4)(z − (−2)) = 0,

which simplifies to 2y − x − z − 5 = 0.
2.7 Tangent lines and tangent planes | 111

The symmetric equations of the normal line are


x+1 y−1 z+2
= = .
−4 8 −4
Figure 2.13(a) shows the tangent plane and normal line at (−1, 1, −2).

(a) (b)

Figure 2.13: Tangent plane and normal line. Examples 2.7.2 and 2.7.3.

The equation of a surface is given explicitly by z = f (x, y)


In the special case in which the equation of a surface S is of the form z = f (x, y), we
can rewrite this equation in the original form as

F(x, y, z) = f (x, y) − z = 0.

If f is differentiable, then

Fx (x, y, z) = fx (x, y), Fy (x, y, z) = fy (x, y), and Fz (x, y, z) = −1,

and a normal vector to the tangent plane is ⟨fx , fy , −1⟩. Thus, an equation of the tangent
plane to the surface at (x0 , y0 , z0 ) becomes

fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) − (z − z0 ) = 0, (2.11)

or

z = z0 + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ). (2.12)

An equation of the normal line to the surface at (x0 , y0 , z0 ) becomes


x − x0 y − y0 z − z0
= = . (2.13)
fx (x0 , y0 ) fy (x0 , y0 ) −1
112 | 2 Functions of multiple variables

Example 2.7.3. Find the tangent plane and normal line to the elliptic paraboloid z = 2x 2 + y 2 at the
point (1, 1, 3).

Solution. Let f (x, y) = 2x 2 + y2 . Then

fx (x, y) = 4x, fy (x, y) = 2y,


fx (1, 1) = 4, fy (1, 1) = 2.

This gives an equation of the tangent plane at (1, 1, 3) as

4(x − 1) + 2(y − 1) − 1(z − 3) = 0,

or

4x + 2y − z = 3.

Symmetric equations of the normal line at this point are

x−1 y−1 z−3


= = .
4 2 −1

Figure 2.13(b) shows the elliptic paraboloid and its tangent plane at (1, 1, 3) that we
found in this example.

Note. If the surface has a parameterization

r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩,

then ru × rv is a vector normal to the plane tangent to the surface. If the point P on the
surface is (u0 , v0 , r(u, v0 )), then r(u0 , v0 ) is a curve on the surface passing through P,
thus, ru (u0 , v0 ) is its tangent vector at P. Similarly, rv (u0 , v0 ) is a tangent vector of the
curve r(u0 , v). Thus, a normal vector is obtained by taking the cross product of the two
tangent vectors. If we choose the parameterization

r(u, v) = ⟨u, v, f (u, v)⟩

for the surface z = f (x, y), then a normal vector of its tangent plane is

ru × rv = ⟨1, 0, fu ⟩ × ⟨0, 1, fv ⟩ = ⟨−fu , −fv , 1⟩,

which is the same as ⟨−fx , −fy , 1⟩ found before.


2.8 Directional derivatives and gradient vectors | 113

Tangent plane approximation


Recall the total differential/linear approximation which we discussed in Section 2.3.3.
Note that when dx = Δx = x − x0 , dy = Δy = y − y0 , and Δz = z − z0 , we have

Δz ≈ dz = fx (x0 , y0 )Δx + fy (x0 , y0 )Δy,


z − z0 ≈ fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ).

The linearization z = z0 +fx (x0 , y0 )(x−x0 )+fy (x0 , y0 )(y−y0 ) is exactly an equation of the
tangent plane at the point (x0 , y0 , z0 ). The change Δz is approximated by the change
dz in the corresponding tangent plane, as shown in Figure 2.14.

Figure 2.14: Tangent plane approximation.

2.8 Directional derivatives and gradient vectors


When you climb a mountain, at a certain point, you might ask, what is the slope of the
surface? Your answer might be “it depends on which direction I go.” This is the idea
behind the directional derivative, i. e., to find the rate of change along some given di-
rection. Recall that if z = f (x, y), then the partial derivatives fx and fy represent the rates
of change of z in the x- and y-direction, respectively (when the independent variables
(x, y) change in the directions of the unit vectors i and j). How can we find the rate of
change in other directions?
Suppose that we now wish to find the rate of change of z at the point P(a, b) when
the independent variables (x, y) change in the direction of an arbitrary unit vector u =
⟨cos α, cos β⟩. As shown in Figure 2.15, a point on the half-line L in the direction of u
at a distance l ≥ 0 from the point P(a, b) is given by (a + l cos α, b + l cos β). The change
of f (x, y) in the direction u is

Δz = f (a + l cos α, b + l sin α) − f (a, b).


114 | 2 Functions of multiple variables

(a) (b)

Figure 2.15: Directional derivatives.

This involves one variable l, so we define the directional derivative along the direction
u as follows.

Definition 2.8.1. Let z = f (x, y) be a function of two variables and (a, b) be an interior point in its
domain; u = ⟨cos α, sin α⟩ is a unit vector. The directional derivative of z at the point (a, b) in the
direction u is defined by

f (a + l cos α, b + l sin α) − f (a, b)


Du f (a, b) = lim ,
l→0+ l

provided the limit exists.

dz 𝜕z 𝜕f
Note. We also use notations such as , , ,
dl 𝜕l 𝜕l
or 𝜕z
𝜕ρ
for directional derivatives.

Example 2.8.1. Find the directional derivative of z = √x 2 + y 2 at (0, 0) in the direction ⟨ √1 , 1


⟩.
2 √2

Solution. The direction u = ⟨ √12 , √12 ⟩ is a unit vector already, so cos α = 1


√2
, cos β = 1
√2
,
and
f (a + l cos α, b + l sin α) − f (a, b)
Du f (0, 0) = lim+
l→0 l
1 2
√(0 + l ) + (0 + l 1 )2 − √02 + 02
√2 √2
= lim+
l󳨀→0 l
l
= lim+ = 1.
l→0 l
It is easy to see from the graph of this right circular cone, at the vertex, the slope
along any direction is tan 45° = 1.
2.8 Directional derivatives and gradient vectors | 115

Note. The cone has no partial derivative at (0, 0). So, this example shows that a func-
tion may have a directional directive at a point in some direction, even though it may
not have partial derivatives at that point. This is because the directional derivative is
defined as a one-sided limit!
Surprisingly, if a function is differentiable, its derivative in any direction exists,
and we can find directional derivatives using its partials. This is shown in the following
theorem.

Theorem 2.8.1. If z = f (x, y) is differentiable at P0 (a, b), then the directional derivative of f exists at
P0 in the direction given by any unit vector u = ⟨cos α, cos β⟩ and

Du f (a, b) = fx (a, b) cos α + fy (a, b) cos β. (2.14)

Proof. Since f (x, y) is differentiable at the point P0 (a, b), we have


𝜕f 𝜕f
Δz = Δx + Δy + o(√Δx2 + Δy2 ),
𝜕x 𝜕y
so

f (a + l cos α, b + l sin α) − f (a, b) = fx (a, b)l cos α + fy (a, b)l sin α

+ o(√[l cos α]2 + [l sin α]2 ).

It follows that
f (a + l cos α, b + l sin α) − f (a, b)
Du f (a, b) = lim
l→0 l
fx (a, b)l cos α + fy (a, b)l sin α + o(√[l cos α]2 + [l sin α]2 )
= lim
l→0 l
fx (a, b)l cos α + fy (a, b)l sin α + o(l)
= lim
l→0 l
= fx (a, b) cos α + fy (a, b) sin α.

Now we can rewrite the directional derivative in a dot product form, i. e.,

Du f (a, b) = fx (a, b) cos α + fy (a, b) sin α = ⟨fx (a, b), fy (a, b)⟩ ⋅ ⟨cos α, sin α⟩.

This is

Du f (a, b) = ⟨fx (a, b), fy (a, b)⟩ ⋅ u. (2.15)

Example 2.8.2. Find the directional derivative of z = xe2y at P(1, 0) in the direction from P to the point
Q(2, −1).
116 | 2 Functions of multiple variables

󳨀󳨀→
Solution. The unit vector in the direction of PQ is

⟨2 − 1, −1 − 0⟩ 1 −1
u= =⟨ , ⟩.
√(2 − 1)2 + (−1 − 0)2 √2 √2

Since

𝜕z 󵄨󵄨󵄨󵄨 𝜕z 󵄨󵄨󵄨󵄨
= e2y 󵄨󵄨󵄨(1,0) = 1 and = 2xe2y 󵄨󵄨󵄨(1,0) = 2,
󵄨 󵄨
󵄨 󵄨
𝜕x 󵄨󵄨󵄨(1,0) 𝜕y 󵄨󵄨󵄨(1,0)

by equation (2.15) the desired directional derivative is

1 1 1 1 √2
Du f (a, b) = ⟨1, 2⟩ ⋅ ⟨ ,− ⟩ = 1 ⋅ + 2 ⋅ (− ) = − .
√2 √2 √2 √2 2

Steepest ascent/descent and the gradient vector


Again, suppose you are somewhere on a mountain and want to know the direction of
the steepest ascending path. The directional derivative can answer this question now.
Assume z = f (x, y) is differentiable at (a, b). The directional derivative in the direction
u is

Du f = ⟨fx (a, b), fy (a, b)⟩ ⋅ u


= 󵄨󵄨󵄨⟨fx (a, b), fy (a, b)⟩󵄨󵄨󵄨|u| cos θ
󵄨 󵄨

= √fx2 (a, b) + fy2 (a, b) cos θ.

Therefore, the maximum directional derivative that f can obtain is √fx2 + fy2 , and this
happens when θ = 0, that is, when the two vectors ⟨fx (a, b), fy (a, b)⟩ and u point in
the same direction. The minimum directional derivative that f can obtain is −√fx2 + fy2 ,
and this happens when θ = π, that is, when the two vectors ⟨fx (a, b), fy (a, b)⟩ and u
point in exactly opposite directions. Therefore, along the direction ⟨fx (a, b), fy (a, b)⟩,
the function f obtains its greatest directional derivative; f attains its smallest direc-
tional derivative in the direction −⟨fx (a, b), fy (a, b)⟩, as shown in Figure 2.16. We give
the vector ⟨fx (a, b), fy (a, b)⟩ a special name.

Definition 2.8.2. If z = f (x, y) is a differentiable function, then the gradient of f at the point (a, b) is
the vector ∇f (a, b) defined by

∇f (a, b) = ⟨fx (a, b), fy (a, b)⟩ = fx (a, b)i + fy (a, b)j.

The gradient vector at (x, y) is the vector function ∇f (x, y) defined by

∇f (x, y) = ⟨fx (x, y), fy (x, y)⟩ = fx (x, y)i + fy (x, y)j.

Note. Sometimes ∇f (x, y) is also denoted by gradf .


2.8 Directional derivatives and gradient vectors | 117

Figure 2.16: Gradient vectors, steepest ascent/descent.

By the above definition, we can write the directional derivative of f in the direction
given by a unit vector u as

Du f (a, b) = ∇f ⋅ u, (2.16)

and the steepest ascent/steepest slope of f at (x, y) is |∇f |, which occurs in the direction
∇f . The steepest descent of f at (x, y) is −|∇f |, which occurs in the direction of −∇f . In
fact, the directional derivative of f at (a, b) in the direction u is the scalar projection of
∇f onto the vector u.

Gradient vectors and level curves


Recall that a level curve of z = f (x, y) is a contour projected onto the xy-plane. Then,
a level curve in the xy-plane has an equation f (x, y) = k, where k is some constant.
Assume the equation f (x, y) = k defines a function y = y(x) implicitly at a point P(a, b).
Differentiating f (x, y) = k at the point P gives

dy
fx + fy = 0,
dx
dy 󵄨󵄨󵄨󵄨
fx (a, b) + fy (a, b) 󵄨 = 0 (computing at P),
dx 󵄨󵄨󵄨(a,b)
dy 󵄨󵄨󵄨󵄨
⟨fx (a, b), fy (a, b)⟩ ⋅ ⟨1, 󵄨 ⟩ = 0,
dx 󵄨󵄨󵄨(a,b)
dx dy
∇f (a, b) ⋅ ⟨ , ⟩ = 0.
dx dx (a,b)

Note that the tangent vector of the level curve written parametrically as x = x and
dx dy
y = y(x) is given by ⟨ dx , dx ⟩. The zero value for the dot product proves that the gradient
vector of f is perpendicular to the level curve at P, as shown in Figure 2.17.
118 | 2 Functions of multiple variables

Figure 2.17: Gradient vectors and level curves.

Example 2.8.3. For the function z = f (x, y) = x 2 − 3y 2 ,


1. find the gradient of f (x, y) at (2, 1).
2. find the derivative of f in the direction ⟨3, −1⟩ at (2, 1).
3. find an equation for the level curve C passing through the point (2, 1) in the xy-plane.
4. verify that the gradient vector of f and the tangent vector of C at (2, 1) are perpendicular to each
other.

Solution.
1. The gradient of f at (2, 1) is

∇f (2, 1) = ⟨fx (2, 1), fy (2, 1)⟩ = ⟨2x, −6y⟩|(2,1) = ⟨4, −6⟩.

2. The directional derivative of f in the direction ⟨3, −1⟩ at (2, 1) is


⟨3, −1⟩ 12 + 6 18
Du f (2, 1) = ∇f ⋅ u = ⟨4, −6⟩ ⋅ = = .
√32 + (−1)2 √10 √10
2 2
3. When x = 2 and y = 1, f (2, 1) = 2 − 3(1) = 1. So, the level curve C has an equation

x2 − 3y2 = 1 and z = 0.

4. Applying implicit differentiation to x − 3y2 = 1, we have


2

2x − 6yy󸀠 = 0,
x
y󸀠 = .
3y

So at the point (2, 1) on the level curve C, the slope is y󸀠 (2) = 32 , and the tangent
line is
2
y − 1 = (x − 2).
3
A tangent vector of C at (2, 1) is ⟨3, 2⟩, which is indeed perpendicular to the gradient
∇f (2, 1) = ⟨4, −6⟩ since

⟨3, 2⟩ ⋅ ⟨4, −6⟩ = 12 − 12 = 0.


2.8 Directional derivatives and gradient vectors | 119

Example 2.8.4. Suppose the temperature distribution on a plate at any point (x, y) satisfies

T (x, y) = 80 − 2x 2 − y 2 − x (°C).

An ant with bad luck unfortunately fell on the plate at (1, 1). Find the best escaping path for the ant.

Solution. Note that the temperature at (1, 1) is 76 °C! The strategy is to find the path
along which the temperature decreases most rapidly. This is equivalent to finding the
steepest descent path on the surface (the graph of the function T(x, y)) starting from
the point (1, 1). Well, the direction of this path must be the opposite direction of ∇T. As-
sume the path is y = y(x). Then ⟨dx, dy⟩, which is the tangent vector, must be parallel
to −∇T = ⟨−Tx , −Ty ⟩. So,

⟨dx, dy⟩ = λ⟨−Tx , −Ty ⟩ for some λ

or

dx dy
= .
Tx Ty

But Tx = −4x − 1 and Ty = −2y, so

1 1
dx = dy.
−4x − 1 −2y

Integrating both sides gives

1 1
−∫ dx = − ∫ dy.
4x + 1 2y

1 1
This simplifies to 4
ln |4x + 1| = 2
ln |y| + C, where C is an arbitrary constant. Then

ln |4x + 1| = 2 ln |y| + 4C,


|4x + 1| = |y|2 e4C ,
4x + 1 = y2 D, where D is also an arbitrary constant.

Since y(1) = 1, this means 4 + 1 = 12 D, so D = 5. Therefore, the escaping route is

4x + 1 = 5y2 .

This is certainly not the shortest path to the edge of the plate, but the path along which
the temperature decreases most rapidly, as shown in Figure 2.18.
120 | 2 Functions of multiple variables

Figure 2.18: The escaping route.

Functions of more than two variables


The concept and notation of directional derivatives can be extended to functions of
more than two variables. For example, for u = f (x, y, z), the gradient vector is ∇f =
⟨fx , fy , fz ⟩, and the derivative in the direction given by a unit vector u is Du f (x, y, z) =
∇f ⋅ u.
For example, the directional derivative of u = f (x, y, z) at the point (a, b, c) in the
direction u = ⟨cos α, cos β, cos γ⟩ is given by

Du f (a, b, c) = fx (a, b, c) cos α + fy (b, a, c) cos β + fz (a, b, c) cos γ


= ⟨fx (a, b, c), fy (a, b, c), fz (a, b, c)⟩ ⋅ u
= ∇f ⋅ u.

𝜕f 󵄨󵄨
Example 2.8.5. Compute 󵄨
𝜕l 󵄨󵄨(1,1,2)
in the direction with direction angles α = π/3, β = π/4, and γ = π/3
when f is defined by

f (x, y, z) = xy + yz + zx.

Solution. The unit vector in the direction is

1 √2 1
u = ⟨cos π/3, cos π/4, cos π/3⟩ = ⟨ , , ⟩.
2 2 2

Also,

fx (1, 1, 2) = (y + z)|(1,1,2) = 3,
fy (1, 1, 2) = (x + z)|(1,1,2) = 3,
fz (1, 1, 2) = (y + x)|(1,1,2) = 2,
2.8 Directional derivatives and gradient vectors | 121

so the directional derivative in the direction u is

𝜕f (1, 1, 2) 1 √2 1 1 √2 1 5 3
= ⟨3, 3, 2⟩ ⋅ ⟨ , , ⟩=3⋅ +3⋅ + 2 ⋅ = + √2.
𝜕l 2 2 2 2 2 2 2 2

Note. There is a similar idea to level curves for functions of three variables. If we set
f (x, y, z) = k, for a constant k, then we get a level surface of the function u = f (x, y, z).
At any given point P(a, b, c), for any curve passing through P that lies on the surface,
we can show that the gradient vector ∇f is actually perpendicular to its tangent vec-
tor. (We provided a proof in the previous section.) Therefore, ∇f is perpendicular to
the tangent plane to the level surface at P, as shown in Figure 2.19. Thus, the gradi-
ent vector ∇f = ⟨fx , fy , fz ⟩ is a normal vector of the tangent plane to the level surface
f (x, y, z) = k through P.

Figure 2.19: Level surfaces and gradient vectors for functions of three variables.

Now, we can use gradient vectors to find a tangent vector for a curve of intersection of
two surfaces,

f (x, y, z) = 0,
{
g(x, y, z) = 0

at a given point P. Since a tangent vector of the curve at P is perpendicular to both ∇f


and ∇g, one vector tangent to the curve at P is

v = ∇f × ∇g.

Using this idea, we consider Example 2.7.1 again. Since

f (x, y, z) = x2 + y2 + z 2 − 6 = 0,
g(x, y, z) = x + y + z = 0,
122 | 2 Functions of multiple variables

we have ∇f (1, −2, 1) = ⟨2x, 2y, 2z⟩|(1,−2,1) = ⟨2, −4, 2⟩ and ∇g(1, −2, 1) = ⟨1, 1, 1⟩. Thus,

2 1 −6 −1
v = ( −4 ) × ( 1 ) = ( 0 ) = 6 ( 0 .)
2 1 6 1

So any vector parallel to ⟨−1, 0, 1⟩ is a tangent vector to the curve at (1, −2, 1). There is
no need for any parameterization.

Note. Using vector notation, and knowledge in linear algebra, we now can write
the Taylor series for a function of two variables in a more compact way. Let x =
⟨x, y⟩ and x0 = ⟨a, b⟩. Then f (x, y) = f (x), f (a, b) = f (x0 ), and Δx = x − x0 = ⟨Δx, Δy⟩.
Note that
𝜕 𝜕 𝜕f (x0 ) 𝜕f (x0 )
(Δx + Δy )f (a, b) = Δx + Δy
𝜕x 𝜕y 𝜕x 𝜕y
= ⟨fx (x0 ), fy (x0 )⟩ ⋅ ⟨Δx, Δy⟩ = ∇f (x0 ) ⋅ Δx.

Also, we have the Hessian matrix notation


󵄨󵄨 󵄨󵄨
󵄨 f (x ) fxy (x0 ) 󵄨󵄨
H(x0 ) = 󵄨󵄨󵄨󵄨 xx 0 󵄨󵄨 .
󵄨󵄨 fxy (x0 ) fyy (x0 ) 󵄨󵄨
󵄨
Then, we can write
1
f (x) = f (x0 ) + ∇f (x0 ) ⋅ Δx+ ΔxT H(x0 )Δx + ⋅ ⋅ ⋅ .
2!

2.9 Maximum and minimum values


2.9.1 Extrema of functions of two variables

As shown in Figure 2.20, for a function of two variables, z = f (x, y), there are also
interesting features such as local or global extreme values, as we have seen in one-
variable calculus. We first give the definition of local and global extrema.

Figure 2.20: Local and global extreme values.


2.9 Maximum and minimum values | 123

Definition 2.9.1. A function z = f (x, y) has a domain D ⊂ ℝ2 . Then, f (x, y) has


1. a local/relative maximum at an interior point (a, b) ∈ D if f (x, y) ≤ f (a, b) for all (x, y) in D close
to (a, b).
2. a local/relative minimum at an interior point (a, b) ∈ D if f (x, y) ≥ f (a, b) for all (x, y) in D close to
(a, b).
3. a global/absolute maximum at a point (a, b) ∈ D if f (x, y) ≤ f (a, b) for all (x, y) in D.
4. a global/absolute minimum at a point (a, b) ∈ D if f (x, y) ≥ f (a, b) for all (x, y) in D.

Figure 2.21 shows that a function whose graph is the upper hemisphere has a local
maximum (also absolute maximum) above its center, and a function whose graph is
the cone with vertex downwards has an absolute minimum at its vertex.

Figure 2.21: An upper sphere and an upper cone.

How can we identify these interesting points? As we saw in one-variable calculus, the
answer is to use derivatives. In this section, we use partial derivatives to help locate
maxima and minima of functions of two variables. We first consider the case that a
differentiable function z = f (x, y) has a local maximum point at (a, b). Then, the inter-
section curve

z = f (x, y),
{
y=b

also has a local maximum at the same point (a, b). However, the curve z = f (x, b) in
the y = b plane has just one variable. Therefore, the derivative with respect to x at
x = a must be 0. This means fx (a, b) = 0. Similarly, we can obtain fy (a, b) = 0, or,
equivalently, ∇f (a, b) = 0, as shown in Figure 2.22.

Note. Note that if fx (a, b) = 0 and fy (a, b) = 0, then ∇f (a, b) = 0, and the tangent plane
at (a, b) is z = z0 . This means the geometric interpretation of ∇f (a, b) = 0 is that the
graph of f has a horizontal tangent plane at the point (a, b).
124 | 2 Functions of multiple variables

Figure 2.22: Candidates for local maximum or local minimum.

If a function has no partial derivatives at a point, it may still have extreme values
there (similar to one-variable calculus, a function that is not differentiable may still
have extreme values). For instance, the upper right circular cone z = √x 2 + y2 has no
partial derivatives at (0, 0), but it does have a local minimum value 0 at (0, 0). There-
fore, candidates of extrema for any function are those points where ∇f = 0 or ∇f does
not exist. We give them a special name.

Definition 2.9.2 (Critical points). A point (a, b) is called a critical point of z = f (x, y) if ∇f (a, b) = 0 or
if ∇f (a, b) does not exist.

These above arguments lead to the following theorem.

Theorem 2.9.1 (Candidate theorem). If a function z = f (x, y) has a local maximum/minimum at an


interior point (a, b) in its domain, then ∇f (a, b) = 0 or ∇f (a, b) does not exist.

Theorem 2.9.1 shows that if f has a local maximum or minimum at (a, b), then (a, b)
must be a critical point of f .

Example 2.9.1. Find all critical points for each of the following functions:

(a) z = xy, (b) z = x 4 + y 4 − 8xy.

Solution. For (a), the function z = xy is differentiable everywhere, so all critical points
are those such that ∇f = 0, i. e.,

∇f = 0 → fx = 0 and fy = 0,
fx = y = 0 and fy = x = 0.

So the only critical point is (0, 0).


2.9 Maximum and minimum values | 125

For (b), the function z = x4 + y4 − 8xy is also differentiable everywhere, so all


critical points are those such that ∇f = 0.

∇f = 0 → fx = 0 and fy = 0,
3
fx = 4x − 8y = 0 and fy = 4y3 − 8x = 0.

Solve for x and y to obtain

x3 = 2y and y3 = 2x.

Thus, we have

x 9 = 8(2x),
x(x8 − 16) = 0,
x(x 4 + 4)(x2 + 2)(x − √2)(x + √2) = 0.

Thus, x = 0, x = √2, or x = −√2, and all the critical points are

(0, 0), (√2, √2), and (−√2, −√2).

Graphs of the two functions are shown in Figure 2.20.

However, as in single-variable calculus, not all critical points give rise to maxima
or minima. For instance, for the function z = xy, ∇f (0, 0) = 0, but the function value 0
at (0, 0) is neither a maximum nor a minimum because near (0, 0) there are points in
the first and third quadrants of the xy-plane which make z = xy positive and points in
the second and fourth quadrants which make z = xy negative. We also give a definition
for this type of point.

Definition 2.9.3 (Saddle points). A critical point (a, b) of z = f (x, y) at which ∇f (a, b) = 0 but f does
not have a local maximum or a local maximum is called a saddle point.

Note. In one-variable calculus, a saddle point is a point where the function has a hor-
izontal tangent line, and nearby the point you can find places where the graph of the
function is above the tangent line and other places where the graph is below the tan-
gent line. The function f (x) = x3 at x = 0 is a good example of a saddle point. Anal-
ogously, in two-variable calculus, a saddle point is a point where the function has a
horizontal tangent plane, and nearby the point you can find places where the graph
of the function is above the tangent plane and other places where the graph is below
the tangent plane.
126 | 2 Functions of multiple variables

Theorem 2.9.2 (Second derivative test). Assume all the second partial derivatives of f are continuous
on a disk with center (a, b), and fx (a, b) = 0 and fy (a, b) = 0, so that (a, b) is a critical point of f . Let

A = fxx (a, b), B = fxy (a, b), C = fyy (a, b).

Then:
(1) If AC − B2 > 0 and A > 0, then f (a, b) is a local minimum.
(2) If AC − B2 > 0 and A < 0, then f (a, b) is a local maximum.
(3) If AC − B2 < 0, then f (a, b) is not a local minimum or maximum, so it is a saddle point.

Note.
1. In case (3), where the point (a, b) is a saddle point of f , the graph y = f (x, y) crosses
its tangent plane at (a, b), that is, near the saddle point, part of the graph is above
the tangent plane, and part of the graph is below the tangent plane.
2. If AC − B2 = 0, the test fails to give any information. In this case, f could have a
local maximum or local minimum at (a, b), or (a, b) could be a saddle point of f .
3. To help remember the formula for AC − B2 , we can write it in determinant form,
󵄨󵄨 󵄨 󵄨 󵄨
󵄨 A B 󵄨󵄨󵄨 󵄨󵄨󵄨 fxx fxy 󵄨󵄨󵄨
AC − B2 = 󵄨󵄨󵄨󵄨 󵄨󵄨 = 󵄨󵄨 󵄨󵄨 = f f − (fxy )2 .
󵄨󵄨 B C 󵄨󵄨󵄨 󵄨󵄨󵄨 fxy fyy 󵄨󵄨󵄨 xx yy

A proof of the second derivative test can be seen from the vector form of the Taylor
expansion for a function of two variables. Assume x = ⟨x, y⟩, u = f (x), and f (x) has
continuous first and second partial derivatives at x0 = (a, b). Then, for small Δx,
1
f (x0 +Δx) ≈ f (x0 ) + ∇f (x0 ) ⋅ ΔxT + ΔxT H(x0 )Δx,
2!
where ∇f (x0 ) = ⟨fx (a, b), fy (a, b)⟩, and H(x0 ) is the Hessian matrix defined as
󵄨󵄨 󵄨󵄨
󵄨 f fxy 󵄨󵄨
H(x0 ) = 󵄨󵄨󵄨󵄨 xx 󵄨󵄨 .
󵄨󵄨 fyx fyy 󵄨󵄨
󵄨(a,b)
Since ∇f (x0 ) = 0 at a candidate x0 , f (x0 +Δx) > f (x0 ) if H(x0 ) is positive definite, and
f (x0 +Δx) < f (x0 ) if H(x0 ) is negative definite.

Example 2.9.2. Locate and classify all the critical points for each of the following functions:
2
−y 2
(a) f (x, y) = xy, (b) f (x, y) = x 4 + y 4 − 8xy, and (c) z = xye−x .

Solution.
(a) We have found the critical point (0, 0) for z = xy in Example 2.9.1. Since A = fxx = 0,
B = fxy = 1, and C = fyy = 0, we have

AC − B2 = 0 − 1 < 0.

So by the second derivative test, this is a saddle point.


2.9 Maximum and minimum values | 127

(b) We calculate A = fxx = 12x2 , B = fxy = −8, and C = fyy = 12y2 and apply the second
derivative test to critical points we have found in Example 2.9.1.
At (0, 0), AC − B2 = 144x2 y2 − 64|(0,0) = −64 < 0, so it is a saddle point.
At (√2, √2), AC − B2 = 144x2 y2 − 64|(√2,√2) = 144(4) − 64 > 0 and A = 12(√2)2 > 0,
so it is a local minimum.
At (−√2, −√2), AC − B2 = 144x2 y2 − 64|(−√2,−√2) = 144(4) − 64 > 0 and A = 12(√2)2 >
0, so it is a local minimum.
(c) Solving ∇f = 0, we have
2
−y2 2
−y2 2
−y2
fx = ye−x + xye−x (−2x) = e−x (y − 2x 2 y) = 0,
{ 2 2 2 2 2 2
fy = xe−x −y
+ xye−x −y
(−2y) = e−x −y
(x − 2xy2 ) = 0.
2
−y2
Since e−x ≠ 0, these equations simplify to

y − 2x2 y = 0,
{
x − 2xy2 = 0.

This gives critical points (0, 0), ( √12 , √12 ), (− √12 , − √12 ), (− √12 , √12 ), and ( √12 , − √12 ).
Then,
2
−y2 2
−y2
A = fxx = e−x (−2x)(y − 2x2 y) + e−x (−4xy),
−x 2 −y2 2 −x2 −y2
B = fxy = e (−2y)(y − 2x y) + e (1 − 2x2 ), and
2 2 2 2
C = fyy = e−x −y
(−2y)(x − 2xy2 ) + e−x −y
(−4xy).

So at (0, 0), AC − B2 < 0, and it is a saddle point. At the other points, note that
2 2
1 − 2x 2 and 1 − 2y2 are 0. So, B is 0, and AC − B2 = 16e−2x −2y x 2 y2 > 0. Therefore,
the function has local extrema at those points. When x and y have opposite signs,
A > 0, and when x and y have the same sign, A < 0. Thus, the function has two
local maxima at ( √12 , √12 ) and (− √12 , − √12 ) and two local minima at (− √12 , √12 ) and
( √12 , − √12 ). Graphs of the three functions are shown in Figure 2.20.

Global extreme values of functions defined on bounded closed regions


Similar to the closed interval test for global extreme values for a one-variable con-
tinuous function over a closed interval, for a continuous function of two variables
z = f (x, y) defined on a bounded closed set D, we also have the closed region test.
(1) Find the critical points of f in D.
(2) Find the critical points of f when f is considered as a function restricted to the
boundary of D.
(3) Evaluate the function values at all the points found in (1) and (2). The greatest
of these values is the absolute maximum value of f in D while the least of these
values is the absolute minimum value of f in D.
128 | 2 Functions of multiple variables

Example 2.9.3. Find the global maximum and global minimum for the function

z = f (x, y) = x 2 + y 2 + 8x − 6y + 20, where (x, y) ∈ D = {(x, y)|x 2 + y 2 ≤ 36}.

Solution. We first find points such that ∇f = 0. This means

fx = 2x + 8 = 0,
fy = 2y − 6 = 0.

So, (−4, 3) is the critical point in D (if not in D, we will reject it). The function value at
(−4, 3) is

f (−4, 3) = (−4)2 + (3)2 + 8(−4) − 6(3) + 20 = −5.

Now we consider the function values on the boundary of D, x 2 + y2 = 36. Using


x = 6 cos t, y = 6 sin t as the parameterization of the boundary gives

f = (6 cos t)2 + (6 sin t)2 + 8(6 cos t) − 6(6 sin t) + 20


= 48 cos t − 36 sin t + 56, for 0 ≤ t ≤ 2π.

Of course, we can use one-variable calculus to find df


dt
= 0 and then solve for
t to find candidates for extreme values. However, if we note that a cos t + b sin t =
√a2 + b2 sin(t+α) for some α, then the maximum value of the expression a cos t+b sin t
is √a2 + b2 , and the minimum value of the expression a cos t + b sin t is −√a2 + b2 .
Thus, we obtain the following maximum and minimum values of the function on the
boundary:

fmaximum on the boundary = √482 + 362 + 56 = 116,


fminimum on the boundary = −√482 + 362 + 56 = −4.

Comparing these function values, we conclude that

fmax = 116 and fmin = −5.

The graphs of this function and the cylinder are shown in Figure 2.23.

Note. In some practical circumstances an absolute maximum/minimum is known to


exist and the domain set is not bounded or the boundary is obscure and/or consists
of points of no practical interest. If only one critical point is found, then this critical
point must be the desired absolute maximum/minimum.

Example 2.9.4. A rectangular container without a lid is to be made from 18 m2 woodboard. Find the
maximum volume of such a container.
2.9 Maximum and minimum values | 129

Figure 2.23: Global extreme values for functions defined over a closed region.

Solution. Let x = length, y = width, and z = height of the box (in meters). Then the
volume of the box is given by

V = xyz.

Computing the area of the four sides and the bottom of the box, which must have a
total area of 18 m2 , gives an extra equation (a constraint) linking x, y, and z, i. e.,

2xz + 2yz + xy = 18.

The domain satisfies x ≥ 0, y ≥ 0, and z ≥ 0, but it is not bounded (for example, if


y = 0, then 2xz = 18 and as z → 0 it follows that x → ∞). We solve the constraint for
z to obtain
18 − xy
z=
2(x + y)

and use this to express V as a function of just two variables x and y,

18 − xy 18xy − x 2 y2
V = xy = .
2(x + y) 2(x + y)

We compute the partial derivatives, and we obtain

𝜕V y2 (x2 + 2xy − 18) 𝜕V x 2 (y2 + 2xy − 18)


=− and =− .
𝜕x 2(x + y)2 𝜕y 2(x + y)2

Let 𝜕V/𝜕x = 𝜕V/𝜕y = 0. Note that x ≠ 0 and y ≠ 0. Hence, we must have

x2 + 2xy − 18 = 0 and y2 + 2xy − 18 = 0.

Subtracting these leads to x2 = y2 and so x = y (note that x and y must both be positive).
If we put x = y in either equation, we obtain 3x 2 − 18 = 0, which gives x = √6, y = √6,
and z = (18 − √6 ⋅ √6)/[2(√6 + √6)] = √6/2.
130 | 2 Functions of multiple variables

Of course, we can show that this indeed gives a local maximum of V by using the
second derivative test. However, from the physical nature of this problem, we could
simply argue that there must be an absolute maximum volume, and it has to occur at
a critical point of V. Since there are no boundary values of interest and there is only
one critical point, the function must take its absolute maximum at the only candidate.

2.9.2 Lagrange multipliers

In Example 2.9.3, we maximized the function f (x, y) under the condition x 2 + y2 = 36.
We found the maximum by finding a parameterization of the boundary ⟨x(t), y(t)⟩ and
then reducing f (x, y) to f (x(t), y(t)), which is a one-variable function. In Example 2.9.4,
we maximized a volume function V = xyz subject to the constraint 2xz + 2yz + xy =
18. We eliminated the constraint by replacing z = (18 − xy)/(2x + 2y) in the objective
function V = xyz. Thus, the problems were reduced to problems without constraints.
However, this approach may be hard or even impossible in some cases. For example,

maximize f (x, y) = x2 + 2y2 , where x 2 − xy + y2 = 1.

We need to consider the more general optimization problem

max / min z = f (x, y), subject to g(x, y) = 0,

which is to find the maximum or minimum value of the objective function z = f (x, y),
subject to the constraint g(x, y) = 0. This type of problem is called a constrained max-
imum/minimum problem.
Sometimes we can convert a constrained maximum/minimum problem to a non-
constrained one, as shown in Example 2.9.3 and Example 2.9.4, by expressing one vari-
able in terms of other variables in the constraint condition. Now, the question is, how
can we identify the candidates for constrained maximum/minimum if elimination of
variables is hard or impossible? Note that if the curve g(x, y) = 0 in the xy-plane has a
parameterization r(t) = ⟨x(t), y(t)⟩, and at t = t0 (this corresponds to some point (a, b)
on the curve) there is a constrained maximum/minimum, then

z = f (x, y) = f (x(t), y(t))

also has a maximum/minimum at t = t0 , as shown in Figure 2.24.


Therefore,

dz dx dy
= 0 → fx + fy =0
dt dt dt
dx dy 󵄨󵄨󵄨
or ⟨fx , fy ⟩ ⋅ ⟨ , ⟩󵄨󵄨󵄨 = 0.
dt dt 󵄨󵄨t=t0
2.9 Maximum and minimum values | 131

Figure 2.24: Constrained maximum/minimum.

This indicates that at the point t = t0 , ∇f is perpendicular to the tangent vector of the
curve r(t). On the other hand the curve r(t) = ⟨x(t), y(t)⟩ also satisfies g(x(t), y(t)) = 0.
In a similar manner, we also have

dx dy dx dy
gx + gy =0 or ⟨gx , gy ⟩ ⋅ ⟨ , ⟩ = 0.
dt dt dt dt

This means ∇g is also perpendicular to the tangent vector of the curve r(t) at t0 . Thus,
at t = t0 , we must have ∇f ‖ ∇g. There must be some number λ such that ∇f = λ∇g at
t = t0 . We conclude this by the following theorem.

Theorem 2.9.3. Suppose both f (x, y) and g(x, y) are differentiable, and at some point (a, b), the opti-
mization problem

max / min z = f (x, y) subject to g(x, y) = 0

has a constrained maximum or minimum. Then at the point (a, b), one must have the following condi-
tions:

g(a, b) = 0 and ∇f (a, b) = λ∇g(a, b) for some constant λ.

Note. The constant λ in Theorem 2.9.3 is called the Lagrange multiplier. The theorem
can be also stated in a form of a Lagrange function defined by L(x, y, λ) = f (x, y) −
λg(x, y). The candidates for constrained maximum/minimum must then satisfy

∇L = 0 or equivalently Lx = Ly = Lλ = 0.

The candidate theorem gives possible candidates for constrained maximum or


minimum. When evaluating all function values at these candidate point, the greatest
function value found is the desired constrained maximum, the least function value
found is the constrained minimum.
132 | 2 Functions of multiple variables

Example 2.9.5. Find the constrained maximum and constrained minimum for the optimization prob-
lem

max / min f (x, y) = x + y, subject to x 2 − xy + y 2 = 1.

Solution. Let L(x, y, λ) = x + y − λ(x2 − xy + y2 − 1). Then any candidate must satisfy

Lx = 1 − 2λx + λy = 0,
Ly = 1 + λx − 2λy = 0,
Lλ = x2 − xy + y2 − 1 = 0.

From the first and second equations, we have

1 1
=λ= .
2x − y 2y − x

Then

2x − y = 2y − x, or x = y.

Substituting into the third equation, we obtain

x2 − x(x) + x 2 − 1 = 0,

so we have x = ±1. Therefore, candidate points are (1, 1) and (−1, −1); f (1, 1) = 2 and
f (−1, −1) = −2, so the constrained maximum is 2 at (1, 1) and the constrained minimum
is −2 at (−1, −1). Figure 2.25(a) shows the graph of the plane z = x + y and z = x 2 − xy +

(a) (b)

Figure 2.25: Constrained maximum/minimum, Example 2.9.5.


2.9 Maximum and minimum values | 133

y2 − 1. Figure 2.25(b) shows the level curves of the plane and the constraint. Note that
the constraint candidates are exactly those where the constraint curve is tangent to a
level curve.

Interpreting constrained max/min using level curves


In the previous example, the condition g(x, y) = 0, in fact, is a cylinder in space. The
constrained maximum/minimum problem is to find the highest point and lowest point
on the curve which is the intersection of the cylinder g(x, y) = 0 and the surface z =
f (x, y). There must be one contour of the surface z = f (x, y) that touches the curve at its
highest point (and, similarly, to its lowest point). When projecting the curve and the
contour onto the xy-plane, we would expect that the two projection curves are tangent
to each other at the projection of the highest point. This is indeed the case. If f (a, b) is
the constrained maximum (same argument for the minimum), then

f (a, b) = f (x, y),


{ is the level curve, and
z=0
g(x, y) = 0,
{ is the constraint curve.
z=0

Since, at (a, b), the tangent vectors of two curves are perpendicular to ∇f (a, b) and
∇g(a, b), respectively, it follows that ∇f (a, b) and ∇g(a, b) are parallel to each other at
(a, b). Thus, the level curve f (x, y) = f (a, b) and the curve g(x, y) = 0 are tangent to
each other at (a, b)!
Figure 2.26(a) shows the graph of z = x2 + 2y2 and the constraint x 2 + y2 = 1
(in the z = 0 plane). Figure 2.26(b) shows the level curves of z = x 2 + 2y2 . One can
find candidates for the constrained maximum and minimum at those points where
the level curve and constraint curve are tangent to each other.

(a) (b)

Figure 2.26: Level curves and constrained extrema.


134 | 2 Functions of multiple variables

Constrained extrema for functions of more than two variables


We have similar conclusions for functions of more than two variables. For instance, if
the optimization problem

max / min u = f (x, y, z), subject to g(x, y, z) = 0

has a constrained extremum at (a, b, c), then, if both f and g are differentiable, we
must have

∇f (a, b, c) = λ∇g(a, b, c) and g(a, b, c) = 0.

Or we can define a new function, L(x, y, x, λ), called the Lagrangian, with an extra vari-
able λ called Lagrange multiplier, by

L(x, y, x, λ) = f (x, y, z) − λg(x, y, z).

Then, we can find the constrained maximum/minimum for f by taking the following
steps.
(1) Find all values of x, y, z, and λ such that the partial derivatives are zero (in other
words, find the critical points of L), by solving

{ Lx = fx (x, y, z) − λgx (x, y, z) = 0,


{
{
{ Ly = fy (x, y, z) − λgy (x, y, z) = 0,
{
{
{ Lz = fz (x, y, z) − λgz (x, y, z) = 0,
{
{ Lλ = −g(x, y, z) = 0.

Note that the fourth equation is just the original constraint.


(2) Evaluate f at all solution points (x, y, z) found in step (1). The largest of these val-
ues is the constrained maximum value of f , the smallest is the constrained mini-
mum value of f .

We illustrate the Lagrange multiplier method using the problem of a previous exam-
ple.

Example 2.9.6. A rectangular container without a lid is to be made from 18 m2 woodboard. Find the
maximum volume of such a container.

Solution. As before, we wish to maximize

V = xyz

subject to the constraint

2xz + 2yz + xy = 18.


2.9 Maximum and minimum values | 135

Using the method of Lagrange multipliers, L(x, y, z, λ) = xyz − λ(2xz + 2yz + xy − 18), so
the four partial derivatives give the equations

{ Lx = yz − λ(2z + y) = 0,
{
{
{ Ly = xz − λ(2z + x) = 0,
{
{
{ Lz = xy − λ(2x + 2y) = 0,
{
{ Lλ = 2xz + 2yz + xy − 18 = 0.

There are no general rules for solving systems of nonlinear equations, and sometimes
some ingenuity is required. In the present example, you might notice that if we multi-
ply the first equation by x, the second equation by y, and the third equation by z, then
we have

{ xyz = λ(2xz + xy),


{
{ xyz = λ(2yz + xy),
{
{ xyz = λ(2xz + 2yz).
Subtracting the first two gives 0 = λz(x − y), and we observe that z cannot be zero, and
λ ≠ 0 because λ = 0 would imply yz = xz = xy = 0 󳨐⇒ x = y = z = 0. Therefore, we
have x = y. Subtracting the first equation from the third equation then gives 0 = λx(x −
2z) 󳨐⇒ x = 2z. Hence, substituting x = y = 2z into the constraint 2xz + 2yz + xy = 18
and solving for z, which must be positive, gives

4z 2 + 4z 2 + 4z 2 = 18,
12z 2 = 18,
z = √6/2.

Hence, the only critical point is, as before, x = √6, y = √6, and z = √6/2, and this
gives the maximum volume.

There is also a geometric interpretation for constrained extrema for differentiable


functions of three variables. If u = f (x, y, z) has a constrained extremum at (a, b, c) sub-
ject to g(x, y, z) = 0, then for any smooth curve r(t) = ⟨x(t), y(t), z(t)⟩ passing through
(a, b, c) and lying on the surface g(x, y, z) = 0, both ∇f and ∇g must be perpendicular to
r󸀠 (t) at (a, b, c). Therefore, both ∇f and ∇g must be perpendicular (normal vectors) to
the tangent plane to g(x, y, z) = 0 at (a, b, c). Thus, ∇f and ∇g are parallel to each other.
So the level surface f (a, b, c) = f (x, y, z) of u = f (x, y, z) and the surface g(x, y, z) = 0
must have the same tangent plane at (a, b, c).

Constrained extrema for more than one constraint


We can also work with optimization problems with more than one constraint. For ex-
ample, for the problem

max / min u = f (x, y, z), subject to g(x, y, z) = 0 and h(x, y, z) = 0


136 | 2 Functions of multiple variables

we can define a Lagrange function as

L(x, y, z, λ, u) = f (x, y, z) − λg(x, y, z) − uh(x, y, z).

By solving the equations

∇L = 0 → Lx = Ly = Lz = Lλ = Lu = 0,

we obtain candidates for constrained extreme values.


There is also a geometric interpretation for this case. Points that satisfy g(x, y, z) =
0 and h(x, y, z) = 0 must be on the intersection curve r(t) of the two surfaces. If f (r(t))
has a local max/min at a candidate t = t0 , following an argument similar to the one
before, we have ∇f ⋅ r󸀠 (t0 ) = 0, ∇g ⋅ r󸀠 (t0 ) = 0, and ∇h ⋅ r󸀠 (t0 ) = 0 at the candidate point.
Therefore, ∇f , ∇g, and ∇h must be coplanar. This means

∇f = λ∇g + u∇h, for some constants λ and u. (2.17)

Example 2.9.7. Find the shortest distance from the origin to the line of intersection of the two planes
y + 2z − 12 = 0 and x + y − 6 = 0.

Solution. Suppose the point is (x, y, z) and the distance is √x 2 + y2 + z 2 . To minimize


the distance, we minimize x2 + y2 + z 2 . So

L(x, y, z, λ, u) = x2 + y2 + z 2 − λ(y + 2z − 12) − u(x + y − 6).

Then by computing ∇L = 0, we have

2x − u = 0,
2y − λ − u = 0,
2z − 2λ = 0,
y + 2z − 12 = 0,
x + y − 6 = 0.

Solving the equations yields the only candidate (2, 4, 4). Therefore, the shortest dis-
tance must be √22 + 42 + 42 = 6.

2.10 Review
Main concepts discussed in this chapter are listed below.
1. Definitions of functions of more than one variable, such as z = f (x, y) and u =
f (x, y, z).
2.10 Review | 137

2. Limits and continuity of functions of more than one variable.


3. Partial derivatives of z = f (x, y) with respect to x or y:

f (a + Δx, b) − f (a, b) f (a, b + Δy) − f (a, b)


fx (a, b) = lim , fy (a, b) = lim .
Δx→0 Δx Δy→0 Δy

4. Differentiability:

Δz = fx (a, b)Δx + fy (a, b)Δy + o(√(Δx)2 + (Δy)2 ).

5. The total differential:


𝜕f 𝜕f
dz = dx + dy.
𝜕x 𝜕y

6. The linear approximation:

Δz ≈ fx (a, b)Δx + fy (a, b)Δy.

𝜕2 f 𝜕2 f
7. Clairaut theorem: if 𝜕x𝜕y
and 𝜕y𝜕x
are both continuous at (a, b), then

𝜕2 f 𝜕2 f
= at (a, b).
𝜕x𝜕y 𝜕y𝜕x

8. The chain rules:


dz 𝜕z dx 𝜕z dy
if z = z(x(t), y(t)), then = + ,
dt 𝜕x dt 𝜕y dt
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
if z = z(x(s, t), y(s, t)), then = + .
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s

9. The Taylor expansion:

f (x) = f (x0 ) + ∇f (x0 ) ⋅ Δx+ΔxT H(x0 )Δx + ⋅ ⋅ ⋅ .

10. Implicit differentiation: F(x, y, z) = 0,

F Fy
=− x
𝜕z 𝜕z
and =− .
𝜕x Fz 𝜕y Fz
󵄨f fy
11. Implicit differentiation: { F(x,y,u,v)=0,
󵄨󵄨
G(x,y,u,v)=0 denote
𝜕(f ,g)
= 󵄨󵄨󵄨󵄨 gxx gy
󵄨󵄨, then
𝜕(x,y) 󵄨
𝜕(F,G) 𝜕(F,G)
𝜕u 𝜕v
and
𝜕(x,v) 𝜕(u,x)
= − 𝜕(F,G) = − 𝜕(F,G) .
𝜕x 𝜕x
𝜕(u,v) 𝜕(u,v)

Similar results hold for 𝜕u


𝜕y
and 𝜕v
𝜕y
.
138 | 2 Functions of multiple variables

12.
f (x,y,z)=0,
(a) For a curve defined by { g(x,y,z)=0, its tangent line at P(x0 , y0 , z0 ) is given by

r = ⟨x0 , y0 , z0 ⟩ + t(∇f × ∇g)P .

(b) For a surface defined by F(x, y, z) = 0, its tangent plane at P(x0 , y0 , z0 ) is given
by

(∇F ⋅ Δx)P = 0.

13. The directional derivative of z = f (x, y) in the direction given by unit vector u at
point (a, b) is

Du f (a, b) = ∇f ⋅ u.

14. Candidates for local maxima/minima of the function f are points where ∇f = 0 or
∇f does not exists.
15. If A = fxx , B = fxy = fyx , and C = fyy , then at point P where ∇f (P) = 0,
(a) if AC − B2 > 0 and A > 0, there is a local minimum,
(b) if AC − B2 > 0 and A < 0, there is a local maximum,
(c) if AC − B2 < 0, there is a saddle point.
16. Candidates for maxima/minima of z = f (x, y) subject to g(x, y) = 0 satisfy

∇f = λ∇g and g(x, y) = 0 for some number λ.

Similar results hold for functions of three variables or with more than one restric-
tion.

2.11 Exercises
2.11.1 Functions of two variables

1. Find the domain for each of the following functions and sketch it:
(1) z = √1 − x2 + √y2 − 1, (2) z = √x − √y,
(3) z = ln(1 − x − y), (4) z = ln(y − x) + √x
,
√1−x2 −y2
1
(5) u = √R2 − x2 − y2 − z 2 + , R > r.
√x2 +y2 +z 2 −r 2
3 3
2. Find f (x, y) if f (x + y, xy) = x + y .
3. Find each of the following limits if it exists:
1
3
(1) lim(x,y)→(2, 1 ) (2 + xy) y+xy2 , (2) limx→∞ (x 2 + y2 ) sin x2 +y 2,
2 y→∞
1−cos(x 2 +y2 ) sin(2xy)
(3) lim(x,y)→(0,0) 2 y2 , (4) lim(x,y)→(2,0) y
,
(x2 +y2 )ex
ln(x+ey ) xy cos y
(5) lim(x,y)→(1,0) , (6) lim(x,y)→(0,0) 3x2 +y2
,
√x2 +y2
xy2
(7) lim(x,y)→(0,0) x2 +y2 +xy
.
2.11 Exercises | 139

4. Show that lim(x,y)→(0,0) f (x, y) does not exist, where

x2 y
, x2 + y2 ≠ 0,
f (x, y) = { x4 +y4
0, x2 + y2 = 0.

5. Determine the set of points at which each of the following functions is continuous:
sin xy y2 +2x
(1) f (x, y) = ex −y2
, (2) z = y2 −2x
,
1
(3) u = xyz
, (4) z = ln(1 − x2 − y2 ).

2.11.2 Partial derivatives and differentiability

1. Find the first partial derivatives for each of the following functions:
(1) u = xy + xy , (2) u = x
,
√x2 +y2

(3) u = x sin(x − 2y), (4) u = x arctan(xey ),


(5) u = xy , (6) u = ( xy )z ,
(7) u = z xy , (8) u = tan xy .
2. Find the indicated partial derivatives for each of the following functions:
(1) for f (x, y) = ln(x + ln y), fx (1, e) and fy (1, e),
(2) for f (x, y) = sin xy cos xy , fx (2, π).
3. Find the indicated second- or third-order partial derivatives for each of the follow-
ing functions:
2 2
𝜕2 u
(1) for u = x ln(x + y), find 𝜕𝜕xu2 , 𝜕𝜕yu2 , and 𝜕x𝜕y ,
𝜕3 u
(2) for u = x3 sin y + y3 sin x, find 𝜕x2 𝜕y
.
4. If u = ln √(x − a)2 + (y − b)2 , where a and b are constants, show that

𝜕2 u 𝜕2 u
+ = 0.
𝜕x2 𝜕y2
2 2
xy x2 −y2 when x2 +y2 =0,
5. If f (x, y) = { show that fxy (0, 0) ≠ fyx (0, 0).
̸
x +y
0 when x2 +y2 =0,
xy
when x2 +y2 =0,
6. If f (x, y) = { show that both fx (0, 0) and fy (0, 0) exist but f is not
̸
x2 +y2
0 when x2 +y2 =0,
differentiable at (0, 0).
1
(x2 +y2 ) sin( ) when x2 +y2 =0,
7. Show that f (x, y) = { is differentiable at (0, 0), but nei-
̸
x 2 +y2
0 when x2 +y2 =0
ther fx nor fy is continuous at (0, 0).
8. Let f be the function

x2 y2
{ 3 if (x, y) ≠ (0, 0),
f (x, y) = { (x2 +y2 ) 2

{ 0 if (x, y) = (0, 0).


140 | 2 Functions of multiple variables

(a) Find the limit lim(x,y)→(0,0) f (x, y) or show that it does not exist.
(b) Is the function continuous at (0, 0)?
(c) Is the function differentiable at (0, 0)?
9. Find the total differential for each of the following functions:
xy
(1) z = x3 ln(y2 ), (2) z = arctan 1−xy , (3) u = √x 2 + y2 + z 2 .
10. Explain why the following functions are differentiable at the given point. Then
find the linearization L(x, y) of the function at that point, and use it to approximate
the given number.
(1) f (x, y) = xy at (1, 1), f (0.97, 1.06),
(2) f (x, y, z) = √x2 + y2 + z 2 at (3, 2, 6), √3.022 + 1.972 + 5.992 .

2.11.3 Chain rules and implicit differentiation

1. Find the given partial derivative for each given explicitly or implicitly defined
function.
(1) u = ln(ex + ey ), y = x3 . Find du
dx
. (2) z = sin(x2 y)x. Find 𝜕x
𝜕z
.
(3) u = x2 y − xy2 + z, where x = t cos(s), y = t sin(t), and z = t + s. Find 𝜕u
𝜕t
and 𝜕u
𝜕s
.
(4) u = f (x2 − y2 , exy ). Find 𝜕u
𝜕x
and 𝜕u
𝜕y
. (5) u = f (x, xy, xyz). Find 𝜕u
𝜕x
, 𝜕u
𝜕y
, and 𝜕u
𝜕z
.
2 2
x 𝜕u 𝜕u 𝜕2 u z
(6) u = f (x, y ). Find 𝜕x2 , 𝜕y2 , and 𝜕x𝜕y . (7) e = xyz. Find 𝜕z
𝜕x
.
(8) yz = ln(x + z). Find 𝜕x 𝜕z
. (9) u = f (x 2 , y, xy). Find 𝜕u
𝜕x
and 𝜕u
𝜕y
.
2. If z = f (x, y), where x = r cos θ and y = r sin θ, show that

𝜕2 z 𝜕2 z 𝜕2 z 1 𝜕2 z 1 𝜕z
+ = + + .
𝜕x2 𝜕y2 𝜕r 2 r 2 𝜕θ2 r 𝜕r

3. If u = ϕ(x 2 + y2 ), show that y 𝜕u


𝜕x
− x 𝜕u
𝜕y
= 0.
4. If x = x(y, z), y = y(x, z), and z = z(x, y) are three functions implicitly defined
by the equation F(x, y, z) = 0, show that the partial derivatives of these functions
satisfy
𝜕x 𝜕y 𝜕z
= −1.
𝜕y 𝜕z 𝜕x
2 2
z=x +y , dz dy
5. If { x+y+z=1, find dx and dx .
6. Assume y = f (x, t) is differentiable, and an equation F(x, y, t) = 0 defines t = t(x, y)
implicitly as a function of x and y. If f and F have continuous partials, show that

dy fx Ft − ft Fx
= .
dx Ft + ft Fy

7. (Derivative under integrals) The famous Leibniz theorem says that if f (x, t) is a
function such that f (x, t) and its partial derivative fx (x, t) are both continuous in
2.11 Exercises | 141

some region of the xt-plane, with a(x) ≤ t ≤ b(x) for two differentiable functions
a(x) and b(x), then
b(x) b(x)
d d d 𝜕f (x, t)
∫ f (x, t)dt = f (x, b(x)) b(x) − f (x, a(x)) a(x) + ∫ dt.
dx dx dx 𝜕x
a(x) a(x)

When f (x, t) = f (t), a one-variable function in t, then this is proved by the funda-
mental theorem of calculus, part I. When a(x) = a and b(x) = b are two constants,
then the theorem becomes
b b
d
∫ f (x, t)dt = ∫ fx (x, t)dt.
dx
a a

This means that the derivative operator can pass through the integral sign. This is
essentially the interchange of two limits (can you see why?).
(a) Prove the Leibniz theorem for the special case where a(x) = a and b(x) = b.
1
(b) By considering the function ϕ(t) = ∫0 ln(1+tx)1+x 2
dx or otherwise, evaluate
1 ln(1+x)
∫0 1+x2
dx.
π
(c) Find the integral ∫02 ln(cos2 x + a2 sin2 x)dx, a > 0.

2.11.4 Tangent lines/planes, directional derivatives

1. Find equations of (a) the tangent line and (b) the normal plane to each of the
following curves at the specified point:
2
(1) x = t 2 , y = 1 − t, z = t 3 , (1, 0, 1), (2) r(t) = ⟨ sin2 t , t+sin2t⋅cos t , sin t⟩, t = π4 ,
2 2 2 2 2
(3) { x x+y+z=0,
+y +z =6,
(1, −2, 1), (4) { (x−1)
2 2
+y =1,
2 (1, 1, √2) .
x +y +z =4,
2. Find equations of (a) the tangent plane and (b) the normal line to the following
surfaces at the given point:
(1) z = 2x2 + 4y2 , (2, 1, 12), (2) x 2 = 2z, (2, 0, 2),
(3) cos πx + x2 y + exz − yz + 4 = 0, (0, 1, 6).
3. Find the directional derivative of the following functions at the given point in the
direction of the vector v:
(1) f (x, y) = x2 − y2 at the point (1, 1), given v = ⟨1, √3⟩,
x
(2) f (x, y, z) = y+z at the point (4, 1, 1), given v = ⟨1, 2, −1⟩.
4. Find all points at which the direction of fastest change of the function f (x, y) =
x2 + y2 − 2x − 4y is i ⃗ + j.⃗
5. Find the maximum rate of change of f at the given points and the direction in
which it occurs:
(1) f (x, y) = x2 y + exy sin y, (1, 0), (2) f (x, y, z) = xy2 z, (1, −1, 2),
(3) f (x, y, z) = ln(x 2 + y2 − 1) + y + 6z, (1, 1, 0).
142 | 2 Functions of multiple variables

6. A vector tangent to the curve { fg(x,y,z)=0


(x,y,z)=0,
at P(x, y, z) is ∇f × ∇g. Use this to find
an equation for the tangent line and normal plane to the following curves at the
indicated point:

x2 + y2 + z 2 − 3y = 0, ex − z + xy = 0,
(1) { (1, 1, 1) , (2) { (0, 1, 1) .
2x + y − z − 2 = 0, x 2 − y2 + 2z 3 = 1,

2.11.5 Maximum/minimum problems

1. Find and classify all the critical points for each of the following functions:
(1) f (x, y) = x2 − xy + y2 + 9x − 6y + 20, (2) f (x, y) = 4(x − y) − x 2 − y2 ,
(3) f (x, y) = 3x − x3 − 2y2 + y4 .
2. If the function z = z(x, y) is implicitly defined by the equation

x2 − 6xy + 10y2 − 2yz − z 2 + 18 = 0,

find all critical points of z and local maximum and local minimum values of z.
3. Find the absolute maximum and minimum values of f (x, y) on the set D:
(1) f (x, y) = 2−4x −5y, where D is the closed triangular region with vertices (0, 0),
(2, 0), and (0, 3),
(2) f (x, y) = xy2 , where D = {(x, y)|x ≥ 0, y ≥ 0, x 2 + y2 ≤ 3},
(3) f (x, y) = 24xy − 8x 3 − 6y2 on the rectangular region D: 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2.
4. Find three positive numbers x, y, and z whose sum is 100 and whose product is a
maximum.
5. Find all points on the ellipse x2 + 4y2 = 4 that are closest to the line 2x + 3y − 6 = 0.
6. Find the dimensions of the closed rectangular box with least total surface area if
the volume is given by V m3 .
7. Find the maximum value of f (x, y, z) = xyz on the line of intersection of the two
planes x + y + z = 40 and x + y = z.
8. (Least square method) Suppose the two variables x and y are related linearly by
the equation y = kx +b for some constants k and b. However, in practice, observed
pairs of data (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) . . . (xn , yn ) that should satisfy this equation
usually do not lie exactly on a straight line. So scientists want to find the constant
k and b such that the line y = kx + b best “fits” these points.
Let di = yi − (kxi + b) be the vertical deviation of the point (xi , yi ) from the line
y = kx + b. The least square method determines k and b by minimizing ∑ni=1 di2 (the
sum of the squares of these deviations). Show that the “best fit line of y on x” is
given by

n ∑ni=1 xi yi − ∑ni=1 xi ∑ni=1 yi ∑ni=1 xi2 ∑ni=1 yi − ∑ni=1 xi yi ∑ni=1 xi


y= n n x + .
n ∑i=1 xi2 − (∑i=1 xi )2 n ∑ni=1 xi2 − (∑ni=1 xi )2
2.11 Exercises | 143

9. For functions of more than two variables, there are similar tests for local extreme
values using the Hessian matrix H(x). For example, for u = f (x, y, z), its Hessian
matrix is
󵄨󵄨 𝜕2 f 𝜕2 f 𝜕2 f 󵄨󵄨
󵄨󵄨 𝜕x2
󵄨󵄨
󵄨󵄨 𝜕x𝜕y 𝜕x𝜕z 󵄨󵄨
󵄨󵄨 𝜕2 f 𝜕2 f 𝜕2 f 󵄨󵄨
H(x) = 󵄨󵄨󵄨 𝜕y2
󵄨󵄨 .
󵄨󵄨
󵄨󵄨 𝜕y𝜕x 𝜕y𝜕z
󵄨󵄨
󵄨󵄨 𝜕2 f 𝜕2 f 𝜕2 f 󵄨󵄨
󵄨󵄨󵄨 𝜕z𝜕x 𝜕z𝜕y 𝜕z 2
󵄨󵄨

If all its second derivatives are continuous, then H(x) is a symmetric matrix. If
H(x) is positive definite at x0 , then f attains an isolated local minimum at x0 . If
the Hessian is negative definite at x0 , then f attains an isolated local maximum
at x0 . If the Hessian has both positive and negative eigenvalues at x0 , then x0 is a
saddle point for f . Otherwise the test is inconclusive.
Use a suitable Hessian matrix to classify the critical points of the function
f (x, y, z) = x2 + y2 − x + z 2 .
3 Multiple integrals
In this chapter, we extend the idea of the definite integral of a function of one variable
to an analogous concept of a function of two or three variables, called a multiple in-
tegral. Multiple integrals are used in a number of applications, including computing
volumes, surface areas, and masses of two- or three-dimensional objects.

3.1 Definition and properties


We start with some interesting questions. It rained very hard in a city area last night.
How much rain was received by the city? How much water is there in a certain water
reservoir? If a rectangular plate is not uniform, and at each point (x, y) the density
is given by a density function, say, f (x, y), then what is the total mass of this plate?
Well, as in one-variable calculus, to answer these questions, we need to identify the
“elements” which we add up to form a Riemann sum and then take a limit to have an
integral. We first consider an ideal reservoir whose surface is a rectangle and whose
base is a smooth surface. If we turn it upside down, then finding the amount of water
in the reservoir is the same as finding the volume of a solid in space.

Volumes of solids
Now suppose that f is a two-variable function with a rectangular domain D satisfying
f (x, y) ≥ 0 for all (x, y) ∈ D, a rectangular region in the xy-plane. Hence, the graph of
f is a surface S above the xy-plane with equation z = f (x, y), and the projection of S
onto the xy-plane is the domain set D. The solid (three-dimensional) region Ω that lies
above D in the xy-plane and under the graph of f is

Ω = {(x, y, z) ∈ ℝ3 |0 ≤ z ≤ f (x, y), (x, y) ∈ D}.

Our initial goal is to find a method for computing the volume of Ω, and this provides
a motivation for multiple integrals.
The first step is to subdivide the region D into n small closed subregions Δσ1 ,
Δσ2 , . . . , Δσn , as illustrated in Figure 3.1, where the subregions are created by drawing
lines parallel to the x- and y-axes. The value of n is left unspecified because eventually
we will use a limiting process in which n → ∞.
Arbitrarily choose a point (ξi , ηi ) in each Δσi for i = 1, 2, . . . , n. We approximate
the part of Ω that lies above each Δσi by a thin rectangular box (or “column”) with
base Δσi and height f (ξi , ηi ), as shown in Figure 3.1. The volume ΔVi of this column is
approximately the height of the column, f (ξi , ηi ), multiplied by the base area Δσi of the
base region, Δσi (we are using Δσi both as the name of the subregion and as the area
of this subregion), so we have

ΔVi ≈ f (ξi , ηi )Δσi .

https://doi.org/10.1515/9783110674378-003
146 | 3 Multiple integrals

Figure 3.1: Double integral, volume of a solid in space.

If we form the sum of these approximations over all subregions (a Riemann sum), we
get an approximation to the total volume, V, of the three-dimensional region Ω, i. e.,
n
V ≈ ∑ f (ξi , ηi )Δσi .
i=1

If the limit of this sum exists when n → ∞ and the maximum Δσi approaches zero,
then we define this limit to be the volume of Ω, i. e.,
n
V= lim ∑ f (ξi , ηi )Δσi .
max |Δσi |→0, n→∞
i=1

Note. This limit must be the same value no matter how the subregions Δσi are made
and no matter where the point (ξi , ηi ) is chosen in each subregion Δσi . If the limit is
taken only as n → ∞ (the number of subregions approaches ∞), it would then still be
possible for some subregions Δσi to stay quite large. To avoid this problem, the limit
must also be taken so that the largest subregion approaches zero both in area and in
physical dimensions. We write |Δσi | to indicate the greatest dimension of the subre-
gion Δσi and then write max |Δσi | → 0 to indicate that this greatest dimension must
approach zero for all the subregions. This limit seems very complicated and hard to
compute but, surprisingly, it can be shown to exist whenever the function f is continu-
ous and the domain D is of a suitable form. We will show later that it can be computed
using two one-variable integrals (iterated integrals) that you have studied previously.

Mass of a lamina
We investigate a second quite different problem of computing the mass of a lamina
(thin plate), and surprisingly it can be found by exactly the same process as we did for
finding the volume of a solid. Suppose a rectangular lamina (thin plate) is represented
by region D of the xy-plane. Suppose further that the density (mass per unit area) of
the lamina at a point corresponding to (x, y) in D is given by μ(x, y), where μ(x, y) is a
continuous function on D, as shown in Figure 3.2. We now derive a way to compute the
3.1 Definition and properties | 147

(a) (b)

Figure 3.2: Double integral, mass of a lamina.

total mass M of the lamina using methods similar to the volume computation above.
We divide D into n small closed subregions Δσ1 , Δσ2 , . . . , Δσn by drawing “nets,” and
arbitrarily choose a point (ξi , ηi ) in each Δσi . If Δσi is very small, so that the density
does not change much over Δσi , then the mass of the part of the lamina represented
by Δσi is approximately the density at (ξi , ηi ) multiplied by the area, i. e., μ(ξi , ηi )Δσi (we
are using Δσi both as the name of the subregion and as the area of this subregion). If
we add all such mass approximations, we get an approximation to the total mass, i. e.,
n
M ≈ ∑ μ(ξi , ηi )Δσi .
i=1

If the limit of this sum exists as n → ∞ and max |Δσi | → 0, and it is independent of
choices of subdivisions of D and sample point (ξι , ηi ) in each Δσi , then we define this
limit to be the mass of the lamina, written
n
M= lim ∑ μ(ξi , ηi )Δσi .
max |Δσi |→0, n→∞
i=1

Note this is exactly the same type of limit as that used before to compute the volume
of a three-dimensional region. We will see later that many other applied problems can
be reduced to computing a limit exactly of this type.
We now link this limiting process to a more general definition of the double inte-
gral of a function f (x, y) over a general region D in the xy-plane.

Definition 3.1.1 (Double integrals). The double integral of f over the region D is defined to be the fol-
lowing limit:

∬ f (x, y)dσ = lim ∑ f (ξi , ηi )Δσi ,


max |Δσi |→0, n→∞
D i

where Δσ1 , Δσ2 , . . . , Δσn are n closed subregions which are a partition of the region D (Δσi also denotes
the area of Δσi ) and (ξi , ηi ) is an arbitrarily chosen point in Δσi , assuming that this limit exists. The
148 | 3 Multiple integrals

limit must have the same value for any choice of subdivision and the choice of sample points (ξi , ηi ).
The double integral is also often written as ∬D f (x, y)dA.

Note. The motivation for the double integral assumed that f (x, y) ≥ 0 for (x, y) ∈ D, but
the definition here does not assume that f is nonnegative. When f takes both positive
and negative values on D, then the double integral ∬D f (x, y)dσ, if it exists, is equal to
the volume of the part of the solid that lies above the xy-plane minus the volume of
the part of the solid that lies below the xy-plane.

Properties of double integrals


We list the following properties of double integrals, which can be proved using the
definition of a double integral, in a manner similar to the proofs for functions of a
single variable (we assume that all integrals exist):
(1) ∬D 1dσ = A(D), the area of D;
(2) ∬D f (x, y) + g(x, y)dσ = ∬D f (x, y)dσ + ∬D g(x, y)dσ;
(3) ∬D kf (x, y)dσ = k ∬D f (x, y)dσ, where k is a constant ((2) and (3) are the linearity
property);
(4) if f (x, y) ≥ g(x, y) for all (x, y) ∈ D, then ∬D f (x, y)dσ ≥ ∬D g(x, y)dσ;
(5) if D = D1 ∪D2 , where D1 and D2 do not overlap except perhaps on their boundaries,
then

∬ f (x, y)dσ = ∬ f (x, y)dσ + ∬ f (x, y)dσ;


D D1 D2

(6) if m ≤ f (x, y) ≤ M for all (x, y) ∈ D and A(D) denotes the area of D, then

m ⋅ A(D) ≤ ∬ f (x, y)dσ ≤ M ⋅ A(D).


D

(7) (mean value theorem) if f (x, y) is continuous on the closed region D and A(D) is
the area of D, then there exists (ξ , η) ∈ D such that

∬ f (x, y)dσ = f (ξ , η)A(D).


D

Example 3.1.1. If D = {(x, y)|x 2 + y 2 ≤ 4}, evaluate the integral

∬ √x 2 + y 2 dσ.
D

Solution. Evaluating this double integral directly from the definition as a limit is hard.
However, because √x2 + y2 ≥ 0, we can find the integral by interpreting it as a volume
3.1 Definition and properties | 149

of a solid Ω. The surface z = √x2 + y2 is a cone with vertex downwards at the origin and
axis along the z-axis and with height 2. Therefore, the given double integral represents
the volume of the solid below this cone and above the disk D. The volume of Ω is the
volume of a cylinder with base D and height 2 minus the volume of a cone with the
same base and height. Thus,
1 16π
∬ √x2 + y2 dσ = π22 × 2 − π22 × 2 = .
3 3
D

Example 3.1.2. Use the properties of double integrals to estimate the integral ∬D esin x cos y dσ, where
D is the disk in the xy-plane with radius 2 and center the origin.

Solution. Because −1 ≤ sin x ≤ 1 and −1 ≤ cos y ≤ 1, we have −1 ≤ sin x cos y ≤ 1 and,


therefore,

e−1 ≤ esin x cos y ≤ e1 = e.

Let m = e−1 = 1/e and M = e. By using property (6) and noting that the area of D is
given by A(D) = π(2)2 = 4π, we obtain

≤ ∬ esin x cos y dσ ≤ 4πe.
e
D

Symmetry in double integrals


Sometimes, we can take advantage of the symmetry properties in the integrand or the
region of integration, as shown in the following example.

Example 3.1.3. Find the integral ∬D (x 3 (1 + y 2 ) + 5)dσ for D = {−1 ≤ x ≤ 1, −1 ≤ y ≤ 1}.

Solution. The region of integration D is a square with center the origin. By the linearity
property,

∬(x3 (1 + y2 ) + 5)dσ = ∬ x3 (1 + y2 )dσ + ∬ 5dσ.


D D D
3 2
In ∬D x (1 + y )dσ, the integrand is an odd function with respect to x. This means that
half of the integrand is positive and the other half is negative over D. That is, half of the
graph of f (x, y) = x3 (1+y2 ) is above the xy-plane, and the other half is below it, and the
two halves are symmetric. This double integral is equal to 0. Since ∬D 5dσ = 5 ∬D 1dσ,
the double integral is equal to 5 times the area of region of integration. Thus,

∬(x 3 (1 + y2 ) + 5)dσ = 0 + 5 × (2)2 = 20.


D

Figure 3.3 shows the graph of the function f (x, y) = x 3 (1 + y2 ) on the region D.
150 | 3 Multiple integrals

(a) (b)

Figure 3.3: Symmetry example for a double integral, Example 3.1.3.

3.2 Double integrals in rectangular coordinates


It is very difficult to evaluate a double integral using its definition as a limit. However,
in this section we show how to express a double integral as iterated integrals that can
be evaluated by calculating two single-variable integrals.
We first consider the volume problem where f (x, y) ≥ 0 for all (x, y) in the rectan-
gular domain

D = {(x, y) | a ≤ x ≤ b, and c ≤ y ≤ d}.

If we divide D horizontally and vertically into nm subregions, we note that the area
element Δσ is ΔxΔy. If we let x be fixed, say, x = xi∗ ∈ [xi−1 , xi ] ⊂ [a, b], then

m d

lim ∑ f (xi∗ , yj∗ )Δyj = ∫ f (xi∗ , y)dy


max |Δyj |→0
j=1 c

gives the area of the region that lies inside the solid, as shown in Figure 3.4.
If we multiply this area by a tiny thickness Δxi , this would give us a volume element

ΔVi ≈ ∫ f (xi∗ , y)dyΔxi .


c

Taking a limit of a Riemann sum will give the volume

n d
V= lim ∑ ∫ f (xi∗ , y)dyΔxi .
max |Δxi |→0
i=1 c
3.2 Double integrals in rectangular coordinates | 151

(a) (b)

Figure 3.4: Iterated integrals over a rectangular region.

d
This is one-variable integration with an integrand the function A(x) = ∫c f (x, y)dy. So
if f (x, y) is continuous, we have
b b d

V = ∫ A(x)dx = ∫(∫ f (x, y)dy)dx.


a a c

It is sometimes convenient to write this as


b d

V = ∫ dx ∫ f (x, y)dy.
a c

In a similar manner, we can also integrate with respect to x first. This gives
d d b

V = ∫ A(y)dy = ∫(∫ f (x, y)dx)dy.


c c a

Note.
1. We can also interpret the two integrals as in the mass of lamina model. The in-
d
ner integral ∫c f (x, y)dy gives the mass of a vertical rod. Then we sum/integrate
those masses of rods to get the total mass of the lamina. This is illustrated in Fig-
ure 3.4(b).
2. As we defined the differentials dx, dy, and dz, we can define dσ = dxdy or dσ =
dydz in rectangular coordinates. We will see this will help a lot in algebraic ma-
nipulations.

Recall that when f (x, y) ≥ 0, the volume V is exactly represented by the double
integral ∬D f (x, y)dσ. The method discussed above also works even if f (x, y) takes pos-
152 | 3 Multiple integrals

itive and negative values over a region D. We summarize these arguments in the fol-
lowing theorem.

Theorem 3.2.1 (Fubini’s theorem: rectangular region). If f (x, y) is continuous on a rectangular region

D = {(x, y)|a ≤ x ≤ b, c ≤ y ≤ d},

then the double integral

d b
∬ f (x, y)dσ = ∬ f (x, y)dxdy = ∫ dy ∫ f (x, y)dx,
D D c a

and also

b d
∬ f (x, y)dσ = ∬ f (x, y)dydx = ∫ dx ∫ f (x, y)dy.
D D a c

Example 3.2.1. Find the double integrals

(a) ∬ x + y 2 dσ, where D = {(x, y)|1 ≤ x ≤ 5, −1 ≤ y ≤ 4},


D

(b) ∬ x 2 y sin y 2 dσ, where D = {(x, y)|0 ≤ x ≤ 6, 0 ≤ y ≤ √π}.


D

Solution.
(a) By Fubini’s theorem
5 4

∬ x + y dσ = ∫ dx ∫ (x + y2 )dy
2

D 1 −1
5 4
y3
= ∫(xy + ) dx
3 −1
1
5
43 1 440
= ∫(5x + − (− ))dx = .
3 3 3
1

(b) Note that the integrand has the form g(x)f (y). Thus,

6 √π

∬ x y sin ydσ = ∫ dx ∫ (x2 y sin y2 )dy


2

D 0 0
6 √π

= ∫ x dx ∫ y sin y2 dy
2

0 0
3.2 Double integrals in rectangular coordinates | 153

6
x3 󵄨󵄨󵄨󵄨 −1 󵄨󵄨 √π
cos y2 󵄨󵄨󵄨
󵄨
= 󵄨󵄨 ⋅
3 󵄨󵄨0 2 󵄨󵄨0
−1
= 72 ⋅ (cos π − cos 0) = 72.
2
Note that we have evaluated the two definite integrals separately. That is,
b d b d

∫ dx ∫ f (x)g(y)dydx = ∫ f (x)dx ∫ g(y)dy.


a c a c

Now we consider more general regions of integration. If D is the region between


the graphs of two continuous functions of x (this type of region is called a type I region,
as shown in Figure 3.5(a)), that is, there is some interval a ≤ x ≤ b and functions of
one variable y = ϕ1 (x) and y = ϕ2 (x) such that

D = {(x, y) | a ≤ x ≤ b, ϕ1 (x) ≤ y ≤ ϕ2 (x)},

then, for each x in [a, b], the range for y now depends on x with lower bound ϕ1 (x)
and upper bound ϕ2 (x). Therefore, if we keep x constant, we can also interpret A(x) =
ϕ (x)
∫ϕ 2(x) f (x, y)dy as the area of a cross-section of the solid. Thus,
1

V = ∫ A(x)dx.
a

Therefore, we can still write the volume V and double integral of f (x, y) as two one-
variable integrals (iterated integrals), i. e.,
b b ϕ2 (x)

V = ∬ f (x, y)dσ = ∫ A(x)dx = ∫[ ∫ f (x, y)dy]dx.


D a a ϕ1 (x)

(a) (b)

Figure 3.5: Type I and type II regions.


154 | 3 Multiple integrals

Similarly, D could be a type II region (see Figure 3.5(b)) bounded by two continuous
functions in the xy-plane, x = ψ1 (y) and x = ψ2 (y) for some interval of y values c ≤
y ≤ d,

D = {(x, y) | ψ1 (y) ≤ x ≤ ψ2 (y), c ≤ y ≤ d}.

A similar derivation to that used above for type I regions shows that

d ψ2 (y) d ψ2 (y)

∬ f (x, y)dσ = ∫[ ∫ f (x, y)dx]dy = ∫ dy ∫ f (x, y)dx.


D c ψ1 (y) c ψ1 (y)

The above results on iterated integrals are true even if f (x, y) is not nonnegative.
The formal statement is given in Fubini’s theorem. Rigorous proofs of Fubini’s theorem
can be found in more theoretical calculus textbooks.

Theorem 3.2.2 (Fubini’s theorem: general region). If z = f (x, y) is continuous on its domain D, a type
I region

D = {(x, y) | a ≤ x ≤ b, ϕ1 (x) ≤ y ≤ ϕ2 (x)},

then the double integral can be evaluated by the iterated integrals

b ϕ2 (x)

∬ f (x, y)dσ = ∫ dx ∫ f (x, y)dy.


D a ϕ1 (x)

If z = f (x, y) is continuous on its domain D, a type II region

D = {(x, y)|ψ1 (y) ≤ x ≤ ψ2 (y), c ≤ y ≤ d},

then the double integral can be evaluated by the iterated integrals

d ψ2 (y)

∬ f (x, y)dσ = ∫ dy ∫ f (x, y)dx.


D c ψ1 (y)

Note. The theorem is also true if f is bounded on D and is discontinuous only on a


finite number of smooth curves, provided the iterated integrals exist. However, the
proof of this fact is beyond the scope of this book.

Example 3.2.2. Evaluate ∬D (x + 2y)dσ, where D is the region bounded by straight lines y = 2 and
y = x and the hyperbola xy = 1.

Solution. The hyperbola intersects the two lines at two points ( 21 , 2) and (1, 1) and the
two lines intersect at the point (2, 2), as shown in Figure 3.6.
3.2 Double integrals in rectangular coordinates | 155

(a) (b)

Figure 3.6: Double integral, Example 3.2.2.

We note that the region D is both a type I region and a type II region, but the description
of D as a type I region is more complicated since the lower boundary consists of two
parts. Therefore, it is better to express D as a type II region bounded on the left by x = y1
and on the right by x = y, so

1
D = {(x, y)| ≤ x ≤ y, 1 ≤ y ≤ 2}.
y

We compute the double integral (recall that the inner iterated integral is evaluated as
though y is a constant – like a partial derivative with respect to x)
2 y 2 x=y
x2
∬(x + 2y)dσ = ∫ ∫(x + 2y)dxdy = ∫[ + 2yx] dy
2 x= 1
D 1 1 1 y
y

2
y2 1 43
= ∫( + 2y2 − 2 − 2)dy = .
2 2y 12
1

If we had expressed D as a type I region, then we would evaluate it in two parts, D1 for
1
2
≤ x ≤ 1, bounded above by y = 2 and below by y = x1 , and D2 for 1 ≤ x ≤ 2, bounded
above by y = 2 and below by y = x. Hence,

∬ x + 2ydσ = ∬(x + 2y)dσ + ∬(x + 2y)dσ


D D1 D2
1 2 2 2

= ∫ ∫(x + 2y)dydx + ∫ ∫(x + 2y)dydx


1 1 1 x
2 x

1 2
1
= ∫(2x − 2 + 3)dx + ∫(−2x 2 + 2x + 4)dx
x
1 1
2

5 7 43
= + = .
4 3 12
156 | 3 Multiple integrals

This clearly involves more work than the first method.

Example 3.2.3. Evaluate ∬D xydσ, where D is the region bounded by the line y = x and the parabola
y 2 = 2x + 8.

y2
Solution. The region is shown in Figure 3.7, and it lies between x = 2
−4 and x = y. So,

y2
D = {(x, y)| − 2 ≤ y ≤ 4, − 4 ≤ x ≤ y}.
2

Again, D is both a type I and a type II region, but we prefer to express D as a type
II region because it is less complicated. The double integral becomes

4 y 4 y 4 2
x2 y 1 y2
∬ xydσ = ∫ [ ∫ xydx]dy = ∫ [ ] 2 dy = ∫ (y3 − ( − 4) y)dy
2 y −4 2 2
D −2 y2 −2 2 −2
2
−4
4
1 1
= ∫ (− y5 + 5y3 − 16y)dy = 18.
2 4
−2

Figure 3.7: Double integral, Example 3.2.3.

Change the order of integration

1 √x sin y
Example 3.2.4. Evaluate the iterated integral ∫0 ∫x y
dydx.

Solution. Note that evaluating ∫ siny y dy is impossible as it is not an elementary func-


tion. So, we cannot compute the integral as it stands. However, if we change the order
of integration, then it may be possible to evaluate the iterated integral. We first express
3.3 Double integral in polar coordinates | 157

(a) (b)

Figure 3.8: Change the order of integration.

the given iterated integral as a double integral, i. e.,


1 √x
sin y sin y
∫∫ dydx = ∬ dσ,
y y
0 x D

where D is the type I region shown in Figure 3.8(a) between the curves y = x and
y = √x,

D = {(x, y)|0 ≤ x ≤ 1, x ≤ y ≤ √x}.

From Figure 3.8(b), we can see that there is an alternative description of D as a


type II region between the curves x = y2 and x = y, i. e.,

D = {(x, y)|0 ≤ y ≤ 1, y2 ≤ x ≤ y}.

This enables us to express the double integral as an iterated integral in a different


order,
1 y 1 x=y
sin y sin y sin y
∬ dxdy = ∫ dy ∫ dx = ∫[x ] dy
y y y x=y2
D 0 y2 0
1

= ∫(sin y − y sin y)dy = 1 − sin 1.


0

3.3 Double integral in polar coordinates


If we are going to integrate the double integral

∬ x2 + y2 dσ for D = {(x, y)|x2 + y2 ≤ a2 },


D
158 | 3 Multiple integrals

we will have to evaluate the integral

a √a2 −x 2

∫ dx ∫ x 2 + y2 dy.
−√a2 −x 2
−a

This is certainly not fun. However, if we describe D in polar coordinates, then we will
have D󸀠rθ = {(r, θ)|0 ≤ r ≤ a, 0 ≤ θ ≤ 2π}. This is a rectangular region in the rθ-plane
on which the integration might be easier. In rectangular coordinates, we see the area
element dσ = dxdy. What would this be in polar coordinates? Recall that we found the
area element by dividing the region into many subregions using lines that are parallel
to the x- or y-axis. So an area element is represented by ΔxΔy in rectangular coordi-
nates. Similarly, we can draw circles all centered at the origin with different radii and
half-lines with initial point the origin but different angles from the positive x-axis. This
produces Δr and Δθ. Looking at an area element as shown in Figure 3.9, we approxi-
mate it as the difference of areas of two sectors. So, the area approximation is
2 2
1 Δr 1 Δr
Δσ ≈ (r ∗ + ) Δθ − (r ∗ − ) Δθ = r ∗ ΔrΔθ.
2 2 2 2

This means that we can consider the limit of the Riemann sum
m m
lim ∑ f (ri∗ , θi∗ )Δσi = lim ∑ f (ri∗ , θi∗ )ri∗ Δri Δθi .
max |Δσi |→0 max |Δr,Δθ|→0
i=1 i=1

Then, if the limit exists independent of the way of subdividing the region and the
choice of sample points, we can define

∬ f (x, y)dσ = ∬ f (r cos θ, r sin θ)rdrdθ.


D D󸀠rθ

Figure 3.9: Double integral in polar coordinates.


3.3 Double integral in polar coordinates | 159

In particular, by taking f (x, y) = 1, we can see that the area of the region D bounded
by θ = α, θ = β, and r = r(θ) is given by

β r(θ) β r(θ) β
r2 1 2
A(D) = ∬ 1dσ = ∫ ∫ rdrdθ = ∫[ ] dθ = ∫[r(θ)] dθ. (3.1)
2 0 2
D α 0 α α

Example 3.3.1. Find the area of the disk D = {(x, y)|x 2 + y 2 ≤ R 2 }.

Solution. In polar coordinates, the circle has equation r = R and D is

{(r, θ)|0 ≤ r ≤ R, 0 ≤ θ ≤ 2π}.

The area is given by

β 2π
1 1 1
A(D) = ∫ r 2 (θ)dθ = ∫ R2 dθ = 2π × R2 = πR2 .
2 2 2
α 0

Example 3.3.2. Find the double integral ∬D x 2 + y 2 dσ for D = {(x, y)|x 2 + y 2 ≤ a2 }.

Solution. Using polar coordinates, D becomes D󸀠rθ = {(r, θ)|0 ≤ r ≤ a, 0 ≤ θ ≤ 2π}. So,

∬ x2 + y2 dσ = ∬((r cos θ)2 + (r sin θ)2 )rdrdθ


D D󸀠rθ
2π a
πa4
= ∬ r drdθ = ∫ dθ ∫ r 3 dr =
3
.
2
D 0 0

Example 3.3.3. Find the volume of the solid bounded by the plane z = 0 and the paraboloid z = 1 −
x 2 − y 2 , using polar coordinates.

Solution. Let z = 0 in the equation of the paraboloid. We get x 2 +y2 = 1. Thus, the solid
lies under the paraboloid and above the circular disk D: x 2 + y2 ≤ 1 in the xy-plane. In
polar coordinates, D is described by 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π. Since 1 − x 2 − y2 = 1 − r 2 ,
the volume is, therefore, given by
2π 1

V = ∬(1 − x2 − y2 )dσ = ∫ ∫(1 − r 2 )rdrdθ


D 0 0
2π 1 2π 1
r2 r4
= ∫ dθ ∫(r − r 3 )dr = ∫ [ − ] dθ
2 4 0
0 0 0
160 | 3 Multiple integrals


1 π
= ∫ dθ = .
4 2
0

Note. If we had used rectangular coordinates instead of polar coordinates, then we


would have to evaluate
1 √1−x2
2 2
V = ∬(1 − x − y )dσ = ∫ ∫ (1 − x 2 − y2 )dydx.
D −1 −√1−x2

This integral can be evaluated using trigonometric substitution and using trigonomet-
ric identities, but it is quite complicated.

Example 3.3.4. Find the volume of the solid that lies under the sphere x 2 + y 2 + z 2 ≤ 4, above the
xy-plane, and inside the cylinder x 2 + y 2 = 2x.

Solution. The solid lies above the disk D whose boundary circle, x 2 + y2 = 2x (center
(1, 0), radius 1), is determined by the cylinder. In polar coordinates, we have x 2 +y2 = r 2
and x = r cos θ. Then, the boundary circle becomes r 2 = 2r cos θ 󳨐⇒ r = 2 cos θ for
− π2 ≤ θ ≤ π2 . Thus, the disk D is given by

D = {(r, θ)| − π/2 ≤ θ ≤ π/2, 0 ≤ r ≤ 2 cos θ}.

Hence, the volume is


π/2 2 cos θ

V = ∬ √(4 − x2 − y2 )dσ = ∫ ∫ √4 − r 2 rdrdθ


D −π/2 0
π/2 2 cos θ π/2
1 2 3 󵄨󵄨
1 3 3
= − ∫ (4 − r 2 ) 2 󵄨󵄨󵄨 dθ = − ∫ ((4 − 4 cos2 θ) 2 − 4 2 )dθ
󵄨
2 3 󵄨󵄨0 3
−π/2 −π/2
π/2 π/2
1 3
2 3
=− ∫ (8(sin2 θ) 2 − 8)dθ = − ∫ (8(sin2 θ) 2 − 8)dθ
3 3
−π/2 0
π/2 π/2
16 16 1
=− ∫ (sin3 θ − 1)dθ = − ∫ ( (3 sin θ − sin 3θ) − 1)dθ
3 3 4
0 0
π
16 1 3 󵄨󵄨 2 16 π 2
󵄨
=− ( cos 3θ − cos θ − θ)󵄨󵄨󵄨 = ( − ).
3 12 4 󵄨󵄨0 3 2 3
π/2 3 π/2
Note. In the above, note that − 31 ∫−π/2 (8(sin2 θ) 2 − 8)dθ ≠ − 83 ∫−π/2 (sin3 θ − 1)dθ. That
3 3
is because (sin2 θ) 2 must use the positive square root and so (sin2 θ) 2 = | sin θ|3 . Also,
sin 3θ = 3 sin θ − 4 sin3 θ is an identity.
3.4 Change of variables formula for double integrals | 161

3.4 Change of variables formula for double integrals


In one-dimensional calculus we often use a change of variables (a substitution) to
modify an integral of a continuous function f . If we let x = g(t), where g is a one-
to-one continuous function, then the one-dimensional change of variables theorem
is
b d d
dx
∫ f (x)dx = ∫ f (g(t)) dt = ∫ f (g(t))g 󸀠 (t)dt,
dt
a c c

where c = g −1 (a) and d = g −1 (b).


Change of variables in double integrals is analogous to this but more compli-
cated. Intuitively, when one has a one-to-one transformation T : (u, v) → (x, y) with
x = x(u, v) and y = y(u, v), the change of variables in a double integral ∬D f (x, y)dσ
becomes an integration over the region D󸀠uv with the integrand f (x(u, v), y(u, v)). What
happens to the area element dσ = dxdy in terms of dudv? We demonstrate the idea as
follows. Let T(u, v) = ⟨x(u, v), y(u, v)⟩ = x(u, v)i + y(u, v)j.
As shown in Figure 3.10, the area of the image is approximated by the parallelo-
gram whose area is the modulus of a cross product, i. e.,

󵄨󵄨 󵄨 󵄨 󵄨
󵄨󵄨(T(u + Δu, v) − T(u, v)) × (T(u, v + Δv) − T(u, v))󵄨󵄨󵄨 ≈ 󵄨󵄨󵄨Tu (u, v)Δu × Tv (u, v)Δv󵄨󵄨󵄨
󵄨 󵄨
= 󵄨󵄨󵄨Tu (u, v) × Tv (u, v)󵄨󵄨󵄨ΔuΔv.

Note that Tu (u, v) = ⟨ 𝜕u , 𝜕u ⟩ and Tv (u, v) = ⟨ 𝜕x


𝜕x 𝜕y
, 𝜕y ⟩. Then
𝜕v 𝜕v

󵄩󵄩 𝜕x 𝜕y 󵄩󵄩
󵄨󵄨 󵄨 󵄩󵄩 󵄩󵄩
󵄨󵄨Tu (u, v) × Tv (u, v)󵄨󵄨󵄨ΔuΔv = 󵄩󵄩󵄩 𝜕u 𝜕u 󵄩󵄩 ΔuΔv.
󵄩󵄩
󵄩󵄩 𝜕x 𝜕y
󵄩󵄩
󵄩 𝜕v 𝜕v

Therefore, we have
󵄨󵄨 𝜕(x, y) 󵄨󵄨
󵄨 󵄨󵄨
ΔxΔy ≈ 󵄨󵄨󵄨 󵄨ΔuΔv.
󵄨󵄨 𝜕(u, v) 󵄨󵄨󵄨

(a) (b) (c)

Figure 3.10: Change of variables in a double integral, area element, transformation.


162 | 3 Multiple integrals

󵄨󵄨 𝜕x 𝜕y 󵄨󵄨
We call the determinant 󵄨󵄨󵄨󵄨 𝜕u 𝜕y 󵄨 a Jacobian determinant and denote it by J or 𝜕(u,v) .
𝜕u 󵄨󵄨 𝜕(x,y)
󵄨 𝜕v 𝜕v 󵄨󵄨
𝜕x

The Jacobian determinant is a magnification (or reduction) factor. That is, it relates
the area dxdy of a small region in the xy-plane to the area of the corresponding region
dudv in the uv-plane.
Note that
󵄨󵄨 𝜕x 𝜕y 󵄨󵄨 󵄨󵄨 𝜕x 𝜕x 󵄨󵄨
󵄨󵄨 𝜕u 𝜕u 󵄨󵄨 󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨󵄨 󵄨 󵄨 󵄨
󵄨󵄨 𝜕x 𝜕y 󵄨󵄨󵄨 = 󵄨󵄨󵄨 𝜕y 𝜕y 󵄨󵄨󵄨 ,
󵄨󵄨 𝜕v 𝜕v 󵄨󵄨 󵄨󵄨 𝜕u 𝜕v 󵄨󵄨

so we use either of them and denote them by 𝜕(x,y)


𝜕(u,v)
.

Theorem 3.4.1 (Change of variables in a double integral). Let f (x, y) be a continuous function on a
bounded and closed region D ∈ ℝ2 , and let the functions x = x(u, v) and y = y(u, v) be a continuously
differentiable (x(u, v) and y(u, v) both have continuous first-order partial derivatives) transformation
(mapping) from a region D󸀠 onto the region D. If the transformation is one-to-one, and 𝜕(u,v)
𝜕(x,y)
≠ 0, for
all (u, v) ∈ D󸀠 , then
󵄨󵄨 󵄨
󵄨󵄨 𝜕(x, y) 󵄨󵄨󵄨
∬ f (x, y)dxdy = ∬ f (x(u, v), y(u, v))󵄨󵄨󵄨 󵄨󵄨dudv. (3.2)
󵄨󵄨 𝜕(u, v) 󵄨󵄨󵄨
D D󸀠
󵄨 󵄨

Example 3.4.1. Evaluate the integral

e(x−y)
∬ dxdy,
(x + y)
D

where D is the rectangular region bounded by x + y = 21 , x + y = 1, x − y = − 21 , x − y = 21 .

Solution. We simplify the problem by using the transformation T −1 : u = x − y and


v = x + y from the xy-plane to the uv-plane. In order to use the change of variables
theorem, we solve these for x and y to find the transformation T from the uv-plane to
the xy-plane, i. e.,
1 1
x = (u + v) and y = (v − u).
2 2
The Jacobian of T is
󵄨 󵄨󵄨 󵄨󵄨 1 1 󵄨󵄨
𝜕(x, y) 󵄨󵄨󵄨󵄨 󵄨󵄨 1
𝜕x 𝜕x
󵄨󵄨 󵄨󵄨 2 2
=󵄨 𝜕u 𝜕v 󵄨󵄨 = 󵄨󵄨 󵄨󵄨 = .
𝜕(u, v) 󵄨󵄨󵄨󵄨 𝜕y 𝜕y 󵄨󵄨 󵄨󵄨 1
󵄨󵄨 󵄨󵄨 − 2 1 󵄨󵄨 2
󵄨󵄨
𝜕u 𝜕v 2

To find the region D󸀠 corresponding to D, we find the transformation of each of the


boundary lines of D (see Figure 3.11),
1 1
x+y= → v = , x + y = 1 → v = 1,
2 2
1 1 1 1
x−y=− →u=− , x−y = →u= .
2 2 2 2
3.4 Change of variables formula for double integrals | 163

(a) (b)

Figure 3.11: Change of variables in a double integral, Example 3.4.1.

Thus, the region D󸀠 , shown in Figure 3.11(b), is defined by

1 1 1
D󸀠 = {(u, v)| − ≤ u ≤ , ≤ v ≤ 1}.
2 2 2

Hence, the change of variables formula gives


󵄨󵄨 1 1
eu eu
󵄨󵄨
e(x−y) 󵄨󵄨 2 󵄨󵄨 1
∬ dxdy = ∬ 󵄨󵄨 1 2
1 󵄨󵄨 dudv = ∬ dudv
x+y v 󵄨󵄨 −
󵄨 2 2
󵄨󵄨
󵄨 2 v
D D󸀠 D 󸀠

1
1 2
1 1 ln 2 1
= ∫ dv ∫ eu du = (√e − ).
2 v 2 √e
1
2
− 21

The Jacobian determinant has a nice property for a one-to-one transformation T :


(u, v) → (x, y), T −1 : (x, y) → (u, v), i. e.,

𝜕(x, y) 𝜕(u, v)
= 1.
𝜕(u, v) 𝜕(x, y)

This is similar to the one-variable case where if y = f (x) is a differentiable one-to-one


dy dx
function with inverse x = ϕ(y), then dx dy
= 1.

Example 3.4.2. Find the area of the region bounded by the four curves

y = ax 2 , y = bx 2 , xy = c, and xy = d,

where a, b, c, and d are four constants satisfying 0 < a < b and 0 < c < d.
164 | 3 Multiple integrals

Solution. The area is given by ∬D 1dσ, which is hard to compute. Instead, we use the
transformation
y
u= and v = xy, thus a < u < b and c < v < d.
x2

To compute 𝜕(x,y)
𝜕(u,v)
, we write

𝜕(x, y) 1 1 1
= = =
𝜕(u, v) | 𝜕(u,v) | 󵄨󵄨󵄨󵄨 𝜕u 𝜕u 󵄨󵄨󵄨󵄨 󵄨󵄨 −2 y
󵄨󵄨 x3
1
x2
󵄨󵄨
󵄨󵄨
𝜕(x,y) 󵄨󵄨 𝜕x 𝜕y 󵄨 󵄨󵄨 y x 󵄨󵄨
󵄨󵄨 𝜕v 𝜕v 󵄨󵄨󵄨
󵄨󵄨 𝜕x 𝜕y 󵄨󵄨
1 −1 −1
= = y = .
−2 xy2 − xy2 3 x2 3u

Therefore,
󵄨󵄨 𝜕(x, y) 󵄨󵄨 󵄨󵄨󵄨 1 󵄨󵄨󵄨
󵄨 󵄨󵄨
∬ 1dσ = ∬ 1 ⋅ 󵄨󵄨󵄨 󵄨󵄨dudv = ∬ 1 ⋅ 󵄨󵄨󵄨− 󵄨󵄨󵄨dudv
󵄨󵄨 𝜕(u, v) 󵄨󵄨 󵄨󵄨 3u 󵄨󵄨
D 󸀠 D 󸀠D
b d
1 1 1 d−c
=∬ dudv = ∫ du ∫ dv = ln(b − a).
3u 3 u 3
D󸀠 a c

Figure 3.12 shows the graphs.

(a) (b)

Figure 3.12: Change of variables in a double integral, Example 3.4.2.

The transformation from x- and y-coordinates to polar coordinates (with the same ori-
gin and with the initial line of the polar coordinates along the x-axis) is given by

x = r cos θ,
{
y = r sin θ.
3.5 Triple integrals | 165

Hence,

󵄨󵄨 𝜕(x, y) 󵄨󵄨 󵄩󵄩󵄩 x 󵄩 󵄩
xθ 󵄩󵄩󵄩 󵄩󵄩󵄩 cos θ
󵄩
−r sin θ 󵄩󵄩󵄩
󵄨󵄨 󵄨󵄨 󵄩󵄩 r 󵄩󵄩 = 󵄩󵄩 󵄩󵄩 = |r| = r.
󵄨󵄨 󵄨󵄨 = 󵄩
󵄨󵄨 𝜕(r, θ) 󵄨󵄨 󵄩󵄩󵄩 yr yθ 󵄩󵄩󵄩 󵄩󵄩󵄩 sin θ r cos θ 󵄩󵄩󵄩

Applying the change of variables theorem to this transformation gives

∬ f (x, y)dxdy = ∬ f (r cos θ, r sin θ)rdrdθ.


D D󸀠

This agrees with what we have done before for double integrals in polar coordinates.

3.5 Triple integrals


3.5.1 Triple integrals in rectangular coordinates

If we have a solid box which is not a uniform one, that is, at each point (x, y, z) inside
the box the density is a continuous function f (x, y, z), then how do you find its total
mass?

Figure 3.13: Triple integrals: mass of a box.

We now follow a process very similar to that used for double integrals. For a bounded
function of three variables, f (x, y, z) defined on a closed bounded region (a solid) Ω ⊂
ℝ3 , we construct a Riemann sum as
n
Rn = ∑ f (xi , yi , zi )ΔVi ,
i=1
166 | 3 Multiple integrals

where ΔVi is the volume of a subregion of Ω, (xi , yi , zi ) is an arbitrarily chosen point in


this subregion, and the nonoverlapping subregions for i = 1, 2, 3, . . . , n cover all of Ω.
If the limit of Rn exists as n → ∞ and as the size of the largest subregion approaches
zero, then we say that f is integrable on Ω and denote this as
n
∭ f (x, y, z)dV = lim ∑ f (xi , yi , zi )ΔVi ,
max |ΔVi |→0,n→∞
Ω i=1

and ∭Ω f (x, y, z)dV is called the triple integral of f over the region Ω. This means that
the limit must exist and have the same value no matter how the subregions are created
and how (xi , yi , zi ) are chosen. It can be shown that the limit always exist when f (x, y, z)
is continuous on Ω provided Ω satisfies some fairly mild condition.
When f (x, y, z) = 1 for all (x, y, z) ∈ Ω, then the triple integral gives the volume
V(Ω) of the region Ω, so

V(Ω) = ∭ 1 ⋅ dV. (3.3)


Ω

If the density function of a solid Ω is ρ(x, y, z) mass/unit volume at any point (x, y, z)
of Ω, then the mass M of the solid Ω is

M = ∭ ρ(x, y, z)dV. (3.4)


Ω

Triple integrals also have properties such as linearity and additivity, as double inte-
grals do.

Example 3.5.1. Evaluate the triple integral ∭Ω (x cos(yz 2 ) + 5)dV , where

Ω = {(x, y, z)| − 1 ≤ x ≤ 1, −2 ≤ y ≤ 2, −3 ≤ z ≤ 3}.

Solution. Note that x cos(yz 2 ) is an odd function with respect to x, while Ω is symmet-
ric about x. Therefore,

∭(x cos(yz 2 ) + 5)dV = ∭ x cos(yz 2 )dV + 5 ∭ 1dV (linearity property)


Ω Ω Ω

= 0 + 5 ∭ 1dV (symmetry property)


Ω
= 5 × volume of Ω = 5 × 2 × 4 × 6 = 240.

In general, how do you evaluate a triple integral? First of all, we consider the mass
model where Ω is a rectangular box given by

Ω = {(x, y, z)|a1 ≤ x ≤ a2 , b1 ≤ y ≤ b2 , c1 ≤ z ≤ c2 }.
3.5 Triple integrals | 167

Figure 3.14: Triple integrals: iterated integrals, rectangular base.

The projection of the region Ω onto the xy-plane is a rectangular region on the
xy-plane D = {(x, y)|a1 ≤ x ≤ a2 and b1 ≤ y ≤ b2 } as shown in Figure 3.14. If we
c
let x and y be fixed, then the integral ∫c 2 f (x, y, z)dz gives the mass of a rod. If we add
1
the masses of all such rods, then the total mass is given by
c2

∬(∫ f (x, y, z)dz)dσ.


D c1

This means
c2

∭ f (x, y, z)dV = ∬(∫ f (x, y, z)dz)dσ.


Ω D c1

We can then evaluate a triple integral by finding a definite integral followed by eval-
uating a double integral. Recalling what we did for double integrals, this eventually
leads to the iterated integrals

a2 b2 c2

∭ f (x, y, z)dV = ∫ dx ∫ dy ∫ f (x, y, z)dz.


Ω a1 b1 c1

We can write the above equation as

a2 b2 c2

∭ f (x, y, z)dV = ∫ ∫ ∫ f (x, y, z)dzdydx, or


Ω a1 b1 c1

∭ f (x, y, z)dV = ∭ f (x, y, z)dzdydx,


Ω Ω
168 | 3 Multiple integrals

where dV = dzdydx is interpreted as the volume element in rectangular coordinates.


Of course, we can write dV = dxdydz, dV = dydxdz, etc., but these mean using differ-
ent orders of the iterated integrals.

Example 3.5.2. A box with dimensions 2 × 4 × 8 with height 8 has a density ρ at each of point in the
box. The density ρ is proportional to the product of the distance from the point to the bottom of the
box and the distance from the point to the top surface of the box. The proportionality constant is 3.
Find the total mass of the box.

Solution. Set up a coordinate system with the left-most corner as the origin, as shown
in Figure 3.15. Then the density ρ(x, y, z) = 3z(8 − z). The total mass is given by

2 4 8

∭ ρ(x, y, z)dV = ∭ 3z(8 − z)dV = ∫ ∫ ∫ 3z(8 − z)dzdydx


Ω Ω 0 0 0
2 4 8

= 3 ∫ dx ∫ dy ∫ z(8 − z)dz
0 0 0
8

= 3 × 2 × 4 × ∫(8z − z 2 )dz = 2048 units.


0

Figure 3.15: Triple integrals: Example 3.5.2.

Now we consider the case of a so-called type I region where the region Ω is enclosed
by two smooth surfaces z = z1 (x, y) and z = z2 (x, y), as shown in Figure 3.16. If the
projection of the region onto the xy-plane is D, then in a way similar to the double
integral, we have

z2 (x,y)

∭ f (x, y, z)dV = ∬( ∫ f (x, y, z)dz)dσ.


Ω D z1 (x,y)
3.5 Triple integrals | 169

Figure 3.16: Triple integrals: general region.

Similarly, we can evaluate a triple integral on a type II or type III region. We summarize
these results in the following definition and Fubini’s theorem.

Definition 3.5.1. A region Ω is of type I if, for each (x, y) ∈ D (a region of the xy-plane), all points in Ω
for all z-values lie between two surfaces z1 = z1 (x, y) and z1 = z2 (x, y), that is,

z1 (x, y) ≤ z ≤ z2 (x, y) and (x, y) ∈ D.

Then (x, y, z) ∈ Ω and all points of Ω are of this type.


In other words, D is the projection of the region Ω onto the xy-plane and Ω is the set of points

Ω = {(x, y, z) ∈ ℝ3 |(x, y) ∈ D, z1 (x, y) ≤ z ≤ z2 (x, y)} .

Similarly, a type II or type III region is defined to lie between two functions with domain D in the xz- or
yz-plane, respectively.

Theorem 3.5.1 (Iterated integral theorem). Assume that f (x, y, z) is continuous on a type I region Ω of
the form

Ω = {(x, y, z) ∈ ℝ3 |(x, y) ∈ Dxy , z1 (x, y) ≤ z ≤ z2 (x, y)}.

Then f is integrable on Ω and can be evaluated as a single-variable integration with respect to z (x and
y are held constant) followed by a double integral over the region D in the xy-plane as

z2 (x,y)

∭ f (x, y, z)dV = ∬( ∫ f (x, y, z)dz)dxdy.


Ω Dxy z1 (x,y)

Similarly, for a type II region Ω = {(x, y, z) ∈ ℝ3 |(x, z) ∈ Dxz , y1 (x, z) ≤ y ≤ y2 (x, z) } we have

y2 (x,z)

∭ f (x, y, z)dV = ∬( ∫ f (x, y, z)dy)dxdz,


Ω Dxz y1 (x,z)
170 | 3 Multiple integrals

where Dxz is the projection of Ω onto the xz-plane. For a type III region Ω = {(x, y, z) ∈ ℝ3 |(y, z) ∈ Dyz ,
x1 (y, z) ≤ x ≤ x2 (y, z)} we have, when Dyz is the projection of Ω onto the yz-plane,

x2 (y,z)

∭ f (x, y, z)dV = ∬( ∫ f (x, y, z)dx)dydz.


Ω Dyz x1 (y,z)

Example 3.5.3. Evaluate ∭Ω xdV , where Ω is the solid tetrahedron bounded by the four planes x = 0,
y = 0, and z = 0, and x + y + z = 1, using the method described above.

Solution. It is always helpful if we draw two diagrams: one is the solid region Ω, and
the other is its projection D onto an appropriate coordinate planes when evaluating a
triple integral. The diagrams for this example are shown in Figure 3.17.

(a) (b)

Figure 3.17: Triple integrals: Example 3.5.3.

The lower boundary of the tetrahedron is the plane z = 0, and the upper boundary
in the z-direction is the plane z = 1 − x − y. Note that the planes x + y + z = 1 and
z = 0 intersect in the line x + y = 1 in the xy-plane. Thus, the projection of Ω onto the
xy-plane is the triangular region (see Figure 3.17(b)) bounded by the x-axis, the y-axis,
and x + y = 1.
We can treat Ω as a type I region

Ω = {(x, y, z)|(x, y) ∈ D, 0 ≤ z ≤ 1 − x − y}, where


D = {(x, y)|0 ≤ y ≤ 1 − x, 0 ≤ x ≤ 1}.

Then this enables us to evaluate the integral as follows:


1−x−y
z=1−x−y
∭ xdV = ∬( ∫ xdz)dA = ∬ [xz]z=0 dydx
Ω D 0 D
3.5 Triple integrals | 171

1 1−x

= ∬ x(1 − x − y)dydx = ∫ ∫ (x − x 2 − xy)dydx


D 0 0
1
2 y=1−x
xy
= ∫[xy − x2 y − ] dx
2 y=0
0
1
x 1
= ∫(x − x2 )(1 − x) − (1 − x)2 dx = .
2 24
0

Sometimes, we may also evaluate the triple integral by first evaluating a double in-
tegral and then evaluating a one-variable integral. For example, if Dz is the projection
(onto the xy-plane) of the cross-section (Dz ) of Ω by a horizontal plane with distance
z units from the xy-plane, and all cross-sections of Ω satisfy c1 ≤ z ≤ c2 , then Ω is
defined by

Ω = {(x, y, z)|c1 ≤ z ≤ c2 and (x, y) ∈ Dz }.

In this case, we have

c2

∭ f (x, y, z)dV = ∫ dz ∬ f (x, y, z)dxdy.


Ω c1 Dz

In less precise language you can think of the double integral ∬D f (x, y, z)dxdz as the
z
mass of the lamina (Dz ) when the density per unit volume is f (x, y, z), and then the
integration with respect to z computes the mass of the solid Ω by adding the masses
of all of the laminae. This is illustrated in Figure 3.18.

Figure 3.18: Triple integrals: a double integral first.


172 | 3 Multiple integrals

Similarly, if Dx is the projection (onto the yz-plane) of the cross-section (Dx ) parallel to
the yz-plane at distance x with a1 ≤ x ≤ a2 , and Dy is the projection (onto the xz-plane)
of the cross-section (Dy ) parallel to the xz-plane at distance y with b1 ≤ y ≤ b2 , then
we also have
a2

∭ f (x, y, z)dV = ∫ dx ∬ f (x, y, z)dσ,


Ω a1 Dx
b2

∭ f (x, y, z)dV = ∫ dy ∬ f (x, y, z)dσ.


Ω b1 Dy

Example 3.5.4. Find ∭Ω xdV , where Ω is the same region in Example 3.5.3, by evaluating a double
integral first.

Solution. Evaluate the integral ∭Ω xdV by first evaluating a double integral over

Dz = {(x, y)|0 ≤ y ≤ 1 − x − z, 0 ≤ x ≤ 1 − z}.

This is a triangular cross-section of Ω at height z from the xy-plane. Its projection Dxy
onto the xy-plane is bounded by x + y = 1 − z, x = 0, and y = 0, as shown in Figure 3.19.
We have
1 1 1−z 1−x−z

∭ zdV = ∫ dz ∬ xdσ = ∫ dz ∫ ∫ xdydx


Ω 0 Dz 0 0 0
1 1−z 1
1 1
= ∫ ∫ x(1 − x − z)dxdz = ∫(− (z − 1)3 )dz = .
6 24
0 0 0

(a) (b)

Figure 3.19: Triple integrals: Example 3.5.4.


3.5 Triple integrals | 173

Example 3.5.5. Attempt to evaluate the following integral by first considering E to be a type I region,
then a type II region, and then a third method using a double integral as the inner integral:

∭ √x 2 + z 2 dV ,
E

where E is the region bounded by the paraboloid y = x 2 + z 2 and the plane y = 1.

Solution. We first sketch the region of E, as shown in Figure 3.20.

Figure 3.20: Triple integrals: Example 3.5.5.

Method 1: We regard the solid E as a type I region, as shown in Figure 3.21(a).


The projection of Dz onto the xy-plane is the parabolic region x 2 ≤ y ≤ 1. In order
to find the upper and lower bounding z-value functions, solve y = x 2 + z 2 to obtain
z = ±√y − x2 . Hence, the lower boundary surface of E is z = −√y − x 2 and the upper
surface is z = √y − x2 . Therefore, the description of E as a type I region is

E = {(x, y, z)| − 1 ≤ x ≤ 1, x2 ≤ y ≤ 1, −√y − x 2 ≤ z ≤ √y − x 2 },

(a) (b) (c)

Figure 3.21: Triple integrals: Example 3.5.5.


174 | 3 Multiple integrals

so we obtain

1 1 √y−x2

∭ √x2 + z 2 dV = ∫ ∫[ ∫ √x 2 + z 2 dz]dydx.
E −1 x2
−√y−x2

Although this expression is correct, it is extremely difficult to evaluate. So we try a


second method.
Method 2: Let us regard E as a type II region. As such, the projection of E onto
the xz-plane is the disk Dxz , x2 + z 2 ≤ 1. Then the left boundary of E is the paraboloid
y = x2 + z 2 and the right boundary is the plane y = 1 (see Figure 3.21(b)), so we have

∭ √x2 + z 2 dV = ∬[ ∫ √x 2 + z 2 dy]dσ
E Dxz x2 +z 2
1
= ∬([y√x 2 + z 2 ]x2 +z 2 )dσ
Dxz

= ∬(1 − x 2 − z 2 )√x2 + z 2 dσ.


Dxz

Since the domain Dxz is a circular disk, it is easier to convert this to polar coordinates
in the xz-plane, using the substitution x = r cos θ, z = r sin θ; Dxz is now given by
0 ≤ θ ≤ 2π and 0 ≤ r ≤ 1, which gives

∭ √x2 + z 2 dV = ∬(1 − x2 − z 2 )√x 2 + z 2 dσ


E Dxz
2π 1 2π 1

= ∫ ∫(1 − r 2 )r rdrdθ = ∫ dθ ∫(r 2 − r 4 )dr


0 0 0 0
3 5 1
r r 4π
= 2π[ − ] = .
3 5 0 15

Method 3: Now we consider computing a double integral first in the xz-plane. The
cross-section of E by the vertical plane passing through (0, y, 0) and perpendicular to
the y-axis is the circular disk Dy : x2 + z 2 ≤ y with center at (0, y, 0) and radius √y (see
Figure 3.21(c)). The triple integral becomes

∭ √x2 + z 2 dV = ∫ dy ∬ √x 2 + z 2 dσ.
E 0 Dy
3.5 Triple integrals | 175

Converting to polar coordinate in the xz-plane (x = r cos θ, z = r sin θ) and computing


the double integral first over the region Dy given by r ≤ √y and 0 ≤ θ ≤ 2π gives
2π √y 3
2πy 2
∬ √x2 + z 2 dA = ∫ dθ ∫ r ⋅ rdr = .
3
Dy 0 0

Hence,
1 1 3
2πy 2 4π
∭ √x2 + z 2 dV = ∫ dy ∬ √x2 + z 2 dσ = ∫ dy = .
3 15
E 0 Dy 0

3.5.2 Cylindrical and spherical coordinates

Cylindrical coordinates
The cylindrical coordinates of a point P are (r, θ, z), as shown in Figure 3.22(a).

(a) (b) (c)

Figure 3.22: Triple integrals: cylindrical coordinates.

The coordinates r and θ are the polar coordinates of the projection of P onto the
xy-plane, and z is the directed distance from the xy-plane to P (the usual z-coordinate).
In cylindrical coordinates, the surfaces analogous to coordinate planes in Carte-
sian coordinates are as follows. If k, l, and m are constants:
r = k is a cylinder whose axis is the z-axis,
θ = l is a half-plane whose edge is the z-axis and its angle with the xz-plane is
θ = l,
z = m is a horizontal plane with distance m from the xy-plane.
The equations relating Cartesian coordinates and cylindrical coordinates of a
point are
x = r cos θ, y = r sin θ, and z = z, (3.5)
where 0 ≤ r < +∞, 0 ≤ θ ≤ 2π, and −∞ < z < ∞. The volume element in cylindrical
coordinates is rdrdθdz (see Figure 3.22(b)).
176 | 3 Multiple integrals

Clearly, a transformation from Cartesian coordinates to cylindrical coordinates


could also be defined with the polar coordinates in the yz-plane or in the xz-plane.
Integration is often easier in cylindrical coordinates when the region of integra-
tion has a cylindrical (not necessarily circular) form with axis parallel to the x-, y-, or
z-axis.

Example 3.5.6. A solid E lies within the cylinder x 2 + y 2 = 1 below the plane z = 2 and above the
paraboloid z = 1 − x 2 − y 2 . The density (mass per unit volume) at any point (x, y, z) is ρ(x, y, z) =
z √x 2 + y 2 . Find the mass of E.

Solution. In cylindrical coordinates, the cylinder has equation r = 1, the upper


boundary is unchanged (z = 2), and the paraboloid has equation z = 1 − r 2 . So, the
region in r-, θ-, and z-coordinates is written

E 󸀠 = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ r ≤ 1, 1 − r 2 ≤ z ≤ 2}.

The density function in cylindrical coordinates is ρ(x, y, z) = zr, and, therefore, the
mass M of E is
2π 1 2

M = ∭ z √x2 + y2 dV = ∭(zr)rdzdrdθ = ∫ ∫ ∫ (rz)rdzdrdθ


E E󸀠 0 0 1−r 2
2π 1 2

= ∫ dθ ∫ r 2 dr ∫ zdz
0 0 1−r 2
1 2
1
= 2π ∫ r 2 [ z 2 ] dr
2 1−r 2
0
1
2 44π
= π ∫ r 2 (4 − (1 − r 2 ) )dr = .
35
0

The solid E is shown in Figure 3.23.

1 √1−x 2 1
Example 3.5.7. Evaluate ∫−1 ∫−√1−x 2 ∫√ zdzdydx.
x 2 +y 2

Solution. This iterated integral is a triple integral of f (x, y, z) = z over the solid region
Ω, i. e.,

Ω = {(x, y, z)| − 1 ≤ x ≤ 1, −√1 − x 2 ≤ y ≤ √1 − x 2 , √x 2 + y2 ≤ z ≤ 1}.

The projection of Ω onto the xy-plane is the disk x 2 + y2 ≤ 1, the lower surface of Ω is
the cone z = √x2 + y2 , and the upper surface is the plane z = 1. This region has a much
3.5 Triple integrals | 177

Figure 3.23: Triple integrals: cylindrical coordinates, Example 3.5.6.

simpler description in cylindrical coordinates, i. e.,

Ω = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ r ≤ 1, r ≤ z ≤ 1}.

Hence, converting the triple integral to cylindrical coordinates gives

1 √1−x 2 1 2π 1 1

∫ ∫ ∫ zdzdydx = ∭ zdV = ∫ ∫ ∫ z rdzdrdθ


−1 −√1−x 2 √ Ω 0 0 r
x2 +y2

2π 1 1 1
π
= ∫ dθ ∫ rdr ∫ zdz = π ∫ r(1 − r 2 )dr = .
4
0 0 r 0

Spherical coordinates
The spherical coordinates (ρ, θ, ϕ) of a point P are usually defined as in Figure 3.24(a).
The coordinate ρ = |OP| is the distance from the origin to P, θ is the same angle as
in cylindrical coordinates, and ϕ is the angle between the positive z-axis and the line
segment OP. Thus, all points in space have unique spherical coordinates (ρ, θ, ϕ), pro-
vided ρ, θ, ϕ are restricted by ρ ≥ 0, 0 ≤ θ ≤ 2π, and 0 ≤ ϕ ≤ π. The spherical coordi-
nate system is especially useful in problems where the formula of the function being
integrated contains the quantity x2 +y2 +z 2 or where the domain has a spherical nature
with center at the origin. In spherical coordinates, the surfaces analogous to the coor-
dinate planes in Cartesian coordinates are as follows. If k, l, and m are any constants:
ρ = k is a sphere with center the origin and radius k,
θ = l is a half-plane whose edge is the z-axis and angle with the xz-plane is θ = l,
ϕ = m is a half-cone making an angle ϕ with the positive z-axis.
The equations relating spherical and rectangular coordinates of a point are

x = ρ sin ϕ cos θ, y = ρ sin ϕ sin θ, and z = ρ cos ϕ. (3.6)

The volume element in spherical coordinates is ρ2 sin ϕdρdθdϕ, as shown in Fig-


ure 3.24(b).
178 | 3 Multiple integrals

(a) (b) (c)

Figure 3.24: Triple integrals: spherical coordinates.

2
+y 2 +z 2 )3/2
Example 3.5.8. Evaluate ∭E e(x dV , where E is the top half of the unit ball

E = {(x, y, z)|x 2 + y 2 + z 2 ≤ 1 and z ≥ 0}.

Solution. Since the boundary of E is part of a sphere, it is wise to try spherical coor-
dinates. The region corresponding to E in spherical coordinates is

π
E 󸀠 = {(ρ, θ, ϕ)|0 ≤ ρ ≤ 1, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ }.
2

In addition, spherical coordinates give x2 +y2 +z 2 = ρ2 and this simplifies the integrand.
Thus,
2
+y2 +z 2 )3/2 2 3/2
∭ e(x dV = ∭ e(ρ ) ρ2 sin ϕ dρdθdϕ
E E󸀠
π
2π 2 1
2 3/2
= ∫ ∫ ∫ e(ρ ) ρ2 sin ϕ dρdϕdθ
0 0 0
π
2π 2 1
3
= ∫ dθ ∫ sin ϕdϕ ∫ eρ ρ2 dρ
0 0 0
1
1 3 π
= 2π ⋅ (− cos ϕ)|0 ⋅ ( eρ ) 2

3 0
2
= π(e − 1).
3

Example 3.5.9. Find the volume of the solid Ω enclosed by the sphere x 2 + y 2 + (z − a)2 = a2 and inside
the half-cone z = √3x 2 + 3y 2 .
3.6 Change of variables in triple integrals | 179

Figure 3.25: Triple integrals: spherical coordinates, Example 3.5.9.

Solution. The solid is shown in Figure 3.25. In spherical coordinates, the boundary
surfaces become, after simplification,
π
ρ = 2a cos ϕ and ϕ = .
6
π
So, the solid Ω is defined by the region Ω󸀠 : 0 ≤ ρ ≤ 2a cos ϕ, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ 6
in
spherical coordinates. Therefore,
π
2π 6 2a cos ϕ
2
V = ∭ dxdydz = ∭ ρ sin ϕ dρdϕdθ = ∫ dθ ∫ sin ϕ dϕ ∫ ρ2 dρ
Ω Ω󸀠 0 0 0
π π
6 2a cos ϕ 6
ρ3 16πa3
= 2π ∫ sin ϕ ⋅ [ ] dϕ = ∫(cos3 ϕ sin ϕ) dϕ
3 0 3
0 0
π
3
16πa 1 7 6
= (− cos4 ϕ) = πa3 cubic units.
3 4 0 12

3.6 Change of variables in triple integrals


The change of variables formula for triple integrals is similar to that for double inte-
grals. Let T be a continuously differentiable one-to-one transformation that maps a
region Ω󸀠 in uvw-space to a region Ω in the xyz-space with equations

x = x(u, v, w), y = y(u, v, w), and z = z(u, v, w).

The Jacobian of T is the following 3 × 3 determinant:


󵄨󵄨 𝜕x 𝜕x 𝜕x 󵄨󵄨
󵄨󵄨 󵄨󵄨
𝜕(x, y, z) 󵄨󵄨󵄨󵄨 󵄨󵄨
𝜕u 𝜕v 𝜕w
𝜕y 𝜕y 𝜕y 󵄨󵄨
=󵄨 󵄨󵄨 .
𝜕(u, v, w) 󵄨󵄨󵄨󵄨 𝜕u 𝜕v 𝜕w
󵄨󵄨󵄨
󵄨󵄨 𝜕z 𝜕z 𝜕z 󵄨󵄨
󵄨 𝜕u 𝜕v 𝜕w 󵄨
180 | 3 Multiple integrals

We have the following change of variables formula for triple integrals:


󵄨󵄨 𝜕(x, y, z) 󵄨󵄨
󵄨 󵄨󵄨
∭ f (x, y, z)dV = ∭ f (x(u, v, w), y(u, v, w), z(u, v, w))󵄨󵄨󵄨 󵄨dudvdw. (3.7)
󵄨󵄨 𝜕(u, v, w) 󵄨󵄨󵄨
Ω 󸀠 Ω

x2 y2 z2
Example 3.6.1. Evaluate ∭Ω |xy|dV , where Ω is the solid bounded by the ellipsoid a2
+ b2
+ c2
= 1.

Solution. We use the substitution x = au, y = bv, z = cw and compute its Jacobian
󵄨󵄨 𝜕x 𝜕x 𝜕x 󵄨󵄨 󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 a 0 0 󵄨󵄨
󵄨󵄨
𝜕(x, y, z) 󵄨󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨
𝜕u 𝜕v 𝜕w
󵄨󵄨 󵄨󵄨 󵄨󵄨
= 󵄨󵄨 𝜕y 𝜕y 𝜕y
󵄨󵄨 = 󵄨󵄨 0 b 0 󵄨󵄨 = abc.
𝜕(u, v, w) 󵄨󵄨󵄨 𝜕u 𝜕v 𝜕w
󵄨󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕z 𝜕z 𝜕z 󵄨󵄨 󵄨 0 0 c 󵄨󵄨
󵄨
󵄨 𝜕u 𝜕v 𝜕w 󵄨

Then we note that we have u2 + v2 + w2 = 1, so

∭ |xy|dV = ∭(|au ⋅ bv|abc)dudvdw


Ω Ω󸀠uvw

= a2 b2 c ∭ (|uv|)dudvdw
u2 +v2 +w2 ≤1

= a2 b2 c × 8 ∭ uvdudvdw
u2 +v2 +w2 ≤1,u≥0,v≥0,w≥0
π π
2 2 1

= 8a2 b2 c ∫ dθ ∫ dϕ ∫ ρ sin ϕ cos θ ⋅ ρ sin ϕ sin θ ⋅ ρ2 sin ϕ dρ


0 0 0
π π
2 2 1

= 8a b c ∫ cos θ sin θ dθ ∫ sin3 ϕ dϕ ∫ ρ4 dρ


2 2

0 0 0
8a2 b2 c
= .
15
Cylindrical and spherical coordinates are special transformations in a triple in-
tegral. The Jacobian of the transformation from Cartesian to cylindrical coordinates
is
󵄨󵄨 𝜕x 𝜕x 𝜕x 󵄨󵄨 󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 cos θ −r sin θ 0 󵄨󵄨
󵄨󵄨
𝜕(x, y, z) 󵄨󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨
𝜕r 𝜕θ 𝜕z
󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 = 󵄨󵄨 sin θ r cos θ 0 󵄨󵄨󵄨 = r.
𝜕y 𝜕y 𝜕y
=󵄨
𝜕(r, θ, z) 󵄨󵄨󵄨󵄨 𝜕r 𝜕θ 𝜕z 󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 0 󵄨󵄨
󵄨󵄨 𝜕z 𝜕z 𝜕z
󵄨󵄨 0 1 󵄨󵄨
󵄨 𝜕r 𝜕θ 𝜕z

Hence, the absolute value of this Jacobian (used in the change of variables formula) is

|r| = r, since r ≥ 0.
3.7 Other applications of multiple integrals | 181

Therefore, the change of variables in a triple integral from a region Ω in Cartesian to a


region Ω󸀠 in cylindrical coordinates is

∭ f (x, y, z)dV = ∭ f (r cos θ, r sin θ, z) rdrdθdz,


Ω Ω󸀠

thus giving us the formula for triple integration in cylindrical coordinates.


We now can compute the Jacobian of the transformation from Cartesian coordi-
nates to spherical coordinates as follows:
󵄨󵄨 𝜕x 𝜕x 𝜕x 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 sin ϕ cos θ −ρ sin ϕ sin θ ρ cos ϕ cos θ 󵄨󵄨
󵄨 𝜕ρ 𝜕θ 𝜕ϕ 󵄨󵄨 󵄨󵄨 󵄨󵄨
𝜕(x, y, z) 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
= 󵄨󵄨󵄨
𝜕y 𝜕y 𝜕y
󵄨󵄨 = 󵄨󵄨 sin ϕ sin θ ρ sin ϕ cos θ ρ cos ϕ sin θ 󵄨󵄨
𝜕(ρ, θ, ϕ) 󵄨󵄨 𝜕ρ 𝜕θ 𝜕ϕ 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨
󵄨󵄨
𝜕z 𝜕z 𝜕z 󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨 cos ϕ 0 −ρ sin ϕ 󵄨󵄨
󵄨
𝜕ρ 𝜕θ 𝜕ϕ 󵄨
2
= −ρ sin ϕ.

Since 0 ≤ ϕ ≤ π, we have sin ϕ ≥ 0, and, therefore, the absolute value of the Jacobian
(used in the change of variables formula) is
󵄨󵄨 𝜕(x, y, z) 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 2 󵄨 2
󵄨󵄨 󵄨 = 󵄨−ρ sin ϕ󵄨󵄨󵄨 = ρ sin ϕ.
󵄨󵄨 𝜕(ρ, θ, ϕ) 󵄨󵄨󵄨 󵄨

So, the change of variables from Cartesian to spherical makes the following changes
in a triple integral:

∭ f (x, y, z)dV = ∭ f (ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ)ρ2 sin ϕ dρdθdϕ.
Ω Ω󸀠

3.7 Other applications of multiple integrals


3.7.1 Surface area

Parameterized surfaces
When using a graphing calculator to sketch a sphere, you may notice that the calcula-
tor does not do a good job in sketching functions such as z = √1 − x 2 − y2 . However, if
you use the parametric form for the same surface x = a sin u cos v, y = a sin u sin v, and
z = a cos u, the graphing calculator does a much better job. In fact, a parameterization
of a surface can be written in a form of a vector-valued function

r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k or r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩,

where u and v are two independent variables (parameters).


182 | 3 Multiple integrals

Example 3.7.1. Find parametric descriptions for the following surfaces:

(1)x 2 + y 2 = a2 , (2)z = a√x 2 + y 2 , (3)z = x 2 + 2y 2 .

Solution. There are many ways to parameterize a surface.


(1) One way to parameterize the cylinder is to set z = v, x = a cos u, and y = a sin u.
We can also write it as a vector-valued function,

r(u, v) = ⟨a cos u, a sin u, v⟩, 0 ≤ u ≤ 2π, −∞ < v < ∞.

Note that r(u, v) = ⟨a cos u2 , a sin u2 , v3 ⟩ is also a parametric description of the same
cylinder.
(2) A parametric description of the circular cone is
v v
r(u, v) = ⟨ cos u, sin u, v⟩, 0 ≤ u ≤ 2π, v ≥ 0.
a a
(3) One parametric description is

r(u, v) = ⟨u, v, u2 + 2v2 ⟩, −∞ < u, v < ∞.

Also
v
r(u, v) = ⟨√v cos u, √ sin u, v⟩, 0 ≤ u ≤ 2π, v ≥ 0
2
is a parametric description.

Surface area
We now apply double integrals to the problem of computing the area of a surface S
defined by r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩, where x = x(u, v), y = y(u, v), and z = z(u, v)
all have continuous partial derivatives at (u, v) ∈ D. A special parametric description
where the surface has an explicit equation z = f (x, y), and a parametric description
for this surface is

r(u, v) = ⟨u, v, f (u, v)⟩ or r(x, y) = ⟨x, y, f (x, y)⟩.

To find the “surface area element dS,” we can use the tangent plane approximation,
as shown in Figure 3.26, where the area element on the tangent plane is given by

|ru Δu × rv Δv| = |ru × rv |ΔuΔv.

Therefore, adding up these elements, we will have

surface area S = ∬ |ru × rv |dudv. (3.8)


Duv

Figure 3.27 shows more general cases.


3.7 Other applications of multiple integrals | 183

Figure 3.26: Surface area: tangent plane approximation.

(a) (b) (c)

Figure 3.27: Surface area: tangent plane approximation, general case.

Example 3.7.2. Find the surface area of a ball with radius R.

Solution. The ball can be described by

r(u, v) = ⟨R sin u cos v, R sin u sin v, R cos u⟩, 0 ≤ v ≤ 2π, 0 ≤ u ≤ π,

where u and v are actually ϕ and θ in spherical coordinates. Since

󵄨󵄨 i j k 󵄨󵄨 󵄨󵄨 i j k 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
ru × rv = 󵄨󵄨󵄨󵄨 xu yu zu 󵄨󵄨󵄨 = 󵄨󵄨󵄨 R cos u cos v R cos u sin v −R sin u 󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 xv yv zv 󵄨󵄨 󵄨󵄨 −R sin u sin v R sin u cos v 0 󵄨󵄨
󵄨
= −R2 sin2 u cos vi + R2 sin2 u sin vj − R2 sin u cos uk,

it follows that

2 2 2
|ru × rv | = √(−R2 sin2 u cos v) + (R2 sin2 u sin v) + (−R2 sin u cos u)
= R2 sin u.
184 | 3 Multiple integrals

Thus,

surface area S = ∬ |ru × rv |dudv = ∬ R2 sin ududv


Duv Duv
2π π

= ∫ dv ∫ R2 sin udu = 4πR2 .


0 0

Surface area when the surface is given by an explicit equation z = f (x, y)


In this case, we let x = x, y = y, and z = z(x, y). Then r(x, y) = ⟨x, y, z(x, y)⟩ is a
parametric description, and rx = ⟨1, 0, zx ⟩, ry = ⟨1, 0, zy ⟩, and
󵄨󵄨 i j k 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨 󵄨󵄨
rx × ry = 󵄨󵄨󵄨󵄨 1 0 zx 󵄨󵄨 = −zx i − zy j + k,
󵄨󵄨
󵄨󵄨
󵄨󵄨 0 1 zy 󵄨󵄨󵄨

so, in rectangular coordinates,

surface area S = ∬ |rx × ry |dσ = ∬ √1 + zx2 + zy2 dσ. (3.9)


Dxy Dxy

We use this formula to compute the surface area of a ball with radius R again in the
following example.

Example 3.7.3. Find the surface area of the sphere with radius R.

Solution. By symmetry, we compute the area of the top half of the sphere and double
this. To make the calculation easier, assume that the center of the sphere is at the
origin so that the equation of the top half of the sphere is

z = f (x, y) = √R2 − x 2 − y2 ,

so that
𝜕z −x 𝜕z −y
zx = = and zy = = .
𝜕x √R2 − x2 − y2 𝜕y √R2 − x 2 − y2

The projection Dxy of this surface onto the xy-plane is

x2 + y2 ≤ R2 and z = 0.

Applying the surface area formula, equation (3.9), we obtain

2 2
−x −y
S = 2 ∬ √zx2 + zy2 + 1dσ = 2 ∬ √( ) +( ) + 1dσ
Dxy Dxy
√R2 − x 2 − y2 √R2 − x 2 − y2
3.7 Other applications of multiple integrals | 185

R
= 2∬ dσ.
Dxy
√R2 − x2 − y2

The integrand (the function being integrated) is unbounded on the region Dxy : x 2 +
y2 ≤ R2 , so we consider D󸀠xy : x2 + y2 ≤ b2 , 0 < b < R and then let b → R. Converting to
cylindrical coordinates, we obtain
2π b b
R r
S = lim 2 ∫ dθ ∫ rdr = lim 4πR ∫ dr
b→R √R2 − r 2 b→R √R2 − r 2
0 0 0
b
= lim 4πR[−√R2 − r 2 ]0 = 4πR lim (R − √R2 − b2 ) = 4πR2 .
b→R b→R

Example 3.7.4. Find the area of the part of the paraboloid z = x 2 + y 2 that lies under the plane z = 2.

Solution. The plane z = 2 intersects the paraboloid in the circle x 2 + y2 = 2, z = 2.


Therefore, the given surface lies above the disk D: x 2 + y2 ≤ 2 with center the origin
and radius √2.
So, the required area A is given by

2 2
𝜕z 𝜕z
A = ∬ √1 + ( ) + ( ) dxdy = ∬ √1 + (2x)2 + (2y)2 dxdy
𝜕x 𝜕y
D D

= ∬ √1 + 4(x2 + y2 )dxdy.
D

Converting to polar coordinates in order to simplify this integration, we obtain

2π √2
1 2 3/2 √2
A = ∫ ∫ √1 + 4r 2 rdrdθ = 2π ⋅ ⋅ [(1 + 4r 2 ) ]0
8 3
0 0
13π
= .
3
In Figure 3.26, we note that the area element dσ in the xy-plane is exactly the pro-
jection of the area element dS in the tangent plane. This indicates that

dS × cos γ = dσ, where γ is the acute angle between the tangent plane
and the xy-plane.

Note that if a smooth surface has an equation F(x, y, z) = 0, then ∇F = ⟨Fx, Fy , Fz ⟩ is a


normal vector of its tangent plane. Then

∇F F Fy F
the unit vector =⟨ x , , z ⟩ = ⟨cos α, cos β, cos γ⟩.
|∇F| |∇F| |∇F| |∇F|
186 | 3 Multiple integrals

1
So, dS = | cos γ|
dσ = |∇F|
|Fz |
dσ = |∇F|
|Fz |
dxdy. Therefore, we conclude that

|∇F|
surface area S = ∬ dS = ∬ dxdy. (3.10)
|Fz |
Dxy

If the surface is given by z = f (x, y), then F(x, y, z) = f (x, y) − z = 0, Fz = −1, and
∇F = ⟨zx , zy , −1⟩. So, |∇F| = √1 + zx2 + zy2 and

√1 + zx2 + zy2
surface area S = ∬ dxdy = ∬ √1 + zx2 + zy2 dxdy, (3.11)
| − 1|
Dxy Dxy

which is the same formula as the one we derived before. However, equation (3.10) does
give us options to project the surface area element dS on the tangent plane onto the
other two coordinate planes. Therefore, we have

|∇F|
surface area S = ∬ dxdy = ∬ √1 + zx2 + zy2 dxdy (3.12)
|Fz |
Dxy Dxy

|∇F|
=∬ dzdy = ∬ √1 + xy2 + xz2 dzdy (3.13)
|Fx |
Dyz Dyz

|∇F|
=∬ dxdz = ∬ √1 + yx2 + yz2 dxdz, (3.14)
|Fy |
Dxz Dxz

where Dxy , Dyz , and Dxz are projection regions of the surface onto the xy-, yz-, and
xz-planes, respectively.

3
4
Example 3.7.5. Find the surface area of the part of the surface y = x 2 for 0 ≤ x ≤ 3
and 0 ≤ z ≤ 4.

Solution. We project the surface onto the xz-plane. Since

3 3 1
F(x, y, z) = y − x 2 = 0, we have ∇F = ⟨− x 2 , 1, 0⟩.
2

The surface area is given by

2
|∇F| −3 1
surface area S = ∬ dxdz = ∬ √1 + ( x 2 ) dxdz
|Fy | 2
Dxz Dxz
4
4 3
9x 224
= ∫ dz ∫ √1 + dx = .
4 27
0 0
3.7 Other applications of multiple integrals | 187

Example 3.7.6. Find the surface area of the part of the sphere x 2 + y 2 + z 2 = R 2 for z ≥ h, where
0 < h < R.

Solution. Since the surface has equation x2 + y2 + z 2 = R2 , we let F(x, y, z) = x 2 + y2 +


z 2 − R2 = 0. Then,

∇F = ⟨2x, 2y, 2z⟩ and Fz = 2z.

Thus, the surface area is

|∇F| √(4x2 + 4y2 + 4z 2 )


S=∬ dxdy = ∬ dxdy
|Fz | 2z
Dxy Dxy

R R
=∬ dxdy = ∬ dxdy
z √R2 − x2 − y2
Dxy Dxy

2π √R2 −h2
R
= ∫ dθ ∫ rdr
√R2 − r 2
0 0
1 √ 2
󵄨 R −h2
= 2πR[−(R2 − r 2 ) 2 󵄨󵄨󵄨0 ]
= 2πR(R − h).

Note. The surface area is linear in h!

3.7.2 Center of mass, moment of inertia

Center of mass
In physics, the center of mass of an object is the balance point of the object or the
position towards which gravity attracts. For a rod with density function ρ(x), the center
of mass x can be found by using the idea of moment and is given by
b
∫a xρ(x)dx
x= b
. (3.15)
∫a ρ(x)dx

For a two-dimensional lamina with density function ρ(x, y), the center of mass (x, y) is
given by
∬D xρ(x, y)dσ ∬D yρ(x, y)dσ
x= and y= . (3.16)
∬D ρ(x, y)dσ ∬D ρ(x, y)dσ

For a three-dimensional solid with density function ρ(x, y, z), the center of mass
(x, y, z) is given by
∭Ω xρ(x, y, z)dV ∭Ω yρ(x, y, z)dV ∭Ω zρ(x, y, z)dV
x= , y= , and z = . (3.17)
∭Ω ρ(x, y, z)dV ∭Ω ρ(x, y, z)dV ∭Ω ρ(x, y, z)dV
188 | 3 Multiple integrals

If an object is uniform (has constant density at each point), then its center of mass is
at the centroid, the geometric center of the figure.

Moment of inertia
In physics, we know that the moment of inertia of a rigid body m about an axis l is

n
I = ∑ Δmi ri2 , (3.18)
i

where Δmi is a mass element, while ri is the perpendicular distance from this mass
element to the axis of rotation.
By taking limits, this can be represented as an integral. If ρ stands for the density
function, the moments of inertia of a region D in the xy-plane being rotated about the
x-axis and y-axis are, therefore,

Ix = ∬ ρy2 dA and Iy = ∬ ρx 2 dA, (3.19)


D D

respectively. Thus, by the perpendicular axis theorem for the moment of inertia of a
rigid body in the plane, the moment of inertia of the body about the z-axis is

Iz = Ix + Iy = ∬ ρ(x2 + y2 )dA. (3.20)


D

Similarly, for a solid R in space, the moment of inertia of the solid about an axis of
rotation is

I = ∭ ρ|r|2 dV, (3.21)


R

where |r| is the distance from the volume element dV to the axis of rotation.
Therefore, finding the coordinates for a center of mass or moment of inertia of an
object about an axis of rotation becomes a job of evaluating some multiple integrals.

3.8 Review
Main concepts discussed in this chapter are listed below.
1. Definition and properties of double integrals:

∬ f (x, y)dσ = lim ∑ f (xi , yi )Δσi .


|Δσi |→0
D
3.8 Review | 189

2. Double integral in rectangular coordinates:

b ϕ2 (x)

∬ f (x, y)dσ = ∫ dx ∫ f (x, y)dy type I region,


D a ϕ1 (x)
d ψ2 (y)

∬ f (x, y)dσ = ∫ dy ∫ f (x, y)dx type II region.


D c ψ1 (y)

3. Double integral in polar coordinates:


β r2 (θ)

∬ f (x, y)dσ = ∫ dθ ∫ f (r cos θ, r sin θ)rdr.


D a r1 (θ)

4. Change of variables in a double integral:


󵄨󵄨 𝜕(x, y) 󵄨󵄨
󵄨 󵄨󵄨
∬ f (x, y)dσ = ∬ f (x(u, v), y(u, v))󵄨󵄨󵄨 󵄨dudv.
󵄨󵄨 𝜕(u, v) 󵄨󵄨󵄨
D Duv

5. Triple integral in rectangular coordinates:


z2 (x,y)

∭ f (x, y, z)dV = ∬( ∫ f (x, y, z)dz)dσ, a definite integral first,


Ω Dxy z1 (x,y)
b

∭ f (x, y, z)dV = ∫(∬ f (x, y, z)dσ)dz, a double integral first.


Ω a Dz

6. Cylindrical coordinates:

x = r cos θ, y = r sin θ, and z = z.

7. Spherical coordinates:

x = ρ sin ϕ cos θ, y = ρ sin ϕ sin θ, and z = ρ cos ϕ.

8. Triple integral in cylindrical coordinates:

∭ f (x, y, z)dV = ∭ f (r cos θ, r sin θ, z)rdzdrdθ.


Ω Ω󸀠rθz

9. Triple integral in spherical coordinates:

∭ f (x, y, z)dV = ∭ f (ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ)ρ2 sin ϕ dρdϕdθ.
Ω Ω󸀠ρϕθ
190 | 3 Multiple integrals

10. Change of variables in a triple integral:


󵄨󵄨 𝜕(x, y, z) 󵄨󵄨
󵄨 󵄨󵄨 󸀠
∭ f (x, y, z)dV = ∭ f (x(u, v, w), y(u, v, w), z(u, v, w))󵄨󵄨󵄨 󵄨dV .
󵄨󵄨 𝜕(u, v, w) 󵄨󵄨󵄨
Ω 󸀠 Ωuvw

11. Surface area for

r = r(u, v) S = ∬ |ru × rv |dudv,


Duv

z = f (x, y) S = ∬ √1 + zx2 + zy2 dxdy,


Dxy

|∇F|
F(x, y, z) = 0 S=∬ dxdy.
|Fz |
Dxy

Similar results hold if the surface is projected onto a coordinate plane other than
the xy-plane.
12. Center of mass: if f (x), f (x, y), and f (x, y, z) are density functions, then

b
∫a xf (x)dx
x̄ = b
one dimension, thin rod,
∫a f (x)dx
∬D xf (x, y)dσ ∬D yf (x, y)dσ
x̄ = ȳ = two dimensions, thin lamina,
∬D f (x, y)dσ ∬D f (x, y)dσ
∭Ω xf (x, y, z)dV ∭Ω yf (x, y, z)dV ∭Ω zf (x, y, z)dV
x̄ = ȳ = z̄ =
∭Ω f (x, y, z)dV ∭Ω f (x, y, z)dV ∭Ω f (x, y, z)dV
three dimensions, solid.

13. Moment of inertia: if ρ is the density function, then

Ix = ∬ ρx2 dσ, Iy = ∬ ρy2 dσ, and Iz = ∬ ρ(x2 + y2 )dσ.


D D D

For a solid Ω with density function ρ,

I = ∭ ρ|r|2 dV,
Ω

where |r| is the distance from the element dV to the axis of rotation.
3.9 Exercises | 191

3.9 Exercises
3.9.1 Double integrals

1. Evaluate each of the following iterated integrals:


2 3 2 2 2−x π sin y
(1) ∫1 ∫0 yx2 ey dxdy, (2) ∫−1 ∫x dydx, (3) ∫0 ∫0 ecos y dxdy.
2. Evaluate each of the following double integrals:
(1) ∬D (x2 + y2 )dxdy, where D = {(x, y)||x| ≤ 1, |y| ≤ 1},
(2) ∬D xy2 dxdy, where D is the region bounded by the parabola y2 = 2x and the
line x = 21 ,
(3) ∬D x√ydxdy, where D is bounded by the two parabolas y = √x and y = x 2 .
3. If f (x, y) is a continuous function, find the limit I = limρ→0 πρ1 2 ∭x2 +y2 ≤ρ2 f (x, y)dσ.
4. Use double integrals to find the area for each of the following plane regions:
(1) region D bounded by curves y = x2 and y = 2,
(2) region R below the curve y = 4 − x2 and the line y = 6 − 3x and above the line
y = 2 − x.
5. Find the volume for each of the following solids:
(1) solid Ω beneath the paraboloid z = 12 − x2 − 2y2 and above the square region
D = {(x, y, 0)|0 ≤ x ≤ 1, 0 ≤ y ≤ 1},
(2) solid R bounded by z = 1 + x + y, z = 0, x + y = 1, x = 0, and y = 0,
(3) solid R below the surface of f (x, y) = 2 + y1 and above the rectangular region
in the xy-plane with vertices (1, 1), (1, 3), (2, 1), and (2, 3),
(4) solid R bounded by z = 0, z = xy, and x + y + z = 1.
6. Sketch the region of each of the following integrations and change the order of
each integration:
2 2x 1 x 2 2−x
(1) ∫1 dx ∫x f (x, y)dy, (2) ∫0 dx ∫0 f (x, y)dy + ∫1 dx ∫0 f (x, y)dy,
3 √9−y2
(3) ∫0 ∫ f (x, y)dxdy.
−√9−y2
7. Evaluate each of the following integrals by reversing the order of integration:
3 9 2
(1) ∫0 dy ∫y2 y cos(x2 )dx, ex dxdy.
√π √π
(2) ∫0 ∫y
1 b a
8. Evaluate the integral ∫0 xln−xx dx, where 0 < a < b.
9. The average value of a function f (x, y) over a plane region D is defined as
1
fave = ∬ f (x, y)dσ, where A(D) is the area of the region D.
A(D)
D

Find the average of the function f (x, y) = sin(2x − 3y) over the region D bounded
by 0 ≤ x ≤ π2 and x ≤ y ≤ π2 .
10. Use polar coordinates to combine the sum
1 x √2 x 2 √4−x2

∫ dx ∫ xydy + ∫ dx ∫ xydy + ∫ dx ∫ xydy.


1/√2 √1−x 2 1 0 √2 0
192 | 3 Multiple integrals

11. Find the volume for each of the following solids:


(1) solid R enclosed by two cylinders x2 + y2 = 4 and x 2 + z 2 = 4,
(2) solid R inside both the sphere x2 + y2 + z 2 = 16 and the cylinder x 2 + y2 = 4,
(3) solid R that is below the surface z = 2+x182 +y2 − 3 and above the plane z = 0,
(4) solid R between two surfaces z = x 2 + y2 and z = 2 − x 2 − y2 .
12. Find the volume of the solid bounded by z = 25 − (x − 1)2 − (y + 2)2 and z ≥ 0.
13. Use change of variables in a double integral to
(1) find the area of the region bounded by the lines x + y = c, x + y = d, y = ax,
and y = bx (0 < c < d, 0 < a < b).
(2) evaluate the double integral ∬D x sin(y − 2x)dσ, where D is region bounded by
the parallelogram with vertices (1, 0), (1, 3), (3, 7), (3, 4).
2 2
(3) find ∬ √1 − x 2 − y 2 dxdy, where D is the region enclosed by the ellipse x 2 +
2

D a b a
y2
= 1.
b2
14. We can define the improper integral
∞ ∞
2
+y2 ) 2
+y2 )
I = ∫ ∫ e−(x dxdy = lim ∬ e−(x dxdy,
r→+∞
−∞ −∞ Dr

where Dr is the disk with radius r and center the origin. Show that
∞ ∞
2
+y2 )
∫ ∫ e−(x dxdy = π
−∞ −∞

and deduce
∞ ∞
−x 2 2
∫ e dx = √π and ∫ e−x /2
dx = √2π.
−∞ −∞

3.9.2 Triple integrals

1. Find the region Ω for which the triple integral

∭ 1 − x2 − 3y2 − 2z 2 dV
Ω

is a maximum.
1 z x−z
2. Evaluate the iterated integral ∫0 ∫0 ∫0 6xzdydxdz.
3. If Ω2 = {(x, y, z)|x2 + y2 + z 2 ≤ R2 } and

Ω1 = {(x, y, z)|x2 + y2 + z 2 ≤ R2 , x ≥ 0, y ≥ 0, z ≥ 0},

find (1) ∭Ω 3dV and (2) ∭Ω xyzdV.


1 2
3.9 Exercises | 193

4. Evaluate the triple integral ∭Ω 2xdV, where

Ω = {(x, y, z)|0 ≤ y ≤ 2, 0 ≤ x ≤ √4 − y2 , 0 ≤ z ≤ y}.

5. The average value of a function f (x, y, z) over a solid region Ω is defined to be

1
fave = ∭ f (x, y, z)dV, where V(Ω) is the volume of Ω.
V(Ω)
Ω

Find the average value of the function f (x, y, z) = x 2 z +y2 z over the region enclosed
by the paraboloid z = 1 − x2 − y2 and the plane z = 43 .
6. Evaluate the following triple integrals by converting to cylindrical or spherical
coordinates:
(1) ∭Ω (x2 + y2 )dxdydz, where Ω is the region enclosed by the paraboloid x2 + y2 =
2z and the plane z = 2,
3 √9−x2 √x2 +y2
(2) ∫0 ∫0 ∫0 √x2 + y2 dzdydx,
(3) ∭Ω √x2 + y2 + z 2 dxdydz, where Ω is the ball x 2 + y2 + z 2 ≤ z,
1 √1−x2 √2−x 2 −y2
(4) ∫0 dx ∫0 dy ∫ z 2 dz.
√x2 +y2
7. Use triple integrals to find the volume of the solid
(1) bounded by planes x = 0, y = 0, z = 0, x = 4, and y = 4 and the paraboloid
z = x2 + y2 + 1.
(2) enclosed by the cone z = √x2 + y2 and the paraboloid az = x 2 + y2 , where
a > 0.
(3) enclosed by the spherical surfaces ρ = 4 sin ϕ.
8. Use the change of variables in a triple integral to evaluate ∭Ω zdV, where Ω is
bounded by the planes y = x, y = x + 2, z = x, z = x + 2, z = 0, and z = 6.
9. Let f (x) > 0 be a continuous function, Ω = {(x, y, z)|x 2 + y2 + z 2 ≤ t 2 }, and D =
{(x, y)|x2 + y2 ≤ t 2 }. Prove that the function

∭Ω f (x2 + y2 + z 2 )dV
F(t) =
∬D f (x2 + y2 )dσ

is increasing for t > 0.

3.9.3 Other applications of multiple integrals

1. Find the area of each of the following surfaces:


(1) the part of the hyperbolic paraboloid z = y2 −x 2 that lies between the cylinders
x2 + y2 = 1 and x2 + y2 = 4,
(2) the part of the surface z = xy that lies within the cylinder x 2 + y2 = 1,
194 | 3 Multiple integrals

(3) the part of the cylinder x2 + y2 = x that lies within the sphere x2 + y2 + z 2 = 1,
(4) the part of the sphere x2 + y2 + z 2 = a2 that lies within the cylinder x 2 + y2 = ax
and above the xy-plane,
(5) the part of the cylinder x2 + y2 = 16 that is between the planes z = 0 and
z = 16 − 2x,
(6) the part of the trough z = x2 for −3 ≤ x ≤ 3 and 1 ≤ y ≤ 4.
2. A lamina is represented by the part of the disk x 2 + y2 ≤ 1 in the first and second
quadrants. Find the mass of the lamina if the density at any point is proportional
to its distance from the x-axis with constant of proportionality equal to K.
3. A solid has the shape of a half-cylinder bounded by −3 ≤ x ≤ 3, 0 ≤ y ≤ √9 − x 2 ,
and 0 ≤ z ≤ 2. Each point (x, y, z) in the solid has density given by the function
f (x, y, z) = 1+x12 +y2 . Find the total mass of this solid.
4. Find the coordinates of the centroid of the constant-density cone D bounded by
z = 4 − √x2 + y2 and z = 0.
5. A torus is a surface obtained by rotating a closed plane curve (for example a circle)
about an axis, where the axis usually does not intersect the curve. Considered the
torus obtained by rotating the circle (x − R)2 + z 2 = r 2 (r < R) in the xz-plane about
the z-axis.
(a) Parameterize the torus.
(b) Find its volume.
(c) Find the centroid of the half-torus (z ≥ 0).
(d) Find the surface area and the moment of inertia of the torus.
6. Find the moment of inertia for:
(a) a uniform semidisk about its straight edge (the diameter),
(b) a uniform semidisk about the axis that is the perpendicular bisector of its
straight edge (in the same plane),
(c) a ball with radius R center at the origin, and density function √x 2 + y2 + z 2
about the z-axis.
4 Line and surface integrals
4.1 Line integral with respect to arc length
The definite integral

b b

∫ f (t)dt or equivalently ∫ f (x)dx (4.1)


a a

of a one-variable function f defined on an interval [a, b] ⊂ ℝ is used to model many


physical phenomena. For example, if f (t) is the speed of an object at time t, then the
first integral gives the total distance traveled by the object between times t = a and
t = b. If f (x) is the density of a wire at distance x along the wire, then the second
integral of (4.1) is the mass of the wire. If f (x) ≥ 0 on a ≤ x ≤ b, then the second form
of (4.1) is the area of the region under the graph of f and above the x-axis, between
x = a and x = b.
It is useful to extend the integral to functions defined on domains other than an
interval [a, b] in ℝ. For example, how can we find the mass of a wire represented by
a curve C ⊂ ℝ2 , with f (x, y) the density function of the wire at a point (x, y) on C?
As shown in Figure 4.1, what is the area of a “curtain” that is part of a cylinder with
base C, a curve in the xy-plane, and at each point (x, y) the height is given by f (x, y)?
We can use the element method like those we used for other integrals. We can split
the curve into many small pieces, and for each piece, say, Δsi , we arbitrarily choose
a sample point, say, (xi∗ , yi∗ ), so that f (xi∗ , yi∗ )Δsi approximates the mass element (or
area element). Then we can form a Riemann sum and take the limit of this Riemann
sum as Δsi tends to 0. This leads to the ideas of a line integral. We now give a formal
definition of a line integral with respect to arc length.

Figure 4.1: Area of a curtain.

https://doi.org/10.1515/9783110674378-004
196 | 4 Line and surface integrals

4.1.1 Definition and properties

Definition 4.1.1 (Line integral with respect to arc length). Let C be a piecewise smooth curve in the
plane ℝ2 , connecting two fixed points A and B. Let s be the distance along C measured from A = S0 .
Subdivide C between A and B (Figure 4.2(a)) by points A = S0 , S1 , S2 , . . . , Sn = B and let si for each
i = 1, 2, . . . , n be the distance along the curve from S0 to Si . Arbitrarily choose (xi∗ , yi∗ ) ∈ ℝ2 on C
between Si−1 and Si for i = 1, 2, . . . , n. Let f be a real-valued function defined on C. If the limit

n
lim ∑ f (xi∗ , yi∗ )Δsi
max |Δsi |→0, n→∞
i=1

exists for all possible subdivisions and choices of the (xi∗ , yi∗ ), then we define this to be the line integral
of f on the curve C with respect to arc length, and we write

n
lim ∑ f (xi∗ , yi∗ )Δsi = ∫ f (x, y)ds.
max |Δsi |→0, n→∞
i=1 C

(a) (b)

Figure 4.2: Definition of a line integral with respect to arc length.

Note.
1. It can be proved that this line integral exists when f is continuous, or piecewise
continuous, provided C is finite and piecewise smooth. A curve C: r(t), a ≤ t ≤ b, is
piecewise smooth if there is a subdivision a = t0 < t1 < t2 < ⋅ ⋅ ⋅ < tn = b such that
r󸀠 (t) is continuous on each subinterval ti−1 < t < ti .
2. Sometimes we also use the notation ∫L f (x, y)ds for a line integral.

Properties of the line integral


The definition can be used to prove, without difficulty, the following properties for line
integrals of the function f (x, y) and g(x, y) in ℝ2 , provided all line integrals involved
exist:
4.1 Line integral with respect to arc length | 197

(1) if f (x, y) = 1 for all (x, y) ∈ C in ℝ2 , then ∫C 1ds is the length of the curve C,
(2) ∫C kf (x, y) + mg(x, y)ds = k ∫C f (x, y)ds + m ∫C g(x, y)ds if k, m ∈ ℝ (linearity),
(3) ∫C +C f (x, y)ds = ∫C f (x, y)ds + ∫C f (x, y)ds, where C1 + C2 means the curve formed
1 2 1 2
by combining curves C1 and C2 into one curve,
(4) if f (x, y) ≤ g(x, y), then ∫C f (x, y)ds ≤ ∫C g(x, y)ds,
(5) if f (x, y) is continuous, then there is a point (η, ξ ) such that

∫ f (x, y)ds = f (η, ξ ) × length of C.


C

Example 4.1.1. Evaluate the line integral ∫C (2x ln(1 + y 2 ) + π)ds, where C is the half-circle y = √4 − x 2 .

Solution. By the linearity property, we have

∫(2x ln(1 + y2 ) + π)ds = 2 ∫ x ln(1 + y2 )ds + π ∫ 1ds.


C C C

By symmetry, as we saw in the discussion of multiple integrals, ∫C x ln(1+y2 )ds is equal


to 0 since the curve is symmetrical about the y-axis, and x ln(1 + y2 ) is odd with respect
to x. Since ∫C 1ds gives the length of C, we obtain

∫(2x ln(1 + y2 ) + π)ds = 2 ∫ x ln(1 + y2 )ds + π ∫ 1ds


C C C

= 0 + π × π × 2 = 2π 2 .

4.1.2 Evaluating a line integral, ∫C f (x, y)ds, in ℝ2

Suppose that a curve C is given by a vector-valued function (in a vector parametric


form)

r(t) = ⟨x(t), y(t)⟩, for a ≤ t ≤ b,

where the functions x(t) and y(t) both have continuous first-order derivatives, and
where r(a) and r(b) correspond to the points A and B, respectively. We approximate
Δsi ≃ √Δxi2 + Δyi2 (see Figure 4.2(b)). Let t = ti∗ be the value of t that corresponds to the
point (xi∗ , yi∗ ), and let t = ti be the value of t that corresponds to the point Si on C, for
i = 1, 2, . . . , n. Writing Δti = ti − ti−1 and noting that max |Δti | → 0 ⇐⇒ max |Δsi | → 0,
it follows that
n
∫ f (x, y)ds = lim ∑ f (xi∗ , yi∗ )Δsi
max |Δsi |→0 n→∞
C i=1
198 | 4 Line and surface integrals

n
= lim ∑ f (x(ti∗ ), y(ti∗ ))√Δxi2 + Δyi2
max |Δsi |→0 n→∞
i=1

n Δxi2 Δyi2
= lim ∑ f (x(ti∗ ), y(ti∗ ))√ + Δti .
max |Δti |→0 n→∞
i=1 Δti2 Δti2

The last Riemann sum is an ordinary one-variable Riemann sum. Hence, if the limit
exists, we can express the line integral as an ordinary one-variable definite integral,
i. e.,
b 2 2
dx dy
∫ f (x, y)ds = ∫ f (x(t), y(t))√( ) + ( ) dt. (4.2)
dt dt
C a

We can also write this in a vector form,


b
󵄨 󵄨
∫ f (x, y)ds = ∫ f (r(t))󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨dt. (4.3)
C a

Note that we can interpret ds, the arc length element, as |r󸀠 (t)|dt. This is consistent
with ds
dt
= |r󸀠 (t)|.

Example 4.1.2. Compute the circumference of the circle x 2 + y 2 = R 2 using a line integral.

Solution. The circumference is ∫C 1ds, where C is the circle. Choose the standard pa-
rameterization,

x = R cos t, y = R sin t, 0 ≤ t ≤ 2π.

Evaluating the integral using equation (4.2), we have


2π 2 2 2π
dx dy
∫ 1ds = ∫ √( ) + ( ) dt = ∫ √(−R sin t)2 + (R cos t)2 dt
dt dt
C 0 0

= ∫ R√sin2 t + cos2 tdt = 2πR.


0

Example 4.1.3. Find the total mass of the wire given by the curve C: y = x 2 , −2 ≤ x ≤ 2 with density
function f (x, y) = x + √y.

Solution. We choose a parameterization r(t) = ⟨t, t 2 ⟩, and then |r󸀠 (t)| = √12 + (2t)2 . So
we have

total mass m = ∫ f (x, y)ds = ∫ x + √yds


C C
4.1 Line integral with respect to arc length | 199

2 2
󵄨 󵄨
= ∫ (t + √t 2 )󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨dt = ∫ (t + √t 2 )√12 + (2t)2 dt
−2 −2
2 2

= ∫ t √12 + (2t)2 dt + ∫ √t 2 √12 + (2t)2 dt


−2 −2
2

= 0 + 2 ∫ t √1 + 4t 2 dt
0
2
12 3 󵄨󵄨
17 1
(1 + 4t 2 ) 2 󵄨󵄨󵄨 = √17 − .
󵄨
=
43 󵄨󵄨0 6 6

4.1.3 Line integrals ∫C f (x, y, z)ds in ℝ3

The process for expressing a line integral in ℝ3 as a definite integral is very similar
to the one used in ℝ2 , and is, therefore, not repeated here. When the curve C is, in a
vector parametric form, r(t) = ⟨x(t), y(t), z(t)⟩, for a ≤ t ≤ b, then the result is
b 2 2 2
dx dy dz
∫ f (x, y, z)ds = ∫ f (x(t), y(t), z(t))√( ) + ( ) + ( ) dt (4.4)
dt dt dt
C a
b
󵄨 󵄨
or ∫ f (x, y, z)ds = ∫ f (r(t))󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨dt.
C a

Note. A curve can have many different parametric forms, but the line integral will al-
ways have the same value because the parametric form of the line integral will always
be the same as the original definition of the line integral in equation (4.2).

Example 4.1.4. Find the mass of the wire represented by the helical curve C : → 󳨀r (t) = ⟨2 cos t, 2 sin t, t⟩,
π ≤ t ≤ 2π, when the density at a point (x, y, z) on the wire is given by δ(x, y, z) = x 2 + y 2 + z 2 .

Solution. The mass is given by the following line integral:


2π 2 2 2
dx dy dz
∫ δ(x, y, z)ds = ∫ (x 2 (t) + y2 (t) + z 2 (t))√( ) + ( ) + ( ) dt
dt dt dt
C π

= ∫ (4 cos2 t + 4 sin2 t + t 2 )√4 sin2 t + 4 cos2 t + 1dt


π

7
= √5 ∫ (4 + t 2 )dt = √5(4π + π 3 ).
3
π
200 | 4 Line and surface integrals

Example 4.1.5. Evaluate ∫C fds when f (x, y, z) = x + yz and C = C1 + C2 , where C1 is the line segment
from (0, 0, 2) to (2, 0, 2) and C2 is the line segment from (2, 0, 2) to (1, 1, 1).

Solution. The parametric form of the line through (0, 0, 2) and (2, 0, 2) is

r(t) = ⟨0, 0, 2⟩ + t(⟨2, 0, 2⟩ − ⟨0, 0, 2⟩)


= (1 − t)⟨0, 0, 2⟩ + t⟨2, 0, 2⟩.

Since r(0) = ⟨0, 0, 2⟩ and r(1) = ⟨2, 0, 2⟩, the line segment C1 has parameterization

C1 : r1 (t) = (1 − t)⟨0, 0, 2⟩ + t⟨2, 0, 2⟩ = ⟨2t, 0, 2⟩, 0 ≤ t ≤ 1.

Similarly, line segment C2 has parameterization

C2 : r2 (t) = (1 − t)⟨2, 0, 2⟩ + t⟨1, 1, 1⟩ = ⟨2 − t, t, 2 − t⟩, 0 ≤ t ≤ 1.

Using properties of line integrals, we have

1 1
󵄨 󵄨 󵄨 󵄨
∫ fds = ∫ fds + ∫ fds = ∫ f (r1 (t))󵄨󵄨󵄨r󸀠1 (t)󵄨󵄨󵄨dt + ∫ f (r2 (t))󵄨󵄨󵄨r󸀠2 (t)󵄨󵄨󵄨dt
C C1 C2 0 0
1 1

= ∫(2t)√22 dt + ∫(2 − t + t(2 − t))√(−1)2 + 1 + (−1)2 dt


0 0
1
1 t2 t3 13
= [2t 2 ]0 + √3[2t + − ] = 2 + √3 .
2 3 0 6

Figure 4.3 shows the graph of the curve C.

Figure 4.3: Line integral with respect to arc length, Example 4.1.5.
4.2 Line integral of a vector field | 201

4.2 Line integral of a vector field


4.2.1 Vector fields

Many physical phenomena can be modeled by associating a vector with each point in
space. Examples include electric fields, magnetic fields, gravitational fields, and the
velocity field for a fluid. A function that associates a vector with each point is called a
vector field.
More precisely, it is defined as follows.

Definition 4.2.1. A function F with domain set D ⊂ ℝn and with range a set of vectors in ℝn is called a
vector field.

For example, the functions F, G, and H defined as follows are vector fields:

F(x, y) = x2 y i + 2xyex j ,
→󳨀 →󳨀

G(x, y) = ⟨−x, −y, −z⟩,


H(ρ, θ, ϕ) = ⟨ρ cos θ sin ϕ, ρ sin θ sin ϕ, cos ϕ⟩.

The gradient of a function f is also a vector field, i. e.,


𝜕f 𝜕f 𝜕f 𝜕f
grad f (x, y) = ∇f = i+ j = ⟨ , ⟩,
𝜕x 𝜕y 𝜕x 𝜕y
𝜕f 𝜕f 𝜕f
grad f (x, y, z) = ∇f = i+ j + k.
𝜕x 𝜕y 𝜕z

We can graph a vector field by drawing arrows at some selected points in ℝ2 or ℝ3 .


Some examples are shown in Figure 4.4 and Figure 4.5.

(a) (b)

Figure 4.4: Some vector fields.


202 | 4 Line and surface integrals

(a) (b)

Figure 4.5: Electronic and magnetic fields.

Example 4.2.1. The gravitational field of the Earth (of mass Me ) acting on an object of mass m is a
vector field. Find the formula for this force field.

Solution. The gravitational force F is given by the inverse square law formula

Me m
F=G ,
r2

where G is the gravitational constant, r is the distance between the object and the
center of the Earth, and m and Me are the mass of the object and the Earth, respectively.
In order to find the gravitational field, let →
󳨀r be the vector from the center of the Earth
to the object. The force on the object acts along the same line, but in the opposite
direction to →󳨀r , and so the field becomes

Me m→ 󳨀r
F = −G 3
.
r

If we define a Euclidean coordinate system with origin at the center of the Earth, then
this becomes
→󳨀 →
󳨀 →
󳨀
Me m r ⃗ Me m(x i + y j + z k )
F = −G = −G ,
r 2 |r|⃗ 3
(x 2 + y2 + z 2 ) 2

where (x, y, z) is the location of the object.

4.2.2 The line Integral of a vector field along a curve C

Suppose an object moves from point A to point B along a smooth curve C in ℝ2 in a


force field F = ⟨f , g⟩, where f = f (x, y) and g = g(x, y) are two differentiable functions
of two variables. How much work does the force field do? Note that the force acting
on the object varies from point to point, maybe both in direction and in magnitude.
4.2 Line integral of a vector field | 203

Figure 4.6: Work integral.

To find the total work, we again use the element method to break down the problem.
Thinking about a very small piece of the curve, say, Δs, the work element ΔW can be
obtained by using |F| cos θ ⋅ Δs, as shown in Figure 4.6. Suppose T is the unit tangent
vector of the curve. Then

ΔW ≈ |F| cos θ ⋅ Δs = |F| cos θ|T| ⋅ Δs = (F ⋅ T)Δs.

Adding them up and taking a limit, we will have a line integral similar to the one in
the previous section. We now give a formal definition of a line integral of a vector field
along a curve C in ℝ2 .

Definition 4.2.2 (Line integral of a vector field). Let F = ⟨f (x, y), g(x, y)⟩ be a vector field in ℝ2 , and let
C : r(t), a ≤ t ≤ b, be a curve in the xy-plane. If the limit

Δri
n
Δti
lim ∑⟨f (xi∗ , yi∗ ), g(xi∗ , yi∗ )⟩ ⋅ Δri
Δsi
max |Δsi |→0 n→∞
i=1 | Δt |
i

exists and is the same for any possible choice of sample points (xi∗ , yi∗ ) between Si−1 and Si on C and
for any subdivision S0 , S1 , . . . , Sn of the curve C, we define this limit to be the line integral of the vector F
along the curve C from point A = r(a) to point B = r(b), and we write

Δri
n
Δti
∫(F ⋅ T)ds = lim ∑⟨f (xi∗ , yi∗ ), g(xi∗ , yi∗ )⟩ ⋅ Δri
Δsi .
max |Δsi |→0 n→∞
C i=1 | Δt |
i

Note.
1. If both components f and g of F are continuous or piecewise continuous and the
curve C is also piecewise smooth, then the above line integral of the vector field F
along C always exists.
2. Unlike the line integral with respect to arc length, the orientation of the curve C
does make a difference in the line integral of a vector field. We usually define the
positive orientation of a parameterized curve C to be the direction that is consis-
tent with the increasing value of its parameter. If we use −C to denote the negative
204 | 4 Line and surface integrals

direction of C, then we define

∫(F ⋅ T)ds = − ∫ (F ⋅ T)ds.


C −C

3. If the curve is closed, that is, r(a) = r(b), then we also adopt the notation
∮C (F ⋅ T)ds.

Notations for line integrals of a vector field


The notation ∫C (F ⋅ T)ds for the line integral of the vector field F = ⟨f , g⟩ along the
curve C : r(t) = ⟨x(t), y(t)⟩, a ≤ t ≤ b, is easy to understand, although it can be mathe-
matically hard to manipulate. Fortunately, we have some other equivalent notations.
Since

r󸀠 (t)
∫(F ⋅ T)ds = ∫⟨f , g⟩ ⋅ ds
|r󸀠 (t)|
C C
r󸀠 (t) 󵄨󵄨 󸀠 󵄨󵄨
= ∫⟨f , g⟩ ⋅ 󵄨r (t)󵄨󵄨dt = ∫⟨f , g⟩ ⋅ r (t)dt
󸀠
|r󸀠 (t)| 󵄨
C C
dx dy
= ∫⟨f , g⟩ ⋅ ⟨ , ⟩dt
dt dt
C

= ∫ fdx + gdy,
C

we have

∫(F ⋅ T)ds = ∫ fdx + gdy. (4.5)


C C

If we denote dr = ⟨dx, dy⟩, then we also have

∫(F ⋅ T)ds = ∫ F ⋅ dr = ∫ F ⋅ r󸀠 (t)dt. (4.6)


C C C

Note. We can write ∫C F ⋅ dr = ∫C (F ⋅ T )ds, where T is a unit tangent to C, and the inte-
grals with respect to arc length s are unchanged when the orientation of C is reversed.
However, the orientation property still holds true because the unit vector T is replaced
by its negative, −T, when C is replaced by −C.

x2 y2
Example 4.2.2. Evaluate ∫C xdy − ydx, where C is the arc of the ellipse a2
+ b2
= 1 from (a, 0) to (0, b).
4.2 Line integral of a vector field | 205

Solution. A vector parametric form of the curve is


π
r(t) = ⟨a cos t, b sin t⟩, for 0 ≤ t ≤ ,
2

and r󸀠 (t) = ⟨−a sin t, b cos t⟩. The vector field is F = ⟨−y, x⟩, so

∫ xdy − ydx = ∫ F ⋅ r󸀠 (t)dt = ∫⟨−y, x⟩.⟨−a sin t, b cos t⟩dt


C C C
π
2

= ∫⟨−b sin t, a cos t⟩ ⋅ ⟨−a sin t, b cos t⟩dt


0
π
2

= ∫ (−b sin t)(−a sin t) + (a cos t × b cos t)dt


0
π
2
π
= ∫(ab cos2 t + ab sin2 t)dt = ab.
2
0

Example 4.2.3. Find the line integral ∫C x 2 dx − xydy, where C consists of the line segment C1 from the
point (1, 0) to the point (0, 0) followed by the vertical line segment C2 from the point (0, 0) to (0, 1).

Solution. The path and the vector field are shown in Figure 4.7(a). Along C1 , we have
y = 0 and dy = 0, so
0
1
∫ x dx − xydy = ∫ x2 dx = − .
2
3
C1 1

(a) (b)

Figure 4.7: Work integral examples, path dependence.


206 | 4 Line and surface integrals

Along C2 , x = 0, so dx = 0 and

∫ x2 dx − xydy = 0.
C2

Altogether, we have

1
∫ x2 dx − xydy = ∫ x2 dx − xydy + ∫ x 2 dx − xydy = − .
3
C C1 C2

Example 4.2.4. Find the work done by the force field F ⃗ (x, y) = x 2 i ⃗ − xy j ⃗ acting on a particle moving
⃗ = cos t i ⃗ + sin t j ⃗ for 0 ≤ t ≤ π2 .
along the quarter-circle r(t)

Solution. The path and the vector field are shown in Figure 4.7(b). Since x = cos t and
y = sin t, we have

F (x, y) = ⟨x2 , −xy⟩ = ⟨cos2 t, − cos t sin t⟩



󳨀

dr(t)
and = ⟨− sin t, cos t⟩. Therefore, the work done is

dt

π/2

∫ F ⋅ dr = ∫ ⟨cos2 t, − cos t sin t⟩ ⋅ ⟨− sin t, cos t⟩dt


C 0
π/2 π/2
cos3 t 2
= ∫ (−2 cos2 t sin t)dt = [2 ] =− .
3 0 3
0

Note. The two previous examples both compute the work done by the same force field,
x2 i − xy j, between the same two points, but reach different answers. This shows that
the work done by the same vector field may be different when calculated along differ-
ent routes. This is not true of so-called conservative force fields, such as gravity, where
the work done is the same, no matter what path is taken, so long as the starting point
and ending point are fixed. The next example with a conservative force field illustrates
this point.

2
Example 4.2.5. Evaluate the line integral ∫C xydx+ x2 dy between O(0, 0) and B(1, 1) along the following
curves:
(1) the line segment from O(0, 0) to A(1, 0), and then to B(1, 1).
(2) the line segment from O(0, 0) to B(1, 1).
(3) an arc of the parabola x = y 2 .
4.2 Line integral of a vector field | 207

Figure 4.8: Work integral examples, path independence.

Solution. The vector field and the paths are shown in Figure 4.8.
(1) Along the line segment from O(0, 0) to A(1, 0), y = 0, and then
x2
∫ xydx + dy = 0.
2
C

Along the line segment from A(1, 0) to B(1, 1), x = 1 and dx = 0. Thus
1
x2 12 1
∫ xydx + dy = ∫ dy = .
2 2 2
C 0

All together,
x2 x2 x2 1 1
∫ xydx + dy = ∫ xydx + dy + ∫ xydx + dy = 0 + = .
2 2 2 2 2
C OA AB

(2) Along the line segment from O(0, 0) to B(1, 1), a parametric form is x = x, y = x for
0 ≤ x ≤ 1, so
1 1
x2 x2 3x2 1
∫ xydx + dy = ∫ x2 dx + dx = ∫ dx = .
2 2 2 2
C 0 0

(3) Along the parabola x = y , a parametric form is x = y2 , y = y for 0 ≤ y ≤ 1, so


2

1 1
x2 y4 5y4 1
∫ xydx + dy = ∫ y3 d(y2 ) + dy = ∫ dy = .
2 2 2 2
C 0 0

2
Note. In this example, you can see that the line integral of the vector field ⟨xy, x2 ⟩
from (0, 0) to (1, 1) is the same along three different curves. In fact it would be the
same along any curve joining the two points. This is an example of a conservative
vector field. We will discuss this kind of field in more details in the coming section.
208 | 4 Line and surface integrals

Line integral of a vector field in ℝ3


The work integral ∫C (F ⋅ T)ds = ∫C fdx + gdy can be generalized to a three-dimensional
vector field F = ⟨f , g, h⟩, where f = f (x, y, z), g = g(x, y, z), and h = h(x, y, z) are three
piecewise continuous functions, and C is a piecewise smooth curve in space. In this
case, we have

∫(F ⋅ T)ds = ∫ F⋅dr = ∫ fdx + gdy + hdz.


C C C

Example 4.2.6. Evaluate ∫C (y − x)dx + xdy + (x + z)dz, where C consists of the line segment C1 from
(2, 0, 0) to (3, 4, 5) followed by the vertical line segment C2 from (3, 4, 5) to (−1, 4, 1).

Solution. The line segments C1 and C2 can be written parametrically for 0 ≤ t ≤ 1 as

C1 : r1 (t) = (1 − t)⟨2, 0, 0⟩ + t⟨3, 4, 5⟩ = ⟨2 + t, 4t, 5t⟩,


C2 : r2 (t) = (1 − t)⟨3, 4, 5⟩ + t⟨−1, 4, 1⟩ = ⟨3 − 4t, 4, 5 − 4t⟩.

Thus,
1

∫ (y − x)dx + xdy + ( x + z)dz = ∫(4t − (2 + t))dt + (2 + t)d(4t) + (2 + t + 5t)d(5t)


C1 +C2 0
1

+ ∫(4 − (3 − 4t))d(3 − 4t) + (3 − 4t)d(4)


0
+ (3 − 4t + 5 − 4t)d(5 − 4t)
1 1
47t 2 󵄨󵄨󵄨󵄨 7
= ∫(−20 + 47t)dt = −20t + 󵄨 = .
2 󵄨󵄨󵄨0 2
0

4.3 The fundamental theorem of line integrals


As you may have already noted in Example 4.2.5, the line integral of a vector field from
point A to point B may not depend on the path that connects A and B. The property
is called path independence, and its formal definition is given below. A conservative
field has this property. For example, we know that the gravitational force field is con-
servative and path independent.

Definition 4.3.1 (Path independence). Let F = ⟨f , g⟩ be a vector field defined on a region D in ℝ2 . Let A
and B be any two points in D, and C be any path with endpoints A and B. If the value of the line integral
∫C fdx + gdy is independent of the path that connects A and B, then we say the line integral ∫C fdx + gdy
is path-independent and that the vector field F is path-independent in D.
4.3 The fundamental theorem of line integrals | 209

(a) (b)

Figure 4.9: Path independence.

From the definition of path independence, we know that if F = ⟨f , g⟩ is path-


independent in D, then for any given two points A and B, where C1 is a path (curve) in
D from A to B and C2 is any other path (curve) in D from A to B (see Figure 4.9(a)), we
must have

∫ F ⋅ dr = ∫ F ⋅ dr.
C1 C2

This means

0 = ∫ F ⋅ dr− ∫ F ⋅ dr = ∫ F ⋅ dr+ ∫ F ⋅ dr = ∮ F ⋅ dr.


C1 C2 C1 −C2 C1 +(−C2 )

Note that C1 +(−C2 ) is a closed curve in D. (We say a parameterized curve r(t), a ≤ t ≤ b,
is closed if r(a) = r(b)).
On the other hand, if F is defined in D and

∮ F ⋅ dr = 0
C

for any closed curve C in D, then we claim that the vector field F must be path-
independent in D. This is because for any two points A and B in D, and any two
different paths C1 and C2 from A to B, we have

0 = ∮ F ⋅ dr = ∫ F ⋅ dr+ ∫ F ⋅ dr
C C1 −C2

= ∫ F ⋅ dr− ∫ F ⋅ dr.
C1 C2
210 | 4 Line and surface integrals

This means

∫ F ⋅ dr = ∫ F ⋅ dr.
C1 C2

We summarize the above arguments in the following theorem.

Theorem 4.3.1. F = ⟨f , g⟩ is path-independent in D, if and only if for any closed curve C in D, we must
have

∮ F ⋅ dr = ∮ fdx + gdy = 0.
C C

But how do we know that a line integral is path-independent? Recall the fundamental
theorem of calculus
b

∫ f 󸀠 (x)dx = f (b) − f (a).


a

The integration of the rate of change of a function is equal to the net change of the
function over the interval [a, b]. For functions of two or more variables, we have seen
that the gradient of a function plays much the same role as the derivative of a function
does for functions of one variable. We consider the integral ∫C ∇φ ⋅ dr. This means the
vector field F is the gradient of some scalar function φ. This indeed works, and we now
state the fundamental theorem of line integrals.

Theorem 4.3.2. Let φ(x, y) be a differentiable function and F = ∇φ. Suppose that C is any curve that
has a parameterization r(t) = ⟨x(t), y(t)⟩, a ≤ t ≤ b, where r(a) and r(b) correspond to the points A
and B, respectively. Then F is path-independent and

∫ F ⋅ dr = ∫ ∇φ ⋅ dr = φ(B) − φ(A).
C C

Proof. Since ∇φ = ⟨ 𝜕x , and dr


=⟨ dx , dy ⟩,
𝜕φ 𝜕φ
𝜕y
⟩ dt dt dt

dr 𝜕φ 𝜕φ dx dy
∫ ∇φ ⋅ dr = ∫ ∇φ ⋅ dt = ∫⟨ , ⟩ ⋅ ⟨ , ⟩dt
dt 𝜕x 𝜕y dt dt
C C C
b
𝜕φ dx 𝜕φ dy
= ∫( + )dt
𝜕x dt 𝜕y dt
a
b

= ∫( )dt = φ(x(b), y(b)) − φ(x(a), y(a))
dt
a
= φ(B) − φ(A).

This completes the proof.


4.3 The fundamental theorem of line integrals | 211

Due to the fundamental theorem of line integrals, we now give the definition of a
conservative vector field and a potential function of the vector field F.

Definition 4.3.2 (Conservative field and potential function). A vector field F is called conservative if it
is the gradient of a scalar function φ, that is, F = ∇φ. Such a φ is called a potential function of the
vector field F.

Note that if F = ⟨f , g⟩ = ∇φ, then

𝜕φ 𝜕φ
fdx + gdy = dx + dy = dφ.
𝜕x 𝜕y

This means

F = ⟨f , g⟩ is conservative ⇐⇒ there is a function φ(x, y) such that F = ∇φ


⇐⇒ there is a function φ(x, y) such that fdx + gdy = dφ.

Also in this case, we can write

𝜕φ 𝜕φ
∫ fdx + gdy = ∫ dx + dy = ∫ dφ(x, y) = φ(B) − φ(A).
𝜕x 𝜕y
C C C

Therefore, if we can find a potential function for the vector field F, then we can eval-
uate a work integral along a curve C by directly finding the difference of the potential
function at the initial and terminal points of the curve C. In this case we also write

∫ fdx + gdy = ∫ fdx + gdy


C A

to indicate that the line integral is path-independent.

Example 4.3.1. Note from Example 4.2.5 that

x2 x2y
xydx + dy = d( ),
2 2

2 2
x2
so x 2y is a potential function of ⟨xy, x2 ⟩. Therefore, the line integral ∫C xydx + 2
dy from A(0, 0) to
B(1, 1) along any curve C is obtained by

󵄨(1,1)
x2 x 2 y 󵄨󵄨󵄨󵄨 12 ⋅ 1 02 ⋅ 0 1
∫ xydx + dy = 󵄨󵄨 = − = .
2 2 󵄨󵄨(0,0)
󵄨 2 2 2
C

We now establish the famous principle of conservation of energy. Let us assume that a
continuous force field F acts on an object moving along a path C. The path C is given
212 | 4 Line and surface integrals

parametrically by r(t), a ≤ t ≤ b, r(a) = A is the initial point, and r(b) = B is the


terminal point of C. By Newton’s second law of motion, the relation between the force
F at a point on C and the acceleration, a(t) or r󸀠󸀠 (t), is given by
F(r(t)) = mr󸀠󸀠 (t).
The line integral giving the work done by the force acting on the object while it is
moving along C from A to B is
b

W = ∫ F ⋅ dr = ∫ F( r(t)) ⋅ r󸀠 (t)dt
C a
b

= ∫ mr󸀠󸀠 (t) ⋅ r󸀠 (t)dt


a
b
m d 󸀠
= ∫ [r (t) ⋅ r󸀠 (t)]dt
2 dt
a
b
m d󵄨 󵄨2
= ∫ 󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨 dt
2 dt
a
m 󵄨 󵄨2 b
= [󵄨󵄨󵄨r󸀠 (t)󵄨󵄨󵄨 ]a
2
m 󵄨 󵄨2 󵄨 󵄨2
= (󵄨󵄨󵄨r󸀠 (b)󵄨󵄨󵄨 − 󵄨󵄨󵄨r󸀠 (a)󵄨󵄨󵄨 ).
2
Therefore, if we write v = r󸀠 (t) (the velocity vector), then
1 󵄨 󵄨2 1 󵄨 󵄨2
W = m󵄨󵄨󵄨v(b)󵄨󵄨󵄨 − m󵄨󵄨󵄨v(a)󵄨󵄨󵄨 .
2 2
The quantity 21 m|v(t)|2 is called the kinetic energy of the object and is denoted by K(t).
Thus, we can rewrite the above equation as
W = K(B) − K(A).
This means that the work done by the force acting on the object as it moves along C is
equal to the change in kinetic energy between t = a and t = b, regardless of the path
that C takes between A and B.
Now, if F is also a conservative force field, then we have F = ∇φ for some function
φ. Note that in physics, the potential energy P of an object at the point (x, y, z) is defined
as the negative of a potential function, i. e., P(x, y, z) = −φ(x, y, z). Applying this here,
we have F = −∇P, and then the work done in moving along C from A to B can be written

W = ∫ F ⋅ dr = − ∫ ∇P ⋅ dr
C C
b
𝜕P dx 𝜕P dy 𝜕P dz
= − ∫( + + )dt
𝜕x dt 𝜕y dt 𝜕z dt
a
4.3 The fundamental theorem of line integrals | 213

b
d
= −∫ (P(x(t), y(t), z(t)))dt
dt
a
= −[P(r(b)) − P(r(a))]
= P(A) − P(B).
Comparing this equation with the previous expression in terms of kinetic energy we
see that
P(A) + K(A) = P(B) + K(B).
This means that if an object moves from point A to point B under the influence of a
conservative force field, then the sum of its potential energy and its kinetic energy
remains constant. That is called the law of conservation of (mechanical) energy, and it
is why such a vector field is called conservative.
By the fundamental theorem of line integrals, we know that a conservative field
in a region D must also be path-independent in D. Conversely, if a vector field is path-
independent in D, do we know whether it is conservative? If the region D is open and
connected, the answer is yes. We say a region D is connected if for any two points in D
there exists a line/curve that lies entirely in D and connects the two points. We have
the following theorem.

Theorem 4.3.3. If D is an open and connected region, and a continuous vector field F = ⟨f , g⟩ is path-
independent in D, then F is also conservative in D. That is, there exists a function φ(x, y) such that
F = ∇φ.

Proof. We construct such a φ in the following way.


Suppose A(a, b) is a fixed point in D and P(x, y) is any point in D. Since F = ⟨f , g⟩
is path-independent in D, the function
(x,y)

φ(x, y) = ∫ F⋅dr
(a,b)

is independent of the path that connects A(a, b) and the point P(x, y). We are going to
show that
𝜕φ 𝜕φ
∇φ(x, y) = F, or equivalently, = f and = g.
𝜕x 𝜕y
Since D is open, there exists a disk centered at (x, y) that lies entirely in D. We choose
a point B(x0 , y) that is also in this disk, and then B(x0 , y) and P(x, y) are the two end-
points of the horizontal line segment BP. As shown in Figure 4.9(b), since D is con-
nected, there is a path C1 in D that connects A and B, and
(x,y) B(x0 ,y) P(x,y)

φ(x, y) = ∫ F⋅dr = ∫ F⋅dr+ ∫ F⋅dr.


(a,b) A(a,b) B(x0 ,y)
214 | 4 Line and surface integrals

B(x ,y)
Note that ∫A(a,b)
0
F⋅dr is independent of x and along the line segment BP, dy = 0. There-
fore,
B(x0 y) x

φ(x, y)= ∫ F⋅dr+ ∫ f (t, y)dt, where we changed the dummy variable to t,
A(a,b) x0

and
𝜕φ
= 0 + f (x, y) = f (x, y).
𝜕x
In a very similar way, with the aid of a vertical line segment, we could prove

𝜕φ
= g(x, y).
𝜕y

Therefore, φ(x, y) is a function such that F = ∇φ, so F is conservative.

Note. If we choose the point A(a, b) such that AP lies in a disk centered at P(x, y) and
lies in D, then by integration along a horizontal line segment followed by a vertical
line segment, we would have a nice formula for finding a potential function,
x y

φ(x, y) = ∫ f (t, b)dt + ∫ g(x, t)dt, (4.7)


a b

or by integration along a vertical line segment first followed by a horizontal one, we


have
y x

φ(x, y) = ∫ g(a, t)dt + ∫ f (t, y)dt. (4.8)


b a

Now, there are still some questions. For example, how do we know whether a po-
tential function exists? Observing again, if

F = ⟨f , g⟩ = ∇φ = ⟨φx , φy ⟩,

then φx = f and φy = g. If f and g both have continuous partial derivatives, then we


would have

fy = φxy = φyx = gx .

This means that if f and g both have continuous partial derivatives, then a necessary
condition for it to be a conservative field is
𝜕g 𝜕f
= .
𝜕x 𝜕y
4.3 The fundamental theorem of line integrals | 215

Remarkably, it turns out this is also a sufficiently condition if the region D is simply
connected, and the curve C is simple and closed. This is proved in the next section by
using Green’s theorem. We now adopt this fact and demonstrate how to find a potential
function in two different ways.

Example 4.3.2. Given a vector field F = ⟨xy 2 , x 2 y + y⟩, find a function φ(x, y) such that F = ∇φ =
⟨xy 2 , x 2 y + y⟩.

Solution. We first note that

𝜕(x2 y + y) 𝜕(xy2 )
= 2xy = .
𝜕x 𝜕y

So there may exist a potential function φ.


Method 1: If F = ∇φ = ⟨xy2 , x2 y + y⟩, then we must have

𝜕φ 𝜕φ
= xy2 and = x2 y + y.
𝜕x 𝜕y

From = xy2 , we integrate with respect to x to obtain


𝜕φ
𝜕x

x2 y2
φ = ∫ xy2 dx = + C(y).
2

We need to remember that the integration is with respect to x, so the arbitrary constant
may be a function of y. To determine C(y), we take the partial derivative with respect
to y to obtain

x2 y2
󸀠
𝜕φ 𝜕φ
=( + C(y)) = x2 y + C 󸀠 (y); but we also have = x 2 y + y.
𝜕y 2 y 𝜕y

Thus,

x2 y + C 󸀠 (y) = x2 y + y.

y2
So, C 󸀠 (y) = y. Then, C 󸀠 (y) = 2
+ C, where C is an arbitrary constant. Now we can
conclude that

x2 y2 y2
φ(x, y) = + + C.
2 2
Method 2: We can also try either of equation (4.7) or equation (4.8). Then we have
x y

φ(x, y) = ∫ f (t, b)dt + ∫ g(x, t)dt


a b
216 | 4 Line and surface integrals

x y x y
t 2 b2 x2 t 2 t 2
= ∫(tb2 )dt + ∫(x2 t + t)dt = ( ) +( + )
2 a 2 2 b
a b
x2 b2 a2 b2 x2 y2 y2 x 2 b2 b2
= − + + − −
2 2 2 2 2 2
x2 y2 y2
= + + C.
2 2

4.4 Green’s theorem: circulation-curl form


4.4.1 Positive oriented simple curve and simply connected region

A simple curve is a curve which does not intersect itself except possibly at its end-
points. The set D ⊂ ℝ2 is connected if for any two points in D there exists a line (curve)
that lies entirely in D and connects the two points; D is called simply connected if for
every simple (i. e., nonself-intersecting) closed curve C composed of points of D, the
region inside of C is also part of D, that is, D has no holes and does not consist of
separate parts. In other words, one can continuously shrink any simple closed curve
to a point while remaining in the domain. Figure 4.10 shows some such curves and
regions. The boundary curve of a region D has a positive orientation if, as you walk
along the boundary, the region D is on your left-hand side. Figure 4.11 shows positive
orientation for two connected regions.

Figure 4.10: Simple curve and simply connected region.

(a) (b)

Figure 4.11: Positive orientation.


4.4 Green’s theorem: circulation-curl form | 217

4.4.2 Circulation around a closed curve

The line integral ∫C F⋅dr along an oriented curve C “adds up” the component of the
vector field that is tangent to the curve C. In this sense, the line integral measures how
much the vector field is aligned with the curve. If the curve C is a closed curve, then the
line integral indicates how much the vector field tends to circulate around the curve C.
We call the line integral the “circulation” of F around C, i. e.,

circulation integral = ∮(F ⋅ T)ds = ∮ fdx + gdy. (4.9)


C C

Note. The symbol “∮C ” is used to indicate a line integral along a closed oriented
curve C.

Example 4.4.1. Investigate the circulation integral

xdy − ydx

x2 + y2
C

for any circle C with center at the origin and radius R > 0 with counterclockwise orientation.

Solution. Figure 4.12 shows the vector field and the circle. Note that F = ⟨f , g⟩ with
y x
f = − x2 +y 2 , g = x 2 +y 2 , and C has a parametric form r(t) = ⟨R cos t, R sin t⟩. Then

y x y x
∮− dx + 2 dy = ∮⟨− 2 , ⟩ ⋅ r󸀠 (t)dt
x2 + y2 x + y2 x + y2 x2 + y2
C C

R sin t R cos t
= ∫ ⟨− , ⟩ ⋅ ⟨−R sin t, R cos t⟩dt
R2 R2
0

= ∫ (sin2 t + cos2 t)dt


0

= ∫ 1dt = 2π.
0

4.4.3 Circulation density

We first introduce the concept of circulation density. As shown in Figure 4.13, we con-
sider the counterclockwise circulation along the rectangle. We start with writing the
following:
218 | 4 Line and surface integrals

Figure 4.12: Circulation integral, Example 4.4.1.

Figure 4.13: Circulation density, the curl.

the bottom circulation is F(x, y − Δy) ⋅ i×2Δx = f (x, y − Δy) × 2Δx,


the top circulation is F(x, y + Δy) ⋅ −i×2Δx = −f (x, y + Δy) × 2Δx,
the right circulation is F(x + Δx, y) ⋅ j×2Δy = g(x + Δx, y) × 2Δy,
the left circulation is F(x − Δx, y) ⋅ −j×2Δx = −g(x − Δx, y) × 2Δy.
So the total counterclockwise circulation along the rectangle is

f (x, y − Δy) × 2Δx − f (x, y + Δy) × 2Δx + g(x + Δx, y) × 2Δy − g(x − Δx, y) × 2Δy,

which we can rearrange to obtain

(g(x + Δx, y) − g(x − Δx, y))2Δy − (f (x, y + Δy) − f (x, y − Δy))2Δx.

The density over the rectangle is, therefore,


(g(x + Δx, y) − g(x − Δx, y))2Δy − (f (x, y + Δy) − f (x, y − Δy))2Δx
.
2Δx × 2Δy
4.4 Green’s theorem: circulation-curl form | 219

Simplifying this, we have


g(x + Δx, y) − g(x − Δx, y) f (x, y + Δy) − f (x, y − Δy)
− .
2Δx 2Δy
f (x+h)−f (x−h)
Taking the limit as Δx → 0 and Δy → 0, and noting that 2h
→ f 󸀠 (x) as h → 0,
we obtain the circulation density at (x, y)
𝜕g 𝜕f
− . (4.10)
𝜕x 𝜕y
Then an integration of 𝜕g
𝜕x
𝜕f
− 𝜕y over the region D will give a cumulated circulation
effect on the boundary the region D, i. e.,
𝜕g 𝜕f
∮ fdx + gdy = ∬( − )dσ.
𝜕x 𝜕y
C D

This is exactly Green’s theorem. This is illustrated in Figure 4.14.


The term 𝜕g𝜕x
𝜕f
− 𝜕y is called the k-component of the curl of the vector field F. The curl
is a vector relating to the rotational effect of F as shown in Figure 4.15. We will discuss
the curl vector again in a later section.

4.4.4 Green’s theorem: circulation-curl form

Recall the fundamental theorem of calculus


b

∫ f 󸀠 (x)dx = f (b) − f (a),


a

which says that the integration of the rate of change of a function is related to the value
of the function at the boundary points of the integration interval. There is indeed a
similar relationship between the integration of rate of change of some functions on
the region that is bounded by a closed curve and a line integral along the curve. The
remarkable Green’s theorem is, therefore, sometimes called the fundamental theorem
of calculus for double integrals, as it reveals the relationship between a double integral
over the planar region D and a line integral on the boundary of D.
With the ideas of circulation integral and circulation density, we now state the
great Green’s theorem and give a partial proof of it.

Theorem 4.4.1 (Green’s theorem: circulation or tangential form). Let L be a piecewise smooth simple
closed curve in ℝ2 having positive orientation (the interior is on the left as you travel around L) and
→󳨀 →󳨀 →
󳨀
the region D inside of L is simply connected. Let F = f i + g j be a vector field for which f and g have
continuous partial derivatives in a region containing C and D. Then

𝜕g 𝜕f
∮ f (x, y)dx + g(x, y)dy = ∬( − )dσ. (4.11)
𝜕x 𝜕y
L D
220 | 4 Line and surface integrals

Figure 4.14: Green’s theorem: the circulation-curl form.

Figure 4.15: The curl vector, axis of rotation.

Verifying Green’s theorem


We first give the following example to demonstrate that the two integrals in Green’s
theorem are equal.

Example 4.4.2. For the vector field F = ⟨−y, x⟩ and the closed curve x 2 +y 2 = 1, evaluate both integrals
in Green’s theorem and check that they are equal.

Solution. Since f = −y, g = x, and the circle has a parametric representation r =


⟨cos t, sin t⟩ with r 󸀠 (t) = ⟨− sin t, cos t⟩, the line integral is

∮ fdx + gdy = ∮⟨−y, x⟩ ⋅ r󸀠 (t)dt


C C

= ∫ ⟨− sin t, cos t⟩⟨− sin t, cos t⟩dt


0
4.4 Green’s theorem: circulation-curl form | 221

= ∫ (sin2 t + cos2 t)dt = 2π.


0

The double integral is

𝜕g 𝜕f 𝜕x 𝜕(−y)
∬( − )dσ = ∬( − )dσ
𝜕x 𝜕y 𝜕x 𝜕y
D D

= ∬(1 − (−1))dσ = 2 ∬ 1dσ


D D
= 2 ⋅ A(D) = 2π.

Therefore, the two integrals in Green’s theorem are equal in this example.

Proof of Green’s theorem in a simple case


Proof. Suppose D is both a type I and a type II region, as shown in Figure 4.16, and the
type I form is

D = {(x, y)|ϕ1 (x) ≤ y ≤ ϕ2 (x), a ≤ x ≤ b},

where L1 : y = ϕ1 (x) and L2 : y = ϕ2 (x), for a ≤ x ≤ b, are two curves comprising L (see
Figure 4.16(a)).

(a) (b)

Figure 4.16: Green’s theorem: a proof in a simple case.

Since 𝜕f
𝜕y
is continuous, by applying the integration formula for double integrals, we
have

b ϕ2 (x)
𝜕f 𝜕f (x, y)
∬ dσ = ∫{ ∫ dy}dx
𝜕y 𝜕y
D a ϕ1 (x)
222 | 4 Line and surface integrals

= ∫{f (x, ϕ2 (x)) − f (x, ϕ1 (x))}dx.


a

On the other hand, the line integral

∮ f (x, y)dx = ∫ f (x, y)dx + ∫ f (x, y)dx


C L1 L2
b a

= ∫ f (x, ϕ1 (x))dx + ∫ f (x, ϕ2 (x))dx


a b
b

= − ∫{f (x, ϕ2 (x)) − f (x, ϕ1 (x))}dx.


a

So

𝜕f
−∬ dσ = ∮ fdx.
𝜕y
D L

Since D is also a type II region of form D = {(x, y)|ψ1 (y) ≤ x ≤ ψ2 (y), c ≤ y ≤ d}, a
similar argument leads to a proof that

𝜕g
∬ dσ = ∮ gdy.
𝜕x
D C

By the linearity property of integrals, we obtain the following result:

𝜕g 𝜕f
∮ fdx + gdy = ∬( − )dσ.
𝜕x 𝜕y
C D

Note. If a simply connected region is not both type I and type II, then we can partition
it into several subregions which are both type I and type II. Green’s theorem can then
be still proved.

4.4.5 Applications of Green’s theorem in circulation-curl form

Determination of a conservative field


Recall that if a vector field F = ⟨f , g⟩ is path-independent in D, this means that for any
closed curve C in D we have

∮ fdx + gdy = 0.
C
4.4 Green’s theorem: circulation-curl form | 223

Now, from Green’s theorem in circulation-curl form, if D is simply connected, we know


that ∮C fdx + gdy = 0 for every simple closed curve C in D if and only if 𝜕g
𝜕x
𝜕f
= 𝜕y every-
where on D. We summarize properties about a conservative field as follows.

Theorem 4.4.2. Let F = ⟨f , g⟩ be a continuous vector field defined in a simply connected region R ⊂ ℝ2
and f and g both have continuous partial derivatives. Then we have

F = ⟨f , g⟩ is conservative
⇐⇒ there is a function φ such that F = ∇φ or dφ = fdx + gdy

⇐⇒ ∫ fdx + gdy is path-independent for every piecewise smooth C in R


C

⇐⇒ ∮ fdx + gdy = 0 for every simple piecewise smooth closed curve C in R


C
𝜕f 𝜕g
⇐⇒ = means φxy = φyx throughout R.
𝜕y 𝜕x

So 𝜕y
𝜕f
= 𝜕g
𝜕x
is a simple criterion to determine whether or not a vector field is conserva-
tive. In this case, we can use the method introduced in Example 4.3.2 to find a potential
function. We can also find a potential function for the vector field by the method in-
troduced in the following example.

Example 4.4.3. Given a vector field F = ⟨ex + y 2 , 2xy⟩:


(1) Determine whether F is conservative.
(2) Find the line integral ∫C F⋅dr, where C is the part of the curve y = sin x 2 , 0 ≤ x ≤ √π.
(3) Find a potential function F if it exists.

Solution. The graph of the vector field is shown in Figure 4.17(a). We now use the
simple criterion derived in the proof of Green’s theorem to determine whether it is
conservative or not.
(1) Since F has continuous partial derivatives in ℝ2 and

𝜕g 𝜕(2xy) 𝜕(ex + y2 ) 𝜕f
= = 2y = = ,
𝜕x 𝜕x 𝜕y 𝜕y
this field is conservative.
(2) Since the field is conservative, the line integral ∫C F⋅dr is path-independent. There-
fore, we will not use the original path where it is hard to evaluate the integral, and,
instead, we try the line segment from (0, 0) to (√π, 0), as shown in Figure 4.17(b).
Note that along this new path, y = 0. We have
√π
x 2
∫ F⋅dr = ∫(e + y )dx + 2xydy = ∫ (ex + 02 )dx + 0
C C 0

ex |x=√π =e − 1.
√π
= x=0
224 | 4 Line and surface integrals

(a) (b) (c)

Figure 4.17: Green’s theorem, Example 4.4.3.

(3) We could have used equation (4.7) or equation (4.8). To enhance our understand-
ing, we use the idea of path independence again. Assume we are going to evaluate
the line integral from (0, 0) to any point, say, (u, v). Since the line integral is path-
independent in this field, if φ(x, y) is a potential function of F, then ∫(0,0) (ex +
(u,v)

y2 )dx + 2xydy = φ(u, v) − φ(0, 0). We now choose the line segments from (0, 0) to
(u, 0) and from (u, 0) to (u, v). Note that on the first line segment y = 0 and along
the second line segment x = u and dx = 0. Then
(u,v) (u,0) (u,v)

∫ (e + y )dx + 2xydy = ∫ (e + 0 )dx + 2x0dy + ∫ (eu + y2 )dx + 2uydy


x 2 x 2

(0,0) (0,0) (u,0)


u v

= ∫ ex dx + 2u ∫ ydy
0 0

= e − 1 + u(v − 02 )
u 2

= eu + uv2 − 1.

This means φ(u, v) − φ(0, 0) = eu + uv2 − 1, so

φ(u, v) = eu + uv2 − 1 + φ(0, 0).

Since (u, v) is any point in D, we have a potential function φ(x, y) = ex + xy2 (note
again that any two potential functions could differ by a constant).

Note. If we switch the letters (u, v) and (x, y), then the equation
x y

φ(x, y) = ∫ f (u, 0)du + ∫ g(x, v)dv


0 0

gives a potential function of a conservative field F = ⟨f , g⟩. Of course, the initial point
is not necessarily (0, 0); it could be any qualifying point, say, (a, b), as shown in Fig-
4.4 Green’s theorem: circulation-curl form | 225

ure 4.17(c). Then

x y

φ(x, y) = ∫ f (u, b)du + ∫ g(x, v)dv (4.12)


a b

still gives a potential function of the conservative field F = ⟨f , g⟩. These are the same
as equation (4.7) and equation (4.8).

Finding the circulation integral


Green’s theorem in circulation-curl form offers us an alternative way to evaluate a line
integral by evaluating a double integral. This may save a lot of work, especially when
𝜕g
𝜕x
𝜕f
− 𝜕y has a simple expression and ∬D ( 𝜕g
𝜕x
𝜕f
− 𝜕y )dσ is easy to compute.

Example 4.4.4. Evaluate ∫C (x 2 −2y)dx+(3xy+yey )dy, where C is the closed triangular curve consisting
of the line segments from (0, 0) to (1, 0), from (1, 0) to (0, 1), and from (0, 1) to (0, 0).

Solution. The graph of the vector field and the triangular curve are shown in Fig-
ure 4.18. The given line integral could be evaluated directly by integrating along each
line segment. But instead we use Green’s theorem. Note that the region D enclosed by
C is simply connected, and C has positive orientation. If we let f (x, y) = x 2 − 2y and
g(x, y) = 3xy + yey , then we have

𝜕g 𝜕f
∫(x2 − 2y)dx + (3xy + yey )dy = ∬( − )dσ = ∬(3y − (−2))dσ
𝜕x 𝜕y
C D D
1 1−x

= ∬(3y + 2)dσ = ∫ dx ∫ (3y + 2)dy


D 0 0
1
3 7 3
= ∫( x2 − 5x + )dx = .
2 2 2
0

Example 4.4.5. Use Green’s theorem to find ∫C (x 2 − y)dx − (x + sin2 y)dy, where C is the arc of the
circle y = √2x − x 2 from the point O(0, 0) to A(1, 1).

Solution. The graph of the vector field and the curve C are shown in Figure 4.19. Let
f = x2 − y and g = −(x + sin2 y). Note that 𝜕y
𝜕f
= 𝜕g
𝜕x
for all x and y. The field is path-
independent. We choose another path, O to B, and then B to A, where B is the point
(1, 0).
226 | 4 Line and surface integrals

Figure 4.18: Green’s theorem, Example 4.4.4.

(a) (b)

Figure 4.19: Green’s theorem, Example 4.4.5.

We compute the line integral over the two line segments separately, i. e.,

1
1 3
∫ (x2 − y)dx − (x + sin2 y)dy = − ∫(1 + sin2 y)dy = sin 2 − and
4 2
󳨀󳨀→ 0
BA
1
1
∫ (x2 − y)dx − (x + sin2 y)dy = ∫ x 2 dx = .
3
󳨀󳨀→ 0
OB

Hence,

1 3 1 1 7
∫(x2 − y)dx − (x + sin2 y)dy = sin 2 − + ( ) = sin 2 − .
4 2 3 4 6
C
4.4 Green’s theorem: circulation-curl form | 227

Finding areas by line integrals


Note that the area of a region D is given by ∬D 1dσ. Therefore, we could evaluate this
using Green’s theorem if we had 𝜕g
𝜕x
𝜕f
− 𝜕y = 1. There are many choices of f and g that
achieve this, and some examples are (in each case, L is the boundary curve of D with
positive orientation)

when f = 0, g = x :
𝜕g 𝜕f
∬ 1dσ = ∬( − )dσ = ∮ fdx + gdy = ∮ xdy, (4.13)
𝜕x 𝜕y
D D L L
when f = −y, g = 0 :
𝜕g 𝜕f
∬ 1dσ = ∬( − )dσ = ∮ fdx + gdy = − ∮ ydx, (4.14)
𝜕x 𝜕y
D D L L
y x
when f = − , g = :
2 2
𝜕g 𝜕f 1
∬ 1dσ = ∬( − )dσ = ∮ fdx + gdy = ∮ xdy − ydx. (4.15)
𝜕x 𝜕y 2
D D L L

Example 4.4.6. Compute, using Green’s theorem, the area of the region D ⊂ ℝ2 that lies above the
parabola y = x 2 and below y = 4.

Solution. Using the first of these formulas for finding areas and choosing a parame-
terization of the parabola as x = t, y = t 2 , −2 ≤ t ≤ 2, the area is
2 2
dy 32
∮ xdy = ∫ x dt = ∫ t ⋅ 2tdt = .
dt 3
L −2 −2

Note. As shown in Figure 4.20, the line integral should have included the flat top y = 4
of the region, but dy = 0 on this line segment, so it adds nothing to the area.

Figure 4.20: Green’s theorem, Example 4.4.6.


228 | 4 Line and surface integrals

Example 4.4.7. Find the area of the region D inside the ellipse L given parametrically by x = a cos t,
y = b sin t, 0 ≤ t ≤ 2π.

Solution. Using one of the equations (4.13)–(4.15) for the area A, we obtain
1
A(D) = ∬ 1dσ = ∮ xdy − ydx
2
D L

1
= ∫ (a cos t)(b cos t)dt − (b sin t)(−a sin t)dt
2
0

ab
= ∫ dt = πab.
2
0

Generalized Green’s theorem


What happens if the region D is not simply connected? Let us investigate an example
first.

Example 4.4.8. Investigate the line integral

xdy − ydx

x2 + y2
C

for any circle C with center at the origin and radius R > 0 and oriented counterclockwise.

y x
Solution. Note that F = ⟨f , g⟩ with f = − x2 +y 2 and g = x2 +y2
. We compute

𝜕f y
󸀠
(y)󸀠y (x2 + y2 ) − (y)(x2 + y2 )󸀠y y2 − x2
= (− 2 ) = − = ,
𝜕y x + y2 y (x 2 + y2 )2 (x 2 + y2 )2
x (x)󸀠 (x2 + y2 ) − (x)(x 2 + y2 )󸀠x y2 − x2
󸀠
𝜕g
=( 2 2
) =− x 2 2 2
= 2 .
𝜕x x +y x (x + y ) (x + y2 )2

Therefore,
𝜕f 𝜕g
= .
𝜕y 𝜕x

Applying Green’s theorem, the line integral would be 0! However, as seen in Exam-
ple 4.4.1, the value is not 0! Why is this? The answer is that F is undefined at (0, 0),
so F is not continuous on any region containing (0, 0) (in this case, the region is not
simply connected).

Theorem 4.4.3 (Generalized Green’s theorem). Suppose the region D ⊂ ℝ2 lies between two simple,
closed, piecewise smooth curves L1 and L2 , where L1 is completely contained within L2 (the curves can
4.4 Green’s theorem: circulation-curl form | 229

intersect but not cross each other). Let f (x, y) and g(x, y) be defined on D and have continuous first
partial derivatives on D. If L2 and L1 have positive orientation, then

𝜕g 𝜕f
∬( − )dxdy = ∫(fdx + gdy) + ∫ (fdx + gdy). (4.16)
𝜕x 𝜕y
D L1 L2

Proof. A proof can be developed by combining L1 and L2 into one positively oriented
curve L that follows L2 , then follows a line segment S joining L2 to L1 , continuing along
L1 , and finally returns to L2 over the line segment S but in the opposite direction. Ap-
plying Green’s theorem to this curve L will give the generalized result. Figure 4.21 il-
lustrates this idea.

(a) (b)

Figure 4.21: Generalized Green’s theorem, the region is not simply connected.

Example 4.4.9. Show that

xdy − ydx
∮ = 2π
x2 + y2
C

y
for every positively oriented simple closed curve C that encloses the origin. Note that f = − x 2 +y 2 and
x
g= x 2 +y 2
are not defined at the origin. So, Green’s theorem cannot be applied directly.

Solution. Since C is an arbitrary and unknown closed path, it is difficult to compute


the integral directly. Hence, let us first consider a clockwise circle C 󸀠 with its center
at the origin and radius a, where a is chosen to be small enough that C 󸀠 lies inside
C, as shown in Figure 4.22. If D is the region enclosed between C and C 󸀠 (D does not
y
contain the origin), then using the generalized Green’s theorem (4.16) with f = − x2 +y 2
x
and g = x2 +y2
gives

𝜕g 𝜕f
∮ fdx + gdy + ∮ fdx + gdy = ∬( − )dσ
𝜕x 𝜕y
C C󸀠 D
230 | 4 Line and surface integrals

(a) (b)

Figure 4.22: Generalized Green’s theorem, the region is not simply connected, Example 4.4.9.

y2 − x2 y2 − x2
= ∬( 2 2 2
− 2 )dσ
(x + y ) (x + y2 )2
D
= 0.

Therefore,

∮ fdx + gdy = − ∮ fdx + gdy


C C󸀠

= ∮ fdx + gdy,
−C 󸀠

where −C 󸀠 is the circle in the counterclockwise orientation. We now easily compute


this last integral using the parameterization of −C 󸀠 given by r(t) = a cos t i + a sin tj,
0 ≤ t ≤ 2π. Thus,

d r(t)
∫ fdx + gdy = ∫ fdx + gdy = ∫ ⟨f (r(t)), g(r(t))⟩ dt
dt
C −C 󸀠 0

−a sin t a cos t
= ∫⟨ , ⟩ ⋅ ⟨−a sin t, a cos t⟩dt
a2 cos2 t + a2 sin2 t a2 cos2 t + a2 sin2 t
0

= ∫ dt = 2π.
0

Example 4.4.10. Evaluate


xdy − ydx
∮ ,
x2 + y2
L

where L is a piecewise smooth, simple closed curve that does not contain (0, 0) on L or in its interior.
4.5 Green’s theorem: flux-divergence form | 231

Solution. From the previous example the value is 2π when the origin is inside of L. If
the origin is not inside of L or on L, then the answer is 0 by applying Green’s theorem.

4.5 Green’s theorem: flux-divergence form


4.5.1 Flux

Suppose instead we are interested in how much vector field is pointing outward of
the given closed simple curve (in the normal direction). If the vector field models fluid
flow, then the question is equivalent to finding the rate (mass per time) at which the
fluid is flowing out of the region through the closed curve. This requires us to resolve
the vector field to the outward normal vector direction at each point. Then we have

flux integral = ∫(F ⋅ N)ds. (4.17)


C

But how do we find N? Well, as shown in Figure 4.23(b) we can define the outward unit
normal vector

N=T×k
1 dr 1 dx dy
= 󸀠 ×k= 󸀠 ⟨ , ⟩×k
|r (t)| dt |r (t)| dt dt
󵄨󵄨 󵄨
󵄨󵄨 i j k 󵄨󵄨󵄨
1 󵄨󵄨󵄨 dx dy 󵄨󵄨
= 󸀠 󵄨 0 󵄨󵄨󵄨
|r (t)| 󵄨󵄨󵄨󵄨 dt dt 󵄨󵄨
󵄨󵄨 0 0 1 󵄨󵄨󵄨
1 dy dx 1 dy dx
= 󸀠 ( i− j)= 󸀠 ⟨ , − ⟩.
|r (t)| dt dt |r (t)| dt dt
So
1 dy dx dy dx
∫(F ⋅ N)ds = ∫⟨f , g⟩ ⋅ ⟨ , − ⟩ds = ∫⟨f , g⟩ ⋅ ⟨ , − ⟩dt.
|r󸀠 (t)| dt dt dt dt
C C C

(a) (b)

Figure 4.23: Flux, outward normal vector.


232 | 4 Line and surface integrals

Simplifying this, we obtain

flux integral ∫(F ⋅ N)ds = ∫ fdy − gdx. (4.18)


C C

Example 4.5.1. Find the flux of the velocity flow v(x, y) = x i + xy j (cm/s) in two dimensions out of the
circular region C : x 2 + y 2 ≤ 4 for a fluid with constant density δ g/cm2 .

Solution. Choose the parametric form for C: r(t) = (2 cos t i + 2 sin tj), 0 ≤ t ≤ 2π. The
flux is

∫ δ(v ⋅ N)ds
C

= δ ∫ xdy − xydx = δ ∫ (2 cos t)d(2 sin t) − (4 cos t sin t)d(2 cos t)


C 0

= 2δ ∫ (2 cos2 t + 8 cos t sin2 t)dt


0
= 4δπ g/s.

4.5.2 Flux density – divergence

Similar to the density of circulation, we can define the density of flux, which is also
called the divergence. The box shown in Figure 4.24 has length 2Δx and width 2Δy.
Then, the flux crossing the
bottom is F ⋅ (−j)2Δx= − g(x, y − Δy)2Δx,
top is (F ⋅ j)2Δx=g(x, y + Δy)2Δx,

Figure 4.24: Density of flux, the divergence.


4.5 Green’s theorem: flux-divergence form | 233

right is (F ⋅ i)2Δy = f (x + Δx, y)2Δy,


left is F ⋅ (−i)2Δy = −f (x − Δx, y)2Δy.
So the total flux is

g(x, y + Δy)2Δx − g(x, y − Δy)2Δx + f (x + Δx, y)2Δy − f (x − Δx, y)2Δy.

The density is, therefore,


g(x, y + Δy)2Δx − g(x, y − Δy)2Δx + f (x + Δx, y)2Δy − f (x − Δx, y)2Δy
,
2Δx × 2Δy
which simplifies to
f (x + Δx, y) − f (x − Δx, y) g(x, y + Δy) − g(x − Δy)
+ .
2Δx 2Δy
Taking the limit gives
𝜕f 𝜕g
+ .
𝜕x 𝜕y
The density is given a special name, the divergence, denoted by Div(F). So the diver-
gence of a two-dimensional vector field F = ⟨f , g⟩ is
𝜕f 𝜕g
Div(F) = + ,
𝜕x 𝜕y
or in vector form

Div(F) = ∇ ⋅ F.

4.5.3 The divergence-flux form of Green’s theorem

With the ideas of the flux integral and the flux density (divergence), we can expect
that the flux integral along a simple closed curve C is equal to the double integral of
the flux density over the simply connected region enclosed by C. This is indeed true.
We now state Green’s theorem in the divergence-flux form.

Theorem 4.5.1 (Green’s theorem: flux-divergence form). Let L be a piecewise smooth simple closed
curve in ℝ2 which has positive orientation (the interior is on the left as you travel round L), and the

󳨀 →󳨀 →󳨀
region D inside of L is simply connected. Let F = f i + g j be a vector field for which f and g have
continuous partial derivatives in a region containing C and D. Then

𝜕f 𝜕g
∮ f (x, y)dy − g(x, y)dx = ∬( + )dσ, (4.19)
𝜕x 𝜕y
L D

or, in vector form,

∮(F ⋅ N)ds = ∬(∇ ⋅ F)dσ. (4.20)


L D
234 | 4 Line and surface integrals

Proof. This is proved by using Green’s theorem in the circulation-curl form. The flux
integral

∮(F ⋅ N)ds = ∮ fdy − gdx (4.21)


L L

can be rearranged as

∮(F ⋅ N)ds = ∮ −gdx + fdy. (4.22)


L L

Compared with the circulation integral, this might be confusing because f and g are
in different positions now. It may be helpful to memorize Green’s theorem writing it as

𝜕♥ 𝜕♣
∮ ♣dx + ♥dy = ∬( − )dσ. (4.23)
𝜕x 𝜕y
L D

Applying Green’s theorem in the circulation-curl form to the flux integral, we obtain
the flux-divergence form of Green’s theorem

𝜕f 𝜕g
∮ fdy − gdx = ∮ −gdx + fdy = ∬( + )dσ. (4.24)
𝜕x 𝜕y
L L D

Note. Recalling that the flux through C measures vector field “flowing” out of the
closed curve, there must be some “source” in D. The term 𝜕x 𝜕f
+ 𝜕g
𝜕y
is the divergence
of the vector field F in D. If at a point P, the divergence is positive, then it is a source.
If at a point P, the divergence is negative, then it is a sink.

Example 4.5.2. The graph of the vector field F = ⟨x 2 − y 2 , x − 3y⟩ and the curve x 2 + y 2 = 1 are shown
in Figure 4.25(a).
1. Make a guess whether the flux along C is positive or negative.
2. Evaluate the two integrals in Green’s theorem in flux-divergence form. Check to see if they are
equal.

Solution.
1. From Figure 4.25(a), it looks like more vector field goes into the circle than comes
out of the circle, so the flux might be negative.
2. We compute the flux integral

∮(F ⋅ N)ds = ∮ fdy − gdx


L L

= ∮(x2 − y2 )dy − (x − 3y)dx


L
4.6 Source-free vector fields | 235

(a) (b)

Figure 4.25: Flux-divergence, Example 4.5.2 and Example 4.6.1.

= ∫ (cos2 t − sin2 t)d(sin t) − (cos t − 3 sin t)d(cos t)


0

= ∫ (cos3 t − cos t sin2 t + cos t sin t − 3 sin2 t)dt = −3π.


0

We now compute the double integral

𝜕f 𝜕g 𝜕(x2 − y2 ) 𝜕(x − 3y)


∬( + )dσ = ∬( + )dσ
𝜕x 𝜕y 𝜕x 𝜕y
D D

= ∬(2x − 3)dσ = ∬ 2xdσ − 3 ∬ 1dσ


D D D
2
= 0 − 3 × π × 1 = −3π.

Indeed, the two integrals have the same value.

4.6 Source-free vector fields


In Green’s theorem in flux-divergence form, we have seen the term 𝜕x 𝜕f
+ 𝜕g
𝜕y
, which mea-
sures the “source” of the vector field. Green’s theorem in flux-divergence form in vector
form is written as

∮(F ⋅ N)ds = ∬ ∇ ⋅ Fdσ.


L D

If the divergence of a vector field is 0 at every point in a region D, then the field is
called source-free. A point with positive divergence is called a source, and a point with
236 | 4 Line and surface integrals

negative divergence is called a sink. For a source-free vector field F = ⟨f , g⟩, we define
the stream function ψ, which satisfies
𝜕ψ 𝜕ψ
=f and = −g.
𝜕y 𝜕x
Since fx + gy = 0, or equivalently fx = −gy , we have

𝜕2 ψ 𝜕2 ψ
= or ψyx = ψxy .
𝜕y𝜕x 𝜕x𝜕y
Similar to a conservative field, for a source-free field, under suitable conditions, we
have the following results:
𝜕f 𝜕g
F = ⟨f , g⟩ is source-free ⇐⇒ + =0
𝜕x 𝜕y
⇐⇒ ∫(F ⋅ N)ds is path-independent
C

⇐⇒ ∮(F ⋅ N)ds = 0 for any simple closed curve


C
⇐⇒ there is a stream function ψ such that
B

∫(F ⋅ N)ds = ψ(B) − ψ(A).


A

Example 4.6.1. Compute the divergence of the vector field F = ⟨−y, x⟩. Does it have a stream function?
If so, find one.

Solution. The graph of the vector field is shown in Figure 4.25(b). Since f = −y and
g = x,
𝜕f 𝜕g 𝜕(−y) 𝜕(x)
+ = + = 0,
𝜕x 𝜕y 𝜕x 𝜕y
so it is a source-free field. Thus, there exists a stream function ψ. Since

𝜕ψ y2
= f = −y, we have ψ = ∫(−y)dy = − + C(x).
𝜕y 2
To find C(x), we differentiate with respect to x to obtain

y2
󸀠
𝜕ψ 𝜕ψ
= (− + C(x)) = C 󸀠 (x), but = −g = −x.
𝜕x 2 x 𝜕x
2
So, C 󸀠 (x) = −x. Thus, C(x) = − x2 + C. A family of stream functions for this vector field
is
y2 x2
ψ(x, y) = − − + C.
2 2
4.7 Surface integral with respect to surface area | 237

Figure 4.26: Surface integral.

4.7 Surface integral with respect to surface area


Suppose we have a thin metal shell that we can represent as a surface in ℝ3 . If the
density of the shell is variable, how can we find its total mass? Sure, we can again
employ the idea of elements, and then form an integral that we may be able to eval-
uate. To find, approximately, the mass of the metal shell, we divide the surface into
small patches (subregions), as shown in Figure 4.26. Then add up the areas ΔSi of
each patch multiplied by the approximate density f (xi , yi , zi ) of the patch. The approx-
imation improves as the size of the patches decreases. This is exactly the process used
in integration.

Definition 4.7.1. Let S be a bounded surface in space and let f (x, y, z) be a bounded function defined
on S. For any subdivision {Sk } of S into n patches (small subregions), and arbitrarily choose a point
(xk , yk , zk ) ∈ Sk in each patch. Form the Riemann sum

n
∑ f (xk , yk , zk )ΔSk ,
k=1

where ΔSk is the area of subregion Sk . If the limit of this Riemann sum exists as n → ∞ and
max |ΔSk | → 0 for all possible subdivisions and choices of points (xk , yk , zk ), then this value is
the surface integral of f (x, y, z) over the surface S, written

n
∬ fdS = ∬ f (x, y, z)dS = lim ∑ f (xk , yk , zk )ΔSk . (4.25)
max |ΔSk |→0, n→∞
S S k=1

Properties of the surface integral


The surface integral has the same basic properties as all integrals. Assuming that k,
l ∈ ℝ are two constants and that all the integrals exist:
1. ∬S 1dS = the area of the surface S, A(S),
238 | 4 Line and surface integrals

2. ∬S (kf (x, y, z) + lg(x, y, z))dS = k ∬S f (x, y, z)dS + l ∬S g(x, y, z)dS (linearity),


3. ∬S f (x, y, z)dS + ∬S󸀠 f (x, y, z)dS = ∬S+S󸀠 f (x, y, z)dS (additivity),
4. if f (x, y, z) is continuous, then there exists a point (a, b, c) such that

∬ f (x, y, z)dS = f (a, b, c) ⋅ A(S).


S

Example 4.7.1. Suppose S is the surface of the unit ball. Evaluate the surface integral

∬ 2x ln(1 + y 2 + z 2 ) − 3(x 2 + y 2 + z 2 )dS.


S

Solution. By the properties of surface integrals,

∬ 2x ln(1 + y2 + z 2 ) − 3(x2 + y2 + z 2 )dS


S

= 2 ∬ x ln(1 + y2 + z 2 )dS − 3 ∬(x 2 + y2 + z 2 )dS (linearity)


S S

= 2 × 0 − 3 ∬(x 2 + y2 + z 2 )dS (symmetry)


S

= 0 − 3 ∬ 1dS (since all points on S satisfy x 2 + y2 + z 2 = 1)


S

= −3 × 4π12 = −12π.

Surface integrals: surface is described by parametric equations


Note that in Chapter 3, for a parameterized surface r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩,
we approximated ΔSk by |ru × rv |ΔuΔv. Hence, equation (4.25) for the surface integral
becomes

∬ f (x, y, z)dS = lim ∑ f (xk , yk , zk )ΔSk


ΔSk →0, n→∞
S
= lim ∑ f (x(uk , vk ), y(uk , vk ), z(uk , vk ))|ru × rv |ΔuΔv,
ΔSk →0, n→∞

and this is now a standard Riemann sum for a double integral, resulting in the final
formula

∬ f (x, y, z)dS = ∬ f (x(u, v), y(u, v), z(u, v))|ru × rv |dudv. (4.26)
S Duv

Example 4.7.2. Find ∬S z1 dS, where S is the part of the sphere x 2 + y 2 + z 2 = a2 that lies above the
plane z = h and h is a constant satisfying 0 < h ≤ a.
4.7 Surface integral with respect to surface area | 239

Figure 4.27: Surface integral, Example 4.7.2.

Solution. Figure 4.27 shows the surface S.


The surface S has a parametric description

r(u, v) = ⟨a sin u cos v, a sin u sin v, a cos u⟩,

where 0 ≤ v ≤ 2π and 0 ≤ u ≤ cos−1 ah . So,

󵄨󵄨
󵄨󵄨 i j k 󵄨󵄨
󵄨󵄨
󵄨 󵄨󵄨
ru × rv = 󵄨󵄨󵄨󵄨 a cos u cos v a cos u sin v −a sin u 󵄨󵄨
󵄨󵄨
󵄨󵄨
󵄨󵄨 −a sin u sin v a sin u cos v 0 󵄨󵄨
󵄨
= a2 sin2 u cos vi+(a2 sin2 u sin v)j+(a2 sin u cos u)k,

and |ru × rv | = √(a2 sin2 u cos v)2 + (a2 sin2 u sin v)2 + (a2 sin u cos u)2 = a2 sin u. Thus,

h
2π cos−1 a
1 1 sin u
∬ dS = ∬ a2 sin ududv = a ∫ dv ∫ du
z a cos u cos u
S Duv 0 0
cos−1 h
h a
= 2πa × (− ln | cos u|)0 a
= −2πa ln = 2πa ln .
a h

Surface integrals: surface is given by an explicit equation z = z(x, y)


Also in Chapter 3, we have seen that if the surface is given by an explicit function
z = z(x, y), then the parametric description r(x, y) = ⟨x, y, z(x, y)⟩ gives dS = |rx × ry | =
√1 + zx2 + zy2 dxdy and

∬ f (x, y, z)dS = ∬ f (x, y, z(x, y))√1 + zx2 + zy2 dxdy. (4.27)


S Dxy
240 | 4 Line and surface integrals

In fact, we can use all the ways that we have developed for the surface area element
dS in Chapter 3 to evaluate a surface integral, so

∬ f (x, y, z)dS = ∬ f (x(y, z), y, z)√1 + xy2 + xz2 dydz, (4.28)


S Dyz

∬ f (x, y, z)dS = ∬ f (x, y(x, z), z)√1 + yx2 + yz2 dxdz. (4.29)
S Dxz

Example 4.7.3. Let S be the closed surface formed by S1 , the portion of the cone with equation z =
√x 2 + y 2 that lies below the plane z = 1, and S2 , the circular top of the cone given by z = 1, x 2 + y 2 ≤ 1.
Let f be defined on S by f (x, y, z) = x 2 + y 2 . Compute the area of S and evaluate ∬S f (x, y, z)dS.

Solution. Figure 4.28 shows the cone and its circular top. The area of S does not need
any integration, since standard formulas give the lateral surface area of the cone as
π √2 and the area of the disk as π, so S has area π(√2 + 1). We compute the integral of
f by evaluating two surface integrals, i. e.,

∬ f (x, y, z)dS = ∬ f (x, y, z)dS + ∬ f (x, y, z)dS.


S S1 S2

For the first integral, the projection of S1 onto the xy-plane is D : x 2 + y2 ≤ 1, and since
z = √x2 + y2 , the surface integral becomes

2 2
𝜕z 𝜕z
∬ f (x, y, z)dS = ∬(x 2 + y2 )√1 + ( ) + ( ) dσ
𝜕x 𝜕y
S1 D

2 2
x y
= ∬(x 2 + y2 )√1 + ( ) +( ) dσ
D
√x 2 + y2 √x 2 + y2

Figure 4.28: Surface integral, Example 4.7.3.


4.8 Surface integrals of vector fields | 241

= ∬(x 2 + y2 )√2dσ.
D

Converting to polar coordinates over the region D󸀠 : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π changes this


to
2π 1
√2π
2
∬(r )√2rdrdθ = √2 ∫ dθ ∫ r 2 rdr = .
2
D󸀠 0 0

For the second integral the surface S2 is z = 1, so zx = zy = 0, and the domain D is the
same as that of the first integral. Hence,

2 2
𝜕z 𝜕z
∬ f (x, y, z)dS = ∬(x2 + y2 )√1 + ( ) + ( ) dσ
𝜕x 𝜕y
S2 D
π
= ∬(x2 + y2 )dσ = .
2
D

The value of this integral is obtained easily because it has exactly the same integrand
as the first integral above except for a factor of √2. Hence, the complete integral of f
is computed as

√2π π √2 + 1
∬(x2 + y2 )dS + ∬(x2 + y2 )dS = + = π.
2 2 2
S1 S2

4.8 Surface integrals of vector fields


4.8.1 Orientable surfaces

We have seen the line integral of a two-dimensional vector field, both for finding cir-
culation and for finding flux. Now we discuss flux for a three-dimensional vector field.
As was the case for flux in two dimensions, we first need to orient a surface so that we
know which direction we are talking about. The surfaces we have encountered so far
have two sides, as shown in Figure 4.29. Such surfaces are orientable. However, some
surfaces are one-sided and are not orientable. For example, the Moebius strip is a one-
sided surface (Figure 4.30(a)). It is formed by taking a long and narrow strip of paper
and joining the ends together after giving a half-twist to one end. Any point on the
Moebius strip can be joined to any other point by a path that stays on the surface of
the paper (the path does not go near the edge), showing that it really is one-sided. Con-
sequently, on a Moebius strip it is not possible to define a unique unit normal vector,
perpendicular to the surface, that changes continuously along any curve. For example
suppose the unit normal N at any point is defined to be in the direction away from the
242 | 4 Line and surface integrals

(a) (b) (c) (d)

Figure 4.29: Orientable surfaces.

(a) (b)

Figure 4.30: Nonorientable surfaces: Moebius strip and Klein bottle.

paper. Following a path from a point P with unit normal N around to the point P 󸀠 on
the other side of the paper from P will give a normal in exactly the opposite direction to
N. This violates the continuity of the normal, since P and P 󸀠 are essentially the same
point. Another example for a nonorientable surface is the famous Klein bottle (Fig-
ure 4.30(b)). In the sequel, we only consider orientable surfaces which we can orient
either upward or downward, outward or inward, leftward or rightward, and so forth.
We now give the definition of an orientable surface.

Definition 4.8.1 (Orientable surface). Let S be a surface in ℝ3 . If at each point (x, y, z) of S we can as-

󳨀 → 󳨀
sign a unit normal N = N (x, y, z) that changes continuously along any curve on S, then we say that S
→󳨀 →
󳨀
is an orientable surface. Once N is defined for a surface S, the function N defines an orientation on
S, and S is said to be oriented.

4.8.2 Flux integral ∬S (F ⋅ N)dS

If F(x, y, z) is a vector field (think of it as the flow of a fluid at each point in space
measured in mass/time) and N is the unit normal of an orientable surface, then at
each point of the surface, F ⋅ N is the component of F perpendicular to S, as shown in
Figure 4.31. Hence, we have the following definition of the flux.
4.8 Surface integrals of vector fields | 243

Figure 4.31: Flux integral.

Definition 4.8.2 (Flux of a three-dimensional vector field). If a surface S is oriented with unit normal

󳨀 →󳨀 →󳨀 →󳨀
vector N , then the surface integral ∬S F ⋅ N dS is called the flux of F across S (in the direction defined
→󳨀 →󳨀 →󳨀 →󳨀
by N ); ∬S F ⋅ N dS is also called the integral of the vector field F over S.

In particular, if F(x, y, z) is the velocity vector for the flow of a fluid through a region of
space and the density function of the fluid at point (x, y, z) is δ(x, y, z) = 1, then the flux
element through a surface element ΔS in a given direction (or mass of fluid per unit
time across ΔS) is |F| cos θΔS, where θ is the angle between F and N (the normal vector
pointing in the given direction). Therefore, |F| cos θΔS = |F| cos θ|N|ΔS = (F ⋅ N)ΔS,
and the integration ∬S (F ⋅ N)dS gives the total flux across S in the given direction. This
is, indeed, a surface integral of the form already defined in equation (4.25), so it has
the same properties (for example, linear and additive across two regions S1 and S2 ). In
addition, we have

∬(F ⋅ N)dS = − ∬(F ⋅ N)dS, (4.30)


S −S

where −S denotes the negative orientation of the surface S.


If we denote NdS by dS, then

∬(F ⋅ N)dS = ∬ F⋅dS. (4.31)


S S

If S is closed, we also adopt the notation ∯S (F ⋅ N)dS, with a circle in the integral
sign.

Example 4.8.1. Given the vector field F = (x 2 − sin y 2 + z)i − yj+(z + z 2 )k, find the flux out of the top
and bottom faces of the cube

D = {(x, y, z)|0 ≤ x ≤ 2, 0 ≤ y ≤ 2, 0 ≤ z ≤ 2}.


244 | 4 Line and surface integrals

Solution. Since the flux is out of two faces, S1 , the top side of D, and S2 , the bottom
side of D, and we orient S1 upward and S2 downward so that we will find outward flux
leaving the cube, we choose N1 to be ⟨0, 0, 1⟩ and we choose N2 to be ⟨0, 0, −1⟩. Then,

∬(F ⋅ N)dS = ∬⟨x2 − sin y2 + z, −y, z + z 2 ⟩ ⋅ ⟨0, 0, 1⟩dS = ∬(z + z 2 )dS


S1 S1 S1

2
= ∬(2 + 2 )dS (since the top side z = 2)
S1

= ∬(6)dS = 6 × A(S1 )
S1
= 24,

∬(F ⋅ N)dS = ∬⟨x 2 − sin y2 + z, −y, z + z 2 ⟩ ⋅ ⟨0, 0, −1⟩dS = − ∬(z + z 2 )dS


S2 S2 S2

= ∬(0 + 02 )dS (since the bottom side z = 0)


S2
= 0.

So, altogether, the desired flux is

∬ (F ⋅ N)dS = 24 + 0 = 24.
S1 +S2

Evaluating ∬S (F ⋅ N)dS for a surface given parametrically


If a surface S is given parametrically by r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩, where (u, v) ∈
D, then two tangent vectors to the surface are ru and rv . Hence, a unit normal vector is
given by the unit vector in the direction of the cross product (vector product), i. e.,
ru × rv
N= .
|ru × rv |
But this N may or may not have the same direction as desired; hence,
ru × rv
∬(F ⋅ N)dS = ± ∬ F(x(u, v), y(u, v), z(u, v)) ⋅ dS.
|ru × rv |
S S

Using the surface integral evaluation formula, equation (4.26), this becomes
ru × rv
∬(F ⋅ N)dS = ± ∬ F ⋅ |r × r |dudv,
|ru × rv | u v
S D

∬(F ⋅ N)dS = ± ∬ F ⋅ (ru × rv )dudv. (4.32)


S D
4.8 Surface integrals of vector fields | 245

Example 4.8.2. Find flux out of the lateral surface of the cylinder x 2 + y 2 = 4, 0 ≤ z ≤ 2 for the vector
field F = ⟨x, y, x + z 2 ⟩.

Solution. The cylinder has a parametric description r(u, v) = 2 cos ui + 2 sin uj + vk,
0 ≤ u ≤ 2π and 0 ≤ z ≤ 2. Then
󵄨󵄨
󵄨󵄨 i j k 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨
ru × rv = 󵄨󵄨󵄨 −2 sin u 2 cos u 0 󵄨󵄨󵄨󵄨 = 2 cos ui + 2 sin uj.
󵄨󵄨 󵄨
󵄨󵄨 0 0 1 󵄨󵄨󵄨
Note that this is an outward normal vector, so we take the positive sign and

∬(F ⋅ N)dS = ∬⟨2 cos u, 2 sin u, 2 cos u + v2 ⟩ ⋅ ⟨2 cos u, 2 sin u, 0⟩dudv


S Duv

= ∬(4 cos2 u + 4 sin2 u)dudv = 4 ∬ dudv = 16π.


Duv Duv

Evaluating ∬S (F ⋅ N)dS for a surface z = z(x, y)


A point (x, y, z(x, y)) on the surface S given explicitly by z = z(x, y) satisfies F =
z − z(x, y) = 0, so ∇F is a normal vector to this surface. Since ∇F = ⟨Fx , Fy , Fz ⟩ =
⟨−zx , −zy , 1⟩, a unit normal is
1
N= ⟨−zx , −zy , 1⟩.
√1 + zx2 + zy2

This normal vector may or may not have the same direction as desired. Hence, for a
→󳨀
vector field F = ⟨f , g, h⟩ defined on S, the surface integral becomes
1
∬(F ⋅ N)dS = ± ∬⟨f , g, h⟩ ⋅ ⟨−zx , −zy , 1⟩ dS.
S S
√1 + zx2 + zy2

Using the surface integral evaluation formula, equation (4.27), this becomes
1
∬(F ⋅ N)dS = ± ∬⟨f , g, h⟩ ⋅ ⟨−zx , −zy , 1⟩ √1 + zx2 + zy2 dσ
S Dxy
√1 + zx2 + zy2

= ± ∬⟨f , g, h⟩ ⋅ ⟨−zx , −zy , 1⟩dσ


Dxy

= ± ∬(−fzx − gzy + h)dxdy,


Dxy

where Dxy is the projection of S onto the xy-plane. So, we have

∬(F ⋅ N)dS = ± ∬(−fzx − gzy + h)dxdy. (4.33)


S Dxy
246 | 4 Line and surface integrals

Similar formulas hold for surfaces S : x = x(y, z) and S : y = y(x, z). If Dyz and Dxz
are projections of S onto the yz-plane and xz-plane, respectively, then

∬(F ⋅ N)dS = ± ∬(f − gxy − hxz )dydz, (4.34)


S Dyz

∬(F ⋅ N)dS = ± ∬(−fyx + g − hyz )dxdz. (4.35)


S Dxz

Example 4.8.3. Evaluate ∬S F ⋅ dS, where F ⃗ (x, y, z) = y i ⃗ + x j ⃗ + z k⃗ and S is the boundary of the solid
region R enclosed by the paraboloid z = 1 − x 2 − y 2 and the plane z = 0. Assume S is oriented inward.

Solution. The boundary S consists of a parabolic top surface S1 with z = 1 − x 2 − y2 and


a circular flat bottom surface S2 with z = 0; S is a closed surface where the outward
normal for S1 is oriented upward and is ⟨−zx , −zy , 1⟩ = ⟨2x, 2y, 1⟩, so the inward normal
is ⟨−2x, −2y, −1⟩. The normal to S2 can be taken as ⟨0, 0, 1⟩. Both S1 and S2 have the same
projection in the xy-plane, namely, the disk D : x2 + y2 ≤ 1. Equation (4.33) gives

∬(F ⋅ N)dS = ∬(F ⋅ N)dS + ∬(F ⋅ N)dS


S S1 S2

= ∬⟨y, x, 1 − x2 − y2 ⟩ ⋅ ⟨−2x, −2y, −1⟩dσ + ∬⟨y, x, 0⟩ ⋅ ⟨0, 0, 1⟩dσ


D D
2 2
= ∬ −2xy − 2xy − (1 − x − y )dxdy + ∬ y ⋅ 0 + x ⋅ 0 + 0 ⋅ (1)dxdy
D D

= ∬(−4xy − 1 + x2 + y2 )dxdy + 0
D
2π 1

= ∫ ∫(−1 + r 2 )rdrdθ (changing to polar coordinates)


0 0
2π 1
π
= ∫ dθ ∫(−r + r 3 )dr = − .
2
0 0

Other forms of ∬S (F ⋅ N)dS


Recall that the unit normal vector N has the form ⟨cos α, cos β, cos γ⟩, where α, β, and γ
are direction angles, and the projection element of dS onto coordinate planes has the
following relations:

cos αdS = dydz, cos βdS = dxdz, and cos γdS = dxdy.
4.8 Surface integrals of vector fields | 247

Then,

∬(F ⋅ N)dS = ∬⟨f , g, h⟩ ⋅ ⟨cos α, cos β, cos γ⟩dS


S S

= ∬ f cos αdS + g cos βdS + h cos γdS


S

= ∬ fdydz + gdxdz + hdxdy. (4.36)


S

This can also be derived as follows.


Equation (4.33) for evaluating surface integrals as double integrals shows that
when the field is F = ⟨f , 0, 0⟩, then

∬(F ⋅ N)dS = ± ∬(f − 0xy − 0xz )dydz = ± ∬ fdydz.


S Dyz Dyz

Similarly, using the alternative forms of the equation (4.33), we can show that if F =
⟨0, g, 0⟩, then

∬(F ⋅ N)dS = ± ∬(−0yx + g − 0yz )dxdz = ± ∬ gdxdz,


S Dxz Dxz

and if F = ⟨0, 0, h⟩,

∬(F ⋅ N)dS = ± ∬(−0zx − 0zy + h)dxdy = ± ∬ hdxdy.


S Dxy Dxy

Putting the above three equations together and writing F = ⟨f , g, h⟩ = ⟨0, 0, h⟩ +


⟨0, g, 0⟩ + ⟨f , 0, 0⟩, we obtain

∬(F ⋅ N)dS = ± ∬ fdydz ± ∬ gdxdz ± ∬ hdxdy


S Dyz Dxz Dxy

or, as it is usually written,

∬(F ⋅ N)dS = ∬ fdxdy + gdxdz + hdydz.


S S

Note. Again, one must be aware that the normal vectors must be consistent with the
desired direction.
248 | 4 Line and surface integrals

Example 4.8.4. Find the flux integral

∬ xdydz + (y − z)dxdz + xdxdy


S

when S is part of the plane x + 2y + z = 3 in the first octant with the unit normal N of S pointing to the
side of the surface away from the origin.

Solution. If we let F(x, y, z) = ⟨x, y − z, x⟩, then we rewrite the integral as

∬ xdydz + (y − z )dxdz + xdxdy = ∬(F ⋅ N)dS.


S S

Since z = 3 − x − 2y on S, the normal vector pointing away from the origin is


⟨−zx , −zy , 1⟩ = ⟨1, 2, 1⟩. Then,

∬(F ⋅ N)dS = ∬⟨x, y − z, x⟩ ⋅ ⟨1, 2, 1⟩dσ = ∬ x + 2(y − z) + xdσ


S Dxy Dxy

= ∬ x + 2(y − 3 + x + 2y) + xdσ


Dxy

= ∬ 4x + 6y − 6dσ
Dxy
3
2 3−2y

= ∫( ∫ (4x + 6y − 6)dx)dy
0 0
3
2

= ∫(−4y2 + 6y)dy
0
9
= .
4

4.9 Divergence theorem


4.9.1 Divergence of a three-dimensional vector field

As seen with two-dimensional vector fields, the divergence which measures as “source”
of a vector field F is defined to be ∇ ⋅ F. This definition can be extended to three-
dimensional vector fields as well. For example, if F = ⟨f , g, h⟩, where f , g, and h are
three functions of three variables, then the divergence of F is

𝜕f 𝜕g 𝜕h
divergence of F = Div F = ∇ ⋅ F = + + .
𝜕x 𝜕y 𝜕z
4.9 Divergence theorem | 249

Physical interpretation of the divergence ∇ ⋅ F


Construct a box with center (x, y, z) of volume 2Δx × 2Δy × 2Δz with sides parallel to the
coordinate planes with one corner at (x + Δx, y + Δy, z + Δz) and the diagonally opposite
corner at (x − Δx, y − Δy, z − Δz), as in Figure 4.32. The values of Δx, Δy, and Δz are small
changes in x, y, and z, respectively. For a continuous differentiable vector field

F(x, y, z) = ⟨f (x, y, z), g(x, y, z), h(x, y, z)⟩,

we calculate the flux per unit volume out of this box, the density of the flux, and then
let Δx, Δy, Δz → 0 to show that the rate of change of the “quantity” of F at (x, y, z) is
∇ ⋅ F.

Figure 4.32: Divergence, the density of flux, box model.

The component of F in the z-direction is h(x, y, z). Hence, the flow out of the face A
(Figure 4.32) of the box is approximately h(x, y, z + Δz)4ΔxΔy, i. e., the flow per unit
time at the center of the face multiplied by the area of the face. Similarly, the flow
into face A󸀠 of the box is approximately h(x, y, z − Δz)4ΔxΔy. Hence, the change in the
z-direction per unit volume is approximately

h(x, y, z + Δz)4ΔxΔy − h(x, y, z − Δz)4ΔxΔy


8ΔxΔyΔz
h(x, y, z + Δz) − h(x, y, z − Δz)
= ,
2Δz

and in the limiting case as Δz → 0, the flux change per unit volume in the z-direction
is

h(x, y, z + Δz) − h(x, y, z − Δz) 𝜕h(x, y, z)


lim [ ]= .
Δz→0 2Δz 𝜕z
250 | 4 Line and surface integrals

Similarly, the changes in the x- and y-directions are 𝜕f


𝜕x
and 𝜕g
𝜕y
, so the total rate of
change per unit volume is

𝜕f 𝜕g 𝜕h
+ + = ∇ ⋅ F = Div F.
𝜕x 𝜕y 𝜕z

Note. If, for example, F = ⟨f , g, h⟩ is fluid flow at (x, y, z), then the flow may be in
any direction. However, we know that the flow F is equivalent to the flow of its three
components f , g, and h in the direction of the three coordinate axes. This allows us to
use the box method above to compute the total flux.

4.9.2 Divergence theorem

Recall the divergence-flux form of Green’s theorem,

𝜕f 𝜕g
∮(F ⋅ N)ds = ∬(∇ ⋅ F)dσ = ∬( + )dσ,
𝜕x 𝜕y
C D D

which states that the integral of the divergence over a simply connected region D gives
the total flux out of the boundary C of the region D. The three-dimensional version of
Green’s theorem in three-dimensional space is the following divergence theorem.

Figure 4.33: Divergence theorem.

Theorem 4.9.1 (The divergence theorem). Let S be a closed surface in ℝ3 oriented outward, enclosing
a simply connected region Ω. Let the vector field F = ⟨f (x, y, z), g(x, y, z), h(x, y, z)⟩ be defined and
have continuous partial derivatives on a region containing Ω and S. Then

∬(F ⋅ N)dS = ∭ Div FdV = ∭ ∇ ⋅ FdV .


S Ω Ω
4.9 Divergence theorem | 251

Note. As noted previously, the quantity ∬S (F ⋅ N)dS measures the flux of the vector
field across the surface S. If, for example, F measures a fluid flow (mass/unit time in
the direction of F) at each point (x, y, z), then ∬S (F ⋅ N)dS measures the amount per
unit time of fluid crossing the surface S in the direction of N. In this case ∭Ω ∇ ⋅ FdV
measures the rate of change of fluid mass in the region Ω. This is because ∇⋅F measures
the flux per unit volume at the point, or in other words, the rate (per unit volume) at
which the fluid quantity is changing at the point (x, y, z).

Verifying the divergence theorem

Example 4.9.1. Compute the flux of F(x, y, z) = zi + yj + xk out of S : x 2 + y 2 + z 2 = 1 using the two
integrals in the divergence theorem.

Solution. We use the parameterization r(u, v) = ⟨sin u cos v, sin u sin v, cos u⟩, 0 ≤ u
≤ π, 0 ≤ v ≤ 2π. Then as computed before

ru × rv = (sin2 u cos v)i+(sin2 u sin v)j+(sin u cos u)k,

∬(F ⋅ N)dS = ∬⟨f , g, h⟩ ⋅ (ru × rv )dudv


S Duv

= ∬⟨cos u, sin u sin v, sin u cos v⟩


Duv

⋅ ⟨sin2 u cos v, sin2 u sin v, sin u cos u⟩dudv

= ∬(sin2 u cos u cos v + sin3 u sin2 v + sin2 u cos u cos v)dudv


Duv

= ∬ sin3 u sin2 vdudv


Duv
π 2π π 2π
1 − cos 2v
= ∫ sin3 udu ∫ sin2 vdv = − ∫(1 − cos2 u)d(cos u) ∫ dv
2
0 0 0 0

1 󵄨󵄨π 4π
= (− cos u + cos3 u󵄨󵄨󵄨 ) × π =
󵄨
.
3 󵄨
󵄨0 3

We now evaluate the triple integral, where Ω (the unit ball) is the interior of S. We have

𝜕 𝜕 𝜕
∭ ∇ ⋅ FdV = ∭( (z) + (y) + (x))dV
𝜕x 𝜕y 𝜕z
Ω Ω
4π13 4π
= ∭ 1dV = = .
3 3
Ω
252 | 4 Line and surface integrals

Note. The last integral did not require a computation because it is the volume of the
sphere of radius 1.

Example 4.9.2. Evaluate the integral

→󳨀 →󳨀 →󳨀 →
󳨀
∬(xy i + yz j + xz k ) ⋅ d S ,
S

where S is the cube bounded by the coordinate planes, by x = 1, y = 1, and z = 1, orientated outward.

Solution. Using the divergence theorem, with B denoting the inside of the box, we
have


󳨀 →
󳨀 →
󳨀 →
󳨀
∬(xy i + yz j + xz k ) ⋅ d S
S

󳨀 →
󳨀 →
󳨀
= ∭ ∇ ⋅ (xy i + yz j + xz k )dV = ∭(y + z + x)dV
B B
1 1 1 1 1
1
= ∫ ∫ ∫(y + z + x)dxdydz = ∫ ∫(y + z + )dydz
2
0 0 0 0 0
1
3
= ∫(z + 1)dz = .
2
0

Example 4.9.3. Find the flux across S, the top hemisphere z = √4 − x 2 − y 2 , oriented outward, if F =
2
−x 2
⟨x 3 + 2y sin z, x 4 + y 3 , e−y + z 3 ⟩.

Solution. It would be hard to evaluate ∬S F⋅dS directly. Instead, we add S󸀠 : z = 0,


x2 + y2 ≤ 4 with orientation downward. So S + S󸀠 is closed and oriented outward. Then,

∬ F⋅dS + ∬ F⋅dS = ∬ F⋅dS = ∭ ∇ ⋅ FdV


S S󸀠 S+S󸀠 Ω

= ∭ 3x + 3y + 3z 2 dV
2 2

Ω
2π π/2 2

= 3 ∫ dθ ∫ dϕ ∫ ρ4 sin ϕdρ
0 0 0
2π π/2 2
192π
= 3 ∫ dθ ∫ sin ϕdϕ ∫ ρ4 dρ = .
5
0 0 0
4.9 Divergence theorem | 253

Now we need to subtract ∬S󸀠 F⋅dS in order to get the desired flux.

192π
∬ F⋅dS = − ∬ F⋅dS
5
S S󸀠
192π 2 2
= − ∬⟨x 3 + 2y sin z, x4 + y3 , e−y −x + z 3 ⟩ ⋅ ⟨0, 0, −1⟩dS
5
S󸀠
192π 2 2
= + ∬ e−y −x dS
5
S󸀠
2π 2
192π 2 2 192π 2
= + ∬ e−y −x dσ = + ∫ dθ ∫ e−r rdr
5 5
x2 +y2 ≤4 0 0

192π 1 1
= − 2π( e−4 − ).
5 2 2

Example 4.9.4. An electric charge q at the origin creates an electric field F at r (the position vector
from the origin) given by

q r
F= ,
4π ∈0 r 3

where r = |r| and ∈0 is a constant. Compute ∇ ⋅ F and find the flux across the surface S of the sphere B :
x 2 + y 2 + z 2 ≤ b2 . Find the flux across any closed surface S1 containing the charge.

Solution. A vector field of the form

r x y z
k = k⟨ 3 , 3 , 3 ⟩,
r3 r r r

where k is a constant, is the inverse square law (such as electric charge and gravity).
1
The divergence is zero, because r = (x2 + y2 + z 2 ) 2 and r 2 = x 2 + y2 + z 2 . So 2r 𝜕x
𝜕r
= 2x,
x
and 𝜕x = r . Thus,
𝜕r

𝜕 x r 3 − x3r 2 𝜕x
𝜕r
1 3x2
( 3) = = − 5 .
𝜕x r r6 r3 r

By symmetry, we have

r x y z
∇⋅k = k∇ ⋅ ⟨ 3 , 3 , 3 ⟩
r3 r r r
1 3x2 1 3y2 1 3z 2
= k( 3 − 5 ) + k( 3 − 5 ) + k( 3 − 5 )
r r r r r r
3k x2 + y2 + z 2
= − 3k( ),
r 3 r5
254 | 4 Line and surface integrals

3k 3k
∇⋅F= − 3 = 0.
r3 r
Using the divergence theorem when F is an inverse square field would give ∬S F ⋅ dS =
∭Ω ∇ ⋅ FdV = 0, but this is a wrong answer, because the vector field F is not defined
at the origin (in fact it approaches ∞ as x, y, z → 0).
Hence, we must integrate the flux integral directly without using the divergence
theorem. The outward unit normal to the sphere x 2 + y2 + z 2 = b2 at (x, y, z) ∈ S is
1
b
(x, y, z), so

∬ F ⋅ dS = ∬(F ⋅ N)dS
S S
q xi + yj + zk xi + yj + zk
= ∬( )⋅( )dS
4π ∈0 ( √x 2 + y 2 + z 2 ) 3 b
S

q x2 + y2 + z 2
= ∬ dS
4π ∈0 b4
S
q 1
= ∬ 2 dS
4π ∈0 b
S
q
= ⋅ 4πb2 (since 4πb2 is the area of S)
4πb2 ∈0
q
= .
∈0
Hence, the flux through the sphere of any radius is ∈q .
0
In fact, the flux through any closed orientable surface S1 containing the charge is
the same value ∈q . To see this, let the radius b of the sphere S be sufficiently large so
0
q r
that S totally encloses S1 . If F = 4π∈ 3 , then the region R between the two surfaces
0 r
satisfies the divergence theorem because this region no longer contains the origin.
Hence,

∬ (F ⋅ N)dS = ∭ ∇ ⋅ FdV = 0 󳨐⇒
S+S1 R

∬(F ⋅ N)dS + ∬(F ⋅ N)dS = 0,


S S1
q
− ∬(F ⋅ N)dS = ∬(F ⋅ N)dS = .
∈0
S1 S

But the unit normal N on S1 will point out of the region R, which means it points into
the region containing the charge. Changing the direction of N to point outwards from
the charge q shows that
q
∬(F ⋅ N)dS = .
∈0
S1
4.9 Divergence theorem | 255

Proof of the divergence theorem under special conditions


Suppose the solid Ω is bounded by S, which consists of two surfaces, the lower one S1 :
z = z1 (x, y) and the upper one S2 : z = z2 (x, y). The projection of S onto the xy-plane is
Dxy , as shown in Figure 4.33. Since

∬(F ⋅ N)dS = ∬(f i + gj + hk) ⋅ NdS = ∬ f i ⋅ NdS + ∬ gj ⋅ NdS + ∬ hk ⋅ NdS


S S S S S

and
𝜕f 𝜕g 𝜕h
∭ ∇ ⋅ FdV = ∭ + + dV,
𝜕x 𝜕y 𝜕z
Ω Ω

we start by showing that

𝜕h
∭ dV = ∬ hk ⋅ NdS. (4.37)
𝜕z
Ω S

But this follows from


z2 (x,y)
𝜕h 𝜕h
∭ dV = ∬( ∫ dz)dxdy
𝜕z 𝜕z
Ω Dxy z1 (x,y)

= ∬(h(x, y, z2 (x, y)) − h(x, y, z1 (x, y)))dxdy


Dxy

and

∬ hk ⋅ NdS = ∬ hk ⋅ NdS + ∬ hk ⋅ NdS


S S1 S2

= − ∬ hk ⋅ ⟨−zx , −zy , 1⟩dxdy + ∬ hk ⋅ ⟨−zx , −zy , 1⟩dxdy


Dxy Dxy

= − ∬ h(x, y, z1 (x, y))dxdy + ∬ h(x, y, z2 (x, y))dxdy


Dxy Dxy

= ∬ h(x, y, z2 (x, y)) − h(x, y, z1 (x, y))dxdy.


Dxy

Similarly, we also have

𝜕g 𝜕f
∭ dV = ∬ gj ⋅ NdS and ∭ dV = ∬ f i ⋅ NdS.
𝜕y 𝜕x
Ω S Ω S

Adding them up gives the divergence theorem.


256 | 4 Line and surface integrals

4.10 Stokes theorem


4.10.1 The curl of a three-dimensional vector field

Recall that when we were trying to find the circulation along a plane curve C over a
vector field F = ⟨f , g⟩, we saw the term 𝜕g
𝜕x
𝜕f
− 𝜕y in Green’s theorem in curl-circulation
form, and we noted that

𝜕g 𝜕f

𝜕x 𝜕y

is the k-component of the curl, a vector relating to the rotational effect of a vector field.
If we want to find circulation along a simple closed curve over a vector field
in ℝ3 , what can we expect for the i- or j-components of the curl? What do the i- or
j-components look like? In general, we define the curl of the vector field F = ⟨f , g, h⟩
as

𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
curl F = ( − )i + ( − )j+( − )k.
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y

This is better written as


󵄨󵄨 i j k 󵄨󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨 𝜕 𝜕 󵄨󵄨󵄨
curl F = 󵄨󵄨󵄨 𝜕x 𝜕
𝜕y 𝜕z 󵄨󵄨󵄨 .
󵄨󵄨 󵄨
󵄨󵄨 f g h 󵄨󵄨󵄨
󵄨

We recall the notation ∇ = ⟨ 𝜕x , 𝜕y , 𝜕z ⟩, so we have the even simpler notation


𝜕 𝜕 𝜕

curl F = ∇ × F.

If the curl of a vector field is always 0, then the field is irrotational. An irrotational
vector field F might be conservative, that is, there exists a potential function φ(x, y, z)
such that F = ∇φ.

Example 4.10.1. First compute the curl of the vector field given by

F(x, y, z) = ⟨2xy − z, x 2 + 4y, −x + 2z⟩.

Is the field irrotational? Attempt to find a potential function φ for the vector field F.

Solution. Note that f = 2xy − z, g = x2 + 4y, and h = −x + 2z. We first compute curl F.
We have
󵄨󵄨 i j k 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
curl F = ∇ × F = 󵄨󵄨󵄨 𝜕
𝜕x
𝜕
𝜕y
𝜕
𝜕z
󵄨󵄨
󵄨󵄨
󵄨󵄨 2 󵄨
󵄨󵄨 2xy − z x + 4y −x + 2z 󵄨󵄨󵄨
󵄨
4.10 Stokes theorem | 257

𝜕(−x + 2z) 𝜕(x2 + 4y) 𝜕(2xy − z) 𝜕(−x + 2z)


=( − )i+( − )j
𝜕y 𝜕z 𝜕z 𝜕x
𝜕(x2 + 4y) 𝜕(2xy − z)
+( − )k
𝜕x 𝜕y
= (0 − 0)i + (−1 − (−1))j + (2x − 2x)k = 0i + 0j + 0k.

This field is irrotational. Now, we attempt to find a potential function φ. We know that
φ satisfies

𝜕φ 𝜕φ 𝜕φ
= 2xy − z, = x2 + 4y, and = −x + 2z.
𝜕x 𝜕y 𝜕z

Step 1: We integrate the first component of F with respect to x to find φ (incompletely),


i. e.,

φ(x, y, z) = ∫(2xy − z)dx,

φ(x, y, z) = x2 y − xz + C(y, z)

Step 2: We differentiate this φ with respect to y giving the second component of F, and
use this to deduce more information about φ, i. e.,

𝜕φ 𝜕C(y, z)
= x2 + (but this must be equal to x 2 + 4y)
𝜕y 𝜕y
𝜕C(y, z)
󳨐⇒ = 4y
𝜕y
󳨐⇒ C(y, z) = 2y2 + C(z)
󳨐⇒ φ(x, y, z) = x2 y − xz + 2y2 + C(z).

Step 3: We differentiate φ again with respect to z giving the third component of F, and
we use this to determine φ(x, y, z) (up to a constant), i. e.,

𝜕φ dC(z)
= −x + (but this must be equal to − x + 2z)
𝜕z dz
dC(z)
󳨐⇒ = 2z
dz
󳨐⇒ C(z) = z 2 + C.

Hence, we have

φ(x, y, z) = x2 y − xz + 2y2 + z 2 + C,

where C is an arbitrary constant. The simplest answer is when C = 0 : φ(x, y, z) =


x2 y − xz + 2y2 + z 2 .
258 | 4 Line and surface integrals

4.10.2 Stokes theorem

Recall Green’s theorem in curl-circulation form. If C is a simple and closed plane curve
which is the boundary of a simply connected region D, we have

𝜕g 𝜕f
∮ F ⋅ Tds= ∬ − dxdy.
𝜕x 𝜕y
C D

Now, in three-dimensional space, if C is a simple closed space curve, which is the


boundary of a smooth orientable surface, is there any similar result? The answer is
Stokes theorem, stated below.

Figure 4.34: Stokes theorem.

Theorem 4.10.1 (Stokes theorem). Suppose that S is a bounded simple orientable smooth surface with
unit normal N and boundary curve C that is oriented with unit tangent T as described in the preceding
sections (a corkscrew following the direction of N turns in the same direction as the positive direction
on C). Let F(x, y, z) be a continuously differentiable vector-valued function defined on S. Then

∮ F ⋅ Tds = ∬(∇ × F) ⋅ NdS = ∬(Curl F ⋅ N)dS,


C S S

or in another form,

𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
∮ fdx + gdy + hdz = ∬( − )dydz + ( − )dzdx + ( − )dxdy.
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
C S

Note. Note that

𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
∮ F ⋅ Tds= ∬( − )dydz + ( − )dzdx + ( − )dxdy,
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
C S
4.10 Stokes theorem | 259

which can be written in different forms, i. e.,


𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
∮ F ⋅ Tds = ∬( − )dydz + ( − )dzdx + ( − )dxdy
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
C S
󵄨󵄨 dydz dzdx dxdy 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕 𝜕 𝜕 󵄨󵄨
= ∬ 󵄨󵄨󵄨 𝜕x 𝜕y 𝜕z
󵄨󵄨
󵄨󵄨
󵄨󵄨 󵄨
S 󵄨󵄨󵄨 f g h 󵄨󵄨󵄨
󵄨󵄨 cos α cos β cos γ 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕 󵄨󵄨
= ∬ 󵄨󵄨󵄨 𝜕x 𝜕 𝜕 󵄨󵄨 dS (note that N = ⟨cos α, cos β, cos γ⟩)
󵄨󵄨 𝜕y 𝜕z 󵄨󵄨
󵄨
S 󵄨 󵄨
󵄨 f g h 󵄨󵄨󵄨

= ∬(∇ × F ⋅ N)dS.
S

Note. In the two-dimensional case, the surface S is a planar region D, and the unit
normal vector N is the basis vector k = ⟨0, 0, 1⟩, so Stokes theorem becomes

∮ F ⋅ Tds = ∬ ∇ × F ⋅ kdxdy (4.38)


C D
󵄨󵄨 0 0 1 󵄨󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨 𝜕 𝜕 󵄨󵄨󵄨 𝜕g 𝜕f
𝜕z 󵄨󵄨󵄨 dxdy = ∬ 𝜕x − 𝜕y dxdy, (4.39)
𝜕
= ∬ 󵄨󵄨󵄨 𝜕x 𝜕y
󵄨󵄨 󵄨
D 󵄨󵄨󵄨 f g h 󵄨󵄨󵄨 D

which is exactly Green’s theorem in circulation-curl form. Thus, Stokes theorem is an


extension of Green’s theorem in three-dimensional space.

Note. There is more than one smooth surface with the same boundary C; however, the
integral ∬S ∇×F ⋅ NdS gives the same value for each of these smooth surfaces satisfying
the conditions of the theorem.

Verifying Stokes theorem

Example 4.10.2. Let the vector field F be ⟨2z − y, x, y⟩, and let the surface S be the upper hemisphere
z = √4 − x 2 − y 2 oriented outward. The boundary curve C of S is x 2 + y 2 = 4 in the xy-plane oriented
counterclockwise. Evaluate:
1. ∬S ∇ × F ⋅ NdS.
2. ∬S ∇ × F ⋅ NdS, where S1 is the disk with boundary C and with N pointing upward.
1
3. ∮C F ⋅ Tds.

Solution. The surface and its orientation is shown in Figure 4.35. We first compute the
curl of F, which is
󵄨󵄨 i j k 󵄨󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨 𝜕 󵄨󵄨󵄨
curl F = 󵄨󵄨󵄨 𝜕x 𝜕 𝜕
𝜕y 𝜕z 󵄨󵄨󵄨 = i+2j+2k.
󵄨󵄨 󵄨
󵄨󵄨 2z − y x y 󵄨󵄨󵄨
󵄨
260 | 4 Line and surface integrals

(a) (b) (c)

Figure 4.35: Stokes theorem, Example 4.10.2, Example 4.10.4, and Example 4.10.5.

1. Since zx = −x
and zy = −y
, we have
√4−x 2 −y2 √4−x 2 −y2

∬ ∇ × F ⋅ NdS = ∬⟨1, 2, 2⟩ ⋅ ⟨−zx , −zy , 1⟩dxdy


S Dxy
x y
=∬ +2 + 2dxdy
Dxy
√4 − x2 − y2 √4 − x 2 − y2

= ∬ 2dxdy = 2 × π22 = 8π.


Dxy

2. Since N can be taken as ⟨0, 0, 1⟩, we have

∬ ∇ × F ⋅ NdS = ∬⟨1, 2, 2⟩ ⋅ ⟨0, 0, 1⟩dxdy = ∬ 2dxdy = 8π.


S Dxy Dxy

3. We parameterize C : r(t) = ⟨2 cos t, 2 sin t, 0⟩, r󸀠 (t) = ⟨−2 sin t, 2 cos t, 0⟩. Then

∮ F ⋅ Tds = ∫⟨2z − y, x, y⟩ ⋅ ⟨−2 sin t, 2 cos t, 0⟩dt


C

= ∫ ⟨2 × 0 − 2 sin t, 2 cos t, 2 sin t⟩ ⋅ ⟨−2 sin t, 2 cos t, 0⟩dt


0

= ∫ (4 sin2 t + 4 cos2 t)dt = 8π.


0

→󳨀 →
󳨀 →󳨀 → 󳨀 →󳨀
Example 4.10.3. Evaluate ∬S (∇ × F ) ⋅ d S = ∬S (∇ × F ) ⋅ N dS, when F = ⟨xz, yz, xy⟩ and S is the part
of the sphere x 2 + y 2 + z 2 = 4 inside the cylinder x 2 + y 2 = 1 with z ≥ 0.
4.10 Stokes theorem | 261

Solution. By Stokes theorem,

→󳨀 → 󳨀 →󳨀 → 󳨀
∬(∇ × F ) ⋅ N dS = ∮ F ⋅ T ds,
S C

where C is the boundary curve x2 +y2 = 1 and z is given by z 2 = 4−(x 2 +y2 ) = 3 ⇒ z = √3.
We can represent C in vector form with positive orientation as

󳨀r (t) = cos t →
󳨀 →󳨀 →󳨀
→ i + sin t j + √3 k , 0 ≤ t ≤ 2π.

→󳨀
Hence, on C : F = ⟨xz, yz, xy⟩ = ⟨√3 cos t, √3 sin t, cos t sin t⟩,


→󳨀 → 󳨀 →󳨀 d→󳨀r
∮ F ⋅ T ds = ∫ F ⋅ dt
dt
C 0

= ∫ ⟨√3 cos t, √3 sin t, cos t sin t⟩ ⋅ ⟨− sin t, cos t, 0⟩dt


0

= ∫ (−√3 cos t sin t + √3 sin t cos t)dt


0
= 0.
→󳨀 →
󳨀
Therefore, ∬S (∇ × F ) ⋅ d S = 0.

2
Example 4.10.4. If F = ⟨z 2 y, −3xy, e−x y 3 ⟩ and S is part of the surface z = 5 − x 2 − y 2 above z = 4,
oriented upward, find ∬S curl F ⋅ dS.

Solution. The curl of F is


󵄨󵄨 󵄨󵄨 2
󵄨󵄨 i j k 󵄨󵄨 3y2 e−x
󵄨󵄨 𝜕 󵄨
󵄨 2
curl F = 󵄨󵄨󵄨󵄨 𝜕x 󵄨󵄨 = (
2xe−x y3 + 2zy ) .
𝜕 𝜕
𝜕y 𝜕z 󵄨󵄨
󵄨󵄨󵄨 2 2 󵄨
e−x y3 󵄨󵄨󵄨 −z 2 − 3y
󵄨
󵄨󵄨 z y −3xy

To evaluate ∬S curl F ⋅ dS directly would be hard. We now use Stokes theorem. We first
find the boundary curve z = 4 and x2 +y2 ≤ 1, and orient it counterclockwise as viewed
from above. We parameterize the curve by r(t) = ⟨cos t, sin t, 4⟩, 0 ≤ t ≤ 2π. Then,
2
∬ curl F ⋅ dS = ∮ F ⋅ dr = ∮⟨z 2 y, −3xy, e−x y3 ⟩ ⋅ d⟨cos t, sin t, 4⟩
S C C

2
= ∫ ⟨42 sin t, −3 cos t sin t, e− cos t (sin t)3 ⟩ ⋅ ⟨− sin t, cos t, 0⟩dt
0
262 | 4 Line and surface integrals

= ∫ (−16 sin2 t − 3 sin t cos2 t)dt


0

= −8 ∫ (1 − cos 2t)dt = −16π.


0

Or, we choose an alternative surface with the boundary. This alternative surface could
be the disk x2 + y2 ≤ 1 and z = 4. The outward unit normal vector is ⟨0, 0, 1⟩. Thus,

∬ curl F ⋅ dS = ∬ curl F ⋅ ⟨0, 0, 1⟩dS


S S
2 2
= ∬⟨3y2 e−x , 2xe−x y3 + 2zy, −z 2 − 3y⟩ ⋅ ⟨0, 0, 1⟩dS
S

= ∬ −z 2 − 3ydS = ∬ −16 − 3ydS = ∬ −16 − 3ydσ


S S D
2π 1

= ∫ dθ ∫(−16 − 3r sin θ)rdr


0 0

= ∫ −8 − sin θdθ = −16π.


0


󳨀 󳨀r , where →󳨀 →󳨀 →
󳨀 →
󳨀
Example 4.10.5. Compute ∫C F ⋅ d→ F = xz i + xy j + 3xz k and C is the triangular closed
curve with vertices followed in the order (1, 0, 0), (0, 2, 0), (0, 0, 2), (1, 0, 0).

Solution. If S is the triangular part of the plane 2x + y + z = 2 defined by the three


vertices (in the first octant and with boundary curve C), then Stokes theorem gives
→󳨀 󳨀 →󳨀 → 󳨀 →󳨀 → 󳨀
∮ F ⋅ d→
r = ∮ F ⋅ T ds = ∬(∇ × F ) ⋅ N dS,
C C S

→󳨀
and ∇ × F = ⟨0, x − 3z, y⟩. Hence,
→󳨀 → 󳨀 →󳨀
∬(∇ × F ) ⋅ N dS = ∬⟨0, x − 3z, y⟩ ⋅ N dS.
S S

Evaluating this using equation (4.33), we find ∬D (−fzx −gzy +h)dxdy, using z = z(x, y) =
2−2x−y, f = 0, g = x−3z, and h = y, where the projection of D onto the xy-plane is given
by the triangle with vertices (0, 0, 0), (1, 0, 0), (0, 2, 0), and ⟨−zx , −zy , 1⟩ = ⟨2, 1, 1⟩. This
is the correct orientation for the plane. Hence, the integral becomes
→󳨀 󳨀 →󳨀 → 󳨀
∮ F ⋅ d→
r = ∬(∇ × F ) ⋅ N dS = ∬((x − 3z) + y)dσ
C S D
4.10 Stokes theorem | 263

= ∬(x + y − 3(2 − 2x − y))dσ


D
1 2−2x

= ∫ ∫ (7x + 4y − 6)dydx
0 0
1

= ∫(−6x 2 + 10x − 4)dx


0
= −1.


󳨀 →
󳨀 →󳨀
Example 4.10.6. Show, using Stokes theorem, ∮C F ⋅ T ds = 0 for any gradient vector field F with con-
→󳨀
tinuous partial derivatives and simple orientable smooth surface S with unit normal N and boundary
curve C in ℝ3 .

Solution. We know that if F = ∇φ for some function φ(x, y, z) (a potential function of


→󳨀
F), then ∇ × ∇φ = 0 because
󵄨󵄨 󵄨
󵄨󵄨 i j k 󵄨󵄨󵄨
󵄨󵄨 𝜕 󵄨󵄨
∇ × ∇φ = 󵄨󵄨󵄨󵄨 𝜕x 𝜕y 𝜕z 󵄨󵄨󵄨󵄨
𝜕 𝜕
󵄨󵄨 𝜕φ 𝜕φ 𝜕φ 󵄨󵄨
󵄨󵄨 󵄨
󵄨 𝜕x 𝜕y 𝜕z 󵄨󵄨
𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ
=( − )i + ( − )j+( − )k
𝜕y𝜕z 𝜕z𝜕y 𝜕z𝜕x 𝜕x𝜕z 𝜕x𝜕y 𝜕y𝜕x
= 0.

Hence, by Stokes theorem,

∮ F ⋅ Tds = ∬(∇ × F) ⋅ NdS = ∬(∇ × ∇φ) ⋅ NdS = ∬ 0dS = 0.


C S S S

By Stokes theorem, we can conclude that if f , g, and h have continuous partial


derivatives, then

F = ⟨f , g, h⟩ is conservative ⇐⇒ ∮ F ⋅ Tds = 0 ⇐⇒ curl F = 0.


C
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
for any simple closed curve C

Note that the proof of the fundamental theorem of line integrals can be extended to
three-dimensional vector fields in ways similar to the results we have obtained for two-
dimensional vector fields. For a three-dimensional vector field F = ⟨f , g, h⟩, where f ,
g, and h all have continuous partial derivatives and D is a simply connected region in
ℝ3 bounded by the simple curve C, we have

F = ⟨f , g, h⟩ is conservative
⇐⇒ there is a function φ such that F = ∇φ, or dφ = fdx + gdy + hdz
264 | 4 Line and surface integrals

⇐⇒ ∫ F ⋅ Tds = ∫ fdx + gdy + hdz is path-independent


C C

⇐⇒ ∮ F ⋅ Tds = 0 for any simple closed curve


C
⇐⇒ curl F = 0. This means φyz = φzy , φxz = φzx , and φxy = φyx .

To find a potential function for a conservative field, we can follow the method used
in Example 4.10.1.

Example 4.10.7. Evaluate ∫C ydx + xdy + 2zdz, where

t(t − 1) π t
C : r(t) = i + sin( t 2 )j + 2 k, 0 ≤ t ≤ 1.
e√t 2 t +1

Solution. Since
󵄨󵄨 i j k 󵄨󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨 𝜕 𝜕 󵄨󵄨󵄨
∇ × F = 󵄨󵄨󵄨 𝜕x 𝜕
𝜕y 𝜕z 󵄨󵄨󵄨 = 0,
󵄨󵄨 󵄨
󵄨󵄨 y x 2z 󵄨󵄨󵄨
󵄨

this field is conservative and, therefore, it is path-independent. The two endpoints are
(0, 0, 0) and (0, 1, 21 ). We could have found a potential function, but we simply choose
a simple route between the endpoints. If we choose the line segment r1 (t) = t⟨0, 1, 21 ⟩,
for 0 ≤ t ≤ 1, then we have x = 0, y = t, and z = 2t and

1
t 1
∫ ydx + xdy + 2zdz = ∫ 0 + 0 + dt = .
2 4
C 0

Proof of Stokes theorem under special conditions


See Figure 4.34. Suppose the surface S : z = z(x, y) is smooth and C is its boundary with
compatible orientations. The projection of S onto the xy-plane is D, and the projection
of C onto the xy-plane is C 󸀠 . Furthermore, we assume zxy = zyx . Note that dz = zx dx +
zy dy. Then,

∮ F ⋅ Tds = ∮ fdx + gdy + hdz = ∮ fdx + gdy + h(zx dx + zy dy)


C C C󸀠

= ∮(f + hzx )dx + (g + hzy )dy


C󸀠
𝜕 𝜕
= ∬( (g + hzy ) − (f + hzx ))dσ
𝜕x 𝜕y
D
4.11 Review | 265

(gx + gz zx + (hx + hz zx )zy + hzyx )


= ∬( ) dσ
−(fy + fz zy + (hy + hz zy )zx + hzxy )
D

= ∬(gx − fy ) + zx (gz − hy ) + zy (hx − fz )dσ


D

= ∬⟨hy − gz , fz − hx , gx − fy ⟩ ⋅ ⟨−zx , −zy , 1⟩dσ


D

= ∬ ∇ × F ⋅ NdS.
S

This completes the proof.

Interpretation of curl
Now we can shed more light on the meaning of the curl vector using Stokes theorem.
Let P0 be a point on the surface S and let SP0 be a very small patch of S containing P0 .
Let A(SP0 ) be the area of the small patch. Then, under the continuity assumption, we
have

∮ F⋅dr = ∬(curl F ⋅ N)dS ≈ (curl F(P0 ) ⋅ N(P0 )) ⋅ A(SP0 ),


C SP0

∮C F⋅dr
(curl F(P0 ) ⋅ N(P0 )) ≈ .
A(SP0 )

When taking limit as the small patch contracts to P0 , we see that curl F(P0 ) ⋅ N(P0 ) is
the circulation density at the point P0 . Thus, the integration of curl F ⋅ N generates the
total circulation along the boundary curve C. Also, one sees that the greatest circula-
tion occurs when curl F is parallel to N, in which case we have the greatest curling
effect.

4.11 Review
Main concepts discussed in this chapter are listed below.
1. Line integral of f (x, y) or f (x, y, z) along a curve C with respect to arc length:

∫ f (x, y)ds or ∫ f (x, y, z)ds.


C C

2. Some equivalent notations for the line integral of a vector field F = ⟨f , g⟩ along a
curve C:

∫ F ⋅ Tds = ∫ F ⋅ dr = ∫ fdx + gdy,


C C C
266 | 4 Line and surface integrals

∫ F ⋅ Nds = ∫ F ⋅ ds = ∫ fdy − gdx.


C C C

3. The fundamental theorem of line integrals:

∫ ∇φ ⋅ dr = φ(B) − φ(A), where points A and B are two endpoints of C.


C

4. Circulation and flux integral for F = ⟨f , g⟩:

∫ F ⋅ dr = ∫ fdx + gdy circulation integral,


C C

∫ F ⋅ ds = ∫ fdy − gdx flux integral.


C C

5. Green’s theorem:
𝜕g 𝜕f
∮ fdx + gdy = ∬ − dσ circulation-curl form,
𝜕x 𝜕y
C D
𝜕f 𝜕g
∮ fdy − gdx = ∬ + dσ flux-divergence form.
𝜕x 𝜕y
C D

6. Under suitable conditions,

vector field F = ⟨f , g⟩ is conservative


⇐⇒ there is a function φ(x, y) such that F = ∇φ
⇐⇒ there is a function φ(x, y) such that dφ(x, y) = fdx + gdy
⇐⇒ ∫ F ⋅ dr is path-independent
C

⇐⇒ ∮ F ⋅ dr = 0
C
𝜕g 𝜕f
⇐⇒ φxy = φyx ( = ).
𝜕x 𝜕y
7. Under suitable conditions

vector field F = ⟨f , g⟩ is source-free


⇐⇒ there is a function ψ(x, y) such that dψ(x, y) = fdy − gdx
⇐⇒ ∫ F ⋅ ds is path-independent
C

⇐⇒ ∮ F ⋅ ds = 0
C
𝜕g 𝜕f
⇐⇒ ψxy = ψyx ( + = 0).
𝜕x 𝜕y
4.11 Review | 267

8. Surface integral with respect to surface area:

∬ f (r(u, v))dS = ∬ f (r(u, v))|ru × rv |dudv for surface r = r(u, v),


S Duv

∬ f (x, y, z)dS = ∬ f (x, y, z)√1 + zx2 + zy2 dxdy for surface z = f (x, y),
S Dxy

|∇F|
∬ f (x, y, z)dS = ∬ f (x, y, z) dxdy for surface F(x, y, z) = 0.
|Fz |
S Dxy

Similar results hold for surfaces that can be projected to coordinate planes other
than the xy-plane.
9. Divergence of a vector field F = ⟨f , g, h⟩:

𝜕f 𝜕g 𝜕h
Div(F) = + + = ∇ ⋅ F.
𝜕x 𝜕y 𝜕z

10. Some equivalent notations:

∬(F ⋅ N)dS = ∬ F⋅dS = ∬ fdydz + gdzdx + hdxdy.


S S S

11. Flux of a vector field F = ⟨f , g, h⟩ crossing a given surface S in a given direction:

∬ F⋅dS = ± ∬(−fzx − gzy + h)dxdy.


S Dxy

Similar results hold for a surface that can be projected onto coordinate planes
other than the xy-plane.
12. The divergence theorem: the outward flux crossing a closed surface S is

𝜕f 𝜕g 𝜕h
∬ fdydz + gdzdx + hdxdy = ∭( + + )dV,
𝜕x 𝜕y 𝜕z
S Ω

∬ F⋅dS = ∭(∇ ⋅ F)dV.


S Ω

13. The curl of a vector field F = ⟨f , g, h⟩:

𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
curl F = ( − )i + ( − )j + ( − )k
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
󵄨󵄨 i j k 󵄨󵄨󵄨󵄨
󵄨󵄨
󵄨󵄨 𝜕 󵄨󵄨󵄨
𝜕z 󵄨󵄨󵄨 = ∇ × F.
𝜕 𝜕
= 󵄨󵄨󵄨 𝜕x 𝜕y
󵄨󵄨 󵄨
󵄨󵄨
󵄨 f g h 󵄨󵄨󵄨
268 | 4 Line and surface integrals

14. Stokes theorem:


𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
∮ fdx + gdy + hdz = ∬( − )dydz + ( − )dzdx + ( − )dxdy,
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
C S

∮ F ⋅ dr = ∬(∇ × F ⋅ N)dS = ∬ curl F ⋅ dS.


C S S

15. Under suitable conditions, for an irrotational vector field:

curl F = 0 ⇐⇒ there is a function φ such that F = ∇φ


⇐⇒ dφ = fdx + gdy + hdz
⇐⇒ ∫ F ⋅ dr is path-independent.
C

4.12 Exercises
4.12.1 Line integrals

1. Evaluate each of the following line integrals for the given curve C:
(1) ∫C √2yds, C : x = a(t − sin t), y = a(1 − cos t), 0 ≤ t ≤ 2π,
(2) ∫C (x + y)ds, C consists of three line segments with vertices (0, 0), (1, 0),
and (0, 1),
(3) ∫C cos √x2 + y2 ds, C is the boundary of the region in the first quadrant bounded
by x = y, y = √R2 − x2 , and y = 0,
2
+y2 +z 2 =a2 ,
(4) ∫C √2y2 + z 2 ds, C : { x x−y=0,
(5) ∫C (x2 + y2 + z 2 )ds, C : x = e cos t, y = et sin t, and z = et , 0 ≤ t ≤ 2π.
t

2. Evaluate each of the following line integrals of a vector field:


(1) ∫C (x2 + 2xy)dx + (y2 − 2xy)dy, C is the arc of the parabola y = x 2 from (−1, 1)
to (1, 1),
(2) ∫C (y2 − z 2 )dx + 2yzdy − x2 dz, C : x = t, y = t 2 , z = t 3 , 0 ≤ t ≤ 1,
(3) ∮C xdy, C is the triangular path consisting of the line segments from (0, 0) to
(2, 0), from (2, 0) to (0, 3), and from (0, 3) to (0, 0),
(4) ∫C F ⋅ Tds, where F = ⟨−y, x⟩ and C is the unit circle with counterclockwise
orientation,
(5) ∫C F⋅dr, where F = ⟨y2 cos z, x2 , zy⟩ and C: r(t) = ⟨2 cos t, 2 sin t, 3t⟩, 0 ≤ t ≤ π.
3. Evaluate each of the following line integrals:
(1) ∫c ∇(x2 + y2 ) ⋅ dr, where C is the curve r(t) = ⟨ 1+t1 2 , cos(tπ)⟩, 0 ≤ t ≤ 1,
(2) ∫C ∇f ⋅ dr, where f (x, y, z) = exy + z 2 and C: r(t) = ⟨ln(1 + t 2 ), t, π8 arctan t⟩,
0 ≤ t ≤ 1.
4. Use Green’s theorem to evaluate each of the following line integrals (assume the
boundary of each region is positively orientated):
4.12 Exercises | 269

(1) ∮C (x + y)2 dx + (x2 − y2 )dy, C is the triangle with vertices A(1, 1), B(3, 3), and
C(3, 5),
(2) ∮C xy2 dx − x2 ydy, C is the circle x2 + y2 = R2 ,
(3) ∫C (y + 2xy)dx + (x2 + 2x + y2 )dy, C is the top-half-arc of the circle x2 + y2 = 4x
from (4, 0) to (0, 0),
(4) ∮C F ⋅ dr, where F(x, y) = ⟨ex (1 − cos y), ex (y − sin y)⟩ and C is the boundary of
the region enclosed by x = 0, x = π, y = 0, and y = sin x,
(5) ∮C ∇(ex + sin(yx2 )) ⋅ dr, where C is any smooth simple closed curve in the
xy-plane,
4
(6) ∫C F ⋅ Tds, where F(x, y) = ⟨ex + y2 , xy + sin(ln y)⟩ and C is the boundary of the
quadrilateral with vertices (1, 1), (1, 2), (2, 3), and (2, 1).
5. Determine whether each of the following vector fields is conservative; if so, find a
potential function:
(1) F = xi − yj, (2) F = ⟨tan y, x sec2 y⟩,
(3) F = ⟨1 − ye−x , e−x ⟩, (4) F = ⟨y + 2xy, x2 + x + y2 ⟩,
(5) F = −yi + xj, (6) F = ⟨ex cos y, −ex sin y⟩,
(7) F = (x2 + 2xy − y2 )i + (x2 − 2xy − y2 )j.
6. Use a line integral to find the area of the region enclosed by the curve x = a cos3 t
and y = a sin3 t.
7. Use Green’s theorem to prove that the centroid of a plane region D in the xy-plane
has coordinates (x,̄ y)̄ given by
1 1
x̄ = ∮ x2 dy and ȳ = − ∮ y2 dx,
2A(D) 2A(D)
C C

where A(D) is the area of the region D. Hence, find the coordinates of the centroid
of the semicircle y = √a2 − x2 .
8. Evaluate the outward flux of each of the following vector fields across the given
curve C:
(1) F = xy2 i + xyj, and C is the boundary of the annulus 1 ≤ x 2 + y2 ≤ 4,
(2) F = ⟨−y, x⟩, and C is the circle with center the origin and radius a.
x y
9. Consider the vector field F = x2 +y 2 i+ x 2 +y 2 j.

(1) Show that Div(F) = 0.


(2) Show that the outward flux across any circle centered at (0, 0) with radius a
is 2π.
(3) Does this example contradict Green’s theorem in flux-divergence form? Ex-
plain.

4.12.2 Surface integrals

1. Evaluate each of the following surface integrals:


(1) ∬S (x2 + y2 )dS, S is the sphere x2 + y2 + z 2 = R2 ,
270 | 4 Line and surface integrals

(2) ∬S xyzdS, S is the part of the plane x + y + z = 1 that lies in the first octant,
(3) ∬S (xy + yz + zx)dS, S is the part of the cone z = √x 2 + y2 that lies inside the
cylinder x2 + y2 = 2x.
2. Evaluate each surface integral ∬S F ⋅ dS for each of the following vector fields F
and oriented surfaces S:
(1) F = ⟨0, 0, xyz⟩, S is the part of the cylinder x 2 + z 2 = R2 in the first and fifth oc-
tants and between the two planes y = 0 and y = h, with outward orientation,
(2) F = (y − z)i + (z − x)j + (x − y)k, S is the surface of the region E bounded by the
cone z = √x2 + y2 and the plane z = 1, with outward orientation.
3. Evaluate each of the following surface integrals ∬S Pdydz + Qdxdz + Rdxdy:
(1) ∬S xydydz + yzdxdz + zxdxdy, where S is the surface of the solid bounded by
z = 0, y = 0, z = 0, and x + y + z = 1, oriented outward,
2 2 2
(2) ∬S (e−x y + x)dydz + (2e−x y + y)dxdz + (e−x y + z)dxdy, where S is the part of the
plane x − y + z = 1 that is in the fourth octant, oriented outward,
(3) ∬S (x2 − y)dxdz + sin(xy)dxdy, where S is the part of the cylinder x 2 + y2 = 1
that is cut by z = 0 and z = 2, oriented outward.
4. Compute the divergence of each of the following vector fields:
2
(1) F = (x2 + sin y2 )i + (y2 − x)j, (2) F = ⟨x + x3 + yz 2 , e−x + ln(y2 + 1), z + xy⟩,
1
(3) F = (x3 + yz)i−xzj+yzk, (4) F = ⟨x − 1+xy2
, tan−1 z + y, z 2 + 3x⟩.
5. Use the divergence theorem to find ∬S x 3 dydz + y dzdx + z dxdy, where S is the
3 3

top half of the sphere x2 + y2 + z 2 = a2 , with outward orientation.


6. Evaluate the integral

xdydz + ydzdx + zdxdy


∬ ,
S
√(x2 + y2 + z 2 )3

where S is the ellipsoid x2 + 2y2 + 5z 2 = 1, with outward orientation.


7. We define the solid W by x2 + y2 + z 2 ≤ 1, and

1
F = ⟨x3 + 3x + , y3 + xy, z 3 − xz + sin(xy)⟩
z 2 + y2 + 1

is a vector field.
(1) Compute the divergence of F.
(2) Find the flux out of W (that is, evaluate ∬S F ⋅ NdS).
8. To evaluate ∬Σ xyzdxdy, where Σ : x2 + y2 + z 2 = 1, (x ≥ 0, y ≥ 0), oriented outward,
two students provided the following solutions:
Solution 1
The integration surface is symmetric about the xy-plane, with half the plane above
the xy-plane and the other half below the xy-plane. The integrand xyz, if keeping
4.12 Exercises | 271

x and y fixed, is an odd function with respect to the variable z. Therefore

∬ xyzdxdy = ∬ xyzdxdy + ∬ xyzdxdy


Σ Σ,z≥0 Σ:z≤0

= ∬ xy√1 − x2 − y2 dxdy + ∬ xy(−√1 − x 2 − y2 )dxdy = 0.


Σ,z≥0 Σ,z≤0

Solution 2
The second student writes

∬ xyzdxdy = ∬ xy√1 − x2 − y2 dxdy


Σ Σ
π
2 1

= ∫ dθ ∫(r cos θ)(r sin θ)√1 − r 2 rdr


0 0
π
2 1
1
= ∫(cos θ sin θ)dθ ∫ r 3 √1 − r 2 dr = .
15
0 0

Is one of the solutions correct, and if so, which one?


Evaluate the integral directly first as a flux integral, and then evaluate it by using
the divergence theorem. Hint: Add some surfaces so that one has a closed surface.
9. Compute the curl for each of the following vector fields:
(1) F = (ex cos y)i − (ex sin y)j, (2) F = ⟨x + y, x 2 + 2z, 2y − xz⟩,
(3) F = ⟨z, 1, x⟩, (4) F = ⟨x,y,z⟩
.
√x2 +y2 +z 2
10. Given a vector field F = (2xy − z 2 )i+(x2 + 2z)j+(2y − 2xz)k.
(1) Show that F is conservative.
(2) Find a potential function for this vector field.
9 π
(3) Evaluate ∫C F⋅dr, where C is the curve r(t) = ⟨1 + t 3 , 1+t 2 , sin( 4 t)⟩ for 0 ≤ t ≤ 2.
2 3
11. Given F = ⟨y + sin x, z + cos y, x ⟩.
(1) Find Curl(F).
(2) Evaluate ∫C (y + sin x)dx + (z 2 + cos y)dy + x 3 dz, where C is the curve r(t) =
⟨sin t, cos t, sin 2t⟩, 0 ≤ t ≤ 2π.
(3) Evaluate ∫c F ⋅ dr, where C is the curve r(t) = ⟨cos t, sin t + 2 cos t, sin t⟩, for
0 ≤ t ≤ 2π, and is orientated counterclockwise as viewed from above.
12. If F(x, y, z) = (x + y2 )i + (y + z 2 )j + (z + x2 )k, find ∫C F ⋅ dr, where C is the triangle
with vertices (1, 0, 0), (0, 1, 0), and (0, 0, 1) and is orientated counterclockwise as
viewed from above.
13. Let S be the part of the spherical surface x2 + y2 + z 2 = 2 lying in z > 1. Orient S
upwards and let C be its bounding circle lying in the plane z = 1 with compatible
orientation.
272 | 4 Line and surface integrals

(a) Parameterize C and use the parameterization to evaluate the line integral

I = ∮ xzdx + xdy + 4dz.


C

(b) Compute the curl of the vector field F = ⟨xz, x, 4⟩.


(c) Evaluate the flux integral ∬S (∇ × F ⋅ N)dS directly.
(d) Which theorem can be applied to evaluate (c) by using I directly?
14. Let F = ⟨2x − y + z, x + y + z, 2y − 3z⟩.
(a) Find ∇ ⋅ F, the divergence of F.
(b) Find ∇ × F, the curl of F.
(c) Evaluate ∬S F ⋅ dS, where S:

{(x, y, z)||2x − y + z| + |x + y + z| + |2y − 3z| = 1}

with outward orientation, using the divergence theorem.


2 2
(d) Evaluate ∮C F ⋅ dr, where C : { x +2y
z=4,
=4, oriented counterclockwise as viewed

from above, using Stokes theorem.


5 Introduction to ordinary differential equations
As we have seen before, equations are used as mathematical models built to solve
practical problems. Algebra is sufficient to solve static problems. However, in many
cases, natural phenomena involve quantities that are changing and can only be
described by equations that describe these changes. Those changes usually are de-
scribed by derivatives of some functions. An equation relating an unknown function
and one or more of its derivatives is called a differential equation. In this chapter,
we develop methods for finding exact solutions for certain types of differential equa-
tions. Also, we introduce some other approaches for finding approximate solutions
by numerical or graphical methods.

5.1 Introduction
We first investigate several examples. We assume the population P(t) is a function of
time, t, subject to constant birth and death rates. Then the rate of change of P with
respect to time t can be modeled as

dP
= kP, where k = birth rate − death rate = some constant.
dt

This equation involves the unknown function P and its first derivative P 󸀠 . It is a first-
order differential equation because it involves a first-order derivative. One can check
that P = Cekt , where C is an arbitrary constant, is a solution to this equation. If k is
positive, it is an exponential growth model, and if k is negative, it is an exponential
decay model. This is the case for the population of a family of bacteria growing or
disappearing over a short period of time. Also, many radioactive materials satisfy this
law.
Newton’s law of cooling says that the rate of change of the temperature T(t) of
a body is proportional to the difference between T and the temperature A of the sur-
rounding medium. If we know T(0) = 50 °C, then we have

dT
= −k(T − A), T(0) = 50 °C,
dt

where k is a positive constant. Note that if T > A, then dT


dt
< 0, so the temperature is a
decreasing function of t and the body is cooling, but if T < A, then dT dt
> 0, so that T
is increasing. The condition T(0) = 50 °C is called an initial condition.
In a spring-mass model, the mass moves back and forth about an equilibrium
point. If F is the force exerted by the spring on the mass and Fr is the resistance force,
then by Newton’s second law we have

F − Fr = −mx 󸀠󸀠 ,

https://doi.org/10.1515/9783110674378-005
274 | 5 Introduction to ordinary differential equations

where x is the displacement from the equilibrium point, and the derivative is with re-
spect to time. By Hooke’s law, we have F = −kx, where k is some constant. The negative
sign in the right-hand side indicates that the resultant force and the displacement are
in opposite directions. If Fr , the resistance force, is proportional to the velocity of the
mass, then Fr = lx 󸀠 for some constant l, so we have

−kx − lx 󸀠 = −mx 󸀠󸀠 or mx󸀠󸀠 − lx󸀠 − kx = 0.

This equation involves an unknown function x(t) and its first- and second-order
derivatives, so it is a second-order differential equation.
The examples above are all ordinary differential equations (ODEs), since the un-
known functions only depend on a single variable. The following differential equa-
tions are all examples of ODEs:

d3 y dy 1
(a) +2 − 3y = , (b) y(4) + y(2) y = y󸀠 ,
dx 3 dx x
3 4
d2 x dx
(c) ( ) + 2( ) = x, (d) ẋ = 2√x + t 2 .
dt 2 dt
The order of an ODE is the highest derivative that is found in the ODE. Thus, in the
above four ODEs, (a) is of order 3, (b) is of order 4, (c) is of order 2, and (d) has order 1.
The degree of an ODE is the highest power of the highest derivative in that ODE.
Thus, in the above four ODEs, (a), (b), and (d) all have degree 1 while (c) has degree 3.
Scientists and economists also use partial differential equations (PDEs) to solve
problems. For example, the heat equation is

𝜕u 𝜕2 u 𝜕2 u 𝜕2 u
− α( 2 + 2 + 2 ) = 0,
𝜕t 𝜕x 𝜕y 𝜕z
where u(x, y, z, t) is the temperature of a body and α is the thermal diffusivity. The wave
equation

𝜕2 u 2
2𝜕 u
= c ,
𝜕t 2 𝜕x 2
the harmonic equation

𝜕2 u 𝜕2 u
+ = 0,
𝜕t 2 𝜕x2
and the famous Black–Scholes model for option pricing

𝜕V 1 2 2 𝜕2 V 𝜕V
+ σ S = rV − rS
𝜕t 2 𝜕S2 𝜕S
are all examples of PDEs.
Both ODEs and PDEs have enormously important applications, as seen above. In
this text, we only consider some ODEs.
5.2 First-order ODEs | 275

5.2 First-order ODEs


5.2.1 General and particular solutions and direction fields

We will discuss a first-order differential equation that can be written in the form

dy
= f (x, y).
dx

dy
Example 5.2.1. Solve the differential equation dx
= 2x.

Solution. Integrating both sides gives all solutions, i. e.,

y(x) = ∫ 2xdx,

y = x2 + C.

dy
This is a general solution of the differential equation dx = 2x since it gives every pos-
sible solution to the equation. If, furthermore, we know y(1) = 2, then we will be able
to determine the constant C = 1 to obtain the particular solution y = x 2 + 1.
The graphs of a general solution are a family of curves, called solution curves or
integral curves. Figure 5.1 shows some solution curves for C = 0, ±1, and 3.

Figure 5.1: Example 5.2.1, some solution curves to y 󸀠 = 2x.

In general, a general solution to a differential equation

dy
= f (x, y)
dx

involves an arbitrary constant, and the graphs are a family of curves.


276 | 5 Introduction to ordinary differential equations

The problem
dy
= f (x, y), dy
{ dx alternatively written as = f (x, y), y(a) = b,
y(a) = b, dx

is called an initial value problem. The solution to an initial value problem is called a
particular solution.

There is a nice theorem about the existence of a particular solution to an initial


value problem. We give, without proof, the theorem that guarantees the existence of
dy
such solutions for a first-order ODE of the form dx = f (x, y).

Theorem 5.2.1 (Existence and uniqueness of solutions). Suppose that the function f (x, y) and its par-
tial derivatives fx (x, y) and fy (x, y) are continuous on some region R in the xy-plane that contains the
point (a, b) in its interior. Then the initial value problem

dy
= f (x, y), y(a) = b
dx

has one and only one solution that is defined on an open interval I containing the point a.

When given a first-order ODE in the form

dy
= f (x, y) or F(x, y, y󸀠 ) = 0,
dx
we may not be able to see its solution at first sight, for example,

dy dy
= x2 + y2 or = x 3 − 2xy.
dx dx
However, we do know, at each point, the derivative of the unknown function y(x). The
derivative is the slope of the tangent line at that point, so we can sketch a small line
segment to indicate its tangent at some points. For example, for the differential equa-
dy
tion dx = x2 +y2 , we can compute the derivative at points (0, 0), (1, 1), and (2, 1) to obtain
y (0) = 0, y󸀠 (1) = 2, and y󸀠 (2) = 5. Thus, we can sketch a diagram as in Figure 5.2(a).
󸀠

(a) (b) (c)


Figure 5.2: Direction fields.
5.2 First-order ODEs | 277

This type of diagram is called a direction field or slope field of the differential equation.
With the help of a computer algebra system, we have the direction fields for the above
two differential equations as shown in Figure 5.2(b) and (c). If we know an additional
condition, y(x0 ) = y0 , then we are able to sketch a solution curve that passes through
the point (x0 , y0 ). Several particular solution curves for y󸀠 = x 3 − 2xy are shown in
Figure 5.3.

Figure 5.3: Direction field and solution curves.

5.2.2 Separable differential equations

For some first-order ODEs of the form


dy
= f (x, y),
dx
if we can factor f (x, y) = h(x)g(y), then we may separate the variables x and y to obtain

dy
= f (x, y) = h(x)g(y),
dx
dy
= h(x)dx.
g(y)

Then, we can integrate both sides separately, the left side as a function of y and the
right side as a function of x, i. e.,
1
∫ dy = ∫ h(x)dx.
g(y)

In this way, we may be able to find an exact solution.


278 | 5 Introduction to ordinary differential equations

Example 5.2.2. Solve the exponential growth/decay model equation

dP
= kP, where k is a nonzero constant.
dt

When k < 0, find the time when P = P0 /2, where P0 = P(0).

Solution. We separate the variables to obtain

1
dP = kdt.
P

We integrate

1
∫ dP = ∫ kdt
P

to get

ln |P| = kt + C1 , where C1 is an arbitrary constant.

Simplifying this we obtain P = ±ekt+C1 = ±eC1 ekt . Note that ±eC1 is also an arbitrary
constant except 0, but y = 0 is a solution, so we can write this as

P = Cekt , where C is an arbitrary constant.

When t = 0, P = P0 , this means C = P0 and P = P0 ekt . Set P = P0 /2. We have

P0 /2 = P0 ekt ,
1
ln = kt,
2
− ln 2
t= , for k < 0.
k

Note. The radioactive half-life for a given radioisotope is given by the above formula.

Example 5.2.3. Suppose a curve y = y(x) has a derivative 2x(y 2 + 1) at each point (x, y).
1. Find any such curve.
2. Furthermore, if the curve passes through the point (0, 1), then find this particular curve.

Solution.
1. The curve y = y(x) satisfies the differential equation

dy
= 2x(y2 + 1).
dx
5.2 First-order ODEs | 279

We separate the variables to obtain

1
dy = 2xdx.
y2 + 1

Then, we integrate

1
∫ dy = ∫ 2xdx
y2 + 1

to obtain tan−1 y = x2 + C. Thus, we have y = tan(x 2 + C).


2. If, furthermore, we know y(0) = 1, then 1 = tan C and C = π4 . So we get the partic-
ular curve

π
y = tan(x2 + ).
4

5.2.3 Substitution methods

dy
A homogeneous first-order differential equation dx
= f (x, y) is one where f (x, y) can be
rewritten as a function of xy , i. e.,

dy y
= F( ). (5.1)
dx x

This typically happens when f (x, y) is made up of polynomial terms in x and y where
the exponents of x and y add to the same value for all such terms.
dy
If dx = F( xy ) and we make the substitution

y dy dv
v= so that y = vx and =v+x ,
x dx dx

then the differential equation (5.1) is transformed into a separable equation with in-
dependent variable x and dependent variable v, i. e.,

dv
v+x = F(v),
dx
dv
x = F(v) − v.
dx
So we may solve the differential equation by using separation of variables.

Example 5.2.4. Solve the differential equation

dy
2xy = 4x 2 + 3y 2 .
dx
280 | 5 Introduction to ordinary differential equations

Solution. This is a homogeneous equation (the degree of each term is the same – two
in this case). We rewrite it as

dy 4x2 + 3y2 x 3 y
= = 2( ) + ( ).
dx 2xy y 2 x
dy dv
Then, the substitution y = vx, dx
= v + x dx , transforms this to

dv 2 3
v+x = + v,
dx v 2
dv 4 + v2
x = ,
dx 2v
and, hence,
2v 1
∫ dv = ∫ dx,
v2 + 4 x
ln(v2 + 4) = ln |x| + C1 .

Thus,

v2 + 4 = |x|eC1 ,
y2
+ 4 = Cx, where C = ±eC1 ,
x2
y2 + 4x 2 = Cx3 ,

where C > 0 if x > 0, and C < 0 if x < 0.

Example 5.2.5. Solve the differential equation

dy
= (x + y + 3)2 .
dx

Solution. This is not separable or homogeneous. Let us try to simplify it by making


the substitution v = x + y + 3, so y = v − x − 3. Differentiating this gives

dy dv
= − 1.
dx dx
So the transformed equation is

dv
= 1 + v2 .
dx
This is a separable equation and

dv
∫ = ∫ dx,
1 + v2
5.2 First-order ODEs | 281

tan−1 v = x + C,
v = tan(x + C).

So

y(x) = tan(x + C) − x − 3.

5.2.4 Exact differential equations

Suppose that an ODE has the form

dy dy f (x, y)
f (x, y) + g(x, y) =0 or =− ,
dx dx g(x, y)

or, equivalently, its differential form

f (x, y)dx + g(x, y)dy = 0. (5.2)

If there exists a function φ(x, y) such that

𝜕φ(x, y) 𝜕φ(x, y)
= f (x, y) and = g(x, y),
𝜕x 𝜕y

then
𝜕φ(x, y) 𝜕φ(x, y)
f (x, y)dx + g(x, y)dy = dx + dy = dφ(x, y) = 0.
𝜕x 𝜕y

This means that φ(x, y) = C, which is a general solution of the differential equation
f (x, y)dx + g(x, y)dy = 0.
Equations of this type are called exact differential equations. In Chapter 4, we
showed the following result holds.

Theorem 5.2.2 (Criterion for exactness). Suppose that the functions f (x, y) and g(x, y) are continuous
and have continuous first-order derivatives in the simply connected region D. Then the ODE

f (x, y)dx + g(x, y)dy = 0

is exact in D if and only if at each point of D

𝜕f 𝜕g
= .
𝜕y 𝜕x

dy y 2 −2x+3
Example 5.2.6. Solve the differential equation dx
= y−2xy
.
282 | 5 Introduction to ordinary differential equations

Solution. We rearrange the terms to obtain

(y2 − 2x + 3)dx + (2xy − y)dy = 0.

Since

𝜕(y2 − 2x + 3) 𝜕(2xy − y)
= 2y = ,
𝜕y 𝜕x

this is an exact differential equation. Assume dφ(x, y) = (y2 − 2x + 3)dx + (2xy − y)dy.
Then

= y2 − 2x + 3 and φ(x, y) = ∫(y2 − 2x + 3)dx = y2 x − x 2 + 3x + h(y).


𝜕φ
𝜕x
To find h(y), we differentiate φ(x, y) with respect to y to obtain

𝜕φ
= 2xy + 0 + h󸀠 (y).
𝜕y
2
But = 2xy − y, so h󸀠 (y) = −y and such an h(y) = − y2 . Therefore,
𝜕φ
𝜕y

y2
φ(x, y) = y2 x − x 2 + 3x − ,
2
and

y2
φ(x, y) = C or y2 x − x 2 + 3x − =C
2
is a general solution to the original differential equation. Figure 5.4 shows the direction
field and several solution curves.

Figure 5.4: Direction field and solution curves, Example 5.2.6.


5.2 First-order ODEs | 283

5.2.5 First-order linear differential equations

The first-order linear differential equation has the form

dy
a(x)y󸀠 + b(x)y = c(x) or a(x) + b(x)y = c(x). (5.3)
dx
dy
The term “linear” refers to the “y” terms that appear as y and dx , but not are raised
to any power or combined in some other function, whereas a(x), b(x), and c(x) are
allowed to be nonlinear functions. Thus, in the differential equations

(a) x2 y󸀠 + 2y = sin x, (b) y󸀠2 + x + 2y = 0, (c) yy󸀠 + x = 0, (d) y󸀠 + √y = x,

only (a) is linear. If a(x) = 0, it would not be a differential equation, so we assume


a(x) ≠ 0 and dividing by a(x) gives

dy b(x) c(x)
+ y= . (5.4)
dx a(x) a(x)
b(x) c(x)
Let P(x) = a(x)
and Q(x) = a(x)
. Then equation (5.4) becomes

dy
+ P(x)y = Q(x). (5.5)
dx
There is a nice technique for solving equation (5.5). Suppose there exists a function
dy
ρ(x) such that multiplying both sides of the equation dx + P(x)y = Q(x) by ρ(x) trans-
forms the left-hand side into the derivative of the product ρ(x) × y. Such a function ρ(x)
is called an integrating factor. Then,

dy
ρ(x) + P(x)ρ(x)y = ρ(x)Q(x),
dx
d
(ρ(x)y) = ρ(x)Q(x). (5.6)
dx
We integrate both sides to obtain

ρ(x)y = ∫ ρ(x)Q(x)dx.

Therefore, we get a general solution

1
y= ∫ ρ(x)Q(x)dx. (5.7)
ρ(x)

Now the question left is, how do we find ρ(x)? Applying the product rule on the left-
hand side of equation (5.6) gives

dy dρ(x)
ρ(x) +y = ρ(x)Q(x).
dx dx
284 | 5 Introduction to ordinary differential equations

Comparing this with the original equation multiplied by ρ(x) shows that

dρ(x)
= P(x)ρ(x).
dx

This is a separable equation, and solving it gives the integrating factor ρ(x) = e∫ P(x)dx .
Substituting ρ(x) in equation (5.7), we obtain a general solution to equation (5.5), i. e.,

y(x) = e− ∫ P(x)dx (∫ e∫ P(x)dx Q(x)dx + C). (5.8)

Note. One can easily check that y(x) = Ce− ∫ P(x)dx is a general solution of the first-
order linear homogeneous equation (right-hand side function Q(x) = 0)

dy
+ P(x)y = 0.
dx

dy
Note that the function y∗ = e− ∫ P(x)dx (∫ e∫ P(x)dx Q(x)dx) is a particular solution to dx
+
dy
P(x)y = Q(x). Thus, a general solution to dx
+ P(x)y = Q(x) can be written as

dy dy
a general solution of dx
+ P(x)y = 0 + a particular solution of dx
+ P(x)y = Q(x).

dy
Example 5.2.7. Solve the first-order linear ODE dx
= x 3 − 2xy.

Solution. Since P(x) = 2x and Q(x) = x3 , by equation (5.8), a general solution is

2 2
y = e− ∫ 2xdx (∫ e∫ 2xdx x3 dx + C) = e−x (∫ ex x 3 dx + C)

2 1 2
= e−x ( ∫ x2 ex d(x 2 ) + C)
2
2 1 2 1 2
= e−x ( x2 ex − ex + C)
2 2
x2 1 2
= − + Ce−x .
2 2

Figure 5.3 shows the direction field and several solution curves to this ODE.

Example 5.2.8. Solve the initial value problem

dy
− y = 2e−x/3 , y(0) = −1.
dx
5.2 First-order ODEs | 285

Solution. This is a first-order linear differential equation with P(x) = −1 and Q(x) =
2e−x/3 , so a general solution is

y(x) = e− ∫ P(x)dx (∫ Q(x)e∫ P(x)dx dx + C)

= e− ∫(−1)dx (∫(2e−x/3 e∫(−1)dx )dx + C)

3 4
= ex (− e− 3 x + C).
2

Substitution of x = 0 and y = −1 shows that C = 21 , so the desired particular solution


is
1 3
y(x) = ex − e−x/3 .
2 2

Example 5.2.9. Solve the differential equation ydx + (y 3 − x)dy = 0 (assume that y > 0).

Solution. If we rewrite the equation as

dy y
+ = 0,
dx y3 − x

it is not linear, homogeneous, separable, or exact as we have discussed so far. How-


ever, if we rewrite it as

dx y3 − x
+ = 0,
dy y

then

dx 1
− x = −y2 .
dy y

It is now linear in x, as a function of y with P = − y1 and Q = −y2 . The general solution


is given by

∫ y1 dy − ∫ y1 dy
x=e (∫ −y2 e dy + C1 ) = eln y (− ∫ y2 e− ln y dy + C1 )

1
= y(− ∫ y2 × dy + C1 )
y
y2
= y(− + C1 ).
2

A general solution is, therefore, 2x = −y3 + Cy (we replaced the constant 2C1 by C).
286 | 5 Introduction to ordinary differential equations

Bernoulli equations
A first-order differential equation of the form

dy
+ P(x)y = Q(x)yn (5.9)
dx

is called a Bernoulli equation.

Remark. This type of equation was named after Jacob Bernoulli, who was one of the
many prominent mathematicians in the Bernoulli family.

If either n = 0 or n = 1, then the equation is linear. Otherwise, dividing both sides


n
by y and making the substitution

v = y1−n

transforms it into the linear equation

dv
+ (1 − n)P(x)v = (1 − n)Q(x).
dx

Rather than memorizing the form of this transformed equation, it is more efficient to
make the substitution explicitly, after dividing both sides by yn , as in the following
example.

Example 5.2.10. Solve the ODE

2xy 󸀠 = 3y + 4x 2 y 3 .

Solution. Divide the ODE by 2x to obtain

dy 3
− y = 2xy3 .
dx 2x
3
We see that this is a Bernoulli equation with P(x) = − 2x and Q(x) = 2x with n = 3. We
3
divide the equation by y to obtain

dy 3
y−3 − y−2 = 2x.
dx 2x
dy
is exactly − 21 d(ydx ) . Hence, we let v = y−2 , and the above
−2
Note that the first term y−3 dx
equation becomes linear, i. e.,

1 d(y−2 ) 3
− − y−2 = 2x,
2 dx 2x
dv 3 × (−2)
− v = 2 × (−2)x,
dx 2x
5.3 Second-order ODEs | 287

dv 3
+ v = −4x.
dx x
A general solution is
3 3
v = e− ∫ x dx (∫ −4xe∫ x dx dx + C)

= x−3 (∫ −4x × x3 dx + C)

4
= x−3 (− x5 + C),
5
4 2 C
v = − x + 3.
5 x
Since v = y−2 , a general solution to the original differential equation is
4 C
y−2 = − x2 + 3 .
5 x

5.3 Second-order ODEs


5.3.1 Reducible second-order equations

Differential equations of higher order appear in many applications in science and en-
gineering. For example, the well-known simple harmonic motion equation
d2 x
= −ω2 x
dt
is a second-order differential equation. A second-order differential equation involves
the second derivative of an unknown function y(x). Thus, it has the general form

F(x, y, y󸀠 , y󸀠󸀠 ) = 0. (5.10)

A general solution to a second-order ODE involves two arbitrary constants. Thus, to


find a particular solution, we need two additional conditions, say, y(x0 ) = a and
y󸀠 (x0 ) = b.
Some types of second-order ODEs can be reduced to a first-order equation and
then solve the first-order equation using methods from previous sections. These are
called reducible second-order ODEs. This is often the case if either the dependent vari-
able y or the independent variable x is missing from a second-order ODE.

1. The dependent variable y and its derivative y 󸀠 are missing


If y and y󸀠 are both missing, then the equation has the form y󸀠󸀠 = f (x). Thus, integrating
the equation twice gives a general solution,

y󸀠 = ∫ f (x)dx + C1 ,
288 | 5 Introduction to ordinary differential equations

y = ∫[∫ f (x)dx + C1 ]dx + C2 .

dP dy
Note. This is equivalent to solving two first-order ODEs dx
= f (x) and dx
= P.

Example 5.3.1. Solve the differential equation

y 󸀠󸀠 = x + 1.

Solution. Integrating once gives

x2
y󸀠 = ∫(x + 1)dx = + x + C1 .
2

Integrating again gives

x2 x3 x2
y = ∫( + x + C1 )dx = + + C1 x + C2 .
2 6 2

2. The dependent variable y is missing


If y is missing, then equation (5.10) takes the form

F(x, y󸀠 , y󸀠󸀠 ) = 0. (5.11)

The substitution

dp
y󸀠 = p, y󸀠󸀠 =
dx

results in a first-order differential equation in p and x, i. e.,

F(x, p, p󸀠 ) = 0.

If we can find a general solution p(x, C1 ) involving an arbitrary constant C1 , then we


can write the solution of the original equation as

y(x) = ∫ y󸀠 (x)dx = ∫ p(x, C1 )dx + C2 .

This gives us a solution of equation (5.11) that involves two arbitrary constants C1 and
C2 , as is to be expected in the case of a second-order differential equation.

Example 5.3.2. Solve the equation xy 󸀠󸀠 + 2y 󸀠 = 6x, in which the dependent variable y is missing.
Determine the particular solution if y(1) = 2 and y 󸀠 (1) = 1.
5.3 Second-order ODEs | 289

dp
Solution. Let y󸀠 = p, so that y󸀠󸀠 = dx
. The substitution defined above gives the first-
order equation

dp dp 2
x + 2p = 6x, that is, + p = 6.
dx dx x
This is linear (in p). So, solving it by the method given by equation (5.8) gives
C1
p(x) = 2x + .
x2

This means dxdy


= 2x + Cx21 . A final integration with respect to x yields a general solution
of the original equation xy󸀠󸀠 + 2y󸀠 = 6x, i. e.,

C1
y(x) = ∫ p(x)dx = ∫(2x + )dx
x2
C1
= x2 − + C2 .
x
C1
Since y󸀠 (1) = 1, this means p(1) = 1 = 2 + 12
, so C1 = −1. Since y(1) = 2, we have

2 = 12 −
−1
+ C2 , so C2 = 0.
1
Thus, the particular solution is
1
y(x) = x2 + .
x

3. The independent variable x is missing


If x is missing, then equation (5.10) takes the form

F(y, y󸀠 , y󸀠󸀠 ) = 0. (5.12)

The substitution
dp dp dy dp
p = y󸀠 , y󸀠󸀠 = = =p
dx dy dx dy

results in a first-order differential equation in terms of p as a function of y, i. e.,

dp
F(y, p, p ) = 0.
dy

If we can solve this equation for a general solution p(y, C1 ) involving an arbitrary con-
stant C1 , then (assuming that y󸀠 ≠ 0) we can find a solution of the original equation,
dy
with x as a function of y, by solving the first-order ODE dx = p(y, C1 ) as follows:

dy 1
p(y, C1 ) = , so that dx = dy,
dx p(y, C1 )
290 | 5 Introduction to ordinary differential equations

1
x=∫ dy + C2 .
p(y, C1 )

This leads to the implicit solution y = y(x) of equation (5.12).

Example 5.3.3. Solve the initial value problem yy 󸀠󸀠 = (y 󸀠 )2 in which the independent variable x is
missing, with the initial conditions y(0) = 2 and y 󸀠 (0) = 1.

Solution. We substitute
dp dp dy dp
y󸀠 = p and y󸀠󸀠 = = =p .
dx dy dx dy

The original equation becomes

dp
yp = p2 .
dy

One solution is p = 0 󳨐⇒ y = k is a constant. Otherwise divide by p and use separation


of variables. Then we have
dp dy
∫ =∫ ,
p y
ln |p| = ln |y| + C,
eln |p| = eln |y| eC ,
|p| = |y|eC ,
p = C1 y, where C1 = ±eC ,
dy
= C1 y.
dx

The initial condition y󸀠 (0) = 1 when y(0) = 2 gives C1 = 21 . Hence, integrating again we
obtain
2
dx = dy,
y
x = 2 ln |y| + C2 .

Simplifying the expression gives

y2 = ex−C2 = e−C2 ex = Aex ,

where A = e−C2 is an arbitrary constant. Substituting the initial condition y(0) = 2


gives A = 4. Thus, the particular solution satisfying the initial conditions is

y = 2√ex .

Note that y(0) = 2, so we take y as being positive.


5.3 Second-order ODEs | 291

5.3.2 Second-order linear differential equations

Early in this chapter, we derived the following second-order ODE for a mass-spring
system if the resistance is proportional to the mass’s velocity:

mx 󸀠󸀠 − lx 󸀠 + kx = 0.

If, furthermore, there is an external force f (t) acting on the mass, we will have

mx󸀠󸀠 − lx 󸀠 + kx = f (t).

In general, the so-called nonhomogeneous second-order linear differential equations


have the form

d2 y dy
a(x) + b(x) + c(x)y = f (x). (5.13)
dx 2 dx

The term “linear” applies to y, y󸀠 , and y󸀠󸀠 , and it means that they appear in separate
terms of the ODE without an exponent (other than one) and are not part of another
function (such as √1 + y). The functions a(x), b(x), c(x), and f (x) are allowed to be
nonlinear. We also assume that a(x) ≠ 0.
If in addition f (x) = 0 for all x in the above equation, then the differential equation
is called a homogeneous linear equation:

d2 y dy
a(x) + b(x) + c(x)y = 0. (5.14)
dx2 dx

The ODE

ex y󸀠󸀠 + (cos x)y󸀠 + (1 + √x)y = x

is linear and nonhomogeneous. By contrast, the equations


2
y󸀠󸀠 = yy󸀠 and y󸀠󸀠 − 3(y󸀠 ) + 4y3 = 0

are not linear because they contain products and powers of y or its derivative.
The second-order ODE

x2 y󸀠󸀠 + 2xy󸀠 + 3y = cos x

is linear and nonhomogeneous, whereas the following is linear and homogeneous:

x2 y󸀠󸀠 + 2xy󸀠 + 3y = 0.
292 | 5 Introduction to ordinary differential equations

Homogeneous second-order linear differential equations


We now explore the solutions to equation (5.14).

Theorem 5.3.1 (Principle of superposition for homogeneous equations). Let y1 (x) and y2 (x) be two so-
lutions of the homogeneous linear equation (5.14), a(x)y 󸀠󸀠 + b(x)y 󸀠 + c(x)y = 0, defined on the interval
I. If C1 and C2 are constants, then the linear combination

y = C1 y1 (x) + C2 y2 (x)

is also a solution of this homogeneous ODE.

Proof. Since y1 and y2 are solutions of equation (5.14), we have

a(x)y1󸀠󸀠 + b(x)y1󸀠 + c(x)y1 = 0 and a(x)y2󸀠󸀠 + b(x)y2󸀠 + c(x)y2 = 0.

Substituting y = C1 y1 + C2 y2 into equation (5.14), we have

a(x)(C1 y1 + C2 y2 )󸀠󸀠 + b(x)(C1 y1 + C2 y2 )󸀠 + c(x)(C1 y1 + C2 y2 )


= a(x)C1 y1󸀠󸀠 + a(x)C2 y2󸀠󸀠 + b(x)C1 y1󸀠 + b(x)C2 y2󸀠 + c(x)C1 y1 + c(x)C2 y2
= C1 [a(x)y1󸀠󸀠 + b(x)y1󸀠 + c(x)y1 ] + C2 [a(x)y2󸀠󸀠 + b(x)y2󸀠 + c(x)y2 ]
= 0.

So, y = C1 y1 + C2 y2 is a solution of equation (5.14).

Thus, if we can find two particular solutions to equation (5.14), and they are lin-
early independent, then the linear combination of the two particular solutions gives
a general solution to equation (5.14). The definition of two linearly independent func-
tions is given below.

Definition 5.3.1 (Linear independence of two functions). Two functions defined on an open interval I
are said to be linearly independent on I provided that neither is a constant multiple of the other (alter-
f
natively, neither of the two functions g or gf is a constant-valued function on I).

For example, ex and e2x are two linearly independent functions, while e2x and 2e2x are
linearly dependent. By the superposition theorem, we have the following theorem.

Theorem 5.3.2 (General solution of second-order homogeneous linear ODEs). Let y1 and y2 be two
linearly independent solutions of the homogeneous linear differential equation

a(x)y 󸀠󸀠 + b(x)y 󸀠 + c(x)y = 0

where a(x)(≠ 0), b(x), and c(x) are continuous on some interval I. Then, a general solution is

y(x) = C1 y1 (x) + C2 y2 (x),

where C1 and C2 are two arbitrary constants.


5.3 Second-order ODEs | 293

Note. This theorem could be generalized to a linear homogeneous differential equa-


tion of order n. Then we have

y(n) + p1 (x)y(n−1) + p2 (x)y(n−2) + ⋅ ⋅ ⋅ + pn−1 (x)y󸀠 + pn (x)y = 0. (5.15)

That is, if y1 , y2 , . . . , yn are n linearly independent solutions of this equation, then a


general solution is

y(x) = c1 y1 + c2 y2 + ⋅ ⋅ ⋅ + cn yn , where c1 , c2 , . . . , cn are n arbitrary constants.

Example 5.3.4. For the differential equation

y 󸀠󸀠 − 4y = 0,
y1
we can verify that y1 (x) = e2x and y2 (x) = e−2x are two solutions, and y2
= e4x is not a constant. There-
fore, y1 and y2 are two linearly independent solutions. So, a general solution is y(x) = C1 e2x + C2 e−2x .

Homogeneous second-order linear differential equations with constant coefficients


As an illustration of the general theory for solutions to a second-order linear ODE, we
now discuss the general second-order homogeneous linear differential equation

ay󸀠󸀠 + by󸀠 + cy = 0 (5.16)

with constant coefficients a(=0),


̸ b, and c. We first look for a single solution of this
equation, and begin with the observation that exponential functions often play a role
in such solutions, such as the solution of y󸀠󸀠 − 4y = 0 in the previous example. Note
that if r is a constant, then

(erx ) = rerx and (erx ) = r 2 erx .


󸀠 󸀠󸀠

Hence, if substituting y = erx into equation (5.16), we find

ar 2 erx + brerx + cerx = 0.

However, erx is never zero, so we can divide this out of the equation. We conclude
that y(x) = erx will satisfy the homogeneous linear differential equation (5.16) with
constant coefficients precisely when r is a root of the algebraic equation

ar 2 + br + c = 0. (5.17)

This quadratic equation is called the characteristic equation or auxiliary equation


of equation (5.16).
It is easy to see that when the characteristic equation (5.17) has two distinct (un-
equal) roots r1 and r2 , then these give two solutions y1 = er1 x and y2 = er2 x that are
linearly independent (since er1 x /er2 x is not a constant). This gives the following result.
294 | 5 Introduction to ordinary differential equations

Theorem 5.3.3 (Homogeneous linear ODEs – distinct real roots). If r1 and r2 are real and distinct roots
of the characteristic equation ar 2 + br + c = 0 of the ODE ay 󸀠󸀠 + by 󸀠 + cy = 0, then

y(x) = C1 er1 x + C2 er2 x ,

where C1 and C2 are arbitrary constants, is a general solution of ay 󸀠󸀠 + by 󸀠 + cy = 0.

Example 5.3.5. Find a general solution of

2y 󸀠󸀠 − 7y 󸀠 + 3y = 0.

Solution. The solutions of the characteristic equation

2r 2 − 7r + 3 = 0
1
are r1 = 2
and r2 = 3. So, a general solution is
1
y(x) = C1 e 2 x + C2 e3x .

If the characteristic equation (5.17) has a root of multiplicity 2, then r1 = r2 . In this


case, we only get one particular solution. We cannot say that

y = C1 er1 x + C2 er2 x

is a general solution since er1 x = er2 x , and they are not linearly independent. There
is, in fact, only one arbitrary constant. To find another solution, we use the method of
variation of parameter. We assume a particular solution has the form

y∗ = C(x)er1 x .

Then, we will look for such a C(x). We plug y∗ into the equation to obtain

a(y∗ ) + b(y∗ ) + cy∗ = 0,


󸀠󸀠 󸀠

a(C(x)er1 x ) + b(C(x)er1 x ) + c(C(x)er1 x ) = 0,


󸀠󸀠 󸀠

a(C 󸀠󸀠 (x)er1 x + 2r1 C 󸀠 (x)er1 x + r12 C(x)er1 x ) + b(C 󸀠 (x)er1 x + C(x)r1 er1 x ) + c(C(x)er1 x ) = 0.

Since er1 x ≠ 0, we can simplify this to

a[C 󸀠󸀠 (x) + 2r1 C 󸀠 (x) + r12 C(x)] + b[C 󸀠 (x) + r1 C(x)] + cC(x) = 0,
aC 󸀠󸀠 (x) + (2ar1 + b)C 󸀠 (x) + (ar12 + br1 + c)C(x) = 0.

Since ar12 + br1 + c = 0 and 2ar1 + b = 0 (r1 = r2 is a repeated root), we have

aC 󸀠󸀠 (x) = 0.

Therefore, such a C(x) does exist; we can choose a simple one, say, C(x) = x (choosing
any C(x) = kx + l also works). Then, we have another particular solution y = xer1 x
which is linearly independent of y = er1 x . So, we have the following theorem.
5.3 Second-order ODEs | 295

Theorem 5.3.4 (Homogeneous linear ODEs – repeated roots). If the characteristic equation
ar 2 + br + c = 0 of the ODE ay 󸀠󸀠 + by 󸀠 + cy = 0 has only one root r (a double root, or repeated
root), then

y(x) = (C1 + C2 x)erx ,

where C1 and C2 are arbitrary constants, is a general solution of ay 󸀠󸀠 + by 󸀠 + cy = 0.

Example 5.3.6. Solve the equation 9y 󸀠󸀠 + 12y 󸀠 + 4y = 0.

Solution. The auxiliary equation 9r 2 + 12r + 4 = 0 can be factored as

(3r + 2)2 = 0.

The only root is r = − 32 . Thus, a general solution is

y = C1 e−2x/3 + C2 xe−2x/3 .

Example 5.3.7. Solve the initial value problem

y 󸀠󸀠 + 2y 󸀠 + y = 0,
{
y(0) = 5, y 󸀠 (0) = −3.

Solution. We note first that the characteristic equation r 2 + 2r + 1 = 0 has repeated


roots r1 = r2 = −1. Hence, a general solution for the ODE is

y(x) = C1 e−x + C2 xe−x .

In order to use the initial conditions, we differentiate to find y󸀠 , i. e.,

y󸀠 (x) = −C1 e−x + C2 e−x − C2 xe−x .

So, the initial conditions substituted into the equations for y(x) and y󸀠 (x) give

y(0) = C1 = 5,
y󸀠 (0) = −C1 + C2 = −3,

which imply C1 = 5 and C2 = 2. Therefore, the desired solution is

y(x) = 5e−x + 2xe−x .

The third case is when the discriminant of the auxiliary equation, b2 − 4ac, is less
than 0, then the auxiliary equation has two complex roots, and they are conjugate
pairs of the form r1,2 = α ± βi. The theory still implies that e(α+βi)x and e(α−βi)x are partic-
ular solutions of the linear ODE, but we would not expect to have complex numbers in
the solution of a real problem. The next theorem shows that we can, in fact, find real
solutions via these complex solutions.
296 | 5 Introduction to ordinary differential equations

Theorem 5.3.5 (Homogeneous linear ODEs – complex conjugate roots). If r1 = α + βi and r2 = α − βi


are complex conjugate roots of the characteristic equation ar 2 + br + c = 0 of the ODE ay 󸀠󸀠 + by 󸀠 + c = 0,
then

y = eαx (C1 cos βx + C2 sin βx),

where C1 and C2 are arbitrary constants, is a general solution of ay 󸀠󸀠 + by 󸀠 + cy = 0.

Proof. For the proof, Euler’s theorem is required, which states that for any θ, we have
eiθ = cos θ + i sin θ. The theory of solutions developed above applies, so we can write
a general solution as

y = Aer1 x + Ber2 x = Ae(α+βi)x + Be(α−βi)x


= Aeαx (cos βx + i sin βx) + Beαx (cos βx − i sin βx)
= eαx [(A + B) cos βx + i(A − B) sin βx]
= eαx (C1 cos βx + C2 sin βx),

where C1 = A + B and C2 = i(A − B). The solutions are, therefore, real when C1 and C2
are both real.

Note. In fact, without using Euler’s theorem, one can still derive a general solution
by proving that eαx cos βx and eαx sin βx are two linearly independent solutions.

Example 5.3.8. Solve x 󸀠󸀠 (t) = −ω2 x.

Solution. Since the characteristic equation, r 2 +ω2 = 0, has roots r1 = ωi and r2 = −ωi,
it follows that a general solution is

x(t) = C1 e0t cos ωt + C2 e0t sin ωt


= √C12 + C22 sin(ωt + B).

Note that if we denote A = √C12 + C22 , then this general solution could also be written
as

x(t) = A sin(ωt + B).

This is the general solution for a simple harmonic motion where A is the amplitude
and ω is the angular velocity. The period is T = 2π
ω
.

Example 5.3.9. Find a general solution of the differential equation y 󸀠󸀠 − 2y 󸀠 + 5y = 0.


5.3 Second-order ODEs | 297

Solution. The characteristic equation is

r 2 − 2r + 5 = 0

with roots r1,2 = 1 ± 2i. A general solution is, therefore,

y = ex (C1 cos 2x + C2 sin 2x).

Summary
A general solution of ay󸀠󸀠 + by󸀠 + cy = 0 has one of the following forms:

Roots of ar 2 + br + c = 0 General solution

r1 , r2 real and distinct y = C1 er1 x + C2 er2 x


r1 = r2 = r y = C1 erx + C2 xerx
r1 , r2 = α ± βi y = eαx (C1 cos βx + C2 sin βx)

For higher-order homogeneous linear differential equations of the form

y(n) + p1 y(n−1) + p2 y(n−2) + ⋅ ⋅ ⋅ + pn y = 0, where all pi are constants,

the results are similar to those that we have obtained for second-order linear ODEs.
The principal difference is that there will be n roots of the auxiliary equation when
the order of the ODE is n. A general solution is a linear combination of n independent
solutions, and each root (or conjugate pair of roots) of the auxiliary equation

r n + p1 r n−1 + ⋅ ⋅ ⋅ + pn = 0

corresponds to one particular solution (or a pair of particular solutions), as shown in


the above table.
If there is a root r with multiplicity 3, or higher, then this root will create a term in
the general solution with three arbitrary constants, C1 , C2 , and C3 , i. e., C1 erx + C2 xerx +
C3 x 2 erx . Roots of still higher multiplicity will extend this result in an analogous way.

Example 5.3.10. Find a general solution of the equation y (4) + 2y 󸀠󸀠󸀠 + 3y 󸀠󸀠 = 0.

Solution. The auxiliary equation is r 4 + 2r 3 + 3r 2 = 0. Solutions are r = 0 (repeated)


and r = −1 ± √2i, so a general solution is

y = C1 e0x + C2 xe0x + e−1x (C3 cos √2x + C4 sin √2x)


= C1 + C2 x + C3 e−x cos √2x + C4 e−x sin √2x.
298 | 5 Introduction to ordinary differential equations

Nonhomogeneous second-order linear differential equations


We now discuss nonhomogeneous second-order linear differential equations of the
form

a(x)y󸀠󸀠 + b(x)y󸀠 + c(x)y = f (x). (5.18)

The associated homogeneous equation

a(x)y󸀠󸀠 + b(x)y󸀠 + c(x)y = 0, (5.19)

where the right-hand side function f (x) is replaced by zero, is called the complimentary
equation. A general solution of this equation is called a complimentary function. In
cases where equation (5.18) models a physical system, the nonhomogeneous term f (x)
frequently corresponds to some external influence on the system being modeled.
There is a nice connection between the solutions of equation (5.18) and equa-
tion (5.19).

Theorem 5.3.6 (General solution of nonhomogeneous linear ODEs). A general solution of a nonhomo-
geneous differential equation

d2 y dy
a(x) + b(x) + c(x)y = f (x)
dx 2 dx

can be written as

y(x) = yc (x) + yp (x), (5.20)

where yc (x) is a complementary function (a general solution of the associated homogeneous equa-
tion (5.19)), and yp (x) is a particular solution of equation (5.18).

Proof. We first show that y(x) = yp (x) + yc (x) is a solution of equation (5.18). Substitut-
ing into that equation, we obtain

a(x)(yp (x) + yc (x)) + b(x)(yp (x) + yc (x)) + c(x)(yp (x) + yc (x))


󸀠󸀠 󸀠

= [a(x)yp󸀠󸀠 (x) + b(x)yp󸀠 (x) + c(x)yp (x)] + [a(x)yc󸀠󸀠 + b(x)yc󸀠 (x) + c(x)yc (x)]
= f (x) + 0
= f (x).

Now we show that any solution of equation (5.18) must be of the form of equa-
tion (5.20). If y∗ is any particular of equation (5.18), then

a(x)y∗󸀠󸀠 + b(x)y∗󸀠 + c(x)y∗ = f (x).

But we also have

a(x)yp󸀠󸀠 + b(x)yp󸀠 + c(x)yp = f (x).


5.3 Second-order ODEs | 299

So

a(x)(y∗ − yp ) + b(x)(y∗ − yp ) + c(x)(y∗ − yp ) = 0.


󸀠󸀠 󸀠

This means that y∗ − yp must be a solution of the complementary equation a(x)y󸀠󸀠 +


b(x)y󸀠 + c(x)y = 0. So, y∗ − yp = yc for some suitable constants in yc . This means

y∗ = yc + yp .

We now apply the theory to nonhomogeneous second-order linear differential


equations with constant coefficients.

Nonhomogeneous second-order linear differential equations with constant


coefficients
We have derived theorems for finding general solutions to

ay󸀠󸀠 + by󸀠 + cy = 0,

which have the form C1 y1 + C2 y2 , where y1 and y2 are two linearly independent so-
lutions. If we can find a particular solution yp to the nonhomogeneous second-order
linear differential equation with constant coefficients of the form

ay󸀠󸀠 + by󸀠 + cy = f (x), (5.21)

then, according to Theorem 5.3.6, we obtain a general solution to equation (5.21),

y(x) = C1 y1 + C2 y2 + yp .

In general, it is very hard to find a particular solution yp for a nonhomogeneous equa-


tion. In the following, we only discuss the cases where the right-hand side function
f (x) of equation (5.21) is a linear combination of products of the form Pm (x)eλx , where
λ is a real or complex constant and Pm (x) is a polynomial of degree m. We use the
method of undetermined coefficients, in which we choose for yp the most likely func-
tion, such as a polynomial multiplied by an exponential, Q(x)ex , and then determine
the unknown coefficients by substituting yp (x) into the ODE.

Example 5.3.11. Solve the differential equation y 󸀠󸀠 + y 󸀠 − 2y = 2x + 1.

Solution. The roots of the auxiliary equation r 2 + r − 2 = 0 are r = 1 and r = −2. Hence,
a complementary function is

yc = C1 ex + C2 e−2x .
300 | 5 Introduction to ordinary differential equations

It seems likely that a polynomial will give a particular solution, because the right-hand
side of the differential equation is a polynomial. Since the right-hand side, 2x + 1, is
a polynomial of degree 1, we try yp = Ax + B of degree 1. Substituting into the given
differential equation we have
(Ax + B)󸀠󸀠 + (Ax + B)󸀠 − 2(Ax + B) = 2x + 1,
A − 2Ax − 2B = 2x + 1.
However, the polynomial on the left-hand side equals the polynomial on the right-
hand side exactly when their coefficients are equal. Thus,
−2A = 2 and A − 2B = 1.
This gives A = −1 and B = −1. So, yp = −x − 1 is a particular solution. Therefore, a
general solution is
y = yc + yp = C1 ex + C2 e−2x − x − 1.

Example 5.3.12. Find a particular solution for each of the following differential equations:

(a) 3y 󸀠󸀠 − 2y 󸀠 − y = e2x , (b) y 󸀠󸀠 − 2y 󸀠 − 3y = e3x .

Solution. For (a), we try yp = Ae2x . Then

3(Ae2x ) − 2(Ae2x ) − (Ae2x ) = e2x ,


󸀠󸀠 󸀠

12Ae2x − 4Ae2x − Ae2x = e2x ,


1
A= .
7
So a particular solution is yp = 71 e2x .
For (b), if we try yp = Ae3x , it will not work. Can you see why? Instead, we try
yp = Axe3x . Then

(Axe3x ) − 2(Axe3x ) − 3(Axe3x ) = e3x ,


󸀠󸀠 󸀠

A(3xe3x + e3x ) − 2A(3xe3x + e3x ) − 3Axe3x = e3x ,


󸀠

A(3e3x + 9xe3x + 3e3x ) − 2A(3xe3x + e3x ) − 3Axe3x = e3x ,


A(6 + 9x) − 2A(1 + 3x) − 3xA = 1,
1
A= .
4
A particular solution is, therefore, yp = 41 xe3x .

Example 5.3.13. Solve the differential equation

y 󸀠󸀠 − 5y 󸀠 + 6y = xe2x , y(0) = 1, y 󸀠 (0) = 2.


5.3 Second-order ODEs | 301

Solution. The auxiliary equation r 2 − 5r + 6 = 0 has roots r1 = 2 and r2 = 3, so a


complementary function is yc = C1 e2x + C2 e3x . For a particular solution, shall we try
yp = (Ax + B)e2x , because it is similar to xe2x ? The answer is “No” since the comple-
mentary function already has the term Ce2x . So, instead, we try yp = x(Ax + B)e2x , for
which the derivatives are

yp󸀠 = (Ax2 + Bx) e2x + (Ax2 + Bx)(e2x )


󸀠 󸀠

= (2Ax + B)e2x + 2(Ax2 + Bx)e2x ,


yp󸀠󸀠 = 2Ae2x + 2(2Ax + B)e2x + (4Ax + 2B)e2x + 4(Ax2 + Bx)e2x
= 2e2x (A + 2B + 4Ax + 2Bx + 2Ax 2 ).

Substituting into the differential equation, y󸀠󸀠 − 5y󸀠 + 6y = xe2x , gives

2e2x (A + 2B + 4Ax + 2Bx + 2Ax 2 ) − 5e2x (B + 2Ax + 2Bx + 2Ax2 ) + 6e2x (Bx + Ax 2 ) = xe2x .

Dividing by e2x and collecting coefficients yields

2(A + 2B + 4Ax + 2Bx + 2Ax2 ) − 5(B + 2Ax + 2Bx + 2Ax2 ) + 6(Bx + Ax2 ) = x,
2A − B − 2Ax = x.

Equating coefficients gives

1
A=− and B = −1,
2
so a particular solution is

1
yp = x(− x − 1)e2x .
2

Thus, we obtain a general solution

1
y = C1 e2x + C2 e3x − x( x + 1)e2x .
2

Since y󸀠 = 2C2 e2x + 3C2 e3x − 2x( x2 + 1)e2x − (x + 1)e2x , under the condition y(0) = 1 and
y󸀠 (0) = 2, we have

1 = C1 + C2 ,
2 = 2C1 + 3C2 − 1.

So, C1 = 0 and C2 = 1, and the particular solution is

x
y = e3x − x( + 1)e2x .
2
302 | 5 Introduction to ordinary differential equations

A general approach to the method of undetermined coefficients


In general, we can make a reasonable guess to get a particular solution of equa-
tion (5.21) (or a higher-order extension of this equation), i. e.,

d2 y dy
a +b + cy = f (x),
dx 2 dx

when f (x) is of the form Pm (x)eλx , where Pm (x) is a degree m polynomial, and λ is a
constant. Our choice for a particular solution takes the form yp (x) = Q(x)eλx , where
Q(x) is a polynomial. We substitute y = Q(x)eλx into the ODE above to obtain

a[Q󸀠󸀠 (x)eλx + 2λQ󸀠 (x)eλx + λ2 Q(x)eλx ] + b[Q󸀠 (x)eλx + λQ(x)eλx ] + cQ(x)eλx = Pm (x)eλx .

We cancel out the factor eλx from both sides of the equation, resulting in

aQ󸀠󸀠 (x) + (2aλ + b)Q󸀠 (x) + (aλ2 + bλ + c)Q(x) = Pm (x). (5.22)

We now consider the following cases.


Case 1: aλ2 + bλ + c ≠ 0. That is, λ is not a root of the characteristic equation of
the associated homogeneous equation. Thus, we deduce that Q(x) needs to be of the
same degree, m, as the polynomial Pm (x).
Case 2: aλ2 + bλ + c = 0 but 2aλ + b ≠ 0. That is, λ is a root of the characteristic
equation of multiplicity 1. Then equation (5.22) becomes

aQ󸀠󸀠 (x) + (2aλ + b)Q󸀠 (x) = Pm (x).

This tells us that Q(x) should be chosen to be of degree m+1. That is, in order to use the
method of undetermined coefficients, we choose Q(x) to have degree one more than
the degree of Pm (x).
Case 3: aλ2 + bλ + c = 0 and 2aλ + b = 0. That is, λ is a root of the characteristic
equation of multiplicity 2. Then, equation (5.22) becomes

aQ󸀠󸀠 (x) = Pm (x).

This tells us that Q(x) should be of degree m + 2. That is, in order to use the method of
undetermined coefficients, we choose Q(x) to have degree two more than the degree
of Pm (x).
We summarize the results. If the right-hand side of the ODE is f (x) = Pm (x)eλx ,
then we initially choose yp (x) = Q(x)eλx , where Q(x) is a polynomial of degree m. We
modify yp by multiplying it by x if λ is a root of the auxiliary equation, and by x 2 if λ is
a repeated root of the auxiliary equation. We determine the undetermined coefficients
by substituting y = yp into the differential equation.

Example 5.3.14. Find a particular solution for the ODE y 󸀠󸀠 − 2y 󸀠 − 3y = 3x + 1.


5.3 Second-order ODEs | 303

Solution. We have f (x) = 3x + 1 = (3x + 1)e0x , so λ = 0 and the polynomial 3x + 1 has


degree 1. The characteristic equation of the associated homogeneous ODE is

r 2 − 2r − 3 = 0.

We know λ = 0 is not a root of this equation, so we can assume a particular solution


takes the form

yp = (Ax + B)e0x = Ax + B.

Substitution into y󸀠󸀠 − 2y󸀠 − 3y = 3x + 1 gives

0 − 2A − 3(Ax + B) = 3x + 1.

We equate the coefficients involving the same power of x, so we have

−3A = 3,
{
−2A − 3B = 1,

with solution
1
A = −1 and B= .
3

Thus, a particular solution for the ODE is yp = −x + 31 , and a general solution is y =


C1 e3x + C2 e−x − x + 31 .

Example 5.3.15. Find a general solution of

y 󸀠󸀠 − 3y 󸀠 + 2y = xex .

Solution. The characteristic equation of the associated homogeneous equation is r 2 −


3r + 2 = 0 with roots r = 1 and 2. Hence, a complementary function is

yc = C1 ex + C2 e2x .

The right-hand side xex is a polynomial of degree 1 multiplied by eλx with λ = 1. Since
λ = 1 is one of the roots of the characteristic equation, the particular solution is chosen
to be a polynomial of degree 1, multiplied by x, and then by ex , i. e.,

yp = x(Ax + B)ex .

Substitution into y󸀠󸀠 − 3y󸀠 + 2y = xex , simplifying, and dividing by ex leads to

−2Ax + (2A − B) = x.
304 | 5 Introduction to ordinary differential equations

Equating the coefficients gives

−2A = 1 and 2A − B = 0.

Solving the system of equations gives A = −1/2 and B = −1. Thus, a particular solution
is

1
yp = x(− x − 1)ex ,
2

and a general solution is

1
y = yc + yp = C1 ex + C2 e2x − ( x 2 + x)ex .
2

Example 5.3.16. Find a general solution of y 󸀠󸀠 + 6y 󸀠 + 9y = 5e−3x .

Solution. The characteristic equation of the associated homogeneous equation is r 2 +


6r + 9 = 0, with repeated root r = −3. Hence, a complementary function is

yc = (C1 + xC2 )e−3x .

The right-hand side is P(x)eλx = 5e−3x , with a polynomial P(x) = 5 of degree 0 and λ =
−3. Since λ = −3 is a double root of the characteristic equation, a particular solution
can be chosen to be an arbitrary polynomial of degree 0 (that is, a constant) multiplied
by x2 and then by e−3x , i. e.,

yp = Ax2 e−3x .

Pluging yp into the original equation to get A = 52 , so a particular solution is

5
yp = x 2 e−3x .
2
Hence, a general solution is

5
y = yc + yp = (C1 + xC2 )e−3x + x 2 e−3x .
2

Example 5.3.17. Solve y 󸀠󸀠 − y = 4x sin x.

Solution. The characteristic equation r 2 − 1 = 0 has roots r = ±1. Hence, a comple-


mentary function is

yc = C1 ex + C2 e−x .
5.3 Second-order ODEs | 305

To find a particular solution, we note that the right-hand side, 4x sin x, is the imaginary
part of 4xeix since eix = cos x + i sin x. So we consider

y1󸀠󸀠 − y1 = 4x cos x,
{
(iy2 )󸀠󸀠 − iy2 = 4x(i sin x),

and we add them up, so we obtain

(y1 + iy2 )󸀠󸀠 − (y1 + iy2 ) = 4x(cos x + i sin x) = 4xeix .

So, if we can solve

y󸀠󸀠 − 4y = 4xeix

for a particular solution yp , then yp will be y1 (x) + iy2 (x). The real part of yp , y1 (x), must
be a particular solution of y󸀠󸀠 − y = 4x cos x, and its imaginary part, y2 (x), must be a
particular solution of y󸀠󸀠 − y = 4x sin x.
Since λ = i is not a root of the characteristic equation, we use a modified right-
hand side and choose a particular solution that is a polynomial of degree 1 multiplied
by eix , i. e.,

yp = (Ax + B)eix .

Substituting this into y󸀠󸀠 − y = 4xeix , simplifying, and dividing by eix leads to

−2Ax − 2B + 2iA = 4x.

Thus, we obtain the equations

−2A = 4 and −2B + 2iA = 0,

with solutions A = −2 and B = −2i. Hence, a particular solution of y󸀠󸀠 − y = 4xeix is

yp = (−2x − 2i)eix
= (−2x − 2i)(cos x + i sin x)
= −2(x cos x − sin x) − 2(x sin x + cos x)i.

The original right-hand side is the imaginary part of 4xeix , so we take the imaginary
part of yp to get a particular solution of the original problem:

yp = −2x sin x − 2 cos x.

Hence, a general solution is

y = yc + yp = C1 ex + C2 e−x − 2x sin x − 2 cos x.


306 | 5 Introduction to ordinary differential equations

Note. This example shows a special case of a method for finding a particular solu-
tion of equation (5.21) when the right-hand side is of the form f (x) = eλx P(x) cos mx or
f (x) = eλx P(x) sin mx. This example shows that f (x) is replaced by the function g(x) =
e(λ+mi)x P(x) (of which f (x) is the real or imaginary part). The particular solution yp of
the new ODE, with g(x) on the right-hand side, can be found using the methods de-
veloped before. The real or imaginary part of yp is a particular solution of the original
problem.
Some books give an alternative procedure, using only real-valued functions,
where the trial solution (particular solution) is taken to be of the form

yp (x) = eλx Q1 (x) cos mx + eλx Q2 (x) sin mx,

where Q1 (x) and Q2 (x) are polynomials with unknown coefficients and of the same
degree as P(x), but multiplied by x or x2 if λ is a single root or repeated root of the
corresponding auxiliary equation, respectively.

Example 5.3.18. Solve y 󸀠󸀠 − y = 3e2x + 4x sin x.

Solution. The associated auxiliary equation is r 2 − 1 = 0, and a complementary func-


tion is the same as in the previous example, so we take

yc = C1 ex + C2 e−x .

In order to find a particular solution, we separately find particular solutions of the two
equations, and then add them, so we have

y󸀠󸀠 − y = 3e2x ,
y󸀠󸀠 − y = 4x sin x.

The first has a particular solution (check this for yourself),

y1 = e2x ,

and the second has the particular solution found in the previous example,

y2 = −2x sin x − 2 cos x.

We now add them to give a particular solution for the original equation in this exam-
ple:

yp = y1 + y2 = e2x − 2x sin x − 2 cos x.

So a general solution is

y = yc + yp = C1 ex + C2 e−x + e2x − 2x sin x − 2 cos x.


5.3 Second-order ODEs | 307

5.3.3 Variation of parameters

The method of undetermined coefficients is often useful to solve problems when f (x) =
Pm (x)eλx . We now introduce the method of variation of parameters, which is another
way to find a particular solution to the differential equation

ay󸀠󸀠 + by󸀠 + cy = f (x). (5.23)

Assume a general solution to ay󸀠󸀠 + by󸀠 + cy = 0 is

y = C1 y1 (x) + C2 y2 (x). (5.24)

Now we look for a particular solution to ay󸀠󸀠 + by󸀠 + cy = f (x) of the form

yp = u1 (x)y1 (x) + u2 (x)y2 (x). (5.25)

Then,

yp󸀠 = u󸀠1 y1 + u1 y1󸀠 + u󸀠2 y2 + u2 y2󸀠 .

To solve for u1 (x) and u2 (x), we need two equations. We already have the condition
that yp is a particular solution, but we need an extra one. Let us impose

u󸀠1 y1 + u󸀠2 y2 = 0 (5.26)

in order to simplify our calculation. Therefore,

yp󸀠 = u1 y1󸀠 + u2 y2󸀠 and


yp󸀠󸀠 = u󸀠1 y1󸀠 + u1 y1󸀠󸀠 + u󸀠2 y2󸀠 + u2 y2󸀠󸀠 .

We substitute these into the differential equation

a(u󸀠1 y1󸀠 + u1 y1󸀠󸀠 + u󸀠2 y2󸀠 + u2 y2󸀠󸀠 ) + b(u1 y1󸀠 + u2 y2󸀠 ) + c(u1 y1 + u2 y2 ) = f (x)

to obtain

u1 (ay1󸀠󸀠 + by1󸀠 + cy1 ) + u2 (ay2󸀠󸀠 + by2󸀠 + cy2 ) + a(u󸀠1 y1󸀠 + u󸀠2 y2󸀠 ) = f (x).

This means

a(u󸀠1 y1󸀠 + u󸀠2 y2󸀠 ) = f (x). (5.27)

In view of equation (5.26) and equation (5.27), by Cramer’s rule we have


󵄨󵄨 0 y2 󵄨󵄨 󵄨󵄨 y1 0 󵄨
󵄨󵄨 f (x)/a y󸀠 󵄨󵄨 󵄨󵄨 y󸀠 f (x)/a 󵄨󵄨󵄨
󵄨 󵄨 󵄨 1 󵄨
u󸀠1 (x) = 󵄨 y1 y2 󵄨2 and u󸀠2 (x) = 󵄨 y1 y2 󵄨 .
󵄨󵄨 󸀠 󸀠 󵄨󵄨 󵄨󵄨 󸀠 󸀠 󵄨󵄨
󵄨󵄨 y1 y2 󵄨󵄨 󵄨󵄨 y1 y2 󵄨󵄨
308 | 5 Introduction to ordinary differential equations

Integration gives
󵄨󵄨 0 y2 󵄨󵄨 󵄨󵄨 y1 0 󵄨
󵄨󵄨 f (x)/a y󸀠 󵄨󵄨 󵄨󵄨 y󸀠 f (x)/a 󵄨󵄨󵄨
󵄨 󵄨 󵄨 󵄨
u1 (x) = ∫ 󵄨 y1 y2 󵄨2 dx and u2 (x) = ∫ 󵄨1 y1 y2 󵄨 dx.
󵄨󵄨 󸀠 󸀠 󵄨󵄨 󵄨󵄨 󸀠 󸀠 󵄨󵄨
󵄨󵄨 y1 y2 󵄨󵄨 󵄨󵄨 y1 y2 󵄨󵄨
Thus, a particular solution is given by
󵄨󵄨 0 y2 󵄨󵄨 󵄨󵄨 y1 0 󵄨
󵄨󵄨 f (x)/a y󸀠 󵄨󵄨 󵄨󵄨 y󸀠 f (x)/a 󵄨󵄨󵄨
󵄨 󵄨 󵄨 󵄨
yp = y1 (x) ⋅ ∫ 󵄨 y1 y2 󵄨2 dx + y2 (x) ⋅ ∫ 󵄨1 y1 y2 󵄨 dx. (5.28)
󵄨󵄨 󸀠 󸀠 󵄨󵄨 󵄨󵄨 󸀠 󸀠 󵄨󵄨
󵄨󵄨 y1 y2 󵄨󵄨 󵄨󵄨 y1 y2 󵄨󵄨

Example 5.3.19. Solve y 󸀠󸀠 − y = 4xex .

Solution. Since r 2 − 1 = 0, we have r = ±1, and the complementary function yc is

yc = C1 e−x + C2 ex .

By equation (5.28), a particular solution yp is given by


󵄨󵄨 0 ex 󵄨󵄨 󵄨󵄨 e−x 0 󵄨󵄨󵄨
󵄨󵄨 4xex ex 󵄨󵄨 󵄨󵄨 −x
x 4xex 󵄨󵄨
yp = e ∫ 󵄨 −x x 󵄨 dx + e ∫ 󵄨 󵄨−e −x
−x
dx
󵄨󵄨 e −x ex 󵄨󵄨 󵄨󵄨 e −x ex 󵄨󵄨󵄨
󵄨󵄨 −e e 󵄨󵄨 󵄨󵄨 −e ex 󵄨󵄨
2x
4x
dx + ex ∫ dx
−4xe
= e−x ∫
2 2
= 2e−x ∫ −xe2x dx + ex ∫ 2xdx

1 1
= 2e−x (− xe2x + e2x ) + x 2 ex
2 4
1 x 2
yp = e (2x − 2x + 1).
2
So, a general solution is given by
1
y = C1 e−x + C2 ex + ex (2x2 − 2x + 1).
2

5.4 Other ways of solving differential equations


Very often, it is hard or even impossible to solve a differential equation exactly. That
is, there is in theory a function that is the solution of the differential equation, but
we cannot obtain an explicit formula for this solution. This is true even for a simple-
looking equation like

y󸀠󸀠 − xy󸀠 + x 2 y = 0. (5.29)

In this section, we introduce, very briefly, two ways to find exact or approximate
solutions to differential equations: the power series method and Euler’s method.
5.4 Other ways of solving differential equations | 309

5.4.1 Power series method

When we cannot find an explicit expression for the solution of a differential equation,
we try to get information about the solution in other ways. One way is to express the
solution in the form of a power series,

y = f (x) = ∑ cn xn = c0 + c1 x + c2 x2 + ⋅ ⋅ ⋅ + cn x n + ⋅ ⋅ ⋅ .
n=0

The method is to substitute this expression into the differential equation and use the
equation to determine the values of the coefficients c0 , c1 , c2 ⋅ ⋅ ⋅. This technique re-
sembles the method of undetermined coefficients discussed previously. Once a Taylor
series solution or some of the initial terms of that Taylor series have been found, this
can be used to compute numerical approximations to the solution of the ODE.
We now illustrate the method on the equation y󸀠 − y = x. We already know how to
solve this equation exactly by techniques introduced before, but it is a simple exam-
ple, helping us to understand the power series method.

Example 5.4.1. Use a power series to solve the initial value problem y 󸀠 − y = x and y(0) = 1.

Solution. We assume there is a solution of the form



y = c0 + c1 x + c2 x2 + c3 x3 + ⋅ ⋅ ⋅ + cn−1 xn−1 + cn x n + ⋅ ⋅ ⋅ = ∑ cn x n .
n=0

We can differentiate the power series term by term to get



y󸀠 = c1 + 2c2 x + 3c3 x2 + 4c4 x3 + ⋅ ⋅ ⋅ + (n − 1)cn−1 xn−2 + ncn x n−1 + ⋅ ⋅ ⋅ = ∑ ncn x n−1 .
n=1

So, y󸀠 − y = x becomes

y󸀠 − y = (c1 − c0 ) + (2c2 − c1 )x + (3c3 − c2 )x2 + ⋅ ⋅ ⋅ + (ncn − cn−1 )x n−1 + ⋅ ⋅ ⋅ = x.

Now, equating the coefficients gives

c1 = c0 , 2c2 − c1 = 1, 3c3 − c2 = 0, ..., ncn − cn−1 = 0, ....

The initial value y(0) = 1 gives c0 = 1. Thus,

2 2 2
c1 = 1, c2 = 1 = , c3 = , ... cn = , ....
2! 3! n!

Therefore, the complete Taylor series for this solution of y󸀠 − y = x is


2 2 2 3 2
y =1+x+ x + x + ⋅ ⋅ ⋅ + xn + ⋅ ⋅ ⋅ .
2! 3! n!
310 | 5 Introduction to ordinary differential equations

x2 x3
Since ex = 1 + x + 2!
+ 3!
+ ⋅ ⋅ ⋅, this solution is

x2 x3
y = 2(1 + x + + + ⋅ ⋅ ⋅) − x − 1
2! 3!
= 2ex − x − 1.

5.4.2 Numerical approximation: Euler’s method

As previously mentioned, it is the exception rather than the rule when a first-order
ODE of the general form
dy
= f (x, y)
dx
can be solved exactly and explicitly by elementary methods like those discussed ear-
lier. Even the simple equations
dy 2 dy sin x
= e−x and =
dx dx x
2
cannot be solved this way, since it can be proved that the antiderivatives of e−x and
sin x
x
are not elementary functions. However, if a solution exists, then we can always
find numerical approximations to the solution. The most basic of the approximation
methods is Euler’s method.
We consider the initial value problem of the form
dy
= f (x, y), y(x0 ) = y0 .
dx
In Euler’s method we first choose a small step size h, and we use this to define a se-
quence of x-values, starting with some initial value (x0 , y0 ) and separated by h, giving

x0 , x1 = x0 + h, x2 = x0 + 2h, x3 = x0 + 3h, ... xn = x0 + nh, ....

We compute a succession of approximate y-values, y1 at x1 , y2 at x2 , y3 at x3 , and so on,


using the iterative formula

yn+1 = yn + hf (xn , yn ) for n = 0, 1, 2, 3, . . . .

Euler’s method works because, by Taylor’s theorem, for the solution function y =
y(x) we have

y(xn + h) = y(xn ) + y 󸀠 (xn )h + o(h2 ).

Thus,

y(xn + h) ≈ y(xn ) + y 󸀠 (xn )h when h is small,


yn+1 ≈ yn + f (xn , yn )h.
5.4 Other ways of solving differential equations | 311

Example 5.4.2. Use Euler’s method with step size 0.1, and then step size 0.05, to construct a table of
approximate values for the solution on the interval 0 ≤ x ≤ 1 for the initial value problem

y󸀠 = x − y and y(0) = 1.

Solution. We start with h = 0.1, x0 = 0, and y0 = 1, so that for n = 0, 1, 2, 3, . . . , 10,


xn = 0, 0.1, 0.2, 0.3, . . . 0.9, 1.0. Hence, we compute using the formula

yn+1 = yn + 0.1(xn − yn ).

We obtain
y1 = y0 + 0.1(x0 − y0 ) = 1 + 0.1(0 − 1) = 0.9,
y2 = y1 + 0.1(x1 − y1 ) = 0.9 + 0.1(0.1 − 0.9) = 0.82,
..
.

Proceeding with similar calculations, we find the values in the two tables (for h = 0.1
and h = 0.05). We have also included the corresponding values of the exact solution,
y = x + 2e−x − 1, and the deviation (error) of the approximate solution from the exact
solution.

x yn (h = 0.1) x+2e−x −1 Error x yn (h = 0.05) x+2e−x −1 Error

0 1 1.0 0 0 1 1.0 0
0.1 0.9 0.909 674 8 0.009 674 8 0.05 0.95 0.952 458 8 0.002 458 8
0.2 0.82 0.837 461 5 0.017 461 5 0.1 0.905 0.909 674 8 0.004 674 8
0.3 0.758 0.781 636 4 0.023 636 4 0.15 0.864 75 0.871 416 0 0.006 666
0.4 0.712 2 0.740 640 1 0.028 440 1 0.2 0.829 012 5 0.837 461 5 0.008 449
0.5 0.680 98 0.713 061 3 0.032 081 3 0.25 0.797 561 9 0.807 601 6 0.010 039 7
0.6 0.662 882 0.697 623 3 0.034 741 3 0.3 0.770 183 8 0.781 636 4 0.011 452 6
0.7 0.656 593 8 0.693 170 6 0.036 576 8 0.35 0.746 674 6 0.759 376 2 0.012 701 6
0.8 0.660 934 4 0.698 657 9 0.037 723 5 0.4 0.726 840 9 0.740 640 1 0.013 799 2
0.9 0.674 841 0 0.713 139 3 0.038 298 3 0.45 0.710 498 8 0.725 256 3 0.014 757 5
1.0 0.697 356 9 0.735 758 9 0.038 402 0.5 0.697 473 9 0.713 061 3 0.015 587 4
0.55 0.687 600 2 0.703 899 6 0.016 299 4
0.6 0.680 720 2 0.697 623 3 0.016 903 1
0.65 0.676 684 2 0.694 091 6 0.017 407 4
0.7 0.675 350 0 0.693 170 6 0.017 820 6
0.75 0.676 582 4 0.694 733 1 0.018 150 7
0.8 0.680 253 3 0.698 657 9 0.018 404 6
0.85 0.686 240 7 0.704 829 9 0.018 589 2
0.9 0.694 428 6 0.713 139 3 0.018 710 7
0.95 0.704 707 2 0.723 482 0.018 774 8
1.0 0.716 971 8 0.735 758 9 0.018 787 1
312 | 5 Introduction to ordinary differential equations

Graphs of the approximate solutions with h = 0.1 (diamonds), h = 0.05 (crosses),


and the exact solution (solid line) are shown in Figure 5.5.

Figure 5.5: Euler’s method, Example 5.4.2.

Note. Euler’s method is subject to the numerical errors experienced by most itera-
tive methods. The small errors caused by the approximate solution at each step are
incorporated into the calculations for the next step, and so can gradually build up
into large errors. This is illustrated in the figure above. This error build-up can be re-
duced by decreasing the size of the step h. However, as h gets smaller, the number
of computations increases, and this can cause another kind of error during computer
computations. This is because computers approximate numbers by rounding them to
a certain precision, and this introduces minute errors (round-off errors). If an itera-
tive method requires an extremely large number of computations, then the round-off
errors can build up into a significant error.

Example 5.4.3. Apply Euler’s method to approximate the solution of the initial value problem

dy
dx
= √x 2 + y 2 ,
{
y(0) = −1

with step size h = 0.1 on the interval [0, 1].

Solution. In this case the iterative formula is yn+1 = yn +0.1√xn2 + yn2 , starting from x0 =
0 and y0 = −1, and for n = 0, 1, 2, 3, . . ., the values of xn are 0, 0.1, 0.2, 0.3, . . . , 0.9, 1.
A table of the computed approximate solution values is shown.
5.5 Review | 313

n xn yn √xn2 + yn2

0 0.0 −1.0000 1.0000


1 0.1 −0.9000 0.9055
2 0.2 −0.8094 0.8337
3 0.3 −0.7260 0.7855
4 0.4 −0.6474 0.7610
5 0.5 −0.5713 0.7592
6 0.6 −0.4954 0.7781
7 0.7 −0.4176 0.8151
8 0.8 −0.3361 0.8677
9 0.9 −0.2493 0.9339
10 1.0 −0.1559 1.0121

Note. In this example it is not possible to find an exact solution formula, so a numer-
ical approach is the only way to investigate the solution.

5.5 Review
Main concepts discussed in this chapter are listed below.
dy
1. Separable differential equations, dx = f (x, y) = g(x)h(y), have the solution

1
∫ dy = ∫ g(x)dx.
h(y)
2. Substitution method: if
dy y
= F( ),
dx x
then y = xv will transform the ODE into a separable one.
3. Exact differential equation:

f (x, y)dx + g(x, y)dy is exact


⇐⇒ gx = fy
⇐⇒ there is a function φ such that dφ = f (x, y)dx + g(x, y)dy.

Thus, a general solution is φ(x, y) = C.


4. A first-order linear ODE y󸀠 + P(x)y = Q(x) has the general solution

y = e− ∫ P(x)dx (∫ e∫ P(x)dx Q(x)dx + C).

5. For a reducible differential equation F(x, y, y󸀠 , y󸀠󸀠 ) = 0:

F(x, y󸀠 , y󸀠󸀠 ) = 0 y is missing, let p(x) = y󸀠 ,


F(y, y󸀠 , y󸀠󸀠 ) = 0 x is missing, let p(y) = y󸀠 .
314 | 5 Introduction to ordinary differential equations

6. A homogeneous second-order linear differential equation with constant coeffi-


cients,

ay󸀠󸀠 + by󸀠 + cy = 0,

has a general solution given by

y = C1 er1 x + C2 er2 x if r1 and r2 are two distinct roots of ar 2 +br+c = 0,


y = C1 erx + C2 xerx if r is a repeated root of ar 2 + br + c = 0,
y = C1 eαx cos βx + C2 eαx sin βx if α ± βx are two complex roots of ar 2 +br+c = 0.

7. A nonhomogeneous second-order linear differential equation

ay󸀠󸀠 + by󸀠 + cy = f (x),

has a general solution given by

y=a general solution of ay󸀠󸀠 + by󸀠 + cy = 0


⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
complementary function
a particular solution .
+ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
particular integral

8. To find a particular solution y∗ for ay󸀠󸀠 + by󸀠 + cy = f (x)eλx , where f (x) is a poly-
nomial of degree m, assume Q(x) is a polynomial of degree m, with unknown co-
efficients, we try

y∗ = Q(x)eλx if λ is not a root of ar 2 + br + c = 0,


y∗ = xQ(x)eλx if λ is a single root of ar 2 + br + c = 0,
y∗ = x2 Q(x)eλx if λ is a double root of ar 2 + br + c = 0.

9. A particular solution of ay󸀠󸀠 + by󸀠 + cy = f (x) is given by


󵄨󵄨 0 y2 󵄨󵄨 󵄨󵄨 y1 0 󵄨
󵄨󵄨 f (x)/a y󸀠 󵄨󵄨 󵄨󵄨 y󸀠 f (x)/a 󵄨󵄨󵄨
󵄨 2 󵄨 󵄨 1 󵄨
y = y1 (x) ∫ 󵄨 y1 y2 󵄨 dx + y2 (x) ∫ 󵄨 y1 y2 󵄨 dx,

󵄨󵄨 󸀠 󸀠 󵄨󵄨 󵄨󵄨 󸀠 󸀠 󵄨󵄨
󵄨󵄨 y1 y2 󵄨󵄨 󵄨󵄨 y1 y2 󵄨󵄨

where y1 and y2 are two independent solutions of ay󸀠󸀠 + by󸀠 + cy = 0.


10. There are some other ways to solve an ODE, such as the power series method and
Euler’s method.
5.6 Exercises | 315

5.6 Exercises
5.6.1 Introduction to differential equations

1. Which of the following equations are differential equations? For those that are,
state whether they are ODE or PDE. For those that are ODE, give their orders and
degrees.
(1) y󸀠 = 2x + 6, (2) y = 2x + 3,
d2 y
(3) dx 2
= y + 2x, (4) x2 − 3t = 0,
(5) y = x + y + y2 cos x,
󸀠
(6) yx + 8(y󸀠 )2 + 6y8 = e2t ,
󸀠 2
(7) y(y ) = 1, (8) x2 dx + ydx = 0,
𝜕2 u
(9) y(4) + 2y󸀠 + 3x = 5, (10) 𝜕x2
+ ( 𝜕u
𝜕t
)2 = x2 − t.
2
2. Verify that x = 2(sin 2t −sin 3t) is the solution of the initial value problem ddtx2 +4x =
10 sin 3t, x(0) = 0, x󸀠 (0) = −2.
3. Graph the slope fields for the following differential equations using computer soft-
ware:
(1) y󸀠 = x−y
x+y
, (2) dxdy
= (x + y − 2)2 ,
dy dy
(3) dx = sin x, (4) dx = x(6 − x).
4. Find an equation of the curve that passes through the point (1, 0) and whose slope
at each point (x, y) is x2 .

5.6.2 First-order differential equations

1. Solve each of the following separable differential equations:


x2
(1) (xy2 + x)dx + (y − x2 y)dy = 0, (2) y󸀠 = cos y
,
du 2t+sec2 t x
(3) dt
= 2u
, u(0) = −5, (4) y󸀠 = y ln y
,
(5) xy(y − xy ) = x + yy󸀠 ,
󸀠
(6) sec2 x tan ydx + sec2 y tan xdy = 0.
2. Find a general solution for the logistic equation
dP
= kP(M − P).
dt
Solve the initial value problem
dP
= 0.18P(20 − P), P(0) = 4.
dt
Sketch the graph of this particular solution. When does dP
dt
change fastest?
3. Use a suitable substitution to solve each of the following the differential equa-
tions:
(1) xy󸀠 − y − √y2 − x2 = 0, (2) xy󸀠 = y ln xy ,
y
(3) y󸀠 = e x + xy , y(1) = 0, (4) x2 y󸀠 + y2 = xyy󸀠 ,
dy dy 2y+4
(5) dx
= (x + y − 2)3 , (6) dx
= x+y−1
.
316 | 5 Introduction to ordinary differential equations

4. Determine whether each of the following differential equations is linear:


(1) y󸀠 + ex y = x2 y2 , (2) y󸀠 = tan y,
(3) x( dx
dt
+ 2) = t 2 , (4) 3x2 + 5y − 5y󸀠 = 0,
3y
(5) y − = x,
󸀠
x
(6) y󸀠 = ln x.
5. Solve each of the following exact differential equations:
dy x−y2
(1) (3x 2 + 6xy2 )dx + (6x2 y + y2 )dy = 0, (2) dx = 2xy+2y 3.

6. Solve each of the following first-order differential equations:


(1) xy󸀠 + y = ex , y(1) = e, (2) y󸀠 + y cos x = e− sin x ,
(3) (x2 + 1)y󸀠 + 2xy = 4x2 , (4) t dy
dt
+ 2y = t 3 , t > 0, y(1) = 0,
y3 dy
(5) y󸀠 + x2 y = x2
, (6) x dx − 4y = x 2 √y.

5.6.3 Second-order differential equations

1. Solve each of the following reducible differential equations:


(1) y󸀠󸀠󸀠 = xex + 2, (2) y󸀠󸀠 = y󸀠 + x,
(3) y󸀠󸀠 = 1 + (y󸀠 )2 , (4) xy󸀠󸀠 + y󸀠 = 0,
1
(5) y = 3√y, y(0) = 1, y (0) = 2,
󸀠󸀠 󸀠
(6) (y󸀠 )2
y󸀠󸀠 = cot y,
(7) y󸀠󸀠 = y󸀠 (1 + y󸀠2 ).
2. To solve the second-order differential equation y3 y󸀠󸀠 + 1 = 0, y(1) = 1, y󸀠 (1) = 0, a
student provided the following solution. Since

1
y󸀠󸀠 = − ,
y3

we must have

1 1
y󸀠󸀠 dy = − dy → ∫ y󸀠󸀠 dy = ∫ − 3 dy.
y3 y

Thus, he obtained y󸀠 = − y12 + C. Using the initial conditions, he had C = 1. So

1
y󸀠 = 1 − .
y2

This becomes a separable ODE now. Is this solution correct?


3. Solve each of the following differential equations:
(1) y󸀠󸀠 − 3y󸀠 + 2y = 0, (2) y󸀠󸀠 + a2 y = 0,
(3) 4y󸀠󸀠 + 4y󸀠 + y = 0, y(0) = 2, y󸀠 (0) = 0, (4) y󸀠󸀠 − 6y󸀠 + 8y = 0,
d2 y d2 x
(5) +
dt 2
3 dy
+ 2 = 0,
dt
(6) dθ2
dx
+ 4 dθ + 4x = 0,
(7) ÿ + ẏ + y = 0.
d2 y dy dy
4. Solve dx2
− 2 dx + 3y = 0, given that y = 0 and dx
= 6 when x = 0.
5.6 Exercises | 317

5. Solve each of the following differential equations using the method of undeter-
mined coefficients:
(1) y󸀠󸀠 − 7y󸀠 + 6y = 4x, (2) y󸀠󸀠 − 2y󸀠 − 3y = 6e2x ,
(3) y󸀠󸀠 + 4y = x cos x, (4) y󸀠󸀠 − y = 4xex , y(0) = 0, y󸀠 (0) = 1,
(5) y󸀠󸀠 − 2y󸀠 + 5y = ex sin 2x, (6) y󸀠󸀠 + y = ex + cos x,
d2 y
(7) ÿ + ẏ + y = 0, (8) dt 2
+ 16y = 3 cos 4t,
2
d2 y dy
(9) 2 ddtx2
− 3 dx
dt
− 5x = 2
10t + 1, (10) dx2
+ 2 dx + y = e−x .
d2 x
6. Solve 20 dt 2 + 4 dx
= 2t + 11, given that x = 1 and dx
dt
+x dt
= 2.8 when t = 0. Describe
the behavior of x when t → ∞.
7. Solve the differential equation y󸀠󸀠 + y = tan x using variation of parameters.
8. A spring with a mass of 2 kg is put on a table. One of the ends of the spring is
fixed on a wall and the other end is attached to the mass. It is held stretched 0.2 m
beyond its natural length by a force of 40 N. Now, suppose the mass is at its equi-
librium point, a push gives the mass an initial velocity of 2 m/s. Find the position
of the mass after t seconds.
9. The Kirchhoff voltage law says that

d2 Q dQ 1
L +R + Q = E(t),
dt 2 dt C
where L is an inductor, R is a resistor, C is a capacitor, Q is the charge, and E is
the electromotive force. The current I is always equal to dQdt
. Find the charge and
current at t in a circuit if the initial charge and current are both 0, and L = 1,
R = 40, C = 16 × 10−4 , and E(t) = 10 sin(2t).
10. Attempt to find a general solution to the Euler equation

d2 y dy
ax2 + bx + cy = 0,
dx2 dx
where a ≠ 0, b, and c are constants. Hint: Try y = x r .
11. (Solving a simple PDE) In general, solving a PDE for an analytical solution is
not easy. Numerical methods are widely used in obtaining approximate solutions.
However, in some cases, we may be able to find an exact solution to a PDE. Con-
sider heat conduction in a cube, which can be modeled by

𝜕2 u 𝜕2 u 𝜕2 u
+ + = 0, for 0 < x, y, z < a.
𝜕x2 𝜕y2 𝜕z 2
2 2 2
Note that 𝜕𝜕xu2 + 𝜕𝜕yu2 + 𝜕𝜕zu2 is often denoted by ∇2 u. Suppose u = P(x)Q(z) and bound-
ary conditions are

u = 0 on x = 0 and a,
u = 0 on z = 0,
u = 1 on z = a.
318 | 5 Introduction to ordinary differential equations

(a) Show that P 󸀠󸀠 Q + PQ󸀠󸀠 = 0 and that there is a constant λ such that Q󸀠󸀠 = λQ
and P 󸀠󸀠 + λP = 0.
(b) Show that a general solution for P(x) is

P(x) = C1 cos √λx + C2 sin √λx.

(c) Show that λ = n2 π 2 /a2 .


(d) Show that a general solution for Q(z) is
√λz √λz
Q(z) = Ae + Be− , where A and B are two arbitrary constants.

(e) Recall that

ex + e−x ex − e−x
cosh x = and sinh = .
2 2
Show that the general solution for Q(z) shown in (d) can be rewritten as

Q(z) = C3 cosh √λz + C4 sinh √λz.

(f) Show that C3 = 0 by using some of the boundary conditions.


(g) Show that by superimposing solutions,

nπx nπz
u(x, z) = ∑ an sin sinh .
n=1 a a

(h) Using the condition u = 1 when z = a, show that



nπx
1 = ∑ an sin sinh(nπ).
n=1 a

Thus, by using knowledge about Fourier series, prove that


4
nπ sinh(nπ)
if n is odd,
an = {
0 if n is even.

(i) Thus, show that the desired particular solution is


1 sinh (2k−1)πz
a (2k − 1)πx
u(x, z) = ∑ sin .
k=1
(2k − 1)π sinh((2k − 1)π) a
Further reading
1. Gilbert Strang. Calculus. Wellesley: Wellesley-Cambridge Press, 1991.
2. Alex Himonas, Alan Howard. Calculus: Ideas and Applications. New Jersey: Wiley,
2002.
3. Michael Spivak. Calculus. 3rd edtion. London: Cambridge University Press, 2006.
4. Robert A. Adams, Christopher Essex. Calculus. 7th edition. Toronto: Pearson
Eduction, 2007.
5. James Stewart. Calculus. 6th edition. California: Brooks Cole, 2017.
6. Donald Trim. Calculus for Engineers. 4th edition. Toronto: Pearson Education,
2008.
7. Ross L. Finney, Franlin D. Demana, Bert K. Waits, Daniel Kennedy. Calculus:
Graphical, Numerical, Algebraic. 4th edition. New Jersey: Prentice Hal, 2012.
8. Ron Larson, Bruce H. Edwards. Calculus, 10th edition. California: Brooks Cole,
2013.
9. James Stewart. Calculus. 5th edition. Beijing: Higher Education Press, 2004.
10. Thomas’s Calculus, 10th edition. Beijing: Higher Education Press, 2004.
11. Department of mathematics, Sichuan University. Higher Mathematics. 4th edi-
tion. Beijing: Higher Education Press, 2009.
12. Department of mathematics, Sichuan University. Higher Mathematics. 2nd edi-
tion. Chengdu: Sichuan University Press, 2013.
13. Department of applied mathematics, Tongji University. Higher Mathematics.
7th edition. Beijing: Higher Education Press, 2014.
14. Ma Jigang, Zou Yunzhi, P. W. Aitchison. Calculus II. Beijing: Higher Education
Press, 2010.
15. William Briggs, Lyle Cochran, Bernard Gillett. Calculus: Early Transcendentals.
2nd edition. Malaysia: Pearson Education, 2015.
16. Elgin H. Johnston, Jerold C. Mathews. Calculus (Annotated Instructor’s edition).
USA: Pearson education, 2002.

https://doi.org/10.1515/9783110674378-006
Index
absolute maximum 123 differentiable 84
absolute minimum 123 differential approximation 90
angle between two lines 21 direction 1
angle between two planes 25 direction angle 8
angle between two vectors 8 direction cosine 8
direction field 277
Bernoulli equation 286 direction numbers 20
boundary 66 directional derivative 113
boundary point 66 divergence 233
bounded region 66 divergence of a 3D vector field 248
bounded region test 127 divergence theorem 248, 250
domain 65
candidate theorem 124 dot product 9
Cartesian equation of a plane 24 double integral 147
center of mass 187 double integral in polar coordinates 157
chain rule 92 double integral in rectangular coordinates 150
chain rule with more than one independent
variable 94 ellipsoid 46
chain rule with one independent variable 92 elliptic cone 46
change of variables 161 elliptic cylinder 45
change of variables in triple integrals 179 elliptic paraboloid 46
change the order of integration 156 equivalent vectors 1
circulation 217 Euler’s method 310
circulation density 217 exact differential equation 281
circulation integral 217 extrema of functions of several variables 122
Clairaut theorem 83
closed region 66 first-order differential equation 273
complementary function 298 first-order linear differential equation 283
components of a vector 5 flux 231
conservative field 211 flux density 232
constrained maximum 130 flux integral 242
constrained minimum 130 Fubini theorem 152
continuous functions of two variables 75 functions of multiple variables 65
coordinate planes 4 functions of two variables 65
criteria for exactness 281 fundamental theorem of line integrals 208
critical point 124
cross product 13 general solution 275
curl 219 generalized Green’s theorem 228
curl of a 3D vector field 256 global maximum 123
curvature 39 global minimum 123
cylinder 44 gradient vector 113
cylindrical coordinates 175 gravitational field 202
Green’s theorem 216
degree 274 Green’s theorem: circulation-curl form 219
dependent variables 65 Green’s theorem: flux-divergence form 231
difference of vectors 3
differentiability 83 Hessian matrix 122
322 | Index

homogeneous equation 279 octant 4


homogeneous second-order linear differential ODE 274
equations 293 open region 66
hyperbolic cylinder 45 optimization problem 131
hyperbolic paraboloid 46 order 274
hyperboloid of one sheet 46 ordinary differential equation 273, 274
hyperboloid of two sheets 46 orientable surfaces 241

implicit differentiation 101 parabolic cylinder 44


independent variables 65 parallelepiped 17
initial point 1 parallelogram law 2
initial value problem 276 parameterization by arc length 38
integrating factor 283 parameterized surfaces 181
interior 66 parametric equations of lines 19
interior point 66 partial derivative 77
intermediate variable 95 partial derivatives of higher order 82
intersecting curves 50 partial differential equation 274
iterated limits 74 particular solution 275
path independence 208
Jabobian determinant 162 PDE 274
perpendicular vectors 8
lagrange multiplier 130 plane 23
length of curves 37 position vector 5
level curves 67 positive orientation of a curve 38
level surface 70 positive oriented curve 216
limits for functions of two variables 70 potential function 211
line integral of a vector field 201 power series method 309
line integral with respect to arc length 195, 196 principal unit normal vector 41
linear approximation 90 principle of superposition for homogeneous
linear differential equation 283 equations 292
linear equation of a plane 24 projection 12
linear independence of two functions 292 projection curves 50
linearization 83 projection region 56
lines in space 18
local linearization 91 quadratic approximation 98
local maximum 123 quadric surfaces 46
local minimum 123
range 65
magnitude 1 reducible second-order equations 287
method of undetermined coefficients 302 regions bounded by surfaces 56
moment of inertia 187, 188 relative maximum 123
relative minimum 123
negative of a vector 2 right-hand rule 3
nonhomogeneous second-order linear ruled surface 63
differential equations 299
normal line 23, 109 saddle point 125
normal plane 34, 106 scalar multiplication 2
scalar projection 12
objective function 130 scalar triple product 17
Index | 323

second derivative test 126 total differential 89


second-order differential equation 274, 287 tree diagram 93
second-order linear differential equation 291 triangle law 2
second-order ODE 287 triple integral 165
separable differential equation 277 triple integral in rectangular coordinates 165
simple curve 216 type I region 153
simply connected region 216 type II region 153
skew lines 21
spherical coordinates 175, 177 unbounded region 66
steepest ascent/descent 116 unit binormal vector 42
Stokes theorem 256 unit tangent vector 34
surface area 181 unit vector 1
surface in space 42
surface integral of vector fields 241 variation of parameters 294, 307
surface integral with respect to surface area 237 vector 1
symmetric equations of lines 19 vector addition 2
vector equation of a line 20
tangent line 106 vector equations of planes 28
tangent plane 109 vector field 201
tangent vector 34 vector-valued functions 30
Taylor expansion 98 Viviani curve 54
terminal point 1
TNB frame 39 zero vector 1

You might also like