Multi-Variable Calculus A First Step PDF
Multi-Variable Calculus A First Step PDF
Multi-Variable Calculus
Also of Interest
Single Variable Calculus, A First Step
Yunzhi Zou, 2018
ISBN 978-3-11-052462-8, e-ISBN (PDF) 978-3-11-052778-0,
e-ISBN (EPUB) 978-3-11-052785-8
Multi-Variable
Calculus
|
A First Step
Mathematics Subject Classification 2010
Primary: 26B12, 26B20, 26B15; Secondary: 26B05, 26B10
Author
Prof. Yunzhi Zou
Department of Mathematics
Sichuan University
610065 Chengdu
People’s Republic of China
[email protected]
ISBN 978-3-11-067414-9
e-ISBN (PDF) 978-3-11-067437-8
e-ISBN (EPUB) 978-3-11-067443-9
www.degruyter.com
Contents
Introduction | IX
Index | 321
Introduction
Calculus has been widely applied to an incredible number of disciplines since its
inception in the seventeenth century. In particular, the marvelous Maxwell equa-
tions revealed the laws that govern electric and magnetic fields, which led to the
forecasting of the existence of the electromagnetic waves. The industrial revolution
witnessed the many applications of calculus. The power of calculus never diminishes,
even in today’s scientific world. For this reason, there is no doubt that calculus is one
of the most important courses for undergraduate students at any university in the
world.
On the other hand, during the past century, especially since the 2000s, many Chi-
nese and other non-English speaking people have gone to English speaking countries
to further their studies, and more are on their way. Also, as global cooperations and
communications become important for people to tackle big problems, there are needs
for people to know and understand each other better. Fortunately, Sichuan University
has a long history of global connections. Its summer immersion program is well known
for its size and popularity. Each year, it sends and hosts thousands of students from
different parts of the world. We believe that there are other similar situations where
students come and go to different places or countries without disrupting their studies.
For those students, a suitable textbook is helpful.
However, there are many challenges in developing such a suitable book. First of
all, for most freshmen whose English is not their first language, the textbook should
employ English as plain as possible. Second, the textbook should take into account
what students have learned in high school and what they need in a calculus course.
Third, there must be a smooth transition from the local standards to those globally
accepted. Furthermore, such a book must have some new insights to inject new energy
into the many already existing texts. This includes, but is not limited to, addressing
discovery over rote learning; being as concise as possible while covering the essential
content required by most local and global universities; and being printed in color as
most texts in English are. The book Single Variable Calculus: A First Step, which was
the first such calculus text in China, has provided a response to these challenges since
it was published by the World Publishing Company in 2015 and by De Gruyter in 2018.
The present book, Multivariable calculus: a first step, makes sure that these efforts
continue.
With more than 10 years in teaching calculus courses to students at the Wu Yu-
zhang honors college at Sichuan University, I have had the chance to work with local
students using books and resource materials in English. We adopted or referred stu-
dents to many calculus books, for example Thomas’ Calculus, 10th edition, by Finney,
Weir, and Giordano; Calculus, 5th edition, by Stewart; Calculus, by Larson, Edwards,
and Hosteltle; Calculus: Ideas and Applications, by Himonnas and Howard; Calculus:
Early Transcendentals, 2nd edition, by Briggs, Cochran, and Gillett; and other books in
https://doi.org/10.1515/9783110674378-201
X | Introduction
8. Mr. Zengbao Wu, Mr. Liang Li, Ms. Mengxin Li, Mr. Bo Qian, Mr. Xi Zhu, Mr. Yang
Yang, and Mr. Yi Guo
Graduate students working as teacher assistants
I also would like to thank my Mathematics department and the academic affairs of-
fice at Sichuan University. I always have their encouragement and generous support,
which make me happy to devote time and energy in writing this book and make the
publication of the work possible.
We have been working hard on this version; however, there might still be typos
and even mistakes. The responsibility for those errors in this book lie entirely with
me. I will be happy to receive comments and feedback anytime whenever they arise.
I can be reached via [email protected].
Sincerely,
Yunzhi Zou
Professor of Mathematics
Sichuan University
Chengdu, P.R. China
[email protected]
610065
1 Vectors and the geometry of space
In this chapter we introduce vectors and coordinate systems for three-dimensional
space. They are very helpful in our study of multivariable calculus. In particular, vec-
tors provide simple descriptions and insight concerning curves and planes. We also
introduce some surfaces in space. The graph of a function of two variables is a surface
in space which gives additional insight into the properties of the function.
1.1 Vectors
1.1.1 Concepts of vectors
The term vector is used to indicate a quantity that has both a magnitude and a di-
rection, for instance, displacement, acceleration, velocity, and force. Scientists often
represent a vector geometrically by an arrow (a directed line segment). The arrow of
the directed line segment points in the direction of the vector, while the length of the
arrow represents the magnitude of the vector. We denote vectors by letters that have
→ → →
an arrow overbar, such as → a , b , i , k ,→
v . For example, suppose an object moves along
a straight line from point A to point B. The vector s⃗ representing this displacement geo-
metrically has initial point A (the tail) and terminal point B (the head), and we indicate
→
this by writing s⃗ = AB (as shown in Figure 1.1(a)). We also denote vectors by printing
the letters in boldface, such as a, b, i, k, v. In this book, we use both notations. We de-
note the magnitude (also called the length) of a vector a⃗ (or a) by |a|⃗ (or |a|). If |a|⃗ = 1,
then we say that a⃗ is a unit vector.
We say that two vectors a⃗ and b⃗ are equivalent (or equal) if they have the same length
and the same direction, and we write a⃗ = b.⃗ Note that two vectors with the same length
and direction are considered equal even when the vectors are in two different loca-
tions. The zero vector, denoted by 0⃗ or 0, has length 0, and, consequently, it is the
https://doi.org/10.1515/9783110674378-001
2 | 1 Vectors and the geometry of space
only vector with no specific direction. If nonzero vectors a⃗ and b⃗ have the same direc-
tion or if a⃗ has exactly the opposite direction to that of b,⃗ then we say that they are
parallel, and we write a⃗ ‖ b.⃗
We assume that vectors considered here can be represented by directed line segments
or arrows in two-dimensional space, ℝ2 , or three-dimensional space, ℝ3 . However,
vectors can be defined much more generally without reference to the directed line seg-
ments.
Definition 1.1.1 (Vector addition). If a⃗ and b⃗ are vectors positioned so the initial point of b⃗ is at the
terminal point of a,⃗ then the sum a⃗ + b⃗ is the vector from the initial point of a⃗ to the terminal point of b.⃗
This definition of vector addition is illustrated in Figure 1.1(b), and you can see why
this definition is sometimes called the triangle law or parallelogram law.
Note. If the initial point of b⃗ is not at the terminal point of a,⃗ then a copy of b⃗ (same
length and direction) can be made with its initial point at the terminal point of a,⃗ and
the sum can be created using a⃗ and this copy of b.⃗
Vector addition satisfies the following laws for any three vectors a,⃗ b,⃗ c:⃗
(1) Commutative law: a⃗ + b⃗ = b⃗ + a.⃗
(2) Associative law: (a⃗ + b)⃗ + c⃗ = a⃗ + (b⃗ + c).
⃗
Definition 1.1.2 (Scalar multiplication, negative of a vector). If λ is a scalar (a number) and a⃗ is a vec-
tor, then the scalar multiple λa⃗ is also a vector. If λ > 0, then λa⃗ has the same direction as the vector
a⃗ and has length λ times the length of → a . If λ < 0, then λa⃗ has the reverse direction to the direction
of a⃗ and has length that is |λ| times the length of → a . If λ = 0 or a⃗ = 0⃗ (zero vector), then λa⃗ = 0.⃗
In particular, the vector −a⃗ is called the negative of a,⃗ and it means the scalar multiple (−1)a⃗ has the
same length as a⃗ but points in the opposite direction.
Scalar multiplication satisfies the following laws for any two vectors a,⃗ b⃗ and any
two scalars λ, μ:
(3) Associative law: λ(μa)⃗ = (λμ)a⃗ = μ(λa).⃗
(4) Distributive laws: (λ + μ)a⃗ = λa⃗ + μa⃗ and λ(a⃗ + b)⃗ = λa⃗ + λb.⃗
By the distributive law (4) b⃗ + (−b)⃗ = 1b⃗ + (−1)b⃗ = (1 − 1)b⃗ = 0,⃗ so b⃗ and −b⃗ act as
negatives of each other. Also, we can see that two nonzero vectors are parallel to each
other if they are scalar multiples of one another. The zero vector is considered to be
parallel to all other vectors. It is easy to establish the following theorem.
1.1 Vectors | 3
Theorem 1.1.1. Suppose a⃗ and b⃗ are two nonzero vectors. Then a⃗ ‖ b⃗ if and only if there exists a number
λ ≠ 0 such that a⃗ = λb.⃗
a⃗ − b⃗ = a⃗ + (−b).
⃗
Hence, we can construct a⃗ − b⃗ geometrically by first drawing the negative −b⃗ of b,⃗
and then adding −b⃗ to a⃗ using the parallelogram law as in Figure 1.1(d). This shows
that the vector a⃗ − b⃗ is the vector from the head of b⃗ to the head of a.⃗ The operation
of subtracting two vectors does not satisfy the commutative law (1) or the associative
law (2), but it does satisfy the distributive law (4), λ(a⃗ − b)⃗ = λa⃗ − λb.⃗
Figure 1.2: Three-dimensional coordinate system, axes, coordinate planes, and octants.
4 | 1 Vectors and the geometry of space
space is divided into eight octants. We label them the first octant, the second octant,
the third octant, the fourth octant, the fifth octant, the six octant, the seventh octant,
and the eighth octant in a way that is shown in Figure 1.2(c).
To locate a point P in space, we project the point onto the three coordinate planes.
If the directed distance from the yz-plane to the point P is a, the directed distance
from the xz-plane to the point P is b, and the directed distance from the xy-plane to
the point P is c, then we say that the point P has x-coordinate a, y-coordinate b, and
z-coordinate c, and we use the ordered triple (a, b, c) to represent these coordinates.
This can be seen by drawing a rectangular box where O and P are two end points of the
main diagonal, as shown in Figure 1.3(a). This coordinate system is called the three-
dimensional Cartesian coordinate system. For example, to locate the point with coor-
dinates (1, 2, −1), we start from the origin and go along the x-axis for 1 unit; then turn
left and go parallel to the y-axis for 2 units; then go downward for 1 unit arriving at
(1, 2, −1), which is in the fifth octant as shown in Figure 1.3(b).
Figure 1.3: Three-dimensional coordinate system, coordinates, points, distance between two points.
Note that there is a one-to-one correspondence between points in the space and the
set of all ordered triples (a, b, c). Sometimes, we use ℝ3 to denote the Cartesian product
ℝ × ℝ × ℝ = {(x, y, z)|x, y, z ∈ ℝ}.
In three-dimensional space, for any two points P(x1 , y1 , z1 ) and Q(x2 , y2 , z2 ), we have a
rectangular box with P and Q as the two endpoints of a main diagonal, as shown in
Figure 1.3(c). Then we apply the Pythagorean theorem twice to get
It is extremely useful to represent vectors using coordinates. First, we have three stan-
dard basis vectors called i,⃗ j,⃗ and k,⃗ which are three unit vectors in the positive direc-
tions of the x-, y-, and z-axes, respectively. If those vectors have their tails at the origin
O, then their heads will be the points (1, 0, 0), (0, 1, 0), (0, 0, 1), respectively, as shown
in Figure 1.4(a).
(a) (b)
→
Definition 1.1.3. A vector OP with initial point O, the origin, and terminal point P(x, y, z) is called the
position vector of the point P(x, y, z).
→
By the definition of vector addition, we must have OP = x i ⃗ + y j ⃗ + z k.⃗ This follows from
→
the box determined by the vector OP (see Figure 1.4(b)), because the parallelogram
rule for addition gives
→ →
where OT is along the x-axis with length x and is x i,⃗ TQ is parallel to the y-axis with
→
length y and is y j,⃗ and QP is parallel to the z-axis with length z and is z k.⃗ The numbers
→
x, y, and z are referred to as the components of the vector OP.
If we add two vectors expressed in the i,⃗ j,⃗ k⃗ format, then the commutative and as-
sociative laws of vector addition show that adding two vectors can be done by adding
their components, i. e.,
By the distributive law one can see that multiplying a vector by a scalar λ is the same
as multiplying each component by λ, i. e.,
Example 1.1.1. If a⃗ = 5i ⃗ + 2j ⃗ − 3k⃗ and b⃗ = 4i ⃗ − 9k,⃗ express the vector 2a⃗ + 3b⃗ in terms of i,⃗ j,⃗ and k.⃗
Now we use the notation ⟨x, y, z⟩ to denote a position vector with its head at the
point (x, y, z), and this is the coordinate representation of this position vector. Since
any vector in space can be translated so that its initial point is the origin, any vector
in space can be represented in the form ⟨x, y, z⟩. We now give definitions for vector
operations using its coordinates representation as follows.
Definition 1.1.4. If a⃗ = ⟨x1 , y1 , z1 ⟩ and b⃗ = ⟨x2 , y2 , z2 ⟩ are two position vectors and λ is a real number,
then
a⃗ + b⃗ = ⟨x1 + x2 , y1 + y2 , z1 + z2 ⟩,
a⃗ − b⃗ = ⟨x − x , y − y , z − z ⟩,
1 2 1 2 1 2
Note that those operations also work for two-dimensional vectors; the only difference
is that there is no z-component (or the z-component is always 0). Also, from the defi-
nition, we know that
a⃗ = b⃗ ⇐⇒ x1 = x2 , y1 = y2 , and z1 = z2 , (1.4)
→
Example 1.1.2. Consider any vector PQ, where the initial point is P(x1 , y1 , z1 ) and the terminal point is
Q(x2 , y2 , z2 ). Then find coordinates of the midpoint of the line segment PQ.
Solution. Since
→ →
if M(x, y, z) is the midpoint of the line segment PQ, then 2PM = PQ, so we have
This means
|a|⃗ = √x 2 + y 2 + z 2 .
1
If |a|⃗ = 1, then a⃗ is a unit vector. If a⃗ is not the zero vector, |a|⃗
a⃗ is the unit vector in the
direction of a.⃗
Solution.
1. The length of a⃗ is |a|⃗ = √12 + 22 + (−1)2 = √6. So the unit vector e⃗ in the direction
of a⃗ is
1 1 1 2 1
e⃗ = a⃗ = ⟨1, 2, −1⟩ = ⟨ , ,− ⟩.
|a|⃗ √6 √6 √6 √6
2. The given vector has length
We have seen the distance formula before. Now we can derive it from the length
of a vector as well. The distance between the two points P(x1 , y1 , z1 ) and Q(x2 , y2 , z2 ) is,
→
therefore, the length of the vector PQ, so it is
→ → →
|PQ| = |OQ − OP| = √(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 . (1.6)
Example 1.1.4. Find a point P on the y-axis such that |PA| = |PB|, where A(−4, 1, 7) and B(3, 5, 2) are
two points.
Solution. We assume the point P has the coordinates (0, y, 0). From the distance for-
mula, we have
Figure 1.5: Angle between two vectors, perpendicular vectors, and direction angles.
Definition 1.1.5 (Angle between two vectors, direction angle, and direction cosines). If a⃗ and b⃗ are
two vectors with a common tail, then:
1. The angle between the vectors a⃗ and b⃗ is the angle θ between 0 and π formed using the two
vectors as sides.
2. The two vectors a⃗ and b⃗ are called perpendicular (orthogonal) if and only if the angle between
them is π2 .
3. The angle between a vector a⃗ and the x-axis is the angle between a⃗ and the unit base vector i.⃗
4. The angle between a vector a⃗ and the y-axis is the angle between a⃗ and the unit base vector j.⃗
5. The angle between a vector a⃗ and the z-axis is the angle between a⃗ and the unit base vector k.⃗
6. The direction angles α, β, and γ of a vector a⃗ are the angles between a⃗ and the x-, y-, and z-axes,
respectively; cos α, cos β, and cos γ are called direction cosines of a.⃗
1.2 Dot product, cross product, and triple product | 9
From Figure 1.5(d), if the vector a⃗ = ⟨x, y, z⟩ has direction angles α, β, γ, then we have
x y z
cos α = , cos β = , and cos γ = . (1.7)
|a|⃗ |a|⃗ |a|⃗
Since
x2 y2 z2
cos2 α + cos2 β + cos2 γ = 2
+ 2 + 2
|a|⃗ |a|⃗ |a|⃗
x2 + y2 + z 2
=
|a|⃗ 2
= 1,
it follows that
⟨x, y, z⟩
⟨cos α, cos β, cos γ⟩ = (1.8)
|a|⃗
is the unit vector in the direction of a.⃗
Example 1.1.5. If A(2, 2, √2) and B(1, 3, 0) are two points, find the length, direction cosines, and direc-
→
tion angles of the vector AB.
→
Solution. Because AB = ⟨1 − 2, 3 − 2, 0 − √2⟩ = ⟨−1, 1, −√2⟩, we have
→
|AB| = √(−1)2 + 12 + (−√2)2 = 2.
→
The unit vector in the direction of AB is
1 −1 1 √2
⟨−1, 1, −√2⟩ = ⟨ , , − ⟩.
2 2 2 2
Hence,
1 1 √2
cos α = − , cos β = , and cos γ = −
2 2 2
are the three direction cosines, and
2π π 3π
α= , β= , and γ=
3 3 4
are the three direction angles with the positive x-, y- and z-axes, respectively.
So far we have introduced the two operations on vectors: addition and multiplica-
tion by a scalar. Now the following questions arise: How about multiplication? Can
10 | 1 Vectors and the geometry of space
we multiply two vectors to obtain a useful quantity? In fact, there are two commonly
used useful products of vectors called the dot product and the cross product.
As shown in Figure 1.6, you may already know from physics that the work done,
W, by a force F applied during a displacement along the vector s is
W = |F||s| cos θ,
where θ is the angle between the two vectors F and s. It is, therefore, useful to define
a product of two vectors in this way.
Definition 1.2.1 (Dot/scalar/inner product). The dot product a ⋅ b of the two vectors a and b is defined
by
a ⋅ b = |a||b| cos θ,
Example 1.2.1. If the two vectors a and b have length 3 and 4, and the angle between them is π/3, find
a ⋅ b.
1
a ⋅ b = |a||b| cos(π/3) = 3 ⋅ 4 ⋅ = 6.
2
Well, this definition looks good as it has a physical basis. However, mathemati-
cally, it is not easy to find the dot product directly as we first need to know the angle
between the vectors. Using the coordinate representation of a vector, it turns out that
there is a remarkable way to compute the dot product, as we will see in the following
theorem.
a ⋅ b = a1 b1 + a2 b2 + a3 b3 .
1.2 Dot product, cross product, and triple product | 11
Proof. Suppose the angle between a and b is θ. Note that the three vectors, a, b, and
c = b − a form the three sides of a triangle. By the cosine law, we have
Since
substituting these values into the cosine law equation and canceling out all the
squares gives
Therefore, we have
a ⋅ b = a1 b1 + a2 b2 + a3 b3 .
In view of this theorem, we give the following alternative definition of the dot
product.
Definition 1.2.2 (Alternative definition of the dot product). If a = ⟨a1 , a2 , a3 ⟩, b = ⟨b1 , b2 , b3 ⟩, and θ is
the angle between the two vectors, then the dot product is defined by
a ⋅ b = |a||b| cos θ = a1 b1 + a2 b2 + a3 b3 .
Finding the dot product of a and b is incredibly easy by using coordinates. We just
multiply corresponding components and add. Using this definition, we can deduce
the following properties of the dot product.
Theorem 1.2.2 (Properties of the dot product). If a, b, and c are any three vectors and λ is any scalar,
then the dot product satisfies:
1. a ⋅ a = |a|2 ,
2. a ⋅ b = b ⋅ a,
3. if a and b are two nonzero vectors, then a ⋅ b = 0 means that a and b are perpendicular to each
other,
4. (a + b) ⋅ c = a ⋅ c + b ⋅ c,
5. (λa) ⋅ b = λ(a ⋅ b) = a ⋅ (λb),
6. 0 ⋅ a = 0.
12 | 1 Vectors and the geometry of space
These properties are similar to the rules for real numbers and can be easily proved
by using either of the two definitions of the dot product. However, some properties of
real number multiplication do not apply to the dot product. For example, if two real
numbers satisfy ab = 0, then either a = 0 or b = 0 or both. This is not true for the dot
product. If a and b are two nonzero vectors, then a ⋅ b = 0 indicates the two vectors
are perpendicular to each other, and it is not necessary that either a = 0 or b = 0.
By using the dot product, we can find the angle between two vectors, as shown in
the following example.
Example 1.2.2. Find the angle between the two vectors i + 2j − k and 2j − k.
u⋅v
Solution. By the definition of the dot product, u ⋅ v = |u| ⋅ |v| cos θ, so cos θ = |u|⋅|v|
.
Thus,
(i + 2j − k) ⋅ (2j − k) 1 ⋅ 0 + 2 ⋅ 2 + (−1) ⋅ (−1)
cos θ = = ≈ 0.913.
|i + 2j − k||2j − k| √12 + 22 + (−1)2 √22 + (−1)2
1.2.2 Projections
→ →
Suppose that a = OA and b = OB are two vectors with the same tail O. If S is the foot
→ →
of the perpendicular from B to the line containing OA, then the vector OS is called the
a
vector projection of the vector b onto the vector a, written as Proja b. If e = |a| is the
→
unit vector in the direction of OA, then the vector projection is λe, where λ = |b| cos θ
is the size (positive or negative) of the projection vector and θ is the angle between the
two vectors, as shown in Figure 1.7. Hence, the projection of vector b onto vector a is
|b| cos θ
Proja b = a.
|a|
The scalar projection of vector b onto vector a is defined as
→
In mechanics, the moment of a force F⃗ acting on a rod OP is the vector with magnitude
⃗
|F||
→ →
OP| sin θ, where θ is the angle between the vectors F⃗ and OP. The direction of the
→
moment vector is perpendicular to F⃗ and OP (see Figure 1.8(a)) and satisfies the right-
→
hand rule: if you curl your right fingers naturally from vector F⃗ to vector OP, then your
thumbs points in the direction of the moment vector, as shown in Figure 1.8(b) and (c).
Therefore, it makes sense to define a product of two vectors a⃗ and b⃗ as follows.
Definition 1.2.3 (Cross/vector/outer product). The cross product denoted by a × b of vector a and vec-
tor b in ℝ3 is a new vector which is perpendicular to both vector a and vector b. The length of a × b
is
|a × b| = |a||b| sin θ
According to the above definition and using Figure 1.4(a), we can see that
i × i = 0, i × j = k, i × k = −j, j × j = 0, j × i = −k,
j × k = i, k × i = j, k × j = −i, and k × k = 0.
But in general, how can we compute the cross product? If we try to compute
a × b = (a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k)
by using the normal rules for numbers, such as the commutative, associative, and
distributive rules, we may find an interesting vector
c = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩.
This vector, in fact, satisfies conditions that we have set for a cross product, as we will
see in the following theorem.
c = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩,
then:
1. c is perpendicular to both a and b.
2. |c| = |a||b| sin θ, where θ is the angle between a and b.
Proof. We compute the dot product to show they are perpendicular. We have
a ⋅ c = ⟨a1 , a2 , a3 ⟩ ⋅ ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩
= a1 a2 b3 − a1 a3 b2 + a2 a3 b1 − a2 b3 a1 + a3 a1 b2 − a3 a2 b1
= 0.
Now the only issue that remains is whether a, b, and c, in that order, satisfy the
right-hand rule. This can be seen in a simple case where a and b are in the first quad-
rant of the xy-plane with tails at the origin. Then the sign of the term aa2 − bb2 determines
1 1
the relative positions of a and b, and the sign of the z-component of c, a1 b2 − a2 b1 , de-
termines whether c points upward or downward. This is exactly the right-hand rule:
when you curl your right fingers from a to b, then your thumb points in the direction
of c.
In light of the above discussion, we now give an alternative definition of the cross
product.
Definition 1.2.4 (Alternative definition of the cross product). Let a = ⟨a1 , a2 , a3 ⟩ and b = ⟨b1 , b2 , b3 ⟩.
Then the cross product (also vector product) a × b is defined by
a × b = ⟨a2 b3 − a3 b2 , a3 b1 − b3 a1 , a1 b2 − a2 b1 ⟩.
where ac db = ad − bc. This is much better for remembering the cross product.
Using the definition of the vector product, we have the following theorem.
Theorem 1.2.4 (Properties of the cross product for three-dimensional vectors). For any three vectors
a, b, and c in ℝ3 and a scalar λ, we have:
1. a × a = 0,
2. if a and b are nonzero vectors, then a × b = 0 if and only if a ‖ b,
3. b × a = −(a × b),
4. a × (b + c) = a × b + a × c,
5. (a + b) × c = a × c + b × c,
6. (λa) × b = λ(a × b) = a × (λb),
7. a ⋅ (b × c) = (a × b) ⋅ c,
8. a × (b × c) = (a ⋅ c)b − (a ⋅ b)c.
Using one of the definitions of the cross product, we can prove these properties by
writing the vectors in their components form. Note that the cross product fails to obey
most of the laws satisfied by real number multiplication, such as the commutative and
associative laws. Check for yourself that a × (b × c) ≠ (a × b) × c for most vectors a, b,
and c.
Example 1.2.4. Find a vector that is perpendicular to the plane containing the three points P(1, 0, 6),
Q(2, 5, −1), and R(−1, 3, 7).
16 | 1 Vectors and the geometry of space
→ →
Solution. The cross product of the two vectors PQ and PR is such a vector. This is be-
→ →
cause the cross product is perpendicular to both PQ and PR and is, thus, perpendicular
to the plane through the three points P, Q, and R. Since
→
PQ = (2 − 1)i ⃗ + (5 − 0)j ⃗ + (−1 − 6)k⃗ = i ⃗ + 5j ⃗ − 7k,⃗
→
PR = (−1 − 1)i ⃗ + (3 − 0)j ⃗ + (7 − 6)k⃗ = −2i ⃗ + 3j ⃗ + k,⃗
we evaluate the cross product of these two vectors using the determinant approach,
i. e.,
⃗
i j⃗ k⃗
→ →
PQ × PR = 1 5 −7 = (5 + 21)i ⃗ − (1 − 14)j ⃗ + (3 + 10)k⃗
−2 3 1
= 26i ⃗ + 13j ⃗ + 13k.⃗
So the vector ⟨26, 13, 13⟩ is perpendicular to the plane passing through the three points
P, Q, and R. In fact, any nonzero scalar multiple of this vector, such as ⟨2, 1, 1⟩, is also
perpendicular to the plane. Figure 1.9 illustrates the vector perpendicular to the plane.
Note that the length of the vector |a × b| = |a||b| sin θ is equal to the area of the paral-
lelogram determined by a and b, assuming they have the same initial point, as shown
in Figure 1.8(d). Therefore, we have the following theorem.
Theorem 1.2.5. Given two nonzero vectors a and b with a common tail, we have
Example 1.2.5. Find the area of the triangle with vertices P(1, 0, 6), Q(2, 5, −1), and R(−1, 3, 7).
→ →
Solution. In the previous example, we already computed that PQ × PR = ⟨26, 13, 13⟩.
The area of the parallelogram with adjacent sides PQ and PR is the magnitude of the
cross product, i. e.,
→ →
|PQ × PR| = √(26)2 + (13)2 + (13)2 = 13√6.
13√6
Thus, the area of the triangle PQR is 2
.
Suppose three nonplanar vectors a, b, and c, have a common tail. What is the volume
of the parallelepiped determined by these three vectors as shown in Figure 1.10?
Consider the base parallelogram; its area is A = |b × c|. Let θ be the angle between
a and b × c. Noting that b × c is perpendicular to b and c and the height h of the
parallelepiped is
h = |a|| cos θ|
(we should use | cos θ| instead of cos θ to ensure that we obtain a positive result when
θ > π2 ), we conclude that the volume V of the parallelepiped is given as follows:
Thus, we have proved that the volume of the parallelepiped determined by the three
vectors a, b, and c with a common tail is given as follows:
A product like a ⋅ (b × c) is called a scalar triple product of the three vectors a,⃗
b,⃗ and c.⃗ Note that we can write this scalar triple product as a 3 × 3 determinant as
follows:
a1 a2 a3
b b3 b b3 b b2
a ⋅ (b × c) = a1 2 − a2 1 + a3 1 = b1 b2 b3 .
c2 c3 c1 c3 c1 c2
c
1 c2 c3
If the above scalar triple product is 0, then it means that the volume of the paral-
lelepiped determined by the three vectors a, b, and c is 0. Then, we can conclude
that the three vectors must be coplanar (that is, they lie in the same plane). In terms
of linear algebra, they are linearly dependent.
Example 1.2.6. Use the scalar triple product to determine whether the vectors a = ⟨2, 0, −7⟩, b =
⟨1, −1, −3⟩, and c = ⟨1, 1, −1⟩ are coplanar.
Solution. Since
2 0 −7
−1 −3 1 −3 1 −1
a ⋅ (b × c) = 1 −1 −3 = 2 − 0 − 7
1 −1 1 −1 1 1
1 1 −1
= 8 − 0 − 7 × 2 = −6
A line in the two-dimensional xy-plane is determined by a point on the line and the
direction of the line (its slope, or angle of inclination, or a vector parallel to the line).
The equation of the line can be written by using the usual slope-intercept form y =
mx + b.
A line L in ℝ3 is also determined once we know a point P(x0 , y0 , z0 ) on L and the
direction of L. However, we do not have the concept of “slope of a line” as we do in ℝ2 .
In three-dimensional space, the direction of a line L can be conveniently described by
a vector v = ⟨m, n, p⟩ parallel to L. If P(x, y, z) is an arbitrary point on L, then the vector
→
P0 P is parallel to v exactly when the point P is on the line, as shown in Figure 1.11, so
for some real number t we have
→
P0 P = tv,
⟨x − x0 , y − y0 , z − z0 ⟩ = ⟨tm, tn, tp⟩.
1.3 Equations of lines and planes | 19
(a) (b)
or
or
{ x = x0 + tm,
{
{ y = y0 + tn, (1.12)
{
{ z = z0 + tp.
Equations (1.11) and (1.12) are called parametric equations of the line passing
through the point (x0 , y0 , z0 ) with the direction vector v = ⟨m, n, p⟩. Note that equa-
tion (1.11) can be rewritten as
x − x0 y − y0 z − z0
= = , (1.13)
m n p
x − x0 y − y0 z − z0
= = ,
0 n p
y−y0 z−z0
but this should be interpreted as x = x0 and n
= p
.
also see that any three numbers proportional to m, n, and p are also direction num-
bers for L. The three direction numbers determine the three direction angles; they are
“angles of inclination” with respect to the three coordinate axes. If v = ⟨m, n, p⟩ is a
unit vector, then the three direction numbers are actually its three direction cosines.
x x0 m
( y ) = ( y0 ) + t ( n ) , (1.14)
z z0 p
or
or
r = r0 + tv. (1.16)
Equations (1.14)–(1.16) are all called vector equations for the line L passing through the
point (x0 , y0 , z0 ) with direction v.
Example 1.3.1. Find parametric equations, a vector equation, and symmetric equations of the line L
which passes through the points A(1, 2, −1) and B(0, 1, 3).
→
Solution. The vector AB = ⟨0 − 1, 1 − 2, 3 − (−1)⟩ = ⟨−1, −1, 4⟩ is a direction vector of the
line L. Hence, a vector equation of L is
or
x 1 −1
( y ) = ( 2 ) + t ( −1 ) .
z −1 4
x = 1 − t, y = 2 − t, z = −1 + 4t.
(a) (b)
Example 1.3.2. Show that the lines L1 and L2 with parametric equations
x = 1 + 2t, y = 2 − t, z = −3 + 4t,
x = 2 + s, y = 4 − s, z = 4 + 2s
are skew lines. That is, L1 and L2 do not intersect in a point and are not parallel to each other and,
therefore, do not lie in the same plane.
Solution. The lines are not parallel because the corresponding direction vectors v1 =
⟨2, −1, 4⟩ and v2 = ⟨1, −1, 2⟩ are not parallel because there is no scalar λ such that
⟨1, −1, 2⟩ = λ⟨2, −1, 4⟩. In other words, their components are not proportional. We at-
tempt to solve the system of equations in t and s to find any intersection points. We
have
1 + 2t = 2 + s,
2 − t = 4 − s,
−3 + 4t = 4 + 2s.
Solving the first two equations for t and s gives t = 3 and s = 5, but these values do not
satisfy the third equation. Therefore, there are no values of t and s that satisfy all three
equations, so the system of equations is inconsistent. Thus, L1 and L2 do not intersect
and are skew lines. The graphs of the two lines are shown in Figure 1.12(b).
The angle between two lines is the angle between their direction vectors. There-
fore, we can use the dot product to find the angle, as shown in the following example.
x−1 y z+3
L1 : = = and L2 : x = 2t, y = −2 − 2t, z = −t.
1 −4 1
22 | 1 Vectors and the geometry of space
Example 1.3.4. Find symmetric equations of the line L that passes through (2, 1, 14) and perpendicu-
larly intersects the line L0 : x−3
2
= y1 = z−1
1
.
Solution. Suppose that the line L intersects L0 at the point P(x, y, z). Then the coordi-
nates of P must have the form
The vector parallel to L with initial point (2, 1, 14) and terminal point P is
Since the two lines intersect perpendicularly, the direction of L0 is also perpen-
dicular to this vector, so
Solving for t, we have t = 2. Hence, P has coordinates (7, 2, 3) and a vector parallel
to L is ⟨7, 2, 3⟩ − ⟨2, 1, 14⟩ = ⟨5, 1, −11⟩. Therefore, symmetric equations of L are
Example 1.3.5. Find the perpendicular distance from the point Q(1, 2, 3) to the straight line with para-
metric equations x = 3 + t, y = 4 − 2t, z = −2 + 2t.
Solution. Let t be the value such that the point on the line N(3 + t, 4 − 2t, −2 + 2t)
→
is the foot of the perpendicular from the point Q to the line. The vector NQ must be
→
perpendicular to the direction of the line, so NQ ⋅ ⟨1, −2, 2⟩ = 0. This means
2 2 2
13 4 2
|NQ| = √( − 1) + ( − 2) + ( − 3) = √17.
3 3 3
Note. The distance can also be obtained by minimizing the function d(t) = √|NQ|.
Also, one can show that the distance from a point P to a line r = r0 + vt is
→
|MP × v|
distance from P to a line = , where M is any point on the line. (1.17)
|v|
1.3.2 Planes
Thus,
This is called the Cartesian equation/linear equation of the plane through M0 (x0 , y0 , z0 )
with normal vector n = ⟨a, b, c⟩. By collecting terms in the equation, we can write the
equation as
ax + by + cz + d = 0, (1.19)
where d = −(ax0 + by0 + cz0 ). A point (x, y, z) is in the plane if and only if it satisfies
this equation.
Example 1.3.6. The plane x = 0 is the yz-coordinate plane, the plane y = 0 is the xz-coordinate plane,
and the plane z = 0 is the xy-coordinate plane; z = 3 is the plane parallel to the xy-plane with distance
3 units from it.
Example 1.3.7. Find an equation of the plane that passes through the point (2, 2, −1) with normal vec-
tor n⃗ = ⟨1, 2, 3⟩. Also, find the intercepts of the plane with the three coordinate axes and then sketch
the plane.
or
x + 2y + 3z = 3.
In order to find the x-intercept, we set y = z = 0 in this equation and solve for x to
get x = 3. Similarly, the y-intercept is 3/2 and the z-intercept is 1. The plane is shown
in Figure 1.14(a).
(a) (b)
Example 1.3.8. Find an equation of the plane through the three points P(−1, −3, 2), Q(0, −1, 7), and
R(3, 2, −1).
→ →
Solution. The vectors PQ and PR are
→
PQ = ⟨0, −1, 7⟩ − ⟨−1, −3, 2⟩ = ⟨1, 2, 5⟩
and
→
PR = ⟨3, 2, −1⟩ − ⟨−1, −3, 2⟩ = ⟨4, 5, −3⟩.
→ → → →
Their cross product PQ × PR is orthogonal to the desired plane and, thus, n⃗ = PQ × PR
is a normal vector to the plane. Hence, an equation of the plane is
→ → → →
PM ⋅ n⃗ = PM ⋅ (PQ × PR) = 0,
where M(x, y, z) is an arbitrary point in the plane. Using the triple product formula
gives
x − (−1) y − (−3) z−2
1 2 5 = 0.
4 5 −3
Simplifying this, we obtain
23y − 31x − 3z + 44 = 0.
We can define the angle between two planes using their normal vectors as shown in
Figure 1.15.
Definition 1.3.1. The angle between two planes is defined as the acute angle between the normal
vectors of the two planes. Two planes are considered to be perpendicular if their normal vectors are
orthogonal.
So the acute angle between the given planes is cos−1 (0.73855) ≈ 42°.
Example 1.3.10. Find a formula for the perpendicular distance D from the point P(x0 , y0 , z0 ) to the
plane ax + by + cz + d = 0.
The vector n⃗ = ⟨a, b, c⟩ is a normal vector of the plane. Then, as shown in Figure 1.16,
the distance D from P to the plane is
→
D = |P1 P| cos θ.
1.3 Equations of lines and planes | 27
Thus,
→
→ |n|⃗
D = |P1 P| cos θ = |P1 P| ⋅ cos θ ⋅
|n|⃗
→
1 → |P1 P ⋅ n|⃗
= |P P| ⋅ cos θ ⋅ | n|
⃗ =
|n|⃗ 1
|n|⃗
|a(x0 − x1 ) + b(y0 − y1 ) + c(z0 − z1 )|
=
√a2 + b2 + c2
|ax0 + by0 + cz0 − (ax1 + by1 + cz1 )|
=
√a2 + b2 + c2
|ax0 + by0 + cz0 + d|
= , (1.20)
√a2 + b2 + c2
Example 1.3.11. Find the distance between the two parallel planes x + 2y − 2z = 5 and 2x + 4y − 4z = 3.
Solution. The two planes are parallel to each other since their normal vectors ⟨1, 2, −2⟩
and ⟨2, 4, −4⟩ are parallel. In order to find the distance D between the two planes, we
can, instead, find the distance from any point in one plane to the other plane. For
example, we can put y = z = 0 in the equation of the first plane, to get x = 5, so
(5, 0, 0) is a point in the first plane. Using formula (1.20) from Example 1.3.10,
The intersection of two planes that are not parallel is of course a line. So a line L
can be described as the line of intersection of two planes in the form
A1 x + B1 y + C1 z = D1 ,
L:{ (1.21)
A2 x + B2 y + C2 z = D2 .
This is a general equation of the line L. The symmetric equations of a line are an exam-
ple of this form. There will, of course, be infinitely many possible choices for the two
planes that intersect in a given line L.
Example 1.3.12. Rewrite the line L determined by the equations below in the form of parametric equa-
tions and then in the form of symmetric equations:
x + y − z = 1,
{
2x + y + 3z = 4.
Solution. First of all, we find a point on the line by choosing z = 0 and solving the
equations for x and y,
x + y = 1,
{
2x + y = 4,
obtaining x = 3 and y = −2. Therefore, the point (3, −2, 0) lies on line L. Note that the
direction vector v of line L is perpendicular to both normal vectors of the given planes,
so it is given by the cross product
i⃗ j⃗ k⃗
v = n1 × n2 = 1 1 −1 = 4i ⃗ − 5j ⃗ − k.⃗
2 1 3
x−3 y+2 z
= = .
4 −5 −1
r − r0 = λa + ub.
1.3 Equations of lines and planes | 29
r = r0 + λa + ub (1.22)
{ x = x0 + λa1 + ub1 ,
{
{ y = y0 + λa2 + ub2 , (1.24)
{
{ = z0 + λa3 + ub3 .
z
These are parametric equations of the plane. Note that this can be written in the form
Example 1.3.13. Rewrite the equation of the plane 2x − y − 3z = 10 in a vector form r ⃗ = r0⃗ + λa⃗ + μb.⃗
Solution. We must find a position vector r0⃗ whose terminal point is a point in the
plane and two nonparallel vectors a⃗ and b⃗ which are both parallel to the plane. To find
three such vectors, we find three points in the plane. It is easy to check that (5, 0, 0),
(0, −10, 0), and (2, 0, −2) are three points in the plane and, therefore,
0 5 −5 2 5 −3
a⃗ = ( −10 ) − ( 0 ) = ( −10 ) and b⃗ = ( 0 ) − ( 0 ) = ( 0 )
0 0 0 −2 0 −2
30 | 1 Vectors and the geometry of space
are two vectors parallel the plane, and a⃗ b,⃗ thus, a vector equation of the plane is
given by
5 −5 −3
r ⃗ = ( 0 ) + λ ( −10 ) + μ ( 0 ) .
0 0 −2
Also, we can solve the equation 2x −y −3z = 10 to find a general solution. For example,
we let y = λ and z = u be two free variables. Then a general solution to the equation
10+λ+3u 1 3
x 2
5 2 2
( y )=( λ ) = ( 0 ) + λ( 1 ) + u( 0 )
z u 0 0 1
{ x = x0 + mt,
{
{ y = y0 + nt,
{
{ z = z0 + pt,
where (x0 , y0 , z0 ) is a point on the line and ⟨m, n, p⟩ is the direction of the line. We can
rewrite this in a vector form
r = r0 + vt
with r = ⟨x, y, z⟩, r0 = ⟨x0 , y0 , z0 ⟩, and v = ⟨m, n, p⟩ is the direction vector. This can be
written as
Example 1.4.1 (A helix). The graph of the vector-valued function r(t) = 2 cos ti + 2 sin tj + 0.5tk, t ≥ 0,
is called a helix. The curve is shown in Figure 1.18.
1.4 Curves and vector-valued functions | 31
Example 1.4.2 (Slinky curve). A slinky curve is defined as r(t) = ⟨a(t) cos t, a(t) sin t, 1.2 sin 20t⟩. The
graph of the curve when a(t) = 5 + cos 20t and 0 ≤ t ≤ 2π is shown in Figure 1.19.
Sometimes it is helpful to visualize a curve in space by projecting the curve onto one
of the coordinate planes. If a curve has the vector equation r(t) = ⟨x(t), y(t), z(t)⟩, then
its view from above is its projection onto the xy-plane, and when it is projected, its
x- and y-coordinates remain unchanged, but the z-coordinate becomes 0. Thus, the
projection of the curve onto the xy-plane has the equation
In Example 1.4.1, the projection of the helix onto the xy-plane has the equation
Similarly, to obtain an equation for the projection of the curve r(t) = ⟨x(t), y(t), z(t)⟩
onto the xz-plane, we set the y-coordinate to be 0. To obtain an equation for the projec-
tion curve of the curve r(t) = ⟨x(t), y(t), z(t)⟩ onto the yz-plane, we set the x-coordinate
to be 0. For instance, the projection of the curve r(t) = ⟨2 cos t, 2 sin t, 0.5t⟩ onto the
yz-plane has an x-coordinate equal to 0, giving
We can also define the limit of a vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩ at a point
t0 . Similar to a scalar function, if t → t0 implies r(t) → L, then we say limt→t0 r(t) = L,
where L = ⟨a, b, c⟩ is a constant vector. More precisely, it is defined as follows.
Definition 1.5.1. Let L = ⟨a, b, c⟩ be a constant vector and r(t) = ⟨x(t), y(t), z(t)⟩ be a vector-valued
function. Then limt→t0 r(t) = L if and only if for any given ε > 0, there is a number δ > 0 such that
r(t) − L < ε whenever 0 < |t − t0 | < δ.
2 2 2
r(t) − L = √(x(t) − a) + (y(t) − b) + (z(t) − c) < ε,
using the above definition and applying the limit laws for scalar functions, we have
the following theorem.
Theorem 1.5.1. Let L = ⟨a, b, c⟩ be a constant vector and let r(t) = ⟨x(t), y(t), z(t)⟩ be a vector-valued
function. Then
Therefore, limt→t0 r(t) = ⟨limt→t0 x(t), limt→t0 y(t), limt→t0 z(t)⟩. That is, to evaluate the
limit of a vector-valued function, we evaluate the limit of each component of the func-
tion, given that all limits exist.
Solution.
(a) We have limt→0 r(t) = ⟨limt→0 1−cos
t2
t
, limt→0 e−t , limt→0 tan−1 t⟩ = ⟨limt→0 sin 2t
t
,
1
1, 0⟩ = ⟨ 2 , 1, 0⟩.
(b) We have limt→∞ r(t) = ⟨limt→∞ 1−cos
t2
t
, limt→∞ e−t , limt→∞ tan−1 t⟩ = ⟨0, 0, π2 ⟩.
Intuitively, we know that if each component of r(t) is continuous, then the curve r(t)
must be continuous, which means that you can draw the curve continuously, without
lifting your pencil. The formal definition of continuity is given below.
Definition 1.5.2. A vector-valued function r(t) is continuous at t0 if and only if limt→t0 r(t) = r(t0 ).
Definition 1.5.3. If x(t), y(t), and z(t) are three differentiable functions on the interval (a, b), then the
derivative of the vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩ is
(a) (b)
In light of the above definition, we are now able to derive an equation for the tangent
line to the curve r(t) at any point t = t0 . Since the curve at point (x(t0 ), y(t0 ), z(t0 )) has
tangent vector r (t0 ) = ⟨x (t0 ), y (t0 ), z (t0 )⟩, the symmetric equations of the tangent
line, provided r (t0 ) ≠ 0, are
x − x(t0 ) y − y(t0 ) z − z(t0 )
= = . (1.26)
x (t0 ) y (t0 ) z (t0 )
Parametric equations of the tangent line at t = t0 are
r (t )
where r (t0 ) is a tangent vector. The unit tangent vector at t = t0 is T = |r (t0 )| .
0
Note that the plane passing through the curve at t = t0 with a normal vector paral-
lel to the tangent vector to the curve at t = t0 is the normal plane to the curve at t = t0 ,
as shown in Figure 1.20(b). The normal plane to the curve at t = t0 has the equation
Example 1.5.2. Find an equation for the tangent line and normal plane to the curve
r(t) = ⟨sin t, cos t, sin 2t⟩ at t = π/6.
Solution. The point is (sin π/6, cos π/6, sin(2 × π/6)) = (1/2, √3/2, √3/2), and since
r (t) = ⟨cos t, − sin t, 2 cos 2t⟩,
So, the parametric equations for the desired tangent line are
1
x= + 23 t,
√
{
{ 2
{
{ √3
{ y= 2
− 21 t,
{
{
{ √3
{ z= 2
+ t.
An equation for the normal plane at t = π/6 is
√3 1 1 √3 √3
(x − ) − (y − ) + (z − ) = 0.
2 2 2 2 2
Figure 1.21 shows the tangent line and normal plane at t = π/6.
By using the above definition of the derivative for a vector-valued function, we can
deduce the following theorem, the proof of which is omitted here.
Theorem 1.5.2. Let u(t) and v(t) be two differentiable vector-valued functions and f (t) be a differen-
tiable scalar-valued function over a < t < b. Let c be a constant vector. Then at any t in (a, b), we
have:
d
1. dt
(c) = 0,
d d d
2. dt
(u(t) ± v(t)) = dt u(t) ± dt v(t) (sum or difference rule),
d d d
3. dt
(f (t)u(t)) = ( dt f (t))u(t) + f (t) dt u(t) (constant multiple rule),
d
4. dt
u(f (t)) = u
(f (t))f
(t) (chain rule),
d
5. dt
(u(t) ⋅ v(t)) = u (t) ⋅ v(t) + u(t) ⋅ v (t) (dot product rule),
d
6. dt
(u(t) × v(t)) = u (t) × v(t) + u(t) × v (t) (cross product rule).
Similar to scalar functions, if R (t) = r(t), then we say R(t) is an antiderivative of r(t),
and we write the indefinite integral of r(t) as
∫ r(t)dt = R(t) + C,
36 | 1 Vectors and the geometry of space
b
where C is an arbitrary constant vector. For definite integrals, we write ∫a r(t)dt =
R(b) − R(a). In light of the previous definition for derivative, we have the following
formal definition using the components of r(t).
Definition 1.5.4. If r(t) = ⟨x(t), y(t), z(t)⟩ is continuous for a ≤ t ≤ b, then we define
b b b b
∫ r(t)dt = ⟨∫ x(t)dt, ∫ y(t)dt, ∫ z(t)dt⟩.
a a a a
Solution.
1. Since r (t) = ⟨e2t , sec2 t, sin t⟩,
1
= ⟨ e2t + c1 , tan t + c2 , − cos t + c3 ⟩
2
1
= ⟨ e2t , tan t, − cos t⟩ + ⟨c1 , c2 , c3 ⟩.
2
1
⟨1, 1, 2⟩ = ⟨ e0 , tan 0, − cos 0⟩ + ⟨c1 , c2 , c3 ⟩
2
1
so ⟨c1 , c2 , c2 ⟩ = ⟨ , 1, 3⟩.
2
As seen before, s, the arc length, or length of a plane curve ⟨x(t), y(t)⟩ for a ≤ t ≤ b, is
b
2 2
s = ∫ √[x (t)] + [y (t)] dt.
a
The analog for a curve in space is the length of a curve r(t) = ⟨x(t), y(t), z(t)⟩ for a ≤
t ≤ b, which is
b
2 2 2
s = ∫ √[x (t)] + [y (t)] + [z (t)] dt,
a
provided that the integrand is integrable. The integrand is always integrable when the
curve is smooth, that is, x (t), y (t), and z (t) are continuous on [a, b].
Again, thinking of a moving object along the curve, the length of the curve is in-
deed the distance traveled by the object over time interval [a, b]. Since the derivative
of position with respect to time is the velocity, v(t), and the derivative of distance trav-
eled with respect the time t is the speed, we have
ds
v(t) = r (t) and = v(t).
dt
b
It is not a surprise that the length of the curve is s = ∫a |v(t)|dt.
We conclude this in the following definition.
Definition 1.5.5. If r (t) is continuous, the curve r(t) is a smooth curve, and the length of this curve for
a ≤ t ≤ b is defined as
b b
2 2 2
∫r (t)dt = ∫ √ [x (t)] + [y (t)] + [z (t)] dt.
a a
Example 1.5.4. Find the length of the curve r(t) = ⟨3 cos t, 4 cos t, 5 sin t⟩ for 0 ≤ t ≤ 2π.
2π
2 2 2
s = ∫ √((3 cos t) ) + ((4 cos t) ) + ((5 sin t) ) dt
0
2π 2π
x = R cos t, x = R cos u2 ,
(a) { 0 ≤ t ≤ 2π , (b) { 0 ≤ u ≤ 4π ,
y = R sin t, y = R sin u2 ,
x = R cos 2t, x = R sin 3θ, 2π
(c) { 0≤t≤π, (d) { 0≤θ≤ .
y = R sin 2t, y = R cos 3θ, 3
They actually describe the same curve. In this case, it is a circle centered at (0, 0)
with radius R. The name of a parameter, of course, does not matter. However, how
the curve evolves as the parameter increases does make a difference. For example,
in (a), the circle is formed counterclockwise while in (d) it is formed clockwise. The
positive orientation of a curve is the direction in which the curve is generated as the
parameter increases. So the positive orientation of (a) is counterclockwise, while the
positive orientation of (d) is clockwise.
A curve may be parameterized in many ways, as shown above. In some ways, the
parameter may have a nice geometric interpretation. For example, in (a), at each point
on the circle, the corresponding value of the parameter t is exactly the angle (measured
in radians) formed by the corresponding radius and the positive x-axis. We now intro-
duce a very natural way for describing a curve where its parameter represents the arc
length. We first investigate the following curve:
x = 2 cos 2t ,
{ for 0 ≤ t ≤ 4π.
y = 2 sin 2t ,
The initial point is (2, 0). When t = π, the corresponding arc length is also π. When t =
2π, the corresponding arc length is also 2π. In general, the length of the interval [0, t] is
equal to the length of the curve generated. We say that the curve r(t) = ⟨2 cos 2t , 2 sin 2t ⟩
is parameterized by arc length. In this case we also write
s s
r(s) = ⟨2 cos , 2 sin ⟩
2 2
s = ∫r (t)dt,
a
and so ds dt
= |r (t)|. This means ds = |r (t)|dt. Therefore, the change in t is equal to
the change in s if and only of if |r (t)| = 1. In particular, if the curve starts at r(a) and
|r (t)| = 1 for all t, then when t = a, we have s = 0, and when t ≠ a, we have s = t − a.
1.5 Calculus of vector-valued functions | 39
(a) r(t) = ⟨sin t, 1, cos t⟩ for t ≥ 1 and (b) r(t) = ⟨t, t + 1, 6t⟩ for 0 ≤ t ≤ 12
use arc length as a parameter. If not, find a description that uses arc length as a parameter.
Solution.
1. For (a), r (t) = ⟨cos t, 0, − sin t⟩, so |r (t)| = √(cos t)2 + 02 + (− sin t)2 = 1. Yes, it
uses arc length as a parameter.
2. For (b), r (t) = ⟨1, 1, 6⟩, so |r (t)| = √12 + 12 + 62 = √38 ≠ 1. No, it does not use arc
length as a parameter. Since
t t
s
if we replace t by √38
, the parameterized curve
s s 6s
r1 (s) = ⟨ , + 1, ⟩
√38 √38 √38
Definition 1.5.6 (Curvature). If r(t) is a smooth curve and T is its unit tangent vector, then the curvature
κ of a smooth curve r(t) is defined as
dT
κ = .
ds
ds
Because dt
= |r (t)|, by using the chain rule, we have
dT dT 1 1 dT
κ = = ds = .
ds dt | | |r (t)| dt
dt
Intuitively speaking, since curvature measures the degree that a curve bends, a
straight line must have 0 curvature. At all points on a circle, the curvature would be the
same constant, and a smaller circle should have larger curvature. Let us use Definition
1.5.6 to verify this understanding.
Example 1.5.6. Find the curvature for the straight line r(t) = ⟨x0 + mt, y0 + nt, z0 + pt⟩.
dr 1
Solution. Since dt
= ⟨m, n, p⟩ is constant, T = ⟨m, n, p⟩. Then,
√m2 +n2 +p2
dT d 1
= ( ⟨m, n, p⟩) = 0.
dt dt √m2 + n2 + p2
1
Therefore, the curvature at any point on the line is κ = | dT |
|r (t)| dt
= 0. This agrees
with our intuition.
Example 1.5.7. Find the curvature for the circle r(t) = ⟨R cos t, R sin t⟩.
dr
Solution. Since dt
= ⟨−R sin t, R cos t⟩,
1
T= ⟨−R sin t, R cos t⟩ = ⟨− sin t, cos t⟩.
√(−R sin t)2 + (R cos t)2
1 dT 1 dT
κ=
=
|r (t)| dt √(−R sin t)2 + (R cos t)2 dt
1 d 1
1 1
= ⟨− sin t, cos t⟩ = ⟨− cos t, − sin t⟩ = √(− cos t)2 + (− sin t)2 = .
R dt
R R R
The curvature is the same at each point on a circle, and a larger circle has a smaller
curvature.
1.5 Calculus of vector-valued functions | 41
In general, calculating the curvature using the definition involves many steps.
However, sometimes it is easier to calculate the curvature of a curve by using the fol-
lowing theorem, which can be derived using Theorem 1.5.2.
Theorem 1.5.3. Let r(t) be a twice differentiable smooth curve. The curvature of r(t) is then
Example 1.5.8. Find the curvature of the curve y = x 2 at the point with greatest curvature.
Solution. Let x = t, y = t 2 , and z = 0. Then the curve is r(t) = ⟨t, t 2 , 0⟩. At each t,
So the curvature is
|⟨0, 0, 2⟩| 2
κ= = .
|⟨1, 2t, 0⟩|3 √1 + 4t 2 3
So at t = 0, which is the origin, the parabola y = x2 has the greatest curvature, which
is 2.
Note. If a curve r(t) is parameterized by arc length, then |r (t)| = 1, ds = dt, and so
dr
ds
= T, and we have
dT
Definition 1.5.7. Let r be a smooth curve. If dt
is not 0, then the principal unit normal vector N at a
point on the curve is defined to be
dT
dt
N= .
| dT
dt
|
All the solutions to the equation are points in the plane. Also, for any point in the
plane, its coordinates must satisfy the equation. In general, for an equation of three
variables,
F(x, y, z) = 0,
all its solutions are the set of points in space that form a surface, which is called the
graph of the equation.
Example 1.6.1. Find an equation for the sphere with radius R centered at P0 (x0 , y0 , z0 ).
Solution. Suppose P(x, y, z) is a point on the sphere. The distance between P and P0
must be R, that is,
|PP0 | = R,
√(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = R,
(x − x0 )2 + (y − y0 )2 + (z − z0 )2 = R2 .
Example 1.6.2. Find the locus of points with equal distance from the two points A(1, 2, 3) and B(2, 1−4).
Solution. If P(x, y, z) is any point with equal distance from A and from B, then |PA| =
|PB|, and this becomes
2x − 2y − 14z − 7 = 0.
(a) (b)
Note that squaring both sides of an equation, as above, can introduce extra solutions,
because A = B and A = −B both square to A2 = B2 . However, this does not happen here
because the square roots on both sides must be nonnegative, and this does not allow
one side to be negative.
1.6.2 Cylinder
Consider a plane in space again. A plane can be considered as the surface which is
formed by all lines that are parallel to a given line and pass through a given curve.
Or, in other words, the plane is formed by moving a line along a curve. This type of
surface is called a cylinder, as shown in Figure 1.24.
Definition 1.6.1. A cylinder is defined as a surface that consists of all lines (called rulings) that are
parallel to a given line and pass through a given curve.
We first consider the cases where all rulings are parallel to one of the coordinate axes.
Example 1.6.3 (Parabolic cylinder). Sketch the graph of the surface y 2 = 2x in three-dimensional
space and show that it is a cylinder.
Solution. Note that the equation of the graph y2 = 2x does not involve z. This means
that for any x0 and y0 satisfying this equation, there is a line of solutions (x0 , y0 , z)
for every possible z-value (that is, z is unrestricted by the equation). Furthermore, any
horizontal plane with equation z = k (parallel to the xy-plane) intersects the graph
in the same curve with equation y2 = 2x, a parabola. Figure 1.25(a) shows how the
graph is formed by moving a line parallel to the z-axis along the parabola y2 = 2x in
the xy-plane. This surface is called a parabolic cylinder. The graph can also be formed
by infinitely many shifted copies of the same parabola y2 = 2x along the z-axis.
1.6 Surfaces in space | 45
Figure 1.25: Examples of cylinders: parabolic cylinder, elliptic cylinder, hyperbolic cylinder.
Note.
1. In general, if one of the variables x, y, or z is missing from the equation of a sur-
face, then the surface is a cylinder with rulings parallel to the axis of the missing
variable.
2. It is useful to sketch surfaces in space by using the traces which are the intersec-
tion curves of the surface and planes parallel to one of the coordinate planes.
Example 1.6.4 (Elliptic cylinder). Identify and sketch in three-dimensional space the surfaces
(a) x 2 + 2y 2 = R 2 and (b) x 2 + z 2 = R 2 .
Solution. (a) Since z is missing, this must be a cylinder with rulings parallel to the
z-axis. The graph of the equation x2 + 2y2 = R2 , for z = k (a constant), is an ellipse in
the plane z = k. Hence, the surface x2 + 2y2 = R2 is an elliptic cylinder whose rulings
are parallel to the z-axis and, so, are vertical (see Figure 1.25(b)).
(b) Similarly, x2 + z 2 = R2 is a circular cylinder whose rulings are parallel to the
y-axis and thus they are horizontal.
z2
Example 1.6.5 (Hyperbolic cylinder). Identify and sketch the surface y 2 − 9
= 1.
Solution. Since x is missing, this is a cylinder with rulings parallel to the x-axis. All
the traces for constant x are hyperbolas. This is a hyperbolic cylinder whose graph is
shown in Figure 1.25(c).
Solution. (a) It is a cylinder with rulings parallel to the z-axis. The trace with z = 0 is
the curve x = sin y in the xy-plane. The cylinder is generated by moving this curve up
and down along the z-axis.
46 | 1 Vectors and the geometry of space
(b) It is a cylinder with rulings parallel to the y-axis. The trace with y = 0 is the
curve z = ln x in the xz-plane. The cylinder is generated by moving this curve left and
right along the y-axis.
A quadric surface is the graph of a second-degree polynomial equation with three vari-
ables, x, y, and z. The most general such equation is
where A, B, C, . . . , J are constants. By translation and rotation of the axes (in algebra:
completing the square and making a linear transformation) it is possible to bring this
equation into one of the two standard forms,
Ax 2 + By2 + Cz 2 + D = 0 or Ax 2 + By2 + Iz = 0,
where A, B, C, and I are nonzero (otherwise, the graphs are cylinders). The signs (pos-
itive or negative) of these constants and whether D is zero lead to the following list of
the types of quadric surfaces in three-dimensional space:
2 2 2
1. elliptic cone zc2 = ax 2 + by 2 (D = 0),
x2 y2 z2
2. ellipsoid a2
+ b2
+ c2
= 1 (D ≠ 0),
2
x2 2
3. hyperboloid of one sheet a2
+ by 2 − zc2 = 1 (D ≠ 0),
2 2 2
4. hyperboloid of two sheets ax 2 − by 2 − zc2 = 1 (D ≠ 0),
2 2
5. elliptic paraboloid z = ax 2 + by 2 ,
2 2
6. hyperbolic paraboloid z = ax 2 − by 2 .
One needs to be aware that the same surfaces with different orientations are obtained
when the roles of the variables are interchanged.
Like conic sections in two-dimensional space, quadric surfaces admit similar
geometric and physical properties, which makes them useful in designing satellite
dishes, headlamps, mirrors in telescopes, cooling towers for nuclear power plants,
water tanks, and so forth.
Using traces, it is not hard to sketch the graph of these quadric surfaces. They are
summarized in Figure 1.26.
One type of special surface in space is obtained by revolving a curve about a line. For
example, if we revolve the plane curve y = x 2 about the x axis, we obtain a surface of
1.6 Surfaces in space | 47
revolution in space. How do we find an equation for this surface? We consider a more
general case. Suppose we have a curve f (y, z) = 0 in the yz-plane (this means x = 0),
and we rotate this curve about the z-axis, as shown in Figure 1.27. To find an equation
for the surface, we consider a point P(x, y, z) on the surface. The point P is obtained
by revolving the point P0 (y0 , z0 ) on the original curve f (y, z) = 0 about the z-axis. Note
that P and P0 actually have the same height above the xy-plane and the same distance
from the z-axis; therefore, we have
z0 = z and |y0 | = √x 2 + y2
(a) (b)
and, thus, (±√x2 + y2 , z) must satisfy the equation f (y, z) = 0. Therefore, we have
f (±√x2 + y2 , z) = 0.
This equation has three variables and is exactly an equation of the surface obtained
by rotating the curve f (y, z) = 0 in the yz-plane about the z-axis.
Example 1.6.7. Find an equation of the surface of revolution formed by revolving a straight line L in
the yz-plane with equation z = ay about the z-axis.
z = ±a√x 2 + y2 .
That gives
z 2 = a2 (x 2 + y2 ).
This type of surface is called a circular cone. The graph of a circular cone is shown in
Figure 1.28.
(a) (b)
f (y, ±√x2 + z 2 ) = 0
is the equation of the surface obtained by revolving the curve f (y, z) = 0 in the yz-plane
about the y-axis. Also, an equation for the surface obtained by revolving a curve in a
coordinate plane other than the yz-plane about one of the axes can be determined in
a similar manner.
1.7 Parameterized surfaces | 49
Example 1.6.8. Find an equation for each surface of revolution obtained by revolving the curve
x 2 + 49 y 2 = 1 in the xy-plane about (a) the x-axis and (b) the y-axis.
Solution. For (a), a rotation about the x-axis, keep x unchanged and replace y with
±√y2 + z 2 , yielding
4 2
x2 + (±√y2 + z 2 ) = 1.
9
This simplifies to
4 2 4 2
x2 + y + z = 1,
9 9
which is an equation for the desired surface.
For (b), similarly, we keep y unchanged, and we replace x by ±√x 2 + z 2 in the equa-
tion of the curve. We obtain
4
x2 + z 2 + y2 = 1,
9
which is an equation of that surface of revolution. The graphs of these surfaces of
revolution are shown in Figure 1.29.
r(u, v) = r0 + ua + vb.
In general, the graph of the vector-valued function r(u, v) with two independent pa-
rameters u and v is a surface in space. Its parametric form is
Solution.
1. Let x = u and y = v. Then z = 2u + 4v − 5. Or
{ x = u,
{
{ y = v,
{
{ z = 2u + 4v − 5.
2. Let x = cos u and y = √12 sin u. Since the equation does not involve the variable z,
z could be any real number. Thus,
{ x = cos u,
{
{ y = √12 sin u,
{
{ z=v
is a parameterization.
4. For the cone, let x = u cos v, y = 21 u sin v, and z 2 = u2 . So
1
x = u cos v, y = u sin v, and z=u
2
is a parameterization.
Example 1.8.1. For each of the following curves, find two surfaces so that the curve is their intersection
curve:
1. the line r(t) = ⟨2 − 3t, 4 + t, −2 − 5t⟩,
2. the helix r(t) = ⟨2 cos t, 2 sin t, 0.5t⟩.
1.8 Intersecting surfaces and projection curves | 51
Solution.
1. The line has symmetric equations
So the line is the line of intersection of the two planes x+3y−14 = 0 and 5y+z−18 =
0.
2. The helix has parametric equations
z
Therefore, x2 + y2 = 4 and t = 0.5 = 2z, so x = 2 cos(2z). Both of them are cylinders.
Therefore, the helix is the curve of intersection of the two cylinders, and we have
x2 + y2 = 4,
{
x = 2 cos(2z).
Figure 1.30 shows the graphs of the cylinder x2 + y2 = 4 and the cylinder x = 2 cos(2z).
In general, the graphs F(x, y, z) = 0 and G(x, y, z) = 0 are surfaces in space, and
F(x, y, z) = 0,
{
G(x, y, z) = 0
52 | 1 Vectors and the geometry of space
describes the curve of intersection of the two surfaces. We call it a general equation of
the curve. Intuitively, if the system of equation has just one independent variable, say,
x, then y and z are dependent variables. Therefore, we have the parametric equations
x = x, y = y(x), and z = z(x), which describe a curve in space. For example,
z = x2 + 2y2 ,
{
z=3
z=x 2 +y 2 ,
Example 1.8.2. Describe the curve given by the equations {
x 2 +y 2 +z 2 =2.
z 2 + z − 2 = 0.
z = 1,
{
x2 + y2 = 1.
This curve is a circle in the z = 1 plane. It has parametric equations x = cos t, y = sin t,
and z = 1. This curve is the intersection curve of a paraboloid and a sphere with center
at the origin and radius √2, as shown in Figure 1.31(b).
We were fortunate that in the previous example, there exist nice equations to describe
curves in space. However, in some cases, it might be hard to find a simple equation for
a space curve as the intersection curve of two surfaces. For example, consider
x + y + 2z = 0, z = x2 + 2y2 ,
{ or {
x2 + y2 + z 2 = 4 x 2 + y2 = 1.
1.8 Intersecting surfaces and projection curves | 53
We know that the first example is the intersection curve of a plane and a sphere. In-
tuitively, we know this is a circle in space. But for the second one, it might be hard to
visualize it. To study a curve like this, it would be helpful to view it from the top or
side. This means that we can study its projection curves onto one of the coordinate
planes. How can we find equations for those projection curves? We first consider the
case where we project the curve
F(x, y, z) = 0,
{
G(x, y, z) = 0
Similarly, if we want the projection curve on the xz- or yz-coordinate planes, we simply
eliminate the variable y or x from the simultaneous equations. This gives a cylinder
parallel to the y- or x-axis, respectively. The curve of intersection of the cylinder with
the xz- or yz-coordinate planes is the desired projection curve. So, finding projection
curves is not hard now.
Example 1.8.3. Find an equation for the projection curve of the intersecting curve of the plane
x + y + 2z = 0 and the sphere x 2 + y 2 + z 2 = 4:
1. onto the xy-plane,
2. onto the yz-plane.
Solution.
1. Onto the xy-plane, we eliminate the variable z to obtain
2
x+y
x2 + y2 + (− ) = 4,
2
which simplifies to 5x 2 +5y2 +2xy = 0. This is an elliptic cylinder, and the projection
curve
5x2 + 5y2 + 2xy = 16,
{
z=0
(−y − 2z)2 + y2 + z 2 = 4,
which simplifies to 2y2 +5z 2 +4yz = 4. This is an elliptic cylinder, and the projection
curve
2y2 + 5z 2 + 4yz = 4,
{
x=0
is also an ellipse in the yz-plane. Figure 1.32 shows the graphs and projections.
x 2 + y 2 + z 2 = 4,
{
(x − 1)2 + y 2 = 1.
Solution. The curve is the intersection curve of a sphere and a circular cylinder. The
variable z is missing in the second equation, so there is no need to eliminate z because
the second equation is already a cylinder containing the curve C. So the projection
curve onto the xy-plane is a circle centered at (1, 0) with radius 1 (as in Figure 1.33)
given by
z = 0,
{
(x − 1)2 + y2 = 1.
Note. This curve has nice parametric equations. If x = 1 + cos t and y = sin t, then
= 2(1 − cos t)
t
= 4 sin2 .
2
Thus, we can take z = 2 sin 2t . A vector equation for this curve is, therefore,
t
r(t) = ⟨1 + cos t, sin t, sin ⟩.
2
Setting the z-component to 0, we have the projection curve onto the xy-plane, i. e.,
which is the same as (x −1)2 +y2 = 1. The intersection curve of the cylinder (x −a)2 +y2 =
a2 and the sphere x2 + y2 + z 2 = 4a2 is called a Viviani curve, named after an Italian
mathematician. Figure 1.33 shows the top and side views of a Viviani curve.
In some cases, we can also find projection curves onto some plane which is not
parallel to any of the coordinate planes.
2x−y+z=0,
Example 1.8.5. Find the projection line of the line L: { x−y−2z+10=0 onto the plane y + 2z + 2 = 0.
where λ is any fixed number. Among all these planes that contain L, there must be
exactly one plane which contains the projection line of L onto the plane y + 2z + 2 = 0.
This plane is the one which is perpendicular to the plane y + 2z + 2 = 0. This means
that the normal vectors of the two planes are perpendicular, i. e.,
⟨0, 1, 2⟩ ⋅ ⟨2 + λ, −1 − λ, 1 − 2λ⟩ = 0.
56 | 1 Vectors and the geometry of space
Therefore, we have 0(2 + λ) + 1(−1 − λ) + 2(1 − 2λ) = 0 with solution λ = 51 . Hence, the
plane containing L and the projection line is
1
(2x − y + z) + (x − y − 2z + 10) = 0, or
5
11x − 6y + 3z + 10 = 0.
The projection line L is the line of intersection of this plane and the plane y+2z+2 =
0. So the projection line is
y + 2z + 2 = 0,
{
11x − 6y + 3z + 10 = 0.
Figure 1.34 shows these planes and the projection line onto the plane y + 2z + 2 = 0 in
blue.
z = x2 + y2 ,
{
z = √2 − x 2 − y2 .
Projecting this curve onto the xy-plane to obtain an equation of the projection curve
gives
z=0 and x 2 + y2 = 1,
1.10 Review | 57
z=0 and x2 + y2 ≤ 1.
To project the region onto the xz-plane, we first note that the projection region of z =
x2 + y2 onto the xz-plane is
y=0 and z ≥ x 2 .
y=0 and z ≤ √2 − x 2 .
Therefore, the projection region of R onto the xz-plane is the region bounded by z = x 2
and z = √2 − x2 in the plane y = 0.
1.10 Review
The main concepts discussed in this chapter are listed below.
1. Vector operations of addition, subtraction, and scalar multiplication, and the dot
product and cross product:
i j k
a ⋅ b = a1 b1 + a2 b2 + a3 b3 , a × b = a1 a2 a3 .
b1 b2 b3
a1 x + b1 y + c1 z = d1 ,
lines : r = r0 + tv, r = ⟨x(t), y(t), z(t)⟩, or {
a2 x + b2 y + c2 z = d2 ,
x − x0 y − y0 z − z0
or = =
m n p
or x = x0 + mt, y = y0 + nt and z = z0 + pt,
planes : a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0 or ax + by + cz + d = 0.
where P1 is any point on the first line and P2 is any point on the second line.
6. For a vector-valued function r(t):
(a) r (t) = ⟨x (t), y (t), z (t)⟩, ∫ r(t)dt = ⟨∫ x(t)dt, ∫ y(t)dt, ∫ z(t)dt⟩,
(b) tangent line at t = t0 : r = r(t0 ) + tr (t0 ),
(c) normal plane at t = t0 :
b
(d) length of curves: s = ∫a |r (t)|dt, for a ≤ t ≤ b,
dT 1
(e) curvature: κ = ds
= | dT |
|r (t)| dt
|r (t)×r (t)|
=|r (t)|3
,
dT dT
(f) the principal unit normal vector: N = dt /| dt |,
(g) the unit binormal vector: B = T × N,
(h) torsion: τ = −N ⋅ dB
ds
.
7. The cylinders parallel to one of the axes: F(x, y) = 0, G(x, z) = 0, and H(y, z) = 0.
8. Quadric surfaces:
z2 x2 y2 x2 y2 z 2
= + elliptic cone + + = 1 ellipsoid
a2 b2 c2 a2 b2 c2
x2 y2 z 2 x2 y2
+ − = 1 hyperboloid of one sheet z= + elliptic paraboloid
a2 b2 c2 a2 b2
x2 y2 z 2 x2 y2
− − = 1 hyperboloid of two sheets z = − hyperbolic paraboloid
a2 b2 c2 a2 b2
9. Surface of revolution:
f (y, z) = 0,
{ about the z-axis f (±√x 2 + y2 , z) = 0,
x = 0,
Similar results hold for curves in other coordinate planes rotated about one of the
axes.
10. Vector form of a plane: r = r0 + ua + vb.
11. Surfaces with vector parametric forms: r = ⟨x(u, v), y(u, v), z(u, v)⟩.
12. Finding the projections of curves by eliminating one of the variables x, y, or z.
1.11 Exercises | 59
1.11 Exercises
1.11.1 Vectors
(1, 2, 3), (−1, 0, 2), (0, 2, 0), (0, −1, 1), (2, −1, 2), (2, 0, 0).
2. Find the three points that are symmetrical to M0 (x0 , y0 , z0 ) about the x-axis, the
xz-plane, and the origin, respectively.
→
3. If |AB| = 11, A(4, −7, 1), and B(6, 2, z), find z.
4. If a⃗ = ⟨5, 7, 8⟩, b⃗ = ⟨3, −4, 6⟩, and c⃗ = ⟨−6, −9, −5⟩, then find the length and direc-
tion angles of the vector a⃗ + b⃗ + c.⃗
5. If r = i − 2j − 2k, then find the unit vector in the direction of r. Also, find the three
direction cosines.
6. Which of the following expressions make sense? Which do not make sense? Ex-
plain your answers.
(1) (a⃗ ⋅ b)⃗ ⋅ c,⃗ (2) (a⃗ ⋅ b)⃗ c,⃗ (3) |a|(
⃗ b⃗ ⋅ c),
⃗
(4) (a⃗ + b) ⋅ c,⃗ (5) a⃗ + b ⋅ c,⃗ (6) |a|(
⃗ ⃗ ⃗ b + c).
⃗ ⃗
7. Which of the following identities are true? Which are false? Explain your answers.
(1) |a|⃗ a⃗ = a⃗ ⋅ a,⃗ (2) (a⃗ ⋅ b)(
⃗ a⃗ ⋅ b)⃗ = (a⃗ ⋅ a)(
⃗ b⃗ ⋅ b),
⃗
(3) (a⃗ ⋅ b)c⃗ = a(⃗ b ⋅ c),
⃗ ⃗ ⃗ (4) (a + b) ⋅ (a + b) = a ⋅ a + 2a ⋅ b + b ⋅ b.
8. Simplify the following expressions:
(1) i ⃗ × (j ⃗ + k)⃗ − j ⃗ × (i ⃗ + k)⃗ + k⃗ × (i ⃗ + j ⃗ + k),
⃗ (2) (2a + b) × (c − a) + (b − c) × (a + b).
9. Prove Theorem 1.2.4.
10. If a⃗ = 3i ⃗ − j ⃗ − 2k,⃗ b⃗ = i ⃗ + 2j ⃗ − k,⃗ and c⃗ = i ⃗ + j,⃗ find
(1) a⃗ ⋅ b⃗ and a⃗ × b,⃗ (2) Projb⃗ a,⃗ (3) (−2a)⃗ ⋅ 3b,⃗
(4) the angle between a⃗ and b,⃗ (5) (a⃗ × b)⃗ ⋅ c.⃗
11. Prove that a⋅(b × c) = (a × b) ⋅ c.
12. Prove that the four points A(2, −1, −2), B(1, 2, 1), C(2, 3, 0), and D(−1, 5, 4) are not
coplanar.
13. Prove the Cauchy-Schwarz inequality for three-dimensional vectors a⃗ and b,⃗
14. Use the projection method to show that in ℝ2 the distance from a point P(x0 , y0 )
to the line ax + by + c = 0 is
|ax0 + by0 + c|
.
√a2 + b2
60 | 1 Vectors and the geometry of space
15. If r1 , r2 , and r3 are three nonzero position vectors, show that their heads are
collinear if and only if
r1 × r2 + r2 × r3 + r3 × r1 = 0.
5. Find the plane that passes through the point (1, 2, 1) and contains the line of inter-
section of the planes x − y + z = 2 and 2x − y − 2z = 1.
6. Find parametric equations of the line that passes through (4, 1, 3) and is parallel
to the line
x−3 y z−1
= = .
2 1 5
7. Find parametric equations of the line through (2, 1, 1) that intersects the line x+13
=
y+3 z
1
= 2 and is parallel to the plane 3x − 4y + z = 10.
8. Find the angle between two planes 4x + 2y + 4z − 7 = 0 and 3x − 4y = 0.
9. Find the distance from the point A(1, 2, 3) to the line x = t, y = 4 − 3t, z = 3 − 2t.
1.11 Exercises | 61
find:
(a) an equation for its tangent line at t = π2 .
(b) an equation for its normal plane at t = π2 .
(c) an equation of the projection curve onto the xy-plane.
6. If r (t) = ⟨tet , t cos t 2 , − √ 2t2 ⟩ and r(0) = ⟨0, 2, 4⟩, find r(t).
t +4
7. A particle is moving in space with velocity v(t) = ⟨3 cos t, 4 cos t, 5 sin t⟩. Initially
it starts at the origin. Find its position when t = π and the distance it traveled
during 0 ≤ t ≤ π.
8. Determine whether the curves
(1) r(t) = cos t 2 i + sin t 2 j, (2) r(t) = 5 cos ti + 3 sin tj + 4 sin tk
use arc length as a parameter. If not, find a description that uses arc length as a
parameter.
9. Prove Theorem 1.5.3.
10. For each curve
(1) r(t) = 3 cos ti + 3 sin tj + 2tk and (2) r(t) = et cos ti + et sin tj + et k,
find
(a) the unit tangent vector T at t = 0.
(b) the principal unit normal vector N at t = 0.
62 | 1 Vectors and the geometry of space
|y (x0 )|
,
(1 + [y (x0 )]2 )3/2
the z-axis. Find the projection curve onto the xy-plane of the curve of intersection
of the surface S and the cone z = √x2 + y2 .
18. Find an equation of the cylinder consisting of all lines parallel to the direction
2
⟨2, 1, −1⟩ and passing through the curve { y −4x=0,
z=0.
19. Find the projection curve of
z=xy,
(1) { x2 +2y2 =1 onto the xy-plane,
(2) r(t) = ⟨1 + t 2 , sin t, t⟩ onto the xz-plane,
1.11 Exercises | 63
(3) z = x2 + 2y2 and x2 + y2 = 1 onto (a) the xy-plane and (b) the yz-plane.
20. Find the projection regions of
(1) the paraboloid z = x2 + y2 ,
(2) the solid bounded by the cone z = √x2 + 2y2 and the sphere x 2 + y2 + z 2 = 1
onto the three coordinate planes.
21. Find an equation of the projection line of the line r(t) = ⟨2 + t, 3 − 2t, 4t⟩ onto the
plane x + y − z = 1.
22. Try to find an equation for the projection curve of the curve r(t) = ⟨2 cos t, 2 sin t, 4t⟩
onto the plane x − 2y + 3z = 2.
23. (Ruled surfaces) In geometry, a surface S is ruled (also called a scroll) if through
every point of S there is a straight line that lies on S. For example, a plane and a
circular cone are both ruled surfaces. A surface is doubly ruled if through every
point of S there are two distinct lines that lie on the surface.
Show that:
(a) the cylinder x2 + 2z 2 = 1 is ruled.
(b) the hyperbolic paraboloid z = x2 − y2 and the hyperboloid of one sheet x 2 +
y2 − z 2 = 1 are doubly ruled surfaces.
2 Functions of multiple variables
In single-variable calculus, a function depends on only one variable. However, in the
real world, physical quantities often depend on two or more variables. For example,
the volume V of a circular cone depends on its base radius r and height h, so it is a
function of two variables. The temperature T of a city in China depends on the time t,
the longitude x, and the latitude y of the city. The temperature T here is a function of
three variables. A smart person’s IQ may depend on his genes, thinking skills, edu-
cation, and so forth. It is a function of more than three variables. In this chapter, we
study multivariable functions and apply differential calculus to such functions.
We first start with functions of two variables. Similar to functions of one variable, a
real-valued function f of two variables is defined as follows.
Definition 2.1.1. A real function of two real variables is a rule f (also called a mapping or correspon-
dence) that assigns to each ordered pair of real numbers (x, y) in a set D ⊂ ℝ2 a unique number z ∈ ℝ.
We denote this rule by z = f (x, y). The set D is called the domain of f . The set {f (x, y)|(x, y) ∈ D} is
called the range of the function f . The variables x and y are called the independent variables and z is
called the dependent variable.
The domain of a function of two variables can have interior points and boundary
points, and it may be an open or closed region in the xy-plane, just the way the do-
mains of one-variable functions defined on subsets of the real number line can (see
Figure 2.1(b) and (c)). We give the following important definitions.
https://doi.org/10.1515/9783110674378-002
66 | 2 Functions of multiple variables
Figure 2.1: Functions of two variables, domain, range, boundary point, interior point.
Definition 2.1.2.
1. A point (x0 , y0 ) in a region (set) R in the xy-plane is an interior point of R if there exists a disk with
center (x0 , y0 ) that lies entirely in R.
2. A point (x0 , y0 ) is a boundary point of R if every disk centered at (x0 , y0 ) contains points that lie
outside of R as well as points that lie in R. The boundary point may or may not belong to R.
3. The interior points of a region make up the interior of the region.
4. The region’s boundary points make up its boundary.
5. A region is open if it consists entirely of interior points.
6. A region is closed if it contains all its boundary points.
7. A region in the plane is bounded if it lies inside a disk of fixed radius.
8. A region is unbounded if it is not bounded.
The structures of domains of two-variable functions can vary considerably from one
function to another, as shown in the following examples.
f (x, y) = √ (4 − x 2 − y 2 )(x 2 + y 2 − 1)
has a maximum domain consisting of all (x, y) satisfying (4 − x 2 − y 2 )(x 2 + y 2 − 1) ≥ 0 ensuring the
square root exists. Solving this inequality gives
D = {(x, y) : 1 ≤ x 2 + y 2 ≤ 4}.
One can easily draw a picture of this domain, which is an annulus (ring), as shown in Figure 2.2(a).
x2 y2
Example 2.1.2. Find the domain of the function z = √ 1 − a2
− b2
+ ln(x + y) (a > 0 and b > 0).
2 2
Solution. The function z is defined only for these pairs of (x, y) satisfying 1− ax 2 − by 2 ≥ 0
and x + y > 0, so the domain for this function of two variables is
x2 y2
D = {(x, y) 2 + 2 ≤ 1 and x + y > 0}.
a b
2.1 Functions of multiple variables | 67
Figure 2.2: Domains of functions of two variables, Example 2.1.1, Example 2.1.2, and Example 2.1.3.
It is a region bounded by the elliptical curve and the line y = −x. This is neither an
open nor a closed region, as shown in Figure 2.2(b).
Example 2.1.3. Find the domain and range of the function z defined by
z(x, y) = √9 − x 2 − y 2 .
Solution. The domain D of z is found by noticing that a square root cannot take neg-
ative values, i. e.,
which is the disk with center (0, 0) and radius 3. The range of z is
Since z is a positive square root, z ≥ 0 and the domain restriction 0 ≤ x 2 +y2 ≤ 9 shows
that
0 ≤ √9 − x2 − y2 ≤ 3.
As seen in the previous chapter, the set of points (x, y, z) whose coordinates satisfy the
equation z = f (x, y) consist of a surface in space. This surface is called the graph of the
68 | 2 Functions of multiple variables
function z = f (x, y). For instance, the graph of the function z = 3x − y + 3 is a plane,
the graph of the function z = √a2 − x2 − y2 is the upper semisphere, and the graph of
z = x2 − 3y2 is a hyperbolic paraboloid in space.
As seen in the previous chapter, it might be hard to visualize or sketch the graph of
a function of two variables, for example, z = x 2 −4y2 , which is a hyperbolic paraboloid.
However, we can use the ideas of contour curves and level curves to help visualize the
graph. One may already have an idea from the daily weather forecasts or topographic
mappings of a mountain, as shown in Figure 2.3.
Assume that you are walking on the surface of a mountain, and the surface is the graph
of a function z = f (x, y). If you walk along a path on which your elevation remains
constant, say, z0 , which is actually the height above the bottom plane of the mountain,
then the path is part of a contour curve which is the intersecting curve of the surface
z = f (x, y) and the plane z = z0 . When the contour curves are projected onto the
xy-plane, those projection curves are called level curves.
Example 2.1.4. Find and sketch the level curves of the following surfaces:
1. z = x 2 − 4y 2 ,
2. f (x, y) = x 4 + y 4 + 8xy.
Solution.
1. The level curves are described by the equations
z0 = x 2 − 4y2 ,
{
z = 0,
where z0 is a constant. For z0 = 0, the level curve is two straight lines x = ±2y
in the xy-plane. For all values of z0 ≠ 0, the level curves are hyperbolas in the
xy-plane. Setting z0 = 1, 4, 8, −1, −4, −8 enables us to obtain Figure 2.4.
2.1 Functions of multiple variables | 69
(a) (b)
4 4
Figure 2.5: Graph of z = x + y + 8xy and its level curves.
2. We have not seen the graph of f (x, y) before. We use a graphing utility to help
sketch the surface and graph the level curves, as shown in Figure 2.5.
Likewise, a function f of three variables x, y, and z is a rule that assigns to each ordered
triple (x, y, z) in a domain set D ⊂ ℝ3 a unique real number u, and we write
u = f (x, y, z)
The graph of a function of three variables is the set of those points (x, y, z, u) in
four-dimensional space where u = f (x, y, z)! Therefore, we cannot visualize it in three-
dimensional space. To get some sense of the graph, we could use an idea similar to
level curves. If we set u0 = f (x, y, z), for any constant u0 , the graph of u0 = f (x, y, z) is
a level surface in space. For example, the level surfaces of the function u = x 2 + y2 + z 2
are spheres in space.
Although most of the functions that we will work with in this textbook will be func-
tions of two or three variables, scientists, engineers, and mathematicians often need
to work with functions of four or more variables. Those functions are defined in a sim-
ilar way. In general, a function f of n variables is a rule that assigns a unique number
u to an n-tuple (x1 , x2 , . . . , xn ) ∈ ℝn of real numbers, and we write u = f (x1 , x2 , . . . , xn ).
Sometimes we use vector notation to write such functions more compactly. That is, if
the ordered n-tuple is considered to be a vector x = ⟨x1 , x2 , . . . , xn ⟩, then we write f (x)
in place of f (x1 , x2 , . . . , xn ), and we can then write the function compactly as u = f (x),
or u = f (P), for P ∈ ℝn .
2.1.4 Limits
Limits for functions of two variables are required to develop the calculus of functions
of two variables. They can often be interpreted in much the same way as we interpret
limits of functions of one variable. For example, the statement
lim 3x 2 y = −12
(x,y)→(2,−1)
means that the value of 3x 2 y gets closer and closer to −12 as (x, y) gets closer and closer
to (2, −1). It may seem obvious that if (x, y) is close to (2, −1), then x is close to 2 and y
is close to −1, so the value of 3x 2 y is close to 3(22 )(−1) = −12. However, we will soon
see that the limit of a function of two variables is not always this clear. Vague phrases
like “closer and closer to” can be hard to interpret in some circumstances, and we can
avoid most of these difficulties by defining the limit more precisely.
Definition 2.1.3. Let D be a region in the xy-plane and (a, b) be an interior point of D. Let f be a real-
valued function of two variables defined on D except possibly at (a, b). We say that a real number L is
the limit of f as (x, y) approaches (a, b), and we write
lim f (x, y) = L
(x,y)→(a,b)
if for every number ε > 0 there is a number δ > 0 (the value of δ depends on ε) such that
f (x, y) − L < ε
for all (x, y) ∈ D satisfying 0 < √(x − a)2 + (y − b)2 < δ.
We also write f (x, y) → L (meaning f (x, y) approaches L) as (x, y) → (a, b) (meaning (x, y) approaches
(a, b)).
2.1 Functions of multiple variables | 71
Note that the set of points (x, y) satisfying the condition 0 < √(x − a)2 + (y − b)2 <
δ forms a punctured open disk with center (a, b) and radius δ (“punctured” means
one point, the center (a, b), is excluded and “open” means the boundary circle is not
included). This punctured disk is sometimes denoted as U((a, ̊ b), δ). Note that we can
relax the requirement that the point (a, b) be in the interior of D as long as for every
δ > 0 the punctured disk U((a,̊ b), δ) contains elements of D where f is defined. For
example, lim(x,y)→(0,0) √x + y exists, even though √x + y is not defined for x + y < 0.
ε
we have |x − 1| < 8
and |y − 2| < 8ε , and then
≤ 2|x − 1| + 4|y − 2|
ε ε
≤ 2 + 4 < ε.
8 8
So, by the definition, we conclude that lim(x,y)→(1,2) (2x + 4y) = 10.
xy
Example 2.1.6. Assume f (x, y) = . Show that
√x 2 +y 2
lim f (x, y) = 0.
(x,y)→(0,0)
Proof. The domain of the function f (x, y) is D = ℝ2 \{(0, 0)}. Since |2xy| ≤ x 2 + y2 , it
follows that
1 x2 + y2 √x 2 + y2
xy
|xy|
f (x, y) − 0 = − 0 =
≤ = .
√x2 + y2 x + y2 2 √x 2 + y2 2
√ 2
Therefore, for any ε > 0, we can ensure that |f (x, y) − 0| < ε if we choose δ = 2ε. This
is because when
we have
√x2 + y2 2ε
f (x, y) − 0 ≤ = ε.
<
2 2
72 | 2 Functions of multiple variables
That is, whenever the point P(x, y) ∈ ℝ2 \{(0, 0)} and |OP| < δ = 2ε, we always have
Thus,
lim f (x, y) = 0.
(x,y)→(0,0)
Note. The function f (x, y) is not defined when (x, y) = (0, 0); however, a limit can exist
at a point where the function is not defined.
The definitions and properties of limits of functions of two variables are very sim-
ilar to those of one-variable functions, and can be extended in a very similar way to
functions of more variables.
Theorem 2.1.1 (Limit laws). Suppose that lim(x,y)→(a,b) f (x, y) = L, lim(x,y)→(a,b) g(x, y) = M, where L
and M are real numbers. Then:
1. lim(x,y)→(a,b) (f (x, y) ± g(x, y)) = L ± M (sum/difference rule),
2. lim(x,y)→(a,b) f (x, y)g(x, y) = LM (product rule),
f (x,y) L
3. lim(x,y)→(a,b) g(x,y)
= M
(quotient rule, given that M ≠ 0).
sin xy
Example 2.1.7. Show that lim(x,y)→(0,2) x
= 2.
Solution. We know the one-variable limit limx→0 sinx x = 1, and we can use this here by
considering the product xy to be a single variable such that xy → 0 as (x, y) → (0, 2),
as follows:
3−√xy+9
Example 2.1.8. Evaluate lim(x,y)→(0,1) xy
.
3 − √xy + 9 3 − √t + 9 (3 − √t + 9)(3 + √t + 9)
lim = lim = lim
(x,y)→(0,1) xy t→0 t t→0 t(3 + √t + 9)
−t −1 1
= lim = lim =− .
t→0 t(3 + √t + 9) t→0 3 + √t + 9 6
For functions of two variables, the situation is different. This is because we can
let (x, y) approach (a, b) from an infinite number of directions in any manner so long
as (x, y) stays within the domain of f , as shown in Figure 2.6(b). The existence of the
limit lim(x,y)→(a,b) f (x, y) means that f (x, y) approaches the same value no matter in
what direction (x, y) approaches (a, b). Therefore, if there are two different routes for
(x, y) → (a, b) along which the function f (x, y) approaches different values, then we
can conclude that the limit lim(x,y)→(a,b) f (x, y) does not exist.
(a) (b)
Example 2.1.9. Investigate whether or not the limit lim(x,y)→(0,0) f (x, y) exists when f is defined in two
parts by
xy
x 2 +y 2
when (x, y) ≠ (0, 0),
f (x, y) = {
0 when (x, y) = (0, 0).
Solution. It is easy to check that if (x, y) → (0, 0) along the x-axis, then f (x, y) → 0.
If (x, y) → (0, 0) along the y-axis, then f (x, y) → 0. Now, we have obtained identical
limits along the two axes. However, this does not show that the given limit is 0. If we
let (x, y) approach (0, 0) along the line y = kx, we have
xy kx2 k
lim 2 2
= lim 2 2 2
= .
(x,y)→(0,0),y=kx x +y x→0 x +k x 1 + k2
Obviously, this limit varies with different values of k. So the limit lim(x,y)→(0,0) f (x, y)
does not exist.
x2y
lim .
(x,y)→(0,0) x 4+ y2
74 | 2 Functions of multiple variables
Solution. Allowing (x, y) to approach (0, 0) along any line y = kx, the limit is
x2 y kx3 kx
lim = lim 4 = lim 2 = 0.
(x,y)→(0,0) x 4 +y 2 x→0 x + k 2 x 2 x→0 x + k 2
However, this is not sufficient to prove the existence of the limit, even though we have
infinitely many paths along which the limits are all 0. Let us see what happens when
the path is a parabola, say, y = mx2 . Then the limit is
x2 y mx4 m
lim = lim 4 4 4
= .
4
(x,y)→(0,0) x + y 2 x→0 x +m x 1 + m4
The limit depends on m! This means that when (x, y) approaches (0, 0) along different
parabolas, we have different limits. Therefore, we conclude that the limit
x2 y
lim
(x,y)→(0,0) x 4 + y 2
Iterated limits
Now we consider the two iterated limits
The limit limx→a limy→b f (x, y) means we evaluate the one-variable limit limy→b f (x, y)
first holding x as a constant, and then evaluate the one-variable limit
limx→a (limy→b f (x, y)) letting x → a. Note that the two iterated limits are actually
two specific paths by which a point (x, y) approaches (a, b). Therefore, we have the
following theorem.
Theorem 2.1.2. If lim(x,y)→(a,b) f (x, y) exists, then both limx→a limy→b f (x, y) and limy→b limx→a f (x, y)
exist and
This theorem indicates a way to evaluate the limit of a function of two variables, that
is, to evaluate two one-variable limits given that all limits involved exist. For instance,
For functions of more than two variables, say, n variables, we can define the limit
at a point P0 ∈ ℝn in a similar manner. In a compact notation, limP→P0 f (P) = L means
that for any given ε > 0, there is a δ > 0 such that whenever 0 < |PP0 | < δ, we have
|f (P)−L| < ε. Also, the limit laws apply to limits of functions of more than two variables
as well.
Solution. This limit requires (x, y, z) to approach (1, 1, 1), which is a boundary point of
the domain of the function. We can assume all x, y, and z are positive and try factor-
ization. We obtain
√xy + √yz − √xz − z √x(√y − √z) + √z(√y − √z)
lim = lim
(x,y,z)→(1,1,1) √xz + √yz − √xy − y (x,y,z)→(1,1,1) √x(√z − √y) + √y(√z − √y)
(√x + √z)
= lim − = −1.
(x,y,z)→(1,1,1) (√x + √y)
2.1.5 Continuity
Definition 2.1.4. Let f be a function of two variables, and let (a, b) be in its domain 𝒟. We say that f is
continuous at (a, b) if
If f is not continuous at (a, b), then we say that f is discontinuous at (a, b) and that f has a discontinuity
at (a, b). If f is continuous at every point in 𝒟, then we say that f is continuous on 𝒟.
is discontinuous at (0, 0) because the limit does not even exist there, as shown in an
earlier example.
76 | 2 Functions of multiple variables
All the points on the circle C = {(x, y)|x2 +y2 = 1} are discontinuities of the function
f (x, y) = sin x2 +y1 2 −1 because the function is not defined at any point of this circle. It is
also possible to define a function f that is discontinuous at (a, b) such that the limit of
f exists as (x, y) → (a, b) and f (a, b) exists, but the two values are different.
Similar to the continuity of functions of one variable, all elementary functions of
two variables are continuous on their natural domains. That is, the limit of an elemen-
tary function f of two variables at point (a, b) in its domain is given by
cos(xy)+sin(xy)
Example 2.1.12. Find lim(x,y)→(1,π) x+y
.
cos(xy)+sin(xy)
Solution. The function x+y
is an elementary function and its domain is
Since (1, π) lies in its domain, this function is continuous at the point (1, π) and
The continuity of functions of more than two variables is defined similarly using
the compact notation.
Definition 2.1.5. A function f of n variables is continuous at P0 , if and only if limP→P0 f (P) = f (P0 ),
where P and P0 are points in ℝn .
Finally, the extreme value theorem, intermediate value theorem, and uniform continu-
ity theorem also hold for continuous functions f of n variables defined on a closed,
bounded region in ℝn .
Suppose that a pollution index Q depends on two factors, x and y, which are outputs
of pollutants from two factories. Now, you have a little money in hand and can invest
it in only one of the factories to reduce the pollution index. Which factory will you put
your money on? Of course, one would like to choose the factory whose small change
in output of pollutants will result in the greatest drop in Q. This involves the idea of
partial derivatives in which we hold one of the variables constant, and try to find the
2.2 Partial derivatives | 77
rate of change of a function with respect to the other variable. So a partial derivative
is simply a one-variable differentiation applied to a two- (or more) variable function.
That is, suppose f is a function of two variables, x and y, but we let only x vary while
holding y = b constant. We have then converted f into a function of a single variable,
x, i. e., g(x) = f (x, b). If g(x) = f (x, b) has a derivative at a, then we call it the partial
derivative of f with respect to x at (a, b) and denote it by fx (a, b) or 𝜕f (a,b)
𝜕x
. One-variable
derivatives were defined in terms of a limit. The definition of partial derivatives for
functions of two variables is also defined using limits in a similar way.
Definition 2.2.1. Let f (x, y) be a real-valued function with domain D ⊂ ℝ2 , and let (a, b) be an interior
point of D. The partial derivative of f with respect to x at (a, b) is denoted and defined by
Note.
1. The derivative f (x) is interpreted as rate of change. Partial derivatives are also
rates of change. If z = f (x, y), then 𝜕f /𝜕x represents the rate of change of z with
respect to x when y is held constant. Similarly, 𝜕f /𝜕y represents the rate of change
of z with respect to y when x is held constant.
2. Partial derivatives are sometimes called partials.
3. Sometimes, we use h in the above limits in place of Δx or Δy.
If z = f (x, y) has a partial derivative with respect to x at all points (x, y) ∈ D, then
fx (x, y) is also a function of x and y with domain D. We call it the partial derivative of
f (x, y) with respect to x and denote it by any of the following:
𝜕z 𝜕f 𝜕f (x, y)
, , , zx , or fx (x, y).
𝜕x 𝜕x 𝜕x
Similarly, we denote the partial derivative of f (x, y) with respect to y by
𝜕z 𝜕f 𝜕f (x, y)
, , , zy , or fy (x, y).
𝜕y 𝜕y 𝜕y
Sometimes the subscript “x” is replaced by the number 1, “y” by 2, and so on, so that
𝜕f 𝜕f
= f1 , = f2 .
𝜕x 𝜕y
So, to compute the partial derivative fx or fy , all we have to remember is that it
is just the ordinary one-variable derivative where we regard y or x as a constant, and
we can, therefore, apply all the derivative laws for functions of a single variable when
finding fx or fy .
78 | 2 Functions of multiple variables
x
Example 2.2.1. If f (x, y) = cos( 2+y ), compute and .
𝜕f 𝜕f
𝜕x 𝜕y
𝜕f x 𝜕 x x 1
= − sin( )⋅ ( ) = − sin( )⋅ .
𝜕x 2+y 𝜕x 2 + y 2+y 2+y
Similarly, we compute 𝜕f
𝜕y
as follows:
𝜕f x 𝜕 x x x
= − sin( )⋅ ( ) = sin( )⋅ .
𝜕y 2+y 𝜕y 2 + y 2+y (2 + y)2
Example 2.2.2. If f (x, y) = 2x 2 y − 3xy 2 + 2x − y 2 + 3, then find 𝜕x (2, −3) and 𝜕y (2, −3) by using one-
𝜕f 𝜕f
variable differentiation formulas and a second time by using the definition as a limit.
So
(2, −3) = 4xy − 3y2 + 2 x=2 = 4 ⋅ 2 ⋅ (−3) − 3 ⋅ (−3)2 + 2 = −49.
𝜕f
𝜕x y=−3
Method 2: We have
where P is the total production (the monetary value of all goods produced in a period), L is the amount
of labor (some measure of the total labor used in that period), and C is the amount of capital invested
(the monetary worth of all machinery, equipment, and buildings); b and α are constants. Find the
partial derivatives PL and PC .
Solution. We have
Note. In 1928 Charles Cobb and Paul Douglas used this function to model the growth
of American economy during the period 1899–1922. Their model turned out to be re-
markably accurate even though there were many factors affecting economic perfor-
mance.
Example 2.2.4. The ideal gas equation is given by pV = RT , where R is a constant. Show that
𝜕p 𝜕V 𝜕T
⋅ ⋅ = −1.
𝜕V 𝜕T 𝜕p
Proof. From
RT 𝜕p RT
p= ⇒ =− 2,
V 𝜕V V
RT 𝜕V R
V= ⇒ = ,
p 𝜕T p
pV 𝜕T V
T= ⇒ = ,
R 𝜕p R
it follows that
𝜕p 𝜕V 𝜕T RT R V RT
⋅ ⋅ =− 2 ⋅ ⋅ =− = −1.
𝜕V 𝜕T 𝜕p V p R pV
Note. This example also shows that partial derivatives cannot be interpreted as ratios
of differentials, as otherwise 𝜕V ⋅ 𝜕T ⋅ 𝜕p would be equal to 1.
𝜕p 𝜕V 𝜕T
We know that this limit does not exist as the two one-sided limits Δx → 0+ and Δx →
0− do not match. Therefore, 𝜕f (0,0)
𝜕x
does not exist. Similarly, 𝜕f (0,0)
𝜕y
does not exist either.
Now, we have seen that for some functions the partial derivatives exist, and for
some functions they do not. Besides using the definition, we shall try to interpret par-
tials geometrically.
There is a geometric interpretation of partial derivatives fx (a, b) and fy (a, b). When we
compute 𝜕x
𝜕f
, we keep y fixed, say, y = b. Therefore, we only consider the points (x, b, z).
So
z = f (x, y) and y = b,
and this is actually the intersection curve C of the plane y = b and the surface S, the
graph of z = f (x, y). The derivative fx (a, b) is therefore the slope of the tangent line to
the curve C at (a, b, f (a, b)) on S. Similarly, fy (a, b) is the slope of the tangent line to the
curve z = f (a, y) in the x = a plane at the point (a, b, f (a, b)). Figure 2.7(a) illustrates
the geometric interpretation of partial derivatives.
Now, we can explain why the function z = √x 2 + y2 has no partials at (0, 0). The
graph of this function is an upper cone with its vertex at the origin. The plane y = 0
(a) (b)
intersects the cone in two lines z = x and z = −x in the xz-plane. The origin is a
corner of the intersection lines, and, therefore, it has no derivative there, as shown in
Figure 2.7(b).
We already know that when a function of one variable has a derivative at a point P,
it must also be continuous at the point P. However, continuity does not follow for a
function of two variables just because it has partial derivatives at a point. Take, for ex-
ample, the function f (x, y) equal to x+y when either x or y equals 0, but with f (x, y) = 4
at all other points. This function has partial derivatives with respect to x and y equal to
1 at (0, 0), but the function is clearly not continuous at (0, 0). The next example shows
that a function can have partial derivatives at every point yet still be a discontinuous
function.
xy
x 2 +y 2
when x 2 + y 2 ≠ 0,
f (x, y) = {
0 when x 2 + y 2 = 0.
If (x, y) = (0, 0), then using the definition of partial differentiation we find
𝜕f (x, y) x(x2 − y2 )
= 2
𝜕y (x + y2 )2
and
𝜕f (0, 0)
= 0.
𝜕y
So, this function has a partial derivative at every point (x, y); however, we have already
seen in a previous example that this function is not continuous at the point (0, 0).
The concept of partial derivatives can be extended to functions of more than two
variables in a natural way. For instance, a function u = f (x, y, z) generally has three
partial derivatives and the partial derivative of the function with respect to x is defined
as
f (x + Δx, y, z) − f (x, y, z)
fx (x, y, z) = lim ,
Δx→0 Δx
82 | 2 Functions of multiple variables
where (x, y, z) is an interior point in the domain of u. However, there is no nice geo-
metric interpretation for fx as a slope of some visible tangent line. To find the partial
derivative fx we hold y and z constant and use the one-variable derivative rules to find
fx . In a similar manner, we can find fy and fz .
Solution. To find 𝜕r
𝜕x
, we regard y and z as constants. Then we find
𝜕r 2x x x
= = = .
𝜕x 2√x2 + y2 + z 2 √x 2 + y2 + z 2 r
y
By symmetry, 𝜕r
𝜕y
= r
and 𝜕r
𝜕z
= zr .
For a function z = f (x, y), the partial derivatives fx (x, y) and fy (x, y) can themselves be
differentiated, giving four more derivatives, i. e., (fx )x , (fx )y , (fy )x , (fy )y , called second
derivatives or second-order partial derivatives. The standard notation for these second-
order partial derivatives are similar to the notations y and d2 y/dx 2 for the second
derivatives of a function y = f (x) of a single variable. We write
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fx )x = fxx = ( ) = 2 = 2,
𝜕x 𝜕x 𝜕x 𝜕x
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fx )y = fxy = ( )= = ,
𝜕y 𝜕x 𝜕y𝜕x 𝜕y𝜕x
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fy )x = fyx = ( )= = ,
𝜕x 𝜕y 𝜕x𝜕y 𝜕x𝜕y
𝜕 𝜕f 𝜕2 f 𝜕2 z
(fy )y = fyy = ( ) = 2 = 2.
𝜕y 𝜕y 𝜕y 𝜕y
2
𝜕f
The notation fxy or 𝜕y𝜕x means that we first differentiate with respect to x (keep-
ing y constant) and then differentiate with respect y (keeping x constant), whereas in
𝜕2 f
computing fyx = 𝜕x𝜕y , the differentiation order is reversed.
f (x, y) = x 2 ey + y cos x.
Solution. We find the partials fx and fy , and we differentiate each of these functions
with respect to each of x and y to give fxx , fxy , fyx , and fyy . Then we obtain the following:
fx = 2xey − y sin x,
2.3 Total differential | 83
fy = x2 ey + cos x,
fxx = 2ey − y cos x,
fxy = 2xey − sin x,
fyx = 2xey − sin x,
fyy = x2 ey .
We note that for this function, whose second-order partial derivatives are continu-
ous, the “mixed partials” fxy and fyx are equal. This result holds generally for functions
with continuous second-order derivatives.
Theorem 2.2.1 (Clairaut’s theorem: equality of mixed partial derivatives). If z = f (x, y) is defined and
has continuous second-order partial derivatives throughout a domain 𝒟, then the two functions fxy
and fyx are identical at any interior point of 𝒟.
Notations for third- and higher-order partial derivatives are defined in a similar way.
Solution. We have
fx = 3 cos(3x + 2yz),
fxx = −9 sin(3x + 2yz),
fxxy = −18z cos(3x + 2yz),
fxxyz = −18 cos(3x + 2yz) + 36yz sin(3x + 2yz).
Δy = AΔx + o(Δx),
where A is a constant that only depends on the point a, not the change Δx in x. It turns
out that the constant A is exactly f (a), the derivative at x = a. Thus,
Δy = f (a)Δx + o(Δx).
The function L(x) = f (a) + f (a)(x − a) is the local linearization of f (x) at x = a, which
is, in fact, the tangent line approximation of f at x = a.
A similar approximation can be made for a function of two variables, z = f (x, y).
Consider a small change Δz in z at a point (a, b), caused by changes Δx in x and Δy in y:
The increment Δz represents the change in the value of f when (a, b) changes from
(a, b) to (a + Δx, b + Δy). In general, the exact increment Δz in z is hard to find. For
example, for z = xy at (1, 1) with Δx = 0.09 and Δy = −0.02, Δz = 1.090.98 − 11 . However,
even though z is not a linear function, Δz can be very close to a linear expression of Δx
and Δy, and the difference is negligible as Δx → 0 and Δy → 0. When this happens,
we say that the function z = f (x, y) is differentiable at the point (a, b). The formal
definition is given below.
Definition 2.3.1 (Differentiability of a real-valued function z = f (x, y) of two variables). Assume that f
is a real-valued function with domain D ⊂ ℝ2 and that (a, b) is an interior point of D. The function f is
differentiable at (a, b) if there exist constants A and B such that
where A and B depend on a and b but are independent of Δx and Δy, and ρ = √(Δx)2 + (Δy)2 . If
z = f (x, y) is differentiable at every point in D, then we say that z is differentiable on D.
Note that if z = f (x, y) is differentiable at (a, b), then Δz ≈ AΔx + BΔy for some con-
stants A and B. This expression can be rewritten as
The formula L(x, y) = f (a, b)+A(x−a)+B(y−b) gives the local linearization of z = f (x, y)
at (a, b), which is the equation of a plane that approximates the surface well at points
near (a, b, f (a, b)). This plane, as we will see later, is indeed the tangent plane to the
graph of z = f (x, y) at (a, b, f (a, b)). So, intuitively speaking, if a function z = f (x, y)
is differentiable at a point, then its graph must be continuous and “smooth” at that
point so that there exists a plane that could nicely touch (be tangent to) the surface at
that point. This is indeed the case, as shown in the following two theorems.
Theorem 2.3.1. If z = f (x, y) is differentiable at (a, b), then it must be continuous at the point (a, b).
Proof. In fact, from the above definition, if z = f (x, y) is differentiable at (a, b), then
are not differentiable at (0, 0) since they are not continuous there.
Also, as in one-variable calculus, if y = f (x) is differentiable at x = a, then dy/dx
exists at x = a. For a function of two variables, we claim that if it is differentiable, then
it has partial derivatives.
Theorem 2.3.2. If z = f (x, y) is differentiable at point (a, b), then the partial derivatives 𝜕z(a,b)
𝜕x
and
𝜕z(a,b)
𝜕y
exist. Furthermore,
𝜕z(a, b) 𝜕z(a, b)
Δz = Δx + Δy + o(ρ).
𝜕x 𝜕y
Proof. If the function z = f (x, y) is differentiable at a point (a, b), then there exist A
and B, independent of Δx and Δy, such that
Δz = AΔx + o(|Δx|).
Similarly, 𝜕z(a,b)
𝜕y
= B. This completes the proof.
86 | 2 Functions of multiple variables
Example 2.3.1. Determine whether the following functions are differentiable at (0, 0):
xy
1 { (x, y) ≠ (0, 0),
(1) f (x, y) = sin , (2) g(x, y) = { √x 2 +y 2 and
x2 + y2 0 (x, y) = (0, 0),
{
2 2 1
(x + y ) sin x 2 +y 2 (x, y) ≠ (0, 0),
(3) h(x, y) = {
0 (x, y) = (0, 0).
Solution.
1
1. Since f (x, y) = sin x2 +y 2 is undefined at (0, 0), it is not continuous at (0, 0). There-
ΔxΔy
That is, is negligible with respect to √(Δx)2 + (Δy)2 , as (Δx, Δy) → (0, 0),
√(Δx)2 +(Δy)2
i. e.,
ΔxΔy
√(Δx)2 +(Δy)2
lim = 0.
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2
However,
ΔxΔy
√(Δx)2 +(Δy)2 ΔxΔy
lim = lim .
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2 (Δx,Δy)→(0,0) (Δx)2 + (Δy)2
2.3 Total differential | 87
This limit does not exist as if Δy = kΔx , the limit depends on k. So, we reached a
contradiction. So g(x, y) is not differentiable at (0, 0).
3. For h(x, y), it is easy to see that lim(x,y)→(0,0) h(x, y) = 0 = h(0, 0), so h(x, y) is con-
tinuous at (0, 0). We now compute
Now, we have seen some nice properties of a differentiable function of two vari-
ables. However, how can we determine whether a function is differentiable? In one-
variable calculus, we know that as long as f (a) exists, y = f (x) is differentiable at
x = a. But in multivariable calculus, this is not the case. The previous theorem shows
that existence of the partial derivatives is a necessary condition for differentiability,
but it is not a sufficient condition for differentiability. In Example 2.2.6, we saw that
the function has partials at (0, 0) but is not continuous there; therefore, it is not dif-
ferentiable there. For a sufficient condition, we have the following theorem.
Theorem 2.3.3 (Test for differentiability of a real-valued function). Let z = f (x, y) be a real-valued
function of two variables and (a, b) an interior point of its domain D. If 𝜕x
𝜕z
and 𝜕y
𝜕z
are continuous at
(a, b), then f is differentiable at (a, b).
Proof. Recall the Lagrange mean value theorem for differentiable functions of one
variable, y = f (x),
Therefore,
Since 𝜕z
𝜕x
= fx (x, y) and 𝜕z
𝜕y
= fy (a, b) are both continuous at (a, b), we have
So,
Furthermore,
Δz − fx (a, b)Δx − fy (a, b)Δy ε1 Δx + ε2 Δy
lim = lim
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2 (Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2
Δx Δy
= lim ε1 ( ) + ε2 ( )
(Δx,Δy)→(0,0)
√(Δx)2 + (Δy)2 √(Δx)2 + (Δy)2
= 0 + 0 = 0.
Therefore,
One can show that the partial derivatives of f (x, y) exist, and f is differentiable, but
the partial derivatives are not continuous at (0, 0).
2.3 Total differential | 89
dy = f (x)dx,
Definition 2.3.2. If z = f (x, y) is differentiable at (a, b), the differential dz (sometimes called the total
differential) at (a, b) is the function defined by
𝜕z(a, b) 𝜕z(a, b)
dz = dx + dy,
𝜕x 𝜕y
where the two independent variables are the differentials dx and dy.
Solution. Since
= fx = yexy = fy = xexy
𝜕z 𝜕z
and
𝜕x 𝜕y
are both continuous at (2, 1), by the definition of dz at a point (a, b), we have
𝜕z(2, 1) 𝜕z(2, 1)
dz = dx + dy
𝜕x 𝜕y
= e2 dx + 2e2 dy.
If the total differential of z = f (x, y) exists for all points in a region D, then dz
becomes a function of (x, y) ∈ D and (dx, dy) ∈ ℝ2 .
Example 2.3.3. Find the differential dz = df (x, y) when f is the function defined by
Solution. Since
𝜕z −2x 𝜕z −2y
= fx = and = fy =
𝜕x 3 − x2 − y2 𝜕y 3 − x2 − y2
𝜕f 𝜕f
dz = dx + dy
𝜕x 𝜕y
90 | 2 Functions of multiple variables
y
Example 2.3.4. Find du if u = x + sin 2
+ eyz .
Solution. Since
𝜕u 1 y
= cos + zeyz , = yeyz ,
𝜕u 𝜕u
= 1,
𝜕x 𝜕y 2 2 𝜕z
it follows that
1 y
dz = dx + ( cos + zeyz )dy + yeyz dz.
𝜕u 𝜕u 𝜕u
du = dx + dy +
𝜕x 𝜕y 𝜕z 2 2
In one-variable calculus, when dx = Δx, then dy = f (a)Δx and Δy ≈ dy. So the lin-
earization is essentially the differential approximation. Similarly, for a differentiable
function z = f (x, y) of two variables, when dx = Δx and dy = Δy , then at (a, b) we have
𝜕z(a, b) 𝜕z(a, b)
dz = Δx + Δy.
𝜕x 𝜕y
Thus,
Δz = dz + o(√(Δx)2 + (Δy)2 ).
𝜕z(a, b) 𝜕z(a, b)
Δz = f (x, y) − f (a, b) ≈ dz = Δx + Δy, (2.1)
𝜕x 𝜕y
or, equivalently,
𝜕z(a, b) 𝜕z(a, b)
f (x, y) ≈ f (a, b) + Δx + Δy. (2.2)
𝜕x 𝜕y
2.3 Total differential | 91
The function
𝜕z(a, b) 𝜕z(a, b)
L(x, y) = f (a, b) + (x − a) + (y − b)
𝜕x 𝜕y
is called the local linearization of the function z = f (x, y) at the point (a, b). This is
essentially the total differential approximation.
Example 2.3.5. Find an approximation for (1.04)2.02 using a suitable local linearization.
Solution. Let f (x, y) = xy . Then 1.042.02 = f (1.04, 2.02) and this is close to f (1, 2) = 12 =
1, so we can use the differential approximation to approximate the change from this
known value. Since
= yxy−1 = x y ln x,
𝜕z 𝜕z
and
𝜕x 𝜕y
𝜕z(1, 2) 𝜕z(1, 2)
f (x, y) ≈ f (1, 2) + Δx + Δy.
𝜕x 𝜕y
By substitution,
𝜕z 𝜕z
f (1.04, 2.02) ≈ f (1, 2) + (1, 2)0.04 + (1, 2)0.02
𝜕x 𝜕y
= 1 + 2 × 0.04 + 0 × 0.02 = 1.08.
Note. A calculator gives a better approximation, (1.04)2.02 = 1.082448 . . . , but our er-
ror is less than 3 × 10−3 .
Example 2.3.6. Use the total differential to estimate of the change of the function z = √20 − 7x 2 − y 2
when (x, y) changes from (1, 2) to (0.98, 2.03).
Solution. Since
𝜕z(1, 2) −2y
−2 × 2 2
= = =− ,
𝜕y 2 2 √ 2
2√20 − 7x − y (1,2) 2 20 − 7(1) − 2
2 3
92 | 2 Functions of multiple variables
the differential is
𝜕z(1, 2) 𝜕z(1, 2) 7 2
dz = Δx + Δy = − (0.98 − 1) − (2.03 − 2) = 0.0026667.
𝜕x 𝜕y 3 3
Since Δz = f (0.98, 2.03) − f (1, 2) ≈ dz, the change in z is approximately 0.00266667.
Using a calculator,
√20 − 7 × 0.982 − 2.032 − √20 − 7 × 12 − 22 ≈ 0.0025938.
The two values are very close, but dz is much easier to evaluate when a calculator is
not available.
Note. The linearization for a function of two variables at a point is essentially a plane
approximation. For this function, it is
7 2
L(x, y) = 3 − (x − 1) − (y − 2),
3 3
which is the equation of a plane. This is the tangent plane at (1, 2) as we will see later
in this chapter.
Linear approximations can be used for differentiable functions of more than two
variables. For example, if u = f (x, y, z) is differentiable at (a, b, c), then
Δu ≈ dz and
f (x, y, z) ≈ f (a, b, c) + fx (a, b, c)Δx + fy (a, b, c)Δy + fz (a, b, c)Δz.
There are some exercises involving differential approximation for functions of more
than two variables in the end of this chapter.
In single-variable calculus, we have found that the chain rule is useful for differen-
tiating a composite function: if y = f (x) and x = x(t) are both differentiable, then
the composite function y is a differentiable function of t, and the derivative of y with
respect to t is
dy dy dx
= . (2.3)
dt dx dt
For functions of several variables, there are several versions of the chain rules.
We first consider z = f (x, y), where each variable x = ϕ(t) and y = ψ(t) is, in turn,
a differentiable function of a variable t. This means z = f (ϕ(t), ψ(t)) is a function of
the variable t. When fx and fy are both continuous, we are able to differentiate z with
respect to t to get dz
dt
, as seen in the following theorem.
2.4 The chain rule | 93
Theorem 2.4.1. If x = ϕ(t) and y = ψ(t) are two differentiable functions of t, and z = f (x, y) is a differ-
entiable function of x and y, then the composite function z = f (x, y) = f (ϕ(t), ψ(t)) is a differentiable
function of t and
dz 𝜕z dx 𝜕z dy
= + . (2.4)
dt 𝜕x dt 𝜕y dt
2 2
Δz 𝜕z Δx 𝜕z Δy o(√(Δx) + (Δy) )
= + +
Δt 𝜕x Δt 𝜕y Δt Δt
2 2 2 2
𝜕z Δx 𝜕z Δy o(√(Δx) + (Δy) ) √(Δx) + (Δy)
= + + ⋅
𝜕x Δt 𝜕y Δt √(Δx)2 + (Δy)2 Δt
2 2 2 2
𝜕z Δx 𝜕z Δy o(√(Δx) + (Δy) ) √ Δx Δy
= + + ⋅ ( ) +( ) .
𝜕x Δt 𝜕y Δt √(Δx)2 + (Δy)2 Δt Δt
2 2
𝜕z dx 𝜕z dy dx dy
= + + 0 ⋅ √( ) + ( )
𝜕x dt 𝜕y dt dt dt
𝜕z dx 𝜕z dy
= + .
𝜕x dt 𝜕y dt
It is helpful to use a tree diagram to remember this chain rule (and other chain
rules as well), as shown in Figure 2.8. For the function in the previous theorem, since
z is a function of x and y, we draw branches from the dependent variable z to the inter-
mediate variables x and y. Then, we draw branches from x and y to the independent
variable t, since both x and y are functions of t. Then, on each branch, we write the cor-
responding derivatives. To find dz dt
, we multiply the derivatives along each path from z
to t, and then add these products to get
dz 𝜕z dx 𝜕z dy
= + .
dt 𝜕x dt 𝜕y dt
94 | 2 Functions of multiple variables
dz
Example 2.4.1. Suppose z = f (x, y) = x 2 − y 2 , x = sin t, y = cos t. Find dt
using the chain rule.
by using the double-angle formula. So dz/dt = (− cos 2t) = 2 sin 2t, which agrees with
the answer we obtained by using the chain rule.
2.4.2 The chain rule with more than one independent variable
We now consider another case, z = f (x, y), but where each of x and y is a function of
two variables s and t, i. e., x = ϕ(s, t), y = ψ(s, t). Then z = f (ϕ(s, t), ψ(s, t)) is indirectly
a function of s and t. We can apply Theorem 2.4.1 to find 𝜕z 𝜕t
and 𝜕z𝜕s
. That is, if we hold
s fixed, then we can compute 𝜕t by using Theorem 2.4.1 to get
𝜕z
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + .
𝜕t 𝜕x 𝜕t 𝜕y 𝜕t
Theorem 2.4.2. Suppose that z = f (x, y) is a differentiable function of x and y, where x = ϕ(s, t) and
y = ψ(s, t) are differentiable functions of s and t. Then
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y 𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + and = + .
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s 𝜕t 𝜕x 𝜕t 𝜕y 𝜕t
2.4 The chain rule | 95
In this version of the chain rule, there are three types of variables, i. e., s and t are
independent variables, x and y are called intermediate variables, and z is the depen-
dent variable. Like the one-independent-variable case, to remember this chain rule
(or any other one), it is helpful to draw a tree diagram representation of the function
relationships,as shown in Figure 2.9. This time, we have two more branches. To find
𝜕z
𝜕s
we multiply the partial derivatives along each path from z to s, and then add these
products, i. e.,
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= + .
𝜕s 𝜕x 𝜕s 𝜕y 𝜕s
Similarly, we find
𝜕z 𝜕z 𝜕x 𝜕z 𝜕y
= +
𝜕t 𝜕x 𝜕t 𝜕y 𝜕t
by using the paths from z to t.
Example 2.4.2. Using the chain rule as an aid, find the partial derivatives of the function z = ex sin y,
where x = st and y = s + t.
and, similarly,
The chain rule and the tree diagram representation of function relationships can
be extended to cases that involve both one and more independent variables or func-
tions of more than two variables, as we will see in the following example.
Example 2.4.3. Let f (u, v, w) = euvw , where u = x 2 , v = x + 2y, and w = yx . Find and .
𝜕f 𝜕f
𝜕x 𝜕y
Solution. This example is different from previous ones in several ways. First, we note
that this is a function of three variables, and the intermediate variables are u, v, and w.
Furthermore, u has one independent variable, while v and w both have two indepen-
dent variables. With the help of the tree diagram shown in Figure 2.10(a), we find
𝜕f 𝜕f du 𝜕f 𝜕v 𝜕f 𝜕w
= + +
𝜕x 𝜕u dx 𝜕v 𝜕x 𝜕w 𝜕x
1
= vweuvw ⋅ 2x + uweuvw ⋅ 1 + uveuvw ⋅
y
uv
= euvw (2vwx + uw + )
y
3 2x 2 (x + 2y) + x3 + x 2 (x + 2y) 3 4x3
= ex (x+2y)/y
( ) = ex (x+2y)/y ( + 6x2 ).
y y
Similarly,
𝜕f 𝜕f 𝜕v 𝜕f 𝜕w
= +
𝜕y 𝜕v 𝜕y 𝜕w 𝜕y
x
= uweuvw ⋅ 2 + uveuvw ⋅ (− )
y2
2x3 x3 (x + 2y)
= euvw ( − )
y y2
3 2x3 x3 (x + 2y) x 4 x3 (x+2y)/y
= ex (x+2y)/y
( − ) = − e .
y y2 y2
Figure 2.10: Tree diagrams for Examples 2.4.3, 2.4.4, 2.4.5, and 2.4.6.
2.4 The chain rule | 97
du
Example 2.4.6. If u = f (x, x 2 , x 3 ), find dx
.
Example 2.4.7. Let u = f (x + y + z, xyz) and suppose that f has continuous second-order partials. Find
𝜕2 u
𝜕u
𝜕z
and 𝜕x𝜕z
in terms of the partial derivatives of f .
where f1 means the derivative of f with respect to the first variable, and f2 means that
with respect to the second variable. We have
𝜕2 u
= (f1 + xyf2 )x = (f1 )x + yf2 + xy(f2 )x
𝜕x𝜕z
98 | 2 Functions of multiple variables
Note. Using the notation f1 , f2 , etc., helps to clarify some ambiguity in the notation
𝜕f
𝜕x
, which may mean f1 (this is also fu ) or the partial derivative of the entire function
with respect to x.
f (a)
f (x) ≈ f (a) + f (x)(x − a) + (x − a)2
2!
So
Note that
ϕ(t) = f (a + tΔx, b + tΔy), ϕ(1) = f (x, y), and ϕ(0) = f (a, b).
Thus,
Furthermore,
1
Example 2.5.1. Use a linear and quadratic approximation to estimate .
√10−(2.01)2 −2(0.98)2
1
Solution. Let z = f (x, y) = . The function has continuous partials at (2, 1).
√10−x 2 −2y2
The value of the function at (2, 1) is f (2, 1) = 21 , and its first partials are
1 −3 −3
fx = − (10 − x2 − 2y2 ) 2 (−2x) = x(10 − x 2 − 2y2 ) 2 ,
2
−3 1
fx (2, 1) = 2(10 − 22 − 2(1)2 ) 2 = ,
4
1 −3 −3
fy = − (10 − x2 − 2y2 ) 2 (−4y) = 2y(10 − x 2 − 2y2 ) 2 ,
2
−3 1
fy (2, 1) = 2(1)(10 − 22 − 2(1)2 ) 2 = .
4
Thus, the linear approximation is
− 32 3 −5
fxx = (10 − x2 − 2y2 ) + x(− )(10 − x 2 − 2y2 ) 2 (−2x)
2
− 32 − 52
= (10 − x2 − 2y2 ) + 3x2 (10 − x2 − 2y2 ) ,
3
− 52 1
fxx (2, 1) = (10 − 22 − 2(1) )
2 −2
+ 3(22 )(10 − 22 − 2(1)2 ) = ,
2
− 32
fxy = (x(10 − x2 − 2y2 ) )y
3 −5
= x(− )(10 − x2 − 2y2 ) 2 (−4y),
2
100 | 2 Functions of multiple variables
3 −5 3
fxy (2, 1) = 2(− )(10 − 22 − 2(1)2 ) 2 (−4(1)) = ,
2 8
− 32
fyy = (2y(10 − x2 − 2y2 ) )y
− 32 3 −5
= 2(10 − x2 − 2y2 ) + 2y(− )(10 − x 2 − 2y2 ) 2 (−4y),
2
3
3 −5
fyy (2, 1) = 2(10 − 22 − 2(1)2 ) 2 + 2(1)(− )(10 − 22 − 2(1)2 ) 2 (−4(1))
−
2
5
= .
8
1
Compared with the value of computed by a calculator, 0.49757, the
√10−(2.01)2 −2(0.98)2
quadratic approximation gives a better estimation.
then we can conclude with Taylor’s theorem (or the Taylor expansion) for a function
of two variables.
Theorem 2.5.1. If z = f (x, y) has continuous (n + 1)th partial derivatives on some neighborhood D
containing the point (a, b) and (x, y) = (a + Δx, b + Δy) is a point in D, then
2
𝜕 𝜕 1 𝜕 𝜕
f (x, y) = f (a, b) + (Δx + Δy )f (a, b) + (Δx + Δy ) f (a, b)
𝜕x 𝜕y 2! 𝜕x 𝜕y
n
1 𝜕 𝜕
+ ⋅⋅⋅ + (Δx + Δy ) f (a, b)
n! 𝜕x 𝜕y
n+1
1 𝜕 𝜕
+ (Δx + Δy ) f (a + θΔx, b + θΔy)
(n + 1)! 𝜕x 𝜕y
Recall that an equation involving several variables can, in theory, be solved to give
one variable, say, z, as a function of the other variables (with domain limitation), and
in this case z is said to be implicitly defined as a function of the other variables by
that equation. If the equation can actually be solved to give a formula for z, then z
is defined explicitly by that formula. Implicit differentiation is a process for finding
derivatives of implicitly defined functions.
The chain rules for functions of two variables can be used to find a formula to
implicitly differentiate a function of one variable implicitly defined by an equation.
We first consider an equation of the form F(x, y) = 0, which defines y implicitly as a
differentiable function of x, for all x in some set D. This means that there exists some
function y = y(x) such that F(x, y(x)) = 0 for all x ∈ D (D is the domain of y), but we
may not have a formula for y(x). In spite of the lack of a formula y(x), we can find a
formula for its derivative by the method of implicit differentiation, provided that F is
dy
differentiable. We develop the formula here for dx by differentiating both sides of the
equation F(x, y) = 0 with respect to x (assuming y is a function of x) using the chain
rule, i. e.,
𝜕F 𝜕F dy
+ = 0.
𝜕x 𝜕y dx
dy F
𝜕F
𝜕x
= − 𝜕F = − x.
dx Fy
𝜕y
There is a theorem, called the implicit function theorem, which guarantees the ex-
istence of the derivative dy/dx under some conditions. It says that if F is defined on a
region containing (a, b), where F(a, b) = 0, Fy (a, b) ≠ 0, and Fx and Fy are both con-
tinuous on this region, then the equation F(x, y) = 0 defines a unique y as a function
of x near a with y(a) = b, and the derivative dy/dx does exist and is equal to −Fx /Fy .
Solution. Let
F(x, y) = x3 + y3 − 6xy + x 2 − 2y − 1 = 0.
Then
Therefore,
dy F 3x 2 − 6y + 2x
y = =− x =− 2 .
dx Fy 3y − 6x − 2
F
𝜕F
= − x.
𝜕z 𝜕x
= − 𝜕F
𝜕x Fz
𝜕z
Fy
𝜕F
𝜕z 𝜕y
= − 𝜕F = − .
𝜕y Fz
𝜕z
Method 2: We avoid the use of the formula by differentiating the equation with
respect to x, using the one-variable chain rule and the one-variable implicit differen-
tiation, assuming that z is a function of x, and treating y as a constant. Then we have
3x2 + 3z 2
𝜕z 𝜕z
+ 6yz + 6xy = 0.
𝜕x 𝜕x
2.6 Implicit differentiation | 103
Then
x2 + 2yz y2 + 2xz
dz = − 2
dx − 2 dy.
z + 2xy z + 2xy
2 2
This means 𝜕z
𝜕x
= − xz 2 +2xy
+2yz
and 𝜕z
𝜕y
= − yz 2 +2xy
+2xz
.
In general, two equations will allow two variables to be defined implicitly as functions
of the remaining variables, three equations will allow three variables to be defined
implicitly as functions of the remaining variables, and so on.
We first consider the system of equations
F(x, y, z) = 0,
{
G(x, y, z) = 0.
Assume that two functions y(x) and z(x) are implicitly defined as functions of the in-
dy dz
dependent variable x. How do we find dx and dx ? We apply the chain rule to both
equations simultaneously to obtain
dy dz
Fx + Fy dx + Fz dx = 0,
{ dy dz
Gx + Gy dx + Gz dx = 0.
F F
If the Jacobian determinant Gy Gz = Fy Gz − Fz Gy ≠ 0, then, by using Cramer’s rule,
y z
the determinant solution is
Fy Fx
Fx Fz
dy Gx Gz dz Gy Gx
= − F F and = − F F . (2.5)
dx y z
Gy Gz dx y z
Gy Gz
104 | 2 Functions of multiple variables
x 2 +y 2 +z 2 =4, dy dz
Example 2.6.3. If { find and .
x 2 −y+2z 2 =2, dx dx
Solution. We could use equation (2.5), but instead we find the derivative with respect
to x for each side of the equations using implicit differentiation to obtain
dy dz
2x + 2y dx + 2z dx = 0,
{ dy dz
2x − dx
+ 4z dx = 0.
Now multiplying the first equation by 2 and subtracting it from the second equation,
we have
dy dz
2x + 2y dx + 2z dx = 0,
{ dy
−2x − (1 + 4y) dx = 0.
dy
Solve for dx
to obtain
dy −2x
= .
dx 1 + 4y
dy
Substituting dx
back into the first equation, we have
dz 1 −2x x + 2xy
= − (2x + 2y( )) = − .
dx 2z 1 + 4y z + 4yz
Now we consider the situation where the two variables u and v are defined implic-
itly as functions u = u(x, y) and v = v(x, y) by two equations of the following form:
F(x, y, u, v) = 0,
{
G(x, y, u, v) = 0.
We use the chain rule to differentiate the equations with respect to x (keeping y con-
stant), and then we solve the two equations for 𝜕u
𝜕x
and 𝜕x
𝜕v
. Then we have
Fx + Fu 𝜕u
𝜕x
+ Fv 𝜕x
𝜕v
= 0,
{
Gx + Gu 𝜕u
𝜕x
+ Gv 𝜕x
𝜕v
= 0.
we have
u = x 2 + y 2 + cos v,
{
y sin v + v sin x = 0.
Find 𝜕v
𝜕x
and 𝜕u
𝜕y
at the point (x, y, u, v) = (0, 1, 0, π).
106 | 2 Functions of multiple variables
When x = 0, y = 1, u = 0, v = π, we have
𝜕u
𝜕x
= 0,
{ 𝜕v
− 𝜕x + π cos 0 = 0.
Thus, 𝜕v
𝜕x (0,1,0,π)
= π. To compute 𝜕u
𝜕y
, we differentiate both equations with respect to y
to get
𝜕u
𝜕y
= 2y − sin v 𝜕y
𝜕v
,
{
sin v + y cos v 𝜕y
𝜕v
+ 𝜕v
𝜕y
sin x = 0.
We have already found tangent lines and normal planes for a curve C in space given
by a vector-valued function r(t) = ⟨x(t), y(t), z(t)⟩, where x(t), y(t), and z(t) are differ-
entiable functions of t. The line tangent to the curve at t = t0 is
where r (t0 ) is the tangent vector at t = t0 . The parametric equations and symmetric
equations of the tangent line are, therefore,
x = x0 + x (t0 )t,
{
{ x − x0 y − y0 z − z0
y = y0 + y (t0 )t, and = = , respectively.
{
{ x (t0 ) y (t0 ) z (t0 )
{ z = z0 + z (t0 )t,
The normal plane at the same point is x (t0 )(x − x0 ) + y (t0 )(y − y0 ) + z (t0 )(z − z0 ) = 0.
Now we are able to find tangent lines and normal planes for a curve C that is im-
plicitly defined by a system of equations of the form
In general, these equations implicitly define two variables as functions of the third,
say, y = y(x) and z = z(x). Thus, we can parameterize C with parameter x as follows:
dy
The implicit differentiation methods described previously allow us to compute dx and
dz
dx
even though we do not have formulas for y(x) and z(x). Consequently, we are able
dy dz
to find the tangent vector ⟨1, , ⟩
dx dx
and tangent line at any point on C.
Example 2.7.1. Find an equation of the tangent line and an equation of the normal plane at the point
(1, −2, 1) of the curve C defined implicitly by
Solution. If we regard x as the independent variable and as the parameter, then the
system of equations implicitly defines two functions y = y(x) and z = z(x). Choosing
the parameterizations x = x, y = y(x), and z = z(x), a tangent vector of the curve is
dy dz
⟨1, dx , dx ⟩. In order to find the derivatives of these two functions, first implicitly differ-
entiate the system of equations with respect to x, i. e.,
dy dz
2x + 2y dx + 2z dx = 0,
{ dy dz
1+ dx
+ dz
= 0.
dy dz
Solving this system of linear equations for dx
and dx
gives
dy z − x dz x − y
= and = .
dx y − z dx y − z
Therefore, when x = 1, y = −2, and z = 1, we have y (1) = 0 and z (1) = −1, and the
dy dz
tangent vector ⟨1, dx , dx ⟩ is
v = ⟨1, 0, −1⟩.
x = 1 + t, y = −2, z = 1 − t.
1 ⋅ (x − 1) + 0 ⋅ (y + 2) + (−1) ⋅ (z − 1) = 0.
Note. If this example had asked for the tangent line at (−2, 1, 1), then something dif-
ferent would have happened. If you try the parameterization
has no solutions at (−2, 1, 1) since the two equations would be inconsistent. This does
not mean that there is no tangent there, but the tangent line is parallel to the yz-plane
(perpendicular to the x-axis). To solve the problem, we try a different way to parame-
terize the curve, i. e.,
2x dx
dy
dz
+ 2y + 2z dy = 0,
{ dx dz
dy
+1+ dy
= 0.
At (−2, 1, 1) we obtain dx
dy
= 0 and dz
dy
= −1. Therefore, the tangent vector is T = ⟨0, 1, −1⟩.
So, the tangent line is
Figure 2.11: Tangent line to the intersection curve of a plane and a sphere.
2.7 Tangent lines and tangent planes | 109
such that t = t0 gives the point M. Since C lies on S, any point (x(t), y(t), z(t)) on C must
satisfy the defining equation F(x, y, z) = 0 of S, so that
If x(t), y(t), and z(t) are differentiable functions of t, and F is also differentiable, then
we can use the chain rule to differentiate both sides of (2.8) as follows:
𝜕F dx 𝜕F dy 𝜕F dz
+ + = 0.
𝜕x dt 𝜕y dt 𝜕z dt
and
where v is the tangent vector to C at M, equation (2.9) can be written in terms of a dot
product as
n ⋅ v = 0. (2.10)
This equation shows that n is perpendicular to the tangent vector v at M for any curve C
on S that passes through M and satisfies the above differentiability conditions. There-
fore, all the tangent lines of these curves at M must be coplanar as shown in Figure 2.12.
Those tangent lines form a plane which we define as the tangent plane to the surface
at M.
110 | 2 Functions of multiple variables
(a) (b)
Definition 2.7.1. Assume a surface in space has equation F (x, y, z) = 0, and F (x, y, z) is differentiable
at M(x0 , y0 , z0 ). Then the tangent plane to the surface at M is
The nonzero vector n = ⟨Fx (M), Fy (M), Fz (M)⟩ is a normal vector of the tangent plane at M. The normal
line at M is
x − x0 y − y0 z − z0
= = .
Fx (M) Fy (M) Fz (M)
Example 2.7.2. Find equations of the tangent plane and normal line to the ellipsoid
2x 2 + 4y 2 + z 2 = 10
F(x, y, z) = 2x 2 + 4y2 + z 2 − 10 = 0.
Therefore, we have
which simplifies to 2y − x − z − 5 = 0.
2.7 Tangent lines and tangent planes | 111
(a) (b)
Figure 2.13: Tangent plane and normal line. Examples 2.7.2 and 2.7.3.
F(x, y, z) = f (x, y) − z = 0.
If f is differentiable, then
and a normal vector to the tangent plane is ⟨fx , fy , −1⟩. Thus, an equation of the tangent
plane to the surface at (x0 , y0 , z0 ) becomes
or
Example 2.7.3. Find the tangent plane and normal line to the elliptic paraboloid z = 2x 2 + y 2 at the
point (1, 1, 3).
or
4x + 2y − z = 3.
Figure 2.13(b) shows the elliptic paraboloid and its tangent plane at (1, 1, 3) that we
found in this example.
then ru × rv is a vector normal to the plane tangent to the surface. If the point P on the
surface is (u0 , v0 , r(u, v0 )), then r(u0 , v0 ) is a curve on the surface passing through P,
thus, ru (u0 , v0 ) is its tangent vector at P. Similarly, rv (u0 , v0 ) is a tangent vector of the
curve r(u0 , v). Thus, a normal vector is obtained by taking the cross product of the two
tangent vectors. If we choose the parameterization
for the surface z = f (x, y), then a normal vector of its tangent plane is
The linearization z = z0 +fx (x0 , y0 )(x−x0 )+fy (x0 , y0 )(y−y0 ) is exactly an equation of the
tangent plane at the point (x0 , y0 , z0 ). The change Δz is approximated by the change
dz in the corresponding tangent plane, as shown in Figure 2.14.
(a) (b)
This involves one variable l, so we define the directional derivative along the direction
u as follows.
Definition 2.8.1. Let z = f (x, y) be a function of two variables and (a, b) be an interior point in its
domain; u = ⟨cos α, sin α⟩ is a unit vector. The directional derivative of z at the point (a, b) in the
direction u is defined by
dz 𝜕z 𝜕f
Note. We also use notations such as , , ,
dl 𝜕l 𝜕l
or 𝜕z
𝜕ρ
for directional derivatives.
Note. The cone has no partial derivative at (0, 0). So, this example shows that a func-
tion may have a directional directive at a point in some direction, even though it may
not have partial derivatives at that point. This is because the directional derivative is
defined as a one-sided limit!
Surprisingly, if a function is differentiable, its derivative in any direction exists,
and we can find directional derivatives using its partials. This is shown in the following
theorem.
Theorem 2.8.1. If z = f (x, y) is differentiable at P0 (a, b), then the directional derivative of f exists at
P0 in the direction given by any unit vector u = ⟨cos α, cos β⟩ and
It follows that
f (a + l cos α, b + l sin α) − f (a, b)
Du f (a, b) = lim
l→0 l
fx (a, b)l cos α + fy (a, b)l sin α + o(√[l cos α]2 + [l sin α]2 )
= lim
l→0 l
fx (a, b)l cos α + fy (a, b)l sin α + o(l)
= lim
l→0 l
= fx (a, b) cos α + fy (a, b) sin α.
Now we can rewrite the directional derivative in a dot product form, i. e.,
Du f (a, b) = fx (a, b) cos α + fy (a, b) sin α = ⟨fx (a, b), fy (a, b)⟩ ⋅ ⟨cos α, sin α⟩.
This is
Example 2.8.2. Find the directional derivative of z = xe2y at P(1, 0) in the direction from P to the point
Q(2, −1).
116 | 2 Functions of multiple variables
→
Solution. The unit vector in the direction of PQ is
⟨2 − 1, −1 − 0⟩ 1 −1
u= =⟨ , ⟩.
√(2 − 1)2 + (−1 − 0)2 √2 √2
Since
𝜕z 𝜕z
= e2y (1,0) = 1 and = 2xe2y (1,0) = 2,
𝜕x (1,0) 𝜕y (1,0)
1 1 1 1 √2
Du f (a, b) = ⟨1, 2⟩ ⋅ ⟨ ,− ⟩ = 1 ⋅ + 2 ⋅ (− ) = − .
√2 √2 √2 √2 2
Therefore, the maximum directional derivative that f can obtain is √fx2 + fy2 , and this
happens when θ = 0, that is, when the two vectors ⟨fx (a, b), fy (a, b)⟩ and u point in
the same direction. The minimum directional derivative that f can obtain is −√fx2 + fy2 ,
and this happens when θ = π, that is, when the two vectors ⟨fx (a, b), fy (a, b)⟩ and u
point in exactly opposite directions. Therefore, along the direction ⟨fx (a, b), fy (a, b)⟩,
the function f obtains its greatest directional derivative; f attains its smallest direc-
tional derivative in the direction −⟨fx (a, b), fy (a, b)⟩, as shown in Figure 2.16. We give
the vector ⟨fx (a, b), fy (a, b)⟩ a special name.
Definition 2.8.2. If z = f (x, y) is a differentiable function, then the gradient of f at the point (a, b) is
the vector ∇f (a, b) defined by
∇f (a, b) = ⟨fx (a, b), fy (a, b)⟩ = fx (a, b)i + fy (a, b)j.
∇f (x, y) = ⟨fx (x, y), fy (x, y)⟩ = fx (x, y)i + fy (x, y)j.
By the above definition, we can write the directional derivative of f in the direction
given by a unit vector u as
Du f (a, b) = ∇f ⋅ u, (2.16)
and the steepest ascent/steepest slope of f at (x, y) is |∇f |, which occurs in the direction
∇f . The steepest descent of f at (x, y) is −|∇f |, which occurs in the direction of −∇f . In
fact, the directional derivative of f at (a, b) in the direction u is the scalar projection of
∇f onto the vector u.
dy
fx + fy = 0,
dx
dy
fx (a, b) + fy (a, b) = 0 (computing at P),
dx (a,b)
dy
⟨fx (a, b), fy (a, b)⟩ ⋅ ⟨1, ⟩ = 0,
dx (a,b)
dx dy
∇f (a, b) ⋅ ⟨ , ⟩ = 0.
dx dx (a,b)
Note that the tangent vector of the level curve written parametrically as x = x and
dx dy
y = y(x) is given by ⟨ dx , dx ⟩. The zero value for the dot product proves that the gradient
vector of f is perpendicular to the level curve at P, as shown in Figure 2.17.
118 | 2 Functions of multiple variables
Solution.
1. The gradient of f at (2, 1) is
∇f (2, 1) = ⟨fx (2, 1), fy (2, 1)⟩ = ⟨2x, −6y⟩|(2,1) = ⟨4, −6⟩.
x2 − 3y2 = 1 and z = 0.
2x − 6yy = 0,
x
y = .
3y
So at the point (2, 1) on the level curve C, the slope is y (2) = 32 , and the tangent
line is
2
y − 1 = (x − 2).
3
A tangent vector of C at (2, 1) is ⟨3, 2⟩, which is indeed perpendicular to the gradient
∇f (2, 1) = ⟨4, −6⟩ since
Example 2.8.4. Suppose the temperature distribution on a plate at any point (x, y) satisfies
T (x, y) = 80 − 2x 2 − y 2 − x (°C).
An ant with bad luck unfortunately fell on the plate at (1, 1). Find the best escaping path for the ant.
Solution. Note that the temperature at (1, 1) is 76 °C! The strategy is to find the path
along which the temperature decreases most rapidly. This is equivalent to finding the
steepest descent path on the surface (the graph of the function T(x, y)) starting from
the point (1, 1). Well, the direction of this path must be the opposite direction of ∇T. As-
sume the path is y = y(x). Then ⟨dx, dy⟩, which is the tangent vector, must be parallel
to −∇T = ⟨−Tx , −Ty ⟩. So,
or
dx dy
= .
Tx Ty
1 1
dx = dy.
−4x − 1 −2y
1 1
−∫ dx = − ∫ dy.
4x + 1 2y
1 1
This simplifies to 4
ln |4x + 1| = 2
ln |y| + C, where C is an arbitrary constant. Then
4x + 1 = 5y2 .
This is certainly not the shortest path to the edge of the plate, but the path along which
the temperature decreases most rapidly, as shown in Figure 2.18.
120 | 2 Functions of multiple variables
𝜕f
Example 2.8.5. Compute
𝜕l (1,1,2)
in the direction with direction angles α = π/3, β = π/4, and γ = π/3
when f is defined by
f (x, y, z) = xy + yz + zx.
1 √2 1
u = ⟨cos π/3, cos π/4, cos π/3⟩ = ⟨ , , ⟩.
2 2 2
Also,
fx (1, 1, 2) = (y + z)|(1,1,2) = 3,
fy (1, 1, 2) = (x + z)|(1,1,2) = 3,
fz (1, 1, 2) = (y + x)|(1,1,2) = 2,
2.8 Directional derivatives and gradient vectors | 121
𝜕f (1, 1, 2) 1 √2 1 1 √2 1 5 3
= ⟨3, 3, 2⟩ ⋅ ⟨ , , ⟩=3⋅ +3⋅ + 2 ⋅ = + √2.
𝜕l 2 2 2 2 2 2 2 2
Note. There is a similar idea to level curves for functions of three variables. If we set
f (x, y, z) = k, for a constant k, then we get a level surface of the function u = f (x, y, z).
At any given point P(a, b, c), for any curve passing through P that lies on the surface,
we can show that the gradient vector ∇f is actually perpendicular to its tangent vec-
tor. (We provided a proof in the previous section.) Therefore, ∇f is perpendicular to
the tangent plane to the level surface at P, as shown in Figure 2.19. Thus, the gradi-
ent vector ∇f = ⟨fx , fy , fz ⟩ is a normal vector of the tangent plane to the level surface
f (x, y, z) = k through P.
Figure 2.19: Level surfaces and gradient vectors for functions of three variables.
Now, we can use gradient vectors to find a tangent vector for a curve of intersection of
two surfaces,
f (x, y, z) = 0,
{
g(x, y, z) = 0
v = ∇f × ∇g.
f (x, y, z) = x2 + y2 + z 2 − 6 = 0,
g(x, y, z) = x + y + z = 0,
122 | 2 Functions of multiple variables
we have ∇f (1, −2, 1) = ⟨2x, 2y, 2z⟩|(1,−2,1) = ⟨2, −4, 2⟩ and ∇g(1, −2, 1) = ⟨1, 1, 1⟩. Thus,
2 1 −6 −1
v = ( −4 ) × ( 1 ) = ( 0 ) = 6 ( 0 .)
2 1 6 1
So any vector parallel to ⟨−1, 0, 1⟩ is a tangent vector to the curve at (1, −2, 1). There is
no need for any parameterization.
Note. Using vector notation, and knowledge in linear algebra, we now can write
the Taylor series for a function of two variables in a more compact way. Let x =
⟨x, y⟩ and x0 = ⟨a, b⟩. Then f (x, y) = f (x), f (a, b) = f (x0 ), and Δx = x − x0 = ⟨Δx, Δy⟩.
Note that
𝜕 𝜕 𝜕f (x0 ) 𝜕f (x0 )
(Δx + Δy )f (a, b) = Δx + Δy
𝜕x 𝜕y 𝜕x 𝜕y
= ⟨fx (x0 ), fy (x0 )⟩ ⋅ ⟨Δx, Δy⟩ = ∇f (x0 ) ⋅ Δx.
As shown in Figure 2.20, for a function of two variables, z = f (x, y), there are also
interesting features such as local or global extreme values, as we have seen in one-
variable calculus. We first give the definition of local and global extrema.
Figure 2.21 shows that a function whose graph is the upper hemisphere has a local
maximum (also absolute maximum) above its center, and a function whose graph is
the cone with vertex downwards has an absolute minimum at its vertex.
How can we identify these interesting points? As we saw in one-variable calculus, the
answer is to use derivatives. In this section, we use partial derivatives to help locate
maxima and minima of functions of two variables. We first consider the case that a
differentiable function z = f (x, y) has a local maximum point at (a, b). Then, the inter-
section curve
z = f (x, y),
{
y=b
also has a local maximum at the same point (a, b). However, the curve z = f (x, b) in
the y = b plane has just one variable. Therefore, the derivative with respect to x at
x = a must be 0. This means fx (a, b) = 0. Similarly, we can obtain fy (a, b) = 0, or,
equivalently, ∇f (a, b) = 0, as shown in Figure 2.22.
Note. Note that if fx (a, b) = 0 and fy (a, b) = 0, then ∇f (a, b) = 0, and the tangent plane
at (a, b) is z = z0 . This means the geometric interpretation of ∇f (a, b) = 0 is that the
graph of f has a horizontal tangent plane at the point (a, b).
124 | 2 Functions of multiple variables
If a function has no partial derivatives at a point, it may still have extreme values
there (similar to one-variable calculus, a function that is not differentiable may still
have extreme values). For instance, the upper right circular cone z = √x 2 + y2 has no
partial derivatives at (0, 0), but it does have a local minimum value 0 at (0, 0). There-
fore, candidates of extrema for any function are those points where ∇f = 0 or ∇f does
not exist. We give them a special name.
Definition 2.9.2 (Critical points). A point (a, b) is called a critical point of z = f (x, y) if ∇f (a, b) = 0 or
if ∇f (a, b) does not exist.
Theorem 2.9.1 shows that if f has a local maximum or minimum at (a, b), then (a, b)
must be a critical point of f .
Example 2.9.1. Find all critical points for each of the following functions:
Solution. For (a), the function z = xy is differentiable everywhere, so all critical points
are those such that ∇f = 0, i. e.,
∇f = 0 → fx = 0 and fy = 0,
fx = y = 0 and fy = x = 0.
∇f = 0 → fx = 0 and fy = 0,
3
fx = 4x − 8y = 0 and fy = 4y3 − 8x = 0.
x3 = 2y and y3 = 2x.
Thus, we have
x 9 = 8(2x),
x(x8 − 16) = 0,
x(x 4 + 4)(x2 + 2)(x − √2)(x + √2) = 0.
However, as in single-variable calculus, not all critical points give rise to maxima
or minima. For instance, for the function z = xy, ∇f (0, 0) = 0, but the function value 0
at (0, 0) is neither a maximum nor a minimum because near (0, 0) there are points in
the first and third quadrants of the xy-plane which make z = xy positive and points in
the second and fourth quadrants which make z = xy negative. We also give a definition
for this type of point.
Definition 2.9.3 (Saddle points). A critical point (a, b) of z = f (x, y) at which ∇f (a, b) = 0 but f does
not have a local maximum or a local maximum is called a saddle point.
Note. In one-variable calculus, a saddle point is a point where the function has a hor-
izontal tangent line, and nearby the point you can find places where the graph of the
function is above the tangent line and other places where the graph is below the tan-
gent line. The function f (x) = x3 at x = 0 is a good example of a saddle point. Anal-
ogously, in two-variable calculus, a saddle point is a point where the function has a
horizontal tangent plane, and nearby the point you can find places where the graph
of the function is above the tangent plane and other places where the graph is below
the tangent plane.
126 | 2 Functions of multiple variables
Theorem 2.9.2 (Second derivative test). Assume all the second partial derivatives of f are continuous
on a disk with center (a, b), and fx (a, b) = 0 and fy (a, b) = 0, so that (a, b) is a critical point of f . Let
Then:
(1) If AC − B2 > 0 and A > 0, then f (a, b) is a local minimum.
(2) If AC − B2 > 0 and A < 0, then f (a, b) is a local maximum.
(3) If AC − B2 < 0, then f (a, b) is not a local minimum or maximum, so it is a saddle point.
Note.
1. In case (3), where the point (a, b) is a saddle point of f , the graph y = f (x, y) crosses
its tangent plane at (a, b), that is, near the saddle point, part of the graph is above
the tangent plane, and part of the graph is below the tangent plane.
2. If AC − B2 = 0, the test fails to give any information. In this case, f could have a
local maximum or local minimum at (a, b), or (a, b) could be a saddle point of f .
3. To help remember the formula for AC − B2 , we can write it in determinant form,
A B fxx fxy
AC − B2 = = = f f − (fxy )2 .
B C fxy fyy xx yy
A proof of the second derivative test can be seen from the vector form of the Taylor
expansion for a function of two variables. Assume x = ⟨x, y⟩, u = f (x), and f (x) has
continuous first and second partial derivatives at x0 = (a, b). Then, for small Δx,
1
f (x0 +Δx) ≈ f (x0 ) + ∇f (x0 ) ⋅ ΔxT + ΔxT H(x0 )Δx,
2!
where ∇f (x0 ) = ⟨fx (a, b), fy (a, b)⟩, and H(x0 ) is the Hessian matrix defined as
f fxy
H(x0 ) = xx .
fyx fyy
(a,b)
Since ∇f (x0 ) = 0 at a candidate x0 , f (x0 +Δx) > f (x0 ) if H(x0 ) is positive definite, and
f (x0 +Δx) < f (x0 ) if H(x0 ) is negative definite.
Example 2.9.2. Locate and classify all the critical points for each of the following functions:
2
−y 2
(a) f (x, y) = xy, (b) f (x, y) = x 4 + y 4 − 8xy, and (c) z = xye−x .
Solution.
(a) We have found the critical point (0, 0) for z = xy in Example 2.9.1. Since A = fxx = 0,
B = fxy = 1, and C = fyy = 0, we have
AC − B2 = 0 − 1 < 0.
(b) We calculate A = fxx = 12x2 , B = fxy = −8, and C = fyy = 12y2 and apply the second
derivative test to critical points we have found in Example 2.9.1.
At (0, 0), AC − B2 = 144x2 y2 − 64|(0,0) = −64 < 0, so it is a saddle point.
At (√2, √2), AC − B2 = 144x2 y2 − 64|(√2,√2) = 144(4) − 64 > 0 and A = 12(√2)2 > 0,
so it is a local minimum.
At (−√2, −√2), AC − B2 = 144x2 y2 − 64|(−√2,−√2) = 144(4) − 64 > 0 and A = 12(√2)2 >
0, so it is a local minimum.
(c) Solving ∇f = 0, we have
2
−y2 2
−y2 2
−y2
fx = ye−x + xye−x (−2x) = e−x (y − 2x 2 y) = 0,
{ 2 2 2 2 2 2
fy = xe−x −y
+ xye−x −y
(−2y) = e−x −y
(x − 2xy2 ) = 0.
2
−y2
Since e−x ≠ 0, these equations simplify to
y − 2x2 y = 0,
{
x − 2xy2 = 0.
This gives critical points (0, 0), ( √12 , √12 ), (− √12 , − √12 ), (− √12 , √12 ), and ( √12 , − √12 ).
Then,
2
−y2 2
−y2
A = fxx = e−x (−2x)(y − 2x2 y) + e−x (−4xy),
−x 2 −y2 2 −x2 −y2
B = fxy = e (−2y)(y − 2x y) + e (1 − 2x2 ), and
2 2 2 2
C = fyy = e−x −y
(−2y)(x − 2xy2 ) + e−x −y
(−4xy).
So at (0, 0), AC − B2 < 0, and it is a saddle point. At the other points, note that
2 2
1 − 2x 2 and 1 − 2y2 are 0. So, B is 0, and AC − B2 = 16e−2x −2y x 2 y2 > 0. Therefore,
the function has local extrema at those points. When x and y have opposite signs,
A > 0, and when x and y have the same sign, A < 0. Thus, the function has two
local maxima at ( √12 , √12 ) and (− √12 , − √12 ) and two local minima at (− √12 , √12 ) and
( √12 , − √12 ). Graphs of the three functions are shown in Figure 2.20.
Example 2.9.3. Find the global maximum and global minimum for the function
fx = 2x + 8 = 0,
fy = 2y − 6 = 0.
So, (−4, 3) is the critical point in D (if not in D, we will reject it). The function value at
(−4, 3) is
The graphs of this function and the cylinder are shown in Figure 2.23.
Example 2.9.4. A rectangular container without a lid is to be made from 18 m2 woodboard. Find the
maximum volume of such a container.
2.9 Maximum and minimum values | 129
Figure 2.23: Global extreme values for functions defined over a closed region.
Solution. Let x = length, y = width, and z = height of the box (in meters). Then the
volume of the box is given by
V = xyz.
Computing the area of the four sides and the bottom of the box, which must have a
total area of 18 m2 , gives an extra equation (a constraint) linking x, y, and z, i. e.,
18 − xy 18xy − x 2 y2
V = xy = .
2(x + y) 2(x + y)
Subtracting these leads to x2 = y2 and so x = y (note that x and y must both be positive).
If we put x = y in either equation, we obtain 3x 2 − 18 = 0, which gives x = √6, y = √6,
and z = (18 − √6 ⋅ √6)/[2(√6 + √6)] = √6/2.
130 | 2 Functions of multiple variables
Of course, we can show that this indeed gives a local maximum of V by using the
second derivative test. However, from the physical nature of this problem, we could
simply argue that there must be an absolute maximum volume, and it has to occur at
a critical point of V. Since there are no boundary values of interest and there is only
one critical point, the function must take its absolute maximum at the only candidate.
In Example 2.9.3, we maximized the function f (x, y) under the condition x 2 + y2 = 36.
We found the maximum by finding a parameterization of the boundary ⟨x(t), y(t)⟩ and
then reducing f (x, y) to f (x(t), y(t)), which is a one-variable function. In Example 2.9.4,
we maximized a volume function V = xyz subject to the constraint 2xz + 2yz + xy =
18. We eliminated the constraint by replacing z = (18 − xy)/(2x + 2y) in the objective
function V = xyz. Thus, the problems were reduced to problems without constraints.
However, this approach may be hard or even impossible in some cases. For example,
which is to find the maximum or minimum value of the objective function z = f (x, y),
subject to the constraint g(x, y) = 0. This type of problem is called a constrained max-
imum/minimum problem.
Sometimes we can convert a constrained maximum/minimum problem to a non-
constrained one, as shown in Example 2.9.3 and Example 2.9.4, by expressing one vari-
able in terms of other variables in the constraint condition. Now, the question is, how
can we identify the candidates for constrained maximum/minimum if elimination of
variables is hard or impossible? Note that if the curve g(x, y) = 0 in the xy-plane has a
parameterization r(t) = ⟨x(t), y(t)⟩, and at t = t0 (this corresponds to some point (a, b)
on the curve) there is a constrained maximum/minimum, then
dz dx dy
= 0 → fx + fy =0
dt dt dt
dx dy
or ⟨fx , fy ⟩ ⋅ ⟨ , ⟩ = 0.
dt dt t=t0
2.9 Maximum and minimum values | 131
This indicates that at the point t = t0 , ∇f is perpendicular to the tangent vector of the
curve r(t). On the other hand the curve r(t) = ⟨x(t), y(t)⟩ also satisfies g(x(t), y(t)) = 0.
In a similar manner, we also have
dx dy dx dy
gx + gy =0 or ⟨gx , gy ⟩ ⋅ ⟨ , ⟩ = 0.
dt dt dt dt
This means ∇g is also perpendicular to the tangent vector of the curve r(t) at t0 . Thus,
at t = t0 , we must have ∇f ‖ ∇g. There must be some number λ such that ∇f = λ∇g at
t = t0 . We conclude this by the following theorem.
Theorem 2.9.3. Suppose both f (x, y) and g(x, y) are differentiable, and at some point (a, b), the opti-
mization problem
has a constrained maximum or minimum. Then at the point (a, b), one must have the following condi-
tions:
Note. The constant λ in Theorem 2.9.3 is called the Lagrange multiplier. The theorem
can be also stated in a form of a Lagrange function defined by L(x, y, λ) = f (x, y) −
λg(x, y). The candidates for constrained maximum/minimum must then satisfy
∇L = 0 or equivalently Lx = Ly = Lλ = 0.
Example 2.9.5. Find the constrained maximum and constrained minimum for the optimization prob-
lem
Solution. Let L(x, y, λ) = x + y − λ(x2 − xy + y2 − 1). Then any candidate must satisfy
Lx = 1 − 2λx + λy = 0,
Ly = 1 + λx − 2λy = 0,
Lλ = x2 − xy + y2 − 1 = 0.
1 1
=λ= .
2x − y 2y − x
Then
2x − y = 2y − x, or x = y.
x2 − x(x) + x 2 − 1 = 0,
so we have x = ±1. Therefore, candidate points are (1, 1) and (−1, −1); f (1, 1) = 2 and
f (−1, −1) = −2, so the constrained maximum is 2 at (1, 1) and the constrained minimum
is −2 at (−1, −1). Figure 2.25(a) shows the graph of the plane z = x + y and z = x 2 − xy +
(a) (b)
y2 − 1. Figure 2.25(b) shows the level curves of the plane and the constraint. Note that
the constraint candidates are exactly those where the constraint curve is tangent to a
level curve.
Since, at (a, b), the tangent vectors of two curves are perpendicular to ∇f (a, b) and
∇g(a, b), respectively, it follows that ∇f (a, b) and ∇g(a, b) are parallel to each other at
(a, b). Thus, the level curve f (x, y) = f (a, b) and the curve g(x, y) = 0 are tangent to
each other at (a, b)!
Figure 2.26(a) shows the graph of z = x2 + 2y2 and the constraint x 2 + y2 = 1
(in the z = 0 plane). Figure 2.26(b) shows the level curves of z = x 2 + 2y2 . One can
find candidates for the constrained maximum and minimum at those points where
the level curve and constraint curve are tangent to each other.
(a) (b)
has a constrained extremum at (a, b, c), then, if both f and g are differentiable, we
must have
Or we can define a new function, L(x, y, x, λ), called the Lagrangian, with an extra vari-
able λ called Lagrange multiplier, by
Then, we can find the constrained maximum/minimum for f by taking the following
steps.
(1) Find all values of x, y, z, and λ such that the partial derivatives are zero (in other
words, find the critical points of L), by solving
We illustrate the Lagrange multiplier method using the problem of a previous exam-
ple.
Example 2.9.6. A rectangular container without a lid is to be made from 18 m2 woodboard. Find the
maximum volume of such a container.
V = xyz
Using the method of Lagrange multipliers, L(x, y, z, λ) = xyz − λ(2xz + 2yz + xy − 18), so
the four partial derivatives give the equations
{ Lx = yz − λ(2z + y) = 0,
{
{
{ Ly = xz − λ(2z + x) = 0,
{
{
{ Lz = xy − λ(2x + 2y) = 0,
{
{ Lλ = 2xz + 2yz + xy − 18 = 0.
There are no general rules for solving systems of nonlinear equations, and sometimes
some ingenuity is required. In the present example, you might notice that if we multi-
ply the first equation by x, the second equation by y, and the third equation by z, then
we have
4z 2 + 4z 2 + 4z 2 = 18,
12z 2 = 18,
z = √6/2.
Hence, the only critical point is, as before, x = √6, y = √6, and z = √6/2, and this
gives the maximum volume.
∇L = 0 → Lx = Ly = Lz = Lλ = Lu = 0,
Example 2.9.7. Find the shortest distance from the origin to the line of intersection of the two planes
y + 2z − 12 = 0 and x + y − 6 = 0.
2x − u = 0,
2y − λ − u = 0,
2z − 2λ = 0,
y + 2z − 12 = 0,
x + y − 6 = 0.
Solving the equations yields the only candidate (2, 4, 4). Therefore, the shortest dis-
tance must be √22 + 42 + 42 = 6.
2.10 Review
Main concepts discussed in this chapter are listed below.
1. Definitions of functions of more than one variable, such as z = f (x, y) and u =
f (x, y, z).
2.10 Review | 137
4. Differentiability:
𝜕2 f 𝜕2 f
7. Clairaut theorem: if 𝜕x𝜕y
and 𝜕y𝜕x
are both continuous at (a, b), then
𝜕2 f 𝜕2 f
= at (a, b).
𝜕x𝜕y 𝜕y𝜕x
F Fy
=− x
𝜕z 𝜕z
and =− .
𝜕x Fz 𝜕y Fz
f fy
11. Implicit differentiation: { F(x,y,u,v)=0,
G(x,y,u,v)=0 denote
𝜕(f ,g)
= gxx gy
, then
𝜕(x,y)
𝜕(F,G) 𝜕(F,G)
𝜕u 𝜕v
and
𝜕(x,v) 𝜕(u,x)
= − 𝜕(F,G) = − 𝜕(F,G) .
𝜕x 𝜕x
𝜕(u,v) 𝜕(u,v)
12.
f (x,y,z)=0,
(a) For a curve defined by { g(x,y,z)=0, its tangent line at P(x0 , y0 , z0 ) is given by
(b) For a surface defined by F(x, y, z) = 0, its tangent plane at P(x0 , y0 , z0 ) is given
by
(∇F ⋅ Δx)P = 0.
13. The directional derivative of z = f (x, y) in the direction given by unit vector u at
point (a, b) is
Du f (a, b) = ∇f ⋅ u.
14. Candidates for local maxima/minima of the function f are points where ∇f = 0 or
∇f does not exists.
15. If A = fxx , B = fxy = fyx , and C = fyy , then at point P where ∇f (P) = 0,
(a) if AC − B2 > 0 and A > 0, there is a local minimum,
(b) if AC − B2 > 0 and A < 0, there is a local maximum,
(c) if AC − B2 < 0, there is a saddle point.
16. Candidates for maxima/minima of z = f (x, y) subject to g(x, y) = 0 satisfy
Similar results hold for functions of three variables or with more than one restric-
tion.
2.11 Exercises
2.11.1 Functions of two variables
1. Find the domain for each of the following functions and sketch it:
(1) z = √1 − x2 + √y2 − 1, (2) z = √x − √y,
(3) z = ln(1 − x − y), (4) z = ln(y − x) + √x
,
√1−x2 −y2
1
(5) u = √R2 − x2 − y2 − z 2 + , R > r.
√x2 +y2 +z 2 −r 2
3 3
2. Find f (x, y) if f (x + y, xy) = x + y .
3. Find each of the following limits if it exists:
1
3
(1) lim(x,y)→(2, 1 ) (2 + xy) y+xy2 , (2) limx→∞ (x 2 + y2 ) sin x2 +y 2,
2 y→∞
1−cos(x 2 +y2 ) sin(2xy)
(3) lim(x,y)→(0,0) 2 y2 , (4) lim(x,y)→(2,0) y
,
(x2 +y2 )ex
ln(x+ey ) xy cos y
(5) lim(x,y)→(1,0) , (6) lim(x,y)→(0,0) 3x2 +y2
,
√x2 +y2
xy2
(7) lim(x,y)→(0,0) x2 +y2 +xy
.
2.11 Exercises | 139
x2 y
, x2 + y2 ≠ 0,
f (x, y) = { x4 +y4
0, x2 + y2 = 0.
5. Determine the set of points at which each of the following functions is continuous:
sin xy y2 +2x
(1) f (x, y) = ex −y2
, (2) z = y2 −2x
,
1
(3) u = xyz
, (4) z = ln(1 − x2 − y2 ).
1. Find the first partial derivatives for each of the following functions:
(1) u = xy + xy , (2) u = x
,
√x2 +y2
𝜕2 u 𝜕2 u
+ = 0.
𝜕x2 𝜕y2
2 2
xy x2 −y2 when x2 +y2 =0,
5. If f (x, y) = { show that fxy (0, 0) ≠ fyx (0, 0).
̸
x +y
0 when x2 +y2 =0,
xy
when x2 +y2 =0,
6. If f (x, y) = { show that both fx (0, 0) and fy (0, 0) exist but f is not
̸
x2 +y2
0 when x2 +y2 =0,
differentiable at (0, 0).
1
(x2 +y2 ) sin( ) when x2 +y2 =0,
7. Show that f (x, y) = { is differentiable at (0, 0), but nei-
̸
x 2 +y2
0 when x2 +y2 =0
ther fx nor fy is continuous at (0, 0).
8. Let f be the function
x2 y2
{ 3 if (x, y) ≠ (0, 0),
f (x, y) = { (x2 +y2 ) 2
(a) Find the limit lim(x,y)→(0,0) f (x, y) or show that it does not exist.
(b) Is the function continuous at (0, 0)?
(c) Is the function differentiable at (0, 0)?
9. Find the total differential for each of the following functions:
xy
(1) z = x3 ln(y2 ), (2) z = arctan 1−xy , (3) u = √x 2 + y2 + z 2 .
10. Explain why the following functions are differentiable at the given point. Then
find the linearization L(x, y) of the function at that point, and use it to approximate
the given number.
(1) f (x, y) = xy at (1, 1), f (0.97, 1.06),
(2) f (x, y, z) = √x2 + y2 + z 2 at (3, 2, 6), √3.022 + 1.972 + 5.992 .
1. Find the given partial derivative for each given explicitly or implicitly defined
function.
(1) u = ln(ex + ey ), y = x3 . Find du
dx
. (2) z = sin(x2 y)x. Find 𝜕x
𝜕z
.
(3) u = x2 y − xy2 + z, where x = t cos(s), y = t sin(t), and z = t + s. Find 𝜕u
𝜕t
and 𝜕u
𝜕s
.
(4) u = f (x2 − y2 , exy ). Find 𝜕u
𝜕x
and 𝜕u
𝜕y
. (5) u = f (x, xy, xyz). Find 𝜕u
𝜕x
, 𝜕u
𝜕y
, and 𝜕u
𝜕z
.
2 2
x 𝜕u 𝜕u 𝜕2 u z
(6) u = f (x, y ). Find 𝜕x2 , 𝜕y2 , and 𝜕x𝜕y . (7) e = xyz. Find 𝜕z
𝜕x
.
(8) yz = ln(x + z). Find 𝜕x 𝜕z
. (9) u = f (x 2 , y, xy). Find 𝜕u
𝜕x
and 𝜕u
𝜕y
.
2. If z = f (x, y), where x = r cos θ and y = r sin θ, show that
𝜕2 z 𝜕2 z 𝜕2 z 1 𝜕2 z 1 𝜕z
+ = + + .
𝜕x2 𝜕y2 𝜕r 2 r 2 𝜕θ2 r 𝜕r
dy fx Ft − ft Fx
= .
dx Ft + ft Fy
7. (Derivative under integrals) The famous Leibniz theorem says that if f (x, t) is a
function such that f (x, t) and its partial derivative fx (x, t) are both continuous in
2.11 Exercises | 141
some region of the xt-plane, with a(x) ≤ t ≤ b(x) for two differentiable functions
a(x) and b(x), then
b(x) b(x)
d d d 𝜕f (x, t)
∫ f (x, t)dt = f (x, b(x)) b(x) − f (x, a(x)) a(x) + ∫ dt.
dx dx dx 𝜕x
a(x) a(x)
When f (x, t) = f (t), a one-variable function in t, then this is proved by the funda-
mental theorem of calculus, part I. When a(x) = a and b(x) = b are two constants,
then the theorem becomes
b b
d
∫ f (x, t)dt = ∫ fx (x, t)dt.
dx
a a
This means that the derivative operator can pass through the integral sign. This is
essentially the interchange of two limits (can you see why?).
(a) Prove the Leibniz theorem for the special case where a(x) = a and b(x) = b.
1
(b) By considering the function ϕ(t) = ∫0 ln(1+tx)1+x 2
dx or otherwise, evaluate
1 ln(1+x)
∫0 1+x2
dx.
π
(c) Find the integral ∫02 ln(cos2 x + a2 sin2 x)dx, a > 0.
1. Find equations of (a) the tangent line and (b) the normal plane to each of the
following curves at the specified point:
2
(1) x = t 2 , y = 1 − t, z = t 3 , (1, 0, 1), (2) r(t) = ⟨ sin2 t , t+sin2t⋅cos t , sin t⟩, t = π4 ,
2 2 2 2 2
(3) { x x+y+z=0,
+y +z =6,
(1, −2, 1), (4) { (x−1)
2 2
+y =1,
2 (1, 1, √2) .
x +y +z =4,
2. Find equations of (a) the tangent plane and (b) the normal line to the following
surfaces at the given point:
(1) z = 2x2 + 4y2 , (2, 1, 12), (2) x 2 = 2z, (2, 0, 2),
(3) cos πx + x2 y + exz − yz + 4 = 0, (0, 1, 6).
3. Find the directional derivative of the following functions at the given point in the
direction of the vector v:
(1) f (x, y) = x2 − y2 at the point (1, 1), given v = ⟨1, √3⟩,
x
(2) f (x, y, z) = y+z at the point (4, 1, 1), given v = ⟨1, 2, −1⟩.
4. Find all points at which the direction of fastest change of the function f (x, y) =
x2 + y2 − 2x − 4y is i ⃗ + j.⃗
5. Find the maximum rate of change of f at the given points and the direction in
which it occurs:
(1) f (x, y) = x2 y + exy sin y, (1, 0), (2) f (x, y, z) = xy2 z, (1, −1, 2),
(3) f (x, y, z) = ln(x 2 + y2 − 1) + y + 6z, (1, 1, 0).
142 | 2 Functions of multiple variables
x2 + y2 + z 2 − 3y = 0, ex − z + xy = 0,
(1) { (1, 1, 1) , (2) { (0, 1, 1) .
2x + y − z − 2 = 0, x 2 − y2 + 2z 3 = 1,
1. Find and classify all the critical points for each of the following functions:
(1) f (x, y) = x2 − xy + y2 + 9x − 6y + 20, (2) f (x, y) = 4(x − y) − x 2 − y2 ,
(3) f (x, y) = 3x − x3 − 2y2 + y4 .
2. If the function z = z(x, y) is implicitly defined by the equation
find all critical points of z and local maximum and local minimum values of z.
3. Find the absolute maximum and minimum values of f (x, y) on the set D:
(1) f (x, y) = 2−4x −5y, where D is the closed triangular region with vertices (0, 0),
(2, 0), and (0, 3),
(2) f (x, y) = xy2 , where D = {(x, y)|x ≥ 0, y ≥ 0, x 2 + y2 ≤ 3},
(3) f (x, y) = 24xy − 8x 3 − 6y2 on the rectangular region D: 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2.
4. Find three positive numbers x, y, and z whose sum is 100 and whose product is a
maximum.
5. Find all points on the ellipse x2 + 4y2 = 4 that are closest to the line 2x + 3y − 6 = 0.
6. Find the dimensions of the closed rectangular box with least total surface area if
the volume is given by V m3 .
7. Find the maximum value of f (x, y, z) = xyz on the line of intersection of the two
planes x + y + z = 40 and x + y = z.
8. (Least square method) Suppose the two variables x and y are related linearly by
the equation y = kx +b for some constants k and b. However, in practice, observed
pairs of data (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) . . . (xn , yn ) that should satisfy this equation
usually do not lie exactly on a straight line. So scientists want to find the constant
k and b such that the line y = kx + b best “fits” these points.
Let di = yi − (kxi + b) be the vertical deviation of the point (xi , yi ) from the line
y = kx + b. The least square method determines k and b by minimizing ∑ni=1 di2 (the
sum of the squares of these deviations). Show that the “best fit line of y on x” is
given by
9. For functions of more than two variables, there are similar tests for local extreme
values using the Hessian matrix H(x). For example, for u = f (x, y, z), its Hessian
matrix is
𝜕2 f 𝜕2 f 𝜕2 f
𝜕x2
𝜕x𝜕y 𝜕x𝜕z
𝜕2 f 𝜕2 f 𝜕2 f
H(x) = 𝜕y2
.
𝜕y𝜕x 𝜕y𝜕z
𝜕2 f 𝜕2 f 𝜕2 f
𝜕z𝜕x 𝜕z𝜕y 𝜕z 2
If all its second derivatives are continuous, then H(x) is a symmetric matrix. If
H(x) is positive definite at x0 , then f attains an isolated local minimum at x0 . If
the Hessian is negative definite at x0 , then f attains an isolated local maximum
at x0 . If the Hessian has both positive and negative eigenvalues at x0 , then x0 is a
saddle point for f . Otherwise the test is inconclusive.
Use a suitable Hessian matrix to classify the critical points of the function
f (x, y, z) = x2 + y2 − x + z 2 .
3 Multiple integrals
In this chapter, we extend the idea of the definite integral of a function of one variable
to an analogous concept of a function of two or three variables, called a multiple in-
tegral. Multiple integrals are used in a number of applications, including computing
volumes, surface areas, and masses of two- or three-dimensional objects.
Volumes of solids
Now suppose that f is a two-variable function with a rectangular domain D satisfying
f (x, y) ≥ 0 for all (x, y) ∈ D, a rectangular region in the xy-plane. Hence, the graph of
f is a surface S above the xy-plane with equation z = f (x, y), and the projection of S
onto the xy-plane is the domain set D. The solid (three-dimensional) region Ω that lies
above D in the xy-plane and under the graph of f is
Our initial goal is to find a method for computing the volume of Ω, and this provides
a motivation for multiple integrals.
The first step is to subdivide the region D into n small closed subregions Δσ1 ,
Δσ2 , . . . , Δσn , as illustrated in Figure 3.1, where the subregions are created by drawing
lines parallel to the x- and y-axes. The value of n is left unspecified because eventually
we will use a limiting process in which n → ∞.
Arbitrarily choose a point (ξi , ηi ) in each Δσi for i = 1, 2, . . . , n. We approximate
the part of Ω that lies above each Δσi by a thin rectangular box (or “column”) with
base Δσi and height f (ξi , ηi ), as shown in Figure 3.1. The volume ΔVi of this column is
approximately the height of the column, f (ξi , ηi ), multiplied by the base area Δσi of the
base region, Δσi (we are using Δσi both as the name of the subregion and as the area
of this subregion), so we have
https://doi.org/10.1515/9783110674378-003
146 | 3 Multiple integrals
If we form the sum of these approximations over all subregions (a Riemann sum), we
get an approximation to the total volume, V, of the three-dimensional region Ω, i. e.,
n
V ≈ ∑ f (ξi , ηi )Δσi .
i=1
If the limit of this sum exists when n → ∞ and the maximum Δσi approaches zero,
then we define this limit to be the volume of Ω, i. e.,
n
V= lim ∑ f (ξi , ηi )Δσi .
max |Δσi |→0, n→∞
i=1
Note. This limit must be the same value no matter how the subregions Δσi are made
and no matter where the point (ξi , ηi ) is chosen in each subregion Δσi . If the limit is
taken only as n → ∞ (the number of subregions approaches ∞), it would then still be
possible for some subregions Δσi to stay quite large. To avoid this problem, the limit
must also be taken so that the largest subregion approaches zero both in area and in
physical dimensions. We write |Δσi | to indicate the greatest dimension of the subre-
gion Δσi and then write max |Δσi | → 0 to indicate that this greatest dimension must
approach zero for all the subregions. This limit seems very complicated and hard to
compute but, surprisingly, it can be shown to exist whenever the function f is continu-
ous and the domain D is of a suitable form. We will show later that it can be computed
using two one-variable integrals (iterated integrals) that you have studied previously.
Mass of a lamina
We investigate a second quite different problem of computing the mass of a lamina
(thin plate), and surprisingly it can be found by exactly the same process as we did for
finding the volume of a solid. Suppose a rectangular lamina (thin plate) is represented
by region D of the xy-plane. Suppose further that the density (mass per unit area) of
the lamina at a point corresponding to (x, y) in D is given by μ(x, y), where μ(x, y) is a
continuous function on D, as shown in Figure 3.2. We now derive a way to compute the
3.1 Definition and properties | 147
(a) (b)
total mass M of the lamina using methods similar to the volume computation above.
We divide D into n small closed subregions Δσ1 , Δσ2 , . . . , Δσn by drawing “nets,” and
arbitrarily choose a point (ξi , ηi ) in each Δσi . If Δσi is very small, so that the density
does not change much over Δσi , then the mass of the part of the lamina represented
by Δσi is approximately the density at (ξi , ηi ) multiplied by the area, i. e., μ(ξi , ηi )Δσi (we
are using Δσi both as the name of the subregion and as the area of this subregion). If
we add all such mass approximations, we get an approximation to the total mass, i. e.,
n
M ≈ ∑ μ(ξi , ηi )Δσi .
i=1
If the limit of this sum exists as n → ∞ and max |Δσi | → 0, and it is independent of
choices of subdivisions of D and sample point (ξι , ηi ) in each Δσi , then we define this
limit to be the mass of the lamina, written
n
M= lim ∑ μ(ξi , ηi )Δσi .
max |Δσi |→0, n→∞
i=1
Note this is exactly the same type of limit as that used before to compute the volume
of a three-dimensional region. We will see later that many other applied problems can
be reduced to computing a limit exactly of this type.
We now link this limiting process to a more general definition of the double inte-
gral of a function f (x, y) over a general region D in the xy-plane.
Definition 3.1.1 (Double integrals). The double integral of f over the region D is defined to be the fol-
lowing limit:
where Δσ1 , Δσ2 , . . . , Δσn are n closed subregions which are a partition of the region D (Δσi also denotes
the area of Δσi ) and (ξi , ηi ) is an arbitrarily chosen point in Δσi , assuming that this limit exists. The
148 | 3 Multiple integrals
limit must have the same value for any choice of subdivision and the choice of sample points (ξi , ηi ).
The double integral is also often written as ∬D f (x, y)dA.
Note. The motivation for the double integral assumed that f (x, y) ≥ 0 for (x, y) ∈ D, but
the definition here does not assume that f is nonnegative. When f takes both positive
and negative values on D, then the double integral ∬D f (x, y)dσ, if it exists, is equal to
the volume of the part of the solid that lies above the xy-plane minus the volume of
the part of the solid that lies below the xy-plane.
(6) if m ≤ f (x, y) ≤ M for all (x, y) ∈ D and A(D) denotes the area of D, then
(7) (mean value theorem) if f (x, y) is continuous on the closed region D and A(D) is
the area of D, then there exists (ξ , η) ∈ D such that
∬ √x 2 + y 2 dσ.
D
Solution. Evaluating this double integral directly from the definition as a limit is hard.
However, because √x2 + y2 ≥ 0, we can find the integral by interpreting it as a volume
3.1 Definition and properties | 149
of a solid Ω. The surface z = √x2 + y2 is a cone with vertex downwards at the origin and
axis along the z-axis and with height 2. Therefore, the given double integral represents
the volume of the solid below this cone and above the disk D. The volume of Ω is the
volume of a cylinder with base D and height 2 minus the volume of a cone with the
same base and height. Thus,
1 16π
∬ √x2 + y2 dσ = π22 × 2 − π22 × 2 = .
3 3
D
Example 3.1.2. Use the properties of double integrals to estimate the integral ∬D esin x cos y dσ, where
D is the disk in the xy-plane with radius 2 and center the origin.
Let m = e−1 = 1/e and M = e. By using property (6) and noting that the area of D is
given by A(D) = π(2)2 = 4π, we obtain
4π
≤ ∬ esin x cos y dσ ≤ 4πe.
e
D
Solution. The region of integration D is a square with center the origin. By the linearity
property,
Figure 3.3 shows the graph of the function f (x, y) = x 3 (1 + y2 ) on the region D.
150 | 3 Multiple integrals
(a) (b)
If we divide D horizontally and vertically into nm subregions, we note that the area
element Δσ is ΔxΔy. If we let x be fixed, say, x = xi∗ ∈ [xi−1 , xi ] ⊂ [a, b], then
m d
gives the area of the region that lies inside the solid, as shown in Figure 3.4.
If we multiply this area by a tiny thickness Δxi , this would give us a volume element
n d
V= lim ∑ ∫ f (xi∗ , y)dyΔxi .
max |Δxi |→0
i=1 c
3.2 Double integrals in rectangular coordinates | 151
(a) (b)
d
This is one-variable integration with an integrand the function A(x) = ∫c f (x, y)dy. So
if f (x, y) is continuous, we have
b b d
V = ∫ dx ∫ f (x, y)dy.
a c
In a similar manner, we can also integrate with respect to x first. This gives
d d b
Note.
1. We can also interpret the two integrals as in the mass of lamina model. The in-
d
ner integral ∫c f (x, y)dy gives the mass of a vertical rod. Then we sum/integrate
those masses of rods to get the total mass of the lamina. This is illustrated in Fig-
ure 3.4(b).
2. As we defined the differentials dx, dy, and dz, we can define dσ = dxdy or dσ =
dydz in rectangular coordinates. We will see this will help a lot in algebraic ma-
nipulations.
Recall that when f (x, y) ≥ 0, the volume V is exactly represented by the double
integral ∬D f (x, y)dσ. The method discussed above also works even if f (x, y) takes pos-
152 | 3 Multiple integrals
itive and negative values over a region D. We summarize these arguments in the fol-
lowing theorem.
Theorem 3.2.1 (Fubini’s theorem: rectangular region). If f (x, y) is continuous on a rectangular region
d b
∬ f (x, y)dσ = ∬ f (x, y)dxdy = ∫ dy ∫ f (x, y)dx,
D D c a
and also
b d
∬ f (x, y)dσ = ∬ f (x, y)dydx = ∫ dx ∫ f (x, y)dy.
D D a c
Solution.
(a) By Fubini’s theorem
5 4
∬ x + y dσ = ∫ dx ∫ (x + y2 )dy
2
D 1 −1
5 4
y3
= ∫(xy + ) dx
3 −1
1
5
43 1 440
= ∫(5x + − (− ))dx = .
3 3 3
1
(b) Note that the integrand has the form g(x)f (y). Thus,
6 √π
D 0 0
6 √π
= ∫ x dx ∫ y sin y2 dy
2
0 0
3.2 Double integrals in rectangular coordinates | 153
6
x3 −1 √π
cos y2
= ⋅
3 0 2 0
−1
= 72 ⋅ (cos π − cos 0) = 72.
2
Note that we have evaluated the two definite integrals separately. That is,
b d b d
then, for each x in [a, b], the range for y now depends on x with lower bound ϕ1 (x)
and upper bound ϕ2 (x). Therefore, if we keep x constant, we can also interpret A(x) =
ϕ (x)
∫ϕ 2(x) f (x, y)dy as the area of a cross-section of the solid. Thus,
1
V = ∫ A(x)dx.
a
Therefore, we can still write the volume V and double integral of f (x, y) as two one-
variable integrals (iterated integrals), i. e.,
b b ϕ2 (x)
(a) (b)
Similarly, D could be a type II region (see Figure 3.5(b)) bounded by two continuous
functions in the xy-plane, x = ψ1 (y) and x = ψ2 (y) for some interval of y values c ≤
y ≤ d,
A similar derivation to that used above for type I regions shows that
d ψ2 (y) d ψ2 (y)
The above results on iterated integrals are true even if f (x, y) is not nonnegative.
The formal statement is given in Fubini’s theorem. Rigorous proofs of Fubini’s theorem
can be found in more theoretical calculus textbooks.
Theorem 3.2.2 (Fubini’s theorem: general region). If z = f (x, y) is continuous on its domain D, a type
I region
b ϕ2 (x)
d ψ2 (y)
Example 3.2.2. Evaluate ∬D (x + 2y)dσ, where D is the region bounded by straight lines y = 2 and
y = x and the hyperbola xy = 1.
Solution. The hyperbola intersects the two lines at two points ( 21 , 2) and (1, 1) and the
two lines intersect at the point (2, 2), as shown in Figure 3.6.
3.2 Double integrals in rectangular coordinates | 155
(a) (b)
We note that the region D is both a type I region and a type II region, but the description
of D as a type I region is more complicated since the lower boundary consists of two
parts. Therefore, it is better to express D as a type II region bounded on the left by x = y1
and on the right by x = y, so
1
D = {(x, y)| ≤ x ≤ y, 1 ≤ y ≤ 2}.
y
We compute the double integral (recall that the inner iterated integral is evaluated as
though y is a constant – like a partial derivative with respect to x)
2 y 2 x=y
x2
∬(x + 2y)dσ = ∫ ∫(x + 2y)dxdy = ∫[ + 2yx] dy
2 x= 1
D 1 1 1 y
y
2
y2 1 43
= ∫( + 2y2 − 2 − 2)dy = .
2 2y 12
1
If we had expressed D as a type I region, then we would evaluate it in two parts, D1 for
1
2
≤ x ≤ 1, bounded above by y = 2 and below by y = x1 , and D2 for 1 ≤ x ≤ 2, bounded
above by y = 2 and below by y = x. Hence,
1 2
1
= ∫(2x − 2 + 3)dx + ∫(−2x 2 + 2x + 4)dx
x
1 1
2
5 7 43
= + = .
4 3 12
156 | 3 Multiple integrals
Example 3.2.3. Evaluate ∬D xydσ, where D is the region bounded by the line y = x and the parabola
y 2 = 2x + 8.
y2
Solution. The region is shown in Figure 3.7, and it lies between x = 2
−4 and x = y. So,
y2
D = {(x, y)| − 2 ≤ y ≤ 4, − 4 ≤ x ≤ y}.
2
Again, D is both a type I and a type II region, but we prefer to express D as a type
II region because it is less complicated. The double integral becomes
4 y 4 y 4 2
x2 y 1 y2
∬ xydσ = ∫ [ ∫ xydx]dy = ∫ [ ] 2 dy = ∫ (y3 − ( − 4) y)dy
2 y −4 2 2
D −2 y2 −2 2 −2
2
−4
4
1 1
= ∫ (− y5 + 5y3 − 16y)dy = 18.
2 4
−2
1 √x sin y
Example 3.2.4. Evaluate the iterated integral ∫0 ∫x y
dydx.
(a) (b)
where D is the type I region shown in Figure 3.8(a) between the curves y = x and
y = √x,
a √a2 −x 2
∫ dx ∫ x 2 + y2 dy.
−√a2 −x 2
−a
This is certainly not fun. However, if we describe D in polar coordinates, then we will
have Drθ = {(r, θ)|0 ≤ r ≤ a, 0 ≤ θ ≤ 2π}. This is a rectangular region in the rθ-plane
on which the integration might be easier. In rectangular coordinates, we see the area
element dσ = dxdy. What would this be in polar coordinates? Recall that we found the
area element by dividing the region into many subregions using lines that are parallel
to the x- or y-axis. So an area element is represented by ΔxΔy in rectangular coordi-
nates. Similarly, we can draw circles all centered at the origin with different radii and
half-lines with initial point the origin but different angles from the positive x-axis. This
produces Δr and Δθ. Looking at an area element as shown in Figure 3.9, we approxi-
mate it as the difference of areas of two sectors. So, the area approximation is
2 2
1 Δr 1 Δr
Δσ ≈ (r ∗ + ) Δθ − (r ∗ − ) Δθ = r ∗ ΔrΔθ.
2 2 2 2
This means that we can consider the limit of the Riemann sum
m m
lim ∑ f (ri∗ , θi∗ )Δσi = lim ∑ f (ri∗ , θi∗ )ri∗ Δri Δθi .
max |Δσi |→0 max |Δr,Δθ|→0
i=1 i=1
Then, if the limit exists independent of the way of subdividing the region and the
choice of sample points, we can define
In particular, by taking f (x, y) = 1, we can see that the area of the region D bounded
by θ = α, θ = β, and r = r(θ) is given by
β r(θ) β r(θ) β
r2 1 2
A(D) = ∬ 1dσ = ∫ ∫ rdrdθ = ∫[ ] dθ = ∫[r(θ)] dθ. (3.1)
2 0 2
D α 0 α α
β 2π
1 1 1
A(D) = ∫ r 2 (θ)dθ = ∫ R2 dθ = 2π × R2 = πR2 .
2 2 2
α 0
Solution. Using polar coordinates, D becomes Drθ = {(r, θ)|0 ≤ r ≤ a, 0 ≤ θ ≤ 2π}. So,
Example 3.3.3. Find the volume of the solid bounded by the plane z = 0 and the paraboloid z = 1 −
x 2 − y 2 , using polar coordinates.
Solution. Let z = 0 in the equation of the paraboloid. We get x 2 +y2 = 1. Thus, the solid
lies under the paraboloid and above the circular disk D: x 2 + y2 ≤ 1 in the xy-plane. In
polar coordinates, D is described by 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π. Since 1 − x 2 − y2 = 1 − r 2 ,
the volume is, therefore, given by
2π 1
2π
1 π
= ∫ dθ = .
4 2
0
This integral can be evaluated using trigonometric substitution and using trigonomet-
ric identities, but it is quite complicated.
Example 3.3.4. Find the volume of the solid that lies under the sphere x 2 + y 2 + z 2 ≤ 4, above the
xy-plane, and inside the cylinder x 2 + y 2 = 2x.
Solution. The solid lies above the disk D whose boundary circle, x 2 + y2 = 2x (center
(1, 0), radius 1), is determined by the cylinder. In polar coordinates, we have x 2 +y2 = r 2
and x = r cos θ. Then, the boundary circle becomes r 2 = 2r cos θ ⇒ r = 2 cos θ for
− π2 ≤ θ ≤ π2 . Thus, the disk D is given by
(T(u + Δu, v) − T(u, v)) × (T(u, v + Δv) − T(u, v)) ≈ Tu (u, v)Δu × Tv (u, v)Δv
= Tu (u, v) × Tv (u, v)ΔuΔv.
𝜕x 𝜕y
Tu (u, v) × Tv (u, v)ΔuΔv = 𝜕u 𝜕u ΔuΔv.
𝜕x 𝜕y
𝜕v 𝜕v
Therefore, we have
𝜕(x, y)
ΔxΔy ≈ ΔuΔv.
𝜕(u, v)
𝜕x 𝜕y
We call the determinant 𝜕u 𝜕y a Jacobian determinant and denote it by J or 𝜕(u,v) .
𝜕u 𝜕(x,y)
𝜕v 𝜕v
𝜕x
The Jacobian determinant is a magnification (or reduction) factor. That is, it relates
the area dxdy of a small region in the xy-plane to the area of the corresponding region
dudv in the uv-plane.
Note that
𝜕x 𝜕y 𝜕x 𝜕x
𝜕u 𝜕u 𝜕u 𝜕v
𝜕x 𝜕y = 𝜕y 𝜕y ,
𝜕v 𝜕v 𝜕u 𝜕v
Theorem 3.4.1 (Change of variables in a double integral). Let f (x, y) be a continuous function on a
bounded and closed region D ∈ ℝ2 , and let the functions x = x(u, v) and y = y(u, v) be a continuously
differentiable (x(u, v) and y(u, v) both have continuous first-order partial derivatives) transformation
(mapping) from a region D onto the region D. If the transformation is one-to-one, and 𝜕(u,v)
𝜕(x,y)
≠ 0, for
all (u, v) ∈ D , then
𝜕(x, y)
∬ f (x, y)dxdy = ∬ f (x(u, v), y(u, v)) dudv. (3.2)
𝜕(u, v)
D D
e(x−y)
∬ dxdy,
(x + y)
D
(a) (b)
1 1 1
D = {(u, v)| − ≤ u ≤ , ≤ v ≤ 1}.
2 2 2
1
1 2
1 1 ln 2 1
= ∫ dv ∫ eu du = (√e − ).
2 v 2 √e
1
2
− 21
𝜕(x, y) 𝜕(u, v)
= 1.
𝜕(u, v) 𝜕(x, y)
Example 3.4.2. Find the area of the region bounded by the four curves
y = ax 2 , y = bx 2 , xy = c, and xy = d,
where a, b, c, and d are four constants satisfying 0 < a < b and 0 < c < d.
164 | 3 Multiple integrals
Solution. The area is given by ∬D 1dσ, which is hard to compute. Instead, we use the
transformation
y
u= and v = xy, thus a < u < b and c < v < d.
x2
To compute 𝜕(x,y)
𝜕(u,v)
, we write
𝜕(x, y) 1 1 1
= = =
𝜕(u, v) | 𝜕(u,v) | 𝜕u 𝜕u −2 y
x3
1
x2
𝜕(x,y) 𝜕x 𝜕y y x
𝜕v 𝜕v
𝜕x 𝜕y
1 −1 −1
= = y = .
−2 xy2 − xy2 3 x2 3u
Therefore,
𝜕(x, y) 1
∬ 1dσ = ∬ 1 ⋅ dudv = ∬ 1 ⋅ − dudv
𝜕(u, v) 3u
D D D
b d
1 1 1 d−c
=∬ dudv = ∫ du ∫ dv = ln(b − a).
3u 3 u 3
D a c
(a) (b)
The transformation from x- and y-coordinates to polar coordinates (with the same ori-
gin and with the initial line of the polar coordinates along the x-axis) is given by
x = r cos θ,
{
y = r sin θ.
3.5 Triple integrals | 165
Hence,
𝜕(x, y) x
xθ cos θ
−r sin θ
r = = |r| = r.
=
𝜕(r, θ) yr yθ sin θ r cos θ
This agrees with what we have done before for double integrals in polar coordinates.
If we have a solid box which is not a uniform one, that is, at each point (x, y, z) inside
the box the density is a continuous function f (x, y, z), then how do you find its total
mass?
We now follow a process very similar to that used for double integrals. For a bounded
function of three variables, f (x, y, z) defined on a closed bounded region (a solid) Ω ⊂
ℝ3 , we construct a Riemann sum as
n
Rn = ∑ f (xi , yi , zi )ΔVi ,
i=1
166 | 3 Multiple integrals
and ∭Ω f (x, y, z)dV is called the triple integral of f over the region Ω. This means that
the limit must exist and have the same value no matter how the subregions are created
and how (xi , yi , zi ) are chosen. It can be shown that the limit always exist when f (x, y, z)
is continuous on Ω provided Ω satisfies some fairly mild condition.
When f (x, y, z) = 1 for all (x, y, z) ∈ Ω, then the triple integral gives the volume
V(Ω) of the region Ω, so
If the density function of a solid Ω is ρ(x, y, z) mass/unit volume at any point (x, y, z)
of Ω, then the mass M of the solid Ω is
Triple integrals also have properties such as linearity and additivity, as double inte-
grals do.
Solution. Note that x cos(yz 2 ) is an odd function with respect to x, while Ω is symmet-
ric about x. Therefore,
In general, how do you evaluate a triple integral? First of all, we consider the mass
model where Ω is a rectangular box given by
Ω = {(x, y, z)|a1 ≤ x ≤ a2 , b1 ≤ y ≤ b2 , c1 ≤ z ≤ c2 }.
3.5 Triple integrals | 167
The projection of the region Ω onto the xy-plane is a rectangular region on the
xy-plane D = {(x, y)|a1 ≤ x ≤ a2 and b1 ≤ y ≤ b2 } as shown in Figure 3.14. If we
c
let x and y be fixed, then the integral ∫c 2 f (x, y, z)dz gives the mass of a rod. If we add
1
the masses of all such rods, then the total mass is given by
c2
This means
c2
We can then evaluate a triple integral by finding a definite integral followed by eval-
uating a double integral. Recalling what we did for double integrals, this eventually
leads to the iterated integrals
a2 b2 c2
a2 b2 c2
Example 3.5.2. A box with dimensions 2 × 4 × 8 with height 8 has a density ρ at each of point in the
box. The density ρ is proportional to the product of the distance from the point to the bottom of the
box and the distance from the point to the top surface of the box. The proportionality constant is 3.
Find the total mass of the box.
Solution. Set up a coordinate system with the left-most corner as the origin, as shown
in Figure 3.15. Then the density ρ(x, y, z) = 3z(8 − z). The total mass is given by
2 4 8
= 3 ∫ dx ∫ dy ∫ z(8 − z)dz
0 0 0
8
Now we consider the case of a so-called type I region where the region Ω is enclosed
by two smooth surfaces z = z1 (x, y) and z = z2 (x, y), as shown in Figure 3.16. If the
projection of the region onto the xy-plane is D, then in a way similar to the double
integral, we have
z2 (x,y)
Similarly, we can evaluate a triple integral on a type II or type III region. We summarize
these results in the following definition and Fubini’s theorem.
Definition 3.5.1. A region Ω is of type I if, for each (x, y) ∈ D (a region of the xy-plane), all points in Ω
for all z-values lie between two surfaces z1 = z1 (x, y) and z1 = z2 (x, y), that is,
Similarly, a type II or type III region is defined to lie between two functions with domain D in the xz- or
yz-plane, respectively.
Theorem 3.5.1 (Iterated integral theorem). Assume that f (x, y, z) is continuous on a type I region Ω of
the form
Then f is integrable on Ω and can be evaluated as a single-variable integration with respect to z (x and
y are held constant) followed by a double integral over the region D in the xy-plane as
z2 (x,y)
Similarly, for a type II region Ω = {(x, y, z) ∈ ℝ3 |(x, z) ∈ Dxz , y1 (x, z) ≤ y ≤ y2 (x, z) } we have
y2 (x,z)
where Dxz is the projection of Ω onto the xz-plane. For a type III region Ω = {(x, y, z) ∈ ℝ3 |(y, z) ∈ Dyz ,
x1 (y, z) ≤ x ≤ x2 (y, z)} we have, when Dyz is the projection of Ω onto the yz-plane,
x2 (y,z)
Example 3.5.3. Evaluate ∭Ω xdV , where Ω is the solid tetrahedron bounded by the four planes x = 0,
y = 0, and z = 0, and x + y + z = 1, using the method described above.
Solution. It is always helpful if we draw two diagrams: one is the solid region Ω, and
the other is its projection D onto an appropriate coordinate planes when evaluating a
triple integral. The diagrams for this example are shown in Figure 3.17.
(a) (b)
The lower boundary of the tetrahedron is the plane z = 0, and the upper boundary
in the z-direction is the plane z = 1 − x − y. Note that the planes x + y + z = 1 and
z = 0 intersect in the line x + y = 1 in the xy-plane. Thus, the projection of Ω onto the
xy-plane is the triangular region (see Figure 3.17(b)) bounded by the x-axis, the y-axis,
and x + y = 1.
We can treat Ω as a type I region
1 1−x
Sometimes, we may also evaluate the triple integral by first evaluating a double in-
tegral and then evaluating a one-variable integral. For example, if Dz is the projection
(onto the xy-plane) of the cross-section (Dz ) of Ω by a horizontal plane with distance
z units from the xy-plane, and all cross-sections of Ω satisfy c1 ≤ z ≤ c2 , then Ω is
defined by
c2
In less precise language you can think of the double integral ∬D f (x, y, z)dxdz as the
z
mass of the lamina (Dz ) when the density per unit volume is f (x, y, z), and then the
integration with respect to z computes the mass of the solid Ω by adding the masses
of all of the laminae. This is illustrated in Figure 3.18.
Similarly, if Dx is the projection (onto the yz-plane) of the cross-section (Dx ) parallel to
the yz-plane at distance x with a1 ≤ x ≤ a2 , and Dy is the projection (onto the xz-plane)
of the cross-section (Dy ) parallel to the xz-plane at distance y with b1 ≤ y ≤ b2 , then
we also have
a2
Example 3.5.4. Find ∭Ω xdV , where Ω is the same region in Example 3.5.3, by evaluating a double
integral first.
Solution. Evaluate the integral ∭Ω xdV by first evaluating a double integral over
This is a triangular cross-section of Ω at height z from the xy-plane. Its projection Dxy
onto the xy-plane is bounded by x + y = 1 − z, x = 0, and y = 0, as shown in Figure 3.19.
We have
1 1 1−z 1−x−z
(a) (b)
Example 3.5.5. Attempt to evaluate the following integral by first considering E to be a type I region,
then a type II region, and then a third method using a double integral as the inner integral:
∭ √x 2 + z 2 dV ,
E
so we obtain
1 1 √y−x2
∭ √x2 + z 2 dV = ∫ ∫[ ∫ √x 2 + z 2 dz]dydx.
E −1 x2
−√y−x2
∭ √x2 + z 2 dV = ∬[ ∫ √x 2 + z 2 dy]dσ
E Dxz x2 +z 2
1
= ∬([y√x 2 + z 2 ]x2 +z 2 )dσ
Dxz
Since the domain Dxz is a circular disk, it is easier to convert this to polar coordinates
in the xz-plane, using the substitution x = r cos θ, z = r sin θ; Dxz is now given by
0 ≤ θ ≤ 2π and 0 ≤ r ≤ 1, which gives
Method 3: Now we consider computing a double integral first in the xz-plane. The
cross-section of E by the vertical plane passing through (0, y, 0) and perpendicular to
the y-axis is the circular disk Dy : x2 + z 2 ≤ y with center at (0, y, 0) and radius √y (see
Figure 3.21(c)). The triple integral becomes
∭ √x2 + z 2 dV = ∫ dy ∬ √x 2 + z 2 dσ.
E 0 Dy
3.5 Triple integrals | 175
Hence,
1 1 3
2πy 2 4π
∭ √x2 + z 2 dV = ∫ dy ∬ √x2 + z 2 dσ = ∫ dy = .
3 15
E 0 Dy 0
Cylindrical coordinates
The cylindrical coordinates of a point P are (r, θ, z), as shown in Figure 3.22(a).
The coordinates r and θ are the polar coordinates of the projection of P onto the
xy-plane, and z is the directed distance from the xy-plane to P (the usual z-coordinate).
In cylindrical coordinates, the surfaces analogous to coordinate planes in Carte-
sian coordinates are as follows. If k, l, and m are constants:
r = k is a cylinder whose axis is the z-axis,
θ = l is a half-plane whose edge is the z-axis and its angle with the xz-plane is
θ = l,
z = m is a horizontal plane with distance m from the xy-plane.
The equations relating Cartesian coordinates and cylindrical coordinates of a
point are
x = r cos θ, y = r sin θ, and z = z, (3.5)
where 0 ≤ r < +∞, 0 ≤ θ ≤ 2π, and −∞ < z < ∞. The volume element in cylindrical
coordinates is rdrdθdz (see Figure 3.22(b)).
176 | 3 Multiple integrals
Example 3.5.6. A solid E lies within the cylinder x 2 + y 2 = 1 below the plane z = 2 and above the
paraboloid z = 1 − x 2 − y 2 . The density (mass per unit volume) at any point (x, y, z) is ρ(x, y, z) =
z √x 2 + y 2 . Find the mass of E.
The density function in cylindrical coordinates is ρ(x, y, z) = zr, and, therefore, the
mass M of E is
2π 1 2
= ∫ dθ ∫ r 2 dr ∫ zdz
0 0 1−r 2
1 2
1
= 2π ∫ r 2 [ z 2 ] dr
2 1−r 2
0
1
2 44π
= π ∫ r 2 (4 − (1 − r 2 ) )dr = .
35
0
1 √1−x 2 1
Example 3.5.7. Evaluate ∫−1 ∫−√1−x 2 ∫√ zdzdydx.
x 2 +y 2
Solution. This iterated integral is a triple integral of f (x, y, z) = z over the solid region
Ω, i. e.,
The projection of Ω onto the xy-plane is the disk x 2 + y2 ≤ 1, the lower surface of Ω is
the cone z = √x2 + y2 , and the upper surface is the plane z = 1. This region has a much
3.5 Triple integrals | 177
1 √1−x 2 1 2π 1 1
2π 1 1 1
π
= ∫ dθ ∫ rdr ∫ zdz = π ∫ r(1 − r 2 )dr = .
4
0 0 r 0
Spherical coordinates
The spherical coordinates (ρ, θ, ϕ) of a point P are usually defined as in Figure 3.24(a).
The coordinate ρ = |OP| is the distance from the origin to P, θ is the same angle as
in cylindrical coordinates, and ϕ is the angle between the positive z-axis and the line
segment OP. Thus, all points in space have unique spherical coordinates (ρ, θ, ϕ), pro-
vided ρ, θ, ϕ are restricted by ρ ≥ 0, 0 ≤ θ ≤ 2π, and 0 ≤ ϕ ≤ π. The spherical coordi-
nate system is especially useful in problems where the formula of the function being
integrated contains the quantity x2 +y2 +z 2 or where the domain has a spherical nature
with center at the origin. In spherical coordinates, the surfaces analogous to the coor-
dinate planes in Cartesian coordinates are as follows. If k, l, and m are any constants:
ρ = k is a sphere with center the origin and radius k,
θ = l is a half-plane whose edge is the z-axis and angle with the xz-plane is θ = l,
ϕ = m is a half-cone making an angle ϕ with the positive z-axis.
The equations relating spherical and rectangular coordinates of a point are
2
+y 2 +z 2 )3/2
Example 3.5.8. Evaluate ∭E e(x dV , where E is the top half of the unit ball
Solution. Since the boundary of E is part of a sphere, it is wise to try spherical coor-
dinates. The region corresponding to E in spherical coordinates is
π
E = {(ρ, θ, ϕ)|0 ≤ ρ ≤ 1, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ }.
2
In addition, spherical coordinates give x2 +y2 +z 2 = ρ2 and this simplifies the integrand.
Thus,
2
+y2 +z 2 )3/2 2 3/2
∭ e(x dV = ∭ e(ρ ) ρ2 sin ϕ dρdθdϕ
E E
π
2π 2 1
2 3/2
= ∫ ∫ ∫ e(ρ ) ρ2 sin ϕ dρdϕdθ
0 0 0
π
2π 2 1
3
= ∫ dθ ∫ sin ϕdϕ ∫ eρ ρ2 dρ
0 0 0
1
1 3 π
= 2π ⋅ (− cos ϕ)|0 ⋅ ( eρ ) 2
3 0
2
= π(e − 1).
3
Example 3.5.9. Find the volume of the solid Ω enclosed by the sphere x 2 + y 2 + (z − a)2 = a2 and inside
the half-cone z = √3x 2 + 3y 2 .
3.6 Change of variables in triple integrals | 179
Solution. The solid is shown in Figure 3.25. In spherical coordinates, the boundary
surfaces become, after simplification,
π
ρ = 2a cos ϕ and ϕ = .
6
π
So, the solid Ω is defined by the region Ω : 0 ≤ ρ ≤ 2a cos ϕ, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ 6
in
spherical coordinates. Therefore,
π
2π 6 2a cos ϕ
2
V = ∭ dxdydz = ∭ ρ sin ϕ dρdϕdθ = ∫ dθ ∫ sin ϕ dϕ ∫ ρ2 dρ
Ω Ω 0 0 0
π π
6 2a cos ϕ 6
ρ3 16πa3
= 2π ∫ sin ϕ ⋅ [ ] dϕ = ∫(cos3 ϕ sin ϕ) dϕ
3 0 3
0 0
π
3
16πa 1 7 6
= (− cos4 ϕ) = πa3 cubic units.
3 4 0 12
x2 y2 z2
Example 3.6.1. Evaluate ∭Ω |xy|dV , where Ω is the solid bounded by the ellipsoid a2
+ b2
+ c2
= 1.
Solution. We use the substitution x = au, y = bv, z = cw and compute its Jacobian
𝜕x 𝜕x 𝜕x
a 0 0
𝜕(x, y, z)
𝜕u 𝜕v 𝜕w
= 𝜕y 𝜕y 𝜕y
= 0 b 0 = abc.
𝜕(u, v, w) 𝜕u 𝜕v 𝜕w
𝜕z 𝜕z 𝜕z 0 0 c
𝜕u 𝜕v 𝜕w
= a2 b2 c ∭ (|uv|)dudvdw
u2 +v2 +w2 ≤1
= a2 b2 c × 8 ∭ uvdudvdw
u2 +v2 +w2 ≤1,u≥0,v≥0,w≥0
π π
2 2 1
0 0 0
8a2 b2 c
= .
15
Cylindrical and spherical coordinates are special transformations in a triple in-
tegral. The Jacobian of the transformation from Cartesian to cylindrical coordinates
is
𝜕x 𝜕x 𝜕x
cos θ −r sin θ 0
𝜕(x, y, z)
𝜕r 𝜕θ 𝜕z
= sin θ r cos θ 0 = r.
𝜕y 𝜕y 𝜕y
=
𝜕(r, θ, z) 𝜕r 𝜕θ 𝜕z
0
𝜕z 𝜕z 𝜕z
0 1
𝜕r 𝜕θ 𝜕z
Hence, the absolute value of this Jacobian (used in the change of variables formula) is
|r| = r, since r ≥ 0.
3.7 Other applications of multiple integrals | 181
Since 0 ≤ ϕ ≤ π, we have sin ϕ ≥ 0, and, therefore, the absolute value of the Jacobian
(used in the change of variables formula) is
𝜕(x, y, z)
2 2
= −ρ sin ϕ = ρ sin ϕ.
𝜕(ρ, θ, ϕ)
So, the change of variables from Cartesian to spherical makes the following changes
in a triple integral:
∭ f (x, y, z)dV = ∭ f (ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ)ρ2 sin ϕ dρdθdϕ.
Ω Ω
Parameterized surfaces
When using a graphing calculator to sketch a sphere, you may notice that the calcula-
tor does not do a good job in sketching functions such as z = √1 − x 2 − y2 . However, if
you use the parametric form for the same surface x = a sin u cos v, y = a sin u sin v, and
z = a cos u, the graphing calculator does a much better job. In fact, a parameterization
of a surface can be written in a form of a vector-valued function
r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k or r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩,
Note that r(u, v) = ⟨a cos u2 , a sin u2 , v3 ⟩ is also a parametric description of the same
cylinder.
(2) A parametric description of the circular cone is
v v
r(u, v) = ⟨ cos u, sin u, v⟩, 0 ≤ u ≤ 2π, v ≥ 0.
a a
(3) One parametric description is
Also
v
r(u, v) = ⟨√v cos u, √ sin u, v⟩, 0 ≤ u ≤ 2π, v ≥ 0
2
is a parametric description.
Surface area
We now apply double integrals to the problem of computing the area of a surface S
defined by r(u, v) = ⟨x(u, v), y(u, v), z(u, v)⟩, where x = x(u, v), y = y(u, v), and z = z(u, v)
all have continuous partial derivatives at (u, v) ∈ D. A special parametric description
where the surface has an explicit equation z = f (x, y), and a parametric description
for this surface is
To find the “surface area element dS,” we can use the tangent plane approximation,
as shown in Figure 3.26, where the area element on the tangent plane is given by
i j k i j k
ru × rv = xu yu zu = R cos u cos v R cos u sin v −R sin u
xv yv zv −R sin u sin v R sin u cos v 0
= −R2 sin2 u cos vi + R2 sin2 u sin vj − R2 sin u cos uk,
it follows that
2 2 2
|ru × rv | = √(−R2 sin2 u cos v) + (R2 sin2 u sin v) + (−R2 sin u cos u)
= R2 sin u.
184 | 3 Multiple integrals
Thus,
We use this formula to compute the surface area of a ball with radius R again in the
following example.
Example 3.7.3. Find the surface area of the sphere with radius R.
Solution. By symmetry, we compute the area of the top half of the sphere and double
this. To make the calculation easier, assume that the center of the sphere is at the
origin so that the equation of the top half of the sphere is
z = f (x, y) = √R2 − x 2 − y2 ,
so that
𝜕z −x 𝜕z −y
zx = = and zy = = .
𝜕x √R2 − x2 − y2 𝜕y √R2 − x 2 − y2
x2 + y2 ≤ R2 and z = 0.
2 2
−x −y
S = 2 ∬ √zx2 + zy2 + 1dσ = 2 ∬ √( ) +( ) + 1dσ
Dxy Dxy
√R2 − x 2 − y2 √R2 − x 2 − y2
3.7 Other applications of multiple integrals | 185
R
= 2∬ dσ.
Dxy
√R2 − x2 − y2
The integrand (the function being integrated) is unbounded on the region Dxy : x 2 +
y2 ≤ R2 , so we consider Dxy : x2 + y2 ≤ b2 , 0 < b < R and then let b → R. Converting to
cylindrical coordinates, we obtain
2π b b
R r
S = lim 2 ∫ dθ ∫ rdr = lim 4πR ∫ dr
b→R √R2 − r 2 b→R √R2 − r 2
0 0 0
b
= lim 4πR[−√R2 − r 2 ]0 = 4πR lim (R − √R2 − b2 ) = 4πR2 .
b→R b→R
Example 3.7.4. Find the area of the part of the paraboloid z = x 2 + y 2 that lies under the plane z = 2.
2 2
𝜕z 𝜕z
A = ∬ √1 + ( ) + ( ) dxdy = ∬ √1 + (2x)2 + (2y)2 dxdy
𝜕x 𝜕y
D D
= ∬ √1 + 4(x2 + y2 )dxdy.
D
2π √2
1 2 3/2 √2
A = ∫ ∫ √1 + 4r 2 rdrdθ = 2π ⋅ ⋅ [(1 + 4r 2 ) ]0
8 3
0 0
13π
= .
3
In Figure 3.26, we note that the area element dσ in the xy-plane is exactly the pro-
jection of the area element dS in the tangent plane. This indicates that
dS × cos γ = dσ, where γ is the acute angle between the tangent plane
and the xy-plane.
∇F F Fy F
the unit vector =⟨ x , , z ⟩ = ⟨cos α, cos β, cos γ⟩.
|∇F| |∇F| |∇F| |∇F|
186 | 3 Multiple integrals
1
So, dS = | cos γ|
dσ = |∇F|
|Fz |
dσ = |∇F|
|Fz |
dxdy. Therefore, we conclude that
|∇F|
surface area S = ∬ dS = ∬ dxdy. (3.10)
|Fz |
Dxy
If the surface is given by z = f (x, y), then F(x, y, z) = f (x, y) − z = 0, Fz = −1, and
∇F = ⟨zx , zy , −1⟩. So, |∇F| = √1 + zx2 + zy2 and
√1 + zx2 + zy2
surface area S = ∬ dxdy = ∬ √1 + zx2 + zy2 dxdy, (3.11)
| − 1|
Dxy Dxy
which is the same formula as the one we derived before. However, equation (3.10) does
give us options to project the surface area element dS on the tangent plane onto the
other two coordinate planes. Therefore, we have
|∇F|
surface area S = ∬ dxdy = ∬ √1 + zx2 + zy2 dxdy (3.12)
|Fz |
Dxy Dxy
|∇F|
=∬ dzdy = ∬ √1 + xy2 + xz2 dzdy (3.13)
|Fx |
Dyz Dyz
|∇F|
=∬ dxdz = ∬ √1 + yx2 + yz2 dxdz, (3.14)
|Fy |
Dxz Dxz
where Dxy , Dyz , and Dxz are projection regions of the surface onto the xy-, yz-, and
xz-planes, respectively.
3
4
Example 3.7.5. Find the surface area of the part of the surface y = x 2 for 0 ≤ x ≤ 3
and 0 ≤ z ≤ 4.
3 3 1
F(x, y, z) = y − x 2 = 0, we have ∇F = ⟨− x 2 , 1, 0⟩.
2
2
|∇F| −3 1
surface area S = ∬ dxdz = ∬ √1 + ( x 2 ) dxdz
|Fy | 2
Dxz Dxz
4
4 3
9x 224
= ∫ dz ∫ √1 + dx = .
4 27
0 0
3.7 Other applications of multiple integrals | 187
Example 3.7.6. Find the surface area of the part of the sphere x 2 + y 2 + z 2 = R 2 for z ≥ h, where
0 < h < R.
R R
=∬ dxdy = ∬ dxdy
z √R2 − x2 − y2
Dxy Dxy
2π √R2 −h2
R
= ∫ dθ ∫ rdr
√R2 − r 2
0 0
1 √ 2
R −h2
= 2πR[−(R2 − r 2 ) 2 0 ]
= 2πR(R − h).
Center of mass
In physics, the center of mass of an object is the balance point of the object or the
position towards which gravity attracts. For a rod with density function ρ(x), the center
of mass x can be found by using the idea of moment and is given by
b
∫a xρ(x)dx
x= b
. (3.15)
∫a ρ(x)dx
For a two-dimensional lamina with density function ρ(x, y), the center of mass (x, y) is
given by
∬D xρ(x, y)dσ ∬D yρ(x, y)dσ
x= and y= . (3.16)
∬D ρ(x, y)dσ ∬D ρ(x, y)dσ
For a three-dimensional solid with density function ρ(x, y, z), the center of mass
(x, y, z) is given by
∭Ω xρ(x, y, z)dV ∭Ω yρ(x, y, z)dV ∭Ω zρ(x, y, z)dV
x= , y= , and z = . (3.17)
∭Ω ρ(x, y, z)dV ∭Ω ρ(x, y, z)dV ∭Ω ρ(x, y, z)dV
188 | 3 Multiple integrals
If an object is uniform (has constant density at each point), then its center of mass is
at the centroid, the geometric center of the figure.
Moment of inertia
In physics, we know that the moment of inertia of a rigid body m about an axis l is
n
I = ∑ Δmi ri2 , (3.18)
i
where Δmi is a mass element, while ri is the perpendicular distance from this mass
element to the axis of rotation.
By taking limits, this can be represented as an integral. If ρ stands for the density
function, the moments of inertia of a region D in the xy-plane being rotated about the
x-axis and y-axis are, therefore,
respectively. Thus, by the perpendicular axis theorem for the moment of inertia of a
rigid body in the plane, the moment of inertia of the body about the z-axis is
Similarly, for a solid R in space, the moment of inertia of the solid about an axis of
rotation is
where |r| is the distance from the volume element dV to the axis of rotation.
Therefore, finding the coordinates for a center of mass or moment of inertia of an
object about an axis of rotation becomes a job of evaluating some multiple integrals.
3.8 Review
Main concepts discussed in this chapter are listed below.
1. Definition and properties of double integrals:
b ϕ2 (x)
6. Cylindrical coordinates:
7. Spherical coordinates:
∭ f (x, y, z)dV = ∭ f (ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ)ρ2 sin ϕ dρdϕdθ.
Ω Ωρϕθ
190 | 3 Multiple integrals
|∇F|
F(x, y, z) = 0 S=∬ dxdy.
|Fz |
Dxy
Similar results hold if the surface is projected onto a coordinate plane other than
the xy-plane.
12. Center of mass: if f (x), f (x, y), and f (x, y, z) are density functions, then
b
∫a xf (x)dx
x̄ = b
one dimension, thin rod,
∫a f (x)dx
∬D xf (x, y)dσ ∬D yf (x, y)dσ
x̄ = ȳ = two dimensions, thin lamina,
∬D f (x, y)dσ ∬D f (x, y)dσ
∭Ω xf (x, y, z)dV ∭Ω yf (x, y, z)dV ∭Ω zf (x, y, z)dV
x̄ = ȳ = z̄ =
∭Ω f (x, y, z)dV ∭Ω f (x, y, z)dV ∭Ω f (x, y, z)dV
three dimensions, solid.
I = ∭ ρ|r|2 dV,
Ω
where |r| is the distance from the element dV to the axis of rotation.
3.9 Exercises | 191
3.9 Exercises
3.9.1 Double integrals
Find the average of the function f (x, y) = sin(2x − 3y) over the region D bounded
by 0 ≤ x ≤ π2 and x ≤ y ≤ π2 .
10. Use polar coordinates to combine the sum
1 x √2 x 2 √4−x2
D a b a
y2
= 1.
b2
14. We can define the improper integral
∞ ∞
2
+y2 ) 2
+y2 )
I = ∫ ∫ e−(x dxdy = lim ∬ e−(x dxdy,
r→+∞
−∞ −∞ Dr
where Dr is the disk with radius r and center the origin. Show that
∞ ∞
2
+y2 )
∫ ∫ e−(x dxdy = π
−∞ −∞
and deduce
∞ ∞
−x 2 2
∫ e dx = √π and ∫ e−x /2
dx = √2π.
−∞ −∞
∭ 1 − x2 − 3y2 − 2z 2 dV
Ω
is a maximum.
1 z x−z
2. Evaluate the iterated integral ∫0 ∫0 ∫0 6xzdydxdz.
3. If Ω2 = {(x, y, z)|x2 + y2 + z 2 ≤ R2 } and
1
fave = ∭ f (x, y, z)dV, where V(Ω) is the volume of Ω.
V(Ω)
Ω
Find the average value of the function f (x, y, z) = x 2 z +y2 z over the region enclosed
by the paraboloid z = 1 − x2 − y2 and the plane z = 43 .
6. Evaluate the following triple integrals by converting to cylindrical or spherical
coordinates:
(1) ∭Ω (x2 + y2 )dxdydz, where Ω is the region enclosed by the paraboloid x2 + y2 =
2z and the plane z = 2,
3 √9−x2 √x2 +y2
(2) ∫0 ∫0 ∫0 √x2 + y2 dzdydx,
(3) ∭Ω √x2 + y2 + z 2 dxdydz, where Ω is the ball x 2 + y2 + z 2 ≤ z,
1 √1−x2 √2−x 2 −y2
(4) ∫0 dx ∫0 dy ∫ z 2 dz.
√x2 +y2
7. Use triple integrals to find the volume of the solid
(1) bounded by planes x = 0, y = 0, z = 0, x = 4, and y = 4 and the paraboloid
z = x2 + y2 + 1.
(2) enclosed by the cone z = √x2 + y2 and the paraboloid az = x 2 + y2 , where
a > 0.
(3) enclosed by the spherical surfaces ρ = 4 sin ϕ.
8. Use the change of variables in a triple integral to evaluate ∭Ω zdV, where Ω is
bounded by the planes y = x, y = x + 2, z = x, z = x + 2, z = 0, and z = 6.
9. Let f (x) > 0 be a continuous function, Ω = {(x, y, z)|x 2 + y2 + z 2 ≤ t 2 }, and D =
{(x, y)|x2 + y2 ≤ t 2 }. Prove that the function
∭Ω f (x2 + y2 + z 2 )dV
F(t) =
∬D f (x2 + y2 )dσ
(3) the part of the cylinder x2 + y2 = x that lies within the sphere x2 + y2 + z 2 = 1,
(4) the part of the sphere x2 + y2 + z 2 = a2 that lies within the cylinder x 2 + y2 = ax
and above the xy-plane,
(5) the part of the cylinder x2 + y2 = 16 that is between the planes z = 0 and
z = 16 − 2x,
(6) the part of the trough z = x2 for −3 ≤ x ≤ 3 and 1 ≤ y ≤ 4.
2. A lamina is represented by the part of the disk x 2 + y2 ≤ 1 in the first and second
quadrants. Find the mass of the lamina if the density at any point is proportional
to its distance from the x-axis with constant of proportionality equal to K.
3. A solid has the shape of a half-cylinder bounded by −3 ≤ x ≤ 3, 0 ≤ y ≤ √9 − x 2 ,
and 0 ≤ z ≤ 2. Each point (x, y, z) in the solid has density given by the function
f (x, y, z) = 1+x12 +y2 . Find the total mass of this solid.
4. Find the coordinates of the centroid of the constant-density cone D bounded by
z = 4 − √x2 + y2 and z = 0.
5. A torus is a surface obtained by rotating a closed plane curve (for example a circle)
about an axis, where the axis usually does not intersect the curve. Considered the
torus obtained by rotating the circle (x − R)2 + z 2 = r 2 (r < R) in the xz-plane about
the z-axis.
(a) Parameterize the torus.
(b) Find its volume.
(c) Find the centroid of the half-torus (z ≥ 0).
(d) Find the surface area and the moment of inertia of the torus.
6. Find the moment of inertia for:
(a) a uniform semidisk about its straight edge (the diameter),
(b) a uniform semidisk about the axis that is the perpendicular bisector of its
straight edge (in the same plane),
(c) a ball with radius R center at the origin, and density function √x 2 + y2 + z 2
about the z-axis.
4 Line and surface integrals
4.1 Line integral with respect to arc length
The definite integral
b b
https://doi.org/10.1515/9783110674378-004
196 | 4 Line and surface integrals
Definition 4.1.1 (Line integral with respect to arc length). Let C be a piecewise smooth curve in the
plane ℝ2 , connecting two fixed points A and B. Let s be the distance along C measured from A = S0 .
Subdivide C between A and B (Figure 4.2(a)) by points A = S0 , S1 , S2 , . . . , Sn = B and let si for each
i = 1, 2, . . . , n be the distance along the curve from S0 to Si . Arbitrarily choose (xi∗ , yi∗ ) ∈ ℝ2 on C
between Si−1 and Si for i = 1, 2, . . . , n. Let f be a real-valued function defined on C. If the limit
n
lim ∑ f (xi∗ , yi∗ )Δsi
max |Δsi |→0, n→∞
i=1
exists for all possible subdivisions and choices of the (xi∗ , yi∗ ), then we define this to be the line integral
of f on the curve C with respect to arc length, and we write
n
lim ∑ f (xi∗ , yi∗ )Δsi = ∫ f (x, y)ds.
max |Δsi |→0, n→∞
i=1 C
(a) (b)
Note.
1. It can be proved that this line integral exists when f is continuous, or piecewise
continuous, provided C is finite and piecewise smooth. A curve C: r(t), a ≤ t ≤ b, is
piecewise smooth if there is a subdivision a = t0 < t1 < t2 < ⋅ ⋅ ⋅ < tn = b such that
r (t) is continuous on each subinterval ti−1 < t < ti .
2. Sometimes we also use the notation ∫L f (x, y)ds for a line integral.
(1) if f (x, y) = 1 for all (x, y) ∈ C in ℝ2 , then ∫C 1ds is the length of the curve C,
(2) ∫C kf (x, y) + mg(x, y)ds = k ∫C f (x, y)ds + m ∫C g(x, y)ds if k, m ∈ ℝ (linearity),
(3) ∫C +C f (x, y)ds = ∫C f (x, y)ds + ∫C f (x, y)ds, where C1 + C2 means the curve formed
1 2 1 2
by combining curves C1 and C2 into one curve,
(4) if f (x, y) ≤ g(x, y), then ∫C f (x, y)ds ≤ ∫C g(x, y)ds,
(5) if f (x, y) is continuous, then there is a point (η, ξ ) such that
Example 4.1.1. Evaluate the line integral ∫C (2x ln(1 + y 2 ) + π)ds, where C is the half-circle y = √4 − x 2 .
= 0 + π × π × 2 = 2π 2 .
where the functions x(t) and y(t) both have continuous first-order derivatives, and
where r(a) and r(b) correspond to the points A and B, respectively. We approximate
Δsi ≃ √Δxi2 + Δyi2 (see Figure 4.2(b)). Let t = ti∗ be the value of t that corresponds to the
point (xi∗ , yi∗ ), and let t = ti be the value of t that corresponds to the point Si on C, for
i = 1, 2, . . . , n. Writing Δti = ti − ti−1 and noting that max |Δti | → 0 ⇐⇒ max |Δsi | → 0,
it follows that
n
∫ f (x, y)ds = lim ∑ f (xi∗ , yi∗ )Δsi
max |Δsi |→0 n→∞
C i=1
198 | 4 Line and surface integrals
n
= lim ∑ f (x(ti∗ ), y(ti∗ ))√Δxi2 + Δyi2
max |Δsi |→0 n→∞
i=1
n Δxi2 Δyi2
= lim ∑ f (x(ti∗ ), y(ti∗ ))√ + Δti .
max |Δti |→0 n→∞
i=1 Δti2 Δti2
The last Riemann sum is an ordinary one-variable Riemann sum. Hence, if the limit
exists, we can express the line integral as an ordinary one-variable definite integral,
i. e.,
b 2 2
dx dy
∫ f (x, y)ds = ∫ f (x(t), y(t))√( ) + ( ) dt. (4.2)
dt dt
C a
Note that we can interpret ds, the arc length element, as |r (t)|dt. This is consistent
with ds
dt
= |r (t)|.
Example 4.1.2. Compute the circumference of the circle x 2 + y 2 = R 2 using a line integral.
Solution. The circumference is ∫C 1ds, where C is the circle. Choose the standard pa-
rameterization,
Example 4.1.3. Find the total mass of the wire given by the curve C: y = x 2 , −2 ≤ x ≤ 2 with density
function f (x, y) = x + √y.
Solution. We choose a parameterization r(t) = ⟨t, t 2 ⟩, and then |r (t)| = √12 + (2t)2 . So
we have
2 2
= ∫ (t + √t 2 )r (t)dt = ∫ (t + √t 2 )√12 + (2t)2 dt
−2 −2
2 2
= 0 + 2 ∫ t √1 + 4t 2 dt
0
2
12 3
17 1
(1 + 4t 2 ) 2 = √17 − .
=
43 0 6 6
The process for expressing a line integral in ℝ3 as a definite integral is very similar
to the one used in ℝ2 , and is, therefore, not repeated here. When the curve C is, in a
vector parametric form, r(t) = ⟨x(t), y(t), z(t)⟩, for a ≤ t ≤ b, then the result is
b 2 2 2
dx dy dz
∫ f (x, y, z)ds = ∫ f (x(t), y(t), z(t))√( ) + ( ) + ( ) dt (4.4)
dt dt dt
C a
b
or ∫ f (x, y, z)ds = ∫ f (r(t))r (t)dt.
C a
Note. A curve can have many different parametric forms, but the line integral will al-
ways have the same value because the parametric form of the line integral will always
be the same as the original definition of the line integral in equation (4.2).
Example 4.1.4. Find the mass of the wire represented by the helical curve C : → r (t) = ⟨2 cos t, 2 sin t, t⟩,
π ≤ t ≤ 2π, when the density at a point (x, y, z) on the wire is given by δ(x, y, z) = x 2 + y 2 + z 2 .
Example 4.1.5. Evaluate ∫C fds when f (x, y, z) = x + yz and C = C1 + C2 , where C1 is the line segment
from (0, 0, 2) to (2, 0, 2) and C2 is the line segment from (2, 0, 2) to (1, 1, 1).
Solution. The parametric form of the line through (0, 0, 2) and (2, 0, 2) is
Since r(0) = ⟨0, 0, 2⟩ and r(1) = ⟨2, 0, 2⟩, the line segment C1 has parameterization
1 1
∫ fds = ∫ fds + ∫ fds = ∫ f (r1 (t))r1 (t)dt + ∫ f (r2 (t))r2 (t)dt
C C1 C2 0 0
1 1
Figure 4.3: Line integral with respect to arc length, Example 4.1.5.
4.2 Line integral of a vector field | 201
Many physical phenomena can be modeled by associating a vector with each point in
space. Examples include electric fields, magnetic fields, gravitational fields, and the
velocity field for a fluid. A function that associates a vector with each point is called a
vector field.
More precisely, it is defined as follows.
Definition 4.2.1. A function F with domain set D ⊂ ℝn and with range a set of vectors in ℝn is called a
vector field.
For example, the functions F, G, and H defined as follows are vector fields:
F(x, y) = x2 y i + 2xyex j ,
→ →
(a) (b)
(a) (b)
Example 4.2.1. The gravitational field of the Earth (of mass Me ) acting on an object of mass m is a
vector field. Find the formula for this force field.
Solution. The gravitational force F is given by the inverse square law formula
Me m
F=G ,
r2
where G is the gravitational constant, r is the distance between the object and the
center of the Earth, and m and Me are the mass of the object and the Earth, respectively.
In order to find the gravitational field, let →
r be the vector from the center of the Earth
to the object. The force on the object acts along the same line, but in the opposite
direction to →r , and so the field becomes
Me m→ r
F = −G 3
.
r
If we define a Euclidean coordinate system with origin at the center of the Earth, then
this becomes
→ →
→
Me m r ⃗ Me m(x i + y j + z k )
F = −G = −G ,
r 2 |r|⃗ 3
(x 2 + y2 + z 2 ) 2
To find the total work, we again use the element method to break down the problem.
Thinking about a very small piece of the curve, say, Δs, the work element ΔW can be
obtained by using |F| cos θ ⋅ Δs, as shown in Figure 4.6. Suppose T is the unit tangent
vector of the curve. Then
Adding them up and taking a limit, we will have a line integral similar to the one in
the previous section. We now give a formal definition of a line integral of a vector field
along a curve C in ℝ2 .
Definition 4.2.2 (Line integral of a vector field). Let F = ⟨f (x, y), g(x, y)⟩ be a vector field in ℝ2 , and let
C : r(t), a ≤ t ≤ b, be a curve in the xy-plane. If the limit
Δri
n
Δti
lim ∑⟨f (xi∗ , yi∗ ), g(xi∗ , yi∗ )⟩ ⋅ Δri
Δsi
max |Δsi |→0 n→∞
i=1 | Δt |
i
exists and is the same for any possible choice of sample points (xi∗ , yi∗ ) between Si−1 and Si on C and
for any subdivision S0 , S1 , . . . , Sn of the curve C, we define this limit to be the line integral of the vector F
along the curve C from point A = r(a) to point B = r(b), and we write
Δri
n
Δti
∫(F ⋅ T)ds = lim ∑⟨f (xi∗ , yi∗ ), g(xi∗ , yi∗ )⟩ ⋅ Δri
Δsi .
max |Δsi |→0 n→∞
C i=1 | Δt |
i
Note.
1. If both components f and g of F are continuous or piecewise continuous and the
curve C is also piecewise smooth, then the above line integral of the vector field F
along C always exists.
2. Unlike the line integral with respect to arc length, the orientation of the curve C
does make a difference in the line integral of a vector field. We usually define the
positive orientation of a parameterized curve C to be the direction that is consis-
tent with the increasing value of its parameter. If we use −C to denote the negative
204 | 4 Line and surface integrals
3. If the curve is closed, that is, r(a) = r(b), then we also adopt the notation
∮C (F ⋅ T)ds.
r (t)
∫(F ⋅ T)ds = ∫⟨f , g⟩ ⋅ ds
|r (t)|
C C
r (t)
= ∫⟨f , g⟩ ⋅ r (t)dt = ∫⟨f , g⟩ ⋅ r (t)dt
|r (t)|
C C
dx dy
= ∫⟨f , g⟩ ⋅ ⟨ , ⟩dt
dt dt
C
= ∫ fdx + gdy,
C
we have
Note. We can write ∫C F ⋅ dr = ∫C (F ⋅ T )ds, where T is a unit tangent to C, and the inte-
grals with respect to arc length s are unchanged when the orientation of C is reversed.
However, the orientation property still holds true because the unit vector T is replaced
by its negative, −T, when C is replaced by −C.
x2 y2
Example 4.2.2. Evaluate ∫C xdy − ydx, where C is the arc of the ellipse a2
+ b2
= 1 from (a, 0) to (0, b).
4.2 Line integral of a vector field | 205
and r (t) = ⟨−a sin t, b cos t⟩. The vector field is F = ⟨−y, x⟩, so
Example 4.2.3. Find the line integral ∫C x 2 dx − xydy, where C consists of the line segment C1 from the
point (1, 0) to the point (0, 0) followed by the vertical line segment C2 from the point (0, 0) to (0, 1).
Solution. The path and the vector field are shown in Figure 4.7(a). Along C1 , we have
y = 0 and dy = 0, so
0
1
∫ x dx − xydy = ∫ x2 dx = − .
2
3
C1 1
(a) (b)
Along C2 , x = 0, so dx = 0 and
∫ x2 dx − xydy = 0.
C2
Altogether, we have
1
∫ x2 dx − xydy = ∫ x2 dx − xydy + ∫ x 2 dx − xydy = − .
3
C C1 C2
Example 4.2.4. Find the work done by the force field F ⃗ (x, y) = x 2 i ⃗ − xy j ⃗ acting on a particle moving
⃗ = cos t i ⃗ + sin t j ⃗ for 0 ≤ t ≤ π2 .
along the quarter-circle r(t)
Solution. The path and the vector field are shown in Figure 4.7(b). Since x = cos t and
y = sin t, we have
dr(t)
and = ⟨− sin t, cos t⟩. Therefore, the work done is
⃗
dt
π/2
Note. The two previous examples both compute the work done by the same force field,
x2 i − xy j, between the same two points, but reach different answers. This shows that
the work done by the same vector field may be different when calculated along differ-
ent routes. This is not true of so-called conservative force fields, such as gravity, where
the work done is the same, no matter what path is taken, so long as the starting point
and ending point are fixed. The next example with a conservative force field illustrates
this point.
2
Example 4.2.5. Evaluate the line integral ∫C xydx+ x2 dy between O(0, 0) and B(1, 1) along the following
curves:
(1) the line segment from O(0, 0) to A(1, 0), and then to B(1, 1).
(2) the line segment from O(0, 0) to B(1, 1).
(3) an arc of the parabola x = y 2 .
4.2 Line integral of a vector field | 207
Solution. The vector field and the paths are shown in Figure 4.8.
(1) Along the line segment from O(0, 0) to A(1, 0), y = 0, and then
x2
∫ xydx + dy = 0.
2
C
Along the line segment from A(1, 0) to B(1, 1), x = 1 and dx = 0. Thus
1
x2 12 1
∫ xydx + dy = ∫ dy = .
2 2 2
C 0
All together,
x2 x2 x2 1 1
∫ xydx + dy = ∫ xydx + dy + ∫ xydx + dy = 0 + = .
2 2 2 2 2
C OA AB
(2) Along the line segment from O(0, 0) to B(1, 1), a parametric form is x = x, y = x for
0 ≤ x ≤ 1, so
1 1
x2 x2 3x2 1
∫ xydx + dy = ∫ x2 dx + dx = ∫ dx = .
2 2 2 2
C 0 0
1 1
x2 y4 5y4 1
∫ xydx + dy = ∫ y3 d(y2 ) + dy = ∫ dy = .
2 2 2 2
C 0 0
2
Note. In this example, you can see that the line integral of the vector field ⟨xy, x2 ⟩
from (0, 0) to (1, 1) is the same along three different curves. In fact it would be the
same along any curve joining the two points. This is an example of a conservative
vector field. We will discuss this kind of field in more details in the coming section.
208 | 4 Line and surface integrals
Example 4.2.6. Evaluate ∫C (y − x)dx + xdy + (x + z)dz, where C consists of the line segment C1 from
(2, 0, 0) to (3, 4, 5) followed by the vertical line segment C2 from (3, 4, 5) to (−1, 4, 1).
Thus,
1
Definition 4.3.1 (Path independence). Let F = ⟨f , g⟩ be a vector field defined on a region D in ℝ2 . Let A
and B be any two points in D, and C be any path with endpoints A and B. If the value of the line integral
∫C fdx + gdy is independent of the path that connects A and B, then we say the line integral ∫C fdx + gdy
is path-independent and that the vector field F is path-independent in D.
4.3 The fundamental theorem of line integrals | 209
(a) (b)
∫ F ⋅ dr = ∫ F ⋅ dr.
C1 C2
This means
Note that C1 +(−C2 ) is a closed curve in D. (We say a parameterized curve r(t), a ≤ t ≤ b,
is closed if r(a) = r(b)).
On the other hand, if F is defined in D and
∮ F ⋅ dr = 0
C
for any closed curve C in D, then we claim that the vector field F must be path-
independent in D. This is because for any two points A and B in D, and any two
different paths C1 and C2 from A to B, we have
0 = ∮ F ⋅ dr = ∫ F ⋅ dr+ ∫ F ⋅ dr
C C1 −C2
= ∫ F ⋅ dr− ∫ F ⋅ dr.
C1 C2
210 | 4 Line and surface integrals
This means
∫ F ⋅ dr = ∫ F ⋅ dr.
C1 C2
Theorem 4.3.1. F = ⟨f , g⟩ is path-independent in D, if and only if for any closed curve C in D, we must
have
∮ F ⋅ dr = ∮ fdx + gdy = 0.
C C
But how do we know that a line integral is path-independent? Recall the fundamental
theorem of calculus
b
The integration of the rate of change of a function is equal to the net change of the
function over the interval [a, b]. For functions of two or more variables, we have seen
that the gradient of a function plays much the same role as the derivative of a function
does for functions of one variable. We consider the integral ∫C ∇φ ⋅ dr. This means the
vector field F is the gradient of some scalar function φ. This indeed works, and we now
state the fundamental theorem of line integrals.
Theorem 4.3.2. Let φ(x, y) be a differentiable function and F = ∇φ. Suppose that C is any curve that
has a parameterization r(t) = ⟨x(t), y(t)⟩, a ≤ t ≤ b, where r(a) and r(b) correspond to the points A
and B, respectively. Then F is path-independent and
∫ F ⋅ dr = ∫ ∇φ ⋅ dr = φ(B) − φ(A).
C C
dr 𝜕φ 𝜕φ dx dy
∫ ∇φ ⋅ dr = ∫ ∇φ ⋅ dt = ∫⟨ , ⟩ ⋅ ⟨ , ⟩dt
dt 𝜕x 𝜕y dt dt
C C C
b
𝜕φ dx 𝜕φ dy
= ∫( + )dt
𝜕x dt 𝜕y dt
a
b
dφ
= ∫( )dt = φ(x(b), y(b)) − φ(x(a), y(a))
dt
a
= φ(B) − φ(A).
Due to the fundamental theorem of line integrals, we now give the definition of a
conservative vector field and a potential function of the vector field F.
Definition 4.3.2 (Conservative field and potential function). A vector field F is called conservative if it
is the gradient of a scalar function φ, that is, F = ∇φ. Such a φ is called a potential function of the
vector field F.
𝜕φ 𝜕φ
fdx + gdy = dx + dy = dφ.
𝜕x 𝜕y
This means
𝜕φ 𝜕φ
∫ fdx + gdy = ∫ dx + dy = ∫ dφ(x, y) = φ(B) − φ(A).
𝜕x 𝜕y
C C C
Therefore, if we can find a potential function for the vector field F, then we can eval-
uate a work integral along a curve C by directly finding the difference of the potential
function at the initial and terminal points of the curve C. In this case we also write
x2 x2y
xydx + dy = d( ),
2 2
2 2
x2
so x 2y is a potential function of ⟨xy, x2 ⟩. Therefore, the line integral ∫C xydx + 2
dy from A(0, 0) to
B(1, 1) along any curve C is obtained by
(1,1)
x2 x 2 y 12 ⋅ 1 02 ⋅ 0 1
∫ xydx + dy = = − = .
2 2 (0,0)
2 2 2
C
We now establish the famous principle of conservation of energy. Let us assume that a
continuous force field F acts on an object moving along a path C. The path C is given
212 | 4 Line and surface integrals
W = ∫ F ⋅ dr = ∫ F( r(t)) ⋅ r (t)dt
C a
b
W = ∫ F ⋅ dr = − ∫ ∇P ⋅ dr
C C
b
𝜕P dx 𝜕P dy 𝜕P dz
= − ∫( + + )dt
𝜕x dt 𝜕y dt 𝜕z dt
a
4.3 The fundamental theorem of line integrals | 213
b
d
= −∫ (P(x(t), y(t), z(t)))dt
dt
a
= −[P(r(b)) − P(r(a))]
= P(A) − P(B).
Comparing this equation with the previous expression in terms of kinetic energy we
see that
P(A) + K(A) = P(B) + K(B).
This means that if an object moves from point A to point B under the influence of a
conservative force field, then the sum of its potential energy and its kinetic energy
remains constant. That is called the law of conservation of (mechanical) energy, and it
is why such a vector field is called conservative.
By the fundamental theorem of line integrals, we know that a conservative field
in a region D must also be path-independent in D. Conversely, if a vector field is path-
independent in D, do we know whether it is conservative? If the region D is open and
connected, the answer is yes. We say a region D is connected if for any two points in D
there exists a line/curve that lies entirely in D and connects the two points. We have
the following theorem.
Theorem 4.3.3. If D is an open and connected region, and a continuous vector field F = ⟨f , g⟩ is path-
independent in D, then F is also conservative in D. That is, there exists a function φ(x, y) such that
F = ∇φ.
φ(x, y) = ∫ F⋅dr
(a,b)
is independent of the path that connects A(a, b) and the point P(x, y). We are going to
show that
𝜕φ 𝜕φ
∇φ(x, y) = F, or equivalently, = f and = g.
𝜕x 𝜕y
Since D is open, there exists a disk centered at (x, y) that lies entirely in D. We choose
a point B(x0 , y) that is also in this disk, and then B(x0 , y) and P(x, y) are the two end-
points of the horizontal line segment BP. As shown in Figure 4.9(b), since D is con-
nected, there is a path C1 in D that connects A and B, and
(x,y) B(x0 ,y) P(x,y)
B(x ,y)
Note that ∫A(a,b)
0
F⋅dr is independent of x and along the line segment BP, dy = 0. There-
fore,
B(x0 y) x
φ(x, y)= ∫ F⋅dr+ ∫ f (t, y)dt, where we changed the dummy variable to t,
A(a,b) x0
and
𝜕φ
= 0 + f (x, y) = f (x, y).
𝜕x
In a very similar way, with the aid of a vertical line segment, we could prove
𝜕φ
= g(x, y).
𝜕y
Note. If we choose the point A(a, b) such that AP lies in a disk centered at P(x, y) and
lies in D, then by integration along a horizontal line segment followed by a vertical
line segment, we would have a nice formula for finding a potential function,
x y
Now, there are still some questions. For example, how do we know whether a po-
tential function exists? Observing again, if
F = ⟨f , g⟩ = ∇φ = ⟨φx , φy ⟩,
fy = φxy = φyx = gx .
This means that if f and g both have continuous partial derivatives, then a necessary
condition for it to be a conservative field is
𝜕g 𝜕f
= .
𝜕x 𝜕y
4.3 The fundamental theorem of line integrals | 215
Remarkably, it turns out this is also a sufficiently condition if the region D is simply
connected, and the curve C is simple and closed. This is proved in the next section by
using Green’s theorem. We now adopt this fact and demonstrate how to find a potential
function in two different ways.
Example 4.3.2. Given a vector field F = ⟨xy 2 , x 2 y + y⟩, find a function φ(x, y) such that F = ∇φ =
⟨xy 2 , x 2 y + y⟩.
𝜕(x2 y + y) 𝜕(xy2 )
= 2xy = .
𝜕x 𝜕y
𝜕φ 𝜕φ
= xy2 and = x2 y + y.
𝜕x 𝜕y
x2 y2
φ = ∫ xy2 dx = + C(y).
2
We need to remember that the integration is with respect to x, so the arbitrary constant
may be a function of y. To determine C(y), we take the partial derivative with respect
to y to obtain
x2 y2
𝜕φ 𝜕φ
=( + C(y)) = x2 y + C (y); but we also have = x 2 y + y.
𝜕y 2 y 𝜕y
Thus,
x2 y + C (y) = x2 y + y.
y2
So, C (y) = y. Then, C (y) = 2
+ C, where C is an arbitrary constant. Now we can
conclude that
x2 y2 y2
φ(x, y) = + + C.
2 2
Method 2: We can also try either of equation (4.7) or equation (4.8). Then we have
x y
x y x y
t 2 b2 x2 t 2 t 2
= ∫(tb2 )dt + ∫(x2 t + t)dt = ( ) +( + )
2 a 2 2 b
a b
x2 b2 a2 b2 x2 y2 y2 x 2 b2 b2
= − + + − −
2 2 2 2 2 2
x2 y2 y2
= + + C.
2 2
A simple curve is a curve which does not intersect itself except possibly at its end-
points. The set D ⊂ ℝ2 is connected if for any two points in D there exists a line (curve)
that lies entirely in D and connects the two points; D is called simply connected if for
every simple (i. e., nonself-intersecting) closed curve C composed of points of D, the
region inside of C is also part of D, that is, D has no holes and does not consist of
separate parts. In other words, one can continuously shrink any simple closed curve
to a point while remaining in the domain. Figure 4.10 shows some such curves and
regions. The boundary curve of a region D has a positive orientation if, as you walk
along the boundary, the region D is on your left-hand side. Figure 4.11 shows positive
orientation for two connected regions.
(a) (b)
The line integral ∫C F⋅dr along an oriented curve C “adds up” the component of the
vector field that is tangent to the curve C. In this sense, the line integral measures how
much the vector field is aligned with the curve. If the curve C is a closed curve, then the
line integral indicates how much the vector field tends to circulate around the curve C.
We call the line integral the “circulation” of F around C, i. e.,
Note. The symbol “∮C ” is used to indicate a line integral along a closed oriented
curve C.
xdy − ydx
∮
x2 + y2
C
for any circle C with center at the origin and radius R > 0 with counterclockwise orientation.
Solution. Figure 4.12 shows the vector field and the circle. Note that F = ⟨f , g⟩ with
y x
f = − x2 +y 2 , g = x 2 +y 2 , and C has a parametric form r(t) = ⟨R cos t, R sin t⟩. Then
y x y x
∮− dx + 2 dy = ∮⟨− 2 , ⟩ ⋅ r (t)dt
x2 + y2 x + y2 x + y2 x2 + y2
C C
2π
R sin t R cos t
= ∫ ⟨− , ⟩ ⋅ ⟨−R sin t, R cos t⟩dt
R2 R2
0
2π
= ∫ 1dt = 2π.
0
We first introduce the concept of circulation density. As shown in Figure 4.13, we con-
sider the counterclockwise circulation along the rectangle. We start with writing the
following:
218 | 4 Line and surface integrals
f (x, y − Δy) × 2Δx − f (x, y + Δy) × 2Δx + g(x + Δx, y) × 2Δy − g(x − Δx, y) × 2Δy,
which says that the integration of the rate of change of a function is related to the value
of the function at the boundary points of the integration interval. There is indeed a
similar relationship between the integration of rate of change of some functions on
the region that is bounded by a closed curve and a line integral along the curve. The
remarkable Green’s theorem is, therefore, sometimes called the fundamental theorem
of calculus for double integrals, as it reveals the relationship between a double integral
over the planar region D and a line integral on the boundary of D.
With the ideas of circulation integral and circulation density, we now state the
great Green’s theorem and give a partial proof of it.
Theorem 4.4.1 (Green’s theorem: circulation or tangential form). Let L be a piecewise smooth simple
closed curve in ℝ2 having positive orientation (the interior is on the left as you travel around L) and
→ → →
the region D inside of L is simply connected. Let F = f i + g j be a vector field for which f and g have
continuous partial derivatives in a region containing C and D. Then
𝜕g 𝜕f
∮ f (x, y)dx + g(x, y)dy = ∬( − )dσ. (4.11)
𝜕x 𝜕y
L D
220 | 4 Line and surface integrals
Example 4.4.2. For the vector field F = ⟨−y, x⟩ and the closed curve x 2 +y 2 = 1, evaluate both integrals
in Green’s theorem and check that they are equal.
2π
𝜕g 𝜕f 𝜕x 𝜕(−y)
∬( − )dσ = ∬( − )dσ
𝜕x 𝜕y 𝜕x 𝜕y
D D
Therefore, the two integrals in Green’s theorem are equal in this example.
where L1 : y = ϕ1 (x) and L2 : y = ϕ2 (x), for a ≤ x ≤ b, are two curves comprising L (see
Figure 4.16(a)).
(a) (b)
Since 𝜕f
𝜕y
is continuous, by applying the integration formula for double integrals, we
have
b ϕ2 (x)
𝜕f 𝜕f (x, y)
∬ dσ = ∫{ ∫ dy}dx
𝜕y 𝜕y
D a ϕ1 (x)
222 | 4 Line and surface integrals
So
𝜕f
−∬ dσ = ∮ fdx.
𝜕y
D L
Since D is also a type II region of form D = {(x, y)|ψ1 (y) ≤ x ≤ ψ2 (y), c ≤ y ≤ d}, a
similar argument leads to a proof that
𝜕g
∬ dσ = ∮ gdy.
𝜕x
D C
𝜕g 𝜕f
∮ fdx + gdy = ∬( − )dσ.
𝜕x 𝜕y
C D
Note. If a simply connected region is not both type I and type II, then we can partition
it into several subregions which are both type I and type II. Green’s theorem can then
be still proved.
∮ fdx + gdy = 0.
C
4.4 Green’s theorem: circulation-curl form | 223
Theorem 4.4.2. Let F = ⟨f , g⟩ be a continuous vector field defined in a simply connected region R ⊂ ℝ2
and f and g both have continuous partial derivatives. Then we have
F = ⟨f , g⟩ is conservative
⇐⇒ there is a function φ such that F = ∇φ or dφ = fdx + gdy
So 𝜕y
𝜕f
= 𝜕g
𝜕x
is a simple criterion to determine whether or not a vector field is conserva-
tive. In this case, we can use the method introduced in Example 4.3.2 to find a potential
function. We can also find a potential function for the vector field by the method in-
troduced in the following example.
Solution. The graph of the vector field is shown in Figure 4.17(a). We now use the
simple criterion derived in the proof of Green’s theorem to determine whether it is
conservative or not.
(1) Since F has continuous partial derivatives in ℝ2 and
𝜕g 𝜕(2xy) 𝜕(ex + y2 ) 𝜕f
= = 2y = = ,
𝜕x 𝜕x 𝜕y 𝜕y
this field is conservative.
(2) Since the field is conservative, the line integral ∫C F⋅dr is path-independent. There-
fore, we will not use the original path where it is hard to evaluate the integral, and,
instead, we try the line segment from (0, 0) to (√π, 0), as shown in Figure 4.17(b).
Note that along this new path, y = 0. We have
√π
x 2
∫ F⋅dr = ∫(e + y )dx + 2xydy = ∫ (ex + 02 )dx + 0
C C 0
ex |x=√π =e − 1.
√π
= x=0
224 | 4 Line and surface integrals
(3) We could have used equation (4.7) or equation (4.8). To enhance our understand-
ing, we use the idea of path independence again. Assume we are going to evaluate
the line integral from (0, 0) to any point, say, (u, v). Since the line integral is path-
independent in this field, if φ(x, y) is a potential function of F, then ∫(0,0) (ex +
(u,v)
y2 )dx + 2xydy = φ(u, v) − φ(0, 0). We now choose the line segments from (0, 0) to
(u, 0) and from (u, 0) to (u, v). Note that on the first line segment y = 0 and along
the second line segment x = u and dx = 0. Then
(u,v) (u,0) (u,v)
= ∫ ex dx + 2u ∫ ydy
0 0
= e − 1 + u(v − 02 )
u 2
= eu + uv2 − 1.
Since (u, v) is any point in D, we have a potential function φ(x, y) = ex + xy2 (note
again that any two potential functions could differ by a constant).
Note. If we switch the letters (u, v) and (x, y), then the equation
x y
gives a potential function of a conservative field F = ⟨f , g⟩. Of course, the initial point
is not necessarily (0, 0); it could be any qualifying point, say, (a, b), as shown in Fig-
4.4 Green’s theorem: circulation-curl form | 225
x y
still gives a potential function of the conservative field F = ⟨f , g⟩. These are the same
as equation (4.7) and equation (4.8).
Example 4.4.4. Evaluate ∫C (x 2 −2y)dx+(3xy+yey )dy, where C is the closed triangular curve consisting
of the line segments from (0, 0) to (1, 0), from (1, 0) to (0, 1), and from (0, 1) to (0, 0).
Solution. The graph of the vector field and the triangular curve are shown in Fig-
ure 4.18. The given line integral could be evaluated directly by integrating along each
line segment. But instead we use Green’s theorem. Note that the region D enclosed by
C is simply connected, and C has positive orientation. If we let f (x, y) = x 2 − 2y and
g(x, y) = 3xy + yey , then we have
𝜕g 𝜕f
∫(x2 − 2y)dx + (3xy + yey )dy = ∬( − )dσ = ∬(3y − (−2))dσ
𝜕x 𝜕y
C D D
1 1−x
Example 4.4.5. Use Green’s theorem to find ∫C (x 2 − y)dx − (x + sin2 y)dy, where C is the arc of the
circle y = √2x − x 2 from the point O(0, 0) to A(1, 1).
Solution. The graph of the vector field and the curve C are shown in Figure 4.19. Let
f = x2 − y and g = −(x + sin2 y). Note that 𝜕y
𝜕f
= 𝜕g
𝜕x
for all x and y. The field is path-
independent. We choose another path, O to B, and then B to A, where B is the point
(1, 0).
226 | 4 Line and surface integrals
(a) (b)
We compute the line integral over the two line segments separately, i. e.,
1
1 3
∫ (x2 − y)dx − (x + sin2 y)dy = − ∫(1 + sin2 y)dy = sin 2 − and
4 2
→ 0
BA
1
1
∫ (x2 − y)dx − (x + sin2 y)dy = ∫ x 2 dx = .
3
→ 0
OB
Hence,
1 3 1 1 7
∫(x2 − y)dx − (x + sin2 y)dy = sin 2 − + ( ) = sin 2 − .
4 2 3 4 6
C
4.4 Green’s theorem: circulation-curl form | 227
when f = 0, g = x :
𝜕g 𝜕f
∬ 1dσ = ∬( − )dσ = ∮ fdx + gdy = ∮ xdy, (4.13)
𝜕x 𝜕y
D D L L
when f = −y, g = 0 :
𝜕g 𝜕f
∬ 1dσ = ∬( − )dσ = ∮ fdx + gdy = − ∮ ydx, (4.14)
𝜕x 𝜕y
D D L L
y x
when f = − , g = :
2 2
𝜕g 𝜕f 1
∬ 1dσ = ∬( − )dσ = ∮ fdx + gdy = ∮ xdy − ydx. (4.15)
𝜕x 𝜕y 2
D D L L
Example 4.4.6. Compute, using Green’s theorem, the area of the region D ⊂ ℝ2 that lies above the
parabola y = x 2 and below y = 4.
Solution. Using the first of these formulas for finding areas and choosing a parame-
terization of the parabola as x = t, y = t 2 , −2 ≤ t ≤ 2, the area is
2 2
dy 32
∮ xdy = ∫ x dt = ∫ t ⋅ 2tdt = .
dt 3
L −2 −2
Note. As shown in Figure 4.20, the line integral should have included the flat top y = 4
of the region, but dy = 0 on this line segment, so it adds nothing to the area.
Example 4.4.7. Find the area of the region D inside the ellipse L given parametrically by x = a cos t,
y = b sin t, 0 ≤ t ≤ 2π.
Solution. Using one of the equations (4.13)–(4.15) for the area A, we obtain
1
A(D) = ∬ 1dσ = ∮ xdy − ydx
2
D L
2π
1
= ∫ (a cos t)(b cos t)dt − (b sin t)(−a sin t)dt
2
0
2π
ab
= ∫ dt = πab.
2
0
xdy − ydx
∮
x2 + y2
C
for any circle C with center at the origin and radius R > 0 and oriented counterclockwise.
y x
Solution. Note that F = ⟨f , g⟩ with f = − x2 +y 2 and g = x2 +y2
. We compute
𝜕f y
(y)y (x2 + y2 ) − (y)(x2 + y2 )y y2 − x2
= (− 2 ) = − = ,
𝜕y x + y2 y (x 2 + y2 )2 (x 2 + y2 )2
x (x) (x2 + y2 ) − (x)(x 2 + y2 )x y2 − x2
𝜕g
=( 2 2
) =− x 2 2 2
= 2 .
𝜕x x +y x (x + y ) (x + y2 )2
Therefore,
𝜕f 𝜕g
= .
𝜕y 𝜕x
Applying Green’s theorem, the line integral would be 0! However, as seen in Exam-
ple 4.4.1, the value is not 0! Why is this? The answer is that F is undefined at (0, 0),
so F is not continuous on any region containing (0, 0) (in this case, the region is not
simply connected).
Theorem 4.4.3 (Generalized Green’s theorem). Suppose the region D ⊂ ℝ2 lies between two simple,
closed, piecewise smooth curves L1 and L2 , where L1 is completely contained within L2 (the curves can
4.4 Green’s theorem: circulation-curl form | 229
intersect but not cross each other). Let f (x, y) and g(x, y) be defined on D and have continuous first
partial derivatives on D. If L2 and L1 have positive orientation, then
𝜕g 𝜕f
∬( − )dxdy = ∫(fdx + gdy) + ∫ (fdx + gdy). (4.16)
𝜕x 𝜕y
D L1 L2
Proof. A proof can be developed by combining L1 and L2 into one positively oriented
curve L that follows L2 , then follows a line segment S joining L2 to L1 , continuing along
L1 , and finally returns to L2 over the line segment S but in the opposite direction. Ap-
plying Green’s theorem to this curve L will give the generalized result. Figure 4.21 il-
lustrates this idea.
(a) (b)
Figure 4.21: Generalized Green’s theorem, the region is not simply connected.
xdy − ydx
∮ = 2π
x2 + y2
C
y
for every positively oriented simple closed curve C that encloses the origin. Note that f = − x 2 +y 2 and
x
g= x 2 +y 2
are not defined at the origin. So, Green’s theorem cannot be applied directly.
𝜕g 𝜕f
∮ fdx + gdy + ∮ fdx + gdy = ∬( − )dσ
𝜕x 𝜕y
C C D
230 | 4 Line and surface integrals
(a) (b)
Figure 4.22: Generalized Green’s theorem, the region is not simply connected, Example 4.4.9.
y2 − x2 y2 − x2
= ∬( 2 2 2
− 2 )dσ
(x + y ) (x + y2 )2
D
= 0.
Therefore,
= ∮ fdx + gdy,
−C
= ∫ dt = 2π.
0
where L is a piecewise smooth, simple closed curve that does not contain (0, 0) on L or in its interior.
4.5 Green’s theorem: flux-divergence form | 231
Solution. From the previous example the value is 2π when the origin is inside of L. If
the origin is not inside of L or on L, then the answer is 0 by applying Green’s theorem.
Suppose instead we are interested in how much vector field is pointing outward of
the given closed simple curve (in the normal direction). If the vector field models fluid
flow, then the question is equivalent to finding the rate (mass per time) at which the
fluid is flowing out of the region through the closed curve. This requires us to resolve
the vector field to the outward normal vector direction at each point. Then we have
But how do we find N? Well, as shown in Figure 4.23(b) we can define the outward unit
normal vector
N=T×k
1 dr 1 dx dy
= ×k= ⟨ , ⟩×k
|r (t)| dt |r (t)| dt dt
i j k
1 dx dy
= 0
|r (t)| dt dt
0 0 1
1 dy dx 1 dy dx
= ( i− j)= ⟨ , − ⟩.
|r (t)| dt dt |r (t)| dt dt
So
1 dy dx dy dx
∫(F ⋅ N)ds = ∫⟨f , g⟩ ⋅ ⟨ , − ⟩ds = ∫⟨f , g⟩ ⋅ ⟨ , − ⟩dt.
|r (t)| dt dt dt dt
C C C
(a) (b)
Example 4.5.1. Find the flux of the velocity flow v(x, y) = x i + xy j (cm/s) in two dimensions out of the
circular region C : x 2 + y 2 ≤ 4 for a fluid with constant density δ g/cm2 .
Solution. Choose the parametric form for C: r(t) = (2 cos t i + 2 sin tj), 0 ≤ t ≤ 2π. The
flux is
∫ δ(v ⋅ N)ds
C
2π
Similar to the density of circulation, we can define the density of flux, which is also
called the divergence. The box shown in Figure 4.24 has length 2Δx and width 2Δy.
Then, the flux crossing the
bottom is F ⋅ (−j)2Δx= − g(x, y − Δy)2Δx,
top is (F ⋅ j)2Δx=g(x, y + Δy)2Δx,
Div(F) = ∇ ⋅ F.
With the ideas of the flux integral and the flux density (divergence), we can expect
that the flux integral along a simple closed curve C is equal to the double integral of
the flux density over the simply connected region enclosed by C. This is indeed true.
We now state Green’s theorem in the divergence-flux form.
Theorem 4.5.1 (Green’s theorem: flux-divergence form). Let L be a piecewise smooth simple closed
curve in ℝ2 which has positive orientation (the interior is on the left as you travel round L), and the
→
→ →
region D inside of L is simply connected. Let F = f i + g j be a vector field for which f and g have
continuous partial derivatives in a region containing C and D. Then
𝜕f 𝜕g
∮ f (x, y)dy − g(x, y)dx = ∬( + )dσ, (4.19)
𝜕x 𝜕y
L D
Proof. This is proved by using Green’s theorem in the circulation-curl form. The flux
integral
can be rearranged as
Compared with the circulation integral, this might be confusing because f and g are
in different positions now. It may be helpful to memorize Green’s theorem writing it as
𝜕♥ 𝜕♣
∮ ♣dx + ♥dy = ∬( − )dσ. (4.23)
𝜕x 𝜕y
L D
Applying Green’s theorem in the circulation-curl form to the flux integral, we obtain
the flux-divergence form of Green’s theorem
𝜕f 𝜕g
∮ fdy − gdx = ∮ −gdx + fdy = ∬( + )dσ. (4.24)
𝜕x 𝜕y
L L D
Note. Recalling that the flux through C measures vector field “flowing” out of the
closed curve, there must be some “source” in D. The term 𝜕x 𝜕f
+ 𝜕g
𝜕y
is the divergence
of the vector field F in D. If at a point P, the divergence is positive, then it is a source.
If at a point P, the divergence is negative, then it is a sink.
Example 4.5.2. The graph of the vector field F = ⟨x 2 − y 2 , x − 3y⟩ and the curve x 2 + y 2 = 1 are shown
in Figure 4.25(a).
1. Make a guess whether the flux along C is positive or negative.
2. Evaluate the two integrals in Green’s theorem in flux-divergence form. Check to see if they are
equal.
Solution.
1. From Figure 4.25(a), it looks like more vector field goes into the circle than comes
out of the circle, so the flux might be negative.
2. We compute the flux integral
(a) (b)
2π
If the divergence of a vector field is 0 at every point in a region D, then the field is
called source-free. A point with positive divergence is called a source, and a point with
236 | 4 Line and surface integrals
negative divergence is called a sink. For a source-free vector field F = ⟨f , g⟩, we define
the stream function ψ, which satisfies
𝜕ψ 𝜕ψ
=f and = −g.
𝜕y 𝜕x
Since fx + gy = 0, or equivalently fx = −gy , we have
𝜕2 ψ 𝜕2 ψ
= or ψyx = ψxy .
𝜕y𝜕x 𝜕x𝜕y
Similar to a conservative field, for a source-free field, under suitable conditions, we
have the following results:
𝜕f 𝜕g
F = ⟨f , g⟩ is source-free ⇐⇒ + =0
𝜕x 𝜕y
⇐⇒ ∫(F ⋅ N)ds is path-independent
C
Example 4.6.1. Compute the divergence of the vector field F = ⟨−y, x⟩. Does it have a stream function?
If so, find one.
Solution. The graph of the vector field is shown in Figure 4.25(b). Since f = −y and
g = x,
𝜕f 𝜕g 𝜕(−y) 𝜕(x)
+ = + = 0,
𝜕x 𝜕y 𝜕x 𝜕y
so it is a source-free field. Thus, there exists a stream function ψ. Since
𝜕ψ y2
= f = −y, we have ψ = ∫(−y)dy = − + C(x).
𝜕y 2
To find C(x), we differentiate with respect to x to obtain
y2
𝜕ψ 𝜕ψ
= (− + C(x)) = C (x), but = −g = −x.
𝜕x 2 x 𝜕x
2
So, C (x) = −x. Thus, C(x) = − x2 + C. A family of stream functions for this vector field
is
y2 x2
ψ(x, y) = − − + C.
2 2
4.7 Surface integral with respect to surface area | 237
Definition 4.7.1. Let S be a bounded surface in space and let f (x, y, z) be a bounded function defined
on S. For any subdivision {Sk } of S into n patches (small subregions), and arbitrarily choose a point
(xk , yk , zk ) ∈ Sk in each patch. Form the Riemann sum
n
∑ f (xk , yk , zk )ΔSk ,
k=1
where ΔSk is the area of subregion Sk . If the limit of this Riemann sum exists as n → ∞ and
max |ΔSk | → 0 for all possible subdivisions and choices of points (xk , yk , zk ), then this value is
the surface integral of f (x, y, z) over the surface S, written
n
∬ fdS = ∬ f (x, y, z)dS = lim ∑ f (xk , yk , zk )ΔSk . (4.25)
max |ΔSk |→0, n→∞
S S k=1
Example 4.7.1. Suppose S is the surface of the unit ball. Evaluate the surface integral
= −3 × 4π12 = −12π.
and this is now a standard Riemann sum for a double integral, resulting in the final
formula
∬ f (x, y, z)dS = ∬ f (x(u, v), y(u, v), z(u, v))|ru × rv |dudv. (4.26)
S Duv
Example 4.7.2. Find ∬S z1 dS, where S is the part of the sphere x 2 + y 2 + z 2 = a2 that lies above the
plane z = h and h is a constant satisfying 0 < h ≤ a.
4.7 Surface integral with respect to surface area | 239
i j k
ru × rv = a cos u cos v a cos u sin v −a sin u
−a sin u sin v a sin u cos v 0
= a2 sin2 u cos vi+(a2 sin2 u sin v)j+(a2 sin u cos u)k,
and |ru × rv | = √(a2 sin2 u cos v)2 + (a2 sin2 u sin v)2 + (a2 sin u cos u)2 = a2 sin u. Thus,
h
2π cos−1 a
1 1 sin u
∬ dS = ∬ a2 sin ududv = a ∫ dv ∫ du
z a cos u cos u
S Duv 0 0
cos−1 h
h a
= 2πa × (− ln | cos u|)0 a
= −2πa ln = 2πa ln .
a h
In fact, we can use all the ways that we have developed for the surface area element
dS in Chapter 3 to evaluate a surface integral, so
∬ f (x, y, z)dS = ∬ f (x, y(x, z), z)√1 + yx2 + yz2 dxdz. (4.29)
S Dxz
Example 4.7.3. Let S be the closed surface formed by S1 , the portion of the cone with equation z =
√x 2 + y 2 that lies below the plane z = 1, and S2 , the circular top of the cone given by z = 1, x 2 + y 2 ≤ 1.
Let f be defined on S by f (x, y, z) = x 2 + y 2 . Compute the area of S and evaluate ∬S f (x, y, z)dS.
Solution. Figure 4.28 shows the cone and its circular top. The area of S does not need
any integration, since standard formulas give the lateral surface area of the cone as
π √2 and the area of the disk as π, so S has area π(√2 + 1). We compute the integral of
f by evaluating two surface integrals, i. e.,
For the first integral, the projection of S1 onto the xy-plane is D : x 2 + y2 ≤ 1, and since
z = √x2 + y2 , the surface integral becomes
2 2
𝜕z 𝜕z
∬ f (x, y, z)dS = ∬(x 2 + y2 )√1 + ( ) + ( ) dσ
𝜕x 𝜕y
S1 D
2 2
x y
= ∬(x 2 + y2 )√1 + ( ) +( ) dσ
D
√x 2 + y2 √x 2 + y2
= ∬(x 2 + y2 )√2dσ.
D
For the second integral the surface S2 is z = 1, so zx = zy = 0, and the domain D is the
same as that of the first integral. Hence,
2 2
𝜕z 𝜕z
∬ f (x, y, z)dS = ∬(x2 + y2 )√1 + ( ) + ( ) dσ
𝜕x 𝜕y
S2 D
π
= ∬(x2 + y2 )dσ = .
2
D
The value of this integral is obtained easily because it has exactly the same integrand
as the first integral above except for a factor of √2. Hence, the complete integral of f
is computed as
√2π π √2 + 1
∬(x2 + y2 )dS + ∬(x2 + y2 )dS = + = π.
2 2 2
S1 S2
We have seen the line integral of a two-dimensional vector field, both for finding cir-
culation and for finding flux. Now we discuss flux for a three-dimensional vector field.
As was the case for flux in two dimensions, we first need to orient a surface so that we
know which direction we are talking about. The surfaces we have encountered so far
have two sides, as shown in Figure 4.29. Such surfaces are orientable. However, some
surfaces are one-sided and are not orientable. For example, the Moebius strip is a one-
sided surface (Figure 4.30(a)). It is formed by taking a long and narrow strip of paper
and joining the ends together after giving a half-twist to one end. Any point on the
Moebius strip can be joined to any other point by a path that stays on the surface of
the paper (the path does not go near the edge), showing that it really is one-sided. Con-
sequently, on a Moebius strip it is not possible to define a unique unit normal vector,
perpendicular to the surface, that changes continuously along any curve. For example
suppose the unit normal N at any point is defined to be in the direction away from the
242 | 4 Line and surface integrals
(a) (b)
paper. Following a path from a point P with unit normal N around to the point P on
the other side of the paper from P will give a normal in exactly the opposite direction to
N. This violates the continuity of the normal, since P and P are essentially the same
point. Another example for a nonorientable surface is the famous Klein bottle (Fig-
ure 4.30(b)). In the sequel, we only consider orientable surfaces which we can orient
either upward or downward, outward or inward, leftward or rightward, and so forth.
We now give the definition of an orientable surface.
Definition 4.8.1 (Orientable surface). Let S be a surface in ℝ3 . If at each point (x, y, z) of S we can as-
→
→
sign a unit normal N = N (x, y, z) that changes continuously along any curve on S, then we say that S
→ →
is an orientable surface. Once N is defined for a surface S, the function N defines an orientation on
S, and S is said to be oriented.
If F(x, y, z) is a vector field (think of it as the flow of a fluid at each point in space
measured in mass/time) and N is the unit normal of an orientable surface, then at
each point of the surface, F ⋅ N is the component of F perpendicular to S, as shown in
Figure 4.31. Hence, we have the following definition of the flux.
4.8 Surface integrals of vector fields | 243
Definition 4.8.2 (Flux of a three-dimensional vector field). If a surface S is oriented with unit normal
→
→ → →
vector N , then the surface integral ∬S F ⋅ N dS is called the flux of F across S (in the direction defined
→ → → →
by N ); ∬S F ⋅ N dS is also called the integral of the vector field F over S.
In particular, if F(x, y, z) is the velocity vector for the flow of a fluid through a region of
space and the density function of the fluid at point (x, y, z) is δ(x, y, z) = 1, then the flux
element through a surface element ΔS in a given direction (or mass of fluid per unit
time across ΔS) is |F| cos θΔS, where θ is the angle between F and N (the normal vector
pointing in the given direction). Therefore, |F| cos θΔS = |F| cos θ|N|ΔS = (F ⋅ N)ΔS,
and the integration ∬S (F ⋅ N)dS gives the total flux across S in the given direction. This
is, indeed, a surface integral of the form already defined in equation (4.25), so it has
the same properties (for example, linear and additive across two regions S1 and S2 ). In
addition, we have
If S is closed, we also adopt the notation ∯S (F ⋅ N)dS, with a circle in the integral
sign.
Example 4.8.1. Given the vector field F = (x 2 − sin y 2 + z)i − yj+(z + z 2 )k, find the flux out of the top
and bottom faces of the cube
Solution. Since the flux is out of two faces, S1 , the top side of D, and S2 , the bottom
side of D, and we orient S1 upward and S2 downward so that we will find outward flux
leaving the cube, we choose N1 to be ⟨0, 0, 1⟩ and we choose N2 to be ⟨0, 0, −1⟩. Then,
2
= ∬(2 + 2 )dS (since the top side z = 2)
S1
= ∬(6)dS = 6 × A(S1 )
S1
= 24,
∬ (F ⋅ N)dS = 24 + 0 = 24.
S1 +S2
Using the surface integral evaluation formula, equation (4.26), this becomes
ru × rv
∬(F ⋅ N)dS = ± ∬ F ⋅ |r × r |dudv,
|ru × rv | u v
S D
Example 4.8.2. Find flux out of the lateral surface of the cylinder x 2 + y 2 = 4, 0 ≤ z ≤ 2 for the vector
field F = ⟨x, y, x + z 2 ⟩.
Solution. The cylinder has a parametric description r(u, v) = 2 cos ui + 2 sin uj + vk,
0 ≤ u ≤ 2π and 0 ≤ z ≤ 2. Then
i j k
ru × rv = −2 sin u 2 cos u 0 = 2 cos ui + 2 sin uj.
0 0 1
Note that this is an outward normal vector, so we take the positive sign and
This normal vector may or may not have the same direction as desired. Hence, for a
→
vector field F = ⟨f , g, h⟩ defined on S, the surface integral becomes
1
∬(F ⋅ N)dS = ± ∬⟨f , g, h⟩ ⋅ ⟨−zx , −zy , 1⟩ dS.
S S
√1 + zx2 + zy2
Using the surface integral evaluation formula, equation (4.27), this becomes
1
∬(F ⋅ N)dS = ± ∬⟨f , g, h⟩ ⋅ ⟨−zx , −zy , 1⟩ √1 + zx2 + zy2 dσ
S Dxy
√1 + zx2 + zy2
Similar formulas hold for surfaces S : x = x(y, z) and S : y = y(x, z). If Dyz and Dxz
are projections of S onto the yz-plane and xz-plane, respectively, then
Example 4.8.3. Evaluate ∬S F ⋅ dS, where F ⃗ (x, y, z) = y i ⃗ + x j ⃗ + z k⃗ and S is the boundary of the solid
region R enclosed by the paraboloid z = 1 − x 2 − y 2 and the plane z = 0. Assume S is oriented inward.
= ∬(−4xy − 1 + x2 + y2 )dxdy + 0
D
2π 1
cos αdS = dydz, cos βdS = dxdz, and cos γdS = dxdy.
4.8 Surface integrals of vector fields | 247
Then,
Similarly, using the alternative forms of the equation (4.33), we can show that if F =
⟨0, g, 0⟩, then
Note. Again, one must be aware that the normal vectors must be consistent with the
desired direction.
248 | 4 Line and surface integrals
when S is part of the plane x + 2y + z = 3 in the first octant with the unit normal N of S pointing to the
side of the surface away from the origin.
= ∬ 4x + 6y − 6dσ
Dxy
3
2 3−2y
= ∫( ∫ (4x + 6y − 6)dx)dy
0 0
3
2
= ∫(−4y2 + 6y)dy
0
9
= .
4
As seen with two-dimensional vector fields, the divergence which measures as “source”
of a vector field F is defined to be ∇ ⋅ F. This definition can be extended to three-
dimensional vector fields as well. For example, if F = ⟨f , g, h⟩, where f , g, and h are
three functions of three variables, then the divergence of F is
𝜕f 𝜕g 𝜕h
divergence of F = Div F = ∇ ⋅ F = + + .
𝜕x 𝜕y 𝜕z
4.9 Divergence theorem | 249
we calculate the flux per unit volume out of this box, the density of the flux, and then
let Δx, Δy, Δz → 0 to show that the rate of change of the “quantity” of F at (x, y, z) is
∇ ⋅ F.
The component of F in the z-direction is h(x, y, z). Hence, the flow out of the face A
(Figure 4.32) of the box is approximately h(x, y, z + Δz)4ΔxΔy, i. e., the flow per unit
time at the center of the face multiplied by the area of the face. Similarly, the flow
into face A of the box is approximately h(x, y, z − Δz)4ΔxΔy. Hence, the change in the
z-direction per unit volume is approximately
and in the limiting case as Δz → 0, the flux change per unit volume in the z-direction
is
𝜕f 𝜕g 𝜕h
+ + = ∇ ⋅ F = Div F.
𝜕x 𝜕y 𝜕z
Note. If, for example, F = ⟨f , g, h⟩ is fluid flow at (x, y, z), then the flow may be in
any direction. However, we know that the flow F is equivalent to the flow of its three
components f , g, and h in the direction of the three coordinate axes. This allows us to
use the box method above to compute the total flux.
𝜕f 𝜕g
∮(F ⋅ N)ds = ∬(∇ ⋅ F)dσ = ∬( + )dσ,
𝜕x 𝜕y
C D D
which states that the integral of the divergence over a simply connected region D gives
the total flux out of the boundary C of the region D. The three-dimensional version of
Green’s theorem in three-dimensional space is the following divergence theorem.
Theorem 4.9.1 (The divergence theorem). Let S be a closed surface in ℝ3 oriented outward, enclosing
a simply connected region Ω. Let the vector field F = ⟨f (x, y, z), g(x, y, z), h(x, y, z)⟩ be defined and
have continuous partial derivatives on a region containing Ω and S. Then
Note. As noted previously, the quantity ∬S (F ⋅ N)dS measures the flux of the vector
field across the surface S. If, for example, F measures a fluid flow (mass/unit time in
the direction of F) at each point (x, y, z), then ∬S (F ⋅ N)dS measures the amount per
unit time of fluid crossing the surface S in the direction of N. In this case ∭Ω ∇ ⋅ FdV
measures the rate of change of fluid mass in the region Ω. This is because ∇⋅F measures
the flux per unit volume at the point, or in other words, the rate (per unit volume) at
which the fluid quantity is changing at the point (x, y, z).
Example 4.9.1. Compute the flux of F(x, y, z) = zi + yj + xk out of S : x 2 + y 2 + z 2 = 1 using the two
integrals in the divergence theorem.
Solution. We use the parameterization r(u, v) = ⟨sin u cos v, sin u sin v, cos u⟩, 0 ≤ u
≤ π, 0 ≤ v ≤ 2π. Then as computed before
1 π 4π
= (− cos u + cos3 u ) × π =
.
3
0 3
We now evaluate the triple integral, where Ω (the unit ball) is the interior of S. We have
𝜕 𝜕 𝜕
∭ ∇ ⋅ FdV = ∭( (z) + (y) + (x))dV
𝜕x 𝜕y 𝜕z
Ω Ω
4π13 4π
= ∭ 1dV = = .
3 3
Ω
252 | 4 Line and surface integrals
Note. The last integral did not require a computation because it is the volume of the
sphere of radius 1.
→ → → →
∬(xy i + yz j + xz k ) ⋅ d S ,
S
where S is the cube bounded by the coordinate planes, by x = 1, y = 1, and z = 1, orientated outward.
Solution. Using the divergence theorem, with B denoting the inside of the box, we
have
→
→
→
→
∬(xy i + yz j + xz k ) ⋅ d S
S
→
→
→
= ∭ ∇ ⋅ (xy i + yz j + xz k )dV = ∭(y + z + x)dV
B B
1 1 1 1 1
1
= ∫ ∫ ∫(y + z + x)dxdydz = ∫ ∫(y + z + )dydz
2
0 0 0 0 0
1
3
= ∫(z + 1)dz = .
2
0
Example 4.9.3. Find the flux across S, the top hemisphere z = √4 − x 2 − y 2 , oriented outward, if F =
2
−x 2
⟨x 3 + 2y sin z, x 4 + y 3 , e−y + z 3 ⟩.
= ∭ 3x + 3y + 3z 2 dV
2 2
Ω
2π π/2 2
= 3 ∫ dθ ∫ dϕ ∫ ρ4 sin ϕdρ
0 0 0
2π π/2 2
192π
= 3 ∫ dθ ∫ sin ϕdϕ ∫ ρ4 dρ = .
5
0 0 0
4.9 Divergence theorem | 253
Now we need to subtract ∬S F⋅dS in order to get the desired flux.
192π
∬ F⋅dS = − ∬ F⋅dS
5
S S
192π 2 2
= − ∬⟨x 3 + 2y sin z, x4 + y3 , e−y −x + z 3 ⟩ ⋅ ⟨0, 0, −1⟩dS
5
S
192π 2 2
= + ∬ e−y −x dS
5
S
2π 2
192π 2 2 192π 2
= + ∬ e−y −x dσ = + ∫ dθ ∫ e−r rdr
5 5
x2 +y2 ≤4 0 0
192π 1 1
= − 2π( e−4 − ).
5 2 2
Example 4.9.4. An electric charge q at the origin creates an electric field F at r (the position vector
from the origin) given by
q r
F= ,
4π ∈0 r 3
where r = |r| and ∈0 is a constant. Compute ∇ ⋅ F and find the flux across the surface S of the sphere B :
x 2 + y 2 + z 2 ≤ b2 . Find the flux across any closed surface S1 containing the charge.
r x y z
k = k⟨ 3 , 3 , 3 ⟩,
r3 r r r
where k is a constant, is the inverse square law (such as electric charge and gravity).
1
The divergence is zero, because r = (x2 + y2 + z 2 ) 2 and r 2 = x 2 + y2 + z 2 . So 2r 𝜕x
𝜕r
= 2x,
x
and 𝜕x = r . Thus,
𝜕r
𝜕 x r 3 − x3r 2 𝜕x
𝜕r
1 3x2
( 3) = = − 5 .
𝜕x r r6 r3 r
By symmetry, we have
r x y z
∇⋅k = k∇ ⋅ ⟨ 3 , 3 , 3 ⟩
r3 r r r
1 3x2 1 3y2 1 3z 2
= k( 3 − 5 ) + k( 3 − 5 ) + k( 3 − 5 )
r r r r r r
3k x2 + y2 + z 2
= − 3k( ),
r 3 r5
254 | 4 Line and surface integrals
3k 3k
∇⋅F= − 3 = 0.
r3 r
Using the divergence theorem when F is an inverse square field would give ∬S F ⋅ dS =
∭Ω ∇ ⋅ FdV = 0, but this is a wrong answer, because the vector field F is not defined
at the origin (in fact it approaches ∞ as x, y, z → 0).
Hence, we must integrate the flux integral directly without using the divergence
theorem. The outward unit normal to the sphere x 2 + y2 + z 2 = b2 at (x, y, z) ∈ S is
1
b
(x, y, z), so
∬ F ⋅ dS = ∬(F ⋅ N)dS
S S
q xi + yj + zk xi + yj + zk
= ∬( )⋅( )dS
4π ∈0 ( √x 2 + y 2 + z 2 ) 3 b
S
q x2 + y2 + z 2
= ∬ dS
4π ∈0 b4
S
q 1
= ∬ 2 dS
4π ∈0 b
S
q
= ⋅ 4πb2 (since 4πb2 is the area of S)
4πb2 ∈0
q
= .
∈0
Hence, the flux through the sphere of any radius is ∈q .
0
In fact, the flux through any closed orientable surface S1 containing the charge is
the same value ∈q . To see this, let the radius b of the sphere S be sufficiently large so
0
q r
that S totally encloses S1 . If F = 4π∈ 3 , then the region R between the two surfaces
0 r
satisfies the divergence theorem because this region no longer contains the origin.
Hence,
∬ (F ⋅ N)dS = ∭ ∇ ⋅ FdV = 0 ⇒
S+S1 R
But the unit normal N on S1 will point out of the region R, which means it points into
the region containing the charge. Changing the direction of N to point outwards from
the charge q shows that
q
∬(F ⋅ N)dS = .
∈0
S1
4.9 Divergence theorem | 255
and
𝜕f 𝜕g 𝜕h
∭ ∇ ⋅ FdV = ∭ + + dV,
𝜕x 𝜕y 𝜕z
Ω Ω
𝜕h
∭ dV = ∬ hk ⋅ NdS. (4.37)
𝜕z
Ω S
and
𝜕g 𝜕f
∭ dV = ∬ gj ⋅ NdS and ∭ dV = ∬ f i ⋅ NdS.
𝜕y 𝜕x
Ω S Ω S
Recall that when we were trying to find the circulation along a plane curve C over a
vector field F = ⟨f , g⟩, we saw the term 𝜕g
𝜕x
𝜕f
− 𝜕y in Green’s theorem in curl-circulation
form, and we noted that
𝜕g 𝜕f
−
𝜕x 𝜕y
is the k-component of the curl, a vector relating to the rotational effect of a vector field.
If we want to find circulation along a simple closed curve over a vector field
in ℝ3 , what can we expect for the i- or j-components of the curl? What do the i- or
j-components look like? In general, we define the curl of the vector field F = ⟨f , g, h⟩
as
𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
curl F = ( − )i + ( − )j+( − )k.
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
curl F = ∇ × F.
If the curl of a vector field is always 0, then the field is irrotational. An irrotational
vector field F might be conservative, that is, there exists a potential function φ(x, y, z)
such that F = ∇φ.
Example 4.10.1. First compute the curl of the vector field given by
Is the field irrotational? Attempt to find a potential function φ for the vector field F.
Solution. Note that f = 2xy − z, g = x2 + 4y, and h = −x + 2z. We first compute curl F.
We have
i j k
curl F = ∇ × F = 𝜕
𝜕x
𝜕
𝜕y
𝜕
𝜕z
2
2xy − z x + 4y −x + 2z
4.10 Stokes theorem | 257
This field is irrotational. Now, we attempt to find a potential function φ. We know that
φ satisfies
𝜕φ 𝜕φ 𝜕φ
= 2xy − z, = x2 + 4y, and = −x + 2z.
𝜕x 𝜕y 𝜕z
φ(x, y, z) = x2 y − xz + C(y, z)
Step 2: We differentiate this φ with respect to y giving the second component of F, and
use this to deduce more information about φ, i. e.,
𝜕φ 𝜕C(y, z)
= x2 + (but this must be equal to x 2 + 4y)
𝜕y 𝜕y
𝜕C(y, z)
⇒ = 4y
𝜕y
⇒ C(y, z) = 2y2 + C(z)
⇒ φ(x, y, z) = x2 y − xz + 2y2 + C(z).
Step 3: We differentiate φ again with respect to z giving the third component of F, and
we use this to determine φ(x, y, z) (up to a constant), i. e.,
𝜕φ dC(z)
= −x + (but this must be equal to − x + 2z)
𝜕z dz
dC(z)
⇒ = 2z
dz
⇒ C(z) = z 2 + C.
Hence, we have
φ(x, y, z) = x2 y − xz + 2y2 + z 2 + C,
Recall Green’s theorem in curl-circulation form. If C is a simple and closed plane curve
which is the boundary of a simply connected region D, we have
𝜕g 𝜕f
∮ F ⋅ Tds= ∬ − dxdy.
𝜕x 𝜕y
C D
Theorem 4.10.1 (Stokes theorem). Suppose that S is a bounded simple orientable smooth surface with
unit normal N and boundary curve C that is oriented with unit tangent T as described in the preceding
sections (a corkscrew following the direction of N turns in the same direction as the positive direction
on C). Let F(x, y, z) be a continuously differentiable vector-valued function defined on S. Then
or in another form,
𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
∮ fdx + gdy + hdz = ∬( − )dydz + ( − )dzdx + ( − )dxdy.
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
C S
𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
∮ F ⋅ Tds= ∬( − )dydz + ( − )dzdx + ( − )dxdy,
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
C S
4.10 Stokes theorem | 259
= ∬(∇ × F ⋅ N)dS.
S
Note. In the two-dimensional case, the surface S is a planar region D, and the unit
normal vector N is the basis vector k = ⟨0, 0, 1⟩, so Stokes theorem becomes
Note. There is more than one smooth surface with the same boundary C; however, the
integral ∬S ∇×F ⋅ NdS gives the same value for each of these smooth surfaces satisfying
the conditions of the theorem.
Example 4.10.2. Let the vector field F be ⟨2z − y, x, y⟩, and let the surface S be the upper hemisphere
z = √4 − x 2 − y 2 oriented outward. The boundary curve C of S is x 2 + y 2 = 4 in the xy-plane oriented
counterclockwise. Evaluate:
1. ∬S ∇ × F ⋅ NdS.
2. ∬S ∇ × F ⋅ NdS, where S1 is the disk with boundary C and with N pointing upward.
1
3. ∮C F ⋅ Tds.
Solution. The surface and its orientation is shown in Figure 4.35. We first compute the
curl of F, which is
i j k
𝜕
curl F = 𝜕x 𝜕 𝜕
𝜕y 𝜕z = i+2j+2k.
2z − y x y
260 | 4 Line and surface integrals
Figure 4.35: Stokes theorem, Example 4.10.2, Example 4.10.4, and Example 4.10.5.
1. Since zx = −x
and zy = −y
, we have
√4−x 2 −y2 √4−x 2 −y2
3. We parameterize C : r(t) = ⟨2 cos t, 2 sin t, 0⟩, r (t) = ⟨−2 sin t, 2 cos t, 0⟩. Then
→ →
→ → →
Example 4.10.3. Evaluate ∬S (∇ × F ) ⋅ d S = ∬S (∇ × F ) ⋅ N dS, when F = ⟨xz, yz, xy⟩ and S is the part
of the sphere x 2 + y 2 + z 2 = 4 inside the cylinder x 2 + y 2 = 1 with z ≥ 0.
4.10 Stokes theorem | 261
→ → → →
∬(∇ × F ) ⋅ N dS = ∮ F ⋅ T ds,
S C
where C is the boundary curve x2 +y2 = 1 and z is given by z 2 = 4−(x 2 +y2 ) = 3 ⇒ z = √3.
We can represent C in vector form with positive orientation as
r (t) = cos t →
→ →
→ i + sin t j + √3 k , 0 ≤ t ≤ 2π.
→
Hence, on C : F = ⟨xz, yz, xy⟩ = ⟨√3 cos t, √3 sin t, cos t sin t⟩,
2π
→ → → d→r
∮ F ⋅ T ds = ∫ F ⋅ dt
dt
C 0
2π
2
Example 4.10.4. If F = ⟨z 2 y, −3xy, e−x y 3 ⟩ and S is part of the surface z = 5 − x 2 − y 2 above z = 4,
oriented upward, find ∬S curl F ⋅ dS.
To evaluate ∬S curl F ⋅ dS directly would be hard. We now use Stokes theorem. We first
find the boundary curve z = 4 and x2 +y2 ≤ 1, and orient it counterclockwise as viewed
from above. We parameterize the curve by r(t) = ⟨cos t, sin t, 4⟩, 0 ≤ t ≤ 2π. Then,
2
∬ curl F ⋅ dS = ∮ F ⋅ dr = ∮⟨z 2 y, −3xy, e−x y3 ⟩ ⋅ d⟨cos t, sin t, 4⟩
S C C
2π
2
= ∫ ⟨42 sin t, −3 cos t sin t, e− cos t (sin t)3 ⟩ ⋅ ⟨− sin t, cos t, 0⟩dt
0
262 | 4 Line and surface integrals
2π
Or, we choose an alternative surface with the boundary. This alternative surface could
be the disk x2 + y2 ≤ 1 and z = 4. The outward unit normal vector is ⟨0, 0, 1⟩. Thus,
→
r , where → → →
→
Example 4.10.5. Compute ∫C F ⋅ d→ F = xz i + xy j + 3xz k and C is the triangular closed
curve with vertices followed in the order (1, 0, 0), (0, 2, 0), (0, 0, 2), (1, 0, 0).
→
and ∇ × F = ⟨0, x − 3z, y⟩. Hence,
→ → →
∬(∇ × F ) ⋅ N dS = ∬⟨0, x − 3z, y⟩ ⋅ N dS.
S S
Evaluating this using equation (4.33), we find ∬D (−fzx −gzy +h)dxdy, using z = z(x, y) =
2−2x−y, f = 0, g = x−3z, and h = y, where the projection of D onto the xy-plane is given
by the triangle with vertices (0, 0, 0), (1, 0, 0), (0, 2, 0), and ⟨−zx , −zy , 1⟩ = ⟨2, 1, 1⟩. This
is the correct orientation for the plane. Hence, the integral becomes
→ → →
∮ F ⋅ d→
r = ∬(∇ × F ) ⋅ N dS = ∬((x − 3z) + y)dσ
C S D
4.10 Stokes theorem | 263
= ∫ ∫ (7x + 4y − 6)dydx
0 0
1
→
→
→
Example 4.10.6. Show, using Stokes theorem, ∮C F ⋅ T ds = 0 for any gradient vector field F with con-
→
tinuous partial derivatives and simple orientable smooth surface S with unit normal N and boundary
curve C in ℝ3 .
Note that the proof of the fundamental theorem of line integrals can be extended to
three-dimensional vector fields in ways similar to the results we have obtained for two-
dimensional vector fields. For a three-dimensional vector field F = ⟨f , g, h⟩, where f ,
g, and h all have continuous partial derivatives and D is a simply connected region in
ℝ3 bounded by the simple curve C, we have
F = ⟨f , g, h⟩ is conservative
⇐⇒ there is a function φ such that F = ∇φ, or dφ = fdx + gdy + hdz
264 | 4 Line and surface integrals
To find a potential function for a conservative field, we can follow the method used
in Example 4.10.1.
t(t − 1) π t
C : r(t) = i + sin( t 2 )j + 2 k, 0 ≤ t ≤ 1.
e√t 2 t +1
Solution. Since
i j k
𝜕 𝜕
∇ × F = 𝜕x 𝜕
𝜕y 𝜕z = 0,
y x 2z
this field is conservative and, therefore, it is path-independent. The two endpoints are
(0, 0, 0) and (0, 1, 21 ). We could have found a potential function, but we simply choose
a simple route between the endpoints. If we choose the line segment r1 (t) = t⟨0, 1, 21 ⟩,
for 0 ≤ t ≤ 1, then we have x = 0, y = t, and z = 2t and
1
t 1
∫ ydx + xdy + 2zdz = ∫ 0 + 0 + dt = .
2 4
C 0
= ∬ ∇ × F ⋅ NdS.
S
Interpretation of curl
Now we can shed more light on the meaning of the curl vector using Stokes theorem.
Let P0 be a point on the surface S and let SP0 be a very small patch of S containing P0 .
Let A(SP0 ) be the area of the small patch. Then, under the continuity assumption, we
have
∮C F⋅dr
(curl F(P0 ) ⋅ N(P0 )) ≈ .
A(SP0 )
When taking limit as the small patch contracts to P0 , we see that curl F(P0 ) ⋅ N(P0 ) is
the circulation density at the point P0 . Thus, the integration of curl F ⋅ N generates the
total circulation along the boundary curve C. Also, one sees that the greatest circula-
tion occurs when curl F is parallel to N, in which case we have the greatest curling
effect.
4.11 Review
Main concepts discussed in this chapter are listed below.
1. Line integral of f (x, y) or f (x, y, z) along a curve C with respect to arc length:
2. Some equivalent notations for the line integral of a vector field F = ⟨f , g⟩ along a
curve C:
5. Green’s theorem:
𝜕g 𝜕f
∮ fdx + gdy = ∬ − dσ circulation-curl form,
𝜕x 𝜕y
C D
𝜕f 𝜕g
∮ fdy − gdx = ∬ + dσ flux-divergence form.
𝜕x 𝜕y
C D
⇐⇒ ∮ F ⋅ dr = 0
C
𝜕g 𝜕f
⇐⇒ φxy = φyx ( = ).
𝜕x 𝜕y
7. Under suitable conditions
⇐⇒ ∮ F ⋅ ds = 0
C
𝜕g 𝜕f
⇐⇒ ψxy = ψyx ( + = 0).
𝜕x 𝜕y
4.11 Review | 267
∬ f (x, y, z)dS = ∬ f (x, y, z)√1 + zx2 + zy2 dxdy for surface z = f (x, y),
S Dxy
|∇F|
∬ f (x, y, z)dS = ∬ f (x, y, z) dxdy for surface F(x, y, z) = 0.
|Fz |
S Dxy
Similar results hold for surfaces that can be projected to coordinate planes other
than the xy-plane.
9. Divergence of a vector field F = ⟨f , g, h⟩:
𝜕f 𝜕g 𝜕h
Div(F) = + + = ∇ ⋅ F.
𝜕x 𝜕y 𝜕z
Similar results hold for a surface that can be projected onto coordinate planes
other than the xy-plane.
12. The divergence theorem: the outward flux crossing a closed surface S is
𝜕f 𝜕g 𝜕h
∬ fdydz + gdzdx + hdxdy = ∭( + + )dV,
𝜕x 𝜕y 𝜕z
S Ω
𝜕h 𝜕g 𝜕f 𝜕h 𝜕g 𝜕f
curl F = ( − )i + ( − )j + ( − )k
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
i j k
𝜕
𝜕z = ∇ × F.
𝜕 𝜕
= 𝜕x 𝜕y
f g h
268 | 4 Line and surface integrals
4.12 Exercises
4.12.1 Line integrals
1. Evaluate each of the following line integrals for the given curve C:
(1) ∫C √2yds, C : x = a(t − sin t), y = a(1 − cos t), 0 ≤ t ≤ 2π,
(2) ∫C (x + y)ds, C consists of three line segments with vertices (0, 0), (1, 0),
and (0, 1),
(3) ∫C cos √x2 + y2 ds, C is the boundary of the region in the first quadrant bounded
by x = y, y = √R2 − x2 , and y = 0,
2
+y2 +z 2 =a2 ,
(4) ∫C √2y2 + z 2 ds, C : { x x−y=0,
(5) ∫C (x2 + y2 + z 2 )ds, C : x = e cos t, y = et sin t, and z = et , 0 ≤ t ≤ 2π.
t
(1) ∮C (x + y)2 dx + (x2 − y2 )dy, C is the triangle with vertices A(1, 1), B(3, 3), and
C(3, 5),
(2) ∮C xy2 dx − x2 ydy, C is the circle x2 + y2 = R2 ,
(3) ∫C (y + 2xy)dx + (x2 + 2x + y2 )dy, C is the top-half-arc of the circle x2 + y2 = 4x
from (4, 0) to (0, 0),
(4) ∮C F ⋅ dr, where F(x, y) = ⟨ex (1 − cos y), ex (y − sin y)⟩ and C is the boundary of
the region enclosed by x = 0, x = π, y = 0, and y = sin x,
(5) ∮C ∇(ex + sin(yx2 )) ⋅ dr, where C is any smooth simple closed curve in the
xy-plane,
4
(6) ∫C F ⋅ Tds, where F(x, y) = ⟨ex + y2 , xy + sin(ln y)⟩ and C is the boundary of the
quadrilateral with vertices (1, 1), (1, 2), (2, 3), and (2, 1).
5. Determine whether each of the following vector fields is conservative; if so, find a
potential function:
(1) F = xi − yj, (2) F = ⟨tan y, x sec2 y⟩,
(3) F = ⟨1 − ye−x , e−x ⟩, (4) F = ⟨y + 2xy, x2 + x + y2 ⟩,
(5) F = −yi + xj, (6) F = ⟨ex cos y, −ex sin y⟩,
(7) F = (x2 + 2xy − y2 )i + (x2 − 2xy − y2 )j.
6. Use a line integral to find the area of the region enclosed by the curve x = a cos3 t
and y = a sin3 t.
7. Use Green’s theorem to prove that the centroid of a plane region D in the xy-plane
has coordinates (x,̄ y)̄ given by
1 1
x̄ = ∮ x2 dy and ȳ = − ∮ y2 dx,
2A(D) 2A(D)
C C
where A(D) is the area of the region D. Hence, find the coordinates of the centroid
of the semicircle y = √a2 − x2 .
8. Evaluate the outward flux of each of the following vector fields across the given
curve C:
(1) F = xy2 i + xyj, and C is the boundary of the annulus 1 ≤ x 2 + y2 ≤ 4,
(2) F = ⟨−y, x⟩, and C is the circle with center the origin and radius a.
x y
9. Consider the vector field F = x2 +y 2 i+ x 2 +y 2 j.
(2) ∬S xyzdS, S is the part of the plane x + y + z = 1 that lies in the first octant,
(3) ∬S (xy + yz + zx)dS, S is the part of the cone z = √x 2 + y2 that lies inside the
cylinder x2 + y2 = 2x.
2. Evaluate each surface integral ∬S F ⋅ dS for each of the following vector fields F
and oriented surfaces S:
(1) F = ⟨0, 0, xyz⟩, S is the part of the cylinder x 2 + z 2 = R2 in the first and fifth oc-
tants and between the two planes y = 0 and y = h, with outward orientation,
(2) F = (y − z)i + (z − x)j + (x − y)k, S is the surface of the region E bounded by the
cone z = √x2 + y2 and the plane z = 1, with outward orientation.
3. Evaluate each of the following surface integrals ∬S Pdydz + Qdxdz + Rdxdy:
(1) ∬S xydydz + yzdxdz + zxdxdy, where S is the surface of the solid bounded by
z = 0, y = 0, z = 0, and x + y + z = 1, oriented outward,
2 2 2
(2) ∬S (e−x y + x)dydz + (2e−x y + y)dxdz + (e−x y + z)dxdy, where S is the part of the
plane x − y + z = 1 that is in the fourth octant, oriented outward,
(3) ∬S (x2 − y)dxdz + sin(xy)dxdy, where S is the part of the cylinder x 2 + y2 = 1
that is cut by z = 0 and z = 2, oriented outward.
4. Compute the divergence of each of the following vector fields:
2
(1) F = (x2 + sin y2 )i + (y2 − x)j, (2) F = ⟨x + x3 + yz 2 , e−x + ln(y2 + 1), z + xy⟩,
1
(3) F = (x3 + yz)i−xzj+yzk, (4) F = ⟨x − 1+xy2
, tan−1 z + y, z 2 + 3x⟩.
5. Use the divergence theorem to find ∬S x 3 dydz + y dzdx + z dxdy, where S is the
3 3
1
F = ⟨x3 + 3x + , y3 + xy, z 3 − xz + sin(xy)⟩
z 2 + y2 + 1
is a vector field.
(1) Compute the divergence of F.
(2) Find the flux out of W (that is, evaluate ∬S F ⋅ NdS).
8. To evaluate ∬Σ xyzdxdy, where Σ : x2 + y2 + z 2 = 1, (x ≥ 0, y ≥ 0), oriented outward,
two students provided the following solutions:
Solution 1
The integration surface is symmetric about the xy-plane, with half the plane above
the xy-plane and the other half below the xy-plane. The integrand xyz, if keeping
4.12 Exercises | 271
Solution 2
The second student writes
(a) Parameterize C and use the parameterization to evaluate the line integral
5.1 Introduction
We first investigate several examples. We assume the population P(t) is a function of
time, t, subject to constant birth and death rates. Then the rate of change of P with
respect to time t can be modeled as
dP
= kP, where k = birth rate − death rate = some constant.
dt
This equation involves the unknown function P and its first derivative P . It is a first-
order differential equation because it involves a first-order derivative. One can check
that P = Cekt , where C is an arbitrary constant, is a solution to this equation. If k is
positive, it is an exponential growth model, and if k is negative, it is an exponential
decay model. This is the case for the population of a family of bacteria growing or
disappearing over a short period of time. Also, many radioactive materials satisfy this
law.
Newton’s law of cooling says that the rate of change of the temperature T(t) of
a body is proportional to the difference between T and the temperature A of the sur-
rounding medium. If we know T(0) = 50 °C, then we have
dT
= −k(T − A), T(0) = 50 °C,
dt
F − Fr = −mx ,
https://doi.org/10.1515/9783110674378-005
274 | 5 Introduction to ordinary differential equations
where x is the displacement from the equilibrium point, and the derivative is with re-
spect to time. By Hooke’s law, we have F = −kx, where k is some constant. The negative
sign in the right-hand side indicates that the resultant force and the displacement are
in opposite directions. If Fr , the resistance force, is proportional to the velocity of the
mass, then Fr = lx for some constant l, so we have
This equation involves an unknown function x(t) and its first- and second-order
derivatives, so it is a second-order differential equation.
The examples above are all ordinary differential equations (ODEs), since the un-
known functions only depend on a single variable. The following differential equa-
tions are all examples of ODEs:
d3 y dy 1
(a) +2 − 3y = , (b) y(4) + y(2) y = y ,
dx 3 dx x
3 4
d2 x dx
(c) ( ) + 2( ) = x, (d) ẋ = 2√x + t 2 .
dt 2 dt
The order of an ODE is the highest derivative that is found in the ODE. Thus, in the
above four ODEs, (a) is of order 3, (b) is of order 4, (c) is of order 2, and (d) has order 1.
The degree of an ODE is the highest power of the highest derivative in that ODE.
Thus, in the above four ODEs, (a), (b), and (d) all have degree 1 while (c) has degree 3.
Scientists and economists also use partial differential equations (PDEs) to solve
problems. For example, the heat equation is
𝜕u 𝜕2 u 𝜕2 u 𝜕2 u
− α( 2 + 2 + 2 ) = 0,
𝜕t 𝜕x 𝜕y 𝜕z
where u(x, y, z, t) is the temperature of a body and α is the thermal diffusivity. The wave
equation
𝜕2 u 2
2𝜕 u
= c ,
𝜕t 2 𝜕x 2
the harmonic equation
𝜕2 u 𝜕2 u
+ = 0,
𝜕t 2 𝜕x2
and the famous Black–Scholes model for option pricing
𝜕V 1 2 2 𝜕2 V 𝜕V
+ σ S = rV − rS
𝜕t 2 𝜕S2 𝜕S
are all examples of PDEs.
Both ODEs and PDEs have enormously important applications, as seen above. In
this text, we only consider some ODEs.
5.2 First-order ODEs | 275
We will discuss a first-order differential equation that can be written in the form
dy
= f (x, y).
dx
dy
Example 5.2.1. Solve the differential equation dx
= 2x.
y(x) = ∫ 2xdx,
y = x2 + C.
dy
This is a general solution of the differential equation dx = 2x since it gives every pos-
sible solution to the equation. If, furthermore, we know y(1) = 2, then we will be able
to determine the constant C = 1 to obtain the particular solution y = x 2 + 1.
The graphs of a general solution are a family of curves, called solution curves or
integral curves. Figure 5.1 shows some solution curves for C = 0, ±1, and 3.
dy
= f (x, y)
dx
The problem
dy
= f (x, y), dy
{ dx alternatively written as = f (x, y), y(a) = b,
y(a) = b, dx
is called an initial value problem. The solution to an initial value problem is called a
particular solution.
Theorem 5.2.1 (Existence and uniqueness of solutions). Suppose that the function f (x, y) and its par-
tial derivatives fx (x, y) and fy (x, y) are continuous on some region R in the xy-plane that contains the
point (a, b) in its interior. Then the initial value problem
dy
= f (x, y), y(a) = b
dx
has one and only one solution that is defined on an open interval I containing the point a.
dy
= f (x, y) or F(x, y, y ) = 0,
dx
we may not be able to see its solution at first sight, for example,
dy dy
= x2 + y2 or = x 3 − 2xy.
dx dx
However, we do know, at each point, the derivative of the unknown function y(x). The
derivative is the slope of the tangent line at that point, so we can sketch a small line
segment to indicate its tangent at some points. For example, for the differential equa-
dy
tion dx = x2 +y2 , we can compute the derivative at points (0, 0), (1, 1), and (2, 1) to obtain
y (0) = 0, y (1) = 2, and y (2) = 5. Thus, we can sketch a diagram as in Figure 5.2(a).
This type of diagram is called a direction field or slope field of the differential equation.
With the help of a computer algebra system, we have the direction fields for the above
two differential equations as shown in Figure 5.2(b) and (c). If we know an additional
condition, y(x0 ) = y0 , then we are able to sketch a solution curve that passes through
the point (x0 , y0 ). Several particular solution curves for y = x 3 − 2xy are shown in
Figure 5.3.
dy
= f (x, y) = h(x)g(y),
dx
dy
= h(x)dx.
g(y)
Then, we can integrate both sides separately, the left side as a function of y and the
right side as a function of x, i. e.,
1
∫ dy = ∫ h(x)dx.
g(y)
dP
= kP, where k is a nonzero constant.
dt
1
dP = kdt.
P
We integrate
1
∫ dP = ∫ kdt
P
to get
Simplifying this we obtain P = ±ekt+C1 = ±eC1 ekt . Note that ±eC1 is also an arbitrary
constant except 0, but y = 0 is a solution, so we can write this as
P0 /2 = P0 ekt ,
1
ln = kt,
2
− ln 2
t= , for k < 0.
k
Note. The radioactive half-life for a given radioisotope is given by the above formula.
Example 5.2.3. Suppose a curve y = y(x) has a derivative 2x(y 2 + 1) at each point (x, y).
1. Find any such curve.
2. Furthermore, if the curve passes through the point (0, 1), then find this particular curve.
Solution.
1. The curve y = y(x) satisfies the differential equation
dy
= 2x(y2 + 1).
dx
5.2 First-order ODEs | 279
1
dy = 2xdx.
y2 + 1
Then, we integrate
1
∫ dy = ∫ 2xdx
y2 + 1
π
y = tan(x2 + ).
4
dy
A homogeneous first-order differential equation dx
= f (x, y) is one where f (x, y) can be
rewritten as a function of xy , i. e.,
dy y
= F( ). (5.1)
dx x
This typically happens when f (x, y) is made up of polynomial terms in x and y where
the exponents of x and y add to the same value for all such terms.
dy
If dx = F( xy ) and we make the substitution
y dy dv
v= so that y = vx and =v+x ,
x dx dx
then the differential equation (5.1) is transformed into a separable equation with in-
dependent variable x and dependent variable v, i. e.,
dv
v+x = F(v),
dx
dv
x = F(v) − v.
dx
So we may solve the differential equation by using separation of variables.
dy
2xy = 4x 2 + 3y 2 .
dx
280 | 5 Introduction to ordinary differential equations
Solution. This is a homogeneous equation (the degree of each term is the same – two
in this case). We rewrite it as
dy 4x2 + 3y2 x 3 y
= = 2( ) + ( ).
dx 2xy y 2 x
dy dv
Then, the substitution y = vx, dx
= v + x dx , transforms this to
dv 2 3
v+x = + v,
dx v 2
dv 4 + v2
x = ,
dx 2v
and, hence,
2v 1
∫ dv = ∫ dx,
v2 + 4 x
ln(v2 + 4) = ln |x| + C1 .
Thus,
v2 + 4 = |x|eC1 ,
y2
+ 4 = Cx, where C = ±eC1 ,
x2
y2 + 4x 2 = Cx3 ,
dy
= (x + y + 3)2 .
dx
dy dv
= − 1.
dx dx
So the transformed equation is
dv
= 1 + v2 .
dx
This is a separable equation and
dv
∫ = ∫ dx,
1 + v2
5.2 First-order ODEs | 281
tan−1 v = x + C,
v = tan(x + C).
So
y(x) = tan(x + C) − x − 3.
dy dy f (x, y)
f (x, y) + g(x, y) =0 or =− ,
dx dx g(x, y)
𝜕φ(x, y) 𝜕φ(x, y)
= f (x, y) and = g(x, y),
𝜕x 𝜕y
then
𝜕φ(x, y) 𝜕φ(x, y)
f (x, y)dx + g(x, y)dy = dx + dy = dφ(x, y) = 0.
𝜕x 𝜕y
This means that φ(x, y) = C, which is a general solution of the differential equation
f (x, y)dx + g(x, y)dy = 0.
Equations of this type are called exact differential equations. In Chapter 4, we
showed the following result holds.
Theorem 5.2.2 (Criterion for exactness). Suppose that the functions f (x, y) and g(x, y) are continuous
and have continuous first-order derivatives in the simply connected region D. Then the ODE
𝜕f 𝜕g
= .
𝜕y 𝜕x
dy y 2 −2x+3
Example 5.2.6. Solve the differential equation dx
= y−2xy
.
282 | 5 Introduction to ordinary differential equations
Since
𝜕(y2 − 2x + 3) 𝜕(2xy − y)
= 2y = ,
𝜕y 𝜕x
this is an exact differential equation. Assume dφ(x, y) = (y2 − 2x + 3)dx + (2xy − y)dy.
Then
𝜕φ
= 2xy + 0 + h (y).
𝜕y
2
But = 2xy − y, so h (y) = −y and such an h(y) = − y2 . Therefore,
𝜕φ
𝜕y
y2
φ(x, y) = y2 x − x 2 + 3x − ,
2
and
y2
φ(x, y) = C or y2 x − x 2 + 3x − =C
2
is a general solution to the original differential equation. Figure 5.4 shows the direction
field and several solution curves.
dy
a(x)y + b(x)y = c(x) or a(x) + b(x)y = c(x). (5.3)
dx
dy
The term “linear” refers to the “y” terms that appear as y and dx , but not are raised
to any power or combined in some other function, whereas a(x), b(x), and c(x) are
allowed to be nonlinear functions. Thus, in the differential equations
dy b(x) c(x)
+ y= . (5.4)
dx a(x) a(x)
b(x) c(x)
Let P(x) = a(x)
and Q(x) = a(x)
. Then equation (5.4) becomes
dy
+ P(x)y = Q(x). (5.5)
dx
There is a nice technique for solving equation (5.5). Suppose there exists a function
dy
ρ(x) such that multiplying both sides of the equation dx + P(x)y = Q(x) by ρ(x) trans-
forms the left-hand side into the derivative of the product ρ(x) × y. Such a function ρ(x)
is called an integrating factor. Then,
dy
ρ(x) + P(x)ρ(x)y = ρ(x)Q(x),
dx
d
(ρ(x)y) = ρ(x)Q(x). (5.6)
dx
We integrate both sides to obtain
ρ(x)y = ∫ ρ(x)Q(x)dx.
1
y= ∫ ρ(x)Q(x)dx. (5.7)
ρ(x)
Now the question left is, how do we find ρ(x)? Applying the product rule on the left-
hand side of equation (5.6) gives
dy dρ(x)
ρ(x) +y = ρ(x)Q(x).
dx dx
284 | 5 Introduction to ordinary differential equations
Comparing this with the original equation multiplied by ρ(x) shows that
dρ(x)
= P(x)ρ(x).
dx
This is a separable equation, and solving it gives the integrating factor ρ(x) = e∫ P(x)dx .
Substituting ρ(x) in equation (5.7), we obtain a general solution to equation (5.5), i. e.,
Note. One can easily check that y(x) = Ce− ∫ P(x)dx is a general solution of the first-
order linear homogeneous equation (right-hand side function Q(x) = 0)
dy
+ P(x)y = 0.
dx
dy
Note that the function y∗ = e− ∫ P(x)dx (∫ e∫ P(x)dx Q(x)dx) is a particular solution to dx
+
dy
P(x)y = Q(x). Thus, a general solution to dx
+ P(x)y = Q(x) can be written as
dy dy
a general solution of dx
+ P(x)y = 0 + a particular solution of dx
+ P(x)y = Q(x).
dy
Example 5.2.7. Solve the first-order linear ODE dx
= x 3 − 2xy.
2 2
y = e− ∫ 2xdx (∫ e∫ 2xdx x3 dx + C) = e−x (∫ ex x 3 dx + C)
2 1 2
= e−x ( ∫ x2 ex d(x 2 ) + C)
2
2 1 2 1 2
= e−x ( x2 ex − ex + C)
2 2
x2 1 2
= − + Ce−x .
2 2
Figure 5.3 shows the direction field and several solution curves to this ODE.
dy
− y = 2e−x/3 , y(0) = −1.
dx
5.2 First-order ODEs | 285
Solution. This is a first-order linear differential equation with P(x) = −1 and Q(x) =
2e−x/3 , so a general solution is
3 4
= ex (− e− 3 x + C).
2
Example 5.2.9. Solve the differential equation ydx + (y 3 − x)dy = 0 (assume that y > 0).
dy y
+ = 0,
dx y3 − x
dx y3 − x
+ = 0,
dy y
then
dx 1
− x = −y2 .
dy y
∫ y1 dy − ∫ y1 dy
x=e (∫ −y2 e dy + C1 ) = eln y (− ∫ y2 e− ln y dy + C1 )
1
= y(− ∫ y2 × dy + C1 )
y
y2
= y(− + C1 ).
2
A general solution is, therefore, 2x = −y3 + Cy (we replaced the constant 2C1 by C).
286 | 5 Introduction to ordinary differential equations
Bernoulli equations
A first-order differential equation of the form
dy
+ P(x)y = Q(x)yn (5.9)
dx
Remark. This type of equation was named after Jacob Bernoulli, who was one of the
many prominent mathematicians in the Bernoulli family.
v = y1−n
dv
+ (1 − n)P(x)v = (1 − n)Q(x).
dx
Rather than memorizing the form of this transformed equation, it is more efficient to
make the substitution explicitly, after dividing both sides by yn , as in the following
example.
2xy = 3y + 4x 2 y 3 .
dy 3
− y = 2xy3 .
dx 2x
3
We see that this is a Bernoulli equation with P(x) = − 2x and Q(x) = 2x with n = 3. We
3
divide the equation by y to obtain
dy 3
y−3 − y−2 = 2x.
dx 2x
dy
is exactly − 21 d(ydx ) . Hence, we let v = y−2 , and the above
−2
Note that the first term y−3 dx
equation becomes linear, i. e.,
1 d(y−2 ) 3
− − y−2 = 2x,
2 dx 2x
dv 3 × (−2)
− v = 2 × (−2)x,
dx 2x
5.3 Second-order ODEs | 287
dv 3
+ v = −4x.
dx x
A general solution is
3 3
v = e− ∫ x dx (∫ −4xe∫ x dx dx + C)
= x−3 (∫ −4x × x3 dx + C)
4
= x−3 (− x5 + C),
5
4 2 C
v = − x + 3.
5 x
Since v = y−2 , a general solution to the original differential equation is
4 C
y−2 = − x2 + 3 .
5 x
Differential equations of higher order appear in many applications in science and en-
gineering. For example, the well-known simple harmonic motion equation
d2 x
= −ω2 x
dt
is a second-order differential equation. A second-order differential equation involves
the second derivative of an unknown function y(x). Thus, it has the general form
y = ∫ f (x)dx + C1 ,
288 | 5 Introduction to ordinary differential equations
dP dy
Note. This is equivalent to solving two first-order ODEs dx
= f (x) and dx
= P.
y = x + 1.
x2
y = ∫(x + 1)dx = + x + C1 .
2
x2 x3 x2
y = ∫( + x + C1 )dx = + + C1 x + C2 .
2 6 2
The substitution
dp
y = p, y =
dx
F(x, p, p ) = 0.
This gives us a solution of equation (5.11) that involves two arbitrary constants C1 and
C2 , as is to be expected in the case of a second-order differential equation.
Example 5.3.2. Solve the equation xy + 2y = 6x, in which the dependent variable y is missing.
Determine the particular solution if y(1) = 2 and y (1) = 1.
5.3 Second-order ODEs | 289
dp
Solution. Let y = p, so that y = dx
. The substitution defined above gives the first-
order equation
dp dp 2
x + 2p = 6x, that is, + p = 6.
dx dx x
This is linear (in p). So, solving it by the method given by equation (5.8) gives
C1
p(x) = 2x + .
x2
C1
y(x) = ∫ p(x)dx = ∫(2x + )dx
x2
C1
= x2 − + C2 .
x
C1
Since y (1) = 1, this means p(1) = 1 = 2 + 12
, so C1 = −1. Since y(1) = 2, we have
2 = 12 −
−1
+ C2 , so C2 = 0.
1
Thus, the particular solution is
1
y(x) = x2 + .
x
The substitution
dp dp dy dp
p = y , y = = =p
dx dy dx dy
dp
F(y, p, p ) = 0.
dy
If we can solve this equation for a general solution p(y, C1 ) involving an arbitrary con-
stant C1 , then (assuming that y ≠ 0) we can find a solution of the original equation,
dy
with x as a function of y, by solving the first-order ODE dx = p(y, C1 ) as follows:
dy 1
p(y, C1 ) = , so that dx = dy,
dx p(y, C1 )
290 | 5 Introduction to ordinary differential equations
1
x=∫ dy + C2 .
p(y, C1 )
Example 5.3.3. Solve the initial value problem yy = (y )2 in which the independent variable x is
missing, with the initial conditions y(0) = 2 and y (0) = 1.
Solution. We substitute
dp dp dy dp
y = p and y = = =p .
dx dy dx dy
dp
yp = p2 .
dy
The initial condition y (0) = 1 when y(0) = 2 gives C1 = 21 . Hence, integrating again we
obtain
2
dx = dy,
y
x = 2 ln |y| + C2 .
y = 2√ex .
Early in this chapter, we derived the following second-order ODE for a mass-spring
system if the resistance is proportional to the mass’s velocity:
mx − lx + kx = 0.
If, furthermore, there is an external force f (t) acting on the mass, we will have
mx − lx + kx = f (t).
d2 y dy
a(x) + b(x) + c(x)y = f (x). (5.13)
dx 2 dx
The term “linear” applies to y, y , and y , and it means that they appear in separate
terms of the ODE without an exponent (other than one) and are not part of another
function (such as √1 + y). The functions a(x), b(x), c(x), and f (x) are allowed to be
nonlinear. We also assume that a(x) ≠ 0.
If in addition f (x) = 0 for all x in the above equation, then the differential equation
is called a homogeneous linear equation:
d2 y dy
a(x) + b(x) + c(x)y = 0. (5.14)
dx2 dx
The ODE
are not linear because they contain products and powers of y or its derivative.
The second-order ODE
x2 y + 2xy + 3y = 0.
292 | 5 Introduction to ordinary differential equations
Theorem 5.3.1 (Principle of superposition for homogeneous equations). Let y1 (x) and y2 (x) be two so-
lutions of the homogeneous linear equation (5.14), a(x)y + b(x)y + c(x)y = 0, defined on the interval
I. If C1 and C2 are constants, then the linear combination
y = C1 y1 (x) + C2 y2 (x)
Thus, if we can find two particular solutions to equation (5.14), and they are lin-
early independent, then the linear combination of the two particular solutions gives
a general solution to equation (5.14). The definition of two linearly independent func-
tions is given below.
Definition 5.3.1 (Linear independence of two functions). Two functions defined on an open interval I
are said to be linearly independent on I provided that neither is a constant multiple of the other (alter-
f
natively, neither of the two functions g or gf is a constant-valued function on I).
For example, ex and e2x are two linearly independent functions, while e2x and 2e2x are
linearly dependent. By the superposition theorem, we have the following theorem.
Theorem 5.3.2 (General solution of second-order homogeneous linear ODEs). Let y1 and y2 be two
linearly independent solutions of the homogeneous linear differential equation
where a(x)(≠ 0), b(x), and c(x) are continuous on some interval I. Then, a general solution is
y − 4y = 0,
y1
we can verify that y1 (x) = e2x and y2 (x) = e−2x are two solutions, and y2
= e4x is not a constant. There-
fore, y1 and y2 are two linearly independent solutions. So, a general solution is y(x) = C1 e2x + C2 e−2x .
However, erx is never zero, so we can divide this out of the equation. We conclude
that y(x) = erx will satisfy the homogeneous linear differential equation (5.16) with
constant coefficients precisely when r is a root of the algebraic equation
ar 2 + br + c = 0. (5.17)
Theorem 5.3.3 (Homogeneous linear ODEs – distinct real roots). If r1 and r2 are real and distinct roots
of the characteristic equation ar 2 + br + c = 0 of the ODE ay + by + cy = 0, then
2y − 7y + 3y = 0.
2r 2 − 7r + 3 = 0
1
are r1 = 2
and r2 = 3. So, a general solution is
1
y(x) = C1 e 2 x + C2 e3x .
y = C1 er1 x + C2 er2 x
is a general solution since er1 x = er2 x , and they are not linearly independent. There
is, in fact, only one arbitrary constant. To find another solution, we use the method of
variation of parameter. We assume a particular solution has the form
y∗ = C(x)er1 x .
Then, we will look for such a C(x). We plug y∗ into the equation to obtain
a(C (x)er1 x + 2r1 C (x)er1 x + r12 C(x)er1 x ) + b(C (x)er1 x + C(x)r1 er1 x ) + c(C(x)er1 x ) = 0.
a[C (x) + 2r1 C (x) + r12 C(x)] + b[C (x) + r1 C(x)] + cC(x) = 0,
aC (x) + (2ar1 + b)C (x) + (ar12 + br1 + c)C(x) = 0.
aC (x) = 0.
Therefore, such a C(x) does exist; we can choose a simple one, say, C(x) = x (choosing
any C(x) = kx + l also works). Then, we have another particular solution y = xer1 x
which is linearly independent of y = er1 x . So, we have the following theorem.
5.3 Second-order ODEs | 295
Theorem 5.3.4 (Homogeneous linear ODEs – repeated roots). If the characteristic equation
ar 2 + br + c = 0 of the ODE ay + by + cy = 0 has only one root r (a double root, or repeated
root), then
(3r + 2)2 = 0.
y = C1 e−2x/3 + C2 xe−2x/3 .
y + 2y + y = 0,
{
y(0) = 5, y (0) = −3.
So, the initial conditions substituted into the equations for y(x) and y (x) give
y(0) = C1 = 5,
y (0) = −C1 + C2 = −3,
The third case is when the discriminant of the auxiliary equation, b2 − 4ac, is less
than 0, then the auxiliary equation has two complex roots, and they are conjugate
pairs of the form r1,2 = α ± βi. The theory still implies that e(α+βi)x and e(α−βi)x are partic-
ular solutions of the linear ODE, but we would not expect to have complex numbers in
the solution of a real problem. The next theorem shows that we can, in fact, find real
solutions via these complex solutions.
296 | 5 Introduction to ordinary differential equations
Proof. For the proof, Euler’s theorem is required, which states that for any θ, we have
eiθ = cos θ + i sin θ. The theory of solutions developed above applies, so we can write
a general solution as
where C1 = A + B and C2 = i(A − B). The solutions are, therefore, real when C1 and C2
are both real.
Note. In fact, without using Euler’s theorem, one can still derive a general solution
by proving that eαx cos βx and eαx sin βx are two linearly independent solutions.
Solution. Since the characteristic equation, r 2 +ω2 = 0, has roots r1 = ωi and r2 = −ωi,
it follows that a general solution is
Note that if we denote A = √C12 + C22 , then this general solution could also be written
as
This is the general solution for a simple harmonic motion where A is the amplitude
and ω is the angular velocity. The period is T = 2π
ω
.
r 2 − 2r + 5 = 0
Summary
A general solution of ay + by + cy = 0 has one of the following forms:
the results are similar to those that we have obtained for second-order linear ODEs.
The principal difference is that there will be n roots of the auxiliary equation when
the order of the ODE is n. A general solution is a linear combination of n independent
solutions, and each root (or conjugate pair of roots) of the auxiliary equation
r n + p1 r n−1 + ⋅ ⋅ ⋅ + pn = 0
where the right-hand side function f (x) is replaced by zero, is called the complimentary
equation. A general solution of this equation is called a complimentary function. In
cases where equation (5.18) models a physical system, the nonhomogeneous term f (x)
frequently corresponds to some external influence on the system being modeled.
There is a nice connection between the solutions of equation (5.18) and equa-
tion (5.19).
Theorem 5.3.6 (General solution of nonhomogeneous linear ODEs). A general solution of a nonhomo-
geneous differential equation
d2 y dy
a(x) + b(x) + c(x)y = f (x)
dx 2 dx
can be written as
where yc (x) is a complementary function (a general solution of the associated homogeneous equa-
tion (5.19)), and yp (x) is a particular solution of equation (5.18).
Proof. We first show that y(x) = yp (x) + yc (x) is a solution of equation (5.18). Substitut-
ing into that equation, we obtain
= [a(x)yp (x) + b(x)yp (x) + c(x)yp (x)] + [a(x)yc + b(x)yc (x) + c(x)yc (x)]
= f (x) + 0
= f (x).
Now we show that any solution of equation (5.18) must be of the form of equa-
tion (5.20). If y∗ is any particular of equation (5.18), then
So
y∗ = yc + yp .
ay + by + cy = 0,
which have the form C1 y1 + C2 y2 , where y1 and y2 are two linearly independent so-
lutions. If we can find a particular solution yp to the nonhomogeneous second-order
linear differential equation with constant coefficients of the form
y(x) = C1 y1 + C2 y2 + yp .
Solution. The roots of the auxiliary equation r 2 + r − 2 = 0 are r = 1 and r = −2. Hence,
a complementary function is
yc = C1 ex + C2 e−2x .
300 | 5 Introduction to ordinary differential equations
It seems likely that a polynomial will give a particular solution, because the right-hand
side of the differential equation is a polynomial. Since the right-hand side, 2x + 1, is
a polynomial of degree 1, we try yp = Ax + B of degree 1. Substituting into the given
differential equation we have
(Ax + B) + (Ax + B) − 2(Ax + B) = 2x + 1,
A − 2Ax − 2B = 2x + 1.
However, the polynomial on the left-hand side equals the polynomial on the right-
hand side exactly when their coefficients are equal. Thus,
−2A = 2 and A − 2B = 1.
This gives A = −1 and B = −1. So, yp = −x − 1 is a particular solution. Therefore, a
general solution is
y = yc + yp = C1 ex + C2 e−2x − x − 1.
Example 5.3.12. Find a particular solution for each of the following differential equations:
2e2x (A + 2B + 4Ax + 2Bx + 2Ax 2 ) − 5e2x (B + 2Ax + 2Bx + 2Ax2 ) + 6e2x (Bx + Ax 2 ) = xe2x .
2(A + 2B + 4Ax + 2Bx + 2Ax2 ) − 5(B + 2Ax + 2Bx + 2Ax2 ) + 6(Bx + Ax2 ) = x,
2A − B − 2Ax = x.
1
A=− and B = −1,
2
so a particular solution is
1
yp = x(− x − 1)e2x .
2
1
y = C1 e2x + C2 e3x − x( x + 1)e2x .
2
Since y = 2C2 e2x + 3C2 e3x − 2x( x2 + 1)e2x − (x + 1)e2x , under the condition y(0) = 1 and
y (0) = 2, we have
1 = C1 + C2 ,
2 = 2C1 + 3C2 − 1.
x
y = e3x − x( + 1)e2x .
2
302 | 5 Introduction to ordinary differential equations
d2 y dy
a +b + cy = f (x),
dx 2 dx
when f (x) is of the form Pm (x)eλx , where Pm (x) is a degree m polynomial, and λ is a
constant. Our choice for a particular solution takes the form yp (x) = Q(x)eλx , where
Q(x) is a polynomial. We substitute y = Q(x)eλx into the ODE above to obtain
a[Q (x)eλx + 2λQ (x)eλx + λ2 Q(x)eλx ] + b[Q (x)eλx + λQ(x)eλx ] + cQ(x)eλx = Pm (x)eλx .
We cancel out the factor eλx from both sides of the equation, resulting in
This tells us that Q(x) should be chosen to be of degree m+1. That is, in order to use the
method of undetermined coefficients, we choose Q(x) to have degree one more than
the degree of Pm (x).
Case 3: aλ2 + bλ + c = 0 and 2aλ + b = 0. That is, λ is a root of the characteristic
equation of multiplicity 2. Then, equation (5.22) becomes
This tells us that Q(x) should be of degree m + 2. That is, in order to use the method of
undetermined coefficients, we choose Q(x) to have degree two more than the degree
of Pm (x).
We summarize the results. If the right-hand side of the ODE is f (x) = Pm (x)eλx ,
then we initially choose yp (x) = Q(x)eλx , where Q(x) is a polynomial of degree m. We
modify yp by multiplying it by x if λ is a root of the auxiliary equation, and by x 2 if λ is
a repeated root of the auxiliary equation. We determine the undetermined coefficients
by substituting y = yp into the differential equation.
r 2 − 2r − 3 = 0.
yp = (Ax + B)e0x = Ax + B.
0 − 2A − 3(Ax + B) = 3x + 1.
−3A = 3,
{
−2A − 3B = 1,
with solution
1
A = −1 and B= .
3
y − 3y + 2y = xex .
yc = C1 ex + C2 e2x .
The right-hand side xex is a polynomial of degree 1 multiplied by eλx with λ = 1. Since
λ = 1 is one of the roots of the characteristic equation, the particular solution is chosen
to be a polynomial of degree 1, multiplied by x, and then by ex , i. e.,
yp = x(Ax + B)ex .
−2Ax + (2A − B) = x.
304 | 5 Introduction to ordinary differential equations
−2A = 1 and 2A − B = 0.
Solving the system of equations gives A = −1/2 and B = −1. Thus, a particular solution
is
1
yp = x(− x − 1)ex ,
2
1
y = yc + yp = C1 ex + C2 e2x − ( x 2 + x)ex .
2
The right-hand side is P(x)eλx = 5e−3x , with a polynomial P(x) = 5 of degree 0 and λ =
−3. Since λ = −3 is a double root of the characteristic equation, a particular solution
can be chosen to be an arbitrary polynomial of degree 0 (that is, a constant) multiplied
by x2 and then by e−3x , i. e.,
yp = Ax2 e−3x .
5
yp = x 2 e−3x .
2
Hence, a general solution is
5
y = yc + yp = (C1 + xC2 )e−3x + x 2 e−3x .
2
yc = C1 ex + C2 e−x .
5.3 Second-order ODEs | 305
To find a particular solution, we note that the right-hand side, 4x sin x, is the imaginary
part of 4xeix since eix = cos x + i sin x. So we consider
y1 − y1 = 4x cos x,
{
(iy2 ) − iy2 = 4x(i sin x),
y − 4y = 4xeix
for a particular solution yp , then yp will be y1 (x) + iy2 (x). The real part of yp , y1 (x), must
be a particular solution of y − y = 4x cos x, and its imaginary part, y2 (x), must be a
particular solution of y − y = 4x sin x.
Since λ = i is not a root of the characteristic equation, we use a modified right-
hand side and choose a particular solution that is a polynomial of degree 1 multiplied
by eix , i. e.,
yp = (Ax + B)eix .
Substituting this into y − y = 4xeix , simplifying, and dividing by eix leads to
yp = (−2x − 2i)eix
= (−2x − 2i)(cos x + i sin x)
= −2(x cos x − sin x) − 2(x sin x + cos x)i.
The original right-hand side is the imaginary part of 4xeix , so we take the imaginary
part of yp to get a particular solution of the original problem:
Note. This example shows a special case of a method for finding a particular solu-
tion of equation (5.21) when the right-hand side is of the form f (x) = eλx P(x) cos mx or
f (x) = eλx P(x) sin mx. This example shows that f (x) is replaced by the function g(x) =
e(λ+mi)x P(x) (of which f (x) is the real or imaginary part). The particular solution yp of
the new ODE, with g(x) on the right-hand side, can be found using the methods de-
veloped before. The real or imaginary part of yp is a particular solution of the original
problem.
Some books give an alternative procedure, using only real-valued functions,
where the trial solution (particular solution) is taken to be of the form
where Q1 (x) and Q2 (x) are polynomials with unknown coefficients and of the same
degree as P(x), but multiplied by x or x2 if λ is a single root or repeated root of the
corresponding auxiliary equation, respectively.
yc = C1 ex + C2 e−x .
In order to find a particular solution, we separately find particular solutions of the two
equations, and then add them, so we have
y − y = 3e2x ,
y − y = 4x sin x.
y1 = e2x ,
and the second has the particular solution found in the previous example,
We now add them to give a particular solution for the original equation in this exam-
ple:
So a general solution is
The method of undetermined coefficients is often useful to solve problems when f (x) =
Pm (x)eλx . We now introduce the method of variation of parameters, which is another
way to find a particular solution to the differential equation
Now we look for a particular solution to ay + by + cy = f (x) of the form
Then,
To solve for u1 (x) and u2 (x), we need two equations. We already have the condition
that yp is a particular solution, but we need an extra one. Let us impose
a(u1 y1 + u1 y1 + u2 y2 + u2 y2 ) + b(u1 y1 + u2 y2 ) + c(u1 y1 + u2 y2 ) = f (x)
to obtain
u1 (ay1 + by1 + cy1 ) + u2 (ay2 + by2 + cy2 ) + a(u1 y1 + u2 y2 ) = f (x).
This means
Integration gives
0 y2 y1 0
f (x)/a y y f (x)/a
u1 (x) = ∫ y1 y2 2 dx and u2 (x) = ∫ 1 y1 y2 dx.
y1 y2 y1 y2
Thus, a particular solution is given by
0 y2 y1 0
f (x)/a y y f (x)/a
yp = y1 (x) ⋅ ∫ y1 y2 2 dx + y2 (x) ⋅ ∫ 1 y1 y2 dx. (5.28)
y1 y2 y1 y2
yc = C1 e−x + C2 ex .
1 1
= 2e−x (− xe2x + e2x ) + x 2 ex
2 4
1 x 2
yp = e (2x − 2x + 1).
2
So, a general solution is given by
1
y = C1 e−x + C2 ex + ex (2x2 − 2x + 1).
2
In this section, we introduce, very briefly, two ways to find exact or approximate
solutions to differential equations: the power series method and Euler’s method.
5.4 Other ways of solving differential equations | 309
When we cannot find an explicit expression for the solution of a differential equation,
we try to get information about the solution in other ways. One way is to express the
solution in the form of a power series,
∞
y = f (x) = ∑ cn xn = c0 + c1 x + c2 x2 + ⋅ ⋅ ⋅ + cn x n + ⋅ ⋅ ⋅ .
n=0
The method is to substitute this expression into the differential equation and use the
equation to determine the values of the coefficients c0 , c1 , c2 ⋅ ⋅ ⋅. This technique re-
sembles the method of undetermined coefficients discussed previously. Once a Taylor
series solution or some of the initial terms of that Taylor series have been found, this
can be used to compute numerical approximations to the solution of the ODE.
We now illustrate the method on the equation y − y = x. We already know how to
solve this equation exactly by techniques introduced before, but it is a simple exam-
ple, helping us to understand the power series method.
Example 5.4.1. Use a power series to solve the initial value problem y − y = x and y(0) = 1.
So, y − y = x becomes
2 2 2
c1 = 1, c2 = 1 = , c3 = , ... cn = , ....
2! 3! n!
x2 x3
Since ex = 1 + x + 2!
+ 3!
+ ⋅ ⋅ ⋅, this solution is
x2 x3
y = 2(1 + x + + + ⋅ ⋅ ⋅) − x − 1
2! 3!
= 2ex − x − 1.
As previously mentioned, it is the exception rather than the rule when a first-order
ODE of the general form
dy
= f (x, y)
dx
can be solved exactly and explicitly by elementary methods like those discussed ear-
lier. Even the simple equations
dy 2 dy sin x
= e−x and =
dx dx x
2
cannot be solved this way, since it can be proved that the antiderivatives of e−x and
sin x
x
are not elementary functions. However, if a solution exists, then we can always
find numerical approximations to the solution. The most basic of the approximation
methods is Euler’s method.
We consider the initial value problem of the form
dy
= f (x, y), y(x0 ) = y0 .
dx
In Euler’s method we first choose a small step size h, and we use this to define a se-
quence of x-values, starting with some initial value (x0 , y0 ) and separated by h, giving
Euler’s method works because, by Taylor’s theorem, for the solution function y =
y(x) we have
Thus,
Example 5.4.2. Use Euler’s method with step size 0.1, and then step size 0.05, to construct a table of
approximate values for the solution on the interval 0 ≤ x ≤ 1 for the initial value problem
y = x − y and y(0) = 1.
yn+1 = yn + 0.1(xn − yn ).
We obtain
y1 = y0 + 0.1(x0 − y0 ) = 1 + 0.1(0 − 1) = 0.9,
y2 = y1 + 0.1(x1 − y1 ) = 0.9 + 0.1(0.1 − 0.9) = 0.82,
..
.
Proceeding with similar calculations, we find the values in the two tables (for h = 0.1
and h = 0.05). We have also included the corresponding values of the exact solution,
y = x + 2e−x − 1, and the deviation (error) of the approximate solution from the exact
solution.
0 1 1.0 0 0 1 1.0 0
0.1 0.9 0.909 674 8 0.009 674 8 0.05 0.95 0.952 458 8 0.002 458 8
0.2 0.82 0.837 461 5 0.017 461 5 0.1 0.905 0.909 674 8 0.004 674 8
0.3 0.758 0.781 636 4 0.023 636 4 0.15 0.864 75 0.871 416 0 0.006 666
0.4 0.712 2 0.740 640 1 0.028 440 1 0.2 0.829 012 5 0.837 461 5 0.008 449
0.5 0.680 98 0.713 061 3 0.032 081 3 0.25 0.797 561 9 0.807 601 6 0.010 039 7
0.6 0.662 882 0.697 623 3 0.034 741 3 0.3 0.770 183 8 0.781 636 4 0.011 452 6
0.7 0.656 593 8 0.693 170 6 0.036 576 8 0.35 0.746 674 6 0.759 376 2 0.012 701 6
0.8 0.660 934 4 0.698 657 9 0.037 723 5 0.4 0.726 840 9 0.740 640 1 0.013 799 2
0.9 0.674 841 0 0.713 139 3 0.038 298 3 0.45 0.710 498 8 0.725 256 3 0.014 757 5
1.0 0.697 356 9 0.735 758 9 0.038 402 0.5 0.697 473 9 0.713 061 3 0.015 587 4
0.55 0.687 600 2 0.703 899 6 0.016 299 4
0.6 0.680 720 2 0.697 623 3 0.016 903 1
0.65 0.676 684 2 0.694 091 6 0.017 407 4
0.7 0.675 350 0 0.693 170 6 0.017 820 6
0.75 0.676 582 4 0.694 733 1 0.018 150 7
0.8 0.680 253 3 0.698 657 9 0.018 404 6
0.85 0.686 240 7 0.704 829 9 0.018 589 2
0.9 0.694 428 6 0.713 139 3 0.018 710 7
0.95 0.704 707 2 0.723 482 0.018 774 8
1.0 0.716 971 8 0.735 758 9 0.018 787 1
312 | 5 Introduction to ordinary differential equations
Note. Euler’s method is subject to the numerical errors experienced by most itera-
tive methods. The small errors caused by the approximate solution at each step are
incorporated into the calculations for the next step, and so can gradually build up
into large errors. This is illustrated in the figure above. This error build-up can be re-
duced by decreasing the size of the step h. However, as h gets smaller, the number
of computations increases, and this can cause another kind of error during computer
computations. This is because computers approximate numbers by rounding them to
a certain precision, and this introduces minute errors (round-off errors). If an itera-
tive method requires an extremely large number of computations, then the round-off
errors can build up into a significant error.
Example 5.4.3. Apply Euler’s method to approximate the solution of the initial value problem
dy
dx
= √x 2 + y 2 ,
{
y(0) = −1
Solution. In this case the iterative formula is yn+1 = yn +0.1√xn2 + yn2 , starting from x0 =
0 and y0 = −1, and for n = 0, 1, 2, 3, . . ., the values of xn are 0, 0.1, 0.2, 0.3, . . . , 0.9, 1.
A table of the computed approximate solution values is shown.
5.5 Review | 313
n xn yn √xn2 + yn2
Note. In this example it is not possible to find an exact solution formula, so a numer-
ical approach is the only way to investigate the solution.
5.5 Review
Main concepts discussed in this chapter are listed below.
dy
1. Separable differential equations, dx = f (x, y) = g(x)h(y), have the solution
1
∫ dy = ∫ g(x)dx.
h(y)
2. Substitution method: if
dy y
= F( ),
dx x
then y = xv will transform the ODE into a separable one.
3. Exact differential equation:
ay + by + cy = 0,
8. To find a particular solution y∗ for ay + by + cy = f (x)eλx , where f (x) is a poly-
nomial of degree m, assume Q(x) is a polynomial of degree m, with unknown co-
efficients, we try
5.6 Exercises
5.6.1 Introduction to differential equations
1. Which of the following equations are differential equations? For those that are,
state whether they are ODE or PDE. For those that are ODE, give their orders and
degrees.
(1) y = 2x + 6, (2) y = 2x + 3,
d2 y
(3) dx 2
= y + 2x, (4) x2 − 3t = 0,
(5) y = x + y + y2 cos x,
(6) yx + 8(y )2 + 6y8 = e2t ,
2
(7) y(y ) = 1, (8) x2 dx + ydx = 0,
𝜕2 u
(9) y(4) + 2y + 3x = 5, (10) 𝜕x2
+ ( 𝜕u
𝜕t
)2 = x2 − t.
2
2. Verify that x = 2(sin 2t −sin 3t) is the solution of the initial value problem ddtx2 +4x =
10 sin 3t, x(0) = 0, x (0) = −2.
3. Graph the slope fields for the following differential equations using computer soft-
ware:
(1) y = x−y
x+y
, (2) dxdy
= (x + y − 2)2 ,
dy dy
(3) dx = sin x, (4) dx = x(6 − x).
4. Find an equation of the curve that passes through the point (1, 0) and whose slope
at each point (x, y) is x2 .
1
y = − ,
y3
we must have
1 1
y dy = − dy → ∫ y dy = ∫ − 3 dy.
y3 y
1
y = 1 − .
y2
5. Solve each of the following differential equations using the method of undeter-
mined coefficients:
(1) y − 7y + 6y = 4x, (2) y − 2y − 3y = 6e2x ,
(3) y + 4y = x cos x, (4) y − y = 4xex , y(0) = 0, y (0) = 1,
(5) y − 2y + 5y = ex sin 2x, (6) y + y = ex + cos x,
d2 y
(7) ÿ + ẏ + y = 0, (8) dt 2
+ 16y = 3 cos 4t,
2
d2 y dy
(9) 2 ddtx2
− 3 dx
dt
− 5x = 2
10t + 1, (10) dx2
+ 2 dx + y = e−x .
d2 x
6. Solve 20 dt 2 + 4 dx
= 2t + 11, given that x = 1 and dx
dt
+x dt
= 2.8 when t = 0. Describe
the behavior of x when t → ∞.
7. Solve the differential equation y + y = tan x using variation of parameters.
8. A spring with a mass of 2 kg is put on a table. One of the ends of the spring is
fixed on a wall and the other end is attached to the mass. It is held stretched 0.2 m
beyond its natural length by a force of 40 N. Now, suppose the mass is at its equi-
librium point, a push gives the mass an initial velocity of 2 m/s. Find the position
of the mass after t seconds.
9. The Kirchhoff voltage law says that
d2 Q dQ 1
L +R + Q = E(t),
dt 2 dt C
where L is an inductor, R is a resistor, C is a capacitor, Q is the charge, and E is
the electromotive force. The current I is always equal to dQdt
. Find the charge and
current at t in a circuit if the initial charge and current are both 0, and L = 1,
R = 40, C = 16 × 10−4 , and E(t) = 10 sin(2t).
10. Attempt to find a general solution to the Euler equation
d2 y dy
ax2 + bx + cy = 0,
dx2 dx
where a ≠ 0, b, and c are constants. Hint: Try y = x r .
11. (Solving a simple PDE) In general, solving a PDE for an analytical solution is
not easy. Numerical methods are widely used in obtaining approximate solutions.
However, in some cases, we may be able to find an exact solution to a PDE. Con-
sider heat conduction in a cube, which can be modeled by
𝜕2 u 𝜕2 u 𝜕2 u
+ + = 0, for 0 < x, y, z < a.
𝜕x2 𝜕y2 𝜕z 2
2 2 2
Note that 𝜕𝜕xu2 + 𝜕𝜕yu2 + 𝜕𝜕zu2 is often denoted by ∇2 u. Suppose u = P(x)Q(z) and bound-
ary conditions are
u = 0 on x = 0 and a,
u = 0 on z = 0,
u = 1 on z = a.
318 | 5 Introduction to ordinary differential equations
(a) Show that P Q + PQ = 0 and that there is a constant λ such that Q = λQ
and P + λP = 0.
(b) Show that a general solution for P(x) is
ex + e−x ex − e−x
cosh x = and sinh = .
2 2
Show that the general solution for Q(z) shown in (d) can be rewritten as
∞
1 sinh (2k−1)πz
a (2k − 1)πx
u(x, z) = ∑ sin .
k=1
(2k − 1)π sinh((2k − 1)π) a
Further reading
1. Gilbert Strang. Calculus. Wellesley: Wellesley-Cambridge Press, 1991.
2. Alex Himonas, Alan Howard. Calculus: Ideas and Applications. New Jersey: Wiley,
2002.
3. Michael Spivak. Calculus. 3rd edtion. London: Cambridge University Press, 2006.
4. Robert A. Adams, Christopher Essex. Calculus. 7th edition. Toronto: Pearson
Eduction, 2007.
5. James Stewart. Calculus. 6th edition. California: Brooks Cole, 2017.
6. Donald Trim. Calculus for Engineers. 4th edition. Toronto: Pearson Education,
2008.
7. Ross L. Finney, Franlin D. Demana, Bert K. Waits, Daniel Kennedy. Calculus:
Graphical, Numerical, Algebraic. 4th edition. New Jersey: Prentice Hal, 2012.
8. Ron Larson, Bruce H. Edwards. Calculus, 10th edition. California: Brooks Cole,
2013.
9. James Stewart. Calculus. 5th edition. Beijing: Higher Education Press, 2004.
10. Thomas’s Calculus, 10th edition. Beijing: Higher Education Press, 2004.
11. Department of mathematics, Sichuan University. Higher Mathematics. 4th edi-
tion. Beijing: Higher Education Press, 2009.
12. Department of mathematics, Sichuan University. Higher Mathematics. 2nd edi-
tion. Chengdu: Sichuan University Press, 2013.
13. Department of applied mathematics, Tongji University. Higher Mathematics.
7th edition. Beijing: Higher Education Press, 2014.
14. Ma Jigang, Zou Yunzhi, P. W. Aitchison. Calculus II. Beijing: Higher Education
Press, 2010.
15. William Briggs, Lyle Cochran, Bernard Gillett. Calculus: Early Transcendentals.
2nd edition. Malaysia: Pearson Education, 2015.
16. Elgin H. Johnston, Jerold C. Mathews. Calculus (Annotated Instructor’s edition).
USA: Pearson education, 2002.
https://doi.org/10.1515/9783110674378-006
Index
absolute maximum 123 differentiable 84
absolute minimum 123 differential approximation 90
angle between two lines 21 direction 1
angle between two planes 25 direction angle 8
angle between two vectors 8 direction cosine 8
direction field 277
Bernoulli equation 286 direction numbers 20
boundary 66 directional derivative 113
boundary point 66 divergence 233
bounded region 66 divergence of a 3D vector field 248
bounded region test 127 divergence theorem 248, 250
domain 65
candidate theorem 124 dot product 9
Cartesian equation of a plane 24 double integral 147
center of mass 187 double integral in polar coordinates 157
chain rule 92 double integral in rectangular coordinates 150
chain rule with more than one independent
variable 94 ellipsoid 46
chain rule with one independent variable 92 elliptic cone 46
change of variables 161 elliptic cylinder 45
change of variables in triple integrals 179 elliptic paraboloid 46
change the order of integration 156 equivalent vectors 1
circulation 217 Euler’s method 310
circulation density 217 exact differential equation 281
circulation integral 217 extrema of functions of several variables 122
Clairaut theorem 83
closed region 66 first-order differential equation 273
complementary function 298 first-order linear differential equation 283
components of a vector 5 flux 231
conservative field 211 flux density 232
constrained maximum 130 flux integral 242
constrained minimum 130 Fubini theorem 152
continuous functions of two variables 75 functions of multiple variables 65
coordinate planes 4 functions of two variables 65
criteria for exactness 281 fundamental theorem of line integrals 208
critical point 124
cross product 13 general solution 275
curl 219 generalized Green’s theorem 228
curl of a 3D vector field 256 global maximum 123
curvature 39 global minimum 123
cylinder 44 gradient vector 113
cylindrical coordinates 175 gravitational field 202
Green’s theorem 216
degree 274 Green’s theorem: circulation-curl form 219
dependent variables 65 Green’s theorem: flux-divergence form 231
difference of vectors 3
differentiability 83 Hessian matrix 122
322 | Index